rmetzger edited a comment on pull request #18692:
URL: https://github.com/apache/flink/pull/18692#issuecomment-1034859586
Sadly, the JRS still doesn't work on K8s, using a minio s3 implementation:
```
2022-02-10 12:20:23,679 INFO
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] -
Starting the resource manager.
2022-02-10 12:20:23,765 INFO
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess [] -
Start SessionDispatcherLeaderProcess.
2022-02-10 12:20:25,060 INFO
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess [] -
Stopping SessionDispatcherLeaderProcess.
2022-02-10 12:20:25,164 INFO
org.apache.flink.runtime.jobmanager.DefaultJobGraphStore [] - Stopping
DefaultJobGraphStore.
2022-02-10 12:20:25,255 ERROR
org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Fatal error
occurred in the cluster entrypoint.
java.util.concurrent.CompletionException:
org.apache.flink.util.FlinkRuntimeException: Could not retrieve JobResults of
globally-terminated jobs from JobResultStore
at
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
~[?:1.8.0_322]
at
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
[?:1.8.0_322]
at
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1606)
[?:1.8.0_322]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[?:1.8.0_322]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[?:1.8.0_322]
at java.lang.Thread.run(Thread.java:750) [?:1.8.0_322]
Caused by: org.apache.flink.util.FlinkRuntimeException: Could not retrieve
JobResults of globally-terminated jobs from JobResultStore
at
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess.getDirtyJobResults(SessionDispatcherLeaderProcess.java:186)
~[flink-dist-1.15-jrs-fix.jar:1.15-jrs-fix]
at
org.apache.flink.runtime.dispatcher.runner.AbstractDispatcherLeaderProcess.supplyUnsynchronizedIfRunning(AbstractDispatcherLeaderProcess.java:198)
~[flink-dist-1.15-jrs-fix.jar:1.15-jrs-fix]
at
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess.getDirtyJobResultsIfRunning(SessionDispatcherLeaderProcess.java:178)
~[flink-dist-1.15-jrs-fix.jar:1.15-jrs-fix]
at
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
~[?:1.8.0_322]
... 3 more
Caused by: java.io.FileNotFoundException: No such file or directory:
s3://xxx-eu-west-1-dev-store/myorg/myscope/3d78a6e7-4c88-4e6f-8e59-4fb4b6dd6319-test-job-name-aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/ha/job-result-store/default
at
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2344)
~[?:?]
at
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2226)
~[?:?]
at
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2160)
~[?:?]
at
org.apache.hadoop.fs.s3a.S3AFileSystem.innerListStatus(S3AFileSystem.java:1961)
~[?:?]
at
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listStatus$9(S3AFileSystem.java:1940)
~[?:?]
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109) ~[?:?]
at
org.apache.hadoop.fs.s3a.S3AFileSystem.listStatus(S3AFileSystem.java:1940)
~[?:?]
at
org.apache.flink.fs.s3hadoop.common.HadoopFileSystem.listStatus(HadoopFileSystem.java:170)
~[?:?]
at
org.apache.flink.core.fs.PluginFileSystemFactory$ClassLoaderFixingFileSystem.listStatus(PluginFileSystemFactory.java:141)
~[flink-dist-1.15-jrs-fix.jar:1.15-jrs-fix]
at
org.apache.flink.runtime.highavailability.FileSystemJobResultStore.getDirtyResultsInternal(FileSystemJobResultStore.java:158)
~[flink-dist-1.15-jrs-fix.jar:1.15-jrs-fix]
at
org.apache.flink.runtime.highavailability.AbstractThreadsafeJobResultStore.withReadLock(AbstractThreadsafeJobResultStore.java:118)
~[flink-dist-1.15-jrs-fix.jar:1.15-jrs-fix]
at
org.apache.flink.runtime.highavailability.AbstractThreadsafeJobResultStore.getDirtyResults(AbstractThreadsafeJobResultStore.java:100)
~[flink-dist-1.15-jrs-fix.jar:1.15-jrs-fix]
at
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess.getDirtyJobResults(SessionDispatcherLeaderProcess.java:184)
~[flink-dist-1.15-jrs-fix.jar:1.15-jrs-fix]
at
org.apache.flink.runtime.dispatcher.runner.AbstractDispatcherLeaderProcess.supplyUnsynchronizedIfRunning(AbstractDispatcherLeaderProcess.java:198)
~[flink-dist-1.15-jrs-fix.jar:1.15-jrs-fix]
at
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess.getDirtyJobResultsIfRunning(SessionDispatcherLeaderProcess.java:178)
~[flink-dist-1.15-jrs-fix.jar:1.15-jrs-fix]
at
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
~[?:1.8.0_322]
... 3 more
2022-02-10 12:20:25,384 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Shutting
StandaloneApplicationClusterEntryPoint down with application status UNKNOWN.
Diagnostics Cluster entrypoint has been closed externally..
```
The directory exists:
```
AWS_ACCESS_KEY_ID=admin AWS_SECRET_ACCESS_KEY=password aws --endpoint-url
http://localhost:9000 s3 ls
s3://xxx-eu-west-1-dev-store/myorg/myscope/3d78a6e7-4c88-4e6f-8e59-4fb4b6dd6319-test-job-name-aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/ha/job-result-store/default
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]