Hi amenreet, According to the error message, I think you can log in the jm pod after it starts, and check access permissions for the directory `file:///opt/flink/pm/ha`
Best, Shammon FY On Fri, Jul 7, 2023 at 6:04 PM amenreet sodhi <amenso...@gmail.com> wrote: > Hi Shammon > > I am using an external NFS mount which gets mounted at path > /opt/flink/pm/, and the path that is mentioned there refers to that > only, so not a local file. Could there be any other configuration issue? > > Thanks > Regard > Amenreet Singh Sodhi > > On Fri, Jul 7, 2023 at 2:00 PM Shammon FY <zjur...@gmail.com> wrote: > >> Hi amenreet, >> >> Maybe you can try to use hdfs or s3 for `high-availability.storageDir`, I >> found your current job is using a local file which is started with >> `file:///`. >> >> Best, >> Shammon FY >> >> >> On Fri, Jul 7, 2023 at 4:20 PM amenreet sodhi <amenso...@gmail.com> >> wrote: >> >>> Hi All, >>> I am deploying Flink cluster on Kubernetes in HA mode. But i noticed, >>> whenever i deploy Flink cluster for first time on K8s cluster, it is not >>> able to populate the cluster configmap, and due to which JM fails with the >>> following exception: >>> >>> 2023-07-06 16:46:11,428 ERROR >>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Fatal >>> error occurred in the cluster entrypoint. >>> java.util.concurrent.CompletionException: java.lang.IllegalStateException: >>> The base directory of the JobResultStore isn't accessible. No dirty >>> JobResults can be restored. >>> at >>> java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314) >>> ~[?:?] >>> at >>> java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:319) >>> [?:?] >>> at >>> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1702) >>> [?:?] >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) >>> [?:?] >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) >>> [?:?] >>> at java.lang.Thread.run(Thread.java:834) [?:?] >>> Caused by: java.lang.IllegalStateException: The base directory of the >>> JobResultStore isn't accessible. No dirty JobResults can be restored. >>> at >>> org.apache.flink.util.Preconditions.checkState(Preconditions.java:193) >>> ~[event_executor-1.1.20.jar:?] >>> at >>> org.apache.flink.runtime.highavailability.FileSystemJobResultStore.getDirtyResultsInternal(FileSystemJobResultStore.java:182) >>> ~[event_executor-1.1.20.jar:?] >>> at >>> org.apache.flink.runtime.highavailability.AbstractThreadsafeJobResultStore.withReadLock(AbstractThreadsafeJobResultStore.java:118) >>> ~[event_executor-1.1.20.jar:?] >>> at >>> org.apache.flink.runtime.highavailability.AbstractThreadsafeJobResultStore.getDirtyResults(AbstractThreadsafeJobResultStore.java:100) >>> ~[event_executor-1.1.20.jar:?] >>> at >>> org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess.getDirtyJobResults(SessionDispatcherLeaderProcess.java:194) >>> ~[event_executor-1.1.20.jar:?] >>> at >>> org.apache.flink.runtime.dispatcher.runner.AbstractDispatcherLeaderProcess.supplyUnsynchronizedIfRunning(AbstractDispatcherLeaderProcess.java:198) >>> ~[event_executor-1.1.20.jar:?] >>> at >>> org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess.getDirtyJobResultsIfRunning(SessionDispatcherLeaderProcess.java:188) >>> ~[event_executor-1.1.20.jar:?] >>> at >>> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700) >>> ~[?:?] >>> >>> Once we reinstall/helm upgrade then this exception goes away. How can >>> this be resolved, any additional configuration required to resolve this? >>> >>> I am using the following configuration for HA: >>> >>> high-availability.storageDir: file:///opt/flink/pm/ha >>> kubernetes.cluster-id: {{ include "fullname" . }}-cluster-{{ now | date >>> "20060102150405" }} >>> high-availability.jobmanager.port: 6123 >>> high-availability.type: kubernetes >>> high-availability: >>> org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory >>> kubernetes.namespace: {{ .Release.Namespace }} >>> >>> Thanks >>> >>> Regards >>> Amenreet Singh Sodhi >>> >>>