Hey, I've been struggling with this problem now for some days - driving me crazy.
I have a standalone kubernetes Flink (1.12.5) using an application cluster mode approach. *The problem* I am getting a NullPointerException when specifying the FQN of the Kubernetes HA Service Factory class i.e. *org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory* What other configurations besides the ones specified (here <https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/#common-cluster-resource-definitions>) may I be missing? Details: * we are using a custom image using the flink: 1.12 <https://hub.docker.com/layers/flink/library/flink/1.12/images/sha256-4b4290888e30d27a28517bac3b1678674cd4b17aa7b8329969d1d12fcdf68f02?context=explore> as base image flink-conf.yaml -- thought this may be useful? flink-conf.yaml: |+ jobmanager.rpc.address: {{ $fullName }}-jobmanager jobmanager.rpc.port: 6123 jobmanager.memory.process.size: 1600m taskmanager.numberOfTaskSlots: 2 taskmanager.rpc.port: 6122 taskmanager.memory.process.size: 1728m blob.server.port: 6124 queryable-state.proxy.ports: 6125 parallelism.default: 2 scheduler-mode: reactive execution.checkpointing.interval: 10s restart-strategy: fixed-delay restart-strategy.fixed-delay.attempts: 10 high-availability: org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory high-availability.cluster-id: /{{ $fullName }} high-availability.storageDir: s3://redacted-flink-dev/recovery *Snippet of Job Manager pod log* 2021-08-25 06:14:20,652 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Shutting StandaloneApplicationClusterEntryPoint down with application status FAILED. Diagnostics org.apache.flink.util.FlinkException: Could not create the ha services from the instantiated HighAvailabilityServicesFactory org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory. at org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createCustomHAServices(HighAvailabilityServicesUtils.java:268) at org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:124) at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:338) at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:296) at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:224) at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$1(ClusterEntrypoint.java:178) at org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28) at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:175) at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:585) at org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.main(StandaloneApplicationClusterEntryPoint.java:85) Caused by: java.lang.NullPointerException at org.apache.flink.util.Preconditions.checkNotNull(Preconditions.java:59) at org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.<init>(Fabric8FlinkKubeClient.java:85) at org.apache.flink.kubernetes.kubeclient.FlinkKubeClientFactory.fromConfiguration(FlinkKubeClientFactory.java:106) at org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory.createHAServices(KubernetesHaServicesFactory.java:37) at org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createCustomHAServices(HighAvailabilityServicesUtils.java:265) ... 9 more . -- Many thanks, Jonas