Hi Jonas,

this exception is raised because "kubernetes.cluster-id" [1] is not set.
I'd also recommend setting "kubernetes.namespace" option, unless you're
using "default" namespace.

I've filled FLINK-23961 [2] so we provide more descriptive warning for this
issue next time ;)

[1]
https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/ha/kubernetes_ha/#example-configuration
[2] https://issues.apache.org/jira/browse/FLINK-23961

Best,
D.

On Wed, Aug 25, 2021 at 8:48 AM jonas eyob <jonas.e...@gmail.com> wrote:

> Hey, I've been struggling with this problem now for some days - driving me
> crazy.
>
> I have a standalone kubernetes Flink (1.12.5) using an application cluster
> mode approach.
>
> *The problem*
> I am getting a NullPointerException when specifying the FQN of the
> Kubernetes HA Service Factory class
> i.e.
> *org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory*
>
> What other configurations besides the ones specified (here
> <https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/#common-cluster-resource-definitions>)
> may I be missing?
>
> Details:
> * we are using a custom image using the flink: 1.12
> <https://hub.docker.com/layers/flink/library/flink/1.12/images/sha256-4b4290888e30d27a28517bac3b1678674cd4b17aa7b8329969d1d12fcdf68f02?context=explore>
> as base image
>
> flink-conf.yaml -- thought this may be useful?
> flink-conf.yaml: |+
> jobmanager.rpc.address: {{ $fullName }}-jobmanager
> jobmanager.rpc.port: 6123
> jobmanager.memory.process.size: 1600m
> taskmanager.numberOfTaskSlots: 2
> taskmanager.rpc.port: 6122
> taskmanager.memory.process.size: 1728m
> blob.server.port: 6124
> queryable-state.proxy.ports: 6125
> parallelism.default: 2
> scheduler-mode: reactive
> execution.checkpointing.interval: 10s
> restart-strategy: fixed-delay
> restart-strategy.fixed-delay.attempts: 10
> high-availability:
> org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory
> high-availability.cluster-id: /{{ $fullName }}
> high-availability.storageDir: s3://redacted-flink-dev/recovery
>
> *Snippet of Job Manager pod log*
> 2021-08-25 06:14:20,652 INFO
>  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - Shutting
> StandaloneApplicationClusterEntryPoint down with application status FAILED.
> Diagnostics org.apache.flink.util.FlinkException: Could not create the ha
> services from the instantiated HighAvailabilityServicesFactory
> org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory.
> at
> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createCustomHAServices(HighAvailabilityServicesUtils.java:268)
> at
> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:124)
> at
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:338)
> at
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:296)
> at
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:224)
> at
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$1(ClusterEntrypoint.java:178)
> at
> org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
> at
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:175)
> at
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:585)
> at
> org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.main(StandaloneApplicationClusterEntryPoint.java:85)
> Caused by: java.lang.NullPointerException
> at org.apache.flink.util.Preconditions.checkNotNull(Preconditions.java:59)
> at
> org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.<init>(Fabric8FlinkKubeClient.java:85)
> at
> org.apache.flink.kubernetes.kubeclient.FlinkKubeClientFactory.fromConfiguration(FlinkKubeClientFactory.java:106)
> at
> org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory.createHAServices(KubernetesHaServicesFactory.java:37)
> at
> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createCustomHAServices(HighAvailabilityServicesUtils.java:265)
> ... 9 more
> .
>
> --
> Many thanks,
> Jonas
>

Reply via email to