The exception is showing up both in TM and JM This however seemed only to appear when running on my local kubernetes setup. > I'd also recommend setting "kubernetes.namespace" option, unless you're using "default" namespace.
Yes, good point - I now see why that was needed. Den ons 25 aug. 2021 kl 11:37 skrev David Morávek <d...@apache.org>: > Hi Jonas, > > Where does the exception pop-up? In job driver, TM, JM? You need to make > sure that the plugin folder is setup for all of them, because they all may > need to access s3 at some point. > > Best, > D. > > On Wed, Aug 25, 2021 at 11:54 AM jonas eyob <jonas.e...@gmail.com> wrote: > >> Hey Thms, >> >> tried the s3p:// option as well - same issue. >> >> > Also check if your user that executes the process is able to read the >> jars. >> Not exactly sure how to do that? The user "flink" in the docker image is >> able to read the contents as far I understand. But maybe that's not how I >> would check it? >> >> Den ons 25 aug. 2021 kl 10:12 skrev Thms Hmm <thms....@gmail.com>: >> >>> Hey Jonas, >>> you could also try to use the ´s3p://´ scheme to directly specify that >>> presto should be used. Also check if your user that executes the process is >>> able to read the jars. >>> >>> Am Mi., 25. Aug. 2021 um 10:01 Uhr schrieb jonas eyob < >>> jonas.e...@gmail.com>: >>> >>>> Thanks David for the quick response! >>>> >>>> *face palm* - Thanks a lot, that seems to have addressed the >>>> NullPointerException issue. >>>> May I also suggest that this [1] page be updated, since it says the key >>>> is "high-availability.cluster-id" >>>> >>>> This led me to another issue however: >>>> "org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: >>>> Could not find a file system implementation for scheme 's3'" >>>> >>>> The section [2] describe how I can either use environment variables >>>> e.g. ENABLE_BUILT_IN_PLUGINS or bake that in to the image by copying the >>>> provided plugins in opt/ under /plugins >>>> >>>> Dockerfile (snippet) >>>> # Configure flink provided plugin for S3 access >>>> RUN mkdir -p $FLINK_HOME/plugins/s3-fs-presto >>>> RUN cp $FLINK_HOME/opt/flink-s3-fs-presto-*.jar >>>> $FLINK_HOME/plugins/s3-fs-presto/ >>>> >>>> When bashing into the image: >>>> >>>> flink@dd86717a92a0:~/plugins/s3-fs-presto$ ls >>>> flink-s3-fs-presto-1.12.5.jar >>>> >>>> Any idea? >>>> >>>> [1] >>>> https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#high-availability >>>> [2] >>>> https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/filesystems/s3/#hadooppresto-s3-file-systems-plugins >>>> >>>> >>>> Den ons 25 aug. 2021 kl 08:00 skrev David Morávek <d...@apache.org>: >>>> >>>>> Hi Jonas, >>>>> >>>>> this exception is raised because "kubernetes.cluster-id" [1] is not >>>>> set. I'd also recommend setting "kubernetes.namespace" option, unless >>>>> you're using "default" namespace. >>>>> >>>>> I've filled FLINK-23961 [2] so we provide more descriptive warning for >>>>> this issue next time ;) >>>>> >>>>> [1] >>>>> https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/ha/kubernetes_ha/#example-configuration >>>>> [2] https://issues.apache.org/jira/browse/FLINK-23961 >>>>> >>>>> Best, >>>>> D. >>>>> >>>>> On Wed, Aug 25, 2021 at 8:48 AM jonas eyob <jonas.e...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hey, I've been struggling with this problem now for some days - >>>>>> driving me crazy. >>>>>> >>>>>> I have a standalone kubernetes Flink (1.12.5) using an application >>>>>> cluster mode approach. >>>>>> >>>>>> *The problem* >>>>>> I am getting a NullPointerException when specifying the FQN of the >>>>>> Kubernetes HA Service Factory class >>>>>> i.e. >>>>>> *org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory* >>>>>> >>>>>> What other configurations besides the ones specified (here >>>>>> <https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/resource-providers/standalone/kubernetes/#common-cluster-resource-definitions>) >>>>>> may I be missing? >>>>>> >>>>>> Details: >>>>>> * we are using a custom image using the flink: 1.12 >>>>>> <https://hub.docker.com/layers/flink/library/flink/1.12/images/sha256-4b4290888e30d27a28517bac3b1678674cd4b17aa7b8329969d1d12fcdf68f02?context=explore> >>>>>> as base image >>>>>> >>>>>> flink-conf.yaml -- thought this may be useful? >>>>>> flink-conf.yaml: |+ >>>>>> jobmanager.rpc.address: {{ $fullName }}-jobmanager >>>>>> jobmanager.rpc.port: 6123 >>>>>> jobmanager.memory.process.size: 1600m >>>>>> taskmanager.numberOfTaskSlots: 2 >>>>>> taskmanager.rpc.port: 6122 >>>>>> taskmanager.memory.process.size: 1728m >>>>>> blob.server.port: 6124 >>>>>> queryable-state.proxy.ports: 6125 >>>>>> parallelism.default: 2 >>>>>> scheduler-mode: reactive >>>>>> execution.checkpointing.interval: 10s >>>>>> restart-strategy: fixed-delay >>>>>> restart-strategy.fixed-delay.attempts: 10 >>>>>> high-availability: >>>>>> org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory >>>>>> high-availability.cluster-id: /{{ $fullName }} >>>>>> high-availability.storageDir: s3://redacted-flink-dev/recovery >>>>>> >>>>>> *Snippet of Job Manager pod log* >>>>>> 2021-08-25 06:14:20,652 INFO >>>>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - >>>>>> Shutting >>>>>> StandaloneApplicationClusterEntryPoint down with application status >>>>>> FAILED. >>>>>> Diagnostics org.apache.flink.util.FlinkException: Could not create the ha >>>>>> services from the instantiated HighAvailabilityServicesFactory >>>>>> org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory. >>>>>> at >>>>>> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createCustomHAServices(HighAvailabilityServicesUtils.java:268) >>>>>> at >>>>>> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:124) >>>>>> at >>>>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:338) >>>>>> at >>>>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:296) >>>>>> at >>>>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:224) >>>>>> at >>>>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$1(ClusterEntrypoint.java:178) >>>>>> at >>>>>> org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28) >>>>>> at >>>>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:175) >>>>>> at >>>>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:585) >>>>>> at >>>>>> org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.main(StandaloneApplicationClusterEntryPoint.java:85) >>>>>> Caused by: java.lang.NullPointerException >>>>>> at >>>>>> org.apache.flink.util.Preconditions.checkNotNull(Preconditions.java:59) >>>>>> at >>>>>> org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.<init>(Fabric8FlinkKubeClient.java:85) >>>>>> at >>>>>> org.apache.flink.kubernetes.kubeclient.FlinkKubeClientFactory.fromConfiguration(FlinkKubeClientFactory.java:106) >>>>>> at >>>>>> org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory.createHAServices(KubernetesHaServicesFactory.java:37) >>>>>> at >>>>>> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createCustomHAServices(HighAvailabilityServicesUtils.java:265) >>>>>> ... 9 more >>>>>> . >>>>>> >>>>>> -- >>>>>> Many thanks, >>>>>> Jonas >>>>>> >>>>> >>>> >>>> -- >>>> *Med Vänliga Hälsningar* >>>> *Jonas Eyob* >>>> >>> >> >> -- >> *Med Vänliga Hälsningar* >> *Jonas Eyob* >> > -- *Med Vänliga Hälsningar* *Jonas Eyob*