So I'm able to put this to debug mode within the flink jobmanager and taskmanager and at the moment, the taskmanager is complaining about these: ``
2019-12-23 22:43:06,892 DEBUG org.apache.beam.vendor.grpc.v1p21p0.io.netty.util.internal.NativeLibraryLoader - Unable to load the library 'org_apache_beam_vendor_grpc_v1p21p0_netty_transport_native_epoll_x86_64', trying other loading mechanism. java.lang.UnsatisfiedLinkError: no org_apache_beam_vendor_grpc_v1p21p0_netty_transport_native_epoll_x86_64 in java.library.path at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1860) at java.lang.Runtime.loadLibrary0(Runtime.java:870) at java.lang.System.loadLibrary(System.java:1122) at org.apache.beam.vendor.grpc.v1p21p0.io.netty.util.internal.NativeLibraryUtil.loadLibrary(NativeLibraryUtil.java:38) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.beam.vendor.grpc.v1p21p0.io.netty.util.internal.NativeLibraryLoader$1.run(NativeLibraryLoader.java:369) at java.security.AccessController.doPrivileged(Native Method) at org.apache.beam.vendor.grpc.v1p21p0.io.netty.util.internal.NativeLibraryLoader.loadLibraryByHelper(NativeLibraryLoader.java:361) at org.apache.beam.vendor.grpc.v1p21p0.io.netty.util.internal.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:339) at org.apache.beam.vendor.grpc.v1p21p0.io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:136) at org.apache.beam.vendor.grpc.v1p21p0.io.netty.channel.epoll.Native.loadNativeLibrary(Native.java:198) at org.apache.beam.vendor.grpc.v1p21p0.io.netty.channel.epoll.Native.<clinit>(Native.java:61) at org.apache.beam.vendor.grpc.v1p21p0.io.netty.channel.epoll.Epoll.<clinit>(Epoll.java:38) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) `` And the other error is this: `` 2019/12/23 22:43:09 Executing: python -m apache_beam.runners.worker.sdk_worker_main 2019/12/23 22:43:09 Python exited: exec: "python": executable file not found in $PATH `` This is more towards complaining that the exec command for the sdk is complaining about python not existing ( my current environment that I'm running this on, in both the job manager server and the python code itself are both inside the jobmanager...I have the flink jobmanager container and the flink taskmanager with base image as flink version 1.8.3 ) Just to verify that the preferred python version is python3? So re-pointing `python` to python3 was a success! I did run into another problem where I sink the transformed data into a local file but I think I'll create a separate thread on that. In production, I don't really want to sink into a local file but rather into some distributed storage, so this is a non-concern atm. My last question is...do I need to worry about that java package issue? If not, then that's it for now, thanks! On Mon, Dec 23, 2019 at 7:55 AM Kyle Weaver <[email protected]> wrote: > > It will be great if you can get the error from the failing process. > > Note that you will have to set the log level to DEBUG to get output from > the process. > > On Fri, Dec 20, 2019 at 6:23 PM Ankur Goenka <[email protected]> wrote: > >> Hi Matthew, >> >> It will be great if you can get the error from the failing process. >> My suspicions are: >> - Flink task manager container access to gcs location. >> - As you are running Flink task manager in a container >> "/beam_src_code/sdks/python/container/build/target/launcher/linux_amd64/boot\" >> path is not applicable. >> - Boot command is not run from a valid activated python environment. >> - You can use >> https://github.com/apache/beam/blob/605c59b383a77b117bb6b07021e8c41cb13b438f/sdks/python/test-suites/portable/py2/build.gradle#L179 >> to create an script for process worker. >> - Alternatively, you can install all the dependencies in the python >> environment of the taskmanager container. >> >> Thanks, >> Ankur >> >> On Fri, Dec 20, 2019 at 1:21 PM Matthew Rafael Magsombol < >> [email protected]> wrote: >> >>> ( Cross posting from stackoverflow: >>> https://stackoverflow.com/questions/59429897/beam-running-on-flink-with-python-sdk-and-using-google-cloud-storage-for-artifac?fbclid=IwAR2M6pcpmdGehgJjU36xkjaFxvHP0iuhi6UYDqO9jA8t7W6jlF9pWx3VGt4 >>> ...tried >>> sending before, but someone mentioned that I should subscribe first to have >>> my email send ) >>> >>> I'm trying to set up beam running on flink with python sdk locally using >>> docker. >>> Currently, I just have a single flink job manager in a docker container, >>> and a single flink task manager in a docker container with the beam job >>> server running on the job manager container. >>> The pipeline options are as follows: >>> ``` >>> PipelineOptions([ >>> "--runner=PortableRunner", >>> "--environment_type=PROCESS", >>> "--environment_config={\"command\": >>> \"/beam_src_code/sdks/python/container/build/target/launcher/linux_amd64/boot\"}", >>> "--job_endpoint=localhost:8099"]) >>> ``` >>> And the job server is ran as follows with artifacts directory in gcs: >>> java -jar jar_lib/beam-runners-flink-1.8-job-server-2.16.0.jar >>> --flink-master-url localhost:8081 --artifacts-dir >>> gs://<gcs_location>/matthew_magsombol/ --job-port 8099 >>> The behavior is that the job gets submitted to the flink cluster, but it >>> freezes up and eventually runs into an illegal state and throwing an >>> exception. >>> >>> Flink Jobmanager Logs: >>> ``` >>> java.lang.Exception: The user defined 'open()' method caused an >>> exception: java.lang.IllegalStateException: Process died with exit code 1 >>> at >>> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498) >>> at >>> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) >>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) >>> at java.lang.Thread.run(Thread.java:748) >>> Caused by: >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: >>> java.lang.IllegalStateException: Process died with exit code 1 >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2050) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:212) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:203) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:186) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultExecutableStageContext.getStageBundleFactory(DefaultExecutableStageContext.java:42) >>> at >>> org.apache.beam.runners.fnexecution.control.ReferenceCountingExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingExecutableStageContextFactory.java:198) >>> at >>> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:130) >>> at >>> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) >>> at >>> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494) >>> ... 3 more >>> Caused by: java.lang.IllegalStateException: Process died with exit >>> code 1 >>> at >>> org.apache.beam.runners.fnexecution.environment.ProcessManager$RunningProcess.isAliveOrThrow(ProcessManager.java:76) >>> at >>> org.apache.beam.runners.fnexecution.environment.ProcessEnvironmentFactory.createEnvironment(ProcessEnvironmentFactory.java:125) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:179) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:163) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044) >>> ... 15 more >>> 2019-12-20 16:55:24,726 INFO >>> org.apache.flink.runtime.executiongraph.ExecutionGraph - Job >>> BeamApp-flink-1220165317-c558fc1d (f7a4b06e1ca475f4b935b84dc0c8e186) >>> switched from state RUNNING to FAILING. >>> java.lang.Exception: The user defined 'open()' method caused an >>> exception: java.lang.IllegalStateException: Process died with exit code 1 >>> at >>> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498) >>> at >>> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) >>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) >>> at java.lang.Thread.run(Thread.java:748) >>> Caused by: >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: >>> java.lang.IllegalStateException: Process died with exit code 1 >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2050) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:212) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:203) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:186) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultExecutableStageContext.getStageBundleFactory(DefaultExecutableStageContext.java:42) >>> at >>> org.apache.beam.runners.fnexecution.control.ReferenceCountingExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingExecutableStageContextFactory.java:198) >>> at >>> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:130) >>> at >>> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) >>> at >>> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494) >>> ... 3 more >>> Caused by: java.lang.IllegalStateException: Process died with exit >>> code 1 >>> at >>> org.apache.beam.runners.fnexecution.environment.ProcessManager$RunningProcess.isAliveOrThrow(ProcessManager.java:76) >>> at >>> org.apache.beam.runners.fnexecution.environment.ProcessEnvironmentFactory.createEnvironment(ProcessEnvironmentFactory.java:125) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:179) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:163) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044) >>> ... 15 more >>> ``` >>> Flink taskmanager logs: >>> ``` >>> 2019-12-20 17:01:33,095 ERROR >>> org.apache.flink.runtime.operators.BatchTask - Error in >>> task code: CHAIN MapPartition (MapPartition at >>> [2]write/Write/WriteImpl/DoOnce/{FlatMap(<lambda at core.py:2468>), >>> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/1) >>> java.lang.Exception: The user defined 'open()' method caused an >>> exception: java.lang.IllegalStateException: Process died with exit code 1 >>> at >>> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498) >>> at >>> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) >>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) >>> at java.lang.Thread.run(Thread.java:748) >>> Caused by: >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: >>> java.lang.IllegalStateException: Process died with exit code 1 >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2050) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:212) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:203) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:186) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultExecutableStageContext.getStageBundleFactory(DefaultExecutableStageContext.java:42) >>> at >>> org.apache.beam.runners.fnexecution.control.ReferenceCountingExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingExecutableStageContextFactory.java:198) >>> at >>> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:130) >>> at >>> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) >>> at >>> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494) >>> ... 3 more >>> Caused by: java.lang.IllegalStateException: Process died with exit >>> code 1 >>> at >>> org.apache.beam.runners.fnexecution.environment.ProcessManager$RunningProcess.isAliveOrThrow(ProcessManager.java:76) >>> at >>> org.apache.beam.runners.fnexecution.environment.ProcessEnvironmentFactory.createEnvironment(ProcessEnvironmentFactory.java:125) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:179) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:163) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044) >>> ... 15 more >>> 2019-12-20 17:01:33,095 ERROR >>> org.apache.flink.runtime.operators.BatchTask - Error in >>> task code: CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> >>> FlatMap (FlatMap at ExtractOutput[0]) (1/1) >>> java.lang.Exception: The user defined 'open()' method caused an >>> exception: java.lang.IllegalStateException: Process died with exit code 1 >>> at >>> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498) >>> at >>> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) >>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) >>> at java.lang.Thread.run(Thread.java:748) >>> Caused by: >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: >>> java.lang.IllegalStateException: Process died with exit code 1 >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2050) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:212) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:203) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:186) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultExecutableStageContext.getStageBundleFactory(DefaultExecutableStageContext.java:42) >>> at >>> org.apache.beam.runners.fnexecution.control.ReferenceCountingExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingExecutableStageContextFactory.java:198) >>> at >>> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:130) >>> at >>> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) >>> at >>> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494) >>> ... 3 more >>> Caused by: java.lang.IllegalStateException: Process died with exit >>> code 1 >>> at >>> org.apache.beam.runners.fnexecution.environment.ProcessManager$RunningProcess.isAliveOrThrow(ProcessManager.java:76) >>> at >>> org.apache.beam.runners.fnexecution.environment.ProcessEnvironmentFactory.createEnvironment(ProcessEnvironmentFactory.java:125) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:179) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:163) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044) >>> ... 15 more >>> 2019-12-20 17:01:33,097 INFO >>> org.apache.flink.runtime.taskmanager.Task - CHAIN >>> MapPartition (MapPartition at >>> [2]write/Write/WriteImpl/DoOnce/{FlatMap(<lambda at core.py:2468>), >>> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/1) >>> (2a610de72b94e612a977136c7381e6e2) switched from RUNNING to FAILED. >>> java.lang.Exception: The user defined 'open()' method caused an >>> exception: java.lang.IllegalStateException: Process died with exit code 1 >>> at >>> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498) >>> at >>> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) >>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) >>> at java.lang.Thread.run(Thread.java:748) >>> Caused by: >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: >>> java.lang.IllegalStateException: Process died with exit code 1 >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2050) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:212) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:203) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:186) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultExecutableStageContext.getStageBundleFactory(DefaultExecutableStageContext.java:42) >>> at >>> org.apache.beam.runners.fnexecution.control.ReferenceCountingExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingExecutableStageContextFactory.java:198) >>> at >>> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:130) >>> at >>> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) >>> at >>> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494) >>> ... 3 more >>> Caused by: java.lang.IllegalStateException: Process died with exit >>> code 1 >>> at >>> org.apache.beam.runners.fnexecution.environment.ProcessManager$RunningProcess.isAliveOrThrow(ProcessManager.java:76) >>> at >>> org.apache.beam.runners.fnexecution.environment.ProcessEnvironmentFactory.createEnvironment(ProcessEnvironmentFactory.java:125) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:179) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:163) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044) >>> ... 15 more >>> ``` >>> Beam Jobserver logs: >>> ``` >>> [flink-runner-job-invoker] ERROR >>> org.apache.beam.runners.fnexecution.jobsubmission.JobInvocation - Error >>> during job invocation >>> BeamApp-flink-1220165928-dbc405d9_f898fc97-9a7c-4f6a-988f-8ee74bd7b9a7. >>> org.apache.flink.client.program.ProgramInvocationException: Job >>> failed. (JobID: 50c07fcc2f2b1dde3a7073f430e4aa50) >>> at >>> org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:268) >>> at >>> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:483) >>> at >>> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:471) >>> at >>> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:446) >>> at >>> org.apache.flink.client.RemoteExecutor.executePlanWithJars(RemoteExecutor.java:210) >>> at >>> org.apache.flink.client.RemoteExecutor.executePlan(RemoteExecutor.java:187) >>> at >>> org.apache.flink.api.java.RemoteEnvironment.execute(RemoteEnvironment.java:173) >>> at >>> org.apache.beam.runners.flink.FlinkBatchPortablePipelineTranslator$BatchTranslationContext.execute(FlinkBatchPortablePipelineTranslator.java:191) >>> at >>> org.apache.beam.runners.flink.FlinkPipelineRunner.runPipelineWithTranslator(FlinkPipelineRunner.java:104) >>> at >>> org.apache.beam.runners.flink.FlinkPipelineRunner.run(FlinkPipelineRunner.java:80) >>> at >>> org.apache.beam.runners.fnexecution.jobsubmission.JobInvocation.runPipeline(JobInvocation.java:78) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) >>> at java.lang.Thread.run(Thread.java:748) >>> Caused by: org.apache.flink.runtime.client.JobExecutionException: >>> Job execution failed. >>> at >>> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146) >>> at >>> org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:265) >>> ... 16 more >>> Caused by: java.lang.Exception: The user defined 'open()' method >>> caused an exception: java.lang.IllegalStateException: Process died with >>> exit code 1 >>> at >>> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498) >>> at >>> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) >>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) >>> ... 1 more >>> Caused by: >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: >>> java.lang.IllegalStateException: Process died with exit code 1 >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2050) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:212) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:203) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:186) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultExecutableStageContext.getStageBundleFactory(DefaultExecutableStageContext.java:42) >>> at >>> org.apache.beam.runners.fnexecution.control.ReferenceCountingExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingExecutableStageContextFactory.java:198) >>> at >>> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:130) >>> at >>> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) >>> at >>> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494) >>> ... 3 more >>> Caused by: java.lang.IllegalStateException: Process died with exit >>> code 1 >>> at >>> org.apache.beam.runners.fnexecution.environment.ProcessManager$RunningProcess.isAliveOrThrow(ProcessManager.java:76) >>> at >>> org.apache.beam.runners.fnexecution.environment.ProcessEnvironmentFactory.createEnvironment(ProcessEnvironmentFactory.java:125) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:179) >>> at >>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:163) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154) >>> at >>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044) >>> ... 15 more >>> Dec 20, 2019 5:01:33 PM >>> com.google.api.client.googleapis.services.AbstractGoogleClient <init> >>> WARNING: Application name is not set. Call >>> Builder#setApplicationName. >>> [flink-runner-job-invoker] INFO >>> org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactRetrievalService >>> - Manifest at >>> gs://<gcs_location>/matthew_magsombol/job_53b08ce8-cec3-4a39-ae60-95398eb04b43/MANIFEST >>> has 1 artifact locations >>> [pool-11-thread-1] INFO >>> org.apache.beam.sdk.extensions.gcp.util.GcsUtil - Ignoring failed deletion >>> of file >>> gs://<gcs_location>/matthew_magsombol/job_53b08ce8-cec3-4a39-ae60-95398eb04b43/artifacts/ >>> which already does not exist: >>> {"code":404,"errors":[{"domain":"global","message":"No such object: >>> <gcs_location>/matthew_magsombol/job_53b08ce8-cec3-4a39-ae60-95398eb04b43/artifacts/","reason":"notFound"}],"message":"No >>> such object: >>> <gcs_location>/matthew_magsombol/job_53b08ce8-cec3-4a39-ae60-95398eb04b43/artifacts/"} >>> [pool-13-thread-1] INFO >>> org.apache.beam.sdk.extensions.gcp.util.GcsUtil - Ignoring failed deletion >>> of file >>> gs://<gcs_location>/matthew_magsombol/job_53b08ce8-cec3-4a39-ae60-95398eb04b43/ >>> which already does not exist: >>> {"code":404,"errors":[{"domain":"global","message":"No such object: >>> <gcs_location>/matthew_magsombol/job_53b08ce8-cec3-4a39-ae60-95398eb04b43/","reason":"notFound"}],"message":"No >>> such object: >>> <gcs_location>/matthew_magsombol/job_53b08ce8-cec3-4a39-ae60-95398eb04b43/"} >>> [flink-runner-job-invoker] INFO >>> org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactStagingService >>> - Removed dir >>> gs://<gcs_location>/matthew_magsombol/job_53b08ce8-cec3-4a39-ae60-95398eb04b43/ >>> ``` >>> Midway through the run, I've confirmed that the beam job server IS able >>> to create a job directory in the gcs location ( it does auto delete when >>> the run ends ). >>> ( My previous attempts at this, the error led me to an authentication >>> issue with gcs, and that was mainly due to gcp creds not being accessible >>> by flink job manager and flink task manager...already fixed it, so it is no >>> longer spitting that error out...so I don't think this might be a gcp >>> credentials issue?) >>> I've followed most of Yu Watanabe's email threads: >>> https://lists.apache.org/thread.html/2ca72cb1946f428aabeb2b8573ba8e3270bbf34b97107c53d26fbf76%40%3Cuser.beam.apache.org%3E >>> ( with a slight difference in artifact directory ). >>> >>> Thanks! >>> >>
