Hi Matthew, It will be great if you can get the error from the failing process. My suspicions are: - Flink task manager container access to gcs location. - As you are running Flink task manager in a container "/beam_src_code/sdks/python/container/build/target/launcher/linux_amd64/boot\" path is not applicable. - Boot command is not run from a valid activated python environment. - You can use https://github.com/apache/beam/blob/605c59b383a77b117bb6b07021e8c41cb13b438f/sdks/python/test-suites/portable/py2/build.gradle#L179 to create an script for process worker. - Alternatively, you can install all the dependencies in the python environment of the taskmanager container.
Thanks, Ankur On Fri, Dec 20, 2019 at 1:21 PM Matthew Rafael Magsombol < [email protected]> wrote: > ( Cross posting from stackoverflow: > https://stackoverflow.com/questions/59429897/beam-running-on-flink-with-python-sdk-and-using-google-cloud-storage-for-artifac?fbclid=IwAR2M6pcpmdGehgJjU36xkjaFxvHP0iuhi6UYDqO9jA8t7W6jlF9pWx3VGt4 > ...tried > sending before, but someone mentioned that I should subscribe first to have > my email send ) > > I'm trying to set up beam running on flink with python sdk locally using > docker. > Currently, I just have a single flink job manager in a docker container, > and a single flink task manager in a docker container with the beam job > server running on the job manager container. > The pipeline options are as follows: > ``` > PipelineOptions([ > "--runner=PortableRunner", > "--environment_type=PROCESS", > "--environment_config={\"command\": > \"/beam_src_code/sdks/python/container/build/target/launcher/linux_amd64/boot\"}", > "--job_endpoint=localhost:8099"]) > ``` > And the job server is ran as follows with artifacts directory in gcs: > java -jar jar_lib/beam-runners-flink-1.8-job-server-2.16.0.jar > --flink-master-url localhost:8081 --artifacts-dir > gs://<gcs_location>/matthew_magsombol/ --job-port 8099 > The behavior is that the job gets submitted to the flink cluster, but it > freezes up and eventually runs into an illegal state and throwing an > exception. > > Flink Jobmanager Logs: > ``` > java.lang.Exception: The user defined 'open()' method caused an > exception: java.lang.IllegalStateException: Process died with exit code 1 > at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498) > at > org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) > at java.lang.Thread.run(Thread.java:748) > Caused by: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalStateException: Process died with exit code 1 > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2050) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:212) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:203) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:186) > at > org.apache.beam.runners.fnexecution.control.DefaultExecutableStageContext.getStageBundleFactory(DefaultExecutableStageContext.java:42) > at > org.apache.beam.runners.fnexecution.control.ReferenceCountingExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingExecutableStageContextFactory.java:198) > at > org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:130) > at > org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) > at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494) > ... 3 more > Caused by: java.lang.IllegalStateException: Process died with exit > code 1 > at > org.apache.beam.runners.fnexecution.environment.ProcessManager$RunningProcess.isAliveOrThrow(ProcessManager.java:76) > at > org.apache.beam.runners.fnexecution.environment.ProcessEnvironmentFactory.createEnvironment(ProcessEnvironmentFactory.java:125) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:179) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:163) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044) > ... 15 more > 2019-12-20 16:55:24,726 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph - Job > BeamApp-flink-1220165317-c558fc1d (f7a4b06e1ca475f4b935b84dc0c8e186) > switched from state RUNNING to FAILING. > java.lang.Exception: The user defined 'open()' method caused an > exception: java.lang.IllegalStateException: Process died with exit code 1 > at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498) > at > org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) > at java.lang.Thread.run(Thread.java:748) > Caused by: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalStateException: Process died with exit code 1 > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2050) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:212) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:203) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:186) > at > org.apache.beam.runners.fnexecution.control.DefaultExecutableStageContext.getStageBundleFactory(DefaultExecutableStageContext.java:42) > at > org.apache.beam.runners.fnexecution.control.ReferenceCountingExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingExecutableStageContextFactory.java:198) > at > org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:130) > at > org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) > at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494) > ... 3 more > Caused by: java.lang.IllegalStateException: Process died with exit > code 1 > at > org.apache.beam.runners.fnexecution.environment.ProcessManager$RunningProcess.isAliveOrThrow(ProcessManager.java:76) > at > org.apache.beam.runners.fnexecution.environment.ProcessEnvironmentFactory.createEnvironment(ProcessEnvironmentFactory.java:125) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:179) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:163) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044) > ... 15 more > ``` > Flink taskmanager logs: > ``` > 2019-12-20 17:01:33,095 ERROR > org.apache.flink.runtime.operators.BatchTask - Error in > task code: CHAIN MapPartition (MapPartition at > [2]write/Write/WriteImpl/DoOnce/{FlatMap(<lambda at core.py:2468>), > Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/1) > java.lang.Exception: The user defined 'open()' method caused an > exception: java.lang.IllegalStateException: Process died with exit code 1 > at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498) > at > org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) > at java.lang.Thread.run(Thread.java:748) > Caused by: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalStateException: Process died with exit code 1 > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2050) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:212) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:203) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:186) > at > org.apache.beam.runners.fnexecution.control.DefaultExecutableStageContext.getStageBundleFactory(DefaultExecutableStageContext.java:42) > at > org.apache.beam.runners.fnexecution.control.ReferenceCountingExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingExecutableStageContextFactory.java:198) > at > org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:130) > at > org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) > at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494) > ... 3 more > Caused by: java.lang.IllegalStateException: Process died with exit > code 1 > at > org.apache.beam.runners.fnexecution.environment.ProcessManager$RunningProcess.isAliveOrThrow(ProcessManager.java:76) > at > org.apache.beam.runners.fnexecution.environment.ProcessEnvironmentFactory.createEnvironment(ProcessEnvironmentFactory.java:125) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:179) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:163) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044) > ... 15 more > 2019-12-20 17:01:33,095 ERROR > org.apache.flink.runtime.operators.BatchTask - Error in > task code: CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> > FlatMap (FlatMap at ExtractOutput[0]) (1/1) > java.lang.Exception: The user defined 'open()' method caused an > exception: java.lang.IllegalStateException: Process died with exit code 1 > at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498) > at > org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) > at java.lang.Thread.run(Thread.java:748) > Caused by: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalStateException: Process died with exit code 1 > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2050) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:212) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:203) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:186) > at > org.apache.beam.runners.fnexecution.control.DefaultExecutableStageContext.getStageBundleFactory(DefaultExecutableStageContext.java:42) > at > org.apache.beam.runners.fnexecution.control.ReferenceCountingExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingExecutableStageContextFactory.java:198) > at > org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:130) > at > org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) > at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494) > ... 3 more > Caused by: java.lang.IllegalStateException: Process died with exit > code 1 > at > org.apache.beam.runners.fnexecution.environment.ProcessManager$RunningProcess.isAliveOrThrow(ProcessManager.java:76) > at > org.apache.beam.runners.fnexecution.environment.ProcessEnvironmentFactory.createEnvironment(ProcessEnvironmentFactory.java:125) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:179) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:163) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044) > ... 15 more > 2019-12-20 17:01:33,097 INFO > org.apache.flink.runtime.taskmanager.Task - CHAIN > MapPartition (MapPartition at > [2]write/Write/WriteImpl/DoOnce/{FlatMap(<lambda at core.py:2468>), > Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/1) > (2a610de72b94e612a977136c7381e6e2) switched from RUNNING to FAILED. > java.lang.Exception: The user defined 'open()' method caused an > exception: java.lang.IllegalStateException: Process died with exit code 1 > at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498) > at > org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) > at java.lang.Thread.run(Thread.java:748) > Caused by: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalStateException: Process died with exit code 1 > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2050) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:212) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:203) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:186) > at > org.apache.beam.runners.fnexecution.control.DefaultExecutableStageContext.getStageBundleFactory(DefaultExecutableStageContext.java:42) > at > org.apache.beam.runners.fnexecution.control.ReferenceCountingExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingExecutableStageContextFactory.java:198) > at > org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:130) > at > org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) > at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494) > ... 3 more > Caused by: java.lang.IllegalStateException: Process died with exit > code 1 > at > org.apache.beam.runners.fnexecution.environment.ProcessManager$RunningProcess.isAliveOrThrow(ProcessManager.java:76) > at > org.apache.beam.runners.fnexecution.environment.ProcessEnvironmentFactory.createEnvironment(ProcessEnvironmentFactory.java:125) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:179) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:163) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044) > ... 15 more > ``` > Beam Jobserver logs: > ``` > [flink-runner-job-invoker] ERROR > org.apache.beam.runners.fnexecution.jobsubmission.JobInvocation - Error > during job invocation > BeamApp-flink-1220165928-dbc405d9_f898fc97-9a7c-4f6a-988f-8ee74bd7b9a7. > org.apache.flink.client.program.ProgramInvocationException: Job > failed. (JobID: 50c07fcc2f2b1dde3a7073f430e4aa50) > at > org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:268) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:483) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:471) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:446) > at > org.apache.flink.client.RemoteExecutor.executePlanWithJars(RemoteExecutor.java:210) > at > org.apache.flink.client.RemoteExecutor.executePlan(RemoteExecutor.java:187) > at > org.apache.flink.api.java.RemoteEnvironment.execute(RemoteEnvironment.java:173) > at > org.apache.beam.runners.flink.FlinkBatchPortablePipelineTranslator$BatchTranslationContext.execute(FlinkBatchPortablePipelineTranslator.java:191) > at > org.apache.beam.runners.flink.FlinkPipelineRunner.runPipelineWithTranslator(FlinkPipelineRunner.java:104) > at > org.apache.beam.runners.flink.FlinkPipelineRunner.run(FlinkPipelineRunner.java:80) > at > org.apache.beam.runners.fnexecution.jobsubmission.JobInvocation.runPipeline(JobInvocation.java:78) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.flink.runtime.client.JobExecutionException: Job > execution failed. > at > org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146) > at > org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:265) > ... 16 more > Caused by: java.lang.Exception: The user defined 'open()' method > caused an exception: java.lang.IllegalStateException: Process died with > exit code 1 > at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498) > at > org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) > ... 1 more > Caused by: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.lang.IllegalStateException: Process died with exit code 1 > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2050) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:212) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:203) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:186) > at > org.apache.beam.runners.fnexecution.control.DefaultExecutableStageContext.getStageBundleFactory(DefaultExecutableStageContext.java:42) > at > org.apache.beam.runners.fnexecution.control.ReferenceCountingExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingExecutableStageContextFactory.java:198) > at > org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:130) > at > org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) > at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494) > ... 3 more > Caused by: java.lang.IllegalStateException: Process died with exit > code 1 > at > org.apache.beam.runners.fnexecution.environment.ProcessManager$RunningProcess.isAliveOrThrow(ProcessManager.java:76) > at > org.apache.beam.runners.fnexecution.environment.ProcessEnvironmentFactory.createEnvironment(ProcessEnvironmentFactory.java:125) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:179) > at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:163) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154) > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044) > ... 15 more > Dec 20, 2019 5:01:33 PM > com.google.api.client.googleapis.services.AbstractGoogleClient <init> > WARNING: Application name is not set. Call Builder#setApplicationName. > [flink-runner-job-invoker] INFO > org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactRetrievalService > - Manifest at > gs://<gcs_location>/matthew_magsombol/job_53b08ce8-cec3-4a39-ae60-95398eb04b43/MANIFEST > has 1 artifact locations > [pool-11-thread-1] INFO > org.apache.beam.sdk.extensions.gcp.util.GcsUtil - Ignoring failed deletion > of file > gs://<gcs_location>/matthew_magsombol/job_53b08ce8-cec3-4a39-ae60-95398eb04b43/artifacts/ > which already does not exist: > {"code":404,"errors":[{"domain":"global","message":"No such object: > <gcs_location>/matthew_magsombol/job_53b08ce8-cec3-4a39-ae60-95398eb04b43/artifacts/","reason":"notFound"}],"message":"No > such object: > <gcs_location>/matthew_magsombol/job_53b08ce8-cec3-4a39-ae60-95398eb04b43/artifacts/"} > [pool-13-thread-1] INFO > org.apache.beam.sdk.extensions.gcp.util.GcsUtil - Ignoring failed deletion > of file > gs://<gcs_location>/matthew_magsombol/job_53b08ce8-cec3-4a39-ae60-95398eb04b43/ > which already does not exist: > {"code":404,"errors":[{"domain":"global","message":"No such object: > <gcs_location>/matthew_magsombol/job_53b08ce8-cec3-4a39-ae60-95398eb04b43/","reason":"notFound"}],"message":"No > such object: > <gcs_location>/matthew_magsombol/job_53b08ce8-cec3-4a39-ae60-95398eb04b43/"} > [flink-runner-job-invoker] INFO > org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactStagingService > - Removed dir > gs://<gcs_location>/matthew_magsombol/job_53b08ce8-cec3-4a39-ae60-95398eb04b43/ > ``` > Midway through the run, I've confirmed that the beam job server IS able to > create a job directory in the gcs location ( it does auto delete when the > run ends ). > ( My previous attempts at this, the error led me to an authentication > issue with gcs, and that was mainly due to gcp creds not being accessible > by flink job manager and flink task manager...already fixed it, so it is no > longer spitting that error out...so I don't think this might be a gcp > credentials issue?) > I've followed most of Yu Watanabe's email threads: > https://lists.apache.org/thread.html/2ca72cb1946f428aabeb2b8573ba8e3270bbf34b97107c53d26fbf76%40%3Cuser.beam.apache.org%3E > ( with a slight difference in artifact directory ). > > Thanks! >
