> It will be great if you can get the error from the failing process. Note that you will have to set the log level to DEBUG to get output from the process.
On Fri, Dec 20, 2019 at 6:23 PM Ankur Goenka <[email protected]> wrote: > Hi Matthew, > > It will be great if you can get the error from the failing process. > My suspicions are: > - Flink task manager container access to gcs location. > - As you are running Flink task manager in a container > "/beam_src_code/sdks/python/container/build/target/launcher/linux_amd64/boot\" > path is not applicable. > - Boot command is not run from a valid activated python environment. > - You can use > https://github.com/apache/beam/blob/605c59b383a77b117bb6b07021e8c41cb13b438f/sdks/python/test-suites/portable/py2/build.gradle#L179 > to create an script for process worker. > - Alternatively, you can install all the dependencies in the python > environment of the taskmanager container. > > Thanks, > Ankur > > On Fri, Dec 20, 2019 at 1:21 PM Matthew Rafael Magsombol < > [email protected]> wrote: > >> ( Cross posting from stackoverflow: >> https://stackoverflow.com/questions/59429897/beam-running-on-flink-with-python-sdk-and-using-google-cloud-storage-for-artifac?fbclid=IwAR2M6pcpmdGehgJjU36xkjaFxvHP0iuhi6UYDqO9jA8t7W6jlF9pWx3VGt4 >> ...tried >> sending before, but someone mentioned that I should subscribe first to have >> my email send ) >> >> I'm trying to set up beam running on flink with python sdk locally using >> docker. >> Currently, I just have a single flink job manager in a docker container, >> and a single flink task manager in a docker container with the beam job >> server running on the job manager container. >> The pipeline options are as follows: >> ``` >> PipelineOptions([ >> "--runner=PortableRunner", >> "--environment_type=PROCESS", >> "--environment_config={\"command\": >> \"/beam_src_code/sdks/python/container/build/target/launcher/linux_amd64/boot\"}", >> "--job_endpoint=localhost:8099"]) >> ``` >> And the job server is ran as follows with artifacts directory in gcs: >> java -jar jar_lib/beam-runners-flink-1.8-job-server-2.16.0.jar >> --flink-master-url localhost:8081 --artifacts-dir >> gs://<gcs_location>/matthew_magsombol/ --job-port 8099 >> The behavior is that the job gets submitted to the flink cluster, but it >> freezes up and eventually runs into an illegal state and throwing an >> exception. >> >> Flink Jobmanager Logs: >> ``` >> java.lang.Exception: The user defined 'open()' method caused an >> exception: java.lang.IllegalStateException: Process died with exit code 1 >> at >> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498) >> at >> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) >> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) >> at java.lang.Thread.run(Thread.java:748) >> Caused by: >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: >> java.lang.IllegalStateException: Process died with exit code 1 >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2050) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:212) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:203) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:186) >> at >> org.apache.beam.runners.fnexecution.control.DefaultExecutableStageContext.getStageBundleFactory(DefaultExecutableStageContext.java:42) >> at >> org.apache.beam.runners.fnexecution.control.ReferenceCountingExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingExecutableStageContextFactory.java:198) >> at >> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:130) >> at >> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) >> at >> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494) >> ... 3 more >> Caused by: java.lang.IllegalStateException: Process died with exit >> code 1 >> at >> org.apache.beam.runners.fnexecution.environment.ProcessManager$RunningProcess.isAliveOrThrow(ProcessManager.java:76) >> at >> org.apache.beam.runners.fnexecution.environment.ProcessEnvironmentFactory.createEnvironment(ProcessEnvironmentFactory.java:125) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:179) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:163) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044) >> ... 15 more >> 2019-12-20 16:55:24,726 INFO >> org.apache.flink.runtime.executiongraph.ExecutionGraph - Job >> BeamApp-flink-1220165317-c558fc1d (f7a4b06e1ca475f4b935b84dc0c8e186) >> switched from state RUNNING to FAILING. >> java.lang.Exception: The user defined 'open()' method caused an >> exception: java.lang.IllegalStateException: Process died with exit code 1 >> at >> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498) >> at >> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) >> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) >> at java.lang.Thread.run(Thread.java:748) >> Caused by: >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: >> java.lang.IllegalStateException: Process died with exit code 1 >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2050) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:212) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:203) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:186) >> at >> org.apache.beam.runners.fnexecution.control.DefaultExecutableStageContext.getStageBundleFactory(DefaultExecutableStageContext.java:42) >> at >> org.apache.beam.runners.fnexecution.control.ReferenceCountingExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingExecutableStageContextFactory.java:198) >> at >> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:130) >> at >> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) >> at >> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494) >> ... 3 more >> Caused by: java.lang.IllegalStateException: Process died with exit >> code 1 >> at >> org.apache.beam.runners.fnexecution.environment.ProcessManager$RunningProcess.isAliveOrThrow(ProcessManager.java:76) >> at >> org.apache.beam.runners.fnexecution.environment.ProcessEnvironmentFactory.createEnvironment(ProcessEnvironmentFactory.java:125) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:179) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:163) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044) >> ... 15 more >> ``` >> Flink taskmanager logs: >> ``` >> 2019-12-20 17:01:33,095 ERROR >> org.apache.flink.runtime.operators.BatchTask - Error in >> task code: CHAIN MapPartition (MapPartition at >> [2]write/Write/WriteImpl/DoOnce/{FlatMap(<lambda at core.py:2468>), >> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/1) >> java.lang.Exception: The user defined 'open()' method caused an >> exception: java.lang.IllegalStateException: Process died with exit code 1 >> at >> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498) >> at >> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) >> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) >> at java.lang.Thread.run(Thread.java:748) >> Caused by: >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: >> java.lang.IllegalStateException: Process died with exit code 1 >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2050) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:212) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:203) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:186) >> at >> org.apache.beam.runners.fnexecution.control.DefaultExecutableStageContext.getStageBundleFactory(DefaultExecutableStageContext.java:42) >> at >> org.apache.beam.runners.fnexecution.control.ReferenceCountingExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingExecutableStageContextFactory.java:198) >> at >> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:130) >> at >> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) >> at >> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494) >> ... 3 more >> Caused by: java.lang.IllegalStateException: Process died with exit >> code 1 >> at >> org.apache.beam.runners.fnexecution.environment.ProcessManager$RunningProcess.isAliveOrThrow(ProcessManager.java:76) >> at >> org.apache.beam.runners.fnexecution.environment.ProcessEnvironmentFactory.createEnvironment(ProcessEnvironmentFactory.java:125) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:179) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:163) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044) >> ... 15 more >> 2019-12-20 17:01:33,095 ERROR >> org.apache.flink.runtime.operators.BatchTask - Error in >> task code: CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> >> FlatMap (FlatMap at ExtractOutput[0]) (1/1) >> java.lang.Exception: The user defined 'open()' method caused an >> exception: java.lang.IllegalStateException: Process died with exit code 1 >> at >> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498) >> at >> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) >> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) >> at java.lang.Thread.run(Thread.java:748) >> Caused by: >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: >> java.lang.IllegalStateException: Process died with exit code 1 >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2050) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:212) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:203) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:186) >> at >> org.apache.beam.runners.fnexecution.control.DefaultExecutableStageContext.getStageBundleFactory(DefaultExecutableStageContext.java:42) >> at >> org.apache.beam.runners.fnexecution.control.ReferenceCountingExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingExecutableStageContextFactory.java:198) >> at >> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:130) >> at >> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) >> at >> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494) >> ... 3 more >> Caused by: java.lang.IllegalStateException: Process died with exit >> code 1 >> at >> org.apache.beam.runners.fnexecution.environment.ProcessManager$RunningProcess.isAliveOrThrow(ProcessManager.java:76) >> at >> org.apache.beam.runners.fnexecution.environment.ProcessEnvironmentFactory.createEnvironment(ProcessEnvironmentFactory.java:125) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:179) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:163) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044) >> ... 15 more >> 2019-12-20 17:01:33,097 INFO >> org.apache.flink.runtime.taskmanager.Task - CHAIN >> MapPartition (MapPartition at >> [2]write/Write/WriteImpl/DoOnce/{FlatMap(<lambda at core.py:2468>), >> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/1) >> (2a610de72b94e612a977136c7381e6e2) switched from RUNNING to FAILED. >> java.lang.Exception: The user defined 'open()' method caused an >> exception: java.lang.IllegalStateException: Process died with exit code 1 >> at >> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498) >> at >> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) >> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) >> at java.lang.Thread.run(Thread.java:748) >> Caused by: >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: >> java.lang.IllegalStateException: Process died with exit code 1 >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2050) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:212) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:203) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:186) >> at >> org.apache.beam.runners.fnexecution.control.DefaultExecutableStageContext.getStageBundleFactory(DefaultExecutableStageContext.java:42) >> at >> org.apache.beam.runners.fnexecution.control.ReferenceCountingExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingExecutableStageContextFactory.java:198) >> at >> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:130) >> at >> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) >> at >> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494) >> ... 3 more >> Caused by: java.lang.IllegalStateException: Process died with exit >> code 1 >> at >> org.apache.beam.runners.fnexecution.environment.ProcessManager$RunningProcess.isAliveOrThrow(ProcessManager.java:76) >> at >> org.apache.beam.runners.fnexecution.environment.ProcessEnvironmentFactory.createEnvironment(ProcessEnvironmentFactory.java:125) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:179) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:163) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044) >> ... 15 more >> ``` >> Beam Jobserver logs: >> ``` >> [flink-runner-job-invoker] ERROR >> org.apache.beam.runners.fnexecution.jobsubmission.JobInvocation - Error >> during job invocation >> BeamApp-flink-1220165928-dbc405d9_f898fc97-9a7c-4f6a-988f-8ee74bd7b9a7. >> org.apache.flink.client.program.ProgramInvocationException: Job >> failed. (JobID: 50c07fcc2f2b1dde3a7073f430e4aa50) >> at >> org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:268) >> at >> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:483) >> at >> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:471) >> at >> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:446) >> at >> org.apache.flink.client.RemoteExecutor.executePlanWithJars(RemoteExecutor.java:210) >> at >> org.apache.flink.client.RemoteExecutor.executePlan(RemoteExecutor.java:187) >> at >> org.apache.flink.api.java.RemoteEnvironment.execute(RemoteEnvironment.java:173) >> at >> org.apache.beam.runners.flink.FlinkBatchPortablePipelineTranslator$BatchTranslationContext.execute(FlinkBatchPortablePipelineTranslator.java:191) >> at >> org.apache.beam.runners.flink.FlinkPipelineRunner.runPipelineWithTranslator(FlinkPipelineRunner.java:104) >> at >> org.apache.beam.runners.flink.FlinkPipelineRunner.run(FlinkPipelineRunner.java:80) >> at >> org.apache.beam.runners.fnexecution.jobsubmission.JobInvocation.runPipeline(JobInvocation.java:78) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) >> at java.lang.Thread.run(Thread.java:748) >> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job >> execution failed. >> at >> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146) >> at >> org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:265) >> ... 16 more >> Caused by: java.lang.Exception: The user defined 'open()' method >> caused an exception: java.lang.IllegalStateException: Process died with >> exit code 1 >> at >> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498) >> at >> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) >> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) >> ... 1 more >> Caused by: >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: >> java.lang.IllegalStateException: Process died with exit code 1 >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2050) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:212) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:203) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:186) >> at >> org.apache.beam.runners.fnexecution.control.DefaultExecutableStageContext.getStageBundleFactory(DefaultExecutableStageContext.java:42) >> at >> org.apache.beam.runners.fnexecution.control.ReferenceCountingExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingExecutableStageContextFactory.java:198) >> at >> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:130) >> at >> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) >> at >> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494) >> ... 3 more >> Caused by: java.lang.IllegalStateException: Process died with exit >> code 1 >> at >> org.apache.beam.runners.fnexecution.environment.ProcessManager$RunningProcess.isAliveOrThrow(ProcessManager.java:76) >> at >> org.apache.beam.runners.fnexecution.environment.ProcessEnvironmentFactory.createEnvironment(ProcessEnvironmentFactory.java:125) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:179) >> at >> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:163) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154) >> at >> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044) >> ... 15 more >> Dec 20, 2019 5:01:33 PM >> com.google.api.client.googleapis.services.AbstractGoogleClient <init> >> WARNING: Application name is not set. Call Builder#setApplicationName. >> [flink-runner-job-invoker] INFO >> org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactRetrievalService >> - Manifest at >> gs://<gcs_location>/matthew_magsombol/job_53b08ce8-cec3-4a39-ae60-95398eb04b43/MANIFEST >> has 1 artifact locations >> [pool-11-thread-1] INFO >> org.apache.beam.sdk.extensions.gcp.util.GcsUtil - Ignoring failed deletion >> of file >> gs://<gcs_location>/matthew_magsombol/job_53b08ce8-cec3-4a39-ae60-95398eb04b43/artifacts/ >> which already does not exist: >> {"code":404,"errors":[{"domain":"global","message":"No such object: >> <gcs_location>/matthew_magsombol/job_53b08ce8-cec3-4a39-ae60-95398eb04b43/artifacts/","reason":"notFound"}],"message":"No >> such object: >> <gcs_location>/matthew_magsombol/job_53b08ce8-cec3-4a39-ae60-95398eb04b43/artifacts/"} >> [pool-13-thread-1] INFO >> org.apache.beam.sdk.extensions.gcp.util.GcsUtil - Ignoring failed deletion >> of file >> gs://<gcs_location>/matthew_magsombol/job_53b08ce8-cec3-4a39-ae60-95398eb04b43/ >> which already does not exist: >> {"code":404,"errors":[{"domain":"global","message":"No such object: >> <gcs_location>/matthew_magsombol/job_53b08ce8-cec3-4a39-ae60-95398eb04b43/","reason":"notFound"}],"message":"No >> such object: >> <gcs_location>/matthew_magsombol/job_53b08ce8-cec3-4a39-ae60-95398eb04b43/"} >> [flink-runner-job-invoker] INFO >> org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactStagingService >> - Removed dir >> gs://<gcs_location>/matthew_magsombol/job_53b08ce8-cec3-4a39-ae60-95398eb04b43/ >> ``` >> Midway through the run, I've confirmed that the beam job server IS able >> to create a job directory in the gcs location ( it does auto delete when >> the run ends ). >> ( My previous attempts at this, the error led me to an authentication >> issue with gcs, and that was mainly due to gcp creds not being accessible >> by flink job manager and flink task manager...already fixed it, so it is no >> longer spitting that error out...so I don't think this might be a gcp >> credentials issue?) >> I've followed most of Yu Watanabe's email threads: >> https://lists.apache.org/thread.html/2ca72cb1946f428aabeb2b8573ba8e3270bbf34b97107c53d26fbf76%40%3Cuser.beam.apache.org%3E >> ( with a slight difference in artifact directory ). >> >> Thanks! >> >
