Are you able to SSH into your TaskManager nodes and run `docker run
hello-world` successfully?

On Wed, Oct 21, 2020 at 11:04 AM Mike Lo <[email protected]> wrote:

> Hi all,
>
> I'm following these instructions
> <https://beam.apache.org/documentation/runners/flink/> to try to run a
> Python SDK Beam job on a distributed Flink (Version: 1.8.2) cluster built
> using AWS EC2 instances. I've pulled Flink Job Service Docker image (Flink
> 1.8 <https://hub.docker.com/r/apache/beam_flink1.8_job_server>) on my
> JobManager node and installed Docker on the other TaskManager nodes (but
> did not pull the job service image). Has anyone else run into this problem
> before or has some clue as to what the problem is? Any input is
> appreciated. Below are some more details.
>
> I submit the job using this command:
>
>> python -m meeting_detector \
>> --input=/path/to/sample-data.csv \
>> --output=/path/to/result.csv \
>> --job_name=python-flink-job-test \
>> --job_endpoint=localhost:8099 \
>> --runner=PortableRunner \
>> --environment_type=DOCKER
>
>
> And this is the traceback:
>
> INFO:apache_beam.runners.portability.portable_runner:Job state changed to
>> STOPPED
>> INFO:apache_beam.runners.portability.portable_runner:Job state changed to
>> STARTING
>> INFO:apache_beam.runners.portability.portable_runner:Job state changed to
>> RUNNING
>> ERROR:root:java.io.IOException: error=2, No such file or directory
>> INFO:apache_beam.runners.portability.portable_runner:Job state changed to
>> FAILED
>> Traceback (most recent call last):
>>   File "/home/anaconda3/envs/fraud/lib/python3.6/runpy.py", line 193, in
>> _run_module_as_main
>>     "__main__", mod_spec)
>>   File "/home/anaconda3/envs/fraud/lib/python3.6/runpy.py", line 85, in
>> _run_code
>>     exec(code, run_globals)
>>   File "/data/jobs/meeting_detector.py", line 106, in <module>
>>     run()
>>   File "/data/jobs/meeting_detector.py", line 98, in run
>>     | 'WriteUserScoreSums' >> beam.io.WriteToText(args.output)
>>   File
>> "/home/anaconda3/envs/fraud/lib/python3.6/site-packages/apache_beam/pipeline.py",
>> line 556, in __exit__
>>     self.result.wait_until_finish()
>>   File
>> "/home/anaconda3/envs/fraud/lib/python3.6/site-packages/apache_beam/runners/portability/portable_runner.py",
>> line 547, in wait_until_finish
>>     raise self._runtime_exception
>> RuntimeError: Pipeline
>> python-flink-job-test_1faab21e-5b8a-4d62-ad67-879453b78d1e failed in state
>> FAILED: java.io.IOException: error=2, No such file or directory
>
>
> Below is the trace from the Docker Flink Job Service:
>
> [flink-akka.actor.default-dispatcher-2] INFO
>> org.apache.flink.runtime.executiongraph.ExecutionGraph - Job
>> python-flink-job-test (7a03cd7dbe75dc23e2d00377ee9c2e1e) switched from
>> state FAILING to FAILED.
>> org.apache.flink.runtime.JobException: Recovery is suppressed by
>> NoRestartBackoffTimeStrategy
>> at
>> org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:110)
>> at
>> org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:76)
>> at
>> org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:192)
>> at
>> org.apache.flink.runtime.scheduler.DefaultScheduler.maybeHandleTaskFailure(DefaultScheduler.java:186)
>> at
>> org.apache.flink.runtime.scheduler.DefaultScheduler.updateTaskExecutionStateInternal(DefaultScheduler.java:180)
>> at
>> org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:496)
>> at
>> org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:380)
>> at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:498)
>> at
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:284)
>> at
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:199)
>> at
>> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74)
>> at
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
>> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
>> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
>> at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
>> at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
>> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
>> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>> at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
>> at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
>> at akka.actor.ActorCell.invoke(ActorCell.scala:561)
>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
>> at akka.dispatch.Mailbox.run(Mailbox.scala:225)
>> at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
>> at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>> at
>> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>> at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>> at
>> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>> Caused by: java.lang.Exception: The user defined 'open()' method caused
>> an exception: java.io.IOException: Cannot run program "docker": error=2, No
>> such file or directory
>> at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:499)
>> at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:369)
>> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:708)
>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:533)
>> at java.lang.Thread.run(Thread.java:748)
>> Caused by:
>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>> java.io.IOException: Cannot run program "docker": error=2, No such file or
>> directory
>> at
>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966)
>> at
>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:447)
>> at
>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:432)
>> at
>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:299)
>> at
>> org.apache.beam.runners.fnexecution.control.DefaultExecutableStageContext.getStageBundleFactory(DefaultExecutableStageContext.java:38)
>> at
>> org.apache.beam.runners.fnexecution.control.ReferenceCountingExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingExecutableStageContextFactory.java:199)
>> at
>> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:137)
>> at
>> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
>> at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:495)
>> ... 4 more
>> Caused by: java.io.IOException: Cannot run program "docker": error=2, No
>> such file or directory
>> at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
>> at
>> org.apache.beam.runners.fnexecution.environment.DockerCommand.runShortCommand(DockerCommand.java:186)
>> at
>> org.apache.beam.runners.fnexecution.environment.DockerCommand.runShortCommand(DockerCommand.java:168)
>> at
>> org.apache.beam.runners.fnexecution.environment.DockerCommand.runImage(DockerCommand.java:92)
>> at
>> org.apache.beam.runners.fnexecution.environment.DockerEnvironmentFactory.createEnvironment(DockerEnvironmentFactory.java:131)
>> at
>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:248)
>> at
>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:227)
>> at
>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528)
>> at
>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277)
>> at
>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154)
>> at
>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044)
>> at
>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952)
>> at
>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974)
>> at
>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958)
>> at
>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964)
>> ... 12 more
>> Caused by: java.io.IOException: error=2, No such file or directory
>> at java.lang.UNIXProcess.forkAndExec(Native Method)
>> at java.lang.UNIXProcess.<init>(UNIXProcess.java:247)
>> at java.lang.ProcessImpl.start(ProcessImpl.java:134)
>> at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
>> ... 26 more
>> [flink-akka.actor.default-dispatcher-2] INFO
>> org.apache.flink.runtime.executiongraph.ExecutionGraph - Discarding the
>> results produced by task execution edead64f179a6403320feb57ec4ed0db.
>> [flink-akka.actor.default-dispatcher-2] INFO
>> org.apache.flink.runtime.executiongraph.ExecutionGraph - Discarding the
>> results produced by task execution edead64f179a6403320feb57ec4ed0db.
>> [flink-akka.actor.default-dispatcher-3] INFO
>> org.apache.flink.runtime.minicluster.MiniCluster - Shutting down Flink Mini
>> Cluster
>> [flink-akka.actor.default-dispatcher-2] INFO
>> org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Job
>> 7a03cd7dbe75dc23e2d00377ee9c2e1e reached globally terminal state FAILED.
>> [flink-akka.actor.default-dispatcher-3] INFO
>> org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Shutting down
>> rest endpoint.
>> [flink-akka.actor.default-dispatcher-2] INFO
>> org.apache.flink.runtime.taskexecutor.TaskExecutor - Stopping TaskExecutor
>> akka://flink/user/taskmanager_2.
>> [flink-akka.actor.default-dispatcher-2] INFO
>> org.apache.flink.runtime.taskexecutor.TaskExecutor - Close ResourceManager
>> connection 13d8396855b02af0c138796ff9fe4120.
>> [flink-akka.actor.default-dispatcher-4] INFO
>> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager -
>> Closing TaskExecutor connection efe9264c-ea4a-49a1-a091-ad5266ed00da
>> because: The TaskExecutor is shutting down.
>> [flink-akka.actor.default-dispatcher-2] INFO
>> org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl - Free slot
>> TaskSlot(index:15, state:ACTIVE, resource profile:
>> ResourceProfile{managedMemory=8.000mb (8388608 bytes),
>> networkMemory=4.000mb (4194304 bytes)}, allocationId:
>> ab8d50ec71c77142a07556c2c778b36b, jobId: 7a03cd7dbe75dc23e2d00377ee9c2e1e).
>> [flink-akka.actor.default-dispatcher-2] INFO
>> org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl - Free slot
>> TaskSlot(index:13, state:ACTIVE, resource profile:
>> ResourceProfile{managedMemory=8.000mb (8388608 bytes),
>> networkMemory=4.000mb (4194304 bytes)}, allocationId:
>> 924a7b5b69bf9ce045b90c3779cacb8b, jobId: 7a03cd7dbe75dc23e2d00377ee9c2e1e).
>> [flink-akka.actor.default-dispatcher-2] INFO
>> org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl - Free slot
>> TaskSlot(index:14, state:ACTIVE, resource profile:
>> ResourceProfile{managedMemory=8.000mb (8388608 bytes),
>> networkMemory=4.000mb (4194304 bytes)}, allocationId:
>> f20cbec15d2eae109199b4a7f381ea6d, jobId: 7a03cd7dbe75dc23e2d00377ee9c2e1e).
>> [flink-akka.actor.default-dispatcher-2] INFO
>> org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl - Free slot
>> TaskSlot(index:10, state:ACTIVE, resource profile:
>> ResourceProfile{managedMemory=8.000mb (8388608 bytes),
>> networkMemory=4.000mb (4194304 bytes)}, allocationId:
>> 21724dc21eeb3b989cc61218b2ed0d33, jobId: 7a03cd7dbe75dc23e2d00377ee9c2e1e).
>> [flink-akka.actor.default-dispatcher-2] INFO
>> org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl - Free slot
>> TaskSlot(index:3, state:ACTIVE, resource profile:
>> ResourceProfile{managedMemory=8.000mb (8388608 bytes),
>> networkMemory=4.000mb (4194304 bytes)}, allocationId:
>> dfb7dbaf74369b8a88c1d3f4458d99fa, jobId: 7a03cd7dbe75dc23e2d00377ee9c2e1e).
>> [flink-akka.actor.default-dispatcher-2] INFO
>> org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl - Free slot
>> TaskSlot(index:7, state:ACTIVE, resource profile:
>> ResourceProfile{managedMemory=8.000mb (8388608 bytes),
>> networkMemory=4.000mb (4194304 bytes)}, allocationId:
>> ee95a6d3407ad30b8b92b96ed54ffd62, jobId: 7a03cd7dbe75dc23e2d00377ee9c2e1e).
>> [flink-akka.actor.default-dispatcher-2] INFO
>> org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl - Free slot
>> TaskSlot(index:12, state:ACTIVE, resource profile:
>> ResourceProfile{managedMemory=8.000mb (8388608 bytes),
>> networkMemory=4.000mb (4194304 bytes)}, allocationId:
>> e6b5f2defe813f0d5fa0ac6864d6fabf, jobId: 7a03cd7dbe75dc23e2d00377ee9c2e1e).
>> [flink-akka.actor.default-dispatcher-2] INFO
>> org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl - Free slot
>> TaskSlot(index:2, state:ACTIVE, resource profile:
>> ResourceProfile{managedMemory=8.000mb (8388608 bytes),
>> networkMemory=4.000mb (4194304 bytes)}, allocationId:
>> 0ac66cecfbeec45c08aa7574713fe713, jobId: 7a03cd7dbe75dc23e2d00377ee9c2e1e).
>> [flink-akka.actor.default-dispatcher-2] INFO
>> org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl - Free slot
>> TaskSlot(index:4, state:ACTIVE, resource profile:
>> ResourceProfile{managedMemory=8.000mb (8388608 bytes),
>> networkMemory=4.000mb (4194304 bytes)}, allocationId:
>> 2065cd8bd1148e8872907ba77253f11c, jobId: 7a03cd7dbe75dc23e2d00377ee9c2e1e).
>> [flink-akka.actor.default-dispatcher-2] INFO
>> org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl - Free slot
>> TaskSlot(index:11, state:ACTIVE, resource profile:
>> ResourceProfile{managedMemory=8.000mb (8388608 bytes),
>> networkMemory=4.000mb (4194304 bytes)}, allocationId:
>> e3410cb80b2049f07c6b28181fc0c8f7, jobId: 7a03cd7dbe75dc23e2d00377ee9c2e1e).
>> [flink-akka.actor.default-dispatcher-2] INFO
>> org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl - Free slot
>> TaskSlot(index:1, state:ACTIVE, resource profile:
>> ResourceProfile{managedMemory=8.000mb (8388608 bytes),
>> networkMemory=4.000mb (4194304 bytes)}, allocationId:
>> 81965b19791ad7d681d274e16a178f36, jobId: 7a03cd7dbe75dc23e2d00377ee9c2e1e).
>> [flink-akka.actor.default-dispatcher-2] INFO
>> org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl - Free slot
>> TaskSlot(index:5, state:ACTIVE, resource profile:
>> ResourceProfile{managedMemory=8.000mb (8388608 bytes),
>> networkMemory=4.000mb (4194304 bytes)}, allocationId:
>> ffb3f5ec8f8af914156caa70aa1976f5, jobId: 7a03cd7dbe75dc23e2d00377ee9c2e1e).
>> [flink-akka.actor.default-dispatcher-2] INFO
>> org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl - Free slot
>> TaskSlot(index:9, state:ACTIVE, resource profile:
>> ResourceProfile{managedMemory=8.000mb (8388608 bytes),
>> networkMemory=4.000mb (4194304 bytes)}, allocationId:
>> 833c6fb3b038dbca13c9507b9c9fe3e1, jobId: 7a03cd7dbe75dc23e2d00377ee9c2e1e).
>> [flink-akka.actor.default-dispatcher-2] INFO
>> org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl - Free slot
>> TaskSlot(index:0, state:ACTIVE, resource profile:
>> ResourceProfile{managedMemory=8.000mb (8388608 bytes),
>> networkMemory=4.000mb (4194304 bytes)}, allocationId:
>> 00a67feee8fd2a9dae42fa21224e4e1f, jobId: 7a03cd7dbe75dc23e2d00377ee9c2e1e).
>> [flink-akka.actor.default-dispatcher-2] INFO
>> org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl - Free slot
>> TaskSlot(index:6, state:ACTIVE, resource profile:
>> ResourceProfile{managedMemory=8.000mb (8388608 bytes),
>> networkMemory=4.000mb (4194304 bytes)}, allocationId:
>> bda8240ec3870e8408383388983b15f5, jobId: 7a03cd7dbe75dc23e2d00377ee9c2e1e).
>> [flink-akka.actor.default-dispatcher-2] INFO
>> org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl - Free slot
>> TaskSlot(index:8, state:ACTIVE, resource profile:
>> ResourceProfile{managedMemory=8.000mb (8388608 bytes),
>> networkMemory=4.000mb (4194304 bytes)}, allocationId:
>> 037a99d1e2c6e53b14a1131c86ec707c, jobId: 7a03cd7dbe75dc23e2d00377ee9c2e1e).
>> [flink-akka.actor.default-dispatcher-4] INFO
>> org.apache.flink.runtime.jobmaster.JobMaster - Stopping the JobMaster for
>> job python-flink-job-test(7a03cd7dbe75dc23e2d00377ee9c2e1e).
>> [flink-runner-job-invoker] ERROR
>> org.apache.beam.runners.jobsubmission.JobInvocation - Error during job
>> invocation python-flink-job-test_1faab21e-5b8a-4d62-ad67-879453b78d1e.
>> [flink-akka.actor.default-dispatcher-5] INFO
>> org.apache.flink.runtime.taskexecutor.JobLeaderService - Stop job leader
>> service.
>> [flink-akka.actor.default-dispatcher-5] INFO
>> org.apache.flink.runtime.state.TaskExecutorLocalStateStoresManager -
>> Shutting down TaskExecutorLocalStateStoresManager.
>> java.util.concurrent.ExecutionException:
>> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
>> at
>> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
>> at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
>> at
>> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:864)
>> at
>> org.apache.beam.runners.flink.FlinkBatchPortablePipelineTranslator$BatchTranslationContext.execute(FlinkBatchPortablePipelineTranslator.java:194)
>> at
>> org.apache.beam.runners.flink.FlinkPipelineRunner.runPipelineWithTranslator(FlinkPipelineRunner.java:116)
>> at
>> org.apache.beam.runners.flink.FlinkPipelineRunner.run(FlinkPipelineRunner.java:83)
>> at
>> org.apache.beam.runners.jobsubmission.JobInvocation.runPipeline(JobInvocation.java:83)
>> at
>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
>> at
>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
>> at
>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>> at java.lang.Thread.run(Thread.java:748)
>> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job
>> execution failed.
>> at
>> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:147)
>> at
>> org.apache.flink.client.program.PerJobMiniClusterFactory$PerJobMiniClusterJobClient.lambda$getJobExecutionResult$2(PerJobMiniClusterFactory.java:175)
>> at
>> java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
>> at
>> java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
>> at
>> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
>> at
>> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
>> at
>> org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:874)
>> at akka.dispatch.OnComplete.internal(Future.scala:264)
>> at akka.dispatch.OnComplete.internal(Future.scala:261)
>> at akka.dispatch.japi$CallbackBridge.apply(Future.scala:191)
>> at akka.dispatch.japi$CallbackBridge.apply(Future.scala:188)
>> at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
>> at
>> org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:74)
>> at
>> scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44)
>> at
>> scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252)
>> at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:572)
>> at
>> akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:22)
>> at
>> akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:21)
>> at scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:436)
>> at scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:435)
>> at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
>> at
>> akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
>> at
>> akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)
>> at
>> akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
>> at
>> akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
>> at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
>> at
>> akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90)
>> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
>> at
>> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:44)
>> at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>> at
>> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>> at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>> at
>> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>> Caused by: org.apache.flink.runtime.JobException: Recovery is suppressed
>> by NoRestartBackoffTimeStrategy
>> at
>> org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:110)
>> at
>> org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:76)
>> at
>> org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:192)
>> at
>> org.apache.flink.runtime.scheduler.DefaultScheduler.maybeHandleTaskFailure(DefaultScheduler.java:186)
>> at
>> org.apache.flink.runtime.scheduler.DefaultScheduler.updateTaskExecutionStateInternal(DefaultScheduler.java:180)
>> at
>> org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:496)
>> at
>> org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:380)
>> at sun.reflect.GeneratedMethodAccessor27.invoke(Unknown Source)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:498)
>> at
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:284)
>> at
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:199)
>> at
>> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74)
>> at
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
>> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
>> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
>> at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
>> at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
>> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
>> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>> at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
>> at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
>> at akka.actor.ActorCell.invoke(ActorCell.scala:561)
>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
>> at akka.dispatch.Mailbox.run(Mailbox.scala:225)
>> at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
>> ... 4 more
>> Caused by: java.lang.Exception: The user defined 'open()' method caused
>> an exception: java.io.IOException: Cannot run program "docker": error=2, No
>> such file or directory
>> at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:499)
>> at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:369)
>> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:708)
>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:533)
>> at java.lang.Thread.run(Thread.java:748)
>> Caused by:
>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>> java.io.IOException: Cannot run program "docker": error=2, No such file or
>> directory
>> at
>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966)
>> at
>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:447)
>> at
>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.<init>(DefaultJobBundleFactory.java:432)
>> at
>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:299)
>> at
>> org.apache.beam.runners.fnexecution.control.DefaultExecutableStageContext.getStageBundleFactory(DefaultExecutableStageContext.java:38)
>> at
>> org.apache.beam.runners.fnexecution.control.ReferenceCountingExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingExecutableStageContextFactory.java:199)
>> at
>> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:137)
>> at
>> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
>> at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:495)
>> ... 4 more
>> Caused by: java.io.IOException: Cannot run program "docker": error=2, No
>> such file or directory
>> at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
>> at
>> org.apache.beam.runners.fnexecution.environment.DockerCommand.runShortCommand(DockerCommand.java:186)
>> at
>> org.apache.beam.runners.fnexecution.environment.DockerCommand.runShortCommand(DockerCommand.java:168)
>> at
>> org.apache.beam.runners.fnexecution.environment.DockerCommand.runImage(DockerCommand.java:92)
>> at
>> org.apache.beam.runners.fnexecution.environment.DockerEnvironmentFactory.createEnvironment(DockerEnvironmentFactory.java:131)
>> at
>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:248)
>> at
>> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$1.load(DefaultJobBundleFactory.java:227)
>> at
>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528)
>> at
>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277)
>> at
>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154)
>> at
>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044)
>> at
>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952)
>> at
>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974)
>> at
>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958)
>> at
>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964)
>> ... 12 more
>> Caused by: java.io.IOException: error=2, No such file or directory
>> at java.lang.UNIXProcess.forkAndExec(Native Method)
>> at java.lang.UNIXProcess.<init>(UNIXProcess.java:247)
>> at java.lang.ProcessImpl.start(ProcessImpl.java:134)
>> at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
>> ... 26 more
>> [flink-akka.actor.default-dispatcher-4] INFO
>> org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl - Suspending
>> SlotPool.
>> [flink-akka.actor.default-dispatcher-4] INFO
>> org.apache.flink.runtime.jobmaster.JobMaster - Close ResourceManager
>> connection 13d8396855b02af0c138796ff9fe4120: JobManager is shutting down..
>> [flink-akka.actor.default-dispatcher-4] INFO
>> org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl - Stopping
>> SlotPool.
>> [flink-akka.actor.default-dispatcher-3] INFO
>> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager -
>> Disconnect job manager 
>> ac69ec5dfd8c2b5d90240c1af14c46de@akka://flink/user/jobmanager_3
>> for job 7a03cd7dbe75dc23e2d00377ee9c2e1e from the resource manager.
>> [ForkJoinPool.commonPool-worker-2] INFO
>> org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Removing cache
>> directory /tmp/flink-web-ui
>> [ForkJoinPool.commonPool-worker-2] INFO
>> org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Shut down
>> complete.
>> [flink-akka.actor.default-dispatcher-2] INFO
>> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - Shut
>> down cluster because application is in CANCELED, diagnostics
>> DispatcherResourceManagerComponent has been closed..
>> [flink-akka.actor.default-dispatcher-2] INFO
>> org.apache.flink.runtime.entrypoint.component.DispatcherResourceManagerComponent
>> - Closing components.
>> [flink-akka.actor.default-dispatcher-2] INFO
>> org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess -
>> Stopping SessionDispatcherLeaderProcess.
>> [flink-akka.actor.default-dispatcher-3] INFO
>> org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Stopping
>> dispatcher akka://flink/user/dispatcher.
>> [flink-akka.actor.default-dispatcher-3] INFO
>> org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Stopping all
>> currently running jobs of dispatcher akka://flink/user/dispatcher.
>> [flink-akka.actor.default-dispatcher-5] INFO
>> org.apache.flink.runtime.io.disk.FileChannelManagerImpl -
>> FileChannelManager removed spill file directory
>> /tmp/flink-io-d968bae7-ae23-4528-8e03-32a9173acdd0
>> [flink-akka.actor.default-dispatcher-4] INFO
>> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManagerImpl -
>> Closing the SlotManager.
>> [flink-akka.actor.default-dispatcher-4] INFO
>> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManagerImpl -
>> Suspending the SlotManager.
>> [flink-akka.actor.default-dispatcher-5] INFO
>> org.apache.flink.runtime.io.network.NettyShuffleEnvironment - Shutting down
>> the network environment and its components.
>> [flink-akka.actor.default-dispatcher-3] INFO
>> org.apache.flink.runtime.rest.handler.legacy.backpressure.BackPressureRequestCoordinator
>> - Shutting down back pressure request coordinator.
>> [flink-akka.actor.default-dispatcher-3] INFO
>> org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Stopped
>> dispatcher akka://flink/user/dispatcher.
>> [flink-akka.actor.default-dispatcher-5] INFO
>> org.apache.flink.runtime.io.disk.FileChannelManagerImpl -
>> FileChannelManager removed spill file directory
>> /tmp/flink-netty-shuffle-9663a939-6608-4337-94b2-f32bfdca367d
>> [flink-akka.actor.default-dispatcher-5] INFO
>> org.apache.flink.runtime.taskexecutor.KvStateService - Shutting down the
>> kvState service and its components.
>> [flink-akka.actor.default-dispatcher-5] INFO
>> org.apache.flink.runtime.taskexecutor.JobLeaderService - Stop job leader
>> service.
>> [flink-akka.actor.default-dispatcher-5] INFO
>> org.apache.flink.runtime.filecache.FileCache - removed file cache directory
>> /tmp/flink-dist-cache-a640b6a1-c444-4886-8b6f-98aed046bf2b
>> [flink-akka.actor.default-dispatcher-5] INFO
>> org.apache.flink.runtime.taskexecutor.TaskExecutor - Stopped TaskExecutor
>> akka://flink/user/taskmanager_2.
>> [flink-akka.actor.default-dispatcher-5] INFO
>> org.apache.flink.runtime.rpc.akka.AkkaRpcService - Stopping Akka RPC
>> service.
>> [flink-metrics-2] INFO
>> akka.remote.RemoteActorRefProvider$RemotingTerminator - Shutting down
>> remote daemon.
>> [flink-metrics-2] INFO
>> akka.remote.RemoteActorRefProvider$RemotingTerminator - Remote daemon shut
>> down; proceeding with flushing remote transports.
>> [flink-metrics-2] INFO
>> akka.remote.RemoteActorRefProvider$RemotingTerminator - Remoting shut down.
>> [flink-metrics-2] INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService -
>> Stopping Akka RPC service.
>> [flink-metrics-2] INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService -
>> Stopped Akka RPC service.
>> [flink-akka.actor.default-dispatcher-3] INFO
>> org.apache.flink.runtime.blob.PermanentBlobCache - Shutting down BLOB cache
>> [flink-akka.actor.default-dispatcher-3] INFO
>> org.apache.flink.runtime.blob.TransientBlobCache - Shutting down BLOB cache
>> [flink-akka.actor.default-dispatcher-3] INFO
>> org.apache.flink.runtime.blob.BlobServer - Stopped BLOB server at
>> 0.0.0.0:42851
>> [flink-akka.actor.default-dispatcher-3] INFO
>> org.apache.flink.runtime.rpc.akka.AkkaRpcService - Stopped Akka RPC service.
>
>
> Best,
> Mike
>

Reply via email to