[
https://issues.apache.org/jira/browse/FLINK-7351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16110777#comment-16110777
]
Nico Kruber edited comment on FLINK-7351 at 8/2/17 12:30 PM:
-------------------------------------------------------------
did another run with the following snipped and the failure is reproducible
(even locally):
{code}
private static Logger LOG =
LoggerFactory.getLogger(JobClientActorRecoveryITCase.class);
@Test
public void testJobClientRecovery1000() throws Exception {
for (int i = 0; i < 1000; ++i) {
LOG.info("starting test run " + i);
testJobClientRecovery();
}
}
{code}
Then, {{mvn -Dlog.dir=logs
-Dlog4j.configuration=file://`pwd`/tools/log4j-travis.properties
-Dtest=JobClientActorRecoveryITCase#testJobClientRecovery -DfailIfNoTests=false
integration-test -pl flink-runtime}} yields the following in the log, i.e.
{{flink-runtime/target/logs/mvn-1.log}}:
{code}
12:17:38,304 INFO org.apache.flink.runtime.blob.FileSystemBlobStore
- Creating highly available BLOB storage directory at
/tmp/junit9004724949110959230/recovery//default/blob
12:17:38,304 INFO org.apache.flink.runtime.util.ZooKeeperUtils
- Enforcing default ACL for ZK connections
12:17:38,304 INFO org.apache.flink.runtime.util.ZooKeeperUtils
- Using '/flink/default' as Zookeeper namespace.
12:17:38,304 INFO org.apache.curator.framework.imps.CuratorFrameworkImpl
- Starting
12:17:38,348 INFO org.apache.flink.runtime.minicluster.FlinkMiniCluster
- Disabled queryable state server
12:17:38,348 INFO org.apache.flink.runtime.minicluster.FlinkMiniCluster
- Starting FlinkMiniCluster.
12:17:38,348 INFO org.apache.curator.framework.state.ConnectionStateManager
- State change: CONNECTED
12:17:38,354 INFO akka.event.slf4j.Slf4jLogger
- Slf4jLogger started
12:17:38,355 INFO org.apache.flink.runtime.blob.BlobServer
- Created BLOB server storage directory
/tmp/blobStore-a2ef16c3-6223-45a6-913b-748781acdb2d
12:17:38,356 INFO org.apache.flink.runtime.blob.BlobServer
- Started BLOB server at 0.0.0.0:35687 - max concurrent requests: 50 - max
backlog: 1000
12:17:38,356 INFO org.apache.flink.runtime.metrics.MetricRegistry
- No metrics reporter configured, no metrics will be exposed/reported.
12:17:38,357 INFO org.apache.flink.runtime.testingUtils.TestingMemoryArchivist
- Started memory archivist akka://flink/user/archive_1
12:17:38,359 INFO org.apache.flink.runtime.blob.BlobServer
- Created BLOB server storage directory
/tmp/blobStore-bc758698-bd94-4802-95d5-da4c6d856883
12:17:38,359 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Starting JobManager at akka://flink/user/jobmanager_1.
12:17:38,359 INFO
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService -
Starting ZooKeeperLeaderElectionService
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService@5cb0f0ab.
12:17:38,359 INFO org.apache.flink.runtime.blob.BlobServer
- Started BLOB server at 0.0.0.0:32853 - max concurrent requests: 50 - max
backlog: 1000
12:17:38,359 INFO org.apache.flink.runtime.metrics.MetricRegistry
- No metrics reporter configured, no metrics will be exposed/reported.
12:17:38,360 INFO org.apache.flink.runtime.testingUtils.TestingMemoryArchivist
- Started memory archivist akka://flink/user/archive_2
12:17:38,363 INFO
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Starting ZooKeeperLeaderRetrievalService.
12:17:38,363 DEBUG
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService - Grant
leadership to contender akka://flink/user/jobmanager_1 with session ID
3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,363 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Starting JobManager at akka://flink/user/jobmanager_2.
12:17:38,363 INFO
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService -
Starting ZooKeeperLeaderElectionService
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService@527cd5eb.
12:17:38,363 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- JobManager akka://flink/user/jobmanager_1 was granted leadership with leader
session ID Some(3f4d9edf-5fa7-48c4-85ae-15bed36d46e4).
12:17:38,363 DEBUG
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService -
Confirm leader session ID 3f4d9edf-5fa7-48c4-85ae-15bed36d46e4 for leader
akka://flink/user/jobmanager_1.
12:17:38,364 INFO
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Starting ZooKeeperLeaderRetrievalService.
12:17:38,365 DEBUG
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService -
Leader node changed while akka://flink/user/jobmanager_1 is the leader with
session ID null.
12:17:38,363 INFO
org.apache.flink.runtime.taskexecutor.TaskManagerConfiguration - Messages have
a max timeout of 100000 ms
12:17:38,365 DEBUG
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService - Write
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,365 INFO org.apache.flink.runtime.taskexecutor.TaskManagerServices
- Temporary file directory '/tmp': total 9 GB, usable 6 GB (66.67% usable)
12:17:38,415 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Attempting to recover job efa7affb9fafdf7b682886f80a3bdeff.
12:17:38,416 DEBUG
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService -
Successfully wrote leader information: Leader=akka://flink/user/jobmanager_1,
session ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,416 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Delaying recovery of all jobs by 10000 milliseconds.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,419 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,419 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,419 INFO
org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore -
Recovered SubmittedJobGraph(efa7affb9fafdf7b682886f80a3bdeff, JobInfo(clients:
Set((Actor[akka://flink/user/$a#235524161],EXECUTION_RESULT_AND_STATE_CHANGES)),
start: 1501676245446)).
12:17:38,419 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,419 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Submitting recovered job efa7affb9fafdf7b682886f80a3bdeff.
12:17:38,419 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Submitting job efa7affb9fafdf7b682886f80a3bdeff (Blocking Test Job)
(Recovery).
12:17:38,419 INFO org.apache.flink.runtime.testutils.TestingResourceManager
- Trying to associate with JobManager leader akka://flink/user/jobmanager_1
12:17:38,419 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Using restart strategy NoRestartStrategy for
efa7affb9fafdf7b682886f80a3bdeff.
12:17:38,419 DEBUG
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService -
Leader node changed while akka://flink/user/jobmanager_1 is the leader with
session ID 3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,420 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Job recovers via failover strategy: full graph restart
12:17:38,420 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Running initialization on master for job Blocking Test Job
(efa7affb9fafdf7b682886f80a3bdeff).
12:17:38,420 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Successfully ran initialization on master in 0 ms.
12:17:38,420 INFO org.apache.flink.runtime.testutils.TestingResourceManager
- Resource Manager associating with leading JobManager
Actor[akka://flink/user/jobmanager_1#-1382860260] - leader session
3f4d9edf-5fa7-48c4-85ae-15bed36d46e4
12:17:38,420 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,420 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Scheduling job efa7affb9fafdf7b682886f80a3bdeff (Blocking Test Job).
12:17:38,420 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,420 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Job Blocking Test Job (efa7affb9fafdf7b682886f80a3bdeff) switched from state
CREATED to RUNNING.
12:17:38,420 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Blocking Vertex (1/1) (424ef083d9189043385f9f2b855aeb21) switched from
CREATED to SCHEDULED.
12:17:38,420 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Blocking Vertex (1/1) (424ef083d9189043385f9f2b855aeb21) switched from
SCHEDULED to FAILED.
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Not
enough free slots available to run the job. You can decrease the operator
parallelism or increase the number of slots per TaskManager in the
configuration. Resources available to scheduler: Number of instances=0, total
number of slots=0, available slots=0
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:334)
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.allocateSlot(Scheduler.java:139)
at
org.apache.flink.runtime.executiongraph.Execution.allocateSlotForExecution(Execution.java:368)
at
org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:309)
at
org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:596)
at
org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:450)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleLazy(ExecutionGraph.java:834)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:814)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1425)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
12:17:38,421 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Job Blocking Test Job (efa7affb9fafdf7b682886f80a3bdeff) switched from state
RUNNING to FAILING.
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Not
enough free slots available to run the job. You can decrease the operator
parallelism or increase the number of slots per TaskManager in the
configuration. Resources available to scheduler: Number of instances=0, total
number of slots=0, available slots=0
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:334)
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.allocateSlot(Scheduler.java:139)
at
org.apache.flink.runtime.executiongraph.Execution.allocateSlotForExecution(Execution.java:368)
at
org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:309)
at
org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:596)
at
org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:450)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleLazy(ExecutionGraph.java:834)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:814)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1425)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
12:17:38,421 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Try to restart or fail the job Blocking Test Job
(efa7affb9fafdf7b682886f80a3bdeff) if no longer possible.
12:17:38,421 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Job Blocking Test Job (efa7affb9fafdf7b682886f80a3bdeff) switched from state
FAILING to FAILED.
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Not
enough free slots available to run the job. You can decrease the operator
parallelism or increase the number of slots per TaskManager in the
configuration. Resources available to scheduler: Number of instances=0, total
number of slots=0, available slots=0
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:334)
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.allocateSlot(Scheduler.java:139)
at
org.apache.flink.runtime.executiongraph.Execution.allocateSlotForExecution(Execution.java:368)
at
org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:309)
at
org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:596)
at
org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:450)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleLazy(ExecutionGraph.java:834)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:814)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1425)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
12:17:38,422 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Could not restart the job Blocking Test Job
(efa7affb9fafdf7b682886f80a3bdeff) because the restart strategy prevented it.
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Not
enough free slots available to run the job. You can decrease the operator
parallelism or increase the number of slots per TaskManager in the
configuration. Resources available to scheduler: Number of instances=0, total
number of slots=0, available slots=0
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:334)
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.allocateSlot(Scheduler.java:139)
at
org.apache.flink.runtime.executiongraph.Execution.allocateSlotForExecution(Execution.java:368)
at
org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:309)
at
org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:596)
at
org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:450)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleLazy(ExecutionGraph.java:834)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:814)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1425)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
12:17:38,446 INFO
org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore - Removed
job graph efa7affb9fafdf7b682886f80a3bdeff from ZooKeeper.
12:17:38,488 INFO org.apache.flink.runtime.io.network.buffer.NetworkBufferPool
- Allocated 197 MB for network buffer pool (number of memory segments: 6307,
bytes per segment: 32768).
12:17:38,488 INFO org.apache.flink.runtime.io.network.NetworkEnvironment
- Starting the network environment and its components.
12:17:38,488 INFO org.apache.flink.runtime.taskexecutor.TaskManagerServices
- Limiting managed memory to 621 MB, memory will be allocated lazily.
12:17:38,489 INFO org.apache.flink.runtime.io.disk.iomanager.IOManager
- I/O manager uses directory
/tmp/flink-io-54a003b9-6f13-49c8-a585-238ec48019c5 for spill files.
12:17:38,489 INFO org.apache.flink.runtime.metrics.MetricRegistry
- No metrics reporter configured, no metrics will be exposed/reported.
12:17:38,489 INFO org.apache.flink.runtime.filecache.FileCache
- User file cache uses directory
/tmp/flink-dist-cache-b7357f57-18a2-40ad-8448-b251ad3af109
12:17:38,490 INFO org.apache.flink.runtime.filecache.FileCache
- User file cache uses directory
/tmp/flink-dist-cache-d3abe96c-6ec6-416e-9418-b1127cf407de
12:17:38,490 INFO org.apache.flink.runtime.testingUtils.TestingTaskManager
- Starting TaskManager actor at akka://flink/user/taskmanager_1#2009077452.
12:17:38,490 INFO org.apache.flink.runtime.testingUtils.TestingTaskManager
- TaskManager data connection information: c9a2fe7403e005322f998f352bbe5be5 @
localhost (dataPort=-1)
12:17:38,490 INFO org.apache.flink.runtime.testingUtils.TestingTaskManager
- TaskManager has 1 task slot(s).
12:17:38,490 INFO org.apache.flink.runtime.testingUtils.TestingTaskManager
- Memory usage stats: [HEAP: 207/247/1979 MB, NON HEAP: 43/44/-1 MB
(used/committed/max)]
12:17:38,490 INFO
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Starting ZooKeeperLeaderRetrievalService.
12:17:38,492 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,492 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,493 INFO org.apache.flink.runtime.testingUtils.TestingTaskManager
- Trying to register at JobManager akka://flink/user/jobmanager_1 (attempt 1,
timeout: 500 milliseconds)
12:17:38,493 INFO org.apache.flink.runtime.testutils.TestingResourceManager
- TaskManager c9a2fe7403e005322f998f352bbe5be5 has started.
12:17:38,493 INFO org.apache.flink.runtime.instance.InstanceManager
- Registered TaskManager at localhost (akka://flink/user/taskmanager_1) as
cd09541e56c8913613dc9a58f61d304a. Current number of registered hosts is 1.
Current number of alive task slots is 1.
12:17:38,493 INFO org.apache.flink.runtime.testingUtils.TestingTaskManager
- Successful registration at JobManager (akka://flink/user/jobmanager_1),
starting network stack and library cache.
12:17:38,493 INFO org.apache.flink.runtime.testingUtils.TestingTaskManager
- Determined BLOB server address to be localhost/127.0.0.1:35687. Starting
BLOB cache.
12:17:38,493 INFO org.apache.flink.runtime.blob.BlobCache
- Created BLOB cache storage directory
/tmp/blobStore-9364d0ae-7fe4-45d9-93c9-d978fae7caa8
12:17:38,495 INFO
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService -
Stopping ZooKeeperLeaderElectionService
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService@5cb0f0ab.
12:17:38,495 INFO org.apache.flink.runtime.testingUtils.TestingTaskManager
- TaskManager akka://flink/user/taskmanager_1 disconnects from JobManager
akka://flink/user/jobmanager_1: JobManager is no longer reachable
12:17:38,495 INFO
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Starting ZooKeeperLeaderRetrievalService.
12:17:38,495 INFO org.apache.flink.runtime.testingUtils.TestingTaskManager
- Disassociating from JobManager
12:17:38,496 INFO org.apache.flink.runtime.blob.BlobCache
- Shutting down BlobCache
12:17:38,496 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- Received SubmitJobAndWait(JobGraph(jobId: 41b8348843eb617e608df4f200590f37))
but there is no connection to a JobManager yet.
12:17:38,496 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- Received job Blocking Test Job (41b8348843eb617e608df4f200590f37).
12:17:38,497 INFO org.apache.flink.runtime.testingUtils.TestingTaskManager
- Trying to register at JobManager akka://flink/user/jobmanager_1 (attempt 1,
timeout: 500 milliseconds)
12:17:38,498 DEBUG
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService - Grant
leadership to contender akka://flink/user/jobmanager_2 with session ID
a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,498 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- JobManager akka://flink/user/jobmanager_2 was granted leadership with leader
session ID Some(a1124fe4-7739-452a-8b74-ee2b3fb7dad0).
12:17:38,498 DEBUG
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService -
Confirm leader session ID a1124fe4-7739-452a-8b74-ee2b3fb7dad0 for leader
akka://flink/user/jobmanager_2.
12:17:38,498 DEBUG
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService - Write
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,503 DEBUG
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService -
Successfully wrote leader information: Leader=akka://flink/user/jobmanager_2,
session ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,503 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,503 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Delaying recovery of all jobs by 10000 milliseconds.
12:17:38,503 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,503 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- Disconnect from JobManager null.
12:17:38,504 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- Connect to JobManager Actor[akka://flink/user/jobmanager_2#2090247331].
12:17:38,504 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- Connected to JobManager at Actor[akka://flink/user/jobmanager_2#2090247331]
with leader session id a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,504 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- Sending message to JobManager akka://flink/user/jobmanager_2 to submit job
Blocking Test Job (41b8348843eb617e608df4f200590f37) and wait for progress
12:17:38,505 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- Upload jar files to job manager akka://flink/user/jobmanager_2.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,505 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- Submit job to the job manager akka://flink/user/jobmanager_2.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,505 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Submitting job 41b8348843eb617e608df4f200590f37 (Blocking Test Job).
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,505 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Using restart strategy NoRestartStrategy for
41b8348843eb617e608df4f200590f37.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,506 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,506 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Job recovers via failover strategy: full graph restart
12:17:38,506 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,506 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,506 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,506 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,506 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Running initialization on master for job Blocking Test Job
(41b8348843eb617e608df4f200590f37).
12:17:38,506 INFO org.apache.flink.runtime.testutils.TestingResourceManager
- Associated JobManager Actor[akka://flink/user/jobmanager_1#-1382860260] lost
leader status
12:17:38,506 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Successfully ran initialization on master in 0 ms.
12:17:38,506 INFO org.apache.flink.runtime.testutils.TestingResourceManager
- Trying to associate with JobManager leader akka://flink/user/jobmanager_2
12:17:38,507 DEBUG
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService -
Leader node changed while akka://flink/user/jobmanager_2 is the leader with
session ID a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,507 INFO org.apache.flink.runtime.testutils.TestingResourceManager
- Resource Manager associating with leading JobManager
Actor[akka://flink/user/jobmanager_2#2090247331] - leader session
a1124fe4-7739-452a-8b74-ee2b3fb7dad0
12:17:38,507 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,508 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,509 INFO
org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore - Added
SubmittedJobGraph(41b8348843eb617e608df4f200590f37, JobInfo(clients:
Set((Actor[akka://flink/user/$a#282433225],EXECUTION_RESULT_AND_STATE_CHANGES)),
start: 1501676258505)) to ZooKeeper.
12:17:38,510 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Scheduling job 41b8348843eb617e608df4f200590f37 (Blocking Test Job).
12:17:38,510 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Job Blocking Test Job (41b8348843eb617e608df4f200590f37) switched from state
CREATED to RUNNING.
12:17:38,510 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- Job 41b8348843eb617e608df4f200590f37 was successfully submitted to the
JobManager akka://flink/deadLetters.
12:17:38,510 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Blocking Vertex (1/1) (19c331a32d8716bc6cd6d5bf7d1f02dd) switched from
CREATED to SCHEDULED.
12:17:38,510 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- 08/02/2017 12:17:38 Job execution switched to status RUNNING.
12:17:38,510 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- 08/02/2017 12:17:38 Blocking Vertex(1/1) switched to SCHEDULED
12:17:38,510 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Blocking Vertex (1/1) (19c331a32d8716bc6cd6d5bf7d1f02dd) switched from
SCHEDULED to FAILED.
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Not
enough free slots available to run the job. You can decrease the operator
parallelism or increase the number of slots per TaskManager in the
configuration. Resources available to scheduler: Number of instances=0, total
number of slots=0, available slots=0
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:334)
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.allocateSlot(Scheduler.java:139)
at
org.apache.flink.runtime.executiongraph.Execution.allocateSlotForExecution(Execution.java:368)
at
org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:309)
at
org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:596)
at
org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:450)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleLazy(ExecutionGraph.java:834)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:814)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1425)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
12:17:38,589 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Job Blocking Test Job (41b8348843eb617e608df4f200590f37) switched from state
RUNNING to FAILING.
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Not
enough free slots available to run the job. You can decrease the operator
parallelism or increase the number of slots per TaskManager in the
configuration. Resources available to scheduler: Number of instances=0, total
number of slots=0, available slots=0
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:334)
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.allocateSlot(Scheduler.java:139)
at
org.apache.flink.runtime.executiongraph.Execution.allocateSlotForExecution(Execution.java:368)
at
org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:309)
at
org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:596)
at
org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:450)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleLazy(ExecutionGraph.java:834)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:814)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1425)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
12:17:38,589 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- 08/02/2017 12:17:38 Blocking Vertex(1/1) switched to FAILED
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Not
enough free slots available to run the job. You can decrease the operator
parallelism or increase the number of slots per TaskManager in the
configuration. Resources available to scheduler: Number of instances=0, total
number of slots=0, available slots=0
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:334)
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.allocateSlot(Scheduler.java:139)
at
org.apache.flink.runtime.executiongraph.Execution.allocateSlotForExecution(Execution.java:368)
at
org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:309)
at
org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:596)
at
org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:450)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleLazy(ExecutionGraph.java:834)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:814)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1425)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
12:17:38,589 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,589 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- 08/02/2017 12:17:38 Job execution switched to status FAILING.
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Not
enough free slots available to run the job. You can decrease the operator
parallelism or increase the number of slots per TaskManager in the
configuration. Resources available to scheduler: Number of instances=0, total
number of slots=0, available slots=0
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:334)
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.allocateSlot(Scheduler.java:139)
at
org.apache.flink.runtime.executiongraph.Execution.allocateSlotForExecution(Execution.java:368)
at
org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:309)
at
org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:596)
at
org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:450)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleLazy(ExecutionGraph.java:834)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:814)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1425)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
12:17:38,589 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Try to restart or fail the job Blocking Test Job
(41b8348843eb617e608df4f200590f37) if no longer possible.
12:17:38,590 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Job Blocking Test Job (41b8348843eb617e608df4f200590f37) switched from state
FAILING to FAILED.
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Not
enough free slots available to run the job. You can decrease the operator
parallelism or increase the number of slots per TaskManager in the
configuration. Resources available to scheduler: Number of instances=0, total
number of slots=0, available slots=0
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:334)
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.allocateSlot(Scheduler.java:139)
at
org.apache.flink.runtime.executiongraph.Execution.allocateSlotForExecution(Execution.java:368)
at
org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:309)
at
org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:596)
at
org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:450)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleLazy(ExecutionGraph.java:834)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:814)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1425)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
12:17:38,590 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,590 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Could not restart the job Blocking Test Job
(41b8348843eb617e608df4f200590f37) because the restart strategy prevented it.
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Not
enough free slots available to run the job. You can decrease the operator
parallelism or increase the number of slots per TaskManager in the
configuration. Resources available to scheduler: Number of instances=0, total
number of slots=0, available slots=0
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:334)
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.allocateSlot(Scheduler.java:139)
at
org.apache.flink.runtime.executiongraph.Execution.allocateSlotForExecution(Execution.java:368)
at
org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:309)
at
org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:596)
at
org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:450)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleLazy(ExecutionGraph.java:834)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:814)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1425)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
12:17:38,591 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- 08/02/2017 12:17:38 Job execution switched to status FAILED.
12:17:38,591 INFO org.apache.flink.runtime.testingUtils.TestingTaskManager
- Trying to register at JobManager akka://flink/user/jobmanager_2 (attempt 1,
timeout: 500 milliseconds)
12:17:38,591 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- Terminate JobClientActor.
12:17:38,591 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- Disconnect from JobManager Actor[akka://flink/user/jobmanager_2#2090247331].
12:17:38,591 INFO org.apache.flink.runtime.client.JobClient
- Job execution failed
12:17:38,591 INFO org.apache.flink.runtime.instance.InstanceManager
- Registered TaskManager at localhost (akka://flink/user/taskmanager_1) as
7b417742cd33c7f2e146a52a7e5597b9. Current number of registered hosts is 1.
Current number of alive task slots is 1.
12:17:38,592 INFO
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Stopping ZooKeeperLeaderRetrievalService.
12:17:38,593 INFO org.apache.flink.runtime.testingUtils.TestingTaskManager
- Stopping TaskManager akka://flink/user/taskmanager_1#2009077452.
12:17:38,593 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Stopping JobManager akka://flink/user/jobmanager_2.
12:17:38,593 INFO
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Stopping ZooKeeperLeaderRetrievalService.
12:17:38,593 INFO
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Stopping ZooKeeperLeaderRetrievalService.
12:17:38,593 INFO org.apache.flink.runtime.io.disk.iomanager.IOManager
- I/O manager removed spill file directory
/tmp/flink-io-54a003b9-6f13-49c8-a585-238ec48019c5
12:17:38,593 INFO org.apache.flink.runtime.io.network.NetworkEnvironment
- Shutting down the network environment and its components.
12:17:38,594 INFO
org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore - Removed
job graph 41b8348843eb617e608df4f200590f37 from ZooKeeper.
12:17:38,594 INFO org.apache.flink.runtime.testingUtils.TestingTaskManager
- Task manager akka://flink/user/taskmanager_1 is completely shut down.
12:17:38,594 INFO
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService -
Stopping ZooKeeperLeaderElectionService
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService@527cd5eb.
12:17:38,595 INFO org.apache.flink.runtime.blob.BlobServer
- Stopped BLOB server at 0.0.0.0:32853
12:17:38,596 ERROR org.apache.flink.runtime.client.JobClientActorRecoveryITCase
-
--------------------------------------------------------------------------------
Test
testJobClientRecovery1000(org.apache.flink.runtime.client.JobClientActorRecoveryITCase)
failed with:
org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:933)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:876)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:876)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by:
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Not
enough free slots available to run the job. You can decrease the operator
parallelism or increase the number of slots per TaskManager in the
configuration. Resources available to scheduler: Number of instances=0, total
number of slots=0, available slots=0
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:334)
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.allocateSlot(Scheduler.java:139)
at
org.apache.flink.runtime.executiongraph.Execution.allocateSlotForExecution(Execution.java:368)
at
org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:309)
at
org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:596)
at
org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:450)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleLazy(ExecutionGraph.java:834)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:814)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1425)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
... 8 more
{code}
was (Author: nicok):
did another run with the following snipped and the failure is reproducible
(even locally):
{code}
private static Logger LOG =
LoggerFactory.getLogger(JobClientActorRecoveryITCase.class);
@Test
public void testJobClientRecovery1000() throws Exception {
for (int i = 0; i < 1000; ++i) {
LOG.info("starting test run " + i);
testJobClientRecovery();
}
}
{code}
{code}
12:17:38,304 INFO org.apache.flink.runtime.blob.FileSystemBlobStore
- Creating highly available BLOB storage directory at
/tmp/junit9004724949110959230/recovery//default/blob
12:17:38,304 INFO org.apache.flink.runtime.util.ZooKeeperUtils
- Enforcing default ACL for ZK connections
12:17:38,304 INFO org.apache.flink.runtime.util.ZooKeeperUtils
- Using '/flink/default' as Zookeeper namespace.
12:17:38,304 INFO org.apache.curator.framework.imps.CuratorFrameworkImpl
- Starting
12:17:38,348 INFO org.apache.flink.runtime.minicluster.FlinkMiniCluster
- Disabled queryable state server
12:17:38,348 INFO org.apache.flink.runtime.minicluster.FlinkMiniCluster
- Starting FlinkMiniCluster.
12:17:38,348 INFO org.apache.curator.framework.state.ConnectionStateManager
- State change: CONNECTED
12:17:38,354 INFO akka.event.slf4j.Slf4jLogger
- Slf4jLogger started
12:17:38,355 INFO org.apache.flink.runtime.blob.BlobServer
- Created BLOB server storage directory
/tmp/blobStore-a2ef16c3-6223-45a6-913b-748781acdb2d
12:17:38,356 INFO org.apache.flink.runtime.blob.BlobServer
- Started BLOB server at 0.0.0.0:35687 - max concurrent requests: 50 - max
backlog: 1000
12:17:38,356 INFO org.apache.flink.runtime.metrics.MetricRegistry
- No metrics reporter configured, no metrics will be exposed/reported.
12:17:38,357 INFO org.apache.flink.runtime.testingUtils.TestingMemoryArchivist
- Started memory archivist akka://flink/user/archive_1
12:17:38,359 INFO org.apache.flink.runtime.blob.BlobServer
- Created BLOB server storage directory
/tmp/blobStore-bc758698-bd94-4802-95d5-da4c6d856883
12:17:38,359 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Starting JobManager at akka://flink/user/jobmanager_1.
12:17:38,359 INFO
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService -
Starting ZooKeeperLeaderElectionService
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService@5cb0f0ab.
12:17:38,359 INFO org.apache.flink.runtime.blob.BlobServer
- Started BLOB server at 0.0.0.0:32853 - max concurrent requests: 50 - max
backlog: 1000
12:17:38,359 INFO org.apache.flink.runtime.metrics.MetricRegistry
- No metrics reporter configured, no metrics will be exposed/reported.
12:17:38,360 INFO org.apache.flink.runtime.testingUtils.TestingMemoryArchivist
- Started memory archivist akka://flink/user/archive_2
12:17:38,363 INFO
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Starting ZooKeeperLeaderRetrievalService.
12:17:38,363 DEBUG
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService - Grant
leadership to contender akka://flink/user/jobmanager_1 with session ID
3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,363 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Starting JobManager at akka://flink/user/jobmanager_2.
12:17:38,363 INFO
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService -
Starting ZooKeeperLeaderElectionService
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService@527cd5eb.
12:17:38,363 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- JobManager akka://flink/user/jobmanager_1 was granted leadership with leader
session ID Some(3f4d9edf-5fa7-48c4-85ae-15bed36d46e4).
12:17:38,363 DEBUG
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService -
Confirm leader session ID 3f4d9edf-5fa7-48c4-85ae-15bed36d46e4 for leader
akka://flink/user/jobmanager_1.
12:17:38,364 INFO
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Starting ZooKeeperLeaderRetrievalService.
12:17:38,365 DEBUG
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService -
Leader node changed while akka://flink/user/jobmanager_1 is the leader with
session ID null.
12:17:38,363 INFO
org.apache.flink.runtime.taskexecutor.TaskManagerConfiguration - Messages have
a max timeout of 100000 ms
12:17:38,365 DEBUG
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService - Write
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,365 INFO org.apache.flink.runtime.taskexecutor.TaskManagerServices
- Temporary file directory '/tmp': total 9 GB, usable 6 GB (66.67% usable)
12:17:38,415 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Attempting to recover job efa7affb9fafdf7b682886f80a3bdeff.
12:17:38,416 DEBUG
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService -
Successfully wrote leader information: Leader=akka://flink/user/jobmanager_1,
session ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,416 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Delaying recovery of all jobs by 10000 milliseconds.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,418 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,419 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,419 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,419 INFO
org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore -
Recovered SubmittedJobGraph(efa7affb9fafdf7b682886f80a3bdeff, JobInfo(clients:
Set((Actor[akka://flink/user/$a#235524161],EXECUTION_RESULT_AND_STATE_CHANGES)),
start: 1501676245446)).
12:17:38,419 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,419 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Submitting recovered job efa7affb9fafdf7b682886f80a3bdeff.
12:17:38,419 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Submitting job efa7affb9fafdf7b682886f80a3bdeff (Blocking Test Job)
(Recovery).
12:17:38,419 INFO org.apache.flink.runtime.testutils.TestingResourceManager
- Trying to associate with JobManager leader akka://flink/user/jobmanager_1
12:17:38,419 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Using restart strategy NoRestartStrategy for
efa7affb9fafdf7b682886f80a3bdeff.
12:17:38,419 DEBUG
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService -
Leader node changed while akka://flink/user/jobmanager_1 is the leader with
session ID 3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,420 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Job recovers via failover strategy: full graph restart
12:17:38,420 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Running initialization on master for job Blocking Test Job
(efa7affb9fafdf7b682886f80a3bdeff).
12:17:38,420 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Successfully ran initialization on master in 0 ms.
12:17:38,420 INFO org.apache.flink.runtime.testutils.TestingResourceManager
- Resource Manager associating with leading JobManager
Actor[akka://flink/user/jobmanager_1#-1382860260] - leader session
3f4d9edf-5fa7-48c4-85ae-15bed36d46e4
12:17:38,420 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,420 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Scheduling job efa7affb9fafdf7b682886f80a3bdeff (Blocking Test Job).
12:17:38,420 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,420 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Job Blocking Test Job (efa7affb9fafdf7b682886f80a3bdeff) switched from state
CREATED to RUNNING.
12:17:38,420 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Blocking Vertex (1/1) (424ef083d9189043385f9f2b855aeb21) switched from
CREATED to SCHEDULED.
12:17:38,420 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Blocking Vertex (1/1) (424ef083d9189043385f9f2b855aeb21) switched from
SCHEDULED to FAILED.
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Not
enough free slots available to run the job. You can decrease the operator
parallelism or increase the number of slots per TaskManager in the
configuration. Resources available to scheduler: Number of instances=0, total
number of slots=0, available slots=0
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:334)
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.allocateSlot(Scheduler.java:139)
at
org.apache.flink.runtime.executiongraph.Execution.allocateSlotForExecution(Execution.java:368)
at
org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:309)
at
org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:596)
at
org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:450)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleLazy(ExecutionGraph.java:834)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:814)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1425)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
12:17:38,421 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Job Blocking Test Job (efa7affb9fafdf7b682886f80a3bdeff) switched from state
RUNNING to FAILING.
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Not
enough free slots available to run the job. You can decrease the operator
parallelism or increase the number of slots per TaskManager in the
configuration. Resources available to scheduler: Number of instances=0, total
number of slots=0, available slots=0
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:334)
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.allocateSlot(Scheduler.java:139)
at
org.apache.flink.runtime.executiongraph.Execution.allocateSlotForExecution(Execution.java:368)
at
org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:309)
at
org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:596)
at
org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:450)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleLazy(ExecutionGraph.java:834)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:814)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1425)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
12:17:38,421 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Try to restart or fail the job Blocking Test Job
(efa7affb9fafdf7b682886f80a3bdeff) if no longer possible.
12:17:38,421 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Job Blocking Test Job (efa7affb9fafdf7b682886f80a3bdeff) switched from state
FAILING to FAILED.
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Not
enough free slots available to run the job. You can decrease the operator
parallelism or increase the number of slots per TaskManager in the
configuration. Resources available to scheduler: Number of instances=0, total
number of slots=0, available slots=0
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:334)
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.allocateSlot(Scheduler.java:139)
at
org.apache.flink.runtime.executiongraph.Execution.allocateSlotForExecution(Execution.java:368)
at
org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:309)
at
org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:596)
at
org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:450)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleLazy(ExecutionGraph.java:834)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:814)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1425)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
12:17:38,422 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Could not restart the job Blocking Test Job
(efa7affb9fafdf7b682886f80a3bdeff) because the restart strategy prevented it.
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Not
enough free slots available to run the job. You can decrease the operator
parallelism or increase the number of slots per TaskManager in the
configuration. Resources available to scheduler: Number of instances=0, total
number of slots=0, available slots=0
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:334)
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.allocateSlot(Scheduler.java:139)
at
org.apache.flink.runtime.executiongraph.Execution.allocateSlotForExecution(Execution.java:368)
at
org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:309)
at
org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:596)
at
org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:450)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleLazy(ExecutionGraph.java:834)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:814)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1425)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
12:17:38,446 INFO
org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore - Removed
job graph efa7affb9fafdf7b682886f80a3bdeff from ZooKeeper.
12:17:38,488 INFO org.apache.flink.runtime.io.network.buffer.NetworkBufferPool
- Allocated 197 MB for network buffer pool (number of memory segments: 6307,
bytes per segment: 32768).
12:17:38,488 INFO org.apache.flink.runtime.io.network.NetworkEnvironment
- Starting the network environment and its components.
12:17:38,488 INFO org.apache.flink.runtime.taskexecutor.TaskManagerServices
- Limiting managed memory to 621 MB, memory will be allocated lazily.
12:17:38,489 INFO org.apache.flink.runtime.io.disk.iomanager.IOManager
- I/O manager uses directory
/tmp/flink-io-54a003b9-6f13-49c8-a585-238ec48019c5 for spill files.
12:17:38,489 INFO org.apache.flink.runtime.metrics.MetricRegistry
- No metrics reporter configured, no metrics will be exposed/reported.
12:17:38,489 INFO org.apache.flink.runtime.filecache.FileCache
- User file cache uses directory
/tmp/flink-dist-cache-b7357f57-18a2-40ad-8448-b251ad3af109
12:17:38,490 INFO org.apache.flink.runtime.filecache.FileCache
- User file cache uses directory
/tmp/flink-dist-cache-d3abe96c-6ec6-416e-9418-b1127cf407de
12:17:38,490 INFO org.apache.flink.runtime.testingUtils.TestingTaskManager
- Starting TaskManager actor at akka://flink/user/taskmanager_1#2009077452.
12:17:38,490 INFO org.apache.flink.runtime.testingUtils.TestingTaskManager
- TaskManager data connection information: c9a2fe7403e005322f998f352bbe5be5 @
localhost (dataPort=-1)
12:17:38,490 INFO org.apache.flink.runtime.testingUtils.TestingTaskManager
- TaskManager has 1 task slot(s).
12:17:38,490 INFO org.apache.flink.runtime.testingUtils.TestingTaskManager
- Memory usage stats: [HEAP: 207/247/1979 MB, NON HEAP: 43/44/-1 MB
(used/committed/max)]
12:17:38,490 INFO
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Starting ZooKeeperLeaderRetrievalService.
12:17:38,492 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,492 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_1, session
ID=3f4d9edf-5fa7-48c4-85ae-15bed36d46e4.
12:17:38,493 INFO org.apache.flink.runtime.testingUtils.TestingTaskManager
- Trying to register at JobManager akka://flink/user/jobmanager_1 (attempt 1,
timeout: 500 milliseconds)
12:17:38,493 INFO org.apache.flink.runtime.testutils.TestingResourceManager
- TaskManager c9a2fe7403e005322f998f352bbe5be5 has started.
12:17:38,493 INFO org.apache.flink.runtime.instance.InstanceManager
- Registered TaskManager at localhost (akka://flink/user/taskmanager_1) as
cd09541e56c8913613dc9a58f61d304a. Current number of registered hosts is 1.
Current number of alive task slots is 1.
12:17:38,493 INFO org.apache.flink.runtime.testingUtils.TestingTaskManager
- Successful registration at JobManager (akka://flink/user/jobmanager_1),
starting network stack and library cache.
12:17:38,493 INFO org.apache.flink.runtime.testingUtils.TestingTaskManager
- Determined BLOB server address to be localhost/127.0.0.1:35687. Starting
BLOB cache.
12:17:38,493 INFO org.apache.flink.runtime.blob.BlobCache
- Created BLOB cache storage directory
/tmp/blobStore-9364d0ae-7fe4-45d9-93c9-d978fae7caa8
12:17:38,495 INFO
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService -
Stopping ZooKeeperLeaderElectionService
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService@5cb0f0ab.
12:17:38,495 INFO org.apache.flink.runtime.testingUtils.TestingTaskManager
- TaskManager akka://flink/user/taskmanager_1 disconnects from JobManager
akka://flink/user/jobmanager_1: JobManager is no longer reachable
12:17:38,495 INFO
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Starting ZooKeeperLeaderRetrievalService.
12:17:38,495 INFO org.apache.flink.runtime.testingUtils.TestingTaskManager
- Disassociating from JobManager
12:17:38,496 INFO org.apache.flink.runtime.blob.BlobCache
- Shutting down BlobCache
12:17:38,496 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- Received SubmitJobAndWait(JobGraph(jobId: 41b8348843eb617e608df4f200590f37))
but there is no connection to a JobManager yet.
12:17:38,496 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- Received job Blocking Test Job (41b8348843eb617e608df4f200590f37).
12:17:38,497 INFO org.apache.flink.runtime.testingUtils.TestingTaskManager
- Trying to register at JobManager akka://flink/user/jobmanager_1 (attempt 1,
timeout: 500 milliseconds)
12:17:38,498 DEBUG
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService - Grant
leadership to contender akka://flink/user/jobmanager_2 with session ID
a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,498 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- JobManager akka://flink/user/jobmanager_2 was granted leadership with leader
session ID Some(a1124fe4-7739-452a-8b74-ee2b3fb7dad0).
12:17:38,498 DEBUG
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService -
Confirm leader session ID a1124fe4-7739-452a-8b74-ee2b3fb7dad0 for leader
akka://flink/user/jobmanager_2.
12:17:38,498 DEBUG
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService - Write
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,503 DEBUG
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService -
Successfully wrote leader information: Leader=akka://flink/user/jobmanager_2,
session ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,503 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,503 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Delaying recovery of all jobs by 10000 milliseconds.
12:17:38,503 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,503 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- Disconnect from JobManager null.
12:17:38,504 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- Connect to JobManager Actor[akka://flink/user/jobmanager_2#2090247331].
12:17:38,504 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- Connected to JobManager at Actor[akka://flink/user/jobmanager_2#2090247331]
with leader session id a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,504 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- Sending message to JobManager akka://flink/user/jobmanager_2 to submit job
Blocking Test Job (41b8348843eb617e608df4f200590f37) and wait for progress
12:17:38,505 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- Upload jar files to job manager akka://flink/user/jobmanager_2.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,505 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- Submit job to the job manager akka://flink/user/jobmanager_2.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,505 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Submitting job 41b8348843eb617e608df4f200590f37 (Blocking Test Job).
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,505 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Using restart strategy NoRestartStrategy for
41b8348843eb617e608df4f200590f37.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,505 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,506 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,506 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Job recovers via failover strategy: full graph restart
12:17:38,506 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,506 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,506 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,506 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,506 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Running initialization on master for job Blocking Test Job
(41b8348843eb617e608df4f200590f37).
12:17:38,506 INFO org.apache.flink.runtime.testutils.TestingResourceManager
- Associated JobManager Actor[akka://flink/user/jobmanager_1#-1382860260] lost
leader status
12:17:38,506 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Successfully ran initialization on master in 0 ms.
12:17:38,506 INFO org.apache.flink.runtime.testutils.TestingResourceManager
- Trying to associate with JobManager leader akka://flink/user/jobmanager_2
12:17:38,507 DEBUG
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService -
Leader node changed while akka://flink/user/jobmanager_2 is the leader with
session ID a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,507 INFO org.apache.flink.runtime.testutils.TestingResourceManager
- Resource Manager associating with leading JobManager
Actor[akka://flink/user/jobmanager_2#2090247331] - leader session
a1124fe4-7739-452a-8b74-ee2b3fb7dad0
12:17:38,507 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,508 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,509 INFO
org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore - Added
SubmittedJobGraph(41b8348843eb617e608df4f200590f37, JobInfo(clients:
Set((Actor[akka://flink/user/$a#282433225],EXECUTION_RESULT_AND_STATE_CHANGES)),
start: 1501676258505)) to ZooKeeper.
12:17:38,510 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Scheduling job 41b8348843eb617e608df4f200590f37 (Blocking Test Job).
12:17:38,510 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Job Blocking Test Job (41b8348843eb617e608df4f200590f37) switched from state
CREATED to RUNNING.
12:17:38,510 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- Job 41b8348843eb617e608df4f200590f37 was successfully submitted to the
JobManager akka://flink/deadLetters.
12:17:38,510 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Blocking Vertex (1/1) (19c331a32d8716bc6cd6d5bf7d1f02dd) switched from
CREATED to SCHEDULED.
12:17:38,510 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- 08/02/2017 12:17:38 Job execution switched to status RUNNING.
12:17:38,510 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- 08/02/2017 12:17:38 Blocking Vertex(1/1) switched to SCHEDULED
12:17:38,510 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Blocking Vertex (1/1) (19c331a32d8716bc6cd6d5bf7d1f02dd) switched from
SCHEDULED to FAILED.
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Not
enough free slots available to run the job. You can decrease the operator
parallelism or increase the number of slots per TaskManager in the
configuration. Resources available to scheduler: Number of instances=0, total
number of slots=0, available slots=0
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:334)
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.allocateSlot(Scheduler.java:139)
at
org.apache.flink.runtime.executiongraph.Execution.allocateSlotForExecution(Execution.java:368)
at
org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:309)
at
org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:596)
at
org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:450)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleLazy(ExecutionGraph.java:834)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:814)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1425)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
12:17:38,589 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Job Blocking Test Job (41b8348843eb617e608df4f200590f37) switched from state
RUNNING to FAILING.
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Not
enough free slots available to run the job. You can decrease the operator
parallelism or increase the number of slots per TaskManager in the
configuration. Resources available to scheduler: Number of instances=0, total
number of slots=0, available slots=0
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:334)
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.allocateSlot(Scheduler.java:139)
at
org.apache.flink.runtime.executiongraph.Execution.allocateSlotForExecution(Execution.java:368)
at
org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:309)
at
org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:596)
at
org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:450)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleLazy(ExecutionGraph.java:834)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:814)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1425)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
12:17:38,589 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- 08/02/2017 12:17:38 Blocking Vertex(1/1) switched to FAILED
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Not
enough free slots available to run the job. You can decrease the operator
parallelism or increase the number of slots per TaskManager in the
configuration. Resources available to scheduler: Number of instances=0, total
number of slots=0, available slots=0
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:334)
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.allocateSlot(Scheduler.java:139)
at
org.apache.flink.runtime.executiongraph.Execution.allocateSlotForExecution(Execution.java:368)
at
org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:309)
at
org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:596)
at
org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:450)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleLazy(ExecutionGraph.java:834)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:814)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1425)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
12:17:38,589 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Leader node has changed.
12:17:38,589 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- 08/02/2017 12:17:38 Job execution switched to status FAILING.
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Not
enough free slots available to run the job. You can decrease the operator
parallelism or increase the number of slots per TaskManager in the
configuration. Resources available to scheduler: Number of instances=0, total
number of slots=0, available slots=0
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:334)
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.allocateSlot(Scheduler.java:139)
at
org.apache.flink.runtime.executiongraph.Execution.allocateSlotForExecution(Execution.java:368)
at
org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:309)
at
org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:596)
at
org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:450)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleLazy(ExecutionGraph.java:834)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:814)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1425)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
12:17:38,589 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Try to restart or fail the job Blocking Test Job
(41b8348843eb617e608df4f200590f37) if no longer possible.
12:17:38,590 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Job Blocking Test Job (41b8348843eb617e608df4f200590f37) switched from state
FAILING to FAILED.
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Not
enough free slots available to run the job. You can decrease the operator
parallelism or increase the number of slots per TaskManager in the
configuration. Resources available to scheduler: Number of instances=0, total
number of slots=0, available slots=0
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:334)
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.allocateSlot(Scheduler.java:139)
at
org.apache.flink.runtime.executiongraph.Execution.allocateSlotForExecution(Execution.java:368)
at
org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:309)
at
org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:596)
at
org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:450)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleLazy(ExecutionGraph.java:834)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:814)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1425)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
12:17:38,590 DEBUG
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - New
leader information: Leader=akka://flink/user/jobmanager_2, session
ID=a1124fe4-7739-452a-8b74-ee2b3fb7dad0.
12:17:38,590 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- Could not restart the job Blocking Test Job
(41b8348843eb617e608df4f200590f37) because the restart strategy prevented it.
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Not
enough free slots available to run the job. You can decrease the operator
parallelism or increase the number of slots per TaskManager in the
configuration. Resources available to scheduler: Number of instances=0, total
number of slots=0, available slots=0
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:334)
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.allocateSlot(Scheduler.java:139)
at
org.apache.flink.runtime.executiongraph.Execution.allocateSlotForExecution(Execution.java:368)
at
org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:309)
at
org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:596)
at
org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:450)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleLazy(ExecutionGraph.java:834)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:814)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1425)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
12:17:38,591 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- 08/02/2017 12:17:38 Job execution switched to status FAILED.
12:17:38,591 INFO org.apache.flink.runtime.testingUtils.TestingTaskManager
- Trying to register at JobManager akka://flink/user/jobmanager_2 (attempt 1,
timeout: 500 milliseconds)
12:17:38,591 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- Terminate JobClientActor.
12:17:38,591 INFO org.apache.flink.runtime.client.JobSubmissionClientActor
- Disconnect from JobManager Actor[akka://flink/user/jobmanager_2#2090247331].
12:17:38,591 INFO org.apache.flink.runtime.client.JobClient
- Job execution failed
12:17:38,591 INFO org.apache.flink.runtime.instance.InstanceManager
- Registered TaskManager at localhost (akka://flink/user/taskmanager_1) as
7b417742cd33c7f2e146a52a7e5597b9. Current number of registered hosts is 1.
Current number of alive task slots is 1.
12:17:38,592 INFO
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Stopping ZooKeeperLeaderRetrievalService.
12:17:38,593 INFO org.apache.flink.runtime.testingUtils.TestingTaskManager
- Stopping TaskManager akka://flink/user/taskmanager_1#2009077452.
12:17:38,593 INFO org.apache.flink.runtime.testingUtils.TestingJobManager
- Stopping JobManager akka://flink/user/jobmanager_2.
12:17:38,593 INFO
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Stopping ZooKeeperLeaderRetrievalService.
12:17:38,593 INFO
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
Stopping ZooKeeperLeaderRetrievalService.
12:17:38,593 INFO org.apache.flink.runtime.io.disk.iomanager.IOManager
- I/O manager removed spill file directory
/tmp/flink-io-54a003b9-6f13-49c8-a585-238ec48019c5
12:17:38,593 INFO org.apache.flink.runtime.io.network.NetworkEnvironment
- Shutting down the network environment and its components.
12:17:38,594 INFO
org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore - Removed
job graph 41b8348843eb617e608df4f200590f37 from ZooKeeper.
12:17:38,594 INFO org.apache.flink.runtime.testingUtils.TestingTaskManager
- Task manager akka://flink/user/taskmanager_1 is completely shut down.
12:17:38,594 INFO
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService -
Stopping ZooKeeperLeaderElectionService
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService@527cd5eb.
12:17:38,595 INFO org.apache.flink.runtime.blob.BlobServer
- Stopped BLOB server at 0.0.0.0:32853
12:17:38,596 ERROR org.apache.flink.runtime.client.JobClientActorRecoveryITCase
-
--------------------------------------------------------------------------------
Test
testJobClientRecovery1000(org.apache.flink.runtime.client.JobClientActorRecoveryITCase)
failed with:
org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:933)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:876)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:876)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by:
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Not
enough free slots available to run the job. You can decrease the operator
parallelism or increase the number of slots per TaskManager in the
configuration. Resources available to scheduler: Number of instances=0, total
number of slots=0, available slots=0
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:334)
at
org.apache.flink.runtime.jobmanager.scheduler.Scheduler.allocateSlot(Scheduler.java:139)
at
org.apache.flink.runtime.executiongraph.Execution.allocateSlotForExecution(Execution.java:368)
at
org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:309)
at
org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:596)
at
org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:450)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleLazy(ExecutionGraph.java:834)
at
org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:814)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1425)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
at
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
... 8 more
{code}
> test instability in JobClientActorRecoveryITCase#testJobClientRecovery
> ----------------------------------------------------------------------
>
> Key: FLINK-7351
> URL: https://issues.apache.org/jira/browse/FLINK-7351
> Project: Flink
> Issue Type: Bug
> Components: Job-Submission, Tests
> Affects Versions: 1.4.0, 1.3.2
> Reporter: Nico Kruber
> Priority: Critical
> Labels: test-stability
>
> On a 16-core VM, the following test failed during {{mvn clean verify}}
> {code}
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 22.814 sec
> <<< FAILURE! - in org.apache.flink.runtime.client.JobClientActorRecoveryITCase
> testJobClientRecovery(org.apache.flink.runtime.client.JobClientActorRecoveryITCase)
> Time elapsed: 21.299 sec <<< ERROR!
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
> at
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:933)
> at
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:876)
> at
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:876)
> at
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> at
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
> at
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
> at
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by:
> org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException:
> Not enough free slots available to run the job. You can decrease the operator
> parallelism or increase the number of slots per TaskManager in the
> configuration. Resources available to scheduler: Number of instances=0, total
> number of slots=0, available slots=0
> at
> org.apache.flink.runtime.jobmanager.scheduler.Scheduler.scheduleTask(Scheduler.java:334)
> at
> org.apache.flink.runtime.jobmanager.scheduler.Scheduler.allocateSlot(Scheduler.java:139)
> at
> org.apache.flink.runtime.executiongraph.Execution.allocateSlotForExecution(Execution.java:368)
> at
> org.apache.flink.runtime.executiongraph.Execution.scheduleForExecution(Execution.java:309)
> at
> org.apache.flink.runtime.executiongraph.ExecutionVertex.scheduleForExecution(ExecutionVertex.java:596)
> at
> org.apache.flink.runtime.executiongraph.ExecutionJobVertex.scheduleAll(ExecutionJobVertex.java:450)
> at
> org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleLazy(ExecutionGraph.java:834)
> at
> org.apache.flink.runtime.executiongraph.ExecutionGraph.scheduleForExecution(ExecutionGraph.java:814)
> at
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1425)
> at
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> at
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
> at
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
> at
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)