[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism
[ https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16062235#comment-16062235 ] ASF GitHub Bot commented on FLINK-6682: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/4125 > Improve error message in case parallelism exceeds maxParallelism > > > Key: FLINK-6682 > URL: https://issues.apache.org/jira/browse/FLINK-6682 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.3.0, 1.4.0 >Reporter: Chesnay Schepler >Assignee: mingleizhang > > When restoring a job with a parallelism that exceeds the maxParallelism we're > not providing a useful error message, as all you get is an > IllegalArgumentException: > {code} > Caused by: org.apache.flink.runtime.client.JobExecutionException: Job > execution failed > at > org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343) > at > org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467) > ... 22 more > Caused by: java.lang.IllegalArgumentException > at > org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism
[ https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16055790#comment-16055790 ] ASF GitHub Bot commented on FLINK-6682: --- Github user zhangminglei commented on the issue: https://github.com/apache/flink/pull/4125 Thanks for your review. > Improve error message in case parallelism exceeds maxParallelism > > > Key: FLINK-6682 > URL: https://issues.apache.org/jira/browse/FLINK-6682 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.3.0, 1.4.0 >Reporter: Chesnay Schepler >Assignee: mingleizhang > > When restoring a job with a parallelism that exceeds the maxParallelism we're > not providing a useful error message, as all you get is an > IllegalArgumentException: > {code} > Caused by: org.apache.flink.runtime.client.JobExecutionException: Job > execution failed > at > org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343) > at > org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467) > ... 22 more > Caused by: java.lang.IllegalArgumentException > at > org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism
[ https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16055778#comment-16055778 ] ASF GitHub Bot commented on FLINK-6682: --- Github user zentol commented on the issue: https://github.com/apache/flink/pull/4125 merging. > Improve error message in case parallelism exceeds maxParallelism > > > Key: FLINK-6682 > URL: https://issues.apache.org/jira/browse/FLINK-6682 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.3.0, 1.4.0 >Reporter: Chesnay Schepler >Assignee: mingleizhang > > When restoring a job with a parallelism that exceeds the maxParallelism we're > not providing a useful error message, as all you get is an > IllegalArgumentException: > {code} > Caused by: org.apache.flink.runtime.client.JobExecutionException: Job > execution failed > at > org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343) > at > org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467) > ... 22 more > Caused by: java.lang.IllegalArgumentException > at > org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism
[ https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16055635#comment-16055635 ] ASF GitHub Bot commented on FLINK-6682: --- Github user zentol commented on the issue: https://github.com/apache/flink/pull/4125 +1. > Improve error message in case parallelism exceeds maxParallelism > > > Key: FLINK-6682 > URL: https://issues.apache.org/jira/browse/FLINK-6682 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.3.0, 1.4.0 >Reporter: Chesnay Schepler >Assignee: mingleizhang > > When restoring a job with a parallelism that exceeds the maxParallelism we're > not providing a useful error message, as all you get is an > IllegalArgumentException: > {code} > Caused by: org.apache.flink.runtime.client.JobExecutionException: Job > execution failed > at > org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343) > at > org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467) > ... 22 more > Caused by: java.lang.IllegalArgumentException > at > org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism
[ https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16055629#comment-16055629 ] ASF GitHub Bot commented on FLINK-6682: --- Github user zhangminglei commented on a diff in the pull request: https://github.com/apache/flink/pull/4125#discussion_r122953813 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/StateAssignmentOperation.java --- @@ -464,6 +464,14 @@ private void collectPartionableStates( private static void checkParallelismPreconditions(OperatorState operatorState, ExecutionJobVertex executionJobVertex) { //max parallelism preconditions- + if (operatorState.getMaxParallelism() < executionJobVertex.getParallelism()) { + throw new IllegalStateException("The state for task " + executionJobVertex.getJobVertexId() + + " can not be restored. The maximum parallelism " + operatorState.getMaxParallelism() + --- End diff -- Yes. I have updated the code. Please helps to check. :XD > Improve error message in case parallelism exceeds maxParallelism > > > Key: FLINK-6682 > URL: https://issues.apache.org/jira/browse/FLINK-6682 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.3.0, 1.4.0 >Reporter: Chesnay Schepler >Assignee: mingleizhang > > When restoring a job with a parallelism that exceeds the maxParallelism we're > not providing a useful error message, as all you get is an > IllegalArgumentException: > {code} > Caused by: org.apache.flink.runtime.client.JobExecutionException: Job > execution failed > at > org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343) > at > org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467) > ... 22 more > Caused by: java.lang.IllegalArgumentException > at > org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism
[ https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16055576#comment-16055576 ] ASF GitHub Bot commented on FLINK-6682: --- Github user zentol commented on a diff in the pull request: https://github.com/apache/flink/pull/4125#discussion_r122943463 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/StateAssignmentOperation.java --- @@ -464,6 +464,14 @@ private void collectPartionableStates( private static void checkParallelismPreconditions(OperatorState operatorState, ExecutionJobVertex executionJobVertex) { //max parallelism preconditions- + if (operatorState.getMaxParallelism() < executionJobVertex.getParallelism()) { + throw new IllegalStateException("The state for task " + executionJobVertex.getJobVertexId() + + " can not be restored. The maximum parallelism " + operatorState.getMaxParallelism() + --- End diff -- can you added braces around the (max)parallelism? The error message currently looks like this: ``` Caused by: java.lang.IllegalStateException: The state for task adf090656b210b1609ad3203d4ee7329 can not be restored. The maximum parallelism 128 of the restored state is lower than the configured parallelism 140. Please reduce the parallelism of the task to be lower or equal to the maximum parallelism. ``` But i think ``` Caused by: java.lang.IllegalStateException: The state for task adf090656b210b1609ad3203d4ee7329 can not be restored. The maximum parallelism (128) of the restored state is lower than the configured parallelism (140). Please reduce the parallelism of the task to be lower or equal to the maximum parallelism. ``` looks nicer. > Improve error message in case parallelism exceeds maxParallelism > > > Key: FLINK-6682 > URL: https://issues.apache.org/jira/browse/FLINK-6682 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.3.0, 1.4.0 >Reporter: Chesnay Schepler >Assignee: mingleizhang > > When restoring a job with a parallelism that exceeds the maxParallelism we're > not providing a useful error message, as all you get is an > IllegalArgumentException: > {code} > Caused by: org.apache.flink.runtime.client.JobExecutionException: Job > execution failed > at > org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343) > at > org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467) > ... 22 more > Caused by: java.lang.IllegalArgumentException > at > org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at >
[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism
[ https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16055449#comment-16055449 ] ASF GitHub Bot commented on FLINK-6682: --- Github user zentol commented on the issue: https://github.com/apache/flink/pull/4125 Looks good to me, let me just quickly try it out. > Improve error message in case parallelism exceeds maxParallelism > > > Key: FLINK-6682 > URL: https://issues.apache.org/jira/browse/FLINK-6682 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.3.0, 1.4.0 >Reporter: Chesnay Schepler >Assignee: mingleizhang > > When restoring a job with a parallelism that exceeds the maxParallelism we're > not providing a useful error message, as all you get is an > IllegalArgumentException: > {code} > Caused by: org.apache.flink.runtime.client.JobExecutionException: Job > execution failed > at > org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343) > at > org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467) > ... 22 more > Caused by: java.lang.IllegalArgumentException > at > org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism
[ https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16055105#comment-16055105 ] ASF GitHub Bot commented on FLINK-6682: --- Github user zhangminglei commented on the issue: https://github.com/apache/flink/pull/4125 Thank you so much @zentol . I did learn a lot from those. Also, the code have been updated. Please review. Thanks again. > Improve error message in case parallelism exceeds maxParallelism > > > Key: FLINK-6682 > URL: https://issues.apache.org/jira/browse/FLINK-6682 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.3.0, 1.4.0 >Reporter: Chesnay Schepler >Assignee: mingleizhang > > When restoring a job with a parallelism that exceeds the maxParallelism we're > not providing a useful error message, as all you get is an > IllegalArgumentException: > {code} > Caused by: org.apache.flink.runtime.client.JobExecutionException: Job > execution failed > at > org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343) > at > org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467) > ... 22 more > Caused by: java.lang.IllegalArgumentException > at > org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism
[ https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16055102#comment-16055102 ] ASF GitHub Bot commented on FLINK-6682: --- Github user zhangminglei commented on a diff in the pull request: https://github.com/apache/flink/pull/4125#discussion_r122869047 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/StateAssignmentOperation.java --- @@ -225,7 +225,16 @@ private void assignAttemptState(ExecutionJobVertex executionJobVertex, List operatorStates, ExecutionJobVertex executionJobVertex) { - + //parallelism compare preconditions- + + // if the max parallelism is lower than parallelism, we will throw an exception. + if (executionJobVertex.getMaxParallelism() < executionJobVertex.getParallelism()) { --- End diff -- Yes. You are very correct. I will move this check condition into this method ```checkParallelismCondition(OperatorState, ExecutionJobVertex).``` It is not probably easier, it is absolutely easier. :1st_place_medal: Lol. > Improve error message in case parallelism exceeds maxParallelism > > > Key: FLINK-6682 > URL: https://issues.apache.org/jira/browse/FLINK-6682 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.3.0, 1.4.0 >Reporter: Chesnay Schepler >Assignee: mingleizhang > > When restoring a job with a parallelism that exceeds the maxParallelism we're > not providing a useful error message, as all you get is an > IllegalArgumentException: > {code} > Caused by: org.apache.flink.runtime.client.JobExecutionException: Job > execution failed > at > org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343) > at > org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467) > ... 22 more > Caused by: java.lang.IllegalArgumentException > at > org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism
[ https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16053967#comment-16053967 ] ASF GitHub Bot commented on FLINK-6682: --- Github user zentol commented on a diff in the pull request: https://github.com/apache/flink/pull/4125#discussion_r122686915 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/StateAssignmentOperation.java --- @@ -225,7 +225,16 @@ private void assignAttemptState(ExecutionJobVertex executionJobVertex, List operatorStates, ExecutionJobVertex executionJobVertex) { - + //parallelism compare preconditions- + + // if the max parallelism is lower than parallelism, we will throw an exception. + if (executionJobVertex.getMaxParallelism() < executionJobVertex.getParallelism()) { + throw new IllegalArgumentException("JobVertex: " + executionJobVertex.getJobVertex() + --- End diff -- We don't usually refer to JobVertices in exceptions, a more user-known "equivalent" are tasks. We should also throw an IllegalStateException to be consistent with existing exceptions in this area. Finally, let's reword this a bit: "The state for task () cannot be restored. The maximum parallelism ( ) of the restored state is lower than the configured parallelism ( ). Please reduce the parallelism of the task to be lower or equal to the maximum parallelism." > Improve error message in case parallelism exceeds maxParallelism > > > Key: FLINK-6682 > URL: https://issues.apache.org/jira/browse/FLINK-6682 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.3.0, 1.4.0 >Reporter: Chesnay Schepler >Assignee: mingleizhang > > When restoring a job with a parallelism that exceeds the maxParallelism we're > not providing a useful error message, as all you get is an > IllegalArgumentException: > {code} > Caused by: org.apache.flink.runtime.client.JobExecutionException: Job > execution failed > at > org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343) > at > org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467) > ... 22 more > Caused by: java.lang.IllegalArgumentException > at > org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism
[ https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16053968#comment-16053968 ] ASF GitHub Bot commented on FLINK-6682: --- Github user zentol commented on a diff in the pull request: https://github.com/apache/flink/pull/4125#discussion_r122685535 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/StateAssignmentOperation.java --- @@ -225,7 +225,16 @@ private void assignAttemptState(ExecutionJobVertex executionJobVertex, List operatorStates, ExecutionJobVertex executionJobVertex) { - + //parallelism compare preconditions- + + // if the max parallelism is lower than parallelism, we will throw an exception. + if (executionJobVertex.getMaxParallelism() < executionJobVertex.getParallelism()) { --- End diff -- you have to compare the `maxParallelism` of the `operatorStates` with the one from the `executionJobVertex`. This is probably easier by moving this check into the `checkParallelismCondition(OperatorState, ExecutionJobVertex)` method. > Improve error message in case parallelism exceeds maxParallelism > > > Key: FLINK-6682 > URL: https://issues.apache.org/jira/browse/FLINK-6682 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.3.0, 1.4.0 >Reporter: Chesnay Schepler >Assignee: mingleizhang > > When restoring a job with a parallelism that exceeds the maxParallelism we're > not providing a useful error message, as all you get is an > IllegalArgumentException: > {code} > Caused by: org.apache.flink.runtime.client.JobExecutionException: Job > execution failed > at > org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343) > at > org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467) > ... 22 more > Caused by: java.lang.IllegalArgumentException > at > org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism
[ https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16053966#comment-16053966 ] ASF GitHub Bot commented on FLINK-6682: --- Github user zentol commented on a diff in the pull request: https://github.com/apache/flink/pull/4125#discussion_r122684131 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/StateAssignmentOperation.java --- @@ -225,7 +225,16 @@ private void assignAttemptState(ExecutionJobVertex executionJobVertex, List operatorStates, ExecutionJobVertex executionJobVertex) { - + //parallelism compare preconditions- + + // if the max parallelism is lower than parallelism, we will throw an exception. --- End diff -- remove the comment. > Improve error message in case parallelism exceeds maxParallelism > > > Key: FLINK-6682 > URL: https://issues.apache.org/jira/browse/FLINK-6682 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.3.0, 1.4.0 >Reporter: Chesnay Schepler >Assignee: mingleizhang > > When restoring a job with a parallelism that exceeds the maxParallelism we're > not providing a useful error message, as all you get is an > IllegalArgumentException: > {code} > Caused by: org.apache.flink.runtime.client.JobExecutionException: Job > execution failed > at > org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343) > at > org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467) > ... 22 more > Caused by: java.lang.IllegalArgumentException > at > org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism
[ https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050370#comment-16050370 ] ASF GitHub Bot commented on FLINK-6682: --- Github user zhangminglei commented on the issue: https://github.com/apache/flink/pull/4125 Hi @zentol . Please helps review if you are free, should I add some extra information ? Thanks. > Improve error message in case parallelism exceeds maxParallelism > > > Key: FLINK-6682 > URL: https://issues.apache.org/jira/browse/FLINK-6682 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.3.0, 1.4.0 >Reporter: Chesnay Schepler >Assignee: mingleizhang > > When restoring a job with a parallelism that exceeds the maxParallelism we're > not providing a useful error message, as all you get is an > IllegalArgumentException: > {code} > Caused by: org.apache.flink.runtime.client.JobExecutionException: Job > execution failed > at > org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343) > at > org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467) > ... 22 more > Caused by: java.lang.IllegalArgumentException > at > org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism
[ https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049288#comment-16049288 ] ASF GitHub Bot commented on FLINK-6682: --- GitHub user zhangminglei opened a pull request: https://github.com/apache/flink/pull/4125 [FLINK-6682] [checkpoints] Improve error message in case parallelism … …exceeds maxParallelism Thanks for contributing to Apache Flink. Before you open your pull request, please take the following check list into consideration. If your changes take all of the items into account, feel free to open your pull request. For more information and/or questions please refer to the [How To Contribute guide](http://flink.apache.org/how-to-contribute.html). In addition to going through the list, please provide a meaningful description of your changes. - [ ] General - The pull request references the related JIRA issue ("[FLINK-XXX] Jira title text") - The pull request addresses only one issue - Each commit in the PR has a meaningful commit message (including the JIRA id) - [ ] Documentation - Documentation has been added for new functionality - Old documentation affected by the pull request has been updated - JavaDoc for public methods has been added - [ ] Tests & Build - Functionality added by the pull request is covered by tests - `mvn clean verify` has been executed successfully locally or a Travis build has passed You can merge this pull request into a Git repository by running: $ git pull https://github.com/zhangminglei/flink flink-6682 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/4125.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4125 commit 76349ac5e53eafb0c3324f8421d59311f3d7d7f3 Author: zhangmingleiDate: 2017-06-14T15:23:20Z [FLINK-6682] [checkpoints] Improve error message in case parallelism exceeds maxParallelism > Improve error message in case parallelism exceeds maxParallelism > > > Key: FLINK-6682 > URL: https://issues.apache.org/jira/browse/FLINK-6682 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.3.0, 1.4.0 >Reporter: Chesnay Schepler > > When restoring a job with a parallelism that exceeds the maxParallelism we're > not providing a useful error message, as all you get is an > IllegalArgumentException: > {code} > Caused by: org.apache.flink.runtime.client.JobExecutionException: Job > execution failed > at > org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343) > at > org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467) > ... 22 more > Caused by: java.lang.IllegalArgumentException > at > org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at >
[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism
[ https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049285#comment-16049285 ] mingleizhang commented on FLINK-6682: - Thanks for telling me. [~Zentol] and I have put the updated code to the PR. What stuff about the user did wrong is that they just set the wrong parallelism I guess. > Improve error message in case parallelism exceeds maxParallelism > > > Key: FLINK-6682 > URL: https://issues.apache.org/jira/browse/FLINK-6682 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.3.0, 1.4.0 >Reporter: Chesnay Schepler > > When restoring a job with a parallelism that exceeds the maxParallelism we're > not providing a useful error message, as all you get is an > IllegalArgumentException: > {code} > Caused by: org.apache.flink.runtime.client.JobExecutionException: Job > execution failed > at > org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343) > at > org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467) > ... 22 more > Caused by: java.lang.IllegalArgumentException > at > org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism
[ https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047627#comment-16047627 ] Chesnay Schepler commented on FLINK-6682: - It should probably be moved to {{StateAssignmentOperator#checkParallelismPreconditions()}}. Also, I wouldn't log a message but throw an exception instead. We already know that the restore will fail, so let's stop the process right there. > Improve error message in case parallelism exceeds maxParallelism > > > Key: FLINK-6682 > URL: https://issues.apache.org/jira/browse/FLINK-6682 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.3.0, 1.4.0 >Reporter: Chesnay Schepler > > When restoring a job with a parallelism that exceeds the maxParallelism we're > not providing a useful error message, as all you get is an > IllegalArgumentException: > {code} > Caused by: org.apache.flink.runtime.client.JobExecutionException: Job > execution failed > at > org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343) > at > org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467) > ... 22 more > Caused by: java.lang.IllegalArgumentException > at > org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism
[ https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047618#comment-16047618 ] mingleizhang commented on FLINK-6682: - Yes. [~Zentol] I know what you mean. It is really dont give any useful information but only some numbers there. I just would like to know this time whether it is suitable that I put this log information at {{assignAttemptState}} method. For example, I might put the error message at {{createKeyGroupPartitions}} method. That is what I care about this time. > Improve error message in case parallelism exceeds maxParallelism > > > Key: FLINK-6682 > URL: https://issues.apache.org/jira/browse/FLINK-6682 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.3.0, 1.4.0 >Reporter: Chesnay Schepler > > When restoring a job with a parallelism that exceeds the maxParallelism we're > not providing a useful error message, as all you get is an > IllegalArgumentException: > {code} > Caused by: org.apache.flink.runtime.client.JobExecutionException: Job > execution failed > at > org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343) > at > org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467) > ... 22 more > Caused by: java.lang.IllegalArgumentException > at > org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism
[ https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047549#comment-16047549 ] Chesnay Schepler commented on FLINK-6682: - While you are checking the right condition the exception still doesn't provide any useful information. The log message only displays the current state, and does neither explain what the user did wrong or how to resolve the issue. > Improve error message in case parallelism exceeds maxParallelism > > > Key: FLINK-6682 > URL: https://issues.apache.org/jira/browse/FLINK-6682 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.3.0, 1.4.0 >Reporter: Chesnay Schepler > > When restoring a job with a parallelism that exceeds the maxParallelism we're > not providing a useful error message, as all you get is an > IllegalArgumentException: > {code} > Caused by: org.apache.flink.runtime.client.JobExecutionException: Job > execution failed > at > org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343) > at > org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467) > ... 22 more > Caused by: java.lang.IllegalArgumentException > at > org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism
[ https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047364#comment-16047364 ] mingleizhang commented on FLINK-6682: - Hi,[~Zentol] . I put the check conditions at {{assignAttemptState}} method. What do you think ? Not very sure about task info I wrote. It would be great helpful if you can check this. {code} private void assignAttemptState(ExecutionJobVertex executionJobVertex, List operatorStates) { List operatorIDs = executionJobVertex.getOperatorIDs(); //1. first compute the new parallelism checkParallelismPreconditions(operatorStates, executionJobVertex); int newParallelism = executionJobVertex.getParallelism(); if (executionJobVertex.getMaxParallelism() < newParallelism) { LOG.error("Task info: {}, maxParallelism number: {}, and parallelism number: {}", this.tasks, executionJobVertex.getMaxParallelism(), newParallelism); } List keyGroupPartitions = createKeyGroupPartitions( executionJobVertex.getMaxParallelism(), newParallelism); {code} > Improve error message in case parallelism exceeds maxParallelism > > > Key: FLINK-6682 > URL: https://issues.apache.org/jira/browse/FLINK-6682 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.3.0, 1.4.0 >Reporter: Chesnay Schepler > > When restoring a job with a parallelism that exceeds the maxParallelism we're > not providing a useful error message, as all you get is an > IllegalArgumentException: > {code} > Caused by: org.apache.flink.runtime.client.JobExecutionException: Job > execution failed > at > org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343) > at > org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467) > ... 22 more > Caused by: java.lang.IllegalArgumentException > at > org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism
[ https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047332#comment-16047332 ] mingleizhang commented on FLINK-6682: - I will look into this issue these days. > Improve error message in case parallelism exceeds maxParallelism > > > Key: FLINK-6682 > URL: https://issues.apache.org/jira/browse/FLINK-6682 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.3.0, 1.4.0 >Reporter: Chesnay Schepler > > When restoring a job with a parallelism that exceeds the maxParallelism we're > not providing a useful error message, as all you get is an > IllegalArgumentException: > {code} > Caused by: org.apache.flink.runtime.client.JobExecutionException: Job > execution failed > at > org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343) > at > org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467) > ... 22 more > Caused by: java.lang.IllegalArgumentException > at > org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism
[ https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021128#comment-16021128 ] Chesnay Schepler commented on FLINK-6682: - The error message should include which task failed, the maxParallelism and parallelism that we tried to set. > Improve error message in case parallelism exceeds maxParallelism > > > Key: FLINK-6682 > URL: https://issues.apache.org/jira/browse/FLINK-6682 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.3.0, 1.4.0 >Reporter: Chesnay Schepler > > When restoring a job with a parallelism that exceeds the maxParallelism we're > not providing a useful error message, as all you get is an > IllegalArgumentException: > {code} > Caused by: org.apache.flink.runtime.client.JobExecutionException: Job > execution failed > at > org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343) > at > org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467) > ... 22 more > Caused by: java.lang.IllegalArgumentException > at > org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117) > at > org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)