[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism

2017-06-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16062235#comment-16062235
 ] 

ASF GitHub Bot commented on FLINK-6682:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/4125


> Improve error message in case parallelism exceeds maxParallelism
> 
>
> Key: FLINK-6682
> URL: https://issues.apache.org/jira/browse/FLINK-6682
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: mingleizhang
>
> When restoring a job with a parallelism that exceeds the maxParallelism we're 
> not providing a useful error message, as all you get is an 
> IllegalArgumentException:
> {code}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed
> at 
> org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343)
> at 
> org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396)
> at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467)
> ... 22 more
> Caused by: java.lang.IllegalArgumentException
> at 
> org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
> at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
> at 
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism

2017-06-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16055790#comment-16055790
 ] 

ASF GitHub Bot commented on FLINK-6682:
---

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/4125
  
Thanks for your review. 


> Improve error message in case parallelism exceeds maxParallelism
> 
>
> Key: FLINK-6682
> URL: https://issues.apache.org/jira/browse/FLINK-6682
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: mingleizhang
>
> When restoring a job with a parallelism that exceeds the maxParallelism we're 
> not providing a useful error message, as all you get is an 
> IllegalArgumentException:
> {code}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed
> at 
> org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343)
> at 
> org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396)
> at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467)
> ... 22 more
> Caused by: java.lang.IllegalArgumentException
> at 
> org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
> at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
> at 
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism

2017-06-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16055778#comment-16055778
 ] 

ASF GitHub Bot commented on FLINK-6682:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4125
  
merging.


> Improve error message in case parallelism exceeds maxParallelism
> 
>
> Key: FLINK-6682
> URL: https://issues.apache.org/jira/browse/FLINK-6682
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: mingleizhang
>
> When restoring a job with a parallelism that exceeds the maxParallelism we're 
> not providing a useful error message, as all you get is an 
> IllegalArgumentException:
> {code}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed
> at 
> org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343)
> at 
> org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396)
> at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467)
> ... 22 more
> Caused by: java.lang.IllegalArgumentException
> at 
> org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
> at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
> at 
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism

2017-06-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16055635#comment-16055635
 ] 

ASF GitHub Bot commented on FLINK-6682:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4125
  
+1.


> Improve error message in case parallelism exceeds maxParallelism
> 
>
> Key: FLINK-6682
> URL: https://issues.apache.org/jira/browse/FLINK-6682
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: mingleizhang
>
> When restoring a job with a parallelism that exceeds the maxParallelism we're 
> not providing a useful error message, as all you get is an 
> IllegalArgumentException:
> {code}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed
> at 
> org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343)
> at 
> org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396)
> at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467)
> ... 22 more
> Caused by: java.lang.IllegalArgumentException
> at 
> org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
> at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
> at 
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism

2017-06-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16055629#comment-16055629
 ] 

ASF GitHub Bot commented on FLINK-6682:
---

Github user zhangminglei commented on a diff in the pull request:

https://github.com/apache/flink/pull/4125#discussion_r122953813
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/StateAssignmentOperation.java
 ---
@@ -464,6 +464,14 @@ private void collectPartionableStates(
private static void checkParallelismPreconditions(OperatorState 
operatorState, ExecutionJobVertex executionJobVertex) {
//max parallelism 
preconditions-
 
+   if (operatorState.getMaxParallelism() < 
executionJobVertex.getParallelism()) {
+   throw new IllegalStateException("The state for task " + 
executionJobVertex.getJobVertexId() +
+   " can not be restored. The maximum parallelism 
" + operatorState.getMaxParallelism() +
--- End diff --

Yes. I have updated the code. Please helps to check. :XD


> Improve error message in case parallelism exceeds maxParallelism
> 
>
> Key: FLINK-6682
> URL: https://issues.apache.org/jira/browse/FLINK-6682
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: mingleizhang
>
> When restoring a job with a parallelism that exceeds the maxParallelism we're 
> not providing a useful error message, as all you get is an 
> IllegalArgumentException:
> {code}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed
> at 
> org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343)
> at 
> org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396)
> at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467)
> ... 22 more
> Caused by: java.lang.IllegalArgumentException
> at 
> org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
> at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
> at 
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism

2017-06-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16055576#comment-16055576
 ] 

ASF GitHub Bot commented on FLINK-6682:
---

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/4125#discussion_r122943463
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/StateAssignmentOperation.java
 ---
@@ -464,6 +464,14 @@ private void collectPartionableStates(
private static void checkParallelismPreconditions(OperatorState 
operatorState, ExecutionJobVertex executionJobVertex) {
//max parallelism 
preconditions-
 
+   if (operatorState.getMaxParallelism() < 
executionJobVertex.getParallelism()) {
+   throw new IllegalStateException("The state for task " + 
executionJobVertex.getJobVertexId() +
+   " can not be restored. The maximum parallelism 
" + operatorState.getMaxParallelism() +
--- End diff --

can you added braces around the (max)parallelism? The error message 
currently looks like this:

```
Caused by: java.lang.IllegalStateException: The state for task 
adf090656b210b1609ad3203d4ee7329 can not be restored. The maximum parallelism 
128 of the restored state is lower than the configured parallelism 140. Please 
reduce the parallelism of the task to be lower or equal to the maximum 
parallelism.
```

But i think
```
Caused by: java.lang.IllegalStateException: The state for task 
adf090656b210b1609ad3203d4ee7329 can not be restored. The maximum parallelism 
(128) of the restored state is lower than the configured parallelism (140). 
Please reduce the parallelism of the task to be lower or equal to the maximum 
parallelism.
```
looks nicer.


> Improve error message in case parallelism exceeds maxParallelism
> 
>
> Key: FLINK-6682
> URL: https://issues.apache.org/jira/browse/FLINK-6682
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: mingleizhang
>
> When restoring a job with a parallelism that exceeds the maxParallelism we're 
> not providing a useful error message, as all you get is an 
> IllegalArgumentException:
> {code}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed
> at 
> org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343)
> at 
> org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396)
> at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467)
> ... 22 more
> Caused by: java.lang.IllegalArgumentException
> at 
> org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
> at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
> at 
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> 

[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism

2017-06-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16055449#comment-16055449
 ] 

ASF GitHub Bot commented on FLINK-6682:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/4125
  
Looks good to me, let me just quickly try it out.


> Improve error message in case parallelism exceeds maxParallelism
> 
>
> Key: FLINK-6682
> URL: https://issues.apache.org/jira/browse/FLINK-6682
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: mingleizhang
>
> When restoring a job with a parallelism that exceeds the maxParallelism we're 
> not providing a useful error message, as all you get is an 
> IllegalArgumentException:
> {code}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed
> at 
> org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343)
> at 
> org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396)
> at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467)
> ... 22 more
> Caused by: java.lang.IllegalArgumentException
> at 
> org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
> at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
> at 
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism

2017-06-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16055105#comment-16055105
 ] 

ASF GitHub Bot commented on FLINK-6682:
---

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/4125
  
Thank you so much @zentol . I did learn a lot from those. Also, the code 
have been updated. Please review. Thanks again.


> Improve error message in case parallelism exceeds maxParallelism
> 
>
> Key: FLINK-6682
> URL: https://issues.apache.org/jira/browse/FLINK-6682
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: mingleizhang
>
> When restoring a job with a parallelism that exceeds the maxParallelism we're 
> not providing a useful error message, as all you get is an 
> IllegalArgumentException:
> {code}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed
> at 
> org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343)
> at 
> org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396)
> at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467)
> ... 22 more
> Caused by: java.lang.IllegalArgumentException
> at 
> org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
> at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
> at 
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism

2017-06-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16055102#comment-16055102
 ] 

ASF GitHub Bot commented on FLINK-6682:
---

Github user zhangminglei commented on a diff in the pull request:

https://github.com/apache/flink/pull/4125#discussion_r122869047
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/StateAssignmentOperation.java
 ---
@@ -225,7 +225,16 @@ private void assignAttemptState(ExecutionJobVertex 
executionJobVertex, List 
operatorStates, ExecutionJobVertex executionJobVertex) {
-
+   //parallelism compare 
preconditions-
+
+   // if the max parallelism is lower than parallelism, we will 
throw an exception.
+   if (executionJobVertex.getMaxParallelism() < 
executionJobVertex.getParallelism()) {
--- End diff --

Yes. You are very correct. I will move this check condition into this 
method ```checkParallelismCondition(OperatorState, ExecutionJobVertex).```  It 
is not probably easier, it is absolutely easier. :1st_place_medal: Lol.


> Improve error message in case parallelism exceeds maxParallelism
> 
>
> Key: FLINK-6682
> URL: https://issues.apache.org/jira/browse/FLINK-6682
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: mingleizhang
>
> When restoring a job with a parallelism that exceeds the maxParallelism we're 
> not providing a useful error message, as all you get is an 
> IllegalArgumentException:
> {code}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed
> at 
> org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343)
> at 
> org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396)
> at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467)
> ... 22 more
> Caused by: java.lang.IllegalArgumentException
> at 
> org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
> at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
> at 
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism

2017-06-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16053967#comment-16053967
 ] 

ASF GitHub Bot commented on FLINK-6682:
---

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/4125#discussion_r122686915
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/StateAssignmentOperation.java
 ---
@@ -225,7 +225,16 @@ private void assignAttemptState(ExecutionJobVertex 
executionJobVertex, List 
operatorStates, ExecutionJobVertex executionJobVertex) {
-
+   //parallelism compare 
preconditions-
+
+   // if the max parallelism is lower than parallelism, we will 
throw an exception.
+   if (executionJobVertex.getMaxParallelism() < 
executionJobVertex.getParallelism()) {
+   throw new IllegalArgumentException("JobVertex: " + 
executionJobVertex.getJobVertex()  +
--- End diff --

We don't usually refer to JobVertices in exceptions, a more user-known 
"equivalent" are tasks.

We should also throw an IllegalStateException to be consistent with 
existing exceptions in this area.

Finally, let's reword this a bit:

"The state for task () cannot be 
restored. The maximum parallelism () of the 
restored state is lower than the configured parallelism 
(). Please reduce the parallelism of the 
task to be lower or equal to the maximum parallelism."


> Improve error message in case parallelism exceeds maxParallelism
> 
>
> Key: FLINK-6682
> URL: https://issues.apache.org/jira/browse/FLINK-6682
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: mingleizhang
>
> When restoring a job with a parallelism that exceeds the maxParallelism we're 
> not providing a useful error message, as all you get is an 
> IllegalArgumentException:
> {code}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed
> at 
> org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343)
> at 
> org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396)
> at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467)
> ... 22 more
> Caused by: java.lang.IllegalArgumentException
> at 
> org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
> at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
> at 
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism

2017-06-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16053968#comment-16053968
 ] 

ASF GitHub Bot commented on FLINK-6682:
---

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/4125#discussion_r122685535
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/StateAssignmentOperation.java
 ---
@@ -225,7 +225,16 @@ private void assignAttemptState(ExecutionJobVertex 
executionJobVertex, List 
operatorStates, ExecutionJobVertex executionJobVertex) {
-
+   //parallelism compare 
preconditions-
+
+   // if the max parallelism is lower than parallelism, we will 
throw an exception.
+   if (executionJobVertex.getMaxParallelism() < 
executionJobVertex.getParallelism()) {
--- End diff --

you have to compare the `maxParallelism` of the `operatorStates` with the 
one from the `executionJobVertex`. This is probably easier by moving this check 
into the `checkParallelismCondition(OperatorState, ExecutionJobVertex)` method.


> Improve error message in case parallelism exceeds maxParallelism
> 
>
> Key: FLINK-6682
> URL: https://issues.apache.org/jira/browse/FLINK-6682
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: mingleizhang
>
> When restoring a job with a parallelism that exceeds the maxParallelism we're 
> not providing a useful error message, as all you get is an 
> IllegalArgumentException:
> {code}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed
> at 
> org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343)
> at 
> org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396)
> at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467)
> ... 22 more
> Caused by: java.lang.IllegalArgumentException
> at 
> org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
> at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
> at 
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism

2017-06-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16053966#comment-16053966
 ] 

ASF GitHub Bot commented on FLINK-6682:
---

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/4125#discussion_r122684131
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/StateAssignmentOperation.java
 ---
@@ -225,7 +225,16 @@ private void assignAttemptState(ExecutionJobVertex 
executionJobVertex, List 
operatorStates, ExecutionJobVertex executionJobVertex) {
-
+   //parallelism compare 
preconditions-
+
+   // if the max parallelism is lower than parallelism, we will 
throw an exception.
--- End diff --

remove the comment.


> Improve error message in case parallelism exceeds maxParallelism
> 
>
> Key: FLINK-6682
> URL: https://issues.apache.org/jira/browse/FLINK-6682
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: mingleizhang
>
> When restoring a job with a parallelism that exceeds the maxParallelism we're 
> not providing a useful error message, as all you get is an 
> IllegalArgumentException:
> {code}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed
> at 
> org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343)
> at 
> org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396)
> at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467)
> ... 22 more
> Caused by: java.lang.IllegalArgumentException
> at 
> org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
> at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
> at 
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism

2017-06-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16050370#comment-16050370
 ] 

ASF GitHub Bot commented on FLINK-6682:
---

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/4125
  
Hi @zentol . Please helps review if you are free, should I add some extra 
information ? Thanks.


> Improve error message in case parallelism exceeds maxParallelism
> 
>
> Key: FLINK-6682
> URL: https://issues.apache.org/jira/browse/FLINK-6682
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>Assignee: mingleizhang
>
> When restoring a job with a parallelism that exceeds the maxParallelism we're 
> not providing a useful error message, as all you get is an 
> IllegalArgumentException:
> {code}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed
> at 
> org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343)
> at 
> org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396)
> at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467)
> ... 22 more
> Caused by: java.lang.IllegalArgumentException
> at 
> org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
> at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
> at 
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism

2017-06-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049288#comment-16049288
 ] 

ASF GitHub Bot commented on FLINK-6682:
---

GitHub user zhangminglei opened a pull request:

https://github.com/apache/flink/pull/4125

[FLINK-6682] [checkpoints] Improve error message in case parallelism …

…exceeds maxParallelism

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [ ] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [ ] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [ ] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zhangminglei/flink flink-6682

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4125.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4125


commit 76349ac5e53eafb0c3324f8421d59311f3d7d7f3
Author: zhangminglei 
Date:   2017-06-14T15:23:20Z

[FLINK-6682] [checkpoints] Improve error message in case parallelism 
exceeds maxParallelism




> Improve error message in case parallelism exceeds maxParallelism
> 
>
> Key: FLINK-6682
> URL: https://issues.apache.org/jira/browse/FLINK-6682
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>
> When restoring a job with a parallelism that exceeds the maxParallelism we're 
> not providing a useful error message, as all you get is an 
> IllegalArgumentException:
> {code}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed
> at 
> org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343)
> at 
> org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396)
> at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467)
> ... 22 more
> Caused by: java.lang.IllegalArgumentException
> at 
> org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
> at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
> at 
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> 

[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism

2017-06-14 Thread mingleizhang (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049285#comment-16049285
 ] 

mingleizhang commented on FLINK-6682:
-

Thanks for telling me. [~Zentol]  and I have put the updated code to the PR. 
What stuff about the user did wrong is that they just set the wrong parallelism 
I guess. 


> Improve error message in case parallelism exceeds maxParallelism
> 
>
> Key: FLINK-6682
> URL: https://issues.apache.org/jira/browse/FLINK-6682
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>
> When restoring a job with a parallelism that exceeds the maxParallelism we're 
> not providing a useful error message, as all you get is an 
> IllegalArgumentException:
> {code}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed
> at 
> org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343)
> at 
> org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396)
> at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467)
> ... 22 more
> Caused by: java.lang.IllegalArgumentException
> at 
> org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
> at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
> at 
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism

2017-06-13 Thread Chesnay Schepler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047627#comment-16047627
 ] 

Chesnay Schepler commented on FLINK-6682:
-

It should probably be moved to 
{{StateAssignmentOperator#checkParallelismPreconditions()}}. Also, I wouldn't 
log a message but throw an exception instead. We already know that the restore 
will fail, so let's stop the process right there.

> Improve error message in case parallelism exceeds maxParallelism
> 
>
> Key: FLINK-6682
> URL: https://issues.apache.org/jira/browse/FLINK-6682
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>
> When restoring a job with a parallelism that exceeds the maxParallelism we're 
> not providing a useful error message, as all you get is an 
> IllegalArgumentException:
> {code}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed
> at 
> org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343)
> at 
> org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396)
> at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467)
> ... 22 more
> Caused by: java.lang.IllegalArgumentException
> at 
> org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
> at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
> at 
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism

2017-06-13 Thread mingleizhang (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047618#comment-16047618
 ] 

mingleizhang commented on FLINK-6682:
-

Yes. [~Zentol] I know what you mean. It is really dont give any useful 
information but only some numbers there. I just would like to know this time 
whether it is suitable that I put this log information at  
{{assignAttemptState}} method. For example, I might put the error message at 
{{createKeyGroupPartitions}} method. That is what I care about this time.

> Improve error message in case parallelism exceeds maxParallelism
> 
>
> Key: FLINK-6682
> URL: https://issues.apache.org/jira/browse/FLINK-6682
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>
> When restoring a job with a parallelism that exceeds the maxParallelism we're 
> not providing a useful error message, as all you get is an 
> IllegalArgumentException:
> {code}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed
> at 
> org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343)
> at 
> org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396)
> at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467)
> ... 22 more
> Caused by: java.lang.IllegalArgumentException
> at 
> org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
> at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
> at 
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism

2017-06-13 Thread Chesnay Schepler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047549#comment-16047549
 ] 

Chesnay Schepler commented on FLINK-6682:
-

While you are checking the right condition the exception still doesn't provide 
any useful information. The log message only displays the current state, and 
does neither explain what the user did wrong or how to resolve the issue.

> Improve error message in case parallelism exceeds maxParallelism
> 
>
> Key: FLINK-6682
> URL: https://issues.apache.org/jira/browse/FLINK-6682
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>
> When restoring a job with a parallelism that exceeds the maxParallelism we're 
> not providing a useful error message, as all you get is an 
> IllegalArgumentException:
> {code}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed
> at 
> org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343)
> at 
> org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396)
> at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467)
> ... 22 more
> Caused by: java.lang.IllegalArgumentException
> at 
> org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
> at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
> at 
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism

2017-06-12 Thread mingleizhang (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047364#comment-16047364
 ] 

mingleizhang commented on FLINK-6682:
-

Hi,[~Zentol] . I put the check conditions at {{assignAttemptState}} method. 
What do you think ?  Not very sure about task info I wrote. It would be great 
helpful if you can check this.
{code}
private void assignAttemptState(ExecutionJobVertex executionJobVertex, 
List operatorStates) {

List operatorIDs = 
executionJobVertex.getOperatorIDs();

//1. first compute the new parallelism
checkParallelismPreconditions(operatorStates, 
executionJobVertex);

int newParallelism = executionJobVertex.getParallelism();

if (executionJobVertex.getMaxParallelism() < newParallelism) {
LOG.error("Task info: {}, maxParallelism number: {}, 
and parallelism number: {}",
this.tasks, 
executionJobVertex.getMaxParallelism(), newParallelism);
}

List keyGroupPartitions = 
createKeyGroupPartitions(
executionJobVertex.getMaxParallelism(),
newParallelism);
{code}

> Improve error message in case parallelism exceeds maxParallelism
> 
>
> Key: FLINK-6682
> URL: https://issues.apache.org/jira/browse/FLINK-6682
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>
> When restoring a job with a parallelism that exceeds the maxParallelism we're 
> not providing a useful error message, as all you get is an 
> IllegalArgumentException:
> {code}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed
> at 
> org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343)
> at 
> org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396)
> at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467)
> ... 22 more
> Caused by: java.lang.IllegalArgumentException
> at 
> org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
> at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
> at 
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism

2017-06-12 Thread mingleizhang (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047332#comment-16047332
 ] 

mingleizhang commented on FLINK-6682:
-

I will look into this issue these days.

> Improve error message in case parallelism exceeds maxParallelism
> 
>
> Key: FLINK-6682
> URL: https://issues.apache.org/jira/browse/FLINK-6682
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>
> When restoring a job with a parallelism that exceeds the maxParallelism we're 
> not providing a useful error message, as all you get is an 
> IllegalArgumentException:
> {code}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed
> at 
> org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343)
> at 
> org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396)
> at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467)
> ... 22 more
> Caused by: java.lang.IllegalArgumentException
> at 
> org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
> at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
> at 
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6682) Improve error message in case parallelism exceeds maxParallelism

2017-05-23 Thread Chesnay Schepler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16021128#comment-16021128
 ] 

Chesnay Schepler commented on FLINK-6682:
-

The error message should include which task failed, the maxParallelism and 
parallelism that we tried to set.

> Improve error message in case parallelism exceeds maxParallelism
> 
>
> Key: FLINK-6682
> URL: https://issues.apache.org/jira/browse/FLINK-6682
> Project: Flink
>  Issue Type: Improvement
>  Components: State Backends, Checkpointing
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Chesnay Schepler
>
> When restoring a job with a parallelism that exceeds the maxParallelism we're 
> not providing a useful error message, as all you get is an 
> IllegalArgumentException:
> {code}
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed
> at 
> org.apache.flink.runtime.client.JobClient.awaitJobResult(JobClient.java:343)
> at 
> org.apache.flink.runtime.client.JobClient.submitJobAndWait(JobClient.java:396)
> at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:467)
> ... 22 more
> Caused by: java.lang.IllegalArgumentException
> at 
> org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:123)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.createKeyGroupPartitions(StateAssignmentOperation.java:449)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignAttemptState(StateAssignmentOperation.java:117)
> at 
> org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:102)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1038)
> at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1101)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1386)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1372)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> at 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
> at 
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
> at 
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at 
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)