[ 
https://issues.apache.org/jira/browse/FLINK-4296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402334#comment-15402334
 ] 

ASF GitHub Bot commented on FLINK-4296:
---------------------------------------

Github user uce commented on the issue:

    https://github.com/apache/flink/pull/2321
  
    Good catch. The change and test look good to me! This was broken for a long 
time (since the initial refactoring of the network stack I think). It never 
surfaced, because most use cases and tests run with pipelined results.


> Scheduler accepts more tasks than it has task slots available
> -------------------------------------------------------------
>
>                 Key: FLINK-4296
>                 URL: https://issues.apache.org/jira/browse/FLINK-4296
>             Project: Flink
>          Issue Type: Bug
>          Components: JobManager, TaskManager
>    Affects Versions: 1.1.0
>            Reporter: Maximilian Michels
>            Assignee: Till Rohrmann
>            Priority: Critical
>             Fix For: 1.1.0, 1.2.0
>
>
> Flink's scheduler doesn't support queued scheduling but expects to find all 
> necessary task slots upon scheduling. If it does not it throws an error. Due 
> to some changes in the latest master, this seems to be broken.
> Flink accepts jobs with {{parallelism > total number of task slots}}, 
> schedules and deploys tasks in all available task slots, and leaves the 
> remaining tasks lingering forever.
> Easy to reproduce: 
> {code}
> ./bin/flink run -p TASK_SLOTS+n
> {code} 
> where {{TASK_SLOTS}} is the number of total task slots of the cluster and 
> {{n>=1}}.
> Here, {{p=11}}, {{TASK_SLOTS=10}}:
> {{bin/flink run -p 11 examples/batch/EnumTriangles.jar}}
> {noformat}
> Cluster configuration: Standalone cluster with JobManager at 
> localhost/127.0.0.1:6123
> Using address localhost:6123 to connect to JobManager.
> JobManager web interface address http://localhost:8081
> Starting execution of program
> Executing EnumTriangles example with default edges data set.
> Use --edges to specify file input.
> Printing result to stdout. Use --output to specify output path.
> Submitting job with JobID: cd0c0b4cbe25643d8d92558168cfc045. Waiting for job 
> completion.
> 08/01/2016 12:12:12     Job execution switched to status RUNNING.
> 08/01/2016 12:12:12     CHAIN DataSource (at 
> getDefaultEdgeDataSet(EnumTrianglesData.java:57) 
> (org.apache.flink.api.java.io.CollectionInputFormat)) -> Map (Map at 
> main(EnumTriangles.java:108))(1/1) switched to SCHEDULED
> 08/01/2016 12:12:12     CHAIN DataSource (at 
> getDefaultEdgeDataSet(EnumTrianglesData.java:57) 
> (org.apache.flink.api.java.io.CollectionInputFormat)) -> Map (Map at 
> main(EnumTriangles.java:108))(1/1) switched to DEPLOYING
> 08/01/2016 12:12:12     CHAIN DataSource (at 
> getDefaultEdgeDataSet(EnumTrianglesData.java:57) 
> (org.apache.flink.api.java.io.CollectionInputFormat)) -> Map (Map at 
> main(EnumTriangles.java:108))(1/1) switched to RUNNING
> 08/01/2016 12:12:12     CHAIN DataSource (at 
> getDefaultEdgeDataSet(EnumTrianglesData.java:57) 
> (org.apache.flink.api.java.io.CollectionInputFormat)) -> Map (Map at 
> main(EnumTriangles.java:108))(1/1) switched to FINISHED
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(1/11) switched to SCHEDULED
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(3/11) switched to SCHEDULED
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(2/11) switched to SCHEDULED
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(7/11) switched to SCHEDULED
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(7/11) switched to DEPLOYING
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(6/11) switched to SCHEDULED
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(4/11) switched to SCHEDULED
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(5/11) switched to SCHEDULED
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(4/11) switched to DEPLOYING
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(3/11) switched to DEPLOYING
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(9/11) switched to SCHEDULED
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(9/11) switched to DEPLOYING
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(5/11) switched to DEPLOYING
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(1/11) switched to DEPLOYING
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(1/11) 
> switched to SCHEDULED
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(1/11) 
> switched to DEPLOYING
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(2/11) 
> switched to SCHEDULED
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(2/11) 
> switched to DEPLOYING
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(3/11) 
> switched to SCHEDULED
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(3/11) 
> switched to DEPLOYING
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(4/11) 
> switched to SCHEDULED
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(4/11) 
> switched to DEPLOYING
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(5/11) 
> switched to SCHEDULED
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(5/11) 
> switched to DEPLOYING
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(6/11) 
> switched to SCHEDULED
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(6/11) 
> switched to DEPLOYING
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(7/11) 
> switched to SCHEDULED
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(7/11) 
> switched to DEPLOYING
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(8/11) 
> switched to SCHEDULED
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(8/11) 
> switched to DEPLOYING
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(9/11) 
> switched to SCHEDULED
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(9/11) 
> switched to DEPLOYING
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(10/11) 
> switched to SCHEDULED
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(10/11) 
> switched to DEPLOYING
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(11/11) switched to SCHEDULED
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(10/11) switched to SCHEDULED
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(11/11) switched to DEPLOYING
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(10/11) switched to DEPLOYING
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(8/11) switched to SCHEDULED
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(6/11) switched to DEPLOYING
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(2/11) switched to DEPLOYING
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(3/11) switched to RUNNING
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(11/11) 
> switched to SCHEDULED
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(1/11) switched to RUNNING
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(1/11) 
> switched to RUNNING
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(2/11) 
> switched to RUNNING
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(3/11) 
> switched to RUNNING
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(9/11) switched to RUNNING
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(4/11) switched to RUNNING
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(5/11) switched to RUNNING
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(7/11) 
> switched to RUNNING
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(6/11) 
> switched to RUNNING
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(8/11) 
> switched to RUNNING
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(9/11) 
> switched to RUNNING
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(10/11) 
> switched to RUNNING
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(10/11) switched to RUNNING
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(11/11) switched to RUNNING
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(4/11) 
> switched to RUNNING
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(5/11) 
> switched to RUNNING
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(7/11) switched to RUNNING
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(2/11) switched to RUNNING
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(6/11) switched to RUNNING
> 08/01/2016 12:12:13     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(1/11) switched to FINISHED
> 08/01/2016 12:12:13     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(2/11) switched to FINISHED
> 08/01/2016 12:12:13     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(7/11) switched to FINISHED
> 08/01/2016 12:12:13     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(6/11) switched to FINISHED
> 08/01/2016 12:12:13     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(3/11) switched to FINISHED
> 08/01/2016 12:12:13     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(9/11) switched to FINISHED
> 08/01/2016 12:12:13     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(11/11) switched to FINISHED
> 08/01/2016 12:12:13     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(5/11) switched to FINISHED
> 08/01/2016 12:12:13     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(10/11) switched to FINISHED
> 08/01/2016 12:12:13     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(4/11) switched to FINISHED
> {noformat}
> For {{8/11}}, the {{Join}} task switches to RUNNING, but the {{GroupReduce}} 
> does not:
> {noformat}
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(8/11) 
> switched to SCHEDULED
> 08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(8/11) 
> switched to DEPLOYING
> ....
> 08/01/2016 12:12:12     GroupReduce (GroupReduce at 
> main(EnumTriangles.java:112))(8/11) switched to SCHEDULED
> ....
> {08/01/2016 12:12:12     Join(Join at main(EnumTriangles.java:114))(8/11) 
> switched to RUNNING}}
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to