[jira] [Updated] (MAPREDUCE-7306) Mistaken cyclic check in JobControl Job DAG
[ https://issues.apache.org/jira/browse/MAPREDUCE-7306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaomin Zhang updated MAPREDUCE-7306: - Description: MAPREDUCE-4371 added cyclic dependency check for a topologic sorted Job DAG. However it is somehow not properly implemented for the `sort` phase. In the while loop as below: {code:java} while (!SourceSet.isEmpty()) { ControlledJob controlledJob = SourceSet.iterator().next(); SourceSet.remove(controlledJob); if (controlledJob.getDependentJobs() != null) { for (int i = 0; i < controlledJob.getDependentJobs().size(); i++) { ControlledJob depenControlledJob = controlledJob.getDependentJobs().get(i); processedMap.get(controlledJob).add(depenControlledJob); if (!hasInComingEdge(controlledJob, jobList, processedMap)) { SourceSet.add(depenControlledJob); } } } }{code} It adds the parent/dependent Job node to the processedMap for current Job node. And then it's supposed to add the parent/dependent Job node into SourceSet if the parent Job node is not processed yet or does not have any child node except the current processed one. However it mistakenly checks the current one: *hasInComingEdge(controlledJob)* , while adding the parent node to the SourceSet: *SourceSet.add(depenControlledJob)* This breaks Job DAGs like below: job1.addDependingJob(job2); job2.addDependingJob(job3); job4.addDependingJob(job2); Above code reports a cyclic dependency for job2, because job2 is added into SourceSet twice. was: MAPREDUCE-4371 added cyclic dependency check for a topologic sorted Job DAG. However it is somehow not properly implemented for the `sort` phase. In the while loop as below: {code:java} while (!SourceSet.isEmpty()) { ControlledJob controlledJob = SourceSet.iterator().next(); SourceSet.remove(controlledJob); if (controlledJob.getDependentJobs() != null) { for (int i = 0; i < controlledJob.getDependentJobs().size(); i++) { ControlledJob depenControlledJob = controlledJob.getDependentJobs().get(i); processedMap.get(controlledJob).add(depenControlledJob); if (!hasInComingEdge(controlledJob, jobList, processedMap)) { SourceSet.add(depenControlledJob); } } } }{code} visit the parent node followed by the child node. If the given graph contains a cycle, then there is at least one node which is a parent as well as a child so this will break Topological Order. Therefore, after the topological sort, check for every directed edge whether it follows the order or not. > Mistaken cyclic check in JobControl Job DAG > --- > > Key: MAPREDUCE-7306 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7306 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client, mrv2 >Affects Versions: 3.0.0 >Reporter: Xiaomin Zhang >Priority: Major > > MAPREDUCE-4371 added cyclic dependency check for a topologic sorted Job DAG. > However it is somehow not properly implemented for the `sort` phase. > In the while loop as below: > {code:java} > while (!SourceSet.isEmpty()) { >ControlledJob controlledJob = SourceSet.iterator().next(); >SourceSet.remove(controlledJob); >if (controlledJob.getDependentJobs() != null) { > for (int i = 0; i < controlledJob.getDependentJobs().size(); i++) { >ControlledJob depenControlledJob = > controlledJob.getDependentJobs().get(i); >processedMap.get(controlledJob).add(depenControlledJob); >if (!hasInComingEdge(controlledJob, jobList, processedMap)) { > SourceSet.add(depenControlledJob); >} > } >} > }{code} > It adds the parent/dependent Job node to the processedMap for current Job > node. And then it's supposed to add the parent/dependent Job node into > SourceSet if the parent Job node is not processed yet or does not have any > child node except the current processed one. However it mistakenly checks the > current one: *hasInComingEdge(controlledJob)* , while adding the parent node > to the SourceSet: *SourceSet.add(depenControlledJob)* > This breaks Job DAGs like below: > job1.addDependingJob(job2); > job2.addDependingJob(job3); > job4.addDependingJob(job2); > Above code reports a cyclic dependency for job2, because job2 is added into > SourceSet twice. > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail:
[jira] [Updated] (MAPREDUCE-7306) Mistaken cyclic check in JobControl Job DAG
[ https://issues.apache.org/jira/browse/MAPREDUCE-7306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaomin Zhang updated MAPREDUCE-7306: - Description: MAPREDUCE-4371 added cyclic dependency check for a topologic sorted Job DAG. However it is somehow not properly implemented for the `sort` phase. In the while loop as below: {code:java} while (!SourceSet.isEmpty()) { ControlledJob controlledJob = SourceSet.iterator().next(); SourceSet.remove(controlledJob); if (controlledJob.getDependentJobs() != null) { for (int i = 0; i < controlledJob.getDependentJobs().size(); i++) { ControlledJob depenControlledJob = controlledJob.getDependentJobs().get(i); processedMap.get(controlledJob).add(depenControlledJob); if (!hasInComingEdge(controlledJob, jobList, processedMap)) { SourceSet.add(depenControlledJob); } } } }{code} visit the parent node followed by the child node. If the given graph contains a cycle, then there is at least one node which is a parent as well as a child so this will break Topological Order. Therefore, after the topological sort, check for every directed edge whether it follows the order or not. was: MAPREDUCE-4371 added cyclic dependency check for a topologic sorted Job DAG. However it is somehow not properly implemented for the `sort` phase. In the while loop as below: {code:java} // code placeholder {code} visit the parent node followed by the child node. If the given graph contains a cycle, then there is at least one node which is a parent as well as a child so this will break Topological Order. Therefore, after the topological sort, check for every directed edge whether it follows the order or not. > Mistaken cyclic check in JobControl Job DAG > --- > > Key: MAPREDUCE-7306 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7306 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client, mrv2 >Affects Versions: 3.0.0 >Reporter: Xiaomin Zhang >Priority: Major > > MAPREDUCE-4371 added cyclic dependency check for a topologic sorted Job DAG. > However it is somehow not properly implemented for the `sort` phase. > In the while loop as below: > {code:java} > while (!SourceSet.isEmpty()) { >ControlledJob controlledJob = SourceSet.iterator().next(); >SourceSet.remove(controlledJob); >if (controlledJob.getDependentJobs() != null) { > for (int i = 0; i < controlledJob.getDependentJobs().size(); i++) { >ControlledJob depenControlledJob = > controlledJob.getDependentJobs().get(i); >processedMap.get(controlledJob).add(depenControlledJob); >if (!hasInComingEdge(controlledJob, jobList, processedMap)) { > SourceSet.add(depenControlledJob); >} > } >} > }{code} > visit the parent node followed by the child node. If the given graph contains > a cycle, then there is at least one node which is a parent as well as a child > so this will break Topological Order. Therefore, after the topological sort, > check for every directed edge whether it follows the order or not. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7306) Mistaken cyclic check in JobControl Job DAG
Xiaomin Zhang created MAPREDUCE-7306: Summary: Mistaken cyclic check in JobControl Job DAG Key: MAPREDUCE-7306 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7306 Project: Hadoop Map/Reduce Issue Type: Bug Components: client, mrv2 Affects Versions: 3.0.0 Reporter: Xiaomin Zhang MAPREDUCE-4371 added cyclic dependency check for a topologic sorted Job DAG. However it is somehow not properly implemented for the `sort` phase. In the while loop as below: {code:java} // code placeholder {code} visit the parent node followed by the child node. If the given graph contains a cycle, then there is at least one node which is a parent as well as a child so this will break Topological Order. Therefore, after the topological sort, check for every directed edge whether it follows the order or not. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org