[jira] [Updated] (MAPREDUCE-7306) Mistaken cyclic check in JobControl Job DAG

2020-11-17 Thread Xiaomin Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaomin Zhang updated MAPREDUCE-7306:
-
Description: 
MAPREDUCE-4371 added cyclic dependency check for a topologic sorted Job DAG. 
However it is somehow not properly implemented for the `sort` phase.

In the while loop as below: 
{code:java}
while (!SourceSet.isEmpty()) {      
   ControlledJob controlledJob = SourceSet.iterator().next();      
   SourceSet.remove(controlledJob);      
   if (controlledJob.getDependentJobs() != null) { 
 for (int i = 0; i < controlledJob.getDependentJobs().size(); i++) {   
   ControlledJob depenControlledJob =       
  controlledJob.getDependentJobs().get(i);   
   processedMap.get(controlledJob).add(depenControlledJob);   
   if (!hasInComingEdge(controlledJob, jobList, processedMap)) {     
 SourceSet.add(depenControlledJob);   
   } 
 }
   }
}{code}
It adds the parent/dependent Job node to the processedMap for current Job node. 
And then it's supposed to add the parent/dependent Job node into SourceSet if 
the parent Job node is not processed yet or does not have any child node except 
the current processed one. However it mistakenly checks the current one: 
*hasInComingEdge(controlledJob)*  , while adding the parent node to the 
SourceSet: *SourceSet.add(depenControlledJob)* 

This breaks Job DAGs like below:

job1.addDependingJob(job2);
 job2.addDependingJob(job3);
 job4.addDependingJob(job2);

Above code reports a cyclic dependency for job2, because job2 is added into 
SourceSet twice.                 

 

 

 

 

  was:
MAPREDUCE-4371 added cyclic dependency check for a topologic sorted Job DAG. 
However it is somehow not properly implemented for the `sort` phase.

In the while loop as below: 
{code:java}
while (!SourceSet.isEmpty()) {      
   ControlledJob controlledJob = SourceSet.iterator().next();      
   SourceSet.remove(controlledJob);      
   if (controlledJob.getDependentJobs() != null) { 
 for (int i = 0; i < controlledJob.getDependentJobs().size(); i++) {   
   ControlledJob depenControlledJob =       
  controlledJob.getDependentJobs().get(i);   
   processedMap.get(controlledJob).add(depenControlledJob);   
   if (!hasInComingEdge(controlledJob, jobList, processedMap)) {     
 SourceSet.add(depenControlledJob);   
   } 
 }
   }
}{code}
visit the parent node followed by the child node. If the given graph contains a 
cycle, then there is at least one node which is a parent as well as a child so 
this will break Topological Order. Therefore, after the topological sort, check 
for every directed edge whether it follows the order or not.

 

 

 


> Mistaken cyclic check in JobControl Job DAG
> ---
>
> Key: MAPREDUCE-7306
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7306
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client, mrv2
>Affects Versions: 3.0.0
>Reporter: Xiaomin Zhang
>Priority: Major
>
> MAPREDUCE-4371 added cyclic dependency check for a topologic sorted Job DAG. 
> However it is somehow not properly implemented for the `sort` phase.
> In the while loop as below: 
> {code:java}
> while (!SourceSet.isEmpty()) {      
>ControlledJob controlledJob = SourceSet.iterator().next();      
>SourceSet.remove(controlledJob);      
>if (controlledJob.getDependentJobs() != null) { 
>  for (int i = 0; i < controlledJob.getDependentJobs().size(); i++) {   
>ControlledJob depenControlledJob =       
>   controlledJob.getDependentJobs().get(i);   
>processedMap.get(controlledJob).add(depenControlledJob);   
>if (!hasInComingEdge(controlledJob, jobList, processedMap)) {     
>  SourceSet.add(depenControlledJob);   
>} 
>  }
>}
> }{code}
> It adds the parent/dependent Job node to the processedMap for current Job 
> node. And then it's supposed to add the parent/dependent Job node into 
> SourceSet if the parent Job node is not processed yet or does not have any 
> child node except the current processed one. However it mistakenly checks the 
> current one: *hasInComingEdge(controlledJob)*  , while adding the parent node 
> to the SourceSet: *SourceSet.add(depenControlledJob)* 
> This breaks Job DAGs like below:
> job1.addDependingJob(job2);
>  job2.addDependingJob(job3);
>  job4.addDependingJob(job2);
> Above code reports a cyclic dependency for job2, because job2 is added into 
> SourceSet twice.                 
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: 

[jira] [Updated] (MAPREDUCE-7306) Mistaken cyclic check in JobControl Job DAG

2020-11-16 Thread Xiaomin Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaomin Zhang updated MAPREDUCE-7306:
-
Description: 
MAPREDUCE-4371 added cyclic dependency check for a topologic sorted Job DAG. 
However it is somehow not properly implemented for the `sort` phase.

In the while loop as below: 
{code:java}
while (!SourceSet.isEmpty()) {      
   ControlledJob controlledJob = SourceSet.iterator().next();      
   SourceSet.remove(controlledJob);      
   if (controlledJob.getDependentJobs() != null) { 
 for (int i = 0; i < controlledJob.getDependentJobs().size(); i++) {   
   ControlledJob depenControlledJob =       
  controlledJob.getDependentJobs().get(i);   
   processedMap.get(controlledJob).add(depenControlledJob);   
   if (!hasInComingEdge(controlledJob, jobList, processedMap)) {     
 SourceSet.add(depenControlledJob);   
   } 
 }
   }
}{code}
visit the parent node followed by the child node. If the given graph contains a 
cycle, then there is at least one node which is a parent as well as a child so 
this will break Topological Order. Therefore, after the topological sort, check 
for every directed edge whether it follows the order or not.

 

 

 

  was:
MAPREDUCE-4371 added cyclic dependency check for a topologic sorted Job DAG. 
However it is somehow not properly implemented for the `sort` phase.

In the while loop as below:

 
{code:java}
// code placeholder
{code}
visit the parent node followed by the child node. If the given graph contains a 
cycle, then there is at least one node which is a parent as well as a child so 
this will break Topological Order. Therefore, after the topological sort, check 
for every directed edge whether it follows the order or not.

 

 

 


> Mistaken cyclic check in JobControl Job DAG
> ---
>
> Key: MAPREDUCE-7306
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7306
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client, mrv2
>Affects Versions: 3.0.0
>Reporter: Xiaomin Zhang
>Priority: Major
>
> MAPREDUCE-4371 added cyclic dependency check for a topologic sorted Job DAG. 
> However it is somehow not properly implemented for the `sort` phase.
> In the while loop as below: 
> {code:java}
> while (!SourceSet.isEmpty()) {      
>ControlledJob controlledJob = SourceSet.iterator().next();      
>SourceSet.remove(controlledJob);      
>if (controlledJob.getDependentJobs() != null) { 
>  for (int i = 0; i < controlledJob.getDependentJobs().size(); i++) {   
>ControlledJob depenControlledJob =       
>   controlledJob.getDependentJobs().get(i);   
>processedMap.get(controlledJob).add(depenControlledJob);   
>if (!hasInComingEdge(controlledJob, jobList, processedMap)) {     
>  SourceSet.add(depenControlledJob);   
>} 
>  }
>}
> }{code}
> visit the parent node followed by the child node. If the given graph contains 
> a cycle, then there is at least one node which is a parent as well as a child 
> so this will break Topological Order. Therefore, after the topological sort, 
> check for every directed edge whether it follows the order or not.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-7306) Mistaken cyclic check in JobControl Job DAG

2020-11-16 Thread Xiaomin Zhang (Jira)
Xiaomin Zhang created MAPREDUCE-7306:


 Summary: Mistaken cyclic check in JobControl Job DAG
 Key: MAPREDUCE-7306
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7306
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client, mrv2
Affects Versions: 3.0.0
Reporter: Xiaomin Zhang


MAPREDUCE-4371 added cyclic dependency check for a topologic sorted Job DAG. 
However it is somehow not properly implemented for the `sort` phase.

In the while loop as below:

 
{code:java}
// code placeholder
{code}
visit the parent node followed by the child node. If the given graph contains a 
cycle, then there is at least one node which is a parent as well as a child so 
this will break Topological Order. Therefore, after the topological sort, check 
for every directed edge whether it follows the order or not.

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org