[ 
https://issues.apache.org/jira/browse/SPARK-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Wendell updated SPARK-2086:
-----------------------------------

    Description: 
It would be nice if the toDebugString method of an RDD did a better job of 
explaining where shuffle boundaries occur in the lineage graph. One way to do 
this would be to only indent the tree at a shuffle boundary instead of 
indenting it for every parent. 

We can determine when a shuffle boundary occurs based on the type of dependency 
seen in the RDD.

  was:It would be nice if the toDebugString method of an RDD did a better job 
of explaining where shuffle boundaries occur in the lineage graph. One way to 
do this would be to only indent the tree at a shuffle boundary instead of 
indenting it for every parent. 


> Improve output of toDebugString to make shuffle boundaries more clear
> ---------------------------------------------------------------------
>
>                 Key: SPARK-2086
>                 URL: https://issues.apache.org/jira/browse/SPARK-2086
>             Project: Spark
>          Issue Type: Improvement
>            Reporter: Patrick Wendell
>            Assignee: Gregory Owen
>            Priority: Minor
>
> It would be nice if the toDebugString method of an RDD did a better job of 
> explaining where shuffle boundaries occur in the lineage graph. One way to do 
> this would be to only indent the tree at a shuffle boundary instead of 
> indenting it for every parent. 
> We can determine when a shuffle boundary occurs based on the type of 
> dependency seen in the RDD.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to