[jira] [Assigned] (SPARK-10008) Shuffle locality can take precedence over narrow dependencies for RDDs with both

2015-08-14 Thread Matei Zaharia (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-10008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matei Zaharia reassigned SPARK-10008:
-

Assignee: Matei Zaharia

> Shuffle locality can take precedence over narrow dependencies for RDDs with 
> both
> 
>
> Key: SPARK-10008
> URL: https://issues.apache.org/jira/browse/SPARK-10008
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler
>Reporter: Matei Zaharia
>Assignee: Matei Zaharia
>
> The shuffle locality patch made the DAGScheduler aware of shuffle data, but 
> for RDDs that have both narrow and shuffle dependencies, it can cause them to 
> place tasks based on the shuffle dependency instead of the narrow one. This 
> case is common in iterative join-based algorithms like PageRank and ALS, 
> where one RDD is hash-partitioned and one isn't.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-10008) Shuffle locality can take precedence over narrow dependencies for RDDs with both

2015-08-14 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-10008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-10008:


Assignee: Apache Spark

> Shuffle locality can take precedence over narrow dependencies for RDDs with 
> both
> 
>
> Key: SPARK-10008
> URL: https://issues.apache.org/jira/browse/SPARK-10008
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler
>Reporter: Matei Zaharia
>Assignee: Apache Spark
>
> The shuffle locality patch made the DAGScheduler aware of shuffle data, but 
> for RDDs that have both narrow and shuffle dependencies, it can cause them to 
> place tasks based on the shuffle dependency instead of the narrow one. This 
> case is common in iterative join-based algorithms like PageRank and ALS, 
> where one RDD is hash-partitioned and one isn't.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-10008) Shuffle locality can take precedence over narrow dependencies for RDDs with both

2015-08-14 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-10008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-10008:


Assignee: (was: Apache Spark)

> Shuffle locality can take precedence over narrow dependencies for RDDs with 
> both
> 
>
> Key: SPARK-10008
> URL: https://issues.apache.org/jira/browse/SPARK-10008
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler
>Reporter: Matei Zaharia
>
> The shuffle locality patch made the DAGScheduler aware of shuffle data, but 
> for RDDs that have both narrow and shuffle dependencies, it can cause them to 
> place tasks based on the shuffle dependency instead of the narrow one. This 
> case is common in iterative join-based algorithms like PageRank and ALS, 
> where one RDD is hash-partitioned and one isn't.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org