[jira] [Updated] (SPARK-9862) Join: Handling data skew

2016-10-13 Thread wangyuhu (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-9862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangyuhu updated SPARK-9862:

Attachment: Handling skew data in join.pdf

> Join: Handling data skew
> 
>
> Key: SPARK-9862
> URL: https://issues.apache.org/jira/browse/SPARK-9862
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Yin Huai
> Attachments: Handling skew data in join.pdf
>
>
> For a two way shuffle join, if one or multiple groups are skewed in one table 
> (say left table) but having a relative small number of rows in another table 
> (say right table), we can use broadcast join for these skewed groups and use 
> shuffle join for other groups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-9862) Join: Handling data skew

2015-10-12 Thread Reynold Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-9862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reynold Xin updated SPARK-9862:
---
Assignee: (was: Yin Huai)

> Join: Handling data skew
> 
>
> Key: SPARK-9862
> URL: https://issues.apache.org/jira/browse/SPARK-9862
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Yin Huai
>
> For a two way shuffle join, if one or multiple groups are skewed in one table 
> (say left table) but having a relative small number of rows in another table 
> (say right table), we can use broadcast join for these skewed groups and use 
> shuffle join for other groups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org