[ 
https://issues.apache.org/jira/browse/PIG-955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12754306#action_12754306
 ] 

Olga Natkovich commented on PIG-955:
------------------------------------

Hi Ying,

Thanks for the patch. From the description it is not clear what kind of scripts 
would be effected by this issue. Adding an example to the JIRA description 
would be helpful.

Also, the patch needs a unit test

> Skewed join generates  incorrect results 
> -----------------------------------------
>
>                 Key: PIG-955
>                 URL: https://issues.apache.org/jira/browse/PIG-955
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Ying He
>         Attachments: PIG-955.patch
>
>
> Fragmented replicated join has a few limitations:
>  - One of the tables needs to be loaded into memory
>  - Join is limited to two tables
> Skewed join partitions the table and joins the records in the reduce phase. 
> It computes a histogram of the key space to account for skewing in the input 
> records. Further, it adjusts the number of reducers depending on the key 
> distribution.
> We need to implement the skewed join in pig.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to