[ https://issues.apache.org/jira/browse/PIG-954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753888#action_12753888 ]
Ying He commented on PIG-954: ----------------------------- the sampling job fails when pig.skewedjoin.reduce.memusage is not configured in pig property file. > Skewed join fails when pig.skewedjoin.reduce.memusage is not configured > ----------------------------------------------------------------------- > > Key: PIG-954 > URL: https://issues.apache.org/jira/browse/PIG-954 > Project: Pig > Issue Type: Improvement > Reporter: Ying He > > Fragmented replicated join has a few limitations: > - One of the tables needs to be loaded into memory > - Join is limited to two tables > Skewed join partitions the table and joins the records in the reduce phase. > It computes a histogram of the key space to account for skewing in the input > records. Further, it adjusts the number of reducers depending on the key > distribution. > We need to implement the skewed join in pig. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.