[jira] Updated: (PIG-926) Merge-Join phase 2
[ https://issues.apache.org/jira/browse/PIG-926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-926: - Attachment: (was: mj_phase2_1.patch) Merge-Join phase 2 -- Key: PIG-926 URL: https://issues.apache.org/jira/browse/PIG-926 Project: Pig Issue Type: Improvement Components: impl Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor This jira is created to keep track of phase-2 work for MergeJoin. Various limitations exist in phase-1 for Merge Join which are listed on: http://wiki.apache.org/pig/PigMergeJoin Those will be addressed here. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-926) Merge-Join phase 2
[ https://issues.apache.org/jira/browse/PIG-926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-926: - Status: Open (was: Patch Available) Merge-Join phase 2 -- Key: PIG-926 URL: https://issues.apache.org/jira/browse/PIG-926 Project: Pig Issue Type: Improvement Components: impl Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Attachments: mj_phase2_1.patch This jira is created to keep track of phase-2 work for MergeJoin. Various limitations exist in phase-1 for Merge Join which are listed on: http://wiki.apache.org/pig/PigMergeJoin Those will be addressed here. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-926) Merge-Join phase 2
[ https://issues.apache.org/jira/browse/PIG-926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-926: - Status: Open (was: Patch Available) Merge-Join phase 2 -- Key: PIG-926 URL: https://issues.apache.org/jira/browse/PIG-926 Project: Pig Issue Type: Improvement Components: impl Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Attachments: mj_phase2_1.patch This jira is created to keep track of phase-2 work for MergeJoin. Various limitations exist in phase-1 for Merge Join which are listed on: http://wiki.apache.org/pig/PigMergeJoin Those will be addressed here. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-926) Merge-Join phase 2
[ https://issues.apache.org/jira/browse/PIG-926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-926: - Attachment: (was: mj_phase2_1.patch) Merge-Join phase 2 -- Key: PIG-926 URL: https://issues.apache.org/jira/browse/PIG-926 Project: Pig Issue Type: Improvement Components: impl Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Attachments: mj_phase2_1.patch This jira is created to keep track of phase-2 work for MergeJoin. Various limitations exist in phase-1 for Merge Join which are listed on: http://wiki.apache.org/pig/PigMergeJoin Those will be addressed here. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-926) Merge-Join phase 2
[ https://issues.apache.org/jira/browse/PIG-926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-926: - Attachment: mj_phase2_1.patch Updated patch addressing Pradeep's comments. Merge-Join phase 2 -- Key: PIG-926 URL: https://issues.apache.org/jira/browse/PIG-926 Project: Pig Issue Type: Improvement Components: impl Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Attachments: mj_phase2_1.patch This jira is created to keep track of phase-2 work for MergeJoin. Various limitations exist in phase-1 for Merge Join which are listed on: http://wiki.apache.org/pig/PigMergeJoin Those will be addressed here. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-926) Merge-Join phase 2
[ https://issues.apache.org/jira/browse/PIG-926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-926: - Status: Patch Available (was: Open) Merge-Join phase 2 -- Key: PIG-926 URL: https://issues.apache.org/jira/browse/PIG-926 Project: Pig Issue Type: Improvement Components: impl Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Attachments: mj_phase2_1.patch This jira is created to keep track of phase-2 work for MergeJoin. Various limitations exist in phase-1 for Merge Join which are listed on: http://wiki.apache.org/pig/PigMergeJoin Those will be addressed here. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-926) Merge-Join phase 2
[ https://issues.apache.org/jira/browse/PIG-926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-926: - Attachment: (was: mj_phase2_1.patch) Merge-Join phase 2 -- Key: PIG-926 URL: https://issues.apache.org/jira/browse/PIG-926 Project: Pig Issue Type: Improvement Components: impl Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor This jira is created to keep track of phase-2 work for MergeJoin. Various limitations exist in phase-1 for Merge Join which are listed on: http://wiki.apache.org/pig/PigMergeJoin Those will be addressed here. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-926) Merge-Join phase 2
[ https://issues.apache.org/jira/browse/PIG-926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-926: - Attachment: mj_phase2_1.patch The attached first patch runs the full pipeline of right side in indexer before sampling the tuple from block. This has following advantages: a) It addresses the concern which Pradeep pointed out in phase-1: Strictly we should not allow LOForeach since it could change sort order or position of join keys and hence invalidate the index - but we need it so that the Foreach introduced by the TypeCastInserter when there is a schema for either of the inputs remains. Now since pipeline is run before sampling the tuple, this becomes a non-issue. b) Currently type information doesn't make it to the POSort which sorts the index entries in reduce task of index job. This works due to other reasons, but this patch fixes this. c) It will improve on performance. Instead of always sampling the first record of the block, index now contains the entry of first record in the block for which join may happen, thus saving time spent in fetching right tuples over the network which couldn't be joined in any case. Merge-Join phase 2 -- Key: PIG-926 URL: https://issues.apache.org/jira/browse/PIG-926 Project: Pig Issue Type: Improvement Components: impl Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Attachments: mj_phase2_1.patch This jira is created to keep track of phase-2 work for MergeJoin. Various limitations exist in phase-1 for Merge Join which are listed on: http://wiki.apache.org/pig/PigMergeJoin Those will be addressed here. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.