[jira] Commented: (PIG-951) Reset parallelism to 1 for indexing job in MergeJoin
[ https://issues.apache.org/jira/browse/PIG-951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756654#action_12756654 ] Alan Gates commented on PIG-951: I'll be reviewing this patch. > Reset parallelism to 1 for indexing job in MergeJoin > > > Key: PIG-951 > URL: https://issues.apache.org/jira/browse/PIG-951 > Project: Pig > Issue Type: Bug > Components: impl >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: pig-951.patch > > > After sampling one tuple from every block, one reducer is used to sort the > index entries in reduce phase to produce sorted index to be used in actual > join job. Thus, parallelism of index job should be explictly set to 1. > Currently, its not. > Currently, this is a non-issue, since we don't allow any blocking operators > in pipeline before merge-join. However, later when we do allow blocking > operators, then parallelism of indexing job will be that of preceding > blocking operator. Even then, job will complete successfully because all > tuple will go to only one reducer, because we are grouping on only one key > "all". However, it will waste cluster resources by starting all the extra > reducers which get no data and thus do nothing. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-951) Reset parallelism to 1 for indexing job in MergeJoin
[ https://issues.apache.org/jira/browse/PIG-951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753941#action_12753941 ] Hadoop QA commented on PIG-951: --- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12419132/pig-951.patch against trunk revision 813601. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/23/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/23/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/23/console This message is automatically generated. > Reset parallelism to 1 for indexing job in MergeJoin > > > Key: PIG-951 > URL: https://issues.apache.org/jira/browse/PIG-951 > Project: Pig > Issue Type: Bug > Components: impl >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: pig-951.patch > > > After sampling one tuple from every block, one reducer is used to sort the > index entries in reduce phase to produce sorted index to be used in actual > join job. Thus, parallelism of index job should be explictly set to 1. > Currently, its not. > Currently, this is a non-issue, since we don't allow any blocking operators > in pipeline before merge-join. However, later when we do allow blocking > operators, then parallelism of indexing job will be that of preceding > blocking operator. Even then, job will complete successfully because all > tuple will go to only one reducer, because we are grouping on only one key > "all". However, it will waste cluster resources by starting all the extra > reducers which get no data and thus do nothing. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.