[jira] Commented: (PIG-951) Reset parallelism to 1 for indexing job in MergeJoin

Hadoop QA (JIRA) Thu, 10 Sep 2009 18:34:22 -0700

    [ 
https://issues.apache.org/jira/browse/PIG-951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753941#action_12753941
 ]


Hadoop QA commented on PIG-951:
-------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12419132/pig-951.patch
  against trunk revision 813601.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/23/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/23/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/23/console

This message is automatically generated.

> Reset parallelism to 1 for indexing job in MergeJoin
> ----------------------------------------------------
>
>                 Key: PIG-951
>                 URL: https://issues.apache.org/jira/browse/PIG-951
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>            Reporter: Ashutosh Chauhan
>            Assignee: Ashutosh Chauhan
>         Attachments: pig-951.patch
>
>
> After sampling one tuple from every block, one reducer is used to sort the 
> index entries in reduce phase to produce sorted index to be used in actual 
> join job. Thus, parallelism of index job should be explictly set to 1. 
> Currently, its not.
> Currently, this is a non-issue, since we don't allow any blocking operators 
> in pipeline before merge-join. However, later when we do allow blocking 
> operators, then parallelism of indexing job will be that of preceding 
> blocking operator. Even then, job will complete successfully because all 
> tuple will go to only one reducer, because we are grouping on only one key 
> "all". However, it will waste cluster resources by starting all the extra 
> reducers which get no data and thus do nothing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-951) Reset parallelism to 1 for indexing job in MergeJoin

Reply via email to