[jira] Commented: (PIG-951) Reset parallelism to 1 for indexing job in MergeJoin

2009-09-17 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756654#action_12756654
 ] 

Alan Gates commented on PIG-951:


I'll be reviewing this patch.

> Reset parallelism to 1 for indexing job in MergeJoin
> 
>
> Key: PIG-951
> URL: https://issues.apache.org/jira/browse/PIG-951
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: pig-951.patch
>
>
> After sampling one tuple from every block, one reducer is used to sort the 
> index entries in reduce phase to produce sorted index to be used in actual 
> join job. Thus, parallelism of index job should be explictly set to 1. 
> Currently, its not.
> Currently, this is a non-issue, since we don't allow any blocking operators 
> in pipeline before merge-join. However, later when we do allow blocking 
> operators, then parallelism of indexing job will be that of preceding 
> blocking operator. Even then, job will complete successfully because all 
> tuple will go to only one reducer, because we are grouping on only one key 
> "all". However, it will waste cluster resources by starting all the extra 
> reducers which get no data and thus do nothing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-951) Reset parallelism to 1 for indexing job in MergeJoin

2009-09-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753941#action_12753941
 ] 

Hadoop QA commented on PIG-951:
---

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12419132/pig-951.patch
  against trunk revision 813601.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/23/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/23/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/23/console

This message is automatically generated.

> Reset parallelism to 1 for indexing job in MergeJoin
> 
>
> Key: PIG-951
> URL: https://issues.apache.org/jira/browse/PIG-951
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: pig-951.patch
>
>
> After sampling one tuple from every block, one reducer is used to sort the 
> index entries in reduce phase to produce sorted index to be used in actual 
> join job. Thus, parallelism of index job should be explictly set to 1. 
> Currently, its not.
> Currently, this is a non-issue, since we don't allow any blocking operators 
> in pipeline before merge-join. However, later when we do allow blocking 
> operators, then parallelism of indexing job will be that of preceding 
> blocking operator. Even then, job will complete successfully because all 
> tuple will go to only one reducer, because we are grouping on only one key 
> "all". However, it will waste cluster resources by starting all the extra 
> reducers which get no data and thus do nothing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.