[ https://issues.apache.org/jira/browse/PIG-1116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784763#action_12784763 ]
Hadoop QA commented on PIG-1116: -------------------------------- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12426637/PIG-1116.patch against trunk revision 886015. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/74/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/74/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/74/console This message is automatically generated. > Remove redundant map-reduce job for merge join > ---------------------------------------------- > > Key: PIG-1116 > URL: https://issues.apache.org/jira/browse/PIG-1116 > Project: Pig > Issue Type: Bug > Reporter: Daniel Dai > Assignee: Pradeep Kamath > Fix For: 0.6.0 > > Attachments: PIG-1116.patch > > > In merge join, when we convert right hand side file into a side file, we > didn't remove it from the map-reduce plan, we only disconnect it from the > plan. When we run the query, the redundant load will load the data but doing > nothing. This operation should be removed entirely. > Eg: > a = load '/user/pig/tests/data/zebra/singlefile/studentsortedtab10k' using > org.apache.hadoop.zebra.pig.TableLoader('', 'sorted') as (name, age, gpa); > b = load '/user/pig/tests/data/zebra/singlefile/votersortedtab10k' using > org.apache.hadoop.zebra.pig.TableLoader('', 'sorted') as (name, age, > registration, contributions); > c = join a by name, b by name using "merge"; > explain c; > {code} > #-------------------------------------------------- > # Map Reduce Plan > #-------------------------------------------------- > MapReduce node 1-21 > Map Plan > Load(hdfs://wilbur20.labs.corp.sp1.yahoo.com:9020/user/pig/tests/data/zebra/singlefile/votersortedtab10k:org.apache.hadoop.zebra.pig.TableLoader('','sorted')) > - 1-13-------- > Global sort: false > ---------------- > MapReduce node 1-20 > Map Plan > Store(fakefile:org.apache.pig.builtin.PigStorage) - 1-19 > | > |---MergeJoin[tuple] - 1-16 > | > > |---Load(hdfs://wilbur20.labs.corp.sp1.yahoo.com:9020/user/pig/tests/data/zebra/singlefile/studentsortedtab10k:org.apache.hadoop.zebra.pig.TableLoader('','sorted')) > - 1-12-------- > Global sort: false > ---------------- > {code} > 1-21 should be removed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.