[jira] Commented: (PIG-1142) Got NullPointerException merge join with pruning
[ https://issues.apache.org/jira/browse/PIG-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789440#action_12789440 ] Alan Gates commented on PIG-1142: - Additional test cases look good. +1 > Got NullPointerException merge join with pruning > > > Key: PIG-1142 > URL: https://issues.apache.org/jira/browse/PIG-1142 > Project: Pig > Issue Type: Bug >Affects Versions: 0.6.0 >Reporter: Jing Huang >Assignee: Daniel Dai > Fix For: 0.6.0 > > Attachments: PIG-1142-1.patch, PIG-1142-2.patch, PIG-1142-3.patch > > > Here is my pig script: > register $zebraJar; > --fs -rmr $outputDir > a1 = LOAD '$inputDir/small1' USING > org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2'); > a2 = LOAD '$inputDir/small2' USING > org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2'); > sort1 = order a1 by str2; > sort2 = order a2 by str2; > --store sort1 into '$outputDir/smallsorted11' using > org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]'); > --store sort2 into '$outputDir/smallsorted21' using > org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]'); > rec1 = load '$outputDir/smallsorted11' using > org.apache.hadoop.zebra.pig.TableLoader(); > rec2 = load '$outputDir/smallsorted21' using > org.apache.hadoop.zebra.pig.TableLoader(); > joina = join rec1 by str2, rec2 by str2 using "merge" ; > E = foreach joina generate $0 as count, $1 as seed, $2 as int1, $3 as > str2; > --limitedVals = LIMIT E 5; > --dump limitedVals; > store E into '$outputDir/smalljoin2' using > org.apache.hadoop.zebra.pig.TableStorer(''); > > Here is the stacktrace: > java.lang.NullPointerException at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:312) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.extractKeysFromTuple(POMergeJoin.java:464) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.getNext(POMergeJoin.java:341) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:260) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:237) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:253) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.close(PigMapBase.java:107) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at > org.apache.hadoop.mapred.Child.main(Child.java:159) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1142) Got NullPointerException merge join with pruning
[ https://issues.apache.org/jira/browse/PIG-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789388#action_12789388 ] Olga Natkovich commented on PIG-1142: - +1, Daniel, code changes look good but do we need to add more unit tests to cover them? > Got NullPointerException merge join with pruning > > > Key: PIG-1142 > URL: https://issues.apache.org/jira/browse/PIG-1142 > Project: Pig > Issue Type: Bug >Affects Versions: 0.6.0 >Reporter: Jing Huang >Assignee: Daniel Dai > Fix For: 0.6.0 > > Attachments: PIG-1142-1.patch, PIG-1142-2.patch > > > Here is my pig script: > register $zebraJar; > --fs -rmr $outputDir > a1 = LOAD '$inputDir/small1' USING > org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2'); > a2 = LOAD '$inputDir/small2' USING > org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2'); > sort1 = order a1 by str2; > sort2 = order a2 by str2; > --store sort1 into '$outputDir/smallsorted11' using > org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]'); > --store sort2 into '$outputDir/smallsorted21' using > org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]'); > rec1 = load '$outputDir/smallsorted11' using > org.apache.hadoop.zebra.pig.TableLoader(); > rec2 = load '$outputDir/smallsorted21' using > org.apache.hadoop.zebra.pig.TableLoader(); > joina = join rec1 by str2, rec2 by str2 using "merge" ; > E = foreach joina generate $0 as count, $1 as seed, $2 as int1, $3 as > str2; > --limitedVals = LIMIT E 5; > --dump limitedVals; > store E into '$outputDir/smalljoin2' using > org.apache.hadoop.zebra.pig.TableStorer(''); > > Here is the stacktrace: > java.lang.NullPointerException at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:312) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.extractKeysFromTuple(POMergeJoin.java:464) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.getNext(POMergeJoin.java:341) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:260) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:237) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:253) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.close(PigMapBase.java:107) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at > org.apache.hadoop.mapred.Child.main(Child.java:159) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1142) Got NullPointerException merge join with pruning
[ https://issues.apache.org/jira/browse/PIG-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789215#action_12789215 ] Hadoop QA commented on PIG-1142: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12427671/PIG-1142-2.patch against trunk revision 889346. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/115/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/115/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/115/console This message is automatically generated. > Got NullPointerException merge join with pruning > > > Key: PIG-1142 > URL: https://issues.apache.org/jira/browse/PIG-1142 > Project: Pig > Issue Type: Bug >Affects Versions: 0.6.0 >Reporter: Jing Huang >Assignee: Daniel Dai > Fix For: 0.6.0 > > Attachments: PIG-1142-1.patch, PIG-1142-2.patch > > > Here is my pig script: > register $zebraJar; > --fs -rmr $outputDir > a1 = LOAD '$inputDir/small1' USING > org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2'); > a2 = LOAD '$inputDir/small2' USING > org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2'); > sort1 = order a1 by str2; > sort2 = order a2 by str2; > --store sort1 into '$outputDir/smallsorted11' using > org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]'); > --store sort2 into '$outputDir/smallsorted21' using > org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]'); > rec1 = load '$outputDir/smallsorted11' using > org.apache.hadoop.zebra.pig.TableLoader(); > rec2 = load '$outputDir/smallsorted21' using > org.apache.hadoop.zebra.pig.TableLoader(); > joina = join rec1 by str2, rec2 by str2 using "merge" ; > E = foreach joina generate $0 as count, $1 as seed, $2 as int1, $3 as > str2; > --limitedVals = LIMIT E 5; > --dump limitedVals; > store E into '$outputDir/smalljoin2' using > org.apache.hadoop.zebra.pig.TableStorer(''); > > Here is the stacktrace: > java.lang.NullPointerException at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:312) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.extractKeysFromTuple(POMergeJoin.java:464) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.getNext(POMergeJoin.java:341) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:260) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:237) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:253) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.close(PigMapBase.java:107) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at > org.apache.hadoop.mapred.Child.main(Child.java:159) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1142) Got NullPointerException merge join with pruning
[ https://issues.apache.org/jira/browse/PIG-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788949#action_12788949 ] Jing Huang commented on PIG-1142: - Verified fix on pig 0.6.0 branch. Fix works. > Got NullPointerException merge join with pruning > > > Key: PIG-1142 > URL: https://issues.apache.org/jira/browse/PIG-1142 > Project: Pig > Issue Type: Bug >Affects Versions: 0.6.0 >Reporter: Jing Huang >Assignee: Daniel Dai > Fix For: 0.6.0 > > Attachments: PIG-1142-1.patch > > > Here is my pig script: > register $zebraJar; > --fs -rmr $outputDir > a1 = LOAD '$inputDir/small1' USING > org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2'); > a2 = LOAD '$inputDir/small2' USING > org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2'); > sort1 = order a1 by str2; > sort2 = order a2 by str2; > --store sort1 into '$outputDir/smallsorted11' using > org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]'); > --store sort2 into '$outputDir/smallsorted21' using > org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]'); > rec1 = load '$outputDir/smallsorted11' using > org.apache.hadoop.zebra.pig.TableLoader(); > rec2 = load '$outputDir/smallsorted21' using > org.apache.hadoop.zebra.pig.TableLoader(); > joina = join rec1 by str2, rec2 by str2 using "merge" ; > E = foreach joina generate $0 as count, $1 as seed, $2 as int1, $3 as > str2; > --limitedVals = LIMIT E 5; > --dump limitedVals; > store E into '$outputDir/smalljoin2' using > org.apache.hadoop.zebra.pig.TableStorer(''); > > Here is the stacktrace: > java.lang.NullPointerException at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:312) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.extractKeysFromTuple(POMergeJoin.java:464) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.getNext(POMergeJoin.java:341) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:260) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:237) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:253) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.close(PigMapBase.java:107) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at > org.apache.hadoop.mapred.Child.main(Child.java:159) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1142) Got NullPointerException merge join with pruning
[ https://issues.apache.org/jira/browse/PIG-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788511#action_12788511 ] Olga Natkovich commented on PIG-1142: - +1. Changes look good. > Got NullPointerException merge join with pruning > > > Key: PIG-1142 > URL: https://issues.apache.org/jira/browse/PIG-1142 > Project: Pig > Issue Type: Bug >Affects Versions: 0.6.0 >Reporter: Jing Huang >Assignee: Daniel Dai > Fix For: 0.7.0 > > Attachments: PIG-1142-1.patch > > > Here is my pig script: > register $zebraJar; > --fs -rmr $outputDir > a1 = LOAD '$inputDir/small1' USING > org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2'); > a2 = LOAD '$inputDir/small2' USING > org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2'); > sort1 = order a1 by str2; > sort2 = order a2 by str2; > --store sort1 into '$outputDir/smallsorted11' using > org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]'); > --store sort2 into '$outputDir/smallsorted21' using > org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]'); > rec1 = load '$outputDir/smallsorted11' using > org.apache.hadoop.zebra.pig.TableLoader(); > rec2 = load '$outputDir/smallsorted21' using > org.apache.hadoop.zebra.pig.TableLoader(); > joina = join rec1 by str2, rec2 by str2 using "merge" ; > E = foreach joina generate $0 as count, $1 as seed, $2 as int1, $3 as > str2; > --limitedVals = LIMIT E 5; > --dump limitedVals; > store E into '$outputDir/smalljoin2' using > org.apache.hadoop.zebra.pig.TableStorer(''); > > Here is the stacktrace: > java.lang.NullPointerException at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:312) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.extractKeysFromTuple(POMergeJoin.java:464) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.getNext(POMergeJoin.java:341) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:260) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:237) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:253) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.close(PigMapBase.java:107) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at > org.apache.hadoop.mapred.Child.main(Child.java:159) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1142) Got NullPointerException merge join with pruning
[ https://issues.apache.org/jira/browse/PIG-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788505#action_12788505 ] Hadoop QA commented on PIG-1142: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12427539/PIG-1142-1.patch against trunk revision 52. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/110/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/110/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/110/console This message is automatically generated. > Got NullPointerException merge join with pruning > > > Key: PIG-1142 > URL: https://issues.apache.org/jira/browse/PIG-1142 > Project: Pig > Issue Type: Bug >Affects Versions: 0.6.0 >Reporter: Jing Huang >Assignee: Daniel Dai > Fix For: 0.7.0 > > Attachments: PIG-1142-1.patch > > > Here is my pig script: > register $zebraJar; > --fs -rmr $outputDir > a1 = LOAD '$inputDir/small1' USING > org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2'); > a2 = LOAD '$inputDir/small2' USING > org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2'); > sort1 = order a1 by str2; > sort2 = order a2 by str2; > --store sort1 into '$outputDir/smallsorted11' using > org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]'); > --store sort2 into '$outputDir/smallsorted21' using > org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]'); > rec1 = load '$outputDir/smallsorted11' using > org.apache.hadoop.zebra.pig.TableLoader(); > rec2 = load '$outputDir/smallsorted21' using > org.apache.hadoop.zebra.pig.TableLoader(); > joina = join rec1 by str2, rec2 by str2 using "merge" ; > E = foreach joina generate $0 as count, $1 as seed, $2 as int1, $3 as > str2; > --limitedVals = LIMIT E 5; > --dump limitedVals; > store E into '$outputDir/smalljoin2' using > org.apache.hadoop.zebra.pig.TableStorer(''); > > Here is the stacktrace: > java.lang.NullPointerException at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:312) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.extractKeysFromTuple(POMergeJoin.java:464) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.getNext(POMergeJoin.java:341) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:260) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:237) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:253) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.close(PigMapBase.java:107) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at > org.apache.hadoop.mapred.Child.main(Child.java:159) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1142) Got NullPointerException merge join with pruning
[ https://issues.apache.org/jira/browse/PIG-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788458#action_12788458 ] Daniel Dai commented on PIG-1142: - To make it clear, this issue happens when we prune more than one columns in front of join key. > Got NullPointerException merge join with pruning > > > Key: PIG-1142 > URL: https://issues.apache.org/jira/browse/PIG-1142 > Project: Pig > Issue Type: Bug >Affects Versions: 0.6.0 >Reporter: Jing Huang >Assignee: Daniel Dai > Fix For: 0.7.0 > > Attachments: PIG-1142-1.patch > > > Here is my pig script: > register $zebraJar; > --fs -rmr $outputDir > a1 = LOAD '$inputDir/small1' USING > org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2'); > a2 = LOAD '$inputDir/small2' USING > org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2'); > sort1 = order a1 by str2; > sort2 = order a2 by str2; > --store sort1 into '$outputDir/smallsorted11' using > org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]'); > --store sort2 into '$outputDir/smallsorted21' using > org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]'); > rec1 = load '$outputDir/smallsorted11' using > org.apache.hadoop.zebra.pig.TableLoader(); > rec2 = load '$outputDir/smallsorted21' using > org.apache.hadoop.zebra.pig.TableLoader(); > joina = join rec1 by str2, rec2 by str2 using "merge" ; > E = foreach joina generate $0 as count, $1 as seed, $2 as int1, $3 as > str2; > --limitedVals = LIMIT E 5; > --dump limitedVals; > store E into '$outputDir/smalljoin2' using > org.apache.hadoop.zebra.pig.TableStorer(''); > > Here is the stacktrace: > java.lang.NullPointerException at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:312) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.extractKeysFromTuple(POMergeJoin.java:464) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.getNext(POMergeJoin.java:341) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:260) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:237) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:253) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.close(PigMapBase.java:107) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at > org.apache.hadoop.mapred.Child.main(Child.java:159) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1142) Got NullPointerException merge join with pruning
[ https://issues.apache.org/jira/browse/PIG-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788415#action_12788415 ] Daniel Dai commented on PIG-1142: - I can reproduce it in regular join as well. Here is the script: a = LOAD '1.txt' as (a0, a1, a2); b = LOAD '2.txt' as (b0, b1, b2); c = join a by a2, b by b2; d = foreach c generate $0, $1, $2; dump d; The logical plan is wrong: {code} . |---LOJoin 1-17 Schema: {a::a0: bytearray,a::a1: bytearray,a::a2: bytearray,b::b2: bytearray} Type: bag | | | Project 1-15 Projections: [2] Overloaded: false FieldSchema: a2: bytearray Type: bytearray | Input: Load 1-13 | | | Project 1-16 Projections: [1] Overloaded: false FieldSchema: Caught Exception: Attempt to fetch field 1 from schema of size 1 Type: Unknown | Input: Load 1-14 | |---Load 1-13 Schema: {a0: bytearray,a1: bytearray,a2: bytearray} Type: bag | |---Load 1-14 Schema: {b2: bytearray} Type: bag {code} The second project of LOJoin should project column 0 > Got NullPointerException merge join with pruning > > > Key: PIG-1142 > URL: https://issues.apache.org/jira/browse/PIG-1142 > Project: Pig > Issue Type: Bug >Affects Versions: 0.6.0 >Reporter: Jing Huang > Fix For: 0.7.0 > > > Here is my pig script: > register $zebraJar; > --fs -rmr $outputDir > a1 = LOAD '$inputDir/small1' USING > org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2'); > a2 = LOAD '$inputDir/small2' USING > org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2'); > sort1 = order a1 by str2; > sort2 = order a2 by str2; > --store sort1 into '$outputDir/smallsorted11' using > org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]'); > --store sort2 into '$outputDir/smallsorted21' using > org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]'); > rec1 = load '$outputDir/smallsorted11' using > org.apache.hadoop.zebra.pig.TableLoader(); > rec2 = load '$outputDir/smallsorted21' using > org.apache.hadoop.zebra.pig.TableLoader(); > joina = join rec1 by str2, rec2 by str2 using "merge" ; > E = foreach joina generate $0 as count, $1 as seed, $2 as int1, $3 as > str2; > --limitedVals = LIMIT E 5; > --dump limitedVals; > store E into '$outputDir/smalljoin2' using > org.apache.hadoop.zebra.pig.TableStorer(''); > > Here is the stacktrace: > java.lang.NullPointerException at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:312) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.extractKeysFromTuple(POMergeJoin.java:464) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.getNext(POMergeJoin.java:341) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:260) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:237) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:253) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.close(PigMapBase.java:107) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at > org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at > org.apache.hadoop.mapred.Child.main(Child.java:159) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.