[jira] Commented: (PIG-1142) Got NullPointerException merge join with pruning

2009-12-11 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789440#action_12789440
 ] 

Alan Gates commented on PIG-1142:
-

Additional test cases look good.  +1

> Got NullPointerException merge join with pruning
> 
>
> Key: PIG-1142
> URL: https://issues.apache.org/jira/browse/PIG-1142
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.6.0
>Reporter: Jing Huang
>Assignee: Daniel Dai
> Fix For: 0.6.0
>
> Attachments: PIG-1142-1.patch, PIG-1142-2.patch, PIG-1142-3.patch
>
>
> Here is my pig script:
> register $zebraJar;
> --fs -rmr $outputDir
> a1 = LOAD '$inputDir/small1' USING 
> org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2');
> a2 = LOAD '$inputDir/small2' USING 
> org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2');
> sort1 = order a1 by str2;
> sort2 = order a2 by str2;
> --store sort1 into '$outputDir/smallsorted11' using 
> org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]');
> --store sort2 into '$outputDir/smallsorted21' using 
> org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]');
> rec1 = load '$outputDir/smallsorted11' using 
> org.apache.hadoop.zebra.pig.TableLoader();
> rec2 = load '$outputDir/smallsorted21' using 
> org.apache.hadoop.zebra.pig.TableLoader();
> joina = join rec1 by str2, rec2 by str2 using "merge" ;
> E = foreach joina  generate $0 as count,  $1 as seed,  $2 as int1,  $3 as 
> str2;
> --limitedVals = LIMIT E 5;
> --dump limitedVals;
> store E into '$outputDir/smalljoin2' using 
> org.apache.hadoop.zebra.pig.TableStorer('');
> 
> Here is the stacktrace:
> java.lang.NullPointerException at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:312)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.extractKeysFromTuple(POMergeJoin.java:464)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.getNext(POMergeJoin.java:341)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:260)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:237)
>  at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:253)
>  at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.close(PigMapBase.java:107)
>  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at 
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at 
> org.apache.hadoop.mapred.Child.main(Child.java:159) 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1142) Got NullPointerException merge join with pruning

2009-12-11 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789388#action_12789388
 ] 

Olga Natkovich commented on PIG-1142:
-

+1, Daniel, code changes look good but do we need to add more unit tests to 
cover them?

> Got NullPointerException merge join with pruning
> 
>
> Key: PIG-1142
> URL: https://issues.apache.org/jira/browse/PIG-1142
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.6.0
>Reporter: Jing Huang
>Assignee: Daniel Dai
> Fix For: 0.6.0
>
> Attachments: PIG-1142-1.patch, PIG-1142-2.patch
>
>
> Here is my pig script:
> register $zebraJar;
> --fs -rmr $outputDir
> a1 = LOAD '$inputDir/small1' USING 
> org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2');
> a2 = LOAD '$inputDir/small2' USING 
> org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2');
> sort1 = order a1 by str2;
> sort2 = order a2 by str2;
> --store sort1 into '$outputDir/smallsorted11' using 
> org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]');
> --store sort2 into '$outputDir/smallsorted21' using 
> org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]');
> rec1 = load '$outputDir/smallsorted11' using 
> org.apache.hadoop.zebra.pig.TableLoader();
> rec2 = load '$outputDir/smallsorted21' using 
> org.apache.hadoop.zebra.pig.TableLoader();
> joina = join rec1 by str2, rec2 by str2 using "merge" ;
> E = foreach joina  generate $0 as count,  $1 as seed,  $2 as int1,  $3 as 
> str2;
> --limitedVals = LIMIT E 5;
> --dump limitedVals;
> store E into '$outputDir/smalljoin2' using 
> org.apache.hadoop.zebra.pig.TableStorer('');
> 
> Here is the stacktrace:
> java.lang.NullPointerException at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:312)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.extractKeysFromTuple(POMergeJoin.java:464)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.getNext(POMergeJoin.java:341)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:260)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:237)
>  at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:253)
>  at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.close(PigMapBase.java:107)
>  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at 
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at 
> org.apache.hadoop.mapred.Child.main(Child.java:159) 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1142) Got NullPointerException merge join with pruning

2009-12-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789215#action_12789215
 ] 

Hadoop QA commented on PIG-1142:


+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12427671/PIG-1142-2.patch
  against trunk revision 889346.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/115/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/115/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/115/console

This message is automatically generated.

> Got NullPointerException merge join with pruning
> 
>
> Key: PIG-1142
> URL: https://issues.apache.org/jira/browse/PIG-1142
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.6.0
>Reporter: Jing Huang
>Assignee: Daniel Dai
> Fix For: 0.6.0
>
> Attachments: PIG-1142-1.patch, PIG-1142-2.patch
>
>
> Here is my pig script:
> register $zebraJar;
> --fs -rmr $outputDir
> a1 = LOAD '$inputDir/small1' USING 
> org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2');
> a2 = LOAD '$inputDir/small2' USING 
> org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2');
> sort1 = order a1 by str2;
> sort2 = order a2 by str2;
> --store sort1 into '$outputDir/smallsorted11' using 
> org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]');
> --store sort2 into '$outputDir/smallsorted21' using 
> org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]');
> rec1 = load '$outputDir/smallsorted11' using 
> org.apache.hadoop.zebra.pig.TableLoader();
> rec2 = load '$outputDir/smallsorted21' using 
> org.apache.hadoop.zebra.pig.TableLoader();
> joina = join rec1 by str2, rec2 by str2 using "merge" ;
> E = foreach joina  generate $0 as count,  $1 as seed,  $2 as int1,  $3 as 
> str2;
> --limitedVals = LIMIT E 5;
> --dump limitedVals;
> store E into '$outputDir/smalljoin2' using 
> org.apache.hadoop.zebra.pig.TableStorer('');
> 
> Here is the stacktrace:
> java.lang.NullPointerException at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:312)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.extractKeysFromTuple(POMergeJoin.java:464)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.getNext(POMergeJoin.java:341)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:260)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:237)
>  at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:253)
>  at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.close(PigMapBase.java:107)
>  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at 
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at 
> org.apache.hadoop.mapred.Child.main(Child.java:159) 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1142) Got NullPointerException merge join with pruning

2009-12-10 Thread Jing Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788949#action_12788949
 ] 

Jing Huang commented on PIG-1142:
-

Verified fix on pig 0.6.0 branch. 
Fix works.

> Got NullPointerException merge join with pruning
> 
>
> Key: PIG-1142
> URL: https://issues.apache.org/jira/browse/PIG-1142
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.6.0
>Reporter: Jing Huang
>Assignee: Daniel Dai
> Fix For: 0.6.0
>
> Attachments: PIG-1142-1.patch
>
>
> Here is my pig script:
> register $zebraJar;
> --fs -rmr $outputDir
> a1 = LOAD '$inputDir/small1' USING 
> org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2');
> a2 = LOAD '$inputDir/small2' USING 
> org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2');
> sort1 = order a1 by str2;
> sort2 = order a2 by str2;
> --store sort1 into '$outputDir/smallsorted11' using 
> org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]');
> --store sort2 into '$outputDir/smallsorted21' using 
> org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]');
> rec1 = load '$outputDir/smallsorted11' using 
> org.apache.hadoop.zebra.pig.TableLoader();
> rec2 = load '$outputDir/smallsorted21' using 
> org.apache.hadoop.zebra.pig.TableLoader();
> joina = join rec1 by str2, rec2 by str2 using "merge" ;
> E = foreach joina  generate $0 as count,  $1 as seed,  $2 as int1,  $3 as 
> str2;
> --limitedVals = LIMIT E 5;
> --dump limitedVals;
> store E into '$outputDir/smalljoin2' using 
> org.apache.hadoop.zebra.pig.TableStorer('');
> 
> Here is the stacktrace:
> java.lang.NullPointerException at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:312)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.extractKeysFromTuple(POMergeJoin.java:464)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.getNext(POMergeJoin.java:341)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:260)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:237)
>  at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:253)
>  at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.close(PigMapBase.java:107)
>  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at 
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at 
> org.apache.hadoop.mapred.Child.main(Child.java:159) 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1142) Got NullPointerException merge join with pruning

2009-12-09 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788511#action_12788511
 ] 

Olga Natkovich commented on PIG-1142:
-

+1. Changes look good.

> Got NullPointerException merge join with pruning
> 
>
> Key: PIG-1142
> URL: https://issues.apache.org/jira/browse/PIG-1142
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.6.0
>Reporter: Jing Huang
>Assignee: Daniel Dai
> Fix For: 0.7.0
>
> Attachments: PIG-1142-1.patch
>
>
> Here is my pig script:
> register $zebraJar;
> --fs -rmr $outputDir
> a1 = LOAD '$inputDir/small1' USING 
> org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2');
> a2 = LOAD '$inputDir/small2' USING 
> org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2');
> sort1 = order a1 by str2;
> sort2 = order a2 by str2;
> --store sort1 into '$outputDir/smallsorted11' using 
> org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]');
> --store sort2 into '$outputDir/smallsorted21' using 
> org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]');
> rec1 = load '$outputDir/smallsorted11' using 
> org.apache.hadoop.zebra.pig.TableLoader();
> rec2 = load '$outputDir/smallsorted21' using 
> org.apache.hadoop.zebra.pig.TableLoader();
> joina = join rec1 by str2, rec2 by str2 using "merge" ;
> E = foreach joina  generate $0 as count,  $1 as seed,  $2 as int1,  $3 as 
> str2;
> --limitedVals = LIMIT E 5;
> --dump limitedVals;
> store E into '$outputDir/smalljoin2' using 
> org.apache.hadoop.zebra.pig.TableStorer('');
> 
> Here is the stacktrace:
> java.lang.NullPointerException at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:312)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.extractKeysFromTuple(POMergeJoin.java:464)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.getNext(POMergeJoin.java:341)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:260)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:237)
>  at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:253)
>  at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.close(PigMapBase.java:107)
>  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at 
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at 
> org.apache.hadoop.mapred.Child.main(Child.java:159) 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1142) Got NullPointerException merge join with pruning

2009-12-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788505#action_12788505
 ] 

Hadoop QA commented on PIG-1142:


+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12427539/PIG-1142-1.patch
  against trunk revision 52.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/110/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/110/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/110/console

This message is automatically generated.

> Got NullPointerException merge join with pruning
> 
>
> Key: PIG-1142
> URL: https://issues.apache.org/jira/browse/PIG-1142
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.6.0
>Reporter: Jing Huang
>Assignee: Daniel Dai
> Fix For: 0.7.0
>
> Attachments: PIG-1142-1.patch
>
>
> Here is my pig script:
> register $zebraJar;
> --fs -rmr $outputDir
> a1 = LOAD '$inputDir/small1' USING 
> org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2');
> a2 = LOAD '$inputDir/small2' USING 
> org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2');
> sort1 = order a1 by str2;
> sort2 = order a2 by str2;
> --store sort1 into '$outputDir/smallsorted11' using 
> org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]');
> --store sort2 into '$outputDir/smallsorted21' using 
> org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]');
> rec1 = load '$outputDir/smallsorted11' using 
> org.apache.hadoop.zebra.pig.TableLoader();
> rec2 = load '$outputDir/smallsorted21' using 
> org.apache.hadoop.zebra.pig.TableLoader();
> joina = join rec1 by str2, rec2 by str2 using "merge" ;
> E = foreach joina  generate $0 as count,  $1 as seed,  $2 as int1,  $3 as 
> str2;
> --limitedVals = LIMIT E 5;
> --dump limitedVals;
> store E into '$outputDir/smalljoin2' using 
> org.apache.hadoop.zebra.pig.TableStorer('');
> 
> Here is the stacktrace:
> java.lang.NullPointerException at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:312)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.extractKeysFromTuple(POMergeJoin.java:464)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.getNext(POMergeJoin.java:341)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:260)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:237)
>  at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:253)
>  at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.close(PigMapBase.java:107)
>  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at 
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at 
> org.apache.hadoop.mapred.Child.main(Child.java:159) 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1142) Got NullPointerException merge join with pruning

2009-12-09 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788458#action_12788458
 ] 

Daniel Dai commented on PIG-1142:
-

To make it clear, this issue happens when we prune more than one columns in 
front of join key.

> Got NullPointerException merge join with pruning
> 
>
> Key: PIG-1142
> URL: https://issues.apache.org/jira/browse/PIG-1142
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.6.0
>Reporter: Jing Huang
>Assignee: Daniel Dai
> Fix For: 0.7.0
>
> Attachments: PIG-1142-1.patch
>
>
> Here is my pig script:
> register $zebraJar;
> --fs -rmr $outputDir
> a1 = LOAD '$inputDir/small1' USING 
> org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2');
> a2 = LOAD '$inputDir/small2' USING 
> org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2');
> sort1 = order a1 by str2;
> sort2 = order a2 by str2;
> --store sort1 into '$outputDir/smallsorted11' using 
> org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]');
> --store sort2 into '$outputDir/smallsorted21' using 
> org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]');
> rec1 = load '$outputDir/smallsorted11' using 
> org.apache.hadoop.zebra.pig.TableLoader();
> rec2 = load '$outputDir/smallsorted21' using 
> org.apache.hadoop.zebra.pig.TableLoader();
> joina = join rec1 by str2, rec2 by str2 using "merge" ;
> E = foreach joina  generate $0 as count,  $1 as seed,  $2 as int1,  $3 as 
> str2;
> --limitedVals = LIMIT E 5;
> --dump limitedVals;
> store E into '$outputDir/smalljoin2' using 
> org.apache.hadoop.zebra.pig.TableStorer('');
> 
> Here is the stacktrace:
> java.lang.NullPointerException at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:312)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.extractKeysFromTuple(POMergeJoin.java:464)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.getNext(POMergeJoin.java:341)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:260)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:237)
>  at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:253)
>  at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.close(PigMapBase.java:107)
>  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at 
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at 
> org.apache.hadoop.mapred.Child.main(Child.java:159) 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (PIG-1142) Got NullPointerException merge join with pruning

2009-12-09 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12788415#action_12788415
 ] 

Daniel Dai commented on PIG-1142:
-

I can reproduce it in regular join as well. Here is the script:
a = LOAD '1.txt' as (a0, a1, a2);
b = LOAD '2.txt' as (b0, b1, b2);
c = join a by a2, b by b2;
d = foreach c generate $0,  $1,  $2;
dump d;

The logical plan is wrong:
{code}
.
|---LOJoin 1-17 Schema: {a::a0: bytearray,a::a1: bytearray,a::a2: 
bytearray,b::b2: bytearray} Type: bag
|   |
|   Project 1-15 Projections: [2] Overloaded: false FieldSchema: a2: 
bytearray Type: bytearray
|   Input: Load 1-13
|   |
|   Project 1-16 Projections: [1] Overloaded: false FieldSchema: Caught 
Exception: Attempt to fetch field 1 from schema of size 1 Type: Unknown
|   Input: Load 1-14
|
|---Load 1-13 Schema: {a0: bytearray,a1: bytearray,a2: bytearray} Type: 
bag
|
|---Load 1-14 Schema: {b2: bytearray} Type: bag
{code}

The second project of LOJoin should project column 0

> Got NullPointerException merge join with pruning
> 
>
> Key: PIG-1142
> URL: https://issues.apache.org/jira/browse/PIG-1142
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.6.0
>Reporter: Jing Huang
> Fix For: 0.7.0
>
>
> Here is my pig script:
> register $zebraJar;
> --fs -rmr $outputDir
> a1 = LOAD '$inputDir/small1' USING 
> org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2');
> a2 = LOAD '$inputDir/small2' USING 
> org.apache.hadoop.zebra.pig.TableLoader('count,seed,int1,str2');
> sort1 = order a1 by str2;
> sort2 = order a2 by str2;
> --store sort1 into '$outputDir/smallsorted11' using 
> org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]');
> --store sort2 into '$outputDir/smallsorted21' using 
> org.apache.hadoop.zebra.pig.TableStorer('[count,seed,int1,str2]');
> rec1 = load '$outputDir/smallsorted11' using 
> org.apache.hadoop.zebra.pig.TableLoader();
> rec2 = load '$outputDir/smallsorted21' using 
> org.apache.hadoop.zebra.pig.TableLoader();
> joina = join rec1 by str2, rec2 by str2 using "merge" ;
> E = foreach joina  generate $0 as count,  $1 as seed,  $2 as int1,  $3 as 
> str2;
> --limitedVals = LIMIT E 5;
> --dump limitedVals;
> store E into '$outputDir/smalljoin2' using 
> org.apache.hadoop.zebra.pig.TableStorer('');
> 
> Here is the stacktrace:
> java.lang.NullPointerException at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:312)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.extractKeysFromTuple(POMergeJoin.java:464)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POMergeJoin.getNext(POMergeJoin.java:341)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:260)
>  at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:237)
>  at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:253)
>  at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.close(PigMapBase.java:107)
>  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57) at 
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at 
> org.apache.hadoop.mapred.Child.main(Child.java:159) 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.