[
https://issues.apache.org/jira/browse/PIG-3060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542032#comment-13542032
]
Youngwook Kim commented on PIG-3060:
------------------------------------
The below is the sample input.
{code}
2 {}
2 {(x),(y),(z)}
{code}
When running the script in 0.10, I get
(0)
while the expected result is
(3)
When running the script in trunk, I get the error below.
org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while
executing [PORelationToExprProject (Name: RelationToExpressionProject[bag][*] -
scope-26 Operator Key: scope-26) children: null at []]:
java.lang.NullPointerException
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:360)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.processInput(POUserFunc.java:228)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:282)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:416)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:349)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:376)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:282)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:465)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.processOnePackageOutput(PigGenericMapReduce.java:433)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:413)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:257)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
at
org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260)
Caused by: java.lang.NullPointerException
at
org.apache.pig.data.DefaultAbstractBag.getMemorySize(DefaultAbstractBag.java:155)
at
org.apache.pig.data.DefaultAbstractBag.markSpillableIfNecessary(DefaultAbstractBag.java:100)
at
org.apache.pig.data.DefaultAbstractBag.add(DefaultAbstractBag.java:92)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:440)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:583)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.PORelationToExprProject.getNext(PORelationToExprProject.java:107)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:335)
... 14 more
> FLATTEN in nested foreach fails when the input contains an empty bag
> --------------------------------------------------------------------
>
> Key: PIG-3060
> URL: https://issues.apache.org/jira/browse/PIG-3060
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.10.0
> Reporter: Youngwook Kim
>
> FLATTEN inside a foreach statement produces wrong results, if the input
> contains an empty bag.
> {code}
> A = load 'flatten.txt' as (a0:int, a1:bag{(t:chararray)});
> B = group A by a0;
> C = foreach B {
> c1 = foreach A generate FLATTEN(a1);
> generate COUNT(c1);
> };
> {code}
> The easy workaround is to filter out empty bags.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira