[jira] [Commented] (PIG-3060) FLATTEN in nested foreach fails when the input contains an empty bag

Youngwook Kim (JIRA) Tue, 01 Jan 2013 23:02:14 -0800

    [ 
https://issues.apache.org/jira/browse/PIG-3060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13542032#comment-13542032
 ]


Youngwook Kim commented on PIG-3060:
------------------------------------

The below is the sample input.
{code}
2 {}
2 {(x),(y),(z)}
{code}

When running the script in 0.10, I get
(0)

while the expected result is
(3)

When running the script in trunk, I get the error below.

org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while 
executing [PORelationToExprProject (Name: RelationToExpressionProject[bag][*] - 
scope-26 Operator Key: scope-26) children: null at []]: 
java.lang.NullPointerException
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:360)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.processInput(POUserFunc.java:228)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:282)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:416)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:349)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:376)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:282)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.runPipeline(PigGenericMapReduce.java:465)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.processOnePackageOutput(PigGenericMapReduce.java:433)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:413)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:257)
        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
        at 
org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260)
Caused by: java.lang.NullPointerException
        at 
org.apache.pig.data.DefaultAbstractBag.getMemorySize(DefaultAbstractBag.java:155)
        at 
org.apache.pig.data.DefaultAbstractBag.markSpillableIfNecessary(DefaultAbstractBag.java:100)
        at 
org.apache.pig.data.DefaultAbstractBag.add(DefaultAbstractBag.java:92)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:440)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:583)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.PORelationToExprProject.getNext(PORelationToExprProject.java:107)
        at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:335)
        ... 14 more

                
> FLATTEN in nested foreach fails when the input contains an empty bag
> --------------------------------------------------------------------
>
>                 Key: PIG-3060
>                 URL: https://issues.apache.org/jira/browse/PIG-3060
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.10.0
>            Reporter: Youngwook Kim
>
> FLATTEN inside a foreach statement produces wrong results, if the input 
> contains an empty bag.
> {code}
> A = load 'flatten.txt' as (a0:int, a1:bag{(t:chararray)});
> B = group A by a0;
> C = foreach B {
>   c1 = foreach A generate FLATTEN(a1);
>   generate COUNT(c1);
> };
> {code}
> The easy workaround is to filter out empty bags.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-3060) FLATTEN in nested foreach fails when the input contains an empty bag

Reply via email to