[ 
https://issues.apache.org/jira/browse/PIG-3458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13765964#comment-13765964
 ] 

Koji Noguchi commented on PIG-3458:
-----------------------------------

Reason it gets lost is, we store C using PigStorage but ReadScalars tries to 
read it by a hardcoded InterStorage. 

{noformat}
...
[First mapreduce job]
Reduce Plan
C: Store(/.../deleteme6_C:PigStorage(',')) - scope-17
|
...

[Second mapreduce job]
    |   POUserFunc(org.apache.pig.impl.builtin.ReadScalars)[long] - scope-31
    |   |
    |   |---Constant(0) - scope-29
    |   |
    |   |---Constant(/.../deleteme6_C) - scope-30
{noformat}

Trying to understand what the fix should be.
1. Make ReadScalars use the corresponding Loader.
2. Split relation 'C' so that we store them in both PigStorage AND InterStorage.

I'm guessing latter, but appreciate your feedback.

                
> ScalarExpression lost with multiquery optimization
> --------------------------------------------------
>
>                 Key: PIG-3458
>                 URL: https://issues.apache.org/jira/browse/PIG-3458
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Koji Noguchi
>            Assignee: Koji Noguchi
>
> Our user reported an issue where their scalar results goes missing when 
> having two store statements.
> {noformat}
> A = load 'test1.txt' using PigStorage('\t') as (a:chararray, count:long);
> B = group A all;
> C = foreach B generate SUM(A.count) as total ;
> store C into 'deleteme6_C' using PigStorage(',');
> Z = load 'test2.txt' using PigStorage('\t') as (a:chararray, id:chararray );
> Y = group Z by id;
> X = foreach Y generate group, C.total;
> store X into 'deleteme6_X' using PigStorage(',');
> ====Inputs
>  pig> cat test1.txt
> a       1
> b       2
> c       8
> d       9
>  pig> cat test2.txt
> a       z
> b       y
> c       x
>  pig>
> {noformat}
> Result X should contain the total count of '20' but instead it's empty.
> {noformat}
>  pig> cat deleteme6_C/part-r-00000
> 20
>  pig> cat deleteme6_X/part-r-00000
> x,
> y,
> z,
>  pig>
> {noformat}
> This works if we take out first "store C" statement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to