Incorrect data generated by diff of SUM
---------------------------------------

                 Key: PIG-1525
                 URL: https://issues.apache.org/jira/browse/PIG-1525
             Project: Pig
          Issue Type: Bug
    Affects Versions: 0.7.0
            Reporter: Richard Ding
            Assignee: Richard Ding
             Fix For: 0.8.0


Given data;

input1:

{code}
id9     0
{code}

input2:

{code}
id8     1
id9     1
{code}

Pig script

{code}
A = LOAD 'input1' AS (id:chararray, val:long);
B = LOAD 'input2' AS (id:chararray, val:long);
C = COGROUP A BY id, B BY id;
D = FOREACH C GENERATE group, SUM(B.val), SUM(A.val), (SUM(A.val) - SUM(B.val));
dump D;
{code}

generates incorrect data:

{code}
(id8,1L,,)
(id9,1L,0L,-2L)
{code}

The workaround is to replace the FOREACH statement with

{code}
D = FOREACH C GENERATE group, SUM(B.val) as b, SUM(A.val) as a;
E = FOREACH D GENERATE $0, b, a, (a-b);
{code}


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to