[ 
https://issues.apache.org/jira/browse/PIG-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai resolved PIG-4646.
-----------------------------
       Resolution: Fixed
         Assignee: Daniel Dai
    Fix Version/s: 0.12.0

Thanks for reporting. 

The script runs right with 0.12.0+. Not sure which patch credit to this. Marked 
for fixed.

> PushUpFilter should not push before nested projection with FILTER operators
> ---------------------------------------------------------------------------
>
>                 Key: PIG-4646
>                 URL: https://issues.apache.org/jira/browse/PIG-4646
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.11.1
>            Reporter: Haishan Liu
>            Assignee: Daniel Dai
>             Fix For: 0.12.0
>
>
> Verified the problem in 0.11.1. In short, filter should not be pushed before 
> a nested foreach in which another filter operator is present. See the 
> following minimum example:
> {code}
> cat data;
> (1, {(1000, 'a'), (1001, 'b')})
> (2, {(2000, 'a'), (2001, 'b'), (2002, 'c')})
> A = load 'data' as (id:int, hits:{(score:int, name:chararray)});
> B = foreach A {
>   filtered = filter hits by score > 2000;
>   generate id, filtered;
> };
> dump B;
> (1,{})
> (2,{(2001,'b'),(2002,'c')})
> C = filter B by SIZE(filtered) > 0;
> dump C;
> (1,{})
> (2,{(2001,'b'),(2002,'c')})
> {code}
> The desired result can be achieved with either '-optimizer_off PushUpFilter' 
> when invoking Pig, or using the following convoluted way:
> {code}
> C = foreach B generate SIZE(filtered) as size, id, filtered;
> D = filter C by size > 0;
> E = foreach D generate id, filtered;
> dump E;
> (2,{(2001,'b'),(2002,'c')})
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to