[ 
https://issues.apache.org/jira/browse/PIG-1494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich reassigned PIG-1494:
-----------------------------------

    Assignee: Swati Jain

> PIG Logical Optimization: Use CNF in PushUpFilter
> -------------------------------------------------
>
>                 Key: PIG-1494
>                 URL: https://issues.apache.org/jira/browse/PIG-1494
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>    Affects Versions: 0.7.0
>            Reporter: Swati Jain
>            Assignee: Swati Jain
>            Priority: Minor
>             Fix For: 0.8.0
>
>
> The PushUpFilter rule is not able to handle complicated boolean expressions.
> For example, SplitFilter rule is splitting one LOFilter into two by "AND". 
> However it will not be able to split LOFilter if the top level operator is 
> "OR". For example:
> *ex script:*
> A = load 'file_a' USING PigStorage(',') as (a1:int,a2:int,a3:int);
> B = load 'file_b' USING PigStorage(',') as (b1:int,b2:int,b3:int);
> C = load 'file_c' USING PigStorage(',') as (c1:int,c2:int,c3:int);
> J1 = JOIN B by b1, C by c1;
> J2 = JOIN J1 by $0, A by a1;
> D = *Filter J2 by ( (c1 < 10) AND (a3+b3 > 10) ) OR (c2 == 5);*
> explain D;
> In the above example, the PushUpFilter is not able to push any filter 
> condition across any join as it contains columns from all branches (inputs). 
> But if we convert this expression into "Conjunctive Normal Form" (CNF) then 
> we would be able to push filter condition c1< 10 and c2 == 5 below both join 
> conditions. Here is the CNF expression for highlighted line:
> ( (c1 < 10) OR (c2 == 5) ) AND ( (a3+b3 > 10) OR (c2 ==5) )
> *Suggestion:* It would be a good idea to convert LOFilter's boolean 
> expression into CNF, it would then be easy to push parts (conjuncts) of the 
> LOFilter boolean expression selectively. We would also not require rule 
> SplitFilter anymore if we were to add this utility to rule PushUpFilter 
> itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to