Cheolsoo Park created PIG-3510: ---------------------------------- Summary: New filter extractor fails with more than one filter statement Key: PIG-3510 URL: https://issues.apache.org/jira/browse/PIG-3510 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.12.0 Reporter: Cheolsoo Park Assignee: Cheolsoo Park Fix For: 0.12.1
This is a regression from PIG-3461 - rewrite of partition filter optimizer. Here is an example that demonstrates the problem: {code:title=two filters} b = FILTER a BY (dateint >= 20130901 AND dateint <= 20131001); c = FILTER b BY (event_id == 419 OR event_id == 418); {code} {code:title=one filter} b = FILTER a BY (dateint >= 20130901 AND dateint <= 20131001) AND (event_id == 419 OR event_id == 418); {code} Both dateint and event_id are partition columns. For the 1 filter case, the whole expression is pushed down whereas for the 2 filter case, only (event_id == 419 OR event_id == 418) is pushed down. The reason is the filter extractor overwrites the pushdown expression that it extracted from the 1st statement while visiting the 2nd statement. {code} private Expression pushdownExpr = null; {code} The old filter extractor used to keep pushdown expressions in array and assemble them with AND at the end. {code} private ArrayList<Expression> pColConditions = new ArrayList<Expression>(); {code} -- This message was sent by Atlassian JIRA (v6.1#6144)