Aman Sinha has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15462
Change subject: IMPALA-9183: Convert certain disjunctive predicates to conjunctive normal form ...................................................................... IMPALA-9183: Convert certain disjunctive predicates to conjunctive normal form Added an expression rewrite rule to convert a disjunctive predicate to conjunctive normal form (CNF). Converting to CNF enables multi-table predicates that were only evaluated by a Join operator to be converted into either single-table conjuncts that are eligible for predicate pushdown to the scan operator or other multi-table conjuncts that are eligible to be pushed to a Join below. This helps improve performance for such queries. Since converting to CNF expands the number of expressions, we place a limit on the maximum number of CNF exprs (each AND is counted as 1 CNF expr) that are considered. Once the MAX_CNF_EXPRS limit (default is 100) is exceeded, whatever expression was supplied to the rule is returned without further transformation. A setting of -1 or 0 allows unlimited number of CNF exprs to be created upto int32 max. Another option ENABLE_CNF_REWRITES enables or disables the entire rewrite. This is False by default until we have done more thorough functional testing. Examples of rewrites: original: (a AND b) OR c rewritten: (a OR c) AND (b OR c) original: (a AND b) OR (c AND d) rewritten: (a OR c) AND (a OR d) AND (b OR c) AND (b OR d) original: NOT(a OR b) rewritten: NOT(a) AND NOT(b) Testing: - Added new unit tests with variations of disjunctive predicates and verified their Explain plans - Manually tested the result correctness on impala shell by running these queries with ENABLE_CNF_REWRITES enabled and disabled - Preliminary performance testing of TPC-DS q13 on a 10TB scale factor shows almost 5x improvement: Original baseline: 47.5 sec With this patch and CNF rewrite enabled: 9.4 sec Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 --- M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java A fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java A testdata/workloads/functional-planner/queries/PlannerTest/convert-to-cnf.test 10 files changed, 522 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/15462/1 -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 1 Gerrit-Owner: Aman Sinha <amsi...@cloudera.com>