[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. IMPALA-9183: Convert disjunctive predicates to conjunctive normal form Added an expression rewrite rule to convert a disjunctive predicate to conjunctive normal form (CNF). Converting to CNF enables multi-table predicates that were only evaluated by a Join operator to be converted into either single-table conjuncts that are eligible for predicate pushdown to the scan operator or other multi-table conjuncts that are eligible to be pushed to a Join below. This helps improve performance for such queries. Since converting to CNF expands the number of expressions, we place a limit on the maximum number of CNF exprs (each AND is counted as 1 CNF expr) that are considered. Once the MAX_CNF_EXPRS limit (default is unlimited) is exceeded, whatever expression was supplied to the rule is returned without further transformation. A setting of -1 or 0 allows unlimited number of CNF exprs to be created upto int32 max. Another option ENABLE_CNF_REWRITES enables or disables the entire rewrite. This is False by default until we have done more thorough functional testing (tracking JIRA IMPALA-9539). Examples of rewrites: original: (a AND b) OR c rewritten: (a OR c) AND (b OR c) original: (a AND b) OR (c AND d) rewritten: (a OR c) AND (a OR d) AND (b OR c) AND (b OR d) original: NOT(a OR b) rewritten: NOT(a) AND NOT(b) Testing: - Added new unit tests with variations of disjunctive predicates and verified their Explain plans - Manually tested the result correctness on impala shell by running these queries with ENABLE_CNF_REWRITES enabled and disabled - Added TPC-H q7, q19 and TPC-DS q13 with the CNF rewrite enabled - Preliminary performance testing of TPC-DS q13 on a 10TB scale factor shows almost 5x improvement: Original baseline: 47.5 sec With this patch and CNF rewrite enabled: 9.4 sec Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Reviewed-on: http://gerrit.cloudera.org:8080/15462 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java A fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java A testdata/workloads/functional-planner/queries/PlannerTest/convert-to-cnf.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test 12 files changed, 831 insertions(+), 2 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 10 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 9: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 9 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 24 Mar 2020 09:18:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 9: I have a fix for the flakiness thankfully, so we'll get there eventually :) -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 9 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 24 Mar 2020 05:57:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 9: > Patch Set 8: > > Same issue again, I'm not sure what suddenly is triggering it as it seems > unrelated to your patch. Sigh. let's see if third time is a charm. -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 9 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 24 Mar 2020 05:33:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 9: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 9 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 24 Mar 2020 04:14:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 9: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5535/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 9 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 24 Mar 2020 04:14:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 8: Same issue again, I'm not sure what suddenly is triggering it as it seems unrelated to your patch. -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 8 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 24 Mar 2020 04:14:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 8: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5527/ -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 8 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 24 Mar 2020 02:46:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 8: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5527/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 8 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 23 Mar 2020 21:49:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 8: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 8 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 23 Mar 2020 21:49:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 7: Agree this looks like a flaky test. I filed IMPALA-9547 to track. -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 7 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 23 Mar 2020 21:49:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 7: > Patch Set 7: Verified-1 > > Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5511/ Looking at the archive.zip for this build failure, the failure I see is the following in TestImpalaShell which seems unrelated to the code changes. However it doesn't look like PlannerTest was run. Was it supposed to ? [gw15] FAILED shell/test_shell_commandline.py::TestImpalaShell::test_socket_opening[table_format_and_file_extension: ('parquet', '.parq') | protocol: beeswax] -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 7 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 23 Mar 2020 20:59:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 7: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5511/ -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 7 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 21 Mar 2020 22:14:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 6 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 21 Mar 2020 17:12:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5511/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 7 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 21 Mar 2020 17:13:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 7: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 7 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 21 Mar 2020 17:13:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5547/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 6 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 20 Mar 2020 20:54:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5546/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 5 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 20 Mar 2020 20:52:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 6: (2 comments) http://gerrit.cloudera.org:8080/#/c/15462/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15462/4//COMMIT_MSG@22 PS4, Line 22: enables or disables the entire rewrite. This is False by default until we > Do we have a follow-on JIRA to change the default? Created IMPALA-9539 and added a reference here. http://gerrit.cloudera.org:8080/#/c/15462/4/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/15462/4/be/src/service/query-options.cc@924 PS4, Line 924: // Parse the preaggregation bytes limit and validate it > It'd be good to use the stricter parsing that some other options use, e.g. Done -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 6 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 20 Mar 2020 20:12:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Hello Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15462 to look at the new patch set (#6). Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. IMPALA-9183: Convert disjunctive predicates to conjunctive normal form Added an expression rewrite rule to convert a disjunctive predicate to conjunctive normal form (CNF). Converting to CNF enables multi-table predicates that were only evaluated by a Join operator to be converted into either single-table conjuncts that are eligible for predicate pushdown to the scan operator or other multi-table conjuncts that are eligible to be pushed to a Join below. This helps improve performance for such queries. Since converting to CNF expands the number of expressions, we place a limit on the maximum number of CNF exprs (each AND is counted as 1 CNF expr) that are considered. Once the MAX_CNF_EXPRS limit (default is unlimited) is exceeded, whatever expression was supplied to the rule is returned without further transformation. A setting of -1 or 0 allows unlimited number of CNF exprs to be created upto int32 max. Another option ENABLE_CNF_REWRITES enables or disables the entire rewrite. This is False by default until we have done more thorough functional testing (tracking JIRA IMPALA-9539). Examples of rewrites: original: (a AND b) OR c rewritten: (a OR c) AND (b OR c) original: (a AND b) OR (c AND d) rewritten: (a OR c) AND (a OR d) AND (b OR c) AND (b OR d) original: NOT(a OR b) rewritten: NOT(a) AND NOT(b) Testing: - Added new unit tests with variations of disjunctive predicates and verified their Explain plans - Manually tested the result correctness on impala shell by running these queries with ENABLE_CNF_REWRITES enabled and disabled - Added TPC-H q7, q19 and TPC-DS q13 with the CNF rewrite enabled - Preliminary performance testing of TPC-DS q13 on a 10TB scale factor shows almost 5x improvement: Original baseline: 47.5 sec With this patch and CNF rewrite enabled: 9.4 sec Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 --- M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java A fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java A testdata/workloads/functional-planner/queries/PlannerTest/convert-to-cnf.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test 12 files changed, 831 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/15462/6 -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 6 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/15462/5/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/15462/5/common/thrift/ImpalaService.thrift@518 PS5, Line 518: MAX_CNF_EXPRS = 100 line has trailing whitespace -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 5 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 20 Mar 2020 20:08:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Hello Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15462 to look at the new patch set (#5). Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. IMPALA-9183: Convert disjunctive predicates to conjunctive normal form Added an expression rewrite rule to convert a disjunctive predicate to conjunctive normal form (CNF). Converting to CNF enables multi-table predicates that were only evaluated by a Join operator to be converted into either single-table conjuncts that are eligible for predicate pushdown to the scan operator or other multi-table conjuncts that are eligible to be pushed to a Join below. This helps improve performance for such queries. Since converting to CNF expands the number of expressions, we place a limit on the maximum number of CNF exprs (each AND is counted as 1 CNF expr) that are considered. Once the MAX_CNF_EXPRS limit (default is unlimited) is exceeded, whatever expression was supplied to the rule is returned without further transformation. A setting of -1 or 0 allows unlimited number of CNF exprs to be created upto int32 max. Another option ENABLE_CNF_REWRITES enables or disables the entire rewrite. This is False by default until we have done more thorough functional testing (tracking JIRA IMPALA-9539). Examples of rewrites: original: (a AND b) OR c rewritten: (a OR c) AND (b OR c) original: (a AND b) OR (c AND d) rewritten: (a OR c) AND (a OR d) AND (b OR c) AND (b OR d) original: NOT(a OR b) rewritten: NOT(a) AND NOT(b) Testing: - Added new unit tests with variations of disjunctive predicates and verified their Explain plans - Manually tested the result correctness on impala shell by running these queries with ENABLE_CNF_REWRITES enabled and disabled - Added TPC-H q7, q19 and TPC-DS q13 with the CNF rewrite enabled - Preliminary performance testing of TPC-DS q13 on a 10TB scale factor shows almost 5x improvement: Original baseline: 47.5 sec With this patch and CNF rewrite enabled: 9.4 sec Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 --- M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java A fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java A testdata/workloads/functional-planner/queries/PlannerTest/convert-to-cnf.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test 12 files changed, 831 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/15462/5 -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 5 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 4: Code-Review+1 (2 comments) Only one minor thing I noticed on my second pass. http://gerrit.cloudera.org:8080/#/c/15462/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15462/4//COMMIT_MSG@22 PS4, Line 22: enables or disables the entire rewrite. This is False by default until we Do we have a follow-on JIRA to change the default? http://gerrit.cloudera.org:8080/#/c/15462/4/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/15462/4/be/src/service/query-options.cc@924 PS4, Line 924: query_options->__set_max_cnf_exprs(atoi(value.c_str())); It'd be good to use the stricter parsing that some other options use, e.g. FETCH_ROWS_TIMEOUT_MS which uses StringToInt(). -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 4 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 19 Mar 2020 23:53:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5520/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 4 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 18 Mar 2020 23:43:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 4: (5 comments) http://gerrit.cloudera.org:8080/#/c/15462/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15462/3//COMMIT_MSG@35 PS3, Line 35: Testing: > Maybe add Q13 to testdata/workloads/functional-planner/queries/PlannerTest/ I have added tpcds q13 to the tpcds-all.test and a duplicate q7, q19 to tpch-all.test. These qualify for the rewrite and all have a QUERYOPTIONS section with cnf rewrite enabled. For tpcds q13, there was no previous entry in tpcds-all.test (I see only 24 queries in that file out of 99 total, so the 'all' is misleading). Since the rewrite is disabled by default (at least temporarily), other tests in PlannerTest are not affected. However, when I ran it with it globally enabled, 5 out of 102 tests failed. Most of these were related to TPC-H q7, TPCH-q19 since they are run against different databases - e.g tpch_nested_parquet, tpch_kudu etc. Once notable change in the latest patch upload: I changed the default max_cnf_exprs to 0 (unlimited) because the above tests do generate hundreds of exprs. I cannot think of a good default at this time. If you think we should limit it by default, let me know. http://gerrit.cloudera.org:8080/#/c/15462/3/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java File fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java: http://gerrit.cloudera.org:8080/#/c/15462/3/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@37 PS3, Line 37: * > The single table rewrites might be useful for pushing predicates down into Agreed. I have removed the wording 'primarily intended for testing.' The default is still multi tables because otherwise we would see a lot of plan diffs. http://gerrit.cloudera.org:8080/#/c/15462/3/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@132 PS3, Line 132: } else if (cpred.getOp() == CompoundPredicate.Operator.NOT) { > Consider factoring the logic of the two branches out into a common function Yes, my intent was to maintain the ordering of applying the distributive property which is left-to-right. I was able to factor out the code though, so it is moved to a separate function. http://gerrit.cloudera.org:8080/#/c/15462/3/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@136 PS3, Line 136: // predicate: NOT (a OR b) > I think it would be more concise to use Arrays.asList() to construct the li Done. http://gerrit.cloudera.org:8080/#/c/15462/3/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@156 PS3, Line 156: private Predicate createPredAndIncrementCount(Expr first_lhs, Expr second_lhs, > It would be good to factor this check out into a common function. Moved this code up so that it is common to both the OR and the NOT blocks. -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 4 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 18 Mar 2020 23:28:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Hello Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15462 to look at the new patch set (#4). Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. IMPALA-9183: Convert disjunctive predicates to conjunctive normal form Added an expression rewrite rule to convert a disjunctive predicate to conjunctive normal form (CNF). Converting to CNF enables multi-table predicates that were only evaluated by a Join operator to be converted into either single-table conjuncts that are eligible for predicate pushdown to the scan operator or other multi-table conjuncts that are eligible to be pushed to a Join below. This helps improve performance for such queries. Since converting to CNF expands the number of expressions, we place a limit on the maximum number of CNF exprs (each AND is counted as 1 CNF expr) that are considered. Once the MAX_CNF_EXPRS limit (default is unlimited) is exceeded, whatever expression was supplied to the rule is returned without further transformation. A setting of -1 or 0 allows unlimited number of CNF exprs to be created upto int32 max. Another option ENABLE_CNF_REWRITES enables or disables the entire rewrite. This is False by default until we have done more thorough functional testing. Examples of rewrites: original: (a AND b) OR c rewritten: (a OR c) AND (b OR c) original: (a AND b) OR (c AND d) rewritten: (a OR c) AND (a OR d) AND (b OR c) AND (b OR d) original: NOT(a OR b) rewritten: NOT(a) AND NOT(b) Testing: - Added new unit tests with variations of disjunctive predicates and verified their Explain plans - Manually tested the result correctness on impala shell by running these queries with ENABLE_CNF_REWRITES enabled and disabled - Added TPC-H q7, q19 and TPC-DS q13 with the CNF rewrite enabled - Preliminary performance testing of TPC-DS q13 on a 10TB scale factor shows almost 5x improvement: Original baseline: 47.5 sec With this patch and CNF rewrite enabled: 9.4 sec Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 --- M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java A fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java A testdata/workloads/functional-planner/queries/PlannerTest/convert-to-cnf.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test 12 files changed, 819 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/15462/4 -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 4 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 3: (5 comments) http://gerrit.cloudera.org:8080/#/c/15462/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15462/3//COMMIT_MSG@35 PS3, Line 35: Testing: Maybe add Q13 to testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test so we're testing this directly. Did any of the existing planner tests change at all? http://gerrit.cloudera.org:8080/#/c/15462/3/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java File fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java: http://gerrit.cloudera.org:8080/#/c/15462/3/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@37 PS3, Line 37: * it can work for single table predicates also (primarily intended for testing). The single table rewrites might be useful for pushing predicates down into the storage (a lot of optimisations like min-max filtering or pushdown into Kudu won't work for OR conjuncts). I think this is OK as-is but maybe worth thinking about as an extension. http://gerrit.cloudera.org:8080/#/c/15462/3/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@132 PS3, Line 132: } else if (rhs instanceof CompoundPredicate && Consider factoring the logic of the two branches out into a common function, since it's essentialyl the same with lhs and rhs swapped (although I guess the order of the expressions in the output is slightly different, so OK to ignore if that was your intent.) http://gerrit.cloudera.org:8080/#/c/15462/3/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@136 PS3, Line 136: List disjuncts = new ArrayList<>(); I think it would be more concise to use Arrays.asList() to construct the list. That would make it significantly more readable IMO by reducing the visual noise. http://gerrit.cloudera.org:8080/#/c/15462/3/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@156 PS3, Line 156: if (forMultiTablesOnly_) { It would be good to factor this check out into a common function. -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 3 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 17 Mar 2020 22:58:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 3: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/5511/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 3 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 17 Mar 2020 18:41:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 3: > Patch Set 2: > > (4 comments) Fixed -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 3 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 17 Mar 2020 18:00:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 3: > Patch Set 1: > > (11 comments) Fixed -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 3 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 17 Mar 2020 17:59:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15462 to look at the new patch set (#3). Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. IMPALA-9183: Convert disjunctive predicates to conjunctive normal form Added an expression rewrite rule to convert a disjunctive predicate to conjunctive normal form (CNF). Converting to CNF enables multi-table predicates that were only evaluated by a Join operator to be converted into either single-table conjuncts that are eligible for predicate pushdown to the scan operator or other multi-table conjuncts that are eligible to be pushed to a Join below. This helps improve performance for such queries. Since converting to CNF expands the number of expressions, we place a limit on the maximum number of CNF exprs (each AND is counted as 1 CNF expr) that are considered. Once the MAX_CNF_EXPRS limit (default is 100) is exceeded, whatever expression was supplied to the rule is returned without further transformation. A setting of -1 or 0 allows unlimited number of CNF exprs to be created upto int32 max. Another option ENABLE_CNF_REWRITES enables or disables the entire rewrite. This is False by default until we have done more thorough functional testing. Examples of rewrites: original: (a AND b) OR c rewritten: (a OR c) AND (b OR c) original: (a AND b) OR (c AND d) rewritten: (a OR c) AND (a OR d) AND (b OR c) AND (b OR d) original: NOT(a OR b) rewritten: NOT(a) AND NOT(b) Testing: - Added new unit tests with variations of disjunctive predicates and verified their Explain plans - Manually tested the result correctness on impala shell by running these queries with ENABLE_CNF_REWRITES enabled and disabled - Preliminary performance testing of TPC-DS q13 on a 10TB scale factor shows almost 5x improvement: Original baseline: 47.5 sec With this patch and CNF rewrite enabled: 9.4 sec Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 --- M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java A fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java A testdata/workloads/functional-planner/queries/PlannerTest/convert-to-cnf.test 10 files changed, 532 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/15462/3 -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 3 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5508/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 2 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 17 Mar 2020 17:11:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15462 ) Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. Patch Set 2: (4 comments) http://gerrit.cloudera.org:8080/#/c/15462/2/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java File fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java: http://gerrit.cloudera.org:8080/#/c/15462/2/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@122 PS2, Line 122: (CompoundPredicate) CompoundPredicate.createDisjunctivePredicate(disjuncts); line too long (92 > 90) http://gerrit.cloudera.org:8080/#/c/15462/2/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@127 PS2, Line 127: (CompoundPredicate) CompoundPredicate.createDisjunctivePredicate(disjuncts); line too long (92 > 90) http://gerrit.cloudera.org:8080/#/c/15462/2/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@140 PS2, Line 140: (CompoundPredicate) CompoundPredicate.createDisjunctivePredicate(disjuncts); line too long (92 > 90) http://gerrit.cloudera.org:8080/#/c/15462/2/fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java@145 PS2, Line 145: (CompoundPredicate) CompoundPredicate.createDisjunctivePredicate(disjuncts); line too long (92 > 90) -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 2 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 17 Mar 2020 16:26:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9183: Convert disjunctive predicates to conjunctive normal form
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/15462 to look at the new patch set (#2). Change subject: IMPALA-9183: Convert disjunctive predicates to conjunctive normal form .. IMPALA-9183: Convert disjunctive predicates to conjunctive normal form Added an expression rewrite rule to convert a disjunctive predicate to conjunctive normal form (CNF). Converting to CNF enables multi-table predicates that were only evaluated by a Join operator to be converted into either single-table conjuncts that are eligible for predicate pushdown to the scan operator or other multi-table conjuncts that are eligible to be pushed to a Join below. This helps improve performance for such queries. Since converting to CNF expands the number of expressions, we place a limit on the maximum number of CNF exprs (each AND is counted as 1 CNF expr) that are considered. Once the MAX_CNF_EXPRS limit (default is 100) is exceeded, whatever expression was supplied to the rule is returned without further transformation. A setting of -1 or 0 allows unlimited number of CNF exprs to be created upto int32 max. Another option ENABLE_CNF_REWRITES enables or disables the entire rewrite. This is False by default until we have done more thorough functional testing. Examples of rewrites: original: (a AND b) OR c rewritten: (a OR c) AND (b OR c) original: (a AND b) OR (c AND d) rewritten: (a OR c) AND (a OR d) AND (b OR c) AND (b OR d) original: NOT(a OR b) rewritten: NOT(a) AND NOT(b) Testing: - Added new unit tests with variations of disjunctive predicates and verified their Explain plans - Manually tested the result correctness on impala shell by running these queries with ENABLE_CNF_REWRITES enabled and disabled - Preliminary performance testing of TPC-DS q13 on a 10TB scale factor shows almost 5x improvement: Original baseline: 47.5 sec With this patch and CNF rewrite enabled: 9.4 sec Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 --- M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java A fe/src/main/java/org/apache/impala/rewrite/ConvertToCNFRule.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java A testdata/workloads/functional-planner/queries/PlannerTest/convert-to-cnf.test 10 files changed, 532 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/15462/2 -- To view, visit http://gerrit.cloudera.org:8080/15462 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5a03cd7239333aaf375416ef5f2b7608fcd4a072 Gerrit-Change-Number: 15462 Gerrit-PatchSet: 2 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins