[Impala-ASF-CR] WIP: IMPALA-9951: put sort before exchange in analytic plans
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16186 ) Change subject: WIP: IMPALA-9951: put sort before exchange in analytic plans .. Patch Set 3: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6126/ -- To view, visit http://gerrit.cloudera.org:8080/16186 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9e7ab40d6cd8ee47d1d31e9aac46eddcc46634b2 Gerrit-Change-Number: 16186 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Comment-Date: Tue, 14 Jul 2020 06:14:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9898: generate grouping set plans
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16128 ) Change subject: IMPALA-9898: generate grouping set plans .. Patch Set 17: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6587/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16128 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197 Gerrit-Change-Number: 16128 Gerrit-PatchSet: 17 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 14 Jul 2020 03:41:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9898: generate grouping set plans
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16128 ) Change subject: IMPALA-9898: generate grouping set plans .. Patch Set 17: Verified+1 Code-Review+2 The next patch was verified, verifying this too since I"m merging together. -- To view, visit http://gerrit.cloudera.org:8080/16128 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197 Gerrit-Change-Number: 16128 Gerrit-PatchSet: 17 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 14 Jul 2020 03:13:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Tim Armstrong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. IMPALA-9917: grouping() and grouping_id() support Implements grouping() and grouping_id() builtins. grouping_id() has both a no-arg version, which returns a bit vector of all grouping exprs and a varargs version, which returns a bit vector of the provided arguments. Grouping is a keyword, so needs special handling in the parser to be accepted as a function name. These functions are implemented in the transpose agg with a CASE expression similar to other aggregate functions, but returning the grouping() or grouping_id() value for that aggregation class instead of an aggregated value. Testing: * Added parser test for grouping keyword. * Added analysis tests for the functions. * Added basic planner test to show expressions generated * Added some TPC-DS queries that use grouping() - queries 80, 70 and 86 using reference .test files from Fang-Yu Rao. 27 and 36 were added with reference results from https://github.com/cwida/tpcds-result-reproduction * Add targeted end-to-end tests. * Added view compatibility test with Hive. Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Reviewed-on: http://gerrit.cloudera.org:8080/16140 Reviewed-by: Tim Armstrong Tested-by: Tim Armstrong --- M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/GroupByClause.java M fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java M fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test M testdata/workloads/functional-query/queries/QueryTest/grouping-sets.test M testdata/workloads/functional-query/queries/QueryTest/views-compatibility.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q27.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q36.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q70.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q80.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q86.test M tests/query_test/test_tpcds_queries.py M tests/util/parse_util.py 19 files changed, 2,686 insertions(+), 82 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 19 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9898: generate grouping set plans
Tim Armstrong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16128 ) Change subject: IMPALA-9898: generate grouping set plans .. IMPALA-9898: generate grouping set plans Integrates the parsing and analysis with plan generation. Testing: * Add analysis test to make sure we reject unsupported queries. * Added targeted planner tests to ensure we generate the correct aggregation classes for a variety of cases. * Add targeted end-to-end functional tests. Added five TPC-DS queries that use ROLLUP, building on some work done by Fang-Yu Rao. Some tweaks were required for these tests. * Add an extra ORDER BY clause to q77 to make fully deterministic. * Add backticks around `returns` to avoid reserved word. * Add INTERVAL keyword to date/timestamp arithmetic. We can run q80, too, but I haven't added or verified results yet - that can be done in a follow-up. Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197 Reviewed-on: http://gerrit.cloudera.org:8080/16128 Reviewed-by: Tim Armstrong Tested-by: Tim Armstrong --- M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/GroupByClause.java M fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/common/RuntimeEnv.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java A testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test M testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test A testdata/workloads/functional-query/queries/QueryTest/grouping-sets.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q18.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q22.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q5.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q67.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q77.test M tests/query_test/test_aggregation.py M tests/query_test/test_tpcds_queries.py M tests/util/parse_util.py 22 files changed, 4,287 insertions(+), 65 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16128 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197 Gerrit-Change-Number: 16128 Gerrit-PatchSet: 18 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 18: Verified+1 Code-Review+2 This just failed because of a merge marker that I failed to remove in the final line of a planner test. I removed that and reran the single affected test locally. I'm going to mark as verified to avoid the need to rerun. -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 18 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 14 Jul 2020 03:12:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9898: generate grouping set plans
Hello Aman Sinha, Quanlong Huang, Fang-Yu Rao, Qifan Chen, Thomas Tauber-Marshall, David Rorke, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16128 to look at the new patch set (#17). Change subject: IMPALA-9898: generate grouping set plans .. IMPALA-9898: generate grouping set plans Integrates the parsing and analysis with plan generation. Testing: * Add analysis test to make sure we reject unsupported queries. * Added targeted planner tests to ensure we generate the correct aggregation classes for a variety of cases. * Add targeted end-to-end functional tests. Added five TPC-DS queries that use ROLLUP, building on some work done by Fang-Yu Rao. Some tweaks were required for these tests. * Add an extra ORDER BY clause to q77 to make fully deterministic. * Add backticks around `returns` to avoid reserved word. * Add INTERVAL keyword to date/timestamp arithmetic. We can run q80, too, but I haven't added or verified results yet - that can be done in a follow-up. Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197 --- M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/GroupByClause.java M fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/common/RuntimeEnv.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java A testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test M testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test A testdata/workloads/functional-query/queries/QueryTest/grouping-sets.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q18.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q22.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q5.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q67.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q77.test M tests/query_test/test_aggregation.py M tests/query_test/test_tpcds_queries.py M tests/util/parse_util.py 22 files changed, 4,287 insertions(+), 65 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/16128/17 -- To view, visit http://gerrit.cloudera.org:8080/16128 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197 Gerrit-Change-Number: 16128 Gerrit-PatchSet: 17 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 17: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6125/ -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 17 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 14 Jul 2020 02:56:16 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: IMPALA-9951: put sort before exchange in analytic plans
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16186 ) Change subject: WIP: IMPALA-9951: put sort before exchange in analytic plans .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6586/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16186 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9e7ab40d6cd8ee47d1d31e9aac46eddcc46634b2 Gerrit-Change-Number: 16186 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Comment-Date: Tue, 14 Jul 2020 02:23:39 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: IMPALA-9951: put sort before exchange in analytic plans
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16186 ) Change subject: WIP: IMPALA-9951: put sort before exchange in analytic plans .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/16186/4/tests/query_test/test_analytic_tpcds.py File tests/query_test/test_analytic_tpcds.py: http://gerrit.cloudera.org:8080/#/c/16186/4/tests/query_test/test_analytic_tpcds.py@48 PS4, Line 48: = flake8: E501 line too long (95 > 90 characters) -- To view, visit http://gerrit.cloudera.org:8080/16186 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9e7ab40d6cd8ee47d1d31e9aac46eddcc46634b2 Gerrit-Change-Number: 16186 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Comment-Date: Tue, 14 Jul 2020 01:56:04 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: IMPALA-9951: put sort before exchange in analytic plans
Hello Shant Hovsepian, David Rorke, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16186 to look at the new patch set (#4). Change subject: WIP: IMPALA-9951: put sort before exchange in analytic plans .. WIP: IMPALA-9951: put sort before exchange in analytic plans It is enabled by default with mt_dop > 0 ENABLE_PRE_EXCHANGE_SORT forces it on or off. Testing: TODO: * End-to-end tests * TPC-DS planner tests. Change-Id: I9e7ab40d6cd8ee47d1d31e9aac46eddcc46634b2 --- M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/java/org/apache/impala/planner/DataPartition.java M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns-mt-dop.test M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test M tests/query_test/test_analytic_tpcds.py 10 files changed, 2,494 insertions(+), 29 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/16186/4 -- To view, visit http://gerrit.cloudera.org:8080/16186 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9e7ab40d6cd8ee47d1d31e9aac46eddcc46634b2 Gerrit-Change-Number: 16186 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian
[Impala-ASF-CR] HACK - put sort before exchange in analytic plans
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16186 ) Change subject: HACK - put sort before exchange in analytic plans .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6585/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16186 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9e7ab40d6cd8ee47d1d31e9aac46eddcc46634b2 Gerrit-Change-Number: 16186 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Comment-Date: Tue, 14 Jul 2020 01:36:39 + Gerrit-HasComments: No
[Impala-ASF-CR] HACK - put sort before exchange in analytic plans
Tim Armstrong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16186 Change subject: HACK - put sort before exchange in analytic plans .. HACK - put sort before exchange in analytic plans It is enabled by default with mt_dop > 0 ENABLE_PRE_EXCHANGE_SORT forces it on or off. Change-Id: I9e7ab40d6cd8ee47d1d31e9aac46eddcc46634b2 --- M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/java/org/apache/impala/planner/DataPartition.java M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java 6 files changed, 68 insertions(+), 16 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/16186/3 -- To view, visit http://gerrit.cloudera.org:8080/16186 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I9e7ab40d6cd8ee47d1d31e9aac46eddcc46634b2 Gerrit-Change-Number: 16186 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16082 ) Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types) .. Patch Set 13: Code-Review+2 (1 comment) Thanks for adding bunch of tests! LGTM. Let's ship it! http://gerrit.cloudera.org:8080/#/c/16082/11/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java: http://gerrit.cloudera.org:8080/#/c/16082/11/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1541 PS11, Line 1541: /*isPartitionKeySc > I think you meant 'false', and you are right. And we also need to set it to Oops, yes, I mean 'false'... Thanks for addressing this. -- To view, visit http://gerrit.cloudera.org:8080/16082 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659 Gerrit-Change-Number: 16082 Gerrit-PatchSet: 13 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 14 Jul 2020 00:20:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9946: Use table id when comparing transactional state
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16170 ) Change subject: IMPALA-9946: Use table id when comparing transactional state .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16170 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I58c5bd58c4eb5663647c01ecd738b661e4e4cd74 Gerrit-Change-Number: 16170 Gerrit-PatchSet: 3 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 14 Jul 2020 00:10:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5534: Fix and enable experimental failure tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16157 ) Change subject: IMPALA-5534: Fix and enable experimental failure tests .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16157 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9dbb98017fb6c40cea349e7c63a35c325cbbc288 Gerrit-Change-Number: 16157 Gerrit-PatchSet: 4 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 14 Jul 2020 00:03:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5534: Fix and enable experimental failure tests
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16157 ) Change subject: IMPALA-5534: Fix and enable experimental failure tests .. IMPALA-5534: Fix and enable experimental failure tests Moves the test_catalog_hms_failures.py and test_process_failures.py from the experimental tests to custom cluster tests. catalog_service/test_hms_failure.py is combined with custom_cluster/test_catalog_hms_failure.py as well in order to unify all tests for HMS failures. Several modifications to the tests were necessary to get them working again, but for the most part, the logic of the tests remained the same. A few additional fault tolerance tests (e.g. TestHiveMetaStoreFailure::test_hms_client_retries) were added as well. The overall goal is to increase the process failure test coverage for all components: impalads, statestore, catalogd, HMS, etc. test_restart_catalogd in test_process_failures.py fails due to IMPALA-9848, so it is skipped for now. Testing: * Ran new tests locally Change-Id: I9dbb98017fb6c40cea349e7c63a35c325cbbc288 Reviewed-on: http://gerrit.cloudera.org:8080/16157 Reviewed-by: Sahil Takiar Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/service/Frontend.java D tests/catalog_service/test_hms_failure.py M tests/common/impala_service.py A tests/custom_cluster/test_catalog_hms_failures.py A tests/custom_cluster/test_process_failures.py D tests/experiments/test_catalog_hms_failures.py D tests/experiments/test_process_failures.py 7 files changed, 485 insertions(+), 466 deletions(-) Approvals: Sahil Takiar: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/16157 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I9dbb98017fb6c40cea349e7c63a35c325cbbc288 Gerrit-Change-Number: 16157 Gerrit-PatchSet: 5 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-9478: Profiles should indicate if custom UDFs are being used
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16188 ) Change subject: IMPALA-9478: Profiles should indicate if custom UDFs are being used .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6584/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16188 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I79122e6cc74fd5a62c76962289a1615fbac2f345 Gerrit-Change-Number: 16188 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 13 Jul 2020 23:59:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9946: Use table id when comparing transactional state
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16170 ) Change subject: IMPALA-9946: Use table id when comparing transactional state .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16170 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I58c5bd58c4eb5663647c01ecd738b661e4e4cd74 Gerrit-Change-Number: 16170 Gerrit-PatchSet: 3 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 13 Jul 2020 23:56:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9478: Profiles should indicate if custom UDFs are being used
Sahil Takiar has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16188 Change subject: IMPALA-9478: Profiles should indicate if custom UDFs are being used .. IMPALA-9478: Profiles should indicate if custom UDFs are being used Adds a marker to runtime profiles and explain plans indicating if custom (e.g. non-built in) user-defined functions are being used. For explain plans, a SQL-style comment is added after any function call. For runtime profiles, a new Frontend entry called "User Defined Functions (UDFs)" lists out all UDFs analyzed during planning. Take the following example: create function hive_lower(string) returns string location '/test-warehouse/hive-exec.jar' symbol='org.apache.hadoop.hive.ql.udf.UDFLower'; set explain_level=3; explain select * from functional.alltypes order by hive_lower(string_col); 01:SORT order by: default.hive_lower(string_col) /* JAVA UDF */ ASC materialized: default.hive_lower(string_col) /* JAVA UDF */ This shows up in the runtime profile as well. When the above query is actually run, the runtime profile includes the following entry: Frontend User Defined Functions (UDFs): default.hive_lower, default.hive_lower Testing: * Will add tests after confirming approach Change-Id: I79122e6cc74fd5a62c76962289a1615fbac2f345 --- M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/service/FrontendProfile.java 2 files changed, 23 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/16188/1 -- To view, visit http://gerrit.cloudera.org:8080/16188 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I79122e6cc74fd5a62c76962289a1615fbac2f345 Gerrit-Change-Number: 16188 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar
[Impala-ASF-CR] IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16171 ) Change subject: IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT .. IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT The patch for IMPALA-8954 failed to account for subqueries that could produce < 1 row. SelectStmt.returnsSingleRow() is confusing because it actually returns true if it returns *at most* one row. As a fix I split it into returnsExactlyOneRow() and returnsAtMostOneRow(), then used returnsExactlyOneRow() to determine if the subquery should instead be rewritten into a LEFT OUTER JOIN, which produces the correct result. CROSS JOIN is still preferred because it can be more freely reordered during planning. Testing: * Added planner tests for a range of scenarios where it can be rewritten as a CROSS JOIN and where it needs to be a LEFT OUTER JOIN for correctness. * Added some targeted end-to-end tests where the results were previously incorrect. Checked the behaviour against Hive and postgres. Ran exhaustive tests. Change-Id: I6034aedac776783bdc8cdb3a2df344e2b3662da6 Reviewed-on: http://gerrit.cloudera.org:8080/16171 Reviewed-by: Tim Armstrong Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java M fe/src/main/java/org/apache/impala/analysis/Subquery.java M fe/src/main/java/org/apache/impala/analysis/TableRef.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test M testdata/workloads/functional-query/queries/QueryTest/subquery.test 8 files changed, 463 insertions(+), 26 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/16171 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I6034aedac776783bdc8cdb3a2df344e2b3662da6 Gerrit-Change-Number: 16171 Gerrit-PatchSet: 8 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16171 ) Change subject: IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT .. Patch Set 7: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16171 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6034aedac776783bdc8cdb3a2df344e2b3662da6 Gerrit-Change-Number: 16171 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 13 Jul 2020 22:38:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9898: generate grouping set plans
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16128 ) Change subject: IMPALA-9898: generate grouping set plans .. Patch Set 16: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6583/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16128 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197 Gerrit-Change-Number: 16128 Gerrit-PatchSet: 16 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 13 Jul 2020 22:21:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 16: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6582/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 16 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 13 Jul 2020 22:12:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9898: generate grouping set plans
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16128 ) Change subject: IMPALA-9898: generate grouping set plans .. Patch Set 15: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6581/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16128 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197 Gerrit-Change-Number: 16128 Gerrit-PatchSet: 15 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 13 Jul 2020 22:10:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9898: generate grouping set plans
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16128 ) Change subject: IMPALA-9898: generate grouping set plans .. Patch Set 16: Code-Review+2 carry +2 -- To view, visit http://gerrit.cloudera.org:8080/16128 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197 Gerrit-Change-Number: 16128 Gerrit-PatchSet: 16 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 13 Jul 2020 21:53:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 17: Code-Review+2 carry +2 -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 17 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 13 Jul 2020 21:54:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 17: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6125/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 17 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 13 Jul 2020 21:54:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9898: generate grouping set plans
Hello Aman Sinha, Quanlong Huang, Fang-Yu Rao, Qifan Chen, Thomas Tauber-Marshall, David Rorke, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16128 to look at the new patch set (#16). Change subject: IMPALA-9898: generate grouping set plans .. IMPALA-9898: generate grouping set plans Integrates the parsing and analysis with plan generation. Testing: * Add analysis test to make sure we reject unsupported queries. * Added targeted planner tests to ensure we generate the correct aggregation classes for a variety of cases. * Add targeted end-to-end functional tests. Added five TPC-DS queries that use ROLLUP, building on some work done by Fang-Yu Rao. Some tweaks were required for these tests. * Add an extra ORDER BY clause to q77 to make fully deterministic. * Add backticks around `returns` to avoid reserved word. * Add INTERVAL keyword to date/timestamp arithmetic. We can run q80, too, but I haven't added or verified results yet - that can be done in a follow-up. Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197 --- M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/GroupByClause.java M fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/common/RuntimeEnv.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java A testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test M testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test A testdata/workloads/functional-query/queries/QueryTest/grouping-sets.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q18.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q22.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q5.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q67.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q77.test M tests/query_test/test_aggregation.py M tests/query_test/test_tpcds_queries.py M tests/util/parse_util.py 22 files changed, 4,288 insertions(+), 65 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/16128/16 -- To view, visit http://gerrit.cloudera.org:8080/16128 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197 Gerrit-Change-Number: 16128 Gerrit-PatchSet: 16 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9898: generate grouping set plans
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16128 ) Change subject: IMPALA-9898: generate grouping set plans .. Patch Set 15: Code-Review+2 (2 comments) carry +2 http://gerrit.cloudera.org:8080/#/c/16128/14/fe/src/main/java/org/apache/impala/analysis/Expr.java File fe/src/main/java/org/apache/impala/analysis/Expr.java: http://gerrit.cloudera.org:8080/#/c/16128/14/fe/src/main/java/org/apache/impala/analysis/Expr.java@1125 PS14, Line 1125: > nit: formatting Done http://gerrit.cloudera.org:8080/#/c/16128/14/testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test File testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test: http://gerrit.cloudera.org:8080/#/c/16128/14/testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test@4235 PS14, Line 4235: # TODO: This plan could be improved by removing the ROLLUP in the subquery or by > Maybe mark this as a TODO? Done -- To view, visit http://gerrit.cloudera.org:8080/16128 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197 Gerrit-Change-Number: 16128 Gerrit-PatchSet: 15 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 13 Jul 2020 21:53:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 15: (2 comments) http://gerrit.cloudera.org:8080/#/c/16140/15/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java File fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java: http://gerrit.cloudera.org:8080/#/c/16140/15/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java@132 PS15, Line 132: return (fnName.getDb() == null || fnName.getDb().equals(BuiltinsDb.NAME)) && > Maybe add a brief comment about what this case represents Done http://gerrit.cloudera.org:8080/#/c/16140/15/testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test File testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test: http://gerrit.cloudera.org:8080/#/c/16140/15/testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test@1293 PS15, Line 1293: #constants > typo Done -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 15 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 13 Jul 2020 21:42:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Hello Aman Sinha, Thomas Tauber-Marshall, Shant Hovsepian, David Rorke, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16140 to look at the new patch set (#16). Change subject: IMPALA-9917: grouping() and grouping_id() support .. IMPALA-9917: grouping() and grouping_id() support Implements grouping() and grouping_id() builtins. grouping_id() has both a no-arg version, which returns a bit vector of all grouping exprs and a varargs version, which returns a bit vector of the provided arguments. Grouping is a keyword, so needs special handling in the parser to be accepted as a function name. These functions are implemented in the transpose agg with a CASE expression similar to other aggregate functions, but returning the grouping() or grouping_id() value for that aggregation class instead of an aggregated value. Testing: * Added parser test for grouping keyword. * Added analysis tests for the functions. * Added basic planner test to show expressions generated * Added some TPC-DS queries that use grouping() - queries 80, 70 and 86 using reference .test files from Fang-Yu Rao. 27 and 36 were added with reference results from https://github.com/cwida/tpcds-result-reproduction * Add targeted end-to-end tests. * Added view compatibility test with Hive. Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 --- M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/GroupByClause.java M fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java M fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test M testdata/workloads/functional-query/queries/QueryTest/grouping-sets.test M testdata/workloads/functional-query/queries/QueryTest/views-compatibility.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q27.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q36.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q70.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q80.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q86.test M tests/query_test/test_tpcds_queries.py M tests/util/parse_util.py 19 files changed, 2,686 insertions(+), 82 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/16140/16 -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 16 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9898: generate grouping set plans
Hello Aman Sinha, Quanlong Huang, Fang-Yu Rao, Qifan Chen, Thomas Tauber-Marshall, David Rorke, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16128 to look at the new patch set (#15). Change subject: IMPALA-9898: generate grouping set plans .. IMPALA-9898: generate grouping set plans Integrates the parsing and analysis with plan generation. Testing: * Add analysis test to make sure we reject unsupported queries. * Added targeted planner tests to ensure we generate the correct aggregation classes for a variety of cases. * Add targeted end-to-end functional tests. Added five TPC-DS queries that use ROLLUP, building on some work done by Fang-Yu Rao. Some tweaks were required for these tests. * Add an extra ORDER BY clause to q77 to make fully deterministic. * Add backticks around `returns` to avoid reserved word. * Add INTERVAL keyword to date/timestamp arithmetic. We can run q80, too, but I haven't added or verified results yet - that can be done in a follow-up. Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197 --- M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/GroupByClause.java M fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/common/RuntimeEnv.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java A testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test M testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test A testdata/workloads/functional-query/queries/QueryTest/grouping-sets.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q18.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q22.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q5.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q67.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q77.test M tests/query_test/test_aggregation.py M tests/query_test/test_tpcds_queries.py M tests/util/parse_util.py 22 files changed, 4,287 insertions(+), 65 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/16128/15 -- To view, visit http://gerrit.cloudera.org:8080/16128 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197 Gerrit-Change-Number: 16128 Gerrit-PatchSet: 15 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9898: generate grouping set plans
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/16128 ) Change subject: IMPALA-9898: generate grouping set plans .. Patch Set 14: Code-Review+2 (3 comments) http://gerrit.cloudera.org:8080/#/c/16128/14/fe/src/main/java/org/apache/impala/analysis/Expr.java File fe/src/main/java/org/apache/impala/analysis/Expr.java: http://gerrit.cloudera.org:8080/#/c/16128/14/fe/src/main/java/org/apache/impala/analysis/Expr.java@1125 PS14, Line 1125: : nit: formatting http://gerrit.cloudera.org:8080/#/c/16128/14/testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test File testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test: http://gerrit.cloudera.org:8080/#/c/16128/14/testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test@9 PS14, Line 9: | group by: CASE valid_tid(1,2,3,4) WHEN 1 THEN int_col WHEN 2 THEN int_col WHEN 3 THEN int_col WHEN 4 THEN NULL END, CASE valid_tid(1,2,3,4) WHEN 1 THEN bool_col WHEN 2 THEN bool_col WHEN 3 THEN NULL WHEN 4 THEN NULL END, CASE valid_tid(1,2,3,4) WHEN 1 THEN string_col WHEN 2 THEN NULL WHEN 3 THEN NULL WHEN 4 THEN NULL END, CASE valid_tid(1,2,3,4) WHEN 1 THEN 1 WHEN 2 THEN 2 WHEN 3 THEN 3 WHEN 4 THEN 4 END Obviously doesn't need to be fixed in this patch, but it would be nice if we had a way to make this output more decipherable http://gerrit.cloudera.org:8080/#/c/16128/14/testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test File testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test: http://gerrit.cloudera.org:8080/#/c/16128/14/testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test@4235 PS14, Line 4235: # This plan could be improved by removing the ROLLUP in the subquery or by Maybe mark this as a TODO? -- To view, visit http://gerrit.cloudera.org:8080/16128 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197 Gerrit-Change-Number: 16128 Gerrit-PatchSet: 14 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 13 Jul 2020 21:12:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16171 ) Change subject: IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16171 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6034aedac776783bdc8cdb3a2df344e2b3662da6 Gerrit-Change-Number: 16171 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 13 Jul 2020 21:08:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9941: ExprTest.CastExprs fails when running with ASAN fix
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/16179 ) Change subject: IMPALA-9941: ExprTest.CastExprs fails when running with ASAN fix .. Patch Set 4: (6 comments) http://gerrit.cloudera.org:8080/#/c/16179/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16179/4//COMMIT_MSG@9 PS4, Line 9: in a : timestamps typo http://gerrit.cloudera.org:8080/#/c/16179/4//COMMIT_MSG@10 PS4, Line 10: becouse same typo http://gerrit.cloudera.org:8080/#/c/16179/4//COMMIT_MSG@10 PS4, Line 10: becouse typo http://gerrit.cloudera.org:8080/#/c/16179/4//COMMIT_MSG@12 PS4, Line 12: Updated files connected to the drop of dateless timestamps. I don't see the point of this sentence. Here we don't know what files have been changed by that other patch so this doesn't bring too much value. http://gerrit.cloudera.org:8080/#/c/16179/4/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/16179/4/be/src/exprs/expr-test.cc@3318 PS4, Line 3318: //IMPALA-9531 nit: leave a space after // http://gerrit.cloudera.org:8080/#/c/16179/4/be/src/exprs/expr-test.cc@3324 PS4, Line 3324: TestIsNull I don't understand. In Patch Set 3 this was a positive test, checking that this input results in a valid output. Then in PS4 you changed this to expect null instead without having any relevant code changes. Was it failing in PS3 or is it failing now in PS4? -- To view, visit http://gerrit.cloudera.org:8080/16179 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I75bd47b6627a2e338c46dc354f7e67d34c1abbcf Gerrit-Change-Number: 16179 Gerrit-PatchSet: 4 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 13 Jul 2020 19:51:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 15: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/16140/15/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java File fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java: http://gerrit.cloudera.org:8080/#/c/16140/15/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java@132 PS15, Line 132: return (fnName.getDb() == null || fnName.getDb().equals(BuiltinsDb.NAME)) && Maybe add a brief comment about what this case represents http://gerrit.cloudera.org:8080/#/c/16140/15/testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test File testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test: http://gerrit.cloudera.org:8080/#/c/16140/15/testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test@1293 PS15, Line 1293: #constants typo -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 15 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 13 Jul 2020 19:43:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5534: Fix and enable experimental failure tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16157 ) Change subject: IMPALA-5534: Fix and enable experimental failure tests .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6580/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16157 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9dbb98017fb6c40cea349e7c63a35c325cbbc288 Gerrit-Change-Number: 16157 Gerrit-PatchSet: 4 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 13 Jul 2020 19:30:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9946: Use table id when comparing transactional state
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16170 ) Change subject: IMPALA-9946: Use table id when comparing transactional state .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6579/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16170 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I58c5bd58c4eb5663647c01ecd738b661e4e4cd74 Gerrit-Change-Number: 16170 Gerrit-PatchSet: 3 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 13 Jul 2020 19:16:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5534: Fix and enable experimental failure tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16157 ) Change subject: IMPALA-5534: Fix and enable experimental failure tests .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6578/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16157 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9dbb98017fb6c40cea349e7c63a35c325cbbc288 Gerrit-Change-Number: 16157 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 13 Jul 2020 19:07:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16082 ) Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types) .. Patch Set 12: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6576/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16082 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659 Gerrit-Change-Number: 16082 Gerrit-PatchSet: 12 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 13 Jul 2020 18:58:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16082 ) Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types) .. Patch Set 13: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6577/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16082 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659 Gerrit-Change-Number: 16082 Gerrit-PatchSet: 13 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 13 Jul 2020 18:57:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5534: Fix and enable experimental failure tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16157 ) Change subject: IMPALA-5534: Fix and enable experimental failure tests .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6124/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16157 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9dbb98017fb6c40cea349e7c63a35c325cbbc288 Gerrit-Change-Number: 16157 Gerrit-PatchSet: 4 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 13 Jul 2020 18:51:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5534: Fix and enable experimental failure tests
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/16157 ) Change subject: IMPALA-5534: Fix and enable experimental failure tests .. Patch Set 4: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/16157/3/tests/custom_cluster/test_catalog_hms_failures.py File tests/custom_cluster/test_catalog_hms_failures.py: http://gerrit.cloudera.org:8080/#/c/16157/3/tests/custom_cluster/test_catalog_hms_failures.py@136 PS3, Line 136: e > flake8: E722 do not use bare except' Done http://gerrit.cloudera.org:8080/#/c/16157/3/tests/custom_cluster/test_catalog_hms_failures.py@140 PS3, Line 140: e > flake8: E722 do not use bare except' Done -- To view, visit http://gerrit.cloudera.org:8080/16157 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9dbb98017fb6c40cea349e7c63a35c325cbbc288 Gerrit-Change-Number: 16157 Gerrit-PatchSet: 4 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 13 Jul 2020 18:50:59 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5534: Fix and enable experimental failure tests
Hello Thomas Tauber-Marshall, Vihang Karajgaonkar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16157 to look at the new patch set (#4). Change subject: IMPALA-5534: Fix and enable experimental failure tests .. IMPALA-5534: Fix and enable experimental failure tests Moves the test_catalog_hms_failures.py and test_process_failures.py from the experimental tests to custom cluster tests. catalog_service/test_hms_failure.py is combined with custom_cluster/test_catalog_hms_failure.py as well in order to unify all tests for HMS failures. Several modifications to the tests were necessary to get them working again, but for the most part, the logic of the tests remained the same. A few additional fault tolerance tests (e.g. TestHiveMetaStoreFailure::test_hms_client_retries) were added as well. The overall goal is to increase the process failure test coverage for all components: impalads, statestore, catalogd, HMS, etc. test_restart_catalogd in test_process_failures.py fails due to IMPALA-9848, so it is skipped for now. Testing: * Ran new tests locally Change-Id: I9dbb98017fb6c40cea349e7c63a35c325cbbc288 --- M fe/src/main/java/org/apache/impala/service/Frontend.java D tests/catalog_service/test_hms_failure.py M tests/common/impala_service.py A tests/custom_cluster/test_catalog_hms_failures.py A tests/custom_cluster/test_process_failures.py D tests/experiments/test_catalog_hms_failures.py D tests/experiments/test_process_failures.py 7 files changed, 485 insertions(+), 466 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/16157/4 -- To view, visit http://gerrit.cloudera.org:8080/16157 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9dbb98017fb6c40cea349e7c63a35c325cbbc288 Gerrit-Change-Number: 16157 Gerrit-PatchSet: 4 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-9946: Use table id when comparing transactional state
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16170 ) Change subject: IMPALA-9946: Use table id when comparing transactional state .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6123/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/16170 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I58c5bd58c4eb5663647c01ecd738b661e4e4cd74 Gerrit-Change-Number: 16170 Gerrit-PatchSet: 3 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 13 Jul 2020 18:48:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9946: Use table id when comparing transactional state
Vihang Karajgaonkar has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/16170 ) Change subject: IMPALA-9946: Use table id when comparing transactional state .. IMPALA-9946: Use table id when comparing transactional state This change adds support for catalog clients to optionally provide a table id when fetching the table metadata using the GetPartialCatalogObject API. This table id is used to compare the catalog table is same as the one which is requested by client. If the table id matches, we compare the ValidWriteIdList to provide a consistent view of the metadata to the clients. Testing: Added a new test which drops and recreates the table from hive to introduce the false positive in the existing ValidWriteIdList comparison logic. After the patch, the test succeeds. Change-Id: I58c5bd58c4eb5663647c01ecd738b661e4e4cd74 --- M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/util/AcidUtils.java M fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoWriteIdTest.java M shaded-deps/pom.xml 5 files changed, 135 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/16170/3 -- To view, visit http://gerrit.cloudera.org:8080/16170 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I58c5bd58c4eb5663647c01ecd738b661e4e4cd74 Gerrit-Change-Number: 16170 Gerrit-PatchSet: 3 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-5534: Fix and enable experimental failure tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16157 ) Change subject: IMPALA-5534: Fix and enable experimental failure tests .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/16157/3/tests/custom_cluster/test_catalog_hms_failures.py File tests/custom_cluster/test_catalog_hms_failures.py: http://gerrit.cloudera.org:8080/#/c/16157/3/tests/custom_cluster/test_catalog_hms_failures.py@136 PS3, Line 136: e flake8: E722 do not use bare except' http://gerrit.cloudera.org:8080/#/c/16157/3/tests/custom_cluster/test_catalog_hms_failures.py@140 PS3, Line 140: e flake8: E722 do not use bare except' -- To view, visit http://gerrit.cloudera.org:8080/16157 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9dbb98017fb6c40cea349e7c63a35c325cbbc288 Gerrit-Change-Number: 16157 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 13 Jul 2020 18:47:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5534: Fix and enable experimental failure tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16157 ) Change subject: IMPALA-5534: Fix and enable experimental failure tests .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6122/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16157 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9dbb98017fb6c40cea349e7c63a35c325cbbc288 Gerrit-Change-Number: 16157 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 13 Jul 2020 18:46:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5534: Fix and enable experimental failure tests
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/16157 ) Change subject: IMPALA-5534: Fix and enable experimental failure tests .. Patch Set 3: Code-Review+2 Carrying +2. -- To view, visit http://gerrit.cloudera.org:8080/16157 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9dbb98017fb6c40cea349e7c63a35c325cbbc288 Gerrit-Change-Number: 16157 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 13 Jul 2020 18:46:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9946: Use table id when comparing transactional state
Vihang Karajgaonkar has posted comments on this change. ( http://gerrit.cloudera.org:8080/16170 ) Change subject: IMPALA-9946: Use table id when comparing transactional state .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/16170/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16170/2//COMMIT_MSG@9 PS2, Line 9: 's > nit: s Done http://gerrit.cloudera.org:8080/#/c/16170/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/16170/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@606 PS2, Line 606: -1 > nit: -1 is used in several places for illegal table id. Is there a constant Unfortunately, we don't have it. Although from code inspection from the Hive side, it looks like the table id will always be >=0 depending on the kind of backing database. I can create a private static constant in CatalogServiceCatalog.java to make it easy to track on our side. http://gerrit.cloudera.org:8080/#/c/16170/2/fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoWriteIdTest.java File fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoWriteIdTest.java: http://gerrit.cloudera.org:8080/#/c/16170/2/fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoWriteIdTest.java@512 PS2, Line 512: long numMisses1 = getMetricCount(testDbName, testTblName, : HdfsTable.FILEMETADATA_CACHE_MISS_METRIC); : long numHits1 = getMetricCount(testDbName, testTblName, : HdfsTable.FILEMETADATA_CACHE_HIT_METRIC); > nit: not used Done -- To view, visit http://gerrit.cloudera.org:8080/16170 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I58c5bd58c4eb5663647c01ecd738b661e4e4cd74 Gerrit-Change-Number: 16170 Gerrit-PatchSet: 2 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 13 Jul 2020 18:47:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5534: Fix and enable experimental failure tests
Hello Thomas Tauber-Marshall, Vihang Karajgaonkar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16157 to look at the new patch set (#3). Change subject: IMPALA-5534: Fix and enable experimental failure tests .. IMPALA-5534: Fix and enable experimental failure tests Moves the test_catalog_hms_failures.py and test_process_failures.py from the experimental tests to custom cluster tests. catalog_service/test_hms_failure.py is combined with custom_cluster/test_catalog_hms_failure.py as well in order to unify all tests for HMS failures. Several modifications to the tests were necessary to get them working again, but for the most part, the logic of the tests remained the same. A few additional fault tolerance tests (e.g. TestHiveMetaStoreFailure::test_hms_client_retries) were added as well. The overall goal is to increase the process failure test coverage for all components: impalads, statestore, catalogd, HMS, etc. test_restart_catalogd in test_process_failures.py fails due to IMPALA-9848, so it is skipped for now. Testing: * Ran new tests locally Change-Id: I9dbb98017fb6c40cea349e7c63a35c325cbbc288 --- M fe/src/main/java/org/apache/impala/service/Frontend.java D tests/catalog_service/test_hms_failure.py M tests/common/impala_service.py A tests/custom_cluster/test_catalog_hms_failures.py A tests/custom_cluster/test_process_failures.py D tests/experiments/test_catalog_hms_failures.py D tests/experiments/test_process_failures.py 7 files changed, 485 insertions(+), 466 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/16157/3 -- To view, visit http://gerrit.cloudera.org:8080/16157 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9dbb98017fb6c40cea349e7c63a35c325cbbc288 Gerrit-Change-Number: 16157 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-5534: Fix and enable experimental failure tests
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/16157 ) Change subject: IMPALA-5534: Fix and enable experimental failure tests .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/16157/2/tests/custom_cluster/test_catalog_hms_failures.py File tests/custom_cluster/test_catalog_hms_failures.py: http://gerrit.cloudera.org:8080/#/c/16157/2/tests/custom_cluster/test_catalog_hms_failures.py@17 PS2, Line 17: import pytest > nit, May be move this line to the python way of documentation on line 36 as Done http://gerrit.cloudera.org:8080/#/c/16157/2/tests/custom_cluster/test_process_failures.py File tests/custom_cluster/test_process_failures.py: http://gerrit.cloudera.org:8080/#/c/16157/2/tests/custom_cluster/test_process_failures.py@143 PS2, Line 143: IMPALA-9848 > Thanks for writing a test case and creating a JIRA for this. I will try to Sounds good thanks. I think Anurag has been looking into this as well. -- To view, visit http://gerrit.cloudera.org:8080/16157 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9dbb98017fb6c40cea349e7c63a35c325cbbc288 Gerrit-Change-Number: 16157 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 13 Jul 2020 18:46:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)
Hello Aman Sinha, Quanlong Huang, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16082 to look at the new patch set (#13). Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types) .. IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types) Hive ACID supports row-level DELETE and UPDATE operations on a table. It achieves it via assigning a unique row-id for each row, and maintaining two sets of files in a table. The first set is in the base/delta directories, they contain the INSERTed rows. The second set of files are in the delete-delta directories, they contain the DELETEd rows. (UPDATE operations are implemented via DELETE+INSERT.) In the filesystem it looks like e.g.: * full_acid/delta_001_001_/_0 * full_acid/delta_002_002_/_0 * full_acid/delete_delta_003_003_/_0 During scanning we need to return INSERTed rows minus DELETEd rows. This patch implements it by creating an ANTI JOIN between the INSERT and DELETE sets. It is a planner-only modification. Every HDFS SCAN that scans full ACID tables (that also have deleted rows) are converted to two HDFS SCANs, one for the INSERT deltas, and one for the DELETE deltas. Then a LEFT ANTI HASH JOIN with BROADCAST distribution mode is created above them. Later we can add support for other distribution modes if the performance requires it. E.g. if we have too many deleted rows then probably we are better off with PARTITIONED distribution mode. We could estimate the number of deleted rows by sampling the delete delta files. The current patch only works for primitive types. I.e. we cannot select nested data if the table has deleted rows. Testing: * added planner test * added e2e tests Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659 --- M common/thrift/CatalogObjects.thrift M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java M fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java M fe/src/main/java/org/apache/impala/catalog/FeFsTable.java M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/main/java/org/apache/impala/util/AcidUtils.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M fe/src/test/java/org/apache/impala/util/AcidUtilsTest.java M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv A testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test M testdata/workloads/functional-query/queries/QueryTest/acid-negative.test A testdata/workloads/functional-query/queries/QueryTest/full-acid-scans.test M tests/custom_cluster/test_local_catalog.py M tests/query_test/test_acid.py 27 files changed, 1,710 insertions(+), 151 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/16082/13 -- To view, visit http://gerrit.cloudera.org:8080/16082 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659 Gerrit-Change-Number: 16082 Gerrit-PatchSet: 13 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16082 ) Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types) .. Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/16082/12/tests/custom_cluster/test_local_catalog.py File tests/custom_cluster/test_local_catalog.py: http://gerrit.cloudera.org:8080/#/c/16082/12/tests/custom_cluster/test_local_catalog.py@30 PS12, Line 30: from tests.common.skip import (SkipIfHive2, SkipIfCatalogV2, SkipIfS3, SkipIfABFS, > flake8: F401 'tests.common.skip.SkipIfCatalogV2' imported but unused Done -- To view, visit http://gerrit.cloudera.org:8080/16082 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659 Gerrit-Change-Number: 16082 Gerrit-PatchSet: 12 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 13 Jul 2020 18:31:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16082 ) Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types) .. Patch Set 12: (9 comments) http://gerrit.cloudera.org:8080/#/c/16082/11/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/16082/11/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@1368 PS11, Line 1368: // Validate all the partition key/values to ensure you can convert them toThrift() : Expr.treesToThrift(getPartitionValues()); : } catch (Exception e) { : throw new CatalogException("Partition (" + getPartitionName() + : ") has invalid partition column values: ", e); : } : } : } : : /** :* Comparator to allow ordering of partitions by their partition-key values. :*/ : public static final KeyValueComparator KV_COMPARATOR = new KeyValueComparator(); : public static class KeyValueComparator implements Comparator { : @Override : public int compare(FeFsPartition o1, FeFsPartition o2) { : return comparePartitionKeyValues(o1.getPartitionValues(), o2.getPartitionValues()); : } : } : : @VisibleForTesting : public static int comparePartitionKeyValues(List lhs, : List rhs) { : > nit: Could you move these non-static methods to somewhere above the HdfsPar Done http://gerrit.cloudera.org:8080/#/c/16082/11/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java: http://gerrit.cloudera.org:8080/#/c/16082/11/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1453 PS11, Line 1453: "that have deleted rows and complex types. As a workaround you can run a " + : "major compaction."; > Maybe mention that the query references complex types? Done. Full-ACID complex types are coming soon, but it was low effort to add some tests for it so I added. http://gerrit.cloudera.org:8080/#/c/16082/11/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1468 PS11, Line 1468: throws AnalysisException { > unused arguement Done http://gerrit.cloudera.org:8080/#/c/16082/11/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1471 PS11, Line 1471: hdfsTblRef.getUniqueAlias() > Looks like if the tableRef doesn't have an explict alias, this is the full It works well, added some tests. http://gerrit.cloudera.org:8080/#/c/16082/11/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1537 PS11, Line 1537: /*isPartitionKeySc > Could you add some tests for isPartitionKeyScan=true? IIUC, we only scan on Thanks for pointing this out. The code really had a bug, though I needed a bit different statements to hit it. When 'isPartitionKeyScan=true' HdfsScanNode creates only one scan range for each file descriptor. Then the backend scanners only return a single row from each scan range. So to hit the bug I had to issue the following statements: create table my_acid (id int) partitioned by (p int) stored as orc tblproperties('transactional'='true'); insert into my_acid partition(p=0) values (0), (1), (2); insert into my_acid partition(p=1) values (0), (1), (2); delete from my_acid where p = 1 and id = 0; // Run this in Impala. The result should be 1 since partition(p=1) is NOT empty. select max(p) from my_acid; In p=1 we have only one insert delta file which contains (0), (1), and (2), all of them fit into one scan range. So the scanner only returns the first element (0) that we have deleted. Added this example as e2e test. http://gerrit.cloudera.org:8080/#/c/16082/11/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1541 PS11, Line 1541: /*isPartitionKeySc > Should we always set this to true to scan all delete rows? I think you meant 'false', and you are right. And we also need to set it to false for the insert delta files as well. http://gerrit.cloudera.org:8080/#/c/16082/11/testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test File testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test: http://gerrit.cloudera.org:8080/#/c/16082/11/testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test@714 PS11, Line 714: > Could you add some tests to make sure that the query hints are respected? A Done http://gerrit.cloudera.org:8080/#/c/16082/11/testdata/worklo
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16082 ) Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types) .. Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/16082/12/tests/custom_cluster/test_local_catalog.py File tests/custom_cluster/test_local_catalog.py: http://gerrit.cloudera.org:8080/#/c/16082/12/tests/custom_cluster/test_local_catalog.py@30 PS12, Line 30: from tests.common.skip import (SkipIfHive2, SkipIfCatalogV2, SkipIfS3, SkipIfABFS, flake8: F401 'tests.common.skip.SkipIfCatalogV2' imported but unused -- To view, visit http://gerrit.cloudera.org:8080/16082 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659 Gerrit-Change-Number: 16082 Gerrit-PatchSet: 12 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 13 Jul 2020 18:29:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)
Hello Aman Sinha, Quanlong Huang, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16082 to look at the new patch set (#12). Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types) .. IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types) Hive ACID supports row-level DELETE and UPDATE operations on a table. It achieves it via assigning a unique row-id for each row, and maintaining two sets of files in a table. The first set is in the base/delta directories, they contain the INSERTed rows. The second set of files are in the delete-delta directories, they contain the DELETEd rows. (UPDATE operations are implemented via DELETE+INSERT.) In the filesystem it looks like e.g.: * full_acid/delta_001_001_/_0 * full_acid/delta_002_002_/_0 * full_acid/delete_delta_003_003_/_0 During scanning we need to return INSERTed rows minus DELETEd rows. This patch implements it by creating an ANTI JOIN between the INSERT and DELETE sets. It is a planner-only modification. Every HDFS SCAN that scans full ACID tables (that also have deleted rows) are converted to two HDFS SCANs, one for the INSERT deltas, and one for the DELETE deltas. Then a LEFT ANTI HASH JOIN with BROADCAST distribution mode is created above them. Later we can add support for other distribution modes if the performance requires it. E.g. if we have too many deleted rows then probably we are better off with PARTITIONED distribution mode. We could estimate the number of deleted rows by sampling the delete delta files. The current patch only works for primitive types. I.e. we cannot select nested data if the table has deleted rows. Testing: * added planner test * added e2e tests Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659 --- M common/thrift/CatalogObjects.thrift M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java M fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java M fe/src/main/java/org/apache/impala/catalog/FeFsTable.java M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/main/java/org/apache/impala/util/AcidUtils.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M fe/src/test/java/org/apache/impala/util/AcidUtilsTest.java M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv A testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test M testdata/workloads/functional-query/queries/QueryTest/acid-negative.test A testdata/workloads/functional-query/queries/QueryTest/full-acid-scans.test M tests/custom_cluster/test_local_catalog.py M tests/query_test/test_acid.py 27 files changed, 1,710 insertions(+), 151 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/16082/12 -- To view, visit http://gerrit.cloudera.org:8080/16082 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659 Gerrit-Change-Number: 16082 Gerrit-PatchSet: 12 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16171 ) Change subject: IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6575/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16171 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6034aedac776783bdc8cdb3a2df344e2b3662da6 Gerrit-Change-Number: 16171 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 13 Jul 2020 17:49:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9898: generate grouping set plans
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16128 ) Change subject: IMPALA-9898: generate grouping set plans .. Patch Set 14: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6573/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16128 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197 Gerrit-Change-Number: 16128 Gerrit-PatchSet: 14 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 13 Jul 2020 17:35:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 15: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6574/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 15 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 13 Jul 2020 17:35:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT
Hello Shant Hovsepian, Zoltan Borok-Nagy, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16171 to look at the new patch set (#7). Change subject: IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT .. IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT The patch for IMPALA-8954 failed to account for subqueries that could produce < 1 row. SelectStmt.returnsSingleRow() is confusing because it actually returns true if it returns *at most* one row. As a fix I split it into returnsExactlyOneRow() and returnsAtMostOneRow(), then used returnsExactlyOneRow() to determine if the subquery should instead be rewritten into a LEFT OUTER JOIN, which produces the correct result. CROSS JOIN is still preferred because it can be more freely reordered during planning. Testing: * Added planner tests for a range of scenarios where it can be rewritten as a CROSS JOIN and where it needs to be a LEFT OUTER JOIN for correctness. * Added some targeted end-to-end tests where the results were previously incorrect. Checked the behaviour against Hive and postgres. Ran exhaustive tests. Change-Id: I6034aedac776783bdc8cdb3a2df344e2b3662da6 --- M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java M fe/src/main/java/org/apache/impala/analysis/Subquery.java M fe/src/main/java/org/apache/impala/analysis/TableRef.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test M testdata/workloads/functional-query/queries/QueryTest/subquery.test 8 files changed, 463 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/16171/7 -- To view, visit http://gerrit.cloudera.org:8080/16171 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6034aedac776783bdc8cdb3a2df344e2b3662da6 Gerrit-Change-Number: 16171 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16171 ) Change subject: IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6121/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16171 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6034aedac776783bdc8cdb3a2df344e2b3662da6 Gerrit-Change-Number: 16171 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 13 Jul 2020 17:32:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16171 ) Change subject: IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT .. Patch Set 7: Code-Review+2 rebase -- To view, visit http://gerrit.cloudera.org:8080/16171 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6034aedac776783bdc8cdb3a2df344e2b3662da6 Gerrit-Change-Number: 16171 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 13 Jul 2020 17:32:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9898: generate grouping set plans
Hello Aman Sinha, Quanlong Huang, Fang-Yu Rao, Qifan Chen, David Rorke, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16128 to look at the new patch set (#14). Change subject: IMPALA-9898: generate grouping set plans .. IMPALA-9898: generate grouping set plans Integrates the parsing and analysis with plan generation. Testing: * Add analysis test to make sure we reject unsupported queries. * Added targeted planner tests to ensure we generate the correct aggregation classes for a variety of cases. * Add targeted end-to-end functional tests. Added five TPC-DS queries that use ROLLUP, building on some work done by Fang-Yu Rao. Some tweaks were required for these tests. * Add an extra ORDER BY clause to q77 to make fully deterministic. * Add backticks around `returns` to avoid reserved word. * Add INTERVAL keyword to date/timestamp arithmetic. We can run q80, too, but I haven't added or verified results yet - that can be done in a follow-up. Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197 --- M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/GroupByClause.java M fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/common/RuntimeEnv.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java A testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test M testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test A testdata/workloads/functional-query/queries/QueryTest/grouping-sets.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q18.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q22.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q5.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q67.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q77.test M tests/query_test/test_aggregation.py M tests/query_test/test_tpcds_queries.py M tests/util/parse_util.py 22 files changed, 4,287 insertions(+), 65 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/16128/14 -- To view, visit http://gerrit.cloudera.org:8080/16128 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197 Gerrit-Change-Number: 16128 Gerrit-PatchSet: 14 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Hello Aman Sinha, Shant Hovsepian, David Rorke, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16140 to look at the new patch set (#15). Change subject: IMPALA-9917: grouping() and grouping_id() support .. IMPALA-9917: grouping() and grouping_id() support Implements grouping() and grouping_id() builtins. grouping_id() has both a no-arg version, which returns a bit vector of all grouping exprs and a varargs version, which returns a bit vector of the provided arguments. Grouping is a keyword, so needs special handling in the parser to be accepted as a function name. These functions are implemented in the transpose agg with a CASE expression similar to other aggregate functions, but returning the grouping() or grouping_id() value for that aggregation class instead of an aggregated value. Testing: * Added parser test for grouping keyword. * Added analysis tests for the functions. * Added basic planner test to show expressions generated * Added some TPC-DS queries that use grouping() - queries 80, 70 and 86 using reference .test files from Fang-Yu Rao. 27 and 36 were added with reference results from https://github.com/cwida/tpcds-result-reproduction * Add targeted end-to-end tests. * Added view compatibility test with Hive. Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 --- M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/GroupByClause.java M fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java M fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test M testdata/workloads/functional-query/queries/QueryTest/grouping-sets.test M testdata/workloads/functional-query/queries/QueryTest/views-compatibility.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q27.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q36.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q70.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q80.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q86.test M tests/query_test/test_tpcds_queries.py M tests/util/parse_util.py 19 files changed, 2,685 insertions(+), 82 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/16140/15 -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 15 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 15: Code-Review+1 rebased, carry +1 -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 15 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 13 Jul 2020 17:06:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9898: generate grouping set plans
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16128 ) Change subject: IMPALA-9898: generate grouping set plans .. Patch Set 14: Code-Review+1 rebased, carry +1 -- To view, visit http://gerrit.cloudera.org:8080/16128 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197 Gerrit-Change-Number: 16128 Gerrit-PatchSet: 14 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 13 Jul 2020 17:06:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16171 ) Change subject: IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6572/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16171 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6034aedac776783bdc8cdb3a2df344e2b3662da6 Gerrit-Change-Number: 16171 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 13 Jul 2020 16:34:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16171 ) Change subject: IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/16171/5/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java: http://gerrit.cloudera.org:8080/#/c/16171/5/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@1380 PS5, Line 1380: : // HAVING cause can eliminate any rows produced by the statement, therefore we have : // no lower bound on rows returned. : if (exactlyOne && havingClause_ != null && !havingClause_.isTriviallyTrue()) { : > It is already handled at L1376. Done -- To view, visit http://gerrit.cloudera.org:8080/16171 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6034aedac776783bdc8cdb3a2df344e2b3662da6 Gerrit-Change-Number: 16171 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 13 Jul 2020 16:08:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16171 ) Change subject: IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/16171/6/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java: http://gerrit.cloudera.org:8080/#/c/16171/6/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@319 PS6, Line 319: Preconditions.checkState(((SelectStmt) s.getStatement()).returnsAtMostOneRow(), line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/16171 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6034aedac776783bdc8cdb3a2df344e2b3662da6 Gerrit-Change-Number: 16171 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 13 Jul 2020 16:09:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16171 ) Change subject: IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6120/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16171 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6034aedac776783bdc8cdb3a2df344e2b3662da6 Gerrit-Change-Number: 16171 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 13 Jul 2020 16:08:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16171 ) Change subject: IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16171 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6034aedac776783bdc8cdb3a2df344e2b3662da6 Gerrit-Change-Number: 16171 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 13 Jul 2020 16:08:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT
Hello Shant Hovsepian, Zoltan Borok-Nagy, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16171 to look at the new patch set (#6). Change subject: IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT .. IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT The patch for IMPALA-8954 failed to account for subqueries that could produce < 1 row. SelectStmt.returnsSingleRow() is confusing because it actually returns true if it returns *at most* one row. As a fix I split it into returnsExactlyOneRow() and returnsAtMostOneRow(), then used returnsExactlyOneRow() to determine if the subquery should instead be rewritten into a LEFT OUTER JOIN, which produces the correct result. CROSS JOIN is still preferred because it can be more freely reordered during planning. Testing: * Added planner tests for a range of scenarios where it can be rewritten as a CROSS JOIN and where it needs to be a LEFT OUTER JOIN for correctness. * Added some targeted end-to-end tests where the results were previously incorrect. Checked the behaviour against Hive and postgres. Ran exhaustive tests. Change-Id: I6034aedac776783bdc8cdb3a2df344e2b3662da6 --- M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java M fe/src/main/java/org/apache/impala/analysis/Subquery.java M fe/src/main/java/org/apache/impala/analysis/TableRef.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test M testdata/workloads/functional-query/queries/QueryTest/subquery.test 8 files changed, 463 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/16171/6 -- To view, visit http://gerrit.cloudera.org:8080/16171 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6034aedac776783bdc8cdb3a2df344e2b3662da6 Gerrit-Change-Number: 16171 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9924: handle single subquery in or predicate
Tim Armstrong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16152 ) Change subject: IMPALA-9924: handle single subquery in or predicate .. IMPALA-9924: handle single subquery in or predicate This patch supports a subset of cases of subqueries inside OR inside WHERE and HAVING clauses. The approach used is to rewrite the subquery into a many-to-one LEFT OUTER JOIN with the subquery and then replace the subquery in the expression with a reference to the single select list expressions of the subquery. This works because: * A many-to-one LEFT OUTER JOIN returns one output row for each left input row, meaning that for every row in the original query before the rewrite, we get the same row plus a single matched row from the subquery * Expressions can be rewritten to refer to a slotref from the right side of the LEFT OUTER JOIN without affecting semantics. E.g. an IN subquery becomes IS NOT NULL or () becomes . This does not affect SELECT list subqueries, which are rewritten using a different mechanism that can already support some subqueries in disjuncts. Correlated and uncorrelated subqueries are both supported, but various limitations are present. Limitations: * Only one subquery per predicate is supported. The rewriting approach should generalize to multiple subqueries but other code needs refactoring to handle this case. * EXISTS and NOT EXISTS subqueries are not supported. The rewriting approach can generalise to that, but we need to add or pick a select list item from the subquery to check for NULL/IS NOT NULL and a little more work is required to do that correctly. * NOT IN is not supported because of the special NULL semantics. * Subqueries with aggregates + grouping by are not supported because we rely on adding distinct to select list and we don't support distinct + aggregations because of IMPALA-5098. Tests: * Positive analysis tests for IN and binary predicate operators. * Negative analysis tests for unsupported subquery operators. * Negative analysis tests for multiple subqueries. * Negative analysis tests for runtime scalar subqueries. * Positive and negative analysis tests for aggregations in subquery. * TPC-DS Query 45 planner and query tests * Targeted planner tests for various supported queries. * Targeted functional tests to confirm plans are executable and return correct result. These exercise a mix of the supported features - correlated/correlated, aggregate functions, EXISTS/comparator, etc. * Tests for BETWEEN predicate, which is supported as a side-effect of being rewritten during analysis. Change-Id: I64588992901afd7cd885419a0b7f949b0b174976 Reviewed-on: http://gerrit.cloudera.org:8080/16152 Tested-by: Impala Public Jenkins Reviewed-by: Zoltan Borok-Nagy --- M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeSubqueriesTest.java M testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test M testdata/workloads/functional-query/queries/QueryTest/subquery.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q45.test M tests/query_test/test_tpcds_queries.py M tests/util/parse_util.py 10 files changed, 1,436 insertions(+), 61 deletions(-) Approvals: Impala Public Jenkins: Verified Zoltan Borok-Nagy: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/16152 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I64588992901afd7cd885419a0b7f949b0b174976 Gerrit-Change-Number: 16152 Gerrit-PatchSet: 15 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9925 cast(pow(2, 31) as int) return 2147483647 on aarch64
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16174 ) Change subject: IMPALA-9925 cast(pow(2, 31) as int) return 2147483647 on aarch64 .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/16174/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16174/1//COMMIT_MSG@9 PS1, Line 9: cast(pow(2, 31) as int) return 2147483647 on aarch64 : but return 2147483648 on x86 : I think aarch64 is correct. Why do you think 2147483647 is the correct result? 2147483647 is 2^31 - 1. 2^31 is 2147483648. However, it cannot be represented in a signed 32-bit integer because 1000...000 means -2147483648. That's the reason the tests must use multiplication with -1, or abs(). http://gerrit.cloudera.org:8080/#/c/16174/1/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/16174/1/be/src/exprs/expr-test.cc@5810 PS1, Line 5810: ifndef nit: instead of 'if not def', you could just use #ifdef to make the code a bit more readable. -- To view, visit http://gerrit.cloudera.org:8080/16174 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I58ab52acebb9bcddbf298efa886fd30ce35f68bf Gerrit-Change-Number: 16174 Gerrit-PatchSet: 1 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 13 Jul 2020 13:57:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9924: handle single subquery in or predicate
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16152 ) Change subject: IMPALA-9924: handle single subquery in or predicate .. Patch Set 14: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16152 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I64588992901afd7cd885419a0b7f949b0b174976 Gerrit-Change-Number: 16152 Gerrit-PatchSet: 14 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 13 Jul 2020 13:00:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9946: Use table id when comparing transactional state
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16170 ) Change subject: IMPALA-9946: Use table id when comparing transactional state .. Patch Set 2: Code-Review+2 (4 comments) LGTM. http://gerrit.cloudera.org:8080/#/c/16170/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16170/2//COMMIT_MSG@9 PS2, Line 9: 's nit: s http://gerrit.cloudera.org:8080/#/c/16170/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/16170/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@606 PS2, Line 606: -1 nit: -1 is used in several places for illegal table id. Is there a constant in Hive codes for this? http://gerrit.cloudera.org:8080/#/c/16170/2/fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoWriteIdTest.java File fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoWriteIdTest.java: http://gerrit.cloudera.org:8080/#/c/16170/2/fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoWriteIdTest.java@512 PS2, Line 512: long numMisses1 = getMetricCount(testDbName, testTblName, : HdfsTable.FILEMETADATA_CACHE_MISS_METRIC); : long numHits1 = getMetricCount(testDbName, testTblName, : HdfsTable.FILEMETADATA_CACHE_HIT_METRIC); nit: not used http://gerrit.cloudera.org:8080/#/c/16170/2/fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoWriteIdTest.java@526 PS2, Line 526: > nit: Are they always equal? -- To view, visit http://gerrit.cloudera.org:8080/16170 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I58c5bd58c4eb5663647c01ecd738b661e4e4cd74 Gerrit-Change-Number: 16170 Gerrit-PatchSet: 2 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Mon, 13 Jul 2020 12:56:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16171 ) Change subject: IMPALA-9949: fix SELECT list subqueries with HAVING/LIMIT .. Patch Set 5: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/16171/5/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java: http://gerrit.cloudera.org:8080/#/c/16171/5/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@1380 PS5, Line 1380: // Limit 0 or 1 clause. This is an upper bound on rows returned. : if (!exactlyOne && limitElement_ != null && hasLimit() && : limitElement_.getLimit() <= 1) { : return true; : } It is already handled at L1376. -- To view, visit http://gerrit.cloudera.org:8080/16171 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6034aedac776783bdc8cdb3a2df344e2b3662da6 Gerrit-Change-Number: 16171 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 13 Jul 2020 12:52:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9941: ExprTest.CastExprs fails when running with ASAN fix
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16179 ) Change subject: IMPALA-9941: ExprTest.CastExprs fails when running with ASAN fix .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6571/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16179 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I75bd47b6627a2e338c46dc354f7e67d34c1abbcf Gerrit-Change-Number: 16179 Gerrit-PatchSet: 4 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 13 Jul 2020 12:10:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9941: ExprTest.CastExprs fails when running with ASAN fix
Adam Tamas has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/16179 ) Change subject: IMPALA-9941: ExprTest.CastExprs fails when running with ASAN fix .. IMPALA-9941: ExprTest.CastExprs fails when running with ASAN fix Fixed the issue with the ASAN build where too short tokens in a timestamps fails becouse of indexing out of bounds becouse of a missing check. Updated files connected to the drop of dateless timestamps. Testing: - Ran CORE tests - Ran ASAN with EE_TEST_SHARDS=6 (with the help of IMPALA-9887) Change-Id: I75bd47b6627a2e338c46dc354f7e67d34c1abbcf --- M be/src/exprs/expr-test.cc M be/src/runtime/date-parse-util.cc M be/src/runtime/datetime-simple-date-format-parser.cc M be/src/runtime/datetime-simple-date-format-parser.h M be/src/runtime/timestamp-parse-util.cc M be/src/runtime/timestamp-parse-util.h 6 files changed, 67 insertions(+), 72 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/16179/4 -- To view, visit http://gerrit.cloudera.org:8080/16179 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I75bd47b6627a2e338c46dc354f7e67d34c1abbcf Gerrit-Change-Number: 16179 Gerrit-PatchSet: 4 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9941: ExprTest.CastExprs fails when running with ASAN fix
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/16179 ) Change subject: IMPALA-9941: ExprTest.CastExprs fails when running with ASAN fix .. Patch Set 3: (5 comments) http://gerrit.cloudera.org:8080/#/c/16179/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16179/3//COMMIT_MSG@13 PS3, Line 13: Testxing typo http://gerrit.cloudera.org:8080/#/c/16179/3/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/16179/3/be/src/exprs/expr-test.cc@3311 PS3, Line 3311: TestTimestampValue("cast(' \t\r\n 01:05:01 \t\r\n ' as timestamp)", : TimestampValue::ParseSimpleDateFormat("01:05:01")); Does this produce a valid TS? http://gerrit.cloudera.org:8080/#/c/16179/3/be/src/exprs/expr-test.cc@3313 PS3, Line 3313: // TestIsNull("cast(' \t\r\n 01:05:01.123456789 \t\r\n ' as timestamp)", : // TYPE_TIMESTAMP); Please don't leave commented code. http://gerrit.cloudera.org:8080/#/c/16179/3/be/src/exprs/expr-test.cc@3327 PS3, Line 3327: TestIsNull("cast(' \t\r\n 1:5:1\t\r\n' as timestamp)", TYPE_TIMESTAMP); : TestIsNull( : "cast(' \t\r\n 1:5:1.12345678\t\r\n' as timestamp)", TYPE_TIMESTAMP); These have nothing to do with IMPALA-6995, please move them to a different section of the test file. http://gerrit.cloudera.org:8080/#/c/16179/3/be/src/runtime/datetime-simple-date-format-parser.cc File be/src/runtime/datetime-simple-date-format-parser.cc: http://gerrit.cloudera.org:8080/#/c/16179/3/be/src/runtime/datetime-simple-date-format-parser.cc@346 PS3, Line 346: str[4] For me the root cause is still a bit unclear. The ASAN build said that for certain inputs here we try to read from memory that was actually released. I wonder how come we get to this point when we had released 'str' and where it is actually released? -- To view, visit http://gerrit.cloudera.org:8080/16179 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I75bd47b6627a2e338c46dc354f7e67d34c1abbcf Gerrit-Change-Number: 16179 Gerrit-PatchSet: 3 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 13 Jul 2020 09:24:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] POC: Remote codegen (Milestone 1)
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14735 ) Change subject: POC: Remote codegen (Milestone 1) .. Patch Set 8: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/6570/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/14735 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I375d969ceb31b3ee234d6ca56a77d508ae43bb0f Gerrit-Change-Number: 14735 Gerrit-PatchSet: 8 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 13 Jul 2020 08:47:05 + Gerrit-HasComments: No
[Impala-ASF-CR] POC: Remote codegen (Milestone 1)
Daniel Becker has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/14735 ) Change subject: POC: Remote codegen (Milestone 1) .. POC: Remote codegen (Milestone 1) Optional remote codegen, set by a query option. Can be used together with async. The Impala executor does everything as usual until the point when it would optimize and compile the LLVM module. Instead, it sends the LLVM bitcode to the compile server which optimizes it and sends back an object file (handled in memory, not written to disk). The executor blocks (in synchronous mode) until it receives the compiled object file. Note that the customization of the LLVM module (replacing function calls) is still done by the executor. The server is a python script that calls llc to compile the code. It will later be replaced probably by a special impalad that uses the LLVM API to compile the code. Communication is not secure. TODO: Now both async and remote are switched ON for testing, these should be disabled by default. Change-Id: I375d969ceb31b3ee234d6ca56a77d508ae43bb0f --- M be/generated-sources/gen-cpp/CMakeLists.txt M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/runtime/fragment-state.cc M be/src/service/fe-support.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/CMakeLists.txt M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift A common/thrift/RemoteCompile.thrift M testdata/bin/kill-all.sh M testdata/bin/run-all.sh A testdata/bin/server.py 14 files changed, 487 insertions(+), 62 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/14735/8 -- To view, visit http://gerrit.cloudera.org:8080/14735 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I375d969ceb31b3ee234d6ca56a77d508ae43bb0f Gerrit-Change-Number: 14735 Gerrit-PatchSet: 8 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9941: ExprTest.CastExprs fails when running with ASAN fix
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16179 ) Change subject: IMPALA-9941: ExprTest.CastExprs fails when running with ASAN fix .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6569/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16179 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I75bd47b6627a2e338c46dc354f7e67d34c1abbcf Gerrit-Change-Number: 16179 Gerrit-PatchSet: 3 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 13 Jul 2020 08:29:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9633: Implement ds hll union()
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16095 ) Change subject: IMPALA-9633: Implement ds_hll_union() .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6568/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16095 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I67cdbf6f3ebdb1296fea38465a15642bc9612d09 Gerrit-Change-Number: 16095 Gerrit-PatchSet: 7 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 13 Jul 2020 08:15:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9633: Implement ds hll union()
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16095 ) Change subject: IMPALA-9633: Implement ds_hll_union() .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6567/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16095 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I67cdbf6f3ebdb1296fea38465a15642bc9612d09 Gerrit-Change-Number: 16095 Gerrit-PatchSet: 6 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 13 Jul 2020 08:13:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9941: ExprTest.CastExprs fails when running with ASAN fix
Adam Tamas has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/16179 ) Change subject: IMPALA-9941: ExprTest.CastExprs fails when running with ASAN fix .. IMPALA-9941: ExprTest.CastExprs fails when running with ASAN fix Fixed the issue with the ASAN build where only date tokens with wrong format in a timestamps fails. Updated files connected to the drop of dateless timestamps. Testxing: - Ran CORE tests - Ran ASAN with EE_TEST_SHARDS=6 (with the help of IMPALA-9887) Change-Id: I75bd47b6627a2e338c46dc354f7e67d34c1abbcf --- M be/src/exprs/expr-test.cc M be/src/runtime/date-parse-util.cc M be/src/runtime/datetime-simple-date-format-parser.cc M be/src/runtime/datetime-simple-date-format-parser.h M be/src/runtime/timestamp-parse-util.cc M be/src/runtime/timestamp-parse-util.h 6 files changed, 64 insertions(+), 70 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/16179/3 -- To view, visit http://gerrit.cloudera.org:8080/16179 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I75bd47b6627a2e338c46dc354f7e67d34c1abbcf Gerrit-Change-Number: 16179 Gerrit-PatchSet: 3 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9633: Implement ds hll union()
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/16095 ) Change subject: IMPALA-9633: Implement ds_hll_union() .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/16095/5/testdata/workloads/functional-query/queries/QueryTest/datasketches-hll.test File testdata/workloads/functional-query/queries/QueryTest/datasketches-hll.test: http://gerrit.cloudera.org:8080/#/c/16095/5/testdata/workloads/functional-query/queries/QueryTest/datasketches-hll.test@202 PS5, Line 202: > Can you also add a union test with hll_sketches_from_hive? Sure, Done -- To view, visit http://gerrit.cloudera.org:8080/16095 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I67cdbf6f3ebdb1296fea38465a15642bc9612d09 Gerrit-Change-Number: 16095 Gerrit-PatchSet: 7 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 13 Jul 2020 07:48:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9633: Implement ds hll union()
Hello Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16095 to look at the new patch set (#7). Change subject: IMPALA-9633: Implement ds_hll_union() .. IMPALA-9633: Implement ds_hll_union() This function receives a set of sketches produced by ds_hll_sketch() and merges them into a single sketch. An example usage is to create a sketch for each partition of a table, write these sketches to a separate table and based on which partition the user is interested of the relevant sketches can be union-ed together to get an estimate. E.g.: SELECT ds_hll_estimate(ds_hll_union(sketch_col)) FROM sketch_tbl WHERE partition_col=1 OR partition_col=5; Note, currently there is a known limitation of unioning string types where some input sketches come from Impala and some from Hive. In this case if there is an overlap in the input data used by Impala and by Hive this overlapping data is still counted twice due to some string representation difference between Impala and Hive. For more details see: https://issues.apache.org/jira/browse/IMPALA-9939 Testing: - Apart from the automated tests I added to this patch I also tested ds_hll_union() on a bigger dataset to check that serialization, deserialization and merging steps work well. I took TPCH25.linelitem, created a number of sketches with grouping by l_shipdate and called ds_hll_union() on those sketches. Change-Id: I67cdbf6f3ebdb1296fea38465a15642bc9612d09 --- M be/src/exprs/CMakeLists.txt M be/src/exprs/aggregate-functions-ir.cc M be/src/exprs/aggregate-functions.h A be/src/exprs/datasketches-common.cc A be/src/exprs/datasketches-common.h M be/src/exprs/datasketches-functions-ir.cc M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M testdata/data/README A testdata/data/hll_sketches_from_impala.parquet M testdata/workloads/functional-query/queries/QueryTest/datasketches-hll.test M tests/query_test/test_datasketches.py 11 files changed, 271 insertions(+), 29 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/16095/7 -- To view, visit http://gerrit.cloudera.org:8080/16095 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I67cdbf6f3ebdb1296fea38465a15642bc9612d09 Gerrit-Change-Number: 16095 Gerrit-PatchSet: 7 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9633: Implement ds hll union()
Hello Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16095 to look at the new patch set (#6). Change subject: IMPALA-9633: Implement ds_hll_union() .. IMPALA-9633: Implement ds_hll_union() This function receives a set of sketches produced by ds_hll_sketch() and merges them into a single sketch. An example usage is to create a sketch for each partition of a table, write these sketches to a separate table and based on which partition the user is interested of the relevant sketches can be union-ed together to get an estimate. E.g.: SELECT ds_hll_estimate(ds_hll_union(sketch_col)) FROM sketch_tbl WHERE partition_col=1 OR partition_col=5; Note, currently there is a known limitation of unioning string types where some input sketches come from Impala and some from Hive. In this case if there is an overlap in the input data used by Impala and by Hive this overlapping data is still counted twice due to some string representation difference between Impala and Hive. For more details see: https://issues.apache.org/jira/browse/IMPALA-9939 Testing: - Apart from the automated tests I added to this patch I also tested ds_hll_union() on a bigger dataset to check that serialization, deserialization and merging steps work well. I took TPCH25.linelitem, created a number of sketches with grouping by l_shipdate and called ds_hll_union() on those sketches. Change-Id: I67cdbf6f3ebdb1296fea38465a15642bc9612d09 --- M be/src/exprs/CMakeLists.txt M be/src/exprs/aggregate-functions-ir.cc M be/src/exprs/aggregate-functions.h A be/src/exprs/datasketches-common.cc A be/src/exprs/datasketches-common.h M be/src/exprs/datasketches-functions-ir.cc M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M testdata/data/README M testdata/workloads/functional-query/queries/QueryTest/datasketches-hll.test M tests/query_test/test_datasketches.py 10 files changed, 271 insertions(+), 29 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/16095/6 -- To view, visit http://gerrit.cloudera.org:8080/16095 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I67cdbf6f3ebdb1296fea38465a15642bc9612d09 Gerrit-Change-Number: 16095 Gerrit-PatchSet: 6 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins