[Impala-ASF-CR] IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16112 ) Change subject: IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis .. Patch Set 15: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6509/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16112 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f Gerrit-Change-Number: 16112 Gerrit-PatchSet: 15 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 05:46:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16112 ) Change subject: IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis .. Patch Set 15: Code-Review+2 Carry +2 -- To view, visit http://gerrit.cloudera.org:8080/16112 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f Gerrit-Change-Number: 16112 Gerrit-PatchSet: 15 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 05:20:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16112 ) Change subject: IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis .. Patch Set 14: (1 comment) http://gerrit.cloudera.org:8080/#/c/16112/14/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java: http://gerrit.cloudera.org:8080/#/c/16112/14/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@35 PS14, Line 35: import org.apache.impala.common.NotImplementedException; > nit: unused import Done -- To view, visit http://gerrit.cloudera.org:8080/16112 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f Gerrit-Change-Number: 16112 Gerrit-PatchSet: 14 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 05:20:04 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16112 ) Change subject: IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis .. Patch Set 16: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16112 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f Gerrit-Change-Number: 16112 Gerrit-PatchSet: 16 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 05:20:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16112 ) Change subject: IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis .. Patch Set 16: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6097/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16112 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f Gerrit-Change-Number: 16112 Gerrit-PatchSet: 16 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 05:20:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis
Hello Aman Sinha, Quanlong Huang, Thomas Tauber-Marshall, Shant Hovsepian, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16112 to look at the new patch set (#15). Change subject: IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis .. IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis Supports a single ROLLUP, CUBE or GROUPING SETS clause in GROUP BY. Also adds non-standard "WITH ROLLUP" and "WITH CUBE" syntax that is supported by some other SQL dialects. This implements basic parsing and validation of the query, then raises an AnalysisException to report that it is not supported so that incorrect plans will not be generated. This patch adds a GroupByClause to each SelectStmt that contains info about the grouping sets and the original GROUP BY list. The grouping exprs are still represented as a List in SelectStmt. Most of the logic, including statement and expr rewrites, can operate on this list of expressions without requiring special handling for grouping sets. Testing: * Add Parser test. * Add toSql() test. * Added analysis tests to check that analysis accepts or rejects queries correctly. Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f --- M fe/src/main/cup/sql-parser.cup A fe/src/main/java/org/apache/impala/analysis/GroupByClause.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java M fe/src/main/java/org/apache/impala/common/RuntimeEnv.java M fe/src/main/jflex/sql-scanner.flex M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java 10 files changed, 759 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/16112/15 -- To view, visit http://gerrit.cloudera.org:8080/16112 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f Gerrit-Change-Number: 16112 Gerrit-PatchSet: 15 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Shant Hovsepian has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 7: (3 comments) Grouping Sets are usually useful when saved as temp or logical views, then sliced and diced to get to the right level. Any potential edge case with saving a grouping set query as a logical view? http://gerrit.cloudera.org:8080/#/c/16140/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16140/7//COMMIT_MSG@28 PS7, Line 28: 27 and 36 sho https://github.com/cwida/tpcds-result-reproduction/blob/master/answer_sets_nulls_last/36.ans https://github.com/cwida/tpcds-result-reproduction/blob/master/answer_sets_nulls_last/27.ans http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java File fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java: http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@394 PS7, Line 394: casingthe nit: casing__space__the http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java File fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java: http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java@150 PS7, Line 150:* analysis. Mention in the comment that it's used for implementing grouping() as an example? -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 04:59:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9294: Support DATE for min-max runtime filter
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16103 ) Change subject: IMPALA-9294: Support DATE for min-max runtime filter .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6508/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic2f6e2dc6949735d5f0fcf317361cc2969a5e82c Gerrit-Change-Number: 16103 Gerrit-PatchSet: 7 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 07 Jul 2020 04:49:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9294: Support DATE for min-max runtime filter
Wenzhe Zhou has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/16103 ) Change subject: IMPALA-9294: Support DATE for min-max runtime filter .. IMPALA-9294: Support DATE for min-max runtime filter Implemented Date min-max filter and applied it to Kudu as other min-max runtime filters. Added new test cases for Date min-max filters. Testing: Passed all core tests. Change-Id: Ic2f6e2dc6949735d5f0fcf317361cc2969a5e82c --- M be/src/codegen/gen_ir_descriptions.py M be/src/runtime/date-value.h M be/src/util/min-max-filter-ir.cc M be/src/util/min-max-filter-test.cc M be/src/util/min-max-filter.cc M be/src/util/min-max-filter.h M common/protobuf/common.proto M common/thrift/Data.thrift M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java M testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test M testdata/workloads/functional-query/queries/QueryTest/all_runtime_filters.test M testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test 12 files changed, 426 insertions(+), 167 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/16103/7 -- To view, visit http://gerrit.cloudera.org:8080/16103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic2f6e2dc6949735d5f0fcf317361cc2969a5e82c Gerrit-Change-Number: 16103 Gerrit-PatchSet: 7 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-9921: Change error messages in checking needsQuotes to TRACE level logs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16146 ) Change subject: IMPALA-9921: Change error messages in checking needsQuotes to TRACE level logs .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6507/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16146 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0e1b5d2963285dc9125d8e0b8ed25c4db6821e0b Gerrit-Change-Number: 16146 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 07 Jul 2020 03:39:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16112 ) Change subject: IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis .. Patch Set 14: Code-Review+2 (2 comments) LGTM. Thanks for working on this feature! Carry on the existing +1 to +2. http://gerrit.cloudera.org:8080/#/c/16112/14/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java: http://gerrit.cloudera.org:8080/#/c/16112/14/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@35 PS14, Line 35: import org.apache.impala.common.NotImplementedException; nit: unused import http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java: http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java@75 PS13, Line 75: } > I think this was done indirectly by AbstractFrontendTest and the FrontendFi Oh, just realized that it's taken care in AbstractFrontendTest. Thanks for making the change! -- To view, visit http://gerrit.cloudera.org:8080/16112 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f Gerrit-Change-Number: 16112 Gerrit-PatchSet: 14 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 03:25:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9921: Change error messages in checking needsQuotes to TRACE level logs
Quanlong Huang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16146 Change subject: IMPALA-9921: Change error messages in checking needsQuotes to TRACE level logs .. IMPALA-9921: Change error messages in checking needsQuotes to TRACE level logs Impala planner uses the HiveLexer to check whether an ident needs to be quoted in toSql results. However, HiveLexer will print error messages to stderr which is redirected to impalad.ERROR, so they appear as ERROR level logs. Actually, they just mean HiveLexer can't parse the ident so they are not Hive keywords so don't need to be quoted. These error messages don't mean anything wrong so shouldn't be ERROR level logs. This patch overrides the HiveLexer used in ToSqlUtils to log the error messages to TRACE level logs. Tests * Manually verify the error messages don't appear in impalad.ERROR and are printed to TRACE level logs. Change-Id: I0e1b5d2963285dc9125d8e0b8ed25c4db6821e0b --- M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java 1 file changed, 15 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/16146/1 -- To view, visit http://gerrit.cloudera.org:8080/16146 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I0e1b5d2963285dc9125d8e0b8ed25c4db6821e0b Gerrit-Change-Number: 16146 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang
[Impala-ASF-CR] IMPALA-9789: Disable ineffective bloom filters for Kudu scan
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16138 ) Change subject: IMPALA-9789: Disable ineffective bloom filters for Kudu scan .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16138 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib55fa359e2fdb2c03c890f2303ffd9173c3c4af1 Gerrit-Change-Number: 16138 Gerrit-PatchSet: 3 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 07 Jul 2020 01:00:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9789: Disable ineffective bloom filters for Kudu scan
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16138 ) Change subject: IMPALA-9789: Disable ineffective bloom filters for Kudu scan .. IMPALA-9789: Disable ineffective bloom filters for Kudu scan Bump Kudu version to 23f67ae0d. This pulls in the latest version of Kudu which includes a performance enhancement in Kudu server for bloom filter. Kudu uses same heuristics as HDFS to disable ineffective bloom filter. This fixs the performance regression issue for queries like TPCH-Q9 so that we can set query option ENABLED_RUNTIME_FILTER_TYPES as ALL to enable bloom filter for Kudu by default. Testing: - Ran single_node_perf_run.py with TPCH for Kudu and verified that there was no regression issue for each query. - Passed all core tests. Change-Id: Ib55fa359e2fdb2c03c890f2303ffd9173c3c4af1 Reviewed-on: http://gerrit.cloudera.org:8080/16138 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M bin/impala-config.sh M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift 3 files changed, 5 insertions(+), 5 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16138 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ib55fa359e2fdb2c03c890f2303ffd9173c3c4af1 Gerrit-Change-Number: 16138 Gerrit-PatchSet: 4 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-9294: Support DATE for min-max runtime filter
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/16103 ) Change subject: IMPALA-9294: Support DATE for min-max runtime filter .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/16103/3/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test File testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test: http://gerrit.cloudera.org:8080/#/c/16103/3/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test@168 PS3, Line 168: functional_kudu.date_tbl a : join [BROADCAST] functional_kudu.date_tbl b > Looks good to me. One question on the broadcast hint used in the query: jus It's working too. Will add a new test case. -- To view, visit http://gerrit.cloudera.org:8080/16103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic2f6e2dc6949735d5f0fcf317361cc2969a5e82c Gerrit-Change-Number: 16103 Gerrit-PatchSet: 6 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 07 Jul 2020 00:54:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6506/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 00:27:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 7: (4 comments) http://gerrit.cloudera.org:8080/#/c/16140/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16140/3//COMMIT_MSG@31 PS3, Line 31: Also TPC-DS planner tests. http://gerrit.cloudera.org:8080/#/c/16140/6/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java File fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java: http://gerrit.cloudera.org:8080/#/c/16140/6/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@399 PS6, Line 399: aggExpr.getFnName().getFunction() + "' must be combined with multiple " + > line too long (100 > 90) Done http://gerrit.cloudera.org:8080/#/c/16140/6/fe/src/test/java/org/apache/impala/analysis/ParserTest.java File fe/src/test/java/org/apache/impala/analysis/ParserTest.java: http://gerrit.cloudera.org:8080/#/c/16140/6/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@3750 PS6, Line 3750: ParsesOk("SELECT a, grouping(a) as `grouping` FROM foo " + > line too long (92 > 90) Done http://gerrit.cloudera.org:8080/#/c/16140/6/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@3751 PS6, Line 3751: "GROUP BY GROUPING SETS((a, b))"); > line too long (93 > 90) Done -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 06 Jul 2020 23:58:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Hello Aman Sinha, Shant Hovsepian, David Rorke, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16140 to look at the new patch set (#7). Change subject: IMPALA-9917: grouping() and grouping_id() support .. IMPALA-9917: grouping() and grouping_id() support Implements grouping() and grouping_id() builtins. grouping_id() has both a no-arg version, which returns a bit vector of all grouping exprs and a varargs version, which returns a bit vector of the provided arguments. Grouping is a keyword, so needs special handling in the parser to be accepted as a function name. These functions are implemented in the transpose agg with a CASE expression similar to other aggregate functions, but returning the grouping() or grouping_id() value for that aggregation class instead of an aggregated value. Testing: * Added parser test for grouping keyword. * Added analysis tests for the functions. * Added basic planner test to show expressions generated * Added some TPC-DS queries that use grouping() - queries 80, 70 and 86 using reference .test files from Fang-Yu Rao. 27 and 36 should also be runnable but I did not have reference results readily available. * Add targeted end-to-end tests. Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 --- M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/GroupByClause.java M fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java M fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test M testdata/workloads/functional-query/queries/QueryTest/grouping-sets.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q70.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q80.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q86.test M tests/query_test/test_tpcds_queries.py M tests/util/parse_util.py 16 files changed, 1,884 insertions(+), 74 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/16140/7 -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian
[Impala-ASF-CR] IMPALA-9898: generate grouping set plans
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16128 ) Change subject: IMPALA-9898: generate grouping set plans .. Patch Set 10: Code-Review+1 carry -- To view, visit http://gerrit.cloudera.org:8080/16128 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197 Gerrit-Change-Number: 16128 Gerrit-PatchSet: 10 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 06 Jul 2020 23:49:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16112 ) Change subject: IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis .. Patch Set 14: Code-Review+1 carry -- To view, visit http://gerrit.cloudera.org:8080/16112 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f Gerrit-Change-Number: 16112 Gerrit-PatchSet: 14 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 06 Jul 2020 23:49:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9902: add rewrite of TPC-DS q38
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16118 ) Change subject: IMPALA-9902: add rewrite of TPC-DS q38 .. IMPALA-9902: add rewrite of TPC-DS q38 I generated the query with dsqgen and then rewrote it to avoid intersect. Testing: Compared results to hive running the original version of the query. Change-Id: I81807683aa265a946729e15156bd2e33724103e1 Reviewed-on: http://gerrit.cloudera.org:8080/16118 Reviewed-by: Tim Armstrong Reviewed-by: Joe McDonnell Tested-by: Impala Public Jenkins --- A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q38-rewrite.test M testdata/workloads/tpcds/queries/tpcds-q8.test M tests/query_test/test_tpcds_queries.py M tests/util/parse_util.py 4 files changed, 40 insertions(+), 2 deletions(-) Approvals: Tim Armstrong: Looks good to me, but someone else must approve Joe McDonnell: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/16118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I81807683aa265a946729e15156bd2e33724103e1 Gerrit-Change-Number: 16118 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9902: add rewrite of TPC-DS q38
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16118 ) Change subject: IMPALA-9902: add rewrite of TPC-DS q38 .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81807683aa265a946729e15156bd2e33724103e1 Gerrit-Change-Number: 16118 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 06 Jul 2020 23:47:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16082 ) Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types) .. Patch Set 10: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16082 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659 Gerrit-Change-Number: 16082 Gerrit-PatchSet: 10 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 06 Jul 2020 21:16:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9294: Support DATE for min-max runtime filter
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/16103 ) Change subject: IMPALA-9294: Support DATE for min-max runtime filter .. Patch Set 6: Code-Review+1 (1 comment) The addition looks good to me. Thanks! http://gerrit.cloudera.org:8080/#/c/16103/3/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test File testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test: http://gerrit.cloudera.org:8080/#/c/16103/3/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test@168 PS3, Line 168: functional_kudu.date_tbl a : join [BROADCAST] functional_kudu.date_tbl b > Ok, I will add new 3-way join test case if I need to submit new patch. Looks good to me. One question on the broadcast hint used in the query: just wonder if a shuffle parallel plan works too. In such a plan, the min/max from different impalads need to be consolidated. -- To view, visit http://gerrit.cloudera.org:8080/16103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic2f6e2dc6949735d5f0fcf317361cc2969a5e82c Gerrit-Change-Number: 16103 Gerrit-PatchSet: 6 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Mon, 06 Jul 2020 20:48:32 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9789: Disable ineffective bloom filters for Kudu scan
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16138 ) Change subject: IMPALA-9789: Disable ineffective bloom filters for Kudu scan .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16138 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib55fa359e2fdb2c03c890f2303ffd9173c3c4af1 Gerrit-Change-Number: 16138 Gerrit-PatchSet: 3 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Mon, 06 Jul 2020 19:58:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9789: Disable ineffective bloom filters for Kudu scan
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16138 ) Change subject: IMPALA-9789: Disable ineffective bloom filters for Kudu scan .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6096/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16138 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib55fa359e2fdb2c03c890f2303ffd9173c3c4af1 Gerrit-Change-Number: 16138 Gerrit-PatchSet: 3 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Mon, 06 Jul 2020 19:58:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9789: Disable ineffective bloom filters for Kudu scan
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/16138 ) Change subject: IMPALA-9789: Disable ineffective bloom filters for Kudu scan .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16138 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib55fa359e2fdb2c03c890f2303ffd9173c3c4af1 Gerrit-Change-Number: 16138 Gerrit-PatchSet: 2 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Bankim Bhavsar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Mon, 06 Jul 2020 19:54:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9632: Implement ds hll sketch() and ds hll estimate()
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16000 ) Change subject: IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate() .. Patch Set 13: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6093/ -- To view, visit http://gerrit.cloudera.org:8080/16000 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe Gerrit-Change-Number: 16000 Gerrit-PatchSet: 13 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 Jul 2020 19:23:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9902: add rewrite of TPC-DS q38
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16118 ) Change subject: IMPALA-9902: add rewrite of TPC-DS q38 .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6095/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81807683aa265a946729e15156bd2e33724103e1 Gerrit-Change-Number: 16118 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 06 Jul 2020 18:44:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9902: add rewrite of TPC-DS q38
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/16118 ) Change subject: IMPALA-9902: add rewrite of TPC-DS q38 .. Patch Set 3: Code-Review+2 This makes sense to me. -- To view, visit http://gerrit.cloudera.org:8080/16118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81807683aa265a946729e15156bd2e33724103e1 Gerrit-Change-Number: 16118 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 06 Jul 2020 18:08:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9902: add rewrite of TPC-DS q38
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16118 ) Change subject: IMPALA-9902: add rewrite of TPC-DS q38 .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6505/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81807683aa265a946729e15156bd2e33724103e1 Gerrit-Change-Number: 16118 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 06 Jul 2020 18:04:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6504/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Comment-Date: Mon, 06 Jul 2020 18:01:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16112 ) Change subject: IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis .. Patch Set 14: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6502/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16112 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f Gerrit-Change-Number: 16112 Gerrit-PatchSet: 14 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 06 Jul 2020 17:48:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9898: generate grouping set plans
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16128 ) Change subject: IMPALA-9898: generate grouping set plans .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6503/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16128 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197 Gerrit-Change-Number: 16128 Gerrit-PatchSet: 10 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 06 Jul 2020 17:48:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9902: add rewrite of TPC-DS q38
Hello Fang-Yu Rao, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16118 to look at the new patch set (#3). Change subject: IMPALA-9902: add rewrite of TPC-DS q38 .. IMPALA-9902: add rewrite of TPC-DS q38 I generated the query with dsqgen and then rewrote it to avoid intersect. Testing: Compared results to hive running the original version of the query. Change-Id: I81807683aa265a946729e15156bd2e33724103e1 --- A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q38-rewrite.test M testdata/workloads/tpcds/queries/tpcds-q8.test M tests/query_test/test_tpcds_queries.py M tests/util/parse_util.py 4 files changed, 40 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/16118/3 -- To view, visit http://gerrit.cloudera.org:8080/16118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I81807683aa265a946729e15156bd2e33724103e1 Gerrit-Change-Number: 16118 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9902: add rewrite of TPC-DS q38
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16118 ) Change subject: IMPALA-9902: add rewrite of TPC-DS q38 .. Patch Set 3: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/16118 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81807683aa265a946729e15156bd2e33724103e1 Gerrit-Change-Number: 16118 Gerrit-PatchSet: 3 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 06 Jul 2020 17:37:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 6: (3 comments) http://gerrit.cloudera.org:8080/#/c/16140/6/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java File fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java: http://gerrit.cloudera.org:8080/#/c/16140/6/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@399 PS6, Line 399: aggExpr.getFnName().getFunction() + "' must be combined with multiple grouping sets"); line too long (100 > 90) http://gerrit.cloudera.org:8080/#/c/16140/6/fe/src/test/java/org/apache/impala/analysis/ParserTest.java File fe/src/test/java/org/apache/impala/analysis/ParserTest.java: http://gerrit.cloudera.org:8080/#/c/16140/6/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@3750 PS6, Line 3750: ParsesOk("SELECT a, grouping(a) as `grouping` FROM foo GROUP BY GROUPING SETS((a, b))"); line too long (92 > 90) http://gerrit.cloudera.org:8080/#/c/16140/6/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@3751 PS6, Line 3751: ParserError("SELECT a, grouping(a) as grouping FROM foo GROUP BY GROUPING SETS((a, b))"); line too long (93 > 90) -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Comment-Date: Mon, 06 Jul 2020 17:37:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Hello Shant Hovsepian, David Rorke, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16140 to look at the new patch set (#6). Change subject: IMPALA-9917: grouping() and grouping_id() support .. IMPALA-9917: grouping() and grouping_id() support Implements grouping() and grouping_id() builtins. grouping_id() has both a no-arg version, which returns a bit vector of all grouping exprs and a varargs version, which returns a bit vector of the provided arguments. Grouping is a keyword, so needs special handling in the parser to be accepted as a function name. These functions are implemented in the transpose agg with a CASE expression similar to other aggregate functions, but returning the grouping() or grouping_id() value for that aggregation class instead of an aggregated value. Testing: * Added parser test for grouping keyword. * Added analysis tests for the functions. * Added basic planner test to show expressions generated * Added some TPC-DS queries that use grouping() - queries 80, 70 and 86 using reference .test files from Fang-Yu Rao. 27 and 36 should also be runnable but I did not have reference results readily available. * Add targeted end-to-end tests. Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 --- M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/GroupByClause.java M fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java M fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test M testdata/workloads/functional-query/queries/QueryTest/grouping-sets.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q70.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q80.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q86.test M tests/query_test/test_tpcds_queries.py M tests/util/parse_util.py 16 files changed, 1,881 insertions(+), 74 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/16140/6 -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian
[Impala-ASF-CR] IMPALA-9898: generate grouping set plans
Hello Aman Sinha, Quanlong Huang, Fang-Yu Rao, Qifan Chen, David Rorke, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16128 to look at the new patch set (#10). Change subject: IMPALA-9898: generate grouping set plans .. IMPALA-9898: generate grouping set plans Integrates the parsing and analysis with plan generation. Testing: * Add analysis test to make sure we reject unsupported queries. * Added targeted planner tests to ensure we generate the correct aggregation classes for a variety of cases. * Add targeted end-to-end functional tests. Added five TPC-DS queries that use ROLLUP, building on some work done by Fang-Yu Rao. Some tweaks were required for these tests. * Add an extra ORDER BY clause to q77 to make fully deterministic. * Add backticks around `returns` to avoid reserved word. * Add INTERVAL keyword to date/timestamp arithmetic. We can run q80, too, but I haven't added or verified results yet - that can be done in a follow-up. Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197 --- M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/GroupByClause.java M fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/common/RuntimeEnv.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java A testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test M testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test A testdata/workloads/functional-query/queries/QueryTest/grouping-sets.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q18.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q22.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q5.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q67.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q77.test M tests/query_test/test_aggregation.py M tests/query_test/test_tpcds_queries.py M tests/util/parse_util.py 22 files changed, 4,207 insertions(+), 65 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/16128/10 -- To view, visit http://gerrit.cloudera.org:8080/16128 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197 Gerrit-Change-Number: 16128 Gerrit-PatchSet: 10 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16112 ) Change subject: IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis .. Patch Set 13: (11 comments) http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/main/java/org/apache/impala/analysis/GroupByClause.java File fe/src/main/java/org/apache/impala/analysis/GroupByClause.java: http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/main/java/org/apache/impala/analysis/GroupByClause.java@66 PS13, Line 66: > nit: 4 spaces identation Done http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/main/java/org/apache/impala/analysis/GroupByClause.java@94 PS13, Line 94: e.clone() > Could you leave a comment for doing clone here? Oh this shouldn't be needed. http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/main/java/org/apache/impala/analysis/GroupByClause.java@125 PS13, Line 125: if (groupingIDs_.size() >= GROUPING_SET_LIMIT) { : throw new AnalysisException("Limit of " + GROUPING_SET_LIMIT + : " grouping sets exceeded"); : } > Could you add a test for this? Done - in AnalyzeStmtsTest http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/main/java/org/apache/impala/analysis/GroupByClause.java@173 PS13, Line 173: analyzer > nit: unused parameter. The SelectStmt parameter is also not used. Done http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/main/java/org/apache/impala/analysis/GroupByClause.java@174 PS13, Line 174: > nit: 4 spaces indentation Done http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/main/java/org/apache/impala/analysis/GroupByClause.java@175 PS13, Line 175: throws AnalysisException { > nit: this can be merged to the above line Done http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/main/java/org/apache/impala/analysis/GroupByClause.java@218 PS13, Line 218: if (groupingSetsType_ == GroupingSetsType.ROLLUP) { > nit: could you merge this to the above if-statement as an else-if clause? Done http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/main/java/org/apache/impala/analysis/GroupByClause.java@233 PS13, Line 233: : if (groupingSetsType_ == GroupingSetsType.SETS) { > nit: could you merge this to the above if-statement as an else-if clause? Done http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/main/java/org/apache/impala/analysis/GroupByClause.java@235 PS13, Line 235: for (int setNum = 0; setNum < groupingSetsList_.size(); setNum++) { : List set = groupingSetsList_.get(setNum); > nit: can be simlified to Done http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java: http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@2459 PS13, Line 2459: > nit: blank line Done http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java: http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java@75 PS13, Line 75: > We'd better add an @AfterClass cleanUp method to reset the env to avoid fla I think this was done indirectly by AbstractFrontendTest and the FrontendFixture, but what you're proposing would make this more obviously correct without having to read a bunch of other code. So I made the change. -- To view, visit http://gerrit.cloudera.org:8080/16112 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f Gerrit-Change-Number: 16112 Gerrit-PatchSet: 13 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 06 Jul 2020 17:19:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis
Hello Aman Sinha, Quanlong Huang, Thomas Tauber-Marshall, Shant Hovsepian, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16112 to look at the new patch set (#14). Change subject: IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis .. IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis Supports a single ROLLUP, CUBE or GROUPING SETS clause in GROUP BY. Also adds non-standard "WITH ROLLUP" and "WITH CUBE" syntax that is supported by some other SQL dialects. This implements basic parsing and validation of the query, then raises an AnalysisException to report that it is not supported so that incorrect plans will not be generated. This patch adds a GroupByClause to each SelectStmt that contains info about the grouping sets and the original GROUP BY list. The grouping exprs are still represented as a List in SelectStmt. Most of the logic, including statement and expr rewrites, can operate on this list of expressions without requiring special handling for grouping sets. Testing: * Add Parser test. * Add toSql() test. * Added analysis tests to check that analysis accepts or rejects queries correctly. Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f --- M fe/src/main/cup/sql-parser.cup A fe/src/main/java/org/apache/impala/analysis/GroupByClause.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java M fe/src/main/java/org/apache/impala/common/RuntimeEnv.java M fe/src/main/jflex/sql-scanner.flex M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java 10 files changed, 760 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/12/16112/14 -- To view, visit http://gerrit.cloudera.org:8080/16112 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f Gerrit-Change-Number: 16112 Gerrit-PatchSet: 14 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] WIP
Tim Armstrong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16144 Change subject: WIP .. WIP Change-Id: I31c3f019dc6d68cc5c433b53b74ac7b13e328dfa --- M fe/src/main/java/org/apache/impala/analysis/GroupByClause.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java 4 files changed, 35 insertions(+), 16 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/16144/1 -- To view, visit http://gerrit.cloudera.org:8080/16144 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I31c3f019dc6d68cc5c433b53b74ac7b13e328dfa Gerrit-Change-Number: 16144 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-7923: DecimalValue should be marked as packed
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/16134 ) Change subject: IMPALA-7923: DecimalValue should be marked as packed .. Patch Set 3: I'm in the process of loading the tables for TPCH scale factor 15, but it seems to be very slow but let's see. -- To view, visit http://gerrit.cloudera.org:8080/16134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I55f936a4f4f4b5faf129a9265222e64fc486b8ed Gerrit-Change-Number: 16134 Gerrit-PatchSet: 3 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 06 Jul 2020 17:15:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16082 ) Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types) .. Patch Set 10: (2 comments) http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java@409 PS10, Line 409: @Override : public ImmutableList getFileDescriptors() { : return fds_; : } : : @Override : public ImmutableList getInsertFileDescriptors() { : return fds_; : } : : @Override : public ImmutableList getDeleteFileDescriptors() { : return fds_; : } Need to update these http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/test/java/org/apache/impala/util/AcidUtilsTest.java File fe/src/test/java/org/apache/impala/util/AcidUtilsTest.java: http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/test/java/org/apache/impala/util/AcidUtilsTest.java@497 PS10, Line 497: Fail not *Fail* anymore -- To view, visit http://gerrit.cloudera.org:8080/16082 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659 Gerrit-Change-Number: 16082 Gerrit-PatchSet: 10 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 06 Jul 2020 17:07:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6500/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Comment-Date: Mon, 06 Jul 2020 16:56:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7923: DecimalValue should be marked as packed
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16134 ) Change subject: IMPALA-7923: DecimalValue should be marked as packed .. Patch Set 3: Code-Review+2 (1 comment) GOod point about fixing the crash. I think the risk of a performance regression is low. http://gerrit.cloudera.org:8080/#/c/16134/3/be/src/exec/hdfs-avro-scanner-test.cc File be/src/exec/hdfs-avro-scanner-test.cc: http://gerrit.cloudera.org:8080/#/c/16134/3/be/src/exec/hdfs-avro-scanner-test.cc@519 PS3, Line 519: memcpy([1], _value2, 16); > I wonder if it makes sense to keep and update the BIG_ENDIAN branch. AFAIK Yeah we should feel free to remove these. There was some effort to make stuff endian-neutral, I think cause it's just a "good practice", but that code isn't tested and will probably never be used, so no point maintaining it. -- To view, visit http://gerrit.cloudera.org:8080/16134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I55f936a4f4f4b5faf129a9265222e64fc486b8ed Gerrit-Change-Number: 16134 Gerrit-PatchSet: 3 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 06 Jul 2020 16:38:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/16140/5/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java File fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java: http://gerrit.cloudera.org:8080/#/c/16140/5/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@399 PS5, Line 399: aggExpr.getFnName().getFunction() + "' must be combined with multiple grouping sets"); line too long (100 > 90) http://gerrit.cloudera.org:8080/#/c/16140/5/fe/src/test/java/org/apache/impala/analysis/ParserTest.java File fe/src/test/java/org/apache/impala/analysis/ParserTest.java: http://gerrit.cloudera.org:8080/#/c/16140/5/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@3750 PS5, Line 3750: ParsesOk("SELECT a, grouping(a) as `grouping` FROM foo GROUP BY GROUPING SETS((a, b))"); line too long (92 > 90) http://gerrit.cloudera.org:8080/#/c/16140/5/fe/src/test/java/org/apache/impala/analysis/ParserTest.java@3751 PS5, Line 3751: ParserError("SELECT a, grouping(a) as grouping FROM foo GROUP BY GROUPING SETS((a, b))"); line too long (93 > 90) -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Comment-Date: Mon, 06 Jul 2020 16:35:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Hello Shant Hovsepian, David Rorke, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16140 to look at the new patch set (#5). Change subject: IMPALA-9917: grouping() and grouping_id() support .. IMPALA-9917: grouping() and grouping_id() support Implements grouping() and grouping_id() builtins. grouping_id() has both a no-arg version, which returns a bit vector of all grouping exprs and a varargs version, which returns a bit vector of the provided arguments. Grouping is a keyword, so needs special handling in the parser to be accepted as a function name. These functions are implemented in the transpose agg with a CASE expression similar to other aggregate functions, but returning the grouping() or grouping_id() value for that aggregation class instead of an aggregated value. Testing: * Added parser test for grouping keyword. * Added analysis tests for the functions. * Added basic planner test to show expressions generated * Added some TPC-DS queries that use grouping() - queries 80, 70 and 86 using reference .test files from Fang-Yu Rao. 27 and 36 should also be runnable but I did not have reference results readily available. * Add targeted end-to-end tests. Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 --- M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/GroupByClause.java M fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java M fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test M testdata/workloads/functional-query/queries/QueryTest/grouping-sets.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q70.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q80.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q86.test M tests/query_test/test_tpcds_queries.py M tests/util/parse_util.py 16 files changed, 1,880 insertions(+), 74 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/16140/5 -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Shant Hovsepian
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16082 ) Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types) .. Patch Set 10: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6094/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/16082 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659 Gerrit-Change-Number: 16082 Gerrit-PatchSet: 10 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 06 Jul 2020 16:09:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16082 ) Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types) .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6499/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16082 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659 Gerrit-Change-Number: 16082 Gerrit-PatchSet: 10 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 06 Jul 2020 16:05:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16082 ) Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types) .. Patch Set 9: (3 comments) http://gerrit.cloudera.org:8080/#/c/16082/9/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java: http://gerrit.cloudera.org:8080/#/c/16082/9/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@45 PS9, Line 45: import org.apache.impala.analysis.Path.PathType; > nit: keep the import list sorted in groups (usually the IDE will do this fo Yeah, I use my IDE to do that for me. Interestingly, even if I delete this line, and ask VSCode to import PathType for me, it puts the import to this line again... Same with HdfsPartition.FileDescriptor. Seems like if an import path has more components it can confuse vscode. Anyway, fixed it manually :D http://gerrit.cloudera.org:8080/#/c/16082/9/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1418 PS9, Line 1418: if (addAcidSlotsIfNeeded(analyzer, hdfsTblRef, partitions)) { > nit: what about merging this if-statement with its outer scope so they are Done http://gerrit.cloudera.org:8080/#/c/16082/9/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1518 PS9, Line 1518: // Let's separate insert delta File Descriptors from delete delta FDs. > I think we should separate the file descriptors in catalogd after loading t I implemented it in PS10, but I'm afraid that the change became a bit too invasive. An alternative approach is just to add the genInsertDeltaPartition() and genDeleteDeltaPartition() to the partition classes and those would filter out the unneeded file descs and return a new Partition object. This way the change would be smaller, but we'd still have to generate these insert/delta partitions for each query. What do you think? -- To view, visit http://gerrit.cloudera.org:8080/16082 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659 Gerrit-Change-Number: 16082 Gerrit-PatchSet: 9 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 06 Jul 2020 15:46:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16082 ) Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types) .. Patch Set 10: (1 comment) http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java: http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1536 PS10, Line 1536: hdfsTblRef.getDesc(), conjuncts, insertDeltaPartitions, hdfsTblRef, /*aggInfo=*/null, line too long (93 > 90) -- To view, visit http://gerrit.cloudera.org:8080/16082 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659 Gerrit-Change-Number: 16082 Gerrit-PatchSet: 10 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 06 Jul 2020 15:42:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)
Hello Aman Sinha, Quanlong Huang, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16082 to look at the new patch set (#10). Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types) .. IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types) Hive ACID supports row-level DELETE and UPDATE operations on a table. It achieves it via assigning a unique row-id for each row, and maintaining two sets of files in a table. The first set is in the base/delta directories, they contain the INSERTed rows. The second set of files are in the delete-delta directories, they contain the DELETEd rows. (UPDATE operations are implemented via DELETE+INSERT.) In the filesystem it looks like e.g.: * full_acid/delta_001_001_/_0 * full_acid/delta_002_002_/_0 * full_acid/delete_delta_003_003_/_0 During scanning we need to return INSERTed rows minus DELETEd rows. This patch implements it by creating an ANTI JOIN between the INSERT and DELETE sets. It is a planner-only modification. Every HDFS SCAN that scans full ACID tables (that also have deleted rows) are converted to two HDFS SCANs, one for the INSERT deltas, and one for the DELETE deltas. Then a LEFT ANTI HASH JOIN with BROADCAST distribution mode is created above them. Later we can add support for other distribution modes if the performance requires it. E.g. if we have too many deleted rows then probably we are better off with PARTITIONED distribution mode. We could estimate the number of deleted rows by sampling the delete delta files. The current patch only works for primitive types. I.e. we cannot select nested data if the table has deleted rows. Testing: * added planner test * added e2e tests Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659 --- M common/thrift/CatalogObjects.thrift M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java M fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/main/java/org/apache/impala/util/AcidUtils.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M fe/src/test/java/org/apache/impala/util/AcidUtilsTest.java M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv A testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test M testdata/workloads/functional-query/queries/QueryTest/acid-negative.test A testdata/workloads/functional-query/queries/QueryTest/full-acid-scans.test M tests/query_test/test_acid.py 25 files changed, 1,429 insertions(+), 142 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/16082/10 -- To view, visit http://gerrit.cloudera.org:8080/16082 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659 Gerrit-Change-Number: 16082 Gerrit-PatchSet: 10 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-7923: DecimalValue should be marked as packed
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/16134 ) Change subject: IMPALA-7923: DecimalValue should be marked as packed .. Patch Set 3: Code-Review+1 (1 comment) lgtm I am also interested in the benchmark, but from my side it can be merged as it is even if the patch introduces regression, as it fixes a crash. http://gerrit.cloudera.org:8080/#/c/16134/3/be/src/exec/hdfs-avro-scanner-test.cc File be/src/exec/hdfs-avro-scanner-test.cc: http://gerrit.cloudera.org:8080/#/c/16134/3/be/src/exec/hdfs-avro-scanner-test.cc@519 PS3, Line 519: memcpy([1], _value2, 16); I wonder if it makes sense to keep and update the BIG_ENDIAN branch. AFAIK Impala is never compiled on BIG_ENDIAN architectures, end we have static_asserts to ensure this. -- To view, visit http://gerrit.cloudera.org:8080/16134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I55f936a4f4f4b5faf129a9265222e64fc486b8ed Gerrit-Change-Number: 16134 Gerrit-PatchSet: 3 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 06 Jul 2020 14:22:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9632: Implement ds hll sketch() and ds hll estimate()
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16000 ) Change subject: IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate() .. Patch Set 13: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16000 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe Gerrit-Change-Number: 16000 Gerrit-PatchSet: 13 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 Jul 2020 14:21:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9632: Implement ds hll sketch() and ds hll estimate()
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16000 ) Change subject: IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate() .. Patch Set 13: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6093/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16000 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe Gerrit-Change-Number: 16000 Gerrit-PatchSet: 13 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 Jul 2020 14:21:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9632: Implement ds hll sketch() and ds hll estimate()
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/16000 ) Change subject: IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate() .. Patch Set 12: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16000 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe Gerrit-Change-Number: 16000 Gerrit-PatchSet: 12 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 Jul 2020 14:13:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9632: Implement ds hll sketch() and ds hll estimate()
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16000 ) Change subject: IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate() .. Patch Set 12: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6498/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16000 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe Gerrit-Change-Number: 16000 Gerrit-PatchSet: 12 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 Jul 2020 13:20:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9632: Implement ds hll sketch() and ds hll estimate()
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16000 ) Change subject: IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate() .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6497/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16000 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe Gerrit-Change-Number: 16000 Gerrit-PatchSet: 11 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 Jul 2020 13:20:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9632: Implement ds hll sketch() and ds hll estimate()
Hello Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16000 to look at the new patch set (#12). Change subject: IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate() .. IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate() These functions can be used to get cardinality estimates of data using HLL algorithm from Apache DataSketches. ds_hll_sketch() receives a dataset, e.g. a column from a table, and returns a serialized HLL sketch in string format. This can be written to a table or be fed directly to ds_hll_estimate() that returns the cardinality estimate for that sketch. Comparing to ndv() these functions bring more flexibility as once we fed data to the sketch it can be written to a table and next time we can save scanning through the dataset and simply return the estimate using the sketch. This doesn't come for free, however, as perfomance measurements show that ndv() is 2x-3.5x faster than sketching. On the other hand if we query the estimate from an existing sketch then the runtime is negligible. Another flexibility with these sketches is that they can be merged together so e.g. if we had saved a sketch for each of the partitions of a table then they can be combined with each other based on the query without touching the actual data. DataSketches HLL is sensitive for the order of the data fed to the sketch and as a result running these algorithms in Impala gets non-deterministic results within the error bounds of the algorithm. In terms of correctness DataSketches HLL is most of the time in 2% range from the correct result but there are occasional spikes where the difference is bigger but never goes out of the range of 5%. Even though the DataSketches HLL algorithm could be parameterized currently this implementation hard-codes these parameters and use HLL_4 and lg_k=12. For more details about Apache DataSketches' HLL implementation see: https://datasketches.apache.org/docs/HLL/HLL.html Testing: - Added some tests running estimates for small datasets where the amount of data is small enough to get the correct results. - Ran manual tests on TPCH25.lineitem to compare perfomance with ndv(). Depending on data characteristics ndv() appears 2x-3.5x faster. The lower the cardinality of the dataset the bigger the difference between the 2 algorithms is. - Ran manual tests on TPCH25.lineitem and functional_parquet.alltypes to compare correctness with ndv(). See results above. Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe --- M be/src/codegen/impala-ir.cc M be/src/exprs/CMakeLists.txt M be/src/exprs/aggregate-functions-ir.cc M be/src/exprs/aggregate-functions.h A be/src/exprs/datasketches-functions-ir.cc A be/src/exprs/datasketches-functions.h M be/src/exprs/datasketches-test.cc M be/src/exprs/scalar-expr-evaluator.cc M common/function-registry/impala_functions.py M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M fe/src/main/java/org/apache/impala/catalog/Function.java M testdata/data/README A testdata/data/hll_sketches_from_hive.parquet A testdata/workloads/functional-query/queries/QueryTest/datasketches-hll.test A tests/query_test/test_datasketches.py 17 files changed, 503 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/16000/12 -- To view, visit http://gerrit.cloudera.org:8080/16000 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe Gerrit-Change-Number: 16000 Gerrit-PatchSet: 12 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9632: Implement ds hll sketch() and ds hll estimate()
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16000 ) Change subject: IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate() .. Patch Set 11: (1 comment) http://gerrit.cloudera.org:8080/#/c/16000/11/tests/query_test/test_datasketches.py File tests/query_test/test_datasketches.py: http://gerrit.cloudera.org:8080/#/c/16000/11/tests/query_test/test_datasketches.py@38 PS11, Line 38: flake8: W391 blank line at end of file -- To view, visit http://gerrit.cloudera.org:8080/16000 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe Gerrit-Change-Number: 16000 Gerrit-PatchSet: 11 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 Jul 2020 12:52:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9632: Implement ds hll sketch() and ds hll estimate()
Hello Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16000 to look at the new patch set (#11). Change subject: IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate() .. IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate() These functions can be used to get cardinality estimates of data using HLL algorithm from Apache DataSketches. ds_hll_sketch() receives a dataset, e.g. a column from a table, and returns a serialized HLL sketch in string format. This can be written to a table or be fed directly to ds_hll_estimate() that returns the cardinality estimate for that sketch. Comparing to ndv() these functions bring more flexibility as once we fed data to the sketch it can be written to a table and next time we can save scanning through the dataset and simply return the estimate using the sketch. This doesn't come for free, however, as perfomance measurements show that ndv() is 2x-3.5x faster than sketching. On the other hand if we query the estimate from an existing sketch then the runtime is negligible. Another flexibility with these sketches is that they can be merged together so e.g. if we had saved a sketch for each of the partitions of a table then they can be combined with each other based on the query without touching the actual data. DataSketches HLL is sensitive for the order of the data fed to the sketch and as a result running these algorithms in Impala gets non-deterministic results within the error bounds of the algorithm. In terms of correctness DataSketches HLL is most of the time in 2% range from the correct result but there are occasional spikes where the difference is bigger but never goes out of the range of 5%. Even though the DataSketches HLL algorithm could be parameterized currently this implementation hard-codes these parameters and use HLL_4 and lg_k=12. For more details about Apache DataSketches' HLL implementation see: https://datasketches.apache.org/docs/HLL/HLL.html Testing: - Added some tests running estimates for small datasets where the amount of data is small enough to get the correct results. - Ran manual tests on TPCH25.lineitem to compare perfomance with ndv(). Depending on data characteristics ndv() appears 2x-3.5x faster. The lower the cardinality of the dataset the bigger the difference between the 2 algorithms is. - Ran manual tests on TPCH25.lineitem and functional_parquet.alltypes to compare correctness with ndv(). See results above. Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe --- M be/src/codegen/impala-ir.cc M be/src/exprs/CMakeLists.txt M be/src/exprs/aggregate-functions-ir.cc M be/src/exprs/aggregate-functions.h A be/src/exprs/datasketches-functions-ir.cc A be/src/exprs/datasketches-functions.h M be/src/exprs/datasketches-test.cc M be/src/exprs/scalar-expr-evaluator.cc M common/function-registry/impala_functions.py M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M fe/src/main/java/org/apache/impala/catalog/Function.java M testdata/data/README A testdata/data/hll_sketches_from_hive.parquet A testdata/workloads/functional-query/queries/QueryTest/datasketches-hll.test A tests/query_test/test_datasketches.py 17 files changed, 504 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/16000/11 -- To view, visit http://gerrit.cloudera.org:8080/16000 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe Gerrit-Change-Number: 16000 Gerrit-PatchSet: 11 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16112 ) Change subject: IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis .. Patch Set 13: (11 comments) http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/main/java/org/apache/impala/analysis/GroupByClause.java File fe/src/main/java/org/apache/impala/analysis/GroupByClause.java: http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/main/java/org/apache/impala/analysis/GroupByClause.java@66 PS13, Line 66: nit: 4 spaces identation http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/main/java/org/apache/impala/analysis/GroupByClause.java@94 PS13, Line 94: e.clone() Could you leave a comment for doing clone here? http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/main/java/org/apache/impala/analysis/GroupByClause.java@125 PS13, Line 125: if (groupingIDs_.size() >= GROUPING_SET_LIMIT) { : throw new AnalysisException("Limit of " + GROUPING_SET_LIMIT + : " grouping sets exceeded"); : } Could you add a test for this? http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/main/java/org/apache/impala/analysis/GroupByClause.java@173 PS13, Line 173: analyzer nit: unused parameter. The SelectStmt parameter is also not used. http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/main/java/org/apache/impala/analysis/GroupByClause.java@174 PS13, Line 174: nit: 4 spaces indentation http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/main/java/org/apache/impala/analysis/GroupByClause.java@175 PS13, Line 175: throws AnalysisException { nit: this can be merged to the above line http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/main/java/org/apache/impala/analysis/GroupByClause.java@218 PS13, Line 218: if (groupingSetsType_ == GroupingSetsType.ROLLUP) { nit: could you merge this to the above if-statement as an else-if clause? http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/main/java/org/apache/impala/analysis/GroupByClause.java@233 PS13, Line 233: : if (groupingSetsType_ == GroupingSetsType.SETS) { nit: could you merge this to the above if-statement as an else-if clause? http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/main/java/org/apache/impala/analysis/GroupByClause.java@235 PS13, Line 235: for (int setNum = 0; setNum < groupingSetsList_.size(); setNum++) { : List set = groupingSetsList_.get(setNum); nit: can be simlified to for (List set : groupingSetsList_) http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java: http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@2459 PS13, Line 2459: nit: blank line http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java: http://gerrit.cloudera.org:8080/#/c/16112/13/fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java@75 PS13, Line 75: We'd better add an @AfterClass cleanUp method to reset the env to avoid flakiness like IMPALA-9743 when running all FE tests in a JVM. Some tests (e.g. CatalogTest) require isTestEnv = False. Ideally all tests should set their required flags (IMPALA-9780). Currently, we'd better reset these flags at the end. -- To view, visit http://gerrit.cloudera.org:8080/16112 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f Gerrit-Change-Number: 16112 Gerrit-PatchSet: 13 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 06 Jul 2020 09:07:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9741: Supported query icebreg table by impala
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16143 ) Change subject: IMPALA-9741: Supported query icebreg table by impala .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6496/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 Gerrit-Change-Number: 16143 Gerrit-PatchSet: 2 Gerrit-Owner: wangsheng Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 Jul 2020 08:01:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9741: Supported query icebreg table by impala
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16143 ) Change subject: IMPALA-9741: Supported query icebreg table by impala .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6495/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 Gerrit-Change-Number: 16143 Gerrit-PatchSet: 1 Gerrit-Owner: wangsheng Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 06 Jul 2020 07:49:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9741: Supported query icebreg table by impala
wangsheng has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/16143 ) Change subject: IMPALA-9741: Supported query icebreg table by impala .. IMPALA-9741: Supported query icebreg table by impala This patch mainly realizes the query of iceberg table through impala, we can use the following sql to create an external iceberg table: CREATE EXTERNAL TABLE default.iceberg_test ( level string, event_time timestamp, message string, ) STORED AS ICEBERG LOCATION 'hdfs://xxx' TBLPROPERTIES ('iceberg_file_format'='parquet'); Or just including table name and location like this: CREATE EXTERNAL TABLE default.iceberg_test STORED AS ICEBERG LOCATION 'hdfs://xxx' TBLPROPERTIES ('iceberg_file_format'='parquet'); 'iceberg_file_format' is the file format in iceberg, currently support PARQUET and ORC. And if you don't identity this property in your SQL, default file format is PARQUT. Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 --- M be/src/runtime/descriptors.cc M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java M fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java M fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java A fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M testdata/workloads/functional-query/queries/QueryTest/iceberg_create.test 23 files changed, 852 insertions(+), 135 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/16143/2 -- To view, visit http://gerrit.cloudera.org:8080/16143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 Gerrit-Change-Number: 16143 Gerrit-PatchSet: 2 Gerrit-Owner: wangsheng Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9741: Supported query icebreg table by impala
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16143 ) Change subject: IMPALA-9741: Supported query icebreg table by impala .. Patch Set 1: (17 comments) http://gerrit.cloudera.org:8080/#/c/16143/1/fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java File fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java: http://gerrit.cloudera.org:8080/#/c/16143/1/fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java@80 PS1, Line 80: "SHOW FILES not applicable to a non hdfs table and non iceberg table: %s", tableName_)); line too long (98 > 90) http://gerrit.cloudera.org:8080/#/c/16143/1/fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java File fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java: http://gerrit.cloudera.org:8080/#/c/16143/1/fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java@84 PS1, Line 84: // There two cases here: Non-partitioned hdfs table and non-partitioned iceberg table line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/16143/1/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java File fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java: http://gerrit.cloudera.org:8080/#/c/16143/1/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@249 PS1, Line 249: public static Map getPartitionColToSourceIdMap(List specs) { line too long (103 > 90) http://gerrit.cloudera.org:8080/#/c/16143/1/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@286 PS1, Line 286: tIcebergTable.setPartition_col_to_source_id_map(icebergTable.getPartitionColToSourceIdMap()); line too long (99 > 90) http://gerrit.cloudera.org:8080/#/c/16143/1/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@309 PS1, Line 309: private static HdfsPartition.FileDescriptor getFileDescriptor(FileSystem fs, Path tableLoc, line too long (95 > 90) http://gerrit.cloudera.org:8080/#/c/16143/1/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@322 PS1, Line 322: return HdfsPartition.FileDescriptor.create(fileStatus, relPath, locations, hostIndex, line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/16143/1/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@329 PS1, Line 329: public static Map loadAllPartition(String location, line too long (93 > 90) http://gerrit.cloudera.org:8080/#/c/16143/1/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@337 PS1, Line 337: HdfsPartition.FileDescriptor fileDesc = getFileDescriptor(new Path(file.path().toString()), line too long (99 > 90) http://gerrit.cloudera.org:8080/#/c/16143/1/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@368 PS1, Line 368: partition.setFileFormat(IcebergUtil.toTHdfsFileFormat(icebergTable.getIcebergFileFormat())); line too long (100 > 90) http://gerrit.cloudera.org:8080/#/c/16143/1/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java File fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java: http://gerrit.cloudera.org:8080/#/c/16143/1/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java@176 PS1, Line 176: table_.getMetaStoreTable().getParameters().get(IcebergTable.ICEBERG_FILE_FORMAT); line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/16143/1/fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java File fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java: http://gerrit.cloudera.org:8080/#/c/16143/1/fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java@71 PS1, Line 71: Map partitionColToSourceIdMap = Utils.getPartitionColToSourceIdMap(partitionSpecs); line too long (104 > 90) http://gerrit.cloudera.org:8080/#/c/16143/1/fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java@78 PS1, Line 78: ColumnMap cmap, List partitionSpecs, Map sourceColsMap, line too long (100 > 90) http://gerrit.cloudera.org:8080/#/c/16143/1/fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java@161 PS1, Line 161: localFsTable_.createPrototypePartition(), CatalogObject.ThriftObjectType.DESCRIPTOR_ONLY); line too long (98 > 90) http://gerrit.cloudera.org:8080/#/c/16143/1/fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java@162 PS1, Line 162: THdfsTable hdfsTable = new THdfsTable(localFsTable_.getHdfsBaseDir(), getColumnNames(), line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/16143/1/fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java@163 PS1, Line 163: localFsTable_.getNullPartitionKeyValue(), FeFsTable.DEFAULT_NULL_COLUMN_VALUE, idToPartition, line too long (101 > 90) http://gerrit.cloudera.org:8080/#/c/16143/1/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java File
[Impala-ASF-CR] IMPALA-9741: Supported query icebreg table by impala
wangsheng has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16143 Change subject: IMPALA-9741: Supported query icebreg table by impala .. IMPALA-9741: Supported query icebreg table by impala This patch mainly realizes the query of iceberg table through impala, we can use the following sql to create an external iceberg table: CREATE EXTERNAL TABLE default.iceberg_test ( level string, event_time timestamp, message string, ) STORED AS ICEBERG LOCATION 'hdfs://xxx' TBLPROPERTIES ('iceberg_file_format'='parquet'); Or just including table name and location like this: CREATE EXTERNAL TABLE default.iceberg_test STORED AS ICEBERG LOCATION 'hdfs://xxx' TBLPROPERTIES ('iceberg_file_format'='parquet'); 'iceberg_file_format' is the file format in iceberg, currently support PARQUET and ORC. And if you don't identity this property in your SQL, default file format is PARQUT. Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 --- M be/src/runtime/descriptors.cc M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java M fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java M fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java A fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M testdata/workloads/functional-query/queries/QueryTest/iceberg_create.test 23 files changed, 842 insertions(+), 135 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/16143/1 -- To view, visit http://gerrit.cloudera.org:8080/16143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006 Gerrit-Change-Number: 16143 Gerrit-PatchSet: 1 Gerrit-Owner: wangsheng