[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 ) Change subject: IMPALA-9531: Dropped support for dateless timestamps .. Patch Set 11: (1 comment) http://gerrit.cloudera.org:8080/#/c/15866/11/testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_text.test File testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_text.test: http://gerrit.cloudera.org:8080/#/c/15866/11/testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_text.test@8 PS11, Line 8: localhost:20500 Note, the dockerised tests fail because the hardcoded localhost. -- To view, visit http://gerrit.cloudera.org:8080/15866 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322 Gerrit-Change-Number: 15866 Gerrit-PatchSet: 11 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 08 Jul 2020 06:23:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9921: Change error messages in checking needsQuotes to TRACE level logs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16146 ) Change subject: IMPALA-9921: Change error messages in checking needsQuotes to TRACE level logs .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16146 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0e1b5d2963285dc9125d8e0b8ed25c4db6821e0b Gerrit-Change-Number: 16146 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Wed, 08 Jul 2020 04:40:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9921: Change error messages in checking needsQuotes to TRACE level logs
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16146 ) Change subject: IMPALA-9921: Change error messages in checking needsQuotes to TRACE level logs .. IMPALA-9921: Change error messages in checking needsQuotes to TRACE level logs Impala planner uses the HiveLexer to check whether an ident needs to be quoted in toSql results. However, HiveLexer will print error messages to stderr which is redirected to impalad.ERROR, so they appear as ERROR level logs. Actually, they just mean HiveLexer can't parse the ident so they are not Hive keywords so don't need to be quoted. These error messages don't mean anything wrong so shouldn't be ERROR level logs. This patch overrides the HiveLexer used in ToSqlUtils to log the error messages to TRACE level logs. Tests * Manually verify the error messages don't appear in impalad.ERROR and are printed to TRACE level logs. Change-Id: I0e1b5d2963285dc9125d8e0b8ed25c4db6821e0b Reviewed-on: http://gerrit.cloudera.org:8080/16146 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java 1 file changed, 15 insertions(+), 1 deletion(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16146 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I0e1b5d2963285dc9125d8e0b8ed25c4db6821e0b Gerrit-Change-Number: 16146 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Shant Hovsepian has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 12: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 12 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 08 Jul 2020 04:37:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9692 (part 4): Rename QuerySchedule to ScheduleState
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16122 ) Change subject: IMPALA-9692 (part 4): Rename QuerySchedule to ScheduleState .. IMPALA-9692 (part 4): Rename QuerySchedule to ScheduleState This is the final patch in the refactor of QuerySchedule for the single admission controller work. It was kept separate for ease of reviewing. This patch renames QuerySchedule and its related classes (FInstanceExecParams, BackendExecParams, and FragmentExecParams) to use the name 'ScheduleState', reflecting the fact that they are used as containers for intermediate info about a query during scheduling. The messages that are included in the QuerySchedulePB struct retain the 'ExecParams' name, reflecting the fact that they are used by the coordinator to start execution. Change-Id: I57069c4a426f3e697df7e2a07754d063bdea26f7 Reviewed-on: http://gerrit.cloudera.org:8080/16122 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/scheduling/CMakeLists.txt M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h R be/src/scheduling/schedule-state.cc R be/src/scheduling/schedule-state.h M be/src/scheduling/scheduler-test-util.h M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h 9 files changed, 631 insertions(+), 625 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16122 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I57069c4a426f3e697df7e2a07754d063bdea26f7 Gerrit-Change-Number: 16122 Gerrit-PatchSet: 6 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-9692 (part 4): Rename QuerySchedule to ScheduleState
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16122 ) Change subject: IMPALA-9692 (part 4): Rename QuerySchedule to ScheduleState .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16122 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I57069c4a426f3e697df7e2a07754d063bdea26f7 Gerrit-Change-Number: 16122 Gerrit-PatchSet: 5 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Wed, 08 Jul 2020 04:07:34 + Gerrit-HasComments: No
[native-toolchain-CR] Remove colon from filename
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16139 ) Change subject: Remove colon from filename .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16139 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I731f9b818376fcc57f9e47a2bd1bc26ca0b41dc5 Gerrit-Change-Number: 16139 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 08 Jul 2020 04:01:43 + Gerrit-HasComments: No
[native-toolchain-CR] Remove colon from filename
Tim Armstrong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16139 ) Change subject: Remove colon from filename .. Remove colon from filename Some OSes don't allow cloning the repo if it has a filename with a colon on it. See https://github.com/cloudera/native-toolchain/issues/65 for context. Change-Id: I731f9b818376fcc57f9e47a2bd1bc26ca0b41dc5 Reviewed-on: http://gerrit.cloudera.org:8080/16139 Reviewed-by: Laszlo Gaal Tested-by: Tim Armstrong --- R source/thrift/thrift-0.9.0-patches/0006-Revert-Thrift-p5_IMPALA-3159-Add-SAN-Wildcard-SSL-cert-client-support-in-python.patch 1 file changed, 0 insertions(+), 0 deletions(-) Approvals: Laszlo Gaal: Looks good to me, approved Tim Armstrong: Verified -- To view, visit http://gerrit.cloudera.org:8080/16139 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I731f9b818376fcc57f9e47a2bd1bc26ca0b41dc5 Gerrit-Change-Number: 16139 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Tim Armstrong
[native-toolchain-CR] Remove colon from filename
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16139 ) Change subject: Remove colon from filename .. Patch Set 2: I'm not sure - these patches are usually manually generated so not sure what happened here. -- To view, visit http://gerrit.cloudera.org:8080/16139 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I731f9b818376fcc57f9e47a2bd1bc26ca0b41dc5 Gerrit-Change-Number: 16139 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 08 Jul 2020 04:02:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 12: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 12 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 08 Jul 2020 02:56:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9294: Support DATE for min-max runtime filter
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16103 ) Change subject: IMPALA-9294: Support DATE for min-max runtime filter .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6526/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic2f6e2dc6949735d5f0fcf317361cc2969a5e82c Gerrit-Change-Number: 16103 Gerrit-PatchSet: 8 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Wed, 08 Jul 2020 01:23:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9294: Support DATE for min-max runtime filter
Wenzhe Zhou has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/16103 ) Change subject: IMPALA-9294: Support DATE for min-max runtime filter .. IMPALA-9294: Support DATE for min-max runtime filter Implemented Date min-max filter and applied it to Kudu as other min-max runtime filters. Added new test cases for Date min-max filters. Testing: Passed all core tests. Change-Id: Ic2f6e2dc6949735d5f0fcf317361cc2969a5e82c --- M be/src/codegen/gen_ir_descriptions.py M be/src/runtime/date-value.h M be/src/util/min-max-filter-ir.cc M be/src/util/min-max-filter-test.cc M be/src/util/min-max-filter.cc M be/src/util/min-max-filter.h M common/protobuf/common.proto M common/thrift/Data.thrift M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java M testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test M testdata/workloads/functional-query/queries/QueryTest/all_runtime_filters.test M testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test 12 files changed, 295 insertions(+), 167 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/16103/8 -- To view, visit http://gerrit.cloudera.org:8080/16103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic2f6e2dc6949735d5f0fcf317361cc2969a5e82c Gerrit-Change-Number: 16103 Gerrit-PatchSet: 8 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-9294: Support DATE for min-max runtime filter
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/16103 ) Change subject: IMPALA-9294: Support DATE for min-max runtime filter .. Patch Set 7: (5 comments) http://gerrit.cloudera.org:8080/#/c/16103/3/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test File testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test: http://gerrit.cloudera.org:8080/#/c/16103/3/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test@168 PS3, Line 168: functional_kudu.date_tbl a : join [BROADCAST] functional_kudu.date_tbl b > While its a little hard to tell and I think that we do have a test that tec Will add a functional-query test as suggested http://gerrit.cloudera.org:8080/#/c/16103/7/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test File testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test: http://gerrit.cloudera.org:8080/#/c/16103/7/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test@167 PS7, Line 167: # Query with date-typed join column > I think this case can be left out - its only showing that we'll generate DA will remove it http://gerrit.cloudera.org:8080/#/c/16103/7/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test@206 PS7, Line 206: # Query with 3-way shuffer joins on date-typed join columns > I think its reasonable to leave this case in as we don't already have a pla Will update comments and change the type of one joined columns http://gerrit.cloudera.org:8080/#/c/16103/7/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test@256 PS7, Line 256: # Query with 3-way join on date-typed join columns > I don't think this case is testing anything different from the existing cas will remove it http://gerrit.cloudera.org:8080/#/c/16103/7/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test@312 PS7, Line 312: # Query with 3-way join on date-typed and int-typed join columns > Same as above, I don't think this case is testing anything different from t Will remove it. -- To view, visit http://gerrit.cloudera.org:8080/16103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic2f6e2dc6949735d5f0fcf317361cc2969a5e82c Gerrit-Change-Number: 16103 Gerrit-PatchSet: 7 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Wed, 08 Jul 2020 00:46:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3380: Add a timeout for statestore RPCs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16150 ) Change subject: IMPALA-3380: Add a timeout for statestore RPCs .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6525/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16150 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If49892ff1950cf474f951aabf4c952dbc44189e2 Gerrit-Change-Number: 16150 Gerrit-PatchSet: 2 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Wed, 08 Jul 2020 00:04:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/16098 ) Change subject: IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans .. Patch Set 14: > Uploaded patch set 14. -- To view, visit http://gerrit.cloudera.org:8080/16098 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576 Gerrit-Change-Number: 16098 Gerrit-PatchSet: 14 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 08 Jul 2020 00:01:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16098 ) Change subject: IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans .. Patch Set 14: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6524/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16098 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576 Gerrit-Change-Number: 16098 Gerrit-PatchSet: 14 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 23:55:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3380: Add a timeout for statestore RPCs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16150 ) Change subject: IMPALA-3380: Add a timeout for statestore RPCs .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6523/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16150 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If49892ff1950cf474f951aabf4c952dbc44189e2 Gerrit-Change-Number: 16150 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 07 Jul 2020 23:47:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3380: Add a timeout for statestore RPCs
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/16150 ) Change subject: IMPALA-3380: Add a timeout for statestore RPCs .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/16150/1/be/src/statestore/statestore-subscriber.cc File be/src/statestore/statestore-subscriber.cc: http://gerrit.cloudera.org:8080/#/c/16150/1/be/src/statestore/statestore-subscriber.cc@69 PS1, Line 69: DEFINE_int32(statestore_client_rpc_timeout_ms, 30, "(Advanced) The underlying TSocket " > line too long (91 > 90) Done http://gerrit.cloudera.org:8080/#/c/16150/1/tests/custom_cluster/test_rpc_timeout.py File tests/custom_cluster/test_rpc_timeout.py: http://gerrit.cloudera.org:8080/#/c/16150/1/tests/custom_cluster/test_rpc_timeout.py@230 PS1, Line 230: # > flake8: E131 continuation line unaligned for hanging indent Done -- To view, visit http://gerrit.cloudera.org:8080/16150 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If49892ff1950cf474f951aabf4c952dbc44189e2 Gerrit-Change-Number: 16150 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 07 Jul 2020 23:43:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3380: Add a timeout for statestore RPCs
Hello Thomas Tauber-Marshall, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16150 to look at the new patch set (#2). Change subject: IMPALA-3380: Add a timeout for statestore RPCs .. IMPALA-3380: Add a timeout for statestore RPCs Adds the following startup flags for statestore subscribers: 'statestore_client_rpc_timeout_ms'. The timeout is set to 5 minutes by default. Testing: * Adds some tests for catalog_client_rpc_timeout_ms that validate the timeout is used correctly, and that retries are triggered Change-Id: If49892ff1950cf474f951aabf4c952dbc44189e2 --- M be/src/catalog/catalog-server.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/statestore/statestore-subscriber.cc M tests/custom_cluster/test_rpc_timeout.py 5 files changed, 65 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/16150/2 -- To view, visit http://gerrit.cloudera.org:8080/16150 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If49892ff1950cf474f951aabf4c952dbc44189e2 Gerrit-Change-Number: 16150 Gerrit-PatchSet: 2 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-9921: Change error messages in checking needsQuotes to TRACE level logs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16146 ) Change subject: IMPALA-9921: Change error messages in checking needsQuotes to TRACE level logs .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6105/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16146 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0e1b5d2963285dc9125d8e0b8ed25c4db6821e0b Gerrit-Change-Number: 16146 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 07 Jul 2020 23:36:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9921: Change error messages in checking needsQuotes to TRACE level logs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16146 ) Change subject: IMPALA-9921: Change error messages in checking needsQuotes to TRACE level logs .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16146 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0e1b5d2963285dc9125d8e0b8ed25c4db6821e0b Gerrit-Change-Number: 16146 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 07 Jul 2020 23:36:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans
Qifan Chen has uploaded a new patch set (#14). ( http://gerrit.cloudera.org:8080/16098 ) Change subject: IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans .. IMPALA-9744: Treat corrupt table stats as missing to avoid bad plans This work addresses the current limitation in computing the total row count for a Hive table in a scan. The row count can be incorrectly computed as 0, even though there exists data in the Hive table. This is the stats corruption at table level. Similar stats corruption exists for a partition. The row count of a table or a partition sometime can also be -1 which indicates a missing stats situation. In the fix, as long as no partition in a Hive table exhibits any missing or corrupt stats, the total row count for the table is computed from the row counts in all partitions. Otherwise, Impala looks at the table level stats particularly the table row count. In addition, if the table stats is missing or corrupted, Impala estimates a row count for the table, if feasible. This row count is the sum of the row count from the partitions with good stats, and an estimation of the number of rows in the partitions with missing or corrupt stats. Such estimation also applies when some partition has missing or corrupt stats. One way to observe the fix is through the explain of queries scanning Hive tables with missing or corrupted stats. The cardinality for any full scan should be a positive value (i.e. the estimated row count), instead of 'unavailable'. At the beginning of the explain output, that table is still listed in the WARNING section for potentially corrupt table statistics. Testing: 1. Ran unit tests with queries documented in the case against Hive tables with the following configrations: a. No stats corruption in any partitions b. Stats corruption in some partitions c. Stats corruption in all partitions 2. Added two new tests in test_compute_stats.py: a. test_corrupted_stats_in_partitioned_Hive_tables b. test_corrupted_stats_in_unpartitioned_Hive_tables 3. Fixed failures in corrupt-stats.test 4. Ran "core" test Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576 --- M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M testdata/workloads/functional-query/queries/QueryTest/corrupt-stats.test M tests/metadata/test_compute_stats.py 3 files changed, 182 insertions(+), 32 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/16098/14 -- To view, visit http://gerrit.cloudera.org:8080/16098 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576 Gerrit-Change-Number: 16098 Gerrit-PatchSet: 14 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-3380: Add a timeout for statestore RPCs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16150 ) Change subject: IMPALA-3380: Add a timeout for statestore RPCs .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/16150/1/be/src/statestore/statestore-subscriber.cc File be/src/statestore/statestore-subscriber.cc: http://gerrit.cloudera.org:8080/#/c/16150/1/be/src/statestore/statestore-subscriber.cc@69 PS1, Line 69: DEFINE_int32(statestore_client_rpc_timeout_ms, 30, "(Advanced) The underlying TSocket " line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/16150/1/tests/custom_cluster/test_rpc_timeout.py File tests/custom_cluster/test_rpc_timeout.py: http://gerrit.cloudera.org:8080/#/c/16150/1/tests/custom_cluster/test_rpc_timeout.py@230 PS1, Line 230: # flake8: E131 continuation line unaligned for hanging indent -- To view, visit http://gerrit.cloudera.org:8080/16150 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If49892ff1950cf474f951aabf4c952dbc44189e2 Gerrit-Change-Number: 16150 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 07 Jul 2020 23:27:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3380: Add a timeout for statestore RPCs
Sahil Takiar has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16150 Change subject: IMPALA-3380: Add a timeout for statestore RPCs .. IMPALA-3380: Add a timeout for statestore RPCs Adds the following startup flags for statestore subscribers: 'statestore_client_rpc_timeout_ms'. The timeout is set to 5 minutes by default. Testing: * Adds some tests for catalog_client_rpc_timeout_ms that validate the timeout is used correctly, and that retries are triggered Change-Id: If49892ff1950cf474f951aabf4c952dbc44189e2 --- M be/src/catalog/catalog-server.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/statestore/statestore-subscriber.cc M tests/custom_cluster/test_rpc_timeout.py 5 files changed, 64 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/16150/1 -- To view, visit http://gerrit.cloudera.org:8080/16150 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: If49892ff1950cf474f951aabf4c952dbc44189e2 Gerrit-Change-Number: 16150 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar
[Impala-ASF-CR] IMPALA-9692 (part 4): Rename QuerySchedule to ScheduleState
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16122 ) Change subject: IMPALA-9692 (part 4): Rename QuerySchedule to ScheduleState .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6104/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16122 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I57069c4a426f3e697df7e2a07754d063bdea26f7 Gerrit-Change-Number: 16122 Gerrit-PatchSet: 5 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 07 Jul 2020 23:01:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9692 (part 4): Rename QuerySchedule to ScheduleState
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/16122 ) Change subject: IMPALA-9692 (part 4): Rename QuerySchedule to ScheduleState .. Patch Set 4: Code-Review+2 carrying forward -- To view, visit http://gerrit.cloudera.org:8080/16122 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I57069c4a426f3e697df7e2a07754d063bdea26f7 Gerrit-Change-Number: 16122 Gerrit-PatchSet: 4 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 07 Jul 2020 23:00:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9692 (part 4): Rename QuerySchedule to ScheduleState
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16122 ) Change subject: IMPALA-9692 (part 4): Rename QuerySchedule to ScheduleState .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16122 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I57069c4a426f3e697df7e2a07754d063bdea26f7 Gerrit-Change-Number: 16122 Gerrit-PatchSet: 5 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 07 Jul 2020 23:01:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 12: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6522/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 12 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 22:15:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9898: generate grouping set plans
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16128 ) Change subject: IMPALA-9898: generate grouping set plans .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6521/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16128 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197 Gerrit-Change-Number: 16128 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 22:14:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6520/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 22:03:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7923: DecimalValue should be marked as packed
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16134 ) Change subject: IMPALA-7923: DecimalValue should be marked as packed .. Patch Set 4: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6102/ -- To view, visit http://gerrit.cloudera.org:8080/16134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I55f936a4f4f4b5faf129a9265222e64fc486b8ed Gerrit-Change-Number: 16134 Gerrit-PatchSet: 4 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 22:04:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/16140/12/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java File fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java: http://gerrit.cloudera.org:8080/#/c/16140/12/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@324 PS12, Line 324: // all aggregation classes to the final output produced by the transposing aggregation. line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 12 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 21:50:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 12: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6103/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 12 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 21:49:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9898: generate grouping set plans
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16128 ) Change subject: IMPALA-9898: generate grouping set plans .. Patch Set 11: Code-Review+1 rebased onto latest master -- To view, visit http://gerrit.cloudera.org:8080/16128 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197 Gerrit-Change-Number: 16128 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 21:49:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 12: Code-Review+1 rebased onto latest master -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 12 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 21:49:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Hello Aman Sinha, Shant Hovsepian, David Rorke, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16140 to look at the new patch set (#12). Change subject: IMPALA-9917: grouping() and grouping_id() support .. IMPALA-9917: grouping() and grouping_id() support Implements grouping() and grouping_id() builtins. grouping_id() has both a no-arg version, which returns a bit vector of all grouping exprs and a varargs version, which returns a bit vector of the provided arguments. Grouping is a keyword, so needs special handling in the parser to be accepted as a function name. These functions are implemented in the transpose agg with a CASE expression similar to other aggregate functions, but returning the grouping() or grouping_id() value for that aggregation class instead of an aggregated value. Testing: * Added parser test for grouping keyword. * Added analysis tests for the functions. * Added basic planner test to show expressions generated * Added some TPC-DS queries that use grouping() - queries 80, 70 and 86 using reference .test files from Fang-Yu Rao. 27 and 36 were added with reference results from https://github.com/cwida/tpcds-result-reproduction * Add targeted end-to-end tests. * Added view compatibility test with Hive. Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 --- M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/GroupByClause.java M fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java M fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test M testdata/workloads/functional-query/queries/QueryTest/grouping-sets.test M testdata/workloads/functional-query/queries/QueryTest/views-compatibility.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q27.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q36.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q70.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q80.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q86.test M tests/query_test/test_tpcds_queries.py M tests/util/parse_util.py 19 files changed, 2,684 insertions(+), 82 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/16140/12 -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 12 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9898: generate grouping set plans
Hello Aman Sinha, Quanlong Huang, Fang-Yu Rao, Qifan Chen, David Rorke, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16128 to look at the new patch set (#11). Change subject: IMPALA-9898: generate grouping set plans .. IMPALA-9898: generate grouping set plans Integrates the parsing and analysis with plan generation. Testing: * Add analysis test to make sure we reject unsupported queries. * Added targeted planner tests to ensure we generate the correct aggregation classes for a variety of cases. * Add targeted end-to-end functional tests. Added five TPC-DS queries that use ROLLUP, building on some work done by Fang-Yu Rao. Some tweaks were required for these tests. * Add an extra ORDER BY clause to q77 to make fully deterministic. * Add backticks around `returns` to avoid reserved word. * Add INTERVAL keyword to date/timestamp arithmetic. We can run q80, too, but I haven't added or verified results yet - that can be done in a follow-up. Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197 --- M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/GroupByClause.java M fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/common/RuntimeEnv.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java A testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test M testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test A testdata/workloads/functional-query/queries/QueryTest/grouping-sets.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q18.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q22.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q5.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q67.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q77.test M tests/query_test/test_aggregation.py M tests/query_test/test_tpcds_queries.py M tests/util/parse_util.py 22 files changed, 4,208 insertions(+), 66 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/16128/11 -- To view, visit http://gerrit.cloudera.org:8080/16128 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197 Gerrit-Change-Number: 16128 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 11: (1 comment) http://gerrit.cloudera.org:8080/#/c/16140/11/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java File fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java: http://gerrit.cloudera.org:8080/#/c/16140/11/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@324 PS11, Line 324: // all aggregation classes to the final output produced by the transposing aggregation. line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 21:35:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Hello Aman Sinha, Shant Hovsepian, David Rorke, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16140 to look at the new patch set (#11). Change subject: IMPALA-9917: grouping() and grouping_id() support .. IMPALA-9917: grouping() and grouping_id() support Implements grouping() and grouping_id() builtins. grouping_id() has both a no-arg version, which returns a bit vector of all grouping exprs and a varargs version, which returns a bit vector of the provided arguments. Grouping is a keyword, so needs special handling in the parser to be accepted as a function name. These functions are implemented in the transpose agg with a CASE expression similar to other aggregate functions, but returning the grouping() or grouping_id() value for that aggregation class instead of an aggregated value. Testing: * Added parser test for grouping keyword. * Added analysis tests for the functions. * Added basic planner test to show expressions generated * Added some TPC-DS queries that use grouping() - queries 80, 70 and 86 using reference .test files from Fang-Yu Rao. 27 and 36 were added with reference results from https://github.com/cwida/tpcds-result-reproduction * Add targeted end-to-end tests. * Added view compatibility test with Hive. Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 --- M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/GroupByClause.java M fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java M fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test M testdata/workloads/functional-query/queries/QueryTest/grouping-sets.test M testdata/workloads/functional-query/queries/QueryTest/views-compatibility.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q27.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q36.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q70.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q80.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q86.test M tests/query_test/test_tpcds_queries.py M tests/util/parse_util.py 19 files changed, 2,684 insertions(+), 82 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/16140/11 -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 11 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 10: (2 comments) http://gerrit.cloudera.org:8080/#/c/16140/10/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java File fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java: http://gerrit.cloudera.org:8080/#/c/16140/10/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@309 PS10, Line 309: aggInfos_.get(0).getResultSmap() : new ExprSubstitutionMap(); > Should this also clone() the smap like in the 'if' case ? If so, this assi I was trying to avoid the clone() when it wasn't necessary but was probably overthinking it :) - I doubt it makes a measurable difference. http://gerrit.cloudera.org:8080/#/c/16140/10/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@328 PS10, Line 328: // all aggregation classes to he final output produced by the transposing aggregation. > nit: 'he'->'the' Done -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 10 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 21:34:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9855, IMPALA-9854: Fix query retry TSAN errors
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16148 ) Change subject: IMPALA-9855, IMPALA-9854: Fix query retry TSAN errors .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6519/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16148 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife2c7492524647bfd8f053dacbda4c553a64eb61 Gerrit-Change-Number: 16148 Gerrit-PatchSet: 2 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 07 Jul 2020 21:16:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9834: De-flake TestQueryRetries on EC builds
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16149 ) Change subject: IMPALA-9834: De-flake TestQueryRetries on EC builds .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6518/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16149 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id5c73c2cbd0ef369175856c41f36d4b0de4b8d71 Gerrit-Change-Number: 16149 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 07 Jul 2020 21:14:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16112 ) Change subject: IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis .. IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis Supports a single ROLLUP, CUBE or GROUPING SETS clause in GROUP BY. Also adds non-standard "WITH ROLLUP" and "WITH CUBE" syntax that is supported by some other SQL dialects. This implements basic parsing and validation of the query, then raises an AnalysisException to report that it is not supported so that incorrect plans will not be generated. This patch adds a GroupByClause to each SelectStmt that contains info about the grouping sets and the original GROUP BY list. The grouping exprs are still represented as a List in SelectStmt. Most of the logic, including statement and expr rewrites, can operate on this list of expressions without requiring special handling for grouping sets. Testing: * Add Parser test. * Add toSql() test. * Added analysis tests to check that analysis accepts or rejects queries correctly. Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f Reviewed-on: http://gerrit.cloudera.org:8080/16112 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M fe/src/main/cup/sql-parser.cup A fe/src/main/java/org/apache/impala/analysis/GroupByClause.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java M fe/src/main/java/org/apache/impala/common/RuntimeEnv.java M fe/src/main/jflex/sql-scanner.flex M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java 10 files changed, 759 insertions(+), 23 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16112 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f Gerrit-Change-Number: 16112 Gerrit-PatchSet: 18 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16112 ) Change subject: IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis .. Patch Set 17: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16112 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f Gerrit-Change-Number: 16112 Gerrit-PatchSet: 17 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 20:49:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9855, IMPALA-9854: Fix query retry TSAN errors
Hello Thomas Tauber-Marshall, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16148 to look at the new patch set (#2). Change subject: IMPALA-9855, IMPALA-9854: Fix query retry TSAN errors .. IMPALA-9855, IMPALA-9854: Fix query retry TSAN errors Fixed two warnings reported by TSAN. One of them is a data race on exec_request_. The other is a lock-inversion warning. The data race on exec_request_ is present because the same exec_request_ is used for the original and retried query. The issue is when a query is retried it needs to set a new query id in the exec_request_, so retrying a query requires mutating the exec_request_. However, even after a query is retried the exec_request_ can still be accessed for the original query. Specifically, an exec status report from a fragment can still be propagated even after the query has been cancelled (e.g. ControlService::ReportExecStatus can still access the original exec_request_). IMPALA-9199 attempted to be smart about re-using the TExecRequest for retried queries, but given the race conditions it seems like a pre-mature optimization. We can re-visit IMPALA-9502 if we think it actually makes a perf difference. The lock-inversion warning is between the sharded locks in ShardedQueryMap (specifically the QueryDriverMap query_driver_map_ in ImpalaServer) and QueryDriver::client_request_state_lock_. The correct lock ordering is to acquire the sharded lock in query_driver_map_ and then to acquire the client_request_state_lock_. QueryDriver::RetryQueryFromThread reversed the ordering. I wasn't able to actually produce a deadlock, but fixing the ordering issue was trivial. Testing: * Ran core tests * Manually checked that the query retry tests are TSAN clean Change-Id: Ife2c7492524647bfd8f053dacbda4c553a64eb61 --- M be/src/runtime/query-driver.cc M be/src/runtime/query-driver.h 2 files changed, 20 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/16148/2 -- To view, visit http://gerrit.cloudera.org:8080/16148 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ife2c7492524647bfd8f053dacbda4c553a64eb61 Gerrit-Change-Number: 16148 Gerrit-PatchSet: 2 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-9855, IMPALA-9854: Fix query retry TSAN errors
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/16148 ) Change subject: IMPALA-9855, IMPALA-9854: Fix query retry TSAN errors .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/16148/1/be/src/runtime/query-driver.h File be/src/runtime/query-driver.h: http://gerrit.cloudera.org:8080/#/c/16148/1/be/src/runtime/query-driver.h@251 PS1, Line 251: /// The TExecRequest for the retried query. Created in 'CreateRetriedClientRequestState'. > line too long (91 > 90) Done http://gerrit.cloudera.org:8080/#/c/16148/1/be/src/runtime/query-driver.cc File be/src/runtime/query-driver.cc: http://gerrit.cloudera.org:8080/#/c/16148/1/be/src/runtime/query-driver.cc@320 PS1, Line 320: parent_server_, *session, retry_exec_request_.get(), request_state->parent_driver()); > line too long (91 > 90) Done -- To view, visit http://gerrit.cloudera.org:8080/16148 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife2c7492524647bfd8f053dacbda4c553a64eb61 Gerrit-Change-Number: 16148 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 07 Jul 2020 20:47:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9834: De-flake TestQueryRetries on EC builds
Sahil Takiar has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16149 Change subject: IMPALA-9834: De-flake TestQueryRetries on EC builds .. IMPALA-9834: De-flake TestQueryRetries on EC builds This patch makes the following changes to de-flake TestQueryRetries on EC builds. It skips any test that scans tpch.lineitem (referred to as a the _shuffle_heavy_query in TestQueryRetries). This query runs on three instances during regular builds (HDFS, S3, etc.), but only two instances on EC builds. This causes some non-deterministism during the test becase killing an impalad in the mini-cluster won't necessarily cause a retry to be triggered. It bumps up the timeout used when waiting for a query to be retried. It improves the assertion in __get_query_id_from_profile so that it dumps the full profile when the assertion fails. This should help debuggability of any test failures that fail in this assertion. Testing: * Ran TestQueryRetries locally Change-Id: Id5c73c2cbd0ef369175856c41f36d4b0de4b8d71 --- M tests/custom_cluster/test_query_retries.py 1 file changed, 11 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/16149/1 -- To view, visit http://gerrit.cloudera.org:8080/16149 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Id5c73c2cbd0ef369175856c41f36d4b0de4b8d71 Gerrit-Change-Number: 16149 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar
[Impala-ASF-CR] IMPALA-9855, IMPALA-9854: Fix query retry TSAN errors
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16148 ) Change subject: IMPALA-9855, IMPALA-9854: Fix query retry TSAN errors .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6517/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16148 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife2c7492524647bfd8f053dacbda4c553a64eb61 Gerrit-Change-Number: 16148 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 07 Jul 2020 20:22:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 10: Code-Review+1 (2 comments) LGTM. http://gerrit.cloudera.org:8080/#/c/16140/10/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java File fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java: http://gerrit.cloudera.org:8080/#/c/16140/10/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@309 PS10, Line 309: aggInfos_.get(0).getResultSmap() : new ExprSubstitutionMap(); Should this also clone() the smap like in the 'if' case ? If so, this assignment is common to both if-else and can be moved out. http://gerrit.cloudera.org:8080/#/c/16140/10/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@328 PS10, Line 328: // all aggregation classes to he final output produced by the transposing aggregation. nit: 'he'->'the' -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 10 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 20:02:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9855, IMPALA-9854: Fix query retry TSAN errors
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16148 ) Change subject: IMPALA-9855, IMPALA-9854: Fix query retry TSAN errors .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/16148/1/be/src/runtime/query-driver.h File be/src/runtime/query-driver.h: http://gerrit.cloudera.org:8080/#/c/16148/1/be/src/runtime/query-driver.h@251 PS1, Line 251: /// The TExecRequest for the retried query. Created in 'CreateRetriedClientRequestState'. line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/16148/1/be/src/runtime/query-driver.cc File be/src/runtime/query-driver.cc: http://gerrit.cloudera.org:8080/#/c/16148/1/be/src/runtime/query-driver.cc@320 PS1, Line 320: parent_server_, *session, retry_exec_request_.get(), request_state->parent_driver()); line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/16148 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife2c7492524647bfd8f053dacbda4c553a64eb61 Gerrit-Change-Number: 16148 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 07 Jul 2020 19:55:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9855, IMPALA-9854: Fix query retry TSAN errors
Sahil Takiar has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16148 Change subject: IMPALA-9855, IMPALA-9854: Fix query retry TSAN errors .. IMPALA-9855, IMPALA-9854: Fix query retry TSAN errors Fixed two warnings reported by TSAN. One of them is a data race on exec_request_. The other is a lock-inversion warning. The data race on exec_request_ is present because the same exec_request_ is used for the original and retried query. The issue is when a query is retried it needs to set a new query id in the exec_request_, so retrying a query requires mutating the exec_request_. However, even after a query is retried the exec_request_ can still be accessed for the original query. Specifically, an exec status report from a fragment can still be propagated even after the query has been cancelled (e.g. ControlService::ReportExecStatus can still access the original exec_request_). IMPALA-9199 attempted to be smart about re-using the TExecRequest for retried queries, but given the race conditions it seems like a pre-mature optimization. We can re-visit IMPALA-9502 if we think it actually makes a perf difference. The lock-inversion warning is between the sharded locks in ShardedQueryMap (specifically the QueryDriverMap query_driver_map_ in ImpalaServer) and QueryDriver::client_request_state_lock_. The correct lock ordering is to acquire the sharded lock in query_driver_map_ and then to acquire the client_request_state_lock_. QueryDriver::RetryQueryFromThread reversed the ordering. I wasn't able to actually produce a deadlock, but fixing the ordering issue was trivial. Testing: * Ran core tests * Manually checked that the query retry tests are TSAN clean Change-Id: Ife2c7492524647bfd8f053dacbda4c553a64eb61 --- M be/src/runtime/query-driver.cc M be/src/runtime/query-driver.h 2 files changed, 17 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/16148/1 -- To view, visit http://gerrit.cloudera.org:8080/16148 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ife2c7492524647bfd8f053dacbda4c553a64eb61 Gerrit-Change-Number: 16148 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar
[Impala-ASF-CR] IMPALA-9633: Implement ds hll union()
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/16095 ) Change subject: IMPALA-9633: Implement ds_hll_union() .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/16095/4/be/src/exprs/aggregate-functions-ir.cc File be/src/exprs/aggregate-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/16095/4/be/src/exprs/aggregate-functions-ir.cc@1736 PS4, Line 1736: DS_SKETCH_CONFIG, DS_HLL_TYPE Do these configs matter, or they are overwritten during deserialization? My understanding is that we should be able to union sketches with different configs. See https://datasketches.apache.org/docs/Architecture/SketchCriteria.html / Mergeable With Different Size Parameters http://gerrit.cloudera.org:8080/#/c/16095/4/fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java File fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java: http://gerrit.cloudera.org:8080/#/c/16095/4/fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java@1336 PS4, Line 1336: Lists.newArrayList(Type.STRING), Type.STRING, Type.STRING, : prefix + "14DsHllUnionInitEPN10impala_udf15FunctionContextEPNS1_9StringValE", : prefix + "16DsHllUnionUpdateEPN10impala_udf15FunctionContextERKNS1_9StringValEPS4_", : prefix + "15DsHllUnionMergeEPN10impala_udf15FunctionContextERKNS1_9StringValEPS4_", : prefix + "19DsHllUnionSerializeEPN10impala_udf15FunctionContextERKNS1_9StringValE", : prefix + "18DsHllUnionFinalizeEPN10impala_udf15FunctionContextERKNS1_9StringValE", : true, false, true)); nit: indentation -- To view, visit http://gerrit.cloudera.org:8080/16095 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I67cdbf6f3ebdb1296fea38465a15642bc9612d09 Gerrit-Change-Number: 16095 Gerrit-PatchSet: 4 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 07 Jul 2020 19:46:20 + Gerrit-HasComments: Yes
[native-toolchain-CR] Remove colon from filename
Laszlo Gaal has posted comments on this change. ( http://gerrit.cloudera.org:8080/16139 ) Change subject: Remove colon from filename .. Patch Set 1: Code-Review+2 LGTM. Would there be an easy way to fix the process that emits such a filename? Not to stop this patch, but as a future improvement/safeguard against a reoccurrence. -- To view, visit http://gerrit.cloudera.org:8080/16139 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I731f9b818376fcc57f9e47a2bd1bc26ca0b41dc5 Gerrit-Change-Number: 16139 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Tue, 07 Jul 2020 19:30:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 9: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6516/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 9 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 19:27:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6515/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 8 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 19:11:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Hello Aman Sinha, Shant Hovsepian, David Rorke, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16140 to look at the new patch set (#10). Change subject: IMPALA-9917: grouping() and grouping_id() support .. IMPALA-9917: grouping() and grouping_id() support Implements grouping() and grouping_id() builtins. grouping_id() has both a no-arg version, which returns a bit vector of all grouping exprs and a varargs version, which returns a bit vector of the provided arguments. Grouping is a keyword, so needs special handling in the parser to be accepted as a function name. These functions are implemented in the transpose agg with a CASE expression similar to other aggregate functions, but returning the grouping() or grouping_id() value for that aggregation class instead of an aggregated value. Testing: * Added parser test for grouping keyword. * Added analysis tests for the functions. * Added basic planner test to show expressions generated * Added some TPC-DS queries that use grouping() - queries 80, 70 and 86 using reference .test files from Fang-Yu Rao. 27 and 36 were added with reference results from https://github.com/cwida/tpcds-result-reproduction * Add targeted end-to-end tests. * Added view compatibility test with Hive. Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 --- M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/GroupByClause.java M fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java M fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test M testdata/workloads/functional-query/queries/QueryTest/grouping-sets.test M testdata/workloads/functional-query/queries/QueryTest/views-compatibility.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q27.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q36.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q70.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q80.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q86.test M tests/query_test/test_tpcds_queries.py M tests/util/parse_util.py 19 files changed, 2,688 insertions(+), 82 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/16140/10 -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 10 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Hello Aman Sinha, Shant Hovsepian, David Rorke, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16140 to look at the new patch set (#9). Change subject: IMPALA-9917: grouping() and grouping_id() support .. IMPALA-9917: grouping() and grouping_id() support Implements grouping() and grouping_id() builtins. grouping_id() has both a no-arg version, which returns a bit vector of all grouping exprs and a varargs version, which returns a bit vector of the provided arguments. Grouping is a keyword, so needs special handling in the parser to be accepted as a function name. These functions are implemented in the transpose agg with a CASE expression similar to other aggregate functions, but returning the grouping() or grouping_id() value for that aggregation class instead of an aggregated value. Testing: * Added parser test for grouping keyword. * Added analysis tests for the functions. * Added basic planner test to show expressions generated * Added some TPC-DS queries that use grouping() - queries 80, 70 and 86 using reference .test files from Fang-Yu Rao. 27 and 36 should also be runnable but I did not have reference results readily available. * Add targeted end-to-end tests. Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 --- M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/GroupByClause.java M fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java M fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test M testdata/workloads/functional-query/queries/QueryTest/grouping-sets.test M testdata/workloads/functional-query/queries/QueryTest/views-compatibility.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q27.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q36.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q70.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q80.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q86.test M tests/query_test/test_tpcds_queries.py M tests/util/parse_util.py 19 files changed, 2,688 insertions(+), 82 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/16140/9 -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 9 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 8: I added a couple of basic view compatibility tests. -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 8 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 19:08:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 8: Oh I missed the logical view comment, will think about that. -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 8 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 18:45:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Hello Aman Sinha, Shant Hovsepian, David Rorke, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16140 to look at the new patch set (#8). Change subject: IMPALA-9917: grouping() and grouping_id() support .. IMPALA-9917: grouping() and grouping_id() support Implements grouping() and grouping_id() builtins. grouping_id() has both a no-arg version, which returns a bit vector of all grouping exprs and a varargs version, which returns a bit vector of the provided arguments. Grouping is a keyword, so needs special handling in the parser to be accepted as a function name. These functions are implemented in the transpose agg with a CASE expression similar to other aggregate functions, but returning the grouping() or grouping_id() value for that aggregation class instead of an aggregated value. Testing: * Added parser test for grouping keyword. * Added analysis tests for the functions. * Added basic planner test to show expressions generated * Added some TPC-DS queries that use grouping() - queries 80, 70 and 86 using reference .test files from Fang-Yu Rao. 27 and 36 should also be runnable but I did not have reference results readily available. * Add targeted end-to-end tests. Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 --- M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/GroupByClause.java M fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java M fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test M testdata/workloads/functional-query/queries/QueryTest/grouping-sets.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q27.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q36.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q70.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q80.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q86.test M tests/query_test/test_tpcds_queries.py M tests/util/parse_util.py 18 files changed, 2,659 insertions(+), 82 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/16140/8 -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 8 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 7: (9 comments) http://gerrit.cloudera.org:8080/#/c/16140/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16140/7//COMMIT_MSG@28 PS7, Line 28: 27 and 36 sho > https://github.com/cwida/tpcds-result-reproduction/blob/master/answer_sets_ Added the rollup variants of those two with the same parameters as the references results and compared to those reference results. They were the same up to rounding and formatting differences. http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java File fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java: http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java@2 PS7, Line 2: > nit: remove newline Done http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java File fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java: http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@352 PS7, Line 352: groupingIdExprs.add(aggExpr); > The groupingIdExprs list does not get used ? Done http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@394 PS7, Line 394: // special-casingthe degenerate case of these functions being used with a single > nit: add space between 'casing', 'the' Done http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@398 PS7, Line 398: throw new AnalysisException("Aggregate function '" + > It's probably ok to reject the query but strictly speaking, grouping() func I looked at this again and it's not too hard to implement by adding smap entries, so I just went and did it - might as well be complete. http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@586 PS7, Line 586: String name = aggExpr.getFnName().getFunction(); > 'name' variable does not get used. Done http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@667 PS7, Line 667:* Return the return value for grouping() or grouping_id() for a particular aggregation > nit: remove the second 'return' Done http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java File fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java: http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java@150 PS7, Line 150:* analysis. > Mention in the comment that it's used for implementing grouping() as an exa Done http://gerrit.cloudera.org:8080/#/c/16140/7/testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test File testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test: http://gerrit.cloudera.org:8080/#/c/16140/7/testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test@1169 PS7, Line 1169: # introduce NULLs. > nit: not clear what 'introduce nulls' means here. I think this was a fat finger error. Removed it. -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 18:44:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 ) Change subject: IMPALA-9531: Dropped support for dateless timestamps .. Patch Set 11: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6100/ -- To view, visit http://gerrit.cloudera.org:8080/15866 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322 Gerrit-Change-Number: 15866 Gerrit-PatchSet: 11 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 07 Jul 2020 18:17:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9294: Support DATE for min-max runtime filter
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/16103 ) Change subject: IMPALA-9294: Support DATE for min-max runtime filter .. Patch Set 6: (5 comments) http://gerrit.cloudera.org:8080/#/c/16103/3/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test File testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test: http://gerrit.cloudera.org:8080/#/c/16103/3/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test@168 PS3, Line 168: functional_kudu.date_tbl a : join [BROADCAST] functional_kudu.date_tbl b > It's working too. Will add a new test case. While its a little hard to tell and I think that we do have a test that technically hits this path, it would be good to add a functional-query test that is specifically targeted at this, eg. in min_max_filters.test add a "Test case 5" with a query like one of the three-way shuffle joins that I'm asking you to remove from the functional-planner/min-max-runtime-filters.test http://gerrit.cloudera.org:8080/#/c/16103/7/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test File testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test: http://gerrit.cloudera.org:8080/#/c/16103/7/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test@167 PS7, Line 167: # Query with date-typed join column I think this case can be left out - its only showing that we'll generate DATE min-max filters now, but we already have tests for that in the functional-query tests and we don't already have planner tests for all of the other min-max filter types. http://gerrit.cloudera.org:8080/#/c/16103/7/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test@206 PS7, Line 206: # Query with 3-way join on date-typed join columns I think its reasonable to leave this case in as we don't already have a planner test for a three-way join with all Kudu tables, but lets rewrite this comment to say that's what we're testing, i.e. its not the fact that we have DATE columns here that makes this case different from the others (you might even change the type of one of the columns just for variety) its the fact that its a three way join with two different join nodes generating different runtime filters. http://gerrit.cloudera.org:8080/#/c/16103/7/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test@256 PS7, Line 256: 00:SCAN KUDU [functional_kudu.date_tbl a] I don't think this case is testing anything different from the existing case above labeled "Query with both Kudu and HDFS targets" and so it can be removed. http://gerrit.cloudera.org:8080/#/c/16103/7/testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test@312 PS7, Line 312:runtime filters: RF001[min_max] -> a.id_col, RF003[min_max] -> a.date_col Same as above, I don't think this case is testing anything different from the existing case above labeled "Query with both Kudu and HDFS targets" and so it can be removed. -- To view, visit http://gerrit.cloudera.org:8080/16103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic2f6e2dc6949735d5f0fcf317361cc2969a5e82c Gerrit-Change-Number: 16103 Gerrit-PatchSet: 6 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 07 Jul 2020 18:07:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9898: generate grouping set plans
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/16128 ) Change subject: IMPALA-9898: generate grouping set plans .. Patch Set 10: (1 comment) > Uploaded patch set 10. http://gerrit.cloudera.org:8080/#/c/16128/6/testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test File testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test: http://gerrit.cloudera.org:8080/#/c/16128/6/testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test@8835 PS6, Line 8835: 75.61K > You're right about this - we apply CARDINALITY_FILTER in the planner tests Good to know. Thanks! -- To view, visit http://gerrit.cloudera.org:8080/16128 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie454c5bf7aee266321dee615548d7f2b71380197 Gerrit-Change-Number: 16128 Gerrit-PatchSet: 10 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 18:03:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9692 (part 4): Rename QuerySchedule to ScheduleState
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/16122 ) Change subject: IMPALA-9692 (part 4): Rename QuerySchedule to ScheduleState .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16122 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I57069c4a426f3e697df7e2a07754d063bdea26f7 Gerrit-Change-Number: 16122 Gerrit-PatchSet: 3 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 07 Jul 2020 17:16:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7923: DecimalValue should be marked as packed
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16134 ) Change subject: IMPALA-7923: DecimalValue should be marked as packed .. Patch Set 3: I was a little scared since Q1 is a decimal-intensive query, but seems to be fine when I tried to repro: Report Generated on 2020-07-07 Run Description: "6c8a3dfc339e43a8992af2ff3429ba5940a061ec vs 513c19bc0a750960b97f0d4cd14a9bdc8bbd2860" Cluster Name: UNKNOWN Lab Run Info: UNKNOWN Impala Version: impalad version 4.0.0-SNAPSHOT RELEASE () Baseline Impala Version: impalad version 4.0.0-SNAPSHOT RELEASE (2020-07-01) +--+---+-++++ | Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) | +--+---+-++++ | TPCH(30) | parquet / none / none | 10.05 | +0.18% | 10.05 | +0.18% | +--+---+-++++ +--+-+---++-++---++---++-+--+ | Workload | Query | File Format | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | Tval | +--+-+---++-++---++---++-+--+ | TPCH(30) | TPCH-Q1 | parquet / none / none | 10.05 | 10.03 | +0.18% | 1.16% | 0.64%| 20| -0.00% | -0.04 | 0.61 | +--+-+---++-++---++---++-+--+ -- To view, visit http://gerrit.cloudera.org:8080/16134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I55f936a4f4f4b5faf129a9265222e64fc486b8ed Gerrit-Change-Number: 16134 Gerrit-PatchSet: 3 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 16:57:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7923: DecimalValue should be marked as packed
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16134 ) Change subject: IMPALA-7923: DecimalValue should be marked as packed .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I55f936a4f4f4b5faf129a9265222e64fc486b8ed Gerrit-Change-Number: 16134 Gerrit-PatchSet: 4 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 16:57:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7923: DecimalValue should be marked as packed
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16134 ) Change subject: IMPALA-7923: DecimalValue should be marked as packed .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6102/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I55f936a4f4f4b5faf129a9265222e64fc486b8ed Gerrit-Change-Number: 16134 Gerrit-PatchSet: 4 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 16:57:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9921: Change error messages in checking needsQuotes to TRACE level logs
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16146 ) Change subject: IMPALA-9921: Change error messages in checking needsQuotes to TRACE level logs .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16146 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0e1b5d2963285dc9125d8e0b8ed25c4db6821e0b Gerrit-Change-Number: 16146 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 07 Jul 2020 15:42:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16112 ) Change subject: IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis .. Patch Set 16: Hit IMPALA-9923 -- To view, visit http://gerrit.cloudera.org:8080/16112 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f Gerrit-Change-Number: 16112 Gerrit-PatchSet: 16 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 15:38:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16112 ) Change subject: IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis .. Patch Set 17: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6101/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16112 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f Gerrit-Change-Number: 16112 Gerrit-PatchSet: 17 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 15:36:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16112 ) Change subject: IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis .. Patch Set 17: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16112 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f Gerrit-Change-Number: 16112 Gerrit-PatchSet: 17 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 15:36:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9633: Implement ds hll union()
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16095 ) Change subject: IMPALA-9633: Implement ds_hll_union() .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6514/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16095 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I67cdbf6f3ebdb1296fea38465a15642bc9612d09 Gerrit-Change-Number: 16095 Gerrit-PatchSet: 3 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 07 Jul 2020 15:31:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16082 ) Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types) .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6513/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16082 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659 Gerrit-Change-Number: 16082 Gerrit-PatchSet: 11 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 07 Jul 2020 15:29:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16082 ) Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types) .. Patch Set 11: (12 comments) http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/FeFsTable.java File fe/src/main/java/org/apache/impala/catalog/FeFsTable.java: http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/FeFsTable.java@376 PS10, Line 376: > Could you add a TODO in this method for IMPALA-9883? Done http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2010 PS10, Line 2010: getNumFileDescriptors(); > Please change this to getNumFileDescriptors() after we revise its implement Done http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java@56 PS10, Line 56: import org.apache.thrift.TException; > nit: unused import Thanks for pointing these out! Done. http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java@255 PS10, Line 255: } > nit: 4 spaces indent Done http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java@409 PS10, Line 409: public ImmutableList getFileDescriptors() { : if (insertFds_.isEmpty()) return fds_; : List ret = Lists.newArrayListWithCapacity( : insertFds_.size() + deleteFds_.size()); : ret.addAll(insertFds_); : ret.addAll(deleteFds_); : return ImmutableList.copyOf(ret); : } : : @Override : public ImmutableList getInsertFileDescriptors() { : return insertFds_; : } : > Need to update these Done http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java File fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java: http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java@133 PS10, Line 133: !fileDescriptors_.isEmpty() || > nit: It's inefficient to construct the list for this. Maybe change to Done http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java@138 PS10, Line 138: > nit: It's inefficient to construct the list for this. Maybe change to Instead of the ternary operator, I rather just sum up the sizes to keep it more readable. http://gerrit.cloudera.org:8080/#/c/16082/9/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java: http://gerrit.cloudera.org:8080/#/c/16082/9/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@45 PS9, Line 45: import org.apache.impala.analysis.NullLiteral; > Looks like IntelliJ is more intelligent in this :D Yeah :D It's a shame vscode fails on such a trivial task. But apart from this it's a great IDE. http://gerrit.cloudera.org:8080/#/c/16082/9/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1518 PS9, Line 1518: feTable.getMetaStoreTable().getParameters())); > PS10 looks good to me. I think the insert files and delete files are used s I'm OK with this if you think it's good. Yeah, I was also thinking about merging fileDescriptors_ and insertFileDescriptors_. My only concern is that it would be a bit weird for genDeleteDeltaPartition(), because it would put the delete delta descriptors into the regular descriptors. http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java: http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1536 PS10, Line 1536: hdfsTblRef.getDesc(), conjuncts, insertDeltaPartitions, hdfsTblRef, > line too long (93 > 90) Done http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/test/java/org/apache/impala/util/AcidUtilsTest.java File fe/src/test/java/org/apache/impala/util/AcidUtilsTest.java: http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/test/java/org/apache/impala/util/AcidUtilsTest.java@497 PS10, Line 497: () { > not *Fail* anymore Done ht
[Impala-ASF-CR] IMPALA-9633: Implement ds hll union()
Hello Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16095 to look at the new patch set (#3). Change subject: IMPALA-9633: Implement ds_hll_union() .. IMPALA-9633: Implement ds_hll_union() This function receives a set of sketches produced by ds_hll_sketch() and merges them into a single sketch. An example usage is to create a sketch for each partition of a table, write these sketches to a separate table and based on which partition the user is interested of the relevant sketches can be union-ed together to get an estimate. E.g.: SELECT ds_hll_estimate(ds_hll_union(sketch_col)) FROM sketch_tbl WHERE partition_col=1 OR partition_col=5; Testing: - Apart from the automated tests I added to this patch I also tested ds_hll_union() on a bigger dataset to check that serialization, deserialization and merging steps work well. I took TPCH25.linelitem, created a number of sketches with grouping by l_shipdate and called ds_hll_union() on those sketches. Change-Id: I67cdbf6f3ebdb1296fea38465a15642bc9612d09 --- M be/src/exprs/CMakeLists.txt M be/src/exprs/aggregate-functions-ir.cc M be/src/exprs/aggregate-functions.h A be/src/exprs/datasketches-common.cc A be/src/exprs/datasketches-common.h M be/src/exprs/datasketches-functions-ir.cc M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M testdata/workloads/functional-query/queries/QueryTest/datasketches-hll.test 8 files changed, 232 insertions(+), 29 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/16095/3 -- To view, visit http://gerrit.cloudera.org:8080/16095 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I67cdbf6f3ebdb1296fea38465a15642bc9612d09 Gerrit-Change-Number: 16095 Gerrit-PatchSet: 3 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16082 ) Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types) .. Patch Set 11: (1 comment) http://gerrit.cloudera.org:8080/#/c/16082/11/tests/custom_cluster/test_local_catalog.py File tests/custom_cluster/test_local_catalog.py: http://gerrit.cloudera.org:8080/#/c/16082/11/tests/custom_cluster/test_local_catalog.py@30 PS11, Line 30: from tests.common.skip import (SkipIfHive2, SkipIfCatalogV2, SkipIfS3, SkipIfABFS, flake8: F401 'tests.common.skip.SkipIfCatalogV2' imported but unused -- To view, visit http://gerrit.cloudera.org:8080/16082 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659 Gerrit-Change-Number: 16082 Gerrit-PatchSet: 11 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 07 Jul 2020 15:01:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)
Hello Aman Sinha, Quanlong Huang, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16082 to look at the new patch set (#11). Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types) .. IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types) Hive ACID supports row-level DELETE and UPDATE operations on a table. It achieves it via assigning a unique row-id for each row, and maintaining two sets of files in a table. The first set is in the base/delta directories, they contain the INSERTed rows. The second set of files are in the delete-delta directories, they contain the DELETEd rows. (UPDATE operations are implemented via DELETE+INSERT.) In the filesystem it looks like e.g.: * full_acid/delta_001_001_/_0 * full_acid/delta_002_002_/_0 * full_acid/delete_delta_003_003_/_0 During scanning we need to return INSERTed rows minus DELETEd rows. This patch implements it by creating an ANTI JOIN between the INSERT and DELETE sets. It is a planner-only modification. Every HDFS SCAN that scans full ACID tables (that also have deleted rows) are converted to two HDFS SCANs, one for the INSERT deltas, and one for the DELETE deltas. Then a LEFT ANTI HASH JOIN with BROADCAST distribution mode is created above them. Later we can add support for other distribution modes if the performance requires it. E.g. if we have too many deleted rows then probably we are better off with PARTITIONED distribution mode. We could estimate the number of deleted rows by sampling the delete delta files. The current patch only works for primitive types. I.e. we cannot select nested data if the table has deleted rows. Testing: * added planner test * added e2e tests Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659 --- M common/thrift/CatalogObjects.thrift M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java M fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java M fe/src/main/java/org/apache/impala/catalog/FeFsTable.java M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java M fe/src/main/java/org/apache/impala/catalog/local/MetaProvider.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/main/java/org/apache/impala/util/AcidUtils.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M fe/src/test/java/org/apache/impala/util/AcidUtilsTest.java M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv A testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test M testdata/workloads/functional-query/queries/QueryTest/acid-negative.test A testdata/workloads/functional-query/queries/QueryTest/full-acid-scans.test M tests/custom_cluster/test_local_catalog.py M tests/query_test/test_acid.py 27 files changed, 1,460 insertions(+), 146 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/16082/11 -- To view, visit http://gerrit.cloudera.org:8080/16082 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659 Gerrit-Change-Number: 16082 Gerrit-PatchSet: 11 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] POC: Remote codegen (Milestone 1)
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14735 ) Change subject: POC: Remote codegen (Milestone 1) .. Patch Set 5: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/6512/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/14735 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I375d969ceb31b3ee234d6ca56a77d508ae43bb0f Gerrit-Change-Number: 14735 Gerrit-PatchSet: 5 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 14:16:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9632: Implement ds hll sketch() and ds hll estimate()
Gabor Kaszab has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16000 ) Change subject: IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate() .. IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate() These functions can be used to get cardinality estimates of data using HLL algorithm from Apache DataSketches. ds_hll_sketch() receives a dataset, e.g. a column from a table, and returns a serialized HLL sketch in string format. This can be written to a table or be fed directly to ds_hll_estimate() that returns the cardinality estimate for that sketch. Comparing to ndv() these functions bring more flexibility as once we fed data to the sketch it can be written to a table and next time we can save scanning through the dataset and simply return the estimate using the sketch. This doesn't come for free, however, as perfomance measurements show that ndv() is 2x-3.5x faster than sketching. On the other hand if we query the estimate from an existing sketch then the runtime is negligible. Another flexibility with these sketches is that they can be merged together so e.g. if we had saved a sketch for each of the partitions of a table then they can be combined with each other based on the query without touching the actual data. DataSketches HLL is sensitive for the order of the data fed to the sketch and as a result running these algorithms in Impala gets non-deterministic results within the error bounds of the algorithm. In terms of correctness DataSketches HLL is most of the time in 2% range from the correct result but there are occasional spikes where the difference is bigger but never goes out of the range of 5%. Even though the DataSketches HLL algorithm could be parameterized currently this implementation hard-codes these parameters and use HLL_4 and lg_k=12. For more details about Apache DataSketches' HLL implementation see: https://datasketches.apache.org/docs/HLL/HLL.html Testing: - Added some tests running estimates for small datasets where the amount of data is small enough to get the correct results. - Ran manual tests on TPCH25.lineitem to compare perfomance with ndv(). Depending on data characteristics ndv() appears 2x-3.5x faster. The lower the cardinality of the dataset the bigger the difference between the 2 algorithms is. - Ran manual tests on TPCH25.lineitem and functional_parquet.alltypes to compare correctness with ndv(). See results above. Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe Reviewed-on: http://gerrit.cloudera.org:8080/16000 Tested-by: Impala Public Jenkins Reviewed-by: Csaba Ringhofer --- M be/src/codegen/impala-ir.cc M be/src/exprs/CMakeLists.txt M be/src/exprs/aggregate-functions-ir.cc M be/src/exprs/aggregate-functions.h A be/src/exprs/datasketches-functions-ir.cc A be/src/exprs/datasketches-functions.h M be/src/exprs/datasketches-test.cc M be/src/exprs/scalar-expr-evaluator.cc M common/function-registry/impala_functions.py M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M fe/src/main/java/org/apache/impala/catalog/Function.java M testdata/data/README A testdata/data/hll_sketches_from_hive.parquet A testdata/workloads/functional-query/queries/QueryTest/datasketches-hll.test A tests/query_test/test_datasketches.py 17 files changed, 505 insertions(+), 8 deletions(-) Approvals: Impala Public Jenkins: Verified Csaba Ringhofer: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/16000 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe Gerrit-Change-Number: 16000 Gerrit-PatchSet: 17 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9632: Implement ds hll sketch() and ds hll estimate()
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/16000 ) Change subject: IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate() .. Patch Set 16: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16000 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe Gerrit-Change-Number: 16000 Gerrit-PatchSet: 16 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 07 Jul 2020 14:08:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9632: Implement ds hll sketch() and ds hll estimate()
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16000 ) Change subject: IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate() .. Patch Set 16: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16000 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe Gerrit-Change-Number: 16000 Gerrit-PatchSet: 16 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 07 Jul 2020 14:01:48 + Gerrit-HasComments: No
[Impala-ASF-CR] POC: Remote codegen (Milestone 1)
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14735 ) Change subject: POC: Remote codegen (Milestone 1) .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/14735/5/be/src/codegen/llvm-codegen.h File be/src/codegen/llvm-codegen.h: http://gerrit.cloudera.org:8080/#/c/14735/5/be/src/codegen/llvm-codegen.h@346 PS5, Line 346: /// If 'async' is true, start executing 'FinalizeModule' in a separate thread and return. line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/14735/5/be/src/service/fe-support.cc File be/src/service/fe-support.cc: http://gerrit.cloudera.org:8080/#/c/14735/5/be/src/service/fe-support.cc@242 PS5, Line 242: status = codegen->FinalizeModule(false, false, nullptr); /// TODO: Check whether the arguments are ok. line too long (106 > 90) http://gerrit.cloudera.org:8080/#/c/14735/5/testdata/bin/server.py File testdata/bin/server.py: http://gerrit.cloudera.org:8080/#/c/14735/5/testdata/bin/server.py@207 PS5, Line 207: if __name__ == "__main__": flake8: E305 expected 2 blank lines after class or function definition, found 1 -- To view, visit http://gerrit.cloudera.org:8080/14735 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I375d969ceb31b3ee234d6ca56a77d508ae43bb0f Gerrit-Change-Number: 14735 Gerrit-PatchSet: 5 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 13:56:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] POC: Remote codegen (Milestone 1)
Daniel Becker has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/14735 ) Change subject: POC: Remote codegen (Milestone 1) .. POC: Remote codegen (Milestone 1) Optional remote codegen, set by a query option. Can be used together with async. The Impala executor does everything as usual until the point when it would optimize and compile the LLVM module. Instead, it sends the LLVM bitcode to the compile server which optimizes it and sends back an object file (handled in memory, not written to disk). The executor blocks until it receives the compiled object file. Note that the customization of the LLVM module (replacing function calls) is still done by the executor. The server is a python script that calls llc to compile the code. It will later be replaced probably by a special impalad that uses the LLVM API to compile the code. Communication is not secure. Change-Id: I375d969ceb31b3ee234d6ca56a77d508ae43bb0f --- A be/src/codegen/LLVMSmallVectorMemoryBuffer.h M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h A be/src/codegen/simple_message.h M be/src/runtime/fragment-state.cc M be/src/service/fe-support.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M testdata/bin/kill-all.sh M testdata/bin/run-all.sh A testdata/bin/server.py 13 files changed, 611 insertions(+), 60 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/14735/5 -- To view, visit http://gerrit.cloudera.org:8080/14735 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I375d969ceb31b3ee234d6ca56a77d508ae43bb0f Gerrit-Change-Number: 14735 Gerrit-PatchSet: 5 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 ) Change subject: IMPALA-9531: Dropped support for dateless timestamps .. Patch Set 11: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15866 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322 Gerrit-Change-Number: 15866 Gerrit-PatchSet: 11 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 07 Jul 2020 13:08:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 ) Change subject: IMPALA-9531: Dropped support for dateless timestamps .. Patch Set 11: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6100/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/15866 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322 Gerrit-Change-Number: 15866 Gerrit-PatchSet: 11 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 07 Jul 2020 13:08:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 ) Change subject: IMPALA-9531: Dropped support for dateless timestamps .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6511/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/15866 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322 Gerrit-Change-Number: 15866 Gerrit-PatchSet: 10 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 07 Jul 2020 11:56:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps
Adam Tamas has posted comments on this change. ( http://gerrit.cloudera.org:8080/15866 ) Change subject: IMPALA-9531: Dropped support for dateless timestamps .. Patch Set 10: (2 comments) http://gerrit.cloudera.org:8080/#/c/15866/7/be/src/exprs/timestamp-functions.cc File be/src/exprs/timestamp-functions.cc: http://gerrit.cloudera.org:8080/#/c/15866/7/be/src/exprs/timestamp-functions.cc@154 PS7, Line 154: // The purpose of making this .cc only is to avoid moving the code of > I'd rather say something to indicate that the purpose of making this .cc on Done http://gerrit.cloudera.org:8080/#/c/15866/7/be/src/runtime/datetime-simple-date-format-parser.cc File be/src/runtime/datetime-simple-date-format-parser.cc: http://gerrit.cloudera.org:8080/#/c/15866/7/be/src/runtime/datetime-simple-date-format-parser.cc@56 PS7, Line 56: Tokenize(&DEFAULT_DATE_TIME_CTX[i], PARSE); > It's no longer needed to give the 3rd parameter in case it's true, right? L Done -- To view, visit http://gerrit.cloudera.org:8080/15866 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322 Gerrit-Change-Number: 15866 Gerrit-PatchSet: 10 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 07 Jul 2020 11:27:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9531: Dropped support for dateless timestamps
Adam Tamas has uploaded a new patch set (#10). ( http://gerrit.cloudera.org:8080/15866 ) Change subject: IMPALA-9531: Dropped support for dateless timestamps .. IMPALA-9531: Dropped support for dateless timestamps Removed the support for dateless timestamps. During dateless timestamp casts if the format doesn't contain date part we get an error during tokenization of the format. If the input str doesn't contain a date part then we get null result. Examples: select cast('01:02:59' as timestamp); This will come back as NULL value. select to_timestamp('01:01:01', 'HH:mm:ss'); select cast('01:02:59' as timestamp format 'HH12:MI:SS'); select cast('12 AM' as timestamp FORMAT 'AM.HH12'); These will come back with a parsing errors. Casting from a table will generate similar results. Testing: Modified the previous tests related to dateless timestamps. Added test to read fromtables which are still containing dateless timestamps and covered timestamp to string path when no date tokens are requested in the output string. Change-Id: I48c49bf027cc4b917849b3d58518facba372b322 --- M be/src/benchmarks/convert-timestamp-benchmark.cc M be/src/benchmarks/parse-timestamp-benchmark.cc M be/src/exec/text-converter.inline.h M be/src/exprs/cast-functions-ir.cc M be/src/exprs/expr-test.cc M be/src/exprs/scalar-expr-evaluator.cc M be/src/exprs/timestamp-functions-ir.cc M be/src/exprs/timestamp-functions.cc M be/src/exprs/timestamp-functions.h M be/src/runtime/date-parse-util.cc M be/src/runtime/date-test.cc M be/src/runtime/datetime-iso-sql-format-tokenizer.cc M be/src/runtime/datetime-parser-common.cc M be/src/runtime/datetime-parser-common.h M be/src/runtime/datetime-simple-date-format-parser.cc M be/src/runtime/datetime-simple-date-format-parser.h M be/src/runtime/timestamp-parse-util.cc M be/src/runtime/timestamp-test.cc M be/src/runtime/timestamp-value.h M bin/rat_exclude_files.txt M common/function-registry/impala_functions.py M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java M testdata/data/README A testdata/data/dateless_timestamps.parq A testdata/data/dateless_timestamps.txt M testdata/data/lazy_timestamp.csv M testdata/workloads/functional-query/queries/QueryTest/date.test A testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_parquet.test A testdata/workloads/functional-query/queries/QueryTest/dateless_timestamp_text.test M testdata/workloads/functional-query/queries/QueryTest/exprs.test M testdata/workloads/functional-query/queries/QueryTest/select-lazy-timestamp.test M tests/data_errors/test_data_errors.py M tests/query_test/test_cast_with_format.py M tests/query_test/test_scanners.py 34 files changed, 277 insertions(+), 231 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/15866/10 -- To view, visit http://gerrit.cloudera.org:8080/15866 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I48c49bf027cc4b917849b3d58518facba372b322 Gerrit-Change-Number: 15866 Gerrit-PatchSet: 10 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16112 ) Change subject: IMPALA-9897: GROUPING SETS/CUBE/ROLLUP parsing and analysis .. Patch Set 16: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6097/ -- To view, visit http://gerrit.cloudera.org:8080/16112 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I178e45de94d736630c97ae1ec4a92423cd74621f Gerrit-Change-Number: 16112 Gerrit-PatchSet: 16 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 10:28:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7923: DecimalValue should be marked as packed
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/16134 ) Change subject: IMPALA-7923: DecimalValue should be marked as packed .. Patch Set 3: I did the test on parquet/none/none, scale factor 15. Copied the results to the Jira issue. TPCH-Q1 became a regression. -- To view, visit http://gerrit.cloudera.org:8080/16134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I55f936a4f4f4b5faf129a9265222e64fc486b8ed Gerrit-Change-Number: 16134 Gerrit-PatchSet: 3 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 10:20:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types)
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16082 ) Change subject: IMPALA-9859: Full ACID Milestone 4: Part 1 Reading modified tables (primitive types) .. Patch Set 10: (9 comments) Thanks for making the change! I'm still looking into the join related logic and tests. http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/FeFsTable.java File fe/src/main/java/org/apache/impala/catalog/FeFsTable.java: http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/FeFsTable.java@376 PS10, Line 376: getFilesSample Could you add a TODO in this method for IMPALA-9883? http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@2010 PS10, Line 2010: getFileDescriptors().size() Please change this to getNumFileDescriptors() after we revise its implementation. http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java@56 PS10, Line 56: import org.apache.orc.FileMetadata; nit: unused import http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java@255 PS10, Line 255: nit: 4 spaces indent http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java File fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java: http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java@133 PS10, Line 133: !getFileDescriptors().isEmpty() nit: It's inefficient to construct the list for this. Maybe change to return !fileDescriptors_.isEmpty() || !insertFileDescriptors_.isEmpty() || !deleteFileDescriptors_.isEmpty(). http://gerrit.cloudera.org:8080/#/c/16082/10/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java@138 PS10, Line 138: getFileDescriptors().size() nit: It's inefficient to construct the list for this. Maybe change to return fileDescriptors_.isEmpty()? insertFileDescriptors_.size() + deleteFileDescriptors_.size() : fileDescriptors_.size(); http://gerrit.cloudera.org:8080/#/c/16082/9/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java: http://gerrit.cloudera.org:8080/#/c/16082/9/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@45 PS9, Line 45: import org.apache.impala.analysis.NullLiteral; > Yeah, I use my IDE to do that for me. Interestingly, even if I delete this Looks like IntelliJ is more intelligent in this :D http://gerrit.cloudera.org:8080/#/c/16082/9/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@1518 PS9, Line 1518: feTable.getMetaStoreTable().getParameters())); > I implemented it in PS10, but I'm afraid that the change became a bit too i PS10 looks good to me. I think the insert files and delete files are used separately so it makes sense to store them in different fileds. However, It'd be good to ask more feedbacks on this decision. Maybe we can merge fileDescriptors_ and insertFileDescriptors_ to one field (since only one of them is valid at a time) to reduce the complexity. http://gerrit.cloudera.org:8080/#/c/16082/10/tests/custom_cluster/test_local_catalog.py File tests/custom_cluster/test_local_catalog.py: http://gerrit.cloudera.org:8080/#/c/16082/10/tests/custom_cluster/test_local_catalog.py@467 PS10, Line 467: def test_full_acid_support(self): Could you add a test here for reading functional_orc_def.alltypes_deleted_rows in LocalCatalog mode? -- To view, visit http://gerrit.cloudera.org:8080/16082 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15c8feabf40be1658f3dd46883f5a1b2aa5d0659 Gerrit-Change-Number: 16082 Gerrit-PatchSet: 10 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 07 Jul 2020 09:59:38 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9632: Implement ds hll sketch() and ds hll estimate()
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16000 ) Change subject: IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate() .. Patch Set 15: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6510/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16000 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe Gerrit-Change-Number: 16000 Gerrit-PatchSet: 15 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 07 Jul 2020 09:25:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9632: Implement ds hll sketch() and ds hll estimate()
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16000 ) Change subject: IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate() .. Patch Set 16: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6099/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16000 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe Gerrit-Change-Number: 16000 Gerrit-PatchSet: 16 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 07 Jul 2020 08:57:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9632: Implement ds hll sketch() and ds hll estimate()
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/16000 ) Change subject: IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate() .. Patch Set 15: Apparently, test_ddl.py/test_functions_ddl failed with IllegalStateException because for unsupported functions there were some unset members. -- To view, visit http://gerrit.cloudera.org:8080/16000 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe Gerrit-Change-Number: 16000 Gerrit-PatchSet: 15 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 07 Jul 2020 08:57:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9632: Implement ds hll sketch() and ds hll estimate()
Hello Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16000 to look at the new patch set (#15). Change subject: IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate() .. IMPALA-9632: Implement ds_hll_sketch() and ds_hll_estimate() These functions can be used to get cardinality estimates of data using HLL algorithm from Apache DataSketches. ds_hll_sketch() receives a dataset, e.g. a column from a table, and returns a serialized HLL sketch in string format. This can be written to a table or be fed directly to ds_hll_estimate() that returns the cardinality estimate for that sketch. Comparing to ndv() these functions bring more flexibility as once we fed data to the sketch it can be written to a table and next time we can save scanning through the dataset and simply return the estimate using the sketch. This doesn't come for free, however, as perfomance measurements show that ndv() is 2x-3.5x faster than sketching. On the other hand if we query the estimate from an existing sketch then the runtime is negligible. Another flexibility with these sketches is that they can be merged together so e.g. if we had saved a sketch for each of the partitions of a table then they can be combined with each other based on the query without touching the actual data. DataSketches HLL is sensitive for the order of the data fed to the sketch and as a result running these algorithms in Impala gets non-deterministic results within the error bounds of the algorithm. In terms of correctness DataSketches HLL is most of the time in 2% range from the correct result but there are occasional spikes where the difference is bigger but never goes out of the range of 5%. Even though the DataSketches HLL algorithm could be parameterized currently this implementation hard-codes these parameters and use HLL_4 and lg_k=12. For more details about Apache DataSketches' HLL implementation see: https://datasketches.apache.org/docs/HLL/HLL.html Testing: - Added some tests running estimates for small datasets where the amount of data is small enough to get the correct results. - Ran manual tests on TPCH25.lineitem to compare perfomance with ndv(). Depending on data characteristics ndv() appears 2x-3.5x faster. The lower the cardinality of the dataset the bigger the difference between the 2 algorithms is. - Ran manual tests on TPCH25.lineitem and functional_parquet.alltypes to compare correctness with ndv(). See results above. Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe --- M be/src/codegen/impala-ir.cc M be/src/exprs/CMakeLists.txt M be/src/exprs/aggregate-functions-ir.cc M be/src/exprs/aggregate-functions.h A be/src/exprs/datasketches-functions-ir.cc A be/src/exprs/datasketches-functions.h M be/src/exprs/datasketches-test.cc M be/src/exprs/scalar-expr-evaluator.cc M common/function-registry/impala_functions.py M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/catalog/AggregateFunction.java M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M fe/src/main/java/org/apache/impala/catalog/Function.java M testdata/data/README A testdata/data/hll_sketches_from_hive.parquet A testdata/workloads/functional-query/queries/QueryTest/datasketches-hll.test A tests/query_test/test_datasketches.py 17 files changed, 505 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/16000/15 -- To view, visit http://gerrit.cloudera.org:8080/16000 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic602cb6eb2bfbeab37e5e4cba11fbf0ca40b03fe Gerrit-Change-Number: 16000 Gerrit-PatchSet: 15 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9917: grouping() and grouping id() support
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/16140 ) Change subject: IMPALA-9917: grouping() and grouping_id() support .. Patch Set 7: (7 comments) Mostly minor comments. http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java File fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java: http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java@2 PS7, Line 2: nit: remove newline http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java File fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java: http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@352 PS7, Line 352: groupingIdExprs.add(aggExpr); The groupingIdExprs list does not get used ? http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@394 PS7, Line 394: // special-casingthe degenerate case of these functions being used with a single nit: add space between 'casing', 'the' http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@398 PS7, Line 398: throw new AnalysisException("Aggregate function '" + It's probably ok to reject the query but strictly speaking, grouping() function can occur with just a simple GROUP BY also without grouping sets. Hive actually supports it. All grouping() values show up as 0 in that case. http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@586 PS7, Line 586: String name = aggExpr.getFnName().getFunction(); 'name' variable does not get used. http://gerrit.cloudera.org:8080/#/c/16140/7/fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java@667 PS7, Line 667:* Return the return value for grouping() or grouping_id() for a particular aggregation nit: remove the second 'return' http://gerrit.cloudera.org:8080/#/c/16140/7/testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test File testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test: http://gerrit.cloudera.org:8080/#/c/16140/7/testdata/workloads/functional-planner/queries/PlannerTest/grouping-sets.test@1169 PS7, Line 1169: # introduce NULLs. nit: not clear what 'introduce nulls' means here. -- To view, visit http://gerrit.cloudera.org:8080/16140 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0b1640d606256c0fe9204d2a21a8f6d06abcdb6 Gerrit-Change-Number: 16140 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: David Rorke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Shant Hovsepian Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Jul 2020 07:02:02 + Gerrit-HasComments: Yes