[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17166 ) Change subject: IMPALA-10565: Adjust result spooling memory based on scratch_limit .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8357/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17166 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9 Gerrit-Change-Number: 17166 Gerrit-PatchSet: 6 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Sat, 13 Mar 2021 07:04:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17166 ) Change subject: IMPALA-10565: Adjust result spooling memory based on scratch_limit .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6965/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/17166 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9 Gerrit-Change-Number: 17166 Gerrit-PatchSet: 6 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Sat, 13 Mar 2021 06:55:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/17166 ) Change subject: IMPALA-10565: Adjust result spooling memory based on scratch_limit .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17166 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9 Gerrit-Change-Number: 17166 Gerrit-PatchSet: 6 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Sat, 13 Mar 2021 06:53:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/17166 ) Change subject: IMPALA-10565: Adjust result spooling memory based on scratch_limit .. Patch Set 6: (2 comments) http://gerrit.cloudera.org:8080/#/c/17166/5/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java: http://gerrit.cloudera.org:8080/#/c/17166/5/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@123 PS5, Line 123: // If maxAllowedScratchLimit < minMemReservat > nit: Shouldn't this be maxAllowedScratchLimit < minMemReservationBytes ? Thanks! Sorry for missing this. http://gerrit.cloudera.org:8080/#/c/17166/5/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@125 PS5, Line 125: If maxAllowedScratchLimit < maxMemReserva > nit: Similar update needed here ? Done -- To view, visit http://gerrit.cloudera.org:8080/17166 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9 Gerrit-Change-Number: 17166 Gerrit-PatchSet: 6 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Sat, 13 Mar 2021 06:44:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit
Hello Aman Sinha, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17166 to look at the new patch set (#6). Change subject: IMPALA-10565: Adjust result spooling memory based on scratch_limit .. IMPALA-10565: Adjust result spooling memory based on scratch_limit IMPALA-9856 enables result spooling by default. Result spooling depends on the ability to spill its entire BufferedTupleStream to disk once it hits maximum memory reservation. However, if the query option scratch_limit is set lower than max_spilled_result_spooling_mem, the query might fail in the middle of execution due to insufficient scratch space. This patch adds planner change to consider scratch_limit and scratch_dirs query option when computing resource used by result spooling. The algorithm is as follow: * If scratch_dirs is empty or scratch_limit < minMemReservationBytes required to use BufferedPlanRootSink, we set spool_query_results to false and fallback to use BlockingPlanRootSink. * If scratch_limit > minMemReservationBytes but still fairly low, we lower the max_result_spooling_mem (default is 100MB) and max_spilled_result_spooling_mem (default is 1GB) to fit scratch_limit. * if scratch_limit > max_spilled_result_spooling_mem, do nothing. Testing: - Add TestScratchLimit::test_result_spooling_and_varying_scratch_limit - Verify that spool_query_results query option is disabled in TestScratchDir::test_no_dirs - Pass exhaustive tests. Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9 --- M be/src/service/query-options-test.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/planner/PlanRootSink.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A testdata/workloads/functional-query/queries/QueryTest/scratch-limit.test M tests/custom_cluster/test_scratch_disk.py M tests/query_test/test_scratch_limit.py 8 files changed, 143 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/17166/6 -- To view, visit http://gerrit.cloudera.org:8080/17166 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9 Gerrit-Change-Number: 17166 Gerrit-PatchSet: 6 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/17166 ) Change subject: IMPALA-10565: Adjust result spooling memory based on scratch_limit .. Patch Set 5: Code-Review+1 (2 comments) Couple of nits. Rest of it LGTM. http://gerrit.cloudera.org:8080/#/c/17166/5/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java: http://gerrit.cloudera.org:8080/#/c/17166/5/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@123 PS5, Line 123: // If scratch_limit < maxAllowedScratchLimit, nit: Shouldn't this be maxAllowedScratchLimit < minMemReservationBytes ? http://gerrit.cloudera.org:8080/#/c/17166/5/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@125 PS5, Line 125: If scratch_limit < maxAllowedScratchLimit nit: Similar update needed here ? -- To view, visit http://gerrit.cloudera.org:8080/17166 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9 Gerrit-Change-Number: 17166 Gerrit-PatchSet: 5 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Sat, 13 Mar 2021 06:39:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7712: Support Google Cloud Storage
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17121 ) Change subject: IMPALA-7712: Support Google Cloud Storage .. Patch Set 11: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6964/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/17121 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia91ec956de3b620cccf6a1244b56b7da7a45b32b Gerrit-Change-Number: 17121 Gerrit-PatchSet: 11 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Sat, 13 Mar 2021 05:37:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7712: Support Google Cloud Storage
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17121 ) Change subject: IMPALA-7712: Support Google Cloud Storage .. Patch Set 11: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6962/ -- To view, visit http://gerrit.cloudera.org:8080/17121 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia91ec956de3b620cccf6a1244b56b7da7a45b32b Gerrit-Change-Number: 17121 Gerrit-PatchSet: 11 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Sat, 13 Mar 2021 05:29:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17166 ) Change subject: IMPALA-10565: Adjust result spooling memory based on scratch_limit .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8355/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17166 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9 Gerrit-Change-Number: 17166 Gerrit-PatchSet: 5 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Sat, 13 Mar 2021 04:17:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10518: Add ImpalaServer interface to retrieve executor membership.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17181 ) Change subject: IMPALA-10518: Add ImpalaServer interface to retrieve executor membership. .. Patch Set 2: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/8356/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/17181 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie89b71f4555c368869ee7b9d6341756c60af12b5 Gerrit-Change-Number: 17181 Gerrit-PatchSet: 2 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Sat, 13 Mar 2021 04:10:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10518: Add ImpalaServer interface to retrieve executor membership.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17181 ) Change subject: IMPALA-10518: Add ImpalaServer interface to retrieve executor membership. .. Patch Set 1: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/8354/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/17181 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie89b71f4555c368869ee7b9d6341756c60af12b5 Gerrit-Change-Number: 17181 Gerrit-PatchSet: 1 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Sat, 13 Mar 2021 04:06:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10518: Add ImpalaServer interface to retrieve executor membership.
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/17181 ) Change subject: IMPALA-10518: Add ImpalaServer interface to retrieve executor membership. .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/17181/1/tests/hs2/test_hs2.py File tests/hs2/test_hs2.py: http://gerrit.cloudera.org:8080/#/c/17181/1/tests/hs2/test_hs2.py@746 PS1, Line 746: > flake8: E501 line too long (101 > 90 characters) Done -- To view, visit http://gerrit.cloudera.org:8080/17181 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie89b71f4555c368869ee7b9d6341756c60af12b5 Gerrit-Change-Number: 17181 Gerrit-PatchSet: 2 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Sat, 13 Mar 2021 04:00:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10518: Add ImpalaServer interface to retrieve executor membership.
Hello Thomas Tauber-Marshall, Kurt Deschler, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17181 to look at the new patch set (#2). Change subject: IMPALA-10518: Add ImpalaServer interface to retrieve executor membership. .. IMPALA-10518: Add ImpalaServer interface to retrieve executor membership. This patch adds an interface to ImpalaServer to retrieve the current executor membership snapshot from impalad. This involves sending a thrift request to impalad and receiving a thrift response. Refactored some code in exec-env into a separate function in the impala namespace which makes it easier to populate the needed information for an external frontend. Testing: - Ran selected tests for sanity check (no impact is expected since this is adding a new interface): - Frontend tests (PlannerTest, CardinalityTest) - Backend tests under custom_cluster/test_executor_groups.py - Manually tested with external frontend to ensure it gets the executor membership snapshot Change-Id: Ie89b71f4555c368869ee7b9d6341756c60af12b5 --- M be/src/runtime/exec-env.cc M be/src/scheduling/cluster-membership-mgr.cc M be/src/scheduling/cluster-membership-mgr.h M be/src/service/impala-hs2-server.cc M be/src/service/impala-server.h M common/thrift/ImpalaService.thrift M tests/hs2/test_hs2.py 7 files changed, 119 insertions(+), 38 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/17181/2 -- To view, visit http://gerrit.cloudera.org:8080/17181 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie89b71f4555c368869ee7b9d6341756c60af12b5 Gerrit-Change-Number: 17181 Gerrit-PatchSet: 2 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/17166 ) Change subject: IMPALA-10565: Adjust result spooling memory based on scratch_limit .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/17166/4/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java: http://gerrit.cloudera.org:8080/#/c/17166/4/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@120 PS4, Line 120: > ACK. Done http://gerrit.cloudera.org:8080/#/c/17166/4/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@142 PS4, Line 142: queryOptions.setSpool_query_results(false); > Yeah, that will make the code cleaner. Agree about the first point about u Done -- To view, visit http://gerrit.cloudera.org:8080/17166 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9 Gerrit-Change-Number: 17166 Gerrit-PatchSet: 5 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Sat, 13 Mar 2021 03:59:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit
Hello Aman Sinha, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17166 to look at the new patch set (#5). Change subject: IMPALA-10565: Adjust result spooling memory based on scratch_limit .. IMPALA-10565: Adjust result spooling memory based on scratch_limit IMPALA-9856 enables result spooling by default. Result spooling depends on the ability to spill its entire BufferedTupleStream to disk once it hits maximum memory reservation. However, if the query option scratch_limit is set lower than max_spilled_result_spooling_mem, the query might fail in the middle of execution due to insufficient scratch space. This patch adds planner change to consider scratch_limit and scratch_dirs query option when computing resource used by result spooling. The algorithm is as follow: * If scratch_dirs is empty or scratch_limit < minMemReservationBytes required to use BufferedPlanRootSink, we set spool_query_results to false and fallback to use BlockingPlanRootSink. * If scratch_limit > minMemReservationBytes but still fairly low, we lower the max_result_spooling_mem (default is 100MB) and max_spilled_result_spooling_mem (default is 1GB) to fit scratch_limit. * if scratch_limit > max_spilled_result_spooling_mem, do nothing. Testing: - Add TestScratchLimit::test_result_spooling_and_varying_scratch_limit - Verify that spool_query_results query option is disabled in TestScratchDir::test_no_dirs - Pass exhaustive tests. Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9 --- M be/src/service/query-options-test.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/planner/PlanRootSink.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A testdata/workloads/functional-query/queries/QueryTest/scratch-limit.test M tests/custom_cluster/test_scratch_disk.py M tests/query_test/test_scratch_limit.py 8 files changed, 143 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/17166/5 -- To view, visit http://gerrit.cloudera.org:8080/17166 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9 Gerrit-Change-Number: 17166 Gerrit-PatchSet: 5 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-10518: Add ImpalaServer interface to retrieve executor membership.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17181 ) Change subject: IMPALA-10518: Add ImpalaServer interface to retrieve executor membership. .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/17181/1/tests/hs2/test_hs2.py File tests/hs2/test_hs2.py: http://gerrit.cloudera.org:8080/#/c/17181/1/tests/hs2/test_hs2.py@746 PS1, Line 746: e flake8: E501 line too long (101 > 90 characters) -- To view, visit http://gerrit.cloudera.org:8080/17181 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie89b71f4555c368869ee7b9d6341756c60af12b5 Gerrit-Change-Number: 17181 Gerrit-PatchSet: 1 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Sat, 13 Mar 2021 03:55:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10518: Add ImpalaServer interface to retrieve executor membership.
Aman Sinha has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17181 Change subject: IMPALA-10518: Add ImpalaServer interface to retrieve executor membership. .. IMPALA-10518: Add ImpalaServer interface to retrieve executor membership. This patch adds an interface to ImpalaServer to retrieve the current executor membership snapshot from impalad. This involves sending a thrift request to impalad and receiving a thrift response. Refactored some code in exec-env into a separate function in the impala namespace which makes it easier to populate the needed information for an external frontend. Testing: - Ran selected tests for sanity check (no impact is expected since this is adding a new interface): - Frontend tests (PlannerTest, CardinalityTest) - Backend tests under custom_cluster/test_executor_groups.py - Manually tested with external frontend to ensure it gets the executor membership snapshot Change-Id: Ie89b71f4555c368869ee7b9d6341756c60af12b5 --- M be/src/runtime/exec-env.cc M be/src/scheduling/cluster-membership-mgr.cc M be/src/scheduling/cluster-membership-mgr.h M be/src/service/impala-hs2-server.cc M be/src/service/impala-server.h M common/thrift/ImpalaService.thrift M tests/hs2/test_hs2.py 7 files changed, 118 insertions(+), 38 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/17181/1 -- To view, visit http://gerrit.cloudera.org:8080/17181 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ie89b71f4555c368869ee7b9d6341756c60af12b5 Gerrit-Change-Number: 17181 Gerrit-PatchSet: 1 Gerrit-Owner: Aman Sinha
[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 ) Change subject: IMPALA-9234: Support Ranger row filtering policies .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8353/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16976 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc Gerrit-Change-Number: 16976 Gerrit-PatchSet: 7 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 13 Mar 2021 03:34:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 ) Change subject: IMPALA-9234: Support Ranger row filtering policies .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6963/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/16976 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc Gerrit-Change-Number: 16976 Gerrit-PatchSet: 7 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 13 Mar 2021 03:17:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 ) Change subject: IMPALA-9234: Support Ranger row filtering policies .. Patch Set 7: > Patch Set 6: Verified-1 > > Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6961/ Updated the patch to not allow enabling row-filtering but disabling column masking. -- To view, visit http://gerrit.cloudera.org:8080/16976 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc Gerrit-Change-Number: 16976 Gerrit-PatchSet: 7 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 13 Mar 2021 03:15:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies
Hello Fang-Yu Rao, Tim Armstrong, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/16976 to look at the new patch set (#7). Change subject: IMPALA-9234: Support Ranger row filtering policies .. IMPALA-9234: Support Ranger row filtering policies Ranger row filtering policies provide customized expressions to filter out rows for specific users when reading from a table. This patch adds support for this feature. A new feature flag, enable_row_filtering, is added to disable this experimental feature. It defaults to be true so the feature is enabled by default. Enabling row-filtering requires --enable_column_masking=true since it depends on the column masking implementation. Note that row filtering policies take effects prior to any column masking policies, because column masking policies apply on result data. Implementation: The existing table masking view infrastructure can be extended to support row filtering. Currently when analyzing a table with column masking policies, we replace the TableRef with an InlineViewRef which contains a SelectStmt wrapping the columns with masking expressions. This patch adds the row filtering expressions to the WhereClause of the SelectStmt. Limitations: - Expressions using subqueries are not supported (IMPALA-10483). - Row filtering policies on nested tables will not be applied when nested collection columns are used directly in the FROM clause. This will leak data so we forbid such kinds of queries until IMPALA-10484 is resolved. Tests: - Add FE test for error message when disabling row filtering. - Add e2e test with row filtering policies. - Add e2e test with column masking and row filtering policies both take place. - Verified audits in a CDP cluster with Ranger and Solr set up. Change-Id: I580517be241225ca15e45686381b78890178d7cc --- M be/src/common/global-flags.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java M fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java M fe/src/main/java/org/apache/impala/authorization/NoopAuthorizationFactory.java M fe/src/main/java/org/apache/impala/authorization/TableMask.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationContext.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java M fe/src/main/java/org/apache/impala/authorization/ranger/RangerBufferAuditHandler.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/util/AuthorizationUtil.java M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java M fe/src/test/java/org/apache/impala/authorization/AuthorizationTestBase.java M fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java M testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test A testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_and_row_filtering.test A testdata/workloads/functional-query/queries/QueryTest/ranger_row_filtering.test M tests/authorization/test_ranger.py 23 files changed, 935 insertions(+), 113 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/16976/7 -- To view, visit http://gerrit.cloudera.org:8080/16976 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc Gerrit-Change-Number: 16976 Gerrit-PatchSet: 7 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/17166 ) Change subject: IMPALA-10565: Adjust result spooling memory based on scratch_limit .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/17166/4/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java: http://gerrit.cloudera.org:8080/#/c/17166/4/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@142 PS4, Line 142: maxMemReservationBytes = scratchLimit - maxRowBufferSize; > If scratch_limit is unbounded, the maxMemReservationBytes calculation in li Yeah, that will make the code cleaner. Agree about the first point about unbounded scratch_limit. -- To view, visit http://gerrit.cloudera.org:8080/17166 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9 Gerrit-Change-Number: 17166 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Sat, 13 Mar 2021 02:05:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/17166 ) Change subject: IMPALA-10565: Adjust result spooling memory based on scratch_limit .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/17166/4/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java: http://gerrit.cloudera.org:8080/#/c/17166/4/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@120 PS4, Line 120: scratch_limit < minMemReservationBytes > Update this and the one below to account for the extra maxRowBufferSize ACK. http://gerrit.cloudera.org:8080/#/c/17166/4/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@142 PS4, Line 142: maxMemReservationBytes = scratchLimit - maxRowBufferSize; > For this adjustment for maxRowBufferSize, can we not just do it up front (o If scratch_limit is unbounded, the maxMemReservationBytes calculation in line 114 is OK. Little overspill will not fail the query. In contrary, if scratch_limit is bounded, just a little overspill will terminate the query because scratch_limit is strictly enforced. What if I tidy up the comparison a bit so that it looks simpler? We define long maxAllowedScratchLimit = scratchLimit - maxRowBufferSize; Instead of comparing against scratchLimit, these should compare against maxAllowedScratchLimit; -- To view, visit http://gerrit.cloudera.org:8080/17166 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9 Gerrit-Change-Number: 17166 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Sat, 13 Mar 2021 01:47:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/17166 ) Change subject: IMPALA-10565: Adjust result spooling memory based on scratch_limit .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/17166/4/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java: http://gerrit.cloudera.org:8080/#/c/17166/4/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@120 PS4, Line 120: scratch_limit < minMemReservationBytes Update this and the one below to account for the extra maxRowBufferSize http://gerrit.cloudera.org:8080/#/c/17166/4/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@142 PS4, Line 142: maxMemReservationBytes = scratchLimit - maxRowBufferSize; For this adjustment for maxRowBufferSize, can we not just do it up front (on line 114) since we know that maxMemReservationBytes should always be conservative such that it leaves a cushion for maxRowBufferSize. It should simplify the logic and presumably not cause other side effects (unless I am missing something). -- To view, visit http://gerrit.cloudera.org:8080/17166 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9 Gerrit-Change-Number: 17166 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Sat, 13 Mar 2021 01:14:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10551: Add result sink support for external frontends
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17144 ) Change subject: IMPALA-10551: Add result sink support for external frontends .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8352/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17144 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I024bf41d77bb81f1ab0debdbd31ec3687c83f072 Gerrit-Change-Number: 17144 Gerrit-PatchSet: 7 Gerrit-Owner: John Sherman Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: John Sherman Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Sat, 13 Mar 2021 00:02:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10551: Add result sink support for external frontends
John Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/17144 ) Change subject: IMPALA-10551: Add result sink support for external frontends .. Patch Set 7: (4 comments) http://gerrit.cloudera.org:8080/#/c/17144/6/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/17144/6/be/src/runtime/coordinator.cc@796 PS6, Line 796: // All instances must have reported their final statuses before finalization, which is a : // post-condition of Wait. Result sink file clean up is the responsibility of the : // external frontend > This is a copy/paste from FinalizeHdfsDml(), so the second sentence doesn't Done http://gerrit.cloudera.org:8080/#/c/17144/6/be/src/runtime/coordinator.cc@802 PS6, Line 802: RETURN_IF_ERROR(UpdateExecState(Status::OK(), nullptr, FLAGS_hostname)); > If there is an error from execution, it would show up here and this would r I agree that retry_failed_queries might be problematic and we might want to recommend that an external frontend not enable the feature. If I am reading the coordinator.cc code correct though, we do not retry queries with a result sink. Are there other areas I should be concerned about? I see some usage in query-driver.cc http://gerrit.cloudera.org:8080/#/c/17144/6/be/src/runtime/coordinator.cc@807 PS6, Line 807: > Nit: This "0" is the table id. I'm guessing 0 is a special constant for the I moved it to a named constant within this method since it is the only usage of it. I can also move it to DescriptorTbl if your prefer it there. http://gerrit.cloudera.org:8080/#/c/17144/6/be/src/runtime/coordinator.cc@810 PS6, Line 810: result_sink_table_id, obj_p > I think this should be "0" (i.e. the special table id used in the CreateHdf I removed the table_id portion from this message/ -- To view, visit http://gerrit.cloudera.org:8080/17144 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I024bf41d77bb81f1ab0debdbd31ec3687c83f072 Gerrit-Change-Number: 17144 Gerrit-PatchSet: 7 Gerrit-Owner: John Sherman Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: John Sherman Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Fri, 12 Mar 2021 23:43:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10551: Add result sink support for external frontends
Hello Aman Sinha, Thomas Tauber-Marshall, Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17144 to look at the new patch set (#7). Change subject: IMPALA-10551: Add result sink support for external frontends .. IMPALA-10551: Add result sink support for external frontends - The intended purpose of these changes is to allow external frontends to receive query results via files rather than streaming the results through the thrift interface. - External frontends are expected to provide an FeFsTable implementation that describes the desired location to store results. - External frontends are responsible for managing the files after the query is completed. - Testing has been manual and through an implementation of an external frontend. Change-Id: I024bf41d77bb81f1ab0debdbd31ec3687c83f072 Reviewed-by: Aman Sinha --- M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-sink.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/query-exec-params.cc M be/src/runtime/query-exec-params.h M common/thrift/DataSinks.thrift M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java M fe/src/main/java/org/apache/impala/planner/TableSink.java 9 files changed, 99 insertions(+), 12 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/17144/7 -- To view, visit http://gerrit.cloudera.org:8080/17144 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I024bf41d77bb81f1ab0debdbd31ec3687c83f072 Gerrit-Change-Number: 17144 Gerrit-PatchSet: 7 Gerrit-Owner: John Sherman Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: John Sherman Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-7712: Support Google Cloud Storage
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17121 ) Change subject: IMPALA-7712: Support Google Cloud Storage .. Patch Set 11: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17121 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia91ec956de3b620cccf6a1244b56b7da7a45b32b Gerrit-Change-Number: 17121 Gerrit-PatchSet: 11 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 12 Mar 2021 23:18:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7712: Support Google Cloud Storage
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17121 ) Change subject: IMPALA-7712: Support Google Cloud Storage .. Patch Set 11: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6962/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/17121 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia91ec956de3b620cccf6a1244b56b7da7a45b32b Gerrit-Change-Number: 17121 Gerrit-PatchSet: 11 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 12 Mar 2021 23:18:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7712: Support Google Cloud Storage
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17121 ) Change subject: IMPALA-7712: Support Google Cloud Storage .. Patch Set 10: (1 comment) > Patch Set 10: Code-Review+2 > > (1 comment) > > Given the analysis in IMPALA-10563, it seems fine to disable those test cases > for now. > > See my note about IMPALA-10579. I think it is ok to include this partial fix, > as it seems better than what we have right now. If IMPALA-10579 was landing > very soon, I would be ok with removing this piece of the fix and relying on > IMPALA-10579. > > This change makes sense to me, and it is good to have the GCS support land. Thanks Joe's review! IMPALA-10579 (https://gerrit.cloudera.org/c/17171/) will take some time to land. So let's have the conservative fix for GCS first. http://gerrit.cloudera.org:8080/#/c/17121/10/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java: http://gerrit.cloudera.org:8080/#/c/17121/10/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@713 PS10, Line 713: /** :* Wrapper around FileSystem.listStatusIterator() to make sure the path exists. :* :* @throws FileNotFoundException if p does not exist :* @throws IOException if any I/O error occurredd :*/ : public static RemoteIterator listStatusIterator(FileSystem fs, Path p) : throws IOException { : RemoteIterator iterator = fs.listStatusIterator(p); : // Some FileSystem implementations like GoogleHadoopFileSystem doesn't check : // existence of the start path when creating the RemoteIterator. Instead, their : // iterators throw the FileNotFoundException in the first call of hasNext() when : // the start path doesn't exist. Here we call hasNext() to ensure start path exists. : iterator.hasNext(); : return iterator; > This code will be replaced by IMPALA-10579. Yeah, exactly! For IMPALA-10579 (https://gerrit.cloudera.org/c/17171/), I plan to test the patch on Ozone, S3 and ABFS so it will take some time. The changes in this patch is conservative so we can assure it won't impact other filesystems. (I have verified it on HDFS and GCS) -- To view, visit http://gerrit.cloudera.org:8080/17121 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia91ec956de3b620cccf6a1244b56b7da7a45b32b Gerrit-Change-Number: 17121 Gerrit-PatchSet: 10 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 12 Mar 2021 23:17:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17166 ) Change subject: IMPALA-10565: Adjust result spooling memory based on scratch_limit .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8351/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17166 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9 Gerrit-Change-Number: 17166 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 12 Mar 2021 20:12:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/17166 ) Change subject: IMPALA-10565: Adjust result spooling memory based on scratch_limit .. Patch Set 4: (8 comments) My exhaustive test run last night reveal that scratch_limit might still get violated if maxMemReservationBytes is equal to scratch_limit. This is because the content of SpillableRowBatchQueue can be slightly higher than maxMemReservationBytes when it decide to spill. To anticipate that, I lower the spooling mem config a little further here in Patch Set 4. http://gerrit.cloudera.org:8080/#/c/17166/3/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/17166/3/be/src/service/query-options.cc@1104 PS3, Line 1104: // max_spilled_result_spooling_mem (a value of 0 means memory is unbounded). > I just figured out in ParseUtil::ParseMemSpec() that -1 for memory query op Done http://gerrit.cloudera.org:8080/#/c/17166/3/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java: http://gerrit.cloudera.org:8080/#/c/17166/3/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@77 PS3, Line 77:* If SPOOL_QUERY_RESULTS is true, then the ResourceProfile sets a min/max resevation, > Some of the method level comment should be updated to reflect the behavior Done http://gerrit.cloudera.org:8080/#/c/17166/3/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@92 PS3, Line 92: > nit: typo ? Done http://gerrit.cloudera.org:8080/#/c/17166/3/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@110 PS3, Line 110: long bufferSize = queryOptions.getDefault_spillable_buffer_size(); : long maxRowBufferSize = PlanNode.computeMaxSpillableBufferSize( > It sounds like an existing bug. If you can create a test case for it can y I filed IMPALA-10583. Will work on that next. http://gerrit.cloudera.org:8080/#/c/17166/3/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@126 PS3, Line 126: > Suggest rewording: 'to >=' minMemReservationBytes Done http://gerrit.cloudera.org:8080/#/c/17166/3/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@126 PS3, Line 126: > nit: 'increasing' Done http://gerrit.cloudera.org:8080/#/c/17166/3/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@142 PS3, Line 142: maxMemReservationBytes = scratchLimit - maxRowBufferSize; > Would be useful to add a trace level log message here as well. Done http://gerrit.cloudera.org:8080/#/c/17166/3/testdata/workloads/functional-query/queries/QueryTest/scratch-limit.test File testdata/workloads/functional-query/queries/QueryTest/scratch-limit.test: http://gerrit.cloudera.org:8080/#/c/17166/3/testdata/workloads/functional-query/queries/QueryTest/scratch-limit.test@2 PS3, Line 2: QUERY > Could you add 1 tests with empty scratch dirs ? Since scratch_dirs is a backend flag, I piggy back the test under TestScratchDir::test_no_dirs -- To view, visit http://gerrit.cloudera.org:8080/17166 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9 Gerrit-Change-Number: 17166 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 12 Mar 2021 20:00:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit
Hello Aman Sinha, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17166 to look at the new patch set (#4). Change subject: IMPALA-10565: Adjust result spooling memory based on scratch_limit .. IMPALA-10565: Adjust result spooling memory based on scratch_limit IMPALA-9856 enables result spooling by default. Result spooling depends on the ability to spill its entire BufferedTupleStream to disk once it hits maximum memory reservation. However, if the query option scratch_limit is set lower than max_spilled_result_spooling_mem, the query might fail in the middle of execution due to insufficient scratch space. This patch adds planner change to consider scratch_limit and scratch_dirs query option when computing resource used by result spooling. The algorithm is as follow: * If scratch_dirs is empty or scratch_limit < minMemReservationBytes required to use BufferedPlanRootSink, we set spool_query_results to false and fallback to use BlockingPlanRootSink. * If scratch_limit > minMemReservationBytes but still fairly low, we lower the max_result_spooling_mem (default is 100MB) and max_spilled_result_spooling_mem (default is 1GB) to fit scratch_limit. * if scratch_limit > max_spilled_result_spooling_mem, do nothing. Testing: - Add TestScratchLimit::test_result_spooling_and_varying_scratch_limit - Verify that spool_query_results query option is disabled in TestScratchDir::test_no_dirs - Pass exhaustive tests. Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9 --- M be/src/service/query-options-test.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/planner/PlanRootSink.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java A testdata/workloads/functional-query/queries/QueryTest/scratch-limit.test M tests/custom_cluster/test_scratch_disk.py M tests/query_test/test_scratch_limit.py 8 files changed, 140 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/17166/4 -- To view, visit http://gerrit.cloudera.org:8080/17166 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9 Gerrit-Change-Number: 17166 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 ) Change subject: IMPALA-9234: Support Ranger row filtering policies .. Patch Set 6: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6961/ -- To view, visit http://gerrit.cloudera.org:8080/16976 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc Gerrit-Change-Number: 16976 Gerrit-PatchSet: 6 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 12 Mar 2021 19:33:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17130 ) Change subject: IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67 Gerrit-Change-Number: 17130 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 12 Mar 2021 19:28:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17130 ) Change subject: IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables .. IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables This patch adds support for CREATE TABLE AS SELECT statements for Iceberg tables. CTAS statements work like the following in Impala: 1. Analysis of the whole CTAS statement 2. Divide CTAS to CREATE stmt and INSERT stmt 3. Create temporary in-memory target table from the CREATE stmt 4. Analyse the INSERT statement by using the temporary target table 5. If everything is OK so far, create the target table 6. Execute the INSERT query For Iceberg tables the non-trivial thing was to create the temporary target table without actually creating it via Iceberg API. I've created a new class 'IcebergCtasTarget' that mimics an FeIceberg table. It can be used with catalog V1 and V2 as well. Testing * e2e CTAS tests in iceberg-ctas.test * SHOW CREATE TABLE stmts in show-create-table.test Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67 Reviewed-on: http://gerrit.cloudera.org:8080/17130 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java A fe/src/main/java/org/apache/impala/catalog/CtasTargetTable.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java A fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java M fe/src/main/java/org/apache/impala/catalog/local/LocalDb.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/IcebergSchemaConverter.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java A testdata/workloads/functional-query/queries/QueryTest/iceberg-ctas.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test M tests/metadata/test_show_create_table.py M tests/query_test/test_iceberg.py 18 files changed, 686 insertions(+), 46 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/17130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67 Gerrit-Change-Number: 17130 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-7712: Support Google Cloud Storage
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/17121 ) Change subject: IMPALA-7712: Support Google Cloud Storage .. Patch Set 10: Code-Review+2 (1 comment) Given the analysis in IMPALA-10563, it seems fine to disable those test cases for now. See my note about IMPALA-10579. I think it is ok to include this partial fix, as it seems better than what we have right now. If IMPALA-10579 was landing very soon, I would be ok with removing this piece of the fix and relying on IMPALA-10579. This change makes sense to me, and it is good to have the GCS support land. http://gerrit.cloudera.org:8080/#/c/17121/10/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java: http://gerrit.cloudera.org:8080/#/c/17121/10/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@713 PS10, Line 713: /** :* Wrapper around FileSystem.listStatusIterator() to make sure the path exists. :* :* @throws FileNotFoundException if p does not exist :* @throws IOException if any I/O error occurredd :*/ : public static RemoteIterator listStatusIterator(FileSystem fs, Path p) : throws IOException { : RemoteIterator iterator = fs.listStatusIterator(p); : // Some FileSystem implementations like GoogleHadoopFileSystem doesn't check : // existence of the start path when creating the RemoteIterator. Instead, their : // iterators throw the FileNotFoundException in the first call of hasNext() when : // the start path doesn't exist. Here we call hasNext() to ensure start path exists. : iterator.hasNext(); : return iterator; This code will be replaced by IMPALA-10579. I'm guessing that the thought here is that this is better than what we have, and the fuller fix will come from IMPALA-10579. -- To view, visit http://gerrit.cloudera.org:8080/17121 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia91ec956de3b620cccf6a1244b56b7da7a45b32b Gerrit-Change-Number: 17121 Gerrit-PatchSet: 10 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 12 Mar 2021 18:42:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10523: Fix impala-shell crash in printing error messages that contain UTF-8 characters
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17099 ) Change subject: IMPALA-10523: Fix impala-shell crash in printing error messages that contain UTF-8 characters .. IMPALA-10523: Fix impala-shell crash in printing error messages that contain UTF-8 characters In Python2, print() converts all non-keyword arguments to strings like str() does and writes them to the stream. str() on QueryStateException returns its value(i.e. error message) which could be in unicode type. Python2 will implicitly encode it to str type using the default encoding, 'ascii'. This could result in UnicodeEncodeError when there are non-ascii characters in the error message. This patch explicitly encodes the error message using 'utf-8' encoding if it's in unicode type and the shell is run in Python2. Tests: - Add test in test_shell_interactive.py Change-Id: Ie10f5b03ecc5877053c2fbada1afaf256b423a71 Reviewed-on: http://gerrit.cloudera.org:8080/17099 Reviewed-by: Tamas Mate Reviewed-by: Laszlo Gaal Tested-by: Impala Public Jenkins --- M shell/impala_shell.py M tests/shell/test_shell_interactive.py 2 files changed, 16 insertions(+), 1 deletion(-) Approvals: Tamas Mate: Looks good to me, but someone else must approve Laszlo Gaal: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/17099 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ie10f5b03ecc5877053c2fbada1afaf256b423a71 Gerrit-Change-Number: 17099 Gerrit-PatchSet: 4 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tamas Mate
[Impala-ASF-CR] IMPALA-10523: Fix impala-shell crash in printing error messages that contain UTF-8 characters
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17099 ) Change subject: IMPALA-10523: Fix impala-shell crash in printing error messages that contain UTF-8 characters .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17099 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie10f5b03ecc5877053c2fbada1afaf256b423a71 Gerrit-Change-Number: 17099 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tamas Mate Gerrit-Comment-Date: Fri, 12 Mar 2021 18:19:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10367: Impala-shell internal error - UnboundLocalError, local variable 'retry msg' referenced before assign
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/17172 ) Change subject: IMPALA-10367: Impala-shell internal error - UnboundLocalError, local variable 'retry_msg' referenced before assign .. Patch Set 1: Code-Review+2 LGTM -- To view, visit http://gerrit.cloudera.org:8080/17172 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I50a08a62a332de759022d0a4862e74f5a81945d9 Gerrit-Change-Number: 17172 Gerrit-PatchSet: 1 Gerrit-Owner: Abhishek Rawat Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 12 Mar 2021 17:50:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10546: Add ImpalaServer interface to retrieve BackendConfig from impalad
Thomas Tauber-Marshall has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17116 ) Change subject: IMPALA-10546: Add ImpalaServer interface to retrieve BackendConfig from impalad .. IMPALA-10546: Add ImpalaServer interface to retrieve BackendConfig from impalad This patch add a new interface ImpalaServer::GetBackendConfig() that returns the current TBackendGflags from impalad. Testing: Called new interface from external frontend. Verified that TBackendGflags were populated correctly. Reviewed-by: John Sherman Change-Id: I14a3cee29f1fc91f4431b7ea89053bb3fbfa5e69 Reviewed-on: http://gerrit.cloudera.org:8080/17116 Reviewed-by: Thomas Tauber-Marshall Tested-by: Impala Public Jenkins --- M be/src/catalog/catalog.cc M be/src/rpc/hs2-http-test.cc M be/src/service/frontend.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-server.h M be/src/util/backend-gflag-util.cc M be/src/util/backend-gflag-util.h M be/src/util/logging-support.cc M common/thrift/ImpalaService.thrift M tests/hs2/test_hs2.py 10 files changed, 64 insertions(+), 6 deletions(-) Approvals: Thomas Tauber-Marshall: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/17116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I14a3cee29f1fc91f4431b7ea89053bb3fbfa5e69 Gerrit-Change-Number: 17116 Gerrit-PatchSet: 17 Gerrit-Owner: Kurt Deschler Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: John Sherman Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-10535: Add interface to ImpalaServer for execution of externally compiled statements
Thomas Tauber-Marshall has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17104 ) Change subject: IMPALA-10535: Add interface to ImpalaServer for execution of externally compiled statements .. IMPALA-10535: Add interface to ImpalaServer for execution of externally compiled statements The ExecutePlannedStatement interface allows an externally supplied TExecRequest to be executed by impalad. The TExecRequest must be fully populated and will be sent directly to the backend for execution. The following fields in the TExecRequest are updated by the coordinator: - Hostname - KRPC address - Local Timezone In order to add the interface to ImpalaInternalService.thrift, several of the thrift classes were moved to Query.thrift to avoid a circular dependency with Frontend.thrift. Added functionality to format and dump TExecRequest structures to path specified in debug flag dump_exec_request_path. A start timestamp field has been added to TExecRequest to represent the interval in the query profile between when the request was sent by the external frontend and handled by the backend. A local timestamp field has been added to the Ping result struct to return the current backend timestamp. This is used by the external to frontend to populate the start timestamp. Also included is a change to avoid generating silent AnalysisExceptions during table resolution. Tested with TExecRequest structures populated by external frontend. Local timezone change tested withe INT64 TIMESTAMP datatype Reviewed-by: John Sherman Change-Id: Iace716dd67290f08441857dc02d2428b0e335eaa Reviewed-on: http://gerrit.cloudera.org:8080/17104 Reviewed-by: Thomas Tauber-Marshall Tested-by: Thomas Tauber-Marshall --- M be/generated-sources/gen-cpp/CMakeLists.txt M be/src/rpc/hs2-http-test.cc M be/src/runtime/debug-options.h M be/src/runtime/query-driver.cc M be/src/runtime/query-driver.h M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M common/thrift/CMakeLists.txt M common/thrift/Frontend.thrift M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift A common/thrift/Query.thrift M fe/pom.xml M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java 21 files changed, 959 insertions(+), 760 deletions(-) Approvals: Thomas Tauber-Marshall: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/17104 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Iace716dd67290f08441857dc02d2428b0e335eaa Gerrit-Change-Number: 17104 Gerrit-PatchSet: 14 Gerrit-Owner: Kurt Deschler Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: John Sherman Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-10522: Support external use of frontend libraries
Thomas Tauber-Marshall has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17115 ) Change subject: IMPALA-10522: Support external use of frontend libraries .. IMPALA-10522: Support external use of frontend libraries This patch enables the Impala frontend jar and dependent library libfesupport.so to be used by an external Java frontend. Calling FeSupport.setExternalFE() will cause external frontend initialization mode to be used during FeSupport.loadLibrary(). This mode builds upon logic that is used to initialize the frontend jar for unit tests. Initialization in external frontend mode differs as follows: - Skip instantiating Frontend object and it's dependents - Skip loading libhdfs - Skip starting JVM Pause monitor - Disable Minidumper - Initialize TimezoneDatabase for external frontends - Disable redirect of stderr/stdout to libfesupport.so glog - Log messages from libfesupport.so to stderr - Use libfesupport.so for JNI symbol look up Null check were added in places where objects were assumed to be instantiated but are now skipped during initialization. Additional change: 1) Add libfesupport.lib path to JAVA_LIBRARY_PATH in test driver Testing: - Initialized frontend jar from external frontend - Verified that frontend Java objects can be used externally without issues - Verified that exceptions thrown from Impala Java or libfesupport can be caught or propagated correctly by the external frontend - Manual verification of minicluster logs - Ran queries with external frontend Co-authored-by: John Sherman Co-authored-by: Aman Sinha Change-Id: I4e3a84721ba196ec00773ce2923b19610b90edd9 Reviewed-on: http://gerrit.cloudera.org:8080/17115 Reviewed-by: Thomas Tauber-Marshall Tested-by: Thomas Tauber-Marshall --- M be/src/benchmarks/expr-benchmark.cc M be/src/common/init.cc M be/src/common/init.h M be/src/runtime/data-stream-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/lib-cache.cc M be/src/runtime/lib-cache.h M be/src/service/fe-support.cc M be/src/util/jni-util.cc M fe/src/main/java/org/apache/impala/service/FeSupport.java M testdata/bin/run-hive-server.sh 12 files changed, 99 insertions(+), 46 deletions(-) Approvals: Thomas Tauber-Marshall: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/17115 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I4e3a84721ba196ec00773ce2923b19610b90edd9 Gerrit-Change-Number: 17115 Gerrit-PatchSet: 10 Gerrit-Owner: Kurt Deschler Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: John Sherman Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-10535: Add interface to ImpalaServer for execution of externally compiled statements
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/17104 ) Change subject: IMPALA-10535: Add interface to ImpalaServer for execution of externally compiled statements .. Patch Set 13: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17104 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iace716dd67290f08441857dc02d2428b0e335eaa Gerrit-Change-Number: 17104 Gerrit-PatchSet: 13 Gerrit-Owner: Kurt Deschler Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: John Sherman Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Fri, 12 Mar 2021 17:49:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10522: Support external use of frontend libraries
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/17115 ) Change subject: IMPALA-10522: Support external use of frontend libraries .. Patch Set 9: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17115 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4e3a84721ba196ec00773ce2923b19610b90edd9 Gerrit-Change-Number: 17115 Gerrit-PatchSet: 9 Gerrit-Owner: Kurt Deschler Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: John Sherman Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Fri, 12 Mar 2021 17:49:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10535: Add interface to ImpalaServer for execution of externally compiled statements
Thomas Tauber-Marshall has removed a vote on this change. Change subject: IMPALA-10535: Add interface to ImpalaServer for execution of externally compiled statements .. Removed Verified-1 by Impala Public Jenkins -- To view, visit http://gerrit.cloudera.org:8080/17104 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: Iace716dd67290f08441857dc02d2428b0e335eaa Gerrit-Change-Number: 17104 Gerrit-PatchSet: 13 Gerrit-Owner: Kurt Deschler Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: John Sherman Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-10494: Making use of the min/max column stats to improve min/max filters
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17075 ) Change subject: IMPALA-10494: Making use of the min/max column stats to improve min/max filters .. Patch Set 19: (8 comments) Did a quick walkthrough, will look into it in detail next week. http://gerrit.cloudera.org:8080/#/c/17075/19//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17075/19//COMMIT_MSG@19 PS19, Line 19: show_column_minmax_stats Do we need this query option? I mean if we have min/max stats then we'd probably want to show them. http://gerrit.cloudera.org:8080/#/c/17075/19/be/src/exec/filter-context.cc File be/src/exec/filter-context.cc: http://gerrit.cloudera.org:8080/#/c/17075/19/be/src/exec/filter-context.cc@447 PS19, Line 447: ( nit: parentheses are not needed as the '.' member access takes precedence over the '&' adress-of operator, http://gerrit.cloudera.org:8080/#/c/17075/19/be/src/exec/filter-context.cc@492 PS19, Line 492: ( nit: parentheses not needed http://gerrit.cloudera.org:8080/#/c/17075/19/be/src/exec/hdfs-scanner.h File be/src/exec/hdfs-scanner.h: http://gerrit.cloudera.org:8080/#/c/17075/19/be/src/exec/hdfs-scanner.h@348 PS19, Line 348: uint8_t enabled_for_rowgroup; Why do we need this flag? If enabled_for_rowgroup is false, then the min/max filter is completely turned off, right? In that case we shouldn't even send them to the scanner, or am I missing something? http://gerrit.cloudera.org:8080/#/c/17075/19/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/17075/19/be/src/exec/parquet/hdfs-parquet-scanner.cc@662 PS19, Line 662: } nit: EvaluateOverlapForRowGroup() is already quite long, maybe this code could go into a separate function. http://gerrit.cloudera.org:8080/#/c/17075/19/be/src/exec/partitioned-hash-join-builder.cc File be/src/exec/partitioned-hash-join-builder.cc: http://gerrit.cloudera.org:8080/#/c/17075/19/be/src/exec/partitioned-hash-join-builder.cc@950 PS19, Line 950: nit: indentation http://gerrit.cloudera.org:8080/#/c/17075/19/fe/src/main/java/org/apache/impala/catalog/ColumnStats.java File fe/src/main/java/org/apache/impala/catalog/ColumnStats.java: http://gerrit.cloudera.org:8080/#/c/17075/19/fe/src/main/java/org/apache/impala/catalog/ColumnStats.java@50 PS19, Line 50: LOG seems unused http://gerrit.cloudera.org:8080/#/c/17075/19/fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java File fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java: http://gerrit.cloudera.org:8080/#/c/17075/19/fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java@278 PS19, Line 278:*/ It would be good to handle Iceberg tables that use Parquet data files. -- To view, visit http://gerrit.cloudera.org:8080/17075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I08581b44419bb8da5940cbf98502132acd1c86df Gerrit-Change-Number: 17075 Gerrit-PatchSet: 19 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 12 Mar 2021 17:02:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10520: Implement ds theta intersect() function
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17088 ) Change subject: IMPALA-10520: Implement ds_theta_intersect() function .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17088 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I80e68c2151c4604f0386d3dfb004c82b10293f97 Gerrit-Change-Number: 17088 Gerrit-PatchSet: 5 Gerrit-Owner: Fucun Chu Gerrit-Reviewer: Fucun Chu Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 12 Mar 2021 16:13:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10520: Implement ds theta intersect() function
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17088 ) Change subject: IMPALA-10520: Implement ds_theta_intersect() function .. IMPALA-10520: Implement ds_theta_intersect() function This function receives a set of serialized Apache DataSketches Theta sketches produced by ds_theta_sketch() and intersects them into a single sketch. An example usage is to create a sketch for each partition of a table, write these sketches to a separate table and intersect them to get estimates based on the partitions the user is interested in related sketches. E.g.: SELECT ds_theta_estimate(ds_theta_intersect(sketch_col)) FROM sketch_tbl WHERE partition_col=1 OR partition_col=5; Testing: - Apart from the automated tests I added to this patch I also tested ds_theta_intersect() on a bigger dataset to check that serialization, deserialization and merging steps work well. I took TPCH25.linelitem, created a number of sketches with grouping by l_shipdate and called ds_theta_intersect() on those sketches Change-Id: I80e68c2151c4604f0386d3dfb004c82b10293f97 Reviewed-on: http://gerrit.cloudera.org:8080/17088 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/exprs/aggregate-functions-ir.cc M be/src/exprs/aggregate-functions.h M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test 4 files changed, 182 insertions(+), 0 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/17088 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I80e68c2151c4604f0386d3dfb004c82b10293f97 Gerrit-Change-Number: 17088 Gerrit-PatchSet: 6 Gerrit-Owner: Fucun Chu Gerrit-Reviewer: Fucun Chu Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-10565: Adjust result spooling memory based on scratch limit
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/17166 ) Change subject: IMPALA-10565: Adjust result spooling memory based on scratch_limit .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/17166/3/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java File fe/src/main/java/org/apache/impala/planner/PlanRootSink.java: http://gerrit.cloudera.org:8080/#/c/17166/3/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java@122 PS3, Line 122: if (scratchLimit > -1) { > Should this check be scratchLimit > 0 since -1 or 0 mean unbounded right ? Ignore this comment. I was probably thinking about the memory setting (side effects of late night review). -- To view, visit http://gerrit.cloudera.org:8080/17166 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I541f46e6911694e14c0fc25be1a6982fd929d3a9 Gerrit-Change-Number: 17166 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 12 Mar 2021 15:37:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10377: Improve the accuracy of resource estimation
Zoltan Borok-Nagy has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16842 ) Change subject: IMPALA-10377: Improve the accuracy of resource estimation .. IMPALA-10377: Improve the accuracy of resource estimation PlanNode does not consider some factors when estimating memory, this will cause a large error rate AggregationNode 1.MemoryEstimate = Ndv * (AvgRowSize + SizeOfBucket) 2.When estimating the Ndv of merge aggregation, Ndv should be divided only once. 3.If there is no grouping exprs, MemoryEstimate = MIN_PLAIN_AGG_MEM SortNode 1.MemoryEstimate = Cardinality * AvgRowSize. Memory used when there is enough memory HashJoinNode 1.MemoryEstimate= DataRows + Buckets + DuplicateNodes, DataRows = RightTableCardinality * AvgRowSize, Buckets= roundUpToPowerOf2(RightTableCardinality) * SizeOfBucket, DuplicateNodes = (RightTableCardinality - RightNdv) * SizeOfDuplicateNode KuduScanNode 1.MemoryEstimate = Columns * BytesPerColumn * MaxScannerThreads, Columns are scanned in query, not all the columns of the table UnitTest 1.CardinalityTest adds test cases to test memory estimation. Modify existing test cases related to memory estimation Change-Id: Ic01db168ff2c6d6de33ee553a8175599f035d7a1 Reviewed-on: http://gerrit.cloudera.org:8080/16842 Reviewed-by: Zoltan Borok-Nagy Tested-by: Impala Public Jenkins --- M be/src/exec/hash-table.h M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/PlannerContext.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java M testdata/workloads/functional-planner/queries/PlannerTest/bloom-filter-assignment.test M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection-hdfs-num-rows-est-enabled.test M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test M testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters-hdfs-num-rows-est-enabled.test M testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering-disabled.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test M testdata/workloads/functional-planner/queries/PlannerTest/partition-pruning.test M testdata/workloads/functional-planner/queries/PlannerTest/preagg-bytes-limit.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-planner/queries/PlannerTest/result-spooling.test M testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-query-options.test M testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test M testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q01.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q02.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q06.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q07.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q08.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q09.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q10a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q12.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q13.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14b.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q15.test M
[Impala-ASF-CR] IMPALA-7712: Support Google Cloud Storage
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17121 ) Change subject: IMPALA-7712: Support Google Cloud Storage .. Patch Set 10: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/17121 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia91ec956de3b620cccf6a1244b56b7da7a45b32b Gerrit-Change-Number: 17121 Gerrit-PatchSet: 10 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 12 Mar 2021 14:16:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9234: Support Ranger row filtering policies
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16976 ) Change subject: IMPALA-9234: Support Ranger row filtering policies .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6961/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/16976 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I580517be241225ca15e45686381b78890178d7cc Gerrit-Change-Number: 16976 Gerrit-PatchSet: 6 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 12 Mar 2021 13:57:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10377: Improve the accuracy of resource estimation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16842 ) Change subject: IMPALA-10377: Improve the accuracy of resource estimation .. Patch Set 22: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16842 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic01db168ff2c6d6de33ee553a8175599f035d7a1 Gerrit-Change-Number: 16842 Gerrit-PatchSet: 22 Gerrit-Owner: liuyao Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: liuyao Gerrit-Comment-Date: Fri, 12 Mar 2021 13:57:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17081 ) Change subject: IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8350/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17081 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd Gerrit-Change-Number: 17081 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 12 Mar 2021 13:54:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17130 ) Change subject: IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67 Gerrit-Change-Number: 17130 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 12 Mar 2021 13:45:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17130 ) Change subject: IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6960/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/17130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67 Gerrit-Change-Number: 17130 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 12 Mar 2021 13:45:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17081 ) Change subject: IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/17081/3/fe/src/main/java/org/apache/impala/catalog/Catalog.java File fe/src/main/java/org/apache/impala/catalog/Catalog.java: http://gerrit.cloudera.org:8080/#/c/17081/3/fe/src/main/java/org/apache/impala/catalog/Catalog.java@78 PS3, Line 78: > Perhaps a better name could include the serviceID of the catalog instance s I've added an abstract method getAcidUserId() to this class. It actually revealed that we never actually logged the user by the Transaction object. -- To view, visit http://gerrit.cloudera.org:8080/17081 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd Gerrit-Change-Number: 17081 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 12 Mar 2021 13:37:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables
Hello Vihang Karajgaonkar, Gabor Kaszab, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17081 to look at the new patch set (#4). Change subject: IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables .. IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables ALTER TABLE ADD PARTITION should bump the write id for ACID tables. Both for INSERT-only and full ACID tables. For transational tables we are adding partitions in an ACID transaction in the following sequence: 1. open transaction 2. allocate write id for table 3. add partitions to HMS table 4. commit transaction However, please note that table metadata modifications are independent of ACID transactions. I.e. if add partitions succeed, but we cannot commit the transaction, then we the newly added partitions won't get removed. So why are we opening a txn then? We are doing it in order to bump the write id in a best-effort way. This aids table metadata caching, so by looking at the table write id we can determine if the cached table metadata is up-to-date. Testing: * added e2e test Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd --- M fe/src/main/java/org/apache/impala/catalog/Catalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java M fe/src/main/java/org/apache/impala/catalog/Transaction.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M tests/query_test/test_acid.py 6 files changed, 117 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/17081/4 -- To view, visit http://gerrit.cloudera.org:8080/17081 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd Gerrit-Change-Number: 17081 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10523: Fix impala-shell crash in printing error messages that contain UTF-8 characters
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17099 ) Change subject: IMPALA-10523: Fix impala-shell crash in printing error messages that contain UTF-8 characters .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6959/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/17099 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie10f5b03ecc5877053c2fbada1afaf256b423a71 Gerrit-Change-Number: 17099 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tamas Mate Gerrit-Comment-Date: Fri, 12 Mar 2021 12:31:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10558: Implement ds theta exclude() function
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/17153 ) Change subject: IMPALA-10558: Implement ds_theta_exclude() function .. Patch Set 2: (8 comments) Thanks for these changes! I had some comments mostly nits and around test coverage. http://gerrit.cloudera.org:8080/#/c/17153/2/be/src/exprs/datasketches-functions-ir.cc File be/src/exprs/datasketches-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/17153/2/be/src/exprs/datasketches-functions-ir.cc@128 PS2, Line 128: // a_not_b nit: this comment is not needed as doesn't give extra info http://gerrit.cloudera.org:8080/#/c/17153/2/be/src/exprs/datasketches-functions-ir.cc@131 PS2, Line 131: datasketches::theta_sketch::unique_ptr first_sketch_ptr; : if (!first_serialized_sketch.is_null && first_serialized_sketch.len > 0) { : try { : first_sketch_ptr = datasketches::theta_sketch::deserialize( : (void*)first_serialized_sketch.ptr, first_serialized_sketch.len); : } catch (const std::exception&) { : LogSketchDeserializationError(ctx); : return StringVal::null(); : } : } This part seems pretty identical to the section L141-150. Can you move it to a function to avoid code repetition? http://gerrit.cloudera.org:8080/#/c/17153/2/be/src/exprs/datasketches-functions-ir.cc@155 PS2, Line 155: first_sketch_ptr.operator bool() I'm not sure I understand the condition in this format :) Could you please explain what goes on here? http://gerrit.cloudera.org:8080/#/c/17153/2/be/src/exprs/datasketches-functions.h File be/src/exprs/datasketches-functions.h: http://gerrit.cloudera.org:8080/#/c/17153/2/be/src/exprs/datasketches-functions.h@73 PS2, Line 73: 'serialized_sketch' Could you mention both sketch params? http://gerrit.cloudera.org:8080/#/c/17153/2/testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test File testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test: http://gerrit.cloudera.org:8080/#/c/17153/2/testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test@330 PS2, Line 330: for A is an empty sketch. When A is empty and B is null. http://gerrit.cloudera.org:8080/#/c/17153/2/testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test@331 PS2, Line 331: select ds_theta_estimate(ds_theta_exclude(ds_theta_sketch(f2), null)) Could you please add another test where A is null and B is empty? (the opposite of this one) http://gerrit.cloudera.org:8080/#/c/17153/2/testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test@332 PS2, Line 332: from functional_parquet.emptytable; Another test would be where A and B are both empty. http://gerrit.cloudera.org:8080/#/c/17153/2/testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test@379 PS2, Line 379: I miss a test where the result of an a-not-b is a non-empty sketch (where the estimate is greater than zero). -- To view, visit http://gerrit.cloudera.org:8080/17153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I05119fd8c652c07ff248a99e44b0da3541e46ca3 Gerrit-Change-Number: 17153 Gerrit-PatchSet: 2 Gerrit-Owner: Fucun Chu Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 12 Mar 2021 12:12:42 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10377: Improve the accuracy of resource estimation
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16842 ) Change subject: IMPALA-10377: Improve the accuracy of resource estimation .. Patch Set 22: Code-Review+2 Yeah, it probably was an intermittent infrastructure issue. -- To view, visit http://gerrit.cloudera.org:8080/16842 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic01db168ff2c6d6de33ee553a8175599f035d7a1 Gerrit-Change-Number: 16842 Gerrit-PatchSet: 22 Gerrit-Owner: liuyao Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: liuyao Gerrit-Comment-Date: Fri, 12 Mar 2021 12:03:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10377: Improve the accuracy of resource estimation
liuyao has posted comments on this change. ( http://gerrit.cloudera.org:8080/16842 ) Change subject: IMPALA-10377: Improve the accuracy of resource estimation .. Patch Set 22: I failed the automation test, and It doesn't look like my code caused the failure. I used Gerrit rebase my code and rerun test. -- To view, visit http://gerrit.cloudera.org:8080/16842 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic01db168ff2c6d6de33ee553a8175599f035d7a1 Gerrit-Change-Number: 16842 Gerrit-PatchSet: 22 Gerrit-Owner: liuyao Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: liuyao Gerrit-Comment-Date: Fri, 12 Mar 2021 11:32:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9470: Use Parquet Bloom filters - Part 1
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17026 ) Change subject: IMPALA-9470: Use Parquet Bloom filters - Part 1 .. Patch Set 17: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8349/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17026 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7119c7161fa3658e561fc1265430cb90079d8287 Gerrit-Change-Number: 17026 Gerrit-PatchSet: 17 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 12 Mar 2021 11:21:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10582: Fix wrong summary numbers in the webpage of catalogd operations
Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/17177 ) Change subject: IMPALA-10582: Fix wrong summary numbers in the webpage of catalogd operations .. Patch Set 1: Code-Review+1 Hi Quanlong, nice catch. LGTM. -- To view, visit http://gerrit.cloudera.org:8080/17177 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1c5361d981832d6f28db5f203a2c2538fe8ebb5e Gerrit-Change-Number: 17177 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tamas Mate Gerrit-Comment-Date: Fri, 12 Mar 2021 11:04:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9470: Use Parquet Bloom filters - Part 1
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17026 ) Change subject: IMPALA-9470: Use Parquet Bloom filters - Part 1 .. Patch Set 17: (182 comments) http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h File be/src/thirdparty/xxhash/xxhash.h: http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@70 PS17, Line 70: https://fastcompression.blogspot.com/2019/03/presenting-xxh3.html?showComment=1552696407071#c3490092340461170735 line too long (112 > 90) http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@92 PS17, Line 92: * https://fastcompression.blogspot.com/2018/03/xxhash-for-small-keys-impressive-power.html line too long (96 > 90) http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@113 PS17, Line 113: # elif defined (__cplusplus) || (defined (__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L) /* C99 */) line too long (104 > 90) http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@243 PS17, Line 243: # define XXH3_64bits_reset_withSecret XXH_NAME2(XXH_NAMESPACE, XXH3_64bits_reset_withSecret) line too long (93 > 90) http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@253 PS17, Line 253: # define XXH3_128bits_reset_withSeed XXH_NAME2(XXH_NAMESPACE, XXH3_128bits_reset_withSeed) line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@254 PS17, Line 254: # define XXH3_128bits_reset_withSecret XXH_NAME2(XXH_NAMESPACE, XXH3_128bits_reset_withSecret) line too long (95 > 90) http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@270 PS17, Line 270: #define XXH_VERSION_NUMBER (XXH_VERSION_MAJOR *100*100 + XXH_VERSION_MINOR *100 + XXH_VERSION_RELEASE) line too long (103 > 90) http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@429 PS17, Line 429: * @param statePtr A pointer to an @ref XXH32_state_t allocated with @ref XXH32_createState(). line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@441 PS17, Line 441: XXH_PUBLIC_API void XXH32_copyState(XXH32_state_t* dst_state, const XXH32_state_t* src_state); line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@476 PS17, Line 476: XXH_PUBLIC_API XXH_errorcode XXH32_update (XXH32_state_t* statePtr, const void* input, size_t length); line too long (102 > 90) http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@628 PS17, Line 628: XXH_PUBLIC_API void XXH64_copyState(XXH64_state_t* dst_state, const XXH64_state_t* src_state); line too long (94 > 90) http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@631 PS17, Line 631: XXH_PUBLIC_API XXH_errorcode XXH64_update (XXH64_state_t* statePtr, const void* input, size_t length); line too long (102 > 90) http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@700 PS17, Line 700: XXH_PUBLIC_API XXH64_hash_t XXH3_64bits_withSeed(const void* data, size_t len, XXH64_hash_t seed); line too long (98 > 90) http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@724 PS17, Line 724: XXH_PUBLIC_API XXH64_hash_t XXH3_64bits_withSecret(const void* data, size_t len, const void* secret, size_t secretSize); line too long (120 > 90) http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@743 PS17, Line 743: XXH_PUBLIC_API void XXH3_copyState(XXH3_state_t* dst_state, const XXH3_state_t* src_state); line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@756 PS17, Line 756: XXH_PUBLIC_API XXH_errorcode XXH3_64bits_reset_withSeed(XXH3_state_t* statePtr, XXH64_hash_t seed); line too long (99 > 90) http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@766 PS17, Line 766: XXH_PUBLIC_API XXH_errorcode XXH3_64bits_reset_withSecret(XXH3_state_t* statePtr, const void* secret, size_t secretSize); line too long (121 > 90) http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@768 PS17, Line 768: XXH_PUBLIC_API XXH_errorcode XXH3_64bits_update (XXH3_state_t* statePtr, const void* input, size_t length); line too long (107 > 90) http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@791 PS17, Line 791: XXH_PUBLIC_API XXH128_hash_t XXH3_128bits_withSeed(const void* data, size_t len, XXH64_hash_t seed); line too long (100 > 90) http://gerrit.cloudera.org:8080/#/c/17026/17/be/src/thirdparty/xxhash/xxhash.h@792 PS17, Line 792: XXH_PUBLIC_API XXH128_hash_t XXH3_128bits_withSecret(const void* data, size_t len, const void* secret, size_t secretSize); line too long (122 > 90)
[Impala-ASF-CR] IMPALA-9470: Use Parquet Bloom filters - Part 1
Daniel Becker has uploaded a new patch set (#17). ( http://gerrit.cloudera.org:8080/17026 ) Change subject: IMPALA-9470: Use Parquet Bloom filters - Part 1 .. IMPALA-9470: Use Parquet Bloom filters - Part 1 This change adds read support for Parquet Bloom filters for some types. The supported Parquet type - Impala type pairs are the following: --- |Parquet type | Impala type| |---| |INT32| TINYINT, SMALLINT, INT | |INT64| BIGINT | |FLOAT| FLOAT | |DOUBLE | DOUBLE | |BYTE_ARRAY | STRING | --- If a Bloom filter is available for a column that is fully dictionary encoded, the Bloom filter is not used as the dictionary can give exact results in filtering. Testing: - Added tests/query_test/test_parquet_bloom_filter.py that tests that Parquet Bloom filtering works for the supported types and that we do not incorrectly discard row groups for the unsupported type VARCHAR. Change-Id: I7119c7161fa3658e561fc1265430cb90079d8287 --- M LICENSE.txt M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exprs/expr-value.h M be/src/exprs/literal.cc M be/src/exprs/literal.h M be/src/kudu/util/block_bloom_filter.cc M be/src/kudu/util/block_bloom_filter.h M be/src/runtime/bufferpool/buffer-pool-internal.h M be/src/runtime/bufferpool/buffer-pool.cc M be/src/runtime/bufferpool/buffer-pool.h A be/src/thirdparty/xxhash/README.md A be/src/thirdparty/xxhash/xxhash.h M be/src/util/CMakeLists.txt M be/src/util/bloom-filter.cc M be/src/util/bloom-filter.h A be/src/util/impala-bloom-filter-buffer-allocator.cc A be/src/util/impala-bloom-filter-buffer-allocator.h A be/src/util/parquet-bloom-filter.cc A be/src/util/parquet-bloom-filter.h M bin/rat_exclude_files.txt M bin/run_clang_tidy.sh M common/thrift/parquet.thrift A testdata/data/parquet-bloom-filtering.parquet A testdata/workloads/functional-query/queries/QueryTest/parquet-bloom-filter.test A tests/query_test/test_parquet_bloom_filter.py 27 files changed, 6,910 insertions(+), 132 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/17026/17 -- To view, visit http://gerrit.cloudera.org:8080/17026 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7119c7161fa3658e561fc1265430cb90079d8287 Gerrit-Change-Number: 17026 Gerrit-PatchSet: 17 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-10520: Implement ds theta intersect() function
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17088 ) Change subject: IMPALA-10520: Implement ds_theta_intersect() function .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6958/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/17088 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I80e68c2151c4604f0386d3dfb004c82b10293f97 Gerrit-Change-Number: 17088 Gerrit-PatchSet: 5 Gerrit-Owner: Fucun Chu Gerrit-Reviewer: Fucun Chu Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 12 Mar 2021 10:29:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10520: Implement ds theta intersect() function
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17088 ) Change subject: IMPALA-10520: Implement ds_theta_intersect() function .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17088 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I80e68c2151c4604f0386d3dfb004c82b10293f97 Gerrit-Change-Number: 17088 Gerrit-PatchSet: 5 Gerrit-Owner: Fucun Chu Gerrit-Reviewer: Fucun Chu Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 12 Mar 2021 10:29:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10520: Implement ds theta intersect() function
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/17088 ) Change subject: IMPALA-10520: Implement ds_theta_intersect() function .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17088 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I80e68c2151c4604f0386d3dfb004c82b10293f97 Gerrit-Change-Number: 17088 Gerrit-PatchSet: 4 Gerrit-Owner: Fucun Chu Gerrit-Reviewer: Fucun Chu Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 12 Mar 2021 10:28:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/17130 ) Change subject: IMPALA-10222: CREATE TABLE AS SELECT for Iceberg tables .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17130 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81d2084e401b9fa74d5ad161b51fd3e2aa3fcc67 Gerrit-Change-Number: 17130 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 12 Mar 2021 10:14:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10523: Fix impala-shell crash in printing error messages that contain UTF-8 characters
Laszlo Gaal has posted comments on this change. ( http://gerrit.cloudera.org:8080/17099 ) Change subject: IMPALA-10523: Fix impala-shell crash in printing error messages that contain UTF-8 characters .. Patch Set 3: Code-Review+2 LGTM; thanks for the fix, Quanlong! -- To view, visit http://gerrit.cloudera.org:8080/17099 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie10f5b03ecc5877053c2fbada1afaf256b423a71 Gerrit-Change-Number: 17099 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tamas Mate Gerrit-Comment-Date: Fri, 12 Mar 2021 10:02:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10523: Fix impala-shell crash in printing error messages that contain UTF-8 characters
Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/17099 ) Change subject: IMPALA-10523: Fix impala-shell crash in printing error messages that contain UTF-8 characters .. Patch Set 3: Code-Review+1 Hi Quanlong, thanks for adding the comment, LGTM! -- To view, visit http://gerrit.cloudera.org:8080/17099 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie10f5b03ecc5877053c2fbada1afaf256b423a71 Gerrit-Change-Number: 17099 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tamas Mate Gerrit-Comment-Date: Fri, 12 Mar 2021 08:46:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10523: Fix impala-shell crash in printing error messages that contain UTF-8 characters
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17099 ) Change subject: IMPALA-10523: Fix impala-shell crash in printing error messages that contain UTF-8 characters .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8348/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17099 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie10f5b03ecc5877053c2fbada1afaf256b423a71 Gerrit-Change-Number: 17099 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tamas Mate Gerrit-Comment-Date: Fri, 12 Mar 2021 08:37:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7712: Support Google Cloud Storage
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17121 ) Change subject: IMPALA-7712: Support Google Cloud Storage .. Patch Set 10: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6957/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/17121 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia91ec956de3b620cccf6a1244b56b7da7a45b32b Gerrit-Change-Number: 17121 Gerrit-PatchSet: 10 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 12 Mar 2021 08:36:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10582: Fix wrong summary numbers in the webpage of catalogd operations
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17177 ) Change subject: IMPALA-10582: Fix wrong summary numbers in the webpage of catalogd operations .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8347/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17177 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1c5361d981832d6f28db5f203a2c2538fe8ebb5e Gerrit-Change-Number: 17177 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 12 Mar 2021 08:20:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10523: Fix impala-shell crash in printing error messages that contain UTF-8 characters
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/17099 ) Change subject: IMPALA-10523: Fix impala-shell crash in printing error messages that contain UTF-8 characters .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/17099/2/shell/impala_shell.py File shell/impala_shell.py: http://gerrit.cloudera.org:8080/#/c/17099/2/shell/impala_shell.py@1321 PS2, Line 1321: # Python2 will implicitly convert unicode to str when printing to stderr. It's done > nit: could you add a short one line comment that explains this condition? Done -- To view, visit http://gerrit.cloudera.org:8080/17099 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie10f5b03ecc5877053c2fbada1afaf256b423a71 Gerrit-Change-Number: 17099 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tamas Mate Gerrit-Comment-Date: Fri, 12 Mar 2021 08:17:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10523: Fix impala-shell crash in printing error messages that contain UTF-8 characters
Hello Tamas Mate, Laszlo Gaal, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/17099 to look at the new patch set (#3). Change subject: IMPALA-10523: Fix impala-shell crash in printing error messages that contain UTF-8 characters .. IMPALA-10523: Fix impala-shell crash in printing error messages that contain UTF-8 characters In Python2, print() converts all non-keyword arguments to strings like str() does and writes them to the stream. str() on QueryStateException returns its value(i.e. error message) which could be in unicode type. Python2 will implicitly encode it to str type using the default encoding, 'ascii'. This could result in UnicodeEncodeError when there are non-ascii characters in the error message. This patch explicitly encodes the error message using 'utf-8' encoding if it's in unicode type and the shell is run in Python2. Tests: - Add test in test_shell_interactive.py Change-Id: Ie10f5b03ecc5877053c2fbada1afaf256b423a71 --- M shell/impala_shell.py M tests/shell/test_shell_interactive.py 2 files changed, 16 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/17099/3 -- To view, visit http://gerrit.cloudera.org:8080/17099 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie10f5b03ecc5877053c2fbada1afaf256b423a71 Gerrit-Change-Number: 17099 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Tamas Mate
[Impala-ASF-CR] IMPALA-10377: Improve the accuracy of resource estimation
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16842 ) Change subject: IMPALA-10377: Improve the accuracy of resource estimation .. Patch Set 22: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6956/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/16842 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic01db168ff2c6d6de33ee553a8175599f035d7a1 Gerrit-Change-Number: 16842 Gerrit-PatchSet: 22 Gerrit-Owner: liuyao Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: liuyao Gerrit-Comment-Date: Fri, 12 Mar 2021 08:09:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10582: Fix wrong summary numbers in the webpage of catalogd operations
Quanlong Huang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17177 Change subject: IMPALA-10582: Fix wrong summary numbers in the webpage of catalogd operations .. IMPALA-10582: Fix wrong summary numbers in the webpage of catalogd operations Webpage of catalogd operations doesn't sum up requests correctly. Instead, the current meaning is summing by tables. As the column name is "Number of requests", we should sum up by requests. Tests: - Manually run test_concurrent_inserts and verify the number is correct. Change-Id: I1c5361d981832d6f28db5f203a2c2538fe8ebb5e --- M be/src/catalog/catalog-server.cc 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/77/17177/1 -- To view, visit http://gerrit.cloudera.org:8080/17177 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I1c5361d981832d6f28db5f203a2c2538fe8ebb5e Gerrit-Change-Number: 17177 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang