[Impala-ASF-CR] IMPALA-3120: Support Bucket Shuffle Join for bucketed table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19430 ) Change subject: IMPALA-3120: Support Bucket Shuffle Join for bucketed table .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12202/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19430 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316 Gerrit-Change-Number: 19430 Gerrit-PatchSet: 6 Gerrit-Owner: Baike Xia Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 19 Jan 2023 06:52:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3120: Support Bucket Shuffle Join for bucketed table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19430 ) Change subject: IMPALA-3120: Support Bucket Shuffle Join for bucketed table .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8977/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/19430 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316 Gerrit-Change-Number: 19430 Gerrit-PatchSet: 6 Gerrit-Owner: Baike Xia Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 19 Jan 2023 06:46:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3120: Support Bucket Shuffle Join for bucketed table
Baike Xia has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/19430 ) Change subject: IMPALA-3120: Support Bucket Shuffle Join for bucketed table .. IMPALA-3120: Support Bucket Shuffle Join for bucketed table For query statements that contain bucketed tables and have operations such as join, group by, sort by, etc., can use bucket shuffle join to optimize the execution plan, reduce the amount of data transferred, and reduce query latency. To ensure consistency with hive, the bucket hash is calculated using the same method that hive uses to calculate the hash value of a column. Add new query option as a function switch: ENABLE_BUCKET_SHUFFLE FRAGMENT_INSTANCE_BUCKET_NUM BUCKET_EXEC_BACKEND_RATIO Testing: - Add e2e tests - Add fe tests Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316 --- M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/data-stream-test.cc M be/src/runtime/fragment-instance-state.cc M be/src/runtime/initial-reservations.cc M be/src/runtime/initial-reservations.h M be/src/runtime/krpc-data-stream-sender-ir.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/scheduling/schedule-state.h M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/hash-util.h M common/protobuf/admission_control_service.proto M common/protobuf/control_service.proto M common/protobuf/planner.proto M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M common/thrift/Partitions.thrift M common/thrift/PlanNodes.thrift M common/thrift/Planner.thrift M common/thrift/Query.thrift M fe/src/main/java/org/apache/impala/analysis/TableDef.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/catalog/FeFsTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/DataPartition.java M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M fe/src/main/java/org/apache/impala/planner/UnionNode.java M fe/src/main/java/org/apache/impala/util/BucketUtils.java M fe/src/main/java/org/apache/impala/util/MathUtil.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv A testdata/workloads/functional-planner/queries/PlannerTest/bucket-shuffle.test A testdata/workloads/functional-query/queries/QueryTest/bucket-shuffle.test A tests/query_test/test_bucket_shuffle.py 52 files changed, 2,212 insertions(+), 70 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/19430/6 -- To view, visit http://gerrit.cloudera.org:8080/19430 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316 Gerrit-Change-Number: 19430 Gerrit-PatchSet: 6 Gerrit-Owner: Baike Xia Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-3120: Support Bucket Shuffle Join for bucketed table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19430 ) Change subject: IMPALA-3120: Support Bucket Shuffle Join for bucketed table .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12201/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19430 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316 Gerrit-Change-Number: 19430 Gerrit-PatchSet: 5 Gerrit-Owner: Baike Xia Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 19 Jan 2023 06:30:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19429 ) Change subject: IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12200/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19429 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77 Gerrit-Change-Number: 19429 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 19 Jan 2023 06:20:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3120: Support Bucket Shuffle Join for bucketed table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19430 ) Change subject: IMPALA-3120: Support Bucket Shuffle Join for bucketed table .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8976/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/19430 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316 Gerrit-Change-Number: 19430 Gerrit-PatchSet: 5 Gerrit-Owner: Baike Xia Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 19 Jan 2023 06:14:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3120: Support Bucket Shuffle Join for bucketed table
Baike Xia has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/19430 ) Change subject: IMPALA-3120: Support Bucket Shuffle Join for bucketed table .. IMPALA-3120: Support Bucket Shuffle Join for bucketed table For query statements that contain bucketed tables and have operations such as join, group by, sort by, etc., can use bucket shuffle join to optimize the execution plan, reduce the amount of data transferred, and reduce query latency. To ensure consistency with hive, the bucket hash is calculated using the same method that hive uses to calculate the hash value of a column. Add new query option as a function switch: ENABLE_BUCKET_SHUFFLE FRAGMENT_INSTANCE_BUCKET_NUM BUCKET_EXEC_BACKEND_RATIO Testing: - Add e2e tests - Add fe tests Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316 --- M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/data-stream-test.cc M be/src/runtime/fragment-instance-state.cc M be/src/runtime/initial-reservations.cc M be/src/runtime/initial-reservations.h M be/src/runtime/krpc-data-stream-sender-ir.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/scheduling/schedule-state.h M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/hash-util.h M common/protobuf/admission_control_service.proto M common/protobuf/control_service.proto M common/protobuf/planner.proto M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M common/thrift/Partitions.thrift M common/thrift/PlanNodes.thrift M common/thrift/Planner.thrift M common/thrift/Query.thrift M fe/src/main/java/org/apache/impala/analysis/TableDef.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/catalog/FeFsTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/DataPartition.java M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M fe/src/main/java/org/apache/impala/planner/UnionNode.java M fe/src/main/java/org/apache/impala/util/BucketUtils.java M fe/src/main/java/org/apache/impala/util/MathUtil.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M testdata/datasets/functional/functional_schema_template.sql A testdata/workloads/functional-planner/queries/PlannerTest/bucket-shuffle.test A testdata/workloads/functional-query/queries/QueryTest/bucket-shuffle.test A tests/query_test/test_bucket_shuffle.py 51 files changed, 2,202 insertions(+), 70 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/19430/5 -- To view, visit http://gerrit.cloudera.org:8080/19430 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316 Gerrit-Change-Number: 19430 Gerrit-PatchSet: 5 Gerrit-Owner: Baike Xia Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/19429 ) Change subject: IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking .. Patch Set 2: (7 comments) Thanks for Daniel's feedback! Adjusted the commit message. While revisiting the Precondition, I found another bug: IMPALA-11851. It happens when the catalog view exposes complex types and need to apply table masking policies. Since it's a bug in branches that support complex types in SelectList. I'll fix it in another patch. http://gerrit.cloudera.org:8080/#/c/19429/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19429/1//COMMIT_MSG@7 PS1, Line 7: IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking > Do I understand it correctly that the error occurs if we try to query a sta Yeah, the error is due to "v.*" always being treated as a star on a struct. http://gerrit.cloudera.org:8080/#/c/19429/1//COMMIT_MSG@10 PS1, Line 10: When > Nit: When Done http://gerrit.cloudera.org:8080/#/c/19429/1//COMMIT_MSG@11 PS1, Line 11: hadn' > Nit: hadn't Done http://gerrit.cloudera.org:8080/#/c/19429/1//COMMIT_MSG@12 PS1, Line 12: the table masking view can't expose complex type columns directly > I see you've already created IMPALA-11847 for this. Yeah, I filed IMPALA-11847 for the refactor since it's a broader change. It also depends on the full functionality of complex type support in select list which hasn't finished yet, e.g. some remaining items: IMPALA-9551, IMPALA-10851, IMPALA-11052. This patch is a fix for the current solution and it will be backported to older branches. http://gerrit.cloudera.org:8080/#/c/19429/1//COMMIT_MSG@32 PS1, Line 32: is rewritten to > Could you make it clearer that these are the conditions for returning the p Done http://gerrit.cloudera.org:8080/#/c/19429/1//COMMIT_MSG@33 PS1, Line 33: > Nit: rooted? Done http://gerrit.cloudera.org:8080/#/c/19429/1/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_complex_types.test File testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_complex_types.test: http://gerrit.cloudera.org:8080/#/c/19429/1/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_complex_types.test@63 PS1, Line 63: a > Nit: expanding. Done -- To view, visit http://gerrit.cloudera.org:8080/19429 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77 Gerrit-Change-Number: 19429 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Thu, 19 Jan 2023 06:06:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking
Hello Fang-Yu Rao, Daniel Becker, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/19429 to look at the new patch set (#2). Change subject: IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking .. IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking resolvePathWithMasking() is a wrapper on resolvePath() to further resolve nested columns inside the table masking view. When it was added, complex types in the select list hadn't been supported yet. So the table masking view can't expose complex type columns directly in the select list. Any paths in nested types will be further resolved inside the table masking view in resolvePathWithMasking(). Take the following query as an example: select id, nested_struct.* from complextypestbl; If Ranger column-masking/row-filter policies applied on the table, the query is rewritten as select id, nested_struct.* from ( select mask(id) from complextypestbl where row-filtering-condition ) t; Table masking view "t" can't expose the nested column "nested_struct". So we further resolve "nested_struct" inside the inlineView to use the masked table "complextypestbl". The underlying TableRef is expected to be a BaseTableRef. Paths that don't reference nested columns should be resolved and returned directly (just like the original resolvePath() does). E.g. select v.* from masked_view v is rewritten to select v.* from ( select mask(c1), mask(c2), ..., mask(cn) from masked_view where row-filtering-condition ) v; The STAR path "v.*" should be resolved directly. However, it's treated as a nested column unexpectedly. The code then try to resolve it inside the table "masked_view" and found it's not a table so throws the IllegalStateException. These are the conditions for returning the STAR paths directly: - The type is STRUCT - And the resolved path is rooted at a valid tuple descriptor They don't really recognize the nested columns. STAR expansion is only valid for paths to a struct type (or a table/view). So the first condition always matches. The second condition also matches for STAR paths on table/view, i.e. paths of "v.*" when "v" is a catalog table/view. The rooted tuple descriptor is exactly the output tuple of the table/view. This patch fixes the check for nested struct STAR expansion by checking the matched types instead. Note that if "v.*" is a table/view expansion, the matched type list is empty. If "v.*" is a struct column expansion, the matched type list contains the STRUCT column type. Tests: - Add missing coverage on STAR paths (v.*) on masked views. Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77 --- M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/Path.java M testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test M testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_complex_types.test 4 files changed, 67 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/19429/2 -- To view, visit http://gerrit.cloudera.org:8080/19429 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77 Gerrit-Change-Number: 19429 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang
[Impala-ASF-CR] IMPALA-11846: Fix builds with setuptools>f.0.0
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/19431 ) Change subject: IMPALA-11846: Fix builds with setuptools>=66.0.0 .. IMPALA-11846: Fix builds with setuptools>=66.0.0 setuptools 66.0.0 introduced a breaking change, it does not support non PEP440 compliant version names. This breaks impala_shell's packaging and installing test if the system python3's version is 3.8+. This is a quick fix to unblock builds. The rest of the work will be done in IMPALA-11849 (e.g. stabilizing the python environments version). impala_shell releases should not be affected by this, as the version number we generate is already PEP440 compliant. Testing: - Built locally with python3.8 Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef Reviewed-on: http://gerrit.cloudera.org:8080/19431 Reviewed-by: Joe McDonnell Tested-by: Impala Public Jenkins --- M shell/CMakeLists.txt 1 file changed, 5 insertions(+), 2 deletions(-) Approvals: Joe McDonnell: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/19431 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef Gerrit-Change-Number: 19431 Gerrit-PatchSet: 8 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith
[Impala-ASF-CR] IMPALA-11846: Fix builds with setuptools>f.0.0
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19431 ) Change subject: IMPALA-11846: Fix builds with setuptools>=66.0.0 .. Patch Set 7: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/19431 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef Gerrit-Change-Number: 19431 Gerrit-PatchSet: 7 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Thu, 19 Jan 2023 03:59:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol.
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/19428 ) Change subject: IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol. .. Patch Set 3: (11 comments) A few quick comments http://gerrit.cloudera.org:8080/#/c/19428/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19428/3//COMMIT_MSG@2 PS3, Line 2: Author: jasonmfehr Do you want to use your Cloudera email? http://gerrit.cloudera.org:8080/#/c/19428/3//COMMIT_MSG@9 PS3, Line 9: When using the hs2 protocol with the http transport, include several tracing http The commit message should be wrapped at 732 characters. http://gerrit.cloudera.org:8080/#/c/19428/3/be/src/transport/THttpServer.h File be/src/transport/THttpServer.h: http://gerrit.cloudera.org:8080/#/c/19428/3/be/src/transport/THttpServer.h@156 PS3, Line 156: // Client-defined string identifying the HTTP request, meaingful only to the client. Nit "meaningful" http://gerrit.cloudera.org:8080/#/c/19428/3/be/src/transport/THttpServer.h@157 PS3, Line 157: const char* HEADER_REQUEST_ID = "X-Request-Id"; I think this should be something like: static const std::string HEADER_REQUEST_ID; and then, in THttpServer.cpp const string THttpTransport::HEADER_REQUEST_ID = "X-Request-Id"; http://gerrit.cloudera.org:8080/#/c/19428/3/be/src/transport/THttpServer.h@236 PS3, Line 236: std::string header_request_id_ = ""; To me its a bit confusing to have both HEADER_IMPALA_SESSION_ID and header_request_id_ Maybe x_request_id_ (etc.) is clearer? http://gerrit.cloudera.org:8080/#/c/19428/3/be/src/transport/THttpServer.cpp File be/src/transport/THttpServer.cpp: http://gerrit.cloudera.org:8080/#/c/19428/3/be/src/transport/THttpServer.cpp@245 PS3, Line 245: LOG(INFO) << "HTTP Connection Tracing" INFO means we we will usually log this, maybe this should be VLOG(2)? http://gerrit.cloudera.org:8080/#/c/19428/3/shell/impala_client.py File shell/impala_client.py: http://gerrit.cloudera.org:8080/#/c/19428/3/shell/impala_client.py@651 PS3, Line 651: # when the transport is http, subclasses can override this function Nit: capitalize sentence, add period at end. http://gerrit.cloudera.org:8080/#/c/19428/3/tests/shell/test_shell_commandline.py File tests/shell/test_shell_commandline.py: http://gerrit.cloudera.org:8080/#/c/19428/3/tests/shell/test_shell_commandline.py@1532 PS3, Line 1532: args = ['--protocol', 'hs2-http', '-q', 'select version();profile'] Add a brief description of the test http://gerrit.cloudera.org:8080/#/c/19428/3/tests/shell/test_shell_commandline.py@1554 PS3, Line 1554: # find all HTTP Connection Tracing log lines Capitalize comments and add periods at the end. http://gerrit.cloudera.org:8080/#/c/19428/3/tests/shell/test_shell_commandline.py@1560 PS3, Line 1560: # request_id consists of the same guid with a serially increasing integer Nit: make it clear that this is what impala-shell does. Other clients, or istio, may have different conventions. http://gerrit.cloudera.org:8080/#/c/19428/3/tests/shell/test_shell_commandline.py@1619 PS3, Line 1619: @CustomClusterTestSuite.with_args("-log_dir={0}".format(LOG_DIR_HTTP_TRACING)) You are running 2 tests with the same --log_dir value, can they interfere with each other? -- To view, visit http://gerrit.cloudera.org:8080/19428 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7857eb5ec03eba32e06ec8d4133480f2e958ad2f Gerrit-Change-Number: 19428 Gerrit-PatchSet: 3 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 19 Jan 2023 03:46:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11846: Fix builds with setuptools>f.0.0
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/19431 ) Change subject: IMPALA-11846: Fix builds with setuptools>=66.0.0 .. Patch Set 7: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/19431 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef Gerrit-Change-Number: 19431 Gerrit-PatchSet: 7 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Thu, 19 Jan 2023 03:00:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11626: Handled COMMIT COMPACTION EVENT from HMS
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19155 ) Change subject: IMPALA-11626: Handled COMMIT_COMPACTION_EVENT from HMS .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12199/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19155 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I464faedb4e3bbcd417bab2e3cb0d57e339d42605 Gerrit-Change-Number: 19155 Gerrit-PatchSet: 6 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 18 Jan 2023 23:18:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11626: Handled COMMIT COMPACTION EVENT from HMS
Hello Quanlong Huang, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/19155 to look at the new patch set (#6). Change subject: IMPALA-11626: Handled COMMIT_COMPACTION_EVENT from HMS .. IMPALA-11626: Handled COMMIT_COMPACTION_EVENT from HMS Since HIVE-24329 HMS emits an event when a compaction is committed, but Impala ignores it. Handling it would allow automatic refreshing of file metadata after commit compactions. Change-Id: I464faedb4e3bbcd417bab2e3cb0d57e339d42605 --- M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java 4 files changed, 107 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/19155/6 -- To view, visit http://gerrit.cloudera.org:8080/19155 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I464faedb4e3bbcd417bab2e3cb0d57e339d42605 Gerrit-Change-Number: 19155 Gerrit-PatchSet: 6 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-11846: Fix builds with setuptools>f.0.0
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19431 ) Change subject: IMPALA-11846: Fix builds with setuptools>=66.0.0 .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8975/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/19431 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef Gerrit-Change-Number: 19431 Gerrit-PatchSet: 7 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Wed, 18 Jan 2023 22:51:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11846: Fix builds with setuptools>f.0.0
Michael Smith has uploaded a new patch set (#6) to the change originally created by Gergely Fürnstáhl. ( http://gerrit.cloudera.org:8080/19431 ) Change subject: IMPALA-11846: Fix builds with setuptools>=66.0.0 .. IMPALA-11846: Fix builds with setuptools>=66.0.0 setuptools 66.0.0 introduced a breaking change, it does not support non PEP440 compliant version names. This breaks impala_shell's packaging and installing test if the system python3's version is 3.8+. This is a quick fix to unblock builds. The rest of the work will be done in IMPALA-11849 (e.g. stabilizing the python environments version). impala_shell releases should not be affected by this, as the version number we generate is already PEP440 compliant. Testing: - Built locally with python3.8 Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef --- M shell/CMakeLists.txt 1 file changed, 5 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/19431/6 -- To view, visit http://gerrit.cloudera.org:8080/19431 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef Gerrit-Change-Number: 19431 Gerrit-PatchSet: 6 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith
[Impala-ASF-CR] IMPALA-11846: Fix builds with setuptools>f.0.0
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19431 ) Change subject: IMPALA-11846: Fix builds with setuptools>=66.0.0 .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/19431 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef Gerrit-Change-Number: 19431 Gerrit-PatchSet: 4 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Wed, 18 Jan 2023 22:15:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11604 Planner changes for CPU usage
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/19033 ) Change subject: IMPALA-11604 Planner changes for CPU usage .. Patch Set 32: (2 comments) http://gerrit.cloudera.org:8080/#/c/19033/32//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19033/32//COMMIT_MSG@169 PS32, Line 169: Compute the total ProcessingCost of the query > We calculate total ProcessingCost of the query and CoreRequirement in IV. W ProcessingCost is used to derive CoreRequirement in section IV. My intention is to only use CoreRequirement for finding suitable executor group. I will clarify this in commit message. http://gerrit.cloudera.org:8080/#/c/19033/32/common/thrift/Frontend.thrift File common/thrift/Frontend.thrift: http://gerrit.cloudera.org:8080/#/c/19033/32/common/thrift/Frontend.thrift@779 PS32, Line 779: total CPU cores among all executor > The max_mem_limit is defined as per host estimated-memory limit. We may def Ack. I will cross check with your IMPALA-11617 patch as well. -- To view, visit http://gerrit.cloudera.org:8080/19033 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a Gerrit-Change-Number: 19033 Gerrit-PatchSet: 32 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Wed, 18 Jan 2023 21:53:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19428 ) Change subject: IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol. .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12198/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19428 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7857eb5ec03eba32e06ec8d4133480f2e958ad2f Gerrit-Change-Number: 19428 Gerrit-PatchSet: 3 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 18 Jan 2023 20:42:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol.
Jason Fehr has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/19428 ) Change subject: IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol. .. IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol. When using the hs2 protocol with the http transport, include several tracing http headers by default. These headers are: * X-Request-Id-- client defined string that identifies the http request, this string is meaningful only to the client * X-Impala-Session-Id -- session id generated by the Impala backend, will be omitted on http calls that occur before this id has been generated * X-Impala-Query-Id -- query id generated by the Impala backend, will be omitted on http calls that occur before this id has been generated The Impala shell includes these flags by default. Command line arguments have been added to remove these headers. The Impala backend logs out these headers if they are on the http request. Testing: - manual testing (verified using debugging proxy and impala logs) - new python test Change-Id: I7857eb5ec03eba32e06ec8d4133480f2e958ad2f --- M be/src/transport/THttpServer.cpp M be/src/transport/THttpServer.h M shell/ImpalaHttpClient.py M shell/impala_client.py M shell/impala_shell.py M shell/impala_shell_config_defaults.py M shell/option_parser.py M tests/common/test_dimensions.py M tests/shell/test_shell_commandline.py 9 files changed, 241 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/19428/3 -- To view, visit http://gerrit.cloudera.org:8080/19428 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7857eb5ec03eba32e06ec8d4133480f2e958ad2f Gerrit-Change-Number: 19428 Gerrit-PatchSet: 3 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-11846: Fix builds with setuptools>f.0.0
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/19431 ) Change subject: IMPALA-11846: Fix builds with setuptools>=66.0.0 .. Patch Set 5: Code-Review+2 Chopping off the -* makes sense to me -- To view, visit http://gerrit.cloudera.org:8080/19431 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef Gerrit-Change-Number: 19431 Gerrit-PatchSet: 5 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Wed, 18 Jan 2023 20:02:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11846: Fix builds with setuptools>f.0.0
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19431 ) Change subject: IMPALA-11846: Fix builds with setuptools>=66.0.0 .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12197/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19431 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef Gerrit-Change-Number: 19431 Gerrit-PatchSet: 5 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Wed, 18 Jan 2023 19:58:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11846: Fix builds with setuptools>f.0.0
Michael Smith has uploaded a new patch set (#5) to the change originally created by Gergely Fürnstáhl. ( http://gerrit.cloudera.org:8080/19431 ) Change subject: IMPALA-11846: Fix builds with setuptools>=66.0.0 .. IMPALA-11846: Fix builds with setuptools>=66.0.0 setuptools 66.0.0 introduced a breaking change, it does not support non PEP440 compilant version names. This breaks impala_shell's packaging and installing test if the system python3's version is 3.8+. This is a quick fix to unblock builds. The rest of the work will be done in IMPALA-11849 (e.g. stabilizing the python environments version). impala_shell releases should not be affected by this, as the version number we generate is already PEP440 compilant. Testing: - Built locally with python3.8 Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef --- M shell/CMakeLists.txt 1 file changed, 5 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/19431/5 -- To view, visit http://gerrit.cloudera.org:8080/19431 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef Gerrit-Change-Number: 19431 Gerrit-PatchSet: 5 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith
[Impala-ASF-CR] IMPALA-11658: Implement Iceberg manifest caching config for Impala
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19423 ) Change subject: IMPALA-11658: Implement Iceberg manifest caching config for Impala .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12196/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19423 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a60a700d2ae6302dfe395d1ef602e6b1d821888 Gerrit-Change-Number: 19423 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Yida Wu Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 18 Jan 2023 19:21:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19428 ) Change subject: IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol. .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12195/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19428 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7857eb5ec03eba32e06ec8d4133480f2e958ad2f Gerrit-Change-Number: 19428 Gerrit-PatchSet: 2 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 18 Jan 2023 19:21:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11846: Fix builds with setuptools>f.0.0
Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/19431 ) Change subject: IMPALA-11846: Fix builds with setuptools>=66.0.0 .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/19431/4/shell/CMakeLists.txt File shell/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/19431/4/shell/CMakeLists.txt@34 PS4, Line 34: "${CMAKE_SOURCE_DIR}/shell/build/dist/impala_shell-${PKG_SUFFIX}.tar.gz") This doesn't work right with IMPALA_VERSION=4.0.0.7.2.17.0-88, which is something we use in CDP builds. -- To view, visit http://gerrit.cloudera.org:8080/19431 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef Gerrit-Change-Number: 19431 Gerrit-PatchSet: 4 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Wed, 18 Jan 2023 19:16:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11658: Implement Iceberg manifest caching config for Impala
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/19423 ) Change subject: IMPALA-11658: Implement Iceberg manifest caching config for Impala .. Patch Set 2: (4 comments) http://gerrit.cloudera.org:8080/#/c/19423/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19423/1//COMMIT_MSG@9 PS1, Line 9: need > nit: needs Done http://gerrit.cloudera.org:8080/#/c/19423/1//COMMIT_MSG@10 PS1, Line 10: implement > nit: implements Done http://gerrit.cloudera.org:8080/#/c/19423/1//COMMIT_MSG@19 PS1, Line 19: - Checked that manifest caching works through debug logging. > Does it works in case of HadoopTables and Catalogs as well? I believe manifest caching is only touched if table is loaded using IcebergHadoopCatalog or IcebergHiveCatalog. http://gerrit.cloudera.org:8080/#/c/19423/1/fe/src/main/java/org/apache/impala/util/IcebergUtil.java File fe/src/main/java/org/apache/impala/util/IcebergUtil.java: http://gerrit.cloudera.org:8080/#/c/19423/1/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@1067 PS1, Line 1067: non-existent. > nit. non-existent or doesn't exist Done -- To view, visit http://gerrit.cloudera.org:8080/19423 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a60a700d2ae6302dfe395d1ef602e6b1d821888 Gerrit-Change-Number: 19423 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Yida Wu Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 18 Jan 2023 19:08:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11658: Implement Iceberg manifest caching config for Impala
Hello Yida Wu, Zoltan Borok-Nagy, Gergely Fürnstáhl, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/19423 to look at the new patch set (#2). Change subject: IMPALA-11658: Implement Iceberg manifest caching config for Impala .. IMPALA-11658: Implement Iceberg manifest caching config for Impala Impala needs to supply Iceberg's catalog properties to enable manifest caching feature. This commit implements the necessary config reading. Iceberg related config is read from hadoop-conf.xml and supplied as a Map in catalog instantiation. Additionally, this patch also replace deprecated RuntimeIOException with its superclass, UncheckedIOException. Testing: - Pass core tests. - Checked that manifest caching works through debug logging. Change-Id: I5a60a700d2ae6302dfe395d1ef602e6b1d821888 --- M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCatalog.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCatalogs.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHadoopCatalog.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHadoopTables.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergHiveCatalog.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.py 7 files changed, 65 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/19423/2 -- To view, visit http://gerrit.cloudera.org:8080/19423 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I5a60a700d2ae6302dfe395d1ef602e6b1d821888 Gerrit-Change-Number: 19423 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Yida Wu Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol.
Jason Fehr has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/19428 ) Change subject: IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol. .. IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol. When using the hs2 protocol with the http transport, include several tracing http headers by default. These headers are: * X-Request-Id-- client defined string that identifies the http request, this string is meaningful only to the client * X-Impala-Session-Id -- session id generated by the Impala backend, will be omitted on http calls that occur before this id has been generated * X-Impala-Query-Id -- query id generated by the Impala backend, will be omitted on http calls that occur before this id has been generated The Impala shell includes these flags by default. Command line arguments have been added to remove these headers. The Impala backend logs out these headers if they are on the http request. Testing: - manual testing (verified using debugging proxy and impala logs) - new python test Change-Id: I7857eb5ec03eba32e06ec8d4133480f2e958ad2f --- M be/src/transport/THttpServer.cpp M be/src/transport/THttpServer.h M shell/ImpalaHttpClient.py M shell/impala_client.py M shell/impala_shell.py M shell/impala_shell_config_defaults.py M shell/option_parser.py M tests/common/test_dimensions.py M tests/shell/test_shell_commandline.py 9 files changed, 235 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/19428/2 -- To view, visit http://gerrit.cloudera.org:8080/19428 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7857eb5ec03eba32e06ec8d4133480f2e958ad2f Gerrit-Change-Number: 19428 Gerrit-PatchSet: 2 Gerrit-Owner: Jason Fehr Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-11658: Implement Iceberg manifest caching config for Impala
Yida Wu has posted comments on this change. ( http://gerrit.cloudera.org:8080/19423 ) Change subject: IMPALA-11658: Implement Iceberg manifest caching config for Impala .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/19423/1/fe/src/main/java/org/apache/impala/util/IcebergUtil.java File fe/src/main/java/org/apache/impala/util/IcebergUtil.java: http://gerrit.cloudera.org:8080/#/c/19423/1/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@1067 PS1, Line 1067: it is not exist nit. non-existent or doesn't exist -- To view, visit http://gerrit.cloudera.org:8080/19423 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a60a700d2ae6302dfe395d1ef602e6b1d821888 Gerrit-Change-Number: 19423 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Yida Wu Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 18 Jan 2023 18:26:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11846: Fix builds with setuptools>f.0.0
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/19431 ) Change subject: IMPALA-11846: Fix builds with setuptools>=66.0.0 .. Patch Set 4: Code-Review+2 This makes sense to me. -- To view, visit http://gerrit.cloudera.org:8080/19431 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef Gerrit-Change-Number: 19431 Gerrit-PatchSet: 4 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Wed, 18 Jan 2023 18:13:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11626: Handled COMMIT COMPACTION EVENT from HMS
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19155 ) Change subject: IMPALA-11626: Handled COMMIT_COMPACTION_EVENT from HMS .. Patch Set 5: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/12194/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/19155 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I464faedb4e3bbcd417bab2e3cb0d57e339d42605 Gerrit-Change-Number: 19155 Gerrit-PatchSet: 5 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala Gerrit-Comment-Date: Wed, 18 Jan 2023 17:54:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11840: Error with joining unnest with views
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19426 ) Change subject: IMPALA-11840: Error with joining unnest with views .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/19426 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic52655631944913553a7e7d9e9169b93da46dde3 Gerrit-Change-Number: 19426 Gerrit-PatchSet: 4 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Peter Rozsa Gerrit-Comment-Date: Wed, 18 Jan 2023 17:46:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11840: Error with joining unnest with views
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/19426 ) Change subject: IMPALA-11840: Error with joining unnest with views .. IMPALA-11840: Error with joining unnest with views Queries fail in the following situation involving collections and views: 1. A view returns an array 2. A second view unnests the array returned from the first view 3. The unnested view is queried in an outer query For example: use functional_parquet; with sub as ( select id, arr1.item unnested_arr from complextypes_arrays_only_view, complextypes_arrays_only_view.int_array arr1) select id, unnested_arr from sub; ERROR: IllegalStateException: null The problem is that in CollectionTableRef.analyze(), if - there is a source view and - the collection ref is within a WITH clause and - it is not in the select list then 'desc_' is not set, but it has to be set in order for TableRef.analyzeJoin() to succeed. This commit solves the problem by assigning a value to 'desc_' also in the above case. Testing: - Added regression tests in nested-types-runtime.test. Change-Id: Ic52655631944913553a7e7d9e9169b93da46dde3 Reviewed-on: http://gerrit.cloudera.org:8080/19426 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java M testdata/workloads/functional-query/queries/QueryTest/nested-types-runtime.test 2 files changed, 104 insertions(+), 19 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/19426 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ic52655631944913553a7e7d9e9169b93da46dde3 Gerrit-Change-Number: 19426 Gerrit-PatchSet: 5 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Peter Rozsa
[Impala-ASF-CR] IMPALA-11626: Handled COMMIT COMPACTION EVENT from HMS
Sai Hemanth Gantasala has posted comments on this change. ( http://gerrit.cloudera.org:8080/19155 ) Change subject: IMPALA-11626: Handled COMMIT_COMPACTION_EVENT from HMS .. Patch Set 5: (15 comments) http://gerrit.cloudera.org:8080/#/c/19155/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19155/4//COMMIT_MSG@9 PS4, Line 9: HMS emits an event when a compaction is committed > For a partitioned table, does HMS still emit a single event if all partitio It'll be a single commit compaction event always. Consider a partitioned table foo, > Alter table foo compact 'minor'; --> will fire single commit compaction event > at the table level, partition name would be null, in this case, we need to > reload whole table file metadata. >Alter table foo partition(i=1) compact 'minor'; --> will fire single commit >compaction event for a single partition i=1 of the table foo. http://gerrit.cloudera.org:8080/#/c/19155/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/19155/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2292 PS4, Line 2292: { > nit: DatabaseNotFoundException is a CatalogException so we can remove it. Ack http://gerrit.cloudera.org:8080/#/c/19155/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2299 PS4, Line 2299: !(tbl instanceo > nit: remove null check since it is covered by subsequent check of 'instance Ack http://gerrit.cloudera.org:8080/#/c/19155/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2304 PS4, Line 2304: Coll > nit: our code style uses 4 spaces for continuation ident. Ack http://gerrit.cloudera.org:8080/#/c/19155/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2308 PS4, Line 2308: Coll > nit: 4 spaces Ack http://gerrit.cloudera.org:8080/#/c/19155/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@2310 PS4, Line 2310: if > nit: need a space after "if" Ack http://gerrit.cloudera.org:8080/#/c/19155/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java: http://gerrit.cloudera.org:8080/#/c/19155/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@220 PS4, Line 220: case INSERT: > This can also be considered as transactional events that trigger incrementa This event can come as a standalone event after a major/minor compaction apart from an incremental refresh. I thought this would be the right place. Also, since the incremental refresh event processing is configurable I thought it is right to put it here. @yu-wen also thinks this is the right place. What do you think? http://gerrit.cloudera.org:8080/#/c/19155/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@2492 PS4, Line 2492: processTableReload(); : } : } else { : > These two are already defined in the base class. We can remove them here. Ack http://gerrit.cloudera.org:8080/#/c/19155/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@2502 PS4, Line 2502: ivate vo > nit: 4 spaces for continuation ident Ack http://gerrit.cloudera.org:8080/#/c/19155/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@2505 PS4, Line 2505: > If this method only exists in CDP Hive, we need to add a wrapper for it in Ack http://gerrit.cloudera.org:8080/#/c/19155/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@2506 PS4, Line 2506: // Ignore event if table or database is not in catalog. Throw exception if : // refresh fails. If the partition does not > This is related to the above comment about removing these fields. Are these Ack http://gerrit.cloudera.org:8080/#/c/19155/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@2515 PS4, Line 2515: + "p > nit: 4 spaces Ack http://gerrit.cloudera.org:8080/#/c/19155/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@2518 PS4, Line 2518: Joiner.on(', > nit: please fix the idention to align with other code, e.g. Ack http://gerrit.cloudera.org:8080/#/c/19155/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@2520 PS4, Line 2520: > nit: need a space after comma Ack http://gerrit.cloudera.org:8080/#/c/19155/4/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@2536 PS4, Line 2536: getFullyQualifiedTblName()), e); > This comment confused me. Isn't this a single table event? Maybe I misunder Ack -- To view, visit http://gerrit.cloudera.org:8080/19155 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerri
[Impala-ASF-CR] IMPALA-11626: Handled COMMIT COMPACTION EVENT from HMS
Hello Quanlong Huang, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/19155 to look at the new patch set (#5). Change subject: IMPALA-11626: Handled COMMIT_COMPACTION_EVENT from HMS .. IMPALA-11626: Handled COMMIT_COMPACTION_EVENT from HMS Since HIVE-24329 HMS emits an event when a compaction is committed, but Impala ignores it. Handling it would allow automatic refreshing of file metadata after commit compactions. Change-Id: I464faedb4e3bbcd417bab2e3cb0d57e339d42605 --- M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java 4 files changed, 107 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/19155/5 -- To view, visit http://gerrit.cloudera.org:8080/19155 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I464faedb4e3bbcd417bab2e3cb0d57e339d42605 Gerrit-Change-Number: 19155 Gerrit-PatchSet: 5 Gerrit-Owner: Sai Hemanth Gantasala Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Sai Hemanth Gantasala
[Impala-ASF-CR] IMPALA-11846: Fix builds with setuptools>f.0.0
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19431 ) Change subject: IMPALA-11846: Fix builds with setuptools>=66.0.0 .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12193/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19431 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef Gerrit-Change-Number: 19431 Gerrit-PatchSet: 4 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Wed, 18 Jan 2023 17:25:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11846: Fix builds with setuptools>f.0.0
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19431 ) Change subject: IMPALA-11846: Fix builds with setuptools>=66.0.0 .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8974/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/19431 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef Gerrit-Change-Number: 19431 Gerrit-PatchSet: 4 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Wed, 18 Jan 2023 17:07:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3120: Support Bucket Shuffle Join for bucketed table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19430 ) Change subject: IMPALA-3120: Support Bucket Shuffle Join for bucketed table .. Patch Set 2: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/8972/ -- To view, visit http://gerrit.cloudera.org:8080/19430 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316 Gerrit-Change-Number: 19430 Gerrit-PatchSet: 2 Gerrit-Owner: Baike Xia Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 18 Jan 2023 17:06:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11846: Fix builds with setuptools>f.0.0
Michael Smith has uploaded a new patch set (#4) to the change originally created by Gergely Fürnstáhl. ( http://gerrit.cloudera.org:8080/19431 ) Change subject: IMPALA-11846: Fix builds with setuptools>=66.0.0 .. IMPALA-11846: Fix builds with setuptools>=66.0.0 setuptools 66.0.0 introduced a breaking change, it does not support non PEP440 compilant version names. This breaks impala_shell's packaging and installing test if the system python3's version is 3.8+. This is a quick fix to unblock builds. The rest of the work will be done in IMPALA-11849 (e.g. stabilizing the python environments version). impala_shell releases should not be affected by this, as the version number we generate is already PEP440 compilant. Testing: - Built locally with python3.8 Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef --- M shell/CMakeLists.txt 1 file changed, 8 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/19431/4 -- To view, visit http://gerrit.cloudera.org:8080/19431 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef Gerrit-Change-Number: 19431 Gerrit-PatchSet: 4 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith
[Impala-ASF-CR] IMPALA-11846: Fix builds with setuptools>f.0.0
Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/19431 ) Change subject: IMPALA-11846: Fix builds with setuptools>=66.0.0 .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/19431/2/shell/CMakeLists.txt File shell/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/19431/2/shell/CMakeLists.txt@29 PS2, Line 29: 4.3.0.dev > Note: shell/CMakeFiles needs to be deleted, to be able to create the packag I mildly prefer a non-sense version if we can't programmatically set this based on the current build version, as it's only used in the dev environment. Although did you try $ENV{IMPALA_VERSION}? -- To view, visit http://gerrit.cloudera.org:8080/19431 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef Gerrit-Change-Number: 19431 Gerrit-PatchSet: 3 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Wed, 18 Jan 2023 16:56:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11846: Fix builds with setuptools>f.0.0
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19431 ) Change subject: IMPALA-11846: Fix builds with setuptools>=66.0.0 .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12192/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19431 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef Gerrit-Change-Number: 19431 Gerrit-PatchSet: 3 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Wed, 18 Jan 2023 16:50:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11846: Fix builds with setuptools>f.0.0
Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/19431 ) Change subject: IMPALA-11846: Fix builds with setuptools>=66.0.0 .. Patch Set 3: Code-Review+2 Looks good to unblock builds. -- To view, visit http://gerrit.cloudera.org:8080/19431 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef Gerrit-Change-Number: 19431 Gerrit-PatchSet: 3 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Wed, 18 Jan 2023 16:47:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11846: Fix builds with setuptools>f.0.0
Gergely Fürnstáhl has posted comments on this change. ( http://gerrit.cloudera.org:8080/19431 ) Change subject: IMPALA-11846: Fix builds with setuptools>=66.0.0 .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/19431/2/shell/CMakeLists.txt File shell/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/19431/2/shell/CMakeLists.txt@29 PS2, Line 29: 1.0.0.dev > We could do 4.3.0 so we test is for the next release. Note: shell/CMakeFiles needs to be deleted, to be able to create the package with a new name. -- To view, visit http://gerrit.cloudera.org:8080/19431 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef Gerrit-Change-Number: 19431 Gerrit-PatchSet: 2 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Wed, 18 Jan 2023 16:32:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11846: Fix builds with setuptools>f.0.0
Gergely Fürnstáhl has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/19431 ) Change subject: IMPALA-11846: Fix builds with setuptools>=66.0.0 .. IMPALA-11846: Fix builds with setuptools>=66.0.0 setuptools 66.0.0 introduced a breaking change, it does not support non PEP440 compilant version names. This breaks impala_shell's packaging and installing test if the system python3's version is 3.8+. This is a quick fix to unblock builds. The rest of the work will be done in a followup JIRA (e.g. stabilizing the python environments version). impala_shell releases should not be affected by this, as the version number we generate is already PEP440 compilant. Testing: - Built locally with python3.8 Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef --- M shell/CMakeLists.txt 1 file changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/19431/3 -- To view, visit http://gerrit.cloudera.org:8080/19431 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef Gerrit-Change-Number: 19431 Gerrit-PatchSet: 3 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith
[Impala-ASF-CR] IMPALA-11846: Fix builds with setuptools>f.0.0
Gergely Fürnstáhl has posted comments on this change. ( http://gerrit.cloudera.org:8080/19431 ) Change subject: IMPALA-11846: Fix builds with setuptools>=66.0.0 .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/19431/2/shell/CMakeLists.txt File shell/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/19431/2/shell/CMakeLists.txt@29 PS2, Line 29: 1.0.0.dev > The Impala-shell version currently released on PyPI is 4.2.0. We could do 4.3.0 so we test is for the next release. I am no Cmake expert, tried it for a while but gave up for now and added the idea to the followup jira already. shell/make_shell_tarball.sh generates the impala_build_version.py, which can be used, or shell/packaging/setup.py writes the used version to the staging directory in version.txt. The thing is, both are generated during build time and we would need to guarantee to execute those targets before creating the full path of the package. Other possibility could be to use wildcards, but not sure if that's better, the cmake config already isnt great for this part, the generated test package or its name got stuck somewhere and could not generate a new one with the new BUILD_VERSION only after git clean -dfx -- To view, visit http://gerrit.cloudera.org:8080/19431 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef Gerrit-Change-Number: 19431 Gerrit-PatchSet: 2 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Wed, 18 Jan 2023 16:25:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11846: Fix builds with setuptools>f.0.0
Laszlo Gaal has posted comments on this change. ( http://gerrit.cloudera.org:8080/19431 ) Change subject: IMPALA-11846: Fix builds with setuptools>=66.0.0 .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/19431/2/shell/CMakeLists.txt File shell/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/19431/2/shell/CMakeLists.txt@29 PS2, Line 29: 1.0.0.dev The Impala-shell version currently released on PyPI is 4.2.0. Would it be possible to set this version number based on that value? Also (if that's not too much for a quick fix) it would be nicer to drive this value off of a CMake variable (or similar) to ensure coherency between the two affected lines. This is not a strong request; I think it would also be fine to do it in the context of the follow-up ticket you mentioned. -- To view, visit http://gerrit.cloudera.org:8080/19431 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef Gerrit-Change-Number: 19431 Gerrit-PatchSet: 2 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Wed, 18 Jan 2023 16:13:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11846: Fix builds with setuptools>f.0.0
Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/19431 ) Change subject: IMPALA-11846: Fix builds with setuptools>=66.0.0 .. Patch Set 2: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/19431 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef Gerrit-Change-Number: 19431 Gerrit-PatchSet: 2 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jason Fehr Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Wed, 18 Jan 2023 15:55:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11846: Fix builds with setuptools>f.0.0
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19431 ) Change subject: IMPALA-11846: Fix builds with setuptools>=66.0.0 .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12191/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19431 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef Gerrit-Change-Number: 19431 Gerrit-PatchSet: 2 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith Gerrit-Comment-Date: Wed, 18 Jan 2023 15:51:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11846: Fix builds with setuptools>f.0.0
Gergely Fürnstáhl has uploaded this change for review. ( http://gerrit.cloudera.org:8080/19431 Change subject: IMPALA-11846: Fix builds with setuptools>=66.0.0 .. IMPALA-11846: Fix builds with setuptools>=66.0.0 setuptools 66.0.0 introduced a breaking change, it does not support non PEP440 compilant version names. This breaks impala_shell's packaging and installing test if the system python3's version is 3.8+. This is a quick fix to unblock builds. The rest of the work will be done in a followup JIRA (e.g. stabilizing the python environments version). impala_shell releases should not be affected by this, as the version number we generate is already PEP440 compilant. Testing: - Built locally with python3.8 Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef --- M shell/CMakeLists.txt 1 file changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/19431/2 -- To view, visit http://gerrit.cloudera.org:8080/19431 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I4eb0957fb576e590b86b6fe570216cfb72d11aef Gerrit-Change-Number: 19431 Gerrit-PatchSet: 2 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Smith
[Impala-ASF-CR] IMPALA-11826: Avoid calling planFiles() on Iceberg V2 tables when there are no predicates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19419 ) Change subject: IMPALA-11826: Avoid calling planFiles() on Iceberg V2 tables when there are no predicates .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/19419 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia46bd2dce248a9e096fc1c0bd914fc3fa4686fb0 Gerrit-Change-Number: 19419 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 18 Jan 2023 15:28:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11826: Avoid calling planFiles() on Iceberg V2 tables when there are no predicates
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/19419 ) Change subject: IMPALA-11826: Avoid calling planFiles() on Iceberg V2 tables when there are no predicates .. IMPALA-11826: Avoid calling planFiles() on Iceberg V2 tables when there are no predicates Similar to IMPALA-11591 but this Jira extends it to V2 tables. With this patch we group data files into two categories in IcebergContentFileStore: * data files without deletes * data files with deletes With this information we can avoid calling planFiles() when planning the scans of Iceberg tables. We can just set the lists of the file descriptors based on IcebergContentFileStore then invoke the regular planning methods. iceberg-v2-tables.test had to be updated a bit because now we are calculating the lengths of the file paths based on Impala's file descriptor objects + table location, and not based on data file information in Iceberg metadata (which has the file system prefix stripped) Testing: * executed existing tests * Updated plan tests Change-Id: Ia46bd2dce248a9e096fc1c0bd914fc3fa4686fb0 Reviewed-on: http://gerrit.cloudera.org:8080/19419 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java A fe/src/main/java/org/apache/impala/catalog/iceberg/GroupedContentFiles.java M fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M fe/src/test/java/org/apache/impala/catalog/local/LocalCatalogTest.java M testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test 9 files changed, 277 insertions(+), 172 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/19419 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ia46bd2dce248a9e096fc1c0bd914fc3fa4686fb0 Gerrit-Change-Number: 19419 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] WIP IMPALA-11745
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19425 ) Change subject: WIP IMPALA-11745 .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12190/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19425 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0ca02a70b4ba244778c9db6d14df4423072b225 Gerrit-Change-Number: 19425 Gerrit-PatchSet: 6 Gerrit-Owner: Peter Rozsa Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Peter Rozsa Gerrit-Comment-Date: Wed, 18 Jan 2023 15:00:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11658: Implement Iceberg manifest caching config for Impala
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/19423 ) Change subject: IMPALA-11658: Implement Iceberg manifest caching config for Impala .. Patch Set 1: Code-Review+1 (3 comments) Thanks for working on this, LGTM! http://gerrit.cloudera.org:8080/#/c/19423/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19423/1//COMMIT_MSG@9 PS1, Line 9: need nit: needs http://gerrit.cloudera.org:8080/#/c/19423/1//COMMIT_MSG@10 PS1, Line 10: implement nit: implements http://gerrit.cloudera.org:8080/#/c/19423/1//COMMIT_MSG@19 PS1, Line 19: - Checked that manifest caching works through debug logging. Does it works in case of HadoopTables and Catalogs as well? -- To view, visit http://gerrit.cloudera.org:8080/19423 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5a60a700d2ae6302dfe395d1ef602e6b1d821888 Gerrit-Change-Number: 19423 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 18 Jan 2023 14:52:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP IMPALA-11745
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19425 ) Change subject: WIP IMPALA-11745 .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12189/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19425 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0ca02a70b4ba244778c9db6d14df4423072b225 Gerrit-Change-Number: 19425 Gerrit-PatchSet: 5 Gerrit-Owner: Peter Rozsa Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Peter Rozsa Gerrit-Comment-Date: Wed, 18 Jan 2023 14:55:42 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-11745
Peter Rozsa has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/19425 ) Change subject: WIP IMPALA-11745 .. WIP IMPALA-11745 Change-Id: If0ca02a70b4ba244778c9db6d14df4423072b225 --- M be/src/common/global-flags.cc M be/src/exprs/hive-udf-call.cc M be/src/util/backend-gflag-util.cc M common/function-registry/CMakeLists.txt A common/function-registry/gen_geospatial_udf_wrappers.py M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/HiveEsriGeospatialBuiltins.java M fe/src/main/java/org/apache/impala/catalog/ScalarFunction.java A fe/src/main/java/org/apache/impala/hive/executor/BinaryToBinaryHiveLegacyFunctionExtractor.java M fe/src/main/java/org/apache/impala/hive/executor/HiveGenericJavaFunction.java M fe/src/main/java/org/apache/impala/hive/executor/HiveJavaFunction.java M fe/src/main/java/org/apache/impala/hive/executor/HiveJavaFunctionFactory.java M fe/src/main/java/org/apache/impala/hive/executor/HiveJavaFunctionFactoryImpl.java A fe/src/main/java/org/apache/impala/hive/executor/HiveLegacyFunctionExtractor.java M fe/src/main/java/org/apache/impala/hive/executor/HiveLegacyJavaFunction.java M fe/src/main/java/org/apache/impala/hive/executor/ImpalaDoubleWritable.java M fe/src/main/java/org/apache/impala/hive/executor/ImpalaStringWritable.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/test/java/org/apache/impala/hive/executor/TestHiveJavaFunctionFactory.java M java/CMakeLists.txt M java/shaded-deps/hive-exec/pom.xml M testdata/datasets/README M testdata/datasets/functional/functional_schema_template.sql A testdata/workloads/functional-query/queries/QueryTest/udf-esri-geospatial.test M tests/common/impala_test_suite.py A tests/custom_cluster/test_geospatial_udfs.py M tests/query_test/test_udfs.py 31 files changed, 3,504 insertions(+), 141 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/19425/6 -- To view, visit http://gerrit.cloudera.org:8080/19425 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If0ca02a70b4ba244778c9db6d14df4423072b225 Gerrit-Change-Number: 19425 Gerrit-PatchSet: 6 Gerrit-Owner: Peter Rozsa Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Peter Rozsa
[Impala-ASF-CR] WIP IMPALA-11745
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19425 ) Change subject: WIP IMPALA-11745 .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/19425/5/tests/custom_cluster/test_geospatial_udfs.py File tests/custom_cluster/test_geospatial_udfs.py: http://gerrit.cloudera.org:8080/#/c/19425/5/tests/custom_cluster/test_geospatial_udfs.py@23 PS5, Line 23: class TestGeospatialUdfs(CustomClusterTestSuite): flake8: E302 expected 2 blank lines, found 1 http://gerrit.cloudera.org:8080/#/c/19425/5/tests/custom_cluster/test_geospatial_udfs.py@45 PS5, Line 45: flake8: W292 no newline at end of file -- To view, visit http://gerrit.cloudera.org:8080/19425 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0ca02a70b4ba244778c9db6d14df4423072b225 Gerrit-Change-Number: 19425 Gerrit-PatchSet: 5 Gerrit-Owner: Peter Rozsa Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Peter Rozsa Gerrit-Comment-Date: Wed, 18 Jan 2023 14:36:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP IMPALA-11745
Peter Rozsa has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/19425 ) Change subject: WIP IMPALA-11745 .. WIP IMPALA-11745 Change-Id: If0ca02a70b4ba244778c9db6d14df4423072b225 --- M be/src/common/global-flags.cc M be/src/exprs/hive-udf-call.cc M be/src/util/backend-gflag-util.cc M common/function-registry/CMakeLists.txt A common/function-registry/gen_geospatial_udf_wrappers.py M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/HiveEsriGeospatialBuiltins.java M fe/src/main/java/org/apache/impala/catalog/ScalarFunction.java A fe/src/main/java/org/apache/impala/hive/executor/BinaryToBinaryHiveLegacyFunctionExtractor.java M fe/src/main/java/org/apache/impala/hive/executor/HiveGenericJavaFunction.java M fe/src/main/java/org/apache/impala/hive/executor/HiveJavaFunction.java M fe/src/main/java/org/apache/impala/hive/executor/HiveJavaFunctionFactory.java M fe/src/main/java/org/apache/impala/hive/executor/HiveJavaFunctionFactoryImpl.java A fe/src/main/java/org/apache/impala/hive/executor/HiveLegacyFunctionExtractor.java M fe/src/main/java/org/apache/impala/hive/executor/HiveLegacyJavaFunction.java M fe/src/main/java/org/apache/impala/hive/executor/ImpalaDoubleWritable.java M fe/src/main/java/org/apache/impala/hive/executor/ImpalaStringWritable.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/test/java/org/apache/impala/hive/executor/TestHiveJavaFunctionFactory.java M java/CMakeLists.txt M java/shaded-deps/hive-exec/pom.xml M testdata/datasets/README M testdata/datasets/functional/functional_schema_template.sql A testdata/workloads/functional-query/queries/QueryTest/udf-esri-geospatial.test M tests/common/impala_test_suite.py A tests/custom_cluster/test_geospatial_udfs.py M tests/query_test/test_udfs.py 31 files changed, 3,503 insertions(+), 141 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/19425/5 -- To view, visit http://gerrit.cloudera.org:8080/19425 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If0ca02a70b4ba244778c9db6d14df4423072b225 Gerrit-Change-Number: 19425 Gerrit-PatchSet: 5 Gerrit-Owner: Peter Rozsa Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Peter Rozsa
[Impala-ASF-CR] WIP IMPALA-11745
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19425 ) Change subject: WIP IMPALA-11745 .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12188/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19425 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0ca02a70b4ba244778c9db6d14df4423072b225 Gerrit-Change-Number: 19425 Gerrit-PatchSet: 4 Gerrit-Owner: Peter Rozsa Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Peter Rozsa Gerrit-Comment-Date: Wed, 18 Jan 2023 14:24:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11013 (part 1): Support 'MIGRATE TABLE' for external Hdfs tables
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/19397 ) Change subject: IMPALA-11013 (part 1): Support 'MIGRATE TABLE' for external Hdfs tables .. Patch Set 6: (14 comments) Thanks for working on this! http://gerrit.cloudera.org:8080/#/c/19397/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19397/6//COMMIT_MSG@19 PS6, Line 19: 'iceberg.catalog' = 'hadoop.catalog' Seems like the hadoop catalog is not supported. http://gerrit.cloudera.org:8080/#/c/19397/6//COMMIT_MSG@26 PS6, Line 26: - InputFormat must be either PARQUET, ORC, or AVRO Hive tables support different file formats in different partitions. Does the migration work in such scenarios? It is not a problem if not, as it is an uncommon use case, and we can also deal with it in a follow-up jira. http://gerrit.cloudera.org:8080/#/c/19397/6//COMMIT_MSG@33 PS6, Line 33: Hadoop Catalog Hive Catalog is the default, right? http://gerrit.cloudera.org:8080/#/c/19397/6//COMMIT_MSG@35 PS6, Line 35: - Child query 4: Drop the temporary Hdfs table. What happens if there is an error at any step? It would be nice if we could undo the steps that have been executed and restore the original table. Probably it could be tested by extending ChildQueryExecutor with the ability to inject errors at the Nth query. http://gerrit.cloudera.org:8080/#/c/19397/6/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/19397/6/be/src/service/client-request-state.cc@2130 PS6, Line 2130: RuntimeProfile* set_hdfs_table_profile = RuntimeProfile::Create( : &profile_pool_, "Set HDFS table query"); : child_profile->AddChild(set_hdfs_table_profile); : child_queries.emplace_back(params.set_hdfs_table_external_query, this, : parent_server_, set_hdfs_table_profile, &profile_pool_); nit: can we refactor these code fragments to a helper function? http://gerrit.cloudera.org:8080/#/c/19397/6/be/src/service/client-request-state.cc@2143 PS6, Line 2143: Refresh temporary HDFS table query This is not mentioned in the commit message. http://gerrit.cloudera.org:8080/#/c/19397/6/be/src/service/client-request-state.cc@2168 PS6, Line 2168: params.__isset.create_iceberg_table_query Could you please add a comment when this can be unset? IIUC it is not set in case of HiveCatalog, because in that case Iceberg's HiveCatalog creates the table for us. http://gerrit.cloudera.org:8080/#/c/19397/6/fe/src/main/java/org/apache/impala/analysis/MigrateStmt.java File fe/src/main/java/org/apache/impala/analysis/MigrateStmt.java: http://gerrit.cloudera.org:8080/#/c/19397/6/fe/src/main/java/org/apache/impala/analysis/MigrateStmt.java@85 PS6, Line 85: FeTable table = analyzer.getTable(tableName_, Privilege.OWNER); Could you please add authorization tests (e.g. in test_ranger.py) for the case when someone without the necessary priviliges tries to migrate an Iceberg table? http://gerrit.cloudera.org:8080/#/c/19397/6/fe/src/main/java/org/apache/impala/analysis/MigrateStmt.java@118 PS6, Line 118: TRANSLATED_TO_EXTERNAL", "FALSE Do we need to set this as well? http://gerrit.cloudera.org:8080/#/c/19397/6/fe/src/main/java/org/apache/impala/analysis/MigrateStmt.java@139 PS6, Line 139: TBL_PROP_EXTERNAL_TABLE_PURGE, "false"); What happens if the original table's purge property was true? http://gerrit.cloudera.org:8080/#/c/19397/6/fe/src/main/java/org/apache/impala/util/MigrateTableUtil.java File fe/src/main/java/org/apache/impala/util/MigrateTableUtil.java: http://gerrit.cloudera.org:8080/#/c/19397/6/fe/src/main/java/org/apache/impala/util/MigrateTableUtil.java@156 PS6, Line 156: return Math.min(max, Math.max((max + 7) / 8, Integer.parseInt(threadNum))); Could you please add some comments about this formula, and why we are using this? http://gerrit.cloudera.org:8080/#/c/19397/6/testdata/workloads/functional-query/queries/QueryTest/iceberg-migrate-from-external-hdfs-tables.test File testdata/workloads/functional-query/queries/QueryTest/iceberg-migrate-from-external-hdfs-tables.test: http://gerrit.cloudera.org:8080/#/c/19397/6/testdata/workloads/functional-query/queries/QueryTest/iceberg-migrate-from-external-hdfs-tables.test@26 PS6, Line 26: create table parquet_partitioned like alltypes stored as parquet; Could you please add tests with more partition types? E.g. partition by STRING, DATE, DECIMAL, etc. http://gerrit.cloudera.org:8080/#/c/19397/6/testdata/workloads/functional-query/queries/QueryTest/iceberg-migrate-from-external-hdfs-tables.test@27 PS6, Line 27: insert into parquet_partitioned partition(year, month) select * from alltypes; Could you please also add tests with partition value being NULL? http://gerrit.cloudera.org:8080/#/c/19397/6/testdata/workloads/functional-query/queries/QueryTest/iceb
[Impala-ASF-CR] WIP IMPALA-11745
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19425 ) Change subject: WIP IMPALA-11745 .. Patch Set 4: (25 comments) http://gerrit.cloudera.org:8080/#/c/19425/4/common/function-registry/gen_geospatial_udf_wrappers.py File common/function-registry/gen_geospatial_udf_wrappers.py: http://gerrit.cloudera.org:8080/#/c/19425/4/common/function-registry/gen_geospatial_udf_wrappers.py@47 PS4, Line 47: e flake8: E501 line too long (169 > 90 characters) http://gerrit.cloudera.org:8080/#/c/19425/4/common/function-registry/gen_geospatial_udf_wrappers.py@48 PS4, Line 48: e flake8: E501 line too long (234 > 90 characters) http://gerrit.cloudera.org:8080/#/c/19425/4/common/function-registry/gen_geospatial_udf_wrappers.py@49 PS4, Line 49: e flake8: E501 line too long (234 > 90 characters) http://gerrit.cloudera.org:8080/#/c/19425/4/common/function-registry/gen_geospatial_udf_wrappers.py@50 PS4, Line 50: o flake8: E501 line too long (234 > 90 characters) http://gerrit.cloudera.org:8080/#/c/19425/4/common/function-registry/gen_geospatial_udf_wrappers.py@51 PS4, Line 51: o flake8: E501 line too long (159 > 90 characters) http://gerrit.cloudera.org:8080/#/c/19425/4/common/function-registry/gen_geospatial_udf_wrappers.py@54 PS4, Line 54: def generate_parameter(parameter_type, order): flake8: E302 expected 2 blank lines, found 1 http://gerrit.cloudera.org:8080/#/c/19425/4/common/function-registry/gen_geospatial_udf_wrappers.py@55 PS4, Line 55: ) flake8: E501 line too long (91 > 90 characters) http://gerrit.cloudera.org:8080/#/c/19425/4/common/function-registry/gen_geospatial_udf_wrappers.py@57 PS4, Line 57: def generate_argument(order): flake8: E302 expected 2 blank lines, found 1 http://gerrit.cloudera.org:8080/#/c/19425/4/common/function-registry/gen_geospatial_udf_wrappers.py@60 PS4, Line 60: def generate_argument_list(order): flake8: E302 expected 2 blank lines, found 1 http://gerrit.cloudera.org:8080/#/c/19425/4/common/function-registry/gen_geospatial_udf_wrappers.py@66 PS4, Line 66: def generate_parameter_list(parameter_type, order): flake8: E302 expected 2 blank lines, found 1 http://gerrit.cloudera.org:8080/#/c/19425/4/common/function-registry/gen_geospatial_udf_wrappers.py@72 PS4, Line 72: def generate_method(return_type, parameter_type, exception_type, order): flake8: E302 expected 2 blank lines, found 1 http://gerrit.cloudera.org:8080/#/c/19425/4/common/function-registry/gen_geospatial_udf_wrappers.py@76 PS4, Line 76: u flake8: E501 line too long (300 > 90 characters) http://gerrit.cloudera.org:8080/#/c/19425/4/common/function-registry/gen_geospatial_udf_wrappers.py@76 PS4, Line 76: flake8: E202 whitespace before ')' http://gerrit.cloudera.org:8080/#/c/19425/4/common/function-registry/gen_geospatial_udf_wrappers.py@78 PS4, Line 78: def generate_methods(return_type, parameter_type, exception_type, min_number_of_params, max_number_of_params, increment): flake8: E302 expected 2 blank lines, found 1 http://gerrit.cloudera.org:8080/#/c/19425/4/common/function-registry/gen_geospatial_udf_wrappers.py@78 PS4, Line 78: x flake8: E501 line too long (121 > 90 characters) http://gerrit.cloudera.org:8080/#/c/19425/4/common/function-registry/gen_geospatial_udf_wrappers.py@81 PS4, Line 81: flake8: E203 whitespace before ',' http://gerrit.cloudera.org:8080/#/c/19425/4/common/function-registry/gen_geospatial_udf_wrappers.py@82 PS4, Line 82: flake8: E202 whitespace before ')' http://gerrit.cloudera.org:8080/#/c/19425/4/common/function-registry/gen_geospatial_udf_wrappers.py@86 PS4, Line 86: def generate_wrapper_class(config): flake8: E302 expected 2 blank lines, found 1 http://gerrit.cloudera.org:8080/#/c/19425/4/common/function-registry/gen_geospatial_udf_wrappers.py@87 PS4, Line 87: a flake8: E501 line too long (218 > 90 characters) http://gerrit.cloudera.org:8080/#/c/19425/4/common/function-registry/gen_geospatial_udf_wrappers.py@89 PS4, Line 89: def generate_file(config): flake8: E302 expected 2 blank lines, found 1 http://gerrit.cloudera.org:8080/#/c/19425/4/common/function-registry/gen_geospatial_udf_wrappers.py@90 PS4, Line 90: g flake8: E501 line too long (152 > 90 characters) http://gerrit.cloudera.org:8080/#/c/19425/4/common/function-registry/gen_geospatial_udf_wrappers.py@92 PS4, Line 92: if not os.path.exists(FE_PATH): flake8: E305 expected 2 blank lines after class or function definition, found 1 http://gerrit.cloudera.org:8080/#/c/19425/4/common/function-registry/gen_geospatial_udf_wrappers.py@100 PS4, Line 100: flake8: W292 no newline at end of file http://gerrit.cloudera.org:8080/#/c/19425/4/tests/custom_cluster/test_geospatial_udfs.py File tests/custom_cluster/test_geospatial_udfs.py: http://gerrit.cloudera.org:8080/#/c/19425/4/tests/custom_cluster/test_geospatial_udfs.py@23 PS4, Line 23: class TestGeospatialUdfs(CustomClusterTestSuite): flake8: E302 e
[Impala-ASF-CR] WIP IMPALA-11745
Peter Rozsa has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/19425 ) Change subject: WIP IMPALA-11745 .. WIP IMPALA-11745 Change-Id: If0ca02a70b4ba244778c9db6d14df4423072b225 --- M be/src/common/global-flags.cc M be/src/exprs/hive-udf-call.cc M be/src/util/backend-gflag-util.cc M common/function-registry/CMakeLists.txt A common/function-registry/gen_geospatial_udf_wrappers.py M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java A fe/src/main/java/org/apache/impala/catalog/HiveEsriGeospatialBuiltins.java M fe/src/main/java/org/apache/impala/catalog/ScalarFunction.java A fe/src/main/java/org/apache/impala/hive/executor/BinaryToBinaryHiveLegacyFunctionExtractor.java M fe/src/main/java/org/apache/impala/hive/executor/HiveGenericJavaFunction.java M fe/src/main/java/org/apache/impala/hive/executor/HiveJavaFunction.java M fe/src/main/java/org/apache/impala/hive/executor/HiveJavaFunctionFactory.java M fe/src/main/java/org/apache/impala/hive/executor/HiveJavaFunctionFactoryImpl.java A fe/src/main/java/org/apache/impala/hive/executor/HiveLegacyFunctionExtractor.java M fe/src/main/java/org/apache/impala/hive/executor/HiveLegacyJavaFunction.java M fe/src/main/java/org/apache/impala/hive/executor/ImpalaDoubleWritable.java M fe/src/main/java/org/apache/impala/hive/executor/ImpalaStringWritable.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/test/java/org/apache/impala/hive/executor/TestHiveJavaFunctionFactory.java M java/CMakeLists.txt M java/shaded-deps/hive-exec/pom.xml M testdata/datasets/README M testdata/datasets/functional/functional_schema_template.sql A testdata/workloads/functional-query/queries/QueryTest/udf-esri-geospatial.test M tests/common/impala_test_suite.py A tests/custom_cluster/test_geospatial_udfs.py M tests/query_test/test_udfs.py 31 files changed, 3,422 insertions(+), 141 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/19425/4 -- To view, visit http://gerrit.cloudera.org:8080/19425 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If0ca02a70b4ba244778c9db6d14df4423072b225 Gerrit-Change-Number: 19425 Gerrit-PatchSet: 4 Gerrit-Owner: Peter Rozsa Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Peter Rozsa
[Impala-ASF-CR] IMPALA-3120: Support Bucket Shuffle Join for bucketed table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19430 ) Change subject: IMPALA-3120: Support Bucket Shuffle Join for bucketed table .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12187/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19430 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316 Gerrit-Change-Number: 19430 Gerrit-PatchSet: 4 Gerrit-Owner: Baike Xia Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 18 Jan 2023 13:59:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3120: Support Bucket Shuffle Join for bucketed table
Baike Xia has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/19430 ) Change subject: IMPALA-3120: Support Bucket Shuffle Join for bucketed table .. IMPALA-3120: Support Bucket Shuffle Join for bucketed table For query statements that contain bucketed tables and have operations such as join, group by, sort by, etc., can use bucket shuffle join to optimize the execution plan, reduce the amount of data transferred, and reduce query latency. To ensure consistency with hive, the bucket hash is calculated using the same method that hive uses to calculate the hash value of a column. Add new query option as a function switch: ENABLE_BUCKET_SHUFFLE FRAGMENT_INSTANCE_BUCKET_NUM BUCKET_EXEC_BACKEND_RATIO Testing: - Add e2e tests - Add fe tests Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316 --- M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/data-stream-test.cc M be/src/runtime/fragment-instance-state.cc M be/src/runtime/initial-reservations.cc M be/src/runtime/initial-reservations.h M be/src/runtime/krpc-data-stream-sender-ir.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/scheduling/schedule-state.h M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/hash-util.h M common/protobuf/admission_control_service.proto M common/protobuf/control_service.proto M common/protobuf/planner.proto M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M common/thrift/Partitions.thrift M common/thrift/PlanNodes.thrift M common/thrift/Planner.thrift M common/thrift/Query.thrift M fe/src/main/java/org/apache/impala/analysis/TableDef.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/catalog/FeFsTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/DataPartition.java M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M fe/src/main/java/org/apache/impala/planner/UnionNode.java M fe/src/main/java/org/apache/impala/util/BucketUtils.java M fe/src/main/java/org/apache/impala/util/MathUtil.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M testdata/datasets/functional/functional_schema_template.sql A testdata/workloads/functional-planner/queries/PlannerTest/bucket-shuffle.test A testdata/workloads/functional-query/queries/QueryTest/bucket-shuffle.test A tests/query_test/test_bucket_shuffle.py 51 files changed, 2,186 insertions(+), 70 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/19430/4 -- To view, visit http://gerrit.cloudera.org:8080/19430 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316 Gerrit-Change-Number: 19430 Gerrit-PatchSet: 4 Gerrit-Owner: Baike Xia Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-3120: Support Bucket Shuffle Join for bucketed table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19430 ) Change subject: IMPALA-3120: Support Bucket Shuffle Join for bucketed table .. Patch Set 3: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/12186/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/19430 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316 Gerrit-Change-Number: 19430 Gerrit-PatchSet: 3 Gerrit-Owner: Baike Xia Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 18 Jan 2023 13:30:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3120: Support Bucket Shuffle Join for bucketed table
Baike Xia has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/19430 ) Change subject: IMPALA-3120: Support Bucket Shuffle Join for bucketed table .. IMPALA-3120: Support Bucket Shuffle Join for bucketed table For query statements that contain bucketed tables and have operations such as join, group by, sort by, etc., can use bucket shuffle join to optimize the execution plan, reduce the amount of data transferred, and reduce query latency. To ensure consistency with hive, the bucket hash is calculated using the same method that hive uses to calculate the hash value of a column. Add new query option as a function switch: ENABLE_BUCKET_SHUFFLE FRAGMENT_INSTANCE_BUCKET_NUM BUCKET_EXEC_BACKEND_RATIO Testing: - Add e2e tests - Add fe tests Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316 --- M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/data-stream-test.cc M be/src/runtime/fragment-instance-state.cc M be/src/runtime/initial-reservations.cc M be/src/runtime/initial-reservations.h M be/src/runtime/krpc-data-stream-sender-ir.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/scheduling/schedule-state.h M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/hash-util.h M common/protobuf/admission_control_service.proto M common/protobuf/control_service.proto M common/protobuf/planner.proto M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M common/thrift/Partitions.thrift M common/thrift/PlanNodes.thrift M common/thrift/Planner.thrift M common/thrift/Query.thrift M fe/src/main/java/org/apache/impala/analysis/TableDef.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/catalog/FeFsTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/DataPartition.java M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M fe/src/main/java/org/apache/impala/planner/UnionNode.java M fe/src/main/java/org/apache/impala/util/BucketUtils.java M fe/src/main/java/org/apache/impala/util/MathUtil.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M testdata/datasets/functional/functional_schema_template.sql A testdata/workloads/functional-planner/queries/PlannerTest/bucket-shuffle.test A testdata/workloads/functional-query/queries/QueryTest/bucket-shuffle.test A tests/query_test/test_bucket_shuffle.py 51 files changed, 2,186 insertions(+), 70 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/19430/3 -- To view, visit http://gerrit.cloudera.org:8080/19430 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316 Gerrit-Change-Number: 19430 Gerrit-PatchSet: 3 Gerrit-Owner: Baike Xia Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-11840: Error with joining unnest with views
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19426 ) Change subject: IMPALA-11840: Error with joining unnest with views .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12185/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19426 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic52655631944913553a7e7d9e9169b93da46dde3 Gerrit-Change-Number: 19426 Gerrit-PatchSet: 3 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Peter Rozsa Gerrit-Comment-Date: Wed, 18 Jan 2023 12:45:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11840: Error with joining unnest with views
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19426 ) Change subject: IMPALA-11840: Error with joining unnest with views .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8973/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/19426 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic52655631944913553a7e7d9e9169b93da46dde3 Gerrit-Change-Number: 19426 Gerrit-PatchSet: 4 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Peter Rozsa Gerrit-Comment-Date: Wed, 18 Jan 2023 12:36:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11840: Error with joining unnest with views
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19426 ) Change subject: IMPALA-11840: Error with joining unnest with views .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/19426 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic52655631944913553a7e7d9e9169b93da46dde3 Gerrit-Change-Number: 19426 Gerrit-PatchSet: 4 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Peter Rozsa Gerrit-Comment-Date: Wed, 18 Jan 2023 12:36:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11840: Error with joining unnest with views
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/19426 ) Change subject: IMPALA-11840: Error with joining unnest with views .. Patch Set 3: Code-Review+2 Carrying Csaba's +2. -- To view, visit http://gerrit.cloudera.org:8080/19426 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic52655631944913553a7e7d9e9169b93da46dde3 Gerrit-Change-Number: 19426 Gerrit-PatchSet: 3 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Peter Rozsa Gerrit-Comment-Date: Wed, 18 Jan 2023 12:36:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11840: Error with joining unnest with views
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19426 ) Change subject: IMPALA-11840: Error with joining unnest with views .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/12184/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/19426 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic52655631944913553a7e7d9e9169b93da46dde3 Gerrit-Change-Number: 19426 Gerrit-PatchSet: 2 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Peter Rozsa Gerrit-Comment-Date: Wed, 18 Jan 2023 12:33:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/19429 ) Change subject: IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/19429/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19429/1//COMMIT_MSG@12 PS1, Line 12: the table masking view can't expose complex type columns directly > Is it still true? If so, can't we / should we change table masking views so I see you've already created IMPALA-11847 for this. -- To view, visit http://gerrit.cloudera.org:8080/19429 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77 Gerrit-Change-Number: 19429 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 18 Jan 2023 12:31:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11840: Error with joining unnest with views
Daniel Becker has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/19426 ) Change subject: IMPALA-11840: Error with joining unnest with views .. IMPALA-11840: Error with joining unnest with views Queries fail in the following situation involving collections and views: 1. A view returns an array 2. A second view unnests the array returned from the first view 3. The unnested view is queried in an outer query For example: use functional_parquet; with sub as ( select id, arr1.item unnested_arr from complextypes_arrays_only_view, complextypes_arrays_only_view.int_array arr1) select id, unnested_arr from sub; ERROR: IllegalStateException: null The problem is that in CollectionTableRef.analyze(), if - there is a source view and - the collection ref is within a WITH clause and - it is not in the select list then 'desc_' is not set, but it has to be set in order for TableRef.analyzeJoin() to succeed. This commit solves the problem by assigning a value to 'desc_' also in the above case. Testing: - Added regression tests in nested-types-runtime.test. Change-Id: Ic52655631944913553a7e7d9e9169b93da46dde3 --- M fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java M testdata/workloads/functional-query/queries/QueryTest/nested-types-runtime.test 2 files changed, 104 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/19426/3 -- To view, visit http://gerrit.cloudera.org:8080/19426 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic52655631944913553a7e7d9e9169b93da46dde3 Gerrit-Change-Number: 19426 Gerrit-PatchSet: 3 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Peter Rozsa
[Impala-ASF-CR] IMPALA-11840: Error with joining unnest with views
Daniel Becker has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/19426 ) Change subject: IMPALA-11840: Error with joining unnest with views .. IMPALA-11840: Error with joining unnest with views Queries fail in the following situation involving collections and views: 1. A view returns an array 2. A second view unnests the array returned from the first view 3. the unnested view is queried in an outer query For example: use functional_parquet; with sub as ( select id, arr1.item unnested_arr from complextypes_arrays_only_view, complextypes_arrays_only_view.int_array arr1) select id, unnested_arr from sub; ERROR: IllegalStateException: null The problem is that in CollectionTableRef.analyze(), if - there is a source view and - the collection ref is within a WITH clause and - it is not in the select list then 'desc_' is not set, but it has to be set in order for TableRef.analyzeJoin() to succeed. This commit solves the problem by assigning a value to 'desc_' also in the above case. Testing: - Added regression tests in nested-types-runtime.test. Change-Id: Ic52655631944913553a7e7d9e9169b93da46dde3 --- M fe/src/main/java/org/apache/impala/analysis/CollectionTableRef.java M testdata/workloads/functional-query/queries/QueryTest/nested-types-runtime.test 2 files changed, 104 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/19426/2 -- To view, visit http://gerrit.cloudera.org:8080/19426 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic52655631944913553a7e7d9e9169b93da46dde3 Gerrit-Change-Number: 19426 Gerrit-PatchSet: 2 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Peter Rozsa
[Impala-ASF-CR] IMPALA-3120: Support Bucket Shuffle Join for bucketed table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19430 ) Change subject: IMPALA-3120: Support Bucket Shuffle Join for bucketed table .. Patch Set 2: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/12183/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/19430 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316 Gerrit-Change-Number: 19430 Gerrit-PatchSet: 2 Gerrit-Owner: Baike Xia Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 18 Jan 2023 12:07:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3120: Support Bucket Shuffle Join for bucketed table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19430 ) Change subject: IMPALA-3120: Support Bucket Shuffle Join for bucketed table .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8972/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/19430 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316 Gerrit-Change-Number: 19430 Gerrit-PatchSet: 2 Gerrit-Owner: Baike Xia Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 18 Jan 2023 11:54:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3120: Support Bucket Shuffle Join for bucketed table
Baike Xia has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/19430 ) Change subject: IMPALA-3120: Support Bucket Shuffle Join for bucketed table .. IMPALA-3120: Support Bucket Shuffle Join for bucketed table For query statements that contain bucketed tables and have operations such as join, group by, sort by, etc., can use bucket shuffle join to optimize the execution plan, reduce the amount of data transferred, and reduce query latency. To ensure consistency with hive, the bucket hash is calculated using the same method that hive uses to calculate the hash value of a column. Add new query option as a function switch: ENABLE_BUCKET_SHUFFLE FRAGMENT_INSTANCE_BUCKET_NUM BUCKET_EXEC_BACKEND_RATIO Testing: - Add e2e tests - Add fe tests Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316 --- M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/data-stream-test.cc M be/src/runtime/fragment-instance-state.cc M be/src/runtime/initial-reservations.cc M be/src/runtime/initial-reservations.h M be/src/runtime/krpc-data-stream-sender-ir.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/scheduling/schedule-state.h M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/hash-util.h M common/protobuf/admission_control_service.proto M common/protobuf/control_service.proto M common/protobuf/planner.proto M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M common/thrift/Partitions.thrift M common/thrift/PlanNodes.thrift M common/thrift/Planner.thrift M common/thrift/Query.thrift M fe/src/main/java/org/apache/impala/analysis/TableDef.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/catalog/FeFsTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/DataPartition.java M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M fe/src/main/java/org/apache/impala/planner/UnionNode.java M fe/src/main/java/org/apache/impala/util/BucketUtils.java M fe/src/main/java/org/apache/impala/util/MathUtil.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M testdata/datasets/functional/functional_schema_template.sql A testdata/workloads/functional-planner/queries/PlannerTest/bucket-shuffle.test A testdata/workloads/functional-query/queries/QueryTest/bucket-shuffle.test A tests/query_test/test_bucket_shuffle.py 51 files changed, 2,187 insertions(+), 70 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/19430/2 -- To view, visit http://gerrit.cloudera.org:8080/19430 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316 Gerrit-Change-Number: 19430 Gerrit-PatchSet: 2 Gerrit-Owner: Baike Xia Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-3120: Support Bucket Shuffle Join for bucketed table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19430 ) Change subject: IMPALA-3120: Support Bucket Shuffle Join for bucketed table .. Patch Set 1: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/12182/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/19430 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316 Gerrit-Change-Number: 19430 Gerrit-PatchSet: 1 Gerrit-Owner: Baike Xia Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 18 Jan 2023 11:42:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3120: Support Bucket Shuffle Join for bucketed table
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19430 ) Change subject: IMPALA-3120: Support Bucket Shuffle Join for bucketed table .. Patch Set 1: (8 comments) http://gerrit.cloudera.org:8080/#/c/19430/1/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/19430/1/be/src/runtime/coordinator-backend-state.cc@149 PS1, Line 149: *fragment_ctx->mutable_bucket_backend_map() = exec_params_.query_schedule().bucket_backend_map(); line too long (103 > 90) http://gerrit.cloudera.org:8080/#/c/19430/1/be/src/runtime/data-stream-test.cc File be/src/runtime/data-stream-test.cc: http://gerrit.cloudera.org:8080/#/c/19430/1/be/src/runtime/data-stream-test.cc@624 PS1, Line 624: data_sink->tsink_->stream_sink, dest_, channel_buffer_size, &state, bucket_backend_map)); line too long (97 > 90) http://gerrit.cloudera.org:8080/#/c/19430/1/be/src/scheduling/scheduler.cc File be/src/scheduling/scheduler.cc: http://gerrit.cloudera.org:8080/#/c/19430/1/be/src/scheduling/scheduler.cc@433 PS1, Line 433: const BackendDescriptorPB& backend_descriptor = LookUpBackendDesc(executor_config, host); line too long (93 > 90) http://gerrit.cloudera.org:8080/#/c/19430/1/be/src/scheduling/scheduler.cc@1046 PS1, Line 1046: scan_range_params.scan_range().bucket_id() % fragment_state->fragment.bucket_info.num_bucket) line too long (107 > 90) http://gerrit.cloudera.org:8080/#/c/19430/1/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java File fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java: http://gerrit.cloudera.org:8080/#/c/19430/1/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java@624 PS1, Line 624: line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/19430/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/19430/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@2542 PS1, Line 2542:* 判定所有选取的partition 是否按照相同的规则进行kudu hash分桶(文件命名规则是否相同, 桶总数是否相同) line too long (145 > 90) http://gerrit.cloudera.org:8080/#/c/19430/1/fe/src/main/java/org/apache/impala/planner/PlanNode.java File fe/src/main/java/org/apache/impala/planner/PlanNode.java: http://gerrit.cloudera.org:8080/#/c/19430/1/fe/src/main/java/org/apache/impala/planner/PlanNode.java@1207 PS1, Line 1207: if (1.0 * childBucketInfo.getNum_bucket() / numExecutors >= bucketExecBucketBackendRatio) { line too long (97 > 90) http://gerrit.cloudera.org:8080/#/c/19430/1/tests/query_test/test_bucket_shuffle.py File tests/query_test/test_bucket_shuffle.py: http://gerrit.cloudera.org:8080/#/c/19430/1/tests/query_test/test_bucket_shuffle.py@40 PS1, Line 40: flake8: W292 no newline at end of file -- To view, visit http://gerrit.cloudera.org:8080/19430 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316 Gerrit-Change-Number: 19430 Gerrit-PatchSet: 1 Gerrit-Owner: Baike Xia Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 18 Jan 2023 11:31:53 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3120: Support Bucket Shuffle Join for bucketed table
Baike Xia has uploaded this change for review. ( http://gerrit.cloudera.org:8080/19430 Change subject: IMPALA-3120: Support Bucket Shuffle Join for bucketed table .. IMPALA-3120: Support Bucket Shuffle Join for bucketed table For query statements that contain bucketed tables and have operations such as join, group by, sort by, etc., can use bucket shuffle join to optimize the execution plan, reduce the amount of data transferred, and reduce query latency. To ensure consistency with hive, the bucket hash is calculated using the same method that hive uses to calculate the hash value of a column. Add new query option as a function switch: ENABLE_BUCKET_SHUFFLE FRAGMENT_INSTANCE_BUCKET_NUM BUCKET_EXEC_BACKEND_RATIO Testing: - Add e2e tests - Add fe tests Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316 --- M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/data-stream-test.cc M be/src/runtime/fragment-instance-state.cc M be/src/runtime/initial-reservations.cc M be/src/runtime/initial-reservations.h M be/src/runtime/krpc-data-stream-sender-ir.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/scheduling/schedule-state.h M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/hash-util.h M common/protobuf/admission_control_service.proto M common/protobuf/control_service.proto M common/protobuf/planner.proto M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M common/thrift/Partitions.thrift M common/thrift/PlanNodes.thrift M common/thrift/Planner.thrift M common/thrift/Query.thrift M fe/src/main/java/org/apache/impala/analysis/TableDef.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/catalog/FeFsTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/DataPartition.java M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M fe/src/main/java/org/apache/impala/planner/UnionNode.java M fe/src/main/java/org/apache/impala/util/BucketUtils.java M fe/src/main/java/org/apache/impala/util/MathUtil.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M testdata/datasets/functional/functional_schema_template.sql A testdata/workloads/functional-planner/queries/PlannerTest/bucket-shuffle.test A testdata/workloads/functional-query/queries/QueryTest/bucket-shuffle.test A tests/query_test/test_bucket_shuffle.py 51 files changed, 2,176 insertions(+), 61 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/19430/1 -- To view, visit http://gerrit.cloudera.org:8080/19430 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316 Gerrit-Change-Number: 19430 Gerrit-PatchSet: 1 Gerrit-Owner: Baike Xia
[Impala-ASF-CR] IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/19429 ) Change subject: IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking .. Patch Set 1: (7 comments) http://gerrit.cloudera.org:8080/#/c/19429/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19429/1//COMMIT_MSG@7 PS1, Line 7: IMPALA-11845: Fix incorrect check of struct STAR path in resolvePathWithMasking Do I understand it correctly that the error occurs if we try to query a star on a table/view, not when we try to query a star on a struct? Such as the example in the Jira: select v.* from masked_view v; ERROR: IllegalStateException: null If so, could you make this clear in the commit message, and include the example here? http://gerrit.cloudera.org:8080/#/c/19429/1//COMMIT_MSG@10 PS1, Line 10: While Nit: When http://gerrit.cloudera.org:8080/#/c/19429/1//COMMIT_MSG@11 PS1, Line 11: haven Nit: hadn't http://gerrit.cloudera.org:8080/#/c/19429/1//COMMIT_MSG@12 PS1, Line 12: the table masking view can't expose complex type columns directly Is it still true? If so, can't we / should we change table masking views so that they can expose complex types too? http://gerrit.cloudera.org:8080/#/c/19429/1//COMMIT_MSG@32 PS1, Line 32: - The type is STRUCT Could you make it clearer that these are the conditions for returning the paths directly? It wasn't obvious for me. http://gerrit.cloudera.org:8080/#/c/19429/1//COMMIT_MSG@33 PS1, Line 33: root Nit: rooted? http://gerrit.cloudera.org:8080/#/c/19429/1/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_complex_types.test File testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_complex_types.test: http://gerrit.cloudera.org:8080/#/c/19429/1/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_complex_types.test@63 PS1, Line 63: e Nit: expanding. -- To view, visit http://gerrit.cloudera.org:8080/19429 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8f1e78e325baafbe23101909d47e82bf140a2d77 Gerrit-Change-Number: 19429 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 18 Jan 2023 11:07:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11826: Avoid calling planFiles() on Iceberg V2 tables when there are no predicates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19419 ) Change subject: IMPALA-11826: Avoid calling planFiles() on Iceberg V2 tables when there are no predicates .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8971/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/19419 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia46bd2dce248a9e096fc1c0bd914fc3fa4686fb0 Gerrit-Change-Number: 19419 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 18 Jan 2023 10:24:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11826: Avoid calling planFiles() on Iceberg V2 tables when there are no predicates
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/19419 ) Change subject: IMPALA-11826: Avoid calling planFiles() on Iceberg V2 tables when there are no predicates .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/19419 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia46bd2dce248a9e096fc1c0bd914fc3fa4686fb0 Gerrit-Change-Number: 19419 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 18 Jan 2023 10:24:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11798: Property 'external.table.purge' should not be ignored when CREATE Iceberg tables
Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/19416 ) Change subject: IMPALA-11798: Property 'external.table.purge' should not be ignored when CREATE Iceberg tables .. Patch Set 1: (1 comment) Thanks for this nice change! I only had one question about the default value. http://gerrit.cloudera.org:8080/#/c/19416/1/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/19416/1/fe/src/main/java/org/apache/impala/catalog/Table.java@191 PS1, Line 191: TRUE What is the reason behind setting the default to true for external tables? -- To view, visit http://gerrit.cloudera.org:8080/19416 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2649dd38fbe050044817d6c425ef447245aa2829 Gerrit-Change-Number: 19416 Gerrit-PatchSet: 1 Gerrit-Owner: Anonymous Coward Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 18 Jan 2023 10:06:59 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11826: Avoid calling planFiles() on Iceberg V2 tables when there are no predicates
Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/19419 ) Change subject: IMPALA-11826: Avoid calling planFiles() on Iceberg V2 tables when there are no predicates .. Patch Set 3: Code-Review+2 Thank you for this change! LGTM! -- To view, visit http://gerrit.cloudera.org:8080/19419 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia46bd2dce248a9e096fc1c0bd914fc3fa4686fb0 Gerrit-Change-Number: 19419 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Anonymous Coward Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 18 Jan 2023 09:52:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11840: Error with joining unnest with views
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/19426 ) Change subject: IMPALA-11840: Error with joining unnest with views .. Patch Set 1: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/19426/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19426/1//COMMIT_MSG@9 PS1, Line 9: Queries fail when an array is taken from a view, unnested in another : view and then the result queried in an outer query. I think that mentioning that this can only occur with at least 2d collections would make this easier to understand -so the first view returns the outer array, the second view unnests it and returns the inner array. -- To view, visit http://gerrit.cloudera.org:8080/19426 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic52655631944913553a7e7d9e9169b93da46dde3 Gerrit-Change-Number: 19426 Gerrit-PatchSet: 1 Gerrit-Owner: Daniel Becker Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Peter Rozsa Gerrit-Comment-Date: Wed, 18 Jan 2023 08:49:27 + Gerrit-HasComments: Yes