[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/21452 ) Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. Patch Set 9: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 9 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 19 Jun 2024 09:32:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21452 ) Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. Patch Set 9: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 9 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 19 Jun 2024 02:01:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21452 ) Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. Patch Set 8: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10733/ -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 8 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 18 Jun 2024 21:56:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21452 ) Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. Patch Set 9: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/16390/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 9 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 18 Jun 2024 21:04:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Hello Daniel Becker, Gabor Kaszab, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21452 to look at the new patch set (#9). Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder With this patch IcebergDeleteBuilder checks how many probe threads are actually blocked on the builder. Let's assume the following plan: UNION ALL / \ / \ / \ SCAN allANTI JOIN datafiles / \ without / \ deletes SCAN SCAN datafilesdeletes with deletes In that case UNION ALL, and the two "SCAN datafiles" operators are in the same fragment, while the builder of the ANTI JOIN is in a different fragment. This means that "SCAN datafiles without deletes" can run in parallel with the builder. But once that SCAN is exhausted, the UNION ALL will drain rows from "SCAN datafiles with deletes" via the ANTI JOIN operator, but that operator depends on the join builder output. This means in some cases the SCAN fragments are busy, while in other cases the SCAN fragments are blocked. It depends on how much work they need to do, and how much work the build-side needs to do. So to handle all cases, we dynamically check how many probe fragments are blocked on the builder, then spin up as many threads to parellelize the final sort. This also works well when we have the following plan: ANTI JOIN / \ / \ SCAN SCAN datafilesdeletes with deletes The above plan is created when all data files have corresponding deletes, or when we are running a simple count(*) query. In that case all "SCAN datafiles" fragments are blocked on the builder, so we can use that many threads to sort the build results. A new field "ThreadCountInFinalBuild" was added, so we can check the query profile about how many threads were used for the final sorting in the builders. Measurements: In a table with 1 Trillion data records and 68.5 Billion delete records it reduced "IcebergDeletePositionSortTimer" from ~1 minute to 8-10 seconds, in an environment with 40 executors and MT_DOP=12. Testing: * e2e tests that check counter "ThreadCountInFinalBuild" Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 (cherry picked from commit c42de800a191ac82dcd676900991c6f5dbf31a76) --- M be/src/exec/iceberg-delete-builder.cc M be/src/exec/iceberg-delete-builder.h M be/src/exec/join-builder.cc M be/src/exec/join-builder.h M testdata/workloads/functional-query/queries/QueryTest/iceberg-update-stress.test M tests/stress/test_update_stress.py 6 files changed, 162 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/21452/9 -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 9 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21452 ) Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. Patch Set 9: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10737/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 9 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 18 Jun 2024 20:39:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21452 ) Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/16385/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 8 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 18 Jun 2024 16:57:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Hello Daniel Becker, Gabor Kaszab, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21452 to look at the new patch set (#8). Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder With this patch IcebergDeleteBuilder checks how many probe threads are actually blocked on the builder. Let's assume the following plan: UNION ALL / \ / \ / \ SCAN allANTI JOIN datafiles / \ without / \ deletes SCAN SCAN datafilesdeletes with deletes In that case UNION ALL, and the two "SCAN datafiles" operators are in the same fragment, while the builder of the ANTI JOIN is in a different fragment. This means that "SCAN datafiles without deletes" can run in parallel with the builder. But once that SCAN is exhausted, the UNION ALL will drain rows from "SCAN datafiles with deletes" via the ANTI JOIN operator, but that operator depends on the join builder output. This means in some cases the SCAN fragments are busy, while in other cases the SCAN fragments are blocked. It depends on how much work they need to do, and how much work the build-side needs to do. So to handle all cases, we dynamically check how many probe fragments are blocked on the builder, then spin up as many threads to parellelize the final sort. This also works well when we have the following plan: ANTI JOIN / \ / \ SCAN SCAN datafilesdeletes with deletes The above plan is created when all data files have corresponding deletes, or when we are running a simple count(*) query. In that case all "SCAN datafiles" fragments are blocked on the builder, so we can use that many threads to sort the build results. A new field "ThreadCountInFinalBuild" was added, so we can check the query profile about how many threads were used for the final sorting in the builders. Measurements: In a table with 1 Trillion data records and 68.5 Billion delete records it reduced "IcebergDeletePositionSortTimer" from ~1 minute to 8-10 seconds, in an environment with 40 executors and MT_DOP=12. Testing: * e2e tests that check counter "ThreadCountInFinalBuild" Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 --- M be/src/exec/iceberg-delete-builder.cc M be/src/exec/iceberg-delete-builder.h M be/src/exec/join-builder.cc M be/src/exec/join-builder.h M testdata/workloads/functional-query/queries/QueryTest/iceberg-update-stress.test M tests/stress/test_update_stress.py 6 files changed, 162 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/21452/8 -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 8 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21452 ) Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. Patch Set 8: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10733/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 8 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 18 Jun 2024 16:23:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21452 ) Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10731/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 18 Jun 2024 15:13:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/21452 ) Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. Patch Set 7: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/21452/7/be/src/exec/join-builder.h File be/src/exec/join-builder.h: http://gerrit.cloudera.org:8080/#/c/21452/7/be/src/exec/join-builder.h@214 PS7, Line 214: protected: Optional: we could put these protected functions before the fields (L157), then we wouldn't have two separate "protected" sections. -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 18 Jun 2024 14:47:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21452 ) Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/16382/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 18 Jun 2024 14:40:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/21452 ) Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc File be/src/exec/iceberg-delete-builder.cc: http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc@266 PS5, Line 266: int probes_waiting; : { : unique_lock l(separate_build_lock_); : probes_waiting = probes_waiting_on_builder_; : } > It could still be made private within the block, but that would leave us wi I've made probes_waiting_on_builder_ private. http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc@275 PS5, Line 275: min(data_file_count, probes_waiting) * 2 / 3); > Actually I do too. But it would also fit on one line, or do you want to emp I think when there are nested expressions then line breaks can help readability. -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 18 Jun 2024 14:16:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Hello Daniel Becker, Gabor Kaszab, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21452 to look at the new patch set (#7). Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder With this patch IcebergDeleteBuilder checks how many probe threads are actually blocked on the builder. Let's assume the following plan: UNION ALL / \ / \ / \ SCAN allANTI JOIN datafiles / \ without / \ deletes SCAN SCAN datafilesdeletes with deletes In that case UNION ALL, and the two "SCAN datafiles" operators are in the same fragment, while the builder of the ANTI JOIN is in a different fragment. This means that "SCAN datafiles without deletes" can run in parallel with the builder. But once that SCAN is exhausted, the UNION ALL will drain rows from "SCAN datafiles with deletes" via the ANTI JOIN operator, but that operator depends on the join builder output. This means in some cases the SCAN fragments are busy, while in other cases the SCAN fragments are blocked. It depends on how much work they need to do, and how much work the build-side needs to do. So to handle all cases, we dynamically check how many probe fragments are blocked on the builder, then spin up as many threads to parellelize the final sort. This also works well when we have the following plan: ANTI JOIN / \ / \ SCAN SCAN datafilesdeletes with deletes The above plan is created when all data files have corresponding deletes, or when we are running a simple count(*) query. In that case all "SCAN datafiles" fragments are blocked on the builder, so we can use that many threads to sort the build results. A new field "ThreadCountInFinalBuild" was added, so we can check the query profile about how many threads were used for the final sorting in the builders. Measurements: In a table with 1 Trillion data records and 68.5 Billion delete records it reduced "IcebergDeletePositionSortTimer" from ~1 minute to 8-10 seconds, in an environment with 40 executors and MT_DOP=12. Testing: * e2e tests that check counter "ThreadCountInFinalBuild" Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 (cherry picked from commit 74a942d2a38555c39bc51d61a05b1c90712c) --- M be/src/exec/iceberg-delete-builder.cc M be/src/exec/iceberg-delete-builder.h M be/src/exec/join-builder.cc M be/src/exec/join-builder.h M testdata/workloads/functional-query/queries/QueryTest/iceberg-update-stress.test M tests/stress/test_update_stress.py 6 files changed, 163 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/21452/7 -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/21452 ) Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. Patch Set 6: Code-Review+1 (3 comments) http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc File be/src/exec/iceberg-delete-builder.cc: http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc@266 PS5, Line 266: int probes_waiting = NumProbesWaiting(); : : // Let's be conservative and only use 2/3 of available theads, because the scanner : // threads are not as CPU heavy as the sort threads. : r > Yeah, I think it's nice to extract it in a function. I didn't make 'probes_ It could still be made private within the block, but that would leave us with separate "protected" blocks, so yes, it's a tradeoff. http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc@275 PS5, Line 275: for (auto& [path, intermediate_vec] : intermediate_delete_rows_) { > For formulas I prefer readability over the 4 spaces rule. Actually I do too. But it would also fit on one line, or do you want to emphasise that these are the two parameters? http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc@296 PS5, Line 296: TURN_IF_ERROR(pool.Init()); : for (auto& [path, intermediate_vec] : intermediate_delete_rows_) { : BuildWorkItem work_item; : work_item.path = > Thanks for the suggestions. Ok, I agree that keeping RAII is more important. -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 18 Jun 2024 09:24:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21452 ) Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. Patch Set 6: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10726/ -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 17 Jun 2024 21:45:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21452 ) Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/16371/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 17 Jun 2024 17:02:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21452 ) Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10726/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 17 Jun 2024 16:39:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/21452 ) Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. Patch Set 6: (6 comments) Thanks for the comments! http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc File be/src/exec/iceberg-delete-builder.cc: http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc@266 PS5, Line 266: int probes_waiting = NumProbesWaiting(); : : // Let's be conservative and only use 2/3 of available theads, because the scanner : // threads are not as CPU heavy as the sort threads. : r > Do you think it would be worth extracting this to a member function in Join Yeah, I think it's nice to extract it in a function. I didn't make 'probes_waiting_on_builder_' private though, as currently it is in a nice /// BEGIN /// END section of members that I did not want to mess up. http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc@275 PS5, Line 275: for (auto& [path, intermediate_vec] : intermediate_delete_rows_) { > Nit: we usually don't match the indentation of arguments but use 4 spaces f For formulas I prefer readability over the 4 spaces rule. http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc@279 PS5, Line 279: > Using the actual type would be more readable. Applies also to L322 and L324 Actual type is a pair, I switched to structured binding. http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc@296 PS5, Line 296: TURN_IF_ERROR(pool.Init()); : for (auto& [path, intermediate_vec] : intermediate_delete_rows_) { : BuildWorkItem work_item; : work_item.path = > This is the same as the loop body of FinalizeBuildImpl() with these excepti Thanks for the suggestions. About 1: I'd have to do something like: if (SHOULD_LOCK) separate_build_lock_.lock() deleted_rows_[*path] = std::move(final_delete_vec); if (SHOULD_LOCK) separate_build_lock_.unlock(); I don't really like such constructs as we lose RAII. To keep RAII, we could do: std::unique_lock guard(separate_build_lock_, std::defer_lock_t{}); if (SHOULD_LOCK) { guard.lock(); } deleted_rows_[*path] = std::move(final_delete_vec); But it feels a bit too complex, also we are creating an unnecessary guard object when SHOULD_LOCK is false (though the compiler might be smart enough to eliminate). So for now I'd leave it in its current form. I'd also keep BuildWorkItem as it makes the code a bit more explicit / readable. http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc@304 PS5, Line 304: medi > Using the actual type would be more readable. Switched to structured binding. http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc@319 PS5, Line 319: > Because 'intermediate_vec' is a const ref, this will not actually move the Nice catch! -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 17 Jun 2024 16:39:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Hello Daniel Becker, Gabor Kaszab, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21452 to look at the new patch set (#6). Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder With this patch IcebergDeleteBuilder checks how many probe threads are actually blocked on the builder. Let's assume the following plan: UNION ALL / \ / \ / \ SCAN allANTI JOIN datafiles / \ without / \ deletes SCAN SCAN datafilesdeletes with deletes In that case UNION ALL, and the two "SCAN datafiles" operators are in the same fragment, while the builder of the ANTI JOIN is in a different fragment. This means that "SCAN datafiles without deletes" can run in parallel with the builder. But once that SCAN is exhausted, the UNION ALL will drain rows from "SCAN datafiles with deletes" via the ANTI JOIN operator, but that operator depends on the join builder output. This means in some cases the SCAN fragments are busy, while in other cases the SCAN fragments are blocked. It depends on how much work they need to do, and how much work the build-side needs to do. So to handle all cases, we dynamically check how many probe fragments are blocked on the builder, then spin up as many threads to parellelize the final sort. This also works well when we have the following plan: ANTI JOIN / \ / \ SCAN SCAN datafilesdeletes with deletes The above plan is created when all data files have corresponding deletes, or when we are running a simple count(*) query. In that case all "SCAN datafiles" fragments are blocked on the builder, so we can use that many threads to sort the build results. A new field "ThreadCountInFinalBuild" was added, so we can check the query profile about how many threads were used for the final sorting in the builders. Measurements: In a table with 1 Trillion data records and 68.5 Billion delete records it reduced "IcebergDeletePositionSortTimer" from ~1 minute to 8-10 seconds, in an environment with 40 executors and MT_DOP=12. Testing: * e2e tests that check counter "ThreadCountInFinalBuild" Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 (cherry picked from commit 74a942d2a38555c39bc51d61a05b1c90712c) --- M be/src/exec/iceberg-delete-builder.cc M be/src/exec/iceberg-delete-builder.h M be/src/exec/join-builder.cc M be/src/exec/join-builder.h M testdata/workloads/functional-query/queries/QueryTest/iceberg-update-stress.test M tests/stress/test_update_stress.py 6 files changed, 160 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/21452/6 -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/21452 ) Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. Patch Set 5: (7 comments) http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.h File be/src/exec/iceberg-delete-builder.h: http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.h@148 PS5, Line 148: Impl To unambiguously differentiate it from FinalizeBuildImplParallel(), this could be called FinalizeBuildImplSequential(). http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc File be/src/exec/iceberg-delete-builder.cc: http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc@266 PS5, Line 266: int probes_waiting; : { : unique_lock l(separate_build_lock_); : probes_waiting = probes_waiting_on_builder_; : } Do you think it would be worth extracting this to a member function in JoinBuilder? Then 'probes_waiting_on_builder_' could also be private. http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc@275 PS5, Line 275: min(data_file_count, probes_waiting) * 2 / 3); Nit: we usually don't match the indentation of arguments but use 4 spaces for continuation indentation. http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc@279 PS5, Line 279: auto Using the actual type would be more readable. Applies also to L322 and L324. http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc@296 PS5, Line 296: DeleteRowVector final_delete_vec = GetFinalDeleteRowVector(intermediate_vec); : intermediate_vec.clear(); : unique_lock l(separate_build_lock_); : deleted_rows_[*path] = std::move(final_delete_vec); This is the same as the loop body of FinalizeBuildImpl() with these exceptions: 1. there is locking here 2. this function uses BuildWorkItem instead of std::pair. These could be unified in the same function so there would be no need for this lambda: 1. a boolean template parameter would decide whether we need to lock and the function could use "constexpr if" 2. the ThreadPool could also use std::pair as its template parameter, so BuildWorkItem would no longer be needed. http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc@304 PS5, Line 304: auto Using the actual type would be more readable. http://gerrit.cloudera.org:8080/#/c/21452/5/be/src/exec/iceberg-delete-builder.cc@319 PS5, Line 319: move Because 'intermediate_vec' is a const ref, this will not actually move the vector but copy it, see https://stackoverflow.com/questions/28595117/why-can-we-use-stdmove-on-a-const-object So I think 'intermediate_vec' should not be const and then 'intermediate_vec.clear()' could be brought here from L297 and L282. -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 17 Jun 2024 14:26:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21452 ) Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. Patch Set 5: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10720/ -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Sat, 15 Jun 2024 13:54:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21452 ) Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/16358/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Sat, 15 Jun 2024 09:17:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Hello Gabor Kaszab, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21452 to look at the new patch set (#5). Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder With this patch IcebergDeleteBuilder checks how many probe threads are actually blocked on the builder. Let's assume the following plan: UNION ALL / \ / \ / \ SCAN allANTI JOIN datafiles / \ without / \ deletes SCAN SCAN datafilesdeletes with deletes In that case UNION ALL, and the two "SCAN datafiles" operators are in the same fragment, while the builder of the ANTI JOIN is in a different fragment. This means that "SCAN datafiles without deletes" can run in parallel with the builder. But once that SCAN is exhausted, the UNION ALL will drain rows from "SCAN datafiles with deletes" via the ANTI JOIN operator, but that operator depends on the join builder output. This means in some cases the SCAN fragments are busy, while in other cases the SCAN fragments are blocked. It depends on how much work they need to do, and how much work the build-side needs to do. So to handle all cases, we dynamically check how many probe fragments are blocked on the builder, then spin up as many threads to parellelize the final sort. This also works well when we have the following plan: ANTI JOIN / \ / \ SCAN SCAN datafilesdeletes with deletes The above plan is created when all data files have corresponding deletes, or when we are running a simple count(*) query. In that case all "SCAN datafiles" fragments are blocked on the builder, so we can use that many threads to sort the build results. A new field "ThreadCountInFinalBuild" was added, so we can check the query profile about how many threads were used for the final sorting in the builders. Measurements: In a table with 1 Trillion data records and 68.5 Billion delete records it reduced "IcebergDeletePositionSortTimer" from ~1 minute to 8-10 seconds, in an environment with 40 executors and MT_DOP=12. Testing: * e2e tests that check counter "ThreadCountInFinalBuild" Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 --- M be/src/exec/iceberg-delete-builder.cc M be/src/exec/iceberg-delete-builder.h M be/src/exec/join-builder.cc M be/src/exec/join-builder.h M testdata/workloads/functional-query/queries/QueryTest/iceberg-update-stress.test M tests/stress/test_update_stress.py 6 files changed, 160 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/21452/5 -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21452 ) Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10720/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Sat, 15 Jun 2024 08:53:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21452 ) Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. Patch Set 4: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/10715/ -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 14 Jun 2024 21:08:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21452 ) Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/10715/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 14 Jun 2024 15:56:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21452 ) Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/16349/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 14 Jun 2024 14:52:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Hello Gabor Kaszab, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21452 to look at the new patch set (#4). Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder With this patch IcebergDeleteBuilder checks how many probe threads are actually blocked on the builder. Let's assume the following plan: UNION ALL / \ / \ / \ SCAN allANTI JOIN datafiles / \ without / \ deletes SCAN SCAN datafilesdeletes with deletes In that case UNION ALL, and the two "SCAN datafiles" operators are in the same fragment, while the builder of the ANTI JOIN is in a different fragment. This means that "SCAN datafiles without deletes" can run in parallel with the builder. But once that SCAN is exhausted, the UNION ALL will drain rows from "SCAN datafiles with deletes" via the ANTI JOIN operator, but that operator depends on the join builder output. This means in some cases the SCAN fragments are busy, while in other cases the SCAN fragments are blocked. It depends on how much work they need to do, and how much work the build-side needs to do. So to handle all cases, we dynamically check how many probe fragments are blocked on the builder, then spin up as many threads to parellelize the final sort. This also works well when we have the following plan: ANTI JOIN / \ / \ SCAN SCAN datafilesdeletes with deletes The above plan is created when all data files have corresponding deletes, or when we are running a simple count(*) query. In that case all "SCAN datafiles" fragments are blocked on the builder, so we can use that many threads to sort the build results. A new field "ThreadCountInFinalBuild" was added, so we can check the query profile about how many threads were used for the final sorting in the builders. Measurements: In a table with 1 Trillion data records and 68.5 Billion delete records it reduced "IcebergDeletePositionSortTimer" from ~1 minute to 8-10 seconds, in an environment with 40 executors and MT_DOP=12. Testing: * e2e tests that check counter "ThreadCountInFinalBuild" Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 --- M be/src/exec/iceberg-delete-builder.cc M be/src/exec/iceberg-delete-builder.h M be/src/exec/join-builder.cc M be/src/exec/join-builder.h M testdata/workloads/functional-query/queries/QueryTest/iceberg-update-stress.test M tests/stress/test_update_stress.py 6 files changed, 159 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/21452/4 -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Hello Gabor Kaszab, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21452 to look at the new patch set (#3). Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder With this patch IcebergDeleteBuilder checks how many probe threads are actually blocked on the builder. Let's assume the following plan: UNION ALL / \ / \ / \ SCAN allANTI JOIN datafiles / \ without / \ deletes SCAN SCAN datafilesdeletes with deletes In that case UNION ALL, and the two "SCAN datafiles" operators are in the same fragment, while the builder of the ANTI JOIN is in a different fragment. This means that "SCAN datafiles without deletes" can run in parallel with the builder. But once that SCAN is exhausted, the UNION ALL will drain rows from "SCAN datafiles with deletes" via the ANTI JOIN operator, but that operator depends on the join builder output. This means in some cases the SCAN fragments are busy, while in other cases the SCAN fragments are blocked. It depends on how much work they need to do, and how much work the build-side needs to do. So to handle all cases, we dynamically check how many probe fragments are blocked on the builder, then spin up as many threads to parellelize the final sort. This also works well when we have the following plan: ANTI JOIN / \ / \ SCAN SCAN datafilesdeletes with deletes The above plan is created when all data files have corresponding deletes, or when we are running a simple count(*) query. In that case all "SCAN datafiles" fragments are blocked on the builder, so we can use that many threads to sort the build results. A new field "ThreadCountInFinalBuild" was added, so we can check the query profile about how many threads were used for the final sorting in the builders. Measurements: In a table with 1 Trillion data records and 68.5 Billion delete records it reduced "IcebergDeletePositionSortTimer" from ~1 minute to 8-10 seconds, in an environment with 40 executors and MT_DOP=12. TODO: * e2e tests that check counter "ThreadCountInFinalBuild" Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 --- M be/src/exec/iceberg-delete-builder.cc M be/src/exec/iceberg-delete-builder.h M be/src/exec/join-builder.cc M be/src/exec/join-builder.h 4 files changed, 105 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/21452/3 -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/21452 ) Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. Patch Set 2: (7 comments) Thanks for the comments! http://gerrit.cloudera.org:8080/#/c/21452/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21452/2//COMMIT_MSG@33 PS2, Line 33: build fragments > aren't these 'probe' fragments that are blocked on the builder? Thanks for catching this, done. http://gerrit.cloudera.org:8080/#/c/21452/2//COMMIT_MSG@37 PS2, Line 37: The > nit: typo Done http://gerrit.cloudera.org:8080/#/c/21452/2//COMMIT_MSG@57 PS2, Line 57: lowered > reduced? Done http://gerrit.cloudera.org:8080/#/c/21452/2/be/src/exec/iceberg-delete-builder.h File be/src/exec/iceberg-delete-builder.h: http://gerrit.cloudera.org:8080/#/c/21452/2/be/src/exec/iceberg-delete-builder.h@147 PS2, Line 147: Status FinalizeBuildImplParallel(int num_threads); > Since this one can return error codes too, could you add a comment in what Done http://gerrit.cloudera.org:8080/#/c/21452/2/be/src/exec/iceberg-delete-builder.cc File be/src/exec/iceberg-delete-builder.cc: http://gerrit.cloudera.org:8080/#/c/21452/2/be/src/exec/iceberg-delete-builder.cc@314 PS2, Line 314: IcebergDeleteBuilder::DeleteRowVector IcebergDeleteBuilder::GetFinalDeleteRowVector( > This and the other PR made me think about whther it makes sense to create a That's a good idea. However, I have another Jira ticket to throw out all of these, or at least the final delete vectors, and use RoaringBitmap: IMPALA-13109 I'm afraid I'd have to throw out half of the work with that change. So probably it would be better to do the encapsulation and refactoring for the RoaringBitmap. http://gerrit.cloudera.org:8080/#/c/21452/2/be/src/exec/iceberg-delete-builder.cc@320 PS2, Line 320: int64_t capacity = 0; > indentation Done http://gerrit.cloudera.org:8080/#/c/21452/2/be/src/exec/join-builder.h File be/src/exec/join-builder.h: http://gerrit.cloudera.org:8080/#/c/21452/2/be/src/exec/join-builder.h@197 PS2, Line 197: int probes_waiting_on_builder_ = 0; > The variable name is self-explanatory, but some additional comment that is Done -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 13 Jun 2024 16:59:31 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21452 ) Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/16335/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 13 Jun 2024 17:26:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/21452 ) Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. Patch Set 2: (7 comments) http://gerrit.cloudera.org:8080/#/c/21452/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21452/2//COMMIT_MSG@33 PS2, Line 33: build fragments aren't these 'probe' fragments that are blocked on the builder? http://gerrit.cloudera.org:8080/#/c/21452/2//COMMIT_MSG@37 PS2, Line 37: The nit: typo http://gerrit.cloudera.org:8080/#/c/21452/2//COMMIT_MSG@57 PS2, Line 57: lowered reduced? http://gerrit.cloudera.org:8080/#/c/21452/2/be/src/exec/iceberg-delete-builder.h File be/src/exec/iceberg-delete-builder.h: http://gerrit.cloudera.org:8080/#/c/21452/2/be/src/exec/iceberg-delete-builder.h@147 PS2, Line 147: Status FinalizeBuildImplParallel(int num_threads); Since this one can return error codes too, could you add a comment in what cases can we except errors? http://gerrit.cloudera.org:8080/#/c/21452/2/be/src/exec/iceberg-delete-builder.cc File be/src/exec/iceberg-delete-builder.cc: http://gerrit.cloudera.org:8080/#/c/21452/2/be/src/exec/iceberg-delete-builder.cc@314 PS2, Line 314: IcebergDeleteBuilder::DeleteRowVector IcebergDeleteBuilder::GetFinalDeleteRowVector( This and the other PR made me think about whther it makes sense to create a separate class for this DeleteRowVector where we can encapsulate all the logic that is needed to populate the vector of vectors, allocating new vectors with increasing size, and also to flatten the vector of vectors into one vector. It would reduce code complexity from the builder code and would give the responsibility of representing the positions in a more efficient way to another code-level. What do you think? http://gerrit.cloudera.org:8080/#/c/21452/2/be/src/exec/iceberg-delete-builder.cc@320 PS2, Line 320: int64_t capacity = 0; indentation http://gerrit.cloudera.org:8080/#/c/21452/2/be/src/exec/join-builder.h File be/src/exec/join-builder.h: http://gerrit.cloudera.org:8080/#/c/21452/2/be/src/exec/join-builder.h@197 PS2, Line 197: int probes_waiting_on_builder_ = 0; The variable name is self-explanatory, but some additional comment that is a bit more verbose could be useful. -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 31 May 2024 08:51:07 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21452 ) Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/16213/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 23 May 2024 13:20:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/21452 ) Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/16212/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 23 May 2024 13:09:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/21452 to look at the new patch set (#2). Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder With this patch IcebergDeleteBuilder checks how many probe threads are actually blocked on the builder. Let's assume the following plan: UNION ALL / \ / \ / \ SCAN allANTI JOIN datafiles / \ without / \ deletes SCAN SCAN datafilesdeletes with deletes In that case UNION ALL, and the two "SCAN datafiles" operators are in the same fragment, while the builder of the ANTI JOIN is in a different fragment. This means that "SCAN datafiles without deletes" can run in parallel with the builder. But once that SCAN is exhausted, the UNION ALL will drain rows from "SCAN datafiles with deletes" via the ANTI JOIN operator, but that operator depends on the join builder output. This means in some cases the SCAN fragments are busy, while in other cases the SCAN fragments are blocked. It depends on how much work they need to do, and how much work the build-side needs to do. So to handle all cases, we dynamically check how many build fragments are blocked on the builder, then spin up as many threads to parellelize the final sort. The also works well when we have the following plan: ANTI JOIN / \ / \ SCAN SCAN datafilesdeletes with deletes The above plan is created when all data files have corresponding deletes, or when we are running a simple count(*) query. In that case all "SCAN datafiles" fragments are blocked on the builder, so we can use that many threads to sort the build results. A new field "ThreadCountInFinalBuild" was added, so we can check the query profile about how many threads were used for the final sorting in the builders. Measurements: In a table with 1 Trillion data records and 68.5 Billion delete records it lowered "IcebergDeletePositionSortTimer" from ~1 minute to 8-10 seconds, in an environment with 40 executors and MT_DOP=12. TODO: * e2e tests that check counter "ThreadCountInFinalBuild" Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 --- M be/src/exec/iceberg-delete-builder.cc M be/src/exec/iceberg-delete-builder.h M be/src/exec/join-builder.cc M be/src/exec/join-builder.h 4 files changed, 105 insertions(+), 24 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/21452/2 -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder
Zoltan Borok-Nagy has uploaded this change for review. ( http://gerrit.cloudera.org:8080/21452 Change subject: IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder .. IMPALA-13088: (part 2) Parallelize final sorts in IcebergDeleteBuilder With this patch IcebergDeleteBuilder checks how many probe threads are actually blocked on the builder. Let's assume the following plan: UNION ALL / \ / \ / \ SCAN allANTI JOIN datafiles / \ without / \ deletes SCAN SCAN datafilesdeletes with deletes In that case UNION ALL, and the two "SCAN datafiles" operators are in the same fragment, while the builder of the ANTI JOIN is in a different fragment. This means that "SCAN datafiles without deletes" can run in parallel with the builder. But once that SCAN is exhausted, the UNION ALL will drain rows from "SCAN datafiles with deletes" via the ANTI JOIN operator, but that operator depends on the join builder output. This means in some cases the SCAN fragments are busy, while in other cases the SCAN fragments are blocked. It depends on how much work they need to do, and how much work the build-side needs to do. So to handle all cases, we dynamically check how many build fragments are blocked on the builder, then spin up as many threads to parellelize the final sort. The also works well when we have the following plan: ANTI JOIN / \ / \ SCAN SCAN datafilesdeletes with deletes The above plan is created when all data files have corresponding deletes, or when we are running a simple count(*) query. In that case all "SCAN datafiles" fragments are blocked on the builder, so we can use that many threads to sort the build results. A new field "ThreadCountInFinalBuild" was added, so we can check the query profile about how many threads were used for the final sorting in the builders. Measurements: In a table with 1 Trillion data records and 68.5 Billion delete records it lowered "IcebergDeletePositionSortTimer" from ~1 minute to 8-10 seconds, in an environment with 40 executors and MT_DOP=12. TODO: * e2e tests that check counter "ThreadCountInFinalBuild" Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 --- M be/src/exec/iceberg-delete-builder.cc M be/src/exec/iceberg-delete-builder.h M be/src/exec/join-builder.cc M be/src/exec/join-builder.h 4 files changed, 102 insertions(+), 24 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/21452/1 -- To view, visit http://gerrit.cloudera.org:8080/21452 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I7ca946a452d061238255e9b0e2c81a51cac68807 Gerrit-Change-Number: 21452 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy