[Impala-ASF-CR] IMPALA-9767: Do not cleanup filter while PublishFilter is ongoing
Fang-Yu Rao has posted comments on this change. ( http://gerrit.cloudera.org:8080/17095 ) Change subject: IMPALA-9767: Do not cleanup filter while PublishFilter is ongoing .. Patch Set 1: > Patch Set 1: Code-Review+1 > > (10 comments) > > Thanks Riza for working on this fix and also Joe for providing the analysis! > I really appreciate it. :-) > > I am able to reproduce the issue according to suggested steps at IMPALA-9767 > by Riza. I have pushed the patch that reproduces the issue to > https://gerrit.cloudera.org/c/17103/ so that it would be easier for people to > reproduce the issue. Will abandon this patch after the fix by Riza has been > merged. > > Only TPC-DS Q1 with the tpcds/tpcds_parquet database are needed to reproduce > the issue. Building Impala with the 'asan' flag is also required to reproduce > the issue. It seems it is unlikely to crash the coordinator in the local > environment without the 'asan' flag. > > I also verified locally that Riza's patch could avoid that > heap-use-after-free error in my local development environment. > > I only have some minor comments on the commit message. Just to make one of my comments above clearer. I verified locally that before cleaning up the memory consumed by the aggregated Bloom filter, by calling "state->WaitForPublishFilter()" to wait for the completion of the last invocation to Coordinator::BackendState::PublishFilterCompleteCb(), the coordinator won't crash (no heap-use-after-free error). This indicates Riza's fix that defers the release of memory until the completion of the last invocation to Coordinator::BackendState::PublishFilterCompleteCb() would work. -- To view, visit http://gerrit.cloudera.org:8080/17095 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1c408bdedab83c4b9249e2c0c493cb0f894a3d08 Gerrit-Change-Number: 17095 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Sun, 21 Feb 2021 22:19:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9767: Do not cleanup filter while PublishFilter is ongoing
Fang-Yu Rao has posted comments on this change. ( http://gerrit.cloudera.org:8080/17095 ) Change subject: IMPALA-9767: Do not cleanup filter while PublishFilter is ongoing .. Patch Set 1: Code-Review+1 (10 comments) Thanks Riza for working on this fix and also Joe for providing the analysis! I really appreciate it. :-) I am able to reproduce the issue according to suggested steps at IMPALA-9767 by Riza. I have pushed the patch that reproduces the issue to https://gerrit.cloudera.org/c/17103/ so that it would be easier for people to reproduce the issue. Will abandon this patch after the fix by Riza has been merged. Only TPC-DS Q1 with the tpcds/tpcds_parquet database are needed to reproduce the issue. Building Impala with the 'asan' flag is also required to reproduce the issue. It seems it is unlikely to crash the coordinator in the local environment without the 'asan' flag. I also verified locally that Riza's patch could avoid that heap-use-after-free error in my local development environment. I only have some minor comments on the commit message. http://gerrit.cloudera.org:8080/#/c/17095/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17095/1//COMMIT_MSG@7 PS1, Line 7: cleanup nit: clean up http://gerrit.cloudera.org:8080/#/c/17095/1//COMMIT_MSG@9 PS1, Line 9: has been occurence nit: have been occurrences http://gerrit.cloudera.org:8080/#/c/17095/1//COMMIT_MSG@10 PS1, Line 10: publish nit: publishing http://gerrit.cloudera.org:8080/#/c/17095/1//COMMIT_MSG@10 PS1, Line 10: bloom nit: a Bloom http://gerrit.cloudera.org:8080/#/c/17095/1//COMMIT_MSG@11 PS1, Line 11: KRPC still sending the filter nit: the coordinator is still sending the aggregated filter via KRPC http://gerrit.cloudera.org:8080/#/c/17095/1//COMMIT_MSG@12 PS1, Line 12: decouple nit: decouples http://gerrit.cloudera.org:8080/#/c/17095/1//COMMIT_MSG@12 PS1, Line 12: remove nit: removes http://gerrit.cloudera.org:8080/#/c/17095/1//COMMIT_MSG@13 PS1, Line 13: separate nit: a separate http://gerrit.cloudera.org:8080/#/c/17095/1//COMMIT_MSG@14 PS1, Line 14: assert nit: asserts http://gerrit.cloudera.org:8080/#/c/17095/1//COMMIT_MSG@15 PS1, Line 15: filter nit: the filter -- To view, visit http://gerrit.cloudera.org:8080/17095 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1c408bdedab83c4b9249e2c0c493cb0f894a3d08 Gerrit-Change-Number: 17095 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Sun, 21 Feb 2021 21:13:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9767: Do not cleanup filter while PublishFilter is ongoing
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17095 ) Change subject: IMPALA-9767: Do not cleanup filter while PublishFilter is ongoing .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/8181/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/17095 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1c408bdedab83c4b9249e2c0c493cb0f894a3d08 Gerrit-Change-Number: 17095 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Sat, 20 Feb 2021 18:45:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9767: Do not cleanup filter while PublishFilter is ongoing
Riza Suminto has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17095 Change subject: IMPALA-9767: Do not cleanup filter while PublishFilter is ongoing .. IMPALA-9767: Do not cleanup filter while PublishFilter is ongoing There has been occurence of heap-use-after-free in ASAN build during runtime filter publish. The root cause of this issue is that bloom filter is being cleaned up while KRPC still sending the filter to workers. This patch remove the offending cleanup routine, decouple cleanup routine from FilterState::DisableAndRelease() into separate method FilterState::Release(), and assert that no rpc is inflight while cleaning up filter. Testing: - Reproduce the bug by instrumenting Coordinator::UpdateFilter(). - Manually verify that the bug does not happen anymore after the patch. - Pass core tests in ASAN build. Change-Id: I1c408bdedab83c4b9249e2c0c493cb0f894a3d08 --- M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-filter-state.h M be/src/runtime/coordinator.cc 3 files changed, 18 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/17095/1 -- To view, visit http://gerrit.cloudera.org:8080/17095 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I1c408bdedab83c4b9249e2c0c493cb0f894a3d08 Gerrit-Change-Number: 17095 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto