Hello Thomas Marshall, Joe McDonnell, Bikramjeet Vig, Impala Public Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/13057
to look at the new patch set (#2).
Change subject: IMPALA-8270: fix MemTracker teardown in FeSupport
......................................................................
IMPALA-8270: fix MemTracker teardown in FeSupport
This patch tries to simplify and standardise the order
in which control structures are torn down. As a
consequence the bug is fixed. I've described the bug
below.
The changes are:
* Make more control structures owned directly by
QueryState::obj_pool_, so that they are all
destroyed at the same time via ~QueryState.
* Tear down local_query_state_ explicitly before
other destructors run.
Either change is sufficient to fix the bug, but
I preferred to do both to reduce the chances
of similar bugs in future.
Description of bug:
===================
In the normal query execution flow:
- RuntimeState is in QueryState::obj_pool_
- RuntimeState owns RuntimeState::instance_mem_tracker_ via scoped_ptr
- QueryState::query_mem_tracker_ is in QueryState::obj_pool_
- QueryState::query_mem_tracker_ has a reference to
RuntimeState::instance_mem_tracker_
The tear-down works because ~QueryState unregisters query_mem_tracker_
from its parent, making the whole subtree unreachable before destroying
QueryState::obj_pool_. It is thus safe to destroy instance_mem_tracker_
along with the rest of obj_pool_.
FeSupport messes this up by having RuntimeState own the QueryState
RuntimeState::local_query_state_ via a scoped_ptr, and the implied
destructor order means that RuntimeState::instance_mem_tracker_ is
destroyed before RuntimeState::local_query_state_, which breaks the
above flow and the destroyed instance_mem_tracker_ is reachable from
the process MemTracker via QueryState::query_mem_tracker_ for a
small window until it is unregistered.
Testing:
Added a backend test that reproduced the ASAN use-after-free
failure when run against unmodified RuntimeState code. I did
not make it a unified backend test so that it would be easier
to backport this fix to older versions that don't have unified
tests.
Change-Id: If815130cd4db00917746f10b28514f779ee254f0
---
M be/src/runtime/CMakeLists.txt
A be/src/runtime/runtime-state-test.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
4 files changed, 92 insertions(+), 12 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/13057/2
--
To view, visit http://gerrit.cloudera.org:8080/13057
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If815130cd4db00917746f10b28514f779ee254f0
Gerrit-Change-Number: 13057
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Thomas Marshall <[email protected]>