Todd Lipcon has submitted this change and it was merged.

Change subject: mem_tracker: fix race between FindTracker() and destructor
......................................................................


mem_tracker: fix race between FindTracker() and destructor

In very rare cases, it's possible for an interleaving between these two
functions to lead to a recursive lock acquisition of child_trackers_lock_
in the destructor.

For example:
1. The tracker hierarchy contains one parent (P) and one child (C1).
2. Thread 1 creates a second child (C2) parented to P. It has the sole ref
   to C2.
3. Thread 2 calls FindTracker() looking for C1.
4. Thread 2 runs as far as the loop in FindTrackerUnlocked(), getting
   descheduled just as it has locked a ref to C2. It also holds P's
   child_trackers_lock_.
5. Thread 1 is rescheduled and drops its ref to C2.
6. Thread 2 is rescheduled. It also drops its ref to C2, which was the last
   ref, so it runs C2's destructor. This acquires P's child_trackers_lock_,
   which it already owns. Boom.

The fix is simple: when looking up a tracker, make a local copy of the list
of children while holding child_trackers_lock_, then iterate without it.
It gets a little complicated due to the unusual requirements of
FindOrCreateTracker(); I ended up introducing a special static lock to
handle that case. The rest of the changes are basically code motion.

Unfortunately, the new test does not fail reliably without the fix. In my
testing, it failed maybe once every few hundred runs. But it doesn't fail at
all with the fix in place.

Change-Id: I69610f782c7a00d161bfb48d2487c29c0309c985
Reviewed-on: http://gerrit.cloudera.org:8080/4614
Tested-by: Kudu Jenkins
Reviewed-by: Todd Lipcon <[email protected]>
---
M src/kudu/consensus/log_cache.cc
M src/kudu/util/cache.cc
M src/kudu/util/mem_tracker-test.cc
M src/kudu/util/mem_tracker.cc
M src/kudu/util/mem_tracker.h
5 files changed, 78 insertions(+), 56 deletions(-)

Approvals:
  Todd Lipcon: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4614
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I69610f782c7a00d161bfb48d2487c29c0309c985
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <[email protected]>

Reply via email to