Mike Percy has submitted this change and it was merged.

Change subject: Control mutex stack walking in DEBUG mode with a gflag
......................................................................


Control mutex stack walking in DEBUG mode with a gflag

This patch disables the Mutex owner stack trace collection on DEBUG
builds by default, only enabling it when a certain gflag is set.

In DEBUG mode, our Mutex implementation collects a stack trace of the
owning thread each time the Mutex is acquired. It does this by calling
google::GetStackTrace() from glog, which in the context of the Kudu
build environment calls into libunwind to collect the stack trace.

At the time of writing, google::GetStackTrace() only allows access by
one thread at a time. If more than one thread attempts to invoke this
function simultaneously, there is a CAS that determines exclusivity. The
"loser" of this contest gets a short-circuit return along with an empty
stack trace, indicating a failure to collect the stack trace.

NB: I have filed a glog issue about that behavior upstream. For more
information, see https://github.com/google/glog/issues/160

This situation becomes a problem when there are one or more Mutexes
constantly being acquired. When that happens, there is always a thread
collecting a stack trace, and so the probability of being able to
successfully collect a stack trace at any given moment is greatly
reduced.

One important caller of google::GetStackTrace() is the glog failure
function and SIGABRT signal handler that is called when a CHECK() fails
or a LOG(FATAL) call is invoked. I have observed that this crash handler
will often print an empty stack trace in DEBUG mode. Investigating this
issue led me to discover that we had a thread (our AsyncLogger thread)
constantly acquiring a Mutex and racing on the above-mentioned CAS check
inside google::GetStackTrace(). Depriving our DEBUG builds of stack
traces on LOG(FATAL) or CHECK failures, especially on Jenkins runs, is
counterproductive. One simple solution to this problem is to disable
this behavior by default.

Change-Id: Ie4593cf7173867ce2f6151e03df0be94f97d95d2
Reviewed-on: http://gerrit.cloudera.org:8080/5741
Tested-by: Mike Percy <[email protected]>
Reviewed-by: Adar Dembo <[email protected]>
---
M src/kudu/util/mutex.cc
M src/kudu/util/mutex.h
2 files changed, 37 insertions(+), 8 deletions(-)

Approvals:
  Mike Percy: Verified
  Adar Dembo: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/5741
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ie4593cf7173867ce2f6151e03df0be94f97d95d2
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <[email protected]>

Reply via email to