Grant Henke has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10230


Change subject: pstack_watcher: blacklist older versions of gdb
......................................................................

pstack_watcher: blacklist older versions of gdb

Despite commit 6ed4690, I was still seeing timeouts in pstack_watcher-test
on CentOS 6.6. After going down a rabbit hole, I think I found the root
cause: a pretty printing bug that causes gdb to read from uninitialized
memory. In theory the symptom varies, but in my case gdb spent a very very
long time reading the memory backing 'std::string exit_info' in
PstackWatcher::RunStackDump. Because 'exit_info' was uninitialized at the
time of the read (i.e. while the test was blocked in Subprocess::Wait()),
its length contained garbage, and sometimes (varied with each execution of
the test binary) that garbage was a very large number. In such cases, gdb
would faithfully read the traced process' memory word by word. This looked
like an infinite loop but attaching strace to gdb revealed the truth.
Eventually the test would time out.

I couldn't pinpoint the bug with precision, nor could I figure out exactly
when it was fixed, so I ended up blacklisting versions older than a
"oldest known good" version. But I'm somewhat confident that this is the
root cause behind all of pstack_watcher-test's gdb-related issues.

Here are the combinations I manually tested:
- el6: gdb was too old and pstack was used.
- el6 with devtoolset-3: gdb was used.
- Ubuntu 14.04: gdb was used.
- Ubuntu 16.04: gdb was used.
In every case I couldn't repro the original timeout.

Change-Id: I8486a22462c12cbb88f6fe49cd54e82543f066bb
Reviewed-on: http://gerrit.cloudera.org:8080/9962
Tested-by: Kudu Jenkins
Reviewed-by: Dan Burkert <[email protected]>
(cherry picked from commit 70e1a9caa08cab94bd36f5ddc2817f28f6028ec8)
---
M src/kudu/util/pstack_watcher.cc
M src/kudu/util/pstack_watcher.h
2 files changed, 69 insertions(+), 19 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/30/10230/1
--
To view, visit http://gerrit.cloudera.org:8080/10230
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.7.x
Gerrit-MessageType: newchange
Gerrit-Change-Id: I8486a22462c12cbb88f6fe49cd54e82543f066bb
Gerrit-Change-Number: 10230
Gerrit-PatchSet: 1
Gerrit-Owner: Grant Henke <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>

Reply via email to