Grant Henke has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10230
Change subject: pstack_watcher: blacklist older versions of gdb ...................................................................... pstack_watcher: blacklist older versions of gdb Despite commit 6ed4690, I was still seeing timeouts in pstack_watcher-test on CentOS 6.6. After going down a rabbit hole, I think I found the root cause: a pretty printing bug that causes gdb to read from uninitialized memory. In theory the symptom varies, but in my case gdb spent a very very long time reading the memory backing 'std::string exit_info' in PstackWatcher::RunStackDump. Because 'exit_info' was uninitialized at the time of the read (i.e. while the test was blocked in Subprocess::Wait()), its length contained garbage, and sometimes (varied with each execution of the test binary) that garbage was a very large number. In such cases, gdb would faithfully read the traced process' memory word by word. This looked like an infinite loop but attaching strace to gdb revealed the truth. Eventually the test would time out. I couldn't pinpoint the bug with precision, nor could I figure out exactly when it was fixed, so I ended up blacklisting versions older than a "oldest known good" version. But I'm somewhat confident that this is the root cause behind all of pstack_watcher-test's gdb-related issues. Here are the combinations I manually tested: - el6: gdb was too old and pstack was used. - el6 with devtoolset-3: gdb was used. - Ubuntu 14.04: gdb was used. - Ubuntu 16.04: gdb was used. In every case I couldn't repro the original timeout. Change-Id: I8486a22462c12cbb88f6fe49cd54e82543f066bb Reviewed-on: http://gerrit.cloudera.org:8080/9962 Tested-by: Kudu Jenkins Reviewed-by: Dan Burkert <[email protected]> (cherry picked from commit 70e1a9caa08cab94bd36f5ddc2817f28f6028ec8) --- M src/kudu/util/pstack_watcher.cc M src/kudu/util/pstack_watcher.h 2 files changed, 69 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/30/10230/1 -- To view, visit http://gerrit.cloudera.org:8080/10230 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: branch-1.7.x Gerrit-MessageType: newchange Gerrit-Change-Id: I8486a22462c12cbb88f6fe49cd54e82543f066bb Gerrit-Change-Number: 10230 Gerrit-PatchSet: 1 Gerrit-Owner: Grant Henke <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]>
