Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14794 )
Change subject: IMPALA-9196: Dump jstack and collect logs when tests timeout ...................................................................... Patch Set 3: > Patch Set 2: > > > > Patch Set 2: > > > > > > > I'm uncertain about how the privileges work. There are ptrace > > > > limitations in Ubuntu that restrict ptrace by the same user to > > a > > > > parent process, which I think is why the gdb part of this > > script > > > > works. I'm not sure what permissions jstack would need, and if > > this > > > > would work. > > > > > > > > If you haven't already, a test that you could run for the > > > > permissions is to run the end to end tests and set > > > > TIMEOUT_FOR_RUN_ALL_TESTS_MINS to some modest value (15 mins) > > and > > > > verify you get the logs you want and jstack works. > > > > > > > > Once we verify that the permissions are ok in the normal way > > we run > > > > this, the code looks good to me. > > > > > > Circling back to this review. My guess is that this doesn't work > > in its current form on Ubuntu, but it might work on other > > platforms. > > > > > > It looks like it is harmless if these debug commands fail > > (because the script doesn't have "set -euo pipefail"). I think any > > step forward in this debugging information is ok to merge as long > > as it improves some platform without regressing anything. We should > > add comments about dump statements that don't work on some > > platforms, but that shouldn't stop us from adding statements that > > do work on Centos7 or some other platform. Obviously, it would be > > nice for these things to work on Ubuntu. > > > > I'm still testing this script in internal jenkeins jobs. It looks > > wired to me that the script fails with "lsof: command not found". > > But when installing lsof explictly, it saids it's already > > installed: > > > > ++ sudo yum install -y lsof > > Loaded plugins: fastestmirror > > Loading mirror speeds from cached hostfile > > Package lsof-4.87-4.el7.x86_64 already installed and latest version > > Nothing to do > > ++ which lsof > > which: no lsof in > (/usr/lib64/qt-3.3/bin:/usr/lib64/ccache:/usr/local/bin:/usr/bin) > > > > I think it's the problem with PATH. Will check it later. Internal > > job link: > > > https://master-02.jenkins.cloudera.com/job/impala-private-parameterized/6139 > > Just as a reminder, it is important not to post links that are not publicly > accessible. Reviews need to be conducted in a way that everyone can > participate. This also protects any companies that may participate in Apache > projects. > > About the lsof: Unfortunately, this is an area where different Linux > distributions will be different. We've dealt with that in a couple ways: > 1. Try to find a subset that works. If we are copying some log files, copying > too many log files is pretty harmless as long as we get the ones we want. I > think if you found a command that listed the most recent 10 logs files that > were modified in that directory using basic utilities like find, it would > work on all Linux distributions. > 2. In other scripts like bin/bootstrap_system.sh, we have commands that are > conditional on the Linux version. Sorry for the unpublic link and late for this.. I realize that bin/run-all-tests.sh makes all log dirs under $IMPALA_HOME. The logs of custom cluster tests are in /tmp when running manually. So we may don't need to collect the log files again. Remove the lsof stuffs. Still test this patch in private Jenkins jobs. -- To view, visit http://gerrit.cloudera.org:8080/14794 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib8a5b140024c236209c7e44149660189890b9d06 Gerrit-Change-Number: 14794 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Comment-Date: Tue, 31 Dec 2019 00:00:20 +0000 Gerrit-HasComments: No
