Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14794 )

Change subject: IMPALA-9196: Dump jstack and collect logs when tests timeout
......................................................................


Patch Set 2:

> > Patch Set 2:
 > >
 > > > I'm uncertain about how the privileges work. There are ptrace
 > >  > limitations in Ubuntu that restrict ptrace by the same user to
 > a
 > >  > parent process, which I think is why the gdb part of this
 > script
 > >  > works. I'm not sure what permissions jstack would need, and if
 > this
 > >  > would work.
 > >  >
 > >  > If you haven't already, a test that you could run for the
 > >  > permissions is to run the end to end tests and set
 > >  > TIMEOUT_FOR_RUN_ALL_TESTS_MINS to some modest value (15 mins)
 > and
 > >  > verify you get the logs you want and jstack works.
 > >  >
 > >  > Once we verify that the permissions are ok in the normal way
 > we run
 > >  > this, the code looks good to me.
 > >
 > > Circling back to this review. My guess is that this doesn't work
 > in its current form on Ubuntu, but it might work on other
 > platforms.
 > >
 > > It looks like it is harmless if these debug commands fail
 > (because the script doesn't have "set -euo pipefail"). I think any
 > step forward in this debugging information is ok to merge as long
 > as it improves some platform without regressing anything. We should
 > add comments about dump statements that don't work on some
 > platforms, but that shouldn't stop us from adding statements that
 > do work on Centos7 or some other platform. Obviously, it would be
 > nice for these things to work on Ubuntu.
 >
 > I'm still testing this script in internal jenkeins jobs. It looks
 > wired to me that the script fails with "lsof: command not found".
 > But when installing lsof explictly, it saids it's already
 > installed:
 >
 > ++ sudo yum install -y lsof
 > Loaded plugins: fastestmirror
 > Loading mirror speeds from cached hostfile
 > Package lsof-4.87-4.el7.x86_64 already installed and latest version
 > Nothing to do
 > ++ which lsof
 > which: no lsof in 
 > (/usr/lib64/qt-3.3/bin:/usr/lib64/ccache:/usr/local/bin:/usr/bin)
 >
 > I think it's the problem with PATH. Will check it later. Internal
 > job link:
 > https://master-02.jenkins.cloudera.com/job/impala-private-parameterized/6139

Just as a reminder, it is important not to post links that are not publicly 
accessible. Reviews need to be conducted in a way that everyone can 
participate. This also protects any companies that may participate in Apache 
projects.

About the lsof: Unfortunately, this is an area where different Linux 
distributions will be different. We've dealt with that in a couple ways:
1. Try to find a subset that works. If we are copying some log files, copying 
too many log files is pretty harmless as long as we get the ones we want. I 
think if you found a command that listed the most recent 10 logs files that 
were modified in that directory using basic utilities like find, it would work 
on all Linux distributions.
2. In other scripts like bin/bootstrap_system.sh, we have commands that are 
conditional on the Linux version.


--
To view, visit http://gerrit.cloudera.org:8080/14794
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8a5b140024c236209c7e44149660189890b9d06
Gerrit-Change-Number: 14794
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Comment-Date: Wed, 04 Dec 2019 01:14:03 +0000
Gerrit-HasComments: No

Reply via email to