Ok that makes sense. I've updated the patch I sent yesterday with a unit
test that exposes the problem as well as a couple of other process attach
issues that were fixed in the last few days. It was in fact this getpgid()
issue that was causing problems when trying to create a unit test earlier
as the problem scenario occurs when the unit test script forks a subprocess
with popen().
On Wed, Mar 26, 2014 at 7:20 PM, Greg Clayton <[email protected]> wrote:
>
> On Mar 25, 2014, at 12:09 PM, Andrew MacPherson <[email protected]>
> wrote:
>
> > Ok great, I'm attaching a patch here that just uses getpgid() to
> determine the pgid to use with waitpid(). I get the same unit test results
> before and after the patch (there are a few tests that fail consistently
> for me). I tried using a simple -1 with waitpid() but this results in a
> hang in the unit tests. This is probably something worth investigating but
> for now this patch does resolve the issues around waitpid() when attaching.
>
> We should never use -1 with waitpid() as you could end up reaping a
> process that someone else is already trying to reap. You must only try to
> reap the process that you launched using some pid for you process. Many
> things launch processes within LLDB like ProcessGDBRemote launching
> "debugserver" binaries. The ProcessGDBRemote must be notified when and if
> "debugserver" crashes, so if anyone else does waitpid(-1, ...)
> ProcessGDBRemote might never find out if debugserver crashes for some
> reason.
>
> > I mentioned a few issues I ran into with the unit tests in the email I
> just pinged with a Linux detach patch, basically there appears to be a bug
> or limitation somewhere in that doing a "continue" in asynchronous mode in
> a unit test breaks things and this is needed for most attach-related debug
> testing. I can probably look into this (along with the couple of failing
> tests I have) but likely won't be able to this week. I haven't filed a bug
> for the consistently-failing tests on my end since it doesn't appear to be
> happening for others and I haven't had a chance to investigate here, let me
> know if I should file a bug anyway.
> >
> >
> > On Tue, Mar 25, 2014 at 6:30 PM, Todd Fiala <[email protected]> wrote:
> > I'd suggest trying the change, running the tests, and see if anything
> new fails.
> >
> > And if this fixes currently broken behavior, add a test to make sure we
> don't regress it. I'm pretty sure if it breaks something we'll know pretty
> quickly (either via bugs or our own usage). At which point - we should add
> a test so we don't regress in the future and perhaps add a few comments as
> to the reason.
> >
> >
> > On Tue, Mar 25, 2014 at 9:54 AM, <[email protected]> wrote:
> > Why are you waiting for process groups? That's not something we have to
> do on Mac OS X.
> >
> > Jim
> >
> >
> > On Mar 25, 2014, at 1:17 AM, Andrew MacPherson <[email protected]>
> wrote:
> >
> > > Currently under Linux if you attach to a process whose process group
> id is not equal to its process id (such as the child process of a fork()
> call) the calls to waitpid() that pass -1*pid will return ECHILD since the
> pid argument refers to a process group that doesn't exist. These calls
> occur in Host::MonitorChildProcessThreadFunction() and the Linux
> ProcessMonitor.
> > >
> > > Changing -1*pid to simply -1 or to -1*getpgid(pid) resolves the issue
> but it's not clear if this is the right fix as I'm unsure how other OSes
> deal with this scenario.
> > >
> > > Any thoughts?
> > >
> > > Thanks,
> > > Andrew
> > > _______________________________________________
> > > lldb-dev mailing list
> > > [email protected]
> > > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
> >
> > _______________________________________________
> > lldb-dev mailing list
> > [email protected]
> > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
> >
> >
> >
> > --
> > Todd Fiala | Software Engineer | [email protected] |
> 650-943-3180
> >
> >
> > <waitpid-getpgid.patch>_______________________________________________
> > lldb-dev mailing list
> > [email protected]
> > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>
>
diff --git a/source/Host/common/Host.cpp b/source/Host/common/Host.cpp
index e5a0f33..b66685c 100644
--- a/source/Host/common/Host.cpp
+++ b/source/Host/common/Host.cpp
@@ -169,7 +169,7 @@ MonitorChildProcessThreadFunction (void *arg)
const bool monitor_signals = info->monitor_signals;
assert (info->pid <= UINT32_MAX);
- const ::pid_t pid = monitor_signals ? -1 * info->pid : info->pid;
+ const ::pid_t pid = monitor_signals ? -1 * getpgid(info->pid) : info->pid;
delete info;
diff --git a/source/Plugins/Process/Linux/ProcessMonitor.cpp b/source/Plugins/Process/Linux/ProcessMonitor.cpp
index 3dec6de..554d54c 100644
--- a/source/Plugins/Process/Linux/ProcessMonitor.cpp
+++ b/source/Plugins/Process/Linux/ProcessMonitor.cpp
@@ -1767,7 +1767,7 @@ ProcessMonitor::StopThread(lldb::tid_t tid)
int status = -1;
if (log)
log->Printf ("ProcessMonitor::%s(bp) waitpid...", __FUNCTION__);
- lldb::pid_t wait_pid = ::waitpid (-1*m_pid, &status, __WALL);
+ lldb::pid_t wait_pid = ::waitpid (-1*getpgid(m_pid), &status, __WALL);
if (log)
log->Printf ("ProcessMonitor::%s(bp) waitpid, pid = %" PRIu64 ", status = %d", __FUNCTION__, wait_pid, status);
diff --git a/test/functionalities/attach_resume/Makefile b/test/functionalities/attach_resume/Makefile
new file mode 100644
index 0000000..0375c31
--- /dev/null
+++ b/test/functionalities/attach_resume/Makefile
@@ -0,0 +1,8 @@
+LEVEL = ../../make
+
+C_SOURCES := main.c
+LD_EXTRAS := -lpthread
+
+EXE := AttachResume
+
+include $(LEVEL)/Makefile.rules
diff --git a/test/functionalities/attach_resume/TestAttachResume.py b/test/functionalities/attach_resume/TestAttachResume.py
new file mode 100644
index 0000000..b79633d
--- /dev/null
+++ b/test/functionalities/attach_resume/TestAttachResume.py
@@ -0,0 +1,95 @@
+"""
+Test process attach/resume.
+"""
+
+import os, time
+import unittest2
+import lldb
+from lldbtest import *
+import lldbutil
+
+exe_name = "AttachResume" # Must match Makefile
+
+class AttachResumeTestCase(TestBase):
+
+ mydir = TestBase.compute_mydir(__file__)
+
+ @dwarf_test
+ def test_attach_continue_interrupt_detach(self):
+ """Test attach/continue/interrupt/detach"""
+ self.buildDwarf()
+ self.process_attach_continue_interrupt_detach()
+
+ def process_attach_continue_interrupt_detach(self):
+ """Test attach/continue/interrupt/detach"""
+
+ exe = os.path.join(os.getcwd(), exe_name)
+
+ popen = self.spawnSubprocess(exe)
+ self.addTearDownHook(self.cleanupSubprocesses)
+
+ self.runCmd("process attach -p " + str(popen.pid))
+
+ self._state = 0
+ def process_events():
+ event = lldb.SBEvent()
+ while self.dbg.GetListener().GetNextEvent(event):
+ self._state = lldb.SBProcess.GetStateFromEvent(event)
+
+ # using process.GetState() does not work: llvm.org/pr16172
+ def wait_for_state(s, timeout=5):
+ t = 0
+ period = 0.1
+ while self._state != s:
+ process_events()
+ time.sleep(period)
+ t += period
+ if t > timeout:
+ return False
+ return True
+
+ self.setAsync(True)
+
+ self.runCmd("c")
+ self.assertTrue(wait_for_state(lldb.eStateRunning),
+ 'Process not running after continue')
+
+ self.runCmd("process interrupt")
+ self.assertTrue(wait_for_state(lldb.eStateStopped),
+ 'Process not stopped after interrupt')
+
+ # be sure to continue/interrupt/continue (r204504)
+ self.runCmd("c")
+ self.assertTrue(wait_for_state(lldb.eStateRunning),
+ 'Process not running after continue')
+
+ self.runCmd("process interrupt")
+ self.assertTrue(wait_for_state(lldb.eStateStopped),
+ 'Process not stopped after interrupt')
+
+ # check that this breakpoint is auto-cleared on detach (r204752)
+ self.runCmd("br set -f main.c -l 12")
+
+ self.runCmd("c")
+ self.assertTrue(wait_for_state(lldb.eStateRunning),
+ 'Process not running after continue')
+
+ self.assertTrue(wait_for_state(lldb.eStateStopped),
+ 'Process not stopped after breakpoint')
+ self.expect('br list', 'Breakpoint not hit',
+ patterns = ['hit count = 1'])
+
+ self.runCmd("c")
+ self.assertTrue(wait_for_state(lldb.eStateRunning),
+ 'Process not running after continue')
+
+ # make sure to detach while in running state (r204759)
+ self.runCmd("detach")
+ self.assertTrue(wait_for_state(lldb.eStateDetached),
+ 'Process not detached after detach')
+
+if __name__ == '__main__':
+ import atexit
+ lldb.SBDebugger.Initialize()
+ atexit.register(lambda: lldb.SBDebugger.Terminate())
+ unittest2.main()
diff --git a/test/functionalities/attach_resume/main.c b/test/functionalities/attach_resume/main.c
new file mode 100644
index 0000000..24efd4b
--- /dev/null
+++ b/test/functionalities/attach_resume/main.c
@@ -0,0 +1,29 @@
+#include <unistd.h>
+#include <stdio.h>
+#include <fcntl.h>
+#include <pthread.h>
+
+void *start(void *data)
+{
+ size_t idx = (size_t)data;
+ for (int i=0; i<30; i++)
+ {
+ if ( idx == 0 )
+ usleep(1);
+ sleep(1);
+ }
+ return 0;
+}
+
+int main(int argc, char const *argv[])
+{
+ static const size_t nthreads = 16;
+ pthread_attr_t attr;
+ pthread_t threads[nthreads];
+ pthread_attr_init(&attr);
+ for (size_t i=0; i<nthreads; i++)
+ pthread_create(&threads[i], &attr, &start, (void *)i);
+
+ for (size_t i=0; i<nthreads; i++)
+ pthread_join(threads[i], 0);
+}
_______________________________________________
lldb-dev mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev