Ok that makes sense. I've updated the patch I sent yesterday with a unit
test that exposes the problem as well as a couple of other process attach
issues that were fixed in the last few days. It was in fact this getpgid()
issue that was causing problems when trying to create a unit test earlier
as the problem scenario occurs when the unit test script forks a subprocess
with popen().


On Wed, Mar 26, 2014 at 7:20 PM, Greg Clayton <[email protected]> wrote:

>
> On Mar 25, 2014, at 12:09 PM, Andrew MacPherson <[email protected]>
> wrote:
>
> > Ok great, I'm attaching a patch here that just uses getpgid() to
> determine the pgid to use with waitpid(). I get the same unit test results
> before and after the patch (there are a few tests that fail consistently
> for me). I tried using a simple -1 with waitpid() but this results in a
> hang in the unit tests. This is probably something worth investigating but
> for now this patch does resolve the issues around waitpid() when attaching.
>
> We should never use -1 with waitpid() as you could end up reaping a
> process that someone else is already trying to reap. You must only try to
> reap the process that you launched using some pid for you process. Many
> things launch processes within LLDB like ProcessGDBRemote launching
> "debugserver" binaries. The ProcessGDBRemote must be notified when and if
> "debugserver" crashes, so if anyone else does waitpid(-1, ...)
> ProcessGDBRemote might never find out if debugserver crashes for some
> reason.
>
> > I mentioned a few issues I ran into with the unit tests in the email I
> just pinged with a Linux detach patch, basically there appears to be a bug
> or limitation somewhere in that doing a "continue" in asynchronous mode in
> a unit test breaks things and this is needed for most attach-related debug
> testing. I can probably look into this (along with the couple of failing
> tests I have) but likely won't be able to this week. I haven't filed a bug
> for the consistently-failing tests on my end since it doesn't appear to be
> happening for others and I haven't had a chance to investigate here, let me
> know if I should file a bug anyway.
> >
> >
> > On Tue, Mar 25, 2014 at 6:30 PM, Todd Fiala <[email protected]> wrote:
> > I'd suggest trying the change, running the tests, and see if anything
> new fails.
> >
> > And if this fixes currently broken behavior, add a test to make sure we
> don't regress it.  I'm pretty sure if it breaks something we'll know pretty
> quickly (either via bugs or our own usage).  At which point - we should add
> a test so we don't regress in the future and perhaps add a few comments as
> to the reason.
> >
> >
> > On Tue, Mar 25, 2014 at 9:54 AM, <[email protected]> wrote:
> > Why are you waiting for process groups?  That's not something we have to
> do on Mac OS X.
> >
> > Jim
> >
> >
> > On Mar 25, 2014, at 1:17 AM, Andrew MacPherson <[email protected]>
> wrote:
> >
> > > Currently under Linux if you attach to a process whose process group
> id is not equal to its process id (such as the child process of a fork()
> call) the calls to waitpid() that pass -1*pid will return ECHILD since the
> pid argument refers to a process group that doesn't exist. These calls
> occur in Host::MonitorChildProcessThreadFunction() and the Linux
> ProcessMonitor.
> > >
> > > Changing -1*pid to simply -1 or to -1*getpgid(pid) resolves the issue
> but it's not clear if this is the right fix as I'm unsure how other OSes
> deal with this scenario.
> > >
> > > Any thoughts?
> > >
> > > Thanks,
> > > Andrew
> > > _______________________________________________
> > > lldb-dev mailing list
> > > [email protected]
> > > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
> >
> > _______________________________________________
> > lldb-dev mailing list
> > [email protected]
> > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
> >
> >
> >
> > --
> > Todd Fiala |   Software Engineer |     [email protected] |
> 650-943-3180
> >
> >
> > <waitpid-getpgid.patch>_______________________________________________
> > lldb-dev mailing list
> > [email protected]
> > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>
>
diff --git a/source/Host/common/Host.cpp b/source/Host/common/Host.cpp
index e5a0f33..b66685c 100644
--- a/source/Host/common/Host.cpp
+++ b/source/Host/common/Host.cpp
@@ -169,7 +169,7 @@ MonitorChildProcessThreadFunction (void *arg)
     const bool monitor_signals = info->monitor_signals;
 
     assert (info->pid <= UINT32_MAX);
-    const ::pid_t pid = monitor_signals ? -1 * info->pid : info->pid;
+    const ::pid_t pid = monitor_signals ? -1 * getpgid(info->pid) : info->pid;
 
     delete info;
 
diff --git a/source/Plugins/Process/Linux/ProcessMonitor.cpp b/source/Plugins/Process/Linux/ProcessMonitor.cpp
index 3dec6de..554d54c 100644
--- a/source/Plugins/Process/Linux/ProcessMonitor.cpp
+++ b/source/Plugins/Process/Linux/ProcessMonitor.cpp
@@ -1767,7 +1767,7 @@ ProcessMonitor::StopThread(lldb::tid_t tid)
         int status = -1;
         if (log)
             log->Printf ("ProcessMonitor::%s(bp) waitpid...", __FUNCTION__);
-        lldb::pid_t wait_pid = ::waitpid (-1*m_pid, &status, __WALL);
+        lldb::pid_t wait_pid = ::waitpid (-1*getpgid(m_pid), &status, __WALL);
         if (log)
             log->Printf ("ProcessMonitor::%s(bp) waitpid, pid = %" PRIu64 ", status = %d", __FUNCTION__, wait_pid, status);
 
diff --git a/test/functionalities/attach_resume/Makefile b/test/functionalities/attach_resume/Makefile
new file mode 100644
index 0000000..0375c31
--- /dev/null
+++ b/test/functionalities/attach_resume/Makefile
@@ -0,0 +1,8 @@
+LEVEL = ../../make
+
+C_SOURCES := main.c
+LD_EXTRAS := -lpthread
+
+EXE := AttachResume
+
+include $(LEVEL)/Makefile.rules
diff --git a/test/functionalities/attach_resume/TestAttachResume.py b/test/functionalities/attach_resume/TestAttachResume.py
new file mode 100644
index 0000000..b79633d
--- /dev/null
+++ b/test/functionalities/attach_resume/TestAttachResume.py
@@ -0,0 +1,95 @@
+"""
+Test process attach/resume.
+"""
+
+import os, time
+import unittest2
+import lldb
+from lldbtest import *
+import lldbutil
+
+exe_name = "AttachResume"  # Must match Makefile
+
+class AttachResumeTestCase(TestBase):
+
+    mydir = TestBase.compute_mydir(__file__)
+
+    @dwarf_test
+    def test_attach_continue_interrupt_detach(self):
+        """Test attach/continue/interrupt/detach"""
+        self.buildDwarf()
+        self.process_attach_continue_interrupt_detach()
+
+    def process_attach_continue_interrupt_detach(self):
+        """Test attach/continue/interrupt/detach"""
+
+        exe = os.path.join(os.getcwd(), exe_name)
+
+        popen = self.spawnSubprocess(exe)
+        self.addTearDownHook(self.cleanupSubprocesses)
+
+        self.runCmd("process attach -p " + str(popen.pid))
+
+        self._state = 0
+        def process_events():
+            event = lldb.SBEvent()
+            while self.dbg.GetListener().GetNextEvent(event):
+                self._state = lldb.SBProcess.GetStateFromEvent(event)
+
+        # using process.GetState() does not work: llvm.org/pr16172
+        def wait_for_state(s, timeout=5):
+            t = 0
+            period = 0.1
+            while self._state != s:
+                process_events()
+                time.sleep(period)
+                t += period
+                if t > timeout:
+                    return False
+            return True
+
+        self.setAsync(True)
+
+        self.runCmd("c")
+        self.assertTrue(wait_for_state(lldb.eStateRunning),
+            'Process not running after continue')
+
+        self.runCmd("process interrupt")
+        self.assertTrue(wait_for_state(lldb.eStateStopped),
+            'Process not stopped after interrupt')
+
+        # be sure to continue/interrupt/continue (r204504)
+        self.runCmd("c")
+        self.assertTrue(wait_for_state(lldb.eStateRunning),
+            'Process not running after continue')
+
+        self.runCmd("process interrupt")
+        self.assertTrue(wait_for_state(lldb.eStateStopped),
+            'Process not stopped after interrupt')
+
+        # check that this breakpoint is auto-cleared on detach (r204752)
+        self.runCmd("br set -f main.c -l 12")
+
+        self.runCmd("c")
+        self.assertTrue(wait_for_state(lldb.eStateRunning),
+            'Process not running after continue')
+
+        self.assertTrue(wait_for_state(lldb.eStateStopped),
+            'Process not stopped after breakpoint')
+        self.expect('br list', 'Breakpoint not hit',
+            patterns = ['hit count = 1'])
+
+        self.runCmd("c")
+        self.assertTrue(wait_for_state(lldb.eStateRunning),
+            'Process not running after continue')
+
+        # make sure to detach while in running state (r204759)
+        self.runCmd("detach")
+        self.assertTrue(wait_for_state(lldb.eStateDetached),
+            'Process not detached after detach')
+
+if __name__ == '__main__':
+    import atexit
+    lldb.SBDebugger.Initialize()
+    atexit.register(lambda: lldb.SBDebugger.Terminate())
+    unittest2.main()
diff --git a/test/functionalities/attach_resume/main.c b/test/functionalities/attach_resume/main.c
new file mode 100644
index 0000000..24efd4b
--- /dev/null
+++ b/test/functionalities/attach_resume/main.c
@@ -0,0 +1,29 @@
+#include <unistd.h>
+#include <stdio.h>
+#include <fcntl.h>
+#include <pthread.h>
+
+void *start(void *data)
+{
+    size_t idx = (size_t)data;
+    for (int i=0; i<30; i++)
+    {
+        if ( idx == 0 )
+            usleep(1);
+        sleep(1);
+    }
+    return 0;
+}
+
+int main(int argc, char const *argv[])
+{
+    static const size_t nthreads = 16;
+    pthread_attr_t attr;
+    pthread_t threads[nthreads];
+    pthread_attr_init(&attr);
+    for (size_t i=0; i<nthreads; i++)
+        pthread_create(&threads[i], &attr, &start, (void *)i);
+
+    for (size_t i=0; i<nthreads; i++)
+        pthread_join(threads[i], 0);
+}
_______________________________________________
lldb-dev mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev

Reply via email to