Hey Todd, Sounds great. Sorry about the C99 issue, I was using clang which I'm guessing behaves differently there. For the waitpid + getpgid in Host.cpp, that section is #ifdef'd out on Windows so should be safe there too, I've committed the patch now. Thanks!
Andrew On Tue, Apr 1, 2014 at 7:46 PM, Todd Fiala <[email protected]> wrote: > Hey Andrew, > > I think the core I was getting on the assert is during exiting and doesn't > really seem to indicate a serious issue. I fixed it up with a null check > and added some logging. > > See the attached patch. This is a cumulative patch with my fixes for > those two test failures plus all of your original changes. > > As this patch stands, LGTM. If you want to check it in, I'm fine with it > from a Linux standpoint. In the common Host.cpp, you now have a getpgid() > call. I'm not sure that's going to work on a Windows build? Might need to > #ifdef that or create a macro for it, or something else entirely. Any > Windows folks able to comment on that? > > > > > > > On Tue, Apr 1, 2014 at 10:25 AM, Todd Fiala <[email protected]> wrote: > >> That change did successfully fix the TestAttachResume.py failure. >> >> This one is a bit different and looks like we may have perturbed >> something related to shutdown: >> >> TestHelloWorld.py: >> >> Executing tearDown hook: def cleanupSubprocesses(self): >> # Ensure any subprocesses are cleaned up >> for p in self.subprocesses: >> if p.poll() == None: >> p.terminate() >> del p >> del self.subprocesses[:] >> # Ensure any forked processes are cleaned up >> for pid in self.forkedProcessPids: >> if os.path.exists("/proc/" + str(pid)): >> os.kill(pid, signal.SIGTERM) >> >> >> python: >> /mnt/ssd/work/git/gen/llvm/tools/lldb/source/Plugins/Process/POSIX/ProcessPOSIX.cpp:427: >> virtual void ProcessPOSIX::SendMessage(const ProcessMessage&): Assertion >> `thread' failed. >> Aborted (core dumped) >> >> I'm having a look at that core now. >> >> >> >> On Tue, Apr 1, 2014 at 10:00 AM, Todd Fiala <[email protected]> wrote: >> >>> I built clean and ran tests error free on r205315. >>> >>> After testing with the patch applied, I'm getting the following failures >>> (3 times in a row): >>> >>> FAIL: LLDB (suite) :: TestHelloWorld.py (Linux >>> tfiala2.mtv.corp.google.com 3.2.5-gg1336 #1 SMP Thu Aug 29 02:37:18 PDT >>> 2013 x86_64 x86_64) >>> FAIL: LLDB (suite) :: TestAttachResume.py (Linux >>> tfiala2.mtv.corp.google.com 3.2.5-gg1336 #1 SMP Thu Aug 29 02:37:18 PDT >>> 2013 x86_64 x86_64) >>> >>> Looking deeper, it looks like the main.c code in >>> test/functionalities/attach_resume/main.c is assuming c99 mode which isn't >>> the default when using gcc. I'm clearing that locally (pulling for loop >>> variable declarations out) and will see if that fixes everything else. >>> >>> Back in a bit... >>> >>> >>> On Tue, Apr 1, 2014 at 9:06 AM, Todd Fiala <[email protected]> wrote: >>> >>>> Having a look at this now. >>>> >>>> >>>> On Thu, Mar 27, 2014 at 1:01 AM, Andrew MacPherson < >>>> [email protected]> wrote: >>>> >>>>> Thanks Todd, and no problem on the unit test, I'll try to include one >>>>> for anything I come across in the future as well. >>>>> >>>>> >>>>> On Wed, Mar 26, 2014 at 10:39 PM, Todd Fiala <[email protected]>wrote: >>>>> >>>>>> Ah nice. I'll definitely run this again soon. Thanks for getting a >>>>>> test around it - we can't have enough of that around important >>>>>> expectations. >>>>>> >>>>>> >>>>>> On Wed, Mar 26, 2014 at 2:06 PM, Andrew MacPherson < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Ok that makes sense. I've updated the patch I sent yesterday with a >>>>>>> unit test that exposes the problem as well as a couple of other process >>>>>>> attach issues that were fixed in the last few days. It was in fact this >>>>>>> getpgid() issue that was causing problems when trying to create a unit >>>>>>> test >>>>>>> earlier as the problem scenario occurs when the unit test script forks a >>>>>>> subprocess with popen(). >>>>>>> >>>>>>> >>>>>>> On Wed, Mar 26, 2014 at 7:20 PM, Greg Clayton <[email protected]>wrote: >>>>>>> >>>>>>>> >>>>>>>> On Mar 25, 2014, at 12:09 PM, Andrew MacPherson < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>> > Ok great, I'm attaching a patch here that just uses getpgid() to >>>>>>>> determine the pgid to use with waitpid(). I get the same unit test >>>>>>>> results >>>>>>>> before and after the patch (there are a few tests that fail >>>>>>>> consistently >>>>>>>> for me). I tried using a simple -1 with waitpid() but this results in a >>>>>>>> hang in the unit tests. This is probably something worth investigating >>>>>>>> but >>>>>>>> for now this patch does resolve the issues around waitpid() when >>>>>>>> attaching. >>>>>>>> >>>>>>>> We should never use -1 with waitpid() as you could end up reaping a >>>>>>>> process that someone else is already trying to reap. You must only try >>>>>>>> to >>>>>>>> reap the process that you launched using some pid for you process. Many >>>>>>>> things launch processes within LLDB like ProcessGDBRemote launching >>>>>>>> "debugserver" binaries. The ProcessGDBRemote must be notified when and >>>>>>>> if >>>>>>>> "debugserver" crashes, so if anyone else does waitpid(-1, ...) >>>>>>>> ProcessGDBRemote might never find out if debugserver crashes for some >>>>>>>> reason. >>>>>>>> >>>>>>>> > I mentioned a few issues I ran into with the unit tests in the >>>>>>>> email I just pinged with a Linux detach patch, basically there appears >>>>>>>> to >>>>>>>> be a bug or limitation somewhere in that doing a "continue" in >>>>>>>> asynchronous >>>>>>>> mode in a unit test breaks things and this is needed for most >>>>>>>> attach-related debug testing. I can probably look into this (along >>>>>>>> with the >>>>>>>> couple of failing tests I have) but likely won't be able to this week. >>>>>>>> I >>>>>>>> haven't filed a bug for the consistently-failing tests on my end since >>>>>>>> it >>>>>>>> doesn't appear to be happening for others and I haven't had a chance to >>>>>>>> investigate here, let me know if I should file a bug anyway. >>>>>>>> > >>>>>>>> > >>>>>>>> > On Tue, Mar 25, 2014 at 6:30 PM, Todd Fiala <[email protected]> >>>>>>>> wrote: >>>>>>>> > I'd suggest trying the change, running the tests, and see if >>>>>>>> anything new fails. >>>>>>>> > >>>>>>>> > And if this fixes currently broken behavior, add a test to make >>>>>>>> sure we don't regress it. I'm pretty sure if it breaks something we'll >>>>>>>> know pretty quickly (either via bugs or our own usage). At which >>>>>>>> point - >>>>>>>> we should add a test so we don't regress in the future and perhaps add >>>>>>>> a >>>>>>>> few comments as to the reason. >>>>>>>> > >>>>>>>> > >>>>>>>> > On Tue, Mar 25, 2014 at 9:54 AM, <[email protected]> wrote: >>>>>>>> > Why are you waiting for process groups? That's not something we >>>>>>>> have to do on Mac OS X. >>>>>>>> > >>>>>>>> > Jim >>>>>>>> > >>>>>>>> > >>>>>>>> > On Mar 25, 2014, at 1:17 AM, Andrew MacPherson < >>>>>>>> [email protected]> wrote: >>>>>>>> > >>>>>>>> > > Currently under Linux if you attach to a process whose process >>>>>>>> group id is not equal to its process id (such as the child process of a >>>>>>>> fork() call) the calls to waitpid() that pass -1*pid will return ECHILD >>>>>>>> since the pid argument refers to a process group that doesn't exist. >>>>>>>> These >>>>>>>> calls occur in Host::MonitorChildProcessThreadFunction() and the Linux >>>>>>>> ProcessMonitor. >>>>>>>> > > >>>>>>>> > > Changing -1*pid to simply -1 or to -1*getpgid(pid) resolves the >>>>>>>> issue but it's not clear if this is the right fix as I'm unsure how >>>>>>>> other >>>>>>>> OSes deal with this scenario. >>>>>>>> > > >>>>>>>> > > Any thoughts? >>>>>>>> > > >>>>>>>> > > Thanks, >>>>>>>> > > Andrew >>>>>>>> > > _______________________________________________ >>>>>>>> > > lldb-dev mailing list >>>>>>>> > > [email protected] >>>>>>>> > > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev >>>>>>>> > >>>>>>>> > _______________________________________________ >>>>>>>> > lldb-dev mailing list >>>>>>>> > [email protected] >>>>>>>> > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > -- >>>>>>>> > Todd Fiala | Software Engineer | [email protected] | >>>>>>>> 650-943-3180 >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> <waitpid-getpgid.patch>_______________________________________________ >>>>>>>> > lldb-dev mailing list >>>>>>>> > [email protected] >>>>>>>> > http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Todd Fiala | Software Engineer | [email protected] | 650-943-3180 >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Todd Fiala | Software Engineer | [email protected] | 650-943-3180 >>>> >>> >>> >>> >>> -- >>> Todd Fiala | Software Engineer | [email protected] | 650-943-3180 >>> >> >> >> >> -- >> Todd Fiala | Software Engineer | [email protected] | 650-943-3180 >> > > > > -- > Todd Fiala | Software Engineer | [email protected] | 650-943-3180 >
_______________________________________________ lldb-dev mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
