> One of the expedited registers (well, the PC) has the little-endian value for the address I see listed in the T response to the $? on startup.
Meaning, for the spurious disassembly on "(lldb) run", e.g. > thread #1: tid = 4417, 0x00007f3b99b9c2d0, name = 'a.out', stop reason = trace from above, the address of the stop is in fact the PC from the expedited registers in the first T response to the $? query. On Sun, Jun 29, 2014 at 12:18 AM, Todd Fiala <todd.fi...@gmail.com> wrote: > Er... > > Ok - so that's not quite right. > > In the llgs case, there is an initial probe with a $? command that asks > where things are at after the inferior is launched. One of the expedited > registers (well, the PC) has the little-endian value for the address I see > listed in the T response to the $? on startup. I'm going to look and see > if the $? is really supposed to respond with a T on startup. I've noticed > (and adhere to) not responding with a T on launch for the initial stop. It > might be that $? should likewise not respond with anything there. > > So this still might be a different thing on llgs than on local Linux. > > -Todd > > > On Sun, Jun 29, 2014 at 12:11 AM, Todd Fiala <todd.fi...@gmail.com> wrote: > >> Here's another interesting bit. I have been combing through the >> NativeProcessLinux code in the llgs branch as I tighten everything up and >> work through known issues. >> >> So for this part Shawn called out above: >> >> (lldb) run >> >> Process 4417 launching >> Process 4417 stopped >> * thread #1: tid = 4417, 0x00007f3b99b9c2d0, name = 'a.out', stop reason >> = trace >> frame #0: 0x00007f3b99b9c2d0 >> -> 0x7f3b99b9c2d0: movq %rsp, %rdi >> 0x7f3b99b9c2d3: callq 0x7f3b99b9fa70 >> 0x7f3b99b9c2d8: movq %rax, %r12 >> 0x7f3b99b9c2db: movl 0x221b17(%rip), %eax >> >> Process 4417 launched: '/home/shawn/Projects/a.out' (x86_64) >> >> >> I have been seeing this in lldb connected to llgs as well as on local >> debugging. It happens sometimes. However, on lldb <=> llgs, I have >> complete view of the protocol (the gdb-remote packets), and I can see that >> the disassembled stop point is not ever coming up in the gdb-remote log as >> a stop notification (a T). I am starting to suspect that there might be >> something in the stack unwinding, symbolication or something else that is >> perhaps racey or maybe sensitive to taking too long to perform some step. >> The address listed in the llgs case is close to some addresses that are >> getting probed on the gdb-remote protocol. >> >> I'm still tracking this down in the llgs case and I'll double check to >> make sure I'm not somehow misreading the gdb-remote packet. More on this >> later if I find anything useful. >> >> -Todd >> >> >> >> On Thu, Jun 26, 2014 at 6:09 PM, Todd Fiala <tfi...@google.com> wrote: >> >>> > This stopping at random locations seems like a racy bug in the >>> ProcessLinux that we should really look into fixing. >>> >>> (Just doing some correlation with the llgs NativeProcessLinux code, >>> which is from the same lineage as the Linux ProcessMonitor but has diverged >>> somewhat). >>> >>> Some of these I have hit as well in the NativeProcessLinux code. I >>> happen to still see this one (the "sometimes dump stack info on startup, >>> then go forward"), when creating a process on lldb over to llgs. No doubt >>> the same source since NativeProcessLinux started from the same code. >>> >>> > This is bad, please use native types (::pid_t) for these locations so >>> that this works correctly. >>> >>> I've fixed most of those (I think all but I'm going to double chekc) as >>> I got warnings on them in the NativeProcessLinux side, but I'll double >>> check on that. >>> >>> > 1 - on linux does a process start running first, then you quickly try >>> to attach to it? If so, this could explain the difference you might be >>> seeing when connecting to a process? On Darwin, our posix_spawn() has a non >>> portable flag that stops the process at the entry point with a SIGSTOP so >>> we are guaranteed to not have a race condition when we launch a process for >>> debugging. >>> >>> We have complete control of the launch - we fork a process, do an >>> execve(), and the monitor gets a trap on the execve(), so shouldn't be a >>> race. This is the case for both local Linux debugging and >>> NativeProcessLinux/llgs. >>> >>> I've started diverging on signal handling fairly significantly in >>> NativeProcessLinux, so it's possible some of the behavior will be different >>> in some spots. I have llgs working in cases where local Linux debugging is >>> failing on my end now. >>> >>> Shawn - make sure you look at these type of bugs in the llgs branch with >>> an llgs - I'd rather you put effort into NativeProcessLinux rather than >>> Linux ProcessMonitor. >>> >>> For llgs, I do have the GDBRemoteCommunicationServer setup to be a >>> delegate to get messages like process stops. It gets installed as the >>> delegate before llgs actually does the launch or attach call so that >>> there's no chance of the delegate missing a startup event. We must be >>> doing something else funky (probably the same but possibly different root >>> cause) on both the local Linux and NativeProcessLinux startup sequence to >>> cause that race. >>> >>> > 2 - The messages coming in out of order seem to be related to sending >>> the eStateLaunching and eStateStopped not being delivered in the correct >>> order. Your first example, they came through OK, and in the second cased we >>> got a eStateStopped first followed by the eStateLaunching. I would take a >>> look at who is sending these out of order. If you fix this out of order >>> events, it might fix your random stopping at an wrong location? >>> >>> As far as llgs goes, I do ensure the delegate >>> (GDBRemoteCommunicationServer) gets those two in order, but at the moment I >>> don't have llgs doing anything when it gets an eStateLaunching method. Not >>> sure if that's relevant there. What does debugserver do when it gets an >>> eStateLaunching internally? >>> >>> >>> On Thu, Jun 26, 2014 at 5:21 PM, Greg Clayton <gclay...@apple.com> >>> wrote: >>> >>>> >>>> > On Jun 26, 2014, at 4:51 PM, Shawn Best <sb...@blueshiftinc.com> >>>> wrote: >>>> > >>>> > In addition to the (lldb) prompt out of order, I am also >>>> investigating some other strange messages when I run a simple application >>>> with no breakpoints. It seems related to thread synchronization >>>> surrounding the startup/management of the inferior process. >>>> > >>>> > (lldb) run >>>> > >>>> > Process 4417 launching >>>> > Process 4417 stopped >>>> > * thread #1: tid = 4417, 0x00007f3b99b9c2d0, name = 'a.out', stop >>>> reason = trace >>>> > frame #0: 0x00007f3b99b9c2d0 >>>> > -> 0x7f3b99b9c2d0: movq %rsp, %rdi >>>> > 0x7f3b99b9c2d3: callq 0x7f3b99b9fa70 >>>> > 0x7f3b99b9c2d8: movq %rax, %r12 >>>> > 0x7f3b99b9c2db: movl 0x221b17(%rip), %eax >>>> > >>>> > Process 4417 launched: '/home/shawn/Projects/a.out' (x86_64) >>>> > Hello world! >>>> > The string is Test String : 5 >>>> > Process 4417 exited with status = 0 (0x00000000) >>>> > (lldb) >>>> > >>>> > ------------- or ---------------- >>>> > >>>> > (lldb) run >>>> > >>>> > Process 4454 launching >>>> > Process 4454 launched: '/home/shawn/Projects/a.out' (x86_64) >>>> > Process 4454 stopped >>>> > * thread #1: tid = 4454, 0x00007ffdec16c2d0, name = 'a.out', stop >>>> reason = trace >>>> > frame #0: 0x00007ffdec16c2d0 >>>> > error: No such process >>>> > >>>> > Hello world! >>>> > The string is Test String : 5 >>>> > Process 4454 exited with status = 0 (0x00000000) >>>> > (lldb) >>>> > >>>> > >>>> > As it is launching the target application, it appears to stop in a >>>> random place (stop reason = trace), and then continue exectuting. When it >>>> momentarily stops, I see it pop/push an IOHandler. >>>> >>>> Yes the Process IO Handler is pushed and popped on every _public_ stop. >>>> There are notions of public stops that the user finds out about, and >>>> private stops where the Process might be in the process of trying to single >>>> step over a source line and might start/stop the process many many times. >>>> >>>> This stopping at random locations seems like a racy bug in the >>>> ProcessLinux that we should really look into fixing. >>>> > >>>> > I added some logging to ProcessPOSIX, and see it hitting >>>> RefreshAfterStop() and DoResume() many times. Is this normal/expected? >>>> >>>> When you start a process, you will run/stop many times as the shared >>>> libraries get loaded. Normally a breakpoint is set in the dynamic loader >>>> that allows us to intercept when shared libraries are loaded/unloaded so >>>> that may explain a few stops you are seeing. >>>> >>>> Other run/stop flurries can result when single stepping over a source >>>> line, stepping past a software breakpoint (disable bp, single instruction >>>> step, re-enable breakpoint, resume). >>>> >>>> > I have added a bunch of logging to Push/Pop IOHandler, ThreadCreate, >>>> HandleProcessEvent and see big differences in the order of events changing >>>> from run to run. >>>> >>>> We have a lot of threading in LLDB so some of this will be normal, but >>>> other times in can indicate a bug much like you are seeing when the process >>>> stops at a random location 0x00007ffdec16c2d0. This could also be an >>>> uninitialized variable in ProcessLinux that gets a random value when >>>> ProcessLinux (or many other classes like ThreadLinux, etc) when a class >>>> instance is initialized. Please do try and track that down. To get a handle >>>> on process controls you can enable process and step logging: >>>> >>>> (lldb) log enable -T -f /tmp/process.txt lldb process step >>>> >>>> Then compare a good and bad run and see what differs. >>>> >>>> > >>>> > >>>> > One other small thing, in POSIX/ProcessMonitor, it calls waitpid() >>>> and checks the return code, >>>> > >>>> > lldb::pid_t wpid; >>>> > if ((wpid = waitpid(pid, &status, 0)) < 0) >>>> > { >>>> > args->m_error.SetErrorToErrno(); >>>> > goto FINISH; >>>> > } >>>> > else ... >>>> > >>>> > lldb::pid_t is a uint64, while waitpid returns an int32, with >>>> negative numbers used for error codes. >>>> > This bug is repeated in a few places >>>> >>>> This is bad, please use native types (::pid_t) for these locations so >>>> that this works correctly. >>>> >>>> So a few things regarding your race conditions: >>>> 1 - on linux does a process start running first, then you quickly try >>>> to attach to it? If so, this could explain the difference you might be >>>> seeing when connecting to a process? On Darwin, our posix_spawn() has a non >>>> portable flag that stops the process at the entry point with a SIGSTOP so >>>> we are guaranteed to not have a race condition when we launch a process for >>>> debugging. >>>> 2 - The messages coming in out of order seem to be related to sending >>>> the eStateLaunching and eStateStopped not being delivered in the correct >>>> order. Your first example, they came through OK, and in the second cased we >>>> got a eStateStopped first followed by the eStateLaunching. I would take a >>>> look at who is sending these out of order. If you fix this out of order >>>> events, it might fix your random stopping at an wrong location? >>>> >>>> Greg >>>> >>>> > >>>> > Shawn. >>>> > >>>> > On Fri, Jun 20, 2014 at 3:34 PM, Greg Clayton <gclay...@apple.com> >>>> wrote: >>>> > >>>> > > On Jun 19, 2014, at 7:27 PM, Ed Maste <ema...@freebsd.org> wrote: >>>> > > >>>> > > Hi Greg, >>>> > > >>>> > > As far as I can tell what's happening here is just that >>>> > > Process::Resume() completes and the next prompt is emitted (from the >>>> > > main-thread?) before the IOHandler gets pushed in another thread. >>>> > > >>>> > > Output from "log enable -n lldb process" with an added log printf >>>> > > where ::Resume returns: >>>> > > >>>> > > step >>>> > > main-thread Process::Resume -- locking run lock >>>> > > main-thread Process::PrivateResume() m_stop_id = 4, public state: >>>> > > stopped private state: stopped >>>> > > main-thread Process::SetPrivateState (running) >>>> > > main-thread Process thinks the process has resumed. >>>> > > internal-state(p Process::ShouldBroadcastEvent (0x80c410480) => new >>>> > > state: running, last broadcast state: running - YES >>>> > > main-thread Process::PrivateResume() returning >>>> > > (lldb) internal-state(p Process::HandlePrivateEvent (pid = 15646) >>>> > > broadcasting new state running (old state stopped) to public >>>> > > wait4(pid=15646) MonitorChildProcessThreadFunction ::waitpid (pid = >>>> > > 15646, &status, options = 0) => pid = -15646, status = 0x0000057f >>>> > > (STOPPED), signal = 5, exit_state = 0 >>>> > > internal-state(p PushIOHandler >>>> > > wait4(pid=15646) Process::SetPrivateState (stopped) >>>> > > >>>> > > As before, I don't see how we intend to enforce synchronization >>>> > > between those two threads. It looks like my tiny usleep in >>>> > > ::PrivateResume delays the next prompt just long enough for the >>>> other >>>> > > IOHandler to be pushed. >>>> > >>>> > That will do it. It is tough because Process::Resume() might not >>>> succeed so we can't always push the ProcessIOHandler. >>>> > >>>> > I need to find a better way to coordinate the pushing of the >>>> ProcessIOHandler so it happens from the same thread that initiates the >>>> resume. Then we won't have this issue, but I need to carefully do this so >>>> it doesn't push it when the process won't be resumed (since it might >>>> already be resumed) or in other edge cases. >>>> > >>>> > Other ideas would be to have the Process::Resume() do some >>>> synchronization between the current thread and the internal-state thread so >>>> it waits for the internal-state thread to get to the running state before >>>> it returns from Process::Resume()... >>>> > >>>> > Greg >>>> > >>>> > >>>> >>>> _______________________________________________ >>>> lldb-dev mailing list >>>> lldb-dev@cs.uiuc.edu >>>> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev >>>> >>> >>> >>> >>> -- >>> Todd Fiala | Software Engineer | tfi...@google.com | 650-943-3180 >>> >>> _______________________________________________ >>> lldb-dev mailing list >>> lldb-dev@cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev >>> >>> >> >> >> -- >> -Todd >> > > > > -- > -Todd > -- -Todd
_______________________________________________ lldb-dev mailing list lldb-dev@cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev