Re: [lldb-dev] More linux process control and IOHandler races

Matthew Gardiner Wed, 13 Aug 2014 01:52:31 -0700

Hi Greg,

Yes, it's clear that a planned attack is required for this. I understandthe need to have public and private state e.g. for source-level steps,and indeed for launching a process. That's why I was a bit surprised todiscover, after debugging it, that the very first stop on "processlaunch" resulted in ShouldBroadcastEvent returning true, as I thoughtthat this would be an instance where we'd like to hide this event fromthe user. I did spend sometime trying to figure out how "ShouldStop" and"ShouldReportStop" worked, but quickly found myself in the world ofthread plans, a world in which I know very little. So I'd certainlyappreciate you guys, who have that knowledge, thrashing this design out.

Having these (proposed) different ProcessState subclasses sounds to be afar better solution than "broadcast hijacking + tricks". My only minorcritique of the design you floated being that you mentioned a "stack" ofthese subclass instances, although when we receive an event (i.e.waitpid returns a state change), it seems apparent that you'll then passthis event to (all) the private and the public subclasses. (I tend tothink of a stack as being a system where you'd send the event to justthe topmost element). Is there any reason to have more just a privatehandler and a public handler?

Regarding simple, interim fixes, I did come up with something (whichaddresses the observation associated with this thread's first post),which is a little icky. I'll attach it though, to see if it attracts anycomment. Essentially, I have a flag which allows first stop to beidentified, and arranges that ShouldBroadcastEvent returns false forthis stop. Also don't push the IOHandler for launching state (I think itgets pushed later for the running state), and force public state changeafter the first stop, which seemed weird, but was required, due to theimplementation of ProcessLaunch waiting for the initial trap.

Anyway, thanks for you and Jim's interest in getting a goodimplementation for this out there.


Matt


Greg Clayton wrote:

Jim and I have discussed the right thing to do which we plan to implement as 
soon as we get the chance.

Current we have two process states: public and private. The private state is 
always up to date and is used to track what the process is currently doing. The 
public state gets set when we believe the end user should see an event that 
will update the GUI or TTY with the current state if needed (stopped or 
exited). There is a lot of tricky code that uses the private process state to 
manage things like the thread plans. Things like source level single step might 
involve starting and stopping and hitting breakpoints and single instruction 
stepping, but the user only wants to see the final eStateStopped when the 
source level step is done. Expressions are also tricky, the public state of the 
process is stopped, but when we run an expression, we might resume the process 
many times and the user will always see that the public state if stopped. 
Internally though the thread plans are doing a whole bunch of stuff to the 
process and the user never hears anything about these (no events go public).

So the current public and private stuff uses broadcaster hijacking and all 
sorts of other tricks to avoid letting the end user know about the starts and 
stops that they shouldn't know about. We eventually want to have a new class, 
lets call in lldb_private::ProcessState for now, that manages the current 
process state and always receives the process events. There would be no more 
public and private events. We would have one version of the ProcessState 
subclass that would cause the GUI/TTY to update (replacing the public events) 
and one ProcessState subclass that would manage running thread plans. There 
would be stack of these and this will help us avoid the whole 
public/private/hijacking stuff we have now.

Launching a process would end up pushing a new ProcessState class that could 
handle all the necessary starts and stop we need to get the process to a 
quiescent state that is presentable to the user. The ProcessState subclass 
could eat all the events it needs to while launching and then propagate the 
stopped or running event when it gets popped off the stack.

So we should make some simple fixes for now to work around the issue and get 
things working, but it would also be great to get the new system up and running 
so we can make this easier in the future for people that want to make changes.

Greg

On Aug 8, 2014, at 1:08 AM, Matthew Gardiner <m...@csr.com> wrote:

Folks,

Regarding "launching using the shell",  I don't think applies in the buggy case 
that myself and Shawn are looking at.

I do:

(lldb) target create ./test
...
(lldb) process launch
...

and when I inspected the call to exec, I see that my exe name is the program 
passed.

Thanks for the insight into the broadcasting of stop events. That explains why 
I see in the ShouldBroadcastEvent, the ShouldStop and ShouldReportStop calls.

However, it would be nice to know if the first stop event should be broadcast for 
"process launch".  I think it's an implementation detail, and therefore should 
not. That would help to fix this issue.

However, Shawn's original suggestion to fix this issue circumvents the above 
debate, by replacing the call of HandlePrivateEvent with SetPublicState. So 
which fix is best? Calling SetPublicState rather than HandlePrivateEvent is 
certainly more expedient, and avoids this debate... but is it more 
portable/future-proof?

Matt


jing...@apple.com wrote:

If you are launching using the shell, you'll see more stops before you get to 
the executable you are actually trying to launch.  In that case, instead of 
just running the binary directly you effectively do:

/bin/bash exec <binary> arg1 arg2

so that you can get bash (or whatever is set in $SHELL) to do the argument 
expansion for you.  GetResumeCountForLaunchInfo calculates this, then it is 
stuffed into the ProcessLaunchInfo (SetResumeCount).  On Mac OS X we always let 
the Platform launch, then attach, so in that case the AttachCompletionHandler 
does the extra resumes.  I'm not all that familiar with how the Linux side 
work, but it also seems to use the ProcessLaunchInfo's resume count.

Note that in general in lldb not all publicly broadcast stop messages are going to 
result in a stop.  For instance, all the breakpoint command & condition 
handling goes on as a result of the broadcast of the public stop event, but the 
process might just turn around an continue based on that.  Whether to suppress the 
broadcast and continue from the private state thread or broadcast the event and let 
the upper levels of lldb take care of what happens from there on really depends on 
where it makes sense to handle the stop.  So for the case of breakpoint stops (or 
stop hooks, another example), those end up being equivalent to user typed commands, 
just done for the user automatically by the system.  So having them happen in a 
world where the public state wasn't sync'ed up to the private state ended up being 
very awkward.

I haven't looked at the launching code in detail recently so I am not sure 
whether it makes sense for the first stop to be handled as it is.

Jim

On Aug 7, 2014, at 6:29 AM, Matthew Gardiner <m...@csr.com> wrote:

Hi Shawn,

I spent some time today looking at how to arrange for ShouldBroadcast to return 
false for this first stop. I managed to produce a quick hack for this (i.e. 
just counted the number of stops), but due to other distractions (from the rest 
of my job) I didn't get that far into discovering a nice way of achieving 
this...

What I did discover is that with my build just doing "process launch" results 
in 3 stops (and 3 private resumes). That in itself I find surprising, since I was under 
the impression that I should see the inferior stop just once when exec is trapped by 
PTRACE_ME.

I then discovered that for each of these 3 stops ShouldBroadcast calls 
Thread::ShouldStop, the Thread::ShouldStop returns true for the first stop and 
false for the other 2. Looking into these behaviour differences I then found 
that from within Thread::ShouldStop we then call into the following:

    StopInfoSP private_stop_info (GetPrivateStopInfo());
    if (private_stop_info && 
private_stop_info->ShouldStopSynchronous(event_ptr) == false)
    {

and also

bool over_ride_stop = current_plan->ShouldAutoContinue(event_ptr);

the results from either of these, it seems providing the reasoning behind the 
different true/false returns. I'll return to spend a bit more time on this 
tomorrow. Let me know if you get any further on a similar vein!

thanks
Matt



Shawn Best wrote:

Matt,

I think you are probably right, although there are other places where it 
directly calls SetPublicState().   I was wondering about the possibility there 
could be other listeners waiting for a broadcast public Stop event.  Is that a 
possibility?

Some others here were investigating some unit tests that were failing 
intermittently (StopHook).  Their description of the problem sounds unrelated 
to the launch code, but this patch also magically fixes that.

Shawn.

On 8/6/2014 6:26 AM, Matthew Gardiner wrote:

Shawn,

Like I said earlier your patch worked. However I think the right fix is to 
arrange that ShouldBroadcast returns false for this first stop. I believe this, 
because firstly no stops should be reported here since the user is only 
interested in launching a program, and additionally because it enables us to 
fix lldb without removing the call to HandlePrivateEvent. This, I think, is 
important to preserve as the central point for process state change handling.

Matt



Shawn Best wrote:

Hi Matthew,

I have also been tracking this bug.  I believe there are other bugs in the unit 
tests failing indirectly because of this.  I also have a patch that will fix 
it, but was sitting on it until the other one landed.  These bugs do not show 
up on OSX since the inferiors are launched separately then attached to.

The first odd thing the launching code does is push an IOHandler when it sees 
the state transition to 'launching'. This is odd because I believe the 
launching program will always come up in a stopped state which will immediately 
pop the IOHandler.

At launch, the process comes up in the stopped state.  The launch code manually 
calls HandlePrivateEvent() with the stop event, which then broadcasts the 
Event.  When HandleProcessEvent gets the public stop, it dumps out the current 
thread state just as if an executing inferior hit a breakpoint and stopped.

One way to fix this would be:

1. Don't push io handler when state is 'launching'
2. Instead of manually calling HandlePrivateEvent, call SetPublicState().

Alternately, we could try and debug why ShouldBroadcast() returns true, but 
that appears to be by design since it is expecting the public stop event to pop 
the IOHandler that had been pushed when launching.

I have attached a patch demonstrating this.  In conjunction with the other 
patch for IOHandler race condition, it will fix a bunch of this kind of 
behaviour.

Shawn.

On 8/5/2014 6:59 AM, Matthew Gardiner wrote:

Jim,

I've been trying to debug an issue (I see it on 64-bit linux) where, I do "target create" 
and "process launch" and despite not requesting *stop at entry*, the first stop (which I 
believe is just the initial ptrace attach stop) is reported to the lldb command line. I added some 
fprintf to Process::HandlePrivateEvent, which counts the number of eStoppedState events seen and 
whether ShouldBroadcastEvent returns true for this event. Here's the output from my program with 
diagnostic:

(lldb) target create ~/me/i64-hello.elf
Current executable set to '~/me/i64-hello.elf' (x86_64).
(lldb) process launch
MG Process::HandlePrivateEvent launching stopped_count 0 should_broadcast 1
Process 31393 launching
MG Process::HandlePrivateEvent stopped stopped_count 1 should_broadcast 1
MG Process::HandlePrivateEvent running stopped_count 1 should_broadcast 1
Process 31393 launched: 'i64-hello.elf' (x86_64)
Process 31393 stopped
* thread #1: tid = 31393, 0x0000003675a011f0, name = 'i64-hello.elf', stop 
reason = trace
    frame #0: 0x0000003675a011f0
-> 0x3675a011f0:  movq   %rsp, %rdi
   0x3675a011f3:  callq  0x3675a046e0
   0x3675a011f8:  movq   %rax, %r12
   0x3675a011fb:  movl   0x21eb97(%rip), %eax
(lldb) MG Process::HandlePrivateEvent stopped stopped_count 2 should_broadcast 0
MG Process::HandlePrivateEvent running stopped_count 2 should_broadcast 0
MG Process::HandlePrivateEvent stopped stopped_count 3 should_broadcast 0
MG Process::HandlePrivateEvent running stopped_count 3 should_broadcast 0

In summary, lldb reports the inferior to be stopped (even though /proc/pid/status and 
lldb "target list" say it is running). Clearly this is wrong (hence my earlier 
post).

Am I correct in assuming that when  ShouldBroadcastEvent returns true this 
means that lldb should show this event to the debug user? (And thus hide other 
events where ShouldBroadcastEvent=false).

What puzzled me was why ShouldBroadcastEvent return true for this very first 
stop. Is this a bug?

I also spent sometime at ShouldBroadcastEvent and saw that this:

        case eStateStopped:
        case eStateCrashed:
        case eStateSuspended:
        {
         ....
                if (was_restarted || should_resume || m_resume_requested)
                {

evaluates as false, and hence the PrivateResume code is not called... does this 
seem buggy to you for this very first stop?

I thought I'd try asking you, since in a previous mail from Greg, he cited you 
as being a thread-plan expert. (Hope that's ok!). I'd really appreciate your 
help in clarifying the above questions for me, and if you have time, giving me 
some ideas as to how to trace this one further e.g. how 
m_thread_list.ShouldStop and m_thread_list.ShouldReportStop should behave, etc.

thanks for your help
Matt

Matthew Gardiner wrote:

Folks,

In addition to the overlapping prompt race Shawn Best and myself are looking 
at, I'm seeing another issue where if I launch a process, I get a stop 
(presumably the in) being reported to the UI.

(lldb) target create ~/mydir/i64-hello.elf
Current executable set to '~/mydir/i64-hello.elf' (x86_64).
(lldb) process launch
Process 27238 launching
Process 27238 launched: '64-hello.elf' (x86_64)
Process 27238 stopped
* thread #1: tid = 27238, 0x0000003675a011f0, name = 'i64-hello.elf'
    frame #0:
(lldb) target list
Current targets:
* target #0: i64-hello.elf ( arch=x86_64-unknown-linux, platform=host, 
pid=27238, state=running )
(lldb)

As you can see the "target list" reflects that the process is running. Which I 
confirmed by looking at /proc/27238/status.

Anyone else seeing this?

thanks
Matt



Member of the CSR plc group of companies. CSR plc registered in England and 
Wales, registered number 4187346, registered office Churchill House, Cambridge 
Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom
More information can be found at www.csr.com <http://www.csr.com>. Keep up to date with CSR on our technical 
blog, www.csr.com/blog <http://www.csr.com/blog>, CSR people blog, www.csr.com/people 
<http://www.csr.com/people>, YouTube, www.youtube.com/user/CSRplc <http://www.youtube.com/user/CSRplc>, 
Facebook, www.facebook.com/pages/CSR/191038434253534 <http://www.facebook.com/pages/CSR/191038434253534>, or 
follow us on Twitter at www.twitter.com/CSR_plc <http://www.twitter.com/CSR_plc>.
New for 2014, you can now access the wide range of products powered by aptX at 
www.aptx.com <http://www.aptx.com>.
_______________________________________________
lldb-dev mailing list
lldb-dev@cs.uiuc.edu <mailto:lldb-dev@cs.uiuc.edu>
http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev


To report this email as spam click 
https://www.mailcontrol.com/sr/EjKNgqvIx0TGX2PQPOmvUj!GOBh06pKKNwnTW0ZqkNYNbZeofOurgZMo6Cl2EgPiaCw7kl6fPUTCXaTERp6oIw==
 
<https://www.mailcontrol.com/sr/EjKNgqvIx0TGX2PQPOmvUj%21GOBh06pKKNwnTW0ZqkNYNbZeofOurgZMo6Cl2EgPiaCw7kl6fPUTCXaTERp6oIw==>
 .

_______________________________________________
lldb-dev mailing list
lldb-dev@cs.uiuc.edu <mailto:lldb-dev@cs.uiuc.edu>
http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev

_______________________________________________
lldb-dev mailing list
lldb-dev@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev

Index: include/lldb/Target/Process.h
===================================================================
--- include/lldb/Target/Process.h	(revision 214950)
+++ include/lldb/Target/Process.h	(working copy)
@@ -3146,6 +3146,7 @@
     
     DISALLOW_COPY_AND_ASSIGN (Process);
 
+    bool m_first_stop;
 };
 
 } // namespace lldb_private
Index: source/Target/Process.cpp
===================================================================
--- source/Target/Process.cpp	(revision 214950)
+++ source/Target/Process.cpp	(working copy)
@@ -703,7 +703,8 @@
     m_force_next_event_delivery(false),
     m_last_broadcast_state (eStateInvalid),
     m_destroy_in_process (false),
-    m_can_jit(eCanJITDontKnow)
+    m_can_jit(eCanJITDontKnow),
+    m_first_stop(true)
 {
     CheckInWithManager ();
 
@@ -2774,6 +2775,7 @@
                         // This delays passing the stopped event to listeners till DidLaunch gets
                         // a chance to complete...
                         HandlePrivateEvent (event_sp);
+                        SetPublicState(state, false);
 
                         if (PrivateStateThreadIsValid ())
                             ResumePrivateStateThread ();
@@ -3661,6 +3663,12 @@
                     return_value = true;
                     SynchronouslyNotifyStateChanged (state);
                 }
+
+                if (m_first_stop)
+                {
+                    m_first_stop = false;
+                    return_value = false;
+                }
             }
         }
         break;
@@ -3886,7 +3894,7 @@
         {
             // Only push the input handler if we aren't fowarding events,
             // as this means the curses GUI is in use...
-            if (!GetTarget().GetDebugger().IsForwardingEvents())
+            if (!GetTarget().GetDebugger().IsForwardingEvents() && new_state != eStateLaunching)
                 PushProcessIOHandler ();
         }
         else if (StateIsStoppedState(new_state, false))

_______________________________________________
lldb-dev mailing list
lldb-dev@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev

Re: [lldb-dev] More linux process control and IOHandler races

Reply via email to