Thanks! This has been very helpful. For reference, the reason why that SetResumeState() was ever there to begin with is because when I was first trying to make all this work I used the ProcessPOSIX as a model, and POSIXThread does the same thing. There's even a comment saying that it doesn't seem right, so I guess I should have looked more closely. But everything was still new to me at the time, so I wasn't sure what the implications of fiddling with it were.
On Mon, Jun 1, 2015 at 4:38 PM Jim Ingham <jing...@apple.com> wrote: > One other bit, just in case you're looking at this more closely... The > other place where we call SetResumeState in lldb is in "process continue" > and "thread continue". "process continue" sets all the states to running. > The idea is that "continue" should get all the threads that aren't > user-suspended running, but I don't think the code there actually does > anything useful. I'll play around with taking that out. > > The other place is in "thread continue", where it sets all the specified > threads to running, and all the others to suspended. That's appropriate > since thread continue is really a composite command that suspends all the > unspecified threads, resumes all the specified ones, and continues the > process. So we're posing as the user here. > > Jim > > > > On Jun 1, 2015, at 3:53 PM, Jim Ingham <jing...@apple.com> wrote: > > > > > >> On Jun 1, 2015, at 3:07 PM, Jim Ingham <jing...@apple.com> wrote: > >> > >> One thing to keep in mind is that there are two kinds of "resume > state". There's the user-specified resume state, which is set by > Thread::SetResumeState, and is stored in m_resume_state in the thread. It > should start out "running" and never change unless some user command does > so. > >> > >> I don't know why you are ever calling "SetResumeState" on Windows > threads, that's not something you should have to do. Who is changing this > value? ProcessGDBRemote calls this in one place, but that is when we are > restarting a process to kill it, and we're just trying to keep all the > other threads from doing something annoying like crashing while we're > trying to kill it. In general, though, internal to lldb you should never > need to change this. > >> > >> The more relevant state is the current thread resume state, which > records what the thread is going to do on this particular resume. This > state is stored in m_temporary_resume_state. That will control the state > of this thread on resume, but note that should only ever get consulted on > the "should stop" side of the thread plan logic. > > > > This comment wasn't very clear. What I meant was that it is legit to > ask "did this thread do anything interesting on last resume" when you stop, > for instance to decide whether you need to re-fetch its stop reason, etc. > But you shouldn't decide whether the thread is going to run or not on the > NEXT resume based on what it did in the previous resume. > > > >> And of course if m_resume_state is eStateSuspended, that should > short-circuit any of our normal logic about this thread. > >> > >> There is another tricky bit about the whole resume negotiation, which > is that you may not need to resume the target at all to perform the > continue action. For instance, if you have nested inline functions which > begin on the same address, then "step in" on that thread just moves the > virtual "inline depth" to the next younger stack frame, but doesn't move > the PC at all. So ShouldResume can return false. > >> > >> So the first step is to take out the call to SetResumeState, that's not > something you should ever have to do. Then if that causes problems, we > should figure out what's going wrong on Windows. > >> > >> Jim > >> > >> > >>> On Jun 1, 2015, at 1:30 PM, Zachary Turner <ztur...@google.com> wrote: > >>> > >>> I'm not quite ready to throw the blanket over this one yet :) > >>> > >>> What was the value of resume_state when it called WillResume()? It > sounds like it was eStateSuspended, which if that's the case, then it still > seems like something deeper inside of LLDB's thread plans is confused about > something, because calling WillResume(eStateSuspended) means "The process > is seriously about to resume, and when it does, this thread is not going to > remain suspended". > >>> > >>> But it's possible I'm not understanding something about the > interaction between resume states and the thread plans. > >>> > >>> Do you have any insight here Jim? > >>> > >>> On Mon, Jun 1, 2015 at 1:26 PM Adrian McCarthy <amcca...@google.com> > wrote: > >>> I think Zach's right. The only plugin that calls SetResumeState > inside WillResume is POSIXThread, and that seems to be overridden by > FreeBSDThread. > >>> > >>> If I remove the call of SetResumeState from > TargetThreadWindows::WillResume, everything starts to work. > >>> > >>> I'll look into adding some logging in ThreadList::WillResume. > >>> > >>> Thanks everyone. > >>> > >>> On Mon, Jun 1, 2015 at 12:58 PM, Zachary Turner <ztur...@google.com> > wrote: > >>> Currently ThreadWindows::WillResume() looks like this: > >>> > >>> > >>> void > >>> TargetThreadWindows::WillResume(lldb::StateType resume_state) > >>> { > >>> SetResumeState(resume_state); > >>> } > >>> > >>> I originally put this code in because that's what one or two of the > other plugins did and I wasn't sure what the "correct" thing to do was. > I'm not sure if it's correct though, or if it could be a cause for the > bug. But if the resume state is eStateSuspended as you say, then that > suggests that something lower level already decided that this thread should > continue to be suspended after the user continues. So the bug might > actually still be earlier. > >>> > >>> There's not a lot of logging in ThreadList::WillResume, I wonder if it > would be worth adding some? > >>> > >>> On Mon, Jun 1, 2015 at 12:44 PM Adrian McCarthy <amcca...@google.com> > wrote: > >>>> The way this works is that when we go to resume the process, all the > thread's get asked whether they need to stop other threads to implement > whatever strategy they are currently pursuing. That query ends up calling > the currently active thread plan's "StopOthers" method. > >>> > >>> Right, and since the ThreadPlanStepOverBreakpoint responds true to > StopOthers, all the other threads get suspended. Once the breakpoint is > restored and stepped over and the ThreadPlanStepOverBreakpoint is popped, > the rest of the threads are still suspended. > >>> > >>>> But ThreadPlanBase::ShouldStop returns false, so if all your threads > are running just the ThreadPlanBase, then they should all resume. > >>> > >>> Except that ShouldStop is not called for threads that are already > suspended (Thread::ShouldStop has an early out if the resume state is > eStateSuspended). So I still don't see how those threads can ever get out > of the suspended state. > >>> > >>> > >>> > >>> On Mon, Jun 1, 2015 at 11:51 AM, Jim Ingham <jing...@apple.com> wrote: > >>> The way this works is that when we go to resume the process, all the > thread's get asked whether they need to stop other threads to implement > whatever strategy they are currently pursuing. That query ends up calling > the currently active thread plan's "StopOthers" method. If one thread > returns true to StopOthers, then that thread will get to run solo. If more > than one thread returns true I do a little round robin to pick which one > gets to go. But ThreadPlanBase::ShouldStop returns false, so if all your > threads are running just the ThreadPlanBase, then they should all resume. > >>> > >>> This all happens in ThreadList::WillResume. > >>> > >>> Note I started to add some commands to manipulate the thread plans - > of which "thread plan list" is the relevant one. The work isn't done yet > (for instance I should actually DO something with the --internal and > --verbose options, but for now I only print user visible plans, not > implementation only plans. Anyway, if you are poking around in this area > that might be of some use. > >>> > >>> Jim > >>> > >>> > >>>> On Jun 1, 2015, at 8:04 AM, Adrian McCarthy <amcca...@google.com> > wrote: > >>>> > >>>> Thanks for the info. > >>>> > >>>> This is not theoretical. I'm trying to get TestBreakAfterJoin to > pass on Windows. Step 1 was to convert it use <thread> instead of > <pthreads.h>. Step 2 was to fix some minor issues in TargetThreadWindows. > >>>> > >>>> But now the inferior deadlocks because the one thread that's not > suspended is waiting on the ones that are. Once the > ThreadPlanStepOverBreakpoint plan is popped, the current plan is > ThreadPlanBase, which, as far as I can tell, does nothing to resume > suspended threads. > >>>> > >>>> I'll compare this to what happens on another platform to see if there > should be some other thread plan in use. > >>>> > >>>> Adrian. > >>>> > >>>> On Fri, May 29, 2015 at 4:50 PM, Jim Ingham <jing...@apple.com> > wrote: > >>>> When stepping over a breakpoint, the ThreadPlanStepOverBreakpoint > gets pushed, handles the single instruction step - during which it does > suspend the other threads - then it gets popped. When you next resume, > whatever plan was handling the stepping before the breakpoint was hit will > resume with whatever policy for running other threads it was using. > >>>> > >>>> So it's up to the plan that was on the stack before the > "StepOverBreakpoint" was pushed to decide this. Most of the more complex > plans (like step-over/step-into) try to keep the other threads from running > if possible (unless the user instructed otherwise) but they also will let > all the threads run if there's something going on that might deadlock. For > instance, if you are doing "next" and we step through straight-line > instructions in a function, we will only run the one thread you are > stepping in (by default, you can control this with options to the "thread > step-over" command. But if we step into a function, we set a breakpoint > on the return address and then run with all threads resumed because > stepping out of a function could run arbitrary code. > >>>> > >>>> Anyway, was this a theoretical question, or do you have some instance > where you are actually seeing a deadlock? > >>>> > >>>> Jim > >>>> > >>>> > >>>>> On May 29, 2015, at 1:59 PM, Adrian McCarthy <amcca...@google.com> > wrote: > >>>>> > >>>>> [I'm trying to make TestBreakAfterJoin work on Windows.] > >>>>> > >>>>> I'm unclear how continuing from a breakpoint in a multi-threaded > inferior is supposed to work. > >>>>> > >>>>> A breakpoint is set, and the inferior runs until one of its threads > hits the breakpoint. The user then selects continue. > >>>>> > >>>>> The thread that had hit the breakpoint has a thread plan type of > ThreadPlanStepOverBreakpoint, which causes all of the other threads to be > set to state eStateSuspended. The thread that had hit the breakpoint then > steps beyond the breakpoint, and the breakpoint is restored. The thread is > then resumed again. > >>>>> > >>>>> But the other threads are all still suspended, causing the inferior > to deadlock. > >>>>> > >>>>> The question is: Where should the other threads have their resume > states set back to a running state? > >>>>> > >>>>> Adrian > >>>>> _______________________________________________ > >>>>> lldb-dev mailing list > >>>>> lldb-dev@cs.uiuc.edu > >>>>> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev > >>>> > >>>> > >>> > >>> > >>> _______________________________________________ > >>> lldb-dev mailing list > >>> lldb-dev@cs.uiuc.edu > >>> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev > >>> > >> > > > >
_______________________________________________ lldb-dev mailing list lldb-dev@cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev