Hi Thomas.

I think the problem is with concurrent code in your test. I do not see what
wrong is going on but test passes when full stepping loop is extracted into
single process:

 s := Semaphore new. done := false.
    "Create debugged process"
    debuggedProcess := [p := Processor activeProcess. done := true]
newProcess name: 'D'; yourself.
    "Until the execution of the debugged process is over, create a forked
process to step it"
[     [done]
        whileFalse: [
            debuggedProcess step.
        ].
s signal ] forkNamed: 'F'.
s wait.
debuggedProcess == p


вт, 21 янв. 2020 г. в 15:19, Thomas Dupriez <[email protected]>:

> Hello
>
> Some time ago, I've stumbled upon a challenging bug in pharo. I tried some
> things, but this bug still eludes me. Maybe someone here has an idea?
>
> The bug is that the value of `Processor activeProcess` is wrong inside a
> process being stepped by a forked process.
>
> In other words, let's say process D is the (frozen) process I am
> debugging, and its code is to store the active process into some variable
> with `p := Processor activeProcess`:
> - If I step process D normally (with `D step`), then p is correct and
> worth process D
> - If I fork to create a process F that steps process D, then p is
> incorrect and worth process F
>
> You will find below the code of the two tests I am using to show the bug,
> as well as a condensed version of my findings so far. If you have any idea
> or lead as to where this bug could come from, I would be very grateful.
>
> Thomas Dupriez
>
> -----
>
> Here is the code of the failing test, where process F steps process D:
>
> ```
> *testActiveProcessInProcessSteppedInForkedProcess*
> *| s p D done |*
> *    s := Semaphore new. done := false.*
> *    "Create debugged process"*
> *    D := [p := Processor activeProcess. done := true] newProcess name:
> 'D'; yourself.*
> *    "Until the execution of the debugged process is over, create a forked
> process to step it"*
> *    [done]*
> *        whileFalse: [ *
> *            [debuggedProcess step. s signal] forkNamed: 'F'.*
> *            s wait.*
> *        ].*
> *    self assert: D identicalTo: p*
> ```
>
> And here is the passing test, where we step process D directly:
>
> ```
> *testActiveProcessInProcessDirectlyStepped*
> *| s p D done |*
> *    s := Semaphore new. done := false.*
> *    "Create debugged process"*
> *    D := [p := Processor activeProcess. done := true] newProcess name:
> 'D'; yourself.*
> *    "Until the execution of the debugged process is over, step it
> directly"*
> *    [done]*
> *        whileFalse: [ *
> *            debuggedProcess step.*
> *        ].*
> *    self assert: D identicalTo: p*
> ```
>
> -----
>
> Here are my findings so far:
>
> The call chain of Process>>step is:
> - Process>>step
> - which calls Process>>evaluate:onBehalfOf:
> - which calls BlockClosure>>ensure:
> - which calls BlockClosure>>valueNoContextSwitch
>
> 1) Replacing the call to BlockClosure>>valueNoContextSwitch with a call to
> BlockClosure>>value does not affect the results of the test
>
> 2) Since #valueNoContextSwitch is a primitive, it cannot be instrumented
> easily. I instrumented right before and after it gets called in the code of
> BlockClosure>>ensure to check the value of active process. No wrong value
> there, so the problem appears inside the execution of
> #valueNoContextSwitch, and it disappears before this method call returns.
>
> 3) The block being evaluated by #valueNoContextSwitch contains a call to
> Context>>step, which ultimately calls
> InstructionStream>>interpretNextV3PlusClosureInstructionFor: (the method
> that read what the next bytecode is and applies it to the execution it is
> stepping. I instrumented this method to log the name of the active process,
> and the context being stepped during the execution of both tests. The log
> show a difference between the passing and failing test:
> - Passing test: the active process is D for a long time, then 'Test
> execution watch dog" for a bit, and finally, it is "Morphic UI Process". So
> everything looks in order: the active process is D until the test ends and
> the UI process takes control back
> - Failing test: The logged active process alternates between F and D, and
> looks like this: (I put some F D patterns in bold for readability) *F D*
> F D *F D*.....*F D* F F D F F *F D* F D *F D*...*F D* D D D D r M M M M...
> "M" is the morphic UI Process, "r" is a seemingly random process whose
> name is "1006977792" in the log. I also logged the ast nodes being stepped,
> but I don't really know how to exploit it.
>
> 4) I did some experiments by tweaking the tests and changing which process
> creates D, which process steps F...and had surprising results:
> 4-1) Original failing test
> [image: a]
>
> In the original failing test, the test process creates the debugged
> process and a fork, and the fork steps the debugged process (blue arrow).
> This test fails.
> 4-2) Original passing test
> [image: a]
>
> In the original passing test, the test process creates the debugged
> process and steps it. This test passes.
> 4-3) Forked process creates AND steps the debugged process
> [image: a]
> If the forked process is the one to create the debugged process, the test
> passes!
> 4-4) *Forked process creates the debugged process, and TestProcess steps
> it*
> [image: a]
> So maybe the test passes whenever the debugged process is a descendant of
> the process stepping it? No, 4-5) shows that it is not necessary.
> 4-5)
> *A forked process creates the debugged process. Another forked process
> steps the debugged process *[image: a]
>
>
>

Reply via email to