2014-07-27 7:09 GMT+02:00 Ben Coman <[email protected]>:

>
>
>
>
> 2014-07-25 16:56 GMT+02:00 Ben Coman <[email protected]>:
>
>>
>> Over the last few days I have been looking deeper into the image locking
>> when suspending a process. It is an interesting rabbit hole [1] that leads
>> to pondering the Delay machinery, that leads to some VM questions.
>>
>> When  pressing the interrupt key it seems to always opens the debugger
>> with the following call stack.
>> Semaphore>>critical:   'self wait'
>> BlockClosure>>ensure:     'self valueNoContextSwitch'
>> Semaphore>>critical:      'ensure: [ caught ifTrue: [self signal]]
>> Delay>>schedule         'AccessProtect critical: ['
>> Delay>>wait              'self schedule'
>> WorldState>>interCyclePause:
>>
>>
> Nicolai Hess wrote:
>
>  Hi Ben,
>
>  I am on Windows too :(
>  So, the fixes does not work (not always) on winddows too. But at least
> they make it less probable to occure, but it still happens.
>  The most distracting thing is, after the first ui lock, pressing
> alt+dot, closing the debuggers, pressing alt+dot ....
> and trying to close the very first debugger, after that, it all works. The
> UI is responsive again and suspending the process does
> not block the ui anymore.
>  It "looks like" supsending the process reactivates another process that
> blocks the UI. And as soon as I terminate this
> process (alt+dot, close debugger ...) all works.
>  But I really don't know.
>
> Nicolai
>
>
> Hi Nicolai,  I have something for you to try.
>
> With the problem seeming to be stuck in the semaphore critical section, I
> read around on mutexes and semaphores and found [1] enlightening, which
> says: "The correct use of a semaphore is for signaling from one task to
> another. A mutex is meant to be taken and released, always in that order,
> by each task that uses the shared resource it protects. By contrast, tasks
> that use semaphores either signal or wait—not both. [...] Any two RTOS
> tasks that operate at **different priorities** and coordinate via a mutex,
> create the opportunity for priority inversion [...while...] semaphores
> [...] do not cause priority inversion when used for signaling." [1]
>
> To summarize with an example, here's how to use a mutex:
> /* Task 1 */
>    mutexWait(mutex_mens_room);
>       // Safely use shared resource
>    mutexRelease(mutex_mens_room);
>
> /* Task 2 */
>    mutexWait(mutex_mens_room);
>       // Safely use shared resource
>    mutexRelease(mutex_mens_room);
>
> By contrast, you should always use a semaphore like this:
> /* Task 1 - Producer */
>     semPost(sem_power_btn);   // Send the signal
>
> /* Task 2 - Consumer */
>     semPend(sem_power_btn);  // Wait for signal
>
> I am out of my depth here, but in light [1] the way the current code seems
> to mix semaphore/mutex concepts with Semaphore>>forMutalExclusion and
> Semaphore>>critical:  gave me some concern.  Anyway, the delay scheduling
> code seemed more like a producer/consumer signaling problem than a shared
> resource mutal exclusion problem, so I rewrote it like the former and fixes
> the problem without causing any *obvious* problems.  However I lack
> experience here and it could definitely use some review.  Can you take the
> attached zip file and...
>
> 1. File in "Delay class-new delay code part 1.st"
>     Defines #enableNewDelayCode and #enableNewDelayCode:
>     Manually define the referenced class variables
>
> 2. File in "Delay class-new delay code part 2.st"
>      Modifies #handleTimerEvent, guarded by enableNewDelayCode
>
> 3. File in "Delay-new delay code part 3.st"
>     Modifies #schedule and #unschedule, guarded by enableNewDelayCode
>
> 4. File in "BackgroundWorkDisplayMorph.st"
>
> 5. Save checkpoint Image to return to between trials.
>
> 6. Open System Browser on BackgroundWorkDisplayMorph>>initialize
>
> Trial 1.
> BackgroundWorkDisplayMorph new openInWorld.
>
> Trial 2.
> Delay enableNewDelayCode: true.
> BackgroundWorkDisplayMorph new openInWorld.
>
> If if its deemed worthwhile, I'll open a ticket to track this more
> formally.  One thing needed is to considered performance impact, I haven't
> so far.
>
> cheers -ben
>
> [1] http://www.barrgroup.com/Embedded-Systems/How-To/RTOS-Mutex-Semaphore
> [2] http://nerdfortress.com/2011/02/18/you-say-semaphore-i-say-mutex/
>
>
>    I notice...
>>     Delay class >> initialize
>>         TimingSemaphore := (Smalltalk specialObjectsArray at: 30).
>> and...
>>     Delay class >> startTimerEventLoop
>>         TimingSemaphore := Semaphore new.
>> which seems incongruous that TimingSemaphore is set in differently.  So
>> while I presume this critical stuff all works fine, just in an exotic way,
>> my entropy-guarding-neuron would just like confirm this is so.
>>
>> --------------
>>
>> In Delay class >> handleTimerEvent the comment says...
>>     "Handle a timer event....
>>           -a timer signal (not explicitly specified)"
>> ...is that event perhaps a 'tick' generated periodically by the VM via
>> that item from specialObjectArray ?  Or is there some other mechanism ?
>>
>> --------------
>>
>> [1] http://www.urbandictionary.com/define.php?term=Rabbit+Hole
>> cheers -ben
>>
>>
>> P.S. I've left the following for some initial context as I change the
>> subject.  btw Nicolai, I confirm that my proposed fixes only work on
>> Windows, not Mavericks (and I haven't checked Linux).
>>
>> Nicolai Hess wrote:
>>
>>
>> Hi ben, thank you for looking at this.
>>
>> 2014-07-22 20:17 GMT+02:00 <[email protected]>:
>>
>>> I thought this might be interesting to learn, so I've gave it a go.  I
>>> had some success at the end, but I'll give a progressive report.
>>>
>>> First I thought I'd try moving the update of StringMorph outside the
>>> worker-process using a Morph's #step method as follows...
>>>
>>> Morph subclass: #BackgroundWorkDisplayMorph
>>>     instanceVariableNames: 'interProcessString stringMorph'
>>>     classVariableNames: ''
>>>     category: 'BenPlay'
>>>     "---------"
>>>
>>> BackgroundWorkDisplayMorph>>initializeMorph
>>>     self color: Color red.
>>>     stringMorph := StringMorph new.
>>>     self addMorphBack: stringMorph.
>>>     self extent:(300@50).
>>>     "---------"
>>>
>>> BackgroundWorkDisplayMorph>>newWorkerProcess
>>>     ^[
>>>         | work |
>>>         work := 0.
>>>         [     20 milliSeconds asDelay wait.
>>>             work := work + 1.
>>>             interProcessString := work asString.
>>>         ] repeat.
>>>     ] newProcess.
>>>     "---------"
>>>
>>> BackgroundWorkDisplayMorph>>step
>>>     stringMorph contents: interProcessString.
>>>     "---------"
>>>
>>> BackgroundWorkDisplayMorph>>stepTime
>>>     ^50
>>>     "---------"
>>>
>>> BackgroundWorkDisplayMorph>>initialize
>>>     | workerProcess running |
>>>     super initialize.
>>>     self initializeMorph.
>>>
>>>     workerProcess := self newWorkerProcess.
>>>     running := false.
>>>
>>>     self on: #mouseUp send: #value to:
>>>     [      (running := running not)
>>>             ifTrue: [  workerProcess resume. self color: Color green.  ]
>>>             ifFalse: [ workerProcess suspend. self color: Color red. ]
>>>     ]
>>>     "---------"
>>>
>>>
>>>
>>> But evaluating "BackgroundWorkDisplayMorph new openInWorld"  found this
>>> exhibited the same problematic behavior you reported... Clicking on the
>>> morph worked a few times and then froze the UI until Cmd-. pressed a few
>>> times.
>>>
>>
>>> However I found the following never locked the GUI.
>>>
>>> BackgroundWorkDisplayMorph>>initialize
>>>     "BackgroundWorkDisplayMorph new openInWorld"
>>>     | workerProcess running |
>>>     super initialize.
>>>     self initializeMorph.
>>>
>>>     workerProcess := self newWorkerProcess.
>>>     running := false.
>>>
>>>     [ [      (running := running not)
>>>             ifTrue: [  workerProcess resume. self color: Color green  ]
>>>             ifFalse: [ workerProcess suspend. self color: Color red ].
>>>         10 milliSeconds asDelay wait.
>>>     ] repeat ] fork.
>>>     "---------"
>>>
>>>
>>  This locks the UI as well. Not every timet hough. I did this 5 times,
>> every time in a freshly loaded image and it happens two times.
>>
>>
>>
>>> So the problem seemed to not be with #suspend/#resume or with the shared
>>> variable /interProcessString/.  Indeed, since in the worker thread
>>> /interProcessString/ is atomically assigned a copy via #asString, and the
>>> String never updated, I think there is no need to surround use of it with a
>>> critical section.
>>>
>>> The solution then was to move the "#resume/#suspend" away from the "#on:
>>> #mouseUp send: #value to:" as follows...
>>>
>>> BackgroundWorkDisplayMorph>>initialize
>>>     "BackgroundWorkDisplayMorph new openInWorld"
>>>     | workerProcess running lastRunning |
>>>     super initialize.
>>>     self initializeMorph.
>>>
>>>     workerProcess := self newWorkerProcess.
>>>     lastRunning := running := false.
>>>
>>>     [ [    lastRunning = running ifFalse:
>>>         [    running
>>>                 ifTrue: [  workerProcess resume  ]
>>>                 ifFalse: [ workerProcess suspend ].
>>>             lastRunning := running.
>>>         ].
>>>         10 milliSeconds asDelay wait.
>>>     ] repeat ] fork.
>>>
>>>     self on: #mouseUp send: #value to:
>>>     [      (running := running not)
>>>             ifTrue: [  self color: Color green.  ]
>>>             ifFalse: [ self color: Color red. ]
>>>     ]
>>>     "---------"
>>>
>>
>>  And this too :(
>>
>>
>>
>>>
>>> And finally remove the busy loop.
>>>
>>> BackgroundWorkDisplayMorph>>initialize
>>>     "BackgroundWorkDisplayMorph new openInWorld"
>>>     | workerProcess running lastRunning semaphore |
>>>     super initialize.
>>>     self initializeMorph.
>>>
>>>     workerProcess := self newWorkerProcess.
>>>     lastRunning := running := false.
>>>     semaphore := Semaphore new.
>>>
>>>     [ [    semaphore wait.
>>>         running
>>>             ifTrue: [  workerProcess resume  ]
>>>             ifFalse: [ workerProcess suspend ].
>>>     ] repeat ] fork.
>>>
>>>     self on: #mouseUp send: #value to:
>>>     [      (running := running not)
>>>             ifTrue: [  self color: Color green.  ]
>>>             ifFalse: [ self color: Color red. ].
>>>         semaphore signal.
>>>     ]
>>>     "---------"
>>>
>>>
>>
>>  And this locks the UI too. (Loaded the code 20 times, every time after
>> a fresh image start up. Two times I got a locked
>>  ui after the first two clicks).
>>  And I don't understand this code :)
>>
>>
>>
>>> Now I can't say how close that is to how it "should" be done.  Its the
>>> first time I used sempahores and just what I discovered hacking around.
>>> But hey! it works :)
>>>
>>> cheers -ben
>>>
>>>
>>>
>>> Nicolai Hess wrote:
>>>
>>>  I am still struggling with it.
>>>
>>>  Any ideas?
>>>
>>>
>>> 2014-07-09 11:19 GMT+02:00 Nicolai Hess <[email protected]>:
>>>
>>>>
>>>>
>>>>
>>>> 2014-07-09 2:07 GMT+02:00 Eliot Miranda <[email protected]>:
>>>>
>>>>  Hi Nicolai,
>>>>>
>>>>>
>>>>>  On Tue, Jul 8, 2014 at 7:19 AM, Nicolai Hess <[email protected]>
>>>>> wrote:
>>>>>
>>>>>>  I want to create a process doing some work and call #changed on a
>>>>>> Morph.
>>>>>> I want to start/suspend/resume or stop this process.
>>>>>> But sometimes, suspending the process locks the UI-Process,
>>>>>> and I don't know why. Did I miss something or do I have to care when
>>>>>> to call suspend?
>>>>>>
>>>>>>  Wrapping the "morph changed" call in
>>>>>>  UIManager default defer:[ morph changed].
>>>>>>  Does not change anything.
>>>>>>
>>>>>> Here is an example to reproduce it.
>>>>>> Create the process,
>>>>>> call resume, call supsend. It works, most of the time,
>>>>>> but sometimes, calling suspend locks the ui.
>>>>>>
>>>>>> p:=[[true] whileTrue:[ Transcript crShow: (DateAndTime now asString).
>>>>>> 30 milliSeconds asDelay wait]] newProcess.
>>>>>>
>>>>>  p resume.
>>>>>> p suspend.
>>>>>>
>>>>>
>>>>>   If you simply suspend this process at random form a user-priority
>>>>> process you'll never be able to damage the Delay machinery you're using,
>>>>> but chances are you'll suspend the process inside the critical section 
>>>>> that
>>>>> Transcript uses to make itself thread-safe, and that'll lock up the
>>>>> Transcript.
>>>>>
>>>>
>>>>  Thank you Eliot
>>>>  yes I guessed it locks up the critical section, but I hoped with
>>>> would not happen if I the use UIManager defer call.
>>>>
>>>>
>>>>
>>>>>
>>>>>  ThreadSafeTranscript>>nextPutAll: value
>>>>>   accessSemaphore
>>>>>  critical: [stream nextPutAll: value].
>>>>>  ^value
>>>>>
>>>>>  So instead you need to use a semaphore.  e.g.
>>>>>
>>>>>  | p s wait |
>>>>> s := Semaphore new.
>>>>> p:=[[true] whileTrue:[wait ifTrue: [s wait]. Transcript crShow:
>>>>> (DateAndTime now asString). 30 milliSeconds asDelay wait]] newProcess.
>>>>> wait := true.
>>>>> 30 milliSeconds asDelay wait.
>>>>>  wait := false.
>>>>> s signal
>>>>>
>>>>>  etc...
>>>>>
>>>>
>>>>  Is this a common pattern I can find in pharos classes. Or I need some
>>>> help understanding this. The semaphore
>>>>  wait/signal is used instead of process resume/suspend?
>>>>
>>>> What I want is a process doing repeatly some computation,
>>>> calls or triggers an update on a morph, and I want to suspend and
>>>> resume this process.
>>>>
>>>> I would stop this discussion if someone tells me, "No your are doing it
>>>> wrong, go this way ..",  BUT what strikes me:
>>>> in this example, that reproduces my problem more closely:
>>>>
>>>> |p m s running|
>>>> running:=false.
>>>> m:=Morph new color:Color red.
>>>> s:= StringMorph new.
>>>> m addMorphBack:s.
>>>> p:=[[true]whileTrue:[20 milliSeconds asDelay wait. s
>>>> contents:(DateAndTime now asString). m changed]] newProcess.
>>>> m on:#mouseUp send:#value to:[
>>>>     running ifTrue:[p suspend. m color:Color red.]
>>>>     ifFalse:[p resume.m color:Color green.].
>>>>     running := running not].
>>>> m extent:(300@50).
>>>> m openInWorld
>>>>
>>>>
>>>>  clicking on the morph will stop or resume the process, if it locks up
>>>> I can still press alt+dot ->
>>>>  - a Debugger opens but the UI is still not responsive. I can click
>>>> with the mouse on the debuggers close icon.
>>>>  - nothing happens, as the UI is still blocked.
>>>>  - pressing alt+Dot again, the mouse click on the close icon is
>>>> processed and the first debugger window closes
>>>> - maybe other debuggers open.
>>>>
>>>> Repeating this steps, at some time the system is *fully* responsive
>>>> again!
>>>>  And miraculously, it works after that without further blockages.
>>>> What's happening here?
>>>>
>>>>
>>>>  Nicolai
>>>>
>>>>
>>>>
>>>>>
>>>>>  HTH
>>>>>
>>>>>   regards
>>>>>>  Nicolai
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>  --
>>>>> best,
>>>>> Eliot
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
>

Interesting, it works.
So, it is the Delay>>schedule implementation?
Because it uses a #critical block for the wait instead of wait/signal only?

Nicolai

Reply via email to