2014-07-27 7:09 GMT+02:00 Ben Coman <[email protected]>: > > > > > 2014-07-25 16:56 GMT+02:00 Ben Coman <[email protected]>: > >> >> Over the last few days I have been looking deeper into the image locking >> when suspending a process. It is an interesting rabbit hole [1] that leads >> to pondering the Delay machinery, that leads to some VM questions. >> >> When pressing the interrupt key it seems to always opens the debugger >> with the following call stack. >> Semaphore>>critical: 'self wait' >> BlockClosure>>ensure: 'self valueNoContextSwitch' >> Semaphore>>critical: 'ensure: [ caught ifTrue: [self signal]] >> Delay>>schedule 'AccessProtect critical: [' >> Delay>>wait 'self schedule' >> WorldState>>interCyclePause: >> >> > Nicolai Hess wrote: > > Hi Ben, > > I am on Windows too :( > So, the fixes does not work (not always) on winddows too. But at least > they make it less probable to occure, but it still happens. > The most distracting thing is, after the first ui lock, pressing > alt+dot, closing the debuggers, pressing alt+dot .... > and trying to close the very first debugger, after that, it all works. The > UI is responsive again and suspending the process does > not block the ui anymore. > It "looks like" supsending the process reactivates another process that > blocks the UI. And as soon as I terminate this > process (alt+dot, close debugger ...) all works. > But I really don't know. > > Nicolai > > > Hi Nicolai, I have something for you to try. > > With the problem seeming to be stuck in the semaphore critical section, I > read around on mutexes and semaphores and found [1] enlightening, which > says: "The correct use of a semaphore is for signaling from one task to > another. A mutex is meant to be taken and released, always in that order, > by each task that uses the shared resource it protects. By contrast, tasks > that use semaphores either signal or wait—not both. [...] Any two RTOS > tasks that operate at **different priorities** and coordinate via a mutex, > create the opportunity for priority inversion [...while...] semaphores > [...] do not cause priority inversion when used for signaling." [1] > > To summarize with an example, here's how to use a mutex: > /* Task 1 */ > mutexWait(mutex_mens_room); > // Safely use shared resource > mutexRelease(mutex_mens_room); > > /* Task 2 */ > mutexWait(mutex_mens_room); > // Safely use shared resource > mutexRelease(mutex_mens_room); > > By contrast, you should always use a semaphore like this: > /* Task 1 - Producer */ > semPost(sem_power_btn); // Send the signal > > /* Task 2 - Consumer */ > semPend(sem_power_btn); // Wait for signal > > I am out of my depth here, but in light [1] the way the current code seems > to mix semaphore/mutex concepts with Semaphore>>forMutalExclusion and > Semaphore>>critical: gave me some concern. Anyway, the delay scheduling > code seemed more like a producer/consumer signaling problem than a shared > resource mutal exclusion problem, so I rewrote it like the former and fixes > the problem without causing any *obvious* problems. However I lack > experience here and it could definitely use some review. Can you take the > attached zip file and... > > 1. File in "Delay class-new delay code part 1.st" > Defines #enableNewDelayCode and #enableNewDelayCode: > Manually define the referenced class variables > > 2. File in "Delay class-new delay code part 2.st" > Modifies #handleTimerEvent, guarded by enableNewDelayCode > > 3. File in "Delay-new delay code part 3.st" > Modifies #schedule and #unschedule, guarded by enableNewDelayCode > > 4. File in "BackgroundWorkDisplayMorph.st" > > 5. Save checkpoint Image to return to between trials. > > 6. Open System Browser on BackgroundWorkDisplayMorph>>initialize > > Trial 1. > BackgroundWorkDisplayMorph new openInWorld. > > Trial 2. > Delay enableNewDelayCode: true. > BackgroundWorkDisplayMorph new openInWorld. > > If if its deemed worthwhile, I'll open a ticket to track this more > formally. One thing needed is to considered performance impact, I haven't > so far. > > cheers -ben > > [1] http://www.barrgroup.com/Embedded-Systems/How-To/RTOS-Mutex-Semaphore > [2] http://nerdfortress.com/2011/02/18/you-say-semaphore-i-say-mutex/ > > > I notice... >> Delay class >> initialize >> TimingSemaphore := (Smalltalk specialObjectsArray at: 30). >> and... >> Delay class >> startTimerEventLoop >> TimingSemaphore := Semaphore new. >> which seems incongruous that TimingSemaphore is set in differently. So >> while I presume this critical stuff all works fine, just in an exotic way, >> my entropy-guarding-neuron would just like confirm this is so. >> >> -------------- >> >> In Delay class >> handleTimerEvent the comment says... >> "Handle a timer event.... >> -a timer signal (not explicitly specified)" >> ...is that event perhaps a 'tick' generated periodically by the VM via >> that item from specialObjectArray ? Or is there some other mechanism ? >> >> -------------- >> >> [1] http://www.urbandictionary.com/define.php?term=Rabbit+Hole >> cheers -ben >> >> >> P.S. I've left the following for some initial context as I change the >> subject. btw Nicolai, I confirm that my proposed fixes only work on >> Windows, not Mavericks (and I haven't checked Linux). >> >> Nicolai Hess wrote: >> >> >> Hi ben, thank you for looking at this. >> >> 2014-07-22 20:17 GMT+02:00 <[email protected]>: >> >>> I thought this might be interesting to learn, so I've gave it a go. I >>> had some success at the end, but I'll give a progressive report. >>> >>> First I thought I'd try moving the update of StringMorph outside the >>> worker-process using a Morph's #step method as follows... >>> >>> Morph subclass: #BackgroundWorkDisplayMorph >>> instanceVariableNames: 'interProcessString stringMorph' >>> classVariableNames: '' >>> category: 'BenPlay' >>> "---------" >>> >>> BackgroundWorkDisplayMorph>>initializeMorph >>> self color: Color red. >>> stringMorph := StringMorph new. >>> self addMorphBack: stringMorph. >>> self extent:(300@50). >>> "---------" >>> >>> BackgroundWorkDisplayMorph>>newWorkerProcess >>> ^[ >>> | work | >>> work := 0. >>> [ 20 milliSeconds asDelay wait. >>> work := work + 1. >>> interProcessString := work asString. >>> ] repeat. >>> ] newProcess. >>> "---------" >>> >>> BackgroundWorkDisplayMorph>>step >>> stringMorph contents: interProcessString. >>> "---------" >>> >>> BackgroundWorkDisplayMorph>>stepTime >>> ^50 >>> "---------" >>> >>> BackgroundWorkDisplayMorph>>initialize >>> | workerProcess running | >>> super initialize. >>> self initializeMorph. >>> >>> workerProcess := self newWorkerProcess. >>> running := false. >>> >>> self on: #mouseUp send: #value to: >>> [ (running := running not) >>> ifTrue: [ workerProcess resume. self color: Color green. ] >>> ifFalse: [ workerProcess suspend. self color: Color red. ] >>> ] >>> "---------" >>> >>> >>> >>> But evaluating "BackgroundWorkDisplayMorph new openInWorld" found this >>> exhibited the same problematic behavior you reported... Clicking on the >>> morph worked a few times and then froze the UI until Cmd-. pressed a few >>> times. >>> >> >>> However I found the following never locked the GUI. >>> >>> BackgroundWorkDisplayMorph>>initialize >>> "BackgroundWorkDisplayMorph new openInWorld" >>> | workerProcess running | >>> super initialize. >>> self initializeMorph. >>> >>> workerProcess := self newWorkerProcess. >>> running := false. >>> >>> [ [ (running := running not) >>> ifTrue: [ workerProcess resume. self color: Color green ] >>> ifFalse: [ workerProcess suspend. self color: Color red ]. >>> 10 milliSeconds asDelay wait. >>> ] repeat ] fork. >>> "---------" >>> >>> >> This locks the UI as well. Not every timet hough. I did this 5 times, >> every time in a freshly loaded image and it happens two times. >> >> >> >>> So the problem seemed to not be with #suspend/#resume or with the shared >>> variable /interProcessString/. Indeed, since in the worker thread >>> /interProcessString/ is atomically assigned a copy via #asString, and the >>> String never updated, I think there is no need to surround use of it with a >>> critical section. >>> >>> The solution then was to move the "#resume/#suspend" away from the "#on: >>> #mouseUp send: #value to:" as follows... >>> >>> BackgroundWorkDisplayMorph>>initialize >>> "BackgroundWorkDisplayMorph new openInWorld" >>> | workerProcess running lastRunning | >>> super initialize. >>> self initializeMorph. >>> >>> workerProcess := self newWorkerProcess. >>> lastRunning := running := false. >>> >>> [ [ lastRunning = running ifFalse: >>> [ running >>> ifTrue: [ workerProcess resume ] >>> ifFalse: [ workerProcess suspend ]. >>> lastRunning := running. >>> ]. >>> 10 milliSeconds asDelay wait. >>> ] repeat ] fork. >>> >>> self on: #mouseUp send: #value to: >>> [ (running := running not) >>> ifTrue: [ self color: Color green. ] >>> ifFalse: [ self color: Color red. ] >>> ] >>> "---------" >>> >> >> And this too :( >> >> >> >>> >>> And finally remove the busy loop. >>> >>> BackgroundWorkDisplayMorph>>initialize >>> "BackgroundWorkDisplayMorph new openInWorld" >>> | workerProcess running lastRunning semaphore | >>> super initialize. >>> self initializeMorph. >>> >>> workerProcess := self newWorkerProcess. >>> lastRunning := running := false. >>> semaphore := Semaphore new. >>> >>> [ [ semaphore wait. >>> running >>> ifTrue: [ workerProcess resume ] >>> ifFalse: [ workerProcess suspend ]. >>> ] repeat ] fork. >>> >>> self on: #mouseUp send: #value to: >>> [ (running := running not) >>> ifTrue: [ self color: Color green. ] >>> ifFalse: [ self color: Color red. ]. >>> semaphore signal. >>> ] >>> "---------" >>> >>> >> >> And this locks the UI too. (Loaded the code 20 times, every time after >> a fresh image start up. Two times I got a locked >> ui after the first two clicks). >> And I don't understand this code :) >> >> >> >>> Now I can't say how close that is to how it "should" be done. Its the >>> first time I used sempahores and just what I discovered hacking around. >>> But hey! it works :) >>> >>> cheers -ben >>> >>> >>> >>> Nicolai Hess wrote: >>> >>> I am still struggling with it. >>> >>> Any ideas? >>> >>> >>> 2014-07-09 11:19 GMT+02:00 Nicolai Hess <[email protected]>: >>> >>>> >>>> >>>> >>>> 2014-07-09 2:07 GMT+02:00 Eliot Miranda <[email protected]>: >>>> >>>> Hi Nicolai, >>>>> >>>>> >>>>> On Tue, Jul 8, 2014 at 7:19 AM, Nicolai Hess <[email protected]> >>>>> wrote: >>>>> >>>>>> I want to create a process doing some work and call #changed on a >>>>>> Morph. >>>>>> I want to start/suspend/resume or stop this process. >>>>>> But sometimes, suspending the process locks the UI-Process, >>>>>> and I don't know why. Did I miss something or do I have to care when >>>>>> to call suspend? >>>>>> >>>>>> Wrapping the "morph changed" call in >>>>>> UIManager default defer:[ morph changed]. >>>>>> Does not change anything. >>>>>> >>>>>> Here is an example to reproduce it. >>>>>> Create the process, >>>>>> call resume, call supsend. It works, most of the time, >>>>>> but sometimes, calling suspend locks the ui. >>>>>> >>>>>> p:=[[true] whileTrue:[ Transcript crShow: (DateAndTime now asString). >>>>>> 30 milliSeconds asDelay wait]] newProcess. >>>>>> >>>>> p resume. >>>>>> p suspend. >>>>>> >>>>> >>>>> If you simply suspend this process at random form a user-priority >>>>> process you'll never be able to damage the Delay machinery you're using, >>>>> but chances are you'll suspend the process inside the critical section >>>>> that >>>>> Transcript uses to make itself thread-safe, and that'll lock up the >>>>> Transcript. >>>>> >>>> >>>> Thank you Eliot >>>> yes I guessed it locks up the critical section, but I hoped with >>>> would not happen if I the use UIManager defer call. >>>> >>>> >>>> >>>>> >>>>> ThreadSafeTranscript>>nextPutAll: value >>>>> accessSemaphore >>>>> critical: [stream nextPutAll: value]. >>>>> ^value >>>>> >>>>> So instead you need to use a semaphore. e.g. >>>>> >>>>> | p s wait | >>>>> s := Semaphore new. >>>>> p:=[[true] whileTrue:[wait ifTrue: [s wait]. Transcript crShow: >>>>> (DateAndTime now asString). 30 milliSeconds asDelay wait]] newProcess. >>>>> wait := true. >>>>> 30 milliSeconds asDelay wait. >>>>> wait := false. >>>>> s signal >>>>> >>>>> etc... >>>>> >>>> >>>> Is this a common pattern I can find in pharos classes. Or I need some >>>> help understanding this. The semaphore >>>> wait/signal is used instead of process resume/suspend? >>>> >>>> What I want is a process doing repeatly some computation, >>>> calls or triggers an update on a morph, and I want to suspend and >>>> resume this process. >>>> >>>> I would stop this discussion if someone tells me, "No your are doing it >>>> wrong, go this way ..", BUT what strikes me: >>>> in this example, that reproduces my problem more closely: >>>> >>>> |p m s running| >>>> running:=false. >>>> m:=Morph new color:Color red. >>>> s:= StringMorph new. >>>> m addMorphBack:s. >>>> p:=[[true]whileTrue:[20 milliSeconds asDelay wait. s >>>> contents:(DateAndTime now asString). m changed]] newProcess. >>>> m on:#mouseUp send:#value to:[ >>>> running ifTrue:[p suspend. m color:Color red.] >>>> ifFalse:[p resume.m color:Color green.]. >>>> running := running not]. >>>> m extent:(300@50). >>>> m openInWorld >>>> >>>> >>>> clicking on the morph will stop or resume the process, if it locks up >>>> I can still press alt+dot -> >>>> - a Debugger opens but the UI is still not responsive. I can click >>>> with the mouse on the debuggers close icon. >>>> - nothing happens, as the UI is still blocked. >>>> - pressing alt+Dot again, the mouse click on the close icon is >>>> processed and the first debugger window closes >>>> - maybe other debuggers open. >>>> >>>> Repeating this steps, at some time the system is *fully* responsive >>>> again! >>>> And miraculously, it works after that without further blockages. >>>> What's happening here? >>>> >>>> >>>> Nicolai >>>> >>>> >>>> >>>>> >>>>> HTH >>>>> >>>>> regards >>>>>> Nicolai >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> best, >>>>> Eliot >>>>> >>>> >>>> >>> >>> >> >> >> > >
Interesting, it works. So, it is the Delay>>schedule implementation? Because it uses a #critical block for the wait instead of wait/signal only? Nicolai
