Hi all,
   this is my first post here, just joined this group so let me do a  
quick introduction.
I have 10+ years of experience doing full-time development with  
VisualWorks (creating trading platforms for the European energy  
exchanges).
Also I have near-zero experience with using Squeak or how its  
development process works. I'll slowly cut my teeth, starting by  
interacting on this mailing list.


Multi threading is something I have spent a lot of quality time with,  
so I want to share some thoughts on the following:

|semaphores tr|
semaphores := Array new: 10.
tr := ThreadSafeTranscript new.
tr open.
1 to: 10 do: [ :index | semaphores at: index put: Semaphore new ].

     1 to: 10 do: [:i |
         [
         tr nextPutAll: i printString, ' fork '; cr.
         (semaphores at: i) signal.
         ] fork
     ].

     semaphores do: [:each | each wait ].
     tr show: 'all forks proccesed'; cr.


I have seen this pattern often (allocating a semaphore for every  
forked process), I usually interpret this as a signal that such code  
is still in its first 'make it work/make it right' stages.
What a lot of people don't realize is that at its heart a semapore is  
a thread-safe counter/register (and if you look at the hierarchy it is  
implemented on you wouldn't guess that either, since the hierarchy  
stresses the implementation part that manages waiting processes rather  
than the counter aspect).

So trying to take the code snippet toward 'make it abstract' territory  
this could be refactored to lean more on the counter aspect of  
semaphores and use only a single semaphore:


|count sem tr|
tr := ThreadSafeTranscript new.
tr open.
count := 10.
sem := Semaphore new.

1 to: count do: [:i |
        [       tr nextPutAll: (i printString, ' fork\') withCRs.
                sem signal.
        ] fork].

count timesRepeat: [sem wait].
tr show: 'all forks proccesed'; cr.



Now the above is about as far as you can go with the current Squeak  
and VisualWorks implementations so you can take it as a simple  
refactoring advise.




However I want to press on a bit more (and go a bit off-topic for this  
list ;-) because I feel it still has a big problem: we need to  
maintain a 'count' and pass that between the two loops in the above  
example.
In the current example this is not much of a problem but in more  
complex applications where the forking is done by yet other forked  
processes we will need to make 'count' thread-safe as well -- I find  
this very ugly, because you will need an extra semaphore just to make  
the original semaphore work as required.
Furthermore you cannot add new forked processes once the second loop  
has started running.

So here is an experiment I did a couple of years ago with VisualWorks:  
I altered the VM (just one line of its source ;-) so it would react  
properly to semaphores that have negative values in the  
'excessSignals' instance variable, and I added a method #unsignal to  
Semaphore that would decrease the value of that ivar.

In my experiments that yielded many opportunities to simplify  
multiprocessing code (not only for thread synchronization but also for  
passing around counts in a thread-safe register!).

In the above code that would allow us to 'pre-load' the semaphore at  
the place where the threads are created with as result that the  
'count' variable can be removed and the bottom loop can be removed too:


|count sem tr|
tr := ThreadSafeTranscript new.
tr open.
sem := Semaphore forMutualExclusion. "We need one excessSignal to  
balance the #wait below"

1 to: 10 do: [:i |
        sem unsignal. "outside the forked code"
        [       tr nextPutAll: (i printString, ' fork\') withCRs.
                sem signal. "balance the unsignal"
        ] fork].

sem wait. "no loop, no need to know the count!"
tr show: 'all forks proccesed'; cr.





Above was just a simple refactoring, but look at how I needed it:

|sem tr|
tr := ThreadSafeTranscript new.
tr open.
sem := Semaphore new. "no excessSignal this time"

"set up a monitoring system first(!)"
[       sem wait.
        tr show: 'all forks proccesed'; cr
] fork.

"then create jobs (in my case I had only a single first job that would  
recursively create more jobs, not shown here)"

1 to: 10 do: [:i |
        sem unsignal.
        [       tr nextPutAll: (i printString, ' fork\') withCRs.
                sem signal.
        ] fork].
"Now that we are sure at least one job is entered balance the #wait we  
started out with"
sem signal.


Since we elided 'count' I can move the code that relied on it up in  
front of the thread creation code, I very much like this flavor of  
decoupling.


I guess this illustrates that Semaphore is stuck in the 'make it work/ 
make it right' phase for thirty years now, and that moving it into  
'make it abstract' territory will make lots of hairy multi-threading  
code much simpler to express...

(And for those thinking this through: yes I did implement a thread- 
safe #add: and #valueWithReset on semaphore too ;-)


I hope I didn't bore y'all and stray to far off-topic, but I did want  
to share this bit of insight I gained by tinkering with the semaphore  
implementation: semaphores are thread-safe counters at their heart.



Cheers,

Reinout
-------

PS: big congrats with the license cleaning milestone, this is what  
finally pulled me into this project :-)


_______________________________________________
Pharo-project mailing list
[email protected]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Reply via email to