[Pharo-project] Concurrent programming (was: Would you start a new Smalltalk project today?)

Michael van der Gulik Sun, 22 Mar 2009 14:45:16 -0700

On Mon, Mar 23, 2009 at 12:18 AM, Igor Stasenko <[email protected]> wrote:


> 2009/3/22 Igor Stasenko <[email protected]>:
> > 2009/3/22 Michael van der Gulik <[email protected]>:
> >>
> >>
> >> On Sat, Mar 21, 2009 at 8:38 AM, Janko Mivšek <[email protected]>
> >> wrote:
> >>>
> >>> Philippe Marschall pravi:
> >>> >> Michael van der Gulik wrote:
> >>>
> >>> >> So now it seems that Gemstone is the only multi-core capable
> Smalltalk
> >>> >> VM :-(.
> >>>
> >>> > AFAIK Gemstone isn't multi-core capable as well. You can just run
> >>> > multiple gems and they share the same persistent memory. Which is
> >>> > similar but different.
> >>>
> >>> Well, Gemstone can for sure be considered as multi-core capable. Every
> >>> gem runs on its own process and therefore can run on its own CPU core.
> >>> All gems then share a Shared Memory Cache. So, a typical multi-core
> >>> scenario.
> >>>
> >> By multi-core, I mean that the following code would spread CPU usage
> over at
> >> least two cores of a CPU or computer for a while:
> >>
> >> | sum1 sum2 |
> >>
> >> sum1 := 0. sum2 := 0.
> >>
> >> [ 1 to: 10000000 do: [ :i | sum1 := sum1 + 1 ] ] fork.
> >>
> >> [ 1 to: 10000000 do: [ :i | sum2 := sum2 + 1 ] ] fork.
> >>
> >> (I didn't try the above so there might be obvious bugs)
> >>
> >> If a VM can't distribute the load for the above over two or more CPU
> cores,
> >> I consider its multi-core capabilities a hack. No offense intended to
> the
> >> Hydra VM.
> >>
> >
> > Michael, that's would be too ideal to be true, especially for smalltalk.
> >
> > Consider the following:
> >
> > | array sum1 sum2 |
> >
> > sum1 := 0. sum2 := 0.
> > array := Array new: 10.
> >
> > [ 1 to: 10000000 do: [ :i | array at: (10 random) put: (Array new: 10) ]
> ] fork.
> > [ 1 to: 10000000 do: [ :i | array at: (10 random) put: (Array new: 10) ]
> ] fork.
> > 1 to: 10000000 do: [ :i | array at: (10 random) put: (Array new: 10) ].
> >
> > This code reveals the following problems:
> > - concurrent access to same object
>


The problem here is that the code above wasn't written with concurrency in
mind. I'd be looking at what I'm trying to acheive, and then try to keep
each thread's data structures as disentangled as possible.

If, for example, you were trying to allocate an array of arrays
concurrently, then I would write (using my namespaces :-) ):

array := Concurrent.Collections.Array new: 1000000000. " We ignore the cost
of making this array to begin with :-) "
array withIndexdo: [ :each :i |
    " each will always be nil. "
    array at: i put: (Array new: 10).
].

Concurrent.Collections.Array>>withIndexDo: aBlock
     | nThreads sema |
    nThreads := 1000. " Pull this from a system setting or something. "
    sema := Semaphore withSignals: 0-nThreads. " Needs implementing: signal
it -1000 times. "
    1 to: nThreads do: [ :i |
        [ (nThreads*i) to: (nThreads*i - 1) do: [ :j
            aBlock value: (self at: j) value: j ]
] fork ].
    ^ self.

There are probably bugs, I haven't tried running this. No synchronisation
should be needed in this example, making it fast. Currently, this wouldn't
work on the Squeak VM because block contexts aren't er... "reentrant".


> > - a heavy memory allocation during running 3 processes, which at some
> > point should cause GC.
> > While first is more or less on the hands of developer (write a proper
> > code to avoid such things), but second is a problem that you need to
> > solve to be able to collect garbage in real time , when there are
> > multiple threads producing it.
>


Then... invent a better garbage collector. This might be a difficult problem
to solve, but it isn't impossible.

http://gulik.pbwiki.com/Block-based+virtual+machine (ideas only)



>
> But what strikes me, that there are a lot of code, which never cares
> about it, for instance see
> Symbol class>> intern:
> and in some magical fashion it works w/o problems in green threading..
> i'm not sure it will continue running when you enable multiple native
> threads.



I plan to find and fix these at some stage. Firstly, I would change the
behaviour of the scheduler to get rid of the predictable yielding behaviour
to expose these bugs. Doing so is required for the implementation of
"Dominions" in my SecureSqueak project anyway, and I'll try to feed my
changes back into the community.

I also plan to make the Squeak equivalent of java.util.concurrent at some
stage.



>
>
> There is another problem, that squeak processes is cheap (few bytes in
> object memory), while allocating new native thread consumes
> considerable amount of memory & address space. So, if you map
> Processes to native threads, you will lose the ability in having
> millions of them, instead you will be limited to thousands.
>


Answered by Anthony which I agree with.

Gulik.


-- 
http://gulik.pbwiki.com/

_______________________________________________
Pharo-project mailing list
[email protected]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

[Pharo-project] Concurrent programming (was: Would you start a new Smalltalk project today?)

Reply via email to