Hi,

The threading model topic still needs lots of thinking, so I decided to
try out some ideas.

Every concurrency model has its advantages and drawbacks, I've been
wondering about this ideas for a while now and I think I finally have a
sketch. My primary concerns were:

 1 - It can't require locking: Locking is just not scalable;
 2 - It should perform better with lots of cores even if it suffers
     when you have only a few;
 3 - It shouldn't require complicated memory management techniques that 
     will make it difficult to bind native libraries (yes, STM is damn 
     hard);
 4 - It should suport implicit threading and implicit event-based  
     programming (i.e. the feed operator);
 5 - It must be easier to use then Perl 5 shared variables;
 6 - It can't use a Global Interpreter Lock (that already said in 1, 
     but, as this is a widely accepted idea in some other environments,
     I thought it would be better to make it explicit).

The idea I started was that every object has an "owner" thread, and only
that "thread" should talk to it, and I ended up with the following,
comments are appreciated:

0 - The idea is similar to Erlang and the "IO Language". Additionally
    to OS threads there are the interpreter processes.

1 - No memory is shared between processes, so no locking is
    necessary.

2 - The interpreter implements a scheduler, just like POE.

3 - The scheduler, unlike POE, should be able to schedule in
    several OS threads, such that any OS thread may raise any
    waiting process.

4 - Each process is run in only one OS thread at a time, it's
    like a Global Interpreter Lock, but it's related only to one
    specific process.

5 - A process may block, and the scheduler must become aware of
    that blocking. That is implemented through Control
    Exceptions.

6 - In order to implement inter-process communication, there are:

    6.1 - A "MessageQueue" works just like an Unix Pipe, it looks
          like a slurpy array. It has a configurable buffer size
          and processes might block when trying to read and/or
          write to it.

    6.2 - A "RemoteInvocation" is an object that has an identifier, a
          capture (which might, optionally, point to a "MessageQueue"
          as input) and another "MessageQueue" to be used as output.

    6.3 - An "InvocationQueue" is a special type
          of "MessageQueue" that accepts "RemoteInvocation"
          objects.
          
    6.4 - A "RemoteValue" is an object that proxies requests to
          another processes through a "RemoteInvocation".

7 - The process boundary is drawn at each closure, every closure
    belongs to a process, every value initialized inside a
    closure belongs to that closure. You might read coroutine instead
    of closure if you like.

8 - A value might have its ownership transferred to another closure if
    it can be detected that this value is in use only for that
    invocation or return value, in order to reduce the amount of
    "RemoteInvocation"s.

9 - A value might do a special "ThreadSafe" role if it is thread-safe
    (such as implementing bindings to thread-safe native libraries) In
    which case it is sent as-is to a different thread.

10 - A value might do a special "ThreadCloneable" role if it should
     be cloned instead of being proxied through a "RemoteValue"
     when sent to a different process.

11 - The "MessageQueue" notifies the scheduler through a Control
     Exception whenever new data is available in that queue so the
     target process might be raised.

12 - Exception handling gets a bit hairy, since exceptions might only
     be raised at the calling scope when the value is consumed.

13 - List assignment and Sink context might result in synchronized
     behavior.

comments? ideas?

daniel

Reply via email to