On 7/13/10 1:08 PM, Andrew Whitworth wrote:
I've been talking to Chandon a lot about his GSoC project, and one
problem that he's going to start running into is the issue of sharing
pointers (specifically PMCs) across threads, and the mechanisms
necessary to lock and protect them. Unfortunately, implementing a
whole system of synchronization or locking primitives is probably
outside the scope of his GSoC project, so we can't rely on something
like that being designed and implemented by the end of his project
work in August.

My question to the larger parrot community is this: Assuming that in a
month we have a more-or-less working and usable threading
implementation in Parrot, how do we want to handle sharing of data?

As safely, cleanly, and with as little complexity as possible.

Do we want a global interpreter lock, like Python uses?

The GIL is the single biggest problem with CPython today, and if they could get rid of it without breaking a ton of existing code, they would. So, definitely not that.

A new implementation of STM?

STM isn't a magic silver bullet either. Even the greatest advocates of STM (e.g. Simon Peyton-Jones) will tell you that STM doesn't actually resolve the problem of deadlocks. (And since deadlocks are one of the central problems STM was trying to eliminate, that's saying a lot.) And, there tends to be a high cost associated with re-executing the transaction code repeatedly until it succeeds.

COW clones for shared PMCs?

The case where two threads have independent copies of a variable really isn't a problem. (COW might save a bit of memory, but you might as well just make a copy.) The difficult and costly part of truly shared variables is when two threads are making changes to a variable and need to see each other's changes.

A library of
locking primitives (probably with some version of a limited GIL to
protect interpreter-global data)?

There are several things you could mean by this, but I'm guessing you mean a set of thread-safe PMCs, that can be shared across threads. It's a possibility.

Or, do we want to maybe start planning for a new architecture entirely
and use a message-passing system like erlang?

That is the direction the most recent round of refactors and the current PDD are heading, and the direction we'll continue heading through Lorito.

This could be
interesting, but does nothing to make threading usable in parrot in
the next few months.

It's a bit strong to say "unusable".

Make a list of the features Chandon needs (not wishlist, but absolute essentials for his GSoC project), and we'll make sure they work. Though, does he really even need shared variables for his project? The idea was to prototype a new style of threading, the GSoC project doesn't require him to drop it in as a whole replacement for the current implementation.



I've been thinking a lot about concurrency lately for Lorito. In the longer-term:

We don't have a single stock answer of "X concurrency model will rule the world" because no one does. It would be a really, really bad idea to sell our souls to any one concurrency model available today, because they're all broken in some way. The Parroty way is to provide the building blocks for multiple different approaches, without dictating one.

The wider world has thrown up its hands at concurrency and gone to the cloud. Cloud architectures are essentially the extreme case of the Erlang concurrency model, independent units of code where you don't have to think about parallelism, combined with message passing remotely over the network rather than within a single process on a single machine. Some features that would give us an advantage in this model are a lightweight interpreter with rapid startup time (hey presto, Lorito).

Message passing internally is the other most important model to support, and should be easy to integrate with message-passing externally.

The other two are shared and unshared threads. Threads with no shared variables are easy. Threads with shared variables (that aren't message-passing) are more complex in that there are multiple ways to do it, and we need to decide which ways to support, which ways to allow (i.e. you can use it, but not together with some other sets of features or some other concurrency models), which we can emulate, and if there are any models we want to explicitly disallow.

I'd put chromatic's "unshared by default with explicit shared variables" in the "supported" category. We can gain some added safety by only allowing specific thread-safe or locking variable types to be shared, and by segmenting shared-variable memory off from the regular pools. This form of sharing can safely interact with message-passing (which is basically just unshared, plus one shared message channel).

Python's GIL and Perl 5 ithreads I'd put in the "allowed" category, i.e. we make it possible, but if you're going that way, it's a whole-hog option. Don't expect it to play well with other concurrency models running in the same interpreter at the same time.

Allison
_______________________________________________
http://lists.parrot.org/mailman/listinfo/parrot-dev

Reply via email to