Hi Dan,

On Oct 1, 2007, at 12:26 PM, Dan Creswell wrote:

Brian Goetz has brought up that we may have frog-boiled ourselves into
an bad situation by adopting the model of shared state with locks in
Java. In general the shared state/locks model makes concurrent programs
difficult to reason about, but in particular this approach to
concurrency isn't composable. You can't safely combine different modules
without understanding the details of what they do with locks and how
they will interact.


That's just about the consequences of shared state with concurrent
access be it using locks or transactional memory etc.

And I think in general concurrency is difficult to reason about even in message based systems with shared nothing because you still have issues of failure to deal with including how that might impact message delivery.

Concurrency may be difficult to reason about in general, but some models are more difficult than others. Some models make it impossible to trip in certain ways, just as it is impossible in Java bytecodes to free a pointer to memory twice. As I understand it, Erlang prevents deadlock by only allowing threads to interact via messages. Transactional memory can still produce starved threads that keep retrying and keep getting rolled back, but the app as a whole will make progress because some threads will be getting stuff done. Java's basic model of synchronization is a bit more like a mine field, because you have to understand the whole application, including everything libraries are doing with locks and callbacks and such, to be sure there are no potential deadlocks. And that's really hard to do by analysis, and it is hard to detect problems via testing because they can happen quite rarely.

If you have a JavaSpaces client that looks at how many CPUs or cores it has to work with when it starts up, and fires up one master thread and enough worker threads to keep all those cores busy, assuming each thread is an independent guy that only communicates with other threads over the network via a JavaSpace, then those threads can't deadlock. (Though I supposed you could design a JavaSpace protocol that could hang them up.)

The Pragmatic Programmers recently published a book on Erlang, which got a lot of people taking about Erlang. Erlang uses a shared nothing model, with message passing between "processes" managed by "actors". Processes
can be implemented as threads I assume, or can be distributed. One
interesting thing about Erlang is that it tries to unify the remote and local models, as far as I can tell. Not that they haven't read a Note on
Distributed Computing. I think that instead of trying to make remote
nodes look like local ones, they may treat local ones as unreliable as
remote ones.


I've yet to see exactly how Erlang does failure detection of processes.
 I guess there might be some timeout value somewhere in respect of
messages reaching a destination etc but I've not seen a description of
this aspect of Erlang.

Further whilst Erlang might do failure detection (of a form) solving the
issues of failure are the difficult bit and I'm less convinced Erlang
offers much here.  For example, one solution to failure is replication
and it appears you are (unsurprisingly) left to do that for yourself
right now.  Putting my high-performance hat on I'd also point out that
replication has recognized limits especially when it's done with
transactions which leads to even more esoteric solutions that are
largely about appropriate architecture/interactions and less about
shared-nothing or message passing.

I'm not trying to promote Erlang's approach, only to point out that it is getting a lot of buzz, because people are thinking about multi- core.

I've been involved with a language called Scala lately, which has an
Erlang-like actors library. On the mailing list they keep talking about issues with implementing remote actors. I as yet don't understand these
details either, but I keep getting this wierd feeling that wheel
reinvention is going on. They seem to be talking about how to solve
problems that Jini addressed almost 10 years ago.

So here's my question. I get the feeling that the trend to multi-core
architectures represents a disruptive technology shift that will shake
up the software industry to some extent. Does River have something to
offer here? If you expect the chips your software will run on will have
multiple cores, and maybe you don't know how many until your program
starts running, you'll want to organize your software so it distributes processing across those cores dynamically. Isn't JavaSpaces a good way
to do that?

I think what it might mean is that you treat another core on the same
box running a worker thread the same as a worker thread across the
network. That way you have a uniform programming model, and when you run out of cores, you just add more boxes and you get more worker nodes. So it would be the opposite of the concept targeted by the Note. Yes, you would use objects through a uniform interface, and whether or not that
object is implemented locally or remotely would be an implementation
detail of the object. But what you'd assume is not that the thing is
local (a thread on another core of the same box) but remote.


Hmmmm, so the uniform model concept is nice and cleans out one
difficulty but there are some others lying around in this which I reckon
are in need of consideration:

(1) A number of multi-core systems are threatening to head towards NUMA
type architectures where the cost of comms is in part related to the
number of memory spaces you have to hop.

(2)  There's at least some (significant?) difference between comms
performance across processors in the same box versus across a network
and therefore the protocols you design and what you pass around in
messages might be somewhat different.

I'm not sure how NUMA would affect things, but local versus remote interfaces usually get into considering chatty versus chunky designs. So my feeling was that if you really are going to just only ever want to exploit multiple cores on one box, JavaSpaces would be overkill because you can reasonably rule out partial failure. In the case when someone wants to exploit multiple cores, but also either distribute processing across the network as well, or at least leave the door open, make it easy to distribute across the network in the future, that JavaSpaces has a compelling solution.

I can imagine J2EE people all over the place in a few years scratching their heads about how they will take advantage of multiple cores for tasks they need done. Will they run a separate J2EE app server on each core? Seems like they could run one app server with multiple threads on each box. But then how do you distribute tasks to those threads? JMS doesn't have a take semantic. I suppose they could install a load balancer in front of a cluster, and have a master server firing jobs into the load balancer.

JavaSpaces solves this problem very elegantly, and has for a long time. The change in the status quo is that the rise of multi-core means more people will be trying to figure out how to do this kind of parallel processing than before. To exploit multi-core, you have to figure out how to partition your app so that you can do parallel processing. You have to find the parallelism. If you actually can do that, you next have to figure out how to implement it. The opportunity I see for River is a marketing one, to simply try and promote the idea that JavaSpaces can be used to solve this kind of problem. So when people face the problem someday, they'll think of JavaSpaces.

Is it still called JavaSpaces? Jini isn't called Jini anymore. What about JavaSpaces?

Thanks.

Bill



Reply via email to