Hi Dan,
On Oct 1, 2007, at 12:26 PM, Dan Creswell wrote:
Brian Goetz has brought up that we may have frog-boiled ourselves
into
an bad situation by adopting the model of shared state with locks in
Java. In general the shared state/locks model makes concurrent
programs
difficult to reason about, but in particular this approach to
concurrency isn't composable. You can't safely combine different
modules
without understanding the details of what they do with locks and how
they will interact.
That's just about the consequences of shared state with concurrent
access be it using locks or transactional memory etc.
And I think in general concurrency is difficult to reason about
even in
message based systems with shared nothing because you still have
issues
of failure to deal with including how that might impact message
delivery.
Concurrency may be difficult to reason about in general, but some
models are more difficult than others. Some models make it impossible
to trip in certain ways, just as it is impossible in Java bytecodes
to free a pointer to memory twice. As I understand it, Erlang
prevents deadlock by only allowing threads to interact via messages.
Transactional memory can still produce starved threads that keep
retrying and keep getting rolled back, but the app as a whole will
make progress because some threads will be getting stuff done. Java's
basic model of synchronization is a bit more like a mine field,
because you have to understand the whole application, including
everything libraries are doing with locks and callbacks and such, to
be sure there are no potential deadlocks. And that's really hard to
do by analysis, and it is hard to detect problems via testing because
they can happen quite rarely.
If you have a JavaSpaces client that looks at how many CPUs or cores
it has to work with when it starts up, and fires up one master thread
and enough worker threads to keep all those cores busy, assuming each
thread is an independent guy that only communicates with other
threads over the network via a JavaSpace, then those threads can't
deadlock. (Though I supposed you could design a JavaSpace protocol
that could hang them up.)
The Pragmatic Programmers recently published a book on Erlang,
which got
a lot of people taking about Erlang. Erlang uses a shared nothing
model,
with message passing between "processes" managed by "actors".
Processes
can be implemented as threads I assume, or can be distributed. One
interesting thing about Erlang is that it tries to unify the
remote and
local models, as far as I can tell. Not that they haven't read a
Note on
Distributed Computing. I think that instead of trying to make remote
nodes look like local ones, they may treat local ones as
unreliable as
remote ones.
I've yet to see exactly how Erlang does failure detection of
processes.
I guess there might be some timeout value somewhere in respect of
messages reaching a destination etc but I've not seen a description of
this aspect of Erlang.
Further whilst Erlang might do failure detection (of a form)
solving the
issues of failure are the difficult bit and I'm less convinced Erlang
offers much here. For example, one solution to failure is replication
and it appears you are (unsurprisingly) left to do that for yourself
right now. Putting my high-performance hat on I'd also point out that
replication has recognized limits especially when it's done with
transactions which leads to even more esoteric solutions that are
largely about appropriate architecture/interactions and less about
shared-nothing or message passing.
I'm not trying to promote Erlang's approach, only to point out that
it is getting a lot of buzz, because people are thinking about multi-
core.
I've been involved with a language called Scala lately, which has an
Erlang-like actors library. On the mailing list they keep talking
about
issues with implementing remote actors. I as yet don't understand
these
details either, but I keep getting this wierd feeling that wheel
reinvention is going on. They seem to be talking about how to solve
problems that Jini addressed almost 10 years ago.
So here's my question. I get the feeling that the trend to multi-core
architectures represents a disruptive technology shift that will
shake
up the software industry to some extent. Does River have something to
offer here? If you expect the chips your software will run on will
have
multiple cores, and maybe you don't know how many until your program
starts running, you'll want to organize your software so it
distributes
processing across those cores dynamically. Isn't JavaSpaces a good
way
to do that?
I think what it might mean is that you treat another core on the same
box running a worker thread the same as a worker thread across the
network. That way you have a uniform programming model, and when
you run
out of cores, you just add more boxes and you get more worker
nodes. So
it would be the opposite of the concept targeted by the Note. Yes,
you
would use objects through a uniform interface, and whether or not
that
object is implemented locally or remotely would be an implementation
detail of the object. But what you'd assume is not that the thing is
local (a thread on another core of the same box) but remote.
Hmmmm, so the uniform model concept is nice and cleans out one
difficulty but there are some others lying around in this which I
reckon
are in need of consideration:
(1) A number of multi-core systems are threatening to head towards
NUMA
type architectures where the cost of comms is in part related to the
number of memory spaces you have to hop.
(2) There's at least some (significant?) difference between comms
performance across processors in the same box versus across a network
and therefore the protocols you design and what you pass around in
messages might be somewhat different.
I'm not sure how NUMA would affect things, but local versus remote
interfaces usually get into considering chatty versus chunky designs.
So my feeling was that if you really are going to just only ever want
to exploit multiple cores on one box, JavaSpaces would be overkill
because you can reasonably rule out partial failure. In the case when
someone wants to exploit multiple cores, but also either distribute
processing across the network as well, or at least leave the door
open, make it easy to distribute across the network in the future,
that JavaSpaces has a compelling solution.
I can imagine J2EE people all over the place in a few years
scratching their heads about how they will take advantage of multiple
cores for tasks they need done. Will they run a separate J2EE app
server on each core? Seems like they could run one app server with
multiple threads on each box. But then how do you distribute tasks to
those threads? JMS doesn't have a take semantic. I suppose they could
install a load balancer in front of a cluster, and have a master
server firing jobs into the load balancer.
JavaSpaces solves this problem very elegantly, and has for a long
time. The change in the status quo is that the rise of multi-core
means more people will be trying to figure out how to do this kind of
parallel processing than before. To exploit multi-core, you have to
figure out how to partition your app so that you can do parallel
processing. You have to find the parallelism. If you actually can do
that, you next have to figure out how to implement it. The
opportunity I see for River is a marketing one, to simply try and
promote the idea that JavaSpaces can be used to solve this kind of
problem. So when people face the problem someday, they'll think of
JavaSpaces.
Is it still called JavaSpaces? Jini isn't called Jini anymore. What
about JavaSpaces?
Thanks.
Bill