Re: jini / javaspaces wrt scala and akka

Oliver Plohmann Sun, 23 Oct 2011 06:10:52 -0700

Hello,

I once developed an interest in STM and surfed a bit through theInternet to find more information about it. I discovered that JBossInfinispan is - according to the JBoss Infinispan people themselves -very STM-like. It has MVCC (already JBoss Cache had MVCC). Infinispanseems to be an option as a central store, through which actors can dolock-free communication. Infinispan also supports asynchronous eventnotification like JavaSpaces. Large parts of JBoss Cache had beenrewritten, because they also had to learn their lessons. Somewhere onTheServerside.com I once found a remark by some developer saying thatthey are seriously considering Infinispan since the license costs forCoherence are killing them. There was also some comment where somebodyquestioned the accurateness MVCC was implemented in Akka. I shouldprovide a link to that reference I know. But if you search the Internetfor "MVCC Akka Actors Infinispan Coherence, etc." you should find someinteresting comments.

So my suggestion would be to go with something like River or Infinispanand add some actor framework to it or develop your own littlequeue-based thread pool-aware mini actor solution to it. I like verymuch the idea of actors and STM used in conjunction. With STM a commitis guaranteed to be deadlock free. Eventually, you can get an optimisticlocking exception. There is some overhead here, because you then have torefetch your data from your central store and then restart your unit ofwork. For someone who knows what a pain deadlocks can be in a trueproduction environment the cost is little compared to the gain.

I think actors on their own are not that useful. Shared-nothing isreally hard to achieve sometimes unless things can be well isolated. Inthat case you don't need actors as you need nothing more than somemap-reduce framework. But actors combined with STM makes really sense tome. Through the STM space actors are able to coordinate themselves in alock-free way. Let's say executing a job in the actor queue creates anexception. Then how do you react propperly? First thing would be to askall other actors to stop execution. How else can you unwind thingspropperly if everyone else continues to do it's job. Once your actorsunderstands what went wrong and wants to set itself up again thesurrounding environment has already been modified in the meanwhile bythe other actors. Then where do you restart? It then gets reallydifficult to continue correctly after some exception or some logical error.

What you never hear from any of the actor proponents is thatasynchronous communication is not easy. Messages don't need to arrive inthe order they were sent. The receiving actor needs to make sure somemeaningful order is established. Asynchronous programming is not new andhence people have already discovered hierarchical state machines can beused well for this job. But I never saw any comment on any homepage fromsome actor framework project group dropping a note about this. And howto you proceed in case of an exception? This is also never mentioned? Ithink in case of an exception the STM is really useful. Not changingsome flag after an exception tells the actors that something went wrongsomewhere else and that they have to return to some safe point, re-starttheir unit of work or continue on their own or whatever.

Actually, JavaSpaces is well-suited for actor-based distributedprogramming in my opinion. Only the actor libraries are a bit lacking.Most actor frameworks are hobby projects or thesis works where some workremains to be done till ready for production use. Pity is that River orBlitz are not clustered and GigaSpaces is only free if there is only oneserver in the cluster (same meanwhile for Terracotta). I fear forJavaSpaces to do well in the future a clustered free implementation isneeded. Otherwise people simply go with something like Infinispan orEHCache. Some backing company like JBoss in addition would be nice as well.

I thought about combining Apache Hadoop with JavaSpaces lately (only theHadoop distributed files system without the map-reduce stuff). Hadoop isclustered (the name server is a single point of failure, I know ... butit is nonetheless clustered and scales well). I know the idea is a bitcrazy since Hadoop is based on a read-many/write-once approach whichmight be fine for data analsysis, but is nonetheless very contrary tothe idea in JavaSpaces. But I don't want to give up that quickly on theidea. Maybe write-once is a more practical interpretation of sharednothing? Yah, now it gets really crazy... Time to finish off and enjoythe rest of my Sunday :-).


Cheers, Oliver

On 22.10.2011 09:27, Dan Creswell wrote:

Hey Patrick,

On 21 October 2011 21:44, Patrick Logan<[email protected]>  wrote:

Hi all,

I was involved in a jini/javaspaces project five years ago, with good
success. Recently I've been trying Scala including the Akka "actors"
library.

Akka has promise but has fewer capabilities than Jini and Javaspaces,
and is certainly less mature.

I am wondering if anyone on the River list has been using Akka with or
without Scala?

Yes and I know some other people doing similar. Quite a lot of
interest in the banks....

Might folks see River as a more mature, capable system worthy of a
revival now that Scala and Akka are attracting a lot of interest in a
related distributed programming model?

I'm afraid I think it's unlikely to help bridge the gap for several reasons:

(1) The programming models are substantially different -
single-threaded, message passing with queues and no clear
differentiation of remote ( I'll say a bit more about that in a
minute.) from local vs services and any threading model you like.

(2) I'll no doubt get into trouble for saying this: The Akka folks are
making all the same mistakes made by others previously. In particular
they've attempted uniform I/O models for different grades of
consistency, they consider various bits and pieces as optimisations
rather than critical requirements for stability/maintainability and
they're attempting to retrofit high availability and replication to a
non-distributed memory model (something that Terracotta have attempted
for example).

(3) I'll likely get into trouble for saying this too: It's not just
the Akka folks, the community of users has made the same old mistake
of "do distributed looking like local". Akka is nothing new (it's just
single-threaded executors and queues for the most part) but many think
it is. The same assumptions are being made with the same "discoveries"
to follow.

Fundamentally, people are buying in to the same old "there is remote
that looks like local" snake oil. River has never been in that camp,
it could move there but would be a completely different beast. Whilst
it remains a different beast and people still want the snake oil,
adoption will be limited (I think for various other reasons that will
always be the case, another story there).

Back to programming models as promised.....

Doug Lea has some interesting stuff to say on transactional memory,
actors and such here: http://days2011.scala-lang.org/node/138/274 -
I'd summarise as "horses for courses and no silver bullet". Watch it
though, it's a great talk.

And remoteness:

The number of services in a large system is likely very much smaller
than the number of actors one would see in a similar system. That will
require a re-think of monitoring and debugging. No one seems to be
worried about that even though most existing messaging systems have
considerably fewer moving parts than one will find in an actor based
system of comparable size (there are very few large-scale message
based systems other than the likes of TCP, UDP and IP networks to take
learnings from).

Whilst most hold that introducing asynchronous operation is sufficient
to paper over the cracks of local vs remote, it's not true. One must
also account for lost messages which means introducing timeouts. Note
that messages aren't lost in the local model but will be across
networks requiring resends. That introduces highly variable latency
where on networks with no problems, things are fine as they are on the
local box but not for the failing network. The actor model has no
accounting for such latency, nor does it hint it's an issue. Resending
of messages implies build up of queues and resource exhaustion which
needs tackling via the likes of flow control or limited queue sizes
and dropped messages. Again, the actor model lacks in this area.
Similar resource exhaustion problems exist when actors don't process
queues fast enough (perhaps because they are updating persistent
storage which is substantially slower than network I/O and memory
operations), again limited discussion.

To summarise, there's a lot of noise about actors and plenty of
interest. Unfortunately, IMHO there's nothing more to it than adopting
a discipline of single-threaded executors with queues and dealing with
the well-known consequences. Sure, Akka has "remote queues" but they
don't work right/well and without a substantial mindset change that
will break the original promise of "easy remoteness" won't ever work.

Has anyone tried Scala with River (or related services such as Bllitz)?

I have, I know that Paul Snively was looking at it way back (Scala on
Blitz) and in fact there is Fly (which isn't Jini/JavaSpaces but
similar) which has Scala bindings and is being used by a few people.
Wouldn't say there's widespread usage tho'

-Patrick



--

www.objectscape.org

Re: jini / javaspaces wrt scala and akka

Reply via email to