Hi Jim,

On Tue, Feb 25, 2014 at 12:02 AM, Jim Newsham <[email protected]> wrote:

>
> Hi everyone,
>
> I just became aware of how Akka can establish a quarantine between remote
> actor systems, requiring a restart of one of the actor systems.  I know
> this has already been discussed in several threads in this forum, as I've
> been searching and reading anything relevant to get an understanding of the
> issues.
>
> I apologize for beating a dead horse, but something feels very wrong to me
> about this approach.  It seems very heavy-handed to have a commonly
> occurring condition where an entire actor system must be restarted.
>

To answer in short, the fallacy in the above sentence is to state that
quarantining happens in commonly occurring conditions. It happens only on
two kinds of occasions:
 - Irrecoverable conditions due to system messages being undeliverable for
a very long time (and I mean really long), or system message sequence
numbers are in an inconsistent state on the two systems.
 - An irreversible decision has been made by declaring the other system
dead (hence the name DeathWatch and not TemporalUnreachabilityWatch and
cluster node Down instead of cluster node Unreachable)


>  And it seems contrary to Akka's otherwise fine-grained error handling and
> recovery support which allows for individual actors to fail and be
> restarted.
>

Akka recovers lost connections on the remoting level, and recovers cluster
node unreachability on the cluster level (main feature of 2.3).
Quarantining triggers on stronger conditions than those above.

Also, actors that have sent a Terminated are definitely stopped. Restart
does not send Terminated.


>
> Is quarantining the only reasonable approach?  I understand that after a
> remote actor system is declared unreachable
>

Not unreachable, this is the point where the argument is invalid. The
remote system is considered dead. That is an irreversible decision. If you
want to track temporary unreachability you have to either implement
heartbeating (if you use pure remoting) or just use the cluster
unreachability tracking facility.

I hope this clarifies the decision. Also, you might want to take a look at
this:
http://doc.akka.io/docs/akka/2.3.0-RC4/scala/remoting.html#Lifecycle_and_Failure_Recovery_Model

-Endre


> and Terminated messages are dispatched for remotely watched actors, that
> we don't want to receive any more messages from the actors which have been
> declared Terminated.
>
> As an alternative idea, what if you included an identifier for the current
> remote association as part of a remote actor's identifier?  In this case,
> when a new remote association is established, all actors on the remote side
> are considered distinct from any actors the local side may have
> communicated with in the past.  This guarantees we don't violate the "no
> messages after Terminated" rule, and allows the distributed system to
> continue operating without a restart.  Individual actors can look up and
> re-establish communication with their remote peers, achieving fine-grained
> fault tolerance.
>
> Best regards,
> Jim
>
>  --
> >>>>>>>>>> Read the docs: http://akka.io/docs/
> >>>>>>>>>> Check the FAQ: http://akka.io/faq/
> >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
> ---
> You received this message because you are subscribed to the Google Groups
> "Akka User List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/akka-user.
> For more options, visit https://groups.google.com/groups/opt_out.
>



-- 
Akka Team
Typesafe - The software stack for applications that scale
Blog: letitcrash.com
Twitter: @akkateam

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: http://akka.io/faq/
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to