Greetings, 

We have a cluster shared actor that manages the UserSession for our users. 
The actor's persistence id is the name of the session id and its shard is 
determined by sessionId.hashCode() % 37. All of this works beautifully 
except when a node goes down either crashing (simulated with kill -9) or 
gracefully. When the node goes down gracefully it does the following: 

UserAuthenticationActor.shardRegion() ! ShardRegion.GracefulShutdown

// .. other code

cluster.leave(cluster.selfAddress)


Obviously a crash fails to execute anything. 

When this happens any user that had their session actor on that node is 
completely frozen out of the system. If we restart the node it recovers but 
we can't guarantee that the node will be restarted and rebooting the whole 
cluster is not the best procedure in a node failure. Furthermore, we would 
like to be able to warm roll code into the site without shutting it down 
and restarting it With 9 nodes, we want to be able to roll them one at a 
time so that users don't notice any interruption. 

According to the Akka docs recommendations, we do not have auto-downing 
enabled but I am now thinking maybe we should in the case of a total node 
failure. 

Are there any recommendations anyone can offer to make this process work 
the way we want it to? 

Thanks for your time. 

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Reply via email to