Greetings, We have a cluster shared actor that manages the UserSession for our users. The actor's persistence id is the name of the session id and its shard is determined by sessionId.hashCode() % 37. All of this works beautifully except when a node goes down either crashing (simulated with kill -9) or gracefully. When the node goes down gracefully it does the following:
UserAuthenticationActor.shardRegion() ! ShardRegion.GracefulShutdown // .. other code cluster.leave(cluster.selfAddress) Obviously a crash fails to execute anything. When this happens any user that had their session actor on that node is completely frozen out of the system. If we restart the node it recovers but we can't guarantee that the node will be restarted and rebooting the whole cluster is not the best procedure in a node failure. Furthermore, we would like to be able to warm roll code into the site without shutting it down and restarting it With 9 nodes, we want to be able to roll them one at a time so that users don't notice any interruption. According to the Akka docs recommendations, we do not have auto-downing enabled but I am now thinking maybe we should in the case of a total node failure. Are there any recommendations anyone can offer to make this process work the way we want it to? Thanks for your time. -- >>>>>>>>>> Read the docs: http://akka.io/docs/ >>>>>>>>>> Check the FAQ: >>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups "Akka User List" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.
