Hello fellow hAkkers, we have multiple persistent actor types distributed using cluster sharding. Some of them logically belong together, lets say they're customers and their orders. Customers never talk to orders of other customers, and vice versa. Thus it makes sense to us to have these actors reside on the same cluster shard (and consequently, in the same VM).
We implemented this by returning identical ShardIds for the customer c123 and its orders c123-o0, c123-o1, etc. But of course, this doesn't work like we thought it would. :) The ShardResolvers of two instances of ShardRegion operate independently, and we just end up with two shards -- one for customers and one for orders -- which share a name but not necessarily a cluster host. I have seen this misunderstanding crop up a few times before on this list, which makes it slightly less embarrasing to admit the mistake. ;) We could stop using cluster sharding for the orders completely, and instead route all messages for the orders through the customers, which would restart the actors on demand. But that sounds like a lot of extraneous code: many other actors talk to the orders[0], and the customers shouldn't need to route these messages or worry about them, the customer actors need not even be alive for them. And we'd also have to worry about the other things that cluster sharding does: support for passivation of orders, gracefully handling rebalances of customers (killing all order actors when it happens, I guess), maybe other things. [0] I realize that this will lead to the question: if many other actors talk to the orders without involving the customers, why do you want them on the same host? Lets just assume for the sake of argument that circumstances make this a reasonable requirement, unless you're saying it's not a reasonable requirement under any circumstances. The alternative would involve writing a custom ShardAllocationStrategy that's shared among the customer and order ShardRegions. I suppose it would involve the following: - maintain the associations between ShardRegion actorRef and ShardIds for each entity type; 1. for a new requested allocation for entity type X: 2. check if the same shardId is already allocated for any other entity type Y, yielding (at least one) associated shardRegionActorRefY 3. if so, determine if there is any shardRegionActorX for entity type X that's on the same host as shardRegionActorRefY 4. if so, allocate the shardId to shardRegionActorX (ie. return it; optionally balance between several candidates) 5. otherwise, fallback to any other ShardAllocationStrategy (updating the associations based on its return value) Eugh. I feel dirty now. Apart from the general horrificness, I imagine step 3 is fraught with peril. And of course, the whole thing would need to be thread-safe because it will be accessed and modified concurrently by several ShardRegions. (Time to dust off ye olde ConcurrentHashMap.) The more I look at it, the more fragile and less feasible it seems. At the same time, having this sort of control over the clustering of several entity types does not seem particularly outrageous. Are we missing something? Thanks as always for your thoughts, Moritz -- >>>>>>>>>> Read the docs: http://akka.io/docs/ >>>>>>>>>> Check the FAQ: >>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups "Akka User List" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.
