Re: [akka-user] Shared cluster shard allocation strategy

Patrik Nordwall Sat, 22 Nov 2014 08:09:07 -0800

Hi Moritz,

On Wed, Nov 19, 2014 at 3:27 PM, Moritz Schallaböck <
[email protected]> wrote:


> Hello fellow hAkkers,
>
> we have multiple persistent actor types distributed using cluster
> sharding. Some of them logically belong together, lets say they're
> customers and their orders. Customers never talk to orders of other
> customers, and vice versa. Thus it makes sense to us to have these actors
> reside on the same cluster shard (and consequently, in the same VM).
>
> We implemented this by returning identical ShardIds for the customer c123
> and its orders c123-o0, c123-o1, etc. But of course, this doesn't work like
> we thought it would. :) The ShardResolvers of two instances of ShardRegion
> operate independently, and we just end up with two shards -- one for
> customers and one for orders -- which share a name but not necessarily a
> cluster host. I have seen this misunderstanding crop up a few times before
> on this list, which makes it slightly less embarrasing to admit the
> mistake. ;)
>
> We could stop using cluster sharding for the orders completely, and
> instead route all messages for the orders through the customers, which
> would restart the actors on demand. But that sounds like a lot of
> extraneous code: many other actors talk to the orders[0], and the customers
> shouldn't need to route these messages or worry about them, the customer
> actors need not even be alive for them. And we'd also have to worry about
> the other things that cluster sharding does: support for passivation of
> orders, gracefully handling rebalances of customers (killing all order
> actors when it happens, I guess), maybe other things.
>
> [0] I realize that this will lead to the question: if many other actors
> talk to the orders without involving the customers, why do you want them on
> the same host? Lets just assume for the sake of argument that circumstances
> make this a reasonable requirement, unless you're saying it's not a
> reasonable requirement under any circumstances.
>
> The alternative would involve writing a custom ShardAllocationStrategy
> that's shared among the customer and order ShardRegions. I suppose it would
> involve the following:
>  - maintain the associations between ShardRegion actorRef and ShardIds for
> each entity type;
>  1. for a new requested allocation for entity type X:
>  2. check if the same shardId is already allocated for any other entity
> type Y, yielding (at least one) associated shardRegionActorRefY
>  3. if so, determine if there is any shardRegionActorX for entity type X
> that's on the same host as shardRegionActorRefY
>  4. if so, allocate the shardId to shardRegionActorX (ie. return it;
> optionally balance between several candidates)
>  5. otherwise, fallback to any other ShardAllocationStrategy (updating the
> associations based on its return value)
>
> Eugh. I feel dirty now. Apart from the general horrificness, I imagine
> step 3 is fraught with peril. And of course, the whole thing would need to
> be thread-safe because it will be accessed and modified concurrently by
> several ShardRegions. (Time to dust off ye olde ConcurrentHashMap.) The
> more I look at it, the more fragile and less feasible it seems.
>

Yes, there are a lot of pitfalls in that approach
One more that you perhaps didn't think of is that the shardRegionActorX for
entity type X might have been allocated and then later shardRegionActorY
for entity type Y is to be allocated by a coordinator running in a
different JVM (because of a crash). Then the shared ShardAllocationStrategy
has no information about the previous shardRegionActorX.

The design of cluster sharding is based on that each entity type can be
managed independent of other entity types. If Customer and Order have a
tight coupling I think they should be modelled as one aggregate type.


>
> At the same time, having this sort of control over the clustering of
> several entity types does not seem particularly outrageous. Are we missing
> something?
>

You could do a best effort co-location of associated customer and order
shards by using consistent hashing in the ShardAllocationStrategy. I can
explain more if you find that interesting.

Cheers,
Patrik


>
> Thanks as always for your thoughts,
> Moritz
>
> --
> >>>>>>>>>> Read the docs: http://akka.io/docs/
> >>>>>>>>>> Check the FAQ:
> http://doc.akka.io/docs/akka/current/additional/faq.html
> >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
> ---
> You received this message because you are subscribed to the Google Groups
> "Akka User List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/akka-user.
> For more options, visit https://groups.google.com/d/optout.
>



-- 

Patrik Nordwall
Typesafe <http://typesafe.com/> -  Reactive apps on the JVM
Twitter: @patriknw

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Re: [akka-user] Shared cluster shard allocation strategy

Reply via email to