Re: [akka-user] Re: Sharded replication with distributed data

Patrik Nordwall Thu, 17 Aug 2017 09:22:10 -0700

On Thu, Aug 17, 2017 at 11:19 AM, Igor Baltiyskiy <[email protected]> wrote:


> Thanks Patrik!
>
> One last question --- what should we watch out for when we have 10000
> entities per node role? Network usage, obviously, what else?
>

You should not put each entry as a top level entry in Replicator, and you
should not have them all in one ORMap. Split them over a reasonable number
of ORMaps (hashing again). See how we do it in Cluster Sharding:
https://github.com/akka/akka/blob/master/akka-cluster-sharding/src/main/scala/akka/cluster/sharding/Shard.scala#L594


>
> Thanks again
> Igor
>
> On Wednesday, August 16, 2017 at 4:11:09 PM UTC+3, Patrik Nordwall wrote:
>
>>
>>
>> On Wed, Aug 16, 2017 at 1:28 PM, Igor Baltiyskiy <[email protected]>
>> wrote:
>>
>>> Hello Patrik,
>>>
>>> thanks for the reply. Could you please clarify a few points:
>>>
>>> 1. How are Replicators tied to node roles?
>>>
>>
>> Replicator.props takes ReplicatorSettings, which contains a role property.
>>
>>
>>> Can I start more than 1 Replicator on a node?
>>>
>>
>> Yes, just start it as an ordinary actor. Make sure that you use the same
>> actor name on other nodes that it should interact with.
>>
>>
>>> If so, can I start only as many Replicators as the roles this node has?
>>>
>>
>> Yes, that was my idea
>>
>>
>>> 2. If a node has K roles, does it mean that its K replicators gossip
>>> independently of each other?
>>>
>>
>> Yes
>>
>>
>>> 3. In the last scenario --- one consistent hashing group router per role
>>> --- why do routees subscribe to changes from DData? Shouldn't DData be
>>> replicated across all nodes with role_i? If so, they can simply read the
>>> data if they are on the node with the right role.
>>>
>>
>> Yes they can read instead, but then you would need to know when to read.
>> Perhaps you do that for each request, that would also work.
>>
>>
>>>
>>> Thanks!
>>> Igor
>>>
>>> On Wednesday, August 16, 2017 at 1:14:31 PM UTC+3, Patrik Nordwall wrote:
>>>>
>>>> You can use Cluster Sharding and DData with roles. So, let's say that
>>>> you go with 10 roles, 10,000 entities in each role. You would then start
>>>> Replicators on the nodes with corresponding nodes. You would also start
>>>> Sharding on the nodes with corresponding roles. On a node that doesn't have
>>>> the a role you would start a sharding proxy for such role.
>>>>
>>>> When you want to send a message to an entity you first need to decide
>>>> which role to use for that message. Can be simple hashCode modulo
>>>> algorithm. Then you delegate the message to the corresponding Sharding
>>>> region or proxy actor.
>>>>
>>>> You have defined the Props for the entities and there you pass in the
>>>> Replicator corresponding to the role that the entity belongs to, i.e. the
>>>> entity takes the right Replicator ActorRef as constructor parameter.
>>>>
>>>> If you don't need the strict guarantees of "only one entity" that
>>>> Cluster Sharding provides, and prefer better availability in case of
>>>> network partitions, you could use a consistent hashing group router instead
>>>> of Cluster Sharding. You would have one router per role, and decide router
>>>> similar as above. Then the entities (routees of the router) would have to
>>>> subscribe to changes from DData to get notified of when a peer entity has
>>>> changed something, since you can have more than one alive at the same time.
>>>>
>>>> Cheers,
>>>> Patrik
>>>>
>>>> On Wed, Aug 16, 2017 at 11:46 AM, Igor Baltiyskiy <[email protected]>
>>>> wrote:
>>>>
>>>>> After some thought, I could instead do "master-master" replication,
>>>>> because in the cache, each entry is versioned, and newer versions always
>>>>> trump older ones --- kind of a LWWRegistryMap with versions for 
>>>>> timestamps.
>>>>> In that scenario, I'd have several actors for each entity, and each of 
>>>>> them
>>>>> would be able to initiate download of data and write it to its cache and
>>>>> then send its values to the peer, which it would trivially merge. Clients
>>>>> reading the cache would specify required version, and if the actor 
>>>>> instance
>>>>> doesn't have a value that recent, it would initiate download from the
>>>>> external source.
>>>>>
>>>>> However, the question remains as to how merge the values after
>>>>> netsplit heals. Cluster sharding isn't recommended to be used with
>>>>> auto-downing, but seemingly for the reason that there will be several
>>>>> instances of entity actors --- with which I'm OK. The only problem is to
>>>>> detect when it heals and find the peer. I think I'd listen to cluster
>>>>> membership changes and try to ask the added node if it has this actor (by
>>>>> path), and if it has, exchange entries with it.
>>>>>
>>>>> Do you see any obvious downsides to that approach?
>>>>>
>>>>> Thanks,
>>>>> Igor
>>>>>
>>>>>
>>>>> On Wednesday, August 16, 2017 at 4:54:32 AM UTC+3, Igor Baltiyskiy
>>>>> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I'm wondering whether "sharded replication" is possible with Akka.
>>>>>> Let me describe that in more detail.
>>>>>>
>>>>>> In my model, entities contain caches that are very expensive to
>>>>>> recreate from scratch (because they cache results of multiple calls to
>>>>>> several external systems). So I can't just use cluster sharding, because
>>>>>> that would result in one actor per entity, and when the node where that
>>>>>> actor is running goes down, the data is lost. On the other hand, since 
>>>>>> this
>>>>>> data still can be fetched, I don't want to persist it. What I want is to
>>>>>> replicate each cache across a few nodes in the cluster.
>>>>>>
>>>>>> After reading the documentation, I initially thought about
>>>>>> "master-slave replication": for each entity actor, setup a router that
>>>>>> manages a pool of worker actors that receive new values whenever the 
>>>>>> actor
>>>>>> updates its cache. Then clients of this entity would be load-balanced
>>>>>> across these worker actors. (The clients are OK with non-monotonic 
>>>>>> reads.)
>>>>>> Whenever the "master" actor fails, one of the slaves should be promoted.
>>>>>> Note that though clients are OK with non-monotonic reads, writes must
>>>>>> be monotonic: older values in the cache must not overwrite newer values. 
>>>>>> So
>>>>>> promotion would require some complex merging of slave data.
>>>>>> A similar problem is behaviour during netsplit: after the partition
>>>>>> heals, masters need to merge their caches.
>>>>>> And I glossed over the promotion details, and how the clients locate
>>>>>> the actors after the master fails, etc. All in all, this case doesn't 
>>>>>> seem
>>>>>> to be handled by cluster sharding and routers out of the box.
>>>>>>
>>>>>> So I turned to distributed data: I might represent the cache as a
>>>>>> CRDT and that would fix the merging problems above. However, there are
>>>>>> several new problems:
>>>>>> - I don't want to have all caches replicated on all nodes in the
>>>>>> cluster. Rather, I'd like something like Riak, where a particular entity 
>>>>>> is
>>>>>> replicated across n nodes, with nodes chosen according to some rule like
>>>>>> consistent hashing.
>>>>>> From the docs, I get the impression that it is possible to define
>>>>>> more than one Replicator per node:
>>>>>>
>>>>>> > [Replicator] communicates with other Replicator instances with the
>>>>>> same path (without address) that are running on other nodes.
>>>>>>
>>>>>> > Cluster Sharding is using its own Distributed Data Replicator per
>>>>>> node role. In this way you can use a subset of all nodes for some entity
>>>>>> types and another subset for other entity types.
>>>>>>
>>>>>> If so, I would have a Replicator per entity. Is that correct? And
>>>>>> what are the practical limits on the number of different Replicators ---
>>>>>> per node, per cluster? My estimate is that the maximum possible number of
>>>>>> entities is on the order of 100 000.
>>>>>> - Do different Replicator instances gossip independently of each
>>>>>> other, or is it a node-wide activity?
>>>>>> - If my understanding is correct, I can specify which nodes will host
>>>>>> the replicas by starting Replicator actors on these nodes with the path
>>>>>> containing entity ID. How do I select the nodes? I could perhaps use a
>>>>>> cluster-aware router (e,g, ConsistentHashingGroup) to handle that: I'd 
>>>>>> have
>>>>>> to death-watch actor instances to manage this Replicator's lifecycle. Is
>>>>>> this approach good practice?
>>>>>>
>>>>>> Thanks
>>>>>> Igor
>>>>>>
>>>>> --
>>>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>> >>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/c
>>>>> urrent/additional/faq.html
>>>>> >>>>>>>>>> Search the archives: https://groups.google.com/grou
>>>>> p/akka-user
>>>>> ---
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "Akka User List" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> To post to this group, send email to [email protected].
>>>>> Visit this group at https://groups.google.com/group/akka-user.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Patrik Nordwall
>>>> Akka Tech Lead
>>>> Lightbend <http://www.lightbend.com/> -  Reactive apps on the JVM
>>>> Twitter: @patriknw
>>>>
>>>> --
>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>> >>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/c
>>> urrent/additional/faq.html
>>> >>>>>>>>>> Search the archives: https://groups.google.com/grou
>>> p/akka-user
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "Akka User List" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at https://groups.google.com/group/akka-user.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> --
>>
>> Patrik Nordwall
>> Akka Tech Lead
>> Lightbend <http://www.lightbend.com/> -  Reactive apps on the JVM
>> Twitter: @patriknw
>>
>> --
> >>>>>>>>>> Read the docs: http://akka.io/docs/
> >>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/
> current/additional/faq.html
> >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
> ---
> You received this message because you are subscribed to the Google Groups
> "Akka User List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/akka-user.
> For more options, visit https://groups.google.com/d/optout.
>



-- 

Patrik Nordwall
Akka Tech Lead
Lightbend <http://www.lightbend.com/> -  Reactive apps on the JVM
Twitter: @patriknw

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Re: [akka-user] Re: Sharded replication with distributed data

Reply via email to