Re: [akka-user] ClusterSharding: question about shard recovery

haghard Fri, 17 Oct 2014 13:15:37 -0700

Hi Konrad
Sorry  for missing context, now I'll try to give it to you

I have a cluster with two types of nodes(roles) "A" and "B"
On each "B" node, on start I run ClusterSharding(system) and 
create N domain (PersistentActor) actors (just sending N message in 
shardRegion actor with predefined key).
Also I start ProcessCoordinator as a ClusterSingleton actor nodes with "B" 
role.


Before each processing iteration ProcessCoordinator polls shardRegion actor 
and counts the number of responses.
If number of responses matches N when progress is possible.
That's how I check for the presence of sharded actors and his quantity.

If node with ProcessCoordinator goes down, 
ProcessCoordinator actor automaticaly start on other "B" node and polls 
shardRegion actor again to make sure all PersistentActors are ready.

During testing on resilience I found this:

Suppose we have nodes(B-1, B-2, B-3) with initial shard state where (20, 
10, 0 is a quantity of domain actors)
| B-1 - 20 + ProcessCoordinator  |   B-2 - 10  |   B-3 - 0 |

Now I start continuously kill/up/kill node with ProcessCoordinator

1 Action: Kill B-1
1 Result state: |  B-2 - 10 | B-3 - 20 + ProcessCoordinator |    : total 30

2 Action: Up B-1  Kill B-3
2 Result state:  |  B-1 - 10 | B-2 - 20 +  ProcessCoordinator |  : total 30


3 Action: Up B-3  Kill B-2
3 Result state: | B-1 - 10 + ProcessCoordinator | B-3 - 0    |  : total 10
Progress impossible

This is where I stuck




пятница, 17 октября 2014 г., 18:55:26 UTC+4 пользователь Konrad Malawski 
написал:
>
> Hi Vadim,
> I'm not exactly sure what you mean by "on other nodes with the same role".
> How did you setup cluster sharding?
> When do you expect the shard to be brought back?
>
> I'll happily help but seem to be missing context here a bit.
>
> On Wed, Oct 15, 2014 at 4:52 PM, Vadim Bondarev <[email protected] 
> <javascript:>> wrote:
>
>> Hi all,
>>
>> I developed application with AkkaCluster and ClusterSharding + Akka 
>> Persistense(for domain actors) for POC
>> During testing on resilience I found some problem (as I think)
>> I have several nodes with some role in the cluster with sharded domain 
>> actors.
>> Sometimes after node crash some shard did not recover on other nodes with 
>> same role and I can't find particular conditions for this situation.
>>
>>
>> Regards
>> Vadim Bondarev
>>
>> --
>> >>>>>>>>>>      Read the docs: http://akka.io/docs/
>> >>>>>>>>>>      Check the FAQ: 
>> http://doc.akka.io/docs/akka/current/additional/faq.html
>> >>>>>>>>>>      Search the archives: 
>> https://groups.google.com/group/akka-user
>> ---
>> You received this message because you are subscribed to the Google Groups 
>> "Akka User List" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To post to this group, send email to [email protected] 
>> <javascript:>.
>> Visit this group at http://groups.google.com/group/akka-user.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> Cheers,
> Konrad 'ktoso' Malawski
> hAkker @ Typesafe
>
> <http://typesafe.com>
>  

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Re: [akka-user] ClusterSharding: question about shard recovery

Reply via email to