last addendum : just noticed got to the same result if cutting the network
between the nodes at the same time :

sudo iptables -I INPUT 1 -s NODE_B -j DROP  // drop packets coming from
NODE_B
sudo iptables -I INPUT 1 -d NODE_B -j DROP  // drop packets going to NODE_B
sleep 6
sudo iptables -I INPUT 1 -s NODE_C -j DROP  // drop packets coming from
NODE_C
sudo iptables -I INPUT 1 -d NODE_C -j DROP  // drop packets going to NODE_C

Cheers,
Francesco

On 22 December 2016 at 16:14, Francesco laTorre <
[email protected]> wrote:

> ... Forgot to mention we are using the freshly backed 2.4.16 :)
>
> On 22 December 2016 at 12:44, Francesco laTorre <
> [email protected]> wrote:
>
>> ​Hi,
>>
>> We are in production with akka-cluster and facing some weird behaviour
>> recently that managed to reproduced locally.
>> Here the simplified version :
>>
>>    - Node A :
>>    - active actor cluster-singleton ACT1
>>       - active actor cluster-singleton ACT2
>>       - actor ACT3
>>    - Node B :
>>    - inactive actor cluster-singleton ACT4
>>       - inactive actor cluster-singleton ACT5
>>       - actor ACT6
>>    - Node C :
>>    - inactive actor cluster-singleton ACT7
>>       - inactive actor cluster-singleton ACT8
>>       - actor ACT9
>>
>> Moreover <sin-to-forgive> we cannot yet get rid of auto-downing despite
>> your clear warning </sin-to-forgive>
>>
>> In terms of FD :
>>
>> cluster {
>> auto-down-unreachable-after = 120s
>>     failure-detector {
>> acceptable-heartbeat-pause = 5s
>>         threshold = 12.0
>>     }
>> }
>>
>> Now, simulating a sort of asymmetric network failure between Node A and
>> the other 2, executing on NODE_A the following list of commands :
>>
>> sudo iptables -I INPUT 1 -s NODE_B -j DROP  // drop packets coming from
>> NODE_B
>> sleep 3
>> sudo iptables -I INPUT 1 -d NODE_B -j DROP  // drop packets going to
>> NODE_B
>> sleep 3
>> sudo iptables -I INPUT 1 -s NODE_C -j DROP  // drop packets coming from
>> NODE_C
>> sleep 3
>> sudo iptables -I INPUT 1 -d NODE_C -j DROP  // drop packets going to
>> NODE_C
>>
>> We can see that NODE_A is correctly downed but also, after a propagation
>> of gossip messages all the other nodes see all the others unreachable and
>> the cluster goes down.
>> The dynamis is not always the same, we identified mainly 2-3 patterns.
>>
>> I've read the remote's design principles :
>> http://doc.akka.io/docs/akka/2.4/general/remoting.html#symme
>> tric-communication
>>
>> Are we, with the list of commands shown above, violating assumption 1 and
>> take this a scenario in which the cluster cannot operate/recover by design ?
>>
>> Cheers,
>> Francesco
>>
>>
>
>

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Reply via email to