Hi,

I have a helix cluster setup with a resource in MASTER-SLAVE configuration
with 3 replicas. We are testing what happens when a client temporarily
looses network connectivity to the ZK cluster. The observations are:
1. The client continues to think it is MASTER (DISCONNECT event is
triggered in the background and reconnecting messages are seen)
2. The controller makes another node MASTER
3. Once the client is able to reconnect a session expired is triggered, and
reset is called on the MASTER-SLAVE, moving the partitions to OFFLINE state.

I would like the node to stop serving requests as MASTER as soon as client
detects that session expiration timeout has passed even though the session
may not be expired. Curator framework injects a session timeout event based
on percent of the negotiated timeout, and I was hoping that in Helix
something similar could be configured. Otherwise there is a state where
multiple nodes think they are masters.

Thanks,
Imran

Reply via email to