Hi guys, We have a usecase here where zookeeper is used to coordinate ownership of partitions of a resource. When one server dies, the partition should be moved to another server, etc. The action we need to take on SessionExpired is very clear. We just kill the server.
However it is unclear what we should do on a SyncDisconnected. We can't just kill our server, as it may have just been one zookeeper server failing. If we block all client requests to our server while we wait for SyncConnected, we may block forever in the case that our server is partitioned away from the zk cluster. If we continue to serve requests, we risk split brain[1]. What have people done in the past to resolve issues like this? -Ivan [1] This is a risk anyhow without proper fencing, but a limited amount is ok in our application.
