Please correct me if I'm wrong, but I thought Curator went into
SUSPENDED mode when it gets a Disconnected state event from its ZK
client. That is not necessarily the same as a network issue, because
that ZK keepalive could be stuck in the ZK server processing queue,
blocked on a slow disk. What I'm proposing would be a true,
network-only timeout that could be used to declare a client disconnected
quickly if there's a network issue, without having to reduce the ZK
session timeout so low that a slow disk would cause false negatives.
Does that make sense?
Jeremy
On 02/26/2014 09:25 PM, Jordan Zimmerman wrote:
Curator should already go into SUSPENDED when there is a connection
issue, right? How would this be different?
-JZ
------------------------------------------------------------------------
From: Jeremy Stribling Jeremy Stribling <mailto:[email protected]>
Reply: [email protected] [email protected]
<mailto:[email protected]>
Date: February 26, 2014 at 7:56:26 AM
To: [email protected] [email protected]
<mailto:[email protected]>
Subject: adding a "network timeout" to curator?
Hi all,
I started a thread on the ZK list a while back about timeouts in ZK.
You can find it in the archives here:
http://mail-archives.apache.org/mod_mbox/zookeeper-user/201309.mbox/%[email protected]%3E
The basic idea is that when ZK is running on a node with slow disks
(e.g., in a VM), you might want to set your session timeout to a long
value (e.g., 30 seconds or 60 seconds), but still detect network
timeouts quickly. On that thread, Michi proposed using 'ruok' commands
from the client to test network connectivity, along with the normal
client pings happening in the background to detect server slowness.
I was wondering if this would make sense to provide as part of the
Curator Framework or Client. There could be some background thread
sending 'ruok' commands to whatever server the client is connected to,
and going into SUSPENDED (or LOST?) mode when it hits a timeout or gets
a failure back. We might be able to implement something like that here
and contribute it back, if it sounds interesting to other people and we
can agree on a design. Any thoughts?
Jeremy