Jay:

It's unnecessary to ensure a client maintains a ZK connection. A heartbeat
mechanism is baked into the ZK session semantics. In other words, there's
no such thing as disconnecting from ZK due to inactivity since, in many
coordination algorithms, liveness (i.e. mere presence) is required for
correct functionality. You can prove this to yourself by reading through
http://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html#ch_zkSessions

...although the following paragraph is what you're looking for:

"The session is kept alive by requests sent by the client. If the session
is idle for a period of time that would timeout the session, the client
will send a PING request to keep the session alive. This PING request not
only allows the ZooKeeper server to know that the client is still active,
but it also allows the client to verify that its connection to the
ZooKeeper server is still active. The timing of the PING is conservative
enough to ensure reasonable time to detect a dead connection and reconnect
to a new server."

Specifically, this bug is real, but not caused by idle disconnects. It
would be an error to attempt to "manage" the ZK session. You're not even
supposed to handle reconnects yourself with ZK (because of the herd
effect); ZK handles this by internally managing retries and then, upon
successfully reestablishing the connection, deciding if you are expired.

On Mon, May 7, 2012 at 3:03 PM, Jay Stricks <[email protected]> wrote:

> I'm wondering how people ensure that their masters stay connected to the
> ZooKeeper server during long periods of time when no config changes are
> made. I'm referring specifically to the issues raised in FLUME-60 (
> https://issues.apache.org/jira/browse/FLUME-60):
>
> This seems related to long pauses or breakpoints. Disconnecting from ZK is
> probably reasonable in these conditions, but ideally the connection should
> be recovered.
>
> As an example, after a long pause, a command that modifies ZK state has
> this error message:
>
> Not connected to ZooKeeper: CLOSED
>
>
> I'm trying to think of possible solutions that don't require restarting
> the master. One idea is to have a test agent periodically issue
> configuration statements to each master, but are there any other ideas out
> there?
>
> Thanks,
>
> Jay
>



-- 
Eric Sammer
twitter: esammer
data: www.cloudera.com

Reply via email to