On 09/01/2010 01:40 PM, Fournier, Camille F. [Tech] wrote:
Hmm. Tying the session directly to the connection basically reverses
my problem: I used to have crashed clients' ephemeral nodes visible
for the duration of the session timeout, now I have partitioned
clients' ephemeral nodes doubled for the duration of the session
timeout. Of course, I expect partitions far less frequently than
crashing/killed clients, so I suspect that is a cost I would
willingly pay.


I was suggesting that the session be tied directly to the connection, as soon as the connection was lost the session would be cleaned up (expired) and the ephemerals deleted. This seemed to me to be a solution to your problem - if the client crashes cleanup the ephemerals immediately, if the client goes into gc pause the connection (and therefore the session) would be maintained.

If the session is tied directly to the connection it would have some
fairly big implications to the client-side zookeeper code eh? Instead
of just seamlessly reconnecting upon zk server node failure you'd
have to recreate your old session state completely. My own wrapping
of the zkclient would handle this, but it's not completely trivial
and increases the work on the zk servers when a failure occurs.


Nothing that you don't already have to handle. If the session is expired you have to do this today regardless. The difference is that clients using this feature lose the benefit of session re-establishment and always drop back to full session recreation in cases of network failure. TANSTAAFL. :-) But unless I'm mis-understanding your original request this solves your problem as originally stated.

Patrick

C

-----Original Message----- From: Patrick Hunt
[mailto:ph...@apache.org] Sent: Wednesday, September 01, 2010 3:47
PM To: zookeeper-user@hadoop.apache.org Cc: Benjamin Reed Subject:
Re: closing session on socket close vs waiting for timeout

Ben, in this case the session would be tied directly to the
connection, we'd explicitly deny session re-establishment for this
session type (so 4 would fail). Would that address your concern,
others?

Patrick

On 09/01/2010 10:03 AM, Benjamin Reed wrote:
i'm a bit skeptical that this is going to work out properly. a
server may receive a socket reset even though the client is still
alive:

1) client sends a request to a server 2) client is partitioned from
the server 3) server starts trying to send response 4) client
reconnects to a different server 5) partition heals 6) server gets
a reset from client

at step 6 i don't think you want to delete the ephemeral nodes.

ben

On 08/31/2010 01:41 PM, Fournier, Camille F. [Tech] wrote:
Yes that's right. Which network issues can cause the socket to
close without the initiating process closing the socket? In my
limited experience in this area network issues were more prone to
leave dead sockets open rather than vice versa so I don't know
what to look out for.

Thanks, Camille

-----Original Message----- From: Dave Wright
[mailto:wrig...@gmail.com] Sent: Tuesday, August 31, 2010 1:14
PM To: zookeeper-user@hadoop.apache.org Subject: Re: closing
session on socket close vs waiting for timeout

I think he's saying that if the socket closes because of a crash
(i.e. not a normal zookeeper close request) then the session
stays alive until the session timeout, which is of course true
since ZK allows reconnection and resumption of the session in
case of disconnect due to network issues.

-Dave Wright

On Tue, Aug 31, 2010 at 1:03 PM, Ted
Dunning<ted.dunn...@gmail.com> wrote:

That doesn't sound right to me.

Is there a Zookeeper expert in the house?

On Tue, Aug 31, 2010 at 8:58 AM, Fournier, Camille F. [Tech]<
camille.fourn...@gs.com>  wrote:

I foolishly did not investigate the ZK code closely enough
and it seems that closing the socket still waits for the
session timeout to remove the session.

Reply via email to