hey Michi,
I'll have to double check the logs to see if the client got a session
expired event, but I would presume so because the ephemeral nodes lying
around had a different session ID. I guess it's a possibility that the old
connection stayed open, and a new one was also created, but I don't believe
this to be the case.
cheers


On Thu, May 15, 2014 at 12:41 PM, Michi Mutsuzaki <[email protected]>wrote:

> Hi Cameron,
>
> Did the client get the session expired event? Sessions don't expire
> during quorum loss, and I'm guessing the session got revalidated when
> the cluster reformed a quorum.
>
>
> On Thu, May 8, 2014 at 3:31 AM, Cameron McKenzie <[email protected]>
> wrote:
> > Sorry, bashed send prematurely!
> >
> > Guys,
> > I've noticed a weird problem with ephemeral nodes not being cleaned up if
> > the session they are tied to times out while ZooKeeper does not have a
> > quorum. The situation is basically as follows:
> >
> > 3 node cluster
> > -Client connects to cluster and creates an ephemeral node
> > -Two nodes die, so quorum is lost
> > -Some time passes (longer than the session timeout negotiated for the
> > client that created the ephemeral node)
> > -One (or both) of the dead nodes come back and a quorum is reformed.
> > -The ephemeral node tied to the session which should have timed out still
> > exists and never seems to get cleaned up.
> > -If I telnet in on port 2181 and 'dump', then I can see that ZK seems to
> > think that the session is still active and associated with the ephemeral
> > node in question.
> > -It seems to stay in this state for some extended period of time (20+
> > minutes). Interestingly, when I happened to fire up zkCli.sh I could see
> > that the node was still there, but after I exited, the node seemed to
> > disappear shortly afterwards. So, I wonder if the session established by
> > zkCli.sh ending somehow triggered the cleanup of this rogue ephemeral
> node?
> >
> > Has anyone experience this issue before? I understand that it's a bit of
> an
> > edge case, but I'm running across it quite frequently when testing
> changing
> > the size of ZK cluster.
> >
> > I've thought of a few work arounds for the issue, but I'd like to know if
> > it's a known issue.
> >
> > Any help appreciated!
> > cheers
> >
> >
> >
> > On Thu, May 8, 2014 at 8:15 PM, Cameron McKenzie <[email protected]
> >wrote:
> >
> >> Guys,
> >> I've noticed a weird problem with ephemeral nodes not being cleaned up
> if
> >> the session they are tied to times out while ZooKeeper does not have a
> >> quorum. The situation is basically as follows:
> >>
> >> 3 node cluster
> >> -Client connects to cluster and creates an ephemeral node
> >> -Two nodes die, so quorum is lost
> >> -Some time passes (longer than the session timeout negotiated for the
> >> client that created the ephemeral node)
> >> -One (or both) of the dead nodes come back and a quorum is reformed.
> >> -The ephemeral node tied to the session which should have timed out
> still
> >> exists
> >>
> >>
>

Reply via email to