Re: Ephemeral nodes not removed

2019-08-02 Thread Patrick Hunt
The jira you ref'd is the only one that comes to mind. In terms of
troubleshooting - try connecting a client to each of the servers in tern
and see if it's a situation where they have a different view of the world
wrt those znodes. You might also have the client create separate znodes on
each server and ensure that they are consistent. The logs are also
typically a good source of information - check against the session id.

Patrick


On Wed, Jul 31, 2019 at 11:29 PM John Lindwall 
wrote:

> ZooKeeper 3.4.6-1569965
>
> In our environment we seem to have a situation where ephemeral znodes
> are not getting removed after the zookeeper session has been
> terminated.  We can see examples of znodes that were created 3-4 days
> past that still exist, though the zk sessions bound to those znodes
> should no longer exist.
>
> Note that we've had this cluster running to about 4 years and have not
> seen this problem until recently.
>
> 1. I am wondering if there are any known issues that would affect our
> zookeeper version that may cause this behavior?
> 2. Is it possible our servers are simply in a "bad state" and a simple
> reboot might clean things up?
> 3. Any tips on diagnosing this?
>
> We noticed this issue from 2011 but that seems to have been fixed in our
> branch.
>
> 
> https://issues.apache.org/jira/browse/ZOOKEEPER-1208
>
> I also see this issue which it seems was never resolved?
>
> https://issues.apache.org/jira/browse/ZOOKEEPER-3018
>
> Thanks,
> John Lindwall
>
>


Ephemeral nodes not removed

2019-08-01 Thread John Lindwall

ZooKeeper 3.4.6-1569965

In our environment we seem to have a situation where ephemeral znodes 
are not getting removed after the zookeeper session has been 
terminated.  We can see examples of znodes that were created 3-4 days 
past that still exist, though the zk sessions bound to those znodes 
should no longer exist.


Note that we've had this cluster running to about 4 years and have not 
seen this problem until recently.


1. I am wondering if there are any known issues that would affect our 
zookeeper version that may cause this behavior?
2. Is it possible our servers are simply in a "bad state" and a simple 
reboot might clean things up?

3. Any tips on diagnosing this?

We noticed this issue from 2011 but that seems to have been fixed in our 
branch.


https://issues.apache.org/jira/browse/ZOOKEEPER-1208

I also see this issue which it seems was never resolved?

https://issues.apache.org/jira/browse/ZOOKEEPER-3018

Thanks,
John Lindwall