Got it. Thanks for explaining it. As I was killing the process (assuming zookeeper removed the node), this change will help my to save some restarts which were not needed.
On Wed, Nov 18, 2015 at 3:23 PM, Cameron McKenzie <[email protected]> wrote: > Not necessarily false alarms, just that the LOST event didn't necessarily > mean session loss, just that curator was giving up. > > With 3.0.0 the LOST event will occur when Curator is explicitly told that > a session has expired by Zookeeper, or if no connection to Zookeeper is > available, Curator will publish a LOST event when it thinks that the > session has been lost. This is based on a timer and the negotiated session > timeout with ZooKeeper. > > > > On Thu, Nov 19, 2015 at 10:13 AM, Vikrant Singh < > [email protected]> wrote: > >> Thanks a lot for reply. So if I am understanding it correct, there were >> false alarms (or mistaken connection lost) . With 3.0.0 connection_lost >> events will happen only when there is true session lost. >> >> On Wed, Nov 18, 2015 at 1:16 PM, Cameron McKenzie <[email protected] >> > wrote: >> >>> Hey Vikrant, >>> The issue was that the LOST event was being published by Curator when it >>> gave up trying to reconnect to Zookeeper after connection loss, whereas >>> most people were interpreting it to mean that the session was lost. >>> >>> So, the change in CURATOR-3.0 is that the LOST event will be published >>> when the session has either expired and Curator is explicitly told this by >>> Zookeeper (implying that a connection is present), or when Curator has been >>> disconnected from Zookeeper for long enough for the session to have expired >>> on the server (this will occur when no connection to Zookeeper is present). >>> >>> So, I'm not sure how it will help your case. It is just a more reliable >>> way of knowing that the session is gone and all related ephemeral state on >>> the Zookeeper server will also be gone. >>> >>> Note that it's also possible to tell Curator to use the legacy way of >>> interpreting the LOST event. >>> cheers >>> >>> >>> On Thu, Nov 19, 2015 at 8:09 AM, Vikrant Singh < >>> [email protected]> wrote: >>> >>>> Hello All, >>>> I need some guidance on understanding how to a fix done in latest >>>> release 3.0.0 . I am talking about following fix - >>>> https://issues.apache.org/jira/browse/CURATOR-247 . >>>> >>>> In my project we create some ephemeral nodes and monitor a cluster >>>> through a tree cache . Framework for treecache and ephemeral node is >>>> created using ExponentialBackoffRetry with retry interval of 1 sec and >>>> retry count of 29 (which is MAX_RETRIES_LIMIT ) . We do kill the >>>> process moment we get TreeCacheEvent.Type.CONNECTION_LOST event . >>>> >>>> As process restart is really expensive, I want to understand how I can >>>> leverage from this fix. >>>> >>>> Please help me in understanding what is the issue and how it may affect >>>> a setup like ours. We are still not on 3.0.0. >>>> >>>> Thanks, >>>> Vikrant >>>> >>>> >>>> >>> >> >
