Local sessions are only available in 3.5+ so I don't think that's an issue for Mark (3.4.5). However it's a really good point and I'm not sure myself what would happen - Thanks Dan!
Patrick On Wed, Jun 22, 2016 at 11:10 AM, Jordan Zimmerman < [email protected]> wrote: > Curator 3.0 will simulate a session expiration when there’s a network > partition, but Curator 2.0 does not. If you’re using ZK 3.4.5 you’d be > using Curator 2.0 so the only way you’d see a session expiration is when > you successfully reconnect to the ensemble. > > -JZ > > > On Jun 22, 2016, at 12:58 PM, Patrick Hunt <[email protected]> wrote: > > > > Hi Mark. See this jira for background: > > https://issues.apache.org/jira/browse/ZOOKEEPER-1277 > > > > However what you describe is correct behavior from our perspective. When > > the lower 32 roll over we now (that was the fix) force a re-election of > the > > leader. Leader re-election causes the quorum to stop serving clients > until > > a new quorum forms. > > > > Leader re-election is a normal behavior for the ZK service, it happens > > whenever the current leader is lost and a new quorum, with a (possibly > new) > > leader needs to reform. Say if the current leader process is restarted. > > Your clients need to be able to handle this situation (typically the > client > > library does this for you). > > > > That said, you should not be seeing session expiration as a result of > this. > > Client timeouts certainly, but not session expiration. It might happen > for > > other reasons, but the leader is the one responsible for expiring > sessions. > > If there is no leader (e.g. being re-elected) there is no session > > expiration. When the new leader is elected it will reset the clock on > > session expiration, for all sessions, from the time it's reelected. For > > example you can shutdown the entire ZK server ensemble, start it back up > an > > hour later and the clients should all be able to rejoin. Hm, that said > I'm > > not sure if Curator is doing some special magic, that's the behavior of > the > > stock client that we ship. > > > > Patrick > > > > > > On Wed, Jun 22, 2016 at 6:18 AM, Figura, Mark <[email protected]> > wrote: > > > >> Hi, > >> > >> We are using ZooKeeper 3.4.5 along with Curator to perform leader > >> elections and also store some application data on a 3-node ensemble. Our > >> application is not hard-realtime, but glitches in stream processing do > get > >> noticed and may raise support tickets. > >> > >> Yesterday, we had such a glitch and by looking through the logs, I found > >> there was an XID rollover. When this happened, a new election within the > >> ensemble was triggered and all client connections were closed. From our > >> application's point of view (possibly filtered through Curator), we saw > the > >> session expire and then the connection was lost. This caused our > >> application to shutdown each component, re-perform leader elections, and > >> eventually start back up. > >> > >> We do have an issue where our application is making many more writes > than > >> it should, but once this is fixed, we'll still run into an XID rollover > >> sooner or later. > >> > >> Is there something our application can do to handle this situation > better? > >> Are there any plans for Zookeeper to handle this situation without > closing > >> client connections? > >> > >> Thanks! > >> Mark > >> > >
