All servers in the quorum reading the snapshot from disk as part of the synchronization phase. From Thawan's email it looks like when ever there is a leader election, all zk servers read the snapshot from disk. I am not sure why all servers should reload the snapshot from disk as this increases unavailability time.
On Tue, Jul 16, 2013 at 12:35 PM, Flavio Junqueira <[email protected]>wrote: > The synchronization phase is part of the protocol and we use it to > guarantee that we expose a consistent view of the state. During the > synchronization phase, servers do not accept requests. > > Which behavior are you proposing we change, Kishore? > > -Flavio > > On Jul 16, 2013, at 7:04 PM, kishore g <[email protected]> wrote: > > > Thanks for clarification Flavio. Does this mean during the leader > election, > > both reads and writes are not supported?. Do we start a separate > > thread/jira of changing this behavior?. > > > > thanks, > > Kishore G > > > > > > On Tue, Jul 16, 2013 at 9:16 AM, Flavio Junqueira <[email protected] > >wrote: > > > >> The disk state should be the authoritative state of a server, so if I > >> remember correctly, we load the database as a way of validating the disk > >> state. I don't claim that this is strictly necessary, but if we are to > >> change it, then I would need to think this through. > >> > >> About leader election, if a leader loses support from a quorum of > >> followers, > >> then it will drop leadership. Any event that causes a follower to stop > >> receiving messages from the leader or the follower to disconnect from > the > >> leader will make it stop supporting the current leader. > >> > >> -Flavio > >> > >> -----Original Message----- > >> From: Sergey Maslyakov [mailto:[email protected]] > >> Sent: 16 July 2013 16:16 > >> To: [email protected] > >> Subject: Re: Maximum size of a snapshot > >> > >> And another extension on top of Kishore's question: do the reelections > >> happen if the previously elected leader remains in the cluster? In other > >> words, what events can trigger re-election and the corresponding > temporary > >> degradation of the service provided by Zookeeper? > >> > >> > >> Thank you, > >> /Sergey > >> > >> > >> On Tue, Jul 16, 2013 at 2:21 AM, kishore g <[email protected]> wrote: > >> > >>> Regarding #2. Is that really true that during leader election every > >>> machine reloads snapshot data from disk? Any reason why this is needed > >>> unless it really needs to truncate or undo conflicting transactions > >> already applied? > >>> > >>> > >>> On Mon, Jul 15, 2013 at 9:50 PM, Thawan Kooburat <[email protected]> > wrote: > >>> > >>>> Max snapshot size: > >>>> > >>>> Here is my take on these issue, others feel free to add or correct. > >>>> > >>>> 1. Depends on how much RAM your machine has. Snapshot is should be > >>>> less than the available RAM since everything is loaded into memory. > >>>> 2. Depends on what is the availability guarantee that the client > needs. > >>>> If there is leader election, every machine need to reload the data > >>>> from disk. So the quorum will be down for at least the same as > >>>> snapshot > >>> loading > >>>> time. The session timeout on the client side should be at least > >>>> longer than expected downtime during leader election. > >>>> > >>>> -- > >>>> Thawan Kooburat > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> On 7/15/13 8:46 PM, "Sergey Maslyakov" <[email protected]> wrote: > >>>> > >>>>> I have a couple of sizing questions to the users and developers. > >>>>> Hope, > >>> you > >>>>> don't mind answering those. > >>>>> > >>>>> What is the guideline for the maximum reasonable size of a DataTree > >>> that a > >>>>> single ZK server can manage? If ZK server writes out a snapshot of > >>>>> about 1GB in size, is it pushed beyond the limits or is it still > >> manageable? > >>> If > >>>>> so, where is the critical threshold when ZK is really being abused? > >>>>> > >>>>> Similarly, how can I estimate the propagation delay of a change > >>>>> across > >>> an > >>>>> ensemble of three ZK servers? > >>>>> > >>>>> > >>>>> Thank you, > >>>>> /Sergey > >>>> > >>>> > >>> > >> > >> > >
