Sorry, I meant to say 3.5 introduced a check that snaphots exist, not that the snapshot files don't exist in 3.4.
On 2021/01/11 10:13:34, Stig D��ssing <generalbas....@gmail.com> wrote: > Hi, > > Zookeeper 3.5 introduced snapshot files, which did not exist in 3.4. 3.5 > won't start if data is present, and there is no snapshot file. > > https://issues.apache.org/jira/browse/ZOOKEEPER-3056 added an option to > disable this check, to enable migration from 3.4 clusters. The workaround > before then was to add an empty snapshot file to the dataDir. > > As far as I can tell, the intended method of upgrading from 3.4 is to add > snapshot.trust.empty=true to the Zookeeper configuration, upgrade to 3.5.x, > and remove the snapshot.trust.empty property once snapshots exist on all > nodes. > > Sadly this method turns out to be inconvenient, as upgraded nodes will not > write snapshots immediately. See > https://issues.apache.org/jira/browse/ZOOKEEPER-3781?focusedCommentId=17261317&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17261317. > > The reason some nodes may not write snapshots seems to be that when a new > leader is elected, it may opt to send DIFF to the followers if they are not > too far behind. If a follower receives a DIFF, it will not write a snapshot > once NEWLEADER is received. > > Is this snapshot write skipped for efficiency reasons, or to maintain > correctness? If it is skipped only for efficiency, I think the upgrade > experience could be improved, by always writing a snapshot at > https://github.com/apache/zookeeper/blob/eeb053767c9e931ae72a2d8c59c0940da3da9679/zookeeper-server/src/main/java/org/apache/zookeeper/server/quorum/Learner.java#L739-L741 > if snapshot.trust.empty=true. > > This would allow people upgrading from 3.4.x to set > snapshot.trust.empty=true, upgrade and boot the cluster, and remove the > property again very shortly after the reboot. > > What do you think? > >