[
https://issues.apache.org/jira/browse/ZOOKEEPER-3056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16502830#comment-16502830
]
Michael Han commented on ZOOKEEPER-3056:
----------------------------------------
I think we just need to differentiate the state of missing snapshot vs not
taking a snapshot file. To do that, we can create a signal file under the
dataLogDir whenever we are taking our first snapshot. The presence of that
signal file indicates there is a dependency between transaction logs and
snapshot and we can't just ignore the missing snapshot file. The conditional
check now checks both the presence of that signal file and the snapshot file,
and it only complains if it found the signal file but not found the snapshot
file. This should cover all cases include upgrade. Note this is a best effort
as it's still possible to subvert the effort but I think it's fine as we don't
deal with Byzantine faults.
> Fails to load database with missing snapshot file but valid transaction log
> file
> --------------------------------------------------------------------------------
>
> Key: ZOOKEEPER-3056
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3056
> Project: ZooKeeper
> Issue Type: Bug
> Components: server
> Affects Versions: 3.5.3, 3.5.4
> Reporter: Michael Han
> Priority: Critical
>
> [An
> issue|https://lists.apache.org/thread.html/cc17af6ef05d42318f74148f1a704f16934d1253f1472cccc1a93b4b@%3Cdev.zookeeper.apache.org%3E]
> was reported when a user failed to upgrade from 3.4.10 to 3.5.4 with missing
> snapshot file.
> The code complains about missing snapshot file is
> [here|https://github.com/apache/zookeeper/blob/release-3.5.4/src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java#L206]
> which is introduced as part of ZOOKEEPER-2325.
> With this check, ZK will not load the db without a snapshot file, even the
> transaction log files are present and valid. This could be a problem for
> restoring a ZK instance which does not have a snapshot file but have a sound
> state (e.g. it crashes before being able to take the first snap shot with a
> large snapCount parameter configured).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)