[
https://issues.apache.org/jira/browse/ZOOKEEPER-3056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506687#comment-16506687
]
Brian Nixon commented on ZOOKEEPER-3056:
----------------------------------------
[~mmerli] That's a very reasonable concern and I'd ideally have all upgrades be
seamless in exactly the way you describe. Property gating the validation is
only undesirable from a proliferation of config point of view.
[~hanm] I think the signal file is a very workable approach and pretty
straightforward to implement. The first intervention that I scoped out (create
a snapshot.0) was inspired by yours as it simplifies the path of "signal file"
to "database load with trust in the transaction log" to "create snapshot,
delete signal file". -- It's a trade-off between admin time and server side
code complexity for sure.
In order of decreasing seamlessness/admin time:
* property flag snapshot validation (default off)
* property flag snapshot validation (default on)
* signal file
* admin script to create a snapshot.0 file in the snapshot directory
* upgrade notes to create a snapshot.0 file in the snapshot directory
For the use cases that we maintain, it's far more likely that being unable to
load a snapshot indicates corruption or machine malfeasance than a legitimate
database so I'd like to expand that impression with more information from the
community. Is a snapshot-less db expected/unremarkable under some reasonable
workloads or is it something worth (politely) discouraging? I do believe
ZOOKEEPER-2325 is a good feature and it would be a shame to set it off by
default.
> Fails to load database with missing snapshot file but valid transaction log
> file
> --------------------------------------------------------------------------------
>
> Key: ZOOKEEPER-3056
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3056
> Project: ZooKeeper
> Issue Type: Bug
> Components: server
> Affects Versions: 3.5.3, 3.5.4
> Reporter: Michael Han
> Priority: Critical
>
> [An
> issue|https://lists.apache.org/thread.html/cc17af6ef05d42318f74148f1a704f16934d1253f1472cccc1a93b4b@%3Cdev.zookeeper.apache.org%3E]
> was reported when a user failed to upgrade from 3.4.10 to 3.5.4 with missing
> snapshot file.
> The code complains about missing snapshot file is
> [here|https://github.com/apache/zookeeper/blob/release-3.5.4/src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java#L206]
> which is introduced as part of ZOOKEEPER-2325.
> With this check, ZK will not load the db without a snapshot file, even the
> transaction log files are present and valid. This could be a problem for
> restoring a ZK instance which does not have a snapshot file but have a sound
> state (e.g. it crashes before being able to take the first snap shot with a
> large snapCount parameter configured).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)