[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506687#comment-16506687
 ] 

Brian Nixon commented on ZOOKEEPER-3056:
----------------------------------------

[~mmerli] That's a very reasonable concern and I'd ideally have all upgrades be 
seamless in exactly the way you describe. Property gating the validation is 
only undesirable from a proliferation of config point of view.

[~hanm] I think the signal file is a very workable approach and pretty 
straightforward to implement. The first intervention that I scoped out (create 
a snapshot.0) was inspired by yours as it simplifies the path of "signal file" 
to "database load with trust in the transaction log" to "create snapshot, 
delete signal file". -- It's a trade-off between admin time and server side 
code complexity for sure.

In order of decreasing seamlessness/admin time:
 * property flag snapshot validation (default off)
 * property flag snapshot validation (default on)
 * signal file
 * admin script to create a snapshot.0 file in the snapshot directory
 * upgrade notes to create a snapshot.0 file in the snapshot directory

For the use cases that we maintain, it's far more likely that being unable to 
load a snapshot indicates corruption or machine malfeasance than a legitimate 
database so I'd like to expand that impression with more information from the 
community. Is a snapshot-less db expected/unremarkable under some reasonable 
workloads or is it something worth (politely) discouraging? I do believe 
ZOOKEEPER-2325 is a good feature and it would be a shame to set it off by 
default.

> Fails to load database with missing snapshot file but valid transaction log 
> file
> --------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-3056
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3056
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.5.3, 3.5.4
>            Reporter: Michael Han
>            Priority: Critical
>
> [An 
> issue|https://lists.apache.org/thread.html/cc17af6ef05d42318f74148f1a704f16934d1253f1472cccc1a93b4b@%3Cdev.zookeeper.apache.org%3E]
>  was reported when a user failed to upgrade from 3.4.10 to 3.5.4 with missing 
> snapshot file.
> The code complains about missing snapshot file is 
> [here|https://github.com/apache/zookeeper/blob/release-3.5.4/src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java#L206]
>  which is introduced as part of ZOOKEEPER-2325.
> With this check, ZK will not load the db without a snapshot file, even the 
> transaction log files are present and valid. This could be a problem for 
> restoring a ZK instance which does not have a snapshot file but have a sound 
> state (e.g. it crashes before being able to take the first snap shot with a 
> large snapCount parameter configured).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to