hanm commented on a change in pull request #1069: ZOOKEEPER-3056: Fails to load
database with missing snapshot file but with valid transaction log file.
URL: https://github.com/apache/zookeeper/pull/1069#discussion_r319711331
##########
File path:
zookeeper-server/src/main/java/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java
##########
@@ -232,11 +242,19 @@ public long restore(DataTree dt, Map<Long, Integer>
sessions, PlayBackListener l
} else {
trustEmptyDB = autoCreateDB;
}
+
if (-1L == deserializeResult) {
/* this means that we couldn't find any snapshot, so we need to
* initialize an empty database (reported in ZOOKEEPER-2325) */
if (txnLog.getLastLoggedZxid() != -1) {
- throw new IOException("No snapshot found, but there are log
entries. " + "Something is broken!");
+ // ZOOKEEPER-3056: provides an escape hatch for users upgrading
+ // from old versions of zookeeper (3.4.x, pre 3.5.3).
+ if (!trustEmptySnapshot) {
+ throw new IOException(EMPTY_SNAPSHOT_WARNING + "Something
is broken!");
+ } else {
+ LOG.warn(EMPTY_SNAPSHOT_WARNING + "This should only be
allowed during upgrading.");
+ trustEmptySnapshot = false;
Review comment:
>> we don't clear the in memory data base anymore
Actually, we still clear database sometimes (the case I was referring to was
the database retain introduced in ZOOKEEPER-2678 and following up jira). There
are two cases we still clear database: a recovering server get a TRUNC from
leader; or a recovering server get a SNAP sync from leader. Both cases will
clear db, leads to reload local snap to initialize the database.
In any case, one invariant here is the first invoke of `restore` will always
happen when a fresh server is starting. I don't think a future refactoring will
change this fact (if, there will be such refactoring at all) given the essence
of this step during start up. With this invariant, we need reset the flag so
further database reloading at run time does not accidentally skip the snapshot
validation.
Now, one alternative is instead of trying to be clever and reset the flag in
code, we remove the reset, but require ops admin to restart the zookeeper
process after upgrading by reset the flag manually. This means upgrading a
cluster will trigger leader election at least twice, instead of once, which
seems not too bad, with the benefit of making code easier to understand. Even
in this case, we can still keep this reset logic in code as an extra layer of
protection, as that logic will only introduce false negative instead of false
positive in terms of the data safety of the system, in worst case.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services