[GitHub] zookeeper pull request #120: ZOOKEEPER-261
Github user asfgit closed the pull request at: https://github.com/apache/zookeeper/pull/120 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] zookeeper pull request #120: ZOOKEEPER-261
Github user enixon commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/120#discussion_r95866724 --- Diff: src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java --- @@ -167,6 +175,16 @@ public long restore(DataTree dt, Mapsessions, PlayBackListener listener) throws IOException { long deserializeResult = snapLog.deserialize(dt, sessions); FileTxnLog txnLog = new FileTxnLog(dataDir); +boolean suspectEmptyDB; --- End diff -- If `suspectEmptyDB` looks vague to you, that's enough reason to change it. The intent is to capture "if you find can't find a data base then treat that condition with paranoia and don't vote until someone provides you with one". The recovery aspect could be emphasized with `forceRecoverEmptyDB`. Or we could invert the condition and call it `trustEmptyDB`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] zookeeper pull request #120: ZOOKEEPER-261
Github user enixon commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/120#discussion_r95862018 --- Diff: src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java --- @@ -175,11 +193,20 @@ public long restore(DataTree dt, Mapsessions, "No snapshot found, but there are log entries. " + "Something is broken!"); } -/* TODO: (br33d) we should either put a ConcurrentHashMap on restore() - * or use Map on save() */ -save(dt, (ConcurrentHashMap )sessions); -/* return a zxid of zero, since we the database is empty */ -return 0; + +if (suspectEmptyDB) { +/* return a zxid of -1, since we are possibly missing data */ +LOG.warn("Unexpected empty data tree, setting zxid to -1"); --- End diff -- We only reach this point if our `SnapShot` interface was not able to deserialize anything into the `DataTree`. In the usual case of `FileSnap`, this happens only when there aren't any non-corrupted snapshots to be loaded. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] zookeeper pull request #120: ZOOKEEPER-261
Github user eribeiro commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/120#discussion_r95721046 --- Diff: src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java --- @@ -175,11 +193,20 @@ public long restore(DataTree dt, Mapsessions, "No snapshot found, but there are log entries. " + "Something is broken!"); } -/* TODO: (br33d) we should either put a ConcurrentHashMap on restore() - * or use Map on save() */ -save(dt, (ConcurrentHashMap )sessions); -/* return a zxid of zero, since we the database is empty */ -return 0; + +if (suspectEmptyDB) { +/* return a zxid of -1, since we are possibly missing data */ +LOG.warn("Unexpected empty data tree, setting zxid to -1"); --- End diff -- Are we 100% sure the data tree is empty? Couldn't it be only partially complete? I mean the machine recorded up to transaction n, but lost transactions n+1, n+2, n+3, etc? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] zookeeper pull request #120: ZOOKEEPER-261
Github user enixon commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/120#discussion_r95714487 --- Diff: src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java --- @@ -167,6 +175,16 @@ public long restore(DataTree dt, Mapsessions, PlayBackListener listener) throws IOException { long deserializeResult = snapLog.deserialize(dt, sessions); FileTxnLog txnLog = new FileTxnLog(dataDir); +boolean suspectEmptyDB; +File initFile = new File(dataDir.getParent(), "initialize"); +if (initFile.exists()) { +if (!initFile.delete()) { +throw new IOException("Unable to delete initialization file " + initFile.toString()); +} +suspectEmptyDB = false; +} else { +suspectEmptyDB = !autoCreateDB; --- End diff -- I tempted to do put the log line on the other side of the conditional since this side is the expected case. We should only delete an initialize file once in the lifecycle of a given server while the check against `autoCreateDB` will happen every other time the server is restarted. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] zookeeper pull request #120: ZOOKEEPER-261
Github user enixon commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/120#discussion_r95714170 --- Diff: src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java --- @@ -167,6 +175,16 @@ public long restore(DataTree dt, Mapsessions, PlayBackListener listener) throws IOException { long deserializeResult = snapLog.deserialize(dt, sessions); FileTxnLog txnLog = new FileTxnLog(dataDir); +boolean suspectEmptyDB; +File initFile = new File(dataDir.getParent(), "initialize"); +if (initFile.exists()) { --- End diff -- Nice optimization, I like it! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] zookeeper pull request #120: ZOOKEEPER-261
Github user enixon commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/120#discussion_r95714094 --- Diff: bin/zkServer-initialize.sh --- @@ -113,6 +113,8 @@ initialize() { else echo "No myid provided, be sure to specify it in $ZOO_DATADIR/myid if using non-standalone" fi + +date > "$ZOO_DATADIR/initialize" --- End diff -- True enough, `touch` is sufficient. Using `date` is an optimization I've included in other scripts in the past as a way of sneaking a bit more information into an otherwise meaningless file but in this context it's probably just confusing. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] zookeeper pull request #120: ZOOKEEPER-261
Github user eribeiro commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/120#discussion_r95711916 --- Diff: src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java --- @@ -167,6 +175,16 @@ public long restore(DataTree dt, Mapsessions, PlayBackListener listener) throws IOException { long deserializeResult = snapLog.deserialize(dt, sessions); FileTxnLog txnLog = new FileTxnLog(dataDir); +boolean suspectEmptyDB; +File initFile = new File(dataDir.getParent(), "initialize"); +if (initFile.exists()) { --- End diff -- Disclaimer: I am not used to `Files` class so you may have to make sure it doesn't alter the current behaviour if you decide to use it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] zookeeper pull request #120: ZOOKEEPER-261
Github user eribeiro commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/120#discussion_r95710948 --- Diff: src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java --- @@ -167,6 +175,16 @@ public long restore(DataTree dt, Mapsessions, PlayBackListener listener) throws IOException { long deserializeResult = snapLog.deserialize(dt, sessions); FileTxnLog txnLog = new FileTxnLog(dataDir); +boolean suspectEmptyDB; +File initFile = new File(dataDir.getParent(), "initialize"); +if (initFile.exists()) { +if (!initFile.delete()) { +throw new IOException("Unable to delete initialization file " + initFile.toString()); +} +suspectEmptyDB = false; +} else { +suspectEmptyDB = !autoCreateDB; --- End diff -- IMO, it would be nice to put a `debug` (warn?) log message here. Something along the lines of "Initialize file doesn't found! Using autoCreateDB attribute." --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] zookeeper pull request #120: ZOOKEEPER-261
Github user eribeiro commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/120#discussion_r95703179 --- Diff: bin/zkServer-initialize.sh --- @@ -113,6 +113,8 @@ initialize() { else echo "No myid provided, be sure to specify it in $ZOO_DATADIR/myid if using non-standalone" fi + +date > "$ZOO_DATADIR/initialize" --- End diff -- Nit: If the sole purpose of this file is to act as a marker, in spite of its content, then a ```touch $ZOO_DATADIR/initialize``` would be enough, wouldn't it? Of course, `date` is fine as well, no problem. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] zookeeper pull request #120: ZOOKEEPER-261
Github user eribeiro commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/120#discussion_r95707294 --- Diff: src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java --- @@ -167,6 +175,16 @@ public long restore(DataTree dt, Mapsessions, PlayBackListener listener) throws IOException { long deserializeResult = snapLog.deserialize(dt, sessions); FileTxnLog txnLog = new FileTxnLog(dataDir); +boolean suspectEmptyDB; +File initFile = new File(dataDir.getParent(), "initialize"); +if (initFile.exists()) { --- End diff -- As Java 7 is the default we could use the code below? The benefits are that it automatically throws the `IOException` if an I/O error happens or return `false` if the file doesn't exists. ``` if (Files.deleteIfExists(initFile.toPath()) { suspectEmptyDB = false; } else { (...) ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] zookeeper pull request #120: ZOOKEEPER-261
Github user eribeiro commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/120#discussion_r95707659 --- Diff: src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java --- @@ -132,6 +137,9 @@ public FileTxnSnapLog(File dataDir, File snapDir) throws IOException { txnLog = new FileTxnLog(this.dataDir); snapLog = new FileSnap(this.snapDir); + +autoCreateDB = Boolean.parseBoolean(System.getProperty(ZOOKEEPER_DB_AUTOCREATE, --- End diff -- +1 with @hanm --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] zookeeper pull request #120: ZOOKEEPER-261
Github user eribeiro commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/120#discussion_r95709489 --- Diff: src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java --- @@ -167,6 +175,16 @@ public long restore(DataTree dt, Mapsessions, PlayBackListener listener) throws IOException { long deserializeResult = snapLog.deserialize(dt, sessions); FileTxnLog txnLog = new FileTxnLog(dataDir); +boolean suspectEmptyDB; --- End diff -- Could we rename this to `recoveringDB` or `recoveringNode`? My rationale is: `suspectEmptyDB` looks vague to me, **plus** __if I understood it right__ a node could have been shutdown and restarted after some time. So, not necessarily its DB will be empty, but it is in a recovering process so we want to avoid that it becoming the leader and messing up with transactions performed while it was offline, right? Could we rename this to `recoveringDB` or `recoveringNode`? My rationale is: `suspectEmptyDB` looks vague to me, **plus** because __if I understood it right__ a node could have been shutdown and restarted after some time. So, not necessarily its DB will be empty, but it is in a recovering process so we want to avoid that it becoming the leader and messing up with transactions performed while it was offline, right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] zookeeper pull request #120: ZOOKEEPER-261
Github user hanm commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/120#discussion_r91235960 --- Diff: src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java --- @@ -132,6 +137,9 @@ public FileTxnSnapLog(File dataDir, File snapDir) throws IOException { txnLog = new FileTxnLog(this.dataDir); snapLog = new FileSnap(this.snapDir); + +autoCreateDB = Boolean.parseBoolean(System.getProperty(ZOOKEEPER_DB_AUTOCREATE, --- End diff -- >> Is that in accord with Zookeeper style? I see - I thought the new property was not end user facing since there is no associated documents added here. Since the property "zookeeper.db.autocreate" is exposed to user some doc could be added to ZooKeeperAdmin.html (similarly like how the existing "zookeeper.datadir.autocreate" is documented there) to describe the motivation / usage of the property. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] zookeeper pull request #120: ZOOKEEPER-261
Github user enixon commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/120#discussion_r91234951 --- Diff: src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java --- @@ -132,6 +137,9 @@ public FileTxnSnapLog(File dataDir, File snapDir) throws IOException { txnLog = new FileTxnLog(this.dataDir); snapLog = new FileSnap(this.snapDir); + +autoCreateDB = Boolean.parseBoolean(System.getProperty(ZOOKEEPER_DB_AUTOCREATE, --- End diff -- I included `ZOOKEEPER_DB_AUTOCREATE` to allow users to opt out of the feature until they're ready to update their ensemble management tooling to support creating the new file. Is that in accord with Zookeeper style? On the question of style, `ZOOKEEPER_DB_AUTOCREATE_DEFAULT` exists purely because `ZOOKEEPER_DATADIR_AUTOCREATE_DEFAULT` exists above it in the file. If including the defaults as static constants isn't Zookeeper style then I'm happy to replace it with a string literal in the constructor. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] zookeeper pull request #120: ZOOKEEPER-261
Github user hanm commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/120#discussion_r91208138 --- Diff: src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java --- @@ -165,8 +173,41 @@ public File getSnapDir() { */ public long restore(DataTree dt, Mapsessions, PlayBackListener listener) throws IOException { -snapLog.deserialize(dt, sessions); +long deserializeResult = snapLog.deserialize(dt, sessions); FileTxnLog txnLog = new FileTxnLog(dataDir); +boolean suspectEmptyDB; +File initFile = new File(dataDir.getParent(), "initialize"); +if (initFile.exists()) { +if (!initFile.delete()) { +LOG.warn("Unable to delete initialization file " + initFile.toString()); --- End diff -- It sounds pretty serious issue if the initialize file can't be cleaned up upon startup of a new ensemble for whatever reasons as the presence of this file is the key promise made here - not able to clean it up will possibly lead to inconsistent quorum state again that this PR is trying to fix. So, maybe throw an IOException here to abort server start process and let admin intervene instead? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] zookeeper pull request #120: ZOOKEEPER-261
Github user hanm commented on a diff in the pull request: https://github.com/apache/zookeeper/pull/120#discussion_r91208860 --- Diff: src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java --- @@ -132,6 +137,9 @@ public FileTxnSnapLog(File dataDir, File snapDir) throws IOException { txnLog = new FileTxnLog(this.dataDir); snapLog = new FileSnap(this.snapDir); + +autoCreateDB = Boolean.parseBoolean(System.getProperty(ZOOKEEPER_DB_AUTOCREATE, --- End diff -- It seems that this variable `autoCreateDB` (and the property `ZOOKEEPER_DB_AUTOCREATE_DEFAULT` and `ZOOKEEPER_DB_AUTOCREATE`) is used solely for testing purpose (to change control flow and get code coverage). IIUC, maybe add some comments about these test only variables? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] zookeeper pull request #120: Zookeeper 261
GitHub user enixon opened a pull request: https://github.com/apache/zookeeper/pull/120 Zookeeper 261 You can merge this pull request into a Git repository by running: $ git pull https://github.com/enixon/zookeeper ZOOKEEPER-261 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/zookeeper/pull/120.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #120 commit 68433ae695438f6f97e740b8413b1f7aca56fbd3 Author: Benjamin ReedDate: 2016-11-29T22:08:22Z ZOOKEEPER-2325: Data inconsistency if all snapshots empty or missing commit 35ce7e275c4ec8431e5dc7775d8257d4f76ec251 Author: Benjamin Reed Date: 2016-12-01T23:04:52Z address comments from rgs1 commit f67e247d9fcb222563c5e5e1f363ad0fe56dcc05 Author: Brian Nixon Date: 2016-11-29T22:56:07Z ZOOKEEPER-261: Reinitialized servers should not participate in leader election --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---