[ https://issues.apache.org/jira/browse/BOOKKEEPER-101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13146460#comment-13146460 ]
jirapos...@reviews.apache.org commented on BOOKKEEPER-101: ---------------------------------------------------------- bq. On 2011-11-08 18:22:42, Dhruba Borthakur wrote: bq. > bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/Bookie.java, line 561 bq. > <https://reviews.apache.org/r/2594/diff/2/?file=56350#file56350line561> bq. > bq. > Can we pl get some javadoc here that specifies the precise semantics of the two new flags. Looking at the code actually makes me understand the semantics, but it will be better if we can create a javadoc here, especially since this is a public api, isn't it? bq. > The client api is in org.apache.bookkeeper.client. This is serverside, so not really public. Some javadoc here would be useful for developers though. bq. On 2011-11-08 18:22:42, Dhruba Borthakur wrote: bq. > bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/Bookie.java, line 570 bq. > <https://reviews.apache.org/r/2594/diff/2/?file=56350#file56350line570> bq. > bq. > I have a major question here: suppose client A is executing a Bookie.addEntry(), it first checked that handle is not fencedout and is about to go ahead to write the entry into the log. It is about to execute handle.addEntry(). bq. > bq. > Now another client B issued a Bookie.addEntry(enableFencing=true). This client sets the fecing flag in the handle and then finishes doing the handle.addEntry() for its own entry. It now returns to the client and client B thinks that it has successfully fenced out other clients. bq. > bq. > Now the first thread that was processing the request for client A wakes up and finishes the handle.addEntry() call. Can this happen? If so, then we have a problem. bq. > bq. > Ah, thanks for reminding me of this. It isn't a problem now, as this part of the bookie is single threaded, but its best to address this now, than having to scrabble to fix it if/when we move to multithreaded bookies. bq. On 2011-11-08 18:22:42, Dhruba Borthakur wrote: bq. > bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeper.java, line 268 bq. > <https://reviews.apache.org/r/2594/diff/2/?file=56354#file56354line268> bq. > bq. > Some java doc here would be really realyl helpful I'll add. bq. On 2011-11-08 18:22:42, Dhruba Borthakur wrote: bq. > bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/BookieClient.java, line 74 bq. > <https://reviews.apache.org/r/2594/diff/2/?file=56364#file56364line74> bq. > bq. > some javadocs on the meaning of "options" would be nice. I'll add. bq. On 2011-11-08 18:22:42, Dhruba Borthakur wrote: bq. > bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/LedgerDescriptor.java, line 46 bq. > <https://reviews.apache.org/r/2594/diff/2/?file=56352#file56352line46> bq. > bq. > maybe we should use the volatile keyword here? I'll add. As I said above, this part is single threaded, so it shouldn't make a difference, but no harm in future proofing. bq. On 2011-11-08 18:22:42, Dhruba Borthakur wrote: bq. > bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/BookieException.java, line 69 bq. > <https://reviews.apache.org/r/2594/diff/2/?file=56351#file56351line69> bq. > bq. > Is it easy to print which client (client identifier, ip address, etc) has fenced off? This helps debugging. The client identifier of who fenced it? It would require modifications to the LedgerDescriptor. Perhaps we could log a message when the ledger itself gets fenced, and the person debugging could match it up. bq. On 2011-11-08 18:22:42, Dhruba Borthakur wrote: bq. > bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/Bookie.java, line 590 bq. > <https://reviews.apache.org/r/2594/diff/2/?file=56350#file56350line590> bq. > bq. > is there a use-case why we need the enableFencing call for both addEntry() and readEntry()? Isn't it enough to have this flag only for readEntry() call? There shouldn't be. I think I added it to addEntry because I was thinking I could get away with just one flag for fencing and recovery add, which I couldn't. I don't think there's any case were addEntry could actually fence, because by the time we get to it, readEntry() should have have hit the bookie. I need to think about this a little more. - Ivan ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2594/#review3113 ----------------------------------------------------------- On 2011-11-04 17:40:53, Ivan Kelly wrote: bq. bq. ----------------------------------------------------------- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2594/ bq. ----------------------------------------------------------- bq. bq. (Updated 2011-11-04 17:40:53) bq. bq. bq. Review request for bookkeeper. bq. bq. bq. Summary bq. ------- bq. bq. BookKeeper is designed for use as a Write ahead log. In systems with a primary/backup architecture, the primary will write state updates to the WAL. If the primary dies the backup comes online, reads the WAL to get the latest state and starts serving requests. However, if the primary was only partitioned from the network, or stuck in a long GC, a split brain occurs. Both primary and backup can service client requests. bq. bq. Fencing(http://en.wikipedia.org/wiki/Fencing_%28computing%29) ensures that this cannot happen. With fencing, the backup can close the WAL of the primary, and cause any subsequent attempt by the primary to write to the WAL to give an error. bq. bq. bq. This addresses bug BOOKKEEPER-101. bq. https://issues.apache.org/jira/browse/BOOKKEEPER-101 bq. bq. bq. Diffs bq. ----- bq. bq. bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/Bookie.java d651894 bq. bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/BookieException.java 292617e bq. bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/LedgerDescriptor.java 024cac3 bq. bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BKException.java d7c8f67 bq. bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeper.java a1fbab7 bq. bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java b3eb5b9 bq. bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerCreateOp.java 6f72e47 bq. bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java 8c2a54f bq. bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java 328c7ca bq. bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerOpenOp.java a68856c bq. bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java 7465c52 bq. bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingAddOp.java eddd760 bq. bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 385b16c bq. bookkeeper-server/src/main/java/org/apache/bookkeeper/client/ReadLastConfirmedOp.java c2d4cee bq. bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/BookieClient.java d70ae27 bq. bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/BookieProtocol.java 873dafe bq. bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/BookieServer.java f1b3ad9 bq. bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/PerChannelBookieClient.java 2cd4de8 bq. bookkeeper-server/src/test/java/org/apache/bookkeeper/client/LedgerCacheTest.java 07639aa bq. bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java PRE-CREATION bq. bookkeeper-server/src/test/java/org/apache/bookkeeper/proto/TestProtoVersions.java f6cd8c9 bq. bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BaseTestCase.java 6bac569 bq. bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BookieClientTest.java 97dc2ab bq. bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BookieRecoveryTest.java ac54d9a bq. bookkeeper-server/src/test/java/org/apache/bookkeeper/test/ConcurrentLedgerTest.java ebb17d2 bq. bookkeeper-server/src/test/java/org/apache/bookkeeper/test/LoopbackClient.java 85822bf bq. hedwig-server/src/main/java/org/apache/hedwig/server/benchmark/BookieBenchmark.java 18319d7 bq. bq. Diff: https://reviews.apache.org/r/2594/diff bq. bq. bq. Testing bq. ------- bq. bq. bq. Thanks, bq. bq. Ivan bq. bq. > Add Fencing to Bookkeeper > ------------------------- > > Key: BOOKKEEPER-101 > URL: https://issues.apache.org/jira/browse/BOOKKEEPER-101 > Project: Bookkeeper > Issue Type: New Feature > Reporter: Ivan Kelly > Assignee: Ivan Kelly > Fix For: 4.0.0 > > Attachments: BOOKKEEPER-101.diff, BOOKKEEPER-101.diff > > > BookKeeper is designed for use as a Write ahead log. In systems with a > primary/backup architecture, the primary will write state updates to the WAL. > If the primary dies the backup comes online, reads the WAL to get the latest > state and starts serving requests. However, if the primary was only > partitioned from the network, or stuck in a long GC, a split brain occurs. > Both primary and backup can service client requests. > Fencing(http://en.wikipedia.org/wiki/Fencing_%28computing%29) ensures that > this cannot happen. With fencing, the backup can close the WAL of the > primary, and cause any subsequent attempt by the primary to write to the WAL > to give an error. > We fence a ledger whenever it is opened by another client using > BookKeeper#openLedger. BookKeeper#openLedgerNoRecovery will not fence. > The opening client marks the ledger as fenced in zookeeper, and then sends a > readEntry message to a all of bookies with the DO_FENCING flag set. Once at > least 1 bookie in each possible quorum of bookies have responded, we can > proceed with opening the ledger. Any subsequent attempt to write to the > ledger will fail as it will not be able to write to a quorum without one of > the bookie in the quorum responding with a ledger fenced error. The client > will also be unable to change the quorum without seeing that the ledger has > been marked as fenced in zookeeper. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira