[
https://issues.apache.org/jira/browse/BOOKKEEPER-101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150275#comment-13150275
]
dhruba borthakur commented on BOOKKEEPER-101:
---------------------------------------------
The patch looks good, I will post a few minot comments in review board.
Two related questions:
1. There is an api openLendger() that will trigger recovery if needed and open
the ledger for readin. There is another api openLedgerNoRecovery() which will
not trigger any recovery. Does it make sense to swap the semantics of these two
calls? Intuitively, it makes more sense that a openLedger() call is kinda a
non-destructive & idempotent call and will not trigger any state change on the
servers. But a intelligent client (e.g. namenode) can invoke
openLendgerWithRecovery() call to fence off ios from the original writer and
make the replicas in sync.
2. Suppose there were three namenodes in the group. The active one is writing
to a ledger. Suppose the primary namenode goes into a GC pause. Both the two
standbys invoke openLedgerWithRecovery() on the same ledger. is this usecase
supported? will both the clients now start to execute the code to recover the
ledger? The reason I ask this question is because the server does not record
which client has fenced off Io to the ledger.
> Add Fencing to Bookkeeper
> -------------------------
>
> Key: BOOKKEEPER-101
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-101
> Project: Bookkeeper
> Issue Type: New Feature
> Reporter: Ivan Kelly
> Assignee: Ivan Kelly
> Fix For: 4.0.0
>
> Attachments: BOOKKEEPER-101.diff, BOOKKEEPER-101.diff,
> BOOKKEEPER-101.diff, BOOKKEEPER-101.diff
>
>
> BookKeeper is designed for use as a Write ahead log. In systems with a
> primary/backup architecture, the primary will write state updates to the WAL.
> If the primary dies the backup comes online, reads the WAL to get the latest
> state and starts serving requests. However, if the primary was only
> partitioned from the network, or stuck in a long GC, a split brain occurs.
> Both primary and backup can service client requests.
> Fencing(http://en.wikipedia.org/wiki/Fencing_%28computing%29) ensures that
> this cannot happen. With fencing, the backup can close the WAL of the
> primary, and cause any subsequent attempt by the primary to write to the WAL
> to give an error.
> We fence a ledger whenever it is opened by another client using
> BookKeeper#openLedger. BookKeeper#openLedgerNoRecovery will not fence.
> The opening client marks the ledger as fenced in zookeeper, and then sends a
> readEntry message to a all of bookies with the DO_FENCING flag set. Once at
> least 1 bookie in each possible quorum of bookies have responded, we can
> proceed with opening the ledger. Any subsequent attempt to write to the
> ledger will fail as it will not be able to write to a quorum without one of
> the bookie in the quorum responding with a ledger fenced error. The client
> will also be unable to change the quorum without seeing that the ledger has
> been marked as fenced in zookeeper.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira