[ https://issues.apache.org/jira/browse/ZOOKEEPER-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015314#comment-13015314 ]
Flavio Junqueira commented on ZOOKEEPER-1001: --------------------------------------------- @Rob {quote} Can a client reading an open ledger from bookie B ever be confident that it will see committed record N short of the ledger being closed? If the answer is No, then it seems that even if an oracle tells me that N is committed, then my reading client must survey a quorum of bookies to be certain of finding record N. {quote} An oracle tells you that entry up to N has been committed (written to a quorum), so the client knows that if it reads up to N, the it is safe. To find any given entry, we follow the same order scheme as the writer does (for efficiency), so the client does not have to scan all bookies to find an entry, so your observation about reading from a quorum is correct. However, it is important to note that quorums in BookKeeper are not necessarily majorities. {quote} And a survey of a quorum of bookies would also find the record that commits N, obviating the need for any oracle. If the answer is Yes, the reading client will see record N without the complexity of consulting the oracle if only the reader is a little patient. {quote} This is not correct. The fact that a client finds a copy of an entry in one bookie does necessarily mean that the entry has been committed. Not finding it in a quorum also does not mean that it hasn't been committed, since one bookie might be unavailable at the time of the query. There has to be agreement somewhere in the system about the committed entries, and so far we have done it through zookeeper when a ledger closes. {quote} Anyway, for HDFS, why not just start a new ledger every minute, and have the standby server only read from closed ledgers? (Sixty seconds of latency.) {quote} I don't see any important problem with this solution, aside from the overhead of creating many ledgers over time. @dhruba {quote} is the writing of a ledger entry atomic (for a single bookie)? For example, if an application writes a ledger entry to a certain bookie and a concurrent reader tries to read that entry (even if that entry is not committed to all bookies), is it possible that only a part of that ledger entry is visible to the reader? {quote} The interface to a bookie is simple: if you ask for an entry of a given ledger and the entry exists, then the bookie will return. That's main reason why we need to make sure that the reader only requests committed entries. > Read from open ledger > --------------------- > > Key: ZOOKEEPER-1001 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1001 > Project: ZooKeeper > Issue Type: New Feature > Components: contrib-bookkeeper > Reporter: Flavio Junqueira > Attachments: zk-1001-design-doc.pdf, zk-1001-design-doc.pdf > > > The BookKeeper client currently does not allow a client to read from an open > ledger. That is, if the creator of a ledger is still writing to it (and the > ledger is not closed), then an attempt to open the same ledger for reading > will execute the code to recover the ledger, assuming that the ledger has not > been correctly closed. > It seems that there are applications that do require the ability to read from > a ledger while it is being written to, and the main goal of this jira is to > discuss possible implementations of this feature. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira