Jonathan, Il giorno gio 14 gen 2021 alle ore 20:57 Jonathan Ellis <jbel...@apache.org> ha scritto:
> On 2021/01/11 08:31:03, Jack Vanlightly wrote: > > Hi, > > > > I've recently modelled the BookKeeper protocol in TLA+ and can confirm > that > > once confirmed, that an entry is not replayed to another bookie. This > > leaves a "hole" as the entry is now replicated only to 2 bookies, > however, > > the new data integrity check that Ivan worked on, when run periodically > > will be able to repair that hole. > > Can I read from the bookie with a hole in the meantime, and silently miss > data that it doesn't know about? > No you cannot miss data, if the client is not able to find a bookie that is able to answer with the entry it receives an error. The ledger has a known tail (LastAddConfirmed entry) and this value is stored on ledger metadata once the ledger is "closed". When the ledger is still open, that is when the writer is writing to it, the reader is allowed to read only up to the LastAddConfirmed entry this LAC value is returned to the reader using a piggyback mechanism, without reading from metadata. The reader cannot read beyond the latest position that has been confirmed to the writer by AQ bookies. We have a third case, the 'recovery read'. A reader starts a "recovery read" when you want to recover a ledger that has been abandoned by a dead writer or when you are a new leader (Pulsar Bundle Owner) or you want to fence out the old leader. In this case the reader merges the current status of the ledger on ZK with the result of a scan of the whole ledger. Basically it reads the ledger from the beginning up to the tail, until it is able to "read" entries, this way it is setting the 'fenced' flag on the ledger on every bookie and also it is able to detect the actual tail of the ledger (because the writer died and it was not able to flush metadata to ZK). The recovery read fails if it is not possible to read every entry from at least AQ bookies (that is it allows WQ-QA read failures), and it does not hazard to "repair" (truncate) the ledger if it does not find enough bookies. I hope that helps Enrico