eolivelli commented on issue #2273: Bookie does not try to download ledger from 
another bookie
URL: https://github.com/apache/bookkeeper/issues/2273#issuecomment-595081016
 
 
   Finally we realized the "problem", it is not a strictly a bug in BookKeeper 
but it is a very unexpected behaviour that needs a workaround in the 
application and a workaround is not always possible.
   
   The scenario is the following:
   - we have the usual leader/follower pattern for BK users
   - the leader holds on metadata service (ZooKeeper) a list of "active 
ledgers" that build up an unlimited stream of data
   - the leader creates a new ledger with WQ=2, gets the id and appends it to 
the list of "active ledgers"
   - the ledger is empty, no write has ever been issues to bookies
   - bookies locally do not know anything about the ledger
   - now let's stop one of the two bookies (or partition it away from client 
network, that's the @hamadodene 's case)
   - so we have on ledger metadata an ensemble with bookie1 and bookie2, ledger 
is in state OPEN, LAC = -1
   - bookie1 is up and running, but it doesn't hold any entry
   - bookie2 is unreachable from the client
   - the follower tries to open the ledger (no recovery), and boom !
   - the ledger is OPEN, the follower reads "NoSuchLedger" from Bookie1 and it 
gets a network error (BookieHandleNotAvailable or something like that) from 
Bookie2
   
   It looks like that even a recovery read is not possible in this case.
   
   The workaround is to write (and block until a successful acknowledge) an 
entry to the ledger before adding the ledger to the list of "active ledgers", 
this way you are sure that each bookie knows about the ledger and does not 
answer NoSuchLedger.
   
   With QA < WQ this workaround won't work, because the write of entry 0 may 
not be acklowledged by Bookie1 (the one running during the open action) but the 
client will consider it successfully written (because Bookie2 at the time of 
the write is up and running).
   
   cc @hamadodene @aluccaroni @ivankelly @fpj @jvrao @sijie 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to