> I'm thinking, any bookie failure in the inprogress ledger will enter into the > race situation, not only the last ensemble of the ledger > > Consider the example of the following open/inprogress ledger:- > L00001 > 0 - A B C > 10 - A B D > 11 - A B E > Say the ReplicationWorker(RW) has chosen this ledger L00001 to recover. Now > assume D has rejoined, only C is not running. > So the RW will re-replicate and update the metadata. This will leads to the > race condition as we ended up with two writers for the same ledger L00001 and > cause BadVersion Exception to the actual writer bk client. Eventhough we are > rereading and checking metadata.resolveConflict(), this will find data > mismatch. Finally fails the bkclient. > > I general, what I understood is any updation to the inprogress ledger by the > RW would result in BadVersionException to the client and resulting in NN > switching. > > Also, an ensemble reformation of an inprogress ledger by the bkclient (actual > writer) would cause BadVersionException to the ReplicationWorker side. > I think, we need to consider this case while designing the ReplicationWorker > thread.
Both these cases should be fine. Both handleBookieFailure and CloseOp will retry if they see the ledger metadata has been updated. For ReplicationWorker, i assume it'll use the mechanism which is in BookKeeperAdmin now. This also rereads and retries if it gets a bad version exception. resolveConflict will only find a data mismatch if the ensemble start entry has changed, not if the configuration of the ensemble has changed. -Ivan
