Log truncation and sync up when bookie fails and rejoins

Unmesh Joshi Tue, 28 Jan 2020 01:15:53 -0800

Hi,

In case of partial failures while implementing Replicated Log, there are
few requirements which need to be fulfilled to sync logs on multiple nodes
in case of node failure. e.g. In RAFT, if a node fails, there is a sync up
that happens with communication from leader to push all the entries and
truncation in case of some conflicting entries. Same happens with with
Kafka when followers start re pulling data after startup. Consistency need
to be maintained, by maintaining something like commitIndex or highwater
mark to have the index till which the majority of the servers have
acknowledged write.


In Apache BookKeeper I see that the consistency part is described well with
last add confirmed description in below documentation,
https://bookkeeper.apache.org/archives/docs/r4.4.0/bookkeeperProtocol.html

But who takes care of updating a particular Bookie in case it crashses (or
temporarily partitioned) and rejoins the cluster? Is there any
initialization protocol that is run at startup?
In Kafka, we have something like Controller which managers this
initialization, (or in RAFT its the current leader which will manage it)
I am going through the code to understand, but any pointers or hints will
be very helpful

Thanks,
Unmesh

Log truncation and sync up when bookie fails and rejoins

Reply via email to