Failure Guarantee and Expectations

Ashutosh Singh Fri, 27 Apr 2012 13:22:29 -0700

Folks,
I read Jun's paper, but I could not get enough details on where all the
failure scenarios are at. I am trying to use kafka (or something else) for
persistent queues. The big requirement for me would also be not to loose
the message persisted. I am ready to contribute the code to add some sort
of replication, but want to know where all the failures can happen.
1- What happens in zookeeper goes down and comes back up---What messages if
any do we lose...what compensation do we need to do on consumer side if any.
2- What happens when the broker goes down.
    a- when the hard drive has a failure.
    b- data is correctly written to disk but the process goes down and is
restarted.
    c- What will happen to consumers intermittently...
    d- If we replicate the data what reliability guarantees can we have.
    e- If CRC errors happen, can we pick up the record from another copy
saved somewhere


There is a deep interest in my group on this project and if it fits our
need we would like to run with it, both as users and as contributors.

Another question.... why was queue not built on Cassandra? Would that have
met sub second latency SLA's

Ashutosh Singh

Failure Guarantee and Expectations

Reply via email to