Hi everyone, I'd like to add a somewhat similar use case, for going back to a specific offset (maybe this will be addressed with the time indexing thing in Kafka 87, by the way, is any of the upcoming features documented?)
Let's say I want to design a fault tolerant system around Kafka's ability to replay messages from a specific offset. A chain of consumers reads messages from kafka, and the last one dumps data into some database. What I want to achieve is, if any of the consumers fails, then a system detects the failure, and replays messages from a specific offset. How this can be achieved: 1) Instead of having the consumer reading from Kafka update ZK with the latest offset, I have the _last_ node in the consumer chain, the one that writes to the DB, update ZK with an offset. 2) A monitoring system detects node failures along the entire consumer chain 3) If the monitoring system detects a failure condition, then the consumer reading from Kafka will reset to the last "successful" offset written into ZK So far, I believe this differs from 0.6 functionality in the following ways: - The default KafkaConsumer is the one that updates ZK on a regular basis. Instead I would like the ability to update ZK from a node other than the one directly consuming from Kafka. - I would like the ability for the KafkaConsumer to reset to the last offset in ZK (or maybe some arbitrary offset). I know the SimpleConsumer can be given an offset, but I don't want to lose the nice auto load balancing features in the smart Consumer. Any thoughts? Many thanks. On Thu, Sep 22, 2011 at 8:22 AM, Taylor Gautier <tgaut...@tagged.com> wrote: > I think we are going to go with Kafka itself hopefully if the code isn't > too > hard to update - it already has everything we need, we just change it's > receive message operation from: > > receive msg => write into log file > > receive msg => write into log file and get offset, write offset into index > log file > > where index log file is just another topic that contains 64 bit offsets of > the original log file. > > of course with batching and sendfile calls this may be trickier than we > anticipate…. > > On Thu, Sep 22, 2011 at 8:17 AM, Jeffrey Damick <jeffreydam...@gmail.com > >wrote: > > > This was something i've asked for in the past as well. To neha's > comment, > > sounds like you'd need some kind of table to maintain the list of offsets > > and which segment they live in? > > > > > > > > On Thu, Sep 22, 2011 at 9:07 AM, Chris Burroughs > > <chris.burrou...@gmail.com>wrote: > > > > > On 09/21/2011 10:06 PM, Taylor Gautier wrote: > > > > I see that kafka-87 addresses this with a request for having a time > > based > > > > index, this would be relatively useful, but I also would like to have > a > > > way > > > > to go back say 1,000 messages. Other than walking backwards one > > segment > > > at > > > > a time, can then scanning forward from there, do you have any > > suggestions > > > > how this might be done or is it also a feature request? > > > > > > Could you elaborate a little on your use case where you need to rewind > > > by a fixed number of messages? > > > > > > -- -- *Evan Chan* Senior Software Engineer | e...@ooyala.com | (650) 996-4600 www.ooyala.com | blog <http://www.ooyala.com/blog> | @ooyala<http://www.twitter.com/ooyala>