Doing aggregating counters in Kafka, how do I get the last consumed offset per partition?

Dave Fayram Thu, 22 Sep 2011 15:38:15 -0700

Hi, I'm trying to write a system that takes a stream of many many
small events (a N-way split of a kafka topic over partitions) and
compress these into one larger interval, which is then flushed to a
write-through cache. For consistency, I'd like to turn off the
autocommit of the partition and only update the read offset when I
actually flush the aggregated result down to a write-through cache.
For durability, I'd also like to record the offset as part of the
record I flush.


But I am having trouble trying to figure out exactly how to do this
with the Scala API. It seems like the client code can't easily get
access to this data. I know that the Hadoop consumer does it
(according to the other thread on offsets and message lengths), but I
can't really figure out how.

Any advice on how to do this? I thought I could use
ZookeeperConsumerStats, but this class seemes to have been removed.

-- 
--
Dave Fayram
dfay...@gmail.com

Doing aggregating counters in Kafka, how do I get the last consumed offset per partition?

Reply via email to