Log compaction recover strategy

Pierre-Yves Ritschard Tue, 10 Mar 2015 05:51:08 -0700

Hi kafka,

I've started implementing simple materialized views with the log
compaction feature to test it out, and it works great. I'll share the
code and an accompanying article shortly but first wanted to discuss
some of the production implications my sandbox has.


I've separated the project in two components:

- An HTTP API which reads off of a memory cache (in this case: redis)
and produces mutations on a kafka topic
- A worker which consumes the stream and materializes the view in redis.

I have a single entity so, the materialization is a very simple process,
which maintains a set of all entity keys and store entity content in
keys. In redis, a create or update maps to a SADD and SET, a delete maps
to a SREM and a DEL.

I'm now considering the production implications this has and have a few
questions:

- How do you typically handle workers starting, always start at offset 0
to make sure the view is correctly recreated ?
- How do you handle topology changes in consumers, which lead to a
redistribution of key across them ?
- Is there a valid mechanism to know the log is being reconsumed and to
let the client layer know of this ?

Congrats on getting log compaction in, this feature opens up a ton of
reliability improvements for us :-)

  - pyr

Log compaction recover strategy

Reply via email to