Re: Kafka for event sourcing architecture

Tom Crayford Tue, 17 May 2016 04:44:27 -0700

Hi Oli,

Inline.

On Tuesday, 17 May 2016, Olivier Lalonde <olalo...@gmail.com> wrote:

> Hi all,
>
> I am considering adopting an "event sourcing" architecture for a system I
> am developing and Kafka seems like a good choice of store for events.
>
> For those who aren't aware, this architecture style consists in storing all
> state changes of the system as an ordered log of events and building
> derivative views as needed for easier querying (using a SQL database for
> example). Those views must be completely derived from the event log alone
> so that the log effectively becomes a "single source of truth".
>
> I was wondering if anyone else is using Kafka for that purpose and more
> specifically:
>
> 1) Can Kafka store messages permanently?

No. Whilst you can tweak config and such to get a very long retention
period, this doesn't work well with Kafka at all. Keeping data around
forever has severe impacts on the operability of your cluster. For example,
if a machine fails, a replacement would have to catch up with vast
quantities of data from its replicas. Currently we (Heroku Kafka) restrict
our customers to a maximum of 14 days of retention, because of all the
operational headaches of more retention than that. Of course on your own
cluster you *can* set it as high as you like, this is just an anecdotal
experience thing from a team that runs thousands of clusters - infinite
retention is an operational disaster waiting to happen.

Whilst Kafka does have a replay mechanism, that should mostly be though of
as a mechanism for handling other system failures. E.g. If the database you
store indexed views is is down, Kafka's replay and retention mechanisms
mean you're not losing data whilst restoring the availability of that
database.

What we typically suggest customers do when they ask about this use case is
to use Kafka as a messaging system, but use e.g. S3 as the long term store.
Kafka can help with batching writes up to S3 (see e.g. Pintrest's secor
project), and act as a very high throughput, durable, replicated messaging
layer for communication. In this paradigm, when you want to replay, you do
so out of S3 until you've consumed the last offset there, then start
replaying out of and catching up with the small amount of remaining data in
Kafka. Of course the replay logic there has to be hand rolled, as Kafka and
its clients have no knowledge of external stores.

Another potential thing to look at is Kafka's compacted topic mechanism.
With compacted topics, Kafka keeps the latest element for a given key,
making it act a little more like a database table. Note that you still have
to consume by offset here - there's no "get the value for key Y
operation". However, this assumes that your keyspace is still tractably
small, and that you're ok with keeping only the latest value. Compaction
completely overrides time based retention, so you have to "delete" keys or
have a bounded keyspace if you want to retain operational sanity with
Kafka. I'd recommend reading the docs on compacted topics, they cover the
use cases quite well.

>
> 2) Let's say I throw away my derived view and want to re-build it from
> scratch, is it possible to consume messages from a topic from its very
> first message and once it has caught up, listen for new messages like it
> would normally do?

That's entirely possible, you can catch up from the first retained message
and then continue from there very easily. However, see above about infinite
retention.

>
> 2) Does it support transactions? Let's say I want to push 3 messages
> atomically but the producer process crashes after sending only 2 messages,
> is it possible to "rollback" the first 2 messages (e.g. "all or nothing"
> semantics)?

No. Kafka at the moment only supports "at least once" semantics, and there
are no cross broker transactions of any kind. Implementing such a thing
would likely have huge negative impacts on the current performance
characteristics of Kafka, which would be a issue for many users.

>
> 3) Does it support request/response style semantics or can they be
> simulated? My system's primary interface with the outside world is an HTTP
> API so it would be nice if I could publish an event and wait for all the
> internal services which need to process the event to be "done"
> processing before returning a response.

In theory that's possible - the producer can return the offset of the
message produced, and you could check the latest offset of each consumer in
your web request handler.

However, doing so is not going to work that well, unless you're ok with
your web requests taking on the order of seconds to tens of seconds to
fulfill. Kafka can do low latency messaging reasonably well, but
coordinating the offsets of many consumers would likely have a huge latency
impact. Writing the code for it and getting it handling failure correctly
would likely be a lot of work (there's nothing in any of the client
libraries like this, because it is not a desirable or supported use case).

Instead I'd like to query *why* you need those semantics? What's the issue
with just producing a message and telling the user HTTP 200 and later
consuming it.

>
> PS: I'm a Node.js/Go developer so when possible please avoid Java centric
> terminology.

Please to note that the node and go clients are notably less mature than
the JVM clients, and that running Kafka in production means knowing enough
about the JVM and Zookeeper to handle that.

Thanks!
Tom Crayford
Heroku Kafka

>
> Thanks!
>
> - Oli
>
> --
> - Oli
>
> Olivier Lalonde
> http://www.syskall.com <-- connect with me!
>

Re: Kafka for event sourcing architecture

Reply via email to