Hi Oli, Inline.
On Tuesday, 17 May 2016, Olivier Lalonde <olalo...@gmail.com> wrote: > Hi all, > > I am considering adopting an "event sourcing" architecture for a system I > am developing and Kafka seems like a good choice of store for events. > > For those who aren't aware, this architecture style consists in storing all > state changes of the system as an ordered log of events and building > derivative views as needed for easier querying (using a SQL database for > example). Those views must be completely derived from the event log alone > so that the log effectively becomes a "single source of truth". > > I was wondering if anyone else is using Kafka for that purpose and more > specifically: > > 1) Can Kafka store messages permanently? No. Whilst you can tweak config and such to get a very long retention period, this doesn't work well with Kafka at all. Keeping data around forever has severe impacts on the operability of your cluster. For example, if a machine fails, a replacement would have to catch up with vast quantities of data from its replicas. Currently we (Heroku Kafka) restrict our customers to a maximum of 14 days of retention, because of all the operational headaches of more retention than that. Of course on your own cluster you *can* set it as high as you like, this is just an anecdotal experience thing from a team that runs thousands of clusters - infinite retention is an operational disaster waiting to happen. Whilst Kafka does have a replay mechanism, that should mostly be though of as a mechanism for handling other system failures. E.g. If the database you store indexed views is is down, Kafka's replay and retention mechanisms mean you're not losing data whilst restoring the availability of that database. What we typically suggest customers do when they ask about this use case is to use Kafka as a messaging system, but use e.g. S3 as the long term store. Kafka can help with batching writes up to S3 (see e.g. Pintrest's secor project), and act as a very high throughput, durable, replicated messaging layer for communication. In this paradigm, when you want to replay, you do so out of S3 until you've consumed the last offset there, then start replaying out of and catching up with the small amount of remaining data in Kafka. Of course the replay logic there has to be hand rolled, as Kafka and its clients have no knowledge of external stores. Another potential thing to look at is Kafka's compacted topic mechanism. With compacted topics, Kafka keeps the latest element for a given key, making it act a little more like a database table. Note that you still have to consume by offset here - there's no "get the value for key Y operation". However, this assumes that your keyspace is still tractably small, and that you're ok with keeping only the latest value. Compaction completely overrides time based retention, so you have to "delete" keys or have a bounded keyspace if you want to retain operational sanity with Kafka. I'd recommend reading the docs on compacted topics, they cover the use cases quite well. > > 2) Let's say I throw away my derived view and want to re-build it from > scratch, is it possible to consume messages from a topic from its very > first message and once it has caught up, listen for new messages like it > would normally do? That's entirely possible, you can catch up from the first retained message and then continue from there very easily. However, see above about infinite retention. > > 2) Does it support transactions? Let's say I want to push 3 messages > atomically but the producer process crashes after sending only 2 messages, > is it possible to "rollback" the first 2 messages (e.g. "all or nothing" > semantics)? No. Kafka at the moment only supports "at least once" semantics, and there are no cross broker transactions of any kind. Implementing such a thing would likely have huge negative impacts on the current performance characteristics of Kafka, which would be a issue for many users. > > 3) Does it support request/response style semantics or can they be > simulated? My system's primary interface with the outside world is an HTTP > API so it would be nice if I could publish an event and wait for all the > internal services which need to process the event to be "done" > processing before returning a response. In theory that's possible - the producer can return the offset of the message produced, and you could check the latest offset of each consumer in your web request handler. However, doing so is not going to work that well, unless you're ok with your web requests taking on the order of seconds to tens of seconds to fulfill. Kafka can do low latency messaging reasonably well, but coordinating the offsets of many consumers would likely have a huge latency impact. Writing the code for it and getting it handling failure correctly would likely be a lot of work (there's nothing in any of the client libraries like this, because it is not a desirable or supported use case). Instead I'd like to query *why* you need those semantics? What's the issue with just producing a message and telling the user HTTP 200 and later consuming it. > > PS: I'm a Node.js/Go developer so when possible please avoid Java centric > terminology. Please to note that the node and go clients are notably less mature than the JVM clients, and that running Kafka in production means knowing enough about the JVM and Zookeeper to handle that. Thanks! Tom Crayford Heroku Kafka > > Thanks! > > - Oli > > -- > - Oli > > Olivier Lalonde > http://www.syskall.com <-- connect with me! >