Just thought of a way to do transactions in Kafka. I think this solution would cover the most common types of transactions. However, it's often useful to run an idea by a second set of eyes. I am interested in knowing where the holes are in this design that I haven't been able to see. If you're interested in transactional kafka, please review this and let me know any feedback you have.
A transactional topic can be approximated by using a second topic as a control stream. Each message in the control topic would contain the offset and length (and an optional transaction ID). There is no change to the messages written to the data topic. The performance impact would generally be low-- the larger the transaction size, the less the performance impact would be. To write a transaction to the data partition, note the end offset of the partition in memory. Write all your messages to the partition. Note the new offset at the end of the partition (to calculate the length). Write the transaction offset+length into the control partition. To read a set of committed data from the data stream: Read the transaction from the control stream. Start reading at the offset stored in the transaction, until you've read the specified length of data. If the producer crashes at any point, the written data will remain in the data partitions, but the transaction will not be written to the control topic, which will prevent those messages from being read by any transactional reader. The assumptions and side-effects of this design are as follows: 1. The control topic mirrors the data topic in terms of brokers and partitions. 2. Each partition can only be fed by a single producer at any given time. 3. The offset at the end of the partition is available to a consumer. 4. Each transaction involves an extra message, so performance for very small transactions will not be ideal. 5. Rolled-back data remains in each individual partition. 6. A single partition can have more than one consumer (with all consumer coordinated by a single control partition reader). Thanks in advance, Tom Brown