Re: Using Kafka for Event Sourcing
up, as I am not sure you receive this email. Le Sun Jan 11 2015 at 5:34:17 PM, Yann Simon yann.simon...@gmail.com a écrit : Hi, after having read http://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying, I am considering Kafka for an application build around CQRS and Event Sourcing. Disclaimer: I read the documentation but do not have any experience with Kafka at this time. In that constellation, the queried state is build by applying every events from the beginning. It is also important: - that all events are ordered, at least per entity - that all events are stored (no deletion) OR that events are compacted in such a way that the final state stays the same Questions: - I read that Kafka can delete events based on time or on disk usage. Is it possible to completely deactivate events deletion? (without using log compaction, this is my next questions) Kafka can also compact log ( https://cwiki.apache.org/confluence/display/KAFKA/Log+Compaction and http://kafka.apache.org/documentation.html#compaction). - How can we structure all events to that the final state stays the same? For example, if I have the following events: - create user 456 - for user 456, set email email1@dns - for user 456, set email email2@dns The log compaction should keep the user creation and the last email setting. Should I set events like that: - id user-456-creation: create user 456 - id user-456-email-set: for user 456, set email email1@dns - id user-456-email-set: for user 456, set email email2@dns - Can we provide a custom log compaction logic? If somebody is using Kafka for this purpose, I'd be glad to hear some return of experience. Cheers, Yann
Using Kafka for Event Sourcing
Hi, after having read http://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying, I am considering Kafka for an application build around CQRS and Event Sourcing. Disclaimer: I read the documentation but do not have any experience with Kafka at this time. In that constellation, the queried state is build by applying every events from the beginning. It is also important: - that all events are ordered, at least per entity - that all events are stored (no deletion) OR that events are compacted in such a way that the final state stays the same Questions: - I read that Kafka can delete events based on time or on disk usage. Is it possible to completely deactivate events deletion? (without using log compaction, this is my next questions) Kafka can also compact log ( https://cwiki.apache.org/confluence/display/KAFKA/Log+Compaction and http://kafka.apache.org/documentation.html#compaction). - How can we structure all events to that the final state stays the same? For example, if I have the following events: - create user 456 - for user 456, set email email1@dns - for user 456, set email email2@dns The log compaction should keep the user creation and the last email setting. Should I set events like that: - id user-456-creation: create user 456 - id user-456-email-set: for user 456, set email email1@dns - id user-456-email-set: for user 456, set email email2@dns - Can we provide a custom log compaction logic? If somebody is using Kafka for this purpose, I'd be glad to hear some return of experience. Cheers, Yann
Re: Using Kafka for Event Sourcing
Hey Yann, Yes, you can just make the retention infinite which will disable any deletion. What you describe with compaction might work, but wasn't exactly the intention. This type of event logging can work two ways: you can log the command or you can log the result of the command. In databases this is sometimes referred to as logical and physical logging. The intention of Kafka's compaction feature is to support physical logging where the last update contains the full aggregated state so far. So in your example the events we would expect would be 456 = {id: 456} 456 = {id:456, email:email1@dns} 456 = {id:456, email:email2@dns} A more verbose approach could even log the prior state, the current state, and the command (as some databases do). Why do it this way? Kafka doesn't allow customizing the compaction. There are embeddable event stores like (http://geteventstore.com/) that allow plugging in customized business logic for compacting events. However Kafka is built to run as a central service not per-application, so in that model deploying business logic into the central Kafka cluster every time you needed to change your compaction logic is a non-starter. There were a number of data systems at LinkedIn that worked off this log like this, but I can't give a good comparison to other CRQS systems since I haven't used any of them. -Jay On Sun, Jan 11, 2015 at 8:34 AM, Yann Simon yann.simon...@gmail.com wrote: Hi, after having read http://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying , I am considering Kafka for an application build around CQRS and Event Sourcing. Disclaimer: I read the documentation but do not have any experience with Kafka at this time. In that constellation, the queried state is build by applying every events from the beginning. It is also important: - that all events are ordered, at least per entity - that all events are stored (no deletion) OR that events are compacted in such a way that the final state stays the same Questions: - I read that Kafka can delete events based on time or on disk usage. Is it possible to completely deactivate events deletion? (without using log compaction, this is my next questions) Kafka can also compact log ( https://cwiki.apache.org/confluence/display/KAFKA/Log+Compaction and http://kafka.apache.org/documentation.html#compaction). - How can we structure all events to that the final state stays the same? For example, if I have the following events: - create user 456 - for user 456, set email email1@dns - for user 456, set email email2@dns The log compaction should keep the user creation and the last email setting. Should I set events like that: - id user-456-creation: create user 456 - id user-456-email-set: for user 456, set email email1@dns - id user-456-email-set: for user 456, set email email2@dns - Can we provide a custom log compaction logic? If somebody is using Kafka for this purpose, I'd be glad to hear some return of experience. Cheers, Yann