subject:"Using Kafka for Event Sourcing"

Re: Using Kafka for Event Sourcing

2015-01-12 Thread Yann Simon

up, as I am not sure you receive this email.

Le Sun Jan 11 2015 at 5:34:17 PM, Yann Simon yann.simon...@gmail.com a
écrit :

Hi,

after having read
http://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying,
I am considering Kafka for an application build around CQRS and Event
Sourcing.

Disclaimer: I read the documentation but do not have any experience with
Kafka at this time.

In that constellation, the queried state is build by applying every events
from the beginning.
It is also important:
- that all events are ordered, at least per entity
- that all events are stored (no deletion) OR that events are compacted in
such a way that the final state stays the same

Questions:
- I read that Kafka can delete events based on time or on disk usage. Is
it possible to completely deactivate events deletion? (without using log
compaction, this is my next questions)

Kafka can also compact log (
https://cwiki.apache.org/confluence/display/KAFKA/Log+Compaction and
http://kafka.apache.org/documentation.html#compaction).
- How can we structure all events to that the final state stays the same?

For example, if I have the following events:
- create user 456
- for user 456, set email email1@dns
- for user 456, set email email2@dns

The log compaction should keep the user creation and the last email
setting.
Should I set events like that:
- id user-456-creation: create user 456
- id user-456-email-set: for user 456, set email email1@dns
- id user-456-email-set: for user 456, set email email2@dns

- Can we provide a custom log compaction logic?

If somebody is using Kafka for this purpose, I'd be glad to hear some
return of experience.

Cheers,
Yann

Using Kafka for Event Sourcing

2015-01-11 Thread Yann Simon

Hi,

after having read
http://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying,
I am considering Kafka for an application build around CQRS and Event
Sourcing.

Disclaimer: I read the documentation but do not have any experience with
Kafka at this time.

In that constellation, the queried state is build by applying every events
from the beginning.
It is also important:
- that all events are ordered, at least per entity
- that all events are stored (no deletion) OR that events are compacted in
such a way that the final state stays the same

Questions:
- I read that Kafka can delete events based on time or on disk usage. Is it
possible to completely deactivate events deletion? (without using log
compaction, this is my next questions)

Kafka can also compact log (
https://cwiki.apache.org/confluence/display/KAFKA/Log+Compaction and
http://kafka.apache.org/documentation.html#compaction).
- How can we structure all events to that the final state stays the same?

For example, if I have the following events:
- create user 456
- for user 456, set email email1@dns
- for user 456, set email email2@dns

The log compaction should keep the user creation and the last email setting.
Should I set events like that:
- id user-456-creation: create user 456
- id user-456-email-set: for user 456, set email email1@dns
- id user-456-email-set: for user 456, set email email2@dns

- Can we provide a custom log compaction logic?

If somebody is using Kafka for this purpose, I'd be glad to hear some
return of experience.

Cheers,
Yann

Re: Using Kafka for Event Sourcing

2015-01-11 Thread Jay Kreps

Hey Yann,

Yes, you can just make the retention infinite which will disable any
deletion.

What you describe with compaction might work, but wasn't exactly the
intention.

This type of event logging can work two ways: you can log the command or
you can log the result of the command. In databases this is sometimes
referred to as logical and physical logging.

The intention of Kafka's compaction feature is to support physical logging
where the last update contains the full aggregated state so far. So in your
example the events we would expect would be
456 = {id: 456}
456 = {id:456, email:email1@dns}
456 = {id:456, email:email2@dns}

A more verbose approach could even log the prior state, the current state,
and the command (as some databases do).

Why do it this way? Kafka doesn't allow customizing the compaction. There
are embeddable event stores like (http://geteventstore.com/) that allow
plugging in customized business logic for compacting events. However Kafka
is built to run as a central service not per-application, so in that model
deploying business logic into the central Kafka cluster every time you
needed to change your compaction logic is a non-starter.

There were a number of data systems at LinkedIn that worked off this log
like this, but I can't give a good comparison to other CRQS systems since I
haven't used any of them.

-Jay

On Sun, Jan 11, 2015 at 8:34 AM, Yann Simon yann.simon...@gmail.com wrote:

Hi,

after having read

http://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying
,
I am considering Kafka for an application build around CQRS and Event
Sourcing.

Disclaimer: I read the documentation but do not have any experience with
Kafka at this time.

Questions:
- I read that Kafka can delete events based on time or on disk usage. Is it
possible to completely deactivate events deletion? (without using log
compaction, this is my next questions)

For example, if I have the following events:
- create user 456
- for user 456, set email email1@dns
- for user 456, set email email2@dns

- Can we provide a custom log compaction logic?

If somebody is using Kafka for this purpose, I'd be glad to hear some
return of experience.

Cheers,
Yann

Re: Using Kafka for Event Sourcing

Using Kafka for Event Sourcing

Re: Using Kafka for Event Sourcing

3 matches

Site Navigation

Mail list logo

Footer information