Re: [qi4j-dev] Drastic Change to Entity support??

Rickard Öberg Sun, 03 Jun 2012 20:29:53 -0700

On 6/1/12 11:37 , Niclas Hedhman wrote:

Gang,


I am contemplating the possibility of going the full distance with DDD
support and restrictions when it comes to Entities.

<snip>

Here's my general take on it. Modeling apps using DDD and theAggregate/Entity/Value breakdown I think works well for most apps.Having entities output events, a la Greg Young, as their main result, isgenerally the right way to go. This is just a continuation of what wealready have started, with separating entity store from querying, buttaking it to its logical conclusion.


Some thoughts/issues:

* When you work with aggregates, instead of UoW, whenever you need toinvolve two Aggregates, with usecases such as "move child X fromaggregate A to B" it has to involve a saga somehow. You would first sendthe command to A, which would verify the remove and produce the event.Then you would have to consume that in a saga and create a new commandto B for adding X. To keep it simple that should not be allowed to failor throw exceptions. The move will hence not be atomic, so would beeventually consistent, and involves a bit of infrastructure for handlingit. If you don't have this infrastructure in place I don't see usingaggregates as being very realistic, as only simplistic usecases thatdon't involve many aggregates are possible.

* If the above is done while taking advantage of the shardingpossibility that comes with using aggregates (an aggregate and allsubentities live in one server, i.e. you have a "document view" of thewhole thing), a "move" such as the above essentially involves taking theclient entity X and creating a VALUE for it in a command to B, and thenrecreating it in B with new identities etc. With that you have a fullyshardable solution with no cross-server references, but it is relying onquite a bit of infrastructure to get it going.

* If the aggregate store is events, a la Greg Young, where snapshots areonly created once in a while, then you need to consider how to handleupdates of the model. The pure way is to only do it through events, butthis is tricky and not a lot of people are used to it. It can be madereasonably simple I think by using something like the Migration API(which is data-centric), but that MUST be in place.

* If you go this route the emphasis would then also be on creating eventconsumers that denormalize events into whatever read store you are using.

* If you go this route, then it becomes clear that the domain model isONLY used for processing commands, i.e. you don't create POJO's that youuse for both read AND write. This is a GOOD THING, as it just makes nosense at all to go through an object model on view queries. I meanseriously: make a database query, get the id's, load the objects, pickout the stuff you need, serialize to JSON/XML for client vs make adatabase query and stream it to JSON/XML for client. As long as you areusing a Query DSL for the query I don't see a problem with that, and itwould allow for much more rich results involving aggregates and such,which we just can't do with queries-in-domain-model.

* On how to construct the domain model, I'm sort of leaning towards thecase that DCI will to a large extent replace how we use mixins now. Youwould load an entity with state, and then add DCI context and rolesaround it/them to process the commands and the business rules. Theentities then would pretty much only be simple state, i.e. we couldstill use interfaces+Property<>/Associations<> as now, but it would haveno logic. Logic would be in DCI, which can be implemented using POJO's.This simplifies testing as well, as the interfaces can be implementedusing TransientComposite for tests.

* Once you have all of this, and your main store is the event store, youcan plug in any number of NOSQL thingies as event consumers fordenormalized views, and you don't really have to rely on transactions orblah blah in them for consistency, since that is already done at thecommand level. All you need is a event consumer that feeds one or moreNOSQL stores for *view purposes only*. Awesome.


That's my general take on it.

For your specific points:

* Entities are bound to an Aggregate, and the Aggregate has an
Aggregate Root, which is the only Entity within the Aggregate that is
globally reachable.

See above. Yes, with the condition that infrastructure around it isneeded for cross-aggregate stuff.

* Only changes within the Aggregate are atomic. Changes across
Aggregates are eventually consistent.


Agreed, as above.

* Invariants are declared on the Aggregate Root or assigned to the
Aggregate Root at assembly.


Yup.

* Aggregates are declared via @Aggregated annotation on Associations
and ManyAssociations.


Yup.

* The Aggregated entities Identity is scoped by the Aggregate Root
(under the hood, Aggregate Root identity is prefixed to the aggregated
entity).

* When a non-Aggregated Association is traversed the retrieved Entity
is read-only.

Traversed for what reason, is the question? Usually that will be forquery/view purposes, but as above, why not skip the model entirely andjust use a query DSL on the store, and stream the result to the client?

Would then that mean that UnitOfWork is not needed at all?? The
Aggregate IS effectively the UnitOfWork, and obtaining an Aggregate
can be done directly on the EntityFactory/Module, and the aggregated
entities are created from the AggregateRoot. Various posts on DDD
group also seems to suggest the same thing, IF you are modelling with
Aggregates, UnitOfWork should not exist.

Agreed. Takes more effort to model, or at least get used to, butphilosophically it's the right thing to do, given all we now know aboutDDD and CQRS/EventSourcing.

In all, this seems to suggest that the whole persistence system can be
simplified, GoodThing(tm), yet with the Aggregates being the
Distribution boundary, Consistency boundary, Transaction boundary and
Concurrency boundary, I think we can obtain a more solid semantic
model for how things are expected to work, both locally as well as
distributed.

Agreed, see above. We provide the transaction boundary on the aggregate,and make sure that isolation is done properly there. Once that is donewith, the rest becomes eventually consistent.

To add to the above, I would like to get in place an asynchronous
model for the Entity Store SPI as well;

* All changes to Entities are captured as Transitions.


Why not Events?

* Such transitions are pushed to the Entity Store SPI asynchronously.
Optimistic success, with callback for success/failures.

If Events from Aggregates are the main thing to decide whether a commandprocessing went well or not, I would make this part synchronous.Processing these events to generate denormalized views would beasynchronous though. This would be great! Right now, for example, I knowthat a big performance issue in Streamflow is the updates to Sesame andSolr as indexes upon transaction commit. With this, all of that wouldarchitecturally be done asynchronously, and out of the critical path ofexecution.

* Retrieval is likewise asynchronous. The request contains a callback
whereto deliver the transition stream.

Retrieval of aggregate happens on start of command processing. You candefinitely make that asynchronous, but what is the payoff? If you arebacking a REST API the response would have to be "command received,processing", rather than worked/failed.

* Perhaps retrieval requests can be persistent, so that one can
register a Specification, which will continue to feed the callback
with all changes matching the specification. Not sure if this will be
useful though.

"Retrieval" is too vague. If we do separate between command processingand view requests, then I (with what I know now) usually opt forskipping the domain model COMPLETELY. Query, stream, done. The domainmodel doesn't give me anything, other than schema help. But I can getthat, if I want to, by using query DSL's and composite interfaces forview definition, or something like that.

This could also simplify the EntityStore SPI quite a bit, since the
only interface needed would be something like;

public interface EventStore<T extends Event>
{

     void save( Identity identity, Iterable<T>  events, EventOutcome<T>  
handler );

     void load( Identity identity, EventRetriever<T>  callback );
}

I would use Future<> as result for both of these, for outcome andretriever, IF you want them to be asynch. I.e.

public interface EventStore
{
  Future<EventOutcome> save(Identity identity, Iterable<Event> events);
  EntityComposite load(Identity identity); // Load given aggregate
}

This is for the purpose of command processing. For event consuming therewould be a different interface that allows for paging, either for allevents or per identity. EventOutcome would be a POJO above rather thancallback.

public interface StateTransition extends Event  // super interface for
all ES transitions
{
     Identity entityIdentity();

     long sequenceNumber();

     DateTime timestamp();
}


Have a look at the event Value I did in the EventSourcing:
https://github.com/Qi4j/qi4j-sdk/blob/develop/libraries/eventsourcing/src/main/java/org/qi4j/library/eventsourcing/domain/api/DomainEventValue.java

Since the aggregate outputs an iterable of events, you don't actuallyput the timestamp into the event itself, but rather in a wrapper of thatiterable:

https://github.com/Qi4j/qi4j-sdk/blob/develop/libraries/eventsourcing/src/main/java/org/qi4j/library/eventsourcing/domain/api/UnitOfWorkDomainEventsValue.java

Notice that it also contains a bit of extra context, such as version ofapp used to create it (helps for migration), usecase (helpsunderstanding scope of event), and who triggered this usecase. Withthis, the EventStore becomes:

public interface EventStore
{

Future<EventOutcome> save(AggregateEvents events); // Containsidentity, timestamp, and list of eventsIterable<Event> load(Identity identity); // Load events for givenaggregate

Something like that. Again, the consumer of events would have adifferent API, something like this:

https://github.com/Qi4j/qi4j-sdk/blob/develop/libraries/eventsourcing/src/main/java/org/qi4j/library/eventsourcing/domain/source/EventSource.java

IF the entitiy state is represented as a List of Transitions, the
"current state" must be rebuilt from these transitions, which seems to
suggest things will be much slower. This is probably true if the
number of modifications to a Property or Association are magnitude
larger than the snapshot value, but only actual trials will tell what
can be expected, how much will be in serialization overhead, versus
reconstruction of the snapshot state. A later optimization could be to
allow for "snapshot", which the ES understand as "temporal starting
point".

Having snapshot events is something I think we would need to have fromthe start. In the Streamflow version the EntityStore and EventStore areseparated, but if you just join the two, with the introduction ofsnapshot events, you're good to go.

The above is basically how I think a proper modern DDD friendly appbuilding environment would look like, and it would embrace NOSQL fully,along with understanding how sharding and vertical scaling works.


/Rickard

_______________________________________________
qi4j-dev mailing list
[email protected]
http://lists.ops4j.org/mailman/listinfo/qi4j-dev

Re: [qi4j-dev] Drastic Change to Entity support??

Reply via email to