Hey,

Just wanted to make a few notes on the division of responsibility in the new persistence SPI, since it is radically changed from before.

Basically there will be three layers (instead of two): state, type, instance. Instance is the UoW-layer as before, and state is the EntityStore responsibility, and this is what will be pluggable. Managing type will be done by a new thingy called EntityRegistry which will sit between the UoW and the EntityStore. Any migration hooks will be in there.

To illustrate the difference between the layers, here is the representation of a property in each of them:
Instance: Property<String> foo();
Type: QualifiedName(mypackage.myinterface.foo)
State: StateName(hashed QualifiedName) = "Bar"

The Property is short-lived, as it is local to a particular UoW. The QualifiedName in the type layer has the same lifespan as the application uptime. After the application has been restarted the property might have been refactored, and hence has a new QualifiedName. The StateName = "Bar" represents the actual stored value, and has a lifespan equal to that of the object, or until the next schema migration.

In the state layer we expect that values will be stored using the hash (secure hash, like SHA1) of the property rather than the property name itself. So, if "foo" has hash value AFEB3241 then on disk we store AFEB3241="Bar". This ensures that next time we load the entity we can know that this property belonged to a particular version of the EntityType. If the EntityType changes then we can do schema migration properly.

Since the hashes might be fairly long it is perfectly ok for the EntityStore to replace them with e.g. a number, so that in the above example it might be stored physically as 32="Bar". It is up to the EntityStore to manage the mapping between "AFEB3241" and 32.

Because of this separation of responsibilities the EntityState interface changes to something like this (not finished version):
public interface EntityState
{
    EntityReference identity();
    long version();
    long lastModified();
    void remove();
    EntityStatus status();

    void addEntityType(EntityTypeReference type);
    void removeEntityType(EntityTypeReference type);
    boolean hasEntityType(EntityTypeReference type);
    Set<EntityTypeReference> entityTypes();

    Object getProperty(StateName stateName);
    void setProperty(StateName stateName, Object newValue);

    EntityReference getAssociation(StateName stateName);
    void setAssociation(StateName stateName, EntityReference newEntity);

    ManyAssociationState getManyAssociation(StateName stateName);

    void hasBeenApplied();

    ValueState newValueState(Map<QualifiedName, Object> values);
}
Instead of using QualifiedName, which includes the class name and property name, both of which will change over time, the above methods are instead using StateName, which includes the name of the property and the hashed name. Quick&Dirty stores will use the name, but "real" stores should use the hashed name to store the value, which will ensure that it can be retrieved later on even if the QualifiedName of a property has changed. EntityTypeReference also contains the hashed name of the EntityType(s) rather than only the name. Typical storage of a single entity in a hashmap-oriented store hence becomes:
id=123
version=5
lastModified=<somedate>
entityTypes=<hash of type 1>,<hash of type 2>,<hash of type 3, etc.>
AFEB3241="Bar"
---
When the instance in the UoW wants to load the property it will use a TypedEntity, which is a decorator of the EntityState, and do pretty much like before:
typedEntity.getProperty(qualifiedName);

The TypedEntity translates the QualifiedName into a StateName and then calls EntityState:
entityState.getProperty(stateName);
which can do the lookup in the above hashmap.

Apart from making persistence more change-tolerant it should also make it much easier to implement EntityStores. There's a lot of code related to type in the Neo4j store today, for example, which I think would simply go away, and the implementation becomes pretty much a straight wrapper of the underlying node, since all type-related code sits in Qi4j.

The above isn't all of it, but a very important part. Do y'all think it makes sense? Any potential problems with it? The main issue I can see right now is for the mapping stores which needs access to the type info somehow. That will have to be fixed somehow.

/Rickard

_______________________________________________
qi4j-dev mailing list
[email protected]
http://lists.ops4j.org/mailman/listinfo/qi4j-dev

Reply via email to