Re: [qi4j-dev] Entity persistence

Kent Sølvsten Fri, 16 Oct 2009 05:00:00 -0700

Loading policies based on usecases would be really awesome, and beyondcool!!

Maybe a compromising solution could be to separate persistence basedon the mixins in the EntityComposite? So the state for each mixin isstored separately. The data in a mixin will probably tend to be usedtogether, at least in a good design.And it should cause a lot less fragments than storing each propertyseparately.

I even think it makes sence conceptionally, since the basic atom inQI4J is fragments, to store each mixin as a single unit. Not sure ifany technical issues makes it difficult/impossible.

Another point is, the EntityStore might want to get called the firsttime a part of a composite is being accessed regardless of whetherthe state of that part is already loaded or not, to allow it to learnabout usage patterns for the current usecase, even if all state forthe composite is currently eagerly fetched. Reducing the granularityto mixins instead of properties/associations could reduce the numberof parts of a composite, and thus the number of callbacks a lot!

The drawback is off course, that each association/property will nowneed to know which mixin it is a part of.

Any ideas how to let the EntityStore know, whether it is being calledbecause someone wants to load an entity, or it is being called becauseanother part of the composite is being accessed? And how will theEntityStore know whether the requested part of the composite isalready loaded? Would it be possible to store the info in theEntityReference?

On a sidenote: What is the proper way to handle partially loadedentities when exiting a UnitOfWork? Should state not yet loaded belazy-loaded on exit, or should the client be prepared forLazyLoadExceptions? If state is lazy loaded on exit, it will probablybe an incentive to use value objects.


/Kent

Den 16/10/2009 kl. 08.25 skrev Rickard Öberg:

On 2009-10-16 12.25, Niclas Hedhman wrote:
Isn't this the equivalent of "Loading Policy" in JDO/JPA ??
Pretty much, yes I think so. Except aren't those on structuralrather than usecase level? I.e. regardless of usecase "if you loadproperty1 always load property2,property3 at the same time".
I think this is a "per EntityStore type issue" more than it is a
client code issue, although the client code will have a hugeinfluence
which loading strategy is the best for a given EntityStore.
Agree.
So, I tend to lean towards introducing "Loading Policy" in the MapES
first, and perhaps even my favorite approach of 'self learning'
against a use-case, i.e. the ES will internally keep track of which
properties (and possibly associations) that a particular use-case
uses, and pre-load those upon a request.
That's an interesting idea, and it's probably the most easy to useand with best results. So basically the client developer doesn'thave to care.
Another issue that I can see is that of performance in for instance
JDBM... I would assume that making many key lookups are relatively
expensive, and we would need to look at "Lookup/Size speed ratio",
i.e. how many bytes larger blob in the single lookup is required cost
wise for each new key-lookup?
I think there are two issues: one is the expense per lookup, and oneis that with more key->value mappings the database is going to bebigger, so will consume more disk space. The indices will also bebigger and thus slower.
What would be interesting is to have the store work on two levels:one is the identity lookup, and then the next level would beproperty(/association) lookup. If one index can have only identitylookup and another has the per-property lookup, it should be muchfaster.
And that is needed at the particular
MapES implementation, as the values for JDBM would be dramatically
different from a network based. But, you probably realize thatreading
the Blob is one step and creating the Property instances is another
with its own overhead, potentially very substantial, and here it is
probably a fixed time per instance, i.e. per first use -->  create.
Right, so one thing we can do *today* is to change so that the blobread does not do the property instantiation eagerly. There's no needto as far as I can tell. That on its own will make a huge difference.
Some of my entities right now have like 10-15 interfaces already,each with its own state (typically 1-5 properties), and having toload and instantiate all of that just to read one property seemslike a massive performance hit. With lazy-instantiation ofproperties a big part of the problem goes away.
The remaining question is whether having only a id->blob mappinglike today is fine, or whether introducing extra fragmentation wouldbe useful. One obvious issue is how to keep it all in sync. With thetwo-level solution it should be fine, since the first lookup shouldget {id,timestamp,version,app_version} and all such metadata aboutthe entity as a whole, whereas the second level lookup would accessthe actual data. But I agree, that needs to be benchmarked.
I think it is a complex area, with room for significant speed
improvements in large entities, but I also think that the answerswill
surprise us once we start measuring.
Any hints on what the surprise might be?
We need a Performance Expert :-) who can dedicate himself/herself to
chasing performance in ES and Indexing/Query.
That would be very nice, yes.

/Rickard

_______________________________________________
qi4j-dev mailing list
[email protected]
http://lists.ops4j.org/mailman/listinfo/qi4j-dev

_______________________________________________
qi4j-dev mailing list
[email protected]
http://lists.ops4j.org/mailman/listinfo/qi4j-dev

Re: [qi4j-dev] Entity persistence

Reply via email to