Re: New prefetching algorithms

Andrus Adamchik Mon, 07 Sep 2009 13:56:07 -0700

Finally some good news on performance. After tweaking of the prefetchstrategies, I got the following test numbers on PostgreSQL, fetching/prefetching a few thousands of objects (smaller number of millisecondsmeans faster processing) :


(disjoint)
n:1 ... M6 ...... 51 ms
n:1 ... trunk ... 45 ms


(joint)
n:1 ... M6 ...... 100 ms
n:1 ... trunk ... 45 ms

(disjoint)
1:n ... M6 ...... 100 ms
1:n ... trunk ... 54 ms

(disjoint)
n:m ... M6 ...... 54 ms
n:m ... trunk ... 51 ms

So the trunk code significantly improves on 3.0M6 when prefetching to-many and joint to-ones relationships, and somewhat improves on othercases (within a margin of error I guess).


Andrus




On Sep 7, 2009, at 8:53 AM, Andrus Adamchik wrote:

Been thinking about the new prefetching model some more and found aglaring performance hole - the most common N:1 prefetch case willresult in a cartesian product processing in memory. E.g. if oneArtist has 3 Paintings, and the Paintings are fetched with Artistprefetch, the Artist DB data will be read repeatedly 3 times. Theresult will be correct - 3 Paintings all pointing to a single Artistobject, however processing will be much slower.
Now will be making another pass over the code to restore the oldprefetch strategy for N:1 relationships. Hopefully the resultingcode will be tighter than it used to be.
Andrus


On Sep 6, 2009, at 9:43 PM, Andrus Adamchik wrote:
Good to have a little time again to hack Cayenne internals.
Just committed a pretty big change to the prefetching algorithmmotivated by CAY-1250 bug report. So combining prefetching andinheritance now works 100%.
One visible effect of this change is that all disjoint prefetchqueries will now include the ID's of the source side of theprefetch relationship and a mandatory join to the source entity. Inreturn for this small inefficiency (increased result set size...hopefully most ID's are small), we get a bunch of benefits, mainone being the ability to process related fetched objects in aconsistent manner regardless of the relationship semantics (1..1,1..N, N..M). This strategy was used before for flattenedrelationships, now it is used for everything. On the other handthis change allowed to optimize some related cases, so all in all,there may be no performance penalty.
It is still possible to go back and optimize it further to preventthe addition of the extra columns to the resultset in some cases(e.g. if both joined FK and PK are present in the result, onlyfetch one of them), I wish we could do that in some centrallocation (like SelectTranslator) instead of writing endless if/elsein the prefetch processing code.
Now the prefetch code is easier to make sense of, with fewer if/else. And I am planning to refactor it further.
Also I came very close to fixing the biggest remaining limitationof disjoint prefetching:
https://issues.apache.org/jira/browse/CAY-1025

Andrus

Re: New prefetching algorithms

Reply via email to