Finally some good news on performance. After tweaking of the prefetch strategies, I got the following test numbers on PostgreSQL, fetching/ prefetching a few thousands of objects (smaller number of milliseconds means faster processing) :

(disjoint)
n:1 ... M6 ...... 51 ms
n:1 ... trunk ... 45 ms

(joint)
n:1 ... M6 ...... 100 ms
n:1 ... trunk ... 45 ms

(disjoint)
1:n ... M6 ...... 100 ms
1:n ... trunk ... 54 ms

(disjoint)
n:m ... M6 ...... 54 ms
n:m ... trunk ... 51 ms

So the trunk code significantly improves on 3.0M6 when prefetching to- many and joint to-ones relationships, and somewhat improves on other cases (within a margin of error I guess).

Andrus




On Sep 7, 2009, at 8:53 AM, Andrus Adamchik wrote:

Been thinking about the new prefetching model some more and found a glaring performance hole - the most common N:1 prefetch case will result in a cartesian product processing in memory. E.g. if one Artist has 3 Paintings, and the Paintings are fetched with Artist prefetch, the Artist DB data will be read repeatedly 3 times. The result will be correct - 3 Paintings all pointing to a single Artist object, however processing will be much slower.

Now will be making another pass over the code to restore the old prefetch strategy for N:1 relationships. Hopefully the resulting code will be tighter than it used to be.

Andrus


On Sep 6, 2009, at 9:43 PM, Andrus Adamchik wrote:

Good to have a little time again to hack Cayenne internals.

Just committed a pretty big change to the prefetching algorithm motivated by CAY-1250 bug report. So combining prefetching and inheritance now works 100%.

One visible effect of this change is that all disjoint prefetch queries will now include the ID's of the source side of the prefetch relationship and a mandatory join to the source entity. In return for this small inefficiency (increased result set size... hopefully most ID's are small), we get a bunch of benefits, main one being the ability to process related fetched objects in a consistent manner regardless of the relationship semantics (1..1, 1..N, N..M). This strategy was used before for flattened relationships, now it is used for everything. On the other hand this change allowed to optimize some related cases, so all in all, there may be no performance penalty.

It is still possible to go back and optimize it further to prevent the addition of the extra columns to the resultset in some cases (e.g. if both joined FK and PK are present in the result, only fetch one of them), I wish we could do that in some central location (like SelectTranslator) instead of writing endless if/else in the prefetch processing code.

Now the prefetch code is easier to make sense of, with fewer if/ else. And I am planning to refactor it further.

Also I came very close to fixing the biggest remaining limitation of disjoint prefetching:

https://issues.apache.org/jira/browse/CAY-1025

Andrus





Reply via email to