Been thinking about the new prefetching model some more and found a
glaring performance hole - the most common N:1 prefetch case will
result in a cartesian product processing in memory. E.g. if one Artist
has 3 Paintings, and the Paintings are fetched with Artist prefetch,
the Artist DB data will be read repeatedly 3 times. The result will be
correct - 3 Paintings all pointing to a single Artist object, however
processing will be much slower.
Now will be making another pass over the code to restore the old
prefetch strategy for N:1 relationships. Hopefully the resulting code
will be tighter than it used to be.
Andrus
On Sep 6, 2009, at 9:43 PM, Andrus Adamchik wrote:
Good to have a little time again to hack Cayenne internals.
Just committed a pretty big change to the prefetching algorithm
motivated by CAY-1250 bug report. So combining prefetching and
inheritance now works 100%.
One visible effect of this change is that all disjoint prefetch
queries will now include the ID's of the source side of the prefetch
relationship and a mandatory join to the source entity. In return
for this small inefficiency (increased result set size... hopefully
most ID's are small), we get a bunch of benefits, main one being the
ability to process related fetched objects in a consistent manner
regardless of the relationship semantics (1..1, 1..N, N..M). This
strategy was used before for flattened relationships, now it is used
for everything. On the other hand this change allowed to optimize
some related cases, so all in all, there may be no performance
penalty.
It is still possible to go back and optimize it further to prevent
the addition of the extra columns to the resultset in some cases
(e.g. if both joined FK and PK are present in the result, only fetch
one of them), I wish we could do that in some central location (like
SelectTranslator) instead of writing endless if/else in the prefetch
processing code.
Now the prefetch code is easier to make sense of, with fewer if/
else. And I am planning to refactor it further.
Also I came very close to fixing the biggest remaining limitation of
disjoint prefetching:
https://issues.apache.org/jira/browse/CAY-1025
Andrus