Author: aadamchik
Date: Fri May 25 20:04:13 2012
New Revision: 1342794
URL: http://svn.apache.org/viewvc?rev=1342794&view=rev
Log:
documentation - prefetching chapter
Modified:
cayenne/main/trunk/docs/docbook/cayenne-guide/src/docbkx/appendix-a.xml
cayenne/main/trunk/docs/docbook/cayenne-guide/src/docbkx/performance-tuning.xml
Modified:
cayenne/main/trunk/docs/docbook/cayenne-guide/src/docbkx/appendix-a.xml
URL:
http://svn.apache.org/viewvc/cayenne/main/trunk/docs/docbook/cayenne-guide/src/docbkx/appendix-a.xml?rev=1342794&r1=1342793&r2=1342794&view=diff
==============================================================================
--- cayenne/main/trunk/docs/docbook/cayenne-guide/src/docbkx/appendix-a.xml
(original)
+++ cayenne/main/trunk/docs/docbook/cayenne-guide/src/docbkx/appendix-a.xml Fri
May 25 20:04:13 2012
@@ -92,6 +92,14 @@
<td>weak</td>
</tr>
<tr>
+
<td><code>cayenne.server.max_id_qualifier_size</code> - defines a maximum
number of ID
+ qualifiers in the WHERE clause
of queries that are generated for paginated
+ queries and for DISJOINT_BY_ID
prefetch processing. This is needed to avoid
+ hitting WHERE clause size
limitations and memory usage efficiency.</td>
+ <td>any positive int</td>
+ <td>10000</td>
+ </tr>
+ <tr>
<td><code>cayenne.rop.service_url</code> - defines the URL of the ROP
server</td>
<td/>
<td/>
Modified:
cayenne/main/trunk/docs/docbook/cayenne-guide/src/docbkx/performance-tuning.xml
URL:
http://svn.apache.org/viewvc/cayenne/main/trunk/docs/docbook/cayenne-guide/src/docbkx/performance-tuning.xml?rev=1342794&r1=1342793&r2=1342794&view=diff
==============================================================================
---
cayenne/main/trunk/docs/docbook/cayenne-guide/src/docbkx/performance-tuning.xml
(original)
+++
cayenne/main/trunk/docs/docbook/cayenne-guide/src/docbkx/performance-tuning.xml
Fri May 25 20:04:13 2012
@@ -4,6 +4,98 @@
<title>Performance Tuning</title>
<section xml:id="prefetching">
<title>Prefetching</title>
+ <para>Prefetching is a technique that allows to bring back in
one query not only the queried
+ objects, but also objects related to them. In other
words it is a controlled eager
+ relationship resolving mechanism. Prefetching is
discussed in the "Performance Tuning"
+ chapter, as it is a powerful performance optimization
method. Another common application
+ of prefetching is for refreshing stale object
relationships.</para>
+ <para>Prefetching example:
+ <programlisting>SelectQuery query = new
SelectQuery(Artist.class);
+
+// this instructs Cayenne to prefetch one of Artist's relationships
+query.addPrefetch("paintings");
+
+// query is expecuted as usual, but the resulting Artists will have
+// their paintings "inflated"
+List<Artist> artists = context.performQuery(query);</programlisting>
+ All types of relationships can be preftetched - to-one,
to-many, flattened. </para>
+ <para>A prefetch can span multiple relationships:
+ <programlisting>
query.addPrefetch("paintings.gallery");</programlisting></para>
+ <para>A query can have multiple
+
prefetches:<programlisting>query.addPrefetch("paintings");
+query.addPrefetch("paintings.gallery"); </programlisting></para>
+ <para>If a query is fetching DataRows, all "disjoint"
prefetches are ignored, only "joint"
+ prefetches are executed (see prefetching semantics
discussion below for what disjoint and
+ joint prefetches mean).</para>
+
+ <section xml:id="prefetching-semantics">
+ <title>Prefetching Semantics</title>
+ <para>Prefetching semantics defines a strategy to
prefetch relationships. Depending on
+ it, Cayenne would generate different types of
queries. The end result is the same -
+ query root objects with related objects fully
resolved. However semantics can affect
+ preformance, in some cases significantly. There
are 3 types of prefetch semantics,
+ all defined as constants in
+
org.apache.cayenne.query.PrefetchTreeNode:<programlisting>PrefetchTreeNode.JOINT_PREFETCH_SEMANTICS
+PrefetchTreeNode.DISJOINT_PREFETCH_SEMANTICS
+PrefetchTreeNode.DISJOINT_BY_ID_PREFETCH_SEMANTICS</programlisting></para>
+ <para>Each query has a default prefetch semantics, so
generally users do not have to
+ worry about changing it, except when
performance is a concern, or a few special
+ cases when a default sematics can't produce the
correct result. SelectQuery uses
+ DISJOINT_PREFETCH_SEMANTICS by default.
Semantics can be changed as
+ follows:<programlisting>SelectQuery query = new
SelectQuery(Artist.class);
+query.addPrefetch("paintings").setSemantics(
+ PrefetchTreeNode.JOINT_PREFETCH_SEMANTICS);
</programlisting></para>
+ <para>There's no limitation on mixing different types
of semantics in the same
+ SelectQuery. Multiple prefetches each can have
its own semantics. </para>
+ <para>SQLTemplate and ProcedureQuery are both using
JOINT_PREFETCH_SEMANTICS and it can
+ not be changed due to the nature of these two
queries.</para>
+ </section>
+ <section xml:id="disjoint-prefetch-semantics">
+ <title>Disjoint Prefetching Semantics</title>
+ <para>This semantics (only applicable to SelectQuery)
results in Cayenne generatiing one
+ SQL statement for the main objects, and a
separate statement for each prefetch path
+ (hence "disjoint" - related objects are not
fetched with the main query). Each
+ additional SQL statement uses a qualifier of
the main query plus a set of joins
+ traversing the preftech path between the main
and related entity. </para>
+ <para>This strategy has an advantage of efficient JVM
memory use, and faster overall
+ result processing by Cayenne, but it requires
(1+N) SQL statements to be executed,
+ where N is the number of prefetched
relationships.</para>
+
+ </section>
+ <section xml:id="disjoint-by-id-prefetch-semantics">
+ <title>Disjoint-by-ID Prefetching Semantics</title>
+ <para>This is a variation of disjoint prefetch where
related objects are matched against
+ a set of IDs derived from the fetched main
objects (or intermediate objects in a
+ multi-step prefetch). Cayenne limits the size
of the generated WHERE clause, as most
+ DBs can't parse arbitrary large SQL. So
prefetch queries are broken into smaller
+ queries. The size of is controlled by the DI
property
+ Constants.SERVER_MAX_ID_QUALIFIER_SIZE_PROPERTY
(the default number of conditions in
+ the generated WHERE clause is 10000). Cayenne
will generate (1 + N * M) SQL
+ statements for each query using disjoint-by-ID
prefetches, where N is the number of
+ relationships to prefetch, and M is the number
of queries for a given prefetch that
+ is dependent on the number of objects in the
result (ideally M = 1).</para>
+ <para>The advantage of this type of prefetch is that
matching database rows by ID may be
+ much faster than matching the qualifier of the
original query. Moreover this is
+ <emphasis role="bold">the only type of
prefetch</emphasis> that can handle
+ SelectQueries with <emphasis role="bold">fetch
limit</emphasis>. Both joint and
+ regular disjoint prefetches may produce invalid
results or generate inefficient
+ fetch-the-entire table SQL when fetch limit is
in effect. </para>
+ <para>The disadvantage is that query SQL can get
unwieldy for large result sets, as each
+ object will have to have its own condition in
the WHERE clause of the generated
+ SQL.</para>
+ </section>
+ <section xml:id="joint-prefetch-semantics">
+ <title>Joint Prefetching Semantics</title>
+ <para>Joint senantics results in a single SQL statement
for root objects and any number
+ of jointly prefetched paths. Cayenne processes
in memory a cartesian product of the
+ entities involved, converting it to an object
tree. It uses OUTER joins to connect
+ prefetched entities.</para>
+ <para>Joint is the most efficient prefetch type of the
three as far as generated SQL
+ goes. There's always just 1 SQL query
generated. Its downsides are the potentially
+ increased amount of data that needs to get
across the network between the
+ application server and the database, and more
data processing that needs to be done
+ on the Cayenne side.</para>
+ </section>
</section>
<section xml:id="datarows">
<title>Data Rows</title>