Hi,

I ran some tests using 3.0b with SQLTemplate in combination with prefetching and found
a possible new problem.

It seems that when running the query in eg 1 minute, it takes about 2 minutes before cayenne
has constructed the prefetched objects.

My query produces 2.5 million records. The query will take about 30 minutes. Construction
of the objects will then take an extra hour.

This is not really workable.

Hans

Hans Pikkemaat wrote:
Hi,

What I can see when I use paging in combination with SQLTemplate is this:

Cayenne first runs the main SQLTemplate query which is stored in memory
When I get the first page it determines the key values of the main query which it then uses in a new query which will return the main table plus the detail table data. This will produce the main table object through which the detail table is accessible.

The problem here is that the key of the main table is used only. The SQLTemplate query was manually constructed and does a query on the main table and a left join to the detail table so this will produce a duplicate key value where a main table record has 2 related detail table records.

This doesnt have to be a problem, actually the query does return the number of records used as page size. But internally in cayenne something weird happens. Somehow the duplicate records are remove and the IncrementalFaultList.checkPageResultConsistency method throws an exception for this.

Because the main query returns the main object but also the detail object I find it strange that the query generated for the page only uses the main table key. I would expect that
it also would use the key of the detail table.

An example. Say I have a main table key 1 and related detail records with key 1, 2 and 3. Say I run the SQLTemplate which returns key 1 but only key 1 and 2 for the detail table.

The page query will now run for all detail records and return all records which I did not
request.

From this I'm concluding that if an SQLTemplate is used it is not usefull (read: faulty) to include the detail table in this query. When paging is used all the detail tables are automatically
queried.

If I write the main SQLTemplate query such it only returns the main object then the
Exception does not occur.

My conclusion is then that if you want to use paging with SQLTemplate the main query should only return the main table. Prefetching will then return ALL related
table records.

tx

HPI

Andrus Adamchik wrote:
Yeah, still need to check that one.

On Nov 12, 2009, at 10:43 AM, Hans Pikkemaat wrote:

Hi,

Yes, the paginated query would indeed be the only way for me to go forward.
The problem however is that I get the exception I posted earlier.

tx

Hans

Andrus Adamchik wrote:
For paginated queries we contemplated a strategy of a list with constant size of fully resolved objects. I.e. when a page is swapped in, some other (LRU?) page is swapped out. We decided against it, as in a general case it is hard to consistently predict which page should be swapped out.

However it should be rather easy to write such a list for a specific case with a known access order (e.g. a standard iteration order). In fact I would vote to even include such implementation in Cayenne going forward.

More specifically, you can extend IncrementalFaultList [1], overriding 'resolveInterval' to swap out previously read pages, turning them back into ids. And the good part is that you can use your extension directly without any need to modify the rest of Cayenne.

Andrus


[1] 
http://cayenne.apache.org/doc/api/org/apache/cayenne/access/IncrementalFaultList.html


On Nov 12, 2009, at 10:07 AM, Hans Pikkemaat wrote:

Hi,

So this means that if I use a generic query that the query results are always stored completely in the object store (or the query cache if I configure it).

Objects are returned in a list so as long I have a reference to this list (because I'm
traversing it) these objects are not garbage collected.

If I use the query cache the full query results are cached. This means that I can only
tell it to remove the whole query.

Effectively this means I'm unable to run a big query and process the results as a stream. So I cannot process the first results and then somehow make them available for
garbage collection.

The only option I have would be the iterated query but this is only usefull for queries one 1 table without any relations because it is not possible to use prefetching nor is
it possible to manually construct relations between obects.

My conclusion here is that cayenne is simply not suitable for doing large batch wise
query processing because of the memory implications.

tx

HPI

Andrus Adamchik wrote:

As mentioned in the docs, individual objects and query lists are
cached independently. Of course query lists contain a subset of cached object store objects inside the lists. An object won't get gc'd if it
is also stored in the query list.

Now list cache expiration is controlled via query cache factory. By
default this is an LRU map, so as long as the map has enough space to hold lists (its capacity == # of lists, not # of objects), the objects
won't get gc'd.

You can explicitly remove entries from the cache via QueryCache remove and removeGroup methods. Or you can use a different QueryCacheFactory
that implements some custom expiration/cleanup mechanism.

Andrus

On Nov 11, 2009, at 3:43 PM, Hans Pikkemaat wrote:



Hi,

I use the latest version of cayenne, 3.0b and am experimenting with
the object caching features.

The documentation states that committed objects are purged from the
cache because it uses weak references.
(http://cayenne.apache.org/doc/individual-object-caching.html)

If I however run a query using SQLTemplate which caches the objects
into the dataContext local cache (objectstore),
the objects don't seem to be purged at all. If I simply run the
query dump the contents using an iterator on the resulting
List then the nr of registered objects in the objectstore stays the
same (dataContext.getObjectStore().registeredObjectsCount()).
Even if I manually run System.gc() I don't see any changes (I know
this can be normal as gc() doesn't guarantee anything)

What am I doing wrong? Under which circumstances will cayenne purge
the cache?

tx

Hans





--
        TSi Solutions
Neptunusstraat 25
7521 WC Enschede

Tel. +31 (0)88 - 25 00 000
Fax. +31 (0)88 - 25 00 122
Hans Pikkemaat
Java Developer (Services Team)
E-mail: [email protected] <mailto:[email protected]>
        www.tsi-solutions.nl <http://www.tsi-solutions.nl/>
www.toeristiek.nl <http://www.toeristiek.nl/>
        

10 jaar TSi Solutions
... marktleider in het automatiseren en outsourcen van werkprocessen in de reisbranche ... toonaangevende partij voor het verzamelen, structureren en beschikbaarstellen van reiscontent
... Reisrevue Innovatieveer 2008 - Veervolle vermelding
... Winnaar Reisrevue Innovatieveer 2009
... Top 20 positie in 2008 Deloitte Technology Fast50 Nederland
... Top 10 positie in 2009 Deloitte Technology Fast50 Benelux
... genomineerd voor Technology 500 EMEA 2009
TSi Solutions is de handelsnaam van Travel Service International b.v.[KvK 06091935] DISCLAIMER: De informatie opgenomen in dit bericht kan vertrouwelijk zijn en is uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onterecht ontvangt, wordt u verzocht de inhoud niet te gebruiken en de afzender direct te informeren door het bericht te retourneren. The information contained in this message may be confidential and is intended to be exclusively for the addressee. Should you receive this message unintentionally, please do not use the contents herein and notify the sender immediately by return e-mail.


Reply via email to