Hi Andrus,

first of all thank you for the prompt support and your suggestions.

As you correctly guessed I was talking about Cayenne 1.2.1

I was thinking about this automatic (but based on custom configuration) "invalidation algorithm":

Configure somewhere a logic association DataObject--->"QueryData"[] (ex Paintings DataObject has to be associated to the queries [Select * from Paintings where year = 1300 | Select * from Artists, Paintings where Paintings.year >1200 | Select * from Artists, Paintings where Paintings.year >1200 order by Paintings.name]

The "QueryData" object shoud contain, separately, information about the "expression" (i.e. the "where" part ) and about ordering. "Select * from Artists, Paintings where Paintings.year >1200 order by Paintings.name" ---> { Paintings.year >1200 | order by Paintings.name }

If I modify (or create or delete) a DataObject I have to check the ante and post modification version of the single DataObject against the associated QueryData's We could do this exploiting the Objects filtering capabilities Expression (or optionally using third parties utilities <commons-bean-utils??>) :

Expression filter = Expression.fromString("Paintings.year >1200");
filter.filterObjects(objects);

As a result we would have two Sets of Queries : those matching before the modification and those matching after the modification. For sure we have to invalidate all the query results that are not in the intersection of the two sets.

For the queries in the intersection:

a)If they have NO ordering (Order by clause , paging limitation etc) they are still valid

b)If they have ordering: if ordering is on one of the modified DataObject field, we have to invalidate the query result, otherwise the query result is still valid.

Of course this solution can lead to high computational resources use, dependent on the number of queries it has to check. But, for example in the project I am collaborating to, the Db is the "under pressure"/bottleneck system and the Middleware has much less load. In such a situation "moving" load from the Db to the Mw is a benefit for the Application as a whole.

For "basic" queries (I made some tests) I think the algorithm should work. Of course more systematic test cases should be performed to completely validate the algorithm and/or find its limitation. Anyway I wanted to share it with you hoping it can be useful or can be of some "inspiration" for a proper/more correct solution.


Francesco.




Andrus Adamchik wrote:
Hi Francesco,


On Sep 25, 2006, at 10:56 AM, Francesco Fuzio wrote:
Thank you for the answers: I'm definitely looking forward to trying the 3.0 cool features you mentioned.

As for 2.1 (since for us is important to keep data updated without relying on expiration timing) I was thinking about this approach (for a clustered environment)

That would be version 1.2.*, right?

1) Enable Cayenne Replicated Shared Object Cache
2) Disable Cayenne Query (i.e list ) Cache
3) Use a Caching framework supporting automatic distributed refresh/invalidation policy (e.g Oscahe or Ehcache) to save query results as list of ObjectId's. 4) In case of Query "Cache Hit" use the cached ObjectId's to retrieve the associated DataObjects via the DataContext [ public Persistent <http://incubator.apache.org/cayenne/1_2/api/cayenne/org/objectstyle/cayenne/Persistent.html> *localObject*(ObjectId <http://incubator.apache.org/cayenne/1_2/api/cayenne/org/objectstyle/cayenne/ObjectId.html> id, Persistent <http://incubator.apache.org/cayenne/1_2/api/cayenne/org/objectstyle/cayenne/Persistent.html> prototype)]

What do you think, is this approach reasonable? Will it work?

This should work (you'll just use your own cache as a front end to the DataContext query API), and should provide a clean path to the future 3.0 migration. You'll need to consider a few things though:

A. Query cache key generation. In 1.2 this is based on Query name which is pretty dumb and barely usable; in 3.0 SelectQuery and SQLTemplate are smart enough to build the cache key based on their state. You may copy some of that code.


B. Invalidation Strategies. That's a tricky one....

I couldn't come up with a well-performing generic solution (I tried, see CAY-577). Consider that events that may cause automatic invalidation are object deletion, insertion and updating (update can affect the ordering and also whether an object still matches the query condition). So *every* commit can potentially invalidate any number of cached lists for a given entity.

The trick is to create an efficient algorithm to invalidate just the right cache entries and avoid invalidating the entire entity cache. Manually scanning and rearranging all lists on every commit is of course very inefficient.

So in 3.0 we added "cache group" notion so that users could categorize queries based on some criteria and then invalidate the whole category of cache entries. (Cache group notion is supported by OSCache by the way). Here is an example.... Consider a "BlogPost" entity. All queries that fetch a date range of BlogPosts can be arbitrarily divided into "old_posts" and "new_posts" categories. So once a user updates/deletes/removes a BlogPost, a code can check the date of this post and invalidate either "old_posts" or "new_posts".

This is just one solution that we came up with. Not automatic, but fairly simple and efficient. You can come up with your own strategies. If you can think of a better generic algorithm for invalidation, please share.

Andrus



__________ NOD32 1.1767 (20060921) Information __________

This message was checked by NOD32 antivirus system.
http://www.nod32.com





Reply via email to