I've started using the Query Cache more in some apps and I have noticed a
problem with the way it uses memory.

BACKGROUND

Cayenne has two versions of the Query Cache - Local and Shared. The shared
cache data is available to all ObjectContexts. The Local cache data is
available only to the same ObjectContext that loaded it.


CURRENT PROBLEMS

The Shared and Local query caches actually use the same global cache
instance, BUT they are kept separate by name spacing the keys. This has a
few surprising consequences:

1) Data written to the Local query cache will survive the lifetime of the
ObjectContext. If the cache group it is written to does not have an
explicit expiration time, then it will stay in the cache forever until you
run out of memory.

2) Removing a Local query cache group explicitly to invalidate the cache
(and reclaim that memory) is not possible because NestedQueryCache (which
wraps the actual cache in order to provide the name spacing feature) is
only able to remove entries from the Shared query cache.

3) Entries written to the Local query cache will keep strong references the
entire state of the ObjectContext in memory since the whole DataObjects are
stored along with the reference to their current ObjectContext. This is in
contrast to the Shared query cache which only stores DataRows. This means
that if you load a lot of data into a context that you don't want to cache
along with a tiny amount of Local query cache data, all of it will be
resident in memory well past the lifetime of the ObjectContext, possibly
forever unless you have a specific expiration time set on that cache group.


SHORT TERM WORKAROUND

The best short term workaround is to define short expiration times for all
of your local cache groups, especially the default group named
"cayenne.default.cachegroup". Reads from the local cache are already
specific to each ObjectContext, so the expiration really just is there to
protect you from having every locally cached query leak memory.


PROPOSED LONG TERM SOLUTION

#3 would be really nice to change so that only DataRows are stored in the
Local cache just like with the Shared cache. Since the Shared cache already
works this way I would expect this to be fairly simple.

For #2 I would recommend an API change to Query Cache that allow specifying
the QueryCacheStrategy along with the cache group name when removing a
cache group, i.e. :

    public void removeGroup(QueryCacheStrategy strategy, String groupKey);

Then the existing one-arg removeGroup method should be deprecated.

    @Deprecated
    public void removeGroup(String groupKey);

For #1 I would recommend adding a finalizer to BaseContext that will purge
all the Local query cache entries from the global cache instance when the
ObjectContext is garbage collected. This is perhaps a little trickier since
the name spacing that defines the Local query cache is only available from
NestedQueryCache, which is supposed to be mostly invisible and accessed
only through the QueryCache API which doesn't know about name spacing. An
alternative to name spacing would be to actually create separate caches for
each ObjectContext, but this might be expensive from a performance
standpoint.

Let me know your thoughts.

Thanks,
John

Reply via email to