Re: StoreJanitor (was: Re: Moving reduced version of CachingSource to core | Configuration issues)

Vadim Gritsenko Thu, 05 Apr 2007 05:48:39 -0700

Ard Schrijvers wrote:

1) How it works and its intention (I think :-) ):  The StoreJanitor is
originally invented to monitor cocoon's memory useage and does this by
checking some memory values every X (default 10) seconds. Beside the fact
that I doubt users know that it is quite important to configure the store
janitor correctly,

It is stressed in several places. If you don't set at the very least maxheapsize, you are in for a trouble.

I stick to the defaults and use a heapsize of just a
little lower then JVM maxmemory.


You also need min free memory and interval set according to site & usage.

Now, every 10 seconds, the StoreJanitor does a check wether
(getJVM().totalMemory() >= getMaxHeapSize() && (getJVM().freeMemory() <
getMinFreeMemory()) is true, and if so, the next store is choosen (compared
to previoud one) and entries are removed from this store (I saw a post that
in trunk not one single store is chosen anymore, but an equal part of all of
them is being removed, right?


Branch and trunk, two algorithms are supported.

2) My Observations: When running high traffic sites and render them live
(only mod_cache in between which holds pages for 5 to 10 min) like [1] or
[2], then checking every X sec for a JVM to be low on memory doesn't make
sense to me. At the moment of checking, the JVM might be perfectly sound but
just needed some extra memory for a moment, in that case, the Store Janitor
is removing items from cache while not needed. Also, when the JVM is really
in trouble, but the Store Janitor is not checking for 5 more sec....this
might be too long for a JVM in a high traffic site when it is low on memory.

That's the problem with your configuration. It also a problem in janitor -- butit can be fixed only after Java 5 is made an option.

- Since there is no way to remove cache entries from the used cache impl by
the cache's eviction policy,


Huh??? That's not the case for me.

- Ones the JVM gets low on memory, and the StoreJanitor is needed, it is
quite likely that from that moment on, the StoreJanitor runs *every* 10
seconds, and keeps removing cache entries which you perhaps don't want to be
removed, like compiled stylesheets.

Since they are not used (janitor removes least recently used entries), that'sperfectly fine to me.

1) suppose, from one store (or since
trunk from multiple stores) 10% (default) is removed. This 10% is from the
number of memory cache entries. I quite frequently happen to have only 200
entries in memory for each store ( I have added *many* different stores to
enable all we wanted in a high traffic environment) and the rest is disk
store. Now, suppose, the JVM which has 512 Mb of memory, is low on memory,
and removes 10% of 200 entries = 20 entries, helping me zero! These memory
entries are my most important ones, so, on the next request, they are either
added again, or, from diskcache I have a hit, implying that the cache will
put this cache entry in memory again. If I would use 2000 memory items, I am
very sure, the 200 items which are cleaned are put back in memory before the
next StoreJanitor runs.


This sounds like a problem in your configuration.

2) I am not sure if in trunk you can configure wether
the StoreJanitor should leave one store alone, like the
DefaultTransientStore.


No. It should be a configuration parameter on store, IIUC.

In this store, typically, compiled stylesheets end up,
and i18n resource bundles. Since these files are needed virtually on every
request, I had rather not that the StoreJanitor removes from this store.

Quite often you need to purge the i18n cataloge for country which is no longerusing your website. Similarly janitor can purge the stylesheet for document typeyou are no longer using.

I
think, the StoreJanitor does so, leaving my "critical app" in an even worse
state, and on the next request, the hardly improved JVM needs to recompile
stylesheets and i18n resource bundles. 3) What if the JVM being low is not
because of the stores....For example, you have added some component which has
some problems you did not know, and, that component is the real reason for
you OOM.

Janitor or not, if you have a buggy code, nothing will help you. You have to fixthe bug, or site is going down regardless of janitor.

4) By default, probably most people are using ehcache. Naturally,
overflow-to-disk is true. In a high traffic site, the number of cache keys
can grow enormously

Not used it on live site yet, so no comment. It still though points to theimportance of configuration, including changing configuration away from ehcache.

--------o0o--------

The rules I try to follow to avoid the Store Janitor to run

1) use readers in noncaching pipelines and use expires on them to avoid
cache/memory polution


Better - there is Apache HTTPD for it.

2) use a different store for repository binary sources
which has only a disk store part and no memory part (cached-binary: protocol
added)


Doesn't it result in some frequently used binary resource always read from the 
disk?

3) use a different store for repository sources then for pipeline
cache


Hm, what are the benefits?


Vadim

4) replaced the abstract double mapping event registry to use
weakreferences and let the JVM clean up my event registry
5)  (4) gave me
undesired behavior by removing weakrefs in combination with ehcache when
overflowing items to disk (i could not reproduce this, but seems that my
references to cachekeys got lost). Testing with JCSCache solved this problem,
gave me faster response times and gave me for free to limit the number of
disk cache entries. Disadvantage of the weakreferences, is that I disabled
persitstent caches for jvm restarts, but, I never wanted this anyway (but

this might be implemented quite easily, but might take long start up times)6) JCSCache has a complex configuration IMO. Therefor, I added default

configurations to choose from, for example:




[1] http://www.minfin.nl [2] http://www.minbuza.nl

Re: StoreJanitor (was: Re: Moving reduced version of CachingSource to core | Configuration issues)

Reply via email to