RE: StoreJanitor

Ard Schrijvers Wed, 04 Apr 2007 01:08:12 -0700

> 
> AFAICS there are two freeing algorithms in trunk: round-robin 
> and all-stores.


I already thought it would be something like this

</snip>

> and this is IMO one of the major weakenesses of ehcache (or I 
> missed it 
> completely), I did not find any way to limit the number of 
> disk store entries.
> 
> Actually we don't configure this value. According to 
> http://ehcache.sourceforge.net/documentation/configuration.htm
> l the default 
> value is 0 meaning unlimited. We should use the 1.2.4 
> constructor that allows to 
> set a maxElementsOnDisk parameter.

That is added lately to ehcache right? I never saw this one, but it is 
extremely important to set it to a sensible value in my opinion. Cocoon
uses some quite ingenious caching tricks, but the everage user won't be 
aware of the millions of cache entries you can leave behind (like when putting
a timestamp in a cachekey). 

> 
> I wonder what StoreJanitor is good for at all. EHCache takes 
> care that the 
> number of items in the memory cache doesn't grow indefinitly 
> and starts its own 
> cleanup threads for the disc store 
> (http://ehcache.sourceforge.net/documentation/storage_options.
> html#DiskStore). 
> JCS will probably do the same. 

Yes, this is exactly my point. The extra problem is that the StoreJanitor never 
has
access to the eviction policy of the cache, and just starts throwing out 
entries "at random".
>From my experience, is that my app will only run solid, when the StoreJanitor 
>never runs :-) 
Therefor, I have created a few store size options to choose from, matching 
different
JVM memory sizes. Then, when app is "finished" I start crawling the site (xenu 
[1]) for an hour 
and look at status generator mem useage or yourkit profiler or something. If I 
see the 
nice shaped sawtooth (is this only dutch? :-) ) of memory useage, the stores 
are configured correctly 


> I guess that original purpose 
> of StoreJanitor was 
> when Cocoon had its own store implementations (transient, 
> persistent) and we had 
> to take care of cleaning them up in our code.

That must indeed have been the reason (I did not know this one, before my time, 
so I have 
never understood how the StoreJanitor would ever help me out)

> Only the persistent store can grow unlimited but since it 
> should only be used 
> for special usecases, it shouldn't be a real problem.
> 

</snip>

> 
> 
> What do we want to do in order to improve the situation? 
> After reading your mail 
> and from my own experience I'd say
> 
>   - introduce a maxPersistentObjects parameter and use it in 
> EHDefaultCache to set maxElementsOnDisk

+1 

>   - make the registration of stores at StoreJanitor configureable
>     (Though I wonder what the default value should be, true or false?)

0 : I would avoid the StoreJanitor to run anyway

>   - fix EventRegistry

+1: I have fixed this locally to let it work also when cache entries are 
removed by the internals of the cache
I did this, by instead of using the AbstractDoubleMapEventRegistry use 
WeakReferences, so that when the cache keys
aren't present anymore, the JVM itself cleans the Registry. Two problems:

1: I removed the persistent cache between JVM retarts, but could rebuild this 
(at the cost of long start up times though)
2: With former versions of EHCache, my weakreferences where not honoured when 
cache entries where overflowed to disk.
Therefor, I thought EHCache might be doing something with the cachekey when 
moving to the disk cachekey map. I could only see this behavior in combination 
with Cocoon, and not when I tested EHCache seperatly. 
On the EHCache userlist, Greg told me that it was not possible, and also showed 
it. 
I am using now JCSCache, which I am pretty ok with (only hard configuration)

If by the way, we start fixing the others, like setting a maxdiskobjects, the 
OOM due to event registry will increase. 
This is a problem from MultiHashMap (also the not deprecated replacer) that 
when you do:
map.put("1","test");
map.put("1","test");

you have two values for key "1". 


> 
> Any further ideas?

Hmmm, yes, but I am not sure wether others like it: I think, it might be good, 
that
when the StoreJanitor runs, there should be at least an info (error level...? I 
frequently want to 
give info in messages which is so important, that it must be at error level to 
not be missed, but this
is stupid, right?) message about possible problems:

either:
1) your JVM memory settings are too low
2) your stores are configured to have too many memory items
3) your cached objects are very large
4) you have a memory leak in some custom component (a little vague yes :-) )
....etc
Try runnning a crawler (xenu) and watch your status page memory useage.

Another improvement might be trying to avoid binary readers putting entries in 
memory cache. But, this might 
be to complex for the average user. In principal, I have have been bugging 
everybody here to:

1) use readers in *noncaching* pipelines, and use appropriate expires times in 
the readers, very important
for fast pages because browsers  honour the expires time
2) we also read binaries from our repository: these obviously need to be 
cached, but what if it are mp3 files
of 15 Mb a piece? Storing this in a normal store...so, I added a protocol, 
cached-binary: which in our
setup uses a different store which is configured to have no memory part, only 
disk cache. 

Then again, perhaps the thing above isn't something we can code (except for 
changing some things regarding having multiple event registries), 
but...perhaps I should wikify it for the advanced useage? It is though quite 
some stuff.

Sometimes people have complained to me that
1) cocoon caching is difficult
2) why nobody explained before how cachekeys work, the status generator 
cachekey overview, 
how validities work, etc etc

But, I doubt if there are frameworks around where you get so much ingenious 
caching for free,
where 95% of the users never have to know about it. And, indeed, when you want 
to run sites
with > 100.000 pages, you indeed need to know more about it. I do think that is 
normal. 

I think it is brilliant of cocoon that we run sites of 100.000 pages with many 
users and editors,
which never go down and run everything live with eventcache, and have response 
times when cached of within
32 ms (and my latest setups (a skeleton generator with standard conf and 
sitemaps even go to 0-15 ms)). 
I did not get this for free. It took me around 3 months to have everything 
configured/rebuild/added and understood correctly.
I am not sure about the best way to have it for free for everybody, without 
needing to understand it all 
(or at least get proper info about it).
 
WDOT?

Ard

> 
> 
> P.S. Ard, answering to your mails is very difficult because 
> there are no line 
> breaks. Is anybody else experiencing the same problem or is 
> it only me?

I am now for the moment putting in line breaks by enter, but probably doesn't 
make it any better, is it? 
Sry if yes, I will try to start using Thunderbird if still a problem

Ard

[1] http://home.snafu.de/tilman/xenulink.html

> 
> -- 
> Reinhard Pötz           Independent Consultant, Trainer & (IT)-Coach 
> 
> {Software Engineering, Open Source, Web Applications, Apache Cocoon}
> 
>                                         web(log): http://www.poetz.cc
> --------------------------------------------------------------------
>

RE: StoreJanitor

Reply via email to