introduction:

This is a proposal to extend the mmbase nodeEvent system and to implement a 
cache release strategy framework around the QueryResultCache class. The goal is 
to optimize the way the queryresultcaches invalidate cached queries on account 
of node events. I suppose the target release is 1.8

At vara we have some very bussy mmbase websites, and are running into 
performance problems on regualar basis. To address these problems i have looked 
at the mmbase node caching (read query caching) mechanism, and discovered that 
it presently dous a very bad job protecting the cached data during updates and 
inserts of new data. As long as there are no mutations in the data they are 
adaquat, but otherwise you are in trouble.
At vara we are currently working on a situation where we have a separate server 
in hour cluster to serve the wizards, and there will allso be a mechanism to 
'save up' node event broadcasts, so the 'frontend' servers can enjoy their 
cached data longer even while editors do their job.
This is a nice hack, but the real problem lies elsewhere. What about fora, 
polls ,directory services and so on, where (many) user are constantly commiting 
nodes of the same type that are constantly queried on. In this situation the 
query caches of mmbase fales and you have a real problem (at +20.000 visiters a 
day or so). Presently the 'multilevelcache' hardly ever performs over 33%, 
which is very poor.
Currently the evaluation of a node event is minimalistic and will in neary 
every situation result in a flush, and what's worse, after inserts or updates 
all queries containing a step of the type that triggered the event are flushed 
without discrimination, and outside the system SearchQueryCache uses for 
invalidating cached queries (see 'problems')


cache release strategy framework:

To address this problem i first thought about a cache pipeline system where 
custom caches could be inserted to accomodate specific systems like a forum. 
data requests would then 'bubble' up through the hierargy, with the database as 
last resort.
Investigating the current situation i changed my approach. All query result 
caches extend from org.mmbase.cache.QueryResultCache, and this class allso 
handles invalidation (see 'background'). I decided to use this class as an 
insertion point for modular 'release-strategies'. If an node (or relation) 
envent ocuurs, all release strategies on a cache are asked to evaluate each 
entry in the cache that appears to be affected by the event, untill some 
evalution decides there is no reason to flush that entry. 
A release strategy class has to implement an interface, and as many release 
strategies can be loaded and unloaded on any of the QueryResultCache 
subclasses. There is an AbstractReleaseStrategy class for conveneance that dous 
some plumbing, like keeping statistics and so on.
Every release strategy is responsible for keeping certain statistics. Every 
cache entry evaluation is being timed, and statistics like avarage execution 
time and performance (as in: how many flushes prevented) are being collected, 
so you can weigh it's cost against it's performance. These statistics will be 
accessible  in the admin/tools/caches page, where you can allso enable/disable 
them, and load new once on the fly.
What strategies should be loaded for what cache at startup can be configured in 
caches.xml. You can configure strategies to be loaded by all caches or by a 
specific one. i added some semantics to the dtd for that, and adapted thed 
Cache class to read the configuration.
It is a well known fact that node commits and updates can sometimes take a long 
time (500ms+ at big and bussy clouds). Of corse the caches are (at least 
partly) responsible for this, becouse a lot of cached entries must be 
evaluated. This framework allows you to see exactly how much a release strategy 
costs you, and how effective it is. 
You can even unload, edit, and reload a strategy on a running system to 
experiment with optimization.

Basically there are twoo levels of evaluating an event against a query. First 
there is the structure, where you look at things like 'is the changed relation 
type actually used in the query?' or 'if the query has one step ignore relatin 
changes'.
the next level is where you start evaluating the cached data where you get 
things like 'there is a list of some nodes with offset and max (paging), dous 
the changed node belong to the page this query represents?'. this type is more 
costly, but can save database cpu. As mmbase servers can be clustered, and many 
database servers can not, it might be a strategic decision to pull the load 
away from the database into the (clustered) middle tier. To accomodate these 
kind of scenario's i think the performance statistics are very important.

application specific release strategies:
this is anoter point where the framework coms in handy. for instance you have a 
forum, and you have forumposts linked to forums. You could write a release 
strategy that takes into account that forumposts are related to forums, so if a 
new post is created in some forum, queries on other fora should not be touched. 
This kind of behavour is what i was thinking of with the original caching 
pipeline.

But all this is only interesting of corse if we are able to build such release 
strategies, which brings me to the event model, the source of change 
information.

eventmodel:

Currently the SearchQueryCache listens to node events, and uses their 
information to decide if a query (and it's data) should be flused. This is 
basically sound, but the current node event yields hardly andy information 
about the event. The only thing it says is: this node, of this type has been 
created, changed, or deleted. Or any of this happend to one of it's relations. 
It allso tells you if the event happed locally, or remotely, and on what 
machine.
What it dous not say is: 
if a node has changed:
- what field has changed
- what was the preveous value, and what is the new value?
if a relation has changed:
- which relation (role, target, destination)
- what change (new, changed, deleted)
- if changed, what field, and what was the preveaus value?)

To address these shortcommings i introduced a new event model. I created an 
MMNodeEvent class and extending from that an MMRelationEvent class. There is 
allso a matching eventListener interface. Currently i implemented only the 
local version of the event process, and not the unicast/broadcast veriaty, for 
which i think i will need help. The advantage of this new event model is that 
there is much more opportunaty for optimizing cache release strategies. The 
event classes implement the Serializable interface.
I allso had to make a change to MMObjectNode, adding a map that stores old 
values when a field is changed. It only dous this once, so it will allways 
contain the value that is presently in the database. When the node is committed 
to the database, the map is cleard.

background:

how the QueryResultCache invalidates it's content:
The QueryResultCache is the base class for all caches that store queries and 
their result sets. 
The QueryResultCache has an inner class calld Observer.An Observer listens to 
the nodeevents of a specific type of builder. When a query is cached, all steps 
are iterated over, and if an Observer for the type of that step dous not exist, 
it is created, and the query is registered with it. This creates a matrix where 
a query is registered with all the observers that correspond to it´s steps, and 
an Observer hold reference to all queries containing a step of it´s type.
When an observer receives a node event, it Iterates over all queries it holds 
reference to, evaluates wether the query should be flushed (presently nearly 
allways), and then removes it from the cache.


problems/todo:

There are some problems to solve yet:
1 the methods insert(), commit() and removeNode() from MMObjectBuilder are 
calling a special method in QueryResultCache: invalidateAll(). this method 
simply invalidates all queries containing steps of a certain type. This runs 
absulutely counter to the whole idear of the cache invalidaton system, and must 
be killed. It is supposed to be there to make shure (local) queries return the 
right content straigt after a database write, but logging shows that this 
method is never called before the proper event was handled, so it should not be 
necssairy. But somebody has put it there for a reason, so i would like to know 
more about it.
2 the whole uni/multicasting side of the new event model has not been done yet, 
and i don't feel very comfortable about doing it. At least i will need help.
3 there are some questions about what events MMOBjectBuilders should propagate 
to their super classes (like relation events).
4 i'm not shure if a changed relation should allso trigger a nodeEvent (i don't 
think so, but am not shure)
5 The event objects must be serializable, but there might be problems with the 
oldvalues and newvalues maps.
6 in this project i mainly focused on the framework and the event model, but of 
corse there must be a basic all purpose release strategy to be loaded by 
default. I made a start on this as well, but i think some collaboration should 
be able to create a completer list of cheap, general validation rules.

conclusion

Sinds i am a committer i have not committed to mmbase yet (allthough i did some 
work on uml2mmbase), and this is quite a leap from nothing to some major 
changes. I think thoug that this could be a nice improvement for mmbase, but i 
don't know the core quite well enough though to pull this off on my own so i'm 
looking for:
1 sufficient support of the idear.
2 a code review of what has been created so far by a senior committer.
3 help on implementing the event model and perhaps the release strategy 
framework (allthough most is done).
4 some serious testing
5 should this be a project?

-----------
Ernst Bunders
tel: 035 6711653
_______________________________________________
Developers mailing list
[email protected]
http://lists.mmbase.org/mailman/listinfo/developers

Reply via email to