Hello

As has become clear (i hope) i spend some time deloping a small
framework for query cache release strategies. This aimes at solving twoo
problems.

1 Currently caches are invalidated way to quickly. there is no proper
evaluation of changes (events) prior to flushing queries and their
result sets from the cache.

2 Extensiblility: This framework will provide a simple way to optimize
cache release behavour for specific applications. 
example: 
think of a forum with forum and forumpost nodes. Presently, when you
commit a forumpost, all queries containing a forumpost are being flused.
With the framework in place you can easily write some code to check if
the evaluated queries in the cache query the same forum that this
forumpost was posted into, and if not, you don't have to flush the
query. This can be done either by examening the cached query object or
examining the stored result set (the latter being obveously more
expensive).

the structure will be plugin like. you can create a release strategy
class by extending an abstract base strategy class, and simply load it
(during runtime or by configuring the caches.xml file). the plugins form
a hyrarchie that is being traversed for each query in a cache until some
rule decides the query should not be flushed.

extra benefits are: performance measuring like avarage processing time
of a strategy class and  effectiveness, so you know at wat processing
cost at the middel tier you are sparing your database tier. This allows
for interesting optimization strategies (you could choose to take a lot
of load to the middle tier by elaborate and expesive optimizations
becouse you can easily scale on that tier through clustering, if you can
not scale on the database tier)

I hope some people will want to join this (little) project for there are
some things yet to be done, and it needs to be tested well becouse it
touches some vital parts of mmbase.
Some people suggested I should give a presentation about this first.
That's ok with me (and i promis i will try to make it better that the
last one :) )


for more details i recommend reading my preveous mail of 2005-08-02

included is a project proposal (in docbook)

START OF VOTING:   2005-08-09 15:00
END OF CALL:       2005-08-16 12:00

[_] +1 (YEA)
[_] +0 (ABSTAIN )
[_] -1 (NAY), because :
[_] VETO, because:


regards,

Ernst
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
"http://www.oasis-open.org/docbook/xml/4.0/docbookx.dtd";>
<article class="productsheet">
  <articleinfo>
    <title>Cache Release Strategy Framework Project Document</title>

    <date>2005-08-05</date>

    <edition>$Id$</edition>

    <authorgroup>
      <!-- one or more authors -->

      <author>
        <firstname>Ernst</firstname>

        <surname>Bunders</surname>
      </author>
    </authorgroup>

    <revhistory>
      <revision>
        <revnumber>1.0</revnumber>

        <date>2005-08-05</date>

        <authorinitials>PvR</authorinitials>

        <revremark>First version project proposal</revremark>
      </revision>
    </revhistory>

    <abstract>
      <para>This project will add an extention mechanism for query cache
      release strategies. It will allso add a new event mechanism to
      fascilitate more advanced cache release strategies. Finally it will
      provide a good base release strategy for common use. </para>
    </abstract>

    <legalnotice>
      <para>This software is OSI Certified Open Source Software. OSI Certified
      is a certification mark of the Open Source Initiative.</para>

      <para>The license (Mozilla version 1.0) can be read at the MMBase site.
      See <ulink url="http://www.mmbase.org/license";>
      http://www.mmbase.org/license </ulink></para>
    </legalnotice>
  </articleinfo>

  <section id="motivation">
    <title>Motivation</title>

    <formalpara>
      <title>Why is this project necessary?</title>

      <para>MMbase uses a number of query caches to reduce expensive database
      querying. these caches perform ok as long as you don't start updating or
      inserting nodes in the cloud. Becouse a validation of the change is
      missing every modification of a node will result in the flushing of all
      queries that contain a step of this type.</para>

      <para>Apart from having no default strategy for evaluating node/relation
      changes and keeping vs flushing cached data on account of it. MMBase
      very much misses the possiblity to add application specific cache
      invalidation logic. For that reason it is currently impossible to build
      applications in mmbase that query heavily on nodetypes it dous a lot of
      update/create on as well. With high load these apllications will suffer
      performace problems, unless you build your own caching mechanism.</para>
    </formalpara>

    <formalpara>
      <title>Why does this have to be an MMBase project instead of an external
      package?</title>

      <para>The changes will affect the core.</para>
    </formalpara>
  </section>

  <section id="goal">
    <title>Goal</title>

    <formalpara>
      <title>What is the purpose of this project?</title>

      <para>The purpose of this project is manyfold. Firste we need a better
      default behavour for evalueating node events by the QueryResultCache
      subclasses. The current procedure just begs for optimization. Then we
      want to create a clear extention point where node change event
      evaluation modules (called strategies) can be inserted. These strategies
      must be easy to create by extending an abstract strategy class and
      implementing a single method. The framwork must allow us to evaluate
      hour strategies in terms of performance and cost (processing time), and
      allso to load, unload, disable and enable at runtime. Finally, to
      accomodate all this we need a better event model that tells us more
      about the event that ocurred. </para>
    </formalpara>

    <formalpara>
      <title>Which goals need to be achieved in order to consider the project
      completed?</title>

      <para>The following goals make up this project: <itemizedlist>
          <listitem>
            <para>Adding a new event model to the core. This event model will
            use MMNodeEvent and MMRelationEvent classes as containers for
            event data. More details of the event will be available. This is
            needed to make more finegrained cache release strategies</para>
          </listitem>

          <listitem>
            <para>Adding a framwork for query cache release strategies. There
            will be some utilities like an AbstractReleaseStrategy to easly
            roll your own strategies. Release strategies will keep statistics
            about their cost and performance, so it is possible to investigate
            and optimize their performance.</para>
          </listitem>

          <listitem>
            <para>Providing a goed default cache release strategy.</para>
          </listitem>

          <listitem>
            <para>Providing a html interface to monitor the release strategies
            on any cache through a web interface. For this purpose the
            admin/tools/caches page will be extended. Through this interface
            it will allso be able to enable/disabel strategies, and to load
            new ones.</para>
          </listitem>
        </itemizedlist></para>
    </formalpara>
  </section>

  <section id="Design">
    <title>Design</title>

    <formalpara>
      <title>What are the key classes in the design?</title>

      <para>For the event model there are a couple of new classes: in
      org.mmbase.module.core there is MMNodeEvent, MMRelationEvent and
      MMNodeEventListener. For the cache release strategy framework there are
      the following classes in the package org.mmbase.cache: first the
      interface QueryResultCacheReleaseStrategy. The basic implementation is
      AbstractReleaseStrategy.java, which dous some things for you like
      keeping statistics. Extending that there are MultiReleaseStrategy.java
      which is a wrapper for a collection of strategies, and serves as 'the'
      strategy impelementation for the query caches and
      BasicReleaseStrategy.java which is a sensible basic set of tests for
      general purpose. an MultiReleaseStrategy instance will allways load a
      BasicReleaseStrategy instance and will not allow you to remove it.
      </para>
    </formalpara>

    <formalpara>
      <title>What are the key technologies used?</title>

      <para>java.</para>
    </formalpara>
  </section>

  <section id="impact">
    <title>Impact</title>

    <formalpara>
      <title>Which existing parts of MMBase are modified?</title>

      <para>For the event model some core classes are modified. MMObjectNode
      get's a new map for old field values, for they are part of the new event
      information. MMBase and MMObjectBuilder are changed to allow adding and
      removing event listeners, and notify them as needed.
      org.mmbase.module.core.MMBaseChangeInterface is extended with a method
      changedNode(MMNodeEvent) and org.mmbase.storage.util.ChangeManager is
      modified to create and pass on MMNOdeEvents and MMRelationEvents to it.
      </para>
    </formalpara>

    <para>For the cache synchronization (unicast and multicast) the impact is
    not clear yet, as i have not worked that out yet. My estimation is that
    only the org.mmbase.modul.change.SharedStorage needs to be changed, in
    order to be able to create messages from events, and events from messages.
    This neest fearther attention though.</para>

    <para>For the cache release strategy framework the QueryResultCache is
    modified. It now must load a MultiReleaseStrategy instance and make the
    Observer instances use it to evaluate ocurring events on cached queries.
    New methods are added to Observer to register as MMNodeEventListener with
    the builder. the method invalidateAll() will have to go. The class
    Cache.java is modified to read the extended cache configuration for the
    strategies that should be loaded at startup, and the caches_1_0.dtd has
    been extended (backwards compatible, so i recon the version number can
    stay the same).</para>

    <formalpara>
      <title>Which new packages will be created?</title>

      <para>none.</para>
    </formalpara>

    <formalpara>
      <title>What are the backward compatibility issues?</title>

      <para>none. the old event model and methods will still work, although
      they will be depricated. I don't suppose changes made to the cache
      classes have any backwards compatibility issues.</para>
    </formalpara>
  </section>

  <section id="planning">
    <title>Planning</title>

    <formalpara>
      <title>What is the estimated planning and total duration of the
      project?</title>

      <para>The planning is to finish this project within two months.</para>
    </formalpara>
  </section>
</article>
_______________________________________________
Developers mailing list
[email protected]
http://lists.mmbase.org/mailman/listinfo/developers

Reply via email to