Re: [Firebird-devel] FYI: Interesting approach to cache management

Jim Starkey Tue, 06 May 2014 15:12:12 -0700

My original cache management scheme used a weighted LRU where a cachehit would set the high order bit of an aging value in the BDB. Periodicaging cycles would do a right shift of the aging values, ending up withaging values that, to a large degree, represented the historical hitrate. It worked like a charm for small number of buffers, but as memorygot larger and buffer pools much bigger, the cost of maintaining thelist became prohibitive. Borland, I believe, ripped it out.

I schema I have used successfully in subsequent systems is a cyclemanager that wakes up from time to time, bumps the cycle number, doesany necessary work, and goes back to sleep. In this schema, an objectreference copies the cycle number into the object without worrying aboutcontention or interlocking. When the cycle manager wakes up, it does alinear scan through the managed objects moving objects referenced intheir previous cycle to the front of the list with a single lock on theLRU list.

The two techniques could easily be combined, giving priority to objectswith references in several cycles to objects referenced in only a singlecycle. It would take a slightly fancier data structure to mergeweighted references into the middle of list rather than just the listheader, but nothing that a little cleverness couldn't address.

A cycle manager, incidentally, could also allow a BDB cache hash tableto be searched without a lock, really speeding things up.



On 5/6/2014 5:21 PM, Leyne, Sean wrote:

All,
I was reading details on a recent update to Apache Hadoop engine andone of the changes dealt with a change to their cache algorithm.Hadoop had a cache which used a simple LRU algorithm. As we know mostLRU algorithms have the problem that "hot" pages can be flushed when aquery which requires reading millions of pages is submitted -- theFirebird equivalent of a full table scan -- The reads for old pagespushes "hot" pages from the cache.
Their solution was to define 2 cache levels:

  * Level 1: for pages which had only been accessed a single time --
    this used a simple index for locating the page in the list.  Once
    a page had been accessed more than once it was promoted to the
    second cache.

  * Level 2: for pages which had been accessed/references more than
    once -- this uses an LRU.
Within the configuration the ratio of memory allocated between thecaches (default = 25%/75% IIRC)I realize that 2 caches will make single cache requests slower, but ithas the benefit of more likely keeping the "hot" pages in memory forlonger, thus improving overall performance.I also know that there has been historical discussion of changing theengine to recognize table scan/NATURAL and backup to modify the cacheoperation.But I wonder if this would be an approach that Firebird shouldconsider, since it seems to address the known issues while notrequiring significant modifications.
Sean


------------------------------------------------------------------------------
Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
&#149; 3 signs your SCM is hindering your productivity
&#149; Requirements for releasing software faster
&#149; Expert tips and advice for migrating your SCM now
http://p.sf.net/sfu/perforce


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

------------------------------------------------------------------------------
Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
&#149; 3 signs your SCM is hindering your productivity
&#149; Requirements for releasing software faster
&#149; Expert tips and advice for migrating your SCM now
http://p.sf.net/sfu/perforce

Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Re: [Firebird-devel] FYI: Interesting approach to cache management

Reply via email to