Derby might have more than one cache manager pretty soon, and as I've been reading source code I've pondered how new cache managers should be wired into Derby. If we don't change any existing interfaces only one cache manager will be used, and it will be chosen at Derby boot time.
I'm thinking it could be useful to give different cache users (i.e. the statement cache vs the page cache) some influence on how their data should be handled by the cache, perhaps choosing between different cache managers in the call to the cache factory methods. We'd need to modify the CacheFactory interface for that. I've been looking at a number of modern buffer management algorithms in the last few days including LRU-k (LRU-2) [1], 2Q [2] and ARC [3]. While they all claim a performance advantage over good old LRU, especially when handling scans, they all seem to be geared towards storage buffering, and they jump through some hoops in order to handle sequential scans and very random workloads well. ARC in particular seems to be geared for storage systems with higher-level caches filtering load for them, not databases. In the end, this makes me wonder whether it would make sense to use plain and simple LRU (or perhaps the good old clock) for generic caches such as the statement cache and then possibly add one of the new & fancy algorithms for page buffering? Thanks for any input on this, [EMAIL PROTECTED] [1] http://www.cs.cmu.edu/~christos/courses/721-resources/p297-o_neil.pdf [2] http://www.vldb.org/conf/1994/P439.PDF [3] http://almaden.ibm.com/cs/people/dmodha/arcfast.pdf -- Anders Morken Trying to bend his head around the synchronization details in the clock cache manager =)
