Gokul Soundararajan <[EMAIL PROTECTED]> writes: > Hi all, > > I'm a Google SoC student working on creating a CacheManager with a > better cache replacement algorithm. After talking with my mentor > (Oystein Grovlen), I have created a bunch of small prototypes to > evaluate different algorithms found in literature. > > I have run them for a Zipf workload for 0.5 and 0.8 and posted the > results on the Derby Wiki. > Link: http://wiki.apache.org/db-derby/DerbyLruCacheManager > The results show that the 2Q, CAR, and CART algorithms perform better > than LRU and Clock.
Thanks! Interesting graphs! I'm not sure I understand why the hit rate decreases when the cache size increases from 100 to 500 and from 1000 to 5000 with Zipf 0.8. Do you have an explanation for that? > I would like to get some feedback on my results to help me decide what > I should implement in Derby. >From your graphs, CART looks promising. And it's a relative of Clock which we currently use! :) What you could do in order to discriminate more between the different algorithms, is adding a couple of scans to your workload. Since the Zipf workload just selects pages at random (though with different probability for each page), the simple algorithms (like LRU and Clock) will perform reasonably well. However, these simple algorithms normally don't handle scans very well, and scans are very common in a database (for instance, SELECT * FROM table). If a large table is scanned, Clock and LRU will replace the entire cache with data from the scan, whereas more intelligent algorithms like 2Q and CART won't. With a mix of Zipf and scans, we might get more input as to which algorithm we should go for. But, I guess, you can always tune the workload to get the results you want... ;) -- Knut Anders
