On Thu, Feb 05, 2015 at 03:17:19PM +0100, Kevin Wolf wrote: > > By never allowing the hit count to go down to zero, we make sure > > that all unused entries are chosen first before a valid one is > > discarded. > > But does this actually improve a lot? cache_hits is only 0 for the > first few accesses and it never becomes 0 again after that. The > result might be that soon all the entries have cache_hits == 1, and > we get the same problem as you're describing - only the first few > entries will be reused.
I wanted to measure the actual impact of this, so I modified the current code to implement a quick and dirty LRU algorithm and made some numbers. First of all, it's true that if the cache is not big enough and there's a lot of cache misses, the entries being reused are always the first ones, whereas with LRU all of them are evicted at some point (I actually measured this and the results are very clear). However this doesn't seem to translate in actual performance improvements in my tests. I compared all three algorithms (the original one, my patched version and my LRU version) using a 20GB disk image and different L2 cache sizes. I tested using fio doing random reads for one minute. Here are the results: |---------------+-----------------------------| | | Average IOPS | |---------------+----------+---------+--------| | Cache entries | Original | Patched | LRU | |---------------+----------+---------+--------| | 16 (8GB) | 4.0 K | 4.9 K | 5.1 K | | 24 (12GB) | 4.1 K | 7.2 K | 7.3 K | | 32 (16GB) | 4.1 K | 12.8 K | 12.7 K | | 40 (20GB) | 4.0 K | 64.0 K | 63.6 K | |---------------+----------+---------+--------| And these are the results with a 40GB disk image (I skipped the original code in this case since its limits are clear): |---------------+------------------| | | Average IOPS | |---------------+---------+--------| | Cache entries | Patched | LRU | |---------------+---------+--------| | 16 (8GB) | 3.7 K | 3.7 K | | 40 (20GB) | 5.4 K | 5.8 K | | 72 (36GB) | 20.8 K | 21.4 K | | 78 (39GB) | 42.6 K | 42.4 K | | 80 (40GB) | 64.2 K | 64.1 K | |---------------+---------+--------| So it seems that under this scenario, if the cache is not big enough for the whole disk, the eviction algorithm doesn't really make a difference. This makes sense because since I'm doing random reads, the previous usage of a cache entry doesn't have an influence on the probability that it will be used again. Under a different scenario the results might be different but I don't have a use case with data to prove that. That's why I chose this simple change to the current algorithm rather than attempting a complete rewrite. Berto