Normally data is only cached via the file vnode which means the cache is blown away when the vnode gets cycled out of the vnode cache. With kern.maxvnodes around ~100,000 on 32 bit systems and ~400,000 on 64 bit systems any filesystem which exceeds the limit will cause vnode recycling to occur. Nearly all filesystems these days exceed these limits, particularly on 32 bit systems. And on 64-bit systems files are often not large enough to utilize available memory before hitting the vnode limit and causing the data to be thrown away despite there being plenty of free ram.
It is now possible to bypass these limitations in DragonFly master by enabling both the HAMMER double_buffer feature (vfs.hammer.double_buffer=1) AND the swapcache data caching feature (vm.swapcache.data_enable=1). See 'man swapcache' for additional information on swapcache. When both features are enabled together swapcache will cache file data via HAMMER's block device instead of via individual file vnodes, making the swapcache'd data immune to vnode recyclement. Swapcache is thus able to cache the data for potentially millions of files up to 75% of available swap (normally configured up to 32G on 32-bit systems and up to 512G on 64-bit systems). -- Now add the fact that Sata-III is now widely available on motherboards and Sata-III SSDs are now in mass production. Intel's 510 series, OCZ's Vertex III, and Crucial's C300 and M4 series are capable of delivering 300-500 MBytes/sec reading and 200-400 MBytes/sec writing from a single device. Crucial's C300 series is very cost effective w/64GB at SATA-III speeds for $160. Compare this to the measily 2-5MBytes/sec a hard drive can do in a random seek/read environment. We're talking 100x the performance already with just a single SSD swap device. With swapcache this means being able shrink the cost and the size of what we might consider to be a 'server' by a factor of three or more. -- The only downside to the new feature is that data is double-buffered in ram. That is, file data is cached via the block device AND also via the file vnode, and there is really no way to get around this other than to expire one of the copies of the cached data more quickly (which we try to do). I still consider the feature a bit experimental due to these inefficiencies. We are definitely on the right track and regardless of the memory inefficiency the HD accesses go away for real when swapcache SSD can take the load instead. On one of our older servers I can now grep through 950,000 files (~15GB worth of file data) at ~2000-4000 files per second pulling 40-50 MBytes/sec from the SSD and *zero* activity on the HD. That is a big deal that only a big whopping RAID system or a ton of ram could compete with prior to the advent of SSDs... all from a little $700 box with an older $100 SSD in it. -Matt Matthew Dillon <dil...@backplane.com>