Hi Prakash, On Feb 7, 2014, at 10:41 AM, Prakash Surya <[email protected]> wrote:
> Hey guys, > > I've been working on some ARC performance work targeted for the ZFS on > Linux implementation, but I think some of the patches I'm proposing > _might_ be useful in the other implementations as well. > > As far as I know, the ARC code is largely the same between > implementations. NB, there are several different implementations use different metadata management approaches. > Although, on Linux we try and maintain a hard limit on > metadata using "arc_meta_limit" and "arc_meta_used". Thus, not all of > the patches are relevant outside of ZoL, but my hunch is many definitely > are. Can you explain the reasoning here? Historically, we've tried to avoid putting absolute limits because they must be managed and increasing management complexity is a bad idea. > To highlight, I think these might be of particular interest: > > * 22be556 Disable aggressive arc_p growth by default MRU (p) growth is the result of demand, yes? > * 5694b53 Allow "arc_p" to drop to zero or grow to "arc_c" Zero p means zero demand? Also, can you explain the reasoning for not wanting anything in the MFU cache? I suppose if you totally disable the MFU cache, then you'll get the behaviour of most other file system caches, but that isn't a good thing. > * 517a0bc Disable arc_p adapt dampener by default > * 2d1f779 Remove "arc_meta_used" from arc_adjust calculatio > * 32a96d6 Prioritize "metadata" in arc_get_data_buf > * b3b7236 Split "data_size" into "meta" and "data" > > Keep in mind, my expertise with the ARC is still limited, so if anybody > finds any of these patches as "wrong" (for a particular workload, maybe) > please let me know. The full patch stack I'm proposing on Linux is here: > > * https://github.com/zfsonlinux/zfs/pull/2110 > > I posted some graphs of useful arcstat parameters vs. time for each of > the 14 unique tests run. Those are in this comment: > > * https://github.com/zfsonlinux/zfs/pull/2110#issuecomment-34393733 > > And here's a snippet from the pull request description with a summary of > the benefits this patch stack has shown in my testing (go check out the > pull request for more info on the tests run and results gathered): > > Improve ARC hit rate with metadata heavy workloads > > This stack of patches has been empirically shown to drastically improve > the hit rate of the ARC for certain workloads. As a result, fewer reads > to disk are required, which is generally a good thing and can > drastically improve performance if the workload is disk limited. > > For the impatient, I'll summarize the results of the tests performed: > > * Test 1 - Creating many empty directories. This test saw 99.9% > fewer reads and 12.8% more inodes created when running > *with* these changes. > > * Test 2 - Creating many empty files. This test saw 4% fewer reads > and 0% more inodes created when running *with* these > changes. > > * Test 3 - Creating many 4 KiB files. This test saw 96.7% fewer > reads and 4.9% more inodes created when running *with* > these changes. > > * Test 4 - Creating many 4096 KiB files. This test saw 99.4% fewer > reads and 0% more inodes created (but took 6.9% fewer > seconds to complete) when running *with* these changes. > > * Test 5 - Rsync'ing a dataset with many empty directories. This > test saw 36.2% fewer reads and 66.2% more inodes created > when running *with* these changes. > > * Test 6 - Rsync'ing a dataset with many empty files. This test saw > 30.9% fewer reads and 0% more inodes created (but took > 24.3% fewer seconds to complete) when running *with* > these changes. > > * Test 7 - Rsync'ing a dataset with many 4 KiB files. This test saw > 30.8% fewer reads and 173.3% more inodes created when > running *with* these changes. AIUI, the tests will work better with a large, MFU metadata cache. Yet the proposed changes can also result in small, MRU-only metadata caches -- which would be disasterous to most (all?) applications. I'd love to learn more about where you want to go with this. -- richard > > So, in the interest of collaboration (and potentially getting much > needed input from people with more ARC expertise than I have), I wanted > to give this work a broader audience. > > -- > Cheers, Prakash > > _______________________________________________ > developer mailing list > [email protected] > http://lists.open-zfs.org/mailman/listinfo/developer _______________________________________________ developer mailing list [email protected] http://lists.open-zfs.org/mailman/listinfo/developer
