Hey guys,
I've been working on some ARC performance work targeted for the ZFS on
Linux implementation, but I think some of the patches I'm proposing
_might_ be useful in the other implementations as well.
As far as I know, the ARC code is largely the same between
implementations. Although, on Linux we try and maintain a hard limit on
metadata using "arc_meta_limit" and "arc_meta_used". Thus, not all of
the patches are relevant outside of ZoL, but my hunch is many definitely
are. To highlight, I think these might be of particular interest:
* 22be556 Disable aggressive arc_p growth by default
* 5694b53 Allow "arc_p" to drop to zero or grow to "arc_c"
* 517a0bc Disable arc_p adapt dampener by default
* 2d1f779 Remove "arc_meta_used" from arc_adjust calculatio
* 32a96d6 Prioritize "metadata" in arc_get_data_buf
* b3b7236 Split "data_size" into "meta" and "data"
Keep in mind, my expertise with the ARC is still limited, so if anybody
finds any of these patches as "wrong" (for a particular workload, maybe)
please let me know. The full patch stack I'm proposing on Linux is here:
* https://github.com/zfsonlinux/zfs/pull/2110
I posted some graphs of useful arcstat parameters vs. time for each of
the 14 unique tests run. Those are in this comment:
* https://github.com/zfsonlinux/zfs/pull/2110#issuecomment-34393733
And here's a snippet from the pull request description with a summary of
the benefits this patch stack has shown in my testing (go check out the
pull request for more info on the tests run and results gathered):
Improve ARC hit rate with metadata heavy workloads
This stack of patches has been empirically shown to drastically improve
the hit rate of the ARC for certain workloads. As a result, fewer reads
to disk are required, which is generally a good thing and can
drastically improve performance if the workload is disk limited.
For the impatient, I'll summarize the results of the tests performed:
* Test 1 - Creating many empty directories. This test saw 99.9%
fewer reads and 12.8% more inodes created when running
*with* these changes.
* Test 2 - Creating many empty files. This test saw 4% fewer reads
and 0% more inodes created when running *with* these
changes.
* Test 3 - Creating many 4 KiB files. This test saw 96.7% fewer
reads and 4.9% more inodes created when running *with*
these changes.
* Test 4 - Creating many 4096 KiB files. This test saw 99.4% fewer
reads and 0% more inodes created (but took 6.9% fewer
seconds to complete) when running *with* these changes.
* Test 5 - Rsync'ing a dataset with many empty directories. This
test saw 36.2% fewer reads and 66.2% more inodes created
when running *with* these changes.
* Test 6 - Rsync'ing a dataset with many empty files. This test saw
30.9% fewer reads and 0% more inodes created (but took
24.3% fewer seconds to complete) when running *with*
these changes.
* Test 7 - Rsync'ing a dataset with many 4 KiB files. This test saw
30.8% fewer reads and 173.3% more inodes created when
running *with* these changes.
So, in the interest of collaboration (and potentially getting much
needed input from people with more ARC expertise than I have), I wanted
to give this work a broader audience.
--
Cheers, Prakash
_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer