Hey guys,

I've been working on some ARC performance work targeted for the ZFS on
Linux implementation, but I think some of the patches I'm proposing
_might_ be useful in the other implementations as well.

As far as I know, the ARC code is largely the same between
implementations. Although, on Linux we try and maintain a hard limit on
metadata using "arc_meta_limit" and "arc_meta_used". Thus, not all of
the patches are relevant outside of ZoL, but my hunch is many definitely
are. To highlight, I think these might be of particular interest:

    * 22be556 Disable aggressive arc_p growth by default
    * 5694b53 Allow "arc_p" to drop to zero or grow to "arc_c"
    * 517a0bc Disable arc_p adapt dampener by default
    * 2d1f779 Remove "arc_meta_used" from arc_adjust calculatio
    * 32a96d6 Prioritize "metadata" in arc_get_data_buf
    * b3b7236 Split "data_size" into "meta" and "data"

Keep in mind, my expertise with the ARC is still limited, so if anybody
finds any of these patches as "wrong" (for a particular workload, maybe)
please let me know. The full patch stack I'm proposing on Linux is here:

    * https://github.com/zfsonlinux/zfs/pull/2110

I posted some graphs of useful arcstat parameters vs. time for each of
the 14 unique tests run. Those are in this comment:

    * https://github.com/zfsonlinux/zfs/pull/2110#issuecomment-34393733

And here's a snippet from the pull request description with a summary of
the benefits this patch stack has shown in my testing (go check out the
pull request for more info on the tests run and results gathered):

    Improve ARC hit rate with metadata heavy workloads

    This stack of patches has been empirically shown to drastically improve
    the hit rate of the ARC for certain workloads. As a result, fewer reads
    to disk are required, which is generally a good thing and can
    drastically improve performance if the workload is disk limited.

    For the impatient, I'll summarize the results of the tests performed:

        * Test 1 - Creating many empty directories. This test saw 99.9%
                   fewer reads and 12.8% more inodes created when running
                   *with* these changes.

        * Test 2 - Creating many empty files. This test saw 4% fewer reads
                   and 0% more inodes created when running *with* these
                   changes.

        * Test 3 - Creating many 4 KiB files. This test saw 96.7% fewer
                   reads and 4.9% more inodes created when running *with*
                   these changes.

        * Test 4 - Creating many 4096 KiB files. This test saw 99.4% fewer
                   reads and 0% more inodes created (but took 6.9% fewer
                   seconds to complete) when running *with* these changes.

        * Test 5 - Rsync'ing a dataset with many empty directories. This
                   test saw 36.2% fewer reads and 66.2% more inodes created
                   when running *with* these changes.

        * Test 6 - Rsync'ing a dataset with many empty files. This test saw
                   30.9% fewer reads and 0% more inodes created (but took
                   24.3% fewer seconds to complete) when running *with*
                   these changes.

        * Test 7 - Rsync'ing a dataset with many 4 KiB files. This test saw
                   30.8% fewer reads and 173.3% more inodes created when
                   running *with* these changes.

So, in the interest of collaboration (and potentially getting much
needed input from people with more ARC expertise than I have), I wanted
to give this work a broader audience.

--
Cheers, Prakash

_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Reply via email to