Re: [OpenZFS Developer] ARC performance improvements on ZoL; other implementations interested?

Richard Elling Sun, 09 Feb 2014 22:31:42 -0800

Hi Prakash,

On Feb 7, 2014, at 10:41 AM, Prakash Surya <[email protected]> wrote:


> Hey guys,
> 
> I've been working on some ARC performance work targeted for the ZFS on
> Linux implementation, but I think some of the patches I'm proposing
> _might_ be useful in the other implementations as well.
> 
> As far as I know, the ARC code is largely the same between
> implementations.

NB, there are several different implementations use different metadata
management approaches. 

> Although, on Linux we try and maintain a hard limit on
> metadata using "arc_meta_limit" and "arc_meta_used". Thus, not all of
> the patches are relevant outside of ZoL, but my hunch is many definitely
> are.

Can you explain the reasoning here? Historically, we've tried to avoid
putting absolute limits because they must be managed and increasing 
management complexity is a bad idea.

> To highlight, I think these might be of particular interest:
> 
>    * 22be556 Disable aggressive arc_p growth by default

MRU (p) growth is the result of demand, yes?

>    * 5694b53 Allow "arc_p" to drop to zero or grow to "arc_c"

Zero p means zero demand?
Also, can you explain the reasoning for not wanting anything in the
MFU cache? I suppose if you totally disable the MFU cache, then you'll
get the behaviour of most other file system caches, but that isn't a
good thing.

>    * 517a0bc Disable arc_p adapt dampener by default
>    * 2d1f779 Remove "arc_meta_used" from arc_adjust calculatio
>    * 32a96d6 Prioritize "metadata" in arc_get_data_buf
>    * b3b7236 Split "data_size" into "meta" and "data"
> 
> Keep in mind, my expertise with the ARC is still limited, so if anybody
> finds any of these patches as "wrong" (for a particular workload, maybe)
> please let me know. The full patch stack I'm proposing on Linux is here:
> 
>    * https://github.com/zfsonlinux/zfs/pull/2110
> 
> I posted some graphs of useful arcstat parameters vs. time for each of
> the 14 unique tests run. Those are in this comment:
> 
>    * https://github.com/zfsonlinux/zfs/pull/2110#issuecomment-34393733
> 
> And here's a snippet from the pull request description with a summary of
> the benefits this patch stack has shown in my testing (go check out the
> pull request for more info on the tests run and results gathered):
> 
>    Improve ARC hit rate with metadata heavy workloads
> 
>    This stack of patches has been empirically shown to drastically improve
>    the hit rate of the ARC for certain workloads. As a result, fewer reads
>    to disk are required, which is generally a good thing and can
>    drastically improve performance if the workload is disk limited.
> 
>    For the impatient, I'll summarize the results of the tests performed:
> 
>        * Test 1 - Creating many empty directories. This test saw 99.9%
>                   fewer reads and 12.8% more inodes created when running
>                   *with* these changes.
> 
>        * Test 2 - Creating many empty files. This test saw 4% fewer reads
>                   and 0% more inodes created when running *with* these
>                   changes.
> 
>        * Test 3 - Creating many 4 KiB files. This test saw 96.7% fewer
>                   reads and 4.9% more inodes created when running *with*
>                   these changes.
> 
>        * Test 4 - Creating many 4096 KiB files. This test saw 99.4% fewer
>                   reads and 0% more inodes created (but took 6.9% fewer
>                   seconds to complete) when running *with* these changes.
> 
>        * Test 5 - Rsync'ing a dataset with many empty directories. This
>                   test saw 36.2% fewer reads and 66.2% more inodes created
>                   when running *with* these changes.
> 
>        * Test 6 - Rsync'ing a dataset with many empty files. This test saw
>                   30.9% fewer reads and 0% more inodes created (but took
>                   24.3% fewer seconds to complete) when running *with*
>                   these changes.
> 
>        * Test 7 - Rsync'ing a dataset with many 4 KiB files. This test saw
>                   30.8% fewer reads and 173.3% more inodes created when
>                   running *with* these changes.

AIUI, the tests will work better with a large, MFU metadata cache. 
Yet the proposed changes can also result in small, MRU-only metadata 
caches -- which would be disasterous to most (all?) applications. 
I'd love to learn more about where you want to go with this.
 -- richard

> 
> So, in the interest of collaboration (and potentially getting much
> needed input from people with more ARC expertise than I have), I wanted
> to give this work a broader audience.
> 
> --
> Cheers, Prakash
> 
> _______________________________________________
> developer mailing list
> [email protected]
> http://lists.open-zfs.org/mailman/listinfo/developer

_______________________________________________
developer mailing list
[email protected]
http://lists.open-zfs.org/mailman/listinfo/developer

Re: [OpenZFS Developer] ARC performance improvements on ZoL; other implementations interested?

Reply via email to