On Wed, Mar 24, 2021 at 09:44:21AM -0400, Josef Bacik wrote:
> Looking at perf data for a fio workload I noticed that we were spending
> a pretty large chunk of time (around 5%) doing percpu_counter_sum() in
> need_preemptive_reclaim.  This is silly, as we only want to know if we
> have more ordered than delalloc to see if we should be counting the
> delayed items in our threshold calculation.  Change this to
> percpu_read_positive() to avoid the overhead.
> 
> I ran this through fsperf to validate the changes, obviously the latency
> numbers in dbench and fio are quite jittery, so take them as you wish,
> but overall the improvements on throughput, iops, and bw are all
> positive.  Each test was run two times, the given value is the average
> of both runs for their respective column.
> 
> btrfs ssd normal test results
> 
> bufferedrandwrite16g results
>      metric         baseline   current          diff
> ==========================================================
> write_io_kbytes     16777216   16777216     0.00%
> read_clat_ns_p99           0          0     0.00%
> write_bw_bytes      1.04e+08   1.05e+08     1.12%
> read_iops                  0          0     0.00%
> write_clat_ns_p50      13888      11840   -14.75%
> read_io_kbytes             0          0     0.00%
> read_io_bytes              0          0     0.00%
> write_clat_ns_p99      35008      29312   -16.27%
> read_bw_bytes              0          0     0.00%
> elapsed                  170        167    -1.76%
> write_lat_ns_min     4221.50    3762.50   -10.87%
> sys_cpu                39.65      35.37   -10.79%
> write_lat_ns_max    2.67e+10   2.50e+10    -6.63%
> read_lat_ns_min            0          0     0.00%
> write_iops          25270.10   25553.43     1.12%
> read_lat_ns_max            0          0     0.00%
> read_clat_ns_p50           0          0     0.00%
> 
> dbench60 results
>   metric     baseline   current         diff
> ==================================================
> qpathinfo       11.12     12.73    14.52%
> throughput     416.09    445.66     7.11%
> flush         3485.63   1887.55   -45.85%
> qfileinfo        0.70      1.92   173.86%
> ntcreatex      992.60    695.76   -29.91%
> qfsinfo          2.43      3.71    52.48%
> close            1.67      3.14    88.09%
> sfileinfo       66.54    105.20    58.10%
> rename         809.23    619.59   -23.43%
> find            16.88     15.46    -8.41%
> unlink         820.54    670.86   -18.24%
> writex        3375.20   2637.91   -21.84%
> deltree        386.33    449.98    16.48%
> readx            3.43      3.41    -0.60%
> mkdir            0.05      0.03   -38.46%
> lockx            0.26      0.26    -0.76%
> unlockx          0.81      0.32   -60.33%
> 
> dio4kbs16threads results
>      metric          baseline       current           diff
> ================================================================
> write_io_kbytes         5249676       3357150   -36.05%
> read_clat_ns_p99              0             0     0.00%
> write_bw_bytes      89583501.50   57291192.50   -36.05%
> read_iops                     0             0     0.00%
> write_clat_ns_p50        242688        263680     8.65%
> read_io_kbytes                0             0     0.00%
> read_io_bytes                 0             0     0.00%
> write_clat_ns_p99      15826944      36732928   132.09%
> read_bw_bytes                 0             0     0.00%
> elapsed                      61            61     0.00%
> write_lat_ns_min          42704         42095    -1.43%
> sys_cpu                    5.27          3.45   -34.52%
> write_lat_ns_max       7.43e+08      9.27e+08    24.71%
> read_lat_ns_min               0             0     0.00%
> write_iops             21870.97      13987.11   -36.05%
> read_lat_ns_max               0             0     0.00%
> read_clat_ns_p50              0             0     0.00%
> 
> randwrite2xram results
>      metric          baseline       current           diff
> ================================================================
> write_io_kbytes        24831972      28876262    16.29%
> read_clat_ns_p99              0             0     0.00%
> write_bw_bytes      83745273.50   92182192.50    10.07%
> read_iops                     0             0     0.00%
> write_clat_ns_p50         13952         11648   -16.51%
> read_io_kbytes                0             0     0.00%
> read_io_bytes                 0             0     0.00%
> write_clat_ns_p99         50176         52992     5.61%
> read_bw_bytes                 0             0     0.00%
> elapsed                     314           332     5.73%
> write_lat_ns_min        5920.50          5127   -13.40%
> sys_cpu                    7.82          7.35    -6.07%
> write_lat_ns_max       5.27e+10      3.88e+10   -26.44%
> read_lat_ns_min               0             0     0.00%
> write_iops             20445.62      22505.42    10.07%
> read_lat_ns_max               0             0     0.00%
> read_clat_ns_p50              0             0     0.00%
> 
> untarfirefox results
> metric    baseline   current        diff
> ==============================================
> elapsed      47.41     47.40   -0.03%
> 
> Signed-off-by: Josef Bacik <jo...@toxicpanda.com>

Added to misc-next, thanks.

Reply via email to