On Wed, Jun 21, 2017 at 4:09 AM, Paolo Abeni <pab...@redhat.com> wrote:
> The 'rep' prefix suffers for a relevant "setup cost"; as a result
> string copies with unrolled loops are faster than even
> optimized string copy using 'rep' variant, for short string.
>
> This change updates __copy_user_generic() to use the unrolled
> version for small string length. The threshold length for short
> string - 64 - has been selected with empirical measures as the
> larger value that still ensure a measurable gain.
>
> A micro-benchmark of __copy_from_user() with different lengths shows
> the following:
>
> string len      vanilla         patched         delta
> bytes           ticks           ticks           tick(%)
>
> 0               58              26              32(55%)
> 1               49              29              20(40%)
> 2               49              31              18(36%)
> 3               49              32              17(34%)
> 4               50              34              16(32%)
> 5               49              35              14(28%)
> 6               49              36              13(26%)
> 7               49              38              11(22%)
> 8               50              31              19(38%)
> 9               51              33              18(35%)
> 10              52              36              16(30%)
> 11              52              37              15(28%)
> 12              52              38              14(26%)
> 13              52              40              12(23%)
> 14              52              41              11(21%)
> 15              52              42              10(19%)
> 16              51              34              17(33%)
> 17              51              35              16(31%)
> 18              52              37              15(28%)
> 19              51              38              13(25%)
> 20              52              39              13(25%)
> 21              52              40              12(23%)
> 22              51              42              9(17%)
> 23              51              46              5(9%)
> 24              52              35              17(32%)
> 25              52              37              15(28%)
> 26              52              38              14(26%)
> 27              52              39              13(25%)
> 28              52              40              12(23%)
> 29              53              42              11(20%)
> 30              52              43              9(17%)
> 31              52              44              8(15%)
> 32              51              36              15(29%)
> 33              51              38              13(25%)
> 34              51              39              12(23%)
> 35              51              41              10(19%)
> 36              52              41              11(21%)
> 37              52              43              9(17%)
> 38              51              44              7(13%)
> 39              52              46              6(11%)
> 40              51              37              14(27%)
> 41              50              38              12(24%)
> 42              50              39              11(22%)
> 43              50              40              10(20%)
> 44              50              42              8(16%)
> 45              50              43              7(14%)
> 46              50              43              7(14%)
> 47              50              45              5(10%)
> 48              50              37              13(26%)
> 49              49              38              11(22%)
> 50              50              40              10(20%)
> 51              50              42              8(16%)
> 52              50              42              8(16%)
> 53              49              46              3(6%)
> 54              50              46              4(8%)
> 55              49              48              1(2%)
> 56              50              39              11(22%)
> 57              50              40              10(20%)
> 58              49              42              7(14%)
> 59              50              42              8(16%)
> 60              50              46              4(8%)
> 61              50              47              3(6%)
> 62              50              48              2(4%)
> 63              50              48              2(4%)
> 64              51              38              13(25%)
>
> Above 64 bytes the gain fades away.
>
> Very similar values are collectd for __copy_to_user().
> UDP receive performances under flood with small packets using recvfrom()
> increase by ~5%.
>
> Signed-off-by: Paolo Abeni <pab...@redhat.com>

Since there are no regressions here, this seems sensible to me. :)

Reviewed-by: Kees Cook <keesc...@chromium.org>

-Kees

> ---
>  arch/x86/include/asm/uaccess_64.h | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/arch/x86/include/asm/uaccess_64.h 
> b/arch/x86/include/asm/uaccess_64.h
> index c5504b9..16a8871 100644
> --- a/arch/x86/include/asm/uaccess_64.h
> +++ b/arch/x86/include/asm/uaccess_64.h
> @@ -28,6 +28,9 @@ copy_user_generic(void *to, const void *from, unsigned len)
>  {
>         unsigned ret;
>
> +       if (len <= 64)
> +               return copy_user_generic_unrolled(to, from, len);
> +
>         /*
>          * If CPU has ERMS feature, use copy_user_enhanced_fast_string.
>          * Otherwise, if CPU has rep_good feature, use 
> copy_user_generic_string.
> --
> 2.9.4
>



-- 
Kees Cook
Pixel Security

Reply via email to