Re: [PATCH] crypto: arm64 - revert NEON yield for fast AEAD implementations

2018-08-03 Thread Herbert Xu
On Sun, Jul 29, 2018 at 04:52:30PM +0200, Ard Biesheuvel wrote:
> As it turns out, checking the TIF_NEED_RESCHED flag after each
> iteration results in a significant performance regression (~10%)
> when running fast algorithms (i.e., ones that use special instructions
> and operate in the < 4 cycles per byte range) on in-order cores with
> comparatively slow memory accesses such as the Cortex-A53.
> 
> Given the speed of these ciphers, and the fact that the page based
> nature of the AEAD scatterwalk API guarantees that the core NEON
> transform is never invoked with more than a single page's worth of
> input, we can estimate the worst case duration of any resulting
> scheduling blackout: on a 1 GHz Cortex-A53 running with 64k pages,
> processing a page's worth of input at 4 cycles per byte results in
> a delay of ~250 us, which is a reasonable upper bound.
> 
> So let's remove the yield checks from the fused AES-CCM and AES-GCM
> routines entirely.
> 
> This reverts commit 7b67ae4d5ce8e2f912377f5fbccb95811a92097f and
> partially reverts commit 7c50136a8aba8784f07fb66a950cc61a7f3d2ee3.
> 
> Fixes: 7c50136a8aba ("crypto: arm64/aes-ghash - yield NEON after every ...")
> Fixes: 7b67ae4d5ce8 ("crypto: arm64/aes-ccm - yield NEON after every ...")
> Signed-off-by: Ard Biesheuvel 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] crypto: arm64 - revert NEON yield for fast AEAD implementations

2018-08-03 Thread Ard Biesheuvel
On 3 August 2018 at 10:17, Herbert Xu  wrote:
> On Fri, Aug 03, 2018 at 09:10:08AM +0200, Ard Biesheuvel wrote:
>> But I think it's too late now to take this into v4.18. Could you
>> please queue this (and my other two pending arm64/aes-gcm patches, if
>> possible) for v4.19 instead?
>
> OK I'll do that.
>

Thanks.

But, actually, those two pending patches are 3 piece series now ...
(v2 of 'crypto/arm64: aes-ce-gcm - switch to 2-way aggregation')

Thanks,
Ard.


Re: [PATCH] crypto: arm64 - revert NEON yield for fast AEAD implementations

2018-08-03 Thread Herbert Xu
On Fri, Aug 03, 2018 at 09:10:08AM +0200, Ard Biesheuvel wrote:
> But I think it's too late now to take this into v4.18. Could you
> please queue this (and my other two pending arm64/aes-gcm patches, if
> possible) for v4.19 instead?

OK I'll do that.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] crypto: arm64 - revert NEON yield for fast AEAD implementations

2018-08-03 Thread Ard Biesheuvel
On 3 August 2018 at 08:14, Herbert Xu  wrote:
> On Sun, Jul 29, 2018 at 04:52:30PM +0200, Ard Biesheuvel wrote:
>> As it turns out, checking the TIF_NEED_RESCHED flag after each
>> iteration results in a significant performance regression (~10%)
>> when running fast algorithms (i.e., ones that use special instructions
>> and operate in the < 4 cycles per byte range) on in-order cores with
>> comparatively slow memory accesses such as the Cortex-A53.
>>
>> Given the speed of these ciphers, and the fact that the page based
>> nature of the AEAD scatterwalk API guarantees that the core NEON
>> transform is never invoked with more than a single page's worth of
>> input, we can estimate the worst case duration of any resulting
>> scheduling blackout: on a 1 GHz Cortex-A53 running with 64k pages,
>> processing a page's worth of input at 4 cycles per byte results in
>> a delay of ~250 us, which is a reasonable upper bound.
>>
>> So let's remove the yield checks from the fused AES-CCM and AES-GCM
>> routines entirely.
>>
>> This reverts commit 7b67ae4d5ce8e2f912377f5fbccb95811a92097f and
>> partially reverts commit 7c50136a8aba8784f07fb66a950cc61a7f3d2ee3.
>>
>> Fixes: 7c50136a8aba ("crypto: arm64/aes-ghash - yield NEON after every ...")
>> Fixes: 7b67ae4d5ce8 ("crypto: arm64/aes-ccm - yield NEON after every ...")
>> Signed-off-by: Ard Biesheuvel 
>> ---
>> This supersedes my series 'crypto/arm64: reduce impact of NEON yield checks'
>> sent out on the 24th of July.
>>
>> Given the significant performance regression, we may want to treat this as
>> a fix (the patches in question went into v4.18)
>>
>> This patch applies onto my patch 'crypto/arm64: aes-ce-gcm - add missing
>> kernel_neon_begin/end pair' which I sent out on the 27th of July, which
>> fixes a data corruption bug, whic should also be applied as a fix.
>
> Acked-by: Herbert Xu 
>

Thanks Herbert.

But I think it's too late now to take this into v4.18. Could you
please queue this (and my other two pending arm64/aes-gcm patches, if
possible) for v4.19 instead?

Thanks,


Re: [PATCH] crypto: arm64 - revert NEON yield for fast AEAD implementations

2018-08-03 Thread Herbert Xu
On Sun, Jul 29, 2018 at 04:52:30PM +0200, Ard Biesheuvel wrote:
> As it turns out, checking the TIF_NEED_RESCHED flag after each
> iteration results in a significant performance regression (~10%)
> when running fast algorithms (i.e., ones that use special instructions
> and operate in the < 4 cycles per byte range) on in-order cores with
> comparatively slow memory accesses such as the Cortex-A53.
> 
> Given the speed of these ciphers, and the fact that the page based
> nature of the AEAD scatterwalk API guarantees that the core NEON
> transform is never invoked with more than a single page's worth of
> input, we can estimate the worst case duration of any resulting
> scheduling blackout: on a 1 GHz Cortex-A53 running with 64k pages,
> processing a page's worth of input at 4 cycles per byte results in
> a delay of ~250 us, which is a reasonable upper bound.
> 
> So let's remove the yield checks from the fused AES-CCM and AES-GCM
> routines entirely.
> 
> This reverts commit 7b67ae4d5ce8e2f912377f5fbccb95811a92097f and
> partially reverts commit 7c50136a8aba8784f07fb66a950cc61a7f3d2ee3.
> 
> Fixes: 7c50136a8aba ("crypto: arm64/aes-ghash - yield NEON after every ...")
> Fixes: 7b67ae4d5ce8 ("crypto: arm64/aes-ccm - yield NEON after every ...")
> Signed-off-by: Ard Biesheuvel 
> ---
> This supersedes my series 'crypto/arm64: reduce impact of NEON yield checks'
> sent out on the 24th of July.
> 
> Given the significant performance regression, we may want to treat this as
> a fix (the patches in question went into v4.18)
> 
> This patch applies onto my patch 'crypto/arm64: aes-ce-gcm - add missing
> kernel_neon_begin/end pair' which I sent out on the 27th of July, which
> fixes a data corruption bug, whic should also be applied as a fix.

Acked-by: Herbert Xu 

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt