Re: [PATCH] crypto: arm64 - revert NEON yield for fast AEAD implementations
On Sun, Jul 29, 2018 at 04:52:30PM +0200, Ard Biesheuvel wrote: > As it turns out, checking the TIF_NEED_RESCHED flag after each > iteration results in a significant performance regression (~10%) > when running fast algorithms (i.e., ones that use special instructions > and operate in the < 4 cycles per byte range) on in-order cores with > comparatively slow memory accesses such as the Cortex-A53. > > Given the speed of these ciphers, and the fact that the page based > nature of the AEAD scatterwalk API guarantees that the core NEON > transform is never invoked with more than a single page's worth of > input, we can estimate the worst case duration of any resulting > scheduling blackout: on a 1 GHz Cortex-A53 running with 64k pages, > processing a page's worth of input at 4 cycles per byte results in > a delay of ~250 us, which is a reasonable upper bound. > > So let's remove the yield checks from the fused AES-CCM and AES-GCM > routines entirely. > > This reverts commit 7b67ae4d5ce8e2f912377f5fbccb95811a92097f and > partially reverts commit 7c50136a8aba8784f07fb66a950cc61a7f3d2ee3. > > Fixes: 7c50136a8aba ("crypto: arm64/aes-ghash - yield NEON after every ...") > Fixes: 7b67ae4d5ce8 ("crypto: arm64/aes-ccm - yield NEON after every ...") > Signed-off-by: Ard Biesheuvel Patch applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH] crypto: arm64 - revert NEON yield for fast AEAD implementations
On 3 August 2018 at 10:17, Herbert Xu wrote: > On Fri, Aug 03, 2018 at 09:10:08AM +0200, Ard Biesheuvel wrote: >> But I think it's too late now to take this into v4.18. Could you >> please queue this (and my other two pending arm64/aes-gcm patches, if >> possible) for v4.19 instead? > > OK I'll do that. > Thanks. But, actually, those two pending patches are 3 piece series now ... (v2 of 'crypto/arm64: aes-ce-gcm - switch to 2-way aggregation') Thanks, Ard.
Re: [PATCH] crypto: arm64 - revert NEON yield for fast AEAD implementations
On Fri, Aug 03, 2018 at 09:10:08AM +0200, Ard Biesheuvel wrote: > But I think it's too late now to take this into v4.18. Could you > please queue this (and my other two pending arm64/aes-gcm patches, if > possible) for v4.19 instead? OK I'll do that. Cheers, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH] crypto: arm64 - revert NEON yield for fast AEAD implementations
On 3 August 2018 at 08:14, Herbert Xu wrote: > On Sun, Jul 29, 2018 at 04:52:30PM +0200, Ard Biesheuvel wrote: >> As it turns out, checking the TIF_NEED_RESCHED flag after each >> iteration results in a significant performance regression (~10%) >> when running fast algorithms (i.e., ones that use special instructions >> and operate in the < 4 cycles per byte range) on in-order cores with >> comparatively slow memory accesses such as the Cortex-A53. >> >> Given the speed of these ciphers, and the fact that the page based >> nature of the AEAD scatterwalk API guarantees that the core NEON >> transform is never invoked with more than a single page's worth of >> input, we can estimate the worst case duration of any resulting >> scheduling blackout: on a 1 GHz Cortex-A53 running with 64k pages, >> processing a page's worth of input at 4 cycles per byte results in >> a delay of ~250 us, which is a reasonable upper bound. >> >> So let's remove the yield checks from the fused AES-CCM and AES-GCM >> routines entirely. >> >> This reverts commit 7b67ae4d5ce8e2f912377f5fbccb95811a92097f and >> partially reverts commit 7c50136a8aba8784f07fb66a950cc61a7f3d2ee3. >> >> Fixes: 7c50136a8aba ("crypto: arm64/aes-ghash - yield NEON after every ...") >> Fixes: 7b67ae4d5ce8 ("crypto: arm64/aes-ccm - yield NEON after every ...") >> Signed-off-by: Ard Biesheuvel >> --- >> This supersedes my series 'crypto/arm64: reduce impact of NEON yield checks' >> sent out on the 24th of July. >> >> Given the significant performance regression, we may want to treat this as >> a fix (the patches in question went into v4.18) >> >> This patch applies onto my patch 'crypto/arm64: aes-ce-gcm - add missing >> kernel_neon_begin/end pair' which I sent out on the 27th of July, which >> fixes a data corruption bug, whic should also be applied as a fix. > > Acked-by: Herbert Xu > Thanks Herbert. But I think it's too late now to take this into v4.18. Could you please queue this (and my other two pending arm64/aes-gcm patches, if possible) for v4.19 instead? Thanks,
Re: [PATCH] crypto: arm64 - revert NEON yield for fast AEAD implementations
On Sun, Jul 29, 2018 at 04:52:30PM +0200, Ard Biesheuvel wrote: > As it turns out, checking the TIF_NEED_RESCHED flag after each > iteration results in a significant performance regression (~10%) > when running fast algorithms (i.e., ones that use special instructions > and operate in the < 4 cycles per byte range) on in-order cores with > comparatively slow memory accesses such as the Cortex-A53. > > Given the speed of these ciphers, and the fact that the page based > nature of the AEAD scatterwalk API guarantees that the core NEON > transform is never invoked with more than a single page's worth of > input, we can estimate the worst case duration of any resulting > scheduling blackout: on a 1 GHz Cortex-A53 running with 64k pages, > processing a page's worth of input at 4 cycles per byte results in > a delay of ~250 us, which is a reasonable upper bound. > > So let's remove the yield checks from the fused AES-CCM and AES-GCM > routines entirely. > > This reverts commit 7b67ae4d5ce8e2f912377f5fbccb95811a92097f and > partially reverts commit 7c50136a8aba8784f07fb66a950cc61a7f3d2ee3. > > Fixes: 7c50136a8aba ("crypto: arm64/aes-ghash - yield NEON after every ...") > Fixes: 7b67ae4d5ce8 ("crypto: arm64/aes-ccm - yield NEON after every ...") > Signed-off-by: Ard Biesheuvel > --- > This supersedes my series 'crypto/arm64: reduce impact of NEON yield checks' > sent out on the 24th of July. > > Given the significant performance regression, we may want to treat this as > a fix (the patches in question went into v4.18) > > This patch applies onto my patch 'crypto/arm64: aes-ce-gcm - add missing > kernel_neon_begin/end pair' which I sent out on the 27th of July, which > fixes a data corruption bug, whic should also be applied as a fix. Acked-by: Herbert Xu Thanks, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt