Re: Backport e666d4e9ceec crypto: vmx - Use skcipher for ctr fallback

2018-08-23 Thread Herbert Xu
On Thu, Aug 23, 2018 at 05:31:01PM +0200, Petr Vorel wrote: > Hi, > > I wonder, it it makes sense to backport commit > e666d4e9ceec crypto: vmx - Use skcipher for ctr fallback > to v 4.14 stable kernel. > I'm using it in 4.12+. > > These commits (somehow similar) has been backported to 4.10.2: >

Re: [PATCH v2] crypto: arm64/aes-modes - get rid of literal load of addend vector

2018-08-23 Thread Ard Biesheuvel
On 23 August 2018 at 21:04, Nick Desaulniers wrote: > On Thu, Aug 23, 2018 at 9:48 AM Ard Biesheuvel > wrote: >> >> Replace the literal load of the addend vector with a sequence that >> performs each add individually. This sequence is only 2 instructions >> longer than the original, and 2%

Re: [PATCH v2] crypto: arm64/aes-modes - get rid of literal load of addend vector

2018-08-23 Thread Nick Desaulniers
On Thu, Aug 23, 2018 at 9:48 AM Ard Biesheuvel wrote: > > Replace the literal load of the addend vector with a sequence that > performs each add individually. This sequence is only 2 instructions > longer than the original, and 2% faster on Cortex-A53. > > This is an improvement by itself, but

[PATCH v2] crypto: arm64/aes-modes - get rid of literal load of addend vector

2018-08-23 Thread Ard Biesheuvel
Replace the literal load of the addend vector with a sequence that performs each add individually. This sequence is only 2 instructions longer than the original, and 2% faster on Cortex-A53. This is an improvement by itself, but also works around a Clang issue, whose integrated assembler does not

Backport e666d4e9ceec crypto: vmx - Use skcipher for ctr fallback

2018-08-23 Thread Petr Vorel
Hi, I wonder, it it makes sense to backport commit e666d4e9ceec crypto: vmx - Use skcipher for ctr fallback to v 4.14 stable kernel. I'm using it in 4.12+. These commits (somehow similar) has been backported to 4.10.2: 5839f555fa57 crypto: vmx - Use skcipher for xts fallback c96d0a1c47ab crypto:

[PATCH v2] crypto: arm/ghash-ce - implement support for 4-way aggregation

2018-08-23 Thread Ard Biesheuvel
Speed up the GHASH algorithm based on 64-bit polynomial multiplication by adding support for 4-way aggregation. This improves throughput by ~85% on Cortex-A53, from 1.7 cycles per byte to 0.9 cycles per byte. When combined with AES into GCM, throughput improves by ~25%, from 3.8 cycles per byte