[PATCH v2 0/3] crypto: arm64/chacha - performance improvements

2018-12-04 Thread Ard Biesheuvel
Improve the performance of NEON based ChaCha: Patch #1 adds a block size of 1472 to the tcrypt test template so we have something that reflects the VPN case. Patch #2 improves performance for arbitrary length inputs: on deep pipelines, throughput increases ~30% when running on inputs blocks

[PATCH v2 1/3] crypto: tcrypt - add block size of 1472 to skcipher template

2018-12-04 Thread Ard Biesheuvel
In order to have better coverage of algorithms operating on block sizes that are in the ballpark of a VPN packet, add 1472 to the block_sizes array. Signed-off-by: Ard Biesheuvel --- crypto/tcrypt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/crypto/tcrypt.c

[PATCH v2 3/3] crypto: arm64/chacha - use combined SIMD/ALU routine for more speed

2018-12-04 Thread Ard Biesheuvel
To some degree, most known AArch64 micro-architectures appear to be able to issue ALU instructions in parellel to SIMD instructions without affecting the SIMD throughput. This means we can use the ALU to process a fifth ChaCha block while the SIMD is processing four blocks in parallel.

[PATCH v2 2/3] crypto: arm64/chacha - optimize for arbitrary length inputs

2018-12-04 Thread Ard Biesheuvel
Update the 4-way NEON ChaCha routine so it can handle input of any length >64 bytes in its entirety, rather than having to call into the 1-way routine and/or memcpy()s via temp buffers to handle the tail of a ChaCha invocation that is not a multiple of 256 bytes. On inputs that are a multiple of

[PATCH] crypto: adiantum - propagate CRYPTO_ALG_ASYNC flag to instance

2018-12-04 Thread Eric Biggers
From: Eric Biggers If the stream cipher implementation is asynchronous, then the Adiantum instance must be flagged as asynchronous as well. Otherwise someone asking for a synchronous algorithm can get an asynchronous algorithm. There are no asynchronous xchacha12 or xchacha20 implementations

Re: [PATCH] fscrypt: remove CRYPTO_CTR dependency

2018-12-04 Thread Eric Biggers
On Thu, Sep 06, 2018 at 12:43:41PM +0200, Ard Biesheuvel wrote: > On 5 September 2018 at 21:24, Eric Biggers wrote: > > From: Eric Biggers > > > > fscrypt doesn't use the CTR mode of operation for anything, so there's > > no need to select CRYPTO_CTR. It was added by commit 71dea01ea2ed > >

Using Advanced Vector eXtensions with hand-coded x64 algorithms (e.g /arch/x86/blowfish-x86_64-asm_64.S)

2018-12-04 Thread Shipof _
I was curious if it might make implementing F() faster to use instructions that are meant to work with sets of data similar to what would be processed