[RFC PATCH] crypto: chacha20 - add implementation using 96-bit nonce

2017-12-08 Thread Ard Biesheuvel
alc...@google.com> Cc: Paul Crowley <paulcrow...@google.com> Cc: Martin Willi <mar...@strongswan.org> Cc: David Gstir <da...@sigma-star.at> Cc: "Jason A . Donenfeld" <ja...@zx2c4.com> Cc: Stephan Mueller <smuel...@chronox.de> Signed-off-by: Ard Biesheuvel

Re: [PATCH] fscrypt: add support for ChaCha20 contents encryption

2017-12-08 Thread Ard Biesheuvel
On 8 December 2017 at 10:14, Stephan Mueller <smuel...@chronox.de> wrote: > Am Freitag, 8. Dezember 2017, 11:06:31 CET schrieb Ard Biesheuvel: > > Hi Ard, > >> >> Given how it is not uncommon for counters to be used as IV, this is a >> fundamental flaw that

Re: [PATCH] fscrypt: add support for ChaCha20 contents encryption

2017-12-08 Thread Ard Biesheuvel
On 8 December 2017 at 09:11, Ard Biesheuvel <ard.biesheu...@linaro.org> wrote: > On 8 December 2017 at 09:11, Ard Biesheuvel <ard.biesheu...@linaro.org> wrote: >> Hi Eric, >> >> On 8 December 2017 at 01:38, Eric Biggers <ebigge...@gmail.com> wrote: >

Re: [PATCH] fscrypt: add support for ChaCha20 contents encryption

2017-12-08 Thread Ard Biesheuvel
On 8 December 2017 at 09:11, Ard Biesheuvel <ard.biesheu...@linaro.org> wrote: > Hi Eric, > > On 8 December 2017 at 01:38, Eric Biggers <ebigge...@gmail.com> wrote: >> From: Eric Biggers <ebigg...@google.com> >> >> fscrypt currently only supports AES

Re: [PATCH] fscrypt: add support for ChaCha20 contents encryption

2017-12-08 Thread Ard Biesheuvel
Hi Eric, On 8 December 2017 at 01:38, Eric Biggers wrote: > From: Eric Biggers > > fscrypt currently only supports AES encryption. However, many low-end > mobile devices still use older CPUs such as ARMv7, which do not support > the AES instructions

Re: [PATCH] fscrypt: add support for ChaCha20 contents encryption

2017-12-07 Thread Ard Biesheuvel
On 8 December 2017 at 02:51, Jason A. Donenfeld wrote: > Hi Eric, > > Nice to see more use of ChaCha20. However... > > Can we skip over the "sort of worse than XTS, but not having _real_ > authentication sucks anyway in either case, so whatever" and move > directly to, "linux

Re: [PATCH v3 11/20] arm64: assembler: add macros to conditionally yield the NEON under PREEMPT

2017-12-07 Thread Ard Biesheuvel
On 7 December 2017 at 14:50, Ard Biesheuvel <ard.biesheu...@linaro.org> wrote: > On 7 December 2017 at 14:39, Dave Martin <dave.mar...@arm.com> wrote: >> On Wed, Dec 06, 2017 at 07:43:37PM +, Ard Biesheuvel wrote: >>> Add support macros to conditionally yi

Re: [PATCH v3 11/20] arm64: assembler: add macros to conditionally yield the NEON under PREEMPT

2017-12-07 Thread Ard Biesheuvel
On 7 December 2017 at 15:47, Ard Biesheuvel <ard.biesheu...@linaro.org> wrote: > On 7 December 2017 at 14:50, Ard Biesheuvel <ard.biesheu...@linaro.org> wrote: >> On 7 December 2017 at 14:39, Dave Martin <dave.mar...@arm.com> wrote: >>> On Wed, Dec 06, 2017

Re: [PATCH v3 10/20] arm64: assembler: add utility macros to push/pop stack frames

2017-12-07 Thread Ard Biesheuvel
On 7 December 2017 at 14:53, Dave Martin <dave.mar...@arm.com> wrote: > On Thu, Dec 07, 2017 at 02:21:17PM +0000, Ard Biesheuvel wrote: >> On 7 December 2017 at 14:11, Dave Martin <dave.mar...@arm.com> wrote: >> > On Wed, Dec 06, 2017 at 07:43:36PM +, Ard Bieshe

Re: [PATCH v3 11/20] arm64: assembler: add macros to conditionally yield the NEON under PREEMPT

2017-12-07 Thread Ard Biesheuvel
On 7 December 2017 at 14:39, Dave Martin <dave.mar...@arm.com> wrote: > On Wed, Dec 06, 2017 at 07:43:37PM +0000, Ard Biesheuvel wrote: >> Add support macros to conditionally yield the NEON (and thus the CPU) >> that may be called from the assembler code. >> >>

Re: [PATCH v3 10/20] arm64: assembler: add utility macros to push/pop stack frames

2017-12-07 Thread Ard Biesheuvel
On 7 December 2017 at 14:11, Dave Martin <dave.mar...@arm.com> wrote: > On Wed, Dec 06, 2017 at 07:43:36PM +0000, Ard Biesheuvel wrote: >> We are going to add code to all the NEON crypto routines that will >> turn them into non-leaf functions, so we need to manage the sta

[PATCH v3 19/20] crypto: arm64/crct10dif-ce - yield NEON after every block of input

2017-12-06 Thread Ard Biesheuvel
Avoid excessive scheduling delays under a preemptible kernel by yielding the NEON after every block of input. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/crct10dif-ce-core.S | 32 +--- 1 file changed, 28 insertions(+), 4 deletions(-)

[PATCH v3 20/20] DO NOT MERGE

2017-12-06 Thread Ard Biesheuvel
Test code to force a kernel_neon_end+begin sequence at every yield point, and wipe the entire NEON state before resuming the algorithm. --- arch/arm64/include/asm/assembler.h | 33 1 file changed, 33 insertions(+) diff --git a/arch/arm64/include/asm/assembler.h

[PATCH v3 18/20] crypto: arm64/crc32-ce - yield NEON after every block of input

2017-12-06 Thread Ard Biesheuvel
Avoid excessive scheduling delays under a preemptible kernel by yielding the NEON after every block of input. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/crc32-ce-core.S | 44 ++-- 1 file changed, 32 insertions(+), 12 deletions(-) diff

[PATCH v3 17/20] crypto: arm64/aes-ghash - yield NEON after every block of input

2017-12-06 Thread Ard Biesheuvel
Avoid excessive scheduling delays under a preemptible kernel by yielding the NEON after every block of input. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/ghash-ce-core.S | 113 ++-- arch/arm64/crypto/ghash-ce-glue.c | 28 +++-- 2 files c

[PATCH v3 13/20] crypto: arm64/sha2-ce - yield NEON after every block of input

2017-12-06 Thread Ard Biesheuvel
Avoid excessive scheduling delays under a preemptible kernel by yielding the NEON after every block of input. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/sha2-ce-core.S | 37 ++-- 1 file changed, 26 insertions(+), 11 deletions(-) diff

[PATCH v3 11/20] arm64: assembler: add macros to conditionally yield the NEON under PREEMPT

2017-12-06 Thread Ard Biesheuvel
, and the code in between is only executed when the yield path is taken, allowing the context to be preserved. The third macro takes an optional label argument that marks the resume path after a yield has been performed. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/inclu

[PATCH v3 12/20] crypto: arm64/sha1-ce - yield NEON after every block of input

2017-12-06 Thread Ard Biesheuvel
Avoid excessive scheduling delays under a preemptible kernel by yielding the NEON after every block of input. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/sha1-ce-core.S | 42 ++-- 1 file changed, 29 insertions(+), 13 deletions(-) diff

[PATCH v3 15/20] crypto: arm64/aes-blk - yield NEON after every block of input

2017-12-06 Thread Ard Biesheuvel
Avoid excessive scheduling delays under a preemptible kernel by yielding the NEON after every block of input. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/aes-ce.S| 15 +- arch/arm64/crypto/aes-modes.S | 331 2 files change

[PATCH v3 14/20] crypto: arm64/aes-ccm - yield NEON after every block of input

2017-12-06 Thread Ard Biesheuvel
Avoid excessive scheduling delays under a preemptible kernel by yielding the NEON after every block of input. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/aes-ce-ccm-core.S | 150 +--- 1 file changed, 95 insertions(+), 55 deletions(-)

[PATCH v3 16/20] crypto: arm64/aes-bs - yield NEON after every block of input

2017-12-06 Thread Ard Biesheuvel
Avoid excessive scheduling delays under a preemptible kernel by yielding the NEON after every block of input. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/aes-neonbs-core.S | 305 +++- 1 file changed, 170 insertions(+), 135 deletions(-)

[PATCH v3 10/20] arm64: assembler: add utility macros to push/pop stack frames

2017-12-06 Thread Ard Biesheuvel
in the stack frame (for locals) and emit the ldp/stp sequences. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/include/asm/assembler.h | 60 1 file changed, 60 insertions(+) diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/inclu

[PATCH v3 07/20] crypto: arm64/aes-blk - add 4 way interleave to CBC encrypt path

2017-12-06 Thread Ard Biesheuvel
routines yield every 64 bytes and not have an exception for CBC encrypt which yields every 16 bytes) So unroll the loop by 4. We still cannot perform the AES algorithm in parallel, but we can at least merge the loads and stores. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch

[PATCH v3 06/20] crypto: arm64/aes-blk - remove configurable interleave

2017-12-06 Thread Ard Biesheuvel
INTERLEAVE=4 with inlining disabled for both flavors of the core AES routines, so let's stick with that, and remove the option to configure this at build time. This makes the code easier to modify, which is nice now that we're adding yield support. Signed-off-by: Ard Biesheuvel <ard.bies

[PATCH v3 09/20] crypto: arm64/sha256-neon - play nice with CONFIG_PREEMPT kernels

2017-12-06 Thread Ard Biesheuvel
contexts. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/sha256-glue.c | 36 +--- 1 file changed, 23 insertions(+), 13 deletions(-) diff --git a/arch/arm64/crypto/sha256-glue.c b/arch/arm64/crypto/sha256-glue.c index b064d925fe2a..e8880c

[PATCH v3 08/20] crypto: arm64/aes-blk - add 4 way interleave to CBC-MAC encrypt path

2017-12-06 Thread Ard Biesheuvel
every 64 bytes and not have an exception for CBC MAC which yields every 16 bytes) So unroll the loop by 4. We still cannot perform the AES algorithm in parallel, but we can at least merge the loads and stores. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/cryp

[PATCH v3 03/20] crypto: arm64/aes-blk - move kernel mode neon en/disable into loop

2017-12-06 Thread Ard Biesheuvel
code, and run the remainder of the code with kernel mode NEON disabled (and preemption enabled) Note that this requires some reshuffling of the registers in the asm code, because the XTS routines can no longer rely on the registers to retain their contents between invocations. Signed-off-by: Ard

[PATCH v3 00/20] crypto: arm64 - play nice with CONFIG_PREEMPT

2017-12-06 Thread Ard Biesheuvel
us...@vger.kernel.org Cc: Peter Zijlstra <pet...@infradead.org> Cc: Catalin Marinas <catalin.mari...@arm.com> Cc: Will Deacon <will.dea...@arm.com> Cc: Steven Rostedt <rost...@goodmis.org> Cc: Thomas Gleixner <t...@linutronix.de> Ard Biesheuvel (20): crypto: testmgr -

[PATCH v3 01/20] crypto: testmgr - add a new test case for CRC-T10DIF

2017-12-06 Thread Ard Biesheuvel
In order to be able to test yield support under preempt, add a test vector for CRC-T10DIF that is long enough to take multiple iterations (and thus possible preemption between them) of the primary loop of the accelerated x86 and arm64 implementations. Signed-off-by: Ard Biesheuvel <ard.bies

[PATCH v3 04/20] crypto: arm64/aes-bs - move kernel mode neon en/disable into loop

2017-12-06 Thread Ard Biesheuvel
code, and run the remainder of the code with kernel mode NEON disabled (and preemption enabled) Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/aes-neonbs-glue.c | 36 +--- 1 file changed, 17 insertions(+), 19 deletions(-) diff --git a/arch

[PATCH v3 02/20] crypto: arm64/aes-ce-ccm - move kernel mode neon en/disable into loop

2017-12-06 Thread Ard Biesheuvel
code, and run the remainder of the code with kernel mode NEON disabled (and preemption enabled) Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/aes-ce-ccm-glue.c | 47 ++-- 1 file changed, 23 insertions(+), 24 deletions(-) diff --git a/arch

[PATCH v3 05/20] crypto: arm64/chacha20 - move kernel mode neon en/disable into loop

2017-12-06 Thread Ard Biesheuvel
code, and run the remainder of the code with kernel mode NEON disabled (and preemption enabled) Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/chacha20-neon-glue.c | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/arch/arm64/

Re: [PATCH v2 11/19] arm64: assembler: add macro to conditionally yield the NEON under PREEMPT

2017-12-06 Thread Ard Biesheuvel
On 6 December 2017 at 12:12, Dave P Martin <dave.mar...@arm.com> wrote: > On Wed, Dec 06, 2017 at 11:57:12AM +0000, Ard Biesheuvel wrote: >> On 6 December 2017 at 11:51, Dave Martin <dave.mar...@arm.com> wrote: >> > On Tue, Dec 05, 2017 at 06:04:34PM +, Ar

Re: [PATCH v2 11/19] arm64: assembler: add macro to conditionally yield the NEON under PREEMPT

2017-12-06 Thread Ard Biesheuvel
On 6 December 2017 at 11:51, Dave Martin <dave.mar...@arm.com> wrote: > On Tue, Dec 05, 2017 at 06:04:34PM +0000, Ard Biesheuvel wrote: >> On 5 December 2017 at 12:45, Ard Biesheuvel <ard.biesheu...@linaro.org> >> wrote: >> > >> > >> >>

Re: [PATCH v2 11/19] arm64: assembler: add macro to conditionally yield the NEON under PREEMPT

2017-12-05 Thread Ard Biesheuvel
On 5 December 2017 at 12:45, Ard Biesheuvel <ard.biesheu...@linaro.org> wrote: > > >> On 5 Dec 2017, at 12:28, Dave Martin <dave.mar...@arm.com> wrote: >> >>> On Mon, Dec 04, 2017 at 12:26:37PM +, Ard Biesheuvel wrote: >>> Add a support macro t

Re: [PATCH v2 11/19] arm64: assembler: add macro to conditionally yield the NEON under PREEMPT

2017-12-05 Thread Ard Biesheuvel
> On 5 Dec 2017, at 12:28, Dave Martin <dave.mar...@arm.com> wrote: > >> On Mon, Dec 04, 2017 at 12:26:37PM +0000, Ard Biesheuvel wrote: >> Add a support macro to conditionally yield the NEON (and thus the CPU) >> that may be called from the assembl

[PATCH v2 18/19] crypto: arm64/crct10dif-ce - yield NEON every 8 blocks of input

2017-12-04 Thread Ard Biesheuvel
Avoid excessive scheduling delays under a preemptible kernel by yielding the NEON every 8 blocks of input. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/crct10dif-ce-core.S | 39 ++-- 1 file changed, 35 insertions(+), 4 deletions(-) diff

[PATCH v2 09/19] crypto: arm64/aes-blk - add 4 way interleave to CBC-MAC encrypt path

2017-12-04 Thread Ard Biesheuvel
every 64 bytes and not have an exception for CBC MAC which yields every 16 bytes) So unroll the loop by 4. We still cannot perform the AES algorithm in parallel, but we can at least merge the loads and stores. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/cryp

[PATCH v2 19/19] DO NOT MERGE

2017-12-04 Thread Ard Biesheuvel
Test code to force a kernel_neon_end+begin sequence at every yield point, and wipe the entire NEON state before resuming the algorithm. --- arch/arm64/include/asm/assembler.h | 33 1 file changed, 33 insertions(+) diff --git a/arch/arm64/include/asm/assembler.h

[PATCH v2 10/19] crypto: arm64/sha256-neon - play nice with CONFIG_PREEMPT kernels

2017-12-04 Thread Ard Biesheuvel
contexts. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/sha256-glue.c | 36 +--- 1 file changed, 23 insertions(+), 13 deletions(-) diff --git a/arch/arm64/crypto/sha256-glue.c b/arch/arm64/crypto/sha256-glue.c index b064d925fe2a..e8880c

[PATCH v2 16/19] crypto: arm64/aes-ghash - yield after processing fixed number of blocks

2017-12-04 Thread Ard Biesheuvel
bytes for that one. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/ghash-ce-core.S | 128 ++-- 1 file changed, 92 insertions(+), 36 deletions(-) diff --git a/arch/arm64/crypto/ghash-ce-core.S b/arch/arm64/crypto/ghash-ce-core.S index 11ebf1

[PATCH v2 08/19] crypto: arm64/aes-blk - add 4 way interleave to CBC encrypt path

2017-12-04 Thread Ard Biesheuvel
routines yield every 64 bytes and not have an exception for CBC encrypt which yields every 16 bytes) So unroll the loop by 4. We still cannot perform the AES algorithm in parallel, but we can at least merge the loads and stores. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch

[PATCH v2 17/19] crypto: arm64/crc32-ce - yield NEON every 16 blocks of input

2017-12-04 Thread Ard Biesheuvel
Avoid excessive scheduling delays under a preemptible kernel by yielding the NEON every 16 blocks of input. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/crc32-ce-core.S | 55 +++- 1 file changed, 43 insertions(+), 12 deletions(-) diff

[PATCH v2 15/19] crypto: arm64/aes-bs - yield after processing each 128 bytes of input

2017-12-04 Thread Ard Biesheuvel
let's add a yield after each 128 bytes of input, (i.e., 8x the AES block size, which is the natural granularity for a bit sliced algorithm.) This will disable and re-enable kernel mode NEON if a reschedule is pending. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/

[PATCH v2 14/19] crypto: arm64/aes-blk - yield after processing a fixed chunk of input

2017-12-04 Thread Ard Biesheuvel
add a yield after each 16 blocks (for the CE case) or after every block (for the pure NEON case), which will disable and re-enable kernel mode NEON if a reschedule is pending. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/aes-ce.S| 17 +- arch/arm64/

[PATCH v2 13/19] crypto: arm64/sha2-ce - yield every 8 blocks of input

2017-12-04 Thread Ard Biesheuvel
Avoid excessive scheduling delays under a preemptible kernel by yielding the NEON every 8 blocks of input. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/sha2-ce-core.S | 40 ++-- 1 file changed, 29 insertions(+), 11 deletions(-) diff

[PATCH v2 12/19] crypto: arm64/sha1-ce - yield every 8 blocks of input

2017-12-04 Thread Ard Biesheuvel
Avoid excessive scheduling delays under a preemptible kernel by yielding the NEON every 8 blocks of input. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/sha1-ce-core.S | 45 ++-- 1 file changed, 32 insertions(+), 13 deletions(-) diff

[PATCH v2 03/19] crypto: arm64/aes-blk - move kernel mode neon en/disable into loop

2017-12-04 Thread Ard Biesheuvel
code, and run the remainder of the code with kernel mode NEON disabled (and preemption enabled) Note that this requires some reshuffling of the registers in the asm code, because the XTS routines can no longer rely on the registers to retain their contents between invocations. Signed-off-by: Ard

[PATCH v2 06/19] crypto: arm64/ghash - move kernel mode neon en/disable into loop

2017-12-04 Thread Ard Biesheuvel
code, and run the remainder of the code with kernel mode NEON disabled (and preemption enabled) Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/ghash-ce-glue.c | 17 ++--- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/arch/arm64/

[PATCH v2 02/19] crypto: arm64/aes-ce-ccm - move kernel mode neon en/disable into loop

2017-12-04 Thread Ard Biesheuvel
code, and run the remainder of the code with kernel mode NEON disabled (and preemption enabled) Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/aes-ce-ccm-glue.c | 47 ++-- 1 file changed, 23 insertions(+), 24 deletions(-) diff --git a/arch

[PATCH v2 11/19] arm64: assembler: add macro to conditionally yield the NEON under PREEMPT

2017-12-04 Thread Ard Biesheuvel
to be preserved. The second macro takes a label argument that marks the resume-from-yield path, which should restore the preserved context again. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/include/asm/assembler.h | 50 1 file changed, 50 insertions(+)

[PATCH v2 07/19] crypto: arm64/aes-blk - remove configurable interleave

2017-12-04 Thread Ard Biesheuvel
INTERLEAVE=4 with inlining disabled for both flavors of the core AES routines, so let's stick with that, and remove the option to configure this at build time. This makes the code easier to modify, which is nice now that we're adding yield support. Signed-off-by: Ard Biesheuvel <ard.bies

[PATCH v2 05/19] crypto: arm64/chacha20 - move kernel mode neon en/disable into loop

2017-12-04 Thread Ard Biesheuvel
code, and run the remainder of the code with kernel mode NEON disabled (and preemption enabled) Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/chacha20-neon-glue.c | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/arch/arm64/

[PATCH v2 04/19] crypto: arm64/aes-bs - move kernel mode neon en/disable into loop

2017-12-04 Thread Ard Biesheuvel
code, and run the remainder of the code with kernel mode NEON disabled (and preemption enabled) Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/aes-neonbs-glue.c | 36 +--- 1 file changed, 17 insertions(+), 19 deletions(-) diff --git a/arch

[PATCH v2 01/19] crypto: testmgr - add a new test case for CRC-T10DIF

2017-12-04 Thread Ard Biesheuvel
In order to be able to test yield support under preempt, add a test vector for CRC-T10DIF that is long enough to take multiple iterations (and thus possible preemption between them) of the primary loop of the accelerated x86 and arm64 implementations. Signed-off-by: Ard Biesheuvel <ard.bies

[PATCH v2 00/19] crypto: arm64 - play nice with CONFIG_PREEMPT

2017-12-04 Thread Ard Biesheuvel
<rost...@goodmis.org> Cc: Thomas Gleixner <t...@linutronix.de> Ard Biesheuvel (19): crypto: testmgr - add a new test case for CRC-T10DIF crypto: arm64/aes-ce-ccm - move kernel mode neon en/disable into loop crypto: arm64/aes-blk - move kernel mode neon en/disable into loop crypto:

Re: [PATCH 0/5] crypto: arm64 - disable NEON across scatterwalk API calls

2017-12-04 Thread Ard Biesheuvel
On 2 December 2017 at 13:59, Peter Zijlstra <pet...@infradead.org> wrote: > On Sat, Dec 02, 2017 at 11:15:14AM +0000, Ard Biesheuvel wrote: >> On 2 December 2017 at 09:11, Ard Biesheuvel <ard.biesheu...@linaro.org> >> wrote: > >> > They consume the entir

Re: [PATCH 0/5] crypto: arm64 - disable NEON across scatterwalk API calls

2017-12-02 Thread Ard Biesheuvel
On 2 December 2017 at 09:11, Ard Biesheuvel <ard.biesheu...@linaro.org> wrote: > On 2 December 2017 at 09:01, Peter Zijlstra <pet...@infradead.org> wrote: >> On Fri, Dec 01, 2017 at 09:19:22PM +, Ard Biesheuvel wrote: >>> Note that the remaining crypto drivers s

Re: [PATCH 0/5] crypto: arm64 - disable NEON across scatterwalk API calls

2017-12-02 Thread Ard Biesheuvel
On 2 December 2017 at 09:01, Peter Zijlstra <pet...@infradead.org> wrote: > On Fri, Dec 01, 2017 at 09:19:22PM +0000, Ard Biesheuvel wrote: >> Note that the remaining crypto drivers simply operate on fixed buffers, so >> while the RT crowd may still feel the need to disabl

[PATCH 5/5] crypto: arm64/ghash - move kernel mode neon en/disable into loop

2017-12-01 Thread Ard Biesheuvel
code, and run the remainder of the code with kernel mode NEON disabled (and preemption enabled) Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/ghash-ce-glue.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/arch/arm64/crypto/gh

[PATCH 4/5] crypto: arm64/chacha20 - move kernel mode neon en/disable into loop

2017-12-01 Thread Ard Biesheuvel
code, and run the remainder of the code with kernel mode NEON disabled (and preemption enabled) Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/chacha20-neon-glue.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/arm64/crypto/chacha2

[PATCH 3/5] crypto: arm64/aes-bs - move kernel mode neon en/disable into loop

2017-12-01 Thread Ard Biesheuvel
code, and run the remainder of the code with kernel mode NEON disabled (and preemption enabled) Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/aes-neonbs-glue.c | 26 +--- 1 file changed, 12 insertions(+), 14 deletions(-) diff --git a/arch

[PATCH 1/5] crypto: arm64/aes-ce-ccm - move kernel mode neon en/disable into loop

2017-12-01 Thread Ard Biesheuvel
code, and run the remainder of the code with kernel mode NEON disabled (and preemption enabled) Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/aes-ce-ccm-glue.c | 47 ++-- 1 file changed, 23 insertions(+), 24 deletions(-) diff --git a/arch

[PATCH 0/5] crypto: arm64 - disable NEON across scatterwalk API calls

2017-12-01 Thread Ard Biesheuvel
con <will.dea...@arm.com> Cc: Steven Rostedt <rost...@goodmis.org> Cc: Thomas Gleixner <t...@linutronix.de> Ard Biesheuvel (5): crypto: arm64/aes-ce-ccm - move kernel mode neon en/disable into loop crypto: arm64/aes-blk - move kernel mode neon en/disable into loop c

[PATCH 2/5] crypto: arm64/aes-blk - move kernel mode neon en/disable into loop

2017-12-01 Thread Ard Biesheuvel
code, and run the remainder of the code with kernel mode NEON disabled (and preemption enabled) Note that this requires some reshuffling of the registers in the asm code, because the XTS routines can no longer rely on the registers to retain their contents between invocations. Signed-off-by: Ard

Re: [PATCH 5/5] crypto: chacha20 - Fix keystream alignment for chacha20_block()

2017-11-22 Thread Ard Biesheuvel
On 22 November 2017 at 21:29, Eric Biggers <ebigge...@gmail.com> wrote: > On Wed, Nov 22, 2017 at 08:51:57PM +0000, Ard Biesheuvel wrote: >> On 22 November 2017 at 19:51, Eric Biggers <ebigge...@gmail.com> wrote: >> > From: Eric Biggers <ebigg...@google.com> >

Re: [PATCH 5/5] crypto: chacha20 - Fix keystream alignment for chacha20_block()

2017-11-22 Thread Ard Biesheuvel
On 22 November 2017 at 19:51, Eric Biggers wrote: > From: Eric Biggers > > When chacha20_block() outputs the keystream block, it uses 'u32' stores > directly. However, the callers (crypto/chacha20_generic.c and > drivers/char/random.c) declare the

Re: [PATCH 4/5] crypto: x86/chacha20 - Remove cra_alignmask

2017-11-22 Thread Ard Biesheuvel
t need the alignment itself. > > Signed-off-by: Eric Biggers <ebigg...@google.com> Acked-by: Ard Biesheuvel <ard.biesheu...@linaro.org> > --- > arch/x86/crypto/chacha20_glue.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/arch/x86/crypto/chacha20_g

Re: [PATCH 3/5] crypto: chacha20 - Remove cra_alignmask

2017-11-22 Thread Ard Biesheuvel
t; there is no need to have a cra_alignmask set for chacha20-generic. > > Signed-off-by: Eric Biggers <ebigg...@google.com> Acked-by: Ard Biesheuvel <ard.biesheu...@linaro.org> > --- > crypto/chacha20_generic.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/c

Re: [PATCH 2/5] crypto: chacha20 - Use unaligned access macros when loading key and IV

2017-11-22 Thread Ard Biesheuvel
ers without the unaligned access macros. > > Fix it by using the unaligned access macros when loading the key and IV. > > Signed-off-by: Eric Biggers <ebigg...@google.com> Acked-by: Ard Biesheuvel <ard.biesheu...@linaro.org> > --- > crypto/chacha20_generic.c |

Re: [PATCH 1/5] crypto: chacha20 - Fix unaligned access when loading constants

2017-11-22 Thread Ard Biesheuvel
> Fix it by just assigning the constants directly instead. > > Signed-off-by: Eric Biggers <ebigg...@google.com> I'm not thrilled about the open coded hex numbers but I don't care enough to object. Acked-by: Ard Biesheuvel <ard.biesheu...@linaro.org> > --- > crypto/c

Re: [PATCH] crypto/arm64: aes-ce-cipher - move assembler code to .S file

2017-11-22 Thread Ard Biesheuvel
cesses. > On Tue, 21 Nov 2017 13:40:17 + > Ard Biesheuvel <ard.biesheu...@linaro.org> wrote: > >> Most crypto drivers involving kernel mode NEON take care to put the >> code that actually touches the NEON register file in a separate >> compilation unit, to preve

Re: [PATCH] crypto: arm64/aes - do not call crypto_unregister_skcipher twice on error

2017-11-22 Thread Ard Biesheuvel
Hello Corentin, On 22 November 2017 at 08:08, Corentin Labbe wrote: > When a cipher fail fails > to register in aes_init(), the error path go thought goes through > aes_exit() then crypto_unregister_skciphers(). > Since aes_exit calls also crypto_unregister_skcipher,

[PATCH] crypto/arm64: aes-ce-cipher - move assembler code to .S file

2017-11-21 Thread Ard Biesheuvel
() and kernel_neon_end() with instantiations of the IR that make up its implementation, allowing further reordering with the asm block. So let's clean this up, and move the asm() blocks into a separate .S file. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/Ma

Re: [PATCH -stable] arm: crypto: reduce priority of bit-sliced AES cipher

2017-11-17 Thread Ard Biesheuvel
- > replace bit-sliced OpenSSL NEON code"), but it was just a small part of > a complete rewrite. This patch just fixes the priority bug for older > kernels. > > Signed-off-by: Eric Biggers <ebigg...@google.com> Acked-by: Ard Biesheuvel <ard.biesheu...@linaro.org> >

Re: [PATCH v2 2/8] crypto: scompress - use sgl_alloc() and sgl_free()

2017-11-01 Thread Ard Biesheuvel
On 1 November 2017 at 15:45, Bart Van Assche <bart.vanass...@wdc.com> wrote: > On Wed, 2017-11-01 at 15:17 +0000, Ard Biesheuvel wrote: >> On 1 November 2017 at 14:50, Bart Van Assche <bart.vanass...@wdc.com> wrote: >> > On Mon, 2017-10-16 at 15:49 -0700, Bar

Re: [PATCH v2 2/8] crypto: scompress - use sgl_alloc() and sgl_free()

2017-11-01 Thread Ard Biesheuvel
he <bart.vanass...@wdc.com> >> Cc: Ard Biesheuvel <ard.biesheu...@linaro.org> >> Cc: Herbert Xu <herb...@gondor.apana.org.au> > > Ard and/or Herbert, can you please have a look at this patch and let us know > whether or not it looks fine to you? > The pat

[PATCH 1/2] crypto/chacha20: fix handling of chunked input

2017-08-14 Thread Ard Biesheuvel
9 ("crypto: chacha20 - convert generic and x86 ...") Cc: <sta...@vger.kernel.org> # v4.11+ Cc: Steffen Klassert <steffen.klass...@secunet.com> Reported-by: Tobias Brunner <tob...@strongswan.org> Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- cry

[PATCH 2/2] crypto: testmgr - add chunked test cases for chacha20

2017-08-14 Thread Ard Biesheuvel
We failed to catch a bug in the chacha20 code after porting it to the skcipher API. We would have caught it if any chunked tests had been defined, so define some now so we will catch future regressions. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- crypto/testmgr.h | 7

Re: [PATCH v2 3/3] crypto: scompress - defer allocation of scratch buffer to first use

2017-07-25 Thread Ard Biesheuvel
> On 26 Jul 2017, at 00:36, Giovanni Cabiddu <giovanni.cabi...@intel.com> wrote: > > Hi Ard, > >> On Fri, Jul 21, 2017 at 04:42:38PM +0100, Ard Biesheuvel wrote: >> +static int crypto_scomp_init_tfm(struct crypto_tfm *tfm) >> +{ >> +int ret; &

Re: [PATCH v4 0/8] crypto: aes - retire table based generic AES

2017-07-24 Thread Ard Biesheuvel
On 24 July 2017 at 17:57, Eric Biggers <ebigge...@gmail.com> wrote: > On Mon, Jul 24, 2017 at 07:59:43AM +0100, Ard Biesheuvel wrote: >> On 18 July 2017 at 13:06, Ard Biesheuvel <ard.biesheu...@linaro.org> wrote: >> > The generic AES driver uses 16 looku

[PATCH resend 18/18] crypto: arm64/aes - avoid expanded lookup tables in the final round

2017-07-24 Thread Ard Biesheuvel
AES module for the shared key expansion routines. It also frees up register x18, which is not available as a scratch register on all platforms, which and so avoiding it improves shareability of this code. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/aes-

[PATCH resend 17/18] crypto: arm/aes - avoid expanded lookup tables in the final round

2017-07-24 Thread Ard Biesheuvel
AES module for the shared key expansion routines. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm/crypto/aes-cipher-core.S | 88 +++- 1 file changed, 65 insertions(+), 23 deletions(-) diff --git a/arch/arm/crypto/aes-cipher-core.S b/arch/arm/crypto/aes-

[PATCH resend 14/18] crypto: arm64/gcm - implement native driver using v8 Crypto Extensions

2017-07-24 Thread Ard Biesheuvel
. So implement a new GCM driver that combines the AES and PMULL instructions at the block level. This improves performance on Cortex-A57 by ~37% (from 3.5 cpb to 2.6 cpb) Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/Kconfig | 4 +- arch/arm64/crypto

[PATCH resend 16/18] crypto: arm64/ghash - add NEON accelerated fallback for 64-bit PMULL

2017-07-24 Thread Ard Biesheuvel
ble based one, and is time invariant as well, making it less vulnerable to timing attacks. When combined with the bit-sliced NEON implementation of AES-CTR, the AES-GCM performance increases by 2x (from 58 to 29 cycles per byte). Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/ar

[PATCH resend 15/18] crypto: arm/ghash - add NEON accelerated fallback for vmull.p64

2017-07-24 Thread Ard Biesheuvel
mull.p64 code is 16x faster on this core). Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm/crypto/Kconfig | 5 +- arch/arm/crypto/ghash-ce-core.S | 234 arch/arm/crypto/ghash-ce-glue.c | 24 +- 3 files changed, 215 insertions(+), 48 delet

[PATCH resend 07/18] crypto: arm64/sha2-ce - add non-SIMD scalar fallback

2017-07-24 Thread Ard Biesheuvel
The arm64 kernel will shortly disallow nested kernel mode NEON, so add a fallback to scalar code that can be invoked in that case. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/Kconfig| 3 +- arch/arm64/crypto/sha2-ce-glue.

[PATCH resend 03/18] crypto: arm64/ghash-ce - add non-SIMD scalar fallback

2017-07-24 Thread Ard Biesheuvel
The arm64 kernel will shortly disallow nested kernel mode NEON, so add a fallback to scalar C code that can be invoked in that case. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/Kconfig | 3 +- arch/arm64/crypto/ghash-ce-glue.

[PATCH resend 09/18] crypto: arm64/aes-ce-cipher: add non-SIMD generic fallback

2017-07-24 Thread Ard Biesheuvel
The arm64 kernel will shortly disallow nested kernel mode NEON, so add a fallback to scalar code that can be invoked in that case. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/Kconfig | 1 + arch/arm64/crypto/aes-ce-cipher.

[PATCH resend 06/18] crypto: arm64/sha1-ce - add non-SIMD generic fallback

2017-07-24 Thread Ard Biesheuvel
The arm64 kernel will shortly disallow nested kernel mode NEON, so add a fallback to scalar C code that can be invoked in that case. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/Kconfig| 3 ++- arch/arm64/crypto/sha1-ce-glue.

[PATCH resend 04/18] crypto: arm64/crct10dif - add non-SIMD generic fallback

2017-07-24 Thread Ard Biesheuvel
The arm64 kernel will shortly disallow nested kernel mode NEON, so add a fallback to scalar C code that can be invoked in that case. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/crct10dif-ce-glue.c | 13 + 1 file changed, 9 insertions

[PATCH resend 11/18] crypto: arm64/aes-blk - add a non-SIMD fallback for synchronous CTR

2017-07-24 Thread Ard Biesheuvel
To accommodate systems that may disallow use of the NEON in kernel mode in some circumstances, introduce a C fallback for synchronous AES in CTR mode, and use it if may_use_simd() returns false. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/K

[PATCH resend 12/18] crypto: arm64/chacha20 - take may_use_simd() into account

2017-07-24 Thread Ard Biesheuvel
To accommodate systems that disallow the use of kernel mode NEON in some circumstances, take the return value of may_use_simd into account when deciding whether to invoke the C fallback routine. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/chacha20-neon-

[PATCH resend 02/18] crypto/algapi - make crypto_xor() take separate dst and src arguments

2017-07-24 Thread Ard Biesheuvel
called crypto_xor_cpy(), taking separate input and output arguments. This removes the need for the separate memcpy(). Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm/crypto/aes-ce-glue.c | 4 +--- arch/arm/crypto/aes-neonbs-glue.c | 5 ++--- arch/arm64/cryp

[PATCH resend 05/18] crypto: arm64/crc32 - add non-SIMD scalar fallback

2017-07-24 Thread Ard Biesheuvel
The arm64 kernel will shortly disallow nested kernel mode NEON, so add a fallback to scalar C code that can be invoked in that case. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/crc32-ce-glue.c | 11 ++- 1 file changed, 6 insertions(+), 5 del

[PATCH resend 08/18] crypto: arm64/aes-ce-cipher - match round key endianness with generic code

2017-07-24 Thread Ard Biesheuvel
to be updated to reflect that. Signed-off-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/aes-ce-ccm-core.S | 30 - arch/arm64/crypto/aes-ce-cipher.c | 35 +--- arch/arm64/crypto/aes-ce.S | 12 +++ 3 files changed, 37 insertions(

[PATCH resend 01/18] crypto/algapi - use separate dst and src operands for __crypto_xor()

2017-07-24 Thread Ard Biesheuvel
In preparation of introducing crypto_xor_cpy(), which will use separate operands for input and output, modify the __crypto_xor() implementation, which it will share with the existing crypto_xor(), which provides the actual functionality when not using the inline version. Signed-off-by: Ard

[PATCH resend 00/18] crypto: ARM/arm64 roundup for v4.14

2017-07-24 Thread Ard Biesheuvel
. This supersedes all other crypto patches I have outstanding, including the AES refactor ones which I will rework later. Ard Biesheuvel (18): crypto/algapi - use separate dst and src operands for __crypto_xor() crypto/algapi - make crypto_xor() take separate dst and src arguments crypto: arm64

[PATCH resend 10/18] crypto: arm64/aes-ce-ccm: add non-SIMD generic fallback

2017-07-24 Thread Ard Biesheuvel
The arm64 kernel will shortly disallow nested kernel mode NEON. So honour this in the ARMv8 Crypto Extensions implementation of CCM-AES, and fall back to a scalar implementation using the generic crypto helpers for AES, XOR and incrementing the CTR counter. Signed-off-by: Ard Biesheuvel

[PATCH resend 13/18] crypto: arm64/aes-bs - implement non-SIMD fallback for AES-CTR

2017-07-24 Thread Ard Biesheuvel
-by: Ard Biesheuvel <ard.biesheu...@linaro.org> --- arch/arm64/crypto/Kconfig | 1 + arch/arm64/crypto/aes-neonbs-glue.c | 48 ++-- 2 files changed, 44 insertions(+), 5 deletions(-) diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig index a068dc

<    1   2   3   4   5   6   7   8   9   >