[PATCH V3 6/6] crypto/nx: Add P9 NX support for 842 compression engine

2017-07-21 Thread Haren Myneni
This patch adds P9 NX support for 842 compression engine. Virtual Accelerator Switchboard (VAS) is used to access 842 engine on P9. For each NX engine per chip, setup receive window using vas_rx_win_open() which configures RxFIFo with FIFO address, lpid, pid and tid values. This unique (lpid,

[PATCH V3 5/6] crypto/nx: Add P9 NX specific error codes for 842 engine

2017-07-21 Thread Haren Myneni
This patch adds changes for checking P9 specific 842 engine error codes. These errros are reported in coprocessor status block (CSB) for failures. Signed-off-by: Haren Myneni --- arch/powerpc/include/asm/icswx.h | 3 +++ drivers/crypto/nx/nx-842-powernv.c | 18

[PATCH V3 3/6] crypto/nx: Create nx842_delete_coprocs function

2017-07-21 Thread Haren Myneni
Move deleting coprocessors info upon exit or failure to nx842_delete_coprocs(). Signed-off-by: Haren Myneni --- drivers/crypto/nx/nx-842-powernv.c | 25 - 1 file changed, 12 insertions(+), 13 deletions(-) diff --git

[PATCH V3 4/6] crypto/nx: Add nx842_add_coprocs_list function

2017-07-21 Thread Haren Myneni
Updating coprocessor list is moved to nx842_add_coprocs_list(). This function will be used for both icswx and VAS functions. Signed-off-by: Haren Myneni --- drivers/crypto/nx/nx-842-powernv.c | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git

[PATCH V3 2/6] crypto/nx: Create nx842_configure_crb function

2017-07-21 Thread Haren Myneni
Configure CRB is moved to nx842_configure_crb() so that it can be used for icswx and VAS exec functions. VAS function will be added later with P9 support. Signed-off-by: Haren Myneni --- drivers/crypto/nx/nx-842-powernv.c | 57 +- 1 file

[PATCH V3 1/6] crypto/nx842: Rename nx842_powernv_function as icswx function

2017-07-21 Thread Haren Myneni
Rename nx842_powernv_function to nx842_powernv_exec. nx842_powernv_exec points to nx842_exec_icswx and will be point to VAS exec function which will be added later for P9 NX support. Signed-off-by: Haren Myneni --- drivers/crypto/nx/nx-842-powernv.c | 20 +---

[PATCH V3 0/6] Enable NX 842 compression engine on Power9

2017-07-21 Thread Haren Myneni
P9 introduces Virtual Accelerator Switchboard (VAS) to communicate with NX 842 engine. icswx function is used to access NX before. On powerNV systems, NX-842 driver invokes VAS functions for configuring RxFIFO (receive window) per each NX engine. VAS uses this FIFO to communicate the request to

Re: [PATCH V2 6/6] crypto/nx: Add P9 NX support for 842 compression engine

2017-07-21 Thread Haren Myneni
On 07/17/2017 11:53 PM, Ram Pai wrote: > On Mon, Jul 17, 2017 at 04:50:38PM -0700, Haren Myneni wrote: >> >> This patch adds P9 NX support for 842 compression engine. Virtual >> Accelerator Switchboard (VAS) is used to access 842 engine on P9. >> >> For each NX engine per chip, setup receive

[PATCH v2 4/4] crypto: ccp - Add XTS-AES-256 support for CCP version 5

2017-07-21 Thread Gary R Hook
Signed-off-by: Gary R Hook --- drivers/crypto/ccp/ccp-crypto-aes-xts.c | 16 +--- drivers/crypto/ccp/ccp-crypto.h |2 +- drivers/crypto/ccp/ccp-ops.c|3 +++ 3 files changed, 17 insertions(+), 4 deletions(-) diff --git

[PATCH v2 0/4] Update support for XTS-AES on AMD CCPs

2017-07-21 Thread Gary R Hook
The following series adds support for XS-AES on version 5 CCPs, both 128- and 256-bit, and enhances/clarifies/simplifies some crypto layer code. Changes since v1: - rework the validation of the unit-size; move to a separate patch - expand the key buffer to accommodate 256-bit keys - use

[PATCH v2 3/4] crypto: ccp - Rework the unit-size check for XTS-AES

2017-07-21 Thread Gary R Hook
The CCP supports a limited set of unit-size values. Change the check for this parameter such that acceptable values match the enumeration. Then clarify the conditions under which we must use the fallback implementation. Signed-off-by: Gary R Hook ---

[PATCH v2 1/4] crypto: ccp - Add a call to xts_check_key()

2017-07-21 Thread Gary R Hook
Vet the key using the available standard function Signed-off-by: Gary R Hook --- drivers/crypto/ccp/ccp-crypto-aes-xts.c |9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/crypto/ccp/ccp-crypto-aes-xts.c

[PATCH v2 2/4] crypto: ccp - Enable XTS-AES-128 support on all CCPs

2017-07-21 Thread Gary R Hook
Version 5 CCPs have some new requirements for XTS-AES: the type field must be specified, and the key requires 512 bits, with each part occupying 256 bits and padded with zeroes. Signed-off-by: Gary R Hook --- drivers/crypto/ccp/ccp-dev-v5.c |2 ++ drivers/crypto/ccp/ccp-dev.h

[PATCH] crypto: Kconfig: Correct help text about feeding entropy pool

2017-07-21 Thread PrasannaKumar Muralidharan
Modify Kconfig help text to reflect the fact that random data from hwrng is fed into kernel random number generator's entropy pool. Signed-off-by: PrasannaKumar Muralidharan --- drivers/char/hw_random/Kconfig | 6 ++ 1 file changed, 2 insertions(+), 4

Re: [PATCH] crypto: ccp - Fix XTS-AES support on a version 5 CCP

2017-07-21 Thread Gary R Hook
On 07/17/2017 04:48 PM, Lendacky, Thomas wrote: On 7/17/2017 3:08 PM, Gary R Hook wrote: Version 5 CCPs have differing requirements for XTS-AES: key components are stored in a 512-bit vector. The context must be little-endian justified. AES-256 is supported now, so propagate the cipher size to

[PATCH v2 0/3] crypto: scompress - defer allocation of percpu scratch buffers

2017-07-21 Thread Ard Biesheuvel
This is a followup to 'crypto: scompress - eliminate percpu scratch buffers', which attempted to replace the scompress per-CPU buffer entirely, but as Herbert pointed out, this is not going to fly in the targeted use cases. Instead, move the alloc/free of the buffers into the tfm init/exit hooks,

[PATCH v2 3/3] crypto: scompress - defer allocation of scratch buffer to first use

2017-07-21 Thread Ard Biesheuvel
The scompress code allocates 2 x 128 KB of scratch buffers for each CPU, so that clients of the async API can use synchronous implementations even from atomic context. However, on systems such as Cavium Thunderx (which has 96 cores), this adds up to a non-negligible 24 MB. Also, 32-bit systems may

[PATCH v2 2/3] crypto: scompress - free partially allocated scratch buffers on failure

2017-07-21 Thread Ard Biesheuvel
When allocating the per-CPU scratch buffers, we allocate the source and destination buffers separately, but bail immediately if the second allocation fails, without freeing the first one. Fix that. Signed-off-by: Ard Biesheuvel --- crypto/scompress.c | 5 - 1 file

[PATCH v2 1/3] crypto: scompress - don't sleep with preemption disabled

2017-07-21 Thread Ard Biesheuvel
Due to the use of per-CPU buffers, scomp_acomp_comp_decomp() executes with preemption disabled, and so whether the CRYPTO_TFM_REQ_MAY_SLEEP flag is set is irrelevant, since we cannot sleep anyway. So disregard the flag, and use GFP_ATOMIC unconditionally. Cc: # v4.10+

Re: [RFC PATCH v12 3/4] Linux Random Number Generator

2017-07-21 Thread Stephan Müller
Am Freitag, 21. Juli 2017, 17:09:11 CEST schrieb Arnd Bergmann: Hi Arnd, > On Fri, Jul 21, 2017 at 10:57 AM, Stephan Müller wrote: > > Am Freitag, 21. Juli 2017, 05:08:47 CEST schrieb Theodore Ts'o: > >> Um, the timer is the largest number of interrupts on my system.

Re: [RFC PATCH v12 3/4] Linux Random Number Generator

2017-07-21 Thread Arnd Bergmann
On Fri, Jul 21, 2017 at 10:57 AM, Stephan Müller wrote: > Am Freitag, 21. Juli 2017, 05:08:47 CEST schrieb Theodore Ts'o: >> Um, the timer is the largest number of interrupts on my system. Compare: >> >> CPU0 CPU1 CPU2 CPU3 >> LOC:6396552

Re: Poor RNG performance on Ryzen

2017-07-21 Thread Gary R Hook
On 07/21/2017 09:47 AM, Theodore Ts'o wrote: On Fri, Jul 21, 2017 at 01:39:13PM +0200, Oliver Mangold wrote: Better, but obviously there is still much room for improvement by reducing the number of calls to RDRAND. Hmm, is there some way we can easily tell we are running on Ryzen? Or do we

Re: Poor RNG performance on Ryzen

2017-07-21 Thread Oliver Mangold
On 21.07.2017 16:47, Theodore Ts'o wrote: On Fri, Jul 21, 2017 at 01:39:13PM +0200, Oliver Mangold wrote: Better, but obviously there is still much room for improvement by reducing the number of calls to RDRAND. Hmm, is there some way we can easily tell we are running on Ryzen? Or do we

Re: Poor RNG performance on Ryzen

2017-07-21 Thread Theodore Ts'o
On Fri, Jul 21, 2017 at 01:39:13PM +0200, Oliver Mangold wrote: > Better, but obviously there is still much room for improvement by reducing > the number of calls to RDRAND. Hmm, is there some way we can easily tell we are running on Ryzen? Or do we believe this is going to be true for all AMD

Re: [PATCH] crypto: scompress - eliminate percpu scratch buffers

2017-07-21 Thread Ard Biesheuvel
On 21 July 2017 at 14:44, Herbert Xu wrote: > On Fri, Jul 21, 2017 at 02:42:20PM +0100, Ard Biesheuvel wrote: >> >> >> - Would you mind a patch that makes the code only use the per-CPU >> >> buffers if we are running atomically to begin with? >> > >> > That would mean

Re: [PATCH] crypto: scompress - eliminate percpu scratch buffers

2017-07-21 Thread Herbert Xu
On Fri, Jul 21, 2017 at 02:42:20PM +0100, Ard Biesheuvel wrote: > > >> - Would you mind a patch that makes the code only use the per-CPU > >> buffers if we are running atomically to begin with? > > > > That would mean dropping the first packet so no. > > > > I think you misunderstood me: the

Re: [PATCH] crypto: scompress - eliminate percpu scratch buffers

2017-07-21 Thread Ard Biesheuvel
On 21 July 2017 at 14:31, Herbert Xu wrote: > On Fri, Jul 21, 2017 at 02:24:02PM +0100, Ard Biesheuvel wrote: >> >> OK, but that doesn't really answer any of my questions: >> - Should we enforce that CRYPTO_ACOMP_ALLOC_OUTPUT is mutually >> exclusive with

Re: [PATCH] crypto: scompress - eliminate percpu scratch buffers

2017-07-21 Thread Herbert Xu
On Fri, Jul 21, 2017 at 02:24:02PM +0100, Ard Biesheuvel wrote: > > OK, but that doesn't really answer any of my questions: > - Should we enforce that CRYPTO_ACOMP_ALLOC_OUTPUT is mutually > exclusive with CRYPTO_TFM_REQ_MAY_SLEEP, or should > crypto_scomp_sg_alloc() always use GFP_ATOMIC? We need

Re: [PATCH] crypto: scompress - eliminate percpu scratch buffers

2017-07-21 Thread Ard Biesheuvel
On 21 July 2017 at 14:24, Ard Biesheuvel wrote: > On 21 July 2017 at 14:11, Herbert Xu wrote: >> On Fri, Jul 21, 2017 at 02:09:39PM +0100, Ard Biesheuvel wrote: >>> >>> Right. And is req->dst guaranteed to be assigned in that case? Because

Re: [PATCH] crypto: scompress - eliminate percpu scratch buffers

2017-07-21 Thread Ard Biesheuvel
On 21 July 2017 at 14:11, Herbert Xu wrote: > On Fri, Jul 21, 2017 at 02:09:39PM +0100, Ard Biesheuvel wrote: >> >> Right. And is req->dst guaranteed to be assigned in that case? Because >> crypto_scomp_sg_alloc() happily allocates pages and kmalloc()s the >>

Re: [PATCH] crypto: scompress - eliminate percpu scratch buffers

2017-07-21 Thread Herbert Xu
On Fri, Jul 21, 2017 at 02:09:39PM +0100, Ard Biesheuvel wrote: > > Right. And is req->dst guaranteed to be assigned in that case? Because > crypto_scomp_sg_alloc() happily allocates pages and kmalloc()s the > scatterlist if req->dst == NULL. > > Is there any way we could make these scratch

Re: [PATCH] crypto: scompress - eliminate percpu scratch buffers

2017-07-21 Thread Ard Biesheuvel
On 21 July 2017 at 13:42, Herbert Xu wrote: > On Thu, Jul 20, 2017 at 12:40:00PM +0100, Ard Biesheuvel wrote: >> The scompress code unconditionally allocates 2 per-CPU scratch buffers >> of 128 KB each, in order to avoid allocation overhead in the async >> wrapper

Re: [PATCH] crypto: scompress - eliminate percpu scratch buffers

2017-07-21 Thread Herbert Xu
On Thu, Jul 20, 2017 at 12:40:00PM +0100, Ard Biesheuvel wrote: > The scompress code unconditionally allocates 2 per-CPU scratch buffers > of 128 KB each, in order to avoid allocation overhead in the async > wrapper that encapsulates the synchronous compression algorithm, since > it may execute in

Re: Poor RNG performance on Ryzen

2017-07-21 Thread Jeffrey Walton
On Fri, Jul 21, 2017 at 3:12 AM, Oliver Mangold wrote: > Hi, > > I was wondering why reading from /dev/urandom is much slower on Ryzen than > on Intel, and did some analysis. It turns out that the RDRAND instruction is > at fault, which takes much longer on AMD. > > if I read

Re: Poor RNG performance on Ryzen

2017-07-21 Thread Oliver Mangold
On 21.07.2017 11:26, Jan Glauber wrote: Nice catch. How much does the performance improve on Ryzen when you use arch_get_random_int()? Okay, now I have some results for you: On Ryzen 1800X (using arch_get_random_int()): --- # dd if=/dev/urandom of=/dev/null bs=1M status=progress 8751415296

Re: [RFC PATCH v12 3/4] Linux Random Number Generator

2017-07-21 Thread Jeffrey Walton
Hi Ted, Snipping one comment: > Practically no one uses /dev/random. It's essentially a deprecated > interface; the primary interfaces that have been recommended for well > over a decade is /dev/urandom, and now, getrandom(2). We only need > 384 bits of randomness every 5 minutes to reseed the

Re: Poor RNG performance on Ryzen

2017-07-21 Thread Jan Glauber
On Fri, Jul 21, 2017 at 09:12:01AM +0200, Oliver Mangold wrote: > Hi, > > I was wondering why reading from /dev/urandom is much slower on > Ryzen than on Intel, and did some analysis. It turns out that the > RDRAND instruction is at fault, which takes much longer on AMD. > > if I read this

Re: [RFC PATCH v12 3/4] Linux Random Number Generator

2017-07-21 Thread Stephan Müller
Am Freitag, 21. Juli 2017, 05:08:47 CEST schrieb Theodore Ts'o: Hi Theodore, > On Thu, Jul 20, 2017 at 09:00:02PM +0200, Stephan Müller wrote: > > I concur with your rationale where de-facto the correlation is effect is > > diminished and eliminated with the fast_pool and the minimal entropy > >

Poor RNG performance on Ryzen

2017-07-21 Thread Oliver Mangold
Hi, I was wondering why reading from /dev/urandom is much slower on Ryzen than on Intel, and did some analysis. It turns out that the RDRAND instruction is at fault, which takes much longer on AMD. if I read this correctly: --- drivers/char/random.c --- 862

Re: [PATCH 3/4] crypto: axis: add ARTPEC-6/7 crypto accelerator driver

2017-07-21 Thread Lars Persson
On 07/20/2017 04:51 PM, Stephan Müller wrote: Am Donnerstag, 20. Juli 2017, 15:44:31 CEST schrieb Lars Persson: Hi Lars, +static int +artpec6_crypto_cipher_set_key(struct crypto_skcipher *cipher, const u8 *key, + unsigned int keylen) +{ + struct