This patch adds P9 NX support for 842 compression engine. Virtual
Accelerator Switchboard (VAS) is used to access 842 engine on P9.
For each NX engine per chip, setup receive window using
vas_rx_win_open() which configures RxFIFo with FIFO address, lpid,
pid and tid values. This unique (lpid,
This patch adds changes for checking P9 specific 842 engine
error codes. These errros are reported in coprocessor status
block (CSB) for failures.
Signed-off-by: Haren Myneni
---
arch/powerpc/include/asm/icswx.h | 3 +++
drivers/crypto/nx/nx-842-powernv.c | 18
Move deleting coprocessors info upon exit or failure to
nx842_delete_coprocs().
Signed-off-by: Haren Myneni
---
drivers/crypto/nx/nx-842-powernv.c | 25 -
1 file changed, 12 insertions(+), 13 deletions(-)
diff --git
Updating coprocessor list is moved to nx842_add_coprocs_list().
This function will be used for both icswx and VAS functions.
Signed-off-by: Haren Myneni
---
drivers/crypto/nx/nx-842-powernv.c | 12 +---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git
Configure CRB is moved to nx842_configure_crb() so that it can
be used for icswx and VAS exec functions. VAS function will be
added later with P9 support.
Signed-off-by: Haren Myneni
---
drivers/crypto/nx/nx-842-powernv.c | 57 +-
1 file
Rename nx842_powernv_function to nx842_powernv_exec.
nx842_powernv_exec points to nx842_exec_icswx and
will be point to VAS exec function which will be added later
for P9 NX support.
Signed-off-by: Haren Myneni
---
drivers/crypto/nx/nx-842-powernv.c | 20 +---
P9 introduces Virtual Accelerator Switchboard (VAS) to communicate
with NX 842 engine. icswx function is used to access NX before.
On powerNV systems, NX-842 driver invokes VAS functions for
configuring RxFIFO (receive window) per each NX engine. VAS uses
this FIFO to communicate the request to
On 07/17/2017 11:53 PM, Ram Pai wrote:
> On Mon, Jul 17, 2017 at 04:50:38PM -0700, Haren Myneni wrote:
>>
>> This patch adds P9 NX support for 842 compression engine. Virtual
>> Accelerator Switchboard (VAS) is used to access 842 engine on P9.
>>
>> For each NX engine per chip, setup receive
Signed-off-by: Gary R Hook
---
drivers/crypto/ccp/ccp-crypto-aes-xts.c | 16 +---
drivers/crypto/ccp/ccp-crypto.h |2 +-
drivers/crypto/ccp/ccp-ops.c|3 +++
3 files changed, 17 insertions(+), 4 deletions(-)
diff --git
The following series adds support for XS-AES on version 5 CCPs,
both 128- and 256-bit, and enhances/clarifies/simplifies some
crypto layer code.
Changes since v1:
- rework the validation of the unit-size; move to a separate patch
- expand the key buffer to accommodate 256-bit keys
- use
The CCP supports a limited set of unit-size values. Change the check
for this parameter such that acceptable values match the enumeration.
Then clarify the conditions under which we must use the fallback
implementation.
Signed-off-by: Gary R Hook
---
Vet the key using the available standard function
Signed-off-by: Gary R Hook
---
drivers/crypto/ccp/ccp-crypto-aes-xts.c |9 -
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/crypto/ccp/ccp-crypto-aes-xts.c
Version 5 CCPs have some new requirements for XTS-AES: the type field
must be specified, and the key requires 512 bits, with each part
occupying 256 bits and padded with zeroes.
Signed-off-by: Gary R Hook
---
drivers/crypto/ccp/ccp-dev-v5.c |2 ++
drivers/crypto/ccp/ccp-dev.h
Modify Kconfig help text to reflect the fact that random data from hwrng
is fed into kernel random number generator's entropy pool.
Signed-off-by: PrasannaKumar Muralidharan
---
drivers/char/hw_random/Kconfig | 6 ++
1 file changed, 2 insertions(+), 4
On 07/17/2017 04:48 PM, Lendacky, Thomas wrote:
On 7/17/2017 3:08 PM, Gary R Hook wrote:
Version 5 CCPs have differing requirements for XTS-AES: key components
are stored in a 512-bit vector. The context must be little-endian
justified. AES-256 is supported now, so propagate the cipher size to
This is a followup to 'crypto: scompress - eliminate percpu scratch buffers',
which attempted to replace the scompress per-CPU buffer entirely, but as
Herbert pointed out, this is not going to fly in the targeted use cases.
Instead, move the alloc/free of the buffers into the tfm init/exit hooks,
The scompress code allocates 2 x 128 KB of scratch buffers for each CPU,
so that clients of the async API can use synchronous implementations
even from atomic context. However, on systems such as Cavium Thunderx
(which has 96 cores), this adds up to a non-negligible 24 MB. Also,
32-bit systems may
When allocating the per-CPU scratch buffers, we allocate the source
and destination buffers separately, but bail immediately if the second
allocation fails, without freeing the first one. Fix that.
Signed-off-by: Ard Biesheuvel
---
crypto/scompress.c | 5 -
1 file
Due to the use of per-CPU buffers, scomp_acomp_comp_decomp() executes
with preemption disabled, and so whether the CRYPTO_TFM_REQ_MAY_SLEEP
flag is set is irrelevant, since we cannot sleep anyway. So disregard
the flag, and use GFP_ATOMIC unconditionally.
Cc: # v4.10+
Am Freitag, 21. Juli 2017, 17:09:11 CEST schrieb Arnd Bergmann:
Hi Arnd,
> On Fri, Jul 21, 2017 at 10:57 AM, Stephan Müller
wrote:
> > Am Freitag, 21. Juli 2017, 05:08:47 CEST schrieb Theodore Ts'o:
> >> Um, the timer is the largest number of interrupts on my system.
On Fri, Jul 21, 2017 at 10:57 AM, Stephan Müller wrote:
> Am Freitag, 21. Juli 2017, 05:08:47 CEST schrieb Theodore Ts'o:
>> Um, the timer is the largest number of interrupts on my system. Compare:
>>
>> CPU0 CPU1 CPU2 CPU3
>> LOC:6396552
On 07/21/2017 09:47 AM, Theodore Ts'o wrote:
On Fri, Jul 21, 2017 at 01:39:13PM +0200, Oliver Mangold wrote:
Better, but obviously there is still much room for improvement by reducing
the number of calls to RDRAND.
Hmm, is there some way we can easily tell we are running on Ryzen? Or
do we
On 21.07.2017 16:47, Theodore Ts'o wrote:
On Fri, Jul 21, 2017 at 01:39:13PM +0200, Oliver Mangold wrote:
Better, but obviously there is still much room for improvement by reducing
the number of calls to RDRAND.
Hmm, is there some way we can easily tell we are running on Ryzen? Or
do we
On Fri, Jul 21, 2017 at 01:39:13PM +0200, Oliver Mangold wrote:
> Better, but obviously there is still much room for improvement by reducing
> the number of calls to RDRAND.
Hmm, is there some way we can easily tell we are running on Ryzen? Or
do we believe this is going to be true for all AMD
On 21 July 2017 at 14:44, Herbert Xu wrote:
> On Fri, Jul 21, 2017 at 02:42:20PM +0100, Ard Biesheuvel wrote:
>>
>> >> - Would you mind a patch that makes the code only use the per-CPU
>> >> buffers if we are running atomically to begin with?
>> >
>> > That would mean
On Fri, Jul 21, 2017 at 02:42:20PM +0100, Ard Biesheuvel wrote:
>
> >> - Would you mind a patch that makes the code only use the per-CPU
> >> buffers if we are running atomically to begin with?
> >
> > That would mean dropping the first packet so no.
> >
>
> I think you misunderstood me: the
On 21 July 2017 at 14:31, Herbert Xu wrote:
> On Fri, Jul 21, 2017 at 02:24:02PM +0100, Ard Biesheuvel wrote:
>>
>> OK, but that doesn't really answer any of my questions:
>> - Should we enforce that CRYPTO_ACOMP_ALLOC_OUTPUT is mutually
>> exclusive with
On Fri, Jul 21, 2017 at 02:24:02PM +0100, Ard Biesheuvel wrote:
>
> OK, but that doesn't really answer any of my questions:
> - Should we enforce that CRYPTO_ACOMP_ALLOC_OUTPUT is mutually
> exclusive with CRYPTO_TFM_REQ_MAY_SLEEP, or should
> crypto_scomp_sg_alloc() always use GFP_ATOMIC? We need
On 21 July 2017 at 14:24, Ard Biesheuvel wrote:
> On 21 July 2017 at 14:11, Herbert Xu wrote:
>> On Fri, Jul 21, 2017 at 02:09:39PM +0100, Ard Biesheuvel wrote:
>>>
>>> Right. And is req->dst guaranteed to be assigned in that case? Because
On 21 July 2017 at 14:11, Herbert Xu wrote:
> On Fri, Jul 21, 2017 at 02:09:39PM +0100, Ard Biesheuvel wrote:
>>
>> Right. And is req->dst guaranteed to be assigned in that case? Because
>> crypto_scomp_sg_alloc() happily allocates pages and kmalloc()s the
>>
On Fri, Jul 21, 2017 at 02:09:39PM +0100, Ard Biesheuvel wrote:
>
> Right. And is req->dst guaranteed to be assigned in that case? Because
> crypto_scomp_sg_alloc() happily allocates pages and kmalloc()s the
> scatterlist if req->dst == NULL.
>
> Is there any way we could make these scratch
On 21 July 2017 at 13:42, Herbert Xu wrote:
> On Thu, Jul 20, 2017 at 12:40:00PM +0100, Ard Biesheuvel wrote:
>> The scompress code unconditionally allocates 2 per-CPU scratch buffers
>> of 128 KB each, in order to avoid allocation overhead in the async
>> wrapper
On Thu, Jul 20, 2017 at 12:40:00PM +0100, Ard Biesheuvel wrote:
> The scompress code unconditionally allocates 2 per-CPU scratch buffers
> of 128 KB each, in order to avoid allocation overhead in the async
> wrapper that encapsulates the synchronous compression algorithm, since
> it may execute in
On Fri, Jul 21, 2017 at 3:12 AM, Oliver Mangold wrote:
> Hi,
>
> I was wondering why reading from /dev/urandom is much slower on Ryzen than
> on Intel, and did some analysis. It turns out that the RDRAND instruction is
> at fault, which takes much longer on AMD.
>
> if I read
On 21.07.2017 11:26, Jan Glauber wrote:
Nice catch. How much does the performance improve on Ryzen when you
use arch_get_random_int()?
Okay, now I have some results for you:
On Ryzen 1800X (using arch_get_random_int()):
---
# dd if=/dev/urandom of=/dev/null bs=1M status=progress
8751415296
Hi Ted,
Snipping one comment:
> Practically no one uses /dev/random. It's essentially a deprecated
> interface; the primary interfaces that have been recommended for well
> over a decade is /dev/urandom, and now, getrandom(2). We only need
> 384 bits of randomness every 5 minutes to reseed the
On Fri, Jul 21, 2017 at 09:12:01AM +0200, Oliver Mangold wrote:
> Hi,
>
> I was wondering why reading from /dev/urandom is much slower on
> Ryzen than on Intel, and did some analysis. It turns out that the
> RDRAND instruction is at fault, which takes much longer on AMD.
>
> if I read this
Am Freitag, 21. Juli 2017, 05:08:47 CEST schrieb Theodore Ts'o:
Hi Theodore,
> On Thu, Jul 20, 2017 at 09:00:02PM +0200, Stephan Müller wrote:
> > I concur with your rationale where de-facto the correlation is effect is
> > diminished and eliminated with the fast_pool and the minimal entropy
> >
Hi,
I was wondering why reading from /dev/urandom is much slower on Ryzen
than on Intel, and did some analysis. It turns out that the RDRAND
instruction is at fault, which takes much longer on AMD.
if I read this correctly:
--- drivers/char/random.c ---
862
On 07/20/2017 04:51 PM, Stephan Müller wrote:
Am Donnerstag, 20. Juli 2017, 15:44:31 CEST schrieb Lars Persson:
Hi Lars,
+static int
+artpec6_crypto_cipher_set_key(struct crypto_skcipher *cipher, const u8
*key, + unsigned int keylen)
+{
+ struct
40 matches
Mail list logo