RE: [PATCH 01/10] crypto: aead - allow to allocate AEAD requests on the stack

2018-05-02 Thread David Laight
From: Antoine Tenart > Sent: 02 May 2018 10:57 > Adds the AEAD_REQUEST_ON_STACK primitive to allow allocating AEAD > requests on the stack, as it can already be done with various other > crypto algorithms within the kernel. > > Signed-off-by: Antoine Tenart > --- >

RE: [PATCH v2 0/2] crypto: removing various VLAs

2018-04-11 Thread David Laight
From: Salvatore Mesoraca > Sent: 09 April 2018 17:38 ... > > You can also do much better than allocating MAX_BLOCKSIZE + MAX_ALIGNMASK > > bytes by requesting 'long' aligned on-stack memory. > > The easiest way is to define a union like: > > > > union crypto_tmp { > > u8

RE: [PATCH v2 0/2] crypto: removing various VLAs

2018-04-09 Thread David Laight
From: Salvatore Mesoraca > Sent: 09 April 2018 14:55 > > v2: > As suggested by Herbert Xu, the blocksize and alignmask checks > have been moved to crypto_check_alg. > So, now, all the other separate checks are not necessary. > Also, the defines have been moved to

RE: [PATCH] crypto: ctr: avoid VLA use

2018-03-15 Thread David Laight
From: Eric Biggers > Sent: 14 March 2018 18:32 ... > Also, I recall there being a long discussion a while back about how > __aligned(16) doesn't work on local variables because the kernel's stack > pointer > isn't guaranteed to maintain the alignment assumed by the compiler (see commit >

RE: [PATCH] crypto: x86/twofish-3way - Fix %rbp usage

2017-12-19 Thread David Laight
From: Juergen Gross > Sent: 19 December 2017 08:05 .. > > Exchanging 2 registers can be done without memory access via: > > xor reg1, reg2 > xor reg2, reg1 > xor reg1, reg2 That'll generate horrid data dependencies. ISTR that there are some optimisations for the stack, so even 'push reg1', 'mov

RE: [PATCH 16/22] xen-blkfront: Make use of the new sg_map helper function

2017-04-18 Thread David Laight
From: Logan Gunthorpe > Sent: 13 April 2017 23:05 > Straightforward conversion to the new helper, except due to > the lack of error path, we have to warn if unmapable memory > is ever present in the sgl. > > Signed-off-by: Logan Gunthorpe > --- >

RE: [PATCH 1/4] crypto: powerpc - Factor out the core CRC vpmsum algorithm

2017-03-16 Thread David Laight
From: Daniel Axtens > Sent: 15 March 2017 22:30 > Hi David, > > > While not part of this change, the unrolled loops look as though > > they just destroy the cpu cache. > > I'd like be convinced that anything does CRC over long enough buffers > > to make it a gain at all. > > > > With modern (not

RE: [PATCH 1/4] crypto: powerpc - Factor out the core CRC vpmsum algorithm

2017-03-15 Thread David Laight
From: Linuxppc-dev Daniel Axtens > Sent: 15 March 2017 12:38 > The core nuts and bolts of the crc32c vpmsum algorithm will > also work for a number of other CRC algorithms with different > polynomials. Factor out the function into a new asm file. > > To handle multiple users of the function, a

RE: [PATCH v2 8/8] crypto/testmgr: Allocate only the required output size for hash tests

2017-01-11 Thread David Laight
From: Andy Lutomirski > Sent: 10 January 2017 23:25 > There are some hashes (e.g. sha224) that have some internal trickery > to make sure that only the correct number of output bytes are > generated. If something goes wrong, they could potentially overrun > the output buffer. > > Make the test

RE: [PATCH v5 1/4] siphash: add cryptographically secure PRF

2016-12-19 Thread David Laight
From: George Spelvin > Sent: 17 December 2016 15:21 ... > uint32_t > hsiphash24(char const *in, size_t len, uint32_t const key[2]) > { > uint32_t c = key[0]; > uint32_t d = key[1]; > uint32_t a = 0x6c796765 ^ 0x736f6d65; > uint32_t b = d ^ 0x74656462 ^ 0x646f7261; I've

RE: [PATCH v5 1/4] siphash: add cryptographically secure PRF

2016-12-16 Thread David Laight
From: George Spelvin > Sent: 15 December 2016 23:29 > > If a halved version of SipHash can bring significant performance boost > > (with 32b words instead of 64b words) with an acceptable security level > > (64-bit enough?) then we may design such a version. > > I was thinking if the key could be

RE: [PATCH v5 2/4] siphash: add Nu{32,64} helpers

2016-12-16 Thread David Laight
From: Jason A. Donenfeld > Sent: 15 December 2016 20:30 > These restore parity with the jhash interface by providing high > performance helpers for common input sizes. ... > +#define PREAMBLE(len) \ > + u64 v0 = 0x736f6d6570736575ULL; \ > + u64 v1 = 0x646f72616e646f6dULL; \ > + u64 v2

RE: [PATCH v5 3/4] secure_seq: use SipHash in place of MD5

2016-12-16 Thread David Laight
From: Jason A. Donenfeld > Sent: 15 December 2016 20:30 > This gives a clear speed and security improvement. Siphash is both > faster and is more solid crypto than the aging MD5. > > Rather than manually filling MD5 buffers, for IPv6, we simply create > a layout by a simple anonymous struct, for

RE: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function

2016-12-15 Thread David Laight
From: Hannes Frederic Sowa > Sent: 15 December 2016 14:57 > On 15.12.2016 14:56, David Laight wrote: > > From: Hannes Frederic Sowa > >> Sent: 15 December 2016 12:50 > >> On 15.12.2016 13:28, David Laight wrote: > >>> From: Hannes Frederi

RE: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function

2016-12-15 Thread David Laight
From: Hannes Frederic Sowa > Sent: 15 December 2016 12:50 > On 15.12.2016 13:28, David Laight wrote: > > From: Hannes Frederic Sowa > >> Sent: 15 December 2016 12:23 > > ... > >> Hmm? Even the Intel ABI expects alignment of unsigned long long to be 8 &

RE: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function

2016-12-15 Thread David Laight
From: Hannes Frederic Sowa > Sent: 15 December 2016 12:23 ... > Hmm? Even the Intel ABI expects alignment of unsigned long long to be 8 > bytes on 32 bit. Do you question that? Yes. The linux ABI for x86 (32 bit) only requires 32bit alignment for u64 (etc). David

RE: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function

2016-12-15 Thread David Laight
From: Hannes Frederic Sowa > Sent: 14 December 2016 22:03 > On 14.12.2016 13:46, Jason A. Donenfeld wrote: > > Hi David, > > > > On Wed, Dec 14, 2016 at 10:56 AM, David Laight <david.lai...@aculab.com> > > wrote: > >> ... > >>> +u64

RE: [PATCH v3 1/3] siphash: add cryptographically secure hashtable function

2016-12-15 Thread David Laight
From: Linus Torvalds > Sent: 15 December 2016 00:11 > On Wed, Dec 14, 2016 at 3:34 PM, Jason A. Donenfeld wrote: > > > > Or does your reasonable dislike of "word" still allow for the use of > > dword and qword, so that the current function names of: > > dword really is confusing

RE: [PATCH v3 3/3] random: use siphash24 instead of md5 for get_random_int/long

2016-12-15 Thread David Laight
From: Behalf Of Jason A. Donenfeld > Sent: 14 December 2016 18:46 ... > + ret = *chaining = siphash24((u8 *), offsetof(typeof(combined), > end), If you make the first argument 'const void *' you won't need the cast on every call. I'd also suggest making the key u64[2]. David -- To

RE: [PATCH] keys/encrypted: Fix two crypto-on-the-stack bugs

2016-12-13 Thread David Laight
From: Andy Lutomirski > Sent: 12 December 2016 20:53 > The driver put a constant buffer of all zeros on the stack and > pointed a scatterlist entry at it in two places. This doesn't work > with virtual stacks. Use a static 16-byte buffer of zeros instead. ... I didn't think you could dma from

RE: [PATCH] crypto: vmx - Ignore generated files

2016-07-20 Thread David Laight
From: Paulo Flabiano Smorigo > Sent: 19 July 2016 14:36 > Ignore assembly files generated by the perl script. ... > diff --git a/drivers/crypto/vmx/.gitignore b/drivers/crypto/vmx/.gitignore > new file mode 100644 > index 000..af4a7ce > --- /dev/null > +++ b/drivers/crypto/vmx/.gitignore > @@

RE: [PATCH 1/2] crypto: vmx - Adding asm subroutines for XTS

2016-07-12 Thread David Laight
From: Paulo Flabiano Smorigo > Sent: 11 July 2016 20:08 > > This patch add XTS subroutines using VMX-crypto driver. > > It gives a boost of 20 times using XTS. > > These code has been adopted from OpenSSL project in collaboration > with the original author (Andy Polyakov ).

RE: ipsec impact on performance

2015-12-02 Thread David Laight
From: Sowmini Varadhan > Sent: 01 December 2015 18:37 ... > I was using esp-null merely to not have the crypto itself perturb > the numbers (i.e., just focus on the s/w overhead for now), but here > are the numbers for the stock linux kernel stack > Gbps peak cpu util > esp-null

RE: ipsec impact on performance

2015-12-02 Thread David Laight
From: Sowmini Varadhan > Sent: 02 December 2015 12:12 > On (12/02/15 11:56), David Laight wrote: > > > Gbps peak cpu util > > > esp-null 1.8 71% > > > aes-gcm-c-2561.6 79% > > > aes-ccm-a-1280.7 96% > > > &g

RE: [PATCH v3 03/17] crypto: talitos - talitos_ptr renamed ptr for more lisibility

2015-04-17 Thread David Laight
From: Christophe Leroy Linux CodyingStyle recommends to use short variables for local variables. ptr is just good enough for those 3 lines functions. It helps keep single lines shorter than 80 characters. ... -static void to_talitos_ptr(struct talitos_ptr *talitos_ptr, dma_addr_t dma_addr)

RE: [PATCH v1 1/3] SHA1 for PPC/SPE - assembler

2015-02-25 Thread David Laight
From: Markus Stockhausen [PATCH v1 1/3] SHA1 for PPC/SPE - assembler This is the assembler code for SHA1 implementation with the SIMD SPE instruction set. With the enhanced instruction set we can operate on 2 32 bit words in parallel. That helps reducing the time to calculate W16-W79. For

RE: [PATCH v1 2/7] AES for PPC/SPE - aes tables

2015-02-16 Thread David Laight
From: Markus Stockhausen 4K AES tables for big endian I can't help feeling that you could give more information about how the values are generated. ... + * These big endian AES encryption/decryption tables are designed to be simply + * accessed by a combination of rlwimi/lwz instructions

RE: [PATCH 2/3] sha512: reduce stack usage to safe number

2012-01-16 Thread David Laight
Doesn't this badly overflow W[] .. +#define SHA512_0_15(i, a, b, c, d, e, f, g, h) \ + t1 = h + e1(e) + Ch(e, f, g) + sha512_K[i] + W[i]; \ ... + for (i = 0; i 16; i += 8) { ... + SHA512_0_15(i + 7, b, c, d, e, f, g, h, a); + } David -- To

RE: sha512: make it work, undo percpu message schedule

2012-01-13 Thread David Laight
Trying a dynamic memory allocation, and fallback on a single pre-allocated bloc of memory, shared by all cpus, protected by a spinlock ... - + static u64 msg_schedule[80]; + static DEFINE_SPINLOCK(msg_schedule_lock); int i; - u64 *W = get_cpu_var(msg_schedule); +