Re: x86-64: Maintain 16-byte stack alignment

2017-01-11 Thread Andy Lutomirski
On Wed, Jan 11, 2017 at 11:05 PM, Herbert Xu wrote: > On Tue, Jan 10, 2017 at 09:05:28AM -0800, Linus Torvalds wrote: >> >> I'm pretty sure we have random asm code that may not maintain a >> 16-byte stack alignment when it calls other code (including, in some >>

Re: [PATCH v2 8/8] crypto/testmgr: Allocate only the required output size for hash tests

2017-01-11 Thread Andy Lutomirski
On Wed, Jan 11, 2017 at 11:47 PM, Herbert Xu wrote: > Andy Lutomirski wrote: >> There are some hashes (e.g. sha224) that have some internal trickery >> to make sure that only the correct number of output bytes are >> generated. If something goes

Re: arm64 broken

2017-01-11 Thread Herbert Xu
Rob Rice wrote: > I’m working on updating a patchset. The master branch in crypto-2.6 doesn’t > compile for ARM64. The first couple errors are listed below. A colleague > believes that the following commit in rc2 fixes the problem. I presume you mean cryptodev and not

Re: [PATCH v2 8/8] crypto/testmgr: Allocate only the required output size for hash tests

2017-01-11 Thread Herbert Xu
Andy Lutomirski wrote: > There are some hashes (e.g. sha224) that have some internal trickery > to make sure that only the correct number of output bytes are > generated. If something goes wrong, they could potentially overrun > the output buffer. > > Make the test more robust

Re: x86-64: Maintain 16-byte stack alignment

2017-01-11 Thread Ingo Molnar
* Herbert Xu wrote: > On Tue, Jan 10, 2017 at 09:05:28AM -0800, Linus Torvalds wrote: > > > > I'm pretty sure we have random asm code that may not maintain a > > 16-byte stack alignment when it calls other code (including, in some > > cases, calling C code). > > >

Re: x86-64: Maintain 16-byte stack alignment

2017-01-11 Thread Ingo Molnar
* Andy Lutomirski wrote: > I find it rather annoying that gcc before 4.8 malfunctions when it > sees __aligned__(16) on x86_64 kernels. Sigh. Ran into this when writing silly FPU in-kernel testcases a couple of months ago... Thanks, Ingo -- To unsubscribe from

Re: x86-64: Maintain 16-byte stack alignment

2017-01-11 Thread Herbert Xu
On Tue, Jan 10, 2017 at 09:05:28AM -0800, Linus Torvalds wrote: > > I'm pretty sure we have random asm code that may not maintain a > 16-byte stack alignment when it calls other code (including, in some > cases, calling C code). > > So I'm not at all convinced that this is a good idea. We

Re: x86-64: Maintain 16-byte stack alignment

2017-01-11 Thread Andy Lutomirski
On Tue, Jan 10, 2017 at 10:01 PM, Andy Lutomirski wrote: > On Tue, Jan 10, 2017 at 8:35 PM, Herbert Xu > wrote: >> On Tue, Jan 10, 2017 at 08:17:17PM -0800, Linus Torvalds wrote: >>> >>> That said, I do think that the "don't assume stack

Re: [PATCH 00/13] crypto: copy AAD during encrypt for AEAD ciphers

2017-01-11 Thread Herbert Xu
On Tue, Jan 10, 2017 at 02:36:21AM +0100, Stephan Müller wrote: > > to all driver maintainers: the patches I added are compile tested, but > I do not have the hardware to verify the code. May I ask the respective > hardware maintainers to verify that the code is appropriate and works > as

Re: x86-64: Maintain 16-byte stack alignment

2017-01-11 Thread Herbert Xu
On Tue, Jan 10, 2017 at 05:30:48PM +, Ard Biesheuvel wrote: > > Apologies for introducing this breakage. It seemed like an obvious and > simple cleanup, so I didn't even bother to mention it in the commit > log, but if the kernel does not guarantee 16 byte alignment, I guess > we should revert

arm64 broken

2017-01-11 Thread Rob Rice
I’m working on updating a patchset. The master branch in crypto-2.6 doesn’t compile for ARM64. The first couple errors are listed below. A colleague believes that the following commit in rc2 fixes the problem. commit b4b8664d291ac1998e0f0bcdc96b6397f0fe68b3 Author: Al Viro

Re: x86-64: Maintain 16-byte stack alignment

2017-01-11 Thread Andy Lutomirski
On Wed, Jan 11, 2017 at 12:09 AM, Herbert Xu wrote: > On Wed, Jan 11, 2017 at 08:06:54AM +, Ard Biesheuvel wrote: >> >> Couldn't we update the __aligned(x) macro to emit 32 if arch == x86 >> and x == 16? All other cases should work just fine afaict > > Not

Re: [PATCH v2 7/8] net: Rename TCA*BPF_DIGEST to ..._SHA256

2017-01-11 Thread Andy Lutomirski
On Wed, Jan 11, 2017 at 1:09 AM, Daniel Borkmann wrote: > Hi Andy, > > On 01/11/2017 04:11 AM, Andy Lutomirski wrote: >> >> On Tue, Jan 10, 2017 at 4:50 PM, Daniel Borkmann >> wrote: >>> >>> On 01/11/2017 12:24 AM, Andy Lutomirski wrote:

Re: [PATCH v2 8/8] crypto/testmgr: Allocate only the required output size for hash tests

2017-01-11 Thread Andy Lutomirski
On Wed, Jan 11, 2017 at 7:13 AM, David Laight wrote: > From: Andy Lutomirski >> Sent: 10 January 2017 23:25 >> There are some hashes (e.g. sha224) that have some internal trickery >> to make sure that only the correct number of output bytes are >> generated. If something

[PATCH v2 7/7] crypto: arm64/aes - reimplement bit-sliced ARM/NEON implementation for arm64

2017-01-11 Thread Ard Biesheuvel
This is a reimplementation of the NEON version of the bit-sliced AES algorithm. This code is heavily based on Andy Polyakov's OpenSSL version for ARM, which is also available in the kernel. This is an alternative for the existing NEON implementation for arm64 authored by me, which suffers from

[PATCH v2 4/7] crypto: arm64/aes - add scalar implementation

2017-01-11 Thread Ard Biesheuvel
This adds a scalar implementation of AES, based on the precomputed tables that are exposed by the generic AES code. Since rotates are cheap on arm64, this implementation only uses the 4 core tables (of 1 KB each), and avoids the prerotated ones, reducing the D-cache footprint by 75%. On

[PATCH v2 5/7] crypto: arm/aes - replace scalar AES cipher

2017-01-11 Thread Ard Biesheuvel
This replaces the scalar AES cipher that originates in the OpenSSL project with a new implementation that is ~15% (*) faster (on modern cores), and reuses the lookup tables and the key schedule generation routines from the generic C implementation (which is usually compiled in anyway due to

[PATCH v2 2/7] crypto: arm/chacha20 - implement NEON version based on SSE3 code

2017-01-11 Thread Ard Biesheuvel
This is a straight port to ARM/NEON of the x86 SSE3 implementation of the ChaCha20 stream cipher. It uses the new skcipher walksize attribute to process the input in strides of 4x the block size. Signed-off-by: Ard Biesheuvel --- arch/arm/crypto/Kconfig |

[PATCH v2 1/7] crypto: arm64/chacha20 - implement NEON version based on SSE3 code

2017-01-11 Thread Ard Biesheuvel
This is a straight port to arm64/NEON of the x86 SSE3 implementation of the ChaCha20 stream cipher. It uses the new skcipher walksize attribute to process the input in strides of 4x the block size. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/Kconfig

[PATCH v2 3/7] crypto: arm64/aes-blk - expose AES-CTR as synchronous cipher as well

2017-01-11 Thread Ard Biesheuvel
In addition to wrapping the AES-CTR cipher into the async SIMD wrapper, which exposes it as an async skcipher that defers processing to process context, expose our AES-CTR implementation directly as a synchronous cipher as well, but with a lower priority. This makes the AES-CTR transform usable

[PATCH v2 0/7] crypto: ARM/arm64 - AES and ChaCha20 updates for v4.11

2017-01-11 Thread Ard Biesheuvel
This adds ARM and arm64 implementations of ChaCha20, scalar AES and SIMD AES (using bit slicing). The SIMD algorithms in this series take advantage of the new skcipher walksize attribute to iterate over the input in the most efficient manner possible. Patch #1 adds a NEON implementation of

RE: [PATCH v2 8/8] crypto/testmgr: Allocate only the required output size for hash tests

2017-01-11 Thread David Laight
From: Andy Lutomirski > Sent: 10 January 2017 23:25 > There are some hashes (e.g. sha224) that have some internal trickery > to make sure that only the correct number of output bytes are > generated. If something goes wrong, they could potentially overrun > the output buffer. > > Make the test

Re: [RFC PATCH v2] crypto: Add IV generation algorithms

2017-01-11 Thread Ondrej Mosnáček
Hi Binoy, 2016-12-13 9:49 GMT+01:00 Binoy Jayan : > Currently, the iv generation algorithms are implemented in dm-crypt.c. > The goal is to move these algorithms from the dm layer to the kernel > crypto layer by implementing them as template ciphers so they can be >

[PATCH 2/2] crypto: mediatek - fix format string for 64-bit builds

2017-01-11 Thread Arnd Bergmann
After I enabled COMPILE_TEST for non-ARM targets, I ran into these warnings: crypto/mediatek/mtk-aes.c: In function 'mtk_aes_info_map': crypto/mediatek/mtk-aes.c:224:28: error: format '%d' expects argument of type 'int', but argument 3 has type 'long unsigned int' [-Werror=format=]

[PATCH 1/2] crypto: mediatek - remove ARM dependencies

2017-01-11 Thread Arnd Bergmann
Building the mediatek driver on an older ARM architecture results in a harmless warning: warning: (ARCH_OMAP2PLUS_TYPICAL && CRYPTO_DEV_MEDIATEK) selects NEON which has unmet direct dependencies (VFPv3 && CPU_V7) We could add an explicit dependency on CPU_V7, but it seems nicer to open up the

Re: [PATCH v4 2/3] drivers: crypto: Add the Virtual Function driver for CPT

2017-01-11 Thread Stephan Müller
Am Mittwoch, 11. Januar 2017, 16:58:17 CET schrieb George Cherian: Hi George, > I will add a seperate function for xts setkey and make changes as following. > > ... > > > >> + > >> +struct crypto_alg algs[] = { { > >> + .cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER | CRYPTO_ALG_ASYNC, > >> +

Re: [PATCH v2] crypto: x86/chacha20 - Manually align stack buffer

2017-01-11 Thread Ard Biesheuvel
On 11 January 2017 at 12:28, Herbert Xu wrote: > On Wed, Jan 11, 2017 at 12:14:24PM +, Ard Biesheuvel wrote: >> >> I think the old code was fine, actually: >> >> u32 *state, state_buf[16 + (CHACHA20_STATE_ALIGN / sizeof(u32)) - 1]; >> >> ends up allocating 16 + 3

[PATCH v2] crypto: x86/chacha20 - Manually align stack buffer

2017-01-11 Thread Herbert Xu
On Wed, Jan 11, 2017 at 12:14:24PM +, Ard Biesheuvel wrote: > > I think the old code was fine, actually: > > u32 *state, state_buf[16 + (CHACHA20_STATE_ALIGN / sizeof(u32)) - 1]; > > ends up allocating 16 + 3 *words* == 64 + 12 bytes , which given the > guaranteed 4 byte alignment is

Re: crypto: x86/chacha20 - Manually align stack buffer

2017-01-11 Thread Ard Biesheuvel
On 11 January 2017 at 12:08, Herbert Xu wrote: > The kernel on x86-64 cannot use gcc attribute align to align to > a 16-byte boundary. This patch reverts to the old way of aligning > it by hand. > > Incidentally the old way was actually broken in not allocating >

crypto: x86/chacha20 - Manually align stack buffer

2017-01-11 Thread Herbert Xu
The kernel on x86-64 cannot use gcc attribute align to align to a 16-byte boundary. This patch reverts to the old way of aligning it by hand. Incidentally the old way was actually broken in not allocating enough space and would silently corrupt the stack. This patch fixes it by allocating an

Crypto Fixes for 4.10

2017-01-11 Thread Herbert Xu
Hi Linus: This push fixes a regression in aesni that renders it useless if it's built-in with a modular pcbc configuration. Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6.git linus Herbert Xu (1): crypto: aesni - Fix failure when built-in with modular

Re: [PATCH v4 2/3] drivers: crypto: Add the Virtual Function driver for CPT

2017-01-11 Thread George Cherian
Hi Stephan, Thanks for pointing it out!! On 01/11/2017 04:42 PM, Stephan Müller wrote: Am Mittwoch, 11. Januar 2017, 10:56:50 CET schrieb George Cherian: Hi George, +int cvm_enc_dec_setkey(struct crypto_ablkcipher *cipher, const u8 *key, + u32 keylen) +{ + struct

Re: [PATCH v4 2/3] drivers: crypto: Add the Virtual Function driver for CPT

2017-01-11 Thread Stephan Müller
Am Mittwoch, 11. Januar 2017, 10:56:50 CET schrieb George Cherian: Hi George, > +int cvm_enc_dec_setkey(struct crypto_ablkcipher *cipher, const u8 *key, > +u32 keylen) > +{ > + struct crypto_tfm *tfm = crypto_ablkcipher_tfm(cipher); > + struct cvm_enc_ctx *ctx =

[PATCH v4 1/3] drivers: crypto: Add Support for Octeon-tx CPT Engine

2017-01-11 Thread George Cherian
Enable the Physical Function driver for the Cavium Crypto Engine (CPT) found in Octeon-tx series of SoC's. CPT is the Cryptographic Accelaration Unit. CPT includes microcoded GigaCypher symmetric engines (SEs) and asymmetric engines (AEs). Signed-off-by: George Cherian

[PATCH v4 2/3] drivers: crypto: Add the Virtual Function driver for CPT

2017-01-11 Thread George Cherian
Enable the CPT VF driver. CPT is the cryptographic Acceleration Unit in Octeon-tx series of processors. Signed-off-by: George Cherian Reviewed-by: David Daney --- drivers/crypto/cavium/cpt/Makefile | 3 +-

[PATCH v4 3/3] drivers: crypto: Enable CPT options crypto for build

2017-01-11 Thread George Cherian
Add the CPT options in crypto Kconfig and update the crypto Makefile Signed-off-by: George Cherian Reviewed-by: David Daney --- drivers/crypto/Kconfig | 1 + drivers/crypto/Makefile | 1 + 2 files changed, 2 insertions(+) diff --git

[PATCH v4 0/3] Add Support for Cavium Cryptographic Acceleration Unit

2017-01-11 Thread George Cherian
This series adds the support for Cavium Cryptographic Accelerarion Unit (CPT) CPT is available in Cavium's Octeon-Tx SoC series. The series was tested with ecryptfs and dm-crypt for in

Re: [PATCH v2 7/8] net: Rename TCA*BPF_DIGEST to ..._SHA256

2017-01-11 Thread Daniel Borkmann
Hi Andy, On 01/11/2017 04:11 AM, Andy Lutomirski wrote: On Tue, Jan 10, 2017 at 4:50 PM, Daniel Borkmann wrote: On 01/11/2017 12:24 AM, Andy Lutomirski wrote: This makes it easier to add another digest algorithm down the road if needed. It also serves to force any

Re: x86-64: Maintain 16-byte stack alignment

2017-01-11 Thread Herbert Xu
On Wed, Jan 11, 2017 at 08:06:54AM +, Ard Biesheuvel wrote: > > Couldn't we update the __aligned(x) macro to emit 32 if arch == x86 > and x == 16? All other cases should work just fine afaict Not everyone uses that macro. You'd also need to add some checks to stop people from using the gcc

Re: x86-64: Maintain 16-byte stack alignment

2017-01-11 Thread Ard Biesheuvel
On 11 January 2017 at 06:53, Linus Torvalds wrote: > > > On Jan 10, 2017 8:36 PM, "Herbert Xu" wrote: > > > Sure we can ban the use of attribute aligned on stacks. But > what about indirect uses through structures? > > > It should be