Re: [cryptodev:master 130/134] aes_generic.c:undefined reference to `_restgpr_31_x'

2018-01-12 Thread Segher Boessenkool
On Fri, Jan 12, 2018 at 10:45:31PM +0100, Arnd Bergmann wrote:
> > I guess you could enable the _x routines whenever you use ubsan?  Ubsan
> > will cause much bigger code growth than the handful of insns in those
> > routines?
> 
> Right, that could work, too. My patch that Herbert merged intentionally
> used -Os also for non-UBSAN builds because it turned out to
> be much faster (see gcc PR83651),

"Much"?

-Os is *slower* with 8.0, 5% faster with 7.2, 4% faster with 7.1,
slower with 7.0 and 6.3.  Your numbers, #c1.

Anf this is the generic code of course, which is slow anyway (not to
mention insecure).

> but we could revert that back
> to the default and only use the -Os for UBSAN, essentially
> addressing only PR83356 but not PR83651.


Segher


Re: [cryptodev:master 130/134] aes_generic.c:undefined reference to `_restgpr_31_x'

2018-01-12 Thread Arnd Bergmann
On Fri, Jan 12, 2018 at 10:41 PM, Segher Boessenkool
 wrote:
> On Fri, Jan 12, 2018 at 10:29:01PM +0100, Arnd Bergmann wrote:
>> On Fri, Jan 12, 2018 at 9:41 PM, Segher Boessenkool
>>  wrote:
>> > On Fri, Jan 12, 2018 at 08:43:21PM +0100, Arnd Bergmann wrote:
>> >> On Fri, Jan 12, 2018 at 5:39 PM, Segher Boessenkool
>>
>> >> We could theoretically work around it by turning that into
>> >> "#if defined(CONFIG_CC_OPTIMIZE_FOR_SIZE) ||
>> >> defined(CONFIG_CRYPTO_AES)", but that seems rather ugly.
>> >>
>> >> My earlier patch already tried to be more specific, turning very
>> >> specific optimizations off rather than moving from -O2 to -Os,
>> >> but that turned out to lead to significantly worse performance,
>> >> where -Os improved performance slightly. Is there a way
>> >> to ask powerpc compilers to use mostly -Os but not the
>> >> specific thing that makes it link to _restgpr_31_x?
>> >
>> > There is no such thing, sorry.  Would be very hard to implement, and
>> > older compilers will never get it, so it won't help you anyway :-(
>>
>> We use -Os only for gcc-7.1 and higher, where it produces faster
>> code for AES and avoids running into
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356
>>
>> > Maybe for now just enable it in crtsavres.S always, with a comment?
>> > That -Os workaround is hopefully not going to live long either...
>>
>> It depends on whether or how soon someone comes up with a
>> better fix for PR83356.
>> gcc-8.0.0 is currently not affected by it, so we could limit the
>> workaround (and the hack in crtsavres.S) to gcc-7-only.
>
> I guess you could enable the _x routines whenever you use ubsan?  Ubsan
> will cause much bigger code growth than the handful of insns in those
> routines?

Right, that could work, too. My patch that Herbert merged intentionally
used -Os also for non-UBSAN builds because it turned out to
be much faster (see gcc PR83651), but we could revert that back
to the default and only use the -Os for UBSAN, essentially
addressing only PR83356 but not PR83651.

   Arnd


Re: [cryptodev:master 130/134] aes_generic.c:undefined reference to `_restgpr_31_x'

2018-01-12 Thread Segher Boessenkool
On Fri, Jan 12, 2018 at 10:29:01PM +0100, Arnd Bergmann wrote:
> On Fri, Jan 12, 2018 at 9:41 PM, Segher Boessenkool
>  wrote:
> > On Fri, Jan 12, 2018 at 08:43:21PM +0100, Arnd Bergmann wrote:
> >> On Fri, Jan 12, 2018 at 5:39 PM, Segher Boessenkool
> 
> >> We could theoretically work around it by turning that into
> >> "#if defined(CONFIG_CC_OPTIMIZE_FOR_SIZE) ||
> >> defined(CONFIG_CRYPTO_AES)", but that seems rather ugly.
> >>
> >> My earlier patch already tried to be more specific, turning very
> >> specific optimizations off rather than moving from -O2 to -Os,
> >> but that turned out to lead to significantly worse performance,
> >> where -Os improved performance slightly. Is there a way
> >> to ask powerpc compilers to use mostly -Os but not the
> >> specific thing that makes it link to _restgpr_31_x?
> >
> > There is no such thing, sorry.  Would be very hard to implement, and
> > older compilers will never get it, so it won't help you anyway :-(
> 
> We use -Os only for gcc-7.1 and higher, where it produces faster
> code for AES and avoids running into
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356
> 
> > Maybe for now just enable it in crtsavres.S always, with a comment?
> > That -Os workaround is hopefully not going to live long either...
> 
> It depends on whether or how soon someone comes up with a
> better fix for PR83356.
> gcc-8.0.0 is currently not affected by it, so we could limit the
> workaround (and the hack in crtsavres.S) to gcc-7-only.

I guess you could enable the _x routines whenever you use ubsan?  Ubsan
will cause much bigger code growth than the handful of insns in those
routines?


Segher


Re: [cryptodev:master 130/134] aes_generic.c:undefined reference to `_restgpr_31_x'

2018-01-12 Thread Arnd Bergmann
On Fri, Jan 12, 2018 at 9:41 PM, Segher Boessenkool
 wrote:
> On Fri, Jan 12, 2018 at 08:43:21PM +0100, Arnd Bergmann wrote:
>> On Fri, Jan 12, 2018 at 5:39 PM, Segher Boessenkool

>> We could theoretically work around it by turning that into
>> "#if defined(CONFIG_CC_OPTIMIZE_FOR_SIZE) ||
>> defined(CONFIG_CRYPTO_AES)", but that seems rather ugly.
>>
>> My earlier patch already tried to be more specific, turning very
>> specific optimizations off rather than moving from -O2 to -Os,
>> but that turned out to lead to significantly worse performance,
>> where -Os improved performance slightly. Is there a way
>> to ask powerpc compilers to use mostly -Os but not the
>> specific thing that makes it link to _restgpr_31_x?
>
> There is no such thing, sorry.  Would be very hard to implement, and
> older compilers will never get it, so it won't help you anyway :-(

We use -Os only for gcc-7.1 and higher, where it produces faster
code for AES and avoids running into
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356

> Maybe for now just enable it in crtsavres.S always, with a comment?
> That -Os workaround is hopefully not going to live long either...

It depends on whether or how soon someone comes up with a
better fix for PR83356.
gcc-8.0.0 is currently not affected by it, so we could limit the
workaround (and the hack in crtsavres.S) to gcc-7-only.

 Arnd


Re: [cryptodev:master 130/134] aes_generic.c:undefined reference to `_restgpr_31_x'

2018-01-12 Thread Segher Boessenkool
On Fri, Jan 12, 2018 at 08:43:21PM +0100, Arnd Bergmann wrote:
> On Fri, Jan 12, 2018 at 5:39 PM, Segher Boessenkool
>  wrote:
> 
> >> or why the aes_generic implementation needs this on
> >> powerpc when built with 'gcc -Os'. FWIW, the -Os change was needed
> >> to work around a possible kernel stack overflow that can happen with
> >> gcc-7.2, see https://patchwork.kernel.org/patch/10143607/ and
> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356
> >
> > The _x versions are smaller but slower; that's why they are used with -Os.
> > Apparently nothing else was built with -Os (and the other needed flags)
> > before.
> 
> Ah, that explains it, the definition is in arch/powerpc/lib/crtsavres.S,
> but inside of #ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE.

Ah ok.  Right.

> We could theoretically work around it by turning that into
> "#if defined(CONFIG_CC_OPTIMIZE_FOR_SIZE) ||
> defined(CONFIG_CRYPTO_AES)", but that seems rather ugly.
> 
> My earlier patch already tried to be more specific, turning very
> specific optimizations off rather than moving from -O2 to -Os,
> but that turned out to lead to significantly worse performance,
> where -Os improved performance slightly. Is there a way
> to ask powerpc compilers to use mostly -Os but not the
> specific thing that makes it link to _restgpr_31_x?

There is no such thing, sorry.  Would be very hard to implement, and
older compilers will never get it, so it won't help you anyway :-(

Maybe for now just enable it in crtsavres.S always, with a comment?
That -Os workaround is hopefully not going to live long either...


Segher


Re: [cryptodev:master 130/134] aes_generic.c:undefined reference to `_restgpr_31_x'

2018-01-12 Thread Arnd Bergmann
On Fri, Jan 12, 2018 at 5:39 PM, Segher Boessenkool
 wrote:

>> or why the aes_generic implementation needs this on
>> powerpc when built with 'gcc -Os'. FWIW, the -Os change was needed
>> to work around a possible kernel stack overflow that can happen with
>> gcc-7.2, see https://patchwork.kernel.org/patch/10143607/ and
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356
>
> The _x versions are smaller but slower; that's why they are used with -Os.
> Apparently nothing else was built with -Os (and the other needed flags)
> before.

Ah, that explains it, the definition is in arch/powerpc/lib/crtsavres.S,
but inside of #ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE.

We could theoretically work around it by turning that into
"#if defined(CONFIG_CC_OPTIMIZE_FOR_SIZE) ||
defined(CONFIG_CRYPTO_AES)", but that seems rather ugly.

My earlier patch already tried to be more specific, turning very
specific optimizations off rather than moving from -O2 to -Os,
but that turned out to lead to significantly worse performance,
where -Os improved performance slightly. Is there a way
to ask powerpc compilers to use mostly -Os but not the
specific thing that makes it link to _restgpr_31_x?

   Arnd


Re: [PATCH v2] crypto: testmgr: change `guard` to unsigned char

2018-01-12 Thread Joey Pabalinas
On Fri, Jan 12, 2018 at 11:23:28PM +1100, Herbert Xu wrote:
> 
> Patch applied.  Thanks.

No problem, cheers.

-- 
Joey Pabalinas


signature.asc
Description: PGP signature


Re: [cryptodev:master 130/134] aes_generic.c:undefined reference to `_restgpr_31_x'

2018-01-12 Thread Segher Boessenkool
Hi!

On Fri, Jan 12, 2018 at 03:55:47PM +0100, Arnd Bergmann wrote:
> >crypto/aes_generic.o: In function `crypto_aes_set_key':
> >>> aes_generic.c:(.text+0x4e0): undefined reference to `_restgpr_31_x'
> 
> adding linuxpcc-dev to Cc, maybe someone knows a way out of this.
> It appears related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43810

It is not.

> but I don't know what _restgpr_31_x actually does,

It restores GPR31 from stack, restores LR from stack, and returns
(_x is "exit").

> why it's not provided by the kernel

Because the kernel refuses to use libgcc.  Let's, uh, not start that
again?  :-)

> or why the aes_generic implementation needs this on
> powerpc when built with 'gcc -Os'. FWIW, the -Os change was needed
> to work around a possible kernel stack overflow that can happen with
> gcc-7.2, see https://patchwork.kernel.org/patch/10143607/ and
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356

The _x versions are smaller but slower; that's why they are used with -Os.
Apparently nothing else was built with -Os (and the other needed flags)
before.


Segher


[PATCH][next] hwrng: exynos: check for -ve error return from readl_poll_timeout

2018-01-12 Thread Colin King
From: Colin Ian King 

Currently, the return from readl_poll_timeout is being assigned to
a u32 and this is being checked for a -ve return which is always
false since a u32 cannot be less than zero.  Fix this by changing
val to an int so that error returns can be correctly detected.

Detected by CoverityScan, CID#1463776 ("Logically dead code")

Fixes: 6cd225cc5d8a ("hwrng: exynos - add Samsung Exynos True RNG driver")
Signed-off-by: Colin Ian King 
---
 drivers/char/hw_random/exynos-trng.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/char/hw_random/exynos-trng.c 
b/drivers/char/hw_random/exynos-trng.c
index 34d6f51ecbee..f4643e3ec346 100644
--- a/drivers/char/hw_random/exynos-trng.c
+++ b/drivers/char/hw_random/exynos-trng.c
@@ -55,7 +55,7 @@ static int exynos_trng_do_read(struct hwrng *rng, void *data, 
size_t max,
   bool wait)
 {
struct exynos_trng_dev *trng;
-   u32 val;
+   int val;
 
max = min_t(size_t, max, (EXYNOS_TRNG_FIFO_LEN * 4));
 
-- 
2.15.1



[PATCH][next] staging: ccree: fix memory leaks in cc_ivgen_init

2018-01-12 Thread Colin King
From: Colin Ian King 

The current error exit path in function cc_ivgen_init via label
'out' free's resources from the drvdata->ivgen_handle context.
However, drvdata->ivgen_handle has not been assigned to the
context ivgen_ctx at this point, so the resources are not freed.
Fix this by setting drvdata->ivgen_handle to ivgen_ctx as early
as possible so that the clean up error exit return path can free
the resources.

Detected by CoveritScan, CID#1463795 ("Resource leak")

Signed-off-by: Colin Ian King 
---
 drivers/staging/ccree/cc_ivgen.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/ccree/cc_ivgen.c b/drivers/staging/ccree/cc_ivgen.c
index 25a3131a93ce..c47f419b277b 100644
--- a/drivers/staging/ccree/cc_ivgen.c
+++ b/drivers/staging/ccree/cc_ivgen.c
@@ -178,6 +178,8 @@ int cc_ivgen_init(struct cc_drvdata *drvdata)
if (!ivgen_ctx)
return -ENOMEM;
 
+   drvdata->ivgen_handle = ivgen_ctx;
+
/* Allocate pool's header for initial enc. key/IV */
ivgen_ctx->pool_meta = dma_alloc_coherent(device, CC_IVPOOL_META_SIZE,
  _ctx->pool_meta_dma,
@@ -196,8 +198,6 @@ int cc_ivgen_init(struct cc_drvdata *drvdata)
goto out;
}
 
-   drvdata->ivgen_handle = ivgen_ctx;
-
return cc_init_iv_sram(drvdata);
 
 out:
-- 
2.15.1



Re: [PATCH 0/5] sha3 fixes and new implementation for arm64

2018-01-12 Thread Ard Biesheuvel
On 12 January 2018 at 13:15, Ard Biesheuvel  wrote:
> Add an implementation of SHA3 to arm64 using the new special instructions (#4)
>
> In preparation of that, fix a bug in the SHA3 and refactor it a bit so it
> can serve as a fallback for the other code. Also, add some new test vectors
> to get better test coverage.
>
> Ard Biesheuvel (5):
>   crypto/generic: sha3 - fixes for alignment and big endian operation
>   crypto/generic: sha3 - simplify code
>   crypto/generic: sha3 - export init/update/final routines
>   crypto/arm64: sha3 - new implementation based on special instructions

Forgot to mention: this is an RFT for patch #4, as it has not been
validated against a real implementation, only against my own QEMU
code.

>   crypto/testmgr: sha3 - add new testcases
>
>  arch/arm64/crypto/Kconfig|   6 +
>  arch/arm64/crypto/Makefile   |   3 +
>  arch/arm64/crypto/sha3-ce-core.S | 224 
>  arch/arm64/crypto/sha3-ce-glue.c | 156 ++
>  crypto/sha3_generic.c| 198 +++
>  crypto/testmgr.h | 550 
>  include/crypto/sha3.h|   6 +-
>  7 files changed, 1012 insertions(+), 131 deletions(-)
>  create mode 100644 arch/arm64/crypto/sha3-ce-core.S
>  create mode 100644 arch/arm64/crypto/sha3-ce-glue.c
>
> --
> 2.11.0
>


Re: [cryptodev:master 130/134] aes_generic.c:undefined reference to `_restgpr_31_x'

2018-01-12 Thread Arnd Bergmann
On Fri, Jan 12, 2018 at 3:11 PM, kbuild test robot
 wrote:
> tree:   
> https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git 
> master
> head:   b40fa82cd6138350f723aa47b37e3e3e80906b40
> commit: 148b974deea927f5dbb6c468af2707b488bfa2de [130/134] crypto: 
> aes-generic - build with -Os on gcc-7+
> config: powerpc-linkstation_defconfig (attached as .config)
> compiler: powerpc-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
> reproduce:
> wget 
> https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
> ~/bin/make.cross
> chmod +x ~/bin/make.cross
> git checkout 148b974deea927f5dbb6c468af2707b488bfa2de
> # save the attached .config to linux build tree
> make.cross ARCH=powerpc
>
> All errors (new ones prefixed by >>):
>
>crypto/aes_generic.o: In function `crypto_aes_set_key':
>>> aes_generic.c:(.text+0x4e0): undefined reference to `_restgpr_31_x'

adding linuxpcc-dev to Cc, maybe someone knows a way out of this.
It appears related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43810
but I don't know what _restgpr_31_x actually does, why it's not provided
by the kernel or why the aes_generic implementation needs this on
powerpc when built with 'gcc -Os'. FWIW, the -Os change was needed
to work around a possible kernel stack overflow that can happen with
gcc-7.2, see https://patchwork.kernel.org/patch/10143607/ and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356

 Arnd


[cryptodev:master 130/134] aes_generic.c:undefined reference to `_restgpr_31_x'

2018-01-12 Thread kbuild test robot
tree:   
https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git master
head:   b40fa82cd6138350f723aa47b37e3e3e80906b40
commit: 148b974deea927f5dbb6c468af2707b488bfa2de [130/134] crypto: aes-generic 
- build with -Os on gcc-7+
config: powerpc-linkstation_defconfig (attached as .config)
compiler: powerpc-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
git checkout 148b974deea927f5dbb6c468af2707b488bfa2de
# save the attached .config to linux build tree
make.cross ARCH=powerpc 

All errors (new ones prefixed by >>):

   crypto/aes_generic.o: In function `crypto_aes_set_key':
>> aes_generic.c:(.text+0x4e0): undefined reference to `_restgpr_31_x'

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


[RFC] AF_ALG AIO and IV

2018-01-12 Thread Stephan Mueller
Hi,

The kernel crypto API requires the caller to set an IV in the request data 
structure. That request data structure shall define one particular cipher 
operation. During the cipher operation, the IV is read by the cipher 
implementation and eventually the potentially updated IV (e.g. in case of CBC) 
is written back to the memory location the request data structure points to.

AF_ALG allows setting the IV with a sendmsg request, where the IV is stored in 
the AF_ALG context that is unique to one particular AF_ALG socket. Note the 
analogy: an AF_ALG socket is like a TFM where one recvmsg operation uses one 
request with the TFM from the socket.

AF_ALG these days supports AIO operations with multiple IOCBs. I.e. with one 
recvmsg call, multiple IOVECs can be specified. Each individual IOCB (derived 
from one IOVEC) implies that one request data structure is created with the 
data to be processed by the cipher implementation. The IV that was set with 
the sendmsg call is registered with the request data structure before the 
cipher operation.

In case of an AIO operation, the cipher operation invocation returns 
immediately, queuing the request to the hardware. While the AIO request is 
processed by the hardware, recvmsg processes the next IOVEC for which another 
request is created. Again, the IV buffer from the AF_ALG socket context is 
registered with the new request and the cipher operation is invoked.

You may now see that there is a potential race condition regarding the IV 
handling, because there is *no* separate IV buffer for the different requests. 
This is nicely demonstrated with libkcapi using the following command which 
creates an AIO request with two IOCBs each encrypting one AES block in CBC 
mode:

kcapi  -d 2 -x 9  -e -c "cbc(aes)" -k 
8d7dd9b0170ce0b5f2f8e1aa768e01e91da8bfc67fd486d081b28254c99eb423 -i 
7fbc02ebf5b93322329df9bfccb635af -p 48981da18e4bb9ef7e2e3162d16b1910

When the first AIO request finishes before the 2nd AIO request is processed, 
the returned value is:

8b19050f66582cb7f7e4b6c873819b7108afa0eaa7de29bac7d903576b674c32

I.e. two blocks where the IV output from the first request is the IV input to 
the 2nd block.

In case the first AIO request is not completed before the 2nd request 
commences, the result is two identical AES blocks (i.e. both use the same IV):

8b19050f66582cb7f7e4b6c873819b718b19050f66582cb7f7e4b6c873819b71

This inconsistent result may even lead to the conclusion that there can be a 
memory corruption in the IV buffer if both AIO requests write to the IV buffer 
at the same time.

This needs to be solved somehow. I see the following options which I would 
like to have vetted by the community.

1. Require that the cipher implementations serialize any AIO requests that 
have dependencies. I.e. for CBC, requests need to be serialized by the driver. 
For, say, ECB or XTS no serialization is necessary.

2. Change AF_ALG to require a per-request IV. This could be implemented by 
moving the IV submission via CMSG from sendmsg to recvmsg. I.e. the recvmsg 
code path would obtain the IV.

I would tend to favor option 2 as this requires code change at only location. 
If option 2 is considered, I would recommend to still allow setting the IV via 
sendmsg CMSG (to keep the interface stable). If, however, the caller provides 
an IV via recvmsg, this takes precedence.

If there are other options, please allow us to learn about them.

Ciao
Stephan




[PATCH 0/5] sha3 fixes and new implementation for arm64

2018-01-12 Thread Ard Biesheuvel
Add an implementation of SHA3 to arm64 using the new special instructions (#4)

In preparation of that, fix a bug in the SHA3 and refactor it a bit so it
can serve as a fallback for the other code. Also, add some new test vectors
to get better test coverage.

Ard Biesheuvel (5):
  crypto/generic: sha3 - fixes for alignment and big endian operation
  crypto/generic: sha3 - simplify code
  crypto/generic: sha3 - export init/update/final routines
  crypto/arm64: sha3 - new implementation based on special instructions
  crypto/testmgr: sha3 - add new testcases

 arch/arm64/crypto/Kconfig|   6 +
 arch/arm64/crypto/Makefile   |   3 +
 arch/arm64/crypto/sha3-ce-core.S | 224 
 arch/arm64/crypto/sha3-ce-glue.c | 156 ++
 crypto/sha3_generic.c| 198 +++
 crypto/testmgr.h | 550 
 include/crypto/sha3.h|   6 +-
 7 files changed, 1012 insertions(+), 131 deletions(-)
 create mode 100644 arch/arm64/crypto/sha3-ce-core.S
 create mode 100644 arch/arm64/crypto/sha3-ce-glue.c

-- 
2.11.0



[PATCH 3/5] crypto/generic: sha3 - export init/update/final routines

2018-01-12 Thread Ard Biesheuvel
To allow accelerated implementations to fall back to the generic
routines, e.g., in contexts where a SIMD based implementation is
not allowed to run, expose the generic SHA3 init/update/final
routines to other modules.

Signed-off-by: Ard Biesheuvel 
---
 crypto/sha3_generic.c | 33 +++-
 include/crypto/sha3.h |  5 +++
 2 files changed, 23 insertions(+), 15 deletions(-)

diff --git a/crypto/sha3_generic.c b/crypto/sha3_generic.c
index 677247d429a1..86db5baafc83 100644
--- a/crypto/sha3_generic.c
+++ b/crypto/sha3_generic.c
@@ -87,7 +87,7 @@ static void keccakf(u64 st[25])
}
 }
 
-static int sha3_init(struct shash_desc *desc)
+int crypto_sha3_init(struct shash_desc *desc)
 {
struct sha3_state *sctx = shash_desc_ctx(desc);
unsigned int digest_size = crypto_shash_digestsize(desc->tfm);
@@ -99,8 +99,9 @@ static int sha3_init(struct shash_desc *desc)
memset(sctx->st, 0, sizeof(sctx->st));
return 0;
 }
+EXPORT_SYMBOL(crypto_sha3_init);
 
-static int sha3_update(struct shash_desc *desc, const u8 *data,
+int crypto_sha3_update(struct shash_desc *desc, const u8 *data,
   unsigned int len)
 {
struct sha3_state *sctx = shash_desc_ctx(desc);
@@ -136,8 +137,9 @@ static int sha3_update(struct shash_desc *desc, const u8 
*data,
 
return 0;
 }
+EXPORT_SYMBOL(crypto_sha3_update);
 
-static int sha3_final(struct shash_desc *desc, u8 *out)
+int crypto_sha3_final(struct shash_desc *desc, u8 *out)
 {
struct sha3_state *sctx = shash_desc_ctx(desc);
unsigned int i, inlen = sctx->partial;
@@ -162,12 +164,13 @@ static int sha3_final(struct shash_desc *desc, u8 *out)
memset(sctx, 0, sizeof(*sctx));
return 0;
 }
+EXPORT_SYMBOL(crypto_sha3_final);
 
 static struct shash_alg algs[] = { {
.digestsize = SHA3_224_DIGEST_SIZE,
-   .init   = sha3_init,
-   .update = sha3_update,
-   .final  = sha3_final,
+   .init   = crypto_sha3_init,
+   .update = crypto_sha3_update,
+   .final  = crypto_sha3_final,
.descsize   = sizeof(struct sha3_state),
.base.cra_name  = "sha3-224",
.base.cra_driver_name   = "sha3-224-generic",
@@ -176,9 +179,9 @@ static struct shash_alg algs[] = { {
.base.cra_module= THIS_MODULE,
 }, {
.digestsize = SHA3_256_DIGEST_SIZE,
-   .init   = sha3_init,
-   .update = sha3_update,
-   .final  = sha3_final,
+   .init   = crypto_sha3_init,
+   .update = crypto_sha3_update,
+   .final  = crypto_sha3_final,
.descsize   = sizeof(struct sha3_state),
.base.cra_name  = "sha3-256",
.base.cra_driver_name   = "sha3-256-generic",
@@ -187,9 +190,9 @@ static struct shash_alg algs[] = { {
.base.cra_module= THIS_MODULE,
 }, {
.digestsize = SHA3_384_DIGEST_SIZE,
-   .init   = sha3_init,
-   .update = sha3_update,
-   .final  = sha3_final,
+   .init   = crypto_sha3_init,
+   .update = crypto_sha3_update,
+   .final  = crypto_sha3_final,
.descsize   = sizeof(struct sha3_state),
.base.cra_name  = "sha3-384",
.base.cra_driver_name   = "sha3-384-generic",
@@ -198,9 +201,9 @@ static struct shash_alg algs[] = { {
.base.cra_module= THIS_MODULE,
 }, {
.digestsize = SHA3_512_DIGEST_SIZE,
-   .init   = sha3_init,
-   .update = sha3_update,
-   .final  = sha3_final,
+   .init   = crypto_sha3_init,
+   .update = crypto_sha3_update,
+   .final  = crypto_sha3_final,
.descsize   = sizeof(struct sha3_state),
.base.cra_name  = "sha3-512",
.base.cra_driver_name   = "sha3-512-generic",
diff --git a/include/crypto/sha3.h b/include/crypto/sha3.h
index 1339dcdbc9b2..080f60c2e6b1 100644
--- a/include/crypto/sha3.h
+++ b/include/crypto/sha3.h
@@ -26,4 +26,9 @@ struct sha3_state {
u8  buf[SHA3_224_BLOCK_SIZE];
 };
 
+int crypto_sha3_init(struct shash_desc *desc);
+int crypto_sha3_update(struct shash_desc *desc, const u8 *data,
+  unsigned int len);
+int crypto_sha3_final(struct shash_desc *desc, u8 *out);
+
 #endif
-- 
2.11.0



[PATCH 1/5] crypto/generic: sha3 - fixes for alignment and big endian operation

2018-01-12 Thread Ard Biesheuvel
Ensure that the input is byte swabbed before injecting it into the
SHA3 transform. Use the get_unaligned() accessor for this so that
we don't perform unaligned access inadvertently on architectures
that do not support that.

Signed-off-by: Ard Biesheuvel 
---
 crypto/sha3_generic.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/crypto/sha3_generic.c b/crypto/sha3_generic.c
index 7e8ed96236ce..a68be626017c 100644
--- a/crypto/sha3_generic.c
+++ b/crypto/sha3_generic.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define KECCAK_ROUNDS 24
 
@@ -149,7 +150,7 @@ static int sha3_update(struct shash_desc *desc, const u8 
*data,
unsigned int i;
 
for (i = 0; i < sctx->rsizw; i++)
-   sctx->st[i] ^= ((u64 *) src)[i];
+   sctx->st[i] ^= get_unaligned_le64(src + 8 * i);
keccakf(sctx->st);
 
done += sctx->rsiz;
@@ -174,7 +175,7 @@ static int sha3_final(struct shash_desc *desc, u8 *out)
sctx->buf[sctx->rsiz - 1] |= 0x80;
 
for (i = 0; i < sctx->rsizw; i++)
-   sctx->st[i] ^= ((u64 *) sctx->buf)[i];
+   sctx->st[i] ^= get_unaligned_le64(sctx->buf + 8 * i);
 
keccakf(sctx->st);
 
-- 
2.11.0



[PATCH 2/5] crypto/generic: sha3 - simplify code

2018-01-12 Thread Ard Biesheuvel
In preparation of exposing the generic SHA3 implementation to other
versions as a fallback, simplify the code, and remove an inconsistency
in the output handling (endian swabbing rsizw words of state before
writing the output does not make sense)

Signed-off-by: Ard Biesheuvel 
---
 crypto/sha3_generic.c | 184 +++-
 include/crypto/sha3.h |   1 -
 2 files changed, 59 insertions(+), 126 deletions(-)

diff --git a/crypto/sha3_generic.c b/crypto/sha3_generic.c
index a68be626017c..677247d429a1 100644
--- a/crypto/sha3_generic.c
+++ b/crypto/sha3_generic.c
@@ -17,7 +17,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 #define KECCAK_ROUNDS 24
@@ -88,43 +87,16 @@ static void keccakf(u64 st[25])
}
 }
 
-static void sha3_init(struct sha3_state *sctx, unsigned int digest_sz)
-{
-   memset(sctx, 0, sizeof(*sctx));
-   sctx->md_len = digest_sz;
-   sctx->rsiz = 200 - 2 * digest_sz;
-   sctx->rsizw = sctx->rsiz / 8;
-}
-
-static int sha3_224_init(struct shash_desc *desc)
+static int sha3_init(struct shash_desc *desc)
 {
struct sha3_state *sctx = shash_desc_ctx(desc);
+   unsigned int digest_size = crypto_shash_digestsize(desc->tfm);
 
-   sha3_init(sctx, SHA3_224_DIGEST_SIZE);
-   return 0;
-}
-
-static int sha3_256_init(struct shash_desc *desc)
-{
-   struct sha3_state *sctx = shash_desc_ctx(desc);
-
-   sha3_init(sctx, SHA3_256_DIGEST_SIZE);
-   return 0;
-}
-
-static int sha3_384_init(struct shash_desc *desc)
-{
-   struct sha3_state *sctx = shash_desc_ctx(desc);
-
-   sha3_init(sctx, SHA3_384_DIGEST_SIZE);
-   return 0;
-}
-
-static int sha3_512_init(struct shash_desc *desc)
-{
-   struct sha3_state *sctx = shash_desc_ctx(desc);
+   sctx->rsiz = 200 - 2 * digest_size;
+   sctx->rsizw = sctx->rsiz / 8;
+   sctx->partial = 0;
 
-   sha3_init(sctx, SHA3_512_DIGEST_SIZE);
+   memset(sctx->st, 0, sizeof(sctx->st));
return 0;
 }
 
@@ -169,6 +141,8 @@ static int sha3_final(struct shash_desc *desc, u8 *out)
 {
struct sha3_state *sctx = shash_desc_ctx(desc);
unsigned int i, inlen = sctx->partial;
+   unsigned int digest_size = crypto_shash_digestsize(desc->tfm);
+   __le64 *digest = (__le64 *)out;
 
sctx->buf[inlen++] = 0x06;
memset(sctx->buf + inlen, 0, sctx->rsiz - inlen);
@@ -179,110 +153,70 @@ static int sha3_final(struct shash_desc *desc, u8 *out)
 
keccakf(sctx->st);
 
-   for (i = 0; i < sctx->rsizw; i++)
-   sctx->st[i] = cpu_to_le64(sctx->st[i]);
+   for (i = 0; i < digest_size / 8; i++)
+   put_unaligned_le64(sctx->st[i], digest++);
 
-   memcpy(out, sctx->st, sctx->md_len);
+   if (digest_size & 4)
+   put_unaligned_le32(sctx->st[i], (__le32 *)digest);
 
memset(sctx, 0, sizeof(*sctx));
return 0;
 }
 
-static struct shash_alg sha3_224 = {
-   .digestsize =   SHA3_224_DIGEST_SIZE,
-   .init   =   sha3_224_init,
-   .update =   sha3_update,
-   .final  =   sha3_final,
-   .descsize   =   sizeof(struct sha3_state),
-   .base   =   {
-   .cra_name   =   "sha3-224",
-   .cra_driver_name =  "sha3-224-generic",
-   .cra_flags  =   CRYPTO_ALG_TYPE_SHASH,
-   .cra_blocksize  =   SHA3_224_BLOCK_SIZE,
-   .cra_module =   THIS_MODULE,
-   }
-};
-
-static struct shash_alg sha3_256 = {
-   .digestsize =   SHA3_256_DIGEST_SIZE,
-   .init   =   sha3_256_init,
-   .update =   sha3_update,
-   .final  =   sha3_final,
-   .descsize   =   sizeof(struct sha3_state),
-   .base   =   {
-   .cra_name   =   "sha3-256",
-   .cra_driver_name =  "sha3-256-generic",
-   .cra_flags  =   CRYPTO_ALG_TYPE_SHASH,
-   .cra_blocksize  =   SHA3_256_BLOCK_SIZE,
-   .cra_module =   THIS_MODULE,
-   }
-};
-
-static struct shash_alg sha3_384 = {
-   .digestsize =   SHA3_384_DIGEST_SIZE,
-   .init   =   sha3_384_init,
-   .update =   sha3_update,
-   .final  =   sha3_final,
-   .descsize   =   sizeof(struct sha3_state),
-   .base   =   {
-   .cra_name   =   "sha3-384",
-   .cra_driver_name =  "sha3-384-generic",
-   .cra_flags  =   CRYPTO_ALG_TYPE_SHASH,
-   .cra_blocksize  =   SHA3_384_BLOCK_SIZE,
-   .cra_module =   THIS_MODULE,
-   }
-};
-
-static struct shash_alg sha3_512 = {
-   .digestsize =   SHA3_512_DIGEST_SIZE,
-   .init   =   sha3_512_init,
-   .update =   sha3_update,
-   .final  

[PATCH 4/5] crypto/arm64: sha3 - new implementation based on special instructions

2018-01-12 Thread Ard Biesheuvel
Implement the various flavours of SHA3 using the new optional
EOR3/RAX1/XAR/BCAX instructions introduced by ARMv8.2.

Signed-off-by: Ard Biesheuvel 
---
 arch/arm64/crypto/Kconfig|   6 +
 arch/arm64/crypto/Makefile   |   3 +
 arch/arm64/crypto/sha3-ce-core.S | 224 
 arch/arm64/crypto/sha3-ce-glue.c | 156 ++
 4 files changed, 389 insertions(+)

diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
index aad288f4b9de..4f2974687606 100644
--- a/arch/arm64/crypto/Kconfig
+++ b/arch/arm64/crypto/Kconfig
@@ -35,6 +35,12 @@ config CRYPTO_SHA512_ARM64_CE
select CRYPTO_HASH
select CRYPTO_SHA512_ARM64
 
+config CRYPTO_SHA3_ARM64_CE
+   tristate "SHA3 digest algorithm (ARMv8 Crypto Extensions)"
+   depends on KERNEL_MODE_NEON
+   select CRYPTO_HASH
+   select CRYPTO_SHA3
+
 config CRYPTO_GHASH_ARM64_CE
tristate "GHASH/AES-GCM using ARMv8 Crypto Extensions"
depends on KERNEL_MODE_NEON
diff --git a/arch/arm64/crypto/Makefile b/arch/arm64/crypto/Makefile
index d7573d31d397..04eaf8b78816 100644
--- a/arch/arm64/crypto/Makefile
+++ b/arch/arm64/crypto/Makefile
@@ -17,6 +17,9 @@ sha2-ce-y := sha2-ce-glue.o sha2-ce-core.o
 obj-$(CONFIG_CRYPTO_SHA512_ARM64_CE) += sha512-ce.o
 sha512-ce-y := sha512-ce-glue.o sha512-ce-core.o
 
+obj-$(CONFIG_CRYPTO_SHA3_ARM64_CE) += sha3-ce.o
+sha3-ce-y := sha3-ce-glue.o sha3-ce-core.o
+
 obj-$(CONFIG_CRYPTO_GHASH_ARM64_CE) += ghash-ce.o
 ghash-ce-y := ghash-ce-glue.o ghash-ce-core.o
 
diff --git a/arch/arm64/crypto/sha3-ce-core.S b/arch/arm64/crypto/sha3-ce-core.S
new file mode 100644
index ..b0b3d68ef3d3
--- /dev/null
+++ b/arch/arm64/crypto/sha3-ce-core.S
@@ -0,0 +1,224 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * sha512-ce-core.S - core SHA-384/SHA-512 transform using v8 Crypto Extensions
+ *
+ * Copyright (C) 2018 Linaro Ltd 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include 
+#include 
+
+   .text
+
+   .irp
b,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
+   .set.Lv\b\().2d, \b
+   .set.Lv\b\().16b, \b
+   .endr
+
+   .macro  eor3, rd, rn, ra, rm
+   .inst   0xce00 | .L\rd | (.L\rn << 5) | (.L\ra << 10) | 
(.L\rm << 16)
+   .endm
+
+   .macro  rax1, rd, rn, rm
+   .inst   0xce608c00 | .L\rd | (.L\rn << 5) | (.L\rm << 16)
+   .endm
+
+   .macro  bcax, rd, rn, ra, rm
+   .inst   0xce20 | .L\rd | (.L\rn << 5) | (.L\ra << 10) | 
(.L\rm << 16)
+   .endm
+
+   .macro  xar, rd, rn, rm, imm6
+   .inst   0xce80 | .L\rd | (.L\rn << 5) | ((\imm6) << 10) | 
(.L\rm << 16)
+   .endm
+
+   /*
+* sha3_ce_transform(u64 *st, const u8 *data, int blocks, int dg_size);
+*/
+ENTRY(sha3_ce_transform)
+   /* load state */
+   mov x8, x0
+   ld1 { v0.1d- v3.1d}, [x8], #32
+   ld1 { v4.1d- v7.1d}, [x8], #32
+   ld1 { v8.1d-v11.1d}, [x8], #32
+   ld1 {v12.1d-v15.1d}, [x8], #32
+   ld1 {v16.1d-v19.1d}, [x8], #32
+   ld1 {v20.1d-v23.1d}, [x8], #32
+   ld1 {v24.1d}, [x8]
+
+0: sub w2, w2, #1
+   mov w8, #24
+   adr_l   x9, .Lsha3_rcon
+
+   /* load input */
+   ld1 {v25.8b-v28.8b}, [x1], #32
+   ld1 {v29.8b-v31.8b}, [x1], #24
+   eor v0.8b, v0.8b, v25.8b
+   eor v1.8b, v1.8b, v26.8b
+   eor v2.8b, v2.8b, v27.8b
+   eor v3.8b, v3.8b, v28.8b
+   eor v4.8b, v4.8b, v29.8b
+   eor v5.8b, v5.8b, v30.8b
+   eor v6.8b, v6.8b, v31.8b
+
+   tbnzx3, #6, 2f  // SHA3-512
+
+   ld1 {v25.8b-v28.8b}, [x1], #32
+   ld1 {v29.8b-v30.8b}, [x1], #16
+   eor  v7.8b,  v7.8b, v25.8b
+   eor  v8.8b,  v8.8b, v26.8b
+   eor  v9.8b,  v9.8b, v27.8b
+   eor v10.8b, v10.8b, v28.8b
+   eor v11.8b, v11.8b, v29.8b
+   eor v12.8b, v12.8b, v30.8b
+
+   tbnzx3, #4, 1f  // SHA3-384 or SHA3-224
+
+   // SHA3-256
+   ld1 {v25.8b-v28.8b}, [x1], #32
+   eor v13.8b, v13.8b, v25.8b
+   eor v14.8b, v14.8b, v26.8b
+   eor v15.8b, v15.8b, v27.8b
+   eor v16.8b, v16.8b, v28.8b
+   b   3f
+
+1: tbz x3, #2, 3f  // bit 2 cleared? SHA-384
+
+   // SHA3-224
+   ld1 

[PATCH 5/5] crypto/testmgr: sha3 - add new testcases

2018-01-12 Thread Ard Biesheuvel
All current SHA3 test cases are smaller than the SHA3 block size, which
means not all code paths are being exercised. So add a new test case to
each variant, and make one of the existing test cases chunked.

Signed-off-by: Ard Biesheuvel 
---
 crypto/testmgr.h | 550 
 1 file changed, 550 insertions(+)

diff --git a/crypto/testmgr.h b/crypto/testmgr.h
index a714b6293959..6044f6906bd6 100644
--- a/crypto/testmgr.h
+++ b/crypto/testmgr.h
@@ -1052,6 +1052,142 @@ static const struct hash_testvec sha3_224_tv_template[] 
= {
"\xc9\xfd\x55\x74\x49\x44\x79\xba"
"\x5c\x7e\x7a\xb7\x6e\xf2\x64\xea"
"\xd0\xfc\xce\x33",
+   .np = 2,
+   .tap= { 28, 28 },
+   }, {
+   .plaintext = "\x08\x9f\x13\xaa\x41\xd8\x4c\xe3"
+"\x7a\x11\x85\x1c\xb3\x27\xbe\x55"
+"\xec\x60\xf7\x8e\x02\x99\x30\xc7"
+"\x3b\xd2\x69\x00\x74\x0b\xa2\x16"
+"\xad\x44\xdb\x4f\xe6\x7d\x14\x88"
+"\x1f\xb6\x2a\xc1\x58\xef\x63\xfa"
+"\x91\x05\x9c\x33\xca\x3e\xd5\x6c"
+"\x03\x77\x0e\xa5\x19\xb0\x47\xde"
+"\x52\xe9\x80\x17\x8b\x22\xb9\x2d"
+"\xc4\x5b\xf2\x66\xfd\x94\x08\x9f"
+"\x36\xcd\x41\xd8\x6f\x06\x7a\x11"
+"\xa8\x1c\xb3\x4a\xe1\x55\xec\x83"
+"\x1a\x8e\x25\xbc\x30\xc7\x5e\xf5"
+"\x69\x00\x97\x0b\xa2\x39\xd0\x44"
+"\xdb\x72\x09\x7d\x14\xab\x1f\xb6"
+"\x4d\xe4\x58\xef\x86\x1d\x91\x28"
+"\xbf\x33\xca\x61\xf8\x6c\x03\x9a"
+"\x0e\xa5\x3c\xd3\x47\xde\x75\x0c"
+"\x80\x17\xae\x22\xb9\x50\xe7\x5b"
+"\xf2\x89\x20\x94\x2b\xc2\x36\xcd"
+"\x64\xfb\x6f\x06\x9d\x11\xa8\x3f"
+"\xd6\x4a\xe1\x78\x0f\x83\x1a\xb1"
+"\x25\xbc\x53\xea\x5e\xf5\x8c\x00"
+"\x97\x2e\xc5\x39\xd0\x67\xfe\x72"
+"\x09\xa0\x14\xab\x42\xd9\x4d\xe4"
+"\x7b\x12\x86\x1d\xb4\x28\xbf\x56"
+"\xed\x61\xf8\x8f\x03\x9a\x31\xc8"
+"\x3c\xd3\x6a\x01\x75\x0c\xa3\x17"
+"\xae\x45\xdc\x50\xe7\x7e\x15\x89"
+"\x20\xb7\x2b\xc2\x59\xf0\x64\xfb"
+"\x92\x06\x9d\x34\xcb\x3f\xd6\x6d"
+"\x04\x78\x0f\xa6\x1a\xb1\x48\xdf"
+"\x53\xea\x81\x18\x8c\x23\xba\x2e"
+"\xc5\x5c\xf3\x67\xfe\x95\x09\xa0"
+"\x37\xce\x42\xd9\x70\x07\x7b\x12"
+"\xa9\x1d\xb4\x4b\xe2\x56\xed\x84"
+"\x1b\x8f\x26\xbd\x31\xc8\x5f\xf6"
+"\x6a\x01\x98\x0c\xa3\x3a\xd1\x45"
+"\xdc\x73\x0a\x7e\x15\xac\x20\xb7"
+"\x4e\xe5\x59\xf0\x87\x1e\x92\x29"
+"\xc0\x34\xcb\x62\xf9\x6d\x04\x9b"
+"\x0f\xa6\x3d\xd4\x48\xdf\x76\x0d"
+"\x81\x18\xaf\x23\xba\x51\xe8\x5c"
+"\xf3\x8a\x21\x95\x2c\xc3\x37\xce"
+"\x65\xfc\x70\x07\x9e\x12\xa9\x40"
+"\xd7\x4b\xe2\x79\x10\x84\x1b\xb2"
+"\x26\xbd\x54\xeb\x5f\xf6\x8d\x01"
+"\x98\x2f\xc6\x3a\xd1\x68\xff\x73"
+"\x0a\xa1\x15\xac\x43\xda\x4e\xe5"
+"\x7c\x13\x87\x1e\xb5\x29\xc0\x57"
+"\xee\x62\xf9\x90\x04\x9b\x32\xc9"
+"\x3d\xd4\x6b\x02\x76\x0d\xa4\x18"
+"\xaf\x46\xdd\x51\xe8\x7f\x16\x8a"
+"\x21\xb8\x2c\xc3\x5a\xf1\x65\xfc"
+"\x93\x07\x9e\x35\xcc\x40\xd7\x6e"
+"\x05\x79\x10\xa7\x1b\xb2\x49\xe0"
+"\x54\xeb\x82\x19\x8d\x24\xbb\x2f"
+"\xc6\x5d\xf4\x68\xff\x96\x0a\xa1"
+"\x38\xcf\x43\xda\x71\x08\x7c\x13"
+"\xaa\x1e\xb5\x4c\xe3\x57\xee\x85"
+"\x1c\x90\x27\xbe\x32\xc9\x60\xf7"
+"\x6b\x02\x99\x0d\xa4\x3b\xd2\x46"
+"\xdd\x74\x0b\x7f\x16\xad\x21\xb8"
+"\x4f\xe6\x5a\xf1\x88\x1f\x93\x2a"
+"\xc1\x35\xcc\x63\xfa\x6e\x05\x9c"
+ 

Re: [PATCH 00/17] crypto: inside-secure - various improvements

2018-01-12 Thread Antoine Tenart
Hi Herbert,

On Fri, Dec 22, 2017 at 03:28:38PM +0100, Antoine Tenart wrote:
> On Fri, Dec 22, 2017 at 08:11:54PM +1100, Herbert Xu wrote:
> > 
> > Patches 1-16 applied.  If patch 17 needs to go through the crypto
> > tree please let me know.
> 
> Thanks! Yes patch 17 should got through the crypto tree as well.

I'm not sure you took patch 17 in your tree. Could you check?

Thanks!
Antoine

-- 
Antoine Ténart, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


Re: [PATCH] crypto: chacha20 - use rol32() macro from bitops.h

2018-01-12 Thread Herbert Xu
On Sun, Dec 31, 2017 at 06:02:45PM -0600, Eric Biggers wrote:
> From: Eric Biggers 
> 
> For chacha20_block(), use the existing 32-bit left-rotate function
> instead of defining one ourselves.
> 
> Signed-off-by: Eric Biggers 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH v4] crypto: AF_ALG - whitelist mask and type

2018-01-12 Thread Herbert Xu
On Tue, Jan 02, 2018 at 08:55:25AM +0100, Stephan Müller wrote:
> Hi,
> 
> sorry, I forgot the right tags.
> 
> ---8<---
> 
> The user space interface allows specifying the type and mask field used
> to allocate the cipher. Only a subset of the possible flags are intended
> for user space. Therefore, white-list the allowed flags.
> 
> In case the user space caller uses at least one non-allowed flag, EINVAL
> is returned.
> 
> Reported-by: syzbot 
> Cc: 
> Signed-off-by: Stephan Mueller 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] [v2] crypto: aes-generic - build with -Os on gcc-7+

2018-01-12 Thread Herbert Xu
On Wed, Jan 03, 2018 at 11:39:27PM +0100, Arnd Bergmann wrote:
> While testing other changes, I discovered that gcc-7.2.1 produces badly
> optimized code for aes_encrypt/aes_decrypt. This is especially true when
> CONFIG_UBSAN_SANITIZE_ALL is enabled, where it leads to extremely
> large stack usage that in turn might cause kernel stack overflows:
> 
> crypto/aes_generic.c: In function 'aes_encrypt':
> crypto/aes_generic.c:1371:1: warning: the frame size of 4880 bytes is larger 
> than 2048 bytes [-Wframe-larger-than=]
> crypto/aes_generic.c: In function 'aes_decrypt':
> crypto/aes_generic.c:1441:1: warning: the frame size of 4864 bytes is larger 
> than 2048 bytes [-Wframe-larger-than=]
> 
> I verified that this problem exists on all architectures that are
> supported by gcc-7.2, though arm64 in particular is less affected than
> the others. I also found that gcc-7.1 and gcc-8 do not show the extreme
> stack usage but still produce worse code than earlier versions for this
> file, apparently because of optimization passes that generally provide
> a substantial improvement in object code quality but understandably fail
> to find any shortcuts in the AES algorithm.
> 
> Possible workarounds include
> 
> a) disabling -ftree-pre and -ftree-sra optimizations, this was an earlier
>patch I tried, which reliably fixed the stack usage, but caused a
>serious performance regression in some versions, as later testing
>found.
> 
> b) disabling UBSAN on this file or all ciphers, as suggested by Ard
>Biesheuvel. This would lead to massively better crypto performance in
>UBSAN-enabled kernels and avoid the stack usage, but there is a concern
>over whether we should exclude arbitrary files from UBSAN at all.
> 
> c) Forcing the optimization level in a different way. Similar to a),
>but rather than deselecting specific optimization stages,
>this now uses "gcc -Os" for this file, regardless of the
>CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE/SIZE option. This is a reliable
>workaround for the stack consumption on all architecture, and I've
>retested the performance results now on x86, cycles/byte (lower is
>better) for cbc(aes-generic) with 256 bit keys:
> 
>   -O2 -Os
>   gcc-6.3.1   14.915.1
>   gcc-7.0.1   14.715.3
>   gcc-7.1.1   15.314.7
>   gcc-7.2.1   16.815.9
>   gcc-8.0.0   15.515.6
> 
> This implements the option c) by enabling forcing -Os on all compiler
> versions starting with gcc-7.1. As a workaround for PR83356, it would
> only be needed for gcc-7.2+ with UBSAN enabled, but since it also shows
> better performance on gcc-7.1 without UBSAN, it seems appropriate to
> use the faster version here as well.
> 
> Side note: during testing, I also played with the AES code in libressl,
> which had a similar performance regression from gcc-6 to gcc-7.2,
> but was three times slower overall. It might be interesting to
> investigate that further and possibly port the Linux implementation
> into that.
> 
> Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356
> Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83651
> Cc: Richard Biener 
> Cc: Jakub Jelinek 
> Cc: Ard Biesheuvel 
> Signed-off-by: Arnd Bergmann 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] crypto: clear htmldocs build warnings for crypto/hash

2018-01-12 Thread Herbert Xu
On Sun, Jan 07, 2018 at 10:01:43AM +1100, Tobin C. Harding wrote:
> SPHINX build emits multiple warnings of kind:
> 
>   warning: duplicate section name 'Note'
> 
> (when building kernel via make target 'htmldocs')
> 
> This is caused by repeated use of comments of form:
> 
>   * Note: soau soaeusoa uoe
> 
> We can change the format without loss of clarity and clear the build
> warnings.
> 
> Add '**[mandatory]**' or '**[optional]**' as kernel-doc field element
> description prefix
> 
> This renders in HTML as (prefixes in bold)
> 
> final
> [mandatory] Retrieve result from the driver. This function finalizes the
> transformation and retrieves the resulting hash from the driver and
> pushes it back to upper layers. No data processing happens at this
> point unless hardware requires it to finish the transformation (then
> the data buffered by the device driver is processed).
> 
> Signed-off-by: Tobin C. Harding 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH v2 0/3] crypto: salsa20 - cleanup and convert to skcipher API

2018-01-12 Thread Herbert Xu
On Fri, Jan 05, 2018 at 11:09:56AM -0800, Eric Biggers wrote:
> From: Eric Biggers 
> 
> This series converts the Salsa20 implementations over to the skcipher
> API, in the process fixing a couple bugs and making them be more similar
> to the ChaCha20 implementations, rather than doing things slightly
> differently for no good reason.  (Note, however, that Salsa20 still
> interprets the IV differently from ChaCha20; I didn't change that.)
> 
> I'm not sure who is actually using Salsa20 in the kernel, but either way
> if we're going to have it at all we might as well keep it up to date.
> 
> The Salsa20 algorithms still pass the self-tests after these changes.
> 
> Changed since v1:
> 
> - Use '__le32' type for keystream, as suggested by Stephan.
> 
> Eric Biggers (3):
>   crypto: salsa20-generic - cleanup and convert to skcipher API
>   crypto: salsa20 - export generic helpers
>   crypto: x86/salsa20 - cleanup and convert to skcipher API
> 
>  arch/x86/crypto/salsa20-i586-asm_32.S   | 184 +---
>  arch/x86/crypto/salsa20-x86_64-asm_64.S | 114 ---
>  arch/x86/crypto/salsa20_glue.c  | 105 ++
>  crypto/Kconfig  |   2 +
>  crypto/salsa20_generic.c| 240 
> ++--
>  include/crypto/salsa20.h|  27 
>  6 files changed, 174 insertions(+), 498 deletions(-)
>  create mode 100644 include/crypto/salsa20.h

All applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] crypto: tcrypt - free xoutbuf instead of axbuf

2018-01-12 Thread Herbert Xu
On Tue, Jan 02, 2018 at 03:43:04PM +, Colin King wrote:
> From: Colin Ian King 
> 
> There seems to be a cut-n-paste bug with the name of the buffer being
> free'd, xoutbuf should be used instead of axbuf.
> 
> Detected by CoverityScan, CID#1463420 ("Copy-paste error")
> 
> Fixes: 427988d981c4 ("crypto: tcrypt - add multibuf aead speed test")
> Signed-off-by: Colin Ian King 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] crypto: Use zeroing memory allocator instead of allocator/memset

2018-01-12 Thread Herbert Xu
On Sun, Dec 31, 2017 at 05:54:23PM +0530, Himanshu Jha wrote:
> Use dma_zalloc_coherent for allocating zeroed
> memory and remove unnecessary memset function.
> 
> Done using Coccinelle.
> Generated-by: scripts/coccinelle/api/alloc/kzalloc-simple.cocci
> 0-day tested with no failures.
> 
> Signed-off-by: Himanshu Jha 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [RFC PATCH 0/9] crypto: prevent unkeyed use of keyed algorithms

2018-01-12 Thread Herbert Xu
On Wed, Jan 03, 2018 at 11:16:21AM -0800, Eric Biggers wrote:
> From: Eric Biggers 
> 
> This series updates the crypto API to consistently prevent using keyed
> algorithms without setting the key.  Currently this is prevented for
> AF_ALG but not for other crypto API users, which is very problematic for
> other places in the kernel where userspace can specify a hash algorithm
> by name, e.g. KEYCTL_DH_COMPUTE as demonstrated by syzbot
> (https://marc.info/?l=linux-crypto-vger=151395810921850).
> 
> This series fixes the bug for all users by adding a flag
> CRYPTO_ALG_NEED_KEY to crypto_tfm.crt_flags.  This flag is set if needed
> when the tfm is created, is cleared when the key is set, and is checked
> when doing an operation that would require the key.
> 
> Patches 1-6 update the hash API, which is the primary fix.  I've marked
> all those patches for stable, which is kind of a pain, but it seems the
> alternative would be very messy -- we'd have to patch at least 5
> different crypto API users (probably missing some), then revert those
> patches upstream once we have the proper fix at the API level.
> 
> The last two patches also extend the fix to the skcipher and AEAD APIs,
> primarily as a sanity check since users should be less likely to try to
> use skciphers or AEADs without setting a key.
> 
> Eric Biggers (9):
>   crypto: hash - introduce crypto_hash_alg_has_setkey()
>   crypto: cryptd - pass through absence of ->setkey()
>   crypto: mcryptd - pass through absence of ->setkey()
>   crypto: poly1305 - remove ->setkey() method
>   crypto: hash - annotate algorithms taking optional key
>   crypto: hash - prevent using keyed hashes without setting key
>   crypto: ghash - remove checks for key being set
>   crypto: skcipher - prevent using skciphers without setting key
>   crypto: aead - prevent using AEADs without setting key

All applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH v2] crypto: testmgr: change `guard` to unsigned char

2018-01-12 Thread Herbert Xu
On Mon, Jan 01, 2018 at 10:40:14AM -1000, Joey Pabalinas wrote:
> When char is signed, storing the values 0xba (186) and 0xad (173) in the
> `guard` array produces signed overflow. Change the type of `guard` to
> static unsigned char to correct undefined behavior and reduce function
> stack usage.
> 
> Signed-off-by: Joey Pabalinas 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH][crypto-next] crypto: tcrypt: fix spelling mistake: "bufufer"-> "buffer"

2018-01-12 Thread Herbert Xu
On Tue, Jan 02, 2018 at 09:21:06AM +, Colin King wrote:
> From: Colin Ian King 
> 
> Trivial fix to spelling mistakes in pr_err error message text.
> 
> Signed-off-by: Colin Ian King 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH 1/2] crypto: Implement a generic crypto statistics

2018-01-12 Thread Stephan Mueller
Am Freitag, 12. Januar 2018, 10:07:30 CET schrieb LABBE Corentin:

Hi LABBE,

> 
> > > diff --git a/include/uapi/linux/cryptouser.h
> > > b/include/uapi/linux/cryptouser.h index 19bf0ca6d635..15e51ccb3679
> > > 100644
> > > --- a/include/uapi/linux/cryptouser.h
> > > +++ b/include/uapi/linux/cryptouser.h
> > > @@ -73,6 +73,8 @@ struct crypto_report_hash {
> > > 
> > >   char type[CRYPTO_MAX_NAME];
> > >   unsigned int blocksize;
> > >   unsigned int digestsize;
> > > 
> > > + __u64 stat_hash;
> > 
> > Why do you use __u64? The atomic_t variable is an int, i.e. 32 bit. Thus I
> > would think that __u32 would suffice?
> 
> You are right, I will downgrade to __u32
> But I think I will set len stats to atomic64/__u64 and keep count on
> atomic_t/__u32.

Fine with me.

> > > + __u64 stat_hash_tlen;
> > > 
> > >  };
> > 
> > What I am slightly unsure here is: how should user space detect whether
> > these additional parameters are part of the NETLINK_USER API or not? I
> > use that interface in my libkcapi whose binary may be used on multiple
> > different kernel versions. How should that library operate if one kernel
> > has these parameters and another does not?
> 
> Userspace could check for kernel version and know if stat are present or
> not. Another way is to add a new netlink request.

Well, I am not sure that checking the kernel version is good enough. Distros 
and other vendors may backport this patch. This means that for some older 
kernel versions this interface is present.

Hence I would rather opt for a separate stat message where the user spacee 
caller receives an error on kernels that does not support it.

Ciao
Stephan




Re: [PATCH 1/2] crypto: Implement a generic crypto statistics

2018-01-12 Thread LABBE Corentin
On Fri, Jan 12, 2018 at 07:49:43AM +0100, Stephan Mueller wrote:
> Am Donnerstag, 11. Januar 2018, 20:56:56 CET schrieb Corentin Labbe:
> 
> Hi Corentin,
> 
> > This patch implement a generic way to get statistics about all crypto
> > usages.
> > 
> > Signed-off-by: Corentin Labbe 
> > ---
> >  crypto/Kconfig  | 11 
> >  crypto/ablkcipher.c |  9 +++
> >  crypto/acompress.c  |  9 +++
> >  crypto/aead.c   | 10 
> >  crypto/ahash.c  |  8 ++
> >  crypto/akcipher.c   | 13 ++
> >  crypto/algapi.c |  6 +
> >  crypto/blkcipher.c  |  9 +++
> >  crypto/crypto_user.c| 28 +
> >  crypto/kpp.c|  7 ++
> >  crypto/rng.c|  8 ++
> >  crypto/scompress.c  |  9 +++
> >  crypto/shash.c  |  5 
> >  crypto/skcipher.c   |  9 +++
> >  include/crypto/acompress.h  | 22 
> >  include/crypto/aead.h   | 22 
> >  include/crypto/akcipher.h   | 42 +++
> >  include/crypto/hash.h   | 21 
> >  include/crypto/kpp.h| 28 +
> >  include/crypto/rng.h| 17 +
> >  include/crypto/skcipher.h   | 22 
> >  include/linux/crypto.h  | 56
> > + include/uapi/linux/cryptouser.h |
> > 34 +
> >  23 files changed, 405 insertions(+)
> > 
> > diff --git a/crypto/Kconfig b/crypto/Kconfig
> > index 971d558494c3..3b88fba14b59 100644
> > --- a/crypto/Kconfig
> > +++ b/crypto/Kconfig
> > @@ -1780,6 +1780,17 @@ config CRYPTO_USER_API_AEAD
> >   This option enables the user-spaces interface for AEAD
> >   cipher algorithms.
> > 
> > +config CRYPTO_STATS
> > +   bool "Crypto usage statistics for User-space"
> > +   help
> > + This option enables the gathering of crypto stats.
> > + This will collect:
> > + - encrypt/decrypt size and numbers of symmeric operations
> > + - compress/decompress size and numbers of compress operations
> > + - size and numbers of hash operations
> > + - encrypt/decrypt/sign/verify numbers for asymmetric operations
> > + - generate/seed numbers for rng operations
> > +
> >  config CRYPTO_HASH_INFO
> > bool
> > 

[...]
> >   * crypto_acomp_compress() -- Invoke asynchronous compress operation
> >   *
> > @@ -247,6 +267,7 @@ static inline int crypto_acomp_compress(struct acomp_req
> > *req) {
> > struct crypto_acomp *tfm = crypto_acomp_reqtfm(req);
> > 
> > +   crypto_stat_compress(req);
> > return tfm->compress(req);
> 
> In general: should'nt the statistics increment only happen if the associated 
> operation was successful?
> 

I will do it.
This bring also a possibility to get errors counters.

> >  /**
> >   * crypto_ahash_finup() - update and finalize message digest
> >   * @req: reference to the ahash_request handle that holds all information
> > @@ -519,6 +538,8 @@ static inline int crypto_ahash_init(struct ahash_request
> > *req) */
> >  static inline int crypto_ahash_update(struct ahash_request *req)
> >  {
> > +
> > +   crypto_stat_ahash_update(req);
> > return crypto_ahash_reqtfm(req)->update(req);
> 
> In case you roll another update: please remove the blank line.

Ok
 
> > diff --git a/include/uapi/linux/cryptouser.h
> > b/include/uapi/linux/cryptouser.h index 19bf0ca6d635..15e51ccb3679 100644
> > --- a/include/uapi/linux/cryptouser.h
> > +++ b/include/uapi/linux/cryptouser.h
> > @@ -73,6 +73,8 @@ struct crypto_report_hash {
> > char type[CRYPTO_MAX_NAME];
> > unsigned int blocksize;
> > unsigned int digestsize;
> > +   __u64 stat_hash;
> 
> Why do you use __u64? The atomic_t variable is an int, i.e. 32 bit. Thus I 
> would think that __u32 would suffice?

You are right, I will downgrade to __u32
But I think I will set len stats to atomic64/__u64 and keep count on 
atomic_t/__u32.

> 
> 
> > +   __u64 stat_hash_tlen;
> >  };
> 
> What I am slightly unsure here is: how should user space detect whether these 
> additional parameters are part of the NETLINK_USER API or not? I use that 
> interface in my libkcapi whose binary may be used on multiple different 
> kernel 
> versions. How should that library operate if one kernel has these parameters 
> and another does not?
> 

Userspace could check for kernel version and know if stat are present or not.
Another way is to add a new netlink request.

Thanks for your review.
Regards
Corentin Labbe