Re: [PATCH] crypto: CTR DRBG - in-place cipher operation

2018-08-03 Thread Herbert Xu
On Fri, Jul 20, 2018 at 07:42:01PM +0200, Stephan Müller wrote:
> The cipher implementations of the kernel crypto API favor in-place
> cipher operations. Thus, switch the CTR cipher operation in the DRBG to
> perform in-place operations. This is implemented by using the output
> buffer as input buffer and zeroizing it before the cipher operation to
> implement a CTR encryption of a NULL buffer.
> 
> The speed improvement is quite visibile with the following comparison
> using the LRNG implementation.
> 
> Without the patch set:
> 
>   16 bytes|   12.267661 MB/s|61338304 bytes |  500213 ns
>   32 bytes|   23.603770 MB/s|   118018848 bytes |  500073 ns
>   64 bytes|   46.732262 MB/s|   233661312 bytes |  500241 ns
>  128 bytes|   90.038042 MB/s|   450190208 bytes |  500244 ns
>  256 bytes|  160.399616 MB/s|   801998080 bytes |  500393 ns
>  512 bytes|  259.878400 MB/s|  1299392000 bytes |  501675 ns
> 1024 bytes|  386.050662 MB/s|  1930253312 bytes |  501661 ns
> 2048 bytes|  493.641728 MB/s|  2468208640 bytes |  501598 ns
> 4096 bytes|  581.835981 MB/s|  2909179904 bytes |  503426 ns
> 
> With the patch set:
> 
>   16 bytes | 17.051142 MB/s | 85255712 bytes |  500854 ns
>   32 bytes | 32.695898 MB/s |163479488 bytes |  500544 ns
>   64 bytes | 64.490739 MB/s |322453696 bytes |  500954 ns
>  128 bytes |123.285043 MB/s |616425216 bytes |  500201 ns
>  256 bytes |233.434573 MB/s |   1167172864 bytes |  500573 ns
>  512 bytes |384.405197 MB/s |   1922025984 bytes |  500671 ns
> 1024 bytes |566.313370 MB/s |   2831566848 bytes |  501080 ns
> 2048 bytes |744.518042 MB/s |   3722590208 bytes |  500926 ns
> 4096 bytes |867.501670 MB/s |   4337508352 bytes |  502181 ns
> 
> Signed-off-by: Stephan Mueller 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] crypto: CTR DRBG - in-place cipher operation

2018-07-27 Thread Herbert Xu
On Fri, Jul 27, 2018 at 01:49:08PM +0200, Stephan Mueller wrote:
>
> This is guaranteed by the invokers of drbg_kcapi_sym_ctr as there are two 
> only:
> 
> - the one in drbg_ctr_update uses the scratchpad for inbuf
> 
> - the one in drbg_ctr_generate uses NULL which implies that the outscratchpad 
> is used.

I see.  I missed the fact that the buffer provided by the user is
the output buffer and not the input buffer.

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] crypto: CTR DRBG - in-place cipher operation

2018-07-27 Thread Stephan Mueller
Am Freitag, 27. Juli 2018, 12:53:35 CEST schrieb Herbert Xu:

Hi Herbert,

> On Fri, Jul 20, 2018 at 07:42:01PM +0200, Stephan Müller wrote:
> > @@ -1747,10 +1733,18 @@ static int drbg_kcapi_sym_ctr(struct drbg_state
> > *drbg,> 
> >   u8 *outbuf, u32 outlen)
> >  
> >  {
> >  
> > struct scatterlist *sg_in = >sg_in, *sg_out = >sg_out;
> > 
> > +   u32 scratchpad_use = min_t(u32, outlen, DRBG_OUTSCRATCHLEN);
> > 
> > int ret;
> > 
> > -   sg_set_buf(sg_in, inbuf, inlen);
> > -   sg_set_buf(sg_out, drbg->outscratchpad, DRBG_OUTSCRATCHLEN);
> > +   if (inbuf) {
> > +   /* Use caller-provided input buffer */
> > +   sg_set_buf(sg_in, inbuf, inlen);
> > +   } else {
> > +   /* Use scratchpad for in-place operation */
> > +   inlen = scratchpad_use;
> > +   memset(drbg->outscratchpad, 0, scratchpad_use);
> > +   sg_set_buf(sg_in, drbg->outscratchpad, scratchpad_use);
> > +   }
> 
> What guarantees that inbuf isn't on the stack?

This is guaranteed by the invokers of drbg_kcapi_sym_ctr as there are two 
only:

- the one in drbg_ctr_update uses the scratchpad for inbuf

- the one in drbg_ctr_generate uses NULL which implies that the outscratchpad 
is used.

Ciao
Stephan




Re: [PATCH] crypto: CTR DRBG - in-place cipher operation

2018-07-27 Thread Herbert Xu
On Fri, Jul 20, 2018 at 07:42:01PM +0200, Stephan Müller wrote:
>
> @@ -1747,10 +1733,18 @@ static int drbg_kcapi_sym_ctr(struct drbg_state *drbg,
> u8 *outbuf, u32 outlen)
>  {
>   struct scatterlist *sg_in = >sg_in, *sg_out = >sg_out;
> + u32 scratchpad_use = min_t(u32, outlen, DRBG_OUTSCRATCHLEN);
>   int ret;
>  
> - sg_set_buf(sg_in, inbuf, inlen);
> - sg_set_buf(sg_out, drbg->outscratchpad, DRBG_OUTSCRATCHLEN);
> + if (inbuf) {
> + /* Use caller-provided input buffer */
> + sg_set_buf(sg_in, inbuf, inlen);
> + } else {
> + /* Use scratchpad for in-place operation */
> + inlen = scratchpad_use;
> + memset(drbg->outscratchpad, 0, scratchpad_use);
> + sg_set_buf(sg_in, drbg->outscratchpad, scratchpad_use);
> + }

What guarantees that inbuf isn't on the stack?

I think rather than doing this we need to fix the existing code
to copy inbuf onto the scratch pad and then do in-place operation
on that.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


[PATCH] crypto: CTR DRBG - in-place cipher operation

2018-07-20 Thread Stephan Müller
The cipher implementations of the kernel crypto API favor in-place
cipher operations. Thus, switch the CTR cipher operation in the DRBG to
perform in-place operations. This is implemented by using the output
buffer as input buffer and zeroizing it before the cipher operation to
implement a CTR encryption of a NULL buffer.

The speed improvement is quite visibile with the following comparison
using the LRNG implementation.

Without the patch set:

  16 bytes|   12.267661 MB/s|61338304 bytes |  500213 ns
  32 bytes|   23.603770 MB/s|   118018848 bytes |  500073 ns
  64 bytes|   46.732262 MB/s|   233661312 bytes |  500241 ns
 128 bytes|   90.038042 MB/s|   450190208 bytes |  500244 ns
 256 bytes|  160.399616 MB/s|   801998080 bytes |  500393 ns
 512 bytes|  259.878400 MB/s|  1299392000 bytes |  501675 ns
1024 bytes|  386.050662 MB/s|  1930253312 bytes |  501661 ns
2048 bytes|  493.641728 MB/s|  2468208640 bytes |  501598 ns
4096 bytes|  581.835981 MB/s|  2909179904 bytes |  503426 ns

With the patch set:

  16 bytes | 17.051142 MB/s | 85255712 bytes |  500854 ns
  32 bytes | 32.695898 MB/s |163479488 bytes |  500544 ns
  64 bytes | 64.490739 MB/s |322453696 bytes |  500954 ns
 128 bytes |123.285043 MB/s |616425216 bytes |  500201 ns
 256 bytes |233.434573 MB/s |   1167172864 bytes |  500573 ns
 512 bytes |384.405197 MB/s |   1922025984 bytes |  500671 ns
1024 bytes |566.313370 MB/s |   2831566848 bytes |  501080 ns
2048 bytes |744.518042 MB/s |   3722590208 bytes |  500926 ns
4096 bytes |867.501670 MB/s |   4337508352 bytes |  502181 ns

Signed-off-by: Stephan Mueller 
---
 crypto/drbg.c | 34 ++
 include/crypto/drbg.h |  2 --
 2 files changed, 14 insertions(+), 22 deletions(-)

diff --git a/crypto/drbg.c b/crypto/drbg.c
index ee302fd229ad..bc52d9562611 100644
--- a/crypto/drbg.c
+++ b/crypto/drbg.c
@@ -261,8 +261,7 @@ static int drbg_fini_sym_kernel(struct drbg_state *drbg);
 static int drbg_kcapi_sym_ctr(struct drbg_state *drbg,
  u8 *inbuf, u32 inbuflen,
  u8 *outbuf, u32 outlen);
-#define DRBG_CTR_NULL_LEN 128
-#define DRBG_OUTSCRATCHLEN DRBG_CTR_NULL_LEN
+#define DRBG_OUTSCRATCHLEN 256
 
 /* BCC function for CTR DRBG as defined in 10.4.3 */
 static int drbg_ctr_bcc(struct drbg_state *drbg,
@@ -555,8 +554,7 @@ static int drbg_ctr_generate(struct drbg_state *drbg,
}
 
/* 10.2.1.5.2 step 4.1 */
-   ret = drbg_kcapi_sym_ctr(drbg, drbg->ctr_null_value, DRBG_CTR_NULL_LEN,
-buf, len);
+   ret = drbg_kcapi_sym_ctr(drbg, NULL, 0, buf, len);
if (ret)
return ret;
 
@@ -1644,9 +1642,6 @@ static int drbg_fini_sym_kernel(struct drbg_state *drbg)
skcipher_request_free(drbg->ctr_req);
drbg->ctr_req = NULL;
 
-   kfree(drbg->ctr_null_value_buf);
-   drbg->ctr_null_value = NULL;
-
kfree(drbg->outscratchpadbuf);
drbg->outscratchpadbuf = NULL;
 
@@ -1697,15 +1692,6 @@ static int drbg_init_sym_kernel(struct drbg_state *drbg)
crypto_req_done, >ctr_wait);
 
alignmask = crypto_skcipher_alignmask(sk_tfm);
-   drbg->ctr_null_value_buf = kzalloc(DRBG_CTR_NULL_LEN + alignmask,
-  GFP_KERNEL);
-   if (!drbg->ctr_null_value_buf) {
-   drbg_fini_sym_kernel(drbg);
-   return -ENOMEM;
-   }
-   drbg->ctr_null_value = (u8 *)PTR_ALIGN(drbg->ctr_null_value_buf,
-  alignmask + 1);
-
drbg->outscratchpadbuf = kmalloc(DRBG_OUTSCRATCHLEN + alignmask,
 GFP_KERNEL);
if (!drbg->outscratchpadbuf) {
@@ -1716,7 +1702,7 @@ static int drbg_init_sym_kernel(struct drbg_state *drbg)
  alignmask + 1);
 
sg_init_table(>sg_in, 1);
-   sg_init_table(>sg_out, 1);
+   sg_init_one(>sg_out, drbg->outscratchpad, DRBG_OUTSCRATCHLEN);
 
return alignmask;
 }
@@ -1747,10 +1733,18 @@ static int drbg_kcapi_sym_ctr(struct drbg_state *drbg,
  u8 *outbuf, u32 outlen)
 {
struct scatterlist *sg_in = >sg_in, *sg_out = >sg_out;
+   u32 scratchpad_use = min_t(u32, outlen, DRBG_OUTSCRATCHLEN);
int ret;
 
-   sg_set_buf(sg_in, inbuf, inlen);
-   sg_set_buf(sg_out, drbg->outscratchpad, DRBG_OUTSCRATCHLEN);
+   if (inbuf) {
+   /* Use caller-provided input buffer */
+   sg_set_buf(sg_in, inbuf, inlen);
+   } else {
+   /* Use scratchpad for in-place operation */
+