Re: [PATCH] crypto: CTR DRBG - in-place cipher operation
On Fri, Jul 20, 2018 at 07:42:01PM +0200, Stephan Müller wrote: > The cipher implementations of the kernel crypto API favor in-place > cipher operations. Thus, switch the CTR cipher operation in the DRBG to > perform in-place operations. This is implemented by using the output > buffer as input buffer and zeroizing it before the cipher operation to > implement a CTR encryption of a NULL buffer. > > The speed improvement is quite visibile with the following comparison > using the LRNG implementation. > > Without the patch set: > > 16 bytes| 12.267661 MB/s|61338304 bytes | 500213 ns > 32 bytes| 23.603770 MB/s| 118018848 bytes | 500073 ns > 64 bytes| 46.732262 MB/s| 233661312 bytes | 500241 ns > 128 bytes| 90.038042 MB/s| 450190208 bytes | 500244 ns > 256 bytes| 160.399616 MB/s| 801998080 bytes | 500393 ns > 512 bytes| 259.878400 MB/s| 1299392000 bytes | 501675 ns > 1024 bytes| 386.050662 MB/s| 1930253312 bytes | 501661 ns > 2048 bytes| 493.641728 MB/s| 2468208640 bytes | 501598 ns > 4096 bytes| 581.835981 MB/s| 2909179904 bytes | 503426 ns > > With the patch set: > > 16 bytes | 17.051142 MB/s | 85255712 bytes | 500854 ns > 32 bytes | 32.695898 MB/s |163479488 bytes | 500544 ns > 64 bytes | 64.490739 MB/s |322453696 bytes | 500954 ns > 128 bytes |123.285043 MB/s |616425216 bytes | 500201 ns > 256 bytes |233.434573 MB/s | 1167172864 bytes | 500573 ns > 512 bytes |384.405197 MB/s | 1922025984 bytes | 500671 ns > 1024 bytes |566.313370 MB/s | 2831566848 bytes | 501080 ns > 2048 bytes |744.518042 MB/s | 3722590208 bytes | 500926 ns > 4096 bytes |867.501670 MB/s | 4337508352 bytes | 502181 ns > > Signed-off-by: Stephan Mueller Patch applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH] crypto: CTR DRBG - in-place cipher operation
On Fri, Jul 27, 2018 at 01:49:08PM +0200, Stephan Mueller wrote: > > This is guaranteed by the invokers of drbg_kcapi_sym_ctr as there are two > only: > > - the one in drbg_ctr_update uses the scratchpad for inbuf > > - the one in drbg_ctr_generate uses NULL which implies that the outscratchpad > is used. I see. I missed the fact that the buffer provided by the user is the output buffer and not the input buffer. Thanks, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH] crypto: CTR DRBG - in-place cipher operation
Am Freitag, 27. Juli 2018, 12:53:35 CEST schrieb Herbert Xu: Hi Herbert, > On Fri, Jul 20, 2018 at 07:42:01PM +0200, Stephan Müller wrote: > > @@ -1747,10 +1733,18 @@ static int drbg_kcapi_sym_ctr(struct drbg_state > > *drbg,> > > u8 *outbuf, u32 outlen) > > > > { > > > > struct scatterlist *sg_in = >sg_in, *sg_out = >sg_out; > > > > + u32 scratchpad_use = min_t(u32, outlen, DRBG_OUTSCRATCHLEN); > > > > int ret; > > > > - sg_set_buf(sg_in, inbuf, inlen); > > - sg_set_buf(sg_out, drbg->outscratchpad, DRBG_OUTSCRATCHLEN); > > + if (inbuf) { > > + /* Use caller-provided input buffer */ > > + sg_set_buf(sg_in, inbuf, inlen); > > + } else { > > + /* Use scratchpad for in-place operation */ > > + inlen = scratchpad_use; > > + memset(drbg->outscratchpad, 0, scratchpad_use); > > + sg_set_buf(sg_in, drbg->outscratchpad, scratchpad_use); > > + } > > What guarantees that inbuf isn't on the stack? This is guaranteed by the invokers of drbg_kcapi_sym_ctr as there are two only: - the one in drbg_ctr_update uses the scratchpad for inbuf - the one in drbg_ctr_generate uses NULL which implies that the outscratchpad is used. Ciao Stephan
Re: [PATCH] crypto: CTR DRBG - in-place cipher operation
On Fri, Jul 20, 2018 at 07:42:01PM +0200, Stephan Müller wrote: > > @@ -1747,10 +1733,18 @@ static int drbg_kcapi_sym_ctr(struct drbg_state *drbg, > u8 *outbuf, u32 outlen) > { > struct scatterlist *sg_in = >sg_in, *sg_out = >sg_out; > + u32 scratchpad_use = min_t(u32, outlen, DRBG_OUTSCRATCHLEN); > int ret; > > - sg_set_buf(sg_in, inbuf, inlen); > - sg_set_buf(sg_out, drbg->outscratchpad, DRBG_OUTSCRATCHLEN); > + if (inbuf) { > + /* Use caller-provided input buffer */ > + sg_set_buf(sg_in, inbuf, inlen); > + } else { > + /* Use scratchpad for in-place operation */ > + inlen = scratchpad_use; > + memset(drbg->outscratchpad, 0, scratchpad_use); > + sg_set_buf(sg_in, drbg->outscratchpad, scratchpad_use); > + } What guarantees that inbuf isn't on the stack? I think rather than doing this we need to fix the existing code to copy inbuf onto the scratch pad and then do in-place operation on that. Cheers, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
[PATCH] crypto: CTR DRBG - in-place cipher operation
The cipher implementations of the kernel crypto API favor in-place cipher operations. Thus, switch the CTR cipher operation in the DRBG to perform in-place operations. This is implemented by using the output buffer as input buffer and zeroizing it before the cipher operation to implement a CTR encryption of a NULL buffer. The speed improvement is quite visibile with the following comparison using the LRNG implementation. Without the patch set: 16 bytes| 12.267661 MB/s|61338304 bytes | 500213 ns 32 bytes| 23.603770 MB/s| 118018848 bytes | 500073 ns 64 bytes| 46.732262 MB/s| 233661312 bytes | 500241 ns 128 bytes| 90.038042 MB/s| 450190208 bytes | 500244 ns 256 bytes| 160.399616 MB/s| 801998080 bytes | 500393 ns 512 bytes| 259.878400 MB/s| 1299392000 bytes | 501675 ns 1024 bytes| 386.050662 MB/s| 1930253312 bytes | 501661 ns 2048 bytes| 493.641728 MB/s| 2468208640 bytes | 501598 ns 4096 bytes| 581.835981 MB/s| 2909179904 bytes | 503426 ns With the patch set: 16 bytes | 17.051142 MB/s | 85255712 bytes | 500854 ns 32 bytes | 32.695898 MB/s |163479488 bytes | 500544 ns 64 bytes | 64.490739 MB/s |322453696 bytes | 500954 ns 128 bytes |123.285043 MB/s |616425216 bytes | 500201 ns 256 bytes |233.434573 MB/s | 1167172864 bytes | 500573 ns 512 bytes |384.405197 MB/s | 1922025984 bytes | 500671 ns 1024 bytes |566.313370 MB/s | 2831566848 bytes | 501080 ns 2048 bytes |744.518042 MB/s | 3722590208 bytes | 500926 ns 4096 bytes |867.501670 MB/s | 4337508352 bytes | 502181 ns Signed-off-by: Stephan Mueller --- crypto/drbg.c | 34 ++ include/crypto/drbg.h | 2 -- 2 files changed, 14 insertions(+), 22 deletions(-) diff --git a/crypto/drbg.c b/crypto/drbg.c index ee302fd229ad..bc52d9562611 100644 --- a/crypto/drbg.c +++ b/crypto/drbg.c @@ -261,8 +261,7 @@ static int drbg_fini_sym_kernel(struct drbg_state *drbg); static int drbg_kcapi_sym_ctr(struct drbg_state *drbg, u8 *inbuf, u32 inbuflen, u8 *outbuf, u32 outlen); -#define DRBG_CTR_NULL_LEN 128 -#define DRBG_OUTSCRATCHLEN DRBG_CTR_NULL_LEN +#define DRBG_OUTSCRATCHLEN 256 /* BCC function for CTR DRBG as defined in 10.4.3 */ static int drbg_ctr_bcc(struct drbg_state *drbg, @@ -555,8 +554,7 @@ static int drbg_ctr_generate(struct drbg_state *drbg, } /* 10.2.1.5.2 step 4.1 */ - ret = drbg_kcapi_sym_ctr(drbg, drbg->ctr_null_value, DRBG_CTR_NULL_LEN, -buf, len); + ret = drbg_kcapi_sym_ctr(drbg, NULL, 0, buf, len); if (ret) return ret; @@ -1644,9 +1642,6 @@ static int drbg_fini_sym_kernel(struct drbg_state *drbg) skcipher_request_free(drbg->ctr_req); drbg->ctr_req = NULL; - kfree(drbg->ctr_null_value_buf); - drbg->ctr_null_value = NULL; - kfree(drbg->outscratchpadbuf); drbg->outscratchpadbuf = NULL; @@ -1697,15 +1692,6 @@ static int drbg_init_sym_kernel(struct drbg_state *drbg) crypto_req_done, >ctr_wait); alignmask = crypto_skcipher_alignmask(sk_tfm); - drbg->ctr_null_value_buf = kzalloc(DRBG_CTR_NULL_LEN + alignmask, - GFP_KERNEL); - if (!drbg->ctr_null_value_buf) { - drbg_fini_sym_kernel(drbg); - return -ENOMEM; - } - drbg->ctr_null_value = (u8 *)PTR_ALIGN(drbg->ctr_null_value_buf, - alignmask + 1); - drbg->outscratchpadbuf = kmalloc(DRBG_OUTSCRATCHLEN + alignmask, GFP_KERNEL); if (!drbg->outscratchpadbuf) { @@ -1716,7 +1702,7 @@ static int drbg_init_sym_kernel(struct drbg_state *drbg) alignmask + 1); sg_init_table(>sg_in, 1); - sg_init_table(>sg_out, 1); + sg_init_one(>sg_out, drbg->outscratchpad, DRBG_OUTSCRATCHLEN); return alignmask; } @@ -1747,10 +1733,18 @@ static int drbg_kcapi_sym_ctr(struct drbg_state *drbg, u8 *outbuf, u32 outlen) { struct scatterlist *sg_in = >sg_in, *sg_out = >sg_out; + u32 scratchpad_use = min_t(u32, outlen, DRBG_OUTSCRATCHLEN); int ret; - sg_set_buf(sg_in, inbuf, inlen); - sg_set_buf(sg_out, drbg->outscratchpad, DRBG_OUTSCRATCHLEN); + if (inbuf) { + /* Use caller-provided input buffer */ + sg_set_buf(sg_in, inbuf, inlen); + } else { + /* Use scratchpad for in-place operation */ +