Re: FW: [PATCH V6 5/7] crypto: AES CBC multi-buffer glue code

2017-07-25 Thread Megha Dey
On Tue, 2017-07-25 at 10:17 +0800, Herbert Xu wrote:
> On Mon, Jul 24, 2017 at 06:09:56PM -0700, Megha Dey wrote:
> >
> > Under the skcipher interface, if both the outer and inner alg are async,
> > there should not be any problem right? Currently I do not see any
> > existing algorithms have both algorithms async.
> 
> That's because the purpose of cryptd is to turn a sync algorithm
> into an async one.
> 
> Your mcryptd is completely different.  We already went through
> this discussion for sha1-mb.  This is no different.  You should
> choose the type that fits your circumstances.

I have updated the inner algorithm to also be async in v7 patch series.
> 
> Cheers,




Re: FW: [PATCH V6 5/7] crypto: AES CBC multi-buffer glue code

2017-07-24 Thread Herbert Xu
On Mon, Jul 24, 2017 at 06:09:56PM -0700, Megha Dey wrote:
>
> Under the skcipher interface, if both the outer and inner alg are async,
> there should not be any problem right? Currently I do not see any
> existing algorithms have both algorithms async.

That's because the purpose of cryptd is to turn a sync algorithm
into an async one.

Your mcryptd is completely different.  We already went through
this discussion for sha1-mb.  This is no different.  You should
choose the type that fits your circumstances.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: FW: [PATCH V6 5/7] crypto: AES CBC multi-buffer glue code

2017-07-24 Thread Megha Dey
On Tue, 2017-07-25 at 00:53 +, Dey, Megha wrote:
> 
> -Original Message-
> From: Herbert Xu [mailto:herb...@gondor.apana.org.au] 
> Sent: Wednesday, July 19, 2017 12:02 AM
> To: Dey, Megha <megha@intel.com>
> Cc: Tim Chen <tim.c.c...@linux.intel.com>; da...@davemloft.net; 
> linux-crypto@vger.kernel.org; linux-ker...@vger.kernel.org
> Subject: Re: [PATCH V6 5/7] crypto: AES CBC multi-buffer glue code
> 
> On Tue, Jul 18, 2017 at 06:18:59PM -0700, Megha Dey wrote:
> >
> > > >> +/*
> > > >> + * CRYPTO_ALG_ASYNC flag is passed to indicate we have an ablk
> > > >> + * scatter-gather walk.
> > > >> + */
> > > >> +static struct skcipher_alg aes_cbc_mb_alg = {
> > > >> +  .base = {
> > > >> +  .cra_name   = "cbc(aes)",
> > > >> +  .cra_driver_name= "cbc-aes-aesni-mb",
> > > >> +  .cra_priority   = 500,
> > > >> +  .cra_flags  = CRYPTO_ALG_INTERNAL,
> > > >> +  .cra_blocksize  = AES_BLOCK_SIZE,
> > > >> +  .cra_ctxsize= CRYPTO_AES_CTX_SIZE,
> > > >> +  .cra_module = THIS_MODULE,
> > > >> +  },
> > > >> +  .min_keysize= AES_MIN_KEY_SIZE,
> > > >> +  .max_keysize= AES_MAX_KEY_SIZE,
> > > >> +  .ivsize = AES_BLOCK_SIZE,
> > > >> +  .setkey = aes_set_key,
> > > >> +  .encrypt= mb_aes_cbc_encrypt,
> > > >> +  .decrypt= mb_aes_cbc_decrypt
> > > >> +};
> > > > 
> > > > So this claims to be a sync algorithm.  Is this really the case?
> > 
> > yes, the inner algorithm is sync whereas the outer algorithm is async.
> 
> Are you saying that once mb_aes_cbc_encrypt returns it will never touch the 
> input/output again? This doesn't seem to agree with the actual code:
> 
> +   if (!ret_rctx) {
> +   /* we submitted a job, but none completed */
> +   /* just notify the caller */
> +   notify_callback(rctx, cstate, -EINPROGRESS);
> +   return 0;
> +   }
> 
> Just because an algorithm is internal doesn't mean that it can arbitrarily 
> violate the crypto API requirements.
> 
> Cheers,

Hi Herbert,

Under the skcipher interface, if both the outer and inner alg are async,
there should not be any problem right? Currently I do not see any
existing algorithms have both algorithms async.
> --
> Email: Herbert Xu <herb...@gondor.apana.org.au> Home Page: 
> http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt




Re: [PATCH V6 5/7] crypto: AES CBC multi-buffer glue code

2017-07-19 Thread Herbert Xu
On Tue, Jul 18, 2017 at 06:18:59PM -0700, Megha Dey wrote:
>
> > >> +/*
> > >> + * CRYPTO_ALG_ASYNC flag is passed to indicate we have an ablk
> > >> + * scatter-gather walk.
> > >> + */
> > >> +static struct skcipher_alg aes_cbc_mb_alg = {
> > >> +.base = {
> > >> +.cra_name   = "cbc(aes)",
> > >> +.cra_driver_name= "cbc-aes-aesni-mb",
> > >> +.cra_priority   = 500,
> > >> +.cra_flags  = CRYPTO_ALG_INTERNAL,
> > >> +.cra_blocksize  = AES_BLOCK_SIZE,
> > >> +.cra_ctxsize= CRYPTO_AES_CTX_SIZE,
> > >> +.cra_module = THIS_MODULE,
> > >> +},
> > >> +.min_keysize= AES_MIN_KEY_SIZE,
> > >> +.max_keysize= AES_MAX_KEY_SIZE,
> > >> +.ivsize = AES_BLOCK_SIZE,
> > >> +.setkey = aes_set_key,
> > >> +.encrypt= mb_aes_cbc_encrypt,
> > >> +.decrypt= mb_aes_cbc_decrypt
> > >> +};
> > > 
> > > So this claims to be a sync algorithm.  Is this really the case?
> 
> yes, the inner algorithm is sync whereas the outer algorithm is async.

Are you saying that once mb_aes_cbc_encrypt returns it will never
touch the input/output again? This doesn't seem to agree with the
actual code:

+   if (!ret_rctx) {
+   /* we submitted a job, but none completed */
+   /* just notify the caller */
+   notify_callback(rctx, cstate, -EINPROGRESS);
+   return 0;
+   }

Just because an algorithm is internal doesn't mean that it can
arbitrarily violate the crypto API requirements.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH V6 5/7] crypto: AES CBC multi-buffer glue code

2017-07-18 Thread Megha Dey
On Tue, 2017-07-18 at 17:52 -0700, Tim Chen wrote:
> On 07/17/2017 10:41 PM, Herbert Xu wrote:
> > On Tue, Jun 27, 2017 at 05:26:13PM -0700, Megha Dey wrote:
> >>
> >> +static void completion_callback(struct mcryptd_skcipher_request_ctx *rctx,
> >> +  struct mcryptd_alg_cstate *cstate,
> >> +  int err)
> >> +{
> >> +  struct skcipher_request *req = cast_mcryptd_ctx_to_req(rctx);
> >> +
> >> +   /* remove from work list and invoke completion callback */
> >> +  spin_lock(>work_lock);
> >> +  list_del(>waiter);
> >> +  spin_unlock(>work_lock);
> >> +
> >> +  if (irqs_disabled())
> >> +  rctx->complete(>base, err);
> >> +  else {
> >> +  local_bh_disable();
> >> +  rctx->complete(>base, err);
> >> +  local_bh_enable();
> >> +  }
> >> +}
> > 
> > The fact that you need to do this check means that this design is
> > wrong.  You should always know what context you are in.
> > 
> 
> I think you are right.  The irqs_disabled check is not necessary
> as we only call this function in the context of the mcryptd thread.
> When I wrote the original mb algorithms I was probably unsure
> and put this check in as a precaution in other mb algorithms and
> Megha did the same.

I will make this change.
> 
> >> +/*
> >> + * CRYPTO_ALG_ASYNC flag is passed to indicate we have an ablk
> >> + * scatter-gather walk.
> >> + */
> >> +static struct skcipher_alg aes_cbc_mb_alg = {
> >> +  .base = {
> >> +  .cra_name   = "cbc(aes)",
> >> +  .cra_driver_name= "cbc-aes-aesni-mb",
> >> +  .cra_priority   = 500,
> >> +  .cra_flags  = CRYPTO_ALG_INTERNAL,
> >> +  .cra_blocksize  = AES_BLOCK_SIZE,
> >> +  .cra_ctxsize= CRYPTO_AES_CTX_SIZE,
> >> +  .cra_module = THIS_MODULE,
> >> +  },
> >> +  .min_keysize= AES_MIN_KEY_SIZE,
> >> +  .max_keysize= AES_MAX_KEY_SIZE,
> >> +  .ivsize = AES_BLOCK_SIZE,
> >> +  .setkey = aes_set_key,
> >> +  .encrypt= mb_aes_cbc_encrypt,
> >> +  .decrypt= mb_aes_cbc_decrypt
> >> +};
> > 
> > So this claims to be a sync algorithm.  Is this really the case?

yes, the inner algorithm is sync whereas the outer algorithm is async.
> > 
> > Cheers,
> > 
> 




Re: [PATCH V6 5/7] crypto: AES CBC multi-buffer glue code

2017-07-18 Thread Tim Chen
On 07/17/2017 10:41 PM, Herbert Xu wrote:
> On Tue, Jun 27, 2017 at 05:26:13PM -0700, Megha Dey wrote:
>>
>> +static void completion_callback(struct mcryptd_skcipher_request_ctx *rctx,
>> +struct mcryptd_alg_cstate *cstate,
>> +int err)
>> +{
>> +struct skcipher_request *req = cast_mcryptd_ctx_to_req(rctx);
>> +
>> +   /* remove from work list and invoke completion callback */
>> +spin_lock(>work_lock);
>> +list_del(>waiter);
>> +spin_unlock(>work_lock);
>> +
>> +if (irqs_disabled())
>> +rctx->complete(>base, err);
>> +else {
>> +local_bh_disable();
>> +rctx->complete(>base, err);
>> +local_bh_enable();
>> +}
>> +}
> 
> The fact that you need to do this check means that this design is
> wrong.  You should always know what context you are in.
> 

I think you are right.  The irqs_disabled check is not necessary
as we only call this function in the context of the mcryptd thread.
When I wrote the original mb algorithms I was probably unsure
and put this check in as a precaution in other mb algorithms and
Megha did the same.

>> +/*
>> + * CRYPTO_ALG_ASYNC flag is passed to indicate we have an ablk
>> + * scatter-gather walk.
>> + */
>> +static struct skcipher_alg aes_cbc_mb_alg = {
>> +.base = {
>> +.cra_name   = "cbc(aes)",
>> +.cra_driver_name= "cbc-aes-aesni-mb",
>> +.cra_priority   = 500,
>> +.cra_flags  = CRYPTO_ALG_INTERNAL,
>> +.cra_blocksize  = AES_BLOCK_SIZE,
>> +.cra_ctxsize= CRYPTO_AES_CTX_SIZE,
>> +.cra_module = THIS_MODULE,
>> +},
>> +.min_keysize= AES_MIN_KEY_SIZE,
>> +.max_keysize= AES_MAX_KEY_SIZE,
>> +.ivsize = AES_BLOCK_SIZE,
>> +.setkey = aes_set_key,
>> +.encrypt= mb_aes_cbc_encrypt,
>> +.decrypt= mb_aes_cbc_decrypt
>> +};
> 
> So this claims to be a sync algorithm.  Is this really the case?
> 
> Cheers,
> 



Re: [PATCH V6 5/7] crypto: AES CBC multi-buffer glue code

2017-07-17 Thread Herbert Xu
On Tue, Jun 27, 2017 at 05:26:13PM -0700, Megha Dey wrote:
>
> +static void completion_callback(struct mcryptd_skcipher_request_ctx *rctx,
> + struct mcryptd_alg_cstate *cstate,
> + int err)
> +{
> + struct skcipher_request *req = cast_mcryptd_ctx_to_req(rctx);
> +
> +   /* remove from work list and invoke completion callback */
> + spin_lock(>work_lock);
> + list_del(>waiter);
> + spin_unlock(>work_lock);
> +
> + if (irqs_disabled())
> + rctx->complete(>base, err);
> + else {
> + local_bh_disable();
> + rctx->complete(>base, err);
> + local_bh_enable();
> + }
> +}

The fact that you need to do this check means that this design is
wrong.  You should always know what context you are in.

> +/*
> + * CRYPTO_ALG_ASYNC flag is passed to indicate we have an ablk
> + * scatter-gather walk.
> + */
> +static struct skcipher_alg aes_cbc_mb_alg = {
> + .base = {
> + .cra_name   = "cbc(aes)",
> + .cra_driver_name= "cbc-aes-aesni-mb",
> + .cra_priority   = 500,
> + .cra_flags  = CRYPTO_ALG_INTERNAL,
> + .cra_blocksize  = AES_BLOCK_SIZE,
> + .cra_ctxsize= CRYPTO_AES_CTX_SIZE,
> + .cra_module = THIS_MODULE,
> + },
> + .min_keysize= AES_MIN_KEY_SIZE,
> + .max_keysize= AES_MAX_KEY_SIZE,
> + .ivsize = AES_BLOCK_SIZE,
> + .setkey = aes_set_key,
> + .encrypt= mb_aes_cbc_encrypt,
> + .decrypt= mb_aes_cbc_decrypt
> +};

So this claims to be a sync algorithm.  Is this really the case?

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


[PATCH V6 5/7] crypto: AES CBC multi-buffer glue code

2017-06-27 Thread Megha Dey
This patch introduces the multi-buffer job manager which is responsible
for submitting scatter-gather buffers from several AES CBC jobs
to the multi-buffer algorithm. The glue code interfaces with the
underlying algorithm that handles 8 data streams of AES CBC encryption
in parallel. AES key expansion and CBC decryption requests are performed
in a manner similar to the existing AESNI Intel glue driver.

The outline of the algorithm for AES CBC encryption requests is
sketched below:

Any driver requesting the crypto service will place an async crypto
request on the workqueue.  The multi-buffer crypto daemon will pull an
AES CBC encryption request from work queue and put each request in an
empty data lane for multi-buffer crypto computation.  When all the empty
lanes are filled, computation will commence on the jobs in parallel and
the job with the shortest remaining buffer will get completed and be
returned. To prevent prolonged stall, when no new jobs arrive, we will
flush workqueue of jobs after a maximum allowable delay has elapsed.

To accommodate the fragmented nature of scatter-gather, we will keep
submitting the next scatter-buffer fragment for a job for multi-buffer
computation until a job is completed and no more buffer fragments
remain. At that time we will pull a new job to fill the now empty data slot.
We check with the multibuffer scheduler to see if there are other
completed jobs to prevent extraneous delay in returning any completed
jobs.

This multi-buffer algorithm should be used for cases where we get at
least 8 streams of crypto jobs submitted at a reasonably high rate.
For low crypto job submission rate and low number of data streams, this
algorithm will not be beneficial. The reason is at low rate, we do not
fill out the data lanes before flushing the jobs instead of processing
them with all the data lanes full. We will miss the benefit of parallel
computation, and adding delay to the processing of the crypto job at the
same time.  Some tuning of the maximum latency parameter may be needed
to get the best performance.

Originally-by: Chandramouli Narayanan 
Signed-off-by: Megha Dey 
Acked-by: Tim Chen 
---
 arch/x86/crypto/Makefile|   1 +
 arch/x86/crypto/aes-cbc-mb/Makefile |  22 +
 arch/x86/crypto/aes-cbc-mb/aes_cbc_mb.c | 727 
 3 files changed, 750 insertions(+)
 create mode 100644 arch/x86/crypto/aes-cbc-mb/Makefile
 create mode 100644 arch/x86/crypto/aes-cbc-mb/aes_cbc_mb.c

diff --git a/arch/x86/crypto/Makefile b/arch/x86/crypto/Makefile
index 34b3fa2..cc556a7 100644
--- a/arch/x86/crypto/Makefile
+++ b/arch/x86/crypto/Makefile
@@ -33,6 +33,7 @@ obj-$(CONFIG_CRYPTO_CRC32_PCLMUL) += crc32-pclmul.o
 obj-$(CONFIG_CRYPTO_SHA256_SSSE3) += sha256-ssse3.o
 obj-$(CONFIG_CRYPTO_SHA512_SSSE3) += sha512-ssse3.o
 obj-$(CONFIG_CRYPTO_CRCT10DIF_PCLMUL) += crct10dif-pclmul.o
+obj-$(CONFIG_CRYPTO_AES_CBC_MB) += aes-cbc-mb/
 obj-$(CONFIG_CRYPTO_POLY1305_X86_64) += poly1305-x86_64.o
 
 # These modules require assembler to support AVX.
diff --git a/arch/x86/crypto/aes-cbc-mb/Makefile 
b/arch/x86/crypto/aes-cbc-mb/Makefile
new file mode 100644
index 000..b642bd8
--- /dev/null
+++ b/arch/x86/crypto/aes-cbc-mb/Makefile
@@ -0,0 +1,22 @@
+#
+# Arch-specific CryptoAPI modules.
+#
+
+avx_supported := $(call as-instr,vpxor %xmm0$(comma)%xmm0$(comma)%xmm0,yes,no)
+
+# we need decryption and key expansion routine symbols
+# if either AESNI_NI_INTEL or AES_CBC_MB is a module
+
+ifeq ($(CONFIG_CRYPTO_AES_NI_INTEL),m)
+   dec_support := ../aesni-intel_asm.o
+endif
+ifeq ($(CONFIG_CRYPTO_AES_CBC_MB),m)
+   dec_support := ../aesni-intel_asm.o
+endif
+
+ifeq ($(avx_supported),yes)
+   obj-$(CONFIG_CRYPTO_AES_CBC_MB) += aes-cbc-mb.o
+   aes-cbc-mb-y := $(dec_support) aes_cbc_mb.o aes_mb_mgr_init.o \
+   mb_mgr_inorder_x8_asm.o mb_mgr_ooo_x8_asm.o \
+   aes_cbc_enc_x8.o
+endif
diff --git a/arch/x86/crypto/aes-cbc-mb/aes_cbc_mb.c 
b/arch/x86/crypto/aes-cbc-mb/aes_cbc_mb.c
new file mode 100644
index 000..7d38041
--- /dev/null
+++ b/arch/x86/crypto/aes-cbc-mb/aes_cbc_mb.c
@@ -0,0 +1,727 @@
+/*
+ * Multi buffer AES CBC algorithm glue code
+ *
+ *
+ * This file is provided under a dual BSD/GPLv2 license.  When using or
+ * redistributing this file, you may do so under either license.
+ *
+ * GPL LICENSE SUMMARY
+ *
+ * Copyright(c) 2016 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * Contact Information:
+