Re: [PATCH v4 5/5] crypto: AES CBC multi-buffer glue code

2015-12-15 Thread Tim Chen
On Sat, 2015-12-12 at 13:48 +0800, Herbert Xu wrote:
> On Fri, Dec 11, 2015 at 08:54:40AM -0800, Tim Chen wrote:
> > Direct call I assume have less overhead.  Let me think about
> 
> static inline int crypto_ablkcipher_encrypt(struct ablkcipher_request *req)
> {
>   struct ablkcipher_tfm *crt =
>   crypto_ablkcipher_crt(crypto_ablkcipher_reqtfm(req));
>   return crt->encrypt(req);
> }
> 
> I don't see any overhead there that you could eliminate by calling
> crt->encrypt directly.
> 
> Cheers,

Okay, I'll attempt to convert the inner cipher to ablk.  It may
take me a bit of time.

Thanks.

Tim

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 5/5] crypto: AES CBC multi-buffer glue code

2015-12-11 Thread Tim Chen
On Fri, 2015-12-11 at 11:23 +0800, Herbert Xu wrote:
> On Thu, Dec 10, 2015 at 08:39:45AM -0800, Tim Chen wrote:
> >
> > The inner cipher is called synchronously from the outer layer 
> > async cipher algorithm in cbc_mb_async_ablk_decrypt via
> > 
> > err = crypto_blkcipher_crt(child_tfm)->decrypt(
> > >desc, req->dst, req->src, req->nbytes);
> 
> We should avoid bypassing the API and directly calling crypto
> driver functions like this.
> 
> What if you made it a proper ablkcipher and just called it through
> the API?
> 
> Thanks,

Direct call I assume have less overhead.  Let me think about
making the inner cipher ablkcipher.  I may need to rewrite a lot
of the glue layer :(

Tim

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 5/5] crypto: AES CBC multi-buffer glue code

2015-12-11 Thread Herbert Xu
On Fri, Dec 11, 2015 at 08:54:40AM -0800, Tim Chen wrote:
> Direct call I assume have less overhead.  Let me think about

static inline int crypto_ablkcipher_encrypt(struct ablkcipher_request *req)
{
struct ablkcipher_tfm *crt =
crypto_ablkcipher_crt(crypto_ablkcipher_reqtfm(req));
return crt->encrypt(req);
}

I don't see any overhead there that you could eliminate by calling
crt->encrypt directly.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 5/5] crypto: AES CBC multi-buffer glue code

2015-12-10 Thread Tim Chen
On Thu, 2015-12-10 at 09:45 +0800, Herbert Xu wrote:
> On Wed, Dec 09, 2015 at 09:23:14AM -0800, Tim Chen wrote:
> > 
> > This is an internal algorithm.  We are indeed casting the request
> > to the outer ablkcipher request when we do the async cipher walk.
> 
> The question remain: why does it have to be a blkcipher rather
> than an ablkcipher?
> 
> Cheers,

The inner cipher is called synchronously from the outer layer 
async cipher algorithm in cbc_mb_async_ablk_decrypt via

err = crypto_blkcipher_crt(child_tfm)->decrypt(
>desc, req->dst, req->src, req->nbytes);

or directly by mcrytpd_queue_worker via mcryptd_blkcipher_crypt. We
do not allow this operation to return something like EINPROGRESS.  

But the walk of the scattered list is done with async walk.

Thanks.

Tim

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 5/5] crypto: AES CBC multi-buffer glue code

2015-12-10 Thread Herbert Xu
On Thu, Dec 10, 2015 at 08:39:45AM -0800, Tim Chen wrote:
>
> The inner cipher is called synchronously from the outer layer 
> async cipher algorithm in cbc_mb_async_ablk_decrypt via
> 
> err = crypto_blkcipher_crt(child_tfm)->decrypt(
> >desc, req->dst, req->src, req->nbytes);

We should avoid bypassing the API and directly calling crypto
driver functions like this.

What if you made it a proper ablkcipher and just called it through
the API?

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 5/5] crypto: AES CBC multi-buffer glue code

2015-12-09 Thread Herbert Xu
On Wed, Dec 09, 2015 at 09:23:14AM -0800, Tim Chen wrote:
> 
> This is an internal algorithm.  We are indeed casting the request
> to the outer ablkcipher request when we do the async cipher walk.

The question remain: why does it have to be a blkcipher rather
than an ablkcipher?

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 5/5] crypto: AES CBC multi-buffer glue code

2015-12-09 Thread Tim Chen
On Wed, 2015-12-09 at 10:52 +0800, Herbert Xu wrote:
> On Wed, Dec 02, 2015 at 12:02:45PM -0800, Tim Chen wrote:
> >
> > +/*
> > + * CRYPTO_ALG_ASYNC flag is passed to indicate we have an ablk
> > + * scatter-gather walk.
> > + */
> > +
> > +static struct crypto_alg aes_cbc_mb_alg = {
> > +   .cra_name   = "__cbc-aes-aesni-mb",
> > +   .cra_driver_name= "__driver-cbc-aes-aesni-mb",
> > +   .cra_priority   = 100,
> > +   .cra_flags  = CRYPTO_ALG_TYPE_BLKCIPHER | CRYPTO_ALG_ASYNC
> > +   | CRYPTO_ALG_INTERNAL,
> > +   .cra_blocksize  = AES_BLOCK_SIZE,
> > +   .cra_ctxsize= sizeof(struct crypto_aes_ctx) +
> > + AESNI_ALIGN - 1,
> > +   .cra_alignmask  = 0,
> > +   .cra_type   = _blkcipher_type,
> > +   .cra_module = THIS_MODULE,
> > +   .cra_list   = LIST_HEAD_INIT(aes_cbc_mb_alg.cra_list),
> > +   .cra_u = {
> > +   .blkcipher = {
> > +   .min_keysize= AES_MIN_KEY_SIZE,
> > +   .max_keysize= AES_MAX_KEY_SIZE,
> > +   .ivsize = AES_BLOCK_SIZE,
> > +   .setkey = aes_set_key,
> > +   .encrypt= mb_aes_cbc_encrypt,
> > +   .decrypt= mb_aes_cbc_decrypt
> > +   },
> > +   },
> > +};
> 
> So why do we still need this? Shouldn't a single ablkcipher cover
> all the cases?
> 
> Thanks,

This is an internal algorithm.  We are indeed casting the request
to the outer ablkcipher request when we do the async cipher walk.

See

static int mb_aes_cbc_decrypt(struct blkcipher_desc *desc,
  struct scatterlist *dst, struct scatterlist *src,
  unsigned int nbytes)
{
struct crypto_aes_ctx *aesni_ctx;
struct mcryptd_blkcipher_request_ctx *rctx =
container_of(desc, struct mcryptd_blkcipher_request_ctx, desc);
struct ablkcipher_request *req;
bool is_mcryptd_req;
unsigned long src_paddr;
unsigned long dst_paddr;
int err;

/* note here whether it is mcryptd req */
is_mcryptd_req = desc->flags & CRYPTO_TFM_REQ_MAY_SLEEP;
req = cast_mcryptd_ctx_to_req(rctx);
aesni_ctx = aes_ctx(crypto_blkcipher_ctx(desc->tfm));

ablkcipher_walk_init(>walk, dst, src, nbytes);
err = ablkcipher_walk_phys(req, >walk);
if (err || !rctx->walk.nbytes)
goto done1;
desc->flags &= ~CRYPTO_TFM_REQ_MAY_SLEEP;

kernel_fpu_begin();
while ((nbytes = rctx->walk.nbytes)) {
src_paddr = (page_to_phys(rctx->walk.src.page) + 
rctx->walk.src.offset);
dst_paddr = (page_to_phys(rctx->walk.dst.page) + 
rctx->walk.dst.offset);
aesni_cbc_dec(aesni_ctx, phys_to_virt(dst_paddr), 
phys_to_virt(src_paddr),
  rctx->walk.nbytes & AES_BLOCK_MASK, 
rctx->walk.iv);
nbytes &= AES_BLOCK_SIZE - 1;
err = ablkcipher_walk_done(req, >walk, nbytes);
if (err)
goto done2;
}
done2:
kernel_fpu_end();
done1:
ablkcipher_walk_complete(>walk);


Thanks.

Tim

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 5/5] crypto: AES CBC multi-buffer glue code

2015-12-08 Thread Herbert Xu
On Wed, Dec 02, 2015 at 12:02:45PM -0800, Tim Chen wrote:
>
> +/*
> + * CRYPTO_ALG_ASYNC flag is passed to indicate we have an ablk
> + * scatter-gather walk.
> + */
> +
> +static struct crypto_alg aes_cbc_mb_alg = {
> + .cra_name   = "__cbc-aes-aesni-mb",
> + .cra_driver_name= "__driver-cbc-aes-aesni-mb",
> + .cra_priority   = 100,
> + .cra_flags  = CRYPTO_ALG_TYPE_BLKCIPHER | CRYPTO_ALG_ASYNC
> + | CRYPTO_ALG_INTERNAL,
> + .cra_blocksize  = AES_BLOCK_SIZE,
> + .cra_ctxsize= sizeof(struct crypto_aes_ctx) +
> +   AESNI_ALIGN - 1,
> + .cra_alignmask  = 0,
> + .cra_type   = _blkcipher_type,
> + .cra_module = THIS_MODULE,
> + .cra_list   = LIST_HEAD_INIT(aes_cbc_mb_alg.cra_list),
> + .cra_u = {
> + .blkcipher = {
> + .min_keysize= AES_MIN_KEY_SIZE,
> + .max_keysize= AES_MAX_KEY_SIZE,
> + .ivsize = AES_BLOCK_SIZE,
> + .setkey = aes_set_key,
> + .encrypt= mb_aes_cbc_encrypt,
> + .decrypt= mb_aes_cbc_decrypt
> + },
> + },
> +};

So why do we still need this? Shouldn't a single ablkcipher cover
all the cases?

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 5/5] crypto: AES CBC multi-buffer glue code

2015-12-02 Thread Tim Chen

This patch introduces the multi-buffer job manager which is responsible
for submitting scatter-gather buffers from several AES CBC jobs
to the multi-buffer algorithm. The glue code interfaces with the
underlying algorithm that handles 8 data streams of AES CBC encryption
in parallel. AES key expansion and CBC decryption requests are performed
in a manner similar to the existing AESNI Intel glue driver.

The outline of the algorithm for AES CBC encryption requests is
sketched below:

Any driver requesting the crypto service will place an async crypto
request on the workqueue.  The multi-buffer crypto daemon will pull an
AES CBC encryption request from work queue and put each request in an
empty data lane for multi-buffer crypto computation.  When all the empty
lanes are filled, computation will commence on the jobs in parallel and
the job with the shortest remaining buffer will get completed and be
returned. To prevent prolonged stall, when no new jobs arrive, we will
flush workqueue of jobs after a maximum allowable delay has elapsed.

To accommodate the fragmented nature of scatter-gather, we will keep
submitting the next scatter-buffer fragment for a job for multi-buffer
computation until a job is completed and no more buffer fragments remain.
At that time we will pull a new job to fill the now empty data slot.
We check with the multibuffer scheduler to see if there are other
completed jobs to prevent extraneous delay in returning any completed
jobs.

This multi-buffer algorithm should be used for cases where we get at
least 8 streams of crypto jobs submitted at a reasonably high rate.
For low crypto job submission rate and low number of data streams, this
algorithm will not be beneficial. The reason is at low rate, we do not
fill out the data lanes before flushing the jobs instead of processing
them with all the data lanes full.  We will miss the benefit of parallel
computation, and adding delay to the processing of the crypto job at the
same time.  Some tuning of the maximum latency parameter may be needed
to get the best performance.

Originally-by: Chandramouli Narayanan 
Signed-off-by: Tim Chen 
---
 arch/x86/crypto/Makefile|   1 +
 arch/x86/crypto/aes-cbc-mb/Makefile |  22 +
 arch/x86/crypto/aes-cbc-mb/aes_cbc_mb.c | 835 
 3 files changed, 858 insertions(+)
 create mode 100644 arch/x86/crypto/aes-cbc-mb/Makefile
 create mode 100644 arch/x86/crypto/aes-cbc-mb/aes_cbc_mb.c

diff --git a/arch/x86/crypto/Makefile b/arch/x86/crypto/Makefile
index b9b912a..000db49 100644
--- a/arch/x86/crypto/Makefile
+++ b/arch/x86/crypto/Makefile
@@ -33,6 +33,7 @@ obj-$(CONFIG_CRYPTO_CRC32_PCLMUL) += crc32-pclmul.o
 obj-$(CONFIG_CRYPTO_SHA256_SSSE3) += sha256-ssse3.o
 obj-$(CONFIG_CRYPTO_SHA512_SSSE3) += sha512-ssse3.o
 obj-$(CONFIG_CRYPTO_CRCT10DIF_PCLMUL) += crct10dif-pclmul.o
+obj-$(CONFIG_CRYPTO_AES_CBC_MB) += aes-cbc-mb/
 obj-$(CONFIG_CRYPTO_POLY1305_X86_64) += poly1305-x86_64.o
 
 # These modules require assembler to support AVX.
diff --git a/arch/x86/crypto/aes-cbc-mb/Makefile 
b/arch/x86/crypto/aes-cbc-mb/Makefile
new file mode 100644
index 000..b642bd8
--- /dev/null
+++ b/arch/x86/crypto/aes-cbc-mb/Makefile
@@ -0,0 +1,22 @@
+#
+# Arch-specific CryptoAPI modules.
+#
+
+avx_supported := $(call as-instr,vpxor %xmm0$(comma)%xmm0$(comma)%xmm0,yes,no)
+
+# we need decryption and key expansion routine symbols
+# if either AESNI_NI_INTEL or AES_CBC_MB is a module
+
+ifeq ($(CONFIG_CRYPTO_AES_NI_INTEL),m)
+   dec_support := ../aesni-intel_asm.o
+endif
+ifeq ($(CONFIG_CRYPTO_AES_CBC_MB),m)
+   dec_support := ../aesni-intel_asm.o
+endif
+
+ifeq ($(avx_supported),yes)
+   obj-$(CONFIG_CRYPTO_AES_CBC_MB) += aes-cbc-mb.o
+   aes-cbc-mb-y := $(dec_support) aes_cbc_mb.o aes_mb_mgr_init.o \
+   mb_mgr_inorder_x8_asm.o mb_mgr_ooo_x8_asm.o \
+   aes_cbc_enc_x8.o
+endif
diff --git a/arch/x86/crypto/aes-cbc-mb/aes_cbc_mb.c 
b/arch/x86/crypto/aes-cbc-mb/aes_cbc_mb.c
new file mode 100644
index 000..4d16a5d
--- /dev/null
+++ b/arch/x86/crypto/aes-cbc-mb/aes_cbc_mb.c
@@ -0,0 +1,835 @@
+/*
+ * Multi buffer AES CBC algorithm glue code
+ *
+ *
+ * This file is provided under a dual BSD/GPLv2 license.  When using or
+ * redistributing this file, you may do so under either license.
+ *
+ * GPL LICENSE SUMMARY
+ *
+ * Copyright(c) 2015 Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * Contact Information:
+ * James Guilford 
+