Re: [PATCH v4 1/5] powerpc: io.h: move iomap.h include so that it can use readq/writeq defs

2017-07-18 Thread Michael Ellerman
Logan Gunthorpe  writes:

> Subsequent patches in this series makes use of the readq and writeq
> defines in iomap.h. However, as is, they get missed on the powerpc
> platform seeing the include comes before the define. This patch
> moves the include down to fix this.
>
> Signed-off-by: Logan Gunthorpe 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Michael Ellerman 
> Cc: Nicholas Piggin 
> Cc: Suresh Warrier 
> Cc: "Oliver O'Halloran" 
> ---
>  arch/powerpc/include/asm/io.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Seems fair enough, have you tested it at all?

cheers


RE: [PATCH] crypto: brcm - Support more FlexRM rings than SPU engines.

2017-07-18 Thread Raveendra Padasalagi
Need to address few issues in the patch.
So NAKing this patch. Will send out re-vised version.

Regards,
Raveendra
> -Original Message-
> From: Raveendra Padasalagi [mailto:raveendra.padasal...@broadcom.com]
> Sent: 13 July 2017 13:58
> To: Herbert Xu; David S. Miller; Rob Rice; Scott Branden; linux-
> cry...@vger.kernel.org
> Cc: Ray Jui; Steve Lin; bcm-kernel-feedback-l...@broadcom.com; linux-
> ker...@vger.kernel.org; Raveendra Padasalagi; sta...@vger.kernel.org
> Subject: [PATCH] crypto: brcm - Support more FlexRM rings than SPU
engines.
>
> Enhance code to generically support cases where DMA rings are greater
than or
> equal to number of SPU engines.
> New hardware has underlying DMA engine-FlexRM with 32 rings which can be
> used to communicate to any of the available
> 10 SPU engines.
>
> Fixes: 9d12ba86f818 ("crypto: brcm - Add Broadcom SPU driver")
> Signed-off-by: Raveendra Padasalagi 
> cc: sta...@vger.kernel.org
> ---
>  drivers/crypto/bcm/cipher.c | 105
+---
>  drivers/crypto/bcm/cipher.h |  15 ---
>  2 files changed, 57 insertions(+), 63 deletions(-)
>
> diff --git a/drivers/crypto/bcm/cipher.c b/drivers/crypto/bcm/cipher.c
index
> cc0d5b9..ecc32d8 100644
> --- a/drivers/crypto/bcm/cipher.c
> +++ b/drivers/crypto/bcm/cipher.c
> @@ -119,7 +119,7 @@ static u8 select_channel(void)  {
>   u8 chan_idx = atomic_inc_return(_priv.next_chan);
>
> - return chan_idx % iproc_priv.spu.num_spu;
> + return chan_idx % iproc_priv.spu.num_chan;
>  }
>
>  /**
> @@ -4527,8 +4527,13 @@ static void spu_functions_register(struct device
> *dev,
>   */
>  static int spu_mb_init(struct device *dev)  {
> - struct mbox_client *mcl = _priv.mcl[iproc_priv.spu.num_spu];
> - int err;
> + struct mbox_client *mcl = _priv.mcl;
> + int err, i;
> +
> + iproc_priv.mbox = devm_kcalloc(dev, iproc_priv.spu.num_chan,
> +   sizeof(struct mbox_chan *), GFP_KERNEL);
> + if (iproc_priv.mbox == NULL)
> + return -ENOMEM;
>
>   mcl->dev = dev;
>   mcl->tx_block = false;
> @@ -4537,15 +4542,16 @@ static int spu_mb_init(struct device *dev)
>   mcl->rx_callback = spu_rx_callback;
>   mcl->tx_done = NULL;
>
> - iproc_priv.mbox[iproc_priv.spu.num_spu] =
> - mbox_request_channel(mcl, 0);
> - if (IS_ERR(iproc_priv.mbox[iproc_priv.spu.num_spu])) {
> - err =
(int)PTR_ERR(iproc_priv.mbox[iproc_priv.spu.num_spu]);
> - dev_err(dev,
> - "Mbox channel %d request failed with err %d",
> - iproc_priv.spu.num_spu, err);
> - iproc_priv.mbox[iproc_priv.spu.num_spu] = NULL;
> - return err;
> + for (i = 0; i < iproc_priv.spu.num_chan; i++) {
> + iproc_priv.mbox[i] = mbox_request_channel(mcl, i);
> + if (IS_ERR(iproc_priv.mbox[i])) {
> + err = (int)PTR_ERR(iproc_priv.mbox[i]);
> + dev_err(dev,
> + "Mbox channel %d request failed with err
%d",
> + i, err);
> + iproc_priv.mbox[i] = NULL;
> + return err;
> + }
>   }
>
>   return 0;
> @@ -4555,7 +4561,7 @@ static void spu_mb_release(struct platform_device
> *pdev)  {
>   int i;
>
> - for (i = 0; i < iproc_priv.spu.num_spu; i++)
> + for (i = 0; i < iproc_priv.spu.num_chan; i++)
>   mbox_free_channel(iproc_priv.mbox[i]);
>  }
>
> @@ -4566,7 +4572,7 @@ static void spu_counters_init(void)
>
>   atomic_set(_priv.session_count, 0);
>   atomic_set(_priv.stream_count, 0);
> - atomic_set(_priv.next_chan, (int)iproc_priv.spu.num_spu);
> + atomic_set(_priv.next_chan, (int)iproc_priv.spu.num_chan);
>   atomic64_set(_priv.bytes_in, 0);
>   atomic64_set(_priv.bytes_out, 0);
>   for (i = 0; i < SPU_OP_NUM; i++) {
> @@ -4809,46 +4815,41 @@ static int spu_dt_read(struct platform_device
> *pdev)
>   const struct of_device_id *match;
>   const struct spu_type_subtype *matched_spu_type;
>   void __iomem *spu_reg_vbase[MAX_SPUS];
> - int err;
> + struct device_node *dn = pdev->dev.of_node;
> + int err, i;
> +
> + /* Count number of mailbox channels */
> + spu->num_chan = of_count_phandle_with_args(dn, "mboxes",
> +"#mbox-cells");
>
>   match = of_match_device(of_match_ptr(bcm_spu_dt_ids), dev);
>   matched_spu_type = match->data;
>
> - if (iproc_priv.spu.num_spu > 1) {
> - /* If this is 2nd or later SPU, make sure it's same type
*/
> - if ((spu->spu_type != matched_spu_type->type) ||
> - (spu->spu_subtype != matched_spu_type->subtype)) {
> - err = -EINVAL;
> - dev_err(>dev, "Multiple SPU types not
> allowed");
> - return err;
> - }
> - } else {
> -

Re: [RFC PATCH v12 3/4] Linux Random Number Generator

2017-07-18 Thread Theodore Ts'o
On Tue, Jul 18, 2017 at 09:00:10PM -0400, Sandy Harris wrote:
> The only really good solution I know of is to find a way to provide a
> chunk of randomness early in the boot process. John Denker has a good
> discussion of doing this by modifying the kernel image & Ted talks of
> doing it via the boot loader. Neither looks remarkably easy. Other
> approaches like making the kernel read a seed file or passing a
> parameter on the kernel command line have been suggested but, if I
> recall right, rejected.

It's actually not that _hard_ to modify the boot loader.  It's not
finicky work like, say, adding support for metadata checksums or xattr
deduplication to ext4.  It's actually mostly plumbing.  It's just that
we haven't found a lot of people willing to do it as paid work, and
the hobbyists haven't been interested.

> As I see it, the questions about Jitter, or any other in-kernel
> generator based on timing, are whether it is good enough to be useful
> until we have one of the above solutions or useful as a
> defense-in-depth trick after we have one. I'd say yes to both.
> 
> There's been a lot of analysis. Stephan has a detailed rationale & a
> lot of test data in his papers & the Havege papers also discuss
> getting entropy from timer operations. I'd say the best paper is
> McGuire et al:
> https://static.lwn.net/images/conf/rtlws11/random-hardware.pdf

So here's the problem that I have with most of these analyses.  Most
of them are done using the x86 as the CPU.  This is true of the
McGuire, Okech, and Schiesser paper you've cited above.  But things
are largely irrelevant on the x86, because we have RDRAND.  And while
I like to mix in environmental noise before generating personal
long-term public keys.  I'm actually mostly OK with relying on RDRAND
for initializing the seeds for hash table to protect against network
denial of service attacks.  (Which is currently the first user of the
not-yet-initialized CRNG on my laptop during kernel boot.)

The real problem is with the non-x86 systems that don't have a
hardware RNG, and there depending timing events which don't depend on
external devices is much more dodgy.  Remember that on most embedded
devices there is only a single oscillator driving the entire system.
It's not like you even have multiple crystal oscillators beating
against one another.

So if you are only depending on CPU timing loops, you basically have a
very complex state machine, driven by a single oscillator, and you're
trying to kid yourself that you're getting entropy out the other end.
How is that any different from using AES in counter mode and claiming
because you don't know the seed, that it's "true randomness"?  It
certainly passes all of the statistical tests!

Hence, we have to rely on external events outside of the CPU and so we
need to depend on interrupt timing --- and that's what we do in
drivers/char/random.c already!  You can debate whether we are being
too conservative with when we judge that we've collective enough
unpredictability to count it as a "bit" of randomness.  So it's
trivially easy to turn the knob and make sure the CRNG gets
initialized more quickly using fewer interrupt timings, and boom!
Problem solved.

Simply turning the knob to make our entropy estimator more lax makes
people uncomfortable, and since they don't have access to the internal
microarchitecture of the CPU, they take comfort in the fact that it's
really, really complicated, and so something like the Jitter RNG
*must* be a more secure way to do things.  But that's really an illusion.

If the real unpredictability is really coming from the interrupts
changing the state of the CPU microarchitecture, the real question is
how many interrupts do you need before you consider things
"unpredictable" to an adequate level of security?  Arguing that we
should turn down the "interrupts per bit of entropy" in
drivers/char/random.c is a much more honest way of having that
discussion.

- Ted

P.S.  In the McGuire paper you cited, it assumes that the system is
fully booted and there are multiple processes running which are
influencing the kernel scheduler.  This makes the paper **not** an
applicable at all.  So if you think that is the most compelling
analysis, I'm definitely not impressed


Re: [PATCH V6 5/7] crypto: AES CBC multi-buffer glue code

2017-07-18 Thread Megha Dey
On Tue, 2017-07-18 at 17:52 -0700, Tim Chen wrote:
> On 07/17/2017 10:41 PM, Herbert Xu wrote:
> > On Tue, Jun 27, 2017 at 05:26:13PM -0700, Megha Dey wrote:
> >>
> >> +static void completion_callback(struct mcryptd_skcipher_request_ctx *rctx,
> >> +  struct mcryptd_alg_cstate *cstate,
> >> +  int err)
> >> +{
> >> +  struct skcipher_request *req = cast_mcryptd_ctx_to_req(rctx);
> >> +
> >> +   /* remove from work list and invoke completion callback */
> >> +  spin_lock(>work_lock);
> >> +  list_del(>waiter);
> >> +  spin_unlock(>work_lock);
> >> +
> >> +  if (irqs_disabled())
> >> +  rctx->complete(>base, err);
> >> +  else {
> >> +  local_bh_disable();
> >> +  rctx->complete(>base, err);
> >> +  local_bh_enable();
> >> +  }
> >> +}
> > 
> > The fact that you need to do this check means that this design is
> > wrong.  You should always know what context you are in.
> > 
> 
> I think you are right.  The irqs_disabled check is not necessary
> as we only call this function in the context of the mcryptd thread.
> When I wrote the original mb algorithms I was probably unsure
> and put this check in as a precaution in other mb algorithms and
> Megha did the same.

I will make this change.
> 
> >> +/*
> >> + * CRYPTO_ALG_ASYNC flag is passed to indicate we have an ablk
> >> + * scatter-gather walk.
> >> + */
> >> +static struct skcipher_alg aes_cbc_mb_alg = {
> >> +  .base = {
> >> +  .cra_name   = "cbc(aes)",
> >> +  .cra_driver_name= "cbc-aes-aesni-mb",
> >> +  .cra_priority   = 500,
> >> +  .cra_flags  = CRYPTO_ALG_INTERNAL,
> >> +  .cra_blocksize  = AES_BLOCK_SIZE,
> >> +  .cra_ctxsize= CRYPTO_AES_CTX_SIZE,
> >> +  .cra_module = THIS_MODULE,
> >> +  },
> >> +  .min_keysize= AES_MIN_KEY_SIZE,
> >> +  .max_keysize= AES_MAX_KEY_SIZE,
> >> +  .ivsize = AES_BLOCK_SIZE,
> >> +  .setkey = aes_set_key,
> >> +  .encrypt= mb_aes_cbc_encrypt,
> >> +  .decrypt= mb_aes_cbc_decrypt
> >> +};
> > 
> > So this claims to be a sync algorithm.  Is this really the case?

yes, the inner algorithm is sync whereas the outer algorithm is async.
> > 
> > Cheers,
> > 
> 




Re: [RFC PATCH v12 3/4] Linux Random Number Generator

2017-07-18 Thread Sandy Harris
On Tue, Jul 18, 2017 at 5:08 PM, Theodore Ts'o  wrote:

> I've been trying to take the best features and suggestions from your
> proposal and integrating them into /dev/random already.

A good approach.

> Things that I've chosen not take is basically because I disbelieve
> that the Jitter RNG is valid. ...

The biggest problem with random(4) is that you cannot generate good
output without a good seed & just after boot, especially first boot on
a new system, you may not have enough entropy. A user space process
cannot do it soon enough and all the in-kernel solutions (unless you
have a hardware RNG) pose difficulties.

The only really good solution I know of is to find a way to provide a
chunk of randomness early in the boot process. John Denker has a good
discussion of doing this by modifying the kernel image & Ted talks of
doing it via the boot loader. Neither looks remarkably easy. Other
approaches like making the kernel read a seed file or passing a
parameter on the kernel command line have been suggested but, if I
recall right, rejected.

As I see it, the questions about Jitter, or any other in-kernel
generator based on timing, are whether it is good enough to be useful
until we have one of the above solutions or useful as a
defense-in-depth trick after we have one. I'd say yes to both.

There's been a lot of analysis. Stephan has a detailed rationale & a
lot of test data in his papers & the Havege papers also discuss
getting entropy from timer operations. I'd say the best paper is
McGuire et al:
https://static.lwn.net/images/conf/rtlws11/random-hardware.pdf

There is enough there to convince me that grabbing some (256?) bits
from such a generator early in the initialization is worthwhile.

> So I have been trying to do the evolution thing already.
> ...

> I'm obviously biased, but I don't see I see the Raison d'Etre for
> merging LRNG into the kernel.

Nor I.


Re: [PATCH V6 5/7] crypto: AES CBC multi-buffer glue code

2017-07-18 Thread Tim Chen
On 07/17/2017 10:41 PM, Herbert Xu wrote:
> On Tue, Jun 27, 2017 at 05:26:13PM -0700, Megha Dey wrote:
>>
>> +static void completion_callback(struct mcryptd_skcipher_request_ctx *rctx,
>> +struct mcryptd_alg_cstate *cstate,
>> +int err)
>> +{
>> +struct skcipher_request *req = cast_mcryptd_ctx_to_req(rctx);
>> +
>> +   /* remove from work list and invoke completion callback */
>> +spin_lock(>work_lock);
>> +list_del(>waiter);
>> +spin_unlock(>work_lock);
>> +
>> +if (irqs_disabled())
>> +rctx->complete(>base, err);
>> +else {
>> +local_bh_disable();
>> +rctx->complete(>base, err);
>> +local_bh_enable();
>> +}
>> +}
> 
> The fact that you need to do this check means that this design is
> wrong.  You should always know what context you are in.
> 

I think you are right.  The irqs_disabled check is not necessary
as we only call this function in the context of the mcryptd thread.
When I wrote the original mb algorithms I was probably unsure
and put this check in as a precaution in other mb algorithms and
Megha did the same.

>> +/*
>> + * CRYPTO_ALG_ASYNC flag is passed to indicate we have an ablk
>> + * scatter-gather walk.
>> + */
>> +static struct skcipher_alg aes_cbc_mb_alg = {
>> +.base = {
>> +.cra_name   = "cbc(aes)",
>> +.cra_driver_name= "cbc-aes-aesni-mb",
>> +.cra_priority   = 500,
>> +.cra_flags  = CRYPTO_ALG_INTERNAL,
>> +.cra_blocksize  = AES_BLOCK_SIZE,
>> +.cra_ctxsize= CRYPTO_AES_CTX_SIZE,
>> +.cra_module = THIS_MODULE,
>> +},
>> +.min_keysize= AES_MIN_KEY_SIZE,
>> +.max_keysize= AES_MAX_KEY_SIZE,
>> +.ivsize = AES_BLOCK_SIZE,
>> +.setkey = aes_set_key,
>> +.encrypt= mb_aes_cbc_encrypt,
>> +.decrypt= mb_aes_cbc_decrypt
>> +};
> 
> So this claims to be a sync algorithm.  Is this really the case?
> 
> Cheers,
> 



[PATCH] crypto: omap-sham: remove unnecessary static in omap_sham_remove()

2017-07-18 Thread Gustavo A. R. Silva
Remove unnecessary static on local variable dd. Such variable
is initialized before being used, on every execution path throughout
the function. The static has no benefit and, removing it reduces the
object file size.

This issue was detected using Coccinelle and the following semantic patch:
https://github.com/GustavoARSilva/coccinelle/blob/master/static/static_unused.cocci

In the following log you can see a difference in the object file size.
This log is the output of the size command, before and after the code
change:

before:
   textdata bss dec hex filename
  26135   11944 128   38207953f drivers/crypto/omap-sham.o

after:
   textdata bss dec hex filename
  26084   11856  64   380049474 drivers/crypto/omap-sham.o

Signed-off-by: Gustavo A. R. Silva 
---
 drivers/crypto/omap-sham.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/omap-sham.c b/drivers/crypto/omap-sham.c
index 9ad9d39..c40ac30 100644
--- a/drivers/crypto/omap-sham.c
+++ b/drivers/crypto/omap-sham.c
@@ -2133,7 +2133,7 @@ static int omap_sham_probe(struct platform_device *pdev)
 
 static int omap_sham_remove(struct platform_device *pdev)
 {
-   static struct omap_sham_dev *dd;
+   struct omap_sham_dev *dd;
int i, j;
 
dd = platform_get_drvdata(pdev);
-- 
2.5.0



[PATCH] crypto: img-hash: remove unnecessary static in img_hash_remove()

2017-07-18 Thread Gustavo A. R. Silva
Remove unnecessary static on local variable hdev. Such variable
is initialized before being used, on every execution path throughout
the function. The static has no benefit and, removing it reduces the
object file size.

This issue was detected using Coccinelle and the following semantic patch:
https://github.com/GustavoARSilva/coccinelle/blob/master/static/static_unused.cocci

In the following log you can see a significant difference in the object
file size. This log is the output of the size command, before and after
the code change:

before:
   textdata bss dec hex filename
  148426464 128   2143453ba drivers/crypto/img-hash.o

after:
   textdata bss dec hex filename
  147896376  64   2122952ed drivers/crypto/img-hash.o

Signed-off-by: Gustavo A. R. Silva 
---
 drivers/crypto/img-hash.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/img-hash.c b/drivers/crypto/img-hash.c
index 0c6a917..b87000a 100644
--- a/drivers/crypto/img-hash.c
+++ b/drivers/crypto/img-hash.c
@@ -1054,7 +1054,7 @@ static int img_hash_probe(struct platform_device *pdev)
 
 static int img_hash_remove(struct platform_device *pdev)
 {
-   static struct img_hash_dev *hdev;
+   struct img_hash_dev *hdev;
 
hdev = platform_get_drvdata(pdev);
spin_lock(_hash.lock);
-- 
2.5.0



[PATCH] crypto: atmel-sha: remove unnecessary static in atmel_sha_remove()

2017-07-18 Thread Gustavo A. R. Silva
Remove unnecessary static on local variable sha_dd. Such variable
is initialized before being used, on every execution path throughout
the function. The static has no benefit and, removing it reduces the
object file size.

This issue was detected using Coccinelle and the following semantic patch:
https://github.com/GustavoARSilva/coccinelle/blob/master/static/static_unused.cocci

In the following log you can see a significant difference in the object
file size. This log is the output of the size command, before and after
the code change:

before:
   textdata bss dec hex filename
  30005   10264 128   403979dcd drivers/crypto/atmel-sha.o

after:
   textdata bss dec hex filename
  29934   10208  64   402069d0e drivers/crypto/atmel-sha.o

Signed-off-by: Gustavo A. R. Silva 
---
 drivers/crypto/atmel-sha.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/atmel-sha.c b/drivers/crypto/atmel-sha.c
index dad4e5b..3e2f41b 100644
--- a/drivers/crypto/atmel-sha.c
+++ b/drivers/crypto/atmel-sha.c
@@ -2883,7 +2883,7 @@ static int atmel_sha_probe(struct platform_device *pdev)
 
 static int atmel_sha_remove(struct platform_device *pdev)
 {
-   static struct atmel_sha_dev *sha_dd;
+   struct atmel_sha_dev *sha_dd;
 
sha_dd = platform_get_drvdata(pdev);
if (!sha_dd)
-- 
2.5.0



[PATCH] crypto: atmel-tdes: remove unnecessary static in atmel_tdes_remove()

2017-07-18 Thread Gustavo A. R. Silva
Remove unnecessary static on local variable tdes_dd. Such variable
is initialized before being used, on every execution path throughout
the function. The static has no benefit and, removing it reduces the
object file size.

This issue was detected using Coccinelle and the following semantic patch:
https://github.com/GustavoARSilva/coccinelle/blob/master/static/static_unused.cocci

In the following log you can see a significant difference in the object
file size. This log is the output of the size command, before and after
the code change:

before:
   textdata bss dec hex filename
  170798704 128   259116537 drivers/crypto/atmel-tdes.o

after:
   textdata bss dec hex filename
  170398616  64   257196477 drivers/crypto/atmel-tdes.o

Signed-off-by: Gustavo A. R. Silva 
---
 drivers/crypto/atmel-tdes.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/atmel-tdes.c b/drivers/crypto/atmel-tdes.c
index b25f1b3..f4b335d 100644
--- a/drivers/crypto/atmel-tdes.c
+++ b/drivers/crypto/atmel-tdes.c
@@ -1487,7 +1487,7 @@ static int atmel_tdes_probe(struct platform_device *pdev)
 
 static int atmel_tdes_remove(struct platform_device *pdev)
 {
-   static struct atmel_tdes_dev *tdes_dd;
+   struct atmel_tdes_dev *tdes_dd;
 
tdes_dd = platform_get_drvdata(pdev);
if (!tdes_dd)
-- 
2.5.0



[PATCH] crypto: n2_core: Convert to using %pOF instead of full_name

2017-07-18 Thread Rob Herring
Now that we have a custom printf format specifier, convert users of
full_name to use %pOF instead. This is preparation to remove storing
of the full path string for each node.

Signed-off-by: Rob Herring 
Cc: Herbert Xu 
Cc: "David S. Miller" 
Cc: linux-crypto@vger.kernel.org
---
 drivers/crypto/n2_core.c | 60 ++--
 1 file changed, 28 insertions(+), 32 deletions(-)

diff --git a/drivers/crypto/n2_core.c b/drivers/crypto/n2_core.c
index 269451375b63..a9fd8b9e86cd 100644
--- a/drivers/crypto/n2_core.c
+++ b/drivers/crypto/n2_core.c
@@ -1730,8 +1730,8 @@ static int spu_mdesc_walk_arcs(struct mdesc_handle *mdesc,
continue;
id = mdesc_get_property(mdesc, tgt, "id", NULL);
if (table[*id] != NULL) {
-   dev_err(>dev, "%s: SPU cpu slot already set.\n",
-   dev->dev.of_node->full_name);
+   dev_err(>dev, "%pOF: SPU cpu slot already set.\n",
+   dev->dev.of_node);
return -EINVAL;
}
cpumask_set_cpu(*id, >sharing);
@@ -1751,8 +1751,8 @@ static int handle_exec_unit(struct spu_mdesc_info *ip, 
struct list_head *list,

p = kzalloc(sizeof(struct spu_queue), GFP_KERNEL);
if (!p) {
-   dev_err(>dev, "%s: Could not allocate SPU queue.\n",
-   dev->dev.of_node->full_name);
+   dev_err(>dev, "%pOF: Could not allocate SPU queue.\n",
+   dev->dev.of_node);
return -ENOMEM;
}

@@ -1981,41 +1981,39 @@ static void n2_spu_driver_version(void)
 static int n2_crypto_probe(struct platform_device *dev)
 {
struct mdesc_handle *mdesc;
-   const char *full_name;
struct n2_crypto *np;
int err;

n2_spu_driver_version();

-   full_name = dev->dev.of_node->full_name;
-   pr_info("Found N2CP at %s\n", full_name);
+   pr_info("Found N2CP at %pOF\n", dev->dev.of_node);

np = alloc_n2cp();
if (!np) {
-   dev_err(>dev, "%s: Unable to allocate n2cp.\n",
-   full_name);
+   dev_err(>dev, "%pOF: Unable to allocate n2cp.\n",
+   dev->dev.of_node);
return -ENOMEM;
}

err = grab_global_resources();
if (err) {
-   dev_err(>dev, "%s: Unable to grab "
-   "global resources.\n", full_name);
+   dev_err(>dev, "%pOF: Unable to grab global resources.\n",
+   dev->dev.of_node);
goto out_free_n2cp;
}

mdesc = mdesc_grab();

if (!mdesc) {
-   dev_err(>dev, "%s: Unable to grab MDESC.\n",
-   full_name);
+   dev_err(>dev, "%pOF: Unable to grab MDESC.\n",
+   dev->dev.of_node);
err = -ENODEV;
goto out_free_global;
}
err = grab_mdesc_irq_props(mdesc, dev, >cwq_info, "n2cp");
if (err) {
-   dev_err(>dev, "%s: Unable to grab IRQ props.\n",
-   full_name);
+   dev_err(>dev, "%pOF: Unable to grab IRQ props.\n",
+   dev->dev.of_node);
mdesc_release(mdesc);
goto out_free_global;
}
@@ -2026,15 +2024,15 @@ static int n2_crypto_probe(struct platform_device *dev)
mdesc_release(mdesc);

if (err) {
-   dev_err(>dev, "%s: CWQ MDESC scan failed.\n",
-   full_name);
+   dev_err(>dev, "%pOF: CWQ MDESC scan failed.\n",
+   dev->dev.of_node);
goto out_free_global;
}

err = n2_register_algs();
if (err) {
-   dev_err(>dev, "%s: Unable to register algorithms.\n",
-   full_name);
+   dev_err(>dev, "%pOF: Unable to register algorithms.\n",
+   dev->dev.of_node);
goto out_free_spu_list;
}

@@ -2092,42 +2090,40 @@ static void free_ncp(struct n2_mau *mp)
 static int n2_mau_probe(struct platform_device *dev)
 {
struct mdesc_handle *mdesc;
-   const char *full_name;
struct n2_mau *mp;
int err;

n2_spu_driver_version();

-   full_name = dev->dev.of_node->full_name;
-   pr_info("Found NCP at %s\n", full_name);
+   pr_info("Found NCP at %pOF\n", dev->dev.of_node);

mp = alloc_ncp();
if (!mp) {
-   dev_err(>dev, "%s: Unable to allocate ncp.\n",
-   full_name);
+   dev_err(>dev, "%pOF: Unable to allocate ncp.\n",
+   dev->dev.of_node);
return -ENOMEM;
}

err = grab_global_resources();
if (err) {
-   dev_err(>dev, "%s: 

Re: [PATCH] crypto: n2_core: Convert to using %pOF instead of full_name

2017-07-18 Thread David Miller
From: Rob Herring 
Date: Tue, 18 Jul 2017 16:42:56 -0500

> Now that we have a custom printf format specifier, convert users of
> full_name to use %pOF instead. This is preparation to remove storing
> of the full path string for each node.
> 
> Signed-off-by: Rob Herring 

Acked-by: David S. Miller 


Re: [RFC PATCH v12 3/4] Linux Random Number Generator

2017-07-18 Thread Theodore Ts'o
On Tue, Jul 18, 2017 at 04:37:11PM +0200, Stephan Müller wrote:
> > 
> > > I have stated the core concerns I have with random.c in [1]. To remedy
> > > these core concerns, major changes to random.c are needed. With the past
> > > experience, I would doubt that I get the changes into random.c.
> > > 
> > > [1] https://www.spinics.net/lists/linux-crypto/msg26316.html
> > 
> > Evolution is the correct way to do this, kernel development relies on
> > that.  We don't do the "use this totally different and untested file
> > instead!" method.
> 
> I am not sure I understand your reply. The offered patch set does not rip out 
> existing code. It adds a replacement implementation which can be enabled 
> during compile time. Yet it is even disabled per default (and thus the legacy 
> code is compiled).

I've been trying to take the best features and suggestions from your
proposal and integrating them into /dev/random already.  Things that
I've chosen not take is basically because I disbelieve that the Jitter
RNG is valid.  And that's mostly becuase I trust Peter Anvin (who has
access to Intel chip architects, who has expressed unease) more than
you.  (No hard feelings).

So I have been trying to do the evolution thing already.  

> I see such a development approach in numerous different kernel core areas: 
> memory allocators (SLAB, SLOB, SLUB), process schedulers, IRQ schedulers.

But we don't have two VFS layers or two MM layers.  We also don't have
two implementations of printk.

I'm obviously biased, but I don't see I see the Raison d'Etre for
merging LRNG into the kernel.

- Ted


[PATCH 4/6] staging: ccree: Fix alignment issues in ssi_cipher.c

2017-07-18 Thread Simon Sandström
Fixes checkpatch.pl alignment warnings.

Signed-off-by: Simon Sandström 
---
 drivers/staging/ccree/ssi_cipher.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/ccree/ssi_cipher.c 
b/drivers/staging/ccree/ssi_cipher.c
index bfe9b1ccbf37..aec7c1480336 100644
--- a/drivers/staging/ccree/ssi_cipher.c
+++ b/drivers/staging/ccree/ssi_cipher.c
@@ -203,7 +203,8 @@ static int ssi_blkcipher_init(struct crypto_tfm *tfm)
 
/* Map key buffer */
ctx_p->user.key_dma_addr = dma_map_single(dev, (void *)ctx_p->user.key,
-max_key_buf_size, DMA_TO_DEVICE);
+ max_key_buf_size,
+ DMA_TO_DEVICE);
if (dma_mapping_error(dev, ctx_p->user.key_dma_addr)) {
SSI_LOG_ERR("Mapping Key %u B at va=%pK for DMA failed\n",
max_key_buf_size, ctx_p->user.key);
-- 
2.11.0



[PATCH 6/6] staging: ccree: Fix alignment issues in ssi_request_mgr.c

2017-07-18 Thread Simon Sandström
Fixes checkpatch.pl alignment warnings.

Signed-off-by: Simon Sandström 
---
 drivers/staging/ccree/ssi_request_mgr.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/ccree/ssi_request_mgr.c 
b/drivers/staging/ccree/ssi_request_mgr.c
index 3f39150cda4f..2eda82f317d2 100644
--- a/drivers/staging/ccree/ssi_request_mgr.c
+++ b/drivers/staging/ccree/ssi_request_mgr.c
@@ -136,7 +136,9 @@ int request_mgr_init(struct ssi_drvdata *drvdata)
 
/* Allocate DMA word for "dummy" completion descriptor use */
req_mgr_h->dummy_comp_buff = dma_alloc_coherent(>plat_dev->dev,
-   sizeof(u32), _mgr_h->dummy_comp_buff_dma, GFP_KERNEL);
+   sizeof(u32),
+   
_mgr_h->dummy_comp_buff_dma,
+   GFP_KERNEL);
if (!req_mgr_h->dummy_comp_buff) {
SSI_LOG_ERR("Not enough memory to allocate DMA (%zu) dropped "
   "buffer\n", sizeof(u32));
-- 
2.11.0



[PATCH 0/6] Fix additional alignment issues in staging/ccree

2017-07-18 Thread Simon Sandström
Here are a few more patches that fixes alignment issues in
staging/ccree. Includes the patches that I sent previously which could
not be applied plus a few more fixes for issues that I found. These
patches should fix all remaining alignment warnings reported by
checkpatch.pl in staging/ccree.

- Simon

---

Simon Sandström (6):
  staging: ccree: Fix alignment issues in ssi_aead.c
  staging: ccree: Fix alignment issues in ssi_hash.c
  staging: ccree: Fix alignment issues in ssi_buffer_mgr.c
  staging: ccree: Fix alignment issues in ssi_cipher.c
  staging: ccree: Fix alignment issues in ssi_ivgen.c
  staging: ccree: Fix alignment issues in ssi_request_mgr.c

 drivers/staging/ccree/ssi_aead.c|  47 +++---
 drivers/staging/ccree/ssi_buffer_mgr.c  |  40 +++-
 drivers/staging/ccree/ssi_cipher.c  |   3 +-
 drivers/staging/ccree/ssi_hash.c| 105 +---
 drivers/staging/ccree/ssi_ivgen.c   |   3 +-
 drivers/staging/ccree/ssi_request_mgr.c |   4 +-
 6 files changed, 112 insertions(+), 90 deletions(-)

-- 
2.11.0



[PATCH 2/6] staging: ccree: Fix alignment issues in ssi_hash.c

2017-07-18 Thread Simon Sandström
Fixes checkpatch.pl alignment warnings.

Signed-off-by: Simon Sandström 
---
 drivers/staging/ccree/ssi_hash.c | 105 +--
 1 file changed, 56 insertions(+), 49 deletions(-)

diff --git a/drivers/staging/ccree/ssi_hash.c b/drivers/staging/ccree/ssi_hash.c
index fba0643e78fa..a5b3e9bebd95 100644
--- a/drivers/staging/ccree/ssi_hash.c
+++ b/drivers/staging/ccree/ssi_hash.c
@@ -70,8 +70,8 @@ static void ssi_hash_create_xcbc_setup(
unsigned int *seq_size);
 
 static void ssi_hash_create_cmac_setup(struct ahash_request *areq,
- struct cc_hw_desc desc[],
- unsigned int *seq_size);
+  struct cc_hw_desc desc[],
+  unsigned int *seq_size);
 
 struct ssi_hash_alg {
struct list_head entry;
@@ -117,8 +117,8 @@ static void ssi_hash_create_data_desc(
 static inline void ssi_set_hash_endianity(u32 mode, struct cc_hw_desc *desc)
 {
if (unlikely((mode == DRV_HASH_MD5) ||
-   (mode == DRV_HASH_SHA384) ||
-   (mode == DRV_HASH_SHA512))) {
+(mode == DRV_HASH_SHA384) ||
+(mode == DRV_HASH_SHA512))) {
set_bytes_swap(desc, 1);
} else {
set_cipher_config0(desc, HASH_DIGEST_RESULT_LITTLE_ENDIAN);
@@ -135,7 +135,7 @@ static int ssi_hash_map_result(struct device *dev,
   DMA_BIDIRECTIONAL);
if (unlikely(dma_mapping_error(dev, state->digest_result_dma_addr))) {
SSI_LOG_ERR("Mapping digest result buffer %u B for DMA 
failed\n",
-   digestsize);
+   digestsize);
return -ENOMEM;
}
SSI_LOG_DEBUG("Mapped digest result buffer %u B "
@@ -200,12 +200,12 @@ static int ssi_hash_map_request(struct device *dev,
state->digest_buff_dma_addr = dma_map_single(dev, (void 
*)state->digest_buff, ctx->inter_digestsize, DMA_BIDIRECTIONAL);
if (dma_mapping_error(dev, state->digest_buff_dma_addr)) {
SSI_LOG_ERR("Mapping digest len %d B at va=%pK for DMA 
failed\n",
-   ctx->inter_digestsize, state->digest_buff);
+   ctx->inter_digestsize, state->digest_buff);
goto fail3;
}
SSI_LOG_DEBUG("Mapped digest %d B at va=%pK to dma=%pad\n",
-   ctx->inter_digestsize, state->digest_buff,
-   state->digest_buff_dma_addr);
+ ctx->inter_digestsize, state->digest_buff,
+ state->digest_buff_dma_addr);
 
if (is_hmac) {
dma_sync_single_for_cpu(dev, ctx->digest_buff_dma_addr, 
ctx->inter_digestsize, DMA_BIDIRECTIONAL);
@@ -249,12 +249,12 @@ static int ssi_hash_map_request(struct device *dev,
state->digest_bytes_len_dma_addr = dma_map_single(dev, (void 
*)state->digest_bytes_len, HASH_LEN_SIZE, DMA_BIDIRECTIONAL);
if (dma_mapping_error(dev, state->digest_bytes_len_dma_addr)) {
SSI_LOG_ERR("Mapping digest len %u B at va=%pK for DMA 
failed\n",
-   HASH_LEN_SIZE, state->digest_bytes_len);
+   HASH_LEN_SIZE, state->digest_bytes_len);
goto fail4;
}
SSI_LOG_DEBUG("Mapped digest len %u B at va=%pK to dma=%pad\n",
-   HASH_LEN_SIZE, state->digest_bytes_len,
-   state->digest_bytes_len_dma_addr);
+ HASH_LEN_SIZE, state->digest_bytes_len,
+ state->digest_bytes_len_dma_addr);
} else {
state->digest_bytes_len_dma_addr = 0;
}
@@ -263,12 +263,13 @@ static int ssi_hash_map_request(struct device *dev,
state->opad_digest_dma_addr = dma_map_single(dev, (void 
*)state->opad_digest_buff, ctx->inter_digestsize, DMA_BIDIRECTIONAL);
if (dma_mapping_error(dev, state->opad_digest_dma_addr)) {
SSI_LOG_ERR("Mapping opad digest %d B at va=%pK for DMA 
failed\n",
-   ctx->inter_digestsize, state->opad_digest_buff);
+   ctx->inter_digestsize,
+   state->opad_digest_buff);
goto fail5;
}
SSI_LOG_DEBUG("Mapped opad digest %d B at va=%pK to dma=%pad\n",
-   ctx->inter_digestsize, state->opad_digest_buff,
-   state->opad_digest_dma_addr);
+ ctx->inter_digestsize, state->opad_digest_buff,
+ state->opad_digest_dma_addr);
} else {
state->opad_digest_dma_addr = 0;
}
@@ -602,7 +603,7 @@ static int ssi_hash_update(struct ahash_req_ctx *state,
if (unlikely(rc)) {
if 

[PATCH 5/6] staging: ccree: Fix alignment issues in ssi_ivgen.c

2017-07-18 Thread Simon Sandström
Fixes checkpatch.pl alignment warnings.

Signed-off-by: Simon Sandström 
---
 drivers/staging/ccree/ssi_ivgen.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/ccree/ssi_ivgen.c 
b/drivers/staging/ccree/ssi_ivgen.c
index f140dbc5195c..86364f81acab 100644
--- a/drivers/staging/ccree/ssi_ivgen.c
+++ b/drivers/staging/ccree/ssi_ivgen.c
@@ -202,7 +202,8 @@ int ssi_ivgen_init(struct ssi_drvdata *drvdata)
 
/* Allocate pool's header for intial enc. key/IV */
ivgen_ctx->pool_meta = dma_alloc_coherent(device, SSI_IVPOOL_META_SIZE,
-   _ctx->pool_meta_dma, GFP_KERNEL);
+ _ctx->pool_meta_dma,
+ GFP_KERNEL);
if (!ivgen_ctx->pool_meta) {
SSI_LOG_ERR("Not enough memory to allocate DMA of pool_meta "
   "(%u B)\n", SSI_IVPOOL_META_SIZE);
-- 
2.11.0



[PATCH 3/6] staging: ccree: Fix alignment issues in ssi_buffer_mgr.c

2017-07-18 Thread Simon Sandström
Fixes checkpatch.pl alignment warnings.

Signed-off-by: Simon Sandström 
---
 drivers/staging/ccree/ssi_buffer_mgr.c | 40 --
 1 file changed, 24 insertions(+), 16 deletions(-)

diff --git a/drivers/staging/ccree/ssi_buffer_mgr.c 
b/drivers/staging/ccree/ssi_buffer_mgr.c
index 6579a54f9dc4..63936091d524 100644
--- a/drivers/staging/ccree/ssi_buffer_mgr.c
+++ b/drivers/staging/ccree/ssi_buffer_mgr.c
@@ -371,7 +371,7 @@ static int ssi_buffer_mgr_map_scatterlist(
*mapped_nents = 1;
} else {  /*sg_is_last*/
*nents = ssi_buffer_mgr_get_sgl_nents(sg, nbytes, lbytes,
-_chained);
+ _chained);
if (*nents > max_sg_nents) {
*nents = 0;
SSI_LOG_ERR("Too many fragments. current %d max %d\n",
@@ -393,9 +393,9 @@ static int ssi_buffer_mgr_map_scatterlist(
 * must have the same nents before and after map
 */
*mapped_nents = ssi_buffer_mgr_dma_map_sg(dev,
-sg,
-*nents,
-direction);
+ sg,
+ *nents,
+ direction);
if (unlikely(*mapped_nents != *nents)) {
*nents = *mapped_nents;
SSI_LOG_ERR("dma_map_sg() sg buffer failed\n");
@@ -783,8 +783,8 @@ static inline int ssi_buffer_mgr_aead_chain_iv(
goto chain_iv_exit;
}
 
-   areq_ctx->gen_ctx.iv_dma_addr = dma_map_single(dev, req->iv,
-   hw_iv_size, DMA_BIDIRECTIONAL);
+   areq_ctx->gen_ctx.iv_dma_addr = dma_map_single(dev, req->iv, hw_iv_size,
+  DMA_BIDIRECTIONAL);
if (unlikely(dma_mapping_error(dev, areq_ctx->gen_ctx.iv_dma_addr))) {
SSI_LOG_ERR("Mapping iv %u B at va=%pK for DMA failed\n",
hw_iv_size, req->iv);
@@ -1323,8 +1323,9 @@ int ssi_buffer_mgr_map_aead_request(
req->cryptlen :
(req->cryptlen - authsize);
 
-   areq_ctx->mac_buf_dma_addr = dma_map_single(dev,
-   areq_ctx->mac_buf, MAX_MAC_SIZE, DMA_BIDIRECTIONAL);
+   areq_ctx->mac_buf_dma_addr = dma_map_single(dev, areq_ctx->mac_buf,
+   MAX_MAC_SIZE,
+   DMA_BIDIRECTIONAL);
if (unlikely(dma_mapping_error(dev, areq_ctx->mac_buf_dma_addr))) {
SSI_LOG_ERR("Mapping mac_buf %u B at va=%pK for DMA failed\n",
MAX_MAC_SIZE, areq_ctx->mac_buf);
@@ -1334,8 +1335,9 @@ int ssi_buffer_mgr_map_aead_request(
 
if (areq_ctx->ccm_hdr_size != ccm_header_size_null) {
areq_ctx->ccm_iv0_dma_addr = dma_map_single(dev,
-   (areq_ctx->ccm_config + CCM_CTR_COUNT_0_OFFSET),
-   AES_BLOCK_SIZE, DMA_TO_DEVICE);
+   
(areq_ctx->ccm_config + CCM_CTR_COUNT_0_OFFSET),
+   AES_BLOCK_SIZE,
+   DMA_TO_DEVICE);
 
if (unlikely(dma_mapping_error(dev, 
areq_ctx->ccm_iv0_dma_addr))) {
SSI_LOG_ERR("Mapping mac_buf %u B at va=%pK "
@@ -1356,7 +1358,9 @@ int ssi_buffer_mgr_map_aead_request(
 #if SSI_CC_HAS_AES_GCM
if (areq_ctx->cipher_mode == DRV_CIPHER_GCTR) {
areq_ctx->hkey_dma_addr = dma_map_single(dev,
-   areq_ctx->hkey, AES_BLOCK_SIZE, DMA_BIDIRECTIONAL);
+areq_ctx->hkey,
+AES_BLOCK_SIZE,
+DMA_BIDIRECTIONAL);
if (unlikely(dma_mapping_error(dev, areq_ctx->hkey_dma_addr))) {
SSI_LOG_ERR("Mapping hkey %u B at va=%pK for DMA 
failed\n",
AES_BLOCK_SIZE, areq_ctx->hkey);
@@ -1365,7 +1369,9 @@ int ssi_buffer_mgr_map_aead_request(
}
 
areq_ctx->gcm_block_len_dma_addr = dma_map_single(dev,
-   _ctx->gcm_len_block, AES_BLOCK_SIZE, 
DMA_TO_DEVICE);
+ 
_ctx->gcm_len_block,
+ 

[PATCH 1/6] staging: ccree: Fix alignment issues in ssi_aead.c

2017-07-18 Thread Simon Sandström
Fixes checkpatch.pl alignment warnings.

Signed-off-by: Simon Sandström 
---
 drivers/staging/ccree/ssi_aead.c | 47 +---
 1 file changed, 25 insertions(+), 22 deletions(-)

diff --git a/drivers/staging/ccree/ssi_aead.c b/drivers/staging/ccree/ssi_aead.c
index ea29b8a1a71d..ad53126d6705 100644
--- a/drivers/staging/ccree/ssi_aead.c
+++ b/drivers/staging/ccree/ssi_aead.c
@@ -96,7 +96,7 @@ static void ssi_aead_exit(struct crypto_aead *tfm)
struct ssi_aead_ctx *ctx = crypto_aead_ctx(tfm);
 
SSI_LOG_DEBUG("Clearing context @%p for %s\n",
-   crypto_aead_ctx(tfm), crypto_tfm_alg_name(>base));
+ crypto_aead_ctx(tfm), crypto_tfm_alg_name(>base));
 
dev = >drvdata->plat_dev->dev;
/* Unmap enckey buffer */
@@ -163,7 +163,7 @@ static int ssi_aead_init(struct crypto_aead *tfm)
 
/* Allocate key buffer, cache line aligned */
ctx->enckey = dma_alloc_coherent(dev, AES_MAX_KEY_SIZE,
-   >enckey_dma_addr, GFP_KERNEL);
+>enckey_dma_addr, GFP_KERNEL);
if (!ctx->enckey) {
SSI_LOG_ERR("Failed allocating key buffer\n");
goto init_failed;
@@ -239,7 +239,7 @@ static void ssi_aead_complete(struct device *dev, void 
*ssi_req, void __iomem *c
 
if (areq_ctx->gen_ctx.op_type == DRV_CRYPTO_DIRECTION_DECRYPT) {
if (memcmp(areq_ctx->mac_buf, areq_ctx->icv_virt_addr,
-   ctx->authsize) != 0) {
+  ctx->authsize) != 0) {
SSI_LOG_DEBUG("Payload authentication failure, "
"(auth-size=%d, cipher=%d).\n",
ctx->authsize, ctx->cipher_mode);
@@ -378,7 +378,7 @@ static int hmac_setkey(struct cc_hw_desc *desc, struct 
ssi_aead_ctx *ctx)
 static int validate_keys_sizes(struct ssi_aead_ctx *ctx)
 {
SSI_LOG_DEBUG("enc_keylen=%u  authkeylen=%u\n",
-   ctx->enc_keylen, ctx->auth_keylen);
+ ctx->enc_keylen, ctx->auth_keylen);
 
switch (ctx->auth_mode) {
case DRV_HASH_SHA1:
@@ -402,7 +402,7 @@ static int validate_keys_sizes(struct ssi_aead_ctx *ctx)
if (unlikely(ctx->flow_mode == S_DIN_to_DES)) {
if (ctx->enc_keylen != DES3_EDE_KEY_SIZE) {
SSI_LOG_ERR("Invalid cipher(3DES) key size: %u\n",
-   ctx->enc_keylen);
+   ctx->enc_keylen);
return -EINVAL;
}
} else { /* Default assumed to be AES ciphers */
@@ -410,7 +410,7 @@ static int validate_keys_sizes(struct ssi_aead_ctx *ctx)
(ctx->enc_keylen != AES_KEYSIZE_192) &&
(ctx->enc_keylen != AES_KEYSIZE_256)) {
SSI_LOG_ERR("Invalid cipher(AES) key size: %u\n",
-   ctx->enc_keylen);
+   ctx->enc_keylen);
return -EINVAL;
}
}
@@ -553,7 +553,8 @@ ssi_aead_setkey(struct crypto_aead *tfm, const u8 *key, 
unsigned int keylen)
int seq_len = 0, rc = -EINVAL;
 
SSI_LOG_DEBUG("Setting key in context @%p for %s. key=%p keylen=%u\n",
-   ctx, crypto_tfm_alg_name(crypto_aead_tfm(tfm)), key, keylen);
+ ctx, crypto_tfm_alg_name(crypto_aead_tfm(tfm)),
+ key, keylen);
 
/* STAT_PHASE_0: Init and sanity checks */
 
@@ -684,7 +685,7 @@ static int ssi_aead_setauthsize(
 
 #if SSI_CC_HAS_AES_CCM
 static int ssi_rfc4309_ccm_setauthsize(struct crypto_aead *authenc,
- unsigned int authsize)
+  unsigned int authsize)
 {
switch (authsize) {
case 8:
@@ -699,7 +700,7 @@ static int ssi_rfc4309_ccm_setauthsize(struct crypto_aead 
*authenc,
 }
 
 static int ssi_ccm_setauthsize(struct crypto_aead *authenc,
- unsigned int authsize)
+  unsigned int authsize)
 {
switch (authsize) {
case 4:
@@ -1183,8 +1184,8 @@ static inline void ssi_aead_load_mlli_to_sram(
(req_ctx->data_buff_type == SSI_DMA_BUF_MLLI) ||
!req_ctx->is_single_pass)) {
SSI_LOG_DEBUG("Copy-to-sram: mlli_dma=%08x, mlli_size=%u\n",
-   (unsigned int)ctx->drvdata->mlli_sram_addr,
-   req_ctx->mlli_params.mlli_len);
+ (unsigned int)ctx->drvdata->mlli_sram_addr,
+ req_ctx->mlli_params.mlli_len);
/* Copy MLLI table host-to-sram */
hw_desc_init([*seq_size]);
set_din_type([*seq_size], DMA_DLLI,
@@ -1328,7 +1329,8 @@ ssi_aead_xcbc_authenc(
 }
 
 static int validate_data_size(struct ssi_aead_ctx *ctx,
-   enum 

Re: [PATCH V2 0/6] Enable NX 842 compression engine on Power9

2017-07-18 Thread Haren Myneni
On 07/18/2017 11:06 AM, Sukadev Bhattiprolu wrote:
> Nicholas Piggin [nicholas.pig...@gmail.com] wrote:
>> On Mon, 17 Jul 2017 16:43:19 -0700
>> Haren Myneni  wrote:
>>
>>> [PATCH V2 0/6] Enable NX 842 compression engine on Power9
>>> This patchset depends on VAS kernel changes:
>>> https://lists.ozlabs.org/pipermail/linuxppc-dev/2017-May/158178.html
>>
>> Just a question, we no longer invalidate the copy buffer on context
>> switch after this patch:
>>
>> 07d2a628bc ("powerpc/64s: Avoid cpabort in context switch when possible")
>>
>> If your vas address mappings are visible only to kernel, only used in
>> process / kthread context, and only used with kernel preemption disabled,
>> this is okay.
> 
> Kernel preemption is not explicitly disabled in the NX driver I think
> and
> 
>>
>> If userspace can possibly copy/paste to the mappings or if you need to
>> sleep or call this from interrupt context, we need to work out how to
>> invalidate the copy buffer.
> 
> user space cannot copy/paste to the mappings yet (that mechanism is
> further out).
> 
> NX driver calls:
> 
>   vas_copy(, ...);
>   vas_paste(addr, ...);
> 
> but not from an interrupt context. Can/should we disable premption between
> the copy/paste and to avoid having to invalidate the copy buffer?

Nick, Also we do not support 842 in user space. Only future NX gzip compression 
module.

If OK, will add disable premption for copy/paste. Thanks for review, 
 
> 
> Sukadev
> 



Re: [PATCH V2 0/6] Enable NX 842 compression engine on Power9

2017-07-18 Thread Sukadev Bhattiprolu
Nicholas Piggin [nicholas.pig...@gmail.com] wrote:
> On Mon, 17 Jul 2017 16:43:19 -0700
> Haren Myneni  wrote:
> 
> > [PATCH V2 0/6] Enable NX 842 compression engine on Power9
> > This patchset depends on VAS kernel changes:
> > https://lists.ozlabs.org/pipermail/linuxppc-dev/2017-May/158178.html
> 
> Just a question, we no longer invalidate the copy buffer on context
> switch after this patch:
> 
> 07d2a628bc ("powerpc/64s: Avoid cpabort in context switch when possible")
> 
> If your vas address mappings are visible only to kernel, only used in
> process / kthread context, and only used with kernel preemption disabled,
> this is okay.

Kernel preemption is not explicitly disabled in the NX driver I think
and

> 
> If userspace can possibly copy/paste to the mappings or if you need to
> sleep or call this from interrupt context, we need to work out how to
> invalidate the copy buffer.

user space cannot copy/paste to the mappings yet (that mechanism is
further out).

NX driver calls:

vas_copy(, ...);
vas_paste(addr, ...);

but not from an interrupt context. Can/should we disable premption between
the copy/paste and to avoid having to invalidate the copy buffer?

Sukadev



RE: [PATCH v4 4/5] ntb: ntb_hw_intel: use io-64-nonatomic instead of in-driver hacks

2017-07-18 Thread Allen Hubbe
From: Logan Gunthorpe
> Now that ioread64 and iowrite64 are available in io-64-nonatomic,
> we can remove the hack at the top of ntb_hw_intel.c and replace it
> with an include.
> 
> Signed-off-by: Logan Gunthorpe 
> Cc: Jon Mason 
> Cc: Allen Hubbe 
> Acked-by: Dave Jiang 

Acked-by: Allen Hubbe 

> ---
>  drivers/ntb/hw/intel/ntb_hw_intel.c | 30 +-
>  1 file changed, 1 insertion(+), 29 deletions(-)
> 
> diff --git a/drivers/ntb/hw/intel/ntb_hw_intel.c 
> b/drivers/ntb/hw/intel/ntb_hw_intel.c
> index 2557e2c05b90..606c90f59d4b 100644
> --- a/drivers/ntb/hw/intel/ntb_hw_intel.c
> +++ b/drivers/ntb/hw/intel/ntb_hw_intel.c
> @@ -59,6 +59,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  #include "ntb_hw_intel.h"
> 
> @@ -155,35 +156,6 @@ MODULE_PARM_DESC(xeon_b2b_dsd_bar5_addr32,
>  static inline enum ntb_topo xeon_ppd_topo(struct intel_ntb_dev *ndev, u8 
> ppd);
>  static int xeon_init_isr(struct intel_ntb_dev *ndev);
> 
> -#ifndef ioread64
> -#ifdef readq
> -#define ioread64 readq
> -#else
> -#define ioread64 _ioread64
> -static inline u64 _ioread64(void __iomem *mmio)
> -{
> - u64 low, high;
> -
> - low = ioread32(mmio);
> - high = ioread32(mmio + sizeof(u32));
> - return low | (high << 32);
> -}
> -#endif
> -#endif
> -
> -#ifndef iowrite64
> -#ifdef writeq
> -#define iowrite64 writeq
> -#else
> -#define iowrite64 _iowrite64
> -static inline void _iowrite64(u64 val, void __iomem *mmio)
> -{
> - iowrite32(val, mmio);
> - iowrite32(val >> 32, mmio + sizeof(u32));
> -}
> -#endif
> -#endif
> -
>  static inline int pdev_is_atom(struct pci_dev *pdev)
>  {
>   switch (pdev->device) {
> --
> 2.11.0



Re: [PATCH] crypto: ccp - Fix XTS-AES support on a version 5 CCP

2017-07-18 Thread Gary R Hook

On 07/18/2017 01:28 AM, Stephan Müller wrote:

Am Montag, 17. Juli 2017, 22:08:27 CEST schrieb Gary R Hook:

Hi Gary,


Version 5 CCPs have differing requirements for XTS-AES: key components
are stored in a 512-bit vector. The context must be little-endian
justified. AES-256 is supported now, so propagate the cipher size to
the command descriptor.

Signed-off-by: Gary R Hook 
---
 drivers/crypto/ccp/ccp-crypto-aes-xts.c |   79


..


@@ -97,14 +77,20 @@ static int ccp_aes_xts_setkey(struct crypto_ablkcipher
*tfm, const u8 *key, unsigned int key_len)
 {
   struct ccp_ctx *ctx = crypto_tfm_ctx(crypto_ablkcipher_tfm(tfm));
+ unsigned int ccpversion = ccp_version();

   /* Only support 128-bit AES key with a 128-bit Tweak key,
* otherwise use the fallback
*/
+


Can you please add xts_check_key here?


Certainly!



[PATCH v4 5/5] crypto: caam: cleanup CONFIG_64BIT ifdefs when using io{read|write}64

2017-07-18 Thread Logan Gunthorpe
From: Horia Geantă 

We can now make use of the io-64-nonatomic-lo-hi header to always
provide 64 bit IO operations. So this patch cleans up the extra
CONFIG_64BIT ifdefs.

To be consistent with CAAM engine HW spec: in case of 64-bit registers,
irrespective of device endianness, the lower address should be read from
/ written to first, followed by the upper address. Indeed the I/O
accessors in CAAM driver currently don't follow the spec, however this
is a good opportunity to fix the code.

Signed-off-by: Horia Geantă 
Signed-off-by: Logan Gunthorpe 
Cc: Horia Geantă 
Cc: Dan Douglass 
Cc: Herbert Xu 
Cc: "David S. Miller" 
---
 drivers/crypto/caam/regs.h | 35 +--
 1 file changed, 5 insertions(+), 30 deletions(-)

diff --git a/drivers/crypto/caam/regs.h b/drivers/crypto/caam/regs.h
index 84d2f838a063..0c45505458e7 100644
--- a/drivers/crypto/caam/regs.h
+++ b/drivers/crypto/caam/regs.h
@@ -9,7 +9,7 @@
 
 #include 
 #include 
-#include 
+#include 
 
 /*
  * Architecture-specific register access methods
@@ -134,50 +134,25 @@ static inline void clrsetbits_32(void __iomem *reg, u32 
clear, u32 set)
  *base + 0x : least-significant 32 bits
  *base + 0x0004 : most-significant 32 bits
  */
-#ifdef CONFIG_64BIT
 static inline void wr_reg64(void __iomem *reg, u64 data)
 {
+#ifndef CONFIG_CRYPTO_DEV_FSL_CAAM_IMX
if (caam_little_end)
iowrite64(data, reg);
else
-   iowrite64be(data, reg);
-}
-
-static inline u64 rd_reg64(void __iomem *reg)
-{
-   if (caam_little_end)
-   return ioread64(reg);
-   else
-   return ioread64be(reg);
-}
-
-#else /* CONFIG_64BIT */
-static inline void wr_reg64(void __iomem *reg, u64 data)
-{
-#ifndef CONFIG_CRYPTO_DEV_FSL_CAAM_IMX
-   if (caam_little_end) {
-   wr_reg32((u32 __iomem *)(reg) + 1, data >> 32);
-   wr_reg32((u32 __iomem *)(reg), data);
-   } else
 #endif
-   {
-   wr_reg32((u32 __iomem *)(reg), data >> 32);
-   wr_reg32((u32 __iomem *)(reg) + 1, data);
-   }
+   iowrite64be(data, reg);
 }
 
 static inline u64 rd_reg64(void __iomem *reg)
 {
 #ifndef CONFIG_CRYPTO_DEV_FSL_CAAM_IMX
if (caam_little_end)
-   return ((u64)rd_reg32((u32 __iomem *)(reg) + 1) << 32 |
-   (u64)rd_reg32((u32 __iomem *)(reg)));
+   return ioread64(reg);
else
 #endif
-   return ((u64)rd_reg32((u32 __iomem *)(reg)) << 32 |
-   (u64)rd_reg32((u32 __iomem *)(reg) + 1));
+   return ioread64be(reg);
 }
-#endif /* CONFIG_64BIT  */
 
 #ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
 #ifdef CONFIG_SOC_IMX7D
-- 
2.11.0



[PATCH v4 3/5] io-64-nonatomic: add io{read|write}64[be]{_lo_hi|_hi_lo} macros

2017-07-18 Thread Logan Gunthorpe
This patch adds generic io{read|write}64[be]{_lo_hi|_hi_lo} macros if
they are not already defined by the architecture. (As they are provided
by the generic iomap library).

The patch also points io{read|write}64[be] to the variant specified by the
header name.

This is because new drivers are encouraged to use ioreadXX, et al instead
of readX[1], et al -- and mixing ioreadXX with readq is pretty ugly.

[1] ldd3: section 9.4.2

Signed-off-by: Logan Gunthorpe 
cc: Christoph Hellwig 
cc: Arnd Bergmann 
cc: Alan Cox 
cc: Greg Kroah-Hartman 
---
 include/linux/io-64-nonatomic-hi-lo.h | 60 +++
 include/linux/io-64-nonatomic-lo-hi.h | 60 +++
 2 files changed, 120 insertions(+)

diff --git a/include/linux/io-64-nonatomic-hi-lo.h 
b/include/linux/io-64-nonatomic-hi-lo.h
index defcc4644ce3..31d28e981299 100644
--- a/include/linux/io-64-nonatomic-hi-lo.h
+++ b/include/linux/io-64-nonatomic-hi-lo.h
@@ -54,4 +54,64 @@ static inline void hi_lo_writeq_relaxed(__u64 val, volatile 
void __iomem *addr)
 #define writeq_relaxed hi_lo_writeq_relaxed
 #endif
 
+#ifndef ioread64_hi_lo
+#define ioread64_hi_lo ioread64_hi_lo
+static inline u64 ioread64_hi_lo(void __iomem *addr)
+{
+   u32 low, high;
+
+   high = ioread32(addr + sizeof(u32));
+   low = ioread32(addr);
+
+   return low + ((u64)high << 32);
+}
+#endif
+
+#ifndef iowrite64_hi_lo
+#define iowrite64_hi_lo iowrite64_hi_lo
+static inline void iowrite64_hi_lo(u64 val, void __iomem *addr)
+{
+   iowrite32(val >> 32, addr + sizeof(u32));
+   iowrite32(val, addr);
+}
+#endif
+
+#ifndef ioread64be_hi_lo
+#define ioread64be_hi_lo ioread64be_hi_lo
+static inline u64 ioread64be_hi_lo(void __iomem *addr)
+{
+   u32 low, high;
+
+   high = ioread32be(addr);
+   low = ioread32be(addr + sizeof(u32));
+
+   return low + ((u64)high << 32);
+}
+#endif
+
+#ifndef iowrite64be_hi_lo
+#define iowrite64be_hi_lo iowrite64be_hi_lo
+static inline void iowrite64be_hi_lo(u64 val, void __iomem *addr)
+{
+   iowrite32be(val >> 32, addr);
+   iowrite32be(val, addr + sizeof(u32));
+}
+#endif
+
+#ifndef ioread64
+#define ioread64 ioread64_hi_lo
+#endif
+
+#ifndef iowrite64
+#define iowrite64 iowrite64_hi_lo
+#endif
+
+#ifndef ioread64be
+#define ioread64be ioread64be_hi_lo
+#endif
+
+#ifndef iowrite64be
+#define iowrite64be iowrite64be_hi_lo
+#endif
+
 #endif /* _LINUX_IO_64_NONATOMIC_HI_LO_H_ */
diff --git a/include/linux/io-64-nonatomic-lo-hi.h 
b/include/linux/io-64-nonatomic-lo-hi.h
index 084461a4e5ab..437a34f20f5a 100644
--- a/include/linux/io-64-nonatomic-lo-hi.h
+++ b/include/linux/io-64-nonatomic-lo-hi.h
@@ -54,4 +54,64 @@ static inline void lo_hi_writeq_relaxed(__u64 val, volatile 
void __iomem *addr)
 #define writeq_relaxed lo_hi_writeq_relaxed
 #endif
 
+#ifndef ioread64_lo_hi
+#define ioread64_lo_hi ioread64_lo_hi
+static inline u64 ioread64_lo_hi(void __iomem *addr)
+{
+   u32 low, high;
+
+   low = ioread32(addr);
+   high = ioread32(addr + sizeof(u32));
+
+   return low + ((u64)high << 32);
+}
+#endif
+
+#ifndef iowrite64_lo_hi
+#define iowrite64_lo_hi iowrite64_lo_hi
+static inline void iowrite64_lo_hi(u64 val, void __iomem *addr)
+{
+   iowrite32(val, addr);
+   iowrite32(val >> 32, addr + sizeof(u32));
+}
+#endif
+
+#ifndef ioread64be_lo_hi
+#define ioread64be_lo_hi ioread64be_lo_hi
+static inline u64 ioread64be_lo_hi(void __iomem *addr)
+{
+   u32 low, high;
+
+   low = ioread32be(addr + sizeof(u32));
+   high = ioread32be(addr);
+
+   return low + ((u64)high << 32);
+}
+#endif
+
+#ifndef iowrite64be_lo_hi
+#define iowrite64be_lo_hi iowrite64be_lo_hi
+static inline void iowrite64be_lo_hi(u64 val, void __iomem *addr)
+{
+   iowrite32be(val, addr + sizeof(u32));
+   iowrite32be(val >> 32, addr);
+}
+#endif
+
+#ifndef ioread64
+#define ioread64 ioread64_lo_hi
+#endif
+
+#ifndef iowrite64
+#define iowrite64 iowrite64_lo_hi
+#endif
+
+#ifndef ioread64be
+#define ioread64be ioread64be_lo_hi
+#endif
+
+#ifndef iowrite64be
+#define iowrite64be iowrite64be_lo_hi
+#endif
+
 #endif /* _LINUX_IO_64_NONATOMIC_LO_HI_H_ */
-- 
2.11.0



[PATCH v4 2/5] iomap: introduce io{read|write}64_{lo_hi|hi_lo}

2017-07-18 Thread Logan Gunthorpe
In order to provide non-atomic functions for io{read|write}64 that will
use readq and writeq when appropriate. We define a number of variants
of these functions in the generic iomap that will do non-atomic
operations on pio but atomic operations on mmio.

These functions are only defined if readq and writeq are defined. If
they are not, then the wrappers that always use non-atomic operations
from include/linux/io-64-nonatomic*.h will be used.

Signed-off-by: Logan Gunthorpe 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: Arnd Bergmann 
Cc: Suresh Warrier 
Cc: Nicholas Piggin 
---
 arch/powerpc/include/asm/io.h |   2 +
 include/asm-generic/iomap.h   |  26 +++--
 lib/iomap.c   | 132 ++
 3 files changed, 154 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/io.h b/arch/powerpc/include/asm/io.h
index af074923d598..4cc420cfaa78 100644
--- a/arch/powerpc/include/asm/io.h
+++ b/arch/powerpc/include/asm/io.h
@@ -788,8 +788,10 @@ extern void __iounmap_at(void *ea, unsigned long size);
 
 #define mmio_read16be(addr)readw_be(addr)
 #define mmio_read32be(addr)readl_be(addr)
+#define mmio_read64be(addr)readq_be(addr)
 #define mmio_write16be(val, addr)  writew_be(val, addr)
 #define mmio_write32be(val, addr)  writel_be(val, addr)
+#define mmio_write64be(val, addr)  writeq_be(val, addr)
 #define mmio_insb(addr, dst, count)readsb(addr, dst, count)
 #define mmio_insw(addr, dst, count)readsw(addr, dst, count)
 #define mmio_insl(addr, dst, count)readsl(addr, dst, count)
diff --git a/include/asm-generic/iomap.h b/include/asm-generic/iomap.h
index 650fede33c25..e4601455ac4a 100644
--- a/include/asm-generic/iomap.h
+++ b/include/asm-generic/iomap.h
@@ -30,9 +30,16 @@ extern unsigned int ioread16(void __iomem *);
 extern unsigned int ioread16be(void __iomem *);
 extern unsigned int ioread32(void __iomem *);
 extern unsigned int ioread32be(void __iomem *);
-#ifdef CONFIG_64BIT
-extern u64 ioread64(void __iomem *);
-extern u64 ioread64be(void __iomem *);
+
+#ifdef readq
+#define ioread64_lo_hi ioread64_lo_hi
+#define ioread64_hi_lo ioread64_hi_lo
+#define ioread64be_lo_hi ioread64be_lo_hi
+#define ioread64be_hi_lo ioread64be_hi_lo
+extern u64 ioread64_lo_hi(void __iomem *addr);
+extern u64 ioread64_hi_lo(void __iomem *addr);
+extern u64 ioread64be_lo_hi(void __iomem *addr);
+extern u64 ioread64be_hi_lo(void __iomem *addr);
 #endif
 
 extern void iowrite8(u8, void __iomem *);
@@ -40,9 +47,16 @@ extern void iowrite16(u16, void __iomem *);
 extern void iowrite16be(u16, void __iomem *);
 extern void iowrite32(u32, void __iomem *);
 extern void iowrite32be(u32, void __iomem *);
-#ifdef CONFIG_64BIT
-extern void iowrite64(u64, void __iomem *);
-extern void iowrite64be(u64, void __iomem *);
+
+#ifdef writeq
+#define iowrite64_lo_hi iowrite64_lo_hi
+#define iowrite64_hi_lo iowrite64_hi_lo
+#define iowrite64be_lo_hi iowrite64be_lo_hi
+#define iowrite64be_hi_lo iowrite64be_hi_lo
+void iowrite64_lo_hi(u64 val, void __iomem *addr);
+void iowrite64_hi_lo(u64 val, void __iomem *addr);
+void iowrite64be_lo_hi(u64 val, void __iomem *addr);
+void iowrite64be_hi_lo(u64 val, void __iomem *addr);
 #endif
 
 /*
diff --git a/lib/iomap.c b/lib/iomap.c
index fc3dcb4b238e..b993400d60bd 100644
--- a/lib/iomap.c
+++ b/lib/iomap.c
@@ -66,6 +66,7 @@ static void bad_io_access(unsigned long port, const char 
*access)
 #ifndef mmio_read16be
 #define mmio_read16be(addr) be16_to_cpu(__raw_readw(addr))
 #define mmio_read32be(addr) be32_to_cpu(__raw_readl(addr))
+#define mmio_read64be(addr) be64_to_cpu(__raw_readq(addr))
 #endif
 
 unsigned int ioread8(void __iomem *addr)
@@ -99,6 +100,80 @@ EXPORT_SYMBOL(ioread16be);
 EXPORT_SYMBOL(ioread32);
 EXPORT_SYMBOL(ioread32be);
 
+#ifdef readq
+static u64 pio_read64_lo_hi(unsigned long port)
+{
+   u64 lo, hi;
+
+   lo = inl(port);
+   hi = inl(port + sizeof(u32));
+
+   return lo | (hi << 32);
+}
+
+static u64 pio_read64_hi_lo(unsigned long port)
+{
+   u64 lo, hi;
+
+   hi = inl(port + sizeof(u32));
+   lo = inl(port);
+
+   return lo | (hi << 32);
+}
+
+static u64 pio_read64be_lo_hi(unsigned long port)
+{
+   u64 lo, hi;
+
+   lo = pio_read32be(port + sizeof(u32));
+   hi = pio_read32be(port);
+
+   return lo | (hi << 32);
+}
+
+static u64 pio_read64be_hi_lo(unsigned long port)
+{
+   u64 lo, hi;
+
+   hi = pio_read32be(port);
+   lo = pio_read32be(port + sizeof(u32));
+
+   return lo | (hi << 32);
+}
+
+u64 ioread64_lo_hi(void __iomem *addr)
+{
+   IO_COND(addr, return pio_read64_lo_hi(port), return readq(addr));
+   return 0xLL;
+}
+
+u64 ioread64_hi_lo(void __iomem *addr)
+{
+   IO_COND(addr, return 

[PATCH v4 1/5] powerpc: io.h: move iomap.h include so that it can use readq/writeq defs

2017-07-18 Thread Logan Gunthorpe
Subsequent patches in this series makes use of the readq and writeq
defines in iomap.h. However, as is, they get missed on the powerpc
platform seeing the include comes before the define. This patch
moves the include down to fix this.

Signed-off-by: Logan Gunthorpe 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: Nicholas Piggin 
Cc: Suresh Warrier 
Cc: "Oliver O'Halloran" 
---
 arch/powerpc/include/asm/io.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/io.h b/arch/powerpc/include/asm/io.h
index 422f99cf9924..af074923d598 100644
--- a/arch/powerpc/include/asm/io.h
+++ b/arch/powerpc/include/asm/io.h
@@ -33,8 +33,6 @@ extern struct pci_dev *isa_bridge_pcidev;
 #include 
 #include 
 
-#include 
-
 #ifdef CONFIG_PPC64
 #include 
 #endif
@@ -663,6 +661,8 @@ static inline void name at  
\
 #define writel_relaxed(v, addr)writel(v, addr)
 #define writeq_relaxed(v, addr)writeq(v, addr)
 
+#include 
+
 #ifdef CONFIG_PPC32
 #define mmiowb()
 #else
-- 
2.11.0



[PATCH v4 0/5] make io{read|write}64 globally usable

2017-07-18 Thread Logan Gunthorpe
This is version four of my patchset to enable drivers to use
io{read|write}64 on all arches.

Changes since v3:

- I noticed powerpc didn't use the appropriate functions seeing
readq/writeq were not defined when iomap.h was included. Thus I've
included a patch to adjust this
- Fixed some mistakes with a couple of the defines in io-64-nonatomic*
headers
- Fixed a typo noticed by Horia.


Horia Geantă (1):
  crypto: caam: cleanup CONFIG_64BIT ifdefs when using io{read|write}64

Logan Gunthorpe (4):
  powerpc: io.h: move iomap.h include so that it can use readq/writeq
defs
  iomap: introduce io{read|write}64_{lo_hi|hi_lo}
  io-64-nonatomic: add io{read|write}64[be]{_lo_hi|_hi_lo} macros
  ntb: ntb_hw_intel: use io-64-nonatomic instead of in-driver hacks

 arch/powerpc/include/asm/io.h |   6 +-
 drivers/crypto/caam/regs.h|  35 ++---
 drivers/ntb/hw/intel/ntb_hw_intel.c   |  30 +---
 include/asm-generic/iomap.h   |  26 +--
 include/linux/io-64-nonatomic-hi-lo.h |  60 
 include/linux/io-64-nonatomic-lo-hi.h |  60 
 lib/iomap.c   | 132 ++
 7 files changed, 282 insertions(+), 67 deletions(-)

--
2.11.0


[PATCH v4 4/5] ntb: ntb_hw_intel: use io-64-nonatomic instead of in-driver hacks

2017-07-18 Thread Logan Gunthorpe
Now that ioread64 and iowrite64 are available in io-64-nonatomic,
we can remove the hack at the top of ntb_hw_intel.c and replace it
with an include.

Signed-off-by: Logan Gunthorpe 
Cc: Jon Mason 
Cc: Allen Hubbe 
Acked-by: Dave Jiang 
---
 drivers/ntb/hw/intel/ntb_hw_intel.c | 30 +-
 1 file changed, 1 insertion(+), 29 deletions(-)

diff --git a/drivers/ntb/hw/intel/ntb_hw_intel.c 
b/drivers/ntb/hw/intel/ntb_hw_intel.c
index 2557e2c05b90..606c90f59d4b 100644
--- a/drivers/ntb/hw/intel/ntb_hw_intel.c
+++ b/drivers/ntb/hw/intel/ntb_hw_intel.c
@@ -59,6 +59,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "ntb_hw_intel.h"
 
@@ -155,35 +156,6 @@ MODULE_PARM_DESC(xeon_b2b_dsd_bar5_addr32,
 static inline enum ntb_topo xeon_ppd_topo(struct intel_ntb_dev *ndev, u8 ppd);
 static int xeon_init_isr(struct intel_ntb_dev *ndev);
 
-#ifndef ioread64
-#ifdef readq
-#define ioread64 readq
-#else
-#define ioread64 _ioread64
-static inline u64 _ioread64(void __iomem *mmio)
-{
-   u64 low, high;
-
-   low = ioread32(mmio);
-   high = ioread32(mmio + sizeof(u32));
-   return low | (high << 32);
-}
-#endif
-#endif
-
-#ifndef iowrite64
-#ifdef writeq
-#define iowrite64 writeq
-#else
-#define iowrite64 _iowrite64
-static inline void _iowrite64(u64 val, void __iomem *mmio)
-{
-   iowrite32(val, mmio);
-   iowrite32(val >> 32, mmio + sizeof(u32));
-}
-#endif
-#endif
-
 static inline int pdev_is_atom(struct pci_dev *pdev)
 {
switch (pdev->device) {
-- 
2.11.0



[PATCH 2/4] arm64: dts: freescale: ls208xa: share aliases node

2017-07-18 Thread Horia Geantă
aliases node is identical for all boards, thus move it
to the common file ls208xa.dtsi.

Signed-off-by: Horia Geantă 
---
 arch/arm64/boot/dts/freescale/fsl-ls2080a-qds.dts  | 5 -
 arch/arm64/boot/dts/freescale/fsl-ls2080a-rdb.dts  | 5 -
 arch/arm64/boot/dts/freescale/fsl-ls2080a-simu.dts | 5 -
 arch/arm64/boot/dts/freescale/fsl-ls2088a-qds.dts  | 5 -
 arch/arm64/boot/dts/freescale/fsl-ls2088a-rdb.dts  | 5 -
 arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi | 5 +
 6 files changed, 5 insertions(+), 25 deletions(-)

diff --git a/arch/arm64/boot/dts/freescale/fsl-ls2080a-qds.dts 
b/arch/arm64/boot/dts/freescale/fsl-ls2080a-qds.dts
index ed209cd57283..3c99608b9b45 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls2080a-qds.dts
+++ b/arch/arm64/boot/dts/freescale/fsl-ls2080a-qds.dts
@@ -55,11 +55,6 @@
model = "Freescale Layerscape 2080a QDS Board";
compatible = "fsl,ls2080a-qds", "fsl,ls2080a";
 
-   aliases {
-   serial0 = 
-   serial1 = 
-   };
-
chosen {
stdout-path = "serial0:115200n8";
};
diff --git a/arch/arm64/boot/dts/freescale/fsl-ls2080a-rdb.dts 
b/arch/arm64/boot/dts/freescale/fsl-ls2080a-rdb.dts
index 67ec3f9c81a1..a4e7de9f70d8 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls2080a-rdb.dts
+++ b/arch/arm64/boot/dts/freescale/fsl-ls2080a-rdb.dts
@@ -55,11 +55,6 @@
model = "Freescale Layerscape 2080a RDB Board";
compatible = "fsl,ls2080a-rdb", "fsl,ls2080a";
 
-   aliases {
-   serial0 = 
-   serial1 = 
-   };
-
chosen {
stdout-path = "serial1:115200n8";
};
diff --git a/arch/arm64/boot/dts/freescale/fsl-ls2080a-simu.dts 
b/arch/arm64/boot/dts/freescale/fsl-ls2080a-simu.dts
index 3ee718f0aaf8..fbbb73e571c0 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls2080a-simu.dts
+++ b/arch/arm64/boot/dts/freescale/fsl-ls2080a-simu.dts
@@ -52,11 +52,6 @@
model = "Freescale Layerscape 2080a software Simulator model";
compatible = "fsl,ls2080a-simu", "fsl,ls2080a";
 
-   aliases {
-   serial0 = 
-   serial1 = 
-   };
-
ethernet@221 {
compatible = "smsc,lan91c111";
reg = <0x0 0x221 0x0 0x100>;
diff --git a/arch/arm64/boot/dts/freescale/fsl-ls2088a-qds.dts 
b/arch/arm64/boot/dts/freescale/fsl-ls2088a-qds.dts
index 4a1df5ce3229..eaee5b1c3a44 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls2088a-qds.dts
+++ b/arch/arm64/boot/dts/freescale/fsl-ls2088a-qds.dts
@@ -54,11 +54,6 @@
model = "Freescale Layerscape 2088A QDS Board";
compatible = "fsl,ls2088a-qds", "fsl,ls2088a";
 
-   aliases {
-   serial0 = 
-   serial1 = 
-   };
-
chosen {
stdout-path = "serial0:115200n8";
};
diff --git a/arch/arm64/boot/dts/freescale/fsl-ls2088a-rdb.dts 
b/arch/arm64/boot/dts/freescale/fsl-ls2088a-rdb.dts
index a76d4b4debd1..c411442cac62 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls2088a-rdb.dts
+++ b/arch/arm64/boot/dts/freescale/fsl-ls2088a-rdb.dts
@@ -54,11 +54,6 @@
model = "Freescale Layerscape 2088A RDB Board";
compatible = "fsl,ls2088a-rdb", "fsl,ls2088a";
 
-   aliases {
-   serial0 = 
-   serial1 = 
-   };
-
chosen {
stdout-path = "serial1:115200n8";
};
diff --git a/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi 
b/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi
index 94cdd3045037..f135b987d13b 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi
@@ -53,6 +53,11 @@
#address-cells = <2>;
#size-cells = <2>;
 
+   aliases {
+   serial0 = 
+   serial1 = 
+   };
+
cpu: cpus {
#address-cells = <1>;
#size-cells = <0>;
-- 
2.12.0.264.gd6db3f216544



[PATCH 3/4] arm64: dts: freescale: ls208xa: add crypto node

2017-07-18 Thread Horia Geantă
LS208xA has a SEC v5.1 security engine.

Signed-off-by: Horia Geantă 
---
 arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi | 41 ++
 1 file changed, 41 insertions(+)

diff --git a/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi 
b/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi
index f135b987d13b..fc1234dc90f9 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi
@@ -46,6 +46,7 @@
  */
 
 #include 
+#include 
 
 / {
compatible = "fsl,ls2080a";
@@ -54,6 +55,7 @@
#size-cells = <2>;
 
aliases {
+   crypto = 
serial0 = 
serial1 = 
};
@@ -306,6 +308,45 @@
clock-names = "apb_pclk", "wdog_clk";
};
 
+   crypto: crypto@800 {
+   compatible = "fsl,sec-v5.0", "fsl,sec-v4.0";
+   fsl,sec-era = <8>;
+   #address-cells = <1>;
+   #size-cells = <1>;
+   ranges = <0x0 0x00 0x800 0x10>;
+   reg = <0x00 0x800 0x0 0x10>;
+   interrupts = ;
+   dma-coherent;
+
+   sec_jr0: jr@1 {
+   compatible = "fsl,sec-v5.0-job-ring",
+"fsl,sec-v4.0-job-ring";
+   reg= <0x1 0x1>;
+   interrupts = ;
+   };
+
+   sec_jr1: jr@2 {
+   compatible = "fsl,sec-v5.0-job-ring",
+"fsl,sec-v4.0-job-ring";
+   reg= <0x2 0x1>;
+   interrupts = ;
+   };
+
+   sec_jr2: jr@3 {
+   compatible = "fsl,sec-v5.0-job-ring",
+"fsl,sec-v4.0-job-ring";
+   reg= <0x3 0x1>;
+   interrupts = ;
+   };
+
+   sec_jr3: jr@4 {
+   compatible = "fsl,sec-v5.0-job-ring",
+"fsl,sec-v4.0-job-ring";
+   reg= <0x4 0x1>;
+   interrupts = ;
+   };
+   };
+
fsl_mc: fsl-mc@80c00 {
compatible = "fsl,qoriq-mc";
reg = <0x0008 0x0c00 0 0x40>,/* MC portal 
base */
-- 
2.12.0.264.gd6db3f216544



[PATCH 1/4] crypto: caam/jr - add support for DPAA2 parts

2017-07-18 Thread Horia Geantă
Add support for using the caam/jr backend on DPAA2-based SoCs.
These have some particularities we have to account for:
-HW S/G format is different
-Management Complex (MC) firmware initializes / manages (partially)
the CAAM block: MCFGR, QI enablement in QICTL, RNG

Signed-off-by: Horia Geantă 
---
 drivers/crypto/caam/caamhash.c   |  7 ++--
 drivers/crypto/caam/ctrl.c   | 45 ++
 drivers/crypto/caam/ctrl.h   |  2 +
 drivers/crypto/caam/jr.c |  7 +++-
 drivers/crypto/caam/regs.h   |  1 +
 drivers/crypto/caam/sg_sw_qm2.h  | 81 
 drivers/crypto/caam/sg_sw_sec4.h | 30 +--
 7 files changed, 148 insertions(+), 25 deletions(-)
 create mode 100644 drivers/crypto/caam/sg_sw_qm2.h

diff --git a/drivers/crypto/caam/caamhash.c b/drivers/crypto/caam/caamhash.c
index 910ec61cae09..698580b60b2f 100644
--- a/drivers/crypto/caam/caamhash.c
+++ b/drivers/crypto/caam/caamhash.c
@@ -791,8 +791,8 @@ static int ahash_update_ctx(struct ahash_request *req)
 to_hash - *buflen,
 *next_buflen, 0);
} else {
-   (edesc->sec4_sg + sec4_sg_src_index - 1)->len |=
-   cpu_to_caam32(SEC4_SG_LEN_FIN);
+   sg_to_sec4_set_last(edesc->sec4_sg + sec4_sg_src_index -
+   1);
}
 
desc = edesc->hw_desc;
@@ -882,8 +882,7 @@ static int ahash_final_ctx(struct ahash_request *req)
if (ret)
goto unmap_ctx;
 
-   (edesc->sec4_sg + sec4_sg_src_index - 1)->len |=
-   cpu_to_caam32(SEC4_SG_LEN_FIN);
+   sg_to_sec4_set_last(edesc->sec4_sg + sec4_sg_src_index - 1);
 
edesc->sec4_sg_dma = dma_map_single(jrdev, edesc->sec4_sg,
sec4_sg_bytes, DMA_TO_DEVICE);
diff --git a/drivers/crypto/caam/ctrl.c b/drivers/crypto/caam/ctrl.c
index 7338f15b8674..fdbcba13824c 100644
--- a/drivers/crypto/caam/ctrl.c
+++ b/drivers/crypto/caam/ctrl.c
@@ -17,6 +17,8 @@
 
 bool caam_little_end;
 EXPORT_SYMBOL(caam_little_end);
+bool caam_dpaa2;
+EXPORT_SYMBOL(caam_dpaa2);
 
 #ifdef CONFIG_CAAM_QI
 #include "qi.h"
@@ -319,8 +321,11 @@ static int caam_remove(struct platform_device *pdev)
caam_qi_shutdown(ctrlpriv->qidev);
 #endif
 
-   /* De-initialize RNG state handles initialized by this driver. */
-   if (ctrlpriv->rng4_sh_init)
+   /*
+* De-initialize RNG state handles initialized by this driver.
+* In case of DPAA 2.x, RNG is managed by MC firmware.
+*/
+   if (!caam_dpaa2 && ctrlpriv->rng4_sh_init)
deinstantiate_rng(ctrldev, ctrlpriv->rng4_sh_init);
 
/* Shut down debug views */
@@ -552,12 +557,17 @@ static int caam_probe(struct platform_device *pdev)
 
/*
 * Enable DECO watchdogs and, if this is a PHYS_ADDR_T_64BIT kernel,
-* long pointers in master configuration register
+* long pointers in master configuration register.
+* In case of DPAA 2.x, Management Complex firmware performs
+* the configuration.
 */
-   clrsetbits_32(>mcr, MCFGR_AWCACHE_MASK | MCFGR_LONG_PTR,
- MCFGR_AWCACHE_CACH | MCFGR_AWCACHE_BUFF |
- MCFGR_WDENABLE | MCFGR_LARGE_BURST |
- (sizeof(dma_addr_t) == sizeof(u64) ? MCFGR_LONG_PTR : 0));
+   caam_dpaa2 = !!(comp_params & CTPR_MS_DPAA2);
+   if (!caam_dpaa2)
+   clrsetbits_32(>mcr, MCFGR_AWCACHE_MASK | MCFGR_LONG_PTR,
+ MCFGR_AWCACHE_CACH | MCFGR_AWCACHE_BUFF |
+ MCFGR_WDENABLE | MCFGR_LARGE_BURST |
+ (sizeof(dma_addr_t) == sizeof(u64) ?
+  MCFGR_LONG_PTR : 0));
 
/*
 *  Read the Compile Time paramters and SCFGR to determine
@@ -586,7 +596,9 @@ static int caam_probe(struct platform_device *pdev)
  JRSTART_JR3_START);
 
if (sizeof(dma_addr_t) == sizeof(u64)) {
-   if (of_device_is_compatible(nprop, "fsl,sec-v5.0"))
+   if (caam_dpaa2)
+   ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(49));
+   else if (of_device_is_compatible(nprop, "fsl,sec-v5.0"))
ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(40));
else
ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(36));
@@ -629,11 +641,9 @@ static int caam_probe(struct platform_device *pdev)
ring++;
}
 
-   /* Check to see if QI present. If so, enable */
-   ctrlpriv->qi_present =
-   !!(rd_reg32(>perfmon.comp_parms_ms) &
-  CTPR_MS_QI_MASK);
-   if 

[PATCH 0/4] crypto: caam - add Job Ring support for DPAA2 parts

2017-07-18 Thread Horia Geantă
This patch set adds support for CAAM's legacy Job Ring backend / interface
on platforms having DPAA2 (Datapath Acceleration Architecture v2), like
LS1088A or LS2088A.

I would like to get the DT patches through the crypto list (to make sure
they don't end up merged before driver support).

Thanks,
Horia

Horia Geantă (4):
  crypto: caam/jr - add support for DPAA2 parts
  arm64: dts: freescale: ls208xa: share aliases node
  arm64: dts: freescale: ls208xa: add crypto node
  arm64: dts: freescale: ls1088a: add crypto node

 arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi | 43 
 arch/arm64/boot/dts/freescale/fsl-ls2080a-qds.dts  |  5 --
 arch/arm64/boot/dts/freescale/fsl-ls2080a-rdb.dts  |  5 --
 arch/arm64/boot/dts/freescale/fsl-ls2080a-simu.dts |  5 --
 arch/arm64/boot/dts/freescale/fsl-ls2088a-qds.dts  |  5 --
 arch/arm64/boot/dts/freescale/fsl-ls2088a-rdb.dts  |  5 --
 arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi | 46 
 drivers/crypto/caam/caamhash.c |  7 +-
 drivers/crypto/caam/ctrl.c | 45 +++-
 drivers/crypto/caam/ctrl.h |  2 +
 drivers/crypto/caam/jr.c   |  7 +-
 drivers/crypto/caam/regs.h |  1 +
 drivers/crypto/caam/sg_sw_qm2.h| 81 ++
 drivers/crypto/caam/sg_sw_sec4.h   | 30 ++--
 14 files changed, 237 insertions(+), 50 deletions(-)
 create mode 100644 drivers/crypto/caam/sg_sw_qm2.h

-- 
2.12.0.264.gd6db3f216544



[PATCH 4/4] arm64: dts: freescale: ls1088a: add crypto node

2017-07-18 Thread Horia Geantă
LS1088A has a SEC v5.3 security engine.

Signed-off-by: Horia Geantă 
---
 arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi | 43 ++
 1 file changed, 43 insertions(+)

diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi 
b/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi
index c144d06a6e33..6c22d75bc504 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi
@@ -52,6 +52,10 @@
#address-cells = <2>;
#size-cells = <2>;
 
+   aliases {
+   crypto = 
+   };
+
cpus {
#address-cells = <1>;
#size-cells = <0>;
@@ -369,6 +373,45 @@
dma-coherent;
status = "disabled";
};
+
+   crypto: crypto@800 {
+   compatible = "fsl,sec-v5.0", "fsl,sec-v4.0";
+   fsl,sec-era = <8>;
+   #address-cells = <1>;
+   #size-cells = <1>;
+   ranges = <0x0 0x00 0x800 0x10>;
+   reg = <0x00 0x800 0x0 0x10>;
+   interrupts = ;
+   dma-coherent;
+
+   sec_jr0: jr@1 {
+   compatible = "fsl,sec-v5.0-job-ring",
+"fsl,sec-v4.0-job-ring";
+   reg= <0x1 0x1>;
+   interrupts = ;
+   };
+
+   sec_jr1: jr@2 {
+   compatible = "fsl,sec-v5.0-job-ring",
+"fsl,sec-v4.0-job-ring";
+   reg= <0x2 0x1>;
+   interrupts = ;
+   };
+
+   sec_jr2: jr@3 {
+   compatible = "fsl,sec-v5.0-job-ring",
+"fsl,sec-v4.0-job-ring";
+   reg= <0x3 0x1>;
+   interrupts = ;
+   };
+
+   sec_jr3: jr@4 {
+   compatible = "fsl,sec-v5.0-job-ring",
+"fsl,sec-v4.0-job-ring";
+   reg= <0x4 0x1>;
+   interrupts = ;
+   };
+   };
};
 
 };
-- 
2.12.0.264.gd6db3f216544



Re: [RFC PATCH v12 3/4] Linux Random Number Generator

2017-07-18 Thread Stephan Müller
Am Dienstag, 18. Juli 2017, 10:52:12 CEST schrieb Greg Kroah-Hartman:

Hi Greg,

> 
> > I have stated the core concerns I have with random.c in [1]. To remedy
> > these core concerns, major changes to random.c are needed. With the past
> > experience, I would doubt that I get the changes into random.c.
> > 
> > [1] https://www.spinics.net/lists/linux-crypto/msg26316.html
> 
> Evolution is the correct way to do this, kernel development relies on
> that.  We don't do the "use this totally different and untested file
> instead!" method.

I am not sure I understand your reply. The offered patch set does not rip out 
existing code. It adds a replacement implementation which can be enabled 
during compile time. Yet it is even disabled per default (and thus the legacy 
code is compiled).

I see such a development approach in numerous different kernel core areas: 
memory allocators (SLAB, SLOB, SLUB), process schedulers, IRQ schedulers.

What is so different for the realm of RNGs?

Ciao
Stephan


Re: [v3 RFC PATCH 2/2] crypto: ecc: use caller's GFP flags

2017-07-18 Thread Tudor Ambarus

Hi, Herbert,

On 07/18/2017 08:52 AM, Herbert Xu wrote:

On Wed, Jun 28, 2017 at 05:08:36PM +0300, Tudor Ambarus wrote:

Using GFP_KERNEL when allocating data and implicitly
assuming that we can sleep was wrong because the caller
could be in atomic context. Let the caller decide whether
sleeping is possible or not.

The caller (ecdh) was updated in the same patch in order
to not affect bissectability.

Signed-off-by: Tudor Ambarus 


Hmm, who wants to do asymmetric key crypto in interrupt context?


I thought of a caller that doesn't set the CRYPTO_TFM_REQ_MAY_SLEEP
flag. As of now I'm not aware of an user who wants asymmetric key
crypto in interrupt context. We can consider this patch superfluous
if CRYPTO_TFM_REQ_MAY_SLEEP is not meaningful for asymmetric key crypto.

Thanks,
ta


Re: [PATCH 1/2] crypto: inside-secure - fix invalidation check in hmac_sha1_setkey

2017-07-18 Thread Antoine Tenart
Hi,

On Mon, Jul 17, 2017 at 11:45:19AM +0200, Antoine Tenart wrote:
> The safexcel_hmac_sha1_setkey function checks if an invalidation command
> should be issued, i.e. when the context ipad/opad change. This checks is
> done after filling the ipad/opad which and it can't be true. The patch
> fixes this by moving the check before the ipad/opad memcpy operations.
> 
> Signed-off-by: Antoine Tenart 

This patch should have stable: and fixes: tags. I'll add them and send a
v2.

Thanks,
Antoine

> ---
>  drivers/crypto/inside-secure/safexcel_hash.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/crypto/inside-secure/safexcel_hash.c 
> b/drivers/crypto/inside-secure/safexcel_hash.c
> index 8527a5899a2f..a11b2edb41b9 100644
> --- a/drivers/crypto/inside-secure/safexcel_hash.c
> +++ b/drivers/crypto/inside-secure/safexcel_hash.c
> @@ -883,9 +883,6 @@ static int safexcel_hmac_sha1_setkey(struct crypto_ahash 
> *tfm, const u8 *key,
>   if (ret)
>   return ret;
>  
> - memcpy(ctx->ipad, , SHA1_DIGEST_SIZE);
> - memcpy(ctx->opad, , SHA1_DIGEST_SIZE);
> -
>   for (i = 0; i < ARRAY_SIZE(istate.state); i++) {
>   if (ctx->ipad[i] != le32_to_cpu(istate.state[i]) ||
>   ctx->opad[i] != le32_to_cpu(ostate.state[i])) {
> @@ -894,6 +891,9 @@ static int safexcel_hmac_sha1_setkey(struct crypto_ahash 
> *tfm, const u8 *key,
>   }
>   }
>  
> + memcpy(ctx->ipad, , SHA1_DIGEST_SIZE);
> + memcpy(ctx->opad, , SHA1_DIGEST_SIZE);
> +
>   return 0;
>  }
>  
> -- 
> 2.13.3
> 

-- 
Antoine Ténart, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


signature.asc
Description: PGP signature


Re: [v3 RFC PATCH 1/2] crypto: ecdh: fix concurrency on ecdh_ctx

2017-07-18 Thread Tudor Ambarus

Hi, Herbert,

On 07/18/2017 08:50 AM, Herbert Xu wrote:

On Wed, Jun 28, 2017 at 05:08:35PM +0300, Tudor Ambarus wrote:

ecdh_ctx contained static allocated data for the shared secret,
for the public and private key.

When talking about shared secret and public key, they were
doomed to concurrency issues because they could be shared by
multiple crypto requests. The requests were generating specific
data to the same zone of memory without any protection.

The private key was subject to concurrency problems because
multiple setkey calls could fight to memcpy to the same zone
of memory.

Shared secret and public key concurrency is fixed by allocating
memory on heap for each request. In the end, the shared secret
and public key are computed for each request, there is no reason
to use shared memory.

Private key concurrency is fixed by allocating memory on heap
for each setkey call, by memcopying the parsed/generated private
key to the heap and by making the private key pointer from
ecdh_ctx to point to the newly allocated data.

On all systems running Linux, loads from and stores to pointers
are atomic, that is, if a store to a pointer occurs at the same
time as a load from that same pointer, the load will return either
the initial value or the value stored, never some bitwise mashup
of the two.

With this, the private key will always point to a valid key,
but to what setkey call it belongs, is the responsibility of the
caller, as it is now in all crypto framework.


I don't get it.  You're replacing a per-tfm shared secret with
a per-tfm dynamically allocated shared secret.  As far as I can
see nothing has changed?


I'm replacing a per-tfm shared secret with a per-request dynamically
allocated shared secret. The same for the public key, I'm replacing
a per-tfm public key with a per-request dynamically allocated public
key. No shared memory, no concurrency.


If the caller is making simultaneous setkey calls on the same tfm,
then it's their problem.



For the private key I wanted to be sure that we don't interleave data
from multiple setkey calls. As of now, the setkey calls can fight to
memcopy to the same zone of memory. I'm making sure that the private key
is valid and not a mash of multiple private keys.

Thanks,
ta


Re: [PATCH v2 2/2] crypto: arm64/ghash - add NEON accelerated fallback for 64-bit PMULL

2017-07-18 Thread Ard Biesheuvel
On 18 July 2017 at 10:56, Ard Biesheuvel  wrote:
> On 18 July 2017 at 10:49, Herbert Xu  wrote:
>> On Wed, Jul 05, 2017 at 12:43:19AM +0100, Ard Biesheuvel wrote:
>>> Implement a NEON fallback for systems that do support NEON but have
>>> no support for the optional 64x64->128 polynomial multiplication
>>> instruction that is part of the ARMv8 Crypto Extensions. It is based
>>> on the paper "Fast Software Polynomial Multiplication on ARM Processors
>>> Using the NEON Engine" by Danilo Camara, Conrado Gouvea, Julio Lopez and
>>> Ricardo Dahab (https://hal.inria.fr/hal-01506572), but has been reworked
>>> extensively for the AArch64 ISA.
>>>
>>> On a low-end core such as the Cortex-A53 found in the Raspberry Pi3, the
>>> NEON based implementation is 4x faster than the table based one, and
>>> is time invariant as well, making it less vulnerable to timing attacks.
>>> When combined with the bit-sliced NEON implementation of AES-CTR, the
>>> AES-GCM performance increases by ~2x (from 58 to 30 cycles per byte).
>>>
>>> Signed-off-by: Ard Biesheuvel 
>>
>> This patch does not apply against cryptodev.
>>
>
> Yeah, it implements a non-SIMD fallback which depends on the AES
> refactor series.

FYI I have pushed everything I have queued up locally here:
https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=crypto-arm-for-v4.14

Once the crypto_xor() and AES refactor stuff looks satisfactory to
you, I will repost the remaining bits, including these GCM and GHASH
changes.

Thanks,
Ard.


[PATCH v4 8/8] crypto: aes - add meaningful help text to the various AES drivers

2017-07-18 Thread Ard Biesheuvel
Remove the duplicated boilerplate help text and add a bit of
explanation about the nature of the various AES implementations that
exist for various architectures. In particular, highlight the time
variant nature of some implementations, and the fact that they can be
omitted if required.

Signed-off-by: Ard Biesheuvel 
---
 arch/arm/crypto/Kconfig   |  16 ++-
 arch/arm64/crypto/Kconfig |  30 +-
 crypto/Kconfig| 104 +++-
 3 files changed, 75 insertions(+), 75 deletions(-)

diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig
index b9adedcc5b2e..f611127c5ef9 100644
--- a/arch/arm/crypto/Kconfig
+++ b/arch/arm/crypto/Kconfig
@@ -62,11 +62,23 @@ config CRYPTO_SHA512_ARM
  using optimized ARM assembler and NEON, when available.
 
 config CRYPTO_AES_ARM
-   tristate "Scalar AES cipher for ARM"
+   tristate "Table based AES cipher for 32-bit ARM"
select CRYPTO_ALGAPI
select CRYPTO_AES
help
- Use optimized AES assembler routines for ARM platforms.
+ Table based implementation in 32-bit ARM assembler of the FIPS-197
+ Advanced Encryption Standard (AES) symmetric cipher algorithm. This
+ driver reuses the tables exposed by the generic AES driver.
+
+ For CPUs that lack the special ARMv8-CE instructions, this is the
+ fastest implementation available of the core cipher, but it may be
+ susceptible to known-plaintext attacks on the key due to the
+ correlation between the processing time and the input of the first
+ round. Therefore, it is recommended to also enable the time invariant
+ NEON based driver below (CRYPTO_AES_ARM_BS), which will supersede
+ this driver on NEON capable CPUs when using AES in CBC, CTR and XTS
+ modes. If time invariance is a requirement, this driver should not
+ be enabled.
 
 config CRYPTO_AES_ARM_BS
tristate "Bit sliced AES using NEON instructions"
diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
index d92293747d63..bf38680a2dbb 100644
--- a/arch/arm64/crypto/Kconfig
+++ b/arch/arm64/crypto/Kconfig
@@ -42,13 +42,37 @@ config CRYPTO_CRC32_ARM64_CE
select CRYPTO_HASH
 
 config CRYPTO_AES_ARM64
-   tristate "AES core cipher using scalar instructions"
+   tristate "Table based AES cipher for 64-bit ARM"
select CRYPTO_AES
+   help
+ Table based implementation in 64-bit ARM assembler of the FIPS-197
+ Advanced Encryption Standard (AES) symmetric cipher algorithm. This
+ driver reuses the tables exposed by the generic AES driver.
+
+ For CPUs that lack the special ARMv8-CE instructions, this is the
+ fastest implementation available of the core cipher, but it may be
+ susceptible to known-plaintext attacks on the key due to the
+ correlation between the processing time and the input of the first
+ round. Therefore, it is recommended to also enable the time invariant
+ drivers below (CRYPTO_AES_ARM64_NEON_BLK and CRYPTO_AES_ARM64_BS),
+ which will supersede this driver when using AES in the specific modes
+ that they implement. If time invariance is a requirement, this driver
+ should not be enabled.
 
 config CRYPTO_AES_ARM64_CE
-   tristate "AES core cipher using ARMv8 Crypto Extensions"
-   depends on ARM64 && KERNEL_MODE_NEON
+   tristate "AES cipher using ARMv8 Crypto Extensions"
+   depends on KERNEL_MODE_NEON
select CRYPTO_ALGAPI
+   help
+ Implementation in assembler of the FIPS-197 Advanced Encryption
+ Standard (AES) symmetric cipher algorithm, using instructions from
+ ARM's optional ARMv8 Crypto Extensions. This implementation is time
+ invariant, and is by far the preferred option for CPUs that support
+ this extension.
+
+ If in doubt, enable as a module: it will be loaded automatically on
+ CPUs that support it, and supersede other implementations of the AES
+ cipher.
 
 config CRYPTO_AES_ARM64_CE_CCM
tristate "AES in CCM mode using ARMv8 Crypto Extensions"
diff --git a/crypto/Kconfig b/crypto/Kconfig
index 8f4b9f3381e2..9bec9f7a81d9 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -909,51 +909,37 @@ config CRYPTO_AES
  block.
 
 config CRYPTO_AES_586
-   tristate "AES cipher algorithms (i586)"
+   tristate "Table based AES cipher for 32-bit x86"
depends on (X86 || UML_X86) && !64BIT
select CRYPTO_ALGAPI
select CRYPTO_AES
help
- AES cipher algorithms (FIPS-197). AES uses the Rijndael
- algorithm.
-
- Rijndael appears to be consistently a very good performer in
- both hardware and software across a wide range of computing
- environments regardless of its use in feedback or non-feedback
- modes. Its key setup time is excellent, and 

[PATCH v4 5/8] crypto: arm/aes - avoid expanded lookup tables in the final round

2017-07-18 Thread Ard Biesheuvel
For the final round, avoid the expanded and padded lookup tables
exported by the generic AES driver. Instead, for encryption, we can
perform byte loads from the same table we used for the inner rounds,
which will still be hot in the caches. For decryption, use the inverse
AES Sbox exported by the generic AES driver, which is 4x smaller than
the inverse table exported by the generic driver.

This significantly reduces the Dcache footprint of our code, and does
not introduce any additional module dependencies, given that we already
rely on the core AES module for the shared key expansion routines.

Signed-off-by: Ard Biesheuvel 
---
 arch/arm/crypto/aes-cipher-core.S | 51 ++--
 1 file changed, 26 insertions(+), 25 deletions(-)

diff --git a/arch/arm/crypto/aes-cipher-core.S 
b/arch/arm/crypto/aes-cipher-core.S
index a727692cd9c1..5e9ddc576ec1 100644
--- a/arch/arm/crypto/aes-cipher-core.S
+++ b/arch/arm/crypto/aes-cipher-core.S
@@ -33,19 +33,19 @@
.endif
.endm
 
-   .macro  __load, out, in, idx
+   .macro  __load, out, in, idx, sz, op
.if __LINUX_ARM_ARCH__ < 7 && \idx > 0
-   ldr \out, [ttab, \in, lsr #(8 * \idx) - 2]
+   ldr\op  \out, [ttab, \in, lsr #(8 * \idx) - \sz]
.else
-   ldr \out, [ttab, \in, lsl #2]
+   ldr\op  \out, [ttab, \in, lsl #\sz]
.endif
.endm
 
-   .macro  __hround, out0, out1, in0, in1, in2, in3, t3, t4, enc
+   .macro  __hround, out0, out1, in0, in1, in2, in3, t3, t4, enc, 
sz, op
__select\out0, \in0, 0
__selectt0, \in1, 1
-   __load  \out0, \out0, 0
-   __load  t0, t0, 1
+   __load  \out0, \out0, 0, \sz, \op
+   __load  t0, t0, 1, \sz, \op
 
.if \enc
__select\out1, \in1, 0
@@ -54,10 +54,10 @@
__select\out1, \in3, 0
__selectt1, \in0, 1
.endif
-   __load  \out1, \out1, 0
+   __load  \out1, \out1, 0, \sz, \op
__selectt2, \in2, 2
-   __load  t1, t1, 1
-   __load  t2, t2, 2
+   __load  t1, t1, 1, \sz, \op
+   __load  t2, t2, 2, \sz, \op
 
eor \out0, \out0, t0, ror #24
 
@@ -69,9 +69,9 @@
__select\t3, \in1, 2
__select\t4, \in2, 3
.endif
-   __load  \t3, \t3, 2
-   __load  t0, t0, 3
-   __load  \t4, \t4, 3
+   __load  \t3, \t3, 2, \sz, \op
+   __load  t0, t0, 3, \sz, \op
+   __load  \t4, \t4, 3, \sz, \op
 
eor \out1, \out1, t1, ror #24
eor \out0, \out0, t2, ror #16
@@ -83,14 +83,14 @@
eor \out1, \out1, t2
.endm
 
-   .macro  fround, out0, out1, out2, out3, in0, in1, in2, in3
-   __hround\out0, \out1, \in0, \in1, \in2, \in3, \out2, \out3, 1
-   __hround\out2, \out3, \in2, \in3, \in0, \in1, \in1, \in2, 1
+   .macro  fround, out0, out1, out2, out3, in0, in1, in2, in3, 
sz=2, op
+   __hround\out0, \out1, \in0, \in1, \in2, \in3, \out2, \out3, 1, 
\sz, \op
+   __hround\out2, \out3, \in2, \in3, \in0, \in1, \in1, \in2, 1, 
\sz, \op
.endm
 
-   .macro  iround, out0, out1, out2, out3, in0, in1, in2, in3
-   __hround\out0, \out1, \in0, \in3, \in2, \in1, \out2, \out3, 0
-   __hround\out2, \out3, \in2, \in1, \in0, \in3, \in1, \in0, 0
+   .macro  iround, out0, out1, out2, out3, in0, in1, in2, in3, 
sz=2, op
+   __hround\out0, \out1, \in0, \in3, \in2, \in1, \out2, \out3, 0, 
\sz, \op
+   __hround\out2, \out3, \in2, \in1, \in0, \in3, \in1, \in0, 0, 
\sz, \op
.endm
 
.macro  __rev, out, in
@@ -115,7 +115,7 @@
.endif
.endm
 
-   .macro  do_crypt, round, ttab, ltab
+   .macro  do_crypt, round, ttab, ltab, bsz
push{r3-r11, lr}
 
ldr r4, [in]
@@ -147,9 +147,12 @@
 
 1: subsrounds, rounds, #4
\round  r8, r9, r10, r11, r4, r5, r6, r7
-   __adrl  ttab, \ltab, ls
+   bls 2f
\round  r4, r5, r6, r7, r8, r9, r10, r11
-   bhi 0b
+   b   0b
+
+2: __adrl  ttab, \ltab
+   \round  r4, r5, r6, r7, r8, r9, r10, r11, \bsz, b
 
 #ifdef CONFIG_CPU_BIG_ENDIAN
__rev   r4, r4
@@ -173,14 +176,12 @@
 
.align  6
aes_table_reduced   crypto_ft_tab
-   aes_table_reduced   crypto_fl_tab
aes_table_reduced   crypto_it_tab
-   aes_table_reduced   crypto_il_tab
 
 ENTRY(__aes_arm_encrypt)
-   do_cryptfround, crypto_ft_tab, 

[PATCH v4 7/8] crypto: arm64/aes-neon - reuse Sboxes from AES core module

2017-07-18 Thread Ard Biesheuvel
The newly introduced AES core module exposes its Sboxes for the benefit
of the fixed time AES driver. Since the arm64 NEON based implementation
already depends on the same core module for its key expansion routines,
let's use its Sboxes as well, and remove the local copy.

Signed-off-by: Ard Biesheuvel 
---
 arch/arm64/crypto/aes-neon.S | 74 +---
 1 file changed, 3 insertions(+), 71 deletions(-)

diff --git a/arch/arm64/crypto/aes-neon.S b/arch/arm64/crypto/aes-neon.S
index f1e3aa2732f9..2acb5f81dcdb 100644
--- a/arch/arm64/crypto/aes-neon.S
+++ b/arch/arm64/crypto/aes-neon.S
@@ -32,7 +32,7 @@
 
/* preload the entire Sbox */
.macro  prepare, sbox, shiftrows, temp
-   adr \temp, \sbox
+   adr_l   \temp, \sbox
moviv12.16b, #0x1b
ldr q13, \shiftrows
ldr q14, .Lror32by8
@@ -44,7 +44,7 @@
 
/* do preload for encryption */
.macro  enc_prepare, ignore0, ignore1, temp
-   prepare .LForward_Sbox, .LForward_ShiftRows, \temp
+   prepare crypto_aes_sbox, .LForward_ShiftRows, \temp
.endm
 
.macro  enc_switch_key, ignore0, ignore1, temp
@@ -53,7 +53,7 @@
 
/* do preload for decryption */
.macro  dec_prepare, ignore0, ignore1, temp
-   prepare .LReverse_Sbox, .LReverse_ShiftRows, \temp
+   prepare crypto_aes_inv_sbox, .LReverse_ShiftRows, \temp
.endm
 
/* apply SubBytes transformation using the the preloaded Sbox */
@@ -274,74 +274,6 @@
 
.text
.align  6
-.LForward_Sbox:
-   .byte   0x63, 0x7c, 0x77, 0x7b, 0xf2, 0x6b, 0x6f, 0xc5
-   .byte   0x30, 0x01, 0x67, 0x2b, 0xfe, 0xd7, 0xab, 0x76
-   .byte   0xca, 0x82, 0xc9, 0x7d, 0xfa, 0x59, 0x47, 0xf0
-   .byte   0xad, 0xd4, 0xa2, 0xaf, 0x9c, 0xa4, 0x72, 0xc0
-   .byte   0xb7, 0xfd, 0x93, 0x26, 0x36, 0x3f, 0xf7, 0xcc
-   .byte   0x34, 0xa5, 0xe5, 0xf1, 0x71, 0xd8, 0x31, 0x15
-   .byte   0x04, 0xc7, 0x23, 0xc3, 0x18, 0x96, 0x05, 0x9a
-   .byte   0x07, 0x12, 0x80, 0xe2, 0xeb, 0x27, 0xb2, 0x75
-   .byte   0x09, 0x83, 0x2c, 0x1a, 0x1b, 0x6e, 0x5a, 0xa0
-   .byte   0x52, 0x3b, 0xd6, 0xb3, 0x29, 0xe3, 0x2f, 0x84
-   .byte   0x53, 0xd1, 0x00, 0xed, 0x20, 0xfc, 0xb1, 0x5b
-   .byte   0x6a, 0xcb, 0xbe, 0x39, 0x4a, 0x4c, 0x58, 0xcf
-   .byte   0xd0, 0xef, 0xaa, 0xfb, 0x43, 0x4d, 0x33, 0x85
-   .byte   0x45, 0xf9, 0x02, 0x7f, 0x50, 0x3c, 0x9f, 0xa8
-   .byte   0x51, 0xa3, 0x40, 0x8f, 0x92, 0x9d, 0x38, 0xf5
-   .byte   0xbc, 0xb6, 0xda, 0x21, 0x10, 0xff, 0xf3, 0xd2
-   .byte   0xcd, 0x0c, 0x13, 0xec, 0x5f, 0x97, 0x44, 0x17
-   .byte   0xc4, 0xa7, 0x7e, 0x3d, 0x64, 0x5d, 0x19, 0x73
-   .byte   0x60, 0x81, 0x4f, 0xdc, 0x22, 0x2a, 0x90, 0x88
-   .byte   0x46, 0xee, 0xb8, 0x14, 0xde, 0x5e, 0x0b, 0xdb
-   .byte   0xe0, 0x32, 0x3a, 0x0a, 0x49, 0x06, 0x24, 0x5c
-   .byte   0xc2, 0xd3, 0xac, 0x62, 0x91, 0x95, 0xe4, 0x79
-   .byte   0xe7, 0xc8, 0x37, 0x6d, 0x8d, 0xd5, 0x4e, 0xa9
-   .byte   0x6c, 0x56, 0xf4, 0xea, 0x65, 0x7a, 0xae, 0x08
-   .byte   0xba, 0x78, 0x25, 0x2e, 0x1c, 0xa6, 0xb4, 0xc6
-   .byte   0xe8, 0xdd, 0x74, 0x1f, 0x4b, 0xbd, 0x8b, 0x8a
-   .byte   0x70, 0x3e, 0xb5, 0x66, 0x48, 0x03, 0xf6, 0x0e
-   .byte   0x61, 0x35, 0x57, 0xb9, 0x86, 0xc1, 0x1d, 0x9e
-   .byte   0xe1, 0xf8, 0x98, 0x11, 0x69, 0xd9, 0x8e, 0x94
-   .byte   0x9b, 0x1e, 0x87, 0xe9, 0xce, 0x55, 0x28, 0xdf
-   .byte   0x8c, 0xa1, 0x89, 0x0d, 0xbf, 0xe6, 0x42, 0x68
-   .byte   0x41, 0x99, 0x2d, 0x0f, 0xb0, 0x54, 0xbb, 0x16
-
-.LReverse_Sbox:
-   .byte   0x52, 0x09, 0x6a, 0xd5, 0x30, 0x36, 0xa5, 0x38
-   .byte   0xbf, 0x40, 0xa3, 0x9e, 0x81, 0xf3, 0xd7, 0xfb
-   .byte   0x7c, 0xe3, 0x39, 0x82, 0x9b, 0x2f, 0xff, 0x87
-   .byte   0x34, 0x8e, 0x43, 0x44, 0xc4, 0xde, 0xe9, 0xcb
-   .byte   0x54, 0x7b, 0x94, 0x32, 0xa6, 0xc2, 0x23, 0x3d
-   .byte   0xee, 0x4c, 0x95, 0x0b, 0x42, 0xfa, 0xc3, 0x4e
-   .byte   0x08, 0x2e, 0xa1, 0x66, 0x28, 0xd9, 0x24, 0xb2
-   .byte   0x76, 0x5b, 0xa2, 0x49, 0x6d, 0x8b, 0xd1, 0x25
-   .byte   0x72, 0xf8, 0xf6, 0x64, 0x86, 0x68, 0x98, 0x16
-   .byte   0xd4, 0xa4, 0x5c, 0xcc, 0x5d, 0x65, 0xb6, 0x92
-   .byte   0x6c, 0x70, 0x48, 0x50, 0xfd, 0xed, 0xb9, 0xda
-   .byte   0x5e, 0x15, 0x46, 0x57, 0xa7, 0x8d, 0x9d, 0x84
-   .byte   0x90, 0xd8, 0xab, 0x00, 0x8c, 0xbc, 0xd3, 0x0a
-   .byte   0xf7, 0xe4, 0x58, 0x05, 0xb8, 0xb3, 0x45, 0x06
-   .byte   

[PATCH v4 4/8] crypto: x86/aes-ni - switch to generic fallback

2017-07-18 Thread Ard Biesheuvel
The time invariant AES-NI implementation is SIMD based, and so it needs
a fallback in case the code is called from a context where SIMD is not
allowed. On x86, this is really only when executing in the context of an
interrupt taken while in kernel mode, since SIMD is allowed in all other
cases.

There is very little code in the kernel that actually performs AES in
interrupt context, and the code that does (mac80211) only does so when
running on 802.11 devices that have no support for AES in hardware, and
those are rare these days.

So switch to the new AES core code as a fallback. It is much smaller, as
well as more resistant to cache timing attacks, and removing the
dependency allows us to disable the time variant drivers altogether if
desired.

Signed-off-by: Ard Biesheuvel 
---
 arch/x86/crypto/aesni-intel_glue.c | 4 ++--
 crypto/Kconfig | 3 +--
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/arch/x86/crypto/aesni-intel_glue.c 
b/arch/x86/crypto/aesni-intel_glue.c
index 4a55cdcdc008..1734e6185800 100644
--- a/arch/x86/crypto/aesni-intel_glue.c
+++ b/arch/x86/crypto/aesni-intel_glue.c
@@ -334,7 +334,7 @@ static void aes_encrypt(struct crypto_tfm *tfm, u8 *dst, 
const u8 *src)
struct crypto_aes_ctx *ctx = aes_ctx(crypto_tfm_ctx(tfm));
 
if (!irq_fpu_usable())
-   crypto_aes_encrypt_x86(ctx, dst, src);
+   crypto_aes_encrypt(ctx, dst, src);
else {
kernel_fpu_begin();
aesni_enc(ctx, dst, src);
@@ -347,7 +347,7 @@ static void aes_decrypt(struct crypto_tfm *tfm, u8 *dst, 
const u8 *src)
struct crypto_aes_ctx *ctx = aes_ctx(crypto_tfm_ctx(tfm));
 
if (!irq_fpu_usable())
-   crypto_aes_decrypt_x86(ctx, dst, src);
+   crypto_aes_decrypt(ctx, dst, src);
else {
kernel_fpu_begin();
aesni_dec(ctx, dst, src);
diff --git a/crypto/Kconfig b/crypto/Kconfig
index 7766fea9c18e..8f4b9f3381e2 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -956,8 +956,7 @@ config CRYPTO_AES_NI_INTEL
tristate "AES cipher algorithms (AES-NI)"
depends on X86
select CRYPTO_AEAD
-   select CRYPTO_AES_X86_64 if 64BIT
-   select CRYPTO_AES_586 if !64BIT
+   select CRYPTO_AES
select CRYPTO_ALGAPI
select CRYPTO_BLKCIPHER
select CRYPTO_GLUE_HELPER_X86 if 64BIT
-- 
2.9.3



[PATCH v4 2/8] crypto - aes: use dedicated lookup tables for table based asm routines

2017-07-18 Thread Ard Biesheuvel
Instead of linking against the table based AES generic C code to reuse
the lookup tables, add an assembler file that defines a couple of macros
that instantiate the tables in-place. This allows us to replace AES in
a subsequent patch.

Signed-off-by: Ard Biesheuvel 
---
 arch/arm/crypto/aes-cipher-core.S   |7 +
 arch/arm64/crypto/aes-cipher-core.S |8 +-
 arch/x86/crypto/aes-i586-asm_32.S   |   13 +-
 arch/x86/crypto/aes-x86_64-asm_64.S |   12 +-
 include/crypto/aes-tables.S | 1104 
 include/crypto/aes.h|5 -
 6 files changed, 1132 insertions(+), 17 deletions(-)

diff --git a/arch/arm/crypto/aes-cipher-core.S 
b/arch/arm/crypto/aes-cipher-core.S
index c817a86c4ca8..a727692cd9c1 100644
--- a/arch/arm/crypto/aes-cipher-core.S
+++ b/arch/arm/crypto/aes-cipher-core.S
@@ -9,6 +9,7 @@
  * published by the Free Software Foundation.
  */
 
+#include 
 #include 
 
.text
@@ -170,6 +171,12 @@
.ltorg
.endm
 
+   .align  6
+   aes_table_reduced   crypto_ft_tab
+   aes_table_reduced   crypto_fl_tab
+   aes_table_reduced   crypto_it_tab
+   aes_table_reduced   crypto_il_tab
+
 ENTRY(__aes_arm_encrypt)
do_cryptfround, crypto_ft_tab, crypto_fl_tab
 ENDPROC(__aes_arm_encrypt)
diff --git a/arch/arm64/crypto/aes-cipher-core.S 
b/arch/arm64/crypto/aes-cipher-core.S
index f2f9cc519309..bbe5dd96135c 100644
--- a/arch/arm64/crypto/aes-cipher-core.S
+++ b/arch/arm64/crypto/aes-cipher-core.S
@@ -8,6 +8,7 @@
  * published by the Free Software Foundation.
  */
 
+#include 
 #include 
 #include 
 
@@ -99,7 +100,12 @@ CPU_BE( rev w8, w8  )
ret
.endm
 
-   .align  5
+   .align  7
+   aes_table_reduced   crypto_ft_tab
+   aes_table_reduced   crypto_fl_tab
+   aes_table_reduced   crypto_it_tab
+   aes_table_reduced   crypto_il_tab
+
 ENTRY(__aes_arm64_encrypt)
do_cryptfround, crypto_ft_tab, crypto_fl_tab
 ENDPROC(__aes_arm64_encrypt)
diff --git a/arch/x86/crypto/aes-i586-asm_32.S 
b/arch/x86/crypto/aes-i586-asm_32.S
index 2849dbc59e11..d68c57ca2ace 100644
--- a/arch/x86/crypto/aes-i586-asm_32.S
+++ b/arch/x86/crypto/aes-i586-asm_32.S
@@ -38,6 +38,13 @@
 
 #include 
 #include 
+#include 
+
+.align 4
+aes_table_prerotated crypto_ft_tab
+aes_table_prerotated crypto_fl_tab
+aes_table_prerotated crypto_it_tab
+aes_table_prerotated crypto_il_tab
 
 #define tlen 1024   // length of each of 4 'xor' arrays (256 32-bit words)
 
@@ -220,9 +227,6 @@
 // AES (Rijndael) Encryption Subroutine
 /* void aes_enc_blk(struct crypto_aes_ctx *ctx, u8 *out_blk, const u8 *in_blk) 
*/
 
-.extern  crypto_ft_tab
-.extern  crypto_fl_tab
-
 ENTRY(aes_enc_blk)
push%ebp
mov ctx(%esp),%ebp
@@ -292,9 +296,6 @@ ENDPROC(aes_enc_blk)
 // AES (Rijndael) Decryption Subroutine
 /* void aes_dec_blk(struct crypto_aes_ctx *ctx, u8 *out_blk, const u8 *in_blk) 
*/
 
-.extern  crypto_it_tab
-.extern  crypto_il_tab
-
 ENTRY(aes_dec_blk)
push%ebp
mov ctx(%esp),%ebp
diff --git a/arch/x86/crypto/aes-x86_64-asm_64.S 
b/arch/x86/crypto/aes-x86_64-asm_64.S
index 8739cf7795de..7b5a9ef3e51d 100644
--- a/arch/x86/crypto/aes-x86_64-asm_64.S
+++ b/arch/x86/crypto/aes-x86_64-asm_64.S
@@ -8,15 +8,17 @@
  * including this sentence is retained in full.
  */
 
-.extern crypto_ft_tab
-.extern crypto_it_tab
-.extern crypto_fl_tab
-.extern crypto_il_tab
-
 .text
 
 #include 
 #include 
+#include 
+
+.align 4
+aes_table_prerotated crypto_ft_tab
+aes_table_prerotated crypto_fl_tab
+aes_table_prerotated crypto_it_tab
+aes_table_prerotated crypto_il_tab
 
 #define R1 %rax
 #define R1E%eax
diff --git a/include/crypto/aes-tables.S b/include/crypto/aes-tables.S
new file mode 100644
index ..9625c38a76fb
--- /dev/null
+++ b/include/crypto/aes-tables.S
@@ -0,0 +1,1104 @@
+/*
+ * ---
+ * Copyright (c) 2002, Dr Brian Gladman , Worcester, UK.
+ * All rights reserved.
+ *
+ * LICENSE TERMS
+ *
+ * The free distribution and use of this software in both source and binary
+ * form is allowed (with or without changes) provided that:
+ *
+ *   1. distributions of this source code include the above copyright
+ *  notice, this list of conditions and the following disclaimer;
+ *
+ *   2. distributions in binary form include the above copyright
+ *  notice, this list of conditions and the following disclaimer
+ *  in the documentation and/or other associated materials;
+ *
+ *   3. the copyright holder's name is not used to endorse products
+ *  built using this software without specific written permission.
+ *
+ * ALTERNATIVELY, provided that this notice is retained in full, this product
+ * may be distributed under the terms of the GNU General Public License 

[PATCH v4 3/8] crypto: aes - retire table based generic AES in favor of fixed time driver

2017-07-18 Thread Ard Biesheuvel
Rework the fixed time AES code so that it can fulfil dependencies of other
drivers on the shared AES key expansion routines. This way, we can remove
the table based generic AES code altogether, and use the much smaller and
time invariant fixed time driver as the global default for systems that
don't have an architecture specific accelerated implementation of the
cipher.

Signed-off-by: Ard Biesheuvel 
---
 crypto/Kconfig |   31 +-
 crypto/Makefile|3 +-
 crypto/{aes_ti.c => aes.c} |  169 ++-
 crypto/aes_generic.c   | 1478 
 drivers/crypto/chelsio/chcr_algo.c |4 +-
 include/crypto/aes.h   |6 +
 6 files changed, 121 insertions(+), 1570 deletions(-)

diff --git a/crypto/Kconfig b/crypto/Kconfig
index caa770e535a2..7766fea9c18e 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -895,35 +895,12 @@ config CRYPTO_GHASH_CLMUL_NI_INTEL
 comment "Ciphers"
 
 config CRYPTO_AES
-   tristate "AES cipher algorithms"
+   tristate "Generic AES cipher (fixed time)"
select CRYPTO_ALGAPI
help
- AES cipher algorithms (FIPS-197). AES uses the Rijndael
- algorithm.
-
- Rijndael appears to be consistently a very good performer in
- both hardware and software across a wide range of computing
- environments regardless of its use in feedback or non-feedback
- modes. Its key setup time is excellent, and its key agility is
- good. Rijndael's very low memory requirements make it very well
- suited for restricted-space environments, in which it also
- demonstrates excellent performance. Rijndael's operations are
- among the easiest to defend against power and timing attacks.
-
- The AES specifies three key sizes: 128, 192 and 256 bits
-
- See  for more information.
-
-config CRYPTO_AES_TI
-   tristate "Fixed time AES cipher"
-   select CRYPTO_ALGAPI
-   help
- This is a generic implementation of AES that attempts to eliminate
- data dependent latencies as much as possible without affecting
- performance too much. It is intended for use by the generic CCM
- and GCM drivers, and other CTR or CMAC/XCBC based modes that rely
- solely on encryption (although decryption is supported as well, but
- with a more dramatic performance hit)
+ This is a generic implementation of AES that was designed to be
+ small (in terms of code size and D-cache footprint) and time
+ invariant, with reasonable performance.
 
  Instead of using 16 lookup tables of 1 KB each, (8 for encryption and
  8 for decryption), this implementation only uses just two S-boxes of
diff --git a/crypto/Makefile b/crypto/Makefile
index d41f0331b085..6163d47b3e12 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -96,8 +96,7 @@ obj-$(CONFIG_CRYPTO_TWOFISH) += twofish_generic.o
 obj-$(CONFIG_CRYPTO_TWOFISH_COMMON) += twofish_common.o
 obj-$(CONFIG_CRYPTO_SERPENT) += serpent_generic.o
 CFLAGS_serpent_generic.o := $(call cc-option,-fsched-pressure)  # 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79149
-obj-$(CONFIG_CRYPTO_AES) += aes_generic.o
-obj-$(CONFIG_CRYPTO_AES_TI) += aes_ti.o
+obj-$(CONFIG_CRYPTO_AES) += aes.o
 obj-$(CONFIG_CRYPTO_CAMELLIA) += camellia_generic.o
 obj-$(CONFIG_CRYPTO_CAST_COMMON) += cast_common.o
 obj-$(CONFIG_CRYPTO_CAST5) += cast5_generic.o
diff --git a/crypto/aes_ti.c b/crypto/aes.c
similarity index 76%
rename from crypto/aes_ti.c
rename to crypto/aes.c
index 03023b2290e8..1c246274bfa3 100644
--- a/crypto/aes_ti.c
+++ b/crypto/aes.c
@@ -13,11 +13,7 @@
 #include 
 #include 
 
-/*
- * Emit the sbox as volatile const to prevent the compiler from doing
- * constant folding on sbox references involving fixed indexes.
- */
-static volatile const u8 __cacheline_aligned __aesti_sbox[] = {
+static volatile const u8 __cacheline_aligned aes_sbox[] = {
0x63, 0x7c, 0x77, 0x7b, 0xf2, 0x6b, 0x6f, 0xc5,
0x30, 0x01, 0x67, 0x2b, 0xfe, 0xd7, 0xab, 0x76,
0xca, 0x82, 0xc9, 0x7d, 0xfa, 0x59, 0x47, 0xf0,
@@ -52,7 +48,7 @@ static volatile const u8 __cacheline_aligned __aesti_sbox[] = 
{
0x41, 0x99, 0x2d, 0x0f, 0xb0, 0x54, 0xbb, 0x16,
 };
 
-static volatile const u8 __cacheline_aligned __aesti_inv_sbox[] = {
+static volatile const u8 __cacheline_aligned aes_inv_sbox[] = {
0x52, 0x09, 0x6a, 0xd5, 0x30, 0x36, 0xa5, 0x38,
0xbf, 0x40, 0xa3, 0x9e, 0x81, 0xf3, 0xd7, 0xfb,
0x7c, 0xe3, 0x39, 0x82, 0x9b, 0x2f, 0xff, 0x87,
@@ -145,30 +141,30 @@ static u32 inv_mix_columns(u32 x)
 
 static __always_inline u32 subshift(u32 in[], int pos)
 {
-   return (__aesti_sbox[in[pos] & 0xff]) ^
-  (__aesti_sbox[(in[(pos + 1) % 4] >>  8) & 0xff] <<  8) ^
-  (__aesti_sbox[(in[(pos + 2) % 4] >> 16) & 0xff] << 16) ^
-  

[PATCH v4 6/8] crypto: arm64/aes - avoid expanded lookup tables in the final round

2017-07-18 Thread Ard Biesheuvel
For the final round, avoid the expanded and padded lookup tables
exported by the generic AES driver. Instead, for encryption, we can
perform byte loads from the same table we used for the inner rounds,
which will still be hot in the caches. For decryption, use the inverse
AES Sbox exported by the generic AES driver, which is 4x smaller than
the inverse table exported by the generic driver.

This significantly reduces the Dcache footprint of our code, and does
not introduce any additional module dependencies, given that we already
rely on the core AES module for the shared key expansion routines. It
also frees up register x18, which is not available as a scratch register
on all platforms, which and so avoiding it improves shareability of this
code.

Signed-off-by: Ard Biesheuvel 
---
 arch/arm64/crypto/aes-cipher-core.S | 155 ++--
 1 file changed, 108 insertions(+), 47 deletions(-)

diff --git a/arch/arm64/crypto/aes-cipher-core.S 
b/arch/arm64/crypto/aes-cipher-core.S
index bbe5dd96135c..fe807f164d83 100644
--- a/arch/arm64/crypto/aes-cipher-core.S
+++ b/arch/arm64/crypto/aes-cipher-core.S
@@ -18,99 +18,160 @@
out .reqx1
in  .reqx2
rounds  .reqx3
-   tt  .reqx4
-   lt  .reqx2
+   tt  .reqx2
 
-   .macro  __pair, enc, reg0, reg1, in0, in1e, in1d, shift
+   .macro  __ubf1, reg0, reg1, in0, in1e, in1d, sz, shift
ubfx\reg0, \in0, #\shift, #8
-   .if \enc
ubfx\reg1, \in1e, #\shift, #8
-   .else
+   .endm
+
+   .macro  __ubf0, reg0, reg1, in0, in1e, in1d, sz, shift
+   ubfx\reg0, \in0, #\shift, #8
ubfx\reg1, \in1d, #\shift, #8
+   .endm
+
+   .macro  __ubf1b, reg0, reg1, in0, in1e, in1d, sz, shift
+   .if \shift == 0 && \sz > 0
+   ubfiz   \reg0, \in0, #\sz, #8
+   ubfiz   \reg1, \in1e, #\sz, #8
+   .else
+   __ubf1  \reg0, \reg1, \in0, \in1e, \in1d, \sz, \shift
+   .endif
+   .endm
+
+   .macro  __ubf0b, reg0, reg1, in0, in1e, in1d, sz, shift
+   .if \shift == 0 && \sz > 0
+   ubfiz   \reg0, \in0, #\sz, #8
+   ubfiz   \reg1, \in1d, #\sz, #8
+   .else
+   __ubf0  \reg0, \reg1, \in0, \in1e, \in1d, \sz, \shift
.endif
+   .endm
+
+   /*
+* AArch64 cannot do byte size indexed loads from a table containing
+* 32-bit quantities, i.e., 'ldrb w12, [tt, w12, uxtw #2]' is not a
+* valid instruction.
+*
+* For shift == 0, we can simply fold the size shift of the index
+* into the ubfx instruction, by switcing to ubfiz and using \sz as
+* the destination offset.
+* For shift > 0, we perform a 32-byte wide load instead, which does
+* allow an index shift of 2, and discard the high bytes later using
+* uxtb or lsl #24.
+*/
+   .macro  __pair, enc, sz, op, reg0, reg1, in0, in1e, in1d, shift
+   __ubf\enc\op\reg0, \reg1, \in0, \in1e, \in1d, \sz, \shift
+   .ifnc   \op\sz, b2
+   ldr\op  \reg0, [tt, \reg0, uxtw #\sz]
+   ldr\op  \reg1, [tt, \reg1, uxtw #\sz]
+   .elseif \shift == 0
+   ldrb\reg0, [tt, \reg0, uxtw]
+   ldrb\reg1, [tt, \reg1, uxtw]
+   .else
ldr \reg0, [tt, \reg0, uxtw #2]
ldr \reg1, [tt, \reg1, uxtw #2]
+   .endif
.endm
 
-   .macro  __hround, out0, out1, in0, in1, in2, in3, t0, t1, enc
+   .macro  __hround, out0, out1, in0, in1, in2, in3, t0, t1, enc, 
sz, op
ldp \out0, \out1, [rk], #8
 
-   __pair  \enc, w13, w14, \in0, \in1, \in3, 0
-   __pair  \enc, w15, w16, \in1, \in2, \in0, 8
-   __pair  \enc, w17, w18, \in2, \in3, \in1, 16
-   __pair  \enc, \t0, \t1, \in3, \in0, \in2, 24
-
-   eor \out0, \out0, w13
-   eor \out1, \out1, w14
-   eor \out0, \out0, w15, ror #24
-   eor \out1, \out1, w16, ror #24
-   eor \out0, \out0, w17, ror #16
-   eor \out1, \out1, w18, ror #16
-   eor \out0, \out0, \t0, ror #8
-   eor \out1, \out1, \t1, ror #8
+   __pair  \enc, \sz, \op, w12, w13, \in0, \in1, \in3, 0
+   __pair  \enc, \sz, \op, w14, w15, \in3, \in0, \in2, 24
+   __pair  \enc, \sz, \op, w16, w17, \in2, \in3, \in1, 16
+   __pair  \enc, \sz, \op, \t0, \t1, \in1, \in2, \in0, 8
+
+   eor \out0, \out0, w12
+   eor \out1, \out1, w13
+
+   .ifnc   \op\sz, b2
+   eor \out0, \out0, w14, ror #8
+   eor   

[PATCH v4 0/8] crypto: aes - retire table based generic AES

2017-07-18 Thread Ard Biesheuvel
The generic AES driver uses 16 lookup tables of 1 KB each, and has
encryption and decryption routines that are fully unrolled. Given how
the dependencies between this code and other drivers are declared in
Kconfig files, this code is always pulled into the core kernel, even
if it is usually superseded at runtime by accelerated drivers that
exist for many architectures.

This leaves us with 25 KB of dead code in the kernel, which is negligible
in typical environments, but which is actually a big deal for the IoT
domain, where every kilobyte counts.

Also, the scalar, table based AES routines that exist for ARM, arm64, i586
and x86_64 share the lookup tables with AES generic, and may be invoked
occasionally when the time-invariant AES-NI or other special instruction
drivers are called in interrupt context, at which time the SIMD register
file cannot be used. Pulling 16 KB of code and 9 KB of instructions into
the L1s (and evicting what was already there) when a softirq happens to
be handled in the context of an interrupt taken from kernel mode (which
means no SIMD on x86) is also something that we may like to avoid, by
falling back to a much smaller and moderately less performant driver.
(Note that arm64 will be updated shortly to supply fallbacks for all
SIMD based AES implementations, which will be based on the core routines)

For the reasons above, this series refactors the way the various AES
implementations are wired up, to allow the generic version in
crypto/aes_generic.c to be omitted from the build entirely.

Patch #1 removes some bogus 'select CRYPTO_AES' statement.

Patch #2 factors out aes-generic's lookup tables, which are shared with
arch-specific implementations in arch/x86, arch/arm and arch/arm64.

Patch #3 replaces the table based aes-generic.o with a new aes.o based on
the fixed time cipher, and uses it to fulfil dependencies on CRYPTO_AES.

Patch #4 switches the fallback in the AES-NI code to the new, generic encrypt
and decrypt routines so it no longer depends on the x86 scalar code or
[transitively] on AES-generic.

Patch #5 tweaks the ARM table based code to only use 2 KB + 256 bytes worth
of lookup tables instead of 4 KB.

Patch #6 does the same for arm64

Patch #7 removes the local copy of the AES sboxes from the arm64 NEON driver,
and switches to the ones exposed by the new AES core module instead.

Patch #8 updates the Kconfig help text to be more descriptive of what they
actually control, rather than duplicating AES's wikipedia entry a number of
times.

v4: - remove aes-generic altogether instead of allow a preference to be set
- factor out shared lookup tables (#2)
- reduce dependency of ARM's table based code on shared lookup tables
  (#5, #6)

v3: - fix big-endian issue in refactored fixed-time AES driver
- improve Kconfig help texts
- add patch #4

v2: - repurpose CRYPTO_AES and avoid HAVE_AES/NEED_AES Kconfig symbols
- don't factor out tables from AES generic to be reused by per arch drivers,
  since the space saving is moderate (the generic code only), and the
  drivers weren't made to be small anyway

Ard Biesheuvel (8):
  drivers/crypto/Kconfig: drop bogus CRYPTO_AES dependencies
  crypto - aes: use dedicated lookup tables for table based asm routines
  crypto: aes - retire table based generic AES in favor of fixed time
driver
  crypto: x86/aes-ni - switch to generic fallback
  crypto: arm/aes - avoid expanded lookup tables in the final round
  crypto: arm64/aes - avoid expanded lookup tables in the final round
  crypto: arm64/aes-neon - reuse Sboxes from AES core module
  crypto: aes - add meaningful help text to the various AES drivers

 arch/arm/crypto/Kconfig |   16 +-
 arch/arm/crypto/aes-cipher-core.S   |   54 +-
 arch/arm64/crypto/Kconfig   |   30 +-
 arch/arm64/crypto/aes-cipher-core.S |  159 ++-
 arch/arm64/crypto/aes-neon.S|   74 +-
 arch/x86/crypto/aes-i586-asm_32.S   |   13 +-
 arch/x86/crypto/aes-x86_64-asm_64.S |   12 +-
 arch/x86/crypto/aesni-intel_glue.c  |4 +-
 crypto/Kconfig  |  138 +-
 crypto/Makefile |3 +-
 crypto/{aes_ti.c => aes.c}  |  169 ++-
 crypto/aes_generic.c| 1478 
 drivers/crypto/Kconfig  |5 -
 drivers/crypto/chelsio/chcr_algo.c  |4 +-
 include/crypto/aes-tables.S | 1104 +++
 include/crypto/aes.h|   11 +-
 16 files changed, 1464 insertions(+), 1810 deletions(-)
 rename crypto/{aes_ti.c => aes.c} (76%)
 delete mode 100644 crypto/aes_generic.c
 create mode 100644 include/crypto/aes-tables.S

-- 
2.9.3



[PATCH v4 1/8] drivers/crypto/Kconfig: drop bogus CRYPTO_AES dependencies

2017-07-18 Thread Ard Biesheuvel
In preparation of fine tuning the dependency relations between the
accelerated AES drivers and the core support code, let's remove the
dependency declarations that are false. None of these modules have
link time dependencies on the generic AES code, nor do they declare
any AES algos with CRYPTO_ALG_NEED_FALLBACK, so they can function
perfectly fine without crypto/aes_generic.o loaded.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/Kconfig | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index 5b5393f1b87a..46a48ea99fb9 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -432,7 +432,6 @@ config CRYPTO_DEV_S5P
tristate "Support for Samsung S5PV210/Exynos crypto accelerator"
depends on ARCH_S5PV210 || ARCH_EXYNOS || COMPILE_TEST
depends on HAS_IOMEM && HAS_DMA
-   select CRYPTO_AES
select CRYPTO_BLKCIPHER
help
  This option allows you to have support for S5P crypto acceleration.
@@ -486,7 +485,6 @@ config CRYPTO_DEV_ATMEL_AES
tristate "Support for Atmel AES hw accelerator"
depends on HAS_DMA
depends on ARCH_AT91 || COMPILE_TEST
-   select CRYPTO_AES
select CRYPTO_AEAD
select CRYPTO_BLKCIPHER
help
@@ -618,7 +616,6 @@ config CRYPTO_DEV_SUN4I_SS
depends on ARCH_SUNXI && !64BIT
select CRYPTO_MD5
select CRYPTO_SHA1
-   select CRYPTO_AES
select CRYPTO_DES
select CRYPTO_BLKCIPHER
help
@@ -641,7 +638,6 @@ config CRYPTO_DEV_SUN4I_SS_PRNG
 config CRYPTO_DEV_ROCKCHIP
tristate "Rockchip's Cryptographic Engine driver"
depends on OF && ARCH_ROCKCHIP
-   select CRYPTO_AES
select CRYPTO_DES
select CRYPTO_MD5
select CRYPTO_SHA1
@@ -657,7 +653,6 @@ config CRYPTO_DEV_MEDIATEK
tristate "MediaTek's EIP97 Cryptographic Engine driver"
depends on HAS_DMA
depends on (ARM && ARCH_MEDIATEK) || COMPILE_TEST
-   select CRYPTO_AES
select CRYPTO_AEAD
select CRYPTO_BLKCIPHER
select CRYPTO_CTR
-- 
2.9.3



[PATCH v2 1/3] staging: ccree: Replace kzalloc with devm_kzalloc

2017-07-18 Thread sunil . m
From: Suniel Mahesh <suni...@techveda.org>

It is recommended to use managed function devm_kzalloc, which
simplifies driver cleanup paths and driver code.
This patch does the following:
(a) replace kzalloc with devm_kzalloc.
(b) drop kfree(), because memory allocated with devm_kzalloc() is
automatically freed on driver detach, otherwise it leads to a double
free.
(c) remove unnecessary blank lines.

Signed-off-by: Suniel Mahesh <suni...@techveda.org>
---
Changes for v2:

- Changes done as suggested by Greg-KH.
- Rebased on top of next-20170718.
---
Note:

- Patch was tested and built(ARCH=arm) on next-20170718.
  No build issues reported, however it was not tested on
  real hardware.
---
 drivers/staging/ccree/ssi_driver.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/ccree/ssi_driver.c 
b/drivers/staging/ccree/ssi_driver.c
index d7b9a63..e918cf4 100644
--- a/drivers/staging/ccree/ssi_driver.c
+++ b/drivers/staging/ccree/ssi_driver.c
@@ -223,13 +223,15 @@ static int init_cc_resources(struct platform_device 
*plat_dev)
struct resource *req_mem_cc_regs = NULL;
void __iomem *cc_base = NULL;
bool irq_registered = false;
-   struct ssi_drvdata *new_drvdata = kzalloc(sizeof(struct ssi_drvdata), 
GFP_KERNEL);
+   struct ssi_drvdata *new_drvdata;
struct device *dev = _dev->dev;
struct device_node *np = dev->of_node;
u32 signature_val;
int rc = 0;
 
-   if (unlikely(!new_drvdata)) {
+   new_drvdata = devm_kzalloc(_dev->dev, sizeof(*new_drvdata),
+  GFP_KERNEL);
+   if (!new_drvdata) {
SSI_LOG_ERR("Failed to allocate drvdata");
rc = -ENOMEM;
goto init_cc_res_err;
@@ -434,10 +436,8 @@ static int init_cc_resources(struct platform_device 
*plat_dev)
   resource_size(new_drvdata->res_mem));
new_drvdata->res_mem = NULL;
}
-   kfree(new_drvdata);
dev_set_drvdata(_dev->dev, NULL);
}
-
return rc;
 }
 
@@ -478,8 +478,6 @@ static void cleanup_cc_resources(struct platform_device 
*plat_dev)
drvdata->cc_base = NULL;
drvdata->res_mem = NULL;
}
-
-   kfree(drvdata);
dev_set_drvdata(_dev->dev, NULL);
 }
 
-- 
1.9.1



[PATCH v2 2/3] staging: ccree: Convert to devm_ioremap_resource for map, unmap

2017-07-18 Thread sunil . m
From: Suniel Mahesh <suni...@techveda.org>

It is recommended to use managed function devm_ioremap_resource(),
which simplifies driver cleanup paths and driver code.
This patch does the following:
(a) replace request_mem_region(), ioremap() and corresponding error
handling with devm_ioremap_resource().
(b) remove struct resource pointer(res_mem) in struct ssi_drvdata as it
seems redundant, use struct resource pointer which is defined locally and
adjust return value of platform_get_resource() accordingly.
(c) release_mem_region() and iounmap() are dropped, since devm_ioremap_
resource() releases and unmaps mem region on driver detach.
(d) adjust log messages accordingly and remove any blank lines.

Signed-off-by: Suniel Mahesh <suni...@techveda.org>
---
Changes for v2:

- format specifiers changed in log messages.
- Rebased on top of next-20170718.
---
Note:
- Patch was tested and built(ARCH=arm) on next-20170718.
  No build issues reported, however it was not tested on
  real hardware.
---
 drivers/staging/ccree/ssi_driver.c | 60 ++
 drivers/staging/ccree/ssi_driver.h |  1 -
 2 files changed, 15 insertions(+), 46 deletions(-)

diff --git a/drivers/staging/ccree/ssi_driver.c 
b/drivers/staging/ccree/ssi_driver.c
index e918cf4..36b7c92 100644
--- a/drivers/staging/ccree/ssi_driver.c
+++ b/drivers/staging/ccree/ssi_driver.c
@@ -246,35 +246,21 @@ static int init_cc_resources(struct platform_device 
*plat_dev)
dev_set_drvdata(_dev->dev, new_drvdata);
/* Get device resources */
/* First CC registers space */
-   new_drvdata->res_mem = platform_get_resource(plat_dev, IORESOURCE_MEM, 
0);
-   if (unlikely(!new_drvdata->res_mem)) {
-   SSI_LOG_ERR("Failed getting IO memory resource\n");
-   rc = -ENODEV;
-   goto init_cc_res_err;
-   }
-   SSI_LOG_DEBUG("Got MEM resource (%s): start=%pad end=%pad\n",
- new_drvdata->res_mem->name,
- new_drvdata->res_mem->start,
- new_drvdata->res_mem->end);
+   req_mem_cc_regs = platform_get_resource(plat_dev, IORESOURCE_MEM, 0);
/* Map registers space */
-   req_mem_cc_regs = request_mem_region(new_drvdata->res_mem->start, 
resource_size(new_drvdata->res_mem), "arm_cc7x_regs");
-   if (unlikely(!req_mem_cc_regs)) {
-   SSI_LOG_ERR("Couldn't allocate registers memory region at "
-"0x%08X\n", (unsigned 
int)new_drvdata->res_mem->start);
-   rc = -EBUSY;
-   goto init_cc_res_err;
-   }
-   cc_base = ioremap(new_drvdata->res_mem->start, 
resource_size(new_drvdata->res_mem));
-   if (unlikely(!cc_base)) {
-   SSI_LOG_ERR("ioremap[CC](0x%08X,0x%08X) failed\n",
-   (unsigned int)new_drvdata->res_mem->start,
-   (unsigned int)resource_size(new_drvdata->res_mem));
-   rc = -ENOMEM;
+   new_drvdata->cc_base = devm_ioremap_resource(_dev->dev,
+req_mem_cc_regs);
+   if (IS_ERR(new_drvdata->cc_base)) {
+   rc = PTR_ERR(new_drvdata->cc_base);
goto init_cc_res_err;
}
-   SSI_LOG_DEBUG("CC registers mapped from %pa to 0x%p\n", 
_drvdata->res_mem->start, cc_base);
-   new_drvdata->cc_base = cc_base;
-
+   SSI_LOG_DEBUG("Got MEM resource (%s): start=%pad end=%pad\n",
+ req_mem_cc_regs->name,
+ req_mem_cc_regs->start,
+ req_mem_cc_regs->end);
+   SSI_LOG_DEBUG("CC registers mapped from %pa to 0x%p\n",
+ _mem_cc_regs->start, new_drvdata->cc_base);
+   cc_base = new_drvdata->cc_base;
/* Then IRQ */
new_drvdata->res_irq = platform_get_resource(plat_dev, IORESOURCE_IRQ, 
0);
if (unlikely(!new_drvdata->res_irq)) {
@@ -424,17 +410,9 @@ static int init_cc_resources(struct platform_device 
*plat_dev)
 #ifdef ENABLE_CC_SYSFS
ssi_sysfs_fini();
 #endif
-
-   if (req_mem_cc_regs) {
-   if (irq_registered) {
-   free_irq(new_drvdata->res_irq->start, 
new_drvdata);
-   new_drvdata->res_irq = NULL;
-   iounmap(cc_base);
-   new_drvdata->cc_base = NULL;
-   }
-   release_mem_region(new_drvdata->res_mem->start,
-  resource_size(new_drvdata->res_mem));
-   new_drvdata->res_mem = NULL;
+   if (irq_registered) {
+   free_irq(new_drvdata->res_irq->st

[PATCH v2 3/3] staging: ccree: Use platform_get_irq and devm_request_irq

2017-07-18 Thread sunil . m
From: Suniel Mahesh <suni...@techveda.org>

It is recommended to use managed function devm_request_irq(),
which simplifies driver cleanup paths and driver code.
This patch does the following:
(a) replace platform_get_resource(), request_irq() and corresponding
error handling with platform_get_irq() and devm_request_irq().
(b) remove struct resource pointer(res_irq) in struct ssi_drvdata as
it seems redundant.
(c) change type of member irq in struct ssi_drvdata from unsigned int
to int, as return type of platform_get_irq is int and can be used in
error handling.
(d) remove irq_registered variable from driver probe as it seems
redundant.
(e) free_irq is not required any more, devm_request_irq() free's it
on driver detach.
(f) adjust log messages accordingly and remove any blank lines.

Signed-off-by: Suniel Mahesh <suni...@techveda.org>
---
Changes for v2:

- Rebased on top of next-20170718.
---
Note:
- Patch was tested and built(ARCH=arm) on next-20170718.
  No build issues reported, however it was not tested on
  real hardware.
---
 drivers/staging/ccree/ssi_driver.c | 30 +-
 drivers/staging/ccree/ssi_driver.h |  3 +--
 2 files changed, 10 insertions(+), 23 deletions(-)

diff --git a/drivers/staging/ccree/ssi_driver.c 
b/drivers/staging/ccree/ssi_driver.c
index 36b7c92..bac27d4 100644
--- a/drivers/staging/ccree/ssi_driver.c
+++ b/drivers/staging/ccree/ssi_driver.c
@@ -222,7 +222,6 @@ static int init_cc_resources(struct platform_device 
*plat_dev)
 {
struct resource *req_mem_cc_regs = NULL;
void __iomem *cc_base = NULL;
-   bool irq_registered = false;
struct ssi_drvdata *new_drvdata;
struct device *dev = _dev->dev;
struct device_node *np = dev->of_node;
@@ -262,26 +261,22 @@ static int init_cc_resources(struct platform_device 
*plat_dev)
  _mem_cc_regs->start, new_drvdata->cc_base);
cc_base = new_drvdata->cc_base;
/* Then IRQ */
-   new_drvdata->res_irq = platform_get_resource(plat_dev, IORESOURCE_IRQ, 
0);
-   if (unlikely(!new_drvdata->res_irq)) {
+   new_drvdata->irq = platform_get_irq(plat_dev, 0);
+   if (new_drvdata->irq < 0) {
SSI_LOG_ERR("Failed getting IRQ resource\n");
-   rc = -ENODEV;
+   rc = new_drvdata->irq;
goto init_cc_res_err;
}
-   rc = request_irq(new_drvdata->res_irq->start, cc_isr,
-IRQF_SHARED, "arm_cc7x", new_drvdata);
-   if (unlikely(rc != 0)) {
-   SSI_LOG_ERR("Could not register to interrupt %llu\n",
-   (unsigned long long)new_drvdata->res_irq->start);
+   rc = devm_request_irq(_dev->dev, new_drvdata->irq, cc_isr,
+ IRQF_SHARED, "arm_cc7x", new_drvdata);
+   if (rc) {
+   SSI_LOG_ERR("Could not register to interrupt %d\n",
+   new_drvdata->irq);
goto init_cc_res_err;
}
init_completion(_drvdata->icache_setup_completion);
 
-   irq_registered = true;
-   SSI_LOG_DEBUG("Registered to IRQ (%s) %llu\n",
- new_drvdata->res_irq->name,
- (unsigned long long)new_drvdata->res_irq->start);
-
+   SSI_LOG_DEBUG("Registered to IRQ: %d\n", new_drvdata->irq);
new_drvdata->plat_dev = plat_dev;
 
rc = cc_clk_on(new_drvdata);
@@ -410,10 +405,6 @@ static int init_cc_resources(struct platform_device 
*plat_dev)
 #ifdef ENABLE_CC_SYSFS
ssi_sysfs_fini();
 #endif
-   if (irq_registered) {
-   free_irq(new_drvdata->res_irq->start, new_drvdata);
-   new_drvdata->res_irq = NULL;
-   }
dev_set_drvdata(_dev->dev, NULL);
}
return rc;
@@ -443,11 +434,8 @@ static void cleanup_cc_resources(struct platform_device 
*plat_dev)
 #ifdef ENABLE_CC_SYSFS
ssi_sysfs_fini();
 #endif
-
fini_cc_regs(drvdata);
cc_clk_off(drvdata);
-   free_irq(drvdata->res_irq->start, drvdata);
-   drvdata->res_irq = NULL;
dev_set_drvdata(_dev->dev, NULL);
 }
 
diff --git a/drivers/staging/ccree/ssi_driver.h 
b/drivers/staging/ccree/ssi_driver.h
index 518c0bf..88ef370 100644
--- a/drivers/staging/ccree/ssi_driver.h
+++ b/drivers/staging/ccree/ssi_driver.h
@@ -128,9 +128,8 @@ struct ssi_crypto_req {
  * @fw_ver:SeP loaded firmware version
  */
 struct ssi_drvdata {
-   struct resource *res_irq;
void __iomem *cc_base;
-   unsigned int irq;
+   int irq;
u32 irq_mask;
u32 fw_ver;
/* Calibration time of start/stop
-- 
1.9.1



Re: [PATCH] crypto: sahara : make of_device_ids const.

2017-07-18 Thread Herbert Xu
On Tue, Jun 27, 2017 at 05:11:23PM +0530, Arvind Yadav wrote:
> of_device_ids are not supposed to change at runtime. All functions
> working with of_device_ids provided by  work with const
> of_device_ids. So mark the non-const structs as const.
> 
> File size before:
>text  data bss dec hex filename
>9759  2736   8   1250330d7 drivers/crypto/sahara.o
> 
> File size after constify:
>text  data bss dec hex filename
>   10367  2128   8   1250330d7 drivers/crypto/sahara.o
> 
> Signed-off-by: Arvind Yadav 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] crypto: ccp - Fix some line spacing

2017-07-18 Thread Herbert Xu
On Tue, Jun 27, 2017 at 08:58:04AM -0500, Gary R Hook wrote:
> Add/remove blank lines as appropriate.
> 
> Signed-off-by: Gary R Hook 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] crypto: qat: fix spelling mistake: "runing" -> "running"

2017-07-18 Thread Herbert Xu
On Mon, Jun 26, 2017 at 08:41:03PM +0100, Colin King wrote:
> From: Colin Ian King 
> 
> trivial fix to spelling mistake in dev_info message
> 
> Signed-off-by: Colin Ian King 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] crypto: ccp - Change all references to use the JOB ID macro

2017-07-18 Thread Herbert Xu
On Tue, Jun 27, 2017 at 08:58:16AM -0500, Gary R Hook wrote:
> Use the CCP_NEW_JOBID() macro when assigning an identifier
> 
> Signed-off-by: Gary R Hook 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] crypto: ccp-platform: print error message on platform_get_irq failure

2017-07-18 Thread Herbert Xu
On Fri, Jun 30, 2017 at 12:59:52AM -0500, Gustavo A. R. Silva wrote:
> Print error message on platform_get_irq failure before return.
> 
> Signed-off-by: Gustavo A. R. Silva 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH v2 0/2] crypto hwrng consider quality value, remember user choice

2017-07-18 Thread Herbert Xu
On Tue, Jul 11, 2017 at 09:36:21AM +0200, Harald Freudenberger wrote:
> The hwrng core implementation currently doesn't consider the
> quality field of the struct hwrng. So the first registered rng
> is the winner and further rng sources even with much better
> quality are ignored.
> 
> The behavior should be that always the best rng with the highest
> quality rate should be used as current rng source. Only if the
> user explicitly chooses a rng source (via writing a rng name
> to /sys/class/misc/hw_random/rng_current) the decision for the
> best quality should be suppressed.
> 
> This two patches make hwrng always hold a list of registered
> rng sources sorted decreasing by quality. On registration of a new
> hwrng source the list is updated and if the current rng source was
> not chosen by user and the new rng provides better quality set as
> new current rng source. Similar on unregistration of an rng, if it
> was the current used rng source the one with the next highest quality
> is used. If a rng source has been set via sysfs from userland as
> long as this one doesn't unregister it is kept as current rng
> regardless of registration of 'better' rng sources.
> 
> Patch 1 introduces the sorted list of registered rngs and the
> always use the best quality behavior.
> 
> Patch 2 makes hwrng remember that the user has selected an
> rng via echo to /sys/class/misc/hw_random/rng_current and
> adds a new sysfs attribute file 'rng_selected' to the rng core.
> 
> Harald Freudenberger (2):
>   crypto: hwrng use rng source with best quality
>   crypto: hwrng remember rng chosen by user

All patches applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] crypto: brcm - remove BCM_PDC_MBOX dependency in Kconfig

2017-07-18 Thread Herbert Xu
On Tue, Jul 11, 2017 at 03:50:06PM +0530, Raveendra Padasalagi wrote:
> SPU driver is dependent on generic MAILBOX API's to
> communicate with underlying DMA engine driver.
> 
> So this patch removes BCM_PDC_MBOX "depends on" for SPU driver
> in Kconfig and adds MAILBOX as dependent module.
> 
> Fixes: 9d12ba86f818 ("crypto: brcm - Add Broadcom SPU driver")
> Signed-off-by: Raveendra Padasalagi 
> Reviewed-by: Ray Jui 
> Reviewed-by: Scott Branden 
> Cc: sta...@vger.kernel.org

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] crypto: mediatek: fix error return code in mtk_crypto_probe()

2017-07-18 Thread Herbert Xu
On Fri, Jun 30, 2017 at 01:24:54AM -0500, Gustavo A. R. Silva wrote:
> Propagate the return value of platform_get_irq on failure.
> 
> Signed-off-by: Gustavo A. R. Silva 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] crypto: mxs-dcp: print error message on platform_get_irq failure

2017-07-18 Thread Herbert Xu
On Fri, Jun 30, 2017 at 01:54:16AM -0500, Gustavo A. R. Silva wrote:
> Print error message on platform_get_irq failure before return.
> 
> Signed-off-by: Gustavo A. R. Silva 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] drivers: crypto: geode-aes: fixed coding style warnings and error

2017-07-18 Thread Herbert Xu
On Thu, Jul 06, 2017 at 02:44:56PM -0400, Chris Gorman wrote:
> fixed WARNING: Block comments should align the * on each line
> fixed WARNINGs: Missing a blank line after declarations
> fixed ERROR: space prohibited before that ',' (ctx:WxE)
> 
> Signed-off-by: Chris Gorman 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] crypto: omap-des: fix error return code in omap_des_probe()

2017-07-18 Thread Herbert Xu
On Fri, Jun 30, 2017 at 02:07:04AM -0500, Gustavo A. R. Silva wrote:
> Print and propagate the return value of platform_get_irq on failure.
> 
> Signed-off-by: Gustavo A. R. Silva 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH v2 00/13] crypto: caam - fixes, clean-up

2017-07-18 Thread Herbert Xu
On Mon, Jul 10, 2017 at 08:40:26AM +0300, Horia Geantă wrote:
> [
> Change log:
> v1 -> v2
> -patch 05/13 - add missing check in ablkcipher_giv_edesc_alloc(),
> to make sure number of reserved S/G entries is not overflown
> -patch 12/13 - fix author - replace my Freescale address with
> corresponding NXP one
> ]
> 
> Hi,
> 
> Current patch set consists of:
> 
> Patches 1-4 fix some issues in caam/qi driver;
> they should be sent to -stable.

While they should go to stable eventually, I think as the bug has
been around for so long we might as well wait for another cycle.

> Patches 5-7 also fix some problems in caam/qi driver, however these are
> ARM-specific. Considering that caam/qi does not have support for ARM in
> kernel v4.12 (lacking one dependency - Queue Manager), there's no need
> to be applied on v4.12.y.
> 
> Patches 8-13 contain code clean-up.

All patches applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] crypto: omap-aes: fix error return code in omap_aes_probe()

2017-07-18 Thread Herbert Xu
On Fri, Jun 30, 2017 at 02:00:54AM -0500, Gustavo A. R. Silva wrote:
> Propagate the return value of platform_get_irq on failure.
> 
> Signed-off-by: Gustavo A. R. Silva 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] crypto: brcm: add NULL check on of_match_device() return value

2017-07-18 Thread Herbert Xu
On Fri, Jul 07, 2017 at 01:33:33AM -0500, Gustavo A. R. Silva wrote:
> Check return value from call to of_match_device()
> in order to prevent a NULL pointer dereference.
> 
> In case of NULL print error message and return -ENODEV
> 
> Signed-off-by: Gustavo A. R. Silva 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH 0/3] crypto: introduce Microchip / Atmel ECC driver

2017-07-18 Thread Herbert Xu
On Wed, Jul 05, 2017 at 01:07:57PM +0300, Tudor Ambarus wrote:
> Hi,
> 
> This patch set introduces Microchip / Atmel ECC driver.
> 
> The first patch adds some helpers that will be used by fallbacks to
> kpp software implementations.
> 
> The second patch adds ECDH support for the ATECC508A (I2C)
> cryptographic engine. The I2C interface is designed to operate
> at a maximum clock speed of 1MHz.
> 
> The device features hardware acceleration for the NIST standard
> P256 prime curve and supports the complete key life cycle from
> private key generation to ECDH key agreement.
> 
> Random private key generation is supported internally within
> the device to ensure that the private key can never be known
> outside of the device. If the user wants to use its own private
> keys, the driver will fallback to the ecdh software implementation.
> 
> Tudor Ambarus (3):
>   crypto: kpp: add get/set_flags helpers
>   crypto: introduce Microchip / Atmel ECC driver
>   MAINTAINERS: add a maintainer for Microchip / Atmel ECC driver

All patches applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH v5] crypto: sun4i-ss: support the Security System PRNG

2017-07-18 Thread Herbert Xu
On Mon, Jul 03, 2017 at 08:48:48PM +0200, Corentin Labbe wrote:
> The Security System has a PRNG, this patch adds support for it via
> crypto_rng.
> 
> Signed-off-by: Corentin Labbe 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH v4 0/5] Introduce AMD Secure Processor device

2017-07-18 Thread Herbert Xu
On Thu, Jul 06, 2017 at 09:59:12AM -0500, Brijesh Singh wrote:
> CCP device (drivers/crypto/ccp/ccp.ko) is part of AMD Secure Processor,
> which is not dedicated solely to crypto. The AMD Secure Processor includes
> CCP and PSP (Platform Secure Processor) devices.
> 
> This patch series adds a framework that allows functional component of the
> AMD Secure Processor to be initialized and handled appropriately. The series
> does not makes any logic modification into CCP - it refactors the code to
> integerate CCP into AMD secure processor framework.
> 
> ---
> 
> Changes since v3:
>  - guard sp_dev_resume and sp_dev_suspend with CONFIG_PM
>  - update Kconfig description for AMD SP device
> 
> Changes since v2:
>  - move the ccp->io_regs initialization before device setup().
>  - maintain the original Kconfig hierarchy
>  - rename ccp-{pci,platform}.c -> sp-{pci,platform}.c
>  - do not fail the module_init() when ccp device is not found
> 
> Changes since v1:
>  - remove unused function [sp_get_device()]
> 
> Brijesh Singh (5):
>   crypto: ccp - Use devres interface to allocate PCI/iomap and cleanup
>   crypto: ccp - Introduce the AMD Secure Processor device
>   crypto: cpp - Abstract interrupt registeration
>   crypto: ccp - rename ccp driver initialize files as sp device
>   crypto: ccp - remove ccp_present() check from device initialize

All patches applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH v3] crypto: ccp - Provide an error path for debugfs setup failure

2017-07-18 Thread Herbert Xu
On Wed, Jun 28, 2017 at 11:56:47AM -0500, Gary R Hook wrote:
> Changes since v2:
>   - On failure remove only the DebugFS heirarchy for this device 
> Changes since v1:
>   - Remove unneeded local variable
> 
> Signed-off-by: Gary R Hook 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] crypto: mxc-scc: fix error code in mxc_scc_probe()

2017-07-18 Thread Herbert Xu
On Fri, Jun 30, 2017 at 01:42:12AM -0500, Gustavo A. R. Silva wrote:
> Print and propagate the return value of platform_get_irq on failure.
> 
> Signed-off-by: Gustavo A. R. Silva 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] crypto: virtio - Refacotor virtio_crypto driver for new virito crypto services

2017-07-18 Thread Herbert Xu
On Fri, Jun 23, 2017 at 11:31:19AM -0400, Xin Zeng wrote:
> In current virtio crypto device driver, some common data structures and
> implementations that should be used by other virtio crypto algorithms
> (e.g. asymmetric crypto algorithms) introduce symmetric crypto algorithms
> specific implementations.
> This patch refactors these pieces of code so that they can be reused by
> other virtio crypto algorithms.
> 
> Acked-by: Gonglei 
> Signed-off-by: Xin Zeng 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] Documentation/bindings: crypto: remove the dma-mask property

2017-07-18 Thread Herbert Xu
On Fri, Jun 23, 2017 at 04:52:18PM +0200, Antoine Tenart wrote:
> The dma-mask property is broken and was removed in the device trees
> having a safexcel-eip197 node and in the safexcel cryptographic
> driver. This patch removes the dma-mask property from the documentation
> as well.
> 
> Signed-off-by: Antoine Tenart 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] crypto: inside-secure - do not parse the dma mask from dt

2017-07-18 Thread Herbert Xu
On Fri, Jun 23, 2017 at 04:05:25PM +0200, Antoine Tenart wrote:
> Remove the dma mask parsing from dt as this should not be encoded into
> the engine device tree node. Keep the fallback value for now, which
> should work for the boards already supported upstream.
> 
> Signed-off-by: Antoine Tenart 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH v1] crypto: brcm - Fix SHA3-512 algorithm failure

2017-07-18 Thread Herbert Xu
Raveendra Padasalagi  wrote:
> In Broadcom SPU driver, due to missing break statement
> in spu2_hash_xlate() while mapping SPU2 equivalent
> SHA3-512 value, -EINVAL is chosen and hence leading to
> failure of SHA3-512 algorithm. This patch fixes the same.
> 
> Fixes: 9d12ba86f818 ("crypto: brcm - Add Broadcom SPU driver")
> Signed-off-by: Raveendra Padasalagi 
> Reviewed-by: Ray Jui 
> Reviewed-by: Scott Branden 
> Cc: sta...@vger.kernel.org

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH][crypto-next] crypto: cavium/nitrox - Change in firmware path.

2017-07-18 Thread Herbert Xu
Srikanth Jampala  wrote:
> Moved the firmware to "cavium" subdirectory as suggested by
> Kyle McMartin.
> 
> Signed-off-by: Srikanth Jampala 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH v2 2/2] crypto: arm64/ghash - add NEON accelerated fallback for 64-bit PMULL

2017-07-18 Thread Ard Biesheuvel
On 18 July 2017 at 10:49, Herbert Xu  wrote:
> On Wed, Jul 05, 2017 at 12:43:19AM +0100, Ard Biesheuvel wrote:
>> Implement a NEON fallback for systems that do support NEON but have
>> no support for the optional 64x64->128 polynomial multiplication
>> instruction that is part of the ARMv8 Crypto Extensions. It is based
>> on the paper "Fast Software Polynomial Multiplication on ARM Processors
>> Using the NEON Engine" by Danilo Camara, Conrado Gouvea, Julio Lopez and
>> Ricardo Dahab (https://hal.inria.fr/hal-01506572), but has been reworked
>> extensively for the AArch64 ISA.
>>
>> On a low-end core such as the Cortex-A53 found in the Raspberry Pi3, the
>> NEON based implementation is 4x faster than the table based one, and
>> is time invariant as well, making it less vulnerable to timing attacks.
>> When combined with the bit-sliced NEON implementation of AES-CTR, the
>> AES-GCM performance increases by ~2x (from 58 to 30 cycles per byte).
>>
>> Signed-off-by: Ard Biesheuvel 
>
> This patch does not apply against cryptodev.
>

Yeah, it implements a non-SIMD fallback which depends on the AES
refactor series.


Re: [PATCH v2 2/2] crypto: arm64/ghash - add NEON accelerated fallback for 64-bit PMULL

2017-07-18 Thread Herbert Xu
On Wed, Jul 05, 2017 at 12:43:19AM +0100, Ard Biesheuvel wrote:
> Implement a NEON fallback for systems that do support NEON but have
> no support for the optional 64x64->128 polynomial multiplication
> instruction that is part of the ARMv8 Crypto Extensions. It is based
> on the paper "Fast Software Polynomial Multiplication on ARM Processors
> Using the NEON Engine" by Danilo Camara, Conrado Gouvea, Julio Lopez and
> Ricardo Dahab (https://hal.inria.fr/hal-01506572), but has been reworked
> extensively for the AArch64 ISA.
> 
> On a low-end core such as the Cortex-A53 found in the Raspberry Pi3, the
> NEON based implementation is 4x faster than the table based one, and
> is time invariant as well, making it less vulnerable to timing attacks.
> When combined with the bit-sliced NEON implementation of AES-CTR, the
> AES-GCM performance increases by ~2x (from 58 to 30 cycles per byte).
> 
> Signed-off-by: Ard Biesheuvel 

This patch does not apply against cryptodev.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


[PATCH v2 2/2] crypto/algapi - make crypto_xor() take separate dst and src arguments

2017-07-18 Thread Ard Biesheuvel
There are quite a number of occurrences in the kernel of the pattern

  if (dst != src)
memcpy(dst, src, walk.total % AES_BLOCK_SIZE);
  crypto_xor(dst, final, walk.total % AES_BLOCK_SIZE);

or

  crypto_xor(keystream, src, nbytes);
  memcpy(dst, keystream, nbytes);

where crypto_xor() is preceded or followed by a memcpy() invocation
that is only there because crypto_xor() uses its output parameter as
one of the inputs. To avoid having to add new instances of this pattern
in the arm64 code, which will be refactored to implement non-SIMD
fallbacks, add an alternative implementation called crypto_xor_cpy(),
taking separate input and output arguments. This removes the need for
the separate memcpy().

Signed-off-by: Ard Biesheuvel 
---
 arch/arm/crypto/aes-ce-glue.c   |  4 +---
 arch/arm/crypto/aes-neonbs-glue.c   |  5 ++---
 arch/arm64/crypto/aes-glue.c|  4 +---
 arch/arm64/crypto/aes-neonbs-glue.c |  5 ++---
 arch/sparc/crypto/aes_glue.c|  3 +--
 arch/x86/crypto/aesni-intel_glue.c  |  4 ++--
 arch/x86/crypto/blowfish_glue.c |  3 +--
 arch/x86/crypto/cast5_avx_glue.c|  3 +--
 arch/x86/crypto/des3_ede_glue.c |  3 +--
 crypto/ctr.c|  3 +--
 crypto/pcbc.c   | 12 
 drivers/crypto/vmx/aes_ctr.c|  3 +--
 drivers/md/dm-crypt.c   | 11 +--
 include/crypto/algapi.h | 19 +++
 14 files changed, 42 insertions(+), 40 deletions(-)

diff --git a/arch/arm/crypto/aes-ce-glue.c b/arch/arm/crypto/aes-ce-glue.c
index 0f966a8ca1ce..d0a9cec73707 100644
--- a/arch/arm/crypto/aes-ce-glue.c
+++ b/arch/arm/crypto/aes-ce-glue.c
@@ -285,9 +285,7 @@ static int ctr_encrypt(struct skcipher_request *req)
 
ce_aes_ctr_encrypt(tail, NULL, (u8 *)ctx->key_enc,
   num_rounds(ctx), blocks, walk.iv);
-   if (tdst != tsrc)
-   memcpy(tdst, tsrc, nbytes);
-   crypto_xor(tdst, tail, nbytes);
+   crypto_xor_cpy(tdst, tsrc, tail, nbytes);
err = skcipher_walk_done(, 0);
}
kernel_neon_end();
diff --git a/arch/arm/crypto/aes-neonbs-glue.c 
b/arch/arm/crypto/aes-neonbs-glue.c
index c76377961444..18768f330449 100644
--- a/arch/arm/crypto/aes-neonbs-glue.c
+++ b/arch/arm/crypto/aes-neonbs-glue.c
@@ -221,9 +221,8 @@ static int ctr_encrypt(struct skcipher_request *req)
u8 *dst = walk.dst.virt.addr + blocks * AES_BLOCK_SIZE;
u8 *src = walk.src.virt.addr + blocks * AES_BLOCK_SIZE;
 
-   if (dst != src)
-   memcpy(dst, src, walk.total % AES_BLOCK_SIZE);
-   crypto_xor(dst, final, walk.total % AES_BLOCK_SIZE);
+   crypto_xor_cpy(dst, src, final,
+  walk.total % AES_BLOCK_SIZE);
 
err = skcipher_walk_done(, 0);
break;
diff --git a/arch/arm64/crypto/aes-glue.c b/arch/arm64/crypto/aes-glue.c
index bcf596b0197e..0da30e3b0e4b 100644
--- a/arch/arm64/crypto/aes-glue.c
+++ b/arch/arm64/crypto/aes-glue.c
@@ -241,9 +241,7 @@ static int ctr_encrypt(struct skcipher_request *req)
 
aes_ctr_encrypt(tail, NULL, (u8 *)ctx->key_enc, rounds,
blocks, walk.iv, first);
-   if (tdst != tsrc)
-   memcpy(tdst, tsrc, nbytes);
-   crypto_xor(tdst, tail, nbytes);
+   crypto_xor_cpy(tdst, tsrc, tail, nbytes);
err = skcipher_walk_done(, 0);
}
kernel_neon_end();
diff --git a/arch/arm64/crypto/aes-neonbs-glue.c 
b/arch/arm64/crypto/aes-neonbs-glue.c
index db2501d93550..9001aec16007 100644
--- a/arch/arm64/crypto/aes-neonbs-glue.c
+++ b/arch/arm64/crypto/aes-neonbs-glue.c
@@ -224,9 +224,8 @@ static int ctr_encrypt(struct skcipher_request *req)
u8 *dst = walk.dst.virt.addr + blocks * AES_BLOCK_SIZE;
u8 *src = walk.src.virt.addr + blocks * AES_BLOCK_SIZE;
 
-   if (dst != src)
-   memcpy(dst, src, walk.total % AES_BLOCK_SIZE);
-   crypto_xor(dst, final, walk.total % AES_BLOCK_SIZE);
+   crypto_xor_cpy(dst, src, final,
+  walk.total % AES_BLOCK_SIZE);
 
err = skcipher_walk_done(, 0);
break;
diff --git a/arch/sparc/crypto/aes_glue.c b/arch/sparc/crypto/aes_glue.c
index c90930de76ba..3cd4f6b198b6 100644
--- a/arch/sparc/crypto/aes_glue.c
+++ b/arch/sparc/crypto/aes_glue.c
@@ -344,8 +344,7 @@ static void ctr_crypt_final(struct crypto_sparc64_aes_ctx 
*ctx,
 
ctx->ops->ecb_encrypt(>key[0], (const u64 *)ctrblk,
  keystream, AES_BLOCK_SIZE);
-   crypto_xor((u8 *) keystream, src, nbytes);
-  

[PATCH v2 1/2] crypto/algapi - use separate dst and src operands for __crypto_xor()

2017-07-18 Thread Ard Biesheuvel
In preparation of introducing crypto_xor_cpy(), which will use separate
operands for input and output, modify the __crypto_xor() implementation,
which it will share with the existing crypto_xor(), which provides the
actual functionality when not using the inline version.

Signed-off-by: Ard Biesheuvel 
---
 crypto/algapi.c | 25 
 include/crypto/algapi.h |  4 ++--
 2 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/crypto/algapi.c b/crypto/algapi.c
index e4cc7615a139..aa699ff6c876 100644
--- a/crypto/algapi.c
+++ b/crypto/algapi.c
@@ -975,13 +975,15 @@ void crypto_inc(u8 *a, unsigned int size)
 }
 EXPORT_SYMBOL_GPL(crypto_inc);
 
-void __crypto_xor(u8 *dst, const u8 *src, unsigned int len)
+void __crypto_xor(u8 *dst, const u8 *src1, const u8 *src2, unsigned int len)
 {
int relalign = 0;
 
if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)) {
int size = sizeof(unsigned long);
-   int d = ((unsigned long)dst ^ (unsigned long)src) & (size - 1);
+   int d = (((unsigned long)dst ^ (unsigned long)src1) |
+((unsigned long)dst ^ (unsigned long)src2)) &
+   (size - 1);
 
relalign = d ? 1 << __ffs(d) : size;
 
@@ -992,34 +994,37 @@ void __crypto_xor(u8 *dst, const u8 *src, unsigned int 
len)
 * process the remainder of the input using optimal strides.
 */
while (((unsigned long)dst & (relalign - 1)) && len > 0) {
-   *dst++ ^= *src++;
+   *dst++ = *src1++ ^ *src2++;
len--;
}
}
 
while (IS_ENABLED(CONFIG_64BIT) && len >= 8 && !(relalign & 7)) {
-   *(u64 *)dst ^= *(u64 *)src;
+   *(u64 *)dst = *(u64 *)src1 ^  *(u64 *)src2;
dst += 8;
-   src += 8;
+   src1 += 8;
+   src2 += 8;
len -= 8;
}
 
while (len >= 4 && !(relalign & 3)) {
-   *(u32 *)dst ^= *(u32 *)src;
+   *(u32 *)dst = *(u32 *)src1 ^ *(u32 *)src2;
dst += 4;
-   src += 4;
+   src1 += 4;
+   src2 += 4;
len -= 4;
}
 
while (len >= 2 && !(relalign & 1)) {
-   *(u16 *)dst ^= *(u16 *)src;
+   *(u16 *)dst = *(u16 *)src1 ^ *(u16 *)src2;
dst += 2;
-   src += 2;
+   src1 += 2;
+   src2 += 2;
len -= 2;
}
 
while (len--)
-   *dst++ ^= *src++;
+   *dst++ = *src1++ ^ *src2++;
 }
 EXPORT_SYMBOL_GPL(__crypto_xor);
 
diff --git a/include/crypto/algapi.h b/include/crypto/algapi.h
index 436c4c2683c7..fd547f946bf8 100644
--- a/include/crypto/algapi.h
+++ b/include/crypto/algapi.h
@@ -192,7 +192,7 @@ static inline unsigned int crypto_queue_len(struct 
crypto_queue *queue)
 }
 
 void crypto_inc(u8 *a, unsigned int size);
-void __crypto_xor(u8 *dst, const u8 *src, unsigned int size);
+void __crypto_xor(u8 *dst, const u8 *src1, const u8 *src2, unsigned int size);
 
 static inline void crypto_xor(u8 *dst, const u8 *src, unsigned int size)
 {
@@ -207,7 +207,7 @@ static inline void crypto_xor(u8 *dst, const u8 *src, 
unsigned int size)
size -= sizeof(unsigned long);
}
} else {
-   __crypto_xor(dst, src, size);
+   __crypto_xor(dst, dst, src, size);
}
 }
 
-- 
2.9.3



[PATCH v2 0/2] crypto/algapi - refactor crypto_xor() to avoid memcpy()s

2017-07-18 Thread Ard Biesheuvel
>From 2/2:

"""
There are quite a number of occurrences in the kernel of the pattern

if (dst != src)
memcpy(dst, src, walk.total % AES_BLOCK_SIZE);
crypto_xor(dst, final, walk.total % AES_BLOCK_SIZE);

or

crypto_xor(keystream, src, nbytes);
memcpy(dst, keystream, nbytes);

where crypto_xor() is preceded or followed by a memcpy() invocation
that is only there because crypto_xor() uses its output parameter as
one of the inputs.
"""

Patch #1 is a preparatory patch, which is split off for ease of review.

Patch #2 updates all occurrences of crypto_xor() involving a memcpy() to
use a new API function crypto_xor_cpy() which combines the two operations.

v2: - keep existing crypto_xor() as-is, and add crypto_xor_cpy() for the
  cases where a redundant memcpy() can be eliminated.

Ard Biesheuvel (2):
  crypto/algapi - use separate dst and src operands for __crypto_xor()
  crypto/algapi - make crypto_xor() take separate dst and src arguments

 arch/arm/crypto/aes-ce-glue.c   |  4 +---
 arch/arm/crypto/aes-neonbs-glue.c   |  5 ++--
 arch/arm64/crypto/aes-glue.c|  4 +---
 arch/arm64/crypto/aes-neonbs-glue.c |  5 ++--
 arch/sparc/crypto/aes_glue.c|  3 +--
 arch/x86/crypto/aesni-intel_glue.c  |  4 ++--
 arch/x86/crypto/blowfish_glue.c |  3 +--
 arch/x86/crypto/cast5_avx_glue.c|  3 +--
 arch/x86/crypto/des3_ede_glue.c |  3 +--
 crypto/algapi.c | 25 
 crypto/ctr.c|  3 +--
 crypto/pcbc.c   | 12 --
 drivers/crypto/vmx/aes_ctr.c|  3 +--
 drivers/md/dm-crypt.c   | 11 -
 include/crypto/algapi.h | 23 --
 15 files changed, 59 insertions(+), 52 deletions(-)

-- 
2.9.3



Re: [RFC PATCH v12 1/4] crypto: make Jitter RNG directly accessible

2017-07-18 Thread Stephan Müller
Am Dienstag, 18. Juli 2017, 11:16:10 CEST schrieb Arnd Bergmann:

Hi Arnd,

> I guess ideally you just move the inner half of lrng_get_jent(),
> i.e. everything inside of the spinlock, plus the buffer, into that file.
> That should keep the low-level side separate from the caller.

Yes, I concur.

Thanks.

Ciao
Stephan


Re: [RFC PATCH v12 1/4] crypto: make Jitter RNG directly accessible

2017-07-18 Thread Arnd Bergmann
On Tue, Jul 18, 2017 at 11:10 AM, Stephan Müller  wrote:
> Am Dienstag, 18. Juli 2017, 11:02:02 CEST schrieb Arnd Bergmann:
>
> Hi Arnd,
>>
>> I can see why the jitterentropy implementation avoids using kernel headers,
>> the problem now is that part of it gets moved into a new header, and that
>> already violates the original principle.
>>
>> From my reading of the code, we could probably leave the structure
>> definition in the crypto/jitterentropy.c, and have the statically
>> allocated instance in the same file when CONFIG_LRNG is
>> set,
>
> That is a very good idea -- I will implement this approach.

I guess ideally you just move the inner half of lrng_get_jent(),
i.e. everything inside of the spinlock, plus the buffer, into that file.
That should keep the low-level side separate from the caller.

  Arnd


Re: [RFC PATCH v12 1/4] crypto: make Jitter RNG directly accessible

2017-07-18 Thread Stephan Müller
Am Dienstag, 18. Juli 2017, 11:02:02 CEST schrieb Arnd Bergmann:

Hi Arnd,
> 
> I can see why the jitterentropy implementation avoids using kernel headers,
> the problem now is that part of it gets moved into a new header, and that
> already violates the original principle.
> 
> From my reading of the code, we could probably leave the structure
> definition in the crypto/jitterentropy.c, and have the statically
> allocated instance in the same file when CONFIG_LRNG is
> set,

That is a very good idea -- I will implement this approach.

> or provide a way to allocate an instance early (I assume you
> can't call jent_entropy_collector_alloc() here since you need
> the RNG long before kzalloc() works).

Correct. I cannot assume that any of the memory allocation routines are 
available.

Thank you.

Ciao
Stephan


Re: [RFC PATCH v12 1/4] crypto: make Jitter RNG directly accessible

2017-07-18 Thread Arnd Bergmann
On Tue, Jul 18, 2017 at 10:49 AM, Greg Kroah-Hartman
 wrote:
> On Tue, Jul 18, 2017 at 10:40:07AM +0200, Stephan Müller wrote:
>> Am Dienstag, 18. Juli 2017, 10:30:14 CEST schrieb Greg Kroah-Hartman:
>>
>> Hi Greg,
>>
>> > > +typedef  unsigned long long  __u64;
>> > > +typedef  long long   __s64;
>> >
>> > types.h already has these defines, don't re-typedef them again...
>>
>> The issue is that the C code is compiled without optimizations. Thus, the C
>> code shall not depend on any other header file.
>
> That is very strange for a kernel file, I don't know what to say...
>
>> This issue was discussed during the inclusion of the Jitter RNG C code into
>> the kernel.
>
> Ok, that was then, this is now, why not change it now?  How does
> including types.h change anything?

I can see why the jitterentropy implementation avoids using kernel headers,
the problem now is that part of it gets moved into a new header, and that
already violates the original principle.

>From my reading of the code, we could probably leave the structure
definition in the crypto/jitterentropy.c, and have the statically
allocated instance in the same file when CONFIG_LRNG is
set, or provide a way to allocate an instance early (I assume you
can't call jent_entropy_collector_alloc() here since you need
the RNG long before kzalloc() works).

Arnd


Re: [RFC PATCH v12 4/4] LRNG - enable compile

2017-07-18 Thread Stephan Müller
Am Dienstag, 18. Juli 2017, 10:51:41 CEST schrieb Arnd Bergmann:

Hi Arnd,

> On Tue, Jul 18, 2017 at 9:59 AM, Stephan Müller  wrote:
> > Add LRNG compilation support.
> > 
> > diff --git a/drivers/char/Makefile b/drivers/char/Makefile
> > index 53e3372..87e06ec 100644
> > --- a/drivers/char/Makefile
> > +++ b/drivers/char/Makefile
> > @@ -2,7 +2,15 @@
> > 
> >  # Makefile for the kernel character device drivers.
> >  #
> > 
> > -obj-y  += mem.o random.o
> > +obj-y  += mem.o
> > +
> > +ifeq ($(CONFIG_LRNG),y)
> > +  obj-$(CONFIG_LRNG)   += lrng.o
> > +  lrng-y   += lrng_base.o lrng_chacha20.o
> > +else
> > +  obj-y+= random.o
> > +endif
> 
> I think you can write the same in a more readable way without the
> intermediate object:
> 
> ifdef CONFIG_LRNG
>   obj-y   += lrng_base.o lrng_chacha20.o
> else
>   obj-y   += random.o
> endif

Thank you for the hint, it will be included.
> 
>   Arnd


Ciao
Stephan


Re: [RFC PATCH v12 1/4] crypto: make Jitter RNG directly accessible

2017-07-18 Thread Stephan Müller
Am Dienstag, 18. Juli 2017, 10:49:59 CEST schrieb Greg Kroah-Hartman:

Hi Greg,
> 
> > This issue was discussed during the inclusion of the Jitter RNG C code
> > into
> > the kernel.
> 
> Ok, that was then, this is now, why not change it now?  How does
> including types.h change anything?

At the time of discussion, I had no issue compiling it with types.h, but on 
other architectures, there were some issues. Allow me to check back with the 
developer who notified me about this issue to see whether types.h can be 
included.

Ciao
Stephan


Re: [RFC PATCH v12 3/4] Linux Random Number Generator

2017-07-18 Thread Greg Kroah-Hartman
On Tue, Jul 18, 2017 at 10:45:12AM +0200, Stephan Müller wrote:
> Am Dienstag, 18. Juli 2017, 10:32:10 CEST schrieb Greg Kroah-Hartman:
> 
> Hi Greg,
> 
> > external references do not last as long as the kernel change log does :(
> 
> What would be the best way to cite a 50+ page document? I got a suggestion to 
> include the ASCII version of the document into Documentation/ -- but for the 
> first inclusion request, I was not sure whether to add such large document.

Sure, we like lots of documentation, what's 50+ more pages of it?  :)


Re: [RFC PATCH v12 3/4] Linux Random Number Generator

2017-07-18 Thread Greg Kroah-Hartman
On Tue, Jul 18, 2017 at 10:45:12AM +0200, Stephan Müller wrote:
> Am Dienstag, 18. Juli 2017, 10:32:10 CEST schrieb Greg Kroah-Hartman:
> 
> Hi Greg,
> 
> > external references do not last as long as the kernel change log does :(
> 
> What would be the best way to cite a 50+ page document? I got a suggestion to 
> include the ASCII version of the document into Documentation/ -- but for the 
> first inclusion request, I was not sure whether to add such large document.
> > 
> > Also a "wholesale" replacement of random.c is a major thing, why not
> > just submit patches to fix it up to add the needed changes you feel are
> > necessary?  We don't like to have major changes like this, that's not
> > how kernel development is done.
> 
> I have to admit that I tried that over the last years. I sent numerous small 
> cleanup patches (not changing any logic) and larger patches (with logic 
> changes). Even after pinging, I hardly got a response to any of my patches, 
> let alone that patches were accepted.

Changing core kernel code is hard, really hard, for good reason.  I
don't recall seeing a patch series from you that addressed minor things
that you might have complaints about, why not send them again?

> I have stated the core concerns I have with random.c in [1]. To remedy these 
> core concerns, major changes to random.c are needed. With the past 
> experience, 
> I would doubt that I get the changes into random.c.
> 
> [1] https://www.spinics.net/lists/linux-crypto/msg26316.html

Evolution is the correct way to do this, kernel development relies on
that.  We don't do the "use this totally different and untested file
instead!" method.

thanks,

greg k-h


Re: [RFC PATCH v12 4/4] LRNG - enable compile

2017-07-18 Thread Arnd Bergmann
On Tue, Jul 18, 2017 at 9:59 AM, Stephan Müller  wrote:
> Add LRNG compilation support.
>
> diff --git a/drivers/char/Makefile b/drivers/char/Makefile
> index 53e3372..87e06ec 100644
> --- a/drivers/char/Makefile
> +++ b/drivers/char/Makefile
> @@ -2,7 +2,15 @@
>  # Makefile for the kernel character device drivers.
>  #
>
> -obj-y  += mem.o random.o
> +obj-y  += mem.o
> +
> +ifeq ($(CONFIG_LRNG),y)
> +  obj-$(CONFIG_LRNG)   += lrng.o
> +  lrng-y   += lrng_base.o lrng_chacha20.o
> +else
> +  obj-y+= random.o
> +endif

I think you can write the same in a more readable way without the
intermediate object:

ifdef CONFIG_LRNG
  obj-y   += lrng_base.o lrng_chacha20.o
else
  obj-y   += random.o
endif

  Arnd


Re: [PATCH 2/2] crypto/algapi - make crypto_xor() take separate dst and src arguments

2017-07-18 Thread Ard Biesheuvel
On 18 July 2017 at 09:39, Herbert Xu  wrote:
> On Mon, Jul 10, 2017 at 02:45:48PM +0100, Ard Biesheuvel wrote:
>> There are quite a number of occurrences in the kernel of the pattern
>>
>> if (dst != src)
>> memcpy(dst, src, walk.total % AES_BLOCK_SIZE);
>> crypto_xor(dst, final, walk.total % AES_BLOCK_SIZE);
>>
>> or
>>
>> crypto_xor(keystream, src, nbytes);
>> memcpy(dst, keystream, nbytes);
>
> What keeping crypto_xor as it is and adding a new entry point for
> the 4-argument case?
>

Also fine.


Re: [RFC PATCH v12 2/4] random: conditionally compile code depending on LRNG

2017-07-18 Thread Stephan Müller
Am Dienstag, 18. Juli 2017, 10:47:00 CEST schrieb Arnd Bergmann:

Hi Arnd,

> On Tue, Jul 18, 2017 at 10:37 AM, Stephan Müller  
wrote:
> > Am Dienstag, 18. Juli 2017, 10:13:55 CEST schrieb Arnd Bergmann:
> >> On Tue, Jul 18, 2017 at 9:58 AM, Stephan Müller  
wrote:
> >> > When selecting the LRNG for compilation, disable add_disk_randomness
> >> > and
> >> > its supporting function.
> >> > 
> >> > CC: Greg Kroah-Hartman 
> >> > CC: Arnd Bergmann 
> >> > CC: Jason A. Donenfeld 
> >> > Signed-off-by: Stephan Mueller 
> >> 
> >> I think this needs a better explanation. Why do we ignore the extra
> >> entropy here?
> > 
> > I was not sure whether to add all the details about the reason into the
> > patch submission.
> > 
> > The reason is explained here in [1] page 3 and re-iterated in [2].
> 
> Ok, got it. A half-sentence summary of that ("... to avoid adding the
> same event twice from interrupt and block") would be sufficient for
> the patch description, longer is also fine.

Perfect, thank you for that hint. I will add this information to a next 
iteration.
> 
> Generally speaking, each patch description should describe why
> that particular patch is required rather than describe what it does
> (which in cases like this is plain to see from looking a few lines
> down).
> 
> Arnd



Ciao
Stephan


Re: [RFC PATCH v12 1/4] crypto: make Jitter RNG directly accessible

2017-07-18 Thread Greg Kroah-Hartman
On Tue, Jul 18, 2017 at 10:40:07AM +0200, Stephan Müller wrote:
> Am Dienstag, 18. Juli 2017, 10:30:14 CEST schrieb Greg Kroah-Hartman:
> 
> Hi Greg,
> 
> > > +typedef  unsigned long long  __u64;
> > > +typedef  long long   __s64;
> > 
> > types.h already has these defines, don't re-typedef them again...
> 
> The issue is that the C code is compiled without optimizations. Thus, the C 
> code shall not depend on any other header file.

That is very strange for a kernel file, I don't know what to say...

> This issue was discussed during the inclusion of the Jitter RNG C code into 
> the kernel.

Ok, that was then, this is now, why not change it now?  How does
including types.h change anything?

thanks,

greg k-h


Re: [RFC PATCH v12 2/4] random: conditionally compile code depending on LRNG

2017-07-18 Thread Arnd Bergmann
On Tue, Jul 18, 2017 at 10:37 AM, Stephan Müller  wrote:
> Am Dienstag, 18. Juli 2017, 10:13:55 CEST schrieb Arnd Bergmann:
>> On Tue, Jul 18, 2017 at 9:58 AM, Stephan Müller  wrote:
>> > When selecting the LRNG for compilation, disable add_disk_randomness and
>> > its supporting function.
>> >
>> > CC: Greg Kroah-Hartman 
>> > CC: Arnd Bergmann 
>> > CC: Jason A. Donenfeld 
>> > Signed-off-by: Stephan Mueller 
>>
>> I think this needs a better explanation. Why do we ignore the extra
>> entropy here?
>
> I was not sure whether to add all the details about the reason into the patch
> submission.
>
> The reason is explained here in [1] page 3 and re-iterated in [2].
>

Ok, got it. A half-sentence summary of that ("... to avoid adding the
same event twice from interrupt and block") would be sufficient for
the patch description, longer is also fine.

Generally speaking, each patch description should describe why
that particular patch is required rather than describe what it does
(which in cases like this is plain to see from looking a few lines
down).

Arnd


Re: [RFC PATCH v12 3/4] Linux Random Number Generator

2017-07-18 Thread Stephan Müller
Am Dienstag, 18. Juli 2017, 10:32:10 CEST schrieb Greg Kroah-Hartman:

Hi Greg,

> external references do not last as long as the kernel change log does :(

What would be the best way to cite a 50+ page document? I got a suggestion to 
include the ASCII version of the document into Documentation/ -- but for the 
first inclusion request, I was not sure whether to add such large document.
> 
> Also a "wholesale" replacement of random.c is a major thing, why not
> just submit patches to fix it up to add the needed changes you feel are
> necessary?  We don't like to have major changes like this, that's not
> how kernel development is done.

I have to admit that I tried that over the last years. I sent numerous small 
cleanup patches (not changing any logic) and larger patches (with logic 
changes). Even after pinging, I hardly got a response to any of my patches, 
let alone that patches were accepted.

I have stated the core concerns I have with random.c in [1]. To remedy these 
core concerns, major changes to random.c are needed. With the past experience, 
I would doubt that I get the changes into random.c.

[1] https://www.spinics.net/lists/linux-crypto/msg26316.html

Ciao
Stephan


Re: [PATCH 2/2] crypto/algapi - make crypto_xor() take separate dst and src arguments

2017-07-18 Thread Herbert Xu
On Mon, Jul 10, 2017 at 02:45:48PM +0100, Ard Biesheuvel wrote:
> There are quite a number of occurrences in the kernel of the pattern
> 
> if (dst != src)
> memcpy(dst, src, walk.total % AES_BLOCK_SIZE);
> crypto_xor(dst, final, walk.total % AES_BLOCK_SIZE);
> 
> or
> 
> crypto_xor(keystream, src, nbytes);
> memcpy(dst, keystream, nbytes);

What keeping crypto_xor as it is and adding a new entry point for
the 4-argument case?

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [RFC PATCH v12 1/4] crypto: make Jitter RNG directly accessible

2017-07-18 Thread Stephan Müller
Am Dienstag, 18. Juli 2017, 10:30:14 CEST schrieb Greg Kroah-Hartman:

Hi Greg,

> > +typedefunsigned long long  __u64;
> > +typedeflong long   __s64;
> 
> types.h already has these defines, don't re-typedef them again...

The issue is that the C code is compiled without optimizations. Thus, the C 
code shall not depend on any other header file.

This issue was discussed during the inclusion of the Jitter RNG C code into 
the kernel.

Ciao
Stephan


Re: [RFC PATCH v12 2/4] random: conditionally compile code depending on LRNG

2017-07-18 Thread Stephan Müller
Am Dienstag, 18. Juli 2017, 10:13:55 CEST schrieb Arnd Bergmann:

Hi Arnd,

> On Tue, Jul 18, 2017 at 9:58 AM, Stephan Müller  wrote:
> > When selecting the LRNG for compilation, disable add_disk_randomness and
> > its supporting function.
> > 
> > CC: Greg Kroah-Hartman 
> > CC: Arnd Bergmann 
> > CC: Jason A. Donenfeld 
> > Signed-off-by: Stephan Mueller 
> 
> I think this needs a better explanation. Why do we ignore the extra
> entropy here?

I was not sure whether to add all the details about the reason into the patch 
submission.

The reason is explained here in [1] page 3 and re-iterated in [2].

The gist is the following:

A HID or block device event providing entropy to the respective individual 
noise sources processing generates an interrupt. These interrupts are also 
processed by the interrupt noise source. The majority of entropy is delivered 
by the high-resolution time stamp of the occurrence of such an event. Now, 
that event is processed twice in the legacy /dev/random implementation: once 
by the HID or block device noise source and once by the interrupt noise 
source. Thus, the two time stamps of the one event (HID noise source and 
interrupt noise source, or block device noise source and interrupt noise 
source) used as a basis for entropy are highly correlated. Correlation or even 
a possible reuse of the same random value diminishes entropy significantly.

The additional data provided via the block noise source (block device number) 
has no real entropy.

Bottom line: for entropy, the HID and block device noise sources are just a 
derivative of the interrupt noise source. Thus, discarding the block device 
noise source will not lose any entropy. Regarding the HID noise source, only 
the key/mouse event numbers are injected into the LRNG without attributing any 
entropy to them.

[1] http://www.chronox.de/lrng/doc/lrng.pdf

[2] https://www.spinics.net/lists/linux-crypto/msg26316.html

Ciao
Stephan


Re: [PATCH v3 0/7] crypto: aes - allow generic AES to be omitted

2017-07-18 Thread Ard Biesheuvel
On 18 July 2017 at 09:30, Herbert Xu  wrote:
> On Tue, Jul 18, 2017 at 08:57:28AM +0100, Ard Biesheuvel wrote:
>>
>> So if you care about security and/or the cache/memory footprint more
>> than about speed, you can disable the table based implementations that
>> exist for i586, x86, ARM and arm64 (all of which have faster and time
>> invariant implementations based on SIMD or special instructions
>> anyway, so for 95% of the cases, it does not really matter).
>
> The thing is that anybody who cares about speed won't be using
> aes-generic anyway.  We have way too many AES implementations
> as it is, and having two C implementations is really getting
> silly.
>
> So would it be possible for you to proceed with your work in
> such a way that we end up with just aes-ti as the generic C
> implementation?
>

Sure.

> As for the table-based asm implementations yes they can stay and
> work out some way of sharing that table at the source-code level.
> At run-time the table can just go into the asm module directly
> since you'd only have one on each platform, right?
>

Indeed. And ARM only uses 4 of those 16 tables anyway (and really only
needs two of them, so I will fix that as well)


Re: [RFC PATCH v12 3/4] Linux Random Number Generator

2017-07-18 Thread Greg Kroah-Hartman
On Tue, Jul 18, 2017 at 09:59:09AM +0200, Stephan Müller wrote:
> The LRNG with the following properties:
> 
> * noise source: interrupts timing with fast boot time seeding
> 
> * lockless LFSR to collect raw entropy
> 
> * use of standalone ChaCha20 based RNG with the option to use a
>   different DRNG selectable at compile time
> 
> * "atomic" seeding of secondary DRBG to ensure full entropy
>   transport
> 
> * instantiate one DRNG per NUMA node
> 
> Further details including the rationale for the design choices and
> properties of the LRNG together with testing is provided at [1].
> In addition, the documentation explains the conducted regression
> tests to verify that the LRNG is API and ABI compatible with the
> legacy /dev/random implementation.
> 
> [1] http://www.chronox.de/lrng.html

external references do not last as long as the kernel change log does :(

Also a "wholesale" replacement of random.c is a major thing, why not
just submit patches to fix it up to add the needed changes you feel are
necessary?  We don't like to have major changes like this, that's not
how kernel development is done.

thanks,

greg k-h


  1   2   >