Re: [PATCH v9 4/4] crypto: Add Allwinner Security System crypto accelerator

2015-06-03 Thread Corentin LABBE
Le 23/05/2015 16:35, Boris Brezillon a écrit :
> Hi Corentin,
> 
> On Sat, 23 May 2015 15:12:23 +0200
> Corentin LABBE  wrote:
> 
>> Le 17/05/2015 10:45, Boris Brezillon a écrit :
>>> Hi Corentin,
>>>
>>> I started to review this new version, and I still think there's
>>> something wrong with the way your processing crypto requests.
>>> From my POV this is not asynchronous at all (see my comments inline),
>>> but maybe Herbert can confirm that.
>>> I haven't reviewed the hash part yet, but I guess it has the same
>>> problem.
>>
>> For resuming your conversation with Herbert, I have removed all 
>> CRYPTO_ALG_ASYNC flags.
> 
> Okay. I really think you can easily deal with asynchronous request (I
> had a look at the datasheet), but I'll let maintainers decide whether
> this is important or not.
> 
> 
 +
 +  if (areq->src == NULL || areq->dst == NULL) {
 +  dev_err(ss->dev, "ERROR: Some SGs are NULL\n");
 +  return -EINVAL;
 +  }
 +
 +  spin_lock_irqsave(&ss->slock, flags);
>>>
>>> Do you really need to take this lock so early ?
>>> BTW, what is this lock protecting ?
>>
>> As I have written in the header, the spinlock protect the usage of the 
>> device.
>> In this case, I need to lock just before writing the key in the device.
> 
> I'm not denying the fact that you need some locking, just how this
> locking is done: you're preventing the all system from receiving
> interrupts for the all time your doing your crypto request.
> 
> Here is a suggestion (if you still want to keep the synchronous model,
> which IMHO is not a good idea, but hey, that's not my call to make).
> 
> 1/ wait for the device to be ready (using a waitqueue)
> 2/ take the lock
> 3/ check if the engine is busy (already handling another crypto
>request).
> 4.1/ If the engine is not busy, mark the engine as busy, release the
>  lock and proceed with the crytpo request handling.
> 4.2/ If the engine is busy, release the lock and go back to 1/ 
> 
> 

I have done a version with a crypto_queue and a kthread.
This works perfectly but.. the performance are really really bad.
Never more than 20k request/s when both generic and synchronous SS can go 
beyond 80k.
With this asynchronous driver, the Security System become useful only with 
request larger than 2048 bytes
I have patched my driver to create stats on request size, this show that real 
usage is less than that. (512bytes for cryptoluks for example)

Furthermore, my current patch for using the PRNG cannot be used with the 
kthread since it use the hwrng interface.
But perhaps by using also a crypto_rng alg it could be done.

So I think that if I want to make the SS driver useful, I cannot set it 
asynchronous.
Perhaps when the DMA engine will be available, this will change.

I have attached the patch that make my driver asynchronous for comments on 
possible improvement.

>>
>>>
>>> IMHO, taking a spinlock and disabling irqs for the whole time you're
>>> executing a crypto request is not a good idea (it will prevent all
>>> other irqs from running, and potentially introduce latencies in other
>>> parts of the kernel).
>>
>> Since crypto operation could be called by software interrupt, I need to 
>> disable them.
>> (Confirmed by http://www.makelinux.net/ldd3/chp-5-sect-5 5.5.3)
> 
> Hm, you're not even using the interrupts provided by the IP to detect
> when the engine is ready to accept new data chunks (which is another
> aspect that should be addressed IMO), so I don't get why you need to
> disable HW interrupts.
> If you just want to disable SW interrupts, you can use spin_lock_bh()
> instead of spin_lock_irqsave().
> 

Thanks I use spin_lock_bh now.

>>
>>>
>>> What you can do though is declare the following fields in your crypto
>>> engine struct (sun4i_ss_ctx):
>>> - a crypto request queue (struct crypto_queue [1])
>>> - a crypto_async_request variable storing the request being processed
>>> - a lock protecting the queue and the current request variable
>>>
>>> This way you'll only have to take the lock when queuing or dequeuing a
>>> request.
>>>
>>> Another comment, your implementation does not seem to be asynchronous at
>>> all: you're blocking the caller until its crypto request is complete.
>>>
>>>
 +
 +  for (i = 0; i < op->keylen; i += 4)
 +  writel(*(op->key + i / 4), ss->base + SS_KEY0 + i);
 +
 +  if (areq->info != NULL) {
 +  for (i = 0; i < 4 && i < ivsize / 4; i++) {
 +  v = *(u32 *)(areq->info + i * 4);
 +  writel(v, ss->base + SS_IV0 + i * 4);
 +  }
 +  }
 +  writel(mode, ss->base + SS_CTL);
 +
 +  sgnum = sg_nents(areq->src);
 +  if (sgnum == 1)
 +  miter_flag = SG_MITER_FROM_SG | SG_MITER_ATOMIC;
 +  else
 +  miter_flag = SG_MITER_FROM_SG;
>>>
>>>
>>> Why is the ATOMIC flag depending on the number of sg elements.
>>> IMO it should only depends on the context you're currently in, 

Re: [linux-sunxi] Re: [PATCH v9 4/4] crypto: Add Allwinner Security System crypto accelerator

2015-05-24 Thread Herbert Xu
On Sun, May 24, 2015 at 10:04:16AM +0200, Corentin LABBE wrote:
> 
> For aes_cbc it exists a test with 3 SG with .tap = { 496 - 20, 4, 16 }
> But my driver handle that. (multiple of 4)
> 
> What do you think about adding a test with 16 SG of 1 byte ? (or 3 + 2 + 3 + 
> 8 * 1)

Sure please send a patch.

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-sunxi] Re: [PATCH v9 4/4] crypto: Add Allwinner Security System crypto accelerator

2015-05-24 Thread Corentin LABBE
Le 24/05/2015 05:32, Herbert Xu a écrit :
> On Sat, May 23, 2015 at 04:35:36PM +0200, Boris Brezillon wrote:
>>
>>> Since all SG I get was always a multiple of 16 (AES BLOCK SIZE) it was a 
>>> sort of confirmation.
>>>
>>> Herbert ? does am I right or a chunking test is missing for cbc(aes) in 
>>> testmgr.h
>>
>> Okay, just sharing my vision of this thing (I'll let Herbert comment on
>> this aspect): I'd say that theoretically nothing prevents one from
>> splitting its sg list in chunks smaller than the block size, so I'd
>> say you should use the same trick for AES. 
> 
> Indeed, you must be able to handle a block that straddles two SG
> entries.  If the hardware is not capable then you have to linearise
> it by copying or use a fallback.
> 
> I can't believe we don't have a generic test for this in testmgr.
> 

For aes_cbc it exists a test with 3 SG with .tap = { 496 - 20, 4, 16 }
But my driver handle that. (multiple of 4)

What do you think about adding a test with 16 SG of 1 byte ? (or 3 + 2 + 3 + 8 
* 1)

Furthermore, I need to re-read testmgr.h but it seems that no aes test exists 
with odd SG size.

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v9 4/4] crypto: Add Allwinner Security System crypto accelerator

2015-05-23 Thread Herbert Xu
On Sat, May 23, 2015 at 04:35:36PM +0200, Boris Brezillon wrote:
>
> > Since all SG I get was always a multiple of 16 (AES BLOCK SIZE) it was a 
> > sort of confirmation.
> > 
> > Herbert ? does am I right or a chunking test is missing for cbc(aes) in 
> > testmgr.h
> 
> Okay, just sharing my vision of this thing (I'll let Herbert comment on
> this aspect): I'd say that theoretically nothing prevents one from
> splitting its sg list in chunks smaller than the block size, so I'd
> say you should use the same trick for AES. 

Indeed, you must be able to handle a block that straddles two SG
entries.  If the hardware is not capable then you have to linearise
it by copying or use a fallback.

I can't believe we don't have a generic test for this in testmgr.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [linux-sunxi] Re: [PATCH v9 4/4] crypto: Add Allwinner Security System crypto accelerator

2015-05-23 Thread Herbert Xu
On Sat, May 23, 2015 at 02:18:06PM +0200, Corentin LABBE wrote:
>
> What do you think about adding a BUG_ON(in_atomic()) in 
> crypto_ablkcipher_setkey() ?

Just add a might_sleep() to it.

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v9 4/4] crypto: Add Allwinner Security System crypto accelerator

2015-05-23 Thread Boris Brezillon
Hi Corentin,

On Sat, 23 May 2015 15:12:23 +0200
Corentin LABBE  wrote:

> Le 17/05/2015 10:45, Boris Brezillon a écrit :
> > Hi Corentin,
> > 
> > I started to review this new version, and I still think there's
> > something wrong with the way your processing crypto requests.
> > From my POV this is not asynchronous at all (see my comments inline),
> > but maybe Herbert can confirm that.
> > I haven't reviewed the hash part yet, but I guess it has the same
> > problem.
> 
> For resuming your conversation with Herbert, I have removed all 
> CRYPTO_ALG_ASYNC flags.

Okay. I really think you can easily deal with asynchronous request (I
had a look at the datasheet), but I'll let maintainers decide whether
this is important or not.


> > 
> >> +
> >> +int sun4i_ss_aes_poll(struct ablkcipher_request *areq, u32 mode)
> >> +{
> >> +  u32 spaces;
> >> +  struct crypto_ablkcipher *tfm = crypto_ablkcipher_reqtfm(areq);
> >> +  struct sun4i_tfm_ctx *op = crypto_ablkcipher_ctx(tfm);
> >> +  struct sun4i_ss_ctx *ss = op->ss;
> >> +  unsigned int ivsize = crypto_ablkcipher_ivsize(tfm);
> >> +  /* when activating SS, the default FIFO space is 32 */
> >> +  u32 rx_cnt = 32;
> >> +  u32 tx_cnt = 0;
> >> +  u32 v;
> >> +  int i, sgnum, err = 0;
> >> +  unsigned int ileft = areq->nbytes;
> >> +  unsigned int oleft = areq->nbytes;
> >> +  unsigned int todo, miter_flag;
> >> +  unsigned long flags;
> >> +  struct sg_mapping_iter mi, mo;
> >> +  unsigned int oi, oo; /* offset for in and out */
> >> +
> >> +  if (areq->nbytes == 0)
> >> +  return 0;
> >> +
> >> +  if (areq->info == NULL) {
> >> +  dev_err(ss->dev, "ERROR: Empty IV\n");
> >> +  return -EINVAL;
> >> +  }
> >> +
> >> +  if (areq->src == NULL || areq->dst == NULL) {
> >> +  dev_err(ss->dev, "ERROR: Some SGs are NULL\n");
> >> +  return -EINVAL;
> >> +  }
> >> +
> >> +  spin_lock_irqsave(&ss->slock, flags);
> > 
> > Do you really need to take this lock so early ?
> > BTW, what is this lock protecting ?
> 
> As I have written in the header, the spinlock protect the usage of the device.
> In this case, I need to lock just before writing the key in the device.

I'm not denying the fact that you need some locking, just how this
locking is done: you're preventing the all system from receiving
interrupts for the all time your doing your crypto request.

Here is a suggestion (if you still want to keep the synchronous model,
which IMHO is not a good idea, but hey, that's not my call to make).

1/ wait for the device to be ready (using a waitqueue)
2/ take the lock
3/ check if the engine is busy (already handling another crypto
   request).
4.1/ If the engine is not busy, mark the engine as busy, release the
 lock and proceed with the crytpo request handling.
4.2/ If the engine is busy, release the lock and go back to 1/ 


> 
> > 
> > IMHO, taking a spinlock and disabling irqs for the whole time you're
> > executing a crypto request is not a good idea (it will prevent all
> > other irqs from running, and potentially introduce latencies in other
> > parts of the kernel).
> 
> Since crypto operation could be called by software interrupt, I need to 
> disable them.
> (Confirmed by http://www.makelinux.net/ldd3/chp-5-sect-5 5.5.3)

Hm, you're not even using the interrupts provided by the IP to detect
when the engine is ready to accept new data chunks (which is another
aspect that should be addressed IMO), so I don't get why you need to
disable HW interrupts.
If you just want to disable SW interrupts, you can use spin_lock_bh()
instead of spin_lock_irqsave().

> 
> > 
> > What you can do though is declare the following fields in your crypto
> > engine struct (sun4i_ss_ctx):
> > - a crypto request queue (struct crypto_queue [1])
> > - a crypto_async_request variable storing the request being processed
> > - a lock protecting the queue and the current request variable
> > 
> > This way you'll only have to take the lock when queuing or dequeuing a
> > request.
> > 
> > Another comment, your implementation does not seem to be asynchronous at
> > all: you're blocking the caller until its crypto request is complete.
> > 
> > 
> >> +
> >> +  for (i = 0; i < op->keylen; i += 4)
> >> +  writel(*(op->key + i / 4), ss->base + SS_KEY0 + i);
> >> +
> >> +  if (areq->info != NULL) {
> >> +  for (i = 0; i < 4 && i < ivsize / 4; i++) {
> >> +  v = *(u32 *)(areq->info + i * 4);
> >> +  writel(v, ss->base + SS_IV0 + i * 4);
> >> +  }
> >> +  }
> >> +  writel(mode, ss->base + SS_CTL);
> >> +
> >> +  sgnum = sg_nents(areq->src);
> >> +  if (sgnum == 1)
> >> +  miter_flag = SG_MITER_FROM_SG | SG_MITER_ATOMIC;
> >> +  else
> >> +  miter_flag = SG_MITER_FROM_SG;
> > 
> > 
> > Why is the ATOMIC flag depending on the number of sg elements.
> > IMO it should only depends on the context you're currently in, and in
> > your specific case, you're always in atomic context since you've taken
> > 

Re: [PATCH v9 4/4] crypto: Add Allwinner Security System crypto accelerator

2015-05-23 Thread Corentin LABBE
Le 17/05/2015 10:45, Boris Brezillon a écrit :
> Hi Corentin,
> 
> I started to review this new version, and I still think there's
> something wrong with the way your processing crypto requests.
> From my POV this is not asynchronous at all (see my comments inline),
> but maybe Herbert can confirm that.
> I haven't reviewed the hash part yet, but I guess it has the same
> problem.

For resuming your conversation with Herbert, I have removed all 
CRYPTO_ALG_ASYNC flags.

> 
> Best Regards,
> 
> Boris
> 
> On Thu, 14 May 2015 14:59:01 +0200
> LABBE Corentin  wrote:
> 
>> Add support for the Security System included in Allwinner SoC A20.
>> The Security System is a hardware cryptographic accelerator that support:
>> - MD5 and SHA1 hash algorithms
>> - AES block cipher in CBC/ECB mode with 128/196/256bits keys.
>> - DES and 3DES block cipher in CBC/ECB mode
>>
>> Signed-off-by: LABBE Corentin 
>> ---
> 
> [...]
> 
>> +
>> +int sun4i_ss_aes_poll(struct ablkcipher_request *areq, u32 mode)
>> +{
>> +u32 spaces;
>> +struct crypto_ablkcipher *tfm = crypto_ablkcipher_reqtfm(areq);
>> +struct sun4i_tfm_ctx *op = crypto_ablkcipher_ctx(tfm);
>> +struct sun4i_ss_ctx *ss = op->ss;
>> +unsigned int ivsize = crypto_ablkcipher_ivsize(tfm);
>> +/* when activating SS, the default FIFO space is 32 */
>> +u32 rx_cnt = 32;
>> +u32 tx_cnt = 0;
>> +u32 v;
>> +int i, sgnum, err = 0;
>> +unsigned int ileft = areq->nbytes;
>> +unsigned int oleft = areq->nbytes;
>> +unsigned int todo, miter_flag;
>> +unsigned long flags;
>> +struct sg_mapping_iter mi, mo;
>> +unsigned int oi, oo; /* offset for in and out */
>> +
>> +if (areq->nbytes == 0)
>> +return 0;
>> +
>> +if (areq->info == NULL) {
>> +dev_err(ss->dev, "ERROR: Empty IV\n");
>> +return -EINVAL;
>> +}
>> +
>> +if (areq->src == NULL || areq->dst == NULL) {
>> +dev_err(ss->dev, "ERROR: Some SGs are NULL\n");
>> +return -EINVAL;
>> +}
>> +
>> +spin_lock_irqsave(&ss->slock, flags);
> 
> Do you really need to take this lock so early ?
> BTW, what is this lock protecting ?

As I have written in the header, the spinlock protect the usage of the device.
In this case, I need to lock just before writing the key in the device.

> 
> IMHO, taking a spinlock and disabling irqs for the whole time you're
> executing a crypto request is not a good idea (it will prevent all
> other irqs from running, and potentially introduce latencies in other
> parts of the kernel).

Since crypto operation could be called by software interrupt, I need to disable 
them.
(Confirmed by http://www.makelinux.net/ldd3/chp-5-sect-5 5.5.3)

> 
> What you can do though is declare the following fields in your crypto
> engine struct (sun4i_ss_ctx):
> - a crypto request queue (struct crypto_queue [1])
> - a crypto_async_request variable storing the request being processed
> - a lock protecting the queue and the current request variable
> 
> This way you'll only have to take the lock when queuing or dequeuing a
> request.
> 
> Another comment, your implementation does not seem to be asynchronous at
> all: you're blocking the caller until its crypto request is complete.
> 
> 
>> +
>> +for (i = 0; i < op->keylen; i += 4)
>> +writel(*(op->key + i / 4), ss->base + SS_KEY0 + i);
>> +
>> +if (areq->info != NULL) {
>> +for (i = 0; i < 4 && i < ivsize / 4; i++) {
>> +v = *(u32 *)(areq->info + i * 4);
>> +writel(v, ss->base + SS_IV0 + i * 4);
>> +}
>> +}
>> +writel(mode, ss->base + SS_CTL);
>> +
>> +sgnum = sg_nents(areq->src);
>> +if (sgnum == 1)
>> +miter_flag = SG_MITER_FROM_SG | SG_MITER_ATOMIC;
>> +else
>> +miter_flag = SG_MITER_FROM_SG;
> 
> 
> Why is the ATOMIC flag depending on the number of sg elements.
> IMO it should only depends on the context you're currently in, and in
> your specific case, you're always in atomic context since you've taken
> a spinlock (and disabled irqs) a few lines above.
> 
> Note that with the approach I previously proposed, you can even get rid
> of this ATMIC flag (or always set it depending on the context you're in
> when dequeuing the crypto requests).
> 

For sg_miter, the ATOMIC flag control the usage of kmap vs kmap_atomic.
Since kmap_atomic must not be held too long, I use them only for short crypto 
operation.
But I need to rebench for be sure that using kmap_atomic give faster result.
If I keep that, I will add a comment explaining that.

> 
>> +sg_miter_start(&mi, areq->src, sgnum, miter_flag);
>> +sgnum = sg_nents(areq->dst);
>> +if (sgnum == 1)
>> +miter_flag = SG_MITER_TO_SG | SG_MITER_ATOMIC;
>> +else
>> +miter_flag = SG_MITER_TO_SG;
> 
> Ditto
> 
>> +sg_miter_start(&mo, areq->dst, sgnum, miter_flag);
>> +sg_miter_next(&mi);
>> +sg_miter_next(&mo);
>> +if (mi

Re: [linux-sunxi] Re: [PATCH v9 4/4] crypto: Add Allwinner Security System crypto accelerator

2015-05-23 Thread Corentin LABBE
Le 15/05/2015 08:49, Herbert Xu a écrit :
> On Thu, May 14, 2015 at 02:59:01PM +0200, LABBE Corentin wrote:
>>
>> +err = crypto_ablkcipher_setkey(op->fallback, kkey, op->keylen);
>> +if (err != 0) {
>> +dev_err(ss->dev, "Cannot set key on fallback\n");
>> +return -EINVAL;
>> +}
> 
> You cannot call setkey from an atomic context.  The easiest solution
> is to call setkey in your setkey function instead.
> 
> Cheers,
> 
Fixed

What do you think about adding a BUG_ON(in_atomic()) in 
crypto_ablkcipher_setkey() ?

Thanks

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v9 4/4] crypto: Add Allwinner Security System crypto accelerator

2015-05-18 Thread Boris Brezillon
Hi Herbert,

On Mon, 18 May 2015 08:41:21 +0800
Herbert Xu  wrote:

> On Sun, May 17, 2015 at 12:48:11PM +0200, Boris Brezillon wrote:
> > 
> > Yep, but then they shouldn't be declared with CRYPTO_ALG_ASYNC and as an

  not ^

> > ablkcipher algorithm (*Asynchronous* Block Cipher), right ?
> 
> Right.  They can still use ablkcipher but should clear the ASYNC
> bit.

Okay, just to be sure, what does "Asynchronous" mean in ablkcipher or
ahash ?
Is it related to the fact that crypto operations can be done in
multiple steps (e.g.: one set_key + several encrypt chunk operations),
or is this something else ?

Best Regards,

Boris

-- 
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v9 4/4] crypto: Add Allwinner Security System crypto accelerator

2015-05-17 Thread Herbert Xu
On Sun, May 17, 2015 at 12:48:11PM +0200, Boris Brezillon wrote:
> 
> Yep, but then they shouldn't be declared with CRYPTO_ALG_ASYNC and as an
> ablkcipher algorithm (*Asynchronous* Block Cipher), right ?

Right.  They can still use ablkcipher but should clear the ASYNC
bit.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html