Re: [PATCH 1/4] crypto: nintendo-aes - add a new AES driver

2021-09-28 Thread Geert Uytterhoeven
On Wed, Sep 22, 2021 at 4:12 AM Joel Stanley  wrote:
> On Tue, 21 Sept 2021 at 21:47, Emmanuel Gil Peyrot
>  wrote:
> >
> > This engine implements AES in CBC mode, using 128-bit keys only.  It is
> > present on both the Wii and the Wii U, and is apparently identical in
> > both consoles.
> >
> > The hardware is capable of firing an interrupt when the operation is
> > done, but this driver currently uses a busy loop, I’m not too sure
> > whether it would be preferable to switch, nor how to achieve that.
> >
> > It also supports a mode where no operation is done, and thus could be
> > used as a DMA copy engine, but I don’t know how to expose that to the
> > kernel or whether it would even be useful.
> >
> > In my testing, on a Wii U, this driver reaches 80.7 MiB/s, while the
> > aes-generic driver only reaches 30.9 MiB/s, so it is a quite welcome
> > speedup.
> >
> > This driver was written based on reversed documentation, see:
> > https://wiibrew.org/wiki/Hardware/AES
> >
> > Signed-off-by: Emmanuel Gil Peyrot 
> > Tested-by: Emmanuel Gil Peyrot   # on Wii U
> > ---
> >  drivers/crypto/Kconfig|  11 ++
> >  drivers/crypto/Makefile   |   1 +
> >  drivers/crypto/nintendo-aes.c | 273 ++
> >  3 files changed, 285 insertions(+)
> >  create mode 100644 drivers/crypto/nintendo-aes.c
> >
> > diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
> > index 9a4c275a1335..adc94ad7462d 100644
> > --- a/drivers/crypto/Kconfig
> > +++ b/drivers/crypto/Kconfig
> > @@ -871,4 +871,15 @@ config CRYPTO_DEV_SA2UL
> >
> >  source "drivers/crypto/keembay/Kconfig"
> >
> > +config CRYPTO_DEV_NINTENDO
> > +   tristate "Support for the Nintendo Wii U AES engine"
> > +   depends on WII || WIIU || COMPILE_TEST
>
> This current seteup will allow the driver to be compile tested for
> non-powerpc, which will fail on the dcbf instructions.
>
> Perhaps use this instead:
>
>depends on WII || WIIU || (COMPILE_TEST && PPC)

Or:

depends on PPC
depends on WII || WIIU || COMPILE_TEST

to distinguish between hard and soft dependencies.

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: [PATCH 1/4] crypto: nintendo-aes - add a new AES driver

2021-09-22 Thread Ard Biesheuvel
On Wed, 22 Sept 2021 at 12:43, Emmanuel Gil Peyrot
 wrote:
>
> On Wed, Sep 22, 2021 at 12:10:41PM +0200, Ard Biesheuvel wrote:
> > On Tue, 21 Sept 2021 at 23:49, Emmanuel Gil Peyrot
> >  wrote:
> > >
> > > This engine implements AES in CBC mode, using 128-bit keys only.  It is
> > > present on both the Wii and the Wii U, and is apparently identical in
> > > both consoles.
> > >
> > > The hardware is capable of firing an interrupt when the operation is
> > > done, but this driver currently uses a busy loop, I’m not too sure
> > > whether it would be preferable to switch, nor how to achieve that.
> > >
> > > It also supports a mode where no operation is done, and thus could be
> > > used as a DMA copy engine, but I don’t know how to expose that to the
> > > kernel or whether it would even be useful.
> > >
> > > In my testing, on a Wii U, this driver reaches 80.7 MiB/s, while the
> > > aes-generic driver only reaches 30.9 MiB/s, so it is a quite welcome
> > > speedup.
> > >
> > > This driver was written based on reversed documentation, see:
> > > https://wiibrew.org/wiki/Hardware/AES
> > >
> > > Signed-off-by: Emmanuel Gil Peyrot 
> > > Tested-by: Emmanuel Gil Peyrot   # on Wii U
> >
> > This is redundant - everybody should test the code they submit.
>
> Indeed, except for the comment, as I haven’t been able to test on the
> Wii just yet and that’s kind of a call for doing exactly that. :)
>
> >
> > ...
> > > +   /* TODO: figure out how to use interrupts here, this will probably
> > > +* lower throughput but let the CPU do other things while the AES
> > > +* engine is doing its work. */
> >
> > So is it worthwhile like this? How much faster is it to use this
> > accelerator rather than the CPU?
>
> As I mentioned above, on my hardware it reaches 80.7 MiB/s using this
> busy loop instead of 30.9 MiB/s using aes-generic, measured using
> `cryptsetup benchmark --cipher=aes --key-size=128`.  I expect the
> difference would be even more pronounced on the Wii, with its CPU being
> clocked lower.
>

Ah apologies for not spotting that. This is a nice speedup.

> I will give a try at using the interrupt, but I fully expect a lower
> throughput alongside a lower CPU usage (for large requests).
>

You should consider latency as well. Is it really necessary to disable
interrupts as well? A scheduling blackout of ~1ms (for the worst case
of 64k of input @ 80 MB/s) may be tolerable but keeping interrupts
disabled for that long is probably not a great idea. (Just make sure
you use spin_lock_bh() to prevent deadlocks that could occur if your
code is called from softirq context)

But using the interrupt is obviously preferred. What's wrong with it?

Btw the crypto API does not permit AES-128 only - you will need to add
a fallback for other key sizes as well.


> >
> > > +   do {
> > > +   status = ioread32be(base + AES_CTRL);
> > > +   cpu_relax();
> > > +   } while ((status & AES_CTRL_EXEC) && --counter);
> > > +
> > > +   /* Do we ever get called with dst ≠ src?  If so we have to 
> > > invalidate
> > > +* dst in addition to the earlier flush of src. */
> > > +   if (unlikely(dst != src)) {
> > > +   for (i = 0; i < len; i += 32)
> > > +   __asm__("dcbi 0, %0" : : "r" (dst + i));
> > > +   __asm__("sync" : : : "memory");
> > > +   }
> > > +
> > > +   return counter ? 0 : 1;
> > > +}
> > > +
> > > +static void
> > > +nintendo_aes_crypt(const void *src, void *dst, u32 len, u8 *iv, int dir,
> > > +  bool firstchunk)
> > > +{
> > > +   u32 flags = 0;
> > > +   unsigned long iflags;
> > > +   int ret;
> > > +
> > > +   flags |= AES_CTRL_EXEC_INIT /* | AES_CTRL_IRQ */ | AES_CTRL_ENA;
> > > +
> > > +   if (dir == AES_DIR_DECRYPT)
> > > +   flags |= AES_CTRL_DEC;
> > > +
> > > +   if (!firstchunk)
> > > +   flags |= AES_CTRL_IV;
> > > +
> > > +   /* Start the critical section */
> > > +   spin_lock_irqsave(, iflags);
> > > +
> > > +   if (firstchunk)
> > > +   writefield(AES_IV, iv);
> > > +
> > > +   ret = do_crypt(src, dst, len, flags);
> > > +   BUG_ON(ret);
> > > +
> > > +   spin_unlock_irqrestore(, iflags);
> > > +}
> > > +
> > > +static int nintendo_setkey_skcipher(struct crypto_skcipher *tfm, const 
> > > u8 *key,
> > > +   unsigned int len)
> > > +{
> > > +   /* The hardware only supports AES-128 */
> > > +   if (len != AES_KEYSIZE_128)
> > > +   return -EINVAL;
> > > +
> > > +   writefield(AES_KEY, key);
> > > +   return 0;
> > > +}
> > > +
> > > +static int nintendo_skcipher_crypt(struct skcipher_request *req, int dir)
> > > +{
> > > +   struct skcipher_walk walk;
> > > +   unsigned int nbytes;
> > > +   int err;
> > > +   char ivbuf[AES_BLOCK_SIZE];
> > > +   unsigned int ivsize;
> > > +
> > > +   bool firstchunk = true;

Re: [PATCH 1/4] crypto: nintendo-aes - add a new AES driver

2021-09-22 Thread Emmanuel Gil Peyrot
On Wed, Sep 22, 2021 at 12:10:41PM +0200, Ard Biesheuvel wrote:
> On Tue, 21 Sept 2021 at 23:49, Emmanuel Gil Peyrot
>  wrote:
> >
> > This engine implements AES in CBC mode, using 128-bit keys only.  It is
> > present on both the Wii and the Wii U, and is apparently identical in
> > both consoles.
> >
> > The hardware is capable of firing an interrupt when the operation is
> > done, but this driver currently uses a busy loop, I’m not too sure
> > whether it would be preferable to switch, nor how to achieve that.
> >
> > It also supports a mode where no operation is done, and thus could be
> > used as a DMA copy engine, but I don’t know how to expose that to the
> > kernel or whether it would even be useful.
> >
> > In my testing, on a Wii U, this driver reaches 80.7 MiB/s, while the
> > aes-generic driver only reaches 30.9 MiB/s, so it is a quite welcome
> > speedup.
> >
> > This driver was written based on reversed documentation, see:
> > https://wiibrew.org/wiki/Hardware/AES
> >
> > Signed-off-by: Emmanuel Gil Peyrot 
> > Tested-by: Emmanuel Gil Peyrot   # on Wii U
> 
> This is redundant - everybody should test the code they submit.

Indeed, except for the comment, as I haven’t been able to test on the
Wii just yet and that’s kind of a call for doing exactly that. :)

> 
> ...
> > +   /* TODO: figure out how to use interrupts here, this will probably
> > +* lower throughput but let the CPU do other things while the AES
> > +* engine is doing its work. */
> 
> So is it worthwhile like this? How much faster is it to use this
> accelerator rather than the CPU?

As I mentioned above, on my hardware it reaches 80.7 MiB/s using this
busy loop instead of 30.9 MiB/s using aes-generic, measured using
`cryptsetup benchmark --cipher=aes --key-size=128`.  I expect the
difference would be even more pronounced on the Wii, with its CPU being
clocked lower.

I will give a try at using the interrupt, but I fully expect a lower
throughput alongside a lower CPU usage (for large requests).

> 
> > +   do {
> > +   status = ioread32be(base + AES_CTRL);
> > +   cpu_relax();
> > +   } while ((status & AES_CTRL_EXEC) && --counter);
> > +
> > +   /* Do we ever get called with dst ≠ src?  If so we have to 
> > invalidate
> > +* dst in addition to the earlier flush of src. */
> > +   if (unlikely(dst != src)) {
> > +   for (i = 0; i < len; i += 32)
> > +   __asm__("dcbi 0, %0" : : "r" (dst + i));
> > +   __asm__("sync" : : : "memory");
> > +   }
> > +
> > +   return counter ? 0 : 1;
> > +}
> > +
> > +static void
> > +nintendo_aes_crypt(const void *src, void *dst, u32 len, u8 *iv, int dir,
> > +  bool firstchunk)
> > +{
> > +   u32 flags = 0;
> > +   unsigned long iflags;
> > +   int ret;
> > +
> > +   flags |= AES_CTRL_EXEC_INIT /* | AES_CTRL_IRQ */ | AES_CTRL_ENA;
> > +
> > +   if (dir == AES_DIR_DECRYPT)
> > +   flags |= AES_CTRL_DEC;
> > +
> > +   if (!firstchunk)
> > +   flags |= AES_CTRL_IV;
> > +
> > +   /* Start the critical section */
> > +   spin_lock_irqsave(, iflags);
> > +
> > +   if (firstchunk)
> > +   writefield(AES_IV, iv);
> > +
> > +   ret = do_crypt(src, dst, len, flags);
> > +   BUG_ON(ret);
> > +
> > +   spin_unlock_irqrestore(, iflags);
> > +}
> > +
> > +static int nintendo_setkey_skcipher(struct crypto_skcipher *tfm, const u8 
> > *key,
> > +   unsigned int len)
> > +{
> > +   /* The hardware only supports AES-128 */
> > +   if (len != AES_KEYSIZE_128)
> > +   return -EINVAL;
> > +
> > +   writefield(AES_KEY, key);
> > +   return 0;
> > +}
> > +
> > +static int nintendo_skcipher_crypt(struct skcipher_request *req, int dir)
> > +{
> > +   struct skcipher_walk walk;
> > +   unsigned int nbytes;
> > +   int err;
> > +   char ivbuf[AES_BLOCK_SIZE];
> > +   unsigned int ivsize;
> > +
> > +   bool firstchunk = true;
> > +
> > +   /* Reset the engine */
> > +   iowrite32be(0, base + AES_CTRL);
> > +
> > +   err = skcipher_walk_virt(, req, false);
> > +   ivsize = min(sizeof(ivbuf), walk.ivsize);
> > +
> > +   while ((nbytes = walk.nbytes) != 0) {
> > +   unsigned int chunkbytes = round_down(nbytes, 
> > AES_BLOCK_SIZE);
> > +   unsigned int ret = nbytes % AES_BLOCK_SIZE;
> > +
> > +   if (walk.total == chunkbytes && dir == AES_DIR_DECRYPT) {
> > +   /* If this is the last chunk and we're decrypting, 
> > take
> > +* note of the IV (which is the last ciphertext 
> > block)
> > +*/
> > +   memcpy(ivbuf, walk.src.virt.addr + walk.total - 
> > ivsize,
> > +  ivsize);
> > +   }
> > +
> > +   

Re: [PATCH 1/4] crypto: nintendo-aes - add a new AES driver

2021-09-22 Thread Ard Biesheuvel
On Tue, 21 Sept 2021 at 23:49, Emmanuel Gil Peyrot
 wrote:
>
> This engine implements AES in CBC mode, using 128-bit keys only.  It is
> present on both the Wii and the Wii U, and is apparently identical in
> both consoles.
>
> The hardware is capable of firing an interrupt when the operation is
> done, but this driver currently uses a busy loop, I’m not too sure
> whether it would be preferable to switch, nor how to achieve that.
>
> It also supports a mode where no operation is done, and thus could be
> used as a DMA copy engine, but I don’t know how to expose that to the
> kernel or whether it would even be useful.
>
> In my testing, on a Wii U, this driver reaches 80.7 MiB/s, while the
> aes-generic driver only reaches 30.9 MiB/s, so it is a quite welcome
> speedup.
>
> This driver was written based on reversed documentation, see:
> https://wiibrew.org/wiki/Hardware/AES
>
> Signed-off-by: Emmanuel Gil Peyrot 
> Tested-by: Emmanuel Gil Peyrot   # on Wii U

This is redundant - everybody should test the code they submit.

...
> +   /* TODO: figure out how to use interrupts here, this will probably
> +* lower throughput but let the CPU do other things while the AES
> +* engine is doing its work. */

So is it worthwhile like this? How much faster is it to use this
accelerator rather than the CPU?

> +   do {
> +   status = ioread32be(base + AES_CTRL);
> +   cpu_relax();
> +   } while ((status & AES_CTRL_EXEC) && --counter);
> +
> +   /* Do we ever get called with dst ≠ src?  If so we have to invalidate
> +* dst in addition to the earlier flush of src. */
> +   if (unlikely(dst != src)) {
> +   for (i = 0; i < len; i += 32)
> +   __asm__("dcbi 0, %0" : : "r" (dst + i));
> +   __asm__("sync" : : : "memory");
> +   }
> +
> +   return counter ? 0 : 1;
> +}
> +
> +static void
> +nintendo_aes_crypt(const void *src, void *dst, u32 len, u8 *iv, int dir,
> +  bool firstchunk)
> +{
> +   u32 flags = 0;
> +   unsigned long iflags;
> +   int ret;
> +
> +   flags |= AES_CTRL_EXEC_INIT /* | AES_CTRL_IRQ */ | AES_CTRL_ENA;
> +
> +   if (dir == AES_DIR_DECRYPT)
> +   flags |= AES_CTRL_DEC;
> +
> +   if (!firstchunk)
> +   flags |= AES_CTRL_IV;
> +
> +   /* Start the critical section */
> +   spin_lock_irqsave(, iflags);
> +
> +   if (firstchunk)
> +   writefield(AES_IV, iv);
> +
> +   ret = do_crypt(src, dst, len, flags);
> +   BUG_ON(ret);
> +
> +   spin_unlock_irqrestore(, iflags);
> +}
> +
> +static int nintendo_setkey_skcipher(struct crypto_skcipher *tfm, const u8 
> *key,
> +   unsigned int len)
> +{
> +   /* The hardware only supports AES-128 */
> +   if (len != AES_KEYSIZE_128)
> +   return -EINVAL;
> +
> +   writefield(AES_KEY, key);
> +   return 0;
> +}
> +
> +static int nintendo_skcipher_crypt(struct skcipher_request *req, int dir)
> +{
> +   struct skcipher_walk walk;
> +   unsigned int nbytes;
> +   int err;
> +   char ivbuf[AES_BLOCK_SIZE];
> +   unsigned int ivsize;
> +
> +   bool firstchunk = true;
> +
> +   /* Reset the engine */
> +   iowrite32be(0, base + AES_CTRL);
> +
> +   err = skcipher_walk_virt(, req, false);
> +   ivsize = min(sizeof(ivbuf), walk.ivsize);
> +
> +   while ((nbytes = walk.nbytes) != 0) {
> +   unsigned int chunkbytes = round_down(nbytes, AES_BLOCK_SIZE);
> +   unsigned int ret = nbytes % AES_BLOCK_SIZE;
> +
> +   if (walk.total == chunkbytes && dir == AES_DIR_DECRYPT) {
> +   /* If this is the last chunk and we're decrypting, 
> take
> +* note of the IV (which is the last ciphertext block)
> +*/
> +   memcpy(ivbuf, walk.src.virt.addr + walk.total - 
> ivsize,
> +  ivsize);
> +   }
> +
> +   nintendo_aes_crypt(walk.src.virt.addr, walk.dst.virt.addr,
> +  chunkbytes, walk.iv, dir, firstchunk);
> +
> +   if (walk.total == chunkbytes && dir == AES_DIR_ENCRYPT) {
> +   /* If this is the last chunk and we're encrypting, 
> take
> +* note of the IV (which is the last ciphertext block)
> +*/
> +   memcpy(walk.iv,
> +  walk.dst.virt.addr + walk.total - ivsize,
> +  ivsize);
> +   } else if (walk.total == chunkbytes && dir == 
> AES_DIR_DECRYPT) {
> +   memcpy(walk.iv, ivbuf, ivsize);
> +   }
> +
> +   err = skcipher_walk_done(, ret);
> +   firstchunk = false;
> +   }
> +
> +   return err;
> +}
> +
> +static int nintendo_cbc_encrypt(struct 

Re: [PATCH 1/4] crypto: nintendo-aes - add a new AES driver

2021-09-22 Thread Corentin Labbe
Le Tue, Sep 21, 2021 at 11:39:27PM +0200, Emmanuel Gil Peyrot a écrit :
> This engine implements AES in CBC mode, using 128-bit keys only.  It is
> present on both the Wii and the Wii U, and is apparently identical in
> both consoles.
> 
> The hardware is capable of firing an interrupt when the operation is
> done, but this driver currently uses a busy loop, I’m not too sure
> whether it would be preferable to switch, nor how to achieve that.
> 
> It also supports a mode where no operation is done, and thus could be
> used as a DMA copy engine, but I don’t know how to expose that to the
> kernel or whether it would even be useful.
> 
> In my testing, on a Wii U, this driver reaches 80.7 MiB/s, while the
> aes-generic driver only reaches 30.9 MiB/s, so it is a quite welcome
> speedup.
> 
> This driver was written based on reversed documentation, see:
> https://wiibrew.org/wiki/Hardware/AES
> 
> Signed-off-by: Emmanuel Gil Peyrot 
> Tested-by: Emmanuel Gil Peyrot   # on Wii U

[...]

> +static int
> +do_crypt(const void *src, void *dst, u32 len, u32 flags)
> +{
> + u32 blocks = ((len >> 4) - 1) & AES_CTRL_BLOCK;
> + u32 status;
> + u32 counter = OP_TIMEOUT;
> + u32 i;
> +
> + /* Flush out all of src, we can’t know whether any of it is in cache */
> + for (i = 0; i < len; i += 32)
> + __asm__("dcbf 0, %0" : : "r" (src + i));
> + __asm__("sync" : : : "memory");
> +
> + /* Set the addresses for DMA */
> + iowrite32be(virt_to_phys((void *)src), base + AES_SRC);
> + iowrite32be(virt_to_phys(dst), base + AES_DEST);

Hello

Since you do DMA operation, I think you should use the DMA-API and call 
dma_map_xxx()
This will prevent the use of __asm__ and virt_to_phys().

Regards


Re: [PATCH 1/4] crypto: nintendo-aes - add a new AES driver

2021-09-21 Thread Joel Stanley
On Tue, 21 Sept 2021 at 21:47, Emmanuel Gil Peyrot
 wrote:
>
> This engine implements AES in CBC mode, using 128-bit keys only.  It is
> present on both the Wii and the Wii U, and is apparently identical in
> both consoles.
>
> The hardware is capable of firing an interrupt when the operation is
> done, but this driver currently uses a busy loop, I’m not too sure
> whether it would be preferable to switch, nor how to achieve that.
>
> It also supports a mode where no operation is done, and thus could be
> used as a DMA copy engine, but I don’t know how to expose that to the
> kernel or whether it would even be useful.
>
> In my testing, on a Wii U, this driver reaches 80.7 MiB/s, while the
> aes-generic driver only reaches 30.9 MiB/s, so it is a quite welcome
> speedup.
>
> This driver was written based on reversed documentation, see:
> https://wiibrew.org/wiki/Hardware/AES
>
> Signed-off-by: Emmanuel Gil Peyrot 
> Tested-by: Emmanuel Gil Peyrot   # on Wii U
> ---
>  drivers/crypto/Kconfig|  11 ++
>  drivers/crypto/Makefile   |   1 +
>  drivers/crypto/nintendo-aes.c | 273 ++
>  3 files changed, 285 insertions(+)
>  create mode 100644 drivers/crypto/nintendo-aes.c
>
> diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
> index 9a4c275a1335..adc94ad7462d 100644
> --- a/drivers/crypto/Kconfig
> +++ b/drivers/crypto/Kconfig
> @@ -871,4 +871,15 @@ config CRYPTO_DEV_SA2UL
>
>  source "drivers/crypto/keembay/Kconfig"
>
> +config CRYPTO_DEV_NINTENDO
> +   tristate "Support for the Nintendo Wii U AES engine"
> +   depends on WII || WIIU || COMPILE_TEST

This current seteup will allow the driver to be compile tested for
non-powerpc, which will fail on the dcbf instructions.

Perhaps use this instead:

   depends on WII || WIIU || (COMPILE_TEST && PPC)

> +   select CRYPTO_AES
> +   help
> + Say 'Y' here to use the Nintendo Wii or Wii U on-board AES
> + engine for the CryptoAPI AES algorithm.
> +
> + To compile this driver as a module, choose M here: the module
> + will be called nintendo-aes.
> +
>  endif # CRYPTO_HW
> diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
> index fa22cb19e242..004dae7bbf39 100644
> --- a/drivers/crypto/Makefile
> +++ b/drivers/crypto/Makefile
> @@ -22,6 +22,7 @@ obj-$(CONFIG_CRYPTO_DEV_MARVELL) += marvell/
>  obj-$(CONFIG_CRYPTO_DEV_MXS_DCP) += mxs-dcp.o
>  obj-$(CONFIG_CRYPTO_DEV_NIAGARA2) += n2_crypto.o
>  n2_crypto-y := n2_core.o n2_asm.o
> +obj-$(CONFIG_CRYPTO_DEV_NINTENDO) += nintendo-aes.o
>  obj-$(CONFIG_CRYPTO_DEV_NX) += nx/
>  obj-$(CONFIG_CRYPTO_DEV_OMAP) += omap-crypto.o
>  obj-$(CONFIG_CRYPTO_DEV_OMAP_AES) += omap-aes-driver.o
> diff --git a/drivers/crypto/nintendo-aes.c b/drivers/crypto/nintendo-aes.c
> new file mode 100644
> index ..79ae77500999
> --- /dev/null
> +++ b/drivers/crypto/nintendo-aes.c
> @@ -0,0 +1,273 @@
> +/*
> + * Copyright (C) 2021 Emmanuel Gil Peyrot 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 and
> + * only version 2 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.

The  kernel uses the SDPX header instead of pasting the text.

> +static int
> +do_crypt(const void *src, void *dst, u32 len, u32 flags)
> +{
> +   u32 blocks = ((len >> 4) - 1) & AES_CTRL_BLOCK;
> +   u32 status;
> +   u32 counter = OP_TIMEOUT;
> +   u32 i;
> +
> +   /* Flush out all of src, we can’t know whether any of it is in cache 
> */
> +   for (i = 0; i < len; i += 32)
> +   __asm__("dcbf 0, %0" : : "r" (src + i));
> +   __asm__("sync" : : : "memory");

This could be flush_dcache_range, from asm/cacheflush.h

> +
> +   /* Set the addresses for DMA */
> +   iowrite32be(virt_to_phys((void *)src), base + AES_SRC);
> +   iowrite32be(virt_to_phys(dst), base + AES_DEST);
> +
> +   /* Start the operation */
> +   iowrite32be(flags | blocks, base + AES_CTRL);
> +
> +   /* TODO: figure out how to use interrupts here, this will probably
> +* lower throughput but let the CPU do other things while the AES
> +* engine is doing its work. */
> +   do {
> +   status = ioread32be(base + AES_CTRL);
> +   cpu_relax();
> +   } while ((status & AES_CTRL_EXEC) && --counter);

You could add a msleep in here?

Consider using readl_poll_timeout().

Cheers,

Joel

> +
> +   /* Do we ever get called with dst ≠ src?  If so we have to invalidate
> +* dst in addition to the earlier flush of src. */
> +   if (unlikely(dst != src)) {
> +   for (i = 0; i < len; i += 32)
> +