Re: [RFC PATCH 2/2] Crypto kernel tls socket

2015-11-24 Thread Phil Sutter
Hi,

On Tue, Nov 24, 2015 at 12:20:00PM +0100, Hannes Frederic Sowa wrote:
> Stephan Mueller  writes:
> 
> > Am Dienstag, 24. November 2015, 18:34:55 schrieb Herbert Xu:
> >
> > Hi Herbert,
> >
> >>On Mon, Nov 23, 2015 at 09:43:02AM -0800, Dave Watson wrote:
> >>> Userspace crypto interface for TLS.  Currently supports gcm(aes) 128bit
> >>> only, however the interface is the same as the rest of the SOCK_ALG
> >>> interface, so it should be possible to add more without any user interface
> >>> changes.
> >>
> >>SOCK_ALG exists to export crypto algorithms to user-space.  So if
> >>we decided to support TLS as an algorithm then I guess this makes
> >>sense.
> >>
> >>However, I must say that it wouldn't have been my first pick.  I'd
> >>imagine a TLS socket to look more like a TCP socket, or perhaps a
> >>KCM socket as proposed by Tom.
> >
> > If I may ask: what is the benefit of having TLS in kernel space? I do not 
> > see 
> > any reason why higher-level protocols should be in the kernel as they do 
> > not 
> > relate to accessing hardware.
> 
> There are some crypto acclerators out there so that putting tls into the
> kernel would give a net benefit, because otherwise user space has to
> copy data into the kernel for device access and back to user space until
> it can finally be send out on the wire.
> 
> Since processors provide aesni and other crypto extensions as part of
> their instruction set architecture, this, of course, does not make sense
> any more.

There "still" are dedicated crypto engines out there which need a driver
to be accessed, so using them from userspace is not as simple as with
padlock or AESNI. This was the reasoning behind the various cryptodev
implementations and af_alg. Using those to establish a TLS connection
with OpenSSL means to fetch encrypted data to userspace first and then
feed it to the kernel again for decryption. Using cryptodev-linux, this
will be zero-copy, but still there's an additional context switch
involved which the approach here avoids.

Cheers, Phil
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] [v3] crypto: sha1/ARM: make use of common SHA-1 structures

2014-07-01 Thread Phil Sutter
Hi,

On Mon, Jun 30, 2014 at 07:38:46PM +0300, Jussi Kivilinna wrote:
 Common SHA-1 structures are defined in crypto/sha.h for code sharing.
 
 This patch changes SHA-1/ARM glue code to use these structures.

I find it worth noting that this patch also fixes mv_cesa if sha1-arm is
also enabled. This is because software SHA1 is used as fallback,
accessing the context's data which was layed out differently before this
patch.

I have not checked, but other crypto engine drivers might be affected by
that, as well.

 Acked-by: Ard Biesheuvel ard.biesheu...@linaro.org
 Signed-off-by: Jussi Kivilinna jussi.kivili...@iki.fi

Acked-by: Phil Sutter p...@nwl.cc

Best wishes, Phil
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

2012-10-23 Thread Phil Sutter
Hey,

On Tue, Jul 31, 2012 at 08:12:02PM +0800, cloudy.linux wrote:
 Sorry for taking so long time to try the latest code. Just came back 
 from a vacation and tried several days to get a tight sleep.

My apologies for having a ~3months lag. Somehow I have totally forgotten
about your mail in my inbox and just recently found it again.

Luckily, I received testing hardware from a colleague a few days ago on
which I can reproduce the problems at hand. So for now, I can do the
testing on my own. Thanks a lot for yours!

I'll get back to you (probably via linux-crypto) as soon as I have some
useful progress. Could be that I have to implement bigger changes in
code flow for Orion, as the IDMA seems to lag this Enhanced Software
Flow functionality (how the Kirkwood datasheet calls it) I am relying
upon in my current code. Will see.

Best wishes, Phil
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

2012-07-16 Thread Phil Sutter
Hey Andrew,

On Mon, Jul 16, 2012 at 11:32:25AM +0200, Andrew Lunn wrote:
 I've not been following this thread too closely..
 
 Do you have any patches you want included into mainline?

No need to fix anything mainline, he's just testing my RFC-state DMA
engine addon to MV_CESA. Current state is failing operation on
IDMA-based machines due to errors in hardware configuration I have not
been able to track down yet. On Kirkwood (i.e. TDMA), the only hardware
I have access to, the same code runs fine.

After all, I am not sure why he decided to put you in Cc in the first
place?

Greetings, Phil


Phil Sutter
Software Engineer

-- 
VNet Europe GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Management Buy-Out at Viprinet - please read
http://www.viprinet.com/en/mbo
Management Buy-Out bei Viprinet - bitte lesen Sie
http://www.viprinet.com/de/mbo

Phone/Zentrale:   +49 6721 49030-0
Direct line/Durchwahl:+49 6721 49030-134
Fax:  +49 6721 49030-109

phil.sut...@viprinet.com
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein, Germany
Commercial register/Handelsregister: Amtsgericht Mainz HRB44090
CEO/Geschäftsführer: Simon Kissel
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

2012-07-16 Thread Phil Sutter
On Mon, Jul 16, 2012 at 04:03:44PM +0200, Andrew Lunn wrote:
 On Mon, Jul 16, 2012 at 03:52:16PM +0200, Phil Sutter wrote:
  Hey Andrew,
  
  On Mon, Jul 16, 2012 at 11:32:25AM +0200, Andrew Lunn wrote:
   I've not been following this thread too closely..
   
   Do you have any patches you want included into mainline?
  
  No need to fix anything mainline
 
 O.K. I thought there was a problem with user space using it, some
 flushes missing somewhere? Or VM mapping problem? Has that been fixed?

Hmm, there was some discussion about an issue like that in this list at
end of February/beginning of March, which was resolved then. On the
other hand there is an unanswered mail from Cloudy at 20. April about a
failing kernel hash test. Maybe he can elaborate on this?

Greetings, Phil


Phil Sutter
Software Engineer

-- 
VNet Europe GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Management Buy-Out at Viprinet - please read
http://www.viprinet.com/en/mbo
Management Buy-Out bei Viprinet - bitte lesen Sie
http://www.viprinet.com/de/mbo

Phone/Zentrale:   +49 6721 49030-0
Direct line/Durchwahl:+49 6721 49030-134
Fax:  +49 6721 49030-109

phil.sut...@viprinet.com
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein, Germany
Commercial register/Handelsregister: Amtsgericht Mainz HRB44090
CEO/Geschäftsführer: Simon Kissel
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

2012-07-09 Thread Phil Sutter
Hi,

On Sun, Jul 08, 2012 at 01:38:47PM +0800, cloudy.linux wrote:
 Newest result. Still couldn't boot up. This time the source was cloned 
 from your git repository.
 
 MV-DMA: window at bar0: target 0, attr 14, base 0, size 800
 MV-DMA: window at bar1: target 5, attr 0, base f220, size 1
 MV-DMA: IDMA engine up and running, IRQ 23
 MV-DMA: idma_print_and_clear_irq: address miss @0!
 MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04
 MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x0
 MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x0
 MV-DMA: tpg.reg + DMA_DST_ADDR = 0x0
 MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x79b1000
 MV-DMA: tpg.reg + DMA_CURR_DESC = 0x0
 MV-DMA: DMA descriptor list:
 MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst 
 0xf2200080, count 16, own 1, next 0x79b1010
 MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst 
 0xf220, count 80, own 1, next 0x79b1020
 MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0, 
 count 0, own 0, next 0x79b1030
 MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst 
 0x79b4000, count 16, own 1, next 0x0
 MV-CESA:got an interrupt but no pending timer?

Sucks. What's making me wonder here is, address decoding of address 0x0
actually shouldn't fail, since window 0 includes this address.

For now, I have pushed two new commits to my public git, adding more
debugging output for decoding window logic and interrupt case as well as
decoding window permission fix and changing from FETCH_ND to programming
the first DMA descriptor's values manually.

In the long term, I probably should try to get access to some
appropriate hardware myself. This is rather a quiz game than actual bug
tracking.

Greetings, Phil


Phil Sutter
Software Engineer

-- 
VNet Europe GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Management Buy-Out at Viprinet - please read
http://www.viprinet.com/en/mbo
Management Buy-Out bei Viprinet - bitte lesen Sie
http://www.viprinet.com/de/mbo

Phone/Zentrale:   +49 6721 49030-0
Direct line/Durchwahl:+49 6721 49030-134
Fax:  +49 6721 49030-109

phil.sut...@viprinet.com
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein, Germany
Commercial register/Handelsregister: Amtsgericht Mainz HRB44090
CEO/Geschäftsführer: Simon Kissel
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/1] MV_CESA with DMA: Clk init fixes

2012-07-06 Thread Phil Sutter
Hi Simon,

On Tue, Jun 26, 2012 at 10:31:51PM +0200, Simon Baatz wrote:
 I just found the time to test your updates. Alas, the mv_dma module
 hangs at boot again.  The culprit seems to be setup_mbus_windows(),
 which is called before the clock is turned on but accesses the DMA
 engine.
 
 I shifted the clock init code a bit and while doing so, fixed some error
 case handling for mv_dma and mv_cesa.  See proposed patch in next mail.

I applied that to my public git, thanks a lot!

Greetings, Phil


Phil Sutter
Software Engineer

-- 
VNet Europe GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Management Buy-Out at Viprinet - please read
http://www.viprinet.com/en/mbo
Management Buy-Out bei Viprinet - bitte lesen Sie
http://www.viprinet.com/de/mbo

Phone/Zentrale:   +49 6721 49030-0
Direct line/Durchwahl:+49 6721 49030-134
Fax:  +49 6721 49030-109

phil.sut...@viprinet.com
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein, Germany
Commercial register/Handelsregister: Amtsgericht Mainz HRB44090
CEO/Geschäftsführer: Simon Kissel
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

2012-07-06 Thread Phil Sutter
Hi Cloudy,

On Sat, Jun 30, 2012 at 03:35:48PM +0800, cloudy.linux wrote:
 Although I had no idea about what's wrong, I looked in the functional 
 errata (again), And I found what's attached (The doc I got from Internet 
 was a protected PDF, that's why I had to use screen capture).
 Is this relevant? Or maybe you have already addressed this in the code 
 (I can just read some simple C code)?

To me, doesn't read like a real problem, just a guideline for doing
things. From the output you sent me in your previous mail, I'd rather
suspect fetching the first descriptor to be faulty: the next descriptor
pointer contains the first descriptor's DMA address, all other fields
are zero (this is the situation when triggering the engine, as on
kirkwood all I have to do is fill the first descriptor's address in and
TDMA does the rest) and IDMA triggers an address miss interrupt at
address 0x0. So probably IDMA starts up and tries to look up decoding
windows for he up the still zero source and destination addresses.

According to the specs, when using the next descriptor field for
fetching the first descriptor one also has to set the FETCH_ND field in
DMA_CTRL register, also for TDMA. Though, on my hardware the only
working configuration is the implemented one, i.e. without FETCH_ND
being set.

I have implemented a separate approach just for IDMA, which instead of
just writing the first descriptor's address to NEXT_DESC does:
1. clear CTRL_ENABLE bit
2. fill NEXT_DESC
3. set CTRL_ENABLE along with FETCH_ND
hopefully this is the way to go on Orion. Since Marvell's BSP doesn't
implement *DMA attached to CESA, I have nowhere to look this up. Getting
it right for TDMA was just a matter of trial and error.

My public git got a few updates, including the code described above.
Would be great if you could give it a try.

Greetings, Phil



Phil Sutter
Software Engineer

-- 
VNet Europe GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Management Buy-Out at Viprinet - please read
http://www.viprinet.com/en/mbo
Management Buy-Out bei Viprinet - bitte lesen Sie
http://www.viprinet.com/de/mbo

Phone/Zentrale:   +49 6721 49030-0
Direct line/Durchwahl:+49 6721 49030-134
Fax:  +49 6721 49030-109

phil.sut...@viprinet.com
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein, Germany
Commercial register/Handelsregister: Amtsgericht Mainz HRB44090
CEO/Geschäftsführer: Simon Kissel
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

2012-06-25 Thread Phil Sutter
Hi,

On Wed, Jun 20, 2012 at 05:41:31PM +0200, Phil Sutter wrote:
 PS: I am currently working at the address decoding problem, will get
 back to in a few days when I have something to test. So stay tuned!

I have updated the cesa-dma branch at git://nwl.cc/~n0-1/linux.git with
code setting the decoding windows. I hope this fixes the issues on
orion. I decided not to publish the changes regarding the second DMA
channel for now, as this seems to be support for a second crypto session
(handled consecutively, so no real improvement) which is not supported
anyway.

Greetings, Phil


Phil Sutter
Software Engineer

-- 


Viprinet GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Phone/Zentrale: +49-6721-49030-0
Direct line/Durchwahl:  +49-6721-49030-134
Fax:+49-6721-49030-209

phil.sut...@viprinet.com
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein
Commercial register/Handelsregister: Amtsgericht Mainz HRB40380
CEO/Geschäftsführer: Simon Kissel
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

2012-06-25 Thread Phil Sutter
Hi,

On Mon, Jun 25, 2012 at 10:25:01PM +0800, cloudy.linux wrote:
 On 2012-6-25 21:40, Phil Sutter wrote:
  Hi,
 
  On Wed, Jun 20, 2012 at 05:41:31PM +0200, Phil Sutter wrote:
  PS: I am currently working at the address decoding problem, will get
  back to in a few days when I have something to test. So stay tuned!
 
  I have updated the cesa-dma branch at git://nwl.cc/~n0-1/linux.git with
  code setting the decoding windows. I hope this fixes the issues on
  orion. I decided not to publish the changes regarding the second DMA
  channel for now, as this seems to be support for a second crypto session
  (handled consecutively, so no real improvement) which is not supported
  anyway.
 
  Greetings, Phil
 
 
  Phil Sutter
  Software Engineer
 
 
 Thanks Phil. I'm cloning your git now but the speed is really slow. Last 
 time I tried to do this but had to cancel after hours of downloading (at 
 only about 20% progress). So the previous tests were actually done with 
 3.5-rc3 (I tried the up-to-date Linus' linux-git, but met compiling 
 problem), of course with your patch and Simon's. Could you provide a 
 diff based on your last round patch (diff to the not patched kernel 
 should also be good, I think)?
 
 In the mean time, I'm still trying with a cloning speed of 5KiB/s ...

Ugh, that's horrible. No idea what's going wrong there, and no access to
the management interface right now. In the mean time, please refer to
the attached patch. It bases on 94fa83c in linus' git but should
cleanly apply to it's current HEAD, too.

Greetings, Phil


Phil Sutter
Software Engineer

-- 


Viprinet GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Phone/Zentrale: +49-6721-49030-0
Direct line/Durchwahl:  +49-6721-49030-134
Fax:+49-6721-49030-209

phil.sut...@viprinet.com
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein
Commercial register/Handelsregister: Amtsgericht Mainz HRB40380
CEO/Geschäftsführer: Simon Kissel
diff --git a/arch/arm/mach-kirkwood/common.c b/arch/arm/mach-kirkwood/common.c
index 25fb3fd..a011b93 100644
--- a/arch/arm/mach-kirkwood/common.c
+++ b/arch/arm/mach-kirkwood/common.c
@@ -232,6 +232,7 @@ void __init kirkwood_clk_init(void)
orion_clkdev_add(NULL, orion-ehci.0, usb0);
orion_clkdev_add(NULL, orion_nand, runit);
orion_clkdev_add(NULL, mvsdio, sdio);
+   orion_clkdev_add(NULL, mv_tdma, crypto);
orion_clkdev_add(NULL, mv_crypto, crypto);
orion_clkdev_add(NULL, MV_XOR_SHARED_NAME .0, xor0);
orion_clkdev_add(NULL, MV_XOR_SHARED_NAME .1, xor1);
@@ -426,8 +427,41 @@ void __init kirkwood_uart1_init(void)
 /*
  * Cryptographic Engines and Security Accelerator (CESA)
  /
+static struct resource kirkwood_tdma_res[] = {
+   {
+   .name   = regs deco,
+   .start  = CRYPTO_PHYS_BASE + 0xA00,
+   .end= CRYPTO_PHYS_BASE + 0xA24,
+   .flags  = IORESOURCE_MEM,
+   }, {
+   .name   = regs control and error,
+   .start  = CRYPTO_PHYS_BASE + 0x800,
+   .end= CRYPTO_PHYS_BASE + 0x8CF,
+   .flags  = IORESOURCE_MEM,
+   }, {
+   .name   = crypto error,
+   .start  = IRQ_KIRKWOOD_TDMA_ERR,
+   .end= IRQ_KIRKWOOD_TDMA_ERR,
+   .flags  = IORESOURCE_IRQ,
+   },
+};
+
+static u64 mv_tdma_dma_mask = DMA_BIT_MASK(32);
+
+static struct platform_device kirkwood_tdma_device = {
+   .name   = mv_tdma,
+   .id = -1,
+   .dev= {
+   .dma_mask   = mv_tdma_dma_mask,
+   .coherent_dma_mask  = DMA_BIT_MASK(32),
+   },
+   .num_resources  = ARRAY_SIZE(kirkwood_tdma_res),
+   .resource   = kirkwood_tdma_res,
+};
+
 void __init kirkwood_crypto_init(void)
 {
+   platform_device_register(kirkwood_tdma_device);
orion_crypto_init(CRYPTO_PHYS_BASE, KIRKWOOD_SRAM_PHYS_BASE,
  KIRKWOOD_SRAM_SIZE, IRQ_KIRKWOOD_CRYPTO);
 }
diff --git a/arch/arm/mach-kirkwood/include/mach/irqs.h 
b/arch/arm/mach-kirkwood/include/mach/irqs.h
index 2bf8161..a66aa3f 100644
--- a/arch/arm/mach-kirkwood/include/mach/irqs.h
+++ b/arch/arm/mach-kirkwood/include/mach/irqs.h
@@ -51,6 +51,7 @@
 #define IRQ_KIRKWOOD_GPIO_HIGH_16_23   41
 #define IRQ_KIRKWOOD_GE00_ERR  46
 #define IRQ_KIRKWOOD_GE01_ERR  47
+#define IRQ_KIRKWOOD_TDMA_ERR  49
 #define IRQ_KIRKWOOD_RTC53
 
 /*
diff --git a/arch/arm/mach-orion5x/common.c b/arch/arm/mach-orion5x/common.c
index 9148b22..4734231 100644
--- a/arch/arm/mach-orion5x/common.c
+++ b/arch/arm/mach-orion5x/common.c
@@ -181,9 +181,49 @@ void __init orion5x_xor_init(void)
 /*
  * Cryptographic

Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

2012-06-25 Thread Phil Sutter
Hi,

On Tue, Jun 26, 2012 at 12:05:55AM +0800, cloudy.linux wrote:
 This time the machine can't finish the boot again and the console was 
 flooded by the message like below:

Oh well. I decided to drop that BUG_ON() again, since I saw it once
being triggered while in interrupt context. But since the error is
non-recovering anyway, I guess it may stay there as well.

 Also, I had to make some modifications to the 
 arch/arm/mach-orion5x/common.c to let it compile successfully:
 1 Add including of mv_dma.h
 2 Add macro to define TARGET_SRAM as 9 (which is defined in addr-map.c, 
 so I think the clean solution should be modify the addr-map.h? Anyway, 
 as a quick solution the source finally got compiled)

Hmm, yeah. Test-compiling for the platform one is writing code for is
still a good idea. But it's even worse than that: according to the
specs, for IDMA the SRAM target ID is 5, not 9 like it is for the CPU.

Please apply the attached patch on top of the one I sent earlier,
without your modifications (the necessary parts are contained in it).
Also, I've added some log output to the decode window setter, so we see
what's going on there.

Anyway, thanks a lot for your help so far! I hope next try shows some
progress at least.

Greetings, Phil


Phil Sutter
Software Engineer

-- 


Viprinet GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Phone/Zentrale: +49-6721-49030-0
Direct line/Durchwahl:  +49-6721-49030-134
Fax:+49-6721-49030-209

phil.sut...@viprinet.com
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein
Commercial register/Handelsregister: Amtsgericht Mainz HRB40380
CEO/Geschäftsführer: Simon Kissel
diff --git a/arch/arm/mach-orion5x/common.c b/arch/arm/mach-orion5x/common.c
index 4734231..45450af 100644
--- a/arch/arm/mach-orion5x/common.c
+++ b/arch/arm/mach-orion5x/common.c
@@ -35,6 +35,7 @@
 #include plat/time.h
 #include plat/common.h
 #include plat/addr-map.h
+#include plat/mv_dma.h
 #include common.h
 
 /*
@@ -203,7 +204,7 @@ static struct resource orion_idma_res[] = {
 static u64 mv_idma_dma_mask = DMA_BIT_MASK(32);
 
 static struct mv_dma_pdata mv_idma_pdata = {
-   .sram_target_id = TARGET_SRAM,
+   .sram_target_id = 5,
.sram_attr  = 0,
.sram_base  = ORION5X_SRAM_PHYS_BASE,
 };
diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index b75fdf5..4f48c63 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -195,6 +195,7 @@ static void mv_completion_timer_callback(unsigned long 
unused)
if (count  0) {
printk(KERN_ERR MV_CESA
   %s: engine reset timed out!\n, __func__);
+   BUG();
}
cpg-eng_st = ENGINE_W_DEQUEUE;
wake_up_process(cpg-queue_th);
diff --git a/drivers/crypto/mv_dma.c b/drivers/crypto/mv_dma.c
index dd1ce02..176976e 100644
--- a/drivers/crypto/mv_dma.c
+++ b/drivers/crypto/mv_dma.c
@@ -318,6 +318,8 @@ static void setup_mbus_windows(void __iomem *regs, struct 
mv_dma_pdata *pdata,
for (chan = 0; chan  dram-num_cs; chan++) {
const struct mbus_dram_window *cs = dram-cs[chan];
 
+   printk(KERN_INFO MV_DMA window at bar%d: target %d, attr %d, 
base %x, size %x\n,
+   chan, dram-mbus_dram_target_id, cs-mbus_attr, 
cs-base, cs-size);
(*win_setter)(regs, chan, dram-mbus_dram_target_id,
cs-mbus_attr, cs-base, cs-size);
}
@@ -330,6 +332,8 @@ static void setup_mbus_windows(void __iomem *regs, struct 
mv_dma_pdata *pdata,
 * Size is in 64k granularity, max SRAM size is 8k -
 * so a single unit easily suffices.
 */
+   printk(KERN_INFO MV_DMA window at bar%d: target %d, attr %d, 
base %x, size %x\n,
+   chan, pdata-sram_target_id, pdata-sram_attr, 
pdata-sram_base, 1  16);
(*win_setter)(regs, chan, pdata-sram_target_id,
pdata-sram_attr, pdata-sram_base, 1  16);
}


Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

2012-06-20 Thread Phil Sutter
Hi Cloudy,

On Wed, Jun 20, 2012 at 09:31:10PM +0800, cloudy.linux wrote:
 On 2012-6-19 4:12, Simon Baatz wrote:
  I see one effect that I don't fully understand.
  Similar to the previous implementation, the system is mostly in
  kernel space when accessing an encrypted dm-crypt device:
 
 Today I also compiled the patched 3.5.0-rc3 for another NAS box with 
 MV88F6282-Rev-A0 (LS-WVL), I noticed one thing that when the CESA engine 
 was used, the interrupt number of mv_crypto kept rising, but the 
 interrupt number of mv_tdma was always zero.

Yes, that is exactly how it should be: the DMA engine is configured to
run attached to CESA, meaning that when CESA is triggered from
mv_cesa.c, it first enables the DMA engine. Using a special descriptor
in the chain, the DMA engine knows when to stop and signals CESA again
so it can start the crypto operation. Afterwards, CESA triggers the DMA
engine again for copying back the results (or more specific: process the
remaining descriptors in the chain after the special one). After a
descriptor with it's next descriptor field being zero has been handled,
CESA is signaled again which in turn generates the interrupt to signal
the software. So no DMA interrupt needed, and no software interaction in
between data copying and crypto operation, of course. :)

Greetings, Phil

PS: I am currently working at the address decoding problem, will get
back to in a few days when I have something to test. So stay tuned!

Phil Sutter
Software Engineer

-- 


Viprinet GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Phone/Zentrale: +49-6721-49030-0
Direct line/Durchwahl:  +49-6721-49030-134
Fax:+49-6721-49030-209

phil.sut...@viprinet.com
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein
Commercial register/Handelsregister: Amtsgericht Mainz HRB40380
CEO/Geschäftsführer: Simon Kissel
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

2012-06-19 Thread Phil Sutter
Hi Simon,

On Mon, Jun 18, 2012 at 10:12:36PM +0200, Simon Baatz wrote:
 On Mon, Jun 18, 2012 at 03:47:18PM +0200, Phil Sutter wrote:
  On Sat, Jun 16, 2012 at 02:20:19AM +0200, Simon Baatz wrote:
   thanks for providing these patches; it's great to finally see DMA
   support for CESA in the kernel. Additionally, the implementation seems
   to be fine regarding cache incoherencies (at least my test in [*]
   works).
  
  Thanks for testing and the fixes. Could you also specify the platform
  you are testing on?
 
 This is a Marvell Kirkwood MV88F6281-A1. 

OK, thanks. Just wanted to be sure it's not already the Orion test I'm
hoping for. :)

 I see one effect that I don't fully understand. 
 Similar to the previous implementation, the system is mostly in
 kernel space when accessing an encrypted dm-crypt device:
 
 # cryptsetup --cipher=aes-cbc-plain --key-size=128 create c_sda2 /dev/sda2 
 Enter passphrase: 
 # dd if=/dev/mapper/c_sda2 of=/dev/null bs=64k count=2048
 2048+0 records in
 2048+0 records out
 134217728 bytes (134 MB) copied, 10.7324 s, 12.5 MB/s
 
 Doing an mpstat 1 at the same time gives:
 
 21:21:42 CPU%usr   %nice%sys %iowait%irq   %soft ...
 21:21:45 all0.000.000.000.000.000.00
 21:21:46 all0.000.00   79.000.000.002.00
 21:21:47 all0.000.00   95.000.000.005.00
 21:21:48 all0.000.00   94.000.000.006.00
 21:21:49 all0.000.00   96.000.000.004.00
 ...
 
 The underlying device is a SATA drive and should not be the limit:
 
 # dd if=/dev/sda2 of=/dev/null bs=64k count=2048
 2048+0 records in
 2048+0 records out
 134217728 bytes (134 MB) copied, 1.79804 s, 74.6 MB/s
 
 I did not dare hope the DMA implementation to be much faster than the
 old one, but I would have expected a rather low CPU usage using DMA. 
 Do you have an idea where the kernel spends its time?  (Am I hitting
 a non/only partially accelerated path here?)

Hmm. Though you passed bs=64k to dd, block sizes may still be the
bottleneck. No idea if the parameter is really passed down to dm-crypt
or if that uses the underlying device's block size anyway. I just did a
short speed test on the 2.6.39.2 we're using productively:

| Testing AES-128-CBC cipher:
|  Encrypting in chunks of 512 bytes: done. 46.19 MB in 5.00 secs: 9.24 MB/sec
|  Encrypting in chunks of 1024 bytes: done. 81.82 MB in 5.00 secs: 16.36 MB/sec
|  Encrypting in chunks of 2048 bytes: done. 124.63 MB in 5.00 secs: 24.93 
MB/sec
|  Encrypting in chunks of 4096 bytes: done. 162.88 MB in 5.00 secs: 32.58 
MB/sec
|  Encrypting in chunks of 8192 bytes: done. 200.47 MB in 5.00 secs: 40.09 
MB/sec
|  Encrypting in chunks of 16384 bytes: done. 226.61 MB in 5.00 secs: 45.32 
MB/sec
|  Encrypting in chunks of 32768 bytes: done. 242.78 MB in 5.00 secs: 48.55 
MB/sec
|  Encrypting in chunks of 65536 bytes: done. 251.85 MB in 5.00 secs: 50.36 
MB/sec
|
| Testing AES-256-CBC cipher:
|  Encrypting in chunks of 512 bytes: done. 45.15 MB in 5.00 secs: 9.03 MB/sec
|  Encrypting in chunks of 1024 bytes: done. 78.72 MB in 5.00 secs: 15.74 MB/sec
|  Encrypting in chunks of 2048 bytes: done. 117.59 MB in 5.00 secs: 23.52 
MB/sec
|  Encrypting in chunks of 4096 bytes: done. 151.59 MB in 5.00 secs: 30.32 
MB/sec
|  Encrypting in chunks of 8192 bytes: done. 182.95 MB in 5.00 secs: 36.59 
MB/sec
|  Encrypting in chunks of 16384 bytes: done. 204.00 MB in 5.00 secs: 40.80 
MB/sec
|  Encrypting in chunks of 32768 bytes: done. 216.17 MB in 5.00 secs: 43.23 
MB/sec
|  Encrypting in chunks of 65536 bytes: done. 223.22 MB in 5.00 secs: 44.64 
MB/sec

Observing top while it was running revealed that system load was
decreasing with increased block sizes - ~75% at 512B, ~20% at 32kB. I
fear this is a limitation we have to live with, the overhead of setting
up DMA descriptors and handling the returned data is quite high,
especially when compared to the time it takes the engine to encrypt
512B. I was playing around with descriptor preparation at some point
(i.e. preparing the next descriptor chaing while the engine is active),
but without satisfying results. Maybe I should have another look at it,
especially regarding the case of small chunk sizes. OTOH this all makes
sense only when used asymmetrically, and I have no idea whether dm-crypt
(or fellows like IPsec) makes use of that interface at all.

   - My system locked up hard when mv_dma and mv_cesa were built as
 modules. mv_cesa has code to enable the crypto clock in 3.5, but
 mv_dma already accesses the CESA engine before. Thus, we need to
 enable this clock here, too.
  
  I have folded them into my patch series, thanks again. I somewhat miss
  the orion_clkdev_add() part for orion5x platforms, but also fail to find
  any equivalent place in the correspondent subdirectory. So I hope it is
  OK like this.
 
 The change follows the original clk changes by Andrew. I don't know
 orion5x, but apparently

Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

2012-06-19 Thread Phil Sutter
Hi,

On Tue, Jun 19, 2012 at 11:09:43PM +0800, cloudy.linux wrote:
 On 2012-6-19 19:51, Phil Sutter wrote:
  Hi Simon,
 
  On Mon, Jun 18, 2012 at 10:12:36PM +0200, Simon Baatz wrote:
  On Mon, Jun 18, 2012 at 03:47:18PM +0200, Phil Sutter wrote:
  On Sat, Jun 16, 2012 at 02:20:19AM +0200, Simon Baatz wrote:
  thanks for providing these patches; it's great to finally see DMA
  support for CESA in the kernel. Additionally, the implementation seems
  to be fine regarding cache incoherencies (at least my test in [*]
  works).
 
  Thanks for testing and the fixes. Could you also specify the platform
  you are testing on?
 
  This is a Marvell Kirkwood MV88F6281-A1.
 
  OK, thanks. Just wanted to be sure it's not already the Orion test I'm
  hoping for. :)
 
 
 OK, here comes the Orion test result - Linkstation Pro with 88F5182 A2. 
 I didn't enable any debug option yet (I don't know what to be enabled in 
 fact). Hope the mv_cesa and mv_dma related kernel messages below could 
 be helpful though:
 
 ...
 
 MV-DMA: IDMA engine up and running, IRQ 23
 MV-DMA: idma_print_and_clear_irq: address miss @0!
 MV-DMA: tpg.reg + DMA_CTRL = 0x80001d04
 MV-DMA: tpg.reg + DMA_BYTE_COUNT = 0x8010
 MV-DMA: tpg.reg + DMA_SRC_ADDR = 0x79b4008
 MV-DMA: tpg.reg + DMA_DST_ADDR = 0xf2200080
 MV-DMA: tpg.reg + DMA_NEXT_DESC = 0x79b1010
 MV-DMA: tpg.reg + DMA_CURR_DESC = 0x79b1000
 MV-DMA: DMA descriptor list:
 MV-DMA: entry 0 at 0xffdbb000: dma addr 0x79b1000, src 0x79b4000, dst 
 0xf2200080, count 16, own 1, next 0x79b1010
 MV-DMA: entry 1 at 0xffdbb010: dma addr 0x79b1010, src 0x799c28c, dst 
 0xf220, count 80, own 1, next 0x79b1020
 MV-DMA: entry 2 at 0xffdbb020: dma addr 0x79b1020, src 0x0, dst 0x0, 
 count 0, own 0, next 0x79b1030
 MV-DMA: entry 3 at 0xffdbb030: dma addr 0x79b1030, src 0xf2200080, dst 
 0x79b4000, count 16, own 1, next 0x0
 MV-CESA:got an interrupt but no pending timer?
 alg: skcipher: Test 1 failed on encryption for mv-ecb-aes
 : 00 11 22 33 44 55 66 77 88 99 aa bb cc dd ee ff
 ...
 
 MV-CESA:completion timer expired (CESA active), cleaning up.
 MV-CESA:mv_completion_timer_callback: waiting for engine finishing
 MV-CESA:mv_completion_timer_callback: waiting for engine finishing
 
 Then the console was flooded by the waiting for engine finshing 
 message and the boot can't finish.
 
 I'll be happy to help to debug this. Just tell me how.

OK. IDMA bailing out was more or less expected, but the error path
flooding the log makes me deserve the darwin award. ;)

I suspect address decoding to be the real problem here (kirkwood seems
not to need any setup, so I completely skipped that), at least the IDMA
interrupt cause points that out. OTOH I found out that CESA wasn't
exactly configured as stated in the specs, so could you please test the
attached diff? (Should also sanitise the error case a bit.)

In any case, thanks a lot for your time!



Phil Sutter
Software Engineer

-- 


Viprinet GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Phone/Zentrale: +49-6721-49030-0
Direct line/Durchwahl:  +49-6721-49030-134
Fax:+49-6721-49030-209

phil.sut...@viprinet.com
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein
Commercial register/Handelsregister: Amtsgericht Mainz HRB40380
CEO/Geschäftsführer: Simon Kissel
diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 2a9fe8a..4361dff 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -9,6 +9,7 @@
 #include crypto/aes.h
 #include crypto/algapi.h
 #include linux/crypto.h
+#include linux/delay.h
 #include linux/dma-mapping.h
 #include linux/dmapool.h
 #include linux/interrupt.h
@@ -180,6 +181,7 @@ struct mv_req_hash_ctx {
 static void mv_completion_timer_callback(unsigned long unused)
 {
int active = readl(cpg-reg + SEC_ACCEL_CMD)  SEC_CMD_EN_SEC_ACCL0;
+   int count = 5;
 
printk(KERN_ERR MV_CESA
   completion timer expired (CESA %sactive), cleaning up.\n,
@@ -187,8 +189,12 @@ static void mv_completion_timer_callback(unsigned long 
unused)
 
del_timer(cpg-completion_timer);
writel(SEC_CMD_DISABLE_SEC, cpg-reg + SEC_ACCEL_CMD);
-   while(readl(cpg-reg + SEC_ACCEL_CMD)  SEC_CMD_DISABLE_SEC)
-   printk(KERN_INFO MV_CESA %s: waiting for engine finishing\n, 
__func__);
+   while((readl(cpg-reg + SEC_ACCEL_CMD)  SEC_CMD_DISABLE_SEC)  count) 
{
+   printk(KERN_INFO MV_CESA %s: waiting for engine finishing 
(%d)\n,
+   __func__, count--);
+   mdelay(1000);
+   }
+   BUG_ON(!count);
cpg-eng_st = ENGINE_W_DEQUEUE;
wake_up_process(cpg-queue_th);
 }
@@ -1288,9 +1294,9 @@ static int mv_probe(struct platform_device *pdev)
clk_prepare_enable(cp-clk);
 
writel(0, cpg-reg + SEC_ACCEL_INT_STATUS);
-   writel(SEC_INT_ACC0_IDMA_DONE, cpg-reg + SEC_ACCEL_INT_MASK);
-   writel((SEC_CFG_STOP_DIG_ERR

Re: [PATCH 0/2] Fixes for MV_CESA with IDMA or TDMA

2012-06-18 Thread Phil Sutter
Hi Simon,

On Sat, Jun 16, 2012 at 02:20:19AM +0200, Simon Baatz wrote:
 thanks for providing these patches; it's great to finally see DMA
 support for CESA in the kernel. Additionally, the implementation seems
 to be fine regarding cache incoherencies (at least my test in [*]
 works).

Thanks for testing and the fixes. Could you also specify the platform
you are testing on?

 I have two patches for your patchset...
 
 - Fix for mv_init_engine error handling
 
 - My system locked up hard when mv_dma and mv_cesa were built as
   modules. mv_cesa has code to enable the crypto clock in 3.5, but
   mv_dma already accesses the CESA engine before. Thus, we need to
   enable this clock here, too.

I have folded them into my patch series, thanks again. I somewhat miss
the orion_clkdev_add() part for orion5x platforms, but also fail to find
any equivalent place in the correspondent subdirectory. So I hope it is
OK like this.

The updated patch series is available at git://nwl.cc/~n0-1/linux.git,
branch 'cesa-dma'. My push changed history, so you have to either reset
--hard to it's HEAD, or rebase skipping the outdated patches.

Greetings, Phil


Phil Sutter
Software Engineer

-- 


Viprinet GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Phone/Zentrale: +49-6721-49030-0
Direct line/Durchwahl:  +49-6721-49030-134
Fax:+49-6721-49030-209

phil.sut...@viprinet.com
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein
Commercial register/Handelsregister: Amtsgericht Mainz HRB40380
CEO/Geschäftsführer: Simon Kissel
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: support for MV_CESA with IDMA or TDMA

2012-06-15 Thread Phil Sutter
Hi,

On Fri, Jun 15, 2012 at 09:40:28AM +0800, cloudy.linux wrote:
 I would like to have a try on those patches. But what version of kernel 
 should I apply those patches on?

Sorry for the caused confusion. I have applied those patches to linus'
git, preceded by the three accepted ones of the earlier four. Yay.

Long story short, please just fetch git://nwl.cc/~n0-1/linux.git and
checkout the 'cesa-dma' branch. It's exactly what I formatted the
patches from.

Greetings, Phil


Phil Sutter
Software Engineer

-- 


Viprinet GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Phone/Zentrale: +49-6721-49030-0
Direct line/Durchwahl:  +49-6721-49030-134
Fax:+49-6721-49030-209

phil.sut...@viprinet.com
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein
Commercial register/Handelsregister: Amtsgericht Mainz HRB40380
CEO/Geschäftsführer: Simon Kissel
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: support for MV_CESA with TDMA

2012-06-12 Thread Phil Sutter
On Tue, Jun 12, 2012 at 06:04:37PM +0800, Herbert Xu wrote:
 On Fri, May 25, 2012 at 06:08:26PM +0200, Phil Sutter wrote:
 
  The point for this being RFC is backwards-compatibility: earlier
  hardware (Orion) ships a (slightly) different DMA engine (IDMA) along
  with the same crypto engine, so in fact mv_cesa.c is in use on these
  platforms, too. But since I don't possess hardware of this kind, I am
  not able to make this code IDMA-compatible. Also, due to the quite
  massive reorganisation of code flow, I don't really see how to make TDMA
  support optional in mv_cesa.c.
 
 So does this break existing functionality or not?

It does break mv_cesa on Orion-based devices (precisely those with IDMA
instead of TDMA). I am currently working on a version which supports
IDMA, too. Since all CESA-equipped hardware comes with either TDMA or
IDMA, that version then should improve all platforms without breaking
any.

Greetings, Phil


Phil Sutter
Software Engineer

-- 


Viprinet GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Phone/Zentrale: +49-6721-49030-0
Direct line/Durchwahl:  +49-6721-49030-134
Fax:+49-6721-49030-209

phil.sut...@viprinet.com
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein
Commercial register/Handelsregister: Amtsgericht Mainz HRB40380
CEO/Geschäftsführer: Simon Kissel
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 12/13] mv_cesa: drop the now unused process callback

2012-06-12 Thread Phil Sutter
And while here, simplify dequeue_complete_req() a bit.

Signed-off-by: Phil Sutter phil.sut...@viprinet.com
---
 drivers/crypto/mv_cesa.c |   21 ++---
 1 files changed, 6 insertions(+), 15 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 86b73d1..7b2b693 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -70,7 +70,6 @@ struct req_progress {
struct scatterlist *src_sg;
struct scatterlist *dst_sg;
void (*complete) (void);
-   void (*process) (void);
 
/* src mostly */
int sg_src_left;
@@ -650,25 +649,17 @@ static void mv_hash_algo_completion(void)
 static void dequeue_complete_req(void)
 {
struct crypto_async_request *req = cpg-cur_req;
-   cpg-p.hw_processed_bytes += cpg-p.crypt_len;
-   cpg-p.crypt_len = 0;
 
mv_dma_clear();
cpg-u32_usage = 0;
 
BUG_ON(cpg-eng_st != ENGINE_W_DEQUEUE);
-   if (cpg-p.hw_processed_bytes  cpg-p.hw_nbytes) {
-   /* process next scatter list entry */
-   cpg-eng_st = ENGINE_BUSY;
-   setup_data_in();
-   cpg-p.process();
-   } else {
-   cpg-p.complete();
-   cpg-eng_st = ENGINE_IDLE;
-   local_bh_disable();
-   req-complete(req, 0);
-   local_bh_enable();
-   }
+
+   cpg-p.complete();
+   cpg-eng_st = ENGINE_IDLE;
+   local_bh_disable();
+   req-complete(req, 0);
+   local_bh_enable();
 }
 
 static int count_sgs(struct scatterlist *sl, unsigned int total_bytes)
-- 
1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/13] mv_cesa: reorganise mv_start_new_hash_req a bit

2012-06-12 Thread Phil Sutter
Check and exit early for whether CESA can be used at all.

Signed-off-by: Phil Sutter phil.sut...@viprinet.com
---
 drivers/crypto/mv_cesa.c |   61 +-
 1 files changed, 33 insertions(+), 28 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 7917d1a..9c65980 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -806,35 +806,13 @@ static void mv_start_new_hash_req(struct ahash_request 
*req)
else
ctx-extra_bytes = 0;
 
-   p-src_sg = req-src;
-   if (req-nbytes) {
-   BUG_ON(!req-src);
-   p-sg_src_left = req-src-length;
-   }
-
-   if (hw_bytes) {
-   p-hw_nbytes = hw_bytes;
-   p-complete = mv_hash_algo_completion;
-   p-process = mv_update_hash_config;
-
-   if (unlikely(old_extra_bytes)) {
-   dma_sync_single_for_device(cpg-dev, ctx-buffer_dma,
-   SHA1_BLOCK_SIZE, DMA_TO_DEVICE);
-   mv_dma_memcpy(cpg-sram_phys + SRAM_DATA_IN_START,
-   ctx-buffer_dma, old_extra_bytes);
-   p-crypt_len = old_extra_bytes;
+   if (unlikely(!hw_bytes)) { /* too little data for CESA */
+   if (req-nbytes) {
+   p-src_sg = req-src;
+   p-sg_src_left = req-src-length;
+   copy_src_to_buf(p, ctx-buffer + old_extra_bytes,
+   req-nbytes);
}
-
-   if (!mv_dma_map_sg(req-src, req-nbytes, DMA_TO_DEVICE)) {
-   printk(KERN_ERR %s: out of memory\n, __func__);
-   return;
-   }
-
-   setup_data_in();
-   mv_init_hash_config(req);
-   } else {
-   copy_src_to_buf(p, ctx-buffer + old_extra_bytes,
-   ctx-extra_bytes - old_extra_bytes);
if (ctx-last_chunk)
rc = mv_hash_final_fallback(req);
else
@@ -843,7 +821,34 @@ static void mv_start_new_hash_req(struct ahash_request 
*req)
local_bh_disable();
req-base.complete(req-base, rc);
local_bh_enable();
+   return;
}
+
+   if (likely(req-nbytes)) {
+   BUG_ON(!req-src);
+
+   if (!mv_dma_map_sg(req-src, req-nbytes, DMA_TO_DEVICE)) {
+   printk(KERN_ERR %s: out of memory\n, __func__);
+   return;
+   }
+   p-sg_src_left = sg_dma_len(req-src);
+   p-src_sg = req-src;
+   }
+
+   p-hw_nbytes = hw_bytes;
+   p-complete = mv_hash_algo_completion;
+   p-process = mv_update_hash_config;
+
+   if (unlikely(old_extra_bytes)) {
+   dma_sync_single_for_device(cpg-dev, ctx-buffer_dma,
+   SHA1_BLOCK_SIZE, DMA_TO_DEVICE);
+   mv_dma_memcpy(cpg-sram_phys + SRAM_DATA_IN_START,
+   ctx-buffer_dma, old_extra_bytes);
+   p-crypt_len = old_extra_bytes;
+   }
+
+   setup_data_in();
+   mv_init_hash_config(req);
 }
 
 static int queue_manag(void *data)
-- 
1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/13] mv_cesa: implementing packet chain mode, only aes for now

2012-06-12 Thread Phil Sutter
This introduces a pool of four-byte DMA buffers for security
accelerator config updates.

Signed-off-by: Phil Sutter phil.sut...@viprinet.com
---
 drivers/crypto/mv_cesa.c |  134 --
 drivers/crypto/mv_cesa.h |1 +
 2 files changed, 106 insertions(+), 29 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 7dfab85..7917d1a 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -10,6 +10,7 @@
 #include crypto/algapi.h
 #include linux/crypto.h
 #include linux/dma-mapping.h
+#include linux/dmapool.h
 #include linux/interrupt.h
 #include linux/io.h
 #include linux/kthread.h
@@ -28,6 +29,9 @@
 #define MAX_HW_HASH_SIZE   0x
 #define MV_CESA_EXPIRE 500 /* msec */
 
+#define MV_DMA_INIT_POOLSIZE 16
+#define MV_DMA_ALIGN 16
+
 static int count_sgs(struct scatterlist *, unsigned int);
 
 /*
@@ -97,6 +101,11 @@ struct sec_accel_sram {
 #define sa_ivo type.hash.ivo
 } __attribute__((packed));
 
+struct u32_mempair {
+   u32 *vaddr;
+   dma_addr_t daddr;
+};
+
 struct crypto_priv {
struct device *dev;
void __iomem *reg;
@@ -120,6 +129,11 @@ struct crypto_priv {
 
struct sec_accel_sram sa_sram;
dma_addr_t sa_sram_dma;
+
+   struct dma_pool *u32_pool;
+   struct u32_mempair *u32_list;
+   int u32_list_len;
+   int u32_usage;
 };
 
 static struct crypto_priv *cpg;
@@ -191,6 +205,54 @@ static void mv_setup_timer(void)
jiffies + msecs_to_jiffies(MV_CESA_EXPIRE));
 }
 
+#define U32_ITEM(x)(cpg-u32_list[x].vaddr)
+#define U32_ITEM_DMA(x)(cpg-u32_list[x].daddr)
+
+static inline int set_u32_poolsize(int nelem)
+{
+   /* need to increase size first if requested */
+   if (nelem  cpg-u32_list_len) {
+   struct u32_mempair *newmem;
+   int newsize = nelem * sizeof(struct u32_mempair);
+
+   newmem = krealloc(cpg-u32_list, newsize, GFP_KERNEL);
+   if (!newmem)
+   return -ENOMEM;
+   cpg-u32_list = newmem;
+   }
+
+   /* allocate/free dma descriptors, adjusting cpg-u32_list_len on the go 
*/
+   for (; cpg-u32_list_len  nelem; cpg-u32_list_len++) {
+   U32_ITEM(cpg-u32_list_len) = dma_pool_alloc(cpg-u32_pool,
+   GFP_KERNEL, U32_ITEM_DMA(cpg-u32_list_len));
+   if (!U32_ITEM((cpg-u32_list_len)))
+   return -ENOMEM;
+   }
+   for (; cpg-u32_list_len  nelem; cpg-u32_list_len--)
+   dma_pool_free(cpg-u32_pool, U32_ITEM(cpg-u32_list_len - 1),
+   U32_ITEM_DMA(cpg-u32_list_len - 1));
+
+   /* ignore size decreases but those to zero */
+   if (!nelem) {
+   kfree(cpg-u32_list);
+   cpg-u32_list = 0;
+   }
+   return 0;
+}
+
+static inline void mv_dma_u32_copy(dma_addr_t dst, u32 val)
+{
+   if (unlikely(cpg-u32_usage == cpg-u32_list_len)
+set_u32_poolsize(cpg-u32_list_len  1)) {
+   printk(KERN_ERR MV_CESA resizing poolsize to %d failed\n,
+   cpg-u32_list_len  1);
+   return;
+   }
+   *(U32_ITEM(cpg-u32_usage)) = val;
+   mv_dma_memcpy(dst, U32_ITEM_DMA(cpg-u32_usage), sizeof(u32));
+   cpg-u32_usage++;
+}
+
 static inline bool
 mv_dma_map_sg(struct scatterlist *sg, int nbytes, enum dma_data_direction dir)
 {
@@ -392,36 +454,13 @@ static void mv_init_crypt_config(struct 
ablkcipher_request *req)
sizeof(struct sec_accel_sram), DMA_TO_DEVICE);
mv_dma_memcpy(cpg-sram_phys + SRAM_CONFIG, cpg-sa_sram_dma,
sizeof(struct sec_accel_sram));
-
-   mv_dma_separator();
-   dma_copy_buf_to_dst(cpg-p, cpg-sram_phys + SRAM_DATA_OUT_START, 
cpg-p.crypt_len);
-
-   /* GO */
-   mv_setup_timer();
-   mv_dma_trigger();
-   writel(SEC_CMD_EN_SEC_ACCL0, cpg-reg + SEC_ACCEL_CMD);
 }
 
 static void mv_update_crypt_config(void)
 {
-   struct sec_accel_config *op = cpg-sa_sram.op;
-
/* update the enc_len field only */
-
-   op-enc_len = cpg-p.crypt_len;
-
-   dma_sync_single_for_device(cpg-dev, cpg-sa_sram_dma + 2 * sizeof(u32),
-   sizeof(u32), DMA_TO_DEVICE);
-   mv_dma_memcpy(cpg-sram_phys + SRAM_CONFIG + 2 * sizeof(u32),
-   cpg-sa_sram_dma + 2 * sizeof(u32), sizeof(u32));
-
-   mv_dma_separator();
-   dma_copy_buf_to_dst(cpg-p, cpg-sram_phys + SRAM_DATA_OUT_START, 
cpg-p.crypt_len);
-
-   /* GO */
-   mv_setup_timer();
-   mv_dma_trigger();
-   writel(SEC_CMD_EN_SEC_ACCL0, cpg-reg + SEC_ACCEL_CMD);
+   mv_dma_u32_copy(cpg-sram_phys + SRAM_CONFIG + 2 * sizeof(u32),
+   (u32)cpg-p.crypt_len);
 }
 
 static void mv_crypto_algo_completion(void)
@@ -660,6 +699,7 @@ static void dequeue_complete_req(void)
cpg

[PATCH 13/13] mv_cesa, mv_dma: outsource common dma-pool handling code

2012-06-12 Thread Phil Sutter

Signed-off-by: Phil Sutter phil.sut...@viprinet.com
---
 drivers/crypto/dma_desclist.h |   79 +++
 drivers/crypto/mv_cesa.c  |   81 +
 drivers/crypto/mv_dma.c   |   91 -
 3 files changed, 125 insertions(+), 126 deletions(-)
 create mode 100644 drivers/crypto/dma_desclist.h

diff --git a/drivers/crypto/dma_desclist.h b/drivers/crypto/dma_desclist.h
new file mode 100644
index 000..c471ad6
--- /dev/null
+++ b/drivers/crypto/dma_desclist.h
@@ -0,0 +1,79 @@
+#ifndef __DMA_DESCLIST__
+#define __DMA_DESCLIST__
+
+struct dma_desc {
+   void *virt;
+   dma_addr_t phys;
+};
+
+struct dma_desclist {
+   struct dma_pool *itempool;
+   struct dma_desc *desclist;
+   unsigned long length;
+   unsigned long usage;
+};
+
+#define DESCLIST_ITEM(dl, x)   ((dl).desclist[(x)].virt)
+#define DESCLIST_ITEM_DMA(dl, x)   ((dl).desclist[(x)].phys)
+#define DESCLIST_FULL(dl)  ((dl).length == (dl).usage)
+
+static inline int
+init_dma_desclist(struct dma_desclist *dl, struct device *dev,
+   size_t size, size_t align, size_t boundary)
+{
+#define STRX(x) #x
+#define STR(x) STRX(x)
+   dl-itempool = dma_pool_create(
+   DMA Desclist Pool at __FILE__(STR(__LINE__)),
+   dev, size, align, boundary);
+#undef STR
+#undef STRX
+   if (!dl-itempool)
+   return 1;
+   dl-desclist = NULL;
+   dl-length = dl-usage = 0;
+   return 0;
+}
+
+static inline int
+set_dma_desclist_size(struct dma_desclist *dl, unsigned long nelem)
+{
+   /* need to increase size first if requested */
+   if (nelem  dl-length) {
+   struct dma_desc *newmem;
+   int newsize = nelem * sizeof(struct dma_desc);
+
+   newmem = krealloc(dl-desclist, newsize, GFP_KERNEL);
+   if (!newmem)
+   return -ENOMEM;
+   dl-desclist = newmem;
+   }
+
+   /* allocate/free dma descriptors, adjusting dl-length on the go */
+   for (; dl-length  nelem; dl-length++) {
+   DESCLIST_ITEM(*dl, dl-length) = dma_pool_alloc(dl-itempool,
+   GFP_KERNEL, DESCLIST_ITEM_DMA(*dl, 
dl-length));
+   if (!DESCLIST_ITEM(*dl, dl-length))
+   return -ENOMEM;
+   }
+   for (; dl-length  nelem; dl-length--)
+   dma_pool_free(dl-itempool, DESCLIST_ITEM(*dl, dl-length - 1),
+   DESCLIST_ITEM_DMA(*dl, dl-length - 1));
+
+   /* ignore size decreases but those to zero */
+   if (!nelem) {
+   kfree(dl-desclist);
+   dl-desclist = 0;
+   }
+   return 0;
+}
+
+static inline void
+fini_dma_desclist(struct dma_desclist *dl)
+{
+   set_dma_desclist_size(dl, 0);
+   dma_pool_destroy(dl-itempool);
+   dl-length = dl-usage = 0;
+}
+
+#endif /* __DMA_DESCLIST__ */
diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 7b2b693..2a9fe8a 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -24,6 +24,7 @@
 
 #include mv_cesa.h
 #include mv_dma.h
+#include dma_desclist.h
 
 #define MV_CESAMV-CESA:
 #define MAX_HW_HASH_SIZE   0x
@@ -100,11 +101,6 @@ struct sec_accel_sram {
 #define sa_ivo type.hash.ivo
 } __attribute__((packed));
 
-struct u32_mempair {
-   u32 *vaddr;
-   dma_addr_t daddr;
-};
-
 struct crypto_priv {
struct device *dev;
void __iomem *reg;
@@ -129,14 +125,14 @@ struct crypto_priv {
struct sec_accel_sram sa_sram;
dma_addr_t sa_sram_dma;
 
-   struct dma_pool *u32_pool;
-   struct u32_mempair *u32_list;
-   int u32_list_len;
-   int u32_usage;
+   struct dma_desclist desclist;
 };
 
 static struct crypto_priv *cpg;
 
+#define ITEM(x)((u32 *)DESCLIST_ITEM(cpg-desclist, x))
+#define ITEM_DMA(x)DESCLIST_ITEM_DMA(cpg-desclist, x)
+
 struct mv_ctx {
u8 aes_enc_key[AES_KEY_LEN];
u32 aes_dec_key[8];
@@ -204,52 +200,17 @@ static void mv_setup_timer(void)
jiffies + msecs_to_jiffies(MV_CESA_EXPIRE));
 }
 
-#define U32_ITEM(x)(cpg-u32_list[x].vaddr)
-#define U32_ITEM_DMA(x)(cpg-u32_list[x].daddr)
-
-static inline int set_u32_poolsize(int nelem)
-{
-   /* need to increase size first if requested */
-   if (nelem  cpg-u32_list_len) {
-   struct u32_mempair *newmem;
-   int newsize = nelem * sizeof(struct u32_mempair);
-
-   newmem = krealloc(cpg-u32_list, newsize, GFP_KERNEL);
-   if (!newmem)
-   return -ENOMEM;
-   cpg-u32_list = newmem;
-   }
-
-   /* allocate/free dma descriptors, adjusting cpg-u32_list_len on the go 
*/
-   for (; cpg-u32_list_len  nelem; cpg-u32_list_len++) {
-   U32_ITEM(cpg

[PATCH 04/13] mv_cesa: split up processing callbacks

2012-06-12 Thread Phil Sutter
Have a dedicated function initialising the full SRAM config, then use a
minimal callback for changing only relevant parts of it.

Signed-off-by: Phil Sutter phil.sut...@viprinet.com
---
 drivers/crypto/mv_cesa.c |   87 +
 1 files changed, 64 insertions(+), 23 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 80dcf16..ad21c72 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -63,7 +63,7 @@ struct req_progress {
struct scatterlist *src_sg;
struct scatterlist *dst_sg;
void (*complete) (void);
-   void (*process) (int is_first);
+   void (*process) (void);
 
/* src mostly */
int sg_src_left;
@@ -267,9 +267,8 @@ static void setup_data_in(void)
p-crypt_len = data_in_sram;
 }
 
-static void mv_process_current_q(int first_block)
+static void mv_init_crypt_config(struct ablkcipher_request *req)
 {
-   struct ablkcipher_request *req = ablkcipher_request_cast(cpg-cur_req);
struct mv_ctx *ctx = crypto_tfm_ctx(req-base.tfm);
struct mv_req_ctx *req_ctx = ablkcipher_request_ctx(req);
struct sec_accel_config *op = cpg-sa_sram.op;
@@ -283,8 +282,6 @@ static void mv_process_current_q(int first_block)
op-config = CFG_OP_CRYPT_ONLY | CFG_ENCM_AES | 
CFG_ENC_MODE_CBC;
op-enc_iv = ENC_IV_POINT(SRAM_DATA_IV) |
ENC_IV_BUF_POINT(SRAM_DATA_IV_BUF);
-   if (!first_block)
-   memcpy(req-info, cpg-sram + SRAM_DATA_IV_BUF, 16);
memcpy(cpg-sa_sram.sa_iv, req-info, 16);
break;
}
@@ -310,9 +307,8 @@ static void mv_process_current_q(int first_block)
op-enc_p = ENC_P_SRC(SRAM_DATA_IN_START) |
ENC_P_DST(SRAM_DATA_OUT_START);
op-enc_key_p = SRAM_DATA_KEY_P;
-
-   setup_data_in();
op-enc_len = cpg-p.crypt_len;
+
memcpy(cpg-sram + SRAM_CONFIG, cpg-sa_sram,
sizeof(struct sec_accel_sram));
 
@@ -321,6 +317,17 @@ static void mv_process_current_q(int first_block)
writel(SEC_CMD_EN_SEC_ACCL0, cpg-reg + SEC_ACCEL_CMD);
 }
 
+static void mv_update_crypt_config(void)
+{
+   /* update the enc_len field only */
+   memcpy(cpg-sram + SRAM_CONFIG + 2 * sizeof(u32),
+   cpg-p.crypt_len, sizeof(u32));
+
+   /* GO */
+   mv_setup_timer();
+   writel(SEC_CMD_EN_SEC_ACCL0, cpg-reg + SEC_ACCEL_CMD);
+}
+
 static void mv_crypto_algo_completion(void)
 {
struct ablkcipher_request *req = ablkcipher_request_cast(cpg-cur_req);
@@ -332,9 +339,8 @@ static void mv_crypto_algo_completion(void)
memcpy(req-info, cpg-sram + SRAM_DATA_IV_BUF, 16);
 }
 
-static void mv_process_hash_current(int first_block)
+static void mv_init_hash_config(struct ahash_request *req)
 {
-   struct ahash_request *req = ahash_request_cast(cpg-cur_req);
const struct mv_tfm_hash_ctx *tfm_ctx = crypto_tfm_ctx(req-base.tfm);
struct mv_req_hash_ctx *req_ctx = ahash_request_ctx(req);
struct req_progress *p = cpg-p;
@@ -357,8 +363,6 @@ static void mv_process_hash_current(int first_block)
MAC_SRC_DATA_P(SRAM_DATA_IN_START) |
MAC_SRC_TOTAL_LEN((u32)req_ctx-count);
 
-   setup_data_in();
-
op-mac_digest =
MAC_DIGEST_P(SRAM_DIGEST_BUF) | MAC_FRAG_LEN(p-crypt_len);
op-mac_iv =
@@ -381,13 +385,11 @@ static void mv_process_hash_current(int first_block)
else
op-config |= CFG_MID_FRAG;
 
-   if (first_block) {
-   writel(req_ctx-state[0], cpg-reg + 
DIGEST_INITIAL_VAL_A);
-   writel(req_ctx-state[1], cpg-reg + 
DIGEST_INITIAL_VAL_B);
-   writel(req_ctx-state[2], cpg-reg + 
DIGEST_INITIAL_VAL_C);
-   writel(req_ctx-state[3], cpg-reg + 
DIGEST_INITIAL_VAL_D);
-   writel(req_ctx-state[4], cpg-reg + 
DIGEST_INITIAL_VAL_E);
-   }
+   writel(req_ctx-state[0], cpg-reg + DIGEST_INITIAL_VAL_A);
+   writel(req_ctx-state[1], cpg-reg + DIGEST_INITIAL_VAL_B);
+   writel(req_ctx-state[2], cpg-reg + DIGEST_INITIAL_VAL_C);
+   writel(req_ctx-state[3], cpg-reg + DIGEST_INITIAL_VAL_D);
+   writel(req_ctx-state[4], cpg-reg + DIGEST_INITIAL_VAL_E);
}
 
memcpy(cpg-sram + SRAM_CONFIG, cpg-sa_sram,
@@ -398,6 +400,42 @@ static void mv_process_hash_current(int first_block)
writel(SEC_CMD_EN_SEC_ACCL0, cpg-reg + SEC_ACCEL_CMD);
 }
 
+static void mv_update_hash_config(void)
+{
+   struct ahash_request *req = ahash_request_cast(cpg-cur_req);
+   struct mv_req_hash_ctx *req_ctx = ahash_request_ctx(req);
+   struct req_progress *p = cpg-p;
+   struct sec_accel_config *op = cpg-sa_sram.op;
+   int is_last;
+
+   /* update only the config

[PATCH 06/13] mv_cesa: use DMA engine for data transfers

2012-06-12 Thread Phil Sutter

Signed-off-by: Phil Sutter phil.sut...@viprinet.com
---
 arch/arm/plat-orion/common.c |6 +
 drivers/crypto/mv_cesa.c |  214 +-
 2 files changed, 175 insertions(+), 45 deletions(-)

diff --git a/arch/arm/plat-orion/common.c b/arch/arm/plat-orion/common.c
index 61fd837..0c6c695 100644
--- a/arch/arm/plat-orion/common.c
+++ b/arch/arm/plat-orion/common.c
@@ -924,9 +924,15 @@ static struct resource orion_crypto_resources[] = {
},
 };
 
+static u64 mv_crypto_dmamask = DMA_BIT_MASK(32);
+
 static struct platform_device orion_crypto = {
.name   = mv_crypto,
.id = -1,
+   .dev= {
+   .dma_mask = mv_crypto_dmamask,
+   .coherent_dma_mask = DMA_BIT_MASK(32),
+   },
 };
 
 void __init orion_crypto_init(unsigned long mapbase,
diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index ad21c72..cdbc82e 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -9,6 +9,7 @@
 #include crypto/aes.h
 #include crypto/algapi.h
 #include linux/crypto.h
+#include linux/dma-mapping.h
 #include linux/interrupt.h
 #include linux/io.h
 #include linux/kthread.h
@@ -21,11 +22,14 @@
 #include crypto/sha.h
 
 #include mv_cesa.h
+#include mv_dma.h
 
 #define MV_CESAMV-CESA:
 #define MAX_HW_HASH_SIZE   0x
 #define MV_CESA_EXPIRE 500 /* msec */
 
+static int count_sgs(struct scatterlist *, unsigned int);
+
 /*
  * STM:
  *   /---\
@@ -50,7 +54,6 @@ enum engine_status {
  * @src_start: offset to add to src start position (scatter list)
  * @crypt_len: length of current hw crypt/hash process
  * @hw_nbytes: total bytes to process in hw for this request
- * @copy_back: whether to copy data back (crypt) or not (hash)
  * @sg_dst_left:   bytes left dst to process in this scatter list
  * @dst_start: offset to add to dst start position (scatter list)
  * @hw_processed_bytes:number of bytes processed by hw (request).
@@ -71,7 +74,6 @@ struct req_progress {
int crypt_len;
int hw_nbytes;
/* dst mostly */
-   int copy_back;
int sg_dst_left;
int dst_start;
int hw_processed_bytes;
@@ -96,8 +98,10 @@ struct sec_accel_sram {
 } __attribute__((packed));
 
 struct crypto_priv {
+   struct device *dev;
void __iomem *reg;
void __iomem *sram;
+   u32 sram_phys;
int irq;
struct clk *clk;
struct task_struct *queue_th;
@@ -115,6 +119,7 @@ struct crypto_priv {
int has_hmac_sha1;
 
struct sec_accel_sram sa_sram;
+   dma_addr_t sa_sram_dma;
 };
 
 static struct crypto_priv *cpg;
@@ -183,6 +188,23 @@ static void mv_setup_timer(void)
jiffies + msecs_to_jiffies(MV_CESA_EXPIRE));
 }
 
+static inline bool
+mv_dma_map_sg(struct scatterlist *sg, int nbytes, enum dma_data_direction dir)
+{
+   int nents = count_sgs(sg, nbytes);
+
+   if (nbytes  dma_map_sg(cpg-dev, sg, nents, dir) != nents)
+   return false;
+   return true;
+}
+
+static inline void
+mv_dma_unmap_sg(struct scatterlist *sg, int nbytes, enum dma_data_direction 
dir)
+{
+   if (nbytes)
+   dma_unmap_sg(cpg-dev, sg, count_sgs(sg, nbytes), dir);
+}
+
 static void compute_aes_dec_key(struct mv_ctx *ctx)
 {
struct crypto_aes_ctx gen_aes_key;
@@ -257,12 +279,66 @@ static void copy_src_to_buf(struct req_progress *p, char 
*dbuf, int len)
}
 }
 
+static void dma_copy_src_to_buf(struct req_progress *p, dma_addr_t dbuf, int 
len)
+{
+   dma_addr_t sbuf;
+   int copy_len;
+
+   while (len) {
+   if (!p-sg_src_left) {
+   /* next sg please */
+   p-src_sg = sg_next(p-src_sg);
+   BUG_ON(!p-src_sg);
+   p-sg_src_left = sg_dma_len(p-src_sg);
+   p-src_start = 0;
+   }
+
+   sbuf = sg_dma_address(p-src_sg) + p-src_start;
+
+   copy_len = min(p-sg_src_left, len);
+   mv_dma_memcpy(dbuf, sbuf, copy_len);
+
+   p-src_start += copy_len;
+   p-sg_src_left -= copy_len;
+
+   len -= copy_len;
+   dbuf += copy_len;
+   }
+}
+
+static void dma_copy_buf_to_dst(struct req_progress *p, dma_addr_t sbuf, int 
len)
+{
+   dma_addr_t dbuf;
+   int copy_len;
+
+   while (len) {
+   if (!p-sg_dst_left) {
+   /* next sg please */
+   p-dst_sg = sg_next(p-dst_sg);
+   BUG_ON(!p-dst_sg);
+   p-sg_dst_left = sg_dma_len(p-dst_sg);
+   p-dst_start = 0;
+   }
+
+   dbuf = sg_dma_address(p-dst_sg) + p-dst_start;
+
+   copy_len = min(p-sg_dst_left, len);
+   mv_dma_memcpy(dbuf, sbuf, copy_len

[PATCH 08/13] mv_cesa: fetch extra_bytes via DMA engine, too

2012-06-12 Thread Phil Sutter

Signed-off-by: Phil Sutter phil.sut...@viprinet.com
---
 drivers/crypto/mv_cesa.c |   12 ++--
 1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 4b08137..7dfab85 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -158,6 +158,7 @@ struct mv_req_hash_ctx {
u64 count;
u32 state[SHA1_DIGEST_SIZE / 4];
u8 buffer[SHA1_BLOCK_SIZE];
+   dma_addr_t buffer_dma;
int first_hash; /* marks that we don't have previous state */
int last_chunk; /* marks that this is the 'final' request */
int extra_bytes;/* unprocessed bytes in buffer */
@@ -638,6 +639,9 @@ static void mv_hash_algo_completion(void)
dma_unmap_single(cpg-dev, ctx-result_dma,
ctx-digestsize, DMA_FROM_DEVICE);
 
+   dma_unmap_single(cpg-dev, ctx-buffer_dma,
+   SHA1_BLOCK_SIZE, DMA_TO_DEVICE);
+
if (unlikely(ctx-count  MAX_HW_HASH_SIZE)) {
mv_save_digest_state(ctx);
mv_hash_final_fallback(req);
@@ -757,8 +761,10 @@ static void mv_start_new_hash_req(struct ahash_request 
*req)
p-process = mv_update_hash_config;
 
if (unlikely(old_extra_bytes)) {
-   memcpy(cpg-sram + SRAM_DATA_IN_START, ctx-buffer,
-  old_extra_bytes);
+   dma_sync_single_for_device(cpg-dev, ctx-buffer_dma,
+   SHA1_BLOCK_SIZE, DMA_TO_DEVICE);
+   mv_dma_memcpy(cpg-sram_phys + SRAM_DATA_IN_START,
+   ctx-buffer_dma, old_extra_bytes);
p-crypt_len = old_extra_bytes;
}
 
@@ -903,6 +909,8 @@ static void mv_init_hash_req_ctx(struct mv_req_hash_ctx 
*ctx, int op,
ctx-first_hash = 1;
ctx-last_chunk = is_last;
ctx-count_add = count_add;
+   ctx-buffer_dma = dma_map_single(cpg-dev, ctx-buffer,
+   SHA1_BLOCK_SIZE, DMA_TO_DEVICE);
 }
 
 static void mv_update_hash_req_ctx(struct mv_req_hash_ctx *ctx, int is_last,
-- 
1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/13] mv_cesa: implement descriptor chaining for hashes, too

2012-06-12 Thread Phil Sutter

Signed-off-by: Phil Sutter phil.sut...@viprinet.com
---
 drivers/crypto/mv_cesa.c |   89 ++
 1 files changed, 35 insertions(+), 54 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 9c65980..86b73d1 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -538,34 +538,14 @@ static void mv_init_hash_config(struct ahash_request *req)
sizeof(struct sec_accel_sram), DMA_TO_DEVICE);
mv_dma_memcpy(cpg-sram_phys + SRAM_CONFIG, cpg-sa_sram_dma,
sizeof(struct sec_accel_sram));
-
-   mv_dma_separator();
-
-   if (req-result) {
-   req_ctx-result_dma = dma_map_single(cpg-dev, req-result,
-   req_ctx-digestsize, DMA_FROM_DEVICE);
-   mv_dma_memcpy(req_ctx-result_dma,
-   cpg-sram_phys + SRAM_DIGEST_BUF, 
req_ctx-digestsize);
-   } else {
-   /* XXX: this fixes some ugly register fuckup bug in the tdma 
engine
-*  (no need to sync since the data is ignored anyway) */
-   mv_dma_memcpy(cpg-sa_sram_dma,
-   cpg-sram_phys + SRAM_CONFIG, 1);
-   }
-
-   /* GO */
-   mv_setup_timer();
-   mv_dma_trigger();
-   writel(SEC_CMD_EN_SEC_ACCL0, cpg-reg + SEC_ACCEL_CMD);
 }
 
-static void mv_update_hash_config(void)
+static void mv_update_hash_config(struct ahash_request *req)
 {
-   struct ahash_request *req = ahash_request_cast(cpg-cur_req);
struct mv_req_hash_ctx *req_ctx = ahash_request_ctx(req);
struct req_progress *p = cpg-p;
-   struct sec_accel_config *op = cpg-sa_sram.op;
int is_last;
+   u32 val;
 
/* update only the config (for changed fragment state) and
 * mac_digest (for changed frag len) fields */
@@ -573,10 +553,10 @@ static void mv_update_hash_config(void)
switch (req_ctx-op) {
case COP_SHA1:
default:
-   op-config = CFG_OP_MAC_ONLY | CFG_MACM_SHA1;
+   val = CFG_OP_MAC_ONLY | CFG_MACM_SHA1;
break;
case COP_HMAC_SHA1:
-   op-config = CFG_OP_MAC_ONLY | CFG_MACM_HMAC_SHA1;
+   val = CFG_OP_MAC_ONLY | CFG_MACM_HMAC_SHA1;
break;
}
 
@@ -584,36 +564,11 @@ static void mv_update_hash_config(void)
 (p-hw_processed_bytes + p-crypt_len = p-hw_nbytes)
 (req_ctx-count = MAX_HW_HASH_SIZE);
 
-   op-config |= is_last ? CFG_LAST_FRAG : CFG_MID_FRAG;
-   dma_sync_single_for_device(cpg-dev, cpg-sa_sram_dma,
-   sizeof(u32), DMA_TO_DEVICE);
-   mv_dma_memcpy(cpg-sram_phys + SRAM_CONFIG,
-   cpg-sa_sram_dma, sizeof(u32));
-
-   op-mac_digest = MAC_DIGEST_P(SRAM_DIGEST_BUF) | 
MAC_FRAG_LEN(p-crypt_len);
-   dma_sync_single_for_device(cpg-dev, cpg-sa_sram_dma + 6 * sizeof(u32),
-   sizeof(u32), DMA_TO_DEVICE);
-   mv_dma_memcpy(cpg-sram_phys + SRAM_CONFIG + 6 * sizeof(u32),
-   cpg-sa_sram_dma + 6 * sizeof(u32), sizeof(u32));
-
-   mv_dma_separator();
-
-   if (req-result) {
-   req_ctx-result_dma = dma_map_single(cpg-dev, req-result,
-   req_ctx-digestsize, DMA_FROM_DEVICE);
-   mv_dma_memcpy(req_ctx-result_dma,
-   cpg-sram_phys + SRAM_DIGEST_BUF, 
req_ctx-digestsize);
-   } else {
-   /* XXX: this fixes some ugly register fuckup bug in the tdma 
engine
-*  (no need to sync since the data is ignored anyway) */
-   mv_dma_memcpy(cpg-sa_sram_dma,
-   cpg-sram_phys + SRAM_CONFIG, 1);
-   }
+   val |= is_last ? CFG_LAST_FRAG : CFG_MID_FRAG;
+   mv_dma_u32_copy(cpg-sram_phys + SRAM_CONFIG, val);
 
-   /* GO */
-   mv_setup_timer();
-   mv_dma_trigger();
-   writel(SEC_CMD_EN_SEC_ACCL0, cpg-reg + SEC_ACCEL_CMD);
+   val = MAC_DIGEST_P(SRAM_DIGEST_BUF) | MAC_FRAG_LEN(p-crypt_len);
+   mv_dma_u32_copy(cpg-sram_phys + SRAM_CONFIG + 6 * sizeof(u32), val);
 }
 
 static inline int mv_hash_import_sha1_ctx(const struct mv_req_hash_ctx *ctx,
@@ -837,7 +792,6 @@ static void mv_start_new_hash_req(struct ahash_request *req)
 
p-hw_nbytes = hw_bytes;
p-complete = mv_hash_algo_completion;
-   p-process = mv_update_hash_config;
 
if (unlikely(old_extra_bytes)) {
dma_sync_single_for_device(cpg-dev, ctx-buffer_dma,
@@ -849,6 +803,33 @@ static void mv_start_new_hash_req(struct ahash_request 
*req)
 
setup_data_in();
mv_init_hash_config(req);
+   mv_dma_separator();
+   cpg-p.hw_processed_bytes += cpg-p.crypt_len;
+   while (cpg-p.hw_processed_bytes  cpg-p.hw_nbytes) {
+   cpg-p.crypt_len = 0;
+
+   setup_data_in

RFC: support for MV_CESA with IDMA or TDMA

2012-06-12 Thread Phil Sutter
Hi,

The following patch series adds support for the TDMA engine built into
Marvell's Kirkwood-based SoCs as well as the IDMA engine built into
Marvell's Orion-based SoCs and enhances mv_cesa.c in order to use it for
speeding up crypto operations. The hardware contains a security
accelerator, which can control DMA as well as crypto engines. It allows
for operation with minimal software intervention, which the following
patches implement: using a chain of DMA descriptors, data input,
configuration, engine startup and data output repeat fully automatically
until the whole input data has been handled.

The point for this being RFC is lack of hardware on my side for testing
the IDMA support. I'd highly appreciate if someone with Orion hardware
could test this, preferably using the hmac_comp tool shipped with
cryptodev-linux as it does a more extensive testing (with bigger buffer
sizes at least) than tcrypt or the standard kernel-internal use cases.

Greetings, Phil

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/13] mv_cesa: minor formatting cleanup, will all make sense soon

2012-06-12 Thread Phil Sutter
This is just to keep formatting changes out of the following commit,
hopefully simplifying it a bit.

Signed-off-by: Phil Sutter phil.sut...@viprinet.com
---
 drivers/crypto/mv_cesa.c |   14 ++
 1 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 818a5c7..59c2ed2 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -269,12 +269,10 @@ static void mv_process_current_q(int first_block)
}
if (req_ctx-decrypt) {
op.config |= CFG_DIR_DEC;
-   memcpy(cpg-sram + SRAM_DATA_KEY_P, ctx-aes_dec_key,
-   AES_KEY_LEN);
+   memcpy(cpg-sram + SRAM_DATA_KEY_P, ctx-aes_dec_key, 
AES_KEY_LEN);
} else {
op.config |= CFG_DIR_ENC;
-   memcpy(cpg-sram + SRAM_DATA_KEY_P, ctx-aes_enc_key,
-   AES_KEY_LEN);
+   memcpy(cpg-sram + SRAM_DATA_KEY_P, ctx-aes_enc_key, 
AES_KEY_LEN);
}
 
switch (ctx-key_len) {
@@ -335,9 +333,8 @@ static void mv_process_hash_current(int first_block)
}
 
op.mac_src_p =
-   MAC_SRC_DATA_P(SRAM_DATA_IN_START) | MAC_SRC_TOTAL_LEN((u32)
-   req_ctx-
-   count);
+   MAC_SRC_DATA_P(SRAM_DATA_IN_START) |
+   MAC_SRC_TOTAL_LEN((u32)req_ctx-count);
 
setup_data_in();
 
@@ -372,7 +369,8 @@ static void mv_process_hash_current(int first_block)
}
}
 
-   memcpy(cpg-sram + SRAM_CONFIG, op, sizeof(struct sec_accel_config));
+   memcpy(cpg-sram + SRAM_CONFIG, op,
+   sizeof(struct sec_accel_config));
 
/* GO */
mv_setup_timer();
-- 
1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/13] add a driver for the Marvell IDMA/TDMA engines

2012-06-12 Thread Phil Sutter
These are DMA engines integrated into the Marvell Orion/Kirkwood SoCs,
designed to offload data transfers from/to the CESA crypto engine.

Signed-off-by: Phil Sutter phil.sut...@viprinet.com
---
 arch/arm/mach-kirkwood/common.c  |   33 ++
 arch/arm/mach-kirkwood/include/mach/irqs.h   |1 +
 arch/arm/mach-orion5x/common.c   |   33 ++
 arch/arm/mach-orion5x/include/mach/orion5x.h |2 +
 drivers/crypto/Kconfig   |5 +
 drivers/crypto/Makefile  |3 +-
 drivers/crypto/mv_dma.c  |  464 ++
 drivers/crypto/mv_dma.h  |  127 +++
 8 files changed, 667 insertions(+), 1 deletions(-)
 create mode 100644 drivers/crypto/mv_dma.c
 create mode 100644 drivers/crypto/mv_dma.h

diff --git a/arch/arm/mach-kirkwood/common.c b/arch/arm/mach-kirkwood/common.c
index 25fb3fd..dcd1327 100644
--- a/arch/arm/mach-kirkwood/common.c
+++ b/arch/arm/mach-kirkwood/common.c
@@ -426,8 +426,41 @@ void __init kirkwood_uart1_init(void)
 /*
  * Cryptographic Engines and Security Accelerator (CESA)
  /
+static struct resource kirkwood_tdma_res[] = {
+   {
+   .name   = regs deco,
+   .start  = CRYPTO_PHYS_BASE + 0xA00,
+   .end= CRYPTO_PHYS_BASE + 0xA24,
+   .flags  = IORESOURCE_MEM,
+   }, {
+   .name   = regs control and error,
+   .start  = CRYPTO_PHYS_BASE + 0x800,
+   .end= CRYPTO_PHYS_BASE + 0x8CF,
+   .flags  = IORESOURCE_MEM,
+   }, {
+   .name   = crypto error,
+   .start  = IRQ_KIRKWOOD_TDMA_ERR,
+   .end= IRQ_KIRKWOOD_TDMA_ERR,
+   .flags  = IORESOURCE_IRQ,
+   },
+};
+
+static u64 mv_tdma_dma_mask = DMA_BIT_MASK(32);
+
+static struct platform_device kirkwood_tdma_device = {
+   .name   = mv_tdma,
+   .id = -1,
+   .dev= {
+   .dma_mask   = mv_tdma_dma_mask,
+   .coherent_dma_mask  = DMA_BIT_MASK(32),
+   },
+   .num_resources  = ARRAY_SIZE(kirkwood_tdma_res),
+   .resource   = kirkwood_tdma_res,
+};
+
 void __init kirkwood_crypto_init(void)
 {
+   platform_device_register(kirkwood_tdma_device);
orion_crypto_init(CRYPTO_PHYS_BASE, KIRKWOOD_SRAM_PHYS_BASE,
  KIRKWOOD_SRAM_SIZE, IRQ_KIRKWOOD_CRYPTO);
 }
diff --git a/arch/arm/mach-kirkwood/include/mach/irqs.h 
b/arch/arm/mach-kirkwood/include/mach/irqs.h
index 2bf8161..a66aa3f 100644
--- a/arch/arm/mach-kirkwood/include/mach/irqs.h
+++ b/arch/arm/mach-kirkwood/include/mach/irqs.h
@@ -51,6 +51,7 @@
 #define IRQ_KIRKWOOD_GPIO_HIGH_16_23   41
 #define IRQ_KIRKWOOD_GE00_ERR  46
 #define IRQ_KIRKWOOD_GE01_ERR  47
+#define IRQ_KIRKWOOD_TDMA_ERR  49
 #define IRQ_KIRKWOOD_RTC53
 
 /*
diff --git a/arch/arm/mach-orion5x/common.c b/arch/arm/mach-orion5x/common.c
index 9148b22..553ccf2 100644
--- a/arch/arm/mach-orion5x/common.c
+++ b/arch/arm/mach-orion5x/common.c
@@ -181,9 +181,42 @@ void __init orion5x_xor_init(void)
 /*
  * Cryptographic Engines and Security Accelerator (CESA)
  /
+static struct resource orion_idma_res[] = {
+   {
+   .name   = regs deco,
+   .start  = ORION5X_IDMA_PHYS_BASE + 0xA00,
+   .end= ORION5X_IDMA_PHYS_BASE + 0xA24,
+   .flags  = IORESOURCE_MEM,
+   }, {
+   .name   = regs control and error,
+   .start  = ORION5X_IDMA_PHYS_BASE + 0x800,
+   .end= ORION5X_IDMA_PHYS_BASE + 0x8CF,
+   .flags  = IORESOURCE_MEM,
+   }, {
+   .name   = crypto error,
+   .start  = IRQ_ORION5X_IDMA_ERR,
+   .end= IRQ_ORION5X_IDMA_ERR,
+   .flags  = IORESOURCE_IRQ,
+   },
+};
+
+static u64 mv_idma_dma_mask = DMA_BIT_MASK(32);
+
+static struct platform_device orion_idma_device = {
+   .name   = mv_idma,
+   .id = -1,
+   .dev= {
+   .dma_mask   = mv_idma_dma_mask,
+   .coherent_dma_mask  = DMA_BIT_MASK(32),
+   },
+   .num_resources  = ARRAY_SIZE(orion_idma_res),
+   .resource   = orion_idma_res,
+};
+
 static void __init orion5x_crypto_init(void)
 {
orion5x_setup_sram_win();
+   platform_device_register(orion_idma_device);
orion_crypto_init(ORION5X_CRYPTO_PHYS_BASE, ORION5X_SRAM_PHYS_BASE,
  SZ_8K, IRQ_ORION5X_CESA);
 }
diff --git a/arch/arm/mach-orion5x/include/mach/orion5x.h 
b/arch/arm/mach-orion5x/include/mach/orion5x.h
index 2745f5d

[PATCH 01/13] mv_cesa: do not use scatterlist iterators

2012-06-12 Thread Phil Sutter
The big problem is they cannot be used to iterate over DMA mapped
scatterlists, so get rid of them in order to add DMA functionality to
mv_cesa.

Signed-off-by: Phil Sutter phil.sut...@viprinet.com
---
 drivers/crypto/mv_cesa.c |   57 ++---
 1 files changed, 28 insertions(+), 29 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 0d40717..818a5c7 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -44,8 +44,8 @@ enum engine_status {
 
 /**
  * struct req_progress - used for every crypt request
- * @src_sg_it: sg iterator for src
- * @dst_sg_it: sg iterator for dst
+ * @src_sg:sg list for src
+ * @dst_sg:sg list for dst
  * @sg_src_left:   bytes left in src to process (scatter list)
  * @src_start: offset to add to src start position (scatter list)
  * @crypt_len: length of current hw crypt/hash process
@@ -60,8 +60,8 @@ enum engine_status {
  * track of progress within current scatterlist.
  */
 struct req_progress {
-   struct sg_mapping_iter src_sg_it;
-   struct sg_mapping_iter dst_sg_it;
+   struct scatterlist *src_sg;
+   struct scatterlist *dst_sg;
void (*complete) (void);
void (*process) (int is_first);
 
@@ -212,19 +212,19 @@ static int mv_setkey_aes(struct crypto_ablkcipher 
*cipher, const u8 *key,
 
 static void copy_src_to_buf(struct req_progress *p, char *dbuf, int len)
 {
-   int ret;
void *sbuf;
int copy_len;
 
while (len) {
if (!p-sg_src_left) {
-   ret = sg_miter_next(p-src_sg_it);
-   BUG_ON(!ret);
-   p-sg_src_left = p-src_sg_it.length;
+   /* next sg please */
+   p-src_sg = sg_next(p-src_sg);
+   BUG_ON(!p-src_sg);
+   p-sg_src_left = p-src_sg-length;
p-src_start = 0;
}
 
-   sbuf = p-src_sg_it.addr + p-src_start;
+   sbuf = sg_virt(p-src_sg) + p-src_start;
 
copy_len = min(p-sg_src_left, len);
memcpy(dbuf, sbuf, copy_len);
@@ -307,9 +307,6 @@ static void mv_crypto_algo_completion(void)
struct ablkcipher_request *req = ablkcipher_request_cast(cpg-cur_req);
struct mv_req_ctx *req_ctx = ablkcipher_request_ctx(req);
 
-   sg_miter_stop(cpg-p.src_sg_it);
-   sg_miter_stop(cpg-p.dst_sg_it);
-
if (req_ctx-op != COP_AES_CBC)
return ;
 
@@ -439,7 +436,6 @@ static void mv_hash_algo_completion(void)
 
if (ctx-extra_bytes)
copy_src_to_buf(cpg-p, ctx-buffer, ctx-extra_bytes);
-   sg_miter_stop(cpg-p.src_sg_it);
 
if (likely(ctx-last_chunk)) {
if (likely(ctx-count = MAX_HW_HASH_SIZE)) {
@@ -459,7 +455,6 @@ static void dequeue_complete_req(void)
 {
struct crypto_async_request *req = cpg-cur_req;
void *buf;
-   int ret;
cpg-p.hw_processed_bytes += cpg-p.crypt_len;
if (cpg-p.copy_back) {
int need_copy_len = cpg-p.crypt_len;
@@ -468,14 +463,14 @@ static void dequeue_complete_req(void)
int dst_copy;
 
if (!cpg-p.sg_dst_left) {
-   ret = sg_miter_next(cpg-p.dst_sg_it);
-   BUG_ON(!ret);
-   cpg-p.sg_dst_left = cpg-p.dst_sg_it.length;
+   /* next sg please */
+   cpg-p.dst_sg = sg_next(cpg-p.dst_sg);
+   BUG_ON(!cpg-p.dst_sg);
+   cpg-p.sg_dst_left = cpg-p.dst_sg-length;
cpg-p.dst_start = 0;
}
 
-   buf = cpg-p.dst_sg_it.addr;
-   buf += cpg-p.dst_start;
+   buf = sg_virt(cpg-p.dst_sg) + cpg-p.dst_start;
 
dst_copy = min(need_copy_len, cpg-p.sg_dst_left);
 
@@ -525,7 +520,6 @@ static int count_sgs(struct scatterlist *sl, unsigned int 
total_bytes)
 static void mv_start_new_crypt_req(struct ablkcipher_request *req)
 {
struct req_progress *p = cpg-p;
-   int num_sgs;
 
cpg-cur_req = req-base;
memset(p, 0, sizeof(struct req_progress));
@@ -534,11 +528,14 @@ static void mv_start_new_crypt_req(struct 
ablkcipher_request *req)
p-process = mv_process_current_q;
p-copy_back = 1;
 
-   num_sgs = count_sgs(req-src, req-nbytes);
-   sg_miter_start(p-src_sg_it, req-src, num_sgs, SG_MITER_FROM_SG);
-
-   num_sgs = count_sgs(req-dst, req-nbytes);
-   sg_miter_start(p-dst_sg_it, req-dst, num_sgs, SG_MITER_TO_SG);
+   p-src_sg = req-src;
+   p-dst_sg = req-dst;
+   if (req-nbytes) {
+   BUG_ON(!req-src);
+   BUG_ON(!req-dst);
+   p

Re: RFC: support for MV_CESA with TDMA

2012-05-29 Thread Phil Sutter
Hi,

On Sun, May 27, 2012 at 10:03:07PM +0800, cloudy.linux wrote:
 Could the source code from the manufacturers of hardwares using kirkwood 
 be helpful?
 I saw the source code of ls-wvl from buffalo contains driver for CESA. 
 And it deals with both IDMA and TDMA. If you need, I can send you the 
 download link.

Actually, I do have the sources. Just had doubts about how useful it
would be to write code for something I couldn't test at all. OTOH,
that's probably a better start than nothing.

 I also have to point out that CESA of some orion revisions has hardware 
 flaws that needs to be addressed which currently doesn't. Information 
 about those flaws can be found in 88F5182_Functional_Errata.pdf which is 
 available on the net.

OK, thanks for the pointer! Looks like implementing combined
(crypto/digest) operation for Orion will be no fun at least.

Greetings, Phil



Phil Sutter
Software Engineer

-- 


Viprinet GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Phone/Zentrale: +49-6721-49030-0
Direct line/Durchwahl:  +49-6721-49030-134
Fax:+49-6721-49030-209

phil.sut...@viprinet.com
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein
Commercial register/Handelsregister: Amtsgericht Mainz HRB40380
CEO/Geschäftsführer: Simon Kissel
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] mv_cesa: no need to write to that FPGA_INT_STATUS field

2012-05-29 Thread Phil Sutter
Hi,

On Mon, May 28, 2012 at 09:58:32AM +0800, cloudy.linux wrote:
 On 2012-5-25 21:54, Phil Sutter wrote:
  Also drop the whole definition, since it's unused otherwise.
 
  Signed-off-by: Phil Sutterphil.sut...@viprinet.com
  ---
drivers/crypto/mv_cesa.c |1 -
drivers/crypto/mv_cesa.h |7 ---
2 files changed, 0 insertions(+), 8 deletions(-)
 
  diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
  index 8327bed..4a1f872 100644
  --- a/drivers/crypto/mv_cesa.c
  +++ b/drivers/crypto/mv_cesa.c
  @@ -908,7 +908,6 @@ irqreturn_t crypto_int(int irq, void *priv)
 got an interrupt but no pending timer?\n);
  }
  val= ~SEC_INT_ACCEL0_DONE;
  -   writel(val, cpg-reg + FPGA_INT_STATUS);
  writel(val, cpg-reg + SEC_ACCEL_INT_STATUS);
  BUG_ON(cpg-eng_st != ENGINE_BUSY);
  cpg-eng_st = ENGINE_W_DEQUEUE;
  diff --git a/drivers/crypto/mv_cesa.h b/drivers/crypto/mv_cesa.h
  index 08fcb11..81ce109 100644
  --- a/drivers/crypto/mv_cesa.h
  +++ b/drivers/crypto/mv_cesa.h
  @@ -29,13 +29,6 @@
#define SEC_ST_ACT_0  (1  0)
#define SEC_ST_ACT_1  (1  1)
 
  -/*
  - * FPGA_INT_STATUS looks like a FPGA leftover and is documented only in 
  Errata
  - * 4.12. It looks like that it was part of an IRQ-controller in FPGA and
  - * someone forgot to remove  it while switching to the core and moving to
  - * SEC_ACCEL_INT_STATUS.
  - */
  -#define FPGA_INT_STATUS0xdd68
#define SEC_ACCEL_INT_STATUS  0xde20
#define SEC_INT_AUTH_DONE (1  0)
#define SEC_INT_DES_E_DONE(1  1)
 
 According to the functional errata of 88F5182, the FPGA_INT_STATUS is 
 needed (at least for 88F5182-A1/A2). Here is the quote from that errata:
 
 4.12  Clearing the Cryptographic Engines and Security Accelerator 
 Interrupt Cause Register
   Type:   Guideline
   Ref#:   GL-CESA-100
   Relevant for:   88F5182-A1/A2
 
 Description:
 Writing 0 to bits[6:0] of the Crytographic Engines ...  Interrupt Cause 
 register (offset 0x9DE20) has no effect.
 
 Steps to be performed by the designer
 Before writing 0 to any of the bits[6:0] of the Cryptographic Engines .. 
 Interrupt Cause register, the software must write 0 to the corresponding 
 bit of the internal register at offset 0x9DD68.
 Writing to offset 0x9DD68 is not possible when any of the Security 
 Accelerators' sessions are active. Therefore, the software must verify 
 that no channel is active before clearing any of those interrupts.

Oh, that explains why it's not needed on Kirkwood but still left there.
I could make it compile-time optional, depending on ARCH_ORION5X e.g. or
simply drop the patch and leave it alone since it really doesn't hurt
that much. 

Anyway, thanks a lot for your kind review!

Greetings, Phil



Phil Sutter
Software Engineer

-- 


Viprinet GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Phone/Zentrale: +49-6721-49030-0
Direct line/Durchwahl:  +49-6721-49030-134
Fax:+49-6721-49030-209

phil.sut...@viprinet.com
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein
Commercial register/Handelsregister: Amtsgericht Mainz HRB40380
CEO/Geschäftsführer: Simon Kissel
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4] mv_cesa: initialise the interrupt status field to zero

2012-05-25 Thread Phil Sutter

Signed-off-by: Phil Sutter phil.sut...@viprinet.com
---
 drivers/crypto/mv_cesa.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 4a1f872..d4763fb 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -1073,6 +1073,7 @@ static int mv_probe(struct platform_device *pdev)
if (ret)
goto err_thread;
 
+   writel(0, cpg-reg + SEC_ACCEL_INT_STATUS);
writel(SEC_INT_ACCEL0_DONE, cpg-reg + SEC_ACCEL_INT_MASK);
writel(SEC_CFG_STOP_DIG_ERR, cpg-reg + SEC_ACCEL_CFG);
writel(SRAM_CONFIG, cpg-reg + SEC_ACCEL_DESC_P0);
-- 
1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4] mv_cesa: no need to write to that FPGA_INT_STATUS field

2012-05-25 Thread Phil Sutter
Also drop the whole definition, since it's unused otherwise.

Signed-off-by: Phil Sutter phil.sut...@viprinet.com
---
 drivers/crypto/mv_cesa.c |1 -
 drivers/crypto/mv_cesa.h |7 ---
 2 files changed, 0 insertions(+), 8 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 8327bed..4a1f872 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -908,7 +908,6 @@ irqreturn_t crypto_int(int irq, void *priv)
   got an interrupt but no pending timer?\n);
}
val = ~SEC_INT_ACCEL0_DONE;
-   writel(val, cpg-reg + FPGA_INT_STATUS);
writel(val, cpg-reg + SEC_ACCEL_INT_STATUS);
BUG_ON(cpg-eng_st != ENGINE_BUSY);
cpg-eng_st = ENGINE_W_DEQUEUE;
diff --git a/drivers/crypto/mv_cesa.h b/drivers/crypto/mv_cesa.h
index 08fcb11..81ce109 100644
--- a/drivers/crypto/mv_cesa.h
+++ b/drivers/crypto/mv_cesa.h
@@ -29,13 +29,6 @@
 #define SEC_ST_ACT_0   (1  0)
 #define SEC_ST_ACT_1   (1  1)
 
-/*
- * FPGA_INT_STATUS looks like a FPGA leftover and is documented only in Errata
- * 4.12. It looks like that it was part of an IRQ-controller in FPGA and
- * someone forgot to remove  it while switching to the core and moving to
- * SEC_ACCEL_INT_STATUS.
- */
-#define FPGA_INT_STATUS0xdd68
 #define SEC_ACCEL_INT_STATUS   0xde20
 #define SEC_INT_AUTH_DONE  (1  0)
 #define SEC_INT_DES_E_DONE (1  1)
-- 
1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/13] mv_cesa: fetch extra_bytes via TDMA engine, too

2012-05-25 Thread Phil Sutter

Signed-off-by: Phil Sutter phil.sut...@viprinet.com
---
 drivers/crypto/mv_cesa.c |   12 ++--
 1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index d099aa0..bc2692e 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -156,6 +156,7 @@ struct mv_req_hash_ctx {
u64 count;
u32 state[SHA1_DIGEST_SIZE / 4];
u8 buffer[SHA1_BLOCK_SIZE];
+   dma_addr_t buffer_dma;
int first_hash; /* marks that we don't have previous state */
int last_chunk; /* marks that this is the 'final' request */
int extra_bytes;/* unprocessed bytes in buffer */
@@ -636,6 +637,9 @@ static void mv_hash_algo_completion(void)
dma_unmap_single(cpg-dev, ctx-result_dma,
ctx-digestsize, DMA_FROM_DEVICE);
 
+   dma_unmap_single(cpg-dev, ctx-buffer_dma,
+   SHA1_BLOCK_SIZE, DMA_TO_DEVICE);
+
if (unlikely(ctx-count  MAX_HW_HASH_SIZE)) {
mv_save_digest_state(ctx);
mv_hash_final_fallback(req);
@@ -755,8 +759,10 @@ static void mv_start_new_hash_req(struct ahash_request 
*req)
p-process = mv_update_hash_config;
 
if (unlikely(old_extra_bytes)) {
-   memcpy(cpg-sram + SRAM_DATA_IN_START, ctx-buffer,
-  old_extra_bytes);
+   dma_sync_single_for_device(cpg-dev, ctx-buffer_dma,
+   SHA1_BLOCK_SIZE, DMA_TO_DEVICE);
+   mv_tdma_memcpy(cpg-sram_phys + SRAM_DATA_IN_START,
+   ctx-buffer_dma, old_extra_bytes);
p-crypt_len = old_extra_bytes;
}
 
@@ -901,6 +907,8 @@ static void mv_init_hash_req_ctx(struct mv_req_hash_ctx 
*ctx, int op,
ctx-first_hash = 1;
ctx-last_chunk = is_last;
ctx-count_add = count_add;
+   ctx-buffer_dma = dma_map_single(cpg-dev, ctx-buffer,
+   SHA1_BLOCK_SIZE, DMA_TO_DEVICE);
 }
 
 static void mv_update_hash_req_ctx(struct mv_req_hash_ctx *ctx, int is_last,
-- 
1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/13] mv_cesa: have TDMA copy back the digest result

2012-05-25 Thread Phil Sutter

Signed-off-by: Phil Sutter phil.sut...@viprinet.com
---
 drivers/crypto/mv_cesa.c |   40 +---
 1 files changed, 29 insertions(+), 11 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index e10da2b..d099aa0 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -159,8 +159,10 @@ struct mv_req_hash_ctx {
int first_hash; /* marks that we don't have previous state */
int last_chunk; /* marks that this is the 'final' request */
int extra_bytes;/* unprocessed bytes in buffer */
+   int digestsize; /* size of the digest */
enum hash_op op;
int count_add;
+   dma_addr_t result_dma;
 };
 
 static void mv_completion_timer_callback(unsigned long unused)
@@ -497,9 +499,17 @@ static void mv_init_hash_config(struct ahash_request *req)
 
mv_tdma_separator();
 
-   /* XXX: this fixes some ugly register fuckup bug in the tdma engine
-*  (no need to sync since the data is ignored anyway) */
-   mv_tdma_memcpy(cpg-sa_sram_dma, cpg-sram_phys + SRAM_CONFIG, 1);
+   if (req-result) {
+   req_ctx-result_dma = dma_map_single(cpg-dev, req-result,
+   req_ctx-digestsize, DMA_FROM_DEVICE);
+   mv_tdma_memcpy(req_ctx-result_dma,
+   cpg-sram_phys + SRAM_DIGEST_BUF, 
req_ctx-digestsize);
+   } else {
+   /* XXX: this fixes some ugly register fuckup bug in the tdma 
engine
+*  (no need to sync since the data is ignored anyway) */
+   mv_tdma_memcpy(cpg-sa_sram_dma,
+   cpg-sram_phys + SRAM_CONFIG, 1);
+   }
 
/* GO */
mv_setup_timer();
@@ -546,9 +556,17 @@ static void mv_update_hash_config(void)
 
mv_tdma_separator();
 
-   /* XXX: this fixes some ugly register fuckup bug in the tdma engine
-*  (no need to sync since the data is ignored anyway) */
-   mv_tdma_memcpy(cpg-sa_sram_dma, cpg-sram_phys + SRAM_CONFIG, 1);
+   if (req-result) {
+   req_ctx-result_dma = dma_map_single(cpg-dev, req-result,
+   req_ctx-digestsize, DMA_FROM_DEVICE);
+   mv_tdma_memcpy(req_ctx-result_dma,
+   cpg-sram_phys + SRAM_DIGEST_BUF, 
req_ctx-digestsize);
+   } else {
+   /* XXX: this fixes some ugly register fuckup bug in the tdma 
engine
+*  (no need to sync since the data is ignored anyway) */
+   mv_tdma_memcpy(cpg-sa_sram_dma,
+   cpg-sram_phys + SRAM_CONFIG, 1);
+   }
 
/* GO */
mv_setup_timer();
@@ -615,11 +633,10 @@ static void mv_hash_algo_completion(void)
copy_src_to_buf(cpg-p, ctx-buffer, ctx-extra_bytes);
 
if (likely(ctx-last_chunk)) {
-   if (likely(ctx-count = MAX_HW_HASH_SIZE)) {
-   memcpy(req-result, cpg-sram + SRAM_DIGEST_BUF,
-  crypto_ahash_digestsize(crypto_ahash_reqtfm
-  (req)));
-   } else {
+   dma_unmap_single(cpg-dev, ctx-result_dma,
+   ctx-digestsize, DMA_FROM_DEVICE);
+
+   if (unlikely(ctx-count  MAX_HW_HASH_SIZE)) {
mv_save_digest_state(ctx);
mv_hash_final_fallback(req);
}
@@ -717,6 +734,7 @@ static void mv_start_new_hash_req(struct ahash_request *req)
memset(p, 0, sizeof(struct req_progress));
hw_bytes = req-nbytes + ctx-extra_bytes;
old_extra_bytes = ctx-extra_bytes;
+   ctx-digestsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(req));
 
ctx-extra_bytes = hw_bytes % SHA1_BLOCK_SIZE;
if (ctx-extra_bytes != 0
-- 
1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/13] mv_cesa: reorganise mv_start_new_hash_req a bit

2012-05-25 Thread Phil Sutter
Check and exit early for whether CESA can be used at all.

Signed-off-by: Phil Sutter phil.sut...@viprinet.com
---
 drivers/crypto/mv_cesa.c |   61 +-
 1 files changed, 33 insertions(+), 28 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 8e66080..5dba9df 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -804,35 +804,13 @@ static void mv_start_new_hash_req(struct ahash_request 
*req)
else
ctx-extra_bytes = 0;
 
-   p-src_sg = req-src;
-   if (req-nbytes) {
-   BUG_ON(!req-src);
-   p-sg_src_left = req-src-length;
-   }
-
-   if (hw_bytes) {
-   p-hw_nbytes = hw_bytes;
-   p-complete = mv_hash_algo_completion;
-   p-process = mv_update_hash_config;
-
-   if (unlikely(old_extra_bytes)) {
-   dma_sync_single_for_device(cpg-dev, ctx-buffer_dma,
-   SHA1_BLOCK_SIZE, DMA_TO_DEVICE);
-   mv_tdma_memcpy(cpg-sram_phys + SRAM_DATA_IN_START,
-   ctx-buffer_dma, old_extra_bytes);
-   p-crypt_len = old_extra_bytes;
+   if (unlikely(!hw_bytes)) { /* too little data for CESA */
+   if (req-nbytes) {
+   p-src_sg = req-src;
+   p-sg_src_left = req-src-length;
+   copy_src_to_buf(p, ctx-buffer + old_extra_bytes,
+   req-nbytes);
}
-
-   if (!mv_dma_map_sg(req-src, req-nbytes, DMA_TO_DEVICE)) {
-   printk(KERN_ERR %s: out of memory\n, __func__);
-   return;
-   }
-
-   setup_data_in();
-   mv_init_hash_config(req);
-   } else {
-   copy_src_to_buf(p, ctx-buffer + old_extra_bytes,
-   ctx-extra_bytes - old_extra_bytes);
if (ctx-last_chunk)
rc = mv_hash_final_fallback(req);
else
@@ -841,7 +819,34 @@ static void mv_start_new_hash_req(struct ahash_request 
*req)
local_bh_disable();
req-base.complete(req-base, rc);
local_bh_enable();
+   return;
}
+
+   if (likely(req-nbytes)) {
+   BUG_ON(!req-src);
+
+   if (!mv_dma_map_sg(req-src, req-nbytes, DMA_TO_DEVICE)) {
+   printk(KERN_ERR %s: out of memory\n, __func__);
+   return;
+   }
+   p-sg_src_left = sg_dma_len(req-src);
+   p-src_sg = req-src;
+   }
+
+   p-hw_nbytes = hw_bytes;
+   p-complete = mv_hash_algo_completion;
+   p-process = mv_update_hash_config;
+
+   if (unlikely(old_extra_bytes)) {
+   dma_sync_single_for_device(cpg-dev, ctx-buffer_dma,
+   SHA1_BLOCK_SIZE, DMA_TO_DEVICE);
+   mv_tdma_memcpy(cpg-sram_phys + SRAM_DATA_IN_START,
+   ctx-buffer_dma, old_extra_bytes);
+   p-crypt_len = old_extra_bytes;
+   }
+
+   setup_data_in();
+   mv_init_hash_config(req);
 }
 
 static int queue_manag(void *data)
-- 
1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/13] mv_cesa: use TDMA engine for data transfers

2012-05-25 Thread Phil Sutter
Simply chose the same DMA mask value as for mvsdio and ehci.

Signed-off-by: Phil Sutter phil.sut...@viprinet.com
---
 arch/arm/plat-orion/common.c |6 +
 drivers/crypto/mv_cesa.c |  214 +-
 2 files changed, 175 insertions(+), 45 deletions(-)

diff --git a/arch/arm/plat-orion/common.c b/arch/arm/plat-orion/common.c
index 74daf5e..dd3a327 100644
--- a/arch/arm/plat-orion/common.c
+++ b/arch/arm/plat-orion/common.c
@@ -916,9 +916,15 @@ static struct resource orion_crypto_resources[] = {
},
 };
 
+static u64 mv_crypto_dmamask = DMA_BIT_MASK(32);
+
 static struct platform_device orion_crypto = {
.name   = mv_crypto,
.id = -1,
+   .dev= {
+   .dma_mask = mv_crypto_dmamask,
+   .coherent_dma_mask = DMA_BIT_MASK(32),
+   },
 };
 
 void __init orion_crypto_init(unsigned long mapbase,
diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 4a989ea..e10da2b 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -9,6 +9,7 @@
 #include crypto/aes.h
 #include crypto/algapi.h
 #include linux/crypto.h
+#include linux/dma-mapping.h
 #include linux/interrupt.h
 #include linux/io.h
 #include linux/kthread.h
@@ -20,11 +21,14 @@
 #include crypto/sha.h
 
 #include mv_cesa.h
+#include mv_tdma.h
 
 #define MV_CESAMV-CESA:
 #define MAX_HW_HASH_SIZE   0x
 #define MV_CESA_EXPIRE 500 /* msec */
 
+static int count_sgs(struct scatterlist *, unsigned int);
+
 /*
  * STM:
  *   /---\
@@ -49,7 +53,6 @@ enum engine_status {
  * @src_start: offset to add to src start position (scatter list)
  * @crypt_len: length of current hw crypt/hash process
  * @hw_nbytes: total bytes to process in hw for this request
- * @copy_back: whether to copy data back (crypt) or not (hash)
  * @sg_dst_left:   bytes left dst to process in this scatter list
  * @dst_start: offset to add to dst start position (scatter list)
  * @hw_processed_bytes:number of bytes processed by hw (request).
@@ -70,7 +73,6 @@ struct req_progress {
int crypt_len;
int hw_nbytes;
/* dst mostly */
-   int copy_back;
int sg_dst_left;
int dst_start;
int hw_processed_bytes;
@@ -95,8 +97,10 @@ struct sec_accel_sram {
 } __attribute__((packed));
 
 struct crypto_priv {
+   struct device *dev;
void __iomem *reg;
void __iomem *sram;
+   u32 sram_phys;
int irq;
struct task_struct *queue_th;
 
@@ -113,6 +117,7 @@ struct crypto_priv {
int has_hmac_sha1;
 
struct sec_accel_sram sa_sram;
+   dma_addr_t sa_sram_dma;
 };
 
 static struct crypto_priv *cpg;
@@ -181,6 +186,23 @@ static void mv_setup_timer(void)
jiffies + msecs_to_jiffies(MV_CESA_EXPIRE));
 }
 
+static inline bool
+mv_dma_map_sg(struct scatterlist *sg, int nbytes, enum dma_data_direction dir)
+{
+   int nents = count_sgs(sg, nbytes);
+
+   if (nbytes  dma_map_sg(cpg-dev, sg, nents, dir) != nents)
+   return false;
+   return true;
+}
+
+static inline void
+mv_dma_unmap_sg(struct scatterlist *sg, int nbytes, enum dma_data_direction 
dir)
+{
+   if (nbytes)
+   dma_unmap_sg(cpg-dev, sg, count_sgs(sg, nbytes), dir);
+}
+
 static void compute_aes_dec_key(struct mv_ctx *ctx)
 {
struct crypto_aes_ctx gen_aes_key;
@@ -255,12 +277,66 @@ static void copy_src_to_buf(struct req_progress *p, char 
*dbuf, int len)
}
 }
 
+static void dma_copy_src_to_buf(struct req_progress *p, dma_addr_t dbuf, int 
len)
+{
+   dma_addr_t sbuf;
+   int copy_len;
+
+   while (len) {
+   if (!p-sg_src_left) {
+   /* next sg please */
+   p-src_sg = sg_next(p-src_sg);
+   BUG_ON(!p-src_sg);
+   p-sg_src_left = sg_dma_len(p-src_sg);
+   p-src_start = 0;
+   }
+
+   sbuf = sg_dma_address(p-src_sg) + p-src_start;
+
+   copy_len = min(p-sg_src_left, len);
+   mv_tdma_memcpy(dbuf, sbuf, copy_len);
+
+   p-src_start += copy_len;
+   p-sg_src_left -= copy_len;
+
+   len -= copy_len;
+   dbuf += copy_len;
+   }
+}
+
+static void dma_copy_buf_to_dst(struct req_progress *p, dma_addr_t sbuf, int 
len)
+{
+   dma_addr_t dbuf;
+   int copy_len;
+
+   while (len) {
+   if (!p-sg_dst_left) {
+   /* next sg please */
+   p-dst_sg = sg_next(p-dst_sg);
+   BUG_ON(!p-dst_sg);
+   p-sg_dst_left = sg_dma_len(p-dst_sg);
+   p-dst_start = 0;
+   }
+
+   dbuf = sg_dma_address(p-dst_sg) + p-dst_start;
+
+   copy_len = min(p-sg_dst_left, len

[PATCH 02/13] mv_cesa: minor formatting cleanup, will all make sense soon

2012-05-25 Thread Phil Sutter
This is just to keep formatting changes out of the following commit,
hopefully simplifying it a bit.

Signed-off-by: Phil Sutter phil.sut...@viprinet.com
---
 drivers/crypto/mv_cesa.c |   14 ++
 1 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index c305350..3862a93 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -267,12 +267,10 @@ static void mv_process_current_q(int first_block)
}
if (req_ctx-decrypt) {
op.config |= CFG_DIR_DEC;
-   memcpy(cpg-sram + SRAM_DATA_KEY_P, ctx-aes_dec_key,
-   AES_KEY_LEN);
+   memcpy(cpg-sram + SRAM_DATA_KEY_P, ctx-aes_dec_key, 
AES_KEY_LEN);
} else {
op.config |= CFG_DIR_ENC;
-   memcpy(cpg-sram + SRAM_DATA_KEY_P, ctx-aes_enc_key,
-   AES_KEY_LEN);
+   memcpy(cpg-sram + SRAM_DATA_KEY_P, ctx-aes_enc_key, 
AES_KEY_LEN);
}
 
switch (ctx-key_len) {
@@ -333,9 +331,8 @@ static void mv_process_hash_current(int first_block)
}
 
op.mac_src_p =
-   MAC_SRC_DATA_P(SRAM_DATA_IN_START) | MAC_SRC_TOTAL_LEN((u32)
-   req_ctx-
-   count);
+   MAC_SRC_DATA_P(SRAM_DATA_IN_START) |
+   MAC_SRC_TOTAL_LEN((u32)req_ctx-count);
 
setup_data_in();
 
@@ -370,7 +367,8 @@ static void mv_process_hash_current(int first_block)
}
}
 
-   memcpy(cpg-sram + SRAM_CONFIG, op, sizeof(struct sec_accel_config));
+   memcpy(cpg-sram + SRAM_CONFIG, op,
+   sizeof(struct sec_accel_config));
 
/* GO */
mv_setup_timer();
-- 
1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 12/13] mv_cesa: drop the now unused process callback

2012-05-25 Thread Phil Sutter
And while here, simplify dequeue_complete_req() a bit.

Signed-off-by: Phil Sutter phil.sut...@viprinet.com
---
 drivers/crypto/mv_cesa.c |   21 ++---
 1 files changed, 6 insertions(+), 15 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 9afed2d..9a2f413 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -69,7 +69,6 @@ struct req_progress {
struct scatterlist *src_sg;
struct scatterlist *dst_sg;
void (*complete) (void);
-   void (*process) (void);
 
/* src mostly */
int sg_src_left;
@@ -648,25 +647,17 @@ static void mv_hash_algo_completion(void)
 static void dequeue_complete_req(void)
 {
struct crypto_async_request *req = cpg-cur_req;
-   cpg-p.hw_processed_bytes += cpg-p.crypt_len;
-   cpg-p.crypt_len = 0;
 
mv_tdma_clear();
cpg-u32_usage = 0;
 
BUG_ON(cpg-eng_st != ENGINE_W_DEQUEUE);
-   if (cpg-p.hw_processed_bytes  cpg-p.hw_nbytes) {
-   /* process next scatter list entry */
-   cpg-eng_st = ENGINE_BUSY;
-   setup_data_in();
-   cpg-p.process();
-   } else {
-   cpg-p.complete();
-   cpg-eng_st = ENGINE_IDLE;
-   local_bh_disable();
-   req-complete(req, 0);
-   local_bh_enable();
-   }
+
+   cpg-p.complete();
+   cpg-eng_st = ENGINE_IDLE;
+   local_bh_disable();
+   req-complete(req, 0);
+   local_bh_enable();
 }
 
 static int count_sgs(struct scatterlist *sl, unsigned int total_bytes)
-- 
1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/13] mv_cesa: implementing packet chain mode, only aes for now

2012-05-25 Thread Phil Sutter
This introduces a pool of four-byte DMA buffers for security
accelerator config updates.

Signed-off-by: Phil Sutter phil.sut...@viprinet.com
---
 drivers/crypto/mv_cesa.c |  134 --
 drivers/crypto/mv_cesa.h |1 +
 2 files changed, 106 insertions(+), 29 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index bc2692e..8e66080 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -10,6 +10,7 @@
 #include crypto/algapi.h
 #include linux/crypto.h
 #include linux/dma-mapping.h
+#include linux/dmapool.h
 #include linux/interrupt.h
 #include linux/io.h
 #include linux/kthread.h
@@ -27,6 +28,9 @@
 #define MAX_HW_HASH_SIZE   0x
 #define MV_CESA_EXPIRE 500 /* msec */
 
+#define MV_DMA_INIT_POOLSIZE 16
+#define MV_DMA_ALIGN 16
+
 static int count_sgs(struct scatterlist *, unsigned int);
 
 /*
@@ -96,6 +100,11 @@ struct sec_accel_sram {
 #define sa_ivo type.hash.ivo
 } __attribute__((packed));
 
+struct u32_mempair {
+   u32 *vaddr;
+   dma_addr_t daddr;
+};
+
 struct crypto_priv {
struct device *dev;
void __iomem *reg;
@@ -118,6 +127,11 @@ struct crypto_priv {
 
struct sec_accel_sram sa_sram;
dma_addr_t sa_sram_dma;
+
+   struct dma_pool *u32_pool;
+   struct u32_mempair *u32_list;
+   int u32_list_len;
+   int u32_usage;
 };
 
 static struct crypto_priv *cpg;
@@ -189,6 +203,54 @@ static void mv_setup_timer(void)
jiffies + msecs_to_jiffies(MV_CESA_EXPIRE));
 }
 
+#define U32_ITEM(x)(cpg-u32_list[x].vaddr)
+#define U32_ITEM_DMA(x)(cpg-u32_list[x].daddr)
+
+static inline int set_u32_poolsize(int nelem)
+{
+   /* need to increase size first if requested */
+   if (nelem  cpg-u32_list_len) {
+   struct u32_mempair *newmem;
+   int newsize = nelem * sizeof(struct u32_mempair);
+
+   newmem = krealloc(cpg-u32_list, newsize, GFP_KERNEL);
+   if (!newmem)
+   return -ENOMEM;
+   cpg-u32_list = newmem;
+   }
+
+   /* allocate/free dma descriptors, adjusting cpg-u32_list_len on the go 
*/
+   for (; cpg-u32_list_len  nelem; cpg-u32_list_len++) {
+   U32_ITEM(cpg-u32_list_len) = dma_pool_alloc(cpg-u32_pool,
+   GFP_KERNEL, U32_ITEM_DMA(cpg-u32_list_len));
+   if (!U32_ITEM((cpg-u32_list_len)))
+   return -ENOMEM;
+   }
+   for (; cpg-u32_list_len  nelem; cpg-u32_list_len--)
+   dma_pool_free(cpg-u32_pool, U32_ITEM(cpg-u32_list_len - 1),
+   U32_ITEM_DMA(cpg-u32_list_len - 1));
+
+   /* ignore size decreases but those to zero */
+   if (!nelem) {
+   kfree(cpg-u32_list);
+   cpg-u32_list = 0;
+   }
+   return 0;
+}
+
+static inline void mv_tdma_u32_copy(dma_addr_t dst, u32 val)
+{
+   if (unlikely(cpg-u32_usage == cpg-u32_list_len)
+set_u32_poolsize(cpg-u32_list_len  1)) {
+   printk(KERN_ERR MV_CESA resizing poolsize to %d failed\n,
+   cpg-u32_list_len  1);
+   return;
+   }
+   *(U32_ITEM(cpg-u32_usage)) = val;
+   mv_tdma_memcpy(dst, U32_ITEM_DMA(cpg-u32_usage), sizeof(u32));
+   cpg-u32_usage++;
+}
+
 static inline bool
 mv_dma_map_sg(struct scatterlist *sg, int nbytes, enum dma_data_direction dir)
 {
@@ -390,36 +452,13 @@ static void mv_init_crypt_config(struct 
ablkcipher_request *req)
sizeof(struct sec_accel_sram), DMA_TO_DEVICE);
mv_tdma_memcpy(cpg-sram_phys + SRAM_CONFIG, cpg-sa_sram_dma,
sizeof(struct sec_accel_sram));
-
-   mv_tdma_separator();
-   dma_copy_buf_to_dst(cpg-p, cpg-sram_phys + SRAM_DATA_OUT_START, 
cpg-p.crypt_len);
-
-   /* GO */
-   mv_setup_timer();
-   mv_tdma_trigger();
-   writel(SEC_CMD_EN_SEC_ACCL0, cpg-reg + SEC_ACCEL_CMD);
 }
 
 static void mv_update_crypt_config(void)
 {
-   struct sec_accel_config *op = cpg-sa_sram.op;
-
/* update the enc_len field only */
-
-   op-enc_len = cpg-p.crypt_len;
-
-   dma_sync_single_for_device(cpg-dev, cpg-sa_sram_dma + 2 * sizeof(u32),
-   sizeof(u32), DMA_TO_DEVICE);
-   mv_tdma_memcpy(cpg-sram_phys + SRAM_CONFIG + 2 * sizeof(u32),
-   cpg-sa_sram_dma + 2 * sizeof(u32), sizeof(u32));
-
-   mv_tdma_separator();
-   dma_copy_buf_to_dst(cpg-p, cpg-sram_phys + SRAM_DATA_OUT_START, 
cpg-p.crypt_len);
-
-   /* GO */
-   mv_setup_timer();
-   mv_tdma_trigger();
-   writel(SEC_CMD_EN_SEC_ACCL0, cpg-reg + SEC_ACCEL_CMD);
+   mv_tdma_u32_copy(cpg-sram_phys + SRAM_CONFIG + 2 * sizeof(u32),
+   (u32)cpg-p.crypt_len);
 }
 
 static void mv_crypto_algo_completion(void)
@@ -658,6 +697,7 @@ static void dequeue_complete_req(void

[PATCH 05/13] add a driver for the Marvell TDMA engine

2012-05-25 Thread Phil Sutter
This is a DMA engine integrated into the Marvell Kirkwood SoC, designed
to offload data transfers from/to the CESA crypto engine.

Signed-off-by: Phil Sutter phil.sut...@viprinet.com
---
 arch/arm/mach-kirkwood/common.c|   33 +++
 arch/arm/mach-kirkwood/include/mach/irqs.h |1 +
 drivers/crypto/Kconfig |5 +
 drivers/crypto/Makefile|3 +-
 drivers/crypto/mv_tdma.c   |  377 
 drivers/crypto/mv_tdma.h   |   50 
 6 files changed, 468 insertions(+), 1 deletions(-)
 create mode 100644 drivers/crypto/mv_tdma.c
 create mode 100644 drivers/crypto/mv_tdma.h

diff --git a/arch/arm/mach-kirkwood/common.c b/arch/arm/mach-kirkwood/common.c
index 3ad0373..adc6eff 100644
--- a/arch/arm/mach-kirkwood/common.c
+++ b/arch/arm/mach-kirkwood/common.c
@@ -269,9 +269,42 @@ void __init kirkwood_uart1_init(void)
 /*
  * Cryptographic Engines and Security Accelerator (CESA)
  /
+static struct resource kirkwood_tdma_res[] = {
+   {
+   .name   = regs deco,
+   .start  = CRYPTO_PHYS_BASE + 0xA00,
+   .end= CRYPTO_PHYS_BASE + 0xA24,
+   .flags  = IORESOURCE_MEM,
+   }, {
+   .name   = regs control and error,
+   .start  = CRYPTO_PHYS_BASE + 0x800,
+   .end= CRYPTO_PHYS_BASE + 0x8CF,
+   .flags  = IORESOURCE_MEM,
+   }, {
+   .name   = crypto error,
+   .start  = IRQ_KIRKWOOD_TDMA_ERR,
+   .end= IRQ_KIRKWOOD_TDMA_ERR,
+   .flags  = IORESOURCE_IRQ,
+   },
+};
+
+static u64 mv_tdma_dma_mask = 0xUL;
+
+static struct platform_device kirkwood_tdma_device = {
+   .name   = mv_tdma,
+   .id = -1,
+   .dev= {
+   .dma_mask   = mv_tdma_dma_mask,
+   .coherent_dma_mask  = 0x,
+   },
+   .num_resources  = ARRAY_SIZE(kirkwood_tdma_res),
+   .resource   = kirkwood_tdma_res,
+};
+
 void __init kirkwood_crypto_init(void)
 {
kirkwood_clk_ctrl |= CGC_CRYPTO;
+   platform_device_register(kirkwood_tdma_device);
orion_crypto_init(CRYPTO_PHYS_BASE, KIRKWOOD_SRAM_PHYS_BASE,
  KIRKWOOD_SRAM_SIZE, IRQ_KIRKWOOD_CRYPTO);
 }
diff --git a/arch/arm/mach-kirkwood/include/mach/irqs.h 
b/arch/arm/mach-kirkwood/include/mach/irqs.h
index 2bf8161..a66aa3f 100644
--- a/arch/arm/mach-kirkwood/include/mach/irqs.h
+++ b/arch/arm/mach-kirkwood/include/mach/irqs.h
@@ -51,6 +51,7 @@
 #define IRQ_KIRKWOOD_GPIO_HIGH_16_23   41
 #define IRQ_KIRKWOOD_GE00_ERR  46
 #define IRQ_KIRKWOOD_GE01_ERR  47
+#define IRQ_KIRKWOOD_TDMA_ERR  49
 #define IRQ_KIRKWOOD_RTC53
 
 /*
diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index 1092a77..17becf3 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -159,6 +159,10 @@ config CRYPTO_GHASH_S390
 
  It is available as of z196.
 
+config CRYPTO_DEV_MV_TDMA
+   tristate
+   default no
+
 config CRYPTO_DEV_MV_CESA
tristate Marvell's Cryptographic Engine
depends on PLAT_ORION
@@ -166,6 +170,7 @@ config CRYPTO_DEV_MV_CESA
select CRYPTO_AES
select CRYPTO_BLKCIPHER2
select CRYPTO_HASH
+   select CRYPTO_DEV_MV_TDMA
help
  This driver allows you to utilize the Cryptographic Engines and
  Security Accelerator (CESA) which can be found on the Marvell Orion
diff --git a/drivers/crypto/Makefile b/drivers/crypto/Makefile
index 0139032..65806e8 100644
--- a/drivers/crypto/Makefile
+++ b/drivers/crypto/Makefile
@@ -4,6 +4,7 @@ obj-$(CONFIG_CRYPTO_DEV_GEODE) += geode-aes.o
 obj-$(CONFIG_CRYPTO_DEV_NIAGARA2) += n2_crypto.o
 n2_crypto-y := n2_core.o n2_asm.o
 obj-$(CONFIG_CRYPTO_DEV_HIFN_795X) += hifn_795x.o
+obj-$(CONFIG_CRYPTO_DEV_MV_TDMA) += mv_tdma.o
 obj-$(CONFIG_CRYPTO_DEV_MV_CESA) += mv_cesa.o
 obj-$(CONFIG_CRYPTO_DEV_TALITOS) += talitos.o
 obj-$(CONFIG_CRYPTO_DEV_FSL_CAAM) += caam/
@@ -14,4 +15,4 @@ obj-$(CONFIG_CRYPTO_DEV_OMAP_AES) += omap-aes.o
 obj-$(CONFIG_CRYPTO_DEV_PICOXCELL) += picoxcell_crypto.o
 obj-$(CONFIG_CRYPTO_DEV_S5P) += s5p-sss.o
 obj-$(CONFIG_CRYPTO_DEV_TEGRA_AES) += tegra-aes.o
-obj-$(CONFIG_CRYPTO_DEV_UX500) += ux500/
\ No newline at end of file
+obj-$(CONFIG_CRYPTO_DEV_UX500) += ux500/
diff --git a/drivers/crypto/mv_tdma.c b/drivers/crypto/mv_tdma.c
new file mode 100644
index 000..aa5316a
--- /dev/null
+++ b/drivers/crypto/mv_tdma.c
@@ -0,0 +1,377 @@
+/*
+ * Support for Marvell's TDMA engine found on Kirkwood chips,
+ * used exclusively by the CESA crypto accelerator.
+ *
+ * Based on unpublished code for IDMA written by Sebastian Siewior.
+ *
+ * Copyright (C) 2012 Phil Sutter phil.sut...@viprinet.com
+ * License

[PATCH 03/13] mv_cesa: prepare the full sram config in dram

2012-05-25 Thread Phil Sutter
This way reconfiguring the cryptographic accelerator consists of a
single step (memcpy here), which in future can be done by the tdma
engine.

This patch introduces some ugly IV copying, necessary for input buffers
above 1920bytes. But this will go away later.

Signed-off-by: Phil Sutter phil.sut...@viprinet.com
---
 drivers/crypto/mv_cesa.c |   83 -
 1 files changed, 52 insertions(+), 31 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 3862a93..68b83d8 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -76,6 +76,24 @@ struct req_progress {
int hw_processed_bytes;
 };
 
+struct sec_accel_sram {
+   struct sec_accel_config op;
+   union {
+   struct {
+   u32 key[8];
+   u32 iv[4];
+   } crypt;
+   struct {
+   u32 ivi[5];
+   u32 ivo[5];
+   } hash;
+   } type;
+#define sa_key type.crypt.key
+#define sa_iv  type.crypt.iv
+#define sa_ivi type.hash.ivi
+#define sa_ivo type.hash.ivo
+} __attribute__((packed));
+
 struct crypto_priv {
void __iomem *reg;
void __iomem *sram;
@@ -93,6 +111,8 @@ struct crypto_priv {
int sram_size;
int has_sha1;
int has_hmac_sha1;
+
+   struct sec_accel_sram sa_sram;
 };
 
 static struct crypto_priv *cpg;
@@ -250,48 +270,49 @@ static void mv_process_current_q(int first_block)
struct ablkcipher_request *req = ablkcipher_request_cast(cpg-cur_req);
struct mv_ctx *ctx = crypto_tfm_ctx(req-base.tfm);
struct mv_req_ctx *req_ctx = ablkcipher_request_ctx(req);
-   struct sec_accel_config op;
+   struct sec_accel_config *op = cpg-sa_sram.op;
 
switch (req_ctx-op) {
case COP_AES_ECB:
-   op.config = CFG_OP_CRYPT_ONLY | CFG_ENCM_AES | CFG_ENC_MODE_ECB;
+   op-config = CFG_OP_CRYPT_ONLY | CFG_ENCM_AES | 
CFG_ENC_MODE_ECB;
break;
case COP_AES_CBC:
default:
-   op.config = CFG_OP_CRYPT_ONLY | CFG_ENCM_AES | CFG_ENC_MODE_CBC;
-   op.enc_iv = ENC_IV_POINT(SRAM_DATA_IV) |
+   op-config = CFG_OP_CRYPT_ONLY | CFG_ENCM_AES | 
CFG_ENC_MODE_CBC;
+   op-enc_iv = ENC_IV_POINT(SRAM_DATA_IV) |
ENC_IV_BUF_POINT(SRAM_DATA_IV_BUF);
-   if (first_block)
-   memcpy(cpg-sram + SRAM_DATA_IV, req-info, 16);
+   if (!first_block)
+   memcpy(req-info, cpg-sram + SRAM_DATA_IV_BUF, 16);
+   memcpy(cpg-sa_sram.sa_iv, req-info, 16);
break;
}
if (req_ctx-decrypt) {
-   op.config |= CFG_DIR_DEC;
-   memcpy(cpg-sram + SRAM_DATA_KEY_P, ctx-aes_dec_key, 
AES_KEY_LEN);
+   op-config |= CFG_DIR_DEC;
+   memcpy(cpg-sa_sram.sa_key, ctx-aes_dec_key, AES_KEY_LEN);
} else {
-   op.config |= CFG_DIR_ENC;
-   memcpy(cpg-sram + SRAM_DATA_KEY_P, ctx-aes_enc_key, 
AES_KEY_LEN);
+   op-config |= CFG_DIR_ENC;
+   memcpy(cpg-sa_sram.sa_key, ctx-aes_enc_key, AES_KEY_LEN);
}
 
switch (ctx-key_len) {
case AES_KEYSIZE_128:
-   op.config |= CFG_AES_LEN_128;
+   op-config |= CFG_AES_LEN_128;
break;
case AES_KEYSIZE_192:
-   op.config |= CFG_AES_LEN_192;
+   op-config |= CFG_AES_LEN_192;
break;
case AES_KEYSIZE_256:
-   op.config |= CFG_AES_LEN_256;
+   op-config |= CFG_AES_LEN_256;
break;
}
-   op.enc_p = ENC_P_SRC(SRAM_DATA_IN_START) |
+   op-enc_p = ENC_P_SRC(SRAM_DATA_IN_START) |
ENC_P_DST(SRAM_DATA_OUT_START);
-   op.enc_key_p = SRAM_DATA_KEY_P;
+   op-enc_key_p = SRAM_DATA_KEY_P;
 
setup_data_in();
-   op.enc_len = cpg-p.crypt_len;
-   memcpy(cpg-sram + SRAM_CONFIG, op,
-   sizeof(struct sec_accel_config));
+   op-enc_len = cpg-p.crypt_len;
+   memcpy(cpg-sram + SRAM_CONFIG, cpg-sa_sram,
+   sizeof(struct sec_accel_sram));
 
/* GO */
mv_setup_timer();
@@ -315,30 +336,30 @@ static void mv_process_hash_current(int first_block)
const struct mv_tfm_hash_ctx *tfm_ctx = crypto_tfm_ctx(req-base.tfm);
struct mv_req_hash_ctx *req_ctx = ahash_request_ctx(req);
struct req_progress *p = cpg-p;
-   struct sec_accel_config op = { 0 };
+   struct sec_accel_config *op = cpg-sa_sram.op;
int is_last;
 
switch (req_ctx-op) {
case COP_SHA1:
default:
-   op.config = CFG_OP_MAC_ONLY | CFG_MACM_SHA1;
+   op-config = CFG_OP_MAC_ONLY | CFG_MACM_SHA1;
break;
case COP_HMAC_SHA1:
-   op.config

RFC: support for MV_CESA with TDMA

2012-05-25 Thread Phil Sutter
Hi,

The following patch series adds support for the TDMA engine built into
Marvell's Kirkwood-based SoCs, and enhances mv_cesa.c in order to use it
for speeding up crypto operations. Kirkwood hardware contains a security
accelerator, which can control DMA as well as crypto engines. It allows
for operation with minimal software intervenience, which the following
patches implement: using a chain of DMA descriptors, data input,
configuration, engine startup and data output repeat fully automatically
until the whole input data has been handled.

The point for this being RFC is backwards-compatibility: earlier
hardware (Orion) ships a (slightly) different DMA engine (IDMA) along
with the same crypto engine, so in fact mv_cesa.c is in use on these
platforms, too. But since I don't possess hardware of this kind, I am
not able to make this code IDMA-compatible. Also, due to the quite
massive reorganisation of code flow, I don't really see how to make TDMA
support optional in mv_cesa.c.

Greetings, Phil
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/13] mv_cesa: do not use scatterlist iterators

2012-05-25 Thread Phil Sutter
The big problem is they cannot be used to iterate over DMA mapped
scatterlists, so get rid of them in order to add DMA functionality to
mv_cesa.

Signed-off-by: Phil Sutter phil.sut...@viprinet.com
---
 drivers/crypto/mv_cesa.c |   57 ++---
 1 files changed, 28 insertions(+), 29 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 3cc9237..c305350 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -43,8 +43,8 @@ enum engine_status {
 
 /**
  * struct req_progress - used for every crypt request
- * @src_sg_it: sg iterator for src
- * @dst_sg_it: sg iterator for dst
+ * @src_sg:sg list for src
+ * @dst_sg:sg list for dst
  * @sg_src_left:   bytes left in src to process (scatter list)
  * @src_start: offset to add to src start position (scatter list)
  * @crypt_len: length of current hw crypt/hash process
@@ -59,8 +59,8 @@ enum engine_status {
  * track of progress within current scatterlist.
  */
 struct req_progress {
-   struct sg_mapping_iter src_sg_it;
-   struct sg_mapping_iter dst_sg_it;
+   struct scatterlist *src_sg;
+   struct scatterlist *dst_sg;
void (*complete) (void);
void (*process) (int is_first);
 
@@ -210,19 +210,19 @@ static int mv_setkey_aes(struct crypto_ablkcipher 
*cipher, const u8 *key,
 
 static void copy_src_to_buf(struct req_progress *p, char *dbuf, int len)
 {
-   int ret;
void *sbuf;
int copy_len;
 
while (len) {
if (!p-sg_src_left) {
-   ret = sg_miter_next(p-src_sg_it);
-   BUG_ON(!ret);
-   p-sg_src_left = p-src_sg_it.length;
+   /* next sg please */
+   p-src_sg = sg_next(p-src_sg);
+   BUG_ON(!p-src_sg);
+   p-sg_src_left = p-src_sg-length;
p-src_start = 0;
}
 
-   sbuf = p-src_sg_it.addr + p-src_start;
+   sbuf = sg_virt(p-src_sg) + p-src_start;
 
copy_len = min(p-sg_src_left, len);
memcpy(dbuf, sbuf, copy_len);
@@ -305,9 +305,6 @@ static void mv_crypto_algo_completion(void)
struct ablkcipher_request *req = ablkcipher_request_cast(cpg-cur_req);
struct mv_req_ctx *req_ctx = ablkcipher_request_ctx(req);
 
-   sg_miter_stop(cpg-p.src_sg_it);
-   sg_miter_stop(cpg-p.dst_sg_it);
-
if (req_ctx-op != COP_AES_CBC)
return ;
 
@@ -437,7 +434,6 @@ static void mv_hash_algo_completion(void)
 
if (ctx-extra_bytes)
copy_src_to_buf(cpg-p, ctx-buffer, ctx-extra_bytes);
-   sg_miter_stop(cpg-p.src_sg_it);
 
if (likely(ctx-last_chunk)) {
if (likely(ctx-count = MAX_HW_HASH_SIZE)) {
@@ -457,7 +453,6 @@ static void dequeue_complete_req(void)
 {
struct crypto_async_request *req = cpg-cur_req;
void *buf;
-   int ret;
cpg-p.hw_processed_bytes += cpg-p.crypt_len;
if (cpg-p.copy_back) {
int need_copy_len = cpg-p.crypt_len;
@@ -466,14 +461,14 @@ static void dequeue_complete_req(void)
int dst_copy;
 
if (!cpg-p.sg_dst_left) {
-   ret = sg_miter_next(cpg-p.dst_sg_it);
-   BUG_ON(!ret);
-   cpg-p.sg_dst_left = cpg-p.dst_sg_it.length;
+   /* next sg please */
+   cpg-p.dst_sg = sg_next(cpg-p.dst_sg);
+   BUG_ON(!cpg-p.dst_sg);
+   cpg-p.sg_dst_left = cpg-p.dst_sg-length;
cpg-p.dst_start = 0;
}
 
-   buf = cpg-p.dst_sg_it.addr;
-   buf += cpg-p.dst_start;
+   buf = sg_virt(cpg-p.dst_sg) + cpg-p.dst_start;
 
dst_copy = min(need_copy_len, cpg-p.sg_dst_left);
 
@@ -523,7 +518,6 @@ static int count_sgs(struct scatterlist *sl, unsigned int 
total_bytes)
 static void mv_start_new_crypt_req(struct ablkcipher_request *req)
 {
struct req_progress *p = cpg-p;
-   int num_sgs;
 
cpg-cur_req = req-base;
memset(p, 0, sizeof(struct req_progress));
@@ -532,11 +526,14 @@ static void mv_start_new_crypt_req(struct 
ablkcipher_request *req)
p-process = mv_process_current_q;
p-copy_back = 1;
 
-   num_sgs = count_sgs(req-src, req-nbytes);
-   sg_miter_start(p-src_sg_it, req-src, num_sgs, SG_MITER_FROM_SG);
-
-   num_sgs = count_sgs(req-dst, req-nbytes);
-   sg_miter_start(p-dst_sg_it, req-dst, num_sgs, SG_MITER_TO_SG);
+   p-src_sg = req-src;
+   p-dst_sg = req-dst;
+   if (req-nbytes) {
+   BUG_ON(!req-src);
+   BUG_ON(!req-dst);
+   p

Re: Problem with mv_cesa

2012-03-14 Thread Phil Sutter
 crypto session */
 if (ioctl(fd, CIOCFSESSION, sess.ses)) {
 perror(ioctl(CIOCFSESSION));
 return 1;
 }
 
 return 0;
 }
 
 int main(int argc, char *argv[])
 {
 if (argc != 4) {
 printf(Usage: %s UNENCRYPTED_FILE ENCRYPTED_FILE 
 NUMBER_OF_BYTES, argv[0]);
 printf(\n\tUNENCRYPTED_FILE: Filename to write the input 
 data used to);
 printf(\n\tENCRYPTED_FILE: File to write the encrypted input 
 data to);
 printf(\n\tNUMBER_OF_BYTES: The number of bytes of data to 
 encrypt);
 printf(\n\n\tThis program will take some data from stack 
 garbage and encrypte it\n);
 printf(\n\toutputting the unencrypted and encrypted data to 
 files.\n);
 printf(\n\tThen invoke openssl like the follow to 
 verify:\n);
 printf(\n\topenssl enc -in UNENCRYPTED_FILE -iv 0 -K 0 
 \\\n);
 printf(\t\t-out OPENSSL_ENCRYPTED_FILE -nopad 
 -aes-128-cbc\n);
 return 1;
 }
 
 int number_of_bytes = atoi(argv[3]);
 
 int fd = -1;
 fd = open(/dev/crypto, O_RDWR, 0);
 if (fd  0)
 {
 perror(open(/dev/crypto));
 return 1;
 }
 
 /* Set close-on-exec (not really neede here) */
 if (fcntl(fd, F_SETFD, 1) == -1) {
 perror(fcntl(F_SETFD));
 return 1;
 }
 
 test_crypto(fd, argv[1], argv[2], number_of_bytes);
 
 if(close(fd))
 {
 perror(close(fd));
 return 1;
 }
 
 return 0;
 }



Phil Sutter
Software Engineer

-- 
Viprinet – Never be offline again!


Viprinet auf der CeBIT 2012 – 6.-10. März:
Besuchen Sie uns in Halle 13, Stand D27!
Alle gezeigten Produkte im Livebetrieb,
zahlreiche Beispielapplikationen. Gerne
schicken wir Ihnen kostenlose Eintrittskarten.

Viprinet at CeBIT 2012, March 6 to 10
in Hannover, Germany. Come and visit us
at Hall 13, Booth D27! All exhibits shown
live, many sample applications. We’ll be
happy to send you free admission tickets.



Viprinet GmbH
Mainzer Str. 43
55411 Bingen am Rhein
Germany

Phone/Zentrale: +49-6721-49030-0
Direct line/Durchwahl:  +49-6721-49030-134
Fax:+49-6721-49030-209

phil.sut...@viprinet.com
http://www.viprinet.com

Registered office/Sitz der Gesellschaft: Bingen am Rhein
Commercial register/Handelsregister: Amtsgericht Mainz HRB40380
CEO/Geschäftsführer: Simon Kissel
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] crypto: mv_cesa - fix final callback not ignoring input data

2012-02-27 Thread Phil Sutter
Broken by commit 6ef84509f3d439ed2d43ea40080643efec37f54f for users
passing a request with non-zero 'nbytes' field, like e.g. testmgr.

Cc: sta...@kernel.org # 3.0+
Signed-off-by: Phil Sutter phil.sut...@viprinet.com
---
 drivers/crypto/mv_cesa.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 597235a..0d40cf6 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -714,6 +714,7 @@ static int mv_hash_final(struct ahash_request *req)
 {
struct mv_req_hash_ctx *ctx = ahash_request_ctx(req);
 
+   ahash_request_set_crypt(req, NULL, req-result, 0);
mv_update_hash_req_ctx(ctx, 1, 0);
return mv_handle_req(req-base);
 }
-- 
1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: mv_cesa hash functions

2012-02-23 Thread Phil Sutter
Hello,

On Wed, Feb 22, 2012 at 02:03:38PM +0100, Frank wrote:
 After doing some trials with hardware crypto offloading through usermode 
 interfaces (af_alg and cryptodev) to Marvell CESA accelerated ciphers and 
 hash functions with the 3.2.4 kernel's mv_cesa in Debian Wheezy on a Marvell 
 Kirkwood system, I've noticed the following kernel output when I load the 
 mv_cesa kernel module:
 
 [490889.448060] alg: hash: Test 1 failed for mv-sha1
 [490889.452786] : c1 94 3f 2e a2 41 ce 88 d5 47 07 43 c4 a8 17 5d
 [490889.459368] 0010: 77 e8 47 ca
 [490889.464321] alg: hash: Test 1 failed for mv-hmac-sha1
 [490889.469493] : 06 71 4d 7c cc cc b5 cf 1b d6 c7 ab d0 25 c4 21
 [490889.476068] 0010: 66 0b 8e 70

I've tracked down the problem to commit 6ef8450, crypto: mv_cesa - make
count_sgs() null-pointer proof. So apparently it was me who broke it.

The simpification introduced by that commit assumes a zero 'nbytes'
field in the ahash_request struct passed to mv_hash_final, but
crypto/testmgr.c calls crypto_ahash_update() and crypto_ahash_final()
without altering the ahash_request in between.

Herbert, please clarify: is it intended behaviour that ahash_alg's final
callback ignores possibly present data in the request? If I wanted to
finalise a hash operation with some final data, would I then use the
finup callback instead? (Note the implied question of the actual
difference between the two callbacks.)

If my assumptions are correct, fixing this issue would be as easy as
adding a call to ahash_request_set_crypt() to mv_hash_final(). Or
manually zeroing req-nbytes, but that's probably unclean. Anyway, if
the 'final' callback is always to ignore request data, why doesn't the
API do this already? At least mv_cesa could use a single function for
both callbacks then.

 Using SHA1 in a ssl/tls handshake fails in tests with mv_cesa loaded, which 
 might be related to this.

You are using cryptodev-linux for that, right? If so, this should be
unrelated since cryptodev-linux always calls crypto_ahash_final() with
an empty request payload (from cryptodev_cipher.c):

| ahash_request_set_crypt(hdata-async.request, NULL, output, 0);
| 
| ret = crypto_ahash_final(hdata-async.request);

But you might suffer from another problem, which is only present on ARM
machines with VIVT cache and linux = 2.6.37: due to commit f8b63c1,
ARM: 6382/1: Remove superfluous flush_kernel_dcache_page() which
prevents pages being flushed from inside the scatterlist iterator API.
This patch seems to introduce problems in other places (namely NFS), too
but I sadly did not have time to investigate this further. I will post a
possible (cryptodev-internal) solution to cryptodev-linux-de...@gna.org,
maybe this fixes the problem with openssl.

Greetings, Phil
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] crypto: mv_cesa - fix hashing of chunks 1920 bytes

2011-11-14 Thread Phil Sutter
This was broken by commit 7759995c75ae0cbd4c861582908449f6b6208e7a (yes,
myself). The basic problem here is since the digest state is only saved
after the last chunk, the state array is only valid when handling the
first chunk of the next buffer. Broken since linux-3.0.

Signed-off-by: Phil Sutter phil.sut...@viprinet.com
---
 drivers/crypto/mv_cesa.c |   12 +++-
 1 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 5c6f56f..dcd8bab 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -343,11 +343,13 @@ static void mv_process_hash_current(int first_block)
else
op.config |= CFG_MID_FRAG;
 
-   writel(req_ctx-state[0], cpg-reg + DIGEST_INITIAL_VAL_A);
-   writel(req_ctx-state[1], cpg-reg + DIGEST_INITIAL_VAL_B);
-   writel(req_ctx-state[2], cpg-reg + DIGEST_INITIAL_VAL_C);
-   writel(req_ctx-state[3], cpg-reg + DIGEST_INITIAL_VAL_D);
-   writel(req_ctx-state[4], cpg-reg + DIGEST_INITIAL_VAL_E);
+   if (first_block) {
+   writel(req_ctx-state[0], cpg-reg + 
DIGEST_INITIAL_VAL_A);
+   writel(req_ctx-state[1], cpg-reg + 
DIGEST_INITIAL_VAL_B);
+   writel(req_ctx-state[2], cpg-reg + 
DIGEST_INITIAL_VAL_C);
+   writel(req_ctx-state[3], cpg-reg + 
DIGEST_INITIAL_VAL_D);
+   writel(req_ctx-state[4], cpg-reg + 
DIGEST_INITIAL_VAL_E);
+   }
}
 
memcpy(cpg-sram + SRAM_CONFIG, op, sizeof(struct sec_accel_config));
-- 
1.7.6.1


--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: comparison of the AF_ALG interface with the /dev/crypto

2011-09-01 Thread Phil Sutter
Herbert,

On Thu, Sep 01, 2011 at 10:14:45PM +0800, Herbert Xu wrote:
 Phil Sutter p...@nwl.cc wrote:
  
  chunksize   af_alg  cryptodev   (100 * cryptodev / af_alg)
  --
  512  4.169 MB/s  7.113 MB/s 171 %
  1024 7.904 MB/s 12.957 MB/s 164 %
  204813.163 MB/s 19.683 MB/s 150 %
  409620.218 MB/s 26.960 MB/s 133 %
  819227.539 MB/s 34.373 MB/s 125 %
  16384   33.730 MB/s 39.997 MB/s 119 %
  32768   37.399 MB/s 42.727 MB/s 114 %
  65536   40.004 MB/s 44.660 MB/s 112 %
 
 Are you maxing out your submission CPU? If not then you're testing
 the latency of the interface, as opposed to the throughput.

Good point. So in order to also test the throughput, I've put my OpenRD
under load:

| stress -c 2 -i 2 -m 2 --vm-bytes 64MB

and ran the tests again:

chunksize   af_alg  cryptodev   (100 * cryptodev / af_alg)
--
512  0.618 MB/s  1.14 MB/s  184 %
1024 1.258 MB/s  2.28 MB/s  181 %
2048 2.453 MB/s  4.39 MB/s  179 %
4096 4.540 MB/s  7.76 MB/s  171 %
8192 7.981 MB/s 11.67 MB/s  146 %
16384   12.543 MB/s 14.08 MB/s  112 %
32768   13.139 MB/s 14.46 MB/s  110 %
65536   14.254 MB/s 15.55 MB/s  109 %

So that means cryptodev-linux is superior in throughput as well as
latency, right? Or is it the lower latency of the interface causing the
higher throughput?

Greetings, Phil
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/10] mv_cesa: use ablkcipher_request_cast instead of the manual container_of

2011-05-05 Thread Phil Sutter

Signed-off-by: Phil Sutter phil.sut...@viprinet.com
---
 drivers/crypto/mv_cesa.c |4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index c99305a..c443246 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -603,9 +603,7 @@ static int queue_manag(void *data)
if (async_req-tfm-__crt_alg-cra_type !=
crypto_ahash_type) {
struct ablkcipher_request *req =
-   container_of(async_req,
-struct ablkcipher_request,
-base);
+   ablkcipher_request_cast(async_req);
mv_start_new_crypt_req(req);
} else {
struct ahash_request *req =
-- 
1.7.4.1


--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/10] mv_cesa: make count_sgs() null-pointer proof

2011-05-05 Thread Phil Sutter
This also makes the dummy scatterlist in mv_hash_final() needless, so
drop it.

XXX: should this routine be made pulicly available? There are probably
other users with their own implementations.

Signed-off-by: Phil Sutter phil.sut...@viprinet.com
---
 drivers/crypto/mv_cesa.c |8 ++--
 1 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index d704ed0..3cf303e 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -133,7 +133,6 @@ struct mv_req_hash_ctx {
int extra_bytes;/* unprocessed bytes in buffer */
enum hash_op op;
int count_add;
-   struct scatterlist dummysg;
 };
 
 static void compute_aes_dec_key(struct mv_ctx *ctx)
@@ -482,7 +481,7 @@ static int count_sgs(struct scatterlist *sl, unsigned int 
total_bytes)
int i = 0;
size_t cur_len;
 
-   while (1) {
+   while (sl) {
cur_len = sl[i].length;
++i;
if (total_bytes  cur_len)
@@ -711,10 +710,7 @@ static int mv_hash_update(struct ahash_request *req)
 static int mv_hash_final(struct ahash_request *req)
 {
struct mv_req_hash_ctx *ctx = ahash_request_ctx(req);
-   /* dummy buffer of 4 bytes */
-   sg_init_one(ctx-dummysg, ctx-buffer, 4);
-   /* I think I'm allowed to do that... */
-   ahash_request_set_crypt(req, ctx-dummysg, req-result, 0);
+
mv_update_hash_req_ctx(ctx, 1, 0);
return mv_handle_req(req-base);
 }
-- 
1.7.4.1


--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/10] mv_cesa: fill inner/outer IV fields only in HMAC case

2011-05-05 Thread Phil Sutter

Signed-off-by: Phil Sutter phil.sut...@viprinet.com
---
 drivers/crypto/mv_cesa.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index de09303..c1925c2 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -296,6 +296,7 @@ static void mv_crypto_algo_completion(void)
 static void mv_process_hash_current(int first_block)
 {
struct ahash_request *req = ahash_request_cast(cpg-cur_req);
+   const struct mv_tfm_hash_ctx *tfm_ctx = crypto_tfm_ctx(req-base.tfm);
struct mv_req_hash_ctx *req_ctx = ahash_request_ctx(req);
struct req_progress *p = cpg-p;
struct sec_accel_config op = { 0 };
@@ -308,6 +309,8 @@ static void mv_process_hash_current(int first_block)
break;
case COP_HMAC_SHA1:
op.config = CFG_OP_MAC_ONLY | CFG_MACM_HMAC_SHA1;
+   memcpy(cpg-sram + SRAM_HMAC_IV_IN,
+   tfm_ctx-ivs, sizeof(tfm_ctx-ivs));
break;
}
 
@@ -510,7 +513,6 @@ static void mv_start_new_hash_req(struct ahash_request *req)
 {
struct req_progress *p = cpg-p;
struct mv_req_hash_ctx *ctx = ahash_request_ctx(req);
-   const struct mv_tfm_hash_ctx *tfm_ctx = crypto_tfm_ctx(req-base.tfm);
int num_sgs, hw_bytes, old_extra_bytes, rc;
cpg-cur_req = req-base;
memset(p, 0, sizeof(struct req_progress));
@@ -523,8 +525,6 @@ static void mv_start_new_hash_req(struct ahash_request *req)
p-crypt_len = ctx-extra_bytes;
}
 
-   memcpy(cpg-sram + SRAM_HMAC_IV_IN, tfm_ctx-ivs, sizeof(tfm_ctx-ivs));
-
if (unlikely(!ctx-first_hash)) {
writel(ctx-state[0], cpg-reg + DIGEST_INITIAL_VAL_A);
writel(ctx-state[1], cpg-reg + DIGEST_INITIAL_VAL_B);
-- 
1.7.4.1


--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/10] mv_cesa: print a warning when registration of AES algos fail

2011-05-05 Thread Phil Sutter

Signed-off-by: Phil Sutter phil.sut...@viprinet.com
---
 drivers/crypto/mv_cesa.c |   10 --
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/mv_cesa.c b/drivers/crypto/mv_cesa.c
index 4aac294..c018cd0 100644
--- a/drivers/crypto/mv_cesa.c
+++ b/drivers/crypto/mv_cesa.c
@@ -1061,12 +1061,18 @@ static int mv_probe(struct platform_device *pdev)
writel(SRAM_CONFIG, cpg-reg + SEC_ACCEL_DESC_P0);
 
ret = crypto_register_alg(mv_aes_alg_ecb);
-   if (ret)
+   if (ret) {
+   printk(KERN_WARNING MV_CESA
+  Could not register aes-ecb driver\n);
goto err_irq;
+   }
 
ret = crypto_register_alg(mv_aes_alg_cbc);
-   if (ret)
+   if (ret) {
+   printk(KERN_WARNING MV_CESA
+  Could not register aes-cbc driver\n);
goto err_unreg_ecb;
+   }
 
ret = crypto_register_ahash(mv_sha1_alg);
if (ret == 0)
-- 
1.7.4.1


--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: aead: driver side documentation

2011-04-05 Thread Phil Sutter
Hi,

On Mon, Apr 04, 2011 at 08:35:43PM -0500, Kim Phillips wrote:
 On Mon, 4 Apr 2011 19:03:37 +0200
 Phil Sutter p...@nwl.cc wrote:
 
  I would like to enhance drivers/crypto/mv_cesa.c by an AEAD algorithm
  (at least authenc(hmac(sha1),cbc(aes))), since the driver is able to do
  both operations in one go.
  
  Unfortunately, I have found little information about this task in
  Documentation/ or the web. Am I missing something? It would be really
  great if you could point me to the right direction here.
 
 use existing drivers for guidance.  The following drivers implement
 those types of algorithms:

Thanks for the hint, although I've already found the sample code. ;)
Was rather looking for something telling me what is crucial and what
options there are. Concrete code tends to just show the solution of a
specific problem. (My five cents to the question why code is often, but
seldomly good documentation.)

Grepping reveals IPSec (i.e., esp{4,6}.c) as the only user of AEAD so
far. Is this correct?

Greetings, Phil
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


aead: driver side documentation

2011-04-04 Thread Phil Sutter
Dear list,

I would like to enhance drivers/crypto/mv_cesa.c by an AEAD algorithm
(at least authenc(hmac(sha1),cbc(aes))), since the driver is able to do
both operations in one go.

Unfortunately, I have found little information about this task in
Documentation/ or the web. Am I missing something? It would be really
great if you could point me to the right direction here.

Greetings, Phil
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Improving SHA-1 performance with Intel SSE3

2010-12-23 Thread Phil Sutter
Dear list,

I am doing performance tests on an Intel I5 (661 i think) based machine.
Thanks to the AES-NI extensions, I am able to get a throughput of about
500MB/s when doing AES256.

But for TLS, hashing performance is important, too. SSE4.2 provides no
equivalent extension for SHA-1, so that needs to be done purely in
software - with a resulting throughput of about 200MB/s.

Given a situation where I need to both hash and encrypt some plaintext,
throughput drops even worse to about 180MB/s since both operations need
to be done sequentially (this all is done on a single core, btw).

So despite the very stunning AES performance, I guess one has a hard
time saturating the wire (full-duplex) with TLS. Is that correct so far,
or am I getting something wrong here?

The actual question (indeed related to the subject) is this: are there
any implementations/plans on using SSE3 to speed up SHA-1 as stated in
[1]? Do you see any possible problems when trying to do so (regarding
e.g. SSE3 detection or something)?

Greetings, Phil

1: 
http://software.intel.com/en-us/articles/improving-the-performance-of-the-secure-hash-algorithm-1/?wapkw=(sha1+optimization)
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] crypto: padlock: fix for non-64byte aligned data

2010-12-07 Thread Phil Sutter
Hi,

On Tue, Dec 07, 2010 at 07:39:56PM +0800, Herbert Xu wrote:
 On Tue, Dec 07, 2010 at 11:41:41AM +0100, Phil Sutter wrote:
 
  Yes, CONFIG_PREEMPT is active in my test system's kernel.
 
 OK, can you see if the problem is still reproducible without
 preemption?

Yes, it is. I just redid the test with a CONFIG_PREEMPT_NONE kernel,
memory corruption occurs every time.

Greetings, Phil
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] crypto: padlock: fix for non-64byte aligned data

2010-11-05 Thread Phil Sutter
Herbert,

On Thu, Nov 04, 2010 at 01:46:06PM -0500, Herbert Xu wrote:
  On one hand, the original code is broken in padlock_xcrypt_cbc(): when
  passing the initial bytes to xcryptcbc, 'count' is incorrectly used as
  length. This may trigger prefetch-related issues, but will definitely
  lead to data corruption as xcryptcbc is called again afterwards,
  altering (count - initial) * AES_BLOCK_SIZE bytes after the end of
  'output' in memory.
 
 Ouch, does the attached patch fix this problem for you?

Yes, kind of. With that trivial fix applied, the driver is stable most
of the time.

  Another problem occurs when passing non-64byte aligned buffers, which
  leads to memory corruption in userspace (running applications crash
  randomly). This problem is too subtile for me to have more than vague
  assumptions about it's origin. Anyways, this patch fixes them:
 
 I'd like to determine whether this is due to the previous bug.
 If it still crashes randomly even with my one-line patch please
 let me know.

Yes, it does, but triggering the bug is not really trivial. I've had
best results with a speed testing tool using the asynchronous interface,
memory corruption occured in each run. The same tool operating
synchronously doesn't crash as soon, but having three or more instances
running in parallel yields the same result.

This problem is so racey, a simple printk statement at the beginning of
padlock_xcrypt_ecb() fixes it. Enclosing the same function's content in
lock_kernel()/unlock_kernel() statements helps as well.

  Instead of handling the odd bytes (i.e., the remainder when dividing
  into prefetch blocks of 64bytes) at the beginning, go for them in the
  end, copying the data out if prefetching would run beyond the page
  boundary.
 
 I'd like to avoid this copying unless the hardware really needs
 it.

As stated initially, I'm not sure why the proposed change fixes
anything. AFAICT, both algorithms are correct in theory. I can't find a
case that breaks the original one reproducably. So my confidence
regarding the change's validity is based on trial and error. Maybe
someone with more knowledge about the various Via erratae can provide
some insights here.

 Can you provide some information on the CPU where you're seeing
 this?

This is the faulty one:
| -bash-4.0# cat /proc/cpuinfo 
| processor : 0
| vendor_id : CentaurHauls
| cpu family: 6
| model : 15
| model name: VIA Nano processor l2...@1600mhz
| stepping  : 2
| cpu MHz   : 1599.696
| cache size: 1024 KB
| fdiv_bug  : no
| hlt_bug   : no
| f00f_bug  : no
| coma_bug  : no
| fpu   : yes
| fpu_exception : yes
| cpuid level   : 10
| wp: yes
| flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat clflush acpi mmx fxsr sse sse2 ss tm syscall nx fxsr_opt rdtscp lm 
constant_tsc rep_good pni monitor est tm2 ssse3 cx16 xtpr rng rng_en ace ace_en 
ace2 phe phe_en lahf_lm
| bogomips  : 3199.39
| clflush size  : 64
| cache_alignment   : 128
| address sizes : 36 bits physical, 48 bits virtual
| power management:

I have a C7 for comparison:
| -bash-4.0# cat /proc/cpuinfo 
| processor : 0
| vendor_id : CentaurHauls
| cpu family: 6
| model : 13
| model name: VIA C7 Processor 1500MHz
| stepping  : 0
| cpu MHz   : 1500.100
| cache size: 128 KB
| fdiv_bug  : no
| hlt_bug   : no
| f00f_bug  : no
| coma_bug  : no
| fpu   : yes
| fpu_exception : yes
| cpuid level   : 1
| wp: yes
| flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge cmov pat 
clflush acpi mmx fxsr sse sse2 tm nx pni est tm2 xtpr rng rng_en ace ace_en 
ace2 ace2_en phe phe_en pmm pmm_en
| bogomips  : 3000.20
| clflush size  : 64
| cache_alignment   : 64
| address sizes : 36 bits physical, 32 bits virtual
| power management:

The C7 is definitely not affected by this bug, so your one-liner fixes all
issues for it.

Greetings, Phil
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: kcrypto - (yet another) user space interface

2010-06-11 Thread Phil Sutter
Hey,

Seems like I'm stabbing into open wounds. :) First of all, thanks a lot
for your comments.

On Fri, Jun 11, 2010 at 11:08:56AM +0200, Sebastian Andrzej Siewior wrote:
 * Nikos Mavrogiannopoulos | 2010-06-11 09:47:15 [+0200]:
 
 Sebastian Andrzej Siewior wrote:
  * Phil Sutter | 2010-06-10 20:22:29 [+0200]:
 
 The problem with right or wrong is that they are only known afterwards.
 For me the right way to go is _to go_. I can see discussions in this
 least, years ago on talks about the perfect userspace crypto api and
 rejections implementations because they are not perfect enough. I don't
 believe there is such thing as a perfect crypto api. Other operating
 systems have a userspace crypto API (maybe not perfect) but linux
 hasn't. I don't think this is the way to go.
 
 Phil asked me for my opinion and he got it. The fundumention problems
 from what I've seen was the interface:
 - kernel structs which are exposed to userland which limit the
   parameters. For instance the iv was limited to 16 bytes while we have
   allready algos with a much longer iv.
 - the interface was using write()/poll()/read() and get_user_pages(). I
   pointed out Herbert's opinion about this and the alternative. So this
   _was_ allready discsussed.

For me, this project is a rather pragmatical one - this just needs to
get done, and it has to be just perfect enough so my employer finds it
usable. Nice to have if I happen to create the perfect CryptoAPI user
space interface ever (yeah, right ...) but this is unlikely to happen.
For me it's enough to first get the concept right and next make it
stable and functional. After that I'm sure we all can tell better if
it's worth pushing it towards the kernel or leave it as (yet another)
niche product.

Greetings, Phil
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: kcrypto - (yet another) user space interface

2010-06-11 Thread Phil Sutter
Hey Bigeasy,

On Thu, Jun 10, 2010 at 11:14:33PM +0200, Sebastian Andrzej Siewior wrote:
 please take look at [0] and [1]. From README I can tell that those two
 posts are different from you have so far.

Hmm. Indeed, using something like AF_CRYPTO didn't come to my mind so
far. Though I'm not sure if this is good or bad - what's the big
advantage in introducing an address family for something which doesn't
even know addressing as such? No offense here, but all I have is a bunch
of bytes which should be transformed by the kernel. Using socket(),
connect() and sendmsg() for just that purpose seems a bit too fancy to
me.

 You might want to take a look at AF_PACKET interface. It does zero copy
 via a ring buffer interface of pre-mmaped user memory. So no
 get_user_pages() then :)

Yes, I've already thought about using just mmap() for the buffer
exchange. But what I don't like about it is that the shared buffer is
allocated from within the kernel, leading to two preconditions:

1) Unless the user anyway has to fill a locally allocated buffer with
the data to transform, at least a single copy is needed to get the data
into the kernel buffer. Although get_user_pages() is quite ugly to use,
it's flexible enough to take any buffer directly from user space to
operate on. (Page alignment constraints, especially with hardware crypto
engines, should be another interesting topic in this context.)

2) Space constraints. I can take a hundred 1.5k buffers along with a
single, 64M one. Despite that my PoC actually doesn't work with buffers
above 64k, using only an in-kernel buffer would make things quite a bit
more complicated.

 
 I think that is the way to go.
 
 [0] http://article.gmane.org/gmane.linux.kernel.cryptoapi/2656
 [1] http://article.gmane.org/gmane.linux.kernel.cryptoapi/2658

Reading a bit further from there, splice() is mentioned as another way
of exchanging the data buffers. But despite that it's doing about what
I've implemented (i.e., using get_user_pages() to fetch the userspace
data), there seems to be now sane way back, at least not according to
the comments in fs/splice.c.

This is actually a limitation of my implementation: all data
transformation is done in situ. Fine for stream ciphers, acceptable for
block ciphers, but probably FUBAR for hashes, I guess.

Greetings, Phil
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RFC: kcrypto - (yet another) user space interface

2010-06-10 Thread Phil Sutter
Hello everyone,

my employer wants to have a lightweight, zero-copy user space interface
to the Crypto-API and I *think* I'm on the right way to realising this.

What I've got so far is just a proof-of-concept, tested only with
cbc(aes), merely as generic as I'd like it to be, but with zero-copy of
the actual data buffers (which reside in user space) and asynchronous
operation.

You can check it out via git://nwl.cc/~n0-1/kcrypto.git or simply have a
look at http://nwl.cc/cgi-bin/git/gitweb.cgi?p=kcrypto.git;a=summary. If
you do so, you may in return flame me as much as you like to. :)

I know that it's far from being ready for use as the sole crypto API
user space interface, but I'm convinced as well that it will never be
unless someone with Crypto-API skills points out at least the very basic
design flaws I've already created.

As this is not yet being used in production (of course), I'm totally
open to changes. So taking this as start of a collaborative project
would be perfect for me, too.

Greetings (and sorry for yet another interface approach),
Phil
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html