[PADLOCK] PadLock SHA1 / SHA256 support

2006-07-04 Thread Michal Ludvig
Hi Herbert and others,

attached is a patch with support for VIA C7 crypto engine providing
SHA1/SHA256 digests. It compiles into a new module padlock-sha.ko.

Currently it allocates 1 page for its buffer and if there are more data
to be hashed it falls back to software SHA implementation. By default it
requests sha1-generic and sha256-generic modules for fallbacks. Patch
that adds these aliases to sha1.ko and sha256.ko is at
http://www.logix.cz/michal/devel/padlock/kernel-2.6.17-aliases.diff

I also use some accessors as discussed earlier on the linux-crypto list:
http://www.logix.cz/michal/devel/padlock/kernel-2.6.17-accessors.diff

The patch is based on cryptodev-2.6 GIT tree but you'll need to fetch
commit ID 224f611c1639cb6c134a934dae7f7b9f0ac3b540 from Linus' tree to
compile it (or #if 0/#endif the two checks for cpu_has_phe(_enabled) in
padlock_init() function).

Here are some benchmarks from tcrypt.ko (mode=303 for sha1 and mode=304
for sha256):

SHA1
 Block size   Software  PadLock
   16 bytes:  272 cycles/byte   43 cycles/byte
   64 bytes:  126 cycles/byte   15 cycles/byte
  256 bytes:   74 cycles/byte7 cycles/byte
 1024 bytes:   61 cycles/byte5 cycles/byte
 2048 bytes:   59 cycles/byte4 cycles/byte
 4096 bytes:   58 cycles/byte4 cycles/byte
 8192 bytes:   58 cycles/byte   58 cycles/byte

SHA256
 Block size   Software  PadLock
   16 bytes:  311 cycles/byte   48 cycles/byte
   64 bytes:  144 cycles/byte   16 cycles/byte
  256 bytes:   86 cycles/byte7 cycles/byte
 1024 bytes:   72 cycles/byte5 cycles/byte
 2048 bytes:   70 cycles/byte5 cycles/byte
 4096 bytes:   68 cycles/byte4 cycles/byte
 8192 bytes:   68 cycles/byte   70 cycles/byte


For 8k pages it falls back to software, so the significant slowdown. All
results are at
http://www.logix.cz/michal/devel/padlock/kernel-2.6.17-results.txt

Note - to compile this patch on vanilla 2.6.17 and 2.6.17 please apply
http://www.logix.cz/michal/devel/padlock/kernel-2.6.16-padlock-prereq.diff
first (it contains all the above mentioned diffs as well).

The attached patch is also available at
http://www.logix.cz/michal/devel/padlock/kernel-padlock-sha.diff
(just in case it gets wrapped in the email).

Please comment :-)

Michal Ludvig

---

Support for SHA1 / SHA256 algorithms in VIA C7 processors.

Signed-off-by: Michal Ludvig [EMAIL PROTECTED]

Index: linux-2.6.16.13-xenU/drivers/crypto/padlock-sha.c
===
--- /dev/null
+++ linux-2.6.16.13-xenU/drivers/crypto/padlock-sha.c
@@ -0,0 +1,366 @@
+/*
+ * Cryptographic API.
+ *
+ * Support for VIA PadLock hardware crypto engine.
+ *
+ * Copyright (c) 2006  Michal Ludvig [EMAIL PROTECTED]
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ */
+
+#include linux/module.h
+#include linux/init.h
+#include linux/types.h
+#include linux/errno.h
+#include linux/crypto.h
+#include linux/cryptohash.h
+#include linux/interrupt.h
+#include linux/kernel.h
+#include linux/scatterlist.h
+#include asm/byteorder.h
+#include padlock.h
+
+#define PADLOCK_CRA_PRIORITY   300
+
+#define SHA1_DEFAULT_FALLBACK  sha1-generic
+#define SHA1_DIGEST_SIZE20
+#define SHA1_HMAC_BLOCK_SIZE64
+
+#define SHA256_DEFAULT_FALLBACK sha256-generic
+#define SHA256_DIGEST_SIZE  32
+#define SHA256_HMAC_BLOCK_SIZE  64
+
+static char *sha1_fallback = SHA1_DEFAULT_FALLBACK;
+static char *sha256_fallback = SHA256_DEFAULT_FALLBACK;
+
+module_param(sha1_fallback, charp, 0444);
+module_param(sha256_fallback, charp, 0444);
+
+MODULE_PARM_DESC(sha1_fallback, Fallback driver for SHA1. Default is  
SHA1_DEFAULT_FALLBACK);
+MODULE_PARM_DESC(sha256_fallback, Fallback driver for SHA256. Default is  
SHA256_DEFAULT_FALLBACK);
+
+struct padlock_sha_ctx {
+   char*data;
+   size_t  used;
+   size_t  data_len;
+   int bypass;
+   void (*f_sha_padlock)(const char *in, char *out, int count);
+   const char  *fallback_driver_name;
+   struct crypto_tfm *fallback_tfm;
+};
+
+#define CTX(tfm)   ((struct padlock_sha_ctx*)(crypto_tfm_ctx(tfm)))
+
+/* We'll need aligned address on the stack */
+#define NEAREST_ALIGNED(ptr) ((unsigned char *)(ptr) + \
+   ((0x10 - ((size_t)(ptr)  0x0F))  0x0F))
+
+static struct crypto_tfm *tfm_sha1, *tfm_sha256;
+static struct crypto_alg sha1_alg, sha256_alg;
+
+static void padlock_sha_bypass(struct crypto_tfm *tfm)
+{
+   if (CTX(tfm)-bypass)
+   return;
+
+   /* We're attempting to use ALG from a module of the same name,
+* e.g. sha1 algo from sha1.ko. This could be more intelligent and
+* allow e.g. sha1-i586 module to be used instead. Hmm, maybe later.
+*
+* BTW We assume we get a 

Re: [PADLOCK] PadLock SHA1 / SHA256 support

2006-07-04 Thread Herbert Xu
On Wed, Jul 05, 2006 at 04:57:55PM +1200, Michal Ludvig wrote:
 
 Here are some benchmarks from tcrypt.ko (mode=303 for sha1 and mode=304
 for sha256):

Looks impressive!

The only thing I'd like to tweak is to use the newly added template
mechanism to pick the fallback.  But you don't have to worry about
this.  I'll see if it fits or not first :)

Also, we can probably bring down the 16-byte numbers if we change the
digest layer to cater for a direct digest interface that takes one sg
entry, i.e., if it's one sg entry then feed it directly to the algo's
digest function, otherwise go through the usual sg walker.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PADLOCK] PadLock SHA1 / SHA256 support

2006-07-04 Thread Herbert Xu
On Wed, Jul 05, 2006 at 03:21:41PM +1000, herbert wrote:
 
 Also, we can probably bring down the 16-byte numbers if we change the
 digest layer to cater for a direct digest interface that takes one sg
 entry, i.e., if it's one sg entry then feed it directly to the algo's
 digest function, otherwise go through the usual sg walker.

After thinking a bit more about this, IMHO this is definitely worthwhile.
If we do this, then we can basically do hmac(sha1) on IPsec/ESP packets
without copying most of the data at all!

This works because the bulk of HMAC time is spent on digesting the
input with a block-sized prefix based on the key.  So if the given
input has a block worth of headroom, we can copy the prefix there
and just feed the whole thing to the padlock.

With IPsec/ESP, it should be easy to make sure that the packet has
64 bytes of headroom in front of the ESP header (Ethernet + IP already
gives 34, so we won't be wasting too much).  The only catch is that
the packet has to be linear.  However, we always linearise it through
skb_cow_data currently because we assume frags to be read-only.  So we
might as well take advantage of that and do everything through one digest
call.

By the time we evolve away from a linear ESP implementation, hopefully
VIA would've produced a proper SHA1 CPU by then :)

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html