Re: [PATCH 2/3] [eSTREAM] stream: Wrapper for eSTREAM ciphers
Hi Herbert, On Nov 13, 2007 9:43 AM, Herbert Xu [EMAIL PROTECTED] wrote: Why couldn't straem ciphers that require an IV just use the blkcipher interface? Please enlighten me :) From what I understand, the blkcipher interface provides functions like crypto_blkcipher_set_iv() for the caller to set IV. What it does is to set *iv in blkcipher_tfm to point to the IV buffer. Later this pointer is passed to desc-info and walk-iv. (Some caller like dm-crypt.c sets desc-info = iv directly though.) Subsequently templates like cbc and ctr pick up the IV pointer from walk-iv. For cbc, the IV is XORed into the input block before calling the underlying cipher. For ctr, the IV is used to form a counter block before calling the underlying cipher. In fact, my stream template patch uses blkcipher in the same way. However unlike cbc and ctr, stream cannot process the IV. It must pass it to the underlying eSTREAM cipher's setiv() because each cipher's setiv() manipulates the IV differently. (Salsa20 uses it in a counter block; other eSTREAM ciphers mix the IV with the key in their key expansion.) So blkcipher is indeed fine for stream ciphers as you stated - I even use it in stream. The problem is that cipher_alg and cipher_tfm do not have callbacks for eSTREAM ciphers to expose setiv(). The estream patch tries to address this issue by introducing crypto_estream_type, estream_alg and estream_tfm. I hoped my explanation is clear. The difference in set IV semantics for block modes and eSTREAM ciphers can be confusing. The patches I've submitted recently are my solution to this problem. It is probably not the best solution. If you or any other expert on this list have other ideas, please discuss and I will try to implement them. (Although the patches pass tcrypt and seem to embody eSTREAM ciphers rather well, I just realized they are not usable in dm-crypt as dm-crypt.c explicitly uses crypto_cipher. Bummer!) Swee Heng - To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] [crypto] S390-AES add fallback driver.
* Jan Glauber | 2007-11-12 18:04:29 [+]: Sebastian, thanks for working on this! Do you know if I need other posted patches that are not yet in cryptodev-2.6 for this to work? Nope I should work. I tested it on Herbert's cryptodev tree. I'm asking becuase I'm getting the following crash using tcrypt (aes 192-bit key, ecb-mode) :( Too bad it doesn't work out of the box :D Call Trace: (?02ee5680? 0x2ee5680) ?0001008292ae? crypto_ecb_setkey+0x52/0x74 ?ecb? ?00010082316e? setkey_fallback_blk+0x5e/0x98 ?aes_s390? ?000100886d76? test_cipher+0x2da/0x8f0 ?tcrypt? ?00010080570e? init+0x70e/0x1808 ?tcrypt? ?000674f4? sys_init_module+0x148/0x1e64 ?000222f8? sysc_noemu+0x10/0x16 ?0211ff6e? 0x211ff6e From my limited understanding of the internals of crypto API I think this is because crypto_ecb_setkey() calls crypto_cipher_setkey() instead of crypto_blkcipher_setkey() and the layout of struct blkcipher_tfm has the *iv where cipher_tfm has the setkey(). And oops, since the *iv is zero we have a null pointer call. But maybe I'm just missing another patch... Please send me (private if you prefer) a full log and I look into it. thanks, Jan Sebastian - To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [camellia-oss:00952] Re: [PATCH 5/5] camellia: de-unrolling, 64bit-ization
From: Denys Vlasenko [EMAIL PROTECTED] Date: Tue, 13 Nov 2007 15:34:33 -0700 My preferred solution is to make loop unrolling conditional on CONFIG_CC_OPTIMIZE_FOR_SIZE - and this is what is done in my (first) patch (see attached). This part: The default build is going to be CONFIG_CC_OPTIMIZE_FOR_SIZE basically for everyone, this is what people get by default and this is what every distribution uses. Therefore %99. of folks will get the slowdown. So in my book this is not an acceptable way to deal with this problem. - To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [camellia-oss:00952] Re: [PATCH 5/5] camellia: de-unrolling, 64bit-ization
On Tuesday 13 November 2007 18:41, David Miller wrote: From: Denys Vlasenko [EMAIL PROTECTED] Date: Tue, 13 Nov 2007 15:34:33 -0700 My preferred solution is to make loop unrolling conditional on CONFIG_CC_OPTIMIZE_FOR_SIZE - and this is what is done in my (first) patch (see attached). This part: The default build is going to be CONFIG_CC_OPTIMIZE_FOR_SIZE basically for everyone, this is what people get by default and this is what every distribution uses. Therefore %99. of folks will get the slowdown. So in my book this is not an acceptable way to deal with this problem. Loop unrolling here amounts to 25% code growth: textdata bss dec hex filename 21714 0 0 2171454d2 camellia5.o 15906 0 0 159063e22 camellia5_Os.o Saving 25% or code size and going 5% slower is perfectly acceptable tradeof for some users. NB: I'm not saying all, ut some significant part of users would like to be able to have this choice. If CONFIG_CC_OPTIMIZE_FOR_SIZE is not an acceptable method, do you have other ideas? -- vda - To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] [eSTREAM] stream: Wrapper for eSTREAM ciphers
On Wed, Nov 14, 2007 at 01:25:37AM +0800, Tan Swee Heng wrote: In fact, my stream template patch uses blkcipher in the same way. However unlike cbc and ctr, stream cannot process the IV. It must pass it to the underlying eSTREAM cipher's setiv() because each cipher's setiv() manipulates the IV differently. (Salsa20 uses it in a counter block; other eSTREAM ciphers mix the IV with the key in their key expansion.) I think we're talking past each other :) What I'm suggesting is that you implement the stream ciphers that use an IV directly using the blkcipher interface, and not the cipher interface. That way you can do whatever you want with the IV. So blkcipher is indeed fine for stream ciphers as you stated - I even use it in stream. The problem is that cipher_alg and cipher_tfm do not have callbacks for eSTREAM ciphers to expose setiv(). The estream patch tries to address this issue by introducing crypto_estream_type, estream_alg and estream_tfm. That's right. Apart from Salsa you shouldn't have to use the cipher interface at all. Which means that what the cipher interface lacks is not a problem :) Salsa can use the cipher interface because deep down it's a block cipher. It's just being used in counter mode. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED] Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [camellia-oss:00952] Re: [PATCH 5/5] camellia: de-unrolling, 64bit-ization
From: Denys Vlasenko [EMAIL PROTECTED] Date: Tue, 13 Nov 2007 19:47:08 -0700 If CONFIG_CC_OPTIMIZE_FOR_SIZE is not an acceptable method, do you have other ideas? Look at ways to make the code run faster without loop unrolling? - To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [camellia-oss:00952] Re: [PATCH 5/5] camellia: de-unrolling, 64bit-ization
Hi, Tue, 13 Nov 2007 19:47:08 -0700 [Subject: Re: [camellia-oss:00952] Re: [PATCH 5/5] camellia: de-unrolling, 64bit-ization] Denys Vlasenko [EMAIL PROTECTED] wrote... On Tuesday 13 November 2007 18:41, David Miller wrote: From: Denys Vlasenko [EMAIL PROTECTED] Date: Tue, 13 Nov 2007 15:34:33 -0700 My preferred solution is to make loop unrolling conditional on CONFIG_CC_OPTIMIZE_FOR_SIZE - and this is what is done in my (first) patch (see attached). This part: The default build is going to be CONFIG_CC_OPTIMIZE_FOR_SIZE basically for everyone, this is what people get by default and this is what every distribution uses. Therefore %99. of folks will get the slowdown. So in my book this is not an acceptable way to deal with this problem. Loop unrolling here amounts to 25% code growth: textdata bss dec hex filename 21714 0 0 2171454d2 camellia5.o 15906 0 0 159063e22 camellia5_Os.o Saving 25% or code size and going 5% slower is perfectly acceptable tradeof for some users. NB: I'm not saying all, ut some significant part of users would like to be able to have this choice. IMHO, if you are going to use camellia on the embedded system, size of code will be important. On the other hand, I think typically the CPU performance is restricted on the embedded system, so the performance of code will be important... I'm not sure 5% slow down is important or not. It will depend on the system. Regards, -- Noriaki TAKAMYA - To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [camellia-oss:00952] Re: [PATCH 5/5] camellia: de-unrolling, 64bit-ization
On Tuesday 13 November 2007 20:49, David Miller wrote: From: Denys Vlasenko [EMAIL PROTECTED] Date: Tue, 13 Nov 2007 19:47:08 -0700 If CONFIG_CC_OPTIMIZE_FOR_SIZE is not an acceptable method, do you have other ideas? Look at ways to make the code run faster without loop unrolling? I did it. I noticed that key setup is mostly operating on 64-bit quantities, and provided alternative implementation which exploits that fact. It's smaller and faster. However, after I've done that, the question still stands: should I unroll the loop or not? The situation we are in now is exactly the sutiation I want to avoid: On Wednesday 07 November 2007 06:22, Denys Vlasenko wrote: Having two versions of the cdoe is unmaintainable. So please either decide that 5% is worth it or isn't. *I* am happy with 5% speed sacrifice. I'm afraid other people won't be. I just want to escape vicious cycle of -Os people arguing with -O2 people to no end. I don't want somebody to come later and unroll the loop again. And then me to come and de-unroll it again... It's better for everybody to recognize that both POVs are valid, and have provisions for tuning size/speed tradeoff by the user (person which builds the binary). That's why I made a patch where unrolling can be enabled by CONFIG_xxx. I will resubmit the patch without de-unrolling. Meanwhile, I'd like to ask you guys to think about ways to make size/speed tradeoffs selectable at build time. -- vda - To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [camellia-oss:00952] Re: [PATCH 5/5] camellia: de-unrolling, 64bit-ization
From: Denys Vlasenko [EMAIL PROTECTED] Date: Tue, 13 Nov 2007 22:30:47 -0700 On Tuesday 13 November 2007 20:49, David Miller wrote: From: Denys Vlasenko [EMAIL PROTECTED] Date: Tue, 13 Nov 2007 19:47:08 -0700 If CONFIG_CC_OPTIMIZE_FOR_SIZE is not an acceptable method, do you have other ideas? Look at ways to make the code run faster without loop unrolling? I did it. I noticed that key setup is mostly operating on 64-bit quantities, and provided alternative implementation which exploits that fact. It's smaller and faster. Great, then you don't have to unroll the loop and performance is at least as good as before _and_ you save code space. It's perfect, you don't need compile time checks or anything silly like that. Please submit this new version :-) - To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [camellia-oss:00952] Re: [PATCH 5/5] camellia: de-unrolling, 64bit-ization
On Tuesday 13 November 2007 22:30, Denys Vlasenko wrote: I will resubmit the patch without de-unrolling. Meanwhile, I'd like to ask you guys to think about ways to make size/speed tradeoffs selectable at build time. Here is the patch which has loops still unrolled, but otherwise unchanged. Description: Use alternative key setup implementation with mostly 64-bit ops if BITS_PER_LONG = 64. Both much smaller and much faster. Unify camellia_en/decrypt128/256 into camellia_do_en/decrypt. Code was similar, with just one additional if() we can use came code. Replace (x 0xff) with (u8)x, gcc is not smart enough to realize that it can do (x 0xff) this way (which is smaller at least on i386). Don't do (x 0xff) in a few places where x cannot be 255 anyway: t0 = il 16; v = camellia_sp0222[(t1 8) 0xff]; il16 is u32, (thus t1 8) is one byte! Signed-off-by: Denys Vlasenko [EMAIL PROTECTED] -- vda diff -urpN linux-2.6.23.1.camellia/crypto/camellia.c linux-2.6.23.1.camellia5/crypto/camellia.c --- linux-2.6.23.1.camellia/crypto/camellia.c 2007-11-13 22:47:28.0 -0700 +++ linux-2.6.23.1.camellia5/crypto/camellia.c 2007-11-13 22:57:54.0 -0700 @@ -36,6 +36,13 @@ #include linux/kernel.h #include linux/module.h +#if BITS_PER_LONG = 64 + +/* Use alternative implementation with mostly 64-bit ops */ +#include camellia_64.c + +#else + static const u32 camellia_sp1110[256] = { 0x70707000,0x82828200,0x2c2c2c00,0xececec00, 0xb3b3b300,0x27272700,0xc0c0c000,0xe5e5e500, @@ -329,7 +336,6 @@ static const u32 camellia_sp4404[256] = /* * macros */ - # define GETU32(v, pt) \ do { \ /* latest breed of gcc is clever enough to use move */ \ @@ -364,63 +370,28 @@ static const u32 camellia_sp4404[256] = } while(0) +/* + * Key setup + */ #define CAMELLIA_F(xl, xr, kl, kr, yl, yr, il, ir, t0, t1) \ do { \ il = xl ^ kl; \ ir = xr ^ kr; \ t0 = il 16; \ t1 = ir 16; \ - yl = camellia_sp1110[ir 0xff]\ - ^ camellia_sp0222[(t1 8) 0xff] \ - ^ camellia_sp3033[t1 0xff]\ - ^ camellia_sp4404[(ir 8) 0xff]; \ - yr = camellia_sp1110[(t0 8) 0xff] \ - ^ camellia_sp0222[t0 0xff]\ - ^ camellia_sp3033[(il 8) 0xff] \ - ^ camellia_sp4404[il 0xff]; \ + yl = camellia_sp1110[(u8)(ir )] \ + ^ camellia_sp0222[(t1 8)] \ + ^ camellia_sp3033[(u8)(t1 )] \ + ^ camellia_sp4404[(u8)(ir 8)]; \ + yr = camellia_sp1110[(t0 8)] \ + ^ camellia_sp0222[(u8)(t0 )] \ + ^ camellia_sp3033[(u8)(il 8)] \ + ^ camellia_sp4404[(u8)(il )]; \ yl ^= yr; \ yr = ROR8(yr); \ yr ^= yl; \ } while(0) - -/* - * for speed up - * - */ -#define CAMELLIA_FLS(ll, lr, rl, rr, kll, klr, krl, krr, t0, t1, t2, t3) \ -do {\ - t0 = kll; \ - t2 = krr; \ - t0 = ll; \ - t2 |= rr; \ - rl ^= t2; \ - lr ^= ROL1(t0); \ - t3 = krl; \ - t1 = klr; \ - t3 = rl; \ - t1 |= lr; \ - ll ^= t1; \ - rr ^= ROL1(t3); \ -} while(0) - -#define CAMELLIA_ROUNDSM(xl, xr, kl, kr, yl, yr, il, ir, t0, t1) \ -do {\ - ir = camellia_sp1110[xr 0xff];\ - il = camellia_sp1110[(xl24) 0xff];\ - ir ^= camellia_sp0222[(xr24) 0xff];\ - il ^= camellia_sp0222[(xl16) 0xff];\ - ir ^= camellia_sp3033[(xr16) 0xff];\ - il ^= camellia_sp3033[(xl8) 0xff];\ - ir ^= camellia_sp4404[(xr8) 0xff];\ - il ^= camellia_sp4404[xl 0xff];\ - il ^= kl; \ - ir ^= il ^ kr; \ - yl ^= ir; \ - yr ^= ROR8(il) ^ ir; \ -} while(0) - - #define SUBKEY_L(INDEX) (subkey[(INDEX)*2]) #define SUBKEY_R(INDEX) (subkey[(INDEX)*2 + 1]) @@ -622,7 +593,7 @@ static void camellia_setup128(const unsi SUBKEY_L(6) = subL[5] ^ subL[7]; /* round 5 */ SUBKEY_R(6) = subR[5] ^ subR[7]; tl = subL[10] ^ (subR[10] ~subR[8]); - dw = tl subL[8], /* FL(kl1) */ + dw = tl subL[8]; /* FL(kl1) */ tr = subR[10] ^ ROL1(dw); SUBKEY_L(7) = subL[6] ^ tl; /* round 6 */ SUBKEY_R(7) = subR[6] ^ tr; @@ -1000,400 +971,150 @@ static void camellia_setup192(const unsi } -static void camellia_encrypt128(const u32 *subkey, u32 *io_text) -{ - u32 il,ir,t0,t1; /* temporary variables */ - - u32 io[4]; - - /* pre whitening but absorb kw2 */ - io[0] = io_text[0] ^ SUBKEY_L(0); - io[1] = io_text[1] ^ SUBKEY_R(0); - io[2] = io_text[2]; - io[3] = io_text[3]; - - /* main iteration */ - CAMELLIA_ROUNDSM(io[0],io[1], - SUBKEY_L(2),SUBKEY_R(2), - io[2],io[3],il,ir,t0,t1); - CAMELLIA_ROUNDSM(io[2],io[3], - SUBKEY_L(3),SUBKEY_R(3), - io[0],io[1],il,ir,t0,t1); - CAMELLIA_ROUNDSM(io[0],io[1], - SUBKEY_L(4),SUBKEY_R(4), - io[2],io[3],il,ir,t0,t1); - CAMELLIA_ROUNDSM(io[2],io[3], - SUBKEY_L(5),SUBKEY_R(5), - io[0],io[1],il,ir,t0,t1); - CAMELLIA_ROUNDSM(io[0],io[1], - SUBKEY_L(6),SUBKEY_R(6), - io[2],io[3],il,ir,t0,t1); -