Re: [RFC PATCH v2] crypto: Add IV generation algorithms
Hi Milan, On 13 December 2016 at 15:31, Milan Brozwrote: > I think that IV generators should not modify or read encrypted data directly, > it should only generate IV. I was trying to find more information about what you said and how a iv generator should be written. I saw two examples of IV generators too used with AEAD ciphers (crypto/seqiv.c and crypto/echainiv.c) Excerpt from crypto api doc: http://www.chronox.de/crypto-API/crypto/architecture.html#crypto-api-cipher-references-and-priority 2. Now, SEQIV uses the AEAD API function calls to invoke the associated AEAD cipher. In our case, during the instantiation of SEQIV, the cipher handle for GCM is provided to SEQIV. This means that SEQIV invokes AEAD cipher operations with the GCM cipher handle. Here, it says seqiv invokes cipher operations. However the code crypto/seqiv.c does not look similar to how the modes are implemented which is confusing. I was looking for an example of an IV generator used with a regular block cipher and not a AEAD cipher. Could you point me out to some? Thanks, Binoy -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
Jean-Philippe Aumasson wrote: > If a halved version of SipHash can bring significant performance boost > (with 32b words instead of 64b words) with an acceptable security level > (64-bit enough?) then we may design such a version. It would be fairly significant, a 2x speed benefit on a lot of 32-bit machines. First is the fact that a 64-bit SipHash round on a generic 32-bit machine requires not twice as many instructions, but more than three. Consider the core SipHash quarter-round operation: a += b; b = rotate_left(b, k) b ^= a The add and xor are equivalent between 32- and 64-bit rounds; twice the instructions do twice the work. (There's a dependency via the carry bit between the two halves of the add, but it ends up not being on the critical path even in a superscalar implementation.) The problem is the rotates. Although some particularly nice code is possible on 32-bit ARM due to its support for shift-and-xor operations, on a generic 32-bit CPU the rotate grows to 6 instructions with a 2-cycle dependency chain (more in practice because barrel shifters are large and even quad-issue CPUs can't do 4 shifts per cycle): temp_lo = b_lo >> (32-k) temp_hi = b_hi >> (32-k) b_lo <<= k b_hi <<= k b_lo ^= temp_hi b_hi ^= temp_lo The resultant instruction counts and (assuming wide issue) latencies are: 64-bit SipHash "Half" SipHash Inst. Latency Inst. Latency 10 33 2 Quarter round 40 6 12 4 Full round 80 12 24 8 Two rounds 82 13 26 9 Mix in one word 82 13 52 18 Mix in 64 bits 166 26 61 18 Four round finalization + final XOR 248 39 113 36 Hash 64 bits 330 52 165 54 Hash 128 bits 412 65 217 72 Hash 192 bits While the ideal latencies are actually better for the 64-bit algorithm, that requires an unrealistic 6+-wide superscalar implementation that's more than twice as wide as the 64-bit code requires (which is already optimized for quad-issue). For a 1- or 2-wide processor, the instruction counts dominate, and not only does the 64-bit algorithm take 60% more time to mix in the same number of bytes, but the finalization rounds bring the ratio to 2:1 for small inputs. (And I haven't included the possible savings if the input size is an odd number of 32-bit words, such as networking applications which include the source/dest port numbers.) Notes on particular processors: - x86 can do a 64-bit rotate in 3 instructions and 2 cycles using the SHLD/SHRD instructions instead: movl%b_hi, %temp shldl $k, %b_lo, %b_hi shldl $k, %temp, %b_lo ... but as I mentioned the problem is registers. SipHash needs 8 32-bit words plus at least one temporary, and 32-bit x86 has only 7 available. (And compilers can rarely manage to keep more than 6 of them busy.) - 64-bit SipHash is particularly efficient on 32-bit ARM due to its support for shift-and-op instructions. The 64-bit shift and following xor can be done in 4 instructions. So the only benefit is from the reduced finalization. - Double-width adds cost a little more on CPUs like MIPS and RISC-V without condition codes. - Certain particularly crappy uClinux processors with slow shifts (68000, anyone?) really suffer from extra shifts. One *weakly* requested feature: It might simplify some programming interfaces if we could use the same key for multiple hash tables with a 1-word "tweak" (e.g. pointer to the hash table, so it could be assumed non-zero if that helped) to make distinct functions. That would let us more safely use a global key for multiple small hash tables without the need to add code to generate and store key material for each place that an unkeyed hash is replaced. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v6 1/5] siphash: add cryptographically secure PRF
SipHash is a 64-bit keyed hash function that is actually a cryptographically secure PRF, like HMAC. Except SipHash is super fast, and is meant to be used as a hashtable keyed lookup function, or as a general PRF for short input use cases, such as sequence numbers or RNG chaining. For the first usage: There are a variety of attacks known as "hashtable poisoning" in which an attacker forms some data such that the hash of that data will be the same, and then preceeds to fill up all entries of a hashbucket. This is a realistic and well-known denial-of-service vector. Currently hashtables use jhash, which is fast but not secure, and some kind of rotating key scheme (or none at all, which isn't good). SipHash is meant as a replacement for jhash in these cases. There are a modicum of places in the kernel that are vulnerable to hashtable poisoning attacks, either via userspace vectors or network vectors, and there's not a reliable mechanism inside the kernel at the moment to fix it. The first step toward fixing these issues is actually getting a secure primitive into the kernel for developers to use. Then we can, bit by bit, port things over to it as deemed appropriate. While SipHash is extremely fast for a cryptographically secure function, it is likely a bit slower than the insecure jhash, and so replacements will be evaluated on a case-by-case basis based on whether or not the difference in speed is negligible and whether or not the current jhash usage poses a real security risk. For the second usage: A few places in the kernel are using MD5 or SHA1 for creating secure sequence numbers, syn cookies, port numbers, or fast random numbers. SipHash is a faster and more fitting, and more secure replacement for MD5 in those situations. Replacing MD5 and SHA1 with SipHash for these uses is obvious and straight-forward, and so is submitted along with this patch series. There shouldn't be much of a debate over its efficacy. Dozens of languages are already using this internally for their hash tables and PRFs. Some of the BSDs already use this in their kernels. SipHash is a widely known high-speed solution to a widely known set of problems, and it's time we catch-up. Signed-off-by: Jason A. DonenfeldCc: Jean-Philippe Aumasson Cc: Linus Torvalds Cc: Eric Biggers Cc: David Laight --- MAINTAINERS | 7 ++ include/linux/siphash.h | 86 lib/Kconfig.debug | 6 +- lib/Makefile| 5 +- lib/siphash.c | 210 lib/test_siphash.c | 101 +++ 6 files changed, 410 insertions(+), 5 deletions(-) create mode 100644 include/linux/siphash.h create mode 100644 lib/siphash.c create mode 100644 lib/test_siphash.c diff --git a/MAINTAINERS b/MAINTAINERS index 59c9895d73d5..5d87a8c1056a 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -11231,6 +11231,13 @@ F: arch/arm/mach-s3c24xx/mach-bast.c F: arch/arm/mach-s3c24xx/bast-ide.c F: arch/arm/mach-s3c24xx/bast-irq.c +SIPHASH PRF ROUTINES +M: Jason A. Donenfeld +S: Maintained +F: lib/siphash.c +F: lib/test_siphash.c +F: include/linux/siphash.h + TI DAVINCI MACHINE SUPPORT M: Sekhar Nori M: Kevin Hilman diff --git a/include/linux/siphash.h b/include/linux/siphash.h new file mode 100644 index ..e82fce48a0f6 --- /dev/null +++ b/include/linux/siphash.h @@ -0,0 +1,86 @@ +/* Copyright (C) 2016 Jason A. Donenfeld . All Rights Reserved. + * + * This file is provided under a dual BSD/GPLv2 license. + * + * SipHash: a fast short-input PRF + * https://131002.net/siphash/ + * + * This implementation is specifically for SipHash2-4. + */ + +#ifndef _LINUX_SIPHASH_H +#define _LINUX_SIPHASH_H + +#include +#include + +#define SIPHASH_ALIGNMENT 8 +typedef u64 siphash_key_t[2]; + +u64 __siphash_aligned(const void *data, size_t len, const siphash_key_t key); +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS +u64 __siphash_unaligned(const void *data, size_t len, const siphash_key_t key); +#endif + +u64 siphash_1u64(const u64 a, const siphash_key_t key); +u64 siphash_2u64(const u64 a, const u64 b, const siphash_key_t key); +u64 siphash_3u64(const u64 a, const u64 b, const u64 c, +const siphash_key_t key); +u64 siphash_4u64(const u64 a, const u64 b, const u64 c, const u64 d, +const siphash_key_t key); + +static inline u64 ___siphash_aligned(const u64 *data, size_t len, const siphash_key_t key) +{ + if (__builtin_constant_p(len) && len == 8) + return siphash_1u64(data[0], key); + if (__builtin_constant_p(len) && len == 16) + return siphash_2u64(data[0], data[1], key); + if (__builtin_constant_p(len) && len == 24) + return
[PATCH v6 5/5] syncookies: use SipHash in place of SHA1
SHA1 is slower and less secure than SipHash, and so replacing syncookie generation with SipHash makes natural sense. Some BSDs have been doing this for several years in fact. Signed-off-by: Jason A. Donenfeld--- net/ipv4/syncookies.c | 20 net/ipv6/syncookies.c | 37 - 2 files changed, 20 insertions(+), 37 deletions(-) diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c index 3e88467d70ee..03bb068f 100644 --- a/net/ipv4/syncookies.c +++ b/net/ipv4/syncookies.c @@ -13,13 +13,13 @@ #include #include #include -#include +#include #include #include #include #include -static u32 syncookie_secret[2][16-4+SHA_DIGEST_WORDS] __read_mostly; +static siphash_key_t syncookie_secret[2] __read_mostly; #define COOKIEBITS 24 /* Upper bits store count */ #define COOKIEMASK (((__u32)1 << COOKIEBITS) - 1) @@ -48,24 +48,12 @@ static u32 syncookie_secret[2][16-4+SHA_DIGEST_WORDS] __read_mostly; #define TSBITS 6 #define TSMASK (((__u32)1 << TSBITS) - 1) -static DEFINE_PER_CPU(__u32 [16 + 5 + SHA_WORKSPACE_WORDS], ipv4_cookie_scratch); - static u32 cookie_hash(__be32 saddr, __be32 daddr, __be16 sport, __be16 dport, u32 count, int c) { - __u32 *tmp; - net_get_random_once(syncookie_secret, sizeof(syncookie_secret)); - - tmp = this_cpu_ptr(ipv4_cookie_scratch); - memcpy(tmp + 4, syncookie_secret[c], sizeof(syncookie_secret[c])); - tmp[0] = (__force u32)saddr; - tmp[1] = (__force u32)daddr; - tmp[2] = ((__force u32)sport << 16) + (__force u32)dport; - tmp[3] = count; - sha_transform(tmp + 16, (__u8 *)tmp, tmp + 16 + 5); - - return tmp[17]; + return siphash_4u32(saddr, daddr, (u32)sport << 16 | dport, count, + syncookie_secret[c]); } diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c index a4d49760bf43..04d19e89a3e0 100644 --- a/net/ipv6/syncookies.c +++ b/net/ipv6/syncookies.c @@ -16,7 +16,7 @@ #include #include -#include +#include #include #include #include @@ -24,7 +24,7 @@ #define COOKIEBITS 24 /* Upper bits store count */ #define COOKIEMASK (((__u32)1 << COOKIEBITS) - 1) -static u32 syncookie6_secret[2][16-4+SHA_DIGEST_WORDS] __read_mostly; +static siphash_key_t syncookie6_secret[2] __read_mostly; /* RFC 2460, Section 8.3: * [ipv6 tcp] MSS must be computed as the maximum packet size minus 60 [..] @@ -41,30 +41,25 @@ static __u16 const msstab[] = { 9000 - 60, }; -static DEFINE_PER_CPU(__u32 [16 + 5 + SHA_WORKSPACE_WORDS], ipv6_cookie_scratch); - static u32 cookie_hash(const struct in6_addr *saddr, const struct in6_addr *daddr, __be16 sport, __be16 dport, u32 count, int c) { - __u32 *tmp; + const struct { + struct in6_addr saddr; + struct in6_addr daddr; + u32 count; + u16 sport; + u16 dport; + } __aligned(SIPHASH_ALIGNMENT) combined = { + .saddr = *saddr, + .daddr = *daddr, + .count = count, + .sport = sport, + .dport = dport + }; net_get_random_once(syncookie6_secret, sizeof(syncookie6_secret)); - - tmp = this_cpu_ptr(ipv6_cookie_scratch); - - /* -* we have 320 bits of information to hash, copy in the remaining -* 192 bits required for sha_transform, from the syncookie6_secret -* and overwrite the digest with the secret -*/ - memcpy(tmp + 10, syncookie6_secret[c], 44); - memcpy(tmp, saddr, 16); - memcpy(tmp + 4, daddr, 16); - tmp[8] = ((__force u32)sport << 16) + (__force u32)dport; - tmp[9] = count; - sha_transform(tmp + 16, (__u8 *)tmp, tmp + 16 + 5); - - return tmp[17]; + return siphash(, sizeof(combined), syncookie6_secret[c]); } static __u32 secure_tcp_syn_cookie(const struct in6_addr *saddr, -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v6 3/5] random: use SipHash in place of MD5
This duplicates the current algorithm for get_random_int/long, but uses siphash instead. This comes with several benefits. It's certainly faster and more cryptographically secure than MD5. This patch also separates hashed fields into three values instead of one, in order to increase diffusion. The previous MD5 algorithm used a per-cpu MD5 state, which caused successive calls to the function to chain upon each other. While it's not entirely clear that this kind of chaining is absolutely necessary when using a secure PRF like siphash, it can't hurt, and the timing of the call chain does add a degree of natural entropy. So, in keeping with this design, instead of the massive per-cpu 64-byte MD5 state, there is instead a per-cpu previously returned value for chaining. The speed benefits are substantial: | siphash | md5| speedup | -- get_random_long | 137130 | 415983 | 3.03x | get_random_int | 86384 | 343323 | 3.97x | Signed-off-by: Jason A. DonenfeldCc: Jean-Philippe Aumasson Cc: Ted Tso --- drivers/char/random.c | 32 +--- 1 file changed, 13 insertions(+), 19 deletions(-) diff --git a/drivers/char/random.c b/drivers/char/random.c index d6876d506220..a51f0ff43f00 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -262,6 +262,7 @@ #include #include #include +#include #include #include @@ -2042,7 +2043,7 @@ struct ctl_table random_table[] = { }; #endif /* CONFIG_SYSCTL */ -static u32 random_int_secret[MD5_MESSAGE_BYTES / 4] cacheline_aligned; +static siphash_key_t random_int_secret; int random_int_secret_init(void) { @@ -2050,8 +2051,7 @@ int random_int_secret_init(void) return 0; } -static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash) - __aligned(sizeof(unsigned long)); +static DEFINE_PER_CPU(u64, get_random_int_chaining); /* * Get a random word for internal kernel use only. Similar to urandom but @@ -2061,19 +2061,16 @@ static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash) */ unsigned int get_random_int(void) { - __u32 *hash; unsigned int ret; + u64 *chaining; if (arch_get_random_int()) return ret; - hash = get_cpu_var(get_random_int_hash); - - hash[0] += current->pid + jiffies + random_get_entropy(); - md5_transform(hash, random_int_secret); - ret = hash[0]; - put_cpu_var(get_random_int_hash); - + chaining = _cpu_var(get_random_int_chaining); + ret = *chaining = siphash_3u64(*chaining, jiffies, random_get_entropy() + + current->pid, random_int_secret); + put_cpu_var(get_random_int_chaining); return ret; } EXPORT_SYMBOL(get_random_int); @@ -2083,19 +2080,16 @@ EXPORT_SYMBOL(get_random_int); */ unsigned long get_random_long(void) { - __u32 *hash; unsigned long ret; + u64 *chaining; if (arch_get_random_long()) return ret; - hash = get_cpu_var(get_random_int_hash); - - hash[0] += current->pid + jiffies + random_get_entropy(); - md5_transform(hash, random_int_secret); - ret = *(unsigned long *)hash; - put_cpu_var(get_random_int_hash); - + chaining = _cpu_var(get_random_int_chaining); + ret = *chaining = siphash_3u64(*chaining, jiffies, random_get_entropy() + + current->pid, random_int_secret); + put_cpu_var(get_random_int_chaining); return ret; } EXPORT_SYMBOL(get_random_long); -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v6 4/5] md5: remove from lib and only live in crypto
The md5_transform function is no longer used any where in the tree, except for the crypto api's actual implementation of md5, so we can drop the function from lib and put it as a static function of the crypto file, where it belongs. There should be no new users of md5_transform, anyway, since there are more modern ways of doing what it once achieved. Signed-off-by: Jason A. Donenfeld--- crypto/md5.c | 95 +++- lib/Makefile | 2 +- lib/md5.c| 95 3 files changed, 95 insertions(+), 97 deletions(-) delete mode 100644 lib/md5.c diff --git a/crypto/md5.c b/crypto/md5.c index 2355a7c25c45..f7ae1a48225b 100644 --- a/crypto/md5.c +++ b/crypto/md5.c @@ -21,9 +21,11 @@ #include #include #include -#include #include +#define MD5_DIGEST_WORDS 4 +#define MD5_MESSAGE_BYTES 64 + const u8 md5_zero_message_hash[MD5_DIGEST_SIZE] = { 0xd4, 0x1d, 0x8c, 0xd9, 0x8f, 0x00, 0xb2, 0x04, 0xe9, 0x80, 0x09, 0x98, 0xec, 0xf8, 0x42, 0x7e, @@ -47,6 +49,97 @@ static inline void cpu_to_le32_array(u32 *buf, unsigned int words) } } +#define F1(x, y, z)(z ^ (x & (y ^ z))) +#define F2(x, y, z)F1(z, x, y) +#define F3(x, y, z)(x ^ y ^ z) +#define F4(x, y, z)(y ^ (x | ~z)) + +#define MD5STEP(f, w, x, y, z, in, s) \ + (w += f(x, y, z) + in, w = (w<>(32-s)) + x) + +static void md5_transform(__u32 *hash, __u32 const *in) +{ + u32 a, b, c, d; + + a = hash[0]; + b = hash[1]; + c = hash[2]; + d = hash[3]; + + MD5STEP(F1, a, b, c, d, in[0] + 0xd76aa478, 7); + MD5STEP(F1, d, a, b, c, in[1] + 0xe8c7b756, 12); + MD5STEP(F1, c, d, a, b, in[2] + 0x242070db, 17); + MD5STEP(F1, b, c, d, a, in[3] + 0xc1bdceee, 22); + MD5STEP(F1, a, b, c, d, in[4] + 0xf57c0faf, 7); + MD5STEP(F1, d, a, b, c, in[5] + 0x4787c62a, 12); + MD5STEP(F1, c, d, a, b, in[6] + 0xa8304613, 17); + MD5STEP(F1, b, c, d, a, in[7] + 0xfd469501, 22); + MD5STEP(F1, a, b, c, d, in[8] + 0x698098d8, 7); + MD5STEP(F1, d, a, b, c, in[9] + 0x8b44f7af, 12); + MD5STEP(F1, c, d, a, b, in[10] + 0x5bb1, 17); + MD5STEP(F1, b, c, d, a, in[11] + 0x895cd7be, 22); + MD5STEP(F1, a, b, c, d, in[12] + 0x6b901122, 7); + MD5STEP(F1, d, a, b, c, in[13] + 0xfd987193, 12); + MD5STEP(F1, c, d, a, b, in[14] + 0xa679438e, 17); + MD5STEP(F1, b, c, d, a, in[15] + 0x49b40821, 22); + + MD5STEP(F2, a, b, c, d, in[1] + 0xf61e2562, 5); + MD5STEP(F2, d, a, b, c, in[6] + 0xc040b340, 9); + MD5STEP(F2, c, d, a, b, in[11] + 0x265e5a51, 14); + MD5STEP(F2, b, c, d, a, in[0] + 0xe9b6c7aa, 20); + MD5STEP(F2, a, b, c, d, in[5] + 0xd62f105d, 5); + MD5STEP(F2, d, a, b, c, in[10] + 0x02441453, 9); + MD5STEP(F2, c, d, a, b, in[15] + 0xd8a1e681, 14); + MD5STEP(F2, b, c, d, a, in[4] + 0xe7d3fbc8, 20); + MD5STEP(F2, a, b, c, d, in[9] + 0x21e1cde6, 5); + MD5STEP(F2, d, a, b, c, in[14] + 0xc33707d6, 9); + MD5STEP(F2, c, d, a, b, in[3] + 0xf4d50d87, 14); + MD5STEP(F2, b, c, d, a, in[8] + 0x455a14ed, 20); + MD5STEP(F2, a, b, c, d, in[13] + 0xa9e3e905, 5); + MD5STEP(F2, d, a, b, c, in[2] + 0xfcefa3f8, 9); + MD5STEP(F2, c, d, a, b, in[7] + 0x676f02d9, 14); + MD5STEP(F2, b, c, d, a, in[12] + 0x8d2a4c8a, 20); + + MD5STEP(F3, a, b, c, d, in[5] + 0xfffa3942, 4); + MD5STEP(F3, d, a, b, c, in[8] + 0x8771f681, 11); + MD5STEP(F3, c, d, a, b, in[11] + 0x6d9d6122, 16); + MD5STEP(F3, b, c, d, a, in[14] + 0xfde5380c, 23); + MD5STEP(F3, a, b, c, d, in[1] + 0xa4beea44, 4); + MD5STEP(F3, d, a, b, c, in[4] + 0x4bdecfa9, 11); + MD5STEP(F3, c, d, a, b, in[7] + 0xf6bb4b60, 16); + MD5STEP(F3, b, c, d, a, in[10] + 0xbebfbc70, 23); + MD5STEP(F3, a, b, c, d, in[13] + 0x289b7ec6, 4); + MD5STEP(F3, d, a, b, c, in[0] + 0xeaa127fa, 11); + MD5STEP(F3, c, d, a, b, in[3] + 0xd4ef3085, 16); + MD5STEP(F3, b, c, d, a, in[6] + 0x04881d05, 23); + MD5STEP(F3, a, b, c, d, in[9] + 0xd9d4d039, 4); + MD5STEP(F3, d, a, b, c, in[12] + 0xe6db99e5, 11); + MD5STEP(F3, c, d, a, b, in[15] + 0x1fa27cf8, 16); + MD5STEP(F3, b, c, d, a, in[2] + 0xc4ac5665, 23); + + MD5STEP(F4, a, b, c, d, in[0] + 0xf4292244, 6); + MD5STEP(F4, d, a, b, c, in[7] + 0x432aff97, 10); + MD5STEP(F4, c, d, a, b, in[14] + 0xab9423a7, 15); + MD5STEP(F4, b, c, d, a, in[5] + 0xfc93a039, 21); + MD5STEP(F4, a, b, c, d, in[12] + 0x655b59c3, 6); + MD5STEP(F4, d, a, b, c, in[3] + 0x8f0ccc92, 10); + MD5STEP(F4, c, d, a, b, in[10] + 0xffeff47d, 15); + MD5STEP(F4, b, c, d, a, in[1] + 0x85845dd1, 21); + MD5STEP(F4, a, b, c, d, in[8] + 0x6fa87e4f, 6); + MD5STEP(F4, d, a, b, c, in[15] + 0xfe2ce6e0, 10); + MD5STEP(F4, c, d, a, b, in[6] + 0xa3014314, 15); +
[PATCH v6 2/5] secure_seq: use SipHash in place of MD5
This gives a clear speed and security improvement. Siphash is both faster and is more solid crypto than the aging MD5. Rather than manually filling MD5 buffers, for IPv6, we simply create a layout by a simple anonymous struct, for which gcc generates rather efficient code. For IPv4, we pass the values directly to the short input convenience functions. Signed-off-by: Jason A. DonenfeldCc: Andi Kleen Cc: David Miller Cc: David Laight Cc: Tom Herbert Cc: Hannes Frederic Sowa --- net/core/secure_seq.c | 133 -- 1 file changed, 52 insertions(+), 81 deletions(-) diff --git a/net/core/secure_seq.c b/net/core/secure_seq.c index 88a8e429fc3e..c80583bf3213 100644 --- a/net/core/secure_seq.c +++ b/net/core/secure_seq.c @@ -1,3 +1,5 @@ +/* Copyright (C) 2016 Jason A. Donenfeld . All Rights Reserved. */ + #include #include #include @@ -8,14 +10,14 @@ #include #include #include - +#include #include #if IS_ENABLED(CONFIG_IPV6) || IS_ENABLED(CONFIG_INET) +#include #include -#define NET_SECRET_SIZE (MD5_MESSAGE_BYTES / 4) -static u32 net_secret[NET_SECRET_SIZE] cacheline_aligned; +static siphash_key_t net_secret; static __always_inline void net_secret_init(void) { @@ -44,44 +46,42 @@ static u32 seq_scale(u32 seq) u32 secure_tcpv6_sequence_number(const __be32 *saddr, const __be32 *daddr, __be16 sport, __be16 dport, u32 *tsoff) { - u32 secret[MD5_MESSAGE_BYTES / 4]; - u32 hash[MD5_DIGEST_WORDS]; - u32 i; - + const struct { + struct in6_addr saddr; + struct in6_addr daddr; + __be16 sport; + __be16 dport; + u32 padding; + } __aligned(SIPHASH_ALIGNMENT) combined = { + .saddr = *(struct in6_addr *)saddr, + .daddr = *(struct in6_addr *)daddr, + .sport = sport, + .dport = dport + }; + u64 hash; net_secret_init(); - memcpy(hash, saddr, 16); - for (i = 0; i < 4; i++) - secret[i] = net_secret[i] + (__force u32)daddr[i]; - secret[4] = net_secret[4] + - (((__force u16)sport << 16) + (__force u16)dport); - for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++) - secret[i] = net_secret[i]; - - md5_transform(hash, secret); - - *tsoff = sysctl_tcp_timestamps == 1 ? hash[1] : 0; - return seq_scale(hash[0]); + hash = siphash(, sizeof(combined), net_secret); + *tsoff = sysctl_tcp_timestamps == 1 ? (hash >> 32) : 0; + return seq_scale(hash); } EXPORT_SYMBOL(secure_tcpv6_sequence_number); u32 secure_ipv6_port_ephemeral(const __be32 *saddr, const __be32 *daddr, __be16 dport) { - u32 secret[MD5_MESSAGE_BYTES / 4]; - u32 hash[MD5_DIGEST_WORDS]; - u32 i; - + const struct { + struct in6_addr saddr; + struct in6_addr daddr; + __be16 dport; + u16 padding1; + u32 padding2; + } __aligned(SIPHASH_ALIGNMENT) combined = { + .saddr = *(struct in6_addr *)saddr, + .daddr = *(struct in6_addr *)daddr, + .dport = dport + }; net_secret_init(); - memcpy(hash, saddr, 16); - for (i = 0; i < 4; i++) - secret[i] = net_secret[i] + (__force u32) daddr[i]; - secret[4] = net_secret[4] + (__force u32)dport; - for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++) - secret[i] = net_secret[i]; - - md5_transform(hash, secret); - - return hash[0]; + return siphash(, sizeof(combined), net_secret); } EXPORT_SYMBOL(secure_ipv6_port_ephemeral); #endif @@ -91,33 +91,17 @@ EXPORT_SYMBOL(secure_ipv6_port_ephemeral); u32 secure_tcp_sequence_number(__be32 saddr, __be32 daddr, __be16 sport, __be16 dport, u32 *tsoff) { - u32 hash[MD5_DIGEST_WORDS]; - + u64 hash; net_secret_init(); - hash[0] = (__force u32)saddr; - hash[1] = (__force u32)daddr; - hash[2] = ((__force u16)sport << 16) + (__force u16)dport; - hash[3] = net_secret[15]; - - md5_transform(hash, net_secret); - - *tsoff = sysctl_tcp_timestamps == 1 ? hash[1] : 0; - return seq_scale(hash[0]); + hash = siphash_4u32(saddr, daddr, sport, dport, net_secret); + *tsoff = sysctl_tcp_timestamps == 1 ? (hash >> 32) : 0; + return seq_scale(hash); } u32 secure_ipv4_port_ephemeral(__be32 saddr, __be32 daddr, __be16 dport) { - u32 hash[MD5_DIGEST_WORDS]; - net_secret_init(); - hash[0] = (__force u32)saddr; - hash[1] = (__force u32)daddr; - hash[2] = (__force u32)dport ^ net_secret[14]; - hash[3] =
[PATCH v6 0/5] The SipHash Patchset
Hey again, This keeps getting more ambitious, which is good I suppose. If the frequency of new versioned patchsets is too high for LKML and not customary, please let me know. Otherwise, read on to see what's new this time... With Hannes' suggestion, there is now only one siphash() function, which will use the faster aligned version by compile-time constant folding. Additionally, I now use constant folding to optionally switch to the helper siphash_Nu64 functions that are a bit faster for data of length 8, 16, 24, and 32. So, the result is that you use siphash(data, len, key) if you have a buffer of sorts, and then everything is taken care of for you. Or, if you have a series of integers, you can opt to use siphash_Nu{32,64} functions instead. The basic API is now complete. After replacing MD5 in secure sequence number generation and the RNG, it turned out that md5_transform wasn't used any place else in the tree, so finally -- this is something to rejoice over -- lib/md5.c has been deleted and now that function lives as a static function in crypto/md5.c where it belongs. Meanwhile, it seems that sha_transform is used in places where SipHash would be more fitting, so the IPv4 and IPv6 syncookies implementation now uses SipHash, which should speed up TCP performance. Some BSDs already do this. I'd like to replace sha_transform in addrconf, but that code is a bit gnarley, so I don't want to be too meddlesome. I'm not entirely convinced either that SipHash is a good choice for it. But I'm open to discussion here, so if you have an opinion, please speak up. If you've been following the evolution of this patchset, and think that certain patches in it are fine, please do lend me your Reviewed-by to carry into any subsequent versions, so that in case you disappear your useful reviews will still keep the ball moving. Thanks for all the great feedback thus far. Jason Jason A. Donenfeld (5): siphash: add cryptographically secure PRF secure_seq: use SipHash in place of MD5 random: use SipHash in place of MD5 md5: remove from lib and only live in crypto syncookies: use SipHash in place of SHA1 MAINTAINERS | 7 ++ crypto/md5.c| 95 +- drivers/char/random.c | 32 +++- include/linux/siphash.h | 86 lib/Kconfig.debug | 6 +- lib/Makefile| 7 +- lib/md5.c | 95 -- lib/siphash.c | 210 lib/test_siphash.c | 101 +++ net/core/secure_seq.c | 133 -- net/ipv4/syncookies.c | 20 + net/ipv6/syncookies.c | 37 - 12 files changed, 590 insertions(+), 239 deletions(-) create mode 100644 include/linux/siphash.h delete mode 100644 lib/md5.c create mode 100644 lib/siphash.c create mode 100644 lib/test_siphash.c -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
Hi Jason, [auto build test ERROR on linus/master] [also build test ERROR on v4.9 next-20161215] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Jason-A-Donenfeld/siphash-add-cryptographically-secure-PRF/20161216-092837 config: ia64-allmodconfig (attached as .config) compiler: ia64-linux-gcc (GCC) 6.2.0 reproduce: wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=ia64 All errors (new ones prefixed by >>): lib/siphash.c: In function 'siphash_unaligned': >> lib/siphash.c:123:15: error: 'bytes' undeclared (first use in this function) case 1: b |= bytes[0]; ^ lib/siphash.c:123:15: note: each undeclared identifier is reported only once for each function it appears in vim +/bytes +123 lib/siphash.c 117 case 7: b |= ((u64)end[6]) << 48; 118 case 6: b |= ((u64)end[5]) << 40; 119 case 5: b |= ((u64)end[4]) << 32; 120 case 4: b |= get_unaligned_le32(end); break; 121 case 3: b |= ((u64)end[2]) << 16; 122 case 2: b |= get_unaligned_le16(end); break; > 123 case 1: b |= bytes[0]; 124 } 125 #endif 126 v3 ^= b; --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
RE: [virtio-dev] Re: [Qemu-devel] [PATCH v7 1/1] crypto: add virtio-crypto driver
Hi Michael, > > > > > > > > Subject: RE: [Qemu-devel] [PATCH v7 1/1] crypto: add virtio-crypto driver > > > > > > On Thursday, December 15, 2016 8:45 AM, Gonglei (Arei) Wrote: > > > < > > diff --git a/drivers/crypto/virtio/virtio_crypto_core.c > > > < > b/drivers/crypto/virtio/virtio_crypto_core.c > > > < > > new file mode 100644 > > > < > > index 000..c0854a1 > > > < > > --- /dev/null > > > < > > +++ b/drivers/crypto/virtio/virtio_crypto_core.c > > > < > > @@ -0,0 +1,474 @@ > > > < > [..] > > > < > > + > > > < > > +static void virtcrypto_dataq_callback(struct virtqueue *vq) > > > < > > +{ > > > < > > + struct virtio_crypto *vcrypto = vq->vdev->priv; > > > < > > + struct virtio_crypto_request *vc_req; > > > < > > + unsigned long flags; > > > < > > + unsigned int len; > > > < > > + struct ablkcipher_request *ablk_req; > > > < > > + int error; > > > < > > + > > > < > > + spin_lock_irqsave(>lock, flags); > > > < > > > > < > Would it make sense to use a per virtqueue lock > > > < > like in virtio_blk for example instead of locking on the whole > > > < > device? OK, it seems you use only one dataqueue, so it > > > < > may not be that relevant. > > > < > > > > < Currently yes, both the backend device (cryptodev-backend-builtin) > > > < and the frontend driver use one dataqueue. > > > < > > > > > > I think it makes sense to use per virtqueue lock here though it only uses > > > one > > > queue so far, > > > but in the spec we already have multi queues support. > > > > > Yes, I agree. Will do that in V8 soon. > > Hope to catch up with Michael's pull request for 4.10. > > > > Regards, > > -Gonglei > > I merged v7, this change will have to wait. Sorry. > That's OK. Thanks! I can post a separate patch after this pull request. Regards, -Gonglei -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
On 16.12.2016 00:43, Jason A. Donenfeld wrote: > Hi Hannes, > > Good news. > > On Thu, Dec 15, 2016 at 10:45 PM, Hannes Frederic Sowa >wrote: >>> How's that sound? >> >> I am still very much concerned about the API. > > Thanks for pushing me and putting up with my daftness... the constant > folding works absolutely perfectly. I've run several tests. When gcc > knows that a struct is aligned (say, via __aligned(8)), then it erases > the branch and makes a direct jump to the aligned code. When it's > uncertain, it evaluates at runtime. So, now there is a single > siphash() function that chooses the best one automatically. Behind the > scene there's siphash_aligned and siphash_unaligned, but nobody needs > to call these directly. (Should I rename these to have a double > underscore prefix?) On platforms that have > CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS, of course all of this > disappears and everything goes directly to the aligned version. > > So, I think this assuages your concerns entirely. A single API entry > point that does the right thing. > > Whew! Good thinking, and thanks again for the suggestion. Awesome, thanks for trying this out. This basically resolves my concern API-wise so far. Hannes out. ;) -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
On Thu, Dec 15, 2016 at 10:45 PM, Hannes Frederic Sowawrote: > By the way, if you target net-next, it is currently closed. So no need > to hurry. Honestly I have no idea what I'm targeting. The hash function touches lib/. The secure_seq stuff touches net/. The rng stuff touches random.c. Shall this be for net-next? For lib-next (doesn't exist)? For tytso-next? Since a lot of feedback has come from netdev people, I suspect net-next is the correct answer. In that case, I'll ask Ted for his sign-off to touch random.c, and then we'll get this queued up in net-next. Please correct me if this doesn't actually resemble how things work around here... Jason -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
Hi Hannes, Good news. On Thu, Dec 15, 2016 at 10:45 PM, Hannes Frederic Sowawrote: >> How's that sound? > > I am still very much concerned about the API. Thanks for pushing me and putting up with my daftness... the constant folding works absolutely perfectly. I've run several tests. When gcc knows that a struct is aligned (say, via __aligned(8)), then it erases the branch and makes a direct jump to the aligned code. When it's uncertain, it evaluates at runtime. So, now there is a single siphash() function that chooses the best one automatically. Behind the scene there's siphash_aligned and siphash_unaligned, but nobody needs to call these directly. (Should I rename these to have a double underscore prefix?) On platforms that have CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS, of course all of this disappears and everything goes directly to the aligned version. So, I think this assuages your concerns entirely. A single API entry point that does the right thing. Whew! Good thinking, and thanks again for the suggestion. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
> If a halved version of SipHash can bring significant performance boost > (with 32b words instead of 64b words) with an acceptable security level > (64-bit enough?) then we may design such a version. I was thinking if the key could be pushed to 80 bits, that would be nice, but honestly 64 bits is fine. This is DoS protection, and while it's possible to brute-force a 64 bit secret, there are more effective (DDoS) attacks possible for the same cost. (I'd suggest a name of "HalfSipHash" to convey the reduced security effectively.) > Regarding output size, are 64 bits sufficient? As a replacement for jhash, 32 bits are sufficient. It's for indexing an in-memory hash table on a 32-bit machine. (When you're done thinking about this, as a matter of personal interest I'd love a hash expert's opinion on https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=2a18da7a9c7886f1c7307f8d3f23f24318583f03 and https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=8387ff2577eb9ed245df9a39947f66976c6bcd02 which is a non-cryptographic hash function of novel design that's inspired by SipHash.) -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
> While SipHash is extremely fast for a cryptographically secure function, > it is likely a tiny bit slower than the insecure jhash, and so replacements > will be evaluated on a case-by-case basis based on whether or not the > difference in speed is negligible and whether or not the current jhash usage > poses a real security risk. To quantify that, jhash is 27 instructions per 12 bytes of input, with a dependency path length of 13 instructions. (24/12 in __jash_mix, plus 3/1 for adding the input to the state.) The final add + __jhash_final is 24 instructions with a path length of 15, which is close enough for this handwaving. Call it 18n instructions and 8n cycles for 8n bytes. SipHash (on a 64-bit machine) is 14 instructions with a dependency path length of 4 *per round*. Two rounds per 8 bytes, plus plus two adds and one cycle per input word, plus four rounds to finish makes 30n+46 instructions and 9n+16 cycles for 8n bytes. So *if* you have a 64-bit 4-way superscalar machine, it's not that much slower once it gets going, but the four-round finalization is quite noticeable for short inputs. For typical kernel input lengths "within a factor of 2" is probably more accurate than "a tiny bit". You lose a factor of 2 if you machine is 2-way or non-superscalar, and a second factor of 2 if it's a 32-bit machine. I mention this because there are a lot of home routers and other netwoek appliances running Linux on 32-bit ARM and MIPS processors. For those, it's a factor of *eight*, which is a lot more than "a tiny bit". The real killer is if you don't have enough registers; SipHash performs horribly on i386 because it uses more state than i386 has registers. (If i386 performance is desired, you might ask Jean-Philippe for some rotate constants for a 32-bit variant with 64 bits of key. Note that SipHash's security proof requires that key length + input length is strictly less than the state size, so for a 4x32-bit variant, while you could stretch the key length a little, you'd have a hard limit at 95 bits.) A second point, the final XOR in SipHash is either a (very minor) design mistake, or an opportunity for optimization, depending on how you look at it. Look at the end of the function: >+ SIPROUND; >+ SIPROUND; >+ return (v0 ^ v1) ^ (v2 ^ v3); Expanding that out, you get: + v0 += v1; v1 = rol64(v1, 13); v1 ^= v0; v0 = rol64(v0, 32); + v2 += v3; v3 = rol64(v3, 16); v3 ^= v2; + v0 += v3; v3 = rol64(v3, 21); v3 ^= v0; + v2 += v1; v1 = rol64(v1, 17); v1 ^= v2; v2 = rol64(v2, 32); + return v0 ^ v1 ^ v2 ^ v3; Since the final XOR includes both v0 and v3, it's undoing the "v3 ^= v0" two lines earlier, so the value of v0 doesn't matter after its XOR into v1 on line one. The final SIPROUND and return can then be optimized to + v0 += v1; v1 = rol64(v1, 13); v1 ^= v0; + v2 += v3; v3 = rol64(v3, 16); v3 ^= v2; + v3 = rol64(v3, 21); + v2 += v1; v1 = rol64(v1, 17); v1 ^= v2; v2 = rol64(v2, 32); + return v1 ^ v2 ^ v3; A 32-bit implementation could further tweak the 4 instructions of v1 ^= v2; v2 = rol64(v2, 32); v1 ^= v2; gcc 6.2.1 -O3 compiles it to basically: v1.low ^= v2.low; v1.high ^= v2.high; v1.low ^= v2.high; v1.high ^= v2.low; but it could be written as: v2.low ^= v2.high; v1.low ^= v2.low; v1.high ^= v2.low; Alternatively, if it's for private use only (key not shared with other systems), a slightly stronger variant would "return v1 ^ v3;". (The final swap of v2 is dead code, but a compiler can spot that easily.) -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
On Thu, Dec 15, 2016 at 09:43:04PM +0100, Jason A. Donenfeld wrote: > On Thu, Dec 15, 2016 at 9:31 PM, Hannes Frederic Sowa >wrote: > > ARM64 and x86-64 have memory operations that are not vector operations > > that operate on 128 bit memory. > > Fair enough. imull I guess. imull is into rdx:rax, not memory. I suspect he's talking about cmpxchg16b. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
On 15.12.2016 22:25, Jason A. Donenfeld wrote: > On Thu, Dec 15, 2016 at 10:17 PM, Hannes Frederic Sowa >wrote: >> And I was exactly questioning this. >> >> static unsigned int inet6_hash_frag(__be32 id, const struct in6_addr *saddr, >> const struct in6_addr *daddr) >> { >> net_get_random_once(_frags.rnd, sizeof(ip6_frags.rnd)); >> return jhash_3words(ipv6_addr_hash(saddr), ipv6_addr_hash(daddr), >> (__force u32)id, ip6_frags.rnd); >> } > > For this example, the replacement is the function entitled siphash_4u32: > > static unsigned int inet6_hash_frag(__be32 id, const struct in6_addr *saddr, > const struct in6_addr *daddr) > { > net_get_random_once(_frags.rnd, sizeof(ip6_frags.rnd)); > return siphash_4u32(ipv6_addr_hash(saddr), ipv6_addr_hash(daddr), > (__force u32)id, 0, ip6_frags.rnd); > } > > And then you make ip6_frags.rnd be of type siphash_key_t. Then > everything is taken care of and works beautifully. Please see v5 of > this patchset. Sorry to not be specific enough, the Hash-DoS is in ipv6_addr_hash. Maybe it was a silly example to start with, sorry. But anyway, your proposal wouldn't have prevented the hash DoS. I wanted to show how it can be difficult to make sure that all pointers come from an appropriate aligned memory region. The idea would be to actually factor out the key in the data structure and align it with __aligned(SIPHASH_ALIGNMENT), make sure the padding bits are all equal zero to not cause any bugs and irregularities with the corresponding equality function. This might need some serious review when switching to siphash to actually make use of it and prevent HashDoS. Or simply use the unaligned version always... >> I would be interested if the compiler can actually constant-fold the >> address of the stack allocation with an simple if () or some >> __builtin_constant_p fiddeling, so we don't have this constant review >> overhead to which function we pass which data. This would also make >> this whole discussion moot. > > I'll play with it to see if the compiler is capable of doing that. > Does anybody know off hand if it is or if there are other examples of > the compiler doing that? Not of the top of my head, but it should be easy to test. > In any case, for all current replacement of jhash_1word, jhash_2words, > jhash_3words, there's the siphash_2u32 or siphash_4u32 functions. This > covers the majority of cases. Agreed and this is also totally fine by me. > For replacements of md5_transform, either the data is small and can > fit in siphash_Nu{32,64}, or it can be put into a struct explicitly > aligned on the stack. > For the remaining use of jhash_nwords, either siphash() can be used or > siphash_unaligned() can be used if the source is of unknown alignment. > Both functions have their alignment requirements (or lack thereof) > documented in a docbook comment. I think the warning needs to be bigger, seriously. Most of the people develop on 64 bit arch, where it will just work during testing and break later on 32 bit. ;) > I'll look into the constant folding to see if it actually works. If it > does, I'll use it. If not, I believe the current solution works. > > How's that sound? I am still very much concerned about the API. By the way, if you target net-next, it is currently closed. So no need to hurry. Bye, Hannes -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
On Thu, Dec 15, 2016 at 10:17 PM, Hannes Frederic Sowawrote: > And I was exactly questioning this. > > static unsigned int inet6_hash_frag(__be32 id, const struct in6_addr *saddr, > const struct in6_addr *daddr) > { > net_get_random_once(_frags.rnd, sizeof(ip6_frags.rnd)); > return jhash_3words(ipv6_addr_hash(saddr), ipv6_addr_hash(daddr), > (__force u32)id, ip6_frags.rnd); > } For this example, the replacement is the function entitled siphash_4u32: static unsigned int inet6_hash_frag(__be32 id, const struct in6_addr *saddr, const struct in6_addr *daddr) { net_get_random_once(_frags.rnd, sizeof(ip6_frags.rnd)); return siphash_4u32(ipv6_addr_hash(saddr), ipv6_addr_hash(daddr), (__force u32)id, 0, ip6_frags.rnd); } And then you make ip6_frags.rnd be of type siphash_key_t. Then everything is taken care of and works beautifully. Please see v5 of this patchset. > I would be interested if the compiler can actually constant-fold the > address of the stack allocation with an simple if () or some > __builtin_constant_p fiddeling, so we don't have this constant review > overhead to which function we pass which data. This would also make > this whole discussion moot. I'll play with it to see if the compiler is capable of doing that. Does anybody know off hand if it is or if there are other examples of the compiler doing that? In any case, for all current replacement of jhash_1word, jhash_2words, jhash_3words, there's the siphash_2u32 or siphash_4u32 functions. This covers the majority of cases. For replacements of md5_transform, either the data is small and can fit in siphash_Nu{32,64}, or it can be put into a struct explicitly aligned on the stack. For the remaining use of jhash_nwords, either siphash() can be used or siphash_unaligned() can be used if the source is of unknown alignment. Both functions have their alignment requirements (or lack thereof) documented in a docbook comment. I'll look into the constant folding to see if it actually works. If it does, I'll use it. If not, I believe the current solution works. How's that sound? Jason -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
On 15.12.2016 21:43, Jason A. Donenfeld wrote: > On Thu, Dec 15, 2016 at 9:31 PM, Hannes Frederic Sowa >wrote: >> ARM64 and x86-64 have memory operations that are not vector operations >> that operate on 128 bit memory. > > Fair enough. imull I guess. > >> How do you know that the compiler for some architecture will not chose a >> more optimized instruction to load a 64 bit memory value into two 32 bit >> registers if you tell the compiler it is 8 byte aligned but it actually >> isn't? I don't know the answer but telling the compiler some data is 8 >> byte aligned while it isn't really pretty much seems like a call for >> trouble. > > If a compiler is in the business of using special 64-bit instructions > on 64-bit aligned data, then it is also the job of the compiler to > align structs to 64-bits when passed __aligned(8), which is what we've > done in this code. If the compiler were to hypothetically choose to > ignore that and internally convert it to a __aligned(4), then it would > only be able to do so with the knowledge that it will never use 64-bit > aligned data instructions. But so far as I can tell, gcc always > respects __aligned(8), which is why I use it in this patchset. > > I think there might have been confusion here, because perhaps someone > was hoping that since in6_addr is 128-bits, that the __aligned > attribute would not be required and that the struct would just > automatically be aligned to at least 8 bytes. But in fact, as I > mentioned, in6_addr is actually composed of u32[4] and not u64[2], so > it will only be aligned to 4 bytes, making the __aligned(8) necessary. > > I think for the purposes of this patchset, this is a solved problem. > There's the unaligned version of the function if you don't know about > the data, and there's the aligned version if you're using > __aligned(SIPHASH_ALIGNMENT) on your data. Plain and simple. And I was exactly questioning this. static unsigned int inet6_hash_frag(__be32 id, const struct in6_addr *saddr, const struct in6_addr *daddr) { net_get_random_once(_frags.rnd, sizeof(ip6_frags.rnd)); return jhash_3words(ipv6_addr_hash(saddr), ipv6_addr_hash(daddr), (__force u32)id, ip6_frags.rnd); } This function had a hash DoS (and kind of still has), but it has been mitigated by explicit checks, I hope. So you start looking for all the pointers where ipv6 addresses could come from and find some globally defined struct where I would need to put the aligned(SIPHASH_ALIGNMENT) into to make this work on 32 bit code? Otherwise just the unaligned version is safe on 32 bit code. Who knows this? It isn't even obvious by looking at the header! I would be interested if the compiler can actually constant-fold the address of the stack allocation with an simple if () or some __builtin_constant_p fiddeling, so we don't have this constant review overhead to which function we pass which data. This would also make this whole discussion mood. Bye, Hannes -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [kernel-hardening] Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
On Thu, Dec 15, 2016 at 10:14 PM, Linus Torvaldswrote: > I think you can/should just use the natural alignment for "u64". > > For architectures that need 8-byte alignment, u64 will already be > properly aligned. For architectures (like x86-32) that only need > 4-byte alignment, you get it. I should have added mention of that with my previous email. For the parameters that are always a multiple of u64 -- namely, the key -- I now do that in v5 of the patchset. So this is already done. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [kernel-hardening] Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
On Thu, Dec 15, 2016 at 1:11 PM, Jason A. Donenfeldwrote: > > Indeed, I stand corrected. But in any case, the use of __aligned(8) in > the patchset ensures that things are fixed and that we don't have this > issue. I think you can/should just use the natural alignment for "u64". For architectures that need 8-byte alignment, u64 will already be properly aligned. For architectures (like x86-32) that only need 4-byte alignment, you get it. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [kernel-hardening] Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
On Thu, Dec 15, 2016 at 10:09 PM, Peter Zijlstrawrote: > On Thu, Dec 15, 2016 at 07:50:36PM +0100, Jason A. Donenfeld wrote: >> There's no 32-bit platform >> that will trap on a 64-bit unaligned access because there's no such >> thing as a 64-bit access there. In short, we're fine. > > ARMv7 LPAE is a 32bit architecture that has 64bit load/stores IIRC. > > x86 has cmpxchg8b that can do 64bit things and very much wants the u64 > aligned. > > Also, IIRC we have a few platforms where u64 doesn't carry 8 byte > alignment, m68k or something like that, but yes, you likely don't care. Indeed, I stand corrected. But in any case, the use of __aligned(8) in the patchset ensures that things are fixed and that we don't have this issue. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
On 15.12.2016 22:04, Peter Zijlstra wrote: > On Thu, Dec 15, 2016 at 09:43:04PM +0100, Jason A. Donenfeld wrote: >> On Thu, Dec 15, 2016 at 9:31 PM, Hannes Frederic Sowa >>wrote: >>> ARM64 and x86-64 have memory operations that are not vector operations >>> that operate on 128 bit memory. >> >> Fair enough. imull I guess. > > imull is into rdx:rax, not memory. I suspect he's talking about > cmpxchg16b. Exactly and I think I saw a ll/sc 128 bit on armv8 to atomically manipulate linked lists. Bye, Hannes -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
On Thu, Dec 15, 2016 at 07:50:36PM +0100, Jason A. Donenfeld wrote: > There's no 32-bit platform > that will trap on a 64-bit unaligned access because there's no such > thing as a 64-bit access there. In short, we're fine. ARMv7 LPAE is a 32bit architecture that has 64bit load/stores IIRC. x86 has cmpxchg8b that can do 64bit things and very much wants the u64 aligned. Also, IIRC we have a few platforms where u64 doesn't carry 8 byte alignment, m68k or something like that, but yes, you likely don't care. Just to make life interesting... -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
On Thu, Dec 15, 2016 at 9:31 PM, Hannes Frederic Sowawrote: > ARM64 and x86-64 have memory operations that are not vector operations > that operate on 128 bit memory. Fair enough. imull I guess. > How do you know that the compiler for some architecture will not chose a > more optimized instruction to load a 64 bit memory value into two 32 bit > registers if you tell the compiler it is 8 byte aligned but it actually > isn't? I don't know the answer but telling the compiler some data is 8 > byte aligned while it isn't really pretty much seems like a call for > trouble. If a compiler is in the business of using special 64-bit instructions on 64-bit aligned data, then it is also the job of the compiler to align structs to 64-bits when passed __aligned(8), which is what we've done in this code. If the compiler were to hypothetically choose to ignore that and internally convert it to a __aligned(4), then it would only be able to do so with the knowledge that it will never use 64-bit aligned data instructions. But so far as I can tell, gcc always respects __aligned(8), which is why I use it in this patchset. I think there might have been confusion here, because perhaps someone was hoping that since in6_addr is 128-bits, that the __aligned attribute would not be required and that the struct would just automatically be aligned to at least 8 bytes. But in fact, as I mentioned, in6_addr is actually composed of u32[4] and not u64[2], so it will only be aligned to 4 bytes, making the __aligned(8) necessary. I think for the purposes of this patchset, this is a solved problem. There's the unaligned version of the function if you don't know about the data, and there's the aligned version if you're using __aligned(SIPHASH_ALIGNMENT) on your data. Plain and simple. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5 2/4] siphash: add Nu{32,64} helpers
These restore parity with the jhash interface by providing high performance helpers for common input sizes. Signed-off-by: Jason A. DonenfeldCc: Tom Herbert --- include/linux/siphash.h | 33 ++ lib/siphash.c | 157 +--- lib/test_siphash.c | 18 ++ 3 files changed, 172 insertions(+), 36 deletions(-) diff --git a/include/linux/siphash.h b/include/linux/siphash.h index 145cf5667078..6f5a08a0fc7e 100644 --- a/include/linux/siphash.h +++ b/include/linux/siphash.h @@ -29,4 +29,37 @@ static inline u64 siphash_unaligned(const void *data, size_t len, u64 siphash_unaligned(const void *data, size_t len, const siphash_key_t key); #endif +u64 siphash_1u64(const u64 a, const siphash_key_t key); +u64 siphash_2u64(const u64 a, const u64 b, const siphash_key_t key); +u64 siphash_3u64(const u64 a, const u64 b, const u64 c, +const siphash_key_t key); +u64 siphash_4u64(const u64 a, const u64 b, const u64 c, const u64 d, +const siphash_key_t key); + +static inline u64 siphash_2u32(const u32 a, const u32 b, const siphash_key_t key) +{ + return siphash_1u64((u64)b << 32 | a, key); +} + +static inline u64 siphash_4u32(const u32 a, const u32 b, const u32 c, const u32 d, + const siphash_key_t key) +{ + return siphash_2u64((u64)b << 32 | a, (u64)d << 32 | c, key); +} + +static inline u64 siphash_6u32(const u32 a, const u32 b, const u32 c, const u32 d, + const u32 e, const u32 f, const siphash_key_t key) +{ + return siphash_3u64((u64)b << 32 | a, (u64)d << 32 | c, (u64)f << 32 | e, + key); +} + +static inline u64 siphash_8u32(const u32 a, const u32 b, const u32 c, const u32 d, + const u32 e, const u32 f, const u32 g, const u32 h, + const siphash_key_t key) +{ + return siphash_4u64((u64)b << 32 | a, (u64)d << 32 | c, (u64)f << 32 | e, + (u64)h << 32 | g, key); +} + #endif /* _LINUX_SIPHASH_H */ diff --git a/lib/siphash.c b/lib/siphash.c index afc13cbb1b78..970c083ab06a 100644 --- a/lib/siphash.c +++ b/lib/siphash.c @@ -25,6 +25,29 @@ v2 += v1; v1 = rol64(v1, 17); v1 ^= v2; v2 = rol64(v2, 32); \ } while(0) +#define PREAMBLE(len) \ + u64 v0 = 0x736f6d6570736575ULL; \ + u64 v1 = 0x646f72616e646f6dULL; \ + u64 v2 = 0x6c7967656e657261ULL; \ + u64 v3 = 0x7465646279746573ULL; \ + u64 b = ((u64)len) << 56; \ + v3 ^= key[1]; \ + v2 ^= key[0]; \ + v1 ^= key[1]; \ + v0 ^= key[0]; + +#define POSTAMBLE \ + v3 ^= b; \ + SIPROUND; \ + SIPROUND; \ + v0 ^= b; \ + v2 ^= 0xff; \ + SIPROUND; \ + SIPROUND; \ + SIPROUND; \ + SIPROUND; \ + return (v0 ^ v1) ^ (v2 ^ v3); + /** * siphash - compute 64-bit siphash PRF value * @data: buffer to hash, must be aligned to SIPHASH_ALIGNMENT @@ -33,18 +56,10 @@ */ u64 siphash(const void *data, size_t len, const siphash_key_t key) { - u64 v0 = 0x736f6d6570736575ULL; - u64 v1 = 0x646f72616e646f6dULL; - u64 v2 = 0x6c7967656e657261ULL; - u64 v3 = 0x7465646279746573ULL; - u64 b = ((u64)len) << 56; - u64 m; const u8 *end = data + len - (len % sizeof(u64)); const u8 left = len & (sizeof(u64) - 1); - v3 ^= key[1]; - v2 ^= key[0]; - v1 ^= key[1]; - v0 ^= key[0]; + u64 m; + PREAMBLE(len) for (; data != end; data += sizeof(u64)) { m = le64_to_cpup(data); v3 ^= m; @@ -67,16 +82,7 @@ u64 siphash(const void *data, size_t len, const siphash_key_t key) case 1: b |= end[0]; } #endif - v3 ^= b; - SIPROUND; - SIPROUND; - v0 ^= b; - v2 ^= 0xff; - SIPROUND; - SIPROUND; - SIPROUND; - SIPROUND; - return (v0 ^ v1) ^ (v2 ^ v3); + POSTAMBLE } EXPORT_SYMBOL(siphash); @@ -89,18 +95,10 @@ EXPORT_SYMBOL(siphash); */ u64 siphash_unaligned(const void *data, size_t len, const siphash_key_t key) { - u64 v0 = 0x736f6d6570736575ULL; - u64 v1 = 0x646f72616e646f6dULL; - u64 v2 = 0x6c7967656e657261ULL; - u64 v3 = 0x7465646279746573ULL; - u64 b = ((u64)len) << 56; - u64 m; const u8 *end = data + len - (len % sizeof(u64)); const u8 left = len & (sizeof(u64) - 1); - v3 ^= key[1]; - v2 ^= key[0]; - v1 ^= key[1]; - v0 ^= key[0]; + u64 m; + PREAMBLE(len) for (; data != end; data += sizeof(u64)) { m = get_unaligned_le64(data); v3 ^= m; @@ -123,16 +121,103 @@ u64 siphash_unaligned(const void *data, size_t len, const siphash_key_t key) case 1: b |= bytes[0]; } #endif - v3 ^= b; + POSTAMBLE +}
[PATCH v5 1/4] siphash: add cryptographically secure PRF
SipHash is a 64-bit keyed hash function that is actually a cryptographically secure PRF, like HMAC. Except SipHash is super fast, and is meant to be used as a hashtable keyed lookup function, or as a general PRF for short input use cases, such as sequence numbers or RNG chaining. For the first usage: There are a variety of attacks known as "hashtable poisoning" in which an attacker forms some data such that the hash of that data will be the same, and then preceeds to fill up all entries of a hashbucket. This is a realistic and well-known denial-of-service vector. Currently hashtables use jhash, which is fast but not secure, and some kind of rotating key scheme (or none at all, which isn't good). SipHash is meant as a replacement for jhash in these cases. There are a modicum of places in the kernel that are vulnerable to hashtable poisoning attacks, either via userspace vectors or network vectors, and there's not a reliable mechanism inside the kernel at the moment to fix it. The first step toward fixing these issues is actually getting a secure primitive into the kernel for developers to use. Then we can, bit by bit, port things over to it as deemed appropriate. While SipHash is extremely fast for a cryptographically secure function, it is likely a tiny bit slower than the insecure jhash, and so replacements will be evaluated on a case-by-case basis based on whether or not the difference in speed is negligible and whether or not the current jhash usage poses a real security risk. For the second usage: A few places in the kernel are using MD5 for creating secure sequence numbers, port numbers, or fast random numbers. SipHash is a faster, more fitting, and more secure replacement for MD5 in those situations. Replacing MD5 with SipHash for these uses is obvious and straight- forward, and so is submitted along with this patch series. There shouldn't be much of a debate over its efficacy. Dozens of languages are already using this internally for their hash tables and PRFs. Some of the BSDs already use this in their kernels. SipHash is a widely known high-speed solution to a widely known set of problems, and it's time we catch-up. Signed-off-by: Jason A. DonenfeldCc: Jean-Philippe Aumasson Cc: Daniel J. Bernstein Cc: Linus Torvalds Cc: Eric Biggers Cc: David Laight --- include/linux/siphash.h | 32 +++ lib/Kconfig.debug | 6 +-- lib/Makefile| 5 +- lib/siphash.c | 138 lib/test_siphash.c | 83 + 5 files changed, 259 insertions(+), 5 deletions(-) create mode 100644 include/linux/siphash.h create mode 100644 lib/siphash.c create mode 100644 lib/test_siphash.c diff --git a/include/linux/siphash.h b/include/linux/siphash.h new file mode 100644 index ..145cf5667078 --- /dev/null +++ b/include/linux/siphash.h @@ -0,0 +1,32 @@ +/* Copyright (C) 2016 Jason A. Donenfeld . All Rights Reserved. + * + * This file is provided under a dual BSD/GPLv2 license. + * + * SipHash: a fast short-input PRF + * https://131002.net/siphash/ + * + * This implementation is specifically for SipHash2-4. + */ + +#ifndef _LINUX_SIPHASH_H +#define _LINUX_SIPHASH_H + +#include + +#define SIPHASH_ALIGNMENT 8 + +typedef u64 siphash_key_t[2]; + +u64 siphash(const void *data, size_t len, const siphash_key_t key); + +#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS +static inline u64 siphash_unaligned(const void *data, size_t len, + const siphash_key_t key) +{ + return siphash(data, len, key); +} +#else +u64 siphash_unaligned(const void *data, size_t len, const siphash_key_t key); +#endif + +#endif /* _LINUX_SIPHASH_H */ diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 7446097f72bd..86254ea99b45 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1843,9 +1843,9 @@ config TEST_HASH tristate "Perform selftest on hash functions" default n help - Enable this option to test the kernel's integer () - and string () hash functions on boot - (or module load). + Enable this option to test the kernel's integer (), + string (), and siphash () + hash functions on boot (or module load). This is intended to help people writing architecture-specific optimized versions. If unsure, say N. diff --git a/lib/Makefile b/lib/Makefile index 50144a3aeebd..71d398b04a74 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -22,7 +22,8 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \ sha1.o chacha20.o md5.o irq_regs.o argv_split.o \ flex_proportions.o ratelimit.o show_mem.o \ is_single_threaded.o plist.o decompress.o kobject_uevent.o \ -earlycpio.o seq_buf.o nmi_backtrace.o nodemask.o
[PATCH v5 0/4] The SipHash Patchset
Hey folks, I think we're approaching the end of the review for this patchset and we're getting somewhat close to being ready for it being queued up. At this point, I've incorporated all of the extremely helpful and instructive suggestions from the list. For this v5, we now accept u64[2] as the key, so that alignment is taken care of naturally. For other alignment issues, we have both the fast aligned version and the unaligned version, depending on what's necessary. We've worked out the issues for struct padding. The functions now take a void pointer to avoid ugly casting, which also helps us shed the inline helper functions which were not very pretty. The replacements of MD5 have been benchmarked and show a big increase in speed. We've even come up with a better naming scheme for dword/qword. All and all it's shaping up nicely. So, if this series looks good to you, please send along your Reviewed-by, so we can begin to get this completed. If there are still lingering issues, let me know and I'll incorporated them into a v6 if necessary. Thanks, Jason Jason A. Donenfeld (4): siphash: add cryptographically secure PRF siphash: add Nu{32,64} helpers secure_seq: use SipHash in place of MD5 random: use SipHash in place of MD5 drivers/char/random.c | 32 +++ include/linux/siphash.h | 65 ++ lib/Kconfig.debug | 6 +- lib/Makefile| 5 +- lib/siphash.c | 223 lib/test_siphash.c | 101 ++ net/core/secure_seq.c | 133 +++-- 7 files changed, 460 insertions(+), 105 deletions(-) create mode 100644 include/linux/siphash.h create mode 100644 lib/siphash.c create mode 100644 lib/test_siphash.c -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5 3/4] secure_seq: use SipHash in place of MD5
This gives a clear speed and security improvement. Siphash is both faster and is more solid crypto than the aging MD5. Rather than manually filling MD5 buffers, for IPv6, we simply create a layout by a simple anonymous struct, for which gcc generates rather efficient code. For IPv4, we pass the values directly to the short input convenience functions. Signed-off-by: Jason A. DonenfeldCc: Andi Kleen Cc: David Miller Cc: David Laight Cc: Tom Herbert Cc: Hannes Frederic Sowa --- net/core/secure_seq.c | 133 -- 1 file changed, 52 insertions(+), 81 deletions(-) diff --git a/net/core/secure_seq.c b/net/core/secure_seq.c index 88a8e429fc3e..c80583bf3213 100644 --- a/net/core/secure_seq.c +++ b/net/core/secure_seq.c @@ -1,3 +1,5 @@ +/* Copyright (C) 2016 Jason A. Donenfeld . All Rights Reserved. */ + #include #include #include @@ -8,14 +10,14 @@ #include #include #include - +#include #include #if IS_ENABLED(CONFIG_IPV6) || IS_ENABLED(CONFIG_INET) +#include #include -#define NET_SECRET_SIZE (MD5_MESSAGE_BYTES / 4) -static u32 net_secret[NET_SECRET_SIZE] cacheline_aligned; +static siphash_key_t net_secret; static __always_inline void net_secret_init(void) { @@ -44,44 +46,42 @@ static u32 seq_scale(u32 seq) u32 secure_tcpv6_sequence_number(const __be32 *saddr, const __be32 *daddr, __be16 sport, __be16 dport, u32 *tsoff) { - u32 secret[MD5_MESSAGE_BYTES / 4]; - u32 hash[MD5_DIGEST_WORDS]; - u32 i; - + const struct { + struct in6_addr saddr; + struct in6_addr daddr; + __be16 sport; + __be16 dport; + u32 padding; + } __aligned(SIPHASH_ALIGNMENT) combined = { + .saddr = *(struct in6_addr *)saddr, + .daddr = *(struct in6_addr *)daddr, + .sport = sport, + .dport = dport + }; + u64 hash; net_secret_init(); - memcpy(hash, saddr, 16); - for (i = 0; i < 4; i++) - secret[i] = net_secret[i] + (__force u32)daddr[i]; - secret[4] = net_secret[4] + - (((__force u16)sport << 16) + (__force u16)dport); - for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++) - secret[i] = net_secret[i]; - - md5_transform(hash, secret); - - *tsoff = sysctl_tcp_timestamps == 1 ? hash[1] : 0; - return seq_scale(hash[0]); + hash = siphash(, sizeof(combined), net_secret); + *tsoff = sysctl_tcp_timestamps == 1 ? (hash >> 32) : 0; + return seq_scale(hash); } EXPORT_SYMBOL(secure_tcpv6_sequence_number); u32 secure_ipv6_port_ephemeral(const __be32 *saddr, const __be32 *daddr, __be16 dport) { - u32 secret[MD5_MESSAGE_BYTES / 4]; - u32 hash[MD5_DIGEST_WORDS]; - u32 i; - + const struct { + struct in6_addr saddr; + struct in6_addr daddr; + __be16 dport; + u16 padding1; + u32 padding2; + } __aligned(SIPHASH_ALIGNMENT) combined = { + .saddr = *(struct in6_addr *)saddr, + .daddr = *(struct in6_addr *)daddr, + .dport = dport + }; net_secret_init(); - memcpy(hash, saddr, 16); - for (i = 0; i < 4; i++) - secret[i] = net_secret[i] + (__force u32) daddr[i]; - secret[4] = net_secret[4] + (__force u32)dport; - for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++) - secret[i] = net_secret[i]; - - md5_transform(hash, secret); - - return hash[0]; + return siphash(, sizeof(combined), net_secret); } EXPORT_SYMBOL(secure_ipv6_port_ephemeral); #endif @@ -91,33 +91,17 @@ EXPORT_SYMBOL(secure_ipv6_port_ephemeral); u32 secure_tcp_sequence_number(__be32 saddr, __be32 daddr, __be16 sport, __be16 dport, u32 *tsoff) { - u32 hash[MD5_DIGEST_WORDS]; - + u64 hash; net_secret_init(); - hash[0] = (__force u32)saddr; - hash[1] = (__force u32)daddr; - hash[2] = ((__force u16)sport << 16) + (__force u16)dport; - hash[3] = net_secret[15]; - - md5_transform(hash, net_secret); - - *tsoff = sysctl_tcp_timestamps == 1 ? hash[1] : 0; - return seq_scale(hash[0]); + hash = siphash_4u32(saddr, daddr, sport, dport, net_secret); + *tsoff = sysctl_tcp_timestamps == 1 ? (hash >> 32) : 0; + return seq_scale(hash); } u32 secure_ipv4_port_ephemeral(__be32 saddr, __be32 daddr, __be16 dport) { - u32 hash[MD5_DIGEST_WORDS]; - net_secret_init(); - hash[0] = (__force u32)saddr; - hash[1] = (__force u32)daddr; - hash[2] = (__force u32)dport ^ net_secret[14]; - hash[3] =
Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
Hello, On 15.12.2016 19:50, Jason A. Donenfeld wrote: > Hi David & Hannes, > > This conversation is veering off course. Why? > I think this doesn't really > matter at all. Gcc converts u64 into essentially a pair of u32 on > 32-bit platforms, so the alignment requirements for 32-bit is at a > maximum 32 bits. On 64-bit platforms the alignment requirements are > related at a maximum to the biggest register size, so 64-bit > alignment. For this reason, no matter the behavior of __aligned(8), > we're okay. Likewise, even without __aligned(8), if gcc aligns structs > by their biggest member, then we get 4 byte alignment on 32-bit and 8 > byte alignment on 64-bit, which is fine. There's no 32-bit platform > that will trap on a 64-bit unaligned access because there's no such > thing as a 64-bit access there. In short, we're fine. ARM64 and x86-64 have memory operations that are not vector operations that operate on 128 bit memory. How do you know that the compiler for some architecture will not chose a more optimized instruction to load a 64 bit memory value into two 32 bit registers if you tell the compiler it is 8 byte aligned but it actually isn't? I don't know the answer but telling the compiler some data is 8 byte aligned while it isn't really pretty much seems like a call for trouble. Why can't a compiler not vectorize this code if it can prove that it doesn't conflict with other register users? Bye, Hannes -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5 4/4] random: use SipHash in place of MD5
This duplicates the current algorithm for get_random_int/long, but uses siphash instead. This comes with several benefits. It's certainly faster and more cryptographically secure than MD5. This patch also separates hashed fields into three values instead of one, in order to increase diffusion. The previous MD5 algorithm used a per-cpu MD5 state, which caused successive calls to the function to chain upon each other. While it's not entirely clear that this kind of chaining is absolutely necessary when using a secure PRF like siphash, it can't hurt, and the timing of the call chain does add a degree of natural entropy. So, in keeping with this design, instead of the massive per-cpu 64-byte MD5 state, there is instead a per-cpu previously returned value for chaining. The speed benefits are substantial: | siphash | md5| speedup | -- get_random_long | 137130 | 415983 | 3.03x | get_random_int | 86384 | 343323 | 3.97x | Signed-off-by: Jason A. DonenfeldCc: Jean-Philippe Aumasson Cc: Ted Tso --- drivers/char/random.c | 32 +--- 1 file changed, 13 insertions(+), 19 deletions(-) diff --git a/drivers/char/random.c b/drivers/char/random.c index d6876d506220..a51f0ff43f00 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -262,6 +262,7 @@ #include #include #include +#include #include #include @@ -2042,7 +2043,7 @@ struct ctl_table random_table[] = { }; #endif /* CONFIG_SYSCTL */ -static u32 random_int_secret[MD5_MESSAGE_BYTES / 4] cacheline_aligned; +static siphash_key_t random_int_secret; int random_int_secret_init(void) { @@ -2050,8 +2051,7 @@ int random_int_secret_init(void) return 0; } -static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash) - __aligned(sizeof(unsigned long)); +static DEFINE_PER_CPU(u64, get_random_int_chaining); /* * Get a random word for internal kernel use only. Similar to urandom but @@ -2061,19 +2061,16 @@ static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash) */ unsigned int get_random_int(void) { - __u32 *hash; unsigned int ret; + u64 *chaining; if (arch_get_random_int()) return ret; - hash = get_cpu_var(get_random_int_hash); - - hash[0] += current->pid + jiffies + random_get_entropy(); - md5_transform(hash, random_int_secret); - ret = hash[0]; - put_cpu_var(get_random_int_hash); - + chaining = _cpu_var(get_random_int_chaining); + ret = *chaining = siphash_3u64(*chaining, jiffies, random_get_entropy() + + current->pid, random_int_secret); + put_cpu_var(get_random_int_chaining); return ret; } EXPORT_SYMBOL(get_random_int); @@ -2083,19 +2080,16 @@ EXPORT_SYMBOL(get_random_int); */ unsigned long get_random_long(void) { - __u32 *hash; unsigned long ret; + u64 *chaining; if (arch_get_random_long()) return ret; - hash = get_cpu_var(get_random_int_hash); - - hash[0] += current->pid + jiffies + random_get_entropy(); - md5_transform(hash, random_int_secret); - ret = *(unsigned long *)hash; - put_cpu_var(get_random_int_hash); - + chaining = _cpu_var(get_random_int_chaining); + ret = *chaining = siphash_3u64(*chaining, jiffies, random_get_entropy() + + current->pid, random_int_secret); + put_cpu_var(get_random_int_chaining); return ret; } EXPORT_SYMBOL(get_random_long); -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
Hi David & Hannes, This conversation is veering off course. I think this doesn't really matter at all. Gcc converts u64 into essentially a pair of u32 on 32-bit platforms, so the alignment requirements for 32-bit is at a maximum 32 bits. On 64-bit platforms the alignment requirements are related at a maximum to the biggest register size, so 64-bit alignment. For this reason, no matter the behavior of __aligned(8), we're okay. Likewise, even without __aligned(8), if gcc aligns structs by their biggest member, then we get 4 byte alignment on 32-bit and 8 byte alignment on 64-bit, which is fine. There's no 32-bit platform that will trap on a 64-bit unaligned access because there's no such thing as a 64-bit access there. In short, we're fine. (The reason in6_addr aligns itself to 4 bytes on 64-bit platforms is that it's defined as being u32 blah[4]. If we added a u64 blah[2], we'd get 8 byte alignment, but that's not in the header. Feel free to start a new thread about this issue if you feel this ought to be added for whatever reason.) One optimization that's been suggested on this list is that instead of u8 key[16] and requiring the alignment attribute, I should just use u64 key[2]. This seems reasonable to me, and it will also save the endian conversion call. These keys generally aren't transmitted over a network, so I don't think a byte-wise encoding is particularly important. The other suggestion I've seen is that I make the functions take a const void * instead of a const u8 * for the data, in order to save ugly casts. I'll do this too. Meanwhile Linus has condemned our 4dwords/2qwords naming, and I'll need to think of something different. The best I can think of right now is siphash_4_u32/siphash_2_u64, but I don't find it especially pretty. Open to suggestions. Regards, Jason -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 3/3] random: use siphash24 instead of md5 for get_random_int/long
Hi David, On Thu, Dec 15, 2016 at 11:14 AM, David Laightwrote: > From: Behalf Of Jason A. Donenfeld >> Sent: 14 December 2016 18:46 > ... >> + ret = *chaining = siphash24((u8 *), >> offsetof(typeof(combined), end), > > If you make the first argument 'const void *' you won't need the cast > on every call. > > I'd also suggest making the key u64[2]. I'll do both. Thanks for the suggestion. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH v7 1/1] crypto: add virtio-crypto driver
On Thu, Dec 15, 2016 at 01:08:51AM +, Gonglei (Arei) wrote: > > > > > Regards, > -Gonglei > > > > -Original Message- > > From: Zeng, Xin [mailto:xin.z...@intel.com] > > Sent: Thursday, December 15, 2016 8:59 AM > > To: Gonglei (Arei); Halil Pasic; linux-ker...@vger.kernel.org; > > qemu-de...@nongnu.org; virtio-...@lists.oasis-open.org; > > virtualizat...@lists.linux-foundation.org; linux-crypto@vger.kernel.org > > Cc: Huangweidong (C); Claudio Fontana; m...@redhat.com; Luonengjun; > > Hanweidong (Randy); Xuquan (Quan Xu); Wanzongshun (Vincent); > > stefa...@redhat.com; Zhoujian (jay, Euler); cornelia.h...@de.ibm.com; > > longpeng; arei.gong...@hotmail.com; da...@davemloft.net; Wubin (H); > > herb...@gondor.apana.org.au > > Subject: RE: [Qemu-devel] [PATCH v7 1/1] crypto: add virtio-crypto driver > > > > On Thursday, December 15, 2016 8:45 AM, Gonglei (Arei) Wrote: > > < > > diff --git a/drivers/crypto/virtio/virtio_crypto_core.c > > < > b/drivers/crypto/virtio/virtio_crypto_core.c > > < > > new file mode 100644 > > < > > index 000..c0854a1 > > < > > --- /dev/null > > < > > +++ b/drivers/crypto/virtio/virtio_crypto_core.c > > < > > @@ -0,0 +1,474 @@ > > < > [..] > > < > > + > > < > > +static void virtcrypto_dataq_callback(struct virtqueue *vq) > > < > > +{ > > < > > + struct virtio_crypto *vcrypto = vq->vdev->priv; > > < > > + struct virtio_crypto_request *vc_req; > > < > > + unsigned long flags; > > < > > + unsigned int len; > > < > > + struct ablkcipher_request *ablk_req; > > < > > + int error; > > < > > + > > < > > + spin_lock_irqsave(>lock, flags); > > < > > > < > Would it make sense to use a per virtqueue lock > > < > like in virtio_blk for example instead of locking on the whole > > < > device? OK, it seems you use only one dataqueue, so it > > < > may not be that relevant. > > < > > > < Currently yes, both the backend device (cryptodev-backend-builtin) > > < and the frontend driver use one dataqueue. > > < > > > > I think it makes sense to use per virtqueue lock here though it only uses > > one > > queue so far, > > but in the spec we already have multi queues support. > > > Yes, I agree. Will do that in V8 soon. > Hope to catch up with Michael's pull request for 4.10. > > Regards, > -Gonglei I merged v7, this change will have to wait. Sorry. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Crypto Fixes for 4.10
Hi Linus: This push fixes the following issues: - A crash regression in the new skcipher walker. - Incorrect return value in public_key_verify_signature. - Fix for in-place signing in the sign-file utility. Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6.git linus Alex Yashchenko (1): sign-file: Fix inplace signing when src and dst names are both specified Ard Biesheuvel (1): crypto: skcipher - fix crash in virtual walk Pan Bian (1): crypto: asymmetric_keys - set error code on failure crypto/asymmetric_keys/public_key.c |1 + crypto/skcipher.c |4 +++- scripts/sign-file.c |2 +- 3 files changed, 5 insertions(+), 2 deletions(-) Thanks, -- Email: Herbert XuHome Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
On 15.12.2016 16:41, David Laight wrote: > Try (retyped): > > echo 'struct { long a; long long b; } s; int bar { return sizeof s; }' >foo.c > gcc [-m32] -O2 -S foo.c; cat foo.s > > And look at what is generated. I used __alignof__(unsigned long long) with -m32. >> Right now ipv6 addresses have an alignment of 4. So we couldn't even >> naturally pass them to siphash but would need to copy them around, which >> I feel like a source of bugs. > > That is more of a problem on systems that don't support misaligned accesses. > Reading the 64bit values with two explicit 32bit reads would work. > I think you can get gcc to do that by adding an aligned(4) attribute to the > structure member. Yes, and that is actually my fear, because we support those architectures. I can't comment on that as I don't understand enough of this. If someone finds a way to cause misaligned reads on a small box this seems (maybe depending on sysctls they get fixed up or panic) to be a much bigger issue than having a hash DoS. Thanks, Hannes -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
From: Hannes Frederic Sowa > Sent: 15 December 2016 14:57 > On 15.12.2016 14:56, David Laight wrote: > > From: Hannes Frederic Sowa > >> Sent: 15 December 2016 12:50 > >> On 15.12.2016 13:28, David Laight wrote: > >>> From: Hannes Frederic Sowa > Sent: 15 December 2016 12:23 > >>> ... > Hmm? Even the Intel ABI expects alignment of unsigned long long to be 8 > bytes on 32 bit. Do you question that? > >>> > >>> Yes. > >>> > >>> The linux ABI for x86 (32 bit) only requires 32bit alignment for u64 > >>> (etc). > >> > >> Hmm, u64 on 32 bit is unsigned long long and not unsigned long. Thus I > >> am actually not sure if the ABI would say anything about that (sorry > >> also for my wrong statement above). > >> > >> Alignment requirement of unsigned long long on gcc with -m32 actually > >> seem to be 8. > > > > It depends on the architecture. > > For x86 it is definitely 4. > > May I ask for a reference? Ask anyone who has had to do compatibility layers to support 32bit binaries on 64bit systems. > I couldn't see unsigned long long being > mentioned in the ia32 abi spec that I found. I agree that those accesses > might be synthetically assembled by gcc and for me the alignment of 4 > would have seemed natural. But my gcc at least in 32 bit mode disagrees > with that. Try (retyped): echo 'struct { long a; long long b; } s; int bar { return sizeof s; }' >foo.c gcc [-m32] -O2 -S foo.c; cat foo.s And look at what is generated. > Right now ipv6 addresses have an alignment of 4. So we couldn't even > naturally pass them to siphash but would need to copy them around, which > I feel like a source of bugs. That is more of a problem on systems that don't support misaligned accesses. Reading the 64bit values with two explicit 32bit reads would work. I think you can get gcc to do that by adding an aligned(4) attribute to the structure member. David
Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
On 15.12.2016 14:56, David Laight wrote: > From: Hannes Frederic Sowa >> Sent: 15 December 2016 12:50 >> On 15.12.2016 13:28, David Laight wrote: >>> From: Hannes Frederic Sowa Sent: 15 December 2016 12:23 >>> ... Hmm? Even the Intel ABI expects alignment of unsigned long long to be 8 bytes on 32 bit. Do you question that? >>> >>> Yes. >>> >>> The linux ABI for x86 (32 bit) only requires 32bit alignment for u64 (etc). >> >> Hmm, u64 on 32 bit is unsigned long long and not unsigned long. Thus I >> am actually not sure if the ABI would say anything about that (sorry >> also for my wrong statement above). >> >> Alignment requirement of unsigned long long on gcc with -m32 actually >> seem to be 8. > > It depends on the architecture. > For x86 it is definitely 4. May I ask for a reference? I couldn't see unsigned long long being mentioned in the ia32 abi spec that I found. I agree that those accesses might be synthetically assembled by gcc and for me the alignment of 4 would have seemed natural. But my gcc at least in 32 bit mode disagrees with that. > It might be 8 for sparc, ppc and/or alpha. This is something to find out... Right now ipv6 addresses have an alignment of 4. So we couldn't even naturally pass them to siphash but would need to copy them around, which I feel like a source of bugs. Bye, Hannes -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
From: Hannes Frederic Sowa > Sent: 15 December 2016 12:50 > On 15.12.2016 13:28, David Laight wrote: > > From: Hannes Frederic Sowa > >> Sent: 15 December 2016 12:23 > > ... > >> Hmm? Even the Intel ABI expects alignment of unsigned long long to be 8 > >> bytes on 32 bit. Do you question that? > > > > Yes. > > > > The linux ABI for x86 (32 bit) only requires 32bit alignment for u64 (etc). > > Hmm, u64 on 32 bit is unsigned long long and not unsigned long. Thus I > am actually not sure if the ABI would say anything about that (sorry > also for my wrong statement above). > > Alignment requirement of unsigned long long on gcc with -m32 actually > seem to be 8. It depends on the architecture. For x86 it is definitely 4. It might be 8 for sparc, ppc and/or alpha. David N�r��yb�X��ǧv�^�){.n�+{�r����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf
[PATCH -resend with CC] crypto: algif_hash, avoid zero-sized array
With this reproducer: struct sockaddr_alg alg = { .salg_family = 0x26, .salg_type = "hash", .salg_feat = 0xf, .salg_mask = 0x5, .salg_name = "digest_null", }; int sock, sock2; sock = socket(AF_ALG, SOCK_SEQPACKET, 0); bind(sock, (struct sockaddr *), sizeof(alg)); sock2 = accept(sock, NULL, NULL); setsockopt(sock, SOL_ALG, ALG_SET_KEY, "\x9b\xca", 2); accept(sock2, NULL, NULL); 8< 8< 8< 8< one can immediatelly see an UBSAN warning: UBSAN: Undefined behaviour in crypto/algif_hash.c:187:7 variable length array bound value 0 <= 0 CPU: 0 PID: 15949 Comm: syz-executor Tainted: GE 4.4.30-0-default #1 ... Call Trace: ... [] ? __ubsan_handle_vla_bound_not_positive+0x13d/0x188 [] ? __ubsan_handle_out_of_bounds+0x1bc/0x1bc [] ? hash_accept+0x5bd/0x7d0 [algif_hash] [] ? hash_accept_nokey+0x3f/0x51 [algif_hash] [] ? hash_accept_parent_nokey+0x4a0/0x4a0 [algif_hash] [] ? SyS_accept+0x2b/0x40 It is a correct warning, as hash state is propagated to accept as zero, but creating a zero-length variable array is not allowed in C. Fix this as proposed by Herbert -- do "?: 1" on that site. No sizeof or similar happens in the code there, so we just allocate one byte even though we do not use the array. Signed-off-by: Jiri SlabyCc: Herbert Xu Cc: "David S. Miller" (maintainer:CRYPTO API) Reported-by: Sasha Levin --- crypto/algif_hash.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c index d19b09cdf284..54fc90e8339c 100644 --- a/crypto/algif_hash.c +++ b/crypto/algif_hash.c @@ -245,7 +245,7 @@ static int hash_accept(struct socket *sock, struct socket *newsock, int flags) struct alg_sock *ask = alg_sk(sk); struct hash_ctx *ctx = ask->private; struct ahash_request *req = >req; - char state[crypto_ahash_statesize(crypto_ahash_reqtfm(req))]; + char state[crypto_ahash_statesize(crypto_ahash_reqtfm(req)) ? : 1]; struct sock *sk2; struct alg_sock *ask2; struct hash_ctx *ctx2; -- 2.11.0 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] crypto: skcipher - fix crash in virtual walk
On 12/14/2016 11:39 AM, Herbert Xu wrote: > On Tue, Dec 13, 2016 at 01:34:02PM +, Ard Biesheuvel wrote: >> The new skcipher walk API may crash in the following way. (Interestingly, >> the tcrypt boot time tests seem unaffected, while an explicit test using >> the module triggers it) >> >> Unable to handle kernel NULL pointer dereference at virtual address >> >> ... >> [] __memcpy+0x84/0x180 >> [] skcipher_walk_done+0x328/0x340 >> [] ctr_encrypt+0x84/0x100 >> [] simd_skcipher_encrypt+0x88/0x98 >> [] crypto_rfc3686_crypt+0x8c/0x98 >> [] test_skcipher_speed+0x518/0x820 [tcrypt] >> [] do_test+0x1408/0x3b70 [tcrypt] >> [] tcrypt_mod_init+0x50/0x1000 [tcrypt] >> [] do_one_initcall+0x44/0x138 >> [] do_init_module+0x68/0x1e0 >> [] load_module+0x1fd0/0x2458 >> [] SyS_finit_module+0xe0/0xf0 >> [] el0_svc_naked+0x24/0x28 >> >> This is due to the fact that skcipher_done_slow() may be entered with >> walk->buffer unset. Since skcipher_walk_done() already deals with the >> case where walk->buffer == walk->page, it appears to be the intention >> that walk->buffer point to walk->page after skcipher_next_slow(), so >> ensure that is the case. >> >> Signed-off-by: Ard Biesheuvel> > Patch applied. Thanks. Fixed problem here as well... Tested-by: Milan Broz Without patch, just running cryptsetup tests (make check) on 32bit machine I get kernel crash in LRW mode. For me, the problem can be triggered from the userspace crypto API (no root needed!) dd if=/dev/zero of=test bs=1M count=32 echo blah | /sbin/cryptsetup luksFormat test -c aes-lrw-null -q --use-urandom and I get Dec 15 13:00:50 kernel: NET: Registered protocol family 38 Dec 15 13:00:50 kernel: BUG: unable to handle kernel NULL pointer dereference at (null) Dec 15 13:00:50 kernel: IP: memcpy+0xf/0x20 Dec 15 13:00:50 kernel: *pde = Dec 15 13:00:50 kernel: Dec 15 13:00:50 kernel: Oops: [#1] PREEMPT SMP Dec 15 13:00:50 kernel: Modules linked in: lrw algif_skcipher af_alg loop rpcsec_gss_krb5 dm_mod crc32_pclmul crc32c_intel pcbc aesni_intel aes_i586 crypto_simd cryptd ata_piix Dec 15 13:00:50 kernel: CPU: 0 PID: 1667 Comm: cryptsetup Tainted: GW 4.9.0+ #122 Dec 15 13:00:50 kernel: Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015 Dec 15 13:00:50 kernel: task: f5cc66c0 task.stack: f45e6000 Dec 15 13:00:50 kernel: EIP: memcpy+0xf/0x20 Dec 15 13:00:50 kernel: EFLAGS: 00010202 CPU: 0 Dec 15 13:00:50 kernel: EAX: fffbaffc EBX: 0004 ECX: 0001 EDX: Dec 15 13:00:50 kernel: ESI: EDI: fffbaffc EBP: f45e7d48 ESP: f45e7d3c Dec 15 13:00:50 kernel: DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Dec 15 13:00:50 kernel: CR0: 80050033 CR2: CR3: 35eba000 CR4: 001406d0 Dec 15 13:00:50 kernel: Call Trace: Dec 15 13:00:50 kernel: scatterwalk_copychunks+0x75/0xd0 Dec 15 13:00:50 kernel: skcipher_walk_done+0x2b1/0x300 Dec 15 13:00:50 kernel: pre_crypt+0x230/0x350 [lrw] Dec 15 13:00:50 kernel: ? __kmalloc+0x222/0x270 Dec 15 13:00:50 kernel: do_decrypt+0x27/0xb0 [lrw] Dec 15 13:00:50 kernel: decrypt+0x19/0x20 [lrw] Dec 15 13:00:50 kernel: skcipher_recvmsg+0x2d3/0x6c0 [algif_skcipher] Dec 15 13:00:50 kernel: ? trace_hardirqs_on_caller+0xd6/0x200 Dec 15 13:00:50 kernel: ? release_sock+0x63/0x90 Dec 15 13:00:50 kernel: ? trace_hardirqs_on+0xb/0x10 Dec 15 13:00:50 kernel: ? __local_bh_enable_ip+0x5c/0xd0 Dec 15 13:00:50 kernel: ? _raw_spin_unlock_bh+0x2a/0x30 Dec 15 13:00:50 kernel: skcipher_recvmsg_nokey+0x26/0x38 [algif_skcipher] Dec 15 13:00:50 kernel: sock_read_iter+0x77/0xb0 Dec 15 13:00:50 kernel: __vfs_read+0xaa/0x120 Dec 15 13:00:50 kernel: vfs_read+0x72/0x100 Dec 15 13:00:50 kernel: SyS_read+0x3d/0xa0 Dec 15 13:00:50 kernel: do_int80_syscall_32+0x40/0x110 Dec 15 13:00:50 kernel: entry_INT80_32+0x2f/0x2f Dec 15 13:00:50 kernel: EIP: 0xb77cc9f2 Dec 15 13:00:50 kernel: EFLAGS: 0246 CPU: 0 Dec 15 13:00:50 kernel: EAX: ffda EBX: 0006 ECX: bff01f4c EDX: 0200 Dec 15 13:00:50 kernel: ESI: 0030 EDI: fffb EBP: bff01e78 ESP: bff01dd4 Dec 15 13:00:50 kernel: DS: 007b ES: 007b FS: GS: 0033 SS: 007b Dec 15 13:00:50 kernel: Code: fc ff ff 8b 43 58 2b 43 50 88 43 4e 5b 5d c3 90 90 90 90 90 90 90 90 90 90 90 90 90 55 89 e5 57 89 c7 56 89 d6 53 89 cb c1 e9 02 a5 89 d9 83 e1 03 74 02 f3 a4 5b 5e 5f 5d c3 90 55 89 e5 57 Dec 15 13:00:50 kernel: EIP: memcpy+0xf/0x20 SS:ESP: 0068:f45e7d3c Dec 15 13:00:50 kernel: CR2: Dec 15 13:00:50 kernel: ---[ end trace 6d5fbbd95de42602 ]--- Dec 15 13:00:50 kernel: note: cryptsetup[1667] exited with preempt_count 1 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 0/3] Add Support for Cavium Cryptographic Accelerarion Unit
Hi, I got few review comments for this series from David Daney. I am reworking on this series and will sent v3 once it is done. So,kindly ignore this series Regards, -George On 12/13/2016 07:33 PM, George Cherian wrote: This series adds the support for Cavium Cryptographic Accelerarion Unit (CPT) CPT is available in Cavium's Octeon-Tx SoC series. The series was tested with ecryptfs and dm-crypt for in kernel cryptographic offload operations. Changes v1 -> v2 -- Addressed a crash issue when more gather components are passed. -- Redo the cptvf request manager. - Get rid of the un necessary buffer copies. -- s/uint*_t/u* -- Remove unwanted Macro definitions -- Remove the redundant ROUNDUP* macros and use kernel function -- Select proper config option in Kconfig file. -- Removed some of the unwanted header file inclusions -- Miscellaneous Cleanup George Cherian (3): drivers: crypto: Add Support for Octeon-tx CPT Engine drivers: crypto: Add the Virtual Function driver for CPT drivers: crypto: Enable CPT options crypto for build drivers/crypto/Kconfig | 1 + drivers/crypto/Makefile | 1 + drivers/crypto/cavium/cpt/Kconfig| 16 + drivers/crypto/cavium/cpt/Makefile | 3 + drivers/crypto/cavium/cpt/cpt_common.h | 166 + drivers/crypto/cavium/cpt/cpt_hw_types.h | 736 drivers/crypto/cavium/cpt/cptpf.h| 69 ++ drivers/crypto/cavium/cpt/cptpf_main.c | 733 drivers/crypto/cavium/cpt/cptpf_mbox.c | 163 + drivers/crypto/cavium/cpt/cptvf.h| 145 drivers/crypto/cavium/cpt/cptvf_algs.c | 424 drivers/crypto/cavium/cpt/cptvf_algs.h | 110 +++ drivers/crypto/cavium/cpt/cptvf_main.c | 971 +++ drivers/crypto/cavium/cpt/cptvf_mbox.c | 205 ++ drivers/crypto/cavium/cpt/cptvf_reqmanager.c | 581 drivers/crypto/cavium/cpt/request_manager.h | 147 16 files changed, 4471 insertions(+) create mode 100644 drivers/crypto/cavium/cpt/Kconfig create mode 100644 drivers/crypto/cavium/cpt/Makefile create mode 100644 drivers/crypto/cavium/cpt/cpt_common.h create mode 100644 drivers/crypto/cavium/cpt/cpt_hw_types.h create mode 100644 drivers/crypto/cavium/cpt/cptpf.h create mode 100644 drivers/crypto/cavium/cpt/cptpf_main.c create mode 100644 drivers/crypto/cavium/cpt/cptpf_mbox.c create mode 100644 drivers/crypto/cavium/cpt/cptvf.h create mode 100644 drivers/crypto/cavium/cpt/cptvf_algs.c create mode 100644 drivers/crypto/cavium/cpt/cptvf_algs.h create mode 100644 drivers/crypto/cavium/cpt/cptvf_main.c create mode 100644 drivers/crypto/cavium/cpt/cptvf_mbox.c create mode 100644 drivers/crypto/cavium/cpt/cptvf_reqmanager.c create mode 100644 drivers/crypto/cavium/cpt/request_manager.h -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
On 15.12.2016 13:28, David Laight wrote: > From: Hannes Frederic Sowa >> Sent: 15 December 2016 12:23 > ... >> Hmm? Even the Intel ABI expects alignment of unsigned long long to be 8 >> bytes on 32 bit. Do you question that? > > Yes. > > The linux ABI for x86 (32 bit) only requires 32bit alignment for u64 (etc). Hmm, u64 on 32 bit is unsigned long long and not unsigned long. Thus I am actually not sure if the ABI would say anything about that (sorry also for my wrong statement above). Alignment requirement of unsigned long long on gcc with -m32 actually seem to be 8. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
From: Hannes Frederic Sowa > Sent: 15 December 2016 12:23 ... > Hmm? Even the Intel ABI expects alignment of unsigned long long to be 8 > bytes on 32 bit. Do you question that? Yes. The linux ABI for x86 (32 bit) only requires 32bit alignment for u64 (etc). David
Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
On 15.12.2016 12:04, David Laight wrote: > From: Hannes Frederic Sowa >> Sent: 14 December 2016 22:03 >> On 14.12.2016 13:46, Jason A. Donenfeld wrote: >>> Hi David, >>> >>> On Wed, Dec 14, 2016 at 10:56 AM, David Laight>>> wrote: ... > +u64 siphash24(const u8 *data, size_t len, const u8 > key[SIPHASH24_KEY_LEN]) ... > + u64 k0 = get_unaligned_le64(key); > + u64 k1 = get_unaligned_le64(key + sizeof(u64)); ... > + m = get_unaligned_le64(data); All these unaligned accesses are going to get expensive on architectures like sparc64. >>> >>> Yes, the unaligned accesses aren't pretty. Since in pretty much all >>> use cases thus far, the data can easily be made aligned, perhaps it >>> makes sense to create siphash24() and siphash24_unaligned(). Any >>> thoughts on doing something like that? >> >> I fear that the alignment requirement will be a source of bugs on 32 bit >> machines, where you cannot even simply take a well aligned struct on a >> stack and put it into the normal siphash(aligned) function without >> adding alignment annotations everywhere. Even blocks returned from >> kmalloc on 32 bit are not aligned to 64 bit. > > Are you doing anything that will require 64bit alignment on 32bit systems? > It is unlikely that the kernel can use any simd registers that have wider > alignment requirements. > > You also really don't want to request on-stack items have large alignments. > While gcc can generate code to do it, it isn't pretty. Hmm? Even the Intel ABI expects alignment of unsigned long long to be 8 bytes on 32 bit. Do you question that? -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
From: Hannes Frederic Sowa > Sent: 14 December 2016 22:03 > On 14.12.2016 13:46, Jason A. Donenfeld wrote: > > Hi David, > > > > On Wed, Dec 14, 2016 at 10:56 AM, David Laight> > wrote: > >> ... > >>> +u64 siphash24(const u8 *data, size_t len, const u8 > >>> key[SIPHASH24_KEY_LEN]) > >> ... > >>> + u64 k0 = get_unaligned_le64(key); > >>> + u64 k1 = get_unaligned_le64(key + sizeof(u64)); > >> ... > >>> + m = get_unaligned_le64(data); > >> > >> All these unaligned accesses are going to get expensive on architectures > >> like sparc64. > > > > Yes, the unaligned accesses aren't pretty. Since in pretty much all > > use cases thus far, the data can easily be made aligned, perhaps it > > makes sense to create siphash24() and siphash24_unaligned(). Any > > thoughts on doing something like that? > > I fear that the alignment requirement will be a source of bugs on 32 bit > machines, where you cannot even simply take a well aligned struct on a > stack and put it into the normal siphash(aligned) function without > adding alignment annotations everywhere. Even blocks returned from > kmalloc on 32 bit are not aligned to 64 bit. Are you doing anything that will require 64bit alignment on 32bit systems? It is unlikely that the kernel can use any simd registers that have wider alignment requirements. You also really don't want to request on-stack items have large alignments. While gcc can generate code to do it, it isn't pretty. David
RE: [PATCH v3 1/3] siphash: add cryptographically secure hashtable function
From: Linus Torvalds > Sent: 15 December 2016 00:11 > On Wed, Dec 14, 2016 at 3:34 PM, Jason A. Donenfeldwrote: > > > > Or does your reasonable dislike of "word" still allow for the use of > > dword and qword, so that the current function names of: > > dword really is confusing to people. > > If you have a MIPS background, it means 64 bits. While to people with > Windows programming backgrounds it means 32 bits. Guess what a DWORD_PTR is on 64bit windows ... (it is an integer type). David
RE: [PATCH v3 3/3] random: use siphash24 instead of md5 for get_random_int/long
From: Behalf Of Jason A. Donenfeld > Sent: 14 December 2016 18:46 ... > + ret = *chaining = siphash24((u8 *), offsetof(typeof(combined), > end), If you make the first argument 'const void *' you won't need the cast on every call. I'd also suggest making the key u64[2]. David -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] crypto: bfin_crc: Remove unneeded linux/miscdevice.h include
bfin_crc.h driver does not use any miscdevice, so this patch remove this unnecessary inclusion. Signed-off-by: Corentin Labbe--- drivers/crypto/bfin_crc.h | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/crypto/bfin_crc.h b/drivers/crypto/bfin_crc.h index 75cef4d..786ef74 100644 --- a/drivers/crypto/bfin_crc.h +++ b/drivers/crypto/bfin_crc.h @@ -55,7 +55,6 @@ struct crc_info { #include #include -#include struct crc_register { u32 control; -- 2.10.2 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] crypto: bfin_crc: Fix format printing warning
bfin_crc.c print some u32 as unsigned long ans so gcc complains about it. This patch remove the long print qualifier. Signed-off-by: Corentin Labbe--- drivers/crypto/bfin_crc.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/crypto/bfin_crc.c b/drivers/crypto/bfin_crc.c index 10db7df..a118b9b 100644 --- a/drivers/crypto/bfin_crc.c +++ b/drivers/crypto/bfin_crc.c @@ -203,7 +203,7 @@ static void bfin_crypto_crc_config_dma(struct bfin_crypto_crc *crc) crc->sg_cpu[i].x_count = 1; crc->sg_cpu[i].x_modify = CHKSUM_DIGEST_SIZE; dev_dbg(crc->dev, "%d: crc_dma: start_addr:0x%lx, " - "cfg:0x%lx, x_count:0x%lx, x_modify:0x%lx\n", + "cfg:0x%x, x_count:0x%x, x_modify:0x%x\n", i, crc->sg_cpu[i].start_addr, crc->sg_cpu[i].cfg, crc->sg_cpu[i].x_count, crc->sg_cpu[i].x_modify); @@ -233,7 +233,7 @@ static void bfin_crypto_crc_config_dma(struct bfin_crypto_crc *crc) crc->sg_cpu[i].x_count = dma_count; crc->sg_cpu[i].x_modify = dma_mod; dev_dbg(crc->dev, "%d: crc_dma: start_addr:0x%lx, " - "cfg:0x%lx, x_count:0x%lx, x_modify:0x%lx\n", + "cfg:0x%x, x_count:0x%x, x_modify:0x%x\n", i, crc->sg_cpu[i].start_addr, crc->sg_cpu[i].cfg, crc->sg_cpu[i].x_count, crc->sg_cpu[i].x_modify); @@ -257,7 +257,7 @@ static void bfin_crypto_crc_config_dma(struct bfin_crypto_crc *crc) crc->sg_cpu[i].x_count = 1; crc->sg_cpu[i].x_modify = CHKSUM_DIGEST_SIZE; dev_dbg(crc->dev, "%d: crc_dma: start_addr:0x%lx, " - "cfg:0x%lx, x_count:0x%lx, x_modify:0x%lx\n", + "cfg:0x%x, x_count:0x%x, x_modify:0x%x\n", i, crc->sg_cpu[i].start_addr, crc->sg_cpu[i].cfg, crc->sg_cpu[i].x_count, crc->sg_cpu[i].x_modify); -- 2.10.2 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
On 15.12.2016 00:29, Jason A. Donenfeld wrote: > Hi Hannes, > > On Wed, Dec 14, 2016 at 11:03 PM, Hannes Frederic Sowa >wrote: >> I fear that the alignment requirement will be a source of bugs on 32 bit >> machines, where you cannot even simply take a well aligned struct on a >> stack and put it into the normal siphash(aligned) function without >> adding alignment annotations everywhere. Even blocks returned from >> kmalloc on 32 bit are not aligned to 64 bit. > > That's what the "__aligned(SIPHASH24_ALIGNMENT)" attribute is for. The > aligned siphash function will be for structs explicitly made for > siphash consumption. For everything else there's siphash_unaligned. So in case you have a pointer from somewhere on 32 bit you can essentially only guarantee it has natural alignment or max. native alignment (based on the arch). gcc only fulfills your request for alignment when you allocate on the stack (minus gcc bugs). Let's say you get a pointer from somewhere, maybe embedded in a struct, which came from kmalloc. kmalloc doesn't care about aligned attribute, it will align according to architecture description. That said, if you want to hash that, you would need manually align the memory returned from kmalloc or make sure the the data is more than naturally aligned on that architecture. >> Can we do this a runtime check and just have one function (siphash) >> dealing with that? > > Seems like the runtime branching on the aligned function would be bad > for performance, when we likely know at compile time if it's going to > be aligned or not. I suppose we could add that check just to the > unaligned version, and rename it to "maybe_unaligned"? Is this what > you have in mind? I argue that you mostly don't know at compile time if it is correctly aligned if the alignment requirements are larger than the natural ones. Also, we don't even have that for memcpy, even we use it probably much more than hashing, so I think this is overkill. Bye, Hannes -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [kernel-hardening] Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
On Thu, 2016-12-15 at 15:57 +0800, Herbert Xu wrote: > Jason A. Donenfeldwrote: > > > > Siphash needs a random secret key, yes. The point is that the hash > > function remains secure so long as the secret key is kept secret. > > Other functions can't make the same guarantee, and so nervous > > periodic > > key rotation is necessary, but in most cases nothing is done, and so > > things just leak over time. > > Actually those users that use rhashtable now have a much more > sophisticated defence against these attacks, dyanmic rehashing > when bucket length exceeds a preset limit. > > Cheers, Key independent collisions won't be mitigated by picking a new secret. A simple solution with clear security properties is ideal. signature.asc Description: This is a digitally signed message part
Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
Jason A. Donenfeldwrote: > > Siphash needs a random secret key, yes. The point is that the hash > function remains secure so long as the secret key is kept secret. > Other functions can't make the same guarantee, and so nervous periodic > key rotation is necessary, but in most cases nothing is done, and so > things just leak over time. Actually those users that use rhashtable now have a much more sophisticated defence against these attacks, dyanmic rehashing when bucket length exceeds a preset limit. Cheers, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html