Re: [PATCH v3 1/3] siphash: add cryptographically secure hashtable function

2016-12-17 Thread Christian Kujau
On Thu, 15 Dec 2016, Jason A. Donenfeld wrote:
> > I'd still drop the "24" unless you really think we're going to have
> > multiple variants coming into the kernel.
> 
> Okay. I don't have a problem with this, unless anybody has some reason
> to the contrary.

What if the 2/4-round version falls and we need more rounds to withstand 
future cryptoanalysis? We'd then have siphash_ and siphash48_ functions, 
no? My amateurish bike-shedding argument would be "let's keep the 24 then" :-)

C.
-- 
BOFH excuse #354:

Chewing gum on /dev/sd3c
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v3 1/3] siphash: add cryptographically secure hashtable function

2016-12-15 Thread David Laight
From: Linus Torvalds
> Sent: 15 December 2016 00:11
> On Wed, Dec 14, 2016 at 3:34 PM, Jason A. Donenfeld  wrote:
> >
> > Or does your reasonable dislike of "word" still allow for the use of
> > dword and qword, so that the current function names of:
> 
> dword really is confusing to people.
>
> If you have a MIPS background, it means 64 bits. While to people with
> Windows programming backgrounds it means 32 bits.

Guess what a DWORD_PTR is on 64bit windows ...
(it is an integer type).

David



Re: [PATCH v3 1/3] siphash: add cryptographically secure hashtable function

2016-12-14 Thread Linus Torvalds
On Wed, Dec 14, 2016 at 3:34 PM, Jason A. Donenfeld  wrote:
>
> Or does your reasonable dislike of "word" still allow for the use of
> dword and qword, so that the current function names of:

dword really is confusing to people.

If you have a MIPS background, it means 64 bits. While to people with
Windows programming backgrounds it means 32 bits.

Please try to avoid using it.

As mentioned, I think almost everybody agrees on the "q" part being 64
bits, but that may just be me not having seen it in any other context.

And before anybody points it out - yes, we already have lots of uses
of "dword" in various places. But they tend to be mostly
hardware-specific - either architectures or drivers.

So I'd _prefer_ to try to keep "word" and "dword" away from generic
helper routines. But it's not like anything is really black and white.

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/3] siphash: add cryptographically secure hashtable function

2016-12-14 Thread Linus Torvalds
On Wed, Dec 14, 2016 at 2:56 PM, Jason A. Donenfeld  wrote:
>
> So actually jhash_Nwords makes no sense, since it takes dwords
> (32-bits) not words (16-bits). The siphash analog should be called
> siphash24_Nqwords.

No. The bug is talking about "words" in the first place.

Depending on your background, a "word" can be generally be either 16
bits or 32 bits (or, in some cases, 18 bits).

In theory, a 64-bit entity can be a "word" too, but pretty much nobody
uses that. Even architectures that started out with a 64-bit register
size and never had any smaller historical baggage (eg alpha) tend to
call 32-bit entities "words".

So 16 bits can be a word, but some people/architectures will call it a
"half-word".

To make matters even more confusing, a "quadword" is generally always
64 bits, regardless of the size of "word".

So please try to avoid the use of "word" entirely. It's too ambiguous,
and it's not even helpful as a "size of the native register". It's
almost purely random.

For the kernel, we tend use

 - uX for types that have specific sizes (X being the number of bits)

 - "[unsigned] long" for native register size

But never "word".

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/3] siphash: add cryptographically secure hashtable function

2016-12-14 Thread Jason A. Donenfeld
Hey Tom,

On Thu, Dec 15, 2016 at 12:14 AM, Tom Herbert  wrote:
> I'm confused, doesn't 2dword == 1qword? Anyway, I think the qword
> functions are good enough. If someone needs to hash over some odd
> length they can either put them in a structure padded to 64 bits or
> call the hash function that takes a byte length.

Yes. Here's an example:

static inline u64 siphash24_2dwords(const u32 a, const u32 b, const u8
key[SIPHASH24_KEY_LEN])
{
   return siphash24_1qword(((u64)b << 32) | a, key);
}

This winds up being extremely useful and syntactically convenient in a
few places. Check out my git branch in about 10 minutes or wait for v4
to be posted tomorrow; these are nice helpers.

> I'd still drop the "24" unless you really think we're going to have
> multiple variants coming into the kernel.

Okay. I don't have a problem with this, unless anybody has some reason
to the contrary.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/3] siphash: add cryptographically secure hashtable function

2016-12-14 Thread Tom Herbert
On Wed, Dec 14, 2016 at 2:56 PM, Jason A. Donenfeld  wrote:
> Hey Tom,
>
> On Wed, Dec 14, 2016 at 10:35 PM, Tom Herbert  wrote:
>> Those look good, although I would probably just do 1,2,3 words and
>> then have a function that takes n words like jhash. Might want to call
>> these dword to distinguish from 32 bit words in jhash.
>
> So actually jhash_Nwords makes no sense, since it takes dwords
> (32-bits) not words (16-bits). The siphash analog should be called
> siphash24_Nqwords.
>
Yeah, that's a "bug" with jhash function names.

> I think what I'll do is change what I already have to:
> siphash24_1qword
> siphash24_2qword
> siphash24_3qword
> siphash24_4qword
>
> And then add some static inline helpers to assist with smaller u32s
> like ipv4 addresses called:
>
> siphash24_2dword
> siphash24_4dword
> siphash24_6dword
> siphash24_8dword
>
> While we're having something new, might as well call it the right thing.
>
I'm confused, doesn't 2dword == 1qword? Anyway, I think the qword
functions are good enough. If someone needs to hash over some odd
length they can either put them in a structure padded to 64 bits or
call the hash function that takes a byte length.

>
>> Also, what is the significance of "24" in the function and constant
>> names? Can we just drop that and call this siphash?
>
> SipHash is actually a family of PRFs, differentiated by the number of
> SIPROUNDs after each 64-bit input is processed and the number of
> SIPROUNDs at the very end of the function. The best trade-off of speed
> and security for kernel usage is 2 rounds after each 64-bit input and
> 4 rounds at the end of the function. This doesn't fall to any known
> cryptanalysis and it's very fast.

I'd still drop the "24" unless you really think we're going to have
multiple variants coming into the kernel.

Tom
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/3] siphash: add cryptographically secure hashtable function

2016-12-14 Thread Jason A. Donenfeld
Hey Tom,

On Wed, Dec 14, 2016 at 10:35 PM, Tom Herbert  wrote:
> Those look good, although I would probably just do 1,2,3 words and
> then have a function that takes n words like jhash. Might want to call
> these dword to distinguish from 32 bit words in jhash.

So actually jhash_Nwords makes no sense, since it takes dwords
(32-bits) not words (16-bits). The siphash analog should be called
siphash24_Nqwords.

I think what I'll do is change what I already have to:
siphash24_1qword
siphash24_2qword
siphash24_3qword
siphash24_4qword

And then add some static inline helpers to assist with smaller u32s
like ipv4 addresses called:

siphash24_2dword
siphash24_4dword
siphash24_6dword
siphash24_8dword

While we're having something new, might as well call it the right thing.


> Also, what is the significance of "24" in the function and constant
> names? Can we just drop that and call this siphash?

SipHash is actually a family of PRFs, differentiated by the number of
SIPROUNDs after each 64-bit input is processed and the number of
SIPROUNDs at the very end of the function. The best trade-off of speed
and security for kernel usage is 2 rounds after each 64-bit input and
4 rounds at the end of the function. This doesn't fall to any known
cryptanalysis and it's very fast.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/3] siphash: add cryptographically secure hashtable function

2016-12-14 Thread Jason A. Donenfeld
Interesting. Evidently gcc 4.8 doesn't like my use of:

enum siphash_lengths {
   SIPHASH24_KEY_LEN = 16,
   SIPHASH24_ALIGNMENT = 8
};

I'll convert this to the more boring:

#define SIPHASH24_KEY_LEN 16
#define SIPHASH24_ALIGNMENT 8
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/3] siphash: add cryptographically secure hashtable function

2016-12-14 Thread kbuild test robot
Hi Jason,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.9 next-20161214]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Jason-A-Donenfeld/siphash-add-cryptographically-secure-hashtable-function/20161215-041458
config: i386-randconfig-i1-201650 (attached as .config)
compiler: gcc-4.8 (Debian 4.8.4-1) 4.8.4
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All errors (new ones prefixed by >>):

   lib/test_siphash.c: In function 'siphash_test_init':
>> lib/test_siphash.c:49:2: error: requested alignment is not an integer 
>> constant
 u8 in[64] __aligned(SIPHASH24_ALIGNMENT);
 ^
   lib/test_siphash.c:50:2: error: requested alignment is not an integer 
constant
 u8 k[16] __aligned(SIPHASH24_ALIGNMENT);
 ^

vim +49 lib/test_siphash.c

43  0x6ca4ecb15c5f91e1ULL, 0x9f626da15c9625f3ULL, 
0xe51b38608ef25f57ULL,
44  0x958a324ceb064572ULL
45  };
46  
47  static int __init siphash_test_init(void)
48  {
  > 49  u8 in[64] __aligned(SIPHASH24_ALIGNMENT);
50  u8 k[16] __aligned(SIPHASH24_ALIGNMENT);
51  u8 in_unaligned[65];
52  u8 k_unaligned[65];

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH v3 1/3] siphash: add cryptographically secure hashtable function

2016-12-14 Thread Jason A. Donenfeld
Hey Tom,

Just following up on what I mentioned in my last email...

On Wed, Dec 14, 2016 at 8:35 PM, Jason A. Donenfeld  wrote:
> I think your suggestion for (2) will contribute to further
> optimizations for (1). In v2, I had another patch in there adding
> siphash_1word, siphash_2words, etc, like jhash, but I implemented it
> by taking u32 variables and then just concatenating these into a
> buffer and passing them to the main siphash function. I removed it
> from v3 because I thought that these kind of missed the whole point.
> In particular:
>
> a) siphash24_1word, siphash24_2words, siphash24_3words, etc should
> take u64, not u32, since that's what siphash operates on natively

I implemented these here:
https://git.zx2c4.com/linux-dev/commit/?h=siphash=4652b6f3643bdba217e2194d89661348bbac48a0

This will be part of the next version of the series I submit. It's not
immediately clear that using it is strictly faster than the struct
trick though. However, I'm not yet sure why this would be.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/3] siphash: add cryptographically secure hashtable function

2016-12-14 Thread Jason A. Donenfeld
Hi Tom,

On Wed, Dec 14, 2016 at 8:18 PM, Tom Herbert  wrote:
> "super fast" is relative. My quick test shows that this faster than
> Toeplitz (good, but not exactly hard to achieve), but is about 4x
> slower than jhash.

Fast relative to other cryptographically secure PRFs.

>> SipHash isn't just some new trendy hash function. It's been around for a
>> while, and there really isn't anything that comes remotely close to
>> being useful in the way SipHash is. With that said, why do we need this?
> I don't think we need advertising nor a lesson on hashing. It would be
> much more useful if you just point us to the paper on siphash (which I
> assume I http://cr.yp.to/siphash/siphash-20120918.pdf ?).

Ugh. Sorry. It definitely wasn't my intention to give an uninvited
lesson or an annoying advert. For the former, I didn't want to make
any expectations about fields of knowledge, because I honest have no
idea. For the latter, I wrote that sentence to indicate that siphash
isn't just some newfangled hipster function, but something useful and
well established. I didn't mean it as a form of advertising. My
apologies if I've offended your sensibilities.

That cr.yp.to link is fine, or https://131002.net/siphash/siphash.pdf I believe.

> Key rotation is important anyway, without any key rotation even if the
> key is compromised in siphash by some external means we would have an
> insecure hash until the system reboots.

I'm a bit surprised to read this. I've never designed a system to be
secure even in the event of remote arbitrary kernel memory disclosure,
and I wasn't aware this was generally considered an architectural
requirement or Linux.

In any case, if you want this, I suppose you can have it with siphash too.

> Maybe so, but we need to do due diligence before considering adopting
> siphash as the primary hashing in the network stack. Consider that we
> may very well perform a hash over L4 tuples on _every_ packet. We've
> done a good job at limiting this to be at most one hash per packet,
> but nevertheless the performance of the hash function must be take
> into account.

I agree with you. It seems like each case is going to needed to be
measured on a case by case basis. In this series I make the first use
of siphash in the secure sequence generation and get_random_int/long,
where siphash replaces md5, so there's a pretty clear performance in.
But for the jhash replacements indeed things are going to need to be
individually evaluated.

> 1) My quick test shows siphash is about four times more expensive than
> jhash. On my test system, computing a hash over IPv4 tuple (two 32 bit
> addresses and 2 16 bit source ports) is 6.9 nsecs in Jenkins hash, 33
> nsecs with siphash. Given that we have eliminated most of the packet
> header hashes this might be tolerable, but still should be looking at
> ways to optimize.
> 2) I like moving to use u64 (quad words) in the hash, this is an
> improvement over Jenkins which is based on 32 bit words. If we put
> this in the kernel we probably want to have several variants of
> siphash for specific sizes (e.g. siphash1, siphash2, siphash2,
> siphashn for hash over one, two, three, or n sixty four bit words).

I think your suggestion for (2) will contribute to further
optimizations for (1). In v2, I had another patch in there adding
siphash_1word, siphash_2words, etc, like jhash, but I implemented it
by taking u32 variables and then just concatenating these into a
buffer and passing them to the main siphash function. I removed it
from v3 because I thought that these kind of missed the whole point.
In particular:

a) siphash24_1word, siphash24_2words, siphash24_3words, etc should
take u64, not u32, since that's what siphash operates on natively
b) Rather than concatenating them in a buffer, I should write
specializations of the siphash24 function _especially_ for these size
inputs to avoid the copy and to reduce the book keeping.

I'll add these functions to v4 implemented like that.

Thanks for the useful feedback and benchmarks!

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/3] siphash: add cryptographically secure hashtable function

2016-12-14 Thread Tom Herbert
On Wed, Dec 14, 2016 at 10:46 AM, Jason A. Donenfeld  wrote:
> SipHash is a 64-bit keyed hash function that is actually a
> cryptographically secure PRF, like HMAC. Except SipHash is super fast,
> and is meant to be used as a hashtable keyed lookup function.
>
"super fast" is relative. My quick test shows that this faster than
Toeplitz (good, but not exactly hard to achieve), but is about 4x
slower than jhash.

> SipHash isn't just some new trendy hash function. It's been around for a
> while, and there really isn't anything that comes remotely close to
> being useful in the way SipHash is. With that said, why do we need this?
>
I don't think we need advertising nor a lesson on hashing. It would be
much more useful if you just point us to the paper on siphash (which I
assume I http://cr.yp.to/siphash/siphash-20120918.pdf ?).

> There are a variety of attacks known as "hashtable poisoning" in which an
> attacker forms some data such that the hash of that data will be the
> same, and then preceeds to fill up all entries of a hashbucket. This is
> a realistic and well-known denial-of-service vector.
>
> Linux developers already seem to be aware that this is an issue, and
> various places that use hash tables in, say, a network context, use a
> non-cryptographically secure function (usually jhash) and then try to
> twiddle with the key on a time basis (or in many cases just do nothing
> and hope that nobody notices). While this is an admirable attempt at
> solving the problem, it doesn't actually fix it. SipHash fixes it.
>
Key rotation is important anyway, without any key rotation even if the
key is compromised in siphash by some external means we would have an
insecure hash until the system reboots.

> (It fixes it in such a sound way that you could even build a stream
> cipher out of SipHash that would resist the modern cryptanalysis.)
>
> There are a modicum of places in the kernel that are vulnerable to
> hashtable poisoning attacks, either via userspace vectors or network
> vectors, and there's not a reliable mechanism inside the kernel at the
> moment to fix it. The first step toward fixing these issues is actually
> getting a secure primitive into the kernel for developers to use. Then
> we can, bit by bit, port things over to it as deemed appropriate.
>
> Secondly, a few places are using MD5 for creating secure sequence
> numbers, port numbers, or fast random numbers. Siphash is a faster, more
> fittting, and more secure replacement for MD5 in those situations.
>
> Dozens of languages are already using this internally for their hash
> tables. Some of the BSDs already use this in their kernels. SipHash is
> a widely known high-speed solution to a widely known problem, and it's
> time we catch-up.
>
Maybe so, but we need to do due diligence before considering adopting
siphash as the primary hashing in the network stack. Consider that we
may very well perform a hash over L4 tuples on _every_ packet. We've
done a good job at limiting this to be at most one hash per packet,
but nevertheless the performance of the hash function must be take
into account.

A few points:

1) My quick test shows siphash is about four times more expensive than
jhash. On my test system, computing a hash over IPv4 tuple (two 32 bit
addresses and 2 16 bit source ports) is 6.9 nsecs in Jenkins hash, 33
nsecs with siphash. Given that we have eliminated most of the packet
header hashes this might be tolerable, but still should be looking at
ways to optimize.
2) I like moving to use u64 (quad words) in the hash, this is an
improvement over Jenkins which is based on 32 bit words. If we put
this in the kernel we probably want to have several variants of
siphash for specific sizes (e.g. siphash1, siphash2, siphash2,
siphashn for hash over one, two, three, or n sixty four bit words).
3) I also tested siphash distribution and Avalanche Effect (these
really should have been covered in the paper :-( ). Both of these are
good with siphash, in fact almost have identical characteristics to
Jenkins hash.

Tom

> Signed-off-by: Jason A. Donenfeld 
> Cc: Jean-Philippe Aumasson 
> Cc: Daniel J. Bernstein 
> Cc: Linus Torvalds 
> Cc: Eric Biggers 
> Cc: David Laight 
> ---
> Changes from v2->v3:
>
>   - There is now a fast aligned version of the function and a not-as-fast
> unaligned version. The requirements for each have been documented in
> a docbook-style comment. As well, the header now contains a constant
> for the expected alignment.
>
>   - The test suite has been updated to check both the unaligned and aligned
> version of the function.
>
>  include/linux/siphash.h |  30 ++
>  lib/Kconfig.debug   |   6 +-
>  lib/Makefile|   5 +-
>  lib/siphash.c   | 153 
> 
>  lib/test_siphash.c  |  85