Re: Patch: cprng_fast performance - please review.

2014-04-18 Thread Thor Lancelot Simon
Miscellaneous notes -- I'm doing the benchmarking we discussed and pondering whether it is right to switch from hc-128 to salsa20, separately. On Thu, Apr 17, 2014 at 09:33:28PM +, Taylor R Campbell wrote: > > > +void > > +hc128_init(hc128_state_t *state, const uint8_t *key, const uint8_t *iv

Re: Patch: cprng_fast performance - please review.

2014-04-18 Thread Joerg Sonnenberger
On Fri, Apr 18, 2014 at 05:05:37PM -0400, Thor Lancelot Simon wrote: > On Fri, Apr 18, 2014 at 05:00:50PM -0400, Thor Lancelot Simon wrote: > > > > Unfortunately, the virtual machines on this laptop that I use for most > > NetBSD development don't expose the AES-NI instructions to guests, even > >

Re: Patch: cprng_fast performance - please review.

2014-04-18 Thread Izumi Tsutsui
tls@ wrote: > > Note the caller of this hc128_init() is: > > > > I'm afraid "9KB stack on rekeying" is fatal on most ports. > > Well, the cipher should hardly ever get rekeyed. The rekeying > intervals could be considerably larger; I did not want to > increase them too much compared to the old

Re: Patch: cprng_fast performance - please review.

2014-04-18 Thread Matt Thomas
On Apr 18, 2014, at 11:23 AM, Markku-Juhani Olavi Saarinen wrote: > It has been there on all new systems purchased in some last 3 years, > so I would *guess* that it would be > 50% of systems fielded out > there. Not everything is x86 based.

Re: Patch: cprng_fast performance - please review.

2014-04-18 Thread Thor Lancelot Simon
On Fri, Apr 18, 2014 at 11:04:04PM +0200, Markku-Juhani Olavi Saarinen wrote: > Hi, > > Just one last thought: it will be on all *future* systems ? No, it won't, unless you have some funny definition of "system" that excludes anything that's not a high-end x86 implementation from one particular m

Re: Patch: cprng_fast performance - please review.

2014-04-18 Thread Thor Lancelot Simon
On Fri, Apr 18, 2014 at 05:00:50PM -0400, Thor Lancelot Simon wrote: > > Unfortunately, the virtual machines on this laptop that I use for most > NetBSD development don't expose the AES-NI instructions to guests, even > when doing hardware assisted virtualization. Not RDRAND neither, for So, sin

Re: Patch: cprng_fast performance - please review.

2014-04-18 Thread Markku-Juhani Olavi Saarinen
Hi, Just one last thought: it will be on all *future* systems ? Cheers, - markku Dr. Markku-Juhani O. Saarinen US +1 (424) 666 2713 On Fri, Apr 18, 2014 at 11:00 PM, Thor Lancelot Simon wrote: > On Fri, Apr 18, 2014 at 09:54:09PM +0100, Roland C. Dowdeswell wrote: >> On Fri, Apr 18, 2014 at

Re: Patch: cprng_fast performance - please review.

2014-04-18 Thread Thor Lancelot Simon
On Fri, Apr 18, 2014 at 09:54:09PM +0100, Roland C. Dowdeswell wrote: > On Fri, Apr 18, 2014 at 08:23:11PM +0200, Markku-Juhani Olavi Saarinen wrote: > > > > > Agreed. AES is worse if you don't have AES-NI. > > > > It has been there on all new systems purchased in some last 3 years, > > so I woul

Re: Patch: cprng_fast performance - please review.

2014-04-18 Thread Roland C. Dowdeswell
On Fri, Apr 18, 2014 at 08:23:11PM +0200, Markku-Juhani Olavi Saarinen wrote: > > Agreed. AES is worse if you don't have AES-NI. > > It has been there on all new systems purchased in some last 3 years, > so I would *guess* that it would be > 50% of systems fielded out > there. It hasn't been the

Towards design criteria for cprng_fast()

2014-04-18 Thread Thor Lancelot Simon
I would like to offer some observations about the use of cprng_fast() (once known as arc4random()) in our kernel and, from these, express what I believe are reasonable design criteria for that function. O1) cprng_fast() is used in some performance-critical parts of the kernel: A) It's use

Re: Patch: cprng_fast performance - please review.

2014-04-18 Thread Thor Lancelot Simon
On Fri, Apr 18, 2014 at 08:33:20PM +0200, Thomas Klausner wrote: > On Fri, Apr 18, 2014 at 01:39:18PM -0400, Thor Lancelot Simon wrote: > > How do you count to 9K? I see: > > > > 2K for p > > 2K for q > > 1280 bytes for w > > Are you talking about this w? > + uint32_t w[1280],

Re: Patch: cprng_fast performance - please review.

2014-04-18 Thread Thor Lancelot Simon
On Fri, Apr 18, 2014 at 06:11:39PM +, Taylor R Campbell wrote: > > The majority of systems certainly don't have AES-NI. Only some recent > Intel CPUs do, and we can't use it in the kernel anyway. Right: plenty of systems accellerate AES, but in the wide world of systems that are not all x86

Re: Patch: cprng_fast performance - please review.

2014-04-18 Thread Thomas Klausner
On Fri, Apr 18, 2014 at 01:39:18PM -0400, Thor Lancelot Simon wrote: > How do you count to 9K? I see: > > 2K for p > 2K for q > 1280 bytes for w Are you talking about this w? + uint32_t w[1280], *p = state->p, *q = state->q; This looks like 1280x4 bytes to me. Thomas

Re: Patch: cprng_fast performance - please review.

2014-04-18 Thread Markku-Juhani Olavi Saarinen
On Fri, Apr 18, 2014 at 8:11 PM, Taylor R Campbell wrote: >Date: Fri, 18 Apr 2014 19:58:06 +0200 >From: Markku-Juhani Olavi Saarinen > >If you want to get rid of RC4, use AES in CTR mode. It is standard, >compact, clean, and really fast solution. May sound boring, but gives >m

Re: Patch: cprng_fast performance - please review.

2014-04-18 Thread Taylor R Campbell
Date: Fri, 18 Apr 2014 19:58:06 +0200 From: Markku-Juhani Olavi Saarinen If you want to get rid of RC4, use AES in CTR mode. It is standard, compact, clean, and really fast solution. May sound boring, but gives me a feel of solid security engineering. We use that for /dev/u?random

Re: Patch: cprng_fast performance - please review.

2014-04-18 Thread Markku-Juhani Olavi Saarinen
Hi, If you want to get rid of RC4, use AES in CTR mode. It is standard, compact, clean, and really fast solution. May sound boring, but gives me a feel of solid security engineering. Note that majority of systems now have the AES-NI instructions which speed up AES implementations by an order of m

Re: Patch: cprng_fast performance - please review.

2014-04-18 Thread Taylor R Campbell
Date: Fri, 18 Apr 2014 12:38:38 -0400 From: Thor Lancelot Simon 3) If the algorithm's use of state-dependent array indices presents a real weakness in practice, why aren't there any published results on this and why was it chosen as, and

Re: Patch: cprng_fast performance - please review.

2014-04-18 Thread Thor Lancelot Simon
On Fri, Apr 18, 2014 at 10:27:45PM +0900, Izumi Tsutsui wrote: > > Note the caller of this hc128_init() is: > > > > +static void > > > +cprng_fast_randrekey(cprng_fast_ctx_t *ctx) > > > +{ > > > + uint8_t key[16], iv[16]; > > > + hc128_state_t tempstate; > > > + int s; > > > + > > > + int have_in

Re: Inconsistency with COMPAT_10

2014-04-18 Thread Christos Zoulas
In article <5350e2b5.6000...@m00nbsd.net>, Maxime Villard wrote: >Hi all, >I think there's an inconsistency with COMPAT_10 in the open() syscall: > >- kern/vfs_syscalls.c - l.1631 -- > >#ifdef COMPAT_10 /* XXX: and perhaps later */ > if (path == NULL) {

Re: Patch: cprng_fast performance - please review.

2014-04-18 Thread Thor Lancelot Simon
On Fri, Apr 18, 2014 at 04:26:19PM +, Taylor R Campbell wrote: > > Closer inspection of HC-128 reveals that it uses secret-dependent > array indices[*], so for that reason alone I don't think we should > adopt it. It also has a very large state, which is going to hurt the > cache on big system

Re: Patch: cprng_fast performance - please review.

2014-04-18 Thread Thor Lancelot Simon
On Fri, Apr 18, 2014 at 10:20:24PM +0900, Izumi Tsutsui wrote: > > LITTLE_ENDIAN != x86 > > This should simply be le32dec(9) otherwise > it will cause unaligned trap on arm and mips etc. I believe the input is -- though declared as uint8_t -- required to always be alinged (see the comment on the

Re: Patch: cprng_fast performance - please review.

2014-04-18 Thread Taylor R Campbell
Closer inspection of HC-128 reveals that it uses secret-dependent array indices[*], so for that reason alone I don't think we should adopt it. It also has a very large state, which is going to hurt the cache on big systems and hurt the stack on little systems. So, could you please split your chan

Re: Patch: cprng_fast performance - please review.

2014-04-18 Thread Thor Lancelot Simon
On Fri, Apr 18, 2014 at 10:27:45PM +0900, Izumi Tsutsui wrote: > > Note the caller of this hc128_init() is: > > I'm afraid "9KB stack on rekeying" is fatal on most ports. Well, the cipher should hardly ever get rekeyed. The rekeying intervals could be considerably larger; I did not want to incr

Re: Patch: cprng_fast performance - please review.

2014-04-18 Thread Izumi Tsutsui
campbell+netbsd-tech-kern@ wrote: > > +void > > +hc128_init(hc128_state_t *state, const uint8_t *key, const uint8_t *iv) > > +{ > > + unsigned int i; > > + uint32_t w[1280], *p = state->p, *q = state->q; > > 5 KB on the stack is a lot! Granted, this is a leaf routine which in > our case will

Re: Inconsistency with COMPAT_10

2014-04-18 Thread Greg Troxel
Maxime Villard writes: > COMPAT_10 should be added in netbsd32, or removed from the native > syscall. But I'm not sure which fix should be applied. Probably added in compat32. But I don't know how common programs are that rely on this bug. I wonder if pathbuf_create results in a need to free

Re: Patch: cprng_fast performance - please review.

2014-04-18 Thread Izumi Tsutsui
tls@ wrote: > @@ -160,6 +160,7 @@ include "crypto/cast128/files.cast128" > --- /dev/null 1 Jan 1970 00:00:00 - > +++ crypto/hc128/hc128.c 17 Apr 2014 03:17:18 - : > +static inline uint32_t > +pack_littleendian(const uint8_t *v) > +{ > +#ifdef LITTLE_ENDIAN > + return *((const uin

Inconsistency with COMPAT_10

2014-04-18 Thread Maxime Villard
Hi all, I think there's an inconsistency with COMPAT_10 in the open() syscall: - kern/vfs_syscalls.c - l.1631 -- #ifdef COMPAT_10/* XXX: and perhaps later */ if (path == NULL) { pb = pathbuf_create("."); if (pb == NUL