Re: [PATCH v4 0/5] /dev/random - a new approach

2016-06-22 Thread Stephan Mueller
Am Mittwoch, 22. Juni 2016, 08:54:16 schrieb Austin S. Hemmelgarn:

Hi Austin,

> You're not demonstrating it's inherently present in the architecture,

I am not demonstrating it, interesting statement. I take different arches from 
different vendors which were developed independently from each other and I see 
the phenomenon to an equal degree in all of them.

So, this is no demonstration in your eyes? Interesting conclusion.

Yes, I do not have the statement that gate X or function Y in a CPU is the 
trigger point. Thus the absolute root cause to the unpredictable phenomenon is 
yet unknown to me. I am hunting that -- I spoke with hardware folks from 3 
different major chip vendors yet and they all have difficulties in explaining 
it. One vendor is currently helping me dissecting the issue.
> 
> > I do not care about the form factor of the test system server, desktop or
> > embedded systems nor do I care about the number of attached devices -- the
> > form factor and number of attached devices is the differentiator of what
> > you call embedded vs server vs desktop.
> 
> I don't care about form factor, I care about the CPU, and embedded
> systems almost always have simpler CPU designs (not including all the
> peripherals they love to add in on-die).  Your own analysis indicates
> that your getting entropy from the complex interaction of the different
> parts of the CPU.  Such complexities are less present in simpler CPU
> designs.

My RNG has two safety measures to detect noise source failures (again, they 
are documented): during startup and at runtime. In both cases, no data will be 
produced. But for chips where the self tests pass, we can surely harvest that 
unpredictable phenomenon.

And funnily: these health tests would scream loudly in dmesg if the noise 
source would not work. Note, in more recent kernels, the RNG is used in a lot 
of configurations and I have only received one complaint that the health test 
indicated a bad noise source. That was a very special system where a 
separation kernel did funky things.

So, as of now, it rather makes sense that you refer me to some embedded 
devices that you think will be a challenge. I do not like to theorize.

For me, embedded devices are something like a Rasperry PI or the MIPS system 
which are tested.

> >> Android barely counts as an embedded system anymore, as many Android
> > 
> > Then read F.28ff -- these are truly embedded systems (i.e. the routers
> > that I have on my desk)
> 
> 1. I'm actually rather curious what router you managed to get Android
> running on.
> 2. This is still an insanely small sample size compared to the rest of
> your results.

I think this will be the last answer for now from my side: I ask you to read 
my document as I really do not want to restate 150+ pages here!

This machine is no Android system but an AVM FritzBox router system that is 
very popular in Germany, as referenced in the document. It runs with highly 
modified Linux system -- as documented.

And instead of complaining about the test sample, you should help us by doing 
more testing, if you feel that the health tests in the RNG are insufficient.


Besides, if you really worried about the lower bound I mentioned in the 
appendix F, use the oversampling rate turning knob 
jent_entropy_collector_alloc as documented. Finally, if that is not of your 
liking, generate a bit more data with the Jitter RNG than you think you need 
entropy wise.

Ciao
Stephan
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/5] /dev/random - a new approach

2016-06-22 Thread Austin S. Hemmelgarn

On 2016-06-22 01:16, Stephan Mueller wrote:

Am Dienstag, 21. Juni 2016, 15:31:07 schrieb Austin S. Hemmelgarn:

Hi Austin,


Little data, interesting statement for results on 200+ systems including
all major CPU arches all showing information leading in the same
directions.

Let me try rephrasing this to make it a bit clearer:
1. You have lots of data on server systems.
2. You have a significant amount of data on desktop/workstation type
systems.
3. You have very little data on embedded systems.

and here are your arguments:
A. This works well on server systems.
B. This works well on desktop systems.
C. This works well on embedded systems.

Arguments A and B are substantiated directly by points 1 and 2.
Argument C is not substantiated thoroughly because of point 3.
My complaint is about argument C given point 3.


Then let me rephrase what I try to say: my RNG rests on the intrinsic
functionality of CPUs. When I show that such intrinsic behavior is present in
various architectures I show that there is a common ground for the basis of
the RNG.
You're not demonstrating it's inherently present in the architecture, 
your demonstrating that it's present for those particular 
micro-architectures you have tested.  You're dealing with effects 
resulting from such low-level details of the CPU that you can't assume 
it happens for the whole architecture.  In fact, your own results 
regarding the weak values from Pentium Celeron Mobile system reinforce 
that it's not an architectural effect at the ISA level, but results from 
low level designs.  Given the naming of that particular CPU, it's almost 
certainly a P6 or NetBurst micro-arch, neither of which had particularly 
complex branch-prediction or pipelining or similar things, and more 
significantly, probably did not have HyperThreading, which I'd be 
willing to bet is a significant contributing factor on the Xeon and Core 
processors you've got results for.


I tested on all CPUs of all large scale architectures (including the
architectures that are commonly used for embedded devices) and demonstrated
that the fundamental phenomenon the RNG rests on is present in all
architectures.
In all architectures you've tested.  Note that technically, from a 
low-level perspective of something like this, different revisions of an 
ISA are different architectures, and when you start talking about 
licensed IP cores like ARM and PPC (and MIPS, and SPARC), different 
manufacturer's designs almost count separately.  You're relying on 
complexities inherent in the CPU itself, which will be different between 
micro-architectures, and possibly even individual revisions of a 
specific model number.  Just because the test gives good results on an 
ARMv6 or ARMv7 does not mean it will on an ARMv4, because there are 
enough differences in typical designs of ARM CPU's that you can't 
directly draw conclusions based on such a small sample size (you've 
tested at most 4 manufacturer's designs, and even then only one from 
each, and there are about 10 different companies making ARM chips, each 
selling dozens of slightly different CPU's).


I do not care about the form factor of the test system server, desktop or
embedded systems nor do I care about the number of attached devices -- the
form factor and number of attached devices is the differentiator of what you
call embedded vs server vs desktop.
I don't care about form factor, I care about the CPU, and embedded 
systems almost always have simpler CPU designs (not including all the 
peripherals they love to add in on-die).  Your own analysis indicates 
that your getting entropy from the complex interaction of the different 
parts of the CPU.  Such complexities are less present in simpler CPU 
designs.


Heck, I have written a test that executes the RNG on bare metal (without OS
and with only a keyboard as device present -- i.e no interrupts are received
apart from a keyboard), which demonstrates that the phenomenon is present.

Furthermore, chapter 6 of my document analyzes the root cause of the RNG and
here you see clearly that it has nothing to do with the size of the CPU or its
attached devices or the size of RAM.
And yet averages are higher for systems that have more CPU cores, and 
thus more complex CPU's.  Prime examples of this are the UltraSPARC 
CPU's you've tested.  Those have more SMP cores (and threads of 
execution) than just about anything else you've listed, and they have 
significantly higher values than almost anything else you list.


The massive number of x86 tests shall demonstrate the common theme I see: the
newer the CPU the larger the phenomenon is the RNG rests on.
Except each iteration of x86 grows more complex branch-prediction and 
pipe-lining and other tricks to try and make the CPU process data 
faster.  That is the source of almost all of the increase your seeing in 
entropy for newer revisions.  The same is not inherently true of 
embedded processors, especially ones designed for hard-real-time usage 

Re: [PATCH v4 0/5] /dev/random - a new approach

2016-06-21 Thread Stephan Mueller
Am Dienstag, 21. Juni 2016, 15:31:07 schrieb Austin S. Hemmelgarn:

Hi Austin,

> > Little data, interesting statement for results on 200+ systems including
> > all major CPU arches all showing information leading in the same
> > directions.
> Let me try rephrasing this to make it a bit clearer:
> 1. You have lots of data on server systems.
> 2. You have a significant amount of data on desktop/workstation type
> systems.
> 3. You have very little data on embedded systems.
> 
> and here are your arguments:
> A. This works well on server systems.
> B. This works well on desktop systems.
> C. This works well on embedded systems.
> 
> Arguments A and B are substantiated directly by points 1 and 2.
> Argument C is not substantiated thoroughly because of point 3.
> My complaint is about argument C given point 3.

Then let me rephrase what I try to say: my RNG rests on the intrinsic 
functionality of CPUs. When I show that such intrinsic behavior is present in 
various architectures I show that there is a common ground for the basis of 
the RNG.

I tested on all CPUs of all large scale architectures (including the 
architectures that are commonly used for embedded devices) and demonstrated 
that the fundamental phenomenon the RNG rests on is present in all 
architectures.

I do not care about the form factor of the test system server, desktop or 
embedded systems nor do I care about the number of attached devices -- the 
form factor and number of attached devices is the differentiator of what you 
call embedded vs server vs desktop.

Heck, I have written a test that executes the RNG on bare metal (without OS 
and with only a keyboard as device present -- i.e no interrupts are received 
apart from a keyboard), which demonstrates that the phenomenon is present.

Furthermore, chapter 6 of my document analyzes the root cause of the RNG and 
here you see clearly that it has nothing to do with the size of the CPU or its 
attached devices or the size of RAM.

The massive number of x86 tests shall demonstrate the common theme I see: the 
newer the CPU the larger the phenomenon is the RNG rests on.

I use different OSes (including microkernel systems) for testing to 
demonstrate that the OS does not materially change the test results.
> 
> I'm not saying you have insufficient data to support argument A or B,
> only that you have insufficient data to support argument C.

And I think that this statement is not correct. But I would always welcome 
more testing.
> 
> Android barely counts as an embedded system anymore, as many Android

Then read F.28ff -- these are truly embedded systems (i.e. the routers that I 
have on my desk)

> phones can outperform most inexpensive desktop and laptop systems, and
> even some rather expensive laptops.  This leaves the only systems that
> can be assumed without further information to be representative of
> embedded boards to be the ones running Genode, and possibly the MIPS
> systems, which is a total of about 10 results out of hundreds for
> servers and desktops/workstations.


Ciao
Stephan
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/5] /dev/random - a new approach

2016-06-21 Thread Austin S. Hemmelgarn

On 2016-06-21 14:04, Stephan Mueller wrote:

Am Dienstag, 21. Juni 2016, 13:51:15 schrieb Austin S. Hemmelgarn:

6. You have a significant lack of data regarding embedded systems, which
is one of the two biggest segments of Linux's market share.  You list no
results for any pre-ARMv6 systems (Linux still runs on and is regularly
used on ARMv4 CPU's, and it's worth also pointing out that the values on
the ARMv6 systems are themselves below average), any MIPS systems other
than 24k and 4k (which is not a good representation of modern embedded
usage), any SPARC CPU's other than UltraSPARC (ideally you should have
results on at least a couple of LEON systems as well), no tight-embedded
PPC chips (PPC 440 processors are very widely used, as are the 7xx and
970 families, and Freescale's e series), and only one set of results for
a tight-embedded x86 CPU (the Via Nano, you should ideally also have
results on things like an Intel Quark).  Overall, your test system
selection is not entirely representative of actual Linux usage (yeah,
ther'es a lot of x86 servers out there running Linux, there's at least
as many embedded systems running it too though, even without including
Android).


Perfectly valid argument. But I programmed that RNG as a hobby -- I do not
have the funds to buy all devices there are.


I'm not complaining as much about the lack of data for such devices as I
am about you stating that it will work fine for such devices when you
have so little data to support those claims.  Many of the devices you


Little data, interesting statement for results on 200+ systems including all
major CPU arches all showing information leading in the same directions.


Let me try rephrasing this to make it a bit clearer:
1. You have lots of data on server systems.
2. You have a significant amount of data on desktop/workstation type 
systems.

3. You have very little data on embedded systems.

and here are your arguments:
A. This works well on server systems.
B. This works well on desktop systems.
C. This works well on embedded systems.

Arguments A and B are substantiated directly by points 1 and 2. 
Argument C is not substantiated thoroughly because of point 3.

My complaint is about argument C given point 3.

I'm not saying you have insufficient data to support argument A or B, 
only that you have insufficient data to support argument C.


Android barely counts as an embedded system anymore, as many Android 
phones can outperform most inexpensive desktop and laptop systems, and 
even some rather expensive laptops.  This leaves the only systems that 
can be assumed without further information to be representative of 
embedded boards to be the ones running Genode, and possibly the MIPS 
systems, which is a total of about 10 results out of hundreds for 
servers and desktops/workstations.

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/5] /dev/random - a new approach

2016-06-21 Thread Stephan Mueller
Am Dienstag, 21. Juni 2016, 13:51:15 schrieb Austin S. Hemmelgarn:

Hi Austin,

> 
> >> 2. Quite a few systems have a rather distressingly low lower bound and
> >> still get accepted by your algorithm (a number of the S/390 systems, and
> >> a handful of the AMD processors in particular).
> > 
> > I am aware of that, but please read the entire documentation where the
> > lower and upper boundary comes from and how the Jitter RNG really
> > operates. There you will see that the lower boundary is just that: it
> > will not be lower, but the common case is the upper boundary.
> 
> Talking about the common case is all well and good, but the lower bound
> still needs to be taken into account.  If the test results aren't
> uniformly distributed within that interval, or even following a typical
> Gaussian distribution within it (which is what I and many other people
> would probably assume without the data later in the appendix), then you
> really need to mention this _before_ the table itself.  Such information
> is very important, and not everyone has time to read everything.

Then read chapter 5 as mentioned above the table. What you read was an 
appendix that is supposed to supplement to that chapter and not a full-fledged 
documentation.

And quickly glancing over that table to make a judgment is not helpful for 
such topic.

> 
> >> 5. You discount the Pentium Celeron Mobile CPU as old and therefore not
> >> worth worrying about.  Linux still runs on 80486 and other 'ancient'
> >> systems, and there are people using it on such systems.  You need to
> >> account for this usage.
> > 
> > I do not account for that in the documentation. In real life though, I
> > certainly do -- see how the Jitter RNG is used in the kernel.
> 
> Then you shouldn't be pushing the documentation as what appears to be
> your sole argument for including it in the kernel.

I understand you read Appendix F -- but that document is 150+ pages in size 
with quite a bit of information and you judge it based on an appendix 
providing supplemental data?

Note, I pointed you to this appendix at the starting question where you wanted 
to see test results on execution timing variations on ARM and MIPS.
> 
> >> 6. You have a significant lack of data regarding embedded systems, which
> >> is one of the two biggest segments of Linux's market share.  You list no
> >> results for any pre-ARMv6 systems (Linux still runs on and is regularly
> >> used on ARMv4 CPU's, and it's worth also pointing out that the values on
> >> the ARMv6 systems are themselves below average), any MIPS systems other
> >> than 24k and 4k (which is not a good representation of modern embedded
> >> usage), any SPARC CPU's other than UltraSPARC (ideally you should have
> >> results on at least a couple of LEON systems as well), no tight-embedded
> >> PPC chips (PPC 440 processors are very widely used, as are the 7xx and
> >> 970 families, and Freescale's e series), and only one set of results for
> >> a tight-embedded x86 CPU (the Via Nano, you should ideally also have
> >> results on things like an Intel Quark).  Overall, your test system
> >> selection is not entirely representative of actual Linux usage (yeah,
> >> ther'es a lot of x86 servers out there running Linux, there's at least
> >> as many embedded systems running it too though, even without including
> >> Android).
> > 
> > Perfectly valid argument. But I programmed that RNG as a hobby -- I do not
> > have the funds to buy all devices there are.
> 
> I'm not complaining as much about the lack of data for such devices as I
> am about you stating that it will work fine for such devices when you
> have so little data to support those claims.  Many of the devices you

Little data, interesting statement for results on 200+ systems including all 
major CPU arches all showing information leading in the same directions.

> have listed that can be reasonably assumed to be embedded systems are
> relatively modern ones that most people would think of (smart-phones and
> similar).  Such systems have almost as much if not more interrupts as
> many desktop and server systems, so the entropy values there actually do

Please re-read the documentation: the testing and the test analysis remove 
outliers from interrupts and such to get a worst-case testing.

> make some sense.  Not everything has this luxury.  Think for example of
> a router.  All it will generally have interrupts from is the timer
> interrupt (which should functionally have near zero entropy because it's
> monotonic most of the time) and the networking hardware, and quite
> often, many of the good routers operate their NIC's in polling mode,
> which means very few interrupts (which indirectly is part of the issue
> with some server systems too), and therefore will have little to no
> entropy there either.  This is an issue with the current system too, but
> you have almost zero data on such systems systems yourself, so you can't
> argue that it makes things better 

Re: [PATCH v4 0/5] /dev/random - a new approach

2016-06-21 Thread Stephan Mueller
Am Dienstag, 21. Juni 2016, 13:54:13 schrieb Austin S. Hemmelgarn:

Hi Austin,

> On 2016-06-21 13:23, Stephan Mueller wrote:
> > Am Dienstag, 21. Juni 2016, 13:18:33 schrieb Austin S. Hemmelgarn:
> > 
> > Hi Austin,
> > 
> >>> You have to trust the host for anything, not just for the entropy in
> >>> timings. This is completely invalid argument unless you can present a
> >>> method that one guest can manipulate timings in other guest in such a
> >>> way that _removes_ the inherent entropy from the host.
> >> 
> >> When dealing with almost any type 2 hypervisor, it is fully possible for
> >> a user other than the one running the hypervisor to manipulate
> >> scheduling such that entropy is reduced.  This does not imply that the
> > 
> > Please re-read the document: Jitter RNG does not rest on scheduling.
> 
> If you are running inside a VM, your interrupt timings depend on the

The RNG does not rest on interrupts either.

> hpyervisor's scheduling, period.  You may not directly rely on
> scheduling from the OS you are running on, but if you are doing anything
> timing related in a VM, you are at the mercy of the scheduling used by
> the hypervisor and whatever host OS that may be running on.
> 
> In the attack I"m describing, the malicious user is not manipulating the
> guest OS's scheduling, they are manipulating the host system's scheduling.


Ciao
Stephan
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/5] /dev/random - a new approach

2016-06-21 Thread Austin S. Hemmelgarn

On 2016-06-21 13:23, Stephan Mueller wrote:

Am Dienstag, 21. Juni 2016, 13:18:33 schrieb Austin S. Hemmelgarn:

Hi Austin,


You have to trust the host for anything, not just for the entropy in
timings. This is completely invalid argument unless you can present a
method that one guest can manipulate timings in other guest in such a
way that _removes_ the inherent entropy from the host.


When dealing with almost any type 2 hypervisor, it is fully possible for
a user other than the one running the hypervisor to manipulate
scheduling such that entropy is reduced.  This does not imply that the


Please re-read the document: Jitter RNG does not rest on scheduling.
If you are running inside a VM, your interrupt timings depend on the 
hpyervisor's scheduling, period.  You may not directly rely on 
scheduling from the OS you are running on, but if you are doing anything 
timing related in a VM, you are at the mercy of the scheduling used by 
the hypervisor and whatever host OS that may be running on.


In the attack I"m describing, the malicious user is not manipulating the 
guest OS's scheduling, they are manipulating the host system's scheduling.


--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/5] /dev/random - a new approach

2016-06-21 Thread Austin S. Hemmelgarn

On 2016-06-21 09:20, Stephan Mueller wrote:

Am Dienstag, 21. Juni 2016, 09:05:55 schrieb Austin S. Hemmelgarn:

Hi Austin,


On 2016-06-20 14:32, Stephan Mueller wrote:

Am Montag, 20. Juni 2016, 13:07:32 schrieb Austin S. Hemmelgarn:

Hi Austin,


On 2016-06-18 12:31, Stephan Mueller wrote:

Am Samstag, 18. Juni 2016, 10:44:08 schrieb Theodore Ts'o:

Hi Theodore,


At the end of the day, with these devices you really badly need a
hardware RNG.  We can't generate randomness out of thin air.  The only
thing you really can do requires user space help, which is to generate
keys lazily, or as late as possible, so you can gather as much entropy
as you can --- and to feed in measurements from the WiFi (RSSI
measurements, MAC addresses seen, etc.)  This won't help much if you
have an FBI van parked outside your house trying to carry out a
TEMPEST attack, but hopefully it provides some protection against a
remote attacker who isn't try to carry out an on-premises attack.


All my measurements on such small systems like MIPS or smaller/older
ARMs
do not seem to support that statement :-)


Was this on real hardware, or in a virtual machine/emulator?  Because if
it's not on real hardware, you're harvesting entropy from the host
system, not the emulated one.  While I haven't done this with MIPS or
ARM systems, I've taken similar measurements on SPARC64, x86_64, and
PPC64 systems comparing real hardware and emulated hardware, and the
emulated hardware _always_ has higher entropy, even when running the
emulator on an identical CPU to the one being emulated and using KVM
acceleration and passing through all the devices possible.

Even if you were testing on real hardware, I'm still rather dubious, as
every single test I've ever done on any hardware (SPARC, PPC, x86, ARM,
and even PA-RISC) indicates that you can't harvest entropy as
effectively from a smaller CPU compared to a large one, and this effect
is significantly more pronounced on RISC systems.


It was on real hardware. As part of my Jitter RNG project, I tested all
major CPUs from small to big -- see Appendix F [1]. For MIPS/ARM, see the
trailing part of the big table.

[1] http://www.chronox.de/jent/doc/CPU-Jitter-NPTRNG.pdf


Specific things I notice about this:
1. QEMU systems are reporting higher values than almost anything else
with the same ISA.  This makes sense, but you don't appear to have
accounted for the fact that you can't trust almost any of the entropy in
a VM unless you have absolute trust in the host system, because the host
system can do whatever the hell it wants to you, including manipulating
timings directly (with a little patience and some time spent working on
it, you could probably get those number to show whatever you want just
by manipulating scheduling parameters on the host OS for the VM software).


I am not sure where you see QEMU systems listed there.
That would be the ones which list 'QEMU Virtual CPU version X.Y' as the 
CPU string.  The only things that return that in the CPUID data are 
either QEMU itself, or software that is based on QEMU.



2. Quite a few systems have a rather distressingly low lower bound and
still get accepted by your algorithm (a number of the S/390 systems, and
a handful of the AMD processors in particular).


I am aware of that, but please read the entire documentation where the lower
and upper boundary comes from and how the Jitter RNG really operates. There
you will see that the lower boundary is just that: it will not be lower, but
the common case is the upper boundary.
Talking about the common case is all well and good, but the lower bound 
still needs to be taken into account.  If the test results aren't 
uniformly distributed within that interval, or even following a typical 
Gaussian distribution within it (which is what I and many other people 
would probably assume without the data later in the appendix), then you 
really need to mention this _before_ the table itself.  Such information 
is very important, and not everyone has time to read everything.


Furthermore, the use case of the Jitter RNG is to support the DRBG seeding
with a very high reseed interval.


3. Your statement at the bottom of the table that 'all test systems at
least un-optimized have a lower bound of 1 bit' is refuted by your own
data, I count at least 2 data points where this is not the case.  One of
them is mentioned at the bottom as an outlier, and you have data to back
this up listed in the table, but the other (MIPS 4Kec v4.8) is the only
system of that specific type that you tested, and thus can't be claimed
as an outlier.


You are right, I have added more and more test results to the table without
updating the statement below. I will fix that.

But note, that there is a list below that statement providing explanations
already. So, it is just that one statement that needs updating.


4. You state the S/390 systems gave different results when run
un-optimized, but don't provide any data regarding this.


The 

Re: [PATCH v4 0/5] /dev/random - a new approach

2016-06-21 Thread Stephan Mueller
Am Dienstag, 21. Juni 2016, 13:18:33 schrieb Austin S. Hemmelgarn:

Hi Austin,

> > You have to trust the host for anything, not just for the entropy in
> > timings. This is completely invalid argument unless you can present a
> > method that one guest can manipulate timings in other guest in such a
> > way that _removes_ the inherent entropy from the host.
> 
> When dealing with almost any type 2 hypervisor, it is fully possible for
> a user other than the one running the hypervisor to manipulate
> scheduling such that entropy is reduced.  This does not imply that the

Please re-read the document: Jitter RNG does not rest on scheduling.


Ciao
Stephan
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/5] /dev/random - a new approach

2016-06-21 Thread Austin S. Hemmelgarn

On 2016-06-21 09:42, Pavel Machek wrote:

Hi!


6. You have a significant lack of data regarding embedded systems, which is
one of the two biggest segments of Linux's market share.  You list no
results for any pre-ARMv6 systems (Linux still runs on and is regularly used
on ARMv4 CPU's, and it's worth also pointing out that the values on
the


Feel free to contribute more test results.

I mean... you can't expect every person who wants to improve something
in linux to test on everything in the world... can you?

I was commenting less on the lack of results for such systems than on 
attempts to make statements about such systems based on a data-set that 
lacks information about such systems.


--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/5] /dev/random - a new approach

2016-06-21 Thread Tomas Mraz
On Út, 2016-06-21 at 09:05 -0400, Austin S. Hemmelgarn wrote:
> On 2016-06-20 14:32, Stephan Mueller wrote:
> > 
> > [1] http://www.chronox.de/jent/doc/CPU-Jitter-NPTRNG.pdf
> Specific things I notice about this:
> 1. QEMU systems are reporting higher values than almost anything
> else 
> with the same ISA.  This makes sense, but you don't appear to have 
> accounted for the fact that you can't trust almost any of the entropy
> in 
> a VM unless you have absolute trust in the host system, because the
> host 
> system can do whatever the hell it wants to you, including
> manipulating 
> timings directly (with a little patience and some time spent working
> on 
> it, you could probably get those number to show whatever you want
> just 
> by manipulating scheduling parameters on the host OS for the VM
> software).

You have to trust the host for anything, not just for the entropy in
timings. This is completely invalid argument unless you can present a
method that one guest can manipulate timings in other guest in such a
way that _removes_ the inherent entropy from the host.

-- 
Tomas Mraz
No matter how far down the wrong road you've gone, turn back.
  Turkish proverb
(You'll never know whether the road is wrong though.)



--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/5] /dev/random - a new approach

2016-06-21 Thread Stephan Mueller
Am Dienstag, 21. Juni 2016, 09:05:55 schrieb Austin S. Hemmelgarn:

Hi Austin,

> On 2016-06-20 14:32, Stephan Mueller wrote:
> > Am Montag, 20. Juni 2016, 13:07:32 schrieb Austin S. Hemmelgarn:
> > 
> > Hi Austin,
> > 
> >> On 2016-06-18 12:31, Stephan Mueller wrote:
> >>> Am Samstag, 18. Juni 2016, 10:44:08 schrieb Theodore Ts'o:
> >>> 
> >>> Hi Theodore,
> >>> 
>  At the end of the day, with these devices you really badly need a
>  hardware RNG.  We can't generate randomness out of thin air.  The only
>  thing you really can do requires user space help, which is to generate
>  keys lazily, or as late as possible, so you can gather as much entropy
>  as you can --- and to feed in measurements from the WiFi (RSSI
>  measurements, MAC addresses seen, etc.)  This won't help much if you
>  have an FBI van parked outside your house trying to carry out a
>  TEMPEST attack, but hopefully it provides some protection against a
>  remote attacker who isn't try to carry out an on-premises attack.
> >>> 
> >>> All my measurements on such small systems like MIPS or smaller/older
> >>> ARMs
> >>> do not seem to support that statement :-)
> >> 
> >> Was this on real hardware, or in a virtual machine/emulator?  Because if
> >> it's not on real hardware, you're harvesting entropy from the host
> >> system, not the emulated one.  While I haven't done this with MIPS or
> >> ARM systems, I've taken similar measurements on SPARC64, x86_64, and
> >> PPC64 systems comparing real hardware and emulated hardware, and the
> >> emulated hardware _always_ has higher entropy, even when running the
> >> emulator on an identical CPU to the one being emulated and using KVM
> >> acceleration and passing through all the devices possible.
> >> 
> >> Even if you were testing on real hardware, I'm still rather dubious, as
> >> every single test I've ever done on any hardware (SPARC, PPC, x86, ARM,
> >> and even PA-RISC) indicates that you can't harvest entropy as
> >> effectively from a smaller CPU compared to a large one, and this effect
> >> is significantly more pronounced on RISC systems.
> > 
> > It was on real hardware. As part of my Jitter RNG project, I tested all
> > major CPUs from small to big -- see Appendix F [1]. For MIPS/ARM, see the
> > trailing part of the big table.
> > 
> > [1] http://www.chronox.de/jent/doc/CPU-Jitter-NPTRNG.pdf
> 
> Specific things I notice about this:
> 1. QEMU systems are reporting higher values than almost anything else
> with the same ISA.  This makes sense, but you don't appear to have
> accounted for the fact that you can't trust almost any of the entropy in
> a VM unless you have absolute trust in the host system, because the host
> system can do whatever the hell it wants to you, including manipulating
> timings directly (with a little patience and some time spent working on
> it, you could probably get those number to show whatever you want just
> by manipulating scheduling parameters on the host OS for the VM software).

I am not sure where you see QEMU systems listed there.

> 2. Quite a few systems have a rather distressingly low lower bound and
> still get accepted by your algorithm (a number of the S/390 systems, and
> a handful of the AMD processors in particular).

I am aware of that, but please read the entire documentation where the lower 
and upper boundary comes from and how the Jitter RNG really operates. There 
you will see that the lower boundary is just that: it will not be lower, but 
the common case is the upper boundary.

Furthermore, the use case of the Jitter RNG is to support the DRBG seeding 
with a very high reseed interval.

> 3. Your statement at the bottom of the table that 'all test systems at
> least un-optimized have a lower bound of 1 bit' is refuted by your own
> data, I count at least 2 data points where this is not the case.  One of
> them is mentioned at the bottom as an outlier, and you have data to back
> this up listed in the table, but the other (MIPS 4Kec v4.8) is the only
> system of that specific type that you tested, and thus can't be claimed
> as an outlier.

You are right, I have added more and more test results to the table without 
updating the statement below. I will fix that.

But note, that there is a list below that statement providing explanations 
already. So, it is just that one statement that needs updating.

> 4. You state the S/390 systems gave different results when run
> un-optimized, but don't provide any data regarding this.

The pointer to appendix F.46 was supposed to cover that issue.

> 5. You discount the Pentium Celeron Mobile CPU as old and therefore not
> worth worrying about.  Linux still runs on 80486 and other 'ancient'
> systems, and there are people using it on such systems.  You need to
> account for this usage.

I do not account for that in the documentation. In real life though, I 
certainly do -- see how the Jitter RNG is used in the kernel.

> 6. You have a significant lack of data 

Re: [PATCH v4 0/5] /dev/random - a new approach

2016-06-21 Thread Austin S. Hemmelgarn

On 2016-06-20 14:32, Stephan Mueller wrote:

Am Montag, 20. Juni 2016, 13:07:32 schrieb Austin S. Hemmelgarn:

Hi Austin,


On 2016-06-18 12:31, Stephan Mueller wrote:

Am Samstag, 18. Juni 2016, 10:44:08 schrieb Theodore Ts'o:

Hi Theodore,


At the end of the day, with these devices you really badly need a
hardware RNG.  We can't generate randomness out of thin air.  The only
thing you really can do requires user space help, which is to generate
keys lazily, or as late as possible, so you can gather as much entropy
as you can --- and to feed in measurements from the WiFi (RSSI
measurements, MAC addresses seen, etc.)  This won't help much if you
have an FBI van parked outside your house trying to carry out a
TEMPEST attack, but hopefully it provides some protection against a
remote attacker who isn't try to carry out an on-premises attack.


All my measurements on such small systems like MIPS or smaller/older ARMs
do not seem to support that statement :-)


Was this on real hardware, or in a virtual machine/emulator?  Because if
it's not on real hardware, you're harvesting entropy from the host
system, not the emulated one.  While I haven't done this with MIPS or
ARM systems, I've taken similar measurements on SPARC64, x86_64, and
PPC64 systems comparing real hardware and emulated hardware, and the
emulated hardware _always_ has higher entropy, even when running the
emulator on an identical CPU to the one being emulated and using KVM
acceleration and passing through all the devices possible.

Even if you were testing on real hardware, I'm still rather dubious, as
every single test I've ever done on any hardware (SPARC, PPC, x86, ARM,
and even PA-RISC) indicates that you can't harvest entropy as
effectively from a smaller CPU compared to a large one, and this effect
is significantly more pronounced on RISC systems.


It was on real hardware. As part of my Jitter RNG project, I tested all major
CPUs from small to big -- see Appendix F [1]. For MIPS/ARM, see the trailing
part of the big table.

[1] http://www.chronox.de/jent/doc/CPU-Jitter-NPTRNG.pdf


Specific things I notice about this:
1. QEMU systems are reporting higher values than almost anything else 
with the same ISA.  This makes sense, but you don't appear to have 
accounted for the fact that you can't trust almost any of the entropy in 
a VM unless you have absolute trust in the host system, because the host 
system can do whatever the hell it wants to you, including manipulating 
timings directly (with a little patience and some time spent working on 
it, you could probably get those number to show whatever you want just 
by manipulating scheduling parameters on the host OS for the VM software).
2. Quite a few systems have a rather distressingly low lower bound and 
still get accepted by your algorithm (a number of the S/390 systems, and 
a handful of the AMD processors in particular).
3. Your statement at the bottom of the table that 'all test systems at 
least un-optimized have a lower bound of 1 bit' is refuted by your own 
data, I count at least 2 data points where this is not the case.  One of 
them is mentioned at the bottom as an outlier, and you have data to back 
this up listed in the table, but the other (MIPS 4Kec v4.8) is the only 
system of that specific type that you tested, and thus can't be claimed 
as an outlier.
4. You state the S/390 systems gave different results when run 
un-optimized, but don't provide any data regarding this.
5. You discount the Pentium Celeron Mobile CPU as old and therefore not 
worth worrying about.  Linux still runs on 80486 and other 'ancient' 
systems, and there are people using it on such systems.  You need to 
account for this usage.
6. You have a significant lack of data regarding embedded systems, which 
is one of the two biggest segments of Linux's market share.  You list no 
results for any pre-ARMv6 systems (Linux still runs on and is regularly 
used on ARMv4 CPU's, and it's worth also pointing out that the values on 
the ARMv6 systems are themselves below average), any MIPS systems other 
than 24k and 4k (which is not a good representation of modern embedded 
usage), any SPARC CPU's other than UltraSPARC (ideally you should have 
results on at least a couple of LEON systems as well), no tight-embedded 
PPC chips (PPC 440 processors are very widely used, as are the 7xx and 
970 families, and Freescale's e series), and only one set of results for 
a tight-embedded x86 CPU (the Via Nano, you should ideally also have 
results on things like an Intel Quark).  Overall, your test system 
selection is not entirely representative of actual Linux usage (yeah, 
ther'es a lot of x86 servers out there running Linux, there's at least 
as many embedded systems running it too though, even without including 
Android).
7. The RISC CPU's that you actually tested have more consistency within 
a particular type than the CISC CPU's.  Many of them do have higher 
values than the CISC CPU's, but a majority of the 

Re: [PATCH v4 0/5] /dev/random - a new approach

2016-06-21 Thread David Jaša
Hi,

On So, 2016-06-18 at 10:44 -0400, Theodore Ts'o wrote:
> On Fri, Jun 17, 2016 at 03:56:13PM +0200, David Jaša wrote:
> > I was thinking along the lines that "almost every important package
> > supports FreeBSD as well where they have to handle the condition so
> > option to switch to Rather Break Than Generate Weak Keys would be nice"
> > - but I didn't expect that systemd could be a roadblock here. :-/
> 
> It wasn't just systemd; it also broke OpenWRT and Ubuntu Quantal
> systems from booting.
> 
> > I was also thinking of little devices where OpenWRT or proprietary
> > Linux-based systems run that ended up with predictable keys way too
> > ofter (or as in OpenWRT's case, with cumbersome tutorials how to
> > generate keys elsewhere).
> 
> OpenWRT and other embedded devices (a) generally use a single master
> oscillator to drive everything, and (b) often use RISC architectures
> such as MIPS.
> 
> Which means that arguments of the form ``the Intel L1 / L2 cache
> architecture is s complicated that no human could possibly
> figure out how they would affect timing calculations, and besides, my
> generator passes FIPS 140-2 tests (never mind AES(NSA_KEY, CNTR++)

this

> also passes the FIPS 140-2 statistical tests)'' --- which I normally
> have trouble believing --- are even harder for me to believe.
> 
> At the end of the day, with these devices you really badly need a
> hardware RNG.  

and this.

It seems much easier to me to embed AES(NSA_KEY, CNTR++) logic directly
to HW RNG compared to tweaking of every microarchitecture to make
jitter/maxwell/havege return known numbers that are going to be mixed
with other entropy anyway (won't they?). So if I put the bits together
correctly, HW RNG helps getting more random numbers but itself is
insufficient to ensure that random numbers are truly random...

Cheers,

David Jaša

> We can't generate randomness out of thin air.  The only
> thing you really can do requires user space help, which is to generate
> keys lazily, or as late as possible, so you can gather as much entropy
> as you can --- and to feed in measurements from the WiFi (RSSI
> measurements, MAC addresses seen, etc.)  This won't help much if you
> have an FBI van parked outside your house trying to carry out a
> TEMPEST attack, but hopefully it provides some protection against a
> remote attacker who isn't try to carry out an on-premises attack.
> 
> Cheers,
> 
>   - Ted


--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/5] /dev/random - a new approach

2016-06-20 Thread Stephan Mueller
Am Montag, 20. Juni 2016, 13:07:32 schrieb Austin S. Hemmelgarn:

Hi Austin,

> On 2016-06-18 12:31, Stephan Mueller wrote:
> > Am Samstag, 18. Juni 2016, 10:44:08 schrieb Theodore Ts'o:
> > 
> > Hi Theodore,
> > 
> >> At the end of the day, with these devices you really badly need a
> >> hardware RNG.  We can't generate randomness out of thin air.  The only
> >> thing you really can do requires user space help, which is to generate
> >> keys lazily, or as late as possible, so you can gather as much entropy
> >> as you can --- and to feed in measurements from the WiFi (RSSI
> >> measurements, MAC addresses seen, etc.)  This won't help much if you
> >> have an FBI van parked outside your house trying to carry out a
> >> TEMPEST attack, but hopefully it provides some protection against a
> >> remote attacker who isn't try to carry out an on-premises attack.
> > 
> > All my measurements on such small systems like MIPS or smaller/older ARMs
> > do not seem to support that statement :-)
> 
> Was this on real hardware, or in a virtual machine/emulator?  Because if
> it's not on real hardware, you're harvesting entropy from the host
> system, not the emulated one.  While I haven't done this with MIPS or
> ARM systems, I've taken similar measurements on SPARC64, x86_64, and
> PPC64 systems comparing real hardware and emulated hardware, and the
> emulated hardware _always_ has higher entropy, even when running the
> emulator on an identical CPU to the one being emulated and using KVM
> acceleration and passing through all the devices possible.
> 
> Even if you were testing on real hardware, I'm still rather dubious, as
> every single test I've ever done on any hardware (SPARC, PPC, x86, ARM,
> and even PA-RISC) indicates that you can't harvest entropy as
> effectively from a smaller CPU compared to a large one, and this effect
> is significantly more pronounced on RISC systems.

It was on real hardware. As part of my Jitter RNG project, I tested all major 
CPUs from small to big -- see Appendix F [1]. For MIPS/ARM, see the trailing 
part of the big table.

[1] http://www.chronox.de/jent/doc/CPU-Jitter-NPTRNG.pdf

Ciao
Stephan
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/5] /dev/random - a new approach

2016-06-20 Thread Austin S. Hemmelgarn

On 2016-06-18 12:31, Stephan Mueller wrote:

Am Samstag, 18. Juni 2016, 10:44:08 schrieb Theodore Ts'o:

Hi Theodore,



At the end of the day, with these devices you really badly need a
hardware RNG.  We can't generate randomness out of thin air.  The only
thing you really can do requires user space help, which is to generate
keys lazily, or as late as possible, so you can gather as much entropy
as you can --- and to feed in measurements from the WiFi (RSSI
measurements, MAC addresses seen, etc.)  This won't help much if you
have an FBI van parked outside your house trying to carry out a
TEMPEST attack, but hopefully it provides some protection against a
remote attacker who isn't try to carry out an on-premises attack.


All my measurements on such small systems like MIPS or smaller/older ARMs do
not seem to support that statement :-)
Was this on real hardware, or in a virtual machine/emulator?  Because if 
it's not on real hardware, you're harvesting entropy from the host 
system, not the emulated one.  While I haven't done this with MIPS or 
ARM systems, I've taken similar measurements on SPARC64, x86_64, and 
PPC64 systems comparing real hardware and emulated hardware, and the 
emulated hardware _always_ has higher entropy, even when running the 
emulator on an identical CPU to the one being emulated and using KVM 
acceleration and passing through all the devices possible.


Even if you were testing on real hardware, I'm still rather dubious, as 
every single test I've ever done on any hardware (SPARC, PPC, x86, ARM, 
and even PA-RISC) indicates that you can't harvest entropy as 
effectively from a smaller CPU compared to a large one, and this effect 
is significantly more pronounced on RISC systems.

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/5] /dev/random - a new approach

2016-06-18 Thread Stephan Mueller
Am Samstag, 18. Juni 2016, 10:44:08 schrieb Theodore Ts'o:

Hi Theodore,

> 
> At the end of the day, with these devices you really badly need a
> hardware RNG.  We can't generate randomness out of thin air.  The only
> thing you really can do requires user space help, which is to generate
> keys lazily, or as late as possible, so you can gather as much entropy
> as you can --- and to feed in measurements from the WiFi (RSSI
> measurements, MAC addresses seen, etc.)  This won't help much if you
> have an FBI van parked outside your house trying to carry out a
> TEMPEST attack, but hopefully it provides some protection against a
> remote attacker who isn't try to carry out an on-premises attack.

All my measurements on such small systems like MIPS or smaller/older ARMs do 
not seem to support that statement :-)

Ciao
Stephan
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/5] /dev/random - a new approach

2016-06-18 Thread Theodore Ts'o
On Fri, Jun 17, 2016 at 03:56:13PM +0200, David Jaša wrote:
> I was thinking along the lines that "almost every important package
> supports FreeBSD as well where they have to handle the condition so
> option to switch to Rather Break Than Generate Weak Keys would be nice"
> - but I didn't expect that systemd could be a roadblock here. :-/

It wasn't just systemd; it also broke OpenWRT and Ubuntu Quantal
systems from booting.

> I was also thinking of little devices where OpenWRT or proprietary
> Linux-based systems run that ended up with predictable keys way too
> ofter (or as in OpenWRT's case, with cumbersome tutorials how to
> generate keys elsewhere).

OpenWRT and other embedded devices (a) generally use a single master
oscillator to drive everything, and (b) often use RISC architectures
such as MIPS.

Which means that arguments of the form ``the Intel L1 / L2 cache
architecture is s complicated that no human could possibly
figure out how they would affect timing calculations, and besides, my
generator passes FIPS 140-2 tests (never mind AES(NSA_KEY, CNTR++)
also passes the FIPS 140-2 statistical tests)'' --- which I normally
have trouble believing --- are even harder for me to believe.

At the end of the day, with these devices you really badly need a
hardware RNG.  We can't generate randomness out of thin air.  The only
thing you really can do requires user space help, which is to generate
keys lazily, or as late as possible, so you can gather as much entropy
as you can --- and to feed in measurements from the WiFi (RSSI
measurements, MAC addresses seen, etc.)  This won't help much if you
have an FBI van parked outside your house trying to carry out a
TEMPEST attack, but hopefully it provides some protection against a
remote attacker who isn't try to carry out an on-premises attack.

Cheers,

- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/5] /dev/random - a new approach

2016-06-18 Thread Stephan Mueller
Am Freitag, 17. Juni 2016, 11:26:23 schrieb Sandy Harris:

Hi Sandy,

> David Jaša  wrote:
> > BTW when looking at an old BSI's issue with Linux urandom that Jarod
> > Wilson tried to solve with this series:
> > https://www.spinics.net/lists/linux-crypto/msg06113.html
> > I was thinking:
> > 1) wouldn't it help for large urandom consumers if kernel created a DRBG
> > instance for each of them? It would likely enhance performance and solve
> > BSI's concern of predicting what numbers could other urandom consumers
> > obtain at cost of memory footprint
> > and then, after reading paper associated with this series:
> > 2) did you evaluate use of intermediate DRBG fed by primary generator to
> > instantiate per-node DRBG's? It would allow initialization of all
> > secondary DRBGs right after primary generator initialization.
> 
> Theodore Ts'o, the random maintainer, already has a patch that
> seems to deal with this issue. He has posted more than one
> version & I'm not sure this is the best or latest, but ...
> https://lkml.org/lkml/2016/5/30/22

His latest patch set that he mentioned to appear in 4.8 covers a per-NUMA DRNG 
where there is a "primary" /dev/urandom DRNG where secondary DRNGs for the 
NUMA nodes are spawned from.


Ciao
Stephan
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/5] /dev/random - a new approach

2016-06-18 Thread Stephan Mueller
Am Freitag, 17. Juni 2016, 15:56:13 schrieb David Jaša:

Hi David,

> Hi Stephan,
> 
> thank you for your thorough reply,
> 
> On St, 2016-06-15 at 18:58 +0200, Stephan Mueller wrote:
> > Am Mittwoch, 15. Juni 2016, 18:17:43 schrieb David Jaša:
> > 
> > Hi David,
> > 
> > > Hello Stephan,
> > > 
> > > Did you consider blocking urandom output or returning error until
> > > initialized? Given the speed of initialization you report, it shouldn't
> > > break any userspace apps while making sure that nobody uses predictable
> > > pseudoranom numbers.
> > 
> > My LRNG will definitely touch the beginning of the initramfs booting until
> > it is fully seeded. As these days the initramfs is driven by systemd
> > which always pulls from /dev/urandom, we cannot block as this would block
> > systemd. In Ted's last patch, he mentioned that he tried to make
> > /dev/urandom block which caused user space pain.
> 
> I was thinking along the lines that "almost every important package
> supports FreeBSD as well where they have to handle the condition so
> option to switch to Rather Break Than Generate Weak Keys would be nice"
> - but I didn't expect that systemd could be a roadblock here. :-/
> 
> I was also thinking of little devices where OpenWRT or proprietary
> Linux-based systems run that ended up with predictable keys way too
> ofter (or as in OpenWRT's case, with cumbersome tutorials how to
> generate keys elsewhere).

I have some ideas on how to handle that issue -- let me run some tests and I 
will report back.
> 
> > But if you use the getrandom system call, it works like /dev/urandom but
> > blocks until the DRBG behind /dev/urandom is fully initialized.
> > 
> > > I was considering asking for patch (or even trying to write it myself)
> > > to make current urandom block/fail when not initialized but that would
> > > surely have to be off by default over "never break userspace" rule (even
> > > if it means way too easy security problem with both random and urandom).
> > > Properties of your urandom implementation makes this point moot and it
> > > could make the random/urandom wars over.
> > 
> > That patch unfortunately will not work. But if you are interested in that
> > blocking /dev/urandom behavior for your application, use getrandom.
> 
> I'm QA with a touch of sysadmin so the numbers of apps to fix is large
> and I don't have neither control over the projects nor abilities to
> patch them all myself. :)

Sure, I can understand that :-)
> 
> > > Best Regards,
> > > 
> > > David Jaša
> > 
> > Ciao
> > Stephan
> 
> BTW when looking at an old BSI's issue with Linux urandom that Jarod
> Wilson tried to solve with this series:
> https://www.spinics.net/lists/linux-crypto/msg06113.html
> I was thinking:
> 1) wouldn't it help for large urandom consumers if kernel created a DRBG
> instance for each of them? It would likely enhance performance and solve
> BSI's concern of predicting what numbers could other urandom consumers
> obtain at cost of memory footprint

That issue is partly solved with my patch set: I have one DRBG per NUMA node 
where all DRBG instances are equally treated. Surely that patch could be 
expanded on a per-CPU instance. But let us try to use the per-NUMA 
implementation and see whether that helps.

Besides, the legacy /dev/urandom delivers about 12 MB/s on my system whereas 
the DRBG delivers more than 800MB/s. So, we have quite some performance 
improvement.

Note, Ted's patch has a similar implementation.

> and then, after reading paper associated with this series:
> 2) did you evaluate use of intermediate DRBG fed by primary generator to
> instantiate per-node DRBG's? It would allow initialization of all
> secondary DRBGs right after primary generator initialization.

That is exactly what I do.
> 
> Cheers,
> 
> David


Ciao
Stephan
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/5] /dev/random - a new approach

2016-06-17 Thread Sandy Harris
David Jaša  wrote:

>
> BTW when looking at an old BSI's issue with Linux urandom that Jarod
> Wilson tried to solve with this series:
> https://www.spinics.net/lists/linux-crypto/msg06113.html
> I was thinking:
> 1) wouldn't it help for large urandom consumers if kernel created a DRBG
> instance for each of them? It would likely enhance performance and solve
> BSI's concern of predicting what numbers could other urandom consumers
> obtain at cost of memory footprint
> and then, after reading paper associated with this series:
> 2) did you evaluate use of intermediate DRBG fed by primary generator to
> instantiate per-node DRBG's? It would allow initialization of all
> secondary DRBGs right after primary generator initialization.

Theodore Ts'o, the random maintainer, already has a patch that
seems to deal with this issue. He has posted more than one
version & I'm not sure this is the best or latest, but ...
https://lkml.org/lkml/2016/5/30/22
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/5] /dev/random - a new approach

2016-06-17 Thread David Jaša
Hi Stephan,

thank you for your thorough reply,

On St, 2016-06-15 at 18:58 +0200, Stephan Mueller wrote:
> Am Mittwoch, 15. Juni 2016, 18:17:43 schrieb David Jaša:
> 
> Hi David,
> 
> > Hello Stephan,
> > 
> > Did you consider blocking urandom output or returning error until
> > initialized? Given the speed of initialization you report, it shouldn't
> > break any userspace apps while making sure that nobody uses predictable
> > pseudoranom numbers.
> 
> My LRNG will definitely touch the beginning of the initramfs booting until it 
> is fully seeded. As these days the initramfs is driven by systemd which 
> always 
> pulls from /dev/urandom, we cannot block as this would block systemd. In 
> Ted's 
> last patch, he mentioned that he tried to make /dev/urandom block which 
> caused 
> user space pain.

I was thinking along the lines that "almost every important package
supports FreeBSD as well where they have to handle the condition so
option to switch to Rather Break Than Generate Weak Keys would be nice"
- but I didn't expect that systemd could be a roadblock here. :-/

I was also thinking of little devices where OpenWRT or proprietary
Linux-based systems run that ended up with predictable keys way too
ofter (or as in OpenWRT's case, with cumbersome tutorials how to
generate keys elsewhere).

> 
> But if you use the getrandom system call, it works like /dev/urandom but 
> blocks until the DRBG behind /dev/urandom is fully initialized.
> > 
> > I was considering asking for patch (or even trying to write it myself)
> > to make current urandom block/fail when not initialized but that would
> > surely have to be off by default over "never break userspace" rule (even
> > if it means way too easy security problem with both random and urandom).
> > Properties of your urandom implementation makes this point moot and it
> > could make the random/urandom wars over.
> 
> That patch unfortunately will not work. But if you are interested in that 
> blocking /dev/urandom behavior for your application, use getrandom.
> 

I'm QA with a touch of sysadmin so the numbers of apps to fix is large
and I don't have neither control over the projects nor abilities to
patch them all myself. :)

> > 
> > Best Regards,
> > 
> > David Jaša
> 
> 
> Ciao
> Stephan

BTW when looking at an old BSI's issue with Linux urandom that Jarod
Wilson tried to solve with this series:
https://www.spinics.net/lists/linux-crypto/msg06113.html
I was thinking:
1) wouldn't it help for large urandom consumers if kernel created a DRBG
instance for each of them? It would likely enhance performance and solve
BSI's concern of predicting what numbers could other urandom consumers
obtain at cost of memory footprint
and then, after reading paper associated with this series:
2) did you evaluate use of intermediate DRBG fed by primary generator to
instantiate per-node DRBG's? It would allow initialization of all
secondary DRBGs right after primary generator initialization.

Cheers,

David

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/5] /dev/random - a new approach

2016-06-15 Thread Stephan Mueller
Am Mittwoch, 15. Juni 2016, 18:17:43 schrieb David Jaša:

Hi David,

> Hello Stephan,
> 
> Did you consider blocking urandom output or returning error until
> initialized? Given the speed of initialization you report, it shouldn't
> break any userspace apps while making sure that nobody uses predictable
> pseudoranom numbers.

My LRNG will definitely touch the beginning of the initramfs booting until it 
is fully seeded. As these days the initramfs is driven by systemd which always 
pulls from /dev/urandom, we cannot block as this would block systemd. In Ted's 
last patch, he mentioned that he tried to make /dev/urandom block which caused 
user space pain.

But if you use the getrandom system call, it works like /dev/urandom but 
blocks until the DRBG behind /dev/urandom is fully initialized.
> 
> I was considering asking for patch (or even trying to write it myself)
> to make current urandom block/fail when not initialized but that would
> surely have to be off by default over "never break userspace" rule (even
> if it means way too easy security problem with both random and urandom).
> Properties of your urandom implementation makes this point moot and it
> could make the random/urandom wars over.

That patch unfortunately will not work. But if you are interested in that 
blocking /dev/urandom behavior for your application, use getrandom.

> 
> Best Regards,
> 
> David Jaša


Ciao
Stephan
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/5] /dev/random - a new approach

2016-06-15 Thread David Jaša
Hello Stephan,

Did you consider blocking urandom output or returning error until
initialized? Given the speed of initialization you report, it shouldn't
break any userspace apps while making sure that nobody uses predictable
pseudoranom numbers.

I was considering asking for patch (or even trying to write it myself)
to make current urandom block/fail when not initialized but that would
surely have to be off by default over "never break userspace" rule (even
if it means way too easy security problem with both random and urandom).
Properties of your urandom implementation makes this point moot and it
could make the random/urandom wars over.

Best Regards,

David Jaša

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/5] /dev/random - a new approach

2016-05-31 Thread George Spelvin
I'll be a while going through this.

I was thinking about our earlier discussion where I was hammering on
the point that compressing entropy too early is a mistake, and just
now realized that I should have given you credit for my recent 4.7-rc1
patch 2a18da7a.  The hash function ("good, fast AND cheap!") introduced
there exploits that point: using a larger hash state (and postponing
compression to the final size) dramatically reduces the requirements on
the hash mixing function.

I wasn't conscious of it at the time, but I just now realized that
explaining it clarified the point in my mind, which led to applying
the principle in other situations.

So thank you!
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 0/5] /dev/random - a new approach

2016-05-31 Thread Stephan Mueller
Hi Herbert, Ted,

The following patch set provides a different approach to /dev/random which
I call Linux Random Number Generator (LRNG) to collect entropy within the Linux
kernel. The main improvements compared to the legacy /dev/random is to provide
sufficient entropy during boot time as well as in virtual environments and when
using SSDs. A secondary design goal is to limit the impact of the entropy
collection on massive parallel systems and also allow the use accelerated
cryptographic primitives. Also, all steps of the entropic data processing are
testable. Finally massive performance improvements are visible at /dev/urandom
and get_random_bytes.

The design and implementation is driven by a set of goals described in [1]
that the LRNG completely implements. Furthermore, [1] includes a
comparison with RNG design suggestions such as SP800-90B, SP800-90C, and
AIS20/31.

Changes v4:
* port to 4.7-rc1
* Use classical twisted LFSR approach to collect entropic data as requested by
  George Spelvin. The LFSR is based on a primitive and irreducible polynomial
  whose taps are not too close to the location the current byte is mixed in.
  Primitive polynomials for other entropy pool sizes are offered in the code.
* The reading of the entropy pool is performed with a hash. The hash can be
  specified at compile time. The pre-defined hashes are the same as used for
  the DRBG type (e.g. a SHA256 Hash DRBG implies the use of SHA-256, an AES256
  CTR DRBG implies the use of CMAC-AES).
* Addition of the example defines for a CTR DRBG with AES128 which can be
  enabled during compile time.
* Entropy estimate: one bit of entropy per interrupt. In case a system does
  not have a high-resolution timer, apply 1/10th bit of entropy per interrupt.
  The interrupt estimates can be changed arbitrarily at compile time.
* Use kmalloc_node for the per-NUMA node secondary DRBGs.
* Add boot time entropy tests discussed in section 3.4.3 [1].
* Align all buffers that are processed by the kernel crypto API to an 8 byte
  boundary. This boundary covers all currently existing cipher implementations.

Changes v3:
* Convert debug printk to pr_debug as suggested by Joe Perches
* Add missing \n as suggested by Joe Perches
* Do not mix in struck IRQ measurements as requested by Pavel Machek
* Add handling logic for systems without high-res timer as suggested by Pavel
  Machek -- it uses ideas from the add_interrupt_randomness of the legacy
  /dev/random implementation
* add per NUMA node secondary DRBGs as suggested by Andi Kleen -- the
  explanation of how the logic works is given in section 2.1.1 of my
  documentation [1], especially how the initial seeding is performed.

Changes v2:
* Removal of the Jitter RNG fast noise source as requested by Ted
* Addition of processing of add_input_randomness as suggested by Ted
* Update documentation and testing in [1] to cover the updates
* Addition of a SystemTap script to test add_input_randomness
* To clarify the question whether sufficient entropy is present during boot
  I added one more test in 3.3.1 [1] which demonstrates the providing of
  sufficient entropy during initialization. In the worst case of no fast noise
  sources, in the worst case of a virtual machine with only very few hardware
  devices, the testing shows that the secondary DRBG is fully seeded with 256
  bits of entropy before user space injects the random data obtained
  during shutdown of the previous boot (i.e. the requirement phrased by the
  legacy /dev/random implementation). As the writing of the random data into
  /dev/random by user space will happen before any cryptographic service
  is initialized in user space, this test demonstrates that sufficient
  entropy is already present in the LRNG at the time user space requires it
  for seeding cryptographic daemons. Note, this test result was obtained
  for different architectures, such as x86 64 bit, x86 32 bit, ARM 32 bit and
  MIPS 32 bit.

[1] http://www.chronox.de/lrng/doc/lrng.pdf

[2] http://www.chronox.de/lrng.html

Stephan Mueller (5):
  crypto: DRBG - externalize DRBG functions for LRNG
  random: conditionally compile code depending on LRNG
  crypto: Linux Random Number Generator
  crypto: LRNG - enable compile
  random: add interrupt callback to VMBus IRQ handler

 crypto/Kconfig |   10 +
 crypto/Makefile|1 +
 crypto/drbg.c  |   11 +-
 crypto/lrng.c  | 1981 
 drivers/char/random.c  |9 +
 drivers/hv/vmbus_drv.c |3 +
 include/crypto/drbg.h  |7 +
 include/linux/genhd.h  |5 +
 include/linux/random.h |7 +-
 9 files changed, 2027 insertions(+), 7 deletions(-)
 create mode 100644 crypto/lrng.c

-- 
2.7.2


--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html