On Aug 27, 2010, at 2:34 AM, Thomas wrote:
Am Donnerstag 26 August 2010 12:25:55 schrieb Jerry Leichter:
RNG's in VM's are a big problem because the "unpredictable" values
used in the non-deterministic parts of the algorithms - whether you
use them just for seeding or during updating as well - are often much
more predictable in a VM than a "real" machine. (For example, disk
timings on real hardware have some real entropy, but in a VM with an
emulated disk, that's open to question.)
I really doubt it. Are there papers about it?
It does not matter if there is one physical disk that is shared
between 1000 processes or between 10 VMs each running 100 processes
(assuming a shared random pool).
You have this precisely, and dangerously, backwards.
The entropy is not generated by the disk but by the processes
accessing
it in a (hopefully) non-deterministic way. The HDD interrupts are just
the sampling point. Therefore gaining entropy depends on the level of
abstraction where the sampling point is placed. It can be assumed that
the buffered HDD writing and reading on the host of a VM produce
less entropy than the real read(2) and write(2) calls within the VM
itself.
*Nothing* of this sort can be assumed. You've given a plausibility
argument, based on your understanding of the behavior of a very
complex system. Such arguments often go wrong. You need to do some
experimentation to justify your conclusions. You'll never *prove*
that you're getting unpredictability, since you aren't starting from a
sound theoretical base. The best you can get is a combination of
broad analysis of the system and experimental confirmation that it
seems to behave according to your model - which, mind you, is all we
ever get for real, complex physical systems. It's enough to build
everything from bridges to semiconductor devices.
To move beyond the pure abstract ... here are a couple of plausible
ways in which disk I/O in a VM might not deliver the same amount of
entropy as on real hardware:
1. Multiple VM's boot the same image at the same time, reading
exactly the same disk blocks in the same order. During the relevant
time period, the various VM's end up in a queue and run in a set
order. They all then see (pretty much) the *same* delays - once the
heads are positioned, they satisfy all the reads in turn.
2. Multiple VM's boot the same image at the same time, as in 1, but
the VM scheduler gives one VM enough time to run through the entire
sequence where it gathers startup entropy. Subsequent VM's find their
disk requests satisfied from the VM-level buffer with essentially no
delay.
3. The VM scheduler uses a large (relative to disk seek times) clock
tick. Any disk interrupts due for a given VM are delayed until the VM
is scheduled to run. Because the clock tick is large, there is
essentially no variation in how long it takes for a disk I/O to
complete: It's always the period up to the next tick.
4. SSD's have entirely different timing characteristics than spinning
disks, so require different measurement approaches. (A collection
point based on disk seek time scales will miss any variation on SSD
time scales.) The implementation is smart enough to look at the disk
type to determine its strategy. However, the VMM hides the SSD
present on the real hardware from the VM, and instead presents a
standard spinning disk. The VM then uses the wrong strategy.
Do any of these actually occur in real VM/OS combinations? I don't
know, *and neither do you*. Even if you know all the details of the
OS disk strategies, and have measured how your collection point works
on real hardware ... you should not believe that it will work
correctly on a VM until you test it there. (In fact, you shouldn't
believe it will work correctly on significantly different hardware
until you test it: Simply replacing the disk with an SSD may change
the behavior enough to require adjustment.)
-- Jerry
---------------------------------------------------------------------
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majord...@metzdowd.com