Entropy estimation in /dev/random (and /dev/urandom)

Linux /dev/random adds randomness from the environment and estimates the
entropy it gains by doing so.  Entropy is a measure of the uncertainty
which an outside observer would have over the internal RNG state (called
the "random pool" in /dev/random).  It is assumed that the observer
does not have detailed knowledge of the events which are occuring,
but he may have some general idea of system activity and characteristics.

The entropy estimation is done on the basis of event timing.  There are
separate timers maintained for mouse, keyboard, block device, and
other interrupts, respectivelly.  When an event occurs the entropy is
estimated based on the timing of that event compared to the timings of
earlier events of that same class.  When a key is pressed, for example,
the entropy is estimated based on the timing of that keyboard press
compared to the timing of earlier keyboard presses.

The actual data of the event does not contribute to the estimate.
The data is added to the random pool (along with the event time), but
it is not taken into consideration in estimating the entropy.  For a
mouse interrupt, the mouse position and interrupt time is added to the
pool, but only the time is used to estimate the entropy contribution,
and similarly for the other interrupt classes.

The entropy estimate is done on the basis of first, second, and third
differences from the timing of previous events of the same class.  The
first difference is the delta of the new event time minus the previous
event time.  The second difference is the delta of first differences of
this event and the previous; the third difference is the delta of second
differences.  The code actually uses the absolute values of the deltas.

An example is as follows:

Mouse event times       1004   1012    1024    1025    1030    1041

1st differences              8      12       1       5      11

2nd differences                  4       11      4       6

3rd differences                      7       7       2

For the latest event being added, at time 1041 above, the first difference
is 11, the second difference is 6, and the third difference is 2.

The entropy estimator takes the minimum delta value, which in this case
would be 2, as its measure of how unexpected the new data point is.
Given this minimum delta, the log base 2 of that value is then used as
the entropy estimate.  In this case the new value would be credited as
adding 1 bit of entropy.

This use of deltas is approximately the same as attempting to fit an n
degree polynomial to the previous n+1 points, then looking to see how far
the new point is from the best prediction based on the previous n points.
The minimum of the deltas is used, which has the effect of taking the best
fit of a 0th, 1st or 2nd degree polynomial, and using that one.

With this model of how the entropy estimation works, we can evaluate how
suitable it is for the purposes in which it is used.

The estimator seems to be implicitly based on a physical model in which
the event timings would be expected to occur on a low degree polynomial.
It is conceivable that an outside observer might be able to guess this
fact and, making some simple assumptions about the initial state and
polynomial coefficients, he could predict a series of timing values.
The entropy estimator takes that into consideration by only including
departures from low degree polynomials as a source of entropy.
The assumption is that such departures are unpredictable and random and
could not be anticipated by the observer.

This model would seem better suited to some kinds of data than others.
For example, mouse positions (measured at constant time intervals) would
produce data series which are likely to be on low degree polynomials.
The mouse is at rest for much of the time, then it is accelerated by hand
and moved at constant or slowly varying speed, then it is decelerated.
If the mouse is being moved around vigorously it will often trace out
something like a Lisajous pattern, where the two coordinates are simple
trig functions, which can be well approximated locally by second degree

However, the model is not applied to mouse positions.  It is only applied
to event timings.  Here it seems less certain that the assumptions
are valid.  The timing of mouse interrupts, for example, does depend
to some extent on mouse speed, but not necessarily in a simple way.
It is not clear that mouse interrupt timing would be expected to fit on
a low degree polynomial.

In fact the model can be seen to be deficient in one significant respect,
in that it does not take into consideration quantization effects.
Event times may, because of various system characteristics, all occur
at some multiple of a high speed timer.  Consider the example above,
and suppose that all event times were multiplied by 10:

Mouse event times       10040   10120    10240    10250    10300    10410

1st differences              80      120       10       50      110

2nd differences                  40       110      40       60

3rd differences                      70       70       20

Now the minimum difference is 20 instead of 2, and the entropy added
will be 4+ bits rather than 1.  But the actual randomness of the data
has not increased, from the point of view of an outside observer who
is aware of this quantization effect.  The system is overestimating the
entropy because it does not detect the quantization.

A better model for timing based entropy would be designed to look
specifically for quantization effects.  If all the deltas and the new
value are close to a multiple of some common factor, this factor should
be removed before the entropy is estimated.

The polynomial fitting can also become confused by some simple variations.
Suppose that due to either hardware or software characteristics, all
events are passed to the entropy estimator twice.  This leads to a
pattern like:

Mouse event times       1024   1025    1025    1030    1030    1041

1st differences              1       0       5       0      11

2nd differences                  1        5      5       11

3rd differences                      4       0       6

This now adds log(6) or 2+ bits of entropy for this event.  But if the
observer knows this characteristic of the system, there is no more entropy
in this series than in the original one.

There may be circumstances where the event code (key press, mouse
position, disk or other event data) is useful in estimating entropy
contributions.  Auto-repeat key presses, for example, might have timing
characteristics sufficiently different from user typing that it is
hard to have one model capture both, and the event code could help to
detect those.  Presently this data is ignored in the entropy estimate.

A few other problems can be mentioned briefly.  All classes of events
are being dealt with by the same model.  It is not clear that it is
reasonable to assume that such disparate kinds of events can be dealt
with in this one framework.  The use of exactly three levels of deltas
is also questionable.  Has it been determined that a fourth level is
not necessary, for any of the event classes handled by /dev/random?
Do all classes of events need exactly the same three levels?

In summary, the /dev/random entropy estimates are based on fitting low
degree polynomials to event times.  All classes of events are modelled
using the same approach.  More justification is needed to show that this
is a reasonable method for estimating entropy for distinctly different
classes of events.  Some specific problems (quantization, repeated events)
will cause this model to overestimate entropy.

Reply via email to