Hi Edward,
Thankyou for your informative comments (and to others for contributing
their thoughts). I do think there is room for improvement in the BC RNG
code, particularly around entropy-collection.
Release 1.8 is currently pending the completion of the port of latest
TLS code from the Java version. It is still several weeks away, so there
is plenty of time if people would like to put their heads together and
contribute some concrete ideas/patches - ideally in the form of smaller
tweaks rather than re-architecting (there may be more room for that in
the 2.0 to follow, which will likely include many broad-sweeping
changes). Also welcome are tools/tests to assess the quality, and
perhaps they should come first.
There is a beta version available at
http://www.downloads.bouncycastle.org/betas/, so please refer to that or
latest git code, as much is likely to have changed since 1.7.
Regards,
Pete Dettman
On 29/07/2014 1:55 pm, Edward Ned Harvey (bouncycastle) wrote:
Just FYI, I've been doing some statistical analysis on random numbers
generated from various entropy sources. Here is a really simple test
that has produced some illuminating results: Generate a bunch of
random bytes. Then split it all out, one bit at a time (so if there's
a pattern, it will be more easily recognizable). Compress it and see
how compressible it is. (I'm using lzma from SharpCompress).
So I create a list of RNG's. One of the RNG's is the zero RNG, which
just produces an endless stream of zero's. This is in the list for
the sake of calibration and as a test control. I create a byte array,
say 64KB, and I populate it with random bytes from the first RNG.
Split each bit out (now I have an array of 512KB), and compress it.
Keep track of its compressed size. Repeat for each RNG in turn.
After repeating with a dozen or so RNG's, I use the maximum one as the
calibration point for assumed pure random, and I use the minimum one
(the all-zero RNG) as the calibration point for completely worthless
non-random. And then linearly scale each RNG result in between these
two points, to estimate the number of entropy bits per bit of output.
ThreadedSeedGenerator (with fast=false) is producing approx 0.7 bits
of entropy per bit.
ThreadedSeedGenerator (with fast=true) is producing approx 0.5 bits of
entropy per bit.
This is not a fatal flaw, as long as you're compensating for it - By
default, SecureRandom seeds itself with one sample of Ticks, and 24
bytes (192 bits) of ThreadedSeedGenerator (with fast=true). By my
estimation, this is approx 100-104 bits of entropy. And each
subsequent call to SecureRandom adds another sample of Ticks seed
material, which is approx 8 bits of entropy (at most).
I really think each call to SecureRandom should get another 256 bits
from ThreadedSeedGenerator (effectively adding another 128 bit seed),
but that's me.
Additionally, I measured the statistical randomness of each individual
bit of Ticks, when sampled thousands of times with a Sleep(1) in
between each sample. All the least significant 8 bits were
indistinguishable from random. The 9th and 10th bit started deviating
measurably, and after that, the deviation was very clear, but nonzero
entropy until maybe the 29th bit or so. The total estimate of entropy
in all the bits of a single sample of Ticks was about 14 bits, but
realistically, only 8 bits looked random, so I wouldn't be comfortable
trusting more than 2 or 4 bits of entropy from a single sample of Ticks.