On randomness

2008-07-31 Thread John Denker
In 1951, John von Neumann wrote:

 Any one who considers arithmetical methods of producing random digits
 is, of course, in a state of sin.

That may or may not be an overstatement.

IMHO it all depends on what is meant by random.  The only notion 
of randomness that I have found worthwhile is the notion of being 
_random enough for the purpose_.

  *) For some benign purposes, such as a Monte Carlo integration
   routine, a simple linear congruence generator might suffice.
   Such a generator would not withstand cryptanalysis for even
   a moment, but it will not be subjected to cryptanalysis, so
   we don't care.

  *) At the other extreme, there are many high-stakes business,
   military, and gambling applications where I would agree with 
   von Neumann, and would shun absolutely all PRNGs.  I would 
   rely exclusively on _hardware_ randomness generators, as
   detailed at:
  http://www.av8n.com/turbid/

   The /seeding/ of PRNGs is a notorious problem;  the idea of
   seeding one PRNG with another often reduces to the problem
   previously _not_ solved.  Sooner or later you need a source
   of high-grade randomness, not pseudo randomness, and sooner
   is better than later.

   For this reason, most so-called PRNGs are not really PRNGs
   after all, since their foundations are seen to rest on a 
   hardware randomness generator.  They are more usefully
   considered schemes for _stretching_ the randomness of the 
   underlying hardware.  I call them SRNGs, for stretched 
   randomness generators.
   

On 07/30/2008 12:22 PM, Pierre-Evariste Dagand wrote:

 I fear I was not clear: I don't see what is wrong in evaluating the
 quality of a random number generator with (an extensive set of)
 statistical tests.

How extensive?

To paraphrase Dykstra:  Testing may prove the absence of randomness,
but it cannot possibly prove the presence of randomness.

Testing for high-grade randomness is not just a hard problem; it is 
formally undecidable, in the same category as the halting problem.  
Reference: Chaitin.  
See also:
  http://www.av8n.com/turbid/paper/turbid.htm#sec-limitations

On 07/30/2008 01:33 PM, Ben Laurie replied:
 
 SHA-1(1), SHA-1(2), SHA-1(3), ... SHA-1(N) will look random, but clearly
 is not.

Quite so.

 But, then, there is a the chicken or the egg problem: how would you
 ensure that a *new* RNG is a good source of randomness ? (it's not a
 rhetorical questions, I'm curious about other approaches).
 
 By reviewing the algorithm and thinking hard.

Sometimes that's good enough, but sometimes it's not.  Or more to the
point, often the cost of thinking hard enough exceeds the cost of 
implementing a _hardware_ randomness generator that has a _provable_
_lower bound_ on its entropy(*).

To paraphrase the previous paraphrase:  Testing may provide an upper 
bound on the randomness, but it cannot possibly provide a useful lower 
bound.  In contrast, physics can provide a useful lower bound.

Saying that this-or-that test measures the randomness is highly 
misleading if you don't distinguish measuring an upper bound from 
measuring a lower bound.  The test that judged a DNS server to be
GREAT was making precisely this mistake.



  *) NB: Whereas I mean something rather vague by randomness (i.e. 
  random enough for the application) I mean something very specific 
  by entropy.

For details on all this, see
  http://www.av8n.com/turbid/
and in particular
  http://www.av8n.com/turbid/paper/turbid.htm

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: On the randomness of DNS

2008-07-31 Thread Pierre-Evariste Dagand
  SHA-1(1), SHA-1(2), SHA-1(3), ... SHA-1(N) will look random, but clearly is
 not.

Just by curiosity, I ran the Diehard tests on /dev/random (FreeBSD
7.0) and a sha1 sequence of [ 1 ... N ]. Both random files are 63 Mb.
I know that there has been some controversy about /dev/random of
FreeBSD on this list in the past, I don't know if the issues has
been solved, though. I do not have any other OS at hand right now.

To interpret the results, quoting DieHard presentation:

Most of the tests in DIEHARD return a p-value, which
should be uniform on [0,1) if the input file contains truly
independent random bits.   Those p-values are obtained by
p=1-F(X), where F is the assumed distribution of the sample
random variable X---often normal. But that assumed F is often just
an asymptotic approximation, for which the fit will be worst
in the tails. Thus you should not be surprised with  occasion-
al p-values near 0 or 1, such as .0012 or .9983. When a bit
stream really FAILS BIG, you will get p`s of 0 or 1 to six
or more places.  By all means, do not, as a Statistician
might, think that a p  .025 or p .975 means that the RNG
has failed the test at the .05 level.  Such p`s happen
among the hundreds that DIEHARD produces, even with good RNGs.
 So keep in mind that p happens

For /dev/random, I get:

Birthday spacing:  0.323220
Overlapping 5-permutations:0.483655
  0.215005
Binary rank:   0.405667
Count the 1's:0.379616
Parking lot:0.993189
Minimum distance:0.580824
3D-spheres:  0.616398
Squeeze:   0.195228
Overlapping sums:0.010507
Runs: 0.233353
  0.274341
  0.719427
  0.749529
Craps:0.480129
  0.650224

Sum-up for /dev/random:
Abnormally high value: 0.993189 [1]
Abnormally low value: 0.010507 [1]
Total: 2

For sha1(n), I get:

Birthday spacing:   0.810196
Overlapping 5-permutations: 0.717577
   0.645166
Binary rank:0.891962
Count the 1's: 0.377828
Parking lot: 0.188993
Minimum distance: 0.138668
3D-spheres:0.087107
Squeeze:0.377509
Overlapping sums: 0.091750
Runs:  0.938376
   0.060212
   0.050921
   0.210624
Craps: 0.927501
   0.827696

Sum up for Sha1(n):
Abnormally high values: 0.938376, 0.927501 [2]
Abnormally low values: 0.087107, 0.091750, 0.060212, 0.050921 [4]
Total: 6

So, I would say that Sha1(n) does not pass DieHard (while /dev/random
does). But this would require further examination, in particular to
understand why some tests failed. And, in fact, I have no clue why
they failed...


Regards,

-- 
Pierre-Evariste DAGAND
http://perso.eleves.bretagne.ens-cachan.fr/~dagand/

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: On the randomness of DNS

2008-07-31 Thread Bill Stewart

 Ben wrote:

  But just how GREAT is that, really? Well, we don'
  t know. Why? Because there isn't actually a way test for randomness. Your
 DNS resolver could be using some easily predicted random number generator
 like, say, a linear congruential one, as is common in the rand() library
 function, but DNS-OARC would still say it was GREAT.


At 11:57 AM 7/30/2008, Pierre-Evariste Dagand wrote:

Well, they are some tests to judge the quality of a random number
generator. The best known being the Diehard tests:


Random number quality is contextual.
In this case, for 95-99% of the market, the real test is between
Patched Badly Broken Not patched yet Didn't need patching,
and if you'd prefer the term Best we can do until DNSSEC
instead of GREAT I won't be the one to argue with you.

There are some other possible conditions, like
Rolled their own with open source, badly
or  Maliciously subtle malware DNS resolver.
The latter is way too much work compared to cruder approaches
(like targeting queries directly to your evil DNS server).
The former is not too common, though it probably exists,
but once most systems get patched,
it may not be a big enough target to interest crackers.



-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: On the randomness of DNS

2008-07-31 Thread Ben Laurie

Dirk-Willem van Gulik wrote:
I fail to see how you could evaluate this without seeing the code (and 
even then - I doubt that one can properly do this -- the ?old? NSA habit 
of tweaking your random generated rather than your protocol/algorithm 
when they wanted your produced upgraded to export quality - is terribly 
effective and very hard to spot).


Or am I missing something ?


I think that, in general, you are correct. However, in the case of NAT 
your adversary is not someone who is trying to guess your randomness, 
but someone who is trying to sell you their NAT gateway. In this case, 
code/silicon inspection probably suffices.


Cheers,

Ben.

--
http://www.apache-ssl.org/ben.html   http://www.links.org/

There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit. - Robert Woodruff

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: On the randomness of DNS

2008-07-31 Thread Philipp Gühring

Hi,

I would suggest to use http://www.cacert.at/random/ to test the 
randomness of the DNS source ports. Due to the large variety of 
random-number sources that have been tested there already, it's useful 
as a classification service of unknown randomly looking numbers.
You just have to collect 12 MB of numbers from a DNS server and upload 
it there. (If you get 2 Bytes per request, that's 6 million requests you 
have to do)



I don't see the point of evaluating the quality of a random number
generator by statistical tests.


We successfully used statistical tests to detect broken random number 
generators, we informed the vendors and they fixed them.

http://www.cacert.at/cgi-bin/rngresults

Best regards,
Philipp Gühring

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: On the unpredictability of DNS

2008-07-31 Thread William Allen Simpson

I've changed the subject.  Some of my own rants are about mathematical
cryptographers that are looking for the perfect solution, instead of
practical security solution.  Always think about the threat first!

In this threat environment, the attacker is unlikely to have perfect
knowledge of the sequence.  Shared resolvers are the most critical
vulnerability, but the attacker isn't necessarily in the packet path, and
cannot discern more than a few scattered numbers in the sequence.  The
more sharing (and greater impact), the more sparse the information.

In any case, the only perfect solution is DNS-security.  Over many
years, I've given *many* lectures to local university, network, and
commercial institutions about the need to upgrade and secure our zones.
But the standards kept changing, and the roots and TLDs were not secured.

Now, the lack of collective attention to known security problems has
bitten us collectively.

Never-the-less, with rephrasing, Ben has some good points

Ben Laurie wrote:
But just how GREAT is that, really? Well, we don't know. Why? 
Because there isn't actually a way test for randomness. ...


While randomness is sufficient for perfect unpredictability, it isn't
necessary in this threat environment.

Keep in mind that the likely unpredictability is about 2**24.  In many
or most cases, that will be implementation limited to 2**18 or less.


Your DNS resolver could be using some easily predicted random number 
generator like, say, a linear congruential one, as is common in the 
rand() library function, but DNS-OARC would still say it was GREAT.


In this threat environment, a better test would be for determination of a
possible seed for any of several common PRNG.  Or lack of PRNG.

How many samples would be needed?  That's the mathematical limitation.

Is it less than 2**9 (birthday attack on 2**18)?


It is an issue because of NAT. If your resolver lives behind NAT (which 
is probably way more common since this alert, as many people's reactions 
[mine included] was to stop using their ISP's nameservers and stand up 
their own to resolve directly for them) and the NAT is doing source port 
translation (quite likely), then you are relying on the NAT gateway to 
provide your randomness. But random ports are not the best strategy for 
NAT. They want to avoid re-using ports too soon, so they tend to use an 
LRU queue instead. Pretty clearly an LRU queue can be probed and 
manipulated into predictability.



Agreed!  All my tests of locally accessible NATs (D-Link and Linksys) show
that the sequence is fairly predictable.  And no code updates available


Incidentally, I'm curious how much this has impacted the DNS 
infrastructure in terms of traffic - anyone out there got some statistics?



Some are coming in on another private security list where I and some
others here are vetted, but everything is very preliminary.

In addition to the publicized attacks on major ISP infrastructure, there
are verified scans and attacks against end user home NATs.


Oh, and I should say that number of ports and standard deviation are not 
a GREAT way to test for randomness. For example, the sequence 1000, 
2000, ..., 27000 has 27 ports and a standard deviation of over 7500, 
which looks pretty GREAT to me. But not very random.



Again, the question is not randomness, but unpredictability.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]