At 17:05  -0400 2006/10/12, Steven M. Bellovin wrote:
This is a very interesting suggestion, but I suspect people need to be
cautious about false positives.  MP3 and JPG files will, I think, have
similar entropy statistics to encrypted files; so will many compressed

Actually, no. I have a general purpose stats program that I often use for cryptanalysis as part of my tookit. I pointed it at my photos folder, and every single jpeg file had results like this:
samples:      88246
unique:       256
sum:          11413854
sum squares:  1943201034
maximum:      255
minimum:      0
range:        255
mean:         129.34132
variance:     5291.1565
std dev:      72.740336
median:       130
exp freq:     344.71094
max freq:     623
mode count:   1
mode:         0
min freq:     109
unmode count: 1
unmode:       192
chi^2:        4375.0414
chi^2 df:     255
pr(chi^2):    1.00000 (*** certainly non-uniform distribution ***)
bad buckets:  96
KS+:          1.0002392
pr(KS+):      0.86510
KS-:          6.6097712
pr(KS-):      1.00000 (*** certainly non-uniform distribution ***)
KS(both):     3.8050052
pr(KS_BOTH):  1.00000

The simplest test is just the chi-squared test on the frequency of bytes, and it's way out of range on even fairly small jpegs. The Kolmogorov-Smirnoff test almost always bingos too. And even if the chi-squared passes, the binomial test on individual byte-value frequencies will flag the data as non-random; note the "bad buckets" count above; the detailed output is suppressed when the chi-squared test fails, since there will generally be too much of it.

The only things that it usually passes as good are for-purpose random number generators' or ciphers' outputs. Everything else (including a terabyte of RC4 output, executables, zip archives, jpegs, mpegs, mp3s, ...) that I've pointed it at, fails one or more of the tests.

True random-looking-ness is hard to find... :-)


The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]

Reply via email to