Re: Debunking the PGP backdoor myth for good. [was RE: Hypothesis: PGP backdoor (was: A security bug in PGP products?)]

2006-08-30 Thread Ondrej Mikle

On 8/28/06, Dave Korn [EMAIL PROTECTED] wrote:

  The author has made the *exact* same error as when someone comes up with a
magical compression algorithm that they say can compress absolutely any data
down to a tiny size.  They always get the data to compress, sure, but they
always have problems with the decompression stage.  They tend to think that
this is just a minor bug in their decompressor they need more time to work on,
but no amount of time will let them work around the fundamental mathematical
fact that they've not got sufficient information in their compressed file to
reconstruct the original.


Thanks, sorry for the hassle, me (and others) should've checked it
before asking.

Ad. compression algorithm: I conjecture there exists an algorithm (not
necessarily *finite*) that can compress large numbers
(strings/files/...) into small space, more precisely, it can
compress number that is N bytes long into O(P(log N)) bytes, for some
polynomial P. Here's a lead:

Take as an example group of Z_p* with p prime (in another words: DLP).
The triplet (Z, p, generator g) is a compression of a string of p-1
numbers, each number about log2(p) bits.

(legend: DTM - deterministic Turing machine, NTM - nondeterministic
Turing machine)

There exists a way (not necessarily fast/polynomial with DTM) that a
lot of strings can be compressed into the mentioned triplets. There
are \phi(p-1) different strings that can be compressed with these
triplets. Exact number of course depends on factorization of p-1.

Decompression is simple: take generator g and compute g, g^2, g^3,
g^4, ... in Z_p*

I conjecture that for every permutation on 1..N there exists a
function that compresses the permutation into a short
representation. It is possible that only NTM, possibly with infinite
algorithm (e.g. a human) can do it in some short finite time. Again,
I could've/should've proven or disproven the conjecture, I just don't
want to do it - yet ;-)

Thanks
 OM

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Impossible compression still not possible. [was RE: Debunking the PGP backdoor myth for good. [was RE: Hypothesis: PGP backdoor (was: A security bug in PGP products?)]]

2006-08-30 Thread Dave Korn
On 28 August 2006 15:30, Ondrej Mikle wrote:

 Ad. compression algorithm: I conjecture there exists an algorithm (not
 necessarily *finite*) that can compress large numbers
 (strings/files/...) into small space, more precisely, it can
 compress number that is N bytes long into O(P(log N)) bytes, for some
 polynomial P. 

[  My maths isn't quite up to this.  Is it *necessarily* the case that /any/
polynomial of log N /necessarily/ grows slower than N?  If not, there's
nothing to discuss, because this is then a conventional compression scheme in
which some inputs lead to larger, not smaller, outputs.  Hmm.  It would seem
to me that if P(x)==e^(2x), P(log n) will grow exponentially faster than N.
I'm using log to mean natural log here, substitute 10 for e in that formula if
we're talking about log10, the base isn't important.  However, assuming that
we're only talking about P that *do* grow more slowly, I'll discuss the
problem with this theory anyway.  ]


  There are many, but there are no algorithms that can compress *all* large
numbers into small space; for all compression algorithms, some sets of input
data must result in *larger* output than input.

  There is *no* way round the sheer counting theory aspect of this.  There are
only 2^N unique files of N bits.  If you compress large files of M bits into
small files of N bits, and you decompression algorithm produces deterministic
output, then you can only possibly generate 2^N files from the compressed
ones.

 Take as an example group of Z_p* with p prime (in another words: DLP).
 The triplet (Z, p, generator g) is a compression of a string of p-1
 numbers, each number about log2(p) bits.
 
 (legend: DTM - deterministic Turing machine, NTM - nondeterministic
 Turing machine)
 
 There exists a way (not necessarily fast/polynomial with DTM) that a
 lot of strings can be compressed into the mentioned triplets. There
 are \phi(p-1) different strings that can be compressed with these
 triplets. Exact number of course depends on factorization of p-1.
 
 Decompression is simple: take generator g and compute g, g^2, g^3,
 g^4, ... in Z_p*

  This theory has been advanced many times before.  The Oh, if I can just
find the right parameters for a PRNG, maybe I can get it to reconstruct my
file as if by magic.  (Or subsitute FFT, or wavelet transform, or
key-expansion algorithm, or ... etc.)

  However, there are only as many unique generators as (2 to the power of the
number of bits it takes to specify your generator) in this scheme.  And that
is the maximum number of unique output files you can generate.

  There is ***NO*** way round the counting theory.

 I conjecture that for every permutation on 1..N there exists a
 function that compresses the permutation into a short
 representation.

  I'm afraid to tell you that, as always, you will find the compression stage
easy and the decompression impossible.  There are many many many more
possible permutations of 1..N than the number of unique short
representations in the scheme.  There is no way that the smaller number of
unique representations can possibly produce any more than the same (smaller)
number of permutations once expanded.  There is no way to represent the other
(vast majority) of permutations.

  It is possible that only NTM, possibly with infinite
 algorithm (e.g. a human) can do it in some short finite time. 

  Then it's not really an algorithm, it's an ad-hoc collection of different
schemes.  If you're allowed to use a completely different encryption scheme
for every file, I can do better than that: for every file, define an
encryption scheme where the bit '1' stands for the content of that file, and
everything else is represented by a '0' bit followed by the file itself.
Sure, most files grow 1 bit bigger, but the only file we care about is
compressed to just a single bit!  Great!

  Unfortunately, all you've done is moved information around.  The amount of
information you'd have to have in the decompressor to know which file to
expand any particular '1' bit into is equal to  the amount you've saved by
compressing the original to a single bit.

  There is ***NO*** way round the counting theory.

 Again,
 I could've/should've proven or disproven the conjecture, I just don't
 want to do it - yet ;-)

  I seriously advise you not to waste much time on it.  Because ...






 There is ***NO*** way round the counting theory.







cheers,
  DaveK
-- 
Can't think of a witty .sigline today


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: Impossible compression still not possible. [was RE: Debunking the PGP backdoor myth for good. [was RE: Hypothesis: PGP backdoor (was: A security bug in PGP products?)]]

2006-08-30 Thread Ondrej Mikle

We are both talking about the same thing :-)

I am not saying there is a finite deterministic algorithm to compress
every string into small space, there isn't. BTW, thanks for There
is ***NO*** way round the counting theory. :-)

All I wanted to say is:
For a specific structure (e.g. movie, picture, sound) there is some
good compression algorithm.

E.g.: if you take a GIF 65536x65536, all white, with just one pixel
black, it can be compressed into 35 bytes, see here:
http://i.iinfo.cz/urs-att/gif3_6-115626056883166.gif
If you wanted to compress the same picture using JPEG (i.e. discrete
cosine transform), then two things would happen:

The compressed jpeg file would be a) much bigger b) decompressed image
would have artifacts, because Fourier transform of a pulse is sync
(infinitely many frequencies). Sure, JPEG is a lossy compression, but
good enough for photos and images that don't have a lot of high
frequencies.

Cheers,
 OM

On 8/28/06, Dave Korn [EMAIL PROTECTED] wrote:

On 28 August 2006 15:30, Ondrej Mikle wrote:

 Ad. compression algorithm: I conjecture there exists an algorithm (not
 necessarily *finite*) that can compress large numbers
 (strings/files/...) into small space, more precisely, it can
 compress number that is N bytes long into O(P(log N)) bytes, for some
 polynomial P.

[  My maths isn't quite up to this.  Is it *necessarily* the case that /any/
polynomial of log N /necessarily/ grows slower than N?  If not, there's
nothing to discuss, because this is then a conventional compression scheme in
which some inputs lead to larger, not smaller, outputs.  Hmm.  It would seem
to me that if P(x)==e^(2x), P(log n) will grow exponentially faster than N.
I'm using log to mean natural log here, substitute 10 for e in that formula if
we're talking about log10, the base isn't important.  However, assuming that
we're only talking about P that *do* grow more slowly, I'll discuss the
problem with this theory anyway.  ]


  There are many, but there are no algorithms that can compress *all* large
numbers into small space; for all compression algorithms, some sets of input
data must result in *larger* output than input.

  There is *no* way round the sheer counting theory aspect of this.  There are
only 2^N unique files of N bits.  If you compress large files of M bits into
small files of N bits, and you decompression algorithm produces deterministic
output, then you can only possibly generate 2^N files from the compressed
ones.

 Take as an example group of Z_p* with p prime (in another words: DLP).
 The triplet (Z, p, generator g) is a compression of a string of p-1
 numbers, each number about log2(p) bits.

 (legend: DTM - deterministic Turing machine, NTM - nondeterministic
 Turing machine)

 There exists a way (not necessarily fast/polynomial with DTM) that a
 lot of strings can be compressed into the mentioned triplets. There
 are \phi(p-1) different strings that can be compressed with these
 triplets. Exact number of course depends on factorization of p-1.

 Decompression is simple: take generator g and compute g, g^2, g^3,
 g^4, ... in Z_p*

  This theory has been advanced many times before.  The Oh, if I can just
find the right parameters for a PRNG, maybe I can get it to reconstruct my
file as if by magic.  (Or subsitute FFT, or wavelet transform, or
key-expansion algorithm, or ... etc.)

  However, there are only as many unique generators as (2 to the power of the
number of bits it takes to specify your generator) in this scheme.  And that
is the maximum number of unique output files you can generate.

  There is ***NO*** way round the counting theory.

 I conjecture that for every permutation on 1..N there exists a
 function that compresses the permutation into a short
 representation.

  I'm afraid to tell you that, as always, you will find the compression stage
easy and the decompression impossible.  There are many many many more
possible permutations of 1..N than the number of unique short
representations in the scheme.  There is no way that the smaller number of
unique representations can possibly produce any more than the same (smaller)
number of permutations once expanded.  There is no way to represent the other
(vast majority) of permutations.

  It is possible that only NTM, possibly with infinite
 algorithm (e.g. a human) can do it in some short finite time.

  Then it's not really an algorithm, it's an ad-hoc collection of different
schemes.  If you're allowed to use a completely different encryption scheme
for every file, I can do better than that: for every file, define an
encryption scheme where the bit '1' stands for the content of that file, and
everything else is represented by a '0' bit followed by the file itself.
Sure, most files grow 1 bit bigger, but the only file we care about is
compressed to just a single bit!  Great!

  Unfortunately, all you've done is moved information around.  The amount of
information you'd have to have in the decompressor to know which file 

RE: Impossible compression still not possible. [was RE: Debunking the PGP backdoor myth for good. [was RE: Hypothesis: PGP backdoor (was: A security bug in PGP products?)]]

2006-08-30 Thread Dave Korn
On 28 August 2006 17:12, Ondrej Mikle wrote:

 We are both talking about the same thing :-)

  Oh!
 
 I am not saying there is a finite deterministic algorithm to compress
 every string into small space, there isn't. BTW, thanks for There
 is ***NO*** way round the counting theory. :-)
 
 All I wanted to say is:
 For a specific structure (e.g. movie, picture, sound) there is some
 good compression algorithm.
 
 E.g.: if you take a GIF 65536x65536, all white, with just one pixel
 black, it can be compressed into 35 bytes, see here:
 http://i.iinfo.cz/urs-att/gif3_6-115626056883166.gif
 If you wanted to compress the same picture using JPEG (i.e. discrete
 cosine transform), then two things would happen:
 
 The compressed jpeg file would be a) much bigger b) decompressed image
 would have artifacts, because Fourier transform of a pulse is sync
 (infinitely many frequencies). Sure, JPEG is a lossy compression, but
 good enough for photos and images that don't have a lot of high
 frequencies.

  Absolutely, absolutely!  

  A compression algorithm achieves the best results if it is designed with
statistical knowledge of the target domain taken into account.  In any
compression scheme you're balancing the set of inputs that grow smaller on
compression against the necessary counterpart of inputs that grow larger.
Whatever you gain in the first set, you lose in the second.  The secret is to
arrange that the inputs that tend to grow larger are the ones that are
less-common in real-world usage, and thus that the ones that are more common
tend to grow smaller.  In practice, this means eliminating 'redundancy', where
'redundancy' is defined as 'whatever similar properties you can tease out from
the more-common-real-world cases'.

  Of course, I could point out that there is precisely *1* bit of information
in that huge GIF, so even compressing it to 35 bytes isn't a great
achievement... it's one of the set of less-common inputs that grow bigger as a
compromise so that real pictures, which tend to have at least *some*
variation, grow smaller.


cheers,
  DaveK
-- 
Can't think of a witty .sigline today


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Bleichenbacher's RSA signature forgery based on implementation error

2006-08-30 Thread Hal Finney
At the evening rump session at Crypto last week, Daniel Bleichenbacher
gave a talk showing how it is possible under some circumstances to
easily forge an RSA signature, so easily that it could almost be done
with just pencil and paper.  This depends on an implementation error,
a failure to check a certain condition while verifying the RSA signature.
Daniel found at least one implementation (I think it was some Java crypto
code) which had this flaw.  I wanted to report on his result here so that
other implementers can make sure they are not vulnerable.  Be aware that
my notes were hurried as Daniel had only a few minutes to talk.

The attack is only good against keys with exponent of 3.  There are
not too many of these around any more but you still run into them
occasionally.  It depends on an error in verifying the PKCS-1 padding
of the signed hash.

An RSA signature is created in several steps.  First the data to be
signed is hashed.  Then the hash gets a special string of bytes in ASN.1
format prepended, which indicates what hash algorithm is used.  This data
is then PKCS-1 padded to be the width of the RSA modulus.  The PKCS-1
padding consists of a byte of 0, then 1, then a string of 0xFF bytes,
then a byte of zero, then the payload which is the ASN.1+hash data.
Graphically:

00 01 FF FF FF ... FF 00  ASN.1  HASH

The signature verifier first applies the RSA public exponent to reveal
this PKCS-1 padded data, checks and removes the PKCS-1 padding, then
compares the hash with its own hash value computed over the signed data.

The error that Bleichenbacher exploits is if the implementation does
not check that the hash+ASN.1 data is right-justified within the PKCS-1
padding.  Some implementations apparently remove the PKCS-1 padding by
looking for the high bytes of 0 and 1, then the 0xFF bytes, then
the zero byte; and then they start parsing the ASN.1 data and hash.
The ASN.1 data encodes the length of the hash within it, so this tells
them how big the hash value is.  These broken implementations go ahead
and use the hash, without verifying that there is no more data after it.
Failing to add this extra check makes implementations vulnerable to a
signature forgery, as follows.

Daniel forges the RSA signature for an exponent of 3 by constructing a
value which is a perfect cube.  Then he can use its cube root as the
RSA signature.  He starts by putting the ASN.1+hash in the middle of
the data field instead of at the right side as it should be.  Graphically:

00 01 FF FF ... FF 00  ASN.1  HASH  GARBAGE

This gives him complete freedom to put anything he wants to the right
of the hash.  This gives him enough flexibility that he can arrange for
the value to be a perfect cube.

In more detail, let D represent the numeric value of the 00 byte, the
ASN.1 data, and the hash, considered as a byte string.  In the case
of SHA-1 this will be 36 bytes or 288 bits long.  Define N as 2^288-D.
We will assume that N is a multiple of 3, which can easily be arranged
by slightly tweaking the message if neccessary.

Bleichenbacher uses an example of a 3072 bit key, and he will position
the hash 2072 bits over from the right.  This improperly padded version
can be expressed numerically as 2^3057 - 2^2360 + D * 2^2072 + garbage.
This is equivalent to 2^3057 - N*2^2072 + garbage.  Then, it turns out
that a cube root of this is simply 2^1019 - (N * 2^34 / 3), and that is
a value which broken implementations accept as an RSA signature.

You can cube this mentally, remembering that the cube of (A-B) is A^3 -
3(A^2)B + 3A(B^2) - B^3.  Applying that rule gives 2^3057 - N*2^2072
+ (N^2 * 2^1087 / 3) - (N^3 * 2^102 / 27), and this fits the pattern
above of 2^3057 - N*2^2072 + garbage.  This is what Daniel means when
he says that this attack is simple enough that it could be carried out
by pencil and paper (except for the hash calculation itself).

Implementors should review their RSA signature verification carefully to
make sure that they are not being sloppy here.  Remember the maxim that in
cryptography, verification checks should err on the side of thoroughness.
This is no place for laxity or permissiveness.

Daniel also recommends that people stop using RSA keys with exponents
of 3.  Even if your own implementation is not vulnerable to this attack,
there's no telling what the other guy's code may do.  And he is the one
relying on your signature.

Hal Finney

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: Impossible compression still not possible. [was RE: Debunking the PGP backdoor myth for good. [was RE: Hypothesis: PGP backdoor

2006-08-30 Thread Ondrej Mikle

Dave Korn wrote:

  Of course, I could point out that there is precisely *1* bit of information
in that huge GIF, so even compressing it to 35 bytes isn't a great
achievement... it's one of the set of less-common inputs that grow bigger as a
compromise so that real pictures, which tend to have at least *some*
variation, grow smaller.



OK, according to Shannon's definition of entropy, how much information 
is there in \pi or e or other transcendent number? AFAIK all finite 
strings are in fractional expansion of \pi (take positive integer base 
other than 10 if you want). Sure, the numbers are correlated, because we 
can express \pi or e as infinite series, though only couple of bytes is 
necessary.


But: if you take any finite string, there exists a finite natural number 
as index where the string can be found in \pi. So are there any random 
strings at all? What if I can express the digits starting at 2^(2^k)-th 
index analytically in a short, fast algorithm? In case no other can, 
then I have perfect one-time pad, at least for some time, because 
conventional computers will not reach the place in some reasonable time 
(yes, I know, there may be other correlations). The point I am trying to 
make: if I take e.g. a transcendent number and can analytically express 
a part of its fractional expansion, I *might* have a strong key.


The point for this: I am researching an authenticating scheme where keys 
are *programs*, not data. It means that key can change itself over time, 
for example. The strongest keys would therefore be somewhat recursive 
polymorphic code, that enumerates all possible permutations on some 
finite set in some deterministic, though seemingly random order.


I know that there are short descriptions for lots of infinite 
structures, the mentioned transcendent numbers, then take Mandelbrot 
set, various IFSs (iterated function systems), quaternions, unmeasurable 
sets, there are lots of possibilities. What I know for sure is that for 
many mathematical structures short descriptions (programs or (partially) 
recursive functions) exist. What I conjecture is that for each finite 
ordered set a short compression function exists. The whole trick is 
that it is almost impossible (meaning computationally infeasible) to 
look for the compression functions using a finite deterministic 
algorithm. Though a human can do it.


It will yet take some time until I have some more results about the 
categories of key programs.


Cheers,
  OM

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: A security bug in PGP products?

2006-08-30 Thread Travis H.

On 8/23/06, Dave Korn [EMAIL PROTECTED] wrote:

  Given that, whatever passphrase you use, you will decrypt the EDK block and
get /something/ that looks like a key, this comparison of hashes is a sanity
test.  If you bypass it but enter the wrong passphrase, you'll get an
incorrectly-decrypted EDK, which will lead your disk to look like every sector
is full of random garbage.  Rather than decrypt the entire disk and run chkdsk
to see if it looks sane, comparing the hashes of the passphrase is a quick and
dirty way of testing if the resulting EDK is going to be the correct one.


The PGP email encryption has two known-plaintext bytes for that purpose.
This only honors a bad key 2^16 of the time, but ensures that brute-forcing
must do a more extensive unknown-plaintext attack at that rate for any
potentially-correct key.

This reminds me a little of the suggestions that MACs should be truncated,
although it seems to me that it's better to encrypt a hash of the plaintext.
--
If you're not part of the solution, you're part of the precipitate.
Unix guru for rent or hire -- http://www.lightconsulting.com/~travis/
GPG fingerprint: 9D3F 395A DAC5 5CCC 9066  151D 0A6B 4098 0C55 1484

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: Hypothesis: PGP backdoor (was: A security bug in PGP products?)

2006-08-30 Thread Travis H.

On 8/23/06, Ondrej Mikle [EMAIL PROTECTED] wrote:

We discussed with V. Klima about the recent bug in PGPdisk that
allowed extraction of key and data without the knowledge of passphrase.


I skimmed the URL and it appears this claim was answered several times
in the original thread.  Did you not read it, or not understand it?

You have to have a valid passphrase from before the change, because the
passphrase unlocks the disk key which doesn't change, unless you explicitly
tell it to.
--
If you're not part of the solution, you're part of the precipitate.
Unix guru for rent or hire -- http://www.lightconsulting.com/~travis/
GPG fingerprint: 9D3F 395A DAC5 5CCC 9066  151D 0A6B 4098 0C55 1484

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: compressing randomly-generated numbers

2006-08-30 Thread Alexander Klimov
On Mon, 28 Aug 2006, Travis H. wrote:
 On 8/23/06, Alexander Klimov [EMAIL PROTECTED] wrote:
  A random bit stream should have two properties: no bias and no
  dependency between bits. If one has biased but independent bits he
  can use the von Neumann algorithm to remove the bias, but if there is
  a dependency no algorithm will be able to `strip' it.

 Interesting claim;

Well, it not really a claim since there was no definition, here it is:
A ``dependency stripping'' algorithm is a deterministic algorithm that
gets a stream of unbiased (but not necessary independent bits) and
produces a stream of several independent bits.

 what if I XOR it with a truly random stream,

It is no a deterministic algorithm.

 or a computationally-pseudorandom stream,

This requires a random key.

Recall that the whole point was to extract random bits -- if we
already have useful random bits we can simply output them and discard
the input stream.

 or if I filter out every other bit?

Consider ``x a x b x c x ...'', sampling every other bit throws away
most of entropy.

In general, there is no ``dependency stripping'' algorithm because an
input x,x,x,..., where all x are the same (either all 0 or all 1), is
possible. There is simply no enough entropy in the input to extract at
least two bits.

Interestingly, an additional requirement that the input stream has at
least two bits of entropy does not change the situation: Consider the
first two bits of the output. There are only four possibilities (00,
01, 10, and 11), so if the input is long enough then it is possible to
find four different inputs that give the same bits, e.g.,

 A(0001) = 00...
 A(0010) = 00...
 A(0100) = 00...
 A(1000) = 00...

An input that contains 0001, 0010, 0100, and 1000 with the same
probability has two bits of entropy, but it is always transformed by
the algorithm into 00... and thus the first two bits are not
independent. (Strictly speaking, if we recall a requirement that the
input bits are unbiased we should also include into the set of inputs
1110, 1101, ..., but this does not change the result.)

 Seems like these would, if not eliminate the dependency, would
 weaken its effect. The last is equivalent to sampling at a slower
 rate.

See above about ``sampling at a slower rate.''

-- 
Regards,
ASK

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: compressing randomly-generated numbers

2006-08-30 Thread Travis H.

On 8/29/06, Alexander Klimov [EMAIL PROTECTED] wrote:

Well, it not really a claim since there was no definition, here it is:
A ``dependency stripping'' algorithm is a deterministic algorithm that
gets a stream of unbiased (but not necessary independent bits) and
produces a stream of several independent bits.


Well, this is a fairly strict definition, I think uniform and e-close to
independent is still potentially suitable for use in cryptography.


 or a computationally-pseudorandom stream,
This requires a random key.


I can't use e-close to (uniform and independent)?


Recall that the whole point was to extract random bits -- if we
already have useful random bits we can simply output them and discard
the input stream.


I hear this argument often, and it appears that the people who say it
don't care about the rate of random bits, nor the desirability of not having
to trust any one source to be truly unpredictable, nor have they understood
the point of an extractor or hashing the output of multiple sources.

For example, I'd like to use the Quantis HWRNG, but since it is an
opaque container, then I cannot trust it fully.  If I had another source
that I could combine with it, then I do not have to trust it blindly;
the opponent would have to compromise both in order to be able to
predict the combined output.


Consider ``x a x b x c x ...'', sampling every other bit throws away
most of entropy.
In general, there is no ``dependency stripping'' algorithm because an
input x,x,x,..., where all x are the same (either all 0 or all 1), is
possible. There is simply no enough entropy in the input to extract at
least two bits.


No, you have no idea of the unpredictability of that source, because
it is unspecified and unpredictability is untestable.  That could very well
be the output of a perfect HWRNG.   So could 01010101, or 000,
or 111.  Each output is equally likely, and no amount of testing the
output can say whether the source is imperfectly random or truly random.

This was stated several times on this list; entropy depends on the
model of the source, not on the output.  The sequence:
0, 1, 2, 3, 4... 255, 0, 1, 2, 3, 4... 255 ...
has a zero entropy if the source is defined as an 8-bit counter starting
at zero, and it has an entropy of 1 if the source is defined as a HWRNG
that generates 8-bit outputs in a uniform and independent manner.
Re-read the definition of entropy, and pay particular attention to the
calculation of probability for a given event.


Interestingly, an additional requirement that the input stream has at
least two bits of entropy does not change the situation


I didn't understand this example.


See above about ``sampling at a slower rate.''


I can take an analog random source, and sample it at one rate, and have
a nearly independent stream.  If I increase the sample rate, the bits will start
to become more and more dependent upon the prior bit.  So, if that is true,
then logically a serially-correlated stream will become less and less correlated
as I decrease the sample rate.  Taking every other bit corresponds to sampling
at half the speed, but doesn't require modifying the hardware.

It seems that you are saying there is no general solution, given a total lack
of knowledge about the source other than the fact that there is some
dependency between the bits.  I can agree with that.  However, if you
understand the physics of the source, then you do know something about
the random phenomenon used as a source, and in that case you can eliminate
specific kinds of dependencies that it may exhibit.

The easiest way to eliminate (computationally) bias and dependency in one
step is to combine with a CSPRNG.  You can reseed it periodically with
the combined output.
--
If you're not part of the solution, you're part of the precipitate.
Unix guru for rent or hire -- http://www.lightconsulting.com/~travis/
GPG fingerprint: 9D3F 395A DAC5 5CCC 9066  151D 0A6B 4098 0C55 1484

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]