Re: Impossible compression still not possible. [was RE: Debunking the PGP backdoor myth for good. [was RE: Hypothesis: PGP backdoor (was: A security bug in PGP products?)]]

2006-09-03 Thread John Denker
Dave Korn asked:

   Is it *necessarily* the case that /any/
 polynomial of log N /necessarily/ grows slower than N?

Yes.

Hint:  L'Hôpital's rule.

 if P(x)==e^(2x)

That's not a polynomial.

x^Q is a polynomial.  Q^x is not.

-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Impossible compression still not possible. [was RE: Debunking the PGP backdoor myth for good. [was RE: Hypothesis: PGP backdoor (was: A security bug in PGP products?)]]

2006-08-30 Thread Dave Korn
On 28 August 2006 15:30, Ondrej Mikle wrote:

 Ad. compression algorithm: I conjecture there exists an algorithm (not
 necessarily *finite*) that can compress large numbers
 (strings/files/...) into small space, more precisely, it can
 compress number that is N bytes long into O(P(log N)) bytes, for some
 polynomial P. 

[  My maths isn't quite up to this.  Is it *necessarily* the case that /any/
polynomial of log N /necessarily/ grows slower than N?  If not, there's
nothing to discuss, because this is then a conventional compression scheme in
which some inputs lead to larger, not smaller, outputs.  Hmm.  It would seem
to me that if P(x)==e^(2x), P(log n) will grow exponentially faster than N.
I'm using log to mean natural log here, substitute 10 for e in that formula if
we're talking about log10, the base isn't important.  However, assuming that
we're only talking about P that *do* grow more slowly, I'll discuss the
problem with this theory anyway.  ]


  There are many, but there are no algorithms that can compress *all* large
numbers into small space; for all compression algorithms, some sets of input
data must result in *larger* output than input.

  There is *no* way round the sheer counting theory aspect of this.  There are
only 2^N unique files of N bits.  If you compress large files of M bits into
small files of N bits, and you decompression algorithm produces deterministic
output, then you can only possibly generate 2^N files from the compressed
ones.

 Take as an example group of Z_p* with p prime (in another words: DLP).
 The triplet (Z, p, generator g) is a compression of a string of p-1
 numbers, each number about log2(p) bits.
 
 (legend: DTM - deterministic Turing machine, NTM - nondeterministic
 Turing machine)
 
 There exists a way (not necessarily fast/polynomial with DTM) that a
 lot of strings can be compressed into the mentioned triplets. There
 are \phi(p-1) different strings that can be compressed with these
 triplets. Exact number of course depends on factorization of p-1.
 
 Decompression is simple: take generator g and compute g, g^2, g^3,
 g^4, ... in Z_p*

  This theory has been advanced many times before.  The Oh, if I can just
find the right parameters for a PRNG, maybe I can get it to reconstruct my
file as if by magic.  (Or subsitute FFT, or wavelet transform, or
key-expansion algorithm, or ... etc.)

  However, there are only as many unique generators as (2 to the power of the
number of bits it takes to specify your generator) in this scheme.  And that
is the maximum number of unique output files you can generate.

  There is ***NO*** way round the counting theory.

 I conjecture that for every permutation on 1..N there exists a
 function that compresses the permutation into a short
 representation.

  I'm afraid to tell you that, as always, you will find the compression stage
easy and the decompression impossible.  There are many many many more
possible permutations of 1..N than the number of unique short
representations in the scheme.  There is no way that the smaller number of
unique representations can possibly produce any more than the same (smaller)
number of permutations once expanded.  There is no way to represent the other
(vast majority) of permutations.

  It is possible that only NTM, possibly with infinite
 algorithm (e.g. a human) can do it in some short finite time. 

  Then it's not really an algorithm, it's an ad-hoc collection of different
schemes.  If you're allowed to use a completely different encryption scheme
for every file, I can do better than that: for every file, define an
encryption scheme where the bit '1' stands for the content of that file, and
everything else is represented by a '0' bit followed by the file itself.
Sure, most files grow 1 bit bigger, but the only file we care about is
compressed to just a single bit!  Great!

  Unfortunately, all you've done is moved information around.  The amount of
information you'd have to have in the decompressor to know which file to
expand any particular '1' bit into is equal to  the amount you've saved by
compressing the original to a single bit.

  There is ***NO*** way round the counting theory.

 Again,
 I could've/should've proven or disproven the conjecture, I just don't
 want to do it - yet ;-)

  I seriously advise you not to waste much time on it.  Because ...






 There is ***NO*** way round the counting theory.







cheers,
  DaveK
-- 
Can't think of a witty .sigline today


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]


Re: Impossible compression still not possible. [was RE: Debunking the PGP backdoor myth for good. [was RE: Hypothesis: PGP backdoor (was: A security bug in PGP products?)]]

2006-08-30 Thread Ondrej Mikle

We are both talking about the same thing :-)

I am not saying there is a finite deterministic algorithm to compress
every string into small space, there isn't. BTW, thanks for There
is ***NO*** way round the counting theory. :-)

All I wanted to say is:
For a specific structure (e.g. movie, picture, sound) there is some
good compression algorithm.

E.g.: if you take a GIF 65536x65536, all white, with just one pixel
black, it can be compressed into 35 bytes, see here:
http://i.iinfo.cz/urs-att/gif3_6-115626056883166.gif
If you wanted to compress the same picture using JPEG (i.e. discrete
cosine transform), then two things would happen:

The compressed jpeg file would be a) much bigger b) decompressed image
would have artifacts, because Fourier transform of a pulse is sync
(infinitely many frequencies). Sure, JPEG is a lossy compression, but
good enough for photos and images that don't have a lot of high
frequencies.

Cheers,
 OM

On 8/28/06, Dave Korn [EMAIL PROTECTED] wrote:

On 28 August 2006 15:30, Ondrej Mikle wrote:

 Ad. compression algorithm: I conjecture there exists an algorithm (not
 necessarily *finite*) that can compress large numbers
 (strings/files/...) into small space, more precisely, it can
 compress number that is N bytes long into O(P(log N)) bytes, for some
 polynomial P.

[  My maths isn't quite up to this.  Is it *necessarily* the case that /any/
polynomial of log N /necessarily/ grows slower than N?  If not, there's
nothing to discuss, because this is then a conventional compression scheme in
which some inputs lead to larger, not smaller, outputs.  Hmm.  It would seem
to me that if P(x)==e^(2x), P(log n) will grow exponentially faster than N.
I'm using log to mean natural log here, substitute 10 for e in that formula if
we're talking about log10, the base isn't important.  However, assuming that
we're only talking about P that *do* grow more slowly, I'll discuss the
problem with this theory anyway.  ]


  There are many, but there are no algorithms that can compress *all* large
numbers into small space; for all compression algorithms, some sets of input
data must result in *larger* output than input.

  There is *no* way round the sheer counting theory aspect of this.  There are
only 2^N unique files of N bits.  If you compress large files of M bits into
small files of N bits, and you decompression algorithm produces deterministic
output, then you can only possibly generate 2^N files from the compressed
ones.

 Take as an example group of Z_p* with p prime (in another words: DLP).
 The triplet (Z, p, generator g) is a compression of a string of p-1
 numbers, each number about log2(p) bits.

 (legend: DTM - deterministic Turing machine, NTM - nondeterministic
 Turing machine)

 There exists a way (not necessarily fast/polynomial with DTM) that a
 lot of strings can be compressed into the mentioned triplets. There
 are \phi(p-1) different strings that can be compressed with these
 triplets. Exact number of course depends on factorization of p-1.

 Decompression is simple: take generator g and compute g, g^2, g^3,
 g^4, ... in Z_p*

  This theory has been advanced many times before.  The Oh, if I can just
find the right parameters for a PRNG, maybe I can get it to reconstruct my
file as if by magic.  (Or subsitute FFT, or wavelet transform, or
key-expansion algorithm, or ... etc.)

  However, there are only as many unique generators as (2 to the power of the
number of bits it takes to specify your generator) in this scheme.  And that
is the maximum number of unique output files you can generate.

  There is ***NO*** way round the counting theory.

 I conjecture that for every permutation on 1..N there exists a
 function that compresses the permutation into a short
 representation.

  I'm afraid to tell you that, as always, you will find the compression stage
easy and the decompression impossible.  There are many many many more
possible permutations of 1..N than the number of unique short
representations in the scheme.  There is no way that the smaller number of
unique representations can possibly produce any more than the same (smaller)
number of permutations once expanded.  There is no way to represent the other
(vast majority) of permutations.

  It is possible that only NTM, possibly with infinite
 algorithm (e.g. a human) can do it in some short finite time.

  Then it's not really an algorithm, it's an ad-hoc collection of different
schemes.  If you're allowed to use a completely different encryption scheme
for every file, I can do better than that: for every file, define an
encryption scheme where the bit '1' stands for the content of that file, and
everything else is represented by a '0' bit followed by the file itself.
Sure, most files grow 1 bit bigger, but the only file we care about is
compressed to just a single bit!  Great!

  Unfortunately, all you've done is moved information around.  The amount of
information you'd have to have in the decompressor to know which file 

RE: Impossible compression still not possible. [was RE: Debunking the PGP backdoor myth for good. [was RE: Hypothesis: PGP backdoor (was: A security bug in PGP products?)]]

2006-08-30 Thread Dave Korn
On 28 August 2006 17:12, Ondrej Mikle wrote:

 We are both talking about the same thing :-)

  Oh!
 
 I am not saying there is a finite deterministic algorithm to compress
 every string into small space, there isn't. BTW, thanks for There
 is ***NO*** way round the counting theory. :-)
 
 All I wanted to say is:
 For a specific structure (e.g. movie, picture, sound) there is some
 good compression algorithm.
 
 E.g.: if you take a GIF 65536x65536, all white, with just one pixel
 black, it can be compressed into 35 bytes, see here:
 http://i.iinfo.cz/urs-att/gif3_6-115626056883166.gif
 If you wanted to compress the same picture using JPEG (i.e. discrete
 cosine transform), then two things would happen:
 
 The compressed jpeg file would be a) much bigger b) decompressed image
 would have artifacts, because Fourier transform of a pulse is sync
 (infinitely many frequencies). Sure, JPEG is a lossy compression, but
 good enough for photos and images that don't have a lot of high
 frequencies.

  Absolutely, absolutely!  

  A compression algorithm achieves the best results if it is designed with
statistical knowledge of the target domain taken into account.  In any
compression scheme you're balancing the set of inputs that grow smaller on
compression against the necessary counterpart of inputs that grow larger.
Whatever you gain in the first set, you lose in the second.  The secret is to
arrange that the inputs that tend to grow larger are the ones that are
less-common in real-world usage, and thus that the ones that are more common
tend to grow smaller.  In practice, this means eliminating 'redundancy', where
'redundancy' is defined as 'whatever similar properties you can tease out from
the more-common-real-world cases'.

  Of course, I could point out that there is precisely *1* bit of information
in that huge GIF, so even compressing it to 35 bytes isn't a great
achievement... it's one of the set of less-common inputs that grow bigger as a
compromise so that real pictures, which tend to have at least *some*
variation, grow smaller.


cheers,
  DaveK
-- 
Can't think of a witty .sigline today


-
The Cryptography Mailing List
Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]