Re: Compression theory reference?
It's a sad situation when you have to get a non-technical judge to resolve academic conflicts like this, but it's your head that you're banging against the wall, not mine. If you want to appeal to authority, there's the FAQ, which of course requires explaining the Usenet FAQ traditions; perhaps you can find Lempel, Ziv, or Welch? In reality, you could show an algorithm for which any input gets at most _one_ bit longer, rather than arbitrarily longer. And of course most of the compression algorithms work because real data almost always has structure which reduces the entropy. My information theory books from grad school have long vanished into some closet, and were written just about the time LZ came out so they mainly discuss Huffman coding in the discrete-message sections, but you should be able to find a modern book on the topic. Matt Crawford's inductive argument is very strong - it gives you a constructive way to say that for any integer N, I can give a proof for that N, starting at 1 and working your way up, showing that if there's a lossless coding that doesn't make any messages of length N any longer, then it doesn't make any any shorter, so it's not a compression method, just a permutation. The You could compress any message down to 1 bit argument is a great throwaway line, but it isn't rigorously valid. (And if it were, you might as well compress down to zero bits while you're at it.) The problem is that for most lossless compression algorithms, some strings will get shorter (maybe even much shorter), but some will stay the same length, so even if you had a hypothetical never gets longer compression algorithm, you can't guarantee that your starting message would be one that got shorter as opposed to staying the same, so you can't say that all messages would compress to zero. If your judge doesn't like inductions that count up, or your academic opponents insist on examining methods that count down, Bear's argument gets you most of the way there, by emphasizing the 1-1 mapping aspect, but you could easily get tangled. (To do this properly, you need to do n and 2**n, but I'll use 10 for concreteness.) There are 1024 10-bit messages, and only 512 9-bit messages, so something obviously happened to the =512 that didn't compress to 9 bits. Maybe 512 of them didn't compress further and stayed as 10-bit; almost certainly some of them became 8 bits or shorter. At least one message didn't get shorter, because (2**10 - 1) = 2**9 + 2**8 + 2**7 ... + 2**1 So if you want to recurse downwards through repeated compression, you need to be sure your mapping keeps track of the ones that didn't compress the first time (maybe they'll compress the second time?), the ones that compressed by one bit, and the ones that compressed by more than one bit, and avoid wandering around in a maze of twisty little passages. So working your way up is probably cleaner. At 11:30 AM 9/1/2004, bear wrote: Actually you don't need to take it all the way to the reductio ad absurdum here. There are 1024 10-bit messages. There are 512 9-bit messages. You need to point out that since a one-to-one mapping between sets of different ordinality cannot exist, it is not possible to create something that will compress every ten-bit message by one bit. Or, slightly more formally, assume that a function C can reversibly compress any ten-bit message by one bit. Since there are 1024 distinct ten-bit messages, there must be at least 1024 distinct nine-bit messages which, when the reversal is applied, result in these 1024 messages. There are exactly 512 distinct nine-bit messages. Therefore 512 = 1024. Bear - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED] Bill Stewart [EMAIL PROTECTED] - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: Compression theory reference?
Matt Crawford wrote: Plus a string of log(N) bits telling you how many times to apply the decompression function! Uh-oh, now goes over the judge's head ... Hadmut Danisch wrote: The problem is that if you ask for a string of log(N) bits, then someone else could take this as a proof that this actually works, because a string of log(N) bits is obviously shorter than the message of N bits, thus the compression scheme is working. Hooray! That misses the point of the construction that was the subject of Matt's remark. The point was (and remains) that the compressed output (whether it be 1 bit, or 1+log(N) bits, or 1+log^*(N) bits) is ridiculous because it is manifestly undecodeable. It is far, far too good to be true. The only question is whether the construction is simple enough for the judge to understand. There is no question whether the construction is a valid _reductio ad absurdum_. While we are on the subject, I recommend the clean and elegant argument submitted by Victor Duchovni (08/31/2004 03:50 PM) and also in more detail by Matt Crawford (08/31/2004 06:04 PM). It uses mathematical induction rather than proof-by-contradiction. It is a less showy argument, but probably it is understandable by a wider audience. It proves a less-spectacular point, but it is quite sufficient to show that the we-can-compress-anything claim is false. (Although with either approach, at least *some* mathematical sophistication is required. Neither PbC nor MI will give you any traction with ten-year-olds.) So it appears we have many different ways of approaching things: 1) The pigeon-hole argument. (This disproves the claim that all N-bit strings are compressible ... even if the claim is restricted to a particular fixed N.) 2) The mathematical induction argument. (Requires the claimed algorithm to be non-expansive for a range of N.) 3) The proof-by-contradiction. (Requires the claimed algorithm to be compressive -- not just non-expansive -- for a range of N.) 4) Readily-demonstrable failure of *any* particular claimed example, including Lempel-Ziv and all the rest. *) Others? Harumph. That really ought to be enough. Indeed *one* disproof should have been enough. The problem is, that the number of iterations is not in the order of N, but in the order of 2^N, so it takes log2(around 2^N) = around N bits to store the number of iterations. I don't see why the number of iterations should be exponential in the length (N) of the input string. A compression function is supposed to decrease N. It is not supposed to decrement the number represented by some N-bit numeral after all, the string might not represent a number at all. Also I repeat that there exist special cases (e.g. inputs of known fixed length) for which no extra bits need be represented, as I explained previously. The recursion convertes a message of N bit recursively into a message of 1 or zero bit length (to your taste), *and* a number which takes around N bit to be stored. Nothing is won. But proof that. I disagree, for the reasons given above. In the worst case, you need log^*(N) extra bits, not N bits. In special cases, you don't need any extra bits at all. The win is very substantial. The win is extreme. This recursion game is far more complicated than it appears to be. Maybe. But there's no need to make it more complicated than it really is. Note also that storing a number takes in reality more than log(N) bits. Why? Because you don't know N in advance. We don't have any limit for the message length. For general N, that's true. So you'r counting register needs theoretically inifinte many bits. Maybe. For many practical purposes, the required number of bits is considerably less than infinity. When you're finished you know how many bits your number took. But storing your number needs an end symbol or a tristate-bit (0,1,void). That's a common mistake. We agree that there are many common mistakes. We agree that it is a mistake to have undelimited strings. But it is also a mistake to think that you need to reserve a special symbol to mark the end of the string. Yes, that is one option, but from a data-compression point of view it is an inefficient option. Anybody who is interested in this stuff reeeally ought to read Chaitin's work. He makes a big fuss about the existence of self-delimiting strings and self-delimiting programs. There are many examples of such: -- The codewords of many commonly-used compression algorithms are self-delimiting. This is related to the property of being instantaneously decodable. -- As Chaitin points out, you can set up a dialect of Lisp such that Lisp programs are self-delimiting. -- If you want to be able to represent M, where M is *any* N-bit number, you need more than log(M) bits (i.e. more than N bits). That's because you need to specify how many bits are used to represent log(M), which adds another log(log(M)) bits.
Re: Compression theory reference?
On Aug 31, 2004, at 15:56, John Denker wrote: 4) Don't forget the _recursion_ argument. Take their favorite algorithm (call it XX). If their claims are correct, XX should be able to compress _anything_. That is, the output of XX should _always_ be at least one bit shorter than the input. Then the compound operation XX(XX(...)) should produce something two bits shorter than the original input. If you start with a N-bit message and apply the XX function N-1 times, you should be able to compress each and every message down to a single bit. Plus a string of log(N) bits telling you how many times to apply the decompression function! Uh-oh, now goes over the judge's head ... - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: Compression theory reference?
On Tue, Aug 31, 2004 at 05:07:30PM -0500, Matt Crawford wrote: Plus a string of log(N) bits telling you how many times to apply the decompression function! Uh-oh, now goes over the judge's head ... Yeah, I just posted a lengthy description why I think that this counterexample is not a counterexample. The problem is that if you ask for a string of log(N) bits, then someone else could take this as a proof that this actually works, because a string of log(N) bits is obviously shorter than the message of N bits, thus the compression scheme is working. Hooray! The problem is, that the number of iterations is not in the order of N, but in the order of 2^N, so it takes log2(around 2^N) = around N bits to store the number of iterations. The recursion convertes a message of N bit recursively into a message of 1 or zero bit length (to your taste), *and* a number which takes around N bit to be stored. Nothing is won. But proof that. This recursion game is far more complicated than it appears to be. Note also that storing a number takes in reality more than log(N) bits. Why? Because you don't know N in advance. We don't have any limit for the message length. So you'r counting register needs theoretically inifinte many bits. When you're finished you know how many bits your number took. But storing your number needs an end symbol or a tristate-bit (0,1,void). That's a common mistake. When determining the compression rate for a file people often forget, that some information is not stored in the file itself, but in the file system, e.g. the file length (telling where the compressed data stops) and the file name (telling you, that the file was compressed). That's basically the same problem. thanks and regards Hadmut - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: Compression theory reference?
I wrote: 4) Don't forget the _recursion_ argument. Take their favorite algorithm (call it XX). If their claims are correct, XX should be able to compress _anything_. That is, the output of XX should _always_ be at least one bit shorter than the input. Then the compound operation XX(XX(...)) should produce something two bits shorter than the original input. If you start with a N-bit message and apply the XX function N-1 times, you should be able to compress each and every message down to a single bit. Matt Crawford wrote: Plus a string of log(N) bits telling you how many times to apply the decompression function! Uh-oh, now goes over the judge's head ... Actually you don't need to adjoin log(N) bits. But perhaps my assertion would benefit from some clarification. I emphasize that I am only discussing messages of length N, where N is some pre-chosen number. For concreteness, let's choose N=10. I repeat my assertion that _if_ XX can compress any string, shortening it by one bit, and _if_ you know that the original messages each have exactly 10 bits, _then_ any 10-bit message can be compressed down to a single bit. I have proved that XX is ridiculous in this one case. My function YY := XX^9 is less general than XX. XX works on any input, whereas YY by its definition only applies to 10-bit messages. The fact remains that we have a proof by contradiction. We assume by way of hypothesis that the bad-guys are right, namely that XX exists and has the properties they assign to it. Then I can construct YY. But YY is ridiculous, through no fault of mine. Ergo the bad guys are wrong, i.e. no such XX can exist. With a little more work I could construct a more powerful and/or more general version of YY ... but that would be doing more work than is required. Their XX stuck its neck out; it is not required for my YY to stick its neck out in the same way. The requirement, as I understood it, was to prove the bad guys wrong. Well, the bad guys have been proved wrong. If something more is required, please explain the requirements in more detail. (BTW I suppose it would be better to call this the 'iterated composition' argument rather than the recursion argument.) Hadmut wrote: I found some outside Germany. But they didn't give me a paper with signature, just e-mails. Will see whether the court will accept that. Ask your legal advisor. In the US, getting such emails admitted as evidence would be problematic. Each jurisdiction (I assume) has its own standards for how affidavits should be prepared. Figure out the rules, and play by the rules. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: Compression theory reference?
Hadmut Danisch [EMAIL PROTECTED] writes: I need a literature reference for a simple problem of encoding/compression theory: comp.compression FAQ, probably question #1 given the number of times this comes up in the newsgroup. (I've just checked, it's question #9 in part 1. Question #73 in part 2 may also be useful). Peter. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: Compression theory reference?
On Wed, Sep 01, 2004 at 04:02:02PM +1200, Peter Gutmann wrote: comp.compression FAQ, probably question #1 given the number of times this comes up in the newsgroup. (I've just checked, it's question #9 in part 1. Question #73 in part 2 may also be useful). Thanks, that's a pretty good hint, especially because it contains an explicit statement, and it's an FAQ, making it easy to show, that the university's claim is not just wrong, but silly. :-) regards Hadmut - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
RE: Compression theory reference?
On Tue, Aug 31, 2004 at 02:48:00PM +0200, Hadmut Danisch wrote: It can be easily shown that there is no lossless compression method which can effectively compress every possible input. Even more simply, if such a method existed, you could recursively apply it to its output and compress every message as one bit. - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: Compression theory reference?
On Wed, 1 Sep 2004, Hadmut Danisch wrote: I have often heard that example and I used it myself several times. I do not use it anymore, because it is not that easy. There's a major flaw in this example, and therefore it is not really a counterexample. If the faculty found that flaw I'd be in a bad position. You could define some reversible bizarre function f that does exactly that job, i.e. for a given message m you need to apply f again and again and after some finite number of calculations you'll find that f(...f(m)...) = x where x = 0 or 1 For example, loading the message into a Linear Feedback Shift Register and iterating until it produces one of two predetermined messages 0 or 1? For a nontrivial message, the last star will burn out before you get that many iterations. Also, as you say, in order to decode the message you have to know how many iterations you made - a number which will, on the average, be the same length as the message. It hardly seems a worthwhile example. They say LZW and MTF. I have already given an example for LZW. They don't care. I've told them to take any random string taken from /dev/random under Linux. They don't care. The german principle is that a faculty is always right by definition. That is inconsistent with the advancement of knowledge. Any university relying on such a principle has abandoned its duty. Bear - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: Compression theory reference?
On Tue, 31 Aug 2004, John Denker wrote: I emphasize that I am only discussing messages of length N, where N is some pre-chosen number. For concreteness, let's choose N=10. I repeat my assertion that _if_ XX can compress any string, shortening it by one bit, and _if_ you know that the original messages each have exactly 10 bits, _then_ any 10-bit message can be compressed down to a single bit. Actually you don't need to take it all the way to the reductio ad absurdum here. There are 1024 10-bit messages. There are 512 9-bit messages. You need to point out that since a one-to-one mapping between sets of different ordinality cannot exist, it is not possible to create something that will compress every ten-bit message by one bit. Or, slightly more formally, assume that a function C can reversibly compress any ten-bit message by one bit. Since there are 1024 distinct ten-bit messages, there must be at least 1024 distinct nine-bit messages which, when the reversal is applied, result in these 1024 messages. There are exactly 512 distinct nine-bit messages. Therefore 512 = 1024. Bear - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Compression theory reference?
Hi, I need a literature reference for a simple problem of encoding/compression theory: It can be easily shown that there is no lossless compression method which can effectively compress every possible input. Proof is easy: In a first step, consider all possible messages of length n bit, n0. There are 2^n different ones. But there are only (2^n)-1 shorter messages, so there is no injektive encoding to encode all messages into a shorter one. And then all codewords of length n are occupied, so it is impossible to compress messages shorter than n bit. So when trying to compress a message of length m, mn, it must be encoded in to a codeword of at least n bit, thus longer than m and not effectively compressed. (And you'd even have to consider the entropy of the eom sign or the bit counter) Or in other words: For every input word which is compressed into a shorter codeword, there must be another, shorter input word, which cannot be effectively compressed, but gets longer - if the algorithm/function should be able to accept any input and should be lossless, i.e. for any input a decompress(compress(a))=a. Thus, for every lossless compression method, which can accept any input message and is not completely useless (i.e. there is at least one message which's codeword is shorter than the message), there is at least one input which's codeword is longer than the message. As far as I know that's stuff of the early semesters of computer science to learn, that in theory there is no universal lossless method compressing everything. Lossless compression is the idea to encode those messages with higher probabilities into shorter codewords, and those with lesser probability into longer codewords, thus reducing the average length of the messages, not the length of every single message. As I mentioned earlier, I have some trouble with a computer science department. They do not want to believe that there is no such algorithm, and they claim that there are such algorithms which can compress everything without loss and without any input resulting into a longer codeword. They claimed that Lempel-Ziv and MTF (Move to Front) would do the job. I've given counterexamples in LZ, showed that gzip on a file filled with random numbers results in a bigger file, and showed that MTF is not a compression at all, since it does not change the length of a message. They don't understand. Therefore, I need a book about computer science or encoding theory, which explicitely says that this is impossible, in a way that a person unexperienced in computer science (judges at a court) can read and at least see that this is not just my personal phantasy and at least somehow reasonable. I have checked several books about information and coding theory, but didn't find any suitable for readers without computer science background. The statement is always hidden in formulas about entropy, average code lengths, code efficiency inequations etc. If you had such an encoding scheme, you could easily show that the average length is below the entropy of the efficiency is 100%. But a non-computer science person does not understand that. Does anybody know a book about coding theory which explicitely states the impossibility of such a compression method in plain language? regards Hadmut - The Cryptography Mailing List Unsubscribe by sending unsubscribe cryptography to [EMAIL PROTECTED]
Re: Compression theory reference?
Hadmut Danisch wrote: It can be easily shown that there is no lossless compression method which can effectively compress every possible input. OK. ... I need a book about computer science or encoding theory, which explicitely says that this is impossible, in a way that a person unexperienced in computer science (judges at a court) can read and at least see that this is not just my personal phantasy and at least somehow reasonable. There are two conficting requirements there. -- authoritative and correct, and -- understandable to a judge with no technical expertise. As you know, it is easy to satisfy either requirement separately. My suggestions: 1) Get a few expert witnesses to certify that your position is certainly not a personal fantasy. (I assume German jurisprudence has something akin to the US notion of expert witnesses, right?) 2) Try to get the burden-of-proof reversed. The opposition are claiming that known algorithms do the job. Get them to be specific about which algorithms they are referring to. Then file with the court some disks, each containing 1.44 megabytes of data. If the opposition are telling the truth, they should be able to compress these using one of their chosen algorithms. Offer to concede the entire case if they can shorten the text by more than, say, 0.1 percent (i.e. 1.44 kilobytes shorter). Of course you fill your disks using a good hardware random number generator, e.g. http://www.av8n.com/turbid/ Be sure to arrange suitable precautions so that the opposition doesn't get a chance to peek at your disks and build a special-purpose rumplestiltskin encoder/decoder, i.e. one that contains quotations of your data. One precaution would be to require that they use an open-source implementation, or at least an impartial implementation, of a published algorithm. 3) Diagram the pigeon-hole argument for the judge. See diagrams below. 4) Don't forget the _recursion_ argument. Take their favorite algorithm (call it XX). If their claims are correct, XX should be able to compress _anything_. That is, the output of XX should _always_ be at least one bit shorter than the input. Then the compound operation XX(XX(...)) should produce something two bits shorter than the original input. If you start with a N-bit message and apply the XX function N-1 times, you should be able to compress each and every message down to a single bit. Here's how to diagram the pigeon-hole argument: Write down all the three-bit numbers, and demonstrate a lossless non-compressive encoding: plaintext codetext 000 -- 000 001 -- 001 010 -- 010 011 ---\ /=== 011 \/ /\ 100 ===/ \--- 100 101 -- 101 110 -- 110 111 -- 111 === Then give an example of a lossy compressive code. Make the point that the codetext words are shorter than the plaintext words. However, there are necessarily fewer of them, so the coding scheme is necessarily lossy and non-invertible. plaintext codetext 000 -- 00 zero 001 -- 01 one 010 -- 10 two 011 ---=== 11 many / | 100 /| | | 101 /| | | 110 /| | | 111 / = Then give an example of a code that might or might not be compressive, depending on statistical assumptions. The following code is compressive if zero, one, and two are considerably more probable than the other three-bit numbers, but does is anti-compressive if all eight three-bit numbers are equally likely, or nearly equally likely. 000 -- 00 zero 001 -- 01 one 010 -- 10 two 011 -- 11000 100 -- 11001 101 -- 11010 110 -- 11011 111 -- 11100 Average length, for equiprobable plaintext words: (2+2+2+5+5+5+5+5) / 8 = 3.875 which is an expansion, not a compression. Judges like fairness. Certainly an equiprobable distribution of input words is fair. It reflects the bit-strings that would be generated by tossing fair coins (three at a time), or tossing a fair eight-sided die. This fairest of distributions is incompressible in the usual sense. Finally, offer a challenge. Write down all eight three-bit plaintext words, and all four two-bit codetext words. Ask the judge to conclude for himself that it is obviously impossible to draw lines connecting corresponding words in a one-to-one fashion. 000 00 001 01 010 10 011 11 100 101 110 111 = Note that in the last example, the codetext words were only one bit shorter than the plaintext words. If you want to pound on the point, present the samething with four-bit plaintext and two-bit