Aram Perez <[EMAIL PROTECTED]> writes: > On Mar 22, 2006, at 9:04 AM, Perry E. Metzger wrote: > >> >> Aram Perez <[EMAIL PROTECTED]> writes: >>>> Entropy is a highly discussed unit of measure. >>> >>> And very often confused. >> >> Apparently. >> >>> While you do want maximum entropy, maximum >>> entropy is not sufficient. The sequence of the consecutive numbers 0 >>> - 255 have maximum entropy but have no randomness (although there is >>> finite probability that a RNG will produce the sequence). >> >> One person might claim that the sequence of numbers 0 to 255 has 256 >> bytes of entropy. > > It could be, but Shannon would not.
No, he wouldn't. You did, however. The maximum entropy a string of 256 bytes could have would be 256*8 bits. Since we're talking about a sequence with far less entropy than 256*8 bits, it is not a sequence of maximum entropy. There are, of course, trivially produced sequences of maximum entropy. >> Another person will note "the sequence of numbers 0-255" completely >> describes that sequence and is only 30 bytes long. > > I'm not sure I see how you get 30 bytes long. $ echo "the sequence of numbers 0-255" | wc -c 30 Now, of course, there are probably not more than 1.5 bits of entropy per letter in that sentence fragment, so really we're probably talking about ~6 bytes of information. Doubtless, though, cleverer people could do better. >> Indeed, more >> compact ways yet of describing that sequence probably >> exist. Therefore, we know that the sequence 0-255 does not, in fact, >> have "maximum entropy" in the sense that the entropy of the sequence >> is far lower than 256 bytes and probably far lower than even 30 bytes. > > Let me rephrase my sequence. Create a sequence of 256 consecutive > bytes, with the first byte having the value of 0, the second byte the > value of 1, ... and the last byte the value of 255. We heard you the first time. The Shannon information of a message is the negative of the log (base 2) of the probability of the message. Of course, that definition is only really useful if you're talking about a sequence of messages. The Kolmogorov-Chaitin information of a text is (roughly) the smallest program that can generate the text. Both of these definitions are getting at can be informally described as the smallest "representation" of a piece of information. If Alice asks Bob "what color are your eyes", Bob could send a 10M high resolution image of his eye, a precise measurement of the spectrum reflected by his eyes, the word "blue", or perhaps even something shorter, like a couple of bits that, according to a pre-agreed table, represent an eye color from a table of eye colors. The smallest possible representation of just the eye color, the couple of bits from a table of eye color codes, is likely closest to the information content of someone's eye color, though a precise measurement is impossible since it is highly context dependent. > If you measure the entropy (according to Shannon) of that sequence > of 256 bytes, you have maximum entropy. Clearly you don't, since the sequence can be described with far less information than 256 bytes. A completely random and incompressible sequence of 256 bytes would have maximum entropy, since it is impossible to compress to less than 256*8 bits, but the sequence 0..255 has very little entropy because it is easily compressed to a smaller size. This should be obvious. If I start reciting to you "zero. one. two..." for a few iterations the probability of the next byte will be 1/256 or close to it. Shortly, though, anyone who isn't an idiot will guess what the next value (with nearly probability 1) in the sequence is, and the information content of subsequent bytes falls to far less than one bit per byte. This is just another way of saying that the smallest program that generates 0..255 is quite small, or that you can easily compress 0..255 into a description that fits in far less than 256 bytes. >> Entropy is indeed often confusing. Perhaps that is because both the >> Shannon and the Kolmogorov-Chaitin definitions do not provide a good >> way of determining the lower bound of the entropy of a datum, and >> indeed no such method can exist. > > No argument from me. I thought you were, indeed, still arguing. Perry --------------------------------------------------------------------- The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]