On 01/17/2007 06:07 PM, Allen wrote: > The whole issue of entropy is a bit vague for me - I don't normally work > at that end of things - so could you point to a good tutorial on the > subject, or barring having a reference handy, could you give an overview?
Entropy is defined in terms of probability. The /definition/ of entropy is sum_i P_i log(1/P_i) [1] there the sum runs over all symbols (i) in the probability distribution, i.e. over all symbols in the ensemble. Equation [1] is the gold standard. It is always correct. Any other expression for entropy is: a) equivalent to [1] b) a corollary, valid under some less-than-general conditions, or c) wrong. We have not yet specified the _base_ of the log in equation [1]. If the log is base 2, the entropy is measured in bits. If the log is base 3, the entropy is measured in nats. For more on this, including the connection to macroscopic thermodynamics, see http://www.av8n.com/physics/thermo-laws.htm#sec-s-units One toss of a fair coin is 1 bit of entropy. Two tosses of a fair coin is 2 bits of entropy. Entropy can be measured in bits, but not everything measured in bits is entropy (just as milk can be measured in gallons, but also sewage can be measured in gallons). In particular, a 32 bit word may hold less than 32 bits of entropy, but it cannot hold more. A related notion is the /surprise value/ of a given symbol, namely $_i = log(1/P_i) We immediately see that the entropy is the appropriately weighted average of the surprise value. It must be emphasize that entropy is a property of the whole ensemble (in contrast to the surprise value which is a property of an individual symbol). There is no such thing as the "average entropy"; entropy is already an average. There are two branches of cryptography. -- One branch depends on entropy for security. -- The other branch depends on computational complexity for security. The term /random/ is used in multiple inconsistent ways; if you ask 5 experts you can get 10 different definitions. As for me, I say something is random if it is _random enough for the application_. I am *not* correspondingly tolerant about the term entropy. Entropy is entropy. There is only one definition of entropy, used uniformly across a wide range of disciplines including 1. cryptography and cryptanalysis (secret codes) 2. communications (error-correcting codes, as part of electronic engineering) 3. computer science, including data-compression codes, machine learning, speech recognition, etc. 4. librarianship 5. the physics of computation 6. the design of refrigerators, heat pumps, and engines (including piston, turbine, and rocket engines) 7. nuclear engineering (reactors and weapons) 8. fluid dynamics 9. astrophysics and cosmology 10. chemistry and chemical engineering If you're not defining it in terms of equation [1] or something provably equivalent, don't call it entropy. In particular, 200 bits of pseudorandomness is not the same as 200 bits of entropy. If you mean entropy, say entropy; if you mean randomness, say randomness. Sometimes pseudorandomness based on computational complexity is suitable for the application, and sometimes it doesn't make sense to use anything less than genuine entropy. For more on how to generate usable entropy, see http://www.av8n.com/turbid/paper/turbid.htm For more about the definition of entropy, including the connection to thermodynamics, see http://www.av8n.com/physics/thermo-laws.htm#sec-entropy --------------------------------------------------------------------- The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to [EMAIL PROTECTED]