[cryptography] Entropy is forever ...

Thierry Moreau Fri, 17 Apr 2015 10:27:00 -0700

Dear all:

Quoting the basic definition of entropy from Wikipedia, "In informationtheory, entropy is the average amount of information contained in eachmessage received. Here, message stands for an event, sample or characterdrawn from a distribution or data stream." In applied cryptography, theentropy of a truly random source of "messages" is an importantcharacteristic to ascertain. There are significant challenges whenapplying the information theory, probability, and statistics concepts toapplied cryptography. The truly random source (and computations usingrandom message data) must be kept secret. Also the following dilemmashould be noted: the truly random source is needed in a digitalprocessor system typically engineered with determinism as a design goalderived from the basic reliability requirement. Quantitatively, theentropy measure for applied cryptography, in the order of hundreds ofbits, is way beyond any workable statistical analysis processes. Inpractice, a truly random source usable for applied cryptography is asystem procurement issue that can seldom be blindly delegated as anordinary operating system service. Thus, one wants a reliable source ofuncertainty, a trustworthy one than can barely be tested, as a universaloperating system service totally dependent on hardware configuration.

Applying the information theory to actual situations is error-prone. Isthere a lower entropy in "Smith-725" than in "gg1jXWXh" as a passwordstring? This question makes no sense as the entropy assessment appliesto the message source. A password management policy that rejects"Smith-725" as a message originating from the end-user populationactually constraints this message source with the hope of a higheraverage amount of information in user-selected passwords. From a singleend-user perspective having to deal with an ever growing number ofpasswords, the entropy concept appears as a formalization of theimpossible task he/she faces.

Significant conceptual deviation may occur from the common (and correct)system arrangement where a software-based pseudo-random number generator(PRNG) of a suitable type for cryptography is initially seeded from asecret true random source and then used for drawing secret randomnumbers. It is often inconvenient for statistical testing to applydirectly to the true random source messages, but statistical testing ofthe PRNG output gives no clue about the true random source. The designof PRNG seeding logic is an involved task dependent on the true randomsource which may be hard to modelize in the first place. In actualsystem operations, the inadequate seeding may have catastrophic indirectconsequences but it may be difficult to detect, and it is certainly achallenging error condition for service continuity (programmers may beinclined to revert to insecure PRNG seeding when the proper true randomsource breaks down).

Despite these pitfalls, I assume my reader to share my endorsement ofthe true random seeding of a cryptographic PRNG as the main source ofrandom secrets for a digital processor system dedicated to cryptographicprocessing. As this PRNG output is being used in various ways, chunks ofthe output sequence may be disclosed to remote parties. It is anessential requirement for a cryptographic PRNG that no output chunk mayallow the recovery of its internal state (i.e. some data equivalent toPRNG seed data leading to the same PRNG output sequence as the secret PRNG).

In this note, I challenge the view that an entropy pool maintained by anoperating system ought to be depleted as it is used. I am referring hereto the Linux "entropy pool." My challenge does not come through a reviewof the theory applied to the implementation. Instead, I propose acounter-example in the form of the above arrangement and a very specificexample of its use.

The central question is this problem. A system is booted and receives2000 bits of true randomness (i.e. a 2000 bits message from a sourcewith 2000 bits of entropy) that are used to seed a cryptographic PRNGhaving an internal state of 2000 bits. This PRNG is used to generate 4RSA key pairs with moduli sizes of 2400 bits. The private keys are keptsecret until their use in their respective usage contexts. No data leakoccurred during the system operation. After the key generation, thesystem memory is erased. What is the proper entropy assessment for eachof the RSA key pairs (assume there are 2^2000 valid RSA moduli for amoduli size of 2400 bits, a number-theoretic assumption orthogonal tothe entropy question)?

My answer is that each of the 4 RSA key pairs are independently backedby 2000 bits of entropy assurance. The entropy characterization(assessment) of a data element is a meta-data element indicating theentropy of a data source at the origin of the data, plus the implicitstatement that no information loss occurred in the transformation of theoriginal message into the assessed data element. Accordingly, my answershould be made more precise by referring to an unbiased RSA keygeneration process (which should not be considered a reasonableassumption for the endorsement of lower ranges of entropy assessments).

To summarize, the entropy assessment is a characterization of a the datasource being used as a secret true random source. It also refers to theprobability distribution of messages from the data source and thequantitative measure of information contents derived from theprobability distribution according to the information theory. Thismathematical formalism is difficult to apply to actual arrangementsuseful for cryptography, notably because the probability distribution isnot reflected in any message. The information theory is silent about thesecrecy requirement essential for cryptographic applications. Maybethere is confusion by assuming that entropy is lost when part of therandom message is disclosed, while only (!) data suitability forcryptographic usage is being lost. In applying the information theory tothe solution of actual difficulties in applied cryptography, we shouldaddress secrecy requirements independently. The probability distributionpreservation through random message transformations is an importantlesson from the theory that might have been overlooked (at least as anexplicit requirement).

A note about the genesis of the ideas put forward. In my efforts todesign applied cryptography key management schemes without takinganything for granted and paying attention to the lessons from theacademia and their theories, I came with a situation very similar to theabove problem statement. The 2000 bit random message from a 2000 bitsentropy truly random source is a simplification to the actual situationin which a first message transformation preserves the probabilitydistribution of random dice shuffling. In the above problem statement,the PRNG seeding is another distribution preserving transformation. Theactual PRNG is based on the Blum-Blum-Shub x^2 mod N generator, whichcomes with two bits of entropy loss upon seeding. The above problemstatement is thus concrete.

Maybe the term entropy is used, more or less by consensus, with adefinition departing from the information theory. Indeed, NIST documentscovering the topic of secret random numbers for cryptography useconflicting definitions surrounding the notion of entropy.

Although my own answer to the stated problem puts into question theLinux "entropy pool" depletion on usage, I do not feel competent to makesuggestions. For instance, my note hints that a PRNG algorithm selectionshould be part of the operating system service definition for/dev/?random offered for cryptographic purposes but I have just a vagueidea of whether and how the open source community might move in thisdirection.


Entropy is forever ... until a data leak occurs.
A diamond is forever ... until burglars break in.

Regards,

- Thierry Moreau
_______________________________________________
cryptography mailing list
[email protected]
http://lists.randombit.net/mailman/listinfo/cryptography

[cryptography] Entropy is forever ...

Reply via email to