Cryptography-Digest Digest #842, Volume #8 Tue, 5 Jan 99 06:13:06 EST
Contents:
Re: On leaving the 56-bit key length limitation ([EMAIL PROTECTED])
----------------------------------------------------------------------------
From: [EMAIL PROTECTED]
Subject: Re: On leaving the 56-bit key length limitation
Date: Tue, 05 Jan 1999 08:24:06 GMT
List:
I would like to answer some private msgs, which have asked me to
discuss what should be revisited about "unicity distance" and
quantify my comments on DES and on DES unicity. I will also advance
some topics on Unicity, to motivate discussion. Specially, I
demonstrate by concrete examples the usefulness of "open trust" to
increase security -- in contrast to only using "closed trust",
represented by secret keys.
1. THE ORIGINAL CONCEPT OF UNICITY
Shannon [Sha49] defined "unicity distance" (hereafter, "n") as the
least amount of plaintext which could be uniquely deciphered from the
corresponding ciphertext -- given unbounded resources by the
attacker. The "amount" of plaintext (i.e., "n") can be measured in
any units the user may find convenient, such as bits, bytes, letters,
symbols, etc. Actually, Shannon used "letters" in his paper.
NOTE: As I commented before, "unicity distance" is not a "distance".
It is not a metric function and does not satisfy the intuitive
properties we ascribe to distance. Thus, to reduce confusion, from
now on I will only use the term "unicity".
In few words, "unicity" is the least message length that can be
uniquely deciphered. As we will see, this number depends on several
factors -- some explicit, most implicit.
2. CALCULATING UNICITY
Under some restrictive assumptions (e.g., "random cipher" as
explained in Section 3 item d), Shannon's mathematical expression for
unicity is given by [Sha49]:
n = H(K)/[|M| - H(M)]
for:
n = unicity; least message length that can be uniquely deciphered
H(K) = entropy of keys
|M| = maximum entropy of message
H(M) = entropy of message
where the entropies are calculated accordingly to the desired units
(bits, bytes, letters, symbols, etc.).
Please note that it does not matter how the attacker may try to
decipher -- he can of course use brute-force and try out all keys or
he can use short-cuts, it is his choice and he is entirely free to
use any method he desires.
The main point is that if the intercepted message has less than "n"
(bits, bytes, letters, etc.) of plaintext then it cannot be uniquely
deciphered from an analysis of its ciphertext -- even by trying all
keys. Conversely, any message that is larger than "n" bits could be
theoretically broken -- albeit that may require some large (actually,
unspecified) amount of work.
I call attention to the fact that the denominator of the formula
above may be zero in some cases of interest -- currently, at least of
theoretical interest. This means that the formula will fail to
provide an answer in such cases. I expect to address this situation
in a new and expanded formulation of unicity, which will follow
several aspects to be advanced below -- specially in regard to the
Sections on identification of unicity, different unicity definitions
and "added trust". The interplay between unicity and trust will also
be further explored in next messages.
3. TWO EXAMPLES OF UNICITY: PRINTABLE AND ENGLISH MESSAGES
To exemplify the concept and also to unravel some subtle points on
unicity, I will use an example in two cases. In both cases, we are
intercepting an enciphered message. In the first case we trust that
the text is in printable ASCII characters, while in the second case
we *also* trust that it is in English. In each case, how much of the
enciphered message do we need to intercept before we can calculate
its key? In case two, will the additional trust make a difference?
In the first case we (the attacker) trust that:
a. the ciphertext is formed by 8-bit characters,
b. the ciphertext has the same length as the plaintext,
c. encryption fully used a random 56-bit key,
d. any decipherment gives a random flat distribution of 8-bit
characters (this is the "random cipher" assumption made by
Shannon, when deriving the unicity formula),
e. the actual plaintext only uses printable ASCII 7-bit characters,
which provides a total of 95 characters (including the space) as
its alphabet,
f. the actual plaintext follows a random flat distribution for its
alphabet characters,
and we ask "How many characters of the ciphertext do we need to
intercept at least, so that we can uniquely decipher the message and
thus discover its key?" Or, equivalently, "How many ciphertext
characters are needed to break the system?"
The answer to this question is the unicity (since the ciphertext has
the same length as the plaintext):
H(K) = log2(2^56) = 56 bits
|M| = log2(2^8) = 8 bits/character
H(M) = log2(95) = 6.57 bits/character
which leads to n = 56/(8 - 6.57) = 39.2 characters. As given in d,
the plaintext encoding uses 8-bit characters, i.e, one byte per
character. This means that the system above has a unicity of 39.2
bytes -- as the least length of ciphertext that I need to intercept
in order to break it.
In the second case, we add the trust that the plaintext is in
English. What changes? Our trust on f -- which is now:
f'. the actual plaintext contains a distribution of its alphabet
characters that follows the trusted statistics for English,
in its letter frequency, digram frequency, trigram frequency,
word length frequency, etc.
This changes H(M) -- which is now the entropy of English text. A
quantity which has been measured by various authors and lies between
1.0 and 1.5 bits/character; for an example see [Sha49]. I will use
the close-to-average value of 1.2. The calculation is:
H(K) = log2(2^56) = 56 bits
|M| = log2(2^8) = 8 bits/character
H(M) = log2(English) = 1.2 bits/character
which leads to n = 56/(8 - 1.2) = 56/6.8 = 8.2 characters. If I
suppose an encoding of one byte per English letter then the system
above has a unicity of 8.2 bytes -- as the least length of ciphertext
that I need to intercept in order to break it.
Thus, the added trust in f' (vis-a-vis f) has *decreased* the unicity
approximately 5 times.
4. ADDING COMPRESSION
The question of compression is also interesting. If the plaintext is
compressed before encipherment, then we rightly expect its entropy
*per compressed character* to increase -- even though its entropy
*per English character* does not increase. This is often confusing
and may provide the impression that nothing is gained by compression
or that we may need to "hide" the compression algorithm from the
attacker.
Here, an example may help clarify the difference, and the benefit of
compression -- albeit 100% known to the attacker. I will use the last
case of the example in section 3. and include a new trust-point after
f' -- i.e. added information which the attacker can rely upon:
g. the plaintext is compressed two times before encipherment, by a
known algorithm.
The new unicity calculation begins as the former one:
H(K) = log2(2^56) = 56 bits
|M| = log2(2^8) = 8 bits/compressed-character
while for H(M) we need now to transform units:
H(M) = log2(English) = 1.2 bits/character (the same as given in 3.)
1 character = 1/2 compressed-characters
H(M) = 1.2*2 = 2.4 bits/compressed-character
which leads to n = 56/(8 - 2.4) = 10 compressed-characters. This
means that I can now transport up to 10*2 = 20 characters of English
when compression is used. If I suppose an encoding of one byte per
English character then the system above has a unicity of 20 bytes --
as the least length of ciphertext that I need to intercept in order
to break it.
Thus, twice compression has *increased* English text unicity more
than twice -- actually, 22% more than the compression ratio --
vis-a-vis the second case.
5. THE EFFECT OF ADDED OPEN TRUST
It may seem that the more the attacker trusts information about the
intercepted message, the least the unicity. It seems intuitive that
the more the attacker knows, the least he would need to know.
This is, however, provably wrong.
Added open trust -- publicly known to all, including the attacker --
can either increase or decrease unicity, as has been proved in
Sections 3. and 4. above.
Of course, to measure and understand this effect one needs to refer
unicity to the same units of plaintext -- for example, in characters
of English text. Otherwise we would be comparing apples with
speedboats.
There is much more to the subject of "added open trust" than this
brief introduction can contain, when one wants to use it to increase
the security of cipher systems. However, for the moment, suffices to
say that, demonstrably, added open trust can increase unicity -- even
when publicly known to all.
Or, in other words, adequately adding open trust was proved to
decrease what the cipher system may disclose at most for a given
length of intercepted messages, when used within its unicity --
notwithstanding the public's full knowledge about such trust (and,
the attacker's).
Of course, the same feature is trivially true for "closed trust" --
i.e, that which is trusted but is not publicly known, such as the
secret symmetric key. And, is the basis for current WA limitations --
to limit the amount of "closed trust" one may have or require. That
this feature but not this limitation also applies to "open trust" is
very noteworthy -- and not controllable.
I also wish to remark that the usefulness of unicity is perhaps not
so much as a measure to help "grade" cipher systems (but where it may
miserably fail even if used as defined -- see Section 2.), but it
seems to be more interestingly inversely related to what a cipher
system may disclose at most -- when any number of its cipher messages
are attacked by unlimited computational resources, including time. In
this, open trust was shown to play an essential role.
6. IDENTIFICATION OF UNICITY
Now, I will comment on the identification of unicity, in terms of
Identification Theory [Ger98].
First, I note that it is hard to define what is "to look valid" -- we
do not know what a valid plaintext is. So, we must expect
difficulties to distinguish between a false plaintext and a correct
one, since it is axiomatic that we do not know the plaintext.
In other words, we have an identification problem.
Using the Identification Theory outlined in [Ger98], I can consider
the four possible identifications of all possible decryptions,
defined by identification level I-2, as:
D - Distinguished -- when an observer can distinguish one
plaintext.
A - Ambiguous -- when an observer can distinguish different
plaintexts but is uncertain which one to take.
O - Obscure -- when an observer cannot distinguish a plaintext
but can perceive information.
F - Formless -- when an observer cannot detect any form of
information exchange.
The table above illustrates the difficulties posed by the question --
what is to "look valid"?
It also serves to exemplify the generalized concept of identification
-- "to identify is to look for coherence" -- which allows
identification to be defined *without* a previously required
identity. This removes the circularity implied by the usual
definition (see [Ger98]) and allows an intrinsic treatment of
identification -- as needed here.
To start, I can rephrase Shannon's unicity (hereafter, Unicity-1) as:
"Unicity-1 is the least amount of Distinguished plaintext which
can be deciphered from the corresponding ciphertext -- given
unbounded resources by the attacker"
Now, if we view the above D, A, O, F classification of identification
level I-2 as cardinal numbers ordered in the sequence D > A > 0 > F,
then Shannon's unicity is the "greatest lower bound" of D -- i.e, as
close as possible to Ambiguous, but still Distinguished.
This immediately allows us to define another unicity, as the "least
upper bound" of Ambiguous:
"Unicity-2 is the maximum amount of Ambiguous plaintext which can be
deciphered from the corresponding ciphertext -- given unbounded
resources by the attacker"
Thus, Unicity-2 gives us the maximum amount of plaintext which can
contain *one* false message -- plus the right (and, a priori unknown)
one.
It is easy to prove that Unicity-1 = Unicity-2 for a random cipher.
This resolves a difference in the literature, by showing that *for
random ciphers* it is equivalent to consider "the least amount of
plaintext that can be deciphered" or the "maximum amount of plaintext
that gives just one expected false decipherment". Since the last case
may be easier to calculate, for some systems, one may choose.
It is also possible (and, useful for security) to define Unicity-3
until Unicity-6 by following down on the DAOF scale:
"Unicity-3 is the least amount of Ambiguous plaintext which
can be deciphered from the corresponding ciphertext -- given
unbounded resources by the attacker"
"Unicity-4 is the maximum amount of Obscure plaintext which can be
deciphered from the corresponding ciphertext -- given unbounded
resources by the attacker"
"Unicity-5 is the least amount of Obscure plaintext which
can be deciphered from the corresponding ciphertext -- given
unbounded resources by the attacker"
"Unicity-6 is the maximum amount of Formless plaintext which can be
deciphered from the corresponding ciphertext -- given unbounded
resources by the attacker"
I will leave for future discussions the non-trivial consequences of
such definitions -- except Unicity-5, which will be considered in the
next Section.
7. THE UNICITY OF DES
Regarding the unicity of DES, when one uses compression (best case,
as seen in Section 4.), different authors quote DES unicity to be
from 16 to 20 bytes of English. For uncompressed English, a value of
8.2 is often quoted. These numbers are indeed easy to obtain. For
example, if I would directly use the calculation given in Sections 3.
and 4. above, DES unicity would indeed be 8.2 bytes for uncompressed
English and 20 bytes for 2x compressed English -- since DES uses
56-bit keys and enciphers bytes.
As another example, quoting from Schneier [Sch98]:
"For a standard English message, the unicity distance is K/6.8,
where K is the key length. (The 6.8 is a measure of the redundancy
of English in ASCII. For other plaintexts it will be more or less,
but not that much more or less.) For DES, the unicity distance is
8.2 bytes."
"This means that if you are trying to brute-force DES you need two
ciphertext blocks. (DES's block length is 8 bytes.) Decrypt the
first ciphertext block with one key after another. If the resulting
plaintext looks like English, then decrypt the second block with the
same key. If the second plaintext block also looks like English,
you've found the correct key."
"The unicity distance grows as the redundancy of the plaintext
shrinks. For compressed files, the redundancy might be 2.5, or
three blocks of DES ciphertext".
However, I will show that DES unicity is actually close to 3 bytes of
English -- and could even be 0 byte in some systems, such as in SSL
that contains a large fraction of known plaintext.
My calculation is simple to explain and can be so summarized (see
the forthcoming paper for the full text):
When one tries to decrypt the encipherment of plaintext with a
defined statistics (such as English text or 2x compressed English
text), DES has a "yes/no" feature on decryption such that using the
wrong key almost 100% guarantees (due to the complex processes used
and the fact that 56 bits encode blocks of 64 bits) that one obtains
"garbage" as a result. Thus, only the correct key is expected to
reproduce statistics other than "random" from ciphertext -- DES
works as a "hash-function" that forces a dimensional reduction and
thus almost guarantees that inverses calculated with a wrong key
will have washed-out statistics. In other words, with DES it is
perhaps much less probable to obtain readable text by decrypting
with the wrong key than it is to obtain readable text when
encrypting readable plaintext -- since ciphertext has less
recognizable structure than plaintext. I suppose that this applies
to a broad range of plaintext statistics, not just English and not
just natural languages.
Thus, the first point is that the current unicity formula (as derived
by Shannon) does not apply to DES -- Shannon's "random cipher"
assumption given in Section 3 item d breaks down completely for DES.
Of course, this may have been a design goal since it aids decryption
attacks also in some other aspects and for random plaintext (both to
be commented in the paper).
This means that for DES one must use the unicity of English alone,
without any key but that provided by the English alphabet itself.
This is a strong reduction over the case with 2^56 keys. In the case
of Unicity-1 (see Section 6) this corresponds approximately to 5
letters of English text -- as the least amount of Distinguished
English text that can be uniquely understood, as Shannon exemplified
[Sha49] and discussed. The Unicity-1 of English is thus approximately
5 bytes -- supposing one byte for each English letter, as we have
been using in our comparisons here.
However, there is a second reduction that must be taken into account.
There is no expected plaintext ambiguity when inverting DES to
tentatively decipher a ciphertext -- one observes either
Distinguished garbage or Distinguished readable and valid text.
Thus, to obtain a DES key one does not need the least amount of
"Distinguished readable and valid text" -- perhaps the least amount
of "Obscure readable and valid text" would already do, when compared
with "Obscure garbage". One just needs to identify with some degree
of success (which can then be confirmed by a second check with more
bits in the same block or, for another block) whether English is
probably present or not.
Therefore, the second question is what unicity concept should one use
with DES -- Unicity-5, as given in Section 6, seems more appropriate
than Unicity-1.
Since DES works on 64-bit blocks, this issue is not so important to
define the least amount of ciphertext that needs to be intercepted --
64 bits is certainly more than enough. However, it is crucial to the
system's security to know how many English characters one is allowed
to transmit under theoretical security -- i.e., that cannot be broken
by a (given and possible) brute force attack. Of course, a low
unicity also strongly reduces the workload of the attack since less
ciphertext is needed and this may lead to simplifications in the
computations.
This needs to be emphasized. Under current technology, 56-bit DES is
ONLY secure if used within its unicity -- which is expectedly much
lower than calculated by Shannon's formula, by two counts: (i) the
cryptographic formula does not apply and plaintext unicity must be
used, (ii) the plaintext unicity is not Unicity-1 but the much lower
Unicity-5. Also, attack workload could be much less than hitherto
imagined.
To estimate Unicity-5, I will use the already given values of
H(English) = 1 to 1.5 and define Unicity-5 = 2^(1) to 2^(1.5) = 2 to
3 characters. I will use the largest value, as a worst case, to
define the detection threshold:
Unicity-5(English) = 3 characters
This corresponds to the least amount of information one expectedly
needs in order to allow English text to be identified and expresses
thus the concept of Unicity-5 (see Section 6). Under the same one
character per byte encoding used for the other examples, this gives 3
bytes.
NOTE: To exemplify how only three characters are already significant
to identify English text, the reader can test "the", "cha", exe",
"how" "ide", "two", "for", "and", "inc", etc.
But, it is well known that 3 bytes of English do not really compress
-- since it is already close to the entropy coding limit (the best
possible) of 2 bytes -- even when using Huffman or Lempel-Ziv
compression.
Thus, the unicity of DES for English text is approximately 3 bytes --
compressed or not. Not 20 bytes. The difference between a name ...
and just the initials.
This also means that if you are trying to brute-force DES you need
just one ciphertext block. Not three, as given in [Sch98]. This
reduces the expected attack cost in time and complexity. I also
suppose that one could use some significant simplifications in the
calculations -- since only 5% of the 64-bit input block is actually
needed (i.e., the unicity is all one needs).
Further, and in a different slant, since 56-bit DES can be attacked
by brute force without much expense as is well-known and it seems to
be even easier as given above, DES well exemplifies the waste of
using a WA-controlled random 56-bit key in order to protect at most
24 bits (3 bytes x 8 = 24). While a simple XOR encoder (a Vernam
cipher) with 56-bits of key would protect 56-bits of English text --
a 130% increase over DES.
8. COMMENTS
This message discusses several aspects of unicity and new unicity
types as a function of identification levels taken as cardinal
numbers. It also provides examples and motivates a revisitation of
the concept of unicity, in order to extend its usefulness as a design
tool.
In particular, the idea that added open trust can increase security
-- even if known to the attacker -- was shown and calculated. This
leads to a new security paradigm and motivates the existence of a
noteworthy new class of systems, for which the more the attacker
knows, the safer the system is.
This message also exemplifies the need to develop better codes (than
DES), which can have larger unicity for English (and, other natural
languages) than their own key-lengths.
Comments?
Cheers,
Ed Gerck
================================
REFERENCES:
[Sha49] Shannon, C. Communication Theory of Secrecy Systems. Bell
Syst. Tech. J., vol. 28, pp. 656-715, 1949. See also
http://www3.edgenet.net/dcowley/docs.html for readable
scanned images of the complete original paper and Shannon's
definition of "unicity distance" in page 693.
[Ger98] Gerck, E., "Identification Theory", in two parts:
http://www.mcg.org.br/coherence.txt and
http://www.mcg.org.br/coherence2.txt
[Sch98] Schneier B., CRYPTO-GRAM, December 15, 1998, e-mail
newsletter, http://www.counterpane.com
______________________________________________________________________
Dr.rer.nat. E. Gerck [EMAIL PROTECTED]
http://novaware.com.br
--- Meta-Certificate Group member -- http://www.mcg.org.br ---
============= Posted via Deja News, The Discussion Network ============
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own
------------------------------
** FOR YOUR REFERENCE **
The service address, to which questions about the list itself and requests
to be added to or deleted from it should be directed, is:
Internet: [EMAIL PROTECTED]
You can send mail to the entire list (and sci.crypt) via:
Internet: [EMAIL PROTECTED]
End of Cryptography-Digest Digest
******************************