Cryptography-Digest Digest #877

Digestifier Sun, 10 Jan 1999 15:13:33 -0500
Cryptography-Digest Digest #877, Volume #8       Sun, 10 Jan 99 15:13:04 EST

Contents:
  Re: On the Generation of Pseudo-OTP (R. Knauer)
  Re: On the Generation of Pseudo-OTP (R. Knauer)
  Re: SCOTT19U ENTROPY (Horst Ossifrage)
  Re: coNP=NP Made Easier? (Matt Brubeck)
  Re: On the Generation of Pseudo-OTP (R. Knauer)
  Re: Practical True Random Number Generator ("hapticz")
  Re: Help: a logical difficulty (Ken Starks)
  Re: On the Generation of Pseudo-OTP (R. Knauer)

----------------------------------------------------------------------------

From: [EMAIL PROTECTED] (R. Knauer)
Subject: Re: On the Generation of Pseudo-OTP
Date: Sun, 10 Jan 1999 18:21:28 GMT
Reply-To: [EMAIL PROTECTED]

On Sat, 09 Jan 1999 16:26:00 -0500, "Trevor Jackson, III"
<[EMAIL PROTECTED]> wrote:

>There are
>algorithmic procedures that introduce zero periodicity and correlation.  I.e.,
>there are algorithms that do not reduce the entropy of the data.

Please name those which are proveably secure for purposes of
cryptopgraphy.

The problem I have with that statement is that once you say
"algorithmic procedures", I envision finite state machines which can
reproduce the stream any number of times in the future, which means
that there is a fixed relationship between one bit and the next, and
that means there is correlation present. Even the digit expansion of
Pi has strong correlation, since each digit is calculated, and
therefore there is a precise correlation between to digits.

If the digit expansion of the nth digit is f(n), and for the n+1th is
f(n+1), then the correlation between two is f(n)*f(n+1). That is a
deterministic relationship that in principle could be discovered and
used to break any ciphers based on it.

Only unless you can come up with an algorithmic procedure which
completely removes all footprints of the original calculation, will
you have a proveably secure cryptosystem.

>That depends on the quality of the hashing.  There is no fundamental reasons
>why a biased  TRNG cannot be massaged into a good one.

No one seems to have a problem with the known procedures for removing
bias, such as those discussed in RFC1750. It is correlation that is
the sticking point. It is correlation that "leaks information" and
gives the cryptanalyst a means to attack the cipher.

>Given these assumptions, it is certainly possible to use the raw output of a
>flawed TRNG and produce a smaller amount of perfectly entropic (unpredictable)
>data.

But is that stream the same as the output of a TRNG? If you really
believe that it is, please show us how you arrive at that conclusion
in terms of the requirements for a crypto-grade random number
generator.

>This is related to the problem of
>proving randomness, which can only be accomplished in the negative.

That problem is of most fundamental significance, even in number
theory (see Chaitin, op. cit.).

Bob Knauer

"We hold that each man is the best judge of his own interest."
--John Adams


------------------------------

From: [EMAIL PROTECTED] (R. Knauer)
Subject: Re: On the Generation of Pseudo-OTP
Date: Sun, 10 Jan 1999 17:10:53 GMT
Reply-To: [EMAIL PROTECTED]

On Sun, 10 Jan 1999 14:25:18 +0100, fungus
<[EMAIL PROTECTED]> wrote:

>> So use the RNG convention.  POTP for the fake and TOTP for the real
>> thing. 

>Whose convention is that?
>A POTP is a "stream cipher".

Finally, there's a manifestation of intelligent life out there.

Bob Knauer

"We hold that each man is the best judge of his own interest."
--John Adams


------------------------------

From: Horst Ossifrage <[EMAIL PROTECTED]>
Subject: Re: SCOTT19U ENTROPY
Date: Sun, 10 Jan 1999 10:02:04 -1000

Tim Redburn wrote:
> 
> I've been looking at the source for
> scott19u and at first glance I can see where
> the value 9205076 could come from.
> 
> Scott19u takes each 19bit word in turn and then
> mod's it with it's position. If you take the entropy of this result
> to simply be the entropy of log<base 2>(position) then
> you get an entropy of 9205076.
> 
> However, when performing  x mod n, unless n is a factor
> of x, then the resulting entropy will be slightly less than
> log<base 2>(n).
> 
> If I've worked it out correctly then the entropy of y mod n, where
> y is a perfectly random number between 0 and x, can be
> calculated as follows :
> 
> assumptions : x is the maximum number that is going to be used -
>                       (in the case of scott19u this
>                        is 524288 ( = 2^19) the value in T19L in the
> 
>                        source)
> 
> a = x mod n
> b = (x - a) / n
> 
> entropy = (n - a) * ((b/x) log<base2> (b/x))
>                  + a * (((b+1)/x) log<base2>((b+1)/x))
> 
> using this formula in the code below, I get a value of 9187574 bits
> for the entropy of the key before being used for the generation
> of the S-Box, so I beleive that (assuming my maths is correct) this
> value is a maximum for the S-Box entropy.
> 
> Please note that I have only checked as far as the key setup -
> i.e. what happens to the key before it is used to generate the
> permutation. I have not yet checked if the generation of the
> permutation removes any more entropy.

When I wrote David's documentation page I wrote the following description 
of the S-Box generation method for scott16u.zip which has a DECREASING 
MOD operation So The modulus starts at 2^16 and linearly decreases to mod 
1 :


Here is how the S-Box is formed for scott16u.zip: 

Step 1: Create a memory array of 64K words (FFFF words in hexadecimal 
terminology).
Call this array List1[i]. These words will have addresses i from 0 to 
FFFF hex. The values
initially stored in these locations are simply 0, 1, 2, 3, 4, 5, 6, 7, 8, 
9, a, b, c, d, e, f, 10, 11,
12, ... etc. up to FFFF hex. Each of these values will be selected only 
once by the
algorithm, and the value will be put in a location of the S-Box memory 
array that is chosen
by Rules defined below. After a value from location i is put in the 
S-Box, location i is
written with a new value, according to the Rules defined below. 

Step 2: Use the keyraw.key file with location pointed to by j. This 
keyraw.key file may have
less than 64K words. Call each word key[j]. Start at location j=0 and use 
the value at that
location according to the Rules defined below to make an entry in the 
S-Box. Then j will be
incremented through the whole keyraw.key file, and j will wrap around as 
many times as
needed to finish making the S-Box. 

Step 3: Set x=1. Take the key value at j=0, add 1 to it, mod ((2^16)-1-x) 
and put that
number in S-Box location 0. Place key[j]+1 in List1[S-Box[j]]. 

Step 4: Set x=2. Take the key value at j=1, add 1+j to it, mod 
((2^16)-1-x) , and place that
value in S-Box[S-Box[j-1]]. Place key[j]+1 in List1[S-Box[j]]. 

Step 5: Set x=3. Take the key value at j=2, add 1+j to it, mod 
((2^16)-1-x) , and place that
value in S-Box[S-Box[j-1]]. Place key[j]+1 in List1[S-Box[j]].
. 
. 
. 
Step Y: Increment x and j. Take key[j], add 1+j to it, mod ((2^16)-1-x), 
and place that value
in S-Box[S-Box[j-1]]. Place key[j]+1 in List1[S-Box[j]]. 

=============================================

so you see the line where it has :

mod ((2^16)-1-x)

it is a decreasing modulus, so it decreases the modulus to near zero for 
the last entries of the S-Box.

This reduces the entropy for scott19u.zip . Half of the S-Box entries 
use less than mod 2^10  and half use more than more than mod 2^9 . This 
linearly changing moduls can be used to re-calculate an entropy ESTIMATE.

Horst Ossifrage

------------------------------

From: [EMAIL PROTECTED] (Matt Brubeck)
Crossposted-To: sci.math,comp.theory
Subject: Re: coNP=NP Made Easier?
Date: Sun, 10 Jan 1999 11:00:59 -0800

[EMAIL PROTECTED] (Lasse Reichstein Nielsen) wrote:

>> Interesting. I have used virtual NDTMs for solving real life problems.
>> In fact, I have written (and use) LEX and GREP type programs for
>> pattern matching which implement NDTMs.

> I think you might be confuzing some things here. Regular expressions
> are usually recognized by Finite Automatons(FAs), and nondeterministic
> FAs(NFAs) are no more powerfull than FAs, so you can translate NFAs to
> FAs, that can be implemented "easily". Ofcourse TMs are MORE powerful
> than FAs (as language acceptors), so you could implement LEX and GREP
> with TM's, but why bother?

This is correct. Regexp matching is usually implemented with an
nondeterministic finite automaton (FA) model, not with NDTMs. And the
implementation usually involves a "translation" from NFA to deterministic
FA.

I also wasn't going to bring up issues such as virtual NDTMs, since they
confused my main point, which was that Turing Machines, either
deterministic or nondeterministic, are theoretical constructs. The
original poster seemed a bit confused about the terminology and concepts,
so I wanted to make it as clear as possible.

Mr. McCarty pointed out that we can implement NDTM-like behavior on real
computing machines. However, my main point still holds that once an NDTM
is made deterministic, we can't guarantee the same run-time bounds.

------------------------------

From: [EMAIL PROTECTED] (R. Knauer)
Subject: Re: On the Generation of Pseudo-OTP
Date: Sun, 10 Jan 1999 19:00:26 GMT
Reply-To: [EMAIL PROTECTED]

On Sat, 9 Jan 1999 21:26:27 +0000, [EMAIL PROTECTED] (Paul L.
Allen) wrote:

>I'd be looking for a *lot* more indicators of malfunction than just all ones
>or all zeros.

I agree - if the designer failed to account for such possibilites.

The assumption, however, is that the designer is a cryptographer who
makes all possible provisions for such possibilities. That leaves two
possibilities that he cannot handle - an open output and a shorted
output.

>Open output into high-Z inputs without pull-ups is going to
>pick up all sorts of radiated crap and will probably look random.

It is assumed that the designer is well aware of that and takes
adequate measures to prevent it. Even the crudest of digital circuits
has pullups to make sure an open output does not float.

>I'd also
>be looking for bad screening resulting in power-line and/or compuer digital
>signal pick-up by internal stages.

Likewise it is assumed that the designer adequately shielded the
device or otherwise designed the device to be immune to such ambient
noise, even it means he has to use battery power. Anyway, a physical
process such as timing differences in radioactive decay is not all
that sensitive to electronic noise, so that is one way around the
problem.

>Something as simple as all zeros or all
>ones is just one of many failure modes.  An integrating A-D converter that
>fails could conceivable produce nice ramps.  Or aperiodic samples of a
>sawtooth that looks random.

It is assumed that the designer has taken all that into account by
doing a careful audit of his design.

>From what I could make out of your post, it looked like he used
>compressibility as one of his tests.

Chaitin does not specifically mention text compression techniques -
that was my doing, to get that topic opened for discussion. Chaitin
only discusses algorithmic reduction in general terms.

>For dealing with output from a TRNG you're just
>fooling yourself - make the sample a little longer and your compression
>table would be less efficient than leaving the random data alone.

I fully agree. And as stated by posters here a year ago, other forms
of algorithmic reduction like hashing suffer from similar problems.

>A million zeroes is possible, valid output from a TRNG.  I'd suspect
>the TRNG of being broken, but the problem is where you draw the line -
>10, 100, 1,000, a million, a googolplex?

You would set it at some realistic value which takes into account the
size of your output buffer. I would imagine that if you looked at
20,000 bits, you would have sufficient reason to shut the TRNG off and
find the malfuntion if they were all 0s or 1s, especially if such
sequences were designed as alarms of malfunction in the TRNG.

FIPS 140-1 discusses sequence lengths for testing. See the section
"Statistical random number generator tests" in:

http://csrc.ncsl.nist.gov/fips/fips1401.htm

>> I do not believe I meant to imply anything like that. My use of the
>> term "unlikely" is consistent with the meaning I intended -
>> vanishingly small probability of occurance.

>Nope.

Nope what?

>Either you're saying that you're going to decide your TRNG whatever
>million-digit sequence it produces (because they're all vanishingly
>unlikely) or you meant to say that you considered a million zeroes
>less likely than any other million-digit sequence.  Either view
>is wrong.  The million-zero sequence is no more or less likely than
>any other sequence of the same length.  Calling it an "unlikely sequence"
>is misleading and incorrect in the context you use it.

The choice of setting the alarm with all 0s or all 1s is based on a
design analysis. Those two bit sequences are special in that they
represent two common malfunctions of digital circuits. I am not trying
to say that they are special in any other way.

>Many people here don't have a clue.

So what? That does not change the fact that we knew what we meant by
the term "unlikely" a year ago in our discussions, because we spelled
out in clear examples what we meant.

One person even went so far as to use the term "unlucky" to mean the
same thing. If your TRNG outputs a sequence like 101010...10, you are
one "unlucky" cryptographer, since such a sequence is highly unlikely.

Chaitin does some simple calculations on the unlikelihood of certain
sequences in terms of algoritmic reduction. If you are willing to
accept N-10 as the greatest level of reduction, then 999 out of 1,000
sequences will be of greater complexity than that. For large N, that
means that for all practical purposes, the probability for any
sequences that are less complex is vanishingly small, i.e., extremely
unlikely.

>> If the run of all 0s of any practical length were eliminated it would
>> not bias the TRNG in any pracitcal way.

>Of course it would.  You bugger up the runs property of the TRNG.  It
>is no longer statistically random.

The emphasis is on the word "practical". Eliminating two specific
sequneces, all 0s and all 1s, is not going to change the total
security of your TRNG-based OTP system for any practical length
cipher.

>You hope.  It does impose a statistical bias on the generator.  Probably
>not enough to cause a significant problem if you set your limit on
>all-ones or all-zeroes high enough.  You hope.

I would like to see exactly how much reduction in security you are
claiming for large N. There are undoubtedly other factors in a
practical TRNG which would cause far more insecurity than filtering
our just those two sequences.

>> And, as mentioned several times in this discussion, you cannot filter
>> any other biased sequences or you WILL mess up the TRNG.

>You can't filter *any* sequence or you mess up the TRNG.

In principle, yes - but in practice I do not see how filtering out two
pathelogical sequences from 2^N other ones is going to cause any
*measureable* loss of security for large N.

So what if the cryptanalyst knows that two possible ciphertexts, those
corresponding to the actual message (for 000...0) and its bitwise
inverse (for 111...1), are not possible.

What yoyo would send the actual plaintext message or its inverse as
his ciphertext anyway. <jeez>

>> I think we are in complete agreement on all issues here

>I don't think so.

Then you are just nitpicking, without offering anything of substance.

>II'd say it's a possible indicator of brokenness.  There's no way of
>filtering them out without compromising the statistical properties of your
>TRNG to some extent.

Yep, but the extent is vanishingly small - and irrelevant on top of
that. Filtering out ciphertexts that are identical to the actual
plaintext message or its inverse is hardly giving the cryptanalyst an
advantage.

Bob Knauer

"We hold that each man is the best judge of his own interest."
--John Adams


------------------------------

From: "hapticz" <[EMAIL PROTECTED]>
Subject: Re: Practical True Random Number Generator
Date: Sun, 10 Jan 1999 14:09:20 -0500

common element here is then, "time",  or in other words, "the repetitive
application of a stable constant againt a man-made perception".

rather than make somthing random, remove the time element entirely, thus
removing any periodic parameters.  is this feasible??

--
best regards
[EMAIL PROTECTED]





------------------------------

From: [EMAIL PROTECTED] (Ken Starks)
Crossposted-To: sci.math
Subject: Re: Help: a logical difficulty
Date: Sun, 10 Jan 1999 12:28:59 GMT

[EMAIL PROTECTED] (Dik T. Winter) wrote:

>> 

> Depends on the language I would say.  In our phonebook you would follow
> your friend, sorry.  But dictionary order is also a specific Computer
> Science term and means first ordering by first element, next by the
> second element.  Even the collating sequence for the elements is
> unspecified.

I would side with Mike on this. Computer scientists should serve the 
wider community, not dictate to them. If this make their sorting 
algorithms a bit harder, so be it.

One of the first programs I wrote include proper sorting for Mc and Mac
according to Mike's scheme, I think. It offered two choices: Mc and Mac
both put in the M's where Mc would go; or Mc and Mac put in a 
seperate section of their own, as if a sperate letter of the alphabet
between Land M. 

The English and Scottish telephone directories at the 
time differed in that respect, if I remeber correctly. ( Locally to 
me, in Scotland, I note  that British telecom directory has reverted to  
the English method, but a locally published index uses the traditional
Scottish ordering )


Ken,     __O 
       _-\<,_ 
      (_)/ (_)                                                                 
    Virtuale Saluton.


------------------------------

From: [EMAIL PROTECTED] (R. Knauer)
Subject: Re: On the Generation of Pseudo-OTP
Date: Sun, 10 Jan 1999 17:58:07 GMT
Reply-To: [EMAIL PROTECTED]

On Sat, 09 Jan 1999 16:07:25 -0500, "Trevor Jackson, III"
<[EMAIL PROTECTED]> wrote:

>> It can, on a fundamental basis. Once you grasp the fundamental
>> difference between a strong stream cipher and an OTP you will see why
>> that is the case.

>This is a strong statement.  Can you describe the distinction you see
>between the concepts?

I will try. See below, after I finish commenting on the rest of your
post.

> As far as I can tell the latter is a subset of the
>former; a OTP is a strong stream cipher.

Not only is it "strong", it is proveably secure. That makes it more
than just "strong". The fact that you are capable of deciding formally
that it is strong makes it strong for certain. That is more than just
"strong".

>The reverse is completely wrong
>though, some stream ciphers being totally unrelated to the OTP concept.

All other stream ciphers are fundamentally different from the OTP
system, because they are not proveably secure.

>You seem to be making the case than an OTP is a distinct entity from all
>other forms of stream cipher.  On what is this distinction based?

Here is my thinking, which developed from those thousands of posts a
year ago when this discussion was raging in its full-blown glory.

Cinsider two extremes -  the OTP and the Book Cipher. The former is
effected by taking the stream from the output from a TRNG and the
latter is effected by taking the unadulterated output from a source of
ordinary text, like a Book. Each stream is XORed with a given
plaintext. Let's imagine that the plaintext is known by the
cryptanalyst.

Since the cryptanalyst knows the plaintext, he can XOR it against the
two ciphertexts to give the two keys. Now here is where the
fundamental difference between the OTP and the Book Cipher emerges.
The key K1 from the OTP cipher will be random and the key K2 from the
Book Cipher will not. K2 will be intelligible text, whereas K1 will
not. All ciphers which result in unintelligible keys are excluded a
priori. This narrows down the possibilities such that there can only
be one intelligible message contained in the ciphertext - the one that
is intended to be the message.

With the OTP system, there are all possible messages of length N
"contained" in a given ciphertext, including all possible intelligible
messages. Each possible message, including the intelligible ones, are
equiprobable - that is what makes the OTP system proveably secure,
because the cryptanalyst has no rationale to pick any one particular
intelligible message over another. "Attack at dawn" is just as likely
the intended message as "Attack at dusk".

[NB: To make matter worse for the cryptanalyst, you would always
pre-encrypt your message, making *any* of the possible plaintexts your
message, even unintelligible ones.]

This is in contrast to any other stream cipher, where only one
intelligible message is possible, due to the fact that the key was not
produced by a TRNG. The limited keyspace of any stream not produced by
a TRNG means that it is exceedingly unlikely (with vanishingly small
probability as N increases) that any intelligible message other than
the one you intended as your message would be present in the cipher,
that is would be an allowed plaintext.

IOW, in the extreme case of a Book Cipher, there is only one
intelligible message that has a corresponding intelligible key. That
means that breaking such a cipher is simple in principle - all the
cryptanalyst has to so is search for intelligbile messages and when
one comes up, see if the key is also intelligable. If so, then he has
found your intended message.

In between the OTP on one extreme and the Book Cipher on the other
extreme there is a grey zone as you go from totally secure with the
TRNG-based OTP system to totally insecure with the Book Cipher.  But
regardless of how close you get to total security of the OTP system
with any particular stream cipher, it is still not proveably secure.

Here's the fundamental difference you asked me to discuss: It is
possible that a particular stream cipher looks extremely secure but
unless it is *proveably secure*, it is not an OTP cipher. The emphasis
is on *proveably secure*.

Based on considerations of people like Greg Chaitin, it is not
possible to prove the randomness of a number by formal procedures.
Only a number produced by a TRNG can be proved to be random, and
therefore suitable for the OTP. Therefore no matter how you try, you
will never be able to prove the total security of a stream cipher
unless it is the OTP system. There is always that possibility, however
small it may seem at the time, that a non-OTP stream cipher can be
broken in principle. And even in practical terms too - since quantum
computers may one day be able to analyze npn-OTP ciphers and ferret
out the intended message.

So, as long as you use anything other than an OTP, you are using a
stream cipher that is not proveably secure, no matter how practically
secure you may think it is. That is the fundamental difference between
the two - a difference that goes to the heart of algorithmic
information theory like Godel's incompleteness theorem and Turing's
halting problem - IOW, a difference that goes to the heart of number
theory itself.

Bob Knauer

"We hold that each man is the best judge of his own interest."
--John Adams


------------------------------


** FOR YOUR REFERENCE **

The service address, to which questions about the list itself and requests
to be added to or deleted from it should be directed, is:

    Internet: [EMAIL PROTECTED]

You can send mail to the entire list (and sci.crypt) via:

    Internet: [EMAIL PROTECTED]

End of Cryptography-Digest Digest
******************************
Cryptography-Digest Digest #877

Reply via email to