Cryptography-Digest Digest #616

Digestifier Tue, 05 Sep 2000 11:07:19 -0700
Cryptography-Digest Digest #616, Volume #12       Tue, 5 Sep 00 14:13:01 EDT

Contents:
  Re: R: R: R: R: R: R: R: Test on pseudorandom number generator. (Terry Ritter)
  Re: RSA Patent. (Roger Schlafly)
  Re: RSA Patent. (Rich Wales)
  Re: For those working on the next RSA factoring challenge... (Jerry Coffin)
  Re: Blum Blum Shub question (Mok-Kong Shen)
  Re: Validating problems (Mike Rosing)
  Re: Secret Journal ("CMan")
  Truncating SHA-1 digest to 128 bits of size (Sami Mäkinen)
  Re: RSA Patent. (Roger Schlafly)

----------------------------------------------------------------------------

From: [EMAIL PROTECTED] (Terry Ritter)
Subject: Re: R: R: R: R: R: R: R: Test on pseudorandom number generator.
Date: Tue, 05 Sep 2000 17:03:20 GMT


On Mon, 4 Sep 2000 18:20:33 +0200, in <8p0go3$e2s$[EMAIL PROTECTED]>,
in sci.crypt "Cristiano" <[EMAIL PROTECTED]> wrote:

>[...]
>I'm not able to choice a PRNG by viewing 234 p-values from Diehard, about
>50-100 p-values from autocorrelation and others p-values from others 6
>tests. So I try something to collect all p-values in a human readable
>summary.

It is not useful to consider all data with the same weight.  What we
are looking for are repeatable violations of the expected
distribution.  When we have a function to compute p-values, the
distribution of p-values we expect is flat.  In reality, of course,
the measured value will be sometimes high, sometimes low; we want to
assure ourselves that it is not *usually* high or low, and not by an
*unusual* amount.  

We are looking to the data to reveal patterns, since repeatable
patterns should not exist.  Consequently, we fasten on the indications
that patterns might have been found, and then use other trials to
confirm or refute those possibilities.  If have a large or small value
but can't repeat it after several tries, we generally ignore it,
unless one of the values is so excessive as to require detailed
investigation into how we got that lucky.  

>[...]
>Now I'll be more precise. URAND pass very well frequency (p= .95489), 

Actually, that's high enough to wonder about.  A value that high
should occur about 1 time in 20:  Are we just that lucky?

>Serial
>(p= .78908), Poker (p= .39018) and Runs (p= .91267); fail Autocorrelation
>for distance =32 (p=0) 

That seems like a pretty clear result, but does it make sense?  I
don't know.  I would not be surprised to see a RNG fail this test, but
from my remote location, I see no reason why we would have such strong
correlation at this particular distance (and no shorter).  Could there
be something in the implementation which repeats at a period of 32,
instead of the expected large period?

If tests could verify randomness independent of the structure of the
data, we might just apply the tests and take a yes or no answer.  But
tests cannot verify randomness, they do depend upon structure (the
amount of state sampled from each RNG step), they can only expose
particular patterns, and only do that in a probabilistic sense.
Accordingly, statistical analysis is not a spectator sport.  We get
indications, then follow them up until we have answers.  


>and 96 (p=.00035) bits, Maurer (p= 1.4641e-17) and
>Diehard (KS test of 234 p-values= 1, 50 p-values=0 and 74=1). These p-values
>are only mean values.

Since we know something is wrong with the sequence, it should be no
surprise that the K-S p-values are extreme.  Before we start thinking
about K-S being bad, we need a *good* sequence, with *good* results,
and *bad* K-S values.  


>>[...]
>> There is a great deal more to understand before we can even accept
>> that Maurer is not confused by that trick.  For example, we would like
>> to know whether Maurer produces the same result for various levels of
>> trickiness; and if not, why not?  Surely taking a byte each from 4 RNG
>> steps must in some sense appear "more random" than not doing so.
>>
>> When we look at the details of Maurer, we find that it does not do
>> well exploring full 32-bit values.  I don't know what you are doing,
>> but if you are applying Maurer on bytes, while applying the other
>> tests on 32-bit values, the results simply are not comparable.  No
>> wonder the "trick" of taking a byte from each step does not confuse
>> Maurer!  If the other tests were applied on bytes, they might do as
>> well or better.  There is a great deal more that it is necessary to
>> know than one apparent success.
>
>When I generate a sequence, this is the same for all tests. Autocorrelation
>work by finding weakness at x bits distance, poker and Maurer by considering
>x bits blocks, and so on.

The Maurer test can partition data into various lengths.  If the test
only deals in byte values, partitioning a 32-bit value composed of a
byte from each of 4 sequential RNG steps is good analysis.  But using
those same 4 byte values as one 32-bit value in other tests is not
good analysis.  So even thought the sequence is the same, if it is not
coordinated with testing, the results can have little meaning.  

>What you say seems to be a disquisition more theoretical than practice.

My view is practical reality, based on tests which I have personally
designed, implemented, executed, interpreted and published over the
last decade.  This is not theory.  


>I'm not a mathematician, I'm a programmer.
>What it is important for me is to have tools to help me in my job. Now my
>job is to find a PRNG and a cryptographically secure PRNG.
>So, if Maurer can help me (even only in some case) to find weakness why not
>use it? 

Using or not using the Maurer test is not the issue.  But when Maurer
picks up something the others do not, my first thought is that
something is strange about the test environment.  


>It is not expensive, Diehard is very bad in this sense.
>If a day somebody will prove that Maurer's test is not good for pseudorandom
>sequences, my only preoccupation will be to remove the few lines of code for
>Maurer's test.

That is not my issue.  


>> >[...]
>> >What you say seems to be what is in HAC.
>>
>> And where would that be, exactly?
>
>Page 182:
>"(iv) Runs test
>The purpose of the runs test is to determine whether the number of runs (of
>either zeros or
>ones; see Definition 5.26) of various lengths in the sequence s is as
>expected for a random
>sequence.".

First of all, the runs test described in HAC only works on bits, not
larger values.  In contrast, Knuth II, section 3.3.2 describes the Run
test (runs up / runs down) which is valid for large integers and
reals.  I have addressed the distribution for small integers.  The
Diehard implementation is in fact runs up / runs down, based on 32-bit
integers, and not the HAC definition.  

Restricting runs to bits only is a serious loss.  Many RNG's produce
integer values, not individual bits, so analyzing the values as bit
sequences may not have much meaning.  

Next, the experimental process described in HAC (which is essentially
a chi-square comparison), in no way encourages the user to *look* at
the data in any way which is meaningful.  It delivers a single
statistical result, which, of course, is inherently probabilistic: it
may not represent the general reality.  Looking at that value does not
encourage identification of problems, nor experimentation to confirm
problems or their solutions, because a single value doesn't
distinguish among places to start.  


>> >[...]
>> >(?)  Do you think that a modulus of 1024 bits it is small?
>>
>> For what?  As I understand things, 1024 bits is too small for the
>> theoretical guarantees to kick in.
>
>(Sorry I don't understand "kick in")
>You said: "Perhaps.  But a BB&S generator with small N is insecure.".
>My N (modulus) is a 1024 bits number. So, what do you mean?

I mean that 1024 bits may not be large enough to gain the theoretical
proof properties of BB&S.  


>> >> And a BB&S
>> >> generator with an error in implementation is not really BB&S at all.
>> >> Is a BB&S generator which does not function as one might expect also a
>> >> BB&S generator?
>> >
>> >Sure, but its implementation is very simple!
>>
>> If indeed the implementation is so simple, why do you report problems
>> in the sequence which it generates?  Why is there a need to "de-skew"
>> with SHA-1?  I would find it ironic indeed if BB&S produced biased
>> sequences.
>
>I have already answer to this:
>"The de-skew in not needed. I only found that the sequences are a little
>better by de-skewing it, that is the tests give a better result (not
>always). Probably in my final version I'll use "standard" BBS (without any
>"trick")."

But that is no answer, because if the sequences are *any* better after
de-skewing, I would still say that something about the implementation
or the testing is wrong.  I would not expect to be able to find any
repeatable statistical patterns in a properly-operating and
properly-tested BB&S implementation.  

Consequently, I see this as an indication of trouble that needs to be
tracked down.  Maybe that's the way BB&S works, but I doubt it.
Indications of this sort often reveal problems in the test fixture.


>
>> >If the bits to hash are 2, 10 or 20 bits of a BBS output I think this is
>not
>> >a big problem.
>> >If I take more bit I violate the strength of BBS if somebody can examine
>the
>> >output of BBS. But if I hash its output nobody can learn about BBS.
>> >I don't use the whole BBS state because the result is very bad.
>>
>> Really?  That would be an interesting result.  If BB&S generates a
>> poor sequence when the whole state is used (that is, when treated like
>> a normal statistical RNG), the main effect of the required few-bit
>> sampling may be to hide this situation from analysis.
>
>This is a delicate question. I'll try to expound it.
>
>Only for test purpose, I use a BBS modulus of no more than 32 bits, so that
>x*x will be no more than 64 bits (my compiler can handle 64 bits in an
>efficent way). In my tests I have seen that a small n don't affect the
>result (n is big only for security reasons). That is, by using a 512 bit n
>or more the tests give about the same results.

Actually, that claim is known to be false with respect to the
probability of short cycles.  We just discussed that on sci.crypt with
considerable enthusiasm over a period of weeks.  


>I choose two prime p and q each congruent to 3 moduls 4, and the following
>numbers are all prime: (p-1)/2, (p-3)/4, (q-1)/2, (q-3)/4. For example I can
>get: p= 0xDAE7, q= 0xFA07, n= 0xD5CB9251 and x0= 0x1674FA67.

BB&S is unlike the usual statistical RNG in that it does not consist
of one long cycle of values.  On the contrary, BB&S has multiple
cycles of different length.  That makes BB&S initialization (that is,
the initial value of x, x0) an issue.  On very small implementations,
the probability of using a short cycle cannot be dismissed.  It
probably is necessary to try various values for x0.  


>To obtain the sequence to test I collect each bit of x in a buffer (the
>sequence). When I have collected all bits of x, I recompute (x*x)%n and
>continue the collection.

That is wrong:  You should be doing the modular reduction before
collecting data.  Many or most RNG's require modular reduction, and
that should always be fully completed before data are taken.  


>In this way many tests fail. For example with the numbers of above these are
>the results (X² is chi-square):
>
>Frequency: X²= 12,5           P= 0,041%
>Serial:        X²= 12,5054     P= 0,193%
>Poker:        X²= 37,6896      P= 0,1%
>Runs test:   X²= 23,741581  P= 9,53%
>Autocorrelation is good for distances from 1 up to 100 bits.
>Maurer for L=8: Xu= 7,168423   dev. std.= 0,000372   X²= 40,941485
>P=1,9845e-15%
>Diehard: KS test of p-values= 1
>0,05   14
>0,15   20
>0,25   26
>0,35   13
>0,45   19
>0,55   21
>0,65   23
>0,75   21
>0,85   20
>0,95   34
>1,05   23 (only p-values= 1)
>
>total p-values= 234
>Mean= 0,592714011151981
>Min= 0,00929386331507939
>Max= 1
>
>It is important for me to emphasize that BBS generator used in this way
>*always* fail many tests.


>Now I'd like to do a consideration.
>If we use a PRNG which has a modulus of type 2^m (n=0x10, 0x100, 0x1000 and
>so on)  the mean of its output will be a (m-1) bits number (2^m/2) and all
>bits have the same probability to be 0 or 1.
>But if we have a modulus like 2^m-x (e.g. 0x12345678) the bits are not
>equally distribuited.

It is very common for many RNG's to not have a 2^m modulus, in which
case the integer results are biased for usage based on bit-width.  One
way to address this is to convert integer range 0..N-1 to floating
0.0..0.999999 (etc.) by dividing each result by N.  

Another way is to ignore values above 2^(m-1) and just step until we
get an in-range value.  In some cases this may hinder analysis, but we
can't expect to use standard 32-bit statistical tests on an irregular
range.  We might well convert a "32-bit" RNG with bias to 31-bit
range, then multiply by 2 so as to make the significant bits highest.

In many cases, the bias is small, and tests will not find it.  What
that really shows is the limitations of the tests.  


>I have found the same problem with the FIPS 186 generator (please see page
>174 of HAC). When in step 4.3 of algorithm 5.12 we make ai=G(t,zi) mod q (q
>is a prime number) the bits are not equally distribuited.

Not having a 2^m modulus is very common.  


>For that I think BBS is "only" a bits generator and cannot be used as a
>number generator; that is, we can take only few bits from its state.

We *can* analyze the full state value, we just have to treat it
correctly.  

---
Terry Ritter   [EMAIL PROTECTED]   http://www.io.com/~ritter/
Crypto Glossary   http://www.io.com/~ritter/GLOSSARY.HTM


------------------------------

From: Roger Schlafly <[EMAIL PROTECTED]>
Subject: Re: RSA Patent.
Date: Tue, 05 Sep 2000 10:06:33 -0700

DJohn37050 wrote:
> P. 2 of the Doc thsy Roger mentions says anyone can use the term "RSA
> algorithm" but "RSA" should not be used in a way that MIGHT be confused with a
> product of RSA Security.

Sure, but one should not use the term to create confusion with
products of IBM, Microsoft, or anyone else. The caveat says
nothing.

------------------------------

From: [EMAIL PROTECTED] (Rich Wales)
Crossposted-To: talk.politics.crypto
Subject: Re: RSA Patent.
Date: 5 Sep 2000 17:03:36 -0000

Earlier, I wrote:

        > > My understanding is that R, S, and A publicized the
        > > details of their algorithm before applying for a
        > > patent, so that the US government couldn't impose
        > > a secrecy order on their invention . . . .

Paul Rubin replied:

        > This is a myth.  Adi Shamir . . . came right out and
        > said that they weren't thinking about patents at the
        > time they published.

Thanks for the correction.

I assume, then, that a more accurate statement would be that, when they
did decide to file, they discovered that they couldn't get a patent in
any country other than the US because of the prior publication?  Also,
that the US gov't people who might otherwise have slapped a secrecy
order on the RSA algorithm realized there was no point in doing so?

Also, would you have any insights as to why it took so unusually long
(almost six years) for the RSA patent to be issued?

Rich Wales         [EMAIL PROTECTED]         http://www.webcom.com/richw/
PGP 2.6+ key generated 2000-08-26; all previous encryption keys REVOKED.
RSA, 2048 bits, ID 0xFDF8FC65, print 2A67F410 0C740867 3EF13F41 528512FA

------------------------------

From: Jerry Coffin <[EMAIL PROTECTED]>
Subject: Re: For those working on the next RSA factoring challenge...
Date: Tue, 5 Sep 2000 11:25:01 -0600

In article <[EMAIL PROTECTED]>, ed-no@spam-
eepatents.com says...
> For those working on the next RSA factoring challenge...
> 
> "Cray supercomputer for sale on eBay"

[ ... ]

> But wouldn't a network of about 10-20 PIII-1000 MHz PC's do just as well,
> for a fraction of the cost?

No -- the matrix reduction phase of a GNFS doesn't distribute well at 
all.  To do the job at all efficiently, you need a single machine 
with enough RAM to hold the entire matrix.  In theory a Pentium III 
can address physical RAM in the terabyte range, but in reality most 
motherboards will only hold 768Mb or so.  Worse, bandwidth to that 
memory is likely to be one of the major bottlenecks; the Cray uses 
SRAM for all of its main memory, giving high bandwidth even with the 
unpredictable memory usage pattern of matrix reduction.  PCs reduce 
costs partly by using SDRAM instead, which won't give nearly the 
bandwidth in this situation.

Loosely coupled parallel machines can do some jobs extremely well, 
but this job doesn't fall into that category.  There are factoring 
algorithms that can be distributed, but they're all enough less 
efficient that a Cray running the GNFS would normally be faster for 
the sizes of numbers you'd think of for an RSA key.

-- 
    Later,
    Jerry.

The Universe is a figment of its own imagination.

------------------------------

From: Mok-Kong Shen <[EMAIL PROTECTED]>
Subject: Re: Blum Blum Shub question
Date: Tue, 05 Sep 2000 19:47:38 +0200



Mark Wooding wrote:
> 
> Mok-Kong Shen <[EMAIL PROTECTED]> wrote:
> 
> > I suppose that I have not stated my point clearly. I meant that, with
> > the same data with which the recipient can get the key to decipher,
> > the opponent can do the same. (They have exactly the same stuffs
> > available, don't they?)
> 
> No.  The recipient has the factors of the modulus, which enables him to
> compute square roots.

So these factors are shared secret of the communication
partners. But this fact can't be read out from your 
original post at all, if I don't err.

M. K. Shen

------------------------------

From: Mike Rosing <[EMAIL PROTECTED]>
Subject: Re: Validating problems
Date: Tue, 05 Sep 2000 07:33:37 -0500

Miki Watts wrote:
> 
> Hi!
> I don't know if this is the right place, but i couldn't find anything
> better.
> Here's my problem/question:
> How can i authenticate that Bob is Bob, by replying to e-mail, and not
> Mallory who is tapping the line for incoming mail ?
> In other words, is there some sort of authentication scheme that works over
> e-mail?

You need at least one out of band exchange, otherwise Mallory wins.
Use phone lines (voice or fax) to send a PGP fingerprint.  Then when
you get a signiture, you can authenticate it.  Face to face meeting
is optimal, but not always possible.  If you don't go outside of the
net, you don't know if Mallory substituted anything the first time.

Patience, persistence, truth,
Dr. mike

------------------------------

From: "CMan" <[EMAIL PROTECTED]>
Subject: Re: Secret Journal
Date: Tue, 5 Sep 2000 10:49:10 -0700

No wonder of it: sheer plod makes plough down sillion
Shine, and blue-bleak embers, ah my dear,
Fall, gall themselves, and gash gold-vermilion.

JK
--
CRAK Software
http://www.crak.com
Password Recovery Software
QuickBooks, Quicken, Access...More
Spam bait (credit E. Needham):
 root@localhost
 postmaster@localhost
 admin@localhost
 abuse@localhost
 webmaster@localhost
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]






"Melinda Harris" <[EMAIL PROTECTED]> wrote in message
news:e2Qs5.17267$[EMAIL PROTECTED]...
> Has any cyrptographers hackers or computer programers reviewed and
anaylized
> the Secret Journal disclosure? I need all the response I can get regarding
> this unprecedented virus.
> EIA
> Encryption Intelligence Agency
>
>


------------------------------

Subject: Truncating SHA-1 digest to 128 bits of size
From: [EMAIL PROTECTED] (Sami Mäkinen)
Date: Tue, 05 Sep 2000 18:04:27 GMT

I would need a 128 bit hash from a source. Could anybody suggest 
a better method for truncating SHA-1 digests to 128 bit size 
other than taking the digest as 32 bit words A, B, C, D, E and 
simply for example xoring D and E. If there is something to 
comment about in this method, is there any obvious better 
one or should I just give up SHA-1 and take some other hash 
algorithm with 128 bit digest?


Thanks,

Sami Mäkinen (email: [EMAIL PROTECTED])

------------------------------

From: Roger Schlafly <[EMAIL PROTECTED]>
Crossposted-To: talk.politics.crypto
Subject: Re: RSA Patent.
Date: Tue, 05 Sep 2000 11:04:14 -0700

Rich Wales wrote:
> I assume, then, that a more accurate statement would be that, when they
> did decide to file, they discovered that they couldn't get a patent in
> any country other than the US because of the prior publication?  Also,

Yes.

> that the US gov't people who might otherwise have slapped a secrecy
> order on the RSA algorithm realized there was no point in doing so?

Yes, it would have been pointless, but not because of any RSA
publication. The much more crucial theory and algorithm had
already been published by Diffie and Hellman. Diffie-Hellman
would have been perfectly acceptable to everyone if RSA had
never been invented. Others would have published RSA anyway.
Hellman-Pohlig and Rabin each independently discovered cryptosystems
that were nearly identical to RSA, and RSA would have soon become
obvious to everyone.

> Also, would you have any insights as to why it took so unusually long
> (almost six years) for the RSA patent to be issued?

Yes, 2 reasons: (1) A 1981 Supreme Court decision upholding a
software patent caused a change of thinking at the PTO towards
software patents; and (2) There was an interference with a
Hellman patent application that claimed essentially the same
thing but with a prime modulus.

You can find some more info about the legalities of the RSA patent
at my web site:
http://bbs.cruzio.com/~schlafly/pkp/pkp.htm

------------------------------


** FOR YOUR REFERENCE **

The service address, to which questions about the list itself and requests
to be added to or deleted from it should be directed, is:

    Internet: [EMAIL PROTECTED]

You can send mail to the entire list (and sci.crypt) via:

    Internet: [EMAIL PROTECTED]

End of Cryptography-Digest Digest
******************************
Cryptography-Digest Digest #616

Reply via email to