Cryptography-Digest Digest #939, Volume #8       Wed, 20 Jan 99 23:13:03 EST

Contents:
  Random numbers for C: End, at last? (George Marsaglia)
  Re: Metaphysics Of Randomness (R. Knauer)
  Re: General (but detailed enough) description of RC4... (Doug Stell)
  Re: Pentium III... (Brad Aisa)
  Re: Java speed vs 'C' (was Re: New Twofish Source Code Available) 
([EMAIL PROTECTED])

----------------------------------------------------------------------------

Crossposted-To: 
sci.stat.math,sci.math,sci.math.num-analysis,sci.physics.research,comp.os.msdos.djgpp
From: George Marsaglia <[EMAIL PROTECTED]>
Subject: Random numbers for C: End, at last?
Date: Thu, 21 Jan 1999 03:08:52 GMT

My offer of RNG's for C was an invitation to dance;
I did not expect the Tarantella.  I hope this post will
stop the music, or at least slow it to a stately dance
for language chauvinists and software police---under
a different heading.

In response to a number of requests for good RNG's in
C, and mindful of the desirability of having a variety
of methods readily available, I offered several. They
were implemented as in-line functions using the #define
feature of C.

Numerous responses have led to improvements; the result
is the listing below, with comments describing the
generators.

I thank all the experts who contributed suggestions, either
directly to me or as part of the numerous threads.

It seems necessary to use a (circular) table in order
to get extremely long periods for some RNG's. Each new
number is some combination of the previous r numbers, kept
in the circular table.  The circular table has to keep
at least the last r, but possible more than r, numbers.

For speed, an  8-bit index seems best for accessing
members of the table---at least for Fortran, where an
8-bit integer is readily  available via integer*1, and
arithmetic on the index is automatically mod 256
(least-absolute-residue).

Having little experience with C, I got out my little
(but BIG) Kernighan and Ritchie book to see if there
were an 8-bit integer type. I found none, but I did
find char and unsigned char: one byte. Furthemore, K&R
said arithmetic on characters was ok. That, and a study
of the #define examples, led me to propose #define's
for in-line generators LFIB4 and SWB, with monster
periods. But it turned out that char arithmetic jumps
"out of character", other than for simple cases such as
c++ or c+=1.   So, for safety, the index arithmetic
below is kept in character by the UC definition.

Another improvement on the original version  takes
advantage of the comma operator, which, to my chagrin,
I had not seen in K&R.  It is there, but only with an
example of (expression,expression).  From the advice of
contributors, I found that the comma operator allows
(expression,...,expression,expression) with the
last expression determining the value.   That makes it
much easier to create in-line functions via #define
(see SHR3, LFIB4, SWB and FIB below).

The improved #define's are listed below, with a
function to initialize the table and a main program
that calls each of the in-line functions one million
times and then  compares the result to what I got with
a DOS version of gcc.   That main program can serve
as a test to see if your system produces the same
results as mine.
   _________________________________________
  |If you run the program below, your output|
  | should be  seven lines, each a 0 (zero).|
   -----------------------------------------

Some readers of the threads are not much interested
in the philosophical aspects of computer languages,
but want to know: what is the use of this stuff?
Here are simple examples of the use of the in-line
functions:  Include the #define's in your program, with
the accompanying static variable declarations, and a
procedure, such as the example, for initializing
the static variable (seeds) and the table.

Then any one of those in-line functions, inserted
in a C expression, will provide a random 32-bit
integer, or a random float if UNI or VNI is used.
For example, KISS&255; would provide a random byte,
while 5.+2.*UNI; would provide a random real (float)
from 5 to 7. Or  1+MWC%10; would provide the
proverbial "take a number from 1 to 10",
(but with not quite, but virtually, equal
 probabilities).
More generally, something such as 1+KISS%n; would
provide a practical uniform random choice from 1 to n,
if n is not too big.

A key point is: a wide variety of very fast, high-
quality, easy-to-use RNG's are available by means of
the nine in-line functions below, used individually or
in combination.

The comments after the main test program describe the
generators. These descriptions are much as in the first
post, for those who missed them. Some of the
generators (KISS, MWC, LFIB4) seem to pass all tests of
randomness, particularly the DIEHARD battery of tests,
and combining virtually any two or more of them should
provide fast, reliable, long period generators. (CONG
or FIB alone and CONG+FIB are suspect, but quite useful
in combinations.)

Serious users of random numbers may want to
run their simulations with several different
generators, to see if they get consistent results.
That should be easy to do.
Bonne chance,
George Marsaglia

The C code follows---------------------------------:

#include <stdio.h>
#define znew   (z=36969*(z&65535)+(z>>16))
#define wnew   (w=18000*(w&65535)+(w>>16))
#define MWC    ((znew<<16)+wnew )
#define SHR3  (jsr^=(jsr<<17), jsr^=(jsr>>13), jsr^=(jsr<<5))
#define CONG  (jcong=69069*jcong+1234567)
#define FIB   ((b=a+b),(a=b-a))
#define KISS  ((MWC^CONG)+SHR3)
#define LFIB4 (c++,t[c]=t[c]+t[UC(c+58)]+t[UC(c+119)]+t[UC(c+178)])
#define SWB   (c++,bro=(x<y),t[c]=(x=t[UC(c+34)])-(y=t[UC(c+19)]+bro))
#define UNI   (KISS*2.328306e-10)
#define VNI   ((long) KISS)*4.656613e-10
#define UC    (unsigned char)  /*a cast operation*/
typedef unsigned long UL;

/*  Global static variables: */
 static UL z=362436069, w=521288629, jsr=123456789, jcong=380116160;
 static UL a=224466889, b=7584631, t[256];
/* Use random seeds to reset z,w,jsr,jcong,a,b, and the table t[256]*/

 static UL x=0,y=0,bro; static unsigned char c=0;

/* Example procedure to set the table, using KISS: */
 void settable(UL i1,UL i2,UL i3,UL i4,UL i5, UL i6)
 { int i; z=i1;w=i2,jsr=i3; jcong=i4; a=i5; b=i6;
 for(i=0;i<256;i=i+1)  t[i]=KISS;
 }

/* This is a test main program.  It should compile and print 7  0's. */
int main(void){
int i; UL k;
settable(12345,65435,34221,12345,9983651,95746118);

for(i=1;i<1000001;i++){k=LFIB4;} printf("%u\n", k-1064612766U);
for(i=1;i<1000001;i++){k=SWB  ;} printf("%u\n", k- 627749721U);
for(i=1;i<1000001;i++){k=KISS ;} printf("%u\n", k-1372460312U);
for(i=1;i<1000001;i++){k=CONG ;} printf("%u\n", k-1529210297U);
for(i=1;i<1000001;i++){k=SHR3 ;} printf("%u\n", k-2642725982U);
for(i=1;i<1000001;i++){k=MWC  ;} printf("%u\n", k- 904977562U);
for(i=1;i<1000001;i++){k=FIB  ;} printf("%u\n", k-3519793928U);
              }
/*-----------------------------------------------------
   Write your own calling program and try one or more of
   the above, singly or in combination, when you run a
   simulation. You may want to change the simple 1-letter
   names, to avoid conflict with your own choices.        */

/* All that follows is comment, mostly from the initial
   post. You may want to remove it */

/* Any one of KISS, MWC, FIB, LFIB4, SWB, SHR3, or CONG
   can be used in an expression to provide a random 32-bit
   integer.

   The KISS generator, (Keep It Simple Stupid), is
   designed to combine the two multiply-with-carry
   generators in MWC with the 3-shift register SHR3 and
   the congruential generator CONG, using addition and
   exclusive-or. Period about 2^123.
   It is one of my favorite generators.

   The  MWC generator concatenates two 16-bit multiply-
   with-carry generators, x(n)=36969x(n-1)+carry,
   y(n)=18000y(n-1)+carry  mod 2^16, has period about
   2^60 and seems to pass all tests of randomness. A
   favorite stand-alone generator---faster than KISS,
   which contains it.

   FIB is the classical Fibonacci sequence
   x(n)=x(n-1)+x(n-2),but taken modulo 2^32.
   Its period is 3*2^31 if one of its two seeds is odd
   and not 1 mod 8. It has little worth as a RNG by
   itself, but provides a simple and fast component for
   use in combination generators.

   SHR3 is a 3-shift-register generator with period
   2^32-1. It uses y(n)=y(n-1)(I+L^17)(I+R^13)(I+L^5),
   with the y's viewed as binary vectors, L the 32x32
   binary matrix that shifts a vector left 1, and R its
   transpose.  SHR3 seems to pass all except those
   related to the binary rank test, since 32 successive
   values, as binary vectors, must be linearly
   independent, while 32 successive truly random 32-bit
   integers, viewed as binary vectors, will be linearly
   independent only about 29% of the time.

   CONG is a congruential generator with the widely used 69069
   multiplier: x(n)=69069x(n-1)+1234567.  It has period
   2^32. The leading half of its 32 bits seem to pass
   tests, but bits in the last half are too regular.

   LFIB4 is an extension of what I have previously
   defined as a lagged Fibonacci generator:
   x(n)=x(n-r) op x(n-s), with the x's in a finite
   set over which there is a binary operation op, such
   as +,- on integers mod 2^32, * on odd such integers,
   exclusive-or(xor) on binary vectors. Except for
   those using multiplication, lagged Fibonacci
   generators fail various tests of randomness, unless
   the lags are very long. (See SWB below).
   To see if more than two lags would serve to overcome
   the problems of 2-lag generators using +,- or xor, I
   have developed the 4-lag generator LFIB4 using
   addition: x(n)=x(n-256)+x(n-179)+x(n-119)+x(n-55)
   mod 2^32. Its period is 2^31*(2^256-1), about 2^287,
   and it seems to pass all tests---in particular,
   those of the kind for which 2-lag generators using
   +,-,xor seem to fail.  For even more confidence in
   its suitability,  LFIB4 can be combined with KISS,
   with a resulting period of about 2^410: just use
   (KISS+LFIB4) in any C expression.

   SWB is a subtract-with-borrow generator that I
   developed to give a simple method for producing
   extremely long periods:
      x(n)=x(n-222)-x(n-237)- borrow mod 2^32.
   The 'borrow' is 0, or set to 1 if computing x(n-1)
   caused overflow in 32-bit integer arithmetic. This
   generator has a very long period, 2^7098(2^480-1),
   about 2^7578.   It seems to pass all tests of
   randomness, except for the Birthday Spacings test,
   which it fails badly, as do all lagged Fibonacci
   generators using +,- or xor. I would suggest
   combining SWB with KISS, MWC, SHR3, or CONG.
   KISS+SWB has period >2^7700 and is highly
   recommended.
   Subtract-with-borrow has the same local behaviour
   as lagged Fibonacci using +,-,xor---the borrow
   merely provides a much longer period.
   SWB fails the birthday spacings test, as do all
   lagged Fibonacci and other generators that merely
   combine two previous values by means of =,- or xor.
   Those failures are for a particular case: m=512
   birthdays in a year of n=2^24 days. There are
   choices of m and n for which lags >1000 will also
   fail the test.  A reasonable precaution is to always
   combine a 2-lag Fibonacci or SWB generator with
   another kind of generator, unless the generator uses
   *, for which a very satisfactory sequence of odd
   32-bit integers results.

   The classical Fibonacci sequence mod 2^32 from FIB
   fails several tests.  It is not suitable for use by
   itself, but is quite suitable for combining with
   other generators.

   The last half of the bits of CONG are too regular,
   and it fails tests for which those bits play a
   significant role. CONG+FIB will also have too much
   regularity in trailing bits, as each does. But keep
   in mind that it is a rare application for which
   the trailing bits play a significant role.  CONG
   is one of the most widely used generators of the
   last 30 years, as it was the system generator for
   VAX and was incorporated in several popular
   software packages, all seemingly without complaint.

   Finally, because many simulations call for uniform
   random variables in 0<x<1 or -1<x<1, I use #define
   statements that permit inclusion of such variates
   directly in expressions:  using UNI will provide a
   uniform random real (float) in (0,1), while VNI will
   provide one in (-1,1).

   All of these: MWC, SHR3, CONG, KISS, LFIB4, SWB, FIB
   UNI and VNI, permit direct insertion of the desired
   random quantity into an expression, avoiding the
   time and space costs of a function call. I call
   these in-line-define functions.  To use them, static
   variables z,w,jsr,jcong,a and b should be assigned
   seed values other than their initial values.  If
   LFIB4 or SWB are used, the static table t[256] must
   be initialized.

   A note on timing:  It is difficult to provide exact
   time costs for inclusion of one of these in-line-
   define functions in an expression.  Times may differ
   widely for different compilers, as the C operations
   may be deeply nested and tricky. I suggest these
   rough comparisons, based on averaging ten runs of a
   routine that is essentially a long loop:
   for(i=1;i<10000000;i++) L=KISS; then with KISS
   replaced with SHR3, CONG,... or KISS+SWB, etc. The
   times on my home PC, a Pentium 300MHz, in nanoseconds:
   FIB 49;LFIB4 77;SWB 80;CONG 80;SHR3 84;MWC 93;KISS 157;
   VNI 417;UNI 450;
 */




------------------------------

From: [EMAIL PROTECTED] (R. Knauer)
Subject: Re: Metaphysics Of Randomness
Date: Thu, 21 Jan 1999 02:30:47 GMT
Reply-To: [EMAIL PROTECTED]

On Wed, 20 Jan 1999 16:22:05 -0500, "Trevor Jackson, III"
<[EMAIL PROTECTED]> wrote:

>This is getting off topic, but in searching for the "root of randomness" you may want 
>to
>follow up some concepts from the late '70s in which (then) modern cosmology was 
>recast by
>a theorist (whose name I am searching for) into a theory of interaction events in 
>which
>the only exchange was that of information.  The reformulation had a number of 
>interesting
>insights and provided simplification of (then) modern particle theory.
>
>One of the most interesting aspects of "cosmological information theory", for want of 
>the
>original name, was that, like all predictive theories, it contained a number of 
>critical
>constants.  The entrancing aspect was that all the constants could be taken as one 
>(1).

>I don't think the theory went very far.  It may be one of those things we cannot
>appreciate for another 100 years.

I recall reading his book a long time ago. He had all sorts of
esoteric equations that dealt with information transfer only.

The problem with that theory is that we know that information is only
(possibly) one of several so-called "fundamentals" like mass and
energy.

But who knows - you are never a prophet in your own time.

Bob Knauer

"A man with his heart in his profession imagines and finds
resources where the worthless and lazy despair."
--Frederic the Great, in instructions to his Generals


------------------------------

From: [EMAIL PROTECTED] (Doug Stell)
Subject: Re: General (but detailed enough) description of RC4...
Date: 21 Jan 1999 03:32:50 GMT

On Mon, 18 Jan 1999 16:55:05 GMT, [EMAIL PROTECTED] (Doug Stell)
wrote:

>> - key size
>
>Variable, up to 2048 octets. Any key you provide is simply used
>repeatedly until 2048 octets are input to the key setup function.

Oops, I goofed. That is 2048 bits, 256 octets.

doug



------------------------------

From: Brad Aisa <[EMAIL PROTECTED]>
Subject: Re: Pentium III...
Date: Wed, 20 Jan 1999 22:39:30 -0500

fungus wrote:
> 
> Intel has announced that the Pentium III will have a built in hardware
> random number generator, and individual serial number on each chip.

I don't quite understand how a unique serial number in the chip is
supposed to be helpful for anything cryptographic.

...and if the chip dies? 

...and if you switch between computers?

__
Brad Aisa

------------------------------

From: [EMAIL PROTECTED]
Subject: Re: Java speed vs 'C' (was Re: New Twofish Source Code Available)
Date: Thu, 21 Jan 1999 02:36:48 GMT

In article <[EMAIL PROTECTED]>,
  "Trevor Jackson, III" <[EMAIL PROTECTED]> wrote:
> [EMAIL PROTECTED] wrote:
>
> > In article <[EMAIL PROTECTED]>,
> >   [EMAIL PROTECTED] (Ian Miller) wrote:
> > > In article <781vl4$qrv$[EMAIL PROTECTED]>,
> > > [EMAIL PROTECTED] wrote:
> > > > Actual this is not true. Good assembly language is dam hard to bet.
> > > In the short term, only.
> > >
> > > >I have seen old code written on old univacs fly circles around the
> > > >newer compiled stuff even though the newer univacs have a larger
> > > >instruction set.
> > > This is entirely predictable if the compiler code generators have not been
> > > brought up to date.  Once they have been the old assembly code will loose
> > > out.
> > >
> > > >Also the newer compliers the designers have not
> > > >given much thought to good design.
> > > On this I disagree radically.  The compiler design is now a very well
> > > understood art, and modern processors are designed (among other criteria)
> > > to be easy to optimise for.  The best modern compilers generate first rate
> > > code.
> > >
> >
> >   Actually I can give a modern example I was using a real key
> > for my original scott16u but I never could get the C code to
> > come with in a factor of ten of the speed of good assembly.
> >  The best compliers do not come close to good assembly if
> > one wants speed period.
> >
> >  C is nice and portable but it still is a dog compared to
> > assemble code.
>
> The effect you describe is real, but restricted to Intel architectures.  The
> Intel instruction set is so bad that one major language vendor (starts with an
> M) actually embedded a p-code interpreter in their C compiler's runtime.  The
> compiler translated the source into p-code which was interpreted at run-time.
>
> Case in point: 11 generations after the original, the Intel instruction set
> still does not support position-independent code.
>
> Computers that were designed instead of "jest growed" are heavily optimized
for
> good compilation space and time efficiency.
>
> If you've never worked on anything but Intel you've been damaged by that
> experience.  Djikstra wrote a famous paper about this kind of thing in the
'70s
> called "GoTo considered harmful".
>

 Actaully I have worked assembly on many systems. I picked Beta
over VHS. and was wrong there. I liked the 68000 instruction set
so bought an AMIGA so was wrong there. The best stuff in theory
does not always win at the market place. Inventing a better mouse
trap is nothing. Coming up with a better Bull Shit sales line is
everything. Compliers will never beat humans for speed since the
complier can never really know what we humans intend to do with
the code so there are certain comprismises the compiler can't
do. But if we ever get an intelligent machine programing will
instantly become so much better it will be like night and day.
 I hope I am still here when machines finally become self aware.

David A. Scott

http://cryptography.org/cgi-bin/crypto.cgi/Misc/scott19u.zip
http://members.xoom.com/ecil/index.htm

============= Posted via Deja News, The Discussion Network ============
http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own    

------------------------------


** FOR YOUR REFERENCE **

The service address, to which questions about the list itself and requests
to be added to or deleted from it should be directed, is:

    Internet: [EMAIL PROTECTED]

You can send mail to the entire list (and sci.crypt) via:

    Internet: [EMAIL PROTECTED]

End of Cryptography-Digest Digest
******************************

Reply via email to