Hi Jim, The discussion following my most recent patch has been a rehash of things we've already covered, or complaints (Andreas') about existing coreutils code which I just moved from one place to another. I still think what I sent is ready to apply. The other issues can be addressed in separate patches if necessary.
Best regards, Frederik On Thu, Dec 01, 2005 at 03:32:09AM +0000, Frederik Eaton wrote: > On Wed, Nov 30, 2005 at 11:55:29AM +0100, Jim Meyering wrote: > > Frederik Eaton <[EMAIL PROTECTED]> wrote: > > > On Tue, Nov 29, 2005 at 07:31:45AM +0100, Jim Meyering wrote: > > >> Frederik Eaton <[EMAIL PROTECTED]> wrote: > > >> >> Maybe we need an option to trade off speed for quality, > > >> >> if it makes such a big difference. > > >> > > > >> > Hmm. Maybe there will be a difference. Well, why don't I make > > >> > ISAAC_LOG or ISAAC_WORDS part of isaac_state so that shred and sort > > >> > can use different values? I don't think it will be important for > > >> > 'sort' to use a very strong hash, since crackers only have the order > > >> > of the hash values to go by, and not the values themselves. > > >> > > >> This seems like a very good point. > > >> How about having this in rand-isaac.h: > > >> > > >> #ifndef ISAAC_LOG > > >> # define ISAAC_LOG 8 > > >> #endif > > >> > > >> then define ISAAC_LOG to 3 just before including it from sort.c? > > > > > > Then I would also have to include rand-isaac.c in sort.c, since > > > rand-isaac.c itself includes rand-isaac.h, and is normally compiled > > > separately. What's wrong with making it a parameter of the state > > > structure? > > > > If it's compiled separately with a variable state array size, > > then the state array will have to be allocated from the heap and > > freed when done. Or you could avoid using the heap at the expense > > of limiting the size to some fixed maximum. That would make this > > code more like random(3). > > > > It's probably not worth much more effort to make the code > > stand-alone, since we'll probably remove it from shred -- then > > sort will be the only client. Considering that, it might be best > > to simply include the .c file. Either approach is fine with me. > > I decided to limit the size of isaac_state, but make the actual state > array size configurable at runtime. It wasn't immediately trivial to > include rand-isaac.c in sort.c and I figured this would be cleaner. > Tell me if you like what I've done. > > With these changes shred is about 0.007 slower. I fixed a bug in > sort.c as well. Either because of that or the ISAAC changes, random > sort is now about 70% slower than normal sort (vs. 50% for my Nov 26 > patch). > > Frederik > diff -N -ur -x '*.in' -x Makefile -x configure -x dircolors.h -x groups -x > '*~' -x wheel.h -x 'localedir.*' -x 'stamp-*' -x wheel-size.h -x alloca.h -x > .deps -x charset.alias -x getdate.c -x ref-add.sed -x ref-del.sed -x po -x > config.h -x config.log -x config.status -x coreutils.info -x autom4te.cache > -x 'output.*' -x confdefs.h -x '.#*' -x CVS -x tests -x '*.1' -x config.hin > -x '*.orig' -x '*.rej' -x version.texi coreutils-cvs/src/Makefile.am > coreutils-cvs-1-shred-isaac-split/src/Makefile.am > --- coreutils-cvs/src/Makefile.am 2005-10-23 16:23:56.000000000 +0100 > +++ coreutils-cvs-1-shred-isaac-split/src/Makefile.am 2005-11-26 > 22:48:20.000000000 +0000 > @@ -166,6 +166,7 @@ > ls_SOURCES = ls.c ls-ls.c > chown_SOURCES = chown.c chown-core.c > chgrp_SOURCES = chgrp.c chown-core.c > +shred_SOURCES = shred.c rand-isaac.c > > mv_SOURCES = mv.c copy.c cp-hash.c remove.c > rm_SOURCES = rm.c remove.c > diff -N -ur -x '*.in' -x Makefile -x configure -x dircolors.h -x groups -x > '*~' -x wheel.h -x 'localedir.*' -x 'stamp-*' -x wheel-size.h -x alloca.h -x > .deps -x charset.alias -x getdate.c -x ref-add.sed -x ref-del.sed -x po -x > config.h -x config.log -x config.status -x coreutils.info -x autom4te.cache > -x 'output.*' -x confdefs.h -x '.#*' -x CVS -x tests -x '*.1' -x config.hin > -x '*.orig' -x '*.rej' -x version.texi coreutils-cvs/src/rand-isaac.c > coreutils-cvs-1-shred-isaac-split/src/rand-isaac.c > --- coreutils-cvs/src/rand-isaac.c 1970-01-01 01:00:00.000000000 +0100 > +++ coreutils-cvs-1-shred-isaac-split/src/rand-isaac.c 2005-11-26 > 22:51:05.000000000 +0000 > @@ -0,0 +1,350 @@ > +#ifdef HAVE_CONFIG_H > +# include <config.h> > +#endif > + > +#include "system.h" > +#include "gethrxtime.h" > + > +#include "rand-isaac.h" > + > +/* > + * -------------------------------------------------------------------- > + * Bob Jenkins' cryptographic random number generator, ISAAC. > + * Hacked by Colin Plumb. > + * > + * We need a source of random numbers for some of the overwrite data > + * for shred. Cryptographically secure is desirable, but it's not > + * life-or-death so I can be a little bit experimental in the choice > + * of RNGs here. > + * > + * This generator is based somewhat on RC4, but has analysis > + * (http://burtleburtle.net/bob/rand/isaacafa.html) > + * pointing to it actually being better. I like it because it's nice > + * and fast, and because the author did good work analyzing it. > + * -------------------------------------------------------------------- > + */ > + > +/* > + * Refill the entire R array, and update S. > + */ > +void > +isaac_refill (struct isaac_state *s, uint32_t r[/* ISAAC_WORDS */]) > +{ > + uint32_t a, b; /* Caches of a and b */ > + uint32_t x, y; /* Temps needed by isaac_step macro */ > + uint32_t *m = s->mm; /* Pointer into state array */ > + > + a = s->a; > + b = s->b + (++s->c); > + > + do > + { > + isaac_step (a << 13, a, b, s->mm, m, ISAAC_WORDS / 2, r); > + isaac_step (a >> 6, a, b, s->mm, m + 1, ISAAC_WORDS / 2, r + 1); > + isaac_step (a << 2, a, b, s->mm, m + 2, ISAAC_WORDS / 2, r + 2); > + isaac_step (a >> 16, a, b, s->mm, m + 3, ISAAC_WORDS / 2, r + 3); > + r += 4; > + } > + while ((m += 4) < s->mm + ISAAC_WORDS / 2); > + do > + { > + isaac_step (a << 13, a, b, s->mm, m, -ISAAC_WORDS / 2, r); > + isaac_step (a >> 6, a, b, s->mm, m + 1, -ISAAC_WORDS / 2, r + 1); > + isaac_step (a << 2, a, b, s->mm, m + 2, -ISAAC_WORDS / 2, r + 2); > + isaac_step (a >> 16, a, b, s->mm, m + 3, -ISAAC_WORDS / 2, r + 3); > + r += 4; > + } > + while ((m += 4) < s->mm + ISAAC_WORDS); > + s->a = a; > + s->b = b; > +} > + > +/* > + * The basic seed-scrambling step for initialization, based on Bob > + * Jenkins' 256-bit hash. > + */ > +#define mix(a,b,c,d,e,f,g,h) \ > + ( a ^= b << 11, d += a, \ > + b += c, b ^= c >> 2, e += b, \ > + c += d, c ^= d << 8, f += c, \ > + d += e, d ^= e >> 16, g += d, \ > + e += f, e ^= f << 10, h += e, \ > + f += g, f ^= g >> 4, a += f, \ > + g += h, g ^= h << 8, b += g, \ > + h += a, h ^= a >> 9, c += h, \ > + a += b ) > + > +/* The basic ISAAC initialization pass. */ > +void > +isaac_mix (struct isaac_state *s, uint32_t const seed[/* ISAAC_WORDS */]) > +{ > + int i; > + uint32_t a = s->iv[0]; > + uint32_t b = s->iv[1]; > + uint32_t c = s->iv[2]; > + uint32_t d = s->iv[3]; > + uint32_t e = s->iv[4]; > + uint32_t f = s->iv[5]; > + uint32_t g = s->iv[6]; > + uint32_t h = s->iv[7]; > + > + for (i = 0; i < ISAAC_WORDS; i += 8) > + { > + a += seed[i]; > + b += seed[i + 1]; > + c += seed[i + 2]; > + d += seed[i + 3]; > + e += seed[i + 4]; > + f += seed[i + 5]; > + g += seed[i + 6]; > + h += seed[i + 7]; > + > + mix (a, b, c, d, e, f, g, h); > + > + s->mm[i] = a; > + s->mm[i + 1] = b; > + s->mm[i + 2] = c; > + s->mm[i + 3] = d; > + s->mm[i + 4] = e; > + s->mm[i + 5] = f; > + s->mm[i + 6] = g; > + s->mm[i + 7] = h; > + } > + > + s->iv[0] = a; > + s->iv[1] = b; > + s->iv[2] = c; > + s->iv[3] = d; > + s->iv[4] = e; > + s->iv[5] = f; > + s->iv[6] = g; > + s->iv[7] = h; > +} > + > +#if 0 /* Provided for reference only; not used in this code */ > +/* > + * Initialize the ISAAC RNG with the given seed material. > + * Its size MUST be a multiple of ISAAC_BYTES, and may be > + * stored in the s->mm array. > + * > + * This is a generalization of the original ISAAC initialization code > + * to support larger seed sizes. For seed sizes of 0 and ISAAC_BYTES, > + * it is identical. > + */ > +void > +isaac_init (struct isaac_state *s, uint32_t const *seed, size_t seedsize) > +{ > + static uint32_t const iv[8] = > + { > + 0x1367df5a, 0x95d90059, 0xc3163e4b, 0x0f421ad8, > + 0xd92a4a78, 0xa51a3c49, 0xc4efea1b, 0x30609119}; > + int i; > + > +# if 0 > + /* The initialization of iv is a precomputed form of: */ > + for (i = 0; i < 7; i++) > + iv[i] = 0x9e3779b9; /* the golden ratio */ > + for (i = 0; i < 4; ++i) /* scramble it */ > + mix (iv[0], iv[1], iv[2], iv[3], iv[4], iv[5], iv[6], iv[7]); > +# endif > + s->a = s->b = s->c = 0; > + > + for (i = 0; i < 8; i++) > + s->iv[i] = iv[i]; > + > + if (seedsize) > + { > + /* First pass (as in reference ISAAC code) */ > + isaac_mix (s, seed); > + /* Second and subsequent passes (extension to ISAAC) */ > + while (seedsize -= ISAAC_BYTES) > + { > + seed += ISAAC_WORDS; > + for (i = 0; i < ISAAC_WORDS; i++) > + s->mm[i] += seed[i]; > + isaac_mix (s, s->mm); > + } > + } > + else > + { > + /* The no seed case (as in reference ISAAC code) */ > + for (i = 0; i < ISAAC_WORDS; i++) > + s->mm[i] = 0; > + } > + > + /* Final pass */ > + isaac_mix (s, s->mm); > +} > +#endif > + > +/* Start seeding an ISAAC structire */ > +void > +isaac_seed_start (struct isaac_state *s) > +{ > + static uint32_t const iv[8] = > + { > + 0x1367df5a, 0x95d90059, 0xc3163e4b, 0x0f421ad8, > + 0xd92a4a78, 0xa51a3c49, 0xc4efea1b, 0x30609119 > + }; > + int i; > + > +#if 0 > + /* The initialization of iv is a precomputed form of: */ > + for (i = 0; i < 7; i++) > + iv[i] = 0x9e3779b9; /* the golden ratio */ > + for (i = 0; i < 4; ++i) /* scramble it */ > + mix (iv[0], iv[1], iv[2], iv[3], iv[4], iv[5], iv[6], iv[7]); > +#endif > + for (i = 0; i < 8; i++) > + s->iv[i] = iv[i]; > + > + /* Enable the following memset if you're worried about used-uninitialized > + warnings involving code in isaac_refill from tools like valgrind. > + Since this buffer is used to accumulate pseudo-random data, there's > + no harm, and maybe even some benefit, in using it uninitialized. */ > +#if AVOID_USED_UNINITIALIZED_WARNINGS > + memset (s->mm, 0, sizeof s->mm); > +#endif > + > + /* s->c gets used for a data pointer during the seeding phase */ > + s->a = s->b = s->c = 0; > +} > + > +/* Add a buffer of seed material */ > +void > +isaac_seed_data (struct isaac_state *s, void const *buffer, size_t size) > +{ > + unsigned char const *buf = buffer; > + unsigned char *p; > + size_t avail; > + size_t i; > + > + avail = sizeof s->mm - (size_t) s->c; /* s->c is used as a write > pointer */ > + > + /* Do any full buffers that are necessary */ > + while (size > avail) > + { > + p = (unsigned char *) s->mm + s->c; > + for (i = 0; i < avail; i++) > + p[i] ^= buf[i]; > + buf += avail; > + size -= avail; > + isaac_mix (s, s->mm); > + s->c = 0; > + avail = sizeof s->mm; > + } > + > + /* And the final partial block */ > + p = (unsigned char *) s->mm + s->c; > + for (i = 0; i < size; i++) > + p[i] ^= buf[i]; > + s->c = size; > +} > + > + > +/* End of seeding phase; get everything ready to produce output. */ > +void > +isaac_seed_finish (struct isaac_state *s) > +{ > + isaac_mix (s, s->mm); > + isaac_mix (s, s->mm); > + /* Now reinitialize c to start things off right */ > + s->c = 0; > +} > +#define ISAAC_SEED(s,x) isaac_seed_data (s, &(x), sizeof (x)) > + > +/* > + * Get seed material. 16 bytes (128 bits) is plenty, but if we have > + * /dev/urandom, we get 32 bytes = 256 bits for complete overkill. > + */ > +void > +isaac_seed (struct isaac_state *s) > +{ > + isaac_seed_start (s); > + > + { pid_t t = getpid (); ISAAC_SEED (s, t); } > + { pid_t t = getppid (); ISAAC_SEED (s, t); } > + { uid_t t = getuid (); ISAAC_SEED (s, t); } > + { gid_t t = getgid (); ISAAC_SEED (s, t); } > + > + { > + xtime_t t = gethrxtime (); > + ISAAC_SEED (s, t); > + } > + > + { > + char buf[32]; > + int fd = open ("/dev/urandom", O_RDONLY | O_NOCTTY); > + if (fd >= 0) > + { > + read (fd, buf, 32); > + close (fd); > + isaac_seed_data (s, buf, 32); > + } > + else > + { > + fd = open ("/dev/random", O_RDONLY | O_NONBLOCK | O_NOCTTY); > + if (fd >= 0) > + { > + /* /dev/random is more precious, so use less */ > + read (fd, buf, 16); > + close (fd); > + isaac_seed_data (s, buf, 16); > + } > + } > + } > + > + isaac_seed_finish (s); > +} > + > +void > +irand_init (struct irand_state *r, struct isaac_state *s) > +{ > + r->numleft = 0; > + r->s = s; > +} > + > +/* > + * We take from the end of the block deliberately, so if we need > + * only a small number of values, we choose the final ones which are > + * marginally better mixed than the initial ones. > + */ > +uint32_t > +irand32 (struct irand_state *r) > +{ > + if (!r->numleft) > + { > + isaac_refill (r->s, r->r); > + r->numleft = ISAAC_WORDS; > + } > + return r->r[--r->numleft]; > +} > + > +/* > + * Return a uniformly distributed random number between 0 and n, > + * inclusive. Thus, the result is modulo n+1. > + * > + * Theory of operation: as x steps through every possible 32-bit number, > + * x % n takes each value at least 2^32 / n times (rounded down), but > + * the values less than 2^32 % n are taken one additional time. Thus, > + * x % n is not perfectly uniform. To fix this, the values of x less > + * than 2^32 % n are disallowed, and if the RNG produces one, we ask > + * for a new value. > + */ > +uint32_t > +irand_mod (struct irand_state *r, uint32_t n) > +{ > + uint32_t x; > + uint32_t lim; > + > + if (!++n) > + return irand32 (r); > + > + lim = -n % n; /* == (2**32-n) % n == 2**32 % n */ > + do > + { > + x = irand32 (r); > + } > + while (x < lim); > + return x % n; > +} > diff -N -ur -x '*.in' -x Makefile -x configure -x dircolors.h -x groups -x > '*~' -x wheel.h -x 'localedir.*' -x 'stamp-*' -x wheel-size.h -x alloca.h -x > .deps -x charset.alias -x getdate.c -x ref-add.sed -x ref-del.sed -x po -x > config.h -x config.log -x config.status -x coreutils.info -x autom4te.cache > -x 'output.*' -x confdefs.h -x '.#*' -x CVS -x tests -x '*.1' -x config.hin > -x '*.orig' -x '*.rej' -x version.texi coreutils-cvs/src/rand-isaac.h > coreutils-cvs-1-shred-isaac-split/src/rand-isaac.h > --- coreutils-cvs/src/rand-isaac.h 1970-01-01 01:00:00.000000000 +0100 > +++ coreutils-cvs-1-shred-isaac-split/src/rand-isaac.h 2005-10-24 > 09:49:19.000000000 +0100 > @@ -0,0 +1,62 @@ > +/* Size of the state tables to use. (You may change ISAAC_LOG) */ > +#define ISAAC_LOG 8 > +#define ISAAC_WORDS (1 << ISAAC_LOG) > +#define ISAAC_BYTES (ISAAC_WORDS * sizeof (uint32_t)) > + > +/* RNG state variables */ > +struct isaac_state > + { > + uint32_t mm[ISAAC_WORDS]; /* Main state array */ > + uint32_t iv[8]; /* Seeding initial vector */ > + uint32_t a, b, c; /* Extra index variables */ > + }; > + > +/* This index operation is more efficient on many processors */ > +#define ind(mm, x) \ > + (* (uint32_t *) ((char *) (mm) \ > + + ((x) & (ISAAC_WORDS - 1) * sizeof (uint32_t)))) > + > +/* > + * The central step. This uses two temporaries, x and y. mm is the > + * whole state array, while m is a pointer to the current word. off is > + * the offset from m to the word ISAAC_WORDS/2 words away in the mm array, > + * i.e. +/- ISAAC_WORDS/2. > + */ > +#define isaac_step(mix, a, b, mm, m, off, r) \ > +( \ > + a = ((a) ^ (mix)) + (m)[off], \ > + x = *(m), \ > + *(m) = y = ind (mm, x) + (a) + (b), \ > + *(r) = b = ind (mm, (y) >> ISAAC_LOG) + x \ > +) > + > +void > +isaac_refill (struct isaac_state *s, uint32_t r[/* ISAAC_WORDS */]); > +void > +isaac_mix (struct isaac_state *s, uint32_t const seed[/* ISAAC_WORDS */]); > +void > +isaac_init (struct isaac_state *s, uint32_t const *seed, size_t seedsize); > +void > +isaac_seed_start (struct isaac_state *s); > +void > +isaac_seed_data (struct isaac_state *s, void const *buffer, size_t size); > +void > +isaac_seed_finish (struct isaac_state *s); > +#define ISAAC_SEED(s,x) isaac_seed_data (s, &(x), sizeof (x)) > +void > +isaac_seed (struct isaac_state *s); > + > +/* Single-word RNG built on top of ISAAC */ > +struct irand_state > +{ > + uint32_t r[ISAAC_WORDS]; > + unsigned int numleft; > + struct isaac_state *s; > +}; > + > +void > +irand_init (struct irand_state *r, struct isaac_state *s); > +uint32_t > +irand32 (struct irand_state *r); > +uint32_t > +irand_mod (struct irand_state *r, uint32_t n); > diff -N -ur -x '*.in' -x Makefile -x configure -x dircolors.h -x groups -x > '*~' -x wheel.h -x 'localedir.*' -x 'stamp-*' -x wheel-size.h -x alloca.h -x > .deps -x charset.alias -x getdate.c -x ref-add.sed -x ref-del.sed -x po -x > config.h -x config.log -x config.status -x coreutils.info -x autom4te.cache > -x 'output.*' -x confdefs.h -x '.#*' -x CVS -x tests -x '*.1' -x config.hin > -x '*.orig' -x '*.rej' -x version.texi coreutils-cvs/src/shred.c > coreutils-cvs-1-shred-isaac-split/src/shred.c > --- coreutils-cvs/src/shred.c 2005-09-24 14:40:37.000000000 +0100 > +++ coreutils-cvs-1-shred-isaac-split/src/shred.c 2005-11-26 > 22:48:57.000000000 +0000 > @@ -109,6 +109,7 @@ > #include "inttostr.h" > #include "quotearg.h" /* For quotearg_colon */ > #include "quote.h" /* For quotearg_colon */ > +#include "rand-isaac.h" /* Random number stuff */ > > #define DEFAULT_PASSES 25 /* Default */ > > @@ -227,387 +228,6 @@ > } > > /* > - * -------------------------------------------------------------------- > - * Bob Jenkins' cryptographic random number generator, ISAAC. > - * Hacked by Colin Plumb. > - * > - * We need a source of random numbers for some of the overwrite data. > - * Cryptographically secure is desirable, but it's not life-or-death > - * so I can be a little bit experimental in the choice of RNGs here. > - * > - * This generator is based somewhat on RC4, but has analysis > - * (http://ourworld.compuserve.com/homepages/bob_jenkins/randomnu.htm) > - * pointing to it actually being better. I like it because it's nice > - * and fast, and because the author did good work analyzing it. > - * -------------------------------------------------------------------- > - */ > - > -/* Size of the state tables to use. (You may change ISAAC_LOG) */ > -#define ISAAC_LOG 8 > -#define ISAAC_WORDS (1 << ISAAC_LOG) > -#define ISAAC_BYTES (ISAAC_WORDS * sizeof (uint32_t)) > - > -/* RNG state variables */ > -struct isaac_state > - { > - uint32_t mm[ISAAC_WORDS]; /* Main state array */ > - uint32_t iv[8]; /* Seeding initial vector */ > - uint32_t a, b, c; /* Extra index variables */ > - }; > - > -/* This index operation is more efficient on many processors */ > -#define ind(mm, x) \ > - (* (uint32_t *) ((char *) (mm) \ > - + ((x) & (ISAAC_WORDS - 1) * sizeof (uint32_t)))) > - > -/* > - * The central step. This uses two temporaries, x and y. mm is the > - * whole state array, while m is a pointer to the current word. off is > - * the offset from m to the word ISAAC_WORDS/2 words away in the mm array, > - * i.e. +/- ISAAC_WORDS/2. > - */ > -#define isaac_step(mix, a, b, mm, m, off, r) \ > -( \ > - a = ((a) ^ (mix)) + (m)[off], \ > - x = *(m), \ > - *(m) = y = ind (mm, x) + (a) + (b), \ > - *(r) = b = ind (mm, (y) >> ISAAC_LOG) + x \ > -) > - > -/* > - * Refill the entire R array, and update S. > - */ > -static void > -isaac_refill (struct isaac_state *s, uint32_t r[/* ISAAC_WORDS */]) > -{ > - uint32_t a, b; /* Caches of a and b */ > - uint32_t x, y; /* Temps needed by isaac_step macro */ > - uint32_t *m = s->mm; /* Pointer into state array */ > - > - a = s->a; > - b = s->b + (++s->c); > - > - do > - { > - isaac_step (a << 13, a, b, s->mm, m, ISAAC_WORDS / 2, r); > - isaac_step (a >> 6, a, b, s->mm, m + 1, ISAAC_WORDS / 2, r + 1); > - isaac_step (a << 2, a, b, s->mm, m + 2, ISAAC_WORDS / 2, r + 2); > - isaac_step (a >> 16, a, b, s->mm, m + 3, ISAAC_WORDS / 2, r + 3); > - r += 4; > - } > - while ((m += 4) < s->mm + ISAAC_WORDS / 2); > - do > - { > - isaac_step (a << 13, a, b, s->mm, m, -ISAAC_WORDS / 2, r); > - isaac_step (a >> 6, a, b, s->mm, m + 1, -ISAAC_WORDS / 2, r + 1); > - isaac_step (a << 2, a, b, s->mm, m + 2, -ISAAC_WORDS / 2, r + 2); > - isaac_step (a >> 16, a, b, s->mm, m + 3, -ISAAC_WORDS / 2, r + 3); > - r += 4; > - } > - while ((m += 4) < s->mm + ISAAC_WORDS); > - s->a = a; > - s->b = b; > -} > - > -/* > - * The basic seed-scrambling step for initialization, based on Bob > - * Jenkins' 256-bit hash. > - */ > -#define mix(a,b,c,d,e,f,g,h) \ > - ( a ^= b << 11, d += a, \ > - b += c, b ^= c >> 2, e += b, \ > - c += d, c ^= d << 8, f += c, \ > - d += e, d ^= e >> 16, g += d, \ > - e += f, e ^= f << 10, h += e, \ > - f += g, f ^= g >> 4, a += f, \ > - g += h, g ^= h << 8, b += g, \ > - h += a, h ^= a >> 9, c += h, \ > - a += b ) > - > -/* The basic ISAAC initialization pass. */ > -static void > -isaac_mix (struct isaac_state *s, uint32_t const seed[/* ISAAC_WORDS */]) > -{ > - int i; > - uint32_t a = s->iv[0]; > - uint32_t b = s->iv[1]; > - uint32_t c = s->iv[2]; > - uint32_t d = s->iv[3]; > - uint32_t e = s->iv[4]; > - uint32_t f = s->iv[5]; > - uint32_t g = s->iv[6]; > - uint32_t h = s->iv[7]; > - > - for (i = 0; i < ISAAC_WORDS; i += 8) > - { > - a += seed[i]; > - b += seed[i + 1]; > - c += seed[i + 2]; > - d += seed[i + 3]; > - e += seed[i + 4]; > - f += seed[i + 5]; > - g += seed[i + 6]; > - h += seed[i + 7]; > - > - mix (a, b, c, d, e, f, g, h); > - > - s->mm[i] = a; > - s->mm[i + 1] = b; > - s->mm[i + 2] = c; > - s->mm[i + 3] = d; > - s->mm[i + 4] = e; > - s->mm[i + 5] = f; > - s->mm[i + 6] = g; > - s->mm[i + 7] = h; > - } > - > - s->iv[0] = a; > - s->iv[1] = b; > - s->iv[2] = c; > - s->iv[3] = d; > - s->iv[4] = e; > - s->iv[5] = f; > - s->iv[6] = g; > - s->iv[7] = h; > -} > - > -#if 0 /* Provided for reference only; not used in this code */ > -/* > - * Initialize the ISAAC RNG with the given seed material. > - * Its size MUST be a multiple of ISAAC_BYTES, and may be > - * stored in the s->mm array. > - * > - * This is a generalization of the original ISAAC initialization code > - * to support larger seed sizes. For seed sizes of 0 and ISAAC_BYTES, > - * it is identical. > - */ > -static void > -isaac_init (struct isaac_state *s, uint32_t const *seed, size_t seedsize) > -{ > - static uint32_t const iv[8] = > - { > - 0x1367df5a, 0x95d90059, 0xc3163e4b, 0x0f421ad8, > - 0xd92a4a78, 0xa51a3c49, 0xc4efea1b, 0x30609119}; > - int i; > - > -# if 0 > - /* The initialization of iv is a precomputed form of: */ > - for (i = 0; i < 7; i++) > - iv[i] = 0x9e3779b9; /* the golden ratio */ > - for (i = 0; i < 4; ++i) /* scramble it */ > - mix (iv[0], iv[1], iv[2], iv[3], iv[4], iv[5], iv[6], iv[7]); > -# endif > - s->a = s->b = s->c = 0; > - > - for (i = 0; i < 8; i++) > - s->iv[i] = iv[i]; > - > - if (seedsize) > - { > - /* First pass (as in reference ISAAC code) */ > - isaac_mix (s, seed); > - /* Second and subsequent passes (extension to ISAAC) */ > - while (seedsize -= ISAAC_BYTES) > - { > - seed += ISAAC_WORDS; > - for (i = 0; i < ISAAC_WORDS; i++) > - s->mm[i] += seed[i]; > - isaac_mix (s, s->mm); > - } > - } > - else > - { > - /* The no seed case (as in reference ISAAC code) */ > - for (i = 0; i < ISAAC_WORDS; i++) > - s->mm[i] = 0; > - } > - > - /* Final pass */ > - isaac_mix (s, s->mm); > -} > -#endif > - > -/* Start seeding an ISAAC structire */ > -static void > -isaac_seed_start (struct isaac_state *s) > -{ > - static uint32_t const iv[8] = > - { > - 0x1367df5a, 0x95d90059, 0xc3163e4b, 0x0f421ad8, > - 0xd92a4a78, 0xa51a3c49, 0xc4efea1b, 0x30609119 > - }; > - int i; > - > -#if 0 > - /* The initialization of iv is a precomputed form of: */ > - for (i = 0; i < 7; i++) > - iv[i] = 0x9e3779b9; /* the golden ratio */ > - for (i = 0; i < 4; ++i) /* scramble it */ > - mix (iv[0], iv[1], iv[2], iv[3], iv[4], iv[5], iv[6], iv[7]); > -#endif > - for (i = 0; i < 8; i++) > - s->iv[i] = iv[i]; > - > - /* Enable the following memset if you're worried about used-uninitialized > - warnings involving code in isaac_refill from tools like valgrind. > - Since this buffer is used to accumulate pseudo-random data, there's > - no harm, and maybe even some benefit, in using it uninitialized. */ > -#if AVOID_USED_UNINITIALIZED_WARNINGS > - memset (s->mm, 0, sizeof s->mm); > -#endif > - > - /* s->c gets used for a data pointer during the seeding phase */ > - s->a = s->b = s->c = 0; > -} > - > -/* Add a buffer of seed material */ > -static void > -isaac_seed_data (struct isaac_state *s, void const *buffer, size_t size) > -{ > - unsigned char const *buf = buffer; > - unsigned char *p; > - size_t avail; > - size_t i; > - > - avail = sizeof s->mm - (size_t) s->c; /* s->c is used as a write > pointer */ > - > - /* Do any full buffers that are necessary */ > - while (size > avail) > - { > - p = (unsigned char *) s->mm + s->c; > - for (i = 0; i < avail; i++) > - p[i] ^= buf[i]; > - buf += avail; > - size -= avail; > - isaac_mix (s, s->mm); > - s->c = 0; > - avail = sizeof s->mm; > - } > - > - /* And the final partial block */ > - p = (unsigned char *) s->mm + s->c; > - for (i = 0; i < size; i++) > - p[i] ^= buf[i]; > - s->c = size; > -} > - > - > -/* End of seeding phase; get everything ready to produce output. */ > -static void > -isaac_seed_finish (struct isaac_state *s) > -{ > - isaac_mix (s, s->mm); > - isaac_mix (s, s->mm); > - /* Now reinitialize c to start things off right */ > - s->c = 0; > -} > -#define ISAAC_SEED(s,x) isaac_seed_data (s, &(x), sizeof (x)) > - > -/* > - * Get seed material. 16 bytes (128 bits) is plenty, but if we have > - * /dev/urandom, we get 32 bytes = 256 bits for complete overkill. > - */ > -static void > -isaac_seed (struct isaac_state *s) > -{ > - isaac_seed_start (s); > - > - { pid_t t = getpid (); ISAAC_SEED (s, t); } > - { pid_t t = getppid (); ISAAC_SEED (s, t); } > - { uid_t t = getuid (); ISAAC_SEED (s, t); } > - { gid_t t = getgid (); ISAAC_SEED (s, t); } > - > - { > - xtime_t t = gethrxtime (); > - ISAAC_SEED (s, t); > - } > - > - { > - char buf[32]; > - int fd = open ("/dev/urandom", O_RDONLY | O_NOCTTY); > - if (fd >= 0) > - { > - read (fd, buf, 32); > - close (fd); > - isaac_seed_data (s, buf, 32); > - } > - else > - { > - fd = open ("/dev/random", O_RDONLY | O_NONBLOCK | O_NOCTTY); > - if (fd >= 0) > - { > - /* /dev/random is more precious, so use less */ > - read (fd, buf, 16); > - close (fd); > - isaac_seed_data (s, buf, 16); > - } > - } > - } > - > - isaac_seed_finish (s); > -} > - > -/* Single-word RNG built on top of ISAAC */ > -struct irand_state > -{ > - uint32_t r[ISAAC_WORDS]; > - unsigned int numleft; > - struct isaac_state *s; > -}; > - > -static void > -irand_init (struct irand_state *r, struct isaac_state *s) > -{ > - r->numleft = 0; > - r->s = s; > -} > - > -/* > - * We take from the end of the block deliberately, so if we need > - * only a small number of values, we choose the final ones which are > - * marginally better mixed than the initial ones. > - */ > -static uint32_t > -irand32 (struct irand_state *r) > -{ > - if (!r->numleft) > - { > - isaac_refill (r->s, r->r); > - r->numleft = ISAAC_WORDS; > - } > - return r->r[--r->numleft]; > -} > - > -/* > - * Return a uniformly distributed random number between 0 and n, > - * inclusive. Thus, the result is modulo n+1. > - * > - * Theory of operation: as x steps through every possible 32-bit number, > - * x % n takes each value at least 2^32 / n times (rounded down), but > - * the values less than 2^32 % n are taken one additional time. Thus, > - * x % n is not perfectly uniform. To fix this, the values of x less > - * than 2^32 % n are disallowed, and if the RNG produces one, we ask > - * for a new value. > - */ > -static uint32_t > -irand_mod (struct irand_state *r, uint32_t n) > -{ > - uint32_t x; > - uint32_t lim; > - > - if (!++n) > - return irand32 (r); > - > - lim = -n % n; /* == (2**32-n) % n == 2**32 % n */ > - do > - { > - x = irand32 (r); > - } > - while (x < lim); > - return x % n; > -} > - > -/* > * Fill a buffer with a fixed pattern. > * > * The buffer must be at least 3 bytes long, even if > diff -N -ur -x '*.in' -x Makefile -x configure -x dircolors.h -x groups -x > '*~' -x wheel.h -x 'localedir.*' -x 'stamp-*' -x wheel-size.h -x alloca.h -x > .deps -x charset.alias -x getdate.c -x ref-add.sed -x ref-del.sed -x po -x > config.h -x config.log -x config.status -x coreutils.info -x autom4te.cache > -x 'output.*' -x confdefs.h -x '.#*' -x CVS -x tests -x '*.1' -x config.hin > -x '*.orig' -x '*.rej' -x version.texi > coreutils-cvs-1-shred-isaac-split/ChangeLog > coreutils-cvs-2-sort-with-isaac/ChangeLog > --- coreutils-cvs-1-shred-isaac-split/ChangeLog 2005-11-24 > 20:10:35.000000000 +0000 > +++ coreutils-cvs-2-sort-with-isaac/ChangeLog 2005-12-01 03:07:15.000000000 > +0000 > @@ -1,3 +1,22 @@ > +2005-11-25 Frederik Eaton <[EMAIL PROTECTED]> > + > + * src/shred.c: > + * src/isaac.h: > + * src/isaac.c: transfer ISAAC code (for cryptographic random > + number generation) from shred.c to new files isaac.c, isaac.h > + > + * src/Makefile.am (shred_SOURCES): isaac.c is now a source of shred > + > + * src/Makefile.am (sort_SOURCES, sort_LDADD): changes to reflect > + that sort uses ISAAC > + > + * src/sort.c (short_options, long_options, rand_state, HASH_SIZE, > + HASH_WORDS, get_hash, keycompare, main): add options --random-sort > + and --seed to implement a random shuffle > + > + * doc/coreutils.texi: document new --random-hash option, provide > + an example > + > 2005-11-24 Jim Meyering <[EMAIL PROTECTED]> > > * Version 6.0-cvs. > diff -N -ur -x '*.in' -x Makefile -x configure -x dircolors.h -x groups -x > '*~' -x wheel.h -x 'localedir.*' -x 'stamp-*' -x wheel-size.h -x alloca.h -x > .deps -x charset.alias -x getdate.c -x ref-add.sed -x ref-del.sed -x po -x > config.h -x config.log -x config.status -x coreutils.info -x autom4te.cache > -x 'output.*' -x confdefs.h -x '.#*' -x CVS -x tests -x '*.1' -x config.hin > -x '*.orig' -x '*.rej' -x version.texi > coreutils-cvs-1-shred-isaac-split/doc/coreutils.texi > coreutils-cvs-2-sort-with-isaac/doc/coreutils.texi > --- coreutils-cvs-1-shred-isaac-split/doc/coreutils.texi 2005-11-24 > 20:10:37.000000000 +0000 > +++ coreutils-cvs-2-sort-with-isaac/doc/coreutils.texi 2005-12-01 > 03:08:49.000000000 +0000 > @@ -3396,6 +3396,15 @@ > Reverse the result of comparison, so that lines with greater key values > appear earlier in the output instead of later. > > [EMAIL PROTECTED] -R > [EMAIL PROTECTED] --random-sort > [EMAIL PROTECTED] -R > [EMAIL PROTECTED] --random-sort > [EMAIL PROTECTED] random sort > + > +Sort by random hash, i.e. perform a shuffle. This is done by hashing > +the input keys and sorting based on the results. > + > @end table > > Other options are: > @@ -3529,6 +3538,11 @@ > reliably handle arbitrary file names (even those containing blanks > or other special characters). > > [EMAIL PROTECTED] --seed @var{tempdir} > [EMAIL PROTECTED] --seed > [EMAIL PROTECTED] specify seed for random hash > +Specify a seed for the @option{--random-sort} option. > + > @end table > > Historical (BSD and System V) implementations of @command{sort} have > @@ -3695,6 +3709,20 @@ > @c printf 'c\n\nb\n\na\n'|perl -0pe 's/\n\n/\n\0/g'|sort -z|perl -0pe > 's/\0/\n/g' > @c @end example > > [EMAIL PROTECTED] > + > +Shuffle a list of directories, but preserve the order of files within > +each directory. (For instance, one could use this to generate a music > +playlist in which albums are shuffled but the songs of each album are > +played in order) > + > +Assumes that each file is nested two levels deep, i.e. that the paths > +output by @command{find} are are of the form @samp{./a/b/c}. > + > [EMAIL PROTECTED] > +find . -type f | ./sort -t/ -k1,3R -k4 > [EMAIL PROTECTED] example > + > @end itemize > > > diff -N -ur -x '*.in' -x Makefile -x configure -x dircolors.h -x groups -x > '*~' -x wheel.h -x 'localedir.*' -x 'stamp-*' -x wheel-size.h -x alloca.h -x > .deps -x charset.alias -x getdate.c -x ref-add.sed -x ref-del.sed -x po -x > config.h -x config.log -x config.status -x coreutils.info -x autom4te.cache > -x 'output.*' -x confdefs.h -x '.#*' -x CVS -x tests -x '*.1' -x config.hin > -x '*.orig' -x '*.rej' -x version.texi > coreutils-cvs-1-shred-isaac-split/src/Makefile.am > coreutils-cvs-2-sort-with-isaac/src/Makefile.am > --- coreutils-cvs-1-shred-isaac-split/src/Makefile.am 2005-11-26 > 22:48:20.000000000 +0000 > +++ coreutils-cvs-2-sort-with-isaac/src/Makefile.am 2005-11-30 > 12:48:37.000000000 +0000 > @@ -69,7 +69,7 @@ > vdir_LDADD = $(LDADD) $(LIB_CLOCK_GETTIME) > > ## If necessary, add -lm to resolve use of pow in lib/strtod.c. > -sort_LDADD = $(LDADD) $(POW_LIB) > +sort_LDADD = $(LDADD) $(POW_LIB) $(LIB_GETHRXTIME) > > # for get_date and gettime > date_LDADD = $(LDADD) $(LIB_CLOCK_GETTIME) > @@ -167,6 +167,7 @@ > chown_SOURCES = chown.c chown-core.c > chgrp_SOURCES = chgrp.c chown-core.c > shred_SOURCES = shred.c rand-isaac.c > +sort_SOURCES = sort.c rand-isaac.c > > mv_SOURCES = mv.c copy.c cp-hash.c remove.c > rm_SOURCES = rm.c remove.c > diff -N -ur -x '*.in' -x Makefile -x configure -x dircolors.h -x groups -x > '*~' -x wheel.h -x 'localedir.*' -x 'stamp-*' -x wheel-size.h -x alloca.h -x > .deps -x charset.alias -x getdate.c -x ref-add.sed -x ref-del.sed -x po -x > config.h -x config.log -x config.status -x coreutils.info -x autom4te.cache > -x 'output.*' -x confdefs.h -x '.#*' -x CVS -x tests -x '*.1' -x config.hin > -x '*.orig' -x '*.rej' -x version.texi > coreutils-cvs-1-shred-isaac-split/src/sort.c > coreutils-cvs-2-sort-with-isaac/src/sort.c > --- coreutils-cvs-1-shred-isaac-split/src/sort.c 2005-10-07 > 19:48:28.000000000 +0100 > +++ coreutils-cvs-2-sort-with-isaac/src/sort.c 2005-11-30 > 12:48:44.000000000 +0000 > @@ -30,8 +30,10 @@ > #include "error.h" > #include "hard-locale.h" > #include "inttostr.h" > +#include "md5.h" > #include "physmem.h" > #include "posixver.h" > +#include "rand-isaac.h" > #include "quote.h" > #include "stdlib--.h" > #include "stdio--.h" > @@ -147,6 +149,7 @@ > bool numeric; /* Flag for numeric comparison. Handle > strings of digits with optional decimal > point, but no exponential notation. */ > + bool random_hash; /* Shuffle by sorting on random hash of key */ > bool general_numeric; /* Flag for general, numeric comparison. > Handle numbers in exponential notation. */ > bool month; /* Flag for comparison by month name. */ > @@ -303,6 +306,7 @@ > -i, --ignore-nonprinting consider only printable characters\n\ > -M, --month-sort compare (unknown) < `JAN' < ... < `DEC'\n\ > -n, --numeric-sort compare according to string numerical value\n\ > + -R, --random-sort sort by random hash of keys\n\ > -r, --reverse reverse the result of comparisons\n\ > \n\ > "), stdout); > @@ -325,6 +329,7 @@ > "), DEFAULT_TMPDIR); > fputs (_("\ > -z, --zero-terminated end lines with 0 byte, not newline\n\ > + --seed use specified seed for random number generator\n\ > "), stdout); > fputs (HELP_OPTION_DESCRIPTION, stdout); > fputs (VERSION_OPTION_DESCRIPTION, stdout); > @@ -353,7 +358,11 @@ > exit (status); > } > > -static char const short_options[] = "-bcdfgik:mMno:rsS:t:T:uy:z"; > +static char const short_options[] = "-bcdfgik:mMno:rRsS:t:T:uy:z"; > +enum > +{ > + SEED_OPTION = CHAR_MAX + 1 > +}; > > static struct option const long_options[] = > { > @@ -367,6 +376,7 @@ > {"merge", no_argument, NULL, 'm'}, > {"month-sort", no_argument, NULL, 'M'}, > {"numeric-sort", no_argument, NULL, 'n'}, > + {"random-sort", no_argument, NULL, 'R'}, > {"output", required_argument, NULL, 'o'}, > {"reverse", no_argument, NULL, 'r'}, > {"stable", no_argument, NULL, 's'}, > @@ -375,6 +385,7 @@ > {"temporary-directory", required_argument, NULL, 'T'}, > {"unique", no_argument, NULL, 'u'}, > {"zero-terminated", no_argument, NULL, 'z'}, > + {"seed", required_argument, NULL, SEED_OPTION}, > {GETOPT_HELP_OPTION_DECL}, > {GETOPT_VERSION_OPTION_DECL}, > {NULL, 0, NULL, 0}, > @@ -1152,6 +1163,28 @@ > return 0; > } > > +static struct isaac_state *rand_state; > + > +#define HASH_WORDS 1 > +#define HASH_SIZE (HASH_WORDS * sizeof (uint32_t)) > + > +static void > +get_hash (char const* text, size_t len, uint32_t resbuf[/*HASH_WORDS*/]) > +{ > + struct isaac_state s; > + int i; > + memcpy (&s, rand_state, sizeof s); > + isaac_seed_data (&s, text, len); > + /* we can call isaac_seed_finish multiple times, but should only > + call isaac_seed_start once */ > + isaac_seed_finish (&s); > + /* getting an irand_state and drawing random numbers would be more > + kosher here, but also slower. so we just peek at the ISAAC state > + array instead */ > + for (i = 0; i < HASH_WORDS; i++) > + resbuf[i] = s.mm[ISAAC_WORDS - 1 - i]; > +} > + > /* Compare two lines A and B trying every key in sequence until there > are no more keys or a difference is found. */ > > @@ -1188,6 +1221,14 @@ > (texta, textb)); > *lima = savea, *limb = saveb; > } > + else if (key->random_hash) > + { > + uint32_t diga[HASH_SIZE]; > + uint32_t digb[HASH_SIZE]; > + get_hash (texta, lena, diga); > + get_hash (textb, lenb, digb); > + diff = memcmp (diga, digb, sizeof (diga)); > + } > else if (key->month) > diff = getmonth (texta, lena) - getmonth (textb, lenb); > /* Sorting like this may become slow, so in a simple locale the user > @@ -1989,6 +2030,7 @@ > { > error (SORT_FAILURE, 0, _("%s: invalid field specification %s"), > _(msgid), quote (spec)); > + /* added to satisfy compiler for NORETURN: */ > abort (); > } > > @@ -2081,6 +2123,9 @@ > case 'n': > key->numeric = true; > break; > + case 'R': > + key->random_hash = true; > + break; > case 'r': > key->reverse = true; > break; > @@ -2109,6 +2154,8 @@ > int c = 0; > bool checkonly = false; > bool mergeonly = false; > + char const *randseed = 0; > + bool need_rand = false; > size_t nfiles = 0; > bool posixly_correct = (getenv ("POSIXLY_CORRECT") != NULL); > bool obsolete_usage = (posix2_version () < 200112); > @@ -2187,7 +2234,7 @@ > gkey.sword = gkey.eword = SIZE_MAX; > gkey.ignore = NULL; > gkey.translate = NULL; > - gkey.numeric = gkey.general_numeric = gkey.month = gkey.reverse = false; > + gkey.numeric = gkey.general_numeric = gkey.random_hash = gkey.month = > gkey.reverse = false; > gkey.skipsblanks = gkey.skipeblanks = false; > > files = xnmalloc (argc, sizeof *files); > @@ -2263,6 +2310,7 @@ > case 'M': > case 'n': > case 'r': > + case 'R': > { > char str[2]; > str[0] = c; > @@ -2404,6 +2452,10 @@ > eolchar = 0; > break; > > + case SEED_OPTION: > + randseed = xstrdup (optarg); > + break; > + > case_GETOPT_HELP_CHAR; > > case_GETOPT_VERSION_CHAR (PROGRAM_NAME, AUTHORS); > @@ -2413,26 +2465,46 @@ > } > } > > + need_rand |= gkey.random_hash; > /* Inheritance of global options to individual keys. */ > for (key = keylist; key; key = key->next) > - if (! (key->ignore || key->translate > - || (key->skipsblanks | key->reverse > - | key->skipeblanks | key->month | key->numeric > - | key->general_numeric))) > - { > - key->ignore = gkey.ignore; > - key->translate = gkey.translate; > - key->skipsblanks = gkey.skipsblanks; > - key->skipeblanks = gkey.skipeblanks; > - key->month = gkey.month; > - key->numeric = gkey.numeric; > - key->general_numeric = gkey.general_numeric; > - key->reverse = gkey.reverse; > - } > + { > + need_rand |= key->random_hash; > + if (! (key->ignore || key->translate > + || (key->skipsblanks | key->reverse > + | key->skipeblanks | key->month | key->numeric > + | key->general_numeric > + | key->random_hash))) > + { > + key->ignore = gkey.ignore; > + key->translate = gkey.translate; > + key->skipsblanks = gkey.skipsblanks; > + key->skipeblanks = gkey.skipeblanks; > + key->month = gkey.month; > + key->numeric = gkey.numeric; > + key->general_numeric = gkey.general_numeric; > + key->random_hash = gkey.random_hash; > + key->reverse = gkey.reverse; > + } > + } > + > + if (need_rand) > + { > + rand_state = xzalloc (sizeof *rand_state); > + if (randseed) > + { > + isaac_seed_start (rand_state); > + isaac_seed_data (rand_state, randseed, strlen (randseed)); > + isaac_seed_finish (rand_state); > + } > + else > + isaac_seed (rand_state); > + } > > if (!keylist && (gkey.ignore || gkey.translate > || (gkey.skipsblanks | gkey.skipeblanks | gkey.month > - | gkey.numeric | gkey.general_numeric))) > + | gkey.numeric | gkey.general_numeric > + | gkey.random_hash))) > insertkey (&gkey); > reverse = gkey.reverse; > > diff -N -ur -x '*.in' -x Makefile -x configure -x dircolors.h -x groups -x > '*~' -x wheel.h -x 'localedir.*' -x 'stamp-*' -x wheel-size.h -x alloca.h -x > .deps -x charset.alias -x getdate.c -x ref-add.sed -x ref-del.sed -x po -x > config.h -x config.log -x config.status -x coreutils.info -x autom4te.cache > -x 'output.*' -x confdefs.h -x '.#*' -x CVS -x tests -x '*.1' -x config.hin > -x '*.orig' -x '*.rej' -x version.texi > coreutils-cvs-2-sort-with-isaac/ChangeLog > coreutils-cvs-3-sort-with-param-isaac/ChangeLog > --- coreutils-cvs-2-sort-with-isaac/ChangeLog 2005-12-01 03:07:15.000000000 > +0000 > +++ coreutils-cvs-3-sort-with-param-isaac/ChangeLog 2005-12-01 > 03:06:55.000000000 +0000 > @@ -1,9 +1,11 @@ > 2005-11-25 Frederik Eaton <[EMAIL PROTECTED]> > > * src/shred.c: > - * src/isaac.h: > - * src/isaac.c: transfer ISAAC code (for cryptographic random > - number generation) from shred.c to new files isaac.c, isaac.h > + * src/rand-isaac.h: > + * src/rand-isaac.c (isaac_new, isaac_copy): transfer ISAAC code > + (for cryptographic random number generation) from shred.c to new > + files rand-isaac.c, rand-isaac.h; make state size > + runtime-configurable and add functions isaac_new and isaac_copy > > * src/Makefile.am (shred_SOURCES): isaac.c is now a source of shred > > diff -N -ur -x '*.in' -x Makefile -x configure -x dircolors.h -x groups -x > '*~' -x wheel.h -x 'localedir.*' -x 'stamp-*' -x wheel-size.h -x alloca.h -x > .deps -x charset.alias -x getdate.c -x ref-add.sed -x ref-del.sed -x po -x > config.h -x config.log -x config.status -x coreutils.info -x autom4te.cache > -x 'output.*' -x confdefs.h -x '.#*' -x CVS -x tests -x '*.1' -x config.hin > -x '*.orig' -x '*.rej' -x version.texi > coreutils-cvs-2-sort-with-isaac/src/rand-isaac.c > coreutils-cvs-3-sort-with-param-isaac/src/rand-isaac.c > --- coreutils-cvs-2-sort-with-isaac/src/rand-isaac.c 2005-12-01 > 00:04:51.000000000 +0000 > +++ coreutils-cvs-3-sort-with-param-isaac/src/rand-isaac.c 2005-12-01 > 03:15:06.000000000 +0000 > @@ -24,37 +24,72 @@ > * -------------------------------------------------------------------- > */ > > +#define ISAAC_SIZE(words) \ > + (sizeof (struct isaac_state) - \ > + (ISAAC_MAX_WORDS - words) * sizeof (uint32_t)) > + > +/* > + * Create a new state, optionally at the location specified. This > + * should be called to create/initialize each new isaac_state. > + */ > +struct isaac_state * > +isaac_new (struct isaac_state *s, int words) > +{ > + size_t w; > + size_t ss = ISAAC_SIZE (words); > + if (!s) > + { > + s = xmalloc (ss); > + } > + memset (s, 0, ss); > + s->words = words; > + s->log = 0; > + for (w=1; w<words; w<<=1, s->log++) {} > + return s; > +} > + > +/* > + * Copy a state. Destination block must be at least as large as > + * source. > + */ > +void > +isaac_copy (struct isaac_state *dst, struct isaac_state *src) > +{ > + memcpy(dst, src, ISAAC_SIZE (src->words)); > +} > + > /* > * Refill the entire R array, and update S. > */ > void > -isaac_refill (struct isaac_state *s, uint32_t r[/* ISAAC_WORDS */]) > +isaac_refill (struct isaac_state *s, uint32_t r[/* s>-words */]) > { > uint32_t a, b; /* Caches of a and b */ > uint32_t x, y; /* Temps needed by isaac_step macro */ > uint32_t *m = s->mm; /* Pointer into state array */ > + uint32_t w = s->words; > > a = s->a; > b = s->b + (++s->c); > > do > { > - isaac_step (a << 13, a, b, s->mm, m, ISAAC_WORDS / 2, r); > - isaac_step (a >> 6, a, b, s->mm, m + 1, ISAAC_WORDS / 2, r + 1); > - isaac_step (a << 2, a, b, s->mm, m + 2, ISAAC_WORDS / 2, r + 2); > - isaac_step (a >> 16, a, b, s->mm, m + 3, ISAAC_WORDS / 2, r + 3); > + isaac_step (s, a << 13, a, b, m, w / 2, r); > + isaac_step (s, a >> 6, a, b, m + 1, w / 2, r + 1); > + isaac_step (s, a << 2, a, b, m + 2, w / 2, r + 2); > + isaac_step (s, a >> 16, a, b, m + 3, w / 2, r + 3); > r += 4; > } > - while ((m += 4) < s->mm + ISAAC_WORDS / 2); > + while ((m += 4) < s->mm + w / 2); > do > { > - isaac_step (a << 13, a, b, s->mm, m, -ISAAC_WORDS / 2, r); > - isaac_step (a >> 6, a, b, s->mm, m + 1, -ISAAC_WORDS / 2, r + 1); > - isaac_step (a << 2, a, b, s->mm, m + 2, -ISAAC_WORDS / 2, r + 2); > - isaac_step (a >> 16, a, b, s->mm, m + 3, -ISAAC_WORDS / 2, r + 3); > + isaac_step (s, a << 13, a, b, m, -w / 2, r); > + isaac_step (s, a >> 6, a, b, m + 1, -w / 2, r + 1); > + isaac_step (s, a << 2, a, b, m + 2, -w / 2, r + 2); > + isaac_step (s, a >> 16, a, b, m + 3, -w / 2, r + 3); > r += 4; > } > - while ((m += 4) < s->mm + ISAAC_WORDS); > + while ((m += 4) < s->mm + w); > s->a = a; > s->b = b; > } > @@ -76,7 +111,7 @@ > > /* The basic ISAAC initialization pass. */ > void > -isaac_mix (struct isaac_state *s, uint32_t const seed[/* ISAAC_WORDS */]) > +isaac_mix (struct isaac_state *s, uint32_t const seed[/* s->words */]) > { > int i; > uint32_t a = s->iv[0]; > @@ -87,8 +122,9 @@ > uint32_t f = s->iv[5]; > uint32_t g = s->iv[6]; > uint32_t h = s->iv[7]; > + uint32_t w = s->words; > > - for (i = 0; i < ISAAC_WORDS; i += 8) > + for (i = 0; i < w; i += 8) > { > a += seed[i]; > b += seed[i + 1]; > @@ -123,13 +159,13 @@ > > #if 0 /* Provided for reference only; not used in this code */ > /* > - * Initialize the ISAAC RNG with the given seed material. > - * Its size MUST be a multiple of ISAAC_BYTES, and may be > + * Initialize the ISAAC RNG with the given seed material. Its size > + * MUST be a multiple of s->words * sizeof (uint32_t), and may be > * stored in the s->mm array. > * > * This is a generalization of the original ISAAC initialization code > - * to support larger seed sizes. For seed sizes of 0 and ISAAC_BYTES, > - * it is identical. > + * to support larger seed sizes. For seed sizes of 0 and s->words * > + * sizeof (uint32_t), it is identical. > */ > void > isaac_init (struct isaac_state *s, uint32_t const *seed, size_t seedsize) > @@ -157,10 +193,10 @@ > /* First pass (as in reference ISAAC code) */ > isaac_mix (s, seed); > /* Second and subsequent passes (extension to ISAAC) */ > - while (seedsize -= ISAAC_BYTES) > + while (seedsize -= s->words * sizeof (uint32_t)) > { > - seed += ISAAC_WORDS; > - for (i = 0; i < ISAAC_WORDS; i++) > + seed += s->words; > + for (i = 0; i < s->words; i++) > s->mm[i] += seed[i]; > isaac_mix (s, s->mm); > } > @@ -168,7 +204,7 @@ > else > { > /* The no seed case (as in reference ISAAC code) */ > - for (i = 0; i < ISAAC_WORDS; i++) > + for (i = 0; i < s->words; i++) > s->mm[i] = 0; > } > > @@ -219,7 +255,7 @@ > size_t avail; > size_t i; > > - avail = sizeof s->mm - (size_t) s->c; /* s->c is used as a write > pointer */ > + avail = s->words - (size_t) s->c; /* s->c is used as a write pointer */ > > /* Do any full buffers that are necessary */ > while (size > avail) > @@ -231,7 +267,7 @@ > size -= avail; > isaac_mix (s, s->mm); > s->c = 0; > - avail = sizeof s->mm; > + avail = s->words; > } > > /* And the final partial block */ > @@ -302,6 +338,7 @@ > { > r->numleft = 0; > r->s = s; > + memset (r->r, 0, s->words * sizeof (uint32_t)); > } > > /* > @@ -315,7 +352,7 @@ > if (!r->numleft) > { > isaac_refill (r->s, r->r); > - r->numleft = ISAAC_WORDS; > + r->numleft = r->s->words; > } > return r->r[--r->numleft]; > } > diff -N -ur -x '*.in' -x Makefile -x configure -x dircolors.h -x groups -x > '*~' -x wheel.h -x 'localedir.*' -x 'stamp-*' -x wheel-size.h -x alloca.h -x > .deps -x charset.alias -x getdate.c -x ref-add.sed -x ref-del.sed -x po -x > config.h -x config.log -x config.status -x coreutils.info -x autom4te.cache > -x 'output.*' -x confdefs.h -x '.#*' -x CVS -x tests -x '*.1' -x config.hin > -x '*.orig' -x '*.rej' -x version.texi > coreutils-cvs-2-sort-with-isaac/src/rand-isaac.h > coreutils-cvs-3-sort-with-param-isaac/src/rand-isaac.h > --- coreutils-cvs-2-sort-with-isaac/src/rand-isaac.h 2005-12-01 > 02:56:41.000000000 +0000 > +++ coreutils-cvs-3-sort-with-param-isaac/src/rand-isaac.h 2005-12-01 > 01:49:30.000000000 +0000 > @@ -1,39 +1,49 @@ > -/* Size of the state tables to use. (You may change ISAAC_LOG) */ > +/* Size of the state tables to use. */ > +/* (You may change ISAAC_LOG but it should be >= 3, and smaller values > + * give less security) */ > +#ifndef ISAAC_LOG > #define ISAAC_LOG 8 > -#define ISAAC_WORDS (1 << ISAAC_LOG) > -#define ISAAC_BYTES (ISAAC_WORDS * sizeof (uint32_t)) > +#endif > +#define ISAAC_MAX_WORDS (1 << ISAAC_LOG) > +#define ISAAC_MAX_BYTES (ISAAC_MAX_WORDS * sizeof (uint32_t)) > > /* RNG state variables */ > struct isaac_state > { > - uint32_t mm[ISAAC_WORDS]; /* Main state array */ > - uint32_t iv[8]; /* Seeding initial vector */ > - uint32_t a, b, c; /* Extra index variables */ > + uint32_t iv[8]; /* Seeding initial vector */ > + uint32_t a, b, c; /* Extra index variables */ > + uint32_t words; /* Words in main state array */ > + uint32_t log; /* Log of words */ > + uint32_t mm[ISAAC_MAX_WORDS]; /* Main state array */ > }; > > /* This index operation is more efficient on many processors */ > -#define ind(mm, x) \ > +#define ind(words, mm, x) \ > (* (uint32_t *) ((char *) (mm) \ > - + ((x) & (ISAAC_WORDS - 1) * sizeof (uint32_t)))) > + + ((x) & (words - 1) * sizeof (uint32_t)))) > > /* > - * The central step. This uses two temporaries, x and y. mm is the > - * whole state array, while m is a pointer to the current word. off is > - * the offset from m to the word ISAAC_WORDS/2 words away in the mm array, > - * i.e. +/- ISAAC_WORDS/2. > + * The central step. This uses two temporaries, x and y. s is the > + * state, while m is a pointer to the current word. off is the offset > + * from m to the word s->words/2 words away in the mm array, i.e. > + * +/- s->words/2. > */ > -#define isaac_step(mix, a, b, mm, m, off, r) \ > +#define isaac_step(s, mix, a, b, m, off, r) \ > ( \ > a = ((a) ^ (mix)) + (m)[off], \ > x = *(m), \ > - *(m) = y = ind (mm, x) + (a) + (b), \ > - *(r) = b = ind (mm, (y) >> ISAAC_LOG) + x \ > + *(m) = y = ind (s->words, s->mm, x) + (a) + (b), \ > + *(r) = b = ind (s->words, s->mm, (y) >> s->log) + x \ > ) > > +struct isaac_state * > +isaac_new (struct isaac_state *s, int words); > void > -isaac_refill (struct isaac_state *s, uint32_t r[/* ISAAC_WORDS */]); > +isaac_copy (struct isaac_state *dst, struct isaac_state *src); > void > -isaac_mix (struct isaac_state *s, uint32_t const seed[/* ISAAC_WORDS */]); > +isaac_refill (struct isaac_state *s, uint32_t r[/* s->words */]); > +void > +isaac_mix (struct isaac_state *s, uint32_t const seed[/* s->words */]); > void > isaac_init (struct isaac_state *s, uint32_t const *seed, size_t seedsize); > void > @@ -49,7 +59,7 @@ > /* Single-word RNG built on top of ISAAC */ > struct irand_state > { > - uint32_t r[ISAAC_WORDS]; > + uint32_t r[ISAAC_MAX_WORDS]; > unsigned int numleft; > struct isaac_state *s; > }; > diff -N -ur -x '*.in' -x Makefile -x configure -x dircolors.h -x groups -x > '*~' -x wheel.h -x 'localedir.*' -x 'stamp-*' -x wheel-size.h -x alloca.h -x > .deps -x charset.alias -x getdate.c -x ref-add.sed -x ref-del.sed -x po -x > config.h -x config.log -x config.status -x coreutils.info -x autom4te.cache > -x 'output.*' -x confdefs.h -x '.#*' -x CVS -x tests -x '*.1' -x config.hin > -x '*.orig' -x '*.rej' -x version.texi > coreutils-cvs-2-sort-with-isaac/src/shred.c > coreutils-cvs-3-sort-with-param-isaac/src/shred.c > --- coreutils-cvs-2-sort-with-isaac/src/shred.c 2005-11-26 > 22:49:12.000000000 +0000 > +++ coreutils-cvs-3-sort-with-param-isaac/src/shred.c 2005-12-01 > 00:33:14.000000000 +0000 > @@ -256,18 +256,19 @@ > > /* > * Fill a buffer, R (of size SIZE_MAX), with random data. > - * SIZE is rounded UP to a multiple of ISAAC_BYTES. > + * SIZE is rounded UP to a multiple of s->words * sizeof (uint32_t). > */ > static void > fillrand (struct isaac_state *s, uint32_t *r, size_t size_max, size_t size) > { > - size = (size + ISAAC_BYTES - 1) / ISAAC_BYTES; > + size_t bytes = s->words * sizeof (uint32_t); > + size = (size + bytes - 1) / bytes; > assert (size <= size_max); > > while (size--) > { > isaac_refill (s, r); > - r += ISAAC_WORDS; > + r += s->words; > } > } > > @@ -369,7 +370,7 @@ > size_t soff; /* Offset into buffer for next write */ > ssize_t ssize; /* Return value from write */ > uint32_t *r; /* Fill pattern. */ > - size_t rsize = 3 * MAX (ISAAC_WORDS, 1024) * sizeof *r; /* Fill size. */ > + size_t rsize = 3 * MAX (s->words, 1024) * sizeof *r; /* Fill size. */ > size_t ralign = lcm (getpagesize (), sizeof *r); /* Fill alignment. */ > char pass_string[PASS_NAME_SIZE]; /* Name of current pass */ > bool write_error = false; > @@ -1103,6 +1104,7 @@ > > atexit (close_stdout); > > + isaac_new (&s, ISAAC_MAX_WORDS); > isaac_seed (&s); > > memset (&flags, 0, sizeof flags); > diff -N -ur -x '*.in' -x Makefile -x configure -x dircolors.h -x groups -x > '*~' -x wheel.h -x 'localedir.*' -x 'stamp-*' -x wheel-size.h -x alloca.h -x > .deps -x charset.alias -x getdate.c -x ref-add.sed -x ref-del.sed -x po -x > config.h -x config.log -x config.status -x coreutils.info -x autom4te.cache > -x 'output.*' -x confdefs.h -x '.#*' -x CVS -x tests -x '*.1' -x config.hin > -x '*.orig' -x '*.rej' -x version.texi > coreutils-cvs-2-sort-with-isaac/src/sort.c > coreutils-cvs-3-sort-with-param-isaac/src/sort.c > --- coreutils-cvs-2-sort-with-isaac/src/sort.c 2005-11-30 > 12:48:44.000000000 +0000 > +++ coreutils-cvs-3-sort-with-param-isaac/src/sort.c 2005-12-01 > 02:35:23.000000000 +0000 > @@ -79,6 +79,8 @@ > # define DEFAULT_TMPDIR "/tmp" > #endif > > +#define SORT_ISAAC_WORDS 8 > + > /* Exit statuses. */ > enum > { > @@ -1173,16 +1175,17 @@ > { > struct isaac_state s; > int i; > - memcpy (&s, rand_state, sizeof s); > + isaac_copy (&s, rand_state); > isaac_seed_data (&s, text, len); > /* we can call isaac_seed_finish multiple times, but should only > call isaac_seed_start once */ > isaac_seed_finish (&s); > + > /* getting an irand_state and drawing random numbers would be more > kosher here, but also slower. so we just peek at the ISAAC state > array instead */ > for (i = 0; i < HASH_WORDS; i++) > - resbuf[i] = s.mm[ISAAC_WORDS - 1 - i]; > + resbuf[i] = s.mm[s.words - 1 - i]; > } > > /* Compare two lines A and B trying every key in sequence until there > @@ -1223,8 +1226,8 @@ > } > else if (key->random_hash) > { > - uint32_t diga[HASH_SIZE]; > - uint32_t digb[HASH_SIZE]; > + uint32_t diga[HASH_WORDS]; > + uint32_t digb[HASH_WORDS]; > get_hash (texta, lena, diga); > get_hash (textb, lenb, digb); > diff = memcmp (diga, digb, sizeof (diga)); > @@ -2490,7 +2493,7 @@ > > if (need_rand) > { > - rand_state = xzalloc (sizeof *rand_state); > + rand_state = isaac_new (NULL, SORT_ISAAC_WORDS); > if (randseed) > { > isaac_seed_start (rand_state); > _______________________________________________ > Bug-coreutils mailing list > Bug-coreutils@gnu.org > http://lists.gnu.org/mailman/listinfo/bug-coreutils _______________________________________________ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils