On Sat, Jun 07, 2014 at 12:03:29AM +0300, Matti Nykyri wrote:
> On Thu, Jun 05, 2014 at 10:58:51PM -0500, Canek Peláez Valdés wrote:
> > On Thu, Jun 5, 2014 at 9:56 PM, <[email protected]> wrote:
> > > Hi,
> > >
> > > I am experimenting with the C code of the ISAAC pseudo random number
> > > generator
> > > (http://burtleburtle.net/bob/rand/isaacafa.html).
> > >
> > > Currently the implementation creates (on my embedded linux) 32 bit
> > > hexadecimal output.
> >
> > So it's a 32 bit integer.
> >
> > > From this I want to create random numbers in the range of [a-Za-z0-9]
> > > *without violating randomness* and (if possible) without throwing
> > > away bits of the output.
> >
> > You mean *characters* int the range [A-Za-z0-9]?
>
> Well this isn't as simple problem as it sounds. A random 32 bit integer
> has 32 bits of randomness. If you take a divison reminder of 62 from this
> integer you will get only 5,95419631039 bits of randomness
> (log(62)/log(2)). So you are wasting 81,4% of your random data. Which is
> quite much and usually random data is quite expensive. You can save your
> precious random data by taking only 6 bit from your 32 bit integer and
> dividing it by 62. Then you will be wasting only 0,8% of random data.
> Another problem is alignment, but that is about mathematical correctness.
>
> > > How can I do this mathemtically (in concern of the quality of output)
> > > correct?
> >
> > The easiest thing to do would be:
>
> The easiest is not mathematically correct though. Random data will stay
> random only if you select and modify it so that randomness is preserved.
> If you take devison reminder of 62 from 32 bit integer there are 69 273
> 667 possibilities of the reminder to be 3 or less. For the reminder to 4
> or more the number of possibilities is 69 273 666. In mathematically
> ideal case the probability for every index of the list should be same:
> 1/62 = 1,61290322581%. But the modulo 62 modifies this probability: for
> index 0-3 the probability is 69 273 667/2^32 = 1,61290324759%. And for
> indexes 4-61 the probability will be 69 273 666/2^32 = 1,6129032243%.
>
> If you wish not to waste those random bits the probabilities will get
> worse. With 6 bits of random the probability for index 0-1 will be 2/64
> and for 2-63 it will be 1/64. This is a very significant change because
> first and second index will appear twice as much as the rest. If you add
> 2 characters to your list you will perfect alignment and you can take 6
> bits of data without it modifying probabilities.
>
> If you are looking a mathematically perfect solution there is a simple
> one even if your list is not in the power of 2! Take 6 bits at a time of
> the random data. If the result is 62 or 63 you will discard the data and
> get the next 6 bits. This selectively modifies the random data but keeps
> the probabilities in correct balance. Now the probability for index of
> 0-61 is 1/62 because the probability to get 62-63 out of 64 if 0.
>
> > -------------------------------------------------------------------------------
> > #include <time.h>
> > #include <stdio.h>
> > #include <stdlib.h>
> >
> > #define N (26+26+10)
> >
> > static char S[] = { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J',
> > 'K', 'L', 'M',
> > 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W',
> > 'X', 'Y', 'Z',
> > 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
> > 'k', 'l', 'm',
> > 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w',
> > 'x', 'y', 'z',
> > '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' };
> >
> > int
> > next_character()
> > {
> > // Use the correct call for ISAAC instead of rand()
> > unsigned int idx = rand() % N;
> > return S[idx];
> > }
>
> so modify the next_char function:
>
> char next_character()
> {
> static unsigned int rand = 0; //(sizeof(int) = 32)
> static char bit_avail = 0;
> char result = 0;
> char move_bits = 0;
> char bits_moved = 0;
>
> do {
> if (!bits_avail) {
> // Use the correct call for ISAAC instead of rand()
> rand = rand();
>
> bit_avail = 32;
> }
>
> move_bits = bits_avail >= 6 ? 6 : bits_avail;
> result <<= move_bits;
> result = (result | rand & (0xFF >> (8 - move_bits))) & 0x3F;
> bits_avail -= move_bits;
> bits_moved += move_bits;
> rand >>= move_bits;
>
> } while (bits_moved != 6 && result > 61);
>
> return result;
> }
Well actually it looks simpler if you break this like this:
unsigned char get_6bits ()
{
static unsigned int rand = 0; //(sizeof(int) = 32)
static char bits_avail = 0;
unsigned char result = 0;
//get 2 bits 3 times: 32 is devidable by 2
for (int i = 0; i < 3; i++) { // --std=c99
//Fill buffer if it is empty!
if (!bits_avail || bits_avail < 0 ) { //if bits_avail < 0 it is
an error!
// Use the correct call for ISAAC instead of rand()
rand = rand();
bits_avail = 32;
}
result <<= 2; //move two bits to left.
result = result | (rand & 0x3); //add two least signifigant
bits to the result.
rand >>= 2; //move two bits to right.
bits_avail -= 2;
}
return result; //result has 6 bits of random data...
}
char next_character()
{
unsigned char idx = 0;
do {
idx = get_6bits();
} while (idx > 61);
return S[idx];
}
Very simple :)
> This function will give perfect distribution of 1/62 probability for
> every index. It will waste 6 bits with the probability of 1/32 (2/64).
>
> > int
> > main(int argc, char* argv[])
> > {
> > // Use the correct call for initializing the ISAAC seed
> > srand((unsigned int)time(NULL));
> > for (int i = 0; i < 20; i++) // --std=c99
> > printf("%c\n", next_character());
> > return 0;
> > }
> > -------------------------------------------------------------------------------
> >
> > If the ISAAC RNG has a good distribution, then the next_character()
> > function will give a good distribution among the set [A-Za-z0-9].
> >
> > Unless I missunderstood what you meant with "create random numbers in
> > the range of [a-Za-z0-9]".
--
-Matti