On Fri, Jun 6, 2014 at 1:39 PM,  <meino.cra...@gmx.de> wrote:
> Canek Peláez Valdés <can...@gmail.com> [14-06-06 17:36]:
>> On Thu, Jun 5, 2014 at 9:56 PM,  <meino.cra...@gmx.de> wrote:
>> > Hi,
>> >
>> > I am experimenting with the C code of the ISAAC pseudo random number 
>> > generator
>> > (http://burtleburtle.net/bob/rand/isaacafa.html).
>> >
>> > Currently the implementation creates (on my embedded linux) 32 bit
>> > hexadecimal output.
>>
>> So it's a 32 bit integer.
>>
>> > From this I want to create random numbers in the range of [a-Za-z0-9]
>> > *without violating randomness* and (if possible) without throwing
>> > away bits of the output.
>>
>> You mean *characters* int the range [A-Za-z0-9]?
>>
>> > How can I do this mathemtically (in concern of the quality of output)
>> > correct?
>>
>> The easiest thing to do would be:
>>
>> -------------------------------------------------------------------------------
>> #include <time.h>
>> #include <stdio.h>
>> #include <stdlib.h>
>>
>> #define N (26+26+10)
>>
>> static char S[] = { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J',
>> 'K', 'L', 'M',
>>                     'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W',
>> 'X', 'Y', 'Z',
>>                     'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
>> 'k', 'l', 'm',
>>                     'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w',
>> 'x', 'y', 'z',
>>                     '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' };
>>
>> int
>> next_character()
>> {
>>         // Use the correct call for ISAAC instead of rand()
>>         unsigned int idx = rand() % N;
>>         return S[idx];
>> }
>>
>> int
>> main(int argc, char* argv[])
>> {
>>         // Use the correct call for initializing the ISAAC seed
>>         srand((unsigned int)time(NULL));
>>         for (int i = 0; i < 20; i++) // --std=c99
>>                 printf("%c\n", next_character());
>>         return 0;
>> }
>> -------------------------------------------------------------------------------
>>
>> If the ISAAC RNG has a good distribution, then the next_character()
>> function will give a good distribution among the set [A-Za-z0-9].
>>
>> Unless I missunderstood what you meant with "create random numbers in
>> the range of [a-Za-z0-9]".
>>
>> Regards.
>> --
>> Canek Peláez Valdés
>> Profesor de asignatura, Facultad de Ciencias
>> Universidad Nacional Autónoma de México
>>
>
> Hi,
>
> Thank you very much for the input! :)
>
> I have a question about the algorithm:
> Suppose rand() has an equal distribution of numbers and furthermore
> one has a count of 2^32 random numbers listed in numerical sort
> order.
> In this list each number would appear (nearly) with the same count: 1
>
> To get an better imagination of that...suppose the rand() would only
> return numbers in the range of 1...12 and the alphabet has only 8
> characters (as 2^32 is not devideable by 62)
>
> rand():
> 1 2 3 4 5 6 7 8 9 10 11 12
>
> rand()%N : rand()%7
> 1 2 3 4 5 6 7 0 1  2  3  4
>
> or in other words: An even distribution of numbers of rand()
> would result in a unevenly distributed sequence of characters...or?
> This would break the quality of ISAACs output.
>
> I am sure I did something wrong here...but where is the logic trap?

In theory, it doesn't matter if the number of random characters you
want is not multiple of the size of the set of characters. In
practice, it doesn't matter if the number of random characters you
want is big enough.

Consider the following modification to my program (I changed the order
of the array S so it's sorted and I can do binary search):

----------------------------------------------------------------------
#include <time.h>
#include <stdio.h>
#include <stdlib.h>

#define N (10+26+26)

static char S[] = {
        '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
        'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M',
        'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z',
        'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
        'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'
};

static int
index_of_aux(char c, int a, int b)
{
        if (a > b)
                return -1;
        int m = (a + b) / 2;
        if (S[m] == c)
                return m;
        if (S[m] > c)
                return index_of_aux(c, a, m-1);
        return index_of_aux(c, m+1, b);
}

static int
index_of(char c)
{
        return index_of_aux(c, 0, N-1);
}

int
next_character()
{
        // Use the correct call for isaac instead of rand()
        unsigned int idx = rand() % N;
        return S[idx];
}

int
main(int argc, char* argv[])
{
        srand((unsigned int)time(NULL));
        int total = 1000000;

        // Size = 62
        int count[] = {
                0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
                0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
                0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
                0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
                0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
        };

        for (int i = 0; i < total; i++) {
                char c = next_character();
                count[index_of(c)]++;
        }

        int min = total, max = 0;
        for (int i = 0; i < N; i++) {
                printf("Count %d: %d\n", i, count[i]);
                min = (count[i] < min) ? count[i] : min;
                max = (count[i] > max) ? count[i] : max;
        }
        printf("Min: %d\n", min);
        printf("Max: %d\n", max);
        printf("Diff: %d\n", (max-min));
        return 0;
}
----------------------------------------------------------------------

The output is the following:

Count 0: 16060
Count 1: 16102
Count 2: 15968
Count 3: 16075
...
Count 59: 16220
Count 60: 16143
Count 61: 16177
Min: 15819
Max: 16366
Diff: 547

As you can see, all the characters appear  around 16,000 times, so
it's pretty well distributed. If the RNG has a good distribution, it
will *always* look something like this. The difference will *never* be
0, since no RNG is perfectly distributed.

So, TL;DR: it doesn't matter if total is not multiple of N.

Regards.
-- 
Canek Peláez Valdés
Profesor de asignatura, Facultad de Ciencias
Universidad Nacional Autónoma de México

Reply via email to