Re: [gentoo-user] Re: OT: Mapping random numbers (PRNG)
On Jun 28, 2014, at 0:13, Matti Nykyri matti.nyk...@iki.fi wrote: On Jun 27, 2014, at 0:00, Kai Krakow hurikha...@gmail.com wrote: Matti Nykyri matti.nyk...@iki.fi schrieb: If you are looking a mathematically perfect solution there is a simple one even if your list is not in the power of 2! Take 6 bits at a time of the random data. If the result is 62 or 63 you will discard the data and get the next 6 bits. This selectively modifies the random data but keeps the probabilities in correct balance. Now the probability for index of 0-61 is 1/62 because the probability to get 62-63 out of 64 if 0. Why not do just something like this? index = 0; while (true) { index = (index + get_6bit_random()) % 62; output char_array[index]; } Done, no bits wasted. Should have perfect distribution also. We also don't have to throw away random data just to stay within unaligned boundaries. The unalignment is being taken over into the next loop so the error corrects itself over time (it becomes distributed over the whole set). Distribution will not be perfect. The same original problem persists. Probability for index 0 to 1 will be 2/64 and for 2 to 61 it will be 1/64. Now the addition changes this so that index 0 to 1 reflects to previous character and not the original index. The distribution of like 10GB of data should be quite even but not on a small scale. The next char will depend on previous char. It is 100% more likely that the next char is the same or one index above the previous char then any of the other ones in the series. So it is likely that you will have long sets of same character. Random means that for next char the probability is always even, 1/62. And like mentioned in Dilbert it is impossible to say that something is random but possible to say that it isn't. If wasting 6bit of data seems large, do this: index = get_6bit_random(); while (index 61) { index = 1; index |= get_1bit_random(); index = 0x3F; } return index; It will waste 1 bit at a time until result is less than 62. This will slightly change probabilities though :/ Sorry this example is really flawed :( If next6bit is over 61 there are only two possible values for it: 62 or 63 - that is 0x3E and 0x3F. So you see that only one bit changes. But that bit is random! So least significant bit is random and does not need to be discarded :) index = get_6bit_random(); while (index 61) { index = 5; index |= get_5bit_random(); index = 0x3F; } return index;
[gentoo-user] Re: OT: Mapping random numbers (PRNG)
On Sat, 28 Jun 2014 19:53:08 -0500 Canek Peláez Valdés can...@gmail.com wrote: On Sat, Jun 28, 2014 at 7:37 PM, gottl...@nyu.edu wrote: On Sat, Jun 28 2014, Canek Peláez Valdés wrote: That doesn't matter. Take a non-negative integer N; if you flip a coin an infinite number of times, then the probability of the coin landing on the same face N times in a row is 1. This is certainly true. This means that it is *guaranteed* to happen That is not as clear. Let me be more precise (and please correct me if I'm wrong): It is guaranteed to happen at some point in the infinite sequence of random flip coins, but we cannot know when it will happen, only that it will happen. That's the way I got it when I took my probability courses, admittedly many years ago. The probability is 1 in the sense that the as the number of flips M increases, so does the probability of getting N heads (or tails) in a row also increases, and the upper bound for the sequence of probabilities is 1. It's not a probability about something which actually happens; no one so far has been able to flip a coin an infinite number of times, not even a computer. In any way, even if I'm wrong and it is not guaranteed, the main point remains true: the probability of getting a large sequence of the same number from a RNG is 1 for every true random RNG, and therefore seeing a large sequence of the same number form a RNG doesn't (technically) means that it is broken. It's true that that wouldn't *prove* the generator is broken. But it might be a good reason to take another look at the algorithm. Regards.
Re: [gentoo-user] Re: OT: Mapping random numbers (PRNG)
On Sat, Jun 28, 2014 at 8:46 PM, »Q« boxc...@gmx.net wrote: On Sat, 28 Jun 2014 19:53:08 -0500 Canek Peláez Valdés can...@gmail.com wrote: On Sat, Jun 28, 2014 at 7:37 PM, gottl...@nyu.edu wrote: On Sat, Jun 28 2014, Canek Peláez Valdés wrote: That doesn't matter. Take a non-negative integer N; if you flip a coin an infinite number of times, then the probability of the coin landing on the same face N times in a row is 1. This is certainly true. This means that it is *guaranteed* to happen That is not as clear. Let me be more precise (and please correct me if I'm wrong): It is guaranteed to happen at some point in the infinite sequence of random flip coins, but we cannot know when it will happen, only that it will happen. That's the way I got it when I took my probability courses, admittedly many years ago. The probability is 1 in the sense that the as the number of flips M increases, so does the probability of getting N heads (or tails) in a row also increases, and the upper bound for the sequence of probabilities is 1. It's not a probability about something which actually happens; no one so far has been able to flip a coin an infinite number of times, not even a computer. And no one will. Ever. In any way, even if I'm wrong and it is not guaranteed, the main point remains true: the probability of getting a large sequence of the same number from a RNG is 1 for every true random RNG, and therefore seeing a large sequence of the same number form a RNG doesn't (technically) means that it is broken. It's true that that wouldn't *prove* the generator is broken. But it might be a good reason to take another look at the algorithm. Agreed. Regards. -- Canek Peláez Valdés Profesor de asignatura, Facultad de Ciencias Universidad Nacional Autónoma de México
Re: [gentoo-user] Re: OT: Mapping random numbers (PRNG)
On 06/26/2014 11:07 PM, Kai Krakow wrote: It is worth noting that my approach has the tendency of generating random characters in sequence. sorry but had to share this http://dilbert.com/strips/comic/2001-10-25/
Re: [gentoo-user] Re: OT: Mapping random numbers (PRNG)
On Jun 27, 2014, at 11:55, thegeezer thegee...@thegeezer.net wrote: On 06/26/2014 11:07 PM, Kai Krakow wrote: It is worth noting that my approach has the tendency of generating random characters in sequence. sorry but had to share this http://dilbert.com/strips/comic/2001-10-25/ This is a good one :) have really been thinking this same comic previosly when writing to this thread...
Re: [gentoo-user] Re: OT: Mapping random numbers (PRNG)
On Jun 27, 2014, at 0:00, Kai Krakow hurikha...@gmail.com wrote: Matti Nykyri matti.nyk...@iki.fi schrieb: If you are looking a mathematically perfect solution there is a simple one even if your list is not in the power of 2! Take 6 bits at a time of the random data. If the result is 62 or 63 you will discard the data and get the next 6 bits. This selectively modifies the random data but keeps the probabilities in correct balance. Now the probability for index of 0-61 is 1/62 because the probability to get 62-63 out of 64 if 0. Why not do just something like this? index = 0; while (true) { index = (index + get_6bit_random()) % 62; output char_array[index]; } Done, no bits wasted. Should have perfect distribution also. We also don't have to throw away random data just to stay within unaligned boundaries. The unalignment is being taken over into the next loop so the error corrects itself over time (it becomes distributed over the whole set). Distribution will not be perfect. The same original problem persists. Probability for index 0 to 1 will be 2/64 and for 2 to 61 it will be 1/64. Now the addition changes this so that index 0 to 1 reflects to previous character and not the original index. The distribution of like 10GB of data should be quite even but not on a small scale. The next char will depend on previous char. It is 100% more likely that the next char is the same or one index above the previous char then any of the other ones in the series. So it is likely that you will have long sets of same character. Random means that for next char the probability is always even, 1/62. And like mentioned in Dilbert it is impossible to say that something is random but possible to say that it isn't. If wasting 6bit of data seems large, do this: index = get_6bit_random(); while (index 61) { index = 1; index |= get_1bit_random(); index = 0x3F; } return index; It will waste 1 bit at a time until result is less than 62. This will slightly change probabilities though :/
[gentoo-user] Re: OT: Mapping random numbers (PRNG)
Matti Nykyri matti.nyk...@iki.fi schrieb: If you are looking a mathematically perfect solution there is a simple one even if your list is not in the power of 2! Take 6 bits at a time of the random data. If the result is 62 or 63 you will discard the data and get the next 6 bits. This selectively modifies the random data but keeps the probabilities in correct balance. Now the probability for index of 0-61 is 1/62 because the probability to get 62-63 out of 64 if 0. Why not do just something like this? index = 0; while (true) { index = (index + get_6bit_random()) % 62; output char_array[index]; } Done, no bits wasted. Should have perfect distribution also. We also don't have to throw away random data just to stay within unaligned boundaries. The unalignment is being taken over into the next loop so the error corrects itself over time (it becomes distributed over the whole set). -- Replies to list only preferred.
[gentoo-user] Re: OT: Mapping random numbers (PRNG)
Kai Krakow hurikha...@gmail.com schrieb: Matti Nykyri matti.nyk...@iki.fi schrieb: If you are looking a mathematically perfect solution there is a simple one even if your list is not in the power of 2! Take 6 bits at a time of the random data. If the result is 62 or 63 you will discard the data and get the next 6 bits. This selectively modifies the random data but keeps the probabilities in correct balance. Now the probability for index of 0-61 is 1/62 because the probability to get 62-63 out of 64 if 0. Why not do just something like this? index = 0; while (true) { index = (index + get_6bit_random()) % 62; output char_array[index]; } Done, no bits wasted. Should have perfect distribution also. We also don't have to throw away random data just to stay within unaligned boundaries. The unalignment is being taken over into the next loop so the error corrects itself over time (it becomes distributed over the whole set). It is worth noting that my approach has the tendency of generating random characters in sequence. I think there are several possibilities to fix this, one being swapping two elements in the char array on each access. That fixed this particular problem in my tests. Here's are quick ruby snippet I played with: $ cat random.rb # danger: ugly code ahead dist = [0] * 62 seq = [0, 0] chars = [*('A'..'Z'), *('a'..'z'), *('0'..'9')] orig_chars = chars.dup index = 0 old = nil (62 * 10).times do # generate random wrapped index index = (index + rand(64)) % 62 # swap two indexed positions and output the char chars[0], chars[index] = chars[index], chars[0] $stdout.write chars[0] # count the generated indexes dist[index] = dist[index] + 1 # count if indexes were generated in sequence by looking # them up in the original sorted chars array unless old.nil? seq[0] = seq[0] + 1 if old orig_chars.index(chars[0]) seq[1] = seq[1] + 1 if old orig_chars.index(chars[0]) end old = orig_chars.index(chars[0]) end puts # check the average, it should always equal the loop count # multiplicator avg = dist.inject(:+) / 62.0 # calculate if the algorithm preferred sequences seq = [*seq, seq[0] - seq[1]] # output some stats puts avg puts dist.inspect puts dist.map {|v| avg - v }.inspect puts seq.inspect -- Replies to list only preferred.