> > > > > > While a good shuffle should be able to start from the initial state and > > generate a good shuffled result, it is slightly better to progressively > > shuffle the previous result array into new positions than to start from > an > > initial state and compute a 1-off. > > Evidence? (BTW, the Fisher-Yates shuffle is theoretically as biased as > the RNG that powers it.) > > Specifically re-1-off vs progressive shuffling? Not on-hand; it was an evaluation I did 10 years ago; and while the difference wasn't that much, it was notable. the rate for any single ball to be in a specific position took longer (by iteration count) when resetting the array to shuffle to initial conditions (1-N)
Fisher-Yates. is biased by the usage of the number generator; not just the generator. I've been considering how to graphically/visually demonstrate it; but I'v failed so far. for a list of 10 numbers; and a 'perfect' RNG. For the last number, it's 90% guaranteed to evaluate it's shuffle twice. (10% it won't move) It has a 10% chance to be in the last position, but only a 1.1% chance to be in the second to last position. (1/10 * 1/9 = 1/90 = 0.011...) so it will NOT be the last position 90% but it will NOT be the second to last 98.9% of the time. For an ideal shuffle, every original position will end up in every other position 10% of the time. > > > > On Sun, Apr 29, 2018 at 3:27 PM, Isiah Meadows <[email protected]> > > wrote: > >> > >> BTW, I added this to my list of various proposed array additions (as a > >> weak one) [1]. > >> > >> I did do a little reading up and found that in general, there's a > >> major glitch that makes shuffling very hard to get right (issues even > >> Underscore's `_.shuffle` doesn't address), specifically that of the > >> size of the random number generator vs how many permutations it can > >> hit. Specifically, if you want any properly unbiased shuffles covering > >> all permutations of arrays larger than 34 entries (and that's not a > >> lot), you can't use any major engines' `Math.random` (which typically > >> have a max seed+period of 128 bits). You have to roll your own > >> implementation, and you need at least something like xorshift1024* [2] > >> (1024-bit max seed/period size) for up to 170 entries, MT19937 > >> [3]/WELL19937 [4] (seed/period up to 2^19937-1) for up to 2080 > >> entries, or MT44497/WELL44497 for up to 4199 entries. Keep in mind the > >> minimum seed/period size in bits grows roughly `ceil(log2(fact(N)))` > >> where `N` is the length of the array [5], and the only way you can > >> guarantee you can even generate all possible permutations of all > >> arrays (ignoring potential bias) is through a true hardware generator > >> (with its potentially infinite number of possibilities). Also, another > >> concern is that the loop's bottleneck is specifically the random > >> number generation, so you can't get too slow without resulting in a > >> *very* slow shuffle. > >> > >> If you want anything minimally biased and much larger than that, > >> you'll need a cryptographically secure pseudorandom number generator > >> just to cover the possible states. (This is about where the > >> intersection meets between basic statistics and cryptography, since > >> cipher blocks are frequently that long.) But I'm leaving that as out > >> of scope of that proposal. > >> > >> [1]: > >> https://github.com/isiahmeadows/array-additions- > proposal#arrayprototypeshuffle > >> [2]: https://en.wikipedia.org/wiki/Xorshift#xorshift* > >> [3]: https://en.wikipedia.org/wiki/Mersenne_Twister > >> [4]: https://en.wikipedia.org/wiki/Well_equidistributed_long- > period_linear > >> [5]: > >> http://www.wolframalpha.com/input/?i=plot+ceiling(log2(x!) > )+where+0+%3C+x+%3C+10000 > >> > >> ----- > >> > >> Isiah Meadows > >> [email protected] > >> > >> Looking for web consulting? Or a new website? > >> Send me an email and we can get started. > >> www.isiahmeadows.com > >> > >> > >> On Sun, Apr 29, 2018 at 4:01 PM, Alexander Lichter <[email protected]> > wrote: > >> > On 29.04.2018 18:34, Naveen Chawla wrote: > >> >> > >> >> I don't think there's such a thing as "real random" in digital algos, > >> >> just > >> >> "pseudo random". > >> > > >> > You are right. I meant 'unbiased' pseudo randomness here. > >> > > >> >> Apart from card games, what's the use case? > >> > > >> > There are a lot of potential use cases. The best that comes into my > mind > >> > is > >> > sampling test data. > >> > > >> > > >> > On 29.04.2018 19:01, Isiah Meadows wrote: > >> >> > >> >> I think this would be better suited for a library function rather > than > >> >> a > >> >> language feature. I could see this also being useful also for > >> >> randomized displays, but that's about it. And I'm not sure what an > >> >> engine could provide here that a library couldn't - you can't really > >> >> get much faster than what's in the language (minus bounds checking, > >> >> but the likely frequent cache misses will eclipse that greatly), and > >> >> it's not unlocking any real new possibilities. > >> > > >> > As Tab Atkins Jr. already pointed out it's not about performance > >> > benefits. > >> > It's about how error-prone custom shuffle implementations are/can be. > >> > > >> > _______________________________________________ > >> > es-discuss mailing list > >> > [email protected] > >> > https://mail.mozilla.org/listinfo/es-discuss > >> _______________________________________________ > >> es-discuss mailing list > >> [email protected] > >> https://mail.mozilla.org/listinfo/es-discuss > > > > >
_______________________________________________ es-discuss mailing list [email protected] https://mail.mozilla.org/listinfo/es-discuss

