The algorithm was fine, the issue was that sampling from a range of integers was heavily optimized for the case of sampling repeatedly from the same interval, rather than from different intervals. Since the Fisher-Yates shuffle changes the interval for every sample, it was effectively a worst case scenario. Rafael posted a change that gets around that worst case by not calling the version that's optimized for sampling from the same interval repeatedly.
On Sunday, January 24, 2016, Dan <[email protected]> wrote: > As requested I deleted the post. But algorithms for creating permutations > are standard and very much in the public domain (what does > TheArtOfProgramming say?). If someone really invests a little effort into > it, a more formally correct algorithm can be implemented. > > On Saturday, January 23, 2016 at 4:43:36 PM UTC+2, Kevin Squire wrote: >> >> Hi Dan, >> >> It would also be good if you deleted that post (or edited it and removed >> the code), for the same reason Viral mentioned: if someone reads the post >> and then creates a pull request for changing the Julia implementation, it >> could be argued that that implementation was derived from GPL code, which >> makes it GPL. A clean-room implementation (such as creating the patch from >> a higher level description of the code) is okay. >> >> (Viral, it would probably be good for you to remove the code from your >> post as well. I did so in this post below.) >> >> Cheers, >> Kevin >> >> On Saturday, January 23, 2016 at 6:30:07 AM UTC-8, Viral Shah wrote: >>> >>> Unfortunately, this makes the implementation GPL. If you can describe >>> the algorithm in an issue on github, someone can do a cleanroom >>> implementation. >>> >>> -viral >>> >>> On Saturday, January 23, 2016 at 3:17:19 PM UTC+5:30, Dan wrote: >>>> >>>> Another way to go about it, is to look at R's implementation of a >>>> random permutation and recreate it in Julia, hoping for similar >>>> performance. Having done so, the resulting Julia code is: >>>> >>>> <GPL code removed> >>>> A size 1,000,000 permutation was generated x2 faster with this method. >>>> But, this method uses uniform floating random numbers, which might not >>>> be uniform on integers for large enough numbers. In general, it should be >>>> possible to optimize a more correct implementation. But if R envy is a >>>> driver, R performance is recovered with a pure Julia implementation (the R >>>> implementation is in C). >>>> >>>> On Saturday, January 23, 2016 at 7:26:49 AM UTC+2, Rafael Fourquet >>>> wrote: >>>>> >>>>> > Let's capture this as a Julia performance issue on github, >>>>> > if we can't figure out an easy way to speed this up right away. >>>>> >>>>> I think I remember having identified a potentially sub-optimal >>>>> implementation of this function few weeks back (perhaps no more than >>>>> what Tim suggested) and had planned to investigate further (when time >>>>> permits...) >>>>> >>>>
