Fixed-size arrays and randomShuffle()
May be that this works as intended, but it fooled me: --- import std.random; import std.stdio; void main() { int[5] a = 0; a[0] = 1; int[] b = [1, 0, 0, 0, 0]; randomShuffle(a); writeln(a); randomShuffle(b); writeln(b); } --- In DMD 2.0.59 the fixed-size array a won't be shuffled (the dynamic array b will), and you won't get any warning about it. I'm not sure whether this counts as something that should be reported as a bug/improvement, nor if only randomShuffle() displays this behaviour, perhaps you could enlighten me.
Re: Fixed-size arrays and randomShuffle()
On 05/03/2012 06:30 AM, Vidar Wahlberg wrote: May be that this works as intended, but it fooled me: --- import std.random; import std.stdio; void main() { int[5] a = 0; a[0] = 1; int[] b = [1, 0, 0, 0, 0]; randomShuffle(a); Fixed-length arrays are value types. 'a' is copied to randomShuffle() so its copy is shuffled. Passing a slice of the whole array works: randomShuffle(a[]); writeln(a); randomShuffle(b); writeln(b); } --- In DMD 2.0.59 the fixed-size array a won't be shuffled (the dynamic array b will), and you won't get any warning about it. I'm not sure whether this counts as something that should be reported as a bug/improvement, nor if only randomShuffle() displays this behaviour, perhaps you could enlighten me. Ali
Re: Fixed-size arrays and randomShuffle()
On 2012-05-03 15:34, Ali Çehreli wrote: Fixed-length arrays are value types. 'a' is copied to randomShuffle() so its copy is shuffled. Passing a slice of the whole array works: randomShuffle(a[]); True, it is however still not exceptionally newbie (or perhaps even user?) friendly (my question was more of does it have to be this way? rather than how do you do this?, even though I appreciate the answer on how to do it). Is it not possible for the compilator to let you know that what you're doing doesn't make any sense? A quick follow-up: I've tried some various random number engines, but neither come even close to the performance of whatever is used for Java's Collection.shuffle() method. Perhaps someone can shed some light on this?
Re: Fixed-size arrays and randomShuffle()
On 05/03/2012 06:55 AM, Vidar Wahlberg wrote: On 2012-05-03 15:34, Ali Çehreli wrote: Fixed-length arrays are value types. 'a' is copied to randomShuffle() so its copy is shuffled. Passing a slice of the whole array works: randomShuffle(a[]); True, it is however still not exceptionally newbie (or perhaps even user?) friendly (my question was more of does it have to be this way? rather than how do you do this?, even though I appreciate the answer on how to do it). Is it not possible for the compilator to let you know that what you're doing doesn't make any sense? Random shuffle can work on a fixed-length array and there is a way for the implementation to know: import std.traits; // ... __traits(isStaticArray, a) That can be used in a template constraint. A quick follow-up: I've tried some various random number engines, but neither come even close to the performance of whatever is used for Java's Collection.shuffle() method. Perhaps someone can shed some light on this? I have no idea with that one. Ali
Re: Fixed-size arrays and randomShuffle()
On 03.05.2012 18:02, Ali Çehreli wrote: A quick follow-up: I've tried some various random number engines, but neither come even close to the performance of whatever is used for Java's Collection.shuffle() method. Perhaps someone can shed some light on this? I have no idea with that one. It's all about RNG used behind the scenes. Default one is Mersane Twister which (AFAIK) is not particularly fast. But has a period of 2^19937 elements. You should probably use XorShift or MinstdRand generator and a version of shuffle with 2nd parameter. -- Dmitry Olshansky
Re: Fixed-size arrays and randomShuffle()
On 2012-05-03 16:26, Dmitry Olshansky wrote: It's all about RNG used behind the scenes. Default one is Mersane Twister which (AFAIK) is not particularly fast. But has a period of 2^19937 elements. You should probably use XorShift or MinstdRand generator and a version of shuffle with 2nd parameter. I tried those two as well. Still significantly slower than what I can achieve in Java.
Re: Fixed-size arrays and randomShuffle()
On Thursday, 3 May 2012 at 14:41:20 UTC, Vidar Wahlberg wrote: I tried those two as well. Still significantly slower than what I can achieve in Java. You might want to post your code... I wrote this code in D: -=-=-=- import std.random, std.stdio, std.datetime; void main() { int[] arr = new int[5_000_000]; foreach(i, ref e; arr) e = i; StopWatch sw = AutoStart.yes; arr.randomShuffle(); sw.stop(); writeln(Took , sw.peek().to!(msecs, double)(), ms); } -=-=-=- And it performed _identically_ to this in Java: -=-=-=- import java.util.ArrayList; import java.util.Collections; public class Main { public static void main(String[] args) { ArrayListInteger ints = new ArrayList(5000); for(int i = 0; i 5_000_000; ++i) ints.add(i); long startTime = System.currentTimeMillis(); Collections.shuffle(ints); long endTime = System.currentTimeMillis(); System.out.println(Took + (endTime - startTime) + ms); } } -=-=-=-
Re: Fixed-size arrays and randomShuffle()
On 2012-05-03 17:31, Chris Cain wrote: You might want to post your code... Sure! D: --- import std.random; import std.stdio; void main() { auto iterations = 1000; int[] a; for (int i = 0; i 42; ++i) a ~= i; for (int i = 0; i iterations; ++i) randomShuffle(a); } naushika:~/projects dmd random.d time ./random ./random 38,35s user 0,05s system 99% cpu 38,420 total --- Java (7): --- import java.util.ArrayList; import java.util.Collections; public class Rnd { public static void main(String... args) { int iterations = 1000; ArrayListInteger a = new ArrayListInteger(); for (int i = 0; i 42; ++i) a.add(i); for (int i = 0; i iterations; ++i) Collections.shuffle(a); } } naushika:~/projects javac Rnd.java time java Rnd java Rnd 9,92s user 0,03s system 100% cpu 9,922 total ---
Re: Fixed-size arrays and randomShuffle()
On a related note, how did you get the other random generators working? I tried to compile this and it gives me an error: -=-=-=- import std.random, std.stdio, std.datetime; void main() { int[] arr = new int[5_000_000]; foreach(i, ref e; arr) e = i; auto rng = MinstdRand0(1); rng.seed(unpredictableSeed); StopWatch sw = AutoStart.yes; randomShuffle(arr, rng); sw.stop(); writeln(Took , sw.peek().to!(msecs, double)(), ms); } -=-=-=- The error: -=-=-=- C:\D\dmd2\windows\bin\..\..\src\phobos\std\random.d(1263): Error: cannot implicitly convert expression (rndGen()) of type MersenneTwisterEngine!(uint,32,624,397,31,-1727483681u,11,7,-1658038656u,15,-272236544u,18) to LinearCongruentialEngine!(uint,16807,0,2147483647) -=-=-=- Is this a bug in 2.059? Or am I doing something wrong?
Re: Fixed-size arrays and randomShuffle()
OK, I took a look at your example, and I saw the kind of performance you were seeing. I performed an investigation on what could cause such a disparity and came up with a conclusion. The answer is two things: First of all, as noted above, D's default generator is a mersenne twister RNG. You can emulate Java's RNG like so: auto rng = LinearCongruentialEngine!(ulong, 25214903917uL, 11uL, 2uL^^48uL)(); Once I equalized that, I looked into the various methods that are called and settled in on uniform. https://github.com/D-Programming-Language/phobos/blob/master/std/random.d#L1154 As you can see, there's a division at line 1154 and another at line 1158. This means there's a minimum of two division operations every time uniform is called. Now, normally this isn't a big deal, but if we really want maximum performance, we need to eliminate at least one. If you replace lines 1154-1161 (auto bucketSize ... to return...) with: CountType rnum, result; do { rnum = cast(CountType) uniform!CountType(urng); result = rnum % count; } while (rnum count (rnum - result + (count - 1)) (rnum - result - 1)); return cast(typeof(return)) (min + result); Then the time taken shrinks down to roughly the same (within a tenth of a second) as Java. I'll probably clean this up (and write some comments on how this works) and see about submitting it as a patch unless anyone sees anything wrong with this approach.