Fixed-size arrays and randomShuffle()

2012-05-03 Thread Vidar Wahlberg

May be that this works as intended, but it fooled me:
---
import std.random;
import std.stdio;
void main() {
int[5] a = 0;
a[0] = 1;
int[] b = [1, 0, 0, 0, 0];
randomShuffle(a);
writeln(a);
randomShuffle(b);
writeln(b);
}
---

In DMD 2.0.59 the fixed-size array a won't be shuffled (the dynamic 
array b will), and you won't get any warning about it.


I'm not sure whether this counts as something that should be reported as 
a bug/improvement, nor if only randomShuffle() displays this behaviour, 
perhaps you could enlighten me.


Re: Fixed-size arrays and randomShuffle()

2012-05-03 Thread Ali Çehreli

On 05/03/2012 06:30 AM, Vidar Wahlberg wrote:

May be that this works as intended, but it fooled me:
---
import std.random;
import std.stdio;
void main() {
int[5] a = 0;
a[0] = 1;
int[] b = [1, 0, 0, 0, 0];
randomShuffle(a);


Fixed-length arrays are value types. 'a' is copied to randomShuffle() so 
its copy is shuffled. Passing a slice of the whole array works:


randomShuffle(a[]);


writeln(a);
randomShuffle(b);
writeln(b);
}
---

In DMD 2.0.59 the fixed-size array a won't be shuffled (the dynamic
array b will), and you won't get any warning about it.

I'm not sure whether this counts as something that should be reported as
a bug/improvement, nor if only randomShuffle() displays this behaviour,
perhaps you could enlighten me.


Ali


Re: Fixed-size arrays and randomShuffle()

2012-05-03 Thread Vidar Wahlberg

On 2012-05-03 15:34, Ali Çehreli wrote:

Fixed-length arrays are value types. 'a' is copied to randomShuffle() so
its copy is shuffled. Passing a slice of the whole array works:

randomShuffle(a[]);


True, it is however still not exceptionally newbie (or perhaps even 
user?) friendly (my question was more of does it have to be this way? 
rather than how do you do this?, even though I appreciate the answer 
on how to do it).
Is it not possible for the compilator to let you know that what you're 
doing doesn't make any sense?


A quick follow-up:
I've tried some various random number engines, but neither come even 
close to the performance of whatever is used for Java's 
Collection.shuffle() method. Perhaps someone can shed some light on this?


Re: Fixed-size arrays and randomShuffle()

2012-05-03 Thread Ali Çehreli

On 05/03/2012 06:55 AM, Vidar Wahlberg wrote:
 On 2012-05-03 15:34, Ali Çehreli wrote:
 Fixed-length arrays are value types. 'a' is copied to randomShuffle() so
 its copy is shuffled. Passing a slice of the whole array works:

 randomShuffle(a[]);

 True, it is however still not exceptionally newbie (or perhaps even
 user?) friendly (my question was more of does it have to be this way?
 rather than how do you do this?, even though I appreciate the answer
 on how to do it).
 Is it not possible for the compilator to let you know that what you're
 doing doesn't make any sense?

Random shuffle can work on a fixed-length array and there is a way for 
the implementation to know:


import std.traits;
// ...
  __traits(isStaticArray, a)

That can be used in a template constraint.

 A quick follow-up:
 I've tried some various random number engines, but neither come even
 close to the performance of whatever is used for Java's
 Collection.shuffle() method. Perhaps someone can shed some light on 
this?


I have no idea with that one.

Ali



Re: Fixed-size arrays and randomShuffle()

2012-05-03 Thread Dmitry Olshansky

On 03.05.2012 18:02, Ali Çehreli wrote:



  A quick follow-up:
  I've tried some various random number engines, but neither come even
  close to the performance of whatever is used for Java's
  Collection.shuffle() method. Perhaps someone can shed some light on
this?

I have no idea with that one.



It's all about RNG used behind the scenes. Default one is Mersane 
Twister which (AFAIK) is not particularly fast. But has a period of 
2^19937 elements.
You should probably use XorShift or MinstdRand generator and a version 
of shuffle with 2nd parameter.


--
Dmitry Olshansky


Re: Fixed-size arrays and randomShuffle()

2012-05-03 Thread Vidar Wahlberg

On 2012-05-03 16:26, Dmitry Olshansky wrote:

It's all about RNG used behind the scenes. Default one is Mersane
Twister which (AFAIK) is not particularly fast. But has a period of
2^19937 elements.
You should probably use XorShift or MinstdRand generator and a version
of shuffle with 2nd parameter.


I tried those two as well. Still significantly slower than what I can 
achieve in Java.


Re: Fixed-size arrays and randomShuffle()

2012-05-03 Thread Chris Cain

On Thursday, 3 May 2012 at 14:41:20 UTC, Vidar Wahlberg wrote:
I tried those two as well. Still significantly slower than what 
I can achieve in Java.


You might want to post your code... I wrote this code in D:
-=-=-=-

import std.random, std.stdio, std.datetime;

void main() {
int[] arr = new int[5_000_000];
foreach(i, ref e; arr)
e = i;

StopWatch sw = AutoStart.yes;
arr.randomShuffle();
sw.stop();

writeln(Took , sw.peek().to!(msecs, double)(), ms);
}

-=-=-=-

And it performed _identically_ to this in Java:

-=-=-=-

import java.util.ArrayList;
import java.util.Collections;

public class Main {
public static void main(String[] args) {
ArrayListInteger ints = new ArrayList(5000);
for(int i = 0; i  5_000_000; ++i)
ints.add(i);

long startTime = System.currentTimeMillis();
Collections.shuffle(ints);
long endTime = System.currentTimeMillis();

System.out.println(Took  + (endTime - startTime) + ms);
}
}

-=-=-=-


Re: Fixed-size arrays and randomShuffle()

2012-05-03 Thread Vidar Wahlberg

On 2012-05-03 17:31, Chris Cain wrote:

You might want to post your code...


Sure!
D:
---
import std.random;
import std.stdio;
void main() {
auto iterations = 1000;
int[] a;
for (int i = 0; i  42; ++i)
a ~= i;
for (int i = 0; i  iterations; ++i)
randomShuffle(a);
}

naushika:~/projects dmd random.d  time ./random
./random  38,35s user 0,05s system 99% cpu 38,420 total
---


Java (7):
---
import java.util.ArrayList;
import java.util.Collections;
public class Rnd {
public static void main(String... args) {
int iterations = 1000;
ArrayListInteger a = new ArrayListInteger();
for (int i = 0; i  42; ++i)
a.add(i);
for (int i = 0; i  iterations; ++i)
Collections.shuffle(a);
}
}

naushika:~/projects javac Rnd.java  time java Rnd
java Rnd  9,92s user 0,03s system 100% cpu 9,922 total
---


Re: Fixed-size arrays and randomShuffle()

2012-05-03 Thread Chris Cain


On a related note, how did you get the other random generators 
working? I tried to compile this and it gives me an error:

-=-=-=-
import std.random, std.stdio, std.datetime;

void main() {
int[] arr = new int[5_000_000];
foreach(i, ref e; arr)
e = i;
auto rng = MinstdRand0(1);
rng.seed(unpredictableSeed);
StopWatch sw = AutoStart.yes;
randomShuffle(arr, rng);
sw.stop();

writeln(Took , sw.peek().to!(msecs, double)(), ms);
}
-=-=-=-

The error:
-=-=-=-
C:\D\dmd2\windows\bin\..\..\src\phobos\std\random.d(1263): Error: 
cannot implicitly convert expression (rndGen()) of type 
MersenneTwisterEngine!(uint,32,624,397,31,-1727483681u,11,7,-1658038656u,15,-272236544u,18) 
to LinearCongruentialEngine!(uint,16807,0,2147483647)

-=-=-=-

Is this a bug in 2.059? Or am I doing something wrong?


Re: Fixed-size arrays and randomShuffle()

2012-05-03 Thread Chris Cain
OK, I took a look at your example, and I saw the kind of 
performance you were seeing.


I performed an investigation on what could cause such a disparity 
and came up with a conclusion. The answer is two things: First of 
all, as noted above, D's default generator is a mersenne twister 
RNG. You can emulate Java's RNG like so:


auto rng = LinearCongruentialEngine!(ulong,
25214903917uL, 11uL, 2uL^^48uL)();

Once I equalized that, I looked into the various methods that are 
called and settled in on uniform.


https://github.com/D-Programming-Language/phobos/blob/master/std/random.d#L1154

As you can see, there's a division at line 1154 and another at 
line 1158. This means there's a minimum of two division 
operations every time uniform is called. Now, normally this isn't 
a big deal, but if we really want maximum performance, we need to 
eliminate at least one.


If you replace lines 1154-1161 (auto bucketSize ... to return...) 
with:

CountType rnum, result;
do
{
rnum = cast(CountType) uniform!CountType(urng);
result = rnum % count;
}
while (rnum  count 
(rnum - result + (count - 1))  (rnum - result - 1));
return cast(typeof(return)) (min + result);

Then the time taken shrinks down to roughly the same (within a 
tenth of a second) as Java.


I'll probably clean this up (and write some comments on how this 
works) and see about submitting it as a patch unless anyone sees 
anything wrong with this approach.