[Issue 3738] MinstdRand0 and MinstdRand very poor performance

2015-06-09 Thread via Digitalmars-d-bugs
https://issues.dlang.org/show_bug.cgi?id=3738

Andrei Alexandrescu and...@erdani.com changed:

   What|Removed |Added

Version|2.039   |D2

--


[Issue 3738] MinstdRand0 and MinstdRand very poor performance

2010-01-24 Thread d-bugmail
http://d.puremagic.com/issues/show_bug.cgi?id=3738


Witold Baryluk bary...@smp.if.uj.edu.pl changed:

   What|Removed |Added

 Status|ASSIGNED|NEW


--- Comment #3 from Witold Baryluk bary...@smp.if.uj.edu.pl 2010-01-24 
08:35:08 PST ---
One can observe that ?: line (branch2)
_x = (y  v || y = 2147483647) ? ((y+1)  0x7fff_u) : y;
can be changed , first to not contain not needed yv, to  (branch1)
_x = (y = 2147483647) ? ((y+1)  0x7fff_u) : y;
or even remove branch completly (branchlessA)
_x = (y  0x7fff_) + (y  31);
and possibly rearange terms (branchlessB)
_x = (y + (y  31))  0x7fff_;


MinstdRnd0
  standard: 44.65 s
  inline: 41.11 s
  standard with tricks (2 branches): 17.08 s
  inline with tricks (2 branches): 17.04 s
  standard with tricks (1 branch): 15.24 s
  inline with tricks (1 branch): 13.59 s
  standard with tricks (branchlesA): 16.50 s
  inline with tricks (branchelesA): 15.82 s
  standard with tricks (branchlesB):  17.74 s
  inline with tricks (branchelesB): 15.30 s

MinstdRnd
  standard: 44.95 s
  inline: 41.17 s
  standard with tricks (2 branch): 17.04 s
  inline with tricks (2 branche): 17.03 s
  standard with tricks (1 branch): 15.29 s
  inline with tricks (1 branche): 13.54 s
  standard with tricks (branchlesA): 16.46 s
  inline with tricks (branchelessA): 15.88 s
  standard with tricks (branchlesB): 18.10 s
  inline with tricks (branchelessB): 15.31 s

So fastest is currently branch1 version. 

Updated patch:

 void popFront()
 {
 static if (m)
-_x = cast(UIntType) ((cast(ulong) a * _x + c) % m);
+static if (is(UIntType == uint)  m == 4294967295uL) {
+ const ulong x = (cast(ulong) a * _x + c);
+ const ulong v = x  32;
+ const ulong w = (x  0x_uL);
+ const uint y = cast(uint)(v+w);
+ _x = (y  v || y = 4294967295u) ? (y+1) : y;
+} else static if (is(UIntType == uint)  m == 2147483647u) {
+ const ulong x = (cast(ulong) a * _x + c);
+ const ulong v = x  31;
+ const ulong w = (x  0x7fff_uL);
+ const uint y = cast(uint)(v+w);
+ _x = (y = 2147483647u) ? ((y+1)  0x7fff_u) : y;
+} else {
+_x = cast(UIntType) ((cast(ulong) a * _x + c) % m);
+}
 else
 _x = a * _x + c;
 }

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---


[Issue 3738] MinstdRand0 and MinstdRand very poor performance

2010-01-24 Thread d-bugmail
http://d.puremagic.com/issues/show_bug.cgi?id=3738



--- Comment #5 from Witold Baryluk bary...@smp.if.uj.edu.pl 2010-01-24 
12:03:20 PST ---
All my tests pases on your version.
10_000_000_000 (more than all possibilities) passes (for MinstdRand0 and
MinstdRand, it is possible that for other a it will fail, but i tested and done
some calculation.  and it should work for all a,cm.
I also tested versions with slightly changed constants (like 30 instad of 31,
or int.max-1, int.max-2, int.max-3, int.max+1 in multiple comibinations an
allone), and all them fail.


Shouldn't const uint y by immucatable y, just like in first static if?


I think that our pretty optimised line, can be further optimalised: (1 branch
sub)

- _x = (y = int.max) ? ((y + 1)  int.max) : y;
+ _x = (y = int.max) ? (y - int.max) : y;

:)

This gives:
MinstdRand0:
  standard with tricks (1 branch sub): 15.25
  inline with tricks (1 branch sub): 13.53 s
MinstdRand:
  standard with tricks (1 branch sub): 15.26
  inline with tricks (1 branch sub): 13.52 s

It doesn't give much difference (maybe 2%). But i thinkg it is better to have
one substraction, than one add and one binary and.



I found few interesting materials about this m=2^^31-1 operation.
One is http://www.firstpr.com.au/dsp/rand31/

I tested this Carta's versions: But they are only relevant for a=16807 (out
Park-Miller's StdminRand0 generator), so pretty limited:

Carta1 (original):
  uint lo = 16807*(_x  0xu);
  immutable uint hi = 16807*(_x  16);
  lo += (hi  0x7FFF)  16;
  lo += (hi  15);
  _x = (lo  int.max) ? (lo - int.max) : lo;

Carta3:
  uint lo = 16807*(_x  0xu);
  immutable uint hi = 16807*(_x  16);
  lo += ((hi  0x7FFF)  16) + (hi  15);
  _x = (lo  int.max) ? (lo - int.max) : lo;

Carta2: 
immutable uint
   hi = 16807*(_x  16),
   lo 16807*(_x  0xu) + ((hi  0x7FFF)  16) + (hi  15);
   _x = (lo  int.max) ? (lo - int.max) : lo;

Last lines can also be changed to branchles:
   _x = (lo  int.max) ? (lo - int.max) : lo;

Timings:
Carta code1: 22.30 s
Carta code1 branchless: 18.42
Carta code3: 21.84 s
Carta code3 branchless: 18.19 s
Carta code2: 23.84 s
Carta code2 branchless: 19.84 s

So it is slower in all cases, and it is only for speciall cases (c=0, a 
2^^15). Of course answers are still correct.


Performing uint*uint+uint - ulong is just faster than their tricks, and more
general, and can be reused in many other places.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---


[Issue 3738] MinstdRand0 and MinstdRand very poor performance

2010-01-23 Thread d-bugmail
http://d.puremagic.com/issues/show_bug.cgi?id=3738



--- Comment #1 from Witold Baryluk bary...@smp.if.uj.edu.pl 2010-01-23 
18:29:30 PST ---
Ok, sorry. Not all are better than Mt19937. But definitly better than
MinstdRand{,0}. And few are acutally better (empirically).

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---


[Issue 3738] MinstdRand0 and MinstdRand very poor performance

2010-01-23 Thread d-bugmail
http://d.puremagic.com/issues/show_bug.cgi?id=3738


Andrei Alexandrescu and...@metalanguage.com changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 CC||and...@metalanguage.com


--- Comment #2 from Andrei Alexandrescu and...@metalanguage.com 2010-01-23 
23:09:46 PST ---
Thanks for the fix. I'll incorporate the patch into Phobos. Walter will decide
whether he wants to pursue a compiler change.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
--- You are receiving this mail because: ---