[
https://issues.apache.org/jira/browse/MATH-1300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065450#comment-15065450
]
Rostislav Krasny edited comment on MATH-1300 at 12/19/15 5:50 PM:
------------------------------------------------------------------
My first statement about the same bytes sequence generated by differently sized
chunks was too optimistic. Indeed even java.util.Random doesn't guarantee this.
But neither java.util.Random nor org.spaceroots.mantissa.random.MersenneTwister
make unneeded calls to the nextInt() method (that just calls next(32)).
Obviously the same bytes sequence guarantee can't be done without a significant
performance degradation.
Also MersenneTwister#next() always generates int but return only asked number
of bits.
{code:java} return y >>> (32 - bits);{code}
BTW the spaceroots's implementation of MersenneTwister was the reference to me.
You can download it (including source code) from
www.spaceroots.org/downloads.html in their mantissa library.
I still propose to change the nextBytes() code to my version because of the
performance. I tested the performance of the four implementations and my
implementation is better. I did it by running following test code:
{code:java}
@Test
public void test4() {
long start;
long end;
int iterations = 20000000;
int chunkSize = 123;
org.spaceroots.mantissa.random.MersenneTwister referenceMt =
new org.spaceroots.mantissa.random.MersenneTwister();
org.apache.commons.math3.random.MersenneTwister cmMt = new
org.apache.commons.math3.random.MersenneTwister();
MersenneTwister2 mt2 = new MersenneTwister2();
MersenneTwister3 mt3 = new MersenneTwister3();
byte[] buf = new byte[chunkSize];
referenceMt.setSeed(1234567L);
cmMt.setSeed(1234567L);
mt2.setSeed(1234567L);
mt3.setSeed(1234567L);
start = System.currentTimeMillis();
for (int i = 0; i < iterations; i++) {
referenceMt.nextBytes(buf);
}
end = System.currentTimeMillis();
System.err.printf("Spaceroots MersenneTwister %d
iterations:\t%8d ms.\n", iterations, (end - start));
start = System.currentTimeMillis();
for (int i = 0; i < iterations; i++) {
cmMt.nextBytes(buf);
}
end = System.currentTimeMillis();
System.err.printf("CM 3.5 MersenneTwister %d iterations:\t%8d
ms.\n", iterations, (end - start));
start = System.currentTimeMillis();
for (int i = 0; i < iterations; i++) {
mt2.nextBytes(buf);
}
end = System.currentTimeMillis();
System.err.printf("Rostislav MersenneTwister %d
iterations:\t%8d ms.\n", iterations, (end - start));
start = System.currentTimeMillis();
for (int i = 0; i < iterations; i++) {
mt3.nextBytes(buf);
}
end = System.currentTimeMillis();
System.err.printf("Gilles MersenneTwister %d iterations:\t%8d
ms.\n", iterations, (end - start));
}
{code}
On my (pretty old) computer (Pentium 4 Prescott2M 3.2GHz, 2GB RAM, JDK 7u80
32-bit) I got following results (two runs):
{code}
Spaceroots MersenneTwister 20000000 iterations: 22937 ms.
CM 3.5 MersenneTwister 20000000 iterations: 17532 ms.
Rostislav MersenneTwister 20000000 iterations: 15812 ms.
Gilles MersenneTwister 20000000 iterations: 24235 ms.
{code}
{code}
Spaceroots MersenneTwister 20000000 iterations: 27937 ms.
CM 3.5 MersenneTwister 20000000 iterations: 16735 ms.
Rostislav MersenneTwister 20000000 iterations: 15547 ms.
Gilles MersenneTwister 20000000 iterations: 23953 ms.
{code}
was (Author: rosti.bsd):
My first statement about the same bytes sequence generated by differently sized
chunks was too optimistic. Indeed even java.util.Random doesn't guarantee this.
But neither java.util.Random nor org.spaceroots.mantissa.random.MersenneTwister
make unneeded calls to the nextInt() method (that just calls next(32)).
Obviously the same bytes sequence guarantee can't be done without a significant
performance degradation.
Also MersenneTwister#next() always generates int but return only asked number
of bits.
{code:java} return y >>> (32 - bits);{code}
BTW the spaceroots's implementation of MersenneTwister was the reference to me.
You can download it (including source code) from
www.spaceroots.org/downloads.html in their mantissa library.
I still propose to change the nextBytes() code to my version because of the
performance. I tested the performance of the four implementations and my
implementation is better. I did it by running following test code:
{code:java}
@Test
public void test4() {
long start;
long end;
int iterations = 20000000;
int chunkSize = 123;
org.spaceroots.mantissa.random.MersenneTwister referenceMt =
new org.spaceroots.mantissa.random.MersenneTwister();
org.apache.commons.math3.random.MersenneTwister cmMt = new
org.apache.commons.math3.random.MersenneTwister();
MersenneTwister2 mt2 = new MersenneTwister2();
MersenneTwister3 mt3 = new MersenneTwister3();
byte[] buf = new byte[chunkSize];
referenceMt.setSeed(1234567L);
mt2.setSeed(1234567L);
mt3.setSeed(1234567L);
start = System.currentTimeMillis();
for (int i = 0; i < iterations; i++) {
referenceMt.nextBytes(buf);
}
end = System.currentTimeMillis();
System.err.printf("Spaceroots MersenneTwister %d
iterations:\t%8d ms.\n", iterations, (end - start));
start = System.currentTimeMillis();
for (int i = 0; i < iterations; i++) {
cmMt.nextBytes(buf);
}
end = System.currentTimeMillis();
System.err.printf("CM 3.5 MersenneTwister %d iterations:\t%8d
ms.\n", iterations, (end - start));
start = System.currentTimeMillis();
for (int i = 0; i < iterations; i++) {
mt2.nextBytes(buf);
}
end = System.currentTimeMillis();
System.err.printf("Rostislav MersenneTwister %d
iterations:\t%8d ms.\n", iterations, (end - start));
start = System.currentTimeMillis();
for (int i = 0; i < iterations; i++) {
mt3.nextBytes(buf);
}
end = System.currentTimeMillis();
System.err.printf("Gilles MersenneTwister %d iterations:\t%8d
ms.\n", iterations, (end - start));
}
{code}
On my (pretty old) computer (Pentium 4 Prescott2M 3.2GHz, 2GB RAM, JDK 7u80
32-bit) I got following results (two runs):
{code}
Spaceroots MersenneTwister 20000000 iterations: 22937 ms.
CM 3.5 MersenneTwister 20000000 iterations: 17532 ms.
Rostislav MersenneTwister 20000000 iterations: 15812 ms.
Gilles MersenneTwister 20000000 iterations: 24235 ms.
{code}
{code}
Spaceroots MersenneTwister 20000000 iterations: 27937 ms.
CM 3.5 MersenneTwister 20000000 iterations: 16735 ms.
Rostislav MersenneTwister 20000000 iterations: 15547 ms.
Gilles MersenneTwister 20000000 iterations: 23953 ms.
{code}
> BitsStreamGenerator#nextBytes(byte[]) is wrong
> ----------------------------------------------
>
> Key: MATH-1300
> URL: https://issues.apache.org/jira/browse/MATH-1300
> Project: Commons Math
> Issue Type: Bug
> Affects Versions: 3.5
> Reporter: Rostislav Krasny
> Attachments: MersenneTwister2.java, TestMersenneTwister.java
>
>
> Sequential calls to the BitsStreamGenerator#nextBytes(byte[]) must generate
> the same sequence of bytes, no matter by chunks of which size it was divided.
> This is also how java.util.Random#nextBytes(byte[]) works.
> When nextBytes(byte[]) is called with a bytes array of length multiple of 4
> it makes one unneeded call to next(int) method. This is wrong and produces an
> inconsistent behavior of classes like MersenneTwister.
> I made a new implementation of the BitsStreamGenerator#nextBytes(byte[]) see
> attached code.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)