[ 
https://issues.apache.org/jira/browse/MATH-1300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065450#comment-15065450
 ] 

Rostislav Krasny edited comment on MATH-1300 at 12/19/15 5:50 PM:
------------------------------------------------------------------

My first statement about the same bytes sequence generated by differently sized 
chunks was too optimistic. Indeed even java.util.Random doesn't guarantee this. 
But neither java.util.Random nor org.spaceroots.mantissa.random.MersenneTwister 
make unneeded calls to the nextInt() method (that just calls next(32)). 
Obviously the same bytes sequence guarantee can't be done without a significant 
performance degradation.

Also MersenneTwister#next() always generates int but return only asked number 
of bits.
{code:java}        return y >>> (32 - bits);{code}

BTW the spaceroots's implementation of MersenneTwister was the reference to me. 
You can download it (including source code) from 
www.spaceroots.org/downloads.html in their mantissa library.

I still propose to change the nextBytes() code to my version because of the 
performance. I tested the performance of the four implementations and my 
implementation is better. I did it by running following test code:
{code:java}
        @Test
        public void test4() {
                long start;
                long end;
                int iterations = 20000000;
                int chunkSize = 123;

                org.spaceroots.mantissa.random.MersenneTwister referenceMt = 
new org.spaceroots.mantissa.random.MersenneTwister();
                org.apache.commons.math3.random.MersenneTwister cmMt = new 
org.apache.commons.math3.random.MersenneTwister();
                MersenneTwister2 mt2 = new MersenneTwister2();
                MersenneTwister3 mt3 = new MersenneTwister3();
                byte[] buf = new byte[chunkSize];

                referenceMt.setSeed(1234567L);
                cmMt.setSeed(1234567L);
                mt2.setSeed(1234567L);
                mt3.setSeed(1234567L);

                start = System.currentTimeMillis();
                for (int i = 0; i < iterations; i++) {
                        referenceMt.nextBytes(buf);
                }
                end = System.currentTimeMillis();
                System.err.printf("Spaceroots MersenneTwister %d 
iterations:\t%8d ms.\n", iterations, (end - start));

                start = System.currentTimeMillis();
                for (int i = 0; i < iterations; i++) {
                        cmMt.nextBytes(buf);
                }
                end = System.currentTimeMillis();
                System.err.printf("CM 3.5 MersenneTwister %d iterations:\t%8d 
ms.\n", iterations, (end - start));

                start = System.currentTimeMillis();
                for (int i = 0; i < iterations; i++) {
                        mt2.nextBytes(buf);
                }
                end = System.currentTimeMillis();
                System.err.printf("Rostislav MersenneTwister %d 
iterations:\t%8d ms.\n", iterations, (end - start));

                start = System.currentTimeMillis();
                for (int i = 0; i < iterations; i++) {
                        mt3.nextBytes(buf);
                }
                end = System.currentTimeMillis();
                System.err.printf("Gilles MersenneTwister %d iterations:\t%8d 
ms.\n", iterations, (end - start));
        }
{code}
On my (pretty old) computer (Pentium 4 Prescott2M 3.2GHz, 2GB RAM, JDK 7u80 
32-bit) I got following results (two runs):
{code}
Spaceroots MersenneTwister 20000000 iterations:    22937 ms.
CM 3.5 MersenneTwister 20000000 iterations:        17532 ms.
Rostislav MersenneTwister 20000000 iterations:     15812 ms.
Gilles MersenneTwister 20000000 iterations:        24235 ms.
{code}
{code}
Spaceroots MersenneTwister 20000000 iterations:    27937 ms.
CM 3.5 MersenneTwister 20000000 iterations:        16735 ms.
Rostislav MersenneTwister 20000000 iterations:     15547 ms.
Gilles MersenneTwister 20000000 iterations:        23953 ms.
{code}


was (Author: rosti.bsd):
My first statement about the same bytes sequence generated by differently sized 
chunks was too optimistic. Indeed even java.util.Random doesn't guarantee this. 
But neither java.util.Random nor org.spaceroots.mantissa.random.MersenneTwister 
make unneeded calls to the nextInt() method (that just calls next(32)). 
Obviously the same bytes sequence guarantee can't be done without a significant 
performance degradation.

Also MersenneTwister#next() always generates int but return only asked number 
of bits.
{code:java}        return y >>> (32 - bits);{code}

BTW the spaceroots's implementation of MersenneTwister was the reference to me. 
You can download it (including source code) from 
www.spaceroots.org/downloads.html in their mantissa library.

I still propose to change the nextBytes() code to my version because of the 
performance. I tested the performance of the four implementations and my 
implementation is better. I did it by running following test code:
{code:java}
        @Test
        public void test4() {
                long start;
                long end;
                int iterations = 20000000;
                int chunkSize = 123;

                org.spaceroots.mantissa.random.MersenneTwister referenceMt = 
new org.spaceroots.mantissa.random.MersenneTwister();
                org.apache.commons.math3.random.MersenneTwister cmMt = new 
org.apache.commons.math3.random.MersenneTwister();
                MersenneTwister2 mt2 = new MersenneTwister2();
                MersenneTwister3 mt3 = new MersenneTwister3();
                byte[] buf = new byte[chunkSize];

                referenceMt.setSeed(1234567L);
                mt2.setSeed(1234567L);
                mt3.setSeed(1234567L);

                start = System.currentTimeMillis();
                for (int i = 0; i < iterations; i++) {
                        referenceMt.nextBytes(buf);
                }
                end = System.currentTimeMillis();
                System.err.printf("Spaceroots MersenneTwister %d 
iterations:\t%8d ms.\n", iterations, (end - start));

                start = System.currentTimeMillis();
                for (int i = 0; i < iterations; i++) {
                        cmMt.nextBytes(buf);
                }
                end = System.currentTimeMillis();
                System.err.printf("CM 3.5 MersenneTwister %d iterations:\t%8d 
ms.\n", iterations, (end - start));

                start = System.currentTimeMillis();
                for (int i = 0; i < iterations; i++) {
                        mt2.nextBytes(buf);
                }
                end = System.currentTimeMillis();
                System.err.printf("Rostislav MersenneTwister %d 
iterations:\t%8d ms.\n", iterations, (end - start));

                start = System.currentTimeMillis();
                for (int i = 0; i < iterations; i++) {
                        mt3.nextBytes(buf);
                }
                end = System.currentTimeMillis();
                System.err.printf("Gilles MersenneTwister %d iterations:\t%8d 
ms.\n", iterations, (end - start));
        }
{code}
On my (pretty old) computer (Pentium 4 Prescott2M 3.2GHz, 2GB RAM, JDK 7u80 
32-bit) I got following results (two runs):
{code}
Spaceroots MersenneTwister 20000000 iterations:    22937 ms.
CM 3.5 MersenneTwister 20000000 iterations:        17532 ms.
Rostislav MersenneTwister 20000000 iterations:     15812 ms.
Gilles MersenneTwister 20000000 iterations:        24235 ms.
{code}
{code}
Spaceroots MersenneTwister 20000000 iterations:    27937 ms.
CM 3.5 MersenneTwister 20000000 iterations:        16735 ms.
Rostislav MersenneTwister 20000000 iterations:     15547 ms.
Gilles MersenneTwister 20000000 iterations:        23953 ms.
{code}

> BitsStreamGenerator#nextBytes(byte[]) is wrong
> ----------------------------------------------
>
>                 Key: MATH-1300
>                 URL: https://issues.apache.org/jira/browse/MATH-1300
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 3.5
>            Reporter: Rostislav Krasny
>         Attachments: MersenneTwister2.java, TestMersenneTwister.java
>
>
> Sequential calls to the BitsStreamGenerator#nextBytes(byte[]) must generate 
> the same sequence of bytes, no matter by chunks of which size it was divided. 
> This is also how java.util.Random#nextBytes(byte[]) works.
> When nextBytes(byte[]) is called with a bytes array of length multiple of 4 
> it makes one unneeded call to next(int) method. This is wrong and produces an 
> inconsistent behavior of classes like MersenneTwister.
> I made a new implementation of the BitsStreamGenerator#nextBytes(byte[]) see 
> attached code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to