Re: Optimising a Groovy script

John Wagenleitner Tue, 28 Mar 2017 14:41:16 -0700

Hi Paul,

The milliseconds to seconds conversion was off, so that puts the real time
at ~0.5 seconds.



> println "Took ${time/100} sec"


Using the following I get somewhere close to 0.15.  Using an int array may
be worth it for higher values of N to avoid the boxing/unboxing of the ints.

@Grapes(
    @Grab(group='org.apache.commons', module='commons-math3', version=
'3.6.1')
)
import org.apache.commons.math3.random.MersenneTwister

def benchmark = { closure ->
  start = System.currentTimeMillis()
  closure.call()
  now = System.currentTimeMillis()
  now - start
}

@groovy.transform.CompileStatic
class Twister {
    static MersenneTwister rng = new MersenneTwister()

    static int roll() {
        rng.nextInt(6) + rng.nextInt(6) + rng.nextInt(6) + 3
    }

    static Map<Integer,Integer> rolls(int num) {
        Map<Integer,Integer> results = [:]
        num.times {
            int n = roll()
            results[n] = results.containsKey(n) ? results[n] + 1 : 1
        }
        return results
    }
}

int N = 1000000
def results

def time = benchmark {
    results = Twister.rolls(N)
}

println "Took ${time/1000} sec"
for (e in results.sort()) {
    println "$e.key: $e.value"
}



On Tue, Mar 28, 2017 at 1:25 PM, Paul Moore <p.f.mo...@gmail.com> wrote:

> I'm very much a newbie with Groovy, so I apologise in advance if this
> is not the right place for questions like this. I couldn't find
> anywhere else that looked like a better option - if there is somewhere
> I should have asked, feel free to redirect me.
>
> I want to write a simulation script using Groovy - this is something
> of a hobby challenge for me, I have a friend who has done a similar
> task in C++, and I'm looking for a more user-friendly language to
> write the code, while not losing too much performance over the C
> version.
>
> The code is basically to simulate "a game". The particular game is
> defined by the user, as a function that generates scores. The program
> runs the game many times, and summarises the distribution of the
> results. Basically, a Monte Carlo simulation. My current code in
> Groovy for this, using "roll 3 dice and add up the results" as the
> target game, looks as follows:
>
> @Grapes(
>     @Grab(group='org.apache.commons', module='commons-math3',
> version='3.6.1')
> )
> import org.apache.commons.math3.random.MersenneTwister
>
> def benchmark = { closure ->
>   start = System.currentTimeMillis()
>   closure.call()
>   now = System.currentTimeMillis()
>   now - start
> }
>
> rng = new MersenneTwister()
>
> int roll() {
>     rng.nextInt(6) + rng.nextInt(6) + rng.nextInt(6) + 3
> }
>
> int N = 1000000
> def results = [:]
>
> def time = benchmark {
>     N.times {
>         int n = roll()
>         results[n] = results.containsKey(n) ? results[n] + 1 : 1
>     }
> }
>
> println "Took ${time/100} sec"
> for (e in results.sort()) {
>     println "$e.key: $e.value"
> }
>
> This does exactly what I want, but takes about 5 seconds to run the
> simulation on my PC. My friend's C++ code runs a similar simulation in
> about 0.1 second. That's a massive penalty for Groovy, and likely
> means that for more realistic simulations (which would be a lot more
> complex than 3d6!) I wouldn't even be close to competitive.
>
> The code above is completely unoptimised. I know that Groovy's dynamic
> programming features can introduce some overhead, but I also get the
> impression from the documentation that by careful use of exact types,
> and similar techniques, this can be speeded up a lot (the docs claim
> potentially better than C performance in some cases).
>
> What should I be looking at to optimise the above code? The areas I
> can think of are:
>
> 1. The RNG. I assume that the apache commons code is pretty efficient,
> though. I do want a reasonably decent RNG, and I'd heard that the JVM
> RNG is not sufficient for simulation. For now I'm assuming that this
> is sufficiently good.
> 2. The roll() function. This is the core of the inner loop, and likely
> the big bottleneck. I've declared the type, which I guess is the first
> step, but I don't know what else I can do here. I tried a
> CompileStatic annotation, but that gave me errors about referencing
> the rng variable. I'm not sure what that implies - is my code doing
> something wrong in how it references the global rng variable?
> 3. Collecting the results in a map is likely not ideal. Is there a
> better data structure I should be using? I basically want to be able
> to count how many times each result appears - results will be integers
> (in certain cases I might want non-integers but I can handle them
> exceptionally) but I don't necessarily know the range in advance (so
> I'd avoid a static array unless it gives significant performance
> benefits - when I did a quick test, I got about a second faster
> runtime, noticeable, but not enough to get me anywhere near my sub-1
> second target)
>
> I tried using the GProf profiler to see if that shed any light on what
> I should do. When I ran with a realistic number of iterations it took
> forever and then failed with an out of memory error. Dropping it to
> 10000 iterations, I got
>
>  %    cumulative   self            self     total    self    total
> self    total
> time   seconds    seconds  calls  ms/call  ms/call  min ms  min ms
> max ms  max ms  name
> 34.5        0.04     0.04  10000     0.00     0.00    0.00    0.00
> 0.91    1.13  blackpool$_run_closure2$_closure3.roll
> 18.5        0.06     0.02  39984     0.00     0.00    0.00    0.00
> 0.53    0.53  java.lang.Integer.plus
> 15.9        0.08     0.02  10000     0.00     0.01    0.00    0.00
> 0.28    1.26  blackpool$_run_closure2$_closure3.doCall
> 11.0        0.10     0.01  30000     0.00     0.00    0.00    0.00
> 0.36    0.36  org.apache.commons.math3.random.MersenneTwister.nextInt
>  5.1        0.10     0.00  10000     0.00     0.00    0.00    0.00
> 0.19    0.19  java.util.LinkedHashMap.containsKey
>  4.4        0.11     0.00  10000     0.00     0.00    0.00    0.00
> 0.02    0.02  java.util.LinkedHashMap.putAt
>  4.0        0.12     0.00      1     5.25   125.62    5.25  125.62
> 5.25  125.62  java.lang.Integer.times
>  3.9        0.12     0.00   9984     0.00     0.00    0.00    0.00
> 0.02    0.02  java.util.LinkedHashMap.getAt
>  2.2        0.12     0.00      1     2.85   128.48    2.85  128.48
> 2.85  128.48  blackpool$_run_closure2.doCall
>
> But I don't really know how to interpret that. (Also, am I somehow
> using GProf wrongly? It seems like it shouldn't run out of memory
> profiling a 5-second program run...)
>
> Can anyone offer any advice on what I should be looking at here?
>
> Thanks,
> Paul
>

Re: Optimising a Groovy script

Reply via email to