I'm very much a newbie with Groovy, so I apologise in advance if this is not the right place for questions like this. I couldn't find anywhere else that looked like a better option - if there is somewhere I should have asked, feel free to redirect me.
I want to write a simulation script using Groovy - this is something of a hobby challenge for me, I have a friend who has done a similar task in C++, and I'm looking for a more user-friendly language to write the code, while not losing too much performance over the C version. The code is basically to simulate "a game". The particular game is defined by the user, as a function that generates scores. The program runs the game many times, and summarises the distribution of the results. Basically, a Monte Carlo simulation. My current code in Groovy for this, using "roll 3 dice and add up the results" as the target game, looks as follows: @Grapes( @Grab(group='org.apache.commons', module='commons-math3', version='3.6.1') ) import org.apache.commons.math3.random.MersenneTwister def benchmark = { closure -> start = System.currentTimeMillis() closure.call() now = System.currentTimeMillis() now - start } rng = new MersenneTwister() int roll() { rng.nextInt(6) + rng.nextInt(6) + rng.nextInt(6) + 3 } int N = 1000000 def results = [:] def time = benchmark { N.times { int n = roll() results[n] = results.containsKey(n) ? results[n] + 1 : 1 } } println "Took ${time/100} sec" for (e in results.sort()) { println "$e.key: $e.value" } This does exactly what I want, but takes about 5 seconds to run the simulation on my PC. My friend's C++ code runs a similar simulation in about 0.1 second. That's a massive penalty for Groovy, and likely means that for more realistic simulations (which would be a lot more complex than 3d6!) I wouldn't even be close to competitive. The code above is completely unoptimised. I know that Groovy's dynamic programming features can introduce some overhead, but I also get the impression from the documentation that by careful use of exact types, and similar techniques, this can be speeded up a lot (the docs claim potentially better than C performance in some cases). What should I be looking at to optimise the above code? The areas I can think of are: 1. The RNG. I assume that the apache commons code is pretty efficient, though. I do want a reasonably decent RNG, and I'd heard that the JVM RNG is not sufficient for simulation. For now I'm assuming that this is sufficiently good. 2. The roll() function. This is the core of the inner loop, and likely the big bottleneck. I've declared the type, which I guess is the first step, but I don't know what else I can do here. I tried a CompileStatic annotation, but that gave me errors about referencing the rng variable. I'm not sure what that implies - is my code doing something wrong in how it references the global rng variable? 3. Collecting the results in a map is likely not ideal. Is there a better data structure I should be using? I basically want to be able to count how many times each result appears - results will be integers (in certain cases I might want non-integers but I can handle them exceptionally) but I don't necessarily know the range in advance (so I'd avoid a static array unless it gives significant performance benefits - when I did a quick test, I got about a second faster runtime, noticeable, but not enough to get me anywhere near my sub-1 second target) I tried using the GProf profiler to see if that shed any light on what I should do. When I ran with a realistic number of iterations it took forever and then failed with an out of memory error. Dropping it to 10000 iterations, I got % cumulative self self total self total self total time seconds seconds calls ms/call ms/call min ms min ms max ms max ms name 34.5 0.04 0.04 10000 0.00 0.00 0.00 0.00 0.91 1.13 blackpool$_run_closure2$_closure3.roll 18.5 0.06 0.02 39984 0.00 0.00 0.00 0.00 0.53 0.53 java.lang.Integer.plus 15.9 0.08 0.02 10000 0.00 0.01 0.00 0.00 0.28 1.26 blackpool$_run_closure2$_closure3.doCall 11.0 0.10 0.01 30000 0.00 0.00 0.00 0.00 0.36 0.36 org.apache.commons.math3.random.MersenneTwister.nextInt 5.1 0.10 0.00 10000 0.00 0.00 0.00 0.00 0.19 0.19 java.util.LinkedHashMap.containsKey 4.4 0.11 0.00 10000 0.00 0.00 0.00 0.00 0.02 0.02 java.util.LinkedHashMap.putAt 4.0 0.12 0.00 1 5.25 125.62 5.25 125.62 5.25 125.62 java.lang.Integer.times 3.9 0.12 0.00 9984 0.00 0.00 0.00 0.00 0.02 0.02 java.util.LinkedHashMap.getAt 2.2 0.12 0.00 1 2.85 128.48 2.85 128.48 2.85 128.48 blackpool$_run_closure2.doCall But I don't really know how to interpret that. (Also, am I somehow using GProf wrongly? It seems like it shouldn't run out of memory profiling a 5-second program run...) Can anyone offer any advice on what I should be looking at here? Thanks, Paul