Are you benchmarking on a multi-socket/NUMA server? On Tue, Aug 1, 2017, 1:48 PM Wojciech Kudla <[email protected]> wrote:
> It definitely makes sense to have a look at gc activity, but I would > suggest looking at safepoints from a broader perspective. Just use > -XX:+PrintGCApplicationStoppedTime to see what's going on. If it's > safepoints, you could get more details with safepoint statistics. > Also, benchmark runs in java may appear undeterministic simply because > compilation happens in background threads by default and some runs may > exhibit a different runtime profile since the compilation threads receive > their time slice in different moments throughout the benchmark. > Are the results also jittery when run entirely in interpreted mode? It may > be worth to experiment with various compilation settings (ie. disable > tiered compilation, employ different warmup strategies, play around with > compiler control). > Are you employing any sort of affinitizing threads to cpus? > Are you running on a multi-socket setup? > > On Tue, 1 Aug 2017, 19:27 Roger Alsing, <[email protected]> wrote: > >> Some context: I'm building an actor framework, similar to Akka but >> polyglot/cross-platform.. >> For each platform we have the same benchmarks, where one of them is an in >> process ping-pong benchmark. >> >> On .NET and Go, we can spin up pairs of ping-pong actors equal to the >> number of cores in the CPU and no matter if we spin up more pairs, the >> total throughput remains roughly the same. >> But, on the JVM. if we do this, I can see how we max out at 100% CPU, as >> expected, but if I instead spin up a lot more pairs, e.g. 20 * core_count, >> the total throughput tipples. >> >> I suspect this is due to the system running in a more steady state kind >> of fashion in the latter case, mailboxes are never completely drained and >> actors don't have to switch between processing and idle. >> Would this be fair to assume? >> This is the reason why I believe this is a question for this specific >> forum. >> >> Now to the real question.. roughly 60-40 when the benchmark is started, >> it runs at 250 mil msg/sec. steadily and the other times it runs at 350 mil >> msg/sec. >> The reason why I find this strange is that it is stable over time. if I >> don't stop the benchmark, it will continue at the same pace. >> >> If anyone is bored and like to try it out, the repo is here: >> https://github.com/AsynkronIT/protoactor-kotlin >> and the actual benchmark here: >> https://github.com/AsynkronIT/protoactor-kotlin/blob/master/examples/src/main/kotlin/actor/proto/examples/inprocessbenchmark/InProcessBenchmark.kt >> >> This is also consistent with or without various vm arguments. >> >> I'm very interested to hear if anyone has any theories what could cause >> this behavior. >> >> One factor that seems to be involved is GC, but not in the obvious way, >> rather reversed. >> In the beginning, when the framework allocated more memory, it more often >> ran at the high speed. >> And the fewer allocations I've managed to do w/o touching the hot path, >> the more the benchmark have started to toggle between these two numbers. >> >> Thoughts? >> >> -- >> You received this message because you are subscribed to the Google Groups >> "mechanical-sympathy" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> For more options, visit https://groups.google.com/d/optout. >> > -- > You received this message because you are subscribed to the Google Groups > "mechanical-sympathy" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
