Are you benchmarking on a multi-socket/NUMA server?

On Tue, Aug 1, 2017, 1:48 PM Wojciech Kudla <[email protected]>
wrote:

> It definitely makes sense to have a look at gc activity, but I would
> suggest looking at safepoints from a broader perspective. Just use
>  -XX:+PrintGCApplicationStoppedTime to see what's going on. If it's
> safepoints, you could get more details with safepoint statistics.
> Also, benchmark runs in java may appear undeterministic simply because
> compilation happens in background threads by default and some runs may
> exhibit a different runtime profile since the compilation threads receive
> their time slice in different moments throughout the benchmark.
> Are the results also jittery when run entirely in interpreted mode? It may
> be worth to experiment with various compilation settings (ie. disable
> tiered compilation, employ different warmup strategies, play around with
> compiler control).
> Are you employing any sort of affinitizing threads to cpus?
> Are you running on a multi-socket setup?
>
> On Tue, 1 Aug 2017, 19:27 Roger Alsing, <[email protected]> wrote:
>
>> Some context: I'm building an actor framework, similar to Akka but
>> polyglot/cross-platform..
>> For each platform we have the same benchmarks, where one of them is an in
>> process ping-pong benchmark.
>>
>> On .NET and Go, we can spin up pairs of ping-pong actors equal to the
>> number of cores in the CPU and no matter if we spin up more pairs, the
>> total throughput remains roughly the same.
>> But, on the JVM. if we do this, I can see how we max out at 100% CPU, as
>> expected, but if I instead spin up a lot more pairs, e.g. 20 * core_count,
>> the total throughput tipples.
>>
>> I suspect this is due to the system running in a more steady state kind
>> of fashion in the latter case, mailboxes are never completely drained and
>> actors don't have to switch between processing and idle.
>> Would this be fair to assume?
>> This is the reason why I believe this is a question for this specific
>> forum.
>>
>> Now to the real question.. roughly 60-40 when the benchmark is started,
>> it runs at 250 mil msg/sec. steadily and the other times it runs at 350 mil
>> msg/sec.
>> The reason why I find this strange is that it is stable over time. if I
>> don't stop the benchmark, it will continue at the same pace.
>>
>> If anyone is bored and like to try it out, the repo is here:
>> https://github.com/AsynkronIT/protoactor-kotlin
>> and the actual benchmark here:
>> https://github.com/AsynkronIT/protoactor-kotlin/blob/master/examples/src/main/kotlin/actor/proto/examples/inprocessbenchmark/InProcessBenchmark.kt
>>
>> This is also consistent with or without various vm arguments.
>>
>> I'm very interested to hear if anyone has any theories what could cause
>> this behavior.
>>
>> One factor that seems to be involved is GC, but not in the obvious way,
>> rather reversed.
>> In the beginning, when the framework allocated more memory, it more often
>> ran at the high speed.
>> And the fewer allocations I've managed to do w/o touching the hot path,
>> the more the benchmark have started to toggle between these two numbers.
>>
>> Thoughts?
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "mechanical-sympathy" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> For more options, visit https://groups.google.com/d/optout.
>>
> --
> You received this message because you are subscribed to the Google Groups
> "mechanical-sympathy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to