Hmm... "Justify/explain your beliefs." Fair, I'm game:
I think I'm sure that my methods are both JIT'd because perf-java-flames
shows them both in green, and I think that's derived from some prefix on
the symbol name where the JVM (via perf) says it's JIT'd. If it weren't
JIT'd, I'd see it listed in red as "Interpreter".
I waited for the system to warm up before grabbing the flame-chart, and
on one run, I was too early, so I actually saw one symbol shift from
interpreted to JIT. I have not used jitwatch, but the signs I am looking
at seem reliable. The interpreter JIT'd the inner method first, then the
outer one, as expected based on call frequency.
I've never seen this resolve_static_call method show up before in any of
my profiling of Java code.
From that PoV, it looks as if the interpreter is doing something odd
with method call between the two methods, despite that it should just be
a static linkage. Like it JIT'd both methods, but didn't JIT the call,
or it's using some interpreter-path code in the call? So is there a
linkage limit based on #methods in the class, or something? Because
making the methods non-static ALSO moved them into a new class which has
only 2 methods... ok, new experiment...
For what it's worth, I seem to be really struggling with JDK 1.8.1b191
performance on the new Xeon hardware for other reasons, too. I'm seeing
perf saying pthread_cond_wait -> native_write_msr is taking 50% of
runtime, and not even sure where to start with that except limit the
life of any JVM to 6 hours and restart it. I kind of want to blame a
Kernel / PMU change but it only affects the JVM.
Caveat: I don't do JVM internals, I'm mostly a JLS-layer muggle.
S.
On 2/4/19 10:53 PM, Todd Lipcon wrote:
On Mon, Feb 4, 2019 at 9:13 PM Shevek <[email protected]
<mailto:[email protected]>> wrote:
This isn't a JIT issue. According to perf-java-flames, all my code DID
get jitted. The overhead is entirely calls to this mystery
resolve_static_call function, so it looks like a static method lookup
issue in the JVM. The shape of the stack profile makes it look as if
something is recursive, too.
Are you sure? From the code it certainly looks like
'resolve_static_call' is part of the interpreter code path.
-Todd
On 2/4/19 8:01 PM, Todd Lipcon wrote:
> Tried looking at LogCompilation output with jitwatch? It's been
helpful
> for me in the past to understand why something wouldn't get jitted.
>
> Todd
>
> On Mon, Feb 4, 2019, 7:54 PM Shevek <[email protected]
<mailto:[email protected]>
> <mailto:[email protected] <mailto:[email protected]>> wrote:
>
> Update: I now think this is slow (but not AS slow) on the
Core i7-5600U
> so this may be a regression from _181 to _191, and not entirely
> CPU-dependent?
>
> Wrapping the two static methods in an otherwise-pointless
class, and
> calling them as instance methods made the code much faster.
>
> Is it relevant that the class in question is 522419 Kb in
size and
> contains 1696 (mostly instance) methods? No individual method
in it is
> larger than 8K, so they all JIT.
>
> The outer readVarintTable method is called about 100K-500K
times, so
> there's plenty of chance to replace it.
>
> No synchronization is used.
>
> I'm still in "WAT?" territory.
>
> S.
>
> On 2/4/19 6:26 PM, Shevek wrote:
> > Hi,
> >
> > I have a very simple routine which, on some JVMs/systems,
which I
> have
> > not yet entirely narrowed down, suffers a 50x slowdown.
The code is
> > included below.
> >
> > In perf-java-flames, I see:
> >
> > <clinit> -> readVarintTable (90%), of which:
> > readVarintTable -> readVarint (4%)
> > readVarintTable -> resolve_static_call -> libjvm.so (86%)
<-- THE
> WAT?
> >
> > So what is a perfectly trivial method doing spending 90%
of it's
> time
> > inside resolve_static_call? What's going on?
> >
> > My google searches turned up a note to do with loop
unrolling and
> some
> > optimizations breaking for static methods, so I will try
this with
> > non-static methods.
> >
> > Slow on openjdk 1.8.0_191, Xeon E-2176M (Lenovo P1 laptop,
12-thread)
> > Fast on openjdk 1.8.0_191, Core i7-5600U (Lenovo T550 laptop,
> 4-thread)
> > I think Fast on Xeon E5620 (Supermicro rack, 8 thread).
> > I think Slow on AMD Epyc 7301 16-core, 64-thread. Will
> investigate more.
> >
> > Knocking off the obvious:
> > * It's not doing meaningful amounts of allocation, and no GC.
> > * Total data size is 100M-1G.
> > * Both machines running same code, same dataset, ...
> > * This is single-threaded, and runs early in the JVM startup.
> > * It's doing I/O over JAR-resource -> BufferedInputStream ->
> > DataInputStream but it's not I/O contended, based on the
calls in
> the
> > flamegraph.
> >
> > But I feel that a 50x slowdown in an unexplained native call
> because of
> > ... what, the number of cores ... bears some explanation.
I can
> post the
> > flamegraphs if that helps.
> >
> > And here is the code, which is as boring as anything, so
what gives:
> >
> >
> > /** Reads a little-endian varint with no optimization for
> negative
> > numbers. */
> > private static int readVarint(@Nonnull
DataInputStream in)
> throws
> > IOException {
> > int result = 0;
> > for (int shift = 0; shift < 32; shift += 7) {
> > int b = in.read();
> > if (b == -1)
> > throw new EOFException("Truncated varint in
> stream.");
> > result |= (b & 0x7f) << shift;
> > if ((b & 0x80) == 0)
> > return result;
> > }
> > throw new IOException("Malformed varint in stream.");
> > }
> >
> > @Nonnull
> > private static int[] readVarintTable(@Nonnull
> DataInputStream in,
> > @Nonnegative int sublength) throws IOException {
> > int[] out = new int[readVarint(in) * sublength];
> > for (int i = 0; i < out.length; i++)
> > out[i] = readVarint(in);
> > return out;
> > }
> >
> > static {
> > try {
> > DataInputStream in = new DataInputStream(
> > new BufferedInputStream(
> > Parser.class.getResourceAsStream("Parser.dat")
> > )
> > );
> > table = readVarintTable(in);
> > } // etc
> >
>
> --
> You received this message because you are subscribed to the
Google
> Groups "mechanical-sympathy" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to
[email protected]
<mailto:mechanical-sympathy%[email protected]>
> <mailto:mechanical-sympathy%[email protected]
<mailto:mechanical-sympathy%[email protected]>>.
> For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google
> Groups "mechanical-sympathy" group.
> To unsubscribe from this group and stop receiving emails from it,
send
> an email to [email protected]
<mailto:mechanical-sympathy%[email protected]>
> <mailto:[email protected]
<mailto:mechanical-sympathy%[email protected]>>.
> For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to [email protected]
<mailto:mechanical-sympathy%[email protected]>.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to [email protected]
<mailto:[email protected]>.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.