Re: Huge, unexpected performance overhead (of static methods?)

Shevek Wed, 06 Feb 2019 14:54:37 -0800

Hmm... "Justify/explain your beliefs." Fair, I'm game:

I think I'm sure that my methods are both JIT'd because perf-java-flamesshows them both in green, and I think that's derived from some prefix onthe symbol name where the JVM (via perf) says it's JIT'd. If it weren'tJIT'd, I'd see it listed in red as "Interpreter".

I waited for the system to warm up before grabbing the flame-chart, andon one run, I was too early, so I actually saw one symbol shift frominterpreted to JIT. I have not used jitwatch, but the signs I am lookingat seem reliable. The interpreter JIT'd the inner method first, then theouter one, as expected based on call frequency.

I've never seen this resolve_static_call method show up before in any ofmy profiling of Java code.

From that PoV, it looks as if the interpreter is doing something oddwith method call between the two methods, despite that it should just bea static linkage. Like it JIT'd both methods, but didn't JIT the call,or it's using some interpreter-path code in the call? So is there alinkage limit based on #methods in the class, or something? Becausemaking the methods non-static ALSO moved them into a new class which hasonly 2 methods... ok, new experiment...

For what it's worth, I seem to be really struggling with JDK 1.8.1b191performance on the new Xeon hardware for other reasons, too. I'm seeingperf saying pthread_cond_wait -> native_write_msr is taking 50% ofruntime, and not even sure where to start with that except limit thelife of any JVM to 6 hours and restart it. I kind of want to blame aKernel / PMU change but it only affects the JVM.


Caveat: I don't do JVM internals, I'm mostly a JLS-layer muggle.

S.

On 2/4/19 10:53 PM, Todd Lipcon wrote:

On Mon, Feb 4, 2019 at 9:13 PM Shevek <[email protected]<mailto:[email protected]>> wrote:


    This isn't a JIT issue. According to perf-java-flames, all my code DID
    get jitted. The overhead is entirely calls to this mystery
    resolve_static_call function, so it looks like a static method lookup
    issue in the JVM. The shape of the stack profile makes it look as if
    something is recursive, too.

Are you sure? From the code it certainly looks like'resolve_static_call' is part of the interpreter code path.


-Todd


    On 2/4/19 8:01 PM, Todd Lipcon wrote:
     > Tried looking at LogCompilation output with jitwatch? It's been
    helpful
     > for me in the past to understand why something wouldn't get jitted.
     >
     > Todd
     >
     > On Mon, Feb 4, 2019, 7:54 PM Shevek <[email protected]
    <mailto:[email protected]>
     > <mailto:[email protected] <mailto:[email protected]>> wrote:
     >
     >     Update: I now think this is slow (but not AS slow) on the
    Core i7-5600U
     >     so this may be a regression from _181 to _191, and not entirely
     >     CPU-dependent?
     >
     >     Wrapping the two static methods in an otherwise-pointless
    class, and
     >     calling them as instance methods made the code much faster.
     >
     >     Is it relevant that the class in question is 522419 Kb in
    size and
     >     contains 1696 (mostly instance) methods? No individual method
    in it is
     >     larger than 8K, so they all JIT.
     >
     >     The outer readVarintTable method is called about 100K-500K
    times, so
     >     there's plenty of chance to replace it.
     >
     >     No synchronization is used.
     >
     >     I'm still in "WAT?" territory.
     >
     >     S.
     >
     >     On 2/4/19 6:26 PM, Shevek wrote:
     >      > Hi,
     >      >
     >      > I have a very simple routine which, on some JVMs/systems,
    which I
     >     have
     >      > not yet entirely narrowed down, suffers a 50x slowdown.
    The code is
     >      > included below.
     >      >
     >      > In perf-java-flames, I see:
     >      >
     >      > <clinit> -> readVarintTable (90%), of which:
     >      > readVarintTable -> readVarint (4%)
     >      > readVarintTable -> resolve_static_call -> libjvm.so (86%)
    <-- THE
     >     WAT?
     >      >
     >      > So what is a perfectly trivial method doing spending 90%
    of it's
     >     time
     >      > inside resolve_static_call? What's going on?
     >      >
     >      > My google searches turned up a note to do with loop
    unrolling and
     >     some
     >      > optimizations breaking for static methods, so I will try
    this with
     >      > non-static methods.
     >      >
     >      > Slow on openjdk 1.8.0_191, Xeon E-2176M (Lenovo P1 laptop,
    12-thread)
     >      > Fast on openjdk 1.8.0_191, Core i7-5600U (Lenovo T550 laptop,
     >     4-thread)
     >      > I think Fast on Xeon E5620 (Supermicro rack, 8 thread).
     >      > I think Slow on AMD Epyc 7301 16-core, 64-thread. Will
     >     investigate more.
     >      >
     >      > Knocking off the obvious:
     >      > * It's not doing meaningful amounts of allocation, and no GC.
     >      > * Total data size is 100M-1G.
     >      > * Both machines running same code, same dataset, ...
     >      > * This is single-threaded, and runs early in the JVM startup.
     >      > * It's doing I/O over JAR-resource -> BufferedInputStream ->
     >      > DataInputStream but it's not I/O contended, based on the
    calls in
     >     the
     >      > flamegraph.
     >      >
     >      > But I feel that a 50x slowdown in an unexplained native call
     >     because of
     >      > ... what, the number of cores ... bears some explanation.
    I can
     >     post the
     >      > flamegraphs if that helps.
     >      >
     >      > And here is the code, which is as boring as anything, so
    what gives:
     >      >
     >      >
     >      >      /** Reads a little-endian varint with no optimization for
     >     negative
     >      > numbers. */
     >      >      private static int readVarint(@Nonnull
    DataInputStream in)
     >     throws
     >      > IOException {
     >      >          int result = 0;
     >      >          for (int shift = 0; shift < 32; shift += 7) {
     >      >              int b = in.read();
     >      >              if (b == -1)
     >      >                  throw new EOFException("Truncated varint in
     >     stream.");
     >      >              result |= (b & 0x7f) << shift;
     >      >              if ((b & 0x80) == 0)
     >      >                  return result;
     >      >          }
     >      >          throw new IOException("Malformed varint in stream.");
     >      >      }
     >      >
     >      >      @Nonnull
     >      >      private static int[] readVarintTable(@Nonnull
     >     DataInputStream in,
     >      > @Nonnegative int sublength) throws IOException {
     >      >          int[] out = new int[readVarint(in) * sublength];
     >      >          for (int i = 0; i < out.length; i++)
     >      >              out[i] = readVarint(in);
     >      >          return out;
     >      >      }
     >      >
     >      >      static {
     >      >          try {
     >      >              DataInputStream in = new DataInputStream(
     >      >                  new BufferedInputStream(
     >      >            Parser.class.getResourceAsStream("Parser.dat")
     >      >                  )
     >      >              );
     >      >                          table = readVarintTable(in);
     >      >                  } // etc
     >      >
     >
     >     --
     >     You received this message because you are subscribed to the
    Google
     >     Groups "mechanical-sympathy" group.
     >     To unsubscribe from this group and stop receiving emails from it,
     >     send an email to
    [email protected]
    <mailto:mechanical-sympathy%[email protected]>
     >     <mailto:mechanical-sympathy%[email protected]
    <mailto:mechanical-sympathy%[email protected]>>.
     >     For more options, visit https://groups.google.com/d/optout.
     >
     > --
     > You received this message because you are subscribed to the Google
     > Groups "mechanical-sympathy" group.
     > To unsubscribe from this group and stop receiving emails from it,
    send
     > an email to [email protected]
    <mailto:mechanical-sympathy%[email protected]>
     > <mailto:[email protected]
    <mailto:mechanical-sympathy%[email protected]>>.
     > For more options, visit https://groups.google.com/d/optout.

--You received this message because you are subscribed to the Google

    Groups "mechanical-sympathy" group.
    To unsubscribe from this group and stop receiving emails from it,
    send an email to [email protected]
    <mailto:mechanical-sympathy%[email protected]>.
    For more options, visit https://groups.google.com/d/optout.

--

You received this message because you are subscribed to the GoogleGroups "mechanical-sympathy" group.To unsubscribe from this group and stop receiving emails from it, sendan email to [email protected]<mailto:[email protected]>.

For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: Huge, unexpected performance overhead (of static methods?)

Reply via email to