I suspect that the build phase of an app using Graal AOT could suffer the same limitation, but it is correct that static initializers *shouldn't* be used for heavyweight computations given that are ensured to be called serially...
Il giorno ven 8 feb 2019, 02:36 Andrei Pangin <[email protected]> ha scritto: > Looks like you run long computation inside the static initializer > <clinit>, right? > The problem is that HotSpot cannot fully optimize static calls when the > holder class is not completely initialized. Moving computation to a > separate class is the right way to deal with such cases. > > The situation got even worse in recent JDK updates. There was a zero day > bug https://bugs.openjdk.java.net/browse/JDK-8215634 that could result in > invocation of a static method of uninitialized class in violation of JVMS. > The bug was fixed in JDK 8u201 and JDK 11.0.2. However, the fix is terrible > from performance perspective: now the invocation of a static method of > uninitialized class goes through the slow path, in particular, through > resolve_static_call() JVM runtime function. > > So, the recommendation is simple: move heavy logic out of a class being > initialized. > > - Andrei > > четверг, 7 февраля 2019 г., 1:54:29 UTC+3 пользователь Shevek написал: >> >> Hmm... "Justify/explain your beliefs." Fair, I'm game: >> >> I think I'm sure that my methods are both JIT'd because perf-java-flames >> shows them both in green, and I think that's derived from some prefix on >> the symbol name where the JVM (via perf) says it's JIT'd. If it weren't >> JIT'd, I'd see it listed in red as "Interpreter". >> >> I waited for the system to warm up before grabbing the flame-chart, and >> on one run, I was too early, so I actually saw one symbol shift from >> interpreted to JIT. I have not used jitwatch, but the signs I am looking >> at seem reliable. The interpreter JIT'd the inner method first, then the >> outer one, as expected based on call frequency. >> >> I've never seen this resolve_static_call method show up before in any of >> my profiling of Java code. >> >> From that PoV, it looks as if the interpreter is doing something odd >> with method call between the two methods, despite that it should just be >> a static linkage. Like it JIT'd both methods, but didn't JIT the call, >> or it's using some interpreter-path code in the call? So is there a >> linkage limit based on #methods in the class, or something? Because >> making the methods non-static ALSO moved them into a new class which has >> only 2 methods... ok, new experiment... >> >> For what it's worth, I seem to be really struggling with JDK 1.8.1b191 >> performance on the new Xeon hardware for other reasons, too. I'm seeing >> perf saying pthread_cond_wait -> native_write_msr is taking 50% of >> runtime, and not even sure where to start with that except limit the >> life of any JVM to 6 hours and restart it. I kind of want to blame a >> Kernel / PMU change but it only affects the JVM. >> >> Caveat: I don't do JVM internals, I'm mostly a JLS-layer muggle. >> >> S. >> >> On 2/4/19 10:53 PM, Todd Lipcon wrote: >> > On Mon, Feb 4, 2019 at 9:13 PM Shevek <[email protected] >> > <mailto:[email protected]>> wrote: >> > >> > This isn't a JIT issue. According to perf-java-flames, all my code >> DID >> > get jitted. The overhead is entirely calls to this mystery >> > resolve_static_call function, so it looks like a static method >> lookup >> > issue in the JVM. The shape of the stack profile makes it look as >> if >> > something is recursive, too. >> > >> > >> > Are you sure? From the code it certainly looks like >> > 'resolve_static_call' is part of the interpreter code path. >> > >> > -Todd >> > >> > >> > On 2/4/19 8:01 PM, Todd Lipcon wrote: >> > > Tried looking at LogCompilation output with jitwatch? It's been >> > helpful >> > > for me in the past to understand why something wouldn't get >> jitted. >> > > >> > > Todd >> > > >> > > On Mon, Feb 4, 2019, 7:54 PM Shevek <[email protected] >> > <mailto:[email protected]> >> > > <mailto:[email protected] <mailto:[email protected]>> wrote: >> > > >> > > Update: I now think this is slow (but not AS slow) on the >> > Core i7-5600U >> > > so this may be a regression from _181 to _191, and not >> entirely >> > > CPU-dependent? >> > > >> > > Wrapping the two static methods in an otherwise-pointless >> > class, and >> > > calling them as instance methods made the code much faster. >> > > >> > > Is it relevant that the class in question is 522419 Kb in >> > size and >> > > contains 1696 (mostly instance) methods? No individual >> method >> > in it is >> > > larger than 8K, so they all JIT. >> > > >> > > The outer readVarintTable method is called about 100K-500K >> > times, so >> > > there's plenty of chance to replace it. >> > > >> > > No synchronization is used. >> > > >> > > I'm still in "WAT?" territory. >> > > >> > > S. >> > > >> > > On 2/4/19 6:26 PM, Shevek wrote: >> > > > Hi, >> > > > >> > > > I have a very simple routine which, on some JVMs/systems, >> > which I >> > > have >> > > > not yet entirely narrowed down, suffers a 50x slowdown. >> > The code is >> > > > included below. >> > > > >> > > > In perf-java-flames, I see: >> > > > >> > > > <clinit> -> readVarintTable (90%), of which: >> > > > readVarintTable -> readVarint (4%) >> > > > readVarintTable -> resolve_static_call -> libjvm.so (86%) >> > <-- THE >> > > WAT? >> > > > >> > > > So what is a perfectly trivial method doing spending 90% >> > of it's >> > > time >> > > > inside resolve_static_call? What's going on? >> > > > >> > > > My google searches turned up a note to do with loop >> > unrolling and >> > > some >> > > > optimizations breaking for static methods, so I will try >> > this with >> > > > non-static methods. >> > > > >> > > > Slow on openjdk 1.8.0_191, Xeon E-2176M (Lenovo P1 >> laptop, >> > 12-thread) >> > > > Fast on openjdk 1.8.0_191, Core i7-5600U (Lenovo T550 >> laptop, >> > > 4-thread) >> > > > I think Fast on Xeon E5620 (Supermicro rack, 8 thread). >> > > > I think Slow on AMD Epyc 7301 16-core, 64-thread. Will >> > > investigate more. >> > > > >> > > > Knocking off the obvious: >> > > > * It's not doing meaningful amounts of allocation, and no >> GC. >> > > > * Total data size is 100M-1G. >> > > > * Both machines running same code, same dataset, ... >> > > > * This is single-threaded, and runs early in the JVM >> startup. >> > > > * It's doing I/O over JAR-resource -> BufferedInputStream >> -> >> > > > DataInputStream but it's not I/O contended, based on the >> > calls in >> > > the >> > > > flamegraph. >> > > > >> > > > But I feel that a 50x slowdown in an unexplained native >> call >> > > because of >> > > > ... what, the number of cores ... bears some explanation. >> > I can >> > > post the >> > > > flamegraphs if that helps. >> > > > >> > > > And here is the code, which is as boring as anything, so >> > what gives: >> > > > >> > > > >> > > > /** Reads a little-endian varint with no >> optimization for >> > > negative >> > > > numbers. */ >> > > > private static int readVarint(@Nonnull >> > DataInputStream in) >> > > throws >> > > > IOException { >> > > > int result = 0; >> > > > for (int shift = 0; shift < 32; shift += 7) { >> > > > int b = in.read(); >> > > > if (b == -1) >> > > > throw new EOFException("Truncated varint >> in >> > > stream."); >> > > > result |= (b & 0x7f) << shift; >> > > > if ((b & 0x80) == 0) >> > > > return result; >> > > > } >> > > > throw new IOException("Malformed varint in >> stream."); >> > > > } >> > > > >> > > > @Nonnull >> > > > private static int[] readVarintTable(@Nonnull >> > > DataInputStream in, >> > > > @Nonnegative int sublength) throws IOException { >> > > > int[] out = new int[readVarint(in) * sublength]; >> > > > for (int i = 0; i < out.length; i++) >> > > > out[i] = readVarint(in); >> > > > return out; >> > > > } >> > > > >> > > > static { >> > > > try { >> > > > DataInputStream in = new DataInputStream( >> > > > new BufferedInputStream( >> > > > Parser.class.getResourceAsStream("Parser.dat") >> > > > ) >> > > > ); >> > > > table = readVarintTable(in); >> > > > } // etc >> > > > >> > -- > You received this message because you are subscribed to the Google Groups > "mechanical-sympathy" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
