John, Chris, thanks for the feedback.

I don't think it is only about microbenchmarks. Long-running large applications with lots of mutable call sites should also benefit for this change. Current JVM behavior counts invalidations on root method, so nmethods with multiple mutable call sites (from root & all inlined callees) are more likely to hit the limit, even if there's no mega-mutable sites. It just sums up and PerMethodRecompilationCutoff (= 400, by default) doesn't look like a huge number.

Also, LambdaForm sharing somewhat worsen the situation. When LambdaForms were mostly customized, different method handle chains were compiled into a single nmethod. Right now, it means that not only the root method is always interpreted, but all bound method handle chains are broken into numerous per-LF nmethods (see JDK-8069591 for some details).

MLVM folks, I'd like to hear your opinion about what kind of behavior do you expect from JVM w.r.t. mutable call sites.

There are valid use-cases when JVM shouldn't throttle the recompilation (e.g., long-running application with indy-based dynamic tracing). Maybe there's a place for a new CallSite flavor to clearly communicate application expectations to the JVM? Either always recompile (thus eventually reaching peak performance) or give up and generate less efficient machine code, but save on possible recompilations.

Best regards,
Vladimir Ivanov

On 1/20/16 2:37 AM, John Rose wrote:
On Jan 18, 2016, at 4:54 AM, Vladimir Ivanov
< <>> wrote:

The fix is to avoid updating recompilation count when corresponding
nmethod is invalidated due to a call site target change.

Although I'm not vetoing it (since it seems it will help customers in
the short term), I'm uncomfortable with this fix because it doesn't
scale to large dyn. lang. applications with many unstable call sites.
  Put another way, it feels like we are duct-taping down a failsafe
switch (against infinite recompilation) in order to spam a
micro-benchmark:  a small number mega-mutable call sites for which we
are willing to spend (potentially) all of the JIT resources, including
those usually allocated to application performance in the steady state.
  Put a third way:  I am not comfortable with unthrottled infinite
recompilation as a performance strategy.

I've commented on the new RFE (JDK-8147550) where to go next, including
the following sentiments:

There is a serious design tension here, though: Some users apparently
are willing to endure an infinite series of recompilations as part of
the cost of doing business; JDK-7177745 addresses this need by turning
off the fail-safe against (accidental, buggy) infinite recompilation
for unstable CSs. Other users might find that having a percentage of
machine time devoted to recompilation is a problem. (This has been the
case in the past with non-dynamic languages, at least.) The code shape
proposed in this bug report would cover all simple unstable call
sites (bi-stable, for example, would compile to a bi-morphic call),
but, in pathological cases (infinite sequence of distinct CS targets)
would "settle down" into a code shape that would be sub-optimal for
any single target, but (as an indirect MH call) reasonable for all the
targets together.

In the absence of clear direction from the user or the profile, the
JVM has to choose infinite recompilation or a good-enough final
compilation. The latter choice is safer. And the
infinite recompilation is less safe because there is no intrinsic
bound on the amount of machine cycles that could be diverted to
recompilation, given a dynamic language application with
enough mega-mutable CSs. Settling down to a network of indirect calls
has a bounded cost.

Yes, one size-fits-all tactics never please everybody. But the JVM
should not choose tactics with unlimited downsides.

— John
mlvm-dev mailing list

Reply via email to