Re: The Great Startup Problem

Marcus Lagergren Sat, 23 Aug 2014 06:12:07 -0700

I agree completely with Charlie’s assessment about Lambda Forms being a 
problematic mechanism for indy call site linking due to its

* Lack of scalability (explosion of byte code)
* Metaspace usage

and everything else that has been described below.

I’m currently recovering after surgery and a bit disoriented and confused, but 
I’ll try to write a longer reply on Monday or Tuesday. 

This post illustrates perfectly to me why it’s important to replace of the 
LambdaForms as they are currently implemented with something else - probably 
even latest in the 9 timeframe. This may be a market window of opportunity that 
is slowly sliding from us. (PM and management might of course have things to 
say here, but for me this thing keeps bubbling up behind me as soon as I look 
away. It certainly has the last 12-18 months). 

We have identical issues with LambdaForms Nashorn. Some of them can to be 
solved with AOT, and have been in 8u20, (the second run of any Nashorn 
application can use a persistent code cache complete with optimizations and 
known types), but for the first run we have similar issues. Some of the warmup 
problem can be done by interpreting the JavaScript AST (we currently don’t do 
that) in a profiling pass, but once we lay out indys we are still going to have 
the scalability problem you described with just a lowered constant factor. (But 
who wants to write an interpreter on your interpreter and there’s debugging 
issues and security issues and such). Also - Interpreters are not particularly 
known to be fast either. We are exploring various other ways not to have to lay 
out so many indys at warmup, but I haven’t got enough on my feet to want to 
talk about it yet.

LambdaForm caching (JEP 210) will at least be a decent band aid for 8u40 (but 
even then, the problem of profile pollution when reusing a LambdaForm for two 
different indy callsites is not trivial and has to be solved. I’m not sure it 
has yet - I know Vladimir is working hard right now on this and is applying his 
excellent brain to the problem). However, I also don’t think that LambdaForm 
caching is enough for the long time solution.

We had some discussions how to implement indy call sites without LambdaForms 
after JVMLS in Santa Clara. Maybe John, Rickard or Vladimir can summarize some 
of the things we talked about, as I am just a code generation amateur in 
HotSpot and don’t want to embarrass myself in front of you guys. (Or I’ll post 
it later, when my head is clearer)

When it comes to putting resources on this, I can only say that I would love 
for this to happen and think it’s tremendously important for dynamic languages 
on the JVM.

Regards
Marcus

P.S. I agree with the tiered stuff too, but LambdaForms is the thing that 
really burns us in the warmup department right now. (and in the Metaspace 
department. Let’s not forget about that one).

P.P.S. Fredrik’s old post about how we did this in JRockit by inlining the indy 
callsites is worth a read again. The approach, is, however, probably also 
subject to some profiling pollution when you think about it. We never got far 
enough to really suffer from it, but one would expect it would crop up No extra 
byte code though. No extra classes. No extra metaspace.  
(https://blogs.oracle.com/ohrstrom/entry/pulling_a_machine_code_rabbit)… Maybe 
Fredrik himself can tell us something here? I

On 22 Aug 2014, at 22:08, Charles Oliver Nutter <head...@headius.com> wrote:

> Marcus coaxed me into making a post about our indy issues. Our indy
> issues mostly surround startup and warmup time, so I'm making this a
> general post about startup and warmup.
> 
> When I started working on JRuby 7 years ago, I hoped we'd have a good
> answer for poor startup time and long warmup times. Today, the answers
> are no better -- and in many cases much worse -- than when I started.
> 
> Here's a summary of our experience over the years...
> 
> * client versus server
> 
> Early on, we made JRuby's launcher use client mode by default. This
> was by far the best way to get good startup performance, but it led to
> us perpetuating the old question "which mode are you running in" when
> people reported poor steady-state performance.
> 
> * Tiered compiler
> 
> The promise of the tiered compiler was great: client-fast startup with
> server-fast steady state. In practice, tiered has failed to meet
> expectations for us. The situation is aggravated by the loss of
> -client and -server flags.
> 
> On the startup side, we have found that the tiered compiler never even
> comes close to the startup time of -client. For a nontrivial app
> startup, like a Rails app, we see a 50% reduction in startup time by
> forcing tier 1 (which is C1, the old -client mode) rather than letting
> the tiered compiler work normally.
> 
> Obviously limiting ourselves to tier 1 means performance is reduced,
> but these days our #1 user complain is startup time. So, we have AGAIN
> taken the step of putting startup-improving flags into our launchers:
> jruby --dev forces tier 1 + client mode.
> 
> On the steady-state side, the tiered compiler is rather unpredictable.
> Some cases will be faster (presumably from better profiling in earlier
> tiers), while others will be much slower. And it can vary from run to
> run...tiered steady-state performance is even harder to predict than
> C2 (-server). We have done no investigation here.
> 
> * Invokedynamic
> 
> We love indy. We love it more than just about anyone. But we have
> again had to make indy support OFF by default in JRuby 1.7.14 and may
> have to do the same for JRuby 9000.
> 
> Originally, we had indy off because of the NCDFE bugs in the old
> implementation. LambdaForms have fixed all that, and with JIT
> improvements in the past year they generally (eventually) reach the
> same steady-state performance.
> 
> Unfortunately, LambdaForms have an enormous startup-time cost. I
> believe there's two reasons for this:
> 
> 1. Method handle chains can now result in dozens of lambda forms,
> making the initial bootstrapping cost much higher. Multiply this by
> thousands of call sites, all getting hit for the first time. Multiply
> that by PIC depth. And then remember that many boot-time operations
> will blow out those caches, so you'll start over repeatedly. Some of
> this can be mitigated in JRuby, but much of it cannot.
> 
> 2. Lambda forms are too slow to execute and take too long to optimize
> down to native code. Lambda forms work sorta like the tiered compiler.
> They'll be interpreted for a while, then they'll become JVM bytecode
> for a while, which interprets for a while, then the tiered compiler's
> first phase will pick it up.... There's no way to "commit" a lambda
> form you know you're going to be hitting hard, so it takes FOREVER to
> get from a newly-bootstrapped call site to the 5 assembly instructions
> that *actually* need to run.
> 
> I do want to emphasize that for us, LambdaForms usually do get to the
> same peak performance we saw with the old implementation. It's just
> taking way, way too long to get there.
> 
> Because of these issues, JRuby's new --dev flag turns invokedynamic
> off, and JRuby 1.7.14 will once again tuen indy off by default on all
> JVM versions.
> 
> * Other ways of mitigating startup time
> 
> We have recommended Nailgun in the past. Nailgun keeps a JVM running
> in the background, and you toss it commands to run. It works well as
> long as the commands are actually self-contained, self-cleaning units
> of work; spin up one thread or leave resources open, and the Nailgun
> server eventually becomes unusable.
> 
> We now recommend Drip as a similar solution. For each command you run,
> Drip attempts to start additional larval JVMs in the background in
> preparation for future commands. You can configure those instances to
> pre-boot libraries or application resources, to reduce the work done
> at startup for the next command (e.g. preboot your Rails application,
> and then the next command just has to utilize it). Drip is cleaner
> than Nailgun, but never quite achieves the same startup time without a
> lot of configuration. It is also a bit of a hack...you can easily
> preboot something in the "next JVM" that is out of date by the time
> you use it.
> 
> CONCLUSION...
> 
> We obviously still love working with OpenJDK, and it remains the best
> platform for building JRuby (and other languages). However, our
> failure as a community to address these startup/warmup issues is
> eventually going to kill us. Startup time remains the #1 complaint
> about JRuby, and warmup time may be a close second.
> 
> What are the rest of you doing to deal with these issues?
> 
> - Charlie
> _______________________________________________
> mlvm-dev mailing list
> mlvm-dev@openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev

Re: The Great Startup Problem

Reply via email to