Re: [jvm-l] slow downs in invokedynamic native code

2016-03-19 Thread Benjamin Sieffert
Hi,

I did run into this issue with Nashorn as well. That was back in 2013, but
a certain Nick Houghton (cc, who will be glad to here there’s finally a bug
filed) contacted me about that in Dec'15, asking whether I had made any
progress, because he had discovered the same issue.
Here’s my original thread:
http://mail.openjdk.java.net/pipermail/mlvm-dev/2013-September/005489.html
Maybe Jochen or Remi will still remember.

Anyways, I think we did solve this issue by eliminating a pattern where we
would globally define objects in our script environment via calls to
NashornScriptEngine.put. I suspect that overriding the same global variable
repeatedly and hence, I suppose, invalidating corresponding call sites,
might’ve triggered the pathologic behaviour.

Just wanted to let you know, in case it help narrowing down the issue /
understanding the impact.

Kind Regards,
Benjamin

On 16 March 2016 at 12:41, Remi Forax <fo...@univ-mlv.fr> wrote:

> The symptoms are really like a deoptimization storm,
> setCallSiteTargetNormal goes to a safepoint (which is worst that only
> having the compiler/JIT lock because all threads are stopped),
> when either a code calls setTarget or a SwithPoint is invalidated.
>
> You have a deopt storm when the JIT compiles a code that contains a
> callsite that is always invalid, so the VM enters in loop like this,
> JIT compile a blob
> execute the blob
> deopt
> jump back in the interpreter
> rinse and repeat
>
> The root cause is a bug in the invalidation logic of the language runtime
> (not the VM) but it's hard to spot without a reproducible test case because
> when the JIT compiles a blob of codes there are several callsites inside
> that blob and usually only one is the faulty one.
>
> We already have discussed about that point several times,
> John is a proponent of marking the callsite has should never be optimized
> again,
> which at least stop the storm issue but it sweeps the real cause of the
> bug under carpet,
> I would prefer, consider these kind of bugs as a language runtime bugs
> that should be investigated by the runtime developers.
>
> Perhaps a middle ground is to mark the callsite as not compilable anymore
> *and* emit a warning (like when the code cache is full) to not hide the
> root cause of the bug.
>
> Rémi
>
> --
>
> *De: *"Hannes Wallnöfer" <hann...@gmail.com>
> *À: *jvm-langua...@googlegroups.com
> *Envoyé: *Mercredi 16 Mars 2016 11:52:42
> *Objet: *Re: [jvm-l] slow downs in invokedynamic native code
>
> I've filed a bug for this:
>
> https://bugs.openjdk.java.net/browse/JDK-8151981
>
> For the Nashorn report, the only thing we know is that it involves pretty
> large scripts that are being re-evaluated in new ScriptEngines, with 8
> engines at a time. So it seems quite possible that some implementation
> detail is stressed beyond the point where it performs efficiently.
>
> Hannes
>
>
> 2016-03-16 11:44 GMT+01:00 Duncan MacGregor <duncan.macgre...@gmail.com>:
>
>> I haven't seen this, but setCallSiteTargetNormal does have to get the
>> compiler lock, so contention can definitely cause problems. Is there a
>> chance you're repeatedly invalidating and setting targets? Or generating
>> lots of new mutable call sites?
>>
>> The other possibility is that the data structures that store the target
>> information aren't scaling, but j have seen a big problem there before, and
>> Magik on Java apps tend to be large, so I'd expect to have hit any problems
>> by now.
>>
>> Duncan.
>>
>> On 16 Mar 2016, at 10:28, Hannes Wallnöfer <hann...@gmail.com> wrote:
>>
>> Hi Jochen,
>>
>> we recently had a report on nashorn-dev that could be related. A user is
>> re-evaluating the same or similar code again and seeing more than 20x
>> slowdown compared to the fist evaluation.
>>
>> http://mail.openjdk.java.net/pipermail/nashorn-dev/2016-March/006024.html
>>
>> The thing is that he is using fresh ScriptEngines for the second
>> evaluation, so the Nashorn engines should not share anything. As with your
>> case, Jvisualvm shows that 80% of time is spent in
>> java.lang.invoke.MethodHandleNatives.setCallSiteTargetNormal().
>>
>> Hannes
>>
>>
>> 2016-03-15 10:28 GMT+01:00 Jochen Theodorou <blackd...@gmx.org>:
>>
>>> Hi,
>>>
>>>
>>> One of our users has a web application using Groovy with indy activated
>>> and describes the following problem:
>>>
>>> At random intervals and a random times our web servers will go from
>>>> serving responses in the 300 ms range to taking 30 

Re: [jvm-l] slow downs in invokedynamic native code

2016-03-18 Thread Remi Forax
Hi Benjamin, 

- Mail original -

> De: "Benjamin Sieffert" <benjamin.sieff...@metrigo.de>
> À: "Da Vinci Machine Project" <mlvm-dev@openjdk.java.net>
> Cc: "Nick Houghton" <nhoug...@gmail.com>, jvm-langua...@googlegroups.com
> Envoyé: Mercredi 16 Mars 2016 13:59:58
> Objet: Re: [jvm-l] slow downs in invokedynamic native code

> Hi,

> I did run into this issue with Nashorn as well. That was back in 2013, but a
> certain Nick Houghton (cc, who will be glad to here there’s finally a bug
> filed) contacted me about that in Dec'15, asking whether I had made any
> progress, because he had discovered the same issue.
> Here’s my original thread:
> http://mail.openjdk.java.net/pipermail/mlvm-dev/2013-September/005489.html
> Maybe Jochen or Remi will still remember.

I don't remember ... i'm like a goldfish ... but the internet saves me (thanks 
for the link) 

> Anyways, I think we did solve this issue by eliminating a pattern where we
> would globally define objects in our script environment via calls to
> NashornScriptEngine.put. I suspect that overriding the same global variable
> repeatedly and hence, I suppose, invalidating corresponding call sites,
> might’ve triggered the pathologic behaviour.

> Just wanted to let you know, in case it help narrowing down the issue /
> understanding the impact.

It's a perfect example of a Nashorn bug (not a VM bug), any global (maybe 
constant?) should have a counter counting the number of time the global is 
changed, and if the counter is above a threshold, the runtime should stop 
trying to see it as constant. 

Perhaps we should have a section about deopt storm in the documentation saying 
that the code of the runtime should always have a backup strategy in case the 
method handle chain is constant enough for the JIT but not constant if you take 
a look to the whole program. 

> Kind Regards,
> Benjamin

regards, 
Rémi 

> On 16 March 2016 at 12:41, Remi Forax < fo...@univ-mlv.fr > wrote:

> > The symptoms are really like a deoptimization storm,
> 
> > setCallSiteTargetNormal goes to a safepoint (which is worst that only
> > having
> > the compiler/JIT lock because all threads are stopped),
> 
> > when either a code calls setTarget or a SwithPoint is invalidated.
> 

> > You have a deopt storm when the JIT compiles a code that contains a
> > callsite
> > that is always invalid, so the VM enters in loop like this,
> 
> > JIT compile a blob
> 
> > execute the blob
> 
> > deopt
> 
> > jump back in the interpreter
> 
> > rinse and repeat
> 

> > The root cause is a bug in the invalidation logic of the language runtime
> > (not the VM) but it's hard to spot without a reproducible test case because
> 
> > when the JIT compiles a blob of codes there are several callsites inside
> > that
> > blob and usually only one is the faulty one.
> 

> > We already have discussed about that point several times,
> 
> > John is a proponent of marking the callsite has should never be optimized
> > again,
> 
> > which at least stop the storm issue but it sweeps the real cause of the bug
> > under carpet,
> 
> > I would prefer, consider these kind of bugs as a language runtime bugs that
> > should be investigated by the runtime developers.
> 

> > Perhaps a middle ground is to mark the callsite as not compilable anymore
> > *and* emit a warning (like when the code cache is full) to not hide the
> > root
> > cause of the bug.
> 

> > Rémi
> 

> > > De: "Hannes Wallnöfer" < hann...@gmail.com >
> > 
> 
> > > À: jvm-langua...@googlegroups.com
> > 
> 
> > > Envoyé: Mercredi 16 Mars 2016 11:52:42
> > 
> 
> > > Objet: Re: [jvm-l] slow downs in invokedynamic native code
> > 
> 

> > > I've filed a bug for this:
> > 
> 

> > > https://bugs.openjdk.java.net/browse/JDK-8151981
> > 
> 

> > > For the Nashorn report, the only thing we know is that it involves pretty
> > > large scripts that are being re-evaluated in new ScriptEngines, with 8
> > > engines at a time. So it seems quite possible that some implementation
> > > detail is stressed beyond the point where it performs efficiently.
> > 
> 

> > > Hannes
> > 
> 

> > > 2016-03-16 11:44 GMT+01:00 Duncan MacGregor < duncan.macgre...@gmail.com
> > > >
> > > :
> > 
> 

> > > > I haven't seen this, but setCallSiteTargetNormal does have to get the
> > > > compiler lock, so contention can definitely cause problems. Is there a
> > > >