Re: A certain (type of?) callsite seems to always require relinking

Remi Forax Wed, 19 Nov 2014 08:27:01 -0800

Hi Attila,
I've had the very same issue with a runtime of a proprietary language.

What I've proposed in the JSR 292 cookbook for this case is to use adispatch table

https://code.google.com/p/jsr292-cookbook/source/browse/trunk/bimorphic-cache/src/jsr292/cookbook/bicache/RT.java
https://code.google.com/p/jsr292-cookbook/source/browse/trunk/bimorphic-cache/src/jsr292/cookbook/bicache/DispatchMap.java

It's a hash table which associate a Class to a method handle,

so the code avoid a linear check that you have with a tree of GWT andcan share method handle trees between subclasses.

Obviously, there is no inlining from the callsite to a method handle chain
(the JIT can not see through the hash table).

I also think Groovy use something similar because Jochen found a bug inthe dispatch table code last year.


cheers,
Rémi

On 11/19/2014 03:41 PM, Attila Szegedi wrote:

Hi Benjamin,

I've been thinking about this, and I believe I know what the issue might be. 
Unfortunately, I don't currently have a good solution for it (although I'll be 
thinking some more about it).

Basically, call sites are linked with method handles that are guarded with a 
test for exact receiver type (basically obj.getClass() == X.class). Call sites 
further can have up to 8 methods linked into them (in a LRU fashion) in a 
waterfall cascade of guard-with-tests. If your call site sees more than 8 
receiver types (this number is fixed right now), it'll keep relinking as it'll 
only remember the most recent 8.

Even if you don't override the method in subclasses, we can't use a more 
generic guard because we can't prove that there won't ever be a new subclass 
that won't overload the method. Note I said overload, not override: that's not 
a mistake. Here's a scenario:

public class A {
     public void foo(Object o) { ... }
}

public class B extends A {
}

Now imagine a script call site "a.foo('Hello')". When it's hit with an instance of B, we'll use 
"a.getClass() == B.class" as the guard. Now, if you have a bunch of subclasses B1…B12 all extending 
A, you'll end up with 12 linkages to the same method, but all guarded with a different "a.getClass() == 
Bn.class" guard. Actually, as I said above, you'll end up with a call site incorporating the 8 most 
recently used ones, and force relinking when the 9th comes along.

You could ask "what'd be the harm in linking to "A.foo(Object)" method just once with 
"a instanceof A" guard? The harm becomes apparent if we now define

public class B extends A {
     public void foo(String s) { ... }
}

With instanceof linkage, invocation at the call site with an instance of C would pass the 
guard, and invoke A.foo(Object), which is incorrect as it'd be expected to invoke 
C.foo(String) instead. As you can see, this is not a matter of a subclass overriding 
foo(Object), but rather it's a matter of the subclass overloading the "foo" 
name with a new signature.

The only strategy we have for avoiding this is at the moment is almost always 
linking with exact receiver class guards :-(

On a sidenote, I said "almost always" above as there's a special class of methods we can, in fact, 
link with "instanceof" guards on the most generic declaring superclass: methods taking zero 
arguments (e.g. all property getters). Since overload choice is actually per-arity, zero-argument methods 
can't effectively be overloaded, so for them we actually use "instanceof" guards. But, sadly, we 
can't use them for any other methods.

One strategy to cope with the issue would be to check during linking if none of 
the currently known subclasses add new overloads to the method (or even, not 
overload it in a manner where a different method would be chosen for the static 
type at the call site), and if they don't, then link with a switch point 
representing this invariant. Then, whenever a subclass is loaded into the VM 
that invalidates the assumption, invalidate the switch points. Unfortunately, 
this strategy requires whole-VM knowledge of loaded classes, and we could only 
do it if we added a java.lang.instrument agent as a mandatory component in 
Nashorn.

(Another sidenote: we're trying very hard to keep Nashorn from relying on any 
implementation-specific or undocumented platform or VM features; so far we have 
always managed to rely solely on public Java APIs also because we'd like to 
prove that they're sufficient for a dynamic language implementation on the JVM.)

Alternatively, we could also try to prove a weaker assumption that the chosen 
method would always be the one invoked at the call site (e.g. the method type 
of the call site guarantees that there can't ever be a more specific method to 
invoke), but in reality this'd be quite hard and since Nashorn internally 
mostly only uses boolean, int, long, double, and Object as the call site 
signatures, it probably also wouldn't be effective (e.g. we could nearly never 
prove this invariant). So that's probably not worth it.

As a yet another solution, we might give you a system property or other 
configuration means of allowing link chains longer than 8 (in your case, if you 
have 12 subclasses, then 12 should be enough).

Sorry for not having a better answer…
   Attila.

On Nov 19, 2014, at 10:40 AM, Benjamin Sieffert <[email protected]> 
wrote:

Hello everyone,

it started with a peculiar obversion about our nashorn-utilising
application, that I made: It continues to load around a hundred new
anonymous classes *per second*, even without new scripts being introduced –
i.e. we are just running the same javascripts over and over again, with
different arguments.
So I ran the application with -tcs=miss and from what I see, eventually
there will be only a single call left that is producing all the output and
therefor, I believe, all the memory load. (Am I correct in this assumption?)

What I can say about the call is the following:

- return type is an array of differing length (but always of the same type)
- there are two arguments, of which the first one will always exactly match
the declaration, the second one is a subclass of the one used in the
declaration – but always the same subclass
- method is implemented in an abstract class
- receiver is one of about a dozen classes that inherit from this abstract
class
- none of the receivers overwrite the original implementation or overload
the method

When I look into the trace output, there's often a bunch of
"TAG MISS library:212 dyn:getMethod|getProp|getElem:<methodname> …"
in a row, then a whole lot of
"TAG MISS library:212
dyn:call([jdk.internal.dynalink.beans.SimpleDynamicMethod …"
with a bit of the first one inbetween.

Is this a known issue? Is there something I can do to alleviate the
problem? As it is, I might just end up implementing the whole chunk in Java
and be done with it, but I thought this might be worthy of some discussion.
If there's some important information that I have left out, I'll be glad to
follow up with it.

Regards,
Benjamin

--
Benjamin Sieffert
metrigo GmbH
Sternstr. 106
20357 Hamburg

Geschäftsführer: Christian Müller, Tobias Schlottke, Philipp Westermeyer,
Martin Rieß
Die Gesellschaft ist eingetragen beim Registergericht Hamburg
Nr. HRB 120447.

Re: A certain (type of?) callsite seems to always require relinking

Reply via email to