Hi, I've hit a very annoying performance problem with invoke dynamic/method handles that makes certain benchmarks about 3 times slower for identical operations. This code is related to to variable lookup and the basic idea is that I have a LexicalScope class which contains a parent pointer. It has a LexicalScope.One subclass that extends LexicalScope, a LexicalScope.Two that extends LexicalScope.One, etc, and there is a field on each of them that contains that indexed variable.
At compile time, I know what lexical depth and index a variable maps to. The original code generates straight bytecode for this. My benchmarks (depending on depth and breadth of the lexical scope) goes between 2.1s to 4.1s. The byte code just does this: get the current scope get the parent of the scope (by repeatedly getting the parent field) cast to the specific scope size we are interested in get the field for the index we are interested in do regular return/invocation on this value (this is the same process as the other call paths, so should be fine). However, when I try to do the same thing with MethodHandles, the best I can get it to do is 8.1s to 15s, which is pretty terrible (it was even worse before I stopped using methodhandles directly to fields. MethodHandles to a getter method gave me 10%). The actual method handle creation looks a bit like this: MethodHandle current = identity(LexicalScope.class); int currentDepth = lexicalDepth; while(currentDepth-- > 0) { current = filterArguments(current, 0, PARENT_SCOPE_METHOD); } MethodHandle valueMH = null; switch(lexicalIndex) { case 0: valueMH = filterArguments(SCOPE_0_GETTER_M, 0, current); break; case 1: valueMH = filterArguments(SCOPE_1_GETTER_M, 0, current); break; case 2: valueMH = filterArguments(SCOPE_2_GETTER_M, 0, current); break; case 3: valueMH = filterArguments(SCOPE_3_GETTER_M, 0, current); break; case 4: valueMH = filterArguments(SCOPE_4_GETTER_M, 0, current); break; case 5: valueMH = filterArguments(SCOPE_5_GETTER_M, 0, current); break; default: valueMH = filterArguments(insertArguments(SCOPE_N_GETTER_M, 0, lexicalIndex-6), 0, current); break; } The rest just applies the same method handles for invocation/return as the rest of the call site is using. SCOPE_2_GETTER_M is defined as findVirtual(LexicalScope.Three.class, "getValueThree", methodType(SephObject.class)).asType(SCOPE_GETTER_M_TYPE) where getValueThree is just a final getter method. I tried switching out asType to explicitCastArguments. That ended up being about 5% slower. I tried removing the asType by defining all the methods on LexicalScope and overriding them (which in practice would never call the base method). This didn't give any performance change at all. So now I'm a bit lost - I have no idea why this is so much slower than the explicit bytecode. Any thoughts? My next attack will be to go and compare the assembler. Cheers -- Ola Bini (http://olabini.com) Ioke - JRuby - ThoughtWorks "Yields falsehood when quined" yields falsehood when quined. _______________________________________________ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev