Nice work, Claes! Looks good.

Best regards,
Vladimir Ivanov

On 07/11/2018 06:57, Claes Redestad wrote:
Hi,

by providing a LambdaFormEditor transform which applies the same filter repeatedly on some selected arguments, we can optimize for cases where there are repeated conversions. This allows the runtime to optimize the internal setup of public API methods such as MethodHandles.filterArguments(MethodHandle, int, MethodHandle...) and MethodHandle.asType(MethodType)

Webrev: http://cr.openjdk.java.net/~redestad/8213478/jdk.00/
Bug: https://bugs.openjdk.java.net/browse/JDK-8213478

Example: mh.asType(mt1) where the mh.type is (JJJJ)I and mt1 is (IIII)I would need to add I->J widening conversions to each argument. Each application of such a conversion C creates a new BMH with a new species, first (JIII)I, with C bound in as a 5th (or 6th, counting the MH) argument, then from that we create (JJII)I with C bound in again as the 6th argument.. and so on.

With the optimization in place we can go from (IIII)I to (JJJJ)I in one step: bind in C as the 5th argument, apply expressions to the LF to call it on each parameter in order. In this example we then end up with only one additional BMH species rather than four.

In synthetic examples invoking methods with maximal number of arguments where all arguments need the same filter to be applied, the number of intermediary species and LF classes generated can be reduced from up to ~290 to 6 (98% reduction)[1], while in less contrived examples the reduction is more modest, down to no difference in number of shapes generated in completely randomized tests. Repeated filters and conversions are common in practice, for example in the ISC implementation, where this patch modestly reduce the number of shapes and LFs generated in a range of tests - from substantial reductions down to around 2% for mostly random argument sequences[2]. Without a formal proof I posit that the upper bound of generated species and LFs is the same with both implementations, just that this reduces the need to generate certain intermediaries.

The proposed patch could surely be micro-optimized, e.g., there are cases where I'm copying arrays and creating TreeMaps around. I do this to simplify insertion of arguments into the right place and ensure execution happens in the expected order etc. Some of that could perhaps be specialized, but these overheads are tiny compared to the class generation overheads we're avoiding.

Thanks!

/Claes

[1] http://cr.openjdk.java.net/~redestad/8213478/BadInvoke.java

Before:
time java -Xlog:class+load BadInvoke | wc -l
819

real    0m0.753s
user    0m4.120s
sys    0m0.164s

After:
time java -Xlog:class+load BadInvoke | wc -l
540

real    0m0.267s
user    0m0.580s
sys    0m0.044s

[2] Generated ISC stress tests such as: http://cr.openjdk.java.net/~redestad/8213478/MixedStringCombinations.java - which loads and creates 37936 classes before, 37114 after, or -2.2%

Reply via email to