On Mon, 5 Apr 2021 11:57:08 GMT, Jorn Vernee <[email protected]> wrote:
>> This patch speeds up MethodHandle.asCollector handles where the array type
>> is not Object[], as well as speeding up all collectors where the arity is
>> greater than 10.
>>
>> The old code is creating a collector handle by combining a set of hard coded
>> methods for collecting arguments into an Object[], up to a maximum of ten
>> elements, and then copying this intermediate array into a final array.
>>
>> In principle, it shouldn't matter how slow (or fast) this handle is, because
>> it should be replaced by existing bytecode intrinsification which does the
>> right thing. But, through investigation it turns out that the
>> intrinsification is only being applied in a very limited amount of cases:
>> Object[] with max 10 elements only, only for the intermediate collector
>> handles. Every other collector shape incurs overhead because it essentially
>> ends up calling the ineffecient fallback handle.
>>
>> Rather than sticking on a band aid (I tried this, but it turned out to be
>> very tricky to untangle the existing code), the new code replaces the
>> existing implementation with a collector handle implemented using a
>> LambdaForm, which removes the need for intrinsification, and also greatly
>> reduces code-complexity of the implementation. (I plan to expose this
>> collector using a public API in the future as well, so users don't have to
>> go through MHs::identity to make a collector).
>>
>> The old code was also using a special lambda form transform for collecting
>> arguments into an array. I believe this was done to take advantage of the
>> existing-but-limited bytecode intrinsification, at least for Object[] with
>> less than 10 elements.
>>
>> The new code just uses the existing collect arguments transform with the
>> newly added collector handle as filter, and this works just as well for the
>> existing case, but as a bonus is also much simpler, since no separate
>> transform is needed. Using the collect arguments transform should also
>> improve sharing.
>>
>> As a result of these changes a lot of code was unused and has been removed
>> in this patch.
>>
>> Testing: tier 1-3, benchmarking using TypedAsCollector (part of the patch),
>> as well as another variant of the benchmark that used a declared static
>> identity method instead of MHs::identity (not included). Before/after
>> comparison of MethodHandleAs* benchmarks (no differences there).
>>
>> Here are some numbers from the added benchmark:
>>
>> Before:
>> Benchmark Mode Cnt Score Error
>> Units
>> TypedAsCollector.testIntCollect avgt 30 189.156 � 1.796
>> ns/op
>> TypedAsCollector.testIntCollectHighArity avgt 30 660.549 � 10.159
>> ns/op
>> TypedAsCollector.testObjectCollect avgt 30 7.092 � 0.042
>> ns/op
>> TypedAsCollector.testObjectCollectHighArity avgt 30 65.225 � 0.546
>> ns/op
>> TypedAsCollector.testStringCollect avgt 30 28.511 � 0.243
>> ns/op
>> TypedAsCollector.testStringCollectHighArity avgt 30 57.054 � 0.635
>> ns/op
>> (as you can see, just the Object[] with arity less than 10 case is fast here)
>> After:
>> Benchmark Mode Cnt Score Error Units
>> TypedAsCollector.testIntCollect avgt 30 6.569 � 0.131 ns/op
>> TypedAsCollector.testIntCollectHighArity avgt 30 8.923 � 0.066 ns/op
>> TypedAsCollector.testObjectCollect avgt 30 6.813 � 0.035 ns/op
>> TypedAsCollector.testObjectCollectHighArity avgt 30 9.718 � 0.066 ns/op
>> TypedAsCollector.testStringCollect avgt 30 6.737 � 0.016 ns/op
>> TypedAsCollector.testStringCollectHighArity avgt 30 9.618 � 0.052 ns/op
>>
>> Thanks,
>> Jorn
>
> Jorn Vernee has updated the pull request incrementally with one additional
> commit since the last revision:
>
> - Revert back to injected constructor handle
> - Add lambda form sharing
> - Add test case for collecting a custom class
src/java.base/share/classes/java/lang/invoke/MethodHandleImpl.java line 1909:
> 1907: boolean isSharedLambdaForm = parameterCount == 0 ||
> basicType.parameterType(0) == Object.class;
> 1908: if (isSharedLambdaForm) {
> 1909: LambdaForm lform =
> basicType.form().cachedLambdaForm(MethodTypeForm.LF_COLLECTOR);
How does sharing work w.r.t. array store check? Unless you put `storeFunc` on
the BMH, I don't see how sharing preserves proper check.
-------------
PR: https://git.openjdk.java.net/jdk/pull/3306