On Mon, 17 May 2021 17:19:16 GMT, Jorn Vernee <[email protected]> wrote:
>> This patch adds a `tableSwitch` combinator that can be used to switch over a
>> set of method handles given an index, with a fallback in case the index is
>> out of bounds, much like the `tableswitch` bytecode. Here is a description
>> of how it works (copied from the javadoc):
>>
>> Creates a table switch method handle, which can be used to switch over
>> a set of target
>> method handles, based on a given target index, called selector.
>>
>> For a selector value of {@code n}, where {@code n} falls in the range
>> {@code [0, N)},
>> and where {@code N} is the number of target method handles, the table
>> switch method
>> handle will invoke the n-th target method handle from the list of
>> target method handles.
>>
>> For a selector value that does not fall in the range {@code [0, N)},
>> the table switch
>> method handle will invoke the given fallback method handle.
>>
>> All method handles passed to this method must have the same type, with
>> the additional
>> requirement that the leading parameter be of type {@code int}. The
>> leading parameter
>> represents the selector.
>>
>> Any trailing parameters present in the type will appear on the returned
>> table switch
>> method handle as well. Any arguments assigned to these parameters will
>> be forwarded,
>> together with the selector value, to the selected method handle when
>> invoking it.
>>
>> The combinator does not support specifying the starting index, so the switch
>> cases always run from 0 to however many target handles are specified. A
>> starting index can be added manually with another combination step that
>> filters the input index by adding or subtracting a constant from it, which
>> does not affect performance. One of the reasons for not supporting a
>> starting index is that it allows for more lambda form sharing, but also
>> simplifies the implementation somewhat. I guess an open question is if a
>> convenience overload should be added for that case?
>>
>> Lookup switch can also be simulated by filtering the input through an
>> injection function that translates it into a case index, which has also
>> proven to have the ability to have comparable performance to, or even better
>> performance than, a bytecode-native `lookupswitch` instruction. I plan to
>> add such an injection function to the runtime libraries in the future as
>> well. Maybe at that point it could be evaluated if it's worth it to add a
>> lookup switch combinator as well, but I don't see an immediate need to
>> include it in this patch.
>>
>> The current bytecode intrinsification generates a call for each switch case,
>> which guarantees full inlining of the target method handles. Alternatively
>> we could only have 1 callsite at the end of the switch, where each case just
>> loads the target method handle, but currently this does not allow for
>> inlining of the handles, since they are not constant.
>>
>> Maybe a future C2 optimization could look at the receiver input for
>> invokeBasic call sites, and if the input is a phi node, clone the call for
>> each constant input of the phi. I believe that would allow simplifying the
>> bytecode without giving up on inlining.
>>
>> Some numbers from the added benchmarks:
>>
>> Benchmark (numCases) (offset)
>> (sorted) Mode Cnt Score Error Units
>> MethodHandlesTableSwitchConstant.testSwitch 5 0
>> N/A avgt 30 4.186 � 0.054 ms/op
>> MethodHandlesTableSwitchConstant.testSwitch 5 150
>> N/A avgt 30 4.164 � 0.057 ms/op
>> MethodHandlesTableSwitchConstant.testSwitch 10 0
>> N/A avgt 30 4.124 � 0.023 ms/op
>> MethodHandlesTableSwitchConstant.testSwitch 10 150
>> N/A avgt 30 4.126 � 0.025 ms/op
>> MethodHandlesTableSwitchConstant.testSwitch 25 0
>> N/A avgt 30 4.137 � 0.042 ms/op
>> MethodHandlesTableSwitchConstant.testSwitch 25 150
>> N/A avgt 30 4.113 � 0.016 ms/op
>> MethodHandlesTableSwitchConstant.testSwitch 50 0
>> N/A avgt 30 4.118 � 0.028 ms/op
>> MethodHandlesTableSwitchConstant.testSwitch 50 150
>> N/A avgt 30 4.127 � 0.019 ms/op
>> MethodHandlesTableSwitchConstant.testSwitch 100 0
>> N/A avgt 30 4.116 � 0.013 ms/op
>> MethodHandlesTableSwitchConstant.testSwitch 100 150
>> N/A avgt 30 4.121 � 0.020 ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch 5 0
>> N/A avgt 30 4.113 � 0.009 ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch 5 150
>> N/A avgt 30 4.149 � 0.041 ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch 10 0
>> N/A avgt 30 4.121 � 0.026 ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch 10 150
>> N/A avgt 30 4.113 � 0.021 ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch 25 0
>> N/A avgt 30 4.129 � 0.028 ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch 25 150
>> N/A avgt 30 4.105 � 0.019 ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch 50 0
>> N/A avgt 30 4.097 � 0.021 ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch 50 150
>> N/A avgt 30 4.131 � 0.037 ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch 100 0
>> N/A avgt 30 4.135 � 0.025 ms/op
>> MethodHandlesTableSwitchOpaqueSingle.testSwitch 100 150
>> N/A avgt 30 4.139 � 0.145 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 5 0
>> true avgt 30 4.894 � 0.028 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 5 0
>> false avgt 30 11.526 � 0.194 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 5 150
>> true avgt 30 4.882 � 0.025 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 5 150
>> false avgt 30 11.532 � 0.034 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 10 0
>> true avgt 30 5.065 � 0.076 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 10 0
>> false avgt 30 13.016 � 0.020 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 10 150
>> true avgt 30 5.103 � 0.051 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 10 150
>> false avgt 30 12.984 � 0.102 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 25 0
>> true avgt 30 8.441 � 0.165 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 25 0
>> false avgt 30 13.371 � 0.060 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 25 150
>> true avgt 30 8.628 � 0.032 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 25 150
>> false avgt 30 13.542 � 0.020 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 50 0
>> true avgt 30 4.701 � 0.015 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 50 0
>> false avgt 30 13.562 � 0.063 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 50 150
>> true avgt 30 7.991 � 3.111 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 50 150
>> false avgt 30 13.543 � 0.088 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 100 0
>> true avgt 30 4.712 � 0.020 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 100 0
>> false avgt 30 13.600 � 0.085 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 100 150
>> true avgt 30 4.676 � 0.011 ms/op
>> MethodHandlesTableSwitchRandom.testSwitch 100 150
>> false avgt 30 13.476 � 0.043 ms/op
>>
>>
>> Testing:
>> - [x] Running of included benchmarks
>> - [x] Inspecting inlining trace and verifying method handle targets are
>> inlined
>> - [x] Running TestTableSwitch test (currently the only user of the new code)
>> - [x] Running java/lang/invoke tests (just in case)
>> - [x] Some manual testing
>>
>> Thanks,
>> Jorn
>
> Jorn Vernee has updated the pull request with a new target base due to a
> merge or a rebase. The incremental webrev excludes the unrelated changes
> brought in by the merge/rebase. The pull request contains two additional
> commits since the last revision:
>
> - - Formatting
> - Reduce benchmark cases
> - Remove NamedFunction::intrinsicName
> - All changes prior to review
Thanks for working through my comments, especially cleaning up how intrinsic
data is handled which now looks much more like a natural fit!
-------------
Marked as reviewed by redestad (Reviewer).
PR: https://git.openjdk.java.net/jdk/pull/3401