On Thu, 8 Apr 2021 18:51:21 GMT, Jorn Vernee <[email protected]> wrote:
> This patch adds a `tableSwitch` combinator that can be used to switch over a
> set of method handles given an index, with a fallback in case the index is
> out of bounds, much like the `tableswitch` bytecode. Here is a description of
> how it works (copied from the javadoc):
>
> Creates a table switch method handle, which can be used to switch over a
> set of target
> method handles, based on a given target index, called selector.
>
> For a selector value of {@code n}, where {@code n} falls in the range
> {@code [0, N)},
> and where {@code N} is the number of target method handles, the table
> switch method
> handle will invoke the n-th target method handle from the list of target
> method handles.
>
> For a selector value that does not fall in the range {@code [0, N)}, the
> table switch
> method handle will invoke the given fallback method handle.
>
> All method handles passed to this method must have the same type, with
> the additional
> requirement that the leading parameter be of type {@code int}. The
> leading parameter
> represents the selector.
>
> Any trailing parameters present in the type will appear on the returned
> table switch
> method handle as well. Any arguments assigned to these parameters will
> be forwarded,
> together with the selector value, to the selected method handle when
> invoking it.
>
> The combinator does not support specifying the starting index, so the switch
> cases always run from 0 to however many target handles are specified. A
> starting index can be added manually with another combination step that
> filters the input index by adding or subtracting a constant from it, which
> does not affect performance. One of the reasons for not supporting a starting
> index is that it allows for more lambda form sharing, but also simplifies the
> implementation somewhat. I guess an open question is if a convenience
> overload should be added for that case?
>
> Lookup switch can also be simulated by filtering the input through an
> injection function that translates it into a case index, which has also
> proven to have the ability to have comparable performance to, or even better
> performance than, a bytecode-native `lookupswitch` instruction. I plan to add
> such an injection function to the runtime libraries in the future as well.
> Maybe at that point it could be evaluated if it's worth it to add a lookup
> switch combinator as well, but I don't see an immediate need to include it in
> this patch.
>
> The current bytecode intrinsification generates a call for each switch case,
> which guarantees full inlining of the target method handles. Alternatively we
> could only have 1 callsite at the end of the switch, where each case just
> loads the target method handle, but currently this does not allow for
> inlining of the handles, since they are not constant.
>
> Maybe a future C2 optimization could look at the receiver input for
> invokeBasic call sites, and if the input is a phi node, clone the call for
> each constant input of the phi. I believe that would allow simplifying the
> bytecode without giving up on inlining.
>
> Some numbers from the added benchmarks:
>
> Benchmark (numCases) (offset)
> (sorted) Mode Cnt Score Error Units
> MethodHandlesTableSwitchConstant.testSwitch 5 0
> N/A avgt 30 4.186 � 0.054 ms/op
> MethodHandlesTableSwitchConstant.testSwitch 5 150
> N/A avgt 30 4.164 � 0.057 ms/op
> MethodHandlesTableSwitchConstant.testSwitch 10 0
> N/A avgt 30 4.124 � 0.023 ms/op
> MethodHandlesTableSwitchConstant.testSwitch 10 150
> N/A avgt 30 4.126 � 0.025 ms/op
> MethodHandlesTableSwitchConstant.testSwitch 25 0
> N/A avgt 30 4.137 � 0.042 ms/op
> MethodHandlesTableSwitchConstant.testSwitch 25 150
> N/A avgt 30 4.113 � 0.016 ms/op
> MethodHandlesTableSwitchConstant.testSwitch 50 0
> N/A avgt 30 4.118 � 0.028 ms/op
> MethodHandlesTableSwitchConstant.testSwitch 50 150
> N/A avgt 30 4.127 � 0.019 ms/op
> MethodHandlesTableSwitchConstant.testSwitch 100 0
> N/A avgt 30 4.116 � 0.013 ms/op
> MethodHandlesTableSwitchConstant.testSwitch 100 150
> N/A avgt 30 4.121 � 0.020 ms/op
> MethodHandlesTableSwitchOpaqueSingle.testSwitch 5 0
> N/A avgt 30 4.113 � 0.009 ms/op
> MethodHandlesTableSwitchOpaqueSingle.testSwitch 5 150
> N/A avgt 30 4.149 � 0.041 ms/op
> MethodHandlesTableSwitchOpaqueSingle.testSwitch 10 0
> N/A avgt 30 4.121 � 0.026 ms/op
> MethodHandlesTableSwitchOpaqueSingle.testSwitch 10 150
> N/A avgt 30 4.113 � 0.021 ms/op
> MethodHandlesTableSwitchOpaqueSingle.testSwitch 25 0
> N/A avgt 30 4.129 � 0.028 ms/op
> MethodHandlesTableSwitchOpaqueSingle.testSwitch 25 150
> N/A avgt 30 4.105 � 0.019 ms/op
> MethodHandlesTableSwitchOpaqueSingle.testSwitch 50 0
> N/A avgt 30 4.097 � 0.021 ms/op
> MethodHandlesTableSwitchOpaqueSingle.testSwitch 50 150
> N/A avgt 30 4.131 � 0.037 ms/op
> MethodHandlesTableSwitchOpaqueSingle.testSwitch 100 0
> N/A avgt 30 4.135 � 0.025 ms/op
> MethodHandlesTableSwitchOpaqueSingle.testSwitch 100 150
> N/A avgt 30 4.139 � 0.145 ms/op
> MethodHandlesTableSwitchRandom.testSwitch 5 0
> true avgt 30 4.894 � 0.028 ms/op
> MethodHandlesTableSwitchRandom.testSwitch 5 0
> false avgt 30 11.526 � 0.194 ms/op
> MethodHandlesTableSwitchRandom.testSwitch 5 150
> true avgt 30 4.882 � 0.025 ms/op
> MethodHandlesTableSwitchRandom.testSwitch 5 150
> false avgt 30 11.532 � 0.034 ms/op
> MethodHandlesTableSwitchRandom.testSwitch 10 0
> true avgt 30 5.065 � 0.076 ms/op
> MethodHandlesTableSwitchRandom.testSwitch 10 0
> false avgt 30 13.016 � 0.020 ms/op
> MethodHandlesTableSwitchRandom.testSwitch 10 150
> true avgt 30 5.103 � 0.051 ms/op
> MethodHandlesTableSwitchRandom.testSwitch 10 150
> false avgt 30 12.984 � 0.102 ms/op
> MethodHandlesTableSwitchRandom.testSwitch 25 0
> true avgt 30 8.441 � 0.165 ms/op
> MethodHandlesTableSwitchRandom.testSwitch 25 0
> false avgt 30 13.371 � 0.060 ms/op
> MethodHandlesTableSwitchRandom.testSwitch 25 150
> true avgt 30 8.628 � 0.032 ms/op
> MethodHandlesTableSwitchRandom.testSwitch 25 150
> false avgt 30 13.542 � 0.020 ms/op
> MethodHandlesTableSwitchRandom.testSwitch 50 0
> true avgt 30 4.701 � 0.015 ms/op
> MethodHandlesTableSwitchRandom.testSwitch 50 0
> false avgt 30 13.562 � 0.063 ms/op
> MethodHandlesTableSwitchRandom.testSwitch 50 150
> true avgt 30 7.991 � 3.111 ms/op
> MethodHandlesTableSwitchRandom.testSwitch 50 150
> false avgt 30 13.543 � 0.088 ms/op
> MethodHandlesTableSwitchRandom.testSwitch 100 0
> true avgt 30 4.712 � 0.020 ms/op
> MethodHandlesTableSwitchRandom.testSwitch 100 0
> false avgt 30 13.600 � 0.085 ms/op
> MethodHandlesTableSwitchRandom.testSwitch 100 150
> true avgt 30 4.676 � 0.011 ms/op
> MethodHandlesTableSwitchRandom.testSwitch 100 150
> false avgt 30 13.476 � 0.043 ms/op
>
>
> Testing:
> - [x] Running of included benchmarks
> - [x] Inspecting inlining trace and verifying method handle targets are
> inlined
> - [x] Running TestTableSwitch test (currently the only user of the new code)
> - [x] Running java/lang/invoke tests (just in case)
> - [x] Some manual testing
>
> Thanks,
> Jorn
Overall this looks great!
I have a number of nits and a request to try and remodel so that
`NamedFunction` isn't responsible for holding the arbitrary intrinsic data that
you want to add here.
src/java.base/share/classes/java/lang/invoke/InvokerBytecodeGenerator.java line
1404:
> 1402: private Name emitTableSwitch(int pos, int numCases) {
> 1403: Name args = lambdaForm.names[pos];
> 1404: Name invoker = lambdaForm.names[pos+1];
Add spaces around `+`
src/java.base/share/classes/java/lang/invoke/InvokerBytecodeGenerator.java line
1409:
> 1407: Class<?> returnType =
> result.function.resolvedHandle().type().returnType();
> 1408: MethodType caseType = args.function.resolvedHandle().type()
> 1409: .dropParameterTypes(0,1) // drop collector
Space after comma
src/java.base/share/classes/java/lang/invoke/LambdaForm.java line 731:
> 729: collectArgs.isInvokeBasic() &&
> 730: unboxResult.isInvokeBasic() &&
> 731: tableSwitch.lastUseIndex(collectArgs) == 3 && //
> t_{n+1}:L=MethodHandleImpl.<invoker>(*, *, *, t_{n});
Extraneous space
src/java.base/share/classes/java/lang/invoke/LambdaForm.java line 1098:
> 1096: @Stable MethodHandle invoker;
> 1097: private final MethodHandleImpl.Intrinsic intrinsicName;
> 1098: private final Object intrinsicData;
This seems like the wrong place to add arbitrary data related only to
intrinsics. Could this be moved either into
`MethodHandleImpl$IntrinsicMethodHandle` or - perhaps less appealingly -
rewrite the `MethodHandleImpl.Intrinsic` to a regular class which can then be
instantiated with extra data? That `NamedFunction` holds a field of type
`MethodHandleImpl.Intrinsic` instead of delegating to
`resolvedHandle.intrinsicName()` seem like a pre-existing kludge that would be
good to sort out why/if it's really necessary.
src/java.base/share/classes/java/lang/invoke/MethodHandleImpl.java line 2015:
> 2013: }
> 2014:
> 2015: private static final Map<TableSwitchCacheKey, LambdaForm>
> TABLE_SWITCH_LAMBDA_FORM_CACHE = new ConcurrentHashMap<>();
This could be moved to `class TableSwitchCacheKey` to reduce eagerly
initialization on bootstrap.
test/micro/org/openjdk/bench/java/lang/invoke/MethodHandlesTableSwitchOpaqueSingle.java
line 90:
> 88: "50",
> 89: "100"
> 90: })
To reduce default runtime I think it'd be good to slim down the number of sizes
we test here to 2 or 3 different ones.
test/micro/org/openjdk/bench/java/lang/invoke/MethodHandlesTableSwitchRandom.java
line 90:
> 88: "50",
> 89: "100"
> 90: })
To reduce default runtime I think it'd be good to slim down the number of sizes
we test here to 2 or 3 different ones.
-------------
Changes requested by redestad (Reviewer).
PR: https://git.openjdk.java.net/jdk/pull/3401