On Wed, 24 Apr 2024 10:08:42 GMT, Claes Redestad <[email protected]> wrote:
>> This patch suggests a workaround to an issue with huge SCF MH expression
>> trees taking excessive JIT compilation resources by reviving (part of) the
>> simple bytecode-generating strategy that was originally available as an
>> all-or-nothing strategy choice.
>>
>> Instead of reintroducing a binary strategy choice I propose a threshold
>> parameter, controlled by
>> `-Djava.lang.invoke.StringConcat.highArityThreshold=<val>`: For expressions
>> below or at this threshold there's no change, for expressions with an arity
>> above it we use the `StringBuilder`-chain bytecode generator.
>>
>> There are a few trade-offs at play here which influence the choice of
>> threshold. The simple high arity strategy will for example not see any reuse
>> of LambdaForms but strictly always generate a class per indy callsite, which
>> means we might end up with a higher total number of classes generated and
>> loaded in applications if we set this value too low. It may also produce
>> worse performance on average. On the other hand there is the observed
>> increase in C2 resource usage as expressions grow unwieldy. On the other
>> other hand high arity expressions are likely rare to begin with, with less
>> opportunities for sharing than the more common low-arity expressions.
>>
>> I turned the submitted test case into a few JMH benchmarks and did some
>> experiments with `-XX:CompileCommand=MemStat,StringConcat::concat,print`:
>>
>> Baseline strategy:
>> 13 args: 6.3M
>> 23 args: 18M
>> 123 args: 868M
>>
>> `-Djava.lang.invoke.StringConcat.highArityThreshold=0`:
>> 13 args: 2.11M
>> 23 args: 3.67M
>> 123 args: 4.75M
>>
>> For 123 args the memory overhead of the baseline strategy is 180x, but for
>> 23 args we're down to a 5x memory overhead, and down to a 3x overhead for 13
>> args. Since the absolute overhead for 23 is borderline acceptable (+15Mb)
>> I've conservatively chosen a threshold at arity 20. This keeps C2 resource
>> pressure at a reasonable level (< 18M) while avoiding perturbing performance
>> at the vast majority of call sites.
>>
>> I was asked to use the new class file API for mainline. There's a version of
>> this patch implemented using ASM in 7c52a9f which might be a reasonable
>> basis for a backport.
>
> Claes Redestad has updated the pull request incrementally with one additional
> commit since the last revision:
>
> Make Set.of(STRONG) a constant, fix compilation, minor code improvements
Looks fine to me. Indeed, splitting this with ASM and then convert it to
ClassFile API would help the backport.
src/java.base/share/classes/java/lang/invoke/StringConcatFactory.java line 1077:
> 1075:
> 1076: /**
> 1077: * Ensure a capacity in the initial StringBuilder to
> accommodate all constants plus this factor times the number
Nit: wrap long line.
src/java.base/share/classes/java/lang/invoke/StringConcatFactory.java line 1085:
> 1083:
> 1084: static {
> 1085: DUMPER =
> ClassFileDumper.getInstance("java.lang.invoke.StringConcatFactory.dump",
> "stringConcatClasses");
Nit: this static block isn't strictly needed. Can assign at the declaration of
the static variable.
src/java.base/share/classes/java/lang/invoke/StringConcatFactory.java line 1112:
> 1110: return hiddenLookup.findStatic(innerClass, METHOD_NAME,
> args);
> 1111: } catch (Exception e) {
> 1112: DUMPER.dumpFailedClass(className, classBytes);
This line is no longer needed. The bytes will be dumped if it's enabled for
both success and failing case.
-------------
Marked as reviewed by mchung (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/18690#pullrequestreview-2020345792
PR Review Comment: https://git.openjdk.org/jdk/pull/18690#discussion_r1578178759
PR Review Comment: https://git.openjdk.org/jdk/pull/18690#discussion_r1578176295
PR Review Comment: https://git.openjdk.org/jdk/pull/18690#discussion_r1578173742