On Mon, 14 Nov 2022 18:28:53 GMT, Vladimir Ivanov <vliva...@openjdk.org> wrote:

>>> Also, I'd like to note that C2 auto-vectorization support is not too far 
>>> away from being able to optimize hash code computations. At some point, I 
>>> was able to achieve some promising results with modest tweaking of 
>>> SuperWord pass: https://github.com/iwanowww/jdk/blob/superword/notes.txt 
>>> http://cr.openjdk.java.net/~vlivanov/superword.reduction/webrev.00/
>> 
>> Intriguing. How far off is this - and do you think it'll be able to match 
>> the efficiency we see here with a memoized coefficient table etc?
>> 
>> If we turn this intrinsic into a stub we might also be able to reuse the 
>> optimization in other places, including from within the VM (calculating 
>> String hashCodes happen in a couple of places, including String 
>> deduplication). So I think there are still a few compelling reasons to go 
>> the manual route and continue on this path.
>
>> How far off is this ...?
> 
> Back then it looked way too constrained (tight constraints on code shapes). 
> But I considered it as a generally applicable optimization. 
> 
>>  ... do you think it'll be able to match the efficiency we see here with a 
>> memoized coefficient table etc?
> 
> Yes, it is able to build the constant table at runtime when folding 
> multiplications of constant coefficients produced during loop unrolling and 
> then packing scalars into a constant vector.
> 
> Moreover, briefly looking at the code shape, the vectorizer would produce a 
> more optimal loop shape (pre-loop would align vector accesses and would use 
> 512-bit vectors when available; vector post-loop could help as well).

Passing the constant node through as an input as suggested by @iwanowww and 
@sviswa7 meant we could eliminate most of the `instruct` blocks, removing a 
significant chunk of code and a little bit of complexity from the proposed 
patch.

-------------

PR: https://git.openjdk.org/jdk/pull/10847

Reply via email to