On Tue, 6 Jan 2026 23:29:52 GMT, Volodymyr Paprotski <[email protected]>
wrote:
>> Shawn M Emery has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> Update copyright year
>
> src/hotspot/cpu/x86/stubGenerator_x86_64_kyber.cpp line 862:
>
>> 860: __ addptr(condensed, condensedOffs);
>> 861:
>> 862: if (VM_Version::supports_avx512_vbmi2()) {
>
> Which instruction needs vbmi2? All I could spot was that `evpermb` that needs
> vbmi. Relax the restriction slightly?
Good catch! Initially the code was using `vpshldvw`, but was changed to just
use `vpsrlvw`. Fixed in next commit.
I should probably update the bug synopsis to exclude VBMI2?
> src/hotspot/cpu/x86/stubGenerator_x86_64_kyber.cpp line 906:
>
>> 904: __ addptr(condensed, 192);
>> 905: __ addptr(parsed, 256);
>> 906: __ subl(parsedLength, 128);
>
> (128 instead of 256 here because `parsedLength` is an index to an `short`
> array..)
>
> I am confused by the stride. The `twelve2Sixteen()` seems to (almost)
> guarantee that the parsed length is a multiple of 64 (last block can be 48
> bytes). This would imply a stride of 128 bytes for `parsed`. And 96 for
> `condensed`.
>
> This is exactly how the existing code already behaves so I am less concerned,
> but I would like an explanation why it works?
I believe the numbers are right: with each pass 256 bytes of coefficients are
`parsed` into the parse buffer. This means that half of the coefficients have
been processed (`parseLength` = 128). Would having a comment stating as such
be sufficient for your concerns?
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/28815#discussion_r2667206396
PR Review Comment: https://git.openjdk.org/jdk/pull/28815#discussion_r2667206828