On Fri, 24 Apr 2026 16:28:50 GMT, Jatin Bhateja <[email protected]> wrote:
> Patch optimizes Float16 to integral conversion operations. Currently, its a > two step process where by first a Float16 value is > converted to a single precision floating point value followed by a conversion > to an integral value. > > x86 targets supporting AVX512-FP16 feature (Intel Sapphire Rapids+ and > upcoming AMD Zen6) provides direct instruction to convert a Float16 value to > integral value. > > Following are the performance numbers of micro benchmark included with the > patch on Granite Rapids with and without auto-vectorization. > > <img width="1125" height="636" alt="image" > src="https://github.com/user-attachments/assets/ca6e6757-1579-475f-8307-9454c7c025c1" > /> > > Kindly review and share your feedback. > > Best Regards, > Jatin > > --------- > - [x] I confirm that I make this contribution in accordance with the [OpenJDK > Interim AI Policy](https://openjdk.org/legal/ai). Changes requested by galder (Committer). src/hotspot/cpu/x86/x86.ad line 14734: > 14732: format %{ "convert_hf2l $dst, $src !\t using $xtmp as TEMP" %} > 14733: ins_encode %{ > 14734: __ convertHF2I(T_LONG, $dst$$Register, $src$$Register, > $xtmp$$XMMRegister); Minor comment: isn't it a bit confusing to call `convertHF2I` with a `T_LONG`? Maybe `convertHF2I` could be renamed to `convertHF2X` to not commit to the type? ------------- PR Review: https://git.openjdk.org/jdk/pull/30928#pullrequestreview-4242968982 PR Review Comment: https://git.openjdk.org/jdk/pull/30928#discussion_r3200551189
