Re: RFR: 8370691: Add new Float16Vector type and enable intrinsification of vector operations supported by auto-vectorizer [v44]

Xueming Shen Thu, 28 May 2026 13:41:04 -0700

On Thu, 28 May 2026 09:48:08 GMT, Jatin Bhateja <[email protected]> wrote:


>> Add a new  Float16lVector type and corresponding concrete vector classes, in 
>> addition to existing primitive vector types, maintaining operation parity 
>> with the FloatVector type.
>> - Add necessary inline expander support.
>>    - Enable intrinsification for a few vector operations, namely 
>> ADD/SUB/MUL/DIV/MAX/MIN/SQRT/FMA.
>> - Use existing Float16 vector IR and backend support.
>> - Extended the existing VectorAPI JTREG test suite for the newly added 
>> Float16Vector operations.
>>  
>> The idea here is to first be at par with Float16 auto-vectorization support 
>> before intrinsifying new operations (conversions, reduction, etc).
>> 
>> The following are the performance numbers for some of the selected 
>> Float16Vector benchmarking kernels compared to equivalent auto-vectorized 
>> Float16OperationsBenchmark kernels.
>> 
>> <img width="1344" height="532" alt="image" 
>> src="https://github.com/user-attachments/assets/c8157c3c-22b0-4bc1-9de9-7a68cadb7b2a";
>>  />
>> 
>> Initial RFP[1] was floated on the panama-dev mailing list.
>> 
>> Kindly review the draft PR and share your feedback.
>> 
>> Best Regards,
>> Jatin
>> 
>> [1] https://mail.openjdk.org/pipermail/panama-dev/2025-August/021100.html
>> 
>> ---------
>> - [x] I confirm that I make this contribution in accordance with the 
>> [OpenJDK Interim AI Policy](https://openjdk.org/legal/ai).
>
> Jatin Bhateja has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Review comments resolutions

This is a nit, and it came from the bot screening :-)  

We don't have a real issue here, with the public VectorOperators.OR, because 
it's VO_NOFP. So probably not a blocker.  The potential risk is the private op 
OR_UNCHECKED, which ignores the NOFP flag (?).

For Float16Vector we currently have


case VECTOR_OP_OR: return (v0, v1, vm) ->
        v0.bOp(v1, vm, (i, a, b) ->
                FloatVector.fromBits(FloatVector.toBits(a) | 
FloatVector.toBits(b)));


which performs Float16 => float32 =>  bitwise OR on float32 => Float16

That seems wrong for Float16. Should this instead perform the bitwise OR 
directly on the raw short bits?

For example

**Float16 bits 0x0001**: fraction = 1 value = 1 * 2^-24  = 
**5.960464477539063e-8**
**Float16 bits 0x0002**: fraction = 2 value = 2 * 2^-24 = **2^-23  = 
1.1920928955078125e-7**

The expected raw bitwise OR is: 0x0001 | 0x0002 = 0x0003
which as Float16 value would be:  3 * 2^-24 = **1.7881393432617188e-7**

However the current implementation:

0x0001 Float16 -> float32 value 2^-24 -> float32 bits 0x33800000
0x0002 Float16 -> float32 value 2^-23 -> float32 bits 0x34000000

Then it ORs the float32 encodings:  0x33800000 | 0x34000000 = 0x37800000
That is float32 value:  **2^-16 = 1.52587890625e-5**

Converted back to Float16, that becomes: 0x0100

So the mismatch is:
expected raw Float16 OR: 0x0003
actual current fallback: 0x0100

-------------

PR Comment: https://git.openjdk.org/jdk/pull/28002#issuecomment-4568001440

Re: RFR: 8370691: Add new Float16Vector type and enable intrinsification of vector operations supported by auto-vectorizer [v44]

Reply via email to