On Fri, 3 Mar 2023 21:41:35 GMT, Vladimir Kozlov <k...@openjdk.org> wrote:
> Implemented `Float.floatToFloat16` and `Float.float16ToFloat` intrinsics in > Interpreter and C1 compiler to produce the same results as C2 intrinsics on > x64, Aarch64 and RISC-V - all platforms where C2 intrinsics for these Java > methods were implemented originally. > > Replaced `SharedRuntime::f2hf()` and `hf2f()` C runtime functions with calls > to runtime stubs which use the same HW instructions as C2 intrinsics. Only > for 64-bit x64 because 32-bit x86 stub does not work: result is passed > through FPU register and NaN values become different from C2 intrinsic. This > runtime stub is only used to calculate constant values during C2 compilation > and can be skipped. > > I added new tests based on Tobias's `TestAll.java` And copied > `jdk/lang/Float/Binary16Conversion*.java` tests to run them with `-Xcomp` to > make sure code is compiled by C1 or C2. I modified > `Binary16ConversionNaN.java` to compare results from Interpreter, C1 and C2. > > Tested tier1-5, Xcomp, stress src/hotspot/cpu/x86/macroAssembler_x86.hpp line 199: > 197: void flt_to_flt16(Register dst, XMMRegister src, XMMRegister tmp) { > 198: // Instruction requires different XMM registers > 199: vcvtps2ph(tmp, src, 0x04, Assembler::AVX_128bit); vcvtps2ph can have source and destination as same. Did you mean to say here in the comment that "Instruction requires XMM register as destination"? src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 3928: > 3926: } > 3927: > 3928: if (VM_Version::supports_f16c() || VM_Version::supports_avx512vl()) { We could check for VM_Version::supports_float16() here instead. src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 3931: > 3929: // For results consistency both intrinsics should be enabled. > 3930: if > (vmIntrinsics::is_intrinsic_available(vmIntrinsics::_float16ToFloat) && > 3931: > vmIntrinsics::is_intrinsic_available(vmIntrinsics::_floatToFloat16)) { Should this also check for InlineIntrinsics? src/hotspot/cpu/x86/templateInterpreterGenerator_x86_64.cpp line 346: > 344: } > 345: // For AVX CPUs only. f16c support is disabled if UseAVX == 0. > 346: if (VM_Version::supports_f16c() || VM_Version::supports_avx512vl()) { We could check for VM_Version::supports_float16() here instead. ------------- PR: https://git.openjdk.org/jdk/pull/12869