Hi all, This patch adds the vcvt_f16_f32 and vcvt_f32_f16 NEON intrinsic to arm_neon.h through the generator ML scripts and also adds the built-ins to which the intrinsics will map to. The generator ML scripts are updated and used to generate the relevant .texi documentation, arm_neon.h and the tests in gcc.target/arm/neon .
The new intrinsics are guarded by checking the __ARM_FP predefine as described in ACLE. The second bit of the macro defines half-precision floating point support, so the intrinsics are guarded by: #if ((__ARM_FP & 0x2) != 0) In arm.c I had to add handling of half-precision floats (and their vector forms) in quite a few places. I hope I didn't miss any part out. Testing arm-none-eabi on qemu showed no regressions. Ok for trunk? Thanks, Kyrill gcc/ChangeLog 2013-04-12 Kyrylo Tkachov <kyrylo.tkac...@arm.com> * config/arm/arm.c (neon_builtin_type_mode): Add T_V4HF. (TB_DREG): Add T_V4HF. (v4hf_UP): New macro. (neon_itype): Add NEON_FLOAT_WIDEN, NEON_FLOAT_NARROW. (arm_init_neon_builtins): Handle NEON_FLOAT_WIDEN, NEON_FLOAT_NARROW. Handle initialisation of V4HF. Adjust initialisation of reinterpret built-ins. (arm_expand_neon_builtin): Handle NEON_FLOAT_WIDEN, NEON_FLOAT_NARROW. (arm_vector_mode_supported_p): Handle V4HF. (arm_mangle_map): Handle V4HFmode. * config/arm/arm.h (VALID_NEON_DREG_MODE): Add V4HF. * config/arm/arm_neon_builtins.def: Add entries for vcvtv4hfv4sf, vcvtv4sfv4hf. * config/arm/neon.md (neon_vcvtv4sfv4hf): New pattern. (neon_vcvtv4hfv4sf): Likewise. * config/arm/neon-gen.ml: Handle half-precision floating point features. * config/arm/neon-testgen.ml: Handle Requires_FP_bit feature. * config/arm/arm_neon.h: Regenerate. * config/arm/neon.ml (type elts): Add F16. (type vectype): Add T_float16x4, T_floatHF. (type vecmode): Add V4HF. (string_of_mode): Move earlier in the file. (type features): Add Requires_FP_bit feature. (elt_width): Handle F16. (elt_class): Likewise. (elt_of_class_width): Likewise. (mode_of_elt_str): New function. (type_for_elt): Handle F16, fix error messages. (vectype_size): Handle T_float16x4. (vcvt_sh): New function. (ops): Add entries for vcvt_f16_f32, vcvt_f32_f16. (string_of_vectype): Handle T_floatHF, T_float16, T_float16x4. * doc/arm-neon-intrinsics.texi: Regenerate. gcc/testsuite/ChangeLog 2013-04-12 Kyrylo Tkachov <kyrylo.tkac...@arm.com> * gcc.target/arm/neon/vcvtf16_f32.c: New test. Generated. * gcc.target/arm/neon/vcvtf32_f16.c: Likewise.
neon-vcvt-intrinsics-temp.patch
Description: Binary data