Hello! > This patch adds intrinsics for FMA instruction set along with tests for them. > Bootstraps and passes make check (including make check on simulator > for new runtime tests).
? ? ? ? ? ? ? * config/i386/fmaintrin.h: New. It is not included in the patch. ? ? ? ? ? ? ? * config.gcc: Add fmaintrin.h. ? ? ? ? ? ? ? * config/i386/i386.c ? ? ? ? ? ? ? * <ix86_builtins> (IX86_BUILTIN_VFMADDSS3): New. ? ? ? ? ? ? ? (IX86_BUILTIN_VFMADDSD3): Likewise. ? ? ? ? ? ? ? (X86_BUILTIN_VFNMADDSS3): Likewise. ? ? ? ? ? ? ? (X86_BUILTIN_VFNMADDSD3): Likewise. ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBSS3): Likewise. ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBSD3): Likewise. ? ? ? ? ? ? ? (X86_BUILTIN_VFNMSUBSS3): Likewise. ? ? ? ? ? ? ? (X86_BUILTIN_VFNMSUBSD3): Likewise. ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBPS): Likewise. ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBPD): Likewise. ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBPS256): Likewise. ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBPD256): Likewise. ? ? ? ? ? ? ? (X86_BUILTIN_VFNMADDPS): Likewise. ? ? ? ? ? ? ? (X86_BUILTIN_VFNMADDPD): Likewise. ? ? ? ? ? ? ? (X86_BUILTIN_VFNMADDPS256): Likewise. ? ? ? ? ? ? ? (X86_BUILTIN_VFNMADDPD256): Likewise. ? ? ? ? ? ? ? (X86_BUILTIN_VFNMSUBPS): Likewise. ? ? ? ? ? ? ? (X86_BUILTIN_VFNMSUBPD): Likewise. ? ? ? ? ? ? ? (X86_BUILTIN_VFNMSUBPS256): Likewise. ? ? ? ? ? ? ? (X86_BUILTIN_VFNMSUBPD256): Likewise. ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBADDPS): Likewise. ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBADDPD): Likewise. ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBADDPS256): Likewise. ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBADDPD256): Likewise. You don't need to add "negated" versions, one FMA builtin per mode is enough, please see existing FMA4 descriptions. Just put unary minus sign in the intrinsics header for "negated" operand and let GCC do its job. Please see existing FMA4 intrinsics header. ? ? ? ? ? ? ? * config/i386/sse.md (fmai_fnmadd_<mode>): New. ? ? ? ? ? ? ? (fmai_fmsub_<mode>): Likewise. ? ? ? ? ? ? ? (fmai_fnmsub_<mode>): Likewise. ? ? ? ? ? ? ? (fmai_fmadd_s_<mode>): Likewise. ? ? ? ? ? ? ? (fmai_vmfmadd_s_<mode>): Likewise. ? ? ? ? ? ? ? (fmai_vmfmsub_s_<mode>): Likewise. ? ? ? ? ? ? ? (fmai_vmfnmadd_s_<mode>): Likewise. ? ? ? ? ? ? ? (fmai_vmfnmsub_s_<mode>): Likewise. ? ? ? ? ? ? ? (*fmai_fmadd_s_<mode>): Likewise. ? ? ? ? ? ? ? (*fmai_fmsub_s_<mode>): Likewise. ? ? ? ? ? ? ? (*fmai_fnmadd_s_<mode>): Likewise. ? ? ? ? ? ? ? (*fmai_fnmsub_s_<mode>): Likewise. ? ? ? ? ? ? ? (fmsubadd_<mode>): Likewise. Also here. All your FMAMODE patterns should be expanded through existing "fma4i_fmadd_<mode>" expander (you can rename it to "fmai_fmadd..." to make its name more generic). This includes new "fmsubadd_<mode>" pattern that should be expanded through existing "fmaddsub_<mode>" expander. vec_merge scalar versions also need only one expander, again follow existing FMA4 version. Also, there is no need to include "_s_" in the name. We know that these are scalar versions. ? ? ? ? ? ? ? * gcc.target/i386/fma-check.h: New. ? ? ? ? ? ? ? * gcc.target/i386/fma-256-fmaddXX.c: New testcase. ? ? ? ? ? ? ? * gcc.target/i386/fma-256-fmaddsubXX.c: Likewise. ? ? ? ? ? ? ? * gcc.target/i386/fma-256-fmsubXX.c: Likewise. ? ? ? ? ? ? ? * gcc.target/i386/fma-256-fmsubaddXX.c: Likewise. ? ? ? ? ? ? ? * gcc.target/i386/fma-256-fnmaddXX.c: Likewise. ? ? ? ? ? ? ? * gcc.target/i386/fma-256-fnmsubXX.c: Likewise. ? ? ? ? ? ? ? * gcc.target/i386/fma-fmaddXX.c: Likewise. ? ? ? ? ? ? ? * gcc.target/i386/fma-fmaddsubXX.c: Likewise. ? ? ? ? ? ? ? * gcc.target/i386/fma-fmsubXX.c: Likewise. ? ? ? ? ? ? ? * gcc.target/i386/fma-fmsubaddXX.c: Likewise. ? ? ? ? ? ? ? * gcc.target/i386/fma-fnmaddXX.c: Likewise. ? ? ? ? ? ? ? * gcc.target/i386/fma-fnmsubXX.c: Likewise. ? ? ? ? ? ? ? * gcc.target/i386/fma-compile.c: Likewise. ? ? ? ? ? ? ? * gcc.target/i386/i386.exp (check_effective_target_fma): New. Is there a reason that all runtime tests are compiled with -O0 except that there are some existing FMA tests in the testsuite using -O0? Usually, these kind of tests are compiled using -O2, so optimizations are applied also to the builtins. Uros.