https://gcc.gnu.org/bugzilla/show_bug.cgi?id=124315
Uroš Bizjak <ubizjak at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |liuhongt at gcc dot gnu.org
Status|UNCONFIRMED |NEW
Ever confirmed|0 |1
Last reconfirmed| |2026-03-02
--- Comment #1 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Zdenek Sojka from comment #0)
> Created attachment 63799 [details]
> reduced testcase
>
> Output:
> $ x86_64-pc-linux-gnu-gcc -mavx512fp16 -c testcase.c
> $ objdump -S testcase.o > att.S
> $ x86_64-pc-linux-gnu-gcc -mavx512fp16 -c testcase.c -masm=intel
> $ objdump -S testcase.o > intel.S
> $ diff -u att.S intel.S
> --- att.S 2026-03-02 07:54:30.144550317 +0100
> +++ intel.S 2026-03-02 07:54:35.554550344 +0100
> @@ -15,7 +15,7 @@
> 1b: 00
> 1c: b8 00 00 00 00 mov $0x0,%eax
> 21: c5 f9 92 c8 kmovb %eax,%k1
> - 25: 62 f6 75 59 bd c2 vfnmadd231sh {ru-sae},%xmm2,%xmm1,%xmm0{%k1}
> + 25: 62 f6 7d 59 bd c2 vfnmadd231sh {ru-sae},%xmm2,%xmm0,%xmm0{%k1}
> 2b: 5d pop %rbp
> 2c: c3 ret
(define_insn "avx512f_vmfnmadd_<mode>_mask3<round_name>"
...
"vfnmadd231<ssescalarmodesuffix>\t{<round_op5>%2, %1, %0%{%4%}|%0%{%4%},
%<iptr>3, %<iptr>2<round_op5>}"
%<iptr>3 for intel dialect is wrong, should be %1
> @@ -30,6 +30,6 @@
> 48: 00
> 49: b8 00 00 00 00 mov $0x0,%eax
> 4e: c5 f9 92 c8 kmovb %eax,%k1
> - 52: 62 f2 ed 59 bb c1 vfmsub231sd {ru-sae},%xmm1,%xmm2,%xmm0{%k1}
> + 52: 62 f2 fd 59 bb c1 vfmsub231sd {ru-sae},%xmm1,%xmm0,%xmm0{%k1}
> 58: 5d pop %rbp
> 59: c3 ret
(define_insn "avx512f_vmfmsub_<mode>_mask3<round_name>"
...
"vfmsub231<ssescalarmodesuffix>\t{<round_op5>%2, %1, %0%{%4%}|%0%{%4%},
%<iptr>3, %<iptr>2<round_op5>}"
same as above.
> The -masm=intel output has xmm0 twice as the operand.
BTW: Adding -dp to the compile flags will report the name of the problematic
insn pattern in the asm dump. -dP will write out the whole RTL pattern.