| Issue |
63877
|
| Summary |
LLVM MCA does not match uops.info/Agner measurements
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
SeeSpring
|
According to uops.info and Agner these instructions all have a throughput of 0.50 on Zen 4
```
vminps %ymm0, %ymm1, %ymm2
vfmadd213ps %ymm0, %ymm1, %ymm2
vcmpltps %ymm0, %ymm1, %ymm2
```
llvm-mca claims they have a RThroughput of 1.00
[Godbolt of llvm-mca](https://godbolt.org/#g:!((g:!((g:!((h:codeEditor,i:(filename:'1',fontScale:14,fontUsePx:'0',j:1,lang:analysis,selection:(endColumn:1,endLineNumber:1,positionColumn:1,positionLineNumber:1,selectionStartColumn:1,selectionStartLineNumber:1,startColumn:1,startLineNumber:1),source:'vminps++++++++%25ymm0,+%25ymm1,+%25ymm2%0Avfmadd213ps+++%25ymm0,+%25ymm1,+%25ymm2%0Avcmpltps++++++%25ymm0,+%25ymm1,+%25ymm2'),l:'5',n:'0',o:'Analysis+source+%231',t:'0')),k:37.5,l:'4',n:'0',o:'',s:0,t:'0'),(g:!((h:compiler,i:(compiler:llvm-mcatrunk,deviceViewOpen:'1',filters:(b:'0',binary:'1',binaryObject:'1',commentOnly:'0',debugCalls:'1',demangle:'0',directives:'0',execute:'1',intel:'1',libraryCode:'0',trim:'1'),flagsViewOpen:'1',fontScale:14,fontUsePx:'0',j:2,lang:analysis,libs:!(),options:'--mcpu%3Dznver4',overrides:!(),selection:(endColumn:1,endLineNumber:1,positionColumn:1,positionLineNumber:1,selectionStartColumn:1,selectionStartLineNumber:1,startColumn:1,startLineNumber:1),source:1),l:'5',n:'0',o:'+llvm-mca+(trunk)+(Editor+%231)',t:'0')),k:62.5,l:'4',m:100,n:'0',o:'',s:1,t:'0')),l:'2',m:100,n:'0',o:'',t:'0')),version:4)
<details>
<summary>Version</summary>
```
LLVM (http://llvm.org/):
LLVM version 17.0.0git
Optimized build.
Default target: x86_64-unknown-linux-gnu
Host CPU: skylake-avx512
Registered Targets:
aarch64 - AArch64 (little endian)
aarch64_32 - AArch64 (little endian ILP32)
aarch64_be - AArch64 (big endian)
amdgcn - AMD GCN GPUs
arm - ARM
arm64 - ARM64 (little endian)
arm64_32 - ARM64 (little endian ILP32)
armeb - ARM (big endian)
avr - Atmel AVR Microcontroller
bpf - BPF (host endian)
bpfeb - BPF (big endian)
bpfel - BPF (little endian)
hexagon - Hexagon
lanai - Lanai
loongarch32 - 32-bit LoongArch
loongarch64 - 64-bit LoongArch
mips - MIPS (32-bit big endian)
mips64 - MIPS (64-bit big endian)
mips64el - MIPS (64-bit little endian)
mipsel - MIPS (32-bit little endian)
msp430 - MSP430 [experimental]
nvptx - NVIDIA PTX 32-bit
nvptx64 - NVIDIA PTX 64-bit
ppc32 - PowerPC 32
ppc32le - PowerPC 32 LE
ppc64 - PowerPC 64
ppc64le - PowerPC 64 LE
r600 - AMD GPUs HD2XXX-HD6XXX
riscv32 - 32-bit RISC-V
riscv64 - 64-bit RISC-V
sparc - Sparc
sparcel - Sparc LE
sparcv9 - Sparc V9
systemz - SystemZ
thumb - Thumb
thumbeb - Thumb (big endian)
ve - VE
wasm32 - WebAssembly 32-bit
wasm64 - WebAssembly 64-bit
x86 - 32-bit X86: Pentium-Pro and above
x86-64 - 64-bit X86: EM64T and AMD64
xcore - XCore
Compiler returned: 0
```
</details>
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs