Issue 63877
Summary LLVM MCA does not match uops.info/Agner measurements
Labels new issue
Assignees
Reporter SeeSpring
    According to uops.info and Agner these instructions all have a throughput of 0.50 on Zen 4
```
vminps %ymm0, %ymm1, %ymm2
vfmadd213ps   %ymm0, %ymm1, %ymm2
vcmpltps %ymm0, %ymm1, %ymm2
```
llvm-mca claims they have a RThroughput of 1.00
[Godbolt of llvm-mca](https://godbolt.org/#g:!((g:!((g:!((h:codeEditor,i:(filename:'1',fontScale:14,fontUsePx:'0',j:1,lang:analysis,selection:(endColumn:1,endLineNumber:1,positionColumn:1,positionLineNumber:1,selectionStartColumn:1,selectionStartLineNumber:1,startColumn:1,startLineNumber:1),source:'vminps++++++++%25ymm0,+%25ymm1,+%25ymm2%0Avfmadd213ps+++%25ymm0,+%25ymm1,+%25ymm2%0Avcmpltps++++++%25ymm0,+%25ymm1,+%25ymm2'),l:'5',n:'0',o:'Analysis+source+%231',t:'0')),k:37.5,l:'4',n:'0',o:'',s:0,t:'0'),(g:!((h:compiler,i:(compiler:llvm-mcatrunk,deviceViewOpen:'1',filters:(b:'0',binary:'1',binaryObject:'1',commentOnly:'0',debugCalls:'1',demangle:'0',directives:'0',execute:'1',intel:'1',libraryCode:'0',trim:'1'),flagsViewOpen:'1',fontScale:14,fontUsePx:'0',j:2,lang:analysis,libs:!(),options:'--mcpu%3Dznver4',overrides:!(),selection:(endColumn:1,endLineNumber:1,positionColumn:1,positionLineNumber:1,selectionStartColumn:1,selectionStartLineNumber:1,startColumn:1,startLineNumber:1),source:1),l:'5',n:'0',o:'+llvm-mca+(trunk)+(Editor+%231)',t:'0')),k:62.5,l:'4',m:100,n:'0',o:'',s:1,t:'0')),l:'2',m:100,n:'0',o:'',t:'0')),version:4)

<details>
<summary>Version</summary>

```
LLVM (http://llvm.org/):
  LLVM version 17.0.0git
  Optimized build.
 Default target: x86_64-unknown-linux-gnu
  Host CPU: skylake-avx512

  Registered Targets:
    aarch64     - AArch64 (little endian)
    aarch64_32  - AArch64 (little endian ILP32)
 aarch64_be  - AArch64 (big endian)
    amdgcn      - AMD GCN GPUs
 arm         - ARM
    arm64       - ARM64 (little endian)
    arm64_32 - ARM64 (little endian ILP32)
    armeb       - ARM (big endian)
 avr         - Atmel AVR Microcontroller
    bpf         - BPF (host endian)
    bpfeb       - BPF (big endian)
    bpfel       - BPF (little endian)
    hexagon     - Hexagon
    lanai       - Lanai
 loongarch32 - 32-bit LoongArch
    loongarch64 - 64-bit LoongArch
 mips        - MIPS (32-bit big endian)
    mips64      - MIPS (64-bit big endian)
    mips64el    - MIPS (64-bit little endian)
    mipsel      - MIPS (32-bit little endian)
    msp430      - MSP430 [experimental]
 nvptx       - NVIDIA PTX 32-bit
    nvptx64     - NVIDIA PTX 64-bit
 ppc32       - PowerPC 32
    ppc32le     - PowerPC 32 LE
    ppc64 - PowerPC 64
    ppc64le     - PowerPC 64 LE
    r600        - AMD GPUs HD2XXX-HD6XXX
    riscv32     - 32-bit RISC-V
    riscv64     - 64-bit RISC-V
    sparc       - Sparc
    sparcel     - Sparc LE
 sparcv9     - Sparc V9
    systemz     - SystemZ
    thumb       - Thumb
    thumbeb     - Thumb (big endian)
    ve          - VE
 wasm32      - WebAssembly 32-bit
    wasm64      - WebAssembly 64-bit
 x86         - 32-bit X86: Pentium-Pro and above
    x86-64      - 64-bit X86: EM64T and AMD64
    xcore       - XCore
Compiler returned: 0
```
</details>
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to