Issue 174871
Summary [X86] X86CompressEVEX: Incorrect VPMOVB2M + KMOV -> VPMOVMSKB transformation causes incorrect results
Labels new issue
Assignees
Reporter aneshlya
    Commit 1caf2704dd6791baa4b958d6a666ea64ec24795d ("[X86] Allow EVEX compression for mask registers (#171980)") introduces a regression that causes incorrect code generation for AVX-512 vectorized code.

The transformation attempts to compress the following pattern:
```asm
vpmov*2m %xmm0, %k0     ->  (erase)
kmov* %k0, %eax ->  vmovmsk* %xmm0, %eax
```

However, this transformation produces incorrect results in certain code patterns involving masked operations in loops on AVX512 SKX+.

## Assembly Difference

**Before (correct) - commit b8f5cbba2abb:**
```asm
switchit___vyi:
# %bb.0:
    vpsllw  $7, %xmm1, %xmm1
    vpmovb2m    %xmm1, %k0          ; <-- AVX-512 instruction
 kmovd   %k0, %eax               ; <-- separate move from mask reg
 andl    $65534, %eax
    je  .LBB3_1
    ...
```

**After (incorrect) - commit 1caf2704dd67:**
```asm
switchit___vyi:
# %bb.0:
    vpsllw  $7, %xmm1, %xmm1
    vpmovmskb   %xmm1, %eax         ; <-- Compressed to single instruction
    andl    $65534, %eax
    je  .LBB3_1
 ...
```

Compiler explorer link: https://ispc.godbolt.org/z/59Mnnvavf
The test used in the reproducer produces incorrect results at runtime:

| Lane | Expected | Actual |
|------|----------|--------|
| 2 | 4.0 | 2.0 |
| 3 | 9.0 | 3.0 |
| 4 | 24.0 | 6.0 |
| 5 | 35.0 | 7.0 |
| 6 | 48.0 | 8.0 |
| 7 | 63.0 | 9.0 |
| 8 | 144.0 | 18.0 |
| ... | ... | ... |
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to