Issue 114959
Summary Missing optimization in zip_float
Labels new issue
Assignees
Reporter junaire
    Source:
```c
#include <immintrin.h>

void zip_float(const double *src, double *dst) {
 __m256d s0 = _mm256_broadcast_pd((__m128d*)src);
    __m256d s1 = _mm256_broadcast_pd((__m128d*)src + 2);
    __m256d s = _mm256_shuffle_pd(s0, s1, 0xc);
    s = _mm256_mul_pd(s, s);
 _mm256_store_pd(dst, s);
}
```

LLVM:
```
zip_float:
 vmovupd xmm0, xmmword ptr [rdi]
        vmovupd xmm1, xmmword ptr [rdi + 32]
        vunpcklpd       xmm2, xmm0, xmm1
        vunpckhpd xmm0, xmm0, xmm1
        vinsertf128     ymm0, ymm2, xmm0, 1
 vmulpd  ymm0, ymm0, ymm0
        vmovapd ymmword ptr [rsi], ymm0
 vzeroupper
        ret
```

GCC:
```
zip_float:
 vbroadcastf128  ymm0, XMMWORD PTR [rdi]
        vbroadcastf128  ymm1, XMMWORD PTR [rdi+32]
        vshufpd ymm0, ymm0, ymm1, 12
 vmulpd  ymm0, ymm0, ymm0
        vmovapd YMMWORD PTR [rsi], ymm0
 vzeroupper
        ret
```

Godbolt: https://godbolt.org/z/ffz1YEhPE
Tweeted by FFmpeg: https://x.com/FFmpeg/status/1853326818008514900
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to