| Issue |
109186
|
| Summary |
-mvzeroupper
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
fbarchard
|
In XNNPack we have a mix of SSE2 and AVX2 code. Is _mm256_zeroupper() still required in all avx and avx512 code?
In this PR vzeroupper is automatically added. Nov 2019... clang 9.01
https://reviews.llvm.org/D69786
I found 7 microkernels in XNNPack that do _mm256_zeroupper() and require it. I remove the call and run a perf stat on skylake
It shows no sse-avx-assists.
Disassembling one of the avx kernels, clang added the vzeroupper.
But in godbolt, a simple avx function does not generate a vzeroupper?
https://godbolt.org/z/5E4KKc6Kv
I was wanting to test different compilers / versions to see which compilers require explicite _mm256_zeroupper()
Is this a bug?
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs