Issue 109186
Summary -mvzeroupper
Labels new issue
Assignees
Reporter fbarchard
    In XNNPack we have a mix of SSE2 and AVX2 code. Is _mm256_zeroupper() still required in all avx and avx512 code?

In this PR vzeroupper is automatically added.  Nov 2019... clang 9.01
https://reviews.llvm.org/D69786

I found 7 microkernels in XNNPack that do _mm256_zeroupper() and require it.  I remove the call and run a perf stat on skylake
It shows no sse-avx-assists.
Disassembling one of the avx kernels, clang added the vzeroupper.

But in godbolt, a simple avx function does not generate a vzeroupper?
https://godbolt.org/z/5E4KKc6Kv

I was wanting to test different compilers / versions to see which compilers require explicite _mm256_zeroupper() 
Is this a bug?
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to