https://bugs.llvm.org/show_bug.cgi?id=51854
Bug ID: 51854
Summary: memset with length 2^N where N=2..7 is vectorized even
with -Oz enabled
Product: libraries
Version: trunk
Hardware: PC
OS: Windows NT
Status: NEW
Severity: enhancement
Priority: P
Component: Backend: X86
Assignee: unassignedb...@nondot.org
Reporter: vdse...@gmail.com
CC: craig.top...@gmail.com, llvm-bugs@lists.llvm.org,
llvm-...@redking.me.uk, pengfei.w...@intel.com,
spatel+l...@rotateright.com
Memset is vectorized with flags -Oz and -Os when the length is equal to 2^N
where N=2..7. There is no such behaviour in gcc, for example. I guess, it is
okay to vectorize this code with O3, but for Oz this shouldn't be done.
Source:
void func(int *P) {
memset(P, 0, 128);
}
Clang's output with Oz (trunk, https://godbolt.org/z/a6vjjxKhz):
func(int*, int): # @func(int*, int)
xorps xmm0, xmm0
movups xmmword ptr [rdi + 112], xmm0
movups xmmword ptr [rdi + 96], xmm0
movups xmmword ptr [rdi + 80], xmm0
movups xmmword ptr [rdi + 64], xmm0
movups xmmword ptr [rdi + 48], xmm0
movups xmmword ptr [rdi + 32], xmm0
movups xmmword ptr [rdi + 16], xmm0
movups xmmword ptr [rdi], xmm0
ret
If length > 128 with Oz/Os, then we generate this:
func(int*, int): # @func(int*, int)
mov edx, 256
xor esi, esi
jmp memset@PLT # TAILCALL
For gcc with Os the output is the same for any length (see
https://godbolt.org/z/1shqe319r):
func(int*, int):
mov ecx, X <-- X is the length
xor eax, eax
rep stosd
ret
So we expect that with Os and Oz flags we don't vectorize and generate the same
code as for the case with length > 128
--
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs