http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54231
--- Comment #3 from Thiago Macieira <thiago at kde dot org> 2012-08-11 22:36:20 UTC --- Another note: it appears the Intel compiler has the same bug. It produces the following code when compiling with -O2 -ipo: 0000000000000340 <my_bzero>: 340: dec %rsi 343: mov 0x2001ae(%rip),%rax # 2004f8 <_DYNAMIC+0xe0> 34a: vpxor %xmm0,%xmm0,%xmm0 34e: cmpl $0x0,(%rax) 351: je 36c <my_bzero+0x2c> 353: cmp $0xffffffffffffffff,%rsi 357: je 383 <my_bzero+0x43> 359: dec %rsi 35c: vmovntdq %xmm0,(%rdi) 360: add $0x10,%rdi 364: cmp $0xffffffffffffffff,%rsi 368: jne 359 <my_bzero+0x19> 36a: jmp 383 <my_bzero+0x43> 36c: cmp $0xffffffffffffffff,%rsi 370: je 383 <my_bzero+0x43> 372: dec %rsi 375: vmovntdq %xmm0,(%rdi) 379: add $0x10,%rdi 37d: cmp $0xffffffffffffffff,%rsi 381: jne 372 <my_bzero+0x32> 383: retq 384: nopl 0x0(%rax,%rax,1) 389: nopl 0x0(%rax) Note, additionally, that there's an instruction-scheduling issue: a VPXOR instruction was scheduled to before the test of the CPU features.