--- Comment #1 from UroŇ° Bizjak <ubizjak at gmail dot com> ---
(In reply to Katsunori Kumatani from comment #0)

> Things to note:
> This happens on GCC 6 and up to 7 only, GCC 5.4 generates correct output.
> This happens once you turn on the -fschedule-insns option. So it's a bug
> there.
> If you remove the __restrict__ from the pointer in foo's parameter, the
> problem is gone.
> Using "asm volatile" instead of "asm" in memset_test generates correct code.
> Using "memory" clobber in that asm also generates correct code.
> Most of these workarounds are not valid in this context because they DISABLE
> the optimizations, so it's like preventing the problem from popping up
> instead of solving it. "memory" clobber is obviously the worst solution by
> far as it will kill any cached memory in registers. "asm volatile" is
> probably the least bad workaround, __restrict__ is definitely useful for
> same types the compiler can't otherwise know they won't alias.

"memory" clobber is the correct solution here, as it represents an implied
compiler barrier. Without it, the compiler is free to schedule loads and stores
around the "rep stosb".

IOW, it is the "cached memory in registers" instructions that can be scheduled
around the "rep stosb" without "memory" clobber.

Reply via email to