On Sun, Jun 22, 2025 at 2:57 PM Jan Hubicka <hubi...@ucw.cz> wrote:
>
> > This contradicts
> >
> > /* X86_TUNE_READ_MODIFY_WRITE: Enable use of read modify write instructions
> >    such as "add $1, mem".  */
> > DEF_TUNE (X86_TUNE_READ_MODIFY_WRITE, "read_modify_write",
> >           ~(m_PENT | m_LAKEMONT))
> >
> > which enables "andl $0, (%edx)" for PentiumPro.   "andl $0, (%edx)" works
> > well on PentiumPro.
>
> It is also enabled for zen but it does not mean that andl $0, (%edx)
> is a good way of clearing meomry when optimizing for speed.
>
> jan@padlo:/tmp> cat t.c
> int mem;
> int
> main()
> {
>         for (int i = 0; i < 1000000000; i++)
> #ifdef AND
>                 asm volatile ("andl $0, %0":"=m"(mem));
> #else
> #ifdef SPLIT
>                 asm volatile ("xorl %%eax, %%eax; movl $0, 
> %0":"=m"(mem)::"eax");
> #else
>                 asm volatile ("movl $0, %0":"=m"(mem));
> #endif
> #endif
>         return 0;
> }
> jan@padlo:/tmp> gcc  -O2 t.c ; time ./a.out
>
> real    0m0.405s
> user    0m0.403s
> sys     0m0.002s
> jan@padlo:/tmp> gcc  -O2 -DSPLIT t.c ; time ./a.out
>
> real    0m0.406s
> user    0m0.404s
> sys     0m0.001s
> jan@padlo:/tmp> gcc  -O2 -DAND t.c ; time ./a.out
>
> real    0m2.824s
> user    0m2.822s
> sys     0m0.001s
>
> Andl is slower then movl because it inroduces unnecesary memory read.
> I don't have PentiumPro to test, but there -DSPLIT variant should be
> bit better, since instruction exceed 7 bytes.
>
> Looking into history of that knob, it was added by me
> https://gcc.gnu.org/pipermail/gcc-patches/1999-July/014219.html
>
> to control behaviour of splitter that split the move if it was longer
> then 7 bytes which was impementing the following recommendation of the
> Intel optimization manual:
>
> "Avoid instructions that contain four or more micro-ops or instructions that 
> are more than
> seven bytes long. If possible, use instructions that require one
> micro-op"
>
> So the comment on SPLIT_LONG_MOVES is bit incorrect not mentining that
> move needs to exceed long_insn threshold.
>
> I am not sure how much we need to care about PPro perofmrance these days
> though.
>
> Honza

You are right.  I withdrew this patch.

-- 
H.J.

Reply via email to