On Sun, Jun 22, 2025 at 1:32 PM Jan Hubicka <hubi...@ucw.cz> wrote: > > > Since there is > > > > /* X86_TUNE_SPLIT_LONG_MOVES: Avoid instructions moving immediates > > directly to memory. */ > > DEF_TUNE (X86_TUNE_SPLIT_LONG_MOVES, "split_long_moves", m_PPRO) > > If I recall correctly, this tune was added for PentiumPro which had > problem decoding moves with long immediate and is a performance > optimization rather than code size one. > > As you discuss in the PR, i686 preffers > xorl %eax, %eax > movl %eax, aligned_heap_area > over > movl $0, aligned_heap_area > which has too long encoding. > > > > to avoid long immediate store instructions, like > > > > c7 02 00 00 00 00 movl $0x0,(%rdx) > > c7 02 ff ff ff ff movl $0xffffffff,(%rdx) > > > > add TARGET_USE_AND0_ORM1_STORE and enable *mov<mode>_(and|or) for > > TARGET_USE_AND0_ORM1_STORE, which is true for TARGET_SPLIT_LONG_MOVES or > > -Oz, to also generate: > > > > 83 22 00 andl $0x0,(%rdx) > > 83 0a ff orl $0xffffffff,(%rdx) > > I think this will not work well on PPro hardware since it will do it as > read-modify-write, while
Since read-modify-write is enabled for PentiumPro: /* X86_TUNE_READ_MODIFY_WRITE: Enable use of read modify write instructions such as "add $1, mem". */ DEF_TUNE (X86_TUNE_READ_MODIFY_WRITE, "read_modify_write", ~(m_PENT | m_LAKEMONT)) should this /* Generate "and $0,mem" and "or $-1,mem", instead of "mov $0,mem" and "mov $-1,mem" with shorter encoding for TARGET_SPLIT_LONG_MOVES with TARGET_READ_MODIFY_WRITE or -Oz. */ #define TARGET_USE_AND0_ORM1_STORE \ ((TARGET_SPLIT_LONG_MOVES && TARGET_READ_MODIFY_WRITE) \ || (optimize_insn_for_size_p () && optimize_size > 1)) work? > xorl %eax, %eax > movl %eax, (%rdx) > is executed as store saving the useless load. > > Honza -- H.J.