Hi! As mentioned in that PR, we have a SI->DImode zero extension and RA happens to choose to zero extend from a SImode memory slot which is the low part of the DImode memory slot into which the zero extension is to be stored. Unfortunately, the RTL DSE part really doesn't have infrastructure to remember and, if needed, invalidate loads, it just remembers stores, so handling this generically is quite unlikely at least for GCC9.
This patch just handles that through a peephole2 (other option would be to handle it in the define_split for the zero extension, but the peephole2 is likely to catch more things). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2019-01-07 Jakub Jelinek <ja...@redhat.com> PR rtl-optimization/79593 * config/i386/i386.md (reg = mem; mem = reg): New define_peephole2. --- gcc/config/i386/i386.md.jj 2019-01-01 12:37:31.564738571 +0100 +++ gcc/config/i386/i386.md 2019-01-07 17:11:21.056392168 +0100 @@ -18740,6 +18740,21 @@ (define_peephole2 const0_rtx); }) +;; Attempt to optimize away memory stores of values the memory already +;; has. See PR79593. +(define_peephole2 + [(set (match_operand 0 "register_operand") + (match_operand 1 "memory_operand")) + (set (match_dup 1) (match_dup 0))] + "REG_P (operands[0]) + && !STACK_REGNO_P (operands[0]) + && !MEM_VOLATILE_P (operands[1])" + [(set (match_dup 0) (match_dup 1))] +{ + if (peep2_reg_dead_p (1, operands[0])) + DONE; +}) + ;; Attempt to always use XOR for zeroing registers (including FP modes). (define_peephole2 [(set (match_operand 0 "general_reg_operand") Jakub