Re: [PATCH] dse: Use SUBREG_REG for copy_to_mode_reg in DSE replace_read for WORD_REGISTER_OPERATIONS targets [PR109040]
On 4/18/23 07:47, Jakub Jelinek wrote: On Tue, Apr 18, 2023 at 07:35:44AM -0600, Jeff Law wrote: On 4/18/23 03:06, Jakub Jelinek wrote: Hi! While we've agreed this is not the right fix for the PR109040 bug, the patch clearly improves generated code (at least on the testcase from the PR), so I'd like to propose this as optimization heuristics improvement for GCC 14. Ok for trunk? 2023-04-18 Jakub Jelinek PR target/109040 * dse.cc (replace_read): If read_reg is a SUBREG of a word mode REG, for WORD_REGISTER_OPERATIONS copy SUBREG_REG of it into a new REG rather than the SUBREG. Doesn't the new behavior need to be conditional on can_create_pseudos_p since the call to copy_to_mode_reg can ultimately call gen_reg_rtx. Why? copy_to_mode_reg was used before as well and it unconditionally does rtx temp = gen_reg_rtx (mode); as the first thing in the function. So something must be guarding earlier, which is fine. It was a general question as it wasn't obvious what would happen post-reload. OK for the trunk. jeff
Re: [PATCH] dse: Use SUBREG_REG for copy_to_mode_reg in DSE replace_read for WORD_REGISTER_OPERATIONS targets [PR109040]
On Tue, Apr 18, 2023 at 07:35:44AM -0600, Jeff Law wrote: > > > On 4/18/23 03:06, Jakub Jelinek wrote: > > Hi! > > > > While we've agreed this is not the right fix for the PR109040 bug, > > the patch clearly improves generated code (at least on the testcase from the > > PR), so I'd like to propose this as optimization heuristics improvement > > for GCC 14. > > > > Ok for trunk? > > > > 2023-04-18 Jakub Jelinek > > > > PR target/109040 > > * dse.cc (replace_read): If read_reg is a SUBREG of a word mode > > REG, for WORD_REGISTER_OPERATIONS copy SUBREG_REG of it into > > a new REG rather than the SUBREG. > Doesn't the new behavior need to be conditional on can_create_pseudos_p > since the call to copy_to_mode_reg can ultimately call gen_reg_rtx. Why? copy_to_mode_reg was used before as well and it unconditionally does rtx temp = gen_reg_rtx (mode); as the first thing in the function. So, if replace_read is used during post-RA DSE instance, it would already ICE before. All the patch changes is it will in some cases do copy_to_mode_reg on SUBREG_REG and create a new SUBREG instead of doing copy_to_mode_reg on the original. I think /* No place to keep the value after ra. */ && !reload_completed in record_store prevents recording the rhs values in DSE2 and so replace_read should never trigger there. Jakub
Re: [PATCH] dse: Use SUBREG_REG for copy_to_mode_reg in DSE replace_read for WORD_REGISTER_OPERATIONS targets [PR109040]
On 4/18/23 03:06, Jakub Jelinek wrote: Hi! While we've agreed this is not the right fix for the PR109040 bug, the patch clearly improves generated code (at least on the testcase from the PR), so I'd like to propose this as optimization heuristics improvement for GCC 14. Ok for trunk? 2023-04-18 Jakub Jelinek PR target/109040 * dse.cc (replace_read): If read_reg is a SUBREG of a word mode REG, for WORD_REGISTER_OPERATIONS copy SUBREG_REG of it into a new REG rather than the SUBREG. And OK after updating the conditional. jeff
Re: [PATCH] dse: Use SUBREG_REG for copy_to_mode_reg in DSE replace_read for WORD_REGISTER_OPERATIONS targets [PR109040]
On 4/18/23 03:06, Jakub Jelinek wrote: Hi! While we've agreed this is not the right fix for the PR109040 bug, the patch clearly improves generated code (at least on the testcase from the PR), so I'd like to propose this as optimization heuristics improvement for GCC 14. Ok for trunk? 2023-04-18 Jakub Jelinek PR target/109040 * dse.cc (replace_read): If read_reg is a SUBREG of a word mode REG, for WORD_REGISTER_OPERATIONS copy SUBREG_REG of it into a new REG rather than the SUBREG. Doesn't the new behavior need to be conditional on can_create_pseudos_p since the call to copy_to_mode_reg can ultimately call gen_reg_rtx. jeff
[PATCH] dse: Use SUBREG_REG for copy_to_mode_reg in DSE replace_read for WORD_REGISTER_OPERATIONS targets [PR109040]
Hi! While we've agreed this is not the right fix for the PR109040 bug, the patch clearly improves generated code (at least on the testcase from the PR), so I'd like to propose this as optimization heuristics improvement for GCC 14. Ok for trunk? 2023-04-18 Jakub Jelinek PR target/109040 * dse.cc (replace_read): If read_reg is a SUBREG of a word mode REG, for WORD_REGISTER_OPERATIONS copy SUBREG_REG of it into a new REG rather than the SUBREG. --- gcc/dse.cc.jj 2023-01-02 09:32:50.369880943 +0100 +++ gcc/dse.cc 2023-04-04 22:17:22.906347794 +0200 @@ -2012,7 +2012,19 @@ replace_read (store_info *store_info, in } /* Force the value into a new register so that it won't be clobbered between the store and the load. */ - read_reg = copy_to_mode_reg (read_mode, read_reg); + if (WORD_REGISTER_OPERATIONS + && GET_CODE (read_reg) == SUBREG + && REG_P (SUBREG_REG (read_reg)) + && GET_MODE (SUBREG_REG (read_reg)) == word_mode) +{ + /* For WORD_REGISTER_OPERATIONS with subreg of word_mode register +force SUBREG_REG into a new register rather than the SUBREG. */ + rtx r = copy_to_mode_reg (word_mode, SUBREG_REG (read_reg)); + read_reg = shallow_copy_rtx (read_reg); + SUBREG_REG (read_reg) = r; +} + else +read_reg = copy_to_mode_reg (read_mode, read_reg); insns = get_insns (); end_sequence (); Jakub