Re: rs6000 toc reference rtl again
On Tue, May 1, 2012 at 12:17 AM, Alan Modra amo...@gmail.com wrote: * config/rs6000/predicates.md (input_operand): Don't match constant pool addresses. Remove label_ref, high and plus from match_code list. Remove redundant CONSTANT_P test. (splat_input_operand): Similarly update match_code list. (small_toc_ref): New predicate. * config/rs6000/rs6000-protos.h (toc_relative_expr_p): Update prototype. * config/rs6000/rs6000.c (tocrel_base, tocrel_offset): Make const. (legitimate_constant_pool_address_p): Move TARGET_TOC test and register checks to.. (toc_relative_expr_p): ..here. Add strict param. Match new rtl generated by create_TOC_reference. (rs6000_legitimize_address): Update cerate_TOC_reference call. (rs6000_delegitimize_address): Handle new rtl for toc refs. (rs6000_cannot_force_const_mem, rs6000_find_base_term): Likewise. (use_toc_relative_ref): New function, split out from.. (rs6000_emit_move): ..here. Remove redundant tests. Update create_TOC_reference calls. (rs6000_legitimize_reload_address): Formatting. Handle splitting of medium/large model toc addresses. Use use_toc_relative_ref. (print_operand): Formatting, style. Adjust for toc changes. (print_operand_address): Likewise. (rs6000_output_addr_const_extra): Likewise. (create_TOC_reference): Put TOC_REGISTER in UNSPEC_TOCREL rather than a PLUS. Use this formulation for both high and low part of -mcmodel=medium/large toc reference too. Before reload, always use the small model formulation. * config/rs6000/rs6000.md (tls_gd, tls_gd_high): Similarly avoid a PLUS in high part of addresses here. (tls_ld, tls_ld_high, tls_got_dtprel, tls_got_dtprel_high): Likewise. (tls_got_tprel, tls_got_tprel_high, largetoc_high): Likewise. (largetoc_high, largetoc_low): Move earlier. Cope when no base reg available. (largetoc_high_plus): New insn. (movsi_internal1, movsi_internal1_single, movsf_softfloat, movdi_mfpgpr, movdi_internal64): Don't handle 'R' constraint here.. (tocref): ..instead do so here, new insn and split. Okay. But please add a comment explaining toc_relative_expr_p(), especially the meaning of the new strict argument. And please wait for bootstrap to be unbroken on PPC64. Thanks, David
Re: rs6000 toc reference rtl again
On Tue, May 01, 2012 at 01:47:51PM +0930, Alan Modra wrote: This revision splits the medium/large code model toc reference after reload. Ping http://gcc.gnu.org/ml/gcc-patches/2012-05/msg6.html -- Alan Modra Australia Development Lab, IBM
Re: rs6000 toc reference rtl again
This revision splits the medium/large code model toc reference after reload. I expected this to be more difficult, but it turned out surprisingly easy. Besides creating the rtl that way, and tweaking toc_relative_expr_p to match, it was just: Move the 'R' constraint handling out of various movsi/di instructions into a new instruction that is split after reload, and modify legitimize_reload_address to handle all the other splits inside MEMs. Bootstrapped and regression tested powerpc-linux. I also intend to spec test this change. Generated code has the expected improvements: libstdc++.so saw a small reduction in code size due to removing silly uses of call-saved regs to hold the high part of a toc reference. OK to apply? * config/rs6000/predicates.md (input_operand): Don't match constant pool addresses. Remove label_ref, high and plus from match_code list. Remove redundant CONSTANT_P test. (splat_input_operand): Similarly update match_code list. (small_toc_ref): New predicate. * config/rs6000/rs6000-protos.h (toc_relative_expr_p): Update prototype. * config/rs6000/rs6000.c (tocrel_base, tocrel_offset): Make const. (legitimate_constant_pool_address_p): Move TARGET_TOC test and register checks to.. (toc_relative_expr_p): ..here. Add strict param. Match new rtl generated by create_TOC_reference. (rs6000_legitimize_address): Update cerate_TOC_reference call. (rs6000_delegitimize_address): Handle new rtl for toc refs. (rs6000_cannot_force_const_mem, rs6000_find_base_term): Likewise. (use_toc_relative_ref): New function, split out from.. (rs6000_emit_move): ..here. Remove redundant tests. Update create_TOC_reference calls. (rs6000_legitimize_reload_address): Formatting. Handle splitting of medium/large model toc addresses. Use use_toc_relative_ref. (print_operand): Formatting, style. Adjust for toc changes. (print_operand_address): Likewise. (rs6000_output_addr_const_extra): Likewise. (create_TOC_reference): Put TOC_REGISTER in UNSPEC_TOCREL rather than a PLUS. Use this formulation for both high and low part of -mcmodel=medium/large toc reference too. Before reload, always use the small model formulation. * config/rs6000/rs6000.md (tls_gd, tls_gd_high): Similarly avoid a PLUS in high part of addresses here. (tls_ld, tls_ld_high, tls_got_dtprel, tls_got_dtprel_high): Likewise. (tls_got_tprel, tls_got_tprel_high, largetoc_high): Likewise. (largetoc_high, largetoc_low): Move earlier. Cope when no base reg available. (largetoc_high_plus): New insn. (movsi_internal1, movsi_internal1_single, movsf_softfloat, movdi_mfpgpr, movdi_internal64): Don't handle 'R' constraint here.. (tocref): ..instead do so here, new insn and split. Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 187010) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -5280,15 +5280,33 @@ constant_pool_expr_p (rtx op) ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (base), Pmode)); } -static rtx tocrel_base, tocrel_offset; +static const_rtx tocrel_base, tocrel_offset; bool -toc_relative_expr_p (rtx op) +toc_relative_expr_p (const_rtx op, bool strict) { - if (GET_CODE (op) != CONST) + if (!TARGET_TOC) return false; - split_const (op, tocrel_base, tocrel_offset); + if (TARGET_CMODEL != CMODEL_SMALL) +{ + /* Only match the low part. */ + if (GET_CODE (op) == LO_SUM + REG_P (XEXP (op, 0)) + INT_REG_OK_FOR_BASE_P (XEXP (op, 0), strict)) + op = XEXP (op, 1); + else if (strict) + return false; +} + + tocrel_base = op; + tocrel_offset = const0_rtx; + if (GET_CODE (op) == PLUS CONST_INT_P (XEXP (op, 1))) +{ + tocrel_base = XEXP (op, 0); + tocrel_offset = XEXP (op, 1); +} + return (GET_CODE (tocrel_base) == UNSPEC XINT (tocrel_base, 1) == UNSPEC_TOCREL); } @@ -5300,14 +5318,7 @@ bool legitimate_constant_pool_address_p (const_rtx x, enum machine_mode mode, bool strict) { - return (TARGET_TOC - (GET_CODE (x) == PLUS || GET_CODE (x) == LO_SUM) - GET_CODE (XEXP (x, 0)) == REG - (REGNO (XEXP (x, 0)) == TOC_REGISTER - || ((TARGET_MINIMAL_TOC - || TARGET_CMODEL != CMODEL_SMALL) - INT_REG_OK_FOR_BASE_P (XEXP (x, 0), strict))) - toc_relative_expr_p (XEXP (x, 1)) + return (toc_relative_expr_p (x, strict) (TARGET_CMODEL != CMODEL_MEDIUM || constant_pool_expr_p (XVECEXP (tocrel_base, 0, 0)) || mode == QImode @@ -5687,10 +5698,7 @@ rs6000_legitimize_address (rtx x, rtx oldx ATTRIBU GET_CODE
Re: rs6000 toc reference rtl again
Alan Modra amo...@gmail.com writes: On Tue, Apr 03, 2012 at 07:49:04PM +0100, Richard Sandiford wrote: Alan Modra amo...@gmail.com writes: Now that we are back in stage1, I'd like to apply http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00304.html, a change to toc reference rtl in order to properly specify r2 dependencies. More commentary in that url. I'm reposting the patch here since the old one no longer applies cleanly, and I've added some ENABLE_CHECKING code in rs6000_delegitimize_address. Sorry to be a pain, but I don't think HIGH is supposed contain regs either. Both HIGH and CONST are supposed to be true constants. Eh, so the existing use of CONST is wrong then. ;-) Right :-) Sorry, I meant: although you're fixing the CONST, the same problem really applies to the HIGH too. I'm proposing (unspec [(symbol_ref sym) (reg r2)] UNSPEC_TOCREL) for the small model, and (high (unspec [(symbol_ref sym) (reg r2)] UNSPEC_TOCREL))) (lo_sum (reg hi) (unspec [(symbol_ref sym) (reg r2)] UNSPEC_TOCREL)) for medium/large model. You can see why I'd like to keep it this way; The medium/large rtl is a natural split of the small rtl. (I'm going to experiment with splitting the small rtl after reload for medium/large to see whether that helps our usage of call-saved regs in loops.) Yeah. FWIW, MIPS keeps both the small and large forms as constants before reload, then splits them into (non-constant) forms that reference the global pointer after reload. So the small version is simply: (set (match_operand 0 register_operand =...) (match_operand 1 symbolic_constant )) before reload, while the large form is a normal HIGH/LO_SUM pair. That HIGH then gets split after reload too. It seems to work well, although these days it does require the split to happen before prologue/epilogue generation. (Hmm, the *lea_high64 patterns are probably wrong now that we have shrink-wrapping...) With the loop thing, do you mean that you're seeing too many HIGHs being hoisted? Would be nice to fix that in the loop optimisers if so. I'm not wedded to the representation, *but* we do want gcc to treat the high part as a constant. That's important because we don't ever want reload saving the high part to a stack slot! Which is what does happen if you don't somehow tell gcc it is a constant. Out of curiosity, does that still happen if you have a HIGH REG_EQUAL note attached to the addition? I'd have expected reload to convert the note into a REG_EQUIV and treat the source as a function invariant. Besides, the high part *is* a constant within any given function. So is the low part for that matter. The only reason I want r2 mentioned in this rtl is for register liveness, eg. so that a load of a function pointer (which loads r2) for an indirect call doesn't get scheduled before any uses of the old r2. Right. But that's also true of, say, a constant that needs to be split into a load-high and add. The result of the add is a function constant, but the add itself is still not a constant from an rtl perspective. I think it'd get too confusing if constants were allowed to reference registers. That sort of thing is usually handled with REG_EQUAL notes instead. But in the specific case of GOT references, where the GOT register isn't really around until prologue/epilogue generation anyway, there's less point exposing it before reload. The alternative of removing r2 from the unspec and attaching a (use (reg r2)) to all instructions that have this addressing form might be clean but will require major duplication of patterns in rs6000.md, won't it? Yeah. It'd probably also not be as effective as splitting after reload. Passes that want to optimise the load high are likely to be put off by a (use ...). Richard
Re: rs6000 toc reference rtl again
On Wed, Apr 04, 2012 at 10:25:39AM +0100, Richard Sandiford wrote: With the loop thing, do you mean that you're seeing too many HIGHs being hoisted? No, nothing as complicated as that. In a lot of cases, any hoisting of the high part is bad, because the linker nops out the high part and edits the low part insn when the r2 offset can be done in 16 bits. If the high part insn was the only use of a call-saved reg, then the function saves and restores a reg to no purpose. I'm not wedded to the representation, *but* we do want gcc to treat the high part as a constant. That's important because we don't ever want reload saving the high part to a stack slot! Which is what does happen if you don't somehow tell gcc it is a constant. Out of curiosity, does that still happen if you have a HIGH REG_EQUAL note attached to the addition? I'd have expected reload to convert the note into a REG_EQUIV and treat the source as a function invariant. I saw that possibility in reload, and did try a REG_EQUIV note before I wrapped the ADD in a CONST as we have currently. Some early pass deleted the notes for me. I forget which one. Oh well, I was going to try splitting after reload anyway. -- Alan Modra Australia Development Lab, IBM
Re: rs6000 toc reference rtl again
On Tue, Apr 03, 2012 at 07:49:04PM +0100, Richard Sandiford wrote: Alan Modra amo...@gmail.com writes: Now that we are back in stage1, I'd like to apply http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00304.html, a change to toc reference rtl in order to properly specify r2 dependencies. More commentary in that url. I'm reposting the patch here since the old one no longer applies cleanly, and I've added some ENABLE_CHECKING code in rs6000_delegitimize_address. Sorry to be a pain, but I don't think HIGH is supposed contain regs either. Both HIGH and CONST are supposed to be true constants. Eh, so the existing use of CONST is wrong then. ;-) I'm proposing (unspec [(symbol_ref sym) (reg r2)] UNSPEC_TOCREL) for the small model, and (high (unspec [(symbol_ref sym) (reg r2)] UNSPEC_TOCREL))) (lo_sum (reg hi) (unspec [(symbol_ref sym) (reg r2)] UNSPEC_TOCREL)) for medium/large model. You can see why I'd like to keep it this way; The medium/large rtl is a natural split of the small rtl. (I'm going to experiment with splitting the small rtl after reload for medium/large to see whether that helps our usage of call-saved regs in loops.) I'm not wedded to the representation, *but* we do want gcc to treat the high part as a constant. That's important because we don't ever want reload saving the high part to a stack slot! Which is what does happen if you don't somehow tell gcc it is a constant. Besides, the high part *is* a constant within any given function. So is the low part for that matter. The only reason I want r2 mentioned in this rtl is for register liveness, eg. so that a load of a function pointer (which loads r2) for an indirect call doesn't get scheduled before any uses of the old r2. The alternative of removing r2 from the unspec and attaching a (use (reg r2)) to all instructions that have this addressing form might be clean but will require major duplication of patterns in rs6000.md, won't it? -- Alan Modra Australia Development Lab, IBM
Re: rs6000 toc reference rtl again
On Tue, Mar 27, 2012 at 3:54 AM, Alan Modra amo...@gmail.com wrote: Now that we are back in stage1, I'd like to apply http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00304.html, a change to toc reference rtl in order to properly specify r2 dependencies. More commentary in that url. I'm reposting the patch here since the old one no longer applies cleanly, and I've added some ENABLE_CHECKING code in rs6000_delegitimize_address. Bootstrapped and regression tested powerpc64-linux. OK for mainline? * cselib.c (preserve_only_constants): Remove HIGH rtx containing value references. * cprop.c (cprop_constant_p): Return false for HIGH rtx containing value references. * config/rs6000/predicates.md (input_operand): Match unspec. Remove redundant tests. * rs6000-protos.h (toc_relative_expr_p): Update prototype. * const/rs6000/rs6000.c (tocrel_base, tocrel_offset): Make const. (legitimate_constant_pool_address_p): Move TARGET_TOC test and register checks to.. (toc_relative_expr_p): ..here. Add strict param. Match new rtl generated by create_TOC_reference. (rs6000_delegitimize_address): Handle new rtl for toc refs. (rs6000_cannot_force_const_mem, rs6000_find_base_term): Likewise. (use_toc_relative_ref): New function, split out from.. (rs6000_emit_move): ..here. Remove redundant tests. (rs6000_legitimize_reload_address): Formatting. Remove redundant code. Use use_toc_relative_ref. (print_operand): Formatting, style. Adjust for toc changes. (print_operand_address): Likewise. (rs6000_output_addr_const_extra): Likewise. (create_TOC_reference): Put TOC_REGISTER in UNSPEC_TOCREL rather than a PLUS. Use this formulation for both high and low part of -mcmodel=medium/large toc reference too. * config/rs6000/rs6000.md (tls_gd, tls_gd_high): Similarly avoid a PLUS in high part of addresses here. (tls_ld, tls_ld_high, tls_got_dtprel, tls_got_dtprel_high): Likewise. (tls_got_tprel, tls_got_tprel_high, largetoc_high): Likewise. (largetoc_high, largetoc_low): Move earlier. Cope when no base reg available. Now that PPC bootstrap appears to be fixed, this patch is okay to apply. However, I cannot approve the cselib.c and cprop.c changes. Thanks, David
Re: rs6000 toc reference rtl again
Alan Modra amo...@gmail.com writes: Now that we are back in stage1, I'd like to apply http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00304.html, a change to toc reference rtl in order to properly specify r2 dependencies. More commentary in that url. I'm reposting the patch here since the old one no longer applies cleanly, and I've added some ENABLE_CHECKING code in rs6000_delegitimize_address. Sorry to be a pain, but I don't think HIGH is supposed contain regs either. Both HIGH and CONST are supposed to be true constants. It's fine to have a non-HIGH expression as the first operand to a LO_SUM though. Alpha and MIPS use this for small data, where the first LO_SUM operand is the small data pointer. It can also be an arbitrary rtx like a PLUS. Richard
Re: rs6000 toc reference rtl
On Tue, Sep 06, 2011 at 01:11:26AM +0930, Alan Modra wrote: Consequently, Mike's change to split rtl for indirect calls sometimes sees the scheduler moving the r2 load in the indirect call sequence before a toc reference. Actually, this isn't correct. Mike's change adding rs6000.c rs6000_call_indirect_aix just made it more likely. Even before this post-reload scheduling could move the r2 load around, since rs6000.md call_indirect_aix patterns were (and still are) split post-reload. Here's an example I was shown today of such damage (qemu compiled with gcc-4.6-redhat). .LVL57151: ld 0,0(31) # load opd+0, function addr addis 4,2,.LC4758@toc@ha ld 11,16(31) mr 7,3 std 2,40(1) # save r2 mr 5,25 addi 4,4,.LC4758@toc@l mtctr 0 # mr 6,26 ld 2,8(31) # load opd+8, new toc ptr in r2 mr 3,28 .LBB255670: .LBB255668: .loc 8 98 0 addis 27,2,.LC4761@toc@ha # oops, should be using old r2 .LVL57152: addi 27,27,.LC4761@toc@l .LBE255668: .LBE255670: .loc 3 9212 0 addis 25,2,.LC4762@toc@ha # oops again .loc 3 9198 0 bctrl # make the call ld 2,40(1) # restore r2 r27 and r25 set up here for later use now contain bogus values. The blame rests on my 2011-06-20 change. -- Alan Modra Australia Development Lab, IBM