Re: rs6000 toc reference rtl again

2012-05-16 Thread David Edelsohn
On Tue, May 1, 2012 at 12:17 AM, Alan Modra amo...@gmail.com wrote:

        * config/rs6000/predicates.md (input_operand): Don't match
        constant pool addresses.  Remove label_ref, high and plus from
        match_code list.  Remove redundant CONSTANT_P test.
        (splat_input_operand): Similarly update match_code list.
        (small_toc_ref): New predicate.
        * config/rs6000/rs6000-protos.h (toc_relative_expr_p): Update 
 prototype.
        * config/rs6000/rs6000.c (tocrel_base, tocrel_offset): Make const.
        (legitimate_constant_pool_address_p): Move TARGET_TOC test and
        register checks to..
        (toc_relative_expr_p): ..here.  Add strict param.  Match new rtl
        generated by create_TOC_reference.
        (rs6000_legitimize_address): Update cerate_TOC_reference call.
        (rs6000_delegitimize_address): Handle new rtl for toc refs.
        (rs6000_cannot_force_const_mem, rs6000_find_base_term): Likewise.
        (use_toc_relative_ref): New function, split out from..
        (rs6000_emit_move): ..here.  Remove redundant tests.  Update
        create_TOC_reference calls.
        (rs6000_legitimize_reload_address): Formatting.  Handle splitting
        of medium/large model toc addresses.  Use use_toc_relative_ref.
        (print_operand): Formatting, style.  Adjust for toc changes.
        (print_operand_address): Likewise.
        (rs6000_output_addr_const_extra): Likewise.
        (create_TOC_reference): Put TOC_REGISTER in UNSPEC_TOCREL rather
        than a PLUS.  Use this formulation for both high and low part
        of -mcmodel=medium/large toc reference too.  Before reload,
        always use the small model formulation.
        * config/rs6000/rs6000.md (tls_gd, tls_gd_high): Similarly avoid
        a PLUS in high part of addresses here.
        (tls_ld, tls_ld_high, tls_got_dtprel, tls_got_dtprel_high): Likewise.
        (tls_got_tprel, tls_got_tprel_high, largetoc_high): Likewise.
        (largetoc_high, largetoc_low): Move earlier.  Cope when no
        base reg available.
        (largetoc_high_plus): New insn.
        (movsi_internal1, movsi_internal1_single, movsf_softfloat,
        movdi_mfpgpr, movdi_internal64): Don't handle 'R' constraint here..
        (tocref): ..instead do so here, new insn and split.

Okay.

But please add a comment explaining toc_relative_expr_p(), especially
the meaning of the new strict argument.

And please wait for bootstrap to be unbroken on PPC64.

Thanks, David


Re: rs6000 toc reference rtl again

2012-05-15 Thread Alan Modra
On Tue, May 01, 2012 at 01:47:51PM +0930, Alan Modra wrote:
 This revision splits the medium/large code model toc reference after
 reload.

Ping http://gcc.gnu.org/ml/gcc-patches/2012-05/msg6.html

-- 
Alan Modra
Australia Development Lab, IBM


Re: rs6000 toc reference rtl again

2012-04-30 Thread Alan Modra
This revision splits the medium/large code model toc reference after
reload.  I expected this to be more difficult, but it turned out
surprisingly easy.  Besides creating the rtl that way, and tweaking
toc_relative_expr_p to match, it was just:  Move the 'R' constraint
handling out of various movsi/di instructions into a new instruction
that is split after reload, and modify legitimize_reload_address to
handle all the other splits inside MEMs.

Bootstrapped and regression tested powerpc-linux.  I also intend to
spec test this change.  Generated code has the expected improvements:
libstdc++.so saw a small reduction in code size due to removing silly
uses of call-saved regs to hold the high part of a toc reference.
OK to apply?

* config/rs6000/predicates.md (input_operand): Don't match
constant pool addresses.  Remove label_ref, high and plus from
match_code list.  Remove redundant CONSTANT_P test.
(splat_input_operand): Similarly update match_code list.
(small_toc_ref): New predicate.
* config/rs6000/rs6000-protos.h (toc_relative_expr_p): Update prototype.
* config/rs6000/rs6000.c (tocrel_base, tocrel_offset): Make const.
(legitimate_constant_pool_address_p): Move TARGET_TOC test and
register checks to..
(toc_relative_expr_p): ..here.  Add strict param.  Match new rtl
generated by create_TOC_reference.
(rs6000_legitimize_address): Update cerate_TOC_reference call.
(rs6000_delegitimize_address): Handle new rtl for toc refs.
(rs6000_cannot_force_const_mem, rs6000_find_base_term): Likewise.
(use_toc_relative_ref): New function, split out from..
(rs6000_emit_move): ..here.  Remove redundant tests.  Update
create_TOC_reference calls.
(rs6000_legitimize_reload_address): Formatting.  Handle splitting
of medium/large model toc addresses.  Use use_toc_relative_ref.
(print_operand): Formatting, style.  Adjust for toc changes.
(print_operand_address): Likewise.
(rs6000_output_addr_const_extra): Likewise.
(create_TOC_reference): Put TOC_REGISTER in UNSPEC_TOCREL rather
than a PLUS.  Use this formulation for both high and low part
of -mcmodel=medium/large toc reference too.  Before reload,
always use the small model formulation.
* config/rs6000/rs6000.md (tls_gd, tls_gd_high): Similarly avoid
a PLUS in high part of addresses here.
(tls_ld, tls_ld_high, tls_got_dtprel, tls_got_dtprel_high): Likewise.
(tls_got_tprel, tls_got_tprel_high, largetoc_high): Likewise.
(largetoc_high, largetoc_low): Move earlier.  Cope when no
base reg available.
(largetoc_high_plus): New insn.
(movsi_internal1, movsi_internal1_single, movsf_softfloat,
movdi_mfpgpr, movdi_internal64): Don't handle 'R' constraint here..
(tocref): ..instead do so here, new insn and split.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 187010)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -5280,15 +5280,33 @@ constant_pool_expr_p (rtx op)
   ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (base), Pmode));
 }
 
-static rtx tocrel_base, tocrel_offset;
+static const_rtx tocrel_base, tocrel_offset;
 
 bool
-toc_relative_expr_p (rtx op)
+toc_relative_expr_p (const_rtx op, bool strict)
 {
-  if (GET_CODE (op) != CONST)
+  if (!TARGET_TOC)
 return false;
 
-  split_const (op, tocrel_base, tocrel_offset);
+  if (TARGET_CMODEL != CMODEL_SMALL)
+{
+  /* Only match the low part.  */
+  if (GET_CODE (op) == LO_SUM
+  REG_P (XEXP (op, 0))
+  INT_REG_OK_FOR_BASE_P (XEXP (op, 0), strict))
+   op = XEXP (op, 1);
+  else if (strict)
+   return false;
+}
+
+  tocrel_base = op;
+  tocrel_offset = const0_rtx;
+  if (GET_CODE (op) == PLUS  CONST_INT_P (XEXP (op, 1)))
+{
+  tocrel_base = XEXP (op, 0);
+  tocrel_offset = XEXP (op, 1);
+}
+
   return (GET_CODE (tocrel_base) == UNSPEC
   XINT (tocrel_base, 1) == UNSPEC_TOCREL);
 }
@@ -5300,14 +5318,7 @@ bool
 legitimate_constant_pool_address_p (const_rtx x, enum machine_mode mode,
bool strict)
 {
-  return (TARGET_TOC
-  (GET_CODE (x) == PLUS || GET_CODE (x) == LO_SUM)
-  GET_CODE (XEXP (x, 0)) == REG
-  (REGNO (XEXP (x, 0)) == TOC_REGISTER
- || ((TARGET_MINIMAL_TOC
-  || TARGET_CMODEL != CMODEL_SMALL)
-  INT_REG_OK_FOR_BASE_P (XEXP (x, 0), strict)))
-  toc_relative_expr_p (XEXP (x, 1))
+  return (toc_relative_expr_p (x, strict)
   (TARGET_CMODEL != CMODEL_MEDIUM
  || constant_pool_expr_p (XVECEXP (tocrel_base, 0, 0))
  || mode == QImode
@@ -5687,10 +5698,7 @@ rs6000_legitimize_address (rtx x, rtx oldx ATTRIBU
GET_CODE 

Re: rs6000 toc reference rtl again

2012-04-04 Thread Richard Sandiford
Alan Modra amo...@gmail.com writes:
 On Tue, Apr 03, 2012 at 07:49:04PM +0100, Richard Sandiford wrote:
 Alan Modra amo...@gmail.com writes:
  Now that we are back in stage1, I'd like to apply
  http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00304.html, a change to
  toc reference rtl in order to properly specify r2 dependencies.  More
  commentary in that url.  I'm reposting the patch here since the old
  one no longer applies cleanly, and I've added some ENABLE_CHECKING
  code in rs6000_delegitimize_address.
 
 Sorry to be a pain, but I don't think HIGH is supposed contain
 regs either.  Both HIGH and CONST are supposed to be true constants.

 Eh, so the existing use of CONST is wrong then.  ;-)

Right :-)  Sorry, I meant: although you're fixing the CONST,
the same problem really applies to the HIGH too.

 I'm proposing
   (unspec [(symbol_ref sym) (reg r2)] UNSPEC_TOCREL)
 for the small model, and
   (high (unspec [(symbol_ref sym) (reg r2)] UNSPEC_TOCREL)))
   (lo_sum (reg hi) (unspec [(symbol_ref sym) (reg r2)] UNSPEC_TOCREL))
 for medium/large model.

 You can see why I'd like to keep it this way;  The medium/large rtl is
 a natural split of the small rtl.  (I'm going to experiment with
 splitting the small rtl after reload for medium/large to see whether
 that helps our usage of call-saved regs in loops.)

Yeah.  FWIW, MIPS keeps both the small and large forms as constants
before reload, then splits them into (non-constant) forms that
reference the global pointer after reload.  So the small version
is simply:

(set (match_operand 0 register_operand =...)
 (match_operand 1 symbolic_constant ))

before reload, while the large form is a normal HIGH/LO_SUM pair.
That HIGH then gets split after reload too.

It seems to work well, although these days it does require the split
to happen before prologue/epilogue generation.  (Hmm, the *lea_high64
patterns are probably wrong now that we have shrink-wrapping...)

With the loop thing, do you mean that you're seeing too many HIGHs
being hoisted?  Would be nice to fix that in the loop optimisers
if so.

 I'm not wedded to the representation, *but* we do want gcc to treat
 the high part as a constant.  That's important because we don't ever
 want reload saving the high part to a stack slot!  Which is what does
 happen if you don't somehow tell gcc it is a constant.

Out of curiosity, does that still happen if you have a HIGH REG_EQUAL
note attached to the addition?  I'd have expected reload to convert
the note into a REG_EQUIV and treat the source as a function invariant.

 Besides, the high part *is* a constant within any given function.  So
 is the low part for that matter.  The only reason I want r2 mentioned
 in this rtl is for register liveness, eg. so that a load of a function
 pointer (which loads r2) for an indirect call doesn't get scheduled
 before any uses of the old r2.

Right.  But that's also true of, say, a constant that needs to be split
into a load-high and add.  The result of the add is a function constant,
but the add itself is still not a constant from an rtl perspective.

I think it'd get too confusing if constants were allowed to reference
registers.  That sort of thing is usually handled with REG_EQUAL notes
instead.  But in the specific case of GOT references, where the GOT
register isn't really around until prologue/epilogue generation anyway,
there's less point exposing it before reload.

 The alternative of removing r2 from the unspec and attaching a
 (use (reg r2)) to all instructions that have this addressing form
 might be clean but will require major duplication of patterns in
 rs6000.md, won't it?

Yeah.  It'd probably also not be as effective as splitting after reload.
Passes that want to optimise the load high are likely to be put off by
a (use ...).

Richard


Re: rs6000 toc reference rtl again

2012-04-04 Thread Alan Modra
On Wed, Apr 04, 2012 at 10:25:39AM +0100, Richard Sandiford wrote:
 With the loop thing, do you mean that you're seeing too many HIGHs
 being hoisted?

No, nothing as complicated as that.  In a lot of cases, any hoisting
of the high part is bad, because the linker nops out the high part and
edits the low part insn when the r2 offset can be done in 16 bits.  If
the high part insn was the only use of a call-saved reg, then the
function saves and restores a reg to no purpose.

  I'm not wedded to the representation, *but* we do want gcc to treat
  the high part as a constant.  That's important because we don't ever
  want reload saving the high part to a stack slot!  Which is what does
  happen if you don't somehow tell gcc it is a constant.
 
 Out of curiosity, does that still happen if you have a HIGH REG_EQUAL
 note attached to the addition?  I'd have expected reload to convert
 the note into a REG_EQUIV and treat the source as a function invariant.

I saw that possibility in reload, and did try a REG_EQUIV note before
I wrapped the ADD in a CONST as we have currently.  Some early pass
deleted the notes for me.  I forget which one.

Oh well, I was going to try splitting after reload anyway.

-- 
Alan Modra
Australia Development Lab, IBM


Re: rs6000 toc reference rtl again

2012-04-03 Thread Alan Modra
On Tue, Apr 03, 2012 at 07:49:04PM +0100, Richard Sandiford wrote:
 Alan Modra amo...@gmail.com writes:
  Now that we are back in stage1, I'd like to apply
  http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00304.html, a change to
  toc reference rtl in order to properly specify r2 dependencies.  More
  commentary in that url.  I'm reposting the patch here since the old
  one no longer applies cleanly, and I've added some ENABLE_CHECKING
  code in rs6000_delegitimize_address.
 
 Sorry to be a pain, but I don't think HIGH is supposed contain
 regs either.  Both HIGH and CONST are supposed to be true constants.

Eh, so the existing use of CONST is wrong then.  ;-)

I'm proposing
  (unspec [(symbol_ref sym) (reg r2)] UNSPEC_TOCREL)
for the small model, and
  (high (unspec [(symbol_ref sym) (reg r2)] UNSPEC_TOCREL)))
  (lo_sum (reg hi) (unspec [(symbol_ref sym) (reg r2)] UNSPEC_TOCREL))
for medium/large model.

You can see why I'd like to keep it this way;  The medium/large rtl is
a natural split of the small rtl.  (I'm going to experiment with
splitting the small rtl after reload for medium/large to see whether
that helps our usage of call-saved regs in loops.)

I'm not wedded to the representation, *but* we do want gcc to treat
the high part as a constant.  That's important because we don't ever
want reload saving the high part to a stack slot!  Which is what does
happen if you don't somehow tell gcc it is a constant.

Besides, the high part *is* a constant within any given function.  So
is the low part for that matter.  The only reason I want r2 mentioned
in this rtl is for register liveness, eg. so that a load of a function
pointer (which loads r2) for an indirect call doesn't get scheduled
before any uses of the old r2.

The alternative of removing r2 from the unspec and attaching a
(use (reg r2)) to all instructions that have this addressing form
might be clean but will require major duplication of patterns in
rs6000.md, won't it?

-- 
Alan Modra
Australia Development Lab, IBM


Re: rs6000 toc reference rtl again

2012-04-03 Thread David Edelsohn
On Tue, Mar 27, 2012 at 3:54 AM, Alan Modra amo...@gmail.com wrote:
 Now that we are back in stage1, I'd like to apply
 http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00304.html, a change to
 toc reference rtl in order to properly specify r2 dependencies.  More
 commentary in that url.  I'm reposting the patch here since the old
 one no longer applies cleanly, and I've added some ENABLE_CHECKING
 code in rs6000_delegitimize_address.

 Bootstrapped and regression tested powerpc64-linux.  OK for mainline?

        * cselib.c (preserve_only_constants): Remove HIGH rtx containing
        value references.
        * cprop.c (cprop_constant_p): Return false for HIGH rtx containing
        value references.
        * config/rs6000/predicates.md (input_operand): Match unspec.  Remove
        redundant tests.
        * rs6000-protos.h (toc_relative_expr_p): Update prototype.
        * const/rs6000/rs6000.c (tocrel_base, tocrel_offset): Make const.
        (legitimate_constant_pool_address_p): Move TARGET_TOC test and
        register checks to..
        (toc_relative_expr_p): ..here.  Add strict param.  Match new rtl
        generated by create_TOC_reference.
        (rs6000_delegitimize_address): Handle new rtl for toc refs.
        (rs6000_cannot_force_const_mem, rs6000_find_base_term): Likewise.
        (use_toc_relative_ref): New function, split out from..
        (rs6000_emit_move): ..here.  Remove redundant tests.
        (rs6000_legitimize_reload_address): Formatting.  Remove redundant
        code.  Use use_toc_relative_ref.
        (print_operand): Formatting, style.  Adjust for toc changes.
        (print_operand_address): Likewise.
        (rs6000_output_addr_const_extra): Likewise.
        (create_TOC_reference): Put TOC_REGISTER in UNSPEC_TOCREL rather
        than a PLUS.  Use this formulation for both high and low part
        of -mcmodel=medium/large toc reference too.
        * config/rs6000/rs6000.md (tls_gd, tls_gd_high): Similarly avoid
        a PLUS in high part of addresses here.
        (tls_ld, tls_ld_high, tls_got_dtprel, tls_got_dtprel_high): Likewise.
        (tls_got_tprel, tls_got_tprel_high, largetoc_high): Likewise.
        (largetoc_high, largetoc_low): Move earlier.  Cope when no
        base reg available.

Now that PPC bootstrap appears to be fixed, this patch is okay to apply.

However, I cannot approve the cselib.c and cprop.c changes.

Thanks, David


Re: rs6000 toc reference rtl again

2012-04-03 Thread Richard Sandiford
Alan Modra amo...@gmail.com writes:
 Now that we are back in stage1, I'd like to apply
 http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00304.html, a change to
 toc reference rtl in order to properly specify r2 dependencies.  More
 commentary in that url.  I'm reposting the patch here since the old
 one no longer applies cleanly, and I've added some ENABLE_CHECKING
 code in rs6000_delegitimize_address.

Sorry to be a pain, but I don't think HIGH is supposed contain
regs either.  Both HIGH and CONST are supposed to be true constants.

It's fine to have a non-HIGH expression as the first operand to
a LO_SUM though.  Alpha and MIPS use this for small data, where
the first LO_SUM operand is the small data pointer.  It can also
be an arbitrary rtx like a PLUS.

Richard


Re: rs6000 toc reference rtl

2011-09-07 Thread Alan Modra
On Tue, Sep 06, 2011 at 01:11:26AM +0930, Alan Modra wrote:
 Consequently, Mike's change to split rtl for
 indirect calls sometimes sees the scheduler moving the r2 load in the
 indirect call sequence before a toc reference.

Actually, this isn't correct.  Mike's change adding rs6000.c
rs6000_call_indirect_aix just made it more likely.  Even before this
post-reload scheduling could move the r2 load around, since rs6000.md
call_indirect_aix patterns were (and still are) split post-reload.

Here's an example I was shown today of such damage (qemu compiled
with gcc-4.6-redhat).

.LVL57151:
ld 0,0(31)  # load opd+0, function addr
addis 4,2,.LC4758@toc@ha
ld 11,16(31)
mr 7,3
std 2,40(1) # save r2
mr 5,25
addi 4,4,.LC4758@toc@l
mtctr 0 #
mr 6,26
ld 2,8(31)  # load opd+8, new toc ptr in r2
mr 3,28
.LBB255670:
.LBB255668:
.loc 8 98 0
addis 27,2,.LC4761@toc@ha   # oops, should be using old r2
.LVL57152:
addi 27,27,.LC4761@toc@l
.LBE255668:
.LBE255670:
.loc 3 9212 0
addis 25,2,.LC4762@toc@ha   # oops again
.loc 3 9198 0
bctrl   # make the call
ld 2,40(1)  # restore r2

r27 and r25 set up here for later use now contain bogus values.
The blame rests on my 2011-06-20 change.

-- 
Alan Modra
Australia Development Lab, IBM