Re: [Patch, fortran] PR50981 correctly handle absent arrays as actual argument to elemental procedures

2012-02-13 Thread Tobias Burnus
Mikael Morin wrote:
 there was no specific handling for absent arrays passed as argument
 to elemental procedures. So, because of scalarisation, we were passing
 an array element reference of a NULL pointer which was failing.

 These patches add a conditional to pass NULL when the data pointer
 is NULL.

 Regression tested on x86_64-unknown-freebsd9.0. OK for trunk?

OK. Thanks for the patch.

Tobias


Re: trans-mem: virtual ops for gimple_transaction

2012-02-13 Thread Richard Guenther
On Fri, 10 Feb 2012, Richard Henderson wrote:

 On 02/10/2012 01:44 AM, Richard Guenther wrote:
  What is the reason to keep a GIMPLE_TRANSACTION stmt after
  TM lowering and not lower it to a builtin function call?
 
 Because real optimization hasn't happened yet, and we hold
 out hope that we'll be able to delete stuff as unreachable.
 Especially all instances of transaction_cancel.
 
  It seems the body is empty after lowering (what's the label thing?)
 
 The label is the transaction cancel label.
 
 When we finally convert GIMPLE_TRANSACTION a builtin, we'll
 generate different code layouts with and without a cancel.

Ah, I see.  But wouldn't a placeholder builtin function be
effectively the same as using a new GIMPLE stmt kind?

Richard.


RE: [PATCH] Improve SCEV for array element

2012-02-13 Thread Jiangning Liu


 -Original Message-
 From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
 ow...@gcc.gnu.org] On Behalf Of Jiangning Liu
 Sent: Friday, January 20, 2012 5:07 PM
 To: 'Richard Guenther'
 Cc: gcc-patches@gcc.gnu.org
 Subject: RE: [PATCH] Improve SCEV for array element
 
  It's definitely not ok at this stage but at most for next stage1.
 
 OK. I may wait until next stage1.
 
  This is a very narrow pattern-match.  It doesn't allow for a[i].x
 for
  example, even if a[i] is a one-element structure.  I think the
  canonical way of handling ADDR_EXPR is to use sth like
 
  base = get_inner_reference (TREE_OPERAND (rhs1, 0), ..., offset,
  ...); base = build1 (ADDR_EXPR, TREE_TYPE (rhs1), base);
  chrec1 = analyze_scalar_evolution (loop, base);
  chrec2 = analyze_scalar_evolution (loop, offset);
  chrec1 = chrec_convert (type, chrec1, at_stmt);
  chrec2 = chrec_convert (TREE_TYPE (offset), chrec2, at_stmt);
  res = chrec_fold_plus (type, chrec1, chrec2);
 
  where you probably need to handle scev_not_known when analyzing
 offset
  (which might be NULL).  You also need to add bitpos to the base
  address (in bytes, of course).  Note that the MEM_REF case would
  naturally work with this as well.
 
 OK. New patch is like below, and bootstrapped on x86-32.
 
 ChangeLog:
 
 2012-01-20  Jiangning Liu  jiangning@arm.com
 
 * tree-scalar-evolution (interpret_rhs_expr): generate chrec
 for
 array reference and component reference.
 
 
 ChangeLog for testsuite:
 
 2012-01-20  Jiangning Liu  jiangning@arm.com
 
 * gcc.dg/tree-ssa/scev-3.c: New.
 * gcc.dg/tree-ssa/scev-4.c: New.
 

Richard,

PING... Is this patch OK after branch 4.7 is created and trunk is open
again?

Thanks,
-Jiangning

 diff --git a/gcc/testsuite/gcc.dg/tree-ssa/scev-3.c
 b/gcc/testsuite/gcc.dg/tree-ssa/scev-3.c
 new file mode 100644
 index 000..28d5c93
 --- /dev/null
 +++ b/gcc/testsuite/gcc.dg/tree-ssa/scev-3.c
 @@ -0,0 +1,18 @@
 +/* { dg-do compile } */
 +/* { dg-options -O2 -fdump-tree-optimized } */
 +
 +int *a_p;
 +int a[1000];
 +
 +f(int k)
 +{
 + int i;
 +
 + for (i=k; i1000; i+=k) {
 + a_p = a[i];
 + *a_p = 100;
 +}
 +}
 +
 +/* { dg-final { scan-tree-dump-times a 1 optimized } } */
 +/* { dg-final { cleanup-tree-dump optimized } } */
 diff --git a/gcc/testsuite/gcc.dg/tree-ssa/scev-4.c
 b/gcc/testsuite/gcc.dg/tree-ssa/scev-4.c
 new file mode 100644
 index 000..6c1e530
 --- /dev/null
 +++ b/gcc/testsuite/gcc.dg/tree-ssa/scev-4.c
 @@ -0,0 +1,23 @@
 +/* { dg-do compile } */
 +/* { dg-options -O2 -fdump-tree-optimized } */
 +
 +typedef struct {
 + int x;
 + int y;
 +} S;
 +
 +int *a_p;
 +S a[1000];
 +
 +f(int k)
 +{
 + int i;
 +
 + for (i=k; i1000; i+=k) {
 + a_p = a[i].y;
 + *a_p = 100;
 +}
 +}
 +
 +/* { dg-final { scan-tree-dump-times a 1 optimized } } */
 +/* { dg-final { cleanup-tree-dump optimized } } */
 diff --git a/gcc/tree-scalar-evolution.c b/gcc/tree-scalar-evolution.c
 index 2077c8d..4e06b75
 --- a/gcc/tree-scalar-evolution.c
 +++ b/gcc/tree-scalar-evolution.c
 @@ -1712,16 +1712,61 @@ interpret_rhs_expr (struct loop *loop, gimple
 at_stmt,
switch (code)
  {
  case ADDR_EXPR:
 -  /* Handle MEM[ptr + CST] which is equivalent to
 POINTER_PLUS_EXPR.
 */
 -  if (TREE_CODE (TREE_OPERAND (rhs1, 0)) != MEM_REF)
 - {
 -   res = chrec_dont_know;
 -   break;
 - }
 +  if (TREE_CODE (TREE_OPERAND (rhs1, 0)) == ARRAY_REF
 +  || TREE_CODE (TREE_OPERAND (rhs1, 0)) == MEM_REF
 +  || TREE_CODE (TREE_OPERAND (rhs1, 0)) == COMPONENT_REF)
 +{
 +   enum machine_mode mode;
 +   HOST_WIDE_INT bitsize, bitpos;
 +   int unsignedp;
 +   int volatilep = 0;
 +   tree base, offset;
 +   tree chrec3;
 +
 +   base = get_inner_reference (TREE_OPERAND (rhs1, 0),
 +   bitsize, bitpos, offset,
 +   mode, unsignedp, volatilep, false);
 +
 +   if (TREE_CODE (base) == MEM_REF)
 + {
 +   rhs2 = TREE_OPERAND (base, 1);
 +   rhs1 = TREE_OPERAND (base, 0);
 +
 +   chrec1 = analyze_scalar_evolution (loop, rhs1);
 +   chrec2 = analyze_scalar_evolution (loop, rhs2);
 +   chrec1 = chrec_convert (type, chrec1, at_stmt);
 +   chrec2 = chrec_convert (TREE_TYPE (rhs2), chrec2, at_stmt);
 +   res = chrec_fold_plus (type, chrec1, chrec2);
 + }
 +   else
 + {
 +   base = build1 (ADDR_EXPR, TREE_TYPE (rhs1), base);
 +   chrec1 = analyze_scalar_evolution (loop, base);
 +   chrec1 = chrec_convert (type, chrec1, at_stmt);
 +   res = chrec1;
 + }
 
 -  rhs2 = TREE_OPERAND (TREE_OPERAND (rhs1, 0), 1);
 -  rhs1 = TREE_OPERAND (TREE_OPERAND (rhs1, 0), 0);
 -  /* Fall through.  */
 +   if (offset != NULL_TREE)
 + 

Re: [PATCH] [RFC, GCC 4.8] Optimize conditional moves from adjacent memory locations

2012-02-13 Thread Richard Guenther
On Fri, Feb 10, 2012 at 10:02 PM, Andrew Pinski pins...@gmail.com wrote:
 On Fri, Feb 10, 2012 at 12:46 PM, Michael Meissner
 meiss...@linux.vnet.ibm.com wrote:
 I was looking at the routelookup EEMBC benchmark and it has code of the form:

   while ( this_node-cmpbit  next_node-cmpbit )
    {
      this_node = next_node;

      if ( proute-dst_addr  (0x1  this_node-cmpbit) )
         next_node = this_node-rlink;
      else
         next_node = this_node-llink;
    }

 Hmm, this looks like we could do this on the tree level better as we
 have more information about this_node there.  Like we know that we
 load from this_node-cmpbit before we do either of the branches.  So
 can move both of those loads before the branch and then we get the
 ifcvt for free.

Indeed.  But note that the transform is not valid as *this_node may cross
a page boundary and thus either pointer load may trap if the other does not
(well, unless the C standard (and thus our middle-end) would require that
iff ptr-component does not trap that *ptr does not trap either - we would
require a operand_equal_p (get_base_address ()) for both addresses).

Joseph, can you clarify what the C standard specifies here?

Thanks,
Richard.
 Thanks,
 Andrew Pinski




 This is where you have a binary tree/trie and you are iterating going down
 either the right link or left link until you find a stopping condition.  The
 code in ifcvt.c does not handle optimizing these cases for conditional move
 since the load might trap, and generates code that does if-then-else with 
 loads
 and jumps.

 However, since the two elements are next to each other in memory, they are
 likely in the same cache line, particularly with aligned stacks and malloc
 returning aligned pointers.  Except in unusual circumstances where the 
 pointer
 is not aligned, this means it is much faster to optimize it as:

   while ( this_node-cmpbit  next_node-cmpbit )
    {
      this_node = next_node;

      rtmp = this_node-rlink;
      ltmp = this_node-llink;
      if ( proute-dst_addr  (0x1  this_node-cmpbit) )
         next_node = rtmp;
      else
         next_node = ltmp;
    }

 So I wrote some patches to do this optimization.  In ifcvt.c I added a new 
 hook
 that allows the backend to try and do conditional moves if the machine
 independent code doesn't handle the special cases that the machine might 
 have.

 Then in rs6000.c I used that hook to see if the conditional moves are 
 adjacent,
 and do the optimization.

 I will note that this type of code comes up quite frequently since binary 
 trees
 and tries are common data structure.  The file splay-tree.c in libiberty is 
 one
 place in the compiler tree that has conditional adjacent memory moves.

 So I would like comments on the patch before the 4.8 tree opens up.  I feel
 even if we decide not to add the adjacent memory move patch, the hook is
 useful, and I have some other ideas for using it for the powerpc.

 I was thinking about rewriting the rs6000 dependent parts to make it a normal
 optimization available to all ports.  Is this something we want as a normal
 option?  At the moment, I'm not sure it should be part of -O3 because it is
 possible for a trap to occur if the pointer straddles a page boundary and the
 test condition would guard against loading up the second value.  However,
 -Ofast might be an appropriate place to do this optimization.

 At this time I don't have test cases, but I would add them for the real
 submission.  I have bootstraped the compiler on powerpc with this option
 enabled and it passed the bootstrap and had no regressions in make check.  I
 will do a spec run over the weekend as well.

 In addition to libibery/splay-tree.c the following files in gcc have adjacent
 conditional moves that this code would optimize:

            cfg.c
            c-typeck.c
            df-scan.c
            fold-const.c
            graphds.c
            ira-emit.c
            omp-low.c
            rs6000.c
            tree-cfg.c
            tree-ssa-dom.c
            tree-ssa-loop-ivops.c
            tree-ssa-phiopt.c
            tree-ssa-uncprop.c

 2012-02-10  Michael Meissner  meiss...@linux.vnet.ibm.com

        * target.def (cmove_md_extra): New hook that is called from
        ifcvt.c to allow the backend to generate additional conditional
        moves that aren't handled by the machine independent code.  Add
        support to call the hook at the appropriate places.
        * targhooks.h (default_cmove_md_extra): Likewise.
        * targhooks.c (default_cmove_md_extra): Likewise.
        * target.h (enum ifcvt_pass): Likewise.
        * ifcvt.c (find_if_header): Likewise.
        (noce_find_if_block): Likewise.
        (struct noce_if_info): Likewise.
        (noce_process_if_block): Likewise.
        (cond_move_process_if_block): Likewise.
        (if_convert): Likewise.
        (rest_of_handle_if_conversion): Likewise.
        (rest_of_handle_if_after_combine): Likewise.
        

Re: [PATCH] Fix signed bitfield BIT_NOT_EXPR expansion (PR middle-end/52209)

2012-02-13 Thread Richard Guenther
On Sat, Feb 11, 2012 at 12:49 PM, Jakub Jelinek ja...@redhat.com wrote:
 Hi!

 In July Richard changed reduce_bit_field BIT_NOT_EXPR expansion from
 NOT unop to XOR with all the bits in the bitfield's precision set.
 Unfortunately that is correct for unsigned bitfields only, for signed
 bitfields, where op0 is already sign-extended to its mode before this,
 expanding this as NOT is the right thing.

 Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
 trunk?

Ok.

Thanks,
Richard.

 2012-02-11  Jakub Jelinek  ja...@redhat.com

        PR middle-end/52209
        * expr.c (expand_expr_real_2) case BIT_NOT_EXPR: Only expand using
        XOR for reduce_bit_field if type is unsigned.

        * gcc.c-torture/execute/pr52209.c: New test.

 --- gcc/expr.c.jj       2012-02-07 16:05:51.0 +0100
 +++ gcc/expr.c  2012-02-11 10:08:44.162924423 +0100
 @@ -8582,8 +8582,9 @@ expand_expr_real_2 (sepops ops, rtx targ
       if (modifier == EXPAND_STACK_PARM)
        target = 0;
       /* In case we have to reduce the result to bitfield precision
 -        expand this as XOR with a proper constant instead.  */
 -      if (reduce_bit_field)
 +        for unsigned bitfield expand this as XOR with a proper constant
 +        instead.  */
 +      if (reduce_bit_field  TYPE_UNSIGNED (type))
        temp = expand_binop (mode, xor_optab, op0,
                             immed_double_int_const
                               (double_int_mask (TYPE_PRECISION (type)), mode),
 --- gcc/testsuite/gcc.c-torture/execute/pr52209.c.jj    2012-02-11 
 10:09:46.080571803 +0100
 +++ gcc/testsuite/gcc.c-torture/execute/pr52209.c       2012-02-11 
 10:09:28.0 +0100
 @@ -0,0 +1,14 @@
 +/* PR middle-end/52209 */
 +
 +extern void abort (void);
 +struct S0 { int f2 : 1; } c;
 +int b;
 +
 +int
 +main ()
 +{
 +  b = -1 ^ c.f2;
 +  if (b != -1)
 +    abort ();
 +  return 0;
 +}

        Jakub


Re: [PATCH] Improve SCEV for array element

2012-02-13 Thread Richard Guenther
On Mon, Feb 13, 2012 at 10:54 AM, Jiangning Liu jiangning@arm.com wrote:


 -Original Message-
 From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
 ow...@gcc.gnu.org] On Behalf Of Jiangning Liu
 Sent: Friday, January 20, 2012 5:07 PM
 To: 'Richard Guenther'
 Cc: gcc-patches@gcc.gnu.org
 Subject: RE: [PATCH] Improve SCEV for array element

  It's definitely not ok at this stage but at most for next stage1.

 OK. I may wait until next stage1.

  This is a very narrow pattern-match.  It doesn't allow for a[i].x
 for
  example, even if a[i] is a one-element structure.  I think the
  canonical way of handling ADDR_EXPR is to use sth like
 
  base = get_inner_reference (TREE_OPERAND (rhs1, 0), ..., offset,
  ...); base = build1 (ADDR_EXPR, TREE_TYPE (rhs1), base);
          chrec1 = analyze_scalar_evolution (loop, base);
          chrec2 = analyze_scalar_evolution (loop, offset);
          chrec1 = chrec_convert (type, chrec1, at_stmt);
          chrec2 = chrec_convert (TREE_TYPE (offset), chrec2, at_stmt);
          res = chrec_fold_plus (type, chrec1, chrec2);
 
  where you probably need to handle scev_not_known when analyzing
 offset
  (which might be NULL).  You also need to add bitpos to the base
  address (in bytes, of course).  Note that the MEM_REF case would
  naturally work with this as well.

 OK. New patch is like below, and bootstrapped on x86-32.

 ChangeLog:

 2012-01-20  Jiangning Liu  jiangning@arm.com

         * tree-scalar-evolution (interpret_rhs_expr): generate chrec
 for
         array reference and component reference.


 ChangeLog for testsuite:

 2012-01-20  Jiangning Liu  jiangning@arm.com

         * gcc.dg/tree-ssa/scev-3.c: New.
         * gcc.dg/tree-ssa/scev-4.c: New.


 Richard,

 PING... Is this patch OK after branch 4.7 is created and trunk is open
 again?

It's on my (rather large) list of things to review for 4.8.  Be patient ...

Richard.

 Thanks,
 -Jiangning

 diff --git a/gcc/testsuite/gcc.dg/tree-ssa/scev-3.c
 b/gcc/testsuite/gcc.dg/tree-ssa/scev-3.c
 new file mode 100644
 index 000..28d5c93
 --- /dev/null
 +++ b/gcc/testsuite/gcc.dg/tree-ssa/scev-3.c
 @@ -0,0 +1,18 @@
 +/* { dg-do compile } */
 +/* { dg-options -O2 -fdump-tree-optimized } */
 +
 +int *a_p;
 +int a[1000];
 +
 +f(int k)
 +{
 +     int i;
 +
 +     for (i=k; i1000; i+=k) {
 +             a_p = a[i];
 +             *a_p = 100;
 +        }
 +}
 +
 +/* { dg-final { scan-tree-dump-times a 1 optimized } } */
 +/* { dg-final { cleanup-tree-dump optimized } } */
 diff --git a/gcc/testsuite/gcc.dg/tree-ssa/scev-4.c
 b/gcc/testsuite/gcc.dg/tree-ssa/scev-4.c
 new file mode 100644
 index 000..6c1e530
 --- /dev/null
 +++ b/gcc/testsuite/gcc.dg/tree-ssa/scev-4.c
 @@ -0,0 +1,23 @@
 +/* { dg-do compile } */
 +/* { dg-options -O2 -fdump-tree-optimized } */
 +
 +typedef struct {
 +     int x;
 +     int y;
 +} S;
 +
 +int *a_p;
 +S a[1000];
 +
 +f(int k)
 +{
 +     int i;
 +
 +     for (i=k; i1000; i+=k) {
 +             a_p = a[i].y;
 +             *a_p = 100;
 +        }
 +}
 +
 +/* { dg-final { scan-tree-dump-times a 1 optimized } } */
 +/* { dg-final { cleanup-tree-dump optimized } } */
 diff --git a/gcc/tree-scalar-evolution.c b/gcc/tree-scalar-evolution.c
 index 2077c8d..4e06b75
 --- a/gcc/tree-scalar-evolution.c
 +++ b/gcc/tree-scalar-evolution.c
 @@ -1712,16 +1712,61 @@ interpret_rhs_expr (struct loop *loop, gimple
 at_stmt,
    switch (code)
      {
      case ADDR_EXPR:
 -      /* Handle MEM[ptr + CST] which is equivalent to
 POINTER_PLUS_EXPR.
 */
 -      if (TREE_CODE (TREE_OPERAND (rhs1, 0)) != MEM_REF)
 -     {
 -       res = chrec_dont_know;
 -       break;
 -     }
 +      if (TREE_CODE (TREE_OPERAND (rhs1, 0)) == ARRAY_REF
 +          || TREE_CODE (TREE_OPERAND (rhs1, 0)) == MEM_REF
 +          || TREE_CODE (TREE_OPERAND (rhs1, 0)) == COMPONENT_REF)
 +        {
 +       enum machine_mode mode;
 +       HOST_WIDE_INT bitsize, bitpos;
 +       int unsignedp;
 +       int volatilep = 0;
 +       tree base, offset;
 +       tree chrec3;
 +
 +       base = get_inner_reference (TREE_OPERAND (rhs1, 0),
 +                                   bitsize, bitpos, offset,
 +                                   mode, unsignedp, volatilep, false);
 +
 +       if (TREE_CODE (base) == MEM_REF)
 +         {
 +           rhs2 = TREE_OPERAND (base, 1);
 +           rhs1 = TREE_OPERAND (base, 0);
 +
 +           chrec1 = analyze_scalar_evolution (loop, rhs1);
 +           chrec2 = analyze_scalar_evolution (loop, rhs2);
 +           chrec1 = chrec_convert (type, chrec1, at_stmt);
 +           chrec2 = chrec_convert (TREE_TYPE (rhs2), chrec2, at_stmt);
 +           res = chrec_fold_plus (type, chrec1, chrec2);
 +         }
 +       else
 +         {
 +           base = build1 (ADDR_EXPR, TREE_TYPE (rhs1), base);
 +           chrec1 = analyze_scalar_evolution (loop, base);
 +           chrec1 = chrec_convert (type, chrec1, at_stmt);
 +           res = chrec1;
 +         }

 -      rhs2 = TREE_OPERAND 

[PATCH] Fix PR52211

2012-02-13 Thread Richard Guenther

Committed as obvious.

Richard.

2012-02-13  Richard Guenther  rguent...@suse.de

PR translation/52211
* passes.c (enable_disable_pass): Fix typo.

Index: gcc/passes.c
===
--- gcc/passes.c(revision 184151)
+++ gcc/passes.c(working copy)
@@ -709,7 +709,7 @@ enable_disable_pass (const char *arg, bo
   if (is_enable)
 error (unknown pass %s specified in -fenable, phase_name);
   else
-error (unknown pass %s specified in -fdisble, phase_name);
+error (unknown pass %s specified in -fdisable, phase_name);
   free (argstr);
   return;
 }


Re: [PATCH ARM] backport r174803 from trunk to 4.6 branch

2012-02-13 Thread Richard Earnshaw
On 08/02/12 08:29, Bin Cheng wrote:
 Hi,
 Julian Brown once posted a patch fixing ARM EABI violation, which I think
 also essential to 4.6 branch.
 I created a patch against 4.6 branch as attached. Is it ok to back port?
 
 You can refer following link for original patch.
 http://gcc.gnu.org/ml/gcc-patches/2011-06/msg00260.html
 
 Thanks
 
 gcc/ChangeLog:
 2012-02-08  Bin Cheng  bin.ch...@arm.com
 
   Backport from mainline
   2011-06-08  Julian Brown  jul...@codesourcery.com
 
   * config/arm/arm.c (arm_libcall_uses_aapcs_base): Use correct ABI
   for double-precision helper functions in hard-float mode if only
   single-precision arithmetic is supported in hardware.
 
 
 
 
 

OK.

Can you also back-port it to 4.5 as well, please.

R.



Re: [PATCH] Fix for PR52081 - Missed tail merging with pure calls

2012-02-13 Thread Richard Guenther
On Thu, Feb 2, 2012 at 11:44 AM, Tom de Vries tom_devr...@mentor.com wrote:
 Richard,

 this patch fixes PR52801.

 Consider test-case pr51879-12.c:
 ...
 __attribute__((pure)) int bar (int);
 __attribute__((pure)) int bar2 (int);
 void baz (int);

 int x, z;

 void
 foo (int y)
 {
  int a = 0;
  if (y == 6)
    {
      a += bar (7);
      a += bar2 (6);
    }
  else
    {
      a += bar2 (6);
      a += bar (7);
    }
  baz (a);
 }
 ...

 When compiling at -O2, pr51879-12.c.094t.pre looks like this:
 ...
  # BLOCK 3 freq:1991
  # PRED: 2 [19.9%]  (true,exec)
  # VUSE .MEMD.1722_12(D)
  # USE = nonlocal escaped
  D.1717_4 = barD.1703 (7);
  # VUSE .MEMD.1722_12(D)
  # USE = nonlocal escaped
  D.1718_6 = bar2D.1705 (6);
  aD.1713_7 = D.1717_4 + D.1718_6;
  goto bb 5;
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 4 freq:8009
  # PRED: 2 [80.1%]  (false,exec)
  # VUSE .MEMD.1722_12(D)
  # USE = nonlocal escaped
  D.1720_8 = bar2D.1705 (6);
  # VUSE .MEMD.1722_12(D)
  # USE = nonlocal escaped
  D.1721_10 = barD.1703 (7);
  aD.1713_11 = D.1720_8 + D.1721_10;
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 5 freq:1
  # PRED: 3 [100.0%]  (fallthru,exec) 4 [100.0%]  (fallthru,exec)
  # aD.1713_1 = PHI aD.1713_7(3), aD.1713_11(4)
  # .MEMD.1722_13 = VDEF .MEMD.1722_12(D)
  # USE = nonlocal
  # CLB = nonlocal
  bazD.1707 (aD.1713_1);
  # VUSE .MEMD.1722_13
  return;
 ...
 block 3 and 4 can be tail-merged.

 Value numbering numbers the two phi arguments a_7 and a_11 the same so the
 problem is not in value numbering:
 ...
 Setting value number of a_11 to a_7 (changed)
 ...

 There are 2 reasons that tail_merge_optimize doesn't optimize this:

 1.
 The clause
  is_gimple_assign (stmt)  local_def (gimple_get_lhs (stmt))
   !gimple_has_side_effects (stmt)
 used in both same_succ_hash and gsi_advance_bw_nondebug_nonlocal evaluates to
 false for pure calls.
 This is fixed by replacing is_gimple_assign with gimple_has_lhs.

 2.
 In same_succ_equal we check gimples from the 2 bbs side-by-side:
 ...
  gsi1 = gsi_start_nondebug_bb (bb1);
  gsi2 = gsi_start_nondebug_bb (bb2);
  while (!(gsi_end_p (gsi1) || gsi_end_p (gsi2)))
    {
      s1 = gsi_stmt (gsi1);
      s2 = gsi_stmt (gsi2);
      if (gimple_code (s1) != gimple_code (s2))
        return 0;
      if (is_gimple_call (s1)  !gimple_call_same_target_p (s1, s2))
        return 0;
      gsi_next_nondebug (gsi1);
      gsi_next_nondebug (gsi2);
    }
 ...
 and we'll be comparing 'bar (7)' and 'bar2 (6)', and gimple_call_same_target_p
 will return false.
 This is fixed by ignoring local defs in this comparison, by using
 gsi_advance_fw_nondebug_nonlocal on the iterators.

 bootstrapped and reg-tested on x86_64.

 ok for stage1?

Sorry for responding so late ... I think these fixes hint at that we should
use structural equality as fallback if value-numbering doesn't equate
two stmt effects.  Thus, treat two stmts with exactly the same operands
and flags as equal and using value-numbering to canonicalize operands
(when they are SSA names) for that comparison, or use VN entirely
if there are no side-effects on the stmt.

Changing value-numbering of virtual operands, even if it looks correct in the
simple cases you change, doesn't look like a general solution for the missed
tail merging opportunities.

Richard.

 Thanks,
 - Tom

 2012-02-02  Tom de Vries  t...@codesourcery.com

        * tree-ssa-tail-merge.c (local_def): Move up.
        (stmt_local_def): New function, factored out of same_succ_hash.  Use
        gimple_has_lhs instead of is_gimple_assign.
        (gsi_advance_nondebug_nonlocal): New function, factored out of
        gsi_advance_bw_nondebug_nonlocal.  Use stmt_local_def.
        (gsi_advance_fw_nondebug_nonlocal): New function.
        (gsi_advance_bw_nondebug_nonlocal): Use gsi_advance_nondebug_nonlocal.
        Move up.
        (same_succ_hash): Use stmt_local_def.
        (same_succ_equal): Use gsi_advance_fw_nondebug_nonlocal.

        * gcc.dg/pr51879-12.c: New test.


Re: [committed] Remove myself as vectorizer maintainer

2012-02-13 Thread Julian Brown
On Tue, 7 Feb 2012 15:44:04 +0200
Ira Rosen i...@il.ibm.com wrote:

 
 Hi,
 
 I am starting to work on a new project and won't be able to continue
 with vectorizer maintenance.
 
 I'd like to thank all the people I had a chance to work with for
 making my GCC experience so enjoyable.

Thanks for all the hard work on auto-vectorization over the years! I'm
sure your contributions will be missed.

Cheers,

Julian


Re: [PATCH] Fix for PR52081 - Missed tail merging with pure calls

2012-02-13 Thread Tom de Vries
On 13/02/12 12:54, Richard Guenther wrote:
 On Thu, Feb 2, 2012 at 11:44 AM, Tom de Vries tom_devr...@mentor.com wrote:
 Richard,

 this patch fixes PR52801.

 Consider test-case pr51879-12.c:
 ...
 __attribute__((pure)) int bar (int);
 __attribute__((pure)) int bar2 (int);
 void baz (int);

 int x, z;

 void
 foo (int y)
 {
  int a = 0;
  if (y == 6)
{
  a += bar (7);
  a += bar2 (6);
}
  else
{
  a += bar2 (6);
  a += bar (7);
}
  baz (a);
 }
 ...

 When compiling at -O2, pr51879-12.c.094t.pre looks like this:
 ...
  # BLOCK 3 freq:1991
  # PRED: 2 [19.9%]  (true,exec)
  # VUSE .MEMD.1722_12(D)
  # USE = nonlocal escaped
  D.1717_4 = barD.1703 (7);
  # VUSE .MEMD.1722_12(D)
  # USE = nonlocal escaped
  D.1718_6 = bar2D.1705 (6);
  aD.1713_7 = D.1717_4 + D.1718_6;
  goto bb 5;
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 4 freq:8009
  # PRED: 2 [80.1%]  (false,exec)
  # VUSE .MEMD.1722_12(D)
  # USE = nonlocal escaped
  D.1720_8 = bar2D.1705 (6);
  # VUSE .MEMD.1722_12(D)
  # USE = nonlocal escaped
  D.1721_10 = barD.1703 (7);
  aD.1713_11 = D.1720_8 + D.1721_10;
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 5 freq:1
  # PRED: 3 [100.0%]  (fallthru,exec) 4 [100.0%]  (fallthru,exec)
  # aD.1713_1 = PHI aD.1713_7(3), aD.1713_11(4)
  # .MEMD.1722_13 = VDEF .MEMD.1722_12(D)
  # USE = nonlocal
  # CLB = nonlocal
  bazD.1707 (aD.1713_1);
  # VUSE .MEMD.1722_13
  return;
 ...
 block 3 and 4 can be tail-merged.

 Value numbering numbers the two phi arguments a_7 and a_11 the same so the
 problem is not in value numbering:
 ...
 Setting value number of a_11 to a_7 (changed)
 ...

 There are 2 reasons that tail_merge_optimize doesn't optimize this:

 1.
 The clause
  is_gimple_assign (stmt)  local_def (gimple_get_lhs (stmt))
   !gimple_has_side_effects (stmt)
 used in both same_succ_hash and gsi_advance_bw_nondebug_nonlocal evaluates to
 false for pure calls.
 This is fixed by replacing is_gimple_assign with gimple_has_lhs.

 2.
 In same_succ_equal we check gimples from the 2 bbs side-by-side:
 ...
  gsi1 = gsi_start_nondebug_bb (bb1);
  gsi2 = gsi_start_nondebug_bb (bb2);
  while (!(gsi_end_p (gsi1) || gsi_end_p (gsi2)))
{
  s1 = gsi_stmt (gsi1);
  s2 = gsi_stmt (gsi2);
  if (gimple_code (s1) != gimple_code (s2))
return 0;
  if (is_gimple_call (s1)  !gimple_call_same_target_p (s1, s2))
return 0;
  gsi_next_nondebug (gsi1);
  gsi_next_nondebug (gsi2);
}
 ...
 and we'll be comparing 'bar (7)' and 'bar2 (6)', and 
 gimple_call_same_target_p
 will return false.
 This is fixed by ignoring local defs in this comparison, by using
 gsi_advance_fw_nondebug_nonlocal on the iterators.

 bootstrapped and reg-tested on x86_64.

 ok for stage1?
 
 Sorry for responding so late ... 

no problem :)

 I think these fixes hint at that we should
 use structural equality as fallback if value-numbering doesn't equate
 two stmt effects.  Thus, treat two stmts with exactly the same operands
 and flags as equal and using value-numbering to canonicalize operands
 (when they are SSA names) for that comparison, or use VN entirely
 if there are no side-effects on the stmt.
 
 Changing value-numbering of virtual operands, even if it looks correct in the
 simple cases you change, doesn't look like a general solution for the missed
 tail merging opportunities.
 

Your comment is relevant for the other recent tail-merge related fixes I
submitted, but I think not for this one.

In this case, value-numbering manages to value number the 2 phi-alternatives
equal. It's tail-merging that doesn't take advantage of this, by treating pure
function calls the same as non-pure function calls. The fixes are therefore in
tail-merging, not in value numbering.

So, ok for stage1?

Thanks,
- Tom

 Richard.
 
 Thanks,
 - Tom

 2012-02-02  Tom de Vries  t...@codesourcery.com

* tree-ssa-tail-merge.c (local_def): Move up.
(stmt_local_def): New function, factored out of same_succ_hash.  Use
gimple_has_lhs instead of is_gimple_assign.
(gsi_advance_nondebug_nonlocal): New function, factored out of
gsi_advance_bw_nondebug_nonlocal.  Use stmt_local_def.
(gsi_advance_fw_nondebug_nonlocal): New function.
(gsi_advance_bw_nondebug_nonlocal): Use gsi_advance_nondebug_nonlocal.
Move up.
(same_succ_hash): Use stmt_local_def.
(same_succ_equal): Use gsi_advance_fw_nondebug_nonlocal.

* gcc.dg/pr51879-12.c: New test.



RE: [Patch,wwwdocs,AVR]: AVR release notes

2012-02-13 Thread Weddington, Eric

 -Original Message-
 From: Gerald Pfeifer
 Sent: Sunday, February 12, 2012 3:17 PM
 To: Georg-Johann Lay
 Cc: gcc-patches@gcc.gnu.org; Denis Chertykov; Weddington, Eric
 Subject: Re: [Patch,wwwdocs,AVR]: AVR release notes
 
 
 This looks like an impressive release for AVR!
 
 Gerald

Johann has been doing some excellent work for the AVR backend. It's been
very much appreciated.

Eric Weddington



[PR52001] too many cse reverse equiv exprs (take2)

2012-02-13 Thread Alexandre Oliva
Jakub asked to have a closer look at the problem, and I found we could
do somewhat better.  The first thing I noticed was that the problem was
that, in each block that computed a (base+const), we created a new VALUE
for the expression (with the same const and global base), and a new
reverse operation.

This was wrong.  Clearly we should reuse the same expression.  I had to
arrange for the expression to be retained across basic blocks, for it
was function invariant.  I split out the code to detect invariants from
the function that removes entries from the cselib hash table across
blocks, and made it recursive so that a VALUE equivalent to (plus
(value) (const_int)) will be retained, if the base value fits (maybe
recursively) the definition of invariant.

An earlier attempt to address this issue remained in cselib: using the
canonical value to build the reverse expression.  I believe it has a
potential of avoiding the creation of redundant reverse expressions, for
expressions involving equivalent but different VALUEs will evaluate to
different hashes.  I haven't observed effects WRT the given testcase,
before or after the change that actually fixed the problem, because we
now find the same base expression and thus reuse the reverse_op as well,
but I figured I'd keep it in for it is very cheap and possibly useful.

Regstrapped on x86_64-linux-gnu and i686-pc-linux-gnu.  Ok to install?

for gcc/ChangeLog
from  Alexandre Oliva  aol...@redhat.com

	PR debug/52001
	* cselib.c (invariant_p): Split out of...
	(preserve_only_constants): ... this.  Preserve plus expressions
	of invariant values and constants.
	* var-tracking.c (reverse_op): Don't drop equivs of constants.
	Use canonical value to build reverse operation.

Index: gcc/cselib.c
===
--- gcc/cselib.c.orig	2012-02-12 06:13:40.676385499 -0200
+++ gcc/cselib.c	2012-02-12 09:07:00.653579375 -0200
@@ -383,22 +383,29 @@ cselib_clear_table (void)
   cselib_reset_table (1);
 }
 
-/* Remove from hash table all VALUEs except constants
-   and function invariants.  */
+/* Return TRUE if V is a constant or a function invariant, FALSE
+   otherwise.  */
 
-static int
-preserve_only_constants (void **x, void *info ATTRIBUTE_UNUSED)
+static bool
+invariant_p (cselib_val *v)
 {
-  cselib_val *v = (cselib_val *)*x;
   struct elt_loc_list *l;
 
+  if (v == cfa_base_preserved_val)
+return true;
+
+  /* Keep VALUE equivalences around.  */
+  for (l = v-locs; l; l = l-next)
+if (GET_CODE (l-loc) == VALUE)
+  return true;
+
   if (v-locs != NULL
v-locs-next == NULL)
 {
   if (CONSTANT_P (v-locs-loc)
 	   (GET_CODE (v-locs-loc) != CONST
 	  || !references_value_p (v-locs-loc, 0)))
-	return 1;
+	return true;
   /* Although a debug expr may be bound to different expressions,
 	 we can preserve it as if it was constant, to get unification
 	 and proper merging within var-tracking.  */
@@ -406,24 +413,29 @@ preserve_only_constants (void **x, void 
 	  || GET_CODE (v-locs-loc) == DEBUG_IMPLICIT_PTR
 	  || GET_CODE (v-locs-loc) == ENTRY_VALUE
 	  || GET_CODE (v-locs-loc) == DEBUG_PARAMETER_REF)
-	return 1;
-  if (cfa_base_preserved_val)
-	{
-	  if (v == cfa_base_preserved_val)
-	return 1;
-	  if (GET_CODE (v-locs-loc) == PLUS
-	   CONST_INT_P (XEXP (v-locs-loc, 1))
-	   XEXP (v-locs-loc, 0) == cfa_base_preserved_val-val_rtx)
-	return 1;
-	}
+	return true;
+
+  /* (plus (value V) (const_int C)) is invariant iff V is invariant.  */
+  if (GET_CODE (v-locs-loc) == PLUS
+	   CONST_INT_P (XEXP (v-locs-loc, 1))
+	   GET_CODE (XEXP (v-locs-loc, 0)) == VALUE
+	   invariant_p (CSELIB_VAL_PTR (XEXP (v-locs-loc, 0
+	return true;
 }
 
-  /* Keep VALUE equivalences around.  */
-  for (l = v-locs; l; l = l-next)
-if (GET_CODE (l-loc) == VALUE)
-  return 1;
+  return false;
+}
+
+/* Remove from hash table all VALUEs except constants
+   and function invariants.  */
+
+static int
+preserve_only_constants (void **x, void *info ATTRIBUTE_UNUSED)
+{
+  cselib_val *v = (cselib_val *)*x;
 
-  htab_clear_slot (cselib_hash_table, x);
+  if (!invariant_p (v))
+htab_clear_slot (cselib_hash_table, x);
   return 1;
 }
 
Index: gcc/var-tracking.c
===
--- gcc/var-tracking.c.orig	2012-02-12 06:13:38.633412886 -0200
+++ gcc/var-tracking.c	2012-02-12 10:09:49.0 -0200
@@ -5298,7 +5298,6 @@ reverse_op (rtx val, const_rtx expr, rtx
 {
   rtx src, arg, ret;
   cselib_val *v;
-  struct elt_loc_list *l;
   enum rtx_code code;
 
   if (GET_CODE (expr) != SET)
@@ -5334,13 +5333,9 @@ reverse_op (rtx val, const_rtx expr, rtx
   if (!v || !cselib_preserved_value_p (v))
 return;
 
-  /* Adding a reverse op isn't useful if V already has an always valid
- location.  Ignore ENTRY_VALUE, while it is always constant, we should
- prefer non-ENTRY_VALUE locations whenever possible.  */
-  for (l = v-locs; 

Re: [PR52001] too many cse reverse equiv exprs (take2)

2012-02-13 Thread Jakub Jelinek
On Mon, Feb 13, 2012 at 12:27:35PM -0200, Alexandre Oliva wrote:
 Jakub asked to have a closer look at the problem, and I found we could
 do somewhat better.  The first thing I noticed was that the problem was
 that, in each block that computed a (base+const), we created a new VALUE
 for the expression (with the same const and global base), and a new
 reverse operation.

I'm not convinced you want the
 +  /* Keep VALUE equivalences around.  */
 +  for (l = v-locs; l; l = l-next)
 +if (GET_CODE (l-loc) == VALUE)
 +  return true;
hunk in invariant_p, I'd say it should stay in preserve_only_values,
a value equivalence isn't necessarily invariant.
Otherwise the cselib.c changes look ok to me, but I don't understand why are you
removing the var-tracking.c loop.  While cselib will with your changes
handle the situation better, for values that are already invariant
(guess canonical_cselib_val should be called before that loop and perhaps
instead of testing CONSTANT_P it could test invatiant_p if you rename
it to cselib_invariant_p and export) adding any reverse ops for it is really
just a waste of resources, because we have a better location already in the
list.  Adding the extra loc doesn't improve it in any way.

Jakub


[PATCH] Fix PR52178

2012-02-13 Thread Richard Guenther

This fixes PR52178, the failure to bootstrap Ada with LTO (well,
until you hit the next problem).  A self-referential DECL_QUALIFIER
makes us think that a QUAL_UNION_TYPE type is of variable-size which
makes us stream that type locally, wrecking type merging and later
ICEing in the type verifier.  While it looks that variably_modified_type_p
should not inspect DECL_QUALIFIER a less intrusive patch for 4.7
notices that DECL_QUALIFIER is unused after gimplification and thus
clears it and does not stream it instead.

LTO bootstrapped until I hit an optimization ICE when optimizing gnat1,
a regular bootstrap  regtest is pending on x86_64-unknown-linux-gnu.

Richard.

2012-02-13  Richard Guenther  rguent...@suse.de

PR lto/52178
* tree-streamer-in.c (lto_input_ts_field_decl_tree_pointers):
Do not stream DECL_QUALIFIER.
* tree-streamer-out.c (write_ts_field_decl_tree_pointers): Likewise.
* tree.c (free_lang_data_in_decl): Free DECL_QUALIFIER.
(find_decls_types_r): Do not walk DECL_QUALIFIER.

Index: gcc/tree-streamer-in.c
===
--- gcc/tree-streamer-in.c  (revision 184151)
+++ gcc/tree-streamer-in.c  (working copy)
@@ -640,7 +640,7 @@ lto_input_ts_field_decl_tree_pointers (s
 {
   DECL_FIELD_OFFSET (expr) = stream_read_tree (ib, data_in);
   DECL_BIT_FIELD_TYPE (expr) = stream_read_tree (ib, data_in);
-  DECL_QUALIFIER (expr) = stream_read_tree (ib, data_in);
+  /* Do not stream DECL_QUALIFIER, it is useless after gimplification.  */
   DECL_FIELD_BIT_OFFSET (expr) = stream_read_tree (ib, data_in);
   DECL_FCONTEXT (expr) = stream_read_tree (ib, data_in);
 }
Index: gcc/tree-streamer-out.c
===
--- gcc/tree-streamer-out.c (revision 184151)
+++ gcc/tree-streamer-out.c (working copy)
@@ -552,7 +552,7 @@ write_ts_field_decl_tree_pointers (struc
 {
   stream_write_tree (ob, DECL_FIELD_OFFSET (expr), ref_p);
   stream_write_tree (ob, DECL_BIT_FIELD_TYPE (expr), ref_p);
-  stream_write_tree (ob, DECL_QUALIFIER (expr), ref_p);
+  /* Do not stream DECL_QUALIFIER, it is useless after gimplification.  */
   stream_write_tree (ob, DECL_FIELD_BIT_OFFSET (expr), ref_p);
   stream_write_tree (ob, DECL_FCONTEXT (expr), ref_p);
 }
Index: gcc/tree.c
===
--- gcc/tree.c  (revision 184151)
+++ gcc/tree.c  (working copy)
@@ -4596,7 +4596,10 @@ free_lang_data_in_decl (tree decl)
   free_lang_data_in_one_sizepos (DECL_SIZE (decl));
   free_lang_data_in_one_sizepos (DECL_SIZE_UNIT (decl));
   if (TREE_CODE (decl) == FIELD_DECL)
-free_lang_data_in_one_sizepos (DECL_FIELD_OFFSET (decl));
+{
+  free_lang_data_in_one_sizepos (DECL_FIELD_OFFSET (decl));
+  DECL_QUALIFIER (decl) = NULL_TREE;
+}
 
  if (TREE_CODE (decl) == FUNCTION_DECL)
 {
@@ -4800,7 +4803,6 @@ find_decls_types_r (tree *tp, int *ws, v
{
  fld_worklist_push (DECL_FIELD_OFFSET (t), fld);
  fld_worklist_push (DECL_BIT_FIELD_TYPE (t), fld);
- fld_worklist_push (DECL_QUALIFIER (t), fld);
  fld_worklist_push (DECL_FIELD_BIT_OFFSET (t), fld);
  fld_worklist_push (DECL_FCONTEXT (t), fld);
}


Re: [PATCH] Re: New atomics not mentioned in /gcc-4.7/changes.html

2012-02-13 Thread Andrew MacLeod

On 02/12/2012 04:48 PM, Gerald Pfeifer wrote:

On Wed, 8 Feb 2012, Andrew MacLeod wrote:

Checked in the shortened version andcode  changes.  How thats?
seems better :-)

Yep, thanks!  There is just a minor grammor I went ahead fixing.

On the title page, I was thinking to refer to the release notes
entry (gcc-4.7/changes.html), and would make this change for you
if you agree.  If not, we can leave it as is.


Im happy with your changes  :-)

Andrew


Re: [PING] New port resubmission for TILEPro and TILE-Gx

2012-02-13 Thread Walter Lee
Ping.  Can someone please review these ports?  Here is a summary of
the submission.

Summary of changes in latest submit:

http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01854.html

Latest submit:

1/6 toplevel: http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01860.html
2/6 contrib: http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01855.html
3/6 gcc: http://gcc.gnu.org/ml/gcc-patches/2012-01/msg01494.html
4/6 libcpp: http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01857.html
5/6 libgcc: http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01858.html
6/6 libgomp: http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01859.html

1st round review comments:

http://gcc.gnu.org/ml/gcc-patches/2011-10/msg01385.html
http://gcc.gnu.org/ml/gcc-patches/2011-10/msg01387.html

2nd round review comments:

http://gcc.gnu.org/ml/gcc-patches/2011-11/msg01232.html
http://gcc.gnu.org/ml/gcc-patches/2011-11/msg01247.html

Thanks,

Walter Lee



[Committed] S/390: Adjust gcc.c-torture/execute/pr51933.c testcase

2012-02-13 Thread Andreas Krebbel
Committed to mainline.

2012-02-13  Andreas Krebbel  andreas.kreb...@de.ibm.com

* gcc.c-torture/execute/pr51933.c: Modify for s390 31 bit.
---
 gcc/testsuite/gcc.c-torture/execute/pr51933.c |8 
 1 file changed, 8 insertions(+)

Index: gcc/testsuite/gcc.c-torture/execute/pr51933.c
===
*** gcc/testsuite/gcc.c-torture/execute/pr51933.c.orig
--- gcc/testsuite/gcc.c-torture/execute/pr51933.c
*** static unsigned char v2[256], v3[256];
*** 6,12 
--- 6,20 
  __attribute__((noclone, noinline)) void
  foo (void)
  {
+ #if defined(__s390__)  !defined(__zarch__)
+   /* S/390 31 bit cannot deal with more than one literal pool
+  reference per insn.  */
+   asm volatile ( : : g (v1) : memory);
+   asm volatile ( : : g (v2[0]));
+   asm volatile ( : : g (v3[0]));
+ #else
asm volatile ( : : g (v1), g (v2[0]), g (v3[0]) : memory);
+ #endif
  }
  
  __attribute__((noclone, noinline)) int



[Patch,AVR]: Built-in for non-contiguous port layouts

2012-02-13 Thread Georg-Johann Lay
This patch set removes __builtin_avr_map8 __builtin_avr_map16 built-ins and
implements a built-in __builtin_avr_insert_bits instead.

This has several reasons:

* From user feedback I learned that speed matters more than size here

* I found that the new built-in has better usability and fits better to
  the intended use cases.

* Better code is generated by implementing hook TARGET_FOLD_BUILTIN.

* The implementation is simpler (except the new folding part).

* There were issues with __builtin_avr_map*.  Instead of fixing these
  I went ahead an removed them altogether

* The new built-in is generic enough to provide the old ones'
  functionalities easily.

There are 2 new test programs for this built-in that all pass fine.

Ok for trunk?

Johann


gcc/doc/
* extend.texi (AVR Built-in Functions): Remove doc for
__builtin_avr_map8, __builtin_avr_map16.
Document __builtin_avr_insert_bits.

gcc/testsuite/
* gcc.target/avr/torture/builtin_insert_bits-1.c: New test.
* gcc.target/avr/torture/builtin_insert_bits-2.c: New test.

gcc/
* config/avr/avr.md (map_bitsqi, map_bitshi): Remove.
(insert_bits): New insn.
(adjust_len.map_bits): Rename to insert_bits.
(UNSPEC_MAP_BITS): Rename to UNSPEC_INSERT_BITS.

* avr-protos.h (avr_out_map_bits): Remove.
(avr_out_insert_bits, avr_has_nibble_0xf): New.

* config/avr/constraints.md (Cxf,C0f): New.

* config/avr/avr.c (avr_cpu_cpp_builtins): Remove built-in
defines __BUILTIN_AVR_MAP8, __BUILTIN_AVR_MAP16.
New built-in define __BUILTIN_AVR_INSERT_BITS.

* config/avr/avr.c (TARGET_FOLD_BUILTIN): New define.
(enum avr_builtin_id): Add AVR_BUILTIN_INSERT_BITS.
(avr_move_bits): Rewrite.
(avr_fold_builtin, avr_map_metric, avr_map_decompose): New static
functions.
(avr_map_op_t): New typedef.
(avr_map_op): New static variable.
(avr_out_insert_bits, avr_has_nibble_0xf): New functions.
(adjust_insn_length): Handle ADJUST_LEN_INSERT_BITS.
(avr_init_builtins): Add definition for __builtin_avr_insert_bits.
(bdesc_3arg, avr_expand_triop_builtin): New.
(avr_expand_builtin): Use them. And handle AVR_BUILTIN_INSERT_BITS.

(avr_revert_map, avr_swap_map, avr_id_map, avr_sig_map): Remove.
(avr_map_hamming_byte, avr_map_hamming_nonstrict): Remove.
(avr_map_equal_p, avr_map_sig_p): Remove.
(avr_out_swap_bits, avr_out_revert_bits, avr_out_map_bits): Remove.
(bdesc_2arg): Remove AVR_BUILTIN_MAP8, AVR_BUILTIN_MAP16.
(adjust_insn_length): Remove handling for ADJUST_LEN_MAP_BITS.
(enum avr_builtin_id): Remove AVR_BUILTIN_MAP8, AVR_BUILTIN_MAP16.
(avr_init_builtins): Remove __builtin_avr_map8, __builtin_avr_map16.
(avr_expand_builtin): Remove AVR_BUILTIN_MAP8, AVR_BUILTIN_MAP16.
Index: doc/extend.texi
===
--- doc/extend.texi	(revision 184156)
+++ doc/extend.texi	(working copy)
@@ -8810,33 +8810,53 @@ might increase delay time. @code{ticks}
 integer constant; delays with a variable number of cycles are not supported.
 
 @smallexample
- unsigned char __builtin_avr_map8 (unsigned long map, unsigned char val)
+ unsigned char __builtin_avr_insert_bits (unsigned long map, unsigned char bits, unsigned char val)
 @end smallexample
 
 @noindent
-Each bit of the result is copied from a specific bit of @code{val}.
-@code{map} is a compile time constant that represents a map composed
-of 8 nibbles (4-bit groups):
-The @var{n}-th nibble of @code{map} specifies which bit of @code{val}
-is to be moved to the @var{n}-th bit of the result.
-For example, @code{map = 0x76543210} represents identity: The MSB of
-the result is read from the 7-th bit of @code{val}, the LSB is
-read from the 0-th bit to @code{val}, etc.
-Two more examples: @code{0x01234567} reverses the bit order and
-@code{0x32107654} is equivalent to a @code{swap} instruction.
+Insert bits from @var{bits} into @var{val} and return the resulting
+value. The nibbles of @var{map} determine how the insertion is
+performed: Let @var{X} be the @var{n}-th nibble of @var{map}
+@enumerate
+@item If @var{X} is @code{0xf},
+then the @var{n}-th bit of @var{val} is returned unaltered.
+
+@item If X is in the range 0@dots{}7,
+then the @var{n}-th result bit is set to the @var{X}-th bit of @var{bits}
+
+@item If X is in the range 8@dots{}@code{0xe},
+then the @var{n}-th result bit is undefined.
+@end enumerate
 
 @noindent
-One typical use case for this and the following built-in is adjusting input and
-output values to non-contiguous port layouts.
+One typical use case for this built-in is adjusting input and
+output values to non-contiguous port layouts. Some examples:
 
 @smallexample
- unsigned int __builtin_avr_map16 (unsigned long long map, unsigned int val)
+// same as val, bits is unused

Re: trans-mem: virtual ops for gimple_transaction

2012-02-13 Thread Richard Henderson
On 02/13/2012 01:35 AM, Richard Guenther wrote:
 On Fri, 10 Feb 2012, Richard Henderson wrote:
 
 On 02/10/2012 01:44 AM, Richard Guenther wrote:
 What is the reason to keep a GIMPLE_TRANSACTION stmt after
 TM lowering and not lower it to a builtin function call?

 Because real optimization hasn't happened yet, and we hold
 out hope that we'll be able to delete stuff as unreachable.
 Especially all instances of transaction_cancel.

 It seems the body is empty after lowering (what's the label thing?)

 The label is the transaction cancel label.

 When we finally convert GIMPLE_TRANSACTION a builtin, we'll
 generate different code layouts with and without a cancel.
 
 Ah, I see.  But wouldn't a placeholder builtin function be
 effectively the same as using a new GIMPLE stmt kind?

Except for the whole need to hold on to a label thing.

Honestly, think about that for 10 seconds and tell me that
a builtin is better than simply re-tasking the gimple code
that we already have around.


r~


New German PO file for 'gcc' (version 4.7-b20120128)

2012-02-13 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the German team of translators.  The file is available at:

http://translationproject.org/latest/gcc/de.po

(This file, 'gcc-4.7-b20120128.de.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.
coordina...@translationproject.org



Re: [PATCH] [RFC, GCC 4.8] Optimize conditional moves from adjacent memory locations

2012-02-13 Thread Joseph S. Myers
On Mon, 13 Feb 2012, Richard Guenther wrote:

 Indeed.  But note that the transform is not valid as *this_node may cross
 a page boundary and thus either pointer load may trap if the other does not
 (well, unless the C standard (and thus our middle-end) would require that
 iff ptr-component does not trap that *ptr does not trap either - we would
 require a operand_equal_p (get_base_address ()) for both addresses).
 
 Joseph, can you clarify what the C standard specifies here?

The question of what the relevant objects for an access are isn't 
well-defined in general, but it seems doubtful that accessing via a 
structure type is valid if the whole structure isn't in accessible memory.  
(Whereas you can't speculatively load from x[1] just because x[0] was 
accessed - x might point to an array of size 1.  And of course this 
applies with flexible array members - access to any bit of the structure 
means the part before the flexible array member is available, but the 
flexible array member may not extend beyond the part accessed.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Documenting the MIPS changes in 4.7

2012-02-13 Thread Richard Sandiford
Gerald Pfeifer ger...@pfeifer.com writes:
 On Sun, 5 Feb 2012, Richard Sandiford wrote:
 I've committed this patch to describe the MIPS changes in GCC 4.7.
 Corrections, comments, and help with wordsmithing are all welcome.

 Nice!  How about the small follow-up below?

The first definitely looks good, thanks.  Not sure either way about
the second; I'll leave it up to you.

Richard


[Patch, libfortran] RFC: Shared vtables, constification

2012-02-13 Thread Janne Blomqvist
Hi,

the attached patch changes the low-level libgfortran IO dispatching
mechanism to use shared vtables for each stream type, instead of all
the function pointers being replicated for each unit. This is similar
to e.g. how the C++ frontend implements vtables. The benefits are:

- Slightly smaller heap memory overhead for each unit as only the
vtable pointer needs to be stored, and slightly faster unit
initialization as only the vtable pointer needs to be setup instead of
all the function pointers in the stream struct.

- Looking at unix.o with readelf, one sees

Relocation section '.rela.data.rel.ro.local.mem_vtable' at offset
0x15550 contains 8 entries:

and similarly for the other vtables; according to
http://www.airs.com/blog/archives/189 this means that after relocation
the page where this data resides may be marked read-only.

The downside is that the sizes of the .text and .data sections are
increased. Before:

   textdata bss dec hex filename
11169916664 592 1124247  112797
./x86_64-unknown-linux-gnu/libgfortran/.libs/libgfortran.so

After:

   textdata bss dec hex filename
11174876936 592 1125015  112a97
./x86_64-unknown-linux-gnu/libgfortran/.libs/libgfortran.so


The data section increase is due to the vtables, the text increase is,
I guess, due to the extra pointer dereference when calling the IO
functions.

Regtested on x86_64-unknown-linux-gnu, Ok for trunk, or 4.8?

2012-02-13  Janne Blomqvist  j...@gcc.gnu.org

* io/unix.h (struct stream): Rename to stream_vtable.
(struct stream): New struct definition.
(sread): Dereference vtable pointer.
(swrite): Likewise.
(sseek): Likewise.
(struncate): Likewise.
(sflush): Likewise.
(sclose): Likewise.
* io/unix.c (raw_vtable): New variable.
(buf_vtable): Likewise.
(mem_vtable): Likewise.
(mem4_vtable): Likewise.
(raw_init): Assign vtable pointer.
(buf_init): Likewise.
(open_internal): Likewise.
(open_internal4): Likewise.



-- 
Janne Blomqvist
diff --git a/libgfortran/io/unix.c b/libgfortran/io/unix.c
index 6eef3f9..978c3ff 100644
--- a/libgfortran/io/unix.c
+++ b/libgfortran/io/unix.c
@@ -401,17 +401,21 @@ raw_close (unix_stream * s)
   return retval;
 }
 
+static const struct stream_vtable raw_vtable = {
+  .read = (void *) raw_read,
+  .write = (void *) raw_write,
+  .seek = (void *) raw_seek,
+  .tell = (void *) raw_tell,
+  .size = (void *) raw_size,
+  .trunc = (void *) raw_truncate,
+  .close = (void *) raw_close,
+  .flush = (void *) raw_flush 
+};
+
 static int
 raw_init (unix_stream * s)
 {
-  s-st.read = (void *) raw_read;
-  s-st.write = (void *) raw_write;
-  s-st.seek = (void *) raw_seek;
-  s-st.tell = (void *) raw_tell;
-  s-st.size = (void *) raw_size;
-  s-st.trunc = (void *) raw_truncate;
-  s-st.close = (void *) raw_close;
-  s-st.flush = (void *) raw_flush;
+  s-st.vptr = raw_vtable;
 
   s-buffer = NULL;
   return 0;
@@ -619,17 +623,21 @@ buf_close (unix_stream * s)
   return raw_close (s);
 }
 
+static const struct stream_vtable buf_vtable = {
+  .read = (void *) buf_read,
+  .write = (void *) buf_write,
+  .seek = (void *) buf_seek,
+  .tell = (void *) buf_tell,
+  .size = (void *) buf_size,
+  .trunc = (void *) buf_truncate,
+  .close = (void *) buf_close,
+  .flush = (void *) buf_flush 
+};
+
 static int
 buf_init (unix_stream * s)
 {
-  s-st.read = (void *) buf_read;
-  s-st.write = (void *) buf_write;
-  s-st.seek = (void *) buf_seek;
-  s-st.tell = (void *) buf_tell;
-  s-st.size = (void *) buf_size;
-  s-st.trunc = (void *) buf_truncate;
-  s-st.close = (void *) buf_close;
-  s-st.flush = (void *) buf_flush;
+  s-st.vptr = buf_vtable;
 
   s-buffer = get_mem (BUFFER_SIZE);
   return 0;
@@ -872,6 +880,31 @@ mem_close (unix_stream * s)
   return 0;
 }
 
+static const struct stream_vtable mem_vtable = {
+  .read = (void *) mem_read,
+  .write = (void *) mem_write,
+  .seek = (void *) mem_seek,
+  .tell = (void *) mem_tell,
+  /* buf_size is not a typo, we just reuse an identical
+ implementation.  */
+  .size = (void *) buf_size,
+  .trunc = (void *) mem_truncate,
+  .close = (void *) mem_close,
+  .flush = (void *) mem_flush 
+};
+
+static const struct stream_vtable mem4_vtable = {
+  .read = (void *) mem_read4,
+  .write = (void *) mem_write4,
+  .seek = (void *) mem_seek,
+  .tell = (void *) mem_tell,
+  /* buf_size is not a typo, we just reuse an identical
+ implementation.  */
+  .size = (void *) buf_size,
+  .trunc = (void *) mem_truncate,
+  .close = (void *) mem_close,
+  .flush = (void *) mem_flush 
+};
 
 /*
   Public functions -- A reimplementation of this module needs to
@@ -895,16 +928,7 @@ open_internal (char *base, int length, gfc_offset offset)
   s-logical_offset = 0;
   s-active = s-file_length = length;
 
-  s-st.close = (void *) mem_close;
-  s-st.seek = 

[PATCH] Fix up vectorizer cost model use of uninitialized value (PR tree-optimization/52210)

2012-02-13 Thread Jakub Jelinek
Hi!

The PR50912 changed vect_get_and_check_slp_defs dt from
array into scalar, which fails when calling vect_model_simple_cost
which looks at two array members.  I believe even 4.6 checked just
the first operand, as it called it when processing the first operand,
so IMHO this patch doesn't regress (the very incomplete) cost model
handling and doesn't introduce undefined behavior.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2012-02-13  Jakub Jelinek  ja...@redhat.com

PR tree-optimization/52210
* tree-vect-slp.c (vect_get_and_check_slp_defs): Call
vect_model_simple_cost with two entry vect_def_type array instead
of an address of dt.

* gcc.dg/pr52210.c: New test.

--- gcc/tree-vect-slp.c.jj  2012-02-07 16:05:51.0 +0100
+++ gcc/tree-vect-slp.c 2012-02-13 10:14:28.017357662 +0100
@@ -321,10 +321,15 @@ vect_get_and_check_slp_defs (loop_vec_in
 vect_model_store_cost (stmt_info, ncopies_for_cost, false,
 dt, slp_node);
  else
-   /* Not memory operation (we don't call this function for
-  loads).  */
-   vect_model_simple_cost (stmt_info, ncopies_for_cost, dt,
-   slp_node);
+   {
+ enum vect_def_type dts[2];
+ dts[0] = dt;
+ dts[1] = vect_uninitialized_def;
+ /* Not memory operation (we don't call this function for
+loads).  */
+ vect_model_simple_cost (stmt_info, ncopies_for_cost, dts,
+ slp_node);
+   }
}
}
   else
--- gcc/testsuite/gcc.dg/pr52210.c.jj   2012-02-13 10:27:46.692809216 +0100
+++ gcc/testsuite/gcc.dg/pr52210.c  2012-02-13 10:25:31.0 +0100
@@ -0,0 +1,12 @@
+/* PR tree-optimization/52210 */
+/* { dg-do compile } */
+/* { dg-options -O3 } */
+
+void
+foo (long *x, long y, long z)
+{
+  long a = x[0];
+  long b = x[1];
+  x[0] = a  ~y;
+  x[1] = b  ~z;
+}

Jakub


[PATCH] Fix __atomic_compare_exchange handling (PR c++/52215)

2012-02-13 Thread Jakub Jelinek
Hi!

As the testcase shows, deciding on whether to convert an argument or not
based on TYPE_SIZE is wrong.  While the old __sync_* builtins in the
_[1248]/_16 variants only had a VPTR as first argument and optionally
I[1248]/I16 argument or arguments that should be converted, the new
__atomic_* builtins also have PTR arguments (e.g. the expected pointer),
BOOL (e.g. weak argument) or INT (e.g. the *memmodel arguments).
Those have invariant types that shouldn't be adjusted based on what type the
first pointer points to.  I[1248]/I16 arguments are unsigned integers,
the arguments that we don't want to adjust are BOOLEAN_TYPE/POINTER_TYPE
or signed integers, so I think we should convert only unsigned
INTEGER_TYPEs.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2012-02-13  Jakub Jelinek  ja...@redhat.com

PR c++/52215
* c-common.c (sync_resolve_params): Don't decide whether to convert
or not based on TYPE_SIZE comparison, convert whenever arg_type
is unsigned INTEGER_TYPE.

* g++.dg/ext/atomic-1.C: New test.

--- gcc/c-family/c-common.c.jj  2012-01-26 09:22:17.0 +0100
+++ gcc/c-family/c-common.c 2012-02-13 14:49:15.204685590 +0100
@@ -9336,10 +9336,12 @@ sync_resolve_params (location_t loc, tre
  return false;
}
 
-  /* Only convert parameters if the size is appropriate with new format
-sync routines.  */
-  if (orig_format
- || tree_int_cst_equal (TYPE_SIZE (ptype), TYPE_SIZE (arg_type)))
+  /* Only convert parameters if arg_type is unsigned integer type with
+new format sync routines, i.e. don't attempt to convert pointer
+arguments (e.g. EXPECTED argument of __atomic_compare_exchange_n),
+bool arguments (e.g. WEAK argument) or signed int arguments (memmodel
+kinds).  */
+  if (TREE_CODE (arg_type) == INTEGER_TYPE  TYPE_UNSIGNED (arg_type))
{
  /* Ideally for the first conversion we'd use convert_for_assignment
 so that we get warnings for anything that doesn't match the pointer
--- gcc/testsuite/g++.dg/ext/atomic-1.C.jj  2012-02-13 14:54:33.337864794 
+0100
+++ gcc/testsuite/g++.dg/ext/atomic-1.C 2012-02-13 14:53:13.0 +0100
@@ -0,0 +1,12 @@
+// PR c++/52215
+// { dg-do compile }
+
+enum E { ZERO };
+
+int
+main ()
+{
+  E e = ZERO;
+  __atomic_compare_exchange_n (e, e, e, true, __ATOMIC_ACQ_REL,
+  __ATOMIC_RELAXED);
+}

Jakub


[ping 5] [patch] attribute to reverse bitfield allocations

2012-02-13 Thread DJ Delorie

Ping 5...

 Ping 4...
 
  Ping 3?  It's been months with no feedback...
  
   Ping 2 ?
   
   http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01889.html
   http://gcc.gnu.org/ml/gcc-patches/2011-07/msg02555.html
 http://gcc.gnu.org/ml/gcc-patches/2012-01/msg00529.html
http://gcc.gnu.org/ml/gcc-patches/2012-01/msg01246.html


[PATCH] Fix cselib dump ICE

2012-02-13 Thread Jakub Jelinek
Hi!

While debugging PR52172, I've noticed ICE when dumping RTL, all of
cselib seems to test setting_insn for NULL, but this spot doesn't.

Ok for trunk?

2012-02-13  Jakub Jelinek  ja...@redhat.com

* cselib.c (dump_cselib_val): Don't assume l-setting_insn is
non-NULL.

--- gcc/cselib.c.jj 2012-01-26 09:22:21.0 +0100
+++ gcc/cselib.c2012-02-13 11:07:15.109023769 +0100
@@ -2688,8 +2688,11 @@ dump_cselib_val (void **x, void *info)
   fputs ( locs:, out);
   do
{
- fprintf (out, \n  from insn %i ,
-  INSN_UID (l-setting_insn));
+ if (l-setting_insn)
+   fprintf (out, \n  from insn %i ,
+INSN_UID (l-setting_insn));
+ else
+   fprintf (out, \n   );
  print_inline_rtx (out, l-loc, 4);
}
   while ((l = l-next));

Jakub


Re: [PATCH] Prevent cselib substitution of FP, SP, SFP

2012-02-13 Thread Jakub Jelinek
On Wed, Jan 04, 2012 at 05:21:38PM +, Marcus Shawcroft wrote:
 Alias analysis by DSE based on CSELIB expansion assumes that
 references to the stack frame from different base registers (ie FP, SP)
 never alias.
 
 The comment block in cselib explains that cselib does not allow
 substitution of FP, SP or SFP specifically in order not to break DSE.

Looks reasonable, appart from coding style (no spaces around - and
no {} around return p-loc;), I just wonder if having a separate
loop in expand_loc just for this isn't too expensive.  On sane targets
IMHO hard frame pointer in the prologue should be initialized from sp, not
the other way around, thus hard frame pointer based VALUEs should have
hard frame pointer earlier in the locs list (when there is
hfp = sp (+ optionally some const)
insn, we first cselib_lookup_from_insn the rhs and add to locs
of the new VALUE (plus (VALUE of sp) (const_int)), then process the
lhs and add it to locs, moving the plus to locs-next).
So I think the following patch could be enough (bootstrapped/regtested
on x86_64-linux and i686-linux).
There is AVR though, which has really weirdo prologue - PR50063,
but I think it should just use UNSPEC for that or something similar,
setting sp from hfp seems unnecessary and especially for values with long
locs chains could make cselib more expensive.

Richard, what do you think about this?

2012-02-13  Jakub Jelinek  ja...@redhat.com

* cselib.c (expand_loc): Return sp, fp, hfp or cfa base reg right
away if seen.

--- gcc/cselib.c.jj 2012-02-13 11:07:15.0 +0100
+++ gcc/cselib.c2012-02-13 18:15:17.531776145 +0100
@@ -1372,8 +1372,18 @@ expand_loc (struct elt_loc_list *p, stru
   unsigned int regno = UINT_MAX;
   struct elt_loc_list *p_in = p;
 
-  for (; p; p = p - next)
+  for (; p; p = p-next)
 {
+  /* Return these right away to avoid returning stack pointer based
+expressions for frame pointer and vice versa, which is something
+that would confuse DSE.  See the comment in cselib_expand_value_rtx_1
+for more details.  */
+  if (REG_P (p-loc)
+  (REGNO (p-loc) == STACK_POINTER_REGNUM
+ || REGNO (p-loc) == FRAME_POINTER_REGNUM
+ || REGNO (p-loc) == HARD_FRAME_POINTER_REGNUM
+ || REGNO (p-loc) == cfa_base_preserved_regno))
+   return p-loc;
   /* Avoid infinite recursion trying to expand a reg into a
 the same reg.  */
   if ((REG_P (p-loc))


Jakub


[PATCH, go]: Disable TestListenMulticastUDP on alpha linux

2012-02-13 Thread Uros Bizjak
Hello!

alpha linux does not have expected /proc/net/igmp and
/proc/net/igmp6 files, so func interfaceMulticastAddrTable(ifindex
int) from interface_linux.go always returns (nil, nil), failing
net/test with:

--- FAIL: net.TestListenMulticastUDP (4.71 seconds)
???:1: IPv4 multicast interface: nil
???:1: IPv4 multicast TTL: 1
???:1: IPv4 multicast loopback: false
???:1: 224.0.0.254:12345 not found in RIB
FAIL

Attached patch skips this sub-test in the same way as for ARM arch.

Tested on alphaev6-pc-linux-gnu, where it fixes failing net test.

Uros.
Index: go/net/multicast_test.go
===
--- go/net/multicast_test.go(revision 184156)
+++ go/net/multicast_test.go(working copy)
@@ -33,7 +33,7 @@
case netbsd, openbsd, plan9, windows:
return
case linux:
-   if runtime.GOARCH == arm {
+   if runtime.GOARCH == arm || runtime.GOARCH == alpha {
return
}
}


[committed] Fix invalid GOMP_loop_static_start call (PR middle-end/52230)

2012-02-13 Thread Jakub Jelinek
Hi!

If omp for loop body doesn't fallthru (which doesn't make much sense),
then we would call GOMP_loop_static_start with wrong number of arguments
if collapse is 1, static scheduling without chunk size and no ordered
clause.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
committed to trunk.

2012-02-13  Jakub Jelinek  ja...@redhat.com

PR middle-end/52230
* omp-low.c (expand_omp_for): If a static schedule without
chunk size has NULL region-cont, force fd.chunk_size to be
integer_zero_node.

--- gcc/omp-low.c.jj2012-01-13 21:47:35.0 +0100
+++ gcc/omp-low.c   2012-02-13 12:54:55.137590443 +0100
@@ -4664,6 +4664,9 @@ expand_omp_for (struct omp_region *regio
 {
   int fn_index, start_ix, next_ix;
 
+  if (fd.chunk_size == NULL
+  fd.sched_kind == OMP_CLAUSE_SCHEDULE_STATIC)
+   fd.chunk_size = integer_zero_node;
   gcc_assert (fd.sched_kind != OMP_CLAUSE_SCHEDULE_AUTO);
   fn_index = (fd.sched_kind == OMP_CLAUSE_SCHEDULE_RUNTIME)
  ? 3 : fd.sched_kind;

Jakub


Re: [PING] New port resubmission for TILEPro and TILE-Gx

2012-02-13 Thread Richard Henderson
On 02/13/2012 07:42 AM, Walter Lee wrote:
 1/6 toplevel: http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01860.html
 2/6 contrib: http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01855.html
 3/6 gcc: http://gcc.gnu.org/ml/gcc-patches/2012-01/msg01494.html
 4/6 libcpp: http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01857.html
 5/6 libgcc: http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01858.html
 6/6 libgomp: http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01859.html

Ok.


r~


Re: [PATCH] Fix cselib dump ICE

2012-02-13 Thread Richard Henderson
On 02/13/2012 11:43 AM, Jakub Jelinek wrote:
   * cselib.c (dump_cselib_val): Don't assume l-setting_insn is
   non-NULL.

Ok.

r~


Re: [PATCH] Prevent cselib substitution of FP, SP, SFP

2012-02-13 Thread Richard Henderson
On 02/13/2012 11:54 AM, Jakub Jelinek wrote:
   * cselib.c (expand_loc): Return sp, fp, hfp or cfa base reg right
   away if seen.

Looks good.


r~


Re: [PATCH] Fix __atomic_compare_exchange handling (PR c++/52215)

2012-02-13 Thread Richard Henderson
On 02/13/2012 11:42 AM, Jakub Jelinek wrote:
 2012-02-13  Jakub Jelinek  ja...@redhat.com
 
   PR c++/52215
   * c-common.c (sync_resolve_params): Don't decide whether to convert
   or not based on TYPE_SIZE comparison, convert whenever arg_type
   is unsigned INTEGER_TYPE.
 
   * g++.dg/ext/atomic-1.C: New test.

Ok.


r~


Re: [libitm] Add SPARC bits

2012-02-13 Thread Richard Henderson
On 02/12/2012 12:15 PM, Eric Botcazou wrote:
 2012-02-12 Eric Botcazou  ebotca...@adacore.com
 
   * configure.tgt (target_cpu): Handle sparc and sparc64  sparcv9.
   * config/sparc/cacheline.h: New file.
   * config/sparc/target.h: Likewise.
   * config/sparc/sjlj.S: Likewise.
   * config/linux/sparc/futex_bits.h: Likewise.

Ok.

Thanks for this.


r~


[patch, testsuite] PR 52229, testsuite failure

2012-02-13 Thread Thomas Koenig

Hello world,

the attached patch xfails the offencing test case on architectures
which do not allow unaligned access for vecorization.  OK for trunk?
Any other architectures which should be XFAILed?

Regression-tested on powerpc64-unknown-linux-gnu.  OK for trunk?

Thomas

2012-02-13  Thomas Koenig  tkoe...@gcc.gnu.org

PR testsuite/52229
PR fortran/32380
* gfortran.dg/vect/pr32380.f:  XFAIL on PowerPC and ia-64.
Index: pr32380.f
===
--- pr32380.f	(Revision 184166)
+++ pr32380.f	(Arbeitskopie)
@@ -259,5 +259,5 @@
   return
   end
 
-! { dg-final { scan-tree-dump-times vectorized 7 loops 1 vect } }
+! { dg-final { scan-tree-dump-times vectorized 7 loops 1 vect { xfail powerpc*-*-* ia64-*-*-* } } }
 ! { dg-final { cleanup-tree-dump vect } }


Re: [PR52001] too many cse reverse equiv exprs (take2)

2012-02-13 Thread Richard Sandiford
Alexandre Oliva aol...@redhat.com writes:
 Jakub asked to have a closer look at the problem, and I found we could
 do somewhat better.  The first thing I noticed was that the problem was
 that, in each block that computed a (base+const), we created a new VALUE
 for the expression (with the same const and global base), and a new
 reverse operation.

 This was wrong.  Clearly we should reuse the same expression.  I had to
 arrange for the expression to be retained across basic blocks, for it
 was function invariant.  I split out the code to detect invariants from
 the function that removes entries from the cselib hash table across
 blocks, and made it recursive so that a VALUE equivalent to (plus
 (value) (const_int)) will be retained, if the base value fits (maybe
 recursively) the definition of invariant.

 An earlier attempt to address this issue remained in cselib: using the
 canonical value to build the reverse expression.  I believe it has a
 potential of avoiding the creation of redundant reverse expressions, for
 expressions involving equivalent but different VALUEs will evaluate to
 different hashes.  I haven't observed effects WRT the given testcase,
 before or after the change that actually fixed the problem, because we
 now find the same base expression and thus reuse the reverse_op as well,
 but I figured I'd keep it in for it is very cheap and possibly useful.

Thanks for looking at this.  Just to be sure: does this avoid the kind
of memrefs_conflict_p cycle I was seeing in:

http://gcc.gnu.org/ml/gcc-patches/2012-01/msg01051.html

(in theory, I mean).

Richard


Re: [Patch, libfortran] RFC: Shared vtables, constification

2012-02-13 Thread Steven Bosscher
On Mon, Feb 13, 2012 at 7:20 PM, Janne Blomqvist
blomqvist.ja...@gmail.com wrote:
 Hi,

 the attached patch changes the low-level libgfortran IO dispatching
 mechanism to use shared vtables for each stream type, instead of all
 the function pointers being replicated for each unit. This is similar
 to e.g. how the C++ frontend implements vtables. The benefits are:

 - Slightly smaller heap memory overhead for each unit as only the
 vtable pointer needs to be stored, and slightly faster unit
 initialization as only the vtable pointer needs to be setup instead of
 all the function pointers in the stream struct.

 - Looking at unix.o with readelf, one sees

 Relocation section '.rela.data.rel.ro.local.mem_vtable' at offset
 0x15550 contains 8 entries:

 and similarly for the other vtables; according to
 http://www.airs.com/blog/archives/189 this means that after relocation
 the page where this data resides may be marked read-only.

 The downside is that the sizes of the .text and .data sections are
 increased. Before:

   text    data     bss     dec     hex filename
 1116991    6664     592 1124247  112797
 ./x86_64-unknown-linux-gnu/libgfortran/.libs/libgfortran.so

 After:

   text    data     bss     dec     hex filename
 1117487    6936     592 1125015  112a97
 ./x86_64-unknown-linux-gnu/libgfortran/.libs/libgfortran.so


 The data section increase is due to the vtables, the text increase is,
 I guess, due to the extra pointer dereference when calling the IO
 functions.

 Regtested on x86_64-unknown-linux-gnu, Ok for trunk, or 4.8?

Certainly not for trunk at this stage.

For 4.8: So the trade-off is between faster initialization and smaller
heap vs. fewer pointer dereferences? Does this patch fix an actual
problem? Does it bring a killer feature? Otherwise, I'd say if it
ain't broke, don't fix it!

Ciao!
Steven


[pph] Re-factor streaming of binding levels (issue5663043)

2012-02-13 Thread Diego Novillo
This patch re-writes the streaming of binding levels to guarantee that
the whole tree of binding levels in each file is written and merged-in
before anything else.

With this re-factoring, we now write all the binding levels, the merge
keys for symbols/types and their other contents at the start of the
PPH image.  Additionally, we do not skip any namespaces when
traversing the binding level tree (we used to skip over builtin
namespaces, which causes problems when looking up things like
std::ptrdiff_t).

After all the binding levels have been merged-in, every other read of
a binding level is expected to be read as a reference (so that we
don't materialize a new binding level that has not been merged in).

With this change, I get significantly fewer name lookup failures in
our internal code base.  But this is still incomplete.  In chasing
down other failures, I found out that we should be also writing out
the table of canonical types (type_hash_table).  I'm getting a new
ICE, because two different types that compare the same fail the
TYPE_CANONICAL identity test.

Lawrence, to avoid too many merge conflicts with the patch you are
working with, I will be fixing this new failure in a subsequent patch.

Most of this patch is moving code around.  The old
pph_out/in_binding_level is now simply expecting a reference to a
binding level.

The merging into existing binding levels (e.g., the global binding
scope or binding levels for already existing namespaces) is done by
pph_in_binding_level_start.  This routine will take an existing
binding level as parameter and use it in two ways:

   1- If the record read from STREAM is a reference, the binding level
  in that reference must be identical to EXISTING_BL.

   2- If the record read from STREAM is a new instance, the binding
  level given in EXISTING_BL is registered in the cache at the
  slot location given by this record.  This way, subsesequent
  internal references to EXISTING_BL will resolve to EXISTING_BL.
  This is used for binding levels that are already set in the
  compilation (e.g., scope_chain-bindings).

2012-02-13   Diego Novillo  dnovi...@google.com

cp/ChangeLog.pph
* pph-in.c (pph_in_binding_level_start): Move earlier into the
file.
Change return type to cp_binding_level *.
Add argument EXISTING_BL and EXISTED_P.
If EXISTING_BL is given, and a reference is read from STREAM,
the reference read should be the same as EXISTING_BL.
If EXISTING_BL is given and a new reference is started,
do not allocate a new instance.  Rather, register EXISTING_BL
in the cache.
(pph_in_binding_level_ref): Rename from pph_in_binding_level.
Assert that it always reads a reference record.
Update all users.
(pph_in_binding_level_1): Move body inside
pph_in_merge_body_binding_level.
(pph_in_merge_key_binding_level): Move earlier in the file.
(pph_in_merge_body_binding_level): Move earlier in the file.
(pph_in_merge_body_binding_level_1): Move body into
pph_in_merge_body_binding_level.
(pph_in_ld_ns): Call pph_in_binding_level_ref.
(pph_ensure_namespace_binding_level): Move body into
pph_in_merge_key_namespace_decl.  Update all users.
(pph_in_merge_key_namespace_decl): Fix comment.
(pph_in_global_binding): Call pph_in_binding_level_start.
* pph-out.c (pph_out_tree_vec_unchain): Remove.
(pph_out_chain_filtered): Remove.
(pph_out_binding_level_ref): Rename from
pph_out_binding_level.  Always expect to write a reference
record.  Update all users.
(pph_out_cxx_binding_1): Embed inside pph_out_merge_body_binding_level.
Update all users.
(pph_out_merge_key_binding_level): Do not filter
BL-NAMESPACES.
(pph_out_merge_body_binding_level): Likewise.

testsuite/ChangeLog.pph
* g++.dg/pph/x7dynarray5.cc: Add expected failure due to
type canonical mismatch.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/pph@184170 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/cp/ChangeLog.pph|   36 
 gcc/cp/pph-in.c |  323 ++-
 gcc/cp/pph-out.c|  183 +++---
 gcc/testsuite/ChangeLog.pph |5 +
 gcc/testsuite/g++.dg/pph/x7dynarray5.cc |2 +
 5 files changed, 260 insertions(+), 289 deletions(-)

diff --git a/gcc/cp/ChangeLog.pph b/gcc/cp/ChangeLog.pph
index b078607..2fa3153 100644
--- a/gcc/cp/ChangeLog.pph
+++ b/gcc/cp/ChangeLog.pph
@@ -1,3 +1,39 @@
+2012-02-13   Diego Novillo  dnovi...@google.com
+
+   * pph-in.c (pph_in_binding_level_start): Move earlier into the
+   file.
+   Change return type to cp_binding_level *.
+   Add argument EXISTING_BL and EXISTED_P.
+   If EXISTING_BL is given, and a reference is read from STREAM,
+   the reference read 

Re: [libitm] Link with -litm and -pthread

2012-02-13 Thread Richard Henderson
On 02/11/2012 06:14 AM, Eric Botcazou wrote:
 2012-02-11  Eric Botcazou  ebotca...@adacore.com
 
   * gcc.c (LINK_COMMAND_SPEC): Deal with -fgnu-tm.
   (GTM_SELF_SPECS): Define if not already defined.
   (driver_self_specs): Add GTM_SELF_SPECS.
   * config/darwin.h (GTM_SELF_SPECS): Define.
   * config/i386/cygwin.h (GTM_SELF_SPECS): Likewise.
   * config/i386/mingw32.h (GTM_SELF_SPECS): Likewise.
 
 
 2012-02-11  Eric Botcazou  ebotca...@adacore.com
 
   * configure.ac (link_itm): Fix comment.
   * configure: Regenerate.
   * testsuite/lib/libitm.exp: Do not pass -litm for the link.

Ok with the darwin followup-patch.


r~


Re: [PING] New port resubmission for TILEPro and TILE-Gx

2012-02-13 Thread Walter Lee

On 2/13/2012 3:02 PM, Richard Henderson wrote:

On 02/13/2012 07:42 AM, Walter Lee wrote:

1/6 toplevel: http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01860.html
2/6 contrib: http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01855.html
3/6 gcc: http://gcc.gnu.org/ml/gcc-patches/2012-01/msg01494.html
4/6 libcpp: http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01857.html
5/6 libgcc: http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01858.html
6/6 libgomp: http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01859.html


Ok.

r~


Hi Richard,

Thanks for the review.  Do I have permission to commit, or is there
anything else I need to do?  I will update the copyright notices with
the year 2012.

The assignment of copyright paperwork was filed on May 26, 2011 by
Tilera Corporation.  The gcc steering committee has approved my
maintainership: http://gcc.gnu.org/ml/gcc/2012-02/msg00123.html.  I
have an account at sourceware.org.  Can I use your name to get commit
rights to gcc?

Thanks,

Walter


Re: [wwwdocs] deprecation of access declarations

2012-02-13 Thread Fabien Chêne
2012/2/12 Gerald Pfeifer ger...@pfeifer.com:
 On Fri, 27 Jan 2012, Fabien Chêne wrote:
 I get back to you for the snippet about deprecated access
 declarations. I would also find it sensible to advertise about the fix
 of c++/14258, a popular bug I have hit myself many times. OK to commit
 the below ?

 Yes, thank you.

 One suggestion: where it reads c++/14258, how about making this
 bug c++/14258, for those who are less familiar how we name things?

I have committed it with the change that you have suggested.

 Do we need an update for http://gcc.gnu.org/gcc-4.7/porting_to.html
 as well?

I don't know. The deprecation of access declarations only raises a
warning -- unless -Werror is used.
Is porting_to.html appropriate to describe the way to fix the warning ?

Concerning the other changes related to using declarations, I don't
expect them to (massively) break some existing code.

-- 
Fabien


[patch] libitm: Add multi-lock, write-through TM method.

2012-02-13 Thread Torvald Riegel
This patch adds a new TM method, ml_wt, which uses an array of locks
with version numbers and runs a write-through algorithm with time-based
validations and snapshot time extensions.

patch1 adds xcalloc as a helper function for allocations (used in the
new TM method).

patch2 improves TM method reinitialization (helps ml_wt avoid
reallocation of the lock array) and adds a hook to TM methods so that
they can report back whether they can deal with the current runtime
situation (e.g., a the current number of threads).

patch3 is the actual TM method.

Tested on ppc64 with up to 64 threads with both microbenchmarks and
STAMP.  OK for trunk?
commit c0d1d1778b18f3dfc4a136e5a807c2fecbeb64e4
Author: Torvald Riegel trie...@redhat.com
Date:   Thu Feb 9 13:44:38 2012 +0100

libitm: Add xcalloc.

libitm/
* util.cc (GTM::xcalloc): New.
* common.h (GTM::xcalloc): Declare.

diff --git a/libitm/common.h b/libitm/common.h
index 14d0efb..b1ef2d4 100644
--- a/libitm/common.h
+++ b/libitm/common.h
@@ -54,6 +54,8 @@ namespace GTM HIDDEN {
 // cache lines that are not shared with any object used by another thread.
 extern void * xmalloc (size_t s, bool separate_cl = false)
   __attribute__((malloc, nothrow));
+extern void * xcalloc (size_t s, bool separate_cl = false)
+  __attribute__((malloc, nothrow));
 extern void * xrealloc (void *p, size_t s, bool separate_cl = false)
   __attribute__((malloc, nothrow));
 
diff --git a/libitm/util.cc b/libitm/util.cc
index afd37e4..48a1bf8 100644
--- a/libitm/util.cc
+++ b/libitm/util.cc
@@ -71,6 +71,18 @@ xmalloc (size_t size, bool separate_cl)
 }
 
 void *
+xcalloc (size_t size, bool separate_cl)
+{
+  // TODO Use posix_memalign if separate_cl is true, or some other allocation
+  // method that will avoid sharing cache lines with data used by other
+  // threads.
+  void *r = calloc (1, size);
+  if (r == 0)
+GTM_fatal (Out of memory allocating %lu bytes, (unsigned long) size);
+  return r;
+}
+
+void *
 xrealloc (void *old, size_t size, bool separate_cl)
 {
   // TODO Use posix_memalign if separate_cl is true, or some other allocation
commit 3b486db323b51ea87e1f64cd3abb9402f7c7307a
Author: Torvald Riegel trie...@redhat.com
Date:   Thu Feb 9 13:50:10 2012 +0100

libitm: Improve method reinit and choice.

libitm/
* dispatch.h (GTM::abi_dispatch::supports): New.
(GTM::method_group::reinit): New.
* retry.cc (GTM::gtm_thread::decide_retry_strategy): Use reinit().
(GTM::gtm_thread::number_of_threads_changed): Check that the method
supports the current situation.

diff --git a/libitm/dispatch.h b/libitm/dispatch.h
index dbf05e4..d059c49 100644
--- a/libitm/dispatch.h
+++ b/libitm/dispatch.h
@@ -245,6 +245,12 @@ struct method_group
   // Stop using any method from this group for now. This can be used to
   // destruct meta data as soon as this method group is not used anymore.
   virtual void fini() = 0;
+  // This can be overriden to implement more light-weight re-initialization.
+  virtual void reinit()
+  {
+fini();
+init();
+  }
 };
 
 
@@ -290,6 +296,10 @@ public:
   // method on begin of a nested transaction without committing or restarting
   // the parent method.
   virtual abi_dispatch* closed_nesting_alternative() { return 0; }
+  // Returns true iff this method group supports the current situation.
+  // NUMBER_OF_THREADS is the current number of threads that might execute
+  // transactions.
+  virtual bool supports(unsigned number_of_threads) { return true; }
 
   bool read_only () const { return m_read_only; }
   bool write_through() const { return m_write_through; }
diff --git a/libitm/retry.cc b/libitm/retry.cc
index decd773..6e05f5f 100644
--- a/libitm/retry.cc
+++ b/libitm/retry.cc
@@ -58,11 +58,8 @@ GTM::gtm_thread::decide_retry_strategy (gtm_restart_reason r)
  serial_lock.read_unlock(this);
  serial_lock.write_lock();
  if (disp-get_method_group() == default_dispatch-get_method_group())
-   {
- // Still the same method group.
- disp-get_method_group()-fini();
- disp-get_method_group()-init();
-   }
+   // Still the same method group.
+   disp-get_method_group()-reinit();
  serial_lock.write_unlock();
  serial_lock.read_lock(this);
  if (disp-get_method_group() != default_dispatch-get_method_group())
@@ -72,11 +69,8 @@ GTM::gtm_thread::decide_retry_strategy (gtm_restart_reason r)
}
}
   else
-   {
- // We are a serial transaction already, which makes things simple.
- disp-get_method_group()-fini();
- disp-get_method_group()-init();
-   }
+   // We are a serial transaction already, which makes things simple.
+   disp-get_method_group()-reinit();
 }
 
   bool retry_irr = (r == RESTART_SERIAL_IRR);
@@ -249,7 +243,7 @@ GTM::gtm_thread::number_of_threads_changed(unsigned 
previous, unsigned now)

[lra] fixing x86 gcc testsuite regressions

2012-02-13 Thread Vladimir Makarov
The following tiny patch fixes testsuite regressions on x86-64 occurred 
after latest merge (this weekend).


The patch was successfully bootstrapped on x86/x86-64.

Committed as rev. 184173.

2012-02-13  Vladimir Makarov vmaka...@redhat.com

* lra.c (check_rtl): Ignore addr with UNSPEC.

Index: lra.c
===
--- lra.c	(revision 184156)
+++ lra.c	(working copy)
@@ -1940,6 +1940,7 @@ check_rtl (bool final_p)
 		   legitimate if they satisfies the constraints and
 		   will be checked by insn constraints which we
 		   ignore here.  */
+		 GET_CODE (XEXP (op, 0)) != UNSPEC
 		 GET_CODE (XEXP (op, 0)) != PRE_DEC
 		 GET_CODE (XEXP (op, 0)) != PRE_INC
 		 GET_CODE (XEXP (op, 0)) != POST_DEC


Re: [lra] fixing x86 gcc testsuite regressions

2012-02-13 Thread Steven Bosscher
On Mon, Feb 13, 2012 at 10:50 PM, Vladimir Makarov vmaka...@redhat.com wrote:
 The following tiny patch fixes testsuite regressions on x86-64 occurred
 after latest merge (this weekend).

Hello Vladimir,

Could you please also update http://gcc.gnu.org/svn.html#devbranches ?
It still mentions ira as an active development branch, but lra isn't
mentioned yet.

Ciao!
Steven


[wwwdocs] Use dependent instead of dependant

2012-02-13 Thread Gerald Pfeifer
Per http://gcc.gnu.org/codingconventions.html we should use
dependent, not dependant.  This fixes this for the new GCC 4.7
porting notes as well as one old news entry.

Committed.

Gerald

Index: gcc-4.7/porting_to.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.7/porting_to.html,v
retrieving revision 1.7
diff -u -3 -p -r1.7 porting_to.html
--- gcc-4.7/porting_to.html 27 Jan 2012 01:04:13 -  1.7
+++ gcc-4.7/porting_to.html 13 Feb 2012 22:17:27 -
@@ -106,7 +106,7 @@ Instead, use the POSIX macro code_REEN
 p
 The C++ compiler no longer performs some extra unqualified lookups it
 had performed in the past, namely
-a href=http://gcc.gnu.org/PR24163;dependant base class scope lookups/a
+a href=http://gcc.gnu.org/PR24163;dependent base class scope lookups/a
 and a href=http://gcc.gnu.org/PR29131;unqualified template function/a
 lookups.
 /p
Index: news/ia32.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/news/ia32.html,v
retrieving revision 1.6
diff -u -3 -p -r1.6 ia32.html
--- news/ia32.html  21 Jan 2002 10:24:45 -  1.6
+++ news/ia32.html  13 Feb 2012 22:17:27 -
@@ -36,7 +36,7 @@ and focused on better optimization for t
 free registers and allocate them as scratches.  This is a generalization
 of the PGCC -friscify pass./li
 
-liRecognition of extension-dependant GIVs.  This shows up in a loop like
+liRecognition of extension-dependent GIVs.  This shows up in a loop like
 pre
 short s;
 for (s = 0; s lt; 10; ++s)
@@ -48,7 +48,7 @@ and focused on better optimization for t
 
 liRecognition of certain forms of loop-carried post-decrement.  Primarily,
 pre
-while (a--) { /* nothing dependant on a */ }
+while (a--) { /* nothing dependent on a */ }
 /pre
 becomes
 pre


Re: [lra] fixing x86 gcc testsuite regressions

2012-02-13 Thread Vladimir Makarov

On 02/13/2012 05:07 PM, Steven Bosscher wrote:

On Mon, Feb 13, 2012 at 10:50 PM, Vladimir Makarovvmaka...@redhat.com  wrote:

The following tiny patch fixes testsuite regressions on x86-64 occurred
after latest merge (this weekend).

Hello Vladimir,

Could you please also update http://gcc.gnu.org/svn.html#devbranches ?
It still mentions ira as an active development branch, but lra isn't
mentioned yet.


Ok.  I've just done that.  The patch is in the attachment.

Index: svn.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/svn.html,v
retrieving revision 1.167
diff -u -r1.167 svn.html
--- svn.html	1 Feb 2012 19:55:33 -	1.167
+++ svn.html	13 Feb 2012 22:27:18 -
@@ -210,30 +210,6 @@
   is maintained by a href=mailto:berg...@vnet.ibm.comPeter
   Bergner/a./dd
 
-  dtira/dt
-  ddThis branch contains the Integrated Register Allocator (IRA).  It is
-  based on work done on yara-branch.  The latter is more of a research
-  branch because one of its goals (removing reload) is too remote.  The
-  ira branch is focused to prepare some code for GCC mainline, hopefully
-  in time for GCC 4.4.  IRA still uses reload; it is called integrated
-  because register coalescing and register live range splitting are done
-  on-the-fly during coloring.  The branch is maintained by Vladimir
-  Makarov lt; a
-  href=mailto:vmaka...@redhat.comvmaka...@redhat.com/agt; and
-  will be merged with mainline from time to time.  Patches will be
-  marked with the tag code[ira]/code in the subject line./dd
-
-  dtira-merge/dt
-  ddThis branch contains bug fixes for the Integrated Register Allocator
-  (IRA).  It is branched from trunk at revision 139590 when IRA was
-  merged into trunk.  It is used to track IRA related regressions.
-  Only IRA fixes from trunk will be applied to this branch. Its goal is
-  there should be no make check  and performance regressions against
-  trunk at revision 139589.  The branch is maintained by H.J. Lu lt;a
-  href=mailto:hjl.to...@gmail.comhjl.to...@gmail.com/agt; and
-  Vladimir Makarov lt;
-  a href=mailto:vmaka...@redhat.comvmaka...@redhat.com/agt;./dd
-
   dtsel-sched-branch/dt
   ddThis branch contains the implementation of the selective scheduling
   approach.  The goal of the branch is to provide more aggressive scheduler 
@@ -336,6 +312,15 @@
   maintained by Richard Guenther and H.J. Lu.  Patches should be marked
   with the tag
   code[vect256]/code in the subject line./dd
+
+  dtlra/dt
+  ddThis branch contains the Local Register Allocator (LRA).  LRA is
+  focused to replace GCC reload pass.  The branch is maintained by
+  Vladimir Makarov
+  lt; a href=mailto:vmaka...@redhat.comvmaka...@redhat.com/agt;
+  and will be merged with mainline from time to time.  Patches will be
+  marked with the tag code[lra]/code in the subject line./dd
+
 /dl
 
 h4Architecture-specific/h4


Re: [Patch, fortran] PR50981 absent polymorphic scalar actual arguments

2012-02-13 Thread Paul Richard Thomas
Mikael,

This is OK for trunk with one proviso; could you move
is_class_container_ref to gfc_is_class_container_ref in class.c?

Thanks for the patch

Paul

On Sun, Feb 12, 2012 at 10:11 PM, Mikael Morin mikael.mo...@sfr.fr wrote:
 Hello,

 this is the next PR50981 fix:
 when passing polymorphic scalar actual arguments to elemental procedures, we
 were not adding the _data component reference.
 The fix is straightforward; checking that the expression's type is BT_CLASS
 was introducing regressions, so this patch uses a helper function to check
 the type without impacting the testsuite.

 Regression tested on x86_64-unknown-freebsd9.0. OK for trunk?

 Mikael





-- 
The knack of flying is learning how to throw yourself at the ground and miss.
       --Hitchhikers Guide to the Galaxy


Re: [wwwdocs] deprecation of access declarations

2012-02-13 Thread Gerald Pfeifer
On Mon, 13 Feb 2012, Fabien Chêne wrote:
 Do we need an update for http://gcc.gnu.org/gcc-4.7/porting_to.html
 as well?
 I don't know. The deprecation of access declarations only raises a
 warning -- unless -Werror is used.
 Is porting_to.html appropriate to describe the way to fix the warning ?

Hmm, I guess if it's only a warning we do not need to document it
yet.  Let's see whether Jason or other C++ affine developers think
differently.

Gerald

Re: [patch] libitm: Add multi-lock, write-through TM method.

2012-02-13 Thread Richard Henderson
On 02/13/2012 01:47 PM, Torvald Riegel wrote:
 +  else {

Watch the formatting.

 +  // Location-to-orec mapping.  Stripes of 16B mapped to 2^19 orecs.
 +  static const gtm_word L2O_ORECS = 1  19;
 +  static const gtm_word L2O_SHIFT = 4;

Is it just easier to say 16B or did we really want CACHELINE_SIZE?

Otherwise ok.


r~


[PATCH, libitm]: GTM_longjmp: Jump indirect from memory address

2012-02-13 Thread Uros Bizjak
Hello!

We can jump indirect from memory address, sparing a couple of cycles.

2012-02-14  Uros Bizjak  ubiz...@gmail.com

* config/x86/target.h (GTM_longjmp): Jump indirect from memory address.

Tested on x86_64-pc-linux-gnu {,-m32}.

OK for mainline?

Uros.
Index: config/x86/sjlj.S
===
--- config/x86/sjlj.S   (revision 184177)
+++ config/x86/sjlj.S   (working copy)
@@ -119,23 +119,19 @@ SYM(GTM_longjmp):
movq32(%rsi), %r13
movq40(%rsi), %r14
movq48(%rsi), %r15
-   movq56(%rsi), %rdx
movl%edi, %eax
cfi_def_cfa(%rcx, 0)
-   cfi_register(%rip, %rdx)
movq%rcx, %rsp
-   jmp *%rdx
+   jmp *56(%rsi)
 #else
movl(%edx), %ecx
movl4(%edx), %ebx
movl8(%edx), %esi
movl12(%edx), %edi
movl16(%edx), %ebp
-   movl20(%edx), %edx
cfi_def_cfa(%ecx, 0)
-   cfi_register(%eip, %edx)
movl%ecx, %esp
-   jmp *%edx
+   jmp *20(%edx)
 #endif
cfi_endproc
 


Re: [PATCH, libitm]: GTM_longjmp: Jump indirect from memory address

2012-02-13 Thread Richard Henderson
On 02/13/2012 02:54 PM, Uros Bizjak wrote:
 - movq56(%rsi), %rdx
   movl%edi, %eax
   cfi_def_cfa(%rcx, 0)
 - cfi_register(%rip, %rdx)
   movq%rcx, %rsp
 - jmp *%rdx
 + jmp *56(%rsi)

If you're going to do that, the correct fix for the unwind info is 

- cfi_register(%rip, %rdx)
+ cfi_offset(%rip, 56)

Otherwise ok.


r~


Re: [PATCH 4.8 v2, i386]: Make CCZ mode compatible with CCGOC and CCGO modes

2012-02-13 Thread Richard Henderson
On 02/11/2012 12:56 AM, Uros Bizjak wrote:
 FWIW, the mode of flags in users doesn't matter at all on x86, but
 which way is correct?

As far as I know, it doesn't matter anywhere.  We don't even bother to have 
perfect harmony between  integer modes in hard registers -- think about what 
happens when we drop all the subregs on the floor post-reload.  Yes, it's 
probably an error if we don't have compatible modes between def and use, but 
nothing is going to check for that.


r~


[patch] libitm: Fix race condition in dispatch choice at transaction begin.

2012-02-13 Thread Torvald Riegel
This patch fixes a race condition in how transactions previously chose
the dispatch at transaction begin:  default_dispatch in retry.cc was
read by transaction before they became either serial or nonserial
transactions (with the serial_lock).  A concurrent change of
default_dispatch was possible to lead to a transaction starting with
some out-of-date dispatch that were potentially incompatible with the
actual current dispatch, leading in turn to all sorts of synchronization
failures.
This is fixed by this patch by moving the serial_lock acquisiton into
the dispatch choice code.

Tested on ppc64 with microbenchmarks.  OK for trunk?
commit ce52924dedca632b24ea931329e060959782f89a
Author: Torvald Riegel trie...@redhat.com
Date:   Mon Feb 13 23:49:55 2012 +0100

libitm: Fix race condition in dispatch choice at transaction begin.

libitm/
* beginend.cc (GTM::gtm_thread::begin_transaction): Move serial lock
acquisition to ...
* retry.cc (GTM::gtm_thread::decide_begin_dispatch): ... here.
(default_dispatch): Make atomic.
(GTM::gtm_thread::decide_retry_strategy,
GTM::gtm_thread::set_default_dispatch): Access atomically.
(GTM::gtm_thread::number_of_threads_changed): Initialize
default_dispatch here.

diff --git a/libitm/beginend.cc b/libitm/beginend.cc
index 08c2174..e6a84de 100644
--- a/libitm/beginend.cc
+++ b/libitm/beginend.cc
@@ -233,16 +233,6 @@ GTM::gtm_thread::begin_transaction (uint32_t prop, const 
gtm_jmpbuf *jb)
 {
   // Outermost transaction
   disp = tx-decide_begin_dispatch (prop);
-  if (disp == dispatch_serialirr() || disp == dispatch_serial())
-   {
- tx-state = STATE_SERIAL;
- if (disp == dispatch_serialirr())
-   tx-state |= STATE_IRREVOCABLE;
- serial_lock.write_lock ();
-   }
-  else
-   serial_lock.read_lock (tx);
-
   set_abi_disp (disp);
 }
 
diff --git a/libitm/retry.cc b/libitm/retry.cc
index d57bba0..08c5d80 100644
--- a/libitm/retry.cc
+++ b/libitm/retry.cc
@@ -27,8 +27,9 @@
 #include ctype.h
 #include libitm_i.h
 
-// The default TM method used when starting a new transaction.
-static GTM::abi_dispatch* default_dispatch = 0;
+// The default TM method used when starting a new transaction.  Initialized
+// in number_of_threads_changed() below.
+static std::atomicGTM::abi_dispatch* default_dispatch;
 // The default TM method as requested by the user, if any.
 static GTM::abi_dispatch* default_dispatch_user = 0;
 
@@ -57,14 +58,18 @@ GTM::gtm_thread::decide_retry_strategy (gtm_restart_reason 
r)
  // given that re-inits should be very infrequent.
  serial_lock.read_unlock(this);
  serial_lock.write_lock();
- if (disp-get_method_group() == default_dispatch-get_method_group())
+ if (disp-get_method_group()
+ == default_dispatch.load(memory_order_relaxed)
+ -get_method_group())
// Still the same method group.
disp-get_method_group()-reinit();
  serial_lock.write_unlock();
  serial_lock.read_lock(this);
- if (disp-get_method_group() != default_dispatch-get_method_group())
+ if (disp-get_method_group()
+ != default_dispatch.load(memory_order_relaxed)
+ -get_method_group())
{
- disp = default_dispatch;
+ disp = default_dispatch.load(memory_order_relaxed);
  set_abi_disp(disp);
}
}
@@ -124,48 +129,81 @@ GTM::gtm_thread::decide_retry_strategy 
(gtm_restart_reason r)
 
 
 // Decides which TM method should be used on the first attempt to run this
-// transaction.
+// transaction.  Acquires the serial lock and sets transaction state
+// according to the chosen TM method.
 GTM::abi_dispatch*
 GTM::gtm_thread::decide_begin_dispatch (uint32_t prop)
 {
+  abi_dispatch* dd;
   // TODO Pay more attention to prop flags (eg, *omitted) when selecting
   // dispatch.
+  // ??? We go irrevocable eagerly here, which is not always good for
+  // performance.  Don't do this?
   if ((prop  pr_doesGoIrrevocable) || !(prop  pr_instrumentedCode))
-return dispatch_serialirr();
+dd = dispatch_serialirr();
 
-  // If we might need closed nesting and the default dispatch has an
-  // alternative that supports closed nesting, use it.
-  // ??? We could choose another TM method that we know supports closed
-  // nesting but isn't the default (e.g., dispatch_serial()). However, we
-  // assume that aborts that need closed nesting are infrequent, so don't
-  // choose a non-default method until we have to actually restart the
-  // transaction.
-  if (!(prop  pr_hasNoAbort)  !default_dispatch-closed_nesting()
-   default_dispatch-closed_nesting_alternative())
-return default_dispatch-closed_nesting_alternative();
+  else
+{
+  // Load the default dispatch.  We're not an active transaction and so it
+  // can change concurrently but will 

Re: Ping: Fix MIPS va_arg regression

2012-02-13 Thread Richard Henderson
On 02/02/2012 11:01 AM, Richard Sandiford wrote:
 Ping for:
 
 http://gcc.gnu.org/ml/gcc-patches/2012-01/msg01564.html
 
 which fixes a MIPS va_arg regression (admittedly a long-standing one)
 on zero-sized types.  There are no functional changes to other targets
 and I'm as confident as I can be that it's safe for MIPS.

Ok.


r~


Re: [PATCH 4.8 v2, i386]: Make CCZ mode compatible with CCGOC and CCGO modes

2012-02-13 Thread Uros Bizjak
On Tue, Feb 14, 2012 at 12:00 AM, Richard Henderson r...@redhat.com wrote:
 On 02/11/2012 12:56 AM, Uros Bizjak wrote:
 FWIW, the mode of flags in users doesn't matter at all on x86, but
 which way is correct?

 As far as I know, it doesn't matter anywhere.  We don't even bother to have 
 perfect harmony between  integer modes in hard registers -- think about what 
 happens when we drop all the subregs on the floor post-reload.  Yes, it's 
 probably an error if we don't have compatible modes between def and use, but 
 nothing is going to check for that.

cse.c says some relaxing words related to this issue:

  /* If the following assertion was triggered, there is most probably
 something wrong with the cc_modes_compatible back end function.
 CC modes only can be considered compatible if the insn - with the mode
 replaced by any of the compatible modes - can still be recognized.  */

It looks to me that correct definition of cc_modes_compatible
guarantees that insn is still valid, no matter if the mode of flags
remains in the wrong mode.

In any case, I will add the comment to avoid confusion.

Thanks,
Uros.


Re: [patch] libitm: Fix race condition in dispatch choice at transaction begin.

2012-02-13 Thread Richard Henderson
On 02/13/2012 03:03 PM, Torvald Riegel wrote:
 -// The default TM method used when starting a new transaction.
 -static GTM::abi_dispatch* default_dispatch = 0;
 +// The default TM method used when starting a new transaction.  Initialized
 +// in number_of_threads_changed() below.
 +static std::atomicGTM::abi_dispatch* default_dispatch;

I see nothing but memory_order_relaxed uses of default_dispatch?


r~


[PATCH] Fix cselib -fcompare-debug problem (PR bootstrap/52172)

2012-02-13 Thread Jakub Jelinek
Hi!

To avoid -fcompare-debug failures, we promote_debug_loc VALUEs looked
up from DEBUG_INSNs when they are looked from some other insns.
Unfortunately, the scheduler after cselib_lookup_from_insn from
DEBUG_INSN calls cselib_subst_to_values, which may e.g. cselib_lookup_mem
(with create=0).  As that is with cselib_current_insn == NULL,
promote_debug_loc considers it being a non-DEBUG_INSN lookup and promotes
it to non-debug, which on the testcase in the PR (too large and too hard
to further reduce) results in different n_useless_values and
remove_useless_values being triggered at different times between
-g and -g0 compilations.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
tested on the testcase with ia64-linux cross.  Ok for trunk?

2012-02-13  Jakub Jelinek  ja...@redhat.com

PR bootstrap/52172
* cselib.h (cselib_subst_to_values_from_insn): New prototype.
* cselib.c (cselib_subst_to_values_from_insn): New function.
* sched-deps.c (add_insn_mem_dependence,
sched_analyze_1, sched_analyze_2): Use it.

--- gcc/cselib.h.jj 2012-01-01 19:54:46.0 +0100
+++ gcc/cselib.h2012-02-13 21:29:21.792483236 +0100
@@ -88,6 +88,7 @@ extern rtx cselib_expand_value_rtx_cb (r
 extern bool cselib_dummy_expand_value_rtx_cb (rtx, bitmap, int,
  cselib_expand_callback, void *);
 extern rtx cselib_subst_to_values (rtx, enum machine_mode);
+extern rtx cselib_subst_to_values_from_insn (rtx, enum machine_mode, rtx);
 extern void cselib_invalidate_rtx (rtx);
 
 extern void cselib_reset_table (unsigned int);
--- gcc/cselib.c.jj 2012-02-13 18:15:17.0 +0100
+++ gcc/cselib.c2012-02-13 21:33:37.019088486 +0100
@@ -1905,6 +1905,19 @@ cselib_subst_to_values (rtx x, enum mach
   return copy;
 }
 
+/* Wrapper for cselib_subst_to_values, that indicates X is in INSN.  */
+
+rtx
+cselib_subst_to_values_from_insn (rtx x, enum machine_mode memmode, rtx insn)
+{
+  rtx ret;
+  gcc_assert (!cselib_current_insn);
+  cselib_current_insn = insn;
+  ret = cselib_subst_to_values (x, memmode);
+  cselib_current_insn = NULL;
+  return ret;
+}
+
 /* Look up the rtl expression X in our tables and return the value it
has.  If CREATE is zero, we return NULL if we don't know the value.
Otherwise, we create a new one if possible, using mode MODE if X
--- gcc/sched-deps.c.jj 2012-01-26 09:22:21.0 +0100
+++ gcc/sched-deps.c2012-02-13 21:30:40.235054596 +0100
@@ -1728,7 +1728,8 @@ add_insn_mem_dependence (struct deps_des
   if (sched_deps_info-use_cselib)
 {
   mem = shallow_copy_rtx (mem);
-  XEXP (mem, 0) = cselib_subst_to_values (XEXP (mem, 0), GET_MODE (mem));
+  XEXP (mem, 0) = cselib_subst_to_values_from_insn (XEXP (mem, 0),
+   GET_MODE (mem), insn);
 }
   link = alloc_EXPR_LIST (VOIDmode, canon_rtx (mem), *mem_list);
   *mem_list = link;
@@ -2449,7 +2450,9 @@ sched_analyze_1 (struct deps_desc *deps,
  t = shallow_copy_rtx (dest);
  cselib_lookup_from_insn (XEXP (t, 0), address_mode, 1,
   GET_MODE (t), insn);
- XEXP (t, 0) = cselib_subst_to_values (XEXP (t, 0), GET_MODE (t));
+ XEXP (t, 0)
+   = cselib_subst_to_values_from_insn (XEXP (t, 0), GET_MODE (t),
+   insn);
}
   t = canon_rtx (t);
 
@@ -2609,7 +2612,9 @@ sched_analyze_2 (struct deps_desc *deps,
t = shallow_copy_rtx (t);
cselib_lookup_from_insn (XEXP (t, 0), address_mode, 1,
 GET_MODE (t), insn);
-   XEXP (t, 0) = cselib_subst_to_values (XEXP (t, 0), GET_MODE (t));
+   XEXP (t, 0)
+ = cselib_subst_to_values_from_insn (XEXP (t, 0), GET_MODE (t),
+ insn);
  }
 
if (!DEBUG_INSN_P (insn))

Jakub


Re: [PATCH] Fix cselib -fcompare-debug problem (PR bootstrap/52172)

2012-02-13 Thread Richard Henderson
On 02/13/2012 03:17 PM, Jakub Jelinek wrote:
 2012-02-13  Jakub Jelinek  ja...@redhat.com
 
   PR bootstrap/52172
   * cselib.h (cselib_subst_to_values_from_insn): New prototype.
   * cselib.c (cselib_subst_to_values_from_insn): New function.
   * sched-deps.c (add_insn_mem_dependence,
   sched_analyze_1, sched_analyze_2): Use it.

Ok.


r~


Re: [PING] New port resubmission for TILEPro and TILE-Gx

2012-02-13 Thread Mike Stump
On Feb 13, 2012, at 1:43 PM, Walter Lee wrote:
 Thanks for the review.  Do I have permission to commit,

Yes, you do.  Richard can approve this, and when he says, Ok., you're good to 
go.

 or is there anything else I need to do?

Nope.  (Assuming you have write after approval to the tree.)


PR middle-end/52214

2012-02-13 Thread Jan Hubicka
Hi,
this patch fixes typo I introduced in my patch fixing infinte recursion of 
predict_paths_for_bb.
While converting the check from aux pointers to bitmaps, I got bitmap_set_bit
wrong.

Bootstrapped/regtested x86_64-linux, comitted.

PR middle-end/52214
* predict.c (predict_paths_for_bb): Fix thinko in previous patch.

Index: predict.c
===
*** predict.c   (revision 184179)
--- predict.c   (working copy)
*** predict_paths_for_bb (basic_block cur, b
*** 1869,1875 
 prevent visiting given BB twice.  */
if (found)
  predict_edge_def (e, pred, taken);
!   else if (!bitmap_set_bit (visited, e-src-index))
predict_paths_for_bb (e-src, e-src, pred, taken, visited);
  }
for (son = first_dom_son (CDI_POST_DOMINATORS, cur);
--- 1869,1875 
 prevent visiting given BB twice.  */
if (found)
  predict_edge_def (e, pred, taken);
!   else if (bitmap_set_bit (visited, e-src-index))
predict_paths_for_bb (e-src, e-src, pred, taken, visited);
  }
for (son = first_dom_son (CDI_POST_DOMINATORS, cur);


Re: [RFC, 4.8] Magic matching for flags clobbering and setting

2012-02-13 Thread Steven Bosscher
On Sat, Feb 11, 2012 at 1:12 AM, Richard Henderson r...@redhat.com wrote:
 Seeing as how Uros is starting to go down the path of cleaning up the
 flags handling for x86, I thought I'd go ahead and knock up the idea
 that I've been tossing around to help automate the process of building
 patterns that match both clobbering the flags and setting the flags to
 a comparison.

 This is far from complete, but it at least shows the direction.

 What I know is missing off the top of my head are:

  (0) Documentation in some .texi file; atm there's only what's in rtl.def.

  (1) Generate (clobber (reg flags)) from genemit, should this construct
     be used in a named insn pattern.

  (2) Can't be usefully used with define_insn_and_split, and no way to tell.
     This problem should simply be documented in the .texi file as user error.

  (3) Can't be used for x86 add patterns, as the clobber version wants the
     freedom to use lea and the set flags version cannot.  And there are
     different sets of constraints if lea may be used or not.

     What would be nice, however, is exposing the targetm.cc_modes_compatible
     thing in such a way that the x86 add patterns could use that, for the
     separate insn that does do the set flags.

     Exposing the targetm.cc_modes_compatible thing separately might also
     clean up some of the evil magic in genrecog.c too.

 Comments?

To see if I understand what a cc0 port conversion would look like with
match_flags, I tried to apply this magic to convert one of the pet
ports, the mighty pdp11.

The pdp11 port doesn't have any define_insn_and_splits, so I didn't
run into the problem you mentioned in (2). What is the problem here? I
suppose it has to do with finding out what the flags setter is after
the split? If so, then couldn't that be resolved with some rules about
how the post-split patterns should be constructed?

Other than that: To convert a port, there is still a lot of work to be
done to define and handle the various CC modes properly (well, not for
the pdp11, because it writes out 1 insn for most define_insns), but
it is great not having to define all the pairs of clobber-flags and
set-flags insns.  At least, I didn't end up rewriting the complete .md
file. It was relatively easy. Less book-keeping involved, etc.

Hope this goes in for GCC 4.8.

Ciao!
Steven


Re: [PATCH, libitm]: GTM_longjmp: Jump indirect from memory address

2012-02-13 Thread Uros Bizjak
On Mon, Feb 13, 2012 at 11:57 PM, Richard Henderson r...@redhat.com wrote:
 On 02/13/2012 02:54 PM, Uros Bizjak wrote:
 -     movq    56(%rsi), %rdx
       movl    %edi, %eax
       cfi_def_cfa(%rcx, 0)
 -     cfi_register(%rip, %rdx)
       movq    %rcx, %rsp
 -     jmp     *%rdx
 +     jmp     *56(%rsi)

 If you're going to do that, the correct fix for the unwind info is

 - cfi_register(%rip, %rdx)
 + cfi_offset(%rip, 56)

Hm, we just defined new CFA as rcx+0, so we should define location of
rip relative to new CFA. Since CFA points to stack slot just before
return address was pushed, new rip lies at CFA-8 for 64bit resp. CFA-4
for x86_32. Did I get these .cfi directives correctly?

SYM(GTM_longjmp):
cfi_startproc
#ifdef __x86_64__
movq(%rsi), %rcx
movq8(%rsi), %rbx
movq16(%rsi), %rbp
movq24(%rsi), %r12
movq32(%rsi), %r13
movq40(%rsi), %r14
movq48(%rsi), %r15
movl%edi, %eax
cfi_def_cfa(%rcx, 0)
cfi_offset(%rip, -8)
movq%rcx, %rsp
jmp *56(%rsi)
#else
movl(%edx), %ecx
movl4(%edx), %ebx
movl8(%edx), %esi
movl12(%edx), %edi
movl16(%edx), %ebp
cfi_def_cfa(%ecx, 0)
cfi_offset(%eip, -4)
movl%ecx, %esp
jmp *20(%edx)
#endif
cfi_endproc

Uros.


Re: [PATCH, libitm]: GTM_longjmp: Jump indirect from memory address

2012-02-13 Thread Richard Henderson
On 02/13/2012 04:09 PM, Uros Bizjak wrote:
 On Mon, Feb 13, 2012 at 11:57 PM, Richard Henderson r...@redhat.com wrote:
 On 02/13/2012 02:54 PM, Uros Bizjak wrote:
 - movq56(%rsi), %rdx
   movl%edi, %eax
   cfi_def_cfa(%rcx, 0)
 - cfi_register(%rip, %rdx)
   movq%rcx, %rsp
 - jmp *%rdx
 + jmp *56(%rsi)

 If you're going to do that, the correct fix for the unwind info is

 - cfi_register(%rip, %rdx)
 + cfi_offset(%rip, 56)
 
 Hm, we just defined new CFA as rcx+0, so we should define location of
 rip relative to new CFA. Since CFA points to stack slot just before
 return address was pushed, new rip lies at CFA-8 for 64bit resp. CFA-4
 for x86_32. Did I get these .cfi directives correctly?

No.  The value at %rcx-8 is total garbage.  There no guarantee that
the call stack leading to this abort has anything in common with the
call stack that created the jmpbuf, except *above* %rcx, the new CFA.

The new rip is at rsi+56.  You can see that in that you jump to it.


r~


Re: [PATCH, go]: Disable TestListenMulticastUDP on alpha linux

2012-02-13 Thread Ian Lance Taylor
Uros Bizjak ubiz...@gmail.com writes:

 alpha linux does not have expected /proc/net/igmp and
 /proc/net/igmp6 files, so func interfaceMulticastAddrTable(ifindex
 int) from interface_linux.go always returns (nil, nil), failing
 net/test with:

 --- FAIL: net.TestListenMulticastUDP (4.71 seconds)
 ???:1: IPv4 multicast interface: nil
 ???:1: IPv4 multicast TTL: 1
 ???:1: IPv4 multicast loopback: false
 ???:1: 224.0.0.254:12345 not found in RIB
 FAIL

 Attached patch skips this sub-test in the same way as for ARM arch.

 Tested on alphaev6-pc-linux-gnu, where it fixes failing net test.

Thanks.

Committed.

Ian


libgo patch committed: Reload m and g if necessary

2012-02-13 Thread Ian Lance Taylor
PR 50654 points out that many Go tests fail on systems that use emutls.
This turns out to be a subtle issue involving the use of setcontext and
getcontext.  When a particular invocation is moved to run on a different
thread via getcontext and setcontext, it must reload the thread-local
variables m and g.  This happens naturally, because the function call
makes gcc think that they might have changed (as indeed they might
have).  However, gcc knows that the address of the thread-local
variables can not change.  Or at least it thinks it does; if setcontext
causes the function to start running on a different thread, then the
address actually does change.  This means that gcc may cache the address
on the stack in some cases where it must not.

The same issue arises for ordinary TLS, of course, and I have already
fixed most cases.  However, I missed one case.  That case was working
for ordinary TLS because the function refers to both m and g, and gcc
compiles the code such that it holds a pointer to the thread-specific
area and references m and g off that pointer.  This happens to work even
if the function starts running on a different thread.  However, it does
not work when using emultls, for which gcc uses a different compilation
strategy.

This patch fixes the problem.  Bootstrapped and ran Go testsuite on
x86_64-unknonw-linux-gnu, with both regular TLS and emutls.  Committed
to mainline.

Ian

diff -r 5b77b481d6f9 libgo/runtime/proc.c
--- a/libgo/runtime/proc.c	Mon Feb 13 16:29:13 2012 -0800
+++ b/libgo/runtime/proc.c	Mon Feb 13 16:30:31 2012 -0800
@@ -309,6 +309,8 @@
 static void
 runtime_mcall(void (*pfn)(G*))
 {
+	M *mp;
+	G *gp;
 #ifndef USING_SPLIT_STACK
 	int i;
 #endif
@@ -317,28 +319,45 @@
 	// collector.
 	__builtin_unwind_init();
 
-	if(g == m-g0)
+	mp = m;
+	gp = g;
+	if(gp == mp-g0)
 		runtime_throw(runtime: mcall called on m-g0 stack);
 
-	if(g != nil) {
+	if(gp != nil) {
 
 #ifdef USING_SPLIT_STACK
 		__splitstack_getcontext(g-stack_context[0]);
 #else
-		g-gcnext_sp = i;
+		gp-gcnext_sp = i;
 #endif
-		g-fromgogo = false;
-		getcontext(g-context);
+		gp-fromgogo = false;
+		getcontext(gp-context);
+
+		// When we return from getcontext, we may be running
+		// in a new thread.  That means that m and g may have
+		// changed.  They are global variables so we will
+		// reload them, but the addresses of m and g may be
+		// cached in our local stack frame, and those
+		// addresses may be wrong.  Call functions to reload
+		// the values for this thread.
+		mp = runtime_m();
+		gp = runtime_g();
 	}
-	if (g == nil || !g-fromgogo) {
+	if (gp == nil || !gp-fromgogo) {
 #ifdef USING_SPLIT_STACK
-		__splitstack_setcontext(m-g0-stack_context[0]);
+		__splitstack_setcontext(mp-g0-stack_context[0]);
 #endif
-		m-g0-entry = (byte*)pfn;
-		m-g0-param = g;
-		g = m-g0;
-		fixcontext(m-g0-context);
-		setcontext(m-g0-context);
+		mp-g0-entry = (byte*)pfn;
+		mp-g0-param = gp;
+
+		// It's OK to set g directly here because this case
+		// can not occur if we got here via a setcontext to
+		// the getcontext call just above.
+		g = mp-g0;
+
+		fixcontext(mp-g0-context);
+		setcontext(mp-g0-context);
 		runtime_throw(runtime: mcall function returned);
 	}
 }


Re: [C/C++ PATCH] Fix merge_decls/duplicate_decls DECL_USER_ALIGN/DECL_ALIGN handling (PR c/52181)

2012-02-13 Thread Jason Merrill

OK.

Jason


Re: [C++ Patch] PR 51494 (and 52183)

2012-02-13 Thread Jason Merrill

This patch fixes this particular bug, but there are some issues.

First, non_static_member_function_p only checks the first function in 
the overload set, which may not be representative of all of them.  It 
really shouldn't look through OVERLOADs, we need to defer this decision 
until build_over_call.


Second, the uses of maybe_dummy_object in build_offset_ref, 
finish_qualified_id_expr and finish_id_expression could also be dealing 
with static member functions.


The underlying problem here is that we're only supposed to capture a 
variable/this when it is odr-used, which we can't know until we finish 
overload resolution.


Jason


Re: [libitm] Add SPARC bits

2012-02-13 Thread David Miller
From: Eric Botcazou ebotca...@adacore.com
Date: Sun, 12 Feb 2012 21:15:26 +0100

 + load[%o1 + OFFSET (JB_CFA)], %fp
 + cfi_def_cfa(%fp, 0)
 +#if STACK_BIAS
 + sub %fp, STACK_BIAS, %fp
 + cfi_def_cfa_offset(STACK_BIAS)
 +#endif

I think you really need to put the proper value into the %fp register
atomically here.

If an interrupt comes in before you STACK_BIAS adjust the %fp, a
debugger or similar could see a corrupt frame pointer.


Re: [libitm] Add SPARC bits

2012-02-13 Thread David Miller
From: Eric Botcazou ebotca...@adacore.com
Date: Sun, 12 Feb 2012 21:15:26 +0100

 +static inline void
 +cpu_relax (void)
 +{
 +  __asm volatile ( : : : memory);
 +}

We probably want to do some nop'ish thing here which will yield the
cpu thread on Niagara cpus, I'd recommend something along the lines of
rd %ccr, %g0 or rd %y, %g0



Re: [libitm] Link with -litm and -pthread

2012-02-13 Thread Hans-Peter Nilsson
On Sat, 11 Feb 2012, Eric Botcazou wrote:
 Hi,

 this completes the half-implemented linking scheme of libitm and makes it 
 mimic
 that of libgomp entirely.  We need the -pthread thing on Solaris 8.

It broke all targets that don't implement threads and as such
don't support -pthread.  And you need to gate *all* tm-related
tests on something like check_effective_target_pthread.
I see regress-155 for cris-elf.

Can't you just limit adding -pthread to Solaris 8 or something?

brgds, H-P


[google/integration] Add support for powerpc64-grtev2-linux-gnu (issue5659050)

2012-02-13 Thread Doug Kwan
Hi,

This patch adds support for powerpc*-grtev2-linux-gnu.  The changes
include:

1. Relocating the dynamic linker using a run-time root prefix.
2. Using different library setting in static linking.

This is tested by building PowerPC64 and PowerPC toolchains and ran
some tests with the resulting toolchain.

This is used by Google and is not meant to be sent to trunk.

-Doug

2012-02-13   Doug Kwan  dougk...@google.com

* gcc/config.gcc (powerpc*-*-linux): Pull in GRTEv2 spec changes if
target matches *-grtev2-*.
* gcc/config/rs6000/linux64.h (GLIB_DYNAMIC_LINKER{32,64}): Add
runtime root prefix to glibc's dynamic linker.
* gcc/config/rs6000/linux-grtev2.h: New file.
* gcc/config/rs6000/sysv4.h (GLIB_DYNAMIC_LINKER): Add
runtime root prefix to glibc's dynamic linker.
(LINUX_GRTE_EXTRA_SPECS): Define to be empty if no definition found.
(SUBTARGET_EXTRA_SPECS): Include LINUX_GRTE_EXTRA_SPECS.

Index: gcc/config.gcc
===
--- gcc/config.gcc  (revision 184150)
+++ gcc/config.gcc  (working copy)
@@ -2040,6 +2040,12 @@ powerpc-*-linux* | powerpc64-*-linux*)
if test x${enable_secureplt} = xyes; then
tm_file=rs6000/secureplt.h ${tm_file}
fi
+   # Pull in spec changes for GRTEv2 configurations.
+   case ${target} in
+   *-grtev2-*)
+   tm_file=${tm_file} rs6000/linux-grtev2.h
+   ;;
+   esac
;;
 powerpc-wrs-vxworks|powerpc-wrs-vxworksae)
tm_file=${tm_file} elfos.h freebsd-spec.h rs6000/sysv4.h
Index: gcc/config/rs6000/linux64.h
===
--- gcc/config/rs6000/linux64.h (revision 184150)
+++ gcc/config/rs6000/linux64.h (working copy)
@@ -367,8 +367,8 @@ extern int dot_symbols;
 #undef LINK_OS_DEFAULT_SPEC
 #define LINK_OS_DEFAULT_SPEC %(link_os_linux)
 
-#define GLIBC_DYNAMIC_LINKER32 /lib/ld.so.1
-#define GLIBC_DYNAMIC_LINKER64 /lib64/ld64.so.1
+#define GLIBC_DYNAMIC_LINKER32 RUNTIME_ROOT_PREFIX /lib/ld.so.1
+#define GLIBC_DYNAMIC_LINKER64 RUNTIME_ROOT_PREFIX /lib64/ld64.so.1
 #define UCLIBC_DYNAMIC_LINKER32 /lib/ld-uClibc.so.0
 #define UCLIBC_DYNAMIC_LINKER64 /lib/ld64-uClibc.so.0
 #if DEFAULT_LIBC == LIBC_UCLIBC
Index: gcc/config/rs6000/linux-grtev2.h
===
--- gcc/config/rs6000/linux-grtev2.h(revision 0)
+++ gcc/config/rs6000/linux-grtev2.h(revision 0)
@@ -0,0 +1,43 @@
+/* Definitions for Linux-based GRTE (Google RunTime Environment) version 2.
+   Copyright (C) 2009,2010,2011,2012 Free Software Foundation, Inc.
+   Contributed by Chris Demetriou and Ollie Wild.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+http://www.gnu.org/licenses/.  */
+
+/* Overrides LIB_LINUX_SPEC from sysv4.h.  */
+#undef LIB_LINUX_SPEC
+#define LIB_LINUX_SPEC \
+  %{pthread:-lpthread} \
+   %{shared:-lc} \
+   %{!shared:%{mieee-fp:-lieee} %{profile:%(libc_p)}%{!profile:%(libc)}}
+
+/* When GRTE links statically, it needs its NSS and resolver libraries
+   linked in as well.  Note that when linking statically, these are
+   enclosed in a group by LINK_GCC_C_SEQUENCE_SPEC.  */
+#undef LINUX_GRTE_EXTRA_SPECS
+#define LINUX_GRTE_EXTRA_SPECS \
+  { libc, %{static:%(libc_static);:-lc} }, \
+  { libc_p, %{static:%(libc_p_static);:-lc_p} }, \
+  { libc_static, \
+-lc -lnss_borg -lnss_cache -lnss_dns -lnss_files -lresolv }, \
+  { libc_p_static, \
+-lc_p -lnss_borg_p -lnss_cache_p -lnss_dns_p -lnss_files_p -lresolv_p },
Index: gcc/config/rs6000/sysv4.h
===
--- gcc/config/rs6000/sysv4.h   (revision 184150)
+++ gcc/config/rs6000/sysv4.h   (working copy)
@@ -803,7 +803,10 @@ extern int fixuplabelno;
 
 #define LINK_START_LINUX_SPEC 
 
-#define GLIBC_DYNAMIC_LINKER /lib/ld.so.1
+#ifndef RUNTIME_ROOT_PREFIX
+#define RUNTIME_ROOT_PREFIX 
+#endif
+#define GLIBC_DYNAMIC_LINKER RUNTIME_ROOT_PREFIX /lib/ld.so.1
 #define UCLIBC_DYNAMIC_LINKER /lib/ld-uClibc.so.0
 #if DEFAULT_LIBC == LIBC_UCLIBC
 #define 

Re: [google/integration] Add support for powerpc64-grtev2-linux-gnu (issue5659050)

2012-02-13 Thread Andrew Pinski
On Mon, Feb 13, 2012 at 6:41 PM, Doug Kwan dougk...@google.com wrote:
 Hi,

 This patch adds support for powerpc*-grtev2-linux-gnu.  The changes
 include:

 1. Relocating the dynamic linker using a run-time root prefix.
 2. Using different library setting in static linking.

 This is tested by building PowerPC64 and PowerPC toolchains and ran
 some tests with the resulting toolchain.

 This is used by Google and is not meant to be sent to trunk.

 -Doug

 2012-02-13   Doug Kwan  dougk...@google.com

        * gcc/config.gcc (powerpc*-*-linux): Pull in GRTEv2 spec changes if
        target matches *-grtev2-*.
        * gcc/config/rs6000/linux64.h (GLIB_DYNAMIC_LINKER{32,64}): Add
        runtime root prefix to glibc's dynamic linker.
        * gcc/config/rs6000/linux-grtev2.h: New file.
        * gcc/config/rs6000/sysv4.h (GLIB_DYNAMIC_LINKER): Add
        runtime root prefix to glibc's dynamic linker.
        (LINUX_GRTE_EXTRA_SPECS): Define to be empty if no definition found.
        (SUBTARGET_EXTRA_SPECS): Include LINUX_GRTE_EXTRA_SPECS.

 Index: gcc/config.gcc
 ===
 --- gcc/config.gcc      (revision 184150)
 +++ gcc/config.gcc      (working copy)
 @@ -2040,6 +2040,12 @@ powerpc-*-linux* | powerpc64-*-linux*)
        if test x${enable_secureplt} = xyes; then
                tm_file=rs6000/secureplt.h ${tm_file}
        fi
 +       # Pull in spec changes for GRTEv2 configurations.
 +       case ${target} in
 +       *-grtev2-*)
 +           tm_file=${tm_file} rs6000/linux-grtev2.h
 +           ;;
 +       esac
        ;;
  powerpc-wrs-vxworks|powerpc-wrs-vxworksae)
        tm_file=${tm_file} elfos.h freebsd-spec.h rs6000/sysv4.h
 Index: gcc/config/rs6000/linux64.h
 ===
 --- gcc/config/rs6000/linux64.h (revision 184150)
 +++ gcc/config/rs6000/linux64.h (working copy)
 @@ -367,8 +367,8 @@ extern int dot_symbols;
  #undef LINK_OS_DEFAULT_SPEC
  #define LINK_OS_DEFAULT_SPEC %(link_os_linux)

 -#define GLIBC_DYNAMIC_LINKER32 /lib/ld.so.1
 -#define GLIBC_DYNAMIC_LINKER64 /lib64/ld64.so.1
 +#define GLIBC_DYNAMIC_LINKER32 RUNTIME_ROOT_PREFIX /lib/ld.so.1
 +#define GLIBC_DYNAMIC_LINKER64 RUNTIME_ROOT_PREFIX /lib64/ld64.so.1
  #define UCLIBC_DYNAMIC_LINKER32 /lib/ld-uClibc.so.0
  #define UCLIBC_DYNAMIC_LINKER64 /lib/ld64-uClibc.so.0
  #if DEFAULT_LIBC == LIBC_UCLIBC
 Index: gcc/config/rs6000/linux-grtev2.h
 ===
 --- gcc/config/rs6000/linux-grtev2.h    (revision 0)
 +++ gcc/config/rs6000/linux-grtev2.h    (revision 0)
 @@ -0,0 +1,43 @@
 +/* Definitions for Linux-based GRTE (Google RunTime Environment) version 2.
 +   Copyright (C) 2009,2010,2011,2012 Free Software Foundation, Inc.
 +   Contributed by Chris Demetriou and Ollie Wild.
 +
 +This file is part of GCC.
 +
 +GCC is free software; you can redistribute it and/or modify
 +it under the terms of the GNU General Public License as published by
 +the Free Software Foundation; either version 3, or (at your option)
 +any later version.
 +
 +GCC is distributed in the hope that it will be useful,
 +but WITHOUT ANY WARRANTY; without even the implied warranty of
 +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 +GNU General Public License for more details.
 +
 +Under Section 7 of GPL version 3, you are granted additional
 +permissions described in the GCC Runtime Library Exception, version
 +3.1, as published by the Free Software Foundation.
 +
 +You should have received a copy of the GNU General Public License and
 +a copy of the GCC Runtime Library Exception along with this program;
 +see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 +http://www.gnu.org/licenses/.  */
 +
 +/* Overrides LIB_LINUX_SPEC from sysv4.h.  */
 +#undef LIB_LINUX_SPEC
 +#define LIB_LINUX_SPEC \
 +  %{pthread:-lpthread} \
 +   %{shared:-lc} \
 +   %{!shared:%{mieee-fp:-lieee} %{profile:%(libc_p)}%{!profile:%(libc)}}
 +
 +/* When GRTE links statically, it needs its NSS and resolver libraries
 +   linked in as well.  Note that when linking statically, these are
 +   enclosed in a group by LINK_GCC_C_SEQUENCE_SPEC.  */
 +#undef LINUX_GRTE_EXTRA_SPECS
 +#define LINUX_GRTE_EXTRA_SPECS \
 +  { libc, %{static:%(libc_static);:-lc} }, \
 +  { libc_p, %{static:%(libc_p_static);:-lc_p} }, \
 +  { libc_static, \
 +    -lc -lnss_borg -lnss_cache -lnss_dns -lnss_files -lresolv }, \
 +  { libc_p_static, \
 +    -lc_p -lnss_borg_p -lnss_cache_p -lnss_dns_p -lnss_files_p -lresolv_p 
 },

Really can't you fix glibc so that libnss.a is not needed.
See http://sourceware.org/bugzilla/show_bug.cgi?id=6528 for those fixes.

Thanks,
Andrew Pinski


 Index: gcc/config/rs6000/sysv4.h
 ===
 --- gcc/config/rs6000/sysv4.h   (revision 184150)
 +++ gcc/config/rs6000/sysv4.h   (working copy)
 @@ -803,7 +803,10 @@ extern int fixuplabelno;

 

Re: [google/integration] Add support for powerpc64-grtev2-linux-gnu (issue5659050)

2012-02-13 Thread 關振德
Thanks Andrew.  I will take a look at that.

-Doug

On Mon, Feb 13, 2012 at 6:45 PM, Andrew Pinski pins...@gmail.com wrote:
 On Mon, Feb 13, 2012 at 6:41 PM, Doug Kwan dougk...@google.com wrote:
 Hi,

 This patch adds support for powerpc*-grtev2-linux-gnu.  The changes
 include:

 1. Relocating the dynamic linker using a run-time root prefix.
 2. Using different library setting in static linking.

 This is tested by building PowerPC64 and PowerPC toolchains and ran
 some tests with the resulting toolchain.

 This is used by Google and is not meant to be sent to trunk.

 -Doug

 2012-02-13   Doug Kwan  dougk...@google.com

        * gcc/config.gcc (powerpc*-*-linux): Pull in GRTEv2 spec changes if
        target matches *-grtev2-*.
        * gcc/config/rs6000/linux64.h (GLIB_DYNAMIC_LINKER{32,64}): Add
        runtime root prefix to glibc's dynamic linker.
        * gcc/config/rs6000/linux-grtev2.h: New file.
        * gcc/config/rs6000/sysv4.h (GLIB_DYNAMIC_LINKER): Add
        runtime root prefix to glibc's dynamic linker.
        (LINUX_GRTE_EXTRA_SPECS): Define to be empty if no definition found.
        (SUBTARGET_EXTRA_SPECS): Include LINUX_GRTE_EXTRA_SPECS.

 Index: gcc/config.gcc
 ===
 --- gcc/config.gcc      (revision 184150)
 +++ gcc/config.gcc      (working copy)
 @@ -2040,6 +2040,12 @@ powerpc-*-linux* | powerpc64-*-linux*)
        if test x${enable_secureplt} = xyes; then
                tm_file=rs6000/secureplt.h ${tm_file}
        fi
 +       # Pull in spec changes for GRTEv2 configurations.
 +       case ${target} in
 +       *-grtev2-*)
 +           tm_file=${tm_file} rs6000/linux-grtev2.h
 +           ;;
 +       esac
        ;;
  powerpc-wrs-vxworks|powerpc-wrs-vxworksae)
        tm_file=${tm_file} elfos.h freebsd-spec.h rs6000/sysv4.h
 Index: gcc/config/rs6000/linux64.h
 ===
 --- gcc/config/rs6000/linux64.h (revision 184150)
 +++ gcc/config/rs6000/linux64.h (working copy)
 @@ -367,8 +367,8 @@ extern int dot_symbols;
  #undef LINK_OS_DEFAULT_SPEC
  #define LINK_OS_DEFAULT_SPEC %(link_os_linux)

 -#define GLIBC_DYNAMIC_LINKER32 /lib/ld.so.1
 -#define GLIBC_DYNAMIC_LINKER64 /lib64/ld64.so.1
 +#define GLIBC_DYNAMIC_LINKER32 RUNTIME_ROOT_PREFIX /lib/ld.so.1
 +#define GLIBC_DYNAMIC_LINKER64 RUNTIME_ROOT_PREFIX /lib64/ld64.so.1
  #define UCLIBC_DYNAMIC_LINKER32 /lib/ld-uClibc.so.0
  #define UCLIBC_DYNAMIC_LINKER64 /lib/ld64-uClibc.so.0
  #if DEFAULT_LIBC == LIBC_UCLIBC
 Index: gcc/config/rs6000/linux-grtev2.h
 ===
 --- gcc/config/rs6000/linux-grtev2.h    (revision 0)
 +++ gcc/config/rs6000/linux-grtev2.h    (revision 0)
 @@ -0,0 +1,43 @@
 +/* Definitions for Linux-based GRTE (Google RunTime Environment) version 2.
 +   Copyright (C) 2009,2010,2011,2012 Free Software Foundation, Inc.
 +   Contributed by Chris Demetriou and Ollie Wild.
 +
 +This file is part of GCC.
 +
 +GCC is free software; you can redistribute it and/or modify
 +it under the terms of the GNU General Public License as published by
 +the Free Software Foundation; either version 3, or (at your option)
 +any later version.
 +
 +GCC is distributed in the hope that it will be useful,
 +but WITHOUT ANY WARRANTY; without even the implied warranty of
 +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 +GNU General Public License for more details.
 +
 +Under Section 7 of GPL version 3, you are granted additional
 +permissions described in the GCC Runtime Library Exception, version
 +3.1, as published by the Free Software Foundation.
 +
 +You should have received a copy of the GNU General Public License and
 +a copy of the GCC Runtime Library Exception along with this program;
 +see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 +http://www.gnu.org/licenses/.  */
 +
 +/* Overrides LIB_LINUX_SPEC from sysv4.h.  */
 +#undef LIB_LINUX_SPEC
 +#define LIB_LINUX_SPEC \
 +  %{pthread:-lpthread} \
 +   %{shared:-lc} \
 +   %{!shared:%{mieee-fp:-lieee} %{profile:%(libc_p)}%{!profile:%(libc)}}
 +
 +/* When GRTE links statically, it needs its NSS and resolver libraries
 +   linked in as well.  Note that when linking statically, these are
 +   enclosed in a group by LINK_GCC_C_SEQUENCE_SPEC.  */
 +#undef LINUX_GRTE_EXTRA_SPECS
 +#define LINUX_GRTE_EXTRA_SPECS \
 +  { libc, %{static:%(libc_static);:-lc} }, \
 +  { libc_p, %{static:%(libc_p_static);:-lc_p} }, \
 +  { libc_static, \
 +    -lc -lnss_borg -lnss_cache -lnss_dns -lnss_files -lresolv }, \
 +  { libc_p_static, \
 +    -lc_p -lnss_borg_p -lnss_cache_p -lnss_dns_p -lnss_files_p -lresolv_p 
 },

 Really can't you fix glibc so that libnss.a is not needed.
 See http://sourceware.org/bugzilla/show_bug.cgi?id=6528 for those fixes.

 Thanks,
 Andrew Pinski


 Index: gcc/config/rs6000/sysv4.h
 ===
 --- 

Re: [v3] libstdc++/51798

2012-02-13 Thread Benjamin Kosnik
 
 The patch uses the weak version of compare_exchange universally, which
 is incorrect in a number of cases.  You wouldn't see this on x86_64;
 you'd have to use a ll/sc target such as powerpc.
 
 In addition to changing several uses to strong compare_exchange, I
 also optimize the idiom
 
   do
 {
 var = *m;
   newval = ...;
 }
   while (!atomic_compare_exchange(m, var, newval, ...));
 
 With the new builtins, VAR is updated with the current value of the 
 memory (regardless of the weak setting), so the initial read from *M
 can be hoisted outside the loop.

nice!

 
 Ok?
 
cool, thanks for reviewing this. 

I fixed up the line numbers for the header file edits.

-benjamin
2012-02-13  Benjamin Kosnik  b...@redhat.com

	* testsuite/20_util/shared_ptr/cons/43820_neg.cc: Adjust line numbers.
	* testsuite/tr1/2_general_utilities/shared_ptr/cons/43820_neg.cc: Same.

diff --git a/libstdc++-v3/testsuite/20_util/shared_ptr/cons/43820_neg.cc b/libstdc++-v3/testsuite/20_util/shared_ptr/cons/43820_neg.cc
index 0d51663..39f9ce3 100644
--- a/libstdc++-v3/testsuite/20_util/shared_ptr/cons/43820_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/shared_ptr/cons/43820_neg.cc
@@ -32,9 +32,9 @@ void test01()
 {
   X* px = 0;
   std::shared_ptrX p1(px);   // { dg-error here }
-  // { dg-error incomplete  { target *-*-* } 773 }
+  // { dg-error incomplete  { target *-*-* } 771 }
 
   std::shared_ptrX p9(ap());  // { dg-error here }
-  // { dg-error incomplete  { target *-*-* } 867 }
+  // { dg-error incomplete  { target *-*-* } 865 }
 
 }
diff --git a/libstdc++-v3/testsuite/tr1/2_general_utilities/shared_ptr/cons/43820_neg.cc b/libstdc++-v3/testsuite/tr1/2_general_utilities/shared_ptr/cons/43820_neg.cc
index ae902dc..0309f8f 100644
--- a/libstdc++-v3/testsuite/tr1/2_general_utilities/shared_ptr/cons/43820_neg.cc
+++ b/libstdc++-v3/testsuite/tr1/2_general_utilities/shared_ptr/cons/43820_neg.cc
@@ -1,6 +1,6 @@
 // { dg-do compile }
 
-// Copyright (C) 2010 Free Software Foundation
+// Copyright (C) 2010, 2012 Free Software Foundation
 //
 // This file is part of the GNU ISO C++ Library.  This library is free
 // software; you can redistribute it and/or modify it under the
@@ -30,9 +30,9 @@ void test01()
 {
   X* px = 0;
   std::tr1::shared_ptrX p1(px);   // { dg-error here }
-  // { dg-error incomplete  { target *-*-* } 565 }
+  // { dg-error incomplete  { target *-*-* } 563 }
 
   std::tr1::shared_ptrX p9(ap());  // { dg-error here }
-  // { dg-error incomplete  { target *-*-* } 604 }
+  // { dg-error incomplete  { target *-*-* } 602 }
 
 }


RE: [PATCH ARM] backport r174803 from trunk to 4.6 branch

2012-02-13 Thread Bin Cheng


 -Original Message-
 From: Richard Earnshaw
 Sent: Monday, February 13, 2012 7:37 PM
 To: Bin Cheng
 Cc: gcc-patches@gcc.gnu.org
 Subject: Re: [PATCH ARM] backport r174803 from trunk to 4.6 branch
 
 On 08/02/12 08:29, Bin Cheng wrote:
  Hi,
  Julian Brown once posted a patch fixing ARM EABI violation, which I
think
  also essential to 4.6 branch.
  I created a patch against 4.6 branch as attached. Is it ok to back port?
 
  You can refer following link for original patch.
  http://gcc.gnu.org/ml/gcc-patches/2011-06/msg00260.html
 
  Thanks
 
  gcc/ChangeLog:
  2012-02-08  Bin Cheng  bin.ch...@arm.com
 
  Backport from mainline
  2011-06-08  Julian Brown  jul...@codesourcery.com
 
  * config/arm/arm.c (arm_libcall_uses_aapcs_base): Use correct ABI
  for double-precision helper functions in hard-float mode if only
  single-precision arithmetic is supported in hardware.
 
 
 
 
 
 
 OK.
 
 Can you also back-port it to 4.5 as well, please.


Hi,

Thanks for approving, I will back port this and r183734(from
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51835) to 4.5 branch.

Thanks.





RE: [Ping] RE: CR16 Port addition

2012-02-13 Thread Jayant R. Sonar
Hello Gerald,

Thank you for this suggestion.

I have not worked on these changes before. Therefore, can you please review 
the attached patch and let me know if any changes are required to be done in 
it. 

Thanks and Regards,
Jayant Sonar
[KPIT Cummins, Pune]



cr16-htdocs2.diff
Description: cr16-htdocs2.diff


MAINTAINERS: add myself

2012-02-13 Thread Walter Lee
Committed.

2012-02-14  Walter Lee  w...@tilera.com

* MAINTAINERS (Write After Approval): Add myself.

Index: MAINTAINERS
===
--- MAINTAINERS (revision 184193)
+++ MAINTAINERS (working copy)
@@ -428,6 +428,7 @@ Asher Langton   
langt...@llnl.gov
 Chris Lattner  sa...@nondot.org
 Terry Laurenzo tlaure...@gmail.com
 Georg-Johann Lay   a...@gjlay.de
+Walter Lee w...@tilera.com
 Marc Lehmann   p...@goof.com
 James Lemkejwle...@codesourcery.com
 Kriang Lerdsuwanakij   lerds...@users.sourceforge.net


[PATCH] Prefer reg as first operand in commutative operator

2012-02-13 Thread Paulo J. Matos

Hi,

This patch was submitted as part of PR 52235.
It increases the preference of a register for first operand of a 
commutative operator.


2012-02-13 Paulo Matos paulo.ma...@csr.com
* gcc/rtlanal.c: Increase preference of a register for the
first operand in a commutative operator.

--- gcc46/gcc/rtlanal.c (gcc 4.6.2)
+++ gcc46/gcc/rtlanal.c (working copy)
@@ -3047,11 +3047,11 @@

   /* Constants always come the second operand.  Prefer nice 
constants.  */

   if (code == CONST_INT)
+return -9;
+  if (code == CONST_DOUBLE)
 return -8;
-  if (code == CONST_DOUBLE)
-return -7;
   if (code == CONST_FIXED)
-return -7;
+return -8;
   op = avoid_constant_pool_reference (op);
   code = GET_CODE (op);

@@ -3059,26 +3059,28 @@
 {
 case RTX_CONST_OBJ:
   if (code == CONST_INT)
+return -7;
+  if (code == CONST_DOUBLE)
 return -6;
-  if (code == CONST_DOUBLE)
-return -5;
   if (code == CONST_FIXED)
-return -5;
-  return -4;
+return -6;
+  return -5;

 case RTX_EXTRA:
   /* SUBREGs of objects should come second.  */
   if (code == SUBREG  OBJECT_P (SUBREG_REG (op)))
-return -3;
+return -4;
   return 0;

 case RTX_OBJ:
   /* Complex expressions should be the first, so decrease priority
  of objects.  Prefer pointer objects over non pointer objects.  */
-  if ((REG_P (op)  REG_POINTER (op))
- || (MEM_P (op)  MEM_POINTER (op)))
-   return -1;
-  return -2;
+  if(REG_P(op))
+  return -1;
+  else if ((REG_P (op)  REG_POINTER (op))
+   || (MEM_P (op)  MEM_POINTER (op)))
+  return -2;
+  return -3;

 case RTX_COMM_ARITH:
   /* Prefer operands that are themselves commutative to be first.

--
PMatos



Re: [PATCH, libitm]: GTM_longjmp: Jump indirect from memory address

2012-02-13 Thread Uros Bizjak
On Tue, Feb 14, 2012 at 1:15 AM, Richard Henderson r...@redhat.com wrote:

 -     movq    56(%rsi), %rdx
       movl    %edi, %eax
       cfi_def_cfa(%rcx, 0)
 -     cfi_register(%rip, %rdx)
       movq    %rcx, %rsp
 -     jmp     *%rdx
 +     jmp     *56(%rsi)

 If you're going to do that, the correct fix for the unwind info is

 - cfi_register(%rip, %rdx)
 + cfi_offset(%rip, 56)

 Hm, we just defined new CFA as rcx+0, so we should define location of
 rip relative to new CFA. Since CFA points to stack slot just before
 return address was pushed, new rip lies at CFA-8 for 64bit resp. CFA-4
 for x86_32. Did I get these .cfi directives correctly?

 No.  The value at %rcx-8 is total garbage.  There no guarantee that
 the call stack leading to this abort has anything in common with the
 call stack that created the jmpbuf, except *above* %rcx, the new CFA.

 The new rip is at rsi+56.  You can see that in that you jump to it.

Thanks for the explanation, I will commit the patch with your suggested change.

Uros.


Re: [PATCH, libitm]: GTM_longjmp: Jump indirect from memory address

2012-02-13 Thread Uros Bizjak
On Tue, Feb 14, 2012 at 8:39 AM, Uros Bizjak ubiz...@gmail.com wrote:

 - cfi_register(%rip, %rdx)
 + cfi_offset(%rip, 56)

 Hm, we just defined new CFA as rcx+0, so we should define location of
 rip relative to new CFA. Since CFA points to stack slot just before
 return address was pushed, new rip lies at CFA-8 for 64bit resp. CFA-4
 for x86_32. Did I get these .cfi directives correctly?

 No.  The value at %rcx-8 is total garbage.  There no guarantee that
 the call stack leading to this abort has anything in common with the
 call stack that created the jmpbuf, except *above* %rcx, the new CFA.

 The new rip is at rsi+56.  You can see that in that you jump to it.

 Thanks for the explanation, I will commit the patch with your suggested 
 change.

Now with the patch attached... (please also note that rip is now
defined with offset to old CFA, before CFA is updated to new
register).

Uros.
Index: ChangeLog
===
--- ChangeLog   (revision 184197)
+++ ChangeLog   (working copy)
@@ -1,3 +1,7 @@
+2012-02-15  Uros Bizjak  ubiz...@gmail.com
+
+   * config/x86/target.h (GTM_longjmp): Jump indirect from memory address.
+
 2012-02-13  Eric Botcazou  ebotca...@adacore.com
 
* configure.tgt (target_cpu): Handle sparc and sparc64  sparcv9.
Index: config/x86/sjlj.S
===
--- config/x86/sjlj.S   (revision 184150)
+++ config/x86/sjlj.S   (working copy)
@@ -119,23 +119,21 @@
movq32(%rsi), %r13
movq40(%rsi), %r14
movq48(%rsi), %r15
-   movq56(%rsi), %rdx
movl%edi, %eax
+   cfi_offset(%rip, 56)
cfi_def_cfa(%rcx, 0)
-   cfi_register(%rip, %rdx)
movq%rcx, %rsp
-   jmp *%rdx
+   jmp *56(%rsi)
 #else
movl(%edx), %ecx
movl4(%edx), %ebx
movl8(%edx), %esi
movl12(%edx), %edi
movl16(%edx), %ebp
-   movl20(%edx), %edx
+   cfi_offset(%eip, 20)
cfi_def_cfa(%ecx, 0)
-   cfi_register(%eip, %edx)
movl%ecx, %esp
-   jmp *%edx
+   jmp *20(%edx)
 #endif
cfi_endproc