date:20120110

Re: [PATCH, ARM] Fix stack red zone bug (PR38644) for GCC 4.6

2012-01-10 Thread Sebastian Huber


On 01/09/2012 05:58 PM, Ramana Radhakrishnan wrote:

On 9 January 2012 08:49, Sebastian Huber
sebastian.hu...@embedded-brains.de  wrote:

What is missing to get this ported back to the GCC 4.6 branch?


I ran a sanity check again today and backported them to the 4.6
branch. Sorry about the delay.


Thanks.


I prefer to do the same for the 4.4 and
4.5 branches. If you are happy to test them then I'm ok with doing the
backports.


The problem with the 4.4 and 4.5 branches is that I have no RTEMS EABI 
configuration for these branches (it was added in 4.6) and thus I cannot run 
the tests on a simulator.  How do you run the test suite?


--
Sebastian Huber, embedded brains GmbH

Address : Obere Lagerstr. 30, D-82178 Puchheim, Germany
Phone   : +49 89 18 90 80 79-6
Fax : +49 89 18 90 80 79-9
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.

Re: [PATCH, i386]: Optimize AND with 0xffffffff

2012-01-10 Thread Richard Guenther

On Mon, 9 Jan 2012, Uros Bizjak wrote:

 Hello!
 
 Attached patch fixes oversight with AND pattern and 0x
 immediate. While ANDs with 0xff and 0x are converted to equivalent
 zero_extend pattern, AND with 0x isn't. This problem leaves
 important optimization that would substitute movq%rdi, %rax; andl
$4294967295, %eax sequence with movl %edi, %eax ineffective.
 
 This optimization happens ~100 times in cc1.
 
 Moving to stage4 got me by a bit of surprise (I was away from the
 keyboard for the weekend), so I will leave to RMs if this (otherwise
 fairly safe patch) is OK for mainline.
 
 2012-01-09  Uros Bizjak  ubiz...@gmail.com
 
   PR target/51681
   * config/i386/constraints.md (L): Return true for 0x.
   * config/i386/i386.c (*anddi_1): Emit AND with 0x as MOV.
 
 So, OK for  mainline?

Ok from a RM perspective.

Richard.

Re: [PATCH][Graphite]

2012-01-10 Thread Richard Guenther

On Mon, 9 Jan 2012, Tobias Grosser wrote:

 On 01/09/2012 04:34 PM, Richard Guenther wrote:
  
  This fixes the 2nd P1 ICE.
  
  There is a disconnect on how we analyze data-references during SCOP
  detection
  (outermost_loop is the root of the loop tree) and during SESE-to-poly
  where
  outermost is determined by outermost_loop_in_sese_1 ().  That influences
  the SCEV result and thus we do not break the SCOP at a stmt we have to
  break
  it.
  
  The following patch fixes this using a sledgehammer - require the
  data-ref to be representable if analyzed with respect to all loops
  it can nest in.
  
  Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
  
  Ok?
 
 This one looks good to me. The new region based scop detection should fix this
 issue, but until we can get the relevant patches in, this looks like a good
 fix. Thanks for fixing the graphite PRs.

Btw, the patch requires me to XFAIL the following tests:

FAIL: gcc.dg/graphite/scop-20.c scan-tree-dump-times graphite number of 
SCoPs: 2 1

huh, we now only detect one SCoP (I suppose not all stmts of a
function will end up in SCoPs?  So we have an unhandled loop here
now.)

FAIL: gfortran.dg/graphite/interchange-1.f  -O  scan-tree-dump-times 
graphite will be interchanged 1

No SCoPs detected anymore.

FAIL: gfortran.dg/graphite/block-1.f90  -O  scan-tree-dump-times graphite 
number of SCoPs: 1 1

Likewise.

FAIL: gfortran.dg/graphite/block-2.f  -O  scan-tree-dump-times graphite 
number of SCoPs: 2 1

Likewise.

But I also saw no easy way to use the proper outermost loop during
the analysis phase, so we have to live with this for 4.7 I believe.

Thus, applied.

Thanks,
Richard.

Re: [PATCH] Adjust 'malloc' attribute documentation to match implementation

2012-01-10 Thread Richard Guenther

On Mon, 9 Jan 2012, Xinliang David Li wrote:

 It looks non-ambiguous to me.

The new proposed version or the old?

Richard.

 David
 
 On Mon, Jan 9, 2012 at 1:05 AM, Richard Guenther rguent...@suse.de wrote:
 
  Since GCC 4.4 applying the malloc attribute to realloc-like
  functions does not work under the documented constraints because
  the contents of the memory pointed to are not properly transfered
  from the realloc argument (or treated as pointing to anything,
  like 4.3 behaved).
 
  The following adjusts documentation to reflect implementation
  reality (we do have an implementation detail that treats the
  memory blob returned for non-builtins as pointing to any global
  variable, but that is neither documented nor do I plan to do
  so - I presume it is to allow allocation + initialization
  routines to be marked with malloc, but even that area looks
  susceptible to misinterpretation to me).
 
  Any comments?
 
  Thanks,
  Richard.
 
  2012-01-09  Richard Guenther  rguent...@suse.de
 
         * doc/extend.texi (malloc attribute): Adjust according to
         implementation.
 
  Index: gcc/doc/extend.texi
  ===
  --- gcc/doc/extend.texi (revision 183001)
  +++ gcc/doc/extend.texi (working copy)
  @@ -2771,13 +2771,12 @@ efficient @code{jal} instruction.
   @cindex @code{malloc} attribute
   The @code{malloc} attribute is used to tell the compiler that a function
   may be treated as if any non-@code{NULL} pointer it returns cannot
  -alias any other pointer valid when the function returns.
  +alias any other pointer valid when the function returns and that the memory
  +has undefined content.
   This will often improve optimization.
   Standard functions with this property include @code{malloc} and
  -@code{calloc}.  @code{realloc}-like functions have this property as
  -long as the old pointer is never referred to (including comparing it
  -to the new pointer) after the function returns a non-@code{NULL}
  -value.
  +@code{calloc}.  @code{realloc}-like functions do not have this
  +property as the memory pointed to does not have undefined content.
 
   @item mips16/nomips16
   @cindex @code{mips16} attribute
 
 

-- 
Richard Guenther rguent...@suse.de
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer

Re: [PATCH, AVR] Fix PR target/50925, use hard_frame_pointer_rtx

2012-01-10 Thread Georg-Johann Lay

Denis Chertykov wrote:
 Hi Georg.
 
 I have found that conversion AVR port to using hard_frame_pointer have
 resolved PR 50925 .
 I have tested the patch without regressions, but I'm worry about it.
 Can you test it with your testsuite for regressions ?
 May be you have your own special difficult tests (special for addressing) ?

Idea:

In avr.c:avr_option_override() there is

  /* caller-save.c looks for call-clobbered hard registers that are assigned
 to pseudos that cross calls and tries so save-restore them around calls
 in order to reduce the number of stack slots needed.

 This might leads to situations where reload is no more able to cope
 with the challenge of AVR's very few address registers and fails to
 perform the requested spills.  */

  if (avr_strict_X)
flag_caller_saves = 0;

i.e. with -mstrict-X -fcaller-saves is turned off because I saw spill fails in
test suite or building libs. This is a kludge, of course, to quick work around
these spill fails.

You can increase register pressure and register allocation stress considerably
by turning on -mstrict-X per default:

  avr_strict_X = 1;

Johann

Ping: Re: [patch middle-end]: Fix PR/48814 - [4.4/4.5/4.6/4.7 Regression] Incorrect scalar increment result

2012-01-10 Thread Kai Tietz

Ping

2012/1/8 Kai Tietz ktiet...@googlemail.com:
 Hi,

 this patch makes sure that for increment of
 postfix-increment/decrement we use also orignal lvalue instead of tmp
 lhs value for increment.  This fixes reported issue about sequence
 point in PR/48814

 ChangeLog

 2012-01-08  Kai Tietz  kti...@redhat.com

          PR middle-end/48814
          * gimplify.c (gimplify_self_mod_expr): Use for
 postfix-inc/dec lvalue instead of temporary
          lhs.

 Regression tested for x86_64-unknown-linux-gnu for all languages
 (including Ada and Obj-C++).  Ok for apply?

 Regards,
 Kai

 Index: gimplify.c
 ===
 --- gimplify.c  (revision 182720)
 +++ gimplify.c  (working copy)
 @@ -2258,7 +2258,7 @@
       arith_code = POINTER_PLUS_EXPR;
     }

 -  t1 = build2 (arith_code, TREE_TYPE (*expr_p), lhs, rhs);
 +  t1 = build2 (arith_code, TREE_TYPE (*expr_p), lvalue, rhs);

   if (postfix)
     {



-- 
|  (\_/) This is Bunny. Copy and paste
| (='.'=) Bunny into your signature to help
| ()_() him gain world domination

Re: Ping: Re: [patch middle-end]: Fix PR/48814 - [4.4/4.5/4.6/4.7 Regression] Incorrect scalar increment result

2012-01-10 Thread Richard Guenther

On Tue, Jan 10, 2012 at 10:58 AM, Kai Tietz ktiet...@googlemail.com wrote:
 Ping

 2012/1/8 Kai Tietz ktiet...@googlemail.com:
 Hi,

 this patch makes sure that for increment of
 postfix-increment/decrement we use also orignal lvalue instead of tmp
 lhs value for increment.  This fixes reported issue about sequence
 point in PR/48814

 ChangeLog

 2012-01-08  Kai Tietz  kti...@redhat.com

          PR middle-end/48814
          * gimplify.c (gimplify_self_mod_expr): Use for
 postfix-inc/dec lvalue instead of temporary
          lhs.

 Regression tested for x86_64-unknown-linux-gnu for all languages
 (including Ada and Obj-C++).  Ok for apply?

 Regards,
 Kai

 Index: gimplify.c
 ===
 --- gimplify.c  (revision 182720)
 +++ gimplify.c  (working copy)
 @@ -2258,7 +2258,7 @@
       arith_code = POINTER_PLUS_EXPR;
     }

 -  t1 = build2 (arith_code, TREE_TYPE (*expr_p), lhs, rhs);
 +  t1 = build2 (arith_code, TREE_TYPE (*expr_p), lvalue, rhs);

   if (postfix)
     {

Please add testcases.  Why does your patch make a difference?
lhs is just the gimplified lvalue.

Richard.



 --
 |  (\_/) This is Bunny. Copy and paste
 | (='.'=) Bunny into your signature to help
 | ()_() him gain world domination

[PATCH, SMS] Fix PR51794

2012-01-10 Thread Revital1 Eres


Hello,

The patch below fixes ICE reported in PR51794.
It avoids creating DDG edges  for register uses of class DF_REF_ARTIFICIAL
as
the latter does not have real instructions for them and thus calling
BLOCK_FOR_INSN fails.

Tested and bootstrap on ppc64-redhat-linux, enabling SMS on loops with SC
1.

OK for mainline?

Thanks,
Revital

Chanelog:

PR rtl-optimization/51794
* ddg.c (add_cross_iteration_register_deps): Avoid
creating edges for uses of class DF_REF_ARTIFICIAL.

testsuite/
PR rtl-optimization/51794
* gcc.dg/sms-12.c: New test.

(See attached file: patch_fix_pr51794.txt)Index: ddg.c
===
--- ddg.c   (revision 183001)
+++ ddg.c   (working copy)
@@ -315,7 +315,12 @@ add_cross_iteration_register_deps (ddg_p
   /* Create inter-loop true dependences and anti dependences.  */
   for (r_use = DF_REF_CHAIN (last_def); r_use != NULL; r_use = r_use-next)
 {
-  rtx use_insn = DF_REF_INSN (r_use-ref);
+  rtx use_insn;
+
+  if (r_use-ref == NULL || DF_REF_CLASS (r_use-ref) == DF_REF_ARTIFICIAL)
+   continue;
+
+  use_insn = DF_REF_INSN (r_use-ref);
 
   if (BLOCK_FOR_INSN (use_insn) != g-bb)
continue;
Index: testsuite/gcc.dg/sms-12.c
===
--- testsuite/gcc.dg/sms-12.c   (revision 0)
+++ testsuite/gcc.dg/sms-12.c   (revision 0)
@@ -0,0 +1,13 @@
+ /* { dg-do compile } */
+ /* { dg-options -O -fmodulo-sched } */
+
+
+int
+res_inverse (int a)
+{
+  int i;
+  char **b = (char **) __builtin_alloca (a * sizeof (char *));
+  for (i = 0; i  a; i++)
+b[i] = (char *) __builtin_alloca (sizeof (*b[i]));
+  return 0;
+}

[Ada] Check for correct Size for shift and rotate intrinsics

2012-01-10 Thread Arnaud Charlet

The parameter type for Shift_Left (etc.) intrinsics is required to have 'Size =
8, 16, 32, or 64 bits, but the compiler failed to check this rule. This patch
checks the rule.

The following test should get a compile-time error.

gnatmake -f -q shift_test.adb

shift_test.adb:4:27: first argument for shift must have size 8, 16, 32 or 64
shift_test.adb:5:29: incorrect intrinsic subprogram, see spec
gnatmake: shift_test.adb compilation error

with Text_IO; use Text_IO;
procedure Shift_Test is
   type M is mod 2**12; -- Illegal!
   function Shift_Left(X: M; Amount: Natural) return M;
   pragma Import(Intrinsic, Shift_Left);
   X : M := M'Last;
begin
   Put_Line(M'Size'Img);
   for Amount in Natural range 0..100 loop
  Put_Line(Shift_Left(X, Amount)'Img);
   end loop;
end Shift_Test;

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-01-10  Bob Duff  d...@adacore.com

* sem_intr.adb (Check_Shift): Use RM_Size instead of Esize, when
checking that the 'Size is correct. If the type is mod 2**12,
for example, it's illegal, but Esize is the 'Object_Size, which
will be something like 16 or 32, so the error ('Size = 12) was
not detected.
* gnat_rm.texi: Improve documentation of shift
and rotate intrinsics.

Index: gnat_rm.texi
===
--- gnat_rm.texi(revision 183051)
+++ gnat_rm.texi(working copy)
@@ -10385,11 +10385,7 @@
 * Exception_Name::
 * File::
 * Line::
-* Rotate_Left::
-* Rotate_Right::
-* Shift_Left::
-* Shift_Right::
-* Shift_Right_Arithmetic::
+* Shifts and Rotates::
 * Source_Location::
 @end menu
 
@@ -10506,62 +10502,36 @@
 @code{GNAT.Source_Info.Line} to obtain the number of the current
 source line.
 
-@node Rotate_Left
-@section Rotate_Left
+@node Shifts and Rotates
+@section Shifts and Rotates
+@cindex Shift_Left
+@cindex Shift_Right
+@cindex Shift_Right_Arithmetic
 @cindex Rotate_Left
+@cindex Rotate_Right
 @noindent
-In standard Ada, the @code{Rotate_Left} function is available only
+In standard Ada, the shift and rotate functions are available only
 for the predefined modular types in package @code{Interfaces}.  However, in
-GNAT it is possible to define a Rotate_Left function for a user
-defined modular type or any signed integer type as in this example:
+GNAT it is possible to define these functions for any integer
+type (signed or modular), as in this example:
 
 @smallexample @c ada
function Shift_Left
- (Value  : My_Modular_Type;
+ (Value  : T;
   Amount : Natural)
-  return   My_Modular_Type;
+  return   T;
 @end smallexample
 
 @noindent
-The requirements are that the profile be exactly as in the example
-above.  The only modifications allowed are in the formal parameter
-names, and in the type of @code{Value} and the return type, which
-must be the same, and must be either a signed integer type, or
-a modular integer type with a binary modulus, and the size must
-be 8.  16, 32 or 64 bits.
+The function name must be one of
+Shift_Left, Shift_Right, Shift_Right_Arithmetic, Rotate_Left, or
+Rotate_Right. T must be an integer type. T'Size must be
+8, 16, 32 or 64 bits; if T is modular, the modulus
+must be 2**8, 2**16, 2**32 or 2**64.
+The result type must be the same as the type of @code{Value}.
+The shift amount must be Natural.
+The formal parameter names can be anything.
 
-@node Rotate_Right
-@section Rotate_Right
-@cindex Rotate_Right
-@noindent
-A @code{Rotate_Right} function can be defined for any user defined
-binary modular integer type, or signed integer type, as described
-above for @code{Rotate_Left}.
-
-@node Shift_Left
-@section Shift_Left
-@cindex Shift_Left
-@noindent
-A @code{Shift_Left} function can be defined for any user defined
-binary modular integer type, or signed integer type, as described
-above for @code{Rotate_Left}.
-
-@node Shift_Right
-@section Shift_Right
-@cindex Shift_Right
-@noindent
-A @code{Shift_Right} function can be defined for any user defined
-binary modular integer type, or signed integer type, as described
-above for @code{Rotate_Left}.
-
-@node Shift_Right_Arithmetic
-@section Shift_Right_Arithmetic
-@cindex Shift_Right_Arithmetic
-@noindent
-A @code{Shift_Right_Arithmetic} function can be defined for any user
-defined binary modular integer type, or signed integer type, as described
-above for @code{Rotate_Left}.
-
 @node Source_Location
 @section Source_Location
 @cindex Source_Location
Index: sem_intr.adb
===
--- sem_intr.adb(revision 183051)
+++ sem_intr.adb(working copy)
@@ -455,12 +455,14 @@
  return;
   end if;
 
-  Size := UI_To_Int (Esize (Typ1));
+  --  type'Size (not 'Object_Size!) must be one of the allowed values
 
-  if Size /= 8
-and then Size /= 16
-and then Size /= 32
-and then Size /= 64
+  Size := UI_To_Int (RM_Size (Typ1));
+
+

Re: [Patch, Fortran] PR51652 - alloc with type-spec: check that char len matches declaration

2012-01-10 Thread Paul Richard Thomas

Dear All,

Sorry for breaking the thread on this one.

The patch below is OK for trunk, minus the fragment in
'get_declared_from_expr' from one of my patches :-)

Cheers

Paul

---

Rather simple patch.

Build and regtested on x86-64-linux.
OK for the trunk?


Tobias

2012-01-08  Tobias Burnus  bur...@net-b.de

PR fortran/51652
* resolve.c (resolve_allocate_expr): For non-deferred char lengths,
check whether type-spec matches declaration.

2012-01-08  Tobias Burnus  bur...@net-b.de

PR fortran/51652
* gfortran.dg/allocate_with_typespec_5.f90: New.


Index: gcc/fortran/resolve.c
===
--- gcc/fortran/resolve.c   (Revision 182995)
+++ gcc/fortran/resolve.c   (Arbeitskopie)
@@ -5683,8 +5683,10 @@ get_declared_from_expr (gfc_ref **class_ref, gfc_r
}
 }

-  if (declared == NULL)
+  if (declared == NULL  e-expr_type == EXPR_VARIABLE)
 declared = e-symtree-n.sym-ts.u.derived;
+  else
+declared = e-ts.u.derived;

   return declared;
 }
@@ -6989,6 +6991,19 @@ resolve_allocate_expr (gfc_expr *e, gfc_code *code
   goto failure;
 }

+  if (code-ext.alloc.ts.type == BT_CHARACTER  !e-ts.deferred)
+{
+  int cmp = gfc_dep_compare_expr (e-ts.u.cl-length,
+ code-ext.alloc.ts.u.cl-length);
+  if (cmp == 1 || cmp == -1 || cmp == -3)
+   {
+ gfc_error (Allocating %s at %L with type-spec requires the same 
+character-length parameter as in the declaration,
+sym-name, e-where);
+ goto failure;
+   }
+}
+
   /* In the variable definition context checks, gfc_expr_attr is used
  on the expression.  This is fooled by the array specification
  present in e, thus we have to eliminate that one temporarily.  */
Index: gcc/testsuite/gfortran.dg/allocate_with_typespec_5.f90
===
--- gcc/testsuite/gfortran.dg/allocate_with_typespec_5.f90  (Revision 0)
+++ gcc/testsuite/gfortran.dg/allocate_with_typespec_5.f90  (Arbeitskopie)
@@ -0,0 +1,26 @@
+! { dg-do compile }
+!
+! PR fortran/51652
+!
+! Contributed by David Kinniburgh
+!
+module settings
+
+type keyword
+  character(60), allocatable :: c(:)
+end type keyword
+
+type(keyword) :: kw(10)
+
+contains
+
+subroutine save_kw
+  allocate(character(80) :: kw(1)%c(10)) ! { dg-error with type-spec
requires the same character-length parameter }
+end subroutine save_kw
+
+subroutine foo(n)
+  character(len=n+2), allocatable :: x
+  allocate (character(len=n+3) :: x) ! { dg-error type-spec requires
the same character-length parameter }
+end subroutine foo
+
+end module settings

Re: Ping: Re: [patch middle-end]: Fix PR/48814 - [4.4/4.5/4.6/4.7 Regression] Incorrect scalar increment result

2012-01-10 Thread Kai Tietz

2012/1/10 Richard Guenther richard.guent...@gmail.com:
 On Tue, Jan 10, 2012 at 10:58 AM, Kai Tietz ktiet...@googlemail.com wrote:
 Ping

 2012/1/8 Kai Tietz ktiet...@googlemail.com:
 Hi,

 this patch makes sure that for increment of
 postfix-increment/decrement we use also orignal lvalue instead of tmp
 lhs value for increment.  This fixes reported issue about sequence
 point in PR/48814

 ChangeLog

 2012-01-08  Kai Tietz  kti...@redhat.com

          PR middle-end/48814
          * gimplify.c (gimplify_self_mod_expr): Use for
 postfix-inc/dec lvalue instead of temporary
          lhs.

 Regression tested for x86_64-unknown-linux-gnu for all languages
 (including Ada and Obj-C++).  Ok for apply?

 Regards,
 Kai

 Index: gimplify.c
 ===
 --- gimplify.c  (revision 182720)
 +++ gimplify.c  (working copy)
 @@ -2258,7 +2258,7 @@
       arith_code = POINTER_PLUS_EXPR;
     }

 -  t1 = build2 (arith_code, TREE_TYPE (*expr_p), lhs, rhs);
 +  t1 = build2 (arith_code, TREE_TYPE (*expr_p), lvalue, rhs);

   if (postfix)
     {

Hi Richard,

 Please add testcases.  Why does your patch make a difference?
 lhs is just the gimplified lvalue.

yes, exactly this makes a big difference for post-inc/dec.  I show you
gimple-dump to illustrate this in more detail.  I used here -O2 option
with using attached patch.

gcc without that patch produces following gimple for main:

main ()
{
  int count.0;
  int count.1;
  int D.2721;
  int D.2725;
  int D.2726;

  count.0 = count; -- here we store orginal value 'count' for having
array-access-index
  D.2721 = incr (); - within that function count gets modified
  arr[count.0] = D.2721;
  count.1 = count.0 + 1; - Here happens the issue.  We increment the
saved value of 'count'
  count = count.1; - By this the modification of count in incr() gets void.
  ...

By the change we make sure to use count's value instead its saved temporary.

Patched gcc produces this gimple:

main ()
{
  int count.0;
  int count.1;
  int D.1718;
  int D.1722;
  int D.1723;

  count.0 = count;
  D.1718 = incr ();
  arr[count.0] = D.1718;
  count.0 = count; -- Reload count.0 for post-inc/dec to use count's
current value
  count.1 = count.0 + 1;
  count = count.1;
  count.0 = count;

Ok, here is the patch with adusted testcase from PR.

ChangeLog

2012-01-10  Kai Tietz  kti...@redhat.com

PR middle-end/48814
* gimplify.c (gimplify_self_mod_expr): Use for
postfix-inc/dec lvalue instead of temporary lhs.

Regression tested for all languages (including Ada and Obj-C++).  Ok for apply?

Regards,
Kai

2012-01-10  Kai Tietz  kti...@redhat.com

* gcc.c-torture/execute/pr48814.c: New test.

Index: gcc/gcc/gimplify.c
===
--- gcc.orig/gcc/gimplify.c
+++ gcc/gcc/gimplify.c
@@ -2258,7 +2258,7 @@ gimplify_self_mod_expr (tree *expr_p, gi
   arith_code = POINTER_PLUS_EXPR;
 }

-  t1 = build2 (arith_code, TREE_TYPE (*expr_p), lhs, rhs);
+  t1 = build2 (arith_code, TREE_TYPE (*expr_p), lvalue, rhs);

   if (postfix)
 {
Index: gcc/gcc/testsuite/gcc.c-torture/execute/pr48814.c
===
--- /dev/null
+++ gcc/gcc/testsuite/gcc.c-torture/execute/pr48814.c
@@ -0,0 +1,18 @@
+extern void abort (void);
+
+int arr[] = {1,2,3,4};
+int count = 0;
+
+int __attribute__((noinline))
+incr (void)
+{
+  return ++count;
+}
+
+int main()
+{
+  arr[count++] = incr ();
+  if (count != 2 || arr[count] != 3)
+abort ();
+  return 0;
+}

Re: [Patch, Fortran] gfortran.texi: Update (C) year and F2003 status

2012-01-10 Thread Gerald Pfeifer

On Mon, 9 Jan 2012, Steve Kargl wrote:
 @item Generic interface name, which have the same name as derived types,

interface names, perhaps?

Gerald

Re: [Patch, Fortran] gfortran.texi: Update (C) year and F2003 status

2012-01-10 Thread Tobias Burnus


Hi Gerald,

On 01/10/2012 12:31 PM, Gerald Pfeifer wrote:

@item Generic interface name, which have the same name as derived types,

interface names, perhaps?


Well spotted - and corrected with the patched patch (Rev. 183062).

Thanks,

Tobias
Index: gcc/fortran/gfortran.texi
===
--- gcc/fortran/gfortran.texi	(revision 183060)
+++ gcc/fortran/gfortran.texi	(working copy)
@@ -797,7 +797,7 @@ override type-bound procedures or to have deferred
 @code{SAME_TYPE_AS}, @code{EXTENDS_TYPE_OF} and @code{SELECT TYPE}.
 Note that unlimited polymophism is currently not supported.
 
-@item Generic interface name, which have the same name as derived types,
+@item Generic interface names, which have the same name as derived types,
 are now supported. This allows one to write constructor functions.  Note
 that Fortran does not support static constructor functions.  For static
 variables, only default initialization or structure-constructor
Index: gcc/fortran/ChangeLog
===
--- gcc/fortran/ChangeLog	(revision 183061)
+++ gcc/fortran/ChangeLog	(working copy)
@@ -1,3 +1,7 @@
+2012-01-10  Gerald Pfeifer  ger...@pfeifer.com
+
+	* gfortran.texi (Fortran 2003 Status): Fix grammar.
+
 2012-01-10  Tobias Burnus  bur...@net-b.de
 
 	PR fortran/51652

[PATCH] Fix PR51806

2012-01-10 Thread Richard Guenther


This fixes LTO not honoring -Werror (similar for all other
non-C-family frontends), despite handling -Werror= just fine.
The issue is that the diagnostic context is only adjusted from
the c-family handle-options routine, not from the common
one (which does process -Werror= though).

Bootstrapped and tested on x86_64-unknown-linux-gnu, ok for trunk?

Thanks,
Richard.

2012-01-10  Richard Guenther  rguent...@suse.de

PR middle-end/51806
c-family/
* c-opts.c (c_common_handle_option): Move -Werror handling
to language independent code.

* opts.c (common_handle_option): Handle -Werror.

Index: gcc/c-family/c-opts.c
===
--- gcc/c-family/c-opts.c   (revision 183054)
+++ gcc/c-family/c-opts.c   (working copy)
@@ -449,10 +449,6 @@ c_common_handle_option (size_t scode, co
   cpp_opts-warn_endif_labels = value;
   break;
 
-case OPT_Werror:
-  global_dc-warning_as_error_requested = value;
-  break;
-
 case OPT_Wformat:
   set_Wformat (value);
   break;
Index: gcc/opts.c
===
--- gcc/opts.c  (revision 183054)
+++ gcc/opts.c  (working copy)
@@ -1420,6 +1420,10 @@ common_handle_option (struct gcc_options
   /* Currently handled in a prescan.  */
   break;
 
+case OPT_Werror:
+  dc-warning_as_error_requested = value;
+  break;
+
 case OPT_Werror_:
   if (lang_mask == CL_DRIVER)
break;

Re: [PATCH][Graphite]

2012-01-10 Thread Tobias Grosser


On 01/10/2012 10:14 AM, Richard Guenther wrote:

On Mon, 9 Jan 2012, Tobias Grosser wrote:


On 01/09/2012 04:34 PM, Richard Guenther wrote:


This fixes the 2nd P1 ICE.

There is a disconnect on how we analyze data-references during SCOP
detection
(outermost_loop is the root of the loop tree) and during SESE-to-poly
where
outermost is determined by outermost_loop_in_sese_1 ().  That influences
the SCEV result and thus we do not break the SCOP at a stmt we have to
break
it.

The following patch fixes this using a sledgehammer - require the
data-ref to be representable if analyzed with respect to all loops
it can nest in.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Ok?


This one looks good to me. The new region based scop detection should fix this
issue, but until we can get the relevant patches in, this looks like a good
fix. Thanks for fixing the graphite PRs.


Btw, the patch requires me to XFAIL the following tests:

FAIL: gcc.dg/graphite/scop-20.c scan-tree-dump-times graphite number of
SCoPs: 2 1

huh, we now only detect one SCoP (I suppose not all stmts of a
function will end up in SCoPs?  So we have an unhandled loop here
now.)

FAIL: gfortran.dg/graphite/interchange-1.f  -O  scan-tree-dump-times
graphite will be interchanged 1

No SCoPs detected anymore.

FAIL: gfortran.dg/graphite/block-1.f90  -O  scan-tree-dump-times graphite
number of SCoPs: 1 1

Likewise.

FAIL: gfortran.dg/graphite/block-2.f  -O  scan-tree-dump-times graphite
number of SCoPs: 2 1

Likewise.

But I also saw no easy way to use the proper outermost loop during
the analysis phase, so we have to live with this for 4.7 I believe.

Thus, applied.

Alright.

Thanks
Tobi

Re: [RFC] Fixing expansion of misaligned MEM_REFs on strict-alignment targets

2012-01-10 Thread Richard Guenther

On Fri, 6 Jan 2012, Martin Jambor wrote:

 Hi,
 
 I'm trying to teach our expander how to deal with misaligned MEM_REFs
 on strict alignment targets.  We currently generate code which leads
 to bus error signals due to misaligned accesses.
 
 I admit my motivation is not any target in particular but simply being
 able to produce misaligned MEM_REFs in SRA, currently we work-around
 that by producing COMPONENT_REFs which causes quite a few headaches.
 Nevertheless, I started by following Richi's advice and set out to fix
 the following two simple testcases on a strict-alignment platform, a
 sparc64 in the compile farm.  If I understood him correctly, Richi
 claimed they have been failing since forever:
 
 - test case 1: -
 
 extern void abort ();
 
 typedef unsigned int myint __attribute__((aligned(1)));
 
 /* even without the attributes we get bus error */
 unsigned int __attribute__((noinline, noclone))
 foo (myint *p)
 {
   return *p;
 }
 
 struct blah
 {
   char c;
   myint i;
 };
 
 struct blah g;
 
 #define cst 0xdeadbeef
 
 int
 main (int argc, char **argv)
 {
   int i;
   g.i = cst;
   i = foo (g.i);
   if (i != cst)
 abort ();
   return 0;
 }
 
 - test case 2: -
 
 extern void abort ();
 
 typedef unsigned int myint __attribute__((aligned(1)));
 
 void __attribute__((noinline, noclone))
 foo (myint *p, unsigned int i)
 {
   *p = i;
 }
 
 struct blah
 {
   char c;
   myint i;
 };
 
 struct blah g;
 
 #define cst 0xdeadbeef
 
 int
 main (int argc, char **argv)
 {
   foo (g.i, cst);
   if (g.i != cst)
 abort ();
   return 0;
 }
 
 
 
 I dug in expr.c and found two places which handle misaligned MEM_REfs
 loads and stores respectively but only if there is a special
 movmisalign_optab operation available for the given mode.  My approach
 therefore was to append calls to extract_bit_field and store_bit_field
 which seem to be the part of expander capable of dealing with
 misaligned memory accesses.  The patch is below, it fixes both
 testcases on sparc64--linux-gnu.
 
 Is this approach generally the right thing to do?  And of course,
 since my knowledge of RTL and expander is very limited I expect that
 there will by quite many suggestions about its various particular
 aspects.  I have run the c and c++ testsuite with the second hunk in
 place without any problems, the same test of the whole patch is under
 way right now but it takes quite a lot of time, therefore most
 probably I won't have the results today.  Of course I plan to do a
 bootstrap and at least Fortran checking on this platform too but that
 is really going to take some time and I'd like to hear any comments
 before that.

The idea is good (well, I suppose I suggested it ... ;))
 
 One more question: I'd like to be able to handle misaligned loads of
 stores of SSE vectors this way too but then of course I cannot use
 STRICT_ALIGNMENT as the guard but need a more elaborate predicate.  I
 assume it must already exist, which one is it?

There is none :/  STRICT_ALIGNMENT would need to get a mode argument,
but reality is that non-STRICT_ALIGNMENT targets at the moment
need to have their movmisalign optab trigger for all cases that
will not work when misaligned.

 --- 4572,4653 
  || TREE_CODE (to) == TARGET_MEM_REF)
  mode != BLKmode
  ((align = get_object_or_type_alignment (to))
 !GET_MODE_ALIGNMENT (mode)))
   {
 enum machine_mode address_mode;
 !   rtx reg;
   
 reg = expand_expr (from, NULL_RTX, VOIDmode, EXPAND_NORMAL);
 reg = force_not_mem (reg);
   
 !   if ((icode = optab_handler (movmisalign_optab, mode))
 !   != CODE_FOR_nothing)
   {
 +   struct expand_operand ops[2];
 +   rtx op0, mem;
 + 
 +   if (TREE_CODE (to) == MEM_REF)
 + {
 +   addr_space_t as
 + = TYPE_ADDR_SPACE (TREE_TYPE (TREE_TYPE (TREE_OPERAND (to, 
 1;
 +   tree base = TREE_OPERAND (to, 0);
 +   address_mode = targetm.addr_space.address_mode (as);
 +   op0 = expand_expr (base, NULL_RTX, VOIDmode, EXPAND_NORMAL);
 +   op0 = convert_memory_address_addr_space (address_mode, op0, as);
 +   if (!integer_zerop (TREE_OPERAND (to, 1)))
 + {
 +   rtx off
 + = immed_double_int_const (mem_ref_offset (to), 
 address_mode);
 +   op0 = simplify_gen_binary (PLUS, address_mode, op0, off);
 + }
 +   op0 = memory_address_addr_space (mode, op0, as);
 +   mem = gen_rtx_MEM (mode, op0);
 +   set_mem_attributes (mem, to, 0);
 +   set_mem_addr_space (mem, as);
 + }
 +   else if (TREE_CODE (to) == TARGET_MEM_REF)
 + {
 +   addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (to));
 +   struct mem_address addr;
 + 
 +   get_address_description (to, addr);
 +   op0 = addr_for_mem_ref (addr, as, true);
 +   op0 = memory_address_addr_space

[PATCH] Add testcase for PR51801

2012-01-10 Thread Richard Guenther


Committed.

Richard.

2012-01-10  Richard Guenther  rguent...@suse.de

PR tree-optimization/51801
* gcc.dg/torture/pr51801.c: New testcase.

Index: gcc/testsuite/gcc.dg/torture/pr51801.c
===
--- gcc/testsuite/gcc.dg/torture/pr51801.c  (revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr51801.c  (revision 0)
@@ -0,0 +1,62 @@
+/* { dg-do compile } */
+
+typedef struct { char Vshow; } TScreen;
+typedef struct _Misc { char Tshow; } Misc;
+typedef struct _XtermWidgetRec { TScreen screen; Misc misc; } XtermWidgetRec, 
*XtermWidget;
+extern XtermWidget term;
+
+void
+handle_tekshow (void *gw, int allowswitch)
+{
+  XtermWidget xw = term;
+  if (!((xw)-misc.Tshow))
+set_tek_visibility (1);
+}
+
+void
+do_tekonoff (void *gw, void *closure, void *data)
+{
+  handle_tekshow (gw, 0);
+}
+
+void
+do_vtonoff (void *gw, void *closure, void *data)
+{
+}
+
+void
+handle_toggle (void (*proc) (void *gw, void *closure, void *data),
+  int var, char **params, unsigned int nparams, void *w,
+  void *closure, void *data)
+{
+  XtermWidget xw = term;
+  int dir = -2;
+  switch (nparams)
+{
+case 0:
+  dir = -1;
+}
+  switch (dir)
+{
+case 1:
+  (*proc) (w, closure, data);
+  Bell (xw, 2, 0);
+}
+}
+
+void
+HandleVisibility (void *w, char **params, unsigned int *param_count)
+{
+  XtermWidget xw = term;
+  if (*param_count == 2)
+switch (params[0][0])
+  {
+  case 'v':
+   handle_toggle (do_vtonoff, (int) ((int) ((xw)-screen)-Vshow),
+  params + 1, (*param_count) - 1, w, (void *) 0,
+  (void *) 0);
+   handle_toggle (do_tekonoff, (int) ((int) ((xw)-misc.Tshow)),
+  params + 1, (*param_count) - 1, w, (void *) 0,
+  (void *) 0);
+  }
+}

[PATCH] Fix PR49642 in 4.6, questions about 4.7

2012-01-10 Thread William J. Schmidt

Greetings,

This patch follows Richard Guenther's suggestion of 2011-07-05 in
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49642 to fix the problem in
gcc 4.6.  It prevents choosing a function split point that is dominated
by a builtin call to __builtin_constant_p.

The bug was marked fixed in 4.7 since the extra FRE pass allows the
correct optimization to be done even in the presence of
__builtin_constant_p.  However, 4.7 still fails in the presence of
-fno-tree-fre.  I think we should probably include a variation of this
patch in 4.7 that only kicks in when FRE has been disabled at the
command line.  The test case would also be modified slightly to include
-fno-tree-fre in the dg-compile statement.  Thoughts?

The 4.6 patch was bootstrapped and tests cleanly on powerpc64-linux-gnu.
OK for 4.6 branch?

Thanks,
Bill


gcc:

2012-01-10  Bill Schmidt  wschm...@linux.vnet.ibm.com

PR tree-optimization/49642
* ipa-split.c (forbidden_dominators): New variable.
(check_forbidden_calls): New function.
(dominated_by_forbidden): Likewise.
(consider_split): Check for forbidden calls.
(execute_split_functions): Initialize and free forbidden
dominators info; call check_forbidden_calls.

gcc/testsuite:

2012-01-10  Bill Schmidt  wschm...@linux.vnet.ibm.com

PR tree-optimization/49642
* gcc.dg/tree-ssa/pr49642.c: New test.


Index: gcc/testsuite/gcc.dg/tree-ssa/pr49642.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr49642.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr49642.c (revision 0)
@@ -0,0 +1,49 @@
+/* Verify that ipa-split is disabled following __builtin_constant_p.  */
+
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-optimized } */
+
+typedef unsigned int u32;
+typedef unsigned long long u64;
+
+static inline __attribute__((always_inline)) __attribute__((const))
+int __ilog2_u32(u32 n)
+{
+ int bit;
+ asm (cntlzw %0,%1 : =r (bit) : r (n));
+ return 31 - bit;
+}
+
+
+static inline __attribute__((always_inline)) __attribute__((const))
+int __ilog2_u64(u64 n)
+{
+ int bit;
+ asm (cntlzd %0,%1 : =r (bit) : r (n));
+ return 63 - bit;
+}
+
+
+
+static u64 ehca_map_vaddr(void *caddr);
+
+struct ehca_shca {
+u32 hca_cap_mr_pgsize;
+};
+
+static u64 ehca_get_max_hwpage_size(struct ehca_shca *shca)
+{
+ return 1UL  ( __builtin_constant_p(shca-hca_cap_mr_pgsize) ? ( 
(shca-hca_cap_mr_pgsize)  1 ? ilog2_NaN() : (shca-hca_cap_mr_pgsize)  
(1ULL  63) ? 63 : (shca-hca_cap_mr_pgsize)  (1ULL  62) ? 62 : 
(shca-hca_cap_mr_pgsize)  (1ULL  61) ? 61 : (shca-hca_cap_mr_pgsize)  
(1ULL  60) ? 60 : (shca-hca_cap_mr_pgsize)  (1ULL  59) ? 59 : 
(shca-hca_cap_mr_pgsize)  (1ULL  58) ? 58 : (shca-hca_cap_mr_pgsize)  
(1ULL  57) ? 57 : (shca-hca_cap_mr_pgsize)  (1ULL  56) ? 56 : 
(shca-hca_cap_mr_pgsize)  (1ULL  55) ? 55 : (shca-hca_cap_mr_pgsize)  
(1ULL  54) ? 54 : (shca-hca_cap_mr_pgsize)  (1ULL  53) ? 53 : 
(shca-hca_cap_mr_pgsize)  (1ULL  52) ? 52 : (shca-hca_cap_mr_pgsize)  
(1ULL  51) ? 51 : (shca-hca_cap_mr_pgsize)  (1ULL  50) ? 50 : 
(shca-hca_cap_mr_pgsize)  (1ULL  49) ? 49 : (shca-hca_cap_mr_pgsize)  
(1ULL  48) ? 48 : (shca-hca_cap_mr_pgsize)  (1ULL  47) ? 47 : 
(shca-hca_cap_mr_pgsize)  (1ULL  46) ? 46 : (shca-hca_cap_mr_pgsize)  
(1ULL  45) ? 45 : (shca-hca_cap_mr_pgsize)  (1ULL  44) ? 44 : 
(shca-hca_cap_mr_pgsize)  (1ULL  43) ? 43 : (shca-hca_cap_mr_pgsize)  
(1ULL  42) ? 42 : (shca-hca_cap_mr_pgsize)  (1ULL  41) ? 41 : 
(shca-hca_cap_mr_pgsize)  (1ULL  40) ? 40 : (shca-hca_cap_mr_pgsize)  
(1ULL  39) ? 39 : (shca-hca_cap_mr_pgsize)  (1ULL  38) ? 38 : 
(shca-hca_cap_mr_pgsize)  (1ULL  37) ? 37 : (shca-hca_cap_mr_pgsize)  
(1ULL  36) ? 36 : (shca-hca_cap_mr_pgsize)  (1ULL  35) ? 35 : 
(shca-hca_cap_mr_pgsize)  (1ULL  34) ? 34 : (shca-hca_cap_mr_pgsize)  
(1ULL  33) ? 33 : (shca-hca_cap_mr_pgsize)  (1ULL  32) ? 32 : 
(shca-hca_cap_mr_pgsize)  (1ULL  31) ? 31 : (shca-hca_cap_mr_pgsize)  
(1ULL  30) ? 30 : (shca-hca_cap_mr_pgsize)  (1ULL  29) ? 29 : 
(shca-hca_cap_mr_pgsize)  (1ULL  28) ? 28 : (shca-hca_cap_mr_pgsize)  
(1ULL  27) ? 27 : (shca-hca_cap_mr_pgsize)  (1ULL  26) ? 26 : 
(shca-hca_cap_mr_pgsize)  (1ULL  25) ? 25 : (shca-hca_cap_mr_pgsize)  
(1ULL  24) ? 24 : (shca-hca_cap_mr_pgsize)  (1ULL  23) ? 23 : 
(shca-hca_cap_mr_pgsize)  (1ULL  22) ? 22 : (shca-hca_cap_mr_pgsize)  
(1ULL  21) ? 21 : (shca-hca_cap_mr_pgsize)  (1ULL  20) ? 20 : 
(shca-hca_cap_mr_pgsize)  (1ULL  19) ? 19 : (shca-hca_cap_mr_pgsize)  
(1ULL  18) ? 18 : (shca-hca_cap_mr_pgsize)  (1ULL  17) ? 17 : 
(shca-hca_cap_mr_pgsize)  (1ULL  16) ? 16 : (shca-hca_cap_mr_pgsize)  
(1ULL  15) ? 15 : (shca-hca_cap_mr_pgsize)  (1ULL  14) ? 14 : 
(shca-hca_cap_mr_pgsize)  (1ULL  13) ? 13 : (shca-hca_cap_mr_pgsize)  
(1ULL  12) ? 12 : (shca-hca_cap_mr_pgsize)  (1ULL  11) ? 11 : 
(shca-hca_cap_mr_pgsize)  (1ULL  10) ? 10 : (shca-hca_cap_mr_pgsize)  
(1ULL  9) ? 9 : (shca-hca_cap_mr_pgsize)  (1ULL  8) ? 8 :

Re: [PATCH] Fix PR49642 in 4.6, questions about 4.7

2012-01-10 Thread Richard Guenther

On Tue, Jan 10, 2012 at 2:43 PM, William J. Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Greetings,

 This patch follows Richard Guenther's suggestion of 2011-07-05 in
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49642 to fix the problem in
 gcc 4.6.  It prevents choosing a function split point that is dominated
 by a builtin call to __builtin_constant_p.

 The bug was marked fixed in 4.7 since the extra FRE pass allows the
 correct optimization to be done even in the presence of
 __builtin_constant_p.  However, 4.7 still fails in the presence of
 -fno-tree-fre.  I think we should probably include a variation of this
 patch in 4.7 that only kicks in when FRE has been disabled at the
 command line.  The test case would also be modified slightly to include
 -fno-tree-fre in the dg-compile statement.  Thoughts?

I think it should be unconditionally restrict splitting (I suppose on the
trunk the __builtin_constant_p is optimized away already).

Btw, this will also disqualify any point below

 if (__builtin_constant_p (...))
   {
 ...
   }

because after the if join all BBs are dominated by the __builtin_constant_p
call.  What we want to disallow is splitting at a block that is dominated
by the true edge of the condition fed by the __builtin_constant_p result ...

Honza?

 The 4.6 patch was bootstrapped and tests cleanly on powerpc64-linux-gnu.
 OK for 4.6 branch?

 Thanks,
 Bill


 gcc:

 2012-01-10  Bill Schmidt  wschm...@linux.vnet.ibm.com

        PR tree-optimization/49642
        * ipa-split.c (forbidden_dominators): New variable.
        (check_forbidden_calls): New function.
        (dominated_by_forbidden): Likewise.
        (consider_split): Check for forbidden calls.
        (execute_split_functions): Initialize and free forbidden
        dominators info; call check_forbidden_calls.

 gcc/testsuite:

 2012-01-10  Bill Schmidt  wschm...@linux.vnet.ibm.com

        PR tree-optimization/49642
        * gcc.dg/tree-ssa/pr49642.c: New test.


 Index: gcc/testsuite/gcc.dg/tree-ssa/pr49642.c
 ===
 --- gcc/testsuite/gcc.dg/tree-ssa/pr49642.c     (revision 0)
 +++ gcc/testsuite/gcc.dg/tree-ssa/pr49642.c     (revision 0)
 @@ -0,0 +1,49 @@
 +/* Verify that ipa-split is disabled following __builtin_constant_p.  */
 +
 +/* { dg-do compile } */
 +/* { dg-options -O2 -fdump-tree-optimized } */
 +
 +typedef unsigned int u32;
 +typedef unsigned long long u64;
 +
 +static inline __attribute__((always_inline)) __attribute__((const))
 +int __ilog2_u32(u32 n)
 +{
 + int bit;
 + asm (cntlzw %0,%1 : =r (bit) : r (n));
 + return 31 - bit;
 +}
 +
 +
 +static inline __attribute__((always_inline)) __attribute__((const))
 +int __ilog2_u64(u64 n)
 +{
 + int bit;
 + asm (cntlzd %0,%1 : =r (bit) : r (n));
 + return 63 - bit;
 +}
 +
 +
 +
 +static u64 ehca_map_vaddr(void *caddr);
 +
 +struct ehca_shca {
 +        u32 hca_cap_mr_pgsize;
 +};
 +
 +static u64 ehca_get_max_hwpage_size(struct ehca_shca *shca)
 +{
 + return 1UL  ( __builtin_constant_p(shca-hca_cap_mr_pgsize) ? ( 
 (shca-hca_cap_mr_pgsize)  1 ? ilog2_NaN() : (shca-hca_cap_mr_pgsize)  
 (1ULL  63) ? 63 : (shca-hca_cap_mr_pgsize)  (1ULL  62) ? 62 : 
 (shca-hca_cap_mr_pgsize)  (1ULL  61) ? 61 : (shca-hca_cap_mr_pgsize)  
 (1ULL  60) ? 60 : (shca-hca_cap_mr_pgsize)  (1ULL  59) ? 59 : 
 (shca-hca_cap_mr_pgsize)  (1ULL  58) ? 58 : (shca-hca_cap_mr_pgsize)  
 (1ULL  57) ? 57 : (shca-hca_cap_mr_pgsize)  (1ULL  56) ? 56 : 
 (shca-hca_cap_mr_pgsize)  (1ULL  55) ? 55 : (shca-hca_cap_mr_pgsize)  
 (1ULL  54) ? 54 : (shca-hca_cap_mr_pgsize)  (1ULL  53) ? 53 : 
 (shca-hca_cap_mr_pgsize)  (1ULL  52) ? 52 : (shca-hca_cap_mr_pgsize)  
 (1ULL  51) ? 51 : (shca-hca_cap_mr_pgsize)  (1ULL  50) ? 50 : 
 (shca-hca_cap_mr_pgsize)  (1ULL  49) ? 49 : (shca-hca_cap_mr_pgsize)  
 (1ULL  48) ? 48 : (shca-hca_cap_mr_pgsize)  (1ULL  47) ? 47 : 
 (shca-hca_cap_mr_pgsize)  (1ULL  46) ? 46 : (shca-hca_cap_mr_pgsize)  
 (1ULL  45) ? 45 : (shca-hca_cap_mr_pgsize)  (1ULL  44) ? 44 : 
 (shca-hca_cap_mr_pgsize)  (1ULL  43) ? 43 : (shca-hca_cap_mr_pgsize)  
 (1ULL  42) ? 42 : (shca-hca_cap_mr_pgsize)  (1ULL  41) ? 41 : 
 (shca-hca_cap_mr_pgsize)  (1ULL  40) ? 40 : (shca-hca_cap_mr_pgsize)  
 (1ULL  39) ? 39 : (shca-hca_cap_mr_pgsize)  (1ULL  38) ? 38 : 
 (shca-hca_cap_mr_pgsize)  (1ULL  37) ? 37 : (shca-hca_cap_mr_pgsize)  
 (1ULL  36) ? 36 : (shca-hca_cap_mr_pgsize)  (1ULL  35) ? 35 : 
 (shca-hca_cap_mr_pgsize)  (1ULL  34) ? 34 : (shca-hca_cap_mr_pgsize)  
 (1ULL  33) ? 33 : (shca-hca_cap_mr_pgsize)  (1ULL  32) ? 32 : 
 (shca-hca_cap_mr_pgsize)  (1ULL  31) ? 31 : (shca-hca_cap_mr_pgsize)  
 (1ULL  30) ? 30 : (shca-hca_cap_mr_pgsize)  (1ULL  29) ? 29 : 
 (shca-hca_cap_mr_pgsize)  (1ULL  28) ? 28 : (shca-hca_cap_mr_pgsize)  
 (1ULL  27) ? 27 : (shca-hca_cap_mr_pgsize)  (1ULL  26) ? 26 : 
 (shca-hca_cap_mr_pgsize)  (1ULL  25) ? 25 : (shca-hca_cap_mr_pgsize)  
 (1ULL  24) ? 24 : (shca-hca_cap_mr_pgsize)  (1ULL  23) ? 23 :

Add -lssp_nonshared to LINK_SSP_SPEC

2012-01-10 Thread Tijl Coosemans

On targets where libc implements stack protector functions (GNU libc,
FreeBSD libc), and where gcc (as an optimisation) generates calls to
a locally defined __stack_chk_fail_local instead of directly calling
the global function __stack_chk_fail (e.g. -fpic code on i386), one
must explicitly specify -lssp_nonshared or -lc -lc_nonshared on the
command line to statically link in __stack_chk_fail_local.

It would be more convenient if the compiler kept the details of this
target specific optimisation hidden by passing -lssp_nonshared to the
linker internally.

Here's a simple test case that shows the problem on i386-freebsd, but
works just fine on e.g. x86_64 targets:

% cat test.c
int
main( void ) {
return( 0 );
}
% gcc46 -o test test.c -fstack-protector-all -fPIE
/var/tmp//ccjYQxKu.o: In function `main':
test.c:(.text+0x37): undefined reference to `__stack_chk_fail_local'
/usr/local/bin/ld: test: hidden symbol `__stack_chk_fail_local' isn't defined
/usr/local/bin/ld: final link failed: Bad value
collect2: ld returned 1 exit status


I don't have commit access, so please commit when approved.

2011-01-10  Tijl Coosemans  t...@coosemans.org

* gcc.c [TARGET_LIBC_PROVIDES_SSP] (LINK_SSP_SPEC): Add -lssp_nonshared.

--- gcc/gcc.c.orig
+++ gcc/gcc.c
@@ -602,7 +602,7 @@ proper position among the other output f
 
 #ifndef LINK_SSP_SPEC
 #ifdef TARGET_LIBC_PROVIDES_SSP
-#define LINK_SSP_SPEC %{fstack-protector:}
+#define LINK_SSP_SPEC 
%{fstack-protector|fstack-protector-all:-lssp_nonshared}
 #else
 #define LINK_SSP_SPEC %{fstack-protector|fstack-protector-all:-lssp_nonshared 
-lssp}
 #endif

Re: C++/libiberty PATCH for many mangling fixes (6057, 48051, 50855, 51322 and more)

2012-01-10 Thread Jason Merrill

bkoz pointed out that I forgot to update invoke.texi about 
-fabi-version=6.  Applying to trunk
commit f94b7ea86ad3146e81a46a141ac23b10048b7fbf
Author: Jason Merrill ja...@redhat.com
Date:   Mon Jan 9 23:03:15 2012 -0500

	* doc/invoke.texi (C++ Dialect Options): Update -fabi-version=6
	information.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index a13ddfa..c0812fb 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1891,13 +1891,18 @@ The default is version 2.
 Version 3 corrects an error in mangling a constant address as a
 template argument.
 
-Version 4 implements a standard mangling for vector types.
+Version 4, which first appeared in G++ 4.5, implements a standard
+mangling for vector types.
 
-Version 5 corrects the mangling of attribute const/volatile on
-function pointer types, decltype of a plain decl, and use of a
-function parameter in the declaration of another parameter.
+Version 5, which first appeared in G++ 4.6, corrects the mangling of
+attribute const/volatile on function pointer types, decltype of a
+plain decl, and use of a function parameter in the declaration of
+another parameter.
 
-Version 6 corrects the promotion behavior of C++11 scoped enums.
+Version 6, which first appeared in G++ 4.7, corrects the promotion
+behavior of C++11 scoped enums and the mangling of template argument
+packs, const/static_cast, prefix ++ and --, and a class scope function
+used as a template argument.
 
 See also @option{-Wabi}.

Re: [RFC, patch] libitm: Filter out undo writes that overlap with the libitm stack.

2012-01-10 Thread Torvald Riegel

On Tue, 2012-01-10 at 16:35 +1100, Richard Henderson wrote:
 On 01/09/2012 06:26 AM, Torvald Riegel wrote:
  libitm: Filter out undo writes that overlap with the libitm stack.
  
  libitm/
  * config/generic/tls.h (GTM::mask_stack_top,
  GTM::mask_stack_bottom): New.
  * local.cc (gtm_undolog::rollback): Filter out any updates that
  overlap the libitm stack.  Add current transaction as parameter.
  * libitm_i.h (GTM::gtm_undolog::rollback): Adapt.
  * beginend.cc (GTM::gtm_thread::rollback): Adapt.
  * testsuite/libitm.c/stackundo.c: New test.
 
 One could steal code from bohem-gc for this.
 See GC_get_stack_base in os_dep.c.

Thanks for the pointer.  I looked at this code, and it seems fairly
complex given the dependencies on OS/libc and OS/libc behavior.  From a
maintenance point-of-view, does it make sense to copy that complexity
into libitm?  boehm-gc is used in GCC, so perhaps that's not much of a
problem, however.  I also looked at glibc's memcpy implementations, and
copying those plus a simple byte-wise copy for the generic case could be
also a fairly clean solution.
Also, is the license compatible with the GPL wrt. mixing sources?

What about keeping the patch/hack that I posted for now, creating a PR,
and looking at this again for another release?

Attached a slightly updated version with just comments in local.cc
changed.
commit 02357c5f11138f512e714d1740491abc86c61388
Author: Torvald Riegel trie...@redhat.com
Date:   Sun Jan 8 20:12:33 2012 +0100

libitm: Filter out undo writes that overlap with the libitm stack.

libitm/
* config/generic/tls.h (GTM::mask_stack_top,
GTM::mask_stack_bottom): New.
* local.cc (gtm_undolog::rollback): Filter out any updates that
overlap the libitm stack.  Add current transaction as parameter.
* libitm_i.h (GTM::gtm_undolog::rollback): Adapt.
* beginend.cc (GTM::gtm_thread::rollback): Adapt.
* testsuite/libitm.c/stackundo.c: New test.

diff --git a/libitm/beginend.cc b/libitm/beginend.cc
index fe14f32..08c2174 100644
--- a/libitm/beginend.cc
+++ b/libitm/beginend.cc
@@ -327,7 +327,7 @@ GTM::gtm_thread::rollback (gtm_transaction_cp *cp, bool 
aborting)
   // data. Because of the latter, we have to roll it back before any
   // dispatch-specific rollback (which handles synchronization with other
   // transactions).
-  undolog.rollback (cp ? cp-undolog_size : 0);
+  undolog.rollback (this, cp ? cp-undolog_size : 0);
 
   // Perform dispatch-specific rollback.
   abi_disp()-rollback (cp);
diff --git a/libitm/config/generic/tls.h b/libitm/config/generic/tls.h
index 6bbdccf..07efef3 100644
--- a/libitm/config/generic/tls.h
+++ b/libitm/config/generic/tls.h
@@ -60,6 +60,25 @@ static inline abi_dispatch * abi_disp() { return 
_gtm_thr_tls.disp; }
 static inline void set_abi_disp(abi_dispatch *x) { _gtm_thr_tls.disp = x; }
 #endif
 
+#ifndef HAVE_ARCH_GTM_MASK_STACK
+// To filter out any updates that overlap the libitm stack, we define
+// gtm_mask_stack_top to the entry point to the library and
+// gtm_mask_stack_bottom to below current function.  This
+// definition should be fine for all stack-grows-down architectures.
+// FIXME We fake the bottom to be lower so that we are safe even if we might
+// call further functions (compared to where we called gtm_mask_stack_bottom
+// in the call hierarchy) to actually undo or redo writes (e.g., memcpy).
+// This is a completely arbitrary value; can we instead ensure that there are
+// no such calls, or can we determine a future-proof value otherwise?
+static inline void *
+mask_stack_top(gtm_thread *tx) { return tx-jb.cfa; }
+static inline void *
+mask_stack_bottom(gtm_thread *tx)
+{
+  return (uint8_t*)__builtin_dwarf_cfa() - 128;
+}
+#endif
+
 } // namespace GTM
 
 #endif // LIBITM_TLS_H
diff --git a/libitm/libitm_i.h b/libitm/libitm_i.h
index f922d22..f849654 100644
--- a/libitm/libitm_i.h
+++ b/libitm/libitm_i.h
@@ -138,7 +138,7 @@ struct gtm_undolog
   size_t size() const { return undolog.size(); }
 
   // In local.cc
-  void rollback (size_t until_size = 0);
+  void rollback (gtm_thread* tx, size_t until_size = 0);
 };
 
 // Contains all thread-specific data required by the entire library.
diff --git a/libitm/local.cc b/libitm/local.cc
index 39b6da3..8123063 100644
--- a/libitm/local.cc
+++ b/libitm/local.cc
@@ -26,11 +26,20 @@
 
 namespace GTM HIDDEN {
 
-
-void
-gtm_undolog::rollback (size_t until_size)
+// This function needs to be noinline because we need to prevent that it gets
+// inlined into another function that calls further functions. This could
+// break our assumption that we only call memcpy and thus only need to
+// additionally protect the memcpy stack (see the hack in mask_stack_bottom()).
+// Even if that isn't an issue because those other calls don't happen during
+// copying, we still need mask_stack_bottom() to be called close

C++ PATCH for c++/51433 (constexpr caching)

2012-01-10 Thread Jason Merrill

Sometimes an expression that is non-constant at one point in the 
translation unit can become constant later, when a constexpr function is 
defined.  So let's not cache failure.


Tested x86_64-pc-linux-gnu, applying to trunk.  This isn't a regression, 
but as before, since C++11 support is still experimental I consider 
patches that fix C++11 bugs and don't affect C++98 code to be acceptable.
commit 639cadeb21c26f4d5e98e9bf2b211a4c39a61b9c
Author: Jason Merrill ja...@redhat.com
Date:   Mon Jan 9 09:40:34 2012 -0500

	PR c++/51433
	* semantics.c (cxx_eval_call_expression): Always retry previously
	non-constant expressions.

diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index fbb74e1..2c351be 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -6576,7 +6576,7 @@ cxx_eval_call_expression (const constexpr_call *old_call, tree t,
   else
 {
   result = entry-result;
-  if (!result || (result == error_mark_node  !allow_non_constant))
+  if (!result || result == error_mark_node)
 	result = (cxx_eval_constant_expression
 		  (new_call, new_call.fundef-body,
 		   allow_non_constant, addr,
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-cache1.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-cache1.C
new file mode 100644
index 000..b6d7b64
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-cache1.C
@@ -0,0 +1,9 @@
+// PR c++/51433
+// { dg-options -std=c++0x }
+
+constexpr int f();
+constexpr int g() { return f(); }
+extern const int n = g();	// dynamic initialization
+constexpr int f() { return 42; }
+extern const int m = g();
+static_assert(m == 42, m == 42);

Re: Add -lssp_nonshared to LINK_SSP_SPEC

2012-01-10 Thread Richard Guenther

On Tue, Jan 10, 2012 at 3:14 PM, Tijl Coosemans t...@coosemans.org wrote:
 On targets where libc implements stack protector functions (GNU libc,
 FreeBSD libc), and where gcc (as an optimisation) generates calls to
 a locally defined __stack_chk_fail_local instead of directly calling
 the global function __stack_chk_fail (e.g. -fpic code on i386), one
 must explicitly specify -lssp_nonshared or -lc -lc_nonshared on the
 command line to statically link in __stack_chk_fail_local.

 It would be more convenient if the compiler kept the details of this
 target specific optimisation hidden by passing -lssp_nonshared to the
 linker internally.

 Here's a simple test case that shows the problem on i386-freebsd, but
 works just fine on e.g. x86_64 targets:

 % cat test.c
 int
 main( void ) {
    return( 0 );
 }
 % gcc46 -o test test.c -fstack-protector-all -fPIE
 /var/tmp//ccjYQxKu.o: In function `main':
 test.c:(.text+0x37): undefined reference to `__stack_chk_fail_local'
 /usr/local/bin/ld: test: hidden symbol `__stack_chk_fail_local' isn't defined
 /usr/local/bin/ld: final link failed: Bad value
 collect2: ld returned 1 exit status


 I don't have commit access, so please commit when approved.

Works fine for me on i?86-linux without -lssp_nonshared (which I do not
have, so linking would fail).

Richard.

 2011-01-10  Tijl Coosemans  t...@coosemans.org

        * gcc.c [TARGET_LIBC_PROVIDES_SSP] (LINK_SSP_SPEC): Add 
 -lssp_nonshared.

 --- gcc/gcc.c.orig
 +++ gcc/gcc.c
 @@ -602,7 +602,7 @@ proper position among the other output f

  #ifndef LINK_SSP_SPEC
  #ifdef TARGET_LIBC_PROVIDES_SSP
 -#define LINK_SSP_SPEC %{fstack-protector:}
 +#define LINK_SSP_SPEC 
 %{fstack-protector|fstack-protector-all:-lssp_nonshared}
  #else
  #define LINK_SSP_SPEC 
 %{fstack-protector|fstack-protector-all:-lssp_nonshared -lssp}
  #endif

Re: [PATCH] Fix PR49642 in 4.6, questions about 4.7

2012-01-10 Thread William J. Schmidt



On Tue, 2012-01-10 at 14:53 +0100, Richard Guenther wrote:
 On Tue, Jan 10, 2012 at 2:43 PM, William J. Schmidt
 wschm...@linux.vnet.ibm.com wrote:
  Greetings,
 
  This patch follows Richard Guenther's suggestion of 2011-07-05 in
  http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49642 to fix the problem in
  gcc 4.6.  It prevents choosing a function split point that is dominated
  by a builtin call to __builtin_constant_p.
 
  The bug was marked fixed in 4.7 since the extra FRE pass allows the
  correct optimization to be done even in the presence of
  __builtin_constant_p.  However, 4.7 still fails in the presence of
  -fno-tree-fre.  I think we should probably include a variation of this
  patch in 4.7 that only kicks in when FRE has been disabled at the
  command line.  The test case would also be modified slightly to include
  -fno-tree-fre in the dg-compile statement.  Thoughts?
 
 I think it should be unconditionally restrict splitting (I suppose on the
 trunk the __builtin_constant_p is optimized away already).

OK.  Yes, on trunk it is optimized away (when FRE is not disabled).
Having the logic unconditional is fine with me; I'd like to use
-fno-tree-fre in the test case so it actually gets tested, though.  Or
have two variants, one with, one without.

 
 Btw, this will also disqualify any point below
 
  if (__builtin_constant_p (...))
{
  ...
}
 
 because after the if join all BBs are dominated by the __builtin_constant_p
 call.  What we want to disallow is splitting at a block that is dominated
 by the true edge of the condition fed by the __builtin_constant_p result ...

True.  What we have is:

  D.1899_68 = __builtin_constant_p (D.1898_67);
  if (D.1899_68 != 0)
goto bb 3;
  else
goto bb 133;

So I suppose we have to walk the immediate uses of the LHS of the call,
find all that are part of a condition, and mark the target block for
nonzero (in this case bb 3) as a forbidden dominator.  I can tighten
this up.

 
 Honza?
 
  The 4.6 patch was bootstrapped and tests cleanly on powerpc64-linux-gnu.
  OK for 4.6 branch?
 
  Thanks,
  Bill
 
 
  gcc:
 
  2012-01-10  Bill Schmidt  wschm...@linux.vnet.ibm.com
 
 PR tree-optimization/49642
 * ipa-split.c (forbidden_dominators): New variable.
 (check_forbidden_calls): New function.
 (dominated_by_forbidden): Likewise.
 (consider_split): Check for forbidden calls.
 (execute_split_functions): Initialize and free forbidden
 dominators info; call check_forbidden_calls.
 
  gcc/testsuite:
 
  2012-01-10  Bill Schmidt  wschm...@linux.vnet.ibm.com
 
 PR tree-optimization/49642
 * gcc.dg/tree-ssa/pr49642.c: New test.
 
 
  Index: gcc/testsuite/gcc.dg/tree-ssa/pr49642.c
  ===
  --- gcc/testsuite/gcc.dg/tree-ssa/pr49642.c (revision 0)
  +++ gcc/testsuite/gcc.dg/tree-ssa/pr49642.c (revision 0)
  @@ -0,0 +1,49 @@
  +/* Verify that ipa-split is disabled following __builtin_constant_p.  */
  +
  +/* { dg-do compile } */
  +/* { dg-options -O2 -fdump-tree-optimized } */
  +
  +typedef unsigned int u32;
  +typedef unsigned long long u64;
  +
  +static inline __attribute__((always_inline)) __attribute__((const))
  +int __ilog2_u32(u32 n)
  +{
  + int bit;
  + asm (cntlzw %0,%1 : =r (bit) : r (n));
  + return 31 - bit;
  +}
  +
  +
  +static inline __attribute__((always_inline)) __attribute__((const))
  +int __ilog2_u64(u64 n)
  +{
  + int bit;
  + asm (cntlzd %0,%1 : =r (bit) : r (n));
  + return 63 - bit;
  +}
  +
  +
  +
  +static u64 ehca_map_vaddr(void *caddr);
  +
  +struct ehca_shca {
  +u32 hca_cap_mr_pgsize;
  +};
  +
  +static u64 ehca_get_max_hwpage_size(struct ehca_shca *shca)
  +{
  + return 1UL  ( __builtin_constant_p(shca-hca_cap_mr_pgsize) ? ( 
  (shca-hca_cap_mr_pgsize)  1 ? ilog2_NaN() : (shca-hca_cap_mr_pgsize) 
   (1ULL  63) ? 63 : (shca-hca_cap_mr_pgsize)  (1ULL  62) ? 62 : 
  (shca-hca_cap_mr_pgsize)  (1ULL  61) ? 61 : (shca-hca_cap_mr_pgsize)  
  (1ULL  60) ? 60 : (shca-hca_cap_mr_pgsize)  (1ULL  59) ? 59 : 
  (shca-hca_cap_mr_pgsize)  (1ULL  58) ? 58 : (shca-hca_cap_mr_pgsize)  
  (1ULL  57) ? 57 : (shca-hca_cap_mr_pgsize)  (1ULL  56) ? 56 : 
  (shca-hca_cap_mr_pgsize)  (1ULL  55) ? 55 : (shca-hca_cap_mr_pgsize)  
  (1ULL  54) ? 54 : (shca-hca_cap_mr_pgsize)  (1ULL  53) ? 53 : 
  (shca-hca_cap_mr_pgsize)  (1ULL  52) ? 52 : (shca-hca_cap_mr_pgsize)  
  (1ULL  51) ? 51 : (shca-hca_cap_mr_pgsize)  (1ULL  50) ? 50 : 
  (shca-hca_cap_mr_pgsize)  (1ULL  49) ? 49 : (shca-hca_cap_mr_pgsize)  
  (1ULL  48) ? 48 : (shca-hca_cap_mr_pgsize)  (1ULL  47) ? 47 : 
  (shca-hca_cap_mr_pgsize)  (1ULL  46) ? 46 : (shca-hca_cap_mr_pgsize)  
  (1ULL  45) ? 45 : (shca-hca_cap_mr_pgsize)  (1ULL  44) ? 44 : 
  (shca-hca_cap_mr_pgsize)  (1ULL  43) ? 43 : (shca-hca_cap_mr_pgsize)  
  (1ULL  42) ? 42 : (shca-hca_cap_mr_pgsize)  (1ULL  41) ? 41 : 
  (shca-hca_cap_mr_pgsize)  (1ULL  40) ? 40 :

RE: [Ping] RE: CR16 Port addition

2012-01-10 Thread Joseph S. Myers

On Tue, 10 Jan 2012, Jayant R. Sonar wrote:

 PING 9: For reviewing the modified CR16 port.
 
 Hello,
 
 Can some one please review the updated patch and let me know if any more 
 changes are required to be done in it?
 
 Rainer had suggested few important changes last time. After making those 
 changes, the modified patch was posted at following URL:
 http://gcc.gnu.org/ml/gcc-patches/2011-11/msg02625.html

Richard, any comments on this version?  I've looked over it and don't have 
any comments myself.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [RFC] Fixing expansion of misaligned MEM_REFs on strict-alignment targets

2012-01-10 Thread Joseph S. Myers

On Tue, 10 Jan 2012, Richard Guenther wrote:

 There is none :/  STRICT_ALIGNMENT would need to get a mode argument,

The version of STRICT_ALIGNMENT with a mode argument is 
SLOW_UNALIGNED_ACCESS (from GCC's perspective, there isn't much difference 
between unaligned accesses don't work at all and unaligned accesses are 
very slow because they trap - we don't want to generate them in either 
case).  Probably we should migrate STRICT_ALIGNMENT uses to call 
SLOW_UNALIGNED_ACCESS (or a version thereof converted to a target hook) 
with appropriate arguments, with a view to poisoning STRICT_ALIGNMENT.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: RFC: An alternative -fsched-pressure implementation

2012-01-10 Thread Vladimir Makarov


On 12/28/2011 08:51 AM, Richard Sandiford wrote:

Vladimir Makarovvmaka...@redhat.com  writes:

In the end I tried an ad-hoc approach in an attempt to do something
about (2), (3) and (4b).  The idea was to construct a preliminary
model schedule in which the only objective is to keep register
pressure to a minimum.  This schedule ignores pipeline characteristics,
latencies, and the number of available registers.  The maximum pressure
seen in this initial model schedule (MP) is then the benchmark for ECC(X).


I always had an impression that the code before scheduler is close to
minimal register pressure because of specific expression generation.
May be I was wrong and some optimizations (global ones like pre) changes
this a lot.

One of the examples I was looking at was:

-
#includestdint.h

#define COUNT 8

void
loop (uint8_t *__restrict dst, uint8_t *__restrict src, uint8_t *__restrict 
ff_cropTbl, int dstStride, int srcStride)
{
   const int w = COUNT;
   uint8_t *cm = ff_cropTbl + 1024;
   for(int i=0; iw; i++)
 {
   const int srcB = src[-2*srcStride];
   const int srcA = src[-1*srcStride];
   const int src0 = src[0 *srcStride];
   const int src1 = src[1 *srcStride];
   const int src2 = src[2 *srcStride];
   const int src3 = src[3 *srcStride];
   const int src4 = src[4 *srcStride];
   const int src5 = src[5 *srcStride];
   const int src6 = src[6 *srcStride];
   const int src7 = src[7 *srcStride];
   const int src8 = src[8 *srcStride];
   const int src9 = src[9 *srcStride];
   const int src10 = src[10*srcStride];

   dst[0*dstStride] = cm[(((src0+src1)*20 - (srcA+src2)*5 + (srcB+src3)) + 
16)5];
   dst[1*dstStride] = cm[(((src1+src2)*20 - (src0+src3)*5 + (srcA+src4)) + 
16)5];
   dst[2*dstStride] = cm[(((src2+src3)*20 - (src1+src4)*5 + (src0+src5)) + 
16)5];
   dst[3*dstStride] = cm[(((src3+src4)*20 - (src2+src5)*5 + (src1+src6)) + 
16)5];
   dst[4*dstStride] = cm[(((src4+src5)*20 - (src3+src6)*5 + (src2+src7)) + 
16)5];
   dst[5*dstStride] = cm[(((src5+src6)*20 - (src4+src7)*5 + (src3+src8)) + 
16)5];
   dst[6*dstStride] = cm[(((src6+src7)*20 - (src5+src8)*5 + (src4+src9)) + 
16)5];
   dst[7*dstStride] = cm[(((src7+src8)*20 - (src6+src9)*5 + (src5+src10)) + 
16)5];
   dst++;
   src++;
 }
}
-

(based on the libav h264 code).  In this example the loads from src and
stores to dst are still in their original order by the time we reach sched1,
so src, dst, srcA, srcB, and src0..10 are all live at once.  There's no
aliasing reason why they can't be reordered, and we do that during
scheduling.


Thanks, for the example.




As for the patch itself, I look at this with more attention at the
beginning of next year.  As I understand there is no rush with that
because we are still not at the stage 1.

Thanks, appreciate it.  And yeah, there's definitely no rush: there's
no way this can go in 4.7.


Ok.  Thanks, Richard.

Re: RFC: An alternative -fsched-pressure implementation

2012-01-10 Thread Vladimir Makarov


On 01/09/2012 07:45 AM, Bernd Schmidt wrote:

On 12/23/2011 12:46 PM, Richard Sandiford wrote:

In the end I tried an ad-hoc approach in an attempt to do something
about (2), (3) and (4b).  The idea was to construct a preliminary
model schedule in which the only objective is to keep register
pressure to a minimum.  This schedule ignores pipeline characteristics,
latencies, and the number of available registers.  The maximum pressure
seen in this initial model schedule (MP) is then the benchmark for ECC(X).

Interesting. I had also thought about trying to address the problem in
the scheduler, so I'll just share some of my thoughts (not intended as a
negative comment on your approach).

Essentially the problem I usually see is that the wider your machine
gets, the happier sched1 will be to fill unused slots with instructions
that could just as well be scheduled 200 cycles later. That's really an
artifact of forward scheduling, so why not schedule both forwards and
then backwards (or the other way round)? Produce an initial schedule,
then perform a fixup phase: start at the other end and look for
instructions that can be scheduled in a wide range of cycles, and move
them if doing so can be shown to reduce register pressure, and we retain
a correct schedule. This can be done without trying to have a global
pressure estimate which has the problems you noted.

I saw such approaches in the literature.  It would be interesting to see 
how it works in GCC.


Although, I believe that register pressure minimization pass before RA 
and selective insn scheduler used only after RA could solve all this 
problems.  But it needs a lot of investigation to confirm this.  It is 
also hard for me to say what approach would require more efforts to 
implement.

Doing this would require constructing a backwards DFA, and I gave up on
this for the moment after a day or so spent with genautomata, but it
should be possible.

Long ago (10 years ago) I considered to do this exactly for removing the 
2nd separate insn scheduler by inserting spill code in already existing 
insn schedule.  But I decided not to do this.


Also I considered to generate DFAs with repeated insn issues for better 
serving modulo scheduler.


There are a lot of directions of developing automatically generated 
pipeline hazard recognizers going beyond (N)DFAs to simulate hidden 
register renaming, different ooo processors look ahead buffers/queues.  
All of these directions could be interesting research projects.  But 
even in the current state, imho GCC (N)DFA pipeline hazard recognizer 
after its 10 years of existence is the most powerful one used in 
industrial compilers.

Re: [PATCH] Don't ICE on = 64KB expressions in dwarf2out (PR debug/51695)

2012-01-10 Thread Alexandre Oliva

On Jan  4, 2012, Jakub Jelinek ja...@redhat.com wrote:

 Unfortunately from time to time we do generate them, I hope Alex will
 look at how to prevent that from happening at var-tracking time, but
 still this isn't something we should assert on.

I've spent some time looking into this.  I could avoid the huge
expression by reducing the default max-vartrack-expr-depth from 12 to 8,
and AFAICT this didn't change at all the debug info generated for any of
the host parts of GCC, so we could do that.  I haven't regtested that,
though, for I'm not convinced that's the way to go.

FWIW, that wouldn't solve the underlying problem, which is that
expressions may get arbitrarily large, and debug info formats may not be
able to deal with that.  I think cutting them off within the debug info
format logic is the best we can do for now.  Using dwarf procedures is a
longer-term project that, if done in var-tracking where IMHO it belogs,
will require some additional interfaces between var-tracking and debug
info formats (are procedures supported, what's the expr size limit,
what's the size of a certain expr, etc).

One shorter-term way to avoid too-big expressions is to change the expr
depth computation logic.  Currently, if we find something like (op
(value) (const_int)) we don't regard it as any deeper than value by
itself.  Only operations with two or more value operands increase the
depth.  E.g., in the testcase for this bug, we only increase the depth
of the expression for the already-dead variable o at the if_then_else
expressions, for everything else (ands, xors, etc) involve constants.  I
don't think that'd be a positive change overall, though.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer

Re: C++/libiberty PATCH for many mangling fixes (6057, 48051, 50855, 51322 and more)

2012-01-10 Thread Jason Merrill

Keith pointed out that my demangler changes changed the demangling of 
overloaded operator delete; this patch corrects that.


Tested x86_64-pc-linux-gnu, applying to trunk.

commit ee8af40f38391d44549cf96b159dcb00821c2074
Author: Jason Merrill ja...@redhat.com
Date:   Tue Jan 10 11:16:46 2012 -0500

	* cp-demangle.c (d_print_comp) [DEMANGLE_COMPONENT_OPERATOR]:
	Omit a trailing space in the operator name.

diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index 2dfd67c..18b84a1 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -4357,14 +4357,17 @@ d_print_comp (struct d_print_info *dpi, int options,
 
 case DEMANGLE_COMPONENT_OPERATOR:
   {
-	char c;
+	const struct demangle_operator_info *op = dc-u.s_operator.op;
+	int len = op-len;
 
 	d_append_string (dpi, operator);
-	c = dc-u.s_operator.op-name[0];
-	if (IS_LOWER (c))
+	/* Add a space before new/delete.  */
+	if (IS_LOWER (op-name[0]))
 	  d_append_char (dpi, ' ');
-	d_append_buffer (dpi, dc-u.s_operator.op-name,
-			 dc-u.s_operator.op-len);
+	/* Omit a trailing space.  */
+	if (op-name[len-1] == ' ')
+	  --len;
+	d_append_buffer (dpi, op-name, len);
 	return;
   }
 
diff --git a/libiberty/testsuite/demangle-expected b/libiberty/testsuite/demangle-expected
index 3f3960a..408c4f4 100644
--- a/libiberty/testsuite/demangle-expected
+++ b/libiberty/testsuite/demangle-expected
@@ -4058,6 +4058,9 @@ decltype ((::delete {parm#1}),(+{parm#1})) fint(int*)
 _Z1fIiEDTcmdafp_psfp_EPT_
 decltype ((delete[] {parm#1}),(+{parm#1})) fint(int*)
 --format=gnu-v3
+_ZN1AdlEPv
+A::operator delete(void*)
+--format=gnu-v3
 _Z2f1IiEDTppfp_ET_
 decltype ({parm#1}++) f1int(int)
 --format=gnu-v3

Re: [PATCH] Adjust 'malloc' attribute documentation to match implementation

2012-01-10 Thread Xinliang David Li

of course your new version.

thanks,

David

On Tue, Jan 10, 2012 at 1:31 AM, Richard Guenther rguent...@suse.de wrote:
 On Mon, 9 Jan 2012, Xinliang David Li wrote:

 It looks non-ambiguous to me.

 The new proposed version or the old?

 Richard.

 David

 On Mon, Jan 9, 2012 at 1:05 AM, Richard Guenther rguent...@suse.de wrote:
 
  Since GCC 4.4 applying the malloc attribute to realloc-like
  functions does not work under the documented constraints because
  the contents of the memory pointed to are not properly transfered
  from the realloc argument (or treated as pointing to anything,
  like 4.3 behaved).
 
  The following adjusts documentation to reflect implementation
  reality (we do have an implementation detail that treats the
  memory blob returned for non-builtins as pointing to any global
  variable, but that is neither documented nor do I plan to do
  so - I presume it is to allow allocation + initialization
  routines to be marked with malloc, but even that area looks
  susceptible to misinterpretation to me).
 
  Any comments?
 
  Thanks,
  Richard.
 
  2012-01-09  Richard Guenther  rguent...@suse.de
 
         * doc/extend.texi (malloc attribute): Adjust according to
         implementation.
 
  Index: gcc/doc/extend.texi
  ===
  --- gcc/doc/extend.texi (revision 183001)
  +++ gcc/doc/extend.texi (working copy)
  @@ -2771,13 +2771,12 @@ efficient @code{jal} instruction.
   @cindex @code{malloc} attribute
   The @code{malloc} attribute is used to tell the compiler that a function
   may be treated as if any non-@code{NULL} pointer it returns cannot
  -alias any other pointer valid when the function returns.
  +alias any other pointer valid when the function returns and that the 
  memory
  +has undefined content.
   This will often improve optimization.
   Standard functions with this property include @code{malloc} and
  -@code{calloc}.  @code{realloc}-like functions have this property as
  -long as the old pointer is never referred to (including comparing it
  -to the new pointer) after the function returns a non-@code{NULL}
  -value.
  +@code{calloc}.  @code{realloc}-like functions do not have this
  +property as the memory pointed to does not have undefined content.
 
   @item mips16/nomips16
   @cindex @code{mips16} attribute



 --
 Richard Guenther rguent...@suse.de
 SUSE / SUSE Labs
 SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
 GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer

Re: FW: patch to fix PR21617

2012-01-10 Thread Vladimir Makarov


On 12/29/2011 06:41 AM, Igor Zamyatin wrote:

Ilya is on vacation so I'll make the answer.

Overall score became worse on 0.3%.

Ok, thanks.  It is in the range of measure error for some processors.  
But Intel processors range is pretty small.


Did you use Atom for measuring?

I'll try to find another solution for the problem.  But I need a help 
for benchmarking patches as I have no access to EEMBC.

Re: [PATCH SMS 2/2, RFC] Register pressure estimation for the partial schedule (re-submission)

2012-01-10 Thread Vladimir Makarov


On 01/03/2012 04:25 AM, Revital1 Eres wrote:


Attached is an updated version with the two changes mentioned above taken
from the previous patch.

Tested and bootstrap with the other patch in the series on
ppc64-redhat-linux, enabling SMS on loops with SC 1.

Thanks again,
Revital



IRA changes are ok for me.
Thanks, Revital.

2012-01-03  Richard Sandifordrichard.sandif...@linaro.org
 Revital Eresrevital.e...@linaro.org

 * loop-invariant.c (get_regno_pressure_class): Move function to...
 * ira.c: Here.
 * common.opt (fmodulo-sched-reg-pressure, -fmodulo-sched-verbose):
 New flags.
 * doc/invoke.texi (fmodulo-sched-reg-pressure,
 -fmodulo-sched-verbose): Document the flags.
 * ira.h (get_regno_pressure_class,
 reset_pseudo_classes_defined_p): Declare.
 * ira-costs.c (reset_pseudo_classes_defined_p): New function.
 * Makefile.in (modulo-sched.o): Include ira.h and modulo-sched.h.
 (modulo-sched-pressure.o): New.
 * modulo-sched.c (ira.h, modulo-sched.h): New includes.
 (partial_schedule_ptr, ps_insn_ptr, struct ps_insn,
 struct ps_reg_move_info, struct partial_schedule): Move to
 modulo-sched.h.
 (ps_rtl_insn, ps_reg_move): Remove static.
 (apply_reg_moves): Remove static and call df_insn_rescan only
 if PS is final.
 (undo_reg_moves): New function.
 (sms_schedule): Call register pressure estimation.
 * modulo-sched.h: New file.
 * modulo-sched-pressure.c: New file.

(See attached file: patch_pressure_3_1_12.txt)

[PATCH] Fix ICE in distribute_notes (PR bootstrap/51796)

2012-01-10 Thread Jakub Jelinek

Hi!

My recent patch which adds REG_ARGS_SIZE note to all
!ACCUMULATE_OUTGOING_ARGS noreturn calls introduced a regression,
the checking code in distribute_notes can ICE if something
is combned with the noreturn call and the noreturn call has
the same REG_ARGS_SIZE value as before.
Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk?

2012-01-10  Jakub Jelinek  ja...@redhat.com

PR bootstrap/51796
* combine.c (distribute_notes): If i3 is a noreturn call,
allow old_size to be equal to args_size.

--- gcc/combine.c   2011-12-09 15:21:20.548649331 +0100
+++ gcc/combine.c   2012-01-10 00:10:59.303836653 +0100
@@ -13282,7 +13282,21 @@ distribute_notes (rtx notes, rtx from_in
{
  int old_size, args_size = INTVAL (XEXP (note, 0));
  old_size = fixup_args_size_notes (PREV_INSN (i3), i3, args_size);
- gcc_assert (old_size != args_size);
+ if (old_size == args_size)
+   {
+ /* emit_call_1 adds for !ACCUMULATE_OUTGOING_ARGS
+REG_ARGS_SIZE note to all noreturn calls, so allow those
+here.  */
+ gcc_assert (CALL_P (i3)  !ACCUMULATE_OUTGOING_ARGS);
+ if (find_reg_note (i3, REG_NORETURN, NULL_RTX) == NULL_RTX)
+   {
+ rtx n;
+ for (n = next_note; n; n = XEXP (n, 1))
+   if (REG_NOTE_KIND (n) == REG_NORETURN)
+ break;
+ gcc_assert (n);
+   }
+   }
}
  break;
 
--- gcc/testsuite/gcc.dg/pr51796.c.jj   2012-01-10 16:35:53.917661308 +0100
+++ gcc/testsuite/gcc.dg/pr51796.c  2012-01-10 16:36:36.000880045 +0100
@@ -0,0 +1,16 @@
+/* PR bootstrap/51796 */
+/* { dg-do compile } */
+/* { dg-options -Os -fno-omit-frame-pointer -fno-tree-dominator-opts 
-fno-tree-fre -fno-tree-pre } */
+
+typedef void (*entry_func) (void) __attribute__ ((noreturn));
+extern entry_func entry_addr;
+static void bsd_boot_entry (void)
+{
+  stop ();
+}   
+void bsd_boot (void)
+{
+  entry_addr = (entry_func) bsd_boot_entry;
+  (*entry_addr) ();
+}
+

Jakub

[patch] Fix ICEs with functions returning variable-sized array

2012-01-10 Thread Eric Botcazou

This is a couple of regressions present on the mainline.  For the first
testcase at O2 -gnatn:

+===GNAT BUG DETECTED==+
| 4.7.0 20120102 (experimental) [trunk revision 182780] (i586-suse-linux) GCC 
error:|
| in assign_stack_temp_for_type, at function.c:796 |
| Error detected around p1.adb:3:4   

For the second testcase:

+===GNAT BUG DETECTED==+
| 4.7.0 20120102 (experimental) [trunk revision 182780] (i586-suse-linux) GCC 
error:|
| in declare_return_variable, at tree-inline.c:2904|
| Error detected around p2.adb:3:4

Both are caused by the fnsplit IPA pass being run on a function returning a 
variable-sized array.  In both cases, the part that isn't inlined is made 
up of a single raise statement, i.e. a no-return call.  So fnsplit rewrites 
the call statement into just:

  f.part (arguments);

In the first case, the compilation aborts when the RTL expander attempts to 
create a temporary for the return value (which would have variable size) 
while, in the second case, it aborts on the assertion:

  gcc_assert (TREE_CODE (TYPE_SIZE_UNIT (callee_type)) == INTEGER_CST);

when the inliner attemps to inline the part that wasn't inlined(!).

The proposed fix is to turn the part that isn't inlined into a function that
returns void.  This involves straightforward adjustments to the two versioning 
machineries (cgraph and tree).

Tested on i586-suse-linux, OK for the mainline?


2012-01-10  Eric Botcazou  ebotca...@adacore.com

* tree.h (build_function_decl_skip_args): Add boolean parameter.
(build_function_type_skip_args): Delete.
* tree.c (build_function_type_skip_args): Make static and add
SKIP_RETURN parameter.  Fix thinko in the handling of variants.
(build_function_decl_skip_args): Add SKIP_RETURN parameter and
pass it to build_function_type_skip_args.
* cgraph.h (cgraph_function_versioning): Add boolean parameter.
(tree_function_versioning): Likewise.
* cgraph.c (cgraph_create_virtual_clone): Adjust call to
build_function_decl_skip_args.
* cgraphunit.c (cgraph_function_versioning): Add SKIP_RETURN parameter
and pass it to build_function_decl_skip_args/tree_function_versioning.
(cgraph_materialize_clone): Adjust call to tree_function_versioning.
* ipa-inline-transform.c (save_inline_function_body): Likewise.
* trans-mem.c (ipa_tm_create_version): Likewise.
* tree-sra.c (modify_function): Likewise for cgraph_function_versioning.
* tree-inline.c (declare_return_variable): Remove always-true test.
(tree_function_versioning): Add SKIP_RETURN parameter.  If the function
returns non-void and SKIP_RETURN, create a void-typed RESULT_DECL.
* ipa-split.c (split_function): Skip the return value for the split
part if it doesn't return.


2012-01-10  Eric Botcazou  ebotca...@adacore.com

* gnat.dg/opt23.ad[sb]: New test.
* gnat.dg/opt23_pkg.ad[sb]: New helper.
* gnat.dg/opt24.ad[sb]: New test.


-- 
Eric Botcazou
Index: tree.h
===
--- tree.h	(revision 182780)
+++ tree.h	(working copy)
@@ -4386,8 +4386,7 @@ extern tree build_nonshared_array_type (
 extern tree build_array_type_nelts (tree, unsigned HOST_WIDE_INT);
 extern tree build_function_type (tree, tree);
 extern tree build_function_type_list (tree, ...);
-extern tree build_function_type_skip_args (tree, bitmap);
-extern tree build_function_decl_skip_args (tree, bitmap);
+extern tree build_function_decl_skip_args (tree, bitmap, bool);
 extern tree build_varargs_function_type_list (tree, ...);
 extern tree build_function_type_array (tree, int, tree *);
 extern tree build_varargs_function_type_array (tree, int, tree *);
Index: tree.c
===
--- tree.c	(revision 182780)
+++ tree.c	(working copy)
@@ -7556,10 +7556,12 @@ build_function_type (tree value_type, tr
   return t;
 }
 
-/* Build variant of function type ORIG_TYPE skipping ARGS_TO_SKIP.  */
+/* Build variant of function type ORIG_TYPE skipping ARGS_TO_SKIP and the
+   return value if SKIP_RETURN is true.  */
 
-tree
-build_function_type_skip_args (tree orig_type, bitmap args_to_skip)
+static tree
+build_function_type_skip_args (tree orig_type, bitmap args_to_skip,
+			   bool skip_return)
 {
   tree new_type = NULL;
   tree args, new_args = NULL, t;
@@ -7599,11 +7601,15 @@ build_function_type_skip_args (tree orig
   TYPE_CONTEXT (new_type) = TYPE_CONTEXT (orig_type);
 }
 
+  if (skip_return)
+TREE_TYPE (new_type) = void_type_node;
+
   /* This is a new type, not a copy of an old type.  Need to reassociate
  variants.  We can handle everything except the main variant lazily.  */
   t =

[patch] Fix crash on function returning variable-sized array

2012-01-10 Thread Eric Botcazou

This is a regression present on the mainline.  The compiler crashes during the 
function unnesting pass because of an out-of-context temporary, but the root 
cause of the problem is incorrect sharing of a tree node.  The problem has 
probably been latent since gimplification was devised: while the DECL_SIZE and 
DECL_SIZE_UNIT trees of VAR_DECLs are visited by walk_tree (via BIND_EXPR), 
this isn't the case for RESULT_DECL.  As a result, if it has variable size 
(this now happens much more often in Ada), its subtrees aren't unshared and 
this is problematic.

The proposed fix is to unshared them manually in unshare_body.  To implement 
this, the awkward interface to gimplify_body/unshare_body, where you pass both 
a pointer to the body and the decl itself and later need to test whether they 
are related, is simplified as the 2 calls of gimplify_body are idiomatic.

Tested on i586-suse-linux, OK for the mainline?


2012-01-10  Eric Botcazou  ebotca...@adacore.com

* gimple.h (gimplify_body): Remove first argument.
* gimplify.c (copy_if_shared): Add DATA argument.  Do not create the
pointer set here, instead just pass DATA to walk_tree.
(unshare_body): Remove BODY_P argument and adjust.  Create the pointer
set here and invoke copy_if_shared on the size trees of DECL_RESULT.
(unvisit_body): Likewise, but with unmark_visited.
(gimplify_body): Remove BODY_P argument and adjust.
(gimplify_function_tree): Adjust call to gimplify_body.
* omp-low.c (finalize_task_copyfn): Likewise.


2012-01-10  Eric Botcazou  ebotca...@adacore.com

* gnat.dg/array19.ad[sb]: New test.


-- 
Eric Botcazou
Index: gimple.h
===
--- gimple.h	(revision 182780)
+++ gimple.h	(working copy)
@@ -1099,7 +1099,7 @@ extern enum gimplify_status gimplify_exp
 extern void gimplify_type_sizes (tree, gimple_seq *);
 extern void gimplify_one_sizepos (tree *, gimple_seq *);
 extern bool gimplify_stmt (tree *, gimple_seq *);
-extern gimple gimplify_body (tree *, tree, bool);
+extern gimple gimplify_body (tree, bool);
 extern void push_gimplify_context (struct gimplify_ctx *);
 extern void pop_gimplify_context (gimple);
 extern void gimplify_and_add (tree, gimple_seq *);
Index: gimplify.c
===
--- gimplify.c	(revision 182780)
+++ gimplify.c	(working copy)
@@ -951,31 +951,33 @@ copy_if_shared_r (tree *tp, int *walk_su
 /* Unshare most of the shared trees rooted at *TP. */
 
 static inline void
-copy_if_shared (tree *tp)
+copy_if_shared (tree *tp, void *data)
 {
-  /* If the language requires deep unsharing, we need a pointer set to make
- sure we don't repeatedly unshare subtrees of unshareable nodes.  */
-  struct pointer_set_t *visited
-= lang_hooks.deep_unsharing ? pointer_set_create () : NULL;
-  walk_tree (tp, copy_if_shared_r, visited, NULL);
-  if (visited)
-pointer_set_destroy (visited);
+  walk_tree (tp, copy_if_shared_r, data, NULL);
 }
 
-/* Unshare all the trees in BODY_P, a pointer into the body of FNDECL, and the
-   bodies of any nested functions if we are unsharing the entire body of
-   FNDECL.  */
+/* Unshare all the trees in the body of FNDECL, as well as in the bodies of
+   any nested functions.  */
 
 static void
-unshare_body (tree *body_p, tree fndecl)
+unshare_body (tree fndecl)
 {
   struct cgraph_node *cgn = cgraph_get_node (fndecl);
+  /* If the language requires deep unsharing, we need a pointer set to make
+ sure we don't repeatedly unshare subtrees of unshareable nodes.  */
+  struct pointer_set_t *visited
+= lang_hooks.deep_unsharing ? pointer_set_create () : NULL;
 
-  copy_if_shared (body_p);
+  copy_if_shared (DECL_SAVED_TREE (fndecl), visited);
+  copy_if_shared (DECL_SIZE (DECL_RESULT (fndecl)), visited);
+  copy_if_shared (DECL_SIZE_UNIT (DECL_RESULT (fndecl)), visited);
+
+  if (visited)
+pointer_set_destroy (visited);
 
-  if (cgn  body_p == DECL_SAVED_TREE (fndecl))
+  if (cgn)
 for (cgn = cgn-nested; cgn; cgn = cgn-next_nested)
-  unshare_body (DECL_SAVED_TREE (cgn-decl), cgn-decl);
+  unshare_body (cgn-decl);
 }
 
 /* Callback for walk_tree to unmark the visited trees rooted at *TP.
@@ -1008,15 +1010,17 @@ unmark_visited (tree *tp)
 /* Likewise, but mark all trees as not visited.  */
 
 static void
-unvisit_body (tree *body_p, tree fndecl)
+unvisit_body (tree fndecl)
 {
   struct cgraph_node *cgn = cgraph_get_node (fndecl);
 
-  unmark_visited (body_p);
+  unmark_visited (DECL_SAVED_TREE (fndecl));
+  unmark_visited (DECL_SIZE (DECL_RESULT (fndecl)));
+  unmark_visited (DECL_SIZE_UNIT (DECL_RESULT (fndecl)));
 
-  if (cgn  body_p == DECL_SAVED_TREE (fndecl))
+  if (cgn)
 for (cgn = cgn-nested; cgn; cgn = cgn-next_nested)
-  unvisit_body (DECL_SAVED_TREE (cgn-decl), cgn-decl);
+  unvisit_body (cgn-decl);
 }
 
 /* Unconditionally make an unshared copy of EXPR.

Re: Add -lssp_nonshared to LINK_SSP_SPEC

2012-01-10 Thread Tijl Coosemans

On Tuesday 10 January 2012 15:40:15 Richard Guenther wrote:
 On Tue, Jan 10, 2012 at 3:38 PM, Richard Guenther
 richard.guent...@gmail.com wrote:
 On Tue, Jan 10, 2012 at 3:14 PM, Tijl Coosemans t...@coosemans.org wrote:
 On targets where libc implements stack protector functions (GNU libc,
 FreeBSD libc), and where gcc (as an optimisation) generates calls to
 a locally defined __stack_chk_fail_local instead of directly calling
 the global function __stack_chk_fail (e.g. -fpic code on i386), one
 must explicitly specify -lssp_nonshared or -lc -lc_nonshared on the
 command line to statically link in __stack_chk_fail_local.

 It would be more convenient if the compiler kept the details of this
 target specific optimisation hidden by passing -lssp_nonshared to the
 linker internally.

 Here's a simple test case that shows the problem on i386-freebsd, but
 works just fine on e.g. x86_64 targets:

 % cat test.c
 int
 main( void ) {
return( 0 );
 }
 % gcc46 -o test test.c -fstack-protector-all -fPIE
 /var/tmp//ccjYQxKu.o: In function `main':
 test.c:(.text+0x37): undefined reference to `__stack_chk_fail_local'
 /usr/local/bin/ld: test: hidden symbol `__stack_chk_fail_local' isn't 
 defined
 /usr/local/bin/ld: final link failed: Bad value
 collect2: ld returned 1 exit status


 I don't have commit access, so please commit when approved.

 Works fine for me on i?86-linux without -lssp_nonshared (which I do not
 have, so linking would fail).

So my patch would actually break Linux?

 Probably because libc.so is a linker script:
 
 /* GNU ld script
Use the shared library, but some functions are only in
the static library, so try that secondarily.  */
 OUTPUT_FORMAT(elf64-x86-64)
 GROUP ( /lib64/libc.so.6 /usr/lib64/libc_nonshared.a  AS_NEEDED (
 /lib64/ld-linux-x86-64.so.2 ) )
 
 and /usr/lib64/libc_nonshared.a provides the symbol.  Why not fix it that
 way on *BSD?

I'll discuss it with some FreeBSD developers. Currently FreeBSD doesn't
use linker scripts anywhere. Would a FreeBSD specific version of the
patch be acceptable? For instance, the version of GCC shipped with
FreeBSD has been patched like this:
http://svnweb.freebsd.org/base/head/contrib/gcc/config/freebsd-spec.h?r1=195697r2=195696

 2011-01-10  Tijl Coosemans  t...@coosemans.org

* gcc.c [TARGET_LIBC_PROVIDES_SSP] (LINK_SSP_SPEC): Add 
 -lssp_nonshared.

 --- gcc/gcc.c.orig
 +++ gcc/gcc.c
 @@ -602,7 +602,7 @@ proper position among the other output f

  #ifndef LINK_SSP_SPEC
  #ifdef TARGET_LIBC_PROVIDES_SSP
 -#define LINK_SSP_SPEC %{fstack-protector:}
 +#define LINK_SSP_SPEC 
 %{fstack-protector|fstack-protector-all:-lssp_nonshared}
  #else
  #define LINK_SSP_SPEC 
 %{fstack-protector|fstack-protector-all:-lssp_nonshared -lssp}
  #endif

Re: [PATCH] Fix ICE in distribute_notes (PR bootstrap/51796)

2012-01-10 Thread Eric Botcazou

 2012-01-10  Jakub Jelinek  ja...@redhat.com

   PR bootstrap/51796
   * combine.c (distribute_notes): If i3 is a noreturn call,
   allow old_size to be equal to args_size.

Wouldn't all the (potential) callers of fixup_args_size_notes need to do the 
same kind of scanning?  IOW, shouldn't this be fixed in fixup_args_size_notes?

-- 
Eric Botcazou

Re: [RFC, patch] libitm: Filter out undo writes that overlap with the libitm stack.

2012-01-10 Thread Richard Henderson

On 01/11/2012 12:43 AM, Torvald Riegel wrote:
 One could steal code from bohem-gc for this.
 See GC_get_stack_base in os_dep.c.
 
 Thanks for the pointer.  I looked at this code, and it seems fairly
 complex given the dependencies on OS/libc and OS/libc behavior.  From a
 maintenance point-of-view, does it make sense to copy that complexity
 into libitm?  boehm-gc is used in GCC, so perhaps that's not much of a
 problem, however.  I also looked at glibc's memcpy implementations, and
 copying those plus a simple byte-wise copy for the generic case could be
 also a fairly clean solution.
 Also, is the license compatible with the GPL wrt. mixing sources?

From the maintenance point of view, I do think it makes sense to copy.
As for the license, I expect we'd want to copy into a separate file so
that we can keep things vaguely separated.

 What about keeping the patch/hack that I posted for now, creating a PR,
 and looking at this again for another release?

I suppose that's not unreasonable.  Ok with...

 +static inline void *
 +mask_stack_bottom(gtm_thread *tx)
 +{
 +  return (uint8_t*)__builtin_dwarf_cfa() - 128;
 +}

Not only can this not be inline, it must be out-of-line.  Otherwise you're not 
including the stack frame of gtm_undolog::rollback much less memcpy.  You could 
get this result inline if you specialized for the arch by looking at the hard 
stack pointer register, but __builtin_dwarf_cfa is at the wrong end of the 
stack.

You might as well make the fudge factor a lot larger.  Like 4-8k.

 +  if (likely(ptr  top || (uint8_t*)ptr + len =bot))

Missing space before bot.


r~

[google][4.6]Add new target builtin to check for amdfam15h processors (issue5535046)

2012-01-10 Thread Sriraman Tallam

This patch adds a new target builtin, __builtin_cpu_is_amdfam15, to check for 
AMD Family 15h processors.

* i386-cpuinfo.c (__cpu_is_amdfam15): New member in __cpu_model struct.
(get_amd_cpu): Check for family 15h processors.
(cpu_indicator_init): Adjust model and family for AMD processors.
Refactor code.
 
* i386.c (IX86_BUILTIN_CPU_IS_AMDFAM15): New enum value.
(fold_builtin_cpu): Process IX86_BUILTIN_CPU_IS_AMDFAM15.
(ix86_init_platform_type_builtins): Make new builtin 
_builtin_cpu_is_amdfam15.
(ix86_expand_builtin): Expand IX86_BUILTIN_CPU_IS_AMDFAM15.
* testsuite/gcc.target/builtin_target.c (fn1): Call 
__builtin_cpu_is_amdfam15.

Index: libgcc/config/i386/i386-cpuinfo.c
===
--- libgcc/config/i386/i386-cpuinfo.c   (revision 183073)
+++ libgcc/config/i386/i386-cpuinfo.c   (working copy)
@@ -63,6 +63,7 @@ struct __processor_model
   unsigned int __cpu_is_amdfam10_barcelona : 1;
   unsigned int __cpu_is_amdfam10_shanghai : 1;
   unsigned int __cpu_is_amdfam10_istanbul : 1;
+  unsigned int __cpu_is_amdfam15 : 1;
 } __cpu_model;
 
 /* Get the specific type of AMD CPU.  */
@@ -72,18 +73,22 @@ get_amd_cpu (unsigned int family, unsigned int mod
 {
   switch (family)
 {
+/* AMD Family 10h.  */
 case 0x10:
   switch (model)
{
case 0x2:
+ /* Barcelona.  */
  __cpu_model.__cpu_is_amdfam10 = 1;
  __cpu_model.__cpu_is_amdfam10_barcelona = 1;
  break;
case 0x4:
+ /* Shanghai.  */
  __cpu_model.__cpu_is_amdfam10 = 1;
  __cpu_model.__cpu_is_amdfam10_shanghai = 1;
  break;
case 0x8:
+ /* Istanbul.  */
  __cpu_model.__cpu_is_amdfam10 = 1;
  __cpu_model.__cpu_is_amdfam10_istanbul = 1;
  break;
@@ -91,6 +96,10 @@ get_amd_cpu (unsigned int family, unsigned int mod
  break;
}
   break;
+/* AMD Family 15h.  */
+case 0x15:
+  __cpu_model.__cpu_is_amdfam15 = 1;
+  break;
 default:
   break;
 }
@@ -223,6 +232,7 @@ __cpu_indicator_init (void)
   int max_level = 5;
   unsigned int vendor;
   unsigned int model, family, brand_id;
+  unsigned int extended_model, extended_family;
   static int called = 0;
 
   /* This function needs to run just once.  */
@@ -247,14 +257,12 @@ __cpu_indicator_init (void)
   model = (eax  4)  0x0f;
   family = (eax  8)  0x0f;
   brand_id = ebx  0xff;
+  extended_model = (eax  12)  0xf0;
+  extended_family = (eax  20)  0xff;
 
-  /* Adjust model and family for Intel CPUS. */
   if (vendor == SIG_INTEL)
 {
-  unsigned int extended_model, extended_family;
-
-  extended_model = (eax  12)  0xf0;
-  extended_family = (eax  20)  0xff;
+  /* Adjust model and family for Intel CPUS. */
   if (family == 0x0f)
{
  family += extended_family;
@@ -262,20 +270,25 @@ __cpu_indicator_init (void)
}
   else if (family == 0x06)
model += extended_model;
+
+  /* Get CPU type.  */
+  __cpu_model.__cpu_is_intel = 1;
+  get_intel_cpu (family, model, brand_id);
 }
 
-  /* Find CPU model. */
-
   if (vendor == SIG_AMD)
 {
+  /* Adjust model and family for AMD CPUS. */
+  if (family == 0x0f)
+   {
+ family += extended_family;
+ model += (extended_model  4);
+   }
+
+  /* Get CPU type.  */
   __cpu_model.__cpu_is_amd = 1;
   get_amd_cpu (family, model);
 }
-  else if (vendor == SIG_INTEL)
-{
-  __cpu_model.__cpu_is_intel = 1;
-  get_intel_cpu (family, model, brand_id);
-}
 
   /* Find available features. */
   get_available_features (ecx, edx);
Index: gcc/testsuite/gcc.target/i386/builtin_target.c
===
--- gcc/testsuite/gcc.target/i386/builtin_target.c  (revision 183073)
+++ gcc/testsuite/gcc.target/i386/builtin_target.c  (working copy)
@@ -47,6 +47,8 @@ fn1 ()
 return -1;
   if (__builtin_cpu_is_amdfam10_istanbul ()  0)
 return -1;
+  if (__builtin_cpu_is_amdfam15 ()  0)
+return -1;
 
   return 0;
 }
Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c  (revision 183073)
+++ gcc/config/i386/i386.c  (working copy)
@@ -24545,6 +24545,7 @@ enum ix86_builtins
   IX86_BUILTIN_CPU_IS_AMDFAM10_BARCELONA,
   IX86_BUILTIN_CPU_IS_AMDFAM10_SHANGHAI,
   IX86_BUILTIN_CPU_IS_AMDFAM10_ISTANBUL,
+  IX86_BUILTIN_CPU_IS_AMDFAM15,
 
   IX86_BUILTIN_MAX
 };
@@ -26028,6 +26029,7 @@ fold_builtin_cpu (enum ix86_builtins fn_code)
 M_AMDFAM10_BARCELONA,
 M_AMDFAM10_SHANGHAI,
 M_AMDFAM10_ISTANBUL,
+M_AMDFAM15,
 M_MAX
   };
 
@@ -26149,6 +26151,11 @@ fold_builtin_cpu (enum ix86_builtins fn_code)
 M_AMDFAM10_ISTANBUL);
   which_struct = __cpu_model_var;
   break;
+

Re: [PATCH] Fix PR49642 in 4.6, questions about 4.7

2012-01-10 Thread William J. Schmidt

On Tue, 2012-01-10 at 09:42 -0600, William J. Schmidt wrote:
 
 On Tue, 2012-01-10 at 14:53 +0100, Richard Guenther wrote:
  
  Btw, this will also disqualify any point below
  
   if (__builtin_constant_p (...))
 {
   ...
 }
  
  because after the if join all BBs are dominated by the __builtin_constant_p
  call.  What we want to disallow is splitting at a block that is dominated
  by the true edge of the condition fed by the __builtin_constant_p result ...
 
 True.  What we have is:
 
   D.1899_68 = __builtin_constant_p (D.1898_67);
   if (D.1899_68 != 0)
 goto bb 3;
   else
 goto bb 133;
 
 So I suppose we have to walk the immediate uses of the LHS of the call,
 find all that are part of a condition, and mark the target block for
 nonzero (in this case bb 3) as a forbidden dominator.  I can tighten
 this up.

Here's a revised patch for 4.6, following the above.  The same patch
applies to 4.7, if desired, optionally with an additional variation on
the test case to add -fno-tree-fre to the compile step.

Bootstrapped and regression tested on powerpc64-linux-gnu.  OK for
4.6/trunk?

Thanks,
Bill


gcc:

2012-01-10  Bill Schmidt  wschm...@linux.vnet.ibm.com

PR tree-optimization/49642
* ipa-split.c (forbidden_dominators): New variable.
(check_forbidden_calls): New function.
(dominated_by_forbidden): Likewise.
(consider_split): Check for forbidden dominators.
(execute_split_functions): Initialize and free forbidden
dominators info; call check_forbidden_calls.

gcc/testsuite:

2012-01-10  Bill Schmidt  wschm...@linux.vnet.ibm.com

PR tree-optimization/49642
* gcc.dg/tree-ssa/pr49642.c: New test.


Index: gcc/testsuite/gcc.dg/tree-ssa/pr49642.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/pr49642.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr49642.c (revision 0)
@@ -0,0 +1,49 @@
+/* Verify that ipa-split is disabled following __builtin_constant_p.  */
+
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-optimized } */
+
+typedef unsigned int u32;
+typedef unsigned long long u64;
+
+static inline __attribute__((always_inline)) __attribute__((const))
+int __ilog2_u32(u32 n)
+{
+ int bit;
+ asm (cntlzw %0,%1 : =r (bit) : r (n));
+ return 31 - bit;
+}
+
+
+static inline __attribute__((always_inline)) __attribute__((const))
+int __ilog2_u64(u64 n)
+{
+ int bit;
+ asm (cntlzd %0,%1 : =r (bit) : r (n));
+ return 63 - bit;
+}
+
+
+
+static u64 ehca_map_vaddr(void *caddr);
+
+struct ehca_shca {
+u32 hca_cap_mr_pgsize;
+};
+
+static u64 ehca_get_max_hwpage_size(struct ehca_shca *shca)
+{
+ return 1UL  ( __builtin_constant_p(shca-hca_cap_mr_pgsize) ? ( 
(shca-hca_cap_mr_pgsize)  1 ? ilog2_NaN() : (shca-hca_cap_mr_pgsize)  
(1ULL  63) ? 63 : (shca-hca_cap_mr_pgsize)  (1ULL  62) ? 62 : 
(shca-hca_cap_mr_pgsize)  (1ULL  61) ? 61 : (shca-hca_cap_mr_pgsize)  
(1ULL  60) ? 60 : (shca-hca_cap_mr_pgsize)  (1ULL  59) ? 59 : 
(shca-hca_cap_mr_pgsize)  (1ULL  58) ? 58 : (shca-hca_cap_mr_pgsize)  
(1ULL  57) ? 57 : (shca-hca_cap_mr_pgsize)  (1ULL  56) ? 56 : 
(shca-hca_cap_mr_pgsize)  (1ULL  55) ? 55 : (shca-hca_cap_mr_pgsize)  
(1ULL  54) ? 54 : (shca-hca_cap_mr_pgsize)  (1ULL  53) ? 53 : 
(shca-hca_cap_mr_pgsize)  (1ULL  52) ? 52 : (shca-hca_cap_mr_pgsize)  
(1ULL  51) ? 51 : (shca-hca_cap_mr_pgsize)  (1ULL  50) ? 50 : 
(shca-hca_cap_mr_pgsize)  (1ULL  49) ? 49 : (shca-hca_cap_mr_pgsize)  
(1ULL  48) ? 48 : (shca-hca_cap_mr_pgsize)  (1ULL  47) ? 47 : 
(shca-hca_cap_mr_pgsize)  (1ULL  46) ? 46 : (shca-hca_cap_mr_pgsize)  
(1ULL  45) ? 45 : (shca-hca_cap_mr_pgsize)  (1ULL  44) ? 44 : 
(shca-hca_cap_mr_pgsize)  (1ULL  43) ? 43 : (shca-hca_cap_mr_pgsize)  
(1ULL  42) ? 42 : (shca-hca_cap_mr_pgsize)  (1ULL  41) ? 41 : 
(shca-hca_cap_mr_pgsize)  (1ULL  40) ? 40 : (shca-hca_cap_mr_pgsize)  
(1ULL  39) ? 39 : (shca-hca_cap_mr_pgsize)  (1ULL  38) ? 38 : 
(shca-hca_cap_mr_pgsize)  (1ULL  37) ? 37 : (shca-hca_cap_mr_pgsize)  
(1ULL  36) ? 36 : (shca-hca_cap_mr_pgsize)  (1ULL  35) ? 35 : 
(shca-hca_cap_mr_pgsize)  (1ULL  34) ? 34 : (shca-hca_cap_mr_pgsize)  
(1ULL  33) ? 33 : (shca-hca_cap_mr_pgsize)  (1ULL  32) ? 32 : 
(shca-hca_cap_mr_pgsize)  (1ULL  31) ? 31 : (shca-hca_cap_mr_pgsize)  
(1ULL  30) ? 30 : (shca-hca_cap_mr_pgsize)  (1ULL  29) ? 29 : 
(shca-hca_cap_mr_pgsize)  (1ULL  28) ? 28 : (shca-hca_cap_mr_pgsize)  
(1ULL  27) ? 27 : (shca-hca_cap_mr_pgsize)  (1ULL  26) ? 26 : 
(shca-hca_cap_mr_pgsize)  (1ULL  25) ? 25 : (shca-hca_cap_mr_pgsize)  
(1ULL  24) ? 24 : (shca-hca_cap_mr_pgsize)  (1ULL  23) ? 23 : 
(shca-hca_cap_mr_pgsize)  (1ULL  22) ? 22 : (shca-hca_cap_mr_pgsize)  
(1ULL  21) ? 21 : (shca-hca_cap_mr_pgsize)  (1ULL  20) ? 20 : 
(shca-hca_cap_mr_pgsize)  (1ULL  19) ? 19 : (shca-hca_cap_mr_pgsize)  
(1ULL  18) ? 18 : (shca-hca_cap_mr_pgsize)  (1ULL  17) ? 17 : 
(shca-hca_cap_mr_pgsize)  (1ULL  16) ? 16 : (shca-hca_cap_mr_pgsize)  
(1ULL  15) ? 15 :

Re: [Ping] RE: CR16 Port addition

2012-01-10 Thread Richard Henderson

On 01/10/2012 11:55 PM, Jayant R. Sonar wrote:
 PING 9: For reviewing the modified CR16 port.
 
 Hello,
 
 Can some one please review the updated patch and let me know if any more 
 changes are required to be done in it?
 
 Rainer had suggested few important changes last time. After making those 
 changes, the modified patch was posted at following URL:
 http://gcc.gnu.org/ml/gcc-patches/2011-11/msg02625.html
 
 Reference links to the past discussions about this port are also available
  at the above mentioned URL.
 
 I am hoping that we will be able to see this port in GCC 4.7.
...
 The CR16's programming memory is 2-byte aligned and the least significant 
 bit of PC is always zero. The Return Address (RA) register always saves the 
 value of PC right shifted by 1(PC  1). This conversion seems broken at 
 some places during unwinding which results in terminate() function  being 
 called.

This sounds like a job for __builtin_frob_return_addr.  At present we only
have hooks for Sparc's RETURN_ADDR_OFFSET, but that could easily be changed
to perform any target-specific transformation.



Minor errors:

 +#define MUST_SAVE_REGS_P() \
 +  (flag_unwind_tables || (flag_exceptions  !UI_SJLJ))

UI_SJLJ is an enumeration constant.  You wanted to test the result of
targetm_common.except_unwind_info().  It might be better to fold this
macro into its only user.

That said, this optimization also affects debugging, in that you may
no longer be able to examine variables in stack frames higher up.
E.g. after you've stopped at a breakpoint set on abort().  Do you 
really want this at lower optimization levels?

 +  /* If -fpic option, data_model == DM_FAR.  */
 +  if (flag_pic == NEAR_PIC)
 +{
 +  data_model = DM_FAR;
 +}
 +
 +  /* The only option we want to examine is data model option.  */
 +  if (cr16_data_model)
 +{
 +  if (strcmp (cr16_data_model, medium) == 0)
 +   data_model = DM_DEFAULT;
 +  else if (strcmp (cr16_data_model, near) == 0)
 +   data_model = DM_NEAR;
 +  else if (strcmp (cr16_data_model, far) == 0)
 +   {
 + if (TARGET_CR16CP)
 +   data_model = DM_FAR;
 + else
 +   error (data-model=far not valid for cr16c architecture);
 +   }
 +  else
 +   error (invalid data model option -mdata-model=%s, cr16_data_model);
 +}
 +  else
 +data_model = DM_DEFAULT;

The first IF is ineffective because the second IF always overwrites data_model.

 +(define_insn push_for_prologue
...
 +(define_insn pop_and_popret_return

I'm not keen on the fact that the integral argument is totally ignored.
E.g. the post-reload pass_stack_adjustments could adjust the amount from
the value you computed in cr16_compute_frame.

 +(define_insn set_bitmode_mem
 +  [(set (match_operand:SHORT 0 bit_operand =m)
...
 +(define_insn clear_bitmode_mem
 +  [(set (match_operand:SHORT 0 bit_operand =m)

+m


As far as I'm concerned these problems could even be fixed post-commit.



r~

Re: [google][4.6]Add new target builtin to check for amdfam15h processors (issue5535046)

2012-01-10 Thread Diego Novillo


On 12-01-10 17:11 , Sriraman Tallam wrote:

This patch adds a new target builtin, __builtin_cpu_is_amdfam15, to check for 
AMD Family 15h processors.

* i386-cpuinfo.c (__cpu_is_amdfam15): New member in __cpu_model struct.
(get_amd_cpu): Check for family 15h processors.
(cpu_indicator_init): Adjust model and family for AMD processors.
Refactor code.

* i386.c (IX86_BUILTIN_CPU_IS_AMDFAM15): New enum value.
(fold_builtin_cpu): Process IX86_BUILTIN_CPU_IS_AMDFAM15.
(ix86_init_platform_type_builtins): Make new builtin 
_builtin_cpu_is_amdfam15.
(ix86_expand_builtin): Expand IX86_BUILTIN_CPU_IS_AMDFAM15.
* testsuite/gcc.target/builtin_target.c (fn1): Call 
__builtin_cpu_is_amdfam15.


Any reason why this is not applicable for upstream 4.6?


Diego.

Re: [google][4.6]Add new target builtin to check for amdfam15h processors (issue5535046)

2012-01-10 Thread Andrew Pinski

On Tue, Jan 10, 2012 at 3:31 PM, Diego Novillo dnovi...@google.com wrote:
 On 12-01-10 17:11 , Sriraman Tallam wrote:

 This patch adds a new target builtin, __builtin_cpu_is_amdfam15, to check
 for AMD Family 15h processors.

        * i386-cpuinfo.c (__cpu_is_amdfam15): New member in __cpu_model
 struct.
        (get_amd_cpu): Check for family 15h processors.
        (cpu_indicator_init): Adjust model and family for AMD processors.
        Refactor code.

        * i386.c (IX86_BUILTIN_CPU_IS_AMDFAM15): New enum value.
        (fold_builtin_cpu): Process IX86_BUILTIN_CPU_IS_AMDFAM15.
        (ix86_init_platform_type_builtins): Make new builtin
 _builtin_cpu_is_amdfam15.
        (ix86_expand_builtin): Expand IX86_BUILTIN_CPU_IS_AMDFAM15.
        * testsuite/gcc.target/builtin_target.c (fn1): Call
 __builtin_cpu_is_amdfam15.


 Any reason why this is not applicable for upstream 4.6?

Yes because the upstream GCC does not have those builtins yet.

Thanks,
Andrew Pinski

Re: [google][4.6]Add new target builtin to check for amdfam15h processors (issue5535046)

2012-01-10 Thread Sriraman Tallam

On Tue, Jan 10, 2012 at 3:33 PM, Andrew Pinski pins...@gmail.com wrote:
 On Tue, Jan 10, 2012 at 3:31 PM, Diego Novillo dnovi...@google.com wrote:
 On 12-01-10 17:11 , Sriraman Tallam wrote:

 This patch adds a new target builtin, __builtin_cpu_is_amdfam15, to check
 for AMD Family 15h processors.

        * i386-cpuinfo.c (__cpu_is_amdfam15): New member in __cpu_model
 struct.
        (get_amd_cpu): Check for family 15h processors.
        (cpu_indicator_init): Adjust model and family for AMD processors.
        Refactor code.

        * i386.c (IX86_BUILTIN_CPU_IS_AMDFAM15): New enum value.
        (fold_builtin_cpu): Process IX86_BUILTIN_CPU_IS_AMDFAM15.
        (ix86_init_platform_type_builtins): Make new builtin
 _builtin_cpu_is_amdfam15.
        (ix86_expand_builtin): Expand IX86_BUILTIN_CPU_IS_AMDFAM15.
        * testsuite/gcc.target/builtin_target.c (fn1): Call
 __builtin_cpu_is_amdfam15.


 Any reason why this is not applicable for upstream 4.6?

 Yes because the upstream GCC does not have those builtins yet.

Please see this discussion too:
http://gcc.gnu.org/ml/gcc-patches/2011-08/msg01355.html



 Thanks,
 Andrew Pinski

Re: [PATCH] Fix ICE in distribute_notes (PR bootstrap/51796)

2012-01-10 Thread Jakub Jelinek

On Tue, Jan 10, 2012 at 09:19:54PM +0100, Eric Botcazou wrote:
  2012-01-10  Jakub Jelinek  ja...@redhat.com
 
  PR bootstrap/51796
  * combine.c (distribute_notes): If i3 is a noreturn call,
  allow old_size to be equal to args_size.
 
 Wouldn't all the (potential) callers of fixup_args_size_notes need to do the 
 same kind of scanning?  IOW, shouldn't this be fixed in fixup_args_size_notes?

combine.c is the only one that asserts something on fixup_args_size_notes
result.  And for noreturn calls, it doesn't do anything wrong, the problem
is that it returns the old size for them.

Jakub

Re: [patch] Fix crash on function returning variable-sized array

2012-01-10 Thread Richard Henderson

On 01/11/2012 06:28 AM, Eric Botcazou wrote:
 2012-01-10  Eric Botcazou  ebotca...@adacore.com
 
   * gimple.h (gimplify_body): Remove first argument.
   * gimplify.c (copy_if_shared): Add DATA argument.  Do not create the
   pointer set here, instead just pass DATA to walk_tree.
   (unshare_body): Remove BODY_P argument and adjust.  Create the pointer
   set here and invoke copy_if_shared on the size trees of DECL_RESULT.
   (unvisit_body): Likewise, but with unmark_visited.
   (gimplify_body): Remove BODY_P argument and adjust.
   (gimplify_function_tree): Adjust call to gimplify_body.
   * omp-low.c (finalize_task_copyfn): Likewise.

Ok.

Nice cleanup.


r~

Re: [google][4.6]Add new target builtin to check for amdfam15h processors (issue 5535046)

2012-01-10 Thread davidxl



http://codereview.appspot.com/5535046/diff/1/gcc/config/i386/i386.c
File gcc/config/i386/i386.c (right):

http://codereview.appspot.com/5535046/diff/1/gcc/config/i386/i386.c#newcode26032
gcc/config/i386/i386.c:26032: +M_AMDFAM15,
Maybe better to change 10 to 10H, and 15 to 15H in the name as the
number is hex.

Why not split the enum for family 15h into M_AMDFAM15H_BDVER1 and
.._BVER2?

http://codereview.appspot.com/5535046/diff/1/libgcc/config/i386/i386-cpuinfo.c
File libgcc/config/i386/i386-cpuinfo.c (right):

http://codereview.appspot.com/5535046/diff/1/libgcc/config/i386/i386-cpuinfo.c#newcode102
libgcc/config/i386/i386-cpuinfo.c:102: break;
No family15h model encoding?

http://codereview.appspot.com/5535046/

Re: C++ PATCH for c++/51433 (constexpr caching)

2012-01-10 Thread Gabriel Dos Reis

On Tue, Jan 10, 2012 at 8:33 AM, Jason Merrill ja...@redhat.com wrote:
 Sometimes an expression that is non-constant at one point in the translation
 unit can become constant later, when a constexpr function is defined.  So
 let's not cache failure.

 Tested x86_64-pc-linux-gnu, applying to trunk.  This isn't a regression, but
 as before, since C++11 support is still experimental I consider patches that
 fix C++11 bugs and don't affect C++98 code to be acceptable.

Thanks!
with hindsight, it is obvious :-)

-- Gaby

Re: [patch] Fix crash on function returning variable-sized array

2012-01-10 Thread Geert Bosch


On Jan 10, 2012, at 14:28, Eric Botcazou wrote:

 2012-01-10  Eric Botcazou  ebotca...@adacore.com
 
   * gimple.h (gimplify_body): Remove first argument.
   * gimplify.c (copy_if_shared): Add DATA argument.  Do not create the
   pointer set here, instead just pass DATA to walk_tree.

The new void *data pointer could use a comment on what it is and how it's used.

  -Geert

C++ PATCH for c++/51614 (ICE with ambiguous base)

2012-01-10 Thread Jason Merrill

Here build_base_path expects that the binfo argument will designate a 
subobject of the expression argument, but that isn't the case here 
because the base is ambiguous.  So let's complain about that instead of 
aborting.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 640c9c1f2824490323a8deb32170379ffeb2c399
Author: Jason Merrill ja...@redhat.com
Date:   Tue Jan 10 14:45:48 2012 -0500

	PR c++/51614
	* class.c (build_base_path): Diagnose ambiguous base.

diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index 79686a2..58c89d3 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -266,10 +266,25 @@ build_base_path (enum tree_code code,
   if (want_pointer)
 probe = TYPE_MAIN_VARIANT (TREE_TYPE (probe));
 
+  if (code == PLUS_EXPR
+   !SAME_BINFO_TYPE_P (BINFO_TYPE (d_binfo), probe))
+{
+  /* This can happen when adjust_result_of_qualified_name_lookup can't
+	 find a unique base binfo in a call to a member function.  We
+	 couldn't give the diagnostic then since we might have been calling
+	 a static member function, so we do it now.  */
+  if (complain  tf_error)
+	{
+	  tree base = lookup_base (probe, BINFO_TYPE (d_binfo),
+   ba_unique, NULL);
+	  gcc_assert (base == error_mark_node);
+	}
+  return error_mark_node;
+}
+
   gcc_assert ((code == MINUS_EXPR
 	SAME_BINFO_TYPE_P (BINFO_TYPE (binfo), probe))
-	  || (code == PLUS_EXPR
-		   SAME_BINFO_TYPE_P (BINFO_TYPE (d_binfo), probe)));
+	  || code == PLUS_EXPR);
 
   if (binfo == d_binfo)
 /* Nothing to do.  */
diff --git a/gcc/testsuite/g++.dg/inherit/ambig1.C b/gcc/testsuite/g++.dg/inherit/ambig1.C
new file mode 100644
index 000..3596bb5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/inherit/ambig1.C
@@ -0,0 +1,14 @@
+// PR c++/51614
+
+struct A
+{
+  void foo();
+};
+
+struct B : A {};
+struct C : A {};
+
+struct D : B, C
+{
+  D() { A::foo(); }		// { dg-error ambiguous }
+};

Go patch committed: Use backend interface for type size and align

2012-01-10 Thread Ian Lance Taylor

This patch to the Go frontend changes it to use the backend interface to
determine type size and alignment information.  This is a preliminary to
a patch adjusting the handling of struct comparison, which will follow.
Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian


2012-01-10  Ian Lance Taylor  i...@google.com

* go-gcc.cc (Gcc_backend::type_size): New function.
(Gcc_backend::type_alignment): New function.
(Gcc_backend::type_field_alignment): New function.
(Gcc_backend::type_field_offset): New function.
* go-backend.c (go_type_alignment): Remove.
* go-c.h (go_type_alignment): Don't declare.


Index: gcc/go/go-gcc.cc
===
--- gcc/go/go-gcc.cc	(revision 182698)
+++ gcc/go/go-gcc.cc	(working copy)
@@ -1,5 +1,5 @@
 // go-gcc.cc -- Go frontend to gcc IR.
-// Copyright (C) 2011 Free Software Foundation, Inc.
+// Copyright (C) 2011, 2012 Free Software Foundation, Inc.
 // Contributed by Ian Lance Taylor, Google.
 
 // This file is part of GCC.
@@ -195,6 +195,18 @@ class Gcc_backend : public Backend
   bool
   is_circular_pointer_type(Btype*);
 
+  size_t
+  type_size(Btype*);
+
+  size_t
+  type_alignment(Btype*);
+
+  size_t
+  type_field_alignment(Btype*);
+
+  size_t
+  type_field_offset(Btype*, size_t index);
+
   // Expressions.
 
   Bexpression*
@@ -755,6 +767,56 @@ Gcc_backend::is_circular_pointer_type(Bt
   return btype-get_tree() == ptr_type_node;
 }
 
+// Return the size of a type.
+
+size_t
+Gcc_backend::type_size(Btype* btype)
+{
+  tree t = TYPE_SIZE_UNIT(btype-get_tree());
+  gcc_assert(TREE_CODE(t) == INTEGER_CST);
+  gcc_assert(TREE_INT_CST_HIGH(t) == 0);
+  unsigned HOST_WIDE_INT val_wide = TREE_INT_CST_LOW(t);
+  size_t ret = static_castsize_t(val_wide);
+  gcc_assert(ret == val_wide);
+  return ret;
+}
+
+// Return the alignment of a type.
+
+size_t
+Gcc_backend::type_alignment(Btype* btype)
+{
+  return TYPE_ALIGN_UNIT(btype-get_tree());
+}
+
+// Return the alignment of a struct field of type BTYPE.
+
+size_t
+Gcc_backend::type_field_alignment(Btype* btype)
+{
+  return go_field_alignment(btype-get_tree());
+}
+
+// Return the offset of a field in a struct.
+
+size_t
+Gcc_backend::type_field_offset(Btype* btype, size_t index)
+{
+  tree struct_tree = btype-get_tree();
+  gcc_assert(TREE_CODE(struct_tree) == RECORD_TYPE);
+  tree field = TYPE_FIELDS(struct_tree);
+  for (; index  0; --index)
+{
+  field = DECL_CHAIN(field);
+  gcc_assert(field != NULL_TREE);
+}
+  HOST_WIDE_INT offset_wide = int_byte_position(field);
+  gcc_assert(offset_wide = 0);
+  size_t ret = static_castsize_t(offset_wide);
+  gcc_assert(ret == static_castunsigned HOST_WIDE_INT(offset_wide));
+  return ret;
+}
+
 // Return the zero value for a type.
 
 Bexpression*
Index: gcc/go/gofrontend/types.h
===
--- gcc/go/gofrontend/types.h	(revision 182971)
+++ gcc/go/gofrontend/types.h	(working copy)
@@ -861,6 +861,27 @@ class Type
   std::string
   mangled_name(Gogo*) const;
 
+  // If the size of the type can be determined, set *PSIZE to the size
+  // in bytes and return true.  Otherwise, return false.  This queries
+  // the backend.
+  bool
+  backend_type_size(Gogo*, unsigned int* psize);
+
+  // If the alignment of the type can be determined, set *PALIGN to
+  // the alignment in bytes and return true.  Otherwise, return false.
+  bool
+  backend_type_align(Gogo*, unsigned int* palign);
+
+  // If the alignment of a struct field of this type can be
+  // determined, set *PALIGN to the alignment in bytes and return
+  // true.  Otherwise, return false.
+  bool
+  backend_type_field_align(Gogo*, unsigned int* palign);
+
+  // Whether the backend size is known.
+  bool
+  is_backend_type_size_known(Gogo*) const;
+
   // Get the hash and equality functions for a type.
   void
   type_functions(Gogo*, Named_type* name, Function_type* hash_fntype,
@@ -2013,6 +2034,12 @@ class Struct_type : public Type
   traverse_field_types(Traverse* traverse)
   { return this-do_traverse(traverse); }
 
+  // If the offset of field INDEX in the backend implementation can be
+  // determined, set *POFFSET to the offset in bytes and return true.
+  // Otherwise, return false.
+  bool
+  backend_field_offset(Gogo*, unsigned int index, unsigned int* poffset);
+
   // Import a struct type.
   static Struct_type*
   do_import(Import*);
@@ -2507,8 +2534,9 @@ class Named_type : public Type
   local_methods_(NULL), all_methods_(NULL),
   interface_method_tables_(NULL), pointer_interface_method_tables_(NULL),
   location_(location), named_btype_(NULL), dependencies_(),
-  is_visible_(true), is_error_(false), is_converted_(false),
-  is_circular_(false), seen_(false), seen_in_get_backend_(false)
+  is_visible_(true), is_error_(false), is_placeholder_(false),
+  is_converted_(false),

[ping 3] [patch] attribute to reverse bitfield allocations

2012-01-10 Thread DJ Delorie


Ping 3?  It's been months with no feedback...

 Ping 2 ?
 
 http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01889.html
 http://gcc.gnu.org/ml/gcc-patches/2011-07/msg02555.html

[Patch libfortran] PR 51803 getcwd() failure

2012-01-10 Thread Janne Blomqvist

Hi,

I committed the attached patch as obvious to trunk after the RM
considered it OK in the PR.

Index: runtime/main.c
===
--- runtime/main.c  (revision 183089)
+++ runtime/main.c  (working copy)
@@ -116,8 +116,10 @@ store_exe_path (const char * argv0)
   memset (buf, 0, sizeof (buf));
 #ifdef HAVE_GETCWD
   cwd = getcwd (buf, sizeof (buf));
+  if (!cwd)
+cwd = .;
 #else
-  cwd = ;
+  cwd = .;
 #endif

   /* exe_path will be cwd + / + argv[0] + \0.  This will not work
Index: ChangeLog
===
--- ChangeLog   (revision 183089)
+++ ChangeLog   (working copy)
@@ -1,3 +1,9 @@
+2012-01-11  Janne Blomqvist  j...@gcc.gnu.org
+Mike Stump  mikest...@comcast.net
+   PR libfortran/51803
+   * runtime/main.c (store_exe_path): Handle getcwd failure and lack
+   of the function better.
+
 2012-01-10  Tobias Burnus  bur...@net-b.de

PR fortran/51197


-- 
Janne Blomqvist

55 matches

Mail list logo