Re: PR 53889: Add __gthread_recursive_mutex_destroy

2012-10-02 Thread Ian Lance Taylor
On Mon, Oct 1, 2012 at 5:46 PM, Jonathan Wakely jwakely@gmail.com wrote:

 static inline int
 __gthread_recursive_mutex_destroy (__gthread_recursive_mutex_t
* UNUSED(__mutex))
 {
   return 0;
 }

 Is that indentation right?  (the asterisk is in the same column as the
 parameter type in a fixed-width font.)

When I see a single parameter that pushes past 80 columns, I normally
start a new line after the left parenthesis and indent the next line 4
spaces.  E.g.:

static inline int
__gthread_recursive_mutex_destroy (
__gthread_recursive_mutex_t * UNUSED(__mutex))

But I don't think there is any solid standard for this.

 PR other/53889
 * gthr.h (__gthread_recursive_mutex_destroy): Document new required
 function.
 * gthr-posix.h (__gthread_recursive_mutex_destroy): Define.
 * gthr-single.h (__gthread_recursive_mutex_destroy): Likewise.
 * config/gthr-rtems.h (__gthread_recursive_mutex_destroy): Likewise.
 * config/gthr-vxworks.h (__gthread_recursive_mutex_destroy): Likewise.
 * config/i386/gthr-win32.h (__gthread_recursive_mutex_destroy):
 Likewise.
 * config/mips/gthr-mipssde.h (__gthread_recursive_mutex_destroy):
 Likewise.
 * config/pa/gthr-dce.h (__gthread_recursive_mutex_destroy): Likewise.
 * config/s390/gthr-tpf.h (__gthread_recursive_mutex_destroy): 
 Likewise.

The libgcc part of this is OK.

Thanks.

Ian


Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-02 Thread Steven Bosscher
On Tue, Oct 2, 2012 at 3:14 AM, Vladimir Makarov vmaka...@redhat.com wrote:
   My experience shows that these lists are usually 1-2 elements. Although in
 this case, there are pseudos with huge number elements (hundreeds).  I tried
 -fweb for this tests because it can decrease the number elements but GCC (I
 don't know what pass) scales even worse: after 20 min of waiting and when
 virt memory achieved 20GB I stoped it.

Ouch :-)

The webizer itself never even runs, the compiler blows up somewhere
during the df_analyze call from web_main. The issue here is probably
in the DF_UD_CHAIN problem or in the DF_RD problem.

Ciao!
Steven


[PATCH v2, libbacktrace]: Compile with -funwind-tables

2012-10-02 Thread Uros Bizjak
On Tue, Oct 2, 2012 at 12:58 AM, Ian Lance Taylor i...@google.com wrote:

 Without -fasynchronous-unwind-tables, FDE is not generated for
 backtrace_full and backtrace_simple wrappers. Without FDE, unwinding
 terminates at these functions.

 I'm not opposed to -fasynchronous-unwind-tables, but now that you
 bring it up I'm fairly certain that it would suffice to use
 -funwind-tables.  I've been testing mainly on x86_64, and I forgot
 that on x86_64 -funwind-tables is the default.  Sorry about that.  And
 -fasynchronous-unwind-tables is the default also, so I could be wrong
 that -funwind-tables is all that is needed.

Yes, you are correct. -funwind-tables works as well.

 Attached patch fixes this problem by adding
 -fasynchronous-unwind-tables, and this way forcing FDEs for all
 functions. With this change, btest passes OK, failing log and
 runtime/pprof from libgo testsuite also pass OK.

 This is basically fine but libbacktrace may be compiled by the host
 compiler and that may not be GCC, so please add a configure test to
 see if the compiler accepts the -fasynchronous-unwind-tables option.

I have simplified the check for -funwind-tables to just look if the
library is compiled with gcc. This option is supported by gcc-2.96
(and probably earlier versions too).

2012-10-02  Uros Bizjak  ubiz...@gmail.com

PR other/54761
* configure.ac (CFLAGS): Add -funwind-tables when compiling with GCC.
* configure: Regenerate.

The patch is re-tested on x86_64-linux-gnu and alphaev68-linux-gnu.

OK for mainline?

Uros.
Index: configure
===
--- configure   (revision 191953)
+++ configure   (working copy)
@@ -4872,8 +4872,12 @@
 
 
 
+if test x$GCC = xyes; then
+  CFLAGS=$CFLAGS -funwind-tables
+fi
 
 
+
 if test -n $ac_tool_prefix; then
   # Extract the first word of ${ac_tool_prefix}ranlib, so it can be a 
program name with args.
 set dummy ${ac_tool_prefix}ranlib; ac_word=$2
@@ -11080,7 +11084,7 @@
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat  conftest.$ac_ext _LT_EOF
-#line 11083 configure
+#line 11087 configure
 #include confdefs.h
 
 #if HAVE_DLFCN_H
@@ -11186,7 +11190,7 @@
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat  conftest.$ac_ext _LT_EOF
-#line 11189 configure
+#line 11193 configure
 #include confdefs.h
 
 #if HAVE_DLFCN_H
Index: configure.ac
===
--- configure.ac(revision 191953)
+++ configure.ac(working copy)
@@ -66,6 +66,10 @@
 AC_PROG_CC
 m4_rename_force([backtrace_PRECIOUS],[_AC_ARG_VAR_PRECIOUS])
 
+if test x$GCC = xyes; then
+  CFLAGS=$CFLAGS -funwind-tables
+fi
+
 AC_SUBST(CFLAGS)
 
 AC_PROG_RANLIB


[PATCH, libitm]: A couple of trivial x86 changes

2012-10-02 Thread Uros Bizjak
Hello!

2012-10-02  Uros Bizjak  ubiz...@gmail.com

* config/x86/target.h (struct gtm_jmpbuf): Merge x86_64
and ia32 declarations some more.
* config/x86/sjlj.S (_ITM_beginTransaction): Move ret to common code.

Tested on x86_64-pc-linux-gnu, committed to mainline SVN.

Uros.
Index: config/x86/sjlj.S
===
--- config/x86/sjlj.S   (revision 191953)
+++ config/x86/sjlj.S   (working copy)
@@ -74,7 +74,6 @@
callSYM(GTM_begin_transaction)
addq$56, %rsp
cfi_def_cfa_offset(8)
-   ret
 #else
leal4(%esp), %ecx
movl4(%esp), %eax
@@ -99,8 +98,8 @@
 #endif
addl$28, %esp
cfi_def_cfa_offset(4)
-   ret
 #endif
+   ret
cfi_endproc
 
TYPE(_ITM_beginTransaction)
Index: config/x86/target.h
===
--- config/x86/target.h (revision 191953)
+++ config/x86/target.h (working copy)
@@ -24,11 +24,11 @@
 
 namespace GTM HIDDEN {
 
-#ifdef __x86_64__
 /* ??? This doesn't work for Win64.  */
 typedef struct gtm_jmpbuf
 {
   void *cfa;
+#ifdef __x86_64__
   unsigned long long rbx;
   unsigned long long rbp;
   unsigned long long r12;
@@ -36,18 +36,14 @@
   unsigned long long r14;
   unsigned long long r15;
   unsigned long long rip;
-} gtm_jmpbuf;
 #else
-typedef struct gtm_jmpbuf
-{
-  void *cfa;
   unsigned long ebx;
   unsigned long esi;
   unsigned long edi;
   unsigned long ebp;
   unsigned long eip;
-} gtm_jmpbuf;
 #endif
+} gtm_jmpbuf;
 
 /* x86 doesn't require strict alignment for the basic types.  */
 #define STRICT_ALIGNMENT 0


[Ada] Avoid unnecessary use of Bignums for ELIMINATED mode

2012-10-02 Thread Arnaud Charlet
Previously there were cases where the result of an operator was
converted to Bignum, only to be immediately converted back to
Long_Long_Integer with an overflow check. This patch removes
this unnecessary inefficiency.

The following program:

 1. procedure toplevov
 2.(a : in out long_long_integer;
 3. b : long_long_integer)
 4. is
 5. begin
 6. a := b * b;
 7. end;

Now generates the following output when compiled
with -gnatG -gnato3.

procedure toplevov
 (a : in out long_long_integer;
  b : long_long_integer) is
begin
   a := long_long_integer(b) {*} long_long_integer(b);
   return;
end toplevov;

Previously it generated Bignum operations

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-02  Robert Dewar  de...@adacore.com

* checks.ads, exp_ch4.adb, checks.adb
(Minimize_Eliminate_Overflow_Checks): Add Top_Level parameter to avoid
unnecessary conversions to Bignum.
Minor reformatting.

Index: checks.adb
===
--- checks.adb  (revision 191921)
+++ checks.adb  (working copy)
@@ -1113,8 +1113,11 @@
 
   --  Otherwise, we have a top level arithmetic operator node, and this
   --  is where we commence the special processing for minimize/eliminate.
+  --  This is the case where we tell the machinery not to move into Bignum
+  --  mode at this top level (of course the top level operation will still
+  --  be in Bignum mode if either of its operands are of type Bignum).
 
-  Minimize_Eliminate_Overflow_Checks (Op, Lo, Hi);
+  Minimize_Eliminate_Overflow_Checks (Op, Lo, Hi, Top_Level = True);
 
   --  That call may but does not necessarily change the result type of Op.
   --  It is the job of this routine to undo such changes, so that at the
@@ -2333,23 +2336,24 @@
 Error_Msg_N
   (\this will result in infinite recursion?, Parent (N));
 Insert_Action (N,
-   Make_Raise_Storage_Error
- (Sloc (N), Reason = SE_Infinite_Recursion));
+  Make_Raise_Storage_Error (Sloc (N),
+Reason = SE_Infinite_Recursion));
 
+ --  Here for normal case of predicate active.
+
  else
-
 --  If the predicate is a static predicate and the operand is
 --  static, the predicate must be evaluated statically. If the
 --  evaluation fails this is a static constraint error.
 
 if Is_OK_Static_Expression (N) then
-   if  Present (Static_Predicate (Typ)) then
+   if Present (Static_Predicate (Typ)) then
   if Eval_Static_Predicate_Check (N, Typ) then
  return;
   else
  Error_Msg_NE
(static expression fails static predicate check on,
-  N, Typ);
+N, Typ);
   end if;
end if;
 end if;
@@ -6549,9 +6553,10 @@

 
procedure Minimize_Eliminate_Overflow_Checks
- (N  : Node_Id;
-  Lo : out Uint;
-  Hi : out Uint)
+ (N : Node_Id;
+  Lo: out Uint;
+  Hi: out Uint;
+  Top_Level : Boolean)
is
   pragma Assert (Is_Signed_Integer_Type (Etype (N)));
 
@@ -6578,6 +6583,11 @@
   OK : Boolean;
   --  Used in call to Determine_Range
 
+  Bignum_Operands : Boolean;
+  --  Set True if one or more operands is already of type Bignum, meaning
+  --  that for sure (regardless of Top_Level setting) we are committed to
+  --  doing the operation in Bignum mode.
+
   procedure Max (A : in out Uint; B : Uint);
   --  If A is No_Uint, sets A to B, else to UI_Max (A, B);
 
@@ -6609,7 +6619,7 @@
--  Start of processing for Minimize_Eliminate_Overflow_Checks
 
begin
-  --  Case where we do not have an arithmetic operator.
+  --  Case where we do not have an arithmetic operator
 
   if not Is_Signed_Integer_Arithmetic_Op (N) then
 
@@ -6638,10 +6648,12 @@
   --  that lies below us!)
 
   else
- Minimize_Eliminate_Overflow_Checks (Right_Opnd (N), Rlo, Rhi);
+ Minimize_Eliminate_Overflow_Checks
+   (Right_Opnd (N), Rlo, Rhi, Top_Level = False);
 
  if Binary then
-Minimize_Eliminate_Overflow_Checks (Left_Opnd (N), Llo, Lhi);
+Minimize_Eliminate_Overflow_Checks
+  (Left_Opnd (N), Llo, Lhi, Top_Level = False);
  end if;
   end if;
 
@@ -6650,10 +6662,13 @@
   if Rlo = No_Uint or else (Binary and then Llo = No_Uint) then
  Lo := No_Uint;
  Hi := No_Uint;
+ Bignum_Operands := True;
 
   --  Otherwise compute result range
 
   else
+ Bignum_Operands := False;
+
  case Nkind (N) is
 
 --  Absolute value
@@ -7007,15 +7022,34 @@
 
   if Lo = 

Re: PR 53889: Add __gthread_recursive_mutex_destroy

2012-10-02 Thread Jakub Jelinek
On Mon, Oct 01, 2012 at 11:02:27PM -0700, Ian Lance Taylor wrote:
 On Mon, Oct 1, 2012 at 5:46 PM, Jonathan Wakely jwakely@gmail.com wrote:
 
  static inline int
  __gthread_recursive_mutex_destroy (__gthread_recursive_mutex_t
 * UNUSED(__mutex))
  {
return 0;
  }
 
  Is that indentation right?  (the asterisk is in the same column as the
  parameter type in a fixed-width font.)
 
 When I see a single parameter that pushes past 80 columns, I normally
 start a new line after the left parenthesis and indent the next line 4
 spaces.  E.g.:
 
 static inline int
 __gthread_recursive_mutex_destroy (
 __gthread_recursive_mutex_t * UNUSED(__mutex))
 
 But I don't think there is any solid standard for this.

I believe the GNU coding standard way (as shown e.g. by what indent does by
default) is to split the single argument onto multiple lines if that still
fits (i.e.
static inline int
__gthread_recursive_mutex_destroy (__gthread_recursive_mutex_t
   * UNUSED(__mutex))
{
  return 0;
}
should be fine), and if even that wouldn't fit, then place ( on the
following line indented by two spaces:
int
foo123456789012345678901234567890123456789012345678901234567890123456789012
  (int x, int y)
{
  return x + y;
}

I have never seen ( at the end of a line in GNU code and find it ugly, but
sure that is a bikeshed thing.

Jakub


[Ada] Ada 2012 invariant checks on access values and components

2012-10-02 Thread Arnaud Charlet
This patch complete the generation of invariant checks, for the case of
return values or in-out parameters that involve access types whose designated
type has invariants.

Executing:

  gnatmake -q -gnat12 -gnata main
  main

must yield:

 1
TEST 0
 1
TEST 1
 2
TEST 2
 2
TEST 3
TEST 4
 3
 4
TEST 5
 3
 4
END

---
with P; use P;
with Ada.Text_IO; use Ada.Text_IO;

procedure Main is
   O : T;   --  value = 1
   V : T_Access := new T;   --  value = 2
   W : aliased X;
begin
   W.V1 := new T;   --  value = 3
   W.V2 := new T;   --  value = 4

   Put_Line (TEST 0);
   Test_0 (O);

   Put_Line (TEST 1);
   Test_1 (V);

   Put_Line (TEST 2);
   Test_2 (V);

   Put_Line (TEST 3);
   Test_3 (W);

   Put_Line (TEST 4);
   Test_4 (W);

   Put_Line (TEST 5);
   Test_5 (W'Access);

   Put_Line (END);
end Main;
---
package P is

   type T is private
   with Type_Invariant = Check (T);

   type T_Access is access all T;

   type X is record
  V1 : access T;
  V2 : T_Access;
   end record;

   function Make (X : integer) return T;
   function Make (X : integer) return access T;

   procedure Test_0 (Obj : in out T);

   function Check (O : T) return Boolean;

   procedure Test_1 (V : access T);

   procedure Test_2 (V : T_Access);

   procedure Test_3 (V : X);

   procedure Test_4 (V : in out X);

   procedure Test_5 (V : access X);

private
   Counter : Integer := 0;
   function Incr return Integer;

   type T is record
  Value : Integer := Incr;
   end record;

end P;
---
with Ada.Text_IO; use Ada.Text_IO;
package body P is
   function Incr return Integer is
   begin
  Counter := Counter + 1;
  return Counter;
   end;

   Root : aliased T := (others = 15);

   function Check (O : T) return Boolean is
   begin
  Put_Line (Integer'Image (O.Value));
  return True;
   end Check;

   function Make (X : Integer) return T is
   begin
  return (Value = X);
   end;

   function Make (X : Integer) return access T is
   begin
  return Root'access;
   end;

   procedure Test_0 (Obj : in out T) is
   begin
  null;
   end;

   procedure Test_1 (V : access T) is
   begin
  null;
   end Test_1;

   procedure Test_2 (V : T_Access) is
   begin
  null;
   end Test_2;

   procedure Test_3 (V : X) is
   begin
  null;
   end Test_3;

   procedure Test_4 (V : in out X) is
   begin
  null;
   end Test_4;

   procedure Test_5 (V : access X) is
   begin
  null;
   end Test_5;

end P;

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-02  Ed Schonberg  schonb...@adacore.com

* sem_ch6.adb (Process_PPCs): Generate invariant checks for a
return value whose type is an access type and whose designated
type has invariants. Ditto for in-out parameters and in-parameters
of an access type.
* exp_ch3.adb (Build_Component_Invariant_Call): Add invariant check
for an access component whose designated type has invariants.

Index: sem_ch6.adb
===
--- sem_ch6.adb (revision 191911)
+++ sem_ch6.adb (working copy)
@@ -11078,6 +11078,12 @@
   Plist : List_Id := No_List;
   --  List of generated postconditions
 
+  procedure Check_Access_Invariants (E : Entity_Id);
+  --  If the subprogram returns an access to a type with invariants, or
+  --  has access parameters whose designated type has an invariant, then
+  --  under the same visibility conditions as for other invariant checks,
+  --  the type invariant must be applied to the returned value.
+
   function Grab_CC return Node_Id;
   --  Prag contains an analyzed contract case pragma. This function copies
   --  relevant components of the pragma, creates the corresponding Check
@@ -11108,6 +4,43 @@
   --  that an invariant check is required (for an IN OUT parameter, or
   --  the returned value of a function.
 
+  -
+  -- Check_Access_Invariants --
+  -
+
+  procedure Check_Access_Invariants (E : Entity_Id) is
+ Call : Node_Id;
+ Obj  : Node_Id;
+ Typ  : Entity_Id;
+
+  begin
+ if Is_Access_Type (Etype (E))
+   and then not Is_Access_Constant (Etype (E))
+ then
+Typ := Designated_Type (Etype (E));
+
+if Has_Invariants (Typ)
+  and then Present (Invariant_Procedure (Typ))
+  and then Is_Public_Subprogram_For (Typ)
+then
+   Obj :=
+ Make_Explicit_Dereference (Loc,
+   Prefix = New_Occurrence_Of (E, Loc));
+   Set_Etype (Obj, Typ);
+
+   Call := Make_Invariant_Call (Obj);
+
+   Append_To (Plist,
+ Make_If_Statement (Loc,
+   Condition =
+ Make_Op_Ne (Loc,
+   Left_Opnd   = Make_Null (Loc),
+   

Re: [PATCH v2, libbacktrace]: Compile with -funwind-tables

2012-10-02 Thread Andreas Schwab
Uros Bizjak ubiz...@gmail.com writes:

 Index: configure.ac
 ===
 --- configure.ac  (revision 191953)
 +++ configure.ac  (working copy)
 @@ -66,6 +66,10 @@
  AC_PROG_CC
  m4_rename_force([backtrace_PRECIOUS],[_AC_ARG_VAR_PRECIOUS])
  
 +if test x$GCC = xyes; then
 +  CFLAGS=$CFLAGS -funwind-tables
 +fi
 +

Don't modify CFLAGS, instead you should substitute a new variable that
is added to AM_CFLAGS.  CFLAGS is reserved for the user to override.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.


[Ada] Add extended overflow -gnato switch to usage

2012-10-02 Thread Arnaud Charlet
This patch adds documentation on the -gnato? and -gnato?? switches
to the usage information. Documentation only, no functional effect
but gnatmake output (with no switches) should have the following
three lines for -gnato:

  -gnatoEnable overflow checking mode to CHECKED (off by default)
  -gnato?   Set SUPPRESSED/CHECKED/MINIMIZED/ELIMINATED (?=0/1/2/3) mode
  -gnato??  Set mode for general/assertion expressions separately

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-02  Robert Dewar  de...@adacore.com

* usage.adb, gnat_rm.texi, vms_data.ads: Add entry for
/OVERFLOW_CHECKS=?? generating -gnato?? for control
of extended overflow checking.
* ug_words: Add entry for -gnato?? for /OVERFLOW_CHECKS=??
* gnat_ugn.texi: Add documentation for -gnato?? for control of overflow
checking mode.

Index: gnat_rm.texi
===
--- gnat_rm.texi(revision 191888)
+++ gnat_rm.texi(working copy)
@@ -179,6 +179,7 @@
 * Pragma Obsolescent::
 * Pragma Optimize_Alignment::
 * Pragma Ordered::
+* Pragma Overflow_Checks::
 * Pragma Passive::
 * Pragma Persistent_BSS::
 * Pragma Polling::
@@ -916,6 +917,7 @@
 * Pragma Obsolescent::
 * Pragma Optimize_Alignment::
 * Pragma Ordered::
+* Pragma Overflow_Checks::
 * Pragma Passive::
 * Pragma Persistent_BSS::
 * Pragma Polling::
@@ -4127,6 +4129,53 @@
 For additional information please refer to the description of the
 @option{-gnatw.u} switch in the @value{EDITION} User's Guide.
 
+@node Pragma Overflow_Checks
+@unnumberedsec Pragma Overflow_Checks
+@findex Overflow checks
+@findex pragma @code{Overflow_Checks}
+@noindent
+Syntax:
+
+@smallexample @c ada
+pragma Overflow_Checks
+ (  [General=] MODE
+  [,[Assertions =] MODE]);
+
+MODE ::= SUPPRESSED | CHECKED | MINIMIZED | ELIMINATED
+@end smallexample
+
+@noindent
+This pragma sets the current overflow mode to the given mode. For details
+of the meaning of these modes, see section on overflow checking in the
+GNAT users guide. If only the @code{General} parameter is present, the
+given mode applies to all expressions. If both parameters are present,
+the @code{General} mode applies to expressions outside assertions, and
+the @code{Eliminated} mode applies to expressions within assertions.
+
+The case of the @code{MODE} parameter is ignored,
+so @code{MINIMIZED}, @code{Minimized} and
+@code{minimized} all have the same effect.
+
+The @code{Overflow_Checks} pragma has the same scoping and placement
+rules as pragma @code{Suppress}, so it can occur either as a
+configuration pragma, specifying a default for the whole
+program, or in a declarative scope, where it applies to the
+remaining declarations and statements in that scope.
+
+The pragma @code{Suppress (Overflow_Check)} sets mode
+
+   General = Suppressed
+
+suppressing all overflow checking within and outside
+assertions.
+
+The pragam @code{Unsuppress (Overflow_Check)} sets mode
+
+   General = Checked
+
+which causes overflow checking of all intermediate overflows.
+This applies both inside and outside assertions.
+
 @node Pragma Passive
 @unnumberedsec Pragma Passive
 @findex Passive
Index: gnat_ugn.texi
===
--- gnat_ugn.texi   (revision 191910)
+++ gnat_ugn.texi   (working copy)
@@ -4325,11 +4325,28 @@
 Historically front end inlining was more extensive than the gcc back end
 inlining, but that is no longer the case.
 
+@item -gnato??
+@cindex @option{-gnato??} (@command{gcc})
+Set default overflow cheecking mode. If ?? is a single digit, in the
+range 0-3, it sets the overflow checking mode for all expressions,
+including those outside and within assertions. The meaning of nnn is:
+
+  0   suppress overflow checks (SUPPRESSED)
+  1   all intermediate overflows checked (CHECKED)
+  2   minimize intermediate overflows (MINIMIZED)
+  3   eliminate intermediate overflows (ELIMINATED)
+
+Otherwise ?? can be two digits, both 0-3, and in this case the first
+digit sets the mode (using the above code) for expressions outside an
+assertion, and the second digit sets the mode for expressions within
+an assertion.
+
 @item -gnato
 @cindex @option{-gnato} (@command{gcc})
 Enable numeric overflow checking (which is not normally enabled by
 default). Note that division by zero is a separate check that is not
 controlled by this switch (division by zero checking is on by default).
+The checking mode is set to CHECKED (equivalent to @option{-gnato11}).
 
 @item -gnatp
 @cindex @option{-gnatp} (@command{gcc})
Index: ug_words
===
--- ug_words(revision 191888)
+++ ug_words(working copy)
@@ -88,6 +88,7 @@
 -gnatn2 ^ /INLINE=PRAGMA_LEVEL_2
 -gnatN  ^ /INLINE=FULL
 -gnato  ^ /CHECKS=OVERFLOW
+-gnato??^ /OVERFLOW_CHECKS=??
 -gnatp  ^ /CHECKS=SUPPRESS_ALL
 -gnat-p ^ 

Re: [PATCH] Rs6000 infrastructure cleanup (switches), revised patch #2c

2012-10-02 Thread Gunther Nikl
Michael Meissner wrote:
 Segher Boessenkool asked me on IRC to break out the fix in the last change.
 This patch is just the change to set the default options if the user did not
 use -mcpu=xxx and the compiler was not configured with --with-cpu=xxx.
 Here are the patches.

Which GCC releases are affected by this bug?

Regards,
Gunther

 I can submit this patch first if David desires, and then resubmit the first of
 the infrastructure patches again, or commit both together.
 
 2012-09-28  Michael Meissner  meiss...@linux.vnet.ibm.com
 
   * config/rs6000/rs6000.c (rs6000_option_override_internal): If
   -mcpu=xxx is not specified and the compiler is not configured
   using --with-cpu=xxx, use the bits from the TARGET_DEFAULT to
   set the initial options.
 
 Index: gcc/config/rs6000/rs6000.c
 ===
 --- gcc/config/rs6000/rs6000.c(revision 191831)
 +++ gcc/config/rs6000/rs6000.c(working copy)
 @@ -2461,6 +2461,11 @@ rs6000_option_override_internal (bool gl
target_flags |= (processor_target_table[cpu_index].target_enable
   set_masks);
  
 +  /* If no -mcpu=xxx, inherit any default options that were cleared via
 + POWERPC_MASKS.  */
 +  if (!have_cpu)
 +target_flags |= (TARGET_DEFAULT  ~target_flags_explicit);
 +
if (rs6000_tune_index = 0)
  tune_index = rs6000_tune_index;
else if (have_cpu)



[Ada] Indexing aspects and indexable containers

2012-10-02 Thread Arnaud Charlet
This patch refines several tests on the legality of indexing aspects:
a) Constant_Indexing function do not have to return a reference type,
b) given an indexing aspect Func, not all overloadings of Func in the current
scope need to be indexing functions.

The commnd:

   gnatmake -gnat12 -q main
   main

must yield:

   Wow Yeah
   Rah Rah Rah 

---
with indexing; use indexing;
with Text_IO; use Text_IO;
procedure Main is
   Box : Holder;
   Carton : Holder2;

begin
   Put_Line (Box.Get (Yeah));
   Put_Line (Carton.Get (Rah ));
end Main;
---
package Indexing is
   type Holder is tagged null record
 with Constant_Indexing = Get,
 Iterator_Element = String;  --  iterable container

   function Get (V : Holder; W : String) return String;   -- indexing function
   function Get (V : Holder; W : String) return Integer;  -- indexing function

   type Holder2 is tagged null record
   with Constant_Indexing = Get;   --  indexable container

   function Get (V : Holder2; W : String) return String;  -- indexing function
end Indexing;
---
package body Indexing is
   function Get (V : Holder; W : String) return String is
   begin
  return Wow   W;
   end Get;

   function Get (V : Holder; W : String) return Integer is
   begin
  return 42;
   end Get;

   function Get (V : Holder2; W : String) return String is
   begin
  return W  W  W;
   end Get;
end Indexing;

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-02  Ed Schonberg  schonb...@adacore.com

* sem_ch13.adb (Check_Indexing_Functions): Refine several tests
on the legality of indexing aspects: Constant_Indexing functions
do not have to return a reference type, and given an indexing
aspect Func, not all overloadings of Func in the current scope
need to be indexing functions.

Index: sem_ch13.adb
===
--- sem_ch13.adb(revision 191902)
+++ sem_ch13.adb(working copy)
@@ -1919,7 +1919,7 @@
   procedure Check_Indexing_Functions;
   --  Check that the function in Constant_Indexing or Variable_Indexing
   --  attribute has the proper type structure. If the name is overloaded,
-  --  check that all interpretations are legal.
+  --  check that some interpretation is legal.
 
   procedure Check_Iterator_Functions;
   --  Check that there is a single function in Default_Iterator attribute
@@ -2070,6 +2070,7 @@
   --
 
   procedure Check_Indexing_Functions is
+ Indexing_Found : Boolean;
 
  procedure Check_One_Function (Subp : Entity_Id);
  --  Check one possible interpretation
@@ -2085,29 +2086,38 @@
Aspect_Iterator_Element);
 
  begin
-if not Check_Primitive_Function (Subp) then
+if not Check_Primitive_Function (Subp)
+  and then not Is_Overloaded (Expr)
+then
Error_Msg_NE
  (aspect Indexing requires a function that applies to type,
-   Subp, Ent);
+Subp, Ent);
 end if;
 
 --  An indexing function must return either the default element of
---  the container, or a reference type.
+--  the container, or a reference type. For variable indexing it
+--  must be latter.
 
 if Present (Default_Element) then
Analyze (Default_Element);
if Is_Entity_Name (Default_Element)
  and then Covers (Entity (Default_Element), Etype (Subp))
then
+  Indexing_Found := True;
   return;
end if;
 end if;
 
---  Otherwise the return type must be a reference type.
+--  For variable_indexing the return type must be a reference type.
 
-if not Has_Implicit_Dereference (Etype (Subp)) then
+if Attr = Name_Variable_Indexing
+  and then not Has_Implicit_Dereference (Etype (Subp))
+then
Error_Msg_N
  (function for indexing must return a reference type, Subp);
+
+else
+   Indexing_Found := True;
 end if;
  end Check_One_Function;
 
@@ -2129,6 +2139,7 @@
It : Interp;
 
 begin
+   Indexing_Found := False;
Get_First_Interp (Expr, I, It);
while Present (It.Nam) loop
 
@@ -2142,6 +2153,11 @@
 
   Get_Next_Interp (I, It);
end loop;
+   if not Indexing_Found then
+  Error_Msg_NE (
+   aspect Indexing requires a function that applies to type,
+ Expr, Ent);
+   end if;
 end;
  end if;
   end Check_Indexing_Functions;


[Ada] Project in limited withed chain reported as duplicate

2012-10-02 Thread Arnaud Charlet
This patch ensures that if a project is in a limited with import chain,
it is not reported as a duplicate project.

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-02  Vincent Celier  cel...@adacore.com

* prj-part.adb (Post_Parse_Context_Clause): Resurrect Boolean
parameter In_Limited.  Check for circularity also if In_Limited
is True.
(Parse_Single_Project): Call Post_Parse_Context_Clause with
In_Limited parameter.

Index: prj-part.adb
===
--- prj-part.adb(revision 191895)
+++ prj-part.adb(working copy)
@@ -216,6 +216,7 @@
procedure Post_Parse_Context_Clause
  (Context_Clause: With_Id;
   In_Tree   : Project_Node_Tree_Ref;
+  In_Limited: Boolean;
   Limited_Withs : Boolean;
   Imported_Projects : in out Project_Node_Id;
   Project_Directory : Path_Name_Type;
@@ -827,6 +828,7 @@
procedure Post_Parse_Context_Clause
  (Context_Clause: With_Id;
   In_Tree   : Project_Node_Tree_Ref;
+  In_Limited: Boolean;
   Limited_Withs : Boolean;
   Imported_Projects : in out Project_Node_Id;
   Project_Directory : Path_Name_Type;
@@ -941,7 +943,9 @@
   --  If we have one, get the project id of the limited
   --  imported project file, and do not parse it.
 
-  if Limited_Withs and then Project_Stack.Last  1 then
+  if (In_Limited or else Limited_Withs) and then
+ Project_Stack.Last  1
+  then
  declare
 Canonical_Path_Name : Path_Name_Type;
 
@@ -975,7 +979,7 @@
 Path_Name_Id  = Imported_Path_Name_Id,
 Extended  = False,
 From_Extended = From_Extended,
-In_Limited= Limited_Withs,
+In_Limited= In_Limited or else Limited_Withs,
 Packages_To_Check = Packages_To_Check,
 Depth = Depth,
 Current_Dir   = Current_Dir,
@@ -1577,6 +1581,7 @@
 Post_Parse_Context_Clause
   (In_Tree   = In_Tree,
Context_Clause= First_With,
+   In_Limited= In_Limited,
Limited_Withs = False,
Imported_Projects = Imported_Projects,
Project_Directory = Project_Directory,
@@ -1936,6 +1941,7 @@
  Post_Parse_Context_Clause
(In_Tree   = In_Tree,
 Context_Clause= First_With,
+In_Limited= In_Limited,
 Limited_Withs = True,
 Imported_Projects = Imported_Projects,
 Project_Directory = Project_Directory,


[Ada] Add style check for NOT IN

2012-10-02 Thread Arnaud Charlet
This patch adds a new style check for the layout of the NOT IN operation.
If the token check style flag is set, then there must be exactly one space
(and no other white space) between the NOT and the IN. The following is
compiled with -gnaty:

 1. package StyleNotIn is
 2.x : Integer := 4;
 3.y : Boolean := x not  in 1 .. 10;
|
 (style) single space must separate not and in

 4. end StyleNotIn;

2012-10-02  Robert Dewar  de...@adacore.com

* stylesw.ads, gnat_ugn.texi: Document new style rule for NOT IN.
* par-ch4.adb (P_Relational_Operator): Add style check for NOT IN.
* style.ads, styleg.adb, styleg.ads (Check_Not_In): New procedure.

Index: gnat_ugn.texi
===
--- gnat_ugn.texi   (revision 191960)
+++ gnat_ugn.texi   (working copy)
@@ -6730,6 +6730,10 @@
 A vertical bar must be surrounded by spaces.
 @end itemize
 
+@item
+Exactly one blank (and no other white space) must appear between
+a @code{not} token and a following @code{in} token.
+
 @item ^u^UNNECESSARY_BLANK_LINES^
 @emph{Check unnecessary blank lines.}
 Unnecessary blank lines are not allowed. A blank line is considered
Index: par-ch4.adb
===
--- par-ch4.adb (revision 191888)
+++ par-ch4.adb (working copy)
@@ -2706,7 +2706,16 @@
 
   Scan; -- past operator token
 
+  --  Deal with NOT IN, if previous token was NOT, we must have IN now
+
   if Prev_Token = Tok_Not then
+
+ --  Style check, for NOT IN, we require one space between NOT and IN
+
+ if Style_Check and then Token = Tok_In then
+Style.Check_Not_In;
+ end if;
+
  T_In;
   end if;
 
Index: style.ads
===
--- style.ads   (revision 191888)
+++ style.ads   (working copy)
@@ -6,7 +6,7 @@
 --  --
 -- S p e c  --
 --  --
---  Copyright (C) 1992-2010, Free Software Foundation, Inc. --
+--  Copyright (C) 1992-2012, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -155,6 +155,11 @@
--  check the line length (Len is the length of the current line). Note that
--  the terminator may be the EOF character.
 
+   procedure Check_Not_In
+ renames Style_Inst.Check_Not_In;
+   --  Called with Scan_Ptr pointing to an IN token, and Prev_Token_Ptr
+   --  pointing to a NOT token. Used to check proper layout of NOT IN.
+
procedure Check_Pragma_Name
  renames Style_Inst.Check_Pragma_Name;
--  The current token is a pragma identifier. Check that it is spelled
Index: styleg.adb
===
--- styleg.adb  (revision 191888)
+++ styleg.adb  (working copy)
@@ -6,7 +6,7 @@
 --  --
 -- B o d y  --
 --  --
---  Copyright (C) 1992-2011, Free Software Foundation, Inc. --
+--  Copyright (C) 1992-2012, Free Software Foundation, Inc. --
 --  --
 -- GNAT is free software;  you can  redistribute it  and/or modify it under --
 -- terms of the  GNU General Public License as published  by the Free Soft- --
@@ -764,6 +764,24 @@
   end if;
end Check_Line_Terminator;
 
+   --
+   -- Check_Not_In --
+   --
+
+   --  In check tokens mode, only one space between NOT and IN
+
+   procedure Check_Not_In is
+   begin
+  if Style_Check_Tokens then
+ if Source (Token_Ptr - 1) /= ' '
+   or else Token_Ptr - Prev_Token_Ptr /= 4
+ then -- CODEFIX?
+Error_Msg
+  ((style) single space must separate NOT and IN, Token_Ptr - 1);
+ end if;
+  end if;
+   end Check_Not_In;
+
--
-- Check_No_Space_After --
--
Index: styleg.ads
===
--- styleg.ads  (revision 191888)
+++ styleg.ads  (working copy)
@@ -6,7 +6,7 @@
 --  --
 -- S p e c  --
 --  

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-02 Thread Paolo Bonzini
Il 02/10/2012 09:28, Steven Bosscher ha scritto:
   My experience shows that these lists are usually 1-2 elements. Although in
  this case, there are pseudos with huge number elements (hundreeds).  I 
  tried
  -fweb for this tests because it can decrease the number elements but GCC (I
  don't know what pass) scales even worse: after 20 min of waiting and when
  virt memory achieved 20GB I stoped it.
 Ouch :-)
 
 The webizer itself never even runs, the compiler blows up somewhere
 during the df_analyze call from web_main. The issue here is probably
 in the DF_UD_CHAIN problem or in the DF_RD problem.

/me is glad to have fixed fwprop when his GCC contribution time was more
than 1-2 days per year...

Unfortunately, the fwprop solution (actually a rewrite) was very
specific to the problem and cannot be reused in other parts of the compiler.

I guess here it is where we could experiment with region-based
optimization.  If a loop (including the parent dummy loop) is too big,
ignore it and only do LRS on smaller loops inside it.  Reaching
definitions is insanely expensive on an entire function, but works well
on smaller loops.

Perhaps something similar could be applied also to IRA/LRA.

Paolo


[Ada] References to the formals of child subprograms without specs

2012-10-02 Thread Arnaud Charlet
If a child subprogram has no previous spec, treat a reference to its formals
(such as a parameter association) as coming from source, in order to generate
the proper references and enable gps navigation between reference and
declaration.

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-02  Ed Schonberg  schonb...@adacore.com

* lib-xref.adb (Generate_Reference): If a child subprogram
has no previous spec, treat a reference to its formals (such
as a parameter association) as coming from source in order to
generate the proper references and enable gps navigation between
reference and declaration.

Index: lib-xref.adb
===
--- lib-xref.adb(revision 191888)
+++ lib-xref.adb(working copy)
@@ -945,6 +945,13 @@
  then
 Ent := E;
 
+ --  Ditto for the formals of such a subprogram
+
+ elsif Is_Overloadable (Scope (E))
+   and then Is_Child_Unit (Scope (E))
+ then
+Ent := E;
+
  --  Record components of discriminated subtypes or derived types must
  --  be treated as references to the original component.
 


abs(long long)

2012-10-02 Thread Marc Glisse

Hello,

here is the patch from PR54686. Several notes:

* I'll have to ask experts if std::abs(unsigned) (yes, a weird thing to 
do, but still) is meant to return a double...
* I still don't like the configure-time _GLIBCXX_USE_INT128, I think it 
should use defined(__SIZEOF_INT128__), which would help other compilers.
* newlib has llabs, according to the doc. It would be good to know what 
newlib is missing for libstdc++ to detect it as C99-ready.


I tested a previous version (without __STRICT_ANSI__) on x86_64-linux-gnu 
and Oleg Endo did a basic check on sh/newlib. I'll do a last check after 
the review (no point if the patch needs changing again).


2012-10-02  Marc Glisse  marc.gli...@inria.fr

PR libstdc++/54686
* include/c_std/cstdlib (abs(long long)): Define fallback whenever
we have long long but possibly not llabs.
(abs(long long)): Use llabs when available.
(abs(__int128)): Define when we have __int128.
(div(long long, long long)): Use lldiv.
* testsuite/26_numerics/headers/cstdlib/54686.c: New file.

--
Marc GlisseIndex: include/c_std/cstdlib
===
--- include/c_std/cstdlib   (revision 191941)
+++ include/c_std/cstdlib   (working copy)
@@ -130,20 +130,32 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   using ::strtoul;
   using ::system;
 #ifdef _GLIBCXX_USE_WCHAR_T
   using ::wcstombs;
   using ::wctomb;
 #endif // _GLIBCXX_USE_WCHAR_T
 
   inline long
   abs(long __i) { return labs(__i); }
 
+#if defined (_GLIBCXX_USE_LONG_LONG) \
+ (!_GLIBCXX_USE_C99 || _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC)
+  // Fallback version if we don't have llabs but still allow long long.
+  inline long long
+  abs(long long __x) { return __x = 0 ? __x : -__x; }
+#endif
+
+#if !defined(__STRICT_ANSI__)  defined(_GLIBCXX_USE_INT128)
+  inline __int128
+  abs(__int128 __x) { return __x = 0 ? __x : -__x; }
+#endif
+
   inline ldiv_t
   div(long __i, long __j) { return ldiv(__i, __j); }
 
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
 
 #if _GLIBCXX_USE_C99
 
 #undef _Exit
 #undef llabs
@@ -161,29 +173,29 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   using ::lldiv_t;
 #endif
 #if _GLIBCXX_USE_C99_CHECK || _GLIBCXX_USE_C99_DYNAMIC
   extern C void (_Exit)(int) throw () _GLIBCXX_NORETURN;
 #endif
 #if !_GLIBCXX_USE_C99_DYNAMIC
   using ::_Exit;
 #endif
 
+#if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   inline long long
-  abs(long long __x) { return __x = 0 ? __x : -__x; }
+  abs(long long __x) { return ::llabs (__x); }
 
-#if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   using ::llabs;
 
   inline lldiv_t
   div(long long __n, long long __d)
-  { lldiv_t __q; __q.quot = __n / __d; __q.rem = __n % __d; return __q; }
+  { return ::lldiv (__n, __d); }
 
   using ::lldiv;
 #endif
 
 #if _GLIBCXX_USE_C99_LONG_LONG_CHECK || _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   extern C long long int (atoll)(const char *) throw ();
   extern C long long int
 (strtoll)(const char * __restrict, char ** __restrict, int) throw ();
   extern C unsigned long long int
 (strtoull)(const char * __restrict, char ** __restrict, int) throw ();
@@ -198,22 +210,22 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace __gnu_cxx
 
 namespace std
 {
 #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   using ::__gnu_cxx::lldiv_t;
 #endif
   using ::__gnu_cxx::_Exit;
-  using ::__gnu_cxx::abs;
 #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
+  using ::__gnu_cxx::abs;
   using ::__gnu_cxx::llabs;
   using ::__gnu_cxx::div;
   using ::__gnu_cxx::lldiv;
 #endif
   using ::__gnu_cxx::atoll;
   using ::__gnu_cxx::strtof;
   using ::__gnu_cxx::strtoll;
   using ::__gnu_cxx::strtoull;
   using ::__gnu_cxx::strtold;
 } // namespace std
Index: testsuite/26_numerics/headers/cstdlib/54686.c
===
--- testsuite/26_numerics/headers/cstdlib/54686.c   (revision 0)
+++ testsuite/26_numerics/headers/cstdlib/54686.c   (revision 0)
@@ -0,0 +1,32 @@
+// { dg-do compile }
+// { dg-options -std=c++11 }
+
+// Copyright (C) 2012 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+//
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+//
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// http://www.gnu.org/licenses/.
+
+#include cmath
+#include cstdlib
+#include type_traits
+#include utility
+
+#ifdef 

Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-02 Thread Steven Bosscher
On Tue, Oct 2, 2012 at 10:29 AM, Paolo Bonzini bonz...@gnu.org wrote:
 Il 02/10/2012 09:28, Steven Bosscher ha scritto:
   My experience shows that these lists are usually 1-2 elements. Although in
  this case, there are pseudos with huge number elements (hundreeds).  I 
  tried
  -fweb for this tests because it can decrease the number elements but GCC 
  (I
  don't know what pass) scales even worse: after 20 min of waiting and when
  virt memory achieved 20GB I stoped it.
 Ouch :-)

 The webizer itself never even runs, the compiler blows up somewhere
 during the df_analyze call from web_main. The issue here is probably
 in the DF_UD_CHAIN problem or in the DF_RD problem.

 /me is glad to have fixed fwprop when his GCC contribution time was more
 than 1-2 days per year...

I thought you spent more time on GCC nowadays, working for RedHat?
Who's your manager, perhaps we can coerce him/her into letting you
spend more time on GCC :-P


 Unfortunately, the fwprop solution (actually a rewrite) was very
 specific to the problem and cannot be reused in other parts of the compiler.

That'd be too bad... But is this really true? I thought you had
something done that builds chains only for USEs reached by multiple
DEFs? That's the only interesting kind for web, too.


 I guess here it is where we could experiment with region-based
 optimization.  If a loop (including the parent dummy loop) is too big,
 ignore it and only do LRS on smaller loops inside it.  Reaching
 definitions is insanely expensive on an entire function, but works well
 on smaller loops.

Heh, yes. In fact I have been working on a region-based version of web
because it is (or at least: used to be) a useful pass that only isn't
enabled by default because the underlying RD problem scales so badly.
My current collection of hacks doesn't bootstrap, doesn't even build
libgcc yet, but I plan to finish it for GCC 4.9. It's based on
identifying SEME regions using structural analysis, and DF's partial
CFG analysis (the latter is currently the problem).

FWIW: part of the problem for this particular test case is that there
are many registers with partial defs (vector registers) and the RD
problem doesn't (and probably cannot) keep track of one partial
def/use killing another partial def/use. This handling of vector regs
appears to be a general problem with much of the RTL infrastructure.

Ciao!
Steven


[AARCH64] Merge from upstream trunk r191882

2012-10-02 Thread Sofiane Naci
Hi,

I have just merged upstream trunk on the aarch64-branch up to r191882.

Thanks
Sofiane






[patch] Introduce DECL_NONLOCAL_FRAME

2012-10-02 Thread Eric Botcazou
Hi,

this is the seemingly non-controversial part of the FRAME splitting patch.
It introduces the DECL_NONLOCAL_FRAME flag, sets it during nested function 
lowering and... that's pretty much it.

Tested on x86_64-suse-linux, OK for mainline?


2012-10-02  Eric Botcazou  ebotca...@adacore.com

* tree.h (DECL_NONLOCAL_FRAME): New macro.
* tree-nested.c (get_frame_type): Set DECL_NONLOCAL_FRAME.
* tree-streamer-in.c (unpack_ts_decl_common_value_fields): Stream in
DECL_NONLOCAL_FRAME flag.
* tree-streamer-out.c (pack_ts_decl_common_value_fields): Stream out
DECL_NONLOCAL_FRAME flag.


-- 
Eric BotcazouIndex: tree.h
===
--- tree.h	(revision 191924)
+++ tree.h	(working copy)
@@ -712,6 +712,9 @@ struct GTY(()) tree_base {
 
SSA_NAME_IS_DEFAULT_DEF in
SSA_NAME
+
+   DECL_NONLOCAL_FRAME in
+	   VAR_DECL
 */
 
 struct GTY(()) tree_typed {
@@ -3270,9 +3273,14 @@ extern void decl_fini_priority_insert (t
libraries.  */
 #define MAX_RESERVED_INIT_PRIORITY 100
 
+/* In a VAR_DECL, nonzero if this is a global variable for VOPs.  */
 #define VAR_DECL_IS_VIRTUAL_OPERAND(NODE) \
   (VAR_DECL_CHECK (NODE)-base.u.bits.saturating_flag)
 
+/* In a VAR_DECL, nonzero if this is a non-local frame structure.  */
+#define DECL_NONLOCAL_FRAME(NODE)  \
+  (VAR_DECL_CHECK (NODE)-base.default_def_flag)
+
 struct GTY(()) tree_var_decl {
   struct tree_decl_with_vis common;
 };
Index: tree-streamer-out.c
===
--- tree-streamer-out.c	(revision 191909)
+++ tree-streamer-out.c	(working copy)
@@ -181,6 +181,9 @@ pack_ts_decl_common_value_fields (struct
   bp_pack_value (bp, expr-decl_common.off_align, 8);
 }
 
+  if (TREE_CODE (expr) == VAR_DECL)
+bp_pack_value (bp, DECL_NONLOCAL_FRAME (expr), 1);
+
   if (TREE_CODE (expr) == RESULT_DECL
   || TREE_CODE (expr) == PARM_DECL
   || TREE_CODE (expr) == VAR_DECL)
Index: tree-nested.c
===
--- tree-nested.c	(revision 191909)
+++ tree-nested.c	(working copy)
@@ -235,6 +235,7 @@ get_frame_type (struct nesting_info *inf
 
   info-frame_type = type;
   info-frame_decl = create_tmp_var_for (info, type, FRAME);
+  DECL_NONLOCAL_FRAME (info-frame_decl) = 1;
 
   /* ??? Always make it addressable for now, since it is meant to
 	 be pointed to by the static chain pointer.  This pessimizes
Index: tree-streamer-in.c
===
--- tree-streamer-in.c	(revision 191909)
+++ tree-streamer-in.c	(working copy)
@@ -216,6 +216,9 @@ unpack_ts_decl_common_value_fields (stru
   expr-decl_common.off_align = bp_unpack_value (bp, 8);
 }
 
+  if (TREE_CODE (expr) == VAR_DECL)
+DECL_NONLOCAL_FRAME (expr) = (unsigned) bp_unpack_value (bp, 1);
+
   if (TREE_CODE (expr) == RESULT_DECL
   || TREE_CODE (expr) == PARM_DECL
   || TREE_CODE (expr) == VAR_DECL)


Re: [patch] Introduce DECL_NONLOCAL_FRAME

2012-10-02 Thread Jakub Jelinek
On Tue, Oct 02, 2012 at 10:49:31AM +0200, Eric Botcazou wrote:
 this is the seemingly non-controversial part of the FRAME splitting patch.
 It introduces the DECL_NONLOCAL_FRAME flag, sets it during nested function 
 lowering and... that's pretty much it.
 
 Tested on x86_64-suse-linux, OK for mainline?

Yes, thanks.

 2012-10-02  Eric Botcazou  ebotca...@adacore.com
 
 * tree.h (DECL_NONLOCAL_FRAME): New macro.
 * tree-nested.c (get_frame_type): Set DECL_NONLOCAL_FRAME.
 * tree-streamer-in.c (unpack_ts_decl_common_value_fields): Stream in
 DECL_NONLOCAL_FRAME flag.
 * tree-streamer-out.c (pack_ts_decl_common_value_fields): Stream out
 DECL_NONLOCAL_FRAME flag.

Jakub


abs(long long)

2012-10-02 Thread Marc Glisse

(Forgot libstdc++...)

Hello,

here is the patch from PR54686. Several notes:

* I'll have to ask experts if std::abs(unsigned) (yes, a weird thing to do, but 
still) is meant to return a double...
* I still don't like the configure-time _GLIBCXX_USE_INT128, I think it should 
use defined(__SIZEOF_INT128__), which would help other compilers.
* newlib has llabs, according to the doc. It would be good to know what newlib 
is missing for libstdc++ to detect it as C99-ready.


I tested a previous version (without __STRICT_ANSI__) on x86_64-linux-gnu and 
Oleg Endo did a basic check on sh/newlib. I'll do a last check after the review 
(no point if the patch needs changing again).


2012-10-02  Marc Glisse  marc.gli...@inria.fr

PR libstdc++/54686
* include/c_std/cstdlib (abs(long long)): Define fallback whenever
we have long long but possibly not llabs.
(abs(long long)): Use llabs when available.
(abs(__int128)): Define when we have __int128.
(div(long long, long long)): Use lldiv.
* testsuite/26_numerics/headers/cstdlib/54686.c: New file.

--
Marc GlisseIndex: include/c_std/cstdlib
===
--- include/c_std/cstdlib   (revision 191941)
+++ include/c_std/cstdlib   (working copy)
@@ -130,20 +130,32 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   using ::strtoul;
   using ::system;
 #ifdef _GLIBCXX_USE_WCHAR_T
   using ::wcstombs;
   using ::wctomb;
 #endif // _GLIBCXX_USE_WCHAR_T
 
   inline long
   abs(long __i) { return labs(__i); }
 
+#if defined (_GLIBCXX_USE_LONG_LONG) \
+ (!_GLIBCXX_USE_C99 || _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC)
+  // Fallback version if we don't have llabs but still allow long long.
+  inline long long
+  abs(long long __x) { return __x = 0 ? __x : -__x; }
+#endif
+
+#if !defined(__STRICT_ANSI__)  defined(_GLIBCXX_USE_INT128)
+  inline __int128
+  abs(__int128 __x) { return __x = 0 ? __x : -__x; }
+#endif
+
   inline ldiv_t
   div(long __i, long __j) { return ldiv(__i, __j); }
 
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
 
 #if _GLIBCXX_USE_C99
 
 #undef _Exit
 #undef llabs
@@ -161,29 +173,29 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   using ::lldiv_t;
 #endif
 #if _GLIBCXX_USE_C99_CHECK || _GLIBCXX_USE_C99_DYNAMIC
   extern C void (_Exit)(int) throw () _GLIBCXX_NORETURN;
 #endif
 #if !_GLIBCXX_USE_C99_DYNAMIC
   using ::_Exit;
 #endif
 
+#if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   inline long long
-  abs(long long __x) { return __x = 0 ? __x : -__x; }
+  abs(long long __x) { return ::llabs (__x); }
 
-#if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   using ::llabs;
 
   inline lldiv_t
   div(long long __n, long long __d)
-  { lldiv_t __q; __q.quot = __n / __d; __q.rem = __n % __d; return __q; }
+  { return ::lldiv (__n, __d); }
 
   using ::lldiv;
 #endif
 
 #if _GLIBCXX_USE_C99_LONG_LONG_CHECK || _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   extern C long long int (atoll)(const char *) throw ();
   extern C long long int
 (strtoll)(const char * __restrict, char ** __restrict, int) throw ();
   extern C unsigned long long int
 (strtoull)(const char * __restrict, char ** __restrict, int) throw ();
@@ -198,22 +210,22 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace __gnu_cxx
 
 namespace std
 {
 #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   using ::__gnu_cxx::lldiv_t;
 #endif
   using ::__gnu_cxx::_Exit;
-  using ::__gnu_cxx::abs;
 #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
+  using ::__gnu_cxx::abs;
   using ::__gnu_cxx::llabs;
   using ::__gnu_cxx::div;
   using ::__gnu_cxx::lldiv;
 #endif
   using ::__gnu_cxx::atoll;
   using ::__gnu_cxx::strtof;
   using ::__gnu_cxx::strtoll;
   using ::__gnu_cxx::strtoull;
   using ::__gnu_cxx::strtold;
 } // namespace std
Index: testsuite/26_numerics/headers/cstdlib/54686.c
===
--- testsuite/26_numerics/headers/cstdlib/54686.c   (revision 0)
+++ testsuite/26_numerics/headers/cstdlib/54686.c   (revision 0)
@@ -0,0 +1,32 @@
+// { dg-do compile }
+// { dg-options -std=c++11 }
+
+// Copyright (C) 2012 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+//
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+//
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// http://www.gnu.org/licenses/.
+
+#include cmath
+#include cstdlib
+#include type_traits

Re: [PATCH v2, libbacktrace]: Compile with -funwind-tables

2012-10-02 Thread Uros Bizjak
On Tue, Oct 2, 2012 at 10:08 AM, Andreas Schwab sch...@linux-m68k.org wrote:

 +if test x$GCC = xyes; then
 +  CFLAGS=$CFLAGS -funwind-tables
 +fi
 +

 Don't modify CFLAGS, instead you should substitute a new variable that
 is added to AM_CFLAGS.  CFLAGS is reserved for the user to override.

Thanks, attached is a version that introduces EXTRA_FLAGS instead.

2012-10-02  Uros Bizjak  ubiz...@gmail.com

PR other/54761
* configure.ac (EXTRA_FLAGS): New.
* Makefile.am (AM_FLAGS): Add $(EXTRA_FLAGS).
* configure, Makefile.in: Regenerate.

Testing on {x86_64,alphaev68}-linux-gnu in progress.

Uros.
Index: configure
===
--- configure   (revision 191955)
+++ configure   (working copy)
@@ -612,6 +612,7 @@
 BACKTRACE_SUPPORTS_THREADS
 PIC_FLAG
 WARN_FLAGS
+EXTRA_FLAGS
 BACKTRACE_FILE
 multi_basedir
 OTOOL64
@@ -11080,7 +11081,7 @@
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat  conftest.$ac_ext _LT_EOF
-#line 11083 configure
+#line 11084 configure
 #include confdefs.h
 
 #if HAVE_DLFCN_H
@@ -11186,7 +11187,7 @@
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat  conftest.$ac_ext _LT_EOF
-#line 11189 configure
+#line 11190 configure
 #include confdefs.h
 
 #if HAVE_DLFCN_H
@@ -11488,6 +11489,12 @@
 fi
 
 
+EXTRA_FLAGS=
+if test x$GCC = xyes; then
+  EXTRA_FLAGS=-funwind-tables
+fi
+
+
 WARN_FLAGS=
 save_CFLAGS=$CFLAGS
 for real_option in -W -Wall -Wwrite-strings -Wstrict-prototypes \
Index: Makefile.in
===
--- Makefile.in (revision 191955)
+++ Makefile.in (working copy)
@@ -152,6 +152,7 @@
 ECHO_T = @ECHO_T@
 EGREP = @EGREP@
 EXEEXT = @EXEEXT@
+EXTRA_FLAGS = @EXTRA_FLAGS@
 FGREP = @FGREP@
 FORMAT_FILE = @FORMAT_FILE@
 GREP = @GREP@
@@ -253,7 +254,7 @@
 AM_CPPFLAGS = -I $(top_srcdir)/../include -I $(top_srcdir)/../libgcc \
-I ../libgcc -I ../gcc/include -I $(MULTIBUILDTOP)../../gcc/include
 
-AM_CFLAGS = $(WARN_FLAGS) $(PIC_FLAG)
+AM_CFLAGS = $(EXTRA_FLAGS) $(WARN_FLAGS) $(PIC_FLAG)
 noinst_LTLIBRARIES = libbacktrace.la
 libbacktrace_la_SOURCES = \
backtrace.h \
Index: configure.ac
===
--- configure.ac(revision 191955)
+++ configure.ac(working copy)
@@ -96,6 +96,12 @@
 fi
 AC_SUBST(BACKTRACE_FILE)
 
+EXTRA_FLAGS=
+if test x$GCC = xyes; then
+  EXTRA_FLAGS=-funwind-tables
+fi
+AC_SUBST(EXTRA_FLAGS)
+
 ACX_PROG_CC_WARNING_OPTS([-W -Wall -Wwrite-strings -Wstrict-prototypes \
  -Wmissing-prototypes -Wold-style-definition \
  -Wmissing-format-attribute -Wcast-qual],
Index: Makefile.am
===
--- Makefile.am (revision 191955)
+++ Makefile.am (working copy)
@@ -34,7 +34,7 @@
 AM_CPPFLAGS = -I $(top_srcdir)/../include -I $(top_srcdir)/../libgcc \
-I ../libgcc -I ../gcc/include -I $(MULTIBUILDTOP)../../gcc/include
 
-AM_CFLAGS = $(WARN_FLAGS) $(PIC_FLAG)
+AM_CFLAGS = $(EXTRA_FLAGS) $(WARN_FLAGS) $(PIC_FLAG)
 
 noinst_LTLIBRARIES = libbacktrace.la
 


Re: abs(long long)

2012-10-02 Thread Gabriel Dos Reis
On Tue, Oct 2, 2012 at 3:57 AM, Marc Glisse marc.gli...@inria.fr wrote:
 (Forgot libstdc++...)


 Hello,

 here is the patch from PR54686. Several notes:

 * I'll have to ask experts if std::abs(unsigned) (yes, a weird thing to do,
 but still) is meant to return a double...

don't we have a core issue about preferring unsigned - long or long long?

 * I still don't like the configure-time _GLIBCXX_USE_INT128, I think it
 should use defined(__SIZEOF_INT128__), which would help other compilers.

Why would that be a problem with the appropriate #define?

 * newlib has llabs, according to the doc. It would be good to know what
 newlib is missing for libstdc++ to detect it as C99-ready.

 I tested a previous version (without __STRICT_ANSI__) on x86_64-linux-gnu
 and Oleg Endo did a basic check on sh/newlib. I'll do a last check after the
 review (no point if the patch needs changing again).

In general, I think I have a bias toward using compiler intrinsics,
for which the
compiler already has lot of knowledge about.



 2012-10-02  Marc Glisse  marc.gli...@inria.fr

 PR libstdc++/54686
 * include/c_std/cstdlib (abs(long long)): Define fallback whenever
 we have long long but possibly not llabs.
 (abs(long long)): Use llabs when available.
 (abs(__int128)): Define when we have __int128.
 (div(long long, long long)): Use lldiv.
 * testsuite/26_numerics/headers/cstdlib/54686.c: New file.

 --
 Marc Glisse

 Index: include/c_std/cstdlib
 ===
 --- include/c_std/cstdlib   (revision 191941)
 +++ include/c_std/cstdlib   (working copy)
 @@ -130,20 +130,32 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
using ::strtoul;
using ::system;
  #ifdef _GLIBCXX_USE_WCHAR_T
using ::wcstombs;
using ::wctomb;
  #endif // _GLIBCXX_USE_WCHAR_T

inline long
abs(long __i) { return labs(__i); }

 +#if defined (_GLIBCXX_USE_LONG_LONG) \
 + (!_GLIBCXX_USE_C99 || _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC)
 +  // Fallback version if we don't have llabs but still allow long long.
 +  inline long long
 +  abs(long long __x) { return __x = 0 ? __x : -__x; }
 +#endif
 +
 +#if !defined(__STRICT_ANSI__)  defined(_GLIBCXX_USE_INT128)
 +  inline __int128
 +  abs(__int128 __x) { return __x = 0 ? __x : -__x; }
 +#endif
 +
inline ldiv_t
div(long __i, long __j) { return ldiv(__i, __j); }

  _GLIBCXX_END_NAMESPACE_VERSION
  } // namespace

  #if _GLIBCXX_USE_C99

  #undef _Exit
  #undef llabs
 @@ -161,29 +173,29 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
using ::lldiv_t;
  #endif
  #if _GLIBCXX_USE_C99_CHECK || _GLIBCXX_USE_C99_DYNAMIC
extern C void (_Exit)(int) throw () _GLIBCXX_NORETURN;
  #endif
  #if !_GLIBCXX_USE_C99_DYNAMIC
using ::_Exit;
  #endif

 +#if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
inline long long
 -  abs(long long __x) { return __x = 0 ? __x : -__x; }
 +  abs(long long __x) { return ::llabs (__x); }

 -#if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
using ::llabs;

inline lldiv_t
div(long long __n, long long __d)
 -  { lldiv_t __q; __q.quot = __n / __d; __q.rem = __n % __d; return __q; }
 +  { return ::lldiv (__n, __d); }

using ::lldiv;
  #endif

  #if _GLIBCXX_USE_C99_LONG_LONG_CHECK || _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
extern C long long int (atoll)(const char *) throw ();
extern C long long int
  (strtoll)(const char * __restrict, char ** __restrict, int) throw ();
extern C unsigned long long int
  (strtoull)(const char * __restrict, char ** __restrict, int) throw ();
 @@ -198,22 +210,22 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

  _GLIBCXX_END_NAMESPACE_VERSION
  } // namespace __gnu_cxx

  namespace std
  {
  #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
using ::__gnu_cxx::lldiv_t;
  #endif
using ::__gnu_cxx::_Exit;
 -  using ::__gnu_cxx::abs;
  #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
 +  using ::__gnu_cxx::abs;
using ::__gnu_cxx::llabs;
using ::__gnu_cxx::div;
using ::__gnu_cxx::lldiv;
  #endif
using ::__gnu_cxx::atoll;
using ::__gnu_cxx::strtof;
using ::__gnu_cxx::strtoll;
using ::__gnu_cxx::strtoull;
using ::__gnu_cxx::strtold;
  } // namespace std
 Index: testsuite/26_numerics/headers/cstdlib/54686.c
 ===
 --- testsuite/26_numerics/headers/cstdlib/54686.c   (revision 0)
 +++ testsuite/26_numerics/headers/cstdlib/54686.c   (revision 0)
 @@ -0,0 +1,32 @@
 +// { dg-do compile }
 +// { dg-options -std=c++11 }
 +
 +// Copyright (C) 2012 Free Software Foundation, Inc.
 +//
 +// This file is part of the GNU ISO C++ Library.  This library is free
 +// software; you can redistribute it and/or modify it under the
 +// terms of the GNU General Public License as published by the
 +// Free Software Foundation; either version 3, or (at your option)
 +// any later version.
 +//
 +// This library is distributed in the hope 

Re: [RFC] Make vectorizer to skip loops with small iteration estimate

2012-10-02 Thread Richard Guenther
On Mon, 1 Oct 2012, Jan Hubicka wrote:

   
So the unvectorized cost is
SIC * niters
   
The vectorized path is
SOC + VIC * ((niters-PL_ITERS-EP_ITERS)/VF) + VOC
The scalar path of vectorizer loop is
SIC * niters + SOC
  
  Note that 'th' is used for the runtime profitability check which is
  done at the time the setup cost has already been taken (yes, we
 
 Yes, I understand that.
  probably should make it more conservative but then guard the whole
  set of loops by the check, not only the vectorized path).
  See PR53355 for the general issue.
 
 Yep, we may reduce the cost of SOC by outputting early guard for 
 non-vectorized
 path better than we do now. However...
  Of course this is very simple benchmark, in reality the vectorizatoin 
   can be
  a lot more harmful by complicating more complex control flows.
  
  So I guess we have two options
   1) go with the new formula and try to make cost model a bit more 
   realistic.
   2) stay with original formula that is quite close to reality, but I 
   think
  more by an accident.
  
  I think we need to improve it as whole, thus I'd prefer 2).
 
 ... I do not see why.
 Even if we make the check cheaper we will only distribute part of SOC to 
 vector
 prologues/epilogues.
 
 Still I think the formula is wrong, I.e. accounting SOC where it should not.
 
 The cost of scalar path without vectorization is 
   niters * SIC
 while with vectorization we have scalar path
   niters * SIC + SOC
 and vector path
   SOC + VIC * ((niters-PL_ITERS-EP_ITERS)/VF) + VOC
 
 So SOC cancels out in the runtime check.
 I still think we need two formulas - one determining if vectorization is
 profitable, other specifying the threshold for scalar path at runtime (that
 will generally give lower values).

True, we want two values.  But part of the scalar path right now
is all the computation required for alias and alignment runtime checks
(because the way all the conditions are combined).

I'm not much into the details of what we account for in SOC (I suppose
it's everything we insert in the preheader of the vector loop).

+  if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS))
+fprintf (vect_dump, not vectorized: estimated iteration count 
too small.);
+  if (vect_print_dump_info (REPORT_DETAILS))
+fprintf (vect_dump, not vectorized: estimated iteration count 
smaller than 
+ user specified loop bound parameter or minimum 
+ profitable iterations (whichever is more 
conservative).);

this won't work anymore btw - dumping infrastructure changed.

I suppose your patch is a step in the right direction, but to really
make progress we need to re-organize the loop and predicate structure
produced by the vectorizer.

So, please update your patch, re-test and then it's ok.

   2) Even when loop iterates 2 times, it is estimated to 4 iterations by
  estimated_stmt_executions_int with the profile feedback.
  The reason is loop_ch pass.  Given a rolled loop with exit probability
  30%, proceeds by duplicating the header with original probabilities.
  This makes the loop to be executed with 60% probability.  Because the
  loop body counts remain the same (and they should), the expected number
  of iterations increase by the decrease of entry edge to the header.
   
  I wonder what to do about this.  Obviously without path profiling
  loop_ch can not really do a good job.  We can artifically make
  header to suceed more likely, that is the reality, but that requires
  non-trivial loop profile updating.
   
  We can also simply record the iteration bound into loop structure 
  and ignore that the profile is not realistic
  
  But we don't preserve loop structure from header copying ...
 
 From what time we keep loop structure? In general I would like to eventualy
 drop value histograms to loop structure specifying number of iterations with
 profile feedback.

We preserve it from the start of the tree loop optimizers (it's easy
to preserve them from earlier points as long as you don't cross inlining,
but to lower the impact of the change I placed it where it was enough
to prevent the excessive unrolling/peeling done by RTL)

  
  Finally we can duplicate loop headers before profilng.  I implemented
  that via early_ch pass executed only with profile generation or 
   feedback.
  I guess it makes sense to do, even if it breaks the assumption that
  we should do strictly -Os generation on paths where
  
  Well, there are CH cases that do not increase code size and I doubt
  that loop header copying is generally bad for -Os ... we are not
  good at handling non-copied loop headers.
 
 There is comment saying 
   /* Loop header copying usually increases size of the code.  This used not to
  be true, since quite often it is possible to verify that the condition is
  satisfied in the first 

[AARCH64-4.7] Merge from upstream gcc-4_7-branch r191881

2012-10-02 Thread Sofiane Naci
Hi,

I have just merged upstream gcc-4_7-branch on the aarch64-4.7-branch up to
r191881.

Thanks
Sofiane






Re: abs(long long)

2012-10-02 Thread Marc Glisse

On Tue, 2 Oct 2012, Gabriel Dos Reis wrote:


On Tue, Oct 2, 2012 at 3:57 AM, Marc Glisse marc.gli...@inria.fr wrote:

(Forgot libstdc++...)


Hello,

here is the patch from PR54686. Several notes:

* I'll have to ask experts if std::abs(unsigned) (yes, a weird thing to do,
but still) is meant to return a double...


don't we have a core issue about preferring unsigned - long or long long?


Here I am talking of a library issue: the wording that says that there are 
sufficient overloads such that integer types call the double version of 
math functions. It is fairly obvious that it doesn't apply to abs(long) 
for instance which has an explicit overload. For short or unsigned, I 
still read it as saying that it converts to double...



* I still don't like the configure-time _GLIBCXX_USE_INT128, I think it
should use defined(__SIZEOF_INT128__), which would help other compilers.


Why would that be a problem with the appropriate #define?


The library installed by the system was compiled with g++, and is then 
used with clang++. If we can avoid installing 2 config.h files to make 
that work...



* newlib has llabs, according to the doc. It would be good to know what
newlib is missing for libstdc++ to detect it as C99-ready.

I tested a previous version (without __STRICT_ANSI__) on x86_64-linux-gnu
and Oleg Endo did a basic check on sh/newlib. I'll do a last check after the
review (no point if the patch needs changing again).


In general, I think I have a bias toward using compiler intrinsics,
for which the
compiler already has lot of knowledge about.


More precisely, does that mean you want __builtin_llabs instead of 
::llabs? I thought the compiler knew they were the same.


--
Marc Glisse


Re: Convert more non-GTY htab_t to hash_table.

2012-10-02 Thread Richard Guenther
On Mon, 1 Oct 2012, Lawrence Crowl wrote:

 Change more non-GTY hash tables to use the new type-safe template hash table.
 Constify member function parameters that can be const.
 Correct a couple of expressions in formerly uninstantiated templates.
 
 The new code is 0.362% faster in bootstrap, with a 99.5% confidence of
 being faster.
 
 Tested on x86-64.
 
 Okay for trunk?

You are changing a hashtable used by fold checking, did you test
with fold checking enabled?

+/* Data structures used to maintain mapping between basic blocks and
+   copies.  */
+static hash_table bb_copy_hasher bb_original;
+static hash_table bb_copy_hasher bb_copy;

note that because hash_table has a constructor we now get global
CTORs for all statics :(  (and mx-protected local inits ...)
Can you please try to remove the constructor from hash_table to
avoid this overhead?  (as a followup - that is, don't initialize htab)

The cfg.c, dse.c and hash-table.h parts are ok for trunk, I'll leave the 
rest to 
respective maintainers of the pieces of the compiler.

Thanks,
Richard.

 
 Index: gcc/java/ChangeLog
 
 2012-10-01  Lawrence Crowl  cr...@google.com
 
   * Make-lang.in (JAVA_OBJS): Add dependence on hash-table.o.
   (JCFDUMP_OBJS): Add dependence on hash-table.o.
   (jcf-io.o): Add dependence on hash-table.h.
   * jcf-io.c (memoized_class_lookups): Change to use type-safe hash table.
 
 Index: gcc/c/ChangeLog
 
 2012-10-01  Lawrence Crowl  cr...@google.com
 
   * Make-lang.in (c-decl.o): Add dependence on hash-table.h.
   * c-decl.c (detect_field_duplicates_hash): Change to new type-safe
   hash table.
 
 Index: gcc/objc/ChangeLog
 
 2012-10-01  Lawrence Crowl  cr...@google.com
 
   * Make-lang.in (OBJC_OBJS): Add dependence on hash-table.o.
   (objc-act.o): Add dependence on hash-table.h.
   * objc-act.c (objc_detect_field_duplicates): Change to new type-safe
   hash table.
 
 Index: gcc/ChangeLog
 
 2012-10-01  Lawrence Crowl  cr...@google.com
 
   * Makefile.in (fold-const.o): Add depencence on hash-table.h.
   (dse.o): Likewise.
   (cfg.o): Likewise.
   * fold-const.c (fold_checksum_tree): Change to new type-safe hash table.
   * (print_fold_checksum): Likewise.
   * cfg.c (var bb_original): Likewise.
   * (var bb_copy): Likewise.
   * (var loop_copy): Likewise.
   * hash-table.h (template hash_table): Constify parameters for find...
   and remove_elt... member functions.
 (hash_table::empty) Correct size expression.
 (hash_table::clear_slot) Correct deleted entry assignment.
   * dse.c (var rtx_group_table): Change to new type-safe hash table.
 
 Index: gcc/cp/ChangeLog
 
 2012-10-01  Lawrence Crowl  cr...@google.com
 
   * Make-lang.in (class.o): Add dependence on hash-table.h.
   (tree.o): Likewise.
   (semantics.o): Likewise.
   * class.c (fixed_type_or_null): Change to new type-safe hash table.
   * tree.c (verify_stmt_tree): Likewise.
   (verify_stmt_tree_r): Likewise.
   * semantics.c (struct nrv_data): Likewise.
 
 
 Index: gcc/java/Make-lang.in
 ===
 --- gcc/java/Make-lang.in (revision 191941)
 +++ gcc/java/Make-lang.in (working copy)
 @@ -83,10 +83,10 @@ JAVA_OBJS = java/class.o java/decl.o jav
java/zextract.o java/jcf-io.o java/win32-host.o java/jcf-parse.o
 java/mangle.o \
java/mangle_name.o java/builtins.o java/resource.o \
java/jcf-depend.o \
 -  java/jcf-path.o java/boehm.o java/java-gimplify.o
 +  java/jcf-path.o java/boehm.o java/java-gimplify.o hash-table.o
 
  JCFDUMP_OBJS = java/jcf-dump.o java/jcf-io.o java/jcf-depend.o
 java/jcf-path.o \
 - java/win32-host.o java/zextract.o ggc-none.o
 + java/win32-host.o java/zextract.o ggc-none.o hash-table.o
 
  JVGENMAIN_OBJS = java/jvgenmain.o java/mangle_name.o
 
 @@ -326,7 +326,7 @@ java/java-gimplify.o: java/java-gimplify
  # jcf-io.o needs $(ZLIBINC) added to cflags.
  CFLAGS-java/jcf-io.o += $(ZLIBINC)
  java/jcf-io.o: java/jcf-io.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
 -  $(JAVA_TREE_H) java/zipfile.h
 +  $(JAVA_TREE_H) java/zipfile.h $(HASH_TABLE_H)
 
  # jcf-path.o needs a -D.
  CFLAGS-java/jcf-path.o += \
 Index: gcc/java/jcf-io.c
 ===
 --- gcc/java/jcf-io.c (revision 191941)
 +++ gcc/java/jcf-io.c (working copy)
 @@ -31,7 +31,7 @@ The Free Software Foundation is independ
  #include jcf.h
  #include tree.h
  #include java-tree.h
 -#include hashtab.h
 +#include hash-table.h
  #include dirent.h
 
  #include zlib.h
 @@ -271,20 +271,34 @@ find_classfile (char *filename, JCF *jcf
return open_class (filename, jcf, fd, dep_name);
  }
 
 -/* Returns 1 if the CLASSNAME (really a char *) matches the name
 -   stored in TABLE_ENTRY (also a char *).  */
 
 -static int
 -memoized_class_lookup_eq (const void *table_entry, const void *classname)
 +/* Hash table 

Re: [PATCH] Fix powerpc breakage, was: Add option for dumping to stderr (issue6190057)

2012-10-02 Thread Richard Guenther
On Tue, Oct 2, 2012 at 1:11 AM, Xinliang David Li davi...@google.com wrote:
 On Mon, Oct 1, 2012 at 4:05 PM, Sharad Singhai sing...@google.com wrote:
 Thanks for tracking down and fixing the powerpc port.

 The dump_kind_p () check is redundant but canonical form here. I
 think blocks of dump code guarded by if dump_kind_p (...) might be
 easier to read/maintain.


 I find it confusing to be honest. The redundant check serves no purpose.

The check should be inlined and avoid the call to the diagnostic routine,
thus speed up compile-time.  We should use this pattern, especially
if it guards multiple calls.

Richard.

 David

 Sharad
 Sharad


 On Mon, Oct 1, 2012 at 3:45 PM, Xinliang David Li davi...@google.com wrote:
 On Mon, Oct 1, 2012 at 2:37 PM, Michael Meissner
 meiss...@linux.vnet.ibm.com wrote:
 I tracked down some of the other code that previously used REPORT_DETAILS, 
 and
 MSG_NOTE is the new way to do the same thing.  This bootstraps and no
 unexpected errors occur during make check.  Is it ok to install?

 2012-10-01  Michael Meissner  meiss...@linux.vnet.ibm.com

 * config/rs6000/rs6000.c (toplevel): Include dumpfile.h.
 (rs6000_density_test): Rework to accomidate 09-30 change by Sharad
 Singhai.

 * config/rs6000/t-rs6000 (rs6000.o): Add dumpfile.h dependency.

 Index: gcc/config/rs6000/rs6000.c
 ===
 --- gcc/config/rs6000/rs6000.c  (revision 191932)
 +++ gcc/config/rs6000/rs6000.c  (working copy)
 @@ -58,6 +58,7 @@
  #include tm-constrs.h
  #include opts.h
  #include tree-vectorizer.h
 +#include dumpfile.h
  #if TARGET_XCOFF
  #include xcoffout.h  /* get declarations of xcoff_*_section_name */
  #endif
 @@ -3518,11 +3519,11 @@ rs6000_density_test (rs6000_cost_data *d
 vec_cost + not_vec_cost  DENSITY_SIZE_THRESHOLD)
  {
data-cost[vect_body] = vec_cost * (100 + DENSITY_PENALTY) / 100;
 -  if (vect_print_dump_info (REPORT_DETAILS))
 -   fprintf (vect_dump,
 -density %d%%, cost %d exceeds threshold, penalizing 
 -loop body cost by %d%%, density_pct,
 -vec_cost + not_vec_cost, DENSITY_PENALTY);
 +  if (dump_kind_p (MSG_NOTE))

 Is this check needed? Seems redundant.

 David


 +   dump_printf_loc (MSG_NOTE, vect_location,
 +density %d%%, cost %d exceeds threshold, 
 penalizing 
 +loop body cost by %d%%, density_pct,
 +vec_cost + not_vec_cost, DENSITY_PENALTY);
  }
  }

 Index: gcc/config/rs6000/t-rs6000
 ===
 --- gcc/config/rs6000/t-rs6000  (revision 191932)
 +++ gcc/config/rs6000/t-rs6000  (working copy)
 @@ -26,7 +26,7 @@ rs6000.o: $(CONFIG_H) $(SYSTEM_H) corety
$(OBSTACK_H) $(TREE_H) $(EXPR_H) $(OPTABS_H) except.h function.h \
output.h dbxout.h $(BASIC_BLOCK_H) toplev.h $(GGC_H) $(HASHTAB_H) \
$(TM_P_H) $(TARGET_H) $(TARGET_DEF_H) langhooks.h reload.h gt-rs6000.h \
 -  cfgloop.h $(OPTS_H) $(COMMON_TARGET_H)
 +  cfgloop.h $(OPTS_H) $(COMMON_TARGET_H) dumpfile.h

  rs6000-c.o: $(srcdir)/config/rs6000/rs6000-c.c \
  $(srcdir)/config/rs6000/rs6000-protos.h \

 --
 Michael Meissner, IBM
 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
 meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899



Re: [PATCH] Fix test breakage, was: Add option for dumping to stderr (issue6190057)

2012-10-02 Thread Richard Guenther
On Tue, Oct 2, 2012 at 1:31 AM, Sharad Singhai sing...@google.com wrote:
 Here is a patch to fix test breakage caused by r191883. Bootstrapped
 on x86_64 and tested with
 make -k check RUNTESTFLAGS=--target_board=unix/\{,-m32\}.

 Okay for trunk?

Ok.

Thanks,
Richard.

 Thanks,
 Sharad

 2012-10-01  Sharad Singhai  sing...@google.com

 * tree-vect-stmts.c (vectorizable_operation): Add missing return.

 testsuite/Changelog

 * gfortran.dg/vect/vect.exp: Change verbose vectorizor dump options
 to fix test failures caused by r191883.
 * gcc.dg/tree-ssa/gen-vect-11.c: Likewise.
 * gcc.dg/tree-ssa/gen-vect-2.c: Likewise.
 * gcc.dg/tree-ssa/gen-vect-32.c: Likewise.
 * gcc.dg/tree-ssa/gen-vect-25.c: Likewise.
 * gcc.dg/tree-ssa/gen-vect-11a.c: Likewise.
 * gcc.dg/tree-ssa/gen-vect-26.c: Likewise.
 * gcc.dg/tree-ssa/gen-vect-11b.c: Likewise.
 * gcc.dg/tree-ssa/gen-vect-11c.c: Likewise.
 * gcc.dg/tree-ssa/gen-vect-28.c: Likewise.
 * testsuite/gcc.target/i386/vect-double-1.c: Fix test. Missing entry
 from r191883.


 Index: testsuite/gfortran.dg/vect/vect.exp
 ===
 --- testsuite/gfortran.dg/vect/vect.exp (revision 191883)
 +++ testsuite/gfortran.dg/vect/vect.exp (working copy)
 @@ -26,7 +26,7 @@ set DEFAULT_VECTCFLAGS 

  # These flags are used for all targets.
  lappend DEFAULT_VECTCFLAGS -O2 -ftree-vectorize -fno-vect-cost-model \
 -  -ftree-vectorizer-verbose=4 -fdump-tree-vect-stats
 +  -fdump-tree-vect-details

  # If the target system supports vector instructions, the default action
  # for a test is 'run', otherwise it's 'compile'.  Save current default.
 Index: testsuite/gcc.dg/tree-ssa/gen-vect-11.c
 ===
 --- testsuite/gcc.dg/tree-ssa/gen-vect-11.c (revision 191883)
 +++ testsuite/gcc.dg/tree-ssa/gen-vect-11.c (working copy)
 @@ -1,6 +1,6 @@
  /* { dg-do run { target vect_cmdline_needed } } */
 -/* { dg-options -O2 -ftree-vectorize -ftree-vectorizer-verbose=3
 -fwrapv -fdump-tree-vect-stats } */
 -/* { dg-options -O2 -ftree-vectorize -ftree-vectorizer-verbose=3
 -fwrapv -fdump-tree-vect-stats -mno-sse { target { i?86-*-*
 x86_64-*-* } } } */
 +/* { dg-options -O2 -ftree-vectorize -fwrapv -fdump-tree-vect-details } */
 +/* { dg-options -O2 -ftree-vectorize -fwrapv
 -fdump-tree-vect-details -mno-sse { target { i?86-*-* x86_64-*-* } }
 } */

  #include stdlib.h

 Index: testsuite/gcc.dg/tree-ssa/gen-vect-2.c
 ===
 --- testsuite/gcc.dg/tree-ssa/gen-vect-2.c (revision 191883)
 +++ testsuite/gcc.dg/tree-ssa/gen-vect-2.c (working copy)
 @@ -1,6 +1,6 @@
  /* { dg-do run { target vect_cmdline_needed } } */
 -/* { dg-options -O2 -ftree-vectorize -ftree-vectorizer-verbose=4
 -fdump-tree-vect-stats } */
 -/* { dg-options -O2 -ftree-vectorize -ftree-vectorizer-verbose=4
 -fdump-tree-vect-stats -mno-sse { target { i?86-*-* x86_64-*-* } } }
 */
 +/* { dg-options -O2 -ftree-vectorize -fdump-tree-vect-details } */
 +/* { dg-options -O2 -ftree-vectorize -fdump-tree-vect-details
 -mno-sse { target { i?86-*-* x86_64-*-* } } } */

  #include stdlib.h

 Index: testsuite/gcc.dg/tree-ssa/gen-vect-32.c
 ===
 --- testsuite/gcc.dg/tree-ssa/gen-vect-32.c (revision 191883)
 +++ testsuite/gcc.dg/tree-ssa/gen-vect-32.c (working copy)
 @@ -1,6 +1,6 @@
  /* { dg-do run { target vect_cmdline_needed } } */
 -/* { dg-options -O2 -ftree-vectorize -ftree-vectorizer-verbose=4
 -fdump-tree-vect-stats } */
 -/* { dg-options -O2 -ftree-vectorize -ftree-vectorizer-verbose=4
 -fdump-tree-vect-stats -mno-sse { target { i?86-*-* x86_64-*-* } } }
 */
 +/* { dg-options -O2 -ftree-vectorize -fdump-tree-vect-details } */
 +/* { dg-options -O2 -ftree-vectorize -fdump-tree-vect-details
 -mno-sse { target { i?86-*-* x86_64-*-* } } } */

  #include stdlib.h

 Index: testsuite/gcc.dg/tree-ssa/gen-vect-25.c
 ===
 --- testsuite/gcc.dg/tree-ssa/gen-vect-25.c (revision 191883)
 +++ testsuite/gcc.dg/tree-ssa/gen-vect-25.c (working copy)
 @@ -1,6 +1,6 @@
  /* { dg-do run { target vect_cmdline_needed } } */
 -/* { dg-options -O2 -ftree-vectorize -ftree-vectorizer-verbose=4
 -fdump-tree-vect-stats } */
 -/* { dg-options -O2 -ftree-vectorize -ftree-vectorizer-verbose=4
 -fdump-tree-vect-stats -mno-sse { target { i?86-*-* x86_64-*-* } } }
 */
 +/* { dg-options -O2 -ftree-vectorize -fdump-tree-vect-details } */
 +/* { dg-options -O2 -ftree-vectorize -fdump-tree-vect-details
 -mno-sse { target { i?86-*-* x86_64-*-* } } } */

  #include stdlib.h

 Index: testsuite/gcc.dg/tree-ssa/gen-vect-11a.c
 ===
 --- testsuite/gcc.dg/tree-ssa/gen-vect-11a.c (revision 191883)
 +++ 

Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-10-02 Thread Richard Guenther
On Mon, Oct 1, 2012 at 8:39 PM, Gabriel Dos Reis
g...@integrable-solutions.net wrote:
 On Mon, Oct 1, 2012 at 1:27 PM, Michael Meissner
 meiss...@linux.vnet.ibm.com wrote:
 On Mon, Oct 01, 2012 at 02:02:26PM -0400, Michael Meissner wrote:
 Your change on September 30th, breaks the powerpc port because the
 REPORT_DETAILS value in the enumeration is no longer there, and the
 rs6000_density_test function was using that.  Please in the future, when you
 are making global changes, grep for uses of enum values in all of the 
 machine
 dependent directories so we can avoid breakage like this.

 Also, in looking at the changes, given we are already up to 28 TDF_ flags, I
 would recommend immediately adding a new type that is the TDF flagword type.
 Thus it will be a lot simpler when we add 4 more TDF flags and have to change
 the type from int to HOST_WIDE_INT.

 Agreed that we need an abstraction here.

Some TLC as well - the flags have various meanings (some control dumping,
some, like TDF_TREE, seem to be unrelated - the MSG ones probably don't
need the same number-space as well, not all flags are used anymore -
TDF_MEMSYMS?).

But yes, an abstraction is needed.  But I wouldn't suggest HOST_WIDE_INT
but int - uint32_t instead (possibly going uint64_t).

Richard.

 -- Gaby


Re: abs(long long)

2012-10-02 Thread Gabriel Dos Reis
On Tue, Oct 2, 2012 at 4:21 AM, Marc Glisse marc.gli...@inria.fr wrote:
 On Tue, 2 Oct 2012, Gabriel Dos Reis wrote:

 On Tue, Oct 2, 2012 at 3:57 AM, Marc Glisse marc.gli...@inria.fr wrote:

 (Forgot libstdc++...)


 Hello,

 here is the patch from PR54686. Several notes:

 * I'll have to ask experts if std::abs(unsigned) (yes, a weird thing to
 do,
 but still) is meant to return a double...


 don't we have a core issue about preferring unsigned - long or long long?


 Here I am talking of a library issue: the wording that says that there are
 sufficient overloads such that integer types call the double version of math
 functions. It is fairly obvious that it doesn't apply to abs(long) for
 instance which has an explicit overload. For short or unsigned, I still read
 it as saying that it converts to double...

I understand that it is originally a library issue, but I don't think
it makes sense to resolve it in isolation of that core issue.




 * I still don't like the configure-time _GLIBCXX_USE_INT128, I think it
 should use defined(__SIZEOF_INT128__), which would help other compilers.


 Why would that be a problem with the appropriate #define?


 The library installed by the system was compiled with g++, and is then used
 with clang++. If we can avoid installing 2 config.h files to make that
 work...

Two things:
  1. that is clearly a clang problem.  I don't think it is libstdc++'s job
  tp try to solve clang's misguided configuration and installation.

  2. I am not sure you understand what I wrote: you can leave the
  use of the current macro the way it is if you appropriately
  define it in terms of what you want to change it to.





 * newlib has llabs, according to the doc. It would be good to know what
 newlib is missing for libstdc++ to detect it as C99-ready.

 I tested a previous version (without __STRICT_ANSI__) on x86_64-linux-gnu
 and Oleg Endo did a basic check on sh/newlib. I'll do a last check after
 the
 review (no point if the patch needs changing again).


 In general, I think I have a bias toward using compiler intrinsics,
 for which the
 compiler already has lot of knowledge about.


 More precisely, does that mean you want __builtin_llabs instead of ::llabs?
 I thought the compiler knew they were the same.

Yes. Another reason is that it simplifies the implementation AND if
people want want to do something with the intrinsics' fallback
libstdc++ will gracefully deliver that.

-- Gaby


Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-02 Thread Paolo Bonzini
Il 02/10/2012 10:49, Steven Bosscher ha scritto:
 On Tue, Oct 2, 2012 at 10:29 AM, Paolo Bonzini bonz...@gnu.org wrote:
 Il 02/10/2012 09:28, Steven Bosscher ha scritto:
   My experience shows that these lists are usually 1-2 elements. Although 
 in
 this case, there are pseudos with huge number elements (hundreeds).  I 
 tried
 -fweb for this tests because it can decrease the number elements but GCC 
 (I
 don't know what pass) scales even worse: after 20 min of waiting and when
 virt memory achieved 20GB I stoped it.
 Ouch :-)

 The webizer itself never even runs, the compiler blows up somewhere
 during the df_analyze call from web_main. The issue here is probably
 in the DF_UD_CHAIN problem or in the DF_RD problem.

 /me is glad to have fixed fwprop when his GCC contribution time was more
 than 1-2 days per year...
 
 I thought you spent more time on GCC nowadays, working for Red Hat?

No, I work on QEMU most of the time. :)  Knowing myself, if I had
GCC-related assignments you'd see me _a lot_ on upstream mailing lists!

 Unfortunately, the fwprop solution (actually a rewrite) was very
 specific to the problem and cannot be reused in other parts of the compiler.
 
 That'd be too bad... But is this really true? I thought you had
 something done that builds chains only for USEs reached by multiple
 DEFs? That's the only interesting kind for web, too.

No, it's the other way round.  I have a dataflow problem that recognizes
USEs reached by multiple DEFs, so that I can use a dominator walk to
build singleton def-use chains.  It's very similar to how you build SSA,
but punting instead of inserting phis.

Another solution is to build factored use-def chains for web, and use
them instead of RD.  In the end it's not very different from regional
live range splitting, since the phi functions factor out the state of
the pass at loop (that is region) boundaries.  I thought you had looked
at FUD chains years ago?

 FWIW: part of the problem for this particular test case is that there
 are many registers with partial defs (vector registers) and the RD
 problem doesn't (and probably cannot) keep track of one partial
 def/use killing another partial def/use.

So they are subregs of regs?  Perhaps they could be represented with
VEC_MERGE to break the live range:

 (set (reg:V4SI 94) (vec_merge:V4SI (reg:V4SI 94)
(const_vector:V4SI [(const_int 0)
(const_int 0)
(const_int 0)
(reg:SI 95)])
(const_int 7)))

And then reload, or something after reload, would know how to split
these when spilling V4SI to memory.

Paolo


Re: [PATCH] Changes in mode switching

2012-10-02 Thread Vladimir Yakovlev
2012/9/30 Uros Bizjak ubiz...@gmail.com:
 On Thu, Sep 20, 2012 at 8:35 AM, Uros Bizjak ubiz...@gmail.com wrote:
 On Thu, Sep 20, 2012 at 8:06 AM, Vladimir Yakovlev vbyakov...@gmail.com 
 wrote:
 The compiler with the patch and without post_reload.patch is built and works
 successfully. It has the only failure with avx-vzeroupper-3 test because of
 post reload problem.

 Ok, can you please elaborate a bit on this filure? Perhaps someone has
 an idea why reload moves unspec_volatile around?

 LRA will eventually replace reload in the nearby future [1], does LRA
 also move unspec_volatile vzeroupper around?

 [1] http://gcc.gnu.org/ml/gcc-patches/2012-09/msg01862.html

 Uros.

I tried my patch with LRA. It works fine. The test avx-vzeroupper-3
runs succesfully, unspec_volatile vzeroupper is not moved around in
LRA.

Vladimir


Re: [PATCH] Add option for dumping to stderr (issue6190057)

2012-10-02 Thread Gabriel Dos Reis
On Tue, Oct 2, 2012 at 4:31 AM, Richard Guenther
richard.guent...@gmail.com wrote:
 On Mon, Oct 1, 2012 at 8:39 PM, Gabriel Dos Reis
 g...@integrable-solutions.net wrote:
 On Mon, Oct 1, 2012 at 1:27 PM, Michael Meissner
 meiss...@linux.vnet.ibm.com wrote:
 On Mon, Oct 01, 2012 at 02:02:26PM -0400, Michael Meissner wrote:
 Your change on September 30th, breaks the powerpc port because the
 REPORT_DETAILS value in the enumeration is no longer there, and the
 rs6000_density_test function was using that.  Please in the future, when 
 you
 are making global changes, grep for uses of enum values in all of the 
 machine
 dependent directories so we can avoid breakage like this.

 Also, in looking at the changes, given we are already up to 28 TDF_ flags, I
 would recommend immediately adding a new type that is the TDF flagword type.
 Thus it will be a lot simpler when we add 4 more TDF flags and have to 
 change
 the type from int to HOST_WIDE_INT.

 Agreed that we need an abstraction here.

 Some TLC as well - the flags have various meanings (some control dumping,
 some, like TDF_TREE, seem to be unrelated - the MSG ones probably don't
 need the same number-space as well, not all flags are used anymore -
 TDF_MEMSYMS?).

TDF_* flags weren't originally designed for those :-/


 But yes, an abstraction is needed.  But I wouldn't suggest HOST_WIDE_INT
 but int - uint32_t instead (possibly going uint64_t).

That makes sense.

-- Gaby


Re: [PATCH] Changes in mode switching

2012-10-02 Thread Uros Bizjak
On Tue, Oct 2, 2012 at 11:35 AM, Vladimir Yakovlev vbyakov...@gmail.com wrote:
 The compiler with the patch and without post_reload.patch is built and 
 works
 successfully. It has the only failure with avx-vzeroupper-3 test because of
 post reload problem.

 Ok, can you please elaborate a bit on this filure? Perhaps someone has
 an idea why reload moves unspec_volatile around?

 LRA will eventually replace reload in the nearby future [1], does LRA
 also move unspec_volatile vzeroupper around?

 [1] http://gcc.gnu.org/ml/gcc-patches/2012-09/msg01862.html

 I tried my patch with LRA. It works fine. The test avx-vzeroupper-3
 runs succesfully, unspec_volatile vzeroupper is not moved around in
 LRA.

Great!

This also means +1 to include LRA in 4.8 from x86 maintainer. We also
expect spill falure fixes and other improvements for pre-reload
scheduling from LRA.

Uros.


Re: [PATCH] Changes in mode switching

2012-10-02 Thread Vladimir Yakovlev
Will we wait for LRA commit or is it possiple to commit to trank
vzeroupper patch now?

2012/10/2 Uros Bizjak ubiz...@gmail.com:
 On Tue, Oct 2, 2012 at 11:35 AM, Vladimir Yakovlev vbyakov...@gmail.com 
 wrote:
 The compiler with the patch and without post_reload.patch is built and 
 works
 successfully. It has the only failure with avx-vzeroupper-3 test because 
 of
 post reload problem.

 Ok, can you please elaborate a bit on this filure? Perhaps someone has
 an idea why reload moves unspec_volatile around?

 LRA will eventually replace reload in the nearby future [1], does LRA
 also move unspec_volatile vzeroupper around?

 [1] http://gcc.gnu.org/ml/gcc-patches/2012-09/msg01862.html

 I tried my patch with LRA. It works fine. The test avx-vzeroupper-3
 runs succesfully, unspec_volatile vzeroupper is not moved around in
 LRA.

 Great!

 This also means +1 to include LRA in 4.8 from x86 maintainer. We also
 expect spill falure fixes and other improvements for pre-reload
 scheduling from LRA.

 Uros.


Re: [PATCH] Changes in mode switching

2012-10-02 Thread Uros Bizjak
On Tue, Oct 2, 2012 at 12:08 PM, Vladimir Yakovlev vbyakov...@gmail.com wrote:
 Will we wait for LRA commit or is it possiple to commit to trank
 vzeroupper patch now?

Since we can emit vzeroupper now, we will wait for LRA.

Uros.


[SH] PR 50457 - Cleanup linux-atomic

2012-10-02 Thread Oleg Endo
Hello,

This is the patch as proposed in the PR to make
libgcc/config/sh/linux-atomic use the appropriate compiler generated
atomic built-in functions depending on the currently selected
atomic-model.

Tested on 191894 with 'make all-gcc' and by compiling code to see if the
__SH_ATOMIC_MODEL_*__ defines work as expected.  The new file
linux-atomic.c was tested by compiling it separately and eyeballing the
generated code.

OK?

Cheers,
Oleg

gcc/ChangeLog:

PR target/50457
* config/sh/sh.c (parse_validate_atomic_model_option): Handle 
name strings in sh_atomic_model.
* config/sh/sh.h (TARGET_CPU_CPP_BUILTINS): Move macro 
implementation to ...
* config/sh/sh-c.c (sh_cpu_cpp_builtins): ... this new function.
Add __SH1__ and __SH2__ defines.  Add __SH_ATOMIC_MODEL_*__ 
define.
* config/sh/sh-protos.h (sh_atomic_model): Add name and 
cdef_name variables.
(sh_cpu_cpp_builtins): Declare new function.

libgcc/ChangeLog:

PR target/50457
* config/sh/linux-atomic.S: Delete.
* config/sh/linux-atomic.c: New.
* config/sh/t-linux (LIB2ADD): Replace linux-atomic.S with 
linux-atomic.c.  Add cflags to disable warnings.
Index: libgcc/config/sh/t-linux
===
--- libgcc/config/sh/t-linux	(revision 191894)
+++ libgcc/config/sh/t-linux	(working copy)
@@ -1,9 +1,13 @@
 LIB1ASMFUNCS_CACHE = _ic_invalidate _ic_invalidate_array
 
-LIB2ADD = $(srcdir)/config/sh/linux-atomic.S
+LIB2ADD = $(srcdir)/config/sh/linux-atomic.c
 
 HOST_LIBGCC2_CFLAGS += -mieee -DNO_FPSCR_VALUES
 
+# Silence atomic built-in related warnings in linux-atomic.c.
+# Unfortunately the conflicting types warning can't be disabled selectively.
+HOST_LIBGCC2_CFLAGS += -w -Wno-sync-nand
+
 # Override t-slibgcc-elf-ver to export some libgcc symbols with
 # the symbol versions that glibc used, and hide some lib1func
 # routines which should not be called via PLT.  We have to create
Index: libgcc/config/sh/linux-atomic.S
===
--- libgcc/config/sh/linux-atomic.S	(revision 191894)
+++ libgcc/config/sh/linux-atomic.S	(working copy)
@@ -1,223 +0,0 @@
-/* Copyright (C) 2006, 2008, 2009 Free Software Foundation, Inc.
-
-   This file is part of GCC.
-
-   GCC is free software; you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 3, or (at your option)
-   any later version.
-
-   GCC is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
-
-   Under Section 7 of GPL version 3, you are granted additional
-   permissions described in the GCC Runtime Library Exception, version
-   3.1, as published by the Free Software Foundation.
-
-   You should have received a copy of the GNU General Public License and
-   a copy of the GCC Runtime Library Exception along with this program;
-   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
-   http://www.gnu.org/licenses/.  */
-
-
-!! Linux specific atomic routines for the Renesas / SuperH SH CPUs.
-!! Linux kernel for SH3/4 has implemented the support for software
-!! atomic sequences.
-
-#define FUNC(X)		.type X,@function
-#define HIDDEN_FUNC(X)	FUNC(X); .hidden X
-#define ENDFUNC0(X)	.Lfe_##X: .size X,.Lfe_##X-X
-#define ENDFUNC(X)	ENDFUNC0(X)
-
-#if ! __SH5__
-
-#define ATOMIC_TEST_AND_SET(N,T,EXT) \
-	.global	__sync_lock_test_and_set_##N; \
-	HIDDEN_FUNC(__sync_lock_test_and_set_##N); \
-	.align	2; \
-__sync_lock_test_and_set_##N:; \
-	mova	1f, r0; \
-	nop; \
-	mov	r15, r1; \
-	mov	#(0f-1f), r15; \
-0:	mov.##T	@r4, r2; \
-	mov.##T	r5, @r4; \
-1:	mov	r1, r15; \
-	rts; \
-	 EXT	r2, r0; \
-	ENDFUNC(__sync_lock_test_and_set_##N)
-
-ATOMIC_TEST_AND_SET (1,b,extu.b)
-ATOMIC_TEST_AND_SET (2,w,extu.w)
-ATOMIC_TEST_AND_SET (4,l,mov)
-
-#define ATOMIC_COMPARE_AND_SWAP(N,T,EXTS,EXT) \
-	.global	__sync_val_compare_and_swap_##N; \
-	HIDDEN_FUNC(__sync_val_compare_and_swap_##N); \
-	.align	2; \
-__sync_val_compare_and_swap_##N:; \
-	mova	1f, r0; \
-	EXTS	r5, r5; \
-	mov	r15, r1; \
-	mov	#(0f-1f), r15; \
-0:	mov.##T	@r4, r2; \
-	cmp/eq	r2, r5; \
-	bf	1f; \
-	mov.##T	r6, @r4; \
-1:	mov	r1, r15; \
-	rts; \
-	 EXT	r2, r0; \
-	ENDFUNC(__sync_val_compare_and_swap_##N)
-
-ATOMIC_COMPARE_AND_SWAP (1,b,exts.b,extu.b)
-ATOMIC_COMPARE_AND_SWAP (2,w,exts.w,extu.w)
-ATOMIC_COMPARE_AND_SWAP (4,l,mov,mov)
-
-#define ATOMIC_BOOL_COMPARE_AND_SWAP(N,T,EXTS) \
-	.global	__sync_bool_compare_and_swap_##N; \
-	HIDDEN_FUNC(__sync_bool_compare_and_swap_##N); \
-	.align	2; \
-__sync_bool_compare_and_swap_##N:; \
-	mova	1f, r0; \
-	EXTS	r5, r5; \
-	mov	r15, r1; \
-	mov	#(0f-1f), r15; \
-0:	mov.##T	@r4, r2; \
-	cmp/eq	r2, 

[Ada] Couple of minor tweaks

2012-10-02 Thread Eric Botcazou
This avoids applying the NRV optimization for small structures and creating 
useless elaboration variables for loops.

Tested on x86_64-suse-linux, applied on the mainline and 4.7 branch.


2012-10-02  Eric Botcazou  ebotca...@adacore.com

* gcc-interfaces/decl.c (elaborate_expression_1): Use the variable for
bounds of loop iteraration scheme only for locally defined subtypes.

* gcc-interface/trans.c (gigi): Fix formatting.
(build_return_expr): Apply the NRV optimization only for BLKmode.


-- 
Eric BotcazouIndex: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 191953)
+++ gcc-interface/decl.c	(working copy)
@@ -6165,6 +6165,7 @@ elaborate_expression_1 (tree gnu_expr, E
   use_variable = expr_variable_p
 		  (expr_global_p
 		 || (!optimize
+		  definition
 			  Is_Itype (gnat_entity)
 			  Nkind (Associated_Node_For_Itype (gnat_entity))
 			== N_Loop_Parameter_Specification));
Index: gcc-interface/trans.c
===
--- gcc-interface/trans.c	(revision 191953)
+++ gcc-interface/trans.c	(working copy)
@@ -332,7 +332,7 @@ gigi (Node_Id gnat_root, int max_gnat_no
 #ifdef ORDINARY_MAP_INSTANCE
   map = LINEMAPS_ORDINARY_MAP_AT (line_table, i);
   if (flag_debug_instances)
-ORDINARY_MAP_INSTANCE(map) = file_info_ptr[i].Instance;
+	ORDINARY_MAP_INSTANCE (map) = file_info_ptr[i].Instance;
 #endif
   linemap_line_start (line_table, file_info_ptr[i].Num_Source_Lines, 252);
   linemap_position_for_column (line_table, 252 - 1);
@@ -3158,6 +3158,7 @@ build_return_expr (tree ret_obj, tree re
   if (optimize
 	   AGGREGATE_TYPE_P (operation_type)
 	   !TYPE_IS_FAT_POINTER_P (operation_type)
+	   TYPE_MODE (operation_type) == BLKmode
 	   aggregate_value_p (operation_type, current_function_decl))
 	{
 	  /* Recognize the temporary created for a return value with variable


PATCH trunk: gengtype honoring mark_hook-s inside struct inide union-s

2012-10-02 Thread Basile Starynkevitch
Hello All,

As I observed in http://gcc.gnu.org/ml/gcc/2010-07/msg00248.html and in
http://gcc.gnu.org/ml/gcc/2012-10/msg3.html the mark_hook GTY annotation is 
sometimes incorrectly ingored by gengtype.

The example in http://gcc.gnu.org/ml/gcc/2012-10/msg3.html demonstrates
that incorrect behavior of gengtype (both with gengtype from GCC 4.7, 
and with the current trunk's gengtype). For simplicity, here is it again:

   /* file tmarkh.h */
   #define MYUTAG 1
   union GTY ((desc(%0.u_int))) myutest_un {
 int GTY((skip)) u_int;
 struct mytest_st GTY ((tag(MYUTAG))) u_mytest;
   };

   static GTY(()) union myutest_un *myutestptr;

   static inline void mymarker(struct mytest_st*s)
   {
 s-myflag = 1;
   }
   /* eof tmarkh.h */
   
when running gengtype (the one from the trunk, or the gcc-4.7 one) with
  gengtype  -D -v -r gtype.state -P _g-tmarkh.h tmarkh.h
you can observe that the generated _g-tmarkh.h don't contain any call to 
mymarker. If the static variable (here myutestptr) is declared with the 
struct mytest_st* type, the marker is emitted.

The reason of that bug is that for GTY-ed union members which are themselves 
GTY-ed struct, the marking of the nested struct is generated inline (for the 
union)
and in that case the mark_hook annotation was not used.

The attached patch to trunk svn rev 191972 solves this issue
(with it, the generated _g-tmarkh.h is correctly calling mymarker).

The gcc/ChangeLog entry is:


2012-10-02  Basile Starynkevitch  bas...@starynkevitch.net

* gengtype.c (walk_type): Emit mark_hook when inside a
  struct of a union member.



Ok for trunk?

Regards.

-- 
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basileatstarynkevitchdotnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***
Index: gcc-trunk-bstarynk/gcc/gengtype.c
===
--- gcc-trunk-bstarynk/gcc/gengtype.c	(revision 191972)
+++ gcc-trunk-bstarynk/gcc/gengtype.c	(working copy)
@@ -2810,6 +2810,7 @@ walk_type (type_p t, struct walk_type_data *d)
 	const char *oldval = d-val;
 	const char *oldprevval1 = d-prev_val[1];
 	const char *oldprevval2 = d-prev_val[2];
+	const char *structmarkhook = NULL;
 	const int union_p = t-kind == TYPE_UNION;
 	int seen_default_p = 0;
 	options_p o;
@@ -2833,7 +2834,14 @@ walk_type (type_p t, struct walk_type_data *d)
 	  if (!desc  strcmp (o-name, desc) == 0
 	   o-kind == OPTION_STRING)
 	desc = o-info.string;
+	  else if (!structmarkhook  strcmp(o-name, mark_hook) == 0
+		o-kind == OPTION_STRING)
+	structmarkhook = o-info.string;
 
+	if (structmarkhook) 
+	oprintf (d-of, %*s/*structmarkhook %s */ %s (%s));\n,
+		 d-indent, , t-u.s.tag,  structmarkhook, oldval);
+	  
 	d-prev_val[2] = oldval;
 	d-prev_val[1] = oldprevval2;
 	if (union_p)


Re: RFC: LRA for x86/x86-64 [7/9]

2012-10-02 Thread Richard Sandiford
Hi Vlad,

Vladimir Makarov vmaka...@redhat.com writes:
 This is the major patch containing all new files.  The patch also adds 
 necessary calls to LRA from IRA.As the patch is too big, it continues in 
 the next email.

 2012-09-27  Vladimir Makarov  vmaka...@redhat.com

  * Makefile.in (LRA_INT_H): New.
  (OBJS): Add lra.o, lra-assigns.o, lra-coalesce.o,
  lra-constraints.o, lra-eliminations.o, lra-lives.o, and lra-spills.o.
  (ira.o): Add dependence on lra.h.
  (lra.o, lra-assigns.o, lra-coalesce.o, lra-constraints.o): New entries.
  (lra-eliminations.o, lra-lives.o, lra-spills.o): Ditto.
  * ira.c: Include lra.h.
  (ira_init_once, ira_init, ira_finish_once): Call lra_start_once,
  lra_init, lra_finish_once in anyway.
  (lra_in_progress): Remove.
  (do_reload): Call LRA.
  * lra.h: New.
  * lra-int.h: Ditto.
  * lra.c: Ditto.
  * lra-assigns.c: Ditto.
  * lra-constraints.c: Ditto.
  * lra-coalesce.c: Ditto.
  * lra-eliminations.c: Ditto.
  * lra-lives.c: Ditto.
  * lra-spills.c: Ditto.
  * doc/passes.texi: Describe LRA pass.

A non-authoritative review of the documentation and lra-eliminations.c:

 +LRA is different from the reload pass in LRA division on small,
 +manageable, and separated sub-tasks.  All LRA transformations and
 +decisions are reflected in RTL as more as possible.  Instruction
 +constraints as a primary source of the info and that minimizes number
 +of target-depended macros/hooks.

 +LRA is run for the targets it were ported.

Suggest something like:

  Unlike the reload pass, intermediate LRA decisions are reflected in
  RTL as much as possible.  This reduces the number of target-dependent
  macros and hooks, leaving instruction constraints as the primary
  source of control.

  LRA is run on targets for which TARGET_LRA_P returns true.

 +/* The virtual registers (like argument and frame pointer) are widely
 +   used in RTL.   Virtual registers should be changed by real hard
 +   registers (like stack pointer or hard frame pointer) plus some
 +   offset.  The offsets are changed usually every time when stack is
 +   expanded.  We know the final offsets only at the very end of LRA.

I always think of virtual as [FIRST_VIRTUAL_REGISTER, LAST_VIRTUAL_REGISTER].
Maybe eliminable would be better?  E.g.

/* Eliminable registers (like a soft argument or frame pointer) are widely
   used in RTL.  These eliminable registers should be replaced by real hard
   registers (like the stack pointer or hard frame pointer) plus some offset.
   The offsets usually change whenever the stack is expanded.  We know the
   final offsets only at the very end of LRA.

 +   We keep RTL code at most time in such state that the virtual
 +   registers can be changed by just the corresponding hard registers
 +   (with zero offsets) and we have the right RTL code.   To achieve this
 +   we should add initial offset at the beginning of LRA work and update
 +   offsets after each stack expanding.   But actually we update virtual
 +   registers to the same virtual registers + corresponding offsets
 +   before every constraint pass because it affects constraint
 +   satisfaction (e.g. an address displacement became too big for some
 +   target).

Suggest:

   Within LRA, we usually keep the RTL in such a state that the eliminable
   registers can be replaced by just the corresponding hard register
   (without any offset).  To achieve this we should add the initial
   elimination offset at the beginning of LRA and update the offsets
   whenever the stack is expanded.  We need to do this before every
   constraint pass because the choice of offset often affects whether
   a particular address or memory constraint is satisfied.

 +   The final change of virtual registers to the corresponding hard
 +   registers are done at the very end of LRA when there were no change
 +   in offsets anymore:
 +
 +  fp + 42 = sp + 42

virtual=eliminable if the above is OK.

 +   Such approach requires a few changes in the rest GCC code because
 +   virtual registers are not recognized as real ones in some
 +   constraints and predicates.   Fortunately, such changes are
 +   small.  */

Not sure whether the last paragraph really belongs in the code,
since it's more about the reload-LRA transition.

 +  /* Nonzero if this elimination can be done.  */
 +  bool can_eliminate;
 +  /* CAN_ELIMINATE since the last check.  */
 +  bool prev_can_eliminate;

AFAICT, these two fields are (now) only ever assigned at the same time,
via init_elim_table and setup_can_eliminate.  Looks like we can do
without prev_can_eliminate.  (And the way that the pass doesn't
need to differentiate between the raw CAN_ELIMINABLE value and
the processed value feels nice and reassuring.)

 +/* Map: 'from regno' - to the current elimination, NULL otherwise.
 +   The elimination table may contains more one elimination of a hard
 

Re: RFC: LRA for x86/x86-64 [7/9]

2012-10-02 Thread Bernd Schmidt
On 09/28/2012 12:59 AM, Vladimir Makarov wrote:
 +   We keep RTL code at most time in such state that the virtual
 +   registers can be changed by just the corresponding hard registers
 +   (with zero offsets) and we have the right RTL code.   To achieve this
 +   we should add initial offset at the beginning of LRA work and update
 +   offsets after each stack expanding.   But actually we update virtual
 +   registers to the same virtual registers + corresponding offsets
 +   before every constraint pass because it affects constraint
 +   satisfaction (e.g. an address displacement became too big for some
 +   target).
 +
 +   The final change of virtual registers to the corresponding hard
 +   registers are done at the very end of LRA when there were no change
 +   in offsets anymore:
 +
 +  fp + 42 = sp + 42

Let me try to understand this.  We have (mem (fp)), which we rewrite to
(mem (fp + 42)), but this is intended to represent (mem (sp + 42))?

Wouldn't this fail on any target which has different addressing ranges
for SP and FP?


Bernd


Re: [SH] PR 51244 - Handle T bit - 0x7FFFFFFF / 0x80000000

2012-10-02 Thread Kaz Kojima
Oleg Endo oleg.e...@t-online.de wrote:
 This handles the case where the T bit is stored to a reg as the value
 0x7FFF or 0x8000.
 Tested on rev 191894 with
 make -k check RUNTESTFLAGS=--target_board=sh-sim
 \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb}
 
 and no new failures.
 OK?

OK.

Regards,
kaz


Re: [SH] PR 50457 - Cleanup linux-atomic

2012-10-02 Thread Kaz Kojima
Oleg Endo oleg.e...@t-online.de wrote:
 This is the patch as proposed in the PR to make
 libgcc/config/sh/linux-atomic use the appropriate compiler generated
 atomic built-in functions depending on the currently selected
 atomic-model.
 
 Tested on 191894 with 'make all-gcc' and by compiling code to see if the
 __SH_ATOMIC_MODEL_*__ defines work as expected.  The new file
 linux-atomic.c was tested by compiling it separately and eyeballing the
 generated code.
 
 OK?

OK.

Regards,
kaz


Re: [Patch] Fix PR53397

2012-10-02 Thread Richard Guenther
On Mon, 1 Oct 2012, venkataramanan.ku...@amd.com wrote:

 Hi, 
 
 The below patch fixes the FFT/Scimark regression caused by useless prefetch
 generation.
 
 This fix tries to make prefetch less aggressive by prefetching arrays in the
 inner loop, when the step is invariant in the entire loop nest.
 
 GCC currently tries to prefetch invariant steps when they are in the inner
 loop. But does not check if the step is variant in outer loops.
 
 In the scimark FFT case, the trip count of the inner loop varies by a non
 constant step, which is invariant in the inner loop. 
 But the step variable is varying in outer loop. This makes
 inner loop trip count small (at run time varies sometimes as small as 1
 iteration) 
 
 Prefetching ahead x iteration when the inner loop trip count is smaller than x
 leads to useless prefetches. 
 
 Flag used: -O3 -march=amdfam10 
 
 Before 
 **  **
 ** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark **
 ** for details. (Results can be submitted to p...@nist.gov) **
 **  **
 Using   2.00 seconds min time per kenel.
 Composite Score:  550.50
 FFT Mflops:38.66(N=1024)
 SOR Mflops:   617.61(100 x 100)
 MonteCarlo: Mflops:   173.74
 Sparse matmult  Mflops:   675.63(N=1000, nz=5000)
 LU  Mflops:  1246.88(M=100, N=100)
 
 
 After 
 **  **
 ** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark **
 ** for details. (Results can be submitted to p...@nist.gov) **
 **  **
 Using   2.00 seconds min time per kenel.
 Composite Score:  639.20
 FFT Mflops:   479.19(N=1024)
 SOR Mflops:   617.61(100 x 100)
 MonteCarlo: Mflops:   173.18
 Sparse matmult  Mflops:   679.13(N=1000, nz=5000)
 LU  Mflops:  1246.88(M=100, N=100)
 
 GCC regression make check -k passes with x86_64-unknown-linux-gnu
 New tests that PASS:
 
 gcc.dg/pr53397-1.c scan-assembler prefetcht0
 gcc.dg/pr53397-1.c scan-tree-dump aprefetch Issued prefetch
 gcc.dg/pr53397-1.c (test for excess errors)
 gcc.dg/pr53397-2.c scan-tree-dump aprefetch loop variant step
 gcc.dg/pr53397-2.c scan-tree-dump aprefetch Not prefetching
 gcc.dg/pr53397-2.c (test for excess errors)
 
 
 Checked CPU2006 and polyhedron on latest AMD processor, no regressions noted.
 
 Ok to commit in trunk?
 
 regards,
 Venkat
 
 gcc/ChangeLog
 +2012-10-01  Venkataramanan Kumar  venkataramanan.ku...@amd.com
 +
 +   * tree-ssa-loop-prefetch.c (gather_memory_references_ref):$
 +   Perform non constant step prefetching in inner loop, only $
 +   when it is invariant in the entire loop nest.  $
 +   * testsuite/gcc.dg/pr53397-1.c: New test case $
 +   Checks we are prefecthing for loop invariant steps$
 +   * testsuite/gcc.dg/pr53397-2.c: New test case$
 +   Checks we are not prefecthing for loop variant steps
 +
 
 
 Index: gcc/testsuite/gcc.dg/pr53397-1.c
 ===
 --- gcc/testsuite/gcc.dg/pr53397-1.c  (revision 0)
 +++ gcc/testsuite/gcc.dg/pr53397-1.c  (revision 0)
 @@ -0,0 +1,28 @@
 +/* Prefetching when the step is loop invariant.  */
 +
 +/* { dg-do compile } */
 +/* { dg-options -O3 -fprefetch-loop-arrays -fdump-tree-aprefetch-details 
 --param min-insn-to-prefetch-ratio=3 --param simultaneous-prefetches=10 
 -fdump-tree-aprefetch-details } */
 +
 +
 +double data[16384];
 +void prefetch_when_non_constant_step_is_invariant(int step, int n)
 +{
 + int a;
 + int b;
 + for (a = 1; a  step; a++) {
 +for (b = 0; b  n; b += 2 * step) {
 +
 +  int i = 2*(b + a);
 +  int j = 2*(b + a + step);
 +
 +
 +  data[j]   = data[i];
 +  data[j+1] = data[i+1];
 +}
 + }
 +}
 +
 +/* { dg-final { scan-tree-dump Issued prefetch aprefetch } } */
 +/* { dg-final { scan-assembler prefetcht0 } } */

This (and the case below) needs to be adjusted to only run on the
appropriate hardware.  See for example gcc.dg/tree-ssa/prefetch-8.c
for how to do this.

 +/* { dg-final { cleanup-tree-dump aprefetch } } */
 Index: gcc/testsuite/gcc.dg/pr53397-2.c
 ===
 --- gcc/testsuite/gcc.dg/pr53397-2.c  (revision 0)
 +++ gcc/testsuite/gcc.dg/pr53397-2.c  (revision 0)
 @@ -0,0 +1,29 @@
 +/* Not prefetching when the step is loop variant.  */
 +
 +/* { dg-do compile } */
 +/* { dg-options -O3 -fprefetch-loop-arrays -fdump-tree-aprefetch-details 
 --param min-insn-to-prefetch-ratio=3 --param simultaneous-prefetches=10 
 -fdump-tree-aprefetch-details } */
 +
 +
 +double data[16384];
 +void donot_prefetch_when_non_constant_step_is_variant(int step, int n)
 +{ 
 + int a;
 + int b;
 + for (a = 1; a  step; a++,step*=2) {
 

[Ada] Get rid of internal use of N_Return_Statement

2012-10-02 Thread Arnaud Charlet
This patch goes almost all the way in removing N_Return_Statement,
and replacing it by N_Simple_Return_Statement. No test, since no
functional effect.

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-02  Robert Dewar  de...@adacore.com

* sinfo.adb, sinfo.ads, sem_util.adb, sem_util.ads, types.h,
exp_ch4.adb, exp_ch6.adb: Get rid of internal use of N_Return_Statement.

Index: sinfo.adb
===
--- sinfo.adb   (revision 191972)
+++ sinfo.adb   (working copy)
@@ -370,7 +370,7 @@
begin
   pragma Assert (False
 or else NT (N).Nkind = N_Extended_Return_Statement
-or else NT (N).Nkind = N_Return_Statement);
+or else NT (N).Nkind = N_Simple_Return_Statement);
   return Flag5 (N);
end By_Ref;
 
@@ -427,7 +427,7 @@
  (N : Node_Id) return Boolean is
begin
   pragma Assert (False
-or else NT (N).Nkind = N_Return_Statement);
+or else NT (N).Nkind = N_Simple_Return_Statement);
   return Flag18 (N);
end Comes_From_Extended_Return_Statement;
 
@@ -958,7 +958,7 @@
 or else NT (N).Nkind = N_Extended_Return_Statement
 or else NT (N).Nkind = N_Function_Call
 or else NT (N).Nkind = N_Procedure_Call_Statement
-or else NT (N).Nkind = N_Return_Statement
+or else NT (N).Nkind = N_Simple_Return_Statement
 or else NT (N).Nkind = N_Type_Conversion);
   return Flag13 (N);
end Do_Tag_Check;
@@ -1234,7 +1234,7 @@
 or else NT (N).Nkind = N_Pragma_Argument_Association
 or else NT (N).Nkind = N_Qualified_Expression
 or else NT (N).Nkind = N_Raise_Statement
-or else NT (N).Nkind = N_Return_Statement
+or else NT (N).Nkind = N_Simple_Return_Statement
 or else NT (N).Nkind = N_Type_Conversion
 or else NT (N).Nkind = N_Unchecked_Expression
 or else NT (N).Nkind = N_Unchecked_Type_Conversion);
@@ -2537,7 +2537,7 @@
 or else NT (N).Nkind = N_Allocator
 or else NT (N).Nkind = N_Extended_Return_Statement
 or else NT (N).Nkind = N_Free_Statement
-or else NT (N).Nkind = N_Return_Statement);
+or else NT (N).Nkind = N_Simple_Return_Statement);
   return Node2 (N);
end Procedure_To_Call;
 
@@ -2670,7 +2670,7 @@
begin
   pragma Assert (False
 or else NT (N).Nkind = N_Extended_Return_Statement
-or else NT (N).Nkind = N_Return_Statement);
+or else NT (N).Nkind = N_Simple_Return_Statement);
   return Node5 (N);
end Return_Statement_Entity;
 
@@ -2862,7 +2862,7 @@
 or else NT (N).Nkind = N_Allocator
 or else NT (N).Nkind = N_Extended_Return_Statement
 or else NT (N).Nkind = N_Free_Statement
-or else NT (N).Nkind = N_Return_Statement);
+or else NT (N).Nkind = N_Simple_Return_Statement);
   return Node1 (N);
end Storage_Pool;
 
@@ -3443,7 +3443,7 @@
begin
   pragma Assert (False
 or else NT (N).Nkind = N_Extended_Return_Statement
-or else NT (N).Nkind = N_Return_Statement);
+or else NT (N).Nkind = N_Simple_Return_Statement);
   Set_Flag5 (N, Val);
end Set_By_Ref;
 
@@ -3500,7 +3500,7 @@
  (N : Node_Id; Val : Boolean := True) is
begin
   pragma Assert (False
-or else NT (N).Nkind = N_Return_Statement);
+or else NT (N).Nkind = N_Simple_Return_Statement);
   Set_Flag18 (N, Val);
end Set_Comes_From_Extended_Return_Statement;
 
@@ -4031,7 +4031,7 @@
 or else NT (N).Nkind = N_Extended_Return_Statement
 or else NT (N).Nkind = N_Function_Call
 or else NT (N).Nkind = N_Procedure_Call_Statement
-or else NT (N).Nkind = N_Return_Statement
+or else NT (N).Nkind = N_Simple_Return_Statement
 or else NT (N).Nkind = N_Type_Conversion);
   Set_Flag13 (N, Val);
end Set_Do_Tag_Check;
@@ -4298,7 +4298,7 @@
 or else NT (N).Nkind = N_Pragma_Argument_Association
 or else NT (N).Nkind = N_Qualified_Expression
 or else NT (N).Nkind = N_Raise_Statement
-or else NT (N).Nkind = N_Return_Statement
+or else NT (N).Nkind = N_Simple_Return_Statement
 or else NT (N).Nkind = N_Type_Conversion
 or else NT (N).Nkind = N_Unchecked_Expression
 or else NT (N).Nkind = N_Unchecked_Type_Conversion);
@@ -5601,7 +5601,7 @@
 or else NT (N).Nkind = N_Allocator
 or else NT (N).Nkind = N_Extended_Return_Statement
 or else NT (N).Nkind = N_Free_Statement
-or else NT (N).Nkind = N_Return_Statement);
+or else NT (N).Nkind = N_Simple_Return_Statement);
   Set_Node2 (N, Val); -- semantic field, no parent set
end Set_Procedure_To_Call;
 
@@ -5734,7 +5734,7 @@
begin
   pragma Assert (False
 or else NT (N).Nkind = N_Extended_Return_Statement
-or else NT (N).Nkind = N_Return_Statement);
+or else NT (N).Nkind = 

[Ada] Ada/C++ missing call to constructor with defaults

2012-10-02 Thread Arnaud Charlet
When the type of an object is a CPP type and the object initialization
requires calling its default C++ constructor, the Ada compiler did not
generate the call to a C++ constructor which has all parameters with
defaults (and hence it covers the default C++ constructor). The
following test must now compile and execute well.

// c_class.h
class Tester {
  public:
Tester(unsigned int a_num = 5, char* a_className = 0);
virtual int dummy();
};

// c_class.cc
#include c_class.h
#include iostream

Tester::Tester(unsigned int a_num, char* a_className) {
  std::cout   ctor Tester called   a_num  :;

  if (a_className == 0) {
 std::cout  null;
  }

  std::cout  std::endl;
}

int Tester::dummy() {
}

--  c_class_h.ads
pragma Ada_2005;
pragma Style_Checks (Off);

with Interfaces.C; use Interfaces.C;
with Interfaces.C.Strings;

package c_class_h is

   package Class_Tester is
  type Tester is tagged limited record
 null;
  end record;
  pragma Import (CPP, Tester);

  function New_Tester
(a_num : unsigned := 5;
 a_className : Interfaces.C.Strings.chars_ptr
 := Interfaces.C.Strings.Null_Ptr)
 return Tester;
  pragma CPP_Constructor (New_Tester, _ZN6TesterC1EjPc);

  function dummy (this : access Tester) return int;
  pragma Import (CPP, dummy, _ZN6Tester5dummyEv);
   end;
   use Class_Tester;
end c_class_h;


--  main.adb
with c_class_h; use c_class_h;
procedure Main is
   use Class_Tester;

   Obj : Tester; --  Test
   pragma Unreferenced (Obj);
begin
   null;
end main;


project Ada2Cppc is
   for Languages use (Ada, C++);
   for Main use (main.adb);

   package Naming is
 for Implementation_Suffix (C++) use .cc;
   end Naming;

   for Source_Dirs use (.);
   for Object_Dir use obj;

   package Compiler is
  for Default_Switches (ada) use (-gnat05);
   end Compiler;

   package Builder is
  for Default_Switches (ada) use (-g);
   end Builder;

   package Ide is
  for Compiler_Command (ada) use gnatmake;
  for Compiler_Command (c) use gcc;
   end Ide;

end Ada2Cppc;

Command:
  mkdir obj
  gprclean -q -P ada2cppc.gpr
  gprbuild -q -P ada2cppc.gpr
  obj/main

Output:
 ctor Tester called 5:null

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-02  Javier Miranda  mira...@adacore.com

* exp_disp.adb (Set_CPP_Constructors): Handle constructor with default
parameters that covers the default constructor.

Index: exp_disp.adb
===
--- exp_disp.adb(revision 191972)
+++ exp_disp.adb(working copy)
@@ -8537,6 +8537,10 @@
   Body_Stmts: List_Id;
   Init_Tags_List: List_Id;
 
+  Covers_Default_Constructor : Entity_Id := Empty;
+
+   --  Start of processing for Set_CPP_Constructor
+
begin
   pragma Assert (Is_CPP_Class (Typ));
 
@@ -8622,7 +8626,9 @@
   Defining_Identifier =
 Make_Defining_Identifier (Loc,
   Chars (Defining_Identifier (P))),
-  Parameter_Type = New_Copy_Tree (Parameter_Type (P;
+  Parameter_Type  =
+New_Copy_Tree (Parameter_Type (P)),
+  Expression  = New_Copy_Tree (Expression (P;
   Next (P);
end loop;
 end if;
@@ -8713,6 +8719,17 @@
 
 Discard_Node (Wrapper_Body_Node);
 Set_Init_Proc (Typ, Wrapper_Id);
+
+--  If this constructor has parameters and all its parameters
+--  have defaults then it covers the default constructor. The
+--  semantic analyzer ensures that only one constructor with
+--  defaults covers the default constructor.
+
+if Present (Parameter_Specifications (Parent (E)))
+  and then Needs_No_Actuals (E)
+then
+   Covers_Default_Constructor := Wrapper_Id;
+end if;
  end if;
 
  Next_Entity (E);
@@ -8725,6 +8742,46 @@
  Set_Is_Abstract_Type (Typ);
   end if;
 
+  --  Handle constructor that has all its parameters with defaults and
+  --  hence it covers the default constructor. We generate a wrapper IP
+  --  which calls the covering constructor.
+
+  if Present (Covers_Default_Constructor) then
+ Loc := Sloc (Covers_Default_Constructor);
+
+ Body_Stmts := New_List (
+   Make_Procedure_Call_Statement (Loc,
+ Name   =
+   New_Reference_To (Covers_Default_Constructor, Loc),
+ Parameter_Associations = New_List (
+   Make_Identifier (Loc, Name_uInit;
+
+ Wrapper_Id :=
+   Make_Defining_Identifier (Loc, Make_Init_Proc_Name (Typ));
+
+ Wrapper_Body_Node :=
+   Make_Subprogram_Body (Loc,
+ Specification  

Re: vec_cond_expr adjustments

2012-10-02 Thread Richard Guenther
On Mon, Oct 1, 2012 at 5:57 PM, Marc Glisse marc.gli...@inria.fr wrote:
 [merging both threads, thanks for the answers]


 On Mon, 1 Oct 2012, Richard Guenther wrote:

 optabs should be fixed instead, an is_gimple_val condition is
 implicitely
 val != 0.


 For vectors, I think it should be val  0 (with an appropriate cast of
 val
 to a signed integer vector type if necessary). Or (val  highbit) != 0,
 but
 that's longer.


 I don't think so.  Throughout the compiler we generally assume false == 0
 and anything else is true.  (yes, for FP there is STORE_FLAG_VALUE, but
 it's scope is quite limited - if we want sth similar for vectors we'd have
 to
 invent it).


 See below.


 If we for example have

 predicate = a  b;
 x = predicate ? d : e;
 y = predicate ? f : g;

 we ideally want to re-use the predicate computation on targets where
 that would be optimal (and combine should be able to recover the
 case where it is not).


 That I don't understand. The vcond instruction implemented by targets
 takes
 as arguments d, e, cmp, a, b and emits the comparison itself. I don't see
 how I can avoid sending to the targets both (d,e,,a,b) and (f,g,,a,b).
 They will notice eventually that ab is computed twice and remove one of
 the
 two, but I don't see how to do that in optabs.c. Or I can compute x = a 
 b,
 use x  0 as the comparison passed to the targets, and expect targets
 (those
 for which it is true) to recognize that  0 is useless in a vector
 condition
 (PR54700), or is useless on a comparison result.


 But that's a limitation of how vcond works.  ISTR there is/was a vselect
 instruction as well, taking a mask and two vectors to select from.  At
 least
 that's how vcond works internally for some sub-targets.


 vselect seems to only appear in config/. Would it be defined as:
 vselect(m,a,b)=(am)|(b~m) ? I would almost be tempted to just define a
 pattern in .md files and let combine handle it, although it might be one
 instruction too long for that (and if m is xy, ~m might look like x=y).
 Or would it match the OpenCL select: For each component of a vector type,
 result[i] = if MSB of c[i] is set ? b[i] : a[i].? Or the pattern with 
 and | but with a precondition that the value of each element of the mask
 must be 0 or ±1?

 I don't find vcond that bad, as long as targets check for trivial
 comparisons in the expansion (what trivial means may depend on the
 platform). It is quite flexible for targets.

Well, ok.


 On Mon, 1 Oct 2012, Richard Guenther wrote:

 tmp = fold_build2_loc (gimple_location (def_stmt),
code,
 -  boolean_type_node,
 +  TREE_TYPE (cond),


 That's obvious.


 Ok, I'll test and commit that line separately.

 +  if (TREE_CODE (op0) == VECTOR_CST  TREE_CODE (op1) == VECTOR_CST)
 +{
 +  int count = VECTOR_CST_NELTS (op0);
 +  tree *elts =  XALLOCAVEC (tree, count);
 +  gcc_assert (TREE_CODE (type) == VECTOR_TYPE);
 +
 +  for (int i = 0; i  count; i++)
 +   {
 + tree elem_type = TREE_TYPE (type);
 + tree elem0 = VECTOR_CST_ELT (op0, i);
 + tree elem1 = VECTOR_CST_ELT (op1, i);
 +
 + elts[i] = fold_relational_const (code, elem_type,
 +  elem0, elem1);
 +
 + if(elts[i] == NULL_TREE)
 +   return NULL_TREE;
 +
 + elts[i] = fold_negate_const (elts[i], elem_type);


 I think you need to invent something new similar to STORE_FLAG_VALUE
 or use STORE_FLAG_VALUE here.  With the above you try to map
 {0, 1} to {0, -1} which is only true if the operation on the element types
 returns {0, 1} (thus, STORE_FLAG_VALUE is 1).


 Er, seems to me that constant folding of a scalar comparison in the
 front/middle-end only returns {0, 1}.

The point is we need to define some semantics for vector comparison
results.  One variant is to make it target independent which in turn
would inhibit (or make it more difficult) to exploit some target features.
You for example use {0, -1} for truth values - probably to exploit target
features - even though the most natural middle-end way would be to
use {0, 1} as for everything else (caveat: there may be both signed
and unsigned bools, we don't allow vector components with non-mode precision,
thus you could argue that a signed bool : 1 is just sign-extended
for your solution).  A different variant is to make it target dependent
to leverage optimization opportunities - that's why STORE_FLAG_VALUE
exists.  For example with vector comparisons a  v result, when
performing bitwise operations on it, you either have to make the target
expand code to produce {0, -1} even if the natural compare instruction
would, say, produce {0, 0x8} - or not constrain the possible values
of its result (like forwprop would do with your patch).  In general we
want constant folding to yield the same results as if the HW carried
out the operation to make -O0 code not diverge from 

Re: [PATCH RFA] Implement register pressure directed hoist pass

2012-10-02 Thread Jeff Law

On 09/29/2012 12:37 AM, Bin Cheng wrote:

Hi Steven,

This is the updated patch according to your comments. Please review.
I also re-collected code size data and found it is improved by about 0.24%
for mips, which is better than previous data. I believe this should be
caused by recent changes in trunk, rather than by using DF caches to
calculate register pressure.

Thanks.

2012-09-29  Bin Chengbin.ch...@arm.com

* common.opt (flag_ira_hoist_pressure): New.
* doc/invoke.texi (-fira-hoist-pressure): Describe.
* ira-costs.c (ira_set_pseudo_classes): New parameter.
* ira.h (ira_set_pseudo_classes): Update prototype.
* haifa-sched.c (sched_init): Update call.
* ira.c (ira): Update call.
* regmove.c (regmove_optimize): Update call.
* loop-invariant.c (move_loop_invariants): Update call.
* gcse.c (struct bb_data): New structure.
(BB_DATA): New macro.
(curr_bb, curr_regs_live, curr_reg_pressure, regs_set)
(n_regs_set): New static variables.
(hoist_expr_reaches_here_p): Use reg pressure to determine the
distance expr can be hoisted.
(hoist_code): Use reg pressure to direct the hoist process.
(get_regno_pressure_class, get_pressure_class_and_nregs)
(change_pressure, mark_regno_live, mark_regno_death)
(mark_reg_death, mark_reg_store, calculate_bb_reg_pressure): New.
(one_code_hoisting_pass): Calculate register pressure. Free data.
* config/arm/arm.c (arm_option_override): Set
flag_ira_hoist_pressure
on Thumb1 when optimizing for size.


hoist-reg-pressure-20120929.txt
+@item -fira-hoist-pressure
+@opindex fira-hoist-pressure
+Use IRA to evaluate register pressure in hoist pass for decisions to hoist
+expressions.  This option usually results in generation of smaller code on
+RISC machines, but it can slow the compiler down.
I wouldn't use CISC/RISC here; I'd just say it usually results in 
smaller code.


You need to update the copyright year in gcse.c, ira.h, regmove.c, and 
loop-invariant.c.



+  /* Only decrease distance if bb has high register pressure or EXPR
+is const expr, otherwise EXPR can be hoisted through bb without
+cost.  */
?!?  This comment makes no sense to me.  To accurately know how hoisting 
an expression affects pressure you have to look at the inputs and output 
and see how their lifetime has changed.


In general:

For inputs, hoisting *may* reduce pressure.  You really have to look at 
how the life of the input changes based on the new location of the insn. 
 For example, if the input's lifetime is unchanged (say perhaps because 
it was live after the insn we want to hoist), then hoisting will have no 
impact. Otherwise the input's life is shortened, but to know by how much 
you have to determine whether the new death of the input occurs (it may 
still die in the hoisted insn or it may die elsewhere).


For an output, hoisting will (effectively) always extend the lifetime.

I've speculated that the right way to deal with register pressure in 
code motion is to actually build the dependency graph and use that to 
guide the code motions.  I've never cobbled together any real code to do 
this though.


Can we find a better name for hoist_expr_reaches_here_p since it's no 
longer just dealing with reachability -- it has heuristics now for 
profitability as well.



@@ -2863,7 +2909,8 @@ static int
if (visited == NULL)
  {
visited_allocated_locally = 1;
-  visited = XCNEWVEC (char, last_basic_block);
+  visited = sbitmap_alloc (last_basic_block);
+  sbitmap_zero (visited);
  }
What's the purpose behind changing visited from a simple array to a 
sbitmap?  I'm not objecting, but would like to hear the rationale behind 
that change.  I'll also note it wasn't mentioned in the ChangeLog.


Similarly what's the rationale behind passing the expression itself 
rather than just its index?  I don't see where we need to use anything 
other than the index in this code.  And again, this change isn't 
mentioned in the ChangeLog.



+  /* Considered invariant insns have only one set.  */
+  gcc_assert (set != NULL_RTX);
+  reg = SET_DEST (set);
+  if (GET_CODE (reg) == SUBREG)
+reg = SUBREG_REG (reg);
+  if (MEM_P (reg))
+{
+  *nregs = 0;
+  pressure_class = NO_REGS;
+}

Don't you need to look at the addresses within the MEM?




Index: gcc/config/arm/arm.c
===
--- gcc/config/arm/arm.c(revision 191816)
+++ gcc/config/arm/arm.c(working copy)
@@ -2021,6 +2021,11 @@ arm_option_override (void)
 current_tune-num_prefetch_slots  0)
  flag_prefetch_loop_arrays = 1;

+  /* Enable register pressure hoist when optimizing for size on Thumb1 set.  */
+  if (TARGET_THUMB1  optimize_function_for_size_p (cfun)
+   flag_ira_hoist_pressure == -1)
+flag_ira_hoist_pressure = 1;
I'd rather see 

[Ada] Small fixes to Eliminated overflow mode

2012-10-02 Thread Arnaud Charlet
This patch cleans up some documentation issues for eliminated mode, and
fixes some errors for marginal cases. Not worth trying to concoct tests
for these cases, which were found by code review, not from any reported
bugs. Also forbid use of Eliminated mode if Long_Long_Integer'Size is
not 64. Also no tests for that, since on pretty much all targets (maybe
all) this condition is met. Also add more extensive doc on this feature.

Tested on x86_64-pc-linux-gnu, committed on trunk

2012-10-02  Robert Dewar  de...@adacore.com

* s-bignum.adb (Big_Exp): 0**0 should be 1, not 0.
(Big_Exp): Fix possible error for (-1)**0.
(Big_Exp): Fix error in computing 2**K for small K.
(Big_Mod): Fix wrong sign for negative operands.
(Div_Rem): Fix bad results for operands close to 2**63.
* s-bignum.ads: Add documentation and an assertion to require
LLI size to be 64 bits.
* sem_prag.adb (Analyze_Pragma, case Overflow_Checks): Do not
allow ELIMINATED if LLI'Size is other than 64 bits.
* switch-c.adb (Scan_Switches): Do not allow -gnato3 if LLI'Size
is not 64 bits.
* switch.ads (Bad_Switch): Add missing pragma No_Return.
* gnat_ugn.texi: Added appendix on Overflow Check Handling in GNAT.

Index: switch-c.adb
===
--- switch-c.adb(revision 191972)
+++ switch-c.adb(working copy)
@@ -33,6 +33,7 @@
 with Opt;  use Opt;
 with Validsw;  use Validsw;
 with Stylesw;  use Stylesw;
+with Ttypes;   use Ttypes;
 with Warnsw;   use Warnsw;
 
 with Ada.Unchecked_Deallocation;
@@ -50,6 +51,10 @@
   new Ada.Unchecked_Deallocation (String_List, String_List_Access);
--  Avoid using System.Strings.Free, which also frees the designated strings
 
+   function Get_Overflow_Mode (C : Character) return Overflow_Check_Type;
+   --  Given a digit in the range 0 .. 3, returns the corresponding value of
+   --  Overflow_Check_Type. Raises program error if C is outside this range.
+
function Switch_Subsequently_Cancelled
  (C: String;
   Args : String_List;
@@ -72,7 +77,6 @@
  declare
 New_Symbol_Definitions : constant String_List_Access :=
   new String_List (1 .. 2 * Preprocessing_Symbol_Last);
-
  begin
 New_Symbol_Definitions (Preprocessing_Symbol_Defs'Range) :=
   Preprocessing_Symbol_Defs.all;
@@ -86,6 +90,37 @@
 new String'(Def);
end Add_Symbol_Definition;
 
+   ---
+   -- Get_Overflow_Mode --
+   ---
+
+   function Get_Overflow_Mode (C : Character) return Overflow_Check_Type is
+   begin
+  case C is
+ when '0' =
+return Suppressed;
+
+ when '1' =
+return Checked;
+
+ when '2' =
+return Minimized;
+
+ --  Eliminated allowed only if Long_Long_Integer is 64 bits (since
+ --  the current implementation of System.Bignums assumes this).
+
+ when '3' =
+if Standard_Long_Long_Integer_Size /= 64 then
+   Bad_Switch (-gnato3 not implemented for this configuration);
+else
+   return Eliminated;
+end if;
+
+ when others =
+raise Program_Error;
+  end case;
+   end Get_Overflow_Mode;
+
-
-- Scan_Front_End_Switches --
-
@@ -778,27 +813,8 @@
else
   --  Handle first digit after -gnato
 
-  case Switch_Chars (Ptr) is
- when '0' =
-Suppress_Options.Overflow_Checks_General :=
-  Suppressed;
-
- when '1' =
-Suppress_Options.Overflow_Checks_General :=
-  Checked;
-
- when '2' =
-Suppress_Options.Overflow_Checks_General :=
-  Minimized;
-
- when '3' =
-Suppress_Options.Overflow_Checks_General :=
-  Eliminated;
-
- when others =
-raise Program_Error;
-  end case;
-
+  Suppress_Options.Overflow_Checks_General :=
+Get_Overflow_Mode (Switch_Chars (Ptr));
   Ptr := Ptr + 1;
 
   --  Only one digit after -gnato, set assertions mode to
@@ -813,27 +829,8 @@
   --  Process second digit after -gnato
 
   else
- case Switch_Chars (Ptr) is
-when '0' =
-   Suppress_Options.Overflow_Checks_Assertions :=
- Suppressed;
-
-when '1' =
-   Suppress_Options.Overflow_Checks_Assertions :=
-

[PATCH] Vector CONSTRUCTOR verifier

2012-10-02 Thread Jakub Jelinek
Hi!

As discussed in the PR and on IRC, this patch verifies that vector
CONSTRUCTOR in GIMPLE is either empty CONSTRUCTOR, or contains scalar
elements of type compatible with vector element type (then the verification
is less strict, allows less than TYPE_VECTOR_SUBPARTS elements and allows
non-NULL indexes if they are consecutive (no holes); this is because
from FEs often CONSTRUCTORs with those properties leak into the IL, and
a change in the gimplifier to canonicalize them wasn't enough, they keep
leaking even from non-gimplified DECL_INITIAL values etc.), or
contains vector elements (element types must be compatible, the vector
elements must be of the same type and their number must fill the whole
wider vector - these are created/used by tree-vect-generic lowering if
HW supports only shorter vectors than what is requested in source).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2012-10-02  Jakub Jelinek  ja...@redhat.com

PR tree-optimization/54713
* expr.c (categorize_ctor_elements_1): Don't assume purpose is
non-NULL.
* tree-cfg.c (verify_gimple_assign_single): Add verification of
vector CONSTRUCTORs.
* tree-ssa-sccvn.c (vn_reference_lookup_3): For VECTOR_TYPE
CONSTRUCTORs, don't do anything if element type is VECTOR_TYPE,
and don't check index.
* tree-vect-slp.c (vect_get_constant_vectors): VIEW_CONVERT_EXPR
ctor elements first if their type isn't compatible with vector
element type.

--- gcc/expr.c.jj   2012-09-27 12:45:53.0 +0200
+++ gcc/expr.c  2012-10-01 18:21:40.885122833 +0200
@@ -5491,7 +5491,7 @@ categorize_ctor_elements_1 (const_tree c
 {
   HOST_WIDE_INT mult = 1;
 
-  if (TREE_CODE (purpose) == RANGE_EXPR)
+  if (purpose  TREE_CODE (purpose) == RANGE_EXPR)
{
  tree lo_index = TREE_OPERAND (purpose, 0);
  tree hi_index = TREE_OPERAND (purpose, 1);
--- gcc/tree-cfg.c.jj   2012-10-01 17:28:17.469921927 +0200
+++ gcc/tree-cfg.c  2012-10-02 11:24:11.686155889 +0200
@@ -4000,6 +4000,80 @@ verify_gimple_assign_single (gimple stmt
   return res;
 
 case CONSTRUCTOR:
+  if (TREE_CODE (rhs1_type) == VECTOR_TYPE)
+   {
+ unsigned int i;
+ tree elt_i, elt_v, elt_t = NULL_TREE;
+
+ if (CONSTRUCTOR_NELTS (rhs1) == 0)
+   return res;
+ /* For vector CONSTRUCTORs we require that either it is empty
+CONSTRUCTOR, or it is a CONSTRUCTOR of smaller vector elements
+(then the element count must be correct to cover the whole
+outer vector and index must be NULL on all elements, or it is
+a CONSTRUCTOR of scalar elements, where we as an exception allow
+smaller number of elements (assuming zero filling) and
+consecutive indexes as compared to NULL indexes (such
+CONSTRUCTORs can appear in the IL from FEs).  */
+ FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (rhs1), i, elt_i, elt_v)
+   {
+ if (elt_t == NULL_TREE)
+   {
+ elt_t = TREE_TYPE (elt_v);
+ if (TREE_CODE (elt_t) == VECTOR_TYPE)
+   {
+ tree elt_t = TREE_TYPE (elt_v);
+ if (!useless_type_conversion_p (TREE_TYPE (rhs1_type),
+ TREE_TYPE (elt_t)))
+   {
+ error (incorrect type of vector CONSTRUCTOR
+ elements);
+ debug_generic_stmt (rhs1);
+ return true;
+   }
+ else if (CONSTRUCTOR_NELTS (rhs1)
+  * TYPE_VECTOR_SUBPARTS (elt_t)
+  != TYPE_VECTOR_SUBPARTS (rhs1_type))
+   {
+ error (incorrect number of vector CONSTRUCTOR
+ elements);
+ debug_generic_stmt (rhs1);
+ return true;
+   }
+   }
+ else if (!useless_type_conversion_p (TREE_TYPE (rhs1_type),
+  elt_t))
+   {
+ error (incorrect type of vector CONSTRUCTOR elements);
+ debug_generic_stmt (rhs1);
+ return true;
+   }
+ else if (CONSTRUCTOR_NELTS (rhs1)
+   TYPE_VECTOR_SUBPARTS (rhs1_type))
+   {
+ error (incorrect number of vector CONSTRUCTOR elements);
+ debug_generic_stmt (rhs1);
+ return true;
+   }
+   }
+ else if (!useless_type_conversion_p (elt_t, TREE_TYPE (elt_v)))
+   {
+ error (incorrect type of vector 

Re: abs(long long)

2012-10-02 Thread Marc Glisse

On Tue, 2 Oct 2012, Gabriel Dos Reis wrote:


I understand that it is originally a library issue, but I don't think
it makes sense to resolve it in isolation of that core issue.


They seem mostly orthogonal to me, since the library only uses an informal 
language describing the desired outcome and not the actual overloads 
necessary to achieve it, whereas the core issue is about determining 
priorities for a non-ambiguous overload resolution (if we are talking 
about the same, where Jens Maurer has a proposal).



The library installed by the system was compiled with g++, and is then used
with clang++. If we can avoid installing 2 config.h files to make that
work...


Two things:
 1. that is clearly a clang problem.  I don't think it is libstdc++'s job
 tp try to solve clang's misguided configuration and installation.


Translated: libstdc++ should only ever be used with the very version of 
g++ that was used to compile it. clang++, icpc, sunCC, etc should never 
try to use a libstdc++ compiled with another compiler.


I am not saying libstdc++ should go to great lengths to support other 
compilers, but when it is actually easier to support them than not to...

(testing a macro is easier than a configure test)


 2. I am not sure you understand what I wrote: you can leave the
 use of the current macro the way it is if you appropriately
 define it in terms of what you want to change it to.


I was complaining about the configure-time nature of the macro. If it is 
defined at each compiler run based on __SIZEOF_INT128__, I am happy.



More precisely, does that mean you want __builtin_llabs instead of ::llabs?
I thought the compiler knew they were the same.


Yes. Another reason is that it simplifies the implementation AND if
people want want to do something with the intrinsics' fallback
libstdc++ will gracefully deliver that.


I don't see how that simplifies the implementation, it is several 
characters longer than ::llabs, and we still need to handle llabs. Or do 
you mean: always call __builtin_llabs (whether we have an llabs or not), 
and let the compiler replace it with either (x0)?-x:x or a library call 
(I assume it never does that unless it has seen a corresponding 
declaration)?


Note that I am happy to let you take over this PR if you like.

--
Marc Glisse


Re: [PATCH v2, libbacktrace]: Compile with -funwind-tables

2012-10-02 Thread Ian Lance Taylor
On Tue, Oct 2, 2012 at 2:01 AM, Uros Bizjak ubiz...@gmail.com wrote:

 2012-10-02  Uros Bizjak  ubiz...@gmail.com

 PR other/54761
 * configure.ac (EXTRA_FLAGS): New.
 * Makefile.am (AM_FLAGS): Add $(EXTRA_FLAGS).
 * configure, Makefile.in: Regenerate.

This is OK.

Thanks.

Ian


Re: [PATCH] Vector CONSTRUCTOR verifier

2012-10-02 Thread Richard Guenther
On Tue, Oct 2, 2012 at 3:01 PM, Jakub Jelinek ja...@redhat.com wrote:
 Hi!

 As discussed in the PR and on IRC, this patch verifies that vector
 CONSTRUCTOR in GIMPLE is either empty CONSTRUCTOR, or contains scalar
 elements of type compatible with vector element type (then the verification
 is less strict, allows less than TYPE_VECTOR_SUBPARTS elements and allows
 non-NULL indexes if they are consecutive (no holes); this is because
 from FEs often CONSTRUCTORs with those properties leak into the IL, and
 a change in the gimplifier to canonicalize them wasn't enough, they keep
 leaking even from non-gimplified DECL_INITIAL values etc.), or
 contains vector elements (element types must be compatible, the vector
 elements must be of the same type and their number must fill the whole
 wider vector - these are created/used by tree-vect-generic lowering if
 HW supports only shorter vectors than what is requested in source).

 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok with ...

 2012-10-02  Jakub Jelinek  ja...@redhat.com

 PR tree-optimization/54713
 * expr.c (categorize_ctor_elements_1): Don't assume purpose is
 non-NULL.
 * tree-cfg.c (verify_gimple_assign_single): Add verification of
 vector CONSTRUCTORs.
 * tree-ssa-sccvn.c (vn_reference_lookup_3): For VECTOR_TYPE
 CONSTRUCTORs, don't do anything if element type is VECTOR_TYPE,
 and don't check index.
 * tree-vect-slp.c (vect_get_constant_vectors): VIEW_CONVERT_EXPR
 ctor elements first if their type isn't compatible with vector
 element type.

 --- gcc/expr.c.jj   2012-09-27 12:45:53.0 +0200
 +++ gcc/expr.c  2012-10-01 18:21:40.885122833 +0200
 @@ -5491,7 +5491,7 @@ categorize_ctor_elements_1 (const_tree c
  {
HOST_WIDE_INT mult = 1;

 -  if (TREE_CODE (purpose) == RANGE_EXPR)
 +  if (purpose  TREE_CODE (purpose) == RANGE_EXPR)
 {
   tree lo_index = TREE_OPERAND (purpose, 0);
   tree hi_index = TREE_OPERAND (purpose, 1);
 --- gcc/tree-cfg.c.jj   2012-10-01 17:28:17.469921927 +0200
 +++ gcc/tree-cfg.c  2012-10-02 11:24:11.686155889 +0200
 @@ -4000,6 +4000,80 @@ verify_gimple_assign_single (gimple stmt
return res;

  case CONSTRUCTOR:
 +  if (TREE_CODE (rhs1_type) == VECTOR_TYPE)
 +   {
 + unsigned int i;
 + tree elt_i, elt_v, elt_t = NULL_TREE;
 +
 + if (CONSTRUCTOR_NELTS (rhs1) == 0)
 +   return res;
 + /* For vector CONSTRUCTORs we require that either it is empty
 +CONSTRUCTOR, or it is a CONSTRUCTOR of smaller vector elements
 +(then the element count must be correct to cover the whole
 +outer vector and index must be NULL on all elements, or it is
 +a CONSTRUCTOR of scalar elements, where we as an exception allow
 +smaller number of elements (assuming zero filling) and
 +consecutive indexes as compared to NULL indexes (such
 +CONSTRUCTORs can appear in the IL from FEs).  */
 + FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (rhs1), i, elt_i, elt_v)
 +   {
 + if (elt_t == NULL_TREE)
 +   {
 + elt_t = TREE_TYPE (elt_v);
 + if (TREE_CODE (elt_t) == VECTOR_TYPE)
 +   {
 + tree elt_t = TREE_TYPE (elt_v);
 + if (!useless_type_conversion_p (TREE_TYPE (rhs1_type),
 + TREE_TYPE (elt_t)))
 +   {
 + error (incorrect type of vector CONSTRUCTOR
 + elements);
 + debug_generic_stmt (rhs1);
 + return true;
 +   }
 + else if (CONSTRUCTOR_NELTS (rhs1)
 +  * TYPE_VECTOR_SUBPARTS (elt_t)
 +  != TYPE_VECTOR_SUBPARTS (rhs1_type))
 +   {
 + error (incorrect number of vector CONSTRUCTOR
 + elements);
 + debug_generic_stmt (rhs1);
 + return true;
 +   }
 +   }
 + else if (!useless_type_conversion_p (TREE_TYPE (rhs1_type),
 +  elt_t))
 +   {
 + error (incorrect type of vector CONSTRUCTOR elements);
 + debug_generic_stmt (rhs1);
 + return true;
 +   }
 + else if (CONSTRUCTOR_NELTS (rhs1)
 +   TYPE_VECTOR_SUBPARTS (rhs1_type))
 +   {
 + error (incorrect number of vector CONSTRUCTOR 
 elements);
 + debug_generic_stmt (rhs1);
 + return true;
 +   

[PATCH] Fix PR54735

2012-10-02 Thread Richard Guenther

This fixes PR54735 - a bad interaction of non-up-to-date virtual
SSA form, update-SSA and cfg cleanup.  Morale of the story:
cfg cleanup can remove blocks and thus release SSA names - SSA
update is rightfully confused when such released SSA name is
still used at update time.

The following patch fixes the case in question simply by making
sure to run update-SSA before cfg cleanup.

[eventually release_ssa_name could treat virtual operands the
same as regular SSA names when they are scheduled for a re-write,
that would make this whole mess more robust - I am thinking of this]

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2012-10-02  Richard Guenther  rguent...@suse.de

PR middle-end/54735
* tree-ssa-pre.c (do_pre): Make sure to update virtual SSA form before
cleaning up the CFG.

* g++.dg/torture/pr54735.C: New testcase.

Index: gcc/tree-ssa-pre.c
===
--- gcc/tree-ssa-pre.c  (revision 191969)
+++ gcc/tree-ssa-pre.c  (working copy)
@@ -4820,6 +4820,13 @@ do_pre (void)
 
   free_scc_vn ();
 
+  /* Tail merging invalidates the virtual SSA web, together with
+ cfg-cleanup opportunities exposed by PRE this will wreck the
+ SSA updating machinery.  So make sure to run update-ssa
+ manually, before eventually scheduling cfg-cleanup as part of
+ the todo.  */
+  update_ssa (TODO_update_ssa_only_virtuals);
+
   return todo;
 }
 
@@ -4845,8 +4852,7 @@ struct gimple_opt_pass pass_pre =
   0,   /* properties_provided */
   0,   /* properties_destroyed */
   TODO_rebuild_alias,  /* todo_flags_start */
-  TODO_update_ssa_only_virtuals  | TODO_ggc_collect
-  | TODO_verify_ssa /* todo_flags_finish */
+  TODO_ggc_collect | TODO_verify_ssa   /* todo_flags_finish */
  }
 };
 
Index: gcc/testsuite/g++.dg/torture/pr54735.C
===
--- gcc/testsuite/g++.dg/torture/pr54735.C  (revision 0)
+++ gcc/testsuite/g++.dg/torture/pr54735.C  (working copy)
@@ -0,0 +1,179 @@
+// { dg-do compile }
+
+class Gmpfr
+{};
+class M : Gmpfr
+{
+public:
+  Gmpfr infconst;
+  M(int);
+};
+templatetypenamestruct A;
+templatetypename, int, int, int = 0 ? : 0, int = 0, int = 0class N;
+templatetypenameclass O;
+templatetypenamestruct B;
+struct C
+{
+  enum
+  { value };
+};
+class D
+{
+public:
+  enum
+  { ret };
+};
+struct F
+{
+  enum
+  { ret = 0 ? : 0 };
+};
+templatetypename Derivedstruct G
+{
+  typedef ODerivedtype;
+};
+struct H
+{
+  void operator * ();
+};
+struct I
+{
+  enum
+  { RequireInitialization = C::value ? : 0, ReadCost };
+};
+templatetypename Derivedstruct J
+{
+  enum
+  { ret = ADerived::InnerStrideAtCompileTime };
+};
+templatetypename Derivedstruct K
+{
+  enum
+  { ret = ADerived::OuterStrideAtCompileTime };
+};
+templatetypename Derivedclass P : H
+{
+public:
+  using H::operator *;
+  typedef typename ADerived::Scalar Scalar;
+  enum
+  { RowsAtCompileTime=
+  ADerived::RowsAtCompileTime, ColsAtCompileTime   =
+  ADerived::ColsAtCompileTime, SizeAtCompileTime   =
+  F::ret, MaxRowsAtCompileTime   =
+  ADerived::MaxRowsAtCompileTime, MaxColsAtCompileTime =
+  ADerived::MaxColsAtCompileTime, MaxSizeAtCompileTime =
+  F::ret, Flags  =
+  ADerived::Flags ? : 0 ? : 0, CoeffReadCost   =
+  ADerived::CoeffReadCost, InnerStrideAtCompileTime=
+  JDerived::ret, OuterStrideAtCompileTime  = KDerived::ret 
};
+  BDerived operator  (const Scalar);
+};
+
+templatetypename Derivedclass O : public PDerived
+{};
+
+templateint _Colsclass L
+{
+public:
+
+  int cols()
+  {
+return _Cols;
+  }
+};
+templatetypename Derivedclass Q : public GDerived::type
+{
+public:
+  typedef typename GDerived::type   Base;
+  typedef typename ADerived::Index  Index;
+  typedef typename ADerived::Scalar Scalar;
+  LBase::ColsAtCompileTime m_storage;
+  Index cols()
+  {
+return m_storage.cols();
+  }
+
+  Scalar coeffRef(Index,
+   Index);
+};
+
+templatetypename _Scalar, int _Rows, int _Cols, int _Options, int _MaxRows,
+ int _MaxColsstruct AN_Scalar, _Rows, _Cols, _Options, _MaxRows,
+ _MaxCols 
+{
+  typedef _Scalar Scalar;
+  typedef int Index;
+  enum
+  { RowsAtCompileTime, ColsAtCompileTime  =
+  _Cols, MaxRowsAtCompileTime, MaxColsAtCompileTime, Flags=
+  D::ret, CoeffReadCost   =
+  I::ReadCost, InnerStrideAtCompileTime, OuterStrideAtCompileTime =
+  0 ? : 0 };
+};
+templatetypename _Scalar, int, int _Cols, int, int,
+ intclass N : public QN_Scalar, 0, _Cols 
+{
+public:
+  QN Base;
+  templatetypename T0, typename T1N(const T0,
+ 

Re: RFC: LRA for x86/x86-64 [7/9]

2012-10-02 Thread Richard Sandiford
Vladimir Makarov vmaka...@redhat.com writes:
 This is the major patch containing all new files.  The patch also adds 
 necessary calls to LRA from IRA.As the patch is too big, it continues in 
 the next email.

 2012-09-27  Vladimir Makarov  vmaka...@redhat.com

  * Makefile.in (LRA_INT_H): New.
  (OBJS): Add lra.o, lra-assigns.o, lra-coalesce.o,
  lra-constraints.o, lra-eliminations.o, lra-lives.o, and lra-spills.o.
  (ira.o): Add dependence on lra.h.
  (lra.o, lra-assigns.o, lra-coalesce.o, lra-constraints.o): New entries.
  (lra-eliminations.o, lra-lives.o, lra-spills.o): Ditto.
  * ira.c: Include lra.h.
  (ira_init_once, ira_init, ira_finish_once): Call lra_start_once,
  lra_init, lra_finish_once in anyway.
  (lra_in_progress): Remove.
  (do_reload): Call LRA.
  * lra.h: New.
  * lra-int.h: Ditto.
  * lra.c: Ditto.
  * lra-assigns.c: Ditto.
  * lra-constraints.c: Ditto.
  * lra-coalesce.c: Ditto.
  * lra-eliminations.c: Ditto.
  * lra-lives.c: Ditto.
  * lra-spills.c: Ditto.
  * doc/passes.texi: Describe LRA pass.

Comments on ira-lives.c.  (Sorry for the split, had more time to look
at this than expected)

 +/* Copy live range list given by its head R and return the result.  */
 +lra_live_range_t
 +lra_copy_live_range_list (lra_live_range_t r)
 +{
 +  lra_live_range_t p, first, last;
 +
 +  if (r == NULL)
 +return NULL;
 +  for (first = last = NULL; r != NULL; r = r-next)
 +{
 +  p = copy_live_range (r);
 +  if (first == NULL)
 + first = p;
 +  else
 + last-next = p;
 +  last = p;
 +}
 +  return first;
 +}

Maybe simpler as:

  lra_live_range_t p, first, *chain;

  first = NULL;
  for (chain = first; r != NULL; r = r-next)
{
  p = copy_live_range (r);
  *chain = p;
  chain = p-next;
}
  return first;

 +/* Merge ranges R1 and R2 and returns the result.  The function
 +   maintains the order of ranges and tries to minimize size of the
 +   result range list.  */
 +lra_live_range_t 
 +lra_merge_live_ranges (lra_live_range_t r1, lra_live_range_t r2)
 +{
 +  lra_live_range_t first, last, temp;
 +
 +  if (r1 == NULL)
 +return r2;
 +  if (r2 == NULL)
 +return r1;
 +  for (first = last = NULL; r1 != NULL  r2 != NULL;)
 +{
 +  if (r1-start  r2-start)
 + {
 +   temp = r1;
 +   r1 = r2;
 +   r2 = temp;
 + }
 +  if (r1-start = r2-finish + 1)
 + {
 +   /* Intersected ranges: merge r1 and r2 into r1.  */
 +   r1-start = r2-start;
 +   if (r1-finish  r2-finish)
 + r1-finish = r2-finish;
 +   temp = r2;
 +   r2 = r2-next;
 +   pool_free (live_range_pool, temp);
 +   if (r2 == NULL)
 + {
 +   /* To try to merge with subsequent ranges in r1.  */
 +   r2 = r1-next;
 +   r1-next = NULL;
 + }
 + }
 +  else
 + {
 +   /* Add r1 to the result.  */
 +   if (first == NULL)
 + first = last = r1;
 +   else
 + {
 +   last-next = r1;
 +   last = r1;
 + }
 +   r1 = r1-next;
 +   if (r1 == NULL)
 + {
 +   /* To try to merge with subsequent ranges in r2.  */
 +   r1 = r2-next;
 +   r2-next = NULL;
 + }
 + }

I might be misreading, but I'm not sure whether this handles merges like:

  r1 = [6,7], [3,4]
  r2 = [3,8], [0,1]

After the first iteration, it looks like we'll have:

  r1 = [3,8], [3,4]
  r2 = [0,1]

Then we'll add both [3,8] and [3,4] to the result.

Same chain pointer comment as for lra_merge_live_ranges.

 +/* Return TRUE if live range R1 is in R2.  */
 +bool
 +lra_live_range_in_p (lra_live_range_t r1, lra_live_range_t r2)
 +{
 +  /* Remember the live ranges are always kept ordered.   */
 +  while (r1 != NULL  r2 != NULL)
 +{
 +  /* R1's element is in R2's element.  */
 +  if (r2-start = r1-start  r1-finish = r2-finish)
 + r1 = r1-next;
 +  /* Intersection: R1's start is in R2.  */
 +  else if (r2-start = r1-start  r1-start = r2-finish)
 + return false;
 +  /* Intersection: R1's finish is in R2.  */
 +  else if (r2-start = r1-finish  r1-finish = r2-finish)
 + return false;
 +  else if (r1-start  r2-finish)
 + return false; /* No covering R2's element for R1's one.  */
 +  else
 + r2 = r2-next;
 +}
 +  return r1 == NULL;

Does the inner bit reduce to:

  /* R1's element is in R2's element.  */
  if (r2-start = r1-start  r1-finish = r2-finish)
r1 = r1-next;
  /* All of R2's element comes after R1's element.  */
  else if (r2-start  r1-finish)
r2 = r2-next;
  else
return false;

(Genuine question)

 +/* Process the death of hard register REGNO.  This updates
 +   hard_regs_live and START_DYING.  */
 +static void
 +make_hard_regno_dead (int regno)
 +{
 +  if (TEST_HARD_REG_BIT (lra_no_alloc_regs, regno)
 +  || ! TEST_HARD_REG_BIT 

Re: [PATCH] limited C++ parsing support for gengtype

2012-10-02 Thread Diego Novillo
Aaron,

I'm currently fixing other issues with gengtype and I needed this
patch on top of them.  I will be rolling both patches into a single
one and commit them today/tomorrow.  If you were working on further
fixes to this, please give me a chance to commit this one first.


Thanks.  Diego.


Re: RFC: LRA for x86/x86-64 [7/9]

2012-10-02 Thread Richard Sandiford
Richard Sandiford rdsandif...@googlemail.com writes:
 +/* Merge ranges R1 and R2 and returns the result.  The function
 +   maintains the order of ranges and tries to minimize size of the
 +   result range list.  */
 +lra_live_range_t 
 +lra_merge_live_ranges (lra_live_range_t r1, lra_live_range_t r2)
 +{
 +  lra_live_range_t first, last, temp;
 +
 +  if (r1 == NULL)
 +return r2;
 +  if (r2 == NULL)
 +return r1;
 +  for (first = last = NULL; r1 != NULL  r2 != NULL;)
 +{
 +  if (r1-start  r2-start)
 +{
 +  temp = r1;
 +  r1 = r2;
 +  r2 = temp;
 +}
 +  if (r1-start = r2-finish + 1)
 +{
 +  /* Intersected ranges: merge r1 and r2 into r1.  */
 +  r1-start = r2-start;
 +  if (r1-finish  r2-finish)
 +r1-finish = r2-finish;
 +  temp = r2;
 +  r2 = r2-next;
 +  pool_free (live_range_pool, temp);
 +  if (r2 == NULL)
 +{
 +  /* To try to merge with subsequent ranges in r1.  */
 +  r2 = r1-next;
 +  r1-next = NULL;
 +}
 +}
 +  else
 +{
 +  /* Add r1 to the result.  */
 +  if (first == NULL)
 +first = last = r1;
 +  else
 +{
 +  last-next = r1;
 +  last = r1;
 +}
 +  r1 = r1-next;
 +  if (r1 == NULL)
 +{
 +  /* To try to merge with subsequent ranges in r2.  */
 +  r1 = r2-next;
 +  r2-next = NULL;
 +}
 +}

 I might be misreading, but I'm not sure whether this handles merges like:

   r1 = [6,7], [3,4]
   r2 = [3,8], [0,1]

 After the first iteration, it looks like we'll have:

   r1 = [3,8], [3,4]
   r2 = [0,1]

 Then we'll add both [3,8] and [3,4] to the result.

OK, so I start to read patch b and realise that this is only supposed to
handle non-overlapping live ranges.  It might be worth having a comment
and assert to that effect, for slow readers like me.

Although in that case the function feels a little more complicated than
it needs to be.  When we run out of R1 or R2, why not just use the other
one as the rest of the live range list?  Why is:

 +  if (r1 == NULL)
 +{
 +  /* To try to merge with subsequent ranges in r2.  */
 +  r1 = r2-next;
 +  r2-next = NULL;
 +}

needed?

Richard


Re: [PATCH] Fix PR47799 - debug info for early-inlining with LTO

2012-10-02 Thread Jakub Jelinek
On Mon, Oct 01, 2012 at 02:05:50PM +0200, Richard Guenther wrote:
 2012-10-01  Richard Guenther  rguent...@suse.de
 
   PR lto/47788
   * tree-streamer-out.c (write_ts_block_tree_pointers): For
   inlined functions outer scopes write the ultimate origin
   as BLOCK_ABSTRACT_ORIGIN and BLOCK_SOURCE_LOCATION.
   Do not stream the fragment chains.
   (lto_input_ts_block_tree_pointers): Likewise.
   * dwarf2out.c (gen_subprogram_die): Handle NULL DECL_INITIAL.
   (dwarf2out_decl): Always output DECL_ABSTRACT function decls.

Ok.

Jakub


Re: abs(long long)

2012-10-02 Thread Daniel Krügler
2012/10/2 Marc Glisse marc.gli...@inria.fr:
 Here I am talking of a library issue: the wording that says that there are
 sufficient overloads such that integer types call the double version of math
 functions. It is fairly obvious that it doesn't apply to abs(long) for
 instance which has an explicit overload. For short or unsigned, I still read
 it as saying that it converts to double...

This really looks like a problem of the Standard Library specification
to me and
a corresponding issue should be submitted. In fact the wording can be
interpreted
that mixing cstdlib with cmath would imply two different versions of
std::abs(int) because of different required return types. I will
prepare a corresponding
submission to the LWG.

- Daniel


Re: RFC: LRA for x86/x86-64 [0/9]

2012-10-02 Thread Vladimir Makarov

On 10/02/2012 12:22 AM, Jeff Law wrote:

On 10/01/2012 07:14 PM, Vladimir Makarov wrote:


   Analogous live ranges are used in IRA as intermidiate step to build a
conflict graph.  Actually, the first approach was to use IRA code to
assign hard registers to pseudos (e.g.  Jeff Law tried this approach)
but it was rejected as requiring much more compilation time.  In some
way, one can look at the assignment in LRA is a compromise between
quality (which could achieved through repeated buidling conflict graphs
and using graph coloring) and compilation speed.
Not only was it slow (iterating IRA), guaranteeing termination was a 
major problem.  There's some algorithmic games that have to be played 
(they're at least discussed in literature, but not under the heading 
of termination) and there's some issues specific to the IRA 
implementation which make ensuring termination difficult.


Chaitin-Briggs literature does not discuss the termination, just saying 
that live-ranges shortening will result to assigning hard regs to all 
necessary pseudos which is not clearly guaranteed. There is the same 
problem in LRA.  So LRA checks that too many passes are done or to many 
reloads for one insn are made and abort LRA.  Porting LRA is mostly 
fixing such aborts.


Another thing omitted by literature is inheritance which is very 
important for performance.  Although it could be considered as a special 
case of live-range splitting.  There are also a lot of small important 
details (e.g. what to do in case of displacement constraints, or when 
non-load/store insns permits memory and registers etc) not discussed 
well or at all in the literature I read.
I got nearly as good of results by conservative updates of the 
conflicts after splitting ranges and (ab)using ira's reload hooks to 
give the new pseudos for the split range a chance to be allocated again.


The biggest problem with that approach was getting the costing right 
for the new pseudos.  That requires running a fair amount of IRA a 
second time.  I'd still like to return to some of the ideas from that 
work as I think some of the bits are still relevant in the IRA+LRA world.



   My experience shows that these lists are usually 1-2 elements.
That's been my experience as well.  The vast majority of the time the 
range lists are very small.






Re: RFC: LRA for x86/x86-64 [8/9]

2012-10-02 Thread Vladimir Makarov

On 10/02/2012 01:01 AM, Jeff Law wrote:

On 09/27/2012 04:59 PM, Vladimir Makarov wrote:

   The following patch adds a code neccessary for correct work of LRA
(function ira_setup_eliminable_regset) and for correct work of the
compiler when LRA is used (see file dwarf2out.c).

2012-09-27  Vladimir Makarov  vmaka...@redhat.com

 * loop-invariant.c (calculate_loop_reg_pressure): Pass new
 argument to ira_setup_eliminable_regset.
 * haifa-sched.c (sched_init): Pass new argument to
 ira_setup_eliminable_regset.
 * dwarf2out.c: Include lra.h.
 (based_loc_descr, compute_frame_pointer_to_fb_displacement): Use
 lra_eliminate_regs for LRA instead of eliminate_regs.
 * ira.c: (ira_setup_eliminable_regset): Add parameter. Remove
 need_fp.  Call lra_init_elemination and mark
 HARD_FRAME_POINTER_REGNUM as living forever if
 frame_pointer_needed.
 (ira): Call ira_setup_eliminable_regset with a new
 argument.
 * ira.h (ira_setup_eliminable_regset): Add an argument.
 * Makefile.in (dwarf2out.o): Add dependence on ira.h and lra.h.

This is OK.  Obviously it's useless without 7a/7b.  But I just wanted 
to go ahead and review the ancillary bits before going to the meat of 
the submission.


So it's just the 7a/7b patch that needs review, right?

Right.  I'll commit your and Richard Sandiford's proposals into the 
branch.  I'll commit some patches (which could be useful without LRA) 
into the trunk too.


Thanks for reviewing all of this.




Re: [PATCH v2, libbacktrace]: Compile with -funwind-tables

2012-10-02 Thread Uros Bizjak
On Tue, Oct 2, 2012 at 3:08 PM, Ian Lance Taylor i...@google.com wrote:
 2012-10-02  Uros Bizjak  ubiz...@gmail.com

 PR other/54761
 * configure.ac (EXTRA_FLAGS): New.
 * Makefile.am (AM_FLAGS): Add $(EXTRA_FLAGS).
 * configure, Makefile.in: Regenerate.

 This is OK.

Thanks, committed.

On a related issue, it looks to me that the compiler itself should be
compiled with -funwind-tables, otherwise there are no backtraces
generated, even if libbacktrace is linked in and operational. Again,
x86_64-linux-gnu host defaults to this flag, but other hosts are left
behind.

Uros.


RFA: Fix OP_INOUT handling of web.c:union_match_dups

2012-10-02 Thread Joern Rennecke

Similar to PR43742, the ARCompact port gets an ICE from the current mainline
version of web.c:union_match_dups for its zero overhead loop pattern.
My first attempt at rectifying this was equivalent in effect to the patch
from comment #1 from this patch; that seemed to work well enough.
Later I stumbled across PR43742, which made me take a second look at the
problem.

Unlike the SH, the ARCompact architecture as an actual zero overhead loop
mechanism, which uses a dedicated loop counter register.  Although it can
be used in most contexts that a general-purpose register can, this causes
a lot of pipeline stalls, so if we changed the match_dup into a matching
constraint, and reload inserted reg-reg copies to fix up matching consatraints
for just a small fraction of the zero overhead loops, the performance penalty
of these stalls would wipe out any benefit gained from having any compiler
generated zero overhead loops.

Looking at md.texi, you could be excused thinking that match_dups have to
follow the operand that they match only for define_expand.
However, when you try to scramble the order in a define_insn_and_split, you
get an error:

/home/amylaar/synopsys/arc_gnu_4.8/unisrc/gcc/config/arc/arc.md:5516:  
operand 0 duplicated before defined


which is emitted by validate_pattern in genrecog.c .  This code is from 2004,
so I'd say there is a good chance that more code that actually relies on this.

Some patterns might be made to conform both to the match_dup ordering
constraint and avoid the web.c SEGV by reordering the pattern, although
at times at other infrastructure, e.g. when every place that tries to
recognize zero overhead loop patterns has to be amended to look for multiple
forms.  But other patterns intrinsically need one of these strictures
removed.
Consider an instruction that atomically exchanges the contents of
two registers and/or memory locations:  The source of the first set must
match the destination of the second set.  So, if we have to make the first
occurencence a match_operand, we must tag the + constraint on this input.

Therefore, web.c:union_match_dups should handle + constraints on inputs
tied with a match_dup to a later mentioned output.

The problem here is that the current version of this function only searches
the match_dup location in the use_link array, but for an OP_INPUT operand,
the location will be in the def_link array.

When I originally implemented this, I put some asserts there to make sure
we now handle all the *dupref == NULL cases; however, this lead to ICEs for
i686-pc-linux-gnu.  As mentioned in the new comment, the
DF_REF_LOC (use_link[n]) points to the register part of a memory address,
wheras recog_data.dup_loc[m] points to an enclosing MEM.
This is really a separate problem, so I choose to leave the behaviour in this
case alone, i.e. just continue in the loop without creating the  
def/use union, even though the comment at the top of the function says  
that it should create

that.

Bootstrapped and regtested on i686-pc-linux-gnu (baseline: revision 191817) .

2012-10-02  Joern Rennecke  joern.renne...@embecosm.com

* web.c (union_match_dups): Properly handle OP_INOUT match_dups.

Index: web.c
===
--- web.c   (revision 191817)
+++ web.c   (working copy)
@@ -96,6 +96,7 @@ union_match_dups (rtx insn, struct web_e
   struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn);
   df_ref *use_link = DF_INSN_INFO_USES (insn_info);
   df_ref *def_link = DF_INSN_INFO_DEFS (insn_info);
+  struct web_entry *dup_entry;
   int i;
 
   extract_insn (insn);
@@ -107,10 +108,24 @@ union_match_dups (rtx insn, struct web_e
   df_ref *ref, *dupref;
   struct web_entry *entry;
 
-  for (dupref = use_link; *dupref; dupref++)
+  for (dup_entry = use_entry, dupref = use_link; *dupref; dupref++)
if (DF_REF_LOC (*dupref) == recog_data.dup_loc[i])
  break;
 
+  if (*dupref == NULL  type == OP_INOUT)
+   {
+
+ for (dup_entry = def_entry, dupref = def_link; *dupref; dupref++)
+   if (DF_REF_LOC (*dupref) == recog_data.dup_loc[i])
+ break;
+   }
+  /* ??? *DUPREF can still be zero, because when an operand matches
+a memory, DF_REF_LOC (use_link[n]) points to the register part
+of the address, whereas recog_data.dup_loc[m] points to the
+entire memory ref, thus we fail to find the duplicate entry,
+ even though it is there.
+ Example: i686-pc-linux-gnu gcc.c-torture/compile/950607-1.c
+ -O3 -fomit-frame-pointer -funroll-loops  */
   if (*dupref == NULL
  || DF_REF_REGNO (*dupref)  FIRST_PSEUDO_REGISTER)
continue;
@@ -121,7 +136,15 @@ union_match_dups (rtx insn, struct web_e
if (DF_REF_LOC (*ref) == recog_data.operand_loc[op])
  break;
 
-  (*fun) (use_entry + DF_REF_ID (*dupref), entry + DF_REF_ID (*ref));
+  if (!*ref  type == OP_INOUT)

Re: RFC: LRA for x86/x86-64 [4/9]

2012-10-02 Thread Vladimir Makarov

On 10/01/2012 02:51 PM, Richard Sandiford wrote:

Vladimir Makarov vmaka...@redhat.com writes:

+/* Return register bank of given hard regno for the current target.  */
+DEFHOOK
+(register_bank,
+ A target hook which returns the register bank number to which the\
+  register @var{hard_regno} belongs to.  The smaller the number, the\
+  more preferable the hard register usage (when all other conditions are\
+  the same).  This hook can be used to prefer some hard register over\
+  others in LRA.  For example, some x86-64 register usage needs\
+  additional prefix which makes instructions longer.  The hook can\
+  return bigger bank number for such registers make them less favorable\
+  and as result making the generated code smaller.\
+  \
+  The default version of this target hook returns always zero.,
+ int, (int),
+ default_register_bank)

This is a horribly bikeshed-level comment, sorry, but I wonder if
something like register_priority would be better.  Register classes
are in some ways an extension of register banks, so it wasn't obvious
from the name why we needed both.
Ok.  I agree that is not a good term.  Register bank in hardware 
(especially in DSP) means a bit different thing.


Actually, on the Cauldron Ian asked me why it is different from register 
allocation order.  I should say that the order usually takes 
caller-saves info into account.  In x86-64, reg with REX flags can be 
caller-saved or not.

+/* Return true if maximal address displacement can be different.  */
+DEFHOOK
+(different_addr_displacement_p,
+ A target hook which returns true if an address with the same structure\
+  can have different maximal legitimate displacement.  For example, the\
+  displacement can depend on memory mode or on operand combinations in\
+  the insn.\
+  \
+  The default version of this target hook returns always false.,
+ bool, (void),
+ default_different_addr_displacement_p)

If I read the patch correctly, this is only used in:

+   if (lra_reg_spill_p || targetm.different_addr_displacement_p ())
+ lra_set_used_insn_alternative (insn, -1);

and so we keep the current alternative when neither spill_class_mode
nor different_addr_displacement_p is defined.  How many targets on the
LRA branch are like that?  I would have expected most targets with limited
address displacements would have to return true for the above hook,
because multiword loads and stores typically have to be split into word
loads and stores.  Same goes for strict-alignment targets, where wider
modes often have slightly lower maximal displacements.

E.g. for MIPS, SImode loads and stores have a displacement range of
[-32768, 32764], but DImode loads and stores only accept [-32768, 32760].
So the maximal displacement depends on mode, even though the instruction set
is pretty regular.

Targets with full address-size displacements can use the default false return,
but it looks like the x86 port defines spill_class_mode instead, so AIUI
the value isn't really tested on Core i7.  What's the impact of that compared
to the other x86 targets that don't set X86_TUNE_GENERAL_REGS_SSE_SPILL?
Is LRA just quicker for them, or will it make different decisions
(compared to Core i7) even for non-SSE insns?
It is mostly done for the LRA speed.  We could remove if-stmt and 
everything will be all right, only lra-constraints pass will go over all 
alternatives again.


Currently, there are two targets for which if-cond is true.  One is 
x86-64 (more exactly when corei7 tune is used) and another target PARISC 
for which different_add_displacement_p is true.


It might be more in the future.  Future ARMs and Powers might be 
profitable for spilling general registers into vector/floating point 
registers.  Might be new other targets will need 
different_addr_displacement_p.


We could rid off the hook but it is important for speeding LRA up. I 
felt that I should make LRA speed competitive with reload even it is 
hard because LRA works on RTL and not on internal structures as reload.  
I see now I was right worrying about the speed.  I guess if we could 
sacrifice 2% of compilation time, LRA code would be smaller and more 
clear.  If we could sacrifice 10% percent compilation time, the code 
would be even smaller and clear because we could use DF-infrastructure.

+/* Determine class of registers which could be used for spilled
+   pseudos instead of memory.  */
+DEFHOOK
+(spill_class,
+ This hook defines a class of registers which could be used for spilled 
pseudos\
+  of given class instead of memory,
+ reg_class_t, (reg_class_t),
+ NULL)

Should probably say that NO_REGS means none.

+/* Determine mode for spilling pseudos into registers instead of memory.  */
+DEFHOOK
+(spill_class_mode,
+ This hook defines mode in which a pseudo of given mode and of the first\
+  register class can be spilled into the second register class,
+ enum machine_mode, (reg_class_t, reg_class_t, enum machine_mode),
+ NULL)

It looks like the only use is in:


Re: [PATCH] Rs6000 infrastructure cleanup (switches), revised patch #2c

2012-10-02 Thread Michael Meissner
On Tue, Oct 02, 2012 at 10:13:25AM +0200, Gunther Nikl wrote:
 Michael Meissner wrote:
  Segher Boessenkool asked me on IRC to break out the fix in the last change.
  This patch is just the change to set the default options if the user did not
  use -mcpu=xxx and the compiler was not configured with --with-cpu=xxx.
  Here are the patches.
 
 Which GCC releases are affected by this bug?

All of them.  Now, in general users don't see this bug, because distribution
maintainers usually build with an explicit --with-cpu= option, which sets the
default CPU in case the user did not use -mcpu=xxx on the command line.  If
neither option was used, the default powerpc or powerpc64 is usually good
enough.

David noticed it when building AIX compilers, because he wanted to add a
default option (-mmfcrf) to the aix*.h definitions to insure that the new get
timebase builtin would generate the correct instructions by default (the
original PowerPCs had a different SPR for the time base than the newer
server machines starting with power4).  He asked me to fix this bug before we
tackle the infrastructure changes.

-- 
Michael Meissner, IBM
5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899



[AARCH64-4.7][PATCH] Reload fix backported to aarch64-4.7-branch.

2012-10-02 Thread Tejas Belagod


Hi,

I've backported Ulrich's reload fix(attached)

http://gcc.gnu.org/ml/gcc-patches/2012-07/msg01421.html

to aarch64-4.7-branch and committed it.

SendingChangeLog.aarch64
Sendingreload.c
Transmitting file data ..
Committed revision 191987.

Thanks,
Tejas.

diff --git a/gcc/reload.c b/gcc/reload.c
index 8420c80..a462419 100644
--- a/gcc/reload.c
+++ b/gcc/reload.c
@@ -283,7 +283,7 @@ static int find_reloads_address_1 (enum machine_mode, 
addr_space_t, rtx, int,
 static void find_reloads_address_part (rtx, rtx *, enum reg_class,
   enum machine_mode, int,
   enum reload_type, int);
-static rtx find_reloads_subreg_address (rtx, int, int, enum reload_type,
+static rtx find_reloads_subreg_address (rtx, int, enum reload_type,
int, rtx, int *);
 static void copy_replacements_1 (rtx *, rtx *, int);
 static int find_inc_amount (rtx, rtx);
@@ -4745,31 +4745,19 @@ find_reloads_toplev (rtx x, int opnum, enum reload_type 
type,
}
 
   /* If the subreg contains a reg that will be converted to a mem,
-convert the subreg to a narrower memref now.
-Otherwise, we would get (subreg (mem ...) ...),
-which would force reload of the mem.
-
-We also need to do this if there is an equivalent MEM that is
-not offsettable.  In that case, alter_subreg would produce an
-invalid address on big-endian machines.
-
-For machines that extend byte loads, we must not reload using
-a wider mode if we have a paradoxical SUBREG.  find_reloads will
-force a reload in that case.  So we should not do anything here.  */
+attempt to convert the whole subreg to a (narrower or wider)
+memory reference instead.  If this succeeds, we're done --
+otherwise fall through to check whether the inner reg still
+needs address reloads anyway.  */
 
   if (regno = FIRST_PSEUDO_REGISTER
-#ifdef LOAD_EXTEND_OP
-  !paradoxical_subreg_p (x)
-#endif
-  (reg_equiv_address (regno) != 0
- || (reg_equiv_mem (regno) != 0
-  (! strict_memory_address_addr_space_p
- (GET_MODE (x), XEXP (reg_equiv_mem (regno), 0),
-  MEM_ADDR_SPACE (reg_equiv_mem (regno)))
- || ! offsettable_memref_p (reg_equiv_mem (regno))
- || num_not_at_initial_offset
-   x = find_reloads_subreg_address (x, 1, opnum, type, ind_levels,
-  insn, address_reloaded);
+  reg_equiv_memory_loc (regno) != 0)
+   {
+ tem = find_reloads_subreg_address (x, opnum, type, ind_levels,
+insn, address_reloaded);
+ if (tem)
+   return tem;
+   }
 }
 
   for (copied = 0, i = GET_RTX_LENGTH (code) - 1; i = 0; i--)
@@ -6007,12 +5995,31 @@ find_reloads_address_1 (enum machine_mode mode, 
addr_space_t as,
  if (ira_reg_class_max_nregs [rclass][GET_MODE (SUBREG_REG (x))]
   reg_class_size[(int) rclass])
{
- x = find_reloads_subreg_address (x, 0, opnum,
-  ADDR_TYPE (type),
-  ind_levels, insn, NULL);
- push_reload (x, NULL_RTX, loc, (rtx*) 0, rclass,
-  GET_MODE (x), VOIDmode, 0, 0, opnum, type);
- return 1;
+ /* If the inner register will be replaced by a memory
+reference, we can do this only if we can replace the
+whole subreg by a (narrower) memory reference.  If
+this is not possible, fall through and reload just
+the inner register (including address reloads).  */
+ if (reg_equiv_memory_loc (REGNO (SUBREG_REG (x))) != 0)
+   {
+ rtx tem = find_reloads_subreg_address (x, opnum,
+ADDR_TYPE (type),
+ind_levels, insn,
+NULL);
+ if (tem)
+   {
+ push_reload (tem, NULL_RTX, loc, (rtx*) 0, rclass,
+  GET_MODE (tem), VOIDmode, 0, 0,
+  opnum, type);
+ return 1;
+   }
+   }
+ else
+   {
+ push_reload (x, NULL_RTX, loc, (rtx*) 0, rclass,
+  GET_MODE (x), VOIDmode, 0, 0, opnum, type);
+ return 1;
+   }
}
}
}
@@ -6089,17 +6096,12 @@ 

Re: abs(long long)

2012-10-02 Thread Gabriel Dos Reis
On Tue, Oct 2, 2012 at 8:07 AM, Marc Glisse marc.gli...@inria.fr wrote:

 The library installed by the system was compiled with g++, and is then
 used
 with clang++. If we can avoid installing 2 config.h files to make that
 work...


 Two things:
  1. that is clearly a clang problem.  I don't think it is libstdc++'s job
  tp try to solve clang's misguided configuration and installation.


 Translated: libstdc++ should only ever be used with the very version of g++
 that was used to compile it. clang++, icpc, sunCC, etc should never try to
 use a libstdc++ compiled with another compiler.

Obviously, I cannot require you to exercise common sense
and keep in check non-sensical strech.

libstdc++ was and is developed for GCc/g++.  If you are have
a 3rd party compiler that you would like to use with g++/libstdc++, you
should
   (a) either convince your 3rd party compiler supplier
 to understand the library you already have (libstdc++), or
   (b) supply yourself the glue between libstdc++ and your compiler.
Many compilers have done that in the past; I don't see anything
special with clang++

Whining on this list about libstdc++ internal macros and your dislike
of them is not going to produce anything today or tomorrow.


 I am not saying libstdc++ should go to great lengths to support other
 compilers, but when it is actually easier to support them than not to...
 (testing a macro is easier than a configure test)


  2. I am not sure you understand what I wrote: you can leave the
  use of the current macro the way it is if you appropriately
  define it in terms of what you want to change it to.


 I was complaining about the configure-time nature of the macro. If it is
 defined at each compiler run based on __SIZEOF_INT128__, I am happy.

I am saying to can arrange to supply the appropriate definition
without having to change the uses.


 More precisely, does that mean you want __builtin_llabs instead of
 ::llabs?
 I thought the compiler knew they were the same.


 Yes. Another reason is that it simplifies the implementation AND if
 people want want to do something with the intrinsics' fallback
 libstdc++ will gracefully deliver that.


 I don't see how that simplifies the implementation, it is several characters
 longer than ::llabs, and we still need to handle llabs.

You are on the wrong track if you are taking the number of characters
used in the implemetation.


 Or do you mean:
 always call __builtin_llabs (whether we have an llabs or not), and let the
 compiler replace it with either (x0)?-x:x or a library call (I assume it
 never does that unless it has seen a corresponding declaration)?

 Note that I am happy to let you take over this PR if you like.

 --
 Marc Glisse


Re: abs(long long)

2012-10-02 Thread Gabriel Dos Reis
On Tue, Oct 2, 2012 at 9:34 AM, Daniel Krügler
daniel.krueg...@gmail.com wrote:
 2012/10/2 Marc Glisse marc.gli...@inria.fr:
 Here I am talking of a library issue: the wording that says that there are
 sufficient overloads such that integer types call the double version of math
 functions. It is fairly obvious that it doesn't apply to abs(long) for
 instance which has an explicit overload. For short or unsigned, I still read
 it as saying that it converts to double...

 This really looks like a problem of the Standard Library specification
 to me and
 a corresponding issue should be submitted. In fact the wording can be
 interpreted
 that mixing cstdlib with cmath would imply two different versions of
 std::abs(int) because of different required return types. I will
 prepare a corresponding
 submission to the LWG.

This was already an issue I reported to LWG when C++98 came out.
Now that you hold  wrtite access to the issue document, you can
make sure it won't slip through the crack this time :-p

-- Gaby


RE: [Patch] Fix PR53397

2012-10-02 Thread Kumar, Venkataramanan
Hi Richi,

(Snip)
 + (!cst_and_fits_in_hwi (step))
 +{
 +  if( loop-inner != NULL)
 +{
 +  if (dump_file  (dump_flags  TDF_DETAILS))
 +{
 +  fprintf (dump_file, Reference %p:\n, (void *) ref);
 +  fprintf (dump_file, (base  );
 +  print_generic_expr (dump_file, base, TDF_SLIM);
 +  fprintf (dump_file, , step );
 +  print_generic_expr (dump_file, step, TDF_TREE);
 +  fprintf (dump_file, )\n);

No need to repeat this - all references are dumped when we gather them.
(Snip)

The dumping happens at record_ref which is called after these statements to 
record these references.

When the step is invariant  we return from the function without recording the 
references. 

 so I thought of dumping the references here.

Is there a cleaner way to dump the references at one place?

Regards,
Venkat.



-Original Message-
From: Richard Guenther [mailto:rguent...@suse.de] 
Sent: Tuesday, October 02, 2012 5:42 PM
To: Kumar, Venkataramanan
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [Patch] Fix PR53397

On Mon, 1 Oct 2012, venkataramanan.ku...@amd.com wrote:

 Hi,
 
 The below patch fixes the FFT/Scimark regression caused by useless 
 prefetch generation.
 
 This fix tries to make prefetch less aggressive by prefetching arrays 
 in the inner loop, when the step is invariant in the entire loop nest.
 
 GCC currently tries to prefetch invariant steps when they are in the 
 inner loop. But does not check if the step is variant in outer loops.
 
 In the scimark FFT case, the trip count of the inner loop varies by a 
 non constant step, which is invariant in the inner loop.
 But the step variable is varying in outer loop. This makes inner loop 
 trip count small (at run time varies sometimes as small as 1
 iteration)
 
 Prefetching ahead x iteration when the inner loop trip count is 
 smaller than x leads to useless prefetches.
 
 Flag used: -O3 -march=amdfam10
 
 Before 
 **  **
 ** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark **
 ** for details. (Results can be submitted to p...@nist.gov) **
 **  **
 Using   2.00 seconds min time per kenel.
 Composite Score:  550.50
 FFT Mflops:38.66(N=1024)
 SOR Mflops:   617.61(100 x 100)
 MonteCarlo: Mflops:   173.74
 Sparse matmult  Mflops:   675.63(N=1000, nz=5000)
 LU  Mflops:  1246.88(M=100, N=100)
 
 
 After 
 **  **
 ** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark **
 ** for details. (Results can be submitted to p...@nist.gov) **
 **  **
 Using   2.00 seconds min time per kenel.
 Composite Score:  639.20
 FFT Mflops:   479.19(N=1024)
 SOR Mflops:   617.61(100 x 100)
 MonteCarlo: Mflops:   173.18
 Sparse matmult  Mflops:   679.13(N=1000, nz=5000)
 LU  Mflops:  1246.88(M=100, N=100)
 
 GCC regression make check -k passes with x86_64-unknown-linux-gnu 
 New tests that PASS:
 
 gcc.dg/pr53397-1.c scan-assembler prefetcht0 gcc.dg/pr53397-1.c 
 scan-tree-dump aprefetch Issued prefetch
 gcc.dg/pr53397-1.c (test for excess errors) gcc.dg/pr53397-2.c 
 scan-tree-dump aprefetch loop variant step
 gcc.dg/pr53397-2.c scan-tree-dump aprefetch Not prefetching
 gcc.dg/pr53397-2.c (test for excess errors)
 
 
 Checked CPU2006 and polyhedron on latest AMD processor, no regressions noted.
 
 Ok to commit in trunk?
 
 regards,
 Venkat
 
 gcc/ChangeLog
 +2012-10-01  Venkataramanan Kumar  venkataramanan.ku...@amd.com
 +
 +   * tree-ssa-loop-prefetch.c (gather_memory_references_ref):$
 +   Perform non constant step prefetching in inner loop, only $
 +   when it is invariant in the entire loop nest.  $
 +   * testsuite/gcc.dg/pr53397-1.c: New test case $
 +   Checks we are prefecthing for loop invariant steps$
 +   * testsuite/gcc.dg/pr53397-2.c: New test case$
 +   Checks we are not prefecthing for loop variant steps
 +
 
 
 Index: gcc/testsuite/gcc.dg/pr53397-1.c 
 ===
 --- gcc/testsuite/gcc.dg/pr53397-1.c  (revision 0)
 +++ gcc/testsuite/gcc.dg/pr53397-1.c  (revision 0)
 @@ -0,0 +1,28 @@
 +/* Prefetching when the step is loop invariant.  */
 +
 +/* { dg-do compile } */
 +/* { dg-options -O3 -fprefetch-loop-arrays 
 +-fdump-tree-aprefetch-details --param min-insn-to-prefetch-ratio=3 
 +--param simultaneous-prefetches=10 -fdump-tree-aprefetch-details } 
 +*/
 +
 +
 +double data[16384];
 +void prefetch_when_non_constant_step_is_invariant(int step, int n) {
 + int a;
 + int b;
 + for (a = 1; a  step; a++) {
 +for (b = 0; b  n; b += 2 * step) {
 +
 +  int i = 2*(b + a);
 +  int j = 2*(b 

Re: abs(long long)

2012-10-02 Thread Gabriel Dos Reis
On Tue, Oct 2, 2012 at 8:07 AM, Marc Glisse marc.gli...@inria.fr wrote:

 Or do you mean:
 always call __builtin_llabs (whether we have an llabs or not), and let the
 compiler replace it with either (x0)?-x:x or a library call (I assume it
 never does that unless it has seen a corresponding declaration)?

See what we did in c/cmath and c_global/cmath.
What you find there is the result of years of several iterations
(including something similar to your earlier patch) all having issues
in one way of another until we settled on the builtin functions approach.
I have no appetite to go back to those days full of headache.

-- Gaby


Re: [PATCH v2, libbacktrace]: Compile with -funwind-tables

2012-10-02 Thread Ian Lance Taylor
On Tue, Oct 2, 2012 at 8:22 AM, Uros Bizjak ubiz...@gmail.com wrote:

 On a related issue, it looks to me that the compiler itself should be
 compiled with -funwind-tables, otherwise there are no backtraces
 generated, even if libbacktrace is linked in and operational. Again,
 x86_64-linux-gnu host defaults to this flag, but other hosts are left
 behind.

Compiling with C++ should always give us -funwind-tables.

If it doesn't for some reason, then I agree.  We might even want
-fasynchronous-unwind-tables for the compiler.

Ian


Re: [PATCH v2, libbacktrace]: Compile with -funwind-tables

2012-10-02 Thread Jakub Jelinek
On Tue, Oct 02, 2012 at 10:12:38AM -0700, Ian Lance Taylor wrote:
 On Tue, Oct 2, 2012 at 8:22 AM, Uros Bizjak ubiz...@gmail.com wrote:
 
  On a related issue, it looks to me that the compiler itself should be
  compiled with -funwind-tables, otherwise there are no backtraces
  generated, even if libbacktrace is linked in and operational. Again,
  x86_64-linux-gnu host defaults to this flag, but other hosts are left
  behind.
 
 Compiling with C++ should always give us -funwind-tables.

It doesn't give that, because the compiler is compiled with
-fno-exceptions -fno-rtti.

Jakub


Re: [PATCH] Rs6000 infrastructure cleanup (switches), revised patch #2d

2012-10-02 Thread David Edelsohn
On Mon, Oct 1, 2012 at 7:11 PM, Michael Meissner
meiss...@linux.vnet.ibm.com wrote:
 2012-10-01  Michael Meissner  meiss...@linux.vnet.ibm.com

 * config/rs6000/rs6000.c (rs6000_option_override_internal): If
 -mcpu=xxx is not specified and the compiler is not configured
 using --with-cpu=xxx, use the bits from the TARGET_DEFAULT to
 set the initial options.

 I reworked the patch to allow TARGET_DEFAULT bits to be set if there is no
 -mcpu=xxx and the compiler was not configured using --with-cpu=xxx, so 
 that
 we don't first clear all of the ISA bits, set them from the cpu, and then 
 merge
 back in the TARGET_DEFAULT bits.

This version of the patch is good.

Thanks, David


Re: [PATCH v2, libbacktrace]: Compile with -funwind-tables

2012-10-02 Thread Gabriel Dos Reis
On Tue, Oct 2, 2012 at 12:14 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Tue, Oct 02, 2012 at 10:12:38AM -0700, Ian Lance Taylor wrote:
 On Tue, Oct 2, 2012 at 8:22 AM, Uros Bizjak ubiz...@gmail.com wrote:
 
  On a related issue, it looks to me that the compiler itself should be
  compiled with -funwind-tables, otherwise there are no backtraces
  generated, even if libbacktrace is linked in and operational. Again,
  x86_64-linux-gnu host defaults to this flag, but other hosts are left
  behind.

 Compiling with C++ should always give us -funwind-tables.

 It doesn't give that, because the compiler is compiled with
 -fno-exceptions -fno-rtti.

I believe in the long term we would to drop either of those.

-- Gaby


Re: [PATCH v2, libbacktrace]: Compile with -funwind-tables

2012-10-02 Thread Uros Bizjak
On Tue, Oct 2, 2012 at 7:44 PM, Gabriel Dos Reis
g...@integrable-solutions.net wrote:

  On a related issue, it looks to me that the compiler itself should be
  compiled with -funwind-tables, otherwise there are no backtraces
  generated, even if libbacktrace is linked in and operational. Again,
  x86_64-linux-gnu host defaults to this flag, but other hosts are left
  behind.

 Compiling with C++ should always give us -funwind-tables.

 It doesn't give that, because the compiler is compiled with
 -fno-exceptions -fno-rtti.

 I believe in the long term we would to drop either of those.

For the short term, I am bootstrapping attached patch, that adds
-funwind-tables to other noexcept flags.

Uros.
Index: configure
===
--- configure   (revision 191991)
+++ configure   (working copy)
@@ -6636,7 +6636,7 @@
 # Disable exceptions and RTTI if building with g++
 noexception_flags=
 save_CFLAGS=$CFLAGS
-for real_option in -fno-exceptions -fno-rtti; do
+for real_option in -fno-exceptions -fno-rtti -funwind-tables; do
   # Do the check with the no- prefix removed since gcc silently
   # accepts any -Wno-* option on purpose
   case $real_option in
Index: configure.ac
===
--- configure.ac(revision 191991)
+++ configure.ac(working copy)
@@ -365,7 +365,8 @@
 
 # Disable exceptions and RTTI if building with g++
 ACX_PROG_CC_WARNING_OPTS(
-   m4_quote(m4_do([-fno-exceptions -fno-rtti])), [noexception_flags])
+   m4_quote(m4_do([-fno-exceptions -fno-rtti -funwind-tables])),
+  [noexception_flags])

 # Enable expensive internal checks
 is_release=


Re: [PATCH] fix up fixincludes for VxWorks and fix testing

2012-10-02 Thread rbmj

On 9/23/2012 7:19 PM, Bruce Korb wrote:


The attached patch needs to be split into two and I will do that before
I actually push the thing.  Since I have run out of play time this weekend
and since I will be in the Ukraine in two weeks for two weeks, this patch
is unlikely to get pushed before the end of October.  Sorry about that.



I've tried to do some of this work since Bruce is out.  I ended up 
splitting it into four patches.


Patches to follow,

Robert Mason



Re: [PATCH] fix up fixincludes for VxWorks and fix testing

2012-10-02 Thread rbmj

Patch 1:  [fixincludes] Fixes for VxWorks

TODO Prior to commit:

* fixincl.x: Regenerate

ChangeLog [fixincludes]:

2012-06-19  Robert Mason  r...@verizon.net

* fixinc.in: Check to see if the machine_name fix needs to be disabled.
viz. vxworks must not check the machine name for fix applicability.
* inclhack.def (AAB_vxworks_assert): New replacement fix
(AAB_vxworks_regs_vxtypes): likewise
(AAB_vxworks_stdint): and again
(AAB_vxworks_unistd) and yet again
(vxworks_ioctl_macro): wrap ioctl function in macro
(vxworks_mkdir_macro): remove mkdir() args vxworks doesn't support
(vxworks_regs): make sure regs.h comes from above arch directory.
(vxworks_write_const): add const attribute to data argument
* mkfixinc.sh: remove vxworks from list of platforms skipped by
fixincludes

2012-09-23  Bruce Korb  bk...@gnu.org

* tests/base/ioLib.h: new test header for new vxworks fix.
* tests/base/math.h: fix results movement
* tests/base/sys/stat.h: vxworks test
* tests/base/testing.h: vxworks test



Re: [PATCH] fix up fixincludes for VxWorks and fix testing

2012-10-02 Thread rbmj

Forgot to attach.

On 10/2/2012 2:09 PM, rbmj wrote:

Patch 1:  [fixincludes] Fixes for VxWorks

TODO Prior to commit:

* fixincl.x: Regenerate

ChangeLog [fixincludes]:

2012-06-19  Robert Mason  r...@verizon.net

 * fixinc.in: Check to see if the machine_name fix needs to be
disabled.
 viz. vxworks must not check the machine name for fix applicability.
 * inclhack.def (AAB_vxworks_assert): New replacement fix
 (AAB_vxworks_regs_vxtypes): likewise
 (AAB_vxworks_stdint): and again
 (AAB_vxworks_unistd) and yet again
 (vxworks_ioctl_macro): wrap ioctl function in macro
 (vxworks_mkdir_macro): remove mkdir() args vxworks doesn't support
 (vxworks_regs): make sure regs.h comes from above arch directory.
 (vxworks_write_const): add const attribute to data argument
 * mkfixinc.sh: remove vxworks from list of platforms skipped by
 fixincludes

2012-09-23  Bruce Korb  bk...@gnu.org

 * tests/base/ioLib.h: new test header for new vxworks fix.
 * tests/base/math.h: fix results movement
 * tests/base/sys/stat.h: vxworks test
 * tests/base/testing.h: vxworks test




From 5da04a0758548288d5f004ed294ac3e903e229a8 Mon Sep 17 00:00:00 2001
From: rbmj r...@verizon.net
Date: Tue, 2 Oct 2012 13:51:18 -0400
Subject: [PATCH 1/4] [fixincludes] Add fixes for VxWorks

---
 fixincludes/fixinc.in  |   16 ++
 fixincludes/inclhack.def   |  266 
 fixincludes/mkfixinc.sh|1 -
 fixincludes/tests/base/ioLib.h |   19 ++
 fixincludes/tests/base/math.h  |   10 +-
 fixincludes/tests/base/sys/stat.h  |7 +
 fixincludes/tests/base/testing.h   |6 +
 8 files changed, 324 insertions(+), 85 deletions(-)
 create mode 100644 fixincludes/tests/base/ioLib.h

diff --git a/fixincludes/fixinc.in b/fixincludes/fixinc.in
index e73aed9..f7b8d8f 100755
--- a/fixincludes/fixinc.in
+++ b/fixincludes/fixinc.in
@@ -128,6 +128,22 @@ fi
 
 # # # # # # # # # # # # # # # # # # # # #
 #
+#  Check to see if the machine_name fix needs to be disabled.
+#
+#  On some platforms, machine_name doesn't work properly and
+#  breaks some of the header files.  Since everything works
+#  properly without it, just wipe the macro list to
+#  disable the fix.
+
+case ${target_canonical} in
+*-*-vxworks*)
+	test -f ${MACRO_LIST}   echo  ${MACRO_LIST}
+;;
+esac
+
+
+# # # # # # # # # # # # # # # # # # # # #
+#
 #  In the file macro_list are listed all the predefined
 #  macros that are not in the C89 reserved namespace (the reserved
 #  namespace is all identifiers beginnning with two underscores or one
diff --git a/fixincludes/inclhack.def b/fixincludes/inclhack.def
index 82792af..c5ae854 100644
--- a/fixincludes/inclhack.def
+++ b/fixincludes/inclhack.def
@@ -354,6 +354,206 @@ fix = {
 	_EndOfHeader_;
 };
 
+/*
+ * Fix assert.h on VxWorks:
+ */
+fix = {
+hackname= AAB_vxworks_assert;
+files   = assert.h;
+mach= *-*-vxworks*;
+
+replace = - _EndOfHeader_
+	#ifndef _ASSERT_H
+	#define _ASSERT_H
+
+	#ifdef assert
+	#undef assert
+	#endif
+
+	#if defined(__STDC__) || defined(__cplusplus)
+	extern void __assert (const char*);
+	#else
+	extern void __assert ();
+	#endif
+
+	#ifdef NDEBUG
+	#define assert(ign) ((void)0)
+	#else
+
+	#define ASSERT_STRINGIFY(str) ASSERT_STRINGIFY_HELPER(str)
+	#define ASSERT_STRINGIFY_HELPER(str) #str
+
+	#define assert(test) ((void) \
+	((test) ? ((void)0) : \
+	__assert(Assertion failed:  ASSERT_STRINGIFY(test) , file  \
+	__FILE__ , line  ASSERT_STRINGIFY(__LINE__) \n)))
+
+	#endif
+
+	#endif
+	_EndOfHeader_;
+};
+
+/*
+ * Add needed include to regs.h (NOT the gcc header) on VxWorks
+ */
+
+fix = {
+hackname= AAB_vxworks_regs_vxtypes;
+files   = regs.h;
+mach= *-*-vxworks*;
+
+replace = - _EndOfHeader_
+	#ifndef _REGS_H
+	#define _REGS_H
+	#include types/vxTypesOld.h
+	#include_next arch/../regs.h
+	#endif
+	_EndOfHeader_;
+};
+
+/*
+ * Make VxWorks stdint.h a bit more compliant - add typedefs
+ */
+fix = {
+hackname= AAB_vxworks_stdint;
+files   = stdint.h;
+mach= *-*-vxworks*;
+
+replace = - _EndOfHeader_
+	#ifndef _STDINT_H
+	#define _STDINT_H
+	/* get int*_t, uint*_t */
+	#include types/vxTypes.h
+	
+	/* get legacy vxworks types for compatibility */
+	#include types/vxTypesOld.h
+	
+	typedef long intptr_t;
+	typedef unsigned long uintptr_t;
+	
+	typedef int64_t intmax_t;
+	typedef uint64_t uintmax_t;
+	
+	typedef int8_t int_least8_t;
+	typedef int16_t int_least16_t;
+	typedef int32_t int_least32_t;
+	typedef int64_t int_least64_t;
+	
+	typedef uint8_t uint_least8_t;
+	typedef uint16_t uint_least16_t;
+	typedef uint32_t uint_least32_t;
+	typedef uint64_t uint_least64_t;
+	
+	typedef int8_t int_fast8_t;
+	typedef int int_fast16_t;
+	typedef int32_t int_fast32_t;
+	typedef 

Re: [PATCH] fix up fixincludes for VxWorks and fix testing

2012-10-02 Thread rbmj

Patch 2: [fixincludes] Clean up fixincludes test machinery

TODO Prior to commit:

* fixincl.x: Regenerate

ChangeLog

2012-09-23  Bruce Korb  bk...@gnu.org

* check.tpl: export TEST_MODE=true for testing
* fixincl.c (te_verbose): extract to fixlib.h
(run_compiles): in test mode, if the fix is a replacement,
then skip the test.  The fix will not be applied.
* fixlib.h (fixinc_mode): new global variable that defaults to
TESTING_OFF but is set to TESTING_ON when TEST_MODE is true.
* fixopts.c: define this global variable
(initialize_opts): set it to TESTING_ON under proper conditions
* inclhack.def (AAB_darwin7_9_long_double_funcs_2): this is *NOT*
a replacement fix.  Rename it and move it where it belongs as
(darwin_9_long_double_funcs_2): renamed fix
(broken_nan): this had a broken selection regex.  Could never work.
* tests/base/architecture/ppc/math.h:  replacement fixes are not tested,
so remove all the replacement text.  Add in the broken_nan test
that used to never, ever fire.



Re: [PATCH] fix up fixincludes for VxWorks and fix testing

2012-10-02 Thread rbmj

Patch 3: Add --enable-libstdcxx option at top level configure

TODO prior to commit:

* configure: regenerate

ChangeLog:

* configure.ac: Add --enable-libstdcxx option

From 3f0d38b7b7b70659a57ac4266701a71a5f948860 Mon Sep 17 00:00:00 2001
From: rbmj r...@verizon.net
Date: Tue, 2 Oct 2012 13:54:21 -0400
Subject: [PATCH 3/4] Add --enable-libstdcxx option at top level configure

---
 configure.ac |   38 +-
 1 file changed, 25 insertions(+), 13 deletions(-)

diff --git a/configure.ac b/configure.ac
index f0d86d9..5325695 100644
--- a/configure.ac
+++ b/configure.ac
@@ -427,6 +427,15 @@ AC_ARG_ENABLE(libssp,
 ENABLE_LIBSSP=$enableval,
 ENABLE_LIBSSP=yes)
 
+AC_ARG_ENABLE(libstdcxx,
+AS_HELP_STRING([--disable-libstdcxx],
+  [do not build libstdc++-v3 directory]),
+ENABLE_LIBSTDCXX=$enableval,
+ENABLE_LIBSTDCXX=default)
+[if test ${ENABLE_LIBSTDCXX} = no ; then
+  noconfigdirs=$noconfigdirs libstdc++-v3
+fi]
+
 # Save it here so that, even in case of --enable-libgcj, if the Java
 # front-end isn't enabled, we still get libgcj disabled.
 libgcj_saved=$libgcj
@@ -562,19 +571,22 @@ case ${target} in
 esac
 
 # Disable libstdc++-v3 for some systems.
-case ${target} in
-  *-*-vxworks*)
-# VxWorks uses the Dinkumware C++ library.
-noconfigdirs=$noconfigdirs target-libstdc++-v3
-;;
-  arm*-wince-pe*)
-# the C++ libraries don't build on top of CE's C libraries
-noconfigdirs=$noconfigdirs target-libstdc++-v3
-;;
-  avr-*-*)
-noconfigdirs=$noconfigdirs target-libstdc++-v3
-;;
-esac
+# Allow user to override this if they pass --enable-libstdc++-v3
+if test ${ENABLE_LIBSTDCXX} = default ; then
+  case ${target} in
+*-*-vxworks*)
+  # VxWorks uses the Dinkumware C++ library.
+  noconfigdirs=$noconfigdirs target-libstdc++-v3
+  ;;
+arm*-wince-pe*)
+  # the C++ libraries don't build on top of CE's C libraries
+  noconfigdirs=$noconfigdirs target-libstdc++-v3
+  ;;
+avr-*-*)
+  noconfigdirs=$noconfigdirs target-libstdc++-v3
+  ;;
+  esac
+fi
 
 # Disable Fortran for some systems.
 case ${target} in
-- 
1.7.10.4



Re: [PATCH] fix up fixincludes for VxWorks and fix testing

2012-10-02 Thread rbmj

Patch 4: Minor changes to fix compilation on VxWorks

ChangeLog [gcc]:
* gcov-io.c (gcov_open): Pass third argument to open() unconditionally

ChangeLog [libstdc++-v3]:
* libstdc++-v3/config/os/vxworks/os_defines.h: Define NOMINMAX
From 420bf6c2b0bde5f1689663b477add8fc9df2a6f0 Mon Sep 17 00:00:00 2001
From: rbmj r...@verizon.net
Date: Tue, 2 Oct 2012 13:55:02 -0400
Subject: [PATCH 4/4] Minor source changes to allow compilation on VxWorks

---
 gcc/gcov-io.c   |3 ++-
 libstdc++-v3/config/os/vxworks/os_defines.h |6 ++
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/gcc/gcov-io.c b/gcc/gcov-io.c
index d64fb42..f562654 100644
--- a/gcc/gcov-io.c
+++ b/gcc/gcov-io.c
@@ -92,7 +92,8 @@ gcov_open (const char *name, int mode)
 {
   /* Read-only mode - acquire a read-lock.  */
   s_flock.l_type = F_RDLCK;
-  fd = open (name, O_RDONLY);
+  /* pass mode (ignored) for compatibility */
+  fd = open (name, O_RDONLY, S_IRUSR | S_IWUSR);
 }
   else
 {
diff --git a/libstdc++-v3/config/os/vxworks/os_defines.h b/libstdc++-v3/config/os/vxworks/os_defines.h
index c66063e..93ad1d4 100644
--- a/libstdc++-v3/config/os/vxworks/os_defines.h
+++ b/libstdc++-v3/config/os/vxworks/os_defines.h
@@ -33,4 +33,10 @@
 // System-specific #define, typedefs, corrections, etc, go here.  This
 // file will come before all others.
 
+//Keep vxWorks from defining min()/max() as macros
+#ifdef NOMINMAX
+#undef NOMINMAX
+#endif
+#define NOMINMAX 1
+
 #endif
-- 
1.7.10.4



Re: [PATCH] Add a new option -fstack-protector-strong (patch / doc inside)

2012-10-02 Thread 沈涵
Hi, any one got a chance to take look at this patch?

It seems that some other guys are also interested in this patch, the
clang developer is also proposing implement this
-fstack-protect-strong option.

Patch has just been merged with newest trunk and fixed a bug reported by Kees.

Tested fox x86_64 and arm.

Below patches also uploaded as patchset#4 at
https://codereview.appspot.com/6303078/

==

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 299150e..6eb18d6 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -1244,6 +1244,11 @@ clear_tree_used (tree block)
 #define SPCT_HAS_ARRAY 4
 #define SPCT_HAS_AGGREGATE 8

+/* Constants for flag_stack_protect. */
+#define SPCT_ALL 3
+#define SPCT_STRONG 2
+#define SPCT_DEFAULT 1
+
 static unsigned int
 stack_protect_classify_type (tree type)
 {
@@ -1306,7 +1311,8 @@ stack_protect_decl_phase (tree decl)
   if (bits  SPCT_HAS_SMALL_CHAR_ARRAY)
 has_short_buffer = true;

-  if (flag_stack_protect == 2)
+  if (flag_stack_protect == SPCT_ALL ||
+  flag_stack_protect == SPCT_STRONG)
 {
   if ((bits  (SPCT_HAS_SMALL_CHAR_ARRAY | SPCT_HAS_LARGE_CHAR_ARRAY))
!(bits  SPCT_HAS_AGGREGATE))
@@ -1444,6 +1450,29 @@ estimated_stack_frame_size (struct cgraph_node *node)
   return size;
 }

+/* Helper routine to check if a record or union contains an array field. */
+
+static int
+record_or_union_type_has_array_p (const_tree tree_type)
+{
+  tree fields = TYPE_FIELDS (tree_type);
+  tree f;
+
+  for (f = fields; f; f = DECL_CHAIN (f))
+{
+  if (TREE_CODE (f) == FIELD_DECL)
+ {
+  tree field_type = TREE_TYPE (f);
+  if (RECORD_OR_UNION_TYPE_P (field_type) 
+  record_or_union_type_has_array_p (field_type))
+return 1;
+  if (TREE_CODE (field_type) == ARRAY_TYPE)
+return 1;
+ }
+}
+  return 0;
+}
+
 /* Expand all variables used in the function.  */

 static void
@@ -1454,6 +1483,7 @@ expand_used_vars (void)
   struct pointer_map_t *ssa_name_decls;
   unsigned i;
   unsigned len;
+  int gen_stack_protect_signal = 0;

   /* Compute the phase of the stack frame for this function.  */
   {
@@ -1505,6 +1535,23 @@ expand_used_vars (void)
 }
   pointer_map_destroy (ssa_name_decls);

+  FOR_EACH_LOCAL_DECL (cfun, i, var)
+{
+  tree var_type = TREE_TYPE (var);
+  /* Examine local referenced variables that have their addresses taken,
+ contain an array, or are arrays.  */
+  if (TREE_CODE (var) == VAR_DECL
+   (TREE_CODE (var_type) == ARRAY_TYPE
+  || TREE_ADDRESSABLE (var)
+  || (RECORD_OR_UNION_TYPE_P (var_type)
+   record_or_union_type_has_array_p (var_type
+ {
+  ++gen_stack_protect_signal;
+  break;
+ }
+}
+
+
   /* At this point all variables on the local_decls with TREE_USED
  set are not associated with any block scope.  Lay them out.  */

@@ -1591,11 +1638,18 @@ expand_used_vars (void)
  dump_stack_var_partition ();
 }

-  /* There are several conditions under which we should create a
- stack guard: protect-all, alloca used, protected decls present.  */
-  if (flag_stack_protect == 2
-  || (flag_stack_protect
-   (cfun-calls_alloca || has_protected_decls)))
+  /* Create stack guard, if
+ a) -fstack-protector-all - always;
+ b) -fstack-protector-strong - if there are arrays, memory
+ references to local variables, alloca used, or protected decls present;
+ c) -fstack-protector - if alloca used, or protected decls present  */
+  if (flag_stack_protect == SPCT_ALL  /* -fstack-protector-all  */
+  || (flag_stack_protect == SPCT_STRONG  /* -fstack-protector-strong  */
+   (gen_stack_protect_signal || cfun-calls_alloca
+  || has_protected_decls))
+  || (flag_stack_protect == SPCT_DEFAULT  /* -fstack-protector  */
+   (cfun-calls_alloca
+  || has_protected_decls)))
 create_stack_guard ();

   /* Assign rtl to each variable based on these partitions.  */
@@ -1612,7 +1666,8 @@ expand_used_vars (void)
   expand_stack_vars (stack_protect_decl_phase_1);

   /* Phase 2 contains other kinds of arrays.  */
-  if (flag_stack_protect == 2)
+  if (flag_stack_protect == SPCT_ALL ||
+  flag_stack_protect == SPCT_STRONG)
 expand_stack_vars (stack_protect_decl_phase_2);
  }

diff --git a/gcc/common.opt b/gcc/common.opt
index f0e757c..942fbc0 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1892,8 +1892,12 @@ fstack-protector
 Common Report Var(flag_stack_protect, 1)
 Use propolice as a stack protection method

-fstack-protector-all
+fstack-protector-strong
 Common Report RejectNegative Var(flag_stack_protect, 2)
+Use a smart stack protection method for certain functions
+
+fstack-protector-all
+Common Report RejectNegative Var(flag_stack_protect, 3)
 Use a stack protection method for every function

 fstack-usage
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 7578dda..e1f2f2d 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -406,7 +406,7 @@ Objective-C and 

Re: [PATCH] fix up fixincludes for VxWorks and fix testing

2012-10-02 Thread rbmj

Forgot to attach...

On 10/2/2012 2:11 PM, rbmj wrote:

Patch 2: [fixincludes] Clean up fixincludes test machinery

TODO Prior to commit:

* fixincl.x: Regenerate

ChangeLog

2012-09-23  Bruce Korb  bk...@gnu.org

 * check.tpl: export TEST_MODE=true for testing
 * fixincl.c (te_verbose): extract to fixlib.h
 (run_compiles): in test mode, if the fix is a replacement,
 then skip the test.  The fix will not be applied.
 * fixlib.h (fixinc_mode): new global variable that defaults to
 TESTING_OFF but is set to TESTING_ON when TEST_MODE is true.
 * fixopts.c: define this global variable
 (initialize_opts): set it to TESTING_ON under proper conditions
 * inclhack.def (AAB_darwin7_9_long_double_funcs_2): this is *NOT*
 a replacement fix.  Rename it and move it where it belongs as
 (darwin_9_long_double_funcs_2): renamed fix
 (broken_nan): this had a broken selection regex.  Could never work.
 * tests/base/architecture/ppc/math.h:  replacement fixes are not
tested,
 so remove all the replacement text.  Add in the broken_nan test
 that used to never, ever fire.




From 56861b9c45b43c1443f88e56e6fa46fde590a70f Mon Sep 17 00:00:00 2001
From: rbmj r...@verizon.net
Date: Tue, 2 Oct 2012 13:52:27 -0400
Subject: [PATCH 2/4] [fixincludes] Clean up fixincludes test machinery

---
 fixincludes/README   |3 +++
 fixincludes/check.tpl|1 +
 fixincludes/fixincl.c|   27 +++
 fixincludes/fixlib.h |   26 +-
 fixincludes/fixopts.c|   42 +++---
 fixincludes/fixtests.c   |2 +-
 fixincludes/inclhack.def |   42 +-
 fixincludes/tests/base/architecture/ppc/math.h |   84 +---
 7 files changed, 89 insertions(+), 54 deletions(-)

diff --git a/fixincludes/README b/fixincludes/README
index c7144a0..9b48210 100644
--- a/fixincludes/README
+++ b/fixincludes/README
@@ -44,6 +44,9 @@ To make your fix, you will need to do several things:
 Make sure it is now properly handled.  Add tests to the
 test_text entry(ies) that validate your fix.  This will
 help ensure that future fixes won't negate your work.
+Do *NOT* specify test text for wrap or replacement fixes.
+There is no real possibility that these fixes will fail.
+If they do, you will surely know straight away.
 
 5.  Go into the fixincludes build directory and type, make check.
 You are guaranteed to have issues printed out as a result.
diff --git a/fixincludes/check.tpl b/fixincludes/check.tpl
index a9810e2..0d1f444 100644
--- a/fixincludes/check.tpl
+++ b/fixincludes/check.tpl
@@ -99,6 +99,7 @@ ENDFOR  fix
 
 =]
 
+export TEST_MODE=true
 find . -type f | sed 's;^\./;;' | sort | ../../fixincl
 cd ${DESTDIR}
 
diff --git a/fixincludes/fixincl.c b/fixincludes/fixincl.c
index 1133534..fecfb19 100644
--- a/fixincludes/fixincl.c
+++ b/fixincludes/fixincl.c
@@ -53,22 +53,8 @@ static const char z_std_preamble[] =
 original, manufacturer supplied header file.  */\n\n;
 
 int find_base_len = 0;
-
-typedef enum {
-  VERB_SILENT = 0,
-  VERB_FIXES,
-  VERB_APPLIES,
-  VERB_PROGRESS,
-  VERB_TESTS,
-  VERB_EVERYTHING
-} te_verbose;
-
-te_verbose  verbose_level = VERB_PROGRESS;
 int have_tty = 0;
 
-#define VLEVEL(l)  ((unsigned int) verbose_level = (unsigned int) l)
-#define NOT_SILENT VLEVEL(VERB_FIXES)
-
 pid_t process_chain_head = (pid_t) -1;
 
 char*  pz_curr_file;  /*  name of the current file under test/fix  */
@@ -412,8 +398,17 @@ run_compiles (void)
   /* FOR every fixup, ...  */
   do
 {
-  tTestDesc *p_test = p_fixd-p_test_desc;
-  int test_ct = p_fixd-test_ct;
+  tTestDesc *p_test;
+  int test_ct;
+
+  if (fixinc_mode  (p_fixd-fd_flags  FD_REPLACEMENT))
+{
+  p_fixd-fd_flags |= FD_SKIP_TEST;
+  continue;
+}
+
+  p_test = p_fixd-p_test_desc;
+  test_ct = p_fixd-test_ct;
 
   /*  IF the machine type pointer is not NULL (we are not in test mode)
  AND this test is for or not done on particular machines
diff --git a/fixincludes/fixlib.h b/fixincludes/fixlib.h
index 42d98b2..19df48a 100644
--- a/fixincludes/fixlib.h
+++ b/fixincludes/fixlib.h
@@ -140,7 +140,10 @@ typedef int apply_fix_p_t;  /* Apply Fix Predicate Type */
  amount of user entertainment )\
  \
   _ENV_( pz_find_base, BOOL_TRUE, FIND_BASE,   \
- leader to trim from file names )
+ leader to trim from file names )  \
+ \
+  _ENV_( pz_test_mode, BOOL_FALSE, TEST_MODE,  \
+ run fixincludes in test mode )
 
 #define _ENV_(v,m,n,t)   extern tCC* v;
 ENV_TABLE
@@ -211,6 +214,27 @@ typedef struct {
 
 extern int gnu_type_map_ct;
 
+typedef enum {
+  VERB_SILENT = 0,
+  VERB_FIXES,
+  VERB_APPLIES,
+  VERB_PROGRESS,
+  VERB_TESTS,
+  VERB_EVERYTHING
+} 

[wwwdocs] Buildstat update for 4.4

2012-10-02 Thread Tom G. Christensen
Latest results for 4.4.x

-tgc

Testresults for 4.4.7:
  alphaev68-dec-osf5.1a
  hppa2.0w-hp-hpux11.00
  hppa2.0w-hp-hpux11.11
  hppa64-hp-hpux11.00
  hppa64-hp-hpux11.11
  i386-pc-solaris2.8

Testresults for 4.4.1:
  alphaev68-dec-osf5.1a

Index: buildstat.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.4/buildstat.html,v
retrieving revision 1.26
diff -u -r1.26 buildstat.html
--- buildstat.html  4 Apr 2012 11:24:55 -   1.26
+++ buildstat.html  2 Oct 2012 18:17:11 -
@@ -34,10 +34,12 @@
 tdalphaev68-dec-osf5.1a/td
 tdnbsp;/td
 tdTest results:
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02456.html;4.4.7/a,
 a 
href=http://gcc.gnu.org/ml/gcc-testresults/2011-08/msg00586.html;4.4.6/a,
 a 
href=http://gcc.gnu.org/ml/gcc-testresults/2011-05/msg00074.html;4.4.6/a,
 a 
href=http://gcc.gnu.org/ml/gcc-testresults/2010-12/msg01338.html;4.4.5/a,
 a 
href=http://gcc.gnu.org/ml/gcc-testresults/2010-07/msg01437.html;4.4.4/a,
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02455.html;4.4.1/a,
 a href=http://gcc.gnu.org/ml/gcc-testresults/2009-07/msg03093.html;4.4.1/a
 /td
 /tr
@@ -134,6 +136,7 @@
 tdhppa2.0w-hp-hpux11.00/td
 tdnbsp;/td
 tdTest results:
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg00105.html;4.4.7/a,
 a 
href=http://gcc.gnu.org/ml/gcc-testresults/2010-08/msg00081.html;4.4.4/a,
 a 
href=http://gcc.gnu.org/ml/gcc-testresults/2009-11/msg01652.html;4.4.2/a,
 a href=http://gcc.gnu.org/ml/gcc-testresults/2009-04/msg03163.html;4.4.0/a
@@ -144,6 +147,7 @@
 tdhppa2.0w-hp-hpux11.11/td
 tdnbsp;/td
 tdTest results:
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-08/msg02261.html;4.4.7/a,
 a 
href=http://gcc.gnu.org/ml/gcc-testresults/2011-05/msg00201.html;4.4.6/a,
 a 
href=http://gcc.gnu.org/ml/gcc-testresults/2010-05/msg02383.html;4.4.4/a,
 a 
href=http://gcc.gnu.org/ml/gcc-testresults/2010-01/msg02240.html;4.4.3/a,
@@ -158,6 +162,7 @@
 tdhppa64-hp-hpux11.00/td
 tdnbsp;/td
 tdTest results:
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-08/msg02861.html;4.4.7/a,
 a 
href=http://gcc.gnu.org/ml/gcc-testresults/2010-08/msg02231.html;4.4.4/a,
 a href=http://gcc.gnu.org/ml/gcc-testresults/2009-05/msg00296.html;4.4.0/a
 /td
@@ -167,6 +172,7 @@
 tdhppa64-hp-hpux11.11/td
 tdnbsp;/td
 tdTest results:
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-08/msg02159.html;4.4.7/a,
 a 
href=http://gcc.gnu.org/ml/gcc-testresults/2011-05/msg00400.html;4.4.6/a,
 a 
href=http://gcc.gnu.org/ml/gcc-testresults/2010-05/msg02914.html;4.4.4/a,
 a 
href=http://gcc.gnu.org/ml/gcc-testresults/2010-01/msg02364.html;4.4.3/a,
@@ -181,6 +187,7 @@
 tdi386-pc-solaris2.8/td
 tdnbsp;/td
 tdTest results:
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-04/msg02185.html;4.4.7/a,
 a 
href=http://gcc.gnu.org/ml/gcc-testresults/2010-07/msg00900.html;4.4.4/a,
 a href=http://gcc.gnu.org/ml/gcc-testresults/2009-05/msg00105.html;4.4.0/a
 /td


Re: [PATCH v2, libbacktrace]: Compile with -funwind-tables

2012-10-02 Thread Ian Lance Taylor
On Tue, Oct 2, 2012 at 10:48 AM, Uros Bizjak ubiz...@gmail.com wrote:
 On Tue, Oct 2, 2012 at 7:44 PM, Gabriel Dos Reis
 g...@integrable-solutions.net wrote:

  On a related issue, it looks to me that the compiler itself should be
  compiled with -funwind-tables, otherwise there are no backtraces
  generated, even if libbacktrace is linked in and operational. Again,
  x86_64-linux-gnu host defaults to this flag, but other hosts are left
  behind.

 Compiling with C++ should always give us -funwind-tables.

 It doesn't give that, because the compiler is compiled with
 -fno-exceptions -fno-rtti.

 I believe in the long term we would to drop either of those.

 For the short term, I am bootstrapping attached patch, that adds
 -funwind-tables to other noexcept flags.

I think you should use -fasynchronous-unwind-tables here.  That way we
can get a backtrace if the compiler gets a segmentation violation.

I'll approve this patch with that change.  But you might want to check
whether you can see any change in bootstrap time or compiler size
(sorry).

Thanks.

Ian


Re: abs(long long)

2012-10-02 Thread Marc Glisse

On Tue, 2 Oct 2012, Gabriel Dos Reis wrote:


Whining on this list about libstdc++ internal macros and your dislike
of them is not going to produce anything today or tomorrow.


Other compilers using libstdc++ was just an extra argument. Even if g++ 
was the only compiler on earth, I would still consider a compile-time test 
superior to a configure test. The macro __SIZEOF_INT128__ was invented 
precisely for this purpose. Yes, that's just more whining ;-)



On Tue, 2 Oct 2012, Gabriel Dos Reis wrote:


On Tue, Oct 2, 2012 at 8:07 AM, Marc Glisse marc.gli...@inria.fr wrote:


Or do you mean:
always call __builtin_llabs (whether we have an llabs or not), and let the
compiler replace it with either (x0)?-x:x or a library call (I assume it
never does that unless it has seen a corresponding declaration)?


See what we did in c/cmath and c_global/cmath.


Note that llabs is quite different from asin. __builtin_llabs generates an 
ABS_EXPR, which will later be expanded either to a special instruction or 
to a condition. It never generates a call to llabs (I am not sure exactly 
if Paolo's instructions to use llabs meant he wanted an actual library 
call). __builtin_asin on the other hand is never expanded inline (except 
maybe for special constant input like 0) and expands to a call to the 
library function asin.


Would the attached patch be better, assuming it passes testing? For lldiv, 
there is no builtin (for good reason).


* include/c_std/cstdlib (abs(long long)): Define with
__builtin_llabs when we have long long.
(abs(__int128)): Define when we have __int128.
(div(long long, long long)): Use lldiv.


--
Marc GlisseIndex: cstdlib
===
--- cstdlib (revision 191941)
+++ cstdlib (working copy)
@@ -128,21 +128,31 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   using ::strtod;
   using ::strtol;
   using ::strtoul;
   using ::system;
 #ifdef _GLIBCXX_USE_WCHAR_T
   using ::wcstombs;
   using ::wctomb;
 #endif // _GLIBCXX_USE_WCHAR_T
 
   inline long
-  abs(long __i) { return labs(__i); }
+  abs(long __i) { return __builtin_labs(__i); }
+
+#ifdef _GLIBCXX_USE_LONG_LONG
+  inline long long
+  abs(long long __x) { return __builtin_llabs (__x); }
+#endif
+
+#if !defined(__STRICT_ANSI__)  defined(_GLIBCXX_USE_INT128)
+  inline __int128
+  abs(__int128 __x) { return __x = 0 ? __x : -__x; }
+#endif
 
   inline ldiv_t
   div(long __i, long __j) { return ldiv(__i, __j); }
 
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
 
 #if _GLIBCXX_USE_C99
 
 #undef _Exit
@@ -161,29 +171,26 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   using ::lldiv_t;
 #endif
 #if _GLIBCXX_USE_C99_CHECK || _GLIBCXX_USE_C99_DYNAMIC
   extern C void (_Exit)(int) throw () _GLIBCXX_NORETURN;
 #endif
 #if !_GLIBCXX_USE_C99_DYNAMIC
   using ::_Exit;
 #endif
 
-  inline long long
-  abs(long long __x) { return __x = 0 ? __x : -__x; }
-
 #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   using ::llabs;
 
   inline lldiv_t
   div(long long __n, long long __d)
-  { lldiv_t __q; __q.quot = __n / __d; __q.rem = __n % __d; return __q; }
+  { return ::lldiv (__n, __d); }
 
   using ::lldiv;
 #endif
 
 #if _GLIBCXX_USE_C99_LONG_LONG_CHECK || _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   extern C long long int (atoll)(const char *) throw ();
   extern C long long int
 (strtoll)(const char * __restrict, char ** __restrict, int) throw ();
   extern C unsigned long long int
 (strtoull)(const char * __restrict, char ** __restrict, int) throw ();
@@ -198,21 +205,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace __gnu_cxx
 
 namespace std
 {
 #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   using ::__gnu_cxx::lldiv_t;
 #endif
   using ::__gnu_cxx::_Exit;
-  using ::__gnu_cxx::abs;
 #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC
   using ::__gnu_cxx::llabs;
   using ::__gnu_cxx::div;
   using ::__gnu_cxx::lldiv;
 #endif
   using ::__gnu_cxx::atoll;
   using ::__gnu_cxx::strtof;
   using ::__gnu_cxx::strtoll;
   using ::__gnu_cxx::strtoull;
   using ::__gnu_cxx::strtold;


[PR54177] Deal with var_lowpart failure in function parameters

2012-10-02 Thread Alexandre Oliva
Uros has already taken care of the main patch for the problem, but I
feel it's appropriate to protect vt_add_function_parameter should
val_lowpart actually return NULL.

I'm checking this in as obvious.  Regstrapped on x86_64-linux-gnu and
i686-linux-gnu.


Deal with var_lowpart failure in function parameters.

From: Alexandre Oliva aol...@redhat.com

for  gcc/ChangeLog

	* var-tracking.c (vt_add_function_parameter): Bail if
	var_lowpart fails.
---

 gcc/var-tracking.c |7 ++-
 1 files changed, 6 insertions(+), 1 deletions(-)


diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
index 9f5bc12..bbd2f4b 100644
--- a/gcc/var-tracking.c
+++ b/gcc/var-tracking.c
@@ -9428,6 +9428,7 @@ vt_add_function_parameter (tree parm)
GET_CODE (incoming) != PARALLEL)
 {
   cselib_val *val;
+  rtx lowpart;
 
   /* ??? We shouldn't ever hit this, but it may happen because
 	 arguments passed by invisible reference aren't dealt with
@@ -9436,7 +9437,11 @@ vt_add_function_parameter (tree parm)
   if (offset)
 	return;
 
-  val = cselib_lookup_from_insn (var_lowpart (mode, incoming), mode, true,
+  lowpart = var_lowpart (mode, incoming);
+  if (!lowpart)
+	return;
+
+  val = cselib_lookup_from_insn (lowpart, mode, true,
  VOIDmode, get_insns ());
 
   /* ??? Float-typed values in memory are not handled by

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer


[wwwdocs] Buildstat update for 4.5

2012-10-02 Thread Tom G. Christensen
Latest results for 4.5.x

-tgc

Testresults for 4.5.4:
  alphaev68-dec-osf5.1a
  hppa2.0w-hp-hpux11.00
  hppa64-hp-hpux11.00

Testresults for 4.5.3:
  alphaev68-dec-osf5.1a
  i386-pc-solaris2.8

Testresults for 4.5.2:
  alphaev68-dec-osf5.1a

Testresults for 4.5.1:
  alphaev68-dec-osf5.1a

Testresults for 4.5.0:
  alphaev68-dec-osf5.1a

Index: buildstat.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.5/buildstat.html,v
retrieving revision 1.14
diff -u -r1.14 buildstat.html
--- buildstat.html  4 Apr 2012 16:11:35 -   1.14
+++ buildstat.html  2 Oct 2012 18:27:49 -
@@ -56,6 +56,18 @@
 /tr
 
 tr
+tdalphaev68-dec-osf5.1a/td
+tdnbsp;/td
+tdTest results:
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02460.html;4.5.4/a,
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02459.html;4.5.3/a,
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02458.html;4.5.2/a,
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02457.html;4.5.1/a,
+a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02452.html;4.5.0/a
+/td
+/tr
+
+tr
 tdarmv7l-unknown-linux-gnueabi/td
 tdnbsp;/td
 tdTest results:
@@ -75,6 +87,7 @@
 tdhppa2.0w-hp-hpux11.00/td
 tdnbsp;/td
 tdTest results:
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg00598.html;4.5.4/a,
 a 
href=http://gcc.gnu.org/ml/gcc-testresults/2011-05/msg01358.html;4.5.3/a,
 a href=http://gcc.gnu.org/ml/gcc-testresults/2010-09/msg01008.html;4.5.1/a
 /td
@@ -93,6 +106,7 @@
 tdhppa64-hp-hpux11.00/td
 tdnbsp;/td
 tdTest results:
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg00833.html;4.5.4/a,
 a 
href=http://gcc.gnu.org/ml/gcc-testresults/2011-05/msg01736.html;4.5.3/a,
 a href=http://gcc.gnu.org/ml/gcc-testresults/2010-09/msg01432.html;4.5.1/a
 /td
@@ -129,6 +143,7 @@
 tdi386-pc-solaris2.8/td
 tdnbsp;/td
 tdTest results:
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-04/msg02309.html;4.5.3/a,
 a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg01359.html;4.5.3/a,
 a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg01215.html;4.5.3/a,
 a 
href=http://gcc.gnu.org/ml/gcc-testresults/2011-05/msg00795.html;4.5.3/a,


[PR53135] Use block4 form for large debug expressions

2012-10-02 Thread Alexandre Oliva
This patch fixes a crash in dwarf2out because of a too-large debug
expression.  Jakub approved it for trunk and 4.7 branches in bugzilla.
I'm installing it in the trunk momentarily, and later today on 4.7 after
I give it a spin there.  Regstrapped on x86_64-linux-gnu and
i686-linux-gnu.

I'm keeping the testcase open because we still have an underlying
problem and other improvements to make.


Use block4 form for large debug expressions.

From: Alexandre Oliva aol...@redhat.com

for  gcc/ChangeLog

	PR debug/53135
	* dwarf2out.c (value_format): Use block4 for dw_val_class_loc
	when needed.
---

 gcc/dwarf2out.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)


diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index c776f68..25f57c0 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -7491,6 +7491,8 @@ value_format (dw_attr_ref a)
 	  return DW_FORM_block1;
 	case 2:
 	  return DW_FORM_block2;
+	case 4:
+	  return DW_FORM_block4;
 	default:
 	  gcc_unreachable ();
 	}


-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist  Red Hat Brazil Compiler Engineer


Re: [PR54551] global dead debug pseudo tracking in fast-dce

2012-10-02 Thread Alexandre Oliva
On Sep 25, 2012, Jakub Jelinek ja...@redhat.com wrote:

 On Sun, Sep 23, 2012 at 07:59:37AM -0300, Alexandre Oliva wrote:
 This patch introduces a global mode of dead_debug tracking for use in
 fast DCE.  If a debug use reaches the top of a basic block before
 finding its death point, the pending and subsequent uses of the pseudo
 in debug insns will all be substituted with the same debug temp, and
 death points will get the value bound to the debug temp.

 Thanks for working on this.  The patch generally looks good, just some minor
 nits below.

Here's the revised version with all the nits fixed.

Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  I'm checking it
in momentarily.

Track dead pseudos used in debug insns globally in fast DCE.

From: Alexandre Oliva aol...@redhat.com

for  gcc/ChangeLog

	PR debug/54551
	* Makefile.in (VALTRACK_H): Add hash-table.h.
	* valtrack.h: Include hash-table.h.
	(struct dead_debug_global_entry): New.
	(struct dead_debug_hash_descr): New.
	(struct dead_debug_global): New.
	(struct dead_debug): Rename to...
	(struct dead_debug_local): ... this.  Adjust all uses.
	(dead_debug_global_init, dead_debug_global_finish): New.
	(dead_debug_init): Rename to...
	(dead_debug_local_init): ... this.  Adjust all callers.
	(dead_debug_finish): Rename to...
	(dead_debug_local_finish): ... this.  Adjust all callers.
	* valtrack.c (dead_debug_global_init): New.
	(dead_debug_init): Rename to...
	(dead_debug_local_init): ... this.  Take global parameter.
	Save it and initialize used bitmap from it.
	(dead_debug_global_find, dead_debug_global_insert): New.
	(dead_debug_global_replace_temp): New.
	(dead_debug_promote_uses): New.
	(dead_debug_finish): Rename to...
	(dead_debug_local_finish): ... this.  Promote remaining uses.
	(dead_debug_global_finish): New.
	(dead_debug_add): Try to replace global temps first.
	(dead_debug_insert_temp): Support global replacements.
	* dce.c (word_dce_process_block, dce_process_block): Add
	global_debug parameter.  Pass it on.
	(fast_dce): Initialize, pass on and finalize global_debug.
	* df-problems.c (df_set_unused_notes_for_mw): Adjusted.
	(df_create_unused_notes, df_note_bb_compute): Likewise.
	(df_note_compute): Justify local-only dead debug analysis.

for  gcc/testsuite/ChangeLog

	PR debug/54551
	* gcc.dg/guality/pr54551.c: New.
---

 gcc/Makefile.in|3 
 gcc/dce.c  |   35 +++--
 gcc/df-problems.c  |   15 +-
 gcc/testsuite/gcc.dg/guality/pr54551.c |   28 
 gcc/valtrack.c |  220 +---
 gcc/valtrack.h |   84 +++-
 6 files changed, 340 insertions(+), 45 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/guality/pr54551.c


diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 94ac3b5..77ba4df 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -888,7 +888,8 @@ CGRAPH_H = cgraph.h $(VEC_H) $(TREE_H) $(BASIC_BLOCK_H) $(FUNCTION_H) \
 	cif-code.def ipa-ref.h ipa-ref-inline.h $(LINKER_PLUGIN_API_H)
 DF_H = df.h $(BITMAP_H) $(REGSET_H) sbitmap.h $(BASIC_BLOCK_H) \
 	alloc-pool.h $(TIMEVAR_H)
-VALTRACK_H = valtrack.h $(BITMAP_H) $(DF_H) $(RTL_H) $(BASIC_BLOCK_H)
+VALTRACK_H = valtrack.h $(BITMAP_H) $(DF_H) $(RTL_H) $(BASIC_BLOCK_H) \
+	$(HASH_TABLE_H)
 RESOURCE_H = resource.h hard-reg-set.h $(DF_H)
 DDG_H = ddg.h sbitmap.h $(DF_H)
 GCC_H = gcc.h version.h $(DIAGNOSTIC_CORE_H)
diff --git a/gcc/dce.c b/gcc/dce.c
index c951865..11f8edb 100644
--- a/gcc/dce.c
+++ b/gcc/dce.c
@@ -806,15 +806,17 @@ struct rtl_opt_pass pass_ud_rtl_dce =
 /* Process basic block BB.  Return true if the live_in set has
changed. REDO_OUT is true if the info at the bottom of the block
needs to be recalculated before starting.  AU is the proper set of
-   artificial uses. */
+   artificial uses.  Track global substitution of uses of dead pseudos
+   in debug insns using GLOBAL_DEBUG.  */
 
 static bool
-word_dce_process_block (basic_block bb, bool redo_out)
+word_dce_process_block (basic_block bb, bool redo_out,
+			struct dead_debug_global *global_debug)
 {
   bitmap local_live = BITMAP_ALLOC (dce_tmp_bitmap_obstack);
   rtx insn;
   bool block_changed;
-  struct dead_debug debug;
+  struct dead_debug_local debug;
 
   if (redo_out)
 {
@@ -836,7 +838,7 @@ word_dce_process_block (basic_block bb, bool redo_out)
 }
 
   bitmap_copy (local_live, DF_WORD_LR_OUT (bb));
-  dead_debug_init (debug, NULL);
+  dead_debug_local_init (debug, NULL, global_debug);
 
   FOR_BB_INSNS_REVERSE (bb, insn)
 if (DEBUG_INSN_P (insn))
@@ -890,7 +892,7 @@ word_dce_process_block (basic_block bb, bool redo_out)
   if (block_changed)
 bitmap_copy (DF_WORD_LR_IN (bb), local_live);
 
-  dead_debug_finish (debug, NULL);
+  dead_debug_local_finish (debug, NULL);
   BITMAP_FREE (local_live);
   return block_changed;
 }
@@ -899,16 +901,18 @@ word_dce_process_block (basic_block bb, bool redo_out)
 /* Process basic block BB.  Return 

Re: Convert more non-GTY htab_t to hash_table.

2012-10-02 Thread Lawrence Crowl
On 10/2/12, Richard Guenther rguent...@suse.de wrote:
 On Mon, 1 Oct 2012, Lawrence Crowl wrote:
  Change more non-GTY hash tables to use the new type-safe
  template hash table.  Constify member function parameters that
  can be const.  Correct a couple of expressions in formerly
  uninstantiated templates.
 
  The new code is 0.362% faster in bootstrap, with a 99.5%
  confidence of being faster.
 
  Tested on x86-64.
 
  Okay for trunk?

 You are changing a hashtable used by fold checking, did you test
 with fold checking enabled?

I didn't know I had to do anything beyond the normal make check.
What do I do?

 +/* Data structures used to maintain mapping between basic blocks and
 +   copies.  */
 +static hash_table bb_copy_hasher bb_original;
 +static hash_table bb_copy_hasher bb_copy;

 note that because hash_table has a constructor we now get global
 CTORs for all statics :( (and mx-protected local inits ...)

The overhead for the global constructors isn't significant.
Only the function-local statics have mx-protection, and that can
be eliminated by making them global static.

 Can you please try to remove the constructor from hash_table to
 avoid this overhead?  (as a followup - that is, don't initialize
 htab)

The initialization avoids potential errors in calling dispose.
I can do it, but I don't think the overhead (after moving the
function-local statics to global) will matter, and so I prefer to
keep the safety.  So is the move of the statics sufficient or do
you still want to remove constructors?

 The cfg.c, dse.c and hash-table.h parts are ok for trunk, I'll
 leave the rest to respective maintainers of the pieces of the
 compiler.

 Thanks,
 Richard.


 Index: gcc/java/ChangeLog

 2012-10-01  Lawrence Crowl  cr...@google.com

  * Make-lang.in (JAVA_OBJS): Add dependence on hash-table.o.
  (JCFDUMP_OBJS): Add dependence on hash-table.o.
  (jcf-io.o): Add dependence on hash-table.h.
  * jcf-io.c (memoized_class_lookups): Change to use type-safe hash table.

 Index: gcc/c/ChangeLog

 2012-10-01  Lawrence Crowl  cr...@google.com

  * Make-lang.in (c-decl.o): Add dependence on hash-table.h.
  * c-decl.c (detect_field_duplicates_hash): Change to new type-safe
  hash table.

 Index: gcc/objc/ChangeLog

 2012-10-01  Lawrence Crowl  cr...@google.com

  * Make-lang.in (OBJC_OBJS): Add dependence on hash-table.o.
  (objc-act.o): Add dependence on hash-table.h.
  * objc-act.c (objc_detect_field_duplicates): Change to new type-safe
  hash table.

 Index: gcc/ChangeLog

 2012-10-01  Lawrence Crowl  cr...@google.com

  * Makefile.in (fold-const.o): Add depencence on hash-table.h.
  (dse.o): Likewise.
  (cfg.o): Likewise.
  * fold-const.c (fold_checksum_tree): Change to new type-safe hash table.
  * (print_fold_checksum): Likewise.
  * cfg.c (var bb_original): Likewise.
  * (var bb_copy): Likewise.
  * (var loop_copy): Likewise.
  * hash-table.h (template hash_table): Constify parameters for find...
  and remove_elt... member functions.
 (hash_table::empty) Correct size expression.
 (hash_table::clear_slot) Correct deleted entry assignment.
  * dse.c (var rtx_group_table): Change to new type-safe hash table.

 Index: gcc/cp/ChangeLog

 2012-10-01  Lawrence Crowl  cr...@google.com

  * Make-lang.in (class.o): Add dependence on hash-table.h.
  (tree.o): Likewise.
  (semantics.o): Likewise.
  * class.c (fixed_type_or_null): Change to new type-safe hash table.
  * tree.c (verify_stmt_tree): Likewise.
  (verify_stmt_tree_r): Likewise.
  * semantics.c (struct nrv_data): Likewise.


 Index: gcc/java/Make-lang.in
 ===
 --- gcc/java/Make-lang.in(revision 191941)
 +++ gcc/java/Make-lang.in(working copy)
 @@ -83,10 +83,10 @@ JAVA_OBJS = java/class.o java/decl.o jav
java/zextract.o java/jcf-io.o java/win32-host.o java/jcf-parse.o
 java/mangle.o \
java/mangle_name.o java/builtins.o java/resource.o \
java/jcf-depend.o \
 -  java/jcf-path.o java/boehm.o java/java-gimplify.o
 +  java/jcf-path.o java/boehm.o java/java-gimplify.o hash-table.o

  JCFDUMP_OBJS = java/jcf-dump.o java/jcf-io.o java/jcf-depend.o
 java/jcf-path.o \
 -java/win32-host.o java/zextract.o ggc-none.o
 +java/win32-host.o java/zextract.o ggc-none.o hash-table.o

  JVGENMAIN_OBJS = java/jvgenmain.o java/mangle_name.o

 @@ -326,7 +326,7 @@ java/java-gimplify.o: java/java-gimplify
  # jcf-io.o needs $(ZLIBINC) added to cflags.
  CFLAGS-java/jcf-io.o += $(ZLIBINC)
  java/jcf-io.o: java/jcf-io.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
 -  $(JAVA_TREE_H) java/zipfile.h
 +  $(JAVA_TREE_H) java/zipfile.h $(HASH_TABLE_H)

  # jcf-path.o needs a -D.
  CFLAGS-java/jcf-path.o += \
 Index: gcc/java/jcf-io.c
 ===
 --- gcc/java/jcf-io.c(revision 191941)
 +++ 

[wwwdocs] Buildstat update for 4.6

2012-10-02 Thread Tom G. Christensen
Latest results for 4.6.x

-tgc

Testresults for 4.6.3
  alphaev68-dec-osf5.1a

Testresults for 4.6.2
  alphaev68-dec-osf5.1a

Testresults for 4.6.1
  alphaev68-dec-osf5.1a (2)

Testresults for 4.6.0
  alphaev68-dec-osf5.1a

Index: buildstat.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.6/buildstat.html,v
retrieving revision 1.12
diff -u -r1.12 buildstat.html
--- buildstat.html  7 Jun 2012 19:48:58 -   1.12
+++ buildstat.html  2 Oct 2012 19:07:07 -
@@ -35,7 +35,12 @@
 tdalphaev68-dec-osf5.1a/td
 tdnbsp;/td
 tdTest results:
-a href=http://gcc.gnu.org/ml/gcc-testresults/2011-08/msg00587.html;4.6.1/a
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02465.html;4.6.3/a,
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02464.html;4.6.2/a,
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02472.html;4.6.1/a,
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02463.html;4.6.1/a,
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2011-08/msg00587.html;4.6.1/a,
+a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02462.html;4.6.0/a
 /td
 /tr
 


[wwwdocs] Buildstat update for 4.7

2012-10-02 Thread Tom G. Christensen
Latest results for 4.7.x

-tgc

Testresults for 4.7.2
  alphaev68-dec-osf5.1a (2)
  hppa2.0w-hp-hpux11.00
  hppa2.0w-hp-hpux11.11
  hppa64-hp-hpux11.11
  i386-apple-darwin10.8.0
  i686-pc-linux-gnu
  powerpc-apple-darwin8.11.0
  x86_64-apple-darwin10.8.0
  x86_64-apple-darwin12.2.0

Testresults for 4.7.1
  alphaev68-dec-osf5.1a (2)

Testresults for 4.7.0
  alphaev68-dec-osf5.1a

Index: buildstat.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.7/buildstat.html,v
retrieving revision 1.6
diff -u -r1.6 buildstat.html
--- buildstat.html  16 Jul 2012 00:06:41 -  1.6
+++ buildstat.html  2 Oct 2012 19:24:44 -
@@ -39,9 +39,30 @@
 /tr
 
 tr
+tdalphaev68-dec-osf5.1a/td
+tdnbsp;/td
+tdTest results:
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02474.html;4.7.2/a,
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02469.html;4.7.2/a,
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02475.html;4.7.1/a,
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02468.html;4.7.1/a,
+a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02466.html;4.7.0/a
+/td
+/tr
+
+tr
+tdhppa2.0w-hp-hpux11.00/td
+tdnbsp;/td
+tdTest results:
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02949.html;4.7.2/a,
+/td
+/tr
+
+tr
 tdhppa2.0w-hp-hpux11.11/td
 tdnbsp;/td
 tdTest results:
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02311.html;4.7.2/a,
 a href=http://gcc.gnu.org/ml/gcc-testresults/2012-04/msg00080.html;4.7.0/a
 /td
 /tr
@@ -50,6 +71,7 @@
 tdhppa64-hp-hpux11.11/td
 tdnbsp;/td
 tdTest results:
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02408.html;4.7.2/a,
 a href=http://gcc.gnu.org/ml/gcc-testresults/2012-04/msg00408.html;4.7.0/a
 /td
 /tr
@@ -58,6 +80,7 @@
 tdi386-apple-darwin10.8.0/td
 tdnbsp;/td
 tdTest results:
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02291.html;4.7.2/a,
 a href=http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg02742.html;4.7.0/a
 /td
 /tr
@@ -102,6 +125,7 @@
 tdi686-pc-linux-gnu/td
 tdnbsp;/td
 tdTest results:
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg00199.html;4.7.1/a,
 a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-06/msg01316.html;4.7.1/a,
 a href=http://gcc.gnu.org/ml/gcc-testresults/2012-06/msg01315.html;4.7.1/a
 /td
@@ -111,6 +135,7 @@
 tdpowerpc-apple-darwin8.11.0/td
 tdnbsp;/td
 tdTest results:
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02736.html;4.7.2/a,
 a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-06/msg01566.html;4.7.1/a,
 a href=http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg02890.html;4.7.0/a
 /td
@@ -145,6 +170,7 @@
 tdx86_64-apple-darwin10.8.0/td
 tdnbsp;/td
 tdTest results:
+a 
href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02247.html;4.7.2/a,
 a href=http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg02708.html;4.7.0/a
 /td
 /tr
@@ -159,6 +185,14 @@
 /tr
 
 tr
+tdx86_64-apple-darwin12.2.0/td
+tdnbsp;/td
+tdTest results:
+a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02248.html;4.7.2/a
+/td
+/tr
+
+tr
 tdx86_64-unknown-linux-gnu/td
 tdnbsp;/td
 tdTest results:


Re: [PATCH v2, rtl-optimization]: Fix PR54457, [x32] Fail to combine 64bit index + constant

2012-10-02 Thread Richard Sandiford
Andrew Pinski andrew.pin...@caviumnetworks.com writes:
 On Thu, Sep 27, 2012 at 11:13 AM, Uros Bizjak ubiz...@gmail.com wrote:
 2012-09-27  Uros Bizjak  ubiz...@gmail.com

 PR rtl-optimization/54457
 * simplify-rtx.c (simplify_subreg):
 Simplify (subreg:M (op:N ((x:N) (y:N)), 0)
 to (op:M (subreg:M (x:N) 0) (subreg:M (x:N) 0)), where
 the outer subreg is effectively a truncation to the original mode M.


 When I was doing something similar on our internal toolchain at
 Cavium.  I found doing this caused a regression on MIPS64 n32 in
 gcc.c-torture/execute/20040709-1.c Where:


 (insn 15 14 16 2 (set (reg/v:DI 200 [ y ])
 (reg:DI 2 $2)) t.c:16 301 {*movdi_64bit}
  (expr_list:REG_DEAD (reg:DI 2 $2)
 (nil)))

 (insn 16 15 17 2 (set (reg:DI 210)
 (zero_extract:DI (reg/v:DI 200 [ y ])
 (const_int 29 [0x1d])
 (const_int 0 [0]))) t.c:16 249 {extzvdi}
  (expr_list:REG_DEAD (reg/v:DI 200 [ y ])
 (nil)))

 (insn 17 16 23 2 (set (reg:SI 211)
 (truncate:SI (reg:DI 210))) t.c:16 175 {truncdisi2}
  (expr_list:REG_DEAD (reg:DI 210)
 (nil)))

 Gets converted to:
 (insn 23 17 26 2 (set (reg/i:SI 2 $2)
 (and:SI (reg:SI 2 $2 [+4 ])
 (const_int 536870911 [0x1fff]))) t.c:18 156 {*andsi3}
  (nil))

 Which is considered an ext instruction

 And with the Octeon simulator which causes undefined arguments to
 32bit word operations to come out as 0xDEADBEEF which showed the
 regression.  I fixed it by changing it to produce TRUNCATE instead of
 the subreg.

 I did the simplification on ior/and rather than plus/minus/mult so the
 issue is only when expanding to this to and/ior.

Hmm, hadn't thought of that.  I think some of the existing subreg
optimisations suffer the same problem.  I.e. we can't assume that
subreg truncations of nested operands are OK just because the outer
subreg is OK.

I've got a patch I'm testing.

BTW, I haven't forgotten about your other ext patch.  Was hoping
to see whether we could finally take the opportunity to parameterise
the ext* patterns by mode, but got distracted with other patches.
Maybe I'll just have to admit I won't get time to try it for 4.8...

Richard


PATCH: PR target/54741: Check SSE and YMM state support for -march=native

2012-10-02 Thread H.J. Lu
Hi,

This patch checks SSE and YMM state support for -march=native.  Tested
on Linux/x86-64.  OK to install?

Thanks.


H.J.
---
2012-10-02  H.J. Lu  hongjiu...@intel.com

PR target/54741
*  config/i386/driver-i386.c (XCR_XFEATURE_ENABLED_MASK): New.
(XSTATE_FP): Likewise.
(XSTATE_SSE): Likewise.
(XSTATE_YMM): Likewise.
(host_detect_local_cpu): Disable AVX, AVX2, FMA, FMA4 and XOP if
SSE and YMM states aren't supported.

diff --git a/gcc/config/i386/driver-i386.c b/gcc/config/i386/driver-i386.c
index bda4e02..4dffc51 100644
--- a/gcc/config/i386/driver-i386.c
+++ b/gcc/config/i386/driver-i386.c
@@ -390,6 +390,7 @@ const char *host_detect_local_cpu (int argc, const char 
**argv)
   unsigned int has_hle = 0, has_rtm = 0;
   unsigned int has_rdrnd = 0, has_f16c = 0, has_fsgsbase = 0;
   unsigned int has_rdseed = 0, has_prfchw = 0, has_adx = 0;
+  unsigned int has_osxsave = 0;
 
   bool arch;
 
@@ -431,6 +432,7 @@ const char *host_detect_local_cpu (int argc, const char 
**argv)
   has_sse4_1 = ecx  bit_SSE4_1;
   has_sse4_2 = ecx  bit_SSE4_2;
   has_avx = ecx  bit_AVX;
+  has_osxsave = ecx  bit_OSXSAVE;
   has_cmpxchg16b = ecx  bit_CMPXCHG16B;
   has_movbe = ecx  bit_MOVBE;
   has_popcnt = ecx  bit_POPCNT;
@@ -460,6 +462,26 @@ const char *host_detect_local_cpu (int argc, const char 
**argv)
   has_adx = ebx  bit_ADX;
 }
 
+  /* Get XCR_XFEATURE_ENABLED_MASK register with xgetbv.  */
+#define XCR_XFEATURE_ENABLED_MASK  0x0
+#define XSTATE_FP  0x1
+#define XSTATE_SSE 0x2
+#define XSTATE_YMM 0x4
+  if (has_osxsave)
+asm (.byte 0x0f; .byte 0x01; .byte 0xd0
+: =a (eax), =d (edx)
+: c (XCR_XFEATURE_ENABLED_MASK));
+
+  /* Check if SSE and YMM states are supported.  */
+  if ((eax  (XSTATE_SSE | XSTATE_YMM)) == (XSTATE_SSE | XSTATE_YMM))
+{
+  has_avx = 0;
+  has_avx2 = 0;
+  has_fma = 0;
+  has_fma4 = 0;
+  has_xop = 0;
+}
+
   /* Check cpuid level of extended features.  */
   __cpuid (0x8000, ext_level, ebx, ecx, edx);
 


[MIPS] Adjust baddu patterns for recent simplify-rtx.c change

2012-10-02 Thread Richard Sandiford
As promised, here's the patch to adjust the MIPS BADDU patterns for
the new (subreg (plus)) simplification.  Tested on mipsisa32-elf
and mipsisa64-elf.  Applied.

Richard


gcc/
* config/mips/mips.md (*baddu_si_eb, *baddu_si_el): Merge into...
(*baddu_si): ...this new pattern.

Index: gcc/config/mips/mips.md
===
--- gcc/config/mips/mips.md 2012-09-29 16:57:31.0 +0100
+++ gcc/config/mips/mips.md 2012-10-01 21:33:39.358480799 +0100
@@ -1293,23 +1293,12 @@ (define_insn_and_split *addsi3_extended
 
 ;; Combiner patterns for unsigned byte-add.
 
-(define_insn *baddu_si_eb
+(define_insn *baddu_si
   [(set (match_operand:SI 0 register_operand =d)
 (zero_extend:SI
-(subreg:QI
- (plus:SI (match_operand:SI 1 register_operand d)
-  (match_operand:SI 2 register_operand d)) 3)))]
-  ISA_HAS_BADDU  BYTES_BIG_ENDIAN
-  baddu\\t%0,%1,%2
-  [(set_attr alu_type add)])
-
-(define_insn *baddu_si_el
-  [(set (match_operand:SI 0 register_operand =d)
-(zero_extend:SI
-(subreg:QI
- (plus:SI (match_operand:SI 1 register_operand d)
-  (match_operand:SI 2 register_operand d)) 0)))]
-  ISA_HAS_BADDU  !BYTES_BIG_ENDIAN
+(plus:QI (match_operand:QI 1 register_operand d)
+ (match_operand:QI 2 register_operand d]
+  ISA_HAS_BADDU
   baddu\\t%0,%1,%2
   [(set_attr alu_type add)])
 


PATCH: PR target/54785: Document -mprefer-avx128

2012-10-02 Thread H.J. Lu
Hi,

This patch documents -mprefer-avx128.  OK for trunk and 4.7?

Thanks.


H.J.
---
2012-10-02  H.J. Lu  hongjiu...@intel.com

PR target/54785
* doc/invoke.texi: Document -mprefer-avx128.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 7578dda..0e7e441 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -630,7 +630,7 @@ Objective-C and Objective-C++ Dialects}.
 -mincoming-stack-boundary=@var{num} @gol
 -mcld -mcx16 -msahf -mmovbe -mcrc32 @gol
 -mrecip -mrecip=@var{opt} @gol
--mvzeroupper @gol
+-mvzeroupper -mprefer-avx128 @gol
 -mmmx  -msse  -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4 -mavx @gol
 -mavx2 -maes -mpclmul -mfsgsbase -mrdrnd -mf16c -mfma @gol
 -msse4a -m3dnow -mpopcnt -mabm -mbmi -mtbm -mfma4 -mxop -mlzcnt @gol
@@ -13926,6 +13926,11 @@ before a transfer of control flow out of the function 
to minimize
 the AVX to SSE transition penalty as well as remove unnecessary 
@code{zeroupper}
 intrinsics.
 
+@item -mprefer-avx128
+@opindex mprefer-avx128
+This option instructs GCC to use 128-bit AVX instructions instead of
+256-bit AVX instructions in the auto-vectorizer.
+
 @item -mcx16
 @opindex mcx16
 This option enables GCC to generate @code{CMPXCHG16B} instructions.


[Committed] Fix truncate of a memory for vector mode

2012-10-02 Thread Andrew Pinski
Hi,
  When I implemented the simplification of a truncate of a memory, I
did not think about the case where we would have a truncate of a
vector mode.  This fixes this case.
Committed as obvious after a bootstrap and test on x86_64-linux-gnu
and also a build and test for arm-linux-gnueabi.

Thanks,
Andrew Pinski
2012-10-02  Andrew Pinski  apin...@cavium.com

* simplify-rtx.c (simplify_unary_operation_1 case TRUNCATE):
Don't optimize a truncate of a mem if it is a vector mode.
Index: simplify-rtx.c
===
--- simplify-rtx.c  (revision 192004)
+++ simplify-rtx.c  (working copy)
@@ -873,6 +873,7 @@ simplify_unary_operation_1 (enum rtx_cod
   /* A truncate of a memory is just loading the low part of the memory
 if we are not changing the meaning of the address. */
   if (GET_CODE (op) == MEM
+  !VECTOR_MODE_P (mode)
   !MEM_VOLATILE_P (op)
   !mode_dependent_address_p (XEXP (op, 0), MEM_ADDR_SPACE (op)))
return rtl_hooks.gen_lowpart_no_emit (mode, op);


[Patch, Fortran, committed] PR 54778: an ICE on invalid OO code

2012-10-02 Thread Janus Weil
Hi all,

I have just committed as obvious a one-line patch to fix an
ICE-on-invalid OOP problem:

http://gcc.gnu.org/viewcvs?view=revisionrevision=192005

Cheers,
Janus


  1   2   >