Re: PR 53889: Add __gthread_recursive_mutex_destroy
On Mon, Oct 1, 2012 at 5:46 PM, Jonathan Wakely jwakely@gmail.com wrote: static inline int __gthread_recursive_mutex_destroy (__gthread_recursive_mutex_t * UNUSED(__mutex)) { return 0; } Is that indentation right? (the asterisk is in the same column as the parameter type in a fixed-width font.) When I see a single parameter that pushes past 80 columns, I normally start a new line after the left parenthesis and indent the next line 4 spaces. E.g.: static inline int __gthread_recursive_mutex_destroy ( __gthread_recursive_mutex_t * UNUSED(__mutex)) But I don't think there is any solid standard for this. PR other/53889 * gthr.h (__gthread_recursive_mutex_destroy): Document new required function. * gthr-posix.h (__gthread_recursive_mutex_destroy): Define. * gthr-single.h (__gthread_recursive_mutex_destroy): Likewise. * config/gthr-rtems.h (__gthread_recursive_mutex_destroy): Likewise. * config/gthr-vxworks.h (__gthread_recursive_mutex_destroy): Likewise. * config/i386/gthr-win32.h (__gthread_recursive_mutex_destroy): Likewise. * config/mips/gthr-mipssde.h (__gthread_recursive_mutex_destroy): Likewise. * config/pa/gthr-dce.h (__gthread_recursive_mutex_destroy): Likewise. * config/s390/gthr-tpf.h (__gthread_recursive_mutex_destroy): Likewise. The libgcc part of this is OK. Thanks. Ian
Re: RFC: LRA for x86/x86-64 [0/9]
On Tue, Oct 2, 2012 at 3:14 AM, Vladimir Makarov vmaka...@redhat.com wrote: My experience shows that these lists are usually 1-2 elements. Although in this case, there are pseudos with huge number elements (hundreeds). I tried -fweb for this tests because it can decrease the number elements but GCC (I don't know what pass) scales even worse: after 20 min of waiting and when virt memory achieved 20GB I stoped it. Ouch :-) The webizer itself never even runs, the compiler blows up somewhere during the df_analyze call from web_main. The issue here is probably in the DF_UD_CHAIN problem or in the DF_RD problem. Ciao! Steven
[PATCH v2, libbacktrace]: Compile with -funwind-tables
On Tue, Oct 2, 2012 at 12:58 AM, Ian Lance Taylor i...@google.com wrote: Without -fasynchronous-unwind-tables, FDE is not generated for backtrace_full and backtrace_simple wrappers. Without FDE, unwinding terminates at these functions. I'm not opposed to -fasynchronous-unwind-tables, but now that you bring it up I'm fairly certain that it would suffice to use -funwind-tables. I've been testing mainly on x86_64, and I forgot that on x86_64 -funwind-tables is the default. Sorry about that. And -fasynchronous-unwind-tables is the default also, so I could be wrong that -funwind-tables is all that is needed. Yes, you are correct. -funwind-tables works as well. Attached patch fixes this problem by adding -fasynchronous-unwind-tables, and this way forcing FDEs for all functions. With this change, btest passes OK, failing log and runtime/pprof from libgo testsuite also pass OK. This is basically fine but libbacktrace may be compiled by the host compiler and that may not be GCC, so please add a configure test to see if the compiler accepts the -fasynchronous-unwind-tables option. I have simplified the check for -funwind-tables to just look if the library is compiled with gcc. This option is supported by gcc-2.96 (and probably earlier versions too). 2012-10-02 Uros Bizjak ubiz...@gmail.com PR other/54761 * configure.ac (CFLAGS): Add -funwind-tables when compiling with GCC. * configure: Regenerate. The patch is re-tested on x86_64-linux-gnu and alphaev68-linux-gnu. OK for mainline? Uros. Index: configure === --- configure (revision 191953) +++ configure (working copy) @@ -4872,8 +4872,12 @@ +if test x$GCC = xyes; then + CFLAGS=$CFLAGS -funwind-tables +fi + if test -n $ac_tool_prefix; then # Extract the first word of ${ac_tool_prefix}ranlib, so it can be a program name with args. set dummy ${ac_tool_prefix}ranlib; ac_word=$2 @@ -11080,7 +11084,7 @@ lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat conftest.$ac_ext _LT_EOF -#line 11083 configure +#line 11087 configure #include confdefs.h #if HAVE_DLFCN_H @@ -11186,7 +11190,7 @@ lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat conftest.$ac_ext _LT_EOF -#line 11189 configure +#line 11193 configure #include confdefs.h #if HAVE_DLFCN_H Index: configure.ac === --- configure.ac(revision 191953) +++ configure.ac(working copy) @@ -66,6 +66,10 @@ AC_PROG_CC m4_rename_force([backtrace_PRECIOUS],[_AC_ARG_VAR_PRECIOUS]) +if test x$GCC = xyes; then + CFLAGS=$CFLAGS -funwind-tables +fi + AC_SUBST(CFLAGS) AC_PROG_RANLIB
[PATCH, libitm]: A couple of trivial x86 changes
Hello! 2012-10-02 Uros Bizjak ubiz...@gmail.com * config/x86/target.h (struct gtm_jmpbuf): Merge x86_64 and ia32 declarations some more. * config/x86/sjlj.S (_ITM_beginTransaction): Move ret to common code. Tested on x86_64-pc-linux-gnu, committed to mainline SVN. Uros. Index: config/x86/sjlj.S === --- config/x86/sjlj.S (revision 191953) +++ config/x86/sjlj.S (working copy) @@ -74,7 +74,6 @@ callSYM(GTM_begin_transaction) addq$56, %rsp cfi_def_cfa_offset(8) - ret #else leal4(%esp), %ecx movl4(%esp), %eax @@ -99,8 +98,8 @@ #endif addl$28, %esp cfi_def_cfa_offset(4) - ret #endif + ret cfi_endproc TYPE(_ITM_beginTransaction) Index: config/x86/target.h === --- config/x86/target.h (revision 191953) +++ config/x86/target.h (working copy) @@ -24,11 +24,11 @@ namespace GTM HIDDEN { -#ifdef __x86_64__ /* ??? This doesn't work for Win64. */ typedef struct gtm_jmpbuf { void *cfa; +#ifdef __x86_64__ unsigned long long rbx; unsigned long long rbp; unsigned long long r12; @@ -36,18 +36,14 @@ unsigned long long r14; unsigned long long r15; unsigned long long rip; -} gtm_jmpbuf; #else -typedef struct gtm_jmpbuf -{ - void *cfa; unsigned long ebx; unsigned long esi; unsigned long edi; unsigned long ebp; unsigned long eip; -} gtm_jmpbuf; #endif +} gtm_jmpbuf; /* x86 doesn't require strict alignment for the basic types. */ #define STRICT_ALIGNMENT 0
[Ada] Avoid unnecessary use of Bignums for ELIMINATED mode
Previously there were cases where the result of an operator was converted to Bignum, only to be immediately converted back to Long_Long_Integer with an overflow check. This patch removes this unnecessary inefficiency. The following program: 1. procedure toplevov 2.(a : in out long_long_integer; 3. b : long_long_integer) 4. is 5. begin 6. a := b * b; 7. end; Now generates the following output when compiled with -gnatG -gnato3. procedure toplevov (a : in out long_long_integer; b : long_long_integer) is begin a := long_long_integer(b) {*} long_long_integer(b); return; end toplevov; Previously it generated Bignum operations Tested on x86_64-pc-linux-gnu, committed on trunk 2012-10-02 Robert Dewar de...@adacore.com * checks.ads, exp_ch4.adb, checks.adb (Minimize_Eliminate_Overflow_Checks): Add Top_Level parameter to avoid unnecessary conversions to Bignum. Minor reformatting. Index: checks.adb === --- checks.adb (revision 191921) +++ checks.adb (working copy) @@ -1113,8 +1113,11 @@ -- Otherwise, we have a top level arithmetic operator node, and this -- is where we commence the special processing for minimize/eliminate. + -- This is the case where we tell the machinery not to move into Bignum + -- mode at this top level (of course the top level operation will still + -- be in Bignum mode if either of its operands are of type Bignum). - Minimize_Eliminate_Overflow_Checks (Op, Lo, Hi); + Minimize_Eliminate_Overflow_Checks (Op, Lo, Hi, Top_Level = True); -- That call may but does not necessarily change the result type of Op. -- It is the job of this routine to undo such changes, so that at the @@ -2333,23 +2336,24 @@ Error_Msg_N (\this will result in infinite recursion?, Parent (N)); Insert_Action (N, - Make_Raise_Storage_Error - (Sloc (N), Reason = SE_Infinite_Recursion)); + Make_Raise_Storage_Error (Sloc (N), +Reason = SE_Infinite_Recursion)); + -- Here for normal case of predicate active. + else - -- If the predicate is a static predicate and the operand is -- static, the predicate must be evaluated statically. If the -- evaluation fails this is a static constraint error. if Is_OK_Static_Expression (N) then - if Present (Static_Predicate (Typ)) then + if Present (Static_Predicate (Typ)) then if Eval_Static_Predicate_Check (N, Typ) then return; else Error_Msg_NE (static expression fails static predicate check on, - N, Typ); +N, Typ); end if; end if; end if; @@ -6549,9 +6553,10 @@ procedure Minimize_Eliminate_Overflow_Checks - (N : Node_Id; - Lo : out Uint; - Hi : out Uint) + (N : Node_Id; + Lo: out Uint; + Hi: out Uint; + Top_Level : Boolean) is pragma Assert (Is_Signed_Integer_Type (Etype (N))); @@ -6578,6 +6583,11 @@ OK : Boolean; -- Used in call to Determine_Range + Bignum_Operands : Boolean; + -- Set True if one or more operands is already of type Bignum, meaning + -- that for sure (regardless of Top_Level setting) we are committed to + -- doing the operation in Bignum mode. + procedure Max (A : in out Uint; B : Uint); -- If A is No_Uint, sets A to B, else to UI_Max (A, B); @@ -6609,7 +6619,7 @@ -- Start of processing for Minimize_Eliminate_Overflow_Checks begin - -- Case where we do not have an arithmetic operator. + -- Case where we do not have an arithmetic operator if not Is_Signed_Integer_Arithmetic_Op (N) then @@ -6638,10 +6648,12 @@ -- that lies below us!) else - Minimize_Eliminate_Overflow_Checks (Right_Opnd (N), Rlo, Rhi); + Minimize_Eliminate_Overflow_Checks + (Right_Opnd (N), Rlo, Rhi, Top_Level = False); if Binary then -Minimize_Eliminate_Overflow_Checks (Left_Opnd (N), Llo, Lhi); +Minimize_Eliminate_Overflow_Checks + (Left_Opnd (N), Llo, Lhi, Top_Level = False); end if; end if; @@ -6650,10 +6662,13 @@ if Rlo = No_Uint or else (Binary and then Llo = No_Uint) then Lo := No_Uint; Hi := No_Uint; + Bignum_Operands := True; -- Otherwise compute result range else + Bignum_Operands := False; + case Nkind (N) is -- Absolute value @@ -7007,15 +7022,34 @@ if Lo =
Re: PR 53889: Add __gthread_recursive_mutex_destroy
On Mon, Oct 01, 2012 at 11:02:27PM -0700, Ian Lance Taylor wrote: On Mon, Oct 1, 2012 at 5:46 PM, Jonathan Wakely jwakely@gmail.com wrote: static inline int __gthread_recursive_mutex_destroy (__gthread_recursive_mutex_t * UNUSED(__mutex)) { return 0; } Is that indentation right? (the asterisk is in the same column as the parameter type in a fixed-width font.) When I see a single parameter that pushes past 80 columns, I normally start a new line after the left parenthesis and indent the next line 4 spaces. E.g.: static inline int __gthread_recursive_mutex_destroy ( __gthread_recursive_mutex_t * UNUSED(__mutex)) But I don't think there is any solid standard for this. I believe the GNU coding standard way (as shown e.g. by what indent does by default) is to split the single argument onto multiple lines if that still fits (i.e. static inline int __gthread_recursive_mutex_destroy (__gthread_recursive_mutex_t * UNUSED(__mutex)) { return 0; } should be fine), and if even that wouldn't fit, then place ( on the following line indented by two spaces: int foo123456789012345678901234567890123456789012345678901234567890123456789012 (int x, int y) { return x + y; } I have never seen ( at the end of a line in GNU code and find it ugly, but sure that is a bikeshed thing. Jakub
[Ada] Ada 2012 invariant checks on access values and components
This patch complete the generation of invariant checks, for the case of return values or in-out parameters that involve access types whose designated type has invariants. Executing: gnatmake -q -gnat12 -gnata main main must yield: 1 TEST 0 1 TEST 1 2 TEST 2 2 TEST 3 TEST 4 3 4 TEST 5 3 4 END --- with P; use P; with Ada.Text_IO; use Ada.Text_IO; procedure Main is O : T; -- value = 1 V : T_Access := new T; -- value = 2 W : aliased X; begin W.V1 := new T; -- value = 3 W.V2 := new T; -- value = 4 Put_Line (TEST 0); Test_0 (O); Put_Line (TEST 1); Test_1 (V); Put_Line (TEST 2); Test_2 (V); Put_Line (TEST 3); Test_3 (W); Put_Line (TEST 4); Test_4 (W); Put_Line (TEST 5); Test_5 (W'Access); Put_Line (END); end Main; --- package P is type T is private with Type_Invariant = Check (T); type T_Access is access all T; type X is record V1 : access T; V2 : T_Access; end record; function Make (X : integer) return T; function Make (X : integer) return access T; procedure Test_0 (Obj : in out T); function Check (O : T) return Boolean; procedure Test_1 (V : access T); procedure Test_2 (V : T_Access); procedure Test_3 (V : X); procedure Test_4 (V : in out X); procedure Test_5 (V : access X); private Counter : Integer := 0; function Incr return Integer; type T is record Value : Integer := Incr; end record; end P; --- with Ada.Text_IO; use Ada.Text_IO; package body P is function Incr return Integer is begin Counter := Counter + 1; return Counter; end; Root : aliased T := (others = 15); function Check (O : T) return Boolean is begin Put_Line (Integer'Image (O.Value)); return True; end Check; function Make (X : Integer) return T is begin return (Value = X); end; function Make (X : Integer) return access T is begin return Root'access; end; procedure Test_0 (Obj : in out T) is begin null; end; procedure Test_1 (V : access T) is begin null; end Test_1; procedure Test_2 (V : T_Access) is begin null; end Test_2; procedure Test_3 (V : X) is begin null; end Test_3; procedure Test_4 (V : in out X) is begin null; end Test_4; procedure Test_5 (V : access X) is begin null; end Test_5; end P; Tested on x86_64-pc-linux-gnu, committed on trunk 2012-10-02 Ed Schonberg schonb...@adacore.com * sem_ch6.adb (Process_PPCs): Generate invariant checks for a return value whose type is an access type and whose designated type has invariants. Ditto for in-out parameters and in-parameters of an access type. * exp_ch3.adb (Build_Component_Invariant_Call): Add invariant check for an access component whose designated type has invariants. Index: sem_ch6.adb === --- sem_ch6.adb (revision 191911) +++ sem_ch6.adb (working copy) @@ -11078,6 +11078,12 @@ Plist : List_Id := No_List; -- List of generated postconditions + procedure Check_Access_Invariants (E : Entity_Id); + -- If the subprogram returns an access to a type with invariants, or + -- has access parameters whose designated type has an invariant, then + -- under the same visibility conditions as for other invariant checks, + -- the type invariant must be applied to the returned value. + function Grab_CC return Node_Id; -- Prag contains an analyzed contract case pragma. This function copies -- relevant components of the pragma, creates the corresponding Check @@ -11108,6 +4,43 @@ -- that an invariant check is required (for an IN OUT parameter, or -- the returned value of a function. + - + -- Check_Access_Invariants -- + - + + procedure Check_Access_Invariants (E : Entity_Id) is + Call : Node_Id; + Obj : Node_Id; + Typ : Entity_Id; + + begin + if Is_Access_Type (Etype (E)) + and then not Is_Access_Constant (Etype (E)) + then +Typ := Designated_Type (Etype (E)); + +if Has_Invariants (Typ) + and then Present (Invariant_Procedure (Typ)) + and then Is_Public_Subprogram_For (Typ) +then + Obj := + Make_Explicit_Dereference (Loc, + Prefix = New_Occurrence_Of (E, Loc)); + Set_Etype (Obj, Typ); + + Call := Make_Invariant_Call (Obj); + + Append_To (Plist, + Make_If_Statement (Loc, + Condition = + Make_Op_Ne (Loc, + Left_Opnd = Make_Null (Loc), +
Re: [PATCH v2, libbacktrace]: Compile with -funwind-tables
Uros Bizjak ubiz...@gmail.com writes: Index: configure.ac === --- configure.ac (revision 191953) +++ configure.ac (working copy) @@ -66,6 +66,10 @@ AC_PROG_CC m4_rename_force([backtrace_PRECIOUS],[_AC_ARG_VAR_PRECIOUS]) +if test x$GCC = xyes; then + CFLAGS=$CFLAGS -funwind-tables +fi + Don't modify CFLAGS, instead you should substitute a new variable that is added to AM_CFLAGS. CFLAGS is reserved for the user to override. Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different.
[Ada] Add extended overflow -gnato switch to usage
This patch adds documentation on the -gnato? and -gnato?? switches to the usage information. Documentation only, no functional effect but gnatmake output (with no switches) should have the following three lines for -gnato: -gnatoEnable overflow checking mode to CHECKED (off by default) -gnato? Set SUPPRESSED/CHECKED/MINIMIZED/ELIMINATED (?=0/1/2/3) mode -gnato?? Set mode for general/assertion expressions separately Tested on x86_64-pc-linux-gnu, committed on trunk 2012-10-02 Robert Dewar de...@adacore.com * usage.adb, gnat_rm.texi, vms_data.ads: Add entry for /OVERFLOW_CHECKS=?? generating -gnato?? for control of extended overflow checking. * ug_words: Add entry for -gnato?? for /OVERFLOW_CHECKS=?? * gnat_ugn.texi: Add documentation for -gnato?? for control of overflow checking mode. Index: gnat_rm.texi === --- gnat_rm.texi(revision 191888) +++ gnat_rm.texi(working copy) @@ -179,6 +179,7 @@ * Pragma Obsolescent:: * Pragma Optimize_Alignment:: * Pragma Ordered:: +* Pragma Overflow_Checks:: * Pragma Passive:: * Pragma Persistent_BSS:: * Pragma Polling:: @@ -916,6 +917,7 @@ * Pragma Obsolescent:: * Pragma Optimize_Alignment:: * Pragma Ordered:: +* Pragma Overflow_Checks:: * Pragma Passive:: * Pragma Persistent_BSS:: * Pragma Polling:: @@ -4127,6 +4129,53 @@ For additional information please refer to the description of the @option{-gnatw.u} switch in the @value{EDITION} User's Guide. +@node Pragma Overflow_Checks +@unnumberedsec Pragma Overflow_Checks +@findex Overflow checks +@findex pragma @code{Overflow_Checks} +@noindent +Syntax: + +@smallexample @c ada +pragma Overflow_Checks + ( [General=] MODE + [,[Assertions =] MODE]); + +MODE ::= SUPPRESSED | CHECKED | MINIMIZED | ELIMINATED +@end smallexample + +@noindent +This pragma sets the current overflow mode to the given mode. For details +of the meaning of these modes, see section on overflow checking in the +GNAT users guide. If only the @code{General} parameter is present, the +given mode applies to all expressions. If both parameters are present, +the @code{General} mode applies to expressions outside assertions, and +the @code{Eliminated} mode applies to expressions within assertions. + +The case of the @code{MODE} parameter is ignored, +so @code{MINIMIZED}, @code{Minimized} and +@code{minimized} all have the same effect. + +The @code{Overflow_Checks} pragma has the same scoping and placement +rules as pragma @code{Suppress}, so it can occur either as a +configuration pragma, specifying a default for the whole +program, or in a declarative scope, where it applies to the +remaining declarations and statements in that scope. + +The pragma @code{Suppress (Overflow_Check)} sets mode + + General = Suppressed + +suppressing all overflow checking within and outside +assertions. + +The pragam @code{Unsuppress (Overflow_Check)} sets mode + + General = Checked + +which causes overflow checking of all intermediate overflows. +This applies both inside and outside assertions. + @node Pragma Passive @unnumberedsec Pragma Passive @findex Passive Index: gnat_ugn.texi === --- gnat_ugn.texi (revision 191910) +++ gnat_ugn.texi (working copy) @@ -4325,11 +4325,28 @@ Historically front end inlining was more extensive than the gcc back end inlining, but that is no longer the case. +@item -gnato?? +@cindex @option{-gnato??} (@command{gcc}) +Set default overflow cheecking mode. If ?? is a single digit, in the +range 0-3, it sets the overflow checking mode for all expressions, +including those outside and within assertions. The meaning of nnn is: + + 0 suppress overflow checks (SUPPRESSED) + 1 all intermediate overflows checked (CHECKED) + 2 minimize intermediate overflows (MINIMIZED) + 3 eliminate intermediate overflows (ELIMINATED) + +Otherwise ?? can be two digits, both 0-3, and in this case the first +digit sets the mode (using the above code) for expressions outside an +assertion, and the second digit sets the mode for expressions within +an assertion. + @item -gnato @cindex @option{-gnato} (@command{gcc}) Enable numeric overflow checking (which is not normally enabled by default). Note that division by zero is a separate check that is not controlled by this switch (division by zero checking is on by default). +The checking mode is set to CHECKED (equivalent to @option{-gnato11}). @item -gnatp @cindex @option{-gnatp} (@command{gcc}) Index: ug_words === --- ug_words(revision 191888) +++ ug_words(working copy) @@ -88,6 +88,7 @@ -gnatn2 ^ /INLINE=PRAGMA_LEVEL_2 -gnatN ^ /INLINE=FULL -gnato ^ /CHECKS=OVERFLOW +-gnato??^ /OVERFLOW_CHECKS=?? -gnatp ^ /CHECKS=SUPPRESS_ALL -gnat-p ^
Re: [PATCH] Rs6000 infrastructure cleanup (switches), revised patch #2c
Michael Meissner wrote: Segher Boessenkool asked me on IRC to break out the fix in the last change. This patch is just the change to set the default options if the user did not use -mcpu=xxx and the compiler was not configured with --with-cpu=xxx. Here are the patches. Which GCC releases are affected by this bug? Regards, Gunther I can submit this patch first if David desires, and then resubmit the first of the infrastructure patches again, or commit both together. 2012-09-28 Michael Meissner meiss...@linux.vnet.ibm.com * config/rs6000/rs6000.c (rs6000_option_override_internal): If -mcpu=xxx is not specified and the compiler is not configured using --with-cpu=xxx, use the bits from the TARGET_DEFAULT to set the initial options. Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c(revision 191831) +++ gcc/config/rs6000/rs6000.c(working copy) @@ -2461,6 +2461,11 @@ rs6000_option_override_internal (bool gl target_flags |= (processor_target_table[cpu_index].target_enable set_masks); + /* If no -mcpu=xxx, inherit any default options that were cleared via + POWERPC_MASKS. */ + if (!have_cpu) +target_flags |= (TARGET_DEFAULT ~target_flags_explicit); + if (rs6000_tune_index = 0) tune_index = rs6000_tune_index; else if (have_cpu)
[Ada] Indexing aspects and indexable containers
This patch refines several tests on the legality of indexing aspects: a) Constant_Indexing function do not have to return a reference type, b) given an indexing aspect Func, not all overloadings of Func in the current scope need to be indexing functions. The commnd: gnatmake -gnat12 -q main main must yield: Wow Yeah Rah Rah Rah --- with indexing; use indexing; with Text_IO; use Text_IO; procedure Main is Box : Holder; Carton : Holder2; begin Put_Line (Box.Get (Yeah)); Put_Line (Carton.Get (Rah )); end Main; --- package Indexing is type Holder is tagged null record with Constant_Indexing = Get, Iterator_Element = String; -- iterable container function Get (V : Holder; W : String) return String; -- indexing function function Get (V : Holder; W : String) return Integer; -- indexing function type Holder2 is tagged null record with Constant_Indexing = Get; -- indexable container function Get (V : Holder2; W : String) return String; -- indexing function end Indexing; --- package body Indexing is function Get (V : Holder; W : String) return String is begin return Wow W; end Get; function Get (V : Holder; W : String) return Integer is begin return 42; end Get; function Get (V : Holder2; W : String) return String is begin return W W W; end Get; end Indexing; Tested on x86_64-pc-linux-gnu, committed on trunk 2012-10-02 Ed Schonberg schonb...@adacore.com * sem_ch13.adb (Check_Indexing_Functions): Refine several tests on the legality of indexing aspects: Constant_Indexing functions do not have to return a reference type, and given an indexing aspect Func, not all overloadings of Func in the current scope need to be indexing functions. Index: sem_ch13.adb === --- sem_ch13.adb(revision 191902) +++ sem_ch13.adb(working copy) @@ -1919,7 +1919,7 @@ procedure Check_Indexing_Functions; -- Check that the function in Constant_Indexing or Variable_Indexing -- attribute has the proper type structure. If the name is overloaded, - -- check that all interpretations are legal. + -- check that some interpretation is legal. procedure Check_Iterator_Functions; -- Check that there is a single function in Default_Iterator attribute @@ -2070,6 +2070,7 @@ -- procedure Check_Indexing_Functions is + Indexing_Found : Boolean; procedure Check_One_Function (Subp : Entity_Id); -- Check one possible interpretation @@ -2085,29 +2086,38 @@ Aspect_Iterator_Element); begin -if not Check_Primitive_Function (Subp) then +if not Check_Primitive_Function (Subp) + and then not Is_Overloaded (Expr) +then Error_Msg_NE (aspect Indexing requires a function that applies to type, - Subp, Ent); +Subp, Ent); end if; -- An indexing function must return either the default element of --- the container, or a reference type. +-- the container, or a reference type. For variable indexing it +-- must be latter. if Present (Default_Element) then Analyze (Default_Element); if Is_Entity_Name (Default_Element) and then Covers (Entity (Default_Element), Etype (Subp)) then + Indexing_Found := True; return; end if; end if; --- Otherwise the return type must be a reference type. +-- For variable_indexing the return type must be a reference type. -if not Has_Implicit_Dereference (Etype (Subp)) then +if Attr = Name_Variable_Indexing + and then not Has_Implicit_Dereference (Etype (Subp)) +then Error_Msg_N (function for indexing must return a reference type, Subp); + +else + Indexing_Found := True; end if; end Check_One_Function; @@ -2129,6 +2139,7 @@ It : Interp; begin + Indexing_Found := False; Get_First_Interp (Expr, I, It); while Present (It.Nam) loop @@ -2142,6 +2153,11 @@ Get_Next_Interp (I, It); end loop; + if not Indexing_Found then + Error_Msg_NE ( + aspect Indexing requires a function that applies to type, + Expr, Ent); + end if; end; end if; end Check_Indexing_Functions;
[Ada] Project in limited withed chain reported as duplicate
This patch ensures that if a project is in a limited with import chain, it is not reported as a duplicate project. Tested on x86_64-pc-linux-gnu, committed on trunk 2012-10-02 Vincent Celier cel...@adacore.com * prj-part.adb (Post_Parse_Context_Clause): Resurrect Boolean parameter In_Limited. Check for circularity also if In_Limited is True. (Parse_Single_Project): Call Post_Parse_Context_Clause with In_Limited parameter. Index: prj-part.adb === --- prj-part.adb(revision 191895) +++ prj-part.adb(working copy) @@ -216,6 +216,7 @@ procedure Post_Parse_Context_Clause (Context_Clause: With_Id; In_Tree : Project_Node_Tree_Ref; + In_Limited: Boolean; Limited_Withs : Boolean; Imported_Projects : in out Project_Node_Id; Project_Directory : Path_Name_Type; @@ -827,6 +828,7 @@ procedure Post_Parse_Context_Clause (Context_Clause: With_Id; In_Tree : Project_Node_Tree_Ref; + In_Limited: Boolean; Limited_Withs : Boolean; Imported_Projects : in out Project_Node_Id; Project_Directory : Path_Name_Type; @@ -941,7 +943,9 @@ -- If we have one, get the project id of the limited -- imported project file, and do not parse it. - if Limited_Withs and then Project_Stack.Last 1 then + if (In_Limited or else Limited_Withs) and then + Project_Stack.Last 1 + then declare Canonical_Path_Name : Path_Name_Type; @@ -975,7 +979,7 @@ Path_Name_Id = Imported_Path_Name_Id, Extended = False, From_Extended = From_Extended, -In_Limited= Limited_Withs, +In_Limited= In_Limited or else Limited_Withs, Packages_To_Check = Packages_To_Check, Depth = Depth, Current_Dir = Current_Dir, @@ -1577,6 +1581,7 @@ Post_Parse_Context_Clause (In_Tree = In_Tree, Context_Clause= First_With, + In_Limited= In_Limited, Limited_Withs = False, Imported_Projects = Imported_Projects, Project_Directory = Project_Directory, @@ -1936,6 +1941,7 @@ Post_Parse_Context_Clause (In_Tree = In_Tree, Context_Clause= First_With, +In_Limited= In_Limited, Limited_Withs = True, Imported_Projects = Imported_Projects, Project_Directory = Project_Directory,
[Ada] Add style check for NOT IN
This patch adds a new style check for the layout of the NOT IN operation. If the token check style flag is set, then there must be exactly one space (and no other white space) between the NOT and the IN. The following is compiled with -gnaty: 1. package StyleNotIn is 2.x : Integer := 4; 3.y : Boolean := x not in 1 .. 10; | (style) single space must separate not and in 4. end StyleNotIn; 2012-10-02 Robert Dewar de...@adacore.com * stylesw.ads, gnat_ugn.texi: Document new style rule for NOT IN. * par-ch4.adb (P_Relational_Operator): Add style check for NOT IN. * style.ads, styleg.adb, styleg.ads (Check_Not_In): New procedure. Index: gnat_ugn.texi === --- gnat_ugn.texi (revision 191960) +++ gnat_ugn.texi (working copy) @@ -6730,6 +6730,10 @@ A vertical bar must be surrounded by spaces. @end itemize +@item +Exactly one blank (and no other white space) must appear between +a @code{not} token and a following @code{in} token. + @item ^u^UNNECESSARY_BLANK_LINES^ @emph{Check unnecessary blank lines.} Unnecessary blank lines are not allowed. A blank line is considered Index: par-ch4.adb === --- par-ch4.adb (revision 191888) +++ par-ch4.adb (working copy) @@ -2706,7 +2706,16 @@ Scan; -- past operator token + -- Deal with NOT IN, if previous token was NOT, we must have IN now + if Prev_Token = Tok_Not then + + -- Style check, for NOT IN, we require one space between NOT and IN + + if Style_Check and then Token = Tok_In then +Style.Check_Not_In; + end if; + T_In; end if; Index: style.ads === --- style.ads (revision 191888) +++ style.ads (working copy) @@ -6,7 +6,7 @@ -- -- -- S p e c -- -- -- --- Copyright (C) 1992-2010, Free Software Foundation, Inc. -- +-- Copyright (C) 1992-2012, Free Software Foundation, Inc. -- -- -- -- GNAT is free software; you can redistribute it and/or modify it under -- -- terms of the GNU General Public License as published by the Free Soft- -- @@ -155,6 +155,11 @@ -- check the line length (Len is the length of the current line). Note that -- the terminator may be the EOF character. + procedure Check_Not_In + renames Style_Inst.Check_Not_In; + -- Called with Scan_Ptr pointing to an IN token, and Prev_Token_Ptr + -- pointing to a NOT token. Used to check proper layout of NOT IN. + procedure Check_Pragma_Name renames Style_Inst.Check_Pragma_Name; -- The current token is a pragma identifier. Check that it is spelled Index: styleg.adb === --- styleg.adb (revision 191888) +++ styleg.adb (working copy) @@ -6,7 +6,7 @@ -- -- -- B o d y -- -- -- --- Copyright (C) 1992-2011, Free Software Foundation, Inc. -- +-- Copyright (C) 1992-2012, Free Software Foundation, Inc. -- -- -- -- GNAT is free software; you can redistribute it and/or modify it under -- -- terms of the GNU General Public License as published by the Free Soft- -- @@ -764,6 +764,24 @@ end if; end Check_Line_Terminator; + -- + -- Check_Not_In -- + -- + + -- In check tokens mode, only one space between NOT and IN + + procedure Check_Not_In is + begin + if Style_Check_Tokens then + if Source (Token_Ptr - 1) /= ' ' + or else Token_Ptr - Prev_Token_Ptr /= 4 + then -- CODEFIX? +Error_Msg + ((style) single space must separate NOT and IN, Token_Ptr - 1); + end if; + end if; + end Check_Not_In; + -- -- Check_No_Space_After -- -- Index: styleg.ads === --- styleg.ads (revision 191888) +++ styleg.ads (working copy) @@ -6,7 +6,7 @@ -- -- -- S p e c -- --
Re: RFC: LRA for x86/x86-64 [0/9]
Il 02/10/2012 09:28, Steven Bosscher ha scritto: My experience shows that these lists are usually 1-2 elements. Although in this case, there are pseudos with huge number elements (hundreeds). I tried -fweb for this tests because it can decrease the number elements but GCC (I don't know what pass) scales even worse: after 20 min of waiting and when virt memory achieved 20GB I stoped it. Ouch :-) The webizer itself never even runs, the compiler blows up somewhere during the df_analyze call from web_main. The issue here is probably in the DF_UD_CHAIN problem or in the DF_RD problem. /me is glad to have fixed fwprop when his GCC contribution time was more than 1-2 days per year... Unfortunately, the fwprop solution (actually a rewrite) was very specific to the problem and cannot be reused in other parts of the compiler. I guess here it is where we could experiment with region-based optimization. If a loop (including the parent dummy loop) is too big, ignore it and only do LRS on smaller loops inside it. Reaching definitions is insanely expensive on an entire function, but works well on smaller loops. Perhaps something similar could be applied also to IRA/LRA. Paolo
[Ada] References to the formals of child subprograms without specs
If a child subprogram has no previous spec, treat a reference to its formals (such as a parameter association) as coming from source, in order to generate the proper references and enable gps navigation between reference and declaration. Tested on x86_64-pc-linux-gnu, committed on trunk 2012-10-02 Ed Schonberg schonb...@adacore.com * lib-xref.adb (Generate_Reference): If a child subprogram has no previous spec, treat a reference to its formals (such as a parameter association) as coming from source in order to generate the proper references and enable gps navigation between reference and declaration. Index: lib-xref.adb === --- lib-xref.adb(revision 191888) +++ lib-xref.adb(working copy) @@ -945,6 +945,13 @@ then Ent := E; + -- Ditto for the formals of such a subprogram + + elsif Is_Overloadable (Scope (E)) + and then Is_Child_Unit (Scope (E)) + then +Ent := E; + -- Record components of discriminated subtypes or derived types must -- be treated as references to the original component.
abs(long long)
Hello, here is the patch from PR54686. Several notes: * I'll have to ask experts if std::abs(unsigned) (yes, a weird thing to do, but still) is meant to return a double... * I still don't like the configure-time _GLIBCXX_USE_INT128, I think it should use defined(__SIZEOF_INT128__), which would help other compilers. * newlib has llabs, according to the doc. It would be good to know what newlib is missing for libstdc++ to detect it as C99-ready. I tested a previous version (without __STRICT_ANSI__) on x86_64-linux-gnu and Oleg Endo did a basic check on sh/newlib. I'll do a last check after the review (no point if the patch needs changing again). 2012-10-02 Marc Glisse marc.gli...@inria.fr PR libstdc++/54686 * include/c_std/cstdlib (abs(long long)): Define fallback whenever we have long long but possibly not llabs. (abs(long long)): Use llabs when available. (abs(__int128)): Define when we have __int128. (div(long long, long long)): Use lldiv. * testsuite/26_numerics/headers/cstdlib/54686.c: New file. -- Marc GlisseIndex: include/c_std/cstdlib === --- include/c_std/cstdlib (revision 191941) +++ include/c_std/cstdlib (working copy) @@ -130,20 +130,32 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION using ::strtoul; using ::system; #ifdef _GLIBCXX_USE_WCHAR_T using ::wcstombs; using ::wctomb; #endif // _GLIBCXX_USE_WCHAR_T inline long abs(long __i) { return labs(__i); } +#if defined (_GLIBCXX_USE_LONG_LONG) \ + (!_GLIBCXX_USE_C99 || _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC) + // Fallback version if we don't have llabs but still allow long long. + inline long long + abs(long long __x) { return __x = 0 ? __x : -__x; } +#endif + +#if !defined(__STRICT_ANSI__) defined(_GLIBCXX_USE_INT128) + inline __int128 + abs(__int128 __x) { return __x = 0 ? __x : -__x; } +#endif + inline ldiv_t div(long __i, long __j) { return ldiv(__i, __j); } _GLIBCXX_END_NAMESPACE_VERSION } // namespace #if _GLIBCXX_USE_C99 #undef _Exit #undef llabs @@ -161,29 +173,29 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC using ::lldiv_t; #endif #if _GLIBCXX_USE_C99_CHECK || _GLIBCXX_USE_C99_DYNAMIC extern C void (_Exit)(int) throw () _GLIBCXX_NORETURN; #endif #if !_GLIBCXX_USE_C99_DYNAMIC using ::_Exit; #endif +#if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC inline long long - abs(long long __x) { return __x = 0 ? __x : -__x; } + abs(long long __x) { return ::llabs (__x); } -#if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC using ::llabs; inline lldiv_t div(long long __n, long long __d) - { lldiv_t __q; __q.quot = __n / __d; __q.rem = __n % __d; return __q; } + { return ::lldiv (__n, __d); } using ::lldiv; #endif #if _GLIBCXX_USE_C99_LONG_LONG_CHECK || _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC extern C long long int (atoll)(const char *) throw (); extern C long long int (strtoll)(const char * __restrict, char ** __restrict, int) throw (); extern C unsigned long long int (strtoull)(const char * __restrict, char ** __restrict, int) throw (); @@ -198,22 +210,22 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _GLIBCXX_END_NAMESPACE_VERSION } // namespace __gnu_cxx namespace std { #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC using ::__gnu_cxx::lldiv_t; #endif using ::__gnu_cxx::_Exit; - using ::__gnu_cxx::abs; #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC + using ::__gnu_cxx::abs; using ::__gnu_cxx::llabs; using ::__gnu_cxx::div; using ::__gnu_cxx::lldiv; #endif using ::__gnu_cxx::atoll; using ::__gnu_cxx::strtof; using ::__gnu_cxx::strtoll; using ::__gnu_cxx::strtoull; using ::__gnu_cxx::strtold; } // namespace std Index: testsuite/26_numerics/headers/cstdlib/54686.c === --- testsuite/26_numerics/headers/cstdlib/54686.c (revision 0) +++ testsuite/26_numerics/headers/cstdlib/54686.c (revision 0) @@ -0,0 +1,32 @@ +// { dg-do compile } +// { dg-options -std=c++11 } + +// Copyright (C) 2012 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. +// +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. +// +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// http://www.gnu.org/licenses/. + +#include cmath +#include cstdlib +#include type_traits +#include utility + +#ifdef
Re: RFC: LRA for x86/x86-64 [0/9]
On Tue, Oct 2, 2012 at 10:29 AM, Paolo Bonzini bonz...@gnu.org wrote: Il 02/10/2012 09:28, Steven Bosscher ha scritto: My experience shows that these lists are usually 1-2 elements. Although in this case, there are pseudos with huge number elements (hundreeds). I tried -fweb for this tests because it can decrease the number elements but GCC (I don't know what pass) scales even worse: after 20 min of waiting and when virt memory achieved 20GB I stoped it. Ouch :-) The webizer itself never even runs, the compiler blows up somewhere during the df_analyze call from web_main. The issue here is probably in the DF_UD_CHAIN problem or in the DF_RD problem. /me is glad to have fixed fwprop when his GCC contribution time was more than 1-2 days per year... I thought you spent more time on GCC nowadays, working for RedHat? Who's your manager, perhaps we can coerce him/her into letting you spend more time on GCC :-P Unfortunately, the fwprop solution (actually a rewrite) was very specific to the problem and cannot be reused in other parts of the compiler. That'd be too bad... But is this really true? I thought you had something done that builds chains only for USEs reached by multiple DEFs? That's the only interesting kind for web, too. I guess here it is where we could experiment with region-based optimization. If a loop (including the parent dummy loop) is too big, ignore it and only do LRS on smaller loops inside it. Reaching definitions is insanely expensive on an entire function, but works well on smaller loops. Heh, yes. In fact I have been working on a region-based version of web because it is (or at least: used to be) a useful pass that only isn't enabled by default because the underlying RD problem scales so badly. My current collection of hacks doesn't bootstrap, doesn't even build libgcc yet, but I plan to finish it for GCC 4.9. It's based on identifying SEME regions using structural analysis, and DF's partial CFG analysis (the latter is currently the problem). FWIW: part of the problem for this particular test case is that there are many registers with partial defs (vector registers) and the RD problem doesn't (and probably cannot) keep track of one partial def/use killing another partial def/use. This handling of vector regs appears to be a general problem with much of the RTL infrastructure. Ciao! Steven
[AARCH64] Merge from upstream trunk r191882
Hi, I have just merged upstream trunk on the aarch64-branch up to r191882. Thanks Sofiane
[patch] Introduce DECL_NONLOCAL_FRAME
Hi, this is the seemingly non-controversial part of the FRAME splitting patch. It introduces the DECL_NONLOCAL_FRAME flag, sets it during nested function lowering and... that's pretty much it. Tested on x86_64-suse-linux, OK for mainline? 2012-10-02 Eric Botcazou ebotca...@adacore.com * tree.h (DECL_NONLOCAL_FRAME): New macro. * tree-nested.c (get_frame_type): Set DECL_NONLOCAL_FRAME. * tree-streamer-in.c (unpack_ts_decl_common_value_fields): Stream in DECL_NONLOCAL_FRAME flag. * tree-streamer-out.c (pack_ts_decl_common_value_fields): Stream out DECL_NONLOCAL_FRAME flag. -- Eric BotcazouIndex: tree.h === --- tree.h (revision 191924) +++ tree.h (working copy) @@ -712,6 +712,9 @@ struct GTY(()) tree_base { SSA_NAME_IS_DEFAULT_DEF in SSA_NAME + + DECL_NONLOCAL_FRAME in + VAR_DECL */ struct GTY(()) tree_typed { @@ -3270,9 +3273,14 @@ extern void decl_fini_priority_insert (t libraries. */ #define MAX_RESERVED_INIT_PRIORITY 100 +/* In a VAR_DECL, nonzero if this is a global variable for VOPs. */ #define VAR_DECL_IS_VIRTUAL_OPERAND(NODE) \ (VAR_DECL_CHECK (NODE)-base.u.bits.saturating_flag) +/* In a VAR_DECL, nonzero if this is a non-local frame structure. */ +#define DECL_NONLOCAL_FRAME(NODE) \ + (VAR_DECL_CHECK (NODE)-base.default_def_flag) + struct GTY(()) tree_var_decl { struct tree_decl_with_vis common; }; Index: tree-streamer-out.c === --- tree-streamer-out.c (revision 191909) +++ tree-streamer-out.c (working copy) @@ -181,6 +181,9 @@ pack_ts_decl_common_value_fields (struct bp_pack_value (bp, expr-decl_common.off_align, 8); } + if (TREE_CODE (expr) == VAR_DECL) +bp_pack_value (bp, DECL_NONLOCAL_FRAME (expr), 1); + if (TREE_CODE (expr) == RESULT_DECL || TREE_CODE (expr) == PARM_DECL || TREE_CODE (expr) == VAR_DECL) Index: tree-nested.c === --- tree-nested.c (revision 191909) +++ tree-nested.c (working copy) @@ -235,6 +235,7 @@ get_frame_type (struct nesting_info *inf info-frame_type = type; info-frame_decl = create_tmp_var_for (info, type, FRAME); + DECL_NONLOCAL_FRAME (info-frame_decl) = 1; /* ??? Always make it addressable for now, since it is meant to be pointed to by the static chain pointer. This pessimizes Index: tree-streamer-in.c === --- tree-streamer-in.c (revision 191909) +++ tree-streamer-in.c (working copy) @@ -216,6 +216,9 @@ unpack_ts_decl_common_value_fields (stru expr-decl_common.off_align = bp_unpack_value (bp, 8); } + if (TREE_CODE (expr) == VAR_DECL) +DECL_NONLOCAL_FRAME (expr) = (unsigned) bp_unpack_value (bp, 1); + if (TREE_CODE (expr) == RESULT_DECL || TREE_CODE (expr) == PARM_DECL || TREE_CODE (expr) == VAR_DECL)
Re: [patch] Introduce DECL_NONLOCAL_FRAME
On Tue, Oct 02, 2012 at 10:49:31AM +0200, Eric Botcazou wrote: this is the seemingly non-controversial part of the FRAME splitting patch. It introduces the DECL_NONLOCAL_FRAME flag, sets it during nested function lowering and... that's pretty much it. Tested on x86_64-suse-linux, OK for mainline? Yes, thanks. 2012-10-02 Eric Botcazou ebotca...@adacore.com * tree.h (DECL_NONLOCAL_FRAME): New macro. * tree-nested.c (get_frame_type): Set DECL_NONLOCAL_FRAME. * tree-streamer-in.c (unpack_ts_decl_common_value_fields): Stream in DECL_NONLOCAL_FRAME flag. * tree-streamer-out.c (pack_ts_decl_common_value_fields): Stream out DECL_NONLOCAL_FRAME flag. Jakub
abs(long long)
(Forgot libstdc++...) Hello, here is the patch from PR54686. Several notes: * I'll have to ask experts if std::abs(unsigned) (yes, a weird thing to do, but still) is meant to return a double... * I still don't like the configure-time _GLIBCXX_USE_INT128, I think it should use defined(__SIZEOF_INT128__), which would help other compilers. * newlib has llabs, according to the doc. It would be good to know what newlib is missing for libstdc++ to detect it as C99-ready. I tested a previous version (without __STRICT_ANSI__) on x86_64-linux-gnu and Oleg Endo did a basic check on sh/newlib. I'll do a last check after the review (no point if the patch needs changing again). 2012-10-02 Marc Glisse marc.gli...@inria.fr PR libstdc++/54686 * include/c_std/cstdlib (abs(long long)): Define fallback whenever we have long long but possibly not llabs. (abs(long long)): Use llabs when available. (abs(__int128)): Define when we have __int128. (div(long long, long long)): Use lldiv. * testsuite/26_numerics/headers/cstdlib/54686.c: New file. -- Marc GlisseIndex: include/c_std/cstdlib === --- include/c_std/cstdlib (revision 191941) +++ include/c_std/cstdlib (working copy) @@ -130,20 +130,32 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION using ::strtoul; using ::system; #ifdef _GLIBCXX_USE_WCHAR_T using ::wcstombs; using ::wctomb; #endif // _GLIBCXX_USE_WCHAR_T inline long abs(long __i) { return labs(__i); } +#if defined (_GLIBCXX_USE_LONG_LONG) \ + (!_GLIBCXX_USE_C99 || _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC) + // Fallback version if we don't have llabs but still allow long long. + inline long long + abs(long long __x) { return __x = 0 ? __x : -__x; } +#endif + +#if !defined(__STRICT_ANSI__) defined(_GLIBCXX_USE_INT128) + inline __int128 + abs(__int128 __x) { return __x = 0 ? __x : -__x; } +#endif + inline ldiv_t div(long __i, long __j) { return ldiv(__i, __j); } _GLIBCXX_END_NAMESPACE_VERSION } // namespace #if _GLIBCXX_USE_C99 #undef _Exit #undef llabs @@ -161,29 +173,29 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC using ::lldiv_t; #endif #if _GLIBCXX_USE_C99_CHECK || _GLIBCXX_USE_C99_DYNAMIC extern C void (_Exit)(int) throw () _GLIBCXX_NORETURN; #endif #if !_GLIBCXX_USE_C99_DYNAMIC using ::_Exit; #endif +#if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC inline long long - abs(long long __x) { return __x = 0 ? __x : -__x; } + abs(long long __x) { return ::llabs (__x); } -#if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC using ::llabs; inline lldiv_t div(long long __n, long long __d) - { lldiv_t __q; __q.quot = __n / __d; __q.rem = __n % __d; return __q; } + { return ::lldiv (__n, __d); } using ::lldiv; #endif #if _GLIBCXX_USE_C99_LONG_LONG_CHECK || _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC extern C long long int (atoll)(const char *) throw (); extern C long long int (strtoll)(const char * __restrict, char ** __restrict, int) throw (); extern C unsigned long long int (strtoull)(const char * __restrict, char ** __restrict, int) throw (); @@ -198,22 +210,22 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _GLIBCXX_END_NAMESPACE_VERSION } // namespace __gnu_cxx namespace std { #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC using ::__gnu_cxx::lldiv_t; #endif using ::__gnu_cxx::_Exit; - using ::__gnu_cxx::abs; #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC + using ::__gnu_cxx::abs; using ::__gnu_cxx::llabs; using ::__gnu_cxx::div; using ::__gnu_cxx::lldiv; #endif using ::__gnu_cxx::atoll; using ::__gnu_cxx::strtof; using ::__gnu_cxx::strtoll; using ::__gnu_cxx::strtoull; using ::__gnu_cxx::strtold; } // namespace std Index: testsuite/26_numerics/headers/cstdlib/54686.c === --- testsuite/26_numerics/headers/cstdlib/54686.c (revision 0) +++ testsuite/26_numerics/headers/cstdlib/54686.c (revision 0) @@ -0,0 +1,32 @@ +// { dg-do compile } +// { dg-options -std=c++11 } + +// Copyright (C) 2012 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. +// +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. +// +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// http://www.gnu.org/licenses/. + +#include cmath +#include cstdlib +#include type_traits
Re: [PATCH v2, libbacktrace]: Compile with -funwind-tables
On Tue, Oct 2, 2012 at 10:08 AM, Andreas Schwab sch...@linux-m68k.org wrote: +if test x$GCC = xyes; then + CFLAGS=$CFLAGS -funwind-tables +fi + Don't modify CFLAGS, instead you should substitute a new variable that is added to AM_CFLAGS. CFLAGS is reserved for the user to override. Thanks, attached is a version that introduces EXTRA_FLAGS instead. 2012-10-02 Uros Bizjak ubiz...@gmail.com PR other/54761 * configure.ac (EXTRA_FLAGS): New. * Makefile.am (AM_FLAGS): Add $(EXTRA_FLAGS). * configure, Makefile.in: Regenerate. Testing on {x86_64,alphaev68}-linux-gnu in progress. Uros. Index: configure === --- configure (revision 191955) +++ configure (working copy) @@ -612,6 +612,7 @@ BACKTRACE_SUPPORTS_THREADS PIC_FLAG WARN_FLAGS +EXTRA_FLAGS BACKTRACE_FILE multi_basedir OTOOL64 @@ -11080,7 +11081,7 @@ lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat conftest.$ac_ext _LT_EOF -#line 11083 configure +#line 11084 configure #include confdefs.h #if HAVE_DLFCN_H @@ -11186,7 +11187,7 @@ lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat conftest.$ac_ext _LT_EOF -#line 11189 configure +#line 11190 configure #include confdefs.h #if HAVE_DLFCN_H @@ -11488,6 +11489,12 @@ fi +EXTRA_FLAGS= +if test x$GCC = xyes; then + EXTRA_FLAGS=-funwind-tables +fi + + WARN_FLAGS= save_CFLAGS=$CFLAGS for real_option in -W -Wall -Wwrite-strings -Wstrict-prototypes \ Index: Makefile.in === --- Makefile.in (revision 191955) +++ Makefile.in (working copy) @@ -152,6 +152,7 @@ ECHO_T = @ECHO_T@ EGREP = @EGREP@ EXEEXT = @EXEEXT@ +EXTRA_FLAGS = @EXTRA_FLAGS@ FGREP = @FGREP@ FORMAT_FILE = @FORMAT_FILE@ GREP = @GREP@ @@ -253,7 +254,7 @@ AM_CPPFLAGS = -I $(top_srcdir)/../include -I $(top_srcdir)/../libgcc \ -I ../libgcc -I ../gcc/include -I $(MULTIBUILDTOP)../../gcc/include -AM_CFLAGS = $(WARN_FLAGS) $(PIC_FLAG) +AM_CFLAGS = $(EXTRA_FLAGS) $(WARN_FLAGS) $(PIC_FLAG) noinst_LTLIBRARIES = libbacktrace.la libbacktrace_la_SOURCES = \ backtrace.h \ Index: configure.ac === --- configure.ac(revision 191955) +++ configure.ac(working copy) @@ -96,6 +96,12 @@ fi AC_SUBST(BACKTRACE_FILE) +EXTRA_FLAGS= +if test x$GCC = xyes; then + EXTRA_FLAGS=-funwind-tables +fi +AC_SUBST(EXTRA_FLAGS) + ACX_PROG_CC_WARNING_OPTS([-W -Wall -Wwrite-strings -Wstrict-prototypes \ -Wmissing-prototypes -Wold-style-definition \ -Wmissing-format-attribute -Wcast-qual], Index: Makefile.am === --- Makefile.am (revision 191955) +++ Makefile.am (working copy) @@ -34,7 +34,7 @@ AM_CPPFLAGS = -I $(top_srcdir)/../include -I $(top_srcdir)/../libgcc \ -I ../libgcc -I ../gcc/include -I $(MULTIBUILDTOP)../../gcc/include -AM_CFLAGS = $(WARN_FLAGS) $(PIC_FLAG) +AM_CFLAGS = $(EXTRA_FLAGS) $(WARN_FLAGS) $(PIC_FLAG) noinst_LTLIBRARIES = libbacktrace.la
Re: abs(long long)
On Tue, Oct 2, 2012 at 3:57 AM, Marc Glisse marc.gli...@inria.fr wrote: (Forgot libstdc++...) Hello, here is the patch from PR54686. Several notes: * I'll have to ask experts if std::abs(unsigned) (yes, a weird thing to do, but still) is meant to return a double... don't we have a core issue about preferring unsigned - long or long long? * I still don't like the configure-time _GLIBCXX_USE_INT128, I think it should use defined(__SIZEOF_INT128__), which would help other compilers. Why would that be a problem with the appropriate #define? * newlib has llabs, according to the doc. It would be good to know what newlib is missing for libstdc++ to detect it as C99-ready. I tested a previous version (without __STRICT_ANSI__) on x86_64-linux-gnu and Oleg Endo did a basic check on sh/newlib. I'll do a last check after the review (no point if the patch needs changing again). In general, I think I have a bias toward using compiler intrinsics, for which the compiler already has lot of knowledge about. 2012-10-02 Marc Glisse marc.gli...@inria.fr PR libstdc++/54686 * include/c_std/cstdlib (abs(long long)): Define fallback whenever we have long long but possibly not llabs. (abs(long long)): Use llabs when available. (abs(__int128)): Define when we have __int128. (div(long long, long long)): Use lldiv. * testsuite/26_numerics/headers/cstdlib/54686.c: New file. -- Marc Glisse Index: include/c_std/cstdlib === --- include/c_std/cstdlib (revision 191941) +++ include/c_std/cstdlib (working copy) @@ -130,20 +130,32 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION using ::strtoul; using ::system; #ifdef _GLIBCXX_USE_WCHAR_T using ::wcstombs; using ::wctomb; #endif // _GLIBCXX_USE_WCHAR_T inline long abs(long __i) { return labs(__i); } +#if defined (_GLIBCXX_USE_LONG_LONG) \ + (!_GLIBCXX_USE_C99 || _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC) + // Fallback version if we don't have llabs but still allow long long. + inline long long + abs(long long __x) { return __x = 0 ? __x : -__x; } +#endif + +#if !defined(__STRICT_ANSI__) defined(_GLIBCXX_USE_INT128) + inline __int128 + abs(__int128 __x) { return __x = 0 ? __x : -__x; } +#endif + inline ldiv_t div(long __i, long __j) { return ldiv(__i, __j); } _GLIBCXX_END_NAMESPACE_VERSION } // namespace #if _GLIBCXX_USE_C99 #undef _Exit #undef llabs @@ -161,29 +173,29 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC using ::lldiv_t; #endif #if _GLIBCXX_USE_C99_CHECK || _GLIBCXX_USE_C99_DYNAMIC extern C void (_Exit)(int) throw () _GLIBCXX_NORETURN; #endif #if !_GLIBCXX_USE_C99_DYNAMIC using ::_Exit; #endif +#if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC inline long long - abs(long long __x) { return __x = 0 ? __x : -__x; } + abs(long long __x) { return ::llabs (__x); } -#if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC using ::llabs; inline lldiv_t div(long long __n, long long __d) - { lldiv_t __q; __q.quot = __n / __d; __q.rem = __n % __d; return __q; } + { return ::lldiv (__n, __d); } using ::lldiv; #endif #if _GLIBCXX_USE_C99_LONG_LONG_CHECK || _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC extern C long long int (atoll)(const char *) throw (); extern C long long int (strtoll)(const char * __restrict, char ** __restrict, int) throw (); extern C unsigned long long int (strtoull)(const char * __restrict, char ** __restrict, int) throw (); @@ -198,22 +210,22 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _GLIBCXX_END_NAMESPACE_VERSION } // namespace __gnu_cxx namespace std { #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC using ::__gnu_cxx::lldiv_t; #endif using ::__gnu_cxx::_Exit; - using ::__gnu_cxx::abs; #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC + using ::__gnu_cxx::abs; using ::__gnu_cxx::llabs; using ::__gnu_cxx::div; using ::__gnu_cxx::lldiv; #endif using ::__gnu_cxx::atoll; using ::__gnu_cxx::strtof; using ::__gnu_cxx::strtoll; using ::__gnu_cxx::strtoull; using ::__gnu_cxx::strtold; } // namespace std Index: testsuite/26_numerics/headers/cstdlib/54686.c === --- testsuite/26_numerics/headers/cstdlib/54686.c (revision 0) +++ testsuite/26_numerics/headers/cstdlib/54686.c (revision 0) @@ -0,0 +1,32 @@ +// { dg-do compile } +// { dg-options -std=c++11 } + +// Copyright (C) 2012 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. +// +// This library is distributed in the hope
Re: [RFC] Make vectorizer to skip loops with small iteration estimate
On Mon, 1 Oct 2012, Jan Hubicka wrote: So the unvectorized cost is SIC * niters The vectorized path is SOC + VIC * ((niters-PL_ITERS-EP_ITERS)/VF) + VOC The scalar path of vectorizer loop is SIC * niters + SOC Note that 'th' is used for the runtime profitability check which is done at the time the setup cost has already been taken (yes, we Yes, I understand that. probably should make it more conservative but then guard the whole set of loops by the check, not only the vectorized path). See PR53355 for the general issue. Yep, we may reduce the cost of SOC by outputting early guard for non-vectorized path better than we do now. However... Of course this is very simple benchmark, in reality the vectorizatoin can be a lot more harmful by complicating more complex control flows. So I guess we have two options 1) go with the new formula and try to make cost model a bit more realistic. 2) stay with original formula that is quite close to reality, but I think more by an accident. I think we need to improve it as whole, thus I'd prefer 2). ... I do not see why. Even if we make the check cheaper we will only distribute part of SOC to vector prologues/epilogues. Still I think the formula is wrong, I.e. accounting SOC where it should not. The cost of scalar path without vectorization is niters * SIC while with vectorization we have scalar path niters * SIC + SOC and vector path SOC + VIC * ((niters-PL_ITERS-EP_ITERS)/VF) + VOC So SOC cancels out in the runtime check. I still think we need two formulas - one determining if vectorization is profitable, other specifying the threshold for scalar path at runtime (that will generally give lower values). True, we want two values. But part of the scalar path right now is all the computation required for alias and alignment runtime checks (because the way all the conditions are combined). I'm not much into the details of what we account for in SOC (I suppose it's everything we insert in the preheader of the vector loop). + if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS)) +fprintf (vect_dump, not vectorized: estimated iteration count too small.); + if (vect_print_dump_info (REPORT_DETAILS)) +fprintf (vect_dump, not vectorized: estimated iteration count smaller than + user specified loop bound parameter or minimum + profitable iterations (whichever is more conservative).); this won't work anymore btw - dumping infrastructure changed. I suppose your patch is a step in the right direction, but to really make progress we need to re-organize the loop and predicate structure produced by the vectorizer. So, please update your patch, re-test and then it's ok. 2) Even when loop iterates 2 times, it is estimated to 4 iterations by estimated_stmt_executions_int with the profile feedback. The reason is loop_ch pass. Given a rolled loop with exit probability 30%, proceeds by duplicating the header with original probabilities. This makes the loop to be executed with 60% probability. Because the loop body counts remain the same (and they should), the expected number of iterations increase by the decrease of entry edge to the header. I wonder what to do about this. Obviously without path profiling loop_ch can not really do a good job. We can artifically make header to suceed more likely, that is the reality, but that requires non-trivial loop profile updating. We can also simply record the iteration bound into loop structure and ignore that the profile is not realistic But we don't preserve loop structure from header copying ... From what time we keep loop structure? In general I would like to eventualy drop value histograms to loop structure specifying number of iterations with profile feedback. We preserve it from the start of the tree loop optimizers (it's easy to preserve them from earlier points as long as you don't cross inlining, but to lower the impact of the change I placed it where it was enough to prevent the excessive unrolling/peeling done by RTL) Finally we can duplicate loop headers before profilng. I implemented that via early_ch pass executed only with profile generation or feedback. I guess it makes sense to do, even if it breaks the assumption that we should do strictly -Os generation on paths where Well, there are CH cases that do not increase code size and I doubt that loop header copying is generally bad for -Os ... we are not good at handling non-copied loop headers. There is comment saying /* Loop header copying usually increases size of the code. This used not to be true, since quite often it is possible to verify that the condition is satisfied in the first
[AARCH64-4.7] Merge from upstream gcc-4_7-branch r191881
Hi, I have just merged upstream gcc-4_7-branch on the aarch64-4.7-branch up to r191881. Thanks Sofiane
Re: abs(long long)
On Tue, 2 Oct 2012, Gabriel Dos Reis wrote: On Tue, Oct 2, 2012 at 3:57 AM, Marc Glisse marc.gli...@inria.fr wrote: (Forgot libstdc++...) Hello, here is the patch from PR54686. Several notes: * I'll have to ask experts if std::abs(unsigned) (yes, a weird thing to do, but still) is meant to return a double... don't we have a core issue about preferring unsigned - long or long long? Here I am talking of a library issue: the wording that says that there are sufficient overloads such that integer types call the double version of math functions. It is fairly obvious that it doesn't apply to abs(long) for instance which has an explicit overload. For short or unsigned, I still read it as saying that it converts to double... * I still don't like the configure-time _GLIBCXX_USE_INT128, I think it should use defined(__SIZEOF_INT128__), which would help other compilers. Why would that be a problem with the appropriate #define? The library installed by the system was compiled with g++, and is then used with clang++. If we can avoid installing 2 config.h files to make that work... * newlib has llabs, according to the doc. It would be good to know what newlib is missing for libstdc++ to detect it as C99-ready. I tested a previous version (without __STRICT_ANSI__) on x86_64-linux-gnu and Oleg Endo did a basic check on sh/newlib. I'll do a last check after the review (no point if the patch needs changing again). In general, I think I have a bias toward using compiler intrinsics, for which the compiler already has lot of knowledge about. More precisely, does that mean you want __builtin_llabs instead of ::llabs? I thought the compiler knew they were the same. -- Marc Glisse
Re: Convert more non-GTY htab_t to hash_table.
On Mon, 1 Oct 2012, Lawrence Crowl wrote: Change more non-GTY hash tables to use the new type-safe template hash table. Constify member function parameters that can be const. Correct a couple of expressions in formerly uninstantiated templates. The new code is 0.362% faster in bootstrap, with a 99.5% confidence of being faster. Tested on x86-64. Okay for trunk? You are changing a hashtable used by fold checking, did you test with fold checking enabled? +/* Data structures used to maintain mapping between basic blocks and + copies. */ +static hash_table bb_copy_hasher bb_original; +static hash_table bb_copy_hasher bb_copy; note that because hash_table has a constructor we now get global CTORs for all statics :( (and mx-protected local inits ...) Can you please try to remove the constructor from hash_table to avoid this overhead? (as a followup - that is, don't initialize htab) The cfg.c, dse.c and hash-table.h parts are ok for trunk, I'll leave the rest to respective maintainers of the pieces of the compiler. Thanks, Richard. Index: gcc/java/ChangeLog 2012-10-01 Lawrence Crowl cr...@google.com * Make-lang.in (JAVA_OBJS): Add dependence on hash-table.o. (JCFDUMP_OBJS): Add dependence on hash-table.o. (jcf-io.o): Add dependence on hash-table.h. * jcf-io.c (memoized_class_lookups): Change to use type-safe hash table. Index: gcc/c/ChangeLog 2012-10-01 Lawrence Crowl cr...@google.com * Make-lang.in (c-decl.o): Add dependence on hash-table.h. * c-decl.c (detect_field_duplicates_hash): Change to new type-safe hash table. Index: gcc/objc/ChangeLog 2012-10-01 Lawrence Crowl cr...@google.com * Make-lang.in (OBJC_OBJS): Add dependence on hash-table.o. (objc-act.o): Add dependence on hash-table.h. * objc-act.c (objc_detect_field_duplicates): Change to new type-safe hash table. Index: gcc/ChangeLog 2012-10-01 Lawrence Crowl cr...@google.com * Makefile.in (fold-const.o): Add depencence on hash-table.h. (dse.o): Likewise. (cfg.o): Likewise. * fold-const.c (fold_checksum_tree): Change to new type-safe hash table. * (print_fold_checksum): Likewise. * cfg.c (var bb_original): Likewise. * (var bb_copy): Likewise. * (var loop_copy): Likewise. * hash-table.h (template hash_table): Constify parameters for find... and remove_elt... member functions. (hash_table::empty) Correct size expression. (hash_table::clear_slot) Correct deleted entry assignment. * dse.c (var rtx_group_table): Change to new type-safe hash table. Index: gcc/cp/ChangeLog 2012-10-01 Lawrence Crowl cr...@google.com * Make-lang.in (class.o): Add dependence on hash-table.h. (tree.o): Likewise. (semantics.o): Likewise. * class.c (fixed_type_or_null): Change to new type-safe hash table. * tree.c (verify_stmt_tree): Likewise. (verify_stmt_tree_r): Likewise. * semantics.c (struct nrv_data): Likewise. Index: gcc/java/Make-lang.in === --- gcc/java/Make-lang.in (revision 191941) +++ gcc/java/Make-lang.in (working copy) @@ -83,10 +83,10 @@ JAVA_OBJS = java/class.o java/decl.o jav java/zextract.o java/jcf-io.o java/win32-host.o java/jcf-parse.o java/mangle.o \ java/mangle_name.o java/builtins.o java/resource.o \ java/jcf-depend.o \ - java/jcf-path.o java/boehm.o java/java-gimplify.o + java/jcf-path.o java/boehm.o java/java-gimplify.o hash-table.o JCFDUMP_OBJS = java/jcf-dump.o java/jcf-io.o java/jcf-depend.o java/jcf-path.o \ - java/win32-host.o java/zextract.o ggc-none.o + java/win32-host.o java/zextract.o ggc-none.o hash-table.o JVGENMAIN_OBJS = java/jvgenmain.o java/mangle_name.o @@ -326,7 +326,7 @@ java/java-gimplify.o: java/java-gimplify # jcf-io.o needs $(ZLIBINC) added to cflags. CFLAGS-java/jcf-io.o += $(ZLIBINC) java/jcf-io.o: java/jcf-io.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ - $(JAVA_TREE_H) java/zipfile.h + $(JAVA_TREE_H) java/zipfile.h $(HASH_TABLE_H) # jcf-path.o needs a -D. CFLAGS-java/jcf-path.o += \ Index: gcc/java/jcf-io.c === --- gcc/java/jcf-io.c (revision 191941) +++ gcc/java/jcf-io.c (working copy) @@ -31,7 +31,7 @@ The Free Software Foundation is independ #include jcf.h #include tree.h #include java-tree.h -#include hashtab.h +#include hash-table.h #include dirent.h #include zlib.h @@ -271,20 +271,34 @@ find_classfile (char *filename, JCF *jcf return open_class (filename, jcf, fd, dep_name); } -/* Returns 1 if the CLASSNAME (really a char *) matches the name - stored in TABLE_ENTRY (also a char *). */ -static int -memoized_class_lookup_eq (const void *table_entry, const void *classname) +/* Hash table
Re: [PATCH] Fix powerpc breakage, was: Add option for dumping to stderr (issue6190057)
On Tue, Oct 2, 2012 at 1:11 AM, Xinliang David Li davi...@google.com wrote: On Mon, Oct 1, 2012 at 4:05 PM, Sharad Singhai sing...@google.com wrote: Thanks for tracking down and fixing the powerpc port. The dump_kind_p () check is redundant but canonical form here. I think blocks of dump code guarded by if dump_kind_p (...) might be easier to read/maintain. I find it confusing to be honest. The redundant check serves no purpose. The check should be inlined and avoid the call to the diagnostic routine, thus speed up compile-time. We should use this pattern, especially if it guards multiple calls. Richard. David Sharad Sharad On Mon, Oct 1, 2012 at 3:45 PM, Xinliang David Li davi...@google.com wrote: On Mon, Oct 1, 2012 at 2:37 PM, Michael Meissner meiss...@linux.vnet.ibm.com wrote: I tracked down some of the other code that previously used REPORT_DETAILS, and MSG_NOTE is the new way to do the same thing. This bootstraps and no unexpected errors occur during make check. Is it ok to install? 2012-10-01 Michael Meissner meiss...@linux.vnet.ibm.com * config/rs6000/rs6000.c (toplevel): Include dumpfile.h. (rs6000_density_test): Rework to accomidate 09-30 change by Sharad Singhai. * config/rs6000/t-rs6000 (rs6000.o): Add dumpfile.h dependency. Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 191932) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -58,6 +58,7 @@ #include tm-constrs.h #include opts.h #include tree-vectorizer.h +#include dumpfile.h #if TARGET_XCOFF #include xcoffout.h /* get declarations of xcoff_*_section_name */ #endif @@ -3518,11 +3519,11 @@ rs6000_density_test (rs6000_cost_data *d vec_cost + not_vec_cost DENSITY_SIZE_THRESHOLD) { data-cost[vect_body] = vec_cost * (100 + DENSITY_PENALTY) / 100; - if (vect_print_dump_info (REPORT_DETAILS)) - fprintf (vect_dump, -density %d%%, cost %d exceeds threshold, penalizing -loop body cost by %d%%, density_pct, -vec_cost + not_vec_cost, DENSITY_PENALTY); + if (dump_kind_p (MSG_NOTE)) Is this check needed? Seems redundant. David + dump_printf_loc (MSG_NOTE, vect_location, +density %d%%, cost %d exceeds threshold, penalizing +loop body cost by %d%%, density_pct, +vec_cost + not_vec_cost, DENSITY_PENALTY); } } Index: gcc/config/rs6000/t-rs6000 === --- gcc/config/rs6000/t-rs6000 (revision 191932) +++ gcc/config/rs6000/t-rs6000 (working copy) @@ -26,7 +26,7 @@ rs6000.o: $(CONFIG_H) $(SYSTEM_H) corety $(OBSTACK_H) $(TREE_H) $(EXPR_H) $(OPTABS_H) except.h function.h \ output.h dbxout.h $(BASIC_BLOCK_H) toplev.h $(GGC_H) $(HASHTAB_H) \ $(TM_P_H) $(TARGET_H) $(TARGET_DEF_H) langhooks.h reload.h gt-rs6000.h \ - cfgloop.h $(OPTS_H) $(COMMON_TARGET_H) + cfgloop.h $(OPTS_H) $(COMMON_TARGET_H) dumpfile.h rs6000-c.o: $(srcdir)/config/rs6000/rs6000-c.c \ $(srcdir)/config/rs6000/rs6000-protos.h \ -- Michael Meissner, IBM 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899
Re: [PATCH] Fix test breakage, was: Add option for dumping to stderr (issue6190057)
On Tue, Oct 2, 2012 at 1:31 AM, Sharad Singhai sing...@google.com wrote: Here is a patch to fix test breakage caused by r191883. Bootstrapped on x86_64 and tested with make -k check RUNTESTFLAGS=--target_board=unix/\{,-m32\}. Okay for trunk? Ok. Thanks, Richard. Thanks, Sharad 2012-10-01 Sharad Singhai sing...@google.com * tree-vect-stmts.c (vectorizable_operation): Add missing return. testsuite/Changelog * gfortran.dg/vect/vect.exp: Change verbose vectorizor dump options to fix test failures caused by r191883. * gcc.dg/tree-ssa/gen-vect-11.c: Likewise. * gcc.dg/tree-ssa/gen-vect-2.c: Likewise. * gcc.dg/tree-ssa/gen-vect-32.c: Likewise. * gcc.dg/tree-ssa/gen-vect-25.c: Likewise. * gcc.dg/tree-ssa/gen-vect-11a.c: Likewise. * gcc.dg/tree-ssa/gen-vect-26.c: Likewise. * gcc.dg/tree-ssa/gen-vect-11b.c: Likewise. * gcc.dg/tree-ssa/gen-vect-11c.c: Likewise. * gcc.dg/tree-ssa/gen-vect-28.c: Likewise. * testsuite/gcc.target/i386/vect-double-1.c: Fix test. Missing entry from r191883. Index: testsuite/gfortran.dg/vect/vect.exp === --- testsuite/gfortran.dg/vect/vect.exp (revision 191883) +++ testsuite/gfortran.dg/vect/vect.exp (working copy) @@ -26,7 +26,7 @@ set DEFAULT_VECTCFLAGS # These flags are used for all targets. lappend DEFAULT_VECTCFLAGS -O2 -ftree-vectorize -fno-vect-cost-model \ - -ftree-vectorizer-verbose=4 -fdump-tree-vect-stats + -fdump-tree-vect-details # If the target system supports vector instructions, the default action # for a test is 'run', otherwise it's 'compile'. Save current default. Index: testsuite/gcc.dg/tree-ssa/gen-vect-11.c === --- testsuite/gcc.dg/tree-ssa/gen-vect-11.c (revision 191883) +++ testsuite/gcc.dg/tree-ssa/gen-vect-11.c (working copy) @@ -1,6 +1,6 @@ /* { dg-do run { target vect_cmdline_needed } } */ -/* { dg-options -O2 -ftree-vectorize -ftree-vectorizer-verbose=3 -fwrapv -fdump-tree-vect-stats } */ -/* { dg-options -O2 -ftree-vectorize -ftree-vectorizer-verbose=3 -fwrapv -fdump-tree-vect-stats -mno-sse { target { i?86-*-* x86_64-*-* } } } */ +/* { dg-options -O2 -ftree-vectorize -fwrapv -fdump-tree-vect-details } */ +/* { dg-options -O2 -ftree-vectorize -fwrapv -fdump-tree-vect-details -mno-sse { target { i?86-*-* x86_64-*-* } } } */ #include stdlib.h Index: testsuite/gcc.dg/tree-ssa/gen-vect-2.c === --- testsuite/gcc.dg/tree-ssa/gen-vect-2.c (revision 191883) +++ testsuite/gcc.dg/tree-ssa/gen-vect-2.c (working copy) @@ -1,6 +1,6 @@ /* { dg-do run { target vect_cmdline_needed } } */ -/* { dg-options -O2 -ftree-vectorize -ftree-vectorizer-verbose=4 -fdump-tree-vect-stats } */ -/* { dg-options -O2 -ftree-vectorize -ftree-vectorizer-verbose=4 -fdump-tree-vect-stats -mno-sse { target { i?86-*-* x86_64-*-* } } } */ +/* { dg-options -O2 -ftree-vectorize -fdump-tree-vect-details } */ +/* { dg-options -O2 -ftree-vectorize -fdump-tree-vect-details -mno-sse { target { i?86-*-* x86_64-*-* } } } */ #include stdlib.h Index: testsuite/gcc.dg/tree-ssa/gen-vect-32.c === --- testsuite/gcc.dg/tree-ssa/gen-vect-32.c (revision 191883) +++ testsuite/gcc.dg/tree-ssa/gen-vect-32.c (working copy) @@ -1,6 +1,6 @@ /* { dg-do run { target vect_cmdline_needed } } */ -/* { dg-options -O2 -ftree-vectorize -ftree-vectorizer-verbose=4 -fdump-tree-vect-stats } */ -/* { dg-options -O2 -ftree-vectorize -ftree-vectorizer-verbose=4 -fdump-tree-vect-stats -mno-sse { target { i?86-*-* x86_64-*-* } } } */ +/* { dg-options -O2 -ftree-vectorize -fdump-tree-vect-details } */ +/* { dg-options -O2 -ftree-vectorize -fdump-tree-vect-details -mno-sse { target { i?86-*-* x86_64-*-* } } } */ #include stdlib.h Index: testsuite/gcc.dg/tree-ssa/gen-vect-25.c === --- testsuite/gcc.dg/tree-ssa/gen-vect-25.c (revision 191883) +++ testsuite/gcc.dg/tree-ssa/gen-vect-25.c (working copy) @@ -1,6 +1,6 @@ /* { dg-do run { target vect_cmdline_needed } } */ -/* { dg-options -O2 -ftree-vectorize -ftree-vectorizer-verbose=4 -fdump-tree-vect-stats } */ -/* { dg-options -O2 -ftree-vectorize -ftree-vectorizer-verbose=4 -fdump-tree-vect-stats -mno-sse { target { i?86-*-* x86_64-*-* } } } */ +/* { dg-options -O2 -ftree-vectorize -fdump-tree-vect-details } */ +/* { dg-options -O2 -ftree-vectorize -fdump-tree-vect-details -mno-sse { target { i?86-*-* x86_64-*-* } } } */ #include stdlib.h Index: testsuite/gcc.dg/tree-ssa/gen-vect-11a.c === --- testsuite/gcc.dg/tree-ssa/gen-vect-11a.c (revision 191883) +++
Re: [PATCH] Add option for dumping to stderr (issue6190057)
On Mon, Oct 1, 2012 at 8:39 PM, Gabriel Dos Reis g...@integrable-solutions.net wrote: On Mon, Oct 1, 2012 at 1:27 PM, Michael Meissner meiss...@linux.vnet.ibm.com wrote: On Mon, Oct 01, 2012 at 02:02:26PM -0400, Michael Meissner wrote: Your change on September 30th, breaks the powerpc port because the REPORT_DETAILS value in the enumeration is no longer there, and the rs6000_density_test function was using that. Please in the future, when you are making global changes, grep for uses of enum values in all of the machine dependent directories so we can avoid breakage like this. Also, in looking at the changes, given we are already up to 28 TDF_ flags, I would recommend immediately adding a new type that is the TDF flagword type. Thus it will be a lot simpler when we add 4 more TDF flags and have to change the type from int to HOST_WIDE_INT. Agreed that we need an abstraction here. Some TLC as well - the flags have various meanings (some control dumping, some, like TDF_TREE, seem to be unrelated - the MSG ones probably don't need the same number-space as well, not all flags are used anymore - TDF_MEMSYMS?). But yes, an abstraction is needed. But I wouldn't suggest HOST_WIDE_INT but int - uint32_t instead (possibly going uint64_t). Richard. -- Gaby
Re: abs(long long)
On Tue, Oct 2, 2012 at 4:21 AM, Marc Glisse marc.gli...@inria.fr wrote: On Tue, 2 Oct 2012, Gabriel Dos Reis wrote: On Tue, Oct 2, 2012 at 3:57 AM, Marc Glisse marc.gli...@inria.fr wrote: (Forgot libstdc++...) Hello, here is the patch from PR54686. Several notes: * I'll have to ask experts if std::abs(unsigned) (yes, a weird thing to do, but still) is meant to return a double... don't we have a core issue about preferring unsigned - long or long long? Here I am talking of a library issue: the wording that says that there are sufficient overloads such that integer types call the double version of math functions. It is fairly obvious that it doesn't apply to abs(long) for instance which has an explicit overload. For short or unsigned, I still read it as saying that it converts to double... I understand that it is originally a library issue, but I don't think it makes sense to resolve it in isolation of that core issue. * I still don't like the configure-time _GLIBCXX_USE_INT128, I think it should use defined(__SIZEOF_INT128__), which would help other compilers. Why would that be a problem with the appropriate #define? The library installed by the system was compiled with g++, and is then used with clang++. If we can avoid installing 2 config.h files to make that work... Two things: 1. that is clearly a clang problem. I don't think it is libstdc++'s job tp try to solve clang's misguided configuration and installation. 2. I am not sure you understand what I wrote: you can leave the use of the current macro the way it is if you appropriately define it in terms of what you want to change it to. * newlib has llabs, according to the doc. It would be good to know what newlib is missing for libstdc++ to detect it as C99-ready. I tested a previous version (without __STRICT_ANSI__) on x86_64-linux-gnu and Oleg Endo did a basic check on sh/newlib. I'll do a last check after the review (no point if the patch needs changing again). In general, I think I have a bias toward using compiler intrinsics, for which the compiler already has lot of knowledge about. More precisely, does that mean you want __builtin_llabs instead of ::llabs? I thought the compiler knew they were the same. Yes. Another reason is that it simplifies the implementation AND if people want want to do something with the intrinsics' fallback libstdc++ will gracefully deliver that. -- Gaby
Re: RFC: LRA for x86/x86-64 [0/9]
Il 02/10/2012 10:49, Steven Bosscher ha scritto: On Tue, Oct 2, 2012 at 10:29 AM, Paolo Bonzini bonz...@gnu.org wrote: Il 02/10/2012 09:28, Steven Bosscher ha scritto: My experience shows that these lists are usually 1-2 elements. Although in this case, there are pseudos with huge number elements (hundreeds). I tried -fweb for this tests because it can decrease the number elements but GCC (I don't know what pass) scales even worse: after 20 min of waiting and when virt memory achieved 20GB I stoped it. Ouch :-) The webizer itself never even runs, the compiler blows up somewhere during the df_analyze call from web_main. The issue here is probably in the DF_UD_CHAIN problem or in the DF_RD problem. /me is glad to have fixed fwprop when his GCC contribution time was more than 1-2 days per year... I thought you spent more time on GCC nowadays, working for Red Hat? No, I work on QEMU most of the time. :) Knowing myself, if I had GCC-related assignments you'd see me _a lot_ on upstream mailing lists! Unfortunately, the fwprop solution (actually a rewrite) was very specific to the problem and cannot be reused in other parts of the compiler. That'd be too bad... But is this really true? I thought you had something done that builds chains only for USEs reached by multiple DEFs? That's the only interesting kind for web, too. No, it's the other way round. I have a dataflow problem that recognizes USEs reached by multiple DEFs, so that I can use a dominator walk to build singleton def-use chains. It's very similar to how you build SSA, but punting instead of inserting phis. Another solution is to build factored use-def chains for web, and use them instead of RD. In the end it's not very different from regional live range splitting, since the phi functions factor out the state of the pass at loop (that is region) boundaries. I thought you had looked at FUD chains years ago? FWIW: part of the problem for this particular test case is that there are many registers with partial defs (vector registers) and the RD problem doesn't (and probably cannot) keep track of one partial def/use killing another partial def/use. So they are subregs of regs? Perhaps they could be represented with VEC_MERGE to break the live range: (set (reg:V4SI 94) (vec_merge:V4SI (reg:V4SI 94) (const_vector:V4SI [(const_int 0) (const_int 0) (const_int 0) (reg:SI 95)]) (const_int 7))) And then reload, or something after reload, would know how to split these when spilling V4SI to memory. Paolo
Re: [PATCH] Changes in mode switching
2012/9/30 Uros Bizjak ubiz...@gmail.com: On Thu, Sep 20, 2012 at 8:35 AM, Uros Bizjak ubiz...@gmail.com wrote: On Thu, Sep 20, 2012 at 8:06 AM, Vladimir Yakovlev vbyakov...@gmail.com wrote: The compiler with the patch and without post_reload.patch is built and works successfully. It has the only failure with avx-vzeroupper-3 test because of post reload problem. Ok, can you please elaborate a bit on this filure? Perhaps someone has an idea why reload moves unspec_volatile around? LRA will eventually replace reload in the nearby future [1], does LRA also move unspec_volatile vzeroupper around? [1] http://gcc.gnu.org/ml/gcc-patches/2012-09/msg01862.html Uros. I tried my patch with LRA. It works fine. The test avx-vzeroupper-3 runs succesfully, unspec_volatile vzeroupper is not moved around in LRA. Vladimir
Re: [PATCH] Add option for dumping to stderr (issue6190057)
On Tue, Oct 2, 2012 at 4:31 AM, Richard Guenther richard.guent...@gmail.com wrote: On Mon, Oct 1, 2012 at 8:39 PM, Gabriel Dos Reis g...@integrable-solutions.net wrote: On Mon, Oct 1, 2012 at 1:27 PM, Michael Meissner meiss...@linux.vnet.ibm.com wrote: On Mon, Oct 01, 2012 at 02:02:26PM -0400, Michael Meissner wrote: Your change on September 30th, breaks the powerpc port because the REPORT_DETAILS value in the enumeration is no longer there, and the rs6000_density_test function was using that. Please in the future, when you are making global changes, grep for uses of enum values in all of the machine dependent directories so we can avoid breakage like this. Also, in looking at the changes, given we are already up to 28 TDF_ flags, I would recommend immediately adding a new type that is the TDF flagword type. Thus it will be a lot simpler when we add 4 more TDF flags and have to change the type from int to HOST_WIDE_INT. Agreed that we need an abstraction here. Some TLC as well - the flags have various meanings (some control dumping, some, like TDF_TREE, seem to be unrelated - the MSG ones probably don't need the same number-space as well, not all flags are used anymore - TDF_MEMSYMS?). TDF_* flags weren't originally designed for those :-/ But yes, an abstraction is needed. But I wouldn't suggest HOST_WIDE_INT but int - uint32_t instead (possibly going uint64_t). That makes sense. -- Gaby
Re: [PATCH] Changes in mode switching
On Tue, Oct 2, 2012 at 11:35 AM, Vladimir Yakovlev vbyakov...@gmail.com wrote: The compiler with the patch and without post_reload.patch is built and works successfully. It has the only failure with avx-vzeroupper-3 test because of post reload problem. Ok, can you please elaborate a bit on this filure? Perhaps someone has an idea why reload moves unspec_volatile around? LRA will eventually replace reload in the nearby future [1], does LRA also move unspec_volatile vzeroupper around? [1] http://gcc.gnu.org/ml/gcc-patches/2012-09/msg01862.html I tried my patch with LRA. It works fine. The test avx-vzeroupper-3 runs succesfully, unspec_volatile vzeroupper is not moved around in LRA. Great! This also means +1 to include LRA in 4.8 from x86 maintainer. We also expect spill falure fixes and other improvements for pre-reload scheduling from LRA. Uros.
Re: [PATCH] Changes in mode switching
Will we wait for LRA commit or is it possiple to commit to trank vzeroupper patch now? 2012/10/2 Uros Bizjak ubiz...@gmail.com: On Tue, Oct 2, 2012 at 11:35 AM, Vladimir Yakovlev vbyakov...@gmail.com wrote: The compiler with the patch and without post_reload.patch is built and works successfully. It has the only failure with avx-vzeroupper-3 test because of post reload problem. Ok, can you please elaborate a bit on this filure? Perhaps someone has an idea why reload moves unspec_volatile around? LRA will eventually replace reload in the nearby future [1], does LRA also move unspec_volatile vzeroupper around? [1] http://gcc.gnu.org/ml/gcc-patches/2012-09/msg01862.html I tried my patch with LRA. It works fine. The test avx-vzeroupper-3 runs succesfully, unspec_volatile vzeroupper is not moved around in LRA. Great! This also means +1 to include LRA in 4.8 from x86 maintainer. We also expect spill falure fixes and other improvements for pre-reload scheduling from LRA. Uros.
Re: [PATCH] Changes in mode switching
On Tue, Oct 2, 2012 at 12:08 PM, Vladimir Yakovlev vbyakov...@gmail.com wrote: Will we wait for LRA commit or is it possiple to commit to trank vzeroupper patch now? Since we can emit vzeroupper now, we will wait for LRA. Uros.
[SH] PR 50457 - Cleanup linux-atomic
Hello, This is the patch as proposed in the PR to make libgcc/config/sh/linux-atomic use the appropriate compiler generated atomic built-in functions depending on the currently selected atomic-model. Tested on 191894 with 'make all-gcc' and by compiling code to see if the __SH_ATOMIC_MODEL_*__ defines work as expected. The new file linux-atomic.c was tested by compiling it separately and eyeballing the generated code. OK? Cheers, Oleg gcc/ChangeLog: PR target/50457 * config/sh/sh.c (parse_validate_atomic_model_option): Handle name strings in sh_atomic_model. * config/sh/sh.h (TARGET_CPU_CPP_BUILTINS): Move macro implementation to ... * config/sh/sh-c.c (sh_cpu_cpp_builtins): ... this new function. Add __SH1__ and __SH2__ defines. Add __SH_ATOMIC_MODEL_*__ define. * config/sh/sh-protos.h (sh_atomic_model): Add name and cdef_name variables. (sh_cpu_cpp_builtins): Declare new function. libgcc/ChangeLog: PR target/50457 * config/sh/linux-atomic.S: Delete. * config/sh/linux-atomic.c: New. * config/sh/t-linux (LIB2ADD): Replace linux-atomic.S with linux-atomic.c. Add cflags to disable warnings. Index: libgcc/config/sh/t-linux === --- libgcc/config/sh/t-linux (revision 191894) +++ libgcc/config/sh/t-linux (working copy) @@ -1,9 +1,13 @@ LIB1ASMFUNCS_CACHE = _ic_invalidate _ic_invalidate_array -LIB2ADD = $(srcdir)/config/sh/linux-atomic.S +LIB2ADD = $(srcdir)/config/sh/linux-atomic.c HOST_LIBGCC2_CFLAGS += -mieee -DNO_FPSCR_VALUES +# Silence atomic built-in related warnings in linux-atomic.c. +# Unfortunately the conflicting types warning can't be disabled selectively. +HOST_LIBGCC2_CFLAGS += -w -Wno-sync-nand + # Override t-slibgcc-elf-ver to export some libgcc symbols with # the symbol versions that glibc used, and hide some lib1func # routines which should not be called via PLT. We have to create Index: libgcc/config/sh/linux-atomic.S === --- libgcc/config/sh/linux-atomic.S (revision 191894) +++ libgcc/config/sh/linux-atomic.S (working copy) @@ -1,223 +0,0 @@ -/* Copyright (C) 2006, 2008, 2009 Free Software Foundation, Inc. - - This file is part of GCC. - - GCC is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - GCC is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - http://www.gnu.org/licenses/. */ - - -!! Linux specific atomic routines for the Renesas / SuperH SH CPUs. -!! Linux kernel for SH3/4 has implemented the support for software -!! atomic sequences. - -#define FUNC(X) .type X,@function -#define HIDDEN_FUNC(X) FUNC(X); .hidden X -#define ENDFUNC0(X) .Lfe_##X: .size X,.Lfe_##X-X -#define ENDFUNC(X) ENDFUNC0(X) - -#if ! __SH5__ - -#define ATOMIC_TEST_AND_SET(N,T,EXT) \ - .global __sync_lock_test_and_set_##N; \ - HIDDEN_FUNC(__sync_lock_test_and_set_##N); \ - .align 2; \ -__sync_lock_test_and_set_##N:; \ - mova 1f, r0; \ - nop; \ - mov r15, r1; \ - mov #(0f-1f), r15; \ -0: mov.##T @r4, r2; \ - mov.##T r5, @r4; \ -1: mov r1, r15; \ - rts; \ - EXT r2, r0; \ - ENDFUNC(__sync_lock_test_and_set_##N) - -ATOMIC_TEST_AND_SET (1,b,extu.b) -ATOMIC_TEST_AND_SET (2,w,extu.w) -ATOMIC_TEST_AND_SET (4,l,mov) - -#define ATOMIC_COMPARE_AND_SWAP(N,T,EXTS,EXT) \ - .global __sync_val_compare_and_swap_##N; \ - HIDDEN_FUNC(__sync_val_compare_and_swap_##N); \ - .align 2; \ -__sync_val_compare_and_swap_##N:; \ - mova 1f, r0; \ - EXTS r5, r5; \ - mov r15, r1; \ - mov #(0f-1f), r15; \ -0: mov.##T @r4, r2; \ - cmp/eq r2, r5; \ - bf 1f; \ - mov.##T r6, @r4; \ -1: mov r1, r15; \ - rts; \ - EXT r2, r0; \ - ENDFUNC(__sync_val_compare_and_swap_##N) - -ATOMIC_COMPARE_AND_SWAP (1,b,exts.b,extu.b) -ATOMIC_COMPARE_AND_SWAP (2,w,exts.w,extu.w) -ATOMIC_COMPARE_AND_SWAP (4,l,mov,mov) - -#define ATOMIC_BOOL_COMPARE_AND_SWAP(N,T,EXTS) \ - .global __sync_bool_compare_and_swap_##N; \ - HIDDEN_FUNC(__sync_bool_compare_and_swap_##N); \ - .align 2; \ -__sync_bool_compare_and_swap_##N:; \ - mova 1f, r0; \ - EXTS r5, r5; \ - mov r15, r1; \ - mov #(0f-1f), r15; \ -0: mov.##T @r4, r2; \ - cmp/eq r2,
[Ada] Couple of minor tweaks
This avoids applying the NRV optimization for small structures and creating useless elaboration variables for loops. Tested on x86_64-suse-linux, applied on the mainline and 4.7 branch. 2012-10-02 Eric Botcazou ebotca...@adacore.com * gcc-interfaces/decl.c (elaborate_expression_1): Use the variable for bounds of loop iteraration scheme only for locally defined subtypes. * gcc-interface/trans.c (gigi): Fix formatting. (build_return_expr): Apply the NRV optimization only for BLKmode. -- Eric BotcazouIndex: gcc-interface/decl.c === --- gcc-interface/decl.c (revision 191953) +++ gcc-interface/decl.c (working copy) @@ -6165,6 +6165,7 @@ elaborate_expression_1 (tree gnu_expr, E use_variable = expr_variable_p (expr_global_p || (!optimize + definition Is_Itype (gnat_entity) Nkind (Associated_Node_For_Itype (gnat_entity)) == N_Loop_Parameter_Specification)); Index: gcc-interface/trans.c === --- gcc-interface/trans.c (revision 191953) +++ gcc-interface/trans.c (working copy) @@ -332,7 +332,7 @@ gigi (Node_Id gnat_root, int max_gnat_no #ifdef ORDINARY_MAP_INSTANCE map = LINEMAPS_ORDINARY_MAP_AT (line_table, i); if (flag_debug_instances) -ORDINARY_MAP_INSTANCE(map) = file_info_ptr[i].Instance; + ORDINARY_MAP_INSTANCE (map) = file_info_ptr[i].Instance; #endif linemap_line_start (line_table, file_info_ptr[i].Num_Source_Lines, 252); linemap_position_for_column (line_table, 252 - 1); @@ -3158,6 +3158,7 @@ build_return_expr (tree ret_obj, tree re if (optimize AGGREGATE_TYPE_P (operation_type) !TYPE_IS_FAT_POINTER_P (operation_type) + TYPE_MODE (operation_type) == BLKmode aggregate_value_p (operation_type, current_function_decl)) { /* Recognize the temporary created for a return value with variable
PATCH trunk: gengtype honoring mark_hook-s inside struct inide union-s
Hello All, As I observed in http://gcc.gnu.org/ml/gcc/2010-07/msg00248.html and in http://gcc.gnu.org/ml/gcc/2012-10/msg3.html the mark_hook GTY annotation is sometimes incorrectly ingored by gengtype. The example in http://gcc.gnu.org/ml/gcc/2012-10/msg3.html demonstrates that incorrect behavior of gengtype (both with gengtype from GCC 4.7, and with the current trunk's gengtype). For simplicity, here is it again: /* file tmarkh.h */ #define MYUTAG 1 union GTY ((desc(%0.u_int))) myutest_un { int GTY((skip)) u_int; struct mytest_st GTY ((tag(MYUTAG))) u_mytest; }; static GTY(()) union myutest_un *myutestptr; static inline void mymarker(struct mytest_st*s) { s-myflag = 1; } /* eof tmarkh.h */ when running gengtype (the one from the trunk, or the gcc-4.7 one) with gengtype -D -v -r gtype.state -P _g-tmarkh.h tmarkh.h you can observe that the generated _g-tmarkh.h don't contain any call to mymarker. If the static variable (here myutestptr) is declared with the struct mytest_st* type, the marker is emitted. The reason of that bug is that for GTY-ed union members which are themselves GTY-ed struct, the marking of the nested struct is generated inline (for the union) and in that case the mark_hook annotation was not used. The attached patch to trunk svn rev 191972 solves this issue (with it, the generated _g-tmarkh.h is correctly calling mymarker). The gcc/ChangeLog entry is: 2012-10-02 Basile Starynkevitch bas...@starynkevitch.net * gengtype.c (walk_type): Emit mark_hook when inside a struct of a union member. Ok for trunk? Regards. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basileatstarynkevitchdotnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mines, sont seulement les miennes} *** Index: gcc-trunk-bstarynk/gcc/gengtype.c === --- gcc-trunk-bstarynk/gcc/gengtype.c (revision 191972) +++ gcc-trunk-bstarynk/gcc/gengtype.c (working copy) @@ -2810,6 +2810,7 @@ walk_type (type_p t, struct walk_type_data *d) const char *oldval = d-val; const char *oldprevval1 = d-prev_val[1]; const char *oldprevval2 = d-prev_val[2]; + const char *structmarkhook = NULL; const int union_p = t-kind == TYPE_UNION; int seen_default_p = 0; options_p o; @@ -2833,7 +2834,14 @@ walk_type (type_p t, struct walk_type_data *d) if (!desc strcmp (o-name, desc) == 0 o-kind == OPTION_STRING) desc = o-info.string; + else if (!structmarkhook strcmp(o-name, mark_hook) == 0 + o-kind == OPTION_STRING) + structmarkhook = o-info.string; + if (structmarkhook) + oprintf (d-of, %*s/*structmarkhook %s */ %s (%s));\n, + d-indent, , t-u.s.tag, structmarkhook, oldval); + d-prev_val[2] = oldval; d-prev_val[1] = oldprevval2; if (union_p)
Re: RFC: LRA for x86/x86-64 [7/9]
Hi Vlad, Vladimir Makarov vmaka...@redhat.com writes: This is the major patch containing all new files. The patch also adds necessary calls to LRA from IRA.As the patch is too big, it continues in the next email. 2012-09-27 Vladimir Makarov vmaka...@redhat.com * Makefile.in (LRA_INT_H): New. (OBJS): Add lra.o, lra-assigns.o, lra-coalesce.o, lra-constraints.o, lra-eliminations.o, lra-lives.o, and lra-spills.o. (ira.o): Add dependence on lra.h. (lra.o, lra-assigns.o, lra-coalesce.o, lra-constraints.o): New entries. (lra-eliminations.o, lra-lives.o, lra-spills.o): Ditto. * ira.c: Include lra.h. (ira_init_once, ira_init, ira_finish_once): Call lra_start_once, lra_init, lra_finish_once in anyway. (lra_in_progress): Remove. (do_reload): Call LRA. * lra.h: New. * lra-int.h: Ditto. * lra.c: Ditto. * lra-assigns.c: Ditto. * lra-constraints.c: Ditto. * lra-coalesce.c: Ditto. * lra-eliminations.c: Ditto. * lra-lives.c: Ditto. * lra-spills.c: Ditto. * doc/passes.texi: Describe LRA pass. A non-authoritative review of the documentation and lra-eliminations.c: +LRA is different from the reload pass in LRA division on small, +manageable, and separated sub-tasks. All LRA transformations and +decisions are reflected in RTL as more as possible. Instruction +constraints as a primary source of the info and that minimizes number +of target-depended macros/hooks. +LRA is run for the targets it were ported. Suggest something like: Unlike the reload pass, intermediate LRA decisions are reflected in RTL as much as possible. This reduces the number of target-dependent macros and hooks, leaving instruction constraints as the primary source of control. LRA is run on targets for which TARGET_LRA_P returns true. +/* The virtual registers (like argument and frame pointer) are widely + used in RTL. Virtual registers should be changed by real hard + registers (like stack pointer or hard frame pointer) plus some + offset. The offsets are changed usually every time when stack is + expanded. We know the final offsets only at the very end of LRA. I always think of virtual as [FIRST_VIRTUAL_REGISTER, LAST_VIRTUAL_REGISTER]. Maybe eliminable would be better? E.g. /* Eliminable registers (like a soft argument or frame pointer) are widely used in RTL. These eliminable registers should be replaced by real hard registers (like the stack pointer or hard frame pointer) plus some offset. The offsets usually change whenever the stack is expanded. We know the final offsets only at the very end of LRA. + We keep RTL code at most time in such state that the virtual + registers can be changed by just the corresponding hard registers + (with zero offsets) and we have the right RTL code. To achieve this + we should add initial offset at the beginning of LRA work and update + offsets after each stack expanding. But actually we update virtual + registers to the same virtual registers + corresponding offsets + before every constraint pass because it affects constraint + satisfaction (e.g. an address displacement became too big for some + target). Suggest: Within LRA, we usually keep the RTL in such a state that the eliminable registers can be replaced by just the corresponding hard register (without any offset). To achieve this we should add the initial elimination offset at the beginning of LRA and update the offsets whenever the stack is expanded. We need to do this before every constraint pass because the choice of offset often affects whether a particular address or memory constraint is satisfied. + The final change of virtual registers to the corresponding hard + registers are done at the very end of LRA when there were no change + in offsets anymore: + + fp + 42 = sp + 42 virtual=eliminable if the above is OK. + Such approach requires a few changes in the rest GCC code because + virtual registers are not recognized as real ones in some + constraints and predicates. Fortunately, such changes are + small. */ Not sure whether the last paragraph really belongs in the code, since it's more about the reload-LRA transition. + /* Nonzero if this elimination can be done. */ + bool can_eliminate; + /* CAN_ELIMINATE since the last check. */ + bool prev_can_eliminate; AFAICT, these two fields are (now) only ever assigned at the same time, via init_elim_table and setup_can_eliminate. Looks like we can do without prev_can_eliminate. (And the way that the pass doesn't need to differentiate between the raw CAN_ELIMINABLE value and the processed value feels nice and reassuring.) +/* Map: 'from regno' - to the current elimination, NULL otherwise. + The elimination table may contains more one elimination of a hard
Re: RFC: LRA for x86/x86-64 [7/9]
On 09/28/2012 12:59 AM, Vladimir Makarov wrote: + We keep RTL code at most time in such state that the virtual + registers can be changed by just the corresponding hard registers + (with zero offsets) and we have the right RTL code. To achieve this + we should add initial offset at the beginning of LRA work and update + offsets after each stack expanding. But actually we update virtual + registers to the same virtual registers + corresponding offsets + before every constraint pass because it affects constraint + satisfaction (e.g. an address displacement became too big for some + target). + + The final change of virtual registers to the corresponding hard + registers are done at the very end of LRA when there were no change + in offsets anymore: + + fp + 42 = sp + 42 Let me try to understand this. We have (mem (fp)), which we rewrite to (mem (fp + 42)), but this is intended to represent (mem (sp + 42))? Wouldn't this fail on any target which has different addressing ranges for SP and FP? Bernd
Re: [SH] PR 51244 - Handle T bit - 0x7FFFFFFF / 0x80000000
Oleg Endo oleg.e...@t-online.de wrote: This handles the case where the T bit is stored to a reg as the value 0x7FFF or 0x8000. Tested on rev 191894 with make -k check RUNTESTFLAGS=--target_board=sh-sim \{-m2/-ml,-m2/-mb,-m2a/-mb,-m4/-ml,-m4/-mb,-m4a/-ml,-m4a/-mb} and no new failures. OK? OK. Regards, kaz
Re: [SH] PR 50457 - Cleanup linux-atomic
Oleg Endo oleg.e...@t-online.de wrote: This is the patch as proposed in the PR to make libgcc/config/sh/linux-atomic use the appropriate compiler generated atomic built-in functions depending on the currently selected atomic-model. Tested on 191894 with 'make all-gcc' and by compiling code to see if the __SH_ATOMIC_MODEL_*__ defines work as expected. The new file linux-atomic.c was tested by compiling it separately and eyeballing the generated code. OK? OK. Regards, kaz
Re: [Patch] Fix PR53397
On Mon, 1 Oct 2012, venkataramanan.ku...@amd.com wrote: Hi, The below patch fixes the FFT/Scimark regression caused by useless prefetch generation. This fix tries to make prefetch less aggressive by prefetching arrays in the inner loop, when the step is invariant in the entire loop nest. GCC currently tries to prefetch invariant steps when they are in the inner loop. But does not check if the step is variant in outer loops. In the scimark FFT case, the trip count of the inner loop varies by a non constant step, which is invariant in the inner loop. But the step variable is varying in outer loop. This makes inner loop trip count small (at run time varies sometimes as small as 1 iteration) Prefetching ahead x iteration when the inner loop trip count is smaller than x leads to useless prefetches. Flag used: -O3 -march=amdfam10 Before ** ** ** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark ** ** for details. (Results can be submitted to p...@nist.gov) ** ** ** Using 2.00 seconds min time per kenel. Composite Score: 550.50 FFT Mflops:38.66(N=1024) SOR Mflops: 617.61(100 x 100) MonteCarlo: Mflops: 173.74 Sparse matmult Mflops: 675.63(N=1000, nz=5000) LU Mflops: 1246.88(M=100, N=100) After ** ** ** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark ** ** for details. (Results can be submitted to p...@nist.gov) ** ** ** Using 2.00 seconds min time per kenel. Composite Score: 639.20 FFT Mflops: 479.19(N=1024) SOR Mflops: 617.61(100 x 100) MonteCarlo: Mflops: 173.18 Sparse matmult Mflops: 679.13(N=1000, nz=5000) LU Mflops: 1246.88(M=100, N=100) GCC regression make check -k passes with x86_64-unknown-linux-gnu New tests that PASS: gcc.dg/pr53397-1.c scan-assembler prefetcht0 gcc.dg/pr53397-1.c scan-tree-dump aprefetch Issued prefetch gcc.dg/pr53397-1.c (test for excess errors) gcc.dg/pr53397-2.c scan-tree-dump aprefetch loop variant step gcc.dg/pr53397-2.c scan-tree-dump aprefetch Not prefetching gcc.dg/pr53397-2.c (test for excess errors) Checked CPU2006 and polyhedron on latest AMD processor, no regressions noted. Ok to commit in trunk? regards, Venkat gcc/ChangeLog +2012-10-01 Venkataramanan Kumar venkataramanan.ku...@amd.com + + * tree-ssa-loop-prefetch.c (gather_memory_references_ref):$ + Perform non constant step prefetching in inner loop, only $ + when it is invariant in the entire loop nest. $ + * testsuite/gcc.dg/pr53397-1.c: New test case $ + Checks we are prefecthing for loop invariant steps$ + * testsuite/gcc.dg/pr53397-2.c: New test case$ + Checks we are not prefecthing for loop variant steps + Index: gcc/testsuite/gcc.dg/pr53397-1.c === --- gcc/testsuite/gcc.dg/pr53397-1.c (revision 0) +++ gcc/testsuite/gcc.dg/pr53397-1.c (revision 0) @@ -0,0 +1,28 @@ +/* Prefetching when the step is loop invariant. */ + +/* { dg-do compile } */ +/* { dg-options -O3 -fprefetch-loop-arrays -fdump-tree-aprefetch-details --param min-insn-to-prefetch-ratio=3 --param simultaneous-prefetches=10 -fdump-tree-aprefetch-details } */ + + +double data[16384]; +void prefetch_when_non_constant_step_is_invariant(int step, int n) +{ + int a; + int b; + for (a = 1; a step; a++) { +for (b = 0; b n; b += 2 * step) { + + int i = 2*(b + a); + int j = 2*(b + a + step); + + + data[j] = data[i]; + data[j+1] = data[i+1]; +} + } +} + +/* { dg-final { scan-tree-dump Issued prefetch aprefetch } } */ +/* { dg-final { scan-assembler prefetcht0 } } */ This (and the case below) needs to be adjusted to only run on the appropriate hardware. See for example gcc.dg/tree-ssa/prefetch-8.c for how to do this. +/* { dg-final { cleanup-tree-dump aprefetch } } */ Index: gcc/testsuite/gcc.dg/pr53397-2.c === --- gcc/testsuite/gcc.dg/pr53397-2.c (revision 0) +++ gcc/testsuite/gcc.dg/pr53397-2.c (revision 0) @@ -0,0 +1,29 @@ +/* Not prefetching when the step is loop variant. */ + +/* { dg-do compile } */ +/* { dg-options -O3 -fprefetch-loop-arrays -fdump-tree-aprefetch-details --param min-insn-to-prefetch-ratio=3 --param simultaneous-prefetches=10 -fdump-tree-aprefetch-details } */ + + +double data[16384]; +void donot_prefetch_when_non_constant_step_is_variant(int step, int n) +{ + int a; + int b; + for (a = 1; a step; a++,step*=2) {
[Ada] Get rid of internal use of N_Return_Statement
This patch goes almost all the way in removing N_Return_Statement, and replacing it by N_Simple_Return_Statement. No test, since no functional effect. Tested on x86_64-pc-linux-gnu, committed on trunk 2012-10-02 Robert Dewar de...@adacore.com * sinfo.adb, sinfo.ads, sem_util.adb, sem_util.ads, types.h, exp_ch4.adb, exp_ch6.adb: Get rid of internal use of N_Return_Statement. Index: sinfo.adb === --- sinfo.adb (revision 191972) +++ sinfo.adb (working copy) @@ -370,7 +370,7 @@ begin pragma Assert (False or else NT (N).Nkind = N_Extended_Return_Statement -or else NT (N).Nkind = N_Return_Statement); +or else NT (N).Nkind = N_Simple_Return_Statement); return Flag5 (N); end By_Ref; @@ -427,7 +427,7 @@ (N : Node_Id) return Boolean is begin pragma Assert (False -or else NT (N).Nkind = N_Return_Statement); +or else NT (N).Nkind = N_Simple_Return_Statement); return Flag18 (N); end Comes_From_Extended_Return_Statement; @@ -958,7 +958,7 @@ or else NT (N).Nkind = N_Extended_Return_Statement or else NT (N).Nkind = N_Function_Call or else NT (N).Nkind = N_Procedure_Call_Statement -or else NT (N).Nkind = N_Return_Statement +or else NT (N).Nkind = N_Simple_Return_Statement or else NT (N).Nkind = N_Type_Conversion); return Flag13 (N); end Do_Tag_Check; @@ -1234,7 +1234,7 @@ or else NT (N).Nkind = N_Pragma_Argument_Association or else NT (N).Nkind = N_Qualified_Expression or else NT (N).Nkind = N_Raise_Statement -or else NT (N).Nkind = N_Return_Statement +or else NT (N).Nkind = N_Simple_Return_Statement or else NT (N).Nkind = N_Type_Conversion or else NT (N).Nkind = N_Unchecked_Expression or else NT (N).Nkind = N_Unchecked_Type_Conversion); @@ -2537,7 +2537,7 @@ or else NT (N).Nkind = N_Allocator or else NT (N).Nkind = N_Extended_Return_Statement or else NT (N).Nkind = N_Free_Statement -or else NT (N).Nkind = N_Return_Statement); +or else NT (N).Nkind = N_Simple_Return_Statement); return Node2 (N); end Procedure_To_Call; @@ -2670,7 +2670,7 @@ begin pragma Assert (False or else NT (N).Nkind = N_Extended_Return_Statement -or else NT (N).Nkind = N_Return_Statement); +or else NT (N).Nkind = N_Simple_Return_Statement); return Node5 (N); end Return_Statement_Entity; @@ -2862,7 +2862,7 @@ or else NT (N).Nkind = N_Allocator or else NT (N).Nkind = N_Extended_Return_Statement or else NT (N).Nkind = N_Free_Statement -or else NT (N).Nkind = N_Return_Statement); +or else NT (N).Nkind = N_Simple_Return_Statement); return Node1 (N); end Storage_Pool; @@ -3443,7 +3443,7 @@ begin pragma Assert (False or else NT (N).Nkind = N_Extended_Return_Statement -or else NT (N).Nkind = N_Return_Statement); +or else NT (N).Nkind = N_Simple_Return_Statement); Set_Flag5 (N, Val); end Set_By_Ref; @@ -3500,7 +3500,7 @@ (N : Node_Id; Val : Boolean := True) is begin pragma Assert (False -or else NT (N).Nkind = N_Return_Statement); +or else NT (N).Nkind = N_Simple_Return_Statement); Set_Flag18 (N, Val); end Set_Comes_From_Extended_Return_Statement; @@ -4031,7 +4031,7 @@ or else NT (N).Nkind = N_Extended_Return_Statement or else NT (N).Nkind = N_Function_Call or else NT (N).Nkind = N_Procedure_Call_Statement -or else NT (N).Nkind = N_Return_Statement +or else NT (N).Nkind = N_Simple_Return_Statement or else NT (N).Nkind = N_Type_Conversion); Set_Flag13 (N, Val); end Set_Do_Tag_Check; @@ -4298,7 +4298,7 @@ or else NT (N).Nkind = N_Pragma_Argument_Association or else NT (N).Nkind = N_Qualified_Expression or else NT (N).Nkind = N_Raise_Statement -or else NT (N).Nkind = N_Return_Statement +or else NT (N).Nkind = N_Simple_Return_Statement or else NT (N).Nkind = N_Type_Conversion or else NT (N).Nkind = N_Unchecked_Expression or else NT (N).Nkind = N_Unchecked_Type_Conversion); @@ -5601,7 +5601,7 @@ or else NT (N).Nkind = N_Allocator or else NT (N).Nkind = N_Extended_Return_Statement or else NT (N).Nkind = N_Free_Statement -or else NT (N).Nkind = N_Return_Statement); +or else NT (N).Nkind = N_Simple_Return_Statement); Set_Node2 (N, Val); -- semantic field, no parent set end Set_Procedure_To_Call; @@ -5734,7 +5734,7 @@ begin pragma Assert (False or else NT (N).Nkind = N_Extended_Return_Statement -or else NT (N).Nkind = N_Return_Statement); +or else NT (N).Nkind =
[Ada] Ada/C++ missing call to constructor with defaults
When the type of an object is a CPP type and the object initialization requires calling its default C++ constructor, the Ada compiler did not generate the call to a C++ constructor which has all parameters with defaults (and hence it covers the default C++ constructor). The following test must now compile and execute well. // c_class.h class Tester { public: Tester(unsigned int a_num = 5, char* a_className = 0); virtual int dummy(); }; // c_class.cc #include c_class.h #include iostream Tester::Tester(unsigned int a_num, char* a_className) { std::cout ctor Tester called a_num :; if (a_className == 0) { std::cout null; } std::cout std::endl; } int Tester::dummy() { } -- c_class_h.ads pragma Ada_2005; pragma Style_Checks (Off); with Interfaces.C; use Interfaces.C; with Interfaces.C.Strings; package c_class_h is package Class_Tester is type Tester is tagged limited record null; end record; pragma Import (CPP, Tester); function New_Tester (a_num : unsigned := 5; a_className : Interfaces.C.Strings.chars_ptr := Interfaces.C.Strings.Null_Ptr) return Tester; pragma CPP_Constructor (New_Tester, _ZN6TesterC1EjPc); function dummy (this : access Tester) return int; pragma Import (CPP, dummy, _ZN6Tester5dummyEv); end; use Class_Tester; end c_class_h; -- main.adb with c_class_h; use c_class_h; procedure Main is use Class_Tester; Obj : Tester; -- Test pragma Unreferenced (Obj); begin null; end main; project Ada2Cppc is for Languages use (Ada, C++); for Main use (main.adb); package Naming is for Implementation_Suffix (C++) use .cc; end Naming; for Source_Dirs use (.); for Object_Dir use obj; package Compiler is for Default_Switches (ada) use (-gnat05); end Compiler; package Builder is for Default_Switches (ada) use (-g); end Builder; package Ide is for Compiler_Command (ada) use gnatmake; for Compiler_Command (c) use gcc; end Ide; end Ada2Cppc; Command: mkdir obj gprclean -q -P ada2cppc.gpr gprbuild -q -P ada2cppc.gpr obj/main Output: ctor Tester called 5:null Tested on x86_64-pc-linux-gnu, committed on trunk 2012-10-02 Javier Miranda mira...@adacore.com * exp_disp.adb (Set_CPP_Constructors): Handle constructor with default parameters that covers the default constructor. Index: exp_disp.adb === --- exp_disp.adb(revision 191972) +++ exp_disp.adb(working copy) @@ -8537,6 +8537,10 @@ Body_Stmts: List_Id; Init_Tags_List: List_Id; + Covers_Default_Constructor : Entity_Id := Empty; + + -- Start of processing for Set_CPP_Constructor + begin pragma Assert (Is_CPP_Class (Typ)); @@ -8622,7 +8626,9 @@ Defining_Identifier = Make_Defining_Identifier (Loc, Chars (Defining_Identifier (P))), - Parameter_Type = New_Copy_Tree (Parameter_Type (P; + Parameter_Type = +New_Copy_Tree (Parameter_Type (P)), + Expression = New_Copy_Tree (Expression (P; Next (P); end loop; end if; @@ -8713,6 +8719,17 @@ Discard_Node (Wrapper_Body_Node); Set_Init_Proc (Typ, Wrapper_Id); + +-- If this constructor has parameters and all its parameters +-- have defaults then it covers the default constructor. The +-- semantic analyzer ensures that only one constructor with +-- defaults covers the default constructor. + +if Present (Parameter_Specifications (Parent (E))) + and then Needs_No_Actuals (E) +then + Covers_Default_Constructor := Wrapper_Id; +end if; end if; Next_Entity (E); @@ -8725,6 +8742,46 @@ Set_Is_Abstract_Type (Typ); end if; + -- Handle constructor that has all its parameters with defaults and + -- hence it covers the default constructor. We generate a wrapper IP + -- which calls the covering constructor. + + if Present (Covers_Default_Constructor) then + Loc := Sloc (Covers_Default_Constructor); + + Body_Stmts := New_List ( + Make_Procedure_Call_Statement (Loc, + Name = + New_Reference_To (Covers_Default_Constructor, Loc), + Parameter_Associations = New_List ( + Make_Identifier (Loc, Name_uInit; + + Wrapper_Id := + Make_Defining_Identifier (Loc, Make_Init_Proc_Name (Typ)); + + Wrapper_Body_Node := + Make_Subprogram_Body (Loc, + Specification
Re: vec_cond_expr adjustments
On Mon, Oct 1, 2012 at 5:57 PM, Marc Glisse marc.gli...@inria.fr wrote: [merging both threads, thanks for the answers] On Mon, 1 Oct 2012, Richard Guenther wrote: optabs should be fixed instead, an is_gimple_val condition is implicitely val != 0. For vectors, I think it should be val 0 (with an appropriate cast of val to a signed integer vector type if necessary). Or (val highbit) != 0, but that's longer. I don't think so. Throughout the compiler we generally assume false == 0 and anything else is true. (yes, for FP there is STORE_FLAG_VALUE, but it's scope is quite limited - if we want sth similar for vectors we'd have to invent it). See below. If we for example have predicate = a b; x = predicate ? d : e; y = predicate ? f : g; we ideally want to re-use the predicate computation on targets where that would be optimal (and combine should be able to recover the case where it is not). That I don't understand. The vcond instruction implemented by targets takes as arguments d, e, cmp, a, b and emits the comparison itself. I don't see how I can avoid sending to the targets both (d,e,,a,b) and (f,g,,a,b). They will notice eventually that ab is computed twice and remove one of the two, but I don't see how to do that in optabs.c. Or I can compute x = a b, use x 0 as the comparison passed to the targets, and expect targets (those for which it is true) to recognize that 0 is useless in a vector condition (PR54700), or is useless on a comparison result. But that's a limitation of how vcond works. ISTR there is/was a vselect instruction as well, taking a mask and two vectors to select from. At least that's how vcond works internally for some sub-targets. vselect seems to only appear in config/. Would it be defined as: vselect(m,a,b)=(am)|(b~m) ? I would almost be tempted to just define a pattern in .md files and let combine handle it, although it might be one instruction too long for that (and if m is xy, ~m might look like x=y). Or would it match the OpenCL select: For each component of a vector type, result[i] = if MSB of c[i] is set ? b[i] : a[i].? Or the pattern with and | but with a precondition that the value of each element of the mask must be 0 or ±1? I don't find vcond that bad, as long as targets check for trivial comparisons in the expansion (what trivial means may depend on the platform). It is quite flexible for targets. Well, ok. On Mon, 1 Oct 2012, Richard Guenther wrote: tmp = fold_build2_loc (gimple_location (def_stmt), code, - boolean_type_node, + TREE_TYPE (cond), That's obvious. Ok, I'll test and commit that line separately. + if (TREE_CODE (op0) == VECTOR_CST TREE_CODE (op1) == VECTOR_CST) +{ + int count = VECTOR_CST_NELTS (op0); + tree *elts = XALLOCAVEC (tree, count); + gcc_assert (TREE_CODE (type) == VECTOR_TYPE); + + for (int i = 0; i count; i++) + { + tree elem_type = TREE_TYPE (type); + tree elem0 = VECTOR_CST_ELT (op0, i); + tree elem1 = VECTOR_CST_ELT (op1, i); + + elts[i] = fold_relational_const (code, elem_type, + elem0, elem1); + + if(elts[i] == NULL_TREE) + return NULL_TREE; + + elts[i] = fold_negate_const (elts[i], elem_type); I think you need to invent something new similar to STORE_FLAG_VALUE or use STORE_FLAG_VALUE here. With the above you try to map {0, 1} to {0, -1} which is only true if the operation on the element types returns {0, 1} (thus, STORE_FLAG_VALUE is 1). Er, seems to me that constant folding of a scalar comparison in the front/middle-end only returns {0, 1}. The point is we need to define some semantics for vector comparison results. One variant is to make it target independent which in turn would inhibit (or make it more difficult) to exploit some target features. You for example use {0, -1} for truth values - probably to exploit target features - even though the most natural middle-end way would be to use {0, 1} as for everything else (caveat: there may be both signed and unsigned bools, we don't allow vector components with non-mode precision, thus you could argue that a signed bool : 1 is just sign-extended for your solution). A different variant is to make it target dependent to leverage optimization opportunities - that's why STORE_FLAG_VALUE exists. For example with vector comparisons a v result, when performing bitwise operations on it, you either have to make the target expand code to produce {0, -1} even if the natural compare instruction would, say, produce {0, 0x8} - or not constrain the possible values of its result (like forwprop would do with your patch). In general we want constant folding to yield the same results as if the HW carried out the operation to make -O0 code not diverge from
Re: [PATCH RFA] Implement register pressure directed hoist pass
On 09/29/2012 12:37 AM, Bin Cheng wrote: Hi Steven, This is the updated patch according to your comments. Please review. I also re-collected code size data and found it is improved by about 0.24% for mips, which is better than previous data. I believe this should be caused by recent changes in trunk, rather than by using DF caches to calculate register pressure. Thanks. 2012-09-29 Bin Chengbin.ch...@arm.com * common.opt (flag_ira_hoist_pressure): New. * doc/invoke.texi (-fira-hoist-pressure): Describe. * ira-costs.c (ira_set_pseudo_classes): New parameter. * ira.h (ira_set_pseudo_classes): Update prototype. * haifa-sched.c (sched_init): Update call. * ira.c (ira): Update call. * regmove.c (regmove_optimize): Update call. * loop-invariant.c (move_loop_invariants): Update call. * gcse.c (struct bb_data): New structure. (BB_DATA): New macro. (curr_bb, curr_regs_live, curr_reg_pressure, regs_set) (n_regs_set): New static variables. (hoist_expr_reaches_here_p): Use reg pressure to determine the distance expr can be hoisted. (hoist_code): Use reg pressure to direct the hoist process. (get_regno_pressure_class, get_pressure_class_and_nregs) (change_pressure, mark_regno_live, mark_regno_death) (mark_reg_death, mark_reg_store, calculate_bb_reg_pressure): New. (one_code_hoisting_pass): Calculate register pressure. Free data. * config/arm/arm.c (arm_option_override): Set flag_ira_hoist_pressure on Thumb1 when optimizing for size. hoist-reg-pressure-20120929.txt +@item -fira-hoist-pressure +@opindex fira-hoist-pressure +Use IRA to evaluate register pressure in hoist pass for decisions to hoist +expressions. This option usually results in generation of smaller code on +RISC machines, but it can slow the compiler down. I wouldn't use CISC/RISC here; I'd just say it usually results in smaller code. You need to update the copyright year in gcse.c, ira.h, regmove.c, and loop-invariant.c. + /* Only decrease distance if bb has high register pressure or EXPR +is const expr, otherwise EXPR can be hoisted through bb without +cost. */ ?!? This comment makes no sense to me. To accurately know how hoisting an expression affects pressure you have to look at the inputs and output and see how their lifetime has changed. In general: For inputs, hoisting *may* reduce pressure. You really have to look at how the life of the input changes based on the new location of the insn. For example, if the input's lifetime is unchanged (say perhaps because it was live after the insn we want to hoist), then hoisting will have no impact. Otherwise the input's life is shortened, but to know by how much you have to determine whether the new death of the input occurs (it may still die in the hoisted insn or it may die elsewhere). For an output, hoisting will (effectively) always extend the lifetime. I've speculated that the right way to deal with register pressure in code motion is to actually build the dependency graph and use that to guide the code motions. I've never cobbled together any real code to do this though. Can we find a better name for hoist_expr_reaches_here_p since it's no longer just dealing with reachability -- it has heuristics now for profitability as well. @@ -2863,7 +2909,8 @@ static int if (visited == NULL) { visited_allocated_locally = 1; - visited = XCNEWVEC (char, last_basic_block); + visited = sbitmap_alloc (last_basic_block); + sbitmap_zero (visited); } What's the purpose behind changing visited from a simple array to a sbitmap? I'm not objecting, but would like to hear the rationale behind that change. I'll also note it wasn't mentioned in the ChangeLog. Similarly what's the rationale behind passing the expression itself rather than just its index? I don't see where we need to use anything other than the index in this code. And again, this change isn't mentioned in the ChangeLog. + /* Considered invariant insns have only one set. */ + gcc_assert (set != NULL_RTX); + reg = SET_DEST (set); + if (GET_CODE (reg) == SUBREG) +reg = SUBREG_REG (reg); + if (MEM_P (reg)) +{ + *nregs = 0; + pressure_class = NO_REGS; +} Don't you need to look at the addresses within the MEM? Index: gcc/config/arm/arm.c === --- gcc/config/arm/arm.c(revision 191816) +++ gcc/config/arm/arm.c(working copy) @@ -2021,6 +2021,11 @@ arm_option_override (void) current_tune-num_prefetch_slots 0) flag_prefetch_loop_arrays = 1; + /* Enable register pressure hoist when optimizing for size on Thumb1 set. */ + if (TARGET_THUMB1 optimize_function_for_size_p (cfun) + flag_ira_hoist_pressure == -1) +flag_ira_hoist_pressure = 1; I'd rather see
[Ada] Small fixes to Eliminated overflow mode
This patch cleans up some documentation issues for eliminated mode, and fixes some errors for marginal cases. Not worth trying to concoct tests for these cases, which were found by code review, not from any reported bugs. Also forbid use of Eliminated mode if Long_Long_Integer'Size is not 64. Also no tests for that, since on pretty much all targets (maybe all) this condition is met. Also add more extensive doc on this feature. Tested on x86_64-pc-linux-gnu, committed on trunk 2012-10-02 Robert Dewar de...@adacore.com * s-bignum.adb (Big_Exp): 0**0 should be 1, not 0. (Big_Exp): Fix possible error for (-1)**0. (Big_Exp): Fix error in computing 2**K for small K. (Big_Mod): Fix wrong sign for negative operands. (Div_Rem): Fix bad results for operands close to 2**63. * s-bignum.ads: Add documentation and an assertion to require LLI size to be 64 bits. * sem_prag.adb (Analyze_Pragma, case Overflow_Checks): Do not allow ELIMINATED if LLI'Size is other than 64 bits. * switch-c.adb (Scan_Switches): Do not allow -gnato3 if LLI'Size is not 64 bits. * switch.ads (Bad_Switch): Add missing pragma No_Return. * gnat_ugn.texi: Added appendix on Overflow Check Handling in GNAT. Index: switch-c.adb === --- switch-c.adb(revision 191972) +++ switch-c.adb(working copy) @@ -33,6 +33,7 @@ with Opt; use Opt; with Validsw; use Validsw; with Stylesw; use Stylesw; +with Ttypes; use Ttypes; with Warnsw; use Warnsw; with Ada.Unchecked_Deallocation; @@ -50,6 +51,10 @@ new Ada.Unchecked_Deallocation (String_List, String_List_Access); -- Avoid using System.Strings.Free, which also frees the designated strings + function Get_Overflow_Mode (C : Character) return Overflow_Check_Type; + -- Given a digit in the range 0 .. 3, returns the corresponding value of + -- Overflow_Check_Type. Raises program error if C is outside this range. + function Switch_Subsequently_Cancelled (C: String; Args : String_List; @@ -72,7 +77,6 @@ declare New_Symbol_Definitions : constant String_List_Access := new String_List (1 .. 2 * Preprocessing_Symbol_Last); - begin New_Symbol_Definitions (Preprocessing_Symbol_Defs'Range) := Preprocessing_Symbol_Defs.all; @@ -86,6 +90,37 @@ new String'(Def); end Add_Symbol_Definition; + --- + -- Get_Overflow_Mode -- + --- + + function Get_Overflow_Mode (C : Character) return Overflow_Check_Type is + begin + case C is + when '0' = +return Suppressed; + + when '1' = +return Checked; + + when '2' = +return Minimized; + + -- Eliminated allowed only if Long_Long_Integer is 64 bits (since + -- the current implementation of System.Bignums assumes this). + + when '3' = +if Standard_Long_Long_Integer_Size /= 64 then + Bad_Switch (-gnato3 not implemented for this configuration); +else + return Eliminated; +end if; + + when others = +raise Program_Error; + end case; + end Get_Overflow_Mode; + - -- Scan_Front_End_Switches -- - @@ -778,27 +813,8 @@ else -- Handle first digit after -gnato - case Switch_Chars (Ptr) is - when '0' = -Suppress_Options.Overflow_Checks_General := - Suppressed; - - when '1' = -Suppress_Options.Overflow_Checks_General := - Checked; - - when '2' = -Suppress_Options.Overflow_Checks_General := - Minimized; - - when '3' = -Suppress_Options.Overflow_Checks_General := - Eliminated; - - when others = -raise Program_Error; - end case; - + Suppress_Options.Overflow_Checks_General := +Get_Overflow_Mode (Switch_Chars (Ptr)); Ptr := Ptr + 1; -- Only one digit after -gnato, set assertions mode to @@ -813,27 +829,8 @@ -- Process second digit after -gnato else - case Switch_Chars (Ptr) is -when '0' = - Suppress_Options.Overflow_Checks_Assertions := - Suppressed; - -when '1' = - Suppress_Options.Overflow_Checks_Assertions := -
[PATCH] Vector CONSTRUCTOR verifier
Hi! As discussed in the PR and on IRC, this patch verifies that vector CONSTRUCTOR in GIMPLE is either empty CONSTRUCTOR, or contains scalar elements of type compatible with vector element type (then the verification is less strict, allows less than TYPE_VECTOR_SUBPARTS elements and allows non-NULL indexes if they are consecutive (no holes); this is because from FEs often CONSTRUCTORs with those properties leak into the IL, and a change in the gimplifier to canonicalize them wasn't enough, they keep leaking even from non-gimplified DECL_INITIAL values etc.), or contains vector elements (element types must be compatible, the vector elements must be of the same type and their number must fill the whole wider vector - these are created/used by tree-vect-generic lowering if HW supports only shorter vectors than what is requested in source). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2012-10-02 Jakub Jelinek ja...@redhat.com PR tree-optimization/54713 * expr.c (categorize_ctor_elements_1): Don't assume purpose is non-NULL. * tree-cfg.c (verify_gimple_assign_single): Add verification of vector CONSTRUCTORs. * tree-ssa-sccvn.c (vn_reference_lookup_3): For VECTOR_TYPE CONSTRUCTORs, don't do anything if element type is VECTOR_TYPE, and don't check index. * tree-vect-slp.c (vect_get_constant_vectors): VIEW_CONVERT_EXPR ctor elements first if their type isn't compatible with vector element type. --- gcc/expr.c.jj 2012-09-27 12:45:53.0 +0200 +++ gcc/expr.c 2012-10-01 18:21:40.885122833 +0200 @@ -5491,7 +5491,7 @@ categorize_ctor_elements_1 (const_tree c { HOST_WIDE_INT mult = 1; - if (TREE_CODE (purpose) == RANGE_EXPR) + if (purpose TREE_CODE (purpose) == RANGE_EXPR) { tree lo_index = TREE_OPERAND (purpose, 0); tree hi_index = TREE_OPERAND (purpose, 1); --- gcc/tree-cfg.c.jj 2012-10-01 17:28:17.469921927 +0200 +++ gcc/tree-cfg.c 2012-10-02 11:24:11.686155889 +0200 @@ -4000,6 +4000,80 @@ verify_gimple_assign_single (gimple stmt return res; case CONSTRUCTOR: + if (TREE_CODE (rhs1_type) == VECTOR_TYPE) + { + unsigned int i; + tree elt_i, elt_v, elt_t = NULL_TREE; + + if (CONSTRUCTOR_NELTS (rhs1) == 0) + return res; + /* For vector CONSTRUCTORs we require that either it is empty +CONSTRUCTOR, or it is a CONSTRUCTOR of smaller vector elements +(then the element count must be correct to cover the whole +outer vector and index must be NULL on all elements, or it is +a CONSTRUCTOR of scalar elements, where we as an exception allow +smaller number of elements (assuming zero filling) and +consecutive indexes as compared to NULL indexes (such +CONSTRUCTORs can appear in the IL from FEs). */ + FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (rhs1), i, elt_i, elt_v) + { + if (elt_t == NULL_TREE) + { + elt_t = TREE_TYPE (elt_v); + if (TREE_CODE (elt_t) == VECTOR_TYPE) + { + tree elt_t = TREE_TYPE (elt_v); + if (!useless_type_conversion_p (TREE_TYPE (rhs1_type), + TREE_TYPE (elt_t))) + { + error (incorrect type of vector CONSTRUCTOR + elements); + debug_generic_stmt (rhs1); + return true; + } + else if (CONSTRUCTOR_NELTS (rhs1) + * TYPE_VECTOR_SUBPARTS (elt_t) + != TYPE_VECTOR_SUBPARTS (rhs1_type)) + { + error (incorrect number of vector CONSTRUCTOR + elements); + debug_generic_stmt (rhs1); + return true; + } + } + else if (!useless_type_conversion_p (TREE_TYPE (rhs1_type), + elt_t)) + { + error (incorrect type of vector CONSTRUCTOR elements); + debug_generic_stmt (rhs1); + return true; + } + else if (CONSTRUCTOR_NELTS (rhs1) + TYPE_VECTOR_SUBPARTS (rhs1_type)) + { + error (incorrect number of vector CONSTRUCTOR elements); + debug_generic_stmt (rhs1); + return true; + } + } + else if (!useless_type_conversion_p (elt_t, TREE_TYPE (elt_v))) + { + error (incorrect type of vector
Re: abs(long long)
On Tue, 2 Oct 2012, Gabriel Dos Reis wrote: I understand that it is originally a library issue, but I don't think it makes sense to resolve it in isolation of that core issue. They seem mostly orthogonal to me, since the library only uses an informal language describing the desired outcome and not the actual overloads necessary to achieve it, whereas the core issue is about determining priorities for a non-ambiguous overload resolution (if we are talking about the same, where Jens Maurer has a proposal). The library installed by the system was compiled with g++, and is then used with clang++. If we can avoid installing 2 config.h files to make that work... Two things: 1. that is clearly a clang problem. I don't think it is libstdc++'s job tp try to solve clang's misguided configuration and installation. Translated: libstdc++ should only ever be used with the very version of g++ that was used to compile it. clang++, icpc, sunCC, etc should never try to use a libstdc++ compiled with another compiler. I am not saying libstdc++ should go to great lengths to support other compilers, but when it is actually easier to support them than not to... (testing a macro is easier than a configure test) 2. I am not sure you understand what I wrote: you can leave the use of the current macro the way it is if you appropriately define it in terms of what you want to change it to. I was complaining about the configure-time nature of the macro. If it is defined at each compiler run based on __SIZEOF_INT128__, I am happy. More precisely, does that mean you want __builtin_llabs instead of ::llabs? I thought the compiler knew they were the same. Yes. Another reason is that it simplifies the implementation AND if people want want to do something with the intrinsics' fallback libstdc++ will gracefully deliver that. I don't see how that simplifies the implementation, it is several characters longer than ::llabs, and we still need to handle llabs. Or do you mean: always call __builtin_llabs (whether we have an llabs or not), and let the compiler replace it with either (x0)?-x:x or a library call (I assume it never does that unless it has seen a corresponding declaration)? Note that I am happy to let you take over this PR if you like. -- Marc Glisse
Re: [PATCH v2, libbacktrace]: Compile with -funwind-tables
On Tue, Oct 2, 2012 at 2:01 AM, Uros Bizjak ubiz...@gmail.com wrote: 2012-10-02 Uros Bizjak ubiz...@gmail.com PR other/54761 * configure.ac (EXTRA_FLAGS): New. * Makefile.am (AM_FLAGS): Add $(EXTRA_FLAGS). * configure, Makefile.in: Regenerate. This is OK. Thanks. Ian
Re: [PATCH] Vector CONSTRUCTOR verifier
On Tue, Oct 2, 2012 at 3:01 PM, Jakub Jelinek ja...@redhat.com wrote: Hi! As discussed in the PR and on IRC, this patch verifies that vector CONSTRUCTOR in GIMPLE is either empty CONSTRUCTOR, or contains scalar elements of type compatible with vector element type (then the verification is less strict, allows less than TYPE_VECTOR_SUBPARTS elements and allows non-NULL indexes if they are consecutive (no holes); this is because from FEs often CONSTRUCTORs with those properties leak into the IL, and a change in the gimplifier to canonicalize them wasn't enough, they keep leaking even from non-gimplified DECL_INITIAL values etc.), or contains vector elements (element types must be compatible, the vector elements must be of the same type and their number must fill the whole wider vector - these are created/used by tree-vect-generic lowering if HW supports only shorter vectors than what is requested in source). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Ok with ... 2012-10-02 Jakub Jelinek ja...@redhat.com PR tree-optimization/54713 * expr.c (categorize_ctor_elements_1): Don't assume purpose is non-NULL. * tree-cfg.c (verify_gimple_assign_single): Add verification of vector CONSTRUCTORs. * tree-ssa-sccvn.c (vn_reference_lookup_3): For VECTOR_TYPE CONSTRUCTORs, don't do anything if element type is VECTOR_TYPE, and don't check index. * tree-vect-slp.c (vect_get_constant_vectors): VIEW_CONVERT_EXPR ctor elements first if their type isn't compatible with vector element type. --- gcc/expr.c.jj 2012-09-27 12:45:53.0 +0200 +++ gcc/expr.c 2012-10-01 18:21:40.885122833 +0200 @@ -5491,7 +5491,7 @@ categorize_ctor_elements_1 (const_tree c { HOST_WIDE_INT mult = 1; - if (TREE_CODE (purpose) == RANGE_EXPR) + if (purpose TREE_CODE (purpose) == RANGE_EXPR) { tree lo_index = TREE_OPERAND (purpose, 0); tree hi_index = TREE_OPERAND (purpose, 1); --- gcc/tree-cfg.c.jj 2012-10-01 17:28:17.469921927 +0200 +++ gcc/tree-cfg.c 2012-10-02 11:24:11.686155889 +0200 @@ -4000,6 +4000,80 @@ verify_gimple_assign_single (gimple stmt return res; case CONSTRUCTOR: + if (TREE_CODE (rhs1_type) == VECTOR_TYPE) + { + unsigned int i; + tree elt_i, elt_v, elt_t = NULL_TREE; + + if (CONSTRUCTOR_NELTS (rhs1) == 0) + return res; + /* For vector CONSTRUCTORs we require that either it is empty +CONSTRUCTOR, or it is a CONSTRUCTOR of smaller vector elements +(then the element count must be correct to cover the whole +outer vector and index must be NULL on all elements, or it is +a CONSTRUCTOR of scalar elements, where we as an exception allow +smaller number of elements (assuming zero filling) and +consecutive indexes as compared to NULL indexes (such +CONSTRUCTORs can appear in the IL from FEs). */ + FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (rhs1), i, elt_i, elt_v) + { + if (elt_t == NULL_TREE) + { + elt_t = TREE_TYPE (elt_v); + if (TREE_CODE (elt_t) == VECTOR_TYPE) + { + tree elt_t = TREE_TYPE (elt_v); + if (!useless_type_conversion_p (TREE_TYPE (rhs1_type), + TREE_TYPE (elt_t))) + { + error (incorrect type of vector CONSTRUCTOR + elements); + debug_generic_stmt (rhs1); + return true; + } + else if (CONSTRUCTOR_NELTS (rhs1) + * TYPE_VECTOR_SUBPARTS (elt_t) + != TYPE_VECTOR_SUBPARTS (rhs1_type)) + { + error (incorrect number of vector CONSTRUCTOR + elements); + debug_generic_stmt (rhs1); + return true; + } + } + else if (!useless_type_conversion_p (TREE_TYPE (rhs1_type), + elt_t)) + { + error (incorrect type of vector CONSTRUCTOR elements); + debug_generic_stmt (rhs1); + return true; + } + else if (CONSTRUCTOR_NELTS (rhs1) + TYPE_VECTOR_SUBPARTS (rhs1_type)) + { + error (incorrect number of vector CONSTRUCTOR elements); + debug_generic_stmt (rhs1); + return true; +
[PATCH] Fix PR54735
This fixes PR54735 - a bad interaction of non-up-to-date virtual SSA form, update-SSA and cfg cleanup. Morale of the story: cfg cleanup can remove blocks and thus release SSA names - SSA update is rightfully confused when such released SSA name is still used at update time. The following patch fixes the case in question simply by making sure to run update-SSA before cfg cleanup. [eventually release_ssa_name could treat virtual operands the same as regular SSA names when they are scheduled for a re-write, that would make this whole mess more robust - I am thinking of this] Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. 2012-10-02 Richard Guenther rguent...@suse.de PR middle-end/54735 * tree-ssa-pre.c (do_pre): Make sure to update virtual SSA form before cleaning up the CFG. * g++.dg/torture/pr54735.C: New testcase. Index: gcc/tree-ssa-pre.c === --- gcc/tree-ssa-pre.c (revision 191969) +++ gcc/tree-ssa-pre.c (working copy) @@ -4820,6 +4820,13 @@ do_pre (void) free_scc_vn (); + /* Tail merging invalidates the virtual SSA web, together with + cfg-cleanup opportunities exposed by PRE this will wreck the + SSA updating machinery. So make sure to run update-ssa + manually, before eventually scheduling cfg-cleanup as part of + the todo. */ + update_ssa (TODO_update_ssa_only_virtuals); + return todo; } @@ -4845,8 +4852,7 @@ struct gimple_opt_pass pass_pre = 0, /* properties_provided */ 0, /* properties_destroyed */ TODO_rebuild_alias, /* todo_flags_start */ - TODO_update_ssa_only_virtuals | TODO_ggc_collect - | TODO_verify_ssa /* todo_flags_finish */ + TODO_ggc_collect | TODO_verify_ssa /* todo_flags_finish */ } }; Index: gcc/testsuite/g++.dg/torture/pr54735.C === --- gcc/testsuite/g++.dg/torture/pr54735.C (revision 0) +++ gcc/testsuite/g++.dg/torture/pr54735.C (working copy) @@ -0,0 +1,179 @@ +// { dg-do compile } + +class Gmpfr +{}; +class M : Gmpfr +{ +public: + Gmpfr infconst; + M(int); +}; +templatetypenamestruct A; +templatetypename, int, int, int = 0 ? : 0, int = 0, int = 0class N; +templatetypenameclass O; +templatetypenamestruct B; +struct C +{ + enum + { value }; +}; +class D +{ +public: + enum + { ret }; +}; +struct F +{ + enum + { ret = 0 ? : 0 }; +}; +templatetypename Derivedstruct G +{ + typedef ODerivedtype; +}; +struct H +{ + void operator * (); +}; +struct I +{ + enum + { RequireInitialization = C::value ? : 0, ReadCost }; +}; +templatetypename Derivedstruct J +{ + enum + { ret = ADerived::InnerStrideAtCompileTime }; +}; +templatetypename Derivedstruct K +{ + enum + { ret = ADerived::OuterStrideAtCompileTime }; +}; +templatetypename Derivedclass P : H +{ +public: + using H::operator *; + typedef typename ADerived::Scalar Scalar; + enum + { RowsAtCompileTime= + ADerived::RowsAtCompileTime, ColsAtCompileTime = + ADerived::ColsAtCompileTime, SizeAtCompileTime = + F::ret, MaxRowsAtCompileTime = + ADerived::MaxRowsAtCompileTime, MaxColsAtCompileTime = + ADerived::MaxColsAtCompileTime, MaxSizeAtCompileTime = + F::ret, Flags = + ADerived::Flags ? : 0 ? : 0, CoeffReadCost = + ADerived::CoeffReadCost, InnerStrideAtCompileTime= + JDerived::ret, OuterStrideAtCompileTime = KDerived::ret }; + BDerived operator (const Scalar); +}; + +templatetypename Derivedclass O : public PDerived +{}; + +templateint _Colsclass L +{ +public: + + int cols() + { +return _Cols; + } +}; +templatetypename Derivedclass Q : public GDerived::type +{ +public: + typedef typename GDerived::type Base; + typedef typename ADerived::Index Index; + typedef typename ADerived::Scalar Scalar; + LBase::ColsAtCompileTime m_storage; + Index cols() + { +return m_storage.cols(); + } + + Scalar coeffRef(Index, + Index); +}; + +templatetypename _Scalar, int _Rows, int _Cols, int _Options, int _MaxRows, + int _MaxColsstruct AN_Scalar, _Rows, _Cols, _Options, _MaxRows, + _MaxCols +{ + typedef _Scalar Scalar; + typedef int Index; + enum + { RowsAtCompileTime, ColsAtCompileTime = + _Cols, MaxRowsAtCompileTime, MaxColsAtCompileTime, Flags= + D::ret, CoeffReadCost = + I::ReadCost, InnerStrideAtCompileTime, OuterStrideAtCompileTime = + 0 ? : 0 }; +}; +templatetypename _Scalar, int, int _Cols, int, int, + intclass N : public QN_Scalar, 0, _Cols +{ +public: + QN Base; + templatetypename T0, typename T1N(const T0, +
Re: RFC: LRA for x86/x86-64 [7/9]
Vladimir Makarov vmaka...@redhat.com writes: This is the major patch containing all new files. The patch also adds necessary calls to LRA from IRA.As the patch is too big, it continues in the next email. 2012-09-27 Vladimir Makarov vmaka...@redhat.com * Makefile.in (LRA_INT_H): New. (OBJS): Add lra.o, lra-assigns.o, lra-coalesce.o, lra-constraints.o, lra-eliminations.o, lra-lives.o, and lra-spills.o. (ira.o): Add dependence on lra.h. (lra.o, lra-assigns.o, lra-coalesce.o, lra-constraints.o): New entries. (lra-eliminations.o, lra-lives.o, lra-spills.o): Ditto. * ira.c: Include lra.h. (ira_init_once, ira_init, ira_finish_once): Call lra_start_once, lra_init, lra_finish_once in anyway. (lra_in_progress): Remove. (do_reload): Call LRA. * lra.h: New. * lra-int.h: Ditto. * lra.c: Ditto. * lra-assigns.c: Ditto. * lra-constraints.c: Ditto. * lra-coalesce.c: Ditto. * lra-eliminations.c: Ditto. * lra-lives.c: Ditto. * lra-spills.c: Ditto. * doc/passes.texi: Describe LRA pass. Comments on ira-lives.c. (Sorry for the split, had more time to look at this than expected) +/* Copy live range list given by its head R and return the result. */ +lra_live_range_t +lra_copy_live_range_list (lra_live_range_t r) +{ + lra_live_range_t p, first, last; + + if (r == NULL) +return NULL; + for (first = last = NULL; r != NULL; r = r-next) +{ + p = copy_live_range (r); + if (first == NULL) + first = p; + else + last-next = p; + last = p; +} + return first; +} Maybe simpler as: lra_live_range_t p, first, *chain; first = NULL; for (chain = first; r != NULL; r = r-next) { p = copy_live_range (r); *chain = p; chain = p-next; } return first; +/* Merge ranges R1 and R2 and returns the result. The function + maintains the order of ranges and tries to minimize size of the + result range list. */ +lra_live_range_t +lra_merge_live_ranges (lra_live_range_t r1, lra_live_range_t r2) +{ + lra_live_range_t first, last, temp; + + if (r1 == NULL) +return r2; + if (r2 == NULL) +return r1; + for (first = last = NULL; r1 != NULL r2 != NULL;) +{ + if (r1-start r2-start) + { + temp = r1; + r1 = r2; + r2 = temp; + } + if (r1-start = r2-finish + 1) + { + /* Intersected ranges: merge r1 and r2 into r1. */ + r1-start = r2-start; + if (r1-finish r2-finish) + r1-finish = r2-finish; + temp = r2; + r2 = r2-next; + pool_free (live_range_pool, temp); + if (r2 == NULL) + { + /* To try to merge with subsequent ranges in r1. */ + r2 = r1-next; + r1-next = NULL; + } + } + else + { + /* Add r1 to the result. */ + if (first == NULL) + first = last = r1; + else + { + last-next = r1; + last = r1; + } + r1 = r1-next; + if (r1 == NULL) + { + /* To try to merge with subsequent ranges in r2. */ + r1 = r2-next; + r2-next = NULL; + } + } I might be misreading, but I'm not sure whether this handles merges like: r1 = [6,7], [3,4] r2 = [3,8], [0,1] After the first iteration, it looks like we'll have: r1 = [3,8], [3,4] r2 = [0,1] Then we'll add both [3,8] and [3,4] to the result. Same chain pointer comment as for lra_merge_live_ranges. +/* Return TRUE if live range R1 is in R2. */ +bool +lra_live_range_in_p (lra_live_range_t r1, lra_live_range_t r2) +{ + /* Remember the live ranges are always kept ordered. */ + while (r1 != NULL r2 != NULL) +{ + /* R1's element is in R2's element. */ + if (r2-start = r1-start r1-finish = r2-finish) + r1 = r1-next; + /* Intersection: R1's start is in R2. */ + else if (r2-start = r1-start r1-start = r2-finish) + return false; + /* Intersection: R1's finish is in R2. */ + else if (r2-start = r1-finish r1-finish = r2-finish) + return false; + else if (r1-start r2-finish) + return false; /* No covering R2's element for R1's one. */ + else + r2 = r2-next; +} + return r1 == NULL; Does the inner bit reduce to: /* R1's element is in R2's element. */ if (r2-start = r1-start r1-finish = r2-finish) r1 = r1-next; /* All of R2's element comes after R1's element. */ else if (r2-start r1-finish) r2 = r2-next; else return false; (Genuine question) +/* Process the death of hard register REGNO. This updates + hard_regs_live and START_DYING. */ +static void +make_hard_regno_dead (int regno) +{ + if (TEST_HARD_REG_BIT (lra_no_alloc_regs, regno) + || ! TEST_HARD_REG_BIT
Re: [PATCH] limited C++ parsing support for gengtype
Aaron, I'm currently fixing other issues with gengtype and I needed this patch on top of them. I will be rolling both patches into a single one and commit them today/tomorrow. If you were working on further fixes to this, please give me a chance to commit this one first. Thanks. Diego.
Re: RFC: LRA for x86/x86-64 [7/9]
Richard Sandiford rdsandif...@googlemail.com writes: +/* Merge ranges R1 and R2 and returns the result. The function + maintains the order of ranges and tries to minimize size of the + result range list. */ +lra_live_range_t +lra_merge_live_ranges (lra_live_range_t r1, lra_live_range_t r2) +{ + lra_live_range_t first, last, temp; + + if (r1 == NULL) +return r2; + if (r2 == NULL) +return r1; + for (first = last = NULL; r1 != NULL r2 != NULL;) +{ + if (r1-start r2-start) +{ + temp = r1; + r1 = r2; + r2 = temp; +} + if (r1-start = r2-finish + 1) +{ + /* Intersected ranges: merge r1 and r2 into r1. */ + r1-start = r2-start; + if (r1-finish r2-finish) +r1-finish = r2-finish; + temp = r2; + r2 = r2-next; + pool_free (live_range_pool, temp); + if (r2 == NULL) +{ + /* To try to merge with subsequent ranges in r1. */ + r2 = r1-next; + r1-next = NULL; +} +} + else +{ + /* Add r1 to the result. */ + if (first == NULL) +first = last = r1; + else +{ + last-next = r1; + last = r1; +} + r1 = r1-next; + if (r1 == NULL) +{ + /* To try to merge with subsequent ranges in r2. */ + r1 = r2-next; + r2-next = NULL; +} +} I might be misreading, but I'm not sure whether this handles merges like: r1 = [6,7], [3,4] r2 = [3,8], [0,1] After the first iteration, it looks like we'll have: r1 = [3,8], [3,4] r2 = [0,1] Then we'll add both [3,8] and [3,4] to the result. OK, so I start to read patch b and realise that this is only supposed to handle non-overlapping live ranges. It might be worth having a comment and assert to that effect, for slow readers like me. Although in that case the function feels a little more complicated than it needs to be. When we run out of R1 or R2, why not just use the other one as the rest of the live range list? Why is: + if (r1 == NULL) +{ + /* To try to merge with subsequent ranges in r2. */ + r1 = r2-next; + r2-next = NULL; +} needed? Richard
Re: [PATCH] Fix PR47799 - debug info for early-inlining with LTO
On Mon, Oct 01, 2012 at 02:05:50PM +0200, Richard Guenther wrote: 2012-10-01 Richard Guenther rguent...@suse.de PR lto/47788 * tree-streamer-out.c (write_ts_block_tree_pointers): For inlined functions outer scopes write the ultimate origin as BLOCK_ABSTRACT_ORIGIN and BLOCK_SOURCE_LOCATION. Do not stream the fragment chains. (lto_input_ts_block_tree_pointers): Likewise. * dwarf2out.c (gen_subprogram_die): Handle NULL DECL_INITIAL. (dwarf2out_decl): Always output DECL_ABSTRACT function decls. Ok. Jakub
Re: abs(long long)
2012/10/2 Marc Glisse marc.gli...@inria.fr: Here I am talking of a library issue: the wording that says that there are sufficient overloads such that integer types call the double version of math functions. It is fairly obvious that it doesn't apply to abs(long) for instance which has an explicit overload. For short or unsigned, I still read it as saying that it converts to double... This really looks like a problem of the Standard Library specification to me and a corresponding issue should be submitted. In fact the wording can be interpreted that mixing cstdlib with cmath would imply two different versions of std::abs(int) because of different required return types. I will prepare a corresponding submission to the LWG. - Daniel
Re: RFC: LRA for x86/x86-64 [0/9]
On 10/02/2012 12:22 AM, Jeff Law wrote: On 10/01/2012 07:14 PM, Vladimir Makarov wrote: Analogous live ranges are used in IRA as intermidiate step to build a conflict graph. Actually, the first approach was to use IRA code to assign hard registers to pseudos (e.g. Jeff Law tried this approach) but it was rejected as requiring much more compilation time. In some way, one can look at the assignment in LRA is a compromise between quality (which could achieved through repeated buidling conflict graphs and using graph coloring) and compilation speed. Not only was it slow (iterating IRA), guaranteeing termination was a major problem. There's some algorithmic games that have to be played (they're at least discussed in literature, but not under the heading of termination) and there's some issues specific to the IRA implementation which make ensuring termination difficult. Chaitin-Briggs literature does not discuss the termination, just saying that live-ranges shortening will result to assigning hard regs to all necessary pseudos which is not clearly guaranteed. There is the same problem in LRA. So LRA checks that too many passes are done or to many reloads for one insn are made and abort LRA. Porting LRA is mostly fixing such aborts. Another thing omitted by literature is inheritance which is very important for performance. Although it could be considered as a special case of live-range splitting. There are also a lot of small important details (e.g. what to do in case of displacement constraints, or when non-load/store insns permits memory and registers etc) not discussed well or at all in the literature I read. I got nearly as good of results by conservative updates of the conflicts after splitting ranges and (ab)using ira's reload hooks to give the new pseudos for the split range a chance to be allocated again. The biggest problem with that approach was getting the costing right for the new pseudos. That requires running a fair amount of IRA a second time. I'd still like to return to some of the ideas from that work as I think some of the bits are still relevant in the IRA+LRA world. My experience shows that these lists are usually 1-2 elements. That's been my experience as well. The vast majority of the time the range lists are very small.
Re: RFC: LRA for x86/x86-64 [8/9]
On 10/02/2012 01:01 AM, Jeff Law wrote: On 09/27/2012 04:59 PM, Vladimir Makarov wrote: The following patch adds a code neccessary for correct work of LRA (function ira_setup_eliminable_regset) and for correct work of the compiler when LRA is used (see file dwarf2out.c). 2012-09-27 Vladimir Makarov vmaka...@redhat.com * loop-invariant.c (calculate_loop_reg_pressure): Pass new argument to ira_setup_eliminable_regset. * haifa-sched.c (sched_init): Pass new argument to ira_setup_eliminable_regset. * dwarf2out.c: Include lra.h. (based_loc_descr, compute_frame_pointer_to_fb_displacement): Use lra_eliminate_regs for LRA instead of eliminate_regs. * ira.c: (ira_setup_eliminable_regset): Add parameter. Remove need_fp. Call lra_init_elemination and mark HARD_FRAME_POINTER_REGNUM as living forever if frame_pointer_needed. (ira): Call ira_setup_eliminable_regset with a new argument. * ira.h (ira_setup_eliminable_regset): Add an argument. * Makefile.in (dwarf2out.o): Add dependence on ira.h and lra.h. This is OK. Obviously it's useless without 7a/7b. But I just wanted to go ahead and review the ancillary bits before going to the meat of the submission. So it's just the 7a/7b patch that needs review, right? Right. I'll commit your and Richard Sandiford's proposals into the branch. I'll commit some patches (which could be useful without LRA) into the trunk too. Thanks for reviewing all of this.
Re: [PATCH v2, libbacktrace]: Compile with -funwind-tables
On Tue, Oct 2, 2012 at 3:08 PM, Ian Lance Taylor i...@google.com wrote: 2012-10-02 Uros Bizjak ubiz...@gmail.com PR other/54761 * configure.ac (EXTRA_FLAGS): New. * Makefile.am (AM_FLAGS): Add $(EXTRA_FLAGS). * configure, Makefile.in: Regenerate. This is OK. Thanks, committed. On a related issue, it looks to me that the compiler itself should be compiled with -funwind-tables, otherwise there are no backtraces generated, even if libbacktrace is linked in and operational. Again, x86_64-linux-gnu host defaults to this flag, but other hosts are left behind. Uros.
RFA: Fix OP_INOUT handling of web.c:union_match_dups
Similar to PR43742, the ARCompact port gets an ICE from the current mainline version of web.c:union_match_dups for its zero overhead loop pattern. My first attempt at rectifying this was equivalent in effect to the patch from comment #1 from this patch; that seemed to work well enough. Later I stumbled across PR43742, which made me take a second look at the problem. Unlike the SH, the ARCompact architecture as an actual zero overhead loop mechanism, which uses a dedicated loop counter register. Although it can be used in most contexts that a general-purpose register can, this causes a lot of pipeline stalls, so if we changed the match_dup into a matching constraint, and reload inserted reg-reg copies to fix up matching consatraints for just a small fraction of the zero overhead loops, the performance penalty of these stalls would wipe out any benefit gained from having any compiler generated zero overhead loops. Looking at md.texi, you could be excused thinking that match_dups have to follow the operand that they match only for define_expand. However, when you try to scramble the order in a define_insn_and_split, you get an error: /home/amylaar/synopsys/arc_gnu_4.8/unisrc/gcc/config/arc/arc.md:5516: operand 0 duplicated before defined which is emitted by validate_pattern in genrecog.c . This code is from 2004, so I'd say there is a good chance that more code that actually relies on this. Some patterns might be made to conform both to the match_dup ordering constraint and avoid the web.c SEGV by reordering the pattern, although at times at other infrastructure, e.g. when every place that tries to recognize zero overhead loop patterns has to be amended to look for multiple forms. But other patterns intrinsically need one of these strictures removed. Consider an instruction that atomically exchanges the contents of two registers and/or memory locations: The source of the first set must match the destination of the second set. So, if we have to make the first occurencence a match_operand, we must tag the + constraint on this input. Therefore, web.c:union_match_dups should handle + constraints on inputs tied with a match_dup to a later mentioned output. The problem here is that the current version of this function only searches the match_dup location in the use_link array, but for an OP_INPUT operand, the location will be in the def_link array. When I originally implemented this, I put some asserts there to make sure we now handle all the *dupref == NULL cases; however, this lead to ICEs for i686-pc-linux-gnu. As mentioned in the new comment, the DF_REF_LOC (use_link[n]) points to the register part of a memory address, wheras recog_data.dup_loc[m] points to an enclosing MEM. This is really a separate problem, so I choose to leave the behaviour in this case alone, i.e. just continue in the loop without creating the def/use union, even though the comment at the top of the function says that it should create that. Bootstrapped and regtested on i686-pc-linux-gnu (baseline: revision 191817) . 2012-10-02 Joern Rennecke joern.renne...@embecosm.com * web.c (union_match_dups): Properly handle OP_INOUT match_dups. Index: web.c === --- web.c (revision 191817) +++ web.c (working copy) @@ -96,6 +96,7 @@ union_match_dups (rtx insn, struct web_e struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn); df_ref *use_link = DF_INSN_INFO_USES (insn_info); df_ref *def_link = DF_INSN_INFO_DEFS (insn_info); + struct web_entry *dup_entry; int i; extract_insn (insn); @@ -107,10 +108,24 @@ union_match_dups (rtx insn, struct web_e df_ref *ref, *dupref; struct web_entry *entry; - for (dupref = use_link; *dupref; dupref++) + for (dup_entry = use_entry, dupref = use_link; *dupref; dupref++) if (DF_REF_LOC (*dupref) == recog_data.dup_loc[i]) break; + if (*dupref == NULL type == OP_INOUT) + { + + for (dup_entry = def_entry, dupref = def_link; *dupref; dupref++) + if (DF_REF_LOC (*dupref) == recog_data.dup_loc[i]) + break; + } + /* ??? *DUPREF can still be zero, because when an operand matches +a memory, DF_REF_LOC (use_link[n]) points to the register part +of the address, whereas recog_data.dup_loc[m] points to the +entire memory ref, thus we fail to find the duplicate entry, + even though it is there. + Example: i686-pc-linux-gnu gcc.c-torture/compile/950607-1.c + -O3 -fomit-frame-pointer -funroll-loops */ if (*dupref == NULL || DF_REF_REGNO (*dupref) FIRST_PSEUDO_REGISTER) continue; @@ -121,7 +136,15 @@ union_match_dups (rtx insn, struct web_e if (DF_REF_LOC (*ref) == recog_data.operand_loc[op]) break; - (*fun) (use_entry + DF_REF_ID (*dupref), entry + DF_REF_ID (*ref)); + if (!*ref type == OP_INOUT)
Re: RFC: LRA for x86/x86-64 [4/9]
On 10/01/2012 02:51 PM, Richard Sandiford wrote: Vladimir Makarov vmaka...@redhat.com writes: +/* Return register bank of given hard regno for the current target. */ +DEFHOOK +(register_bank, + A target hook which returns the register bank number to which the\ + register @var{hard_regno} belongs to. The smaller the number, the\ + more preferable the hard register usage (when all other conditions are\ + the same). This hook can be used to prefer some hard register over\ + others in LRA. For example, some x86-64 register usage needs\ + additional prefix which makes instructions longer. The hook can\ + return bigger bank number for such registers make them less favorable\ + and as result making the generated code smaller.\ + \ + The default version of this target hook returns always zero., + int, (int), + default_register_bank) This is a horribly bikeshed-level comment, sorry, but I wonder if something like register_priority would be better. Register classes are in some ways an extension of register banks, so it wasn't obvious from the name why we needed both. Ok. I agree that is not a good term. Register bank in hardware (especially in DSP) means a bit different thing. Actually, on the Cauldron Ian asked me why it is different from register allocation order. I should say that the order usually takes caller-saves info into account. In x86-64, reg with REX flags can be caller-saved or not. +/* Return true if maximal address displacement can be different. */ +DEFHOOK +(different_addr_displacement_p, + A target hook which returns true if an address with the same structure\ + can have different maximal legitimate displacement. For example, the\ + displacement can depend on memory mode or on operand combinations in\ + the insn.\ + \ + The default version of this target hook returns always false., + bool, (void), + default_different_addr_displacement_p) If I read the patch correctly, this is only used in: + if (lra_reg_spill_p || targetm.different_addr_displacement_p ()) + lra_set_used_insn_alternative (insn, -1); and so we keep the current alternative when neither spill_class_mode nor different_addr_displacement_p is defined. How many targets on the LRA branch are like that? I would have expected most targets with limited address displacements would have to return true for the above hook, because multiword loads and stores typically have to be split into word loads and stores. Same goes for strict-alignment targets, where wider modes often have slightly lower maximal displacements. E.g. for MIPS, SImode loads and stores have a displacement range of [-32768, 32764], but DImode loads and stores only accept [-32768, 32760]. So the maximal displacement depends on mode, even though the instruction set is pretty regular. Targets with full address-size displacements can use the default false return, but it looks like the x86 port defines spill_class_mode instead, so AIUI the value isn't really tested on Core i7. What's the impact of that compared to the other x86 targets that don't set X86_TUNE_GENERAL_REGS_SSE_SPILL? Is LRA just quicker for them, or will it make different decisions (compared to Core i7) even for non-SSE insns? It is mostly done for the LRA speed. We could remove if-stmt and everything will be all right, only lra-constraints pass will go over all alternatives again. Currently, there are two targets for which if-cond is true. One is x86-64 (more exactly when corei7 tune is used) and another target PARISC for which different_add_displacement_p is true. It might be more in the future. Future ARMs and Powers might be profitable for spilling general registers into vector/floating point registers. Might be new other targets will need different_addr_displacement_p. We could rid off the hook but it is important for speeding LRA up. I felt that I should make LRA speed competitive with reload even it is hard because LRA works on RTL and not on internal structures as reload. I see now I was right worrying about the speed. I guess if we could sacrifice 2% of compilation time, LRA code would be smaller and more clear. If we could sacrifice 10% percent compilation time, the code would be even smaller and clear because we could use DF-infrastructure. +/* Determine class of registers which could be used for spilled + pseudos instead of memory. */ +DEFHOOK +(spill_class, + This hook defines a class of registers which could be used for spilled pseudos\ + of given class instead of memory, + reg_class_t, (reg_class_t), + NULL) Should probably say that NO_REGS means none. +/* Determine mode for spilling pseudos into registers instead of memory. */ +DEFHOOK +(spill_class_mode, + This hook defines mode in which a pseudo of given mode and of the first\ + register class can be spilled into the second register class, + enum machine_mode, (reg_class_t, reg_class_t, enum machine_mode), + NULL) It looks like the only use is in:
Re: [PATCH] Rs6000 infrastructure cleanup (switches), revised patch #2c
On Tue, Oct 02, 2012 at 10:13:25AM +0200, Gunther Nikl wrote: Michael Meissner wrote: Segher Boessenkool asked me on IRC to break out the fix in the last change. This patch is just the change to set the default options if the user did not use -mcpu=xxx and the compiler was not configured with --with-cpu=xxx. Here are the patches. Which GCC releases are affected by this bug? All of them. Now, in general users don't see this bug, because distribution maintainers usually build with an explicit --with-cpu= option, which sets the default CPU in case the user did not use -mcpu=xxx on the command line. If neither option was used, the default powerpc or powerpc64 is usually good enough. David noticed it when building AIX compilers, because he wanted to add a default option (-mmfcrf) to the aix*.h definitions to insure that the new get timebase builtin would generate the correct instructions by default (the original PowerPCs had a different SPR for the time base than the newer server machines starting with power4). He asked me to fix this bug before we tackle the infrastructure changes. -- Michael Meissner, IBM 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899
[AARCH64-4.7][PATCH] Reload fix backported to aarch64-4.7-branch.
Hi, I've backported Ulrich's reload fix(attached) http://gcc.gnu.org/ml/gcc-patches/2012-07/msg01421.html to aarch64-4.7-branch and committed it. SendingChangeLog.aarch64 Sendingreload.c Transmitting file data .. Committed revision 191987. Thanks, Tejas. diff --git a/gcc/reload.c b/gcc/reload.c index 8420c80..a462419 100644 --- a/gcc/reload.c +++ b/gcc/reload.c @@ -283,7 +283,7 @@ static int find_reloads_address_1 (enum machine_mode, addr_space_t, rtx, int, static void find_reloads_address_part (rtx, rtx *, enum reg_class, enum machine_mode, int, enum reload_type, int); -static rtx find_reloads_subreg_address (rtx, int, int, enum reload_type, +static rtx find_reloads_subreg_address (rtx, int, enum reload_type, int, rtx, int *); static void copy_replacements_1 (rtx *, rtx *, int); static int find_inc_amount (rtx, rtx); @@ -4745,31 +4745,19 @@ find_reloads_toplev (rtx x, int opnum, enum reload_type type, } /* If the subreg contains a reg that will be converted to a mem, -convert the subreg to a narrower memref now. -Otherwise, we would get (subreg (mem ...) ...), -which would force reload of the mem. - -We also need to do this if there is an equivalent MEM that is -not offsettable. In that case, alter_subreg would produce an -invalid address on big-endian machines. - -For machines that extend byte loads, we must not reload using -a wider mode if we have a paradoxical SUBREG. find_reloads will -force a reload in that case. So we should not do anything here. */ +attempt to convert the whole subreg to a (narrower or wider) +memory reference instead. If this succeeds, we're done -- +otherwise fall through to check whether the inner reg still +needs address reloads anyway. */ if (regno = FIRST_PSEUDO_REGISTER -#ifdef LOAD_EXTEND_OP - !paradoxical_subreg_p (x) -#endif - (reg_equiv_address (regno) != 0 - || (reg_equiv_mem (regno) != 0 - (! strict_memory_address_addr_space_p - (GET_MODE (x), XEXP (reg_equiv_mem (regno), 0), - MEM_ADDR_SPACE (reg_equiv_mem (regno))) - || ! offsettable_memref_p (reg_equiv_mem (regno)) - || num_not_at_initial_offset - x = find_reloads_subreg_address (x, 1, opnum, type, ind_levels, - insn, address_reloaded); + reg_equiv_memory_loc (regno) != 0) + { + tem = find_reloads_subreg_address (x, opnum, type, ind_levels, +insn, address_reloaded); + if (tem) + return tem; + } } for (copied = 0, i = GET_RTX_LENGTH (code) - 1; i = 0; i--) @@ -6007,12 +5995,31 @@ find_reloads_address_1 (enum machine_mode mode, addr_space_t as, if (ira_reg_class_max_nregs [rclass][GET_MODE (SUBREG_REG (x))] reg_class_size[(int) rclass]) { - x = find_reloads_subreg_address (x, 0, opnum, - ADDR_TYPE (type), - ind_levels, insn, NULL); - push_reload (x, NULL_RTX, loc, (rtx*) 0, rclass, - GET_MODE (x), VOIDmode, 0, 0, opnum, type); - return 1; + /* If the inner register will be replaced by a memory +reference, we can do this only if we can replace the +whole subreg by a (narrower) memory reference. If +this is not possible, fall through and reload just +the inner register (including address reloads). */ + if (reg_equiv_memory_loc (REGNO (SUBREG_REG (x))) != 0) + { + rtx tem = find_reloads_subreg_address (x, opnum, +ADDR_TYPE (type), +ind_levels, insn, +NULL); + if (tem) + { + push_reload (tem, NULL_RTX, loc, (rtx*) 0, rclass, + GET_MODE (tem), VOIDmode, 0, 0, + opnum, type); + return 1; + } + } + else + { + push_reload (x, NULL_RTX, loc, (rtx*) 0, rclass, + GET_MODE (x), VOIDmode, 0, 0, opnum, type); + return 1; + } } } } @@ -6089,17 +6096,12 @@
Re: abs(long long)
On Tue, Oct 2, 2012 at 8:07 AM, Marc Glisse marc.gli...@inria.fr wrote: The library installed by the system was compiled with g++, and is then used with clang++. If we can avoid installing 2 config.h files to make that work... Two things: 1. that is clearly a clang problem. I don't think it is libstdc++'s job tp try to solve clang's misguided configuration and installation. Translated: libstdc++ should only ever be used with the very version of g++ that was used to compile it. clang++, icpc, sunCC, etc should never try to use a libstdc++ compiled with another compiler. Obviously, I cannot require you to exercise common sense and keep in check non-sensical strech. libstdc++ was and is developed for GCc/g++. If you are have a 3rd party compiler that you would like to use with g++/libstdc++, you should (a) either convince your 3rd party compiler supplier to understand the library you already have (libstdc++), or (b) supply yourself the glue between libstdc++ and your compiler. Many compilers have done that in the past; I don't see anything special with clang++ Whining on this list about libstdc++ internal macros and your dislike of them is not going to produce anything today or tomorrow. I am not saying libstdc++ should go to great lengths to support other compilers, but when it is actually easier to support them than not to... (testing a macro is easier than a configure test) 2. I am not sure you understand what I wrote: you can leave the use of the current macro the way it is if you appropriately define it in terms of what you want to change it to. I was complaining about the configure-time nature of the macro. If it is defined at each compiler run based on __SIZEOF_INT128__, I am happy. I am saying to can arrange to supply the appropriate definition without having to change the uses. More precisely, does that mean you want __builtin_llabs instead of ::llabs? I thought the compiler knew they were the same. Yes. Another reason is that it simplifies the implementation AND if people want want to do something with the intrinsics' fallback libstdc++ will gracefully deliver that. I don't see how that simplifies the implementation, it is several characters longer than ::llabs, and we still need to handle llabs. You are on the wrong track if you are taking the number of characters used in the implemetation. Or do you mean: always call __builtin_llabs (whether we have an llabs or not), and let the compiler replace it with either (x0)?-x:x or a library call (I assume it never does that unless it has seen a corresponding declaration)? Note that I am happy to let you take over this PR if you like. -- Marc Glisse
Re: abs(long long)
On Tue, Oct 2, 2012 at 9:34 AM, Daniel Krügler daniel.krueg...@gmail.com wrote: 2012/10/2 Marc Glisse marc.gli...@inria.fr: Here I am talking of a library issue: the wording that says that there are sufficient overloads such that integer types call the double version of math functions. It is fairly obvious that it doesn't apply to abs(long) for instance which has an explicit overload. For short or unsigned, I still read it as saying that it converts to double... This really looks like a problem of the Standard Library specification to me and a corresponding issue should be submitted. In fact the wording can be interpreted that mixing cstdlib with cmath would imply two different versions of std::abs(int) because of different required return types. I will prepare a corresponding submission to the LWG. This was already an issue I reported to LWG when C++98 came out. Now that you hold wrtite access to the issue document, you can make sure it won't slip through the crack this time :-p -- Gaby
RE: [Patch] Fix PR53397
Hi Richi, (Snip) + (!cst_and_fits_in_hwi (step)) +{ + if( loop-inner != NULL) +{ + if (dump_file (dump_flags TDF_DETAILS)) +{ + fprintf (dump_file, Reference %p:\n, (void *) ref); + fprintf (dump_file, (base ); + print_generic_expr (dump_file, base, TDF_SLIM); + fprintf (dump_file, , step ); + print_generic_expr (dump_file, step, TDF_TREE); + fprintf (dump_file, )\n); No need to repeat this - all references are dumped when we gather them. (Snip) The dumping happens at record_ref which is called after these statements to record these references. When the step is invariant we return from the function without recording the references. so I thought of dumping the references here. Is there a cleaner way to dump the references at one place? Regards, Venkat. -Original Message- From: Richard Guenther [mailto:rguent...@suse.de] Sent: Tuesday, October 02, 2012 5:42 PM To: Kumar, Venkataramanan Cc: gcc-patches@gcc.gnu.org Subject: Re: [Patch] Fix PR53397 On Mon, 1 Oct 2012, venkataramanan.ku...@amd.com wrote: Hi, The below patch fixes the FFT/Scimark regression caused by useless prefetch generation. This fix tries to make prefetch less aggressive by prefetching arrays in the inner loop, when the step is invariant in the entire loop nest. GCC currently tries to prefetch invariant steps when they are in the inner loop. But does not check if the step is variant in outer loops. In the scimark FFT case, the trip count of the inner loop varies by a non constant step, which is invariant in the inner loop. But the step variable is varying in outer loop. This makes inner loop trip count small (at run time varies sometimes as small as 1 iteration) Prefetching ahead x iteration when the inner loop trip count is smaller than x leads to useless prefetches. Flag used: -O3 -march=amdfam10 Before ** ** ** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark ** ** for details. (Results can be submitted to p...@nist.gov) ** ** ** Using 2.00 seconds min time per kenel. Composite Score: 550.50 FFT Mflops:38.66(N=1024) SOR Mflops: 617.61(100 x 100) MonteCarlo: Mflops: 173.74 Sparse matmult Mflops: 675.63(N=1000, nz=5000) LU Mflops: 1246.88(M=100, N=100) After ** ** ** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark ** ** for details. (Results can be submitted to p...@nist.gov) ** ** ** Using 2.00 seconds min time per kenel. Composite Score: 639.20 FFT Mflops: 479.19(N=1024) SOR Mflops: 617.61(100 x 100) MonteCarlo: Mflops: 173.18 Sparse matmult Mflops: 679.13(N=1000, nz=5000) LU Mflops: 1246.88(M=100, N=100) GCC regression make check -k passes with x86_64-unknown-linux-gnu New tests that PASS: gcc.dg/pr53397-1.c scan-assembler prefetcht0 gcc.dg/pr53397-1.c scan-tree-dump aprefetch Issued prefetch gcc.dg/pr53397-1.c (test for excess errors) gcc.dg/pr53397-2.c scan-tree-dump aprefetch loop variant step gcc.dg/pr53397-2.c scan-tree-dump aprefetch Not prefetching gcc.dg/pr53397-2.c (test for excess errors) Checked CPU2006 and polyhedron on latest AMD processor, no regressions noted. Ok to commit in trunk? regards, Venkat gcc/ChangeLog +2012-10-01 Venkataramanan Kumar venkataramanan.ku...@amd.com + + * tree-ssa-loop-prefetch.c (gather_memory_references_ref):$ + Perform non constant step prefetching in inner loop, only $ + when it is invariant in the entire loop nest. $ + * testsuite/gcc.dg/pr53397-1.c: New test case $ + Checks we are prefecthing for loop invariant steps$ + * testsuite/gcc.dg/pr53397-2.c: New test case$ + Checks we are not prefecthing for loop variant steps + Index: gcc/testsuite/gcc.dg/pr53397-1.c === --- gcc/testsuite/gcc.dg/pr53397-1.c (revision 0) +++ gcc/testsuite/gcc.dg/pr53397-1.c (revision 0) @@ -0,0 +1,28 @@ +/* Prefetching when the step is loop invariant. */ + +/* { dg-do compile } */ +/* { dg-options -O3 -fprefetch-loop-arrays +-fdump-tree-aprefetch-details --param min-insn-to-prefetch-ratio=3 +--param simultaneous-prefetches=10 -fdump-tree-aprefetch-details } +*/ + + +double data[16384]; +void prefetch_when_non_constant_step_is_invariant(int step, int n) { + int a; + int b; + for (a = 1; a step; a++) { +for (b = 0; b n; b += 2 * step) { + + int i = 2*(b + a); + int j = 2*(b
Re: abs(long long)
On Tue, Oct 2, 2012 at 8:07 AM, Marc Glisse marc.gli...@inria.fr wrote: Or do you mean: always call __builtin_llabs (whether we have an llabs or not), and let the compiler replace it with either (x0)?-x:x or a library call (I assume it never does that unless it has seen a corresponding declaration)? See what we did in c/cmath and c_global/cmath. What you find there is the result of years of several iterations (including something similar to your earlier patch) all having issues in one way of another until we settled on the builtin functions approach. I have no appetite to go back to those days full of headache. -- Gaby
Re: [PATCH v2, libbacktrace]: Compile with -funwind-tables
On Tue, Oct 2, 2012 at 8:22 AM, Uros Bizjak ubiz...@gmail.com wrote: On a related issue, it looks to me that the compiler itself should be compiled with -funwind-tables, otherwise there are no backtraces generated, even if libbacktrace is linked in and operational. Again, x86_64-linux-gnu host defaults to this flag, but other hosts are left behind. Compiling with C++ should always give us -funwind-tables. If it doesn't for some reason, then I agree. We might even want -fasynchronous-unwind-tables for the compiler. Ian
Re: [PATCH v2, libbacktrace]: Compile with -funwind-tables
On Tue, Oct 02, 2012 at 10:12:38AM -0700, Ian Lance Taylor wrote: On Tue, Oct 2, 2012 at 8:22 AM, Uros Bizjak ubiz...@gmail.com wrote: On a related issue, it looks to me that the compiler itself should be compiled with -funwind-tables, otherwise there are no backtraces generated, even if libbacktrace is linked in and operational. Again, x86_64-linux-gnu host defaults to this flag, but other hosts are left behind. Compiling with C++ should always give us -funwind-tables. It doesn't give that, because the compiler is compiled with -fno-exceptions -fno-rtti. Jakub
Re: [PATCH] Rs6000 infrastructure cleanup (switches), revised patch #2d
On Mon, Oct 1, 2012 at 7:11 PM, Michael Meissner meiss...@linux.vnet.ibm.com wrote: 2012-10-01 Michael Meissner meiss...@linux.vnet.ibm.com * config/rs6000/rs6000.c (rs6000_option_override_internal): If -mcpu=xxx is not specified and the compiler is not configured using --with-cpu=xxx, use the bits from the TARGET_DEFAULT to set the initial options. I reworked the patch to allow TARGET_DEFAULT bits to be set if there is no -mcpu=xxx and the compiler was not configured using --with-cpu=xxx, so that we don't first clear all of the ISA bits, set them from the cpu, and then merge back in the TARGET_DEFAULT bits. This version of the patch is good. Thanks, David
Re: [PATCH v2, libbacktrace]: Compile with -funwind-tables
On Tue, Oct 2, 2012 at 12:14 PM, Jakub Jelinek ja...@redhat.com wrote: On Tue, Oct 02, 2012 at 10:12:38AM -0700, Ian Lance Taylor wrote: On Tue, Oct 2, 2012 at 8:22 AM, Uros Bizjak ubiz...@gmail.com wrote: On a related issue, it looks to me that the compiler itself should be compiled with -funwind-tables, otherwise there are no backtraces generated, even if libbacktrace is linked in and operational. Again, x86_64-linux-gnu host defaults to this flag, but other hosts are left behind. Compiling with C++ should always give us -funwind-tables. It doesn't give that, because the compiler is compiled with -fno-exceptions -fno-rtti. I believe in the long term we would to drop either of those. -- Gaby
Re: [PATCH v2, libbacktrace]: Compile with -funwind-tables
On Tue, Oct 2, 2012 at 7:44 PM, Gabriel Dos Reis g...@integrable-solutions.net wrote: On a related issue, it looks to me that the compiler itself should be compiled with -funwind-tables, otherwise there are no backtraces generated, even if libbacktrace is linked in and operational. Again, x86_64-linux-gnu host defaults to this flag, but other hosts are left behind. Compiling with C++ should always give us -funwind-tables. It doesn't give that, because the compiler is compiled with -fno-exceptions -fno-rtti. I believe in the long term we would to drop either of those. For the short term, I am bootstrapping attached patch, that adds -funwind-tables to other noexcept flags. Uros. Index: configure === --- configure (revision 191991) +++ configure (working copy) @@ -6636,7 +6636,7 @@ # Disable exceptions and RTTI if building with g++ noexception_flags= save_CFLAGS=$CFLAGS -for real_option in -fno-exceptions -fno-rtti; do +for real_option in -fno-exceptions -fno-rtti -funwind-tables; do # Do the check with the no- prefix removed since gcc silently # accepts any -Wno-* option on purpose case $real_option in Index: configure.ac === --- configure.ac(revision 191991) +++ configure.ac(working copy) @@ -365,7 +365,8 @@ # Disable exceptions and RTTI if building with g++ ACX_PROG_CC_WARNING_OPTS( - m4_quote(m4_do([-fno-exceptions -fno-rtti])), [noexception_flags]) + m4_quote(m4_do([-fno-exceptions -fno-rtti -funwind-tables])), + [noexception_flags]) # Enable expensive internal checks is_release=
Re: [PATCH] fix up fixincludes for VxWorks and fix testing
On 9/23/2012 7:19 PM, Bruce Korb wrote: The attached patch needs to be split into two and I will do that before I actually push the thing. Since I have run out of play time this weekend and since I will be in the Ukraine in two weeks for two weeks, this patch is unlikely to get pushed before the end of October. Sorry about that. I've tried to do some of this work since Bruce is out. I ended up splitting it into four patches. Patches to follow, Robert Mason
Re: [PATCH] fix up fixincludes for VxWorks and fix testing
Patch 1: [fixincludes] Fixes for VxWorks TODO Prior to commit: * fixincl.x: Regenerate ChangeLog [fixincludes]: 2012-06-19 Robert Mason r...@verizon.net * fixinc.in: Check to see if the machine_name fix needs to be disabled. viz. vxworks must not check the machine name for fix applicability. * inclhack.def (AAB_vxworks_assert): New replacement fix (AAB_vxworks_regs_vxtypes): likewise (AAB_vxworks_stdint): and again (AAB_vxworks_unistd) and yet again (vxworks_ioctl_macro): wrap ioctl function in macro (vxworks_mkdir_macro): remove mkdir() args vxworks doesn't support (vxworks_regs): make sure regs.h comes from above arch directory. (vxworks_write_const): add const attribute to data argument * mkfixinc.sh: remove vxworks from list of platforms skipped by fixincludes 2012-09-23 Bruce Korb bk...@gnu.org * tests/base/ioLib.h: new test header for new vxworks fix. * tests/base/math.h: fix results movement * tests/base/sys/stat.h: vxworks test * tests/base/testing.h: vxworks test
Re: [PATCH] fix up fixincludes for VxWorks and fix testing
Forgot to attach. On 10/2/2012 2:09 PM, rbmj wrote: Patch 1: [fixincludes] Fixes for VxWorks TODO Prior to commit: * fixincl.x: Regenerate ChangeLog [fixincludes]: 2012-06-19 Robert Mason r...@verizon.net * fixinc.in: Check to see if the machine_name fix needs to be disabled. viz. vxworks must not check the machine name for fix applicability. * inclhack.def (AAB_vxworks_assert): New replacement fix (AAB_vxworks_regs_vxtypes): likewise (AAB_vxworks_stdint): and again (AAB_vxworks_unistd) and yet again (vxworks_ioctl_macro): wrap ioctl function in macro (vxworks_mkdir_macro): remove mkdir() args vxworks doesn't support (vxworks_regs): make sure regs.h comes from above arch directory. (vxworks_write_const): add const attribute to data argument * mkfixinc.sh: remove vxworks from list of platforms skipped by fixincludes 2012-09-23 Bruce Korb bk...@gnu.org * tests/base/ioLib.h: new test header for new vxworks fix. * tests/base/math.h: fix results movement * tests/base/sys/stat.h: vxworks test * tests/base/testing.h: vxworks test From 5da04a0758548288d5f004ed294ac3e903e229a8 Mon Sep 17 00:00:00 2001 From: rbmj r...@verizon.net Date: Tue, 2 Oct 2012 13:51:18 -0400 Subject: [PATCH 1/4] [fixincludes] Add fixes for VxWorks --- fixincludes/fixinc.in | 16 ++ fixincludes/inclhack.def | 266 fixincludes/mkfixinc.sh|1 - fixincludes/tests/base/ioLib.h | 19 ++ fixincludes/tests/base/math.h | 10 +- fixincludes/tests/base/sys/stat.h |7 + fixincludes/tests/base/testing.h |6 + 8 files changed, 324 insertions(+), 85 deletions(-) create mode 100644 fixincludes/tests/base/ioLib.h diff --git a/fixincludes/fixinc.in b/fixincludes/fixinc.in index e73aed9..f7b8d8f 100755 --- a/fixincludes/fixinc.in +++ b/fixincludes/fixinc.in @@ -128,6 +128,22 @@ fi # # # # # # # # # # # # # # # # # # # # # # +# Check to see if the machine_name fix needs to be disabled. +# +# On some platforms, machine_name doesn't work properly and +# breaks some of the header files. Since everything works +# properly without it, just wipe the macro list to +# disable the fix. + +case ${target_canonical} in +*-*-vxworks*) + test -f ${MACRO_LIST} echo ${MACRO_LIST} +;; +esac + + +# # # # # # # # # # # # # # # # # # # # # +# # In the file macro_list are listed all the predefined # macros that are not in the C89 reserved namespace (the reserved # namespace is all identifiers beginnning with two underscores or one diff --git a/fixincludes/inclhack.def b/fixincludes/inclhack.def index 82792af..c5ae854 100644 --- a/fixincludes/inclhack.def +++ b/fixincludes/inclhack.def @@ -354,6 +354,206 @@ fix = { _EndOfHeader_; }; +/* + * Fix assert.h on VxWorks: + */ +fix = { +hackname= AAB_vxworks_assert; +files = assert.h; +mach= *-*-vxworks*; + +replace = - _EndOfHeader_ + #ifndef _ASSERT_H + #define _ASSERT_H + + #ifdef assert + #undef assert + #endif + + #if defined(__STDC__) || defined(__cplusplus) + extern void __assert (const char*); + #else + extern void __assert (); + #endif + + #ifdef NDEBUG + #define assert(ign) ((void)0) + #else + + #define ASSERT_STRINGIFY(str) ASSERT_STRINGIFY_HELPER(str) + #define ASSERT_STRINGIFY_HELPER(str) #str + + #define assert(test) ((void) \ + ((test) ? ((void)0) : \ + __assert(Assertion failed: ASSERT_STRINGIFY(test) , file \ + __FILE__ , line ASSERT_STRINGIFY(__LINE__) \n))) + + #endif + + #endif + _EndOfHeader_; +}; + +/* + * Add needed include to regs.h (NOT the gcc header) on VxWorks + */ + +fix = { +hackname= AAB_vxworks_regs_vxtypes; +files = regs.h; +mach= *-*-vxworks*; + +replace = - _EndOfHeader_ + #ifndef _REGS_H + #define _REGS_H + #include types/vxTypesOld.h + #include_next arch/../regs.h + #endif + _EndOfHeader_; +}; + +/* + * Make VxWorks stdint.h a bit more compliant - add typedefs + */ +fix = { +hackname= AAB_vxworks_stdint; +files = stdint.h; +mach= *-*-vxworks*; + +replace = - _EndOfHeader_ + #ifndef _STDINT_H + #define _STDINT_H + /* get int*_t, uint*_t */ + #include types/vxTypes.h + + /* get legacy vxworks types for compatibility */ + #include types/vxTypesOld.h + + typedef long intptr_t; + typedef unsigned long uintptr_t; + + typedef int64_t intmax_t; + typedef uint64_t uintmax_t; + + typedef int8_t int_least8_t; + typedef int16_t int_least16_t; + typedef int32_t int_least32_t; + typedef int64_t int_least64_t; + + typedef uint8_t uint_least8_t; + typedef uint16_t uint_least16_t; + typedef uint32_t uint_least32_t; + typedef uint64_t uint_least64_t; + + typedef int8_t int_fast8_t; + typedef int int_fast16_t; + typedef int32_t int_fast32_t; + typedef
Re: [PATCH] fix up fixincludes for VxWorks and fix testing
Patch 2: [fixincludes] Clean up fixincludes test machinery TODO Prior to commit: * fixincl.x: Regenerate ChangeLog 2012-09-23 Bruce Korb bk...@gnu.org * check.tpl: export TEST_MODE=true for testing * fixincl.c (te_verbose): extract to fixlib.h (run_compiles): in test mode, if the fix is a replacement, then skip the test. The fix will not be applied. * fixlib.h (fixinc_mode): new global variable that defaults to TESTING_OFF but is set to TESTING_ON when TEST_MODE is true. * fixopts.c: define this global variable (initialize_opts): set it to TESTING_ON under proper conditions * inclhack.def (AAB_darwin7_9_long_double_funcs_2): this is *NOT* a replacement fix. Rename it and move it where it belongs as (darwin_9_long_double_funcs_2): renamed fix (broken_nan): this had a broken selection regex. Could never work. * tests/base/architecture/ppc/math.h: replacement fixes are not tested, so remove all the replacement text. Add in the broken_nan test that used to never, ever fire.
Re: [PATCH] fix up fixincludes for VxWorks and fix testing
Patch 3: Add --enable-libstdcxx option at top level configure TODO prior to commit: * configure: regenerate ChangeLog: * configure.ac: Add --enable-libstdcxx option From 3f0d38b7b7b70659a57ac4266701a71a5f948860 Mon Sep 17 00:00:00 2001 From: rbmj r...@verizon.net Date: Tue, 2 Oct 2012 13:54:21 -0400 Subject: [PATCH 3/4] Add --enable-libstdcxx option at top level configure --- configure.ac | 38 +- 1 file changed, 25 insertions(+), 13 deletions(-) diff --git a/configure.ac b/configure.ac index f0d86d9..5325695 100644 --- a/configure.ac +++ b/configure.ac @@ -427,6 +427,15 @@ AC_ARG_ENABLE(libssp, ENABLE_LIBSSP=$enableval, ENABLE_LIBSSP=yes) +AC_ARG_ENABLE(libstdcxx, +AS_HELP_STRING([--disable-libstdcxx], + [do not build libstdc++-v3 directory]), +ENABLE_LIBSTDCXX=$enableval, +ENABLE_LIBSTDCXX=default) +[if test ${ENABLE_LIBSTDCXX} = no ; then + noconfigdirs=$noconfigdirs libstdc++-v3 +fi] + # Save it here so that, even in case of --enable-libgcj, if the Java # front-end isn't enabled, we still get libgcj disabled. libgcj_saved=$libgcj @@ -562,19 +571,22 @@ case ${target} in esac # Disable libstdc++-v3 for some systems. -case ${target} in - *-*-vxworks*) -# VxWorks uses the Dinkumware C++ library. -noconfigdirs=$noconfigdirs target-libstdc++-v3 -;; - arm*-wince-pe*) -# the C++ libraries don't build on top of CE's C libraries -noconfigdirs=$noconfigdirs target-libstdc++-v3 -;; - avr-*-*) -noconfigdirs=$noconfigdirs target-libstdc++-v3 -;; -esac +# Allow user to override this if they pass --enable-libstdc++-v3 +if test ${ENABLE_LIBSTDCXX} = default ; then + case ${target} in +*-*-vxworks*) + # VxWorks uses the Dinkumware C++ library. + noconfigdirs=$noconfigdirs target-libstdc++-v3 + ;; +arm*-wince-pe*) + # the C++ libraries don't build on top of CE's C libraries + noconfigdirs=$noconfigdirs target-libstdc++-v3 + ;; +avr-*-*) + noconfigdirs=$noconfigdirs target-libstdc++-v3 + ;; + esac +fi # Disable Fortran for some systems. case ${target} in -- 1.7.10.4
Re: [PATCH] fix up fixincludes for VxWorks and fix testing
Patch 4: Minor changes to fix compilation on VxWorks ChangeLog [gcc]: * gcov-io.c (gcov_open): Pass third argument to open() unconditionally ChangeLog [libstdc++-v3]: * libstdc++-v3/config/os/vxworks/os_defines.h: Define NOMINMAX From 420bf6c2b0bde5f1689663b477add8fc9df2a6f0 Mon Sep 17 00:00:00 2001 From: rbmj r...@verizon.net Date: Tue, 2 Oct 2012 13:55:02 -0400 Subject: [PATCH 4/4] Minor source changes to allow compilation on VxWorks --- gcc/gcov-io.c |3 ++- libstdc++-v3/config/os/vxworks/os_defines.h |6 ++ 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/gcc/gcov-io.c b/gcc/gcov-io.c index d64fb42..f562654 100644 --- a/gcc/gcov-io.c +++ b/gcc/gcov-io.c @@ -92,7 +92,8 @@ gcov_open (const char *name, int mode) { /* Read-only mode - acquire a read-lock. */ s_flock.l_type = F_RDLCK; - fd = open (name, O_RDONLY); + /* pass mode (ignored) for compatibility */ + fd = open (name, O_RDONLY, S_IRUSR | S_IWUSR); } else { diff --git a/libstdc++-v3/config/os/vxworks/os_defines.h b/libstdc++-v3/config/os/vxworks/os_defines.h index c66063e..93ad1d4 100644 --- a/libstdc++-v3/config/os/vxworks/os_defines.h +++ b/libstdc++-v3/config/os/vxworks/os_defines.h @@ -33,4 +33,10 @@ // System-specific #define, typedefs, corrections, etc, go here. This // file will come before all others. +//Keep vxWorks from defining min()/max() as macros +#ifdef NOMINMAX +#undef NOMINMAX +#endif +#define NOMINMAX 1 + #endif -- 1.7.10.4
Re: [PATCH] Add a new option -fstack-protector-strong (patch / doc inside)
Hi, any one got a chance to take look at this patch? It seems that some other guys are also interested in this patch, the clang developer is also proposing implement this -fstack-protect-strong option. Patch has just been merged with newest trunk and fixed a bug reported by Kees. Tested fox x86_64 and arm. Below patches also uploaded as patchset#4 at https://codereview.appspot.com/6303078/ == diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c index 299150e..6eb18d6 100644 --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -1244,6 +1244,11 @@ clear_tree_used (tree block) #define SPCT_HAS_ARRAY 4 #define SPCT_HAS_AGGREGATE 8 +/* Constants for flag_stack_protect. */ +#define SPCT_ALL 3 +#define SPCT_STRONG 2 +#define SPCT_DEFAULT 1 + static unsigned int stack_protect_classify_type (tree type) { @@ -1306,7 +1311,8 @@ stack_protect_decl_phase (tree decl) if (bits SPCT_HAS_SMALL_CHAR_ARRAY) has_short_buffer = true; - if (flag_stack_protect == 2) + if (flag_stack_protect == SPCT_ALL || + flag_stack_protect == SPCT_STRONG) { if ((bits (SPCT_HAS_SMALL_CHAR_ARRAY | SPCT_HAS_LARGE_CHAR_ARRAY)) !(bits SPCT_HAS_AGGREGATE)) @@ -1444,6 +1450,29 @@ estimated_stack_frame_size (struct cgraph_node *node) return size; } +/* Helper routine to check if a record or union contains an array field. */ + +static int +record_or_union_type_has_array_p (const_tree tree_type) +{ + tree fields = TYPE_FIELDS (tree_type); + tree f; + + for (f = fields; f; f = DECL_CHAIN (f)) +{ + if (TREE_CODE (f) == FIELD_DECL) + { + tree field_type = TREE_TYPE (f); + if (RECORD_OR_UNION_TYPE_P (field_type) + record_or_union_type_has_array_p (field_type)) +return 1; + if (TREE_CODE (field_type) == ARRAY_TYPE) +return 1; + } +} + return 0; +} + /* Expand all variables used in the function. */ static void @@ -1454,6 +1483,7 @@ expand_used_vars (void) struct pointer_map_t *ssa_name_decls; unsigned i; unsigned len; + int gen_stack_protect_signal = 0; /* Compute the phase of the stack frame for this function. */ { @@ -1505,6 +1535,23 @@ expand_used_vars (void) } pointer_map_destroy (ssa_name_decls); + FOR_EACH_LOCAL_DECL (cfun, i, var) +{ + tree var_type = TREE_TYPE (var); + /* Examine local referenced variables that have their addresses taken, + contain an array, or are arrays. */ + if (TREE_CODE (var) == VAR_DECL + (TREE_CODE (var_type) == ARRAY_TYPE + || TREE_ADDRESSABLE (var) + || (RECORD_OR_UNION_TYPE_P (var_type) + record_or_union_type_has_array_p (var_type + { + ++gen_stack_protect_signal; + break; + } +} + + /* At this point all variables on the local_decls with TREE_USED set are not associated with any block scope. Lay them out. */ @@ -1591,11 +1638,18 @@ expand_used_vars (void) dump_stack_var_partition (); } - /* There are several conditions under which we should create a - stack guard: protect-all, alloca used, protected decls present. */ - if (flag_stack_protect == 2 - || (flag_stack_protect - (cfun-calls_alloca || has_protected_decls))) + /* Create stack guard, if + a) -fstack-protector-all - always; + b) -fstack-protector-strong - if there are arrays, memory + references to local variables, alloca used, or protected decls present; + c) -fstack-protector - if alloca used, or protected decls present */ + if (flag_stack_protect == SPCT_ALL /* -fstack-protector-all */ + || (flag_stack_protect == SPCT_STRONG /* -fstack-protector-strong */ + (gen_stack_protect_signal || cfun-calls_alloca + || has_protected_decls)) + || (flag_stack_protect == SPCT_DEFAULT /* -fstack-protector */ + (cfun-calls_alloca + || has_protected_decls))) create_stack_guard (); /* Assign rtl to each variable based on these partitions. */ @@ -1612,7 +1666,8 @@ expand_used_vars (void) expand_stack_vars (stack_protect_decl_phase_1); /* Phase 2 contains other kinds of arrays. */ - if (flag_stack_protect == 2) + if (flag_stack_protect == SPCT_ALL || + flag_stack_protect == SPCT_STRONG) expand_stack_vars (stack_protect_decl_phase_2); } diff --git a/gcc/common.opt b/gcc/common.opt index f0e757c..942fbc0 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -1892,8 +1892,12 @@ fstack-protector Common Report Var(flag_stack_protect, 1) Use propolice as a stack protection method -fstack-protector-all +fstack-protector-strong Common Report RejectNegative Var(flag_stack_protect, 2) +Use a smart stack protection method for certain functions + +fstack-protector-all +Common Report RejectNegative Var(flag_stack_protect, 3) Use a stack protection method for every function fstack-usage diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 7578dda..e1f2f2d 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -406,7 +406,7 @@ Objective-C and
Re: [PATCH] fix up fixincludes for VxWorks and fix testing
Forgot to attach... On 10/2/2012 2:11 PM, rbmj wrote: Patch 2: [fixincludes] Clean up fixincludes test machinery TODO Prior to commit: * fixincl.x: Regenerate ChangeLog 2012-09-23 Bruce Korb bk...@gnu.org * check.tpl: export TEST_MODE=true for testing * fixincl.c (te_verbose): extract to fixlib.h (run_compiles): in test mode, if the fix is a replacement, then skip the test. The fix will not be applied. * fixlib.h (fixinc_mode): new global variable that defaults to TESTING_OFF but is set to TESTING_ON when TEST_MODE is true. * fixopts.c: define this global variable (initialize_opts): set it to TESTING_ON under proper conditions * inclhack.def (AAB_darwin7_9_long_double_funcs_2): this is *NOT* a replacement fix. Rename it and move it where it belongs as (darwin_9_long_double_funcs_2): renamed fix (broken_nan): this had a broken selection regex. Could never work. * tests/base/architecture/ppc/math.h: replacement fixes are not tested, so remove all the replacement text. Add in the broken_nan test that used to never, ever fire. From 56861b9c45b43c1443f88e56e6fa46fde590a70f Mon Sep 17 00:00:00 2001 From: rbmj r...@verizon.net Date: Tue, 2 Oct 2012 13:52:27 -0400 Subject: [PATCH 2/4] [fixincludes] Clean up fixincludes test machinery --- fixincludes/README |3 +++ fixincludes/check.tpl|1 + fixincludes/fixincl.c| 27 +++ fixincludes/fixlib.h | 26 +- fixincludes/fixopts.c| 42 +++--- fixincludes/fixtests.c |2 +- fixincludes/inclhack.def | 42 +- fixincludes/tests/base/architecture/ppc/math.h | 84 +--- 7 files changed, 89 insertions(+), 54 deletions(-) diff --git a/fixincludes/README b/fixincludes/README index c7144a0..9b48210 100644 --- a/fixincludes/README +++ b/fixincludes/README @@ -44,6 +44,9 @@ To make your fix, you will need to do several things: Make sure it is now properly handled. Add tests to the test_text entry(ies) that validate your fix. This will help ensure that future fixes won't negate your work. +Do *NOT* specify test text for wrap or replacement fixes. +There is no real possibility that these fixes will fail. +If they do, you will surely know straight away. 5. Go into the fixincludes build directory and type, make check. You are guaranteed to have issues printed out as a result. diff --git a/fixincludes/check.tpl b/fixincludes/check.tpl index a9810e2..0d1f444 100644 --- a/fixincludes/check.tpl +++ b/fixincludes/check.tpl @@ -99,6 +99,7 @@ ENDFOR fix =] +export TEST_MODE=true find . -type f | sed 's;^\./;;' | sort | ../../fixincl cd ${DESTDIR} diff --git a/fixincludes/fixincl.c b/fixincludes/fixincl.c index 1133534..fecfb19 100644 --- a/fixincludes/fixincl.c +++ b/fixincludes/fixincl.c @@ -53,22 +53,8 @@ static const char z_std_preamble[] = original, manufacturer supplied header file. */\n\n; int find_base_len = 0; - -typedef enum { - VERB_SILENT = 0, - VERB_FIXES, - VERB_APPLIES, - VERB_PROGRESS, - VERB_TESTS, - VERB_EVERYTHING -} te_verbose; - -te_verbose verbose_level = VERB_PROGRESS; int have_tty = 0; -#define VLEVEL(l) ((unsigned int) verbose_level = (unsigned int) l) -#define NOT_SILENT VLEVEL(VERB_FIXES) - pid_t process_chain_head = (pid_t) -1; char* pz_curr_file; /* name of the current file under test/fix */ @@ -412,8 +398,17 @@ run_compiles (void) /* FOR every fixup, ... */ do { - tTestDesc *p_test = p_fixd-p_test_desc; - int test_ct = p_fixd-test_ct; + tTestDesc *p_test; + int test_ct; + + if (fixinc_mode (p_fixd-fd_flags FD_REPLACEMENT)) +{ + p_fixd-fd_flags |= FD_SKIP_TEST; + continue; +} + + p_test = p_fixd-p_test_desc; + test_ct = p_fixd-test_ct; /* IF the machine type pointer is not NULL (we are not in test mode) AND this test is for or not done on particular machines diff --git a/fixincludes/fixlib.h b/fixincludes/fixlib.h index 42d98b2..19df48a 100644 --- a/fixincludes/fixlib.h +++ b/fixincludes/fixlib.h @@ -140,7 +140,10 @@ typedef int apply_fix_p_t; /* Apply Fix Predicate Type */ amount of user entertainment )\ \ _ENV_( pz_find_base, BOOL_TRUE, FIND_BASE, \ - leader to trim from file names ) + leader to trim from file names ) \ + \ + _ENV_( pz_test_mode, BOOL_FALSE, TEST_MODE, \ + run fixincludes in test mode ) #define _ENV_(v,m,n,t) extern tCC* v; ENV_TABLE @@ -211,6 +214,27 @@ typedef struct { extern int gnu_type_map_ct; +typedef enum { + VERB_SILENT = 0, + VERB_FIXES, + VERB_APPLIES, + VERB_PROGRESS, + VERB_TESTS, + VERB_EVERYTHING +}
[wwwdocs] Buildstat update for 4.4
Latest results for 4.4.x -tgc Testresults for 4.4.7: alphaev68-dec-osf5.1a hppa2.0w-hp-hpux11.00 hppa2.0w-hp-hpux11.11 hppa64-hp-hpux11.00 hppa64-hp-hpux11.11 i386-pc-solaris2.8 Testresults for 4.4.1: alphaev68-dec-osf5.1a Index: buildstat.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.4/buildstat.html,v retrieving revision 1.26 diff -u -r1.26 buildstat.html --- buildstat.html 4 Apr 2012 11:24:55 - 1.26 +++ buildstat.html 2 Oct 2012 18:17:11 - @@ -34,10 +34,12 @@ tdalphaev68-dec-osf5.1a/td tdnbsp;/td tdTest results: +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02456.html;4.4.7/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2011-08/msg00586.html;4.4.6/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2011-05/msg00074.html;4.4.6/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2010-12/msg01338.html;4.4.5/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2010-07/msg01437.html;4.4.4/a, +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02455.html;4.4.1/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2009-07/msg03093.html;4.4.1/a /td /tr @@ -134,6 +136,7 @@ tdhppa2.0w-hp-hpux11.00/td tdnbsp;/td tdTest results: +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg00105.html;4.4.7/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2010-08/msg00081.html;4.4.4/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2009-11/msg01652.html;4.4.2/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2009-04/msg03163.html;4.4.0/a @@ -144,6 +147,7 @@ tdhppa2.0w-hp-hpux11.11/td tdnbsp;/td tdTest results: +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-08/msg02261.html;4.4.7/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2011-05/msg00201.html;4.4.6/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2010-05/msg02383.html;4.4.4/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2010-01/msg02240.html;4.4.3/a, @@ -158,6 +162,7 @@ tdhppa64-hp-hpux11.00/td tdnbsp;/td tdTest results: +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-08/msg02861.html;4.4.7/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2010-08/msg02231.html;4.4.4/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2009-05/msg00296.html;4.4.0/a /td @@ -167,6 +172,7 @@ tdhppa64-hp-hpux11.11/td tdnbsp;/td tdTest results: +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-08/msg02159.html;4.4.7/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2011-05/msg00400.html;4.4.6/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2010-05/msg02914.html;4.4.4/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2010-01/msg02364.html;4.4.3/a, @@ -181,6 +187,7 @@ tdi386-pc-solaris2.8/td tdnbsp;/td tdTest results: +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-04/msg02185.html;4.4.7/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2010-07/msg00900.html;4.4.4/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2009-05/msg00105.html;4.4.0/a /td
Re: [PATCH v2, libbacktrace]: Compile with -funwind-tables
On Tue, Oct 2, 2012 at 10:48 AM, Uros Bizjak ubiz...@gmail.com wrote: On Tue, Oct 2, 2012 at 7:44 PM, Gabriel Dos Reis g...@integrable-solutions.net wrote: On a related issue, it looks to me that the compiler itself should be compiled with -funwind-tables, otherwise there are no backtraces generated, even if libbacktrace is linked in and operational. Again, x86_64-linux-gnu host defaults to this flag, but other hosts are left behind. Compiling with C++ should always give us -funwind-tables. It doesn't give that, because the compiler is compiled with -fno-exceptions -fno-rtti. I believe in the long term we would to drop either of those. For the short term, I am bootstrapping attached patch, that adds -funwind-tables to other noexcept flags. I think you should use -fasynchronous-unwind-tables here. That way we can get a backtrace if the compiler gets a segmentation violation. I'll approve this patch with that change. But you might want to check whether you can see any change in bootstrap time or compiler size (sorry). Thanks. Ian
Re: abs(long long)
On Tue, 2 Oct 2012, Gabriel Dos Reis wrote: Whining on this list about libstdc++ internal macros and your dislike of them is not going to produce anything today or tomorrow. Other compilers using libstdc++ was just an extra argument. Even if g++ was the only compiler on earth, I would still consider a compile-time test superior to a configure test. The macro __SIZEOF_INT128__ was invented precisely for this purpose. Yes, that's just more whining ;-) On Tue, 2 Oct 2012, Gabriel Dos Reis wrote: On Tue, Oct 2, 2012 at 8:07 AM, Marc Glisse marc.gli...@inria.fr wrote: Or do you mean: always call __builtin_llabs (whether we have an llabs or not), and let the compiler replace it with either (x0)?-x:x or a library call (I assume it never does that unless it has seen a corresponding declaration)? See what we did in c/cmath and c_global/cmath. Note that llabs is quite different from asin. __builtin_llabs generates an ABS_EXPR, which will later be expanded either to a special instruction or to a condition. It never generates a call to llabs (I am not sure exactly if Paolo's instructions to use llabs meant he wanted an actual library call). __builtin_asin on the other hand is never expanded inline (except maybe for special constant input like 0) and expands to a call to the library function asin. Would the attached patch be better, assuming it passes testing? For lldiv, there is no builtin (for good reason). * include/c_std/cstdlib (abs(long long)): Define with __builtin_llabs when we have long long. (abs(__int128)): Define when we have __int128. (div(long long, long long)): Use lldiv. -- Marc GlisseIndex: cstdlib === --- cstdlib (revision 191941) +++ cstdlib (working copy) @@ -128,21 +128,31 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION using ::strtod; using ::strtol; using ::strtoul; using ::system; #ifdef _GLIBCXX_USE_WCHAR_T using ::wcstombs; using ::wctomb; #endif // _GLIBCXX_USE_WCHAR_T inline long - abs(long __i) { return labs(__i); } + abs(long __i) { return __builtin_labs(__i); } + +#ifdef _GLIBCXX_USE_LONG_LONG + inline long long + abs(long long __x) { return __builtin_llabs (__x); } +#endif + +#if !defined(__STRICT_ANSI__) defined(_GLIBCXX_USE_INT128) + inline __int128 + abs(__int128 __x) { return __x = 0 ? __x : -__x; } +#endif inline ldiv_t div(long __i, long __j) { return ldiv(__i, __j); } _GLIBCXX_END_NAMESPACE_VERSION } // namespace #if _GLIBCXX_USE_C99 #undef _Exit @@ -161,29 +171,26 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC using ::lldiv_t; #endif #if _GLIBCXX_USE_C99_CHECK || _GLIBCXX_USE_C99_DYNAMIC extern C void (_Exit)(int) throw () _GLIBCXX_NORETURN; #endif #if !_GLIBCXX_USE_C99_DYNAMIC using ::_Exit; #endif - inline long long - abs(long long __x) { return __x = 0 ? __x : -__x; } - #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC using ::llabs; inline lldiv_t div(long long __n, long long __d) - { lldiv_t __q; __q.quot = __n / __d; __q.rem = __n % __d; return __q; } + { return ::lldiv (__n, __d); } using ::lldiv; #endif #if _GLIBCXX_USE_C99_LONG_LONG_CHECK || _GLIBCXX_USE_C99_LONG_LONG_DYNAMIC extern C long long int (atoll)(const char *) throw (); extern C long long int (strtoll)(const char * __restrict, char ** __restrict, int) throw (); extern C unsigned long long int (strtoull)(const char * __restrict, char ** __restrict, int) throw (); @@ -198,21 +205,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _GLIBCXX_END_NAMESPACE_VERSION } // namespace __gnu_cxx namespace std { #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC using ::__gnu_cxx::lldiv_t; #endif using ::__gnu_cxx::_Exit; - using ::__gnu_cxx::abs; #if !_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC using ::__gnu_cxx::llabs; using ::__gnu_cxx::div; using ::__gnu_cxx::lldiv; #endif using ::__gnu_cxx::atoll; using ::__gnu_cxx::strtof; using ::__gnu_cxx::strtoll; using ::__gnu_cxx::strtoull; using ::__gnu_cxx::strtold;
[PR54177] Deal with var_lowpart failure in function parameters
Uros has already taken care of the main patch for the problem, but I feel it's appropriate to protect vt_add_function_parameter should val_lowpart actually return NULL. I'm checking this in as obvious. Regstrapped on x86_64-linux-gnu and i686-linux-gnu. Deal with var_lowpart failure in function parameters. From: Alexandre Oliva aol...@redhat.com for gcc/ChangeLog * var-tracking.c (vt_add_function_parameter): Bail if var_lowpart fails. --- gcc/var-tracking.c |7 ++- 1 files changed, 6 insertions(+), 1 deletions(-) diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c index 9f5bc12..bbd2f4b 100644 --- a/gcc/var-tracking.c +++ b/gcc/var-tracking.c @@ -9428,6 +9428,7 @@ vt_add_function_parameter (tree parm) GET_CODE (incoming) != PARALLEL) { cselib_val *val; + rtx lowpart; /* ??? We shouldn't ever hit this, but it may happen because arguments passed by invisible reference aren't dealt with @@ -9436,7 +9437,11 @@ vt_add_function_parameter (tree parm) if (offset) return; - val = cselib_lookup_from_insn (var_lowpart (mode, incoming), mode, true, + lowpart = var_lowpart (mode, incoming); + if (!lowpart) + return; + + val = cselib_lookup_from_insn (lowpart, mode, true, VOIDmode, get_insns ()); /* ??? Float-typed values in memory are not handled by -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist Red Hat Brazil Compiler Engineer
[wwwdocs] Buildstat update for 4.5
Latest results for 4.5.x -tgc Testresults for 4.5.4: alphaev68-dec-osf5.1a hppa2.0w-hp-hpux11.00 hppa64-hp-hpux11.00 Testresults for 4.5.3: alphaev68-dec-osf5.1a i386-pc-solaris2.8 Testresults for 4.5.2: alphaev68-dec-osf5.1a Testresults for 4.5.1: alphaev68-dec-osf5.1a Testresults for 4.5.0: alphaev68-dec-osf5.1a Index: buildstat.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.5/buildstat.html,v retrieving revision 1.14 diff -u -r1.14 buildstat.html --- buildstat.html 4 Apr 2012 16:11:35 - 1.14 +++ buildstat.html 2 Oct 2012 18:27:49 - @@ -56,6 +56,18 @@ /tr tr +tdalphaev68-dec-osf5.1a/td +tdnbsp;/td +tdTest results: +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02460.html;4.5.4/a, +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02459.html;4.5.3/a, +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02458.html;4.5.2/a, +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02457.html;4.5.1/a, +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02452.html;4.5.0/a +/td +/tr + +tr tdarmv7l-unknown-linux-gnueabi/td tdnbsp;/td tdTest results: @@ -75,6 +87,7 @@ tdhppa2.0w-hp-hpux11.00/td tdnbsp;/td tdTest results: +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg00598.html;4.5.4/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2011-05/msg01358.html;4.5.3/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2010-09/msg01008.html;4.5.1/a /td @@ -93,6 +106,7 @@ tdhppa64-hp-hpux11.00/td tdnbsp;/td tdTest results: +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg00833.html;4.5.4/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2011-05/msg01736.html;4.5.3/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2010-09/msg01432.html;4.5.1/a /td @@ -129,6 +143,7 @@ tdi386-pc-solaris2.8/td tdnbsp;/td tdTest results: +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-04/msg02309.html;4.5.3/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg01359.html;4.5.3/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg01215.html;4.5.3/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2011-05/msg00795.html;4.5.3/a,
[PR53135] Use block4 form for large debug expressions
This patch fixes a crash in dwarf2out because of a too-large debug expression. Jakub approved it for trunk and 4.7 branches in bugzilla. I'm installing it in the trunk momentarily, and later today on 4.7 after I give it a spin there. Regstrapped on x86_64-linux-gnu and i686-linux-gnu. I'm keeping the testcase open because we still have an underlying problem and other improvements to make. Use block4 form for large debug expressions. From: Alexandre Oliva aol...@redhat.com for gcc/ChangeLog PR debug/53135 * dwarf2out.c (value_format): Use block4 for dw_val_class_loc when needed. --- gcc/dwarf2out.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c index c776f68..25f57c0 100644 --- a/gcc/dwarf2out.c +++ b/gcc/dwarf2out.c @@ -7491,6 +7491,8 @@ value_format (dw_attr_ref a) return DW_FORM_block1; case 2: return DW_FORM_block2; + case 4: + return DW_FORM_block4; default: gcc_unreachable (); } -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist Red Hat Brazil Compiler Engineer
Re: [PR54551] global dead debug pseudo tracking in fast-dce
On Sep 25, 2012, Jakub Jelinek ja...@redhat.com wrote: On Sun, Sep 23, 2012 at 07:59:37AM -0300, Alexandre Oliva wrote: This patch introduces a global mode of dead_debug tracking for use in fast DCE. If a debug use reaches the top of a basic block before finding its death point, the pending and subsequent uses of the pseudo in debug insns will all be substituted with the same debug temp, and death points will get the value bound to the debug temp. Thanks for working on this. The patch generally looks good, just some minor nits below. Here's the revised version with all the nits fixed. Regstrapped on x86_64-linux-gnu and i686-linux-gnu. I'm checking it in momentarily. Track dead pseudos used in debug insns globally in fast DCE. From: Alexandre Oliva aol...@redhat.com for gcc/ChangeLog PR debug/54551 * Makefile.in (VALTRACK_H): Add hash-table.h. * valtrack.h: Include hash-table.h. (struct dead_debug_global_entry): New. (struct dead_debug_hash_descr): New. (struct dead_debug_global): New. (struct dead_debug): Rename to... (struct dead_debug_local): ... this. Adjust all uses. (dead_debug_global_init, dead_debug_global_finish): New. (dead_debug_init): Rename to... (dead_debug_local_init): ... this. Adjust all callers. (dead_debug_finish): Rename to... (dead_debug_local_finish): ... this. Adjust all callers. * valtrack.c (dead_debug_global_init): New. (dead_debug_init): Rename to... (dead_debug_local_init): ... this. Take global parameter. Save it and initialize used bitmap from it. (dead_debug_global_find, dead_debug_global_insert): New. (dead_debug_global_replace_temp): New. (dead_debug_promote_uses): New. (dead_debug_finish): Rename to... (dead_debug_local_finish): ... this. Promote remaining uses. (dead_debug_global_finish): New. (dead_debug_add): Try to replace global temps first. (dead_debug_insert_temp): Support global replacements. * dce.c (word_dce_process_block, dce_process_block): Add global_debug parameter. Pass it on. (fast_dce): Initialize, pass on and finalize global_debug. * df-problems.c (df_set_unused_notes_for_mw): Adjusted. (df_create_unused_notes, df_note_bb_compute): Likewise. (df_note_compute): Justify local-only dead debug analysis. for gcc/testsuite/ChangeLog PR debug/54551 * gcc.dg/guality/pr54551.c: New. --- gcc/Makefile.in|3 gcc/dce.c | 35 +++-- gcc/df-problems.c | 15 +- gcc/testsuite/gcc.dg/guality/pr54551.c | 28 gcc/valtrack.c | 220 +--- gcc/valtrack.h | 84 +++- 6 files changed, 340 insertions(+), 45 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/guality/pr54551.c diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 94ac3b5..77ba4df 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -888,7 +888,8 @@ CGRAPH_H = cgraph.h $(VEC_H) $(TREE_H) $(BASIC_BLOCK_H) $(FUNCTION_H) \ cif-code.def ipa-ref.h ipa-ref-inline.h $(LINKER_PLUGIN_API_H) DF_H = df.h $(BITMAP_H) $(REGSET_H) sbitmap.h $(BASIC_BLOCK_H) \ alloc-pool.h $(TIMEVAR_H) -VALTRACK_H = valtrack.h $(BITMAP_H) $(DF_H) $(RTL_H) $(BASIC_BLOCK_H) +VALTRACK_H = valtrack.h $(BITMAP_H) $(DF_H) $(RTL_H) $(BASIC_BLOCK_H) \ + $(HASH_TABLE_H) RESOURCE_H = resource.h hard-reg-set.h $(DF_H) DDG_H = ddg.h sbitmap.h $(DF_H) GCC_H = gcc.h version.h $(DIAGNOSTIC_CORE_H) diff --git a/gcc/dce.c b/gcc/dce.c index c951865..11f8edb 100644 --- a/gcc/dce.c +++ b/gcc/dce.c @@ -806,15 +806,17 @@ struct rtl_opt_pass pass_ud_rtl_dce = /* Process basic block BB. Return true if the live_in set has changed. REDO_OUT is true if the info at the bottom of the block needs to be recalculated before starting. AU is the proper set of - artificial uses. */ + artificial uses. Track global substitution of uses of dead pseudos + in debug insns using GLOBAL_DEBUG. */ static bool -word_dce_process_block (basic_block bb, bool redo_out) +word_dce_process_block (basic_block bb, bool redo_out, + struct dead_debug_global *global_debug) { bitmap local_live = BITMAP_ALLOC (dce_tmp_bitmap_obstack); rtx insn; bool block_changed; - struct dead_debug debug; + struct dead_debug_local debug; if (redo_out) { @@ -836,7 +838,7 @@ word_dce_process_block (basic_block bb, bool redo_out) } bitmap_copy (local_live, DF_WORD_LR_OUT (bb)); - dead_debug_init (debug, NULL); + dead_debug_local_init (debug, NULL, global_debug); FOR_BB_INSNS_REVERSE (bb, insn) if (DEBUG_INSN_P (insn)) @@ -890,7 +892,7 @@ word_dce_process_block (basic_block bb, bool redo_out) if (block_changed) bitmap_copy (DF_WORD_LR_IN (bb), local_live); - dead_debug_finish (debug, NULL); + dead_debug_local_finish (debug, NULL); BITMAP_FREE (local_live); return block_changed; } @@ -899,16 +901,18 @@ word_dce_process_block (basic_block bb, bool redo_out) /* Process basic block BB. Return
Re: Convert more non-GTY htab_t to hash_table.
On 10/2/12, Richard Guenther rguent...@suse.de wrote: On Mon, 1 Oct 2012, Lawrence Crowl wrote: Change more non-GTY hash tables to use the new type-safe template hash table. Constify member function parameters that can be const. Correct a couple of expressions in formerly uninstantiated templates. The new code is 0.362% faster in bootstrap, with a 99.5% confidence of being faster. Tested on x86-64. Okay for trunk? You are changing a hashtable used by fold checking, did you test with fold checking enabled? I didn't know I had to do anything beyond the normal make check. What do I do? +/* Data structures used to maintain mapping between basic blocks and + copies. */ +static hash_table bb_copy_hasher bb_original; +static hash_table bb_copy_hasher bb_copy; note that because hash_table has a constructor we now get global CTORs for all statics :( (and mx-protected local inits ...) The overhead for the global constructors isn't significant. Only the function-local statics have mx-protection, and that can be eliminated by making them global static. Can you please try to remove the constructor from hash_table to avoid this overhead? (as a followup - that is, don't initialize htab) The initialization avoids potential errors in calling dispose. I can do it, but I don't think the overhead (after moving the function-local statics to global) will matter, and so I prefer to keep the safety. So is the move of the statics sufficient or do you still want to remove constructors? The cfg.c, dse.c and hash-table.h parts are ok for trunk, I'll leave the rest to respective maintainers of the pieces of the compiler. Thanks, Richard. Index: gcc/java/ChangeLog 2012-10-01 Lawrence Crowl cr...@google.com * Make-lang.in (JAVA_OBJS): Add dependence on hash-table.o. (JCFDUMP_OBJS): Add dependence on hash-table.o. (jcf-io.o): Add dependence on hash-table.h. * jcf-io.c (memoized_class_lookups): Change to use type-safe hash table. Index: gcc/c/ChangeLog 2012-10-01 Lawrence Crowl cr...@google.com * Make-lang.in (c-decl.o): Add dependence on hash-table.h. * c-decl.c (detect_field_duplicates_hash): Change to new type-safe hash table. Index: gcc/objc/ChangeLog 2012-10-01 Lawrence Crowl cr...@google.com * Make-lang.in (OBJC_OBJS): Add dependence on hash-table.o. (objc-act.o): Add dependence on hash-table.h. * objc-act.c (objc_detect_field_duplicates): Change to new type-safe hash table. Index: gcc/ChangeLog 2012-10-01 Lawrence Crowl cr...@google.com * Makefile.in (fold-const.o): Add depencence on hash-table.h. (dse.o): Likewise. (cfg.o): Likewise. * fold-const.c (fold_checksum_tree): Change to new type-safe hash table. * (print_fold_checksum): Likewise. * cfg.c (var bb_original): Likewise. * (var bb_copy): Likewise. * (var loop_copy): Likewise. * hash-table.h (template hash_table): Constify parameters for find... and remove_elt... member functions. (hash_table::empty) Correct size expression. (hash_table::clear_slot) Correct deleted entry assignment. * dse.c (var rtx_group_table): Change to new type-safe hash table. Index: gcc/cp/ChangeLog 2012-10-01 Lawrence Crowl cr...@google.com * Make-lang.in (class.o): Add dependence on hash-table.h. (tree.o): Likewise. (semantics.o): Likewise. * class.c (fixed_type_or_null): Change to new type-safe hash table. * tree.c (verify_stmt_tree): Likewise. (verify_stmt_tree_r): Likewise. * semantics.c (struct nrv_data): Likewise. Index: gcc/java/Make-lang.in === --- gcc/java/Make-lang.in(revision 191941) +++ gcc/java/Make-lang.in(working copy) @@ -83,10 +83,10 @@ JAVA_OBJS = java/class.o java/decl.o jav java/zextract.o java/jcf-io.o java/win32-host.o java/jcf-parse.o java/mangle.o \ java/mangle_name.o java/builtins.o java/resource.o \ java/jcf-depend.o \ - java/jcf-path.o java/boehm.o java/java-gimplify.o + java/jcf-path.o java/boehm.o java/java-gimplify.o hash-table.o JCFDUMP_OBJS = java/jcf-dump.o java/jcf-io.o java/jcf-depend.o java/jcf-path.o \ -java/win32-host.o java/zextract.o ggc-none.o +java/win32-host.o java/zextract.o ggc-none.o hash-table.o JVGENMAIN_OBJS = java/jvgenmain.o java/mangle_name.o @@ -326,7 +326,7 @@ java/java-gimplify.o: java/java-gimplify # jcf-io.o needs $(ZLIBINC) added to cflags. CFLAGS-java/jcf-io.o += $(ZLIBINC) java/jcf-io.o: java/jcf-io.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ - $(JAVA_TREE_H) java/zipfile.h + $(JAVA_TREE_H) java/zipfile.h $(HASH_TABLE_H) # jcf-path.o needs a -D. CFLAGS-java/jcf-path.o += \ Index: gcc/java/jcf-io.c === --- gcc/java/jcf-io.c(revision 191941) +++
[wwwdocs] Buildstat update for 4.6
Latest results for 4.6.x -tgc Testresults for 4.6.3 alphaev68-dec-osf5.1a Testresults for 4.6.2 alphaev68-dec-osf5.1a Testresults for 4.6.1 alphaev68-dec-osf5.1a (2) Testresults for 4.6.0 alphaev68-dec-osf5.1a Index: buildstat.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.6/buildstat.html,v retrieving revision 1.12 diff -u -r1.12 buildstat.html --- buildstat.html 7 Jun 2012 19:48:58 - 1.12 +++ buildstat.html 2 Oct 2012 19:07:07 - @@ -35,7 +35,12 @@ tdalphaev68-dec-osf5.1a/td tdnbsp;/td tdTest results: -a href=http://gcc.gnu.org/ml/gcc-testresults/2011-08/msg00587.html;4.6.1/a +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02465.html;4.6.3/a, +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02464.html;4.6.2/a, +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02472.html;4.6.1/a, +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02463.html;4.6.1/a, +a href=http://gcc.gnu.org/ml/gcc-testresults/2011-08/msg00587.html;4.6.1/a, +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02462.html;4.6.0/a /td /tr
[wwwdocs] Buildstat update for 4.7
Latest results for 4.7.x -tgc Testresults for 4.7.2 alphaev68-dec-osf5.1a (2) hppa2.0w-hp-hpux11.00 hppa2.0w-hp-hpux11.11 hppa64-hp-hpux11.11 i386-apple-darwin10.8.0 i686-pc-linux-gnu powerpc-apple-darwin8.11.0 x86_64-apple-darwin10.8.0 x86_64-apple-darwin12.2.0 Testresults for 4.7.1 alphaev68-dec-osf5.1a (2) Testresults for 4.7.0 alphaev68-dec-osf5.1a Index: buildstat.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.7/buildstat.html,v retrieving revision 1.6 diff -u -r1.6 buildstat.html --- buildstat.html 16 Jul 2012 00:06:41 - 1.6 +++ buildstat.html 2 Oct 2012 19:24:44 - @@ -39,9 +39,30 @@ /tr tr +tdalphaev68-dec-osf5.1a/td +tdnbsp;/td +tdTest results: +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02474.html;4.7.2/a, +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02469.html;4.7.2/a, +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02475.html;4.7.1/a, +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02468.html;4.7.1/a, +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02466.html;4.7.0/a +/td +/tr + +tr +tdhppa2.0w-hp-hpux11.00/td +tdnbsp;/td +tdTest results: +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02949.html;4.7.2/a, +/td +/tr + +tr tdhppa2.0w-hp-hpux11.11/td tdnbsp;/td tdTest results: +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02311.html;4.7.2/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2012-04/msg00080.html;4.7.0/a /td /tr @@ -50,6 +71,7 @@ tdhppa64-hp-hpux11.11/td tdnbsp;/td tdTest results: +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02408.html;4.7.2/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2012-04/msg00408.html;4.7.0/a /td /tr @@ -58,6 +80,7 @@ tdi386-apple-darwin10.8.0/td tdnbsp;/td tdTest results: +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02291.html;4.7.2/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg02742.html;4.7.0/a /td /tr @@ -102,6 +125,7 @@ tdi686-pc-linux-gnu/td tdnbsp;/td tdTest results: +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg00199.html;4.7.1/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2012-06/msg01316.html;4.7.1/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2012-06/msg01315.html;4.7.1/a /td @@ -111,6 +135,7 @@ tdpowerpc-apple-darwin8.11.0/td tdnbsp;/td tdTest results: +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02736.html;4.7.2/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2012-06/msg01566.html;4.7.1/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg02890.html;4.7.0/a /td @@ -145,6 +170,7 @@ tdx86_64-apple-darwin10.8.0/td tdnbsp;/td tdTest results: +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02247.html;4.7.2/a, a href=http://gcc.gnu.org/ml/gcc-testresults/2012-03/msg02708.html;4.7.0/a /td /tr @@ -159,6 +185,14 @@ /tr tr +tdx86_64-apple-darwin12.2.0/td +tdnbsp;/td +tdTest results: +a href=http://gcc.gnu.org/ml/gcc-testresults/2012-09/msg02248.html;4.7.2/a +/td +/tr + +tr tdx86_64-unknown-linux-gnu/td tdnbsp;/td tdTest results:
Re: [PATCH v2, rtl-optimization]: Fix PR54457, [x32] Fail to combine 64bit index + constant
Andrew Pinski andrew.pin...@caviumnetworks.com writes: On Thu, Sep 27, 2012 at 11:13 AM, Uros Bizjak ubiz...@gmail.com wrote: 2012-09-27 Uros Bizjak ubiz...@gmail.com PR rtl-optimization/54457 * simplify-rtx.c (simplify_subreg): Simplify (subreg:M (op:N ((x:N) (y:N)), 0) to (op:M (subreg:M (x:N) 0) (subreg:M (x:N) 0)), where the outer subreg is effectively a truncation to the original mode M. When I was doing something similar on our internal toolchain at Cavium. I found doing this caused a regression on MIPS64 n32 in gcc.c-torture/execute/20040709-1.c Where: (insn 15 14 16 2 (set (reg/v:DI 200 [ y ]) (reg:DI 2 $2)) t.c:16 301 {*movdi_64bit} (expr_list:REG_DEAD (reg:DI 2 $2) (nil))) (insn 16 15 17 2 (set (reg:DI 210) (zero_extract:DI (reg/v:DI 200 [ y ]) (const_int 29 [0x1d]) (const_int 0 [0]))) t.c:16 249 {extzvdi} (expr_list:REG_DEAD (reg/v:DI 200 [ y ]) (nil))) (insn 17 16 23 2 (set (reg:SI 211) (truncate:SI (reg:DI 210))) t.c:16 175 {truncdisi2} (expr_list:REG_DEAD (reg:DI 210) (nil))) Gets converted to: (insn 23 17 26 2 (set (reg/i:SI 2 $2) (and:SI (reg:SI 2 $2 [+4 ]) (const_int 536870911 [0x1fff]))) t.c:18 156 {*andsi3} (nil)) Which is considered an ext instruction And with the Octeon simulator which causes undefined arguments to 32bit word operations to come out as 0xDEADBEEF which showed the regression. I fixed it by changing it to produce TRUNCATE instead of the subreg. I did the simplification on ior/and rather than plus/minus/mult so the issue is only when expanding to this to and/ior. Hmm, hadn't thought of that. I think some of the existing subreg optimisations suffer the same problem. I.e. we can't assume that subreg truncations of nested operands are OK just because the outer subreg is OK. I've got a patch I'm testing. BTW, I haven't forgotten about your other ext patch. Was hoping to see whether we could finally take the opportunity to parameterise the ext* patterns by mode, but got distracted with other patches. Maybe I'll just have to admit I won't get time to try it for 4.8... Richard
PATCH: PR target/54741: Check SSE and YMM state support for -march=native
Hi, This patch checks SSE and YMM state support for -march=native. Tested on Linux/x86-64. OK to install? Thanks. H.J. --- 2012-10-02 H.J. Lu hongjiu...@intel.com PR target/54741 * config/i386/driver-i386.c (XCR_XFEATURE_ENABLED_MASK): New. (XSTATE_FP): Likewise. (XSTATE_SSE): Likewise. (XSTATE_YMM): Likewise. (host_detect_local_cpu): Disable AVX, AVX2, FMA, FMA4 and XOP if SSE and YMM states aren't supported. diff --git a/gcc/config/i386/driver-i386.c b/gcc/config/i386/driver-i386.c index bda4e02..4dffc51 100644 --- a/gcc/config/i386/driver-i386.c +++ b/gcc/config/i386/driver-i386.c @@ -390,6 +390,7 @@ const char *host_detect_local_cpu (int argc, const char **argv) unsigned int has_hle = 0, has_rtm = 0; unsigned int has_rdrnd = 0, has_f16c = 0, has_fsgsbase = 0; unsigned int has_rdseed = 0, has_prfchw = 0, has_adx = 0; + unsigned int has_osxsave = 0; bool arch; @@ -431,6 +432,7 @@ const char *host_detect_local_cpu (int argc, const char **argv) has_sse4_1 = ecx bit_SSE4_1; has_sse4_2 = ecx bit_SSE4_2; has_avx = ecx bit_AVX; + has_osxsave = ecx bit_OSXSAVE; has_cmpxchg16b = ecx bit_CMPXCHG16B; has_movbe = ecx bit_MOVBE; has_popcnt = ecx bit_POPCNT; @@ -460,6 +462,26 @@ const char *host_detect_local_cpu (int argc, const char **argv) has_adx = ebx bit_ADX; } + /* Get XCR_XFEATURE_ENABLED_MASK register with xgetbv. */ +#define XCR_XFEATURE_ENABLED_MASK 0x0 +#define XSTATE_FP 0x1 +#define XSTATE_SSE 0x2 +#define XSTATE_YMM 0x4 + if (has_osxsave) +asm (.byte 0x0f; .byte 0x01; .byte 0xd0 +: =a (eax), =d (edx) +: c (XCR_XFEATURE_ENABLED_MASK)); + + /* Check if SSE and YMM states are supported. */ + if ((eax (XSTATE_SSE | XSTATE_YMM)) == (XSTATE_SSE | XSTATE_YMM)) +{ + has_avx = 0; + has_avx2 = 0; + has_fma = 0; + has_fma4 = 0; + has_xop = 0; +} + /* Check cpuid level of extended features. */ __cpuid (0x8000, ext_level, ebx, ecx, edx);
[MIPS] Adjust baddu patterns for recent simplify-rtx.c change
As promised, here's the patch to adjust the MIPS BADDU patterns for the new (subreg (plus)) simplification. Tested on mipsisa32-elf and mipsisa64-elf. Applied. Richard gcc/ * config/mips/mips.md (*baddu_si_eb, *baddu_si_el): Merge into... (*baddu_si): ...this new pattern. Index: gcc/config/mips/mips.md === --- gcc/config/mips/mips.md 2012-09-29 16:57:31.0 +0100 +++ gcc/config/mips/mips.md 2012-10-01 21:33:39.358480799 +0100 @@ -1293,23 +1293,12 @@ (define_insn_and_split *addsi3_extended ;; Combiner patterns for unsigned byte-add. -(define_insn *baddu_si_eb +(define_insn *baddu_si [(set (match_operand:SI 0 register_operand =d) (zero_extend:SI -(subreg:QI - (plus:SI (match_operand:SI 1 register_operand d) - (match_operand:SI 2 register_operand d)) 3)))] - ISA_HAS_BADDU BYTES_BIG_ENDIAN - baddu\\t%0,%1,%2 - [(set_attr alu_type add)]) - -(define_insn *baddu_si_el - [(set (match_operand:SI 0 register_operand =d) -(zero_extend:SI -(subreg:QI - (plus:SI (match_operand:SI 1 register_operand d) - (match_operand:SI 2 register_operand d)) 0)))] - ISA_HAS_BADDU !BYTES_BIG_ENDIAN +(plus:QI (match_operand:QI 1 register_operand d) + (match_operand:QI 2 register_operand d] + ISA_HAS_BADDU baddu\\t%0,%1,%2 [(set_attr alu_type add)])
PATCH: PR target/54785: Document -mprefer-avx128
Hi, This patch documents -mprefer-avx128. OK for trunk and 4.7? Thanks. H.J. --- 2012-10-02 H.J. Lu hongjiu...@intel.com PR target/54785 * doc/invoke.texi: Document -mprefer-avx128. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 7578dda..0e7e441 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -630,7 +630,7 @@ Objective-C and Objective-C++ Dialects}. -mincoming-stack-boundary=@var{num} @gol -mcld -mcx16 -msahf -mmovbe -mcrc32 @gol -mrecip -mrecip=@var{opt} @gol --mvzeroupper @gol +-mvzeroupper -mprefer-avx128 @gol -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4 -mavx @gol -mavx2 -maes -mpclmul -mfsgsbase -mrdrnd -mf16c -mfma @gol -msse4a -m3dnow -mpopcnt -mabm -mbmi -mtbm -mfma4 -mxop -mlzcnt @gol @@ -13926,6 +13926,11 @@ before a transfer of control flow out of the function to minimize the AVX to SSE transition penalty as well as remove unnecessary @code{zeroupper} intrinsics. +@item -mprefer-avx128 +@opindex mprefer-avx128 +This option instructs GCC to use 128-bit AVX instructions instead of +256-bit AVX instructions in the auto-vectorizer. + @item -mcx16 @opindex mcx16 This option enables GCC to generate @code{CMPXCHG16B} instructions.
[Committed] Fix truncate of a memory for vector mode
Hi, When I implemented the simplification of a truncate of a memory, I did not think about the case where we would have a truncate of a vector mode. This fixes this case. Committed as obvious after a bootstrap and test on x86_64-linux-gnu and also a build and test for arm-linux-gnueabi. Thanks, Andrew Pinski 2012-10-02 Andrew Pinski apin...@cavium.com * simplify-rtx.c (simplify_unary_operation_1 case TRUNCATE): Don't optimize a truncate of a mem if it is a vector mode. Index: simplify-rtx.c === --- simplify-rtx.c (revision 192004) +++ simplify-rtx.c (working copy) @@ -873,6 +873,7 @@ simplify_unary_operation_1 (enum rtx_cod /* A truncate of a memory is just loading the low part of the memory if we are not changing the meaning of the address. */ if (GET_CODE (op) == MEM + !VECTOR_MODE_P (mode) !MEM_VOLATILE_P (op) !mode_dependent_address_p (XEXP (op, 0), MEM_ADDR_SPACE (op))) return rtl_hooks.gen_lowpart_no_emit (mode, op);
[Patch, Fortran, committed] PR 54778: an ICE on invalid OO code
Hi all, I have just committed as obvious a one-line patch to fix an ICE-on-invalid OOP problem: http://gcc.gnu.org/viewcvs?view=revisionrevision=192005 Cheers, Janus