New Spanish PO file for 'gcc' (version 4.7.0)
Hello, gentle maintainer. This is a message from the Translation Project robot. A revised PO file for textual domain 'gcc' has been submitted by the Spanish team of translators. The file is available at: http://translationproject.org/latest/gcc/es.po (This file, 'gcc-4.7.0.es.po', has just now been sent to you in a separate email.) All other PO files for your package are available in: http://translationproject.org/latest/gcc/ Please consider including all of these in your next release, whether official or a pretest. Whenever you have a new distribution with a new version number ready, containing a newer POT file, please send the URL of that distribution tarball to the address below. The tarball may be just a pretest or a snapshot, it does not even have to compile. It is just used by the translators when they need some extra translation context. The following HTML page has been updated: http://translationproject.org/domain/gcc.html If any question arises, please contact the translation coordinator. Thank you for all your work, The Translation Project robot, in the name of your translation coordinator. coordina...@translationproject.org
Re: libitm MinGW detect
On Fri, 2012-03-30 at 07:28 +0400, niXman wrote: Hello. Hello. What's the state of TM support on mingw, have you tested it?
Re: [PATCH][1/n] Cleanup internal interfaces, GCC modularization
On Thu, 29 Mar 2012, Jan Hubicka wrote: I am playing with doing some internal interface static analysis using the first patch below (and looking at LTO bootstrap results). An example, obvious patch resulting from that is the 2nd patch, resuling from the static analysis output /space/rguenther/src/svn/trunk/gcc/tree-ssa-pre.c:add_to_value can be made static /space/rguenther/src/svn/trunk/gcc/tree-ssa-pre.c:print_value_expressions can be made static the static analysis is very verbose (and does not consider ipa-refs yet). You also need to union results for building all frontends and all targets (well, in theory, or you can simply manually verify things which is a good idea anyway - even unused functions may be useful exported when they implement a generic data structure for example). Excercise for the reader: turn the analysis into a plugin. Richard. Index: gcc/lto/lto.c === --- gcc/lto/lto.c (revision 185918) +++ gcc/lto/lto.c (working copy) @@ -2721,6 +2721,65 @@ read_cgraph_and_symbols (unsigned nfiles lto_symtab_merge_cgraph_nodes (); ggc_collect (); + if (flag_wpa) +{ + struct cgraph_node *node; + FILE *f = fopen (concat (dump_base_name, .callers, NULL), w); + for (node = cgraph_nodes; node; node = node-next) + { + tree caller_tu = NULL_TREE; + struct cgraph_edge *caller; + bool found = true; + + if (!TREE_PUBLIC (node-decl) + || !TREE_STATIC (node-decl) + || resolution_used_from_other_file_p (node-resolution)) + continue; + + if (!node-callers) + { + expanded_location loc = expand_location (DECL_SOURCE_LOCATION (node-decl)); + fprintf (f, %s:%s no calls\n, + loc.file, IDENTIFIER_POINTER (DECL_NAME (node-decl))); + } With Mozilla folks I used the dumps from first WPA unreachable function removal pass with some degree of success. This gets a lot of non-trivial cases of dead code, but also there are a lot of funny false positives wrt comdats etc. + for (caller = node-callers; caller; caller = caller-next_caller) + { + if (!caller_tu) + caller_tu = DECL_CONTEXT (caller-caller-decl); + else if (caller_tu + DECL_CONTEXT (caller-caller-decl) != caller_tu) + found = false; + } Extending to IPA-REF should be straighforward. + if (found caller_tu) + { + expanded_location loc1 = expand_location (DECL_SOURCE_LOCATION (node-decl)); + expanded_location loc2 = expand_location (DECL_SOURCE_LOCATION (node-callers-caller-decl)); + + if (DECL_CONTEXT (node-decl) == caller_tu) + fprintf (f, %s:%s can be made static\n, +loc1.file, IDENTIFIER_POINTER (DECL_NAME (node-decl))); Indeed, this is also useful. Any plans to turn this into general -Wsomething, or you will also stay just with an internal hack like I did? :) ;) The result has way too many false positives (LTO bootstrap produces quite some dead functions due to early inlining). So yes, this will stay internal ;) Richard.
Re: [PATCH gcc-4.7/changes.html] Update for -ftrack-macro-expansion and -Wunused-local-typedefs
Dodji Seketeli do...@redhat.com a écrit: Hello, I forgot to update changes.html when the -ftrack-macro-expansion and Wunused-local-typedefs changes went in. Fixed thus. I hope it is not too late, now that 4.7 is out. OK for CVS head? * htdocs/gcc-4.7/changes.html: Update for -ftrack-macro-expansion and -Wunused-local-typedefs. I have just committed this to the CVS repository, after chatting with Paolo Carlini via email and getting an ACK from Richard Guenther on IRC. Thanks. -- Dodji
[PATCH][4/4] Cleanup internal interfaces
Last one (for now). Bootstrapped on x86_64-unknown-linux-gnu, applied. Richard. 2012-03-30 Richard Guenther rguent...@suse.de * tree-affine.h (print_aff): Remove. * tree-affine.c (print_aff): Make static. * tree-data-ref.h (access_matrix_get_index_for_parameter): Remove. (get_references_in_stmt): Likewise. (print_direction_vector): Likewise. (print_dir_vectors): Likewise. (print_dist_vectors): Likewise. (dump_subscript): Likewise. (dump_ddrs): Likewise. (dump_dist_dir_vectors): Likewise. (dump_data_references): Likewise. (dump_data_dependence_relation): Likewise. (dump_data_dependence_direction): Likewise. (dump_rdg_vertex): Likewise. (dump_rdg_component): Likewise. (debug_ddrs): Declare. (struct data_ref_loc_d): Move ... * tree-data-ref.c (struct data_ref_loc_d): ... here. (get_references_in_stmt): Make static. (dump_data_references): Likewise. (dump_subscript): Likewise. (print_direction_vector): Likewise. (print_dir_vectors): Likewise. (print_dist_vectors): Likewise. (dump_data_dependence_relation): Likewise. (dump_dist_dir_vectors): Likewise. (dump_ddrs): Likewise. (dump_rdg_vertex): Likewise. (dump_rdg_component): Likewise. (debug_ddrs): New function. (access_matrix_get_index_for_parameter): Remove. Index: gcc/tree-affine.h === --- gcc/tree-affine.h (revision 185957) +++ gcc/tree-affine.h (working copy) @@ -79,5 +79,4 @@ void free_affine_expand_cache (struct po bool aff_comb_cannot_overlap_p (aff_tree *, double_int, double_int); /* Debugging functions. */ -void print_aff (FILE *, aff_tree *); void debug_aff (aff_tree *); Index: gcc/tree-affine.c === --- gcc/tree-affine.c (revision 185957) +++ gcc/tree-affine.c (working copy) @@ -812,7 +812,7 @@ aff_combination_constant_multiple_p (aff /* Prints the affine VAL to the FILE. */ -void +static void print_aff (FILE *file, aff_tree *val) { unsigned i; Index: gcc/tree-data-ref.h === --- gcc/tree-data-ref.h (revision 185957) +++ gcc/tree-data-ref.h (working copy) @@ -169,8 +169,6 @@ am_vector_index_for_loop (struct access_ gcc_unreachable(); } -int access_matrix_get_index_for_parameter (tree, struct access_matrix *); - struct data_reference { /* A pointer to the statement that contains this DR. */ @@ -371,22 +369,6 @@ DEF_VEC_ALLOC_P(ddr_p,heap); #define DDR_REVERSED_P(DDR) DDR-reversed_p - -/* Describes a location of a memory reference. */ - -typedef struct data_ref_loc_d -{ - /* Position of the memory reference. */ - tree *pos; - - /* True if the memory reference is read. */ - bool is_read; -} data_ref_loc; - -DEF_VEC_O (data_ref_loc); -DEF_VEC_ALLOC_O (data_ref_loc, heap); - -bool get_references_in_stmt (gimple, VEC (data_ref_loc, heap) **); bool dr_analyze_innermost (struct data_reference *, struct loop *); extern bool compute_data_dependences_for_loop (struct loop *, bool, VEC (loop_p, heap) **, @@ -395,23 +377,13 @@ extern bool compute_data_dependences_for extern bool compute_data_dependences_for_bb (basic_block, bool, VEC (data_reference_p, heap) **, VEC (ddr_p, heap) **); -extern void print_direction_vector (FILE *, lambda_vector, int); -extern void print_dir_vectors (FILE *, VEC (lambda_vector, heap) *, int); -extern void print_dist_vectors (FILE *, VEC (lambda_vector, heap) *, int); -extern void dump_subscript (FILE *, struct subscript *); -extern void dump_ddrs (FILE *, VEC (ddr_p, heap) *); -extern void dump_dist_dir_vectors (FILE *, VEC (ddr_p, heap) *); +extern void debug_ddrs (VEC (ddr_p, heap) *); extern void dump_data_reference (FILE *, struct data_reference *); extern void debug_data_reference (struct data_reference *); -extern void dump_data_references (FILE *, VEC (data_reference_p, heap) *); extern void debug_data_references (VEC (data_reference_p, heap) *); extern void debug_data_dependence_relation (struct data_dependence_relation *); -extern void dump_data_dependence_relation (FILE *, - struct data_dependence_relation *); extern void dump_data_dependence_relations (FILE *, VEC (ddr_p, heap) *); extern void debug_data_dependence_relations (VEC (ddr_p, heap) *); -extern void dump_data_dependence_direction (FILE *, - enum data_dependence_direction); extern void free_dependence_relation (struct data_dependence_relation *); extern void free_dependence_relations (VEC (ddr_p, heap) *); extern void free_data_ref
[Ada] Protect generation of Alfa sections in ALI files against empty node
In some cases, a node designating a compilation unit may be empty, which was not considered in the code generating Alfa sections in ALI files. Now corrected. Tested on x86_64-pc-linux-gnu, committed on trunk 2012-03-30 Yannick Moy m...@adacore.com * lib-xref-alfa.adb (Add_Alfa_File): Take into account possible absence of compilation unit for unit in Sdep_Table. Index: lib-xref-alfa.adb === --- lib-xref-alfa.adb (revision 185995) +++ lib-xref-alfa.adb (working copy) @@ -226,9 +226,15 @@ From := Alfa_Scope_Table.Last + 1; - Traverse_Compilation_Unit (Cunit (U), Detect_And_Add_Alfa_Scope'Access, - Inside_Stubs = False); + -- Unit U might not have an associated compilation unit, as seen in code + -- filling Sdep_Table in Write_ALI. + if Present (Cunit (U)) then + Traverse_Compilation_Unit (Cunit (U), +Detect_And_Add_Alfa_Scope'Access, +Inside_Stubs = False); + end if; + -- Update scope numbers declare @@ -279,9 +285,11 @@ Get_Name_String (Reference_Name (S)); File_Name := new String'(Name_Buffer (1 .. Name_Len)); - -- For subunits, also retrieve the file name of the unit + -- For subunits, also retrieve the file name of the unit. Only do so if + -- unit U has an associated compilation unit. - if Present (Cunit (Unit (S))) + if Present (Cunit (U)) +and then Present (Cunit (Unit (S))) and then Nkind (Unit (Cunit (Unit (S = N_Subunit then Get_Name_String (Reference_Name (Main_Source_File));
[Ada] Missing debug info for loop entity of an Ada 2012 array iterator loop
When the declaration of the loop entity of an Ada-2012-style array iterator is rewritten as a renaming of the indexed array, debug info was not being generated for the renaming, preventing display of the entity (gdb generates a no definition in current context message). The loop entity of such a renaming is now marked as needing debug info. Tested on x86_64-pc-linux-gnu, committed on trunk 2012-03-30 Gary Dismukes dismu...@adacore.com * exp_ch5.adb (Expand_Iterator_Loop_Over_Array): For the case of a loop entity which is rewritten as a renaming of the indexed array, explicitly mark the entity as needing debug info so that Materialize entity will be set later by Debug_Renaming_Declaration when the renaming is expanded. Index: exp_ch5.adb === --- exp_ch5.adb (revision 185995) +++ exp_ch5.adb (working copy) @@ -3303,6 +3303,14 @@ New_Reference_To (Component_Type (Array_Typ), Loc), Name= Ind_Comp)); + -- Mark the loop variable as needing debug info, so that expansion + -- of the renaming will result in Materialize_Entity getting set via + -- Debug_Renaming_Declaration. (This setting is needed here because + -- the setting in Freeze_Entity comes after the expansion, which is + -- too late. ???) + + Set_Debug_Info_Needed (Id); + -- for Index in Array loop -- This case utilizes the already given iterator name
[PATCH] Fix PR52786
This fixes PR52786 which I did not see in my testing (huh). I suppose hppa*-*-* has unsigned HOST_WIDE_INT == unsigned int and we suppress the sign-compare warning for unsigned long = (long) unsigned int Committed as obvious. Richard. 2012-03-30 Richard Guenther rguent...@suse.de PR middle-end/52786 * double-int.c (rshift_double): Remove not needed cast. Index: gcc/double-int.c === --- gcc/double-int.c(revision 185994) +++ gcc/double-int.c(working copy) @@ -228,7 +228,7 @@ rshift_double (unsigned HOST_WIDE_INT l1 /* Zero / sign extend all bits that are beyond the precision. */ - if (count = (HOST_WIDE_INT)prec) + if (count = prec) { *hv = signmask; *lv = signmask;
[Ada] Incorrect finalization of build-in-place function result
This patch updates the mechanism which detects build-in-place function calls returning controlled results on the secondary stack. -- Source -- -- types.ads with Ada.Finalization; use Ada.Finalization; package Types is type Ctrl_Comp is new Limited_Controlled with null record; procedure Finalize (Obj : in out Ctrl_Comp); type Root is tagged limited null record; type Root_Ptr is access all Root'Class; function Create (Ctrl : Boolean) return Root'Class; type Empty_Child is new Root with null record; type Ctrl_Child is new Root with record Comp : Ctrl_Comp; end record; end Types; -- types.adb with Ada.Text_IO; use Ada.Text_IO; package body Types is function Create (Ctrl : Boolean) return Root'Class is begin if Ctrl then return Result : Ctrl_Child; else return Result : Empty_Child; end if; end Create; procedure Finalize (Obj : in out Ctrl_Comp) is begin Put_Line ( Finalize); end Finalize; end Types; -- main.adb with Ada.Text_IO; use Ada.Text_IO; with Types; use Types; procedure Main is pragma Suppress (Accessibility_Check); begin Put_Line (Empty child); declare Obj : Root_Ptr := new Root'Class'(Create (False)); begin Put_Line (Empty child allocated); end; Put_Line (Ctrl child); declare Obj : Root_Ptr := new Root'Class'(Create (True)); begin Put_Line (Ctrl child allocated); end; Put_Line (End); end Main; - -- Compilation and expected output -- - $ gnatmake -q -gnat05 main.adb $ ./main Empty child Empty child allocated Ctrl child Ctrl child allocated End Finalize Tested on x86_64-pc-linux-gnu, committed on trunk 2012-03-30 Hristian Kirtchev kirtc...@adacore.com * exp_ch7.adb (Process_Declarations): Replace the call to Is_Null_Access_BIP_Func_Call with Is_Secondary_Stack_BIP_Func_Call. Update the related comment. * exp_util.adb (Is_Null_Access_BIP_Func_Call): Removed. (Is_Secondary_Stack_BIP_Func_Call): New routine. (Requires_Cleanup_Actions): Replace the call to Is_Null_Access_BIP_Func_Call with Is_Secondary_Stack_BIP_Func_Call. Update the related comment. * exp_util.ads (Is_Null_Access_BIP_Func_Call): Removed. (Is_Secondary_Stack_BIP_Func_Call): New routine. Index: exp_ch7.adb === --- exp_ch7.adb (revision 185995) +++ exp_ch7.adb (working copy) @@ -1824,15 +1824,14 @@ --Obj : Access_Typ := Non_BIP_Function_Call'reference; --Obj : Access_Typ := - --BIP_Function_Call - -- (..., BIPaccess = null, ...)'reference; + --BIP_Function_Call (BIPalloc = 2, ...)'reference; elsif Is_Access_Type (Obj_Typ) and then Needs_Finalization (Available_View (Designated_Type (Obj_Typ))) and then Present (Expr) and then - (Is_Null_Access_BIP_Func_Call (Expr) + (Is_Secondary_Stack_BIP_Func_Call (Expr) or else (Is_Non_BIP_Func_Call (Expr) and then not Is_Related_To_Func_Return (Obj_Id))) Index: exp_util.adb === --- exp_util.adb(revision 185995) +++ exp_util.adb(working copy) @@ -4475,74 +4475,6 @@ and then Is_Library_Level_Entity (Typ); end Is_Library_Level_Tagged_Type; - -- - -- Is_Null_Access_BIP_Func_Call -- - -- - - function Is_Null_Access_BIP_Func_Call (Expr : Node_Id) return Boolean is - Call : Node_Id := Expr; - - begin - -- Build-in-place calls usually appear in 'reference format - - if Nkind (Call) = N_Reference then - Call := Prefix (Call); - end if; - - if Nkind_In (Call, N_Qualified_Expression, - N_Unchecked_Type_Conversion) - then - Call := Expression (Call); - end if; - - if Is_Build_In_Place_Function_Call (Call) then - declare -Access_Nam : Name_Id := No_Name; -Actual : Node_Id; -Param : Node_Id; -Formal : Node_Id; - - begin --- Examine all parameter associations of the function call - -Param := First (Parameter_Associations (Call)); -while Present (Param) loop - if Nkind (Param) = N_Parameter_Association - and then Nkind (Selector_Name (Param)) = N_Identifier - then - Formal := Selector_Name (Param); - Actual := Explicit_Actual_Parameter (Param); - -
[Ada] Towards support in Alfa cross-references for generics
This reformatting is meant to clarify the code generating Alfa cross-references so that it can be updated to take into account instantiations. Tested on x86_64-pc-linux-gnu, committed on trunk 2012-03-30 Yannick Moy m...@adacore.com * lib-xref-alfa.adb, lib-xref.adb: Code clean ups. Index: lib-xref-alfa.adb === --- lib-xref-alfa.adb (revision 185997) +++ lib-xref-alfa.adb (working copy) @@ -40,101 +40,17 @@ -- Table of Alfa_Entities, True for each entity kind used in Alfa Alfa_Entities : constant array (Entity_Kind) of Boolean := - (E_Void = False, - E_Variable = True, - E_Component = False, - E_Constant = True, - E_Discriminant = False, + (E_Constant = True, + E_Function = True, + E_In_Out_Parameter = True, + E_In_Parameter = True, + E_Loop_Parameter = True, + E_Operator = True, + E_Out_Parameter= True, + E_Procedure= True, + E_Variable = True, + others = False); - E_Loop_Parameter = True, - E_In_Parameter = True, - E_Out_Parameter = True, - E_In_Out_Parameter = True, - E_Generic_In_Out_Parameter = False, - - E_Generic_In_Parameter = False, - E_Named_Integer = False, - E_Named_Real = False, - E_Enumeration_Type = False, - E_Enumeration_Subtype= False, - - E_Signed_Integer_Type= False, - E_Signed_Integer_Subtype = False, - E_Modular_Integer_Type = False, - E_Modular_Integer_Subtype= False, - E_Ordinary_Fixed_Point_Type = False, - - E_Ordinary_Fixed_Point_Subtype = False, - E_Decimal_Fixed_Point_Type = False, - E_Decimal_Fixed_Point_Subtype= False, - E_Floating_Point_Type= False, - E_Floating_Point_Subtype = False, - - E_Access_Type= False, - E_Access_Subtype = False, - E_Access_Attribute_Type = False, - E_Allocator_Type = False, - E_General_Access_Type= False, - - E_Access_Subprogram_Type = False, - E_Access_Protected_Subprogram_Type = False, - E_Anonymous_Access_Subprogram_Type = False, - E_Anonymous_Access_Protected_Subprogram_Type = False, - E_Anonymous_Access_Type = False, - - E_Array_Type = False, - E_Array_Subtype = False, - E_String_Type= False, - E_String_Subtype = False, - E_String_Literal_Subtype = False, - - E_Class_Wide_Type= False, - E_Class_Wide_Subtype = False, - E_Record_Type= False, - E_Record_Subtype = False, - E_Record_Type_With_Private = False, - - E_Record_Subtype_With_Private= False, - E_Private_Type = False, - E_Private_Subtype= False, - E_Limited_Private_Type = False, - E_Limited_Private_Subtype= False, - - E_Incomplete_Type= False, - E_Incomplete_Subtype = False, - E_Task_Type = False, - E_Task_Subtype = False, - E_Protected_Type = False, - - E_Protected_Subtype = False, - E_Exception_Type = False, - E_Subprogram_Type= False, - E_Enumeration_Literal= False, - E_Function = True, - - E_Operator = True, - E_Procedure = True, - E_Entry = False, - E_Entry_Family = False, - E_Block = False, - - E_Entry_Index_Parameter
[PATCH, libgfortran]: Fix PR52758, out of bounds access
Hello! 2012-03-30 Uros Bizjak ubiz...@gmail.com PR libgfortran/52758 * intrinsics/chmod.c: Remove out-of-bounds initialization of rwxXstugo. Bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN as obvious. Uros. Index: intrinsics/chmod.c === --- intrinsics/chmod.c (revision 185992) +++ intrinsics/chmod.c (working copy) @@ -141,7 +141,6 @@ chmod_func (char *name, char *mode, gfc_charlen_ty rwxXstugo[6] = false; rwxXstugo[7] = false; rwxXstugo[8] = false; - rwxXstugo[9] = false; part = 0; set_mode = -1; for (; i mode_len; i++)
Re: [PATCH][ARM] NEON DImode immediate constants
On 28/02/12 16:20, Andrew Stubbs wrote: Hi all, This patch implements 64-bit immediate constant loads in NEON. The current state is that you can load const_vector, but not const_int. This is clearly not ideal. The result is a constant pool entry when it's not necessary. The patch disables the movdi_vfp patterns for loading DImode values, if the operand is const_int and NEON is enabled, and extends the neon_mov pattern to include DImode const_int, as well as the const_vector operands. I've modified neon_valid_immediate only enough to accept const_int input - the logic remains untouched. That patch failed to bootstrap successfully, but this updated patch bootstraps and tests with no regressions. OK? Andrew 2012-03-27 Andrew Stubbs a...@codesourcery.com gcc/ * config/arm/arm.c (neon_valid_immediate): Allow const_int. (arm_print_operand): Add 'x' format. * config/arm/constraints.md (Dn): Allow const_int. * config/arm/neon.md (neon_movmode): Use VDX to allow DImode. Use 'x' format to print constants. * config/arm/predicates.md (imm_for_neon_mov_operand): Allow const_int. * config/arm/vfp.md (movdi_vfp): Disable for const_int when neon is enabled. (movdi_vfp_cortexa8): Likewise. diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 0bded8d..492ddde 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -8873,11 +8873,25 @@ neon_valid_immediate (rtx op, enum machine_mode mode, int inverse, break; \ } - unsigned int i, elsize = 0, idx = 0, n_elts = CONST_VECTOR_NUNITS (op); - unsigned int innersize = GET_MODE_SIZE (GET_MODE_INNER (mode)); + unsigned int i, elsize = 0, idx = 0, n_elts; + unsigned int innersize; unsigned char bytes[16]; int immtype = -1, matches; unsigned int invmask = inverse ? 0xff : 0; + bool vector = GET_CODE (op) == CONST_VECTOR; + + if (vector) +{ + n_elts = CONST_VECTOR_NUNITS (op); + innersize = GET_MODE_SIZE (GET_MODE_INNER (mode)); +} + else +{ + n_elts = 1; + if (mode == VOIDmode) + mode = DImode; + innersize = GET_MODE_SIZE (mode); +} /* Vectors of float constants. */ if (GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT) @@ -8913,7 +8927,7 @@ neon_valid_immediate (rtx op, enum machine_mode mode, int inverse, /* Splat vector constant out into a byte vector. */ for (i = 0; i n_elts; i++) { - rtx el = CONST_VECTOR_ELT (op, i); + rtx el = vector ? CONST_VECTOR_ELT (op, i) : op; unsigned HOST_WIDE_INT elpart; unsigned int part, parts; @@ -17230,6 +17244,19 @@ arm_print_operand (FILE *stream, rtx x, int code) } return; +/* An integer that we want to print in HEX. */ +case 'x': + switch (GET_CODE (x)) + { + case CONST_INT: + fprintf (stream, # HOST_WIDE_INT_PRINT_HEX, INTVAL (x)); + break; + + default: + output_operand_lossage (Unsupported operand for code '%c', code); + } + return; + case 'B': if (GET_CODE (x) == CONST_INT) { diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md index 7d0269a..68979c1 100644 --- a/gcc/config/arm/constraints.md +++ b/gcc/config/arm/constraints.md @@ -255,9 +255,9 @@ (define_constraint Dn @internal - In ARM/Thumb-2 state a const_vector which can be loaded with a Neon vmov - immediate instruction. - (and (match_code const_vector) + In ARM/Thumb-2 state a const_vector or const_int which can be loaded with a + Neon vmov immediate instruction. + (and (match_code const_vector,const_int) (match_test TARGET_32BIT imm_for_neon_mov_operand (op, GET_MODE (op) diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index d7caa37..3c88568 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -152,9 +152,9 @@ (define_attr vqh_mnem vadd,vmin,vmax (const_string vadd)) (define_insn *neon_movmode - [(set (match_operand:VD 0 nonimmediate_operand + [(set (match_operand:VDX 0 nonimmediate_operand =w,Uv,w, w, ?r,?w,?r,?r, ?Us) - (match_operand:VD 1 general_operand + (match_operand:VDX 1 general_operand w,w, Dn,Uvi, w, r, r, Usi,r))] TARGET_NEON (register_operand (operands[0], MODEmode) @@ -173,7 +173,7 @@ if (width == 0) return vmov.f32\t%P0, %1 @ mode; else -sprintf (templ, vmov.i%d\t%%P0, %%1 @ mode, width); +sprintf (templ, vmov.i%d\t%%P0, %%x1 @ mode, width); return templ; } diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md index b535335..8a8a1f1 100644 --- a/gcc/config/arm/predicates.md +++ b/gcc/config/arm/predicates.md @@ -630,7 +630,7 @@ }) (define_predicate imm_for_neon_mov_operand - (match_code const_vector) + (match_code const_vector,const_int) { return neon_immediate_valid_for_move (op, mode, NULL, NULL); }) diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md index 6530570..2061414 100644 --- a/gcc/config/arm/vfp.md +++ b/gcc/config/arm/vfp.md @@ -138,7 +138,9 @@
Re: [PATCH] Remove bogus assert from CCP's insert_clobbers_for_var
On IRC I've been told that is OK and the that CCP cannot make such assumtions. Since it is only a missed-optimization if the call to the builtin is not found and processed (basically PR 51491 again but only in cases like these), I thought it best to just remove the assert by the following simple patch, bootstrapped and tested on x86_64-linux. Please do that on the 4.7 branch as well if the assertion is incorrect. -- Eric Botcazou
RE: [PATCH] SH2A: Don't push/pop registers for functions with resbank attribute
Hi, Looks that the patch ignores the case using movml. It could be something like the attached patch Sorry for ignoring the case using movml. Thanks for the patch which takes care of movml case. though I don't do any tests. The patch was tested with movml testcase and works as expected. Tested with sh2a-elf. No new regressions. Thanks Regards, Naveen
Re: [PATCH][1/n] Cleanup internal interfaces, GCC modularization
On Fri, 30 Mar 2012, Richard Guenther wrote: On Thu, 29 Mar 2012, Jan Hubicka wrote: With Mozilla folks I used the dumps from first WPA unreachable function removal pass with some degree of success. This gets a lot of non-trivial cases of dead code, but also there are a lot of funny false positives wrt comdats etc. + for (caller = node-callers; caller; caller = caller-next_caller) + { + if (!caller_tu) + caller_tu = DECL_CONTEXT (caller-caller-decl); + else if (caller_tu + DECL_CONTEXT (caller-caller-decl) != caller_tu) + found = false; + } Extending to IPA-REF should be straighforward. + if (found caller_tu) + { + expanded_location loc1 = expand_location (DECL_SOURCE_LOCATION (node-decl)); + expanded_location loc2 = expand_location (DECL_SOURCE_LOCATION (node-callers-caller-decl)); + + if (DECL_CONTEXT (node-decl) == caller_tu) + fprintf (f, %s:%s can be made static\n, + loc1.file, IDENTIFIER_POINTER (DECL_NAME (node-decl))); Indeed, this is also useful. Any plans to turn this into general -Wsomething, or you will also stay just with an internal hack like I did? :) ;) The result has way too many false positives (LTO bootstrap produces quite some dead functions due to early inlining). So yes, this will stay internal ;) Btw, the following is the last incarnation of the patch - I've stopped here for now. If you LTO bootstrap with it you can find *.callers files in the build tree (remember to use -O0 for added precision). Richard. Index: gcc/lto/lto.c === --- gcc/lto/lto.c (revision 186007) +++ gcc/lto/lto.c (working copy) @@ -2721,6 +2721,70 @@ read_cgraph_and_symbols (unsigned nfiles lto_symtab_merge_cgraph_nodes (); ggc_collect (); + if (flag_wpa) +{ + struct cgraph_node *node; + FILE *f = fopen (concat (dump_base_name, .callers, NULL), w); + for (node = cgraph_nodes; node; node = node-next) + { + tree caller_tu = NULL_TREE; + struct cgraph_edge *caller; + bool found = true; + + if (!TREE_PUBLIC (node-decl) + || !TREE_STATIC (node-decl) + || DECL_PRESERVE_P (node-decl) + || resolution_used_from_other_file_p (node-resolution)) + continue; + + /* For now, until we walk references. */ + if (node-address_taken) + continue; + + if (!node-callers) + { + expanded_location loc = expand_location (DECL_SOURCE_LOCATION (node-decl)); + fprintf (f, %s:%s no calls\n, + loc.file, IDENTIFIER_POINTER (DECL_NAME (node-decl))); + } + for (caller = node-callers; caller; caller = caller-next_caller) + { + if (!caller_tu) + caller_tu = DECL_CONTEXT (caller-caller-decl); + else if (caller_tu + DECL_CONTEXT (caller-caller-decl) != caller_tu) + found = false; + } + if (found caller_tu) + { + expanded_location loc1 = expand_location (DECL_SOURCE_LOCATION (node-decl)); + expanded_location loc2 = expand_location (DECL_SOURCE_LOCATION (node-callers-caller-decl)); + + if (DECL_CONTEXT (node-decl) == caller_tu) + fprintf (f, %s:%s can be made static\n, +loc1.file, IDENTIFIER_POINTER (DECL_NAME (node-decl))); + else + { + struct cgraph_edge *callee; + bool calls_nonpublic_static_fn = false; + /* Check if we can move node to the caller TU without +moving anything else. */ + for (callee = node-callees; callee; callee = callee-next_callee) + { + if (!TREE_PUBLIC (callee-callee-decl) + TREE_STATIC (callee-callee-decl)) + calls_nonpublic_static_fn = true; + } + if (!calls_nonpublic_static_fn) + fprintf (f, %s:%s called only from %s\n, +loc1.file, IDENTIFIER_POINTER (DECL_NAME (node-decl)), +loc2.file); + } + } + } + fclose (f); +} + if (flag_ltrans) for (node = cgraph_nodes; node; node = node-next) {
Re: [Patch V2] libgfortran: do not assume libm
Il 30/03/2012 12:22, Tristan Gingold ha scritto: On Mar 27, 2012, at 10:38 AM, Janne Blomqvist wrote: On Tue, Mar 27, 2012 at 11:01, Tristan Gingold ging...@adacore.com wrote: Hi, this patch fixes this issue. Is it OK ? Ok. Maybe we should include the AC_DEFINE action within GCC_CHECK_MATH_FUNC. Will try to do that. That looks like a cleaner solution, yes, and less chance for typos to sneak in. Hi, here is the 'cleaner solution': now GCC_CHECK_MATH_FUNC automatically define the HAVE_xxx variable. The description is now: Define to 1 if you have the `xxx' function. As a consequence, libgfortran/config.h.in was regenerated (with differences like: -/* acos is available */ +/* Define to 1 if you have the `acos' function. */ #undef HAVE_ACOS ) Tested by rebuild libgfortran for ia64-hp-openvms and visual inspection of differences. I have CC: Paolo as he approved the first version of math.m4. Ok for trunk ? Yes. Paolo
Re: Support for Runtime CPU type detection via builtins (issue5754058)
Hi, On Thu, 29 Mar 2012, Sriraman Tallam wrote: +struct __processor_model +{ + /* Vendor. */ + unsigned int __cpu_is_amd : 1; + unsigned int __cpu_is_intel : 1; + /* CPU type. */ + unsigned int __cpu_is_intel_atom : 1; + unsigned int __cpu_is_intel_core2 : 1; + unsigned int __cpu_is_intel_corei7 : 1; + unsigned int __cpu_is_intel_corei7_nehalem : 1; + unsigned int __cpu_is_intel_corei7_westmere : 1; + unsigned int __cpu_is_intel_corei7_sandybridge : 1; + unsigned int __cpu_is_amdfam10h : 1; + unsigned int __cpu_is_amdfam10h_barcelona : 1; + unsigned int __cpu_is_amdfam10h_shanghai : 1; + unsigned int __cpu_is_amdfam10h_istanbul : 1; + unsigned int __cpu_is_amdfam15h_bdver1 : 1; + unsigned int __cpu_is_amdfam15h_bdver2 : 1; +} __cpu_model; It doesn't make sense for the model to be a bitfield, a processor will have only ever exactly one model. Just make it an enum or even just an int. Ciao, Michael.
[ia64/vms]: Reimplement common_object
Hi, this attribute is used for some specialized Ada constructs available only on VMS. The current implementation is flawed when there are two (or more) variables with the same common_object. This patch uses the same mechanism as the one used by Alpha VMS. Manually tested on ia64-hp-openvms. Committed on trunk. Tristan. 2012-03-30 Tristan Gingold ging...@adacore.com * config/ia64/ia64.c (ia64_section_type_flags): Remove common_object attribute handling. (SECTION_VMS_OVERLAY): Remove (ia64_vms_common_object_attribute): Replace abort with an assert. Do not set DECL_SECTION_NAME. (ia64_vms_output_aligned_decl_common): Handle common_object attribute. (ia64_vms_elf_asm_named_section): Remove. * config/ia64/vms.h (TARGET_ASM_NAMED_SECTION): Remove. Index: gcc/config/ia64/ia64.c === --- gcc/config/ia64/ia64.c (revision 186009) +++ gcc/config/ia64/ia64.c (working copy) @@ -740,9 +740,6 @@ return NULL_TREE; } -/* The section must have global and overlaid attributes. */ -#define SECTION_VMS_OVERLAY SECTION_MACH_DEP - /* Part of the low level implementation of DEC Ada pragma Common_Object which enables the shared use of variables stored in overlaid linker areas corresponding to the use of Fortran COMMON. */ @@ -753,24 +750,18 @@ bool *no_add_attrs) { tree decl = *node; -tree id, val; -if (! DECL_P (decl)) - abort (); +tree id; + +gcc_assert (DECL_P (decl)); DECL_COMMON (decl) = 1; id = TREE_VALUE (args); -if (TREE_CODE (id) == IDENTIFIER_NODE) - val = build_string (IDENTIFIER_LENGTH (id), IDENTIFIER_POINTER (id)); -else if (TREE_CODE (id) == STRING_CST) - val = id; -else +if (TREE_CODE (id) != IDENTIFIER_NODE TREE_CODE (id) != STRING_CST) { - warning (OPT_Wattributes, -%qE attribute requires a string constant argument, name); + error (%qE attribute requires a string constant argument, name); *no_add_attrs = true; return NULL_TREE; } -DECL_SECTION_NAME (decl) = val; return NULL_TREE; } @@ -783,50 +774,31 @@ { tree attr = DECL_ATTRIBUTES (decl); - /* As common_object attribute set DECL_SECTION_NAME check it before - looking up the attribute. */ - if (DECL_SECTION_NAME (decl) attr) + if (attr) attr = lookup_attribute (common_object, attr); - else -attr = NULL_TREE; - - if (!attr) + if (attr) { - /* Code from elfos.h. */ - fprintf (file, %s, COMMON_ASM_OP); - assemble_name (file, name); - fprintf (file, ,HOST_WIDE_INT_PRINT_UNSIGNED,%u\n, - size, align / BITS_PER_UNIT); -} - else -{ - ASM_OUTPUT_ALIGN (file, floor_log2 (align / BITS_PER_UNIT)); - ASM_OUTPUT_LABEL (file, name); - ASM_OUTPUT_SKIP (file, size ? size : 1); -} -} + tree id = TREE_VALUE (TREE_VALUE (attr)); + const char *name; -/* Definition of TARGET_ASM_NAMED_SECTION for VMS. */ + if (TREE_CODE (id) == IDENTIFIER_NODE) +name = IDENTIFIER_POINTER (id); + else if (TREE_CODE (id) == STRING_CST) +name = TREE_STRING_POINTER (id); + else +abort (); -void -ia64_vms_elf_asm_named_section (const char *name, unsigned int flags, - tree decl) -{ - if (!(flags SECTION_VMS_OVERLAY)) -{ - default_elf_asm_named_section (name, flags, decl); - return; + fprintf (file, \t.vms_common\t\%s\,, name); } - if (flags != (SECTION_VMS_OVERLAY | SECTION_WRITE)) -abort (); + else +fprintf (file, %s, COMMON_ASM_OP); - if (flags SECTION_DECLARED) -{ - fprintf (asm_out_file, \t.section\t%s\n, name); - return; -} + /* Code from elfos.h. */ + assemble_name (file, name); + fprintf (file, ,HOST_WIDE_INT_PRINT_UNSIGNED,%u, + size, align / BITS_PER_UNIT); - fprintf (asm_out_file, \t.section\t%s,\awgO\\n, name); + fputc ('\n', file); } static void @@ -10536,12 +10508,6 @@ || strncmp (name, .gnu.linkonce.sb., 17) == 0) flags = SECTION_SMALL; -#if TARGET_ABI_OPEN_VMS - if (decl DECL_ATTRIBUTES (decl) - lookup_attribute (common_object, DECL_ATTRIBUTES (decl))) -flags |= SECTION_VMS_OVERLAY; -#endif - flags |= default_section_type_flags (decl, name, reloc); return flags; } Index: gcc/config/ia64/vms.h === --- gcc/config/ia64/vms.h (revision 186009) +++ gcc/config/ia64/vms.h (working copy) @@ -121,9 +121,6 @@ #undef TARGET_VALID_POINTER_MODE #define TARGET_VALID_POINTER_MODE ia64_vms_valid_pointer_mode -#undef TARGET_ASM_NAMED_SECTION -#define TARGET_ASM_NAMED_SECTION ia64_vms_elf_asm_named_section - /* Define this macro if it is advisable to hold scalars in registers in a wider mode than
[PING] reload: Try alternative with swapped operands before going to the next
Hi, I've re-tested the patch from: http://gcc.gnu.org/ml/gcc-patches/2011-11/msg01819.html on s390x and x86_64. Ok for mainline? Bye, -Andreas-
[PATCH] Fix PR52772
This fixes PR52772 - prev_bb does not have any relation to the new pre-landing-pad block (not sure what I was thinking here), so this moves the loop updating code to the place where we connect the new block into the CFG. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2012-03-30 Richard Guenther rguent...@suse.de PR middle-end/52780 * except.c (emit_to_new_bb_before): Move loop updating ... (dw2_build_landing_pads): ... here. Use a proper block for querying the loop father. * g++.dg/torture/pr52772.C: New testcase. Index: gcc/except.c === *** gcc/except.c(revision 186007) --- gcc/except.c(working copy) *** emit_to_new_bb_before (rtx seq, rtx insn *** 918,929 bb = create_basic_block (seq, last, prev_bb); update_bb_for_insn (bb); bb-flags |= BB_SUPERBLOCK; - if (current_loops) - { - add_bb_to_loop (bb, prev_bb-loop_father); - if (prev_bb-loop_father-header == prev_bb) - prev_bb-loop_father-header = bb; - } return bb; } --- 918,923 *** dw2_build_landing_pads (void) *** 995,1000 --- 989,1004 e = make_edge (bb, bb-next_bb, e_flags); e-count = bb-count; e-probability = REG_BR_PROB_BASE; + if (current_loops) + { + struct loop *loop = bb-next_bb-loop_father; + /* If we created a pre-header block, add the new block to the +outer loop, otherwise to the loop itself. */ + if (bb-next_bb == loop-header) + add_bb_to_loop (bb, loop_outer (loop)); + else + add_bb_to_loop (bb, loop); + } } } Index: gcc/testsuite/g++.dg/torture/pr52772.C === *** gcc/testsuite/g++.dg/torture/pr52772.C (revision 0) --- gcc/testsuite/g++.dg/torture/pr52772.C (revision 0) *** *** 0 --- 1,85 + // { dg-do compile } + + typedef __SIZE_TYPE__ size_t; + + class c1; + + class c2 { + public: c2() { }; + void *operator new(size_t size, const c1 crc1); + }; + + class c3 { + public: c3() { _Obj = 0; } + ~c3() { if (_Obj) delete _Obj; } + void set(c2 *pObj); + protected: c2 *_Obj; + }; + + void c3::set(c2 *pObj) { _Obj = pObj; }; + + templateclass TYPE class tc1 : public c2 { + public: tc1(int n=0){}; + int get() const; + TYPE operator[] (int id); + TYPE * _data; + int _size; + }; + + templateclass TYPE TYPE tc1TYPE::operator[] (int id) { + return _data[id]; + } + + templateclass TYPE int tc1TYPE::get() const { + return _size; + } + + class c4 { + public: c4(); + }; + + class c5 : public c2 { + protected: c2 * _own; + public: c5(c2 *o) : _own(o) { } + c5(const c4 box); + int add(const c4 ext); + }; + + class c6 { + public: int get() const {}; + }; + + class c7 { + friend class c8; + int find(c6 * loop) const; + }; + + class c8 { + const c1 _rc1; + int tria(c7 * face, c5 * vtree0 = 0); + }; + + int c8::tria(c7 * face, c5 * vtree0) { + c6 *sLData[64]; + tc1c6* loops(64); + while (loops.get() 1) { + c6 *iloop = 0; + for (int j=1; jloops.get(); j++) { + if (loops[j]-get() 32) { + iloop = loops[j]; + } + } + face-find(iloop); + } + c4 box; + c3 ctree; + c5 *vtree = vtree0; + if (!vtree) { + vtree = new (_rc1) c5(box); + ctree.set(vtree); + for (int j=0; j1; j++) { + c4 sVBBox; + vtree-add(sVBBox); + } + } + }
[PATCH] Fix PR52754
This fixes another case of bogusly reconstructed array references. The symptom is that predictive commoning creates negative array indices (and in a wrong way, too). Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2012-03-30 Richard Guenther rguent...@suse.de PR tree-optimization/52754 * tree-ssa-forwprop.c (forward_propagate_addr_expr_1): Only propagate arbitrary addresses into really plain dereferences. * gcc.target/i386/pr52754.c: New testcase. Index: gcc/tree-ssa-forwprop.c === --- gcc/tree-ssa-forwprop.c (revision 185999) +++ gcc/tree-ssa-forwprop.c (working copy) @@ -905,6 +905,7 @@ forward_propagate_addr_expr_1 (tree name that of the pointed-to type of the address we can put the dereferenced address on the LHS preserving the original alias-type. */ else if (gimple_assign_lhs (use_stmt) == lhs + integer_zerop (TREE_OPERAND (lhs, 1)) useless_type_conversion_p (TREE_TYPE (TREE_OPERAND (def_rhs, 0)), TREE_TYPE (gimple_assign_rhs1 (use_stmt @@ -917,9 +918,8 @@ forward_propagate_addr_expr_1 (tree name if (TREE_CODE (*def_rhs_basep) == MEM_REF) { new_base = TREE_OPERAND (*def_rhs_basep, 0); - new_offset - = int_const_binop (PLUS_EXPR, TREE_OPERAND (lhs, 1), - TREE_OPERAND (*def_rhs_basep, 1)); + new_offset = fold_convert (TREE_TYPE (TREE_OPERAND (lhs, 1)), +TREE_OPERAND (*def_rhs_basep, 1)); } else { @@ -989,6 +989,7 @@ forward_propagate_addr_expr_1 (tree name that of the pointed-to type of the address we can put the dereferenced address on the RHS preserving the original alias-type. */ else if (gimple_assign_rhs1 (use_stmt) == rhs + integer_zerop (TREE_OPERAND (rhs, 1)) useless_type_conversion_p (TREE_TYPE (gimple_assign_lhs (use_stmt)), TREE_TYPE (TREE_OPERAND (def_rhs, 0 @@ -1001,9 +1002,8 @@ forward_propagate_addr_expr_1 (tree name if (TREE_CODE (*def_rhs_basep) == MEM_REF) { new_base = TREE_OPERAND (*def_rhs_basep, 0); - new_offset - = int_const_binop (PLUS_EXPR, TREE_OPERAND (rhs, 1), - TREE_OPERAND (*def_rhs_basep, 1)); + new_offset = fold_convert (TREE_TYPE (TREE_OPERAND (rhs, 1)), +TREE_OPERAND (*def_rhs_basep, 1)); } else { Index: gcc/testsuite/gcc.target/i386/pr52754.c === --- gcc/testsuite/gcc.target/i386/pr52754.c (revision 0) +++ gcc/testsuite/gcc.target/i386/pr52754.c (revision 0) @@ -0,0 +1,33 @@ +/* { dg-do run } */ +/* { dg-options -O2 -fpredictive-commoning -msse2 -std=c99 } */ +/* { dg-require-effective-target sse2 } */ + +#include x86intrin.h + +#include isa-check.h +#include sse-os-support.h + +int main() +{ +const float mem[8] = { 0, 1, 2, 3, 4, 5, 6, 7 }; + +unsigned int indexes[8]; +for (unsigned int i = 0; i 8; ++i) indexes[i] = i; + +check_isa (); + +if (!sse_os_support ()) + exit (0); + +__m128 x = _mm_setr_ps(0, 1, 2, 3); +for (unsigned int i = 0; i + 4 6; ++i) { +const unsigned int *ii = indexes[i]; +const __m128 tmp = _mm_setr_ps(mem[ii[0]], mem[ii[1]], mem[ii[2]], mem[ii[3]]); +if (0xf != _mm_movemask_ps(_mm_cmpeq_ps(tmp, x))) { +__builtin_abort(); +} +x = _mm_add_ps(x, _mm_set1_ps(1)); +} + +return 0; +}
[VMS]: Build gfortran
Hi, this is the minimum required to build gfortran on VMS. As vms.h defines TARGET_EXTRA_INCLUDES, we need to add an empty replacement for gfortran not unlike darwin. Tested by building gfortran. Committed on trunk. Tristan. 2012-03-30 Tristan Gingold ging...@adacore.com * config/vms/vms-f.c: New file. * config/vms/t-vms (vms-f.o): New rule. * config.gcc (*-*-*vms*): Define fortran_target_objs. Index: gcc/config.gcc === --- gcc/config.gcc (revision 186009) +++ gcc/config.gcc (working copy) @@ -740,6 +740,7 @@ xm_file=vms/xm-vms.h c_target_objs=vms-c.o cxx_target_objs=vms-c.o + fortran_target_objs=vms-f.o use_gcc_stdint=provide tm_file=${tm_file} vms/vms-stdint.h if test x$gnu_ld != xyes; then Index: gcc/config/vms/t-vms === --- gcc/config/vms/t-vms(revision 186009) +++ gcc/config/vms/t-vms(working copy) @@ -34,3 +34,8 @@ $(TM_P_H) $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ $(PREPROCESSOR_DEFINES) $ -o $@ + +vms-f.o: $(srcdir)/config/vms/vms-f.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ + $(TM_H) + $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ + $(PREPROCESSOR_DEFINES) $ -o $@ Index: gcc/config/vms/vms-f.c === --- gcc/config/vms/vms-f.c (revision 0) +++ gcc/config/vms/vms-f.c (revision 0) @@ -0,0 +1,31 @@ +/* VMS support needed only by Fortran frontends. + Copyright (C) 2012 Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or (at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +http://www.gnu.org/licenses/. */ + +#include config.h +#include system.h +#include coretypes.h +#include tm.h + +void +vms_c_register_includes (const char *sysroot ATTRIBUTE_UNUSED, + const char *iprefix ATTRIBUTE_UNUSED, +int stdinc ATTRIBUTE_UNUSED) +{ + /* No-op for fortran. */ +}
[PATCH H8300] Added -mno-exr option in case of function with monitor attribute
Hi, Please find the attached patch to avoid saving of EXR register for monitor functions. By default, in prologue code of a monitor function, EXR register is pushed onto the stack. This implementation is not required for H8S/224x and 21xx variants of H8S controllers. The behavior can be controlled with option -mno-exr. Built compiler is only for compiling C language source code. No regression found with this patch. Compiler behavior with different command line options used for compilation of code after applying this patch is given below: * h8300-elf-gcc -mn -S test.c test.c:1:0: error: -mn is used without -mh or -ms or -msx * h8300-elf-gcc -mh -mexr -S test.c test.c:1:0: error: -mexr is used without -ms * h8300-elf-gcc -mh -mno-exr -S test.c test.c:1:0: warning: -mno-exr valid only with -ms or -msx - Option ignored! [-mno-exr] * Generated assembly without option '-mno-exr': _testmonitor: stc exr,@-er7 mov.l er0,@-er7 stc ccr,r0l * Generated assembly with option '-mno-exr': _testmonitor: mov.l er0,@-er7 stc ccr,r0l Please review the patch and let me know if there should be any modifications in it? Regards, Sandeep Kumar Singh, KPIT Cummins InfoSystems Ltd. Pune, India ChangeLog.patch Description: ChangeLog.patch patch-EXR.patch Description: patch-EXR.patch
Re: [Patch, i386] Limit unroll factor for certain loops on Corei7
Pulling this one back as I have a better solution, patch coming shortly. Thanks, Teresa On Fri, Mar 16, 2012 at 3:33 PM, Teresa Johnson tejohn...@google.com wrote: Ping - now that stage 1 is open, could someone review? Thanks, Teresa On Sun, Dec 4, 2011 at 10:26 PM, Teresa Johnson tejohn...@google.com wrote: Latest patch which improves the efficiency as described below is included here. Boostrapped and checked again with x86_64-unknown-linux-gnu. Could someone review? Thanks, Teresa 2011-12-04 Teresa Johnson tejohn...@google.com * loop-unroll.c (decide_unroll_constant_iterations): Call loop unroll target hook. * config/i386/i386.c (ix86_loop_unroll_adjust): New function. (TARGET_LOOP_UNROLL_ADJUST): Define hook for x86. === --- loop-unroll.c (revision 181902) +++ loop-unroll.c (working copy) @@ -547,6 +547,9 @@ decide_unroll_constant_iterations (struc if (nunroll (unsigned) PARAM_VALUE (PARAM_MAX_UNROLL_TIMES)) nunroll = PARAM_VALUE (PARAM_MAX_UNROLL_TIMES); + if (targetm.loop_unroll_adjust) + nunroll = targetm.loop_unroll_adjust (nunroll, loop); + /* Skip big loops. */ if (nunroll = 1) { Index: config/i386/i386.c === --- config/i386/i386.c (revision 181902) +++ config/i386/i386.c (working copy) @@ -60,6 +60,7 @@ along with GCC; see the file COPYING3. #include fibheap.h #include opts.h #include diagnostic.h +#include cfgloop.h enum upper_128bits_state { @@ -38370,6 +38371,82 @@ ix86_autovectorize_vector_sizes (void) return (TARGET_AVX !TARGET_PREFER_AVX128) ? 32 | 16 : 0; } +/* If LOOP contains a possible LCP stalling instruction on corei7, + calculate new number of times to unroll instead of NUNROLL so that + the unrolled loop will still likely fit into the loop stream detector. */ +static unsigned +ix86_loop_unroll_adjust (unsigned nunroll, struct loop *loop) +{ + basic_block *body, bb; + unsigned i; + rtx insn; + bool found = false; + unsigned newunroll; + + if (ix86_tune != PROCESSOR_COREI7_64 + ix86_tune != PROCESSOR_COREI7_32) + return nunroll; + + /* Look for instructions that store a constant into HImode (16-bit) + memory. These require a length-changing prefix and on corei7 are + prone to LCP stalls. These stalls can be avoided if the loop + is streamed from the loop stream detector. */ + body = get_loop_body (loop); + for (i = 0; i loop-num_nodes; i++) + { + bb = body[i]; + + FOR_BB_INSNS (bb, insn) + { + rtx set_expr, dest; + set_expr = single_set (insn); + if (!set_expr) + continue; + + dest = SET_DEST (set_expr); + + /* Don't reduce unroll factor in loops with floating point + computation, which tend to benefit more heavily from + larger unroll factors and are less likely to bottleneck + at the decoder. */ + if (FLOAT_MODE_P (GET_MODE (dest))) + { + free (body); + return nunroll; + } + + if (!found + GET_MODE (dest) == HImode + CONST_INT_P (SET_SRC (set_expr)) + MEM_P (dest)) + { + found = true; + /* Keep walking loop body to look for FP computations above. */ + } + } + } + free (body); + + if (!found) + return nunroll; + + if (dump_file) + { + fprintf (dump_file, + ;; Loop contains HImode store of const (possible LCP stalls),\n); + fprintf (dump_file, + reduce unroll factor to fit into Loop Stream Detector\n); + } + + /* On corei7 the loop stream detector can hold 28 uops, so + don't allow unrolling to exceed that many instructions. */ + newunroll = 28 / loop-av_ninsns; + if (newunroll nunroll) + return newunroll; + + return nunroll; +} + /* Initialize the GCC target structure. */ #undef TARGET_RETURN_IN_MEMORY #define TARGET_RETURN_IN_MEMORY ix86_return_in_memory @@ -38685,6 +38762,9 @@ ix86_autovectorize_vector_sizes (void) #define TARGET_INIT_LIBFUNCS darwin_rename_builtins #endif +#undef TARGET_LOOP_UNROLL_ADJUST +#define TARGET_LOOP_UNROLL_ADJUST ix86_loop_unroll_adjust + struct gcc_target targetm = TARGET_INITIALIZER; #include gt-i386.h On Fri, Dec 2, 2011 at 12:11 PM, Teresa Johnson tejohn...@google.com wrote: On Fri, Dec 2, 2011 at 11:36 AM, Andi Kleen a...@firstfloor.org wrote: Teresa Johnson tejohn...@google.com writes: Interesting optimization. I would be concerned a little bit about compile time, does it make a
[Patch, i386] Avoid LCP stalls (issue5975045)
This patch addresses instructions that incur expensive length-changing prefix (LCP) stalls on some x86-64 implementations, notably Core2 and Corei7. Specifically, a move of a 16-bit constant into memory requires a length-changing prefix and can incur significant penalties. The attached patch avoids this by forcing such instructions to be split into two: a move of the corresponding 32-bit constant into a register, and a move of the register's lower 16 bits into memory. Bootstrapped and tested on x86_64-unknown-linux-gnu. Is this ok for trunk? Thanks, Teresa 2012-03-29 Teresa Johnson tejohn...@google.com * config/i386/i386.h (ix86_tune_indices): Add X86_TUNE_LCP_STALL. * config/i386/i386.md (movhi_internal): Split to movhi_internal and movhi_imm_internal. * config/i386/i386.c (initial_ix86_tune_features): Initialize X86_TUNE_LCP_STALL entry. Index: config/i386/i386.h === --- config/i386/i386.h (revision 185920) +++ config/i386/i386.h (working copy) @@ -262,6 +262,7 @@ enum ix86_tune_indices { X86_TUNE_MOVX, X86_TUNE_PARTIAL_REG_STALL, X86_TUNE_PARTIAL_FLAG_REG_STALL, + X86_TUNE_LCP_STALL, X86_TUNE_USE_HIMODE_FIOP, X86_TUNE_USE_SIMODE_FIOP, X86_TUNE_USE_MOV0, @@ -340,6 +341,8 @@ extern unsigned char ix86_tune_features[X86_TUNE_L #define TARGET_PARTIAL_REG_STALL ix86_tune_features[X86_TUNE_PARTIAL_REG_STALL] #define TARGET_PARTIAL_FLAG_REG_STALL \ ix86_tune_features[X86_TUNE_PARTIAL_FLAG_REG_STALL] +#define TARGET_LCP_STALL \ + ix86_tune_features[X86_TUNE_LCP_STALL] #define TARGET_USE_HIMODE_FIOP ix86_tune_features[X86_TUNE_USE_HIMODE_FIOP] #define TARGET_USE_SIMODE_FIOP ix86_tune_features[X86_TUNE_USE_SIMODE_FIOP] #define TARGET_USE_MOV0ix86_tune_features[X86_TUNE_USE_MOV0] Index: config/i386/i386.md === --- config/i386/i386.md (revision 185920) +++ config/i386/i386.md (working copy) @@ -2262,9 +2262,19 @@ ] (const_string SI)))]) +(define_insn *movhi_imm_internal + [(set (match_operand:HI 0 memory_operand =m) +(match_operand:HI 1 immediate_operand n))] + !TARGET_LCP_STALL !(MEM_P (operands[0]) MEM_P (operands[1])) +{ + return mov{w}\t{%1, %0|%0, %1}; +} + [(set (attr type) (const_string imov)) + (set (attr mode) (const_string HI))]) + (define_insn *movhi_internal [(set (match_operand:HI 0 nonimmediate_operand =r,r,r,m) - (match_operand:HI 1 general_operand r,rn,rm,rn))] + (match_operand:HI 1 general_operand r,rn,rm,r))] !(MEM_P (operands[0]) MEM_P (operands[1])) { switch (get_attr_type (insn)) Index: config/i386/i386.c === --- config/i386/i386.c (revision 185920) +++ config/i386/i386.c (working copy) @@ -1964,6 +1964,11 @@ static unsigned int initial_ix86_tune_features[X86 /* X86_TUNE_PARTIAL_FLAG_REG_STALL */ m_CORE2I7 | m_GENERIC, + /* X86_TUNE_LCP_STALL: Avoid an expensive length-changing prefix stall + * on 16-bit immediate moves into memory on Core2 and Corei7, + * which may also affect AMD implementations. */ + m_CORE2I7 | m_GENERIC | m_AMD_MULTIPLE, + /* X86_TUNE_USE_HIMODE_FIOP */ m_386 | m_486 | m_K6_GEODE, -- This patch is available for review at http://codereview.appspot.com/5975045
[Patch]: Support VMS in libstdc++ crossconfig.m4
Hi, currently all VMS compilers are built on Unix. So, to build the libstdc++ library, this looks like the minimum required. Tested by build libstdc++ for ia64-hp-openvms. Ok for trunk ? Tristan. libstdc++/ 2012-03-30 Tristan Gingold ging...@adacore.com * crossconfig.m4 (*-*-*vms*): Add. * configure: Regenerate. diff --git a/libstdc++-v3/crossconfig.m4 b/libstdc++-v3/crossconfig.m4 index 3850879..e208fbf 100644 --- a/libstdc++-v3/crossconfig.m4 +++ b/libstdc++-v3/crossconfig.m4 @@ -241,6 +241,12 @@ case ${host} in AC_DEFINE(HAVE_ISNANL) fi ;; + *-*vms*) +# Check for available headers. +# Don't call GLIBCXX_CHECK_LINKER_FEATURES, VMS doesn't have a GNU ld +GLIBCXX_CHECK_MATH_SUPPORT +GLIBCXX_CHECK_STDLIB_SUPPORT +;; *-vxworks) AC_DEFINE(HAVE_ACOSF) AC_DEFINE(HAVE_ASINF)
Re: [Patch, i386] Avoid LCP stalls (issue5975045)
I should add that I have tested performance of this on Core2, Corei7 (Nehalem) and AMD Opteron-based systems. It appears to be performance-neutral on AMD (only minor perturbations, overall a wash). For the test case that provoked the optimization, there were nice improvements on Core2 and Corei7. Thanks, Teresa On Fri, Mar 30, 2012 at 7:18 AM, Teresa Johnson tejohn...@google.com wrote: This patch addresses instructions that incur expensive length-changing prefix (LCP) stalls on some x86-64 implementations, notably Core2 and Corei7. Specifically, a move of a 16-bit constant into memory requires a length-changing prefix and can incur significant penalties. The attached patch avoids this by forcing such instructions to be split into two: a move of the corresponding 32-bit constant into a register, and a move of the register's lower 16 bits into memory. Bootstrapped and tested on x86_64-unknown-linux-gnu. Is this ok for trunk? Thanks, Teresa 2012-03-29 Teresa Johnson tejohn...@google.com * config/i386/i386.h (ix86_tune_indices): Add X86_TUNE_LCP_STALL. * config/i386/i386.md (movhi_internal): Split to movhi_internal and movhi_imm_internal. * config/i386/i386.c (initial_ix86_tune_features): Initialize X86_TUNE_LCP_STALL entry. Index: config/i386/i386.h === --- config/i386/i386.h (revision 185920) +++ config/i386/i386.h (working copy) @@ -262,6 +262,7 @@ enum ix86_tune_indices { X86_TUNE_MOVX, X86_TUNE_PARTIAL_REG_STALL, X86_TUNE_PARTIAL_FLAG_REG_STALL, + X86_TUNE_LCP_STALL, X86_TUNE_USE_HIMODE_FIOP, X86_TUNE_USE_SIMODE_FIOP, X86_TUNE_USE_MOV0, @@ -340,6 +341,8 @@ extern unsigned char ix86_tune_features[X86_TUNE_L #define TARGET_PARTIAL_REG_STALL ix86_tune_features[X86_TUNE_PARTIAL_REG_STALL] #define TARGET_PARTIAL_FLAG_REG_STALL \ ix86_tune_features[X86_TUNE_PARTIAL_FLAG_REG_STALL] +#define TARGET_LCP_STALL \ + ix86_tune_features[X86_TUNE_LCP_STALL] #define TARGET_USE_HIMODE_FIOP ix86_tune_features[X86_TUNE_USE_HIMODE_FIOP] #define TARGET_USE_SIMODE_FIOP ix86_tune_features[X86_TUNE_USE_SIMODE_FIOP] #define TARGET_USE_MOV0 ix86_tune_features[X86_TUNE_USE_MOV0] Index: config/i386/i386.md === --- config/i386/i386.md (revision 185920) +++ config/i386/i386.md (working copy) @@ -2262,9 +2262,19 @@ ] (const_string SI)))]) +(define_insn *movhi_imm_internal + [(set (match_operand:HI 0 memory_operand =m) + (match_operand:HI 1 immediate_operand n))] + !TARGET_LCP_STALL !(MEM_P (operands[0]) MEM_P (operands[1])) +{ + return mov{w}\t{%1, %0|%0, %1}; +} + [(set (attr type) (const_string imov)) + (set (attr mode) (const_string HI))]) + (define_insn *movhi_internal [(set (match_operand:HI 0 nonimmediate_operand =r,r,r,m) - (match_operand:HI 1 general_operand r,rn,rm,rn))] + (match_operand:HI 1 general_operand r,rn,rm,r))] !(MEM_P (operands[0]) MEM_P (operands[1])) { switch (get_attr_type (insn)) Index: config/i386/i386.c === --- config/i386/i386.c (revision 185920) +++ config/i386/i386.c (working copy) @@ -1964,6 +1964,11 @@ static unsigned int initial_ix86_tune_features[X86 /* X86_TUNE_PARTIAL_FLAG_REG_STALL */ m_CORE2I7 | m_GENERIC, + /* X86_TUNE_LCP_STALL: Avoid an expensive length-changing prefix stall + * on 16-bit immediate moves into memory on Core2 and Corei7, + * which may also affect AMD implementations. */ + m_CORE2I7 | m_GENERIC | m_AMD_MULTIPLE, + /* X86_TUNE_USE_HIMODE_FIOP */ m_386 | m_486 | m_K6_GEODE, -- This patch is available for review at http://codereview.appspot.com/5975045 -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
[libiberty] Avoid compiler warnings in stack-limit.c
Hi, there are some systems on which getrlimit/setrlimit are not available, and compiling stack-limit.c on these systems generates a warning. Cleaned up with this patch. Tested while building gcc for ia64-hp-openvms on x86_64-darwin. Ok for trunk ? Tristan. libiberty/ 2012-03-30 Tristan Gingold ging...@adacore.com * stack-limit.c: Includes ansidecl.h. (stack_limit_increase): Add ATTRIBUTE_UNUSED diff --git a/libiberty/stack-limit.c b/libiberty/stack-limit.c index e64cac2..82c3d44 100644 --- a/libiberty/stack-limit.c +++ b/libiberty/stack-limit.c @@ -34,6 +34,7 @@ Attempt to increase stack size limit to @var{pref} bytes if possible. */ #include config.h +#include ansidecl.h #ifdef HAVE_STDINT_H #include stdint.h @@ -43,7 +44,7 @@ Attempt to increase stack size limit to @var{pref} bytes if possible. #endif void -stack_limit_increase (unsigned long pref) +stack_limit_increase (unsigned long pref ATTRIBUTE_UNUSED) { #if defined(HAVE_SETRLIMIT) defined(HAVE_GETRLIMIT) \ defined(RLIMIT_STACK) defined(RLIM_INFINITY)
Re: [C Patch]: pr52543
This patch takes a different approach to fixing PR52543 than does the patch in http://gcc.gnu.org/ml/gcc-patches/2012-03/msg00641.html This patch transforms the lower-subreg pass(es) from unconditionally splitting wide moves, zero extensions, and shifts, so that it now takes into account the target specific costs and only does the transformations if it is profitable. As far as I understand the pass, it's not only about splitting these instructions but also to the additional benefits of the split, i.e. AND 0xfffe will only need one QI operation instead of 1 SI operation that costs 4 QI. And in fact, the positive benefit of subreg-lowering occurs with bit-wise operations like AND, IOR, EOR etc. And one problem is that the pass is not sensitive to address spaces. For example, HI splits for generic space are profitable, for non-generic they are not. Thus, a patch should also address address-space sensivity. Unconditional splitting is a problem that not only occurs on the AVR but is also a problem on the ARM NEON and my private port. Furthermore, it is a problem that is likely to occur on most modern larger machines since these machines are more likely to have fast instructions for moving things that are larger than word mode. At compiler initialization time, each mode that is larger that a word mode is examined to determine if the cost of moving a value of that mode is less expensive that inserting the proper number of word sided moves. If it is cheaper to split it up, a bit is set to allow moves of that mode to be lowered. As written above, the mode is *not* enough. For MEM there are is also address space involved. Johann
[Patch, i386] Avoid LCP stalls (issue5975045)
Minor update to patch to remove unnecessary check in new movhi_imm_internal define_insn. Retested successfully. Teresa 2012-03-29 Teresa Johnson tejohn...@google.com * config/i386/i386.h (ix86_tune_indices): Add X86_TUNE_LCP_STALL. * config/i386/i386.md (movhi_internal): Split to movhi_internal and movhi_imm_internal. * config/i386/i386.c (initial_ix86_tune_features): Initialize X86_TUNE_LCP_STALL entry. Index: config/i386/i386.h === --- config/i386/i386.h (revision 185920) +++ config/i386/i386.h (working copy) @@ -262,6 +262,7 @@ enum ix86_tune_indices { X86_TUNE_MOVX, X86_TUNE_PARTIAL_REG_STALL, X86_TUNE_PARTIAL_FLAG_REG_STALL, + X86_TUNE_LCP_STALL, X86_TUNE_USE_HIMODE_FIOP, X86_TUNE_USE_SIMODE_FIOP, X86_TUNE_USE_MOV0, @@ -340,6 +341,8 @@ extern unsigned char ix86_tune_features[X86_TUNE_L #define TARGET_PARTIAL_REG_STALL ix86_tune_features[X86_TUNE_PARTIAL_REG_STALL] #define TARGET_PARTIAL_FLAG_REG_STALL \ ix86_tune_features[X86_TUNE_PARTIAL_FLAG_REG_STALL] +#define TARGET_LCP_STALL \ + ix86_tune_features[X86_TUNE_LCP_STALL] #define TARGET_USE_HIMODE_FIOP ix86_tune_features[X86_TUNE_USE_HIMODE_FIOP] #define TARGET_USE_SIMODE_FIOP ix86_tune_features[X86_TUNE_USE_SIMODE_FIOP] #define TARGET_USE_MOV0ix86_tune_features[X86_TUNE_USE_MOV0] Index: config/i386/i386.md === --- config/i386/i386.md (revision 185920) +++ config/i386/i386.md (working copy) @@ -2262,9 +2262,19 @@ ] (const_string SI)))]) +(define_insn *movhi_imm_internal + [(set (match_operand:HI 0 memory_operand =m) +(match_operand:HI 1 immediate_operand n))] + !TARGET_LCP_STALL +{ + return mov{w}\t{%1, %0|%0, %1}; +} + [(set (attr type) (const_string imov)) + (set (attr mode) (const_string HI))]) + (define_insn *movhi_internal [(set (match_operand:HI 0 nonimmediate_operand =r,r,r,m) - (match_operand:HI 1 general_operand r,rn,rm,rn))] + (match_operand:HI 1 general_operand r,rn,rm,r))] !(MEM_P (operands[0]) MEM_P (operands[1])) { switch (get_attr_type (insn)) Index: config/i386/i386.c === --- config/i386/i386.c (revision 185920) +++ config/i386/i386.c (working copy) @@ -1964,6 +1964,11 @@ static unsigned int initial_ix86_tune_features[X86 /* X86_TUNE_PARTIAL_FLAG_REG_STALL */ m_CORE2I7 | m_GENERIC, + /* X86_TUNE_LCP_STALL: Avoid an expensive length-changing prefix stall + * on 16-bit immediate moves into memory on Core2 and Corei7, + * which may also affect AMD implementations. */ + m_CORE2I7 | m_GENERIC | m_AMD_MULTIPLE, + /* X86_TUNE_USE_HIMODE_FIOP */ m_386 | m_486 | m_K6_GEODE, -- This patch is available for review at http://codereview.appspot.com/5975045
Re: PATCH: Add OPTION_MASK_ISA_X86_64 and support TARGET_BI_ARCH == 2
Mike Stump mikest...@comcast.net writes: Here is the new patch. OK for trunk if there are no regressions on Linux/ia32 and Linux/x86-64? Too bad you didn't test 32-bit darwin, causes: http://gcc.gnu.org/PR52784 Could you please revert or fix, thanks. Same problem on Solaris 10 and 11/x86. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [Patch, i386] Avoid LCP stalls (issue5975045)
On 03/30/2012 11:03 AM, Teresa Johnson wrote: +(define_insn *movhi_imm_internal + [(set (match_operand:HI 0 memory_operand =m) +(match_operand:HI 1 immediate_operand n))] + !TARGET_LCP_STALL +{ + return mov{w}\t{%1, %0|%0, %1}; +} + [(set (attr type) (const_string imov)) + (set (attr mode) (const_string HI))]) + (define_insn *movhi_internal [(set (match_operand:HI 0 nonimmediate_operand =r,r,r,m) - (match_operand:HI 1 general_operand r,rn,rm,rn))] + (match_operand:HI 1 general_operand r,rn,rm,r))] !(MEM_P (operands[0]) MEM_P (operands[1])) For reload to work correctly, all alternatives must remain part of the same pattern. This issue should be handled with the ISA and ENABLED attributes. r~
Re: [Patch, i386] Avoid LCP stalls (issue5975045)
Index: config/i386/i386.md === --- config/i386/i386.md (revision 185920) +++ config/i386/i386.md (working copy) @@ -2262,9 +2262,19 @@ ] (const_string SI)))]) +(define_insn *movhi_imm_internal + [(set (match_operand:HI 0 memory_operand =m) +(match_operand:HI 1 immediate_operand n))] + !TARGET_LCP_STALL !(MEM_P (operands[0]) MEM_P (operands[1])) +{ + return mov{w}\t{%1, %0|%0, %1}; +} + [(set (attr type) (const_string imov)) + (set (attr mode) (const_string HI))]) + (define_insn *movhi_internal [(set (match_operand:HI 0 nonimmediate_operand =r,r,r,m) - (match_operand:HI 1 general_operand r,rn,rm,rn))] + (match_operand:HI 1 general_operand r,rn,rm,r))] If you do this, you will prevent reload from considering using immediate as rematerializatoin when the register holding constant is on a stack on !TARGET_LCP_STALL machines. The matching pattern for moves should really handle all available alternatives, so reload is happy. You can duplicate the pattern, but I think this is much better to be done as post-reload peephole2. I.e. ask for scratch register and if it is available do the splitting. This way optimization won't happen when there is no register available and we will also rely on post-reload cleanups to unify moves of constant, but I think this should work well. You also want to conditionalize the split by optimize_insn_for_speed, too. !(MEM_P (operands[0]) MEM_P (operands[1])) { switch (get_attr_type (insn)) Index: config/i386/i386.c === --- config/i386/i386.c(revision 185920) +++ config/i386/i386.c(working copy) @@ -1964,6 +1964,11 @@ static unsigned int initial_ix86_tune_features[X86 /* X86_TUNE_PARTIAL_FLAG_REG_STALL */ m_CORE2I7 | m_GENERIC, + /* X86_TUNE_LCP_STALL: Avoid an expensive length-changing prefix stall + * on 16-bit immediate moves into memory on Core2 and Corei7, + * which may also affect AMD implementations. */ + m_CORE2I7 | m_GENERIC | m_AMD_MULTIPLE, Is this supposed to help AMD? (at least the pre-buldozer design should not care about length changing prefixes that much because it tags sizes in the cache). If not, I would suggest enabling it only for cores and generic. Honza
Re: [Patch, i386] Avoid LCP stalls (issue5975045)
On 03/30/2012 11:11 AM, Richard Henderson wrote: On 03/30/2012 11:03 AM, Teresa Johnson wrote: +(define_insn *movhi_imm_internal + [(set (match_operand:HI 0 memory_operand =m) +(match_operand:HI 1 immediate_operand n))] + !TARGET_LCP_STALL +{ + return mov{w}\t{%1, %0|%0, %1}; +} + [(set (attr type) (const_string imov)) + (set (attr mode) (const_string HI))]) + (define_insn *movhi_internal [(set (match_operand:HI 0 nonimmediate_operand =r,r,r,m) -(match_operand:HI 1 general_operand r,rn,rm,rn))] +(match_operand:HI 1 general_operand r,rn,rm,r))] !(MEM_P (operands[0]) MEM_P (operands[1])) For reload to work correctly, all alternatives must remain part of the same pattern. This issue should be handled with the ISA and ENABLED attributes. I'll also ask if this should better be handled with a peephole2. While movw $1234,(%eax) might be expensive, is it so expensive that we *must* force the use of a free register? Might it be better only to split the insn in two if and only if a free register exists? That can easily be done with a peephole2 pattern... r~
Re: [C Patch]: pr52543
+ There are two useful preprocessor defines for use by maintainers: + + #define LOG_COSTS + + if you wish to see the actual cost estimates that are being used + for each mode wider than word mode and the cost estimates for zero + extension and the shifts. This can be useful when port maintainers + are tuning insn rtx costs. + + #define FORCE_LOWERING + + if you wish to test the pass with all the transformation forced on. + This can be useful for finding bugs in the transformations. Must admit I'm not keen on these kinds of macro, but it's Ian's call. Idea for the future (i.e. not this patch) is to have a dump file for target initialisation. Imagine my horror when i did all of this as you had privately suggested and discovered that there was no way to log what i was doing. This is good enough until someone wants to fix the general problem. +/* This pass can transform 4 different operations: move, ashift, + lshiftrt, and zero_extend. There is a boolean vector for move + splitting that is indexed by mode and is true for each mode that is + to have its copies split. The other three operations are only done + for one mode so they are only controlled by a single boolean .*/ As mentioned privately, whether this is profitable for shifts depends to some extent on the shift amount. GCC already supports targets where this transformation would be OK for some shift amounts but not others. So for shifts, I think this should be an array of HOST_BITS_PER_WIDE_INT booleans rather than just one. More comments below about how this filters through your other changes. I think that you actually are missing what i am doing with this. I look at 3 representative values that should discover any non uniformities. If any of them are profitable, i set this bit. Then at the point where i really have to pull the trigger on a real instance, i check the shift amount used at that spot to see if the individual shift is profitable. I did this for two reasons. One of them was that i was a little concerned that HOST_BITS_PER_WIDE_INT on the smallest host was not as big as the bitsize of word_word mode on the largest target (it could be but this knowledge is above my pay grade). The other reason was did not see this as a common operation and checking it on demand seemed like the winner. I will do everything else you mention and resubmit after i fix ramana's ice.
Re: [C Patch]: pr52543
On 03/30/2012 10:39 AM, Georg-Johann Lay wrote: This patch takes a different approach to fixing PR52543 than does the patch in http://gcc.gnu.org/ml/gcc-patches/2012-03/msg00641.html This patch transforms the lower-subreg pass(es) from unconditionally splitting wide moves, zero extensions, and shifts, so that it now takes into account the target specific costs and only does the transformations if it is profitable. As far as I understand the pass, it's not only about splitting these instructions but also to the additional benefits of the split, i.e. AND 0xfffe will only need one QI operation instead of 1 SI operation that costs 4 QI. And in fact, the positive benefit of subreg-lowering occurs with bit-wise operations like AND, IOR, EOR etc. And one problem is that the pass is not sensitive to address spaces. For example, HI splits for generic space are profitable, for non-generic they are not. Thus, a patch should also address address-space sensivity. No, this pass only splits operations that are wider than word mode into word mode sized chunks. On a machine where word mode is SI, it will split DI shifts and zero extends and any moves wider that SI mode into a series of SI operations. It does nothing for things in QI, or HI mode. The pass was written before there were machines with fast vector move operations. It might be that there are issues where the address space considerations may need to be taken into consideration.Someone who has a port with memory operation like this may want to consider making that enhancement. I think that once this patch is in place, that kind of change will be easier to incorporate. Unconditional splitting is a problem that not only occurs on the AVR but is also a problem on the ARM NEON and my private port. Furthermore, it is a problem that is likely to occur on most modern larger machines since these machines are more likely to have fast instructions for moving things that are larger than word mode. At compiler initialization time, each mode that is larger that a word mode is examined to determine if the cost of moving a value of that mode is less expensive that inserting the proper number of word sided moves. If it is cheaper to split it up, a bit is set to allow moves of that mode to be lowered. As written above, the mode is *not* enough. For MEM there are is also address space involved. Johann
Re: [C Patch]: pr52543
Kenneth Zadeck zad...@naturalbridge.com writes: + There are two useful preprocessor defines for use by maintainers: + + #define LOG_COSTS + + if you wish to see the actual cost estimates that are being used + for each mode wider than word mode and the cost estimates for zero + extension and the shifts. This can be useful when port maintainers + are tuning insn rtx costs. + + #define FORCE_LOWERING + + if you wish to test the pass with all the transformation forced on. + This can be useful for finding bugs in the transformations. Must admit I'm not keen on these kinds of macro, but it's Ian's call. Idea for the future (i.e. not this patch) is to have a dump file for target initialisation. Imagine my horror when i did all of this as you had privately suggested and discovered that there was no way to log what i was doing. This is good enough until someone wants to fix the general problem. +/* This pass can transform 4 different operations: move, ashift, + lshiftrt, and zero_extend. There is a boolean vector for move + splitting that is indexed by mode and is true for each mode that is + to have its copies split. The other three operations are only done + for one mode so they are only controlled by a single boolean .*/ As mentioned privately, whether this is profitable for shifts depends to some extent on the shift amount. GCC already supports targets where this transformation would be OK for some shift amounts but not others. So for shifts, I think this should be an array of HOST_BITS_PER_WIDE_INT booleans rather than just one. More comments below about how this filters through your other changes. I think that you actually are missing what i am doing with this. I look at 3 representative values that should discover any non uniformities. If any of them are profitable, i set this bit. Then at the point where i really have to pull the trigger on a real instance, i check the shift amount used at that spot to see if the individual shift is profitable. No, I got that. I just think it's an unnecessary complication. I did this for two reasons. One of them was that i was a little concerned that HOST_BITS_PER_WIDE_INT on the smallest host was not as big as the bitsize of word_word mode on the largest target (it could be but this knowledge is above my pay grade). Ah, yes, sorry, I meant an array of BITS_PER_WORD booleans. I had HOST_WIDE_INT on the brain after Mike's patch. The other reason was did not see this as a common operation and checking it on demand seemed like the winner. But (at least after the other changes I mentioned) these overall booleans cut out only a very small portion of find_decomposable_shift_zext. I.e.: op = SET_SRC (set); if (GET_CODE (op) != ASHIFT GET_CODE (op) != LSHIFTRT GET_CODE (op) != ZERO_EXTEND) -- unified booleans checked here return 0; op_operand = XEXP (op, 0); if (!REG_P (SET_DEST (set)) || !REG_P (op_operand) || HARD_REGISTER_NUM_P (REGNO (SET_DEST (set))) || HARD_REGISTER_NUM_P (REGNO (op_operand)) || !SCALAR_INT_MODE_P (GET_MODE (op))) return 0; if (GET_CODE (op) == ZERO_EXTEND) { if (GET_MODE (op_operand) != word_mode || GET_MODE_BITSIZE (GET_MODE (op)) != 2 * BITS_PER_WORD) return 0; } else /* left or right shift */ { --- specific booleans checked here if (!CONST_INT_P (XEXP (op, 1)) || INTVAL (XEXP (op, 1)) BITS_PER_WORD || GET_MODE_BITSIZE (GET_MODE (op_operand)) != 2 * BITS_PER_WORD) return 0; } It seems better (and simpler) not to prejudge which shift amounts are interesting and instead cache the win or no win flag for each value. As I say, this is all in the context of this pass not being interesting for modes where the split move is strictly more expensive than the unified move, regardless of shift zext costs. Richard
Re: [IA-64] Work around bug in unwinder
On 03/21/2012 01:03 PM, Eric Botcazou wrote: 2012-03-21 Eric Botcazou ebotca...@adacore.com * config/ia64/unwind-ia64.c (uw_install_context): Manually save LC if it hasn't been previously saved. Looks ok, given the other ugliness in this macro. r~
Re: PATCH: Add OPTION_MASK_ISA_X86_64 and support TARGET_BI_ARCH == 2
On Fri, Mar 30, 2012 at 8:11 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Mike Stump mikest...@comcast.net writes: Here is the new patch. OK for trunk if there are no regressions on Linux/ia32 and Linux/x86-64? Too bad you didn't test 32-bit darwin, causes: http://gcc.gnu.org/PR52784 Could you please revert or fix, thanks. Same problem on Solaris 10 and 11/x86. Rainer When i[34567]86-*-* targets are configured with --enable-targets=all, TARGET_BI_ARCH is defined as 1, but TARGET_64BIT_DEFAULT isn't defined. It leads to if (!TARGET_64BIT) ix86_isa_flags = ~(OPTION_MASK_ABI_64 | OPTION_MASK_ABI_X32); Since TARGET_64BIT is false by default, -m64 and -mx32 don't work correctly. This patch changes TARGET_BI_ARCH to 3 for i[34567]86-*-* targets configured with --enable-targets=all. Tested on Linux/ia32 with bootstrap and Linux/ia32 with --enable-targets=all --disable-bootstrap. Please try on other OSes. Thanks. -- H.J. --- 2012-03-30 H.J. Lu hongjiu...@intel.com PR bootstrap/52784 * config.gcc (tm_defines): Replace TARGET_BI_ARCH=1 with TARGET_BI_ARCH=3 for i[34567]86-*-* targets. * config/i386/i386.c (ix86_option_override_internal): Don't check OPTION_MASK_ABI_64 nor OPTION_MASK_ABI_X32 if TARGET_BI_ARCH isn't defined or TARGET_BI_ARCH == 3. 2012-03-30 H.J. Lu hongjiu...@intel.com PR bootstrap/52784 * config.gcc (tm_defines): Replace TARGET_BI_ARCH=1 with TARGET_BI_ARCH=3 for i[34567]86-*-* targets. * config/i386/i386.c (ix86_option_override_internal): Don't check OPTION_MASK_ABI_64 nor OPTION_MASK_ABI_X32 if TARGET_BI_ARCH isn't defined or TARGET_BI_ARCH == 3. diff --git a/gcc/config.gcc b/gcc/config.gcc index c30bb24..d1e0480 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -1208,7 +1208,7 @@ i[34567]86-*-linux* | i[34567]86-*-kfreebsd*-gnu | i[34567]86-*-knetbsd*-gnu | i default_gnu_indirect_function=yes if test x$enable_targets = xall; then tm_file=${tm_file} i386/x86-64.h i386/gnu-user64.h i386/linux64.h - tm_defines=${tm_defines} TARGET_BI_ARCH=1 + tm_defines=${tm_defines} TARGET_BI_ARCH=3 tmake_file=${tmake_file} i386/t-linux64 x86_multilibs=${with_multilib_list} if test $x86_multilibs = default; then @@ -1338,7 +1338,7 @@ i[34567]86-*-solaris2* | x86_64-*-solaris2.1[0-9]*) case ${target} in *-*-solaris2.1[0-9]*) tm_file=${tm_file} i386/x86-64.h i386/sol2-bi.h sol2-bi.h - tm_defines=${tm_defines} TARGET_BI_ARCH=1 + tm_defines=${tm_defines} TARGET_BI_ARCH=3 tmake_file=$tmake_file i386/t-sol2-64 need_64bit_isa=yes case X${with_cpu} in @@ -1406,7 +1406,7 @@ i[34567]86-*-mingw* | x86_64-*-mingw*) user_headers_inc_next_pre=${user_headers_inc_next_pre} stddef.h stdarg.h tm_file=${tm_file} i386/mingw-w64.h if test x$enable_targets = xall; then -tm_defines=${tm_defines} TARGET_BI_ARCH=1 +tm_defines=${tm_defines} TARGET_BI_ARCH=3 case X${with_cpu} in Xgeneric|Xatom|Xcore2|Xcorei7|Xcorei7-avx|Xnocona|Xx86-64|Xbdver2|Xbdver1|Xbtver1|Xamdfam10|Xbarcelona|Xk8|Xopteron|Xathlon64|Xathlon-fx|Xathlon64-sse3|Xk8-sse3|Xopteron-sse3) ;; diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 42746e4..62ed2c0 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -3117,11 +3117,14 @@ ix86_option_override_internal (bool main_args_p) SUBSUBTARGET_OVERRIDE_OPTIONS; #endif - /* Turn off both OPTION_MASK_ABI_64 and OPTION_MASK_ABI_X32 if - TARGET_64BIT is false. */ +#if defined TARGET_BI_ARCH TARGET_BI_ARCH != 3 + /* When TARGET_BI_ARCH isn't defined or TARGET_BI_ARCH == 3, + TARGET_64BIT is false by default and there is no need to check + OPTION_MASK_ABI_64 nor OPTION_MASK_ABI_X32. Turn off both + OPTION_MASK_ABI_64 and OPTION_MASK_ABI_X32 if TARGET_64BIT is + false. */ if (!TARGET_64BIT) ix86_isa_flags = ~(OPTION_MASK_ABI_64 | OPTION_MASK_ABI_X32); -#ifdef TARGET_BI_ARCH else { #if TARGET_BI_ARCH == 1
Re: [Patch, i386] Avoid LCP stalls (issue5975045)
On Fri, Mar 30, 2012 at 8:19 AM, Richard Henderson r...@redhat.com wrote: On 03/30/2012 11:11 AM, Richard Henderson wrote: On 03/30/2012 11:03 AM, Teresa Johnson wrote: +(define_insn *movhi_imm_internal + [(set (match_operand:HI 0 memory_operand =m) + (match_operand:HI 1 immediate_operand n))] + !TARGET_LCP_STALL +{ + return mov{w}\t{%1, %0|%0, %1}; +} + [(set (attr type) (const_string imov)) + (set (attr mode) (const_string HI))]) + (define_insn *movhi_internal [(set (match_operand:HI 0 nonimmediate_operand =r,r,r,m) - (match_operand:HI 1 general_operand r,rn,rm,rn))] + (match_operand:HI 1 general_operand r,rn,rm,r))] !(MEM_P (operands[0]) MEM_P (operands[1])) For reload to work correctly, all alternatives must remain part of the same pattern. This issue should be handled with the ISA and ENABLED attributes. I'll also ask if this should better be handled with a peephole2. While movw $1234,(%eax) might be expensive, is it so expensive that we *must* force the use of a free register? Might it be better only to split the insn in two if and only if a free register exists? That can easily be done with a peephole2 pattern... Here is a very old LCP patch with peephole2. It may need some updates. -- H.J. --- gcc/config/i386/i386-tune.c.movw 2007-08-06 07:58:38.0 -0700 +++ gcc/config/i386/i386-tune.c 2007-08-06 07:58:38.0 -0700 @@ -117,6 +117,9 @@ x86_tune_options (void) abort (); } } + + if (x86_split_movw_length_string) +x86_split_movw_length = atoi (x86_split_movw_length_string); } #undef TARGET_SCHED_ISSUE_RATE @@ -137,3 +140,4 @@ const char *ix86_adjust_cost_string; int ia32_multipass_dfa_lookahead_value; const char *ia32_multipass_dfa_lookahead_string; +int x86_split_movw_length; --- gcc/config/i386/i386-tune.h.movw 2007-08-06 07:58:38.0 -0700 +++ gcc/config/i386/i386-tune.h 2007-08-06 07:58:38.0 -0700 @@ -4,6 +4,9 @@ -mno-default + -msplit-movw-length=NUMBER + NUMBER is the maximum 16bit immediate move instruction length + -missue-rate=NUMBER -madjust-cost=NUMBER @@ -72,6 +75,7 @@ extern void x86_tune_options (void); +extern int x86_split_movw_length; extern int ix86_issue_rate_value; extern const char *ix86_issue_rate_string; --- gcc/config/i386/i386-tune.opt.movw 2007-08-06 07:58:38.0 -0700 +++ gcc/config/i386/i386-tune.opt 2007-08-06 07:58:38.0 -0700 @@ -363,3 +363,6 @@ Target RejectNegative Joined Report Var( mno-default Target RejectNegative Report Var(x86_no_default_string) Undocumented +msplit-movw-length= +Target RejectNegative Joined Report Var(x86_split_movw_length_string) Undocumented + --- gcc/config/i386/i386.md.movw 2007-08-06 07:55:01.0 -0700 +++ gcc/config/i386/i386.md 2007-08-06 08:50:48.0 -0700 @@ -19655,14 +19655,18 @@ (set (match_dup 0) (match_dup 1))] ) +;; Also don't move a 16bit immediate directly to memory when target +;; has slow LCP instructions. (define_peephole2 [(match_scratch:HI 1 r) (set (match_operand:HI 0 memory_operand ) (const_int 0))] optimize_insn_for_speed_p () -! TARGET_USE_MOV0 -TARGET_SPLIT_LONG_MOVES -get_attr_length (insn) = ix86_cur_cost ()-large_insn +((x86_split_movw_length_string != NULL + get_attr_length (insn) = x86_split_movw_length) + || (! TARGET_USE_MOV0 + TARGET_SPLIT_LONG_MOVES + get_attr_length (insn) = ix86_cur_cost ()-large_insn)) peep2_regno_dead_p (0, FLAGS_REG) [(parallel [(set (match_dup 2) (const_int 0)) (clobber (reg:CC FLAGS_REG))]) @@ -19694,13 +19698,17 @@ (set (match_dup 0) (match_dup 2))] ) +;; Also don't move a 16bit immediate directly to memory when target +;; has slow LCP instructions. (define_peephole2 [(match_scratch:HI 2 r) (set (match_operand:HI 0 memory_operand ) (match_operand:HI 1 immediate_operand ))] optimize_insn_for_speed_p () -TARGET_SPLIT_LONG_MOVES -get_attr_length (insn) = ix86_cur_cost ()-large_insn +((x86_split_movw_length_string != NULL + get_attr_length (insn) = x86_split_movw_length) + || (TARGET_SPLIT_LONG_MOVES + get_attr_length (insn) = ix86_cur_cost ()-large_insn)) [(set (match_dup 2) (match_dup 1)) (set (match_dup 0) (match_dup 2))] )
Re: [Patch, i386] Avoid LCP stalls (issue5975045)
Hi Richard, Jan and H.J., Thanks for all the quick responses and suggestions. I had tested my patch when tuning for an arch without the LCP stalls, but it didn't hit an issue in reload because it didn't require rematerialization. Thanks for pointing out this issue. Regarding the penalty, it can be =6 cycles for core2/corei7 so I thought it would be best to force the splitting even when that would force the use of a new register, but it is possible that the peephole2 approach will work just fine in the majority of the cases. Thanks for the peephole2 patch, H.J., I will test that solution out for the case I was trying to solve. Regarding the penalty on AMD, reading Agner's guide suggested that this could be a problem on Bulldozer, but only if there are 3 prefixes, and I'm not sure how often that will occur for this type of instruction in practice. I will look into removing AMD from the handled cases. Will respond later after trying the peephole2 approach. Teresa On Fri, Mar 30, 2012 at 9:23 AM, H.J. Lu hjl.to...@gmail.com wrote: On Fri, Mar 30, 2012 at 8:19 AM, Richard Henderson r...@redhat.com wrote: On 03/30/2012 11:11 AM, Richard Henderson wrote: On 03/30/2012 11:03 AM, Teresa Johnson wrote: +(define_insn *movhi_imm_internal + [(set (match_operand:HI 0 memory_operand =m) + (match_operand:HI 1 immediate_operand n))] + !TARGET_LCP_STALL +{ + return mov{w}\t{%1, %0|%0, %1}; +} + [(set (attr type) (const_string imov)) + (set (attr mode) (const_string HI))]) + (define_insn *movhi_internal [(set (match_operand:HI 0 nonimmediate_operand =r,r,r,m) - (match_operand:HI 1 general_operand r,rn,rm,rn))] + (match_operand:HI 1 general_operand r,rn,rm,r))] !(MEM_P (operands[0]) MEM_P (operands[1])) For reload to work correctly, all alternatives must remain part of the same pattern. This issue should be handled with the ISA and ENABLED attributes. I'll also ask if this should better be handled with a peephole2. While movw $1234,(%eax) might be expensive, is it so expensive that we *must* force the use of a free register? Might it be better only to split the insn in two if and only if a free register exists? That can easily be done with a peephole2 pattern... Here is a very old LCP patch with peephole2. It may need some updates. -- H.J. -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
[rfc] Fix PR52770 (supoport throwing asms)
Hi, So here's an extended variant of my hack that implements throwing asms. Like rth proposed I've added a new pseudo clobber, throw: int f (void) { int x, y; x = 1; y = 2; try { __asm__ ( : =r(x), =r(y) : : throw); } catch (...) { return 2+x+y; } return x+y; } The patch handles multiple output arguments by doing the same we do for calls, i.e. introducing new temporaries. For gimple this clobber is retained and hence a throwing asm can be simply recognized by searching for it. For RTL I've added a new flag to mark the ASM_OPERANDS rtx. Without such marking we would have to treat every asm as potentially throwing in insn_could_throw_p, and rely on the REG_EH_REGION notes to mark them as non-throwing. Instead of auditing all MEM_VOLATILE_P uses on asms to see if they require handling throwing I've settled on simply implying the volatile bit for throwing asms. Not yet regstrapped, so no rfa, but does this seem sane? Ciao, Michael. Index: tree-eh.c === --- tree-eh.c (revision 183716) +++ tree-eh.c (working copy) @@ -1990,6 +1990,44 @@ lower_eh_constructs_2 (struct leh_state } break; +case GIMPLE_ASM: + /* Similar to normal LHS handling above, replace outputs + with new temporaries. */ + if (stmt_could_throw_p (stmt) + gimple_code (stmt) == GIMPLE_ASM) + { + unsigned noutputs; + unsigned i; + + noutputs = gimple_asm_noutputs (stmt); + for (i = 0; i noutputs; i++) + { + tree link, op; + link = gimple_asm_output_op (stmt, i); + op = TREE_VALUE (link); + if (!tree_could_throw_p (op) + is_gimple_reg_type (TREE_TYPE (op))) + { + tree tmp = create_tmp_var (TREE_TYPE (op), NULL); + gimple s = gimple_build_assign (op, tmp); + gimple_set_location (s, gimple_location (stmt)); + gimple_set_block (s, gimple_block (stmt)); + TREE_VALUE (link) = tmp; + if (TREE_CODE (TREE_TYPE (tmp)) == COMPLEX_TYPE + || TREE_CODE (TREE_TYPE (tmp)) == VECTOR_TYPE) + DECL_GIMPLE_REG_P (tmp) = 1; + gsi_insert_after (gsi, s, GSI_SAME_STMT); + } + } + } + /* Look for things that can throw exceptions, and record them. */ + if (state-cur_region stmt_could_throw_p (stmt)) + { + record_stmt_eh_region (state-cur_region, stmt); + note_eh_region_may_contain_throw (state-cur_region); + } + break; + case GIMPLE_COND: case GIMPLE_GOTO: case GIMPLE_RETURN: @@ -2639,6 +2677,8 @@ stmt_could_throw_p (gimple stmt) return stmt_could_throw_1_p (stmt); case GIMPLE_ASM: + if (gimple_asm_can_throw_p (stmt)) + return true; if (!cfun-can_throw_non_call_exceptions) return false; return gimple_asm_volatile_p (stmt); Index: varasm.c === --- varasm.c(revision 183716) +++ varasm.c(working copy) @@ -834,9 +834,10 @@ set_user_assembler_name (tree decl, cons /* Decode an `asm' spec for a declaration as a register name. Return the register number, or -1 if nothing specified, - or -2 if the ASMSPEC is not `cc' or `memory' and is not recognized, + or -2 if the ASMSPEC is not `cc', `memory' or `throw' and is not recognized, or -3 if ASMSPEC is `cc' and is not recognized, - or -4 if ASMSPEC is `memory' and is not recognized. + or -4 if ASMSPEC is `memory' and is not recognized, + or -5 if ASMSPEC is `throw' and is not recognized. Accept an exact spelling or a decimal number. Prefixes such as % are optional. */ @@ -902,6 +903,9 @@ decode_reg_name_and_count (const char *a } #endif /* ADDITIONAL_REGISTER_NAMES */ + if (!strcmp (asmspec, throw)) + return -5; + if (!strcmp (asmspec, memory)) return -4; Index: rtl.h === --- rtl.h (revision 183716) +++ rtl.h (working copy) @@ -266,7 +266,8 @@ struct GTY((chain_next (RTX_NEXT (%h) 1 in a CALL_INSN if it is a sibling call. 1 in a SET that is for a return. In a CODE_LABEL, part of the two-bit alternate entry field. - 1 in a CONCAT is VAL_EXPR_IS_COPIED in var-tracking.c. */ + 1 in a CONCAT is VAL_EXPR_IS_COPIED in var-tracking.c. + 1 in an ASM_OPERANDS is ASM_OPERANDS_THROW_P. */ unsigned int jump : 1; /* In a CODE_LABEL, part of the two-bit alternate entry field. 1 in a MEM if it cannot trap. @@ -1317,6 +1318,8 @@ do { \ #define ASM_OPERANDS_LABEL_LENGTH(RTX) XCVECLEN (RTX, 5, ASM_OPERANDS) #define ASM_OPERANDS_LABEL(RTX, N) XCVECEXP (RTX, 5,
Re: [rfc] Fix PR52770 (supoport throwing asms)
On 03/30/2012 12:37 PM, Michael Matz wrote: Not yet regstrapped, so no rfa, but does this seem sane? It definitely seems plausible. I can't immediately think of anything you might have forgotten. r~
Re: [rfc] Fix PR52770 (supoport throwing asms)
On 03/30/2012 12:43 PM, Richard Henderson wrote: On 03/30/2012 12:37 PM, Michael Matz wrote: Not yet regstrapped, so no rfa, but does this seem sane? It definitely seems plausible. I can't immediately think of anything you might have forgotten. Stating the obvious, but you'd do well to add at least an x86 test case that does a call to a local function which does throw. r~
Re: [Patch]: ggc-page.c: use uintptr_t instead of size_t
On 03/20/2012 05:41 AM, Tristan Gingold wrote: 2012-03-20 Tristan Gingold ging...@adacore.com * ggc-page.c (PAGE_L1_SIZE, PAGE_L2_SIZE, LOOKUP_L1, LOOKUP_L2) (ggc_allocated_p, lookup_page_table_entry, set_page_table_entry) (alloc_page, init_ggc, clear_marks, struct ggc_pch_data) (ggc_pch_this_base): Use uintptr_t instead of size_t. Ok. r~
Re: [PATCH] ARM: Use different linker path for hardfloat ABI
On 29/03/12 20:34, dann frazier wrote: This is an updated version of a patch Debian and Ubuntu are using to use an alternate linker path for hardfloat binaries. The difference with this one is that it covers the case where no float flag was passed in, defaulting to the softfloat path. 2012-03-29 dann frazier dann.fraz...@canonical.com * config/arm/linux-elf.h: Use alternate linker path for hardfloat ABI Index: gcc/config/arm/linux-elf.h === --- gcc/config/arm/linux-elf.h(revision 185708) +++ gcc/config/arm/linux-elf.h(working copy) @@ -59,14 +59,21 @@ #define LIBGCC_SPEC %{mfloat-abi=soft*:-lfloat} -lgcc -#define GLIBC_DYNAMIC_LINKER /lib/ld-linux.so.2 +#define LINUX_DYNAMIC_LINKER_SF /lib/ld-linux.so.3 +#define LINUX_DYNAMIC_LINKER_HF /lib/arm-linux-gnueabihf/ld-linux.so.3 #define LINUX_TARGET_LINK_SPEC %{h*} \ %{static:-Bstatic} \ %{shared:-shared} \ %{symbolic:-Bsymbolic} \ %{rdynamic:-export-dynamic} \ - -dynamic-linker GNU_USER_DYNAMIC_LINKER \ + %{msoft-float:-dynamic-linker LINUX_DYNAMIC_LINKER_SF } \ + %{mfloat-abi=soft*:-dynamic-linker LINUX_DYNAMIC_LINKER_SF } \ + %{mhard-float:-dynamic-linker LINUX_DYNAMIC_LINKER_HF } \ + %{mfloat-abi=hard:-dynamic-linker LINUX_DYNAMIC_LINKER_HF } \ + %{!mfloat-abi: \ + %{!msoft-float: \ + %{!mhard-float:-dynamic-linker LINUX_DYNAMIC_LINKER_SF }}} \ -X \ %{mbig-endian:-EB} %{mlittle-endian:-EL} \ SUBTARGET_EXTRA_LINK_SPEC Looks to me as though this will break the old Linux ABI. While we've marked that as deprecated, it hasn't been removed as yet. So I think this patch either needs to wait until that removal has taken place, or provide the relevant updates to maintain the old ABI support. R.
[committed] Fix used but not set warning in dwarf2out.c.
This showed up on i686-linux with BOOT_CFLAGS='-Os -g'. I didn't investigate the full call chain for why the variable wasn't set in the by-reference call to fortran_common. r~ * dwarf2out.c (gen_variable_die): Initialize off. diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c index 828e996..ca88fc5 100644 --- a/gcc/dwarf2out.c +++ b/gcc/dwarf2out.c @@ -17765,7 +17765,7 @@ common_block_die_table_eq (const void *x, const void *y) static void gen_variable_die (tree decl, tree origin, dw_die_ref context_die) { - HOST_WIDE_INT off; + HOST_WIDE_INT off = 0; tree com_decl; tree decl_or_origin = decl ? decl : origin; tree ultimate_origin;
Re: PATCH: Add OPTION_MASK_ISA_X86_64 and support TARGET_BI_ARCH == 2
On Fri, Mar 30, 2012 at 09:18:13AM -0700, H.J. Lu wrote: On Fri, Mar 30, 2012 at 8:11 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Mike Stump mikest...@comcast.net writes: Here is the new patch. OK for trunk if there are no regressions on Linux/ia32 and Linux/x86-64? Too bad you didn't test 32-bit darwin, causes: http://gcc.gnu.org/PR52784 Could you please revert or fix, thanks. Same problem on Solaris 10 and 11/x86. Rainer When i[34567]86-*-* targets are configured with --enable-targets=all, TARGET_BI_ARCH is defined as 1, but TARGET_64BIT_DEFAULT isn't defined. It leads to if (!TARGET_64BIT) ix86_isa_flags = ~(OPTION_MASK_ABI_64 | OPTION_MASK_ABI_X32); Since TARGET_64BIT is false by default, -m64 and -mx32 don't work correctly. This patch changes TARGET_BI_ARCH to 3 for i[34567]86-*-* targets configured with --enable-targets=all. Tested on Linux/ia32 with bootstrap and Linux/ia32 with --enable-targets=all --disable-bootstrap. Please try on other OSes. H.J., This patch solves the bootstrap of current gcc trunk on i386-apple-darwin10. Thanks. Jack Thanks. -- H.J. --- 2012-03-30 H.J. Lu hongjiu...@intel.com PR bootstrap/52784 * config.gcc (tm_defines): Replace TARGET_BI_ARCH=1 with TARGET_BI_ARCH=3 for i[34567]86-*-* targets. * config/i386/i386.c (ix86_option_override_internal): Don't check OPTION_MASK_ABI_64 nor OPTION_MASK_ABI_X32 if TARGET_BI_ARCH isn't defined or TARGET_BI_ARCH == 3. 2012-03-30 H.J. Lu hongjiu...@intel.com PR bootstrap/52784 * config.gcc (tm_defines): Replace TARGET_BI_ARCH=1 with TARGET_BI_ARCH=3 for i[34567]86-*-* targets. * config/i386/i386.c (ix86_option_override_internal): Don't check OPTION_MASK_ABI_64 nor OPTION_MASK_ABI_X32 if TARGET_BI_ARCH isn't defined or TARGET_BI_ARCH == 3. diff --git a/gcc/config.gcc b/gcc/config.gcc index c30bb24..d1e0480 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -1208,7 +1208,7 @@ i[34567]86-*-linux* | i[34567]86-*-kfreebsd*-gnu | i[34567]86-*-knetbsd*-gnu | i default_gnu_indirect_function=yes if test x$enable_targets = xall; then tm_file=${tm_file} i386/x86-64.h i386/gnu-user64.h i386/linux64.h - tm_defines=${tm_defines} TARGET_BI_ARCH=1 + tm_defines=${tm_defines} TARGET_BI_ARCH=3 tmake_file=${tmake_file} i386/t-linux64 x86_multilibs=${with_multilib_list} if test $x86_multilibs = default; then @@ -1338,7 +1338,7 @@ i[34567]86-*-solaris2* | x86_64-*-solaris2.1[0-9]*) case ${target} in *-*-solaris2.1[0-9]*) tm_file=${tm_file} i386/x86-64.h i386/sol2-bi.h sol2-bi.h - tm_defines=${tm_defines} TARGET_BI_ARCH=1 + tm_defines=${tm_defines} TARGET_BI_ARCH=3 tmake_file=$tmake_file i386/t-sol2-64 need_64bit_isa=yes case X${with_cpu} in @@ -1406,7 +1406,7 @@ i[34567]86-*-mingw* | x86_64-*-mingw*) user_headers_inc_next_pre=${user_headers_inc_next_pre} stddef.h stdarg.h tm_file=${tm_file} i386/mingw-w64.h if test x$enable_targets = xall; then - tm_defines=${tm_defines} TARGET_BI_ARCH=1 + tm_defines=${tm_defines} TARGET_BI_ARCH=3 case X${with_cpu} in Xgeneric|Xatom|Xcore2|Xcorei7|Xcorei7-avx|Xnocona|Xx86-64|Xbdver2|Xbdver1|Xbtver1|Xamdfam10|Xbarcelona|Xk8|Xopteron|Xathlon64|Xathlon-fx|Xathlon64-sse3|Xk8-sse3|Xopteron-sse3) ;; diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 42746e4..62ed2c0 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -3117,11 +3117,14 @@ ix86_option_override_internal (bool main_args_p) SUBSUBTARGET_OVERRIDE_OPTIONS; #endif - /* Turn off both OPTION_MASK_ABI_64 and OPTION_MASK_ABI_X32 if - TARGET_64BIT is false. */ +#if defined TARGET_BI_ARCH TARGET_BI_ARCH != 3 + /* When TARGET_BI_ARCH isn't defined or TARGET_BI_ARCH == 3, + TARGET_64BIT is false by default and there is no need to check + OPTION_MASK_ABI_64 nor OPTION_MASK_ABI_X32. Turn off both + OPTION_MASK_ABI_64 and OPTION_MASK_ABI_X32 if TARGET_64BIT is + false. */ if (!TARGET_64BIT) ix86_isa_flags = ~(OPTION_MASK_ABI_64 | OPTION_MASK_ABI_X32); -#ifdef TARGET_BI_ARCH else { #if TARGET_BI_ARCH == 1
Fix debug/52727 -- lost REG_ARGS_SIZE note
Losing REG_ARGS_SIZE notes leads to assertion failures in dwarf2cfi.c when the values don't match across CFG edges. I'd like to leave this patch on mainline for a week or so to see what falls out before copying it to the 4.7 branch. There is a new abort here, though I suppose that would probably only trigger when we would abort later in dwarf2cfi.c anyway... Tested on x86_64 (no-op, sanity), and i686 (-Os), and committed. r~ diff --git a/gcc/combine-stack-adj.c b/gcc/combine-stack-adj.c index 3cffd66..6b6f74b 100644 --- a/gcc/combine-stack-adj.c +++ b/gcc/combine-stack-adj.c @@ -320,6 +320,107 @@ maybe_move_args_size_note (rtx last, rtx insn, bool after) add_reg_note (last, REG_ARGS_SIZE, XEXP (note, 0)); } +/* Return the next (or previous) active insn within BB. */ + +static rtx +prev_active_insn_bb (basic_block bb, rtx insn) +{ + for (insn = PREV_INSN (insn); + insn != PREV_INSN (BB_HEAD (bb)); + insn = PREV_INSN (insn)) +if (active_insn_p (insn)) + return insn; + return NULL_RTX; +} + +static rtx +next_active_insn_bb (basic_block bb, rtx insn) +{ + for (insn = NEXT_INSN (insn); + insn != NEXT_INSN (BB_END (bb)); + insn = NEXT_INSN (insn)) +if (active_insn_p (insn)) + return insn; + return NULL_RTX; +} + +/* If INSN has a REG_ARGS_SIZE note, if possible move it to PREV. Otherwise + search for a nearby candidate within BB where we can stick the note. */ + +static void +force_move_args_size_note (basic_block bb, rtx prev, rtx insn) +{ + rtx note, test, next_candidate, prev_candidate; + + /* If PREV exists, tail-call to the logic in the other function. */ + if (prev) +{ + maybe_move_args_size_note (prev, insn, false); + return; +} + + /* First, make sure there's anything that needs doing. */ + note = find_reg_note (insn, REG_ARGS_SIZE, NULL_RTX); + if (note == NULL) +return; + + /* We need to find a spot between the previous and next exception points + where we can place the note and properly deallocate the arguments. */ + next_candidate = prev_candidate = NULL; + + /* It is often the case that we have insns in the order: + call + add sp (previous deallocation) + sub sp (align for next arglist) + push arg + and the add/sub cancel. Therefore we begin by searching forward. */ + + test = insn; + while ((test = next_active_insn_bb (bb, test)) != NULL) +{ + /* Found an existing note: nothing to do. */ + if (find_reg_note (test, REG_ARGS_SIZE, NULL_RTX)) +return; + /* Found something that affects unwinding. Stop searching. */ + if (CALL_P (test) || !insn_nothrow_p (test)) + break; + if (next_candidate == NULL) + next_candidate = test; +} + + test = insn; + while ((test = prev_active_insn_bb (bb, test)) != NULL) +{ + rtx tnote; + /* Found a place that seems logical to adjust the stack. */ + tnote = find_reg_note (test, REG_ARGS_SIZE, NULL_RTX); + if (tnote) + { + XEXP (tnote, 0) = XEXP (note, 0); + return; + } + if (prev_candidate == NULL) + prev_candidate = test; + /* Found something that affects unwinding. Stop searching. */ + if (CALL_P (test) || !insn_nothrow_p (test)) + break; +} + + if (prev_candidate) +test = prev_candidate; + else if (next_candidate) +test = next_candidate; + else +{ + /* ??? We *must* have a place, lest we ICE on the lost adjustment. +Options are: dummy clobber insn, nop, or prevent the removal of +the sp += 0 insn. Defer that decision until we can prove this +can actually happen. */ + gcc_unreachable (); +} + add_reg_note (test, REG_ARGS_SIZE, XEXP (note, 0)); +} + /* Subroutine of combine_stack_adjustments, called for each basic block. */ static void @@ -327,6 +428,7 @@ combine_stack_adjustments_for_block (basic_block bb) { HOST_WIDE_INT last_sp_adjust = 0; rtx last_sp_set = NULL_RTX; + rtx last2_sp_set = NULL_RTX; struct csa_reflist *reflist = NULL; rtx insn, next, set; struct record_stack_refs_data data; @@ -391,9 +493,8 @@ combine_stack_adjustments_for_block (basic_block bb) last_sp_adjust + this_adjust, this_adjust)) { - maybe_move_args_size_note (last_sp_set, insn, false); - /* It worked! */ + maybe_move_args_size_note (last_sp_set, insn, false); delete_insn (insn); last_sp_adjust += this_adjust; continue; @@ -409,9 +510,8 @@ combine_stack_adjustments_for_block (basic_block bb) last_sp_adjust + this_adjust, -last_sp_adjust)) { -
Re: PATCH: Add OPTION_MASK_ISA_X86_64 and support TARGET_BI_ARCH == 2
On Fri, Mar 30, 2012 at 11:05 AM, Jack Howarth howa...@bromo.med.uc.edu wrote: On Fri, Mar 30, 2012 at 09:18:13AM -0700, H.J. Lu wrote: On Fri, Mar 30, 2012 at 8:11 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Mike Stump mikest...@comcast.net writes: Here is the new patch. OK for trunk if there are no regressions on Linux/ia32 and Linux/x86-64? Too bad you didn't test 32-bit darwin, causes: http://gcc.gnu.org/PR52784 Could you please revert or fix, thanks. Same problem on Solaris 10 and 11/x86. Rainer When i[34567]86-*-* targets are configured with --enable-targets=all, TARGET_BI_ARCH is defined as 1, but TARGET_64BIT_DEFAULT isn't defined. It leads to if (!TARGET_64BIT) ix86_isa_flags = ~(OPTION_MASK_ABI_64 | OPTION_MASK_ABI_X32); Since TARGET_64BIT is false by default, -m64 and -mx32 don't work correctly. This patch changes TARGET_BI_ARCH to 3 for i[34567]86-*-* targets configured with --enable-targets=all. Tested on Linux/ia32 with bootstrap and Linux/ia32 with --enable-targets=all --disable-bootstrap. Please try on other OSes. H.J., This patch solves the bootstrap of current gcc trunk on i386-apple-darwin10. Thanks. Jack Here is a smaller patch. -- H.J. 2012-03-30 H.J. Lu hongjiu...@intel.com PR bootstrap/52784 * config/i386/i386.c (ix86_option_override_internal): Don't check OPTION_MASK_ABI_64 nor OPTION_MASK_ABI_X32 if TARGET_BI_ARCH isn't defined or TARGET_64BIT_DEFAULT is 0. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 42746e4..3905287 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -3117,11 +3117,13 @@ ix86_option_override_internal (bool main_args_p) SUBSUBTARGET_OVERRIDE_OPTIONS; #endif - /* Turn off both OPTION_MASK_ABI_64 and OPTION_MASK_ABI_X32 if - TARGET_64BIT is false. */ +#if defined TARGET_BI_ARCH TARGET_64BIT_DEFAULT + /* When TARGET_64BIT_DEFAULT isn't defined, TARGET_64BIT is false by + default and there is no need to check OPTION_MASK_ABI_64 nor + OPTION_MASK_ABI_X32. Turn off both OPTION_MASK_ABI_64 and + OPTION_MASK_ABI_X32 if TARGET_64BIT is false. */ if (!TARGET_64BIT) ix86_isa_flags = ~(OPTION_MASK_ABI_64 | OPTION_MASK_ABI_X32); -#ifdef TARGET_BI_ARCH else { #if TARGET_BI_ARCH == 1
Re: [C Patch]: pr52543
ramana i get the same failure on the trunk without my patch. kenny On 03/30/2012 07:36 AM, Ramana Radhakrishnan wrote: Hi I have tested this on an x86_64 with both the force lowering on and off and neither cause any regressions as well as extensive testing on my port. So, just out of curiosity, I decided to run this through a cross-build and noticed the following ICE with eglibc. I haven't had the time to debug this further but it does appear as though it could do with some more testing on some more ports and this probably needs some tuning as you say. $ /work/cross-build/fsf/arm-none-linux-gnueabi/tools-lowersubregchanges-patched/bin/arm-none-linux-gnueabi-gcc -c -O2 ./besttry.c -mfloat-abi=soft -march=armv5te ./besttry.c: In function ‘_IO_new_file_write’: ./besttry.c:36:1: internal compiler error: in get_loop_body, at cfgloop.c:831 $ cat besttry.c __extension__ typedef int __ssize_t; extern __thread int __libc_errno __attribute__ ((tls_model (initial-exec))); struct _IO_FILE { int _fileno; int _flags2; }; typedef struct _IO_FILE _IO_FILE; _IO_new_file_write (f, data, n) _IO_FILE *f; { __ssize_t to_do = n; while (to_do 0) { __ssize_t count = (__builtin_expect (f-_flags2 2, 0) ? ({ unsigned int _sys_result = ({ register int _a1 asm (r0), _nr asm (r7); int _a3tmp = (int) ((to_do)); int _a2tmp = (int) ((data)); register int _a2 asm (a2) = _a2tmp; register int _a3 asm (a3) = _a3tmp; _nr = ((0 + 4)); asm volatile (swi0x0 @ syscall SYS_ify(write) : =r (_a1) : r (_nr) , r (_a1), r (_a2), r (_a3) : memory); _a1; }); if (__builtin_expect (((unsigned int) (_sys_result)= 0xf001u), 0)) { (__libc_errno = ((-(_sys_result; _sys_result = (unsigned int) -1; } (int) _sys_result; }) : __write (f-_fileno, data, to_do)); if (count 0) { break; } to_do -= count; } } Ramana On 29 March 2012 22:10, Kenneth Zadeckzad...@naturalbridge.com wrote: This patch takes a different approach to fixing PR52543 than does the patch in http://gcc.gnu.org/ml/gcc-patches/2012-03/msg00641.html This patch transforms the lower-subreg pass(es) from unconditionally splitting wide moves, zero extensions, and shifts, so that it now takes into account the target specific costs and only does the transformations if it is profitable. Unconditional splitting is a problem that not only occurs on the AVR but is also a problem on the ARM NEON and my private port. Furthermore, it is a problem that is likely to occur on most modern larger machines since these machines are more likely to have fast instructions for moving things that are larger than word mode. At compiler initialization time, each mode that is larger that a word mode is examined to determine if the cost of moving a value of that mode is less expensive that inserting the proper number of word sided moves. If it is cheaper to split it up, a bit is set to allow moves of that mode to be lowered. A similar analysis is made for the zero extensions and shifts except that lower subreg had been (and is still limited to only breaking up these operations if the target size was twice the size of word mode.) Also, if the analysis determines that there are no profitable transformations, the pass exits quickly without doing any analysis. It is quite likely that most ports will have to be adjusted after this patch is accepted. For instance, the analysis discovers that there are no profitable transformations to be performed on the x86-64.Since this is not my platform, I have no idea if these are the correct settings. But the pass uses the standard insn_rtx_cost interface and it is the port maintainers responsibility to not lie to the optimization passes so this extra work in stage one should be acceptable. I do know from a private conversation with Richard Sandiford, that mips patches are likely forthcoming. There is preprocessor controlled code that prints out the cost analysis. Only a summary of this can go in the subregs dump file because the analysis is called from backend_init_target and so the dump file is not available. But it is very useful to define LOG_COSTS when adjusting your port. There is also preprocessor code that forces all of the lowering operations to marked as profitable. This is useful in debugging the new logic. Both of these preprocessor symbols are documented at the top of the pass. I have tested this on an x86_64 with both the force lowering on and off and neither cause any regressions as well as extensive testing on my port. Ok to commit? Kenny 2012-03-29 Kenneth Zadeckzad...@naturalbridge.com * toplev.c (backend_init_target): Call initializer for lower-subreg pass. * lower-subreg.c (move_modes_to_split, splitting_ashift, splitting_lshiftrt) splitting_zext, splitting_some_shifts, twice_word_mode, something_to_do, word_mode_move_cost,
Re: PATCH: Add OPTION_MASK_ISA_X86_64 and support TARGET_BI_ARCH == 2
On Fri, Mar 30, 2012 at 11:32:37AM -0700, H.J. Lu wrote: On Fri, Mar 30, 2012 at 11:05 AM, Jack Howarth howa...@bromo.med.uc.edu wrote: On Fri, Mar 30, 2012 at 09:18:13AM -0700, H.J. Lu wrote: On Fri, Mar 30, 2012 at 8:11 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Mike Stump mikest...@comcast.net writes: Here is the new patch. OK for trunk if there are no regressions on Linux/ia32 and Linux/x86-64? Too bad you didn't test 32-bit darwin, causes: http://gcc.gnu.org/PR52784 Could you please revert or fix, thanks. Same problem on Solaris 10 and 11/x86. Rainer When i[34567]86-*-* targets are configured with --enable-targets=all, TARGET_BI_ARCH is defined as 1, but TARGET_64BIT_DEFAULT isn't defined. It leads to if (!TARGET_64BIT) ix86_isa_flags = ~(OPTION_MASK_ABI_64 | OPTION_MASK_ABI_X32); Since TARGET_64BIT is false by default, -m64 and -mx32 don't work correctly. This patch changes TARGET_BI_ARCH to 3 for i[34567]86-*-* targets configured with --enable-targets=all. Tested on Linux/ia32 with bootstrap and Linux/ia32 with --enable-targets=all --disable-bootstrap. Please try on other OSes. H.J., This patch solves the bootstrap of current gcc trunk on i386-apple-darwin10. Thanks. Jack Here is a smaller patch. H.J., The smaller patch also solves the bootstrap failure on i386-apple-darwin10. Jack -- H.J. 2012-03-30 H.J. Lu hongjiu...@intel.com PR bootstrap/52784 * config/i386/i386.c (ix86_option_override_internal): Don't check OPTION_MASK_ABI_64 nor OPTION_MASK_ABI_X32 if TARGET_BI_ARCH isn't defined or TARGET_64BIT_DEFAULT is 0. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 42746e4..3905287 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -3117,11 +3117,13 @@ ix86_option_override_internal (bool main_args_p) SUBSUBTARGET_OVERRIDE_OPTIONS; #endif - /* Turn off both OPTION_MASK_ABI_64 and OPTION_MASK_ABI_X32 if - TARGET_64BIT is false. */ +#if defined TARGET_BI_ARCH TARGET_64BIT_DEFAULT + /* When TARGET_64BIT_DEFAULT isn't defined, TARGET_64BIT is false by + default and there is no need to check OPTION_MASK_ABI_64 nor + OPTION_MASK_ABI_X32. Turn off both OPTION_MASK_ABI_64 and + OPTION_MASK_ABI_X32 if TARGET_64BIT is false. */ if (!TARGET_64BIT) ix86_isa_flags = ~(OPTION_MASK_ABI_64 | OPTION_MASK_ABI_X32); -#ifdef TARGET_BI_ARCH else { #if TARGET_BI_ARCH == 1
Re: PATCH: Add OPTION_MASK_ISA_X86_64 and support TARGET_BI_ARCH == 2
On Fri, Mar 30, 2012 at 1:23 PM, Jack Howarth howa...@bromo.med.uc.edu wrote: On Fri, Mar 30, 2012 at 11:32:37AM -0700, H.J. Lu wrote: On Fri, Mar 30, 2012 at 11:05 AM, Jack Howarth howa...@bromo.med.uc.edu wrote: On Fri, Mar 30, 2012 at 09:18:13AM -0700, H.J. Lu wrote: On Fri, Mar 30, 2012 at 8:11 AM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Mike Stump mikest...@comcast.net writes: Here is the new patch. OK for trunk if there are no regressions on Linux/ia32 and Linux/x86-64? Too bad you didn't test 32-bit darwin, causes: http://gcc.gnu.org/PR52784 Could you please revert or fix, thanks. Same problem on Solaris 10 and 11/x86. Rainer When i[34567]86-*-* targets are configured with --enable-targets=all, TARGET_BI_ARCH is defined as 1, but TARGET_64BIT_DEFAULT isn't defined. It leads to if (!TARGET_64BIT) ix86_isa_flags = ~(OPTION_MASK_ABI_64 | OPTION_MASK_ABI_X32); Since TARGET_64BIT is false by default, -m64 and -mx32 don't work correctly. This patch changes TARGET_BI_ARCH to 3 for i[34567]86-*-* targets configured with --enable-targets=all. Tested on Linux/ia32 with bootstrap and Linux/ia32 with --enable-targets=all --disable-bootstrap. Please try on other OSes. H.J., This patch solves the bootstrap of current gcc trunk on i386-apple-darwin10. Thanks. Jack Here is a smaller patch. H.J., The smaller patch also solves the bootstrap failure on i386-apple-darwin10. Jack Please ignore the smaller patch since preprocessor may handle TARGET_64BIT_DEFAULT properly. -- H.J.
Re: [Patch, i386] Avoid LCP stalls (issue5975045)
Hi Richard, Jan and H.J., Thanks for all the quick responses and suggestions. I had tested my patch when tuning for an arch without the LCP stalls, but it didn't hit an issue in reload because it didn't require rematerialization. Thanks for pointing out this issue. Regarding the penalty, it can be =6 cycles for core2/corei7 so I 6 cycles is indeed quite serve and may pay for extra spill. I guess easiest way is to benchmark peephole variant and see what comes first. You may be able to see the differences better in 32bit mode due to register pressure issues. thought it would be best to force the splitting even when that would force the use of a new register, but it is possible that the peephole2 approach will work just fine in the majority of the cases. Thanks for the peephole2 patch, H.J., I will test that solution out for the case I was trying to solve. Regarding the penalty on AMD, reading Agner's guide suggested that this could be a problem on Bulldozer, but only if there are 3 prefixes, and I'm not sure how often that will occur for this type of I can not think of case where MOV instruction in question would have 3 prefixes. It can have size overload and REX prefix, but REX usually do not count. You may try to benchmark Buldozer, but I would be surprised if there was any benefits. We need to run some benchmarks for generic/generic32 models on AMD machine anyway. I would guess that this transformation should be safe. Cost of extra register move is not high compared to the 16bit store overhead. Harsha? Honza
Re: [PATCH] Fix PRs 52080, 52097 and 48124, rewrite bitfield expansion, enable the C++ memory model wrt bitfields everywhere
May I apply the patch I posted? It boostrapped/regtested fine on x86-64/Linux. Yes. Thanks. Unfortunately, while this was the last identified problem on x86, another issue is visible on x86-64 as a miscompilation of XML/Ada at -O0. Reduced testcase attached: gnat.dg/pack18.adb gnat.dg/pack18_pkg.ads The executable segfaults because it attempts a read at 0x2000. The scenario is a follows: Rec is packed record so its fields are bit fields, N being at bit offset 129. The representative is at offset 0. get_bit_range is invoked on N with a bitpos of 1, because there is variable offset and its DECL_FIELD_OFFFSET is added to it instead of bitpos. Hence bitpos - bitoffset is (unsigned HOST_WIDE_INT) -128. This value enters unchanged the new code in store_bit_field and the division: offset = bitregion_start / BITS_PER_UNIT; yields the problematic big number. It would therefore appear that bitstart and bitend need to be signed offsets, at least until they are adjusted by store_bit_field. -- Eric Botcazou -- { dg-do run } with Pack18_Pkg; use Pack18_Pkg; procedure Pack18 is use Pack18_Pkg.Attributes_Tables; Table : Instance; begin Init (Table); Set_Last (Table, 1); Table.Table (Last (Table)).N := 0; end; with GNAT.Dynamic_Tables; package Pack18_Pkg is type String_Access is access String; type Rec is record S : String_Access; B : Boolean; N : Natural; end record; pragma Pack (Rec); package Attributes_Tables is new GNAT.Dynamic_Tables (Table_Component_Type = Rec, Table_Index_Type = Natural, Table_Low_Bound = 1, Table_Initial= 200, Table_Increment = 200); end Pack18_Pkg;
libgo patch committed: Set errno after Exitsyscall
In libgo, system calls that return errors convert from an errno value to an error interface, a step that requires memory allocation. Memory allocation should be done while the goroutine is running on a thread that the Go scheduler knows about, which is to say not between calls to Entersyscall and Exitsyscall. This patch to libgo moves the interface conversion after the call to Exitsyscall. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline and 4.7 branch. Ian Index: libgo/go/syscall/mksyscall.awk === --- libgo/go/syscall/mksyscall.awk (revision 186020) +++ libgo/go/syscall/mksyscall.awk (revision 186021) @@ -199,6 +199,7 @@ BEGIN { } printf(c_%s(%s)\n, cfnname, args) +seterr = 0 if (gofnresults != ) { fields = split(gofnresults, goresults, , *) if (fields 2) { @@ -218,13 +219,17 @@ BEGIN { gotype = goparam[2] if (goname == err) { + print \tvar errno Errno + print \tsetErrno := false if (cfnresult ~ /^\*/) { print \tif _r == nil { } else { print \tif _r 0 { } - print \t\terr = GetErrno() + print \t\terrno = GetErrno() + print \t\tsetErrno = true print \t} + seterr = 1 } else if (gotype == uintptr cfnresult ~ /^\*/) { printf(\t%s = (%s)(unsafe.Pointer(_r))\n, goname, gotype) } else { @@ -243,6 +248,12 @@ BEGIN { print \tExitsyscall() } +if (seterr) { + print \tif setErrno { + print \t\terr = errno + print \t} +} + if (gofnresults != ) { print \treturn }
libgo patch committed: Update to weekly.2012-03-13
I have committed a patch to update libgo to the weekly.2012-03-13 release. As usual this e-mail message only includes the changes to the files specific to gccgo. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline and 4.7 branch. Ian diff -r 2d2ecf6f57c5 libgo/MERGE --- a/libgo/MERGE Fri Mar 30 14:10:10 2012 -0700 +++ b/libgo/MERGE Fri Mar 30 14:11:58 2012 -0700 @@ -1,4 +1,4 @@ -f4470a54e6db +3cdba7b0650c The first line of this file holds the Mercurial revision number of the last merge done from the master library sources. diff -r 2d2ecf6f57c5 libgo/Makefile.am --- a/libgo/Makefile.am Fri Mar 30 14:10:10 2012 -0700 +++ b/libgo/Makefile.am Fri Mar 30 14:11:58 2012 -0700 @@ -813,6 +813,7 @@ go/net/rpc/server.go go_runtime_files = \ + go/runtime/compiler.go \ go/runtime/debug.go \ go/runtime/error.go \ go/runtime/extern.go \ @@ -843,6 +844,7 @@ go/strconv/decimal.go \ go/strconv/extfloat.go \ go/strconv/ftoa.go \ + go/strconv/isprint.go \ go/strconv/itoa.go \ go/strconv/quote.go @@ -1000,12 +1002,13 @@ go/crypto/tls/handshake_server.go \ go/crypto/tls/key_agreement.go \ go/crypto/tls/prf.go \ - go/crypto/tls/root_unix.go \ go/crypto/tls/tls.go go_crypto_x509_files = \ go/crypto/x509/cert_pool.go \ go/crypto/x509/pkcs1.go \ go/crypto/x509/pkcs8.go \ + go/crypto/x509/root.go \ + go/crypto/x509/root_unix.go \ go/crypto/x509/verify.go \ go/crypto/x509/x509.go @@ -1320,7 +1323,8 @@ go_path_filepath_files = \ go/path/filepath/match.go \ go/path/filepath/path.go \ - go/path/filepath/path_unix.go + go/path/filepath/path_unix.go \ + go/path/filepath/symlink.go go_regexp_syntax_files = \ go/regexp/syntax/compile.go \ diff -r 2d2ecf6f57c5 libgo/runtime/malloc.goc --- a/libgo/runtime/malloc.goc Fri Mar 30 14:10:10 2012 -0700 +++ b/libgo/runtime/malloc.goc Fri Mar 30 14:11:58 2012 -0700 @@ -390,6 +390,23 @@ { byte *p; + + if(n (uintptr)(h-arena_end - h-arena_used)) { + // We are in 32-bit mode, maybe we didn't use all possible address space yet. + // Reserve some more space. + byte *new_end; + uintptr needed; + + needed = (uintptr)h-arena_used + n - (uintptr)h-arena_end; + // Round wanted arena size to a multiple of 256MB. + needed = (needed + (25620) - 1) ~((25620)-1); + new_end = h-arena_end + needed; + if(new_end = h-arena_start + MaxArena32) { + p = runtime_SysReserve(h-arena_end, new_end - h-arena_end); + if(p == h-arena_end) +h-arena_end = new_end; + } + } if(n = (uintptr)(h-arena_end - h-arena_used)) { // Keep taking from our reservation. p = h-arena_used; @@ -411,7 +428,8 @@ return nil; if(p h-arena_start || (uintptr)(p+n - h-arena_start) = MaxArena32) { - runtime_printf(runtime: memory allocated by OS not in usable range\n); + runtime_printf(runtime: memory allocated by OS (%p) not in usable range [%p,%p)\n, + p, h-arena_start, h-arena_start+MaxArena32); runtime_SysFree(p, n); return nil; } diff -r 2d2ecf6f57c5 libgo/runtime/proc.c --- a/libgo/runtime/proc.c Fri Mar 30 14:10:10 2012 -0700 +++ b/libgo/runtime/proc.c Fri Mar 30 14:11:58 2012 -0700 @@ -406,7 +406,9 @@ n = maxgomaxprocs; runtime_gomaxprocs = n; } - setmcpumax(runtime_gomaxprocs); + // wait for the main goroutine to start before taking + // GOMAXPROCS into account. + setmcpumax(1); runtime_singleproc = runtime_gomaxprocs == 1; canaddmcpu(); // mcpu++ to account for bootstrap m @@ -432,6 +434,8 @@ // by calling runtime.LockOSThread during initialization // to preserve the lock. runtime_LockOSThread(); + // From now on, newgoroutines may use non-main threads. + setmcpumax(runtime_gomaxprocs); runtime_sched.init = true; scvg = __go_go(runtime_MHeap_Scavenger, nil); main_init(); diff -r 2d2ecf6f57c5 libgo/runtime/runtime.h --- a/libgo/runtime/runtime.h Fri Mar 30 14:10:10 2012 -0700 +++ b/libgo/runtime/runtime.h Fri Mar 30 14:11:58 2012 -0700 @@ -416,7 +416,6 @@ /* * runtime c-called (but written in Go) */ -void runtime_newError(String, Eface*); void runtime_printany(Eface) __asm__(libgo_runtime.runtime.Printany); void runtime_newTypeAssertionError(const String*, const String*, const String*, const String*, Eface*) @@ -429,7 +428,6 @@ */ void runtime_semacquire(uint32 volatile *); void runtime_semrelease(uint32 volatile *); -String runtime_signame(int32 sig); int32 runtime_gomaxprocsfunc(int32 n); void runtime_procyield(uint32); void runtime_osyield(void);
Re: [PATCH] SH2A: Don't push/pop registers for functions with resbank attribute
Naveen H. S navee...@kpitcummins.com wrote: The patch was tested with movml testcase and works as expected. Tested with sh2a-elf. No new regressions. Thanks for testing. I've committed it as revision 186024 on trunk. Regards, kaz
Re: [PATCH] SH: Fix m2a-single-only compilation error
Naveen H. S navee...@kpitcummins.com wrote: Please find attached the patch crt1.patch which fixes compilation issue with sh2a-single-only target. Currently, compilation generates the following error:- merge of architecture 'sh3e' with architecture 'sh2a' produced unknown architecture The patch fixes the issue. [snip] * libgcc/config/sh/crt1.S (VBR_SETUP): Don't define for SH2E targets This does not look right. Please try the patch below instead. Regards, kaz -- * config/sh/t-sh (MULTILIB_MATCHES): Match m2a-single-only to m2a-single instead of m2e. --- ORIG/trunk/gcc/config/sh/t-sh 2011-11-03 09:27:45.0 +0900 +++ trunk/gcc/config/sh/t-sh2012-03-30 15:18:19.0 +0900 @@ -1,5 +1,5 @@ # Copyright (C) 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, -# 2003, 2004, 2006, 2008, 2009, 2011 Free Software Foundation, Inc. +# 2003, 2004, 2006, 2008, 2009, 2011, 2012 Free Software Foundation, Inc. # # This file is part of GCC. # @@ -37,7 +37,7 @@ MULTILIB_MATCHES = $(shell \ for abi in m1,m2,m3,m4-nofpu,m4-100-nofpu,m4-200-nofpu,m4-400,m4-500,m4-340,m4-300-nofpu,m4al,m4a-nofpu \ m1,m2,m2a-nofpu \ m2e,m3e,m4-single-only,m4-100-single-only,m4-200-single-only,m4-300-single-only,m4a-single-only \ - m2e,m2a-single-only \ + m2a-single,m2a-single-only \ m4-single,m4-100-single,m4-200-single,m4-300-single,m4a-single \ m4,m4-100,m4-200,m4-300,m4a \ m5-32media,m5-compact,m5-32media \
libgo patch committed: Update to weekly.2012-03-22 release
I have committed a patch to update libgo to the weekly.2012-03-22 release. As usual, this e-mail only includes the changes to files specific go gccgo. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline and 4.7 branch. Ian diff -r 95129a2d27f4 libgo/MERGE --- a/libgo/MERGE Fri Mar 30 14:13:09 2012 -0700 +++ b/libgo/MERGE Fri Mar 30 14:50:01 2012 -0700 @@ -1,4 +1,4 @@ -3cdba7b0650c +bce220d03774 The first line of this file holds the Mercurial revision number of the last merge done from the master library sources. diff -r 95129a2d27f4 libgo/merge.sh --- a/libgo/merge.sh Fri Mar 30 14:13:09 2012 -0700 +++ b/libgo/merge.sh Fri Mar 30 14:50:01 2012 -0700 @@ -163,7 +163,7 @@ done done -runtime=chan.c cpuprof.c goc2c.c lock_futex.c lock_sema.c mcache.c mcentral.c mfinal.c mfixalloc.c mgc0.c mheap.c msize.c proc.c runtime.c runtime.h signal_unix.c malloc.h malloc.goc mprof.goc runtime1.goc sema.goc sigqueue.goc string.goc time.goc +runtime=chan.c cpuprof.c lock_futex.c lock_sema.c mcache.c mcentral.c mfinal.c mfixalloc.c mgc0.c mheap.c msize.c proc.c runtime.c runtime.h signal_unix.c malloc.h malloc.goc mprof.goc runtime1.goc sema.goc sigqueue.goc string.goc time.goc for f in $runtime; do merge_c $f $f done diff -r 95129a2d27f4 libgo/runtime/proc.c --- a/libgo/runtime/proc.c Fri Mar 30 14:13:09 2012 -0700 +++ b/libgo/runtime/proc.c Fri Mar 30 14:50:01 2012 -0700 @@ -966,6 +966,11 @@ } #endif + // Install signal handlers; after minit so that minit can + // prepare the thread to be able to handle the signals. + if(m == runtime_m0) + runtime_initsig(); + schedule(nil); return nil; } diff -r 95129a2d27f4 libgo/runtime/runtime.c --- a/libgo/runtime/runtime.c Fri Mar 30 14:13:09 2012 -0700 +++ b/libgo/runtime/runtime.c Fri Mar 30 14:50:01 2012 -0700 @@ -74,7 +74,7 @@ runtime_panicstring(const char *s) { Eface err; - + if(runtime_m()-gcing) { runtime_printf(panic: %s\n, s); runtime_throw(panic during gc); @@ -101,7 +101,7 @@ { String *s; int32 i; - + // for windows implementation see os package if(Windows) return; @@ -119,7 +119,7 @@ { String *s; int32 i, n; - + for(n=0; argv[argc+1+n] != 0; n++) ; @@ -195,8 +195,6 @@ runtime_check(void) { __go_register_gc_roots(runtime_roots); - - runtime_initsig (); } int64
Re: [google]Add support for sampled profile collection (issue4438083)
I want to revive this patch for mainline and have some questions on Honza's comments. On Fri, Apr 29, 2011 at 1:48 PM, Jan Hubicka hubi...@ucw.cz wrote: A known limitation is that value profiling is not yet sampled, but it does not seem to cause problems. For the performance alone, we probably don't need to care that much given the fact that we value porfile only relatively expensive operations. But if we want to have the turn off/on feature, then i gueess we need to guard everything. It is not much of pain to add the code generating conditionals everywhere, after all. If we sample value profiling instrumentation as well, does it make sense to use a single counter and rate for all instrumentations. If not, does the additional complexity (and flags) justify the benefit of uniformity? +/* Insert STMT_IF around given sequence of consecutive statements in the + same basic block starting with STMT_START, ending with STMT_END. */ + +static void +insert_if_then (gimple stmt_start, gimple stmt_end, gimple stmt_if) +{ + gimple_stmt_iterator gsi; + basic_block bb_original, bb_before_if, bb_after_if; + edge e_if_taken, e_then_join; + + gsi = gsi_for_stmt (stmt_start); + gsi_insert_before (gsi, stmt_if, GSI_SAME_STMT); + bb_original = gsi_bb (gsi); + e_if_taken = split_block (bb_original, stmt_if); + e_if_taken-flags = ~EDGE_FALLTHRU; + e_if_taken-flags |= EDGE_TRUE_VALUE; + e_then_join = split_block (e_if_taken-dest, stmt_end); + bb_before_if = e_if_taken-src; + bb_after_if = e_then_join-dest; On mainline when we do profile estimation before profiling instrumentaiton, now, you really want to update profile for performance here. I am not sure I understand this. + make_edge (bb_before_if, bb_after_if, EDGE_FALSE_VALUE); +} + +/* Transform: + + ORIGINAL CODE + + Into: + + __gcov_sample_counter++; + if (__gcov_sample_counter = __gcov_sampling_rate) + { + __gcov_sample_counter = 0; + ORIGINAL CODE + } Hmm, I think the probability that internal loop of program will interfere with sampling rate is relatively high, but I see it is bit hard to do something about it. Can we think of some very basic randomization of sampling_rate? The predominant use case we have for this technique for server workloads is as follows: Have a high value for __gcov_sampling_rate (that should probably be named __gcov_sampling_interval) during server start up so that it reduces the overhead as well as ensures that the startup period doesn't pollute the rest of the counters. After startup, change the sampling interval to a small value (in many cases, to 1) using the external interface in libgcov.c. In this use case, randomization doesn't make sense during startup (since we want to skip profile collection) as well as the steady phase (since the interval is 1 or a small number). If you want to have randomization support, do you have any suggestions for how to make it work with low sampling intervals? Thanks, Easwaran Honza
Re: [C Patch]: pr52543
On 30 March 2012 20:29, Kenneth Zadeck zad...@naturalbridge.com wrote: ramana i get the same failure on the trunk without my patch. In which case I apologise and will file a bug report separately. I should really have checked :( . Ramana kenny On 03/30/2012 07:36 AM, Ramana Radhakrishnan wrote: Hi I have tested this on an x86_64 with both the force lowering on and off and neither cause any regressions as well as extensive testing on my port. So, just out of curiosity, I decided to run this through a cross-build and noticed the following ICE with eglibc. I haven't had the time to debug this further but it does appear as though it could do with some more testing on some more ports and this probably needs some tuning as you say. $ /work/cross-build/fsf/arm-none-linux-gnueabi/tools-lowersubregchanges-patched/bin/arm-none-linux-gnueabi-gcc -c -O2 ./besttry.c -mfloat-abi=soft -march=armv5te ./besttry.c: In function ‘_IO_new_file_write’: ./besttry.c:36:1: internal compiler error: in get_loop_body, at cfgloop.c:831 $ cat besttry.c __extension__ typedef int __ssize_t; extern __thread int __libc_errno __attribute__ ((tls_model (initial-exec))); struct _IO_FILE { int _fileno; int _flags2; }; typedef struct _IO_FILE _IO_FILE; _IO_new_file_write (f, data, n) _IO_FILE *f; { __ssize_t to_do = n; while (to_do 0) { __ssize_t count = (__builtin_expect (f-_flags2 2, 0) ? ({ unsigned int _sys_result = ({ register int _a1 asm (r0), _nr asm (r7); int _a3tmp = (int) ((to_do)); int _a2tmp = (int) ((data)); register int _a2 asm (a2) = _a2tmp; register int _a3 asm (a3) = _a3tmp; _nr = ((0 + 4)); asm volatile (swi 0x0 @ syscall SYS_ify(write) : =r (_a1) : r (_nr) , r (_a1), r (_a2), r (_a3) : memory); _a1; }); if (__builtin_expect (((unsigned int) (_sys_result)= 0xf001u), 0)) { (__libc_errno = ((-(_sys_result; _sys_result = (unsigned int) -1; } (int) _sys_result; }) : __write (f-_fileno, data, to_do)); if (count 0) { break; } to_do -= count; } } Ramana On 29 March 2012 22:10, Kenneth Zadeckzad...@naturalbridge.com wrote: This patch takes a different approach to fixing PR52543 than does the patch in http://gcc.gnu.org/ml/gcc-patches/2012-03/msg00641.html This patch transforms the lower-subreg pass(es) from unconditionally splitting wide moves, zero extensions, and shifts, so that it now takes into account the target specific costs and only does the transformations if it is profitable. Unconditional splitting is a problem that not only occurs on the AVR but is also a problem on the ARM NEON and my private port. Furthermore, it is a problem that is likely to occur on most modern larger machines since these machines are more likely to have fast instructions for moving things that are larger than word mode. At compiler initialization time, each mode that is larger that a word mode is examined to determine if the cost of moving a value of that mode is less expensive that inserting the proper number of word sided moves. If it is cheaper to split it up, a bit is set to allow moves of that mode to be lowered. A similar analysis is made for the zero extensions and shifts except that lower subreg had been (and is still limited to only breaking up these operations if the target size was twice the size of word mode.) Also, if the analysis determines that there are no profitable transformations, the pass exits quickly without doing any analysis. It is quite likely that most ports will have to be adjusted after this patch is accepted. For instance, the analysis discovers that there are no profitable transformations to be performed on the x86-64. Since this is not my platform, I have no idea if these are the correct settings. But the pass uses the standard insn_rtx_cost interface and it is the port maintainers responsibility to not lie to the optimization passes so this extra work in stage one should be acceptable. I do know from a private conversation with Richard Sandiford, that mips patches are likely forthcoming. There is preprocessor controlled code that prints out the cost analysis. Only a summary of this can go in the subregs dump file because the analysis is called from backend_init_target and so the dump file is not available. But it is very useful to define LOG_COSTS when adjusting your port. There is also preprocessor code that forces all of the lowering operations to marked as profitable. This is useful in debugging the new logic. Both of these preprocessor symbols are documented at the top of the pass. I have tested this on an x86_64 with both the force lowering on and off and neither cause any regressions as well as extensive testing on my port. Ok to commit? Kenny 2012-03-29 Kenneth Zadeckzad...@naturalbridge.com
libgo patch committed: Update to weekly.2012-03-27 release
I have committed a patch to libgo to update to the weekly.2012-03-27 release. This brings libgo up to the Go 1 standard release. This patch is small enough to include in this e-mail message in its entirety. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline and 4.7 branch. Ian diff -r b8be815d3dea libgo/MERGE --- a/libgo/MERGE Fri Mar 30 15:03:47 2012 -0700 +++ b/libgo/MERGE Fri Mar 30 15:24:14 2012 -0700 @@ -1,4 +1,4 @@ -bce220d03774 +dc5e410f0b4c The first line of this file holds the Mercurial revision number of the last merge done from the master library sources. diff -r b8be815d3dea libgo/go/crypto/tls/handshake_server.go --- a/libgo/go/crypto/tls/handshake_server.go Fri Mar 30 15:03:47 2012 -0700 +++ b/libgo/go/crypto/tls/handshake_server.go Fri Mar 30 15:24:14 2012 -0700 @@ -60,21 +60,23 @@ for _, id := range clientHello.cipherSuites { for _, supported := range config.cipherSuites() { if id == supported { -suite = nil +var candidate *cipherSuite + for _, s := range cipherSuites { if s.id == id { - suite = s + candidate = s break } } -if suite == nil { +if candidate == nil { continue } // Don't select a ciphersuite which we can't // support for this client. -if suite.elliptic !ellipticOk { +if candidate.elliptic !ellipticOk { continue } +suite = candidate break FindCipherSuite } } diff -r b8be815d3dea libgo/go/crypto/tls/key_agreement.go --- a/libgo/go/crypto/tls/key_agreement.go Fri Mar 30 15:03:47 2012 -0700 +++ b/libgo/go/crypto/tls/key_agreement.go Fri Mar 30 15:24:14 2012 -0700 @@ -130,6 +130,10 @@ } } + if curveid == 0 { + return nil, errors.New(tls: no supported elliptic curves offered) + } + var x, y *big.Int var err error ka.privateKey, x, y, err = elliptic.GenerateKey(ka.curve, config.rand()) diff -r b8be815d3dea libgo/go/exp/gotype/gotype.go --- a/libgo/go/exp/gotype/gotype.go Fri Mar 30 15:03:47 2012 -0700 +++ b/libgo/go/exp/gotype/gotype.go Fri Mar 30 15:24:14 2012 -0700 @@ -171,7 +171,7 @@ func processPackage(fset *token.FileSet, files map[string]*ast.File) { // make a package (resolve all identifiers) - pkg, err := ast.NewPackage(fset, files, types.GcImporter, types.Universe) + pkg, err := ast.NewPackage(fset, files, types.GcImport, types.Universe) if err != nil { report(err) return diff -r b8be815d3dea libgo/go/exp/types/check_test.go --- a/libgo/go/exp/types/check_test.go Fri Mar 30 15:03:47 2012 -0700 +++ b/libgo/go/exp/types/check_test.go Fri Mar 30 15:24:14 2012 -0700 @@ -184,7 +184,7 @@ eliminate(t, errors, err) // verify errors returned after resolving identifiers - pkg, err := ast.NewPackage(fset, files, GcImporter, Universe) + pkg, err := ast.NewPackage(fset, files, GcImport, Universe) eliminate(t, errors, err) // verify errors returned by the typechecker diff -r b8be815d3dea libgo/go/exp/types/exportdata.go --- a/libgo/go/exp/types/exportdata.go Fri Mar 30 15:03:47 2012 -0700 +++ b/libgo/go/exp/types/exportdata.go Fri Mar 30 15:24:14 2012 -0700 @@ -2,7 +2,7 @@ // Use of this source code is governed by a BSD-style // license that can be found in the LICENSE file. -// This file implements ExportData. +// This file implements FindGcExportData. package types @@ -11,15 +11,14 @@ errors fmt io - os strconv strings ) -func readGopackHeader(buf *bufio.Reader) (name string, size int, err error) { +func readGopackHeader(r *bufio.Reader) (name string, size int, err error) { // See $GOROOT/include/ar.h. hdr := make([]byte, 16+12+6+6+8+10+2) - _, err = io.ReadFull(buf, hdr) + _, err = io.ReadFull(r, hdr) if err != nil { return } @@ -36,33 +35,14 @@ return } -type dataReader struct { - *bufio.Reader - io.Closer -} - -// ExportData returns a readCloser positioned at the beginning of the -// export data section of the given object/archive file, or an error. -// It is the caller's responsibility to close the readCloser. +// FindGcExportData positions the reader r at the beginning of the +// export data section of an underlying GC-created object/archive +// file by reading from it. The reader must be positioned at the +// start of the file before calling this function. // -func ExportData(filename string) (rc io.ReadCloser, err error) { - file, err := os.Open(filename) - if err != nil { - return - } - - defer func() { - if err != nil { - file.Close() - // Add file name to error. - err = fmt.Errorf(reading export data: %s: %v, filename, err) - } - }() - - buf := bufio.NewReader(file) - +func FindGcExportData(r *bufio.Reader) (err error) { // Read first line to make sure this is an object file. - line, err := buf.ReadSlice('\n') + line, err := r.ReadSlice('\n') if err != nil { return } @@ -74,7 +54,7 @@ // First entry should be __.SYMDEF. // Read and discard. - if name, size, err = readGopackHeader(buf); err != nil { + if name,
Re: Support for Runtime CPU type detection via builtins (issue5754058)
On Fri, Mar 30, 2012 at 5:47 AM, Michael Matz m...@suse.de wrote: Hi, On Thu, 29 Mar 2012, Sriraman Tallam wrote: +struct __processor_model +{ + /* Vendor. */ + unsigned int __cpu_is_amd : 1; + unsigned int __cpu_is_intel : 1; + /* CPU type. */ + unsigned int __cpu_is_intel_atom : 1; + unsigned int __cpu_is_intel_core2 : 1; + unsigned int __cpu_is_intel_corei7 : 1; + unsigned int __cpu_is_intel_corei7_nehalem : 1; + unsigned int __cpu_is_intel_corei7_westmere : 1; + unsigned int __cpu_is_intel_corei7_sandybridge : 1; + unsigned int __cpu_is_amdfam10h : 1; + unsigned int __cpu_is_amdfam10h_barcelona : 1; + unsigned int __cpu_is_amdfam10h_shanghai : 1; + unsigned int __cpu_is_amdfam10h_istanbul : 1; + unsigned int __cpu_is_amdfam15h_bdver1 : 1; + unsigned int __cpu_is_amdfam15h_bdver2 : 1; +} __cpu_model; It doesn't make sense for the model to be a bitfield, a processor will have only ever exactly one model. Just make it an enum or even just an int. Not entirely true, nehalem and corei7 can be both set. However, I modified this by dividing it into types and sub types and then did what you said. * config/i386/i386.c (build_processor_features_struct): New function. (build_processor_model_struct): New function. (make_var_decl): New function. (get_field_from_struct): New function. (fold_builtin_target): New function. (ix86_fold_builtin): New function. (ix86_expand_builtin): Expand new builtins by folding them. (make_cpu_type_builtin): New functions. (ix86_init_platform_type_builtins): Make the new builtins. (ix86_init_builtins): Make new builtins to detect CPU type. (TARGET_FOLD_BUILTIN): New macro. (IX86_BUILTIN_CPU_INIT): New enum value. (IX86_BUILTIN_CPU_IS): New enum value. (IX86_BUILTIN_CPU_SUPPORTS): New enum value. * config/i386/i386-builtin-types.def: New function type. * testsuite/gcc.target/builtin_target.c: New testcase. * libgcc/config/i386/i386-cpuinfo.c: New file. * libgcc/config/i386/t-cpuinfo: New file. * libgcc/config.host: Include t-cpuinfo. * libgcc/config/i386/libgcc-glibc.ver: Version symbols __cpu_model and __cpu_features. Patch available for review here: http://codereview.appspot.com/5754058 Thanks, -Sri. Ciao, Michael. Index: libgcc/config.host === --- libgcc/config.host (revision 185898) +++ libgcc/config.host (working copy) @@ -1130,7 +1130,7 @@ i[34567]86-*-linux* | x86_64-*-linux* | \ i[34567]86-*-kfreebsd*-gnu | x86_64-*-kfreebsd*-gnu | \ i[34567]86-*-knetbsd*-gnu | \ i[34567]86-*-gnu*) - tmake_file=${tmake_file} t-tls i386/t-linux + tmake_file=${tmake_file} t-tls i386/t-linux i386/t-cpuinfo if test $libgcc_cv_cfi = yes; then tmake_file=${tmake_file} t-stack i386/t-stack-i386 fi Index: libgcc/config/i386/t-cpuinfo === --- libgcc/config/i386/t-cpuinfo(revision 0) +++ libgcc/config/i386/t-cpuinfo(revision 0) @@ -0,0 +1 @@ +LIB2ADD += $(srcdir)/config/i386/i386-cpuinfo.c Index: libgcc/config/i386/i386-cpuinfo.c === --- libgcc/config/i386/i386-cpuinfo.c (revision 0) +++ libgcc/config/i386/i386-cpuinfo.c (revision 0) @@ -0,0 +1,298 @@ +/* Get CPU type and Features for x86 processors. + Copyright (C) 2012 Free Software Foundation, Inc. + Contributed by Sriraman Tallam (tmsri...@google.com) + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +http://www.gnu.org/licenses/. */ + +#include cpuid.h +#include tsystem.h + +int __cpu_indicator_init (void) __attribute__ ((constructor (101))); + +enum vendor_signatures +{ + SIG_INTEL = 0x756e6547 /* Genu */, + SIG_AMD =0x68747541 /* Auth */ +}; + +/* ISA Features supported. */ + +struct __processor_features +{ + unsigned int __cpu_cmov : 1; + unsigned int __cpu_mmx : 1; + unsigned int __cpu_popcnt : 1; + unsigned int __cpu_sse : 1; + unsigned int __cpu_sse2 : 1; + unsigned int __cpu_sse3 : 1; + unsigned int __cpu_ssse3 : 1; + unsigned int __cpu_sse4_1 : 1; + unsigned int __cpu_sse4_2 : 1; +} __cpu_features; + +/* Processor Vendor and Models. */ + +enum processor_vendor +{ + VENDOR_INTEL = 1, + VENDOR_AMD, + VENDOR_MAX +}; +
Re: [SMS] Support new loop pattern
Roman, Andrey, Sorry for the delayed response. It would indeed be good to have SMS apply to more loop patterns, still within the realm of *countable* loops. SMS was originally designed to handle doloops, with a specific pattern controlling the loop, easily identified and separable from the loop's body. The newly proposed change to support new loop patterns is pretty invasive and sizable, taking place entirely within modulo-sched.c. The main issue I've been considering, is whether it would be possible instead to transform the new loop patterns we want SMS to handle, into doloops (potentially introducing additional induction variables to feed other uses), and then feed the resulting loop into SMS as is? In other words, could you fold it into doloop.c? And if so, will doing so introduce significant overheads? 2012/3/29 Andrey Belevantsev a...@ispras.ru: Hello, I'd like to ping again those SMS patches once we're back to Stage 1. Ayal, maybe it would remove some burden for you if you'd review the general SMS functionality of those patches, and we'd ask RTL folks to look at the pieces related to RTL pattern matching and generation? It definitely would ... especially in light of the above issue. Thanks (for your patches, patience, pings..), Ayal. Yours, Andrey On 10.02.2012 16:15, Roman Zhuykov wrote: Ping. Ayal, please review this patch and these three patches too: http://gcc.gnu.org/ml/gcc-patches/2011-12/msg00505.html http://gcc.gnu.org/ml/gcc-patches/2011-12/msg00506.html http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01800.html -- Roman Zhuykov zhr...@ispras.ru
Re: [Patch]: Support VMS in libstdc++ crossconfig.m4
Hi, Hi, currently all VMS compilers are built on Unix. So, to build the libstdc++ library, this looks like the minimum required. Tested by build libstdc++ for ia64-hp-openvms. Ok for trunk ? I understand that without your patch a libstdc++ for ia64-hp-openvms cannot be build at all, and I don't see how your fix could negatively affect other targets, thus Ok for mainline. Thanks, Paolo.
Re: [C++ RFC / Patch] Implementing Deducing noexcept for destructors
Hi again, On 03/30/2012 12:26 AM, Paolo Carlini wrote: On 03/29/2012 09:27 PM, Jason Merrill wrote: On 03/29/2012 03:06 PM, Paolo Carlini wrote: The exception specification on old_decl doesn't matter; we can drop that test. I seem to remember something going wrong with templates otherwise, because implicitly_declare_fn has gcc_assert (!dependent_type_p (type)); We shouldn't be doing this for templates anyway, as in general we can't know what the implicitly declared function will look like. Oh my, as simple as the below appears to work! I simply added a !processing_template_decl check. Then I removed the deduce_noexcept_on_destructor calls in register_specialization and when I found a proper place in grokfndecl (must be before check_explicit_specialization) I noticed that apparently I can remove the other deduce_noexcept_on_destructor call which I had later on in grokfndecl. Thus the below passes the (updated) testsuite on x86_64-linux. Sorry for essentially self-replying, but today, while I was traveling, I reviewed in my mind your comments over the last days, and I think I had a buglet in the patch which I sent in the last message: it doesn't make sure, in grokfndecl, to *not* call deduce_noexcept_on_destructor on a destructor of a class still being defined. Thus I'm adding a !TYPE_BEING_DEFINED (DECL_CONTEXT (decl)) check and the complete patch (which I'm attaching below) still passes testing. I also double checked that, for a simple case like: struct A { ~A(); }; A::~A() { } we process the declaration from check_bases_and_members and then the definition from grokfndecl. Thanks, Paolo. Index: testsuite/g++.old-deja/g++.eh/cleanup1.C === --- testsuite/g++.old-deja/g++.eh/cleanup1.C(revision 185982) +++ testsuite/g++.old-deja/g++.eh/cleanup1.C(working copy) @@ -2,6 +2,12 @@ // Bug: obj gets destroyed twice because the fixups for the return are // inside its cleanup region. +#ifdef __GXX_EXPERIMENTAL_CXX0X__ +#define NOEXCEPT_FALSE noexcept (false) +#else +#define NOEXCEPT_FALSE +#endif + extern C int printf (const char *, ...); int d; @@ -9,7 +15,7 @@ int d; struct myExc { }; struct myExcRaiser { - ~myExcRaiser() { throw myExc(); } + ~myExcRaiser() NOEXCEPT_FALSE { throw myExc(); } }; struct stackObj { Index: testsuite/g++.dg/tree-ssa/ehcleanup-1.C === --- testsuite/g++.dg/tree-ssa/ehcleanup-1.C (revision 185982) +++ testsuite/g++.dg/tree-ssa/ehcleanup-1.C (working copy) @@ -1,9 +1,16 @@ // { dg-options -O2 -fdump-tree-ehcleanup1-details } + +#ifdef __GXX_EXPERIMENTAL_CXX0X__ +#define NOEXCEPT_FALSE noexcept (false) +#else +#define NOEXCEPT_FALSE +#endif + extern void can_throw (); class a { public: - ~a () + ~a () NOEXCEPT_FALSE { if (0) can_throw (); Index: testsuite/g++.dg/cpp0x/noexcept17.C === --- testsuite/g++.dg/cpp0x/noexcept17.C (revision 0) +++ testsuite/g++.dg/cpp0x/noexcept17.C (revision 0) @@ -0,0 +1,54 @@ +// PR c++/50043 +// { dg-options -std=c++11 } + +struct True1 {}; +struct True2 { ~True2(); }; +struct True3 { ~True3(){ throw 0; } }; +struct False { ~False() noexcept(false); }; + +template typename Base +struct A : Base +{ +}; + +template typename Member +struct B +{ +Member mem; +}; + +template typename Base, typename Member +struct C : Base +{ +Member mem; +}; + +#define SA(X) static_assert(X, #X) + +SA( noexcept(True1())); +SA( noexcept(True2())); +SA( noexcept(True3())); +SA(!noexcept(False())); + +SA( noexcept(ATrue1())); +SA( noexcept(ATrue2())); +SA( noexcept(ATrue3())); +SA(!noexcept(AFalse())); + +SA( noexcept(BTrue1())); +SA( noexcept(BTrue2())); +SA( noexcept(BTrue3())); +SA(!noexcept(BFalse())); + +SA( noexcept(CTrue1, True2())); +SA( noexcept(CTrue1, True3())); +SA( noexcept(CTrue2, True3())); +SA( noexcept(CTrue2, True1())); +SA( noexcept(CTrue3, True1())); +SA( noexcept(CTrue3, True2())); +SA(!noexcept(CFalse, True1())); +SA(!noexcept(CFalse, True2())); +SA(!noexcept(CFalse, True3())); +SA(!noexcept(CTrue1, False())); +SA(!noexcept(CTrue2, False())); +SA(!noexcept(CTrue3, False())); Index: testsuite/g++.dg/cpp0x/noexcept01.C === --- testsuite/g++.dg/cpp0x/noexcept01.C (revision 185982) +++ testsuite/g++.dg/cpp0x/noexcept01.C (working copy) @@ -50,7 +50,7 @@ struct E ~E(); }; -SA (!noexcept (E())); +SA (noexcept (E())); struct F { @@ -74,7 +74,7 @@ void tf() } template void tfint,true(); -template void tfE, false(); +template void tfE, true(); // Make sure that noexcept uses the declared exception-specification, not // any knowledge we might have about whether or not the function really Index: testsuite/g++.dg/eh/init-temp1.C
[google] Work around PR52796 by replacing empty packs with explicit overloads. (issue5971053)
Work around http://gcc.gnu.org/PR52796 in gcc-4.6 by adding an overload of each function that passes a parameter pack directly as the only arguments of an object's constructor, which explicitly takes no arguments in place of the pack. Tested with check-c++ and by trying to provoke the bug in valgrind for each changed location. Some of the insert_aux locations appear inaccessible because of missing emplace() functions. I plan to only apply this to the google/gcc-4_6 branch, since gcc-4.7 already makes these cases work properly. 2012-03-30 Jeffrey Yasskin jyass...@google.com * libstdc++-v3/include/ext/pool_allocator.h: Add 1-argument construct() method. * libstdc++-v3/include/ext/bitmap_allocator.h: Likewise. * libstdc++-v3/include/ext/new_allocator.h: Likewise. * libstdc++-v3/include/ext/malloc_allocator.h: Likewise. * libstdc++-v3/include/ext/array_allocator.h: Likewise. * libstdc++-v3/include/ext/mt_allocator.h: Likewise. * libstdc++-v3/include/ext/extptr_allocator.h: Likewise. * libstdc++-v3/include/bits/stl_construct.h:Add 1-argument _Construct function. * libstdc++-v3/include/bits/stl_list.h: Add default _List_node constructor. * libstdc++-v3/include/bits/hashtable_policy.h: Add default _Hash_node constructor. * libstdc++-v3/include/bits/forward_list.h:Add default _Fwd_list_node constructor. * libstdc++-v3/include/bits/stl_tree.h:Add default _Rb_tree_node constructor. * libstdc++-v3/testsuite/23_containers/forward_list/requirements/dr438/assign_neg.cc: Update line numbers. * libstdc++-v3/testsuite/23_containers/forward_list/requirements/dr438/insert_neg.cc: Likewise. * libstdc++-v3/testsuite/23_containers/forward_list/requirements/dr438/constructor_1_neg.cc: Likewise. * libstdc++-v3/testsuite/23_containers/forward_list/requirements/dr438/constructor_2_neg.cc: Likewise. * libstdc++-v3/testsuite/23_containers/list/requirements/dr438/assign_neg.cc: Likewise. * libstdc++-v3/testsuite/23_containers/list/requirements/dr438/insert_neg.cc: Likewise. * libstdc++-v3/testsuite/23_containers/list/requirements/dr438/constructor_1_neg.cc: Likewise. * libstdc++-v3/testsuite/23_containers/list/requirements/dr438/constructor_2_neg.cc: Likewise. Index: libstdc++-v3/include/ext/pool_allocator.h === --- libstdc++-v3/include/ext/pool_allocator.h (revision 186024) +++ libstdc++-v3/include/ext/pool_allocator.h (working copy) @@ -165,6 +165,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { ::new((void *)__p) _Tp(__val); } #ifdef __GXX_EXPERIMENTAL_CXX0X__ + // Work around PR52796 by avoiding 0-length parameter packs + // passed to constructors. + void + construct(pointer __p) + { ::new((void *)__p) _Tp(); } + templatetypename... _Args void construct(pointer __p, _Args... __args) Index: libstdc++-v3/include/ext/bitmap_allocator.h === --- libstdc++-v3/include/ext/bitmap_allocator.h (revision 186024) +++ libstdc++-v3/include/ext/bitmap_allocator.h (working copy) @@ -1058,6 +1058,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { ::new((void *)__p) value_type(__data); } #ifdef __GXX_EXPERIMENTAL_CXX0X__ + // Work around PR52796 by avoiding 0-length parameter packs + // passed to constructors. + void + construct(pointer __p) + { ::new((void *)__p) _Tp(); } + templatetypename... _Args void construct(pointer __p, _Args... __args) @@ -1109,4 +1115,3 @@ _GLIBCXX_END_NAMESPACE_VERSION } // namespace __gnu_cxx #endif - Index: libstdc++-v3/include/ext/new_allocator.h === --- libstdc++-v3/include/ext/new_allocator.h(revision 186024) +++ libstdc++-v3/include/ext/new_allocator.h(working copy) @@ -115,6 +115,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { ::new((void *)__p) _Tp(__val); } #ifdef __GXX_EXPERIMENTAL_CXX0X__ + // Work around PR52796 by avoiding 0-length parameter packs + // passed to constructors. + void + construct(pointer __p) + { ::new((void *)__p) _Tp(); } + templatetypename... _Args void construct(pointer __p, _Args... __args) Index: libstdc++-v3/include/ext/malloc_allocator.h === --- libstdc++-v3/include/ext/malloc_allocator.h (revision 186024) +++ libstdc++-v3/include/ext/malloc_allocator.h (working copy) @@ -111,6 +111,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { ::new((void *)__p) value_type(__val); } #ifdef __GXX_EXPERIMENTAL_CXX0X__ + // Work around PR52796 by avoiding 0-length parameter packs + // passed to constructors. + void + construct(pointer __p) + { ::new((void *)__p)
[v3] Fix libstdc++/52799
Hi, tested x86_64-linux, committed mainline and 4_7-branch. Thanks, Paolo. /// 2012-03-30 Jeffrey Yasskin jyass...@gcc.gnu.org Paolo Carlini paolo.carl...@oracle.com PR libstdc++/52799 * include/bits/deque.tcc (emplace): Fix thinko, replace push_front - emplace_front, and likewise for *_back. * testsuite/23_containers/deque/modifiers/emplace/52799.cc: New. * testsuite/23_containers/list/modifiers/emplace/52799.cc: Likewise. * testsuite/23_containers/vector/modifiers/emplace/52799.cc: Likewise. Index: include/bits/deque.tcc === --- include/bits/deque.tcc (revision 185982) +++ include/bits/deque.tcc (working copy) @@ -1,7 +1,7 @@ // Deque implementation (out of line) -*- C++ -*- // Copyright (C) 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, -// 2009, 2010, 2011 +// 2009, 2010, 2011, 2012 // Free Software Foundation, Inc. // // This file is part of the GNU ISO C++ Library. This library is free @@ -175,12 +175,12 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER { if (__position._M_cur == this-_M_impl._M_start._M_cur) { - push_front(std::forward_Args(__args)...); + emplace_front(std::forward_Args(__args)...); return this-_M_impl._M_start; } else if (__position._M_cur == this-_M_impl._M_finish._M_cur) { - push_back(std::forward_Args(__args)...); + emplace_back(std::forward_Args(__args)...); iterator __tmp = this-_M_impl._M_finish; --__tmp; return __tmp; Index: testsuite/23_containers/vector/modifiers/emplace/52799.cc === --- testsuite/23_containers/vector/modifiers/emplace/52799.cc (revision 0) +++ testsuite/23_containers/vector/modifiers/emplace/52799.cc (revision 0) @@ -0,0 +1,28 @@ +// { dg-options -std=gnu++11 } +// { dg-do compile } + +// Copyright (C) 2012 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// http://www.gnu.org/licenses/. + +#include vector + +// libstdc++/52799 +int main() +{ + std::vectorint v; + v.emplace(v.begin()); +} Index: testsuite/23_containers/deque/modifiers/emplace/52799.cc === --- testsuite/23_containers/deque/modifiers/emplace/52799.cc(revision 0) +++ testsuite/23_containers/deque/modifiers/emplace/52799.cc(revision 0) @@ -0,0 +1,28 @@ +// { dg-options -std=gnu++11 } +// { dg-do compile } + +// Copyright (C) 2012 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// http://www.gnu.org/licenses/. + +#include deque + +// libstdc++/52799 +int main() +{ + std::dequeint d; + d.emplace(d.begin()); +} Index: testsuite/23_containers/list/modifiers/emplace/52799.cc === --- testsuite/23_containers/list/modifiers/emplace/52799.cc (revision 0) +++ testsuite/23_containers/list/modifiers/emplace/52799.cc (revision 0) @@ -0,0 +1,28 @@ +// { dg-options -std=gnu++11 } +// { dg-do compile } + +// Copyright (C) 2012 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of