Re: [PATCH 12/13] Adjust relevant test cases wrt -ftrack-macro-expansion=[0|2]
Dodji Seketeli do...@redhat.com writes: Gabriel Dos Reis g...@integrable-solutions.net writes: OK. Thank you. While bootstrapping the tree again, it appeared that an output regression of the objc test objc.dg/foreach-7.m flew below my radar. It's one of those typical cases where the first location pointed to by the diagnostics points into the definition of a macro, instead of pointing to its expansion point. I have just forced that unique test to run without -ftrack-macro-expansion. This looks fairly obvious to me, but I am CC-ing Mike Stump, just in case. I have updated this patch to include the adjustment for objc.dg/foreach-7.m. All the other parts remained unchanged. Tested on x86_64-unknown-linux-gnu against trunk, bootstrap still underway. FWIW, this passed bootstrap fine. -- Dodji
Re: [PATCH 12/13] Adjust relevant test cases wrt -ftrack-macro-expansion=[0|2]
It is still OK :-) On Mon, Apr 30, 2012 at 1:22 AM, Dodji Seketeli do...@redhat.com wrote: Dodji Seketeli do...@redhat.com writes: Gabriel Dos Reis g...@integrable-solutions.net writes: OK. Thank you. While bootstrapping the tree again, it appeared that an output regression of the objc test objc.dg/foreach-7.m flew below my radar. It's one of those typical cases where the first location pointed to by the diagnostics points into the definition of a macro, instead of pointing to its expansion point. I have just forced that unique test to run without -ftrack-macro-expansion. This looks fairly obvious to me, but I am CC-ing Mike Stump, just in case. I have updated this patch to include the adjustment for objc.dg/foreach-7.m. All the other parts remained unchanged. Tested on x86_64-unknown-linux-gnu against trunk, bootstrap still underway. FWIW, this passed bootstrap fine. -- Dodji
[Ada] Fix spurious aliasing warning for subtype of private type
Although the Ada type system provides strong enough guarantees to make a safe and effective usage of -fstrict-aliasing, there is an explicit unsafe construct (Unchecked_Conversion) for which these guarantees aren't valid any more. That's why GNAT also provides the No_Strict_Aliasing pragma (mapped to TYPE_REF_CAN_ALIAS_ALL in the GCC IL) and issues a warning when it runs into such a problematic Unchecked_Conversion. The attached testcase exhibits a false positive case for this warning, stemming from gigi trying to second-guess the front-end instead of computing the exact predicate, because it cannot compute it in all cases by the time it encounters the N_Validate_Unchecked_Conversion node. Fixed by waiting until the very end of the translation to issue the warning. Tested on i586-suse-linux, applied on the mainline. 2012-04-30 Eric Botcazou ebotca...@adacore.com * gcc-interface/gigi.h (mark_out_of_scope): Delete. (destroy_gnat_to_gnu): Declare. (destroy_dummy_type): Likewise. * gcc-interface/decl.c (mark_out_of_scope): Delete. * gcc-interface/utils.c (destroy_gnat_to_gnu): New function. (destroy_dummy_type): Likewise. * gcc-interface/trans.c (gnat_validate_uc_list): New variable. (gigi): Call validate_unchecked_conversion on gnat_validate_uc_list after the translation is completed. Call destroy_gnat_to_gnu and destroy_dummy_type at the end. (Subprogram_Body_to_gnu): Do not call mark_out_of_scope. (gnat_to_gnu) N_Block_Statement: Likewise. N_Validate_Unchecked_Conversion: Do not process the node, only push it onto gnat_validate_uc_list. (validate_unchecked_conversion): New function. 2012-04-30 Eric Botcazou ebotca...@adacore.com * gnat.dg/warn6.ad[sb]: New test. -- Eric Botcazou Index: gcc-interface/utils.c === --- gcc-interface/utils.c (revision 186945) +++ gcc-interface/utils.c (working copy) @@ -231,6 +231,15 @@ init_gnat_to_gnu (void) associate_gnat_to_gnu = ggc_alloc_cleared_vec_tree (max_gnat_nodes); } +/* Destroy the association of GNAT nodes to GCC trees. */ + +void +destroy_gnat_to_gnu (void) +{ + ggc_free (associate_gnat_to_gnu); + associate_gnat_to_gnu = NULL; +} + /* GNAT_ENTITY is a GNAT tree node for an entity. Associate GNU_DECL, a GCC tree node, with GNAT_ENTITY. If GNU_DECL is not a ..._DECL node, abort. If NO_CHECK is true, the latter check is suppressed. @@ -280,6 +289,15 @@ init_dummy_type (void) dummy_node_table = ggc_alloc_cleared_vec_tree (max_gnat_nodes); } +/* Destroy the association of GNAT nodes to GCC trees as dummies. */ + +void +destroy_dummy_type (void) +{ + ggc_free (dummy_node_table); + dummy_node_table = NULL; +} + /* Make a dummy type corresponding to GNAT_TYPE. */ tree Index: gcc-interface/decl.c === --- gcc-interface/decl.c (revision 186945) +++ gcc-interface/decl.c (working copy) @@ -5838,44 +5838,6 @@ elaborate_entity (Entity_Id gnat_entity) } } -/* Mark GNAT_ENTITY as going out of scope at this point. Recursively mark - any entities on its entity chain similarly. */ - -void -mark_out_of_scope (Entity_Id gnat_entity) -{ - Entity_Id gnat_sub_entity; - unsigned int kind = Ekind (gnat_entity); - - /* If this has an entity list, process all in the list. */ - if (IN (kind, Class_Wide_Kind) || IN (kind, Concurrent_Kind) - || IN (kind, Private_Kind) - || kind == E_Block || kind == E_Entry || kind == E_Entry_Family - || kind == E_Function || kind == E_Generic_Function - || kind == E_Generic_Package || kind == E_Generic_Procedure - || kind == E_Loop || kind == E_Operator || kind == E_Package - || kind == E_Package_Body || kind == E_Procedure - || kind == E_Record_Type || kind == E_Record_Subtype - || kind == E_Subprogram_Body || kind == E_Subprogram_Type) -for (gnat_sub_entity = First_Entity (gnat_entity); - Present (gnat_sub_entity); - gnat_sub_entity = Next_Entity (gnat_sub_entity)) - if (Scope (gnat_sub_entity) == gnat_entity - gnat_sub_entity != gnat_entity) - mark_out_of_scope (gnat_sub_entity); - - /* Now clear this if it has been defined, but only do so if it isn't - a subprogram or parameter. We could refine this, but it isn't - worth it. If this is statically allocated, it is supposed to - hang around out of cope. */ - if (present_gnu_tree (gnat_entity) !Is_Statically_Allocated (gnat_entity) - kind != E_Procedure kind != E_Function !IN (kind, Formal_Kind)) -{ - save_gnu_tree (gnat_entity, NULL_TREE, true); - save_gnu_tree (gnat_entity, error_mark_node, true); -} -} - /* Relate the alias sets of GNU_NEW_TYPE and GNU_OLD_TYPE according to OP. If this is a multi-dimensional array type, do this recursively. Index: gcc-interface/gigi.h
Re: [PATCH] x86: emit tzcnt unconditionally
On Fri, Apr 27, 2012 at 3:30 PM, Paolo Bonzini bonz...@gnu.org wrote: tzcnt is encoded as rep;bsf and unlike lzcnt is a drop-in replacement if we don't care about the flags (it has the same semantics for non-zero values). Since bsf is usually slower, just emit tzcnt unconditionally. However, write it as rep;bsf unless -mbmi is in use, to cater for old assemblers. Please emit rep;bsf when optimize_insn_for_speed_p () is true. Bootstrapped on a non-BMI x86_64-linux host, regtest in progress. Ok for mainline? OK with the optimize_insn_for_speed_p conditional. Thanks, Uros.
[Ada] Fix fallout of bitfields ABI change on Windows
The MS bitfields ABI and its stringent rules totally break packing in Ada, so we cannot use the C layout in this case. The i386.c hunk fixes an oversight which would result in gazillions of warnings for QUAL_UNION_TYPE. Tested on i586-suse-linux and i686-pc-mingw32, applied on the mainline and the 4.7 branch (as obvious for the i386.c hunk). 2012-04-30 Eric Botcazou ebotca...@adacore.com * config/i386/i386.c (ix86_handle_struct_attribute): Use the proper predicate to discriminate types. 2012-04-30 Eric Botcazou ebotca...@adacore.com * gcc-interface/utils.c (finish_record_type): Force the traditional GCC layout for bitfields on the type if it is packed or has a representation clause and an alternate layout is available. -- Eric Botcazou Index: config/i386/i386.c === --- config/i386/i386.c (revision 186907) +++ config/i386/i386.c (working copy) @@ -32465,8 +32465,7 @@ ix86_handle_struct_attribute (tree *node else type = node; - if (!(type (TREE_CODE (*type) == RECORD_TYPE - || TREE_CODE (*type) == UNION_TYPE))) + if (!(type RECORD_OR_UNION_TYPE_P (*type))) { warning (OPT_Wattributes, %qE attribute ignored, name); Index: ada/gcc-interface/utils.c === --- ada/gcc-interface/utils.c (revision 186956) +++ ada/gcc-interface/utils.c (working copy) @@ -721,6 +721,19 @@ finish_record_type (tree record_type, tr case where there is a rep clause but all fields have errors and no longer have a position. */ TYPE_SIZE (record_type) = 0; + + /* Ensure we use the traditional GCC layout for bitfields when we need + to pack the record type or have a representation clause. The other + possible layout (Microsoft C compiler), if available, would prevent + efficient packing in almost all cases. */ +#ifdef TARGET_MS_BITFIELD_LAYOUT + if (TARGET_MS_BITFIELD_LAYOUT TYPE_PACKED (record_type)) + decl_attributes (record_type, + tree_cons (get_identifier (gcc_struct), +NULL_TREE, NULL_TREE), + ATTR_FLAG_TYPE_IN_PLACE); +#endif + layout_type (record_type); }
[Ada] Fix issues with rep clauses in -gnatct mode
In -gnatct mode, the compiler doesn't generate code and gigi is only invoked to lay out and back-annotate types. There were old issues when representation clauses are present: fields are shifted, sizes are wrong, component clauses are overridden. Tested on i586-suse-linux, applied on the mainline. 2012-04-30 Eric Botcazou ebotca...@adacore.com * gcc-interface/decl.c (gnat_to_gnu_entity): In type annotation mode, do not adjust the size of a tagged type if there is a representation clause on it. Otherwise, round the adjustment up to the alignment of the first field and use the appropriate helper routine. (maybe_pad_type): Do not warn in type annotation mode on a tagged type. (gnat_to_gnu_field): Do not error out under the same circumstances. (annotate_rep): In type annotation mode, do not adjust the offset of components of a tagged type with representation clause. Otherwise, round the adjustment up to the alignment of the first field. -- Eric Botcazou Index: gcc-interface/decl.c === --- gcc-interface/decl.c (revision 186956) +++ gcc-interface/decl.c (working copy) @@ -5027,28 +5027,33 @@ gnat_to_gnu_entity (Entity_Id gnat_entit if (CONTAINS_PLACEHOLDER_P (gnu_size)) gnu_size = max_size (gnu_size, true); - if (type_annotate_only Is_Tagged_Type (gnat_entity)) + /* If we are just annotating types and the type is tagged, the tag + and the parent components are not generated by the front-end so + sizes must be adjusted if there is no representation clause. */ + if (type_annotate_only + Is_Tagged_Type (gnat_entity) + !VOID_TYPE_P (gnu_type) + (!TYPE_FIELDS (gnu_type) + || integer_zerop (bit_position (TYPE_FIELDS (gnu_type) { - /* In this mode, the tag and the parent components are not - generated by the front-end so the sizes must be adjusted. */ tree pointer_size = bitsize_int (POINTER_SIZE), offset; Uint uint_size; if (Is_Derived_Type (gnat_entity)) { - offset = UI_To_gnu (Esize (Etype (Base_Type (gnat_entity))), - bitsizetype); - Set_Alignment (gnat_entity, - Alignment (Etype (Base_Type (gnat_entity; + Entity_Id gnat_parent = Etype (Base_Type (gnat_entity)); + offset = UI_To_gnu (Esize (gnat_parent), bitsizetype); + Set_Alignment (gnat_entity, Alignment (gnat_parent)); } else offset = pointer_size; + if (TYPE_FIELDS (gnu_type)) + offset + = round_up (offset, DECL_ALIGN (TYPE_FIELDS (gnu_type))); + gnu_size = size_binop (PLUS_EXPR, gnu_size, offset); - gnu_size = size_binop (MULT_EXPR, pointer_size, - size_binop (CEIL_DIV_EXPR, - gnu_size, - pointer_size)); + gnu_size = round_up (gnu_size, POINTER_SIZE); uint_size = annotate_value (gnu_size); Set_Esize (gnat_entity, uint_size); Set_RM_Size (gnat_entity, uint_size); @@ -6619,7 +6624,9 @@ maybe_pad_type (tree type, tree size, un /* If the size was widened explicitly, maybe give a warning. Take the original size as the maximum size of the input if there was an unconstrained record involved and round it up to the specified alignment, - if one was specified. */ + if one was specified. But don't do it if we are just annotating types + and the type is tagged, since tagged types aren't fully laid out in this + mode. */ if (CONTAINS_PLACEHOLDER_P (orig_size)) orig_size = max_size (orig_size, true); @@ -6635,7 +6642,8 @@ maybe_pad_type (tree type, tree size, un TREE_CODE (orig_size) == INTEGER_CST (TREE_OVERFLOW (size) || TREE_OVERFLOW (orig_size) - || tree_int_cst_lt (size, orig_size + || tree_int_cst_lt (size, orig_size))) + !(type_annotate_only Is_Tagged_Type (Etype (gnat_entity { Node_Id gnat_error_node = Empty; @@ -6901,10 +6909,13 @@ gnat_to_gnu_field (Entity_Id gnat_field, } } - /* If this field needs strict alignment, ensure the record is - sufficiently aligned and that that position and size are - consistent with the alignment. */ - if (needs_strict_alignment) + /* If this field needs strict alignment, check that the record is + sufficiently aligned and that position and size are consistent + with the alignment. But don't do it if we are just annotating + types and the field's type is tagged, since tagged types aren't + fully laid out in this mode. */ + if (needs_strict_alignment + !(type_annotate_only Is_Tagged_Type (gnat_field_type))) { TYPE_ALIGN (gnu_record_type) = MAX (TYPE_ALIGN (gnu_record_type), TYPE_ALIGN (gnu_field_type)); @@ -7839,12 +7850,16 @@ annotate_rep (Entity_Id gnat_entity, tre { tree parent_offset; - if (type_annotate_only Is_Tagged_Type (gnat_entity)) + /* If we are just annotating types
Re: [ARM] Add atomic_loaddi pattern
On 27/04/12 22:30, Richard Henderson wrote: We can perform a single-copy atomic load with an ldrexd insn. If the load is all we care about, we need not pair this with a strexd. Ok? r~ d-arm-ldi * config/arm/arm.md (UNSPEC_LL): New. * config/arm/sync.md (atomic_loaddi, atomic_loaddi_1): New. diff --git a/gcc/config/arm/sync.md b/gcc/config/arm/sync.md index 03838f5..de2da3b 100644 --- a/gcc/config/arm/sync.md +++ b/gcc/config/arm/sync.md +operands[2] = gen_rtx_REG (SImode, REGNO (target) + 1); +return ldrexd%?\t%0, %2, %C1; + } Use ldrexd%?\t%0, %H0, %C1, then you don't need to construct operands[2] Otherwise, OK. R.
Re: [ARM] Add atomic_loaddi pattern
On 04/27/2012 10:30 PM, Richard Henderson wrote: We can perform a single-copy atomic load with an ldrexd insn. If the load is all we care about, we need not pair this with a strexd. Can we? It's good to know. I have had a long email exchange with engineers at ARM, and they would not say that this was safe. If they have changed their mind, I'd like to see chapter and verse. Andrew.
Re: Vector subscripts in C++
Ping? Also added Richard in Cc: as the author of the C front-end code I am sharing with the C++ front-end. Looks like I forgot to say it here (was in bugzilla), but the patch was tested in a c,c++ bootstrap + make -k check on linux x86_64. Reattaching the patch (now using the recommended diff options). On Tue, 17 Apr 2012, Marc Glisse wrote: Hello, this patch adds vector subscripting to C++ by reusing the C code. build_array_ref and cp_build_array_ref could probably share more, but I don't understand them enough to do it. (note that I can't commit, so if you like the patch...) gcc/cp/ChangeLog 2012-04-17 Marc Glisse marc.gli...@inria.fr PR c++/51033 * typeck.c (cp_build_array_ref): Handle VECTOR_TYPE. * decl2.c (grok_array_decl): Likewise. gcc/c-family/ChangeLog 2012-04-17 Marc Glisse marc.gli...@inria.fr PR c++/51033 * c-common.c (convert_vector_to_pointer_for_subscript): New function. * c-common.h (convert_vector_to_pointer_for_subscript): Declare it. gcc/ChangeLog 2012-04-17 Marc Glisse marc.gli...@inria.fr PR c++/51033 * c-typeck.c (build_array_ref): Call convert_vector_to_pointer_for_subscript. * doc/extend.texi (Vector Extensions): Subscripting not just for C. gcc/testsuite/ChangeLog 2012-04-17 Marc Glisse marc.gli...@inria.fr PR c++/51033 * gcc.dg/vector-1.c: Move to ... * c-c++-common/vector-1.c: ... here. * gcc.dg/vector-2.c: Move to ... * c-c++-common/vector-2.c: ... here. * gcc.dg/vector-3.c: Move to ... * c-c++-common/vector-3.c: ... here. Adapt to C++. * gcc.dg/vector-4.c: Move to ... * c-c++-common/vector-4.c: ... here. * gcc.dg/vector-init-1.c: Move to ... * c-c++-common/vector-init-1.c: ... here. * gcc.dg/vector-init-2.c: Move to ... * c-c++-common/vector-init-2.c: ... here. * gcc.dg/vector-subscript-1.c: Move to ... Adapt to C++. * c-c++-common/vector-subscript-1.c: ... here. * gcc.dg/vector-subscript-2.c: Move to ... * c-c++-common/vector-subscript-2.c: ... here. * gcc.dg/vector-subscript-3.c: Move to ... * c-c++-common/vector-subscript-3.c: ... here. -- Marc GlisseIndex: c-typeck.c === --- c-typeck.c (revision 186961) +++ c-typeck.c (working copy) @@ -2338,30 +2338,11 @@ build_array_ref (location_t loc, tree ar /* Apply default promotions *after* noticing character types. */ index = default_conversion (index); gcc_assert (TREE_CODE (TREE_TYPE (index)) == INTEGER_TYPE); - /* For vector[index], convert the vector to a - pointer of the underlying type. */ - if (TREE_CODE (TREE_TYPE (array)) == VECTOR_TYPE) -{ - tree type = TREE_TYPE (array); - tree type1; - - if (TREE_CODE (index) == INTEGER_CST) -if (!host_integerp (index, 1) -|| ((unsigned HOST_WIDE_INT) tree_low_cst (index, 1) - = TYPE_VECTOR_SUBPARTS (TREE_TYPE (array - warning_at (loc, OPT_Warray_bounds, index value is out of bound); - - c_common_mark_addressable_vec (array); - type = build_qualified_type (TREE_TYPE (type), TYPE_QUALS (type)); - type = build_pointer_type (type); - type1 = build_pointer_type (TREE_TYPE (array)); - array = build1 (ADDR_EXPR, type1, array); - array = convert (type, array); -} + convert_vector_to_pointer_for_subscript (loc, array, index); if (TREE_CODE (TREE_TYPE (array)) == ARRAY_TYPE) { tree rval, type; Index: c-family/c-common.c === --- c-family/c-common.c (revision 186961) +++ c-family/c-common.c (working copy) @@ -10831,6 +10831,31 @@ build_userdef_literal (tree suffix_id, t USERDEF_LITERAL_VALUE (literal) = value; USERDEF_LITERAL_NUM_STRING (literal) = num_string; return literal; } +/* For vector[index], convert the vector to a + pointer of the underlying type. */ +void +convert_vector_to_pointer_for_subscript (location_t loc, tree* vecp, tree index) +{ + if (TREE_CODE (TREE_TYPE (*vecp)) == VECTOR_TYPE) +{ + tree type = TREE_TYPE (*vecp); + tree type1; + + if (TREE_CODE (index) == INTEGER_CST) +if (!host_integerp (index, 1) +|| ((unsigned HOST_WIDE_INT) tree_low_cst (index, 1) + = TYPE_VECTOR_SUBPARTS (type))) + warning_at (loc, OPT_Warray_bounds, index value is out of bound); + + c_common_mark_addressable_vec (*vecp); + type = build_qualified_type (TREE_TYPE (type), TYPE_QUALS (type)); + type = build_pointer_type (type); + type1 = build_pointer_type (TREE_TYPE (*vecp)); + *vecp = build1 (ADDR_EXPR, type1, *vecp); + *vecp = convert (type, *vecp); +} +} + #include gt-c-family-c-common.h Index: c-family/c-common.h
Re: [ARM] Add atomic_loaddi pattern
On 30/04/12 09:51, Andrew Haley wrote: On 04/27/2012 10:30 PM, Richard Henderson wrote: We can perform a single-copy atomic load with an ldrexd insn. If the load is all we care about, we need not pair this with a strexd. Can we? It's good to know. I have had a long email exchange with engineers at ARM, and they would not say that this was safe. If they have changed their mind, I'd like to see chapter and verse. Andrew. The ARM ARM lists a number of single-copy atomic operations. For v7, the list includes: - memory accesses caused by LDREXD and STREXD instructions to doubleword-aligned locations. Of course, there is a potential performance issue from heavy use of the global monitor to ensure coherency. R.
Re: [ARM] Add atomic_loaddi pattern
On 04/30/2012 11:50 AM, Richard Earnshaw wrote: On 30/04/12 09:51, Andrew Haley wrote: On 04/27/2012 10:30 PM, Richard Henderson wrote: We can perform a single-copy atomic load with an ldrexd insn. If the load is all we care about, we need not pair this with a strexd. Can we? It's good to know. I have had a long email exchange with engineers at ARM, and they would not say that this was safe. If they have changed their mind, I'd like to see chapter and verse. The ARM ARM lists a number of single-copy atomic operations. For v7, the list includes: - memory accesses caused by LDREXD and STREXD instructions to doubleword-aligned locations. Of course, there is a potential performance issue from heavy use of the global monitor to ensure coherency. Aha! That is excellent news. that none of us managed to discover. I shall immediately change the code we use in OpenJDK. Thanks, Andrew.
Re: Patches to enable -ftrack-macro-expansion by default
Dodji Seketeli do...@redhat.com writes: I am proposing a series of patches which is supposed to address the remaining issues (I am aware of) preventing us from enabling the -ftrack-macro-expansion by default. The idea is to address each issue I notice in the course of trying to bootstrap the compiler and running the tests with -ftrack-macro-expansion enabled. Beside the fixes, I ended up disabling the -ftrack-macro-expansion for many test cases (sometimes globally in the dg-*.exp files, or on a case by case basis), because that option changes the compiler output and so requires that I either adapt the test case or disable the option. For other tests, I chose to adapt the test case. I have finally applied this series of 14 patches to the mainline today. The SVN revisions are from r186965 to r186978. Thank you all for your time and patience while reviewing this patch-set. -- Dodji
[Ping] [PATCH H8300] Add function_vector attribute
Hi, This is a ping for the patch that add 'function_vector' attribute for H8300 targets: http://gcc.gnu.org/ml/gcc-patches/2012-04/msg00250.html The patch has been updated to address Jeff's comments which can be referred at: http://gcc.gnu.org/ml/gcc-patches/2012-03/msg01884.html Please review the patch. Thanks Regards, Ajinkya Dhobale KPIT Cummins, Pune.
Re: [PATCH] Improve andq $0xffffffff, %reg handling (PR target/53110)
On Thu, Apr 26, 2012 at 5:34 PM, Uros Bizjak ubiz...@gmail.com wrote: On Thu, Apr 26, 2012 at 5:28 PM, Jakub Jelinek ja...@redhat.com wrote: We have a splitter for reg1 = reg2 0x, but only if regnums are different. But movl %edi, %edi is a cheaper variant of andq $0x, %rdi even with the same register and doesn't clobber flags, so this patch attempts to expand it as a zero extension early. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2012-04-25 Jakub Jelinek ja...@redhat.com PR target/53110 * config/i386/i386.md (andmode3): For andq $0x, reg instead expand it as zero extension. Perhaps we can go all the way and also convert ANDs with $0xff and $0x to relevant zero_extend patterns, like in the referred splitter. I wasn't sure about 0x, on which CPUs it would be a win and on which it would not. 0x is a win always. It is always a win, especially when loading from memory. In the latter case, no execution units are used, only load (memory read) unit. My recent changes to zero_extend expanders should handle this automatically, and will undo generation of zero_extend pattern. Please see zero_extendmodesi2_and expander, and how it handles TARGET_ZERO_EXTEND_WITH_AND targets. Attached patch implements this idea. In addition, it fixes the splitter to not change output mode of zero_extension from HImode and QImode from DImode to SImode. Although they generate the same instruction, I think we should better keep original mode here. 2012-04-30 Uros Bizjak ubiz...@gmail.com * config/i386/i386.md (andmode3): Expand masking operations with 0xff, 0x or 0x immediates to corresponding zero_extend RTX. (and splitter): Split to DImode zero_extend RTX for DImode operand[0]. Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32} and committed to mainline SVN. Uros. Index: config/i386/i386.md === --- config/i386/i386.md (revision 186954) +++ config/i386/i386.md (working copy) @@ -7695,14 +7695,45 @@ (match_operand:SWIM 2 general_szext_operand)))] { - if (MODEmode == DImode - GET_CODE (operands[2]) == CONST_INT - INTVAL (operands[2]) == (HOST_WIDE_INT) 0x - REG_P (operands[1])) -emit_insn (gen_zero_extendsidi2 (operands[0], -gen_lowpart (SImode, operands[1]))); + enum machine_mode mode = GET_MODE (operands[1]); + rtx (*insn) (rtx, rtx); + + if (CONST_INT_P (operands[2]) REG_P (operands[0])) +{ + HOST_WIDE_INT ival = INTVAL (operands[2]); + + if (ival == (HOST_WIDE_INT) 0x) + mode = SImode; + else if (ival == 0x) + mode = HImode; + else if (ival == 0xff) + mode = QImode; + } + + if (mode == GET_MODE (operands[1])) +{ + ix86_expand_binary_operator (AND, MODEmode, operands); + DONE; +} + + operands[1] = gen_lowpart (mode, operands[1]); + + if (GET_MODE (operands[0]) == DImode) +insn = (mode == SImode) + ? gen_zero_extendsidi2 + : (mode == HImode) + ? gen_zero_extendhidi2 + : gen_zero_extendqidi2; + else if (GET_MODE (operands[0]) == SImode) +insn = (mode == HImode) + ? gen_zero_extendhisi2 + : gen_zero_extendqisi2; + else if (GET_MODE (operands[0]) == HImode) +insn = gen_zero_extendqihi2; else -ix86_expand_binary_operator (AND, MODEmode, operands); +gcc_unreachable (); + + emit_insn (insn (operands[0], operands[1])); DONE; }) @@ -7839,32 +7870,38 @@ true_regnum (operands[0]) != true_regnum (operands[1]) [(const_int 0)] { + HOST_WIDE_INT ival = INTVAL (operands[2]); enum machine_mode mode; + rtx (*insn) (rtx, rtx); - if (INTVAL (operands[2]) == (HOST_WIDE_INT) 0x) + if (ival == (HOST_WIDE_INT) 0x) mode = SImode; - else if (INTVAL (operands[2]) == 0x) + else if (ival == 0x) mode = HImode; else { - gcc_assert (INTVAL (operands[2]) == 0xff); + gcc_assert (ival == 0xff); mode = QImode; } operands[1] = gen_lowpart (mode, operands[1]); - if (mode == SImode) -emit_insn (gen_zero_extendsidi2 (operands[0], operands[1])); + if (GET_MODE (operands[0]) == DImode) +insn = (mode == SImode) + ? gen_zero_extendsidi2 + : (mode == HImode) + ? gen_zero_extendhidi2 + : gen_zero_extendqidi2; else { - rtx (*insn) (rtx, rtx); - /* Zero extend to SImode to avoid partial register stalls. */ operands[0] = gen_lowpart (SImode, operands[0]); - insn = (mode == HImode) ? gen_zero_extendhisi2 : gen_zero_extendqisi2; - emit_insn (insn (operands[0], operands[1])); + insn = (mode == HImode) +? gen_zero_extendhisi2 +: gen_zero_extendqisi2; } + emit_insn
Re: [PATCH] Improve andq $0xffffffff, %reg handling (PR target/53110)
On Mon, Apr 30, 2012 at 02:54:05PM +0200, Uros Bizjak wrote: My recent changes to zero_extend expanders should handle this automatically, and will undo generation of zero_extend pattern. Please see zero_extendmodesi2_and expander, and how it handles TARGET_ZERO_EXTEND_WITH_AND targets. Attached patch implements this idea. In addition, it fixes the splitter to not change output mode of zero_extension from HImode and QImode from DImode to SImode. Although they generate the same instruction, I think we should better keep original mode here. Thanks. I was trying this morning slightly different patch for the same, but strangely it failed bootstrap, and didn't get around to analysing why a mem store had (zero_extend (subreg (reg))) on a RHS. + operands[1] = gen_lowpart (mode, operands[1]); + + if (GET_MODE (operands[0]) == DImode) +insn = (mode == SImode) +? gen_zero_extendsidi2 +: (mode == HImode) +? gen_zero_extendhidi2 +: gen_zero_extendqidi2; + else if (GET_MODE (operands[0]) == SImode) +insn = (mode == HImode) +? gen_zero_extendhisi2 +: gen_zero_extendqisi2; + else if (GET_MODE (operands[0]) == HImode) +insn = gen_zero_extendqihi2; else -ix86_expand_binary_operator (AND, MODEmode, operands); +gcc_unreachable (); + + emit_insn (insn (operands[0], operands[1])); IMHO you should use MODEmode instead of GET_MODE (operands[0]) in all of the above, then the compiler can actually optimize it at compile time. Jakub
Re: [PATCH] pr51020 Fix invalid options validation for ARM target
On 28/04/12 11:04, Alexey Kravets wrote: Hi guys, Please, take a look at this patch. It fixes the invalid star symbol processing in validate_switches function reported in GCC bugzilla: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51020 With this patch invalid options are no longer accepted by the compiler (see new testcase for more details). It showed no regressions on GCC-4.6.3 on ARM. ChangeLog: * gcc/gcc.c(validate_switches): Reset starred flag. * gcc/testsuite/gcc.target/arm/pr51020.c: New test. Why have you created an arm-specific test for something that, at least on initial examination, is a generic issue? R.
Re: [PATCH] Improve andq $0xffffffff, %reg handling (PR target/53110)
On Mon, Apr 30, 2012 at 3:10 PM, Jakub Jelinek ja...@redhat.com wrote: On Mon, Apr 30, 2012 at 02:54:05PM +0200, Uros Bizjak wrote: My recent changes to zero_extend expanders should handle this automatically, and will undo generation of zero_extend pattern. Please see zero_extendmodesi2_and expander, and how it handles TARGET_ZERO_EXTEND_WITH_AND targets. Attached patch implements this idea. In addition, it fixes the splitter to not change output mode of zero_extension from HImode and QImode from DImode to SImode. Although they generate the same instruction, I think we should better keep original mode here. Thanks. I was trying this morning slightly different patch for the same, but strangely it failed bootstrap, and didn't get around to analysing why a mem store had (zero_extend (subreg (reg))) on a RHS. Maybe it was due to slightly wrong splitter? I was changing both, the expander and the splitter in parallel, and didn't see any failure you mentioned... + operands[1] = gen_lowpart (mode, operands[1]); + + if (GET_MODE (operands[0]) == DImode) + insn = (mode == SImode) + ? gen_zero_extendsidi2 + : (mode == HImode) + ? gen_zero_extendhidi2 + : gen_zero_extendqidi2; + else if (GET_MODE (operands[0]) == SImode) + insn = (mode == HImode) + ? gen_zero_extendhisi2 + : gen_zero_extendqisi2; + else if (GET_MODE (operands[0]) == HImode) + insn = gen_zero_extendqihi2; else - ix86_expand_binary_operator (AND, MODEmode, operands); + gcc_unreachable (); + + emit_insn (insn (operands[0], operands[1])); IMHO you should use MODEmode instead of GET_MODE (operands[0]) in all of the above, then the compiler can actually optimize it at compile time. You are right, I will change this in an incremental patch. Thanks, Uros.
[PATCH] Don't ignore compute_all_dependences failures in phiopt (PR tree-optimization/53163)
Hi! compute_all_dependences in 4.7+ can fail, and cond_if_else_store_replacement isn't prepared to handle the chrec_dont_know DDR added there in case of failure (with NULL DDR_A/DDR_B). Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/4.7? BTW, tree-ssa-loop-prefetch.c seems to have the same problem, but no idea how that should be handled in there... 2012-04-30 Jakub Jelinek ja...@redhat.com PR tree-optimization/53163 * tree-ssa-phiopt.c (cond_if_else_store_replacement): Don't ignore return value from compute_all_dependences. * gcc.c-torture/compile/pr53163.c: New test. --- gcc/tree-ssa-phiopt.c.jj2012-04-30 08:06:18.0 +0200 +++ gcc/tree-ssa-phiopt.c 2012-04-30 08:59:16.726488101 +0200 @@ -1624,8 +1624,17 @@ cond_if_else_store_replacement (basic_bl /* Compute and check data dependencies in both basic blocks. */ then_ddrs = VEC_alloc (ddr_p, heap, 1); else_ddrs = VEC_alloc (ddr_p, heap, 1); - compute_all_dependences (then_datarefs, then_ddrs, NULL, false); - compute_all_dependences (else_datarefs, else_ddrs, NULL, false); + if (!compute_all_dependences (then_datarefs, then_ddrs, NULL, false) + || !compute_all_dependences (else_datarefs, else_ddrs, NULL, false)) +{ + free_dependence_relations (then_ddrs); + free_dependence_relations (else_ddrs); + free_data_refs (then_datarefs); + free_data_refs (else_datarefs); + VEC_free (gimple, heap, then_stores); + VEC_free (gimple, heap, else_stores); + return false; +} blocks[0] = then_bb; blocks[1] = else_bb; blocks[2] = join_bb; --- gcc/testsuite/gcc.c-torture/compile/pr53163.c.jj2012-04-30 09:21:40.950512562 +0200 +++ gcc/testsuite/gcc.c-torture/compile/pr53163.c 2012-04-30 09:21:20.0 +0200 @@ -0,0 +1,34 @@ +/* PR tree-optimization/53163 */ + +struct S { int s; } b, f; +int a, c; + +void +foo (void) +{ + int d, e; + for (d = 4; d 19; ++d) +for (e = 2; e = 0; e--) + { + a = 0; + a = 1; + } +} + +void +bar (void) +{ + int g, h, i; + for (i = 1; i = 0; i--) +{ + b = f; + for (g = 0; g = 1; g++) + { + if (c) + break; + for (h = 0; h = 1; h++) + foo (); + foo (); + } +} +} Jakub
[PATCH] Fix REE (PR rtl-optimization/53160)
Hi! As shown on the testcase below, if REE modifies some sign/zero extension insn, which is on the candidate vector, as a def_insn of some other extension, before combine_reaching_defs is called on that insn, we might ignore the changes done to that insn and just assume it is only an extension to the recorded mode. It might be an extension to a wider mode though. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/4.7? 2012-04-30 Jakub Jelinek ja...@redhat.com PR rtl-optimization/53160 * ree.c (combine_reaching_defs): Handle the case where cand-insn has been modified by ree pass already. * gcc.c-torture/execute/pr53160.c: New test. --- gcc/ree.c.jj2012-01-31 11:04:34.0 +0100 +++ gcc/ree.c 2012-04-30 11:24:02.682520157 +0200 @@ -667,6 +667,24 @@ combine_reaching_defs (ext_cand *cand, c if (!outcome) return false; + /* If cand-insn has been already modified, update cand-mode to a wider + mode if possible, or punt. */ + if (state-modified[INSN_UID (cand-insn)].kind != EXT_MODIFIED_NONE) +{ + enum machine_mode mode; + rtx set; + + if (state-modified[INSN_UID (cand-insn)].kind + != (cand-code == ZERO_EXTEND + ? EXT_MODIFIED_ZEXT : EXT_MODIFIED_SEXT) + || state-modified[INSN_UID (cand-insn)].mode != cand-mode + || (set = single_set (cand-insn)) == NULL_RTX) + return false; + mode = GET_MODE (SET_DEST (set)); + gcc_assert (GET_MODE_SIZE (mode) = GET_MODE_SIZE (cand-mode)); + cand-mode = mode; +} + merge_successful = true; /* Go through the defs vector and try to merge all the definitions --- gcc/testsuite/gcc.c-torture/execute/pr53160.c.jj2012-04-30 11:34:33.807798084 +0200 +++ gcc/testsuite/gcc.c-torture/execute/pr53160.c 2012-04-30 11:34:18.0 +0200 @@ -0,0 +1,35 @@ +/* PR rtl-optimization/53160 */ + +extern void abort (void); + +int a, c = 1, d, e, g; +volatile int b; +volatile char f; +long h; +short i; + +void +foo (void) +{ + for (e = 0; e; ++e) +; +} + +int +main () +{ + if (g) +(void) b; + foo (); + for (d = 0; d = 0; d--) +{ + short j = f; + int k = 0; + i = j ? j : j k; +} + h = c == 0 ? 0 : i; + a = h; + if (a != 0) +abort (); + return 0; +} Jakub
Re: [Patch, ARM][1/2] add ldm_stm_operation_p
On 24/04/12 17:26, Greta Yorsh wrote: Move the code of the special predicates load_multiple_operation and store_multiple_operation into a separate function. No change in functionality. gcc/ChangeLog 2012-04-24 Ian Bolton ian.bolton at arm.com Sameera Deshpande sameera.deshpande at arm.com Greta Yorsh greta.yorsh at arm.com * config/arm/arm-protos.h (ldm_stm_operation_p): New declaration. * config/arm/arm.c (ldm_stm_operation_p): New function. * config/arm/predicates.md (load_multiple_operation): Update predicate. (store_multiple_operation): Likewise. Thanks, I've committed this. R. 1-predicate.patch.txt diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index 900d09a..7da0e90 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -62,6 +62,7 @@ extern bool arm_legitimize_reload_address (rtx *, enum machine_mode, int, int, extern rtx thumb_legitimize_reload_address (rtx *, enum machine_mode, int, int, int); extern int thumb1_legitimate_address_p (enum machine_mode, rtx, int); +extern bool ldm_stm_operation_p (rtx, bool); extern int arm_const_double_rtx (rtx); extern int neg_const_double_rtx_ok_for_fpa (rtx); extern int vfp3_const_double_rtx (rtx); diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index e5779ce..74f4abf 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -10138,6 +10138,150 @@ adjacent_mem_locations (rtx a, rtx b) return 0; } +/* Return true if OP is a valid load or store multiple operation. LOAD is true + for load operations, false for store operations. + The pattern we are trying to match for load is: + [(SET (R_d0) (MEM (PLUS (addr) (offset + (SET (R_d1) (MEM (PLUS (addr) (offset + reg_increment + : + : + (SET (R_dn) (MEM (PLUS (addr) (offset + n * reg_increment + ] + where + 1. If offset is 0, first insn should be (SET (R_d0) (MEM (src_addr))). + 2. REGNO (R_d0) REGNO (R_d1) ... REGNO (R_dn). + 3. If consecutive is TRUE, then for kth register being loaded, + REGNO (R_dk) = REGNO (R_d0) + k. + The pattern for store is similar. */ +bool +ldm_stm_operation_p (rtx op, bool load) +{ + HOST_WIDE_INT count = XVECLEN (op, 0); + rtx reg, mem, addr; + unsigned regno; + HOST_WIDE_INT i = 1, base = 0, offset = 0; + rtx elt; + bool addr_reg_in_reglist = false; + bool update = false; + int reg_increment; + int offset_adj; + + reg_increment = 4; + offset_adj = 0; + + if (count = 1 + || GET_CODE (XVECEXP (op, 0, offset_adj)) != SET + || (load !REG_P (SET_DEST (XVECEXP (op, 0, offset_adj) +return false; + + /* Check if this is a write-back. */ + elt = XVECEXP (op, 0, offset_adj); + if (GET_CODE (SET_SRC (elt)) == PLUS) +{ + i++; + base = 1; + update = true; + + /* The offset adjustment must be the number of registers being + popped times the size of a single register. */ + if (!REG_P (SET_DEST (elt)) + || !REG_P (XEXP (SET_SRC (elt), 0)) + || (REGNO (SET_DEST (elt)) != REGNO (XEXP (SET_SRC (elt), 0))) + || !CONST_INT_P (XEXP (SET_SRC (elt), 1)) + || INTVAL (XEXP (SET_SRC (elt), 1)) != + ((count - 1 - offset_adj) * reg_increment)) +return false; +} + + i = i + offset_adj; + base = base + offset_adj; + /* Perform a quick check so we don't blow up below. */ + if (count = i) +return false; + + elt = XVECEXP (op, 0, i - 1); + if (GET_CODE (elt) != SET) +return false; + + if (load) +{ + reg = SET_DEST (elt); + mem = SET_SRC (elt); +} + else +{ + reg = SET_SRC (elt); + mem = SET_DEST (elt); +} + + if (!REG_P (reg) || !MEM_P (mem)) +return false; + + regno = REGNO (reg); + addr = XEXP (mem, 0); + if (GET_CODE (addr) == PLUS) +{ + if (!CONST_INT_P (XEXP (addr, 1))) + return false; + + offset = INTVAL (XEXP (addr, 1)); + addr = XEXP (addr, 0); +} + + if (!REG_P (addr)) +return false; + + for (; i count; i++) +{ + elt = XVECEXP (op, 0, i); + if (GET_CODE (elt) != SET) +return false; + + if (load) +{ + reg = SET_DEST (elt); + mem = SET_SRC (elt); +} + else +{ + reg = SET_SRC (elt); + mem = SET_DEST (elt); +} + + if (!REG_P (reg) + || GET_MODE (reg) != SImode + || REGNO (reg) = regno + || !MEM_P (mem) + || GET_MODE (mem) != SImode + || ((GET_CODE (XEXP (mem, 0)) != PLUS +|| !rtx_equal_p (XEXP (XEXP (mem, 0), 0), addr) +|| !CONST_INT_P (XEXP (XEXP (mem, 0), 1)) +|| (INTVAL (XEXP (XEXP
Re: [Patch, ARM][2/2] generalize ldm_stm_operation_p
On 24/04/12 17:31, Greta Yorsh wrote: Generalize ldm_stm_operation_p with additional parameters that will be used by epilogue patterns: * machine mode to support both SImode and DFmode registers * flag to request consecutive registers in the register list * flag to indicate whether PC in the register list gcc/ChangeLog 2012-04-24 Ian Bolton ian.bolton at arm.com Sameera Deshpande sameera.deshpande at arm.com Greta Yorsh greta.yorsh at arm.com * config/arm/arm-protos.h (ldm_stm_operation_p): New parameters. * config/arm/arm.c (ldm_stm_operation_p): New parameters. * config/arm/predicates.md (load_multiple_operation): Add arguments. (store_multiple_operation): Likewise. Thanks, committed. R.
Re: [patch] More thorough checking in reg_fits_class_p
On 26/04/12 14:20, Jim MacArthur wrote: The current code in reg_fits_class_p appears to be incorrect; since offset may be negative, it's necessary to check both ends of the range otherwise an array overrun or underrun may occur when calling in_hard_reg_set_p. in_hard_reg_set_p should also be checked for each register in the range of regno .. regno+offset. A negative offset can occur on a big-endian target. For example, on AArch64, subreg_regno_offset (0, DImode, 0, TImode) returns -1. We discovered this problem while developing unrelated code for big-endian support in the AArch64 back end. I've tested this with an x86 bootstrap which shows no errors, and with our own AArch64 back end. Ok for trunk? gcc/Changelog entry: 2012-04-26 Jim MacArthurjim.macart...@arm.com * recog.c (reg_fits_class_p): Check every register between regno and regno+offset is in the hard register set. OK. R. reg-fits-class-9 diff --git a/gcc/recog.c b/gcc/recog.c index 8fb96a0..825bfb1 100644 --- a/gcc/recog.c +++ b/gcc/recog.c @@ -2759,14 +2759,28 @@ bool reg_fits_class_p (const_rtx operand, reg_class_t cl, int offset, enum machine_mode mode) { - int regno = REGNO (operand); + unsigned int regno = REGNO (operand); if (cl == NO_REGS) return false; - return (HARD_REGISTER_NUM_P (regno) -in_hard_reg_set_p (reg_class_contents[(int) cl], - mode, regno + offset)); + /* We should not assume all registers in the range regno to regno + offset are + valid just because each end of the range is. */ + if (HARD_REGISTER_NUM_P (regno) HARD_REGISTER_NUM_P (regno + offset)) +{ + unsigned int i; + + unsigned int start = MIN (regno, regno + offset); + unsigned int end = MAX (regno, regno + offset); + for (i = start; i = end; i++) + { + if (!in_hard_reg_set_p (reg_class_contents[(int) cl], + mode, i)) + return false; + } + return true; +} + return false; } /* Split single instruction. Helper function for split_all_insns and
Re: [PATCH] Improve andq $0xffffffff, %reg handling (PR target/53110)
On Mon, Apr 30, 2012 at 3:34 PM, Uros Bizjak ubiz...@gmail.com wrote: On Mon, Apr 30, 2012 at 3:10 PM, Jakub Jelinek ja...@redhat.com wrote: On Mon, Apr 30, 2012 at 02:54:05PM +0200, Uros Bizjak wrote: My recent changes to zero_extend expanders should handle this automatically, and will undo generation of zero_extend pattern. Please see zero_extendmodesi2_and expander, and how it handles TARGET_ZERO_EXTEND_WITH_AND targets. Attached patch implements this idea. In addition, it fixes the splitter to not change output mode of zero_extension from HImode and QImode from DImode to SImode. Although they generate the same instruction, I think we should better keep original mode here. Thanks. I was trying this morning slightly different patch for the same, but strangely it failed bootstrap, and didn't get around to analysing why a mem store had (zero_extend (subreg (reg))) on a RHS. Maybe it was due to slightly wrong splitter? I was changing both, the expander and the splitter in parallel, and didn't see any failure you mentioned... In fact, the problem was already in your original patch. There was a check for REG_P (operands[1]), but it should check for REG_P (operands[0]). There is no zero_extended store, only load. Uros.
Re: [patch] More thorough checking in reg_fits_class_p
Richard Earnshaw rearn...@arm.com writes: On 26/04/12 14:20, Jim MacArthur wrote: The current code in reg_fits_class_p appears to be incorrect; since offset may be negative, it's necessary to check both ends of the range otherwise an array overrun or underrun may occur when calling in_hard_reg_set_p. in_hard_reg_set_p should also be checked for each register in the range of regno .. regno+offset. A negative offset can occur on a big-endian target. For example, on AArch64, subreg_regno_offset (0, DImode, 0, TImode) returns -1. We discovered this problem while developing unrelated code for big-endian support in the AArch64 back end. I've tested this with an x86 bootstrap which shows no errors, and with our own AArch64 back end. Ok for trunk? gcc/Changelog entry: 2012-04-26 Jim MacArthurjim.macart...@arm.com * recog.c (reg_fits_class_p): Check every register between regno and regno+offset is in the hard register set. OK. R. reg-fits-class-9 diff --git a/gcc/recog.c b/gcc/recog.c index 8fb96a0..825bfb1 100644 --- a/gcc/recog.c +++ b/gcc/recog.c @@ -2759,14 +2759,28 @@ bool reg_fits_class_p (const_rtx operand, reg_class_t cl, int offset, enum machine_mode mode) { - int regno = REGNO (operand); + unsigned int regno = REGNO (operand); if (cl == NO_REGS) return false; - return (HARD_REGISTER_NUM_P (regno) - in_hard_reg_set_p (reg_class_contents[(int) cl], -mode, regno + offset)); + /* We should not assume all registers in the range regno to regno + offset are + valid just because each end of the range is. */ + if (HARD_REGISTER_NUM_P (regno) HARD_REGISTER_NUM_P (regno + offset)) +{ + unsigned int i; + + unsigned int start = MIN (regno, regno + offset); + unsigned int end = MAX (regno, regno + offset); + for (i = start; i = end; i++) +{ + if (!in_hard_reg_set_p (reg_class_contents[(int) cl], + mode, i)) +return false; +} This doesn't look right to me. We should still only need to check in_hard_reg_set_p for one register number. I'd have expected something like: return (HARD_REGISTER_NUM_P (regno) HARD_REGISTER_NUM_P (regno + offset) in_hard_reg_set_p (reg_class_contents[(int) cl], mode, regno + offset)); instead. Richard
Re: [PATCH][ARM] NEON DImode immediate constants
On 27/04/12 16:17, Richard Earnshaw wrote: This is OK. Thanks, now committed. It would be good to merge all the target32 movdi variants into one pattern and then use alternative enabling to deal with the different valid alternatives. Yes, I'll take a look. Andrew
Re: [patch] More thorough checking in reg_fits_class_p
On 30/04/12 15:07, Richard Sandiford wrote: Richard Earnshaw rearn...@arm.com writes: On 26/04/12 14:20, Jim MacArthur wrote: The current code in reg_fits_class_p appears to be incorrect; since offset may be negative, it's necessary to check both ends of the range otherwise an array overrun or underrun may occur when calling in_hard_reg_set_p. in_hard_reg_set_p should also be checked for each register in the range of regno .. regno+offset. A negative offset can occur on a big-endian target. For example, on AArch64, subreg_regno_offset (0, DImode, 0, TImode) returns -1. We discovered this problem while developing unrelated code for big-endian support in the AArch64 back end. I've tested this with an x86 bootstrap which shows no errors, and with our own AArch64 back end. Ok for trunk? gcc/Changelog entry: 2012-04-26 Jim MacArthurjim.macart...@arm.com * recog.c (reg_fits_class_p): Check every register between regno and regno+offset is in the hard register set. OK. R. reg-fits-class-9 diff --git a/gcc/recog.c b/gcc/recog.c index 8fb96a0..825bfb1 100644 --- a/gcc/recog.c +++ b/gcc/recog.c @@ -2759,14 +2759,28 @@ bool reg_fits_class_p (const_rtx operand, reg_class_t cl, int offset, enum machine_mode mode) { - int regno = REGNO (operand); + unsigned int regno = REGNO (operand); if (cl == NO_REGS) return false; - return (HARD_REGISTER_NUM_P (regno) - in_hard_reg_set_p (reg_class_contents[(int) cl], - mode, regno + offset)); + /* We should not assume all registers in the range regno to regno + offset are + valid just because each end of the range is. */ + if (HARD_REGISTER_NUM_P (regno) HARD_REGISTER_NUM_P (regno + offset)) +{ + unsigned int i; + + unsigned int start = MIN (regno, regno + offset); + unsigned int end = MAX (regno, regno + offset); + for (i = start; i = end; i++) + { + if (!in_hard_reg_set_p (reg_class_contents[(int) cl], + mode, i)) + return false; + } This doesn't look right to me. We should still only need to check in_hard_reg_set_p for one register number. I'd have expected something like: return (HARD_REGISTER_NUM_P (regno) HARD_REGISTER_NUM_P (regno + offset) in_hard_reg_set_p (reg_class_contents[(int) cl], mode, regno + offset)); instead. Richard There's no guarantee that all registers in a set are contiguous; ARM for example doesn't make that guarantee, since SP is not a GP register, but R12 and R14 are. R.
Re: [PATCH][ARM] NEON DImode neg
On 16/04/12 13:41, Richard Earnshaw wrote: P.S. This patch can't actually be committed until my NEON DImode immediate constants patch is approved and committed. (Without that the load #0 needs a constant pool, and loading constants this late has a bug at -O0.) neon-neg64.patch 2012-04-12 Andrew Stubbsa...@codesourcery.com gcc/ * config/arm/arm.md (negdi2): Use gen_negdi2_neon. * config/arm/neon.md (negdi2_neon): New insn. Also add splitters for core and NEON registers. OK Thanks, now that the other patch is committed, I've committed this one also. Andrew
Re: fix incorrect SRA transformation on non-integral VIEW_CONVERT argument
Hello Richard, Thanks for the constructive exchange :-) On Apr 26, 2012, at 10:48 , Richard Guenther wrote: In particular, I'm pretty sure that we can get component refs of integral modes that access a smaller range of bits than what the mode conveys. It is common with packing or rep clauses in Ada. Yeah, well - the tree verification code in tree-cfg.c is not enforcing any constraints here and the docs are not clear either. My view is that we don't want the size of the VIEW_CONVERT_EXPR differ from the size of its operand I agree the current situation is not ideal. What you suggest corresponds to what is currently documented (difference = undefined behavior), but I'm pretty sure that the Ada compiler relies on some cases to behave in some specific manner. For tagged types IIRC. It would certainly be nice to get rid of this discrepancy at some point. The issue I was trying to address was a bit different: SRA changing the VCE argument access size, which seems incorrect regardless of the source destination size consistency. I can see how they are connected though and will see if I can come up with a different approach. [...] Naively a VIEW_CONVERT_EXPR is *(T *)op? Then a VIEW_CONVERT_EXPR T, op would be the same as MEM_REF T, op, (T' *)0 with the advantage that for the MEM_REF you can specify the alias set that is supposed to be in effect. Similar it would be the same as BIT_FIELD_REF T, op, sizeof (T) * BITS_PER_UNIT, 0. Can you formally relate those three representations and tell me why VIEW_CONVERT_EXPR is useful (not only convenient because of less operands) to use on lvalues (thus memory, compared to registers or constants)? I have ideas on how they are supposed to relate (corresponding to your intuitive understanding, what is documented), but MEM_REF and BIT_FIELD_REF were introduced much more recently though, so I'm pretty sure that there are details that escape me.
Re: [patch] More thorough checking in reg_fits_class_p
On 30/04/12 15:07, Richard Sandiford wrote: Richard Earnshaw rearn...@arm.com writes: On 26/04/12 14:20, Jim MacArthur wrote: The current code in reg_fits_class_p appears to be incorrect; since offset may be negative, it's necessary to check both ends of the range otherwise an array overrun or underrun may occur when calling in_hard_reg_set_p. in_hard_reg_set_p should also be checked for each register in the range of regno .. regno+offset. A negative offset can occur on a big-endian target. For example, on AArch64, subreg_regno_offset (0, DImode, 0, TImode) returns -1. We discovered this problem while developing unrelated code for big-endian support in the AArch64 back end. I've tested this with an x86 bootstrap which shows no errors, and with our own AArch64 back end. Ok for trunk? gcc/Changelog entry: 2012-04-26 Jim MacArthurjim.macart...@arm.com * recog.c (reg_fits_class_p): Check every register between regno and regno+offset is in the hard register set. OK. R. reg-fits-class-9 diff --git a/gcc/recog.c b/gcc/recog.c index 8fb96a0..825bfb1 100644 --- a/gcc/recog.c +++ b/gcc/recog.c @@ -2759,14 +2759,28 @@ bool reg_fits_class_p (const_rtx operand, reg_class_t cl, int offset, enum machine_mode mode) { - int regno = REGNO (operand); + unsigned int regno = REGNO (operand); if (cl == NO_REGS) return false; - return (HARD_REGISTER_NUM_P (regno) - in_hard_reg_set_p (reg_class_contents[(int) cl], - mode, regno + offset)); + /* We should not assume all registers in the range regno to regno + offset are + valid just because each end of the range is. */ + if (HARD_REGISTER_NUM_P (regno) HARD_REGISTER_NUM_P (regno + offset)) +{ + unsigned int i; + + unsigned int start = MIN (regno, regno + offset); + unsigned int end = MAX (regno, regno + offset); + for (i = start; i = end; i++) + { + if (!in_hard_reg_set_p (reg_class_contents[(int) cl], + mode, i)) + return false; + } This doesn't look right to me. We should still only need to check in_hard_reg_set_p for one register number. I'd have expected something like: return (HARD_REGISTER_NUM_P (regno) HARD_REGISTER_NUM_P (regno + offset) in_hard_reg_set_p (reg_class_contents[(int) cl], mode, regno + offset)); instead. Richard Hmm, looking at the comment that precedes the function makes me think the actual implementation should be: { int regno = REGNO (operand) + offset; ... return (HARD_REGISTER_NUM_P (regno) HARD_REGISTER_NUM_P (end_hard_regno (regno, mode)) in_hard_reg_set_p (, regno); } That is, the original regno isn't really interesting and what really counts is the range regno + offset ... regno + offset + NUM_HARD_REGS(mode) - 1 R.
Re: Symbol table 6/many: Symbol table hashes
On Mon, Apr 16, 2012 at 9:09 AM, Jan Hubicka hubi...@ucw.cz wrote: Hi, this patch moves cgraph/varpool hashes into symbol table hashes, so the symbol table is actually almost a symbol table ;) Work done. Bootstrapped/regtested x86_64-linux. Will commit it after bit of more testing - the assembler name handling is slipperly wrt aliases. Honza * cgraph.c (cgraph_hash, assembler_name_hash): Remove. (hash_node, eq_node): Remove. (cgraph_create_node): Do not handle hashtable. (cgraph_get_node): Remove. (cgraph_insert_node_to_hashtable): Remove. (hash_node_by_assembler_name): Remove. (eq_assembler_name): Remove. (cgraph_node_for_asm): Rewrite. (cgraph_find_replacement_node): Break out from ... (cgraph_remove_node): ... here; do not maintain hashtables. (change_decl_assembler_name): Remove. (cgraph_clone_node): Do not maintain hashtables. * cgraph.h (const_symtab_node): New typedef. (cgraph_insert_node_to_hashtable): Remove. (symtab_get_node, symtab_node_for_asm, symtab_insert_node_to_hashtable): Declare. (cgraph_find_replacement_node): Declare. (cgraph_get_node, varpool_get_node): Turn into inlines. (cgraph, varpool): Work sanely on NULL pointers. (FOR_EACH_SYMBOL): New walker. * ipa-inline-transform.c (save_inline_function_body): Use symtab_insert_node_to_hashtable. * symtab.c: Include ggc.h and diagnostics.h (symtab_hash, assembler_name_hash): New static vars; (hash_node, eq_node, hash_node_by_assembler_name, eq_assembler_name): New. (symtab_register_node): Update hashtables. (symtab_insert_node_to_hashtable): New. (symtab_unregister_node): Update hashtables. (symtab_get_node): New. (symtab_node_for_asm): New. (change_decl_assembler_name): New. * Makefile.in (symtab.o): Needs GTY. * varpool.c (varpool_hash): Remove. (hash_varpool_node, eq_varpool_node, varpool_get_node): Remove. (varpool_node): Rewrite using varpool_get_node. (varpool_remove_node): DO not maintain hashtables. (varpool_node_for_asm); Rewrite. This caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53161 -- H.J.
Re: [patch] More thorough checking in reg_fits_class_p
Richard Earnshaw rearn...@arm.com writes: On 30/04/12 15:07, Richard Sandiford wrote: Richard Earnshaw rearn...@arm.com writes: On 26/04/12 14:20, Jim MacArthur wrote: The current code in reg_fits_class_p appears to be incorrect; since offset may be negative, it's necessary to check both ends of the range otherwise an array overrun or underrun may occur when calling in_hard_reg_set_p. in_hard_reg_set_p should also be checked for each register in the range of regno .. regno+offset. A negative offset can occur on a big-endian target. For example, on AArch64, subreg_regno_offset (0, DImode, 0, TImode) returns -1. We discovered this problem while developing unrelated code for big-endian support in the AArch64 back end. I've tested this with an x86 bootstrap which shows no errors, and with our own AArch64 back end. Ok for trunk? gcc/Changelog entry: 2012-04-26 Jim MacArthurjim.macart...@arm.com * recog.c (reg_fits_class_p): Check every register between regno and regno+offset is in the hard register set. OK. R. reg-fits-class-9 diff --git a/gcc/recog.c b/gcc/recog.c index 8fb96a0..825bfb1 100644 --- a/gcc/recog.c +++ b/gcc/recog.c @@ -2759,14 +2759,28 @@ bool reg_fits_class_p (const_rtx operand, reg_class_t cl, int offset, enum machine_mode mode) { - int regno = REGNO (operand); + unsigned int regno = REGNO (operand); if (cl == NO_REGS) return false; - return (HARD_REGISTER_NUM_P (regno) - in_hard_reg_set_p (reg_class_contents[(int) cl], - mode, regno + offset)); + /* We should not assume all registers in the range regno to regno + offset are + valid just because each end of the range is. */ + if (HARD_REGISTER_NUM_P (regno) HARD_REGISTER_NUM_P (regno + offset)) +{ + unsigned int i; + + unsigned int start = MIN (regno, regno + offset); + unsigned int end = MAX (regno, regno + offset); + for (i = start; i = end; i++) + { +if (!in_hard_reg_set_p (reg_class_contents[(int) cl], +mode, i)) + return false; + } This doesn't look right to me. We should still only need to check in_hard_reg_set_p for one register number. I'd have expected something like: return (HARD_REGISTER_NUM_P (regno) HARD_REGISTER_NUM_P (regno + offset) in_hard_reg_set_p (reg_class_contents[(int) cl], mode, regno + offset)); instead. Richard There's no guarantee that all registers in a set are contiguous; ARM for example doesn't make that guarantee, since SP is not a GP register, but R12 and R14 are. Sorry, I don't follow. My point was that in_hard_reg_set_p (C, M, R1) tests whether every register required to store a value of mode M starting at R1 fits in C. Which is what we want to know. Whether the intermediate registers (between regno and regno + offset) are even valid for MODE shouldn't matter. I don't think it makes conceptual sense to call: if (!in_hard_reg_set_p (reg_class_contents[(int) cl], mode, i)) for regno i regno + offset (or regno + offset i regno), because we're not trying to construct a value of mode MODE in that register. Richard
Re: [patch] More thorough checking in reg_fits_class_p
On 30/04/12 15:39, Richard Sandiford wrote: Richard Earnshaw rearn...@arm.com writes: On 30/04/12 15:07, Richard Sandiford wrote: Richard Earnshaw rearn...@arm.com writes: On 26/04/12 14:20, Jim MacArthur wrote: The current code in reg_fits_class_p appears to be incorrect; since offset may be negative, it's necessary to check both ends of the range otherwise an array overrun or underrun may occur when calling in_hard_reg_set_p. in_hard_reg_set_p should also be checked for each register in the range of regno .. regno+offset. A negative offset can occur on a big-endian target. For example, on AArch64, subreg_regno_offset (0, DImode, 0, TImode) returns -1. We discovered this problem while developing unrelated code for big-endian support in the AArch64 back end. I've tested this with an x86 bootstrap which shows no errors, and with our own AArch64 back end. Ok for trunk? gcc/Changelog entry: 2012-04-26 Jim MacArthurjim.macart...@arm.com * recog.c (reg_fits_class_p): Check every register between regno and regno+offset is in the hard register set. OK. R. reg-fits-class-9 diff --git a/gcc/recog.c b/gcc/recog.c index 8fb96a0..825bfb1 100644 --- a/gcc/recog.c +++ b/gcc/recog.c @@ -2759,14 +2759,28 @@ bool reg_fits_class_p (const_rtx operand, reg_class_t cl, int offset, enum machine_mode mode) { - int regno = REGNO (operand); + unsigned int regno = REGNO (operand); if (cl == NO_REGS) return false; - return (HARD_REGISTER_NUM_P (regno) -in_hard_reg_set_p (reg_class_contents[(int) cl], - mode, regno + offset)); + /* We should not assume all registers in the range regno to regno + offset are + valid just because each end of the range is. */ + if (HARD_REGISTER_NUM_P (regno) HARD_REGISTER_NUM_P (regno + offset)) +{ + unsigned int i; + + unsigned int start = MIN (regno, regno + offset); + unsigned int end = MAX (regno, regno + offset); + for (i = start; i = end; i++) + { + if (!in_hard_reg_set_p (reg_class_contents[(int) cl], + mode, i)) + return false; + } This doesn't look right to me. We should still only need to check in_hard_reg_set_p for one register number. I'd have expected something like: return (HARD_REGISTER_NUM_P (regno) HARD_REGISTER_NUM_P (regno + offset) in_hard_reg_set_p (reg_class_contents[(int) cl], mode, regno + offset)); instead. Richard There's no guarantee that all registers in a set are contiguous; ARM for example doesn't make that guarantee, since SP is not a GP register, but R12 and R14 are. Sorry, I don't follow. My point was that in_hard_reg_set_p (C, M, R1) tests whether every register required to store a value of mode M starting at R1 fits in C. Which is what we want to know. Whether the intermediate registers (between regno and regno + offset) are even valid for MODE shouldn't matter. I don't think it makes conceptual sense to call: if (!in_hard_reg_set_p (reg_class_contents[(int) cl], mode, i)) for regno i regno + offset (or regno + offset i regno), because we're not trying to construct a value of mode MODE in that register. Richard You're right, of course. I'd missed that in my initial review; and hence my follow-up suggestion. It's not particularly interesting whether or not HARD_REGISTER_NUM_P (REGNO (operand)) is true, only whether REGNO(operand) + offset ... REGNO(operand) + offset + NUM_REGS(mode) -1 is. R.
Re: [PATCH, i386] V4DF __builtin_shuffle
Ping? http://gcc.gnu.org/ml/gcc-patches/2012-04/msg01034.html Since then, I've run a c,c++ bootstrap and: make -k check RUNTESTFLAGS=--target_board=my-sde-sim where my-sde-sim is the dejagnu board posted by H.J. Lu to run tests inside Intel's simulator, no difference between before and after my patch. (If I understand correctly, the testsuite always compiles the AVX and AVX2 tests, and uses cpuid (which I expect the simulator must fake) to determine if it should run them, so I don't need to pass any extra flag in RUNTESTFLAGS. If I am wrong, please tell me.) Adding in Cc: the 2 people who kindly commented on the other shuffle patch (the one that isn't finished). On Tue, 17 Apr 2012, Marc Glisse wrote: Hello, this patch expands __builtin_shuffle for V4DF mode in at most 3 insn. It is simple and works really well, often generates only 2 insn. It is not very generic, because other modes don't have an instruction equivalent to vshufpd. For V8SF (and likely V4DI and V8SI with AVX2, but I still need to do that), my patch default case in PR 52607 seems more interesting. I tried calling this new function after expand_vec_perm_vperm2f128_vblend (instead of before as in the patch), but it generated more instructions for some permutations, and never less. That function is still useful for V8SF though. I bootstrapped gcc on a non-avx platform, compiled a program that tests all 4096 shuffles with -mavx/-mavx2, and ran the result using Intel's emulator (SDE). There are still a few V4DF permutations that don't generate an optimal sequence (3 insn instead of 2), but not that many I think. Of course, I am assuming a constant cost of 1 per insn, which is completely false, but seems like a sensible first approximation. (note that I can't commit) 2012-04-17 Marc Glisse marc.gli...@inria.fr PR target/502607 * config/i386/i386.c (ix86_expand_vec_perm_const): Move code to ... (canonicalize_perm): ... new function. (expand_vec_perm_2vperm2f128_vshuf): New function. (ix86_expand_vec_perm_const_1): Call it. -- Marc Glisse
Re: [C++ Patch] sizeof... and parentheses
Ping ? http://gcc.gnu.org/ml/gcc-patches/2012-04/txt00094.txt Adding Gabriel in Cc: in case this falls in the diagnostic category. On Fri, 20 Apr 2012, Marc Glisse wrote: Hello, here is a patch that requires parentheses around the argument of sizeof I am not sure what variadic76.C tests, so I put 2 lines, with parentheses in different places. bootstrapped and regression tested (make -k check, compared to an unpatched version). Note that I can't commit. gcc/cp/ChangeLog 2012-04-20 Marc Glisse marc.gli...@inria.fr PR c++/51314 * parser.c (cp_parser_sizeof_operand): Require parentheses for sizeof... gcc/testsuite/ChangeLog 2012-04-20 Marc Glisse marc.gli...@inria.fr PR c++/51314 * g++.dg/cpp0x/vt-51314.C: New test. * g++.dg/cpp0x/variadic76.C: Fix. -- Marc Glisse
Re: [patch] More thorough checking in reg_fits_class_p
Richard Earnshaw rearn...@arm.com writes: On 30/04/12 15:39, Richard Sandiford wrote: Richard Earnshaw rearn...@arm.com writes: On 30/04/12 15:07, Richard Sandiford wrote: Richard Earnshaw rearn...@arm.com writes: On 26/04/12 14:20, Jim MacArthur wrote: The current code in reg_fits_class_p appears to be incorrect; since offset may be negative, it's necessary to check both ends of the range otherwise an array overrun or underrun may occur when calling in_hard_reg_set_p. in_hard_reg_set_p should also be checked for each register in the range of regno .. regno+offset. A negative offset can occur on a big-endian target. For example, on AArch64, subreg_regno_offset (0, DImode, 0, TImode) returns -1. We discovered this problem while developing unrelated code for big-endian support in the AArch64 back end. I've tested this with an x86 bootstrap which shows no errors, and with our own AArch64 back end. Ok for trunk? gcc/Changelog entry: 2012-04-26 Jim MacArthurjim.macart...@arm.com * recog.c (reg_fits_class_p): Check every register between regno and regno+offset is in the hard register set. OK. R. reg-fits-class-9 diff --git a/gcc/recog.c b/gcc/recog.c index 8fb96a0..825bfb1 100644 --- a/gcc/recog.c +++ b/gcc/recog.c @@ -2759,14 +2759,28 @@ bool reg_fits_class_p (const_rtx operand, reg_class_t cl, int offset, enum machine_mode mode) { - int regno = REGNO (operand); + unsigned int regno = REGNO (operand); if (cl == NO_REGS) return false; - return (HARD_REGISTER_NUM_P (regno) - in_hard_reg_set_p (reg_class_contents[(int) cl], -mode, regno + offset)); + /* We should not assume all registers in the range regno to regno + offset are + valid just because each end of the range is. */ + if (HARD_REGISTER_NUM_P (regno) HARD_REGISTER_NUM_P (regno + offset)) +{ + unsigned int i; + + unsigned int start = MIN (regno, regno + offset); + unsigned int end = MAX (regno, regno + offset); + for (i = start; i = end; i++) +{ + if (!in_hard_reg_set_p (reg_class_contents[(int) cl], + mode, i)) +return false; +} This doesn't look right to me. We should still only need to check in_hard_reg_set_p for one register number. I'd have expected something like: return (HARD_REGISTER_NUM_P (regno) HARD_REGISTER_NUM_P (regno + offset) in_hard_reg_set_p (reg_class_contents[(int) cl], mode, regno + offset)); instead. Richard There's no guarantee that all registers in a set are contiguous; ARM for example doesn't make that guarantee, since SP is not a GP register, but R12 and R14 are. Sorry, I don't follow. My point was that in_hard_reg_set_p (C, M, R1) tests whether every register required to store a value of mode M starting at R1 fits in C. Which is what we want to know. Whether the intermediate registers (between regno and regno + offset) are even valid for MODE shouldn't matter. I don't think it makes conceptual sense to call: if (!in_hard_reg_set_p (reg_class_contents[(int) cl], mode, i)) for regno i regno + offset (or regno + offset i regno), because we're not trying to construct a value of mode MODE in that register. Richard You're right, of course. I'd missed that in my initial review; and hence my follow-up suggestion. It's not particularly interesting whether or not HARD_REGISTER_NUM_P (REGNO (operand)) is true, only whether REGNO(operand) + offset ... REGNO(operand) + offset + NUM_REGS(mode) -1 is. The problem is that the function currently accepts pseudo registers. We wouldn't want FIRST_PSEUDO_REGISTER to be accidentally associated with FIRST_PSUEDO_REGISTER-1, however unlikely that combination of arguments might be in practice. I think the HARD_REGISTER_NUM_P check is needed for both regno and regno + offset. I agree that we should protect against overrun in in_hard_reg_set, but I think that check logically belongs there. Most targets happen to be OK because FIRST_PSEUDO_REGISTER isn't divisible by 32, so the class HARD_REG_SETs always have a terminating zero bit. There are a couple of exceptions though, such as alpha and lm32. So one fix would be to require HARD_REG_SET to have an entry for FIRST_PSEUDO_REGISTER, which wouldn't affect most targets. Another would be to have an explicit range check in in_hard_reg_set_p and friends. Richard
Re: [PATCH v2] ARM: Use different linker path for hardfloat ABI
On 27/04/12 00:27, Michael Hope wrote: On 27 April 2012 08:20, Carlos O'Donell car...@systemhalted.org wrote: On Mon, Apr 23, 2012 at 5:36 PM, Michael Hope michael.h...@linaro.org wrote: 2012-04-24 Michael Hope michael.h...@linaro.org Richard Earnshaw rearn...@arm.com * config/arm/linux-eabi.h (GLIBC_DYNAMIC_LINKER_SOFT_FLOAT): Define. (GLIBC_DYNAMIC_LINKER_HARD_FLOAT): Define. (GLIBC_DYNAMIC_LINKER_DEFAULT): Define. (GLIBC_DYNAMIC_LINKER): Redefine to use the hard float path. diff --git a/gcc/config/arm/linux-eabi.h b/gcc/config/arm/linux-eabi.h index 80bd825..2ace6f0 100644 --- a/gcc/config/arm/linux-eabi.h +++ b/gcc/config/arm/linux-eabi.h @@ -62,7 +62,17 @@ /* Use ld-linux.so.3 so that it will be possible to run classic GNU/Linux binaries on an EABI system. */ #undef GLIBC_DYNAMIC_LINKER -#define GLIBC_DYNAMIC_LINKER /lib/ld-linux.so.3 +#define GLIBC_DYNAMIC_LINKER_SOFT_FLOAT /lib/ld-linux.so.3 +#define GLIBC_DYNAMIC_LINKER_HARD_FLOAT /lib/ld-linux-armhf.so.3 +#if TARGET_DEFAULT_FLOAT_ABI == ARM_FLOAT_ABI_HARD +#define GLIBC_DYNAMIC_LINKER_DEFAULT GLIBC_DYNAMIC_LINKER_HARD_FLOAT +#else +#define GLIBC_DYNAMIC_LINKER_DEFAULT GLIBC_DYNAMIC_LINKER_SOFT_FLOAT +#endif +#define GLIBC_DYNAMIC_LINKER \ + %{mfloat-abi=hard: GLIBC_DYNAMIC_LINKER_HARD_FLOAT } \ +%{mfloat-abi=soft*: GLIBC_DYNAMIC_LINKER_SOFT_FLOAT } \ +%{!mfloat-abi=*: GLIBC_DYNAMIC_LINKER_DEFAULT } /* At this point, bpabi.h will have clobbered LINK_SPEC. We want to use the GNU/Linux version, not the generic BPABI version. */ This patch is broken. Please fix this. You can't use a named enumeration in cpp equality. The type ARM_FLOAT_ABI_HARD is a named enumeration and evaluates to 0 as an unknown identifier. Therefore #if TARGET_DEFAULT_FLOAT_ABI == ARM_FLOAT_ABI_HARD evaluates to #if 0 == 0 and is always true. Watch out that #define ARM_FLOAT_ABI_HARD ARM_FLOAT_ABI_HARD for such enums is not conforming C99/C11. I suggest you define the types as macros and then set the named enum to those values, then use the macros in the header equality checks. e.g. #define VAL1 0 then enum FOO { RVAL1 = VAL1, ... } Look at arm.h for the enum definition. I've looked further into this and I think the original pre-#if version is correct. The float ABI comes from these places: * The -mfloat-abi= command line argument, else * The --with-float= configure time argument, else * TARGET_DEFAULT_FLOAT_ABI from linux-eabi.h In the first case the ABI is explicit. In the second OPTION_DEFAULT_SPECS turns the configure time argument into an explict -mfloat-abi=. The patch below covers all cases, keeps the logic in the spec file, and adds a comment linking the two #defines. Tested by building with no configure flags, --wtih-float=softfp, --with-float=hard, and then running with all combinations of {,-mfloat-abi=softfp,-mfloat-abi=hard} {,-mglibc,-muclibc,-mbionic}. OK? -- Michael 2012-04-27 Michael Hope michael.h...@linaro.org * config/arm/linux-eabi.h (GLIBC_DYNAMIC_LINKER): Pick the loader using a spec rule. Michael, can you try this patch please. It should make it possible to then create linux-eabihf.h containing just #undef TARGET_DEFAULT_FLOAT_ABI #define TARGET_DEFAULT_FLOAT_ABI ARM_FLOAT_ABI_HARD #undef GLIBC_DYNAMIC_LINKER_DEFAULT #define GLIBC_DYNAMIC_LINKER_DEFAULT GLIBC_DYNAMIC_LINKER_HARD_FLOAT Which is not quite as simple as leaving out the second re-define, but pretty close. R. --- linux-eabi.h(revision 186924) +++ linux-eabi.h(local) @@ -32,7 +32,8 @@ while (false) /* We default to a soft-float ABI so that binaries can run on all - target hardware. */ + target hardware. If you override this to use the hard-float ABI then + change the setting of GLIBC_DYNAMIC_LINKER_DEFAULT as well. */ #undef TARGET_DEFAULT_FLOAT_ABI #define TARGET_DEFAULT_FLOAT_ABI ARM_FLOAT_ABI_SOFT @@ -59,16 +60,19 @@ #undef SUBTARGET_EXTRA_LINK_SPEC #define SUBTARGET_EXTRA_LINK_SPEC -m TARGET_LINKER_EMULATION -/* Use ld-linux.so.3 so that it will be possible to run classic - GNU/Linux binaries on an EABI system. */ +/* GNU/Linux on ARM currently supports three dynamic linkers: + - ld-linux.so.2 - for the legacy ABI + - ld-linux.so.3 - for the EABI-derived soft-float ABI + - ld-linux-armhf.so.3 - for the EABI-derived hard-float ABI. + All the dynamic linkers live in /lib. + We default to soft-float, but this can be overridden by changing both + GLIBC_DYNAMIC_LINKER_DEFAULT and TARGET_DEFAULT_FLOAT_ABI. */ + #undef GLIBC_DYNAMIC_LINKER #define GLIBC_DYNAMIC_LINKER_SOFT_FLOAT /lib/ld-linux.so.3 #define GLIBC_DYNAMIC_LINKER_HARD_FLOAT /lib/ld-linux-armhf.so.3 -#if TARGET_DEFAULT_FLOAT_ABI == ARM_FLOAT_ABI_HARD -#define GLIBC_DYNAMIC_LINKER_DEFAULT GLIBC_DYNAMIC_LINKER_HARD_FLOAT -#else #define GLIBC_DYNAMIC_LINKER_DEFAULT GLIBC_DYNAMIC_LINKER_SOFT_FLOAT
Re: [patch] More thorough checking in reg_fits_class_p
Richard Earnshaw schrieb: On 30/04/12 15:07, Richard Sandiford wrote: Richard Earnshaw writes: Jim MacArthur wrote: The current code in reg_fits_class_p appears to be incorrect; since offset may be negative, it's necessary to check both ends of the range otherwise an array overrun or underrun may occur when calling in_hard_reg_set_p. in_hard_reg_set_p should also be checked for each register in the range of regno .. regno+offset. A negative offset can occur on a big-endian target. For example, on AArch64, subreg_regno_offset (0, DImode, 0, TImode) returns -1. We discovered this problem while developing unrelated code for big-endian support in the AArch64 back end. I've tested this with an x86 bootstrap which shows no errors, and with our own AArch64 back end. Ok for trunk? gcc/Changelog entry: 2012-04-26 Jim MacArthurjim.macart...@arm.com * recog.c (reg_fits_class_p): Check every register between regno and regno+offset is in the hard register set. OK. R. reg-fits-class-9 diff --git a/gcc/recog.c b/gcc/recog.c index 8fb96a0..825bfb1 100644 --- a/gcc/recog.c +++ b/gcc/recog.c @@ -2759,14 +2759,28 @@ bool reg_fits_class_p (const_rtx operand, reg_class_t cl, int offset, enum machine_mode mode) { - int regno = REGNO (operand); + unsigned int regno = REGNO (operand); if (cl == NO_REGS) return false; - return (HARD_REGISTER_NUM_P (regno) - in_hard_reg_set_p (reg_class_contents[(int) cl], - mode, regno + offset)); + /* We should not assume all registers in the range regno to regno + offset are + valid just because each end of the range is. */ + if (HARD_REGISTER_NUM_P (regno) HARD_REGISTER_NUM_P (regno + offset)) +{ + unsigned int i; + + unsigned int start = MIN (regno, regno + offset); + unsigned int end = MAX (regno, regno + offset); + for (i = start; i = end; i++) + { + if (!in_hard_reg_set_p (reg_class_contents[(int) cl], + mode, i)) + return false; + } This doesn't look right to me. We should still only need to check in_hard_reg_set_p for one register number. I'd have expected something like: return (HARD_REGISTER_NUM_P (regno) HARD_REGISTER_NUM_P (regno + offset) in_hard_reg_set_p (reg_class_contents[(int) cl], mode, regno + offset)); instead. Richard Hmm, looking at the comment that precedes the function makes me think the actual implementation should be: { int regno = REGNO (operand) + offset; ... return (HARD_REGISTER_NUM_P (regno) HARD_REGISTER_NUM_P (end_hard_regno (regno, mode)) in_hard_reg_set_p (, regno); } That is, the original regno isn't really interesting and what really counts is the range regno + offset ... regno + offset + NUM_HARD_REGS(mode) - 1 Shouldn't this be HARD_REGNO_NREGS? Johann
Re: [patch] More thorough checking in reg_fits_class_p
On 30/04/12 16:36, Georg-Johann Lay wrote: Richard Earnshaw schrieb: On 30/04/12 15:07, Richard Sandiford wrote: Richard Earnshaw writes: Jim MacArthur wrote: The current code in reg_fits_class_p appears to be incorrect; since offset may be negative, it's necessary to check both ends of the range otherwise an array overrun or underrun may occur when calling in_hard_reg_set_p. in_hard_reg_set_p should also be checked for each register in the range of regno .. regno+offset. A negative offset can occur on a big-endian target. For example, on AArch64, subreg_regno_offset (0, DImode, 0, TImode) returns -1. We discovered this problem while developing unrelated code for big-endian support in the AArch64 back end. I've tested this with an x86 bootstrap which shows no errors, and with our own AArch64 back end. Ok for trunk? gcc/Changelog entry: 2012-04-26 Jim MacArthurjim.macart...@arm.com * recog.c (reg_fits_class_p): Check every register between regno and regno+offset is in the hard register set. OK. R. reg-fits-class-9 diff --git a/gcc/recog.c b/gcc/recog.c index 8fb96a0..825bfb1 100644 --- a/gcc/recog.c +++ b/gcc/recog.c @@ -2759,14 +2759,28 @@ bool reg_fits_class_p (const_rtx operand, reg_class_t cl, int offset, enum machine_mode mode) { - int regno = REGNO (operand); + unsigned int regno = REGNO (operand); if (cl == NO_REGS) return false; - return (HARD_REGISTER_NUM_P (regno) -in_hard_reg_set_p (reg_class_contents[(int) cl], - mode, regno + offset)); + /* We should not assume all registers in the range regno to regno + offset are + valid just because each end of the range is. */ + if (HARD_REGISTER_NUM_P (regno) HARD_REGISTER_NUM_P (regno + offset)) +{ + unsigned int i; + + unsigned int start = MIN (regno, regno + offset); + unsigned int end = MAX (regno, regno + offset); + for (i = start; i = end; i++) + { + if (!in_hard_reg_set_p (reg_class_contents[(int) cl], + mode, i)) + return false; + } This doesn't look right to me. We should still only need to check in_hard_reg_set_p for one register number. I'd have expected something like: return (HARD_REGISTER_NUM_P (regno) HARD_REGISTER_NUM_P (regno + offset) in_hard_reg_set_p (reg_class_contents[(int) cl], mode, regno + offset)); instead. Richard Hmm, looking at the comment that precedes the function makes me think the actual implementation should be: { int regno = REGNO (operand) + offset; ... return (HARD_REGISTER_NUM_P (regno) HARD_REGISTER_NUM_P (end_hard_regno (regno, mode)) in_hard_reg_set_p (, regno); } That is, the original regno isn't really interesting and what really counts is the range regno + offset ... regno + offset + NUM_HARD_REGS(mode) - 1 Shouldn't this be HARD_REGNO_NREGS? Johann Look at the definition of end_hard_regno... R.
Re: [PATCH 12/13] Adjust relevant test cases wrt -ftrack-macro-expansion=[0|2]
On Apr 29, 2012, at 10:38 AM, Dodji Seketeli wrote: While bootstrapping the tree again, it appeared that an output regression of the objc test objc.dg/foreach-7.m flew below my radar. This looks fairly obvious to me, but I am CC-ing Mike Stump, just in case. That's fine.
[PATCH 1/2] Minor refactoring of tree-vect-patterns.c
Hello, in working on a fix for PR 52633, I noticed that tree-vect-patterns.c now contains a number of copies of rather similar code (of which my patch would have added another copy), so it seems to make sense to do a bit of refactoring first. This patch introduced a new helper function vect_same_loop_or_bb_p to decide whether a new statement is in the same loop or basic-block (depending on whether we're currently doing loop-based or basic-block-based vectorization) as an existing statement. The helper is then used everywhere where this test is currently open-coded. As a side-effect, the patch actually contains two bug fixes: - The condition in some places (!loop gimple_bb (def_stmt) != BB_VINFO_BB (bb_vinfo) gimple_code (def_stmt) != GIMPLE_PHI) doesn't seem to make sense. It allows PHI statements from random other basic blocks -- but those will have an uninitialized vinfo_for_stmt, so if that test ever hits, we're going to crash. The check was introduced in this patch: http://gcc.gnu.org/ml/gcc-patches/2012-02/msg00191.html which likewise introduces a similar check in vect_get_and_check_slp_defs: (!loop gimple_bb (def_stmt) == BB_VINFO_BB (bb_vinfo) gimple_code (def_stmt) != GIMPLE_PHI)) This makes sense: basic-block vectorization operates only on statements in the current basic block, but *not* on the incoming PHIs. It seems that the condition simply was incorrectly negated when moving it over to the tree-vect-pattern.c uses. - In vect_recog_widen_shift_pattern, the test for same loop / basic-block is still missing. In my recent patch to PR 52870 I added the check to vect_recog_widen_mult_pattern ... it is still missing in ...shift_pattern. Tested on i386-linux-gnu and arm-linux-gnueabi with no regressions. OK for mainline? Bye, Ulrich ChangeLog: * tree-vect-patterns.c (vect_same_loop_or_bb_p): New function. (vect_handle_widen_op_by_const): Use it instead of open-coding test. (vect_recog_widen_mult_pattern): Likewise. (vect_operation_fits_smaller_type): Likewise. (vect_recog_over_widening_pattern): Likewise. (vect_recog_widen_shift_pattern): Add to vect_same_loop_or_bb_p test. Index: gcc-head/gcc/tree-vect-patterns.c === --- gcc-head.orig/gcc/tree-vect-patterns.c 2012-04-26 15:53:48.0 +0200 +++ gcc-head/gcc/tree-vect-patterns.c 2012-04-26 19:46:12.0 +0200 @@ -84,6 +84,41 @@ new_pattern_def_seq (stmt_vec_info stmt_ append_pattern_def_seq (stmt_info, stmt); } +/* Check whether STMT2 is in the same loop or basic block as STMT1. + Which of the two applies depends on whether we're currently doing + loop-based or basic-block-based vectorization, as determined by + the vinfo_for_stmt for STMT1 (which must be defined). + + If this returns true, vinfo_for_stmt for STMT2 is guaranteed + to be defined as well. */ + +static bool +vect_same_loop_or_bb_p (gimple stmt1, gimple stmt2) +{ + stmt_vec_info stmt_vinfo = vinfo_for_stmt (stmt1); + loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_vinfo); + bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_vinfo); + + if (!gimple_bb (stmt2)) +return false; + + if (loop_vinfo) +{ + struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo); + if (!flow_bb_inside_loop_p (loop, gimple_bb (stmt2))) + return false; +} + else +{ + if (gimple_bb (stmt2) != BB_VINFO_BB (bb_vinfo) + || gimple_code (stmt2) == GIMPLE_PHI) + return false; +} + + gcc_assert (vinfo_for_stmt (stmt2)); + return true; +} + /* Check whether NAME, an ssa-name used in USE_STMT, is a result of a type promotion or demotion, such that: DEF_STMT: NAME = NOP (name0) @@ -400,16 +435,6 @@ vect_handle_widen_op_by_const (gimple st { tree new_type, new_oprnd, tmp; gimple new_stmt; - loop_vec_info loop_vinfo; - struct loop *loop = NULL; - bb_vec_info bb_vinfo; - stmt_vec_info stmt_vinfo; - - stmt_vinfo = vinfo_for_stmt (stmt); - loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_vinfo); - bb_vinfo = STMT_VINFO_BB_VINFO (stmt_vinfo); - if (loop_vinfo) -loop = LOOP_VINFO_LOOP (loop_vinfo); if (code != MULT_EXPR code != LSHIFT_EXPR) return false; @@ -425,12 +450,10 @@ vect_handle_widen_op_by_const (gimple st return true; } - if (TYPE_PRECISION (type) (TYPE_PRECISION (*half_type) * 4) - || !gimple_bb (def_stmt) - || (loop !flow_bb_inside_loop_p (loop, gimple_bb (def_stmt))) - || (!loop gimple_bb (def_stmt) != BB_VINFO_BB (bb_vinfo) - gimple_code (def_stmt) != GIMPLE_PHI) - || !vinfo_for_stmt (def_stmt)) + if (TYPE_PRECISION (type) (TYPE_PRECISION (*half_type) * 4)) +return false; + + if (!vect_same_loop_or_bb_p (stmt, def_stmt)) return false; /* TYPE is 4 times bigger than HALF_TYPE, try widening operation for @@ -564,16 +587,6 @@
[PATCH 2/2] Minor refactoring of tree-vect-patterns.c
Hello, as a second step in refactoring this patch introduces a routine vect_find_single_use to determine whether a defining statement has one single use within the current vectorization domain. The helper is then called wherever that check is currently open-coded. There should be no change in behaviour. Tested on i386-linux-gnu and arm-linux-gnueabi with no regressions. OK for mainline? Bye, Ulrich ChangeLog: * tree-vect-patterns.c (vect_find_single_use): New function. (vect_recog_widen_mult_pattern): Use it instead of open-coding loop. (vect_recog_over_widening_pattern): Likewise. (vect_recog_widen_shift_pattern): Likewise. Index: gcc-head/gcc/tree-vect-patterns.c === --- gcc-head.orig/gcc/tree-vect-patterns.c 2012-04-26 19:46:12.0 +0200 +++ gcc-head/gcc/tree-vect-patterns.c 2012-04-26 19:46:53.0 +0200 @@ -119,6 +119,33 @@ vect_same_loop_or_bb_p (gimple stmt1, gi return true; } +/* If the LHS of DEF_STMT has a single use, and that statement is + in the same loop or basic block, return it. */ + +static gimple +vect_find_single_use (gimple def_stmt) +{ + tree lhs = gimple_assign_lhs (def_stmt); + imm_use_iterator imm_iter; + use_operand_p use_p; + gimple use_stmt = NULL; + + FOR_EACH_IMM_USE_FAST (use_p, imm_iter, lhs) +{ + if (is_gimple_debug (USE_STMT (use_p))) + continue; + + if (use_stmt) + return NULL; + use_stmt = USE_STMT (use_p); + + if (!vect_same_loop_or_bb_p (def_stmt, use_stmt)) + return NULL; +} + + return use_stmt; +} + /* Check whether NAME, an ssa-name used in USE_STMT, is a result of a type promotion or demotion, such that: DEF_STMT: NAME = NOP (name0) @@ -636,31 +663,18 @@ vect_recog_widen_mult_pattern (VEC (gimp Use unsigned TYPE as the type for WIDEN_MULT_EXPR. */ if (TYPE_UNSIGNED (type) != TYPE_UNSIGNED (half_type0)) { - tree lhs = gimple_assign_lhs (last_stmt), use_lhs; - imm_use_iterator imm_iter; - use_operand_p use_p; - int nuses = 0; - gimple use_stmt = NULL; + gimple use_stmt; + tree use_lhs; tree use_type; if (TYPE_UNSIGNED (type) == TYPE_UNSIGNED (half_type1)) return NULL; - FOR_EACH_IMM_USE_FAST (use_p, imm_iter, lhs) -{ - if (is_gimple_debug (USE_STMT (use_p))) - continue; - use_stmt = USE_STMT (use_p); - nuses++; -} - - if (nuses != 1 || !is_gimple_assign (use_stmt) - || gimple_assign_rhs_code (use_stmt) != NOP_EXPR) + use_stmt = vect_find_single_use (last_stmt); + if (!use_stmt || !is_gimple_assign (use_stmt) + || gimple_assign_rhs_code (use_stmt) != NOP_EXPR) return NULL; - if (!vect_same_loop_or_bb_p (last_stmt, use_stmt)) - return NULL; - use_lhs = gimple_assign_lhs (use_stmt); use_type = TREE_TYPE (use_lhs); if (!INTEGRAL_TYPE_P (use_type) @@ -1165,10 +1179,7 @@ vect_recog_over_widening_pattern (VEC (g { gimple stmt = VEC_pop (gimple, *stmts); gimple pattern_stmt = NULL, new_def_stmt, prev_stmt = NULL, use_stmt = NULL; - tree op0, op1, vectype = NULL_TREE, lhs, use_lhs, use_type; - imm_use_iterator imm_iter; - use_operand_p use_p; - int nuses = 0; + tree op0, op1, vectype = NULL_TREE, use_lhs, use_type; tree var = NULL_TREE, new_type = NULL_TREE, tmp, new_oprnd; bool first; tree type = NULL; @@ -1192,18 +1203,8 @@ vect_recog_over_widening_pattern (VEC (g } /* STMT can be performed on a smaller type. Check its uses. */ - lhs = gimple_assign_lhs (stmt); - nuses = 0; - FOR_EACH_IMM_USE_FAST (use_p, imm_iter, lhs) -{ - if (is_gimple_debug (USE_STMT (use_p))) -continue; - use_stmt = USE_STMT (use_p); - nuses++; -} - - if (nuses != 1 || !is_gimple_assign (use_stmt) - || !vect_same_loop_or_bb_p (stmt, use_stmt)) + use_stmt = vect_find_single_use (stmt); + if (!use_stmt || !is_gimple_assign (use_stmt)) return NULL; /* Create pattern statement for STMT. */ @@ -1454,12 +1455,6 @@ vect_recog_widen_shift_pattern (VEC (gim Use unsigned TYPE as the type for WIDEN_LSHIFT_EXPR. */ if (TYPE_UNSIGNED (type) != TYPE_UNSIGNED (half_type0)) { - tree lhs = gimple_assign_lhs (last_stmt), use_lhs; - imm_use_iterator imm_iter; - use_operand_p use_p; - int nuses = 0; - tree use_type; - if (over_widen) { /* In case of over-widening pattern, S4 should be ORIG_STMT itself. @@ -1472,21 +1467,14 @@ vect_recog_widen_shift_pattern (VEC (gim } else { - FOR_EACH_IMM_USE_FAST (use_p, imm_iter, lhs) -{ - if (is_gimple_debug (USE_STMT (use_p))) - continue; - use_stmt = USE_STMT (use_p); -
Re: [libcpp] maybe canonicalize system paths in line-map
OK. Jason
minor PATCH to constify symbols in dwarf2out.c
Some other changes I've been working on needed this const-correctness adjustment. Tested x86_64-pc-linux-gnu, applying to trunk. commit e69aeb777fdbbcecbc603c9b91ef3dc17cf812a5 Author: Jason Merrill ja...@redhat.com Date: Thu Apr 19 10:02:22 2012 -0400 * dwarf2out.c (comdat_symbol_id): Add const. (union die_symbol_or_type_node): Add const to die_symbol. (output_die_symbol, output_die, output_comp_unit): Adjust. diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c index 98b53f0..fd485fb 100644 --- a/gcc/dwarf2out.c +++ b/gcc/dwarf2out.c @@ -2470,7 +2470,7 @@ DEF_VEC_ALLOC_O(dw_attr_node,gc); typedef struct GTY((chain_circular (%h.die_sib))) die_struct { union die_symbol_or_type_node { - char * GTY ((tag (0))) die_symbol; + const char * GTY ((tag (0))) die_symbol; comdat_type_node_ref GTY ((tag (1))) die_type_node; } GTY ((desc (use_debug_types))) die_id; @@ -5825,7 +5825,7 @@ same_die_p_wrap (dw_die_ref die1, dw_die_ref die2) /* The prefix to attach to symbols on DIEs in the current comdat debug info section. */ -static char *comdat_symbol_id; +static const char *comdat_symbol_id; /* The index of the current symbol within the current comdat CU. */ static unsigned int comdat_symbol_number; @@ -7396,7 +7396,7 @@ output_abbrev_section (void) static inline void output_die_symbol (dw_die_ref die) { - char *sym = die-die_id.die_symbol; + const char *sym = die-die_id.die_symbol; if (sym == 0) return; @@ -7678,7 +7678,7 @@ output_die (dw_die_ref die) } else { - char *sym = AT_ref (a)-die_id.die_symbol; + const char *sym = AT_ref (a)-die_id.die_symbol; int size; gcc_assert (sym); @@ -7800,8 +7800,8 @@ output_compilation_unit_header (void) static void output_comp_unit (dw_die_ref die, int output_if_empty) { - const char *secname; - char *oldsym, *tmp; + const char *secname, *oldsym; + char *tmp; /* Unless we are outputting main CU, we may throw away empty ones. */ if (!output_if_empty die-die_child == NULL)
Re: [patch] backport powerpc64-freebsd support to 4.7 branch
On Sat, Apr 28, 2012 at 3:45 PM, Andreas Tobler andreast-l...@fgznet.ch wrote: Hello all, I did a backport of the powerpc64-freebsd support to the 4.7 branch, here the results: http://gcc.gnu.org/ml/gcc-testresults/2012-04/msg02768.html http://gcc.gnu.org/ml/gcc-testresults/2012-04/msg02767.html Is the attached/below ok to commit to 4.7 branch? TIA, Andreas libstdc++: 2012-04-28 Andreas Tobler andre...@fgznet.ch Backport from mainline 2012-03-21 Andreas Tobler andre...@fgznet.ch * testsuite/23_containers/vector/bool/modifiers/insert/31370.cc: Skip this test on powerpc64-*-freebsd*. libgcc: 2012-04-28 Andreas Tobler andre...@fgznet.ch Backport from mainline 2012-03-21 Andreas Tobler andre...@fgznet.ch * config.host: Add bits to support powerpc64-*-freebsd*. * config/rs6000/freebsd-unwind.h: New file. * config/rs6000/t-freebsd64: New file. gcc: 2012-04-28 Andreas Tobler andre...@fgznet.ch Backport from mainline 2012-03-21 Andreas Tobler andre...@fgznet.ch * configure.ac (HAVE_LD_NO_DOT_SYMBOLS): Add powerpc64-*-freebsd*. Introduce emul_name to select the right linker emulation for powerpc64-*-freebsd*. * configure: Regenerate. * config.gcc: Add bits to support powerpc64-*-freebsd*. * config/rs6000/freebsd.h (POWERPC_FREEBSD): Define. * config/rs6000/freebsd64.h: New file. * config/rs6000/rs6000.c (rs6000_option_override_internal): Use POWERPC_FREEBSD. (rs6000_savres_strategy): Likewise. (rs6000_savres_routine_name): Likewise. (rs6000_elf_file_end): Likewise. * config/rs6000/t-freebsd64: New file. * config/rs6000/sysv4.h (SUBTARGET_OVERRIDE_OPTIONS): Set the rs6000_current_abi for 64-bit FreeBSD to ABI_AIX. I am a little uncomfortable about adding a new feature to a release branch, but this is not a major configuration and there were previous delays, so this is okay. Thanks, David
Use sed -n … instead of sed s/…/p -e d in s-header-vars
Hello, sed s/…/p -e d as used in s-header-vars doesn't work on at least ia64-hpux, where s/.../p only prints out if -n was requested as well. The attached patch fixes this by using '-n' instead of a trailing '-e d' We have been using this on all our gcc-4.5 based configurations for a couple of years now. Bootstrapped and regtested with mainline on x86_64-linux-gnu. OK to commit ? Thanks in advance, With Kind Regards, Olivier 2012-04-30 Olivier Hainque hain...@adacore.com * Makefile.in (s-header-vars): Resort to -n instead of trailing -e d in sed invocation. sed-n.dif Description: video/dv
Re: [RFH / Patch] PR 51222
On 04/29/2012 11:42 AM, Paolo Carlini wrote: This might just be a matter of calling for_each_template_parm and returning 1 if we see any template parameter. Good. Today I quickly tried something along these lines (see 'p' attachment) and got some fails: Ah, well. I guess for_each_template_parm doesn't look at the types of declarations. Otherwise, I'm attaching something very close to the letter of the ABI, Your patch only looks at immediate subexpressions; I believe what the ABI means is any subexpression, which is why I think something involving walk_tree might be the way to go. Jason
Re: [RFH / Patch] PR 51222
Hi, Your patch only looks at immediate subexpressions; I believe what the ABI means is any subexpression, which is why I think something involving walk_tree might be the way to go. Ok, I'm going to open code something using walk_tree and quite similar to for_each_template_parm but checking also TREE_TYPE, let's if it's enough at least to avoid the regressions. Thanks, Paolo
libgo patch committed: Fix build on MIPS GNU/Linux
This patch to libgo fixes the build on MIPS GNU/Linux. I haven't been able to fully test gccgo, as the MIPS64 machine in the GCC compile farm is running a version of glibc that is too old--it does not support makecontext and friends. However, this patch at least gets past the immediate build problem reported in PR 52586. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline and 4.7 branch. Ian diff -r a5f055c95162 libgo/go/syscall/libcall_linux.go --- a/libgo/go/syscall/libcall_linux.go Fri Apr 27 21:56:09 2012 -0700 +++ b/libgo/go/syscall/libcall_linux.go Mon Apr 30 08:54:38 2012 -0700 @@ -203,7 +203,11 @@ p = (*byte)(unsafe.Pointer(_zero)) } Entersyscall() - r1, _, errno := Syscall(SYS_GETDENTS64, uintptr(fd), uintptr(unsafe.Pointer(p)), uintptr(len(buf))) + s := SYS_GETDENTS64 + if s == 0 { + s = SYS_GETDENTS + } + r1, _, errno := Syscall(uintptr(s), uintptr(fd), uintptr(unsafe.Pointer(p)), uintptr(len(buf))) n = int(r1) if n 0 { err = errno diff -r a5f055c95162 libgo/mksysinfo.sh --- a/libgo/mksysinfo.sh Fri Apr 27 21:56:09 2012 -0700 +++ b/libgo/mksysinfo.sh Mon Apr 30 08:54:38 2012 -0700 @@ -224,6 +224,14 @@ echo const $sup = _$sys ${OUT} done +# The GNU/Linux support wants to use SYS_GETDENTS64 if available. +if ! grep '^const SYS_GETDENTS ' ${OUT} /dev/null 21; then + echo const SYS_GETDENTS = 0 ${OUT} +fi +if ! grep '^const SYS_GETDENTS64 ' ${OUT} /dev/null 21; then + echo const SYS_GETDENTS64 = 0 ${OUT} +fi + # Stat constants. grep '^const _S_' gen-sysinfo.go | \ sed -e 's/^\(const \)_\(S_[^= ]*\)\(.*\)$/\1\2 = _\2/' ${OUT}
Re: [patch] backport powerpc64-freebsd support to 4.7 branch
On 30.04.12 18:54, David Edelsohn wrote: On Sat, Apr 28, 2012 at 3:45 PM, Andreas Toblerandreast-l...@fgznet.ch wrote: Hello all, I did a backport of the powerpc64-freebsd support to the 4.7 branch, here the results: http://gcc.gnu.org/ml/gcc-testresults/2012-04/msg02768.html http://gcc.gnu.org/ml/gcc-testresults/2012-04/msg02767.html Is the attached/below ok to commit to 4.7 branch? TIA, Andreas libstdc++: 2012-04-28 Andreas Toblerandre...@fgznet.ch Backport from mainline 2012-03-21 Andreas Toblerandre...@fgznet.ch * testsuite/23_containers/vector/bool/modifiers/insert/31370.cc: Skip this test on powerpc64-*-freebsd*. libgcc: 2012-04-28 Andreas Toblerandre...@fgznet.ch Backport from mainline 2012-03-21 Andreas Toblerandre...@fgznet.ch * config.host: Add bits to support powerpc64-*-freebsd*. * config/rs6000/freebsd-unwind.h: New file. * config/rs6000/t-freebsd64: New file. gcc: 2012-04-28 Andreas Toblerandre...@fgznet.ch Backport from mainline 2012-03-21 Andreas Toblerandre...@fgznet.ch * configure.ac (HAVE_LD_NO_DOT_SYMBOLS): Add powerpc64-*-freebsd*. Introduce emul_name to select the right linker emulation for powerpc64-*-freebsd*. * configure: Regenerate. * config.gcc: Add bits to support powerpc64-*-freebsd*. * config/rs6000/freebsd.h (POWERPC_FREEBSD): Define. * config/rs6000/freebsd64.h: New file. * config/rs6000/rs6000.c (rs6000_option_override_internal): Use POWERPC_FREEBSD. (rs6000_savres_strategy): Likewise. (rs6000_savres_routine_name): Likewise. (rs6000_elf_file_end): Likewise. * config/rs6000/t-freebsd64: New file. * config/rs6000/sysv4.h (SUBTARGET_OVERRIDE_OPTIONS): Set the rs6000_current_abi for 64-bit FreeBSD to ABI_AIX. I am a little uncomfortable about adding a new feature to a release branch, but this is not a major configuration and there were previous delays, so this is okay. Thank you very much. I appreciate your ack! This gives us a stable compiler for powerpc64-freebsd. Trunk is also ok, but there we have the ups and downs of a development snapshot. (Currently the many failures which are also present on powerpc64-linux.) Andreas P.S, committed as: 186995/6/7
[PATCH, middle-end]: Fix PR53136, Use after free in cgraph dumps
Hello! There is a problem with multiple calls of cgraph_node_name in fprintf dumps. Please note that C++ uses caching in cxx_printable_name_internal (aka LANG_HOOKS_DECL_PRINTABLE_NAME), so when cxx_printable_name_internal is called multiple times from printf (i.e. fprintf %s/%i - %s/%i), it can happen that the first string gets evicted by the second call, before fprintf is fully evaluated. Attached patch audits all uses of cgraph_node_name, and in case of multiple calls in dump fprintf, wraps every call in xstrdup. This fixes valgrind report in the PR, as well as original dump failure on alpha [1]. I think that small memory leak is tolerable here (the changes are exclusively in the dump code), and follows the same approach as in java frontend. 2012-04-30 Uros Bizjak ubiz...@gmail.com PR middle-end/53136 * ipa-prop.c (ipa_print_node_jump_functions): Wrap multiple calls to cgraph_node_name in xstrdup. (ipa_make_edge_direct_to_target): Ditto. * cgraph.c (dump_cgraph_node): Ditto. * tree-sra.c (convert_callers_for_node): Ditto. * lto-symtab.c (lto_cgraph_replace_node): Ditto. * ipa-cp.c (perhaps_add_new_callers): Ditto. * cgraphunit.c (cgraph_redirect_edge_call_stmt_to_callee): Ditto. (cgraph_materialize_all_clones): Ditto. * ipa-inline.c (report_inline_failed_reason): Ditto. (want_early_inline_function_p): Ditto. (edge_badness): Ditto. (update_edge_key): Ditto. (flatten_function): Ditto. (ipa_inline): Ditto. (inlinw_always_inline_functions): Ditto. (early_inline_small_functions): Ditto. Patch was tested on x86_64-pc-linux-gnu {,-m32} and alphaev68-pc-linux-gnu. OK for mainline? [1] http://gcc.gnu.org/ml/gcc-testresults/2012-04/msg02722.html Uros. Index: tree-sra.c === --- tree-sra.c (revision 186992) +++ tree-sra.c (working copy) @@ -4612,8 +4612,8 @@ convert_callers_for_node (struct cgraph_node *node if (dump_file) fprintf (dump_file, Adjusting call (%i - %i) %s - %s\n, cs-caller-uid, cs-callee-uid, -cgraph_node_name (cs-caller), -cgraph_node_name (cs-callee)); +xstrdup (cgraph_node_name (cs-caller)), +xstrdup (cgraph_node_name (cs-callee))); ipa_modify_call_arguments (cs, cs-call_stmt, adjustments); Index: ipa-prop.c === --- ipa-prop.c (revision 186992) +++ ipa-prop.c (working copy) @@ -230,8 +230,8 @@ ipa_print_node_jump_functions (FILE *f, struct cgr continue; fprintf (f, callsite %s/%i - %s/%i : \n, - cgraph_node_name (node), node-uid, - cgraph_node_name (cs-callee), cs-callee-uid); + xstrdup (cgraph_node_name (node)), node-uid, + xstrdup (cgraph_node_name (cs-callee)), cs-callee-uid); ipa_print_node_jump_functions_for_edge (f, cs); } @@ -1780,8 +1780,8 @@ ipa_make_edge_direct_to_target (struct cgraph_edge fprintf (dump_file, ipa-prop: Discovered %s call to a known target (%s/%i - %s/%i), for stmt , ie-indirect_info-polymorphic ? a virtual : an indirect, - cgraph_node_name (ie-caller), ie-caller-uid, - cgraph_node_name (ie-callee), ie-callee-uid); + xstrdup (cgraph_node_name (ie-caller)), ie-caller-uid, + xstrdup (cgraph_node_name (ie-callee)), ie-callee-uid); if (ie-call_stmt) print_gimple_stmt (dump_file, ie-call_stmt, 2, TDF_SLIM); else Index: cgraph.c === --- cgraph.c(revision 186992) +++ cgraph.c(working copy) @@ -1521,9 +1521,9 @@ dump_cgraph_node (FILE *f, struct cgraph_node *nod if (node-global.inlined_to) fprintf (f, Function %s/%i is inline copy in %s/%i\n, -cgraph_node_name (node), +xstrdup (cgraph_node_name (node)), node-symbol.order, -cgraph_node_name (node-global.inlined_to), +xstrdup (cgraph_node_name (node-global.inlined_to)), node-global.inlined_to-symbol.order); if (node-clone_of) fprintf (f, Clone of %s/%i\n, Index: lto-symtab.c === --- lto-symtab.c(revision 186992) +++ lto-symtab.c(working copy) @@ -215,8 +215,8 @@ lto_cgraph_replace_node (struct cgraph_node *node, { fprintf (cgraph_dump_file, Replacing cgraph node %s/%i by %s/%i for symbol %s\n, - cgraph_node_name (node), node-uid, - cgraph_node_name (prevailing_node), + xstrdup (cgraph_node_name (node)), node-uid, + xstrdup (cgraph_node_name (prevailing_node)), prevailing_node-uid,
Re: [PATCH] Fix REE (PR rtl-optimization/53160)
On 04/30/2012 06:43 AM, Jakub Jelinek wrote: Hi! As shown on the testcase below, if REE modifies some sign/zero extension insn, which is on the candidate vector, as a def_insn of some other extension, before combine_reaching_defs is called on that insn, we might ignore the changes done to that insn and just assume it is only an extension to the recorded mode. It might be an extension to a wider mode though. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/4.7? 2012-04-30 Jakub Jelinekja...@redhat.com PR rtl-optimization/53160 * ree.c (combine_reaching_defs): Handle the case where cand-insn has been modified by ree pass already. * gcc.c-torture/execute/pr53160.c: New test. Ok. r~
Backported r185231 from trunk. (issue 6139063)
Reviewers: xur, davidxl, iant2, Message: I backported the following patch: 2012-03-12 Richard Guenther rguent...@suse.de * gthr.h (__GTHREAD_MUTEX_INIT_FUNCTION): Adjust specification. * gthr-posix.h (__GTHREAD_MUTEX_INIT_FUNCTION): Define. (__gthread_mutex_init_function): New function. * gthr-single.h (__GTHREAD_MUTEX_INIT_FUNCTION): Define. PR gcov/49484 * libgcov.c: Include gthr.h. (__gcov_flush_mx): New global variable. (init_mx, init_mx_once): New functions. (__gcov_flush): Protect self with a mutex. (__gcov_fork): Re-initialize mutex after forking. * unwind-dw2-fde.c: Change condition under which to use __GTHREAD_MUTEX_INIT_FUNCTION. Please review this at http://codereview.appspot.com/6139063/ Affected files: M. M gcc/ChangeLog.google-4_6 M gcc/gthr-posix.h M gcc/gthr-single.h M gcc/gthr.h M gcc/libgcov.c M gcc/unwind-dw2-fde.c M libgcc/ChangeLog Index: . === --- . (revision 186999) +++ . (working copy) Property changes on: . ___ Modified: svn:mergeinfo Merged /trunk:r185231 Index: libgcc/ChangeLog === --- libgcc/ChangeLog(revision 186999) +++ libgcc/ChangeLog(working copy) @@ -1,3 +1,19 @@ +2012-03-12 Richard Guenther rguent...@suse.de + + * gthr.h (__GTHREAD_MUTEX_INIT_FUNCTION): Adjust specification. + * gthr-posix.h (__GTHREAD_MUTEX_INIT_FUNCTION): Define. + (__gthread_mutex_init_function): New function. + * gthr-single.h (__GTHREAD_MUTEX_INIT_FUNCTION): Define. + + PR gcov/49484 + * libgcov.c: Include gthr.h. + (__gcov_flush_mx): New global variable. + (init_mx, init_mx_once): New functions. + (__gcov_flush): Protect self with a mutex. + (__gcov_fork): Re-initialize mutex after forking. + * unwind-dw2-fde.c: Change condition under which to use + __GTHREAD_MUTEX_INIT_FUNCTION. + 2012-03-01 Release Manager * GCC 4.6.3 released. Index: gcc/unwind-dw2-fde.c === --- gcc/unwind-dw2-fde.c(revision 186999) +++ gcc/unwind-dw2-fde.c(working copy) @@ -46,11 +46,10 @@ #ifdef __GTHREAD_MUTEX_INIT static __gthread_mutex_t object_mutex = __GTHREAD_MUTEX_INIT; +#define init_object_mutex_once() #else static __gthread_mutex_t object_mutex; -#endif -#ifdef __GTHREAD_MUTEX_INIT_FUNCTION static void init_object_mutex (void) { @@ -63,8 +62,6 @@ static __gthread_once_t once = __GTHREAD_ONCE_INIT; __gthread_once (once, init_object_mutex); } -#else -#define init_object_mutex_once() #endif /* Called from crtbegin.o to register the unwind info for an object. */ Index: gcc/ChangeLog.google-4_6 === --- gcc/ChangeLog.google-4_6(revision 186999) +++ gcc/ChangeLog.google-4_6(working copy) @@ -1,3 +1,22 @@ +2012-02-21 Ahmad Sharif asha...@google.com + + Backport from mainline r185231. + 2012-03-12 Richard Guenther rguent...@suse.de + + * gthr.h (__GTHREAD_MUTEX_INIT_FUNCTION): Adjust specification. + * gthr-posix.h (__GTHREAD_MUTEX_INIT_FUNCTION): Define. + (__gthread_mutex_init_function): New function. + * gthr-single.h (__GTHREAD_MUTEX_INIT_FUNCTION): Define. + + PR gcov/49484 + * libgcov.c: Include gthr.h. + (__gcov_flush_mx): New global variable. + (init_mx, init_mx_once): New functions. + (__gcov_flush): Protect self with a mutex. + (__gcov_fork): Re-initialize mutex after forking. + * unwind-dw2-fde.c: Change condition under which to use + __GTHREAD_MUTEX_INIT_FUNCTION. + 2012-04-25 Rong Xu x...@google.com * gcc/gcc.c (ripa_lto_spec): Support -S in streaming LIPO. Index: gcc/libgcov.c === --- gcc/libgcov.c (revision 186999) +++ gcc/libgcov.c (working copy) @@ -46,6 +46,7 @@ #include tsystem.h #include coretypes.h #include tm.h +#include gthr.h #endif /* __KERNEL__ */ #if 1 @@ -667,6 +668,26 @@ info-version = 0; } +#ifdef __GTHREAD_MUTEX_INIT +ATTRIBUTE_HIDDEN __gthread_mutex_t __gcov_flush_mx = __GTHREAD_MUTEX_INIT; +#define init_mx_once() +#else +__gthread_mutex_t __gcov_flush_mx ATTRIBUTE_HIDDEN; + +static void +init_mx (void) +{ + __GTHREAD_MUTEX_INIT_FUNCTION (__gcov_flush_mx); +} +static void +init_mx_once (void) +{ + static __gthread_once_t once = __GTHREAD_ONCE_INIT; + __gthread_once (once, init_mx); +} +#endif + + /* Called before fork or exec - write out profile information gathered so far and reset it to zero. This avoids duplication or loss of the profile information gathered so far. */ @@ -676,6
Re: ja...@redhat.com
This makes sense to me. Jason
Re: ja...@redhat.com
This makes sense to me. Thanks (and sorry for bogus subject). I bootstrapped/regtested the change and tested it indeed lets me build kernel and Mozilla with the sanity check that there are no function-variable aliases. Is the change OK? Honza Jason
Re: ja...@redhat.com
On 04/30/2012 04:38 PM, Jan Hubicka wrote: This makes sense to me. Thanks (and sorry for bogus subject). I bootstrapped/regtested the change and tested it indeed lets me build kernel and Mozilla with the sanity check that there are no function-variable aliases. Is the change OK? OK. Jason
[PATCH, i386]: Fix PR53141, gcc.target/i386/bmi2-mulx32-[12]a.c FAIL
Hello! On BMI2 targets, we prefer mulx, so switch places of constraints 0 and 1. 2012-04-30 Uros Bizjak ubiz...@gmail.com PR target/53141 * config/i386/i386.md (*umulmodedwi3_1): Switch places of constraints 0 and 1. Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN. Uros. Index: config/i386/i386.md === --- config/i386/i386.md (revision 186999) +++ config/i386/i386.md (working copy) @@ -6814,29 +6814,29 @@ (set_attr mode SI)]) (define_insn *umulmodedwi3_1 - [(set (match_operand:DWI 0 register_operand =A,r) + [(set (match_operand:DWI 0 register_operand =r,A) (mult:DWI (zero_extend:DWI - (match_operand:DWIH 1 nonimmediate_operand %0,d)) + (match_operand:DWIH 1 nonimmediate_operand %d,0)) (zero_extend:DWI (match_operand:DWIH 2 nonimmediate_operand rm,rm (clobber (reg:CC FLAGS_REG))] !(MEM_P (operands[1]) MEM_P (operands[2])) @ - mul{imodesuffix}\t%2 - # - [(set_attr isa *,bmi2) - (set_attr type imul,imulx) - (set_attr length_immediate 0,*) + # + mul{imodesuffix}\t%2 + [(set_attr isa bmi2,*) + (set_attr type imulx,imul) + (set_attr length_immediate *,0) (set (attr athlon_decode) - (cond [(eq_attr alternative 0) + (cond [(eq_attr alternative 1) (if_then_else (eq_attr cpu athlon) (const_string vector) (const_string double))] (const_string *))) - (set_attr amdfam10_decode double,*) - (set_attr bdver1_decode direct,*) - (set_attr prefix orig,vex) + (set_attr amdfam10_decode *,double) + (set_attr bdver1_decode *,direct) + (set_attr prefix vex,orig) (set_attr mode MODE)]) ;; Convert mul to the mulx pattern to avoid flags dependency.
Re: [RFH / Patch] PR 51222
Hi again, On 04/30/2012 07:04 PM, Jason Merrill wrote: On 04/29/2012 11:42 AM, Paolo Carlini wrote: This might just be a matter of calling for_each_template_parm and returning 1 if we see any template parameter. Good. Today I quickly tried something along these lines (see 'p' attachment) and got some fails: Ah, well. I guess for_each_template_parm doesn't look at the types of declarations. Just wanted to add that I just confirmed that *at least* the following testcases, regressing with the elementary for_each_template_parm patch: FAIL: g++.dg/cpp0x/decltype21.C (test for excess errors) FAIL: g++.dg/cpp0x/sfinae17.C (test for excess errors) FAIL: g++.dg/cpp0x/sfinae21.C (test for excess errors) FAIL: g++.dg/cpp0x/sfinae28.C (test for excess errors) are characterized by expr with TREE_CODE (expr) == CONSTRUCTOR and TREE_TYPE (expr) == TEMPLATE_TYPE_PARM. Paolo.
Re: [PATCH v2] ARM: Use different linker path for hardfloat ABI
On 1 May 2012 03:24, Richard Earnshaw rearn...@arm.com wrote: On 27/04/12 00:27, Michael Hope wrote: On 27 April 2012 08:20, Carlos O'Donell car...@systemhalted.org wrote: On Mon, Apr 23, 2012 at 5:36 PM, Michael Hope michael.h...@linaro.org wrote: 2012-04-24 Michael Hope michael.h...@linaro.org Richard Earnshaw rearn...@arm.com * config/arm/linux-eabi.h (GLIBC_DYNAMIC_LINKER_SOFT_FLOAT): Define. (GLIBC_DYNAMIC_LINKER_HARD_FLOAT): Define. (GLIBC_DYNAMIC_LINKER_DEFAULT): Define. (GLIBC_DYNAMIC_LINKER): Redefine to use the hard float path. diff --git a/gcc/config/arm/linux-eabi.h b/gcc/config/arm/linux-eabi.h index 80bd825..2ace6f0 100644 --- a/gcc/config/arm/linux-eabi.h +++ b/gcc/config/arm/linux-eabi.h @@ -62,7 +62,17 @@ /* Use ld-linux.so.3 so that it will be possible to run classic GNU/Linux binaries on an EABI system. */ #undef GLIBC_DYNAMIC_LINKER -#define GLIBC_DYNAMIC_LINKER /lib/ld-linux.so.3 +#define GLIBC_DYNAMIC_LINKER_SOFT_FLOAT /lib/ld-linux.so.3 +#define GLIBC_DYNAMIC_LINKER_HARD_FLOAT /lib/ld-linux-armhf.so.3 +#if TARGET_DEFAULT_FLOAT_ABI == ARM_FLOAT_ABI_HARD +#define GLIBC_DYNAMIC_LINKER_DEFAULT GLIBC_DYNAMIC_LINKER_HARD_FLOAT +#else +#define GLIBC_DYNAMIC_LINKER_DEFAULT GLIBC_DYNAMIC_LINKER_SOFT_FLOAT +#endif +#define GLIBC_DYNAMIC_LINKER \ + %{mfloat-abi=hard: GLIBC_DYNAMIC_LINKER_HARD_FLOAT } \ + %{mfloat-abi=soft*: GLIBC_DYNAMIC_LINKER_SOFT_FLOAT } \ + %{!mfloat-abi=*: GLIBC_DYNAMIC_LINKER_DEFAULT } /* At this point, bpabi.h will have clobbered LINK_SPEC. We want to use the GNU/Linux version, not the generic BPABI version. */ This patch is broken. Please fix this. You can't use a named enumeration in cpp equality. The type ARM_FLOAT_ABI_HARD is a named enumeration and evaluates to 0 as an unknown identifier. Therefore #if TARGET_DEFAULT_FLOAT_ABI == ARM_FLOAT_ABI_HARD evaluates to #if 0 == 0 and is always true. Watch out that #define ARM_FLOAT_ABI_HARD ARM_FLOAT_ABI_HARD for such enums is not conforming C99/C11. I suggest you define the types as macros and then set the named enum to those values, then use the macros in the header equality checks. e.g. #define VAL1 0 then enum FOO { RVAL1 = VAL1, ... } Look at arm.h for the enum definition. I've looked further into this and I think the original pre-#if version is correct. The float ABI comes from these places: * The -mfloat-abi= command line argument, else * The --with-float= configure time argument, else * TARGET_DEFAULT_FLOAT_ABI from linux-eabi.h In the first case the ABI is explicit. In the second OPTION_DEFAULT_SPECS turns the configure time argument into an explict -mfloat-abi=. The patch below covers all cases, keeps the logic in the spec file, and adds a comment linking the two #defines. Tested by building with no configure flags, --wtih-float=softfp, --with-float=hard, and then running with all combinations of {,-mfloat-abi=softfp,-mfloat-abi=hard} {,-mglibc,-muclibc,-mbionic}. OK? -- Michael 2012-04-27 Michael Hope michael.h...@linaro.org * config/arm/linux-eabi.h (GLIBC_DYNAMIC_LINKER): Pick the loader using a spec rule. Michael, can you try this patch please. It should make it possible to then create linux-eabihf.h containing just #undef TARGET_DEFAULT_FLOAT_ABI #define TARGET_DEFAULT_FLOAT_ABI ARM_FLOAT_ABI_HARD #undef GLIBC_DYNAMIC_LINKER_DEFAULT #define GLIBC_DYNAMIC_LINKER_DEFAULT GLIBC_DYNAMIC_LINKER_HARD_FLOAT Which is not quite as simple as leaving out the second re-define, but pretty close. Hi Richard. Your patch tests just fine. I like it. You could change the spec rule to the newer if-elseif-else form but that's a nit. -- Michael
Re: [PATCH v2] ARM: Use different linker path for hardfloat ABI
On 04/30/2012 03:47 PM, Michael Hope wrote: 2012-04-27 Michael Hopemichael.h...@linaro.org * config/arm/linux-eabi.h (GLIBC_DYNAMIC_LINKER): Pick the loader using a spec rule. Michael, can you try this patch please. It should make it possible to then create linux-eabihf.h containing just #undef TARGET_DEFAULT_FLOAT_ABI #define TARGET_DEFAULT_FLOAT_ABI ARM_FLOAT_ABI_HARD #undef GLIBC_DYNAMIC_LINKER_DEFAULT #define GLIBC_DYNAMIC_LINKER_DEFAULT GLIBC_DYNAMIC_LINKER_HARD_FLOAT Which is not quite as simple as leaving out the second re-define, but pretty close. Hi Richard. Your patch tests just fine. I like it. You could change the spec rule to the newer if-elseif-else form but that's a nit. So who owns creating the appropriate glibc patch now that we've got a good gcc patch? jeff
Re: Fix find_moveable_pseudos, PR52997
On 04/28/2012 09:56 AM, Bernd Schmidt wrote: The problem is that resize_reg_info only resizes the data structure, but leaves it uninitialized. Argh. Something like this maybe (currently testing on i686-linux, ok if it passes?) That patch cleans up the SPEC build failures I was seeing and a pile of testsuite failures on PowerPC. -Pat
[patch] skip tpf configure tests
* crossconfig.m4: Since we know that all TPF builds are cross- builds and cannot run configuration-time link tests, do not allow it; just go with known supported linker options. * configure: Regenerate (called as GLIBCXX_CROSSCONFIG). Index: crossconfig.m4 === --- crossconfig.m4 (revision 187002) +++ crossconfig.m4 (working copy) @@ -219,14 +219,14 @@ case ${host} in AC_DEFINE(HAVE_ISNANF) AC_DEFINE(HAVE_MODFF) AC_DEFINE(HAVE_HYPOT) ;; *-tpf) SECTION_FLAGS='-ffunction-sections -fdata-sections' +SECTION_LDFLAGS='-Wl,--gc-sections $SECTION_LDFLAGS' AC_SUBST(SECTION_FLAGS) -GLIBCXX_CHECK_LINKER_FEATURES AC_DEFINE(HAVE_FINITE) AC_DEFINE(HAVE_FINITEF) AC_DEFINE(HAVE_FREXPF) AC_DEFINE(HAVE_HYPOTF) AC_DEFINE(HAVE_ISINF) AC_DEFINE(HAVE_ISINFF)
Re: [ping 6] [patch] attribute to reverse bitfield allocations (is anyone even reading these?)
Set up a cron job to ping once a day. :-) Did you ever dig up the Apple test cases from the APPLE LOCAL work I pointed you at earlier? They will be more comprehensive that any testing you've done, and, if you get them to all pass, the work should be closer to being complete. The feature needed a ton of testcases, a few didn't cut it. In going through the Apple test cases, I discovered one HUGE disadvantage to using __attribute__ to tag structures for bit reversal - it doesn't propogate. I.e.: typedef __attribute__((reverse)) struct { struct { int x:4; int y:4; } a; } b; The attribute seems to apply only to struct b, not to struct a. For PACKED, we handle the flag specially, propogating it at every step in the layout. The original Apple patch used a #pragma. Suggestions on how to make a structure attribute apply to the whole structure? Or should I *also* add a #pragma to specify the default bit ordering? This is what the Renesas ABIs want anyway, and what Apple did, but I was told to use an attribute and have a target pragma set the attribute, but I don't see how...
Re: [RFH / Patch] PR 51222
Hi again, On 04/29/2012 11:42 AM, Paolo Carlini wrote: This might just be a matter of calling for_each_template_parm and returning 1 if we see any template parameter. Good. Today I quickly tried something along these lines (see 'p' attachment) and got some fails: Ah, well. I guess for_each_template_parm doesn't look at the types of declarations. Just a few moments ago I noticed something interesting: if a NULL FN is passed to for_each_template_parm, it assumes a function which always returns 1, what we want when just searching for the first occurrence. Now, in practice, *no* front-code calls it like this! Thus, if we want, it's safe to tweak / extend for_each_template_parm_r for our purposes when !fn And indeed, the attached passes the testsuite with no regressions ;) I also want to remark that there is this kind of comment: case INDIRECT_REF: case COMPONENT_REF: /* If there's no type, then this thing must be some expression involving template parameters. */ if (!fn !TREE_TYPE (t)) return error_mark_node; + else if (!fn for_each_template_parm (TREE_TYPE (t), fn, + data, pfd-visited, + pfd-include_nondeduced_p)) +return error_mark_node; which, how can I say, appears to support the idea of tweaking / extending the code in the direction we like. Thus, my question would be: is something like the below in the right direction? The alternate possibility I can see, would be basically redoing a slightly slimmed version of for_each_template_parm specialized for our needs (a few less conditionals) Thanks! Paolo. Index: pt.c === --- pt.c(revision 187001) +++ pt.c(working copy) @@ -7861,6 +7861,10 @@ for_each_template_parm_r (tree *tp, int *walk_subt for_each_template_parm (DECL_CONTEXT (t), fn, data, pfd-visited, pfd-include_nondeduced_p)) return error_mark_node; + if (!fn TREE_TYPE (t) + for_each_template_parm (TREE_TYPE(t), fn, data, +pfd-visited, pfd-include_nondeduced_p)) + return error_mark_node; break; case BOUND_TEMPLATE_TEMPLATE_PARM: @@ -7905,6 +7909,11 @@ for_each_template_parm_r (tree *tp, int *walk_subt (TREE_TYPE (t)), fn, data, pfd-visited, pfd-include_nondeduced_p)) return error_mark_node; + else if (!fn TREE_TYPE (t) + for_each_template_parm (TREE_TYPE (t), fn, + data, pfd-visited, + pfd-include_nondeduced_p)) + return error_mark_node; break; case INDIRECT_REF: @@ -7913,6 +7922,10 @@ for_each_template_parm_r (tree *tp, int *walk_subt involving template parameters. */ if (!fn !TREE_TYPE (t)) return error_mark_node; + else if (!fn for_each_template_parm (TREE_TYPE (t), fn, + data, pfd-visited, + pfd-include_nondeduced_p)) + return error_mark_node; break; case MODOP_EXPR: @@ -19744,6 +19757,29 @@ type_dependent_expression_p (tree expression) return (dependent_type_p (TREE_TYPE (expression))); } +/* Returns TRUE if the EXPRESSION is instantiation-dependent, in the + sense defined by the ABI: + + An expression is instantiation-dependent if it is type-dependent + or value-dependent, or it has a subexpression that is type-dependent + or value-dependent. */ + +bool +instantiation_dependent_expression_p (tree expression) +{ + if (!processing_template_decl) +return false; + + if (expression == error_mark_node) +return false; + + if (!TREE_TYPE (expression)) +return true; + + return for_each_template_parm (expression, NULL, NULL, NULL, +/*include_nondeduced_p=*/true); +} + /* Like type_dependent_expression_p, but it also works while not processing a template definition, i.e. during substitution or mangling. */ Index: semantics.c === --- semantics.c (revision 187001) +++ semantics.c (working copy) @@ -5168,8 +5168,7 @@ finish_decltype_type (tree expr, bool id_expressio return error_mark_node; } - /* FIXME instantiation-dependent */ - if (type_dependent_expression_p (expr) + if (instantiation_dependent_expression_p (expr) /* In a template, a COMPONENT_REF has an IDENTIFIER_NODE for op1 even if it isn't dependent, so that we can check access control at instantiation time, so defer the decltype as well (PR 42277). */ Index: cp-tree.h === --- cp-tree.h (revision 187001) +++ cp-tree.h
Re: [ping 6] [patch] attribute to reverse bitfield allocations (is anyone even reading these?)
I've been reading your patches... On Apr 30, 2012, at 4:34 PM, DJ Delorie wrote: In going through the Apple test cases, I discovered one HUGE disadvantage to using __attribute__ to tag structures for bit reversal - it doesn't propogate. I.e.: This is why pragma exists. :-) Certainly, once you have the underlying support for an attribute, adding a pragma is easier.
Re: [PATCH] Fix predcom -fcompare-debug bug (PR debug/47684)
On Thu, Feb 10, 2011 at 1:44 PM, Jakub Jelinek ja...@redhat.com wrote: Hi! single_nonlooparound_use doesn't ignore debug uses, which results in -fcompare-debug failures. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2011-02-10 Jakub Jelinek ja...@redhat.com PR debug/47684 * tree-predcom.c (single_nonlooparound_use): Ignore debug uses. This may have caused: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53174 -- H.J.
Re: [RS6000] Fix PR53038, cfa_restore out of order
On Thu, Apr 26, 2012 at 1:00 AM, Alan Modra amo...@gmail.com wrote: PR target/53038 * config/rs6000/rs6000.c (load_lr_save, restore_saved_lr, load_cr_save, add_crlr_cfa_restore): New functions. (rs6000_restore_saved_cr): Rename to.. (restore_saved_cr): ..this. Add cfa_restore notes for cr. (rs6000_emit_epilogue): Use new functions. Adjust condition for emitting lr and cr cfa_restore. Emit cfa_restores for fp regs when using out-of-line restore only when shrink wrapping. Okay. Thanks, David
Re: [PATCH v2] ARM: Use different linker path for hardfloat ABI
On 1 May 2012 10:01, Jeff Law l...@redhat.com wrote: On 04/30/2012 03:47 PM, Michael Hope wrote: 2012-04-27 Michael Hopemichael.h...@linaro.org * config/arm/linux-eabi.h (GLIBC_DYNAMIC_LINKER): Pick the loader using a spec rule. Michael, can you try this patch please. It should make it possible to then create linux-eabihf.h containing just #undef TARGET_DEFAULT_FLOAT_ABI #define TARGET_DEFAULT_FLOAT_ABI ARM_FLOAT_ABI_HARD #undef GLIBC_DYNAMIC_LINKER_DEFAULT #define GLIBC_DYNAMIC_LINKER_DEFAULT GLIBC_DYNAMIC_LINKER_HARD_FLOAT Which is not quite as simple as leaving out the second re-define, but pretty close. Hi Richard. Your patch tests just fine. I like it. You could change the spec rule to the newer if-elseif-else form but that's a nit. So who owns creating the appropriate glibc patch now that we've got a good gcc patch? Carlos is working on the GLIBC patch: http://sourceware.org/ml/libc-ports/2012-04/msg00171.html -- Michael
[PATCH] Improve COND_EXPR expansion
Hi, This patch improves the expansion of COND_EXPR into RTL, directly using conditional moves. I had to fix a bug in the x86 backend where emit_conditional_move could cause a crash as we had a comparison mode of DImode which is not handled by the 32bit part. can_conditionally_move_p return true as we had an SImode for the other operands. Note other targets might need a similar fix as x86 had but I could not test those targets and this is really the first time where emit_conditional_move is being called with different modes for the comparison and the other operands mode and the comparison mode is not of the CC class. The main reasoning to do this conversion early rather than wait for ifconv as the resulting code is slightly better. Also the compiler is slightly faster. OK? Bootstrapped and tested on both mips64-linux-gnu (where it was originally written for) and x86_64-linux-gnu. Thanks, Andrew Pinski ChangeLog: * expr.c (convert_tree_comp_to_rtx): New function. (expand_expr_real_2): Try using conditional moves for COND_EXPRs if they exist. * config/i386/i386.c (ix86_expand_int_movcc): Disallow comparison modes of DImode for 32bits and TImode. Index: expr.c === --- expr.c (revision 186954) +++ expr.c (working copy) @@ -7344,6 +7344,64 @@ highest_pow2_factor_for_target (const_tr return MAX (factor, talign); } +/* Convert the tree comparision code TCODE to the rtl one where the + signedness is UNSIGNEDP. */ + +static enum rtx_code +convert_tree_comp_to_rtx (enum tree_code tcode, int unsignedp) +{ + enum rtx_code code; + switch (tcode) +{ +case EQ_EXPR: + code = EQ; + break; +case NE_EXPR: + code = NE; + break; +case LT_EXPR: + code = unsignedp ? LTU : LT; + break; +case LE_EXPR: + code = unsignedp ? LEU : LE; + break; +case GT_EXPR: + code = unsignedp ? GTU : GT; + break; +case GE_EXPR: + code = unsignedp ? GEU : GE; + break; +case UNORDERED_EXPR: + code = UNORDERED; + break; +case ORDERED_EXPR: + code = ORDERED; + break; +case UNLT_EXPR: + code = UNLT; + break; +case UNLE_EXPR: + code = UNLE; + break; +case UNGT_EXPR: + code = UNGT; + break; +case UNGE_EXPR: + code = UNGE; + break; +case UNEQ_EXPR: + code = UNEQ; + break; +case LTGT_EXPR: + code = LTGT; + break; + +default: + gcc_unreachable (); +} + return code; +} + /* Subroutine of expand_expr. Expand the two operands of a binary expression EXP0 and EXP1 placing the results in OP0 and OP1. The value may be stored in TARGET if TARGET is nonzero. The @@ -8851,8 +8909,7 @@ expand_expr_real_2 (sepops ops, rtx targ safe_from_p (original_target, treeop0, 1) GET_MODE (original_target) == mode #ifdef HAVE_conditional_move - (! can_conditionally_move_p (mode) - || REG_P (original_target)) + 0 #endif !MEM_P (original_target)) temp = original_target; @@ -8860,6 +8917,82 @@ expand_expr_real_2 (sepops ops, rtx targ temp = assign_temp (type, 0, 0, 1); do_pending_stack_adjust (); +#if HAVE_conditional_move + if (!can_conditionally_move_p (mode)) +mode = promote_mode (type, mode, unsignedp); + if (can_conditionally_move_p (mode)) + { + rtx insn; + rtx op00, op01; + enum rtx_code comparison_code; + enum machine_mode comparison_mode; + start_sequence (); + + expand_operands (treeop1, treeop2, + temp, op1, op2, EXPAND_NORMAL); + if (TREE_CODE (treeop0) == SSA_NAME) + { + gimple srcstmt; + srcstmt = get_gimple_for_ssa_name (treeop0); + if (srcstmt + TREE_CODE_CLASS (gimple_assign_rhs_code (srcstmt)) + == tcc_comparison) + { + tree type = TREE_TYPE (gimple_assign_rhs1 (srcstmt)); + op00 = expand_normal (gimple_assign_rhs1 (srcstmt)); + op01 = expand_normal (gimple_assign_rhs2 (srcstmt)); + comparison_code = convert_tree_comp_to_rtx (gimple_assign_rhs_code (srcstmt), TYPE_UNSIGNED (type)); + comparison_mode = TYPE_MODE (type); + unsignedp = TYPE_UNSIGNED (type); + } + else + goto non_comparison_cond_expr; + } + else if (TREE_CODE_CLASS (TREE_CODE (treeop0)) == tcc_comparison) + { + tree type = TREE_TYPE (TREE_OPERAND (treeop0, 0)); + op00 = expand_normal (TREE_OPERAND (treeop0, 0)); + op01 = expand_normal (TREE_OPERAND (treeop0, 1)); + comparison_code = convert_tree_comp_to_rtx (TREE_CODE (treeop0), TYPE_UNSIGNED (type)); + comparison_mode = TYPE_MODE
Re: [PATCH v2] ARM: Use different linker path for hardfloat ABI
On 04/30/2012 08:43 PM, Michael Hope wrote: On 1 May 2012 10:01, Jeff Lawl...@redhat.com wrote: On 04/30/2012 03:47 PM, Michael Hope wrote: 2012-04-27 Michael Hopemichael.h...@linaro.org * config/arm/linux-eabi.h (GLIBC_DYNAMIC_LINKER): Pick the loader using a spec rule. Michael, can you try this patch please. It should make it possible to then create linux-eabihf.h containing just #undef TARGET_DEFAULT_FLOAT_ABI #define TARGET_DEFAULT_FLOAT_ABI ARM_FLOAT_ABI_HARD #undef GLIBC_DYNAMIC_LINKER_DEFAULT #define GLIBC_DYNAMIC_LINKER_DEFAULT GLIBC_DYNAMIC_LINKER_HARD_FLOAT Which is not quite as simple as leaving out the second re-define, but pretty close. Hi Richard. Your patch tests just fine. I like it. You could change the spec rule to the newer if-elseif-else form but that's a nit. So who owns creating the appropriate glibc patch now that we've got a good gcc patch? Carlos is working on the GLIBC patch: http://sourceware.org/ml/libc-ports/2012-04/msg00171.html Ah, not on the main glibc list... That's why I missed it. Thanks. jeff
Re: rs6000 toc reference rtl again
This revision splits the medium/large code model toc reference after reload. I expected this to be more difficult, but it turned out surprisingly easy. Besides creating the rtl that way, and tweaking toc_relative_expr_p to match, it was just: Move the 'R' constraint handling out of various movsi/di instructions into a new instruction that is split after reload, and modify legitimize_reload_address to handle all the other splits inside MEMs. Bootstrapped and regression tested powerpc-linux. I also intend to spec test this change. Generated code has the expected improvements: libstdc++.so saw a small reduction in code size due to removing silly uses of call-saved regs to hold the high part of a toc reference. OK to apply? * config/rs6000/predicates.md (input_operand): Don't match constant pool addresses. Remove label_ref, high and plus from match_code list. Remove redundant CONSTANT_P test. (splat_input_operand): Similarly update match_code list. (small_toc_ref): New predicate. * config/rs6000/rs6000-protos.h (toc_relative_expr_p): Update prototype. * config/rs6000/rs6000.c (tocrel_base, tocrel_offset): Make const. (legitimate_constant_pool_address_p): Move TARGET_TOC test and register checks to.. (toc_relative_expr_p): ..here. Add strict param. Match new rtl generated by create_TOC_reference. (rs6000_legitimize_address): Update cerate_TOC_reference call. (rs6000_delegitimize_address): Handle new rtl for toc refs. (rs6000_cannot_force_const_mem, rs6000_find_base_term): Likewise. (use_toc_relative_ref): New function, split out from.. (rs6000_emit_move): ..here. Remove redundant tests. Update create_TOC_reference calls. (rs6000_legitimize_reload_address): Formatting. Handle splitting of medium/large model toc addresses. Use use_toc_relative_ref. (print_operand): Formatting, style. Adjust for toc changes. (print_operand_address): Likewise. (rs6000_output_addr_const_extra): Likewise. (create_TOC_reference): Put TOC_REGISTER in UNSPEC_TOCREL rather than a PLUS. Use this formulation for both high and low part of -mcmodel=medium/large toc reference too. Before reload, always use the small model formulation. * config/rs6000/rs6000.md (tls_gd, tls_gd_high): Similarly avoid a PLUS in high part of addresses here. (tls_ld, tls_ld_high, tls_got_dtprel, tls_got_dtprel_high): Likewise. (tls_got_tprel, tls_got_tprel_high, largetoc_high): Likewise. (largetoc_high, largetoc_low): Move earlier. Cope when no base reg available. (largetoc_high_plus): New insn. (movsi_internal1, movsi_internal1_single, movsf_softfloat, movdi_mfpgpr, movdi_internal64): Don't handle 'R' constraint here.. (tocref): ..instead do so here, new insn and split. Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 187010) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -5280,15 +5280,33 @@ constant_pool_expr_p (rtx op) ASM_OUTPUT_SPECIAL_POOL_ENTRY_P (get_pool_constant (base), Pmode)); } -static rtx tocrel_base, tocrel_offset; +static const_rtx tocrel_base, tocrel_offset; bool -toc_relative_expr_p (rtx op) +toc_relative_expr_p (const_rtx op, bool strict) { - if (GET_CODE (op) != CONST) + if (!TARGET_TOC) return false; - split_const (op, tocrel_base, tocrel_offset); + if (TARGET_CMODEL != CMODEL_SMALL) +{ + /* Only match the low part. */ + if (GET_CODE (op) == LO_SUM + REG_P (XEXP (op, 0)) + INT_REG_OK_FOR_BASE_P (XEXP (op, 0), strict)) + op = XEXP (op, 1); + else if (strict) + return false; +} + + tocrel_base = op; + tocrel_offset = const0_rtx; + if (GET_CODE (op) == PLUS CONST_INT_P (XEXP (op, 1))) +{ + tocrel_base = XEXP (op, 0); + tocrel_offset = XEXP (op, 1); +} + return (GET_CODE (tocrel_base) == UNSPEC XINT (tocrel_base, 1) == UNSPEC_TOCREL); } @@ -5300,14 +5318,7 @@ bool legitimate_constant_pool_address_p (const_rtx x, enum machine_mode mode, bool strict) { - return (TARGET_TOC - (GET_CODE (x) == PLUS || GET_CODE (x) == LO_SUM) - GET_CODE (XEXP (x, 0)) == REG - (REGNO (XEXP (x, 0)) == TOC_REGISTER - || ((TARGET_MINIMAL_TOC - || TARGET_CMODEL != CMODEL_SMALL) - INT_REG_OK_FOR_BASE_P (XEXP (x, 0), strict))) - toc_relative_expr_p (XEXP (x, 1)) + return (toc_relative_expr_p (x, strict) (TARGET_CMODEL != CMODEL_MEDIUM || constant_pool_expr_p (XVECEXP (tocrel_base, 0, 0)) || mode == QImode @@ -5687,10 +5698,7 @@ rs6000_legitimize_address (rtx x, rtx oldx ATTRIBU GET_CODE
Re: [RFH / Patch] PR 51222
On 04/30/2012 07:37 PM, Paolo Carlini wrote: Thus, my question would be: is something like the below in the right direction? The alternate possibility I can see, would be basically redoing a slightly slimmed version of for_each_template_parm specialized for our needs (a few less conditionals) I think either approach would be fine; I lean toward the first, but changing the name and adding a flag for clarity. Changing the walking behavior based on fn being null is too subtle. Jason
[rx] add initial rtx_costs() function
Initial implementation of RTX_COSTS target function for rx-elf. Minor increase in coremark scores, and enables division by multiplication of reciprocals, tested on trunk and 4.7. Ok for trunk and/or 4.7 branch? * config/rx/rx.c (TARGET_RTX_COSTS): Define. (rx_rtx_costs): New. Index: gcc/config/rx/rx.c === --- gcc/config/rx/rx.c (revision 186534) +++ gcc/config/rx/rx.c (working copy) @@ -2738,12 +2738,65 @@ rx_address_cost (rtx addr, bool speed) /* Try to discourage REG + large OFF when optimizing for size. */ return COSTS_N_INSNS (2); return COSTS_N_INSNS (1); } +#undef TARGET_RTX_COSTS +#define TARGET_RTX_COSTS rx_rtx_costs + +static bool +rx_rtx_costs (rtx x, + int code, + int outer_code ATTRIBUTE_UNUSED, + int opno ATTRIBUTE_UNUSED, + int * total, + bool speed ATTRIBUTE_UNUSED) +{ + switch (code) +{ +case MULT: + if (GET_MODE (x) == DImode) + { + *total = COSTS_N_INSNS (2); + return true; + } + /* fall through */ +case PLUS: +case MINUS: +case AND: +case COMPARE: +case IOR: +case XOR: + if (GET_CODE (XEXP (x, 0)) == MEM + || GET_CODE (XEXP (x, 1)) == MEM) + *total = COSTS_N_INSNS (3); + else + *total = COSTS_N_INSNS (1); + return true; + +case DIV: + /* This is worst case. */ + *total = COSTS_N_INSNS (20); + return true; + +case UDIV: + /* This is worst case. */ + *total = COSTS_N_INSNS (18); + return true; + +case IF_THEN_ELSE: + *total = COSTS_N_INSNS (3); + return true; + +default: + break; +} + return false; +} + static bool rx_can_eliminate (const int from ATTRIBUTE_UNUSED, const int to) { /* We can always eliminate to the frame pointer. We can eliminate to the stack pointer unless a frame pointer is needed. */