Re: Do not compute alias sets for types that don't need them
On Tue, 26 May 2015, Jan Hubicka wrote: Hi, On Fri, 22 May 2015, Jan Hubicka wrote: Index: tree-streamer-out.c === --- tree-streamer-out.c (revision 223508) +++ tree-streamer-out.c (working copy) @@ -346,6 +346,7 @@ pack_ts_type_common_value_fields (struct alias-set zero to this type. */ bp_pack_var_len_int (bp, (TYPE_ALIAS_SET (expr) == 0 || (!in_lto_p + type_with_alias_set_p (expr) get_alias_set (expr) == 0)) ? 0 : -1); I find such interfaces very ugly. IOW, when it's always (or often) necessary to call check_foo_p() before foo() can be called then the checking should be part of foo() (and it should then return a conservative value, i.e. alias set 0), and that requirement not be imposed on the callers of foo(). I.e. why can't whatever checks you do in type_with_alias_set_p be included in get_alias_set? Because of sanity checking: I want to make alias sets of those types undefined rather than having random values. The point is that using the alias set in alias oracle querry is wrong. You could have just returned 0 for the alias-set for !type_with_alias_set_p in get_alias_set. That avoids polluting the alias data structures and is neither random or wrong. Now I run into the case that we do produce MEM exprs for incomplete variants just to take their address so I was thinking the other day about defining an invalid alias set -2, making get_alias_set to return it and ICE later when query is actually made? We do have wrong query problems at least in ipa-icf, so I think it is worthwhile sanity check. + front-end routine) and use it. + + We may be called to produce MEM RTX for variable of incomplete type. + This MEM RTX will only be used to produce address of a vairable, so + we do not need to compute alias set. */ + if (!DECL_P (t) || type_with_alias_set_p (TYPE_MAIN_VARIANT (TREE_TYPE (t +attrs.alias = get_alias_set (t); And if the checking needs to go down the main-variant chain then this should be done inside type_with_alias_set_p(), not in the caller, otherwise even the symmetry between arguments of type_with_alias_set_p(xy) and get_alias_set(xy) is destroyed (but see above for why I think type_with_alias_set_p shouldn't even exist). Yep, good point - I will cleanup this. Honza -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nuernberg)
Re: [patch] Move generic tree functions from expr.h to tree.h
On Wed, May 27, 2015 at 12:00 PM, Eric Botcazou ebotca...@adacore.com wrote: Hi, a few functions manipulating generic trees from expr.c are useful for FEs too and some of them (array_ref_{low,up}_bound, get_inner_reference) are already declared in tree.h instead of expr.h. This patch moves 3 similar functions (array_ref_element_size, array_at_struct_end_p, component_ref_field_offset). Tested on x86_64-suse-linux, OK for the mainline? No. Prototypes of functions defined in A.c should be in A.h, not in some other header. We've been (slowly) moving to that. You should have moved them all to expr.h instead, or move the implementations to tree.c. Richard. 2015-05-27 Eric Botcazou ebotca...@adacore.com * expr.h (array_at_struct_end_p): Move to... (array_ref_element_size): Likewise. (component_ref_field_offset): Likewise. * tree.h (array_ref_element_size): ...here. (array_at_struct_end_p): Likewise. (component_ref_field_offset): Likewise. * expr.c (array_ref_up_bound): Move around. -- Eric Botcazou
[PATCH] Fix last SLP analysis refactoring
This fixes the last SLP analysis refactoring to _really_ pass the SLP node to the analysis functions. It also moves the premature out in the loop analysis code (it fails to consider pattern stmts for one). Finally this properly implements the slp_perm check for strided loads in vectorizable_load (now that slp_node is passed down) and it also adds dumping of hybrid detected stmts. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2015-05-27 Richard Biener rguent...@suse.de * tree-vect-stmts.c (vectorizable_load): Initialize slp_perm earlier and remove ??? comment. (vect_analyze_stmt): If we are analyzing a pure SLP stmt and got called from loop analysis bail out. Always pass the SLP node to the vectorizable_* functions. * tree-vect-loop.c (vect_analyze_loop_operations): Remove the premature SLP check here. * tree-vect-slp.c (vect_detect_hybrid_slp_stmts): Dump hybrid detected SLP stmts. (vect_detect_hybrid_slp_1): Likewise. Index: gcc/tree-vect-stmts.c === *** gcc/tree-vect-stmts.c (revision 223737) --- gcc/tree-vect-stmts.c (working copy) *** vectorizable_load (gimple stmt, gimple_s *** 5940,5945 --- 5940,5948 return false; } + if (slp SLP_TREE_LOAD_PERMUTATION (slp_node).exists ()) + slp_perm = true; + group_size = GROUP_SIZE (vinfo_for_stmt (first_stmt)); if (!slp !PURE_SLP_STMT (stmt_info) *** vectorizable_load (gimple stmt, gimple_s *** 6004,6013 (slp || PURE_SLP_STMT (stmt_info))) (group_size nunits || nunits % group_size != 0 ! /* ??? During analysis phase we are not called with the !slp node/instance we are in so whether we'll end up !with a permutation we don't know. Still we don't !support load permutations. */ || slp_perm)) { dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, --- 6007,6013 (slp || PURE_SLP_STMT (stmt_info))) (group_size nunits || nunits % group_size != 0 ! /* We don't support load permutations. */ || slp_perm)) { dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, *** vectorizable_load (gimple stmt, gimple_s *** 6402,6409 { grouped_load = false; vec_num = SLP_TREE_NUMBER_OF_VEC_STMTS (slp_node); - if (SLP_TREE_LOAD_PERMUTATION (slp_node).exists ()) - slp_perm = true; group_gap = GROUP_GAP (vinfo_for_stmt (first_stmt)); } else --- 6402,6407 *** vect_analyze_stmt (gimple stmt, bool *ne *** 7371,7403 *need_to_vectorize = true; } !ok = true; !if (!bb_vinfo ! (STMT_VINFO_RELEVANT_P (stmt_info) !|| STMT_VINFO_DEF_TYPE (stmt_info) == vect_reduction_def)) ! ok = (vectorizable_simd_clone_call (stmt, NULL, NULL, NULL) ! || vectorizable_conversion (stmt, NULL, NULL, NULL) ! || vectorizable_shift (stmt, NULL, NULL, NULL) ! || vectorizable_operation (stmt, NULL, NULL, NULL) ! || vectorizable_assignment (stmt, NULL, NULL, NULL) ! || vectorizable_load (stmt, NULL, NULL, NULL, NULL) ! || vectorizable_call (stmt, NULL, NULL, NULL) ! || vectorizable_store (stmt, NULL, NULL, NULL) ! || vectorizable_reduction (stmt, NULL, NULL, NULL) ! || vectorizable_condition (stmt, NULL, NULL, NULL, 0, NULL)); ! else ! { ! if (bb_vinfo) ! ok = (vectorizable_simd_clone_call (stmt, NULL, NULL, node) ! || vectorizable_conversion (stmt, NULL, NULL, node) ! || vectorizable_shift (stmt, NULL, NULL, node) ! || vectorizable_operation (stmt, NULL, NULL, node) ! || vectorizable_assignment (stmt, NULL, NULL, node) ! || vectorizable_load (stmt, NULL, NULL, node, NULL) ! || vectorizable_call (stmt, NULL, NULL, node) ! || vectorizable_store (stmt, NULL, NULL, node) ! || vectorizable_condition (stmt, NULL, NULL, NULL, 0, node)); ! } if (!ok) { --- 7369,7408 *need_to_vectorize = true; } ! if (PURE_SLP_STMT (stmt_info) !node) ! { ! dump_printf_loc (MSG_NOTE, vect_location, ! handled only by SLP analysis\n); ! return true; ! } ! ! ok = true; ! if (!bb_vinfo !(STMT_VINFO_RELEVANT_P (stmt_info) ! || STMT_VINFO_DEF_TYPE (stmt_info) == vect_reduction_def)) ! ok = (vectorizable_simd_clone_call (stmt, NULL, NULL, node) ! || vectorizable_conversion (stmt, NULL, NULL, node) ! ||
Re: [PATCH][ARM] Restrict MAX_CONDITIONAL_EXECUTE when -mrestrict-it is in place
Ping. Here is the rebased (and retested) patch after Christian's series. Thanks, Kyrill On 18/05/15 11:26, Kyrill Tkachov wrote: Hi all, When using the short Thumb2 IT blocks we want to also restrict ifcvt so that it will not end up generating a number of back-to-back cond_execs that will later end up being back to back single-instruction IT blocks. Branching over them should be a better choice. This patch implements that by setting max_insns_skipped to 1 when arm_restrict_it. With this patch, I've seen GCC replace a number of sequences in places like SPEC2006 from: iteq moveqr1, r5 itne movner1, r10 iteq moveqr8, r4 to a branch over them. Bootstrapped and tested on arm. Ok for trunk? Thanks, Kyrill 2015-05-18 Kyrylo Tkachov kyrylo.tkac...@arm.com * config/arm/arm.c (arm_option_params_internal): When optimising for speed set max_insns_skipped when arm_restrict_it. 2015-05-18 Kyrylo Tkachov kyrylo.tkac...@arm.com * gcc.target/arm/short-it-ifcvt-1.c: New test. * gcc.target/arm/short-it-ifcvt-2.c: Likewise. commit 2e5bb6e122e96189af1774a4fa451ad7e9b44d3d Author: Kyrylo Tkachov kyrylo.tkac...@arm.com Date: Thu May 14 12:08:14 2015 +0100 [ARM] Restrict MAX_CONDITIONAL_EXECUTE when -mrestrict-it is in place diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index a4eeba3..638d659 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -2788,7 +2788,10 @@ arm_option_params_internal (struct gcc_options *opts) max_insns_skipped = opts-x_arm_restrict_it ? 1 : 4; } else -max_insns_skipped = current_tune-max_insns_skipped; +/* When -mrestrict-it is in use tone down the if-conversion. */ +max_insns_skipped + = (TARGET_THUMB2_P (opts-x_target_flags) opts-x_arm_restrict_it) + ? 1 : current_tune-max_insns_skipped; } /* Reset options between modes that the user has specified. */ diff --git a/gcc/testsuite/gcc.target/arm/short-it-ifcvt-1.c b/gcc/testsuite/gcc.target/arm/short-it-ifcvt-1.c new file mode 100644 index 000..f3d29b7 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/short-it-ifcvt-1.c @@ -0,0 +1,23 @@ +/* Test that ifcvt is not being too aggressive when -mrestrict-it. */ +/* { dg-do compile } */ +/* { dg-options -O2 -mrestrict-it } */ +/* { dg-require-effective-target arm_thumb2_ok } */ + +int +f1(int x, int y, int z) +{ + if (x 100) +{ + x++; + z = -z; +} + else +{ + x = -x; + y = -y; + z = 1; +} + return x + y + z; +} + +/* { dg-final { scan-assembler b(gt|le) } } */ diff --git a/gcc/testsuite/gcc.target/arm/short-it-ifcvt-2.c b/gcc/testsuite/gcc.target/arm/short-it-ifcvt-2.c new file mode 100644 index 000..9ac8153 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/short-it-ifcvt-2.c @@ -0,0 +1,21 @@ +/* Test that ifcvt is not being too aggressive when -mrestrict-it. */ +/* { dg-do compile } */ +/* { dg-options -O2 -mrestrict-it } */ +/* { dg-require-effective-target arm_thumb2_ok } */ + +int +f1(int x, int y, int z) +{ + if (x 100) +{ + x++; + z = -z; +} + else +{ + x = -x; + y = -y; +} + return x + y + z; +} +/* { dg-final { scan-assembler b(gt|le) } } */
Re: Add few cases to operand_equal_p
On Tue, 26 May 2015, Jan Hubicka wrote: Will do if we agree on having this. I know you would like ipa-icf to keep original bodies and use them for inlining declaring alias sets to be function local. This is wrong plan. Consder: void t(int *ptr) { *ptr=1; } int a(int *ptr1, int *ptr2) { int a = *ptr1; t(ptr2) return a+*ptr1; } long b(long *ptr1, int *ptr2) { int a = *ptr1; t(ptr2) return a+*ptr1; } here aliasing leads to the two options to be optimizer differently: a: .LFB1: .cfi_startproc movl4(%esp), %edx movl8(%esp), %ecx movl(%edx), %eax movl$1, (%ecx) addl(%edx), %eax ret .cfi_endproc b: .LFB2: .cfi_startproc movl4(%esp), %eax movl8(%esp), %edx movl(%eax), %eax movl$1, (%edx) addl%eax, %eax ret .cfi_endproc however with -fno-early-inlining the functions look identical (modulo alias sets) at ipa-icf time. If we merged a/b, we could get wrong code for a even though no inlining of a or b happens. First of all the return types don't agree so the testcase is bogus. With -m32 they are types_compatible_p because they are of same size. So either we match the alias sets or we need to verify that the alias sets permit precisely the same set of optimizations with taking possible inlining into account. Hmm, but then what makes ICF of a and b _with_ early inlining fail with -fno-tree-fre1? The casts from *ptr1 to int in the 'long' case. Dereferencing *ptr1 that has different alias set in each function. So I think I need to see a real testcase and then I'll show you even with no inlining after ICF you get wrong-code thus it is a bug in ICF ;) I added the inline only to make it clear that the loads won't be optimized at early optimization time. long a(int *ptr1, int *ptr2) { int a = *ptr1; *ptr2=1; return a+*ptr1; } long b(long *ptr1, int *ptr2) { int a = *ptr1; *ptr2=1; return a+*ptr1; } with -fno-tree-fre may be more real a (int * ptr1, int * ptr2) { int a; int D.1380; long int D.1379; int _4; long int _5; bb 2: a_2 = *ptr1_1(D); *ptr2_3(D) = 1; _4 = *ptr1_1(D); _5 = _4 + a_2; L0: return _5; } ;; Function b (b, funcdef_no=1, decl_uid=1375, cgraph_uid=1) b (long int * ptr1, int * ptr2) { int a; long int D.1383; long int D.1382; long int _4; long int _5; bb 2: a_2 = *ptr1_1(D); *ptr2_3(D) = 1; _4 = *ptr1_1(D); _5 = _4 + a_2; L0: return _5; } Yes, so this shows using original bodies for inlining isn't the issue. The issue is that we can't really ignore TBAA (completely?) when merging function bodies, independent of any issues that pop up when inlining merged bodies. We should have the above as testcase in the testsuite (with both source orders of a and b to make sure ICF will eventually pick both). Now the question is whether we can in some way still merge the above two functions and retain (some) TBAA, like by making sure to adjust all MEM_REFs to use union { type1; type2; } for the TBAA type... (eh). No longer globbing all pointer types will even make std::vectorptr no longer mergeable... Richard. I also do not believe that TBAA should be function local. I believe it is useful to propagate stuff interprocedurally, like ipa-prop could be able to propagate this: long *ptr1; int *ptr2; t(int *ptr) { return *ptr; } wrap(int *ptr) { *ptr1=1; } call() { return wrap (*ptr2); } and we could have ipa-reference style pass that collect alias sets read/written by a function and uses it during local optimization to figure out if there is a true dependence between function call and memory store. Sure, but after ICF there is no IPA propagation... Doesn't matter if you propagate before or after ICF. If you do before, ICF would need to match/merge the alias set in optimization summary to be sure that the functions are same. Honza Richard. -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nuernberg) -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nuernberg)
[patch] Move generic tree functions from expr.h to tree.h
Hi, a few functions manipulating generic trees from expr.c are useful for FEs too and some of them (array_ref_{low,up}_bound, get_inner_reference) are already declared in tree.h instead of expr.h. This patch moves 3 similar functions (array_ref_element_size, array_at_struct_end_p, component_ref_field_offset). Tested on x86_64-suse-linux, OK for the mainline? 2015-05-27 Eric Botcazou ebotca...@adacore.com * expr.h (array_at_struct_end_p): Move to... (array_ref_element_size): Likewise. (component_ref_field_offset): Likewise. * tree.h (array_ref_element_size): ...here. (array_at_struct_end_p): Likewise. (component_ref_field_offset): Likewise. * expr.c (array_ref_up_bound): Move around. -- Eric BotcazouIndex: expr.h === --- expr.h (revision 223736) +++ expr.h (working copy) @@ -281,19 +281,10 @@ rtx get_personality_function (tree); extern int can_move_by_pieces (unsigned HOST_WIDE_INT, unsigned int); extern unsigned HOST_WIDE_INT highest_pow2_factor (const_tree); -bool array_at_struct_end_p (tree); - -/* Return a tree of sizetype representing the size, in bytes, of the element - of EXP, an ARRAY_REF or an ARRAY_RANGE_REF. */ -extern tree array_ref_element_size (tree); extern bool categorize_ctor_elements (const_tree, HOST_WIDE_INT *, HOST_WIDE_INT *, bool *); -/* Return a tree representing the offset, in bytes, of the field referenced - by EXP. This does not include any offset in DECL_FIELD_BIT_OFFSET. */ -extern tree component_ref_field_offset (tree); - extern void expand_operands (tree, tree, rtx, rtx*, rtx*, enum expand_modifier); Index: expr.c === --- expr.c (revision 223736) +++ expr.c (working copy) @@ -7002,6 +7002,23 @@ array_ref_low_bound (tree exp) return build_int_cst (TREE_TYPE (TREE_OPERAND (exp, 1)), 0); } +/* Return a tree representing the upper bound of the array mentioned in + EXP, an ARRAY_REF or an ARRAY_RANGE_REF. */ + +tree +array_ref_up_bound (tree exp) +{ + tree domain_type = TYPE_DOMAIN (TREE_TYPE (TREE_OPERAND (exp, 0))); + + /* If there is a domain type and it has an upper bound, use it, substituting + for a PLACEHOLDER_EXPR as needed. */ + if (domain_type TYPE_MAX_VALUE (domain_type)) +return SUBSTITUTE_PLACEHOLDER_IN_EXPR (TYPE_MAX_VALUE (domain_type), exp); + + /* Otherwise fail. */ + return NULL_TREE; +} + /* Returns true if REF is an array reference to an array at the end of a structure. If this is the case, the array may be allocated larger than its upper bound implies. */ @@ -7039,23 +7056,6 @@ array_at_struct_end_p (tree ref) return true; } -/* Return a tree representing the upper bound of the array mentioned in - EXP, an ARRAY_REF or an ARRAY_RANGE_REF. */ - -tree -array_ref_up_bound (tree exp) -{ - tree domain_type = TYPE_DOMAIN (TREE_TYPE (TREE_OPERAND (exp, 0))); - - /* If there is a domain type and it has an upper bound, use it, substituting - for a PLACEHOLDER_EXPR as needed. */ - if (domain_type TYPE_MAX_VALUE (domain_type)) -return SUBSTITUTE_PLACEHOLDER_IN_EXPR (TYPE_MAX_VALUE (domain_type), exp); - - /* Otherwise fail. */ - return NULL_TREE; -} - /* Return a tree representing the offset, in bytes, of the field referenced by EXP. This does not include any offset in DECL_FIELD_BIT_OFFSET. */ Index: tree.h === --- tree.h (revision 223736) +++ tree.h (working copy) @@ -5051,12 +5051,6 @@ tree_int_cst_compare (const_tree t1, con extern void set_decl_rtl (tree, rtx); extern bool complete_ctor_at_level_p (const_tree, HOST_WIDE_INT, const_tree); -/* Return a tree representing the upper bound of the array mentioned in - EXP, an ARRAY_REF or an ARRAY_RANGE_REF. */ -extern tree array_ref_up_bound (tree); - -extern tree build_personality_function (const char *); - /* Given an expression EXP that is a handled_component_p, look for the ultimate containing object, which is returned and specify the access position and size. */ @@ -5064,10 +5058,28 @@ extern tree get_inner_reference (tree, H tree *, machine_mode *, int *, int *, bool); +/* Return a tree of sizetype representing the size, in bytes, of the element + of EXP, an ARRAY_REF or an ARRAY_RANGE_REF. */ +extern tree array_ref_element_size (tree); + +/* Return a tree representing the upper bound of the array mentioned in + EXP, an ARRAY_REF or an ARRAY_RANGE_REF. */ +extern tree array_ref_up_bound (tree); + /* Return a tree representing the lower bound of the array mentioned in EXP, an ARRAY_REF or an ARRAY_RANGE_REF. */ extern tree array_ref_low_bound (tree); +/* Returns true if REF is an array reference to an array at the end of + a structure. If this is the case, the array may be allocated larger + than its
Re: [PATCH][ARM] Add debug dumping of cost table fields
On Wed, May 27, 2015 at 4:39 PM, Andrew Pinski pins...@gmail.com wrote: On Wed, May 27, 2015 at 4:38 PM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote: Ping. https://gcc.gnu.org/ml/gcc-patches/2015-05/msg00054.html This and the one in AARCH64 is too noisy. Can we have an option to turn this on and default to turning them off. Agreed. Actually I once file a PR about this enormous dump information in gimple dumps. Thanks, bin Thanks, Andrew Thanks, Kyrill On 01/05/15 15:31, Kyrill Tkachov wrote: Hi all, This patch adds a macro to wrap cost field accesses into a helpful debug dump, saying which field is being accessed at what line and with what values. This helped me track down cases where the costs were doing the wrong thing by allowing me to see which path in arm_new_rtx_costs was taken. For example, the combine log might now contain: Trying 2 - 6: Successfully matched this instruction: (set (reg:SI 115 [ D.5348 ]) (neg:SI (reg:SI 0 r0 [ a ]))) using extra_cost-alu.arith with cost 0 from line 10506 which can be useful in debugging the rtx costs. Bootstrapped and tested on arm. Ok for trunk? Thanks, Kyrill 2015-05-01 Kyrylo Tkachov kyrylo.tkac...@arm.com * config/arm/arm.c (DBG_COST): New macro. (arm_new_rtx_costs): Use above.
Re: [PATCH][ARM] Add debug dumping of cost table fields
On 27/05/15 09:47, Bin.Cheng wrote: On Wed, May 27, 2015 at 4:39 PM, Andrew Pinski pins...@gmail.com wrote: On Wed, May 27, 2015 at 4:38 PM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote: Ping. https://gcc.gnu.org/ml/gcc-patches/2015-05/msg00054.html This and the one in AARCH64 is too noisy. Can we have an option to turn this on and default to turning them off. Agreed. Actually I once file a PR about this enormous dump information in gimple dumps. Ok, I'll give it a shot and gate both this and the existing Hot/Cold stuff on an option. Thanks for the feedback. Kyrill Thanks, bin Thanks, Andrew Thanks, Kyrill On 01/05/15 15:31, Kyrill Tkachov wrote: Hi all, This patch adds a macro to wrap cost field accesses into a helpful debug dump, saying which field is being accessed at what line and with what values. This helped me track down cases where the costs were doing the wrong thing by allowing me to see which path in arm_new_rtx_costs was taken. For example, the combine log might now contain: Trying 2 - 6: Successfully matched this instruction: (set (reg:SI 115 [ D.5348 ]) (neg:SI (reg:SI 0 r0 [ a ]))) using extra_cost-alu.arith with cost 0 from line 10506 which can be useful in debugging the rtx costs. Bootstrapped and tested on arm. Ok for trunk? Thanks, Kyrill 2015-05-01 Kyrylo Tkachov kyrylo.tkac...@arm.com * config/arm/arm.c (DBG_COST): New macro. (arm_new_rtx_costs): Use above.
Re: [PATCH][ARM] Add debug dumping of cost table fields
On Wed, May 27, 2015 at 4:38 PM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote: Ping. https://gcc.gnu.org/ml/gcc-patches/2015-05/msg00054.html This and the one in AARCH64 is too noisy. Can we have an option to turn this on and default to turning them off. Thanks, Andrew Thanks, Kyrill On 01/05/15 15:31, Kyrill Tkachov wrote: Hi all, This patch adds a macro to wrap cost field accesses into a helpful debug dump, saying which field is being accessed at what line and with what values. This helped me track down cases where the costs were doing the wrong thing by allowing me to see which path in arm_new_rtx_costs was taken. For example, the combine log might now contain: Trying 2 - 6: Successfully matched this instruction: (set (reg:SI 115 [ D.5348 ]) (neg:SI (reg:SI 0 r0 [ a ]))) using extra_cost-alu.arith with cost 0 from line 10506 which can be useful in debugging the rtx costs. Bootstrapped and tested on arm. Ok for trunk? Thanks, Kyrill 2015-05-01 Kyrylo Tkachov kyrylo.tkac...@arm.com * config/arm/arm.c (DBG_COST): New macro. (arm_new_rtx_costs): Use above.
Re: Do not compute alias sets for types that don't need them
On Wed, 27 May 2015, Jan Hubicka wrote: I am not sure if TYPE_MAIN_VARIANT is really needed here. What I know is that complete types may have incomplete variants. How can that be? TYPE_FIELDS is shared across variants and all variants should be layed out. Because TYPE_FILEDS are not always shared across variants. For example gfc_nonrestricted_type builds variants of types that have their own TYPE_FIELDS lists whose types are variants of the original TYPE_FIELDs. C++ FE used to do the same for member pointers, but I noticed that last stage1 with early version of type verifier and as far as I can remember Jason changed that. The fortran one needs to be fixed to use the new MEM_REF restrict support. Richard. -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nuernberg)
Re: ping**3 [PATCH, ARM] Cortex-A9 MPCore volatile load workaround
Hi Sandra, Chung-Lin, A couple of comments from me, On 26/05/15 20:10, Sandra Loosemore wrote: Chung-Lin posted this patch last year but it seems never to have been reviewed: https://gcc.gnu.org/ml/gcc-patches/2014-06/msg00714.html I've just re-applied and re-tested it and it still seems to be good. Can somebody please take a look at it? -Sandra +mfix-cortex-a9-volatile-hazards +Target Report Var(fix_a9_volatile_hazards) Init(0) +Avoid errata causing read-after-read hazards for concurrent volatile +accesses on Cortex-A9 MPCore processors. s/errata/erratum/ +;; Thumb-2 version allows conditional execution +(define_insn *memory_barrier_t2 + [(set (match_operand:BLK 0 ) + (unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BARRIER))] + TARGET_HAVE_MEMORY_BARRIER TARGET_THUMB2 + { +if (TARGET_HAVE_DMB) + { + /* Note we issue a system level barrier. We should consider issuing + a inner shareabilty zone barrier here instead, ie. DMB ISH. */ + /* ??? Differentiate based on SEQ_CST vs less strict? */ + return dmb%?\tsy; + } + +if (TARGET_HAVE_DMB_MCR) + return mcr%?\tp15, 0, r0, c7, c10, 5; + +gcc_unreachable (); + } + [(set_attr length 4) + (set_attr conds nocond) + (set_attr predicable yes)]) + This should also set the 'predicable_short_it' attribute to no since we don't want it to be predicated when compiling for ARMv8-A Thumb2. Consequently: Index: testsuite/gcc.target/arm/a9-volatile-ordering-erratum-2.c === --- testsuite/gcc.target/arm/a9-volatile-ordering-erratum-2.c (revision 0) +++ testsuite/gcc.target/arm/a9-volatile-ordering-erratum-2.c (revision 0) @@ -0,0 +1,14 @@ +/* { dg-do compile { target arm_dmb } } */ +/* { dg-options -O2 -mthumb -mfix-cortex-a9-volatile-hazards } */ Please add a -mno-restrict-it to the options here so that when armv8-a is the default architecture we are still allowed to conditionalise dmb. +static bool +any_volatile_loads_p (const_rtx body) +{ + int i, j; + rtx lhs, rhs; + enum rtx_code code; + const char *fmt; + + if (body == NULL_RTX) +return false; + + code = GET_CODE (body); + + if (code == SET) +{ + lhs = SET_DEST (body); + rhs = SET_SRC (body); + + if (!REG_P (lhs) GET_CODE (lhs) != SUBREG) +return false; + + if ((MEM_P (rhs) || GET_CODE (rhs) == SYMBOL_REF) + MEM_VOLATILE_P (rhs)) +return true; +} + else +{ + fmt = GET_RTX_FORMAT (code); + + for (i = GET_RTX_LENGTH (code) - 1; i = 0; i--) +{ + if (fmt[i] == 'e') + { + if (any_volatile_loads_p (XEXP (body, i))) + return true; + } + else if (fmt[i] == 'E') + for (j = 0; j XVECLEN (body, i); j++) + if (any_volatile_loads_p (XVECEXP (body, i, j))) + return true; + } +} + + return false; +} Would it be simpler to write this using the FOR_EACH_SUBRTX infrastructure? I think it would make this function much shorter. @@ -17248,6 +17334,9 @@ arm_reorg (void) { rtx table; + if (fix_a9_volatile_hazards) + arm_cortex_a9_errata_reorg (insn); + note_invalid_constants (insn, address, true); address += get_attr_length (insn); Does the logic for adding the insn length to address need to be updated in any way since we're inserting a new instruction in the stream? The calculations here always confuse me... Thanks, Kyrill
Re: [PATCH][ARM] Add debug dumping of cost table fields
Ping. https://gcc.gnu.org/ml/gcc-patches/2015-05/msg00054.html Thanks, Kyrill On 01/05/15 15:31, Kyrill Tkachov wrote: Hi all, This patch adds a macro to wrap cost field accesses into a helpful debug dump, saying which field is being accessed at what line and with what values. This helped me track down cases where the costs were doing the wrong thing by allowing me to see which path in arm_new_rtx_costs was taken. For example, the combine log might now contain: Trying 2 - 6: Successfully matched this instruction: (set (reg:SI 115 [ D.5348 ]) (neg:SI (reg:SI 0 r0 [ a ]))) using extra_cost-alu.arith with cost 0 from line 10506 which can be useful in debugging the rtx costs. Bootstrapped and tested on arm. Ok for trunk? Thanks, Kyrill 2015-05-01 Kyrylo Tkachov kyrylo.tkac...@arm.com * config/arm/arm.c (DBG_COST): New macro. (arm_new_rtx_costs): Use above.
Re: [RFC] operand_equal_p with valueization
On Tue, 26 May 2015, Jan Hubicka wrote: On Fri, 22 May 2015, Jan Hubicka wrote: And no, I'm hesitant to change operand_equal_p too much. It's very much deep-rooted into GENERIC. OK, as another option, i can bring relevant logic from operand_equal_p to ipa-icf and separate it into the compare_operand class like I did. Use it in ipa-icf-gimple now and we can slowly turn other uses of operand_equal into the compare_operand users in middle end. I agree that operand_equal is bit crazy code and it does not handle quite few things we could do at gimple. I have nothing against going this direction. (after all I do not like touching fold-const much becuase it works on generic, gimple and FE non-generic and it is not well specified what it should do) Yes, I've played with the idea of a GIMPLE specific operand_equal_p multiple times but then the changes required to operand_equal_p were small all the times. And having one piece of code that does sth is always good ... We might turn operand_equal_p to a worker (template?) that Hmm, OK that is precisely what I was shooting for by this patch. I went by wrapping it to a class with valueize helper. It can be template, too, just it semed that having the single valueize function lets me do everything I need without actually needing to duplicate the code. I can get around templatizing it. Do you have some outline what interface would seem more fit I was thinking about template bool with_valueize int operand_equal_p_1 (const_tree arg0, const_tree arg1, unsigned int flags, tree (*valueize)(tree)) { #define VALUEIZE(op) (with_valueize valueize) ? valueize (op) : op ... } and extern template int operand_equal_p_1false (const_tree arg0, const_tree arg1, unsigned int flags, tree (*valueize)(tree)); extern template int operand_equal_p_1true (const_tree arg0, const_tree arg1, unsigned int flags, tree (*valueize)(tree)); int operand_equal_p (const_tree arg0, const_tree arg1, unsigned int flags) { return operand_equal_p_1false (arg0, arg1, flags, NULL); } we don't want to make 'valueize' a template parameter (that is, we don't want to put operand_equal_p_1 to fold-const.h). Same with an eventual 'gimple_p' template parameter (which eventually could simply be the same as the with_valueize one). I'm playing with the idea to make match-and-simplify similar, providing explicit specializations for common valueize callbacks. As it always has a valueize callback I'd do it like template tree (*fixed_valueize)(tree) bool gimple_simplify (code_helper *res_code, tree *res_ops, gimple_seq *seq, tree (*valueize)(tree), code_helper code, tree type, tree op0) { #define do_valueize(op) \ fixed_valueize != (void *)-1 \ ? (fixed_valueize ? fixed_valueize (op) : op) \ : (valueize ? valueize (op) : op) ... } Richard. operand_equal_p and gimple_operand_equal_p can share (with an extra flag whether to turn on GIMPLE stuff and/or valueization). And then simply provide explicit instantiations for the original operand_equal_p and a new gimple_operand_equal_p. Of course we'll only know if we like that when seeing a patch that does this ;0) Richard. -- Richard Biener rguent...@suse.de SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nuernberg)
Re: [patch] libstdc++/66017 Avoid bad casts and fix alignment of _Rb_tree_nodelong long::_M_storage
On 26/05/15 15:46 +0100, Jonathan Wakely wrote: On 22/05/15 18:48 +0100, Jonathan Wakely wrote: On 22/05/15 16:21 +0100, Jonathan Wakely wrote: On 22/05/15 17:13 +0200, Jakub Jelinek wrote: On Fri, May 22, 2015 at 03:59:47PM +0100, Jonathan Wakely wrote: + alignas(alignof(_Tp2)) unsigned char _M_storage[sizeof(_Tp)]; Is alignof(_Tp2) always the same as alignof(_Tp2::_M_t) on all targets (I mean, won't some target align the structure more than its only field)? Hmm, maybe. I don't know. Wouldn't it be safer to use alignof(_Tp2::_M_t) here? Yes. Though, apparently that is a GNU extension, so you'd need to use __alignof__ instead. Yes, that's what I did in an earlier version of the patch, so I'll go back to that. Just grepped around, and e.g. on powerpc64le-linux -std=c++11 -malign-power -O2 typedef double _Tp; struct _Tp2 { _Tp _M_t; }; extern _Tp2 tp2e; int a = alignof(_Tp2); int b = __alignof__(_Tp2::_M_t); int c = alignof(_Tp); int d = __alignof__(tp2e._M_t); int e = alignof(_Tp2::_M_t); we have a = 8, b = 4, c = 8, d = 4, e = 4. OK, thanks. Note clang++ with -pedantic-errors errors out on alignof(_Tp2::_M_t) though. It allows __alignof__ though. Revised patches attached, as two separate commits because the first should be backported but the second doesn't need to be. This includes the necessary changes for the Python printers. The change to __aligned_buffer (which makes _Rb_tree_nodelong long consistent in c++98 and c++11 modes) also affects some other C++11-only types. Compiling the attached program with -std=gnu++11 -m32 before and after the patch produces these results: Before: futurelong long shared state: alignment: 8 size: 24 shared_ptrlong long control block: alignment: 8 size: 24 forward_listlong long node: alignment: 8 size: 16 unordered_setlong long node: alignment: 8 size: 16 After: futurelong long shared state: alignment: 4 size: 20 shared_ptrlong long control block: alignment: 4 size: 20 forward_listlong long node: alignment: 4 size: 12 unordered_setlong long node: alignment: 4 size: 12 The fix for _Rb_tree_nodelong long is a bug fix and necessary for consistency with existing c++98 code, which is more important than consistency with existing c++11 code using 5.1 or earlier releases. But changing the other types as well would make 5.2 inconsistent with 5.1 for those types. We could just make that change and deal with it, or I could keep __aligned_buffer unchanged and add a new __aligned_buffer_mem for use in _Rb_tree_node, so we only change the one type that is currently inconsistent between c++98 and c++11 modes. The attached patch makes that smaller change (the second patch in my last mail remains unchanged). It's a shame to waste some space in the other types using __aligned_buffer, and to have to maintain both __aligned_buffer and __aligned_buffer_mem, but I think this is safer. Here's the version I've committed, it's the same as the version yesterday but renaming __aligned_buffer_mem to __aligned_membuf and adding some comments to ext/aligned_buffer.h explaining why there are two types. Tested powerpc64le-linux, committed to trunk. I plan to commit patch1.txt to gcc-5-branch too. commit 8dae241ed96d8ad400a4f8af7748a5bd0315c0e7 Author: Jonathan Wakely jwak...@redhat.com Date: Thu May 21 14:41:16 2015 +0100 PR libstdc++/66017 * include/bits/stl_tree.h (_Rb_tree_node): Use __aligned_membuf. (_Rb_tree_iterator, _Rb_tree_const_iterator): Support construction from _Base_ptr. (_Rb_tree_const_iterator::_M_const_cast): Remove static_cast. (_Rb_tree::begin, _Rb_tree::end): Remove static_cast. * include/ext/aligned_buffer.h (__aligned_membuf): New type using alignment of _Tp as a member subobject, not as a complete object. * python/libstdcxx/v6/printers.py (StdRbtreeIteratorPrinter): Lookup _Link_type manually as it might not be in the debug info. diff --git a/libstdc++-v3/include/bits/stl_tree.h b/libstdc++-v3/include/bits/stl_tree.h index 5ca8e28..d39042f 100644 --- a/libstdc++-v3/include/bits/stl_tree.h +++ b/libstdc++-v3/include/bits/stl_tree.h @@ -146,7 +146,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _M_valptr() const { return std::__addressof(_M_value_field); } #else - __gnu_cxx::__aligned_buffer_Val _M_storage; + __gnu_cxx::__aligned_membuf_Val _M_storage; _Val* _M_valptr() @@ -188,7 +188,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION : _M_node() { } explicit - _Rb_tree_iterator(_Link_type __x) _GLIBCXX_NOEXCEPT + _Rb_tree_iterator(_Base_ptr __x) _GLIBCXX_NOEXCEPT : _M_node(__x) { } reference @@ -260,7 +260,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION : _M_node() { } explicit - _Rb_tree_const_iterator(_Link_type __x) _GLIBCXX_NOEXCEPT + _Rb_tree_const_iterator(_Base_ptr __x) _GLIBCXX_NOEXCEPT : _M_node(__x) { } _Rb_tree_const_iterator(const iterator __it)
Re: Teach gimple_canonical_types_compatible_p about incomplete types
On Tue, 26 May 2015, Jan Hubicka wrote: Now the change does not really translate to great increase of disambiguations for Firefox (it seems more in noise). The reason is the pointer_type globbing in alias.c. Yeah, we only get the improvement because of some hack in the tree alias oracle which also uses the base object for TBAA. Why that is hack? Dereferencing a pointer makes it clear the type of memory location pointed to is known, we should use that info. Yeah, we should fix that. And in fact, for cross-language LTO I don't see why union { int a; char c; }; and union { int a; short s; }; should not be compatible - they have a common member after all. So I'd like to glob all unions that have the same size (and as improvement Well, none of language standards I saw so far expect this to happen. Going to extremes, you can always put variable sized char array to union and by transitivity glob everything with everything. I'm speaking of cross-language LTO - that leaves the language standards territorry and requires us to apply common sense. over that, that have at least one compatible member). That also get's rid of the issue that we'd need to sort union members for the comparison to avoid quadraticness (as long as we don't check for that one compatible member). Yeah, sorting is possible by using the hash values. Oh, and are union { int a; }; and struct { int a; }; not compatible? They are layout-wise at least. Likewise the struct and union { int a; short s; } with the same argument as the two-union case. Applying this rule you have union { char a[n]; } compatible with every union and thus also union {int a;} struct { int a;} int a; Which would disable TBAA completely. See ;) At least we have the int a; vs. struct { int a; } issue with Fortran vs. C compatibility (there is even a PR about this). We also do not compare alignments. This is probably not important) Correct - alignment doesn't enter TBAA. Yep, I think the alignment compare in C standard basically is there to say that structures must have same lyaout. void f(double (* restrict a)[5]); void f(double a[restrict][5]); void f(double a[restrict 3][5]); void f(double a[restrict static 3][5]);) Not sure why you get into functions here at all ... Basically it matters only if we want to disambiguate function pointers. 2 Each enumerated type shall be compatible with char , a signed integer type, or an unsigned integer type. The choice of type is implementation-defined, but shall be capable of representing the values of all the members of the enumeration.The enumerated type is incomplete until immediately after the that terminates the list of enumerator declarations, and complete thereafter. (we ignore this completely as far as I know, it is easy to fix though, all we need is to make ENUMERATION_TYPE pretend to be INTEGER_TYPE) Yes, we don't make a distinction between ENUMERAL_TYPE and INTEGER_TYPE. hstate.add_int (TREE_CODE (type)); in alias.c I mean. makes them different. I think we want to produce simplified code that turns REFERENCE_TYPE to POINTER_TYPE and ENUMERAL_TYPE to INTEGER_TYPE. I will send patch fo that. Thanks. 10 For two qualified types to be compatible, both shall have the identically qualified version of a compatible type; the order of type qualifiers within a list of specifiers or qualifiers does not affect the specified type. Now I think in order to get C standard type compatiblity to imply gimple_canonical_types_compatible we need to implement all the above globbing rules as part of canonical type computation, not only punt at pointers in alias.c My reading is that for example struct a {char *a;}; is compatible with struct a {enum *a;}; defined in other compilation unit. Yes, as said above the TREE_CODE trick in the pointer-type handing is wrong. We can as well just drop it ... struct a {char a;}; is compatible with struct a {enum a;}; I would say we just want to simplify the codes and peel for pointers instead of TREE_TYPE (t) compare look for actual pointed to type (peeling out POINTER_TYPE/RECORD_TYPE/ARRAY_TYPEs) 8) i think to be correct by C language standard we need to glob enum with char tough I do not quite see how standard conforming program should use it given that standard does not say if it is char/unsigned char/signed char. I think it depends on the actual enum, no? So for forward declarations like enum Foo; struct X { enum Foo *p; }; you face the same issue as with void *. Well, handling all enums as integers should solve this. But sizeof (enum Foo) depends on the enum, so no, it won't solve it. There are
Re: [Patch, fortran, pr65548, addendum] [5/6 Regression] gfc_conv_procedure_call
Hi Thomas, thanks for the review. Commited as r223738 with the changes (new testcase, double space in dg-do). Regards, Andre On Wed, 27 May 2015 08:38:07 +0200 Thomas Koenig tkoe...@netcologne.de wrote: Hi Andre, Because this patch is obvious I plan to commit it tomorrow if no one objects?! The patch itself is obviously OK. About the test case: In general, it is better not to change existing test cases unless absolutely necessary (e.g. adjust an error message). This makes it easier to track regressions. I would prefer if you made a new test case from your existing one, with the changes you did and a small explanation of what was tested in the comments. If you are worried about runtime for an additonal test, you can use the ! { dg-do run } hack (notice the two spaces between the dg-do and the run) to have the test case execute only once. OK with that change. Regards Thomas -- Andre Vehreschild * Email: vehre ad gmx dot de
Re: PATCH to run autoconf tests with C++ compiler
This breaks all checks for supported compiler options: configure:6382: checking whether gcc supports -Wnarrowing configure:6399: gcc -c -Wnarrowing conftest.c 5 cc1: error: unrecognized command line option -Wnarrowing configure:6399: $? = 1 configure:6485: checking whether gcc supports -Wnarrowing configure:6502: g++ -std=c++98 -c -g conftest.cpp 5 configure:6502: $? = 0 configure:6511: result: yes Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 And now for something completely different.
Re: Do less generous pointer globbing in alias.c
On Wed, 27 May 2015, Jan Hubicka wrote: Hi, this patch makes it possible for non-LTO alias oracle to TBAA disambiguate pointer types. It makes void * conflicting with all of them and does not put it to alias set 0. It also preserves the property that qualifiers of pointer-to type should not matter to determine the alias set and that pointer to array is same as pointer to array element. Finally it makes pointer void * to be equivalent to void ** (and more *) and to types with structural equality only. void * should be equivalent to incomplete-type * as well. I think those are all globbing rules we discussed for the non-LTO patch. It does two things. First is kind of canonicalization where for a given pointer it looks for non-pointer pointed-to type and then rebuilds is without qualifiers. This is fast, because build_pointer_type will reuse existing types. It makes void * to conflict with everyting by making its alias set to be subset of alias set of any other pointer. This means that writes to void * conflict with writes to any other pointer without really need to glob all the pointers to one equivalence class. I think you need to make each pointer alias-set a subset of the one of void * as well because both of the following is valid: *(void *)p = ... ... = *(int *)p; and *(int *)p = ... ... = *(void *)p; not sure if it's possible to create a testcase that fails if you do subsetting only one-way (because alias_sets_conflict queries both ways and I think alias_set_subset_of is not used very much, only by tree-ssa-alias.c:aliasing_component_refs_p which won't ever use it on two pointer alias sets). In theory true vs. anti-dependence should use alias_set_subset_of and trigger the above cases. But as those queries are done wrong a lot (in the past?) we use alias_sets_conflict there. For efficiency you could use a new flag similar to has_zero_child in alias_set_entry_d ... More comments inline below This patch makes quite some difference on C++. For example on deal II the TBAA stats reports 4344358 disambiguations and 7008576 queries, while with the patch we get 5368737 and 5687399 queries (I did not chose deal II for reason, it is just random C++ file) The patch bootstrap and regtests ppc64le-linux with the following testsuite differences: @@ -30,7 +30,9 @@ FAIL: c-c++-common/asan/null-deref-1.c -O3 -g output pattern test, is ASAN:SIGSEGV FAIL: c-c++-common/asan/null-deref-1.c -Os output pattern test, is ASAN:SIGSEGV FAIL: gcc.dg/cpp/_Pragma3.c (test for excess errors) +XPASS: gcc.dg/alias-8.c (test for warnings, line 11) FAIL: gcc.dg/loop-8.c scan-rtl-dump-times loop2_invariant Decided 1 +FAIL: gcc.dg/pr62167.c scan-tree-dump-not pre Removing basic block FAIL: gcc.dg/sms-4.c scan-rtl-dump-times sms SMS succeeded 1 XPASS: gcc.dg/guality/example.c -O0 execution test XPASS: gcc.dg/guality/example.c -O1 execution test @@ -304,6 +306,9 @@ FAIL: c-c++-common/asan/null-deref-1.c -O3 -g output pattern test, is ASAN:SIGSEGV FAIL: g++.dg/cpp1y/vla-initlist1.C -std=gnu++11 execution test FAIL: g++.dg/cpp1y/vla-initlist1.C -std=gnu++14 execution test +FAIL: g++.dg/ipa/ipa-icf-4.C -std=gnu++11 scan-ipa-dump icf Equal symbols: [67] +FAIL: g++.dg/ipa/ipa-icf-4.C -std=gnu++14 scan-ipa-dump icf Equal symbols: [67] +FAIL: g++.dg/ipa/ipa-icf-4.C -std=gnu++98 scan-ipa-dump icf Equal symbols: [67] ipa-icf-4 is about alias info now being more perceptive to block the merging. pr62167 seems just confused. The template checks that memory stores are not unified. It looks for BB removal message, but with the patch we get: bb 2: node.next = 0B; head.0_4 = head; node.prev = head.0_4; head.0_4-first = node; k.1_7 = k; h_8 = heads[k.1_7]; heads[2].first = 0B; if (head.0_4 == h_8) goto bb 3; else goto bb 5; bb 5: goto bb 4; bb 3: p_10 = MEM[(struct head *)heads][k.1_7].first; bb 4: # p_1 = PHI p_10(3), node(5) _11 = p_1 != 0B; _12 = (int) _11; return _12; before PR, the message is about the bb 5 sitting at critical edge removed. The TBAA incompatible load it looks for is optimized away by FRE: head-first = node; struct node *n = head-first; struct head *h = heads[k]; heads[2].first = n-next; if ((void*)n-prev == (void *)h) p = h-first; else /* Dead tbaa-unsafe load from ((struct node *)heads[2])-next. */ p = n-prev-next; here n is known to be head-first that is known to be node. The testcase runtime checks that result is Ok and passes. Bootstrapped/regtested ppc64le-linux. * alias.c (get_alias_set): Do not glob all pointer types into one; just produce euqivalence classes based on canonical type of pointed type type; make void * equivalent to void **. (record_component_aliases): Make void * to conflict with all other pointer types. Index:
Re: conditional lim
On Tue, May 26, 2015 at 3:10 PM, Evgeniya Maenkova evgeniya.maenk...@gmail.com wrote: Hi, Richard Thanks for review starting. Do you see any major issues with this patch (i.e. algorithms and ideas that should be completely replaced, effectively causing the re-write of most code)? To decide if there are major issues in the patch, perhaps, you need additional clarifications from me? Could you point at the places where additional explanations could save you most effort? Your answers to these questions are looking the first priority ones. You wrote about several issues in the code, which are looking as easy (or almost easy ;) to fix(inline functions, unswitch-loops flag, comments, etc). But, I think you agree, let’s first decide about the major issues (I mean, whether we continue with this patch or starting new one, this will save a lot of time for both of us). I didn't get an overall idea on how the patch works, that is, how it integrates with the existing algorithm. If you can elaborate on that a bit that would be helpful. I think the code-generation part needs some work (whether by following my idea with re-using copy_bbs or whether by basically re-implementing it is up to debate). How does your code handle for () { if (cond1) { if (cond2) invariant; if (cond3) invariant; } } ? Optimally we'd have before the loop exactly the same if () structure (thus if (cond1) is shared). Richard. Thanks, Evgeniya On Tue, May 26, 2015 at 2:31 PM, Richard Biener richard.guent...@gmail.com wrote: On Fri, May 8, 2015 at 11:07 PM, Evgeniya Maenkova evgeniya.maenk...@gmail.com wrote: Hi, Could you please review my patch for predicated lim? Let me note some details about it: 1) Phi statements are still moved only if they have 1 or 2 arguments. However, phi statements could be move under conditions (as it’s done for the other statements). Probably, phi statement motion with 3 + arguments could be implemented in the next patch after predicated lim. 2) Patch has limitations/features like (it was ok to me to implement it such way, maybe I’m not correct. ): a) Loop1 { If (a) Loop2 { Stmt - Invariant for Loop1 } } In this case Stmt will be moved only out of Loop2, because of if (a). b) Or Loop1 { … If (cond1) If (cond2) If (cond3) Stmt; } Stmt will be moved out only if cond1 is always executed in Loop1. c) It took me a long time to write all of these code, so there might be other peculiarities which I forgot to mention. :) Let’s discuss these ones as you will review my patch. 3) Patch consists of 9 files: a) gcc/testsuite/gcc.dg/tree-ssa/loop-7.c, gcc/testsuite/gcc.dg/tree-ssa/recip-3.c – changed tests: - gcc/testsuite/gcc.dg/tree-ssa/loop-7.c changed as predicated lim moves 2 more statements out of the loop; - gcc/testsuite/gcc.dg/tree-ssa/recip-3.c – with conditional lim recip optimization in this test doesn’t work (the corresponding value is below threshold as I could see in the code for recip, 13). So to have recip working in this test I changed test a little bit. b) gcc/tree-ssa-loop-im.c – the patched lim per se c) gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-13.c, gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-14.c, gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-15.c, gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-16.c, gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-17.c, gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-18.c the tests for conditional lim. 4) Patch testing: a) make –k check (no difference in results for me for the clean build and the patched one, - Revision: 222849, - uname -a Linux Istanbul 3.16.0-23-generic #31-Ubuntu SMP Tue Oct 21 18:00:35 UTC 2014 i686 i686 i686 GNU/Linux b) Bootstrap. It goes well now, however to fix it I have made a temporary hack in the lim code. And with this fix patch definitely shouldn’t be committed. I did so, as I would like to discuss this issue first. The problem is: I got stage2-stage3 comparison failure on the single file (tree-vect-data-refs.o). After some investigation I understood that tree-vect-data-refs.o differs being compiled with and without ‘-g’ option (yes, more exactly on stage 2 this is ‘-g –O2 –gtoggle’, and for stage 3 this is ‘-g –O2’. But to simplify things I can reproduce this difference on the same build (even not bootstrapped), with option ‘ –g’ and without it). Of course, the file compiled with –g option will contain debug information and will differ from the corresponding file without debug information. I mean there is the difference reported
Re: [patch] Move generic tree functions from expr.h to tree.h
No. Prototypes of functions defined in A.c should be in A.h, not in some other header. We've been (slowly) moving to that. You should have moved them all to expr.h instead, or move the implementations to tree.c. The former is simply not possible since expr.h is poisoned for FEs... I can move the implementations to tree.c but get_inner_reference is one of them. -- Eric Botcazou
Re: PATCH to run autoconf tests with C++ compiler
On Wed, May 27, 2015 at 10:49 AM, Andreas Schwab sch...@suse.de wrote: This breaks all checks for supported compiler options: configure:6382: checking whether gcc supports -Wnarrowing configure:6399: gcc -c -Wnarrowing conftest.c 5 cc1: error: unrecognized command line option -Wnarrowing configure:6399: $? = 1 configure:6485: checking whether gcc supports -Wnarrowing configure:6502: g++ -std=c++98 -c -g conftest.cpp 5 configure:6502: $? = 0 configure:6511: result: yes And thus causes PR66304, bootstrap failure with host gcc 4.3 (at least). Richard. Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 And now for something completely different.
[PATCH 01/35] Introduce new type-based pool allocator.
Hello. Following patch set attempts to replace old-style pool allocator to a type-based one. Moreover, as we utilize classes and structs that are used just by a pool allocator, these types have overwritten ctors and dtors. Thus, using the allocator is much easier and we shouldn't cast types back and forth. Another beneficat can be achieved in future, as we will be able to call a class constructors to correctly register a location, where a memory is allocated (-fgather-detailed-mem-stats). Patch can boostrap on x86_64-linux-gnu and ppc64-linux-gnu and survives regression tests on x86_64-linux-gnu. Ready for trunk? Thanks, Martin gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * alloc-pool.c (struct alloc_pool_descriptor): Move definition to header file. * alloc-pool.h (pool_allocator::pool_allocator): New function. (pool_allocator::release): Likewise. (inline pool_allocator::release_if_empty): Likewise. (inline pool_allocator::~pool_allocator): Likewise. (pool_allocator::allocate): Likewise. (pool_allocator::remove): Likewise. --- gcc/alloc-pool.c | 33 +- gcc/alloc-pool.h | 350 +++ 2 files changed, 355 insertions(+), 28 deletions(-) diff --git a/gcc/alloc-pool.c b/gcc/alloc-pool.c index 81909d8..0bea7a6 100644 --- a/gcc/alloc-pool.c +++ b/gcc/alloc-pool.c @@ -25,6 +25,8 @@ along with GCC; see the file COPYING3. If not see #include hash-table.h #include hash-map.h +ALLOC_POOL_ID_TYPE last_id; + #define align_eight(x) (((x+7) 3) 3) /* The internal allocation object. */ @@ -58,36 +60,10 @@ typedef struct allocation_object_def #define USER_PTR_FROM_ALLOCATION_OBJECT_PTR(X) \ ((void *) (((allocation_object *) (X))-u.data)) -#ifdef ENABLE_CHECKING -/* Last used ID. */ -static ALLOC_POOL_ID_TYPE last_id; -#endif - -/* Store information about each particular alloc_pool. Note that this - will underestimate the amount the amount of storage used by a small amount: - 1) The overhead in a pool is not accounted for. - 2) The unallocated elements in a block are not accounted for. Note - that this can at worst case be one element smaller that the block - size for that pool. */ -struct alloc_pool_descriptor -{ - /* Number of pools allocated. */ - unsigned long created; - /* Gross allocated storage. */ - unsigned long allocated; - /* Amount of currently active storage. */ - unsigned long current; - /* Peak amount of storage used. */ - unsigned long peak; - /* Size of element in the pool. */ - int elt_size; -}; - /* Hashtable mapping alloc_pool names to descriptors. */ -static hash_mapconst char *, alloc_pool_descriptor *alloc_pool_hash; +hash_mapconst char *, alloc_pool_descriptor *alloc_pool_hash; -/* For given name, return descriptor, create new if needed. */ -static struct alloc_pool_descriptor * +struct alloc_pool_descriptor * allocate_pool_descriptor (const char *name) { if (!alloc_pool_hash) @@ -96,6 +72,7 @@ allocate_pool_descriptor (const char *name) return alloc_pool_hash-get_or_insert (name); } + /* Create a pool of things of size SIZE, with NUM in each block we allocate. */ diff --git a/gcc/alloc-pool.h b/gcc/alloc-pool.h index 0c30711..8fd664f 100644 --- a/gcc/alloc-pool.h +++ b/gcc/alloc-pool.h @@ -20,6 +20,8 @@ along with GCC; see the file COPYING3. If not see #ifndef ALLOC_POOL_H #define ALLOC_POOL_H +#include hash-map.h + typedef unsigned long ALLOC_POOL_ID_TYPE; typedef struct alloc_pool_list_def @@ -63,4 +65,352 @@ extern void free_alloc_pool_if_empty (alloc_pool *); extern void *pool_alloc (alloc_pool) ATTRIBUTE_MALLOC; extern void pool_free (alloc_pool, void *); extern void dump_alloc_pool_statistics (void); + +typedef unsigned long ALLOC_POOL_ID_TYPE; + +/* Type based memory pool allocator. */ +template typename T +class pool_allocator +{ +public: + /* Default constructor for pool allocator called NAME. Each block + has NUM elements. The allocator support EXTRA_SIZE and can + potentially IGNORE_TYPE_SIZE. */ + pool_allocator (const char *name, size_t num, size_t extra_size = 0, + bool ignore_type_size = false); + + /* Default destuctor. */ + ~pool_allocator (); + + /* Release internal data structures. */ + void release (); + + /* Release internal data structures if the pool has not allocated + an object. */ + void release_if_empty (); + + /* Allocate a new object. */ + T *allocate () ATTRIBUTE_MALLOC; + + /* Release OBJECT that must come from the pool. */ + void remove (T *object); + +private: + struct allocation_pool_list + { +allocation_pool_list *next; + }; + + template typename U + struct allocation_object + { +#ifdef ENABLE_CHECKING +/* The ID of alloc pool which the object was allocated from. */ +ALLOC_POOL_ID_TYPE id; +#endif + +union + { + /* The data of the object. */ +
[PATCH 04/35] Change use to type-based pool allocator in lra.c.
gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * lra.c (init_insn_regs): Use new type-based pool allocator. (new_insn_reg) Likewise. (free_insn_reg) Likewise. (free_insn_regs) Likewise. (finish_insn_regs) Likewise. (init_insn_recog_data) Likewise. (init_reg_info) Likewise. (finish_reg_info) Likewise. (lra_free_copies) Likewise. (lra_create_copy) Likewise. (invalidate_insn_data_regno_info) Likewise. --- gcc/lra-int.h | 31 +++ gcc/lra.c | 40 ++-- 2 files changed, 41 insertions(+), 30 deletions(-) diff --git a/gcc/lra-int.h b/gcc/lra-int.h index 4bdd2c6..ef137e0 100644 --- a/gcc/lra-int.h +++ b/gcc/lra-int.h @@ -84,6 +84,22 @@ struct lra_copy int regno1, regno2; /* Next copy with correspondingly REGNO1 and REGNO2. */ lra_copy_t regno1_next, regno2_next; + + /* Pool allocation new operator. */ + inline void *operator new (size_t) + { +return pool.allocate (); + } + + /* Delete operator utilizing pool allocation. */ + inline void operator delete (void *ptr) + { +pool.remove((lra_copy *) ptr); + } + + /* Memory allocation pool. */ + static pool_allocatorlra_copy pool; + }; /* Common info about a register (pseudo or hard register). */ @@ -191,6 +207,21 @@ struct lra_insn_reg int regno; /* Next reg info of the same insn. */ struct lra_insn_reg *next; + + /* Pool allocation new operator. */ + inline void *operator new (size_t) + { +return pool.allocate (); + } + + /* Delete operator utilizing pool allocation. */ + inline void operator delete (void *ptr) + { +pool.remove((lra_insn_reg *) ptr); + } + + /* Memory allocation pool. */ + static pool_allocatorlra_insn_reg pool; }; /* Static part (common info for insns with the same ICODE) of LRA diff --git a/gcc/lra.c b/gcc/lra.c index 7440668..456f618 100644 --- a/gcc/lra.c +++ b/gcc/lra.c @@ -550,15 +550,7 @@ lra_update_dups (lra_insn_recog_data_t id, signed char *nops) insns. */ /* Pools for insn reg info. */ -static alloc_pool insn_reg_pool; - -/* Initiate pool for insn reg info. */ -static void -init_insn_regs (void) -{ - insn_reg_pool -= create_alloc_pool (insn regs, sizeof (struct lra_insn_reg), 100); -} +pool_allocatorlra_insn_reg lra_insn_reg::pool (insn regs, 100); /* Create LRA insn related info about a reference to REGNO in INSN with TYPE (in/out/inout), biggest reference mode MODE, flag that it is @@ -570,9 +562,7 @@ new_insn_reg (rtx_insn *insn, int regno, enum op_type type, machine_mode mode, bool subreg_p, bool early_clobber, struct lra_insn_reg *next) { - struct lra_insn_reg *ir; - - ir = (struct lra_insn_reg *) pool_alloc (insn_reg_pool); + lra_insn_reg *ir = new lra_insn_reg (); ir-type = type; ir-biggest_mode = mode; if (GET_MODE_SIZE (mode) GET_MODE_SIZE (lra_reg_info[regno].biggest_mode) @@ -585,13 +575,6 @@ new_insn_reg (rtx_insn *insn, int regno, enum op_type type, return ir; } -/* Free insn reg info IR. */ -static void -free_insn_reg (struct lra_insn_reg *ir) -{ - pool_free (insn_reg_pool, ir); -} - /* Free insn reg info list IR. */ static void free_insn_regs (struct lra_insn_reg *ir) @@ -601,7 +584,7 @@ free_insn_regs (struct lra_insn_reg *ir) for (; ir != NULL; ir = next_ir) { next_ir = ir-next; - free_insn_reg (ir); + delete ir; } } @@ -609,7 +592,7 @@ free_insn_regs (struct lra_insn_reg *ir) static void finish_insn_regs (void) { - free_alloc_pool (insn_reg_pool); + lra_insn_reg::pool.release (); } @@ -737,7 +720,6 @@ init_insn_recog_data (void) { lra_insn_recog_data_len = 0; lra_insn_recog_data = NULL; - init_insn_regs (); } /* Expand, if necessary, LRA data about insns. */ @@ -791,6 +773,8 @@ finish_insn_recog_data (void) if ((data = lra_insn_recog_data[i]) != NULL) free_insn_recog_data (data); finish_insn_regs (); + lra_copy::pool.release (); + lra_insn_reg::pool.release (); free (lra_insn_recog_data); } @@ -1310,7 +1294,7 @@ get_new_reg_value (void) } /* Pools for copies. */ -static alloc_pool copy_pool; +pool_allocatorlra_copy lra_copy::pool (lra copies, 100); /* Vec referring to pseudo copies. */ static veclra_copy_t copy_vec; @@ -1350,8 +1334,6 @@ init_reg_info (void) lra_reg_info = XNEWVEC (struct lra_reg, reg_info_size); for (i = 0; i reg_info_size; i++) initialize_lra_reg_info_element (i); - copy_pool -= create_alloc_pool (lra copies, sizeof (struct lra_copy), 100); copy_vec.create (100); } @@ -1366,8 +1348,6 @@ finish_reg_info (void) bitmap_clear (lra_reg_info[i].insn_bitmap); free (lra_reg_info); reg_info_size = 0; - free_alloc_pool (copy_pool); - copy_vec.release (); } /* Expand common reg info if it is necessary. */ @@ -1394,7 +1374,7 @@ lra_free_copies (void) { cp = copy_vec.pop
[PATCH 02/35] Change use to type-based pool allocator in et-forest.c.
gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * et-forest.c (et_new_occ): Use new type-based pool allocator. (et_new_tree): Likewise. (et_free_tree): Likewise. (et_free_tree_force): Likewise. (et_free_pools): Likewise. (et_split): Likewise. --- gcc/dominance.c | 1 + gcc/et-forest.c | 48 +--- gcc/et-forest.h | 15 +++ 3 files changed, 45 insertions(+), 19 deletions(-) diff --git a/gcc/dominance.c b/gcc/dominance.c index 09c8c90..f3c99ba 100644 --- a/gcc/dominance.c +++ b/gcc/dominance.c @@ -51,6 +51,7 @@ #include cfganal.h #include basic-block.h #include diagnostic-core.h +#include alloc-pool.h #include et-forest.h #include timevar.h #include hash-map.h diff --git a/gcc/et-forest.c b/gcc/et-forest.c index da6b7d7..fd451b8 100644 --- a/gcc/et-forest.c +++ b/gcc/et-forest.c @@ -25,8 +25,8 @@ License along with libiberty; see the file COPYING3. If not see #include config.h #include system.h #include coretypes.h -#include et-forest.h #include alloc-pool.h +#include et-forest.h /* We do not enable this with ENABLE_CHECKING, since it is awfully slow. */ #undef DEBUG_ET @@ -59,10 +59,26 @@ struct et_occ on the path to the root. */ struct et_occ *min_occ; /* The occurrence in the subtree with the minimal depth. */ + + /* Pool allocation new operator. */ + inline void *operator new (size_t) + { +return pool.allocate (); + } + + /* Delete operator utilizing pool allocation. */ + inline void operator delete (void *ptr) + { +pool.remove((et_occ *) ptr); + } + + /* Memory allocation pool. */ + static pool_allocatoret_occ pool; + }; -static alloc_pool et_nodes; -static alloc_pool et_occurrences; +pool_allocatoret_node et_node::pool (et_nodes pool, 300); +pool_allocatoret_occ et_occ::pool (et_occ pool, 300); /* Changes depth of OCC to D. */ @@ -449,11 +465,7 @@ et_splay (struct et_occ *occ) static struct et_occ * et_new_occ (struct et_node *node) { - struct et_occ *nw; - - if (!et_occurrences) -et_occurrences = create_alloc_pool (et_occ pool, sizeof (struct et_occ), 300); - nw = (struct et_occ *) pool_alloc (et_occurrences); + et_occ *nw = new et_occ; nw-of = node; nw-parent = NULL; @@ -474,9 +486,7 @@ et_new_tree (void *data) { struct et_node *nw; - if (!et_nodes) -et_nodes = create_alloc_pool (et_node pool, sizeof (struct et_node), 300); - nw = (struct et_node *) pool_alloc (et_nodes); + nw = new et_node; nw-data = data; nw-father = NULL; @@ -501,8 +511,8 @@ et_free_tree (struct et_node *t) if (t-father) et_split (t); - pool_free (et_occurrences, t-rightmost_occ); - pool_free (et_nodes, t); + delete t-rightmost_occ; + delete t; } /* Releases et tree T without maintaining other nodes. */ @@ -510,10 +520,10 @@ et_free_tree (struct et_node *t) void et_free_tree_force (struct et_node *t) { - pool_free (et_occurrences, t-rightmost_occ); + delete t-rightmost_occ; if (t-parent_occ) -pool_free (et_occurrences, t-parent_occ); - pool_free (et_nodes, t); +delete t-parent_occ; + delete t; } /* Release the alloc pools, if they are empty. */ @@ -521,8 +531,8 @@ et_free_tree_force (struct et_node *t) void et_free_pools (void) { - free_alloc_pool_if_empty (et_occurrences); - free_alloc_pool_if_empty (et_nodes); + et_occ::pool.release_if_empty (); + et_node::pool.release_if_empty (); } /* Sets father of et tree T to FATHER. */ @@ -614,7 +624,7 @@ et_split (struct et_node *t) rmost-depth = 0; rmost-min = 0; - pool_free (et_occurrences, p_occ); + delete p_occ; /* Update the tree. */ if (father-son == t) diff --git a/gcc/et-forest.h b/gcc/et-forest.h index b507c64..1b3a16c 100644 --- a/gcc/et-forest.h +++ b/gcc/et-forest.h @@ -66,6 +66,21 @@ struct et_node struct et_occ *rightmost_occ;/* The rightmost occurrence. */ struct et_occ *parent_occ; /* The occurrence of the parent node. */ + + /* Pool allocation new operator. */ + inline void *operator new (size_t) + { +return pool.allocate (); + } + + /* Delete operator utilizing pool allocation. */ + inline void operator delete (void *ptr) + { +pool.remove((et_node *) ptr); + } + + /* Memory allocation pool. */ + static pool_allocatoret_node pool; }; struct et_node *et_new_tree (void *data); -- 2.1.4
[PATCH] New SLP reduction testcase
To cover the case where we need two vectors. Tested on x86_64-unknown-linux-gnu, applied. Richard. 2015-05-27 Richard Biener rguent...@suse.de * gcc.dg/vect/slp-reduc-7.c: New testcase. Index: gcc/testsuite/gcc.dg/vect/slp-reduc-7.c === --- gcc/testsuite/gcc.dg/vect/slp-reduc-7.c (revision 0) +++ gcc/testsuite/gcc.dg/vect/slp-reduc-7.c (working copy) @@ -0,0 +1,60 @@ +/* { dg-require-effective-target vect_int } */ + +#include stdarg.h +#include tree-vect.h + +#define N 32 + +unsigned int ub[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45, +0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45}; +unsigned int uc[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15, +0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}; + +/* Vectorization of reduction using loop-aware SLP (with two copies). */ + +__attribute__ ((noinline)) +int main1 (int n, int res0, int res1, int res2, int res3, + int res4, int res5, int res6, int res7) +{ + int i; + unsigned int udiff0 = 5, udiff1 = 10, udiff2 = 20, udiff3 = 30; + unsigned int udiff4 = 45, udiff5 = 50, udiff6 = 60, udiff7 = 70; + + for (i = 0; i n; i++) { +udiff7 += (ub[8*i + 7] - uc[8*i + 7]); +udiff6 += (ub[8*i + 6] - uc[8*i + 6]); +udiff5 += (ub[8*i + 5] - uc[8*i + 5]); +udiff4 += (ub[8*i + 4] - uc[8*i + 4]); +udiff3 += (ub[8*i + 3] - uc[8*i + 3]); +udiff2 += (ub[8*i + 2] - uc[8*i + 2]); +udiff1 += (ub[8*i + 1] - uc[8*i + 1]); +udiff0 += (ub[8*i] - uc[8*i]); + } + + /* Check results: */ + if (udiff0 != res0 + || udiff1 != res1 + || udiff2 != res2 + || udiff3 != res3 + || udiff4 != res4 + || udiff5 != res5 + || udiff6 != res6 + || udiff7 != res7) +abort (); + + return 0; +} + +int main (void) +{ + check_vect (); + + main1 (N/8, 37, 50, 68, 86, 109, 122, 140, 158); + main1 (N/8 - 1, 21, 32, 48, 64, 85, 96, 112, 128); + return 0; +} + +/* { dg-final { scan-tree-dump-times vectorized 1 loops 1 vect { xfail vect_no_int_add } } } */ +/* { dg-final { scan-tree-dump-times vectorizing stmts using SLP 1 vect { xfail vect_no_int_add } } } */ +/* { dg-final { cleanup-tree-dump vect } } */ +
Re: [patch 10/10] debug-early merge: compiler proper
On Fri, May 22, 2015 at 4:12 PM, Aldy Hernandez al...@redhat.com wrote: On 05/22/2015 07:23 AM, Richard Biener wrote: On Wed, May 20, 2015 at 5:50 PM, Aldy Hernandez al...@redhat.com wrote: On 05/18/2015 06:56 AM, Richard Biener wrote: diff --git a/gcc/tree-core.h b/gcc/tree-core.h index ad1bb23..2a9f417 100644 --- a/gcc/tree-core.h +++ b/gcc/tree-core.h @@ -1334,6 +1334,9 @@ struct GTY(()) tree_block { tree abstract_origin; tree fragment_origin; tree fragment_chain; + + /* Pointer to the DWARF lexical block. */ + struct die_struct *die; }; struct GTY(()) tree_type_common { Ick - do we need this? dwarf2out.c has a hashtable to map blocks to DIEs (which you don't remove in turn). We need a way to reference the early created DIE from late debugging, and we can't use block_map because it gets cloberred across functions. It's currently being released in late debug (dwarf2out_function_decl), that's why you see it not set to NULL in dwarf2out_c_finalize. Also, it uses BLOCK_NUMBERs, which according to the documentation in tree.h, are not guaranteed to be unique across functions. As Honza mentioned, we're already using a DIE map in types through TYPE_SYMTAB_DIE. See lookup_type_die() in dwarf2out.c. Could we leave this as is? But why then not eliminate block_map in favor of using the new -die member? Having both looks very odd to me. Oh, I would love to. I just didn't want to rip things apart elsewhere until I was sure you guys were on board with the approach. Can you cook up a patch for trunk adding that field to tree_block and removing the block_map map in favor of sth like what we do for lookup_type_die/equate_type_number_to_die and TYPE_SYMTAB_DIE? Absolutely! The attached patch removes block_map in favor of BLOCK_DIE. I did not add lookup_block_die/equate_block_number_to_die abstractions because I think BLOCK_DIE is pretty straightforward. The attached patch is against mainline. I also ported it to the branch for testing, and neither the branch nor mainline exhibit any regressions. Tested on x86-64 Linux with --enable-languages=all,go,ada. OK for trunk? Ok. Thanks, Richard. Aldy
Re: [PATCH 09/35] Change use to type-based pool allocator in c-format.c.
On Wed, May 27, 2015 at 03:56:47PM +0200, mliska wrote: gcc/c-family/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * c-format.c (check_format_arg):Use new type-based pool allocator. (check_format_info_main) Likewise. Please watch your ChangeLog entries. Missing space after : in many cases, missing : after ) in many cases. Also, please grep your patches for '^+[ ]*' (8 consecutive spaces) + lines in patches really should use tabs. Jakub
Re: [PATCH 4/13] arm musl support
Hi Szabolcs, On 20/04/15 19:53, Szabolcs Nagy wrote: Set up dynamic linker name for arm. gcc/Changelog: 2015-04-16 Gregor Richards gregor.richa...@uwaterloo.ca * config/arm/linux-eabi.h (MUSL_DYNAMIC_LINKER): Define. This is ok. The #if TARGET_BIG_ENDIAN_DEFAULT logic looks ok to me. I've committed this as r223749. Thanks, Kyrill
Re: [Patch]: libbacktrace - add support of PE/COFF
On 05/21/2015 06:41 AM, Tristan Gingold wrote: Hello, this patch adds basic support to libbacktrace for PE32 and PE32+ (Windows and Windows64 object formats). Support is ‘basic’ because neither DLL nor PIE (if that exists) are handled. Furthermore, there is no windows versions of mmapio.c and mmap.c Finally, I have disabled the support of data symbols for PE because I wasn’t able to pass ‘make check’ with that: symbol ‘_global’ is at the same address as a symbol defined by the linker and I haven’t found any way to discard the latter. As I think data symbol support isn’t a required feature, I have preferred to disable that feature on PE. The new file, pecoff.c, mostly follows the structure of elf.c Tested on both windows and windows64. No regression on Gnu/Linux x86. Tristan. 2015-05-21 Tristan Gingold ging...@adacore.com * pecoff.c: New file. * Makefile.am (FORMAT_FILES): Add pecoff.c and dependencies. * Makefile.in: Regenerate. * filetype.awk: Detect pecoff. * configure.ac: Define BACKTRACE_SUPPORTS_DATA on elf platforms. Add pecoff. * btest.c (test5): Test enabled only if BACKTRACE_SUPPORTS_DATA is true. * backtrace-supported.h.in (BACKTRACE_SUPPORTS_DATA): Define. * configure: Regenerate. * pecoff.c: New file. + +/* Return true iff SYM is a defined symbol for a function. Data symbols + are discarded because they aren't easily identified. */ + +static int +coff_is_symbol (const b_coff_internal_symbol *isym) +{ + return isym-type == 0x20 isym-sec 0; +} You probably want const or enum so that you can have a symbolic name rather than 0x20 here. It also seems like the name ought to better indicate it's testing for function symbols. It's a given that you know COFF specifics better than I ever did, so I'm comfortable assuming you got the COFF specifics right. The overall structure of elf.c coff.c is the same with code templates that are very similar, except they work on different underlying types. Presumably there wasn't a good way to factor any of the generic looking bits out? And no, I'm not requesting you rewrite all this in BFD :-) OK for the trunk. Any future issues with the coff bits I'll send your way. jeff
Re: [Patch, fortran] PR66079 - [6 Regression] memory leak with source allocation in internal subprogram
Hi Paul, hi Mikael, about renaming the identifier emitted: I would like to keep it short. Remember, there is always a number attached to it, which makes it unique. Furthermore does alloc_source_tmp sound unnecessarily long to me. It tastes like we do not trust the unique identifier mechanism established in gfortran. But that is just my personal taste. about missing expr-rank == 0) in the extended patch: I just wanted to present an idea here. The patch was not meant to be commited yet. I think it furthermore is just half of the rent (like we say in Germany). I think we can do better, when we also think about the preceeding two if-blocks (the ones taking care about derived and class types). It should be possible to do something similar there. Furthermore could one think about moving e3rhs for array valued objects, too. But then we should not move to the last element, but instead to the first element. Nevertheless in the array valued case one might end up still having to deallocate the components or e3rhs, when the object allocated is zero sized. I wonder whether the bother really pays. What do you think about it? Paul: I would recommend you commit with symbol rename, but without the move optimization. We can do that later. Mikael: I usually do favor else if, too. Because of quick and dirty nature of the patch, I omitted to stick to the standard code convention. Regards, Andre On Wed, 27 May 2015 09:59:20 +0200 Paul Richard Thomas paul.richard.tho...@gmail.com wrote: Dear Andre, I am perfectly happy with renaming the rename to source. I was attempting to distinguish atmp coming from trans-array.c from this temporary; just as an aid to any possible future debugging. The rework of the patch looks fine to me as well. Do you want to commit or should I do so? Cheers Paul On 25 May 2015 at 12:24, Andre Vehreschild ve...@gmx.de wrote: Hi Paul, I am not quite happy with the naming of the temporary variable. When I initially set the prefix to atmp this was because the variable would be an array most of the time and because of the number appended to it should be unique anyway. However I would like to point out that disclosing an internal implementation detail of the compiler to a future developer looking at the pseudo-code dump will not help (I mean expr3, here). I would rather use source as the prefix now that I think of it with some distance to the original naming. What do you think? Now that the deallocate for source's components is in the patch, I understand why initially the source= preevaluation for derived types with allocatable components was disallowed. Thanks for clarifying that. I wonder though, if we can't do better... Please have a look at the attached patch. It not only renames the temporary variable from expr3 to source (couldn't help, but do it. Please don't be angry :-)), but also adds move semantics to source= expressions for the last object to allocate. I.e., when a scalar source= expression with allocatable components is detected, then its content is moved (memcpy'ed) to the last object to allocate instead of being assigned. All former objects to allocate are of course handled like before, i.e., components are allocated and the contents of the source= expression is copied using the assign. But when a move could be done the alloc/dealloc of the components is skipped. With this I hope to safe a lot of mallocs and frees, which are not that cheap. In the most common case where only one object is allocated, there now is only one alloc for the components to get expr3 up and one for the object to allocate. We safe the allocate of the allocatable components in the object to allocate and the free of the source= components. I hope I could make clear what I desire? If not maybe a look into the patch might help. What do you think? The patch of course is only a quick implementation of the idea. Please comment, everyone! Regards, Andre On Mon, 25 May 2015 09:30:34 +0200 Paul Richard Thomas paul.richard.tho...@gmail.com wrote: Dear All, Lets see if I can get it right this time :-) Note that I have changed the name of the temporary variable in trans_allocate from 'atmp' to 'expr3' so that it is not confused with array temporaries. I am not suree how much of the testcase is pertinent after the reform of the evaluation of expr3 performed by Andre. However, there were still memory leaks that are fixed by the attached patch. Bootstrapped and regtested on a current trunk - OK for trunk? Paul 2015-05-23 Paul Thomas pa...@gcc.gnu.org PR fortran/66079 * trans-expr.c (gfc_conv_procedure_call): Allocatable scalar function results must be freed and nullified after use. Create a temporary to hold the result to prevent duplicate calls. * trans-stmt.c (gfc_trans_allocate): Rename temporary variable as 'expr3'. Deallocate
Re: [patch] Move generic tree functions from expr.h to tree.h
On Wed, May 27, 2015 at 12:50 PM, Eric Botcazou ebotca...@adacore.com wrote: No. Prototypes of functions defined in A.c should be in A.h, not in some other header. We've been (slowly) moving to that. You should have moved them all to expr.h instead, or move the implementations to tree.c. The former is simply not possible since expr.h is poisoned for FEs... I can move the implementations to tree.c but get_inner_reference is one of them. You can leave get_inner_reference in its place then ... or move it. It's hardly only used by expansion now. Richard. -- Eric Botcazou
[Ada] Remove propagation of atomicity from object to type
This change removes an old trick which was propagating the Atomic (and now Volatile_Full_Access) setting from an object to a locally-defined type, in order to coax gigi into accepting more atomic objects. This trick is now obsolete since gigi should be able to rewrite the type of the objects to meet the atomicity requirements on its own. The change also rewrites Is_Atomic_VFA_Aggregate to check for the presence of the flag on the object as well, which was missing but largely mitigated by the aforementioned trick. No functional changes. Tested on x86_64-pc-linux-gnu, committed on trunk 2015-05-26 Eric Botcazou ebotca...@adacore.com * freeze.ads (Is_Atomic_VFA_Aggregate): Adjust profile. * freeze.adb (Is_Atomic_VFA_Aggregate): Change Entity parameter into Node parameter and remove Type parameter. Look at Is_Atomic_Or_VFA both on the type and on the object. (Freeze_Entity): Adjust call to Is_Atomic_VFA_Aggregate. * exp_aggr.adb (Expand_Record_Aggregate): Likewise. (Process_Atomic_Independent_Shared_Volatile): Remove code propagating Atomic or VFA from object to locally-defined type. Index: sem_prag.adb === --- sem_prag.adb(revision 223750) +++ sem_prag.adb(working copy) @@ -5875,7 +5875,6 @@ E: Entity_Id; E_Id : Node_Id; K: Node_Kind; - Utyp : Entity_Id; procedure Set_Atomic_VFA (E : Entity_Id); -- Set given type as Is_Atomic or Is_Volatile_Full_Access. Also, if @@ -6053,46 +6052,6 @@ then Set_Has_Delayed_Freeze (E); end if; - - -- An interesting improvement here. If an object of composite - -- type X is declared atomic, and the type X isn't, that's a - -- pity, since it may not have appropriate alignment etc. We - -- can rescue this in the special case where the object and - -- type are in the same unit by just setting the type as - -- atomic, so that the back end will process it as atomic. - - -- Note: we used to do this for elementary types as well, - -- but that turns out to be a bad idea and can have unwanted - -- effects, most notably if the type is elementary, the object - -- a simple component within a record, and both are in a spec: - -- every object of this type in the entire program will be - -- treated as atomic, thus incurring a potentially costly - -- synchronization operation for every access. - - -- For Volatile_Full_Access we can do this for elementary types - -- too, since there is no issue of atomic synchronization. - - -- Of course it would be best if the back end could just adjust - -- the alignment etc for the specific object, but that's not - -- something we are capable of doing at this point. - - Utyp := Underlying_Type (Etype (E)); - - if Present (Utyp) - and then (Is_Composite_Type (Utyp) -or else Prag_Id = Pragma_Volatile_Full_Access) - and then Sloc (E) No_Location - and then Sloc (Utyp) No_Location - and then - Get_Source_File_Index (Sloc (E)) = -Get_Source_File_Index (Sloc (Utyp)) - then - if Prag_Id = Pragma_Volatile_Full_Access then - Set_Is_Volatile_Full_Access (Utyp); - else - Set_Is_Atomic (Utyp); - end if; - end if; end if; -- Atomic/Shared/Volatile_Full_Access imply Independent Index: freeze.adb === --- freeze.adb (revision 223750) +++ freeze.adb (working copy) @@ -1459,17 +1459,15 @@ -- Is_Atomic_VFA_Aggregate -- - - function Is_Atomic_VFA_Aggregate - (E : Entity_Id; - Typ : Entity_Id) return Boolean - is - Loc : constant Source_Ptr := Sloc (E); + function Is_Atomic_VFA_Aggregate (N : Node_Id) return Boolean is + Loc : constant Source_Ptr := Sloc (N); New_N : Node_Id; Par : Node_Id; Temp : Entity_Id; + Typ : Entity_Id; begin - Par := Parent (E); + Par := Parent (N); -- Array may be qualified, so find outer context @@ -1477,24 +1475,45 @@ Par := Parent (Par); end if; - if Nkind_In (Par, N_Object_Declaration, N_Assignment_Statement) -and then Comes_From_Source (Par) - then - Temp := Make_Temporary (Loc, 'T', E); - New_N := - Make_Object_Declaration (Loc, -
[PATCH 03/35] Change use to type-based pool allocator in lra-lives.c.
gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * lra-lives.c (free_live_range): Use new type-based pool allocator. (free_live_range_list) Likewise. (create_live_range) Likewise. (copy_live_range) Likewise. (lra_merge_live_ranges) Likewise. (remove_some_program_points_and_update_live_ranges) Likewise. (lra_live_ranges_init) Likewise. (lra_live_ranges_finish) Likewise. --- gcc/lra-coalesce.c | 1 + gcc/lra-int.h | 15 +++ gcc/lra-lives.c| 27 +++ gcc/lra-spills.c | 1 + gcc/lra.c | 1 + 5 files changed, 25 insertions(+), 20 deletions(-) diff --git a/gcc/lra-coalesce.c b/gcc/lra-coalesce.c index 045691d..b385603 100644 --- a/gcc/lra-coalesce.c +++ b/gcc/lra-coalesce.c @@ -84,6 +84,7 @@ along with GCC; see the file COPYING3.If not see #include except.h #include timevar.h #include ira.h +#include alloc-pool.h #include lra-int.h #include df.h diff --git a/gcc/lra-int.h b/gcc/lra-int.h index 12923ee..4bdd2c6 100644 --- a/gcc/lra-int.h +++ b/gcc/lra-int.h @@ -54,6 +54,21 @@ struct lra_live_range lra_live_range_t next; /* Pointer to structures with the same start. */ lra_live_range_t start_next; + + /* Pool allocation new operator. */ + inline void *operator new (size_t) + { +return pool.allocate (); + } + + /* Delete operator utilizing pool allocation. */ + inline void operator delete (void *ptr) + { +pool.remove((lra_live_range *) ptr); + } + + /* Memory allocation pool. */ + static pool_allocatorlra_live_range pool; }; typedef struct lra_copy *lra_copy_t; diff --git a/gcc/lra-lives.c b/gcc/lra-lives.c index 085411e..9b5f74e 100644 --- a/gcc/lra-lives.c +++ b/gcc/lra-lives.c @@ -121,14 +121,7 @@ static sparseset unused_set, dead_set; static bitmap_head temp_bitmap; /* Pool for pseudo live ranges. */ -static alloc_pool live_range_pool; - -/* Free live range LR. */ -static void -free_live_range (lra_live_range_t lr) -{ - pool_free (live_range_pool, lr); -} +pool_allocator lra_live_range lra_live_range::pool (live ranges, 100); /* Free live range list LR. */ static void @@ -139,7 +132,7 @@ free_live_range_list (lra_live_range_t lr) while (lr != NULL) { next = lr-next; - free_live_range (lr); + delete lr; lr = next; } } @@ -148,9 +141,7 @@ free_live_range_list (lra_live_range_t lr) static lra_live_range_t create_live_range (int regno, int start, int finish, lra_live_range_t next) { - lra_live_range_t p; - - p = (lra_live_range_t) pool_alloc (live_range_pool); + lra_live_range_t p = new lra_live_range; p-regno = regno; p-start = start; p-finish = finish; @@ -162,9 +153,7 @@ create_live_range (int regno, int start, int finish, lra_live_range_t next) static lra_live_range_t copy_live_range (lra_live_range_t r) { - lra_live_range_t p; - - p = (lra_live_range_t) pool_alloc (live_range_pool); + lra_live_range_t p = new lra_live_range; *p = *r; return p; } @@ -209,7 +198,7 @@ lra_merge_live_ranges (lra_live_range_t r1, lra_live_range_t r2) r1-start = r2-start; lra_live_range_t temp = r2; r2 = r2-next; - pool_free (live_range_pool, temp); + delete temp; } else { @@ -1109,7 +1098,7 @@ remove_some_program_points_and_update_live_ranges (void) } prev_r-start = r-start; prev_r-next = next_r; - free_live_range (r); + delete r; } } } @@ -1380,8 +1369,6 @@ lra_clear_live_ranges (void) void lra_live_ranges_init (void) { - live_range_pool = create_alloc_pool (live ranges, - sizeof (struct lra_live_range), 100); bitmap_initialize (temp_bitmap, reg_obstack); initiate_live_solver (); } @@ -1392,5 +1379,5 @@ lra_live_ranges_finish (void) { finish_live_solver (); bitmap_clear (temp_bitmap); - free_alloc_pool (live_range_pool); + lra_live_range::pool.release (); } diff --git a/gcc/lra-spills.c b/gcc/lra-spills.c index 19ece20..caece9a 100644 --- a/gcc/lra-spills.c +++ b/gcc/lra-spills.c @@ -98,6 +98,7 @@ along with GCC; see the file COPYING3.If not see #include except.h #include timevar.h #include target.h +#include alloc-pool.h #include lra-int.h #include ira.h #include df.h diff --git a/gcc/lra.c b/gcc/lra.c index 7c33636..7440668 100644 --- a/gcc/lra.c +++ b/gcc/lra.c @@ -149,6 +149,7 @@ along with GCC; see the file COPYING3. If not see #include timevar.h #include target.h #include ira.h +#include alloc-pool.h #include lra-int.h #include df.h -- 2.1.4
Re: [C++/66270] another may_alias crash
On 05/26/15 15:00, Nathan Sidwell wrote: On 05/25/15 21:18, Jason Merrill wrote: Hmm, are you seeing a case where TYPE_CANONICAL (to_type) has the may_alias attribute? Yes. This occurs when the newly created TRCAA pointer is to a self-canonical type. The else if (TYPE_CANONICAL (to_type) != to_type) is false, so the newly created pointer is self-canonical too (and has TRCAA). If the canonical type should not have TRCAA we need to change the if condition to: else if (TYPE_CANONICAL (to_type) != to_type || could_alias_all) where COULD_ALIAS_ALL is the incoming CAN_ALIAS_ALL value. Does that make sense? Making that change does stop the ICE I was seeing, but I've not done a full test yet. Here's a patch implementing that change, When build_pointer_type_for_mode is passed true for CAN_ALIAS_ALL, we force creating a canonical type, continuing to pass false for that pointer's creation. booted tested on x86-64-linux, ok? nathan 2015-05-25 Nathan Sidwell nat...@acm.org PR c++/66270 * tree.c (build_pointer_type_for_mode): Canonical type does not inherit can_alias_all. (build_reference_type_for_mode): Likewise. PR c++/66270 * g++.dg/ext/alias-canon3.C: New. Index: testsuite/g++.dg/ext/alias-canon3.C === --- testsuite/g++.dg/ext/alias-canon3.C (revision 0) +++ testsuite/g++.dg/ext/alias-canon3.C (working copy) @@ -0,0 +1,12 @@ +// { dg-do compile } +// PR c++/66270 + +typedef float __m256 __attribute__ (( __vector_size__(32), __may_alias__ )); +struct A { + __m256 ymm; + const float f() const; +}; + +const float A::f() const { + return ymm[1]; +} Index: tree.c === --- tree.c (revision 223636) +++ tree.c (working copy) @@ -7719,6 +7719,7 @@ build_pointer_type_for_mode (tree to_typ bool can_alias_all) { tree t; + bool could_alias = can_alias_all; if (to_type == error_mark_node) return error_mark_node; @@ -7756,7 +7757,7 @@ build_pointer_type_for_mode (tree to_typ if (TYPE_STRUCTURAL_EQUALITY_P (to_type)) SET_TYPE_STRUCTURAL_EQUALITY (t); - else if (TYPE_CANONICAL (to_type) != to_type) + else if (TYPE_CANONICAL (to_type) != to_type || could_alias) TYPE_CANONICAL (t) = build_pointer_type_for_mode (TYPE_CANONICAL (to_type), mode, false); @@ -7786,6 +7787,7 @@ build_reference_type_for_mode (tree to_t bool can_alias_all) { tree t; + bool could_alias = can_alias_all; if (to_type == error_mark_node) return error_mark_node; @@ -7823,7 +7825,7 @@ build_reference_type_for_mode (tree to_t if (TYPE_STRUCTURAL_EQUALITY_P (to_type)) SET_TYPE_STRUCTURAL_EQUALITY (t); - else if (TYPE_CANONICAL (to_type) != to_type) + else if (TYPE_CANONICAL (to_type) != to_type || could_alias) TYPE_CANONICAL (t) = build_reference_type_for_mode (TYPE_CANONICAL (to_type), mode, false);
Re: GIMPLE syntax highlighting for vim
On 05/24/2015 01:48 PM, Mikhail Maltsev wrote: Hi all! The attached vim script can be used to highlight syntax in GIMPLE dumps making them somewhat easier to read. I would like to add this script to gcc/contrib directory. Is that OK? Sure, that's fine. jeff
Re: conditional lim
On Wed, May 27, 2015 at 2:11 PM, Richard Biener richard.guent...@gmail.com wrote: On Tue, May 26, 2015 at 3:10 PM, Evgeniya Maenkova evgeniya.maenk...@gmail.com wrote: Hi, Richard Thanks for review starting. Do you see any major issues with this patch (i.e. algorithms and ideas that should be completely replaced, effectively causing the re-write of most code)? To decide if there are major issues in the patch, perhaps, you need additional clarifications from me? Could you point at the places where additional explanations could save you most effort? Your answers to these questions are looking the first priority ones. You wrote about several issues in the code, which are looking as easy (or almost easy ;) to fix(inline functions, unswitch-loops flag, comments, etc). But, I think you agree, let’s first decide about the major issues (I mean, whether we continue with this patch or starting new one, this will save a lot of time for both of us). I didn't get an overall idea on how the patch works, that is, how it integrates with the existing algorithm. If you can elaborate on that a bit that would be helpful. Hi, Sure, I'll write you some notes in several days. I think the code-generation part needs some work (whether by following my idea with re-using copy_bbs or whether by basically re-implementing it is up to debate). How does your code handle for () { if (cond1) { if (cond2) invariant; if (cond3) invariant; } } ? Optimally we'd have before the loop exactly the same if () structure (thus if (cond1) is shared). If both invariants are going out of the same loop (i mean tgt_level), then if structure will be the same. for1() for () { if (cond1) { if (cond2) invariant1; if (cond3) invariant2; } } will be transformed to for1() if (cond1) { if (cond2) invariant1; if (cond3) invariant2; } } for () { if (cond1) { if (cond2); if (cond3); } } (I don't cleanup empty if's in lim code). If these invarians are moved in different loops then for1 for2() for() { if (cond1) { if (cond2) invariant1; if (cond3) invariant2; } } will be transformed to: for1 { if (cond1) if (cond2) invariant1; for2() { if (cond1) if (cond3) invariant2; for() { if (cond1) { if (cond2); if (cond3); } } } } Of course, there could be some bugs, but the idea was as mentioned above. This transformation was looking logical to me. What do you think? Thanks, Evgeniya Richard. Thanks, Evgeniya On Tue, May 26, 2015 at 2:31 PM, Richard Biener richard.guent...@gmail.com wrote: On Fri, May 8, 2015 at 11:07 PM, Evgeniya Maenkova evgeniya.maenk...@gmail.com wrote: Hi, Could you please review my patch for predicated lim? Let me note some details about it: 1) Phi statements are still moved only if they have 1 or 2 arguments. However, phi statements could be move under conditions (as it’s done for the other statements). Probably, phi statement motion with 3 + arguments could be implemented in the next patch after predicated lim. 2) Patch has limitations/features like (it was ok to me to implement it such way, maybe I’m not correct. ): a) Loop1 { If (a) Loop2 { Stmt - Invariant for Loop1 } } In this case Stmt will be moved only out of Loop2, because of if (a). b) Or Loop1 { … If (cond1) If (cond2) If (cond3) Stmt; } Stmt will be moved out only if cond1 is always executed in Loop1. c) It took me a long time to write all of these code, so there might be other peculiarities which I forgot to mention. :) Let’s discuss these ones as you will review my patch. 3) Patch consists of 9 files: a) gcc/testsuite/gcc.dg/tree-ssa/loop-7.c, gcc/testsuite/gcc.dg/tree-ssa/recip-3.c – changed tests: - gcc/testsuite/gcc.dg/tree-ssa/loop-7.c changed as predicated lim moves 2 more statements out of the loop; - gcc/testsuite/gcc.dg/tree-ssa/recip-3.c – with conditional lim recip optimization in this test doesn’t work (the corresponding value is below threshold as I could see in the code for recip, 13). So to have recip working in this test I changed test a little bit. b) gcc/tree-ssa-loop-im.c – the patched lim per se c)
[PATCH 06/35] Change use to type-based pool allocator in ira-color.c.
gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * ira-color.c (init_update_cost_records):Use new type-based pool allocator. (get_update_cost_record) Likewise. (free_update_cost_record_list) Likewise. (finish_update_cost_records) Likewise. (initiate_cost_update) Likewise. --- gcc/ira-color.c | 35 +++ 1 file changed, 15 insertions(+), 20 deletions(-) diff --git a/gcc/ira-color.c b/gcc/ira-color.c index b719e7a..4750714 100644 --- a/gcc/ira-color.c +++ b/gcc/ira-color.c @@ -123,21 +123,6 @@ struct update_cost_record int divisor; /* Next record for given allocno. */ struct update_cost_record *next; - - /* Pool allocation new operator. */ - inline void *operator new (size_t) - { -return pool.allocate (); - } - - /* Delete operator utilizing pool allocation. */ - inline void operator delete (void *ptr) - { -pool.remove((update_cost_record *) ptr); - } - - /* Memory allocation pool. */ - static pool_allocatorupdate_cost_record pool; }; /* To decrease footprint of ira_allocno structure we store all data @@ -1181,16 +1166,25 @@ setup_profitable_hard_regs (void) allocnos. */ /* Pool for update cost records. */ -pool_allocatorupdate_cost_record update_cost_record::pool - (update cost records, 100); +static alloc_pool update_cost_record_pool; + +/* Initiate update cost records. */ +static void +init_update_cost_records (void) +{ + update_cost_record_pool += create_alloc_pool (update cost records, +sizeof (struct update_cost_record), 100); +} /* Return new update cost record with given params. */ static struct update_cost_record * get_update_cost_record (int hard_regno, int divisor, struct update_cost_record *next) { - update_cost_record *record = new update_cost_record; + struct update_cost_record *record; + record = (struct update_cost_record *) pool_alloc (update_cost_record_pool); record-hard_regno = hard_regno; record-divisor = divisor; record-next = next; @@ -1206,7 +1200,7 @@ free_update_cost_record_list (struct update_cost_record *list) while (list != NULL) { next = list-next; - delete list; + pool_free (update_cost_record_pool, list); list = next; } } @@ -1215,7 +1209,7 @@ free_update_cost_record_list (struct update_cost_record *list) static void finish_update_cost_records (void) { - update_cost_record::pool.release (); + free_alloc_pool (update_cost_record_pool); } /* Array whose element value is TRUE if the corresponding hard @@ -1270,6 +1264,7 @@ initiate_cost_update (void) = (struct update_cost_queue_elem *) ira_allocate (size); memset (update_cost_queue_elems, 0, size); update_cost_check = 0; + init_update_cost_records (); } /* Deallocate data used by function update_costs_from_copies. */ -- 2.1.4
[PATCH 12/35] Change use to type-based pool allocator in cselib.c.
gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * cselib.c (new_elt_list):Use new type-based pool allocator. (new_elt_loc_list) Likewise. (unchain_one_elt_list) Likewise. (unchain_one_elt_loc_list) Likewise. (unchain_one_value) Likewise. (new_cselib_val) Likewise. (cselib_init) Likewise. (cselib_finish) Likewise. --- gcc/alias.c | 1 + gcc/cfgcleanup.c | 1 + gcc/cprop.c | 1 + gcc/cselib.c | 63 gcc/cselib.h | 33 ++- gcc/gcse.c | 1 + gcc/postreload.c | 1 + gcc/print-rtl.c | 1 + gcc/sel-sched-dump.c | 1 + 9 files changed, 78 insertions(+), 25 deletions(-) diff --git a/gcc/alias.c b/gcc/alias.c index aa7dc21..bc8e2b4 100644 --- a/gcc/alias.c +++ b/gcc/alias.c @@ -53,6 +53,7 @@ along with GCC; see the file COPYING3. If not see #include tm_p.h #include regs.h #include diagnostic-core.h +#include alloc-pool.h #include cselib.h #include hash-map.h #include langhooks.h diff --git a/gcc/cfgcleanup.c b/gcc/cfgcleanup.c index aff64ef..fc2ed31 100644 --- a/gcc/cfgcleanup.c +++ b/gcc/cfgcleanup.c @@ -50,6 +50,7 @@ along with GCC; see the file COPYING3. If not see #include flags.h #include recog.h #include diagnostic-core.h +#include alloc-pool.h #include cselib.h #include params.h #include tm_p.h diff --git a/gcc/cprop.c b/gcc/cprop.c index 57c44ef..41ca201 100644 --- a/gcc/cprop.c +++ b/gcc/cprop.c @@ -63,6 +63,7 @@ along with GCC; see the file COPYING3. If not see #include expr.h #include except.h #include params.h +#include alloc-pool.h #include cselib.h #include intl.h #include obstack.h diff --git a/gcc/cselib.c b/gcc/cselib.c index 7a50f50..8de85bc 100644 --- a/gcc/cselib.c +++ b/gcc/cselib.c @@ -46,6 +46,7 @@ along with GCC; see the file COPYING3. If not see #include ggc.h #include hash-table.h #include dumpfile.h +#include alloc-pool.h #include cselib.h #include predict.h #include basic-block.h @@ -56,9 +57,25 @@ along with GCC; see the file COPYING3. If not see #include bitmap.h /* A list of cselib_val structures. */ -struct elt_list { -struct elt_list *next; -cselib_val *elt; +struct elt_list +{ + struct elt_list *next; + cselib_val *elt; + + /* Pool allocation new operator. */ + inline void *operator new (size_t) + { +return pool.allocate (); + } + + /* Delete operator utilizing pool allocation. */ + inline void operator delete (void *ptr) + { +pool.remove((elt_list *) ptr); + } + + /* Memory allocation pool. */ + static pool_allocatorelt_list pool; }; static bool cselib_record_memory; @@ -260,7 +277,13 @@ static unsigned int cfa_base_preserved_regno = INVALID_REGNUM; May or may not contain the useless values - the list is compacted each time memory is invalidated. */ static cselib_val *first_containing_mem = dummy_val; -static alloc_pool elt_loc_list_pool, elt_list_pool, cselib_val_pool, value_pool; + +pool_allocatorelt_list elt_list::pool (elt_list, 10); +pool_allocatorelt_loc_list elt_loc_list::pool (elt_loc_list, 10); +pool_allocatorcselib_val cselib_val::pool (cselib_val_list, 10); + +static pool_allocatorrtx_def value_pool (value, 100, RTX_CODE_SIZE (VALUE), + true); /* If nonnull, cselib will call this function before freeing useless VALUEs. A VALUE is deemed useless if its locs field is null. */ @@ -288,8 +311,7 @@ void (*cselib_record_sets_hook) (rtx_insn *insn, struct cselib_set *sets, static inline struct elt_list * new_elt_list (struct elt_list *next, cselib_val *elt) { - struct elt_list *el; - el = (struct elt_list *) pool_alloc (elt_list_pool); + elt_list *el = new elt_list (); el-next = next; el-elt = elt; return el; @@ -373,14 +395,14 @@ new_elt_loc_list (cselib_val *val, rtx loc) } /* Chain LOC back to VAL. */ - el = (struct elt_loc_list *) pool_alloc (elt_loc_list_pool); + el = new elt_loc_list; el-loc = val-val_rtx; el-setting_insn = cselib_current_insn; el-next = NULL; CSELIB_VAL_PTR (loc)-locs = el; } - el = (struct elt_loc_list *) pool_alloc (elt_loc_list_pool); + el = new elt_loc_list; el-loc = loc; el-setting_insn = cselib_current_insn; el-next = next; @@ -420,7 +442,7 @@ unchain_one_elt_list (struct elt_list **pl) struct elt_list *l = *pl; *pl = l-next; - pool_free (elt_list_pool, l); + delete l; } /* Likewise for elt_loc_lists. */ @@ -431,7 +453,7 @@ unchain_one_elt_loc_list (struct elt_loc_list **pl) struct elt_loc_list *l = *pl; *pl = l-next; - pool_free (elt_loc_list_pool, l); + delete l; } /* Likewise for cselib_vals. This also frees the addr_list associated with @@ -443,7 +465,7 @@ unchain_one_value (cselib_val *v) while (v-addr_list) unchain_one_elt_list (v-addr_list); - pool_free (cselib_val_pool,
[PATCH 10/35] Change use to type-based pool allocator in cfg.c.
gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * cfg.c (initialize_original_copy_tables):Use new type-based pool allocator. (free_original_copy_tables) Likewise. (copy_original_table_clear) Likewise. (copy_original_table_set) Likewise. --- gcc/cfg.c | 17 +++-- 1 file changed, 7 insertions(+), 10 deletions(-) diff --git a/gcc/cfg.c b/gcc/cfg.c index cdcc01c..ddfecdc 100644 --- a/gcc/cfg.c +++ b/gcc/cfg.c @@ -1066,18 +1066,16 @@ static hash_tablebb_copy_hasher *bb_copy; /* And between loops and copies. */ static hash_tablebb_copy_hasher *loop_copy; -static alloc_pool original_copy_bb_pool; - +static pool_allocatorhtab_bb_copy_original_entry *original_copy_bb_pool; /* Initialize the data structures to maintain mapping between blocks and its copies. */ void initialize_original_copy_tables (void) { - gcc_assert (!original_copy_bb_pool); - original_copy_bb_pool -= create_alloc_pool (original_copy, -sizeof (struct htab_bb_copy_original_entry), 10); + + original_copy_bb_pool = new pool_allocatorhtab_bb_copy_original_entry +(original_copy, 10); bb_original = new hash_tablebb_copy_hasher (10); bb_copy = new hash_tablebb_copy_hasher (10); loop_copy = new hash_tablebb_copy_hasher (10); @@ -1095,7 +1093,7 @@ free_original_copy_tables (void) bb_copy = NULL; delete loop_copy; loop_copy = NULL; - free_alloc_pool (original_copy_bb_pool); + delete original_copy_bb_pool; original_copy_bb_pool = NULL; } @@ -1117,7 +1115,7 @@ copy_original_table_clear (hash_tablebb_copy_hasher *tab, unsigned obj) elt = *slot; tab-clear_slot (slot); - pool_free (original_copy_bb_pool, elt); + original_copy_bb_pool-remove (elt); } /* Sets the value associated with OBJ in table TAB to VAL. @@ -1137,8 +1135,7 @@ copy_original_table_set (hash_tablebb_copy_hasher *tab, slot = tab-find_slot (key, INSERT); if (!*slot) { - *slot = (struct htab_bb_copy_original_entry *) - pool_alloc (original_copy_bb_pool); + *slot = original_copy_bb_pool-allocate (); (*slot)-index1 = obj; } (*slot)-index2 = val; -- 2.1.4
Re: [patch 10/10] debug-early merge: compiler proper
On 05/20/2015 11:50 AM, Aldy Hernandez wrote: + determine anscestry later. */ ancestry +static bool early_dwarf_dumping; Sorry for the late bikeshedding, but dumping suddently strikes me as odd, since there is no output as with other dumping in the compiler. Can we change that to generation or building? + /* Reuse DIE even with a differing context. + +This happens when called through +dwarf2out_abstract_function for formal parameter +packs. */ + gcc_assert (parm_die-die_parent-die_tag + == DW_TAG_GNU_formal_parameter_pack); Does this mean we're generating a new DW_TAG_GNU_formal_parameter_pack in late debug even though we already generated one in early debug? If so, why? - /* It is possible to have both DECL_ABSTRACT_P and DECLARATION be true if we - started to generate the abstract instance of an inline, decided to output - its containing class, and proceeded to emit the declaration of the inline - from the member list for the class. If so, DECLARATION takes priority; - we'll get back to the abstract instance when done with the class. */ - - /* The class-scope declaration DIE must be the primary DIE. */ - if (origin declaration class_or_namespace_scope_p (context_die)) -{ - origin = NULL; - gcc_assert (!old_die); -} Can't this happen anymore? + if ((is_cu_die (old_die-die_parent) + /* FIXME: Jason doesn't like this condition, but it fixes + the inconsistency/ICE with the following Fortran test: + +module some_m +contains + logical function funky (FLAG) + funky = .true. + end function +end module + + Another alternative is !is_cu_die (context_die). + */ + || old_die-die_parent-die_tag == DW_TAG_module I like it now. :) You can leave the rest of the comment. + /* For non DECL_EXTERNALs, if range information is available, fill + the DIE with it. */ else if (!DECL_EXTERNAL (decl)) { HOST_WIDE_INT cfa_fb_offset; + struct function *fun = DECL_STRUCT_FUNCTION (decl); - if (!old_die || !get_AT (old_die, DW_AT_inline)) - equate_decl_number_to_die (decl, subr_die); + /* If we have no fun-fde, we have no range information. +Skip over and fill in range information in the second +dwarf pass. */ + if (!fun-fde) + goto no_fde_continue; How about controlling this block with !early_dwarf so you don't need to deal with missing FDE? if (generic_decl_parm lang_hooks.function_parameter_pack_p (generic_decl_parm)) - gen_formal_parameter_pack_die (generic_decl_parm, - parm, subr_die, - parm); + { + if (early_dwarf_dumping) + gen_formal_parameter_pack_die (generic_decl_parm, + parm, subr_die, + parm); + else if (parm) + parm = DECL_CHAIN (parm); + } Let's try only setting generic_decl when early_dwarf. + /* Unless we have an existing non-declaration DIE, equate the new + DIE. */ + if (!old_die || is_declaration_die (old_die)) +equate_decl_number_to_die (decl, subr_die); ... + if (decl (DECL_ABSTRACT_P (decl) || declaration || old_die == NULL + /* If we make it to a specialization, we have already + handled the declaration by virtue of early dwarf. + If so, make a new assocation if available, so late + dwarf can find it. */ + || (specialization_p early_dwarf_dumping))) equate_decl_number_to_die (decl, var_die); Why are the conditions so different? Can we use the function condition for variables, too? + /* Do nothing. This must have been early dumped and it +won't even need location information since it's a +DW_AT_inline function. */ + for (dw_die_ref c = context_die; c; c = c-die_parent) + if (c-die_tag == DW_TAG_inlined_subroutine + || c-die_tag == DW_TAG_subprogram) + { + gcc_assert (get_AT (c, DW_AT_inline)); + break; + } Maybe wrap this in #ifdef ENABLE_CHECKING. + /* Do the new DIE dance. */ + stmt_die = new_die (DW_TAG_lexical_block, context_die, stmt); + BLOCK_DIE (stmt) = stmt_die; + } +} + else if (BLOCK_ABSTRACT_ORIGIN (stmt)) +{ + /* If this is an inlined instance, create a new lexical die for +anything below to attach DW_AT_abstract_origin to. */ + stmt_die = new_die (DW_TAG_lexical_block, context_die, stmt); +} + else +{ + if (!stmt_die) +
Re: [PATCH] Fix duplicated warning with __attribute__((format)) (PR c/64223)
On 05/26/2015 05:06 AM, Marek Polacek wrote: Ping. On Tue, May 19, 2015 at 04:07:53PM +0200, Marek Polacek wrote: This PR points out that we output same -Wformat warning twice when using __attribute__ ((format)). The problem was that attribute_value_equal (called when processing merge_attributes) got two lists: format printf, 1, 2 and __format__ __printf__, 1, 2, these should be equal. But since attribute_value_equal uses simple_cst_list_equal when it sees a TREE_LISTs, it doesn't consider __printf__ and printf as the same, so it said that the two lists aren't same. That means that the type then contains two same format attributes and we warn twice. Fixed by handling the format attribute specially. (The patch doesn't consider the printf and the gnu_printf archetypes as the same, so we still might get duplicate warnings when combining printf and gnu_printf.) Bootstrapped/regtested on x86_64-linux, ok for trunk? 2015-05-19 Marek Polacek pola...@redhat.com PR c/64223 * tree.c (attribute_value_equal): Handle attribute format. (cmp_attrib_identifiers): Factor out of lookup_ident_attribute. * gcc.dg/pr64223-1.c: New test. * gcc.dg/pr64223-2.c: New test. diff --git gcc/tree.c gcc/tree.c index 6297f04..a58ad7b 100644 --- gcc/tree.c +++ gcc/tree.c @@ -4871,9 +4871,53 @@ simple_cst_list_equal (const_tree l1, const_tree l2) return l1 == l2; } +/* Compare two identifier nodes representing attributes. Either one may + be in prefixed __ATTR__ form. Return true if they are the same, false + otherwise. */ I think wrapped may be better than prefixed above. But it's obviously a nit. Your call whether or not to change. + + if (attr2_len == attr1_len + 4) +{ + const char *p = IDENTIFIER_POINTER (attr2); + const char *q = IDENTIFIER_POINTER (attr1); + if (p[0] == '_' p[1] == '_' + p[attr2_len - 2] == '_' p[attr2_len - 1] == '_' + strncmp (q, p + 2, attr1_len) == 0) + return true;; +} + else if (attr2_len + 4 == attr1_len) +{ + const char *p = IDENTIFIER_POINTER (attr2); + const char *q = IDENTIFIER_POINTER (attr1); + if (q[0] == '_' q[1] == '_' + q[attr1_len - 2] == '_' q[attr1_len - 1] == '_' + strncmp (q + 2, p, attr2_len) == 0) + return true; +} Consider canonicalizing and using std::swap so that the longer identifier is always in attr1 and the second hunk of code can just go away. Obviously it's not a huge deal and again, your call whether or not to pursue this very minor cleanup. Ok for the trunk as is a patch which makes either or both of the trivial changes noted above. Jeff
Re: [PATCH] LTO balanced map: add stats about insns and symbols.
On Tue, May 26, 2015 at 4:13 PM, Martin Liška mli...@suse.cz wrote: Hello. Following patch enhanced dump output for LTO balanced map. Sample output: Partition sizes: partition 0 contains 2413 (13.33%) symbols and 56646 (3.62%) insns partition 1 contains 2006 (11.08%) symbols and 55901 (3.57%) insns partition 2 contains 1954 (10.79%) symbols and 61054 (3.90%) insns partition 3 contains 1234 (6.82%) symbols and 61331 (3.92%) insns partition 4 contains 2024 (11.18%) symbols and 60955 (3.89%) insns partition 5 contains 2332 (12.88%) symbols and 61030 (3.90%) insns partition 6 contains 2294 (12.67%) symbols and 60585 (3.87%) insns partition 7 contains 1044 (5.77%) symbols and 56854 (3.63%) insns partition 8 contains 1390 (7.68%) symbols and 60877 (3.89%) insns partition 9 contains 1891 (10.44%) symbols and 56356 (3.60%) insns partition 10 contains 1172 (6.47%) symbols and 56990 (3.64%) insns partition 11 contains 2099 (11.59%) symbols and 57168 (3.65%) insns partition 12 contains 2444 (13.50%) symbols and 60830 (3.88%) insns partition 13 contains 1610 (8.89%) symbols and 51294 (3.28%) insns partition 14 contains 1949 (10.76%) symbols and 61142 (3.90%) insns partition 15 contains 2256 (12.46%) symbols and 60634 (3.87%) insns partition 16 contains 2951 (16.30%) symbols and 61536 (3.93%) insns partition 17 contains 1968 (10.87%) symbols and 62862 (4.01%) insns partition 18 contains 2298 (12.69%) symbols and 62748 (4.01%) insns partition 19 contains 1679 (9.27%) symbols and 61772 (3.94%) insns partition 20 contains 2265 (12.51%) symbols and 61851 (3.95%) insns partition 21 contains 2234 (12.34%) symbols and 62310 (3.98%) insns partition 22 contains 2345 (12.95%) symbols and 62185 (3.97%) insns partition 23 contains 1816 (10.03%) symbols and 60530 (3.87%) insns partition 24 contains 2655 (14.66%) symbols and 63232 (4.04%) insns partition 25 contains 1782 (9.84%) symbols and 45523 (2.91%) insns partition 26 contains 2217 (12.25%) symbols and 67405 (4.30%) insns partition 27 contains 2642 (14.59%) symbols and 66556 (4.25%) insns partition 28 contains 2454 (13.55%) symbols and 66748 (4.26%) insns partition 29 contains 2637 (14.57%) symbols and 66711 (4.26%) insns partition 30 contains 2244 (12.39%) symbols and 51957 (3.32%) insns Patch can bootstrap on x86_64-linux-gnu and can build Firefox and Inkscape with LTO enabled. Ready for trunk? Ok. Richard. Thanks, Martin
Re: [PATCH][expr.c] PR 65358 Avoid clobbering partial argument during sibcall
Hi Jeff, On 12/05/15 23:04, Jeff Law wrote: On 05/11/2015 03:28 AM, Kyrill Tkachov wrote: The more I think about this, the more I think it's an ugly can of worms and maybe we should just disable sibcalls for partial arguments. I doubt it's a big performance issue in general. We already have quite a bit of code in calls.c to detect cases with partial argument overlap for the explicit purpose of allowing sibcalls when partial arguments occur in the general case. However, that code only detects when a partial argument overlaps with other arguments in a call. In this PR the partial argument overlaps with itself. It would be a shame to disable sibcalls for all partial arguments when there is already infrastructure in place to handle them. I didn't even realize we had support for partial arguments in sibcalls. Ah, Kazu added that in 2005, I totally missed it. I probably would have suggested failing the sibcall for those cases back then too... Is there any way to re-use that infrastructure to deal with the case at hand? In addition to the argument/stack direction stuff, I've been pondering the stack/frame/arg pointer issues. Your approach assumes that the incoming and outgoing areas are always referenced off the same base register. If they aren't, then the routine returns no overlap. But we'd need to consider the case where we have a reference to the arg or frame pointer which later gets rewritten into a stack pointer relative address. Is it too late at the point were you do the checks to reject the sibling call? If not, then maybe the overlap routine should return a tri-state. No overlap, overlap, don't know. The last would be used when the two addresses use a different register. Ok, here is my attempt at that. The overlap functions returns -2 when it cannot staticall compare the two pointers (i.e. when the base registers are different) and the caller then disables sibcalls. The code in calls.c that calls this code will undo any emitted instructions in the meantime if sibcall optimisation fails. This required me to change the type of emit_push_insn to bool and add an extra parameter, so this patch touches a bit more code than the original version. Bootstrapped on x86_64 and tested on arm. The testcase in this PR still performs a sibcall correctly on arm. What do you think of this? Thanks, Kyrill 2015-05-11 Kyrylo Tkachov kyrylo.tkac...@arm.com PR target/65358 * expr.c (memory_load_overlap): New function. (emit_push_insn): When pushing partial args to the stack would clobber the register part load the overlapping part into a pseudo and put it into the hard reg after pushing. Change return type to bool. Add bool argument. * expr.h (emit_push_insn): Change return type to bool. Add bool argument. * calls.c (expand_call): Cancel sibcall optimisation when encountering partial argument on targets with ARGS_GROW_DOWNWARD and !STACK_GROWS_DOWNWARD. (emit_library_call_value_1): Update callsite of emit_push_insn. (store_one_arg): Likewise. 2015-05-11 Honggyu Kim hong.gyu@lge.com PR target/65358 * gcc.dg/pr65358.c: New test. Jeff expr.patch commit 5b596f10846b6d3b143442a306801c8262d8b10a Author: Kyrylo Tkachovkyrylo.tkac...@arm.com Date: Wed Mar 18 13:42:37 2015 + [expr.c] PR 65358 Avoid clobbering partial argument during sibcall diff --git a/gcc/calls.c b/gcc/calls.c index caa7d60..81ef2c9 100644 --- a/gcc/calls.c +++ b/gcc/calls.c @@ -3225,6 +3225,13 @@ expand_call (tree exp, rtx target, int ignore) { rtx_insn *before_arg = get_last_insn (); + /* On targets with weird calling conventions (e.g. PA) it's +hard to ensure that all cases of argument overlap between +stack and registers work. Play it safe and bail out. */ +#if defined (ARGS_GROW_DOWNWARD) !defined (STACK_GROWS_DOWNWARD) + sibcall_failure = 1; + break; +#endif So we're trying to get away from this kind of conditional compilation. Instead we want to write if (ARGS_GROW_DOWNWARD !STACK_GROWS_DOWNWARD) ARGS_GROW_DOWNWARD is already a testable value. But STACK_GROWS_DOWNWARD is not. The way folks have been dealing with this is something like this after the #includes: /* Redefine STACK_GROWS_DOWNWARD in terms of 0 or 1. */ #ifdef STACK_GROWS_DOWNWARD # undef STACK_GROWS_DOWNWARD # define STACK_GROWS_DOWNWARD 1 #else # define STACK_GROWS_DOWNWARD 0 #endif With that in place you can change the test into the more desirable if (ARGS_GROW_DOWNWARD !STACK_GROWS_DOWNWARD) diff --git a/gcc/expr.c b/gcc/expr.c index 25aa11f..712fa0b 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -4121,12 +4121,35 @@ emit_single_push_insn (machine_mode mode, rtx x, tree type) } #endif +/* If reading SIZE bytes from X will end up reading from + Y return the number of bytes that overlap. Return -1 + if there is no overlap or -2 if we can't determing
[PATCH 05/35] Change use to type-based pool allocator in ira-color.c.
gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * ira-color.c (init_update_cost_records):Use new type-based pool allocator. (get_update_cost_record) Likewise. (free_update_cost_record_list) Likewise. (finish_update_cost_records) Likewise. (initiate_cost_update) Likewise. --- gcc/ira-color.c | 35 --- 1 file changed, 20 insertions(+), 15 deletions(-) diff --git a/gcc/ira-color.c b/gcc/ira-color.c index 4750714..b719e7a 100644 --- a/gcc/ira-color.c +++ b/gcc/ira-color.c @@ -123,6 +123,21 @@ struct update_cost_record int divisor; /* Next record for given allocno. */ struct update_cost_record *next; + + /* Pool allocation new operator. */ + inline void *operator new (size_t) + { +return pool.allocate (); + } + + /* Delete operator utilizing pool allocation. */ + inline void operator delete (void *ptr) + { +pool.remove((update_cost_record *) ptr); + } + + /* Memory allocation pool. */ + static pool_allocatorupdate_cost_record pool; }; /* To decrease footprint of ira_allocno structure we store all data @@ -1166,25 +1181,16 @@ setup_profitable_hard_regs (void) allocnos. */ /* Pool for update cost records. */ -static alloc_pool update_cost_record_pool; - -/* Initiate update cost records. */ -static void -init_update_cost_records (void) -{ - update_cost_record_pool -= create_alloc_pool (update cost records, -sizeof (struct update_cost_record), 100); -} +pool_allocatorupdate_cost_record update_cost_record::pool + (update cost records, 100); /* Return new update cost record with given params. */ static struct update_cost_record * get_update_cost_record (int hard_regno, int divisor, struct update_cost_record *next) { - struct update_cost_record *record; + update_cost_record *record = new update_cost_record; - record = (struct update_cost_record *) pool_alloc (update_cost_record_pool); record-hard_regno = hard_regno; record-divisor = divisor; record-next = next; @@ -1200,7 +1206,7 @@ free_update_cost_record_list (struct update_cost_record *list) while (list != NULL) { next = list-next; - pool_free (update_cost_record_pool, list); + delete list; list = next; } } @@ -1209,7 +1215,7 @@ free_update_cost_record_list (struct update_cost_record *list) static void finish_update_cost_records (void) { - free_alloc_pool (update_cost_record_pool); + update_cost_record::pool.release (); } /* Array whose element value is TRUE if the corresponding hard @@ -1264,7 +1270,6 @@ initiate_cost_update (void) = (struct update_cost_queue_elem *) ira_allocate (size); memset (update_cost_queue_elems, 0, size); update_cost_check = 0; - init_update_cost_records (); } /* Deallocate data used by function update_costs_from_copies. */ -- 2.1.4
[PATCH 09/35] Change use to type-based pool allocator in c-format.c.
gcc/c-family/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * c-format.c (check_format_arg):Use new type-based pool allocator. (check_format_info_main) Likewise. --- gcc/c-family/c-format.c | 14 ++ 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/gcc/c-family/c-format.c b/gcc/c-family/c-format.c index 145bbfd..7b9bf38 100644 --- a/gcc/c-family/c-format.c +++ b/gcc/c-family/c-format.c @@ -1031,7 +1031,8 @@ static void check_format_arg (void *, tree, unsigned HOST_WIDE_INT); static void check_format_info_main (format_check_results *, function_format_info *, const char *, int, tree, -unsigned HOST_WIDE_INT, alloc_pool); +unsigned HOST_WIDE_INT, + pool_allocatorformat_wanted_type ); static void init_dollar_format_checking (int, tree); static int maybe_read_dollar_number (const char **, int, @@ -1518,7 +1519,6 @@ check_format_arg (void *ctx, tree format_tree, const char *format_chars; tree array_size = 0; tree array_init; - alloc_pool fwt_pool; if (TREE_CODE (format_tree) == VAR_DECL) { @@ -1694,11 +1694,9 @@ check_format_arg (void *ctx, tree format_tree, will decrement it if it finds there are extra arguments, but this way need not adjust it for every return. */ res-number_other++; - fwt_pool = create_alloc_pool (format_wanted_type pool, -sizeof (format_wanted_type), 10); + pool_allocator format_wanted_type fwt_pool (format_wanted_type pool, 10); check_format_info_main (res, info, format_chars, format_length, params, arg_num, fwt_pool); - free_alloc_pool (fwt_pool); } @@ -1713,7 +1711,8 @@ static void check_format_info_main (format_check_results *res, function_format_info *info, const char *format_chars, int format_length, tree params, -unsigned HOST_WIDE_INT arg_num, alloc_pool fwt_pool) +unsigned HOST_WIDE_INT arg_num, + pool_allocatorformat_wanted_type fwt_pool) { const char *orig_format_chars = format_chars; tree first_fillin_param = params; @@ -2424,8 +2423,7 @@ check_format_info_main (format_check_results *res, fci = fci-chain; if (fci) { - wanted_type_ptr = (format_wanted_type *) - pool_alloc (fwt_pool); + wanted_type_ptr = fwt_pool.allocate (); arg_num++; wanted_type = *fci-types[length_chars_val].type; wanted_type_name = fci-types[length_chars_val].name; -- 2.1.4
[PATCH 19/35] Change use to type-based pool allocator in sel-sched-ir.c.
gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * sel-sched-ir.c (alloc_sched_pools): Use new type-based pool allocator. (free_sched_pools): Likewise. * sel-sched-ir.h (_list_alloc): Likewise. (_list_remove): Likewise. --- gcc/sel-sched-ir.c | 7 ++- gcc/sel-sched-ir.h | 6 +++--- 2 files changed, 5 insertions(+), 8 deletions(-) diff --git a/gcc/sel-sched-ir.c b/gcc/sel-sched-ir.c index 94f6c43..ffaba56 100644 --- a/gcc/sel-sched-ir.c +++ b/gcc/sel-sched-ir.c @@ -70,7 +70,7 @@ vecsel_region_bb_info_def sel_region_bb_info = vNULL; /* A pool for allocating all lists. */ -alloc_pool sched_lists_pool; +pool_allocator_list_node sched_lists_pool (sel-sched-lists, 500); /* This contains information about successors for compute_av_set. */ struct succs_info current_succs; @@ -5030,9 +5030,6 @@ alloc_sched_pools (void) succs_info_pool.size = succs_size; succs_info_pool.top = -1; succs_info_pool.max_top = -1; - - sched_lists_pool = create_alloc_pool (sel-sched-lists, -sizeof (struct _list_node), 500); } /* Free the pools. */ @@ -5041,7 +5038,7 @@ free_sched_pools (void) { int i; - free_alloc_pool (sched_lists_pool); + sched_lists_pool.release (); gcc_assert (succs_info_pool.top == -1); for (i = 0; i = succs_info_pool.max_top; i++) { diff --git a/gcc/sel-sched-ir.h b/gcc/sel-sched-ir.h index 91ce92f..3707a87 100644 --- a/gcc/sel-sched-ir.h +++ b/gcc/sel-sched-ir.h @@ -364,12 +364,12 @@ struct _list_node /* _list_t functions. All of _*list_* functions are used through accessor macros, thus we can't move them in sel-sched-ir.c. */ -extern alloc_pool sched_lists_pool; +extern pool_allocator_list_node sched_lists_pool; static inline _list_t _list_alloc (void) { - return (_list_t) pool_alloc (sched_lists_pool); + return sched_lists_pool.allocate (); } static inline void @@ -395,7 +395,7 @@ _list_remove (_list_t *lp) _list_t n = *lp; *lp = _LIST_NEXT (n); - pool_free (sched_lists_pool, n); + sched_lists_pool.remove (n); } static inline void -- 2.1.4
[PATCH] Fix PR66272
The following fixes PR66272. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk sofar. Richard. 2015-05-27 Richard Biener rguent...@suse.de PR tree-optimization/66272 Revert parts of 2014-08-15 Richard Biener rguent...@suse.de PR tree-optimization/62031 * tree-data-ref.c (dr_analyze_indices): Do not set DR_UNCONSTRAINED_BASE. (dr_may_alias_p): All indirect accesses have to go the formerly DR_UNCONSTRAINED_BASE path. * tree-data-ref.h (struct indices): Remove unconstrained_base member. (DR_UNCONSTRAINED_BASE): Remove. * gcc.dg/torture/pr66272.c: New testcase. Index: gcc/tree-data-ref.c === --- gcc/tree-data-ref.c (revision 223737) +++ gcc/tree-data-ref.c (working copy) @@ -1036,6 +1036,7 @@ dr_analyze_indices (struct data_referenc base, memoff); MR_DEPENDENCE_CLIQUE (ref) = MR_DEPENDENCE_CLIQUE (old); MR_DEPENDENCE_BASE (ref) = MR_DEPENDENCE_BASE (old); + DR_UNCONSTRAINED_BASE (dr) = true; access_fns.safe_push (access_fn); } } @@ -1453,7 +1454,8 @@ dr_may_alias_p (const struct data_refere offset/overlap based analysis but have to rely on points-to information only. */ if (TREE_CODE (addr_a) == MEM_REF - TREE_CODE (TREE_OPERAND (addr_a, 0)) == SSA_NAME) + (DR_UNCONSTRAINED_BASE (a) + || TREE_CODE (TREE_OPERAND (addr_a, 0)) == SSA_NAME)) { /* For true dependences we can apply TBAA. */ if (flag_strict_aliasing @@ -1469,7 +1471,8 @@ dr_may_alias_p (const struct data_refere build_fold_addr_expr (addr_b)); } else if (TREE_CODE (addr_b) == MEM_REF - TREE_CODE (TREE_OPERAND (addr_b, 0)) == SSA_NAME) + (DR_UNCONSTRAINED_BASE (b) + || TREE_CODE (TREE_OPERAND (addr_b, 0)) == SSA_NAME)) { /* For true dependences we can apply TBAA. */ if (flag_strict_aliasing Index: gcc/tree-data-ref.h === --- gcc/tree-data-ref.h (revision 223737) +++ gcc/tree-data-ref.h (working copy) @@ -81,6 +81,10 @@ struct indices /* A list of chrecs. Access functions of the indices. */ vectree access_fns; + + /* Whether BASE_OBJECT is an access representing the whole object + or whether the access could not be constrained. */ + bool unconstrained_base; }; struct dr_alias @@ -129,6 +133,7 @@ struct data_reference #define DR_STMT(DR)(DR)-stmt #define DR_REF(DR) (DR)-ref #define DR_BASE_OBJECT(DR) (DR)-indices.base_object +#define DR_UNCONSTRAINED_BASE(DR) (DR)-indices.unconstrained_base #define DR_ACCESS_FNS(DR) (DR)-indices.access_fns #define DR_ACCESS_FN(DR, I)DR_ACCESS_FNS (DR)[I] #define DR_NUM_DIMENSIONS(DR) DR_ACCESS_FNS (DR).length () Index: gcc/testsuite/gcc.dg/torture/pr66272.c === --- gcc/testsuite/gcc.dg/torture/pr66272.c (revision 0) +++ gcc/testsuite/gcc.dg/torture/pr66272.c (working copy) @@ -0,0 +1,23 @@ +/* { dg-do run } */ + +struct S +{ + int f0; + int f1; +}; + +int b; + +int main () +{ + struct S a[2] = { 0 }; + struct S d = { 0, 1 }; + for (b = 0; b 2; b++) +{ + a[b] = d; + d = a[0]; +} + if (d.f1 != 1) +__builtin_abort (); + return 0; +}
Re: [PATCH] hppa-linux: add missing cpp specs
On 27 May 2015 13:05, John David Anglin wrote: On 2015-05-27 11:59 AM, Mike Frysinger wrote: Define CPP_SPEC for parisc linux targets so that -posix -pthread work like on all other linux targets. 2015-05-27 Mike Frysingervap...@gentoo.org * config/pa/pa-linux.h (CPP_SPEC): Define. Okay. I think this should be applied to all active branches. ChangeLog entry should mention _REENTRANT. since i'm not looped into gcc development normally, which branches are those currently ? naively reading gcc.gnu.org homepage makes me think none since they're labled regression fixes and afaict, none of these are regressions. they've been broken for as long as the ports have existed :/. -mike signature.asc Description: Digital signature
Re: [PATCH 05/35] Change use to type-based pool allocator in ira-color.c.
On 05/27/2015 07:56 AM, mliska wrote: gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * ira-color.c (init_update_cost_records):Use new type-based pool allocator. (get_update_cost_record) Likewise. (free_update_cost_record_list) Likewise. (finish_update_cost_records) Likewise. (initiate_cost_update) Likewise. OK. jeff
Re: [PATCH 09/35] Change use to type-based pool allocator in c-format.c.
On 05/27/2015 07:56 AM, mliska wrote: gcc/c-family/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * c-format.c (check_format_arg):Use new type-based pool allocator. (check_format_info_main) Likewise. OK. jeff
Re: [Ada] Remove propagation of atomicity from object to type
This trick is now obsolete since gigi should be able to rewrite the type of the objects to meet the atomicity requirements on its own. It just needs the following patchlet, applied on the mainline. 2015-05-27 Eric Botcazou ebotca...@adacore.com * gcc-interface/decl.c (gnat_to_gnu_entity) E_Variable: Move down code applying atomic checks to the object. -- Eric BotcazouIndex: gcc-interface/decl.c === --- gcc-interface/decl.c (revision 223766) +++ gcc-interface/decl.c (working copy) @@ -881,15 +881,6 @@ gnat_to_gnu_entity (Entity_Id gnat_entit check_ok_for_atomic_type (gnu_inner, gnat_entity, true); } - /* Now check if the type of the object allows atomic access. Note - that we must test the type, even if this object has size and - alignment to allow such access, because we will be going inside - the padded record to assign to the object. We could fix this by - always copying via an intermediate value, but it's not clear it's - worth the effort. */ - if (Is_Atomic_Or_VFA (gnat_entity)) - check_ok_for_atomic_type (gnu_type, gnat_entity, false); - /* If this is an aliased object with an unconstrained nominal subtype, make a type that includes the template. */ if (Is_Constr_Subt_For_UN_Aliased (Etype (gnat_entity)) @@ -955,6 +946,10 @@ gnat_to_gnu_entity (Entity_Id gnat_entit debug_info_p, gnat_entity); } + /* Now check if the type of the object allows atomic access. */ + if (Is_Atomic_Or_VFA (gnat_entity)) + check_ok_for_atomic_type (gnu_type, gnat_entity, false); + /* If this is a renaming, avoid as much as possible to create a new object. However, in some cases, creating it is required because renaming can be applied to objects that are not names in Ada.
Re: [PATCH 16/35] Change use to type-based pool allocator in tree-sra.c.
On 05/27/2015 07:56 AM, mliska wrote: gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * tree-sra.c (sra_initialize): Use new type-based pool allocator. (sra_deinitialize) Likewise. (create_access_1) Likewise. (build_accesses_from_assign) Likewise. (create_artificial_child_access) Likewise. OK. jeff
Re: [PATCH 17/35] Change use to type-based pool allocator in tree-ssa-math-opts.c.
On 05/27/2015 07:56 AM, mliska wrote: gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * tree-ssa-math-opts.c (occ_new): Use new type-based pool allocator. (free_bb): Likewise. (pass_cse_reciprocals::execute): Likewise. OK. jeff
Re: [PATCH 19/35] Change use to type-based pool allocator in sel-sched-ir.c.
On 05/27/2015 07:56 AM, mliska wrote: gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * sel-sched-ir.c (alloc_sched_pools): Use new type-based pool allocator. (free_sched_pools): Likewise. * sel-sched-ir.h (_list_alloc): Likewise. (_list_remove): Likewise. OK jeff
Re: [patch] Move generic tree functions from expr.h to tree.h
You can leave get_inner_reference in its place then ... or move it. It's hardly only used by expansion now. Yes, but it's a hot function and I'd rather preserve its SVN history so I'm leaving it in its original place. Thanks for the review. Here's the patch I have installed after having tested it on x86_64-suse-linux. 2015-05-27 Eric Botcazou ebotca...@adacore.com * expr.h (array_at_struct_end_p): Move to... (array_ref_element_size): Likewise. (component_ref_field_offset): Likewise. * tree.h (array_ref_element_size): ...here. (array_at_struct_end_p): Likewise. (component_ref_field_offset): Likewise. * expr.c (array_ref_element_size): Move to... (array_ref_low_bound): Likewise. (array_at_struct_end_p): Likewise. (array_ref_up_bound): Likewise. (component_ref_field_offset): Likewise. * tree.c (array_ref_element_size): ...here. (array_ref_low_bound): Likewise. (array_ref_up_bound): Likewise. (array_at_struct_end_p): Likewise. (component_ref_field_offset): Likewise. -- Eric BotcazouIndex: expr.h === --- expr.h (revision 223736) +++ expr.h (working copy) @@ -281,19 +281,10 @@ rtx get_personality_function (tree); extern int can_move_by_pieces (unsigned HOST_WIDE_INT, unsigned int); extern unsigned HOST_WIDE_INT highest_pow2_factor (const_tree); -bool array_at_struct_end_p (tree); - -/* Return a tree of sizetype representing the size, in bytes, of the element - of EXP, an ARRAY_REF or an ARRAY_RANGE_REF. */ -extern tree array_ref_element_size (tree); extern bool categorize_ctor_elements (const_tree, HOST_WIDE_INT *, HOST_WIDE_INT *, bool *); -/* Return a tree representing the offset, in bytes, of the field referenced - by EXP. This does not include any offset in DECL_FIELD_BIT_OFFSET. */ -extern tree component_ref_field_offset (tree); - extern void expand_operands (tree, tree, rtx, rtx*, rtx*, enum expand_modifier); Index: expr.c === --- expr.c (revision 223736) +++ expr.c (working copy) @@ -6953,139 +6953,6 @@ get_inner_reference (tree exp, HOST_WIDE return exp; } -/* Return a tree of sizetype representing the size, in bytes, of the element - of EXP, an ARRAY_REF or an ARRAY_RANGE_REF. */ - -tree -array_ref_element_size (tree exp) -{ - tree aligned_size = TREE_OPERAND (exp, 3); - tree elmt_type = TREE_TYPE (TREE_TYPE (TREE_OPERAND (exp, 0))); - location_t loc = EXPR_LOCATION (exp); - - /* If a size was specified in the ARRAY_REF, it's the size measured - in alignment units of the element type. So multiply by that value. */ - if (aligned_size) -{ - /* ??? tree_ssa_useless_type_conversion will eliminate casts to - sizetype from another type of the same width and signedness. */ - if (TREE_TYPE (aligned_size) != sizetype) - aligned_size = fold_convert_loc (loc, sizetype, aligned_size); - return size_binop_loc (loc, MULT_EXPR, aligned_size, - size_int (TYPE_ALIGN_UNIT (elmt_type))); -} - - /* Otherwise, take the size from that of the element type. Substitute - any PLACEHOLDER_EXPR that we have. */ - else -return SUBSTITUTE_PLACEHOLDER_IN_EXPR (TYPE_SIZE_UNIT (elmt_type), exp); -} - -/* Return a tree representing the lower bound of the array mentioned in - EXP, an ARRAY_REF or an ARRAY_RANGE_REF. */ - -tree -array_ref_low_bound (tree exp) -{ - tree domain_type = TYPE_DOMAIN (TREE_TYPE (TREE_OPERAND (exp, 0))); - - /* If a lower bound is specified in EXP, use it. */ - if (TREE_OPERAND (exp, 2)) -return TREE_OPERAND (exp, 2); - - /* Otherwise, if there is a domain type and it has a lower bound, use it, - substituting for a PLACEHOLDER_EXPR as needed. */ - if (domain_type TYPE_MIN_VALUE (domain_type)) -return SUBSTITUTE_PLACEHOLDER_IN_EXPR (TYPE_MIN_VALUE (domain_type), exp); - - /* Otherwise, return a zero of the appropriate type. */ - return build_int_cst (TREE_TYPE (TREE_OPERAND (exp, 1)), 0); -} - -/* Returns true if REF is an array reference to an array at the end of - a structure. If this is the case, the array may be allocated larger - than its upper bound implies. */ - -bool -array_at_struct_end_p (tree ref) -{ - if (TREE_CODE (ref) != ARRAY_REF - TREE_CODE (ref) != ARRAY_RANGE_REF) -return false; - - while (handled_component_p (ref)) -{ - /* If the reference chain contains a component reference to a - non-union type and there follows another field the reference - is not at the end of a structure. */ - if (TREE_CODE (ref) == COMPONENT_REF - TREE_CODE (TREE_TYPE (TREE_OPERAND (ref, 0))) == RECORD_TYPE) - { - tree nextf = DECL_CHAIN (TREE_OPERAND (ref, 1)); - while (nextf TREE_CODE (nextf) != FIELD_DECL) - nextf = DECL_CHAIN (nextf); - if (nextf) - return false;
Re: [PATCH 13/35] Change use to type-based pool allocator in df-problems.c.
On 05/27/2015 07:56 AM, mliska wrote: gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * df-problems.c (df_chain_create):Use new type-based pool allocator. (df_chain_unlink_1) Likewise. (df_chain_unlink) Likewise. (df_chain_remove_problem) Likewise. (df_chain_alloc) Likewise. (df_chain_free) Likewise. * df.h (struct dataflow) Likewise. OK. As Jakub noted, please double-check your ChangeLogs for proper formatting before committing. There's consistently nits to fix in them. Jeff
Re: [PATCH 18/35] Change use to type-based pool allocator in stmt.c.
On 05/27/2015 07:56 AM, mliska wrote: gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * stmt.c (add_case_node): Use new type-based pool allocator. (expand_case): Likewise. (expand_sjlj_dispatch_table): Likewise. OK. jeff
Re: [PATCH 21/35] Change use to type-based pool allocator in regcprop.c.
On 05/27/2015 07:56 AM, mliska wrote: gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * regcprop.c (free_debug_insn_changes): Use new type-based pool allocator. (replace_oldest_value_reg): Likewise. (pass_cprop_hardreg::execute): Likewise. OK. jeff
Re: [PATCH 20/35] Change use to type-based pool allocator in ira-build.c.
On 05/27/2015 07:56 AM, mliska wrote: gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * ira-build.c (initiate_cost_vectors): Use new type-based pool allocator. (ira_allocate_cost_vector): Likewise. (ira_free_cost_vector): Likewise. (finish_cost_vectors): Likewise. OK. jeff
Re: [PATCH 25/35] Change use to type-based pool allocator in tree-ssa-sccvn.c.
On 05/27/2015 07:56 AM, mliska wrote: gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * tree-ssa-sccvn.c (vn_reference_insert): Use new type-based pool allocator. (vn_reference_insert_pieces): Likewise. (vn_phi_insert): Likewise. (visit_reference_op_call): Likewise. (copy_phi): Likewise. (copy_reference): Likewise. (process_scc): Likewise. (allocate_vn_table): Likewise. (free_vn_table): Likewise. OK. jeff
Re: [PATCH 24/35] Change use to type-based pool allocator in tree-ssa-reassoc.c.
On 05/27/2015 07:56 AM, mliska wrote: gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * tree-ssa-reassoc.c (add_to_ops_vec): Use new type-based pool allocator. (add_repeat_to_ops_vec): Likewise. (get_ops): Likewise. (maybe_optimize_range_tests): Likewise. (init_reassoc): Likewise. (fini_reassoc): Likewise. OK. jeff
Re: [PATCH 22/35] Change use to type-based pool allocator in sched-deps.c.
On 05/27/2015 07:56 AM, mliska wrote: gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * sched-deps.c (create_dep_node): Use new type-based pool allocator. (delete_dep_node): Likewise. (create_deps_list): Likewise. (free_deps_list): Likewise. (sched_deps_init): Likewise. (sched_deps_finish): Likewise. OK. First use of the release_if_empty API that I've seen in these patches. jeff
Re: [PATCH 26/35] Change use to type-based pool allocator in tree-ssa-strlen.c.
On 05/27/2015 07:56 AM, mliska wrote: gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * tree-ssa-strlen.c (new_strinfo): Use new type-based pool allocator. (free_strinfo): Likewise. (pass_strlen::execute): Likewise. OK. jeff
Re: [PATCH 28/35] Change use to type-based pool allocator in ipa-profile.c.
On 05/27/2015 07:56 AM, mliska wrote: gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * ipa-profile.c (account_time_size): Use new type-based pool allocator. (ipa_profile_generate_summary): Likewise. (ipa_profile_read_summary): Likewise. (ipa_profile): Likewise. OK. jeff
Re: [PATCH 27/35] Change use to type-based pool allocator in tree-ssa-structalias.c.
On 05/27/2015 07:56 AM, mliska wrote: gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * tree-ssa-structalias.c (new_var_info): Use new type-based pool allocator. (new_constraint): Likewise. (init_alias_vars): Likewise. (delete_points_to_sets): Likewise. --- OK. Jeff
Re: [PATCH 29/35] Change use to type-based pool allocator in ipa-prop.c.
On 05/27/2015 07:56 AM, mliska wrote: gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * ipa-prop.c (ipa_set_jf_constant): Use new type-based pool allocator. (ipa_edge_duplication_hook): Likewise. (ipa_free_all_structures_after_ipa_cp): Likewise. (ipa_free_all_structures_after_iinln): Likewise. OK. Jeff
Re: [PATCH 33/35] Change use to type-based pool allocator in ira-color.c.
On 05/27/2015 07:56 AM, mliska wrote: gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * ira-color.c (init_update_cost_records): Use new type-based pool allocator. (get_update_cost_record): Likewise. (free_update_cost_record_list): Likewise. (finish_update_cost_records): Likewise. (initiate_cost_update): Likewise. OK. Jeff
Re: [PATCH 23/35] Change use to type-based pool allocator in tree-ssa-pre.c.
On 05/27/2015 07:56 AM, mliska wrote: gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * tree-ssa-pre.c (get_or_alloc_expr_for_name): Use new type-based pool allocator. (bitmap_set_new): Likewise. (get_or_alloc_expr_for_constant): Likewise. (get_or_alloc_expr_for): Likewise. (phi_translate_1): Likewise. (compute_avail): Likewise. (init_pre): Likewise. (fini_pre): Likewise. OK. Jeff
Re: [PATCH] hppa-linux: add missing cpp specs
On 2015-05-27 1:50 PM, Mike Frysinger wrote: since i'm not looped into gcc development normally, which branches are those currently ? naively reading gcc.gnu.org homepage makes me think none since they're labled regression fixes and afaict, none of these are regressions. they've been broken for as long as the ports have existed :/. The branches are 4.8, 4.9, 5 and trunk as noted on http://gcc.gnu.org. For target fixes, that don't affect primary or secondary targets, nobody cares about the regression criteria. This is probably one of the causes of poor thread behavior of many applications running on parisc hardware. I want to see the patch in Debian and you probably want it for Gentoo. Dave -- John David Anglin dave.ang...@bell.net
Re: [PATCH 32/35] Change use to type-based pool allocator in ira-build.c.
On 05/27/2015 07:56 AM, mliska wrote: gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * ira-build.c (finish_allocnos): Use new type-based pool allocator. (finish_prefs): Likewise. (finish_copies): Likewise. Is this a partial duplicate of patch #34? Something seems amiss here. jeff
Re: [PATCH 35/35] Remove old pool allocator.
On 05/27/2015 07:56 AM, mliska wrote: gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * alloc-pool.c (create_alloc_pool): Remove. (empty_alloc_pool): Likewise. (free_alloc_pool): Likewise. (free_alloc_pool_if_empty): Likewise. (pool_alloc): Likewise. (pool_free): Likewise. * alloc-pool.h: Remove old declarations. So, the remaining patches to use the type based pool allocator are OK as long as they have the same overall structure as the patches that have already been OK. You've got something goofy in #32/#34, which I'll assume you'll sort out sensibly. OK. jeff
RE: Remove splay_tree from gimplify.c
Date: Wed, 27 May 2015 18:41:55 +0200 From: ja...@redhat.com To: l...@redhat.com CC: hiradi...@msn.com; gcc-patches@gcc.gnu.org Subject: Re: Remove splay_tree from gimplify.c On Wed, May 27, 2015 at 10:34:46AM -0600, Jeff Law wrote: So the question here is whether or not the other uses are commonly looking up elements we've already searched for -- that's the whole purpose of a splay tree, to improve lookup performance for commonly hit items. First of all, this is only used for OpenMP/OpenACC/Cilk+ constructs, so it really isn't that performance criticial. The code probably dates back to Richard's and Diego's changes. And, I believe splay trees are the right thing to use here, while sometimes you lookup different vars, looking up the same var many times in a row is very common. If that is the case then I guess we can abandon the patch. I do not have a way to measure the compile time for OpenMP. Thanks for the review. -Aditya Jakub
RE: [PATCH] Print Pass Names
Date: Wed, 27 May 2015 09:07:30 -0600 From: l...@redhat.com To: hiradi...@msn.com; richard.guent...@gmail.com; gcc-patches@gcc.gnu.org Subject: Re: [PATCH] Print Pass Names On 05/26/2015 08:32 AM, Aditya K wrote: I don't have commit access. I would appreciate if someone does that for me. Thanks, -Aditya Date: Fri, 22 May 2015 14:52:29 -0600 From: l...@redhat.com To: hiradi...@msn.com; richard.guent...@gmail.com; gcc-patches@gcc.gnu.org Subject: Re: [PATCH] Print Pass Names On 05/22/2015 02:38 PM, Aditya K wrote: Subject: Re: [PATCH] Print Pass Names From: richard.guent...@gmail.com Date: Fri, 22 May 2015 21:32:24 +0200 To: hiradi...@msn.com; gcc-patches@gcc.gnu.org On May 22, 2015 6:32:38 PM GMT+02:00, Aditya K hiradi...@msn.com wrote: Currently, when we print the passes it does not print its name. This becomes confusing when we want to print all the passes at once (e.g., -fdump-tree-all-all=stderr pass.dump). This patch adds functionality to print the pass name. It passes bootstrap (with default configurations). Hope this is useful. Can't you just use current_pass-name? You are right. I have updated the patch. Thanks -Aditya gcc/ChangeLog: 2015-05-22 Aditya Kumar hiradi...@msn.com * statistics.c (statistics_fini_pass): Print pass name.OK. jeff Installed on the trunk after a bootstrap and regression test run on x86-linux-gnu. Thanks for merging the patches. -Aditya Thanks, Jeff
Re: [PATCH 08/35] Change use to type-based pool allocator in asan.c.
On 05/27/2015 07:56 AM, mliska wrote: gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * asan.c (asan_mem_ref_get_alloc_pool):Use new type-based pool allocator. (asan_mem_ref_new) Likewise. (free_mem_ref_resources) Likewise. Presumably the inconsequential whitespace changes are removing trailing whitespace or something similar. OK. jeff
Re: [PATCH 10/35] Change use to type-based pool allocator in cfg.c.
On 05/27/2015 07:56 AM, mliska wrote: gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * cfg.c (initialize_original_copy_tables):Use new type-based pool allocator. (free_original_copy_tables) Likewise. (copy_original_table_clear) Likewise. (copy_original_table_set) Likewise. OK. jeff
Re: [PATCH 11/35] Change use to type-based pool allocator in sh.c.
On 05/27/2015 07:56 AM, mliska wrote: gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * config/sh/sh.c (add_constant):Use new type-based pool allocator. (sh_reorg) Likewise. OK. jeff
Re: [PATCH] microblaze-linux: add missing cpp specs
On 05/27/2015 10:03 AM, Andreas Schwab wrote: Mike Frysinger vap...@gentoo.org writes: diff --git a/gcc/config/microblaze/linux.h b/gcc/config/microblaze/linux.h index a7faa7d..655a70f 100644 --- a/gcc/config/microblaze/linux.h +++ b/gcc/config/microblaze/linux.h @@ -22,6 +22,9 @@ #undef TARGET_SUPPORTS_PIC #define TARGET_SUPPORTS_PIC 1 +#undef CPP_SPEC +#define CPP_SPEC %{posix:-D_POSIX_SOURCE} %{pthread:-D_REENTRANT} Should this be defined by a shared header? Seems that way to me as well. jeff
Re: [PATCH] Contribute FreeBSD unwind support (x86_64 and x86)
On 05/20/2015 01:49 PM, John Marino wrote: I have maintained unwind support for FreeBSD i386 and x86_64 in my gnat-aux repository for many years (I created it). I've always intended on contributing it back to GCC, but I never got around to proving it worked until now. Happens. I can't count how many things I've written but never finished for various reasons through the years. The version I've been using actually has two flavors: FreeBSD 8 and below and FreeBSD 9 and above. However, the last of the FreeBSD 8 releases reaches EOL at the end of June so the unwind support I've attached here drops the FreeBSD 8 variation for simplicity's sake. Seems reasonable. We're not nearly as aggressive at dropping dead code as we perhaps could/should be. Note that I provided a similar unwind support for DragonFly a few months ago. Please consider applying the attached patch to gcc trunk. (copy of patch found here: http://leaf.dragonflybsd.org/~marino/freebsd/freebsd-unwind-support.diff ) Thanks for pointing that out. It's a shame that the BSDs can't share this code, but such is life. Suggested text for libgcc/ChangeLog: 2015-05-XX John Marino gnu...@marino.st * config.host (i[34567]86-*-freebsd*, x86_64-*-freebsd*): Set md_unwind_header * config/i386/freebsd-unwind.h: New. Also please recall that my copyright assignment to FSF is in order! Thanks for pointing that out. Otherwise I'd probably have asked the redundant question :-) Installed on the trunk. Thanks, jeff
Re: Fwd: PING^3: [PATCH]: New configure options that make the compiler use -fPIE and -pie as default option
On Wed, May 27, 2015 at 8:24 AM, Peter Bergner berg...@vnet.ibm.com wrote: On Tue, 2015-05-26 at 16:40 -0500, Bill Schmidt wrote: Ah, never mind. I guess I need to run automake first. I ran the patch on powerpc64-linux (ie, Big Endian) both with and without --enable-default-pie. Both bootstraps completed with no errors and the without --enable-default-pie regtested without any regressions. The --enable-default-pie regtesting shows massive failures that I have to look into. I'm haven't determined yet whether these are all -m32 FAILs or -m64 FAILS or both. I'll report back with more info after I dig into some of the failures. Does --enable-default-pie work on powerpc64-linux? Do you get working PIE by default? Some GCC tests expect non-PIE. I fixed some of them: commit 82923064d660e4183933b014ee3f645799a945b0 Author: hjl hjl@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Thu Jan 15 16:33:37 2015 + Ignore additional linker messages on Linux/x86 with PIE g++.dg/other/anon5.C is expected to fail to link. On Linux/x86 with PIE and the new linker, there are additional messages from linker: [hjl@gnu-tools-1 gcc]$ g++ -fPIE -pie /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/other/anon5.C /tmp/ccwg53fj.o: In function `f()': anon5.C:(.text+0x7): undefined reference to `(anonymous namespace)::c::t' /usr/local/bin/ld: /tmp/ccwg53fj.o: relocation R_X86_64_PC32 against undefined symbol `_ZN12_GLOBAL__N_11c1tE' can not be used when making a shared object; recompile with -fPIC /usr/local/bin/ld: final link failed: Bad value collect2: error: ld returned 1 exit status [hjl@gnu-tools-1 gcc]$ This patch ignores additional linker messages on Linux/x86 with PIE. * g++.dg/other/anon5.C: Ignore additional messages on Linux/x86 with PIE. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@219667 138bc75d-0d04-0410-961f-82ee72b054a4 -- H.J.
Re: [PATCH] Fix PR66142
Hi Richard, On 26/05/15 14:54, Richard Biener wrote: The following fixes the testcase in PR66142 Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2015-05-26 Richard Biener rguent...@suse.de PR tree-optimization/66142 * tree-ssa-sccvn.c (vn_reference_lookup_3): Manually compare MEM_REFs for the same base address. * gcc.dg/tree-ssa/ssa-fre-44.c: New testcase. Index: gcc/tree-ssa-sccvn.c === --- gcc/tree-ssa-sccvn.c(revision 223574) +++ gcc/tree-ssa-sccvn.c(working copy) @@ -1894,7 +1894,12 @@ vn_reference_lookup_3 (ao_ref *ref, tree size2 = lhs_ref.size; maxsize2 = lhs_ref.max_size; if (maxsize2 == -1 - || (base != base2 !operand_equal_p (base, base2, 0)) + || (base != base2 + (TREE_CODE (base) != MEM_REF + || TREE_CODE (base2) != MEM_REF + || TREE_OPERAND (base, 0) != TREE_OPERAND (base2, 0) + || !tree_int_cst_equal (TREE_OPERAND (base, 1), + TREE_OPERAND (base2, 1 || offset2 offset || offset2 + size2 offset + maxsize) return (void *)-1; Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c === --- gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-44.c (working copy) @@ -0,0 +1,62 @@ +/* { dg-do compile } */ +/* { dg-options -O -fdump-tree-fre1 } */ + +struct A { float x, y; }; +struct B { struct A u; }; +void bar (struct A *); + +float +f1 (struct B *x, int y) +{ + struct A p; + p.x = 1.0f; + p.y = 2.0f; + struct A *q = x[y].u; + *q = p; + float f = x[y].u.x + x[y].u.y; + bar (p); + return f; +} + +float +f2 (struct B *x, int y) +{ + struct A p; + p.x = 1.0f; + p.y = 2.0f; + x[y].u = p; + float f = x[y].u.x + x[y].u.y; + bar (p); + return f; +} + +float +f3 (struct B *x, int y) +{ + struct A p; + p.x = 1.0f; + p.y = 2.0f; + struct A *q = x[y].u; + __builtin_memcpy (q-x, p.x, sizeof (float)); + __builtin_memcpy (q-y, p.y, sizeof (float)); + *q = p; + float f = x[y].u.x + x[y].u.y; + bar (p); + return f; +} + +float +f4 (struct B *x, int y) +{ + struct A p; + p.x = 1.0f; + p.y = 2.0f; + __builtin_memcpy (x[y].u.x, p.x, sizeof (float)); + __builtin_memcpy (x[y].u.y, p.y, sizeof (float)); + float f = x[y].u.x + x[y].u.y; + bar (p); + return f; +} I see this test failing on arm-none-eabi. In particular, the f4 dump is the only one that doesn't contain return 3.0. Instead it is: f4 (struct B * x, int y) { float f; struct A p; unsigned int y.3_5; unsigned int _6; struct B * _8; float * _9; float * _14; float _19; float _23; bb 2: p.x = 1.0e+0; p.y = 2.0e+0; y.3_5 = (unsigned int) y_4(D); _6 = y.3_5 * 8; _8 = x_7(D) + _6; _9 = _8-u.x; __builtin_memcpy (_9, p.x, 4); _14 = _8-u.y; __builtin_memcpy (_14, p.y, 4); _19 = _8-u.x; _23 = _8-u.y; f_24 = _19 + _23; bar (p); p ={v} {CLOBBER}; return f_24; } Thanks, Kyrill + +/* { dg-final { scan-tree-dump-times return 3.0 4 fre1 } } */ +/* { dg-final { cleanup-tree-dump fre1 } } */
Re: Fwd: PING^3: [PATCH]: New configure options that make the compiler use -fPIE and -pie as default option
On Wed, 2015-05-27 at 08:36 -0700, H.J. Lu wrote: On Wed, May 27, 2015 at 8:24 AM, Peter Bergner berg...@vnet.ibm.com wrote: On Tue, 2015-05-26 at 16:40 -0500, Bill Schmidt wrote: Ah, never mind. I guess I need to run automake first. I ran the patch on powerpc64-linux (ie, Big Endian) both with and without --enable-default-pie. Both bootstraps completed with no errors and the without --enable-default-pie regtested without any regressions. The --enable-default-pie regtesting shows massive failures that I have to look into. I'm haven't determined yet whether these are all -m32 FAILs or -m64 FAILS or both. I'll report back with more info after I dig into some of the failures. Does --enable-default-pie work on powerpc64-linux? Do you get working PIE by default? Some GCC tests expect non-PIE. I fixed some of them: I haven't looked into any of the failures yet. That said, powerpc64-linux is PIC by default, so I thought maybe PIE would just work. Maybe it does and it's just powerpc-linux tests that are failing (I run the testsuite with both -m32 and -m64). I won't know until I get some time to have a deeper look. That said, if there is something you know of that I should look for or at, I'd appreciate it. Peter
Re: Do not compute alias sets for types that don't need them
On Tue, 26 May 2015, Jan Hubicka wrote: Hi, On Fri, 22 May 2015, Jan Hubicka wrote: Index: tree-streamer-out.c === --- tree-streamer-out.c (revision 223508) +++ tree-streamer-out.c (working copy) @@ -346,6 +346,7 @@ pack_ts_type_common_value_fields (struct alias-set zero to this type. */ bp_pack_var_len_int (bp, (TYPE_ALIAS_SET (expr) == 0 || (!in_lto_p +type_with_alias_set_p (expr) get_alias_set (expr) == 0)) ? 0 : -1); I find such interfaces very ugly. IOW, when it's always (or often) necessary to call check_foo_p() before foo() can be called then the checking should be part of foo() (and it should then return a conservative value, i.e. alias set 0), and that requirement not be imposed on the callers of foo(). I.e. why can't whatever checks you do in type_with_alias_set_p be included in get_alias_set? Because of sanity checking: I want to make alias sets of those types undefined rather than having random values. The point is that using the alias set in alias oracle querry is wrong. You could have just returned 0 for the alias-set for !type_with_alias_set_p in get_alias_set. That avoids polluting the alias data structures and is neither random or wrong. Take the example of bug in ipa-ICF. It is digging out completely random types from the IL and thinks it absolutely must compare alias sets of all of them (the bug obviously is that it really should compare only those that matters). It then throws random incomplete type to get_alias_set and obtain 0. Which will make it to silently give up if the matching random type is complete. ICE here is a friendly reminder to the author of the optimization pass that he is doing something fishy. It will also catch the cases where we throw memory access of incomplete type to the function body by frontend/middleend bug instead of just silently disabling optimization. I caught the Java interface glue issue with this. (still need to fix that) Now pack_ts_type_common_value_fields and RTL generation are differnt from the usual use of alias set oracle in a sense that they do compute unnecesary alias sets by design. They are not optimizations, they are IL stage transitions. Honza
Re: [Patch, fortran] PR66079 - [6 Regression] memory leak with source allocation in internal subprogram
Le 27/05/2015 16:07, Andre Vehreschild a écrit : Hi Paul, hi Mikael, about renaming the identifier emitted: I would like to keep it short. Remember, there is always a number attached to it, which makes it unique. Furthermore does alloc_source_tmp sound unnecessarily long to me. It tastes like we do not trust the unique identifier mechanism established in gfortran. But that is just my personal taste. Then let's go with source, which seems to get the majority of the votes. It remains an improvement over expr3 and atmp. about missing expr-rank == 0) in the extended patch: I just wanted to present an idea here. The patch was not meant to be commited yet. I think it furthermore is just half of the rent (like we say in Germany). I think we can do better, when we also think about the preceeding two if-blocks (the ones taking care about derived and class types). It should be possible to do something similar there. Furthermore could one think about moving e3rhs for array valued objects, too. But then we should not move to the last element, but instead to the first element. Nevertheless in the array valued case one might end up still having to deallocate the components or e3rhs, when the object allocated is zero sized. I wonder whether the bother really pays. What do you think about it? I don't want to review monster patches. ;-) More seriously, I think there are more important things than this, but the patch was there and seemed reasonable. One can add support for the other if-blocks. About the rest, I'm not sure I understand. Or rather, I'm sure I don't. Does it make a difference first or last element? What is so specific about array valued case? We can try to add support for this in more and more cases, but please let's not make the code impossible to understand. Paul: I would recommend you commit with symbol rename, but without the move optimization. We can do that later. Agreed. Mikael
Re: [C++/66270] another may_alias crash
On 05/26/2015 03:00 PM, Nathan Sidwell wrote: Ok, so IIUC a canonical pointer to a may_alias type should have TRCAA but a canonical can_alias_all pointer to a non-may_alias type should not have TRCAA? (i.e. one where CAN_ALIAS_ALL was passed true). Or are you saying that no canonical pointers should have TRCAA? The former: A canonical pointer should have TRCAA if and only if the canonical referent is may_alias. Hmm, are you seeing a case where TYPE_CANONICAL (to_type) has the may_alias attribute? Yes. This occurs when the newly created TRCAA pointer is to a self-canonical type. Hmm, seems like that's another problem with your testcase: the canonical variant of __m256 should not have may_alias. But the canonical variant of a class or enum type could still have may_alias, so we still need to handle that case. The patch is OK. Jason
Re: Do less generous pointer globbing in alias.c
Hmm, what about union t {int a; char b;}; int a; uniont t *ptr=a; *ptr = ... If we want to define this, aliasing_component_refs_p would IMO need to be symmetrized, too. I am happy leaving this undefined. Globbing all pointers was soo simple... :) Indeed, but too restrictive ;) The testcase above is not about globbing pointers, I do not think it is going to be handled in defined manner by mainline (or any release). Note that we are in the middle-end here and have to find cross-language common grounds. People may experience regressions towards the previous globbing so I guess the question is which is the globbing we want to remove - that is, what makes the most difference in code-generation? Yes, I expect to see some PRs with regress towards the previous globbing. I think the globbing as proposed by my patch should be generous enough for common bugs in user code and it is quite easy to add new rules on demand. For high-level C++ code definitely the most important point is that you have many different class types and we care about differentiating these (struct *a wrt struct *b). We also want to make difference between vtbl pointer (that is pointer to array of functions) and other stuff. I think I will modify the patch the following way: 1) I will move the code adding subset to get_alias_set 2) I will add flag is_pointer to alias set datastructure 3) I will make alias_set_subset_of to additionally consider every is_pointer set to be subset of alias set of ptr_type_node's set. This will fix the symmetry with void *a; variable and incompatible pointer write. We need to do two things - arrange alias set to be subset of all pointer's alias sets and all their superset and force equivalence between pointer alias sets. While the first can be also done by means of special flag contains_pointer I think it is cleaner to keep the DAG reprsented explicitely. After all we do not have that many alias sets and the hash table lookups should be fast enough (we may special case lookup in hash of size 1) Hona Richard. Honza
Re: [PATCH/RFC] Make loop-header-copying more aggressive, rerun before tree-if-conversion
On 05/22/2015 09:42 AM, Alan Lawrence wrote: This patch does so (and makes slightly less conservative, to tackle the example above). I found I had to make this a separate pass, so that the phi nodes were cleaned up at the end of the pass before running tree_if_conversion. Also at this stage in the compiler (inside loop opts) it was not possible to run loop_optimizer_init+finalize, or other loop_optimizer data structures needed later would be deleted; hence, I have two nearly-but-not-quite-identical passes, the new ch_vect avoiding the init/finalize. I tried to tackle this with some C++ subclassing, which removes the duplication, but the result feels a little ugly; suggestions for any neater approach welcome. What PHI node cleanup needs to be done? I don't doubt something's needed, but would like to understand the cleanup -- depending on what needs to be done, it may be the case that we can cleanup on-the-fly or it may point at a general issue we should be resolving prior to running tree_if_conversion. This patch causes failure of the scan-tree-dump of dom2 in gcc.dg/ssa/pr21417.c. This looks for jump-threading to perform an optimization, but no longer finds the expected line in the log - as the loop-header-copying phase has already done an equivalent transformation *before* dom2. The final CFG is thus in the desired form, but I'm not sure how to determine this (scanning the CFG itself is very difficult, well beyond what we can do with regex, requiring looking at multiple lines and basic blocks). Can anyone advise? [The test issue can be worked around by preserving the old do_while_p logic for the first header-copying pass, and using the new logic only for the second, but this is more awkward inside the compiler, which feels wrong.] Don't we have a flag to turn off loop header copying? If so, does adding that flag to the test fix it without resorting to something gross like preserving the old logic for the first pass and new logic for the second pass. The refactoring to deal with being able to call into this without reinitializing the loop optimizer doesn't seem terrible to me. One could argue that the loop optimizer init bits could become a property and managed by the pass manager. I'm not sure that really simplifies anything though. My biggest worry would be cases where data initialized by loop_optimizer_init gets invalidated by the header copying. Have you looked at all at that possibility? I don't have anything specific in mind to point you at -- just a general concern. Besides the new vect-ifcvt-11.c, the testsuite actually has a couple of other examples where this patch enables (undesired!) vectorization. I've dealt with these, but for the record: Presumably undesired is within the scope of the testsuite, not necessarily in terms of the code we generate for real user code :-) Overall it doesn't look bad to me... Convince me it's safe WRT the loop_optimizer_init issue above and we'll have a clear path forward. jeff
[PATCH] microblaze-linux: add missing cpp specs
Define CPP_SPEC for microblaze linux targets so that -posix -pthread work like on all other linux targets. 2015-05-27 Mike Frysinger vap...@gentoo.org * config/microblaze/linux.h (CPP_SPEC): Define. --- gcc/config/microblaze/linux.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/gcc/config/microblaze/linux.h b/gcc/config/microblaze/linux.h index a7faa7d..655a70f 100644 --- a/gcc/config/microblaze/linux.h +++ b/gcc/config/microblaze/linux.h @@ -22,6 +22,9 @@ #undef TARGET_SUPPORTS_PIC #define TARGET_SUPPORTS_PIC 1 +#undef CPP_SPEC +#define CPP_SPEC %{posix:-D_POSIX_SOURCE} %{pthread:-D_REENTRANT} + #undef TLS_NEEDS_GOT #define TLS_NEEDS_GOT 1 -- 2.4.1
[PATCH] nios2-linux: add missing cpp specs
Define CPP_SPEC for nios2 linux targets so that -posix -pthread work like on all other linux targets. 2015-05-27 Mike Frysinger vap...@gentoo.org * config/nios2/linux.h (CPP_SPEC): Define. --- gcc/config/nios2/linux.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/gcc/config/nios2/linux.h b/gcc/config/nios2/linux.h index 41cad94..f43f655 100644 --- a/gcc/config/nios2/linux.h +++ b/gcc/config/nios2/linux.h @@ -26,6 +26,9 @@ } \ while (0) +#undef CPP_SPEC +#define CPP_SPEC %{posix:-D_POSIX_SOURCE} %{pthread:-D_REENTRANT} + #define GLIBC_DYNAMIC_LINKER /lib/ld-linux-nios2.so.1 #undef LINK_SPEC -- 2.4.1
[PATCH] hppa-linux: add missing cpp specs
Define CPP_SPEC for parisc linux targets so that -posix -pthread work like on all other linux targets. 2015-05-27 Mike Frysinger vap...@gentoo.org * config/pa/pa-linux.h (CPP_SPEC): Define. --- gcc/config/pa/pa-linux.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/pa/pa-linux.h b/gcc/config/pa/pa-linux.h index 396d321..f8da185 100644 --- a/gcc/config/pa/pa-linux.h +++ b/gcc/config/pa/pa-linux.h @@ -28,7 +28,7 @@ along with GCC; see the file COPYING3. If not see while (0) #undef CPP_SPEC -#define CPP_SPEC %{posix:-D_POSIX_SOURCE} +#define CPP_SPEC %{posix:-D_POSIX_SOURCE} %{pthread:-D_REENTRANT} #undef ASM_SPEC #define ASM_SPEC \ -- 2.4.1
Re: [PATCH] PR 62173, re-shuffle insns for RTL loop invariant hoisting
On 05/21/2015 02:46 PM, Jiong Wang wrote: Thanks for these thoughts. I tried but still can't prove this transformation will not introduce extra pointer overflow even given it's reassociation with vfp, although my first impression is it do will not introduce extra risk in real application. Have done a quick check on hppa's legitimize_address. I see for (plus sym_ref, const_int), if const_int is beyond +-4K, then that hook will force them into register, then (plus reg, reg) is always OK. I'm virtually certain the PA's legitimize_address is not overflow safe. It was written long before we started worrying about overflows in address computations. It was mostly concerned with trying generate good addressing modes without running afoul of the implicit space register selection issues. A SYMBOL_REF is always a valid base register. However, as the comment in hppa_legitimize_address notes, we might be given a MEM for something like: x[n-10]. We don't want to rewrite that as (x-10) + n, even though doing so would be beneficial for LICM. So for target hooks, my understanding of your idea is something like: new hook targetm.pointer_arith_reassociate (), if return -1 then support full reassociation, 0 for limited, 1 for should not do any reassociation. the default version return -1 as most targets are OK to do reassociation given we can prove there is no introducing of overflow risk. While for target like HPPA, we should define this hook to return 0 for limited support. Right. Rather than use magic constants, I'd suggest an enum for the tri-state. FULL_PTR_REASSOCIATION, PARTIAL_PTR_REASSOCIATION, NO_PTR_REASSOCIATION. Then, if targetm.pointer_arith_reassociate () return 1, we should further invoke the second hook targetm.limited_reassociate_p (rtx x), to check the reassociated rtx 'x' meets any restrictions, for example for HPPA, constants part shouldn't beyond +-4K. Right. Jeff
Re: PATCH to run autoconf tests with C++ compiler
On 05/27/2015 08:54 AM, Richard Biener wrote: On Wed, May 27, 2015 at 10:49 AM, Andreas Schwab sch...@suse.de wrote: This breaks all checks for supported compiler options: configure:6382: checking whether gcc supports -Wnarrowing configure:6399: gcc -c -Wnarrowing conftest.c 5 cc1: error: unrecognized command line option -Wnarrowing configure:6399: $? = 1 configure:6485: checking whether gcc supports -Wnarrowing configure:6502: g++ -std=c++98 -c -g conftest.cpp 5 configure:6502: $? = 0 configure:6511: result: yes And thus causes PR66304, bootstrap failure with host gcc 4.3 (at least). Fixed thus: commit 0af5fc110196c2e9421f65c48ac09391bce031e3 Author: Jason Merrill ja...@redhat.com Date: Wed May 27 09:49:06 2015 -0400 PR bootstrap/66304 config/ * warnings.m4 (ACX_PROG_CXX_WARNING_OPTS) (ACX_PROG_CXX_WARNINGS_ARE_ERRORS) (ACX_PROG_CXX_WARNING_ALMOST_PEDANTIC): New. (ACX_PROG_CC_WARNING_OPTS, ACX_PROG_CC_WARNING_ALMOST_PEDANTIC) (ACX_PROG_CC_WARNINGS_ARE_ERRORS): Push into C language context. gcc/ * configure.ac: Use ACX_PROG_CXX_WARNING_OPTS, ACX_PROG_CXX_WARNING_ALMOST_PEDANTIC, and ACX_PROG_CXX_WARNINGS_ARE_ERRORS. * configure: Regenerate. diff --git a/config/warnings.m4 b/config/warnings.m4 index b64b594..b5a149a 100644 --- a/config/warnings.m4 +++ b/config/warnings.m4 @@ -23,6 +23,7 @@ # compiler accepts. AC_DEFUN([ACX_PROG_CC_WARNING_OPTS], [AC_REQUIRE([AC_PROG_CC])dnl +AC_LANG_PUSH(C) m4_pushdef([acx_Var], [m4_default([$2], [WARN_CFLAGS])])dnl AC_SUBST(acx_Var)dnl m4_expand_once([acx_Var= @@ -48,6 +49,7 @@ for real_option in $1; do done CFLAGS=$save_CFLAGS m4_popdef([acx_Var])dnl +AC_LANG_POP(C) ])# ACX_PROG_CC_WARNING_OPTS # ACX_PROG_CC_WARNING_ALMOST_PEDANTIC(WARNINGS, [VARIABLE = WARN_PEDANTIC]) @@ -55,6 +57,7 @@ m4_popdef([acx_Var])dnl # and accepts all of those options simultaneously, otherwise to nothing. AC_DEFUN([ACX_PROG_CC_WARNING_ALMOST_PEDANTIC], [AC_REQUIRE([AC_PROG_CC])dnl +AC_LANG_PUSH(C) m4_pushdef([acx_Var], [m4_default([$2], [WARN_PEDANTIC])])dnl AC_SUBST(acx_Var)dnl m4_expand_once([acx_Var= @@ -77,6 +80,7 @@ AS_IF([test AS_VAR_GET(acx_Pedantic) = yes], AS_VAR_POPDEF([acx_Pedantic])dnl m4_popdef([acx_Woptions])dnl m4_popdef([acx_Var])dnl +AC_LANG_POP(C) ])# ACX_PROG_CC_WARNING_ALMOST_PEDANTIC # ACX_PROG_CC_WARNINGS_ARE_ERRORS([x.y.z], [VARIABLE = WERROR]) @@ -88,6 +92,7 @@ m4_popdef([acx_Var])dnl # appeared on the configure command line. AC_DEFUN([ACX_PROG_CC_WARNINGS_ARE_ERRORS], [AC_REQUIRE([AC_PROG_CC])dnl +AC_LANG_PUSH(C) m4_pushdef([acx_Var], [m4_default([$2], [WERROR])])dnl AC_SUBST(acx_Var)dnl m4_expand_once([acx_Var= @@ -114,4 +119,109 @@ AS_IF([test $enable_werror_always = yes], [acx_Var=$acx_Var${acx_Var:+ }-Werror]) AS_VAR_POPDEF([acx_GCCvers])]) m4_popdef([acx_Var])dnl +AC_LANG_POP(C) ])# ACX_PROG_CC_WARNINGS_ARE_ERRORS + +# ACX_PROG_CXX_WARNING_OPTS(WARNINGS, [VARIABLE = WARN_CFLAGS) +# Sets @VARIABLE@ to the subset of the given options which the +# compiler accepts. +AC_DEFUN([ACX_PROG_CXX_WARNING_OPTS], +[AC_REQUIRE([AC_PROG_CXX])dnl +AC_LANG_PUSH(C++) +m4_pushdef([acx_Var], [m4_default([$2], [WARN_CXXFLAGS])])dnl +AC_SUBST(acx_Var)dnl +m4_expand_once([acx_Var= +],m4_quote(acx_Var=))dnl +save_CXXFLAGS=$CXXFLAGS +for real_option in $1; do + # Do the check with the no- prefix removed since gcc silently + # accepts any -Wno-* option on purpose + case $real_option in +-Wno-*) option=-W`expr x$real_option : 'x-Wno-\(.*\)'` ;; +*) option=$real_option ;; + esac + AS_VAR_PUSHDEF([acx_Woption], [acx_cv_prog_cc_warning_$option]) + AC_CACHE_CHECK([whether $CXX supports $option], acx_Woption, +[CXXFLAGS=$option +AC_COMPILE_IFELSE([AC_LANG_PROGRAM([],[])], + [AS_VAR_SET(acx_Woption, yes)], + [AS_VAR_SET(acx_Woption, no)]) + ]) + AS_IF([test AS_VAR_GET(acx_Woption) = yes], +[acx_Var=$acx_Var${acx_Var:+ }$real_option]) + AS_VAR_POPDEF([acx_Woption])dnl +done +CXXFLAGS=$save_CXXFLAGS +m4_popdef([acx_Var])dnl +AC_LANG_POP(C++) +])# ACX_PROG_CXX_WARNING_OPTS + +# ACX_PROG_CXX_WARNING_ALMOST_PEDANTIC(WARNINGS, [VARIABLE = WARN_PEDANTIC]) +# Append to VARIABLE -pedantic + the argument, if the compiler is G++ +# and accepts all of those options simultaneously, otherwise to nothing. +AC_DEFUN([ACX_PROG_CXX_WARNING_ALMOST_PEDANTIC], +[AC_REQUIRE([AC_PROG_CXX])dnl +AC_LANG_PUSH(C++) +m4_pushdef([acx_Var], [m4_default([$2], [WARN_PEDANTIC])])dnl +AC_SUBST(acx_Var)dnl +m4_expand_once([acx_Var= +],m4_quote(acx_Var=))dnl +# Do the check with the no- prefix removed from the warning options +# since gcc silently accepts any -Wno-* option on purpose +m4_pushdef([acx_Woptions], [m4_bpatsubst([$1], [-Wno-], [-W])])dnl +AS_VAR_PUSHDEF([acx_Pedantic], [acx_cv_prog_cc_pedantic_]acx_Woptions)dnl +AS_IF([test $GXX = yes], +[AC_CACHE_CHECK([whether $CXX supports -pedantic ]acx_Woptions, acx_Pedantic, +[save_CXXFLAGS=$CXXFLAGS
Re: [PATCH] microblaze-linux: add missing cpp specs
On 27 May 2015 18:03, Andreas Schwab wrote: Mike Frysinger vap...@gentoo.org writes: diff --git a/gcc/config/microblaze/linux.h b/gcc/config/microblaze/linux.h index a7faa7d..655a70f 100644 --- a/gcc/config/microblaze/linux.h +++ b/gcc/config/microblaze/linux.h @@ -22,6 +22,9 @@ #undef TARGET_SUPPORTS_PIC #define TARGET_SUPPORTS_PIC 1 +#undef CPP_SPEC +#define CPP_SPEC %{posix:-D_POSIX_SOURCE} %{pthread:-D_REENTRANT} Should this be defined by a shared header? i was going to poke that next, but i don't think fixing the few fringe arches should be predicated on cleaning up a mess that has been here for over a decade. -mike signature.asc Description: Digital signature
Re: [PATCH] Fixes combined gcc-binutils builds.
On 05/24/2015 01:56 PM, Michael Darling wrote: Combined builds has been broken for about 10 months, because some binutils configure.in files were renamed to configure.ac, but gcc's references to them were not updated. There is a corresponding patch submitted to binutils-gdb, which renames its few remaining configure.in files to configure.ac. Otherwise, fixing the gcc calls to binutils-gdb configure.* files would be more complicated. Also, some time ago, gcc renamed its configure.in files to configure.ac. Fixed a few remaining references to gcc configure.in files, such as in error messages and documentation. Can you please send your patch as an attachment. Your mailer re-wrapped the long lines making the patch impossible to apply and test. Jeff
Re: [PATCH PR65447]Improve IV handling by grouping address type uses with same base and step
Hi Bin, On 08/05/15 11:47, Bin Cheng wrote: Hi, GCC's IVO currently handles every IV use independently, which is not right by learning from cases reported in PR65447. The rationale is: 1) Lots of address type IVs refer to the same memory object, share similar base and have same step. We should handle these IVs as a group in order to maximize CSE opportunities, prefer reg+offset addressing mode. 2) GCC's IVO algorithm is expensive and only is run when candidate set is small enough. By grouping same family uses, we can decrease the number of both uses and candidates. Before this patch, number of candidates for PR65447 is too big to run expensive IVO algorithm, resulting in bad assembly code on targets like AArch64 and Mips. 3) Even for cases the assembly code isn't improved, we can still get compilation time benefit with this patch. 4) This is a prerequisite for enabling auto-increment support in IVO on AArch64. For now, this is only done to address type IVs, in the future I may extend it to general IVs too. For AArch64: Benchmarks 470.lbm/spec2k6 and 173.applu/spec2k are improved obviously by this patch. A couple of cases from spec2k/fp appear regressed. I looked into generated assembly code and can confirm the regression is false alarm except one case (189.lucas). For that case, I think it's another issue exposed by this patch (GCC failed to CSE candidate setup code, resulting in bloated loop header). Anyway, I also fined tuned the patch to minimize the impact. For AArch32, this patch seems to be able to improve spec2kfp too, but I didn't look deep into it. I guess the reason is it can make life for auto-increment support in IVO better. One of defects of this patch is computation of max offset in compute_max_addr_offset is basically borrowed from get_address_cost. The comment says we should find a better way to compute all information. People also complained we need to refactor that part of code. I don't have good solution to that yet, though I did try best to keep compute_max_addr_offset simple. I believe this is a generally wanted change, bootstrap and test on x86_64 and AArch64, so is it ok? 2015-05-08 Bin Cheng bin.ch...@arm.com PR tree-optimization/65447 * tree-ssa-loop-ivopts.c (struct iv_use): New fields. (dump_use, dump_uses): Support to dump sub use. (record_use): New parameters to support sub use. Remove call to dump_use. (record_sub_use, record_group_use): New functions. (compute_max_addr_offset, split_all_small_groups): New functions. (group_address_uses, rewrite_use_address): New functions. (strip_offset): New declaration. (find_interesting_uses_address): Call record_group_use. (add_candidate): New assertion. (infinite_cost_p): Move definition forward. (add_costs): Check INFTY cost and return immediately. (get_computation_cost_at): Clear setup cost and dependent bitmap for sub uses. (determine_use_iv_cost_address): Compute cost for sub uses. (rewrite_use_address_1): Rename from old rewrite_use_address. (free_loop_data): Free sub uses. (tree_ssa_iv_optimize_loop): Call group_address_uses. gcc/testsuite/ChangeLog 2015-05-08 Bin Cheng bin.ch...@arm.com PR tree-optimization/65447 * gcc.dg/tree-ssa/pr65447.c: New test. I see this test failing on arm-none-eabi with a compiler at r223737. My configure options are: --enable-checking=yes --with-newlib --with-fpu=neon-fp-armv8 --with-arch=armv8-a --without-isl Kyrill
Re: [PATCH] microblaze-linux: add missing cpp specs
Mike Frysinger vap...@gentoo.org writes: diff --git a/gcc/config/microblaze/linux.h b/gcc/config/microblaze/linux.h index a7faa7d..655a70f 100644 --- a/gcc/config/microblaze/linux.h +++ b/gcc/config/microblaze/linux.h @@ -22,6 +22,9 @@ #undef TARGET_SUPPORTS_PIC #define TARGET_SUPPORTS_PIC 1 +#undef CPP_SPEC +#define CPP_SPEC %{posix:-D_POSIX_SOURCE} %{pthread:-D_REENTRANT} Should this be defined by a shared header? Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 And now for something completely different.
Re: Fwd: PING^3: [PATCH]: New configure options that make the compiler use -fPIE and -pie as default option
On Wed, May 27, 2015 at 9:05 AM, Peter Bergner berg...@vnet.ibm.com wrote: On Wed, 2015-05-27 at 08:36 -0700, H.J. Lu wrote: On Wed, May 27, 2015 at 8:24 AM, Peter Bergner berg...@vnet.ibm.com wrote: On Tue, 2015-05-26 at 16:40 -0500, Bill Schmidt wrote: Ah, never mind. I guess I need to run automake first. I ran the patch on powerpc64-linux (ie, Big Endian) both with and without --enable-default-pie. Both bootstraps completed with no errors and the without --enable-default-pie regtested without any regressions. The --enable-default-pie regtesting shows massive failures that I have to look into. I'm haven't determined yet whether these are all -m32 FAILs or -m64 FAILS or both. I'll report back with more info after I dig into some of the failures. Does --enable-default-pie work on powerpc64-linux? Do you get working PIE by default? Some GCC tests expect non-PIE. I fixed some of them: I haven't looked into any of the failures yet. That said, powerpc64-linux is PIC by default, so I thought maybe PIE PIC != PIE. Is PIE the default for powerpc64-linux? Please show me # readelf -h /bin/ls on powerpc64-linux. would just work. Maybe it does and it's just powerpc-linux tests that are failing (I run the testsuite with both -m32 and -m64). I won't know until I get some time to have a deeper look. That said, if there is something you know of that I should look for or at, I'd appreciate it. You should first verify if --enable-default-pie generates a GCC which can produce a simple hello program. -- H.J.
[PATCH 17/35] Change use to type-based pool allocator in tree-ssa-math-opts.c.
gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * tree-ssa-math-opts.c (occ_new): Use new type-based pool allocator. (free_bb): Likewise. (pass_cse_reciprocals::execute): Likewise. --- gcc/tree-ssa-math-opts.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c index 98e2c49..0df755b 100644 --- a/gcc/tree-ssa-math-opts.c +++ b/gcc/tree-ssa-math-opts.c @@ -229,7 +229,7 @@ static struct static struct occurrence *occ_head; /* Allocation pool for getting instances of struct occurrence. */ -static alloc_pool occ_pool; +static pool_allocatoroccurrence *occ_pool; @@ -240,7 +240,7 @@ occ_new (basic_block bb, struct occurrence *children) { struct occurrence *occ; - bb-aux = occ = (struct occurrence *) pool_alloc (occ_pool); + bb-aux = occ = occ_pool-allocate (); memset (occ, 0, sizeof (struct occurrence)); occ-bb = bb; @@ -468,7 +468,7 @@ free_bb (struct occurrence *occ) next = occ-next; child = occ-children; occ-bb-aux = NULL; - pool_free (occ_pool, occ); + occ_pool-remove (occ); /* Now ensure that we don't recurse unless it is necessary. */ if (!child) @@ -572,9 +572,8 @@ pass_cse_reciprocals::execute (function *fun) basic_block bb; tree arg; - occ_pool = create_alloc_pool (dominators for recip, - sizeof (struct occurrence), - n_basic_blocks_for_fn (fun) / 3 + 1); + occ_pool = new pool_allocatoroccurrence +(dominators for recip, n_basic_blocks_for_fn (fun) / 3 + 1); memset (reciprocal_stats, 0, sizeof (reciprocal_stats)); calculate_dominance_info (CDI_DOMINATORS); @@ -704,7 +703,7 @@ pass_cse_reciprocals::execute (function *fun) free_dominance_info (CDI_DOMINATORS); free_dominance_info (CDI_POST_DOMINATORS); - free_alloc_pool (occ_pool); + delete occ_pool; return 0; } -- 2.1.4
[PATCH 07/35] Change use to type-based pool allocator in var-tracking.c.
gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * var-tracking.c (variable_htab_free):Use new type-based pool allocator. (attrs_list_clear) Likewise. (attrs_list_insert) Likewise. (attrs_list_copy) Likewise. (shared_hash_unshare) Likewise. (shared_hash_destroy) Likewise. (unshare_variable) Likewise. (var_reg_delete_and_set) Likewise. (var_reg_delete) Likewise. (var_regno_delete) Likewise. (drop_overlapping_mem_locs) Likewise. (variable_union) Likewise. (insert_into_intersection) Likewise. (canonicalize_values_star) Likewise. (variable_merge_over_cur) Likewise. (dataflow_set_merge) Likewise. (remove_duplicate_values) Likewise. (variable_post_merge_new_vals) Likewise. (dataflow_set_preserve_mem_locs) Likewise. (dataflow_set_remove_mem_locs) Likewise. (variable_from_dropped) Likewise. (variable_was_changed) Likewise. (set_slot_part) Likewise. (clobber_slot_part) Likewise. (delete_slot_part) Likewise. (loc_exp_insert_dep) Likewise. (notify_dependents_of_changed_value) Likewise. (emit_notes_for_differences_1) Likewise. (vt_emit_notes) Likewise. (vt_initialize) Likewise. (vt_finalize) Likewise. --- gcc/var-tracking.c | 201 - 1 file changed, 122 insertions(+), 79 deletions(-) diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c index 0db4358..f7afed1 100644 --- a/gcc/var-tracking.c +++ b/gcc/var-tracking.c @@ -282,6 +282,21 @@ typedef struct attrs_def /* Offset from start of DECL. */ HOST_WIDE_INT offset; + + /* Pool allocation new operator. */ + inline void *operator new (size_t) + { +return pool.allocate (); + } + + /* Delete operator utilizing pool allocation. */ + inline void operator delete (void *ptr) + { +pool.remove((attrs_def *) ptr); + } + + /* Memory allocation pool. */ + static pool_allocatorattrs_def pool; } *attrs; /* Structure for chaining the locations. */ @@ -298,6 +313,21 @@ typedef struct location_chain_def /* Initialized? */ enum var_init_status init; + + /* Pool allocation new operator. */ + inline void *operator new (size_t) + { +return pool.allocate (); + } + + /* Delete operator utilizing pool allocation. */ + inline void operator delete (void *ptr) + { +pool.remove((location_chain_def *) ptr); + } + + /* Memory allocation pool. */ + static pool_allocatorlocation_chain_def pool; } *location_chain; /* A vector of loc_exp_dep holds the active dependencies of a one-part @@ -315,6 +345,21 @@ typedef struct loc_exp_dep_s /* A pointer to the pointer to this entry (head or prev's next) in the doubly-linked list. */ struct loc_exp_dep_s **pprev; + + /* Pool allocation new operator. */ + inline void *operator new (size_t) + { +return pool.allocate (); + } + + /* Delete operator utilizing pool allocation. */ + inline void operator delete (void *ptr) + { +pool.remove((loc_exp_dep_s *) ptr); + } + + /* Memory allocation pool. */ + static pool_allocatorloc_exp_dep_s pool; } loc_exp_dep; @@ -554,6 +599,21 @@ typedef struct shared_hash_def /* Actual hash table. */ variable_table_type *htab; + + /* Pool allocation new operator. */ + inline void *operator new (size_t) + { +return pool.allocate (); + } + + /* Delete operator utilizing pool allocation. */ + inline void operator delete (void *ptr) + { +pool.remove((shared_hash_def *) ptr); + } + + /* Memory allocation pool. */ + static pool_allocatorshared_hash_def pool; } *shared_hash; /* Structure holding the IN or OUT set for a basic block. */ @@ -598,22 +658,28 @@ typedef struct variable_tracking_info_def } *variable_tracking_info; /* Alloc pool for struct attrs_def. */ -static alloc_pool attrs_pool; +pool_allocatorattrs_def attrs_def::pool (attrs_def pool, 1024); /* Alloc pool for struct variable_def with MAX_VAR_PARTS entries. */ -static alloc_pool var_pool; + +static pool_allocatorvariable_def var_pool + (variable_def pool, 64, + (MAX_VAR_PARTS - 1) * sizeof (((variable)NULL)-var_part[0])); /* Alloc pool for struct variable_def with a single var_part entry. */ -static alloc_pool valvar_pool; +static pool_allocatorvariable_def valvar_pool + (small variable_def pool, 256); /* Alloc pool for struct location_chain_def. */ -static alloc_pool loc_chain_pool; +pool_allocatorlocation_chain_def location_chain_def::pool + (location_chain_def pool, 1024); /* Alloc pool for struct shared_hash_def. */ -static alloc_pool shared_hash_pool; +pool_allocatorshared_hash_def shared_hash_def::pool + (shared_hash_def pool, 256); /* Alloc pool for struct loc_exp_dep_s for NOT_ONEPART variables. */ -static alloc_pool loc_exp_dep_pool; +pool_allocatorloc_exp_dep loc_exp_dep::pool (loc_exp_dep pool,
[PATCH 20/35] Change use to type-based pool allocator in ira-build.c.
gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * ira-build.c (initiate_cost_vectors): Use new type-based pool allocator. (ira_allocate_cost_vector): Likewise. (ira_free_cost_vector): Likewise. (finish_cost_vectors): Likewise. --- gcc/ira-build.c | 15 +++ 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/gcc/ira-build.c b/gcc/ira-build.c index 8b6b956..2de7d34 100644 --- a/gcc/ira-build.c +++ b/gcc/ira-build.c @@ -1633,7 +1633,7 @@ finish_copies (void) /* Pools for cost vectors. It is defined only for allocno classes. */ -static alloc_pool cost_vector_pool[N_REG_CLASSES]; +static pool_allocatorint * cost_vector_pool[N_REG_CLASSES]; /* The function initiates work with hard register cost vectors. It creates allocation pool for each allocno class. */ @@ -1646,10 +1646,9 @@ initiate_cost_vectors (void) for (i = 0; i ira_allocno_classes_num; i++) { aclass = ira_allocno_classes[i]; - cost_vector_pool[aclass] - = create_alloc_pool (cost vectors, -sizeof (int) * ira_class_hard_regs_num[aclass], -100); + cost_vector_pool[aclass] = new pool_allocatorint + (cost vectors, 100, +sizeof (int) * (ira_class_hard_regs_num[aclass] - 1)); } } @@ -1657,7 +1656,7 @@ initiate_cost_vectors (void) int * ira_allocate_cost_vector (reg_class_t aclass) { - return (int *) pool_alloc (cost_vector_pool[(int) aclass]); + return cost_vector_pool[(int) aclass]-allocate (); } /* Free a cost vector VEC for ACLASS. */ @@ -1665,7 +1664,7 @@ void ira_free_cost_vector (int *vec, reg_class_t aclass) { ira_assert (vec != NULL); - pool_free (cost_vector_pool[(int) aclass], vec); + cost_vector_pool[(int) aclass]-remove (vec); } /* Finish work with hard register cost vectors. Release allocation @@ -1679,7 +1678,7 @@ finish_cost_vectors (void) for (i = 0; i ira_allocno_classes_num; i++) { aclass = ira_allocno_classes[i]; - free_alloc_pool (cost_vector_pool[aclass]); + delete cost_vector_pool[aclass]; } } -- 2.1.4
[PATCH 15/35] Change use to type-based pool allocator in dse.c.
gcc/ChangeLog: 2015-04-30 Martin Liska mli...@suse.cz * dse.c (get_group_info):Use new type-based pool allocator. (dse_step0) Likewise. (free_store_info) Likewise. (delete_dead_store_insn) Likewise. (free_read_records) Likewise. (record_store) Likewise. (replace_read) Likewise. (check_mem_read_rtx) Likewise. (scan_insn) Likewise. (dse_step1) Likewise. (dse_step7) Likewise. --- gcc/dse.c | 201 -- 1 file changed, 129 insertions(+), 72 deletions(-) diff --git a/gcc/dse.c b/gcc/dse.c index b3b38d5..5ade9dd 100644 --- a/gcc/dse.c +++ b/gcc/dse.c @@ -249,7 +249,7 @@ static struct obstack dse_obstack; /* Scratch bitmap for cselib's cselib_expand_value_rtx. */ static bitmap scratch = NULL; -struct insn_info; +struct insn_info_type; /* This structure holds information about a candidate store. */ struct store_info @@ -316,7 +316,7 @@ struct store_info /* Set if this store stores the same constant value as REDUNDANT_REASON insn stored. These aren't eliminated early, because doing that might prevent the earlier larger store to be eliminated. */ - struct insn_info *redundant_reason; + struct insn_info_type *redundant_reason; }; /* Return a bitmask with the first N low bits set. */ @@ -329,12 +329,15 @@ lowpart_bitmask (int n) } typedef struct store_info *store_info_t; -static alloc_pool cse_store_info_pool; -static alloc_pool rtx_store_info_pool; +static pool_allocatorstore_info cse_store_info_pool (cse_store_info_pool, + 100); + +static pool_allocatorstore_info rtx_store_info_pool (rtx_store_info_pool, + 100); /* This structure holds information about a load. These are only built for rtx bases. */ -struct read_info +struct read_info_type { /* The id of the mem group of the base address. */ int group_id; @@ -351,15 +354,30 @@ struct read_info rtx mem; /* The next read_info for this insn. */ - struct read_info *next; + struct read_info_type *next; + + /* Pool allocation new operator. */ + inline void *operator new (size_t) + { +return pool.allocate (); + } + + /* Delete operator utilizing pool allocation. */ + inline void operator delete (void *ptr) + { +pool.remove((read_info_type *) ptr); + } + + /* Memory allocation pool. */ + static pool_allocatorread_info_type pool; }; -typedef struct read_info *read_info_t; -static alloc_pool read_info_pool; +typedef struct read_info_type *read_info_t; +pool_allocatorread_info_type read_info_type::pool (read_info_pool, 100); /* One of these records is created for each insn. */ -struct insn_info +struct insn_info_type { /* Set true if the insn contains a store but the insn itself cannot be deleted. This is set if the insn is a parallel and there is @@ -433,27 +451,41 @@ struct insn_info regset fixed_regs_live; /* The prev insn in the basic block. */ - struct insn_info * prev_insn; + struct insn_info_type * prev_insn; /* The linked list of insns that are in consideration for removal in the forwards pass through the basic block. This pointer may be trash as it is not cleared when a wild read occurs. The only time it is guaranteed to be correct is when the traversal starts at active_local_stores. */ - struct insn_info * next_local_store; + struct insn_info_type * next_local_store; + + /* Pool allocation new operator. */ + inline void *operator new (size_t) + { +return pool.allocate (); + } + + /* Delete operator utilizing pool allocation. */ + inline void operator delete (void *ptr) + { +pool.remove((insn_info_type *) ptr); + } + + /* Memory allocation pool. */ + static pool_allocatorinsn_info_type pool; }; +typedef struct insn_info_type *insn_info_t; -typedef struct insn_info *insn_info_t; -static alloc_pool insn_info_pool; +pool_allocatorinsn_info_type insn_info_type::pool (insn_info_pool, 100); /* The linked list of stores that are under consideration in this basic block. */ static insn_info_t active_local_stores; static int active_local_stores_len; -struct dse_bb_info +struct dse_bb_info_type { - /* Pointer to the insn info for the last insn in the block. These are linked so this is how all of the insns are reached. During scanning this is the current insn being scanned. */ @@ -507,10 +539,25 @@ struct dse_bb_info to assure that shift and/or add sequences that are inserted do not accidentally clobber live hard regs. */ bitmap regs_live; + + /* Pool allocation new operator. */ + inline void *operator new (size_t) + { +return pool.allocate (); + } + + /* Delete operator utilizing pool allocation. */ + inline void operator delete (void *ptr) + { +pool.remove((dse_bb_info_type *) ptr); + } +