Re: [PATCH GCC]Improve how we handle overflow for type conversion in scev/ivopts, part I
On Fri, May 22, 2015 at 7:45 PM, Richard Biener wrote: > On Wed, May 20, 2015 at 11:41 AM, Bin Cheng wrote: >> Hi, >> As we know, GCC is too conservative when checking overflow behavior in SCEV >> and loop related optimizers. Result is some variable can't be recognized as >> scalar evolution and thus optimizations are missed. To be specific, >> optimizers like ivopts and vectorizer are affected. >> This issue is more severe on 64 bit platforms, for example, PR62173 is >> failed on aarch64; scev-3.c and scev-4.c were marked as XFAIL on lp64 >> platforms. >> >> As the first part to improve overflow checking in GCC, this patch does below >> improvements: >> 1) Ideally, chrec_convert should be responsible to convert scev like >> "(type){base, step}" to scev like "{(type)base, (type)step} when the result >> scev doesn't overflow; chrec_convert_aggressive should do the conversion if >> the result scev could overflow/wrap. Unfortunately, current implementation >> may use chrec_convert_aggressive to return a scev that won't overflow. This >> is because of a) the static parameter "fold_conversions" for >> instantiate_scev_convert can only tracks whether chrec_convert_aggressive >> may be called, rather than if it does some overflow conversion or not; b) >> the implementation of instantiate_scev_convert sometimes shortcuts the call >> to chrec_convert and misses conversion opportunities. This patch improves >> this. >> 2) iv->no_overflow computed in simple_iv is too conservative. With 1) >> fixed, iv->no_overflow should reflects whether chrec_convert_aggressive does >> return an overflow scev. This patch improves this. >> 3) chrec_convert should be able to prove the resulting scev won't overflow >> with loop niter information. This patch doesn't finish this, but it >> factored a new interface out of scev_probably_wraps_p for future >> improvement. And that will be the part II patch. >> >> With the improvements in SCEV, this patch also improves optimizer(IVOPT) >> that uses scev information like below: >> For array reference in the form of arr[IV], GCC tries to derive new >> address iv {arr+iv.base, iv.step*elem_size} from IV. If IV overflow wrto a >> type that is narrower than address space, this derivation is not true >> because &arr[IV] isn't a scev. Root cause why scev-*.c are failed now is >> the overflow information of IV is too conservative. IVOPT has to be >> conservative to reject &arr[IV] as a scev. With more accurate overflow >> information, IVOPT can be improved too. So this patch fixes the mentioned >> long standing issues. >> >> Bootstrap and test on x86_64, x86 and aarch64. >> BTW, test gcc.target/i386/pr49781-1.c failed on x86_64, but I can confirmed >> it's not this patch's fault. >> >> So what's your opinion on this?. > > I maybe mixing things up but does > > +chrec_convert_aggressive (tree type, tree chrec, bool *fold_conversions) > { > ... > + if (evolution_function_is_affine_p (chrec)) > +{ > + tree base, step; > + struct loop *loop; > + > + loop = get_chrec_loop (chrec); > + base = CHREC_LEFT (chrec); > + step = CHREC_RIGHT (chrec); > + if (convert_affine_scev (loop, type, &base, &step, NULL, true)) > + return build_polynomial_chrec (loop->num, base, step); > > ^^^ not forget to set *fold_conversions to true? Or we need to use > convert_affine_scev (..., false)? Nice catch. It's supposed to be called only if source scev has no overflow behavior introduced by previous call to chrec_convert_aggressive. In other words, it should be guarded by "!*fold_conversions" like below: + + if (!*fold_conversions && evolution_function_is_affine_p (chrec)) +{ + tree base, step; + struct loop *loop; + + loop = get_chrec_loop (chrec); + base = CHREC_LEFT (chrec); + step = CHREC_RIGHT (chrec); + if (convert_affine_scev (loop, type, &base, &step, NULL, true)) +return build_polynomial_chrec (loop->num, base, step); +} The scenario is rare that didn't exposed in either bootstrap or reg-test. Here is the updated patch without any other difference. Bootstrap and test on x86_64 & AArch64. Thanks, bin > > +} > > (bah, and the diff somehow messes up -p context :/ which is why I like > context diffs more) > > Other from the above the patch looks good to me. > > Thanks, > Richard. > >> Thanks, >> bin >> >> 2015-05-20 Bin Cheng >> >> PR tree-optimization/62173 >> * tree-ssa-loop-ivopts.c (struct iv): New field. Reorder fields. >> (alloc_iv, set_iv): New parameter. >> (determine_biv_step): Delete. >> (find_bivs): Inline original determine_biv_step. Pass new >> argument to set_iv. >> (idx_find_step): Use no_overflow information for conversion. >> * tree-scalar-evolution.c (analyze_scalar_evolution_in_loop): Let >> resolve_mixers handle folded_casts. >> (instantiate_scev_name): Change bool parameter to bool pointer. >>
Remove method_class_type
Hi, when I introduced method_class_type I was not aware that TYPE_METHOD_BASETYPE gives previsely the same information. Bootstrapped/regtested x86_64-linux, will commit it shortly. * ipa-utils.h (method_class_type): Remove. * cgraphunit.c (walk_polymorphic_call_targets): Use TYPE_METHOD_BASETYPE. * ipa-devirt.c (type_in_anonymous_namespace_p): Check that it is called on main variants only. (method_class_type): Remove. (update_type_inheritance_graph): Use TYPE_METHOD_BASETYPE. (build_type_inheritance_graph): Likewise. * ipa-icf.c (sem_function::equals_wpa): Likewise. * pa-polymorphic-call.c (decl_maybe_in_construction_p, check_stmt_for_type_change): Use TYPE_METHOD_BASETYPE. * ipa.c (walk_polymorphic_call_targets): Likewise. * tree.c (verify_type): Verify that TYPE_METHOD_BASETYPE is main variant. Index: gcc/cgraphunit.c === --- gcc/cgraphunit.c(revision 223628) +++ gcc/cgraphunit.c(working copy) @@ -866,9 +866,8 @@ (TREE_TYPE (targets[i]->decl)) == METHOD_TYPE && !type_in_anonymous_namespace_p - (method_class_type -(TREE_TYPE (targets[i]->decl - enqueue_node (targets[i]); + (TYPE_METHOD_BASETYPE (TREE_TYPE (targets[i]->decl + enqueue_node (targets[i]); } } Index: gcc/ipa-devirt.c === --- gcc/ipa-devirt.c(revision 223628) +++ gcc/ipa-devirt.c(working copy) @@ -2279,18 +2280,6 @@ } } -/* Given method type T, return type of class it belongs to. - Look up this pointer and get its type.*/ - -tree -method_class_type (const_tree t) -{ - tree first_parm_type = TREE_VALUE (TYPE_ARG_TYPES (t)); - gcc_assert (TREE_CODE (t) == METHOD_TYPE); - - return TREE_TYPE (first_parm_type); -} - /* Initialize IPA devirt and build inheritance tree graph. */ void @@ -2314,8 +2303,7 @@ if (is_a (n) && DECL_VIRTUAL_P (n->decl) && n->real_symbol_p ()) - get_odr_type (TYPE_MAIN_VARIANT (method_class_type (TREE_TYPE (n->decl))), - true); + get_odr_type (TYPE_METHOD_BASETYPE (TREE_TYPE (n->decl)), true); /* Look also for virtual tables of types that do not define any methods. @@ -3446,8 +3434,7 @@ if (DECL_VIRTUAL_P (n->decl) && !n->definition && n->real_symbol_p ()) - get_odr_type (method_class_type (TYPE_MAIN_VARIANT (TREE_TYPE (n->decl))), - true); + get_odr_type (TYPE_METHOD_BASETYPE (TREE_TYPE (n->decl)), true); timevar_pop (TV_IPA_INHERITANCE); } Index: gcc/ipa-icf.c === --- gcc/ipa-icf.c (revision 223628) +++ gcc/ipa-icf.c (working copy) @@ -663,8 +663,8 @@ if (TREE_CODE (TREE_TYPE (item->decl)) != METHOD_TYPE) return return_false_with_msg ("DECL_CXX_CONSTURCTOR type mismatch"); else if (!func_checker::compatible_polymorphic_types_p -(method_class_type (TREE_TYPE (decl)), - method_class_type (TREE_TYPE (item->decl)), false)) +(TYPE_METHOD_BASETYPE (TREE_TYPE (decl)), + TYPE_METHOD_BASETYPE (TREE_TYPE (item->decl)), false)) return return_false_with_msg ("ctor polymorphic type mismatch"); } @@ -753,8 +753,8 @@ if (TREE_CODE (TREE_TYPE (decl)) != TREE_CODE (TREE_TYPE (item->decl))) return return_false_with_msg ("METHOD_TYPE and FUNCTION_TYPE mismatch"); if (!func_checker::compatible_polymorphic_types_p - (method_class_type (TREE_TYPE (decl)), - method_class_type (TREE_TYPE (item->decl)), false)) + (TYPE_METHOD_BASETYPE (TREE_TYPE (decl)), + TYPE_METHOD_BASETYPE (TREE_TYPE (item->decl)), false)) return return_false_with_msg ("THIS pointer ODR type mismatch"); } @@ -2722,7 +2722,7 @@ { if (TREE_CODE (TREE_TYPE (m_items[i]->decl)) == METHOD_TYPE && contains_polymorphic_type_p - (method_class_type (TREE_TYPE (m_items[i]->decl))) + (TYPE_METHOD_BASETYPE (TREE_TYPE (m_items[i]->decl))) && (DECL_CXX_CONSTRUCTOR_P (m_items[i]->decl) || (static_cast (m_items[i])->param_used_p (0) && static_cast (m_items[i]) @@ -2729,7 +2729,7 @@ ->compare_polymorphic_p ( { tree class_type - = method_class_type (TREE_TYPE (m_items[i]->decl)); + = TYPE_METHOD_BASETYPE (TREE_TYPE (m_items[i]->decl)); inchash::hash hstate (m_items[i]->hash); if (TYPE_NAME (class_type) Index: gcc/ipa-polymorphic-call.c ===
Constify few functions in tree.c
Hi, this patch turns some paramters to const_tree. Bootstrapped/regtested ppc64-linux, comitted as obvious. Honza * tree.c (prototype_p, virtual_method_call_p, obj_type_ref_class, is_typedef_decl, typedef_variant_p): Constify. * tree.h (prototype_p, virtual_method_call_p, obj_type_ref_class, is_typedef_decl, typedef_variant_p): Constify. Index: tree.c === --- tree.c (revision 223625) +++ tree.c (working copy) @@ -11579,7 +11579,7 @@ /* Return true if TYPE has a prototype. */ bool -prototype_p (tree fntype) +prototype_p (const_tree fntype) { tree t; @@ -12005,7 +12005,7 @@ can't apply.) */ bool -virtual_method_call_p (tree target) +virtual_method_call_p (const_tree target) { if (TREE_CODE (target) != OBJ_TYPE_REF) return false; @@ -12026,7 +12026,7 @@ /* REF is OBJ_TYPE_REF, return the class the ref corresponds to. */ tree -obj_type_ref_class (tree ref) +obj_type_ref_class (const_tree ref) { gcc_checking_assert (TREE_CODE (ref) == OBJ_TYPE_REF); ref = TREE_TYPE (ref); @@ -12124,7 +12124,7 @@ /* Returns true if X is a typedef decl. */ bool -is_typedef_decl (tree x) +is_typedef_decl (const_tree x) { return (x && TREE_CODE (x) == TYPE_DECL && DECL_ORIGINAL_TYPE (x) != NULL_TREE); @@ -12133,7 +12133,7 @@ /* Returns true iff TYPE is a type variant created for a typedef. */ bool -typedef_variant_p (tree type) +typedef_variant_p (const_tree type) { return is_typedef_decl (TYPE_NAME (type)); } Index: tree.h === --- tree.h (revision 223625) +++ tree.h (working copy) @@ -4375,9 +4375,9 @@ extern tree create_artificial_label (location_t); extern const char *get_name (tree); extern bool stdarg_p (const_tree); -extern bool prototype_p (tree); -extern bool is_typedef_decl (tree x); -extern bool typedef_variant_p (tree); +extern bool prototype_p (const_tree); +extern bool is_typedef_decl (const_tree x); +extern bool typedef_variant_p (const_tree); extern bool auto_var_in_fn_p (const_tree, const_tree); extern tree build_low_bits_mask (tree, unsigned); extern bool tree_nop_conversion_p (const_tree, const_tree); @@ -4544,8 +4544,8 @@ extern location_t tree_nonartificial_location (tree); extern tree block_ultimate_origin (const_tree); extern tree get_binfo_at_offset (tree, HOST_WIDE_INT, tree); -extern bool virtual_method_call_p (tree); -extern tree obj_type_ref_class (tree ref); +extern bool virtual_method_call_p (const_tree); +extern tree obj_type_ref_class (const_tree ref); extern bool types_same_for_odr (const_tree type1, const_tree type2, bool strict=false); extern bool contains_bitfld_component_ref_p (const_tree);
[PATCH 6/7] add default for HAVE_store_multiple
From: Trevor Saunders gcc/ChangeLog: 2015-05-23 Trevor Saunders * defaults.h (gen_store_multiple): New function. (HAVE_store_multiple): Add default value. * expr.c (move_block_from_reg): Adjust. --- gcc/ChangeLog | 6 ++ gcc/defaults.h | 10 ++ gcc/expr.c | 2 -- 3 files changed, 16 insertions(+), 2 deletions(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 5d609d4..99ee6dd 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,5 +1,11 @@ 2015-05-23 Trevor Saunders + * defaults.h (gen_store_multiple): New function. + (HAVE_store_multiple): Add default value. + * expr.c (move_block_from_reg): Adjust. + +2015-05-23 Trevor Saunders + * defaults.h (gen_load_multiple): New function. (HAVE_load_multiple): Add default value. * expr.c (move_block_to_reg): Adjust. diff --git a/gcc/defaults.h b/gcc/defaults.h index ea5ff80..566841b 100644 --- a/gcc/defaults.h +++ b/gcc/defaults.h @@ -1484,6 +1484,16 @@ gen_load_multiple (rtx, rtx, rtx) } #endif +#ifndef HAVE_store_multiple +#define HAVE_store_multiple 0 +static inline rtx +gen_store_multiple (rtx, rtx, rtx) +{ + gcc_unreachable (); + return NULL; +} +#endif + #endif /* GCC_INSN_FLAGS_H */ #endif /* ! GCC_DEFAULTS_H */ diff --git a/gcc/expr.c b/gcc/expr.c index c4b39f4..0dad737 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -1548,7 +1548,6 @@ move_block_from_reg (int regno, rtx x, int nregs) return; /* See if the machine can do this with a store multiple insn. */ -#ifdef HAVE_store_multiple if (HAVE_store_multiple) { rtx_insn *last = get_last_insn (); @@ -1562,7 +1561,6 @@ move_block_from_reg (int regno, rtx x, int nregs) else delete_insns_since (last); } -#endif for (i = 0; i < nregs; i++) { -- 2.4.0.78.g7c6ecbf
[PATCH 4/7] provide default for HAVE_mem_signal_fence
From: Trevor Saunders gcc/ChangeLog: 2015-05-23 Trevor Saunders * defaults.h (gen_mem_signal_fence): New function. (HAVE_mem_signal_fence): Add default value. * optabs.c: Adjust. --- gcc/ChangeLog | 6 ++ gcc/defaults.h | 10 ++ gcc/optabs.c | 5 - 3 files changed, 16 insertions(+), 5 deletions(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index cd0358a..5e540b6 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,5 +1,11 @@ 2015-05-23 Trevor Saunders + * defaults.h (gen_mem_signal_fence): New function. + (HAVE_mem_signal_fence): Add default value. + * optabs.c: Adjust. + +2015-05-23 Trevor Saunders + * defaults.h (gen_memory_barrier): New function. (HAVE_memory_barrier): Add default value. * optabs.c: Adjust. diff --git a/gcc/defaults.h b/gcc/defaults.h index a7455e5..50004d5 100644 --- a/gcc/defaults.h +++ b/gcc/defaults.h @@ -1464,6 +1464,16 @@ gen_memory_barrier () } #endif +#ifndef HAVE_mem_signal_fence +#define HAVE_mem_signal_fence 0 +static inline rtx +gen_mem_signal_fence (rtx) +{ + gcc_unreachable (); + return NULL; +} +#endif + #endif /* GCC_INSN_FLAGS_H */ #endif /* ! GCC_DEFAULTS_H */ diff --git a/gcc/optabs.c b/gcc/optabs.c index d3c1d21..49e1c53 100644 --- a/gcc/optabs.c +++ b/gcc/optabs.c @@ -7608,11 +7608,6 @@ expand_mem_thread_fence (enum memmodel model) /* This routine will either emit the mem_signal_fence pattern or issue a sync_synchronize to generate a fence for memory model MEMMODEL. */ -#ifndef HAVE_mem_signal_fence -# define HAVE_mem_signal_fence 0 -# define gen_mem_signal_fence(x) (gcc_unreachable (), NULL_RTX) -#endif - void expand_mem_signal_fence (enum memmodel model) { -- 2.4.0.78.g7c6ecbf
[PATCH 5/7] add default for HAVE_load_multiple
From: Trevor Saunders gcc/ChangeLog: 2015-05-23 Trevor Saunders * defaults.h (gen_load_multiple): New function. (HAVE_load_multiple): Add default value. * expr.c (move_block_to_reg): Adjust. --- gcc/ChangeLog | 6 ++ gcc/defaults.h | 10 ++ gcc/expr.c | 4 3 files changed, 16 insertions(+), 4 deletions(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 5e540b6..5d609d4 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,5 +1,11 @@ 2015-05-23 Trevor Saunders + * defaults.h (gen_load_multiple): New function. + (HAVE_load_multiple): Add default value. + * expr.c (move_block_to_reg): Adjust. + +2015-05-23 Trevor Saunders + * defaults.h (gen_mem_signal_fence): New function. (HAVE_mem_signal_fence): Add default value. * optabs.c: Adjust. diff --git a/gcc/defaults.h b/gcc/defaults.h index 50004d5..ea5ff80 100644 --- a/gcc/defaults.h +++ b/gcc/defaults.h @@ -1474,6 +1474,16 @@ gen_mem_signal_fence (rtx) } #endif +#ifndef HAVE_load_multiple +#define HAVE_load_multiple 0 +static inline rtx +gen_load_multiple (rtx, rtx, rtx) +{ + gcc_unreachable (); + return NULL; +} +#endif + #endif /* GCC_INSN_FLAGS_H */ #endif /* ! GCC_DEFAULTS_H */ diff --git a/gcc/expr.c b/gcc/expr.c index 3605e99..c4b39f4 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -1507,10 +1507,8 @@ void move_block_to_reg (int regno, rtx x, int nregs, machine_mode mode) { int i; -#ifdef HAVE_load_multiple rtx pat; rtx_insn *last; -#endif if (nregs == 0) return; @@ -1519,7 +1517,6 @@ move_block_to_reg (int regno, rtx x, int nregs, machine_mode mode) x = validize_mem (force_const_mem (mode, x)); /* See if the machine can do this with a load multiple insn. */ -#ifdef HAVE_load_multiple if (HAVE_load_multiple) { last = get_last_insn (); @@ -1533,7 +1530,6 @@ move_block_to_reg (int regno, rtx x, int nregs, machine_mode mode) else delete_insns_since (last); } -#endif for (i = 0; i < nregs; i++) emit_move_insn (gen_rtx_REG (word_mode, regno + i), -- 2.4.0.78.g7c6ecbf
[PATCH 7/7] add default for HAVE_tablejump
From: Trevor Saunders gcc/ChangeLog: 2015-05-23 Trevor Saunders * defaults.h (gen_tablejump): New function. (HAVE_tablejump): Add default value. * expr.c: Adjust. * stmt.c: Likewise. --- gcc/ChangeLog | 7 +++ gcc/defaults.h | 10 ++ gcc/expr.c | 5 - gcc/stmt.c | 4 4 files changed, 17 insertions(+), 9 deletions(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 99ee6dd..864ce02 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,5 +1,12 @@ 2015-05-23 Trevor Saunders + * defaults.h (gen_tablejump): New function. + (HAVE_tablejump): Add default value. + * expr.c: Adjust. + * stmt.c: Likewise. + +2015-05-23 Trevor Saunders + * defaults.h (gen_store_multiple): New function. (HAVE_store_multiple): Add default value. * expr.c (move_block_from_reg): Adjust. diff --git a/gcc/defaults.h b/gcc/defaults.h index 566841b..53d6682 100644 --- a/gcc/defaults.h +++ b/gcc/defaults.h @@ -1494,6 +1494,16 @@ gen_store_multiple (rtx, rtx, rtx) } #endif +#ifndef HAVE_tablejump +#define HAVE_tablejump 0 +static inline rtx +gen_tablejump (rtx, rtx) +{ + gcc_unreachable (); + return NULL; +} +#endif + #endif /* GCC_INSN_FLAGS_H */ #endif /* ! GCC_DEFAULTS_H */ diff --git a/gcc/expr.c b/gcc/expr.c index 0dad737..dccaf8b 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -11226,11 +11226,6 @@ try_casesi (tree index_type, tree index_expr, tree minval, tree range, } /* Attempt to generate a tablejump instruction; same concept. */ -#ifndef HAVE_tablejump -#define HAVE_tablejump 0 -#define gen_tablejump(x, y) (0) -#endif - /* Subroutine of the next function. INDEX is the value being switched on, with the lowest value diff --git a/gcc/stmt.c b/gcc/stmt.c index 16a080a..303df72 100644 --- a/gcc/stmt.c +++ b/gcc/stmt.c @@ -796,10 +796,6 @@ dump_case_nodes (FILE *f, struct case_node *root, #define HAVE_casesi 0 #endif -#ifndef HAVE_tablejump -#define HAVE_tablejump 0 -#endif - /* Return the smallest number of different values for which it is best to use a jump-table instead of a tree of conditional branches. */ -- 2.4.0.78.g7c6ecbf
[PATCH 2/7] provide default for HAVE_mem_thread_fence
From: Trevor Saunders gcc/ChangeLog: 2015-05-23 Trevor Saunders * defaults.h (gen_mem_thread_fence): New function. (HAVE_mem_thread_fence): Add default definition. * optabs.c: Adjust. --- gcc/ChangeLog | 6 ++ gcc/defaults.h | 10 ++ gcc/optabs.c | 4 3 files changed, 16 insertions(+), 4 deletions(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 360f013..2f40e8d 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,5 +1,11 @@ 2015-05-23 Trevor Saunders + * defaults.h (gen_mem_thread_fence): New function. + (HAVE_mem_thread_fence): Add default definition. + * optabs.c: Adjust. + +2015-05-23 Trevor Saunders + * combine.c (find_split_point): Check the value of HAVE_lo_sum instead of if it is defined. (combine_simplify_rtx): Likewise. diff --git a/gcc/defaults.h b/gcc/defaults.h index e7bbcb8..72b290a 100644 --- a/gcc/defaults.h +++ b/gcc/defaults.h @@ -1444,6 +1444,16 @@ gen_epilogue () } #endif +#ifndef HAVE_mem_thread_fence +#define HAVE_mem_thread_fence 0 +static inline rtx +gen_mem_thread_fence (rtx) +{ + gcc_unreachable (); + return NULL; +} +#endif + #endif /* GCC_INSN_FLAGS_H */ #endif /* ! GCC_DEFAULTS_H */ diff --git a/gcc/optabs.c b/gcc/optabs.c index 21150db..197e4ae 100644 --- a/gcc/optabs.c +++ b/gcc/optabs.c @@ -7589,10 +7589,6 @@ expand_asm_memory_barrier (void) /* This routine will either emit the mem_thread_fence pattern or issue a sync_synchronize to generate a fence for memory model MEMMODEL. */ -#ifndef HAVE_mem_thread_fence -# define HAVE_mem_thread_fence 0 -# define gen_mem_thread_fence(x) (gcc_unreachable (), NULL_RTX) -#endif #ifndef HAVE_memory_barrier # define HAVE_memory_barrier 0 # define gen_memory_barrier() (gcc_unreachable (), NULL_RTX) -- 2.4.0.78.g7c6ecbf
[PATCH 0/7] More ifdef reduction
From: Trevor Saunders Hi, yet more of the same. each individually bootstrapped + regtested on x86_64-linux-gnu, and made sure config-list.mk was fine at the end. I expect this stuff is all still preapproved so committing to trunk. Trev Trevor Saunders (7): always define HAVE_lo_sum provide default for HAVE_mem_thread_fence always define HAVE_memory_barrier provide default for HAVE_mem_signal_fence add default for HAVE_load_multiple add default for HAVE_store_multiple add default for HAVE_tablejump gcc/ChangeLog | 46 gcc/combine.c | 10 +++ gcc/config/darwin.c | 3 +-- gcc/defaults.h| 60 ++ gcc/expr.c| 11 gcc/genconfig.c | 2 ++ gcc/lra-constraints.c | 72 +-- gcc/optabs.c | 14 -- gcc/stmt.c| 4 --- 9 files changed, 148 insertions(+), 74 deletions(-) -- 2.4.0.78.g7c6ecbf
[PATCH 3/7] always define HAVE_memory_barrier
From: Trevor Saunders gcc/ChangeLog: 2015-05-23 Trevor Saunders * defaults.h (gen_memory_barrier): New function. (HAVE_memory_barrier): Add default value. * optabs.c: Adjust. --- gcc/ChangeLog | 6 ++ gcc/defaults.h | 10 ++ gcc/optabs.c | 5 - 3 files changed, 16 insertions(+), 5 deletions(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 2f40e8d..cd0358a 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,5 +1,11 @@ 2015-05-23 Trevor Saunders + * defaults.h (gen_memory_barrier): New function. + (HAVE_memory_barrier): Add default value. + * optabs.c: Adjust. + +2015-05-23 Trevor Saunders + * defaults.h (gen_mem_thread_fence): New function. (HAVE_mem_thread_fence): Add default definition. * optabs.c: Adjust. diff --git a/gcc/defaults.h b/gcc/defaults.h index 72b290a..a7455e5 100644 --- a/gcc/defaults.h +++ b/gcc/defaults.h @@ -1454,6 +1454,16 @@ gen_mem_thread_fence (rtx) } #endif +#ifndef HAVE_memory_barrier +#define HAVE_memory_barrier 0 +static inline rtx +gen_memory_barrier () +{ + gcc_unreachable (); + return NULL; +} +#endif + #endif /* GCC_INSN_FLAGS_H */ #endif /* ! GCC_DEFAULTS_H */ diff --git a/gcc/optabs.c b/gcc/optabs.c index 197e4ae..d3c1d21 100644 --- a/gcc/optabs.c +++ b/gcc/optabs.c @@ -7589,11 +7589,6 @@ expand_asm_memory_barrier (void) /* This routine will either emit the mem_thread_fence pattern or issue a sync_synchronize to generate a fence for memory model MEMMODEL. */ -#ifndef HAVE_memory_barrier -# define HAVE_memory_barrier 0 -# define gen_memory_barrier() (gcc_unreachable (), NULL_RTX) -#endif - void expand_mem_thread_fence (enum memmodel model) { -- 2.4.0.78.g7c6ecbf
[PATCH 1/7] always define HAVE_lo_sum
From: Trevor Saunders gcc/ChangeLog: 2015-05-23 Trevor Saunders * combine.c (find_split_point): Check the value of HAVE_lo_sum instead of if it is defined. (combine_simplify_rtx): Likewise. * lra-constraints.c (process_address_1): Likewise. * config/darwin.c: Adjust. * genconfig.c (main): Always define HAVE_lo_sum. --- gcc/ChangeLog | 9 +++ gcc/combine.c | 10 +++ gcc/config/darwin.c | 3 +-- gcc/genconfig.c | 2 ++ gcc/lra-constraints.c | 72 +-- 5 files changed, 51 insertions(+), 45 deletions(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 3ce1628..360f013 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,12 @@ +2015-05-23 Trevor Saunders + + * combine.c (find_split_point): Check the value of HAVE_lo_sum + instead of if it is defined. + (combine_simplify_rtx): Likewise. + * lra-constraints.c (process_address_1): Likewise. + * config/darwin.c: Adjust. + * genconfig.c (main): Always define HAVE_lo_sum. + 2015-05-23 Prathamesh Kulkarni * genmatch.c (parser::parse_operation): Reject expanding operator-list inside 'for'. diff --git a/gcc/combine.c b/gcc/combine.c index 0817af2..73d141e 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -4785,11 +4785,10 @@ find_split_point (rtx *loc, rtx_insn *insn, bool set_src) return find_split_point (&SUBREG_REG (x), insn, false); case MEM: -#ifdef HAVE_lo_sum /* If we have (mem (const ..)) or (mem (symbol_ref ...)), split it using LO_SUM and HIGH. */ - if (GET_CODE (XEXP (x, 0)) == CONST - || GET_CODE (XEXP (x, 0)) == SYMBOL_REF) + if (HAVE_lo_sum && (GET_CODE (XEXP (x, 0)) == CONST + || GET_CODE (XEXP (x, 0)) == SYMBOL_REF)) { machine_mode address_mode = get_address_mode (x); @@ -4799,7 +4798,6 @@ find_split_point (rtx *loc, rtx_insn *insn, bool set_src) XEXP (x, 0))); return &XEXP (XEXP (x, 0), 0); } -#endif /* If we have a PLUS whose second operand is a constant and the address is not valid, perhaps will can split it up using @@ -5857,16 +5855,14 @@ combine_simplify_rtx (rtx x, machine_mode op0_mode, int in_dest, SUBST (XEXP (x, 0), XEXP (XEXP (x, 0), 0)); break; -#ifdef HAVE_lo_sum case LO_SUM: /* Convert (lo_sum (high FOO) FOO) to FOO. This is necessary so we can add in an offset. find_split_point will split this address up again if it doesn't match. */ - if (GET_CODE (XEXP (x, 0)) == HIGH + if (HAVE_lo_sum && GET_CODE (XEXP (x, 0)) == HIGH && rtx_equal_p (XEXP (XEXP (x, 0), 0), XEXP (x, 1))) return XEXP (x, 1); break; -#endif case PLUS: /* (plus (xor (and (const_int pow2 - 1)) ) <-c>) diff --git a/gcc/config/darwin.c b/gcc/config/darwin.c index 5ea7088..ea7eec1 100644 --- a/gcc/config/darwin.c +++ b/gcc/config/darwin.c @@ -149,8 +149,7 @@ int generating_for_darwin_version ; section * darwin_sections[NUM_DARWIN_SECTIONS]; /* While we transition to using in-tests instead of ifdef'd code. */ -#ifndef HAVE_lo_sum -#define HAVE_lo_sum 0 +#if !HAVE_lo_sum #define gen_macho_high(a,b) (a) #define gen_macho_low(a,b,c) (a) #endif diff --git a/gcc/genconfig.c b/gcc/genconfig.c index 7237dede..a0a834a 100644 --- a/gcc/genconfig.c +++ b/gcc/genconfig.c @@ -360,6 +360,8 @@ main (int argc, char **argv) if (have_lo_sum_flag) printf ("#define HAVE_lo_sum 1\n"); + else +printf ("#define HAVE_lo_sum 0\n"); if (have_rotate_flag) printf ("#define HAVE_rotate 1\n"); diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c index c0f2995..a8d0820 100644 --- a/gcc/lra-constraints.c +++ b/gcc/lra-constraints.c @@ -2962,42 +2962,42 @@ process_address_1 (int nop, bool check_only_p, rtx addr = *ad.inner; new_reg = lra_create_new_reg (Pmode, NULL_RTX, cl, "addr"); -#ifdef HAVE_lo_sum - { - rtx_insn *insn; - rtx_insn *last = get_last_insn (); - - /* addr => lo_sum (new_base, addr), case (2) above. */ - insn = emit_insn (gen_rtx_SET - (new_reg, - gen_rtx_HIGH (Pmode, copy_rtx (addr; - code = recog_memoized (insn); - if (code >= 0) - { - *ad.inner = gen_rtx_LO_SUM (Pmode, new_reg, addr); - if (! valid_address_p (ad.mode, *ad.outer, ad.as)) - { - /* Try to put lo_sum into register. */ - insn = emit_insn (gen_rtx_SET - (new_reg, - gen_rtx_LO_SUM (Pmode, new_reg, addr))); - code = recog_memoized (insn); - if (code >= 0) - { -
[Patch, fortran] PR66257 [5/6 regression] elemental typebound calls rejected as actual argument
Hello, For PR 63727, a check was introduced, rejecting procedure pointer components used as actual arguments: foo(obj%proc_comp) but it had the side effect of also rejecting foo(obj%proc_comp(arg)) Fixed by the attached patch. Tested on x86_64-linux. OK for 6/5 ? Mikael 2015-05-23 Mikael Morin PR fortran/66257 * resolve.c (resolve_actual_arglist): Don't throw an error if the argument with procedure pointer component is not a variable. 2015-05-23 Mikael Morin PR fortran/66257 * typebound_call_27.f90: New file. diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c index fbf260f..a46ab60 100644 --- a/gcc/fortran/resolve.c +++ b/gcc/fortran/resolve.c @@ -1981,7 +1981,8 @@ resolve_actual_arglist (gfc_actual_arglist *arg, procedure_type ptype, } comp = gfc_get_proc_ptr_comp(e); - if (comp && comp->attr.elemental) + if (e->expr_type == EXPR_VARIABLE + && comp && comp->attr.elemental) { gfc_error ("ELEMENTAL procedure pointer component %qs is not " "allowed as an actual argument at %L", comp->name, ! { dg-do compile } ! ! PR fortran/66257 ! Check that typebound function calls are accepted as actual argument. ! MODULE test_class IMPLICIT NONE PRIVATE PUBLIC:: test INTEGER, PARAMETER :: dp = SELECTED_REAL_KIND(15) TYPE test PRIVATE CONTAINS PRIVATE PROCEDURE, PUBLIC:: E PROCEDURE, PUBLIC:: Om END TYPE test CONTAINS ELEMENTAL FUNCTION E (self, a) IMPLICIT NONE CLASS(test), INTENT(IN):: self REAL(kind=dp), INTENT(IN):: a REAL(kind=dp):: E E = a END FUNCTION E ELEMENTAL FUNCTION Om (self, z) IMPLICIT NONE CLASS(test), INTENT(IN):: self REAL(kind=dp), INTENT(IN):: z REAL(kind=dp):: Om Om = self%E(self%E(z)) Om = log10(self%E(z)) END FUNCTION Om END MODULE test_class
Re: [match-and-simplify] reject expanding operator-list to implicit 'for'
On 20 May 2015 at 17:39, Prathamesh Kulkarni wrote: > On 20 May 2015 at 17:01, Richard Biener wrote: >> On Wed, 20 May 2015, Prathamesh Kulkarni wrote: >> >>> On 20 May 2015 at 16:17, Prathamesh Kulkarni >>> wrote: >>> > Hi, >>> > This patch rejects expanding operator-list to implicit 'for'. >>> On second thoughts, should we reject expansion of operator-list _only_ >>> if it's mixed with 'for' ? >> >> At least that, yes. >> >>> We could define multiple operator-lists in simplify to be the same as >>> enclosing the simplify in 'for' with number of iterators >>> equal to number of operator-lists. >>> So we could allow >>> (define_operator_list op1 ...) >>> (define_operator_list op2 ...) >>> >>> (simplify >>> (op1 (op2 ... ))) >>> >>> is equivalent to: >>> (for temp1 (op1) >>>temp2 (op2) >>> (simplify >>> (temp1 (temp2 ... >>> >>> I think we have patterns like these in match-builtin.pd in the >>> match-and-simplify branch >>> And reject mixing of 'for' and operator-lists. >>> Admittedly the implicit 'for' behavior is not obvious from the syntax -;( >> >> Hmm, indeed we have for example >> >> /* Optimize pow(1.0,y) = 1.0. */ >> (simplify >> (POW real_onep@0 @1) >> @0) >> >> and I remember wanting that implicit for to make those less ugly. >> >> So can you rework only rejecting it within for? > This patch rejects expanding operator-list inside 'for'. > OK for trunk after bootstrap+testing ? Bootstrap failed with -Werror=parantheses . Applied the attached patch (which just adds { } on if (p && p->is_oper_list)) Thanks, Prathamesh > > Thanks, > Prathamesh >> >> Thanks, >> Richard. >> >> >>> Thanks, >>> Prathamesh >>> > OK for trunk after bootstrap+testing ? >>> > >>> > Thanks, >>> > Prathamesh >>> >>> >> >> -- >> Richard Biener >> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, >> Graham Norton, HRB 21284 (AG Nuernberg) Index: genmatch.c === --- genmatch.c (revision 223610) +++ genmatch.c (working copy) @@ -2913,7 +2913,12 @@ user_id *p = dyn_cast (op); if (p && p->is_oper_list) -record_operlist (id_tok->src_loc, p); +{ + if (active_fors.length() == 0) + record_operlist (id_tok->src_loc, p); + else + fatal_at (id_tok, "operator-list %s cannot be exapnded inside 'for'", id); +} return op; }
Re: [C++ PATCH] fix canonical type ICE
OK, thanks. Jason
Re: [Patch, fortran] PR66079 - [6 Regression] memory leak with source allocation in internal subprogram
Dear Andre, To answer your fist question - no, it doesn't. I was working with my laptop, which is over slow when it comes to updating. I should have realised that since you are working in this area that there might be a problem. I discovered it when I did an update on my workstation this afternoon. Of course the fixes are trivial. As to your second question about the chunk that avoids creating an expr3. Yes, I am aware of the consequences. I am perfectly happy to insert a TODO that saying that we should effect all the good things that happen when the standard assignment is used in another block of code. Alternatively, we could make a temporary variable to which the expr3 is assigned that is used latter in the assignment to the allocated variables. For the moment though, repeating evaluation of expressions seems like a small price to pay for avoiding memory leakage. Cheers Paul On 23 May 2015 at 19:52, Andre Vehreschild wrote: > Hi Paul, > > does this patch apply to current trunk cleanly? I get an issue with the last > hunk, because all of the prerequisites are gone since r223445. The string copy > is completely handled by the trans_assignment at the bottom of the if > (code->expr3) block. Therefore I doubt the patches last hunk is needed any > longer. > > Do you have an example why this hunk is needed? > > Index: gcc/fortran/trans-stmt.c > === > *** gcc/fortran/trans-stmt.c(revision 223233) > --- gcc/fortran/trans-stmt.c(working copy) > *** gfc_trans_allocate (gfc_code * code) > *** 5200,5206 > } > /* else expr3 = NULL_TREE set above. */ > } > ! else > { > /* In all other cases evaluate the expr3 and create a > temporary. */ > --- 5200,5207 > } > /* else expr3 = NULL_TREE set above. */ > } > ! else if (!(code->expr3->ts.type == BT_DERIVED > !&& code->expr3->ts.u.derived->attr.alloc_comp)) > { > /* In all other cases evaluate the expr3 and create a > temporary. */ > > When I get the code right, than all derived-typed source= expressions that > have > an allocatable component will not be prepared for copy to the allocated > object. > This also means, that functions returning an object of such a type are called > multiple times. Once for each object to allocate. Is this really desired? > > I am sorry, that I have to say that, but the check2305.diff file does not > bring > the testcase with it. > > Regards, > Andre > > > On Sat, 23 May 2015 14:48:53 +0200 > Paul Richard Thomas wrote: > >> Dear All, >> >> This patch started out fixing a single source of memory leak and then >> went on to fix various other issues that I found upon investigation. >> >> The fortran ChangeLog entry is sufficiently descripive that I do not >> think that there is a need to say more. >> >> Bootstrapped and regtested on x86_64/FC21 - OK for trunk? >> >> I am rather sure that some of the issues go further back than 6.0. I >> will investigate what should be fixed for 5.2. >> >> Cheers >> >> Paul >> >> 2015-05-23 Paul Thomas >> >> PR fortran/66079 >> * trans-expr.c (gfc_conv_procedure_call): Allocatable scalar >> function results must be freed and nullified after use. Create >> a temporary to hold the result to prevent duplicate calls. >> * trans-stmt.c (gfc_trans_allocate): Prevent memory leaks by >> not evaluating expr3 for scalar derived types with allocatable >> components. Fixed character length allocatable results and >> dummies need to be dereferenced. Also, if al_len is NULL use >> memsz for the string copy. >> >> 2015-05-23 Paul Thomas >> >> PR fortran/66079 >> * gfortran.dg/allocatable_scalar_13.f90: New test > > > -- > Andre Vehreschild * Email: vehre ad gmx dot de -- Outside of a dog, a book is a man's best friend. Inside of a dog it's too dark to read. Groucho Marx
Re: [Patch, fortran] PR66079 - [6 Regression] memory leak with source allocation in internal subprogram
Hi Paul, does this patch apply to current trunk cleanly? I get an issue with the last hunk, because all of the prerequisites are gone since r223445. The string copy is completely handled by the trans_assignment at the bottom of the if (code->expr3) block. Therefore I doubt the patches last hunk is needed any longer. Do you have an example why this hunk is needed? Index: gcc/fortran/trans-stmt.c === *** gcc/fortran/trans-stmt.c(revision 223233) --- gcc/fortran/trans-stmt.c(working copy) *** gfc_trans_allocate (gfc_code * code) *** 5200,5206 } /* else expr3 = NULL_TREE set above. */ } ! else { /* In all other cases evaluate the expr3 and create a temporary. */ --- 5200,5207 } /* else expr3 = NULL_TREE set above. */ } ! else if (!(code->expr3->ts.type == BT_DERIVED !&& code->expr3->ts.u.derived->attr.alloc_comp)) { /* In all other cases evaluate the expr3 and create a temporary. */ When I get the code right, than all derived-typed source= expressions that have an allocatable component will not be prepared for copy to the allocated object. This also means, that functions returning an object of such a type are called multiple times. Once for each object to allocate. Is this really desired? I am sorry, that I have to say that, but the check2305.diff file does not bring the testcase with it. Regards, Andre On Sat, 23 May 2015 14:48:53 +0200 Paul Richard Thomas wrote: > Dear All, > > This patch started out fixing a single source of memory leak and then > went on to fix various other issues that I found upon investigation. > > The fortran ChangeLog entry is sufficiently descripive that I do not > think that there is a need to say more. > > Bootstrapped and regtested on x86_64/FC21 - OK for trunk? > > I am rather sure that some of the issues go further back than 6.0. I > will investigate what should be fixed for 5.2. > > Cheers > > Paul > > 2015-05-23 Paul Thomas > > PR fortran/66079 > * trans-expr.c (gfc_conv_procedure_call): Allocatable scalar > function results must be freed and nullified after use. Create > a temporary to hold the result to prevent duplicate calls. > * trans-stmt.c (gfc_trans_allocate): Prevent memory leaks by > not evaluating expr3 for scalar derived types with allocatable > components. Fixed character length allocatable results and > dummies need to be dereferenced. Also, if al_len is NULL use > memsz for the string copy. > > 2015-05-23 Paul Thomas > > PR fortran/66079 > * gfortran.dg/allocatable_scalar_13.f90: New test -- Andre Vehreschild * Email: vehre ad gmx dot de
Re: C/C++ PATCH to allow deprecating enum values (PR c/47043)
On 05/22/2015 06:19 PM, Mikhail Maltsev wrote: On 22.05.2015 12:10, Marek Polacek wrote: Thanks, applied. Here's the final version. By the way, we have a feature test macro, __cpp_attributes=200809 which can be used to determine, whether C++11 attribute syntax is supported by the compiler. I propose to add something similar for this extension (like __cpp_gnu_enum_attributes=... for C++ and __GCC_HAVE_ENUM_ATTRIBUTES for C). Thoughts? I think SG 10 is or did debate this a weekor two ago. I haven't heard. I had made some recommendations along the lines you describe. Another member had suggested just bumping the date on __cpp_attributes I don't know what/if they decided. Ed
Re: [PATCH] Don't combine param and return value copies
Hi Alan, On Sat, May 23, 2015 at 06:03:05PM +0930, Alan Modra wrote: > This stops combine messing with parameter and return value copies > from/to hard registers. Bootstrapped and regression tested > powerpc64le-linux, powerpc64-linux and x86_64-linux. In looking at a > number of different powerpc64le gcc/*.o files, I noticed a few code > generation improvements. There were cases where a register copy was > no longer needed, cmp used in place of mr., and rlwinm instead of > rldicl. x86_64 gcc/*.o showed no changes (apart from combine.o of > course), but should see a small improvement in compile time with this > change. Thanks. Did you see improvements around return as well, or mostly / only related to the function start? > The "clear next_use when !can_combine_p" change is to fix a non-bug. > Given > > 1) insn defining rn >... > 2) insn defining rn but !can_combine_p >... > 3) insn using rn > > then create_log_links might create a link from (3) to (1), I thought. > However, can_combine_p doesn't currently allow this to happen. > Obviously, any can_combine_p result depending on regno shouldn't give > a different result at (1) and (2), but there is also at test of > DF_REF_PRE_POST_MODIFY that can. The saving grace is that pre/post > modify insns also use the register, which means next_use[rn] will > point at (2), not (3), when (1) is processed. > > I came across this because at one stage I considered modifying > can_combine_p. Someone who does so in the future might trigger the > above problem, so I thought it worth posting the change. I agree it looks like a bug waiting to happen. Please post it as a separate patch though. > +/* Twiddle BLOCK_FOR_INSN to TO for instructions in the first block BB > + that we don't want to combine with other instructions. */ > + > +static void > +twiddle_first_block (basic_block bb, basic_block to) I don't like this much. Messing with global state makes things harder to change later. If it is hard to come up with a good name for a factor, it usually means it is not such a good factor. You can do these inside can_combine_{def,use}_p as far as I can see? Probably need to give those an extra parameter then: for _def, a bool that says "don't allow moves from hard regs", set when the scan has encountered a NOTE_INSN_FUNCTION_BEG in this bb; and for _use, a regset of those hard regs we don't want to allow moves to (those seen in USEs later in that same block). Or do it in the main loop itself, _{def,use} are each called in one place only; whatever works best. What makes return values different from CALL args here, btw? Why do you not want to combine return values, but you leave alone call args? > +/* Fill in log links field for all insns that we wish to combine. */ Please don't change this comment; it was more correct before. > @@ -1103,7 +1192,7 @@ create_log_links (void) >we might wind up changing the semantics of the insn, >even if reload can make what appear to be valid >assignments later. */ > - if (regno < FIRST_PSEUDO_REGISTER > + if (HARD_REGISTER_NUM_P (regno) > && asm_noperands (PATTERN (use_insn)) >= 0) > continue; An independent change. Segher
[Patch, fortran] PR66079 - [6 Regression] memory leak with source allocation in internal subprogram
Dear All, This patch started out fixing a single source of memory leak and then went on to fix various other issues that I found upon investigation. The fortran ChangeLog entry is sufficiently descripive that I do not think that there is a need to say more. Bootstrapped and regtested on x86_64/FC21 - OK for trunk? I am rather sure that some of the issues go further back than 6.0. I will investigate what should be fixed for 5.2. Cheers Paul 2015-05-23 Paul Thomas PR fortran/66079 * trans-expr.c (gfc_conv_procedure_call): Allocatable scalar function results must be freed and nullified after use. Create a temporary to hold the result to prevent duplicate calls. * trans-stmt.c (gfc_trans_allocate): Prevent memory leaks by not evaluating expr3 for scalar derived types with allocatable components. Fixed character length allocatable results and dummies need to be dereferenced. Also, if al_len is NULL use memsz for the string copy. 2015-05-23 Paul Thomas PR fortran/66079 * gfortran.dg/allocatable_scalar_13.f90: New test Index: gcc/fortran/trans-expr.c === *** gcc/fortran/trans-expr.c(revision 223234) --- gcc/fortran/trans-expr.c(working copy) *** gfc_conv_procedure_call (gfc_se * se, gf *** 5877,5882 --- 5877,5896 fntype = TREE_TYPE (TREE_TYPE (se->expr)); se->expr = build_call_vec (TREE_TYPE (fntype), se->expr, arglist); + /* Allocatable scalar function results must be freed and nullified + after use. This necessitates the creation of a temporary to + hold the result to prevent duplicate calls. */ + if (!byref && sym->ts.type != BT_CHARACTER + && sym->attr.allocatable && !sym->attr.dimension) + { + tmp = gfc_create_var (TREE_TYPE (se->expr), NULL); + gfc_add_modify (&se->pre, tmp, se->expr); + se->expr = tmp; + tmp = gfc_call_free (tmp); + gfc_add_expr_to_block (&post, tmp); + gfc_add_modify (&post, se->expr, build_int_cst (TREE_TYPE (se->expr), 0)); + } + /* If we have a pointer function, but we don't want a pointer, e.g. something like x = f() Index: gcc/fortran/trans-stmt.c === *** gcc/fortran/trans-stmt.c(revision 223233) --- gcc/fortran/trans-stmt.c(working copy) *** gfc_trans_allocate (gfc_code * code) *** 5200,5206 } /* else expr3 = NULL_TREE set above. */ } ! else { /* In all other cases evaluate the expr3 and create a temporary. */ --- 5200,5207 } /* else expr3 = NULL_TREE set above. */ } ! else if (!(code->expr3->ts.type == BT_DERIVED !&& code->expr3->ts.u.derived->attr.alloc_comp)) { /* In all other cases evaluate the expr3 and create a temporary. */ *** gfc_trans_allocate (gfc_code * code) *** 5626,5631 --- 5627,5633 fold_convert (TREE_TYPE (al_len), integer_zero_node)); } + if (code->expr3 && !code->expr3->mold) { /* Initialization via SOURCE block *** gfc_trans_allocate (gfc_code * code) *** 5650,5655 --- 5652,5669 se.expr : build_fold_indirect_ref_loc (input_location, se.expr); + + /* Fixed length allocatable results and dummies need further +dereferencing. */ + if (!expr->ts.deferred + && TREE_CODE (se.expr) == PARM_DECL) + tmp = build_fold_indirect_ref_loc (input_location, tmp); + + /* Fixed length allocations need this because al_len is +never set. */ + if (al_len == NULL_TREE) + al_len = memsz; + gfc_trans_string_copy (&block, al_len, tmp, code->expr3->ts.kind, expr3_len, expr3,
Re: PR fortran/44054 Convert all gfc_error_1 calls to gfc_error
Le 23/05/2015 01:04, Manuel López-Ibáñez a écrit : > PING: https://gcc.gnu.org/ml/gcc-patches/2015-05/msg01511.html > > This only needs approval from Fortran maintainers. > I approved: https://gcc.gnu.org/ml/fortran/2015-05/msg00135.html But it seems I replied to the fortran list only. Mikael
Re: [patch] libstdc++/64657 support iterators with overloaded comma operator
On 23/05/2015 13:19, François Dumont wrote: On 29/04/2015 17:22, Jonathan Wakely wrote: I think this covers all the places in the library where we do: ++i, ++j Tested powerpc64le-linux, committed to trunk. I just committed 2 missing places in debug mode. 2015-05-23 François Dumont fdum...@gcc.gnu.org> PR libstdc++/64657 * include/debug/functions.h (__check_sorted_aux): Cast expression to void. Tested linux x86_64 debug mode in C++11. François With the patch. Index: include/debug/functions.h === --- include/debug/functions.h (revision 223604) +++ include/debug/functions.h (working copy) @@ -336,7 +336,7 @@ return true; _ForwardIterator __next = __first; - for (++__next; __next != __last; __first = __next, ++__next) + for (++__next; __next != __last; __first = __next, (void)++__next) if (*__next < *__first) return false; @@ -362,7 +362,7 @@ return true; _ForwardIterator __next = __first; - for (++__next; __next != __last; __first = __next, ++__next) + for (++__next; __next != __last; __first = __next, (void)++__next) if (__pred(*__next, *__first)) return false;
Re: [patch] libstdc++/64657 support iterators with overloaded comma operator
On 29/04/2015 17:22, Jonathan Wakely wrote: I think this covers all the places in the library where we do: ++i, ++j Tested powerpc64le-linux, committed to trunk. I just committed 2 missing places in debug mode. 2015-05-23 François Dumont fdum...@gcc.gnu.org> PR libstdc++/64657 * include/debug/functions.h (__check_sorted_aux): Cast expression to void. Tested linux x86_64 debug mode in C++11. François
add names for constants in line-map.c
This patch adds descriptive names to various constants in line-map.c. There were some differences between the constants used a various places, but my understanding is that these differences are arbitrary and it is easier to understand the code if they are harmonized. Bootstrapped and regression tested on x86-linux-gnu. OK? libcpp/ChangeLog: 2015-05-23 Manuel López-Ibáñez * line-map.c (LINE_MAP_MAX_COLUMN_NUMBER LINE_MAP_MAX_LOCATION_WITH_COLS,LINE_MAP_MAX_SOURCE_LOCATION): New constants. (linemap_line_start): Use them. (linemap_position_for_column): Use them. Index: libcpp/line-map.c === --- libcpp/line-map.c (revision 223596) +++ libcpp/line-map.c (working copy) @@ -24,10 +24,22 @@ along with this program; see the file CO #include "line-map.h" #include "cpplib.h" #include "internal.h" #include "hashtab.h" +/* Do not track column numbers higher than this one. As a result, the + range of column_bits is [7, 18] (or 0 if column numbers are + disabled). */ +const unsigned int LINE_MAP_MAX_COLUMN_NUMBER = (1U << 17); + +/* Do not track column numbers if locations get higher than this. */ +const source_location LINE_MAP_MAX_LOCATION_WITH_COLS = 0x6000; + +/* Highest possible source location encoded within an ordinary or + macro map. */ +const source_location LINE_MAP_MAX_SOURCE_LOCATION = 0x7000; + static void trace_include (const struct line_maps *, const line_map_ordinary *); static const line_map_ordinary * linemap_ordinary_map_lookup (struct line_maps *, source_location); static const line_map_macro* linemap_macro_map_lookup (struct line_maps *, source_location); @@ -542,26 +554,27 @@ linemap_line_start (struct line_maps *se || (line_delta > 10 && line_delta * ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map) > 1000) || (max_column_hint >= (1U << ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map))) || (max_column_hint <= 80 && ORDINARY_MAP_NUMBER_OF_COLUMN_BITS (map) >= 10) - || (highest > 0x6000 - && (set->max_column_hint || highest > 0x7000))) + || (highest > LINE_MAP_MAX_LOCATION_WITH_COLS + && (set->max_column_hint || highest >= LINE_MAP_MAX_SOURCE_LOCATION))) add_map = true; else max_column_hint = set->max_column_hint; if (add_map) { int column_bits; - if (max_column_hint > 10 || highest > 0x6000) + if (max_column_hint > LINE_MAP_MAX_COLUMN_NUMBER + || highest > LINE_MAP_MAX_LOCATION_WITH_COLS) { /* If the column number is ridiculous or we've allocated a huge number of source_locations, give up on column numbers. */ max_column_hint = 0; - if (highest > 0x7000) - return 0; column_bits = 0; + if (highest > LINE_MAP_MAX_SOURCE_LOCATION) + return 0; } else { column_bits = 7; while (max_column_hint >= (1U << column_bits)) @@ -613,11 +626,12 @@ linemap_position_for_column (struct line linemap_assert (!linemap_macro_expansion_map_p (LINEMAPS_LAST_ORDINARY_MAP (set))); if (to_column >= set->max_column_hint) { - if (r >= 0xC00 || to_column > 10) + if (r > LINE_MAP_MAX_LOCATION_WITH_COLS + || to_column > LINE_MAP_MAX_COLUMN_NUMBER) { /* Running low on source_locations - disable column numbers. */ return r; } else
Re: [PATCH] Don't combine param and return value copies
On Sat, May 23, 2015 at 1:33 AM, Alan Modra wrote: > This stops combine messing with parameter and return value copies > from/to hard registers. Bootstrapped and regression tested > powerpc64le-linux, powerpc64-linux and x86_64-linux. In looking at a > number of different powerpc64le gcc/*.o files, I noticed a few code > generation improvements. There were cases where a register copy was > no longer needed, cmp used in place of mr., and rlwinm instead of > rldicl. x86_64 gcc/*.o showed no changes (apart from combine.o of > course), but should see a small improvement in compile time with this > change. I noticed this a while back but I never got around to fixing this. This should also improve other RISC targets like AARCH64 and MIPS where I had saw issues like this too. Thanks, Andrew > > The "clear next_use when !can_combine_p" change is to fix a non-bug. > Given > > 1) insn defining rn >... > 2) insn defining rn but !can_combine_p >... > 3) insn using rn > > then create_log_links might create a link from (3) to (1), I thought. > However, can_combine_p doesn't currently allow this to happen. > Obviously, any can_combine_p result depending on regno shouldn't give > a different result at (1) and (2), but there is also at test of > DF_REF_PRE_POST_MODIFY that can. The saving grace is that pre/post > modify insns also use the register, which means next_use[rn] will > point at (2), not (3), when (1) is processed. > > I came across this because at one stage I considered modifying > can_combine_p. Someone who does so in the future might trigger the > above problem, so I thought it worth posting the change. > > OK for mainline? > > * combine.c (set_return_regs): New function. > (twiddle_first_block, twiddle_last_block): New functions. > (create_log_links): Exclude instructions copying parameter > values from hard regs to pseudos, and instructions copying > return value pseudos to hard regs. Clear next_use when > !can_combine_p. > > Index: gcc/combine.c > === > --- gcc/combine.c (revision 223463) > +++ gcc/combine.c (working copy) > @@ -1048,9 +1048,79 @@ can_combine_use_p (df_ref use) >return true; > } > > -/* Fill in log links field for all insns. */ > +/* Used to build set of return value regs. Add X to the set. */ > > static void > +set_return_regs (rtx x, void *arg) > +{ > + HARD_REG_SET *regs = (HARD_REG_SET *) arg; > + > + add_to_hard_reg_set (regs, GET_MODE (x), REGNO (x)); > +} > + > +/* Twiddle BLOCK_FOR_INSN to TO for instructions in the first block BB > + that we don't want to combine with other instructions. */ > + > +static void > +twiddle_first_block (basic_block bb, basic_block to) > +{ > + rtx_insn *insn; > + > + FOR_BB_INSNS (bb, insn) > +{ > + if (NOTE_P (insn) && NOTE_KIND (insn) == NOTE_INSN_FUNCTION_BEG) > + break; > + if (!NONDEBUG_INSN_P (insn)) > + continue; > + > + /* reg,reg copies from parameter hard regs. */ > + rtx set = single_set (insn); > + if (set > + && REG_P (SET_DEST (set)) > + && REG_P (SET_SRC (set)) > + && HARD_REGISTER_P (SET_SRC (set))) > + set_block_for_insn (insn, to); > +} > +} > + > +/* Twiddle BLOCK_FOR_INSN to TO for instructions in the last block BB > + that we don't want to combine with other instructions. */ > + > +static void > +twiddle_last_block (basic_block bb, basic_block to, HARD_REG_SET return_regs) > +{ > + rtx_insn *insn; > + > + FOR_BB_INSNS_REVERSE (bb, insn) > +{ > + if (CALL_P (insn)) > + break; > + if (!NONDEBUG_INSN_P (insn)) > + continue; > + > + rtx reg = NULL_RTX; > + /* use_return_regster added USEs. */ > + if (GET_CODE (PATTERN (insn)) == USE) > + reg = XEXP (PATTERN (insn), 0); > + else > + { > + /* reg,reg copies that set return value hard regs. */ > + rtx set = single_set (insn); > + if (set && REG_P (SET_SRC (set))) > + reg = SET_DEST (set); > + } > + if (reg > + && REG_P (reg) > + && HARD_REGISTER_P (reg) > + && overlaps_hard_reg_set_p (return_regs, > + GET_MODE (reg), REGNO (reg))) > + set_block_for_insn (insn, to); > +} > +} > + > +/* Fill in log links field for all insns that we wish to combine. */ > + > +static void > create_log_links (void) > { >basic_block bb; > @@ -1057,9 +1127,28 @@ create_log_links (void) >rtx_insn **next_use; >rtx_insn *insn; >df_ref def, use; > + HARD_REG_SET return_regs; > >next_use = XCNEWVEC (rtx_insn *, max_reg_num ()); > > + /* Don't combine instructions copying parameter values from hard > + regs to pseudos. Exclude such instructions from LOG_LINKS by > + temporarily zapping BLOCK_FOR_INSN. */ > + > + bb = ENTRY_BLOCK_PTR_FOR_FN (cfun)->next_bb; > + twiddle_first_
[PATCH] Don't combine param and return value copies
This stops combine messing with parameter and return value copies from/to hard registers. Bootstrapped and regression tested powerpc64le-linux, powerpc64-linux and x86_64-linux. In looking at a number of different powerpc64le gcc/*.o files, I noticed a few code generation improvements. There were cases where a register copy was no longer needed, cmp used in place of mr., and rlwinm instead of rldicl. x86_64 gcc/*.o showed no changes (apart from combine.o of course), but should see a small improvement in compile time with this change. The "clear next_use when !can_combine_p" change is to fix a non-bug. Given 1) insn defining rn ... 2) insn defining rn but !can_combine_p ... 3) insn using rn then create_log_links might create a link from (3) to (1), I thought. However, can_combine_p doesn't currently allow this to happen. Obviously, any can_combine_p result depending on regno shouldn't give a different result at (1) and (2), but there is also at test of DF_REF_PRE_POST_MODIFY that can. The saving grace is that pre/post modify insns also use the register, which means next_use[rn] will point at (2), not (3), when (1) is processed. I came across this because at one stage I considered modifying can_combine_p. Someone who does so in the future might trigger the above problem, so I thought it worth posting the change. OK for mainline? * combine.c (set_return_regs): New function. (twiddle_first_block, twiddle_last_block): New functions. (create_log_links): Exclude instructions copying parameter values from hard regs to pseudos, and instructions copying return value pseudos to hard regs. Clear next_use when !can_combine_p. Index: gcc/combine.c === --- gcc/combine.c (revision 223463) +++ gcc/combine.c (working copy) @@ -1048,9 +1048,79 @@ can_combine_use_p (df_ref use) return true; } -/* Fill in log links field for all insns. */ +/* Used to build set of return value regs. Add X to the set. */ static void +set_return_regs (rtx x, void *arg) +{ + HARD_REG_SET *regs = (HARD_REG_SET *) arg; + + add_to_hard_reg_set (regs, GET_MODE (x), REGNO (x)); +} + +/* Twiddle BLOCK_FOR_INSN to TO for instructions in the first block BB + that we don't want to combine with other instructions. */ + +static void +twiddle_first_block (basic_block bb, basic_block to) +{ + rtx_insn *insn; + + FOR_BB_INSNS (bb, insn) +{ + if (NOTE_P (insn) && NOTE_KIND (insn) == NOTE_INSN_FUNCTION_BEG) + break; + if (!NONDEBUG_INSN_P (insn)) + continue; + + /* reg,reg copies from parameter hard regs. */ + rtx set = single_set (insn); + if (set + && REG_P (SET_DEST (set)) + && REG_P (SET_SRC (set)) + && HARD_REGISTER_P (SET_SRC (set))) + set_block_for_insn (insn, to); +} +} + +/* Twiddle BLOCK_FOR_INSN to TO for instructions in the last block BB + that we don't want to combine with other instructions. */ + +static void +twiddle_last_block (basic_block bb, basic_block to, HARD_REG_SET return_regs) +{ + rtx_insn *insn; + + FOR_BB_INSNS_REVERSE (bb, insn) +{ + if (CALL_P (insn)) + break; + if (!NONDEBUG_INSN_P (insn)) + continue; + + rtx reg = NULL_RTX; + /* use_return_regster added USEs. */ + if (GET_CODE (PATTERN (insn)) == USE) + reg = XEXP (PATTERN (insn), 0); + else + { + /* reg,reg copies that set return value hard regs. */ + rtx set = single_set (insn); + if (set && REG_P (SET_SRC (set))) + reg = SET_DEST (set); + } + if (reg + && REG_P (reg) + && HARD_REGISTER_P (reg) + && overlaps_hard_reg_set_p (return_regs, + GET_MODE (reg), REGNO (reg))) + set_block_for_insn (insn, to); +} +} + +/* Fill in log links field for all insns that we wish to combine. */ + +static void create_log_links (void) { basic_block bb; @@ -1057,9 +1127,28 @@ create_log_links (void) rtx_insn **next_use; rtx_insn *insn; df_ref def, use; + HARD_REG_SET return_regs; next_use = XCNEWVEC (rtx_insn *, max_reg_num ()); + /* Don't combine instructions copying parameter values from hard + regs to pseudos. Exclude such instructions from LOG_LINKS by + temporarily zapping BLOCK_FOR_INSN. */ + + bb = ENTRY_BLOCK_PTR_FOR_FN (cfun)->next_bb; + twiddle_first_block (bb, 0); + + /* Similarly, don't combine instructions copying return values + from pseudos to hard regs. */ + + CLEAR_HARD_REG_SET (return_regs); + diddle_return_value (set_return_regs, &return_regs); + if (!hard_reg_set_empty_p (return_regs)) +{ + bb = EXIT_BLOCK_PTR_FOR_FN (cfun)->prev_bb; + twiddle_last_block (bb, 0, return_regs); +} + /* Pass through each block from the end, recording the uses of each register and establishing log links when def is encou
Re: match.pd: (x | y) & ~x -> y & ~x
On May 22, 2015 10:54:02 PM GMT+02:00, Marc Glisse wrote: >On Mon, 18 May 2015, Richard Biener wrote: > >> On Fri, May 15, 2015 at 7:22 PM, Marc Glisse >wrote: >> >>> we already have the more complicated: x & ~(x & y) -> x & ~y (which >I am >>> reindenting by the way) and the simpler: (~x | y) & x -> x & y, so I >am >>> proposing this one for completeness. Regtested on >ppc64le-redhat-linux. >> >> Ok (doesn't seem to be in fold-const.c). >> >> Btw, there are quite some (simple) ones only in fold-const.c which >are >> eligible for moving to match.pd (thus remove them from fold-const.c >and >> implement in match.pd). Mostly canonicalization ones though. > >Haven't you already done a lot of those in the branch though? Yeah, maybe. I'd have to check. Reminds me to come back to all this. After sorting out all the vectorizer stuff I am working on at the moment. Richard.