Reduce -flto -fprofile-generate memory use
Hi, while compiling firefox I noticed that -fprofile-generage -flto goes to 8GB. It turns out that this is caused by ipa_reference no longer being disabled becaus in_lto_p became flag that is set later (it is not clear to me why it needs to be this way). I however do not see reason why not disable ipa-reference for non-lto path, too. Bootstrapped/regtested x86_linux, comitted to mainline. OK for 4.9.1? Honza Index: ChangeLog === --- ChangeLog (revision 209461) +++ ChangeLog (working copy) @@ -1,5 +1,10 @@ 2014-04-16 Jan Hubicka hubi...@ucw.cz + * opts.c (common_handle_option): Disable -fipa-reference coorectly + with -fuse-profile. + +2014-04-16 Jan Hubicka hubi...@ucw.cz + * ipa-devirt.c (odr_type_d): Add field all_derivations_known. (type_all_derivations_known_p): New predicate. (type_all_ctors_visible_p): New predicate. Index: opts.c === --- opts.c (revision 209461) +++ opts.c (working copy) @@ -1732,7 +1732,7 @@ common_handle_option (struct gcc_options /* FIXME: Instrumentation we insert makes ipa-reference bitmaps quadratic. Disable the pass until better memory representation is done. */ - if (!opts_set-x_flag_ipa_reference opts-x_in_lto_p) + if (!opts_set-x_flag_ipa_reference) opts-x_flag_ipa_reference = false; break;
Re: Reduce -flto -fprofile-generate memory use
On Thu, 17 Apr 2014, Jan Hubicka wrote: Hi, while compiling firefox I noticed that -fprofile-generage -flto goes to 8GB. It turns out that this is caused by ipa_reference no longer being disabled becaus in_lto_p became flag that is set later (it is not clear to me why it needs to be this way). I however do not see reason why not disable ipa-reference for non-lto path, too. Bootstrapped/regtested x86_linux, comitted to mainline. OK for 4.9.1? Yes. Thanks, Richard. Honza Index: ChangeLog === --- ChangeLog (revision 209461) +++ ChangeLog (working copy) @@ -1,5 +1,10 @@ 2014-04-16 Jan Hubicka hubi...@ucw.cz + * opts.c (common_handle_option): Disable -fipa-reference coorectly + with -fuse-profile. + +2014-04-16 Jan Hubicka hubi...@ucw.cz + * ipa-devirt.c (odr_type_d): Add field all_derivations_known. (type_all_derivations_known_p): New predicate. (type_all_ctors_visible_p): New predicate. Index: opts.c === --- opts.c(revision 209461) +++ opts.c(working copy) @@ -1732,7 +1732,7 @@ common_handle_option (struct gcc_options /* FIXME: Instrumentation we insert makes ipa-reference bitmaps quadratic. Disable the pass until better memory representation is done. */ - if (!opts_set-x_flag_ipa_reference opts-x_in_lto_p) + if (!opts_set-x_flag_ipa_reference) opts-x_flag_ipa_reference = false; break; -- Richard Biener rguent...@suse.de SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imendorffer
Re: Fix lto/PR60854
On Thu, Apr 17, 2014 at 4:30 AM, Jan Hubicka hubi...@ucw.cz wrote: Hi, the testcase shows problem where cpp implicit alias is always inline and symtab_remove_unreachable_nodes removes the body of aliased function before inlininghappens. The real problem is that cgraph_state is set too early and not as the comment says after inlinig, but for release branch I think it is easier to sovle the problem by simply making the alias target reachable by hand. Bootstrapped/regtested x86_64-linux, comitted to trunk. Let me know when it is OK for release brach. It's ok for 4.9.1. Richard. Honza Index: ChangeLog === --- ChangeLog (revision 209458) +++ ChangeLog (working copy) @@ -1,3 +1,9 @@ +2014-04-16 Jan Hubicka hubi...@ucw.cz + + PR ipa/60854 + * ipa.c (symtab_remove_unreachable_nodes): Mark targets of + external aliases alive, too. + 2014-04-16 Andrew Pinski apin...@cavium.com * config/host-linux.c (TRY_EMPTY_VM_SPACE): Change aarch64 ilp32 Index: testsuite/ChangeLog === --- testsuite/ChangeLog (revision 209450) +++ testsuite/ChangeLog (working copy) @@ -1,3 +1,8 @@ +2014-04-16 Jan Hubicka hubi...@ucw.cz + + PR ipa/60854 + * g++.dg/torture/pr60854.C: New testcase. + 2014-04-16 Catherine Moore c...@codesourcery.com * gcc.target/mips/umips-store16-2.c: New test. Index: ipa.c === --- ipa.c (revision 209450) +++ ipa.c (working copy) @@ -415,7 +415,18 @@ symtab_remove_unreachable_nodes (bool be || !DECL_EXTERNAL (e-callee-decl) || e-callee-alias || before_inlining_p)) - pointer_set_insert (reachable, e-callee); + { + /* Be sure that we will not optimize out alias target +body. */ + if (DECL_EXTERNAL (e-callee-decl) + e-callee-alias + before_inlining_p) + { + pointer_set_insert (reachable, + cgraph_function_node (e-callee)); + } + pointer_set_insert (reachable, e-callee); + } enqueue_node (e-callee, first, reachable); } Index: testsuite/g++.dg/torture/pr60854.C === --- testsuite/g++.dg/torture/pr60854.C (revision 0) +++ testsuite/g++.dg/torture/pr60854.C (revision 0) @@ -0,0 +1,13 @@ +template typename T +class MyClass +{ +public: + __attribute__ ((__always_inline__)) inline MyClass () { ; } +}; + +extern template class MyClassdouble; + +void Func() +{ + MyClassdouble x; +}
Re: [PATCH GCC]Fix pr60363 by adding backtraced value of phi arg along jump threading path
On Thu, Apr 17, 2014 at 7:30 AM, Jeff Law l...@redhat.com wrote: On 03/18/14 04:13, bin.cheng wrote: Hi, After control flow graph change made by http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01492.html, case gcc.dg/tree-ssa/ssa-dom-thread-4.c is broken on logical_op_short_circuit targets including cortex-m3/cortex-m0. The regression reveals a missed opportunity in jump threading, which causes a forward basic block doesn't get removed in cfgcleanup after jump threading in VRP1. Root cause is stated at the corresponding PR: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60363, please refer to it for detailed report. This patch fixes the issue by adding constant value instead of ssa_name as the new phi argument. Bootstrap and test on x86_64, also test on cortex-m3 and the regression is gone. I think this should wait for stage1, but would like to hear some comments now. So does it look reasonable? 2014-03-18 Bin Chengbin.ch...@arm.com PR regression/60363 * gcc/tree-ssa-threadupdate.c (get_value_locus_in_path): New. (copy_phi_args): New parameters. Call get_value_locus_in_path. (update_destination_phis): New parameter. (create_edge_and_update_destination_phis): Ditto. (ssa_fix_duplicate_block_edges): Pass new arguments. (thread_single_edge): Ditto. This is a good and interesting catch. DOM knows how to propagate these context sensitive equivalences which should expose the optimizable forwarder blocks. But I'm a big believer in catching as many CFG simplifications as early as we can as they tend to have nice cascading effects. So if we can pick it up by being smarter in how we duplicate arguments, then I'm all for it. + for (int j = idx - 1; j = 0; j--) +{ + edge e = (*path)[j]-e; + if (e-dest == def_bb) + { + arg = gimple_phi_arg_def (def_phi, e-dest_idx); + *locus = gimple_phi_arg_location (def_phi, e-dest_idx); + return (TREE_CODE (arg) == INTEGER_CST ? arg : def); Presumably any constant that can legitimately appear in a PHI node is good here. So for example ADDR_EXPR something in static storage ought to be handled as well. One could also argue that we should go ahead and do a context sensitive copy propagation here too if ARG turns out to be an SSA_NAME. You have to be a bit more careful with those and use may_propagate_copy_p and you'd probably want to test the loop depth of the SSA_NAMEs to ensure you're not doing a propagation that is going to muck up LICM. See loop_depth_of_name uses in tree-ssa-dom.c. Overall I think it's good. We just need to resolve whether or not we want to catch constant ADDR_EXPRs and/or do the context sensitive copy propagations. Simply use is_gimple_min_invariant (arg) ? arg : def jeff
[PATCH] Fix PR60841
This fixes running into the exponential value-graph - SLP tree expansion by artificially limiting the overall SLP tree size. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2014-04-16 Richard Biener rguent...@suse.de PR tree-optimization/60841 * tree-vect-data-refs.c (vect_analyze_data_refs): Count stmts. * tree-vect-loop.c (vect_analyze_loop_2): Pass down number of stmts to SLP build. * tree-vect-slp.c (vect_slp_analyze_bb_1): Likewise. (vect_analyze_slp): Likewise. (vect_analyze_slp_instance): Likewise. (vect_build_slp_tree): Limit overall SLP tree growth. * tree-vectorizer.h (vect_analyze_data_refs, vect_analyze_slp): Adjust prototypes. * gcc.dg/vect/pr60841.c: New testcase. Index: gcc/tree-vect-data-refs.c === --- gcc/tree-vect-data-refs.c (revision 209423) +++ gcc/tree-vect-data-refs.c (working copy) @@ -3172,7 +3213,7 @@ vect_check_gather (gimple stmt, loop_vec bool vect_analyze_data_refs (loop_vec_info loop_vinfo, bb_vec_info bb_vinfo, - int *min_vf) + int *min_vf, unsigned *n_stmts) { struct loop *loop = NULL; basic_block bb = NULL; @@ -3207,6 +3248,9 @@ vect_analyze_data_refs (loop_vec_info lo for (gsi = gsi_start_bb (bbs[i]); !gsi_end_p (gsi); gsi_next (gsi)) { gimple stmt = gsi_stmt (gsi); + if (is_gimple_debug (stmt)) + continue; + ++*n_stmts; if (!find_data_references_in_stmt (loop, stmt, datarefs)) { if (is_gimple_call (stmt) loop-safelen) @@ -3260,6 +3304,9 @@ vect_analyze_data_refs (loop_vec_info lo for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (gsi)) { gimple stmt = gsi_stmt (gsi); + if (is_gimple_debug (stmt)) + continue; + ++*n_stmts; if (!find_data_references_in_stmt (NULL, stmt, BB_VINFO_DATAREFS (bb_vinfo))) { Index: gcc/tree-vect-loop.c === --- gcc/tree-vect-loop.c(revision 209423) +++ gcc/tree-vect-loop.c(working copy) @@ -1629,6 +1629,7 @@ vect_analyze_loop_2 (loop_vec_info loop_ int max_vf = MAX_VECTORIZATION_FACTOR; int min_vf = 2; unsigned int th; + unsigned int n_stmts = 0; /* Find all data references in the loop (which correspond to vdefs/vuses) and analyze their evolution in the loop. Also adjust the minimal @@ -1637,7 +1638,7 @@ vect_analyze_loop_2 (loop_vec_info loop_ FORNOW: Handle only simple, array references, which alignment can be forced, and aligned pointer-references. */ - ok = vect_analyze_data_refs (loop_vinfo, NULL, min_vf); + ok = vect_analyze_data_refs (loop_vinfo, NULL, min_vf, n_stmts); if (!ok) { if (dump_enabled_p ()) @@ -1747,7 +1748,7 @@ vect_analyze_loop_2 (loop_vec_info loop_ } /* Check the SLP opportunities in the loop, analyze and build SLP trees. */ - ok = vect_analyze_slp (loop_vinfo, NULL); + ok = vect_analyze_slp (loop_vinfo, NULL, n_stmts); if (ok) { /* Decide which possible SLP instances to SLP. */ Index: gcc/tree-vect-slp.c === --- gcc/tree-vect-slp.c (revision 209423) +++ gcc/tree-vect-slp.c (working copy) @@ -849,9 +849,10 @@ vect_build_slp_tree (loop_vec_info loop_ unsigned int *max_nunits, vecslp_tree *loads, unsigned int vectorization_factor, -bool *matches, unsigned *npermutes) +bool *matches, unsigned *npermutes, unsigned *tree_size, +unsigned max_tree_size) { - unsigned nops, i, this_npermutes = 0; + unsigned nops, i, this_npermutes = 0, this_tree_size = 0; gimple stmt; if (!matches) @@ -911,6 +912,12 @@ vect_build_slp_tree (loop_vec_info loop_ if (oprnd_info-first_dt != vect_internal_def) continue; + if (++this_tree_size max_tree_size) + { + vect_free_oprnd_info (oprnds_info); + return false; + } + child = vect_create_new_slp_node (oprnd_info-def_stmts); if (!child) { @@ -921,7 +928,8 @@ vect_build_slp_tree (loop_vec_info loop_ bool *matches = XALLOCAVEC (bool, group_size); if (vect_build_slp_tree (loop_vinfo, bb_vinfo, child, group_size, max_nunits, loads, - vectorization_factor, matches, npermutes)) + vectorization_factor, matches, + npermutes, this_tree_size, max_tree_size)) { oprnd_info-def_stmts = vNULL; SLP_TREE_CHILDREN
[PATCH] Fix PR60849
This fixes PR60849 by properly rejecting non-boolean typed comparisons from valid_gimple_rhs_p so they go through the gimplification paths. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2014-04-16 Richard Biener rguent...@suse.de PR middle-end/60849 * tree-ssa-propagate.c (valid_gimple_rhs_p): Only allow effective boolean results for comparisons. * g++.dg/opt/pr60849.C: New testcase. Index: gcc/tree-ssa-propagate.c === --- gcc/tree-ssa-propagate.c(revision 209423) +++ gcc/tree-ssa-propagate.c(working copy) @@ -571,8 +571,14 @@ valid_gimple_rhs_p (tree expr) /* All constants are ok. */ break; -case tcc_binary: case tcc_comparison: + if (!INTEGRAL_TYPE_P (TREE_TYPE (expr)) + || (TREE_CODE (TREE_TYPE (expr)) != BOOLEAN_TYPE + TYPE_PRECISION (TREE_TYPE (expr)) != 1)) + return false; + + /* Fallthru. */ +case tcc_binary: if (!is_gimple_val (TREE_OPERAND (expr, 0)) || !is_gimple_val (TREE_OPERAND (expr, 1))) return false; Index: gcc/testsuite/g++.dg/opt/pr60849.C === --- gcc/testsuite/g++.dg/opt/pr60849.C (revision 0) +++ gcc/testsuite/g++.dg/opt/pr60849.C (working copy) @@ -0,0 +1,13 @@ +// { dg-do compile } +// { dg-options -O2 } + +int g; + +extern C int isnan (); + +void foo(float a) { + int (*xx)(...); + xx = isnan; + if (xx(a)) +g++; +}
[PATCH] Fix PR60836
This fixes PR60836 by emitting a non-proper PHI argument to the incoming edge. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2014-04-16 Richard Biener rguent...@suse.de PR tree-optimization/60836 * tree-vect-loop.c (vect_create_epilog_for_reduction): Force initial PHI args to be gimple values. * g++.dg/vect/pr60836.cc: New testcase. Index: gcc/tree-vect-loop.c === *** gcc/tree-vect-loop.c(revision 209423) --- gcc/tree-vect-loop.c(working copy) *** vect_create_epilog_for_reduction (vectr *** 3951,3958 /* Set phi nodes arguments. */ FOR_EACH_VEC_ELT (reduction_phis, i, phi) { ! tree vec_init_def = vec_initial_defs[i]; ! tree def = vect_defs[i]; for (j = 0; j ncopies; j++) { /* Set the loop-entry arg of the reduction-phi. */ --- 3952,3963 /* Set phi nodes arguments. */ FOR_EACH_VEC_ELT (reduction_phis, i, phi) { ! tree vec_init_def, def; ! gimple_seq stmts; ! vec_init_def = force_gimple_operand (vec_initial_defs[i], stmts, ! true, NULL_TREE); ! gsi_insert_seq_on_edge_immediate (loop_preheader_edge (loop), stmts); ! def = vect_defs[i]; for (j = 0; j ncopies; j++) { /* Set the loop-entry arg of the reduction-phi. */ Index: gcc/testsuite/g++.dg/vect/pr60836.cc === *** gcc/testsuite/g++.dg/vect/pr60836.cc(revision 0) --- gcc/testsuite/g++.dg/vect/pr60836.cc(working copy) *** *** 0 --- 1,39 + // { dg-do compile } + + int a, b; + typedef double (*NormFunc) (const int ); + int + max (int p1, int p2) + { + if (p1 p2) + return p2; + return p1; + } + + struct A + { + int operator () (int p1, int p2) + { + return max (p1, p2); + } + }; + template class, class double + norm_ (const int ) + { + char c, d; + A e; + for (; a; a++) + { + b = e (b, d); + b = e (b, c); + } + } + + void + norm () + { + static NormFunc f = norm_ int, A ; + f = 0; + } + + // { dg-final { cleanup-tree-dump vect } }
[PATCH 2/6] merge register_dump_files_1 into register_dump_files
From: Trevor Saunders tsaund...@mozilla.com Hi, simplification allowed by previous patch. bootstrap + regtest passed on x86_64-unknown-linux-gnu, ok? Trev 2014-03-19 Trevor Saunders tsaund...@mozilla.com * pass_manager.h (pass_manager::register_dump_files_1): Remove declaration. * passes.c (pass_manager::register_dump_files_1): Merge into (pass_manager::register_dump_files): this, and remove its handling of properties since the pass always has the properties anyway. (pass_manager::pass_manager): Adjust. diff --git a/gcc/pass_manager.h b/gcc/pass_manager.h index 8309567..9f4d67b 100644 --- a/gcc/pass_manager.h +++ b/gcc/pass_manager.h @@ -91,8 +91,7 @@ public: private: void set_pass_for_id (int id, opt_pass *pass); - void register_dump_files_1 (opt_pass *pass); - void register_dump_files (opt_pass *pass, int properties); + void register_dump_files (opt_pass *pass); private: context *m_ctxt; diff --git a/gcc/passes.c b/gcc/passes.c index 3f9590a..7508771 100644 --- a/gcc/passes.c +++ b/gcc/passes.c @@ -706,11 +706,10 @@ pass_manager::register_one_dump_file (opt_pass *pass) free (CONST_CAST (char *, full_name)); } -/* Recursive worker function for register_dump_files. */ +/* Register the dump files for the pass_manager starting at PASS. */ void -pass_manager:: -register_dump_files_1 (opt_pass *pass) +pass_manager::register_dump_files (opt_pass *pass) { do { @@ -718,25 +717,13 @@ register_dump_files_1 (opt_pass *pass) register_one_dump_file (pass); if (pass-sub) -register_dump_files_1 (pass-sub); +register_dump_files (pass-sub); pass = pass-next; } while (pass); } -/* Register the dump files for the pass_manager starting at PASS. - PROPERTIES reflects the properties that are guaranteed to be available at - the beginning of the pipeline. */ - -void -pass_manager:: -register_dump_files (opt_pass *pass,int properties) -{ - pass-properties_required |= properties; - register_dump_files_1 (pass); -} - struct pass_registry { const char* unique_name; @@ -1536,19 +1523,11 @@ pass_manager::pass_manager (context *ctxt) #undef TERMINATE_PASS_LIST /* Register the passes with the tree dump code. */ - register_dump_files (all_lowering_passes, PROP_gimple_any); - register_dump_files (all_small_ipa_passes, - PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh - | PROP_cfg); - register_dump_files (all_regular_ipa_passes, - PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh - | PROP_cfg); - register_dump_files (all_late_ipa_passes, - PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh - | PROP_cfg); - register_dump_files (all_passes, - PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh - | PROP_cfg); + register_dump_files (all_lowering_passes); + register_dump_files (all_small_ipa_passes); + register_dump_files (all_regular_ipa_passes); + register_dump_files (all_late_ipa_passes); + register_dump_files (all_passes); } /* If we are in IPA mode (i.e., current_function_decl is NULL), call -- 1.9.2
[PATCH 1/6] remove properties stuff from register_dump_files_1
From: Trevor Saunders tsaund...@mozilla.com Hi, just removing some dead code. bootstrapped + regtested against r209414 on x86_64-unknown-linux-gnu, ok? Trev 2014-03-19 Trevor Saunders tsaund...@mozilla.com * pass_manager.h (pass_manager::register_dump_files_1): Adjust. * passes.c (pass_manager::register_dump_files_1): Remove dead code dealing with properties. (pass_manager::register_dump_files): Adjust. diff --git a/gcc/pass_manager.h b/gcc/pass_manager.h index e1d8143..8309567 100644 --- a/gcc/pass_manager.h +++ b/gcc/pass_manager.h @@ -91,7 +91,7 @@ public: private: void set_pass_for_id (int id, opt_pass *pass); - int register_dump_files_1 (opt_pass *pass, int properties); + void register_dump_files_1 (opt_pass *pass); void register_dump_files (opt_pass *pass, int properties); private: diff --git a/gcc/passes.c b/gcc/passes.c index 60fb135..3f9590a 100644 --- a/gcc/passes.c +++ b/gcc/passes.c @@ -708,33 +708,21 @@ pass_manager::register_one_dump_file (opt_pass *pass) /* Recursive worker function for register_dump_files. */ -int +void pass_manager:: -register_dump_files_1 (opt_pass *pass, int properties) +register_dump_files_1 (opt_pass *pass) { do { - int new_properties = (properties | pass-properties_provided) - ~pass-properties_destroyed; - if (pass-name pass-name[0] != '*') register_one_dump_file (pass); if (pass-sub) -new_properties = register_dump_files_1 (pass-sub, new_properties); - - /* If we have a gate, combine the properties that we could have with - and without the pass being examined. */ - if (pass-has_gate) -properties = new_properties; - else -properties = new_properties; +register_dump_files_1 (pass-sub); pass = pass-next; } while (pass); - - return properties; } /* Register the dump files for the pass_manager starting at PASS. @@ -746,7 +734,7 @@ pass_manager:: register_dump_files (opt_pass *pass,int properties) { pass-properties_required |= properties; - register_dump_files_1 (pass, properties); + register_dump_files_1 (pass); } struct pass_registry -- 1.9.2
[PATCH 4/6] enable -Woverloaded-virtual when available
From: Trevor Saunders tbsau...@mozilla.com hi, its a useful warning, and helps catch bugs in the next two patches. bootstrap + regtest passed on x86_64-unknown-linux-gnu, ok? Trev 2014-03-19 Trevor Saunders tsaund...@mozilla.com * configure.ac: Check for -Woverloaded-virtual and enable it if found. * configure: Regenerate. diff --git a/gcc/configure b/gcc/configure index 415377a..1a48ca3 100755 --- a/gcc/configure +++ b/gcc/configure @@ -6427,6 +6427,50 @@ fi done CFLAGS=$save_CFLAGS +save_CFLAGS=$CFLAGS +for real_option in -Woverloaded-virtual; do + # Do the check with the no- prefix removed since gcc silently + # accepts any -Wno-* option on purpose + case $real_option in +-Wno-*) option=-W`expr x$real_option : 'x-Wno-\(.*\)'` ;; +*) option=$real_option ;; + esac + as_acx_Woption=`$as_echo acx_cv_prog_cc_warning_$option | $as_tr_sh` + + { $as_echo $as_me:${as_lineno-$LINENO}: checking whether $CC supports $option 5 +$as_echo_n checking whether $CC supports $option... 6; } +if { as_var=$as_acx_Woption; eval test \\${$as_var+set}\ = set; }; then : + $as_echo_n (cached) 6 +else + CFLAGS=$option +cat confdefs.h - _ACEOF conftest.$ac_ext +/* end confdefs.h. */ + +int +main () +{ + + ; + return 0; +} +_ACEOF +if ac_fn_c_try_compile $LINENO; then : + eval $as_acx_Woption=yes +else + eval $as_acx_Woption=no +fi +rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext + +fi +eval ac_res=\$$as_acx_Woption + { $as_echo $as_me:${as_lineno-$LINENO}: result: $ac_res 5 +$as_echo $ac_res 6; } + if test `eval 'as_val=${'$as_acx_Woption'};$as_echo $as_val'` = yes; then : + strict_warn=$strict_warn${strict_warn:+ }$real_option +fi + done +CFLAGS=$save_CFLAGS + c_strict_warn= save_CFLAGS=$CFLAGS for real_option in -Wold-style-definition -Wc++-compat; do @@ -17927,7 +17971,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat conftest.$ac_ext _LT_EOF -#line 17930 configure +#line 17974 configure #include confdefs.h #if HAVE_DLFCN_H @@ -18033,7 +18077,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat conftest.$ac_ext _LT_EOF -#line 18036 configure +#line 18080 configure #include confdefs.h #if HAVE_DLFCN_H diff --git a/gcc/configure.ac b/gcc/configure.ac index 0336066..b2726e5 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -340,6 +340,8 @@ ACX_PROG_CC_WARNING_OPTS( ACX_PROG_CC_WARNING_OPTS( m4_quote(m4_do([-Wmissing-format-attribute])), [strict_warn]) ACX_PROG_CC_WARNING_OPTS( + m4_quote(m4_do([-Woverloaded-virtual])), [strict_warn]) +ACX_PROG_CC_WARNING_OPTS( m4_quote(m4_do([-Wold-style-definition -Wc++-compat])), [c_strict_warn]) ACX_PROG_CC_WARNING_ALMOST_PEDANTIC( m4_quote(m4_do([-Wno-long-long -Wno-variadic-macros ], -- 1.9.2
Re: [PATCH 2/6] merge register_dump_files_1 into register_dump_files
On Thu, Apr 17, 2014 at 10:37 AM, tsaund...@mozilla.com wrote: From: Trevor Saunders tsaund...@mozilla.com Hi, simplification allowed by previous patch. bootstrap + regtest passed on x86_64-unknown-linux-gnu, ok? Ok. Thanks, Richard. Trev 2014-03-19 Trevor Saunders tsaund...@mozilla.com * pass_manager.h (pass_manager::register_dump_files_1): Remove declaration. * passes.c (pass_manager::register_dump_files_1): Merge into (pass_manager::register_dump_files): this, and remove its handling of properties since the pass always has the properties anyway. (pass_manager::pass_manager): Adjust. diff --git a/gcc/pass_manager.h b/gcc/pass_manager.h index 8309567..9f4d67b 100644 --- a/gcc/pass_manager.h +++ b/gcc/pass_manager.h @@ -91,8 +91,7 @@ public: private: void set_pass_for_id (int id, opt_pass *pass); - void register_dump_files_1 (opt_pass *pass); - void register_dump_files (opt_pass *pass, int properties); + void register_dump_files (opt_pass *pass); private: context *m_ctxt; diff --git a/gcc/passes.c b/gcc/passes.c index 3f9590a..7508771 100644 --- a/gcc/passes.c +++ b/gcc/passes.c @@ -706,11 +706,10 @@ pass_manager::register_one_dump_file (opt_pass *pass) free (CONST_CAST (char *, full_name)); } -/* Recursive worker function for register_dump_files. */ +/* Register the dump files for the pass_manager starting at PASS. */ void -pass_manager:: -register_dump_files_1 (opt_pass *pass) +pass_manager::register_dump_files (opt_pass *pass) { do { @@ -718,25 +717,13 @@ register_dump_files_1 (opt_pass *pass) register_one_dump_file (pass); if (pass-sub) -register_dump_files_1 (pass-sub); +register_dump_files (pass-sub); pass = pass-next; } while (pass); } -/* Register the dump files for the pass_manager starting at PASS. - PROPERTIES reflects the properties that are guaranteed to be available at - the beginning of the pipeline. */ - -void -pass_manager:: -register_dump_files (opt_pass *pass,int properties) -{ - pass-properties_required |= properties; - register_dump_files_1 (pass); -} - struct pass_registry { const char* unique_name; @@ -1536,19 +1523,11 @@ pass_manager::pass_manager (context *ctxt) #undef TERMINATE_PASS_LIST /* Register the passes with the tree dump code. */ - register_dump_files (all_lowering_passes, PROP_gimple_any); - register_dump_files (all_small_ipa_passes, - PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh - | PROP_cfg); - register_dump_files (all_regular_ipa_passes, - PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh - | PROP_cfg); - register_dump_files (all_late_ipa_passes, - PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh - | PROP_cfg); - register_dump_files (all_passes, - PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh - | PROP_cfg); + register_dump_files (all_lowering_passes); + register_dump_files (all_small_ipa_passes); + register_dump_files (all_regular_ipa_passes); + register_dump_files (all_late_ipa_passes); + register_dump_files (all_passes); } /* If we are in IPA mode (i.e., current_function_decl is NULL), call -- 1.9.2
Re: [PATCH 1/6] remove properties stuff from register_dump_files_1
On Thu, Apr 17, 2014 at 10:37 AM, tsaund...@mozilla.com wrote: From: Trevor Saunders tsaund...@mozilla.com Hi, just removing some dead code. bootstrapped + regtested against r209414 on x86_64-unknown-linux-gnu, ok? Ok. Thanks, Richard. Trev 2014-03-19 Trevor Saunders tsaund...@mozilla.com * pass_manager.h (pass_manager::register_dump_files_1): Adjust. * passes.c (pass_manager::register_dump_files_1): Remove dead code dealing with properties. (pass_manager::register_dump_files): Adjust. diff --git a/gcc/pass_manager.h b/gcc/pass_manager.h index e1d8143..8309567 100644 --- a/gcc/pass_manager.h +++ b/gcc/pass_manager.h @@ -91,7 +91,7 @@ public: private: void set_pass_for_id (int id, opt_pass *pass); - int register_dump_files_1 (opt_pass *pass, int properties); + void register_dump_files_1 (opt_pass *pass); void register_dump_files (opt_pass *pass, int properties); private: diff --git a/gcc/passes.c b/gcc/passes.c index 60fb135..3f9590a 100644 --- a/gcc/passes.c +++ b/gcc/passes.c @@ -708,33 +708,21 @@ pass_manager::register_one_dump_file (opt_pass *pass) /* Recursive worker function for register_dump_files. */ -int +void pass_manager:: -register_dump_files_1 (opt_pass *pass, int properties) +register_dump_files_1 (opt_pass *pass) { do { - int new_properties = (properties | pass-properties_provided) - ~pass-properties_destroyed; - if (pass-name pass-name[0] != '*') register_one_dump_file (pass); if (pass-sub) -new_properties = register_dump_files_1 (pass-sub, new_properties); - - /* If we have a gate, combine the properties that we could have with - and without the pass being examined. */ - if (pass-has_gate) -properties = new_properties; - else -properties = new_properties; +register_dump_files_1 (pass-sub); pass = pass-next; } while (pass); - - return properties; } /* Register the dump files for the pass_manager starting at PASS. @@ -746,7 +734,7 @@ pass_manager:: register_dump_files (opt_pass *pass,int properties) { pass-properties_required |= properties; - register_dump_files_1 (pass, properties); + register_dump_files_1 (pass); } struct pass_registry -- 1.9.2
Re: [PATCH 4/6] enable -Woverloaded-virtual when available
On Thu, Apr 17, 2014 at 10:37 AM, tsaund...@mozilla.com wrote: From: Trevor Saunders tbsau...@mozilla.com hi, its a useful warning, and helps catch bugs in the next two patches. bootstrap + regtest passed on x86_64-unknown-linux-gnu, ok? Ok. Thanks, Richard. Trev 2014-03-19 Trevor Saunders tsaund...@mozilla.com * configure.ac: Check for -Woverloaded-virtual and enable it if found. * configure: Regenerate. diff --git a/gcc/configure b/gcc/configure index 415377a..1a48ca3 100755 --- a/gcc/configure +++ b/gcc/configure @@ -6427,6 +6427,50 @@ fi done CFLAGS=$save_CFLAGS +save_CFLAGS=$CFLAGS +for real_option in -Woverloaded-virtual; do + # Do the check with the no- prefix removed since gcc silently + # accepts any -Wno-* option on purpose + case $real_option in +-Wno-*) option=-W`expr x$real_option : 'x-Wno-\(.*\)'` ;; +*) option=$real_option ;; + esac + as_acx_Woption=`$as_echo acx_cv_prog_cc_warning_$option | $as_tr_sh` + + { $as_echo $as_me:${as_lineno-$LINENO}: checking whether $CC supports $option 5 +$as_echo_n checking whether $CC supports $option... 6; } +if { as_var=$as_acx_Woption; eval test \\${$as_var+set}\ = set; }; then : + $as_echo_n (cached) 6 +else + CFLAGS=$option +cat confdefs.h - _ACEOF conftest.$ac_ext +/* end confdefs.h. */ + +int +main () +{ + + ; + return 0; +} +_ACEOF +if ac_fn_c_try_compile $LINENO; then : + eval $as_acx_Woption=yes +else + eval $as_acx_Woption=no +fi +rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext + +fi +eval ac_res=\$$as_acx_Woption + { $as_echo $as_me:${as_lineno-$LINENO}: result: $ac_res 5 +$as_echo $ac_res 6; } + if test `eval 'as_val=${'$as_acx_Woption'};$as_echo $as_val'` = yes; then : + strict_warn=$strict_warn${strict_warn:+ }$real_option +fi + done +CFLAGS=$save_CFLAGS + c_strict_warn= save_CFLAGS=$CFLAGS for real_option in -Wold-style-definition -Wc++-compat; do @@ -17927,7 +17971,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat conftest.$ac_ext _LT_EOF -#line 17930 configure +#line 17974 configure #include confdefs.h #if HAVE_DLFCN_H @@ -18033,7 +18077,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat conftest.$ac_ext _LT_EOF -#line 18036 configure +#line 18080 configure #include confdefs.h #if HAVE_DLFCN_H diff --git a/gcc/configure.ac b/gcc/configure.ac index 0336066..b2726e5 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -340,6 +340,8 @@ ACX_PROG_CC_WARNING_OPTS( ACX_PROG_CC_WARNING_OPTS( m4_quote(m4_do([-Wmissing-format-attribute])), [strict_warn]) ACX_PROG_CC_WARNING_OPTS( + m4_quote(m4_do([-Woverloaded-virtual])), [strict_warn]) +ACX_PROG_CC_WARNING_OPTS( m4_quote(m4_do([-Wold-style-definition -Wc++-compat])), [c_strict_warn]) ACX_PROG_CC_WARNING_ALMOST_PEDANTIC( m4_quote(m4_do([-Wno-long-long -Wno-variadic-macros ], -- 1.9.2
Re: [RFC] Add aarch64 support for ada
On 16 Apr 2014, at 17:36, Richard Henderson r...@redhat.com wrote: On 04/16/2014 12:39 AM, Eric Botcazou wrote: The primary bit of rfc here is the hunk that applies to ada/types.h with respect to Fat_Pointer. Given that the Ada type, as defined in s-stratt.ads, does not include alignment, I can't imagine why the C type should have it. See gcc-interface/utils.c:finish_fat_pointer_type. Ah hah. /* Make sure we can put it into a register. */ if (STRICT_ALIGNMENT) TYPE_ALIGN (record_type) = MIN (BIGGEST_ALIGNMENT, 2 * POINTER_SIZE); AArch64 is not a STRICT_ALIGNMENT target, thus the mismatch. As the align attribute in types.h is for the host, couldn't a configure test solve this issue ? If we were to make this alignment unconditional, would it be better to drop the code from here in finish_fat_pointer_type and instead record that in the Ada source, as we do with the C source? I presume for Fat_Pointer'Alignment use System.Address'Size * 2; or some such incantation would do that... One of the most common Fat_Pointer is for strings, which aren't declared in any source and is very commonly used. OTOH, I think this optimization mostly targets sparc. Tristan.
Re: [PATCH v2] libstdc++: Add hexfloat/defaultfloat io manipulators.
On 17 April 2014 01:56, Luke Allardyce wrote: Thanks, I was wrong about that. Then I think we should just bite the bullet and provide the new behaviour. If we do have an abi_tag on those types in the next release then we can preserve the old behaviour in the old ABI and use the C++11 semantics for the abi_tagged type, which will be used for both C++03 and C++11 code. I am not too concerned that people who use a meaningless modifier in C++03 code get the C++11 behaviour. If they really want %g or %G then they shouldn't use fixed|scientific. Does that mean abi_tag will be enabled with separate compiler flag / define rather than checking against the __cplusplus value? I'm going to send a mail later on today, but the plan is that it's not going to depend on __cplusplus at all. That makes it possible to pass the abi_tagged types between C++03 and C++11 code.
Re: [PATCH] Enhancing the widen-mult pattern in vectorization.
On Sat, Dec 7, 2013 at 12:45 AM, Cong Hou co...@google.com wrote: After further reviewing this patch, I found I don't have to change the code in tree-vect-stmts.c to allow further type conversion after widen-mult operation. Instead, I detect the following pattern in vect_recog_widen_mult_pattern(): T1 a, b; ai = (T2) a; bi = (T2) b; c = ai * bi; where T2 is more that double the size of T1. (e.g. T1 is char and T2 is int). In this case I just create a new type T3 whose size is double of the size of T1, then get an intermediate result of type T3 from widen-mult. Then I add a new statement to STMT_VINFO_PATTERN_DEF_SEQ converting the result into type T2. This strategy makes the patch more clean. Bootstrapped and tested on an x86-64 machine. Ok for trunk (please re-bootstrap/test of course). Thanks, Richard. thanks, Cong diff --git a/gcc/ChangeLog b/gcc/ChangeLog index f298c0b..12990b2 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,10 @@ +2013-12-02 Cong Hou co...@google.com + + * tree-vect-patterns.c (vect_recog_widen_mult_pattern): Enhance + the widen-mult pattern by handling two operands with different + sizes, and operands whose size is smaller than half of the result + type. + 2013-11-22 Jakub Jelinek ja...@redhat.com PR sanitizer/59061 diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index 12d2c90..611ae1c 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,8 @@ +2013-12-02 Cong Hou co...@google.com + + * gcc.dg/vect/vect-widen-mult-u8-s16-s32.c: New test. + * gcc.dg/vect/vect-widen-mult-u8-u32.c: New test. + 2013-11-22 Jakub Jelinek ja...@redhat.com * c-c++-common/asan/no-redundant-instrumentation-7.c: Fix diff --git a/gcc/testsuite/gcc.dg/vect/vect-widen-mult-u8-s16-s32.c b/gcc/testsuite/gcc.dg/vect/vect-widen-mult-u8-s16-s32.c new file mode 100644 index 000..9f9081b --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/vect-widen-mult-u8-s16-s32.c @@ -0,0 +1,48 @@ +/* { dg-require-effective-target vect_int } */ + +#include stdarg.h +#include tree-vect.h + +#define N 64 + +unsigned char X[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__))); +short Y[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__))); +int result[N]; + +/* unsigned char * short - int widening-mult. */ +__attribute__ ((noinline)) int +foo1(int len) { + int i; + + for (i=0; ilen; i++) { +result[i] = X[i] * Y[i]; + } +} + +int main (void) +{ + int i; + + check_vect (); + + for (i=0; iN; i++) { +X[i] = i; +Y[i] = 64-i; +__asm__ volatile (); + } + + foo1 (N); + + for (i=0; iN; i++) { +if (result[i] != X[i] * Y[i]) + abort (); + } + + return 0; +} + +/* { dg-final { scan-tree-dump-times vectorized 1 loops 1 vect { target { vect_widen_mult_hi_to_si || vect_unpack } } } } */ +/* { dg-final { scan-tree-dump-times vect_recog_widen_mult_pattern: detected 1 vect { target vect_widen_mult_hi_to_si_pattern } } } */ +/* { dg-final { scan-tree-dump-times pattern recognized 1 vect { target vect_widen_mult_hi_to_si_pattern } } } */ +/* { dg-final { cleanup-tree-dump vect } } */ + diff --git a/gcc/testsuite/gcc.dg/vect/vect-widen-mult-u8-u32.c b/gcc/testsuite/gcc.dg/vect/vect-widen-mult-u8-u32.c new file mode 100644 index 000..12c4692 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/vect-widen-mult-u8-u32.c @@ -0,0 +1,48 @@ +/* { dg-require-effective-target vect_int } */ + +#include stdarg.h +#include tree-vect.h + +#define N 64 + +unsigned char X[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__))); +unsigned char Y[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__))); +unsigned int result[N]; + +/* unsigned char- unsigned int widening-mult. */ +__attribute__ ((noinline)) int +foo1(int len) { + int i; + + for (i=0; ilen; i++) { +result[i] = X[i] * Y[i]; + } +} + +int main (void) +{ + int i; + + check_vect (); + + for (i=0; iN; i++) { +X[i] = i; +Y[i] = 64-i; +__asm__ volatile (); + } + + foo1 (N); + + for (i=0; iN; i++) { +if (result[i] != X[i] * Y[i]) + abort (); + } + + return 0; +} + +/* { dg-final { scan-tree-dump-times vectorized 1 loops 1 vect { target { vect_widen_mult_qi_to_hi || vect_unpack } } } } */ +/* { dg-final { scan-tree-dump-times vect_recog_widen_mult_pattern: detected 1 vect { target vect_widen_mult_qi_to_hi_pattern } } } */ +/* { dg-final { scan-tree-dump-times pattern recognized 1 vect { target vect_widen_mult_qi_to_hi_pattern } } } */ +/* { dg-final { cleanup-tree-dump vect } } */ + diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c index 7823cc3..f412e2d 100644 --- a/gcc/tree-vect-patterns.c +++ b/gcc/tree-vect-patterns.c @@ -529,7 +529,8 @@ vect_handle_widen_op_by_const (gimple stmt, enum tree_code code, Try to find the following pattern: - type a_t, b_t; + type1 a_t; +
Re: Remove obsolete Solaris 9 support
Uros Bizjak ubiz...@gmail.com writes: On Wed, Apr 16, 2014 at 1:16 PM, Rainer Orth r...@cebitec.uni-bielefeld.de wrote: Now that 4.9 has branched, it's time to actually remove the obsolete Solaris 9 configuration. Most of this is just legwork and falls under my Solaris maintainership. A couple of questions, though: * Uros: I'm removing all sse_os_support() checks from the testsuite. Solaris 9 was the only consumer, so it seems best to do away with it. This is OK, but please leave sse-os-check.h (and corresponding sse_os_support calls) in the testsuite. Just remove the Solaris 9 specific code from sse-os-check.h and always return 1, perhaps with the comment that all currently supported OSes support SSE instructions. Done. I'll repost the final patch once another round of testing has completed. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH GCC]Fix pr60363 by adding backtraced value of phi arg along jump threading path
On Thu, Apr 17, 2014 at 1:30 PM, Jeff Law l...@redhat.com wrote: On 03/18/14 04:13, bin.cheng wrote: Hi, After control flow graph change made by http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01492.html, case gcc.dg/tree-ssa/ssa-dom-thread-4.c is broken on logical_op_short_circuit targets including cortex-m3/cortex-m0. The regression reveals a missed opportunity in jump threading, which causes a forward basic block doesn't get removed in cfgcleanup after jump threading in VRP1. Root cause is stated at the corresponding PR: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60363, please refer to it for detailed report. This patch fixes the issue by adding constant value instead of ssa_name as the new phi argument. Bootstrap and test on x86_64, also test on cortex-m3 and the regression is gone. I think this should wait for stage1, but would like to hear some comments now. So does it look reasonable? 2014-03-18 Bin Chengbin.ch...@arm.com PR regression/60363 * gcc/tree-ssa-threadupdate.c (get_value_locus_in_path): New. (copy_phi_args): New parameters. Call get_value_locus_in_path. (update_destination_phis): New parameter. (create_edge_and_update_destination_phis): Ditto. (ssa_fix_duplicate_block_edges): Pass new arguments. (thread_single_edge): Ditto. This is a good and interesting catch. DOM knows how to propagate these context sensitive equivalences which should expose the optimizable forwarder blocks. At the time I was looking into the problem, DOM couldn't understand the equivalence. Maybe it can be improved too. But I'm a big believer in catching as many CFG simplifications as early as we can as they tend to have nice cascading effects. So if we can pick it up by being smarter in how we duplicate arguments, then I'm all for it. + for (int j = idx - 1; j = 0; j--) +{ + edge e = (*path)[j]-e; + if (e-dest == def_bb) + { + arg = gimple_phi_arg_def (def_phi, e-dest_idx); + *locus = gimple_phi_arg_location (def_phi, e-dest_idx); + return (TREE_CODE (arg) == INTEGER_CST ? arg : def); Presumably any constant that can legitimately appear in a PHI node is good here. So for example ADDR_EXPR something in static storage ought to be handled as well. One could also argue that we should go ahead and do a context sensitive copy propagation here too if ARG turns out to be an SSA_NAME. You have to be a bit more careful with those and use may_propagate_copy_p and you'd probably want to test the loop depth of the SSA_NAMEs to ensure you're not doing a propagation that is going to muck up LICM. See loop_depth_of_name uses in tree-ssa-dom.c. Overall I think it's good. We just need to resolve whether or not we want to catch constant ADDR_EXPRs and/or do the context sensitive copy propagations. Do you mean const/copy propagation in jump threading optimization, or just an independent opt somewhere else? It's naturally flow sensitive along jump threading path, which looks interesting to me. Thanks, bin jeff -- Best Regards.
Re: [PATCH, i386, PR57623] Introduce synonyms for BMI intrinsics
On Wed, Jul 03, 2013 at 08:14:25AM +0200, Uros Bizjak wrote: On Tue, Jul 2, 2013 at 10:32 AM, Kirill Yukhin kirill.yuk...@gmail.com wrote: Bootstrap passing. Updated tests passing on BMI-featured HW. ChangeLog: 2013-07-02 Kirill Yukhin kirill.yuk...@intel.com * config/i386/bmiintrin.h (_blsi_u32): New. (_blsi_u64): Ditto. (_blsr_u32): Ditto. (_blsr_u64): Ditto. (_blsmsk_u32): Ditto. (_blsmsk_u64): Ditto. (_tzcnt_u32): Ditto. (_tzcnt_u64): Ditto. testsuite/ChangeLog: 2013-07-02 Kirill Yukhin kirill.yuk...@intel.com * gcc.target/i386/bmi-1.c: Extend with new instrinsics. Fix scan patterns. * gcc.target/i386/bmi-2.c: Ditto. [1] http://gcc.gnu.org/ml/gcc-patches/2013-06/msg01286.html This is OK for mainline. BTW: Do we want to backport this patch (and your previous) to 4.8 branch? Kyrill, you've committed this only to the 4.8 branch and not to the trunk, which means we actually regress on this on in 4.9 compared to 4.8.2. As the patch has been approved, I went ahead and after testing it on x86_64 (-m32/-m64) committed it to the trunk and 4.9. 2014-04-17 Jakub Jelinek ja...@redhat.com PR target/60847 Forward port from 4.8 branch 2013-07-19 Kirill Yukhin kirill.yuk...@intel.com * config/i386/bmiintrin.h (_blsi_u32): New. (_blsi_u64): Ditto. (_blsr_u32): Ditto. (_blsr_u64): Ditto. (_blsmsk_u32): Ditto. (_blsmsk_u64): Ditto. (_tzcnt_u32): Ditto. (_tzcnt_u64): Ditto. * gcc.target/i386/bmi-1.c: Extend with new instrinsics. Fix scan patterns. * gcc.target/i386/bmi-2.c: Ditto. --- gcc/config/i386/bmiintrin.h (revision 201046) +++ gcc/config/i386/bmiintrin.h (revision 201047) @@ -40,7 +40,6 @@ __tzcnt_u16 (unsigned short __X) return __builtin_ctzs (__X); } - extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) __andn_u32 (unsigned int __X, unsigned int __Y) { @@ -66,17 +65,34 @@ __blsi_u32 (unsigned int __X) } extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) +_blsi_u32 (unsigned int __X) +{ + return __blsi_u32 (__X); +} + +extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) __blsmsk_u32 (unsigned int __X) { return __X ^ (__X - 1); } extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) +_blsmsk_u32 (unsigned int __X) +{ + return __blsmsk_u32 (__X); +} + +extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) __blsr_u32 (unsigned int __X) { return __X (__X - 1); } +extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) +_blsr_u32 (unsigned int __X) +{ + return __blsr_u32 (__X); +} extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) __tzcnt_u32 (unsigned int __X) @@ -84,6 +100,12 @@ __tzcnt_u32 (unsigned int __X) return __builtin_ctz (__X); } +extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) +_tzcnt_u32 (unsigned int __X) +{ + return __builtin_ctz (__X); +} + #ifdef __x86_64__ extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) @@ -111,22 +133,46 @@ __blsi_u64 (unsigned long long __X) } extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) +_blsi_u64 (unsigned long long __X) +{ + return __blsi_u64 (__X); +} + +extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) __blsmsk_u64 (unsigned long long __X) { return __X ^ (__X - 1); } extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) +_blsmsk_u64 (unsigned long long __X) +{ + return __blsmsk_u64 (__X); +} + +extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) __blsr_u64 (unsigned long long __X) { return __X (__X - 1); } extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) +_blsr_u64 (unsigned long long __X) +{ + return __blsr_u64 (__X); +} + +extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) __tzcnt_u64 (unsigned long long __X) { return __builtin_ctzll (__X); } + +extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) +_tzcnt_u64 (unsigned long long __X) +{ + return __builtin_ctzll (__X); +} #endif /* __x86_64__ */ --- gcc/testsuite/gcc.target/i386/bmi-1.c (revision 201046) +++ gcc/testsuite/gcc.target/i386/bmi-1.c (revision 201047) @@ -2,10 +2,10 @@ /* { dg-options
Re: [PATCH][2/3] Fix PR54733 Optimize endian independent load/store
On Thu, Apr 17, 2014 at 7:19 AM, Thomas Preud'homme thomas.preudho...@arm.com wrote: From: Richard Biener [mailto:richard.guent...@gmail.com] With handling only the outermost handled-component and then only a selected subset you'll catch many but not all cases. Why not simply use get_inner_reference () here (plus stripping the constant offset from an innermost MEM_REF) and get the best of both worlds (not duplicate parts of its logic and handle more cases)? Eventually using tree-affine.c and get_inner_reference_aff is even more appropriate so you can compute the address differences without decomposing them yourselves. Why does the constant offset from an innermost MEM_REF need to be stripped? Shouldn't that be part of the offset in the symbolic number? Yes, but get_inner_reference returns MEM[ptr, constant-offset] as base, thus it doesn't move the constant offset therein to bitpos and doesn't return MEM[ptr, 0]. You have to do that yourselves. (as you are really interested in the _address_ of the memory reference instead of the reference itself it would be appropriate to introduce a variant of get_inner_reference that returns 'ptr' in this case and x for x.field1 for example) + /* Compute address to load from and cast according to the size + of the load. */ + load_ptr_type = build_pointer_type (load_type); + addr_expr = build1 (ADDR_EXPR, load_ptr_type, bswap_src); + addr_tmp = make_temp_ssa_name (load_ptr_type, NULL, load_src); + addr_stmt = gimple_build_assign_with_ops +(NOP_EXPR, addr_tmp, addr_expr, NULL); + gsi_insert_before (gsi, addr_stmt, GSI_SAME_STMT); + + /* Perform the load. */ + load_offset_ptr = build_int_cst (load_ptr_type, 0); + val_tmp = make_temp_ssa_name (load_type, NULL, load_dst); + val_expr = build2 (MEM_REF, load_type, addr_tmp, load_offset_ptr); + load_stmt = gimple_build_assign_with_ops +(MEM_REF, val_tmp, val_expr, NULL); this is unnecessarily complex and has TBAA issues. You don't need to create a correct pointer type, so doing addr_expr = fold_build_addr_expr (bswap_src); is enough. Now, to fix the TBAA issues you either need to remember and combine the reference_alias_ptr_type of each original load and use that for the load_offset_ptr value or decide that isn't worth it and use alias-set zero (use ptr_type_node). Sorry this is only my second patch [1] to gcc so it's not all clear to me. The TBAA issue you mention comes from val_expr referring to a memory area that overlap with the smaller memory area used in the bitwise OR operation, am I right? Now, I have no idea about how to do the combination of the values returned by reference_alias_ptr_type () for each individual small memory area. Can you advise me on this? And what are the effect of not doing it and using ptr_type_node for the alias-set? You can combine two reference_alias_ptr_type()s with if (alias_ptr_types_compatible_p (type1, type2)) return type1; else return ptr_type_node; using ptr_type_node for the alias-set will make it alias with all memory references (that is, type-based disambiguation will be disabled). That's required for example if you combine four loads with type 'short' using a single load with type 'long'. [1] First one was a fix on the existing implementation of the bswap pass. Can you also expand the comment about size vs. range? Is it that range can be bigger than size if you have (short)a[0] | ((short)a[3] 1) sofar where size == 2 but range == 3? Thus range can also be smaller than size for example for (short)a[0] | ((short)a[0] 1) where range would be 1 and size == 2? I suppose adding two examples like this to the comment, together with the expected value of 'n' would help here. You understood correctly. I will add the suggested example. Otherwise the patch looks good. Now we're only missing the addition of trying to match to a VEC_PERM_EXPR with a constant mask using can_vec_perm_p ;) Is that the vector shuffle engine you were mentioning in PR54733? If I understand correctly it is a generalization of the check again CMPNOP and CMPXCHG in find_bswap in this new patchset. I will look if ARM could Benefit from this and if yes I might take a look (yep, two conditions). Yep. For example it might match on things like int foo (char *x) { return x[0] 1 | x[0]) 1) | x[1]) 1) | x[0]; } not sure if target support for shuffles on small vectors (or vector parts) is working well. Thus on v1si as in the example. Richard. Thanks a lot for such quick and detailed comments after my ping. Best regards, Thomas
Re: [PATCH] dwarf2out: Use normal constant values in bound_info if possible.
On Tue, 2014-04-15 at 14:24 -0700, Cary Coutant wrote: + /* If HOST_WIDE_INT is big enough then represent the bound as + a constant value. Note that we need to make sure the type + is signed or unsigned. We cannot just add an unsigned + constant if the value itself is positive. Some DWARF + consumers will lookup the bounds type and then sign extend + any unsigned values found for signed types. This is only + for DW_AT_lower_bound, normally unsigned values + (DW_FORM_data[1248]) are assumed to not need + sign-extension. */ This comment confuses me. Sorry, obviously not my intention. But I see what I was trying to say and how I said it didn't make things very clear. Apologies. By we need to make sure the type is signed or unsigned (what else can it be?), I think you mean we need to choose a form based on whether the type is signed or unsigned. Yes, right. I was confusing matters in my comment because I was thinking of non-constants (reference or exprlocs) that are handled elsewhere later on in the code. And by This is only for DW_AT_lower_bound, ..., I think you mean This is needed only for DW_AT_{lower,upper}_bound, since for most other attributes, consumers will treat DW_FORM_data[1248] as unsigned values, regardless of the underlying type. Yes, right again. Otherwise, the patch looks OK to me. Thanks I pushed it with the comment changed to how you expressed things. It now reads: /* If HOST_WIDE_INT is big enough then represent the bound as a constant value. We need to choose a form based on whether the type is signed or unsigned. We cannot just call add_AT_unsigned if the value itself is positive (add_AT_unsigned might add the unsigned value encoded as DW_FORM_data[1248]). Some DWARF consumers will lookup the bounds type and then sign extend any unsigned values found for signed types. This is needed only for DW_AT_{lower,upper}_bound, since for most other attributes, consumers will treat DW_FORM_data[1248] as unsigned values, regardless of the underlying type. */ Thanks, Mark
Re: [PATCH] Fix PR60849
On Thu, 17 Apr 2014, Richard Biener wrote: This fixes PR60849 by properly rejecting non-boolean typed comparisons from valid_gimple_rhs_p so they go through the gimplification paths. Could you also accept vector comparisons please? -- Marc Glisse
Re: [PATCH, i386, PR57623] Introduce synonyms for BMI intrinsics
Thanks! Sorry, missed that! K On Thu, Apr 17, 2014 at 2:13 PM, Jakub Jelinek ja...@redhat.com wrote: On Wed, Jul 03, 2013 at 08:14:25AM +0200, Uros Bizjak wrote: On Tue, Jul 2, 2013 at 10:32 AM, Kirill Yukhin kirill.yuk...@gmail.com wrote: Bootstrap passing. Updated tests passing on BMI-featured HW. ChangeLog: 2013-07-02 Kirill Yukhin kirill.yuk...@intel.com * config/i386/bmiintrin.h (_blsi_u32): New. (_blsi_u64): Ditto. (_blsr_u32): Ditto. (_blsr_u64): Ditto. (_blsmsk_u32): Ditto. (_blsmsk_u64): Ditto. (_tzcnt_u32): Ditto. (_tzcnt_u64): Ditto. testsuite/ChangeLog: 2013-07-02 Kirill Yukhin kirill.yuk...@intel.com * gcc.target/i386/bmi-1.c: Extend with new instrinsics. Fix scan patterns. * gcc.target/i386/bmi-2.c: Ditto. [1] http://gcc.gnu.org/ml/gcc-patches/2013-06/msg01286.html This is OK for mainline. BTW: Do we want to backport this patch (and your previous) to 4.8 branch? Kyrill, you've committed this only to the 4.8 branch and not to the trunk, which means we actually regress on this on in 4.9 compared to 4.8.2. As the patch has been approved, I went ahead and after testing it on x86_64 (-m32/-m64) committed it to the trunk and 4.9. 2014-04-17 Jakub Jelinek ja...@redhat.com PR target/60847 Forward port from 4.8 branch 2013-07-19 Kirill Yukhin kirill.yuk...@intel.com * config/i386/bmiintrin.h (_blsi_u32): New. (_blsi_u64): Ditto. (_blsr_u32): Ditto. (_blsr_u64): Ditto. (_blsmsk_u32): Ditto. (_blsmsk_u64): Ditto. (_tzcnt_u32): Ditto. (_tzcnt_u64): Ditto. * gcc.target/i386/bmi-1.c: Extend with new instrinsics. Fix scan patterns. * gcc.target/i386/bmi-2.c: Ditto. --- gcc/config/i386/bmiintrin.h (revision 201046) +++ gcc/config/i386/bmiintrin.h (revision 201047) @@ -40,7 +40,6 @@ __tzcnt_u16 (unsigned short __X) return __builtin_ctzs (__X); } - extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) __andn_u32 (unsigned int __X, unsigned int __Y) { @@ -66,17 +65,34 @@ __blsi_u32 (unsigned int __X) } extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) +_blsi_u32 (unsigned int __X) +{ + return __blsi_u32 (__X); +} + +extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) __blsmsk_u32 (unsigned int __X) { return __X ^ (__X - 1); } extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) +_blsmsk_u32 (unsigned int __X) +{ + return __blsmsk_u32 (__X); +} + +extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) __blsr_u32 (unsigned int __X) { return __X (__X - 1); } +extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) +_blsr_u32 (unsigned int __X) +{ + return __blsr_u32 (__X); +} extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) __tzcnt_u32 (unsigned int __X) @@ -84,6 +100,12 @@ __tzcnt_u32 (unsigned int __X) return __builtin_ctz (__X); } +extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) +_tzcnt_u32 (unsigned int __X) +{ + return __builtin_ctz (__X); +} + #ifdef __x86_64__ extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) @@ -111,22 +133,46 @@ __blsi_u64 (unsigned long long __X) } extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) +_blsi_u64 (unsigned long long __X) +{ + return __blsi_u64 (__X); +} + +extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) __blsmsk_u64 (unsigned long long __X) { return __X ^ (__X - 1); } extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) +_blsmsk_u64 (unsigned long long __X) +{ + return __blsmsk_u64 (__X); +} + +extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) __blsr_u64 (unsigned long long __X) { return __X (__X - 1); } extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) +_blsr_u64 (unsigned long long __X) +{ + return __blsr_u64 (__X); +} + +extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) __tzcnt_u64 (unsigned long long __X) { return __builtin_ctzll (__X); } + +extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) +_tzcnt_u64 (unsigned long long __X) +{ + return
Re: [PATCH] Fix PR60849
On Thu, 17 Apr 2014, Marc Glisse wrote: On Thu, 17 Apr 2014, Richard Biener wrote: This fixes PR60849 by properly rejecting non-boolean typed comparisons from valid_gimple_rhs_p so they go through the gimplification paths. Could you also accept vector comparisons please? Sure. Testing in progress. Richard. 2014-04-17 Richard Biener rguent...@suse.de PR middle-end/60849 * tree-ssa-propagate.c (valid_gimple_rhs_p): Allow vector comparison results and add clarifying comment. Index: gcc/tree-ssa-propagate.c === --- gcc/tree-ssa-propagate.c(revision 209469) +++ gcc/tree-ssa-propagate.c(working copy) @@ -572,9 +572,13 @@ valid_gimple_rhs_p (tree expr) break; case tcc_comparison: - if (!INTEGRAL_TYPE_P (TREE_TYPE (expr)) - || (TREE_CODE (TREE_TYPE (expr)) != BOOLEAN_TYPE - TYPE_PRECISION (TREE_TYPE (expr)) != 1)) + /* GENERIC allows comparisons with non-boolean types, reject + those for GIMPLE. Let vector-typed comparisons pass - rules +for GENERIC and GIMPLE are the same here. */ + if (!(INTEGRAL_TYPE_P (TREE_TYPE (expr)) +(TREE_CODE (TREE_TYPE (expr)) == BOOLEAN_TYPE + || TYPE_PRECISION (TREE_TYPE (expr)) == 1)) + TREE_CODE (TREE_TYPE (expr)) != VECTOR_TYPE) return false; /* Fallthru. */
[PATCH] Try to coalesce for unary and binary ops
The patch below increases the number of coalescs we attempt to also cover unary and binary operations. This improves initial code generation for code like int foo (int i, int j, int k, int l) { int res = i; res += j; res += k; res += l; return res; } from ;; res_3 = i_1(D) + j_2(D); (insn 9 8 0 (parallel [ (set (reg/v:SI 83 [ res ]) (plus:SI (reg/v:SI 87 [ i ]) (reg/v:SI 88 [ j ]))) (clobber (reg:CC 17 flags)) ]) t.c:4 -1 (nil)) ;; res_5 = res_3 + k_4(D); (insn 10 9 0 (parallel [ (set (reg/v:SI 84 [ res ]) (plus:SI (reg/v:SI 83 [ res ]) (reg/v:SI 89 [ k ]))) (clobber (reg:CC 17 flags)) ]) t.c:5 -1 (nil)) ... to ;; res_3 = i_1(D) + j_2(D); (insn 9 8 0 (parallel [ (set (reg/v:SI 83 [ res ]) (plus:SI (reg/v:SI 85 [ i ]) (reg/v:SI 86 [ j ]))) (clobber (reg:CC 17 flags)) ]) t.c:4 -1 (nil)) ;; res_5 = res_3 + k_4(D); (insn 10 9 0 (parallel [ (set (reg/v:SI 83 [ res ]) (plus:SI (reg/v:SI 83 [ res ]) (reg/v:SI 87 [ k ]))) (clobber (reg:CC 17 flags)) ]) t.c:5 -1 (nil)) re-using the same pseudo for the LHS. Expansion has special code to improve coalescing of op1 with target thus this is what we try to match here. Overall there are positive and negative size effects during a bootstrap on x86_64, but overall it seems to be a loss - stage3 cc1 text size is 18261647 bytes without the patch compared to 18265751 bytes with the patch. Now the question is what does this tell us? Not re-using the same pseudo as op and target is always better? Btw, I tried this to find a convincing metric for a intra-BB scheduling pass (during out-of-SSA) on GIMPLE (to be able to kill that odd scheduling code we now have in reassoc). And to have sth that TER not immediately un-does we have to disable TER which conveniently happens for coalesced SSA names. Thus - schedule for register pressure, and thus reduce SSA name lifetime - with the goal that out-of-SSA can do more coalescing. But it won't even try to coalesce anything else than PHI copies (not affected by scheduling) or plain SSA name copies (shouldn't happen anyway due to copy propagation). So - any ideas? Or is the overall negative for cc1 just an artifact to ignore and we _should_ coalesce as much as possible (even if it doesn't avoid copies - thus the cost of 0 used in the patch)? Otherwise the patch bootstraps and tests fine on x86_64-unknown-linux-gnu. Thanks, Richard. 2014-04-17 Richard Biener rguent...@suse.de * tree-ssa-coalesce.c (create_outofssa_var_map): Try to coalesce SSA name uses with SSA name results in all unary and binary operations. Index: gcc/tree-ssa-coalesce.c === *** gcc/tree-ssa-coalesce.c (revision 209469) --- gcc/tree-ssa-coalesce.c (working copy) *** create_outofssa_var_map (coalesce_list_p *** 991,1007 case GIMPLE_ASSIGN: { tree lhs = gimple_assign_lhs (stmt); tree rhs1 = gimple_assign_rhs1 (stmt); ! if (gimple_assign_ssa_name_copy_p (stmt) gimple_can_coalesce_p (lhs, rhs1)) { v1 = SSA_NAME_VERSION (lhs); v2 = SSA_NAME_VERSION (rhs1); ! cost = coalesce_cost_bb (bb); ! add_coalesce (cl, v1, v2, cost); bitmap_set_bit (used_in_copy, v1); bitmap_set_bit (used_in_copy, v2); } } break; --- 993,1031 case GIMPLE_ASSIGN: { tree lhs = gimple_assign_lhs (stmt); + if (TREE_CODE (lhs) != SSA_NAME) + break; + + /* Expansion handles target == op1 properly and also + target == op2 for commutative binary ops. */ tree rhs1 = gimple_assign_rhs1 (stmt); ! enum tree_code code = gimple_assign_rhs_code (stmt); ! enum gimple_rhs_class klass = get_gimple_rhs_class (code); ! if (TREE_CODE (rhs1) == SSA_NAME gimple_can_coalesce_p (lhs, rhs1)) { v1 = SSA_NAME_VERSION (lhs); v2 = SSA_NAME_VERSION (rhs1); ! add_coalesce (cl, v1, v2, ! klass == GIMPLE_SINGLE_RHS ! ? coalesce_cost_bb (bb) : 0); bitmap_set_bit (used_in_copy, v1); bitmap_set_bit (used_in_copy, v2); } + if (klass == GIMPLE_BINARY_RHS +commutative_tree_code (code)) + { +
Re: [PATCH] Try to coalesce for unary and binary ops
On Thu, 17 Apr 2014, Richard Biener wrote: The patch below increases the number of coalescs we attempt to also cover unary and binary operations. This improves initial code generation for code like int foo (int i, int j, int k, int l) { int res = i; res += j; res += k; res += l; return res; } from ;; res_3 = i_1(D) + j_2(D); (insn 9 8 0 (parallel [ (set (reg/v:SI 83 [ res ]) (plus:SI (reg/v:SI 87 [ i ]) (reg/v:SI 88 [ j ]))) (clobber (reg:CC 17 flags)) ]) t.c:4 -1 (nil)) ;; res_5 = res_3 + k_4(D); (insn 10 9 0 (parallel [ (set (reg/v:SI 84 [ res ]) (plus:SI (reg/v:SI 83 [ res ]) (reg/v:SI 89 [ k ]))) (clobber (reg:CC 17 flags)) ]) t.c:5 -1 (nil)) ... to ;; res_3 = i_1(D) + j_2(D); (insn 9 8 0 (parallel [ (set (reg/v:SI 83 [ res ]) (plus:SI (reg/v:SI 85 [ i ]) (reg/v:SI 86 [ j ]))) (clobber (reg:CC 17 flags)) ]) t.c:4 -1 (nil)) ;; res_5 = res_3 + k_4(D); (insn 10 9 0 (parallel [ (set (reg/v:SI 83 [ res ]) (plus:SI (reg/v:SI 83 [ res ]) (reg/v:SI 87 [ k ]))) (clobber (reg:CC 17 flags)) ]) t.c:5 -1 (nil)) re-using the same pseudo for the LHS. Expansion has special code to improve coalescing of op1 with target thus this is what we try to match here. Overall there are positive and negative size effects during a bootstrap on x86_64, but overall it seems to be a loss - stage3 cc1 text size is 18261647 bytes without the patch compared to 18265751 bytes with the patch. Now the question is what does this tell us? Not re-using the same pseudo as op and target is always better? Btw, I tried this to find a convincing metric for a intra-BB scheduling pass (during out-of-SSA) on GIMPLE (to be able to kill that odd scheduling code we now have in reassoc). And to have sth that TER not immediately un-does we have to disable TER which conveniently happens for coalesced SSA names. Thus - schedule for register pressure, and thus reduce SSA name lifetime - with the goal that out-of-SSA can do more coalescing. But it won't even try to coalesce anything else than PHI copies (not affected by scheduling) or plain SSA name copies (shouldn't happen anyway due to copy propagation). So - any ideas? Or is the overall negative for cc1 just an artifact to ignore and we _should_ coalesce as much as possible (even if it doesn't avoid copies - thus the cost of 0 used in the patch)? One example where it delivers bad initial expansion on x86_64 is int foo (int *p) { int res = p[0]; res += p[1]; res += p[2]; res += p[3]; return res; } where i386.c:ix86_fixup_binary_operands tries to be clever and improve address combine, generating two instructions for (plus:SI (reg/v:SI 83 [ res ]) (mem:SI (...))) and thus triggering expand_binop_directly pat = maybe_gen_insn (icode, 3, ops); if (pat) { /* If PAT is composed of more than one insn, try to add an appropriate REG_EQUAL note to it. If we can't because TEMP conflicts with an operand, call expand_binop again, this time without a target. */ if (INSN_P (pat) NEXT_INSN (pat) != NULL_RTX ! add_equal_note (pat, ops[0].value, optab_to_code (binoptab), ops[1].value, ops[2].value)) { delete_insns_since (last); return expand_binop (mode, binoptab, op0, op1, NULL_RTX, unsignedp, methods); } and thus we end up with (insn 9 6 10 (set (reg:SI 91) (mem:SI (plus:DI (reg/v/f:DI 88 [ p ]) (const_int 4 [0x4])) [0 MEM[(int *)p_2(D) + 4B]+0 S4 A32])) t.c:4 -1 (nil)) (insn 10 9 11 (parallel [ (set (reg:SI 90) (plus:SI (reg/v:SI 83 [ res ]) (reg:SI 91))) (clobber (reg:CC 17 flags)) ]) t.c:4 -1 (expr_list:REG_EQUAL (plus:SI (reg/v:SI 83 [ res ]) (mem:SI (plus:DI (reg/v/f:DI 88 [ p ]) (const_int 4 [0x4])) [0 MEM[(int *)p_2(D) + 4B]+0 S4 A32])) (nil))) (insn 11 10 0 (set (reg/v:SI 83 [ res ]) (reg:SI 90)) t.c:4 -1 (nil)) unpatched we avoid the last move (the tiny testcase of course ends up optimizing the same anyway). Not sure if that strong desire to add a REG_EQUAL note makes up for the losses. At least it looks backwards to the code preceeding it: /* If operation is commutative, try to make the first operand a register. Even better, try to make it the same as the target. Also try to make the last operand a constant. */ if (commutative_p swap_commutative_operands_with_target (target, xop0, xop1)) { swap = xop1; xop1 = xop0; xop0 = swap; }
RE: [PATCH] Add a new option -fmerge-bitfields (patch / doc inside)
Hello, My apologies for inconvenience. Removed every appearance of -ftree-bitfield-merge from the patch and fixed an issue with unions. The rest of the patch is the same as before. Regards, Zoran Jovanovic -- Lowering is applied only for bit-fields copy sequences that are merged. Data structure representing bit-field copy sequences is renamed and reduced in size. Optimization turned on by default for -O2 and higher. Some comments fixed. Benchmarking performed on WebKit for Android. Code size reduction noticed on several files, best examples are: core/rendering/style/StyleMultiColData (632-520 bytes) core/platform/graphics/FontDescription (1715-1475 bytes) core/rendering/style/FillLayer (5069-4513 bytes) core/rendering/style/StyleRareInheritedData (5618-5346) core/css/CSSSelectorList(4047-3887) core/platform/animation/CSSAnimationData (3844-3440 bytes) core/css/resolver/FontBuilder (13818-13350 bytes) core/platform/graphics/Font (16447-15975 bytes) Example: One of the motivating examples for this work was copy constructor of the class which contains bit-fields. C++ code: class A { public: A(const A x); unsigned a : 1; unsigned b : 2; unsigned c : 4; }; A::A(const Ax) { a = x.a; b = x.b; c = x.c; } GIMPLE code without optimization: bb 2: _3 = x_2(D)-a; this_4(D)-a = _3; _6 = x_2(D)-b; this_4(D)-b = _6; _8 = x_2(D)-c; this_4(D)-c = _8; return; Optimized GIMPLE code: bb 2: _10 = x_2(D)-D.1867; _11 = BIT_FIELD_REF _10, 7, 0; _12 = this_4(D)-D.1867; _13 = _12 128; _14 = (unsigned char) _11; _15 = _13 | _14; this_4(D)-D.1867 = _15; return; Generated MIPS32r2 assembly code without optimization: lw $3,0($5) lbu $2,0($4) andi$3,$3,0x1 andi$2,$2,0xfe or $2,$2,$3 sb $2,0($4) lw $3,0($5) andi$2,$2,0xf9 andi$3,$3,0x6 or $2,$2,$3 sb $2,0($4) lw $3,0($5) andi$2,$2,0x87 andi$3,$3,0x78 or $2,$2,$3 j $31 sb $2,0($4) Optimized MIPS32r2 assembly code: lw $3,0($5) lbu $2,0($4) andi$3,$3,0x7f andi$2,$2,0x80 or $2,$3,$2 j $31 sb $2,0($4) Algorithm works on basic block level and consists of following 3 major steps: 1. Go through basic block statements list. If there are statement pairs that implement copy of bit field content from one memory location to another record statements pointers and other necessary data in corresponding data structure. 2. Identify records that represent adjacent bit field accesses and mark them as merged. 3. Lower bit-field accesses by using new field size for those that can be merged. New command line option -fmerge-bitfields is introduced. Tested - passed gcc regression tests for MIPS32r2. Changelog - gcc/ChangeLog: 2014-04-16 Zoran Jovanovic (zoran.jovano...@imgtec.com) * common.opt (fmerge-bitfields): New option. * doc/invoke.texi: Add reference to -fmerge-bitfields. * tree-sra.c (lower_bitfields): New function. Entry for (-fmerge-bitfields). (part_of_union_p): New function. (bf_access_candidate_p): New function. (lower_bitfield_read): New function. (lower_bitfield_write): New function. (bitfield_stmt_bfcopy_pair::hash): New function. (bitfield_stmt_bfcopy_pair::equal): New function. (bitfield_stmt_bfcopy_pair::remove): New function. (create_and_insert_bfcopy): New function. (get_bit_offset): New function. (add_stmt_bfcopy_pair): New function. (cmp_bfcopies): New function. (get_merged_bit_field_size): New function. * dwarf2out.c (simple_type_size_in_bits): Move to tree.c. (field_byte_offset): Move declaration to tree.h and make it extern. * testsuite/gcc.dg/tree-ssa/bitfldmrg1.c: New test. * testsuite/gcc.dg/tree-ssa/bitfldmrg2.c: New test. * tree-ssa-sccvn.c (expressions_equal_p): Move to tree.c. * tree-ssa-sccvn.h (expressions_equal_p): Move declaration to tree.h. * tree.c (expressions_equal_p): Move from tree-ssa-sccvn.c. (simple_type_size_in_bits): Move from dwarf2out.c. * tree.h (expressions_equal_p): Add declaration. (field_byte_offset): Add declaration. Patch - diff --git a/gcc/common.opt b/gcc/common.opt index da275e5..52c7f58 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -2203,6 +2203,10 @@ ftree-sra Common Report Var(flag_tree_sra) Optimization Perform scalar replacement of aggregates +fmerge-bitfields +Common Report Var(flag_tree_bitfield_merge) Optimization +Merge loads and stores of consecutive bitfields + ftree-ter Common Report Var(flag_tree_ter) Optimization Replace temporary expressions in the SSA-normal pass diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index
[RFC][PING^2] Do not consider volatile asms as optimization barriers #1
-- From: Yury Gribov y.gri...@samsung.com Sent: Tuesday, March 25, 2014 11:57AM To: Jakub Jelinek ja...@redhat.com, Eric Botcazou ebotca...@adacore.com, gcc-patches@gcc.gnu.org, Hans-Peter Nilsson h...@bitrange.com, rdsandif...@googlemail.com Subject: Re: [RFC] Do not consider volatile asms as optimization barriers #1 On 03/25/2014 11:57 AM, Yury Gribov wrote: Jakub Jelinek wrote: Richard Sandiford wrote: OK, how about this? It looks like the builtins.c and stmt.c stuff wasn't merged until 4.9, and at this stage it seemed safer to just add the same use/clobber sequence to both places. Please wait a little bit, the patch has been committed to the trunk only very recently, we want to see if it has any fallout. It has been two weeks since Richard commited this to trunk. Perhaps it's ok to backport to 4.8 branch now? -Y Link to original email: http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01306.html
Re: Patch ping
On Wed, Apr 16, 2014 at 02:45:37PM -0400, DJ Delorie wrote: I'll approve both patches, if you agree to think about a way to solve this problem without module-specific configury changes for each such command line option. I understand the usefulness of having instrumentation, but the configure hack is a hack. Only the second patch I'd consider a hack, the first patch merely makes sure the POSTSTAGE1_LDFLAGS stuff actually isn't eaten by libtool. I'll think about other options for the second patch. Note that in a combined tree this isn't a problem, because we'd just instrument the linker at the same time. Only if you never use the plugin from the combined tree build with any other linker. Add -B ../ to some other linker and suddenly it will crash. Jakub
Re: [PATCH] Make SRA tolerate most throwing statements
On Wed, Apr 16, 2014 at 11:22:28AM +0200, Richard Biener wrote: On Tue, 15 Apr 2014, Martin Jambor wrote: Hi, back in January in http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00848.html Eric pointed out a testcase where the problem was SRA not scalarizing an aggregate because it was involved in a throwing statement. The reason is that SRA is likely to need to append new statements after each one where a replaced aggregate is present, but throwing statements must end their BBs. This patch comes up with a fix for most such situations by adding these new statements onto a single successor non-EH edge, if there is one and only one such edge. I have bootstrapped and tested a very similar version on x86_64-linux, bootstrap and testing of this exact one is currently underway. OK for trunk? Eric, if and once this gets in, can you please add the testcase from your original post to the suite? Thanks, Martin 2014-04-15 Martin Jambor mjam...@suse.cz * tree-sra.c (single_non_eh_succ): New function. (disqualify_ops_if_throwing_stmt): Renamed to disqualify_if_bad_bb_terminating_stmt. Allow throwing statements having one non-EH successor BB. (gsi_for_eh_followups): New function. (sra_modify_expr): If stmt ends bb, use single non-EH successor to generate loads into replacements. (sra_modify_assign): Likewise and and also use the simple path for such statements. (sra_modify_function_body): Iterate safely over BBs. ... @@ -2734,6 +2758,19 @@ get_access_for_expr (tree expr) return get_var_base_offset_size_access (base, offset, max_size); } +/* Split the single non-EH successor edge from BB (there must be exactly one) + and return a gimple iterator to the new block. */ + +static gimple_stmt_iterator +gsi_for_eh_followups (basic_block bb) +{ + edge e = single_non_eh_succ (bb); + gcc_assert (e); + + basic_block new_bb = split_edge (e); + return gsi_start_bb (new_bb); +} + /* Replace the expression EXPR with a scalar replacement if there is one and generate other statements to do type conversion or subtree copying if necessary. GSI is used to place newly created statements, WRITE is true if @@ -2763,6 +2800,13 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write) type = TREE_TYPE (*expr); loc = gimple_location (gsi_stmt (*gsi)); + gimple_stmt_iterator alt_gsi = gsi_none (); + if (write stmt_ends_bb_p (gsi_stmt (*gsi))) +{ + alt_gsi = gsi_for_eh_followups (gsi_bb (*gsi)); + gsi = alt_gsi; I think you should try to either use gsi_insert_on_edge_immediate (yeah, bad we can't build a gsi_for_edge_insert ()) or add a gsi_for_edge_insert () building on gimple_find_edge_insert_loc (note the before/after flag that returns - gsi_insert_* variants that take a flag specifying after/before would come handy here). You could also add a flag to gimple_find_edge_insert_loc whether it always should be possible to use gsi_insert_after and split the block in some more cases (or split it if both after and before inserts should be valid, but that would not split in the very rare case of an empty successor only). Basically usually you can avoid splitting the edge. The following patch adds gsi_start_edge for that purpose and uses it together with gsi_commit_edge_inserts from within SRA. I did not make it an inline static function in the header like the other gsi initializing functions because that would make gimple-iterator.h depend on tree-cfg.h and with our current flat includes that triggered changes of includes in half a gazillion unrelated c files (I have that patch too because I was apparently too lazy to think before the third coffee yesterday but I do not think it is worth it). Bootstrapped and tested on x86_64-linux, this time it also includes Eric's testcase. OK for trunk? Thanks, Martin 2014-04-16 Martin Jambor mjam...@suse.cz * gimple-iterator.c (gsi_start_edge): New function. * gimple-iterator.h (gsi_start_edge): Declare. * tree-sra.c (single_non_eh_succ): New function. (disqualify_ops_if_throwing_stmt): Renamed to disqualify_if_bad_bb_terminating_stmt. Allow throwing statements having one non-EH successor BB. (sra_modify_expr): If stmt ends bb, use single non-EH successor to generate loads into replacements. (sra_modify_assign): Likewise and and also use the simple path for such statements. (sra_modify_function_body): Commit statements on edges. testsuite/ * gnat.dg/opt34.adb: New. * gnat.dg/opt34_pkg.ads: Likewise. diff --git a/gcc/gimple-iterator.c b/gcc/gimple-iterator.c index 1cfeb73..8a1ec53 100644 --- a/gcc/gimple-iterator.c +++ b/gcc/gimple-iterator.c @@ -689,6 +689,15 @@ gsi_insert_seq_on_edge (edge e, gimple_seq seq) gimple_seq_add_seq
Re: [PATCH] Make SRA tolerate most throwing statements
On Thu, Apr 17, 2014 at 2:21 PM, Martin Jambor mjam...@suse.cz wrote: On Wed, Apr 16, 2014 at 11:22:28AM +0200, Richard Biener wrote: On Tue, 15 Apr 2014, Martin Jambor wrote: Hi, back in January in http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00848.html Eric pointed out a testcase where the problem was SRA not scalarizing an aggregate because it was involved in a throwing statement. The reason is that SRA is likely to need to append new statements after each one where a replaced aggregate is present, but throwing statements must end their BBs. This patch comes up with a fix for most such situations by adding these new statements onto a single successor non-EH edge, if there is one and only one such edge. I have bootstrapped and tested a very similar version on x86_64-linux, bootstrap and testing of this exact one is currently underway. OK for trunk? Eric, if and once this gets in, can you please add the testcase from your original post to the suite? Thanks, Martin 2014-04-15 Martin Jambor mjam...@suse.cz * tree-sra.c (single_non_eh_succ): New function. (disqualify_ops_if_throwing_stmt): Renamed to disqualify_if_bad_bb_terminating_stmt. Allow throwing statements having one non-EH successor BB. (gsi_for_eh_followups): New function. (sra_modify_expr): If stmt ends bb, use single non-EH successor to generate loads into replacements. (sra_modify_assign): Likewise and and also use the simple path for such statements. (sra_modify_function_body): Iterate safely over BBs. ... @@ -2734,6 +2758,19 @@ get_access_for_expr (tree expr) return get_var_base_offset_size_access (base, offset, max_size); } +/* Split the single non-EH successor edge from BB (there must be exactly one) + and return a gimple iterator to the new block. */ + +static gimple_stmt_iterator +gsi_for_eh_followups (basic_block bb) +{ + edge e = single_non_eh_succ (bb); + gcc_assert (e); + + basic_block new_bb = split_edge (e); + return gsi_start_bb (new_bb); +} + /* Replace the expression EXPR with a scalar replacement if there is one and generate other statements to do type conversion or subtree copying if necessary. GSI is used to place newly created statements, WRITE is true if @@ -2763,6 +2800,13 @@ sra_modify_expr (tree *expr, gimple_stmt_iterator *gsi, bool write) type = TREE_TYPE (*expr); loc = gimple_location (gsi_stmt (*gsi)); + gimple_stmt_iterator alt_gsi = gsi_none (); + if (write stmt_ends_bb_p (gsi_stmt (*gsi))) +{ + alt_gsi = gsi_for_eh_followups (gsi_bb (*gsi)); + gsi = alt_gsi; I think you should try to either use gsi_insert_on_edge_immediate (yeah, bad we can't build a gsi_for_edge_insert ()) or add a gsi_for_edge_insert () building on gimple_find_edge_insert_loc (note the before/after flag that returns - gsi_insert_* variants that take a flag specifying after/before would come handy here). You could also add a flag to gimple_find_edge_insert_loc whether it always should be possible to use gsi_insert_after and split the block in some more cases (or split it if both after and before inserts should be valid, but that would not split in the very rare case of an empty successor only). Basically usually you can avoid splitting the edge. The following patch adds gsi_start_edge for that purpose and uses it together with gsi_commit_edge_inserts from within SRA. I did not make it an inline static function in the header like the other gsi initializing functions because that would make gimple-iterator.h depend on tree-cfg.h and with our current flat includes that triggered changes of includes in half a gazillion unrelated c files (I have that patch too because I was apparently too lazy to think before the third coffee yesterday but I do not think it is worth it). Bootstrapped and tested on x86_64-linux, this time it also includes Eric's testcase. OK for trunk? Ok. Thanks, Richard. Thanks, Martin 2014-04-16 Martin Jambor mjam...@suse.cz * gimple-iterator.c (gsi_start_edge): New function. * gimple-iterator.h (gsi_start_edge): Declare. * tree-sra.c (single_non_eh_succ): New function. (disqualify_ops_if_throwing_stmt): Renamed to disqualify_if_bad_bb_terminating_stmt. Allow throwing statements having one non-EH successor BB. (sra_modify_expr): If stmt ends bb, use single non-EH successor to generate loads into replacements. (sra_modify_assign): Likewise and and also use the simple path for such statements. (sra_modify_function_body): Commit statements on edges. testsuite/ * gnat.dg/opt34.adb: New. * gnat.dg/opt34_pkg.ads: Likewise. diff --git a/gcc/gimple-iterator.c b/gcc/gimple-iterator.c index 1cfeb73..8a1ec53 100644 --- a/gcc/gimple-iterator.c +++
Re: [PATCH 1/6] remove properties stuff from register_dump_files_1
On Thu, Apr 17, 2014 at 10:53:07AM +0200, Richard Biener wrote: On Thu, Apr 17, 2014 at 10:37 AM, tsaund...@mozilla.com wrote: From: Trevor Saunders tsaund...@mozilla.com Hi, just removing some dead code. bootstrapped + regtested against r209414 on x86_64-unknown-linux-gnu, ok? Ok. Thanks for the quick reviews! committed as r209477 - 209482 Trev Thanks, Richard. Trev 2014-03-19 Trevor Saunders tsaund...@mozilla.com * pass_manager.h (pass_manager::register_dump_files_1): Adjust. * passes.c (pass_manager::register_dump_files_1): Remove dead code dealing with properties. (pass_manager::register_dump_files): Adjust. diff --git a/gcc/pass_manager.h b/gcc/pass_manager.h index e1d8143..8309567 100644 --- a/gcc/pass_manager.h +++ b/gcc/pass_manager.h @@ -91,7 +91,7 @@ public: private: void set_pass_for_id (int id, opt_pass *pass); - int register_dump_files_1 (opt_pass *pass, int properties); + void register_dump_files_1 (opt_pass *pass); void register_dump_files (opt_pass *pass, int properties); private: diff --git a/gcc/passes.c b/gcc/passes.c index 60fb135..3f9590a 100644 --- a/gcc/passes.c +++ b/gcc/passes.c @@ -708,33 +708,21 @@ pass_manager::register_one_dump_file (opt_pass *pass) /* Recursive worker function for register_dump_files. */ -int +void pass_manager:: -register_dump_files_1 (opt_pass *pass, int properties) +register_dump_files_1 (opt_pass *pass) { do { - int new_properties = (properties | pass-properties_provided) - ~pass-properties_destroyed; - if (pass-name pass-name[0] != '*') register_one_dump_file (pass); if (pass-sub) -new_properties = register_dump_files_1 (pass-sub, new_properties); - - /* If we have a gate, combine the properties that we could have with - and without the pass being examined. */ - if (pass-has_gate) -properties = new_properties; - else -properties = new_properties; +register_dump_files_1 (pass-sub); pass = pass-next; } while (pass); - - return properties; } /* Register the dump files for the pass_manager starting at PASS. @@ -746,7 +734,7 @@ pass_manager:: register_dump_files (opt_pass *pass,int properties) { pass-properties_required |= properties; - register_dump_files_1 (pass, properties); + register_dump_files_1 (pass); } struct pass_registry -- 1.9.2 signature.asc Description: Digital signature
Changes for if-convert to recognize simple conditional reduction.
Hi All, We implemented enhancement for if-convert phase to recognize the simplest conditional reduction and to transform it vectorizable form, e.g. statement if (A[i] != 0) num+= 1; will be recognized. A new test-case is also provided. Bootstrapping and regression testing did not show any new failures. Is it OK for trunk? gcc/ChangeLog: 2014-04-17 Yuri Rumyantsev ysrum...@gmail.com * tree-if-conv.c (is_cond_scalar_reduction): New function. (convert_scalar_cond_reduction): Likewise. (predicate_scalar_phi): Add recognition and transformation of simple conditioanl reduction to be vectorizable. gcc/testsuite/ChangeLog: 2014-04-17 Yuri Rumyantsev ysrum...@gmail.com * gcc.dg/cond-reduc.c: New test. if-conv-cond-reduc.patch Description: Binary data
Re: [PATCH] Redesign jump threading profile updates
On Wed, Apr 16, 2014 at 10:39 PM, Jeff Law l...@redhat.com wrote: On 03/26/14 17:44, Teresa Johnson wrote: Recently I discovered that the profile updates being performed by jump threading were incorrect in many cases, particularly in the case where the threading path contains a joiner. Some of the duplicated blocks/edges were not getting any counts, leading to incorrect function splitting and other downstream optimizations, and there were other insanities as well. After making a few attempts to fix the handling I ended up completely redesigning the profile update code, removing a few places throughout the code where it was attempting to do some updates. The profile updates in that code is a mess. It's never really been looked at in any systematic way, what's there is ad-hoc and usually in response to someone mentioning the profile data was incorrectly updated. As we rely more and more on that data the ad-hoc updating is going to cause us more and more pain. So any work in this space is going to be greatly appreciated. I'll have to look at this in some detail. But I wanted you to know I was aware of the work and it's in my queue. Great, thanks for the update! I realize that it is not a trivial change so it would take some time to get through. Hopefully it should address the ongoing profile fixup issues. Teresa Thanks! jeff -- Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
[PING] [PATCH] Fix for PR libstdc++/60758
Hi, This fixes infinite backtrace in __cxa_end_cleanup(). Regtest was finished with no regressions on arm-linux-gnueabi(sf). The patch posted at: http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00496.html Thanks in advance. Best regards, Merzlyakov Alexey
[ PATCH] Extend mode-switching to support toggle (1/2)
Hello, He is a new version of the patch. It hookizes the mode-setting and mode-toggling macros. Split in 2 parts. Successfully bootstrapped/regtested on ix86 and SH4/SH4a. I was able to do a limited build on Epiphany, if someone could give it a try on it that would be great. comments ? suggestions ? many thanks, Christian 2014-04-02 Christian Bruel christian.br...@st.com * target.def (mode_switching): New hook vector. (mode_emit, mode_needed, mode_after, mode_entry): New hooks. (mode_exit, modepriority_to_mode): Likewise. * mode-switching.c (MODE_NEEDED, MODE_AFTER, MODE_ENTRY): Hookify. (MODE_EXIT, MODE_PRIORITY_TO_MODE, EMIT_MODE_SET): Likewise. (default_priority_to_mode): Define. * targhooks.h (default_priority_to_mode): Declare. * target.h: Include tm.h and hard-reg-set.h. * doc/tm.texi.in (EMIT_MODE_SET, MODE_NEEDED, MODE_AFTER, MODE_ENTRY) (MODE_EXIT, MODE_PRIORITY_TO_MODE): Delete and hookify. * doc/tm.texi Regenerate. * config/sh/sh.h (MODE_NEEDED, MODE_AFTER, MODE_ENTRY): Delete (MODE_EXIT, MODE_PRIORITY_TO_MODE, EMIT_MODE_SET): Likewise. * config/sh/sh.c (emit_fpu_toggle): New function. (sh4_emit_mode_set, sh4_mode_needed): Hookify. (sh4_mode_after, sh4_mode_entry, sh4_mode_exit): Likewise. * config/i386/i386.h (MODE_NEEDED, MODE_AFTER, MODE_ENTRY): Delete (MODE_EXIT, MODE_PRIORITY_TO_MODE, EMIT_MODE_SET): Likewise. * config/i386/i386-protos.h (ix86_mode_needed, ix86_mode_after) (ix86_mode_entrym, ix86_emit_mode_set): Remove external declaration. * config/i386/i386.c (ix86_mode_needed, ix86_mode_after, ix86_mode_exit, (ix86_mode_entry, ix86_mode_priority, ix86_emit_mode_set): Hookify. * config/epiphany/epiphany.h (MODE_NEEDED, MODE_AFTER, MODE_ENTRY): Delete (MODE_EXIT, MODE_PRIORITY_TO_MODE, EMIT_MODE_SET): Likewise. * config/sh/sh.h (MODE_NEEDED, MODE_AFTER, MODE_ENTRY): Delete (MODE_EXIT, MODE_PRIORITY_TO_MODE, EMIT_MODE_SET): Likewise. * config/sh/sh.c (sh4_emit_mode_set, sh4_mode_needed): Hookify. (sh4_mode_after, sh4_mode_entry, sh4_mode_exit): Likewise. * config/epiphany/epiphany-protos.h (epiphany_mode_needed) (emit_set_fp_mode, epiphany_mode_entry_exit, epiphany_mode_after) (epiphany_mode_priority_to_mode): Remove declaration. * config/epiphany/epiphany.c (emit_set_fp_mode): Hookify. (epiphany_mode_needed, epiphany_mode_priority_to_mode): Likewise. (epiphany_mode_entry, epiphany_mode_exit, epiphany_mode_after): Likewise. (epiphany_mode_priority_to_mode): Change priority type. Hookify. (epiphany_mode_needed, epiphany_mode_entry_exit): Hookify. (epiphany_mode_after, epiphany_mode_entry, emit_set_fp_mode): Hookify. --- gcc/config/epiphany/epiphany-protos.h (revision 209415) +++ gcc/config/epiphany/epiphany-protos.h (working copy) @@ -45,9 +45,7 @@ extern void emit_set_fp_mode (int entity, int mode extern void epiphany_insert_mode_switch_use (rtx insn, int, int); extern void epiphany_expand_set_fp_mode (rtx *operands); extern int epiphany_mode_needed (int entity, rtx insn); -extern int epiphany_mode_entry_exit (int entity, bool); extern int epiphany_mode_after (int entity, int last_mode, rtx insn); -extern int epiphany_mode_priority_to_mode (int entity, unsigned priority); extern bool epiphany_epilogue_uses (int regno); extern bool epiphany_optimize_mode_switching (int entity); extern bool epiphany_is_interrupt_p (tree); --- gcc/config/epiphany/epiphany.c (revision 209415) +++ gcc/config/epiphany/epiphany.c (working copy) @@ -152,6 +152,20 @@ static rtx frame_insn (rtx); /* We further restrict the minimum to be a multiple of eight. */ #define TARGET_MIN_ANCHOR_OFFSET (optimize_size ? 0 : -2040) +/* Mode switching hooks. */ + +#define TARGET_MODE_EMIT emit_set_fp_mode + +#define TARGET_MODE_NEEDED epiphany_mode_needed + +#define TARGET_MODE_PRIORITY epiphany_mode_priority + +#define TARGET_MODE_ENTRY epiphany_mode_entry + +#define TARGET_MODE_EXIT epiphany_mode_exit + +#define TARGET_MODE_AFTER epiphany_mode_after + #include target-def.h #undef TARGET_ASM_ALIGNED_HI_OP @@ -2306,8 +2320,8 @@ epiphany_optimize_mode_switching (int entity) gcc_unreachable (); } -int -epiphany_mode_priority_to_mode (int entity, unsigned priority) +static int +epiphany_mode_priority (int entity, int priority) { if (entity == EPIPHANY_MSW_ENTITY_AND || entity == EPIPHANY_MSW_ENTITY_OR || entity== EPIPHANY_MSW_ENTITY_CONFIG) @@ -2415,7 +2429,7 @@ epiphany_mode_needed (int entity, rtx insn) } } -int +static int epiphany_mode_entry_exit (int entity, bool exit) { int normal_mode = epiphany_normal_fp_mode ; @@ -2502,6 +2516,18 @@ epiphany_mode_after (int entity, int last_mode, rt return last_mode; } +static int +epiphany_mode_entry (int entity) +{ + return epiphany_mode_entry_exit (entity, false); +} + +static int +epiphany_mode_exit (int entity) +{ + return epiphany_mode_entry_exit (entity, true); +} + void emit_set_fp_mode (int entity, int mode, HARD_REG_SET regs_live ATTRIBUTE_UNUSED) { ---
Re: [PATCH] Try to coalesce for unary and binary ops
Hi, On Thu, 17 Apr 2014, Richard Biener wrote: The patch below increases the number of coalescs we attempt to also cover unary and binary operations. This is not usually a good idea if not mitigated by things like register pressure measurement and using target properties to determine if it's a two- or three-address instruction. It increases register pressure and naturally generates multiple-def pseudos which aren't liked by some of the RTL passes. It will lead to fewer pseudos, so there's a positive side. Now the question is what does this tell us? Not re-using the same pseudo as op and target is always better? No, it tells us that tree-ssa-coalesce is too early for such coalescing. The register allocator is the right spot (or instruction selection if we had that), and it's done there. And to have sth that TER not immediately un-does we have to disable TER which conveniently happens for coalesced SSA names. So, instead TER should be improved to not disturb the incoming instruction order (except where secondary effects of expanding larger trees can be had). Changing the coalescing set to disable some bad parts in a later pass doesn't sound very convincing :) Ciao, Michael.
[ PATCH] Extend mode-switching to support toggle (2/2)
and the toggle-support hookized many thanks, Christian 2014-04-02 Christian Bruel christian.br...@st.com * target.def (mode_switching): New hook vector. (toggle_init, toggle_destroy, toggle_set, toggle_test): New mode toggle hooks. * targhooks.h (default_toggle_test): Declare. * basic-block.h (pre_edge_lcm_avs): Declare. * lcm.c (pre_edge_lcm_avs): Renamed from pre_edge_lcm. Call clear_aux_for_edges. Fix comments. (pre_edge_lcm): New wrapper function to call pre_edge_lcm_avs. (pre_edge_rev_lcm): Idem. * mode-switching.c (init_modes_infos): New function. (free_modes_infos): Likewise. (add_mode_set): Likewise. (get_mode): Likewise. (commit_mode_sets): Likewise. (merge_modes): Likewise. (optimize_mode_switching): Support mode toggle. (default_priority_to_mode, default_toggle_test): Define. * doc/tm.texi.in (TARGET_MODE_TOGGLE_INIT, TARGET_MODE_TOGGLE_TEST) (TARGET_MODE_TOGGLE_DESTROY, TARGET_MODE_TOGGLE_SET): New target hooks. * doc/tm.texi: Regenerate. * config/sh/sh.c (sh4_toggle_init, sh4_toggle_destroy): Add hook and define. (sh4_toggle_set, sh4_toggle_test): Likewise. (mode_in_flip, mode_out_flip): Add bitmap to compute mode flipping. (TARGET_MODE_EMIT): New toggle parameter. * config/sh/sh.md (toggle_pr): Defined for TARGET_SH4_300 and TARGET_SH4A_FP. (in_delay_slot): fpscr_toggle don't go in delay slot. * config/i386/i386.c (ix86_emit_mode_set): Add bool unused parameter. * config/epiphany/epiphany.c (emit_set_fp_mode): Add bool unused parameter. --- gcc/basic-block.h 2014-01-07 10:30:59.0 +0100 +++ /work1/bruel/superh_elf/gnu_trunk.devs/gcc/gcc/basic-block.h 2014-04-15 16:17:53.0 +0200 @@ -711,6 +711,9 @@ extern struct edge_list *pre_edge_lcm (int, sbitmap *, sbitmap *, sbitmap *, sbitmap *, sbitmap **, sbitmap **); +extern struct edge_list *pre_edge_lcm_avs (int, sbitmap *, sbitmap *, + sbitmap *, sbitmap *, sbitmap *, + sbitmap *, sbitmap **, sbitmap **); extern struct edge_list *pre_edge_rev_lcm (int, sbitmap *, sbitmap *, sbitmap *, sbitmap *, sbitmap **, --- gcc/config/epiphany/epiphany.c 2014-04-17 13:23:48.0 +0200 +++ /work1/bruel/superh_elf/gnu_trunk.devs/gcc/gcc/config/epiphany/epiphany.c 2014-04-17 13:25:54.0 +0200 @@ -2529,7 +2529,8 @@ } void -emit_set_fp_mode (int entity, int mode, HARD_REG_SET regs_live ATTRIBUTE_UNUSED) +emit_set_fp_mode (int entity, int mode, bool toggle ATTRIBUTE_UNUSED, + HARD_REG_SET regs_live ATTRIBUTE_UNUSED) { rtx save_cc, cc_reg, mask, src, src2; enum attr_fp_mode fp_mode; --- gcc/config/epiphany/epiphany-protos.h 2014-04-17 11:10:36.0 +0200 +++ /work1/bruel/superh_elf/gnu_trunk.devs/gcc/gcc/config/epiphany/epiphany-protos.h 2014-04-17 11:22:02.0 +0200 @@ -40,7 +40,8 @@ extern void epiphany_init_expanders (void); extern int hard_regno_mode_ok (int regno, enum machine_mode mode); #ifdef HARD_CONST -extern void emit_set_fp_mode (int entity, int mode, HARD_REG_SET regs_live); +extern void emit_set_fp_mode (int entity, int mode, + bool toggle ATTRIBUTE_UNUSED, HARD_REG_SET regs_live); #endif extern void epiphany_insert_mode_switch_use (rtx insn, int, int); extern void epiphany_expand_set_fp_mode (rtx *operands); --- gcc/config/epiphany/resolve-sw-modes.c 2014-04-17 11:10:36.0 +0200 +++ /work1/bruel/superh_elf/gnu_trunk.devs/gcc/gcc/config/epiphany/resolve-sw-modes.c 2014-04-17 11:21:07.0 +0200 @@ -147,7 +147,7 @@ } start_sequence (); emit_set_fp_mode (EPIPHANY_MSW_ENTITY_ROUND_UNKNOWN, - jilted_mode, NULL); + jilted_mode, false, NULL); seq = get_insns (); end_sequence (); need_commit = true; --- gcc/config/i386/i386.c 2014-04-17 13:02:49.0 +0200 +++ /work1/bruel/superh_elf/gnu_trunk.devs/gcc/gcc/config/i386/i386.c 2014-04-17 13:04:18.0 +0200 @@ -16409,7 +16409,8 @@ are to be inserted. */ static void -ix86_emit_mode_set (int entity, int mode, HARD_REG_SET regs_live) +ix86_emit_mode_set (int entity, int mode, bool toggle ATTRIBUTE_UNUSED, + HARD_REG_SET regs_live) { switch (entity) { --- gcc/config/sh/sh.c 2014-04-17 13:23:07.0 +0200 +++ /work1/bruel/superh_elf/gnu_trunk.devs/gcc/gcc/config/sh/sh.c 2014-04-17 13:25:27.0 +0200 @@ -202,7 +202,7 @@ static int calc_live_regs (HARD_REG_SET *); static HOST_WIDE_INT rounded_frame_size (int); static bool sh_frame_pointer_required (void); -static void sh4_emit_mode_set (int, int, HARD_REG_SET); +static void sh4_emit_mode_set (int, int, bool, HARD_REG_SET); static int sh4_mode_needed (int, rtx); static int sh4_mode_after (int, int, rtx); static int sh4_mode_entry (int); @@ -590,9 +590,21 @@ #undef TARGET_MODE_EXIT #define TARGET_MODE_EXIT sh4_mode_exit +#undef TARGET_MODE_TOGGLE_INIT +#define TARGET_MODE_TOGGLE_INIT sh4_toggle_init + #undef TARGET_MODE_PRIORITY #define TARGET_MODE_PRIORITY sh4_mode_priority +#undef
[PATCH 0/3] libsanitizer libc conditionals
Respun. First two patches are for gcc, the last one is for upstream LLVM. The gcc part was bootstrapped and regtested on x86_64-unknown-linux-gnu without regressions and bootstrapped on x86_64-unknown-linux-uclibc to verify that the configury works as expected and that the library links without errors. These two patches are essentially backports of the LLVM bits in patch #3. The LLVM part was compiled on x86_64 (X86_64 ?) against glibc and verified that the configury picks up the previously hard-coded values both with configure make as well as with cmake make. LLVM'er, please install the LLVM bits. Ok for trunk? Bernhard Reutner-Fischer (3): libsanitizer: Fix !statfs64 builds libsanitizer: add conditionals for libc [LLVM] [sanitizer] add conditionals for libc libsanitizer/asan/Makefile.am | 6 + libsanitizer/asan/Makefile.in | 17 +- libsanitizer/config.h.in | 60 + libsanitizer/configure | 281 - libsanitizer/configure.ac | 38 +++ libsanitizer/interception/interception_linux.cc| 2 + libsanitizer/interception/interception_linux.h | 8 + libsanitizer/lsan/Makefile.am | 6 + libsanitizer/lsan/Makefile.in | 11 +- libsanitizer/sanitizer_common/Makefile.am | 5 + libsanitizer/sanitizer_common/Makefile.in | 18 +- .../sanitizer_common_interceptors.inc | 100 +++- .../sanitizer_platform_interceptors.h | 4 +- .../sanitizer_platform_limits_linux.cc | 2 + .../sanitizer_platform_limits_posix.cc | 44 +++- .../sanitizer_platform_limits_posix.h | 27 +- .../sanitizer_common/sanitizer_posix_libcdep.cc| 7 + libsanitizer/tsan/Makefile.am | 6 + libsanitizer/tsan/Makefile.in | 11 +- 19 files changed, 619 insertions(+), 34 deletions(-) -- 1.9.1
[PATCH 1/3] libsanitizer: Fix !statfs64 builds
libsanitizer/ChangeLog 2014-04-02 Bernhard Reutner-Fischer al...@gcc.gnu.org * configure.ac: Check for sizeof(struct statfs64). * configure, config.h.in: Regenerate. * sanitizer_common/sanitizer_platform_interceptors.h (SANITIZER_INTERCEPT_STATFS64): Make conditional on SIZEOF_STRUCT_STATFS64 being not 0. * sanitizer_common/sanitizer_platform_limits_linux.cc (namespace __sanitizer): Make unsigned struct_statfs64_sz conditional on SANITIZER_INTERCEPT_STATFS64. Signed-off-by: Bernhard Reutner-Fischer rep.dot@gmail.com --- libsanitizer/config.h.in | 9 +++ libsanitizer/configure | 69 ++ libsanitizer/configure.ac | 15 + .../sanitizer_platform_interceptors.h | 4 +- .../sanitizer_platform_limits_linux.cc | 2 + 5 files changed, 98 insertions(+), 1 deletion(-) diff --git a/libsanitizer/config.h.in b/libsanitizer/config.h.in index e4b2786..4bd6a7f 100644 --- a/libsanitizer/config.h.in +++ b/libsanitizer/config.h.in @@ -61,12 +61,18 @@ /* Define to 1 if you have the sys/mman.h header file. */ #undef HAVE_SYS_MMAN_H +/* Define to 1 if you have the sys/statfs.h header file. */ +#undef HAVE_SYS_STATFS_H + /* Define to 1 if you have the sys/stat.h header file. */ #undef HAVE_SYS_STAT_H /* Define to 1 if you have the sys/types.h header file. */ #undef HAVE_SYS_TYPES_H +/* Define to 1 if you have the sys/vfs.h header file. */ +#undef HAVE_SYS_VFS_H + /* Define to 1 if you have the unistd.h header file. */ #undef HAVE_UNISTD_H @@ -107,6 +113,9 @@ /* The size of `short', as computed by sizeof. */ #undef SIZEOF_SHORT +/* The size of `struct statfs64', as computed by sizeof. */ +#undef SIZEOF_STRUCT_STATFS64 + /* The size of `void *', as computed by sizeof. */ #undef SIZEOF_VOID_P diff --git a/libsanitizer/configure b/libsanitizer/configure index 5e4840f..c636212 100755 --- a/libsanitizer/configure +++ b/libsanitizer/configure @@ -15463,6 +15463,75 @@ _ACEOF +for ac_header in sys/statfs.h +do : + ac_fn_c_check_header_mongrel $LINENO sys/statfs.h ac_cv_header_sys_statfs_h $ac_includes_default +if test x$ac_cv_header_sys_statfs_h = xyes; then : + cat confdefs.h _ACEOF +#define HAVE_SYS_STATFS_H 1 +_ACEOF + +fi + +done + +if test $ac_cv_header_sys_statfs_h = no; then + for ac_header in sys/vfs.h +do : + ac_fn_c_check_header_mongrel $LINENO sys/vfs.h ac_cv_header_sys_vfs_h $ac_includes_default +if test x$ac_cv_header_sys_vfs_h = xyes; then : + cat confdefs.h _ACEOF +#define HAVE_SYS_VFS_H 1 +_ACEOF + +fi + +done + +fi +# The cast to long int works around a bug in the HP C Compiler +# version HP92453-01 B.11.11.23709.GP, which incorrectly rejects +# declarations like `int a3[[(sizeof (unsigned char)) = 0]];'. +# This bug is HP SR number 8606223364. +{ $as_echo $as_me:${as_lineno-$LINENO}: checking size of struct statfs64 5 +$as_echo_n checking size of struct statfs64... 6; } +if test ${ac_cv_sizeof_struct_statfs64+set} = set; then : + $as_echo_n (cached) 6 +else + if ac_fn_c_compute_int $LINENO (long int) (sizeof (struct statfs64)) ac_cv_sizeof_struct_statfs64 +#ifdef HAVE_SYS_STATFS_H +# include sys/statfs.h +#endif +#ifdef HAVE_SYS_VFS_H +# include sys/vfs.h +#endif + +; then : + +else + if test $ac_cv_type_struct_statfs64 = yes; then + { { $as_echo $as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd': 5 +$as_echo $as_me: error: in \`$ac_pwd': 2;} +{ as_fn_set_status 77 +as_fn_error cannot compute sizeof (struct statfs64) +See \`config.log' for more details. $LINENO 5; }; } + else + ac_cv_sizeof_struct_statfs64=0 + fi +fi + +fi +{ $as_echo $as_me:${as_lineno-$LINENO}: result: $ac_cv_sizeof_struct_statfs64 5 +$as_echo $ac_cv_sizeof_struct_statfs64 6; } + + + +cat confdefs.h _ACEOF +#define SIZEOF_STRUCT_STATFS64 $ac_cv_sizeof_struct_statfs64 +_ACEOF + + + if test ${multilib} = yes; then multilib_arg=--enable-multilib else diff --git a/libsanitizer/configure.ac b/libsanitizer/configure.ac index e672131..746c216 100644 --- a/libsanitizer/configure.ac +++ b/libsanitizer/configure.ac @@ -78,6 +78,21 @@ AC_SUBST(enable_static) AC_CHECK_SIZEOF([void *]) +dnl Careful, this breaks on glibc for e.g. dirent.d_ino being 64bit +dnl AC_SYS_LARGEFILE +AC_CHECK_HEADERS(sys/statfs.h) +if test $ac_cv_header_sys_statfs_h = no; then + AC_CHECK_HEADERS(sys/vfs.h) +fi +AC_CHECK_SIZEOF([struct statfs64],[],[ +#ifdef HAVE_SYS_STATFS_H +# include sys/statfs.h +#endif +#ifdef HAVE_SYS_VFS_H +# include sys/vfs.h +#endif +]) + if test ${multilib} = yes; then multilib_arg=--enable-multilib else diff --git a/libsanitizer/sanitizer_common/sanitizer_platform_interceptors.h b/libsanitizer/sanitizer_common/sanitizer_platform_interceptors.h index f37d84b..b9ebd5c 100644 --- a/libsanitizer/sanitizer_common/sanitizer_platform_interceptors.h +++
[PATCH 2/3] libsanitizer: add conditionals for libc
Conditionalize usage of dlvsym(), nanosleep(), usleep(); Conditionalize layout of struct sigaction and type of it's member sa_flags. Conditionalize glob_t members gl_closedir, gl_readdir, gl_opendir, gl_flags, gl_lstat, gl_stat. Check for availability of glob.h for use with above members. Check for availability of netrom/netrom.h, sys/ustat.h (for obsolete ustat() function), utime.h (for obsolete utime() function), wordexp.h. Determine size of sigset_t instead of hardcoding it. libsanitizer/ChangeLog 2014-04-16 Bernhard Reutner-Fischer al...@gcc.gnu.org * configure.ac (AC_CHECK_HEADERS): Add time.h, wordexp.h, glob.h, netrom/netrom.h, sys/ustat.h. (AC_CHECK_MEMBERS): Check GNU extension glob_t members. (AC_CHECK_SIZEOF): Determine size of sigset_t. (HAVE_STRUCT_SIGACTION_SA_MASK_LAST, STRUCT_SIGACTION_SA_FLAGS_TYPE): New. (AC_CHECK_FUNCS): Add usleep, nanosleep, dlvsym. * configure, config.h.in: Regenerate. * asan/Makefile.am, lsan/Makefile.am, tsan/Makefile.am, sanitizer_common/Makefile.am (AM_CXXFLAGS): Include config.h, add include search directory. * asan/Makefile.in, lsan/Makefile.in, tsan/Makefile.in, sanitizer_common/Makefile.in: Regenerate. * interception/interception_linux.h, interception/interception_linux.cc, sanitizer_common/sanitizer_common_interceptors.inc, sanitizer_common/sanitizer_platform_limits_posix.cc, sanitizer_common/sanitizer_platform_limits_posix.h, sanitizer_common/sanitizer_posix_libcdep.cc: Use config.h's new defines. Signed-off-by: Bernhard Reutner-Fischer rep.dot@gmail.com --- libsanitizer/asan/Makefile.am | 6 + libsanitizer/asan/Makefile.in | 17 +- libsanitizer/config.h.in | 51 + libsanitizer/configure | 212 - libsanitizer/configure.ac | 23 +++ libsanitizer/interception/interception_linux.cc| 2 + libsanitizer/interception/interception_linux.h | 8 + libsanitizer/lsan/Makefile.am | 6 + libsanitizer/lsan/Makefile.in | 11 +- libsanitizer/sanitizer_common/Makefile.am | 5 + libsanitizer/sanitizer_common/Makefile.in | 18 +- .../sanitizer_common_interceptors.inc | 100 +- .../sanitizer_platform_limits_posix.cc | 44 - .../sanitizer_platform_limits_posix.h | 27 ++- .../sanitizer_common/sanitizer_posix_libcdep.cc| 7 + libsanitizer/tsan/Makefile.am | 6 + libsanitizer/tsan/Makefile.in | 11 +- 17 files changed, 521 insertions(+), 33 deletions(-) diff --git a/libsanitizer/asan/Makefile.am b/libsanitizer/asan/Makefile.am index 3f07a83..851774c 100644 --- a/libsanitizer/asan/Makefile.am +++ b/libsanitizer/asan/Makefile.am @@ -9,6 +9,12 @@ DEFS += -DMAC_INTERPOSE_FUNCTIONS -DMISSING_BLOCKS_SUPPORT endif AM_CXXFLAGS = -Wall -W -Wno-unused-parameter -Wwrite-strings -pedantic -Wno-long-long -fPIC -fno-builtin -fno-exceptions -fno-rtti -fomit-frame-pointer -funwind-tables -fvisibility=hidden -Wno-variadic-macros AM_CXXFLAGS += $(LIBSTDCXX_RAW_CXX_CXXFLAGS) +AM_CXXFLAGS += -include $(top_builddir)/config.h +if LIBBACKTRACE_SUPPORTED +# backtrace-rename.h is included from config.h, provide -I dir for it +AM_CXXFLAGS += -I $(top_srcdir) +endif + ACLOCAL_AMFLAGS = -I $(top_srcdir) -I $(top_srcdir)/config toolexeclib_LTLIBRARIES = libasan.la diff --git a/libsanitizer/asan/Makefile.in b/libsanitizer/asan/Makefile.in index 273eb4b..a9b889d 100644 --- a/libsanitizer/asan/Makefile.in +++ b/libsanitizer/asan/Makefile.in @@ -37,8 +37,10 @@ build_triplet = @build@ host_triplet = @host@ target_triplet = @target@ @USING_MAC_INTERPOSE_TRUE@am__append_1 = -DMAC_INTERPOSE_FUNCTIONS -DMISSING_BLOCKS_SUPPORT -@USING_MAC_INTERPOSE_FALSE@am__append_2 = $(top_builddir)/interception/libinterception.la -@LIBBACKTRACE_SUPPORTED_TRUE@am__append_3 = $(top_builddir)/libbacktrace/libsanitizer_libbacktrace.la +# backtrace-rename.h is included from config.h, provide -I dir for it +@LIBBACKTRACE_SUPPORTED_TRUE@am__append_2 = -I $(top_srcdir) +@USING_MAC_INTERPOSE_FALSE@am__append_3 = $(top_builddir)/interception/libinterception.la +@LIBBACKTRACE_SUPPORTED_TRUE@am__append_4 = $(top_builddir)/libbacktrace/libsanitizer_libbacktrace.la subdir = asan DIST_COMMON = $(srcdir)/Makefile.in $(srcdir)/Makefile.am ACLOCAL_M4 = $(top_srcdir)/aclocal.m4 @@ -86,8 +88,8 @@ LTLIBRARIES = $(toolexeclib_LTLIBRARIES) am__DEPENDENCIES_1 = libasan_la_DEPENDENCIES = \ $(top_builddir)/sanitizer_common/libsanitizer_common.la \ - $(top_builddir)/lsan/libsanitizer_lsan.la $(am__append_2) \ - $(am__append_3) $(am__DEPENDENCIES_1) + $(top_builddir)/lsan/libsanitizer_lsan.la $(am__append_3) \ +
[PATCH 3/3] [LLVM] [sanitizer] add conditionals for libc
Conditionalize usage of dlvsym(), nanosleep(), usleep(); Conditionalize layout of struct sigaction and type of it's member sa_flags. Conditionalize glob_t members gl_closedir, gl_readdir, gl_opendir, gl_flags, gl_lstat, gl_stat. Check for availability of glob.h for use with above members. Check for availability of netrom/netrom.h, sys/ustat.h (for obsolete ustat() function), utime.h (for obsolete utime() function), wordexp.h. Determine size of sigset_t instead of hardcoding it. Determine size of struct statfs64, if available. Leave defaults to match what glibc expects but probe them for uClibc. Signed-off-by: Bernhard Reutner-Fischer rep.dot@gmail.com --- CMakeLists.txt | 58 +++ cmake/Modules/CompilerRTUtils.cmake| 15 ++ cmake/Modules/FunctionExistsNotStub.cmake | 56 +++ lib/interception/interception_linux.cc | 2 + lib/interception/interception_linux.h | 9 ++ .../sanitizer_common_interceptors.inc | 101 +++- .../sanitizer_platform_limits_posix.cc | 44 - .../sanitizer_platform_limits_posix.h | 27 +++- lib/sanitizer_common/sanitizer_posix_libcdep.cc| 9 ++ make/platform/clang_linux.mk | 180 + make/platform/clang_linux_test_libc.c | 68 11 files changed, 561 insertions(+), 8 deletions(-) create mode 100644 cmake/Modules/FunctionExistsNotStub.cmake create mode 100644 make/platform/clang_linux_test_libc.c diff --git a/CMakeLists.txt b/CMakeLists.txt index e1a7a1f..af8073e 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -330,6 +330,64 @@ if(APPLE) -isysroot ${IOSSIM_SDK_DIR}) endif() +set(ct_c ${COMPILER_RT_SOURCE_DIR}/make/platform/clang_linux_test_libc.c) +check_include_file(sys/ustat.h HAVE_SYS_USTAT_H) +check_include_file(utime.h HAVE_UTIME_H) +check_include_file(wordexp.h HAVE_WORDEXP_H) +check_include_file(glob.h HAVE_GLOB_H) +include(FunctionExistsNotStub) +check_function_exists_not_stub(${ct_c} nanosleep HAVE_NANOSLEEP) +check_function_exists_not_stub(${ct_c} usleep HAVE_USLEEP) +include(CheckTypeSize) +# check for sizeof sigset_t +set(oCMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES}) +set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} signal.h) +check_type_size(sigset_t SIZEOF_SIGSET_T BUILTIN_TYPES_ONLY) +if(EXISTS HAVE_SIZEOF_SIGSET_T) + set(SIZEOF_SIGSET_T ${HAVE_SIZEOF_SIGSET_T}) +endif() +set(CMAKE_EXTRA_INCLUDE_FILES ${oCMAKE_EXTRA_INCLUDE_FILES}) +# check for sizeof struct statfs64 +set(oCMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES}) +check_include_file(sys/statfs.h HAVE_SYS_STATFS_H) +check_include_file(sys/vfs.h HAVE_SYS_VFS_H) +if(HAVE_SYS_STATFS_H) + set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} sys/statfs.h) +endif() +if(HAVE_SYS_VFS_H) + set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} sys/vfs.h) +endif() +# Have to pass _LARGEFILE64_SOURCE otherwise there is no struct statfs64. +# We forcefully enable LFS to retain glibc legacy behaviour herein. +set(oCMAKE_REQUIRED_DEFINITIONS ${CMAKE_REQUIRED_DEFINITIONS}) +set(CMAKE_REQUIRED_DEFINITIONS ${CMAKE_REQUIRED_DEFINITIONS} -D_LARGEFILE64_SOURCE) +check_type_size(struct statfs64 SIZEOF_STRUCT_STATFS64) +if(EXISTS HAVE_SIZEOF_STRUCT_STATFS64) + set(SIZEOF_STRUCT_STATFS64 ${HAVE_SIZEOF_STRUCT_STATFS64}) +else() + set(CMAKE_REQUIRED_DEFINITIONS ${oCMAKE_REQUIRED_DEFINITIONS}) +endif() +set(CMAKE_EXTRA_INCLUDE_FILES ${oCMAKE_EXTRA_INCLUDE_FILES}) +# do not set(CMAKE_REQUIRED_DEFINITIONS ${oCMAKE_REQUIRED_DEFINITIONS}) +# it back here either way. +include(CheckStructHasMember) +check_struct_has_member(glob_t gl_flags glob.h HAVE_GLOB_T_GL_FLAGS) +check_struct_has_member(glob_t gl_closedir glob.h HAVE_GLOB_T_GL_CLOSEDIR) +check_struct_has_member(glob_t gl_readdir glob.h HAVE_GLOB_T_GL_READDIR) +check_struct_has_member(glob_t gl_opendir glob.h HAVE_GLOB_T_GL_OPENDIR) +check_struct_has_member(glob_t gl_lstat glob.h HAVE_GLOB_T_GL_LSTAT) +check_struct_has_member(glob_t gl_stat glob.h HAVE_GLOB_T_GL_STAT) + +# folks seem to have an aversion to configure_file? So be it.. +foreach(x HAVE_SYS_USTAT_H HAVE_UTIME_H HAVE_WORDEXP_H HAVE_GLOB_H +HAVE_NANOSLEEP HAVE_USLEEP SIZEOF_SIGSET_T SIZEOF_STRUCT_STATFS64 +HAVE_GLOB_T_GL_FLAGS HAVE_GLOB_T_GL_CLOSEDIR +HAVE_GLOB_T_GL_READDIR HAVE_GLOB_T_GL_OPENDIR +HAVE_GLOB_T_GL_LSTAT HAVE_GLOB_T_GL_STAT) +def_undef_string(${x} SANITIZER_COMMON_CFLAGS) +endforeach() + + # Architectures supported by Sanitizer runtimes. Specific sanitizers may # support only subset of these (e.g. TSan works on x86_64 only). filter_available_targets(SANITIZER_COMMON_SUPPORTED_ARCH diff --git a/cmake/Modules/CompilerRTUtils.cmake b/cmake/Modules/CompilerRTUtils.cmake index e22e775..3a0beec 100644 --- a/cmake/Modules/CompilerRTUtils.cmake +++ b/cmake/Modules/CompilerRTUtils.cmake @@ -59,3 +59,18 @@ macro(append_no_rtti_flag list)
Re: [PATCH 3/3] [LLVM] [sanitizer] add conditionals for libc
Hi, If you are trying to modify the libsanitizer files, please read here: https://code.google.com/p/address-sanitizer/wiki/HowToContribute --kcc On Thu, Apr 17, 2014 at 5:49 PM, Bernhard Reutner-Fischer rep.dot@gmail.com wrote: Conditionalize usage of dlvsym(), nanosleep(), usleep(); Conditionalize layout of struct sigaction and type of it's member sa_flags. Conditionalize glob_t members gl_closedir, gl_readdir, gl_opendir, gl_flags, gl_lstat, gl_stat. Check for availability of glob.h for use with above members. Check for availability of netrom/netrom.h, sys/ustat.h (for obsolete ustat() function), utime.h (for obsolete utime() function), wordexp.h. Determine size of sigset_t instead of hardcoding it. Determine size of struct statfs64, if available. Leave defaults to match what glibc expects but probe them for uClibc. Signed-off-by: Bernhard Reutner-Fischer rep.dot@gmail.com --- CMakeLists.txt | 58 +++ cmake/Modules/CompilerRTUtils.cmake| 15 ++ cmake/Modules/FunctionExistsNotStub.cmake | 56 +++ lib/interception/interception_linux.cc | 2 + lib/interception/interception_linux.h | 9 ++ .../sanitizer_common_interceptors.inc | 101 +++- .../sanitizer_platform_limits_posix.cc | 44 - .../sanitizer_platform_limits_posix.h | 27 +++- lib/sanitizer_common/sanitizer_posix_libcdep.cc| 9 ++ make/platform/clang_linux.mk | 180 + make/platform/clang_linux_test_libc.c | 68 11 files changed, 561 insertions(+), 8 deletions(-) create mode 100644 cmake/Modules/FunctionExistsNotStub.cmake create mode 100644 make/platform/clang_linux_test_libc.c diff --git a/CMakeLists.txt b/CMakeLists.txt index e1a7a1f..af8073e 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -330,6 +330,64 @@ if(APPLE) -isysroot ${IOSSIM_SDK_DIR}) endif() +set(ct_c ${COMPILER_RT_SOURCE_DIR}/make/platform/clang_linux_test_libc.c) +check_include_file(sys/ustat.h HAVE_SYS_USTAT_H) +check_include_file(utime.h HAVE_UTIME_H) +check_include_file(wordexp.h HAVE_WORDEXP_H) +check_include_file(glob.h HAVE_GLOB_H) +include(FunctionExistsNotStub) +check_function_exists_not_stub(${ct_c} nanosleep HAVE_NANOSLEEP) +check_function_exists_not_stub(${ct_c} usleep HAVE_USLEEP) +include(CheckTypeSize) +# check for sizeof sigset_t +set(oCMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES}) +set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} signal.h) +check_type_size(sigset_t SIZEOF_SIGSET_T BUILTIN_TYPES_ONLY) +if(EXISTS HAVE_SIZEOF_SIGSET_T) + set(SIZEOF_SIGSET_T ${HAVE_SIZEOF_SIGSET_T}) +endif() +set(CMAKE_EXTRA_INCLUDE_FILES ${oCMAKE_EXTRA_INCLUDE_FILES}) +# check for sizeof struct statfs64 +set(oCMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES}) +check_include_file(sys/statfs.h HAVE_SYS_STATFS_H) +check_include_file(sys/vfs.h HAVE_SYS_VFS_H) +if(HAVE_SYS_STATFS_H) + set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} sys/statfs.h) +endif() +if(HAVE_SYS_VFS_H) + set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} sys/vfs.h) +endif() +# Have to pass _LARGEFILE64_SOURCE otherwise there is no struct statfs64. +# We forcefully enable LFS to retain glibc legacy behaviour herein. +set(oCMAKE_REQUIRED_DEFINITIONS ${CMAKE_REQUIRED_DEFINITIONS}) +set(CMAKE_REQUIRED_DEFINITIONS ${CMAKE_REQUIRED_DEFINITIONS} -D_LARGEFILE64_SOURCE) +check_type_size(struct statfs64 SIZEOF_STRUCT_STATFS64) +if(EXISTS HAVE_SIZEOF_STRUCT_STATFS64) + set(SIZEOF_STRUCT_STATFS64 ${HAVE_SIZEOF_STRUCT_STATFS64}) +else() + set(CMAKE_REQUIRED_DEFINITIONS ${oCMAKE_REQUIRED_DEFINITIONS}) +endif() +set(CMAKE_EXTRA_INCLUDE_FILES ${oCMAKE_EXTRA_INCLUDE_FILES}) +# do not set(CMAKE_REQUIRED_DEFINITIONS ${oCMAKE_REQUIRED_DEFINITIONS}) +# it back here either way. +include(CheckStructHasMember) +check_struct_has_member(glob_t gl_flags glob.h HAVE_GLOB_T_GL_FLAGS) +check_struct_has_member(glob_t gl_closedir glob.h HAVE_GLOB_T_GL_CLOSEDIR) +check_struct_has_member(glob_t gl_readdir glob.h HAVE_GLOB_T_GL_READDIR) +check_struct_has_member(glob_t gl_opendir glob.h HAVE_GLOB_T_GL_OPENDIR) +check_struct_has_member(glob_t gl_lstat glob.h HAVE_GLOB_T_GL_LSTAT) +check_struct_has_member(glob_t gl_stat glob.h HAVE_GLOB_T_GL_STAT) + +# folks seem to have an aversion to configure_file? So be it.. +foreach(x HAVE_SYS_USTAT_H HAVE_UTIME_H HAVE_WORDEXP_H HAVE_GLOB_H +HAVE_NANOSLEEP HAVE_USLEEP SIZEOF_SIGSET_T SIZEOF_STRUCT_STATFS64 +HAVE_GLOB_T_GL_FLAGS HAVE_GLOB_T_GL_CLOSEDIR +HAVE_GLOB_T_GL_READDIR HAVE_GLOB_T_GL_OPENDIR +HAVE_GLOB_T_GL_LSTAT HAVE_GLOB_T_GL_STAT) +def_undef_string(${x} SANITIZER_COMMON_CFLAGS) +endforeach() + + # Architectures supported by Sanitizer runtimes. Specific sanitizers may # support only subset of these (e.g. TSan works on
Re: [PATCH 3/3] [LLVM] [sanitizer] add conditionals for libc
On 17 April 2014 16:07, Konstantin Serebryany konstantin.s.serebry...@gmail.com wrote: Hi, If you are trying to modify the libsanitizer files, please read here: https://code.google.com/p/address-sanitizer/wiki/HowToContribute I read that, thanks. Patch 3/3 is for current compiler-rt git repo, please install it there, i do not have write access to the LLVM nor compiler-rt trees. TIA, --kcc On Thu, Apr 17, 2014 at 5:49 PM, Bernhard Reutner-Fischer rep.dot@gmail.com wrote: Conditionalize usage of dlvsym(), nanosleep(), usleep(); Conditionalize layout of struct sigaction and type of it's member sa_flags. Conditionalize glob_t members gl_closedir, gl_readdir, gl_opendir, gl_flags, gl_lstat, gl_stat. Check for availability of glob.h for use with above members. Check for availability of netrom/netrom.h, sys/ustat.h (for obsolete ustat() function), utime.h (for obsolete utime() function), wordexp.h. Determine size of sigset_t instead of hardcoding it. Determine size of struct statfs64, if available. Leave defaults to match what glibc expects but probe them for uClibc. Signed-off-by: Bernhard Reutner-Fischer rep.dot@gmail.com --- CMakeLists.txt | 58 +++ cmake/Modules/CompilerRTUtils.cmake| 15 ++ cmake/Modules/FunctionExistsNotStub.cmake | 56 +++ lib/interception/interception_linux.cc | 2 + lib/interception/interception_linux.h | 9 ++ .../sanitizer_common_interceptors.inc | 101 +++- .../sanitizer_platform_limits_posix.cc | 44 - .../sanitizer_platform_limits_posix.h | 27 +++- lib/sanitizer_common/sanitizer_posix_libcdep.cc| 9 ++ make/platform/clang_linux.mk | 180 + make/platform/clang_linux_test_libc.c | 68 11 files changed, 561 insertions(+), 8 deletions(-) create mode 100644 cmake/Modules/FunctionExistsNotStub.cmake create mode 100644 make/platform/clang_linux_test_libc.c diff --git a/CMakeLists.txt b/CMakeLists.txt index e1a7a1f..af8073e 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -330,6 +330,64 @@ if(APPLE) -isysroot ${IOSSIM_SDK_DIR}) endif() +set(ct_c ${COMPILER_RT_SOURCE_DIR}/make/platform/clang_linux_test_libc.c) +check_include_file(sys/ustat.h HAVE_SYS_USTAT_H) +check_include_file(utime.h HAVE_UTIME_H) +check_include_file(wordexp.h HAVE_WORDEXP_H) +check_include_file(glob.h HAVE_GLOB_H) +include(FunctionExistsNotStub) +check_function_exists_not_stub(${ct_c} nanosleep HAVE_NANOSLEEP) +check_function_exists_not_stub(${ct_c} usleep HAVE_USLEEP) +include(CheckTypeSize) +# check for sizeof sigset_t +set(oCMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES}) +set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} signal.h) +check_type_size(sigset_t SIZEOF_SIGSET_T BUILTIN_TYPES_ONLY) +if(EXISTS HAVE_SIZEOF_SIGSET_T) + set(SIZEOF_SIGSET_T ${HAVE_SIZEOF_SIGSET_T}) +endif() +set(CMAKE_EXTRA_INCLUDE_FILES ${oCMAKE_EXTRA_INCLUDE_FILES}) +# check for sizeof struct statfs64 +set(oCMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES}) +check_include_file(sys/statfs.h HAVE_SYS_STATFS_H) +check_include_file(sys/vfs.h HAVE_SYS_VFS_H) +if(HAVE_SYS_STATFS_H) + set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} sys/statfs.h) +endif() +if(HAVE_SYS_VFS_H) + set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} sys/vfs.h) +endif() +# Have to pass _LARGEFILE64_SOURCE otherwise there is no struct statfs64. +# We forcefully enable LFS to retain glibc legacy behaviour herein. +set(oCMAKE_REQUIRED_DEFINITIONS ${CMAKE_REQUIRED_DEFINITIONS}) +set(CMAKE_REQUIRED_DEFINITIONS ${CMAKE_REQUIRED_DEFINITIONS} -D_LARGEFILE64_SOURCE) +check_type_size(struct statfs64 SIZEOF_STRUCT_STATFS64) +if(EXISTS HAVE_SIZEOF_STRUCT_STATFS64) + set(SIZEOF_STRUCT_STATFS64 ${HAVE_SIZEOF_STRUCT_STATFS64}) +else() + set(CMAKE_REQUIRED_DEFINITIONS ${oCMAKE_REQUIRED_DEFINITIONS}) +endif() +set(CMAKE_EXTRA_INCLUDE_FILES ${oCMAKE_EXTRA_INCLUDE_FILES}) +# do not set(CMAKE_REQUIRED_DEFINITIONS ${oCMAKE_REQUIRED_DEFINITIONS}) +# it back here either way. +include(CheckStructHasMember) +check_struct_has_member(glob_t gl_flags glob.h HAVE_GLOB_T_GL_FLAGS) +check_struct_has_member(glob_t gl_closedir glob.h HAVE_GLOB_T_GL_CLOSEDIR) +check_struct_has_member(glob_t gl_readdir glob.h HAVE_GLOB_T_GL_READDIR) +check_struct_has_member(glob_t gl_opendir glob.h HAVE_GLOB_T_GL_OPENDIR) +check_struct_has_member(glob_t gl_lstat glob.h HAVE_GLOB_T_GL_LSTAT) +check_struct_has_member(glob_t gl_stat glob.h HAVE_GLOB_T_GL_STAT) + +# folks seem to have an aversion to configure_file? So be it.. +foreach(x HAVE_SYS_USTAT_H HAVE_UTIME_H HAVE_WORDEXP_H HAVE_GLOB_H +HAVE_NANOSLEEP HAVE_USLEEP SIZEOF_SIGSET_T SIZEOF_STRUCT_STATFS64 +HAVE_GLOB_T_GL_FLAGS HAVE_GLOB_T_GL_CLOSEDIR +HAVE_GLOB_T_GL_READDIR
Re: [PATCH v7?] PR middle-end/60281
Hi Bernd, I have my copyright mark signed and the process has completed. Now I am going to answer two more questions before my patch can be commited right? Did you copy any files or text written by someone else in these changes?” no [Which files have you changed so far, and which new files have you written so far?] gcc/asan.c gcc/ChangeLog gcc/cfgexpand.c Okay, you may review my patch again, if there is no problem, please commit it for me. -- Regards lin zuojian
Re: [RFC] Add aarch64 support for ada
On 04/17/2014 02:00 AM, Tristan Gingold wrote: On 16 Apr 2014, at 17:36, Richard Henderson r...@redhat.com wrote: On 04/16/2014 12:39 AM, Eric Botcazou wrote: The primary bit of rfc here is the hunk that applies to ada/types.h with respect to Fat_Pointer. Given that the Ada type, as defined in s-stratt.ads, does not include alignment, I can't imagine why the C type should have it. See gcc-interface/utils.c:finish_fat_pointer_type. Ah hah. /* Make sure we can put it into a register. */ if (STRICT_ALIGNMENT) TYPE_ALIGN (record_type) = MIN (BIGGEST_ALIGNMENT, 2 * POINTER_SIZE); AArch64 is not a STRICT_ALIGNMENT target, thus the mismatch. As the align attribute in types.h is for the host, couldn't a configure test solve this issue ? I doubt it. I'm not sure what kind of configure test you could write that would determine the setting of STRICT_ALIGNMENT, since even non-strict-align targets prefer to align data for performance reasons. Be careful that the test couldn't be an execution test, lest you break host != build. One of the most common Fat_Pointer is for strings, which aren't declared in any source and is very commonly used. OTOH, I think this optimization mostly targets sparc. Indeed, 32-bit sparc wants 64-bit alignment for its ldd/std instructions. Perhaps the better optimization (supposing it's really worth keeping) is to DECL_ALIGN the static strings, rather than align the type? Presumably Ada strings are as with C string literals -- symbols private to the compilation unit which are normally passed by value. Thus functions within the compilation unit would see the extra alignment of the data and be able to use ldd to load the pair. On the receiving end being able to use std would remain a matter of luck. r~
Re: [PATCH 3/3] [LLVM] [sanitizer] add conditionals for libc
On Thu, Apr 17, 2014 at 6:27 PM, Bernhard Reutner-Fischer rep.dot@gmail.com wrote: On 17 April 2014 16:07, Konstantin Serebryany konstantin.s.serebry...@gmail.com wrote: Hi, If you are trying to modify the libsanitizer files, please read here: https://code.google.com/p/address-sanitizer/wiki/HowToContribute I read that, thanks. Patch 3/3 is for current compiler-rt git repo, please install it there, i do not have write access to the LLVM nor compiler-rt trees. I can commit your patch to llvm tree only after you follow the process described on that page. Sorry, this is a hard rule. --kcc TIA, --kcc On Thu, Apr 17, 2014 at 5:49 PM, Bernhard Reutner-Fischer rep.dot@gmail.com wrote: Conditionalize usage of dlvsym(), nanosleep(), usleep(); Conditionalize layout of struct sigaction and type of it's member sa_flags. Conditionalize glob_t members gl_closedir, gl_readdir, gl_opendir, gl_flags, gl_lstat, gl_stat. Check for availability of glob.h for use with above members. Check for availability of netrom/netrom.h, sys/ustat.h (for obsolete ustat() function), utime.h (for obsolete utime() function), wordexp.h. Determine size of sigset_t instead of hardcoding it. Determine size of struct statfs64, if available. Leave defaults to match what glibc expects but probe them for uClibc. Signed-off-by: Bernhard Reutner-Fischer rep.dot@gmail.com --- CMakeLists.txt | 58 +++ cmake/Modules/CompilerRTUtils.cmake| 15 ++ cmake/Modules/FunctionExistsNotStub.cmake | 56 +++ lib/interception/interception_linux.cc | 2 + lib/interception/interception_linux.h | 9 ++ .../sanitizer_common_interceptors.inc | 101 +++- .../sanitizer_platform_limits_posix.cc | 44 - .../sanitizer_platform_limits_posix.h | 27 +++- lib/sanitizer_common/sanitizer_posix_libcdep.cc| 9 ++ make/platform/clang_linux.mk | 180 + make/platform/clang_linux_test_libc.c | 68 11 files changed, 561 insertions(+), 8 deletions(-) create mode 100644 cmake/Modules/FunctionExistsNotStub.cmake create mode 100644 make/platform/clang_linux_test_libc.c diff --git a/CMakeLists.txt b/CMakeLists.txt index e1a7a1f..af8073e 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -330,6 +330,64 @@ if(APPLE) -isysroot ${IOSSIM_SDK_DIR}) endif() +set(ct_c ${COMPILER_RT_SOURCE_DIR}/make/platform/clang_linux_test_libc.c) +check_include_file(sys/ustat.h HAVE_SYS_USTAT_H) +check_include_file(utime.h HAVE_UTIME_H) +check_include_file(wordexp.h HAVE_WORDEXP_H) +check_include_file(glob.h HAVE_GLOB_H) +include(FunctionExistsNotStub) +check_function_exists_not_stub(${ct_c} nanosleep HAVE_NANOSLEEP) +check_function_exists_not_stub(${ct_c} usleep HAVE_USLEEP) +include(CheckTypeSize) +# check for sizeof sigset_t +set(oCMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES}) +set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} signal.h) +check_type_size(sigset_t SIZEOF_SIGSET_T BUILTIN_TYPES_ONLY) +if(EXISTS HAVE_SIZEOF_SIGSET_T) + set(SIZEOF_SIGSET_T ${HAVE_SIZEOF_SIGSET_T}) +endif() +set(CMAKE_EXTRA_INCLUDE_FILES ${oCMAKE_EXTRA_INCLUDE_FILES}) +# check for sizeof struct statfs64 +set(oCMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES}) +check_include_file(sys/statfs.h HAVE_SYS_STATFS_H) +check_include_file(sys/vfs.h HAVE_SYS_VFS_H) +if(HAVE_SYS_STATFS_H) + set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} sys/statfs.h) +endif() +if(HAVE_SYS_VFS_H) + set(CMAKE_EXTRA_INCLUDE_FILES ${CMAKE_EXTRA_INCLUDE_FILES} sys/vfs.h) +endif() +# Have to pass _LARGEFILE64_SOURCE otherwise there is no struct statfs64. +# We forcefully enable LFS to retain glibc legacy behaviour herein. +set(oCMAKE_REQUIRED_DEFINITIONS ${CMAKE_REQUIRED_DEFINITIONS}) +set(CMAKE_REQUIRED_DEFINITIONS ${CMAKE_REQUIRED_DEFINITIONS} -D_LARGEFILE64_SOURCE) +check_type_size(struct statfs64 SIZEOF_STRUCT_STATFS64) +if(EXISTS HAVE_SIZEOF_STRUCT_STATFS64) + set(SIZEOF_STRUCT_STATFS64 ${HAVE_SIZEOF_STRUCT_STATFS64}) +else() + set(CMAKE_REQUIRED_DEFINITIONS ${oCMAKE_REQUIRED_DEFINITIONS}) +endif() +set(CMAKE_EXTRA_INCLUDE_FILES ${oCMAKE_EXTRA_INCLUDE_FILES}) +# do not set(CMAKE_REQUIRED_DEFINITIONS ${oCMAKE_REQUIRED_DEFINITIONS}) +# it back here either way. +include(CheckStructHasMember) +check_struct_has_member(glob_t gl_flags glob.h HAVE_GLOB_T_GL_FLAGS) +check_struct_has_member(glob_t gl_closedir glob.h HAVE_GLOB_T_GL_CLOSEDIR) +check_struct_has_member(glob_t gl_readdir glob.h HAVE_GLOB_T_GL_READDIR) +check_struct_has_member(glob_t gl_opendir glob.h HAVE_GLOB_T_GL_OPENDIR) +check_struct_has_member(glob_t gl_lstat glob.h HAVE_GLOB_T_GL_LSTAT) +check_struct_has_member(glob_t gl_stat glob.h HAVE_GLOB_T_GL_STAT) + +# folks seem to have an aversion to configure_file? So be
Re: fuse-caller-save - hook format
On 2014-04-16, 3:19 PM, Tom de Vries wrote: Vladimir, All patches for the fuse-caller-save optimization have been ok-ed. The only part not approved is the MIPS-specific part. The objection of Richard S. is not so much the patch itself, but more the idea of the hook fn_other_hard_reg_usage. For clarity, I'm restating the current hook definition here: ... +@deftypefn {Target Hook} bool TARGET_FN_OTHER_HARD_REG_USAGE (struct hard_reg_set_container *@var{regs}) Add any hard registers to @var{regs} that are set or clobbered by a call to the function. This hook only needs to add registers that cannot be found by examination of the final RTL representation of a function. This hook returns true if it managed to determine which registers need to be added. The default version of this hook returns false. ... Richard prefers to, rather than having a hook specifying what registers are implicitly clobbered, adding those clobbers to CALL_INSN_FUNCTION_USAGE. I can see these possibilities (and perhaps there are more): 1. We go with Richards proposal: we make each target responsible for adding these clobbers in CALL_INSN_FUNCTION_USAGE, and use a hook called f.i. targetm.fuse_caller_save_p or targetm.implicit_call_clobbers_in_fusage_p, to indicate whether a target has taken care of that, meaning it's safe to do the fuse-caller-save optimization. 2. A mixed solution: we make each target responsible for specifying which clobbers need to be added in CALL_INSN_FUNCTION_USAGE, using a hook called f.i. targetm.call_clobbered_regs, and add generic code to add those clobbers to CALL_INSN_FUNCTION_USAGE. 3. We stick with the current, approved hook format, and try to convince Richard to live with it. Since you are a register allocator maintainer, familiar with the fuse-caller-save optimization, and have approved the original hook, I would like to ask you to make a decision on how to proceed from here. I have no preferences and it is a matter of taste. Each solution has own advantages and disadvantages. Putting this info into CALL_INSN_FUNCTION_USAGE helps GCC developing a lot but it has a big drawback in RTL memory footprint (especially for some targets which have a lot of regs like AM29k or IA64). On the order hand analogous approach is already used in DF-infrastructure (which would be nice to fix it imho). Still between GCC users and GCC developers, I'd prefer solution (even the effect on amount of resources used by GCC is quite insignificant) for users as their number is in a few magnitudes more then the developers. But I can live with any solution. So it is up to you. I am flexible.
PATCH: PR target/60863: Incorrect codegen in ix86_expand_clear for -Os
Hi, I checked in this preapproved patch to generate xor reg, reg when optimizing for size. H.J. Index: ChangeLog === --- ChangeLog (revision 209487) +++ ChangeLog (working copy) @@ -1,3 +1,10 @@ +2014-04-17 H.J. Lu hongjiu...@intel.com + + PR target/60863 + * config/i386/i386.c (ix86_expand_clear): Remove outdated + comment. Check optimize_insn_for_size_p instead of + optimize_insn_for_speed_p. + 2014-04-17 Martin Jambor mjam...@suse.cz * gimple-iterator.c (gsi_start_edge): New function. Index: config/i386/i386.c === --- config/i386/i386.c (revision 209487) +++ config/i386/i386.c (working copy) @@ -16668,8 +16668,7 @@ ix86_expand_clear (rtx dest) dest = gen_rtx_REG (SImode, REGNO (dest)); tmp = gen_rtx_SET (VOIDmode, dest, const0_rtx); - /* This predicate should match that for movsi_xor and movdi_xor_rex64. */ - if (!TARGET_USE_MOV0 || optimize_insn_for_speed_p ()) + if (!TARGET_USE_MOV0 || optimize_insn_for_size_p ()) { rtx clob = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (CCmode, FLAGS_REG)); tmp = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, tmp, clob));
Re: fuse-caller-save - hook format
Vladimir Makarov vmaka...@redhat.com writes: On 2014-04-16, 3:19 PM, Tom de Vries wrote: Vladimir, All patches for the fuse-caller-save optimization have been ok-ed. The only part not approved is the MIPS-specific part. The objection of Richard S. is not so much the patch itself, but more the idea of the hook fn_other_hard_reg_usage. For clarity, I'm restating the current hook definition here: ... +@deftypefn {Target Hook} bool TARGET_FN_OTHER_HARD_REG_USAGE (struct hard_reg_set_container *@var{regs}) Add any hard registers to @var{regs} that are set or clobbered by a call to the function. This hook only needs to add registers that cannot be found by examination of the final RTL representation of a function. This hook returns true if it managed to determine which registers need to be added. The default version of this hook returns false. ... Richard prefers to, rather than having a hook specifying what registers are implicitly clobbered, adding those clobbers to CALL_INSN_FUNCTION_USAGE. I can see these possibilities (and perhaps there are more): 1. We go with Richards proposal: we make each target responsible for adding these clobbers in CALL_INSN_FUNCTION_USAGE, and use a hook called f.i. targetm.fuse_caller_save_p or targetm.implicit_call_clobbers_in_fusage_p, to indicate whether a target has taken care of that, meaning it's safe to do the fuse-caller-save optimization. 2. A mixed solution: we make each target responsible for specifying which clobbers need to be added in CALL_INSN_FUNCTION_USAGE, using a hook called f.i. targetm.call_clobbered_regs, and add generic code to add those clobbers to CALL_INSN_FUNCTION_USAGE. 3. We stick with the current, approved hook format, and try to convince Richard to live with it. Since you are a register allocator maintainer, familiar with the fuse-caller-save optimization, and have approved the original hook, I would like to ask you to make a decision on how to proceed from here. I have no preferences and it is a matter of taste. Each solution has own advantages and disadvantages. Putting this info into CALL_INSN_FUNCTION_USAGE helps GCC developing a lot but it has a big drawback in RTL memory footprint (especially for some targets which have a lot of regs like AM29k or IA64). On the order hand analogous approach is already used in DF-infrastructure (which would be nice to fix it imho). Still between GCC users and GCC developers, I'd prefer solution (even the effect on amount of resources used by GCC is quite insignificant) for users as their number is in a few magnitudes more then the developers. Hmm, but you're talking like there are going to be a lot of these registers. This isn't about which registers are call-clobbered or call-saved according to the ABI (that's already available in other places). All we want here are the set of registers that are clobbered _in the caller_ before reaching the callee or after the callee has returned. So although IA-64 has lots of registers, the caller doesn't AFAIK use lots of registers in the process of making the call. On all targets we should be talking about one or two registers here. Thanks, Richard
Re: [RFC] Add aarch64 support for ada
On 17 Apr 2014, at 16:50, Richard Henderson r...@redhat.com wrote: On 04/17/2014 02:00 AM, Tristan Gingold wrote: On 16 Apr 2014, at 17:36, Richard Henderson r...@redhat.com wrote: On 04/16/2014 12:39 AM, Eric Botcazou wrote: The primary bit of rfc here is the hunk that applies to ada/types.h with respect to Fat_Pointer. Given that the Ada type, as defined in s-stratt.ads, does not include alignment, I can't imagine why the C type should have it. See gcc-interface/utils.c:finish_fat_pointer_type. Ah hah. /* Make sure we can put it into a register. */ if (STRICT_ALIGNMENT) TYPE_ALIGN (record_type) = MIN (BIGGEST_ALIGNMENT, 2 * POINTER_SIZE); AArch64 is not a STRICT_ALIGNMENT target, thus the mismatch. As the align attribute in types.h is for the host, couldn't a configure test solve this issue ? I doubt it. I'm not sure what kind of configure test you could write that would determine the setting of STRICT_ALIGNMENT, since even non-strict-align targets prefer to align data for performance reasons. Be careful that the test couldn't be an execution test, lest you break host != build. What about this compile-time check: package Fatptralign is type String_Acc is access String; type Integer_acc is access Integer; pragma Compile_Time_Error (String_Acc'Alignment = 1 * Integer_Acc'Alignment, Fat pointer are simply aligned); pragma Compile_Time_Error (String_Acc'Alignment = 2 * Integer_Acc'Alignment, Fat pointer are doubly aligned); end Fatptralign; One of the most common Fat_Pointer is for strings, which aren't declared in any source and is very commonly used. OTOH, I think this optimization mostly targets sparc. Indeed, 32-bit sparc wants 64-bit alignment for its ldd/std instructions. Perhaps the better optimization (supposing it's really worth keeping) That's a true question (worth keeping). I think this also affects powerpc (as an important target) Eric ? is to DECL_ALIGN the static strings, rather than align the type? [ Ada strings (and more generally Ada unconstrained array and Ada accesses to unconstrained arrays) are represented in GNAT by a fat pointer, ie a structure containing a pointer to the bounds and a pointer to the data. We are talking about the alignment of that structure. ] Presumably Ada strings are as with C string literals -- symbols private to the compilation unit which are normally passed by value. Thus functions within the compilation unit would see the extra alignment of the data and be able to use ldd to load the pair. On the receiving end being able to use std would remain a matter of luck. I think this will dismiss most of the gain. Fat pointers can be heavily used in some applications, and be present in structures. Gain with only private symbols might be tiny. Tristan.
Re: [RFC] Add aarch64 support for ada
Ah hah. /* Make sure we can put it into a register. */ if (STRICT_ALIGNMENT) TYPE_ALIGN (record_type) = MIN (BIGGEST_ALIGNMENT, 2 * POINTER_SIZE); AArch64 is not a STRICT_ALIGNMENT target, thus the mismatch. I see. Initially this alignment promotion had been universal, but someone recently complained about holes in structures on x86-64 because of it so we restricted it to the platforms where it is really necessary for the goal stated in the comment; we left types.h untouched because the alignment could not possibly change the calling convention for non-strict-alignment targets... If we were to make this alignment unconditional, would it be better to drop the code from here in finish_fat_pointer_type and instead record that in the Ada source, as we do with the C source? We cannot really do that, the s-stratt.ads thing is a red herring, alignment of fat pointer types is entirely decided inside the compiler (layout.adb:3213 and gcc-interface/utils.c:finish_fat_pointer_type) I presume that the attached kludge is sufficient to make it work? * fe.h (Compiler_Abort): Replace Fat_Pointer by String. (Error_Msg_N): Likewise. (Error_Msg_NE): Likewise. (Get_External_Name_With_Suffix): Likewise. * types.h (Fat_Pointer): Delete. (String): New type. (DECLARE_STRING): New macro. * gcc-interface/decl.c (create_concat_name): Adjust. * gcc-interface/trans.c (post_error): Likewise. (post_error_ne): Likewise. * gcc-interface/misc.c (internal_error_function): Likewise. -- Eric BotcazouIndex: fe.h === --- fe.h (revision 209461) +++ fe.h (working copy) @@ -39,7 +39,7 @@ extern C { /* comperr: */ #define Compiler_Abort comperr__compiler_abort -extern int Compiler_Abort (Fat_Pointer, int, Fat_Pointer) ATTRIBUTE_NORETURN; +extern int Compiler_Abort (String, int, String) ATTRIBUTE_NORETURN; /* csets: */ @@ -90,8 +90,8 @@ extern Node_Id Get_Attribute_Definition_ #define Error_Msg_NE errout__error_msg_ne #define Set_Identifier_Casing errout__set_identifier_casing -extern void Error_Msg_N (Fat_Pointer, Node_Id); -extern void Error_Msg_NE (Fat_Pointer, Node_Id, Entity_Id); +extern void Error_Msg_N (String, Node_Id); +extern void Error_Msg_NE (String, Node_Id, Entity_Id); extern void Set_Identifier_Casing (Char *, const Char *); /* err_vars: */ @@ -151,7 +151,7 @@ extern void Setup_Asm_Outputs (Node_Id) extern void Get_Encoded_Name (Entity_Id); extern void Get_External_Name (Entity_Id, Boolean); -extern void Get_External_Name_With_Suffix (Entity_Id, Fat_Pointer); +extern void Get_External_Name_With_Suffix (Entity_Id, String); /* exp_util: */ Index: types.h === --- types.h (revision 209461) +++ types.h (working copy) @@ -76,11 +76,14 @@ typedef Char *Str; /* Pointer to string of Chars */ typedef Char *Str_Ptr; -/* Types for the fat pointer used for strings and the template it - points to. */ +/* Types for the fat pointer used for strings and the template it points to. + On most platforms the fat pointer is naturally aligned but, on the rest, + it is given twice the natural alignment. For maximum portability, we do + not overalign the type but only the objects. */ typedef struct {int Low_Bound, High_Bound; } String_Template; -typedef struct {const char *Array; String_Template *Bounds; } - __attribute ((aligned (sizeof (char *) * 2))) Fat_Pointer; +typedef struct {const char *Array; String_Template *Bounds; } String; +#define DECLARE_STRING(s, a, t) \ + __attribute__ ((aligned (sizeof (char *) * 2))) String s = { a, t } /* Types for Node/Entity Kinds: */ Index: gcc-interface/decl.c === --- gcc-interface/decl.c (revision 209461) +++ gcc-interface/decl.c (working copy) @@ -8861,8 +8861,8 @@ create_concat_name (Entity_Id gnat_entit if (suffix) { String_Template temp = {1, (int) strlen (suffix)}; - Fat_Pointer fp = {suffix, temp}; - Get_External_Name_With_Suffix (gnat_entity, fp); + DECLARE_STRING (s, suffix, temp); + Get_External_Name_With_Suffix (gnat_entity, s); } else Get_External_Name (gnat_entity, 0); Index: gcc-interface/trans.c === --- gcc-interface/trans.c (revision 209461) +++ gcc-interface/trans.c (working copy) @@ -7833,7 +7833,6 @@ gnat_gimplify_stmt (tree *stmt_p) gnu_cond = build2 (ANNOTATE_EXPR, TREE_TYPE (gnu_cond), gnu_cond, build_int_cst (integer_type_node, annot_expr_ivdep_kind)); - if (LOOP_STMT_NO_VECTOR (stmt)) gnu_cond = build2 (ANNOTATE_EXPR, TREE_TYPE (gnu_cond), gnu_cond, build_int_cst (integer_type_node, @@ -9357,16 +9356,14 @@ void post_error
Re: [C PATCH] Make attributes accept enum values (PR c/50459)
On Wed, Apr 16, 2014 at 01:40:22PM -0400, Jason Merrill wrote: On 04/15/2014 03:56 PM, Marek Polacek wrote: The testsuite doesn't hit this code with C++, but does hit this code with C. The thing is, if we have e.g. enum { A = 128 }; void *fn1 (void) __attribute__((assume_aligned (A))); then handle_assume_aligned_attribute walks the attribute arguments and gets the argument via TREE_VALUE. If this argument is an enum value, then for C the argument is identifier_node that contains const_decl, Ah. Then I think the C parser should be fixed to check attribute_takes_identifier_p and look up the argument if false. Ok, thanks, I didn't know about attribute_takes_identifier_p. Should be done in the following. Regtested/bootstrapped on x86_64-linux. Ok now? 2014-04-17 Marek Polacek pola...@redhat.com PR c/50459 c-family/ * c-common.c (handle_aligned_attribute): Don't call default_conversion on FUNCTION_DECLs. (handle_vector_size_attribute): Likewise. (handle_sentinel_attribute): Call default_conversion and allow even integral types as an argument. c/ * c-parser.c (c_parser_attributes): If the attribute doesn't take an identifier, call lookup_name for arguments. testsuite/ * c-c++-common/pr50459.c: New test. diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c index c0e247b..1443914 100644 --- gcc/c-family/c-common.c +++ gcc/c-family/c-common.c @@ -7539,7 +7539,8 @@ handle_aligned_attribute (tree *node, tree ARG_UNUSED (name), tree args, if (args) { align_expr = TREE_VALUE (args); - if (align_expr TREE_CODE (align_expr) != IDENTIFIER_NODE) + if (align_expr TREE_CODE (align_expr) != IDENTIFIER_NODE + TREE_CODE (align_expr) != FUNCTION_DECL) align_expr = default_conversion (align_expr); } else @@ -8533,7 +8534,8 @@ handle_vector_size_attribute (tree *node, tree name, tree args, *no_add_attrs = true; size = TREE_VALUE (args); - if (size TREE_CODE (size) != IDENTIFIER_NODE) + if (size TREE_CODE (size) != IDENTIFIER_NODE + TREE_CODE (size) != FUNCTION_DECL) size = default_conversion (size); if (!tree_fits_uhwi_p (size)) @@ -8944,8 +8946,12 @@ handle_sentinel_attribute (tree *node, tree name, tree args, if (args) { tree position = TREE_VALUE (args); + if (position TREE_CODE (position) != IDENTIFIER_NODE + TREE_CODE (position) != FUNCTION_DECL) + position = default_conversion (position); - if (TREE_CODE (position) != INTEGER_CST) + if (TREE_CODE (position) != INTEGER_CST + || !INTEGRAL_TYPE_P (TREE_TYPE (position))) { warning (OPT_Wattributes, requested position is not an integer constant); diff --git gcc/c/c-parser.c gcc/c/c-parser.c index 5653e49..f8fe424 100644 --- gcc/c/c-parser.c +++ gcc/c/c-parser.c @@ -3912,6 +3912,7 @@ c_parser_attributes (c_parser *parser) || c_parser_next_token_is (parser, CPP_KEYWORD)) { tree attr, attr_name, attr_args; + bool attr_takes_id_p; vectree, va_gc *expr_list; if (c_parser_next_token_is (parser, CPP_COMMA)) { @@ -3922,6 +3923,7 @@ c_parser_attributes (c_parser *parser) attr_name = c_parser_attribute_any_word (parser); if (attr_name == NULL) break; + attr_takes_id_p = attribute_takes_identifier_p (attr_name); if (is_cilkplus_vector_p (attr_name)) { c_token *v_token = c_parser_peek_token (parser); @@ -3950,6 +3952,15 @@ c_parser_attributes (c_parser *parser) == CPP_CLOSE_PAREN))) { tree arg1 = c_parser_peek_token (parser)-value; + if (!attr_takes_id_p) + { + /* This is for enum values, so that they can be used as +an attribute parameter; lookup_name will find their +CONST_DECLs. */ + tree ln = lookup_name (arg1); + if (ln) + arg1 = ln; + } c_parser_consume_token (parser); if (c_parser_next_token_is (parser, CPP_CLOSE_PAREN)) attr_args = build_tree_list (NULL_TREE, arg1); diff --git gcc/testsuite/c-c++-common/pr50459.c gcc/testsuite/c-c++-common/pr50459.c index e69de29..f954b32 100644 --- gcc/testsuite/c-c++-common/pr50459.c +++ gcc/testsuite/c-c++-common/pr50459.c @@ -0,0 +1,14 @@ +/* PR c/50459 */ +/* { dg-do compile } */ +/* { dg-options -Wall -Wextra } */ + +enum { A = 128, B = 1 }; +void *fn1 (void) __attribute__((assume_aligned (A))); +void *fn2 (void) __attribute__((assume_aligned (A, 4))); +void fn3 (void) __attribute__((constructor (A))); +void fn4 (void) __attribute__((destructor (A))); +void *fn5 (int) __attribute__((alloc_size (B))); +void *fn6 (int) __attribute__((alloc_align (B))); +void fn7 (const char *,
Re: Avoid unnecesary GGC runs during LTO
+ + /* At this stage we know that majority of GGC memory is reachable. + Growing the limits prevents unnecesary invocation of GGC. */ + ggc_grow (); ggc_collect (); Isn't the collect here pointless? I see not in ENABLE_CHECKING, but shouldn't this be abstracted away, thus call ggc_collect from ggc_grow? Or maybe rather even for ENABLE_CHECKING adjust G.allocated_last_gc and simply drop the ggc_collect above (). I am fine with both. I basically decided to keep the explicit ggc_collect() to make it clear (from lto.c source code) that we are GGC safe at this point and to have way to double check that we do not produce too much of garbage with checking disabled. (so with -Q I will see how much it is collected at that place). We can embed it into ggc_grow and document that w/o checking it is equivalent to ggc_cooect. Anyway, this is sth for stage1 at this point. OK, Honza Ping... the patches saves 33 GGC runs during libxul.so link, that is not that bad ;) Honza Thanks, Richard. /* Set the hooks so that all of the ipa passes can read in their data. */ Index: ggc-none.c === --- ggc-none.c(revision 209170) +++ ggc-none.c(working copy) @@ -63,3 +63,8 @@ ggc_free (void *p) { free (p); } + +void +ggc_grow (void) +{ +} Index: ggc-page.c === --- ggc-page.c(revision 209170) +++ ggc-page.c(working copy) @@ -2095,6 +2095,19 @@ ggc_collect (void) fprintf (G.debug_file, END COLLECTING\n); } +/* Assume that all GGC memory is reachable and grow the limits for next collection. */ + +void +ggc_grow (void) +{ +#ifndef ENABLE_CHECKING + G.allocated_last_gc = MAX (G.allocated_last_gc, + G.allocated); +#endif + if (!quiet_flag) +fprintf (stderr, {GC start %luk} , (unsigned long) G.allocated / 1024); +} + /* Print allocation statistics. */ #define SCALE(x) ((unsigned long) ((x) 1024*10 \ ? (x) \ -- Richard Biener rguent...@suse.de SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imendorffer
RE: [PATCH] Add a new option -fmerge-bitfields (patch / doc inside)
Hello, Unfortunately, optimization is limited only to bit-fields that have same bit-field representative (DECL_BIT_FIELD_REPRESENTATIVE), and fields from different classes do have different representatives. In given example optimization would merge accesses to x and y bit-fields from Base class, but not the access to z from Der class. Regards, Zoran From: Daniel Gutson [daniel.gut...@tallertechnologies.com] Sent: Wednesday, April 16, 2014 4:16 PM To: Zoran Jovanovic Cc: Bernhard Reutner-Fischer; Richard Biener; gcc-patches@gcc.gnu.org Subject: Re: [PATCH] Add a new option -fmerge-bitfields (patch / doc inside) On Wed, Apr 16, 2014 at 8:38 AM, Zoran Jovanovic zoran.jovano...@imgtec.com wrote: Hello, This is new patch version. Lowering is applied only for bit-fields copy sequences that are merged. Data structure representing bit-field copy sequences is renamed and reduced in size. Optimization turned on by default for -O2 and higher. Some comments fixed. Benchmarking performed on WebKit for Android. Code size reduction noticed on several files, best examples are: core/rendering/style/StyleMultiColData (632-520 bytes) core/platform/graphics/FontDescription (1715-1475 bytes) core/rendering/style/FillLayer (5069-4513 bytes) core/rendering/style/StyleRareInheritedData (5618-5346) core/css/CSSSelectorList(4047-3887) core/platform/animation/CSSAnimationData (3844-3440 bytes) core/css/resolver/FontBuilder (13818-13350 bytes) core/platform/graphics/Font (16447-15975 bytes) Example: One of the motivating examples for this work was copy constructor of the class which contains bit-fields. C++ code: class A { public: A(const A x); unsigned a : 1; unsigned b : 2; unsigned c : 4; }; A::A(const Ax) { a = x.a; b = x.b; c = x.c; } Very interesting. Does this work with inheritance too? E.g. struct Base { uint32_t x:1; uint32_t y:3; Base(const Base other) { x = other.x; y = other.y; } }; struct Der : Base { Der() = default; Der(const Der other) : Base(other) { z = other.z; } uint32_t z:9; }; GIMPLE code without optimization: bb 2: _3 = x_2(D)-a; this_4(D)-a = _3; _6 = x_2(D)-b; this_4(D)-b = _6; _8 = x_2(D)-c; this_4(D)-c = _8; return; Optimized GIMPLE code: bb 2: _10 = x_2(D)-D.1867; _11 = BIT_FIELD_REF _10, 7, 0; _12 = this_4(D)-D.1867; _13 = _12 128; _14 = (unsigned char) _11; _15 = _13 | _14; this_4(D)-D.1867 = _15; return; Generated MIPS32r2 assembly code without optimization: lw $3,0($5) lbu $2,0($4) andi$3,$3,0x1 andi$2,$2,0xfe or $2,$2,$3 sb $2,0($4) lw $3,0($5) andi$2,$2,0xf9 andi$3,$3,0x6 or $2,$2,$3 sb $2,0($4) lw $3,0($5) andi$2,$2,0x87 andi$3,$3,0x78 or $2,$2,$3 j $31 sb $2,0($4) Optimized MIPS32r2 assembly code: lw $3,0($5) lbu $2,0($4) andi$3,$3,0x7f andi$2,$2,0x80 or $2,$3,$2 j $31 sb $2,0($4) Algorithm works on basic block level and consists of following 3 major steps: 1. Go through basic block statements list. If there are statement pairs that implement copy of bit field content from one memory location to another record statements pointers and other necessary data in corresponding data structure. 2. Identify records that represent adjacent bit field accesses and mark them as merged. 3. Lower bit-field accesses by using new field size for those that can be merged. New command line option -fmerge-bitfields is introduced. Tested - passed gcc regression tests for MIPS32r2. Changelog - gcc/ChangeLog: 2014-04-16 Zoran Jovanovic (zoran.jovano...@imgtec.com) * common.opt (fmerge-bitfields): New option. * doc/invoke.texi: Add reference to -fmerge-bitfields. * tree-sra.c (lower_bitfields): New function. Entry for (-fmerge-bitfields). (bf_access_candidate_p): New function. (lower_bitfield_read): New function. (lower_bitfield_write): New function. (bitfield_stmt_bfcopy_pair::hash): New function. (bitfield_stmt_bfcopy_pair::equal): New function. (bitfield_stmt_bfcopy_pair::remove): New function. (create_and_insert_bfcopy): New function. (get_bit_offset): New function. (add_stmt_bfcopy_pair): New function. (cmp_bfcopies): New function. (get_merged_bit_field_size): New function. * dwarf2out.c (simple_type_size_in_bits): Move to tree.c. (field_byte_offset): Move declaration to tree.h and make it extern. * testsuite/gcc.dg/tree-ssa/bitfldmrg1.c: New test. * testsuite/gcc.dg/tree-ssa/bitfldmrg2.c: New test. * tree-ssa-sccvn.c (expressions_equal_p): Move to tree.c. *
[C++ Patch] PR 59120
Hi, we can fix this crash during error recovery very easily, by grouping together the two conditions leading to skip early return, in complete analogy with the single-declaration case (note that we explicitly commit to tentative parse at the beginning of the function, thus we are good). Tested x86_64-linux. Thanks, Paolo. /// /cp 2014-04-17 Paolo Carlini paolo.carl...@oracle.com PR c++/59120 * parser.c (cp_parser_alias_declaration): Check return value of cp_parser_require. /testsuite 2014-04-17 Paolo Carlini paolo.carl...@oracle.com PR c++/59120 * g++.dg/cpp0x/alias-decl-42.C: New. Index: cp/parser.c === --- cp/parser.c (revision 209472) +++ cp/parser.c (working copy) @@ -16142,20 +16142,13 @@ cp_parser_alias_declaration (cp_parser* parser) if (parser-num_template_parameter_lists) parser-type_definition_forbidden_message = saved_message; - if (type == error_mark_node) + if (type == error_mark_node + || !cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON)) { cp_parser_skip_to_end_of_block_or_statement (parser); return error_mark_node; } - cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON); - - if (cp_parser_error_occurred (parser)) -{ - cp_parser_skip_to_end_of_block_or_statement (parser); - return error_mark_node; -} - /* A typedef-name can also be introduced by an alias-declaration. The identifier following the using keyword becomes a typedef-name. It has the same semantics as if it were introduced by the typedef Index: testsuite/g++.dg/cpp0x/alias-decl-42.C === --- testsuite/g++.dg/cpp0x/alias-decl-42.C (revision 0) +++ testsuite/g++.dg/cpp0x/alias-decl-42.C (working copy) @@ -0,0 +1,4 @@ +// PR c++/59120 +// { dg-do compile { target c++11 } } + +templatetypename T using X = int T::T*; // { dg-error expected }
[PATCH] Fix uninitialised variable warning in tree-ssa-loop-ivcanon.c
Hi all, While looking at the build logs I noticed a warning while building tree-ssa-loop-ivcanon.c about a potential use of an uninitialised variable. This patchlet fixes that warning by initialising it to 0. Tested arm-none-eabi. Ok for trunk? 2014-04-17 Kyrylo Tkachov kyrylo.tkac...@arm.com * tree-ssa-loop-ivcanon.c (canonicalize_loop_induction_variables): Initialise n_unroll to 0. diff --git a/gcc/tree-ssa-loop-ivcanon.c b/gcc/tree-ssa-loop-ivcanon.c index cdf1559..7a83b12 100644 --- a/gcc/tree-ssa-loop-ivcanon.c +++ b/gcc/tree-ssa-loop-ivcanon.c @@ -656,7 +656,7 @@ try_unroll_loop_completely (struct loop *loop, HOST_WIDE_INT maxiter, location_t locus) { - unsigned HOST_WIDE_INT n_unroll, ninsns, max_unroll, unr_insns; + unsigned HOST_WIDE_INT n_unroll = 0, ninsns, max_unroll, unr_insns; gimple cond; struct loop_size size; bool n_unroll_found = false;
Re: [PATCH 3/3] [LLVM] [sanitizer] add conditionals for libc
On 17 April 2014 16:51:23 Konstantin Serebryany konstantin.s.serebry...@gmail.com wrote: On Thu, Apr 17, 2014 at 6:27 PM, Bernhard Reutner-Fischer rep.dot@gmail.com wrote: On 17 April 2014 16:07, Konstantin Serebryany konstantin.s.serebry...@gmail.com wrote: Hi, If you are trying to modify the libsanitizer files, please read here: https://code.google.com/p/address-sanitizer/wiki/HowToContribute I read that, thanks. Patch 3/3 is for current compiler-rt git repo, please install it there, i do not have write access to the LLVM nor compiler-rt trees. I can commit your patch to llvm tree only after you follow the process described on that page. Sorry, this is a hard rule. What part of the process do you think I did not follow? I made a patch for compiler-rt, sent it to llvm-comm...@cs.uiuc.edu then provided the corresponding GCC parts, along a backport of the new bits that I expect to be overwritten once you do a new merge, leaving just the GCC configuy bits. This is how I read the wiki page you cite. Please tell me what you expect me to do differently? Thanks, --kcc Sent with AquaMail for Android http://www.aqua-mail.com
Re: fuse-caller-save - hook format
On 2014-04-17, 11:29 AM, Richard Sandiford wrote: Vladimir Makarov vmaka...@redhat.com writes: On 2014-04-16, 3:19 PM, Tom de Vries wrote: Vladimir, All patches for the fuse-caller-save optimization have been ok-ed. The only part not approved is the MIPS-specific part. The objection of Richard S. is not so much the patch itself, but more the idea of the hook fn_other_hard_reg_usage. For clarity, I'm restating the current hook definition here: ... +@deftypefn {Target Hook} bool TARGET_FN_OTHER_HARD_REG_USAGE (struct hard_reg_set_container *@var{regs}) Add any hard registers to @var{regs} that are set or clobbered by a call to the function. This hook only needs to add registers that cannot be found by examination of the final RTL representation of a function. This hook returns true if it managed to determine which registers need to be added. The default version of this hook returns false. ... Richard prefers to, rather than having a hook specifying what registers are implicitly clobbered, adding those clobbers to CALL_INSN_FUNCTION_USAGE. I can see these possibilities (and perhaps there are more): 1. We go with Richards proposal: we make each target responsible for adding these clobbers in CALL_INSN_FUNCTION_USAGE, and use a hook called f.i. targetm.fuse_caller_save_p or targetm.implicit_call_clobbers_in_fusage_p, to indicate whether a target has taken care of that, meaning it's safe to do the fuse-caller-save optimization. 2. A mixed solution: we make each target responsible for specifying which clobbers need to be added in CALL_INSN_FUNCTION_USAGE, using a hook called f.i. targetm.call_clobbered_regs, and add generic code to add those clobbers to CALL_INSN_FUNCTION_USAGE. 3. We stick with the current, approved hook format, and try to convince Richard to live with it. Since you are a register allocator maintainer, familiar with the fuse-caller-save optimization, and have approved the original hook, I would like to ask you to make a decision on how to proceed from here. I have no preferences and it is a matter of taste. Each solution has own advantages and disadvantages. Putting this info into CALL_INSN_FUNCTION_USAGE helps GCC developing a lot but it has a big drawback in RTL memory footprint (especially for some targets which have a lot of regs like AM29k or IA64). On the order hand analogous approach is already used in DF-infrastructure (which would be nice to fix it imho). Still between GCC users and GCC developers, I'd prefer solution (even the effect on amount of resources used by GCC is quite insignificant) for users as their number is in a few magnitudes more then the developers. Hmm, but you're talking like there are going to be a lot of these registers. Yes, you are right. That is what I thought. I should have read Tom's email with more attention. This isn't about which registers are call-clobbered or call-saved according to the ABI (that's already available in other places). All we want here are the set of registers that are clobbered _in the caller_ before reaching the callee or after the callee has returned. So although IA-64 has lots of registers, the caller doesn't AFAIK use lots of registers in the process of making the call. On all targets we should be talking about one or two registers here. I see. I guess your proposed solution is ok then.
[PATCH] Fix warning in libgfortran configure script
Hi all, While configuring libgfortran I'm getting this message: libgfortran/configure: line 25938: test: =: unary operator expected The script doesn't fail and continues afterwards, but I don't think it's supposed to give that warning. This patch makes it go away and makes it more consistent with other similar uses (a few lines below $ac_cv_lib_rt_clock_gettime is quoted when used in a test structure). configure.ac is updated and configure is regenerated with autoconf 2.64 Ok for trunk? Make sure libgfortran builds for arm-none-eabi. libgfortran/ 2014-04-17 Kyrylo Tkachov kyrylo.tkac...@arm.com * configure.ac: Quote usage of ac_cv_func_clock_gettime in if test. * configure: Regenerate. diff --git a/libgfortran/configure b/libgfortran/configure index 23f57c7..d3ced74 100755 --- a/libgfortran/configure +++ b/libgfortran/configure @@ -25935,7 +25935,7 @@ fi # test is copied from libgomp, and modified to not link in -lrt as # libgfortran calls clock_gettime via a weak reference if it's found # in librt. -if test $ac_cv_func_clock_gettime = no; then +if test $ac_cv_func_clock_gettime = no; then { $as_echo $as_me:${as_lineno-$LINENO}: checking for clock_gettime in -lrt 5 $as_echo_n checking for clock_gettime in -lrt... 6; } if test ${ac_cv_lib_rt_clock_gettime+set} = set; then : diff --git a/libgfortran/configure.ac b/libgfortran/configure.ac index de2d65e..24dbf2b 100644 --- a/libgfortran/configure.ac +++ b/libgfortran/configure.ac @@ -510,7 +510,7 @@ AC_CHECK_LIB([m],[feenableexcept],[have_feenableexcept=yes AC_DEFINE([HAVE_FEENA # test is copied from libgomp, and modified to not link in -lrt as # libgfortran calls clock_gettime via a weak reference if it's found # in librt. -if test $ac_cv_func_clock_gettime = no; then +if test $ac_cv_func_clock_gettime = no; then AC_CHECK_LIB(rt, clock_gettime, [AC_DEFINE(HAVE_CLOCK_GETTIME_LIBRT, 1, [Define to 1 if you have the `clock_gettime' function in librt.])])
Re: [PATCH 3/3] [LLVM] [sanitizer] add conditionals for libc
On Thu, Apr 17, 2014 at 8:45 PM, Bernhard Reutner-Fischer rep.dot@gmail.com wrote: On 17 April 2014 16:51:23 Konstantin Serebryany konstantin.s.serebry...@gmail.com wrote: On Thu, Apr 17, 2014 at 6:27 PM, Bernhard Reutner-Fischer rep.dot@gmail.com wrote: On 17 April 2014 16:07, Konstantin Serebryany konstantin.s.serebry...@gmail.com wrote: Hi, If you are trying to modify the libsanitizer files, please read here: https://code.google.com/p/address-sanitizer/wiki/HowToContribute I read that, thanks. Patch 3/3 is for current compiler-rt git repo, please install it there, i do not have write access to the LLVM nor compiler-rt trees. I can commit your patch to llvm tree only after you follow the process described on that page. Sorry, this is a hard rule. What part of the process do you think I did not follow? I made a patch for compiler-rt, sent it to llvm-comm...@cs.uiuc.edu then provided the corresponding GCC parts, along a backport of the new bits that I expect to be overwritten once you do a new merge, leaving just the GCC configuy bits. This is how I read the wiki page you cite. Please tell me what you expect me to do differently? First, I did not notice that you've sent it to llvm-commits because it was also sent to the gcc list (unusual thing to happen) and got filtered into the gcc part of my mail. Sorry. But second, the patch is far from trivial and you should not expect us to commit it w/o a careful review, so here comes another part of the wiki: For non-trivial patches please use Phabricator -- this will help us reply faster. --kcc Thanks, --kcc Sent with AquaMail for Android http://www.aqua-mail.com
RE: [PATCH] Fix uninitialised variable warning in tree-ssa-loop-ivcanon.c
Hello! I am not against it.. However I think there is no danger. I see no potential use of uninitialized variable. The use of n_unroll is guarded by n_unroll_found. Best regards, Daniel Marjamäki
PATCH: PR target/60868: [4.9/4.10 Regression] ICE: in int_mode_for_mode, at stor-layout.c:400 with -minline-all-stringops -minline-stringops-dynamically -march=core2
Hi, GET_MODE returns VOIDmode on CONST_INT. It happens with -O0. This patch uses counter_mode on count_exp to get mode. Tested on Linux/x86-64 without regressions. OK for trunk and 4.9 branch? Thanks. H.J. --- gcc/ 2014-04-17 H.J. Lu hongjiu...@intel.com PR target/60868 * config/i386/i386.c (ix86_expand_set_or_movmem): Call counter_mode on count_exp to get mode. gcc/testsuite/ 2014-04-17 H.J. Lu hongjiu...@intel.com PR target/60868 * gcc.target/i386/pr60868.c: New testcase. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 536f50f..7a68623 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -24392,7 +24392,8 @@ ix86_expand_set_or_movmem (rtx dst, rtx src, rtx count_exp, rtx val_exp, if (jump_around_label == NULL_RTX) jump_around_label = gen_label_rtx (); emit_cmp_and_jump_insns (count_exp, GEN_INT (dynamic_check - 1), - LEU, 0, GET_MODE (count_exp), 1, hot_label); + LEU, 0, counter_mode (count_exp), + 1, hot_label); predict_jump (REG_BR_PROB_BASE * 90 / 100); if (issetmem) set_storage_via_libcall (dst, count_exp, val_exp, false); diff --git a/gcc/testsuite/gcc.target/i386/pr60868.c b/gcc/testsuite/gcc.target/i386/pr60868.c new file mode 100644 index 000..c30bbfc --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr60868.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options -O0 -minline-all-stringops -minline-stringops-dynamically -march=core2 } */ + +void bar (float *); + +void foo (void) +{ + float b[256] = {0}; + bar(b); +}
Re: Avoid unnecesary GGC runs during LTO
On April 17, 2014 6:03:13 PM CEST, Jan Hubicka hubi...@ucw.cz wrote: + + /* At this stage we know that majority of GGC memory is reachable. + Growing the limits prevents unnecesary invocation of GGC. */ + ggc_grow (); ggc_collect (); Isn't the collect here pointless? I see not in ENABLE_CHECKING, but shouldn't this be abstracted away, thus call ggc_collect from ggc_grow? Or maybe rather even for ENABLE_CHECKING adjust G.allocated_last_gc and simply drop the ggc_collect above (). I am fine with both. I basically decided to keep the explicit ggc_collect() to make it clear (from lto.c source code) that we are GGC safe at this point and to have way to double check that we do not produce too much of garbage with checking disabled. (so with -Q I will see how much it is collected at that place). We can embed it into ggc_grow and document that w/o checking it is equivalent to ggc_cooect. Anyway, this is sth for stage1 at this point. OK, Honza Ping... the patches saves 33 GGC runs during libxul.so link, that is not that bad ;) What is the updated patch you propose? Richard Honza Thanks, Richard. /* Set the hooks so that all of the ipa passes can read in their data. */ Index: ggc-none.c === --- ggc-none.c (revision 209170) +++ ggc-none.c (working copy) @@ -63,3 +63,8 @@ ggc_free (void *p) { free (p); } + +void +ggc_grow (void) +{ +} Index: ggc-page.c === --- ggc-page.c (revision 209170) +++ ggc-page.c (working copy) @@ -2095,6 +2095,19 @@ ggc_collect (void) fprintf (G.debug_file, END COLLECTING\n); } +/* Assume that all GGC memory is reachable and grow the limits for next collection. */ + +void +ggc_grow (void) +{ +#ifndef ENABLE_CHECKING + G.allocated_last_gc = MAX (G.allocated_last_gc, + G.allocated); +#endif + if (!quiet_flag) +fprintf (stderr, {GC start %luk} , (unsigned long) G.allocated / 1024); +} + /* Print allocation statistics. */ #define SCALE(x) ((unsigned long) ((x) 1024*10 \ ? (x) \ -- Richard Biener rguent...@suse.de SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imendorffer
Re: Avoid unnecesary GGC runs during LTO
On April 17, 2014 6:03:13 PM CEST, Jan Hubicka hubi...@ucw.cz wrote: + + /* At this stage we know that majority of GGC memory is reachable. + Growing the limits prevents unnecesary invocation of GGC. */ + ggc_grow (); ggc_collect (); Isn't the collect here pointless? I see not in ENABLE_CHECKING, but shouldn't this be abstracted away, thus call ggc_collect from ggc_grow? Or maybe rather even for ENABLE_CHECKING adjust G.allocated_last_gc and simply drop the ggc_collect above (). I am fine with both. I basically decided to keep the explicit ggc_collect() to make it clear (from lto.c source code) that we are GGC safe at this point and to have way to double check that we do not produce too much of garbage with checking disabled. (so with -Q I will see how much it is collected at that place). We can embed it into ggc_grow and document that w/o checking it is equivalent to ggc_cooect. Anyway, this is sth for stage1 at this point. OK, Honza Ping... the patches saves 33 GGC runs during libxul.so link, that is not that bad ;) What is the updated patch you propose? I was trying to explain, why I kept explicit ggc_collect just after ggc_grow: I want to make it clear that we are ggc safe at that point. I also want to see the ggc run happening w/o checking to have -Q report how much of garbage we see at this stage so I can keep eye on it. I can hide ENABLE_CHECKING ggc_collect call in ggc_grow and update documentation if your preffer. Honza Richard Honza Thanks, Richard. /* Set the hooks so that all of the ipa passes can read in their data. */ Index: ggc-none.c === --- ggc-none.c (revision 209170) +++ ggc-none.c (working copy) @@ -63,3 +63,8 @@ ggc_free (void *p) { free (p); } + +void +ggc_grow (void) +{ +} Index: ggc-page.c === --- ggc-page.c (revision 209170) +++ ggc-page.c (working copy) @@ -2095,6 +2095,19 @@ ggc_collect (void) fprintf (G.debug_file, END COLLECTING\n); } +/* Assume that all GGC memory is reachable and grow the limits for next collection. */ + +void +ggc_grow (void) +{ +#ifndef ENABLE_CHECKING + G.allocated_last_gc = MAX (G.allocated_last_gc, + G.allocated); +#endif + if (!quiet_flag) +fprintf (stderr, {GC start %luk} , (unsigned long) G.allocated / 1024); +} + /* Print allocation statistics. */ #define SCALE(x) ((unsigned long) ((x) 1024*10 \ ? (x) \ -- Richard Biener rguent...@suse.de SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imendorffer
Re: [RFC] Add aarch64 support for ada
On 04/17/2014 08:35 AM, Tristan Gingold wrote: What about this compile-time check: package Fatptralign is type String_Acc is access String; type Integer_acc is access Integer; pragma Compile_Time_Error (String_Acc'Alignment = 1 * Integer_Acc'Alignment, Fat pointer are simply aligned); pragma Compile_Time_Error (String_Acc'Alignment = 2 * Integer_Acc'Alignment, Fat pointer are doubly aligned); end Fatptralign; Yes, that seems to work, even with a cross-compiler. r~
Re: [PATCH] C++ thunk section names
Ping. On Wed, Feb 5, 2014 at 4:31 PM, Sriraman Tallam tmsri...@google.com wrote: Hi, I would like this patch reviewed and considered for commit when Stage 1 is active again. Patch Description: A C++ thunk's section name is set to be the same as the original function's section name for which the thunk was created in order to place the two together. This is done in cp/method.c in function use_thunk. However, with function reordering turned on, the original function's section name can change to something like .text.hot.orginal or .text.unlikely.original in function default_function_section in varasm.c based on the node count of that function. The thunk function's section name is not updated to be the same as the original here and also is not always correct to do it as the original function can be hotter than the thunk. I have created a patch to not name the thunk function's section to be the same as the original function when function reordering is enabled. Thanks Sri
Inliner heuristics TLC 1/n - let small function inlinng to ignore cold portion of program
Hi, I think for 4.10 we should revisit inliner behaviour to be more LTO and LTO+FDO ready. This is first of small patches I made to sanitize behaviour of current bounds. The main problem LTO brings is that we get way too many inline candidates. In per-file model one gets only small percentage of calls inlinable, since most of them go to other units, so our current heuristics behave quite well, inlining usually all calls that it consider benefical. With LTO almost all calls are inlinable and if we inline everything we consider profitable we get insane code size growths, so practically always we hit our 30% unit growth threshold. This is not always a good idea. Reducing inline-insns-auto/inline-insns-single to avoid inliner hitting the growth limit would cause a regression on benchmarks that needs inlining of large functions. LLVM seems to get around the problem by doing code expanding inlining at compile time (in equivalent of our early inliner). This makes functions big, so the LTO doesn't inline much, but it also misses useful cross-module inlines and replace them by less usefull inter-module. Other approach would be to have inline-insns-crossmodule that is significantly smaller than inline-insns-auto. We already have crossmodule hint that probably ought to be made smarter to not fire on COMDAT functions. I do not want to do it, since the numbers I collected in http://hubicka.blogspot.ca/2014/04/devirtualization-in-c-part-5-feedback.html suggest that inline-insns-auto is already quite bad limit. I would be happy to hear about alternative solutions to this. We may want to switch whole program inliner into temperature style bound, like open64 does. Well, this patch actually goes bit different direction - making unit growth threashold more sane. While looking into inliner behaviour at Firefox to write my blog entry I noticed that with profile feedback only very small portion of the program is trained (15%) and only around 7% of code contains something that we consider hot. Inliner however still hits the inline-unit-growth limit with: Unit growth for small function inlining: 7232256-9220597 (27%) Inlined 183353 calls, eliminated 54652 function We do not grow the code in the cold portions of program, but because of the dead padding we grow everything we consider hot 4 times, instead of 1.3 times as we would usually do if it was unpadded. This patch fixes the problem by considering only non-cold functions for frequency calculation. We now get: Unit growth for small function inlining: 2083217-2537163 (21%) Inlined 134611 calls, eliminated 53586 functions So while the relative growth is still close to 30%, the absolute growth is only 22% of the previous one. We inline fewer calls but in the dynamic stats there is very minor (sub 0.01%) diference. Bootstrapped/regtested x86_64-linux, will commit it shortly. Honza * ipa-inline.c (inline_small_functions): Account only non-cold functions. * doc/invoke.texi (inline-unit-growth): Update documentation. Index: ipa-inline.c === --- ipa-inline.c(revision 209461) +++ ipa-inline.c(working copy) @@ -1585,7 +1590,10 @@ inline_small_functions (void) struct inline_summary *info = inline_summary (node); struct ipa_dfs_info *dfs = (struct ipa_dfs_info *) node-aux; - if (!DECL_EXTERNAL (node-decl)) + /* Do not account external functions, they will be optimized out + if not inlined. Also only count the non-cold portion of program. */ + if (!DECL_EXTERNAL (node-decl) +node-frequency != NODE_FREQUENCY_UNLIKELY_EXECUTED) initial_size += info-size; info-growth = estimate_growth (node); if (dfs dfs-next_cycle) Index: doc/invoke.texi === --- doc/invoke.texi (revision 209461) +++ doc/invoke.texi (working copy) @@ -9409,7 +9409,8 @@ before applying @option{--param inline-u @item inline-unit-growth Specifies maximal overall growth of the compilation unit caused by inlining. The default value is 30 which limits unit growth to 1.3 times the original -size. +size. Cold functions (either marked cold via an attribibute or by profile +feedback) are not accounted into the unit size. @item ipcp-unit-growth Specifies maximal overall growth of the compilation unit caused by
Re: Avoid unnecesary GGC runs during LTO
On April 17, 2014 7:18:05 PM CEST, Jan Hubicka hubi...@ucw.cz wrote: On April 17, 2014 6:03:13 PM CEST, Jan Hubicka hubi...@ucw.cz wrote: + + /* At this stage we know that majority of GGC memory is reachable. + Growing the limits prevents unnecesary invocation of GGC. */ + ggc_grow (); ggc_collect (); Isn't the collect here pointless? I see not in ENABLE_CHECKING, but shouldn't this be abstracted away, thus call ggc_collect from ggc_grow? Or maybe rather even for ENABLE_CHECKING adjust G.allocated_last_gc and simply drop the ggc_collect above (). I am fine with both. I basically decided to keep the explicit ggc_collect() to make it clear (from lto.c source code) that we are GGC safe at this point and to have way to double check that we do not produce too much of garbage with checking disabled. (so with -Q I will see how much it is collected at that place). We can embed it into ggc_grow and document that w/o checking it is equivalent to ggc_cooect. Anyway, this is sth for stage1 at this point. OK, Honza Ping... the patches saves 33 GGC runs during libxul.so link, that is not that bad ;) What is the updated patch you propose? I was trying to explain, why I kept explicit ggc_collect just after ggc_grow: I want to make it clear that we are ggc safe at that point. I also want to see the ggc run happening w/o checking to have -Q report how much of garbage we see at this stage so I can keep eye on it. I can hide ENABLE_CHECKING ggc_collect call in ggc_grow and update documentation if your preffer. I'd prefer that. OK with that change. Thanks, Richard. Honza Richard Honza Thanks, Richard. /* Set the hooks so that all of the ipa passes can read in their data. */ Index: ggc-none.c === --- ggc-none.c(revision 209170) +++ ggc-none.c(working copy) @@ -63,3 +63,8 @@ ggc_free (void *p) { free (p); } + +void +ggc_grow (void) +{ +} Index: ggc-page.c === --- ggc-page.c(revision 209170) +++ ggc-page.c(working copy) @@ -2095,6 +2095,19 @@ ggc_collect (void) fprintf (G.debug_file, END COLLECTING\n); } +/* Assume that all GGC memory is reachable and grow the limits for next collection. */ + +void +ggc_grow (void) +{ +#ifndef ENABLE_CHECKING + G.allocated_last_gc = MAX (G.allocated_last_gc, + G.allocated); +#endif + if (!quiet_flag) +fprintf (stderr, {GC start %luk} , (unsigned long) G.allocated / 1024); +} + /* Print allocation statistics. */ #define SCALE(x) ((unsigned long) ((x) 1024*10 \ ? (x) \ -- Richard Biener rguent...@suse.de SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imendorffer
Re: [gomp4] Add tables generation
On 27 Mar 17:16, Jakub Jelinek wrote: On Thu, Mar 27, 2014 at 08:13:00PM +0400, Ilya Verbin wrote: On 27 Mar 15:02, Jakub Jelinek wrote: The tables need to be created before IPA, that way it really shouldn't matter in what order you emit them. E.g. the outlined target functions could be added to the table during ompexp pass which actually creates the outlined functions, the vars need to be added before target lto or host lto is streamed. For host tables it's ok, but when target compiler will create tables with functions? It reads bytecode from target_lto sections, so it never executes ompexp pass. Which is why the table created for host by the ompexp pass should be streamed into the target_lto sections (marked specially somehow, special attribute or whatever), and then corresponding target table created from that, rather then created from some possibly different ordering there. Jakub Hi Jakub, Could you please take a look at this patch? It fixes the ordering issue in the tables stated above, and passes all the tests that I have. But I'm not sure about its correctness from the architectural point of view. --- gcc/lto-cgraph.c | 93 ++ gcc/lto-section-in.c | 3 +- gcc/lto-streamer-out.c | 2 ++ gcc/lto-streamer.h | 3 ++ gcc/lto/lto.c | 2 ++ gcc/omp-low.c | 68 +++- 6 files changed, 115 insertions(+), 56 deletions(-) diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c index 544f04b..3d6637e 100644 --- a/gcc/lto-cgraph.c +++ b/gcc/lto-cgraph.c @@ -82,6 +82,8 @@ enum LTO_symtab_tags LTO_symtab_last_tag }; +extern vectree, va_gc *offload_funcs, *offload_vars; + /* Create a new symtab encoder. if FOR_INPUT, the encoder allocate only datastructures needed to read the symtab. */ @@ -958,6 +960,51 @@ output_symtab (void) output_refs (encoder); } +void +output_offload_tables (void) +{ + /* Collect all omp-target global variables to offload_vars, if they have not + been gathered earlier by input_offload_tables. */ + if (vec_safe_is_empty (offload_vars)) +{ + struct varpool_node *vnode; + FOR_EACH_DEFINED_VARIABLE (vnode) + { + if (!lookup_attribute (omp declare target, +DECL_ATTRIBUTES (vnode-decl)) + || TREE_CODE (vnode-decl) != VAR_DECL + || DECL_SIZE (vnode-decl) == 0) + continue; + vec_safe_push (offload_vars, vnode-decl); + } +} + + if (vec_safe_is_empty (offload_funcs) vec_safe_is_empty (offload_vars)) +return; + + struct lto_simple_output_block *ob += lto_create_simple_output_block (LTO_section_offload_table); + + for (unsigned i = 0; i vec_safe_length (offload_funcs); i++) +{ + streamer_write_enum (ob-main_stream, LTO_symtab_tags, + LTO_symtab_last_tag, LTO_symtab_unavail_node); + lto_output_fn_decl_index (ob-decl_state, ob-main_stream, + (*offload_funcs)[i]); +} + + for (unsigned i = 0; i vec_safe_length (offload_vars); i++) +{ + streamer_write_enum (ob-main_stream, LTO_symtab_tags, + LTO_symtab_last_tag, LTO_symtab_variable); + lto_output_var_decl_index (ob-decl_state, ob-main_stream, +(*offload_vars)[i]); +} + + streamer_write_uhwi_stream (ob-main_stream, 0); + lto_destroy_simple_output_block (ob); +} + /* Overwrite the information in NODE based on FILE_DATA, TAG, FLAGS, STACK_SIZE, SELF_TIME and SELF_SIZE. This is called either to initialize NODE or to replace the values in it, for instance because the first @@ -1611,6 +1658,52 @@ input_symtab (void) } } +void +input_offload_tables (void) +{ + struct lto_file_decl_data **file_data_vec = lto_get_file_decl_data (); + struct lto_file_decl_data *file_data; + unsigned int j = 0; + + while ((file_data = file_data_vec[j++])) +{ + const char *data; + size_t len; + struct lto_input_block *ib + = lto_create_simple_input_block (file_data, LTO_section_offload_table, +data, len); + if (!ib) + continue; + + enum LTO_symtab_tags tag + = streamer_read_enum (ib, LTO_symtab_tags, LTO_symtab_last_tag); + while (tag) + { + if (tag == LTO_symtab_unavail_node) + { + int decl_index = streamer_read_uhwi (ib); + tree fn_decl + = lto_file_decl_data_get_fn_decl (file_data, decl_index); + vec_safe_push (offload_funcs, fn_decl); + } + else if (tag == LTO_symtab_variable) + { + int decl_index = streamer_read_uhwi (ib); + tree var_decl + = lto_file_decl_data_get_var_decl (file_data, decl_index); + vec_safe_push (offload_vars, var_decl); + } +
Re: Inliner heuristics TLC 1/n - let small function inlinng to ignore cold portion of program
On Thu, 2014-04-17 at 19:52 +0200, Jan Hubicka wrote: [...] Index: doc/invoke.texi === --- doc/invoke.texi (revision 209461) +++ doc/invoke.texi (working copy) @@ -9409,7 +9409,8 @@ before applying @option{--param inline-u @item inline-unit-growth Specifies maximal overall growth of the compilation unit caused by inlining. The default value is 30 which limits unit growth to 1.3 times the original -size. +size. Cold functions (either marked cold via an attribibute or by profile FWIW, there a trivial typo here-^^
Go patch commited: Mark various expressions as immutable
This patch from Chris Manghane marks various expression types as immutable: numerics, constants, type info, address of, type conversion when appropriate. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 194e0f47c9e5 go/expressions.cc --- a/go/expressions.cc Wed Apr 16 13:33:13 2014 -0700 +++ b/go/expressions.cc Thu Apr 17 11:57:28 2014 -0700 @@ -555,6 +555,10 @@ { return true; } bool + do_is_immutable() const + { return true; } + + bool do_numeric_constant_value(Numeric_constant* nc) const { nc-set_unsigned_long(NULL, 0); @@ -1422,6 +1426,10 @@ do_is_constant() const { return true; } + bool + do_is_immutable() const + { return true; } + Type* do_type(); @@ -1790,6 +1798,10 @@ { return true; } bool + do_is_immutable() const + { return true; } + + bool do_numeric_constant_value(Numeric_constant* nc) const; Type* @@ -2109,6 +2121,10 @@ { return true; } bool + do_is_immutable() const + { return true; } + + bool do_numeric_constant_value(Numeric_constant* nc) const { nc-set_float(this-type_, this-val_); @@ -2292,6 +2308,10 @@ { return true; } bool + do_is_immutable() const + { return true; } + + bool do_numeric_constant_value(Numeric_constant* nc) const { nc-set_complex(this-type_, this-real_, this-imag_); @@ -2506,6 +2526,10 @@ { return true; } bool + do_is_immutable() const + { return true; } + + bool do_numeric_constant_value(Numeric_constant* nc) const; bool @@ -2994,6 +3018,9 @@ do_is_constant() const; bool + do_is_immutable() const; + + bool do_numeric_constant_value(Numeric_constant*) const; bool @@ -3175,6 +3202,27 @@ return true; } +// Return whether a type conversion is immutable. + +bool +Type_conversion_expression::do_is_immutable() const +{ + Type* type = this-type_; + Type* expr_type = this-expr_-type(); + + if (type-interface_type() != NULL + || expr_type-interface_type() != NULL) +return false; + + if (!this-expr_-is_immutable()) +return false; + + if (Type::are_identical(type, expr_type, false, NULL)) +return true; + + return type-is_basic_type() expr_type-is_basic_type(); +} + // Return the constant numeric value if there is one. bool @@ -3599,7 +3647,8 @@ bool do_is_immutable() const - { return this-expr_-is_immutable(); } + { return this-expr_-is_immutable() + || (this-op_ == OPERATOR_AND this-expr_-is_variable()); } bool do_numeric_constant_value(Numeric_constant*) const; @@ -14076,6 +14125,10 @@ { } protected: + bool + do_is_immutable() const + { return true; } + Type* do_type();
Re: Inliner heuristics TLC 1/n - let small function inlinng to ignore cold portion of program
This looks fine. LIPO has similar change too. Other directions worth looking into: 1) To model icache effect better, weighted callee size need to be used with profile. The weight for BB may look like: min(1, FREQ(BB)/FREQ(ENTRY)). 2) When function splitting is turned on, are any inline heuristic changes are needed? E.g. only consider the hot code part of node for unit growth computation? We are also looking into more aggressive approach to track per loop (inter-procedural) region growth limit, instead of using one single global limit. David On Thu, Apr 17, 2014 at 10:52 AM, Jan Hubicka hubi...@ucw.cz wrote: Hi, I think for 4.10 we should revisit inliner behaviour to be more LTO and LTO+FDO ready. This is first of small patches I made to sanitize behaviour of current bounds. The main problem LTO brings is that we get way too many inline candidates. In per-file model one gets only small percentage of calls inlinable, since most of them go to other units, so our current heuristics behave quite well, inlining usually all calls that it consider benefical. With LTO almost all calls are inlinable and if we inline everything we consider profitable we get insane code size growths, so practically always we hit our 30% unit growth threshold. This is not always a good idea. Reducing inline-insns-auto/inline-insns-single to avoid inliner hitting the growth limit would cause a regression on benchmarks that needs inlining of large functions. LLVM seems to get around the problem by doing code expanding inlining at compile time (in equivalent of our early inliner). This makes functions big, so the LTO doesn't inline much, but it also misses useful cross-module inlines and replace them by less usefull inter-module. Other approach would be to have inline-insns-crossmodule that is significantly smaller than inline-insns-auto. We already have crossmodule hint that probably ought to be made smarter to not fire on COMDAT functions. I do not want to do it, since the numbers I collected in http://hubicka.blogspot.ca/2014/04/devirtualization-in-c-part-5-feedback.html suggest that inline-insns-auto is already quite bad limit. I would be happy to hear about alternative solutions to this. We may want to switch whole program inliner into temperature style bound, like open64 does. Well, this patch actually goes bit different direction - making unit growth threashold more sane. While looking into inliner behaviour at Firefox to write my blog entry I noticed that with profile feedback only very small portion of the program is trained (15%) and only around 7% of code contains something that we consider hot. Inliner however still hits the inline-unit-growth limit with: Unit growth for small function inlining: 7232256-9220597 (27%) Inlined 183353 calls, eliminated 54652 function We do not grow the code in the cold portions of program, but because of the dead padding we grow everything we consider hot 4 times, instead of 1.3 times as we would usually do if it was unpadded. This patch fixes the problem by considering only non-cold functions for frequency calculation. We now get: Unit growth for small function inlining: 2083217-2537163 (21%) Inlined 134611 calls, eliminated 53586 functions So while the relative growth is still close to 30%, the absolute growth is only 22% of the previous one. We inline fewer calls but in the dynamic stats there is very minor (sub 0.01%) diference. Bootstrapped/regtested x86_64-linux, will commit it shortly. Honza * ipa-inline.c (inline_small_functions): Account only non-cold functions. * doc/invoke.texi (inline-unit-growth): Update documentation. Index: ipa-inline.c === --- ipa-inline.c(revision 209461) +++ ipa-inline.c(working copy) @@ -1585,7 +1590,10 @@ inline_small_functions (void) struct inline_summary *info = inline_summary (node); struct ipa_dfs_info *dfs = (struct ipa_dfs_info *) node-aux; - if (!DECL_EXTERNAL (node-decl)) + /* Do not account external functions, they will be optimized out + if not inlined. Also only count the non-cold portion of program. */ + if (!DECL_EXTERNAL (node-decl) +node-frequency != NODE_FREQUENCY_UNLIKELY_EXECUTED) initial_size += info-size; info-growth = estimate_growth (node); if (dfs dfs-next_cycle) Index: doc/invoke.texi === --- doc/invoke.texi (revision 209461) +++ doc/invoke.texi (working copy) @@ -9409,7 +9409,8 @@ before applying @option{--param inline-u @item inline-unit-growth Specifies maximal overall growth of the compilation unit caused by inlining. The default value is 30 which limits unit growth to 1.3 times the original -size. +size.
RE: [PATCH v7?] PR middle-end/60281
Hi Lin, On Thu, 17 Apr 2014 22:29:14, Lin Zuojian wrote: Hi Bernd, I have my copyright mark signed and the process has completed. Now I am going to answer two more questions before my patch can be commited right? Did you copy any files or text written by someone else in these changes?” no [Which files have you changed so far, and which new files have you written so far?] gcc/asan.c gcc/ChangeLog gcc/cfgexpand.c Okay, you may review my patch again, if there is no problem, please commit it for me. -- Regards lin zuojian I am not sure if your patch was already approved by a global GCC reviewer. That is however absolutely necessary before it can be committed. I think it would be best to re-submit the latest version of your patch now, and ask a global reviewer for approval. The message should be sent to gcc-patches@gcc.gnu.org and contain the following information in addition to the proposed patch itself and the change-log entry: a) On which target(s) did you boot-strap your patch? b) Did you run the testsuite? c) When you compare the test results with and without the patch, were there any regressions? Regards Bernd.
Go patch committed: Only convert function type when necessary
This patch to the Go frontend fixes it to not convert the function type in a call when calling an interface method. The function type of an interface method is not correct, since it does not include the receiver, but the type of the method field is correct, and as such should not be converted. This is PR 60870. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Tested by Ulrich Weigand on PPC. Committed to mainline. Ian diff -r 43e2635914c2 go/expressions.cc --- a/go/expressions.cc Thu Apr 17 12:09:37 2014 -0700 +++ b/go/expressions.cc Thu Apr 17 12:24:08 2014 -0700 @@ -9619,9 +9619,20 @@ fn = Expression::make_compound(set_closure, fn, location); } - Btype* bft = fntype-get_backend_fntype(gogo); Bexpression* bfn = tree_to_expr(fn-get_tree(context)); - bfn = gogo-backend()-convert_expression(bft, bfn, location); + + // When not calling a named function directly, use a type conversion + // in case the type of the function is a recursive type which refers + // to itself. We don't do this for an interface method because 1) + // an interface method never refers to itself, so we always have a + // function type here; 2) we pass an extra first argument to an + // interface method, so fntype is not correct. + if (func == NULL !is_interface_method) +{ + Btype* bft = fntype-get_backend_fntype(gogo); + bfn = gogo-backend()-convert_expression(bft, bfn, location); +} + Bexpression* call = gogo-backend()-call_expression(bfn, fn_args, location); if (this-results_ != NULL)
Re: [RFC][PATCH] RL78 - clean-up of missing operand mode warnings.
On 15/04/14 22:58, DJ Delorie wrote: I typically leave the mode off when the operand accepts a CONST_INT as I've had problems with patterns matching CONST_INTs otherwise, as CONST_INT rtx's do not have a mode (or have VOIDmode). (yes, I know gcc is supposed to accomodate that, but like I said, I've had problems...) Ok, that's fine. I was just trying to mop up one little bit of the sea of warnings. It seems a little inconsistent, however, that *movqi_real and *xorqi3_real don't specify modes but *movhi_real and *andqi_real/*iorqi_real do (and they also accept CONST_INTs). Not that I'm advocating generating more warnings, but my inner OCD likes consistency :) Richard.
Re: [patch] change specific int128 - generic intN
On Tue, 15 Apr 2014, DJ Delorie wrote: I wasn't sure what to do with that array, since it was static and couldn't have empty slots in them like the arrays in tree.h. Also, do we need to have *every* type in that list? What's the rule for whether a type gets installed there or not? The comment says guaranteed to be in the runtime support but does that mean for this particular build (wrt multilibs) as not all intN types are guaranteed (even the int128 types were not guaranteed to be supported before my patch). In other parts of the patch, just taking out the special case for __int128 was sufficient to do the right thing for all __intN types. You need someone who understands this better than me (ask Jason). To be able to throw/catch a type, you need some typeinfo symbols. The front-end generates that for classes when they are defined. For fundamental types, it assumes libsupc++ will provide it, and the function you are modifying is the one generating libsupc++ (I am surprised your patch didn't cause any failure on x64_64, at least in abi_check). We need to generate the typeinfo for __intN, either in libsupc++, or in each TU, and since both cases will require code, I assume libsupc++ is preferable. I can certainly put the intN types in there, but note that it would mean regenerating the fundamentals[] array at runtime to include those types which are supported at the time. After the patch I linked, it should just mean calling the helper function on your new types, no need to touch the array. Do the entries in the array need to be in a particular order? No, any random order would do. -- Marc Glisse
Re: [RFC] Add aarch64 support for ada
On 04/17/2014 08:56 AM, Eric Botcazou wrote: I presume that the attached kludge is sufficient to make it work? * fe.h (Compiler_Abort): Replace Fat_Pointer by String. (Error_Msg_N): Likewise. (Error_Msg_NE): Likewise. (Get_External_Name_With_Suffix): Likewise. * types.h (Fat_Pointer): Delete. (String): New type. (DECLARE_STRING): New macro. * gcc-interface/decl.c (create_concat_name): Adjust. * gcc-interface/trans.c (post_error): Likewise. (post_error_ne): Likewise. * gcc-interface/misc.c (internal_error_function): Likewise. Yes, this bootstrapped. r~
Go patch committed: Use backend interface for constant expressions
This patch from Chris Manghane changes the Go frontend to use the backend interface for global constants. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian 2014-04-17 Chris Manghane cm...@google.com * go-gcc.cc (Gcc_backend::named_constant_expression): New function. Index: gcc/go/go-gcc.cc === --- gcc/go/go-gcc.cc (revision 209494) +++ gcc/go/go-gcc.cc (revision 209495) @@ -227,6 +227,10 @@ class Gcc_backend : public Backend indirect_expression(Bexpression* expr, bool known_valid, Location); Bexpression* + named_constant_expression(Btype* btype, const std::string name, + Bexpression* val, Location); + + Bexpression* integer_constant_expression(Btype* btype, mpz_t val); Bexpression* @@ -962,6 +966,29 @@ Gcc_backend::indirect_expression(Bexpres return tree_to_expr(ret); } +// Return an expression that declares a constant named NAME with the +// constant value VAL in BTYPE. + +Bexpression* +Gcc_backend::named_constant_expression(Btype* btype, const std::string name, + Bexpression* val, Location location) +{ + tree type_tree = btype-get_tree(); + tree const_val = val-get_tree(); + if (type_tree == error_mark_node || const_val == error_mark_node) +return this-error_expression(); + + tree name_tree = get_identifier_from_string(name); + tree decl = build_decl(location.gcc_location(), CONST_DECL, name_tree, + type_tree); + DECL_INITIAL(decl) = const_val; + TREE_CONSTANT(decl) = 1; + TREE_READONLY(decl) = 1; + + go_preserve_from_gc(decl); + return this-make_expression(decl); +} + // Return a typed value as a constant integer. Bexpression* Index: gcc/go/gofrontend/gogo-tree.cc === --- gcc/go/gofrontend/gogo-tree.cc (revision 209494) +++ gcc/go/gofrontend/gogo-tree.cc (revision 209495) @@ -1015,44 +1015,22 @@ Named_object::get_tree(Gogo* gogo, Named { case NAMED_OBJECT_CONST: { - Named_constant* named_constant = this-u_.const_value; Translate_context subcontext(gogo, function, NULL, NULL); - tree expr_tree = named_constant-expr()-get_tree(subcontext); - if (expr_tree == error_mark_node) - decl = error_mark_node; - else + Type* type = this-u_.const_value-type(); + Location loc = this-location(); + + Expression* const_ref = Expression::make_const_reference(this, loc); +Bexpression* const_decl = + tree_to_expr(const_ref-get_tree(subcontext)); + if (type != NULL type-is_numeric_type()) { - Type* type = named_constant-type(); - if (type != NULL !type-is_abstract()) - { - if (type-is_error()) - expr_tree = error_mark_node; - else - { - Btype* btype = type-get_backend(gogo); - expr_tree = fold_convert(type_to_tree(btype), expr_tree); - } - } - if (expr_tree == error_mark_node) - decl = error_mark_node; - else if (INTEGRAL_TYPE_P(TREE_TYPE(expr_tree))) - { -tree name = get_identifier_from_string(this-get_id(gogo)); - decl = build_decl(named_constant-location().gcc_location(), - CONST_DECL, name, TREE_TYPE(expr_tree)); - DECL_INITIAL(decl) = expr_tree; - TREE_CONSTANT(decl) = 1; - TREE_READONLY(decl) = 1; - } - else - { - // A CONST_DECL is only for an enum constant, so we - // shouldn't use for non-integral types. Instead we - // just return the constant itself, rather than a - // decl. - decl = expr_tree; - } + Btype* btype = type-get_backend(gogo); + std::string name = this-get_id(gogo); +const_decl = + gogo-backend()-named_constant_expression(btype, name, + const_decl, loc); } + decl = expr_to_tree(const_decl); } break; Index: gcc/go/gofrontend/backend.h === --- gcc/go/gofrontend/backend.h (revision 209494) +++ gcc/go/gofrontend/backend.h (revision 209495) @@ -257,6 +257,12 @@ class Backend virtual Bexpression* indirect_expression(Bexpression* expr, bool known_valid, Location) = 0; + // Return an expression that declares a constant named NAME with the + // constant value VAL in BTYPE. + virtual Bexpression* + named_constant_expression(Btype* btype, const std::string name, + Bexpression* val, Location) = 0; + // Return an expression for the multi-precision integer VAL in BTYPE. virtual Bexpression* integer_constant_expression(Btype* btype, mpz_t val) = 0; Index: gcc/go/gofrontend/expressions.cc === --- gcc/go/gofrontend/expressions.cc (revision 209494) +++ gcc/go/gofrontend/expressions.cc (revision 209495) @@ -2792,12 +2792,12 @@ Const_expression::do_get_tree(Translate_ // If the type has been set for this expression, but the underlying // object is an abstract int
Re: Patch ping
On Wed, Apr 16, 2014 at 11:35 PM, Jeff Law l...@redhat.com wrote: I'd like to ping 2 patches: http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00140.html - Ensure GET_MODE_{SIZE,INNER,NUNITS} (const) is constant rather than memory load after optimization (I'd like to keep the current MODE_SIZE patch for the reasons mentioned there, but also add this patch) This is fine. Per the follow-up discussion, I think you can mark it was resolving 36109 as well. http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00131.html - PR target/59617 handle gather loads for AVX512 (at least non-masked ones, masked ones will need to wait for 5.0 and we need to find how to represent it in GIMPLE) I'll leave this to Uros :-) IIRC, this patch was already committed to 4.9 some time ago. Uros.
[PATCH], PR target/60876 -- fix build issue with powerpc
I committed the following patch as obvious to fix the PowerPC build issue that came up with changes to machmode.h. These changes allow the compiler to build and bootstrap. Submitted as subversion id 209498. 2014-04-17 Michael Meissner meiss...@linux.vnet.ibm.com PR target/60876 * config/rs6000/rs6000.c (rs6000_setup_reg_addr_masks): Make sure GET_MODE_SIZE gets passed an enum machine_mode type and not integer. (rs6000_init_hard_regno_mode_ok): Likewise. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797 Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 209494) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -2329,6 +2329,8 @@ rs6000_setup_reg_addr_masks (void) for (m = 0; m NUM_MACHINE_MODES; ++m) { + enum machine_mode m2 = (enum machine_mode)m; + /* SDmode is special in that we want to access it only via REG+REG addressing on power7 and above, since we want to use the LFIWZX and STFIWZX instructions to load it. */ @@ -2363,13 +2365,13 @@ rs6000_setup_reg_addr_masks (void) if (TARGET_UPDATE (rc == RELOAD_REG_GPR || rc == RELOAD_REG_FPR) - GET_MODE_SIZE (m) = 8 - !VECTOR_MODE_P (m) - !COMPLEX_MODE_P (m) + GET_MODE_SIZE (m2) = 8 + !VECTOR_MODE_P (m2) + !COMPLEX_MODE_P (m2) !indexed_only_p - !(TARGET_E500_DOUBLE GET_MODE_SIZE (m) == 8) - !(m == DFmode TARGET_UPPER_REGS_DF) - !(m == SFmode TARGET_UPPER_REGS_SF)) + !(TARGET_E500_DOUBLE GET_MODE_SIZE (m2) == 8) + !(m2 == DFmode TARGET_UPPER_REGS_DF) + !(m2 == SFmode TARGET_UPPER_REGS_SF)) { addr_mask |= RELOAD_REG_PRE_INCDEC; @@ -2815,6 +2817,7 @@ rs6000_init_hard_regno_mode_ok (bool glo for (m = 0; m NUM_MACHINE_MODES; ++m) { + enum machine_mode m2 = (enum machine_mode)m; int reg_size2 = reg_size; /* TFmode/TDmode always takes 2 registers, even in VSX. */ @@ -2823,7 +2826,7 @@ rs6000_init_hard_regno_mode_ok (bool glo reg_size2 = UNITS_PER_FP_WORD; rs6000_class_max_nregs[m][c] - = (GET_MODE_SIZE (m) + reg_size2 - 1) / reg_size2; + = (GET_MODE_SIZE (m2) + reg_size2 - 1) / reg_size2; } }
Re: [RFC][PATCH] RL78 - clean-up of missing operand mode warnings.
It seems a little inconsistent, however, that *movqi_real and *xorqi3_real don't specify modes but *movhi_real and *andqi_real/*iorqi_real do (and they also accept CONST_INTs). Not that I'm advocating generating more warnings, but my inner OCD likes consistency :) Adding the mode might be the right way, but I've seen cases where it wasn't. My paranoia supercedes my OCD ;-)
libgo patch committed: Avoid unnecessary gccgo extension
This patch from Peter Collingbourne avoids an unnecessary gccgo extension in libgo. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 801009f33610 libgo/go/syscall/libcall_posix.go --- a/libgo/go/syscall/libcall_posix.go Thu Apr 17 16:01:58 2014 -0700 +++ b/libgo/go/syscall/libcall_posix.go Thu Apr 17 16:03:58 2014 -0700 @@ -138,7 +138,7 @@ //sys Select(nfd int, r *FdSet, w *FdSet, e *FdSet, timeout *Timeval) (n int, err error) //select(nfd _C_int, r *FdSet, w *FdSet, e *FdSet, timeout *Timeval) _C_int -const nfdbits = int(unsafe.Sizeof(fds_bits_type) * 8) +const nfdbits = int(unsafe.Sizeof(fds_bits_type(0)) * 8) type FdSet struct { Bits [(FD_SETSIZE + nfdbits - 1) / nfdbits]fds_bits_type
libgo patch committed: Use delete rather than old map deletion syntax
This patch from Peter Collingbourne changes libgo to use the builtin delete function rather than the old map deletion syntax. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 1e27a38c43ea libgo/go/syscall/syscall_unix.go --- a/libgo/go/syscall/syscall_unix.go Thu Apr 17 16:13:05 2014 -0700 +++ b/libgo/go/syscall/syscall_unix.go Thu Apr 17 16:17:50 2014 -0700 @@ -153,7 +153,7 @@ if errno := m.munmap(uintptr(unsafe.Pointer(b[0])), uintptr(len(b))); errno != nil { return errno } - m.active[p] = nil, false + delete(m.active, p) return nil }
libgo patch committed: Avoid duplicate function declarations in syscall
This patch from Peter Collingbourne avoids duplicate function declarations in the generated libcalls.go file when building the syscall package. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 5009262c3e56 libgo/go/syscall/mksyscall.awk --- a/libgo/go/syscall/mksyscall.awk Thu Apr 17 16:26:57 2014 -0700 +++ b/libgo/go/syscall/mksyscall.awk Thu Apr 17 16:30:29 2014 -0700 @@ -96,8 +96,11 @@ cfnresult = line printf(// Automatically generated wrapper for %s/%s\n, gofnname, cfnname) -printf(//extern %s\n, cfnname) -printf(func c_%s(%s) %s\n, cfnname, cfnparams, cfnresult) +if (!(cfnname in cfns)) { +cfns[cfnname] = 1 +printf(//extern %s\n, cfnname) +printf(func c_%s(%s) %s\n, cfnname, cfnparams, cfnresult) +} printf(func %s(%s) %s%s%s%s{\n, gofnname, gofnparams, gofnresults == ? : (, gofnresults, gofnresults == ? : ), gofnresults == ? : )
[PATCH, rs6000, 4.8, 4.9, trunk] Fix little endian behavior of vec_merge[hl] for V4SI/V4SF with VSX
Hi, I missed a case in the vector API work for little endian. When VSX is enabled, the vec_mergeh and vec_mergel interfaces for 4x32 vectors are translated into xxmrghw and xxmrglw. The patterns for these were not adjusted for little endian. This patch fixes this and adds tests for V4SI and V4SF modes when VSX is available. Bootstrapped and tested on 4.8, 4.9, and trunk for powerpc64le-unknown-linux-gnu with no regressions. Tests are still ongoing for powerpc64-unknown-linux-gnu. Provided those complete without regressions, is this fix ok for trunk, 4.9, and 4.8? Thanks, Bill [gcc] 2014-04-17 Bill Schmidt wschm...@linux.vnet.ibm.com * config/rs6000/vsx.md (vsx_xxmrghw_mode): Adjust for little-endian. (vsx_xxmrglw_mode): Likewise. [gcc/testsuite] 2014-04-17 Bill Schmidt wschm...@linux.vnet.ibm.com * gcc.dg/vmx/merge-vsx.c: Add V4SI and V4SF tests. * gcc.dg/vmx/merge-vsx-be-order.c: Likewise. Index: gcc/config/rs6000/vsx.md === --- gcc/config/rs6000/vsx.md(revision 209513) +++ gcc/config/rs6000/vsx.md(working copy) @@ -1891,7 +1891,12 @@ (parallel [(const_int 0) (const_int 4) (const_int 1) (const_int 5)])))] VECTOR_MEM_VSX_P (MODEmode) - xxmrghw %x0,%x1,%x2 +{ + if (BYTES_BIG_ENDIAN) +return xxmrghw %x0,%x1,%x2; + else +return xxmrglw %x0,%x2,%x1; +} [(set_attr type vecperm)]) (define_insn vsx_xxmrglw_mode @@ -1903,7 +1908,12 @@ (parallel [(const_int 2) (const_int 6) (const_int 3) (const_int 7)])))] VECTOR_MEM_VSX_P (MODEmode) - xxmrglw %x0,%x1,%x2 +{ + if (BYTES_BIG_ENDIAN) +return xxmrglw %x0,%x1,%x2; + else +return xxmrghw %x0,%x2,%x1; +} [(set_attr type vecperm)]) ;; Shift left double by word immediate Index: gcc/testsuite/gcc.dg/vmx/merge-vsx-be-order.c === --- gcc/testsuite/gcc.dg/vmx/merge-vsx-be-order.c (revision 209513) +++ gcc/testsuite/gcc.dg/vmx/merge-vsx-be-order.c (working copy) @@ -21,10 +21,19 @@ static void test() vector long long vlb = {0,1}; vector double vda = {-2.0,-1.0}; vector double vdb = {0.0,1.0}; + vector unsigned int vuia = {0,1,2,3}; + vector unsigned int vuib = {4,5,6,7}; + vector signed int vsia = {-4,-3,-2,-1}; + vector signed int vsib = {0,1,2,3}; + vector float vfa = {-4.0,-3.0,-2.0,-1.0}; + vector float vfb = {0.0,1.0,2.0,3.0}; /* Result vectors. */ vector long long vlh, vll; vector double vdh, vdl; + vector unsigned int vuih, vuil; + vector signed int vsih, vsil; + vector float vfh, vfl; /* Expected result vectors. */ #if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ @@ -32,11 +41,23 @@ static void test() vector long long vlrl = {0,-2}; vector double vdrh = {1.0,-1.0}; vector double vdrl = {0.0,-2.0}; + vector unsigned int vuirh = {6,2,7,3}; + vector unsigned int vuirl = {4,0,5,1}; + vector signed int vsirh = {2,-2,3,-1}; + vector signed int vsirl = {0,-4,1,-3}; + vector float vfrh = {2.0,-2.0,3.0,-1.0}; + vector float vfrl = {0.0,-4.0,1.0,-3.0}; #else vector long long vlrh = {-2,0}; vector long long vlrl = {-1,1}; vector double vdrh = {-2.0,0.0}; vector double vdrl = {-1.0,1.0}; + vector unsigned int vuirh = {0,4,1,5}; + vector unsigned int vuirl = {2,6,3,7}; + vector signed int vsirh = {-4,0,-3,1}; + vector signed int vsirl = {-2,2,-1,3}; + vector float vfrh = {-4.0,0.0,-3.0,1.0}; + vector float vfrl = {-2.0,2.0,-1.0,3.0}; #endif vlh = vec_mergeh (vla, vlb); @@ -43,9 +64,21 @@ static void test() vll = vec_mergel (vla, vlb); vdh = vec_mergeh (vda, vdb); vdl = vec_mergel (vda, vdb); + vuih = vec_mergeh (vuia, vuib); + vuil = vec_mergel (vuia, vuib); + vsih = vec_mergeh (vsia, vsib); + vsil = vec_mergel (vsia, vsib); + vfh = vec_mergeh (vfa, vfb ); + vfl = vec_mergel (vfa, vfb ); check (vec_long_long_eq (vlh, vlrh), vlh); check (vec_long_long_eq (vll, vlrl), vll); check (vec_double_eq (vdh, vdrh), vdh ); check (vec_double_eq (vdl, vdrl), vdl ); + check (vec_all_eq (vuih, vuirh), vuih); + check (vec_all_eq (vuil, vuirl), vuil); + check (vec_all_eq (vsih, vsirh), vsih); + check (vec_all_eq (vsil, vsirl), vsil); + check (vec_all_eq (vfh, vfrh), vfh); + check (vec_all_eq (vfl, vfrl), vfl); } Index: gcc/testsuite/gcc.dg/vmx/merge-vsx.c === --- gcc/testsuite/gcc.dg/vmx/merge-vsx.c(revision 209513) +++ gcc/testsuite/gcc.dg/vmx/merge-vsx.c(working copy) @@ -21,10 +21,19 @@ static void test() vector long long vlb = {0,1}; vector double vda = {-2.0,-1.0}; vector double vdb = {0.0,1.0}; + vector unsigned int vuia = {0,1,2,3}; + vector unsigned int vuib = {4,5,6,7}; + vector signed int vsia = {-4,-3,-2,-1}; + vector signed int vsib = {0,1,2,3}; + vector float vfa =
[PATCH v8] PR middle-end/60281
Hi, Here is the patch after the Jakub's review, and Jakub helps with the coding style. -- * asan.c (asan_emit_stack_protection): Force the base to align to appropriate bits if STRICT_ALIGNMENT. Set shadow_mem align to appropriate bits if STRICT_ALIGNMENT. * cfgexpand.c (expand_stack_vars): Set base_align appropriately when asan is on. (expand_used_vars): Leave a space in the stack frame for alignment if STRICT_ALIGNMENT. --- gcc/ChangeLog | 9 + gcc/asan.c | 15 +++ gcc/cfgexpand.c | 18 -- 3 files changed, 40 insertions(+), 2 deletions(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index da35be8..30a2b33 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,12 @@ +2014-04-18 Lin Zuojian manjian2...@gmail.com + PR middle-end/60281 + * asan.c (asan_emit_stack_protection): Force the base to align to + appropriate bits if STRICT_ALIGNMENT. Set shadow_mem align to + appropriate bits if STRICT_ALIGNMENT. + * cfgexpand.c (expand_stack_vars): Set base_align appropriately + when asan is on. + (expand_used_vars): Leave a space in the stack frame for alignment + if STRICT_ALIGNMENT. 2014-04-17 Jakub Jelinek ja...@redhat.com PR target/60847 diff --git a/gcc/asan.c b/gcc/asan.c index 53992a8..28a476f 100644 --- a/gcc/asan.c +++ b/gcc/asan.c @@ -1017,8 +1017,17 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb, base_align_bias = ((asan_frame_size + alignb - 1) ~(alignb - HOST_WIDE_INT_1)) - asan_frame_size; } + /* Align base if target is STRICT_ALIGNMENT. */ + if (STRICT_ALIGNMENT) +base = expand_binop (Pmode, and_optab, base, +gen_int_mode (-((GET_MODE_ALIGNMENT (SImode) + ASAN_SHADOW_SHIFT) +/ BITS_PER_UNIT), Pmode), NULL_RTX, +1, OPTAB_DIRECT); + if (use_after_return_class == -1 pbase) emit_move_insn (pbase, base); + base = expand_binop (Pmode, add_optab, base, gen_int_mode (base_offset - base_align_bias, Pmode), NULL_RTX, 1, OPTAB_DIRECT); @@ -1097,6 +1106,8 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb, (ASAN_RED_ZONE_SIZE ASAN_SHADOW_SHIFT) == 4); shadow_mem = gen_rtx_MEM (SImode, shadow_base); set_mem_alias_set (shadow_mem, asan_shadow_set); + if (STRICT_ALIGNMENT) +set_mem_align (shadow_mem, (GET_MODE_ALIGNMENT (SImode))); prev_offset = base_offset; for (l = length; l; l -= 2) { @@ -1186,6 +1197,10 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb, shadow_mem = gen_rtx_MEM (BLKmode, shadow_base); set_mem_alias_set (shadow_mem, asan_shadow_set); + + if (STRICT_ALIGNMENT) +set_mem_align (shadow_mem, (GET_MODE_ALIGNMENT (SImode))); + prev_offset = base_offset; last_offset = base_offset; last_size = 0; diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c index b7f6360..14511e1 100644 --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -1013,10 +1013,19 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data) if (data-asan_base == NULL) data-asan_base = gen_reg_rtx (Pmode); base = data-asan_base; + + if (!STRICT_ALIGNMENT) + base_align = crtl-max_used_stack_slot_alignment; + else + base_align = MAX (crtl-max_used_stack_slot_alignment, + GET_MODE_ALIGNMENT (SImode) + ASAN_SHADOW_SHIFT); } else - offset = alloc_stack_frame_space (stack_vars[i].size, alignb); - base_align = crtl-max_used_stack_slot_alignment; + { + offset = alloc_stack_frame_space (stack_vars[i].size, alignb); + base_align = crtl-max_used_stack_slot_alignment; + } } else { @@ -1845,6 +1854,11 @@ expand_used_vars (void) = alloc_stack_frame_space (redzonesz, ASAN_RED_ZONE_SIZE); data.asan_vec.safe_push (prev_offset); data.asan_vec.safe_push (offset); + /* Leave space for alignment if STRICT_ALIGNMENT. */ + if (STRICT_ALIGNMENT) + alloc_stack_frame_space ((GET_MODE_ALIGNMENT (SImode) + ASAN_SHADOW_SHIFT) +/ BITS_PER_UNIT, 1); var_end_seq = asan_emit_stack_protection (virtual_stack_vars_rtx, -- 1.8.3.2 -- Regards lin zuojian
Re: [PATCH v8] PR middle-end/60281
Hi Bernd, a) On which target(s) did you boot-strap your patch? I just run it on x86, can't run it on ARM, because Android is not a posix system, nor a System V compatible system. And my code does not effect x86. b) Did you run the testsuite? Yes, but again my code does not effect x86. c) When you compare the test results with and without the patch, were there any regressions? Only the bug has gone. My app can run on my Android ARM system. On Fri, Apr 18, 2014 at 12:21:50PM +0800, lin zuojian wrote: Hi, Here is the patch after the Jakub's review, and Jakub helps with the coding style. -- * asan.c (asan_emit_stack_protection): Force the base to align to appropriate bits if STRICT_ALIGNMENT. Set shadow_mem align to appropriate bits if STRICT_ALIGNMENT. * cfgexpand.c (expand_stack_vars): Set base_align appropriately when asan is on. (expand_used_vars): Leave a space in the stack frame for alignment if STRICT_ALIGNMENT. --- gcc/ChangeLog | 9 + gcc/asan.c | 15 +++ gcc/cfgexpand.c | 18 -- 3 files changed, 40 insertions(+), 2 deletions(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index da35be8..30a2b33 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,12 @@ +2014-04-18 Lin Zuojian manjian2...@gmail.com + PR middle-end/60281 + * asan.c (asan_emit_stack_protection): Force the base to align to + appropriate bits if STRICT_ALIGNMENT. Set shadow_mem align to + appropriate bits if STRICT_ALIGNMENT. + * cfgexpand.c (expand_stack_vars): Set base_align appropriately + when asan is on. + (expand_used_vars): Leave a space in the stack frame for alignment + if STRICT_ALIGNMENT. 2014-04-17 Jakub Jelinek ja...@redhat.com PR target/60847 diff --git a/gcc/asan.c b/gcc/asan.c index 53992a8..28a476f 100644 --- a/gcc/asan.c +++ b/gcc/asan.c @@ -1017,8 +1017,17 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb, base_align_bias = ((asan_frame_size + alignb - 1) ~(alignb - HOST_WIDE_INT_1)) - asan_frame_size; } + /* Align base if target is STRICT_ALIGNMENT. */ + if (STRICT_ALIGNMENT) +base = expand_binop (Pmode, and_optab, base, + gen_int_mode (-((GET_MODE_ALIGNMENT (SImode) +ASAN_SHADOW_SHIFT) + / BITS_PER_UNIT), Pmode), NULL_RTX, + 1, OPTAB_DIRECT); + if (use_after_return_class == -1 pbase) emit_move_insn (pbase, base); + base = expand_binop (Pmode, add_optab, base, gen_int_mode (base_offset - base_align_bias, Pmode), NULL_RTX, 1, OPTAB_DIRECT); @@ -1097,6 +1106,8 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb, (ASAN_RED_ZONE_SIZE ASAN_SHADOW_SHIFT) == 4); shadow_mem = gen_rtx_MEM (SImode, shadow_base); set_mem_alias_set (shadow_mem, asan_shadow_set); + if (STRICT_ALIGNMENT) +set_mem_align (shadow_mem, (GET_MODE_ALIGNMENT (SImode))); prev_offset = base_offset; for (l = length; l; l -= 2) { @@ -1186,6 +1197,10 @@ asan_emit_stack_protection (rtx base, rtx pbase, unsigned int alignb, shadow_mem = gen_rtx_MEM (BLKmode, shadow_base); set_mem_alias_set (shadow_mem, asan_shadow_set); + + if (STRICT_ALIGNMENT) +set_mem_align (shadow_mem, (GET_MODE_ALIGNMENT (SImode))); + prev_offset = base_offset; last_offset = base_offset; last_size = 0; diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c index b7f6360..14511e1 100644 --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -1013,10 +1013,19 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data) if (data-asan_base == NULL) data-asan_base = gen_reg_rtx (Pmode); base = data-asan_base; + + if (!STRICT_ALIGNMENT) + base_align = crtl-max_used_stack_slot_alignment; + else + base_align = MAX (crtl-max_used_stack_slot_alignment, + GET_MODE_ALIGNMENT (SImode) +ASAN_SHADOW_SHIFT); } else - offset = alloc_stack_frame_space (stack_vars[i].size, alignb); - base_align = crtl-max_used_stack_slot_alignment; + { + offset = alloc_stack_frame_space (stack_vars[i].size, alignb); + base_align = crtl-max_used_stack_slot_alignment; + } } else { @@ -1845,6 +1854,11 @@ expand_used_vars (void) = alloc_stack_frame_space (redzonesz, ASAN_RED_ZONE_SIZE); data.asan_vec.safe_push (prev_offset); data.asan_vec.safe_push (offset); + /* Leave space for alignment if STRICT_ALIGNMENT. */ + if (STRICT_ALIGNMENT) + alloc_stack_frame_space
[patch, testsuite] Fix fragile case nsdmi-union5
Resulting from discussion here: http://gcc.gnu.org/ml/gcc/2014-04/msg00125.html ChangeLog: * g++.dg/cpp0x/nsdmi-union5.C: Change to runtime test. Index: gcc/testsuite/g++.dg/cpp0x/nsdmi-union5.C === --- gcc/testsuite/g++.dg/cpp0x/nsdmi-union5.C (revision 209462) +++ gcc/testsuite/g++.dg/cpp0x/nsdmi-union5.C (working copy) @@ -1,6 +1,5 @@ // PR c++/58701 -// { dg-require-effective-target c++11 } -// { dg-final { scan-assembler 7 } } +// { dg-do run { target c++11 } } static union { @@ -9,3 +8,10 @@ int i = 7; }; }; + +extern C void abort(void); +int main() +{ + if (i != 7) abort(); + return 0; +}
[C PATCH] Warn if switch has boolean value (PR c/60439)
This patch implements a new warning that warns when controlling expression of a switch has boolean value. (Intentionally I don't warn if the controlling expression is (un)signed:1 bit-field.) I guess the question is if this should be enabled by default or deserves some new warning option. Since clang does the former, I did it too and currently this warning is enabled by default. Regtested/bootstrapped on x86_64-linux, ok for trunk? 2014-04-17 Marek Polacek pola...@redhat.com PR c/60439 c/ * c-typeck.c (c_start_case): Warn if switch condition has boolean value. testsuite/ * gcc.dg/pr60439.c: New test. diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c index 65aad45..91b1109 100644 --- gcc/c/c-typeck.c +++ gcc/c/c-typeck.c @@ -9344,6 +9344,28 @@ c_start_case (location_t switch_loc, else { tree type = TYPE_MAIN_VARIANT (orig_type); + tree e = exp; + enum tree_code exp_code; + + while (TREE_CODE (e) == COMPOUND_EXPR) + e = TREE_OPERAND (e, 1); + exp_code = TREE_CODE (e); + + if (TREE_CODE (type) == BOOLEAN_TYPE + || exp_code == TRUTH_ANDIF_EXPR + || exp_code == TRUTH_AND_EXPR + || exp_code == TRUTH_ORIF_EXPR + || exp_code == TRUTH_OR_EXPR + || exp_code == TRUTH_XOR_EXPR + || exp_code == TRUTH_NOT_EXPR + || exp_code == EQ_EXPR + || exp_code == NE_EXPR + || exp_code == LE_EXPR + || exp_code == GE_EXPR + || exp_code == LT_EXPR + || exp_code == GT_EXPR) + warning_at (switch_cond_loc, 0, + switch condition has boolean value); if (!in_system_header_at (input_location) (type == long_integer_type_node diff --git gcc/testsuite/gcc.dg/pr60439.c gcc/testsuite/gcc.dg/pr60439.c index e69de29..26e7c25 100644 --- gcc/testsuite/gcc.dg/pr60439.c +++ gcc/testsuite/gcc.dg/pr60439.c @@ -0,0 +1,112 @@ +/* PR c/60439 */ +/* { dg-do compile } */ + +typedef _Bool bool; +extern _Bool foo (void); + +void +f1 (const _Bool b) +{ + switch (b) /* { dg-warning switch condition has boolean value } */ +case 1: + break; +} + +void +f2 (int a, int b) +{ + switch (a b) /* { dg-warning switch condition has boolean value } */ +case 1: + break; + switch ((bool) (a b)) /* { dg-warning switch condition has boolean value } */ +case 1: + break; + switch ((a b) || a) /* { dg-warning switch condition has boolean value } */ +case 1: + break; +} + +void +f3 (int a) +{ + switch (!!a) /* { dg-warning switch condition has boolean value } */ +case 1: + break; + switch (!a) /* { dg-warning switch condition has boolean value } */ +case 1: + break; +} + +void +f4 (void) +{ + switch (foo ()) /* { dg-warning switch condition has boolean value } */ +case 1: + break; +} + +void +f5 (int a) +{ + switch (a == 3) /* { dg-warning switch condition has boolean value } */ +case 1: + break; + switch (a != 3) /* { dg-warning switch condition has boolean value } */ +case 1: + break; + switch (a 3) /* { dg-warning switch condition has boolean value } */ +case 1: + break; + switch (a 3) /* { dg-warning switch condition has boolean value } */ +case 1: + break; + switch (a = 3) /* { dg-warning switch condition has boolean value } */ +case 1: + break; + switch (a = 3) /* { dg-warning switch condition has boolean value } */ +case 1: + break; + switch (foo (), foo (), a = 42) /* { dg-warning switch condition has boolean value } */ +case 1: + break; + switch (a == 3, a 4, a ^ 5, a) +case 1: + break; +} + +void +f6 (bool b) +{ + switch (b) /* { dg-warning switch condition has boolean value } */ +case 1: + break; + switch (!b) /* { dg-warning switch condition has boolean value } */ +case 1: + break; + switch (b++) /* { dg-warning switch condition has boolean value } */ +case 1: + break; +} + +void +f7 (void) +{ + bool b; + switch (b = 1) /* { dg-warning switch condition has boolean value } */ +case 1: + break; +} + +void +f8 (int i) +{ + switch (i) +case 0: + break; + switch ((unsigned int) i) +case 0: + break; + switch ((bool) i) /* { dg-warning switch condition has boolean value } */ +case 0: + break; +} Marek
Re: [C PATCH] Warn if switch has boolean value (PR c/60439)
On Fri, 18 Apr 2014, Marek Polacek wrote: This patch implements a new warning that warns when controlling expression of a switch has boolean value. (Intentionally I don't warn if the controlling expression is (un)signed:1 bit-field.) I guess the question is if this should be enabled by default or deserves some new warning option. Since clang does the former, I did it too and currently this warning is enabled by default. It can be enabled by -Wsome-name which is itself enabled by default but at least gives the possibility to use -Wno-some-name, -Werror=some-name, etc. No? I believe Manuel insists regularly that no new warning should use 0 (and old ones should progressively lose it). -- Marc Glisse