On Fri, Sep 23, 2016 at 08:55:02AM +0200, Richard Biener wrote: > While I agree with the goal to reduce the number of static constructors > esp. the vNULL cases I disagree with. This is just introducing > undefined behavior (uninitialized object use), and in case we end up > making vNULL not all-zeros it's bound to break. So IMHO your patch > is very much premature optimization here.
No, there is no undefined behavior, and we rely on the zero initialization in > 100 cases, see grep '\(^\|static \)vec \?<.*;' *.c *.cc */*.c ada/*/*.c | grep -v '\*' or anywhere where we have vec part of a struct that we allocate cleared or clear after allocation. The thing is, vec is (intentionally) a POD, thus doesn't have a ctor, and the = vNULL "construction" is template <typename T, typename A, typename L> operator vec<T, A, L> () { return vec<T, A, L>(); } actually clearing of the vector. If we ever wanted to initialize vec to something not-all-zeros, we'd actually need to turn it into non-POD and add ctor. I'm attaching 3 further patches (not bootstrapped/regtested yet), which: 1) adds constexpr keyword for C++11 + to the vNULL operator, this seems to be enough to get rid of the runtime initialization, both in the tree-ssa-sccvn.o case with _ZGV var and e.g. in passes.c get rid of _Z41__static_initialization_and_destruction_0ii ctor 2) another patch to remove the useless = vNULL initializers from file scope vecs, we don't usually = 0 static vars either (though, apparently we have hundreds of those) 3) a patch that attempts to clarify when = vNULL is needed and when it is not > Maybe we can change vNULL to sth that avoids the constructor, if only > if C++14 is available (thus in stage2+)? > > For cases like this I hope we could make GCC optimize away the static > construction, maybe you can reduce it to a simple testcase and file > a bugreport? It will require making static constructors "first class" > in the cgraph so we know at some point the function body initializing > a specific global and some later cgraph phase builds up the global > constructor calling all remaining relevant individual constructor > functions (which can then still be inlined into that fn). A short testcase is e.g. struct S { int a; }; void baz (S *); #if __cpp_constexpr >= 200704 constexpr #endif inline S foo () { return (S) { 2 }; } S s = foo (); void bar () { static S t = foo (); baz (&t); } Here for -std=c++98 (or even -std=c++1z without the constexpr keyword) at -O0 there is both _Z41__static_initialization_and_destruction_0ii and the __cxa_guard*+_ZGV stuff, at -O2 we manage to inline _Z41__static_initialization_and_destruction_0ii but still end up with _GLOBAL__sub_I_s: movl $2, s(%rip) ret registered in .ctors. And the guard stuff is left too. Optimizing that away would be nice, but non-fun, we'd have to recognize something like: static struct S t; unsigned char _1; int _2; <bb 2>: _1 = __atomic_load_1 (&_ZGVZ3barvE1t, 2); if (_1 == 0) goto <bb 4>; else goto <bb 3>; <bb 3>: goto <bb 6>; <bb 4>: _2 = __cxa_guard_acquire (&_ZGVZ3barvE1t); if (_2 != 0) goto <bb 5>; else goto <bb 3>; <bb 5>: MEM[(struct S *)&t] = 2; __cxa_guard_release (&_ZGVZ3barvE1t); <bb 6>: and turn those stores into the static var into a static initializer and throw away the _ZGV* var, plus if this is in an inline function (i.e. the static var and its _ZGV* var are comdat) it might be an ABI thing too (unfortunately the _ZGV* var and the actual var it is guarding aren't in the same comdat, so even a trick like keeping the runtime initialization in, but initialize the var to the expected values and preinitialize the _ZGV* guard into the "already initialized" state wouldn't work reliably, if the linker picks the _ZGV* var from one TU and the guarded var from another one, it could fail). So, I'm afraid if we want to do this optimization, we'd need to limit it to the cases where the _ZGV* var isn't visible outside of the TU. Ok for the 3 patches (or some subset of them) for trunk if they pass bootstrap/regtest? Jakub
2016-09-23 Jakub Jelinek <ja...@redhat.com> * vec.h (vnull::operator vec): Add constexpr keyword for C++11 and later. --- gcc/vec.h.jj 2016-04-26 08:08:18.000000000 +0200 +++ gcc/vec.h 2016-09-23 09:48:48.813586327 +0200 @@ -414,6 +414,9 @@ struct GTY((user)) vec struct vnull { template <typename T, typename A, typename L> +#if __cpp_constexpr >= 201103 + constexpr +#endif operator vec<T, A, L> () { return vec<T, A, L>(); } }; extern vnull vNULL;
2016-09-23 Jakub Jelinek <ja...@redhat.com> * sel-sched-ir.c (sel_global_bb_info, sel_region_bb_info, loop_nests, s_i_d, last_added_blocks): Remove unnecessary = vNULL initialization of file scope vec. * passes.c (pass_tab, enabled_pass_uid_range_tab, disabled_pass_uid_range_tab): Likewise. * haifa-sched.c (sched_luids, h_i_d): Likewise. * tree-chkp-opt.c (check_infos): Likewise. * sel-sched.c (vec_av_set, vec_temp_moveop_nops): Likewise. c/ * c-parser.c (incomplete_record_decls): Remove unnecessary = vNULL initialization of file scope vec. cp/ * constexpr.c (call_stack): Remove unnecessary = vNULL initialization of file scope vec. --- gcc/sel-sched-ir.c.jj 2016-03-15 17:10:19.000000000 +0100 +++ gcc/sel-sched-ir.c 2016-09-23 09:56:54.145357346 +0200 @@ -45,12 +45,10 @@ along with GCC; see the file COPYING3. #include "sel-sched-dump.h" /* A vector holding bb info for whole scheduling pass. */ -vec<sel_global_bb_info_def> - sel_global_bb_info = vNULL; +vec<sel_global_bb_info_def> sel_global_bb_info; /* A vector holding bb info. */ -vec<sel_region_bb_info_def> - sel_region_bb_info = vNULL; +vec<sel_region_bb_info_def> sel_region_bb_info; /* A pool for allocating all lists. */ object_allocator<_list_node> sched_lists_pool ("sel-sched-lists"); @@ -66,7 +64,7 @@ struct loop *current_loop_nest; /* LOOP_NESTS is a vector containing the corresponding loop nest for each region. */ -static vec<loop_p> loop_nests = vNULL; +static vec<loop_p> loop_nests; /* Saves blocks already in loop regions, indexed by bb->index. */ static sbitmap bbs_in_loop_rgns = NULL; @@ -4163,7 +4161,7 @@ finish_region_bb_info (void) /* Data for each insn in current region. */ -vec<sel_insn_data_def> s_i_d = vNULL; +vec<sel_insn_data_def> s_i_d; /* Extend data structures for insns from current region. */ static void @@ -4499,8 +4497,7 @@ get_av_level (insn_t insn) /* The basic block that already has been processed by the sched_data_update (), but hasn't been in sel_add_bb () yet. */ -static vec<basic_block> - last_added_blocks = vNULL; +static vec<basic_block> last_added_blocks; /* A pool for allocating successor infos. */ static struct --- gcc/passes.c.jj 2016-09-14 16:00:50.000000000 +0200 +++ gcc/passes.c 2016-09-23 09:55:41.265281188 +0200 @@ -862,7 +862,7 @@ pass_manager::register_pass_name (opt_pa /* Map from pass id to canonicalized pass name. */ typedef const char *char_ptr; -static vec<char_ptr> pass_tab = vNULL; +static vec<char_ptr> pass_tab; /* Callback function for traversing NAME_TO_PASS_MAP. */ @@ -982,10 +982,8 @@ struct uid_range typedef struct uid_range *uid_range_p; -static vec<uid_range_p> - enabled_pass_uid_range_tab = vNULL; -static vec<uid_range_p> - disabled_pass_uid_range_tab = vNULL; +static vec<uid_range_p> enabled_pass_uid_range_tab; +static vec<uid_range_p> disabled_pass_uid_range_tab; /* Parse option string for -fdisable- and -fenable- --- gcc/haifa-sched.c.jj 2016-08-29 12:17:29.000000000 +0200 +++ gcc/haifa-sched.c 2016-09-23 09:55:15.440612566 +0200 @@ -401,13 +401,13 @@ const struct common_sched_info_def haifa }; /* Mapping from instruction UID to its Logical UID. */ -vec<int> sched_luids = vNULL; +vec<int> sched_luids; /* Next LUID to assign to an instruction. */ int sched_max_luid = 1; /* Haifa Instruction Data. */ -vec<haifa_insn_data_def> h_i_d = vNULL; +vec<haifa_insn_data_def> h_i_d; void (* sched_init_only_bb) (basic_block, basic_block); --- gcc/tree-chkp-opt.c.jj 2016-08-17 10:22:14.000000000 +0200 +++ gcc/tree-chkp-opt.c 2016-09-23 09:57:09.715159981 +0200 @@ -84,7 +84,7 @@ static void chkp_collect_value (tree ssa #define chkp_checku_fndecl \ (targetm.builtin_chkp_function (BUILT_IN_CHKP_BNDCU)) -static vec<struct bb_checks, va_heap, vl_ptr> check_infos = vNULL; +static vec<struct bb_checks, va_heap, vl_ptr> check_infos; /* Comparator for pol_item structures I1 and I2 to be used to find items with equal var. Also used for polynomial --- gcc/sel-sched.c.jj 2016-08-06 12:11:51.000000000 +0200 +++ gcc/sel-sched.c 2016-09-23 09:56:20.679781563 +0200 @@ -494,7 +494,7 @@ static int max_ws; static int num_insns_scheduled; /* A vector of expressions is used to be able to sort them. */ -static vec<expr_t> vec_av_set = vNULL; +static vec<expr_t> vec_av_set; /* A vector of vinsns is used to hold temporary lists of vinsns. */ typedef vec<vinsn_t> vinsn_vec_t; @@ -512,7 +512,7 @@ static vinsn_vec_t vec_target_unavailabl /* Vector to store temporary nops inserted in move_op to prevent removal of empty bbs. */ -static vec<insn_t> vec_temp_moveop_nops = vNULL; +static vec<insn_t> vec_temp_moveop_nops; /* These bitmaps record original instructions scheduled on the current iteration and bookkeeping copies created by them. */ --- gcc/c/c-parser.c.jj 2016-09-14 23:48:59.000000000 +0200 +++ gcc/c/c-parser.c 2016-09-23 09:57:28.249925030 +0200 @@ -67,7 +67,7 @@ along with GCC; see the file COPYING3. In c_parser_translation_unit(), we iterate over incomplete_record_decls and report error if any of the decls are still incomplete. */ -vec<tree> incomplete_record_decls = vNULL; +vec<tree> incomplete_record_decls; void set_c_expr_source_range (c_expr *expr, --- gcc/cp/constexpr.c.jj 2016-09-20 17:17:51.000000000 +0200 +++ gcc/cp/constexpr.c 2016-09-23 09:57:46.055699321 +0200 @@ -1253,7 +1253,7 @@ cxx_bind_parameters_in_call (const const These do not need to be marked for PCH or GC. */ /* FIXME remember and print actual constant arguments. */ -static vec<tree> call_stack = vNULL; +static vec<tree> call_stack; static int call_stack_tick; static int last_cx_error_tick;
2016-09-23 Jakub Jelinek <ja...@redhat.com> * vec.h (vNULL): Extend comment to say = vNULL initialization isn't needed for static vars. --- gcc/vec.h.jj 2016-09-23 09:53:13.000000000 +0200 +++ gcc/vec.h 2016-09-23 10:11:06.811548785 +0200 @@ -410,7 +410,9 @@ struct GTY((user)) vec /* Type to provide NULL values for vec<T, A, L>. This is used to provide nil initializers for vec instances. Since vec must be a POD, we cannot have proper ctor/dtor for it. To initialize - a vec instance, you can assign it the value vNULL. */ + a vec instance, you can assign it the value vNULL. This isn't + needed for file-scope and function-local static vectors, which + are zero-initialized by default. */ struct vnull { template <typename T, typename A, typename L>