On Fri, Sep 23, 2016 at 08:55:02AM +0200, Richard Biener wrote:
> While I agree with the goal to reduce the number of static constructors
> esp. the vNULL cases I disagree with.  This is just introducing
> undefined behavior (uninitialized object use), and in case we end up
> making vNULL not all-zeros it's bound to break.  So IMHO your patch
> is very much premature optimization here.

No, there is no undefined behavior, and we rely on the zero initialization
in > 100 cases, see
grep '\(^\|static \)vec \?<.*;' *.c *.cc */*.c ada/*/*.c | grep -v '\*'
or anywhere where we have vec part of a struct that we allocate cleared or
clear after allocation.
The thing is, vec is (intentionally) a POD, thus doesn't have a ctor,
and the = vNULL "construction" is
  template <typename T, typename A, typename L>
  operator vec<T, A, L> () { return vec<T, A, L>(); }
actually clearing of the vector.  If we ever wanted to initialize vec to
something not-all-zeros, we'd actually need to turn it into non-POD and add
ctor.

I'm attaching 3 further patches (not bootstrapped/regtested yet), which:
1) adds constexpr keyword for C++11 + to the vNULL operator, this seems to
be enough to get rid of the runtime initialization, both in the
tree-ssa-sccvn.o case with _ZGV var and e.g. in passes.c get rid of
_Z41__static_initialization_and_destruction_0ii ctor
2) another patch to remove the useless = vNULL initializers from file scope
vecs, we don't usually = 0 static vars either (though, apparently we have
hundreds of those)
3) a patch that attempts to clarify when = vNULL is needed and when it is
not

> Maybe we can change vNULL to sth that avoids the constructor, if only
> if C++14 is available (thus in stage2+)?
> 
> For cases like this I hope we could make GCC optimize away the static
> construction, maybe you can reduce it to a simple testcase and file
> a bugreport?  It will require making static constructors "first class"
> in the cgraph so we know at some point the function body initializing
> a specific global and some later cgraph phase builds up the global
> constructor calling all remaining relevant individual constructor
> functions (which can then still be inlined into that fn).

A short testcase is e.g.
struct S { int a; };
void baz (S *);
#if __cpp_constexpr >= 200704
constexpr
#endif
inline S foo () { return (S) { 2 }; }

S s = foo ();

void
bar ()
{
  static S t = foo ();
  baz (&t);
}
Here for -std=c++98 (or even -std=c++1z without the constexpr keyword)
at -O0 there is both _Z41__static_initialization_and_destruction_0ii and
the __cxa_guard*+_ZGV stuff, at -O2 we manage to inline
_Z41__static_initialization_and_destruction_0ii but still end up with
_GLOBAL__sub_I_s:
        movl    $2, s(%rip)
        ret
registered in .ctors.  And the guard stuff is left too.
Optimizing that away would be nice, but non-fun, we'd have to recognize
something like:

  static struct S t;
  unsigned char _1;
  int _2;

  <bb 2>:
  _1 = __atomic_load_1 (&_ZGVZ3barvE1t, 2);
  if (_1 == 0)
    goto <bb 4>;
  else
    goto <bb 3>;

  <bb 3>:
  goto <bb 6>;

  <bb 4>:
  _2 = __cxa_guard_acquire (&_ZGVZ3barvE1t);
  if (_2 != 0)
    goto <bb 5>;
  else
    goto <bb 3>;

  <bb 5>:
  MEM[(struct S *)&t] = 2;
  __cxa_guard_release (&_ZGVZ3barvE1t);

  <bb 6>:

and turn those stores into the static var into a static initializer
and throw away the _ZGV* var, plus if this is in an inline function
(i.e. the static var and its _ZGV* var are comdat) it might be an
ABI thing too (unfortunately the _ZGV* var and the actual var it is guarding
aren't in the same comdat, so even a trick like keeping the runtime
initialization in, but initialize the var to the expected values and
preinitialize the _ZGV* guard into the "already initialized" state wouldn't
work reliably, if the linker picks the _ZGV* var from one TU and the guarded
var from another one, it could fail).  So, I'm afraid if we want to do this
optimization, we'd need to limit it to the cases where the _ZGV* var isn't
visible outside of the TU.

Ok for the 3 patches (or some subset of them) for trunk if they pass
bootstrap/regtest?

        Jakub
2016-09-23  Jakub Jelinek  <ja...@redhat.com>

        * vec.h (vnull::operator vec): Add constexpr keyword for
        C++11 and later.

--- gcc/vec.h.jj        2016-04-26 08:08:18.000000000 +0200
+++ gcc/vec.h   2016-09-23 09:48:48.813586327 +0200
@@ -414,6 +414,9 @@ struct GTY((user)) vec
 struct vnull
 {
   template <typename T, typename A, typename L>
+#if __cpp_constexpr >= 201103
+  constexpr
+#endif
   operator vec<T, A, L> () { return vec<T, A, L>(); }
 };
 extern vnull vNULL;
2016-09-23  Jakub Jelinek  <ja...@redhat.com>

        * sel-sched-ir.c (sel_global_bb_info, sel_region_bb_info,
        loop_nests, s_i_d, last_added_blocks): Remove unnecessary
        = vNULL initialization of file scope vec.
        * passes.c (pass_tab, enabled_pass_uid_range_tab,
        disabled_pass_uid_range_tab): Likewise.
        * haifa-sched.c (sched_luids, h_i_d): Likewise.
        * tree-chkp-opt.c (check_infos): Likewise.
        * sel-sched.c (vec_av_set, vec_temp_moveop_nops): Likewise.
c/
        * c-parser.c (incomplete_record_decls): Remove unnecessary
        = vNULL initialization of file scope vec.
cp/
        * constexpr.c (call_stack): Remove unnecessary
        = vNULL initialization of file scope vec.

--- gcc/sel-sched-ir.c.jj       2016-03-15 17:10:19.000000000 +0100
+++ gcc/sel-sched-ir.c  2016-09-23 09:56:54.145357346 +0200
@@ -45,12 +45,10 @@ along with GCC; see the file COPYING3.
 #include "sel-sched-dump.h"
 
 /* A vector holding bb info for whole scheduling pass.  */
-vec<sel_global_bb_info_def>
-    sel_global_bb_info = vNULL;
+vec<sel_global_bb_info_def> sel_global_bb_info;
 
 /* A vector holding bb info.  */
-vec<sel_region_bb_info_def>
-    sel_region_bb_info = vNULL;
+vec<sel_region_bb_info_def> sel_region_bb_info;
 
 /* A pool for allocating all lists.  */
 object_allocator<_list_node> sched_lists_pool ("sel-sched-lists");
@@ -66,7 +64,7 @@ struct loop *current_loop_nest;
 
 /* LOOP_NESTS is a vector containing the corresponding loop nest for
    each region.  */
-static vec<loop_p> loop_nests = vNULL;
+static vec<loop_p> loop_nests;
 
 /* Saves blocks already in loop regions, indexed by bb->index.  */
 static sbitmap bbs_in_loop_rgns = NULL;
@@ -4163,7 +4161,7 @@ finish_region_bb_info (void)
 
 
 /* Data for each insn in current region.  */
-vec<sel_insn_data_def> s_i_d = vNULL;
+vec<sel_insn_data_def> s_i_d;
 
 /* Extend data structures for insns from current region.  */
 static void
@@ -4499,8 +4497,7 @@ get_av_level (insn_t insn)
 
 /* The basic block that already has been processed by the sched_data_update (),
    but hasn't been in sel_add_bb () yet.  */
-static vec<basic_block>
-    last_added_blocks = vNULL;
+static vec<basic_block> last_added_blocks;
 
 /* A pool for allocating successor infos.  */
 static struct
--- gcc/passes.c.jj     2016-09-14 16:00:50.000000000 +0200
+++ gcc/passes.c        2016-09-23 09:55:41.265281188 +0200
@@ -862,7 +862,7 @@ pass_manager::register_pass_name (opt_pa
 /* Map from pass id to canonicalized pass name.  */
 
 typedef const char *char_ptr;
-static vec<char_ptr> pass_tab = vNULL;
+static vec<char_ptr> pass_tab;
 
 /* Callback function for traversing NAME_TO_PASS_MAP.  */
 
@@ -982,10 +982,8 @@ struct uid_range
 typedef struct uid_range *uid_range_p;
 
 
-static vec<uid_range_p>
-      enabled_pass_uid_range_tab = vNULL;
-static vec<uid_range_p>
-      disabled_pass_uid_range_tab = vNULL;
+static vec<uid_range_p> enabled_pass_uid_range_tab;
+static vec<uid_range_p> disabled_pass_uid_range_tab;
 
 
 /* Parse option string for -fdisable- and -fenable-
--- gcc/haifa-sched.c.jj        2016-08-29 12:17:29.000000000 +0200
+++ gcc/haifa-sched.c   2016-09-23 09:55:15.440612566 +0200
@@ -401,13 +401,13 @@ const struct common_sched_info_def haifa
   };
 
 /* Mapping from instruction UID to its Logical UID.  */
-vec<int> sched_luids = vNULL;
+vec<int> sched_luids;
 
 /* Next LUID to assign to an instruction.  */
 int sched_max_luid = 1;
 
 /* Haifa Instruction Data.  */
-vec<haifa_insn_data_def> h_i_d = vNULL;
+vec<haifa_insn_data_def> h_i_d;
 
 void (* sched_init_only_bb) (basic_block, basic_block);
 
--- gcc/tree-chkp-opt.c.jj      2016-08-17 10:22:14.000000000 +0200
+++ gcc/tree-chkp-opt.c 2016-09-23 09:57:09.715159981 +0200
@@ -84,7 +84,7 @@ static void chkp_collect_value (tree ssa
 #define chkp_checku_fndecl \
   (targetm.builtin_chkp_function (BUILT_IN_CHKP_BNDCU))
 
-static vec<struct bb_checks, va_heap, vl_ptr> check_infos = vNULL;
+static vec<struct bb_checks, va_heap, vl_ptr> check_infos;
 
 /* Comparator for pol_item structures I1 and I2 to be used
    to find items with equal var.  Also used for polynomial
--- gcc/sel-sched.c.jj  2016-08-06 12:11:51.000000000 +0200
+++ gcc/sel-sched.c     2016-09-23 09:56:20.679781563 +0200
@@ -494,7 +494,7 @@ static int max_ws;
 static int num_insns_scheduled;
 
 /* A vector of expressions is used to be able to sort them.  */
-static vec<expr_t> vec_av_set = vNULL;
+static vec<expr_t> vec_av_set;
 
 /* A vector of vinsns is used to hold temporary lists of vinsns.  */
 typedef vec<vinsn_t> vinsn_vec_t;
@@ -512,7 +512,7 @@ static vinsn_vec_t vec_target_unavailabl
 
 /* Vector to store temporary nops inserted in move_op to prevent removal
    of empty bbs.  */
-static vec<insn_t> vec_temp_moveop_nops = vNULL;
+static vec<insn_t> vec_temp_moveop_nops;
 
 /* These bitmaps record original instructions scheduled on the current
    iteration and bookkeeping copies created by them.  */
--- gcc/c/c-parser.c.jj 2016-09-14 23:48:59.000000000 +0200
+++ gcc/c/c-parser.c    2016-09-23 09:57:28.249925030 +0200
@@ -67,7 +67,7 @@ along with GCC; see the file COPYING3.
    In c_parser_translation_unit(), we iterate over incomplete_record_decls
    and report error if any of the decls are still incomplete.  */ 
 
-vec<tree> incomplete_record_decls = vNULL;
+vec<tree> incomplete_record_decls;
 
 void
 set_c_expr_source_range (c_expr *expr,
--- gcc/cp/constexpr.c.jj       2016-09-20 17:17:51.000000000 +0200
+++ gcc/cp/constexpr.c  2016-09-23 09:57:46.055699321 +0200
@@ -1253,7 +1253,7 @@ cxx_bind_parameters_in_call (const const
    These do not need to be marked for PCH or GC.  */
 
 /* FIXME remember and print actual constant arguments.  */
-static vec<tree> call_stack = vNULL;
+static vec<tree> call_stack;
 static int call_stack_tick;
 static int last_cx_error_tick;
 
2016-09-23  Jakub Jelinek  <ja...@redhat.com>

        * vec.h (vNULL): Extend comment to say = vNULL initialization
        isn't needed for static vars.

--- gcc/vec.h.jj        2016-09-23 09:53:13.000000000 +0200
+++ gcc/vec.h   2016-09-23 10:11:06.811548785 +0200
@@ -410,7 +410,9 @@ struct GTY((user)) vec
 /* Type to provide NULL values for vec<T, A, L>.  This is used to
    provide nil initializers for vec instances.  Since vec must be
    a POD, we cannot have proper ctor/dtor for it.  To initialize
-   a vec instance, you can assign it the value vNULL.  */
+   a vec instance, you can assign it the value vNULL.  This isn't
+   needed for file-scope and function-local static vectors, which
+   are zero-initialized by default.  */
 struct vnull
 {
   template <typename T, typename A, typename L>

Reply via email to