On Thu, 13 Nov 2014, Jan Hubicka wrote: > Hi, > this patch adds infrastructure for proper streaming and merging of > TREE_TARGET_OPTION. The catch is that TREE_TARGET_OPTION is autogenerated > structure. For x86_64 it looks as follows: > > /* Structure to save/restore selected target specific options. */ > struct GTY(()) cl_target_option > { > HOST_WIDE_INT x_ix86_isa_flags_explicit; > HOST_WIDE_INT x_ix86_target_flags_explicit; > const char *x_ix86_arch_string; > const char *x_ix86_recip_name; > const char *x_ix86_tune_ctrl_string; > const char *x_ix86_tune_memcpy_strategy; > const char *x_ix86_tune_memset_strategy; > const char *x_ix86_tune_string; > HOST_WIDE_INT x_ix86_isa_flags; > enum fpmath_unit x_ix86_fpmath; > enum asm_dialect x_ix86_asm_dialect; > ... > enum stack_protector_guard x_ix86_stack_protector_guard; > enum stringop_alg x_ix86_stringop_alg; > enum tls_dialect x_ix86_tls_dialect; > int x_ix86_branch_cost; > int x_ix86_dump_tunes; > .... > int x_recip_mask; > int x_target_flags; > unsigned char arch; > unsigned char arch_specified; > unsigned char branch_cost; > unsigned char schedule; > unsigned char tune; > unsigned char tune_defaulted; > }; > > The problem is that the strings prevents us from streaming the structure as a > binary > blob like we do for optimization nodes (and we should not do that to make LTO > streaming > stable across attribute tweaks in minor releases). > > This patch extends the AWK generators to produce four functions: > > /* Compare two target options */ > bool > cl_target_option_eq (struct cl_target_option const *ptr1, > struct cl_target_option const *ptr2) > { > if (ptr1->x_ix86_arch_string != ptr2->x_ix86_arch_string > && (!ptr1->x_ix86_arch_string || !ptr2->x_ix86_arch_string > || strcmp (ptr1->x_ix86_arch_string, ptr2->x_ix86_arch_string))) > return false; > .... > if (ptr1->x_ix86_isa_flags != ptr2->x_ix86_isa_flags) > return false; > if (ptr1->x_ix86_fpmath != ptr2->x_ix86_fpmath) > return false; > return true; > } > > /* Hash target options */ > hashval_t > cl_target_option_hash (struct cl_target_option const *ptr) > { > inchash::hash hstate; > if (ptr->x_ix86_arch_string) > hstate.add (ptr->x_ix86_arch_string, strlen (ptr->x_ix86_arch_string)); > else > hstate.add_int (0); > ... > hstate.add_wide_int (ptr->x_ix86_fpmath); > return hstate.end (); > } > > /* Stream out target options */ > void > cl_target_option_stream_out (struct output_block *ob, struct cl_target_option > *ptr) > { > streamer_write_string (ob, ob->main_stream, ptr->x_ix86_arch_string, true); > ... > streamer_write_uhwi (ob, ptr->x_ix86_isa_flags); > streamer_write_uhwi (ob, ptr->x_ix86_fpmath); > } > > /* Stream in target options */ > void > cl_target_option_stream_in (struct lto_input_block *ib, struct data_in > *data_in, struct cl_target_option *ptr) > { > ptr->x_ix86_arch_string = strdup (streamer_read_string (data_in, ib)); > ... > ptr->x_ix86_fpmath = (enum fpmath_unit ) streamer_read_uhwi (ib); > } > > These allows us to properly stream, compare and hash the values. > > There is some implementation lazyness - for example I stream all scalars as > unsigned HOST_WIDE_INT that seems to be the largest type used for variables > across the targets. > > If other types are used, the type matching (I stole from other part of the awk > scripts) will need to be extended; for now it knows that const char * is a > string and needs special care but nothing else. > > I have tested this with firefox and additional change to tree.c to always > initialize TARGET_OPTION_NODE for defined functions (so target units are > peorperly presered across units). This seems to work as intended - i.e. the > functions gets link-time optimized according to -march settings passed at > compile time. > > I noticed that memory use is now 3.8GB instead of previous 2.8GB - this may be > tree merging problem caused by mismatching target settings but most probably > it > is an unrelated stuff. I will look into it tomorrow. I also plan to test this > on PPC (that currently fails to bootstrap). > > Incrementally I would like to handle optimizations nodes same way and change > streaming format to be a set of assignments field_name=value. > This way we could behave sanely when some field is introduced or removed by > either defaulting it or ignoring it.
I'm not sure that idea is good ;) Adding a version / checksum to detect a mismatch early would be nice though. Maybe just compute a hash for the fields in the AWK script? > Bootstrapped/regtested x86_64-linux, seems sane? Sounds sane - I'd like to have a AWK capable person have a look for non-portable stuff (did you check AIX? ;)) Some more comments below > * optc-save-gen.awk: Output cl_target_option_eq, > cl_target_option_hash, cl_target_option_stream_out, > cl_target_option_stream_in functions. > * opth-gen.awk: Output prototypes for > cl_target_option_eq and cl_target_option_hash. > * lto-streamer.h (cl_target_option_stream_out, > cl_target_option_stream_in): Declare. > * tree.c (cl_option_hash_hash): Use cl_target_option_hash. > (cl_option_hash_eq): Use cl_target_option_eq. > * tree-streamer-in.c (unpack_value_fields): Stream in > TREE_TARGET_OPTION. > * lto-streamer-out.c (DFS::DFS_write_tree_body): Follow > DECL_FUNCTION_SPECIFIC_TARGET. > (hash_tree): Hash TREE_TARGET_OPTION; visit > DECL_FUNCTION_SPECIFIC_TARGET. > * tree-streamer-out.c (streamer_pack_tree_bitfields): Skip > TS_TARGET_OPTION. > (streamer_write_tree_body): Output TS_TARGET_OPTION. > > * lto.c (compare_tree_sccs_1): Compare cl_target_option_eq. > > Index: optc-save-gen.awk > =================================================================== > --- optc-save-gen.awk (revision 217448) > +++ optc-save-gen.awk (working copy) > @@ -39,6 +39,16 @@ print "#include " quote "intl.h" quote > print "" > print "#include " quote "flags.h" quote > print "#include " quote "target.h" quote > +print "#include " quote "inchash.h" quote > +print "#include " quote "tree.h" quote > +print "#include " quote "tree-ssa-alias.h" quote > +print "#include " quote "is-a.h" quote > +print "#include " quote "predict.h" quote > +print "#include " quote "function.h" quote > +print "#include " quote "basic-block.h" quote > +print "#include " quote "gimple-expr.h" quote > +print "#include " quote "gimple.h" quote > +print "#include " quote "data-streamer.h" quote > print "" > > if (n_extra_c_includes > 0) { > @@ -417,4 +427,120 @@ print " targetm.target_option.print ( > > print "}"; > > +print ""; > +print "/* Compare two target options */"; > +print "bool"; > +print "cl_target_option_eq (struct cl_target_option const *ptr1,"; > +print " struct cl_target_option const *ptr2)"; > +print "{"; > +n_target_val = 0; > +n_target_str = 0; > + > +for (i = 0; i < n_target_save; i++) { > + var = target_save_decl[i]; > + sub (" *=.*", "", var); > + name = var; > + type = var; > + sub("^.*[ *]", "", name) > + sub(" *" name "$", "", type) > + if (target_save_decl[i] ~ "^const char \\*+[_" alnum "]+$") > + var_target_str[n_target_str++] = name; > + else { > + var_target_val_type[n_target_val] = type; > + var_target_val[n_target_val++] = name; > + } > +} > +if (have_save) { > + for (i = 0; i < n_opts; i++) { > + if (flag_set_p("Save", flags[i])) { > + name = var_name(flags[i]) > + if(name == "") > + name = "target_flags"; > + > + if(name in var_list_seen) > + continue; > + > + var_list_seen[name]++; > + otype = var_type_struct(flags[i]) > + if (otype ~ "^const char \\**$") > + var_target_str[n_target_str++] = "x_" name; > + else { > + var_target_val_type[n_target_val] = otype; > + var_target_val[n_target_val++] = "x_" name; > + } > + } > + } > +} else { > + var_target_val_type[n_target_val] = "int"; > + var_target_val[n_target_val++] = "x_target_flags"; > +} > + > +for (i = 0; i < n_target_str; i++) { > + name = var_target_str[i] > + print " if (ptr1->" name" != ptr2->" name; > + print " && (!ptr1->" name" || !ptr2->" name > + print " || strcmp (ptr1->" name", ptr2->" name ")))"; > + print " return false;"; > +} > +for (i = 0; i < n_target_val; i++) { > + name = var_target_val[i] > + print " if (ptr1->" name" != ptr2->" name ")"; > + print " return false;"; > +} > + > +print " return true;"; > + > +print "}"; > + > +print ""; > +print "/* Hash target options */"; > +print "hashval_t"; > +print "cl_target_option_hash (struct cl_target_option const *ptr)"; > +print "{"; > +print " inchash::hash hstate;"; > +for (i = 0; i < n_target_str; i++) { > + name = var_target_str[i] > + print " if (ptr->" name")"; > + print " hstate.add (ptr->" name", strlen (ptr->" name"));"; > + print " else"; > + print " hstate.add_int (0);"; > +} > +for (i = 0; i < n_target_val; i++) { > + name = var_target_val[i] > + print " hstate.add_wide_int (ptr->" name");"; > +} > +print " return hstate.end ();"; > +print "}"; > + > +print ""; > +print "/* Stream out target options */"; > +print "void"; > +print "cl_target_option_stream_out (struct output_block *ob, struct > cl_target_option *ptr)"; > +print "{"; > +for (i = 0; i < n_target_str; i++) { > + name = var_target_str[i] > + print " streamer_write_string (ob, ob->main_stream, ptr->" name", > true);"; > +} > +for (i = 0; i < n_target_val; i++) { > + name = var_target_val[i] > + print " streamer_write_uhwi (ob, ptr->" name");"; > +} > +print "}"; > + > +print ""; > +print "/* Stream in target options */"; > +print "void"; > +print "cl_target_option_stream_in (struct lto_input_block *ib, struct > data_in *data_in, struct cl_target_option *ptr)"; > +print "{"; > +for (i = 0; i < n_target_str; i++) { > + name = var_target_str[i] > + print " ptr->" name" = strdup (streamer_read_string (data_in, ib));"; > +} > +for (i = 0; i < n_target_val; i++) { > + name = var_target_val[i] > + print " ptr->" name" = (" var_target_val_type[i] ") streamer_read_uhwi > (ib);"; > +} > + > +print "}"; > + > } > Index: tree-streamer-in.c > =================================================================== > --- tree-streamer-in.c (revision 217448) > +++ tree-streamer-in.c (working copy) > @@ -506,9 +506,6 @@ unpack_value_fields (struct data_in *dat > if (CODE_CONTAINS_STRUCT (code, TS_TRANSLATION_UNIT_DECL)) > unpack_ts_translation_unit_decl_value_fields (data_in, bp, expr); > > - if (CODE_CONTAINS_STRUCT (code, TS_TARGET_OPTION)) > - gcc_unreachable (); > - > if (CODE_CONTAINS_STRUCT (code, TS_OPTIMIZATION)) > unpack_ts_optimization (bp, expr); > > @@ -1081,6 +1078,9 @@ streamer_read_tree_body (struct lto_inpu > if (CODE_CONTAINS_STRUCT (code, TS_CONSTRUCTOR)) > lto_input_ts_constructor_tree_pointers (ib, data_in, expr); > > + if (CODE_CONTAINS_STRUCT (code, TS_TARGET_OPTION)) > + cl_target_option_stream_in (ib, data_in, TREE_TARGET_OPTION (expr)); > + > if (code == OMP_CLAUSE) > lto_input_ts_omp_clause_tree_pointers (ib, data_in, expr); Any reason you don't use bitpacks and stream it from [un]pack_value_fields? We can happily pack strings there as well. The 'body' streaming is for tree pointers only. > } > Index: lto-streamer.h > =================================================================== > --- lto-streamer.h (revision 217448) > +++ lto-streamer.h (working copy) > @@ -832,6 +832,14 @@ bool reachable_from_this_partition_p (st > lto_symtab_encoder_t); > lto_symtab_encoder_t compute_ltrans_boundary (lto_symtab_encoder_t encoder); > > +/* In options-save.c. */ > +void cl_target_option_stream_out (struct output_block *, > + struct cl_target_option *); > + > +void cl_target_option_stream_in (struct lto_input_block *, > + struct data_in *, > + struct cl_target_option *); > + > > /* In lto-symtab.c. */ > extern void lto_symtab_merge_decls (void); > Index: tree.c > =================================================================== > --- tree.c (revision 217448) > +++ tree.c (working copy) > @@ -11484,10 +11495,7 @@ cl_option_hash_hash (const void *x) > } > > else if (TREE_CODE (t) == TARGET_OPTION_NODE) > - { > - p = (const char *)TREE_TARGET_OPTION (t); > - len = sizeof (struct cl_target_option); > - } > + return cl_target_option_hash (TREE_TARGET_OPTION (t)); > > else > gcc_unreachable (); > @@ -11526,9 +11534,8 @@ cl_option_hash_eq (const void *x, const > > else if (TREE_CODE (xt) == TARGET_OPTION_NODE) > { > - xp = (const char *)TREE_TARGET_OPTION (xt); > - yp = (const char *)TREE_TARGET_OPTION (yt); > - len = sizeof (struct cl_target_option); > + return cl_target_option_eq (TREE_TARGET_OPTION (xt), > + TREE_TARGET_OPTION (yt)); > } > > else > Index: opth-gen.awk > =================================================================== > --- opth-gen.awk (revision 217448) > +++ opth-gen.awk (working copy) > @@ -293,6 +293,12 @@ print ""; > print "/* Print target option variables from a structure. */"; > print "extern void cl_target_option_print (FILE *, int, struct > cl_target_option *);"; > print ""; > +print "/* Compare two target option variables from a structure. */"; > +print "extern bool cl_target_option_eq (const struct cl_target_option *, > const struct cl_target_option *);"; > +print ""; > +print "/* Hash option variables from a structure. */"; > +print "extern hashval_t cl_target_option_hash (const struct cl_target_option > *);"; > +print ""; > print "/* Anything that includes tm.h, does not necessarily need this. */" > print "#if !defined(GCC_TM_H)" > print "#include \"input.h\" /* for location_t */" > Index: lto-streamer-out.c > =================================================================== > --- lto-streamer-out.c (revision 217448) > +++ lto-streamer-out.c (working copy) > @@ -594,7 +594,7 @@ DFS::DFS_write_tree_body (struct output_ > { > DFS_follow_tree_edge (DECL_VINDEX (expr)); > DFS_follow_tree_edge (DECL_FUNCTION_PERSONALITY (expr)); > - /* Do not DECL_FUNCTION_SPECIFIC_TARGET. They will be regenerated. */ > + DFS_follow_tree_edge (DECL_FUNCTION_SPECIFIC_TARGET (expr)); > DFS_follow_tree_edge (DECL_FUNCTION_SPECIFIC_OPTIMIZATION (expr)); > } > > @@ -945,7 +945,7 @@ hash_tree (struct streamer_tree_cache_d > strlen (TRANSLATION_UNIT_LANGUAGE (t))); > > if (CODE_CONTAINS_STRUCT (code, TS_TARGET_OPTION)) > - gcc_unreachable (); > + hstate.add_wide_int (cl_target_option_hash (TREE_TARGET_OPTION (t))); > > if (CODE_CONTAINS_STRUCT (code, TS_OPTIMIZATION)) > hstate.add (t, sizeof (struct cl_optimization)); > @@ -1028,7 +1028,7 @@ hash_tree (struct streamer_tree_cache_d > { > visit (DECL_VINDEX (t)); > visit (DECL_FUNCTION_PERSONALITY (t)); > - /* Do not follow DECL_FUNCTION_SPECIFIC_TARGET. */ > + visit (DECL_FUNCTION_SPECIFIC_TARGET (t)); > visit (DECL_FUNCTION_SPECIFIC_OPTIMIZATION (t)); > } > > Index: lto/lto.c > =================================================================== > --- lto/lto.c (revision 217448) > +++ lto/lto.c (working copy) > @@ -1377,7 +1377,8 @@ compare_tree_sccs_1 (tree t1, tree t2, t > return false; > > if (CODE_CONTAINS_STRUCT (code, TS_TARGET_OPTION)) > - gcc_unreachable (); > + if (cl_target_option_eq (TREE_TARGET_OPTION (t1), TREE_TARGET_OPTION > (t2))) > + return false; I suppose that's the reason for the memory use - should be !cl_target_option_eq I think ;) > > if (CODE_CONTAINS_STRUCT (code, TS_OPTIMIZATION)) > if (memcmp (TREE_OPTIMIZATION (t1), TREE_OPTIMIZATION (t2), > Index: tree-streamer-out.c > =================================================================== > --- tree-streamer-out.c (revision 217448) > +++ tree-streamer-out.c (working copy) > @@ -472,9 +472,6 @@ streamer_pack_tree_bitfields (struct out > if (CODE_CONTAINS_STRUCT (code, TS_TRANSLATION_UNIT_DECL)) > pack_ts_translation_unit_decl_value_fields (ob, bp, expr); > > - if (CODE_CONTAINS_STRUCT (code, TS_TARGET_OPTION)) > - gcc_unreachable (); > - > if (CODE_CONTAINS_STRUCT (code, TS_OPTIMIZATION)) > pack_ts_optimization (bp, expr); > > @@ -963,6 +960,9 @@ streamer_write_tree_body (struct output_ > if (CODE_CONTAINS_STRUCT (code, TS_CONSTRUCTOR)) > write_ts_constructor_tree_pointers (ob, expr, ref_p); > > + if (CODE_CONTAINS_STRUCT (code, TS_TARGET_OPTION)) > + cl_target_option_stream_out (ob, TREE_TARGET_OPTION (expr)); > + > if (code == OMP_CLAUSE) > write_ts_omp_clause_tree_pointers (ob, expr, ref_p); > } Otherwise looks good. Thanks, Richard.