Re: Unbreak lto plugin
> On Sat, Aug 11, 2012 at 5:51 PM, Jan Hubicka wrote: > > Hi, > > my prevoius LTO cleanup run into problem with plugin enabled builds; > > lto-plugin is stupid enought to confused symtab_nodes section with symtab > > because it ignores the suffixes. > > Err, I suppose we should fix the plugin as well. It likely does that because > it thinks its suffixes for the partial-link stuff? Yes, I will test patch matching symtab. instead of symtab string. That should be enough. Honza > > Richard. > > > Fixed thus. Comitted as obvoius. > > Honza > > > > Index: ChangeLog > > === > > --- ChangeLog (revision 190316) > > +++ ChangeLog (working copy) > > @@ -1,3 +1,8 @@ > > +2012-08-11 Jan Hubicka > > + > > + * lto-section-in.c (lto_section_name): Do not use "symtab" as part > > of > > + symtab_node sectoin name; it confuses plugin. > > + > > 2012-08-11 Uros Bizjak > > > > * config/alpha/alpha.c (alpha_stdarg_optimize_hook): Shift DECL_UID > > Index: lto-section-in.c > > === > > --- lto-section-in.c(revision 190312) > > +++ lto-section-in.c(working copy) > > @@ -55,7 +55,7 @@ const char *lto_section_name[LTO_N_SECTI > >"jmpfuncs", > >"pureconst", > >"reference", > > - "symtab_nodes", > > + "symbol_nodes", > >"opts", > >"cgraphopt", > >"inline"
Re: [PATCH] PR 53528 c++/ C++11 Generalized Attribute support
On 08/10/2012 04:04 PM, Dodji Seketeli wrote: In cp_parser_decl_specifier_seq, I first tried to apply the c++11 attribute to the (already constructed) type it follows, like what you suggest. But then I am getting the warning: warning: ignoring attributes applied to 'A' after definition issued by build_type_attribute_qual_variant when called by decl_attribute. Basically, this is because we are not allowed to build a distinct copy of a class type. Right. Then I figured maybe I could: - do what you suggest for non-tagged types. - for tagged types apply the attribute to the decl. The aligned attribute in particular can be applied to the decl; that's a practical workaround way to comply with the requirement to apply the attribute to the type only for that declaration. Or we could just require people to put the attribute in the right place (or one of the right places) if they want it to apply to the decl. That is, either at the beginning of the declaration statement or after the declarator-id. Again, I don't think we want to extend the flexible binding of GNU attributes to C++11 attributes. +typedef union { int i; } U [[gnu::transparent_union]]; For the same reason, this should also be rejected; the testcase should put the attribute before the opening brace. We accept this with GNU-style attributes for backward compatibility, but there's no reason to propagate that lossage into the new syntax. Syntactically, why can't we say that the attribute applies to the typedef? My understanding is that this is syntactically allowed by the noptr-declarator production. Yes, syntactically it's fine. But semantically it doesn't make sense, because the attribute only applies to types, not declarations. Even if we looked through typedefs, it would be applying an attribute to a class after its definition was complete. If you want the attribute to be rejected for this particular kind of cases, how would you like to do it? It should be rejected because it's applying a type attribute to a declaration. For unused though, I am not sure how to do that in an appropriate manner. An idea? What does it mean to say that int is unused? It seems meaningless to me. Jason
[PATCH][Cilkplus] Fixes builtin Array functions
Hello Everyone, This patch is for the Cilk Plus branch affecting mainly the C++ compiler. This patch fixes a crash when builtin array notation functions are used as part of a modify expression. Thanks, Balaji V. Iyer. Index: gcc/cp/cp-array-notation.c === --- gcc/cp/cp-array-notation.c (revision 190319) +++ gcc/cp/cp-array-notation.c (working copy) @@ -50,15 +50,13 @@ static tree fix_conditional_array_notations_1 (tree stmt); tree fix_unary_array_notation_exprs (tree stmt); static bool is_builtin_array_notation_fn (tree, an_reduce_type *); -static tree build_x_reduce_expr (tree, enum tree_code, tree, tsubst_flags_t, -an_reduce_type); bool contains_array_notation_expr (tree); extern bool is_sec_implicit_index_fn (tree); extern int extract_sec_implicit_index_arg (tree fn); void extract_array_notation_exprs (tree node, bool ignore_builtin_fn, tree **array_list, int *list_size); - static bool has_call_expr_with_array_notation (tree expr); +static tree fix_builtin_array_notation_fn (tree an_builtin_fn, tree *new_var); int array_notation_label_no; @@ -378,27 +376,64 @@ tree ii_tree = NULL_TREE, comp_stmt; tree *rhs_list = NULL, *lhs_list = NULL; int rhs_list_size = 0, lhs_list_size = 0; - an_reduce_type type = REDUCE_UNKNOWN; + tree new_modify_expr, new_var, builtin_loop; + bool found_builtin_fn = false; char label_name[50]; + + + /* In thie first part, we try to break up the builtin functions for array + notations. */ + find_rank (rhs, false, &rhs_rank); + extract_array_notation_exprs (rhs, false, &rhs_list, &rhs_list_size); + loop = push_stmt_list (); + + for (ii = 0; ii < rhs_list_size; ii++) +{ + if (TREE_CODE (rhs_list[ii]) == CALL_EXPR) + { + builtin_loop = fix_builtin_array_notation_fn (rhs_list[ii], + &new_var); + if (builtin_loop) + { + add_stmt (builtin_loop); + found_builtin_fn = true; + replace_array_notations (&rhs, false, &rhs_list[ii], &new_var, + 1); + } + } +} + lhs_rank = 0; + rhs_rank = 0; find_rank (lhs, true, &lhs_rank); find_rank (rhs, true, &rhs_rank); - - /* If both are scalar, then no reason to do any of the components inside this - function... a simple build_x_modify_expr would do. */ + /* If both are scalar, then the only reason why we will get this far is if + there is some array notations inside it and was using a builtin array + notation functions. If so, we have already broken those guys up and now + a simple build_x_modify_expr would do. */ if (lhs_rank == 0 && rhs_rank == 0) -return NULL_TREE; - - if (TREE_CODE (rhs) == CALL_EXPR) { - if (is_builtin_array_notation_fn (CALL_EXPR_FN (rhs), &type)) + if (found_builtin_fn) { - loop = build_x_reduce_expr (lhs, modifycode, rhs, complain, type); + new_modify_expr = build_x_modify_expr (UNKNOWN_LOCATION, lhs, +modifycode, rhs, complain); + add_stmt (new_modify_expr); + pop_stmt_list (loop); return loop; } + else + { + /* Generally, we should not get here at all. */ + pop_stmt_list (loop); + return NULL_TREE; + } } - + + /* We need this when we have a scatter issue. */ + extract_array_notation_exprs (lhs, true, &lhs_list, &lhs_list_size); + rhs_list_size = 0; + rhs_list = NULL; extract_array_notation_exprs (rhs, true, &rhs_list, &rhs_list_size); if (lhs_rank == 0 && rhs_rank != 0) @@ -417,9 +452,6 @@ return error_mark_node; } - /* We need this when we have a scatter issue. */ - extract_array_notation_exprs (lhs, true, &lhs_list, &lhs_list_size); - rhs_vector = (bool **) xmalloc (sizeof (bool *) * rhs_list_size); for (ii = 0; ii < rhs_list_size; ii++) rhs_vector[ii] = (bool *) xmalloc (sizeof (bool) * rhs_rank); @@ -627,9 +659,7 @@ else rhs_vector[ii][0] = false; } - - loop = push_stmt_list (); - + for (ii = 0; ii < lhs_rank; ii++) { if (lhs_vector[0][ii]) @@ -1358,7 +1388,6 @@ case VECTOR_CST: case COMPLEX_CST: return t; - case MODIFY_EXPR: if (contains_array_notation_expr (t)) t = build_x_array_notation_expr @@ -1801,8 +1830,7 @@ replace_array_notations (&func_parm, true, array_list, array_operand, list_size); - if (!TREE_TYPE (func_parm)) - + if (!TREE_TYPE (func_parm)) TREE_TYPE (func_parm) = ARRAY_NOTATION_TYPE (array_list[0]); for (ii = 0; ii < rank; ii++) @@ -1832,19 +1860,15 @@ } } - *new_var = build_decl (UNKNOWN_LOCATION, VAR_DECL, NULL_TREE, new_var_type); - DEC
PR 54193: raw gimple dump of vec_perm_expr
Hello, I'll have to retest this patch tomorrow (although I don't expect the modified code is ever called), for some reason the testsuite took twice as long as usual to run and showed some weird stuff tonight. There doesn't seem to be any test calling -fdump-tree-*-raw, so I didn't add any. I wondered about spelling out 3 calls to dump_gimple_fmt to avoid the trailing NULLs, but who cares... 2012-08-12 Marc Glisse PR middle-end/54193 * gimple-pretty-print.c (dump_ternary_rhs): Handle 4 arguments. -- Marc GlisseIndex: gcc/gimple-pretty-print.c === --- gcc/gimple-pretty-print.c (revision 190318) +++ gcc/gimple-pretty-print.c (working copy) @@ -470,31 +470,39 @@ dump_ternary_rhs (pretty_printer *buffer /* Dump the gimple assignment GS. BUFFER, SPC and FLAGS are as in dump_gimple_stmt. */ static void dump_gimple_assign (pretty_printer *buffer, gimple gs, int spc, int flags) { if (flags & TDF_RAW) { - tree last; - if (gimple_num_ops (gs) == 2) -last = NULL_TREE; - else if (gimple_num_ops (gs) == 3) -last = gimple_assign_rhs2 (gs); - else -gcc_unreachable (); + tree arg1 = NULL; + tree arg2 = NULL; + tree arg3 = NULL; + switch (gimple_num_ops (gs)) + { + case 4: + arg3 = gimple_assign_rhs3 (gs); + case 3: + arg2 = gimple_assign_rhs2 (gs); + case 2: + arg1 = gimple_assign_rhs1 (gs); + break; + default: + gcc_unreachable (); + } - dump_gimple_fmt (buffer, spc, flags, "%G <%s, %T, %T, %T>", gs, + dump_gimple_fmt (buffer, spc, flags, "%G <%s, %T, %T, %T, %T>", gs, tree_code_name[gimple_assign_rhs_code (gs)], - gimple_assign_lhs (gs), gimple_assign_rhs1 (gs), last); + gimple_assign_lhs (gs), arg1, arg2, arg3); } else { if (!(flags & TDF_RHS_ONLY)) { dump_generic_node (buffer, gimple_assign_lhs (gs), spc, flags, false); pp_space (buffer); pp_character (buffer, '='); if (gimple_assign_nontemporal_move_p (gs))
Re: complex.h
Marc Glisse ha scritto: >To be honest, I only checked the patch on linux/glibc, so there is a >real >risk on other platforms (which I don't have access to). I also did a >quick >sanity check on freebsd (not a true test). Ok, conditioning the small change on glibc would not be a big deal, in case. Let's ask Rainer to double check Solaris, anyway. Paolo
[PATCH] Allow dg-skip-if to use compiler flags specified through set_board_info cflags
This patch allows cflags set in board config files using "set_board_info cflags" to be used in the selectors of dg-skip-if and other dejagnu commands that use the check-flags proc. The code merely adds cflags to compiler_flags in the check-flags proc, exactly the same way as multilib_flags is added. Regards Senthil * lib/target-supports-dg.exp (check-flags): Add cflags from board config to compiler_flags diff --git a/gcc/testsuite/lib/target-supports-dg.exp b/gcc/testsuite/lib/target-supports-dg.exp index 2f6c4c2..bdf7476 100644 --- a/gcc/testsuite/lib/target-supports-dg.exp +++ b/gcc/testsuite/lib/target-supports-dg.exp @@ -304,6 +304,9 @@ proc check-flags { args } { # If running a subset of the test suite, $TEST_ALWAYS_FLAGS may not exist. catch {append compiler_flags " $TEST_ALWAYS_FLAGS "} set dest [target_info name] +if [board_info $dest exists cflags] { +append compiler_flags "[board_info $dest cflags] " +} if [board_info $dest exists multilib_flags] { append compiler_flags "[board_info $dest multilib_flags] " }
Re: Unbreak lto plugin
On Sat, Aug 11, 2012 at 5:51 PM, Jan Hubicka wrote: > Hi, > my prevoius LTO cleanup run into problem with plugin enabled builds; > lto-plugin is stupid enought to confused symtab_nodes section with symtab > because it ignores the suffixes. Err, I suppose we should fix the plugin as well. It likely does that because it thinks its suffixes for the partial-link stuff? Richard. > Fixed thus. Comitted as obvoius. > Honza > > Index: ChangeLog > === > --- ChangeLog (revision 190316) > +++ ChangeLog (working copy) > @@ -1,3 +1,8 @@ > +2012-08-11 Jan Hubicka > + > + * lto-section-in.c (lto_section_name): Do not use "symtab" as part of > + symtab_node sectoin name; it confuses plugin. > + > 2012-08-11 Uros Bizjak > > * config/alpha/alpha.c (alpha_stdarg_optimize_hook): Shift DECL_UID > Index: lto-section-in.c > === > --- lto-section-in.c(revision 190312) > +++ lto-section-in.c(working copy) > @@ -55,7 +55,7 @@ const char *lto_section_name[LTO_N_SECTI >"jmpfuncs", >"pureconst", >"reference", > - "symtab_nodes", > + "symbol_nodes", >"opts", >"cgraphopt", >"inline"
Re: complex.h
On Sat, 11 Aug 2012, Paolo Carlini wrote: On 08/10/2012 09:17 PM, Marc Glisse wrote: Ping http://gcc.gnu.org/ml/gcc-patches/2012-07/msg01440.html I would say, let's give Gaby and Benjamin another couple of days, then Marc is welcome to commit the patch. And, after all the help he gave us to sort out the Sparc issues vs c++11 we can only trust him about the header intricacies, eh! Thanks! To be honest, I only checked the patch on linux/glibc, so there is a real risk on other platforms (which I don't have access to). I also did a quick sanity check on freebsd (not a true test). -- Marc Glisse
Re: Reload/i386 patch for secondary memory vs subregs
On 08/11/2012 04:20 PM, John David Anglin wrote: >> * reload1.c (replaced_subreg): New static function. >> (gen_reload): Use it when deciding whether to use secondary >> memory. > > This causes the following on hppa*-*-* (32-bit): > > ../../gcc/gcc/reload1.c: In function 'rtx_def* gen_reload(rtx, rtx, int, > reload_ > type)': > ../../gcc/gcc/reload1.c:8494:12: error: unused variable 'tem1' > [-Werror=unused-v > ariable] >rtx tem, tem1, tem2; > ^ > ../../gcc/gcc/reload1.c:8494:18: error: unused variable 'tem2' > [-Werror=unused-variable] > rtx tem, tem1, tem2; > ^ > ../../gcc/gcc/reload1.c: At global scope: > ../../gcc/gcc/reload1.c:8477:1: error: > 'rtx_def* replaced_subreg(rtx)' defined but not used [-Werror=unused-function] >replaced_subreg (rtx x) > ^ > cc1plus: all warnings being treated as > errors > make[3]: *** [reload1.o] Error 1 Fixed. Bernd Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 190317) +++ gcc/ChangeLog (working copy) @@ -1,3 +1,8 @@ +2012-08-11 Bernd Schmidt + + * reload1.c (replaced_subreg, gen_reload): Add + SECONDARY_MEMORY_NEEDED ifdefs as necessary. + 2012-08-11 Jan Hubicka * lto-section-in.c (lto_section_name): Do not use "symtab" as part of Index: gcc/reload1.c === --- gcc/reload1.c (revision 190252) +++ gcc/reload1.c (working copy) @@ -8469,6 +8469,7 @@ emit_insn_if_valid_for_reload (rtx insn) return NULL; } +#ifdef SECONDARY_MEMORY_NEEDED /* If X is not a subreg, return it unmodified. If it is a subreg, look up whether we made a replacement for the SUBREG_REG. Return either the replacement or the SUBREG_REG. */ @@ -8480,6 +8481,7 @@ replaced_subreg (rtx x) return find_replacement (&SUBREG_REG (x)); return x; } +#endif /* Emit code to perform a reload from IN (which may be a reload register) to OUT (which may also be a reload register). IN or OUT is from operand @@ -8491,7 +8493,10 @@ static rtx gen_reload (rtx out, rtx in, int opnum, enum reload_type type) { rtx last = get_last_insn (); - rtx tem, tem1, tem2; + rtx tem; +#ifdef SECONDARY_MEMORY_NEEDED + rtx tem1, tem2; +#endif /* If IN is a paradoxical SUBREG, remove it and try to put the opposite SUBREG on OUT. Likewise for a paradoxical SUBREG on OUT. */
Re: complex.h
On 08/10/2012 09:17 PM, Marc Glisse wrote: Ping http://gcc.gnu.org/ml/gcc-patches/2012-07/msg01440.html I would say, let's give Gaby and Benjamin another couple of days, then Marc is welcome to commit the patch. And, after all the help he gave us to sort out the Sparc issues vs c++11 we can only trust him about the header intricacies, eh! Thanks! Paolo.
Re: [PATCH 2/3] Incorporate aggregate jump functions into inlining analysis
> This is the patch I have committed after resolving a conflict (and > another round of bootstrapping and testing). The ChangeLog is still > the same. > > The previous version of the patch (but with tree code already > converted to an enum), the aggregate jump functions together with > enlarged inlining predicates increased Mozilla Firefox LTO WPA memory > consumption by 1.2% (6775287 kb -> 6855035 kb as measured by > maxmem2.sh). On Monday I will make another measurement with jump > functions completely switched off. Thanks! I did bit of static analysis on Mozilla, we increase number of predicated instructions by about 30% that is quite good. We seem to be weaker on actually using the predicates however, about 1% of inline decisions lead to using those predicates. I guess there is room for improvements in jump functions construction/propagation, right? :) Looking at the dumps we get some quit interesting matches like memset call when the destination argument is known to be zero already etc. Honza
Re: Merge varpool and cgraph encoders into symtab encoders
> > This caused: > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54229 It was the other patch, but I just fixed it. See my other mail. Honza > > -- > H.J.
Unbreak lto plugin
Hi, my prevoius LTO cleanup run into problem with plugin enabled builds; lto-plugin is stupid enought to confused symtab_nodes section with symtab because it ignores the suffixes. Fixed thus. Comitted as obvoius. Honza Index: ChangeLog === --- ChangeLog (revision 190316) +++ ChangeLog (working copy) @@ -1,3 +1,8 @@ +2012-08-11 Jan Hubicka + + * lto-section-in.c (lto_section_name): Do not use "symtab" as part of + symtab_node sectoin name; it confuses plugin. + 2012-08-11 Uros Bizjak * config/alpha/alpha.c (alpha_stdarg_optimize_hook): Shift DECL_UID Index: lto-section-in.c === --- lto-section-in.c(revision 190312) +++ lto-section-in.c(working copy) @@ -55,7 +55,7 @@ const char *lto_section_name[LTO_N_SECTI "jmpfuncs", "pureconst", "reference", - "symtab_nodes", + "symbol_nodes", "opts", "cgraphopt", "inline"
[PATCH, alpha]: Variable arguments handling broken by r190229
Hello! We have to apply the same DECL_UID change as in tree-stdarg.c to alpha_stdarg_optimize_hook. 2012-08-11 Uros Bizjak * config/alpha/alpha.c (alpha_stdarg_optimize_hook): Shift DECL_UID in the va_list_vars bitmap by num_ssa_names. Tested on alphaev68-pc-linux-gnu, committed to mainline SVN as obvious. Uros. Index: config/alpha/alpha.c === --- config/alpha/alpha.c(revision 190311) +++ config/alpha/alpha.c(working copy) @@ -5942,7 +5942,7 @@ base = get_base_address (base); if (TREE_CODE (base) != VAR_DECL - || !bitmap_bit_p (si->va_list_vars, DECL_UID (base))) + || !bitmap_bit_p (si->va_list_vars, DECL_UID (base) + num_ssa_names)) return false; offset = gimple_op (stmt, 1 + offset_arg); Index: ChangeLog === --- ChangeLog (revision 190315) +++ ChangeLog (working copy) @@ -1,5 +1,10 @@ 2012-08-11 Uros Bizjak + * config/alpha/alpha.c (alpha_stdarg_optimize_hook): Shift DECL_UID + in the va_list_vars bitmap by num_ssa_names. + +2012-08-11 Uros Bizjak + * config/i386/sse.md (xop integer multiply/add insns): Use register_operand for operand 3 predicate. (xop_phaddbq): Fix vec_select selectors.
Re: Reload/i386 patch for secondary memory vs subregs
> * reload1.c (replaced_subreg): New static function. > (gen_reload): Use it when deciding whether to use secondary > memory. This causes the following on hppa*-*-* (32-bit): ../../gcc/gcc/reload1.c: In function 'rtx_def* gen_reload(rtx, rtx, int, reload_ type)': ../../gcc/gcc/reload1.c:8494:12: error: unused variable 'tem1' [-Werror=unused-v ariable] rtx tem, tem1, tem2; ^ ../../gcc/gcc/reload1.c:8494:18: error: unused variable 'tem2' [-Werror=unused-variable] rtx tem, tem1, tem2; ^ ../../gcc/gcc/reload1.c: At global scope: ../../gcc/gcc/reload1.c:8477:1: error: 'rtx_def* replaced_subreg(rtx)' defined but not used [-Werror=unused-function] replaced_subreg (rtx x) ^ cc1plus: all warnings being treated as errors make[3]: *** [reload1.o] Error 1 Dave -- J. David Anglin dave.ang...@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6602)
Re: Merge varpool and cgraph encoders into symtab encoders
On Fri, Aug 10, 2012 at 8:56 AM, Jan Hubicka wrote: > Hi, > this patch does the boring job of merging varpool and cgraph encoders into > single symtab encoder. This has some snowballing effect. Because the IDs of > cgraphs and varpools now use single space, the cgraph and varpool sections > needs to be merged. Also reference lists output code is simplified. The > uses of encoders needs to be updated to expect all kinds of symtab nodes in > the encoder lists. > > I broke out all subsequent changes from this renaming happy patch to make them > more readable. > > Bootstrapped/regtested x86_64-linux and tested with Mozilla build. I will > commit it after bit of further testing. The mozilla binarry differs (due to > different UIDs) but has same size. > > Honza > > * cgraph.h (vector types for symtab_node): Add. > * ipa-reference.c (ipa_reference_write_optimization_summary): Update > for new symtab encoder. > (ipa_reference_read_optimization_summary): Likewise. > * lto-cgraph.c (output_varpool): Remove. > (input_cgraph_opt_summary): Take symtab nodes vector as argument. > (LTO_cgraph_tags): Rename to ... > (LTO_symtab_tags): ... this one; add LTO_symtab_variable. > (lto_cgraph_encoder_new): Rename to ... > (lto_symtab_encoder_new): ... this on. > (lto_cgraph_encoder_encode): Rename to ... > (lto_symtab_encoder_encode): ... this one. > (lto_cgraph_encoder_delete): Rename to ... > (lto_symtab_encoder_delete): ... this one. > (lto_cgraph_encoder_deref): Rename to ... > (lto_symtab_encoder_deref): ... this one. > (lto_cgraph_encoder_encode_body_p): Rename to ... > (lto_symtab_encoder_encode_body_p): ... this one. > (lto_varpool_encoder_new, lto_varpool_encoder_delete, > lto_varpool_encoder_encode, lto_varpool_encoder_lookup, > lto_varpool_encoder_deref): Remove. > (lto_varpool_encoder_encode_initializer_p): Rename to ... > (lto_symtab_encoder_encode_initializer_p): ... this one. > (lto_set_varpool_encoder_encode_initializer): Rename to ... > (lto_set_symtab_encoder_encode_initializer): ... this one. > (lto_output_edge): Update. > (lto_output_node): Update. > (lto_output_varpool_node): Update; stream out LTO_symtab_variable tag. > (lto_output_ref): Drop varpool_encoder; update. > (add_node_to): Update. > (add_references): Update. > (output_outgoing_cgraph_edges): Update. > (output_refs): Update. > (compute_ltrans_boundary): Update. > (output_cgraph): Update; output varpools too. > (input_overwrite_node): Update. > (output_varpool): Remove. > (input_node): Update. > (input_ref): Update. > (input_edge): Update. > (input_cgraph_1): Update; input varpool too; unify fixup code. > (input_varpool_1): Remove. > (input_refs): Update. > (input_cgraph): Update. > (output_node_opt_summary): Update. > (input_cgraph_opt_section): Update. > (input_cgraph_opt_summary): Update. > * ipa-pure-const.c (pure_const_write_summary): Update. > (pure_const_read_summary): Update. > * lto-streamer-out.c (lto_write_tree): Update. > (lto_output): Likewise. > (produce_symtab): Update. > (produce_asm_for_decls): Update. > * ipa-inline-analysis.c (inline_read_section): Update. > (inline_write_summary): Update. > * ipa-prop.c (ipa_write_node_info): Update. > (ipa_prop_read_section): Update. > * lto-streamer.h (lto_cgraph_encoder_d): Rename to ... > (lto_symtab_encoder_d): ... this one; add initializer. > (lto_cgraph_encoder_t): Rename to ... > (lto_symtab_encoder_t): ... this one. > (lto_cgraph_encoder_size): Rename to ... > (lto_symtab_encoder_size): ... this one. > (lto_varpool_encoder_d): ... remove. > (lto_varpool_encoder_t): Remove. > (lto_out_decl_state): Remove cgraph_node_encoder, varpool_node_encoder > add symtab_node_encoder. > (lto_file_decl_data): Likewise. > (lto_cgraph_encoder_deref, lto_cgraph_encoder_lookup, > lto_cgraph_encoder_new, lto_cgraph_encoder_encode, > lto_cgraph_encoder_delete, > lto_cgraph_encoder_encode_body_p, lto_varpool_encoder_encode_body_p, > lto_varpool_encoder_deref, lto_varpool_encoder_lookup, > lto_varpool_encoder_new, > lto_varpool_encoder_encode, lto_varpool_encoder_delete, > lto_varpool_encoder_encode_initializer_p): Remove. > (lto_symtab_encoder_deref, lto_symtab_encoder_lookup, > lto_symtab_encoder_t, lto_symtab_encoder_encode, > lto_symtab_encoder_delete, > lto_symtab_encoder_encode_body_p, > lto_symtab_encoder_encode_initializer_p): > New. This caused: http://gcc.gnu.org/bugzilla/show
Re: Value type of map need not be default copyable
On Sat, 11 Aug 2012, François Dumont wrote: Your remark on using std::move rather than std::forward Marc made sens but didn't work. I don't understand why but the new test is showing that std::forward works. If anyone can explain why std::move doesn't work I am interested. What testcase failed? I just tried the 2.cc file you added with the patch, and replacing forward(__k) with move(__k) compiled fine. -- Marc Glisse
Re: Value type of map need not be default copyable
Here is an other attempt. I took the time to refactor the hashtable implementation. I prefer to rename _M_insert_node into _M_insert_unique_node and use it also into _M_emplace implementation. I introduce _M_insert_multi_node that is used in _M_insert and _M_emplace when keys are not unique. Your remark on using std::move rather than std::forward Marc made sens but didn't work. I don't understand why but the new test is showing that std::forward works. If anyone can explain why std::move doesn't work I am interested. For your question regarding how to include headers I just follow current method. Normally it is done so to make headers more reusable but in this case I agree that hashtable_policy.h can't be included without before. Should I put include into hashtable_policy.h ? Adding a declaration of std::tuple in hashtable_policy.h could make this header less dependent on , should I do so ? 2012-08-09 François Dumont Ollie Wild * include/bits/hashtable.h (_Hashtable<>_M_insert_multi_node(hash_code, node_type*)): New. (_Hashtable<>_M_insert(_Args&&, false_type)): Use latter. (_Hashtable<>::_M_emplace(false_type, _Args&&...)): Likewise. (_Hashtable<>::_M_insert_bucket): Replace by ... (_Hashtable<>::_M_insert_unique_node(size_type, hash_code, node_type*)): ... this, new. (_Hashtable<>::_M_insert(_Args&&, true_type)): Use latter. (_Hashtable<>::_M_emplace(true_type, _Args&&...)): Likewise. * include/bits/hashtable_policy.h (_Map_base<>::operator[]): Use latter, emplace the value_type rather than insert. * include/std/unordered_map: Include tuple. * include/std/unordered_set: Likewise. * testsuite/util/testsuite_counter_type.h: New. * testsuite/23_containers/unordered_map/operators/2.cc: New. Tested under linux x86_64, normal and debug mode. Ok for trunk ? François On 08/10/2012 01:26 AM, Paolo Carlini wrote: On 08/09/2012 11:22 PM, Marc Glisse wrote: I don't know if std:: is needed, but it looks strange to have it only on some functions: std::forward_as_tuple(forward(__k)), Looking at this line again, you seem to be using std::forward on something that is not a deduced parameter type. I guess it is equivalent to std::move in this case, it just confuses me a bit. Wanted to point out that yesterday. Please double check std::move. I realize now that nobody is interested in std::cref, good ;) Thanks! Paolo. Index: include/std/unordered_map === --- include/std/unordered_map (revision 190209) +++ include/std/unordered_map (working copy) @@ -38,6 +38,7 @@ #include #include #include +#include #include #include #include // equal_to, _Identity, _Select1st Index: include/std/unordered_set === --- include/std/unordered_set (revision 190209) +++ include/std/unordered_set (working copy) @@ -38,6 +38,7 @@ #include #include #include +#include #include #include #include // equal_to, _Identity, _Select1st Index: include/bits/hashtable.h === --- include/bits/hashtable.h (revision 190209) +++ include/bits/hashtable.h (working copy) @@ -584,10 +584,17 @@ __node_base* _M_get_previous_node(size_type __bkt, __node_base* __n); - template - iterator - _M_insert_bucket(_Arg&&, size_type, __hash_code); + // Insert node with hash code __code, in bucket bkt if no rehash (assumes + // no element with its key already present). Take ownership of the node, + // deallocate it on exception. + iterator + _M_insert_unique_node(size_type __bkt, __hash_code __code, + __node_type* __n); + // Insert node with hash code __code. Take ownership of the node, + // deallocate it on exception. + iterator + _M_insert_multi_node(__hash_code __code, __node_type* __n); template std::pair @@ -1214,42 +1221,29 @@ { // First build the node to get access to the hash code __node_type* __node = _M_allocate_node(std::forward<_Args>(__args)...); + const key_type& __k = this->_M_extract()(__node->_M_v); + __hash_code __code; __try { - const key_type& __k = this->_M_extract()(__node->_M_v); - __hash_code __code = this->_M_hash_code(__k); - size_type __bkt = _M_bucket_index(__k, __code); - - if (__node_type* __p = _M_find_node(__bkt, __k, __code)) - { - // There is already an equivalent node, no insertion - _M_deallocate_node(__node); - return std::make_pair(iterator(__p), false); - } - - // We are going to insert this node - this->_M_store_code(__node, __code); - const __rehash_state& __saved_state - = _M_rehash_policy._M_state(); - std::pair __do_rehash - = _M_rehash_policy._M_need_rehash(_M_bucket_count, - _M_element_count, 1); - - if (__do_rehash.first) - { -
Re: combine permutations in gimple
On Fri, 10 Aug 2012, Marc Glisse wrote: this patch detects permutations of permutations and merges them. It also canonicalizes permutations a bit more. There are several issues with this patch: 1) I am not sure we always want to combine permutations. Indeed, someone (user? vectorizer?) may have written 2 permutations to help the backend generate optimal code, whereas it will do a bad job on the complicated combined permutation. At tree level, I am not sure what can be done except possibly restrict the optimization to the case where the combined permutation is the identity. At the RTL level, we could compare what insns are generated, but that's going to be painful (doing anything with permutations at the RTL level is painful). 2) I don't understand how the cleanup works in tree-ssa-forwprop.c. I copied some fold/update/remove lines from other simplifications, but I don't know if they are right. 3) In a first version of the patch, where I had SSA_NAME_DEF_STMT instead of get_prop_source_stmt, I got an ICE in one of the torture vectorization testcases. It happened in expand_expr_real_2, because it asserts that expand_vec_perm will never fail. However, expand_vec_perm does return NULL_RTX sometimes. Here it was in V16QImode, with the permutation {0,0,2,2,4,...,14,14}. maybe_expand_insn can't handle it directly (that's ok I guess), but then expand_vec_perm sees VOIDmode and exits instead of trying other ways. I don't know if there is a latent bug or if (more likely) my patch may produce trees with wrong modes. 4) Is this the right place? This isn't the transformation I am most interested in, but it is a first step to see if the direction is right. Hello, there was yet another issue with the version I posted: the testcase was trivial so I didn't notice that it didn't perform the optimization at all... Here is a new version. It seems a bit strange to me that there are plenty of functions that check for single-use variables, but there isn't any that checks for variables used in a single statement (but possibly twice). So I may be doing something strange, but it seems to be the natural test here. Happily passed bootstrap+regtest. 2012-08-11 Marc Glisse gcc/ * fold-const.c (fold_ternary_loc): Detect identity permutations. Canonicalize permutations more. * tree-ssa-forwprop.c (is_only_used_in): New function. (simplify_permutation): Likewise. (ssa_forward_propagate_and_combine): Call it. gcc/testsuite/ * gcc.dg/tree-ssa/forwprop-19.c: New testcase. -- Marc GlisseIndex: gcc/tree-ssa-forwprop.c === --- gcc/tree-ssa-forwprop.c (revision 190311) +++ gcc/tree-ssa-forwprop.c (working copy) @@ -2569,20 +2569,97 @@ combine_conversions (gimple_stmt_iterato gimple_assign_set_rhs_code (stmt, CONVERT_EXPR); update_stmt (stmt); return remove_prop_source_from_use (op0) ? 2 : 1; } } } return 0; } +/* Return true if VAR has no nondebug use but in stmt. */ +static bool +is_only_used_in (const_tree var, const_gimple stmt) +{ + const ssa_use_operand_t *const ptr0 = &(SSA_NAME_IMM_USE_NODE (var)); + const ssa_use_operand_t *ptr = ptr0->next; + + for (; ptr != ptr0; ptr = ptr->next) +{ + const_gimple use_stmt = USE_STMT (ptr); + if (is_gimple_debug (use_stmt)) + continue; + if (use_stmt != stmt) + return false; +} + return true; +} + +/* Combine two shuffles in a row. Returns 1 if there were any changes + made, 2 if cfg-cleanup needs to run. Else it returns 0. */ + +static int +simplify_permutation (gimple_stmt_iterator *gsi) +{ + gimple stmt = gsi_stmt (*gsi); + gimple def_stmt; + tree op0, op1, op2, op3, mask; + enum tree_code code = gimple_assign_rhs_code (stmt); + enum tree_code code2; + location_t loc = gimple_location (stmt); + + gcc_checking_assert (code == VEC_PERM_EXPR); + + op0 = gimple_assign_rhs1 (stmt); + op1 = gimple_assign_rhs2 (stmt); + op2 = gimple_assign_rhs3 (stmt); + + if (TREE_CODE (op0) != SSA_NAME) +return 0; + + if (TREE_CODE (op2) != VECTOR_CST) +return 0; + + if (op0 != op1) +return 0; + + def_stmt = SSA_NAME_DEF_STMT (op0); + if (!def_stmt || !is_gimple_assign (def_stmt) + || !can_propagate_from (def_stmt) + || !is_only_used_in (op0, stmt)) +return 0; + + /* Check that it is only used here. We cannot use has_single_use + since the expression is using it twice itself... */ + + code2 = gimple_assign_rhs_code (def_stmt); + + /* Two consecutive shuffles. */ + if (code2 == VEC_PERM_EXPR) +{ + op3 = gimple_assign_rhs3 (def_stmt); + if (TREE_CODE (op3) != VECTOR_CST) + return 0; + mask = fold_build3_loc (loc, VEC_PERM_EXPR, TREE_TYPE (op3), + op3, op3, op2); + gcc_assert (TREE_CODE (mask) == VECTOR_CST); + gimple_assign_s
[PATCH, i386]: Fix a couple of XOP issues
Hello! There is no point to use nonimmediate_operand predicate, if the insn accepts only register operands. Also, the patch fixes wrong selectors for horizontal add insn. 2012-08-11 Uros Bizjak * config/i386/sse.md (xop integer multiply/add insns): Use register_operand for operand 3 predicate. (xop_phaddbq): Fix vec_select selectors. Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline. Wrong selectors fix will be backported to 4.7. Uros. Index: sse.md === --- sse.md (revision 190311) +++ sse.md (working copy) @@ -9547,9 +9547,6 @@ (define_code_attr madcs [(plus "madcs") (ss_plus "madcss")]) ;; XOP parallel integer multiply/add instructions. -;; Note the XOP multiply/add instructions -;; a[i] = b[i] * c[i] + d[i]; -;; do not allow the value being added to be a memory operation. (define_insn "xop_p" [(set (match_operand:VI24_128 0 "register_operand" "=x") @@ -9557,7 +9554,7 @@ (mult:VI24_128 (match_operand:VI24_128 1 "nonimmediate_operand" "%x") (match_operand:VI24_128 2 "nonimmediate_operand" "xm")) -(match_operand:VI24_128 3 "nonimmediate_operand" "x")))] +(match_operand:VI24_128 3 "register_operand" "x")))] "TARGET_XOP" "vp\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "type" "ssemuladd") @@ -9575,7 +9572,7 @@ (vec_select:V2SI (match_operand:V4SI 2 "nonimmediate_operand" "xm") (parallel [(const_int 0) (const_int 2)] -(match_operand:V2DI 3 "nonimmediate_operand" "x")))] +(match_operand:V2DI 3 "register_operand" "x")))] "TARGET_XOP" "vpdql\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "type" "ssemuladd") @@ -9593,7 +9590,7 @@ (vec_select:V2SI (match_operand:V4SI 2 "nonimmediate_operand" "xm") (parallel [(const_int 1) (const_int 3)] -(match_operand:V2DI 3 "nonimmediate_operand" "x")))] +(match_operand:V2DI 3 "register_operand" "x")))] "TARGET_XOP" "vpdqh\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "type" "ssemuladd") @@ -9614,7 +9611,7 @@ (match_operand:V8HI 2 "nonimmediate_operand" "xm") (parallel [(const_int 1) (const_int 3) (const_int 5) (const_int 7)] -(match_operand:V4SI 3 "nonimmediate_operand" "x")))] +(match_operand:V4SI 3 "register_operand" "x")))] "TARGET_XOP" "vpwd\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "type" "ssemuladd") @@ -9646,7 +9643,7 @@ (match_dup 2) (parallel [(const_int 1) (const_int 3) (const_int 5) (const_int 7)]) -(match_operand:V4SI 3 "nonimmediate_operand" "x")))] +(match_operand:V4SI 3 "register_operand" "x")))] "TARGET_XOP" "vpwd\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "type" "ssemuladd") @@ -9722,39 +9719,39 @@ (any_extend:V2DI (vec_select:V2QI (match_operand:V16QI 1 "nonimmediate_operand" "xm") -(parallel [(const_int 0) (const_int 4)]))) +(parallel [(const_int 0) (const_int 8)]))) (any_extend:V2DI (vec_select:V2QI (match_dup 1) -(parallel [(const_int 1) (const_int 5)] +(parallel [(const_int 1) (const_int 9)] (plus:V2DI (any_extend:V2DI (vec_select:V2QI (match_dup 1) -(parallel [(const_int 2) (const_int 6)]))) +(parallel [(const_int 2) (const_int 10)]))) (any_extend:V2DI (vec_select:V2QI (match_dup 1) -(parallel [(const_int 3) (const_int 7)]) +(parallel [(const_int 3) (const_int 11)]) (plus:V2DI (plus:V2DI (any_extend:V2DI (vec_select:V2QI (match_dup 1) -(parallel [(const_int 8) (const_int 12)]))) +(parallel [(const_int 4) (const_int 12)]))) (any_extend:V2DI (vec_select:V2QI (match_dup 1) -(parallel [(const_int 9) (const_int 13)] +(parallel [(const_int 5) (const_int 13)] (plus:V2DI (any_extend:V2DI (vec_select:V2QI (match_dup 1) -(parallel [(const_int 10) (const_int 14)]))) +(parallel [(const_int 6) (const_int 14)]))) (any_extend:V2DI (vec_select:V2QI (match_dup 1) -(parallel [(const_int 11) (const_int 15)])))] +(parallel [(const_int 7) (const_int 15)])))] "TARGET_XOP" "vphaddbq\t{%1, %0|%0, %1}" [(set_attr "type" "sseiadd1")])
Re: [PATCH 2/3] Incorporate aggregate jump functions into inlining analysis
Hi, On Fri, Aug 10, 2012 at 04:39:44PM +0200, Martin Jambor wrote: > On Fri, Aug 10, 2012 at 05:12:31AM +0200, Jan Hubicka wrote: > > > Hi, > > > > > ... > > > > > > > 2012-07-31 Martin Jambor > > > > > > PR fortran/48636 > > > * ipa-inline.h (condition): New fields offset, agg_contents and by_ref. > > > * ipa-inline-analysis.c (agg_position_info): New type. > > > (add_condition): New parameter aggpos, also store agg_contents, by_ref > > > and offset. > > > (dump_condition): Also dump aggregate conditions. > > > (evaluate_conditions_for_known_args): Also handle aggregate > > > conditions. New parameter known_aggs. > > > (evaluate_properties_for_edge): Gather known aggregate contents. > > > (inline_node_duplication_hook): Pass NULL known_aggs to > > > evaluate_conditions_for_known_args. > > > (unmodified_parm): Split into unmodified_parm and unmodified_parm_1. > > > (unmodified_parm_or_parm_agg_item): New function. > > > (set_cond_stmt_execution_predicate): Handle values passed in > > > aggregates. > > > (set_switch_stmt_execution_predicate): Likewise. > > > (will_be_nonconstant_predicate): Likewise. > > > (estimate_edge_devirt_benefit): Pass new parameter known_aggs to > > > ipa_get_indirect_edge_target. > > > (estimate_calls_size_and_time): New parameter known_aggs, pass it > > > recrsively to itself and to estimate_edge_devirt_benefit. > > > (estimate_node_size_and_time): New vector known_aggs, pass it o > > > functions which need it. > > > (remap_predicate): New parameter offset_map, use it to remap aggregate > > > conditions. > > > (remap_edge_summaries): New parameter offset_map, pass it recursively > > > to itself and to remap_predicate. > > > (inline_merge_summary): Also create and populate vector offset_map. > > > (do_estimate_edge_time): New vector of known aggregate contents, > > > passed to functions which need it. > > > (inline_read_section): Stream new fields of condition. > > > (inline_write_summary): Likewise. > > > * ipa-cp.c (ipa_get_indirect_edge_target): Also examine the aggregate > > > contents. Let all local callers pass NULL for known_aggs. > > > > > > * testsuite/gfortran.dg/pr48636.f90: New test. > > > > OK with the following changes. > > > > I plan to push out my inline hints code, so it would be nice if you > > commited soon > > so I cn resolve conflicts on my side. > > > Index: src/gcc/ipa-inline.h > > > === > > > *** src.orig/gcc/ipa-inline.h > > > --- src/gcc/ipa-inline.h > > > *** along with GCC; see the file COPYING3. > > > *** 28,36 > > > --- 28,45 > > > > > > typedef struct GTY(()) condition > > > { > > > + /* If agg_contents is set, this is the offset from which the used > > > data was > > > +loaded. */ > > > + HOST_WIDE_INT offset; > > > tree val; > > > int operand_num; > > > enum tree_code code; > > > + /* Set if the used data were loaded from an aggregate parameter or > > > from > > > +data received by reference. */ > > > + unsigned agg_contents : 1; > > > + /* If agg_contents is set, this differentiates between loads from > > > data > > > +passed by reference and by value. */ > > > + unsigned by_ref : 1; > > > > > Do you have any data on memory usage? I was originally concerned > > about memory use of the whole predicate thingy on WPA level. > > Eventually we could add simple inheritance on conditions and sort > > them into mutiple vectors if needed. But I assume it is OK or we > > will work out on Mozilla builds soonish. > > > > One obvious thing is to patch CODE and the bitfields so we fit in 3 > > 64bit words. > > OK, I made it an enum, I will look at memory consumption in a while. This is the patch I have committed after resolving a conflict (and another round of bootstrapping and testing). The ChangeLog is still the same. The previous version of the patch (but with tree code already converted to an enum), the aggregate jump functions together with enlarged inlining predicates increased Mozilla Firefox LTO WPA memory consumption by 1.2% (6775287 kb -> 6855035 kb as measured by maxmem2.sh). On Monday I will make another measurement with jump functions completely switched off. Thanks, Martin Index: src/gcc/ipa-inline.h === *** src.orig/gcc/ipa-inline.h --- src/gcc/ipa-inline.h *** along with GCC; see the file COPYING3. *** 28,36 typedef struct GTY(()) condition { tree val; int operand_num; ! enum tree_code code; } condition; DEF_VEC_O (condition); --- 28,45 typedef struct GTY(()) condition { + /* If agg_contents is set, this is the offset from which the used data was +loaded. */ + HOST_WIDE_INT offset; tree val; int operand_num; ! ENUM_BITFIELD(tr