Re: move increase_alignment from simple to regular ipa pass
ping * 3 https://gcc.gnu.org/ml/gcc-patches/2016-06/msg01703.html Thanks, Prathamesh On 5 July 2016 at 10:53, Prathamesh Kulkarniwrote: > ping * 2 ping https://gcc.gnu.org/ml/gcc-patches/2016-06/msg01703.html > > Thanks, > Prathamesh > > On 28 June 2016 at 14:49, Prathamesh Kulkarni > wrote: >> ping https://gcc.gnu.org/ml/gcc-patches/2016-06/msg01703.html >> >> Thanks, >> Prathamesh >> >> On 23 June 2016 at 22:51, Prathamesh Kulkarni >> wrote: >>> On 17 June 2016 at 19:52, Prathamesh Kulkarni >>> wrote: On 14 June 2016 at 18:31, Prathamesh Kulkarni wrote: > On 13 June 2016 at 16:13, Jan Hubicka wrote: >>> diff --git a/gcc/cgraph.h b/gcc/cgraph.h >>> index ecafe63..41ac408 100644 >>> --- a/gcc/cgraph.h >>> +++ b/gcc/cgraph.h >>> @@ -1874,6 +1874,9 @@ public: >>> if we did not do any inter-procedural code movement. */ >>>unsigned used_by_single_function : 1; >>> >>> + /* Set if -fsection-anchors is set. */ >>> + unsigned section_anchor : 1; >>> + >>> private: >>>/* Assemble thunks and aliases associated to varpool node. */ >>>void assemble_aliases (void); >>> diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c >>> index 4bfcad7..e75d5c0 100644 >>> --- a/gcc/cgraphunit.c >>> +++ b/gcc/cgraphunit.c >>> @@ -800,6 +800,9 @@ varpool_node::finalize_decl (tree decl) >>> it is available to notice_global_symbol. */ >>>node->definition = true; >>>notice_global_symbol (decl); >>> + >>> + node->section_anchor = flag_section_anchors; >>> + >>>if (TREE_THIS_VOLATILE (decl) || DECL_PRESERVE_P (decl) >>>/* Traditionally we do not eliminate static variables when not >>>optimizing and when not doing toplevel reoder. */ >>> diff --git a/gcc/common.opt b/gcc/common.opt >>> index f0d7196..e497795 100644 >>> --- a/gcc/common.opt >>> +++ b/gcc/common.opt >>> @@ -1590,6 +1590,10 @@ fira-algorithm= >>> Common Joined RejectNegative Enum(ira_algorithm) >>> Var(flag_ira_algorithm) Init(IRA_ALGORITHM_CB) Optimization >>> -fira-algorithm=[CB|priority] Set the used IRA algorithm. >>> >>> +fipa-increase_alignment >>> +Common Report Var(flag_ipa_increase_alignment) Init(0) Optimization >>> +Option to gate increase_alignment ipa pass. >>> + >>> Enum >>> Name(ira_algorithm) Type(enum ira_algorithm) UnknownError(unknown IRA >>> algorithm %qs) >>> >>> @@ -2133,7 +2137,7 @@ Common Report Var(flag_sched_dep_count_heuristic) >>> Init(1) Optimization >>> Enable the dependent count heuristic in the scheduler. >>> >>> fsection-anchors >>> -Common Report Var(flag_section_anchors) Optimization >>> +Common Report Var(flag_section_anchors) >>> Access data in the same section from shared anchor points. >>> >>> fsee >>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c >>> index a0db3a4..1482566 100644 >>> --- a/gcc/config/aarch64/aarch64.c >>> +++ b/gcc/config/aarch64/aarch64.c >>> @@ -8252,6 +8252,8 @@ aarch64_override_options (void) >>> >>>aarch64_register_fma_steering (); >>> >>> + /* Enable increase_alignment pass. */ >>> + flag_ipa_increase_alignment = 1; >> >> I would rather enable it always on targets that do support anchors. > AFAIK aarch64 supports section anchors. >>> diff --git a/gcc/lto/lto-symtab.c b/gcc/lto/lto-symtab.c >>> index ce9e146..7f09f3a 100644 >>> --- a/gcc/lto/lto-symtab.c >>> +++ b/gcc/lto/lto-symtab.c >>> @@ -342,6 +342,13 @@ lto_symtab_merge (symtab_node *prevailing, >>> symtab_node *entry) >>> The type compatibility checks or the completing of types has >>> properly >>> dealt with most issues. */ >>> >>> + /* ??? is this assert necessary ? */ >>> + varpool_node *v_prevailing = dyn_cast (prevailing); >>> + varpool_node *v_entry = dyn_cast (entry); >>> + gcc_assert (v_prevailing && v_entry); >>> + /* section_anchor of prevailing_decl wins. */ >>> + v_entry->section_anchor = v_prevailing->section_anchor; >>> + >> Other flags are merged in lto_varpool_replace_node so please move this >> there. > Ah indeed, thanks for the pointers. > I wonder though if we need to set > prevailing_node->section_anchor = vnode->section_anchor ? > IIUC, the function merges flags from vnode into prevailing_node > and removes vnode. However we want prevailing_node->section_anchor > to always take precedence. >>> +/* Return true if alignment should be increased for this vnode. >>> + This is done if every function that references/referring to vnode >>> + has
Re: move increase_alignment from simple to regular ipa pass
ping * 2 ping https://gcc.gnu.org/ml/gcc-patches/2016-06/msg01703.html Thanks, Prathamesh On 28 June 2016 at 14:49, Prathamesh Kulkarniwrote: > ping https://gcc.gnu.org/ml/gcc-patches/2016-06/msg01703.html > > Thanks, > Prathamesh > > On 23 June 2016 at 22:51, Prathamesh Kulkarni > wrote: >> On 17 June 2016 at 19:52, Prathamesh Kulkarni >> wrote: >>> On 14 June 2016 at 18:31, Prathamesh Kulkarni >>> wrote: On 13 June 2016 at 16:13, Jan Hubicka wrote: >> diff --git a/gcc/cgraph.h b/gcc/cgraph.h >> index ecafe63..41ac408 100644 >> --- a/gcc/cgraph.h >> +++ b/gcc/cgraph.h >> @@ -1874,6 +1874,9 @@ public: >> if we did not do any inter-procedural code movement. */ >>unsigned used_by_single_function : 1; >> >> + /* Set if -fsection-anchors is set. */ >> + unsigned section_anchor : 1; >> + >> private: >>/* Assemble thunks and aliases associated to varpool node. */ >>void assemble_aliases (void); >> diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c >> index 4bfcad7..e75d5c0 100644 >> --- a/gcc/cgraphunit.c >> +++ b/gcc/cgraphunit.c >> @@ -800,6 +800,9 @@ varpool_node::finalize_decl (tree decl) >> it is available to notice_global_symbol. */ >>node->definition = true; >>notice_global_symbol (decl); >> + >> + node->section_anchor = flag_section_anchors; >> + >>if (TREE_THIS_VOLATILE (decl) || DECL_PRESERVE_P (decl) >>/* Traditionally we do not eliminate static variables when not >>optimizing and when not doing toplevel reoder. */ >> diff --git a/gcc/common.opt b/gcc/common.opt >> index f0d7196..e497795 100644 >> --- a/gcc/common.opt >> +++ b/gcc/common.opt >> @@ -1590,6 +1590,10 @@ fira-algorithm= >> Common Joined RejectNegative Enum(ira_algorithm) >> Var(flag_ira_algorithm) Init(IRA_ALGORITHM_CB) Optimization >> -fira-algorithm=[CB|priority] Set the used IRA algorithm. >> >> +fipa-increase_alignment >> +Common Report Var(flag_ipa_increase_alignment) Init(0) Optimization >> +Option to gate increase_alignment ipa pass. >> + >> Enum >> Name(ira_algorithm) Type(enum ira_algorithm) UnknownError(unknown IRA >> algorithm %qs) >> >> @@ -2133,7 +2137,7 @@ Common Report Var(flag_sched_dep_count_heuristic) >> Init(1) Optimization >> Enable the dependent count heuristic in the scheduler. >> >> fsection-anchors >> -Common Report Var(flag_section_anchors) Optimization >> +Common Report Var(flag_section_anchors) >> Access data in the same section from shared anchor points. >> >> fsee >> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c >> index a0db3a4..1482566 100644 >> --- a/gcc/config/aarch64/aarch64.c >> +++ b/gcc/config/aarch64/aarch64.c >> @@ -8252,6 +8252,8 @@ aarch64_override_options (void) >> >>aarch64_register_fma_steering (); >> >> + /* Enable increase_alignment pass. */ >> + flag_ipa_increase_alignment = 1; > > I would rather enable it always on targets that do support anchors. AFAIK aarch64 supports section anchors. >> diff --git a/gcc/lto/lto-symtab.c b/gcc/lto/lto-symtab.c >> index ce9e146..7f09f3a 100644 >> --- a/gcc/lto/lto-symtab.c >> +++ b/gcc/lto/lto-symtab.c >> @@ -342,6 +342,13 @@ lto_symtab_merge (symtab_node *prevailing, >> symtab_node *entry) >> The type compatibility checks or the completing of types has >> properly >> dealt with most issues. */ >> >> + /* ??? is this assert necessary ? */ >> + varpool_node *v_prevailing = dyn_cast (prevailing); >> + varpool_node *v_entry = dyn_cast (entry); >> + gcc_assert (v_prevailing && v_entry); >> + /* section_anchor of prevailing_decl wins. */ >> + v_entry->section_anchor = v_prevailing->section_anchor; >> + > Other flags are merged in lto_varpool_replace_node so please move this > there. Ah indeed, thanks for the pointers. I wonder though if we need to set prevailing_node->section_anchor = vnode->section_anchor ? IIUC, the function merges flags from vnode into prevailing_node and removes vnode. However we want prevailing_node->section_anchor to always take precedence. >> +/* Return true if alignment should be increased for this vnode. >> + This is done if every function that references/referring to vnode >> + has flag_tree_loop_vectorize set. */ >> + >> +static bool >> +increase_alignment_p (varpool_node *vnode) >> +{ >> + ipa_ref *ref; >> + >> + for (int i = 0; vnode->iterate_reference (i, ref); i++) >> +if (cgraph_node *cnode = dyn_cast (ref->referred))
Re: move increase_alignment from simple to regular ipa pass
ping https://gcc.gnu.org/ml/gcc-patches/2016-06/msg01703.html Thanks, Prathamesh On 23 June 2016 at 22:51, Prathamesh Kulkarniwrote: > On 17 June 2016 at 19:52, Prathamesh Kulkarni > wrote: >> On 14 June 2016 at 18:31, Prathamesh Kulkarni >> wrote: >>> On 13 June 2016 at 16:13, Jan Hubicka wrote: > diff --git a/gcc/cgraph.h b/gcc/cgraph.h > index ecafe63..41ac408 100644 > --- a/gcc/cgraph.h > +++ b/gcc/cgraph.h > @@ -1874,6 +1874,9 @@ public: > if we did not do any inter-procedural code movement. */ >unsigned used_by_single_function : 1; > > + /* Set if -fsection-anchors is set. */ > + unsigned section_anchor : 1; > + > private: >/* Assemble thunks and aliases associated to varpool node. */ >void assemble_aliases (void); > diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c > index 4bfcad7..e75d5c0 100644 > --- a/gcc/cgraphunit.c > +++ b/gcc/cgraphunit.c > @@ -800,6 +800,9 @@ varpool_node::finalize_decl (tree decl) > it is available to notice_global_symbol. */ >node->definition = true; >notice_global_symbol (decl); > + > + node->section_anchor = flag_section_anchors; > + >if (TREE_THIS_VOLATILE (decl) || DECL_PRESERVE_P (decl) >/* Traditionally we do not eliminate static variables when not >optimizing and when not doing toplevel reoder. */ > diff --git a/gcc/common.opt b/gcc/common.opt > index f0d7196..e497795 100644 > --- a/gcc/common.opt > +++ b/gcc/common.opt > @@ -1590,6 +1590,10 @@ fira-algorithm= > Common Joined RejectNegative Enum(ira_algorithm) Var(flag_ira_algorithm) > Init(IRA_ALGORITHM_CB) Optimization > -fira-algorithm=[CB|priority] Set the used IRA algorithm. > > +fipa-increase_alignment > +Common Report Var(flag_ipa_increase_alignment) Init(0) Optimization > +Option to gate increase_alignment ipa pass. > + > Enum > Name(ira_algorithm) Type(enum ira_algorithm) UnknownError(unknown IRA > algorithm %qs) > > @@ -2133,7 +2137,7 @@ Common Report Var(flag_sched_dep_count_heuristic) > Init(1) Optimization > Enable the dependent count heuristic in the scheduler. > > fsection-anchors > -Common Report Var(flag_section_anchors) Optimization > +Common Report Var(flag_section_anchors) > Access data in the same section from shared anchor points. > > fsee > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index a0db3a4..1482566 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -8252,6 +8252,8 @@ aarch64_override_options (void) > >aarch64_register_fma_steering (); > > + /* Enable increase_alignment pass. */ > + flag_ipa_increase_alignment = 1; I would rather enable it always on targets that do support anchors. >>> AFAIK aarch64 supports section anchors. > diff --git a/gcc/lto/lto-symtab.c b/gcc/lto/lto-symtab.c > index ce9e146..7f09f3a 100644 > --- a/gcc/lto/lto-symtab.c > +++ b/gcc/lto/lto-symtab.c > @@ -342,6 +342,13 @@ lto_symtab_merge (symtab_node *prevailing, > symtab_node *entry) > The type compatibility checks or the completing of types has > properly > dealt with most issues. */ > > + /* ??? is this assert necessary ? */ > + varpool_node *v_prevailing = dyn_cast (prevailing); > + varpool_node *v_entry = dyn_cast (entry); > + gcc_assert (v_prevailing && v_entry); > + /* section_anchor of prevailing_decl wins. */ > + v_entry->section_anchor = v_prevailing->section_anchor; > + Other flags are merged in lto_varpool_replace_node so please move this there. >>> Ah indeed, thanks for the pointers. >>> I wonder though if we need to set >>> prevailing_node->section_anchor = vnode->section_anchor ? >>> IIUC, the function merges flags from vnode into prevailing_node >>> and removes vnode. However we want prevailing_node->section_anchor >>> to always take precedence. > +/* Return true if alignment should be increased for this vnode. > + This is done if every function that references/referring to vnode > + has flag_tree_loop_vectorize set. */ > + > +static bool > +increase_alignment_p (varpool_node *vnode) > +{ > + ipa_ref *ref; > + > + for (int i = 0; vnode->iterate_reference (i, ref); i++) > +if (cgraph_node *cnode = dyn_cast (ref->referred)) > + { > + struct cl_optimization *opts = opts_for_fn (cnode->decl); > + if (!opts->x_flag_tree_loop_vectorize) > + return false; > + } If you take address of function that has vectorizer enabled probably doesn't imply need to increase
Re: move increase_alignment from simple to regular ipa pass
On 17 June 2016 at 19:52, Prathamesh Kulkarniwrote: > On 14 June 2016 at 18:31, Prathamesh Kulkarni > wrote: >> On 13 June 2016 at 16:13, Jan Hubicka wrote: diff --git a/gcc/cgraph.h b/gcc/cgraph.h index ecafe63..41ac408 100644 --- a/gcc/cgraph.h +++ b/gcc/cgraph.h @@ -1874,6 +1874,9 @@ public: if we did not do any inter-procedural code movement. */ unsigned used_by_single_function : 1; + /* Set if -fsection-anchors is set. */ + unsigned section_anchor : 1; + private: /* Assemble thunks and aliases associated to varpool node. */ void assemble_aliases (void); diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c index 4bfcad7..e75d5c0 100644 --- a/gcc/cgraphunit.c +++ b/gcc/cgraphunit.c @@ -800,6 +800,9 @@ varpool_node::finalize_decl (tree decl) it is available to notice_global_symbol. */ node->definition = true; notice_global_symbol (decl); + + node->section_anchor = flag_section_anchors; + if (TREE_THIS_VOLATILE (decl) || DECL_PRESERVE_P (decl) /* Traditionally we do not eliminate static variables when not optimizing and when not doing toplevel reoder. */ diff --git a/gcc/common.opt b/gcc/common.opt index f0d7196..e497795 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -1590,6 +1590,10 @@ fira-algorithm= Common Joined RejectNegative Enum(ira_algorithm) Var(flag_ira_algorithm) Init(IRA_ALGORITHM_CB) Optimization -fira-algorithm=[CB|priority] Set the used IRA algorithm. +fipa-increase_alignment +Common Report Var(flag_ipa_increase_alignment) Init(0) Optimization +Option to gate increase_alignment ipa pass. + Enum Name(ira_algorithm) Type(enum ira_algorithm) UnknownError(unknown IRA algorithm %qs) @@ -2133,7 +2137,7 @@ Common Report Var(flag_sched_dep_count_heuristic) Init(1) Optimization Enable the dependent count heuristic in the scheduler. fsection-anchors -Common Report Var(flag_section_anchors) Optimization +Common Report Var(flag_section_anchors) Access data in the same section from shared anchor points. fsee diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index a0db3a4..1482566 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -8252,6 +8252,8 @@ aarch64_override_options (void) aarch64_register_fma_steering (); + /* Enable increase_alignment pass. */ + flag_ipa_increase_alignment = 1; >>> >>> I would rather enable it always on targets that do support anchors. >> AFAIK aarch64 supports section anchors. diff --git a/gcc/lto/lto-symtab.c b/gcc/lto/lto-symtab.c index ce9e146..7f09f3a 100644 --- a/gcc/lto/lto-symtab.c +++ b/gcc/lto/lto-symtab.c @@ -342,6 +342,13 @@ lto_symtab_merge (symtab_node *prevailing, symtab_node *entry) The type compatibility checks or the completing of types has properly dealt with most issues. */ + /* ??? is this assert necessary ? */ + varpool_node *v_prevailing = dyn_cast (prevailing); + varpool_node *v_entry = dyn_cast (entry); + gcc_assert (v_prevailing && v_entry); + /* section_anchor of prevailing_decl wins. */ + v_entry->section_anchor = v_prevailing->section_anchor; + >>> Other flags are merged in lto_varpool_replace_node so please move this >>> there. >> Ah indeed, thanks for the pointers. >> I wonder though if we need to set >> prevailing_node->section_anchor = vnode->section_anchor ? >> IIUC, the function merges flags from vnode into prevailing_node >> and removes vnode. However we want prevailing_node->section_anchor >> to always take precedence. +/* Return true if alignment should be increased for this vnode. + This is done if every function that references/referring to vnode + has flag_tree_loop_vectorize set. */ + +static bool +increase_alignment_p (varpool_node *vnode) +{ + ipa_ref *ref; + + for (int i = 0; vnode->iterate_reference (i, ref); i++) +if (cgraph_node *cnode = dyn_cast (ref->referred)) + { + struct cl_optimization *opts = opts_for_fn (cnode->decl); + if (!opts->x_flag_tree_loop_vectorize) + return false; + } >>> >>> If you take address of function that has vectorizer enabled probably doesn't >>> imply need to increase alignment of that var. So please drop the loop. >>> >>> You only want function that read/writes or takes address of the symbol. But >>> onthe other hand, you need to walk all aliases of the symbol by >>> call_for_symbol_and_aliases + + for (int i = 0; vnode->iterate_referring (i, ref); i++)
Re: move increase_alignment from simple to regular ipa pass
On 14 June 2016 at 18:31, Prathamesh Kulkarniwrote: > On 13 June 2016 at 16:13, Jan Hubicka wrote: >>> diff --git a/gcc/cgraph.h b/gcc/cgraph.h >>> index ecafe63..41ac408 100644 >>> --- a/gcc/cgraph.h >>> +++ b/gcc/cgraph.h >>> @@ -1874,6 +1874,9 @@ public: >>> if we did not do any inter-procedural code movement. */ >>>unsigned used_by_single_function : 1; >>> >>> + /* Set if -fsection-anchors is set. */ >>> + unsigned section_anchor : 1; >>> + >>> private: >>>/* Assemble thunks and aliases associated to varpool node. */ >>>void assemble_aliases (void); >>> diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c >>> index 4bfcad7..e75d5c0 100644 >>> --- a/gcc/cgraphunit.c >>> +++ b/gcc/cgraphunit.c >>> @@ -800,6 +800,9 @@ varpool_node::finalize_decl (tree decl) >>> it is available to notice_global_symbol. */ >>>node->definition = true; >>>notice_global_symbol (decl); >>> + >>> + node->section_anchor = flag_section_anchors; >>> + >>>if (TREE_THIS_VOLATILE (decl) || DECL_PRESERVE_P (decl) >>>/* Traditionally we do not eliminate static variables when not >>>optimizing and when not doing toplevel reoder. */ >>> diff --git a/gcc/common.opt b/gcc/common.opt >>> index f0d7196..e497795 100644 >>> --- a/gcc/common.opt >>> +++ b/gcc/common.opt >>> @@ -1590,6 +1590,10 @@ fira-algorithm= >>> Common Joined RejectNegative Enum(ira_algorithm) Var(flag_ira_algorithm) >>> Init(IRA_ALGORITHM_CB) Optimization >>> -fira-algorithm=[CB|priority] Set the used IRA algorithm. >>> >>> +fipa-increase_alignment >>> +Common Report Var(flag_ipa_increase_alignment) Init(0) Optimization >>> +Option to gate increase_alignment ipa pass. >>> + >>> Enum >>> Name(ira_algorithm) Type(enum ira_algorithm) UnknownError(unknown IRA >>> algorithm %qs) >>> >>> @@ -2133,7 +2137,7 @@ Common Report Var(flag_sched_dep_count_heuristic) >>> Init(1) Optimization >>> Enable the dependent count heuristic in the scheduler. >>> >>> fsection-anchors >>> -Common Report Var(flag_section_anchors) Optimization >>> +Common Report Var(flag_section_anchors) >>> Access data in the same section from shared anchor points. >>> >>> fsee >>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c >>> index a0db3a4..1482566 100644 >>> --- a/gcc/config/aarch64/aarch64.c >>> +++ b/gcc/config/aarch64/aarch64.c >>> @@ -8252,6 +8252,8 @@ aarch64_override_options (void) >>> >>>aarch64_register_fma_steering (); >>> >>> + /* Enable increase_alignment pass. */ >>> + flag_ipa_increase_alignment = 1; >> >> I would rather enable it always on targets that do support anchors. > AFAIK aarch64 supports section anchors. >>> diff --git a/gcc/lto/lto-symtab.c b/gcc/lto/lto-symtab.c >>> index ce9e146..7f09f3a 100644 >>> --- a/gcc/lto/lto-symtab.c >>> +++ b/gcc/lto/lto-symtab.c >>> @@ -342,6 +342,13 @@ lto_symtab_merge (symtab_node *prevailing, symtab_node >>> *entry) >>> The type compatibility checks or the completing of types has properly >>> dealt with most issues. */ >>> >>> + /* ??? is this assert necessary ? */ >>> + varpool_node *v_prevailing = dyn_cast (prevailing); >>> + varpool_node *v_entry = dyn_cast (entry); >>> + gcc_assert (v_prevailing && v_entry); >>> + /* section_anchor of prevailing_decl wins. */ >>> + v_entry->section_anchor = v_prevailing->section_anchor; >>> + >> Other flags are merged in lto_varpool_replace_node so please move this there. > Ah indeed, thanks for the pointers. > I wonder though if we need to set > prevailing_node->section_anchor = vnode->section_anchor ? > IIUC, the function merges flags from vnode into prevailing_node > and removes vnode. However we want prevailing_node->section_anchor > to always take precedence. >>> +/* Return true if alignment should be increased for this vnode. >>> + This is done if every function that references/referring to vnode >>> + has flag_tree_loop_vectorize set. */ >>> + >>> +static bool >>> +increase_alignment_p (varpool_node *vnode) >>> +{ >>> + ipa_ref *ref; >>> + >>> + for (int i = 0; vnode->iterate_reference (i, ref); i++) >>> +if (cgraph_node *cnode = dyn_cast (ref->referred)) >>> + { >>> + struct cl_optimization *opts = opts_for_fn (cnode->decl); >>> + if (!opts->x_flag_tree_loop_vectorize) >>> + return false; >>> + } >> >> If you take address of function that has vectorizer enabled probably doesn't >> imply need to increase alignment of that var. So please drop the loop. >> >> You only want function that read/writes or takes address of the symbol. But >> onthe other hand, you need to walk all aliases of the symbol by >> call_for_symbol_and_aliases >>> + >>> + for (int i = 0; vnode->iterate_referring (i, ref); i++) >>> +if (cgraph_node *cnode = dyn_cast (ref->referring)) >>> + { >>> + struct cl_optimization *opts = opts_for_fn (cnode->decl); >>> + if (!opts->x_flag_tree_loop_vectorize) >>> + return
Re: move increase_alignment from simple to regular ipa pass
On 13 June 2016 at 16:13, Jan Hubickawrote: >> diff --git a/gcc/cgraph.h b/gcc/cgraph.h >> index ecafe63..41ac408 100644 >> --- a/gcc/cgraph.h >> +++ b/gcc/cgraph.h >> @@ -1874,6 +1874,9 @@ public: >> if we did not do any inter-procedural code movement. */ >>unsigned used_by_single_function : 1; >> >> + /* Set if -fsection-anchors is set. */ >> + unsigned section_anchor : 1; >> + >> private: >>/* Assemble thunks and aliases associated to varpool node. */ >>void assemble_aliases (void); >> diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c >> index 4bfcad7..e75d5c0 100644 >> --- a/gcc/cgraphunit.c >> +++ b/gcc/cgraphunit.c >> @@ -800,6 +800,9 @@ varpool_node::finalize_decl (tree decl) >> it is available to notice_global_symbol. */ >>node->definition = true; >>notice_global_symbol (decl); >> + >> + node->section_anchor = flag_section_anchors; >> + >>if (TREE_THIS_VOLATILE (decl) || DECL_PRESERVE_P (decl) >>/* Traditionally we do not eliminate static variables when not >>optimizing and when not doing toplevel reoder. */ >> diff --git a/gcc/common.opt b/gcc/common.opt >> index f0d7196..e497795 100644 >> --- a/gcc/common.opt >> +++ b/gcc/common.opt >> @@ -1590,6 +1590,10 @@ fira-algorithm= >> Common Joined RejectNegative Enum(ira_algorithm) Var(flag_ira_algorithm) >> Init(IRA_ALGORITHM_CB) Optimization >> -fira-algorithm=[CB|priority] Set the used IRA algorithm. >> >> +fipa-increase_alignment >> +Common Report Var(flag_ipa_increase_alignment) Init(0) Optimization >> +Option to gate increase_alignment ipa pass. >> + >> Enum >> Name(ira_algorithm) Type(enum ira_algorithm) UnknownError(unknown IRA >> algorithm %qs) >> >> @@ -2133,7 +2137,7 @@ Common Report Var(flag_sched_dep_count_heuristic) >> Init(1) Optimization >> Enable the dependent count heuristic in the scheduler. >> >> fsection-anchors >> -Common Report Var(flag_section_anchors) Optimization >> +Common Report Var(flag_section_anchors) >> Access data in the same section from shared anchor points. >> >> fsee >> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c >> index a0db3a4..1482566 100644 >> --- a/gcc/config/aarch64/aarch64.c >> +++ b/gcc/config/aarch64/aarch64.c >> @@ -8252,6 +8252,8 @@ aarch64_override_options (void) >> >>aarch64_register_fma_steering (); >> >> + /* Enable increase_alignment pass. */ >> + flag_ipa_increase_alignment = 1; > > I would rather enable it always on targets that do support anchors. AFAIK aarch64 supports section anchors. >> diff --git a/gcc/lto/lto-symtab.c b/gcc/lto/lto-symtab.c >> index ce9e146..7f09f3a 100644 >> --- a/gcc/lto/lto-symtab.c >> +++ b/gcc/lto/lto-symtab.c >> @@ -342,6 +342,13 @@ lto_symtab_merge (symtab_node *prevailing, symtab_node >> *entry) >> The type compatibility checks or the completing of types has properly >> dealt with most issues. */ >> >> + /* ??? is this assert necessary ? */ >> + varpool_node *v_prevailing = dyn_cast (prevailing); >> + varpool_node *v_entry = dyn_cast (entry); >> + gcc_assert (v_prevailing && v_entry); >> + /* section_anchor of prevailing_decl wins. */ >> + v_entry->section_anchor = v_prevailing->section_anchor; >> + > Other flags are merged in lto_varpool_replace_node so please move this there. Ah indeed, thanks for the pointers. I wonder though if we need to set prevailing_node->section_anchor = vnode->section_anchor ? IIUC, the function merges flags from vnode into prevailing_node and removes vnode. However we want prevailing_node->section_anchor to always take precedence. >> +/* Return true if alignment should be increased for this vnode. >> + This is done if every function that references/referring to vnode >> + has flag_tree_loop_vectorize set. */ >> + >> +static bool >> +increase_alignment_p (varpool_node *vnode) >> +{ >> + ipa_ref *ref; >> + >> + for (int i = 0; vnode->iterate_reference (i, ref); i++) >> +if (cgraph_node *cnode = dyn_cast (ref->referred)) >> + { >> + struct cl_optimization *opts = opts_for_fn (cnode->decl); >> + if (!opts->x_flag_tree_loop_vectorize) >> + return false; >> + } > > If you take address of function that has vectorizer enabled probably doesn't > imply need to increase alignment of that var. So please drop the loop. > > You only want function that read/writes or takes address of the symbol. But > onthe other hand, you need to walk all aliases of the symbol by > call_for_symbol_and_aliases >> + >> + for (int i = 0; vnode->iterate_referring (i, ref); i++) >> +if (cgraph_node *cnode = dyn_cast (ref->referring)) >> + { >> + struct cl_optimization *opts = opts_for_fn (cnode->decl); >> + if (!opts->x_flag_tree_loop_vectorize) >> + return false; >> + } >> + >> + return true; >> +} >> + >> /* Entry point to increase_alignment pass. */ >> static unsigned int >> increase_alignment (void) >> @@ -914,9 +942,12 @@ increase_alignment (void) >>
Re: move increase_alignment from simple to regular ipa pass
> diff --git a/gcc/cgraph.h b/gcc/cgraph.h > index ecafe63..41ac408 100644 > --- a/gcc/cgraph.h > +++ b/gcc/cgraph.h > @@ -1874,6 +1874,9 @@ public: > if we did not do any inter-procedural code movement. */ >unsigned used_by_single_function : 1; > > + /* Set if -fsection-anchors is set. */ > + unsigned section_anchor : 1; > + > private: >/* Assemble thunks and aliases associated to varpool node. */ >void assemble_aliases (void); > diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c > index 4bfcad7..e75d5c0 100644 > --- a/gcc/cgraphunit.c > +++ b/gcc/cgraphunit.c > @@ -800,6 +800,9 @@ varpool_node::finalize_decl (tree decl) > it is available to notice_global_symbol. */ >node->definition = true; >notice_global_symbol (decl); > + > + node->section_anchor = flag_section_anchors; > + >if (TREE_THIS_VOLATILE (decl) || DECL_PRESERVE_P (decl) >/* Traditionally we do not eliminate static variables when not >optimizing and when not doing toplevel reoder. */ > diff --git a/gcc/common.opt b/gcc/common.opt > index f0d7196..e497795 100644 > --- a/gcc/common.opt > +++ b/gcc/common.opt > @@ -1590,6 +1590,10 @@ fira-algorithm= > Common Joined RejectNegative Enum(ira_algorithm) Var(flag_ira_algorithm) > Init(IRA_ALGORITHM_CB) Optimization > -fira-algorithm=[CB|priority] Set the used IRA algorithm. > > +fipa-increase_alignment > +Common Report Var(flag_ipa_increase_alignment) Init(0) Optimization > +Option to gate increase_alignment ipa pass. > + > Enum > Name(ira_algorithm) Type(enum ira_algorithm) UnknownError(unknown IRA > algorithm %qs) > > @@ -2133,7 +2137,7 @@ Common Report Var(flag_sched_dep_count_heuristic) > Init(1) Optimization > Enable the dependent count heuristic in the scheduler. > > fsection-anchors > -Common Report Var(flag_section_anchors) Optimization > +Common Report Var(flag_section_anchors) > Access data in the same section from shared anchor points. > > fsee > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index a0db3a4..1482566 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -8252,6 +8252,8 @@ aarch64_override_options (void) > >aarch64_register_fma_steering (); > > + /* Enable increase_alignment pass. */ > + flag_ipa_increase_alignment = 1; I would rather enable it always on targets that do support anchors. > diff --git a/gcc/lto/lto-symtab.c b/gcc/lto/lto-symtab.c > index ce9e146..7f09f3a 100644 > --- a/gcc/lto/lto-symtab.c > +++ b/gcc/lto/lto-symtab.c > @@ -342,6 +342,13 @@ lto_symtab_merge (symtab_node *prevailing, symtab_node > *entry) > The type compatibility checks or the completing of types has properly > dealt with most issues. */ > > + /* ??? is this assert necessary ? */ > + varpool_node *v_prevailing = dyn_cast (prevailing); > + varpool_node *v_entry = dyn_cast (entry); > + gcc_assert (v_prevailing && v_entry); > + /* section_anchor of prevailing_decl wins. */ > + v_entry->section_anchor = v_prevailing->section_anchor; > + Other flags are merged in lto_varpool_replace_node so please move this there. > +/* Return true if alignment should be increased for this vnode. > + This is done if every function that references/referring to vnode > + has flag_tree_loop_vectorize set. */ > + > +static bool > +increase_alignment_p (varpool_node *vnode) > +{ > + ipa_ref *ref; > + > + for (int i = 0; vnode->iterate_reference (i, ref); i++) > +if (cgraph_node *cnode = dyn_cast (ref->referred)) > + { > + struct cl_optimization *opts = opts_for_fn (cnode->decl); > + if (!opts->x_flag_tree_loop_vectorize) > + return false; > + } If you take address of function that has vectorizer enabled probably doesn't imply need to increase alignment of that var. So please drop the loop. You only want function that read/writes or takes address of the symbol. But onthe other hand, you need to walk all aliases of the symbol by call_for_symbol_and_aliases > + > + for (int i = 0; vnode->iterate_referring (i, ref); i++) > +if (cgraph_node *cnode = dyn_cast (ref->referring)) > + { > + struct cl_optimization *opts = opts_for_fn (cnode->decl); > + if (!opts->x_flag_tree_loop_vectorize) > + return false; > + } > + > + return true; > +} > + > /* Entry point to increase_alignment pass. */ > static unsigned int > increase_alignment (void) > @@ -914,9 +942,12 @@ increase_alignment (void) >tree decl = vnode->decl; >unsigned int alignment; > > - if ((decl_in_symtab_p (decl) > - && !symtab_node::get (decl)->can_increase_alignment_p ()) > - || DECL_USER_ALIGN (decl) || DECL_ARTIFICIAL (decl)) > + if (!vnode->section_anchor > + || (decl_in_symtab_p (decl) > + && !symtab_node::get (decl)->can_increase_alignment_p ()) > + || DECL_USER_ALIGN (decl) > + || DECL_ARTIFICIAL (decl) > + || !increase_alignment_p (vnode))
Re: move increase_alignment from simple to regular ipa pass
On 10 June 2016 at 16:47, Richard Bienerwrote: > On Fri, 10 Jun 2016, Prathamesh Kulkarni wrote: > >> On 10 June 2016 at 01:53, Jan Hubicka wrote: >> >> On 8 June 2016 at 20:38, Jan Hubicka wrote: >> >> >> I think it would be nice to work towards transitioning >> >> >> flag_section_anchors to a flag on varpool nodes, thereby removing >> >> >> the Optimization flag from common.opt:fsection-anchors >> >> >> >> >> >> That would simplify the walk over varpool candidates. >> >> > >> >> > Makes sense to me, too. There are more candidates for sutff that should >> >> > be >> >> > variable specific in common.opt (such as variable alignment, >> >> > -fdata-sctions, >> >> > -fmerge-constants) and targets. We may try to do it in an easy to >> >> > extend way >> >> > so incrementally we can get rid of those global flags, too. >> >> In this version I removed Optimization from fsection-anchors entry in >> >> common.opt, >> >> and gated the increase_alignment pass on flag_section_anchors != 0. >> >> Cross tested on arm*-*-*, aarch64*-*-*. >> >> Does it look OK ? >> > >> > If you go this way you will need to do something sane for LTO. Here one >> > can compile >> > some object files with -fsection-anchors and other without and link with >> > random setting >> > (because in traditional compilation linktime flags does not matter). >> > >> > For global flags we have magic in merge_and_complain that determines flags >> > to pass >> > to the LTO compiler. >> > It is not very robust though. >> >> > >> >> > One thing that needs to be done for LTO is sane merging, I guess in >> >> > this case >> >> > it is clear that the variable should be anchored when its previaling >> >> > definition >> >> > is. >> >> Um could we determine during WPA if symbol is a section anchor for >> >> merging ? >> >> Seems to me SYMBOL_REF_ANCHOR_P is defined only on DECL_RTL and not at >> >> tree level. >> >> Do we have DECL_RTL info available during WPA ? >> > >> > We don't have anchros computed, but we can decide whether we want to >> > potentially >> > anchor the variable if we can. >> > >> > I would say all you need is to have section_anchor flag in varpool node >> > itself >> > which controls RTL production. At varpool_finalize_decl you will set it >> > according to flag_varpool and stream it to LTO objects. At WPA when doing >> > linking, the section_anchor flag of the previaling decl wins.. >> Thanks for the suggestions. >> IIUC, we want to add new section_anchor flag to varpool_node class >> and set it in varpool_node::finalize_decl and stream it to LTO byte-code, >> and then during WPA set section_anchor_flag during symbol merging if it is >> set >> for prevailing decl. > > Yes. > >> In the increase_alignment_pass if a vnode has section_anchor flag set, >> we will walk all functions that reference it to check if they have >> -ftree-loop-vectorize set. >> Is that correct ? > > Yes. > >> Could you please elaborate a bit more on "at varpool_finalize_decl you will >> set section_anchor flag according to flag_varpool" ? >> flag_varpool doesn't appear to be defined. > > flag_section_anchors. Hi, I have done the changes in this version In varpool_node::finalize_decl, I just set vnode->section_anchor = flag_section_anchors. Should that be sufficient ? I tried with a couple of test-cases, once with prevailing->section_anchors == 1 and once with entry->section_anchors == 1 and it appears prevailing->section_anchor always took precedence. So I wonder if the change to lto_symtab_merge () in the patch is necessary ? Re-introduced flag_ipa_increase_alignment to gate the pass on, so it runs only for targets supporting section anchors. Cross tested on aarch64*-*-*, arm*-*-*. Thanks, Prathamesh > > Richard. > >> Thanks, >> Prathamesh >> > >> > Honza >> >> >> >> Thanks, >> >> Prathamesh >> >> > >> >> > Honza >> >> >> >> >> >> Richard. >> >> >> >> >> >> > Thanks, >> >> >> > Prathamesh >> >> >> > > >> >> >> > > Honza >> >> >> > >> >> >> >> > >> Richard. >> >> >> > >> >> >> > >> >> >> >> >> >> -- >> >> >> Richard Biener >> >> >> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, >> >> >> HRB 21284 (AG Nuernberg) >> > >> >> diff --git a/gcc/common.opt b/gcc/common.opt >> >> index f0d7196..f93f26c 100644 >> >> --- a/gcc/common.opt >> >> +++ b/gcc/common.opt >> >> @@ -2133,7 +2133,7 @@ Common Report Var(flag_sched_dep_count_heuristic) >> >> Init(1) Optimization >> >> Enable the dependent count heuristic in the scheduler. >> >> >> >> fsection-anchors >> >> -Common Report Var(flag_section_anchors) Optimization >> >> +Common Report Var(flag_section_anchors) >> >> Access data in the same section from shared anchor points. >> >> >> >> fsee >> >> diff --git a/gcc/passes.def b/gcc/passes.def >> >> index 3647e90..3a8063c 100644 >> >> --- a/gcc/passes.def >> >> +++ b/gcc/passes.def >> >> @@ -138,12 +138,12 @@ along with GCC; see the file COPYING3. If not see
Re: move increase_alignment from simple to regular ipa pass
On Fri, 10 Jun 2016, Prathamesh Kulkarni wrote: > On 10 June 2016 at 01:53, Jan Hubickawrote: > >> On 8 June 2016 at 20:38, Jan Hubicka wrote: > >> >> I think it would be nice to work towards transitioning > >> >> flag_section_anchors to a flag on varpool nodes, thereby removing > >> >> the Optimization flag from common.opt:fsection-anchors > >> >> > >> >> That would simplify the walk over varpool candidates. > >> > > >> > Makes sense to me, too. There are more candidates for sutff that should > >> > be > >> > variable specific in common.opt (such as variable alignment, > >> > -fdata-sctions, > >> > -fmerge-constants) and targets. We may try to do it in an easy to > >> > extend way > >> > so incrementally we can get rid of those global flags, too. > >> In this version I removed Optimization from fsection-anchors entry in > >> common.opt, > >> and gated the increase_alignment pass on flag_section_anchors != 0. > >> Cross tested on arm*-*-*, aarch64*-*-*. > >> Does it look OK ? > > > > If you go this way you will need to do something sane for LTO. Here one > > can compile > > some object files with -fsection-anchors and other without and link with > > random setting > > (because in traditional compilation linktime flags does not matter). > > > > For global flags we have magic in merge_and_complain that determines flags > > to pass > > to the LTO compiler. > > It is not very robust though. > >> > > >> > One thing that needs to be done for LTO is sane merging, I guess in this > >> > case > >> > it is clear that the variable should be anchored when its previaling > >> > definition > >> > is. > >> Um could we determine during WPA if symbol is a section anchor for merging > >> ? > >> Seems to me SYMBOL_REF_ANCHOR_P is defined only on DECL_RTL and not at > >> tree level. > >> Do we have DECL_RTL info available during WPA ? > > > > We don't have anchros computed, but we can decide whether we want to > > potentially > > anchor the variable if we can. > > > > I would say all you need is to have section_anchor flag in varpool node > > itself > > which controls RTL production. At varpool_finalize_decl you will set it > > according to flag_varpool and stream it to LTO objects. At WPA when doing > > linking, the section_anchor flag of the previaling decl wins.. > Thanks for the suggestions. > IIUC, we want to add new section_anchor flag to varpool_node class > and set it in varpool_node::finalize_decl and stream it to LTO byte-code, > and then during WPA set section_anchor_flag during symbol merging if it is set > for prevailing decl. Yes. > In the increase_alignment_pass if a vnode has section_anchor flag set, > we will walk all functions that reference it to check if they have > -ftree-loop-vectorize set. > Is that correct ? Yes. > Could you please elaborate a bit more on "at varpool_finalize_decl you will > set section_anchor flag according to flag_varpool" ? > flag_varpool doesn't appear to be defined. flag_section_anchors. Richard. > Thanks, > Prathamesh > > > > Honza > >> > >> Thanks, > >> Prathamesh > >> > > >> > Honza > >> >> > >> >> Richard. > >> >> > >> >> > Thanks, > >> >> > Prathamesh > >> >> > > > >> >> > > Honza > >> >> > >> > >> >> > >> Richard. > >> >> > > >> >> > > >> >> > >> >> -- > >> >> Richard Biener > >> >> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, > >> >> HRB 21284 (AG Nuernberg) > > > >> diff --git a/gcc/common.opt b/gcc/common.opt > >> index f0d7196..f93f26c 100644 > >> --- a/gcc/common.opt > >> +++ b/gcc/common.opt > >> @@ -2133,7 +2133,7 @@ Common Report Var(flag_sched_dep_count_heuristic) > >> Init(1) Optimization > >> Enable the dependent count heuristic in the scheduler. > >> > >> fsection-anchors > >> -Common Report Var(flag_section_anchors) Optimization > >> +Common Report Var(flag_section_anchors) > >> Access data in the same section from shared anchor points. > >> > >> fsee > >> diff --git a/gcc/passes.def b/gcc/passes.def > >> index 3647e90..3a8063c 100644 > >> --- a/gcc/passes.def > >> +++ b/gcc/passes.def > >> @@ -138,12 +138,12 @@ along with GCC; see the file COPYING3. If not see > >>PUSH_INSERT_PASSES_WITHIN (pass_ipa_tree_profile) > >>NEXT_PASS (pass_feedback_split_functions); > >>POP_INSERT_PASSES () > >> - NEXT_PASS (pass_ipa_increase_alignment); > >>NEXT_PASS (pass_ipa_tm); > >>NEXT_PASS (pass_ipa_lower_emutls); > >>TERMINATE_PASS_LIST (all_small_ipa_passes) > >> > >>INSERT_PASSES_AFTER (all_regular_ipa_passes) > >> + NEXT_PASS (pass_ipa_increase_alignment); > >>NEXT_PASS (pass_ipa_whole_program_visibility); > >>NEXT_PASS (pass_ipa_profile); > >>NEXT_PASS (pass_ipa_icf); > >> diff --git a/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-vect-73.c > >> b/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-vect-73.c > >> new file mode 100644 > >> index 000..74eaed8 > >> --- /dev/null > >> +++
Re: move increase_alignment from simple to regular ipa pass
On 10 June 2016 at 01:53, Jan Hubickawrote: >> On 8 June 2016 at 20:38, Jan Hubicka wrote: >> >> I think it would be nice to work towards transitioning >> >> flag_section_anchors to a flag on varpool nodes, thereby removing >> >> the Optimization flag from common.opt:fsection-anchors >> >> >> >> That would simplify the walk over varpool candidates. >> > >> > Makes sense to me, too. There are more candidates for sutff that should be >> > variable specific in common.opt (such as variable alignment, >> > -fdata-sctions, >> > -fmerge-constants) and targets. We may try to do it in an easy to extend >> > way >> > so incrementally we can get rid of those global flags, too. >> In this version I removed Optimization from fsection-anchors entry in >> common.opt, >> and gated the increase_alignment pass on flag_section_anchors != 0. >> Cross tested on arm*-*-*, aarch64*-*-*. >> Does it look OK ? > > If you go this way you will need to do something sane for LTO. Here one can > compile > some object files with -fsection-anchors and other without and link with > random setting > (because in traditional compilation linktime flags does not matter). > > For global flags we have magic in merge_and_complain that determines flags to > pass > to the LTO compiler. > It is not very robust though. >> > >> > One thing that needs to be done for LTO is sane merging, I guess in this >> > case >> > it is clear that the variable should be anchored when its previaling >> > definition >> > is. >> Um could we determine during WPA if symbol is a section anchor for merging ? >> Seems to me SYMBOL_REF_ANCHOR_P is defined only on DECL_RTL and not at >> tree level. >> Do we have DECL_RTL info available during WPA ? > > We don't have anchros computed, but we can decide whether we want to > potentially > anchor the variable if we can. > > I would say all you need is to have section_anchor flag in varpool node itself > which controls RTL production. At varpool_finalize_decl you will set it > according to flag_varpool and stream it to LTO objects. At WPA when doing > linking, the section_anchor flag of the previaling decl wins.. Thanks for the suggestions. IIUC, we want to add new section_anchor flag to varpool_node class and set it in varpool_node::finalize_decl and stream it to LTO byte-code, and then during WPA set section_anchor_flag during symbol merging if it is set for prevailing decl. In the increase_alignment_pass if a vnode has section_anchor flag set, we will walk all functions that reference it to check if they have -ftree-loop-vectorize set. Is that correct ? Could you please elaborate a bit more on "at varpool_finalize_decl you will set section_anchor flag according to flag_varpool" ? flag_varpool doesn't appear to be defined. Thanks, Prathamesh > > Honza >> >> Thanks, >> Prathamesh >> > >> > Honza >> >> >> >> Richard. >> >> >> >> > Thanks, >> >> > Prathamesh >> >> > > >> >> > > Honza >> >> > >> >> >> > >> Richard. >> >> > >> >> > >> >> >> >> -- >> >> Richard Biener >> >> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, >> >> HRB 21284 (AG Nuernberg) > >> diff --git a/gcc/common.opt b/gcc/common.opt >> index f0d7196..f93f26c 100644 >> --- a/gcc/common.opt >> +++ b/gcc/common.opt >> @@ -2133,7 +2133,7 @@ Common Report Var(flag_sched_dep_count_heuristic) >> Init(1) Optimization >> Enable the dependent count heuristic in the scheduler. >> >> fsection-anchors >> -Common Report Var(flag_section_anchors) Optimization >> +Common Report Var(flag_section_anchors) >> Access data in the same section from shared anchor points. >> >> fsee >> diff --git a/gcc/passes.def b/gcc/passes.def >> index 3647e90..3a8063c 100644 >> --- a/gcc/passes.def >> +++ b/gcc/passes.def >> @@ -138,12 +138,12 @@ along with GCC; see the file COPYING3. If not see >>PUSH_INSERT_PASSES_WITHIN (pass_ipa_tree_profile) >>NEXT_PASS (pass_feedback_split_functions); >>POP_INSERT_PASSES () >> - NEXT_PASS (pass_ipa_increase_alignment); >>NEXT_PASS (pass_ipa_tm); >>NEXT_PASS (pass_ipa_lower_emutls); >>TERMINATE_PASS_LIST (all_small_ipa_passes) >> >>INSERT_PASSES_AFTER (all_regular_ipa_passes) >> + NEXT_PASS (pass_ipa_increase_alignment); >>NEXT_PASS (pass_ipa_whole_program_visibility); >>NEXT_PASS (pass_ipa_profile); >>NEXT_PASS (pass_ipa_icf); >> diff --git a/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-vect-73.c >> b/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-vect-73.c >> new file mode 100644 >> index 000..74eaed8 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-vect-73.c >> @@ -0,0 +1,25 @@ >> +/* { dg-do compile } */ >> +/* { dg-require-effective-target section_anchors } */ >> +/* { dg-require-effective-target vect_int } */ >> + >> +#define N 32 >> + >> +/* Clone of section-anchors-vect-70.c with foo() having >> -fno-tree-loop-vectorize. */ >> + >> +static struct A { >> + int p1, p2; >> +
Re: move increase_alignment from simple to regular ipa pass
On 8 June 2016 at 20:38, Jan Hubickawrote: >> I think it would be nice to work towards transitioning >> flag_section_anchors to a flag on varpool nodes, thereby removing >> the Optimization flag from common.opt:fsection-anchors >> >> That would simplify the walk over varpool candidates. > > Makes sense to me, too. There are more candidates for sutff that should be > variable specific in common.opt (such as variable alignment, -fdata-sctions, > -fmerge-constants) and targets. We may try to do it in an easy to extend way > so incrementally we can get rid of those global flags, too. In this version I removed Optimization from fsection-anchors entry in common.opt, and gated the increase_alignment pass on flag_section_anchors != 0. Cross tested on arm*-*-*, aarch64*-*-*. Does it look OK ? > > One thing that needs to be done for LTO is sane merging, I guess in this case > it is clear that the variable should be anchored when its previaling > definition > is. Um could we determine during WPA if symbol is a section anchor for merging ? Seems to me SYMBOL_REF_ANCHOR_P is defined only on DECL_RTL and not at tree level. Do we have DECL_RTL info available during WPA ? Thanks, Prathamesh > > Honza >> >> Richard. >> >> > Thanks, >> > Prathamesh >> > > >> > > Honza >> > >> >> > >> Richard. >> > >> > >> >> -- >> Richard Biener >> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB >> 21284 (AG Nuernberg) diff --git a/gcc/common.opt b/gcc/common.opt index f0d7196..f93f26c 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -2133,7 +2133,7 @@ Common Report Var(flag_sched_dep_count_heuristic) Init(1) Optimization Enable the dependent count heuristic in the scheduler. fsection-anchors -Common Report Var(flag_section_anchors) Optimization +Common Report Var(flag_section_anchors) Access data in the same section from shared anchor points. fsee diff --git a/gcc/passes.def b/gcc/passes.def index 3647e90..3a8063c 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -138,12 +138,12 @@ along with GCC; see the file COPYING3. If not see PUSH_INSERT_PASSES_WITHIN (pass_ipa_tree_profile) NEXT_PASS (pass_feedback_split_functions); POP_INSERT_PASSES () - NEXT_PASS (pass_ipa_increase_alignment); NEXT_PASS (pass_ipa_tm); NEXT_PASS (pass_ipa_lower_emutls); TERMINATE_PASS_LIST (all_small_ipa_passes) INSERT_PASSES_AFTER (all_regular_ipa_passes) + NEXT_PASS (pass_ipa_increase_alignment); NEXT_PASS (pass_ipa_whole_program_visibility); NEXT_PASS (pass_ipa_profile); NEXT_PASS (pass_ipa_icf); diff --git a/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-vect-73.c b/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-vect-73.c new file mode 100644 index 000..74eaed8 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-vect-73.c @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target section_anchors } */ +/* { dg-require-effective-target vect_int } */ + +#define N 32 + +/* Clone of section-anchors-vect-70.c with foo() having -fno-tree-loop-vectorize. */ + +static struct A { + int p1, p2; + int e[N]; +} a, b, c; + +__attribute__((optimize("-fno-tree-loop-vectorize"))) +int foo(void) +{ + for (int i = 0; i < N; i++) +a.e[i] = b.e[i] + c.e[i]; + + return a.e[0]; +} + +/* { dg-final { scan-ipa-dump-times "Increasing alignment of decl" 0 "increase_alignment" { target aarch64*-*-* } } } */ +/* { dg-final { scan-ipa-dump-times "Increasing alignment of decl" 0 "increase_alignment" { target powerpc64*-*-* } } } */ +/* { dg-final { scan-ipa-dump-times "Increasing alignment of decl" 0 "increase_alignment" { target arm*-*-* } } } */ diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index 36299a6..d36aa1d 100644 --- a/gcc/tree-pass.h +++ b/gcc/tree-pass.h @@ -483,7 +483,7 @@ extern simple_ipa_opt_pass *make_pass_local_optimization_passes (gcc::context *c extern ipa_opt_pass_d *make_pass_ipa_whole_program_visibility (gcc::context *ctxt); -extern simple_ipa_opt_pass *make_pass_ipa_increase_alignment (gcc::context +extern ipa_opt_pass_d *make_pass_ipa_increase_alignment (gcc::context *ctxt); extern ipa_opt_pass_d *make_pass_ipa_inline (gcc::context *ctxt); extern simple_ipa_opt_pass *make_pass_ipa_free_lang_data (gcc::context *ctxt); diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c index 2669813..d34e560 100644 --- a/gcc/tree-vectorizer.c +++ b/gcc/tree-vectorizer.c @@ -899,6 +899,34 @@ get_vec_alignment_for_type (tree type) return (alignment > TYPE_ALIGN (type)) ? alignment : 0; } +/* Return true if alignment should be increased for this vnode. + This is done if every function that references/referring to vnode + has flag_tree_loop_vectorize and flag_section_anchors set. */ + +static bool +increase_alignment_p (varpool_node *vnode) +{ + ipa_ref *ref; +
Re: move increase_alignment from simple to regular ipa pass
> On 8 June 2016 at 20:38, Jan Hubickawrote: > >> I think it would be nice to work towards transitioning > >> flag_section_anchors to a flag on varpool nodes, thereby removing > >> the Optimization flag from common.opt:fsection-anchors > >> > >> That would simplify the walk over varpool candidates. > > > > Makes sense to me, too. There are more candidates for sutff that should be > > variable specific in common.opt (such as variable alignment, -fdata-sctions, > > -fmerge-constants) and targets. We may try to do it in an easy to extend > > way > > so incrementally we can get rid of those global flags, too. > In this version I removed Optimization from fsection-anchors entry in > common.opt, > and gated the increase_alignment pass on flag_section_anchors != 0. > Cross tested on arm*-*-*, aarch64*-*-*. > Does it look OK ? If you go this way you will need to do something sane for LTO. Here one can compile some object files with -fsection-anchors and other without and link with random setting (because in traditional compilation linktime flags does not matter). For global flags we have magic in merge_and_complain that determines flags to pass to the LTO compiler. It is not very robust though. > > > > One thing that needs to be done for LTO is sane merging, I guess in this > > case > > it is clear that the variable should be anchored when its previaling > > definition > > is. > Um could we determine during WPA if symbol is a section anchor for merging ? > Seems to me SYMBOL_REF_ANCHOR_P is defined only on DECL_RTL and not at > tree level. > Do we have DECL_RTL info available during WPA ? We don't have anchros computed, but we can decide whether we want to potentially anchor the variable if we can. I would say all you need is to have section_anchor flag in varpool node itself which controls RTL production. At varpool_finalize_decl you will set it according to flag_varpool and stream it to LTO objects. At WPA when doing linking, the section_anchor flag of the previaling decl wins.. Honza > > Thanks, > Prathamesh > > > > Honza > >> > >> Richard. > >> > >> > Thanks, > >> > Prathamesh > >> > > > >> > > Honza > >> > >> > >> > >> Richard. > >> > > >> > > >> > >> -- > >> Richard Biener > >> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB > >> 21284 (AG Nuernberg) > diff --git a/gcc/common.opt b/gcc/common.opt > index f0d7196..f93f26c 100644 > --- a/gcc/common.opt > +++ b/gcc/common.opt > @@ -2133,7 +2133,7 @@ Common Report Var(flag_sched_dep_count_heuristic) > Init(1) Optimization > Enable the dependent count heuristic in the scheduler. > > fsection-anchors > -Common Report Var(flag_section_anchors) Optimization > +Common Report Var(flag_section_anchors) > Access data in the same section from shared anchor points. > > fsee > diff --git a/gcc/passes.def b/gcc/passes.def > index 3647e90..3a8063c 100644 > --- a/gcc/passes.def > +++ b/gcc/passes.def > @@ -138,12 +138,12 @@ along with GCC; see the file COPYING3. If not see >PUSH_INSERT_PASSES_WITHIN (pass_ipa_tree_profile) >NEXT_PASS (pass_feedback_split_functions); >POP_INSERT_PASSES () > - NEXT_PASS (pass_ipa_increase_alignment); >NEXT_PASS (pass_ipa_tm); >NEXT_PASS (pass_ipa_lower_emutls); >TERMINATE_PASS_LIST (all_small_ipa_passes) > >INSERT_PASSES_AFTER (all_regular_ipa_passes) > + NEXT_PASS (pass_ipa_increase_alignment); >NEXT_PASS (pass_ipa_whole_program_visibility); >NEXT_PASS (pass_ipa_profile); >NEXT_PASS (pass_ipa_icf); > diff --git a/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-vect-73.c > b/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-vect-73.c > new file mode 100644 > index 000..74eaed8 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-vect-73.c > @@ -0,0 +1,25 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target section_anchors } */ > +/* { dg-require-effective-target vect_int } */ > + > +#define N 32 > + > +/* Clone of section-anchors-vect-70.c with foo() having > -fno-tree-loop-vectorize. */ > + > +static struct A { > + int p1, p2; > + int e[N]; > +} a, b, c; > + > +__attribute__((optimize("-fno-tree-loop-vectorize"))) > +int foo(void) > +{ > + for (int i = 0; i < N; i++) > +a.e[i] = b.e[i] + c.e[i]; > + > + return a.e[0]; > +} > + > +/* { dg-final { scan-ipa-dump-times "Increasing alignment of decl" 0 > "increase_alignment" { target aarch64*-*-* } } } */ > +/* { dg-final { scan-ipa-dump-times "Increasing alignment of decl" 0 > "increase_alignment" { target powerpc64*-*-* } } } */ > +/* { dg-final { scan-ipa-dump-times "Increasing alignment of decl" 0 > "increase_alignment" { target arm*-*-* } } } */ > diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h > index 36299a6..d36aa1d 100644 > --- a/gcc/tree-pass.h > +++ b/gcc/tree-pass.h > @@ -483,7 +483,7 @@ extern simple_ipa_opt_pass > *make_pass_local_optimization_passes (gcc::context *c > > extern
Re: move increase_alignment from simple to regular ipa pass
> I think it would be nice to work towards transitioning > flag_section_anchors to a flag on varpool nodes, thereby removing > the Optimization flag from common.opt:fsection-anchors > > That would simplify the walk over varpool candidates. Makes sense to me, too. There are more candidates for sutff that should be variable specific in common.opt (such as variable alignment, -fdata-sctions, -fmerge-constants) and targets. We may try to do it in an easy to extend way so incrementally we can get rid of those global flags, too. One thing that needs to be done for LTO is sane merging, I guess in this case it is clear that the variable should be anchored when its previaling definition is. Honza > > Richard. > > > Thanks, > > Prathamesh > > > > > > Honza > > >> > > >> Richard. > > > > > > -- > Richard Biener> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB > 21284 (AG Nuernberg)
Re: move increase_alignment from simple to regular ipa pass
On Tue, 7 Jun 2016, Prathamesh Kulkarni wrote: > On 3 June 2016 at 13:35, Jan Hubickawrote: > >> > fsection-anchors > >> > Common Report Var(flag_section_anchors) > >> > Access data in the same section from shared anchor points. > >> > >> Funny. I see the following on trunk: > >> > >> fsection-anchors > >> Common Report Var(flag_section_anchors) Optimization > >> Access data in the same section from shared anchor points. > > > > Aha, my local change from last year still inmy tree. Sorry. > > Yep, having it as Optimization makes sense, but we need to be sure it works > > as intended. > >> > >> > flag_section_anchors is not declared as Optimiation, so it can't be > >> > function > >> > specific right now. It probably should because it is an optimization. > >> > This > >> > makes me wonder what happens when one function have anchors enabled and > >> > other > >> > doesn't? Probably anchroing or not anchoring the var will then depend > >> > on what > >> > function comes first in the compilation order and then we will need to > >> > make > >> > backend grok the case where static var is anchored but > >> > flag_section_anchors is > >> > off. > >> > >> This is because we represent the anchor with DECL_RTL, right? Maybe > >> DECL_RTL of globals needs to be re-computed for each function... > > > > I would rather anchor variable if it is used by at least one function that > > is compiled > > with anchors. Accessing anchors is IMO no slower than accessing symbols. > > But I am not > > that familiar witht his code... > >> > >> > I dunno what is the desired behaviour for LTOint together different code > >> > models. > >> > >> Good question. There's always the choice to remove 'Optimization' and > >> enforce same setting for all TUs we LTO in lto-wrapper. > > > > Yep. Not sure what is better - I did not really think of targets that use > > both > > models. > Um I am not really sure what to do next to convert increase_alignment > to regular pass, I would be grateful > for suggestions. I think it would be nice to work towards transitioning flag_section_anchors to a flag on varpool nodes, thereby removing the Optimization flag from common.opt:fsection-anchors That would simplify the walk over varpool candidates. Richard. > Thanks, > Prathamesh > > > > Honza > >> > >> Richard. > > -- Richard Biener SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
Re: move increase_alignment from simple to regular ipa pass
> > fsection-anchors > > > > Common Report Var(flag_section_anchors) > > > > Access data in the same section from shared anchor points. > > > > Funny. I see the following on trunk: > > fsection-anchors > Common Report Var(flag_section_anchors) Optimization > Access data in the same section from shared anchor points. Aha, my local change from last year still inmy tree. Sorry. Yep, having it as Optimization makes sense, but we need to be sure it works as intended. > > > flag_section_anchors is not declared as Optimiation, so it can't be function > > specific right now. It probably should because it is an optimization. This > > makes me wonder what happens when one function have anchors enabled and > > other > > doesn't? Probably anchroing or not anchoring the var will then depend on > > what > > function comes first in the compilation order and then we will need to make > > backend grok the case where static var is anchored but flag_section_anchors > > is > > off. > > This is because we represent the anchor with DECL_RTL, right? Maybe > DECL_RTL of globals needs to be re-computed for each function... I would rather anchor variable if it is used by at least one function that is compiled with anchors. Accessing anchors is IMO no slower than accessing symbols. But I am not that familiar witht his code... > > > I dunno what is the desired behaviour for LTOint together different code > > models. > > Good question. There's always the choice to remove 'Optimization' and > enforce same setting for all TUs we LTO in lto-wrapper. Yep. Not sure what is better - I did not really think of targets that use both models. Honza > > Richard.
Re: move increase_alignment from simple to regular ipa pass
On Thu, 2 Jun 2016, Jan Hubicka wrote: > > On Thu, 2 Jun 2016, David Edelsohn wrote: > > > > > > Richard Biener wrote: > > > > > > >> "This would mean the pass should get its own non-Optimization flag > > > >> initialized by targets where section anchors are usually used" > > > >> IIUC should we add a new option -fno-increase_alignment and gate the > > > >> pass on it ? Um sorry I didn't understand why targets > > > >> with section anchors (arm, aarch64, ppc) should initialize this option > > > >> ? > > > > > > > > Currently the pass is only run for targets with section anchors (and > > > > there > > > > by default if they are enabled by default). So it makes sense to > > > > run on those by default and the pass is not necessary on targets w/o > > > > section anchors as the vectorizer can easily adjust alignment itself on > > > > those. > > > > > > PPC does not always enable section anchors -- it depends on the code > > > model. Shouldn't this be tied to use of section anchors? > > > > It effectively is with the patch by walking all functions to see if they > > have section anchors enabled. That is unnecessary work for targets that > > fsection-anchors > > Common Report Var(flag_section_anchors) > > Access data in the same section from shared anchor points. > Funny. I see the following on trunk: fsection-anchors Common Report Var(flag_section_anchors) Optimization Access data in the same section from shared anchor points. > flag_section_anchors is not declared as Optimiation, so it can't be function > specific right now. It probably should because it is an optimization. This > makes me wonder what happens when one function have anchors enabled and other > doesn't? Probably anchroing or not anchoring the var will then depend on what > function comes first in the compilation order and then we will need to make > backend grok the case where static var is anchored but flag_section_anchors is > off. This is because we represent the anchor with DECL_RTL, right? Maybe DECL_RTL of globals needs to be re-computed for each function... > I dunno what is the desired behaviour for LTOint together different code > models. Good question. There's always the choice to remove 'Optimization' and enforce same setting for all TUs we LTO in lto-wrapper. Richard.
Re: move increase_alignment from simple to regular ipa pass
> On Thu, 2 Jun 2016, David Edelsohn wrote: > > > > Richard Biener wrote: > > > > >> "This would mean the pass should get its own non-Optimization flag > > >> initialized by targets where section anchors are usually used" > > >> IIUC should we add a new option -fno-increase_alignment and gate the > > >> pass on it ? Um sorry I didn't understand why targets > > >> with section anchors (arm, aarch64, ppc) should initialize this option ? > > > > > > Currently the pass is only run for targets with section anchors (and there > > > by default if they are enabled by default). So it makes sense to > > > run on those by default and the pass is not necessary on targets w/o > > > section anchors as the vectorizer can easily adjust alignment itself on > > > those. > > > > PPC does not always enable section anchors -- it depends on the code > > model. Shouldn't this be tied to use of section anchors? > > It effectively is with the patch by walking all functions to see if they > have section anchors enabled. That is unnecessary work for targets that fsection-anchors Common Report Var(flag_section_anchors) Access data in the same section from shared anchor points. flag_section_anchors is not declared as Optimiation, so it can't be function specific right now. It probably should because it is an optimization. This makes me wonder what happens when one function have anchors enabled and other doesn't? Probably anchroing or not anchoring the var will then depend on what function comes first in the compilation order and then we will need to make backend grok the case where static var is anchored but flag_section_anchors is off. I dunno what is the desired behaviour for LTOint together different code models. Honza > do not support section anchors at all, of course. > > Richard.
Re: move increase_alignment from simple to regular ipa pass
On Thu, 2 Jun 2016, David Edelsohn wrote: > > Richard Biener wrote: > > >> "This would mean the pass should get its own non-Optimization flag > >> initialized by targets where section anchors are usually used" > >> IIUC should we add a new option -fno-increase_alignment and gate the > >> pass on it ? Um sorry I didn't understand why targets > >> with section anchors (arm, aarch64, ppc) should initialize this option ? > > > > Currently the pass is only run for targets with section anchors (and there > > by default if they are enabled by default). So it makes sense to > > run on those by default and the pass is not necessary on targets w/o > > section anchors as the vectorizer can easily adjust alignment itself on > > those. > > PPC does not always enable section anchors -- it depends on the code > model. Shouldn't this be tied to use of section anchors? It effectively is with the patch by walking all functions to see if they have section anchors enabled. That is unnecessary work for targets that do not support section anchors at all, of course. Richard.
Re: move increase_alignment from simple to regular ipa pass
> Richard Biener wrote: >> "This would mean the pass should get its own non-Optimization flag >> initialized by targets where section anchors are usually used" >> IIUC should we add a new option -fno-increase_alignment and gate the >> pass on it ? Um sorry I didn't understand why targets >> with section anchors (arm, aarch64, ppc) should initialize this option ? > > Currently the pass is only run for targets with section anchors (and there > by default if they are enabled by default). So it makes sense to > run on those by default and the pass is not necessary on targets w/o > section anchors as the vectorizer can easily adjust alignment itself on > those. PPC does not always enable section anchors -- it depends on the code model. Shouldn't this be tied to use of section anchors? Thanks, David
Re: move increase_alignment from simple to regular ipa pass
On 2 June 2016 at 14:44, Prathamesh Kulkarniwrote: > On 2 June 2016 at 13:23, Richard Biener wrote: >> On Thu, 2 Jun 2016, Prathamesh Kulkarni wrote: >> >>> On 1 June 2016 at 18:37, Richard Biener wrote: >>> > On Wed, 1 Jun 2016, Prathamesh Kulkarni wrote: >>> > >>> >> Hi Richard, >>> >> This patch tries to move increase_alignment pass from small to regular >>> >> ipa pass. >>> >> Does the patch look correct ? >>> >> Since we are only increasing alignment of varpool nodes, I am not sure >>> >> if any ipa >>> >> read/write hooks were necessary and passed NULL for them. >>> >> Cross-tested on arm*-*-*, aarch64*-*-*, >>> >> Bootstrap+test on aarch64-linux-gnu in progress. >>> > >>> > I think the patch looks sensible apart from the fact that both >>> > flag_section_anchors and flag_tree_vectorize can have different >>> > states for each function. This would mean the pass should get >>> > its own non-Optimization flag initialized by targets where >>> > section anchors are usually used and it means you'd want to >>> > walk IPA refs to see whether variables are used in a function >>> > with both section anchors and vectorization enabled. >>> Hi, >>> I have attached patch (stage-1 built) that walks ipa-refs to determine >>> if function has flag_tree_loop_vectorize and flag_section_anchors set. >>> Does it look OK ? >>> >>> "This would mean the pass should get its own non-Optimization flag >>> initialized by targets where section anchors are usually used" >>> IIUC should we add a new option -fno-increase_alignment and gate the >>> pass on it ? Um sorry I didn't understand why targets >>> with section anchors (arm, aarch64, ppc) should initialize this option ? >> >> Currently the pass is only run for targets with section anchors (and there >> by default if they are enabled by default). So it makes sense to >> run on those by default and the pass is not necessary on targets w/o >> section anchors as the vectorizer can easily adjust alignment itself on >> those. > Ah indeed, thanks for explanation. Does the attached patch look OK > (stage-1 built) ? > I added new option -fipa-increase_alignment which is disabled by default > and only enabled on arm, aarch64 and ppc. Minor mistake in previous patch (used flag_tree_vectorize in one place instead of flag_tree_loop_vectorize). Corrected in this version. Thanks, Prathamesh > > Thanks, > Prathamesh >> >> Richard. >> >>> Thanks, >>> Prathamesh >>> > >>> > Honza may have further comments. >>> > >>> > Thanks, >>> > Richard. >>> > >>> >> Thanks, >>> >> Prathamesh >>> >> >>> > >>> > -- >>> > Richard Biener >>> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, >>> > HRB 21284 (AG Nuernberg) >>> >> >> -- >> Richard Biener >> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB >> 21284 (AG Nuernberg) diff --git a/gcc/common.opt b/gcc/common.opt index fccd4b5..c964cf9 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -1586,6 +1586,10 @@ fira-algorithm= Common Joined RejectNegative Enum(ira_algorithm) Var(flag_ira_algorithm) Init(IRA_ALGORITHM_CB) Optimization -fira-algorithm=[CB|priority] Set the used IRA algorithm. +fipa-increase_alignment +Common Report Var(flag_ipa_increase_alignment) Init(0) Optimization +Option to gate increase_alignment ipa pass. + Enum Name(ira_algorithm) Type(enum ira_algorithm) UnknownError(unknown IRA algorithm %qs) diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 51d2d50..f1c383f 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -8251,6 +8251,8 @@ aarch64_override_options (void) aarch64_register_fma_steering (); + /* Enable increase_alignment pass. */ + flag_ipa_increase_alignment = 1; } /* Implement targetm.override_options_after_change. */ diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 4e453fd..5aca5d0 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -3454,6 +3454,9 @@ arm_option_override (void) /* Init initial mode for testing. */ thumb_flipper = TARGET_THUMB; + + /* Enable increase_alignment pass. */ + flag_ipa_increase_alignment = 1; } static void diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index c6b2b6a..a046158 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -5011,6 +5011,9 @@ rs6000_option_override (void) = { pass_analyze_swaps, "cse1", 1, PASS_POS_INSERT_BEFORE }; register_pass (_swaps_info); + + /* Enable increase_alignment pass. */ + flag_ipa_increase_alignment = 1; } diff --git a/gcc/passes.def b/gcc/passes.def index 993ed28..a841183 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -138,12 +138,12 @@ along with GCC; see the file COPYING3. If not see PUSH_INSERT_PASSES_WITHIN (pass_ipa_tree_profile) NEXT_PASS (pass_feedback_split_functions); POP_INSERT_PASSES ()
Re: move increase_alignment from simple to regular ipa pass
On 2 June 2016 at 13:23, Richard Bienerwrote: > On Thu, 2 Jun 2016, Prathamesh Kulkarni wrote: > >> On 1 June 2016 at 18:37, Richard Biener wrote: >> > On Wed, 1 Jun 2016, Prathamesh Kulkarni wrote: >> > >> >> Hi Richard, >> >> This patch tries to move increase_alignment pass from small to regular >> >> ipa pass. >> >> Does the patch look correct ? >> >> Since we are only increasing alignment of varpool nodes, I am not sure >> >> if any ipa >> >> read/write hooks were necessary and passed NULL for them. >> >> Cross-tested on arm*-*-*, aarch64*-*-*, >> >> Bootstrap+test on aarch64-linux-gnu in progress. >> > >> > I think the patch looks sensible apart from the fact that both >> > flag_section_anchors and flag_tree_vectorize can have different >> > states for each function. This would mean the pass should get >> > its own non-Optimization flag initialized by targets where >> > section anchors are usually used and it means you'd want to >> > walk IPA refs to see whether variables are used in a function >> > with both section anchors and vectorization enabled. >> Hi, >> I have attached patch (stage-1 built) that walks ipa-refs to determine >> if function has flag_tree_loop_vectorize and flag_section_anchors set. >> Does it look OK ? >> >> "This would mean the pass should get its own non-Optimization flag >> initialized by targets where section anchors are usually used" >> IIUC should we add a new option -fno-increase_alignment and gate the >> pass on it ? Um sorry I didn't understand why targets >> with section anchors (arm, aarch64, ppc) should initialize this option ? > > Currently the pass is only run for targets with section anchors (and there > by default if they are enabled by default). So it makes sense to > run on those by default and the pass is not necessary on targets w/o > section anchors as the vectorizer can easily adjust alignment itself on > those. Ah indeed, thanks for explanation. Does the attached patch look OK (stage-1 built) ? I added new option -fipa-increase_alignment which is disabled by default and only enabled on arm, aarch64 and ppc. Thanks, Prathamesh > > Richard. > >> Thanks, >> Prathamesh >> > >> > Honza may have further comments. >> > >> > Thanks, >> > Richard. >> > >> >> Thanks, >> >> Prathamesh >> >> >> > >> > -- >> > Richard Biener >> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB >> > 21284 (AG Nuernberg) >> > > -- > Richard Biener > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB > 21284 (AG Nuernberg) diff --git a/gcc/common.opt b/gcc/common.opt index fccd4b5..c964cf9 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -1586,6 +1586,10 @@ fira-algorithm= Common Joined RejectNegative Enum(ira_algorithm) Var(flag_ira_algorithm) Init(IRA_ALGORITHM_CB) Optimization -fira-algorithm=[CB|priority] Set the used IRA algorithm. +fipa-increase_alignment +Common Report Var(flag_ipa_increase_alignment) Init(0) Optimization +Option to gate increase_alignment ipa pass. + Enum Name(ira_algorithm) Type(enum ira_algorithm) UnknownError(unknown IRA algorithm %qs) diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 51d2d50..f1c383f 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -8251,6 +8251,8 @@ aarch64_override_options (void) aarch64_register_fma_steering (); + /* Enable increase_alignment pass. */ + flag_ipa_increase_alignment = 1; } /* Implement targetm.override_options_after_change. */ diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 4e453fd..5aca5d0 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -3454,6 +3454,9 @@ arm_option_override (void) /* Init initial mode for testing. */ thumb_flipper = TARGET_THUMB; + + /* Enable increase_alignment pass. */ + flag_ipa_increase_alignment = 1; } static void diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index c6b2b6a..a046158 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -5011,6 +5011,9 @@ rs6000_option_override (void) = { pass_analyze_swaps, "cse1", 1, PASS_POS_INSERT_BEFORE }; register_pass (_swaps_info); + + /* Enable increase_alignment pass. */ + flag_ipa_increase_alignment = 1; } diff --git a/gcc/passes.def b/gcc/passes.def index 993ed28..a841183 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -138,12 +138,12 @@ along with GCC; see the file COPYING3. If not see PUSH_INSERT_PASSES_WITHIN (pass_ipa_tree_profile) NEXT_PASS (pass_feedback_split_functions); POP_INSERT_PASSES () - NEXT_PASS (pass_ipa_increase_alignment); NEXT_PASS (pass_ipa_tm); NEXT_PASS (pass_ipa_lower_emutls); TERMINATE_PASS_LIST (all_small_ipa_passes) INSERT_PASSES_AFTER (all_regular_ipa_passes) + NEXT_PASS (pass_ipa_increase_alignment); NEXT_PASS (pass_ipa_whole_program_visibility); NEXT_PASS
Re: move increase_alignment from simple to regular ipa pass
On Thu, 2 Jun 2016, Prathamesh Kulkarni wrote: > On 1 June 2016 at 18:37, Richard Bienerwrote: > > On Wed, 1 Jun 2016, Prathamesh Kulkarni wrote: > > > >> Hi Richard, > >> This patch tries to move increase_alignment pass from small to regular ipa > >> pass. > >> Does the patch look correct ? > >> Since we are only increasing alignment of varpool nodes, I am not sure > >> if any ipa > >> read/write hooks were necessary and passed NULL for them. > >> Cross-tested on arm*-*-*, aarch64*-*-*, > >> Bootstrap+test on aarch64-linux-gnu in progress. > > > > I think the patch looks sensible apart from the fact that both > > flag_section_anchors and flag_tree_vectorize can have different > > states for each function. This would mean the pass should get > > its own non-Optimization flag initialized by targets where > > section anchors are usually used and it means you'd want to > > walk IPA refs to see whether variables are used in a function > > with both section anchors and vectorization enabled. > Hi, > I have attached patch (stage-1 built) that walks ipa-refs to determine > if function has flag_tree_loop_vectorize and flag_section_anchors set. > Does it look OK ? > > "This would mean the pass should get its own non-Optimization flag > initialized by targets where section anchors are usually used" > IIUC should we add a new option -fno-increase_alignment and gate the > pass on it ? Um sorry I didn't understand why targets > with section anchors (arm, aarch64, ppc) should initialize this option ? Currently the pass is only run for targets with section anchors (and there by default if they are enabled by default). So it makes sense to run on those by default and the pass is not necessary on targets w/o section anchors as the vectorizer can easily adjust alignment itself on those. Richard. > Thanks, > Prathamesh > > > > Honza may have further comments. > > > > Thanks, > > Richard. > > > >> Thanks, > >> Prathamesh > >> > > > > -- > > Richard Biener > > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB > > 21284 (AG Nuernberg) > -- Richard Biener SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
Re: move increase_alignment from simple to regular ipa pass
On 1 June 2016 at 18:37, Richard Bienerwrote: > On Wed, 1 Jun 2016, Prathamesh Kulkarni wrote: > >> Hi Richard, >> This patch tries to move increase_alignment pass from small to regular ipa >> pass. >> Does the patch look correct ? >> Since we are only increasing alignment of varpool nodes, I am not sure >> if any ipa >> read/write hooks were necessary and passed NULL for them. >> Cross-tested on arm*-*-*, aarch64*-*-*, >> Bootstrap+test on aarch64-linux-gnu in progress. > > I think the patch looks sensible apart from the fact that both > flag_section_anchors and flag_tree_vectorize can have different > states for each function. This would mean the pass should get > its own non-Optimization flag initialized by targets where > section anchors are usually used and it means you'd want to > walk IPA refs to see whether variables are used in a function > with both section anchors and vectorization enabled. Hi, I have attached patch (stage-1 built) that walks ipa-refs to determine if function has flag_tree_loop_vectorize and flag_section_anchors set. Does it look OK ? "This would mean the pass should get its own non-Optimization flag initialized by targets where section anchors are usually used" IIUC should we add a new option -fno-increase_alignment and gate the pass on it ? Um sorry I didn't understand why targets with section anchors (arm, aarch64, ppc) should initialize this option ? Thanks, Prathamesh > > Honza may have further comments. > > Thanks, > Richard. > >> Thanks, >> Prathamesh >> > > -- > Richard Biener > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB > 21284 (AG Nuernberg) diff --git a/gcc/passes.def b/gcc/passes.def index 993ed28..a841183 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -138,12 +138,12 @@ along with GCC; see the file COPYING3. If not see PUSH_INSERT_PASSES_WITHIN (pass_ipa_tree_profile) NEXT_PASS (pass_feedback_split_functions); POP_INSERT_PASSES () - NEXT_PASS (pass_ipa_increase_alignment); NEXT_PASS (pass_ipa_tm); NEXT_PASS (pass_ipa_lower_emutls); TERMINATE_PASS_LIST (all_small_ipa_passes) INSERT_PASSES_AFTER (all_regular_ipa_passes) + NEXT_PASS (pass_ipa_increase_alignment); NEXT_PASS (pass_ipa_whole_program_visibility); NEXT_PASS (pass_ipa_profile); NEXT_PASS (pass_ipa_icf); diff --git a/gcc/testsuite/gcc.dg/vect/section-anchors-vect-73.c b/gcc/testsuite/gcc.dg/vect/section-anchors-vect-73.c new file mode 100644 index 000..74eaed8 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/section-anchors-vect-73.c @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target section_anchors } */ +/* { dg-require-effective-target vect_int } */ + +#define N 32 + +/* Clone of section-anchors-vect-70.c with foo() having -fno-tree-loop-vectorize. */ + +static struct A { + int p1, p2; + int e[N]; +} a, b, c; + +__attribute__((optimize("-fno-tree-loop-vectorize"))) +int foo(void) +{ + for (int i = 0; i < N; i++) +a.e[i] = b.e[i] + c.e[i]; + + return a.e[0]; +} + +/* { dg-final { scan-ipa-dump-times "Increasing alignment of decl" 0 "increase_alignment" { target aarch64*-*-* } } } */ +/* { dg-final { scan-ipa-dump-times "Increasing alignment of decl" 0 "increase_alignment" { target powerpc64*-*-* } } } */ +/* { dg-final { scan-ipa-dump-times "Increasing alignment of decl" 0 "increase_alignment" { target arm*-*-* } } } */ diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index 66e103a..2d2e8fc 100644 --- a/gcc/tree-pass.h +++ b/gcc/tree-pass.h @@ -482,7 +482,7 @@ extern simple_ipa_opt_pass *make_pass_local_optimization_passes (gcc::context *c extern ipa_opt_pass_d *make_pass_ipa_whole_program_visibility (gcc::context *ctxt); -extern simple_ipa_opt_pass *make_pass_ipa_increase_alignment (gcc::context +extern ipa_opt_pass_d *make_pass_ipa_increase_alignment (gcc::context *ctxt); extern ipa_opt_pass_d *make_pass_ipa_inline (gcc::context *ctxt); extern simple_ipa_opt_pass *make_pass_ipa_free_lang_data (gcc::context *ctxt); diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c index 2669813..12ef019 100644 --- a/gcc/tree-vectorizer.c +++ b/gcc/tree-vectorizer.c @@ -899,6 +899,34 @@ get_vec_alignment_for_type (tree type) return (alignment > TYPE_ALIGN (type)) ? alignment : 0; } +/* Return true if alignment should be increased for this vnode. + This is done if every function that references/referring to vnode + has flag_tree_loop_vectorize and flag_section_anchors set. */ + +static bool +increase_alignment_p (varpool_node *vnode) +{ + ipa_ref *ref; + + for (int i = 0; vnode->iterate_reference (i, ref); i++) +if (cgraph_node *cnode = dyn_cast (ref->referred)) + { + struct cl_optimization *opts = opts_for_fn (cnode->decl); + if (! (opts->x_flag_tree_vectorize && opts->x_flag_section_anchors))
Re: move increase_alignment from simple to regular ipa pass
On Wed, 1 Jun 2016, Prathamesh Kulkarni wrote: > Hi Richard, > This patch tries to move increase_alignment pass from small to regular ipa > pass. > Does the patch look correct ? > Since we are only increasing alignment of varpool nodes, I am not sure > if any ipa > read/write hooks were necessary and passed NULL for them. > Cross-tested on arm*-*-*, aarch64*-*-*, > Bootstrap+test on aarch64-linux-gnu in progress. I think the patch looks sensible apart from the fact that both flag_section_anchors and flag_tree_vectorize can have different states for each function. This would mean the pass should get its own non-Optimization flag initialized by targets where section anchors are usually used and it means you'd want to walk IPA refs to see whether variables are used in a function with both section anchors and vectorization enabled. Honza may have further comments. Thanks, Richard. > Thanks, > Prathamesh > -- Richard BienerSUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)