Re: [PATCH] Allow fpmath differences in inlining when callee doesn't use FP math
On Thu, Aug 24, 2017 at 1:34 PM, Richard Bienerwrote: > On Thu, 24 Aug 2017, Uros Bizjak wrote: > >> On Thu, Aug 24, 2017 at 1:10 PM, Richard Biener wrote: >> > >> > This adjusts the x86 backend to allow -mfpmath differences when >> > deciding whether to allow inlining. -mfpmath doesn't really >> > matter for functions not containing FP operations. >> > >> > It appears that the can_inline_p target hook is called from the >> > C++ FE for multi-versioning, thus the ! ipa_fn_summaries check. >> > >> > Bootstrapped and tested on x86_64-unknown-linux-gnu, ok for trunk? >> > >> > Incidentically this fixes the bootstrap issue I have when enabling >> > free-lang-data without LTO. >> > >> > It doesn't fully solve the fpmath difference issue we have with >> > intrinsics but is an optimization. >> > >> > Thanks, >> > Richard. >> > >> > 2017-08-24 Richard Biener >> > >> > * config/i386/i386.c: Include symbol-summary.h, ipa-prop.h >> > and ipa-fnsummary.h. >> > (ix86_can_inline_p): When ix86_fpmath flags do not match >> > check whether the callee uses FP math at all. >> >> LGTM for branches, but for trunk, I'd still like to remove fpmath > > Hmm, it should be indeed safe but fp_expression is only available > on the GCC 7 branch. Might be the only change that doesn't have > the chance to reject code that we previously accepted... > >> processing from outside ix86_option_override_internal. As mentioned in >> the comment in i_o_o_i, blindly switching fpmath for TARGET_SSE can >> violate ABI for some environments. > > Yeah. The following has also passed bootstrap & regtest on > x86_64-unknown-linux-gnu. > > Ok for trunk? Yes, x86 part is OK. Thanks, Uros. > Thanks, > Richard. > > 2017-08-23 Richard Biener > > PR target/81921 > * targhooks.c (default_target_can_inline_p): Properly > use target_option_default_node when no DECL_FUNCTION_SPECIFIC_TARGET > is present and always compare. > * config/i386/i386.c (ix86_valid_target_attribute_tree): Do not > imply -mfpmath=sse from TARGET_SSE_P. > (ix86_can_inline_p): Properly use target_option_default_node when > no DECL_FUNCTION_SPECIFIC_TARGET is present and always compare. > > * gcc.target/i386/pr81921.c: New testcase. > > Index: gcc/config/i386/i386.c > === > --- gcc/config/i386/i386.c (revision 251270) > +++ gcc/config/i386/i386.c (working copy) > @@ -7398,16 +7398,6 @@ ix86_valid_target_attribute_tree (tree a >/* If fpmath= is not set, and we now have sse2 on 32-bit, use it. */ >if (enum_opts_set.x_ix86_fpmath) > opts_set->x_ix86_fpmath = (enum fpmath_unit) 1; > - else if (!TARGET_64BIT_P (opts->x_ix86_isa_flags) > - && TARGET_SSE_P (opts->x_ix86_isa_flags)) > - { > - if (TARGET_80387_P (opts->x_target_flags)) > - opts->x_ix86_fpmath = (enum fpmath_unit) (FPMATH_SSE > - | FPMATH_387); > - else > - opts->x_ix86_fpmath = (enum fpmath_unit) FPMATH_SSE; > - opts_set->x_ix86_fpmath = (enum fpmath_unit) 1; > - } > >/* Do any overrides, such as arch=xxx, or tune=xxx support. */ >bool r = ix86_option_override_internal (false, opts, opts_set); > @@ -7502,53 +7492,47 @@ ix86_valid_target_attribute_p (tree fnde > static bool > ix86_can_inline_p (tree caller, tree callee) > { > - bool ret = false; >tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller); >tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee); > - > - /* If callee has no option attributes, then it is ok to inline. */ >if (!callee_tree) > -ret = true; > +callee_tree = target_option_default_node; > + if (!caller_tree) > +caller_tree = target_option_default_node; > + if (callee_tree == caller_tree) > +return true; > + > + struct cl_target_option *caller_opts = TREE_TARGET_OPTION (caller_tree); > + struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree); > + bool ret = false; > > - /* If caller has no option attributes, but callee does then it is not ok to > - inline. */ > - else if (!caller_tree) > + /* Callee's isa options should be a subset of the caller's, i.e. a SSE4 > + function can inline a SSE2 function but a SSE2 function can't inline > + a SSE4 function. */ > + if (((caller_opts->x_ix86_isa_flags & callee_opts->x_ix86_isa_flags) > + != callee_opts->x_ix86_isa_flags) > + || ((caller_opts->x_ix86_isa_flags2 & callee_opts->x_ix86_isa_flags2) > + != callee_opts->x_ix86_isa_flags2)) > ret = false; > > - else > -{ > - struct cl_target_option *caller_opts = TREE_TARGET_OPTION > (caller_tree); > - struct cl_target_option *callee_opts = TREE_TARGET_OPTION > (callee_tree); > +
Re: [PATCH] Allow fpmath differences in inlining when callee doesn't use FP math
On Thu, 24 Aug 2017, Uros Bizjak wrote: > On Thu, Aug 24, 2017 at 1:10 PM, Richard Bienerwrote: > > > > This adjusts the x86 backend to allow -mfpmath differences when > > deciding whether to allow inlining. -mfpmath doesn't really > > matter for functions not containing FP operations. > > > > It appears that the can_inline_p target hook is called from the > > C++ FE for multi-versioning, thus the ! ipa_fn_summaries check. > > > > Bootstrapped and tested on x86_64-unknown-linux-gnu, ok for trunk? > > > > Incidentically this fixes the bootstrap issue I have when enabling > > free-lang-data without LTO. > > > > It doesn't fully solve the fpmath difference issue we have with > > intrinsics but is an optimization. > > > > Thanks, > > Richard. > > > > 2017-08-24 Richard Biener > > > > * config/i386/i386.c: Include symbol-summary.h, ipa-prop.h > > and ipa-fnsummary.h. > > (ix86_can_inline_p): When ix86_fpmath flags do not match > > check whether the callee uses FP math at all. > > LGTM for branches, but for trunk, I'd still like to remove fpmath Hmm, it should be indeed safe but fp_expression is only available on the GCC 7 branch. Might be the only change that doesn't have the chance to reject code that we previously accepted... > processing from outside ix86_option_override_internal. As mentioned in > the comment in i_o_o_i, blindly switching fpmath for TARGET_SSE can > violate ABI for some environments. Yeah. The following has also passed bootstrap & regtest on x86_64-unknown-linux-gnu. Ok for trunk? Thanks, Richard. 2017-08-23 Richard Biener PR target/81921 * targhooks.c (default_target_can_inline_p): Properly use target_option_default_node when no DECL_FUNCTION_SPECIFIC_TARGET is present and always compare. * config/i386/i386.c (ix86_valid_target_attribute_tree): Do not imply -mfpmath=sse from TARGET_SSE_P. (ix86_can_inline_p): Properly use target_option_default_node when no DECL_FUNCTION_SPECIFIC_TARGET is present and always compare. * gcc.target/i386/pr81921.c: New testcase. Index: gcc/config/i386/i386.c === --- gcc/config/i386/i386.c (revision 251270) +++ gcc/config/i386/i386.c (working copy) @@ -7398,16 +7398,6 @@ ix86_valid_target_attribute_tree (tree a /* If fpmath= is not set, and we now have sse2 on 32-bit, use it. */ if (enum_opts_set.x_ix86_fpmath) opts_set->x_ix86_fpmath = (enum fpmath_unit) 1; - else if (!TARGET_64BIT_P (opts->x_ix86_isa_flags) - && TARGET_SSE_P (opts->x_ix86_isa_flags)) - { - if (TARGET_80387_P (opts->x_target_flags)) - opts->x_ix86_fpmath = (enum fpmath_unit) (FPMATH_SSE - | FPMATH_387); - else - opts->x_ix86_fpmath = (enum fpmath_unit) FPMATH_SSE; - opts_set->x_ix86_fpmath = (enum fpmath_unit) 1; - } /* Do any overrides, such as arch=xxx, or tune=xxx support. */ bool r = ix86_option_override_internal (false, opts, opts_set); @@ -7502,53 +7492,47 @@ ix86_valid_target_attribute_p (tree fnde static bool ix86_can_inline_p (tree caller, tree callee) { - bool ret = false; tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller); tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee); - - /* If callee has no option attributes, then it is ok to inline. */ if (!callee_tree) -ret = true; +callee_tree = target_option_default_node; + if (!caller_tree) +caller_tree = target_option_default_node; + if (callee_tree == caller_tree) +return true; + + struct cl_target_option *caller_opts = TREE_TARGET_OPTION (caller_tree); + struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree); + bool ret = false; - /* If caller has no option attributes, but callee does then it is not ok to - inline. */ - else if (!caller_tree) + /* Callee's isa options should be a subset of the caller's, i.e. a SSE4 + function can inline a SSE2 function but a SSE2 function can't inline + a SSE4 function. */ + if (((caller_opts->x_ix86_isa_flags & callee_opts->x_ix86_isa_flags) + != callee_opts->x_ix86_isa_flags) + || ((caller_opts->x_ix86_isa_flags2 & callee_opts->x_ix86_isa_flags2) + != callee_opts->x_ix86_isa_flags2)) ret = false; - else -{ - struct cl_target_option *caller_opts = TREE_TARGET_OPTION (caller_tree); - struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree); + /* See if we have the same non-isa options. */ + else if (caller_opts->x_target_flags != callee_opts->x_target_flags) +ret = false; - /* Callee's isa options should be a subset of the caller's, i.e. a SSE4 -function can inline a SSE2 function but a SSE2 function can't inline -
Re: [PATCH] Allow fpmath differences in inlining when callee doesn't use FP math
On Thu, Aug 24, 2017 at 1:10 PM, Richard Bienerwrote: > > This adjusts the x86 backend to allow -mfpmath differences when > deciding whether to allow inlining. -mfpmath doesn't really > matter for functions not containing FP operations. > > It appears that the can_inline_p target hook is called from the > C++ FE for multi-versioning, thus the ! ipa_fn_summaries check. > > Bootstrapped and tested on x86_64-unknown-linux-gnu, ok for trunk? > > Incidentically this fixes the bootstrap issue I have when enabling > free-lang-data without LTO. > > It doesn't fully solve the fpmath difference issue we have with > intrinsics but is an optimization. > > Thanks, > Richard. > > 2017-08-24 Richard Biener > > * config/i386/i386.c: Include symbol-summary.h, ipa-prop.h > and ipa-fnsummary.h. > (ix86_can_inline_p): When ix86_fpmath flags do not match > check whether the callee uses FP math at all. LGTM for branches, but for trunk, I'd still like to remove fpmath processing from outside ix86_option_override_internal. As mentioned in the comment in i_o_o_i, blindly switching fpmath for TARGET_SSE can violate ABI for some environments. Uros. > Index: gcc/config/i386/i386.c > === > --- gcc/config/i386/i386.c (revision 251307) > +++ gcc/config/i386/i386.c (working copy) > @@ -85,6 +85,9 @@ along with GCC; see the file COPYING3. > #include "print-rtl.h" > #include "intl.h" > #include "ifcvt.h" > +#include "symbol-summary.h" > +#include "ipa-prop.h" > +#include "ipa-fnsummary.h" > > /* This file should be included last. */ > #include "target-def.h" > @@ -7544,7 +7547,14 @@ ix86_can_inline_p (tree caller, tree cal >else if (caller_opts->tune != callee_opts->tune) > ret = false; > > - else if (caller_opts->x_ix86_fpmath != callee_opts->x_ix86_fpmath) > + else if (caller_opts->x_ix86_fpmath != callee_opts->x_ix86_fpmath > + /* If the calle doesn't use FP expressions differences in > + ix86_fpmath can be ignored. We are called from FEs > + for multi-versioning call optimization, so beware of > + ipa_fn_summaries not available. */ > + && (! ipa_fn_summaries > + || ipa_fn_summaries->get > + (cgraph_node::get (callee))->fp_expressions)) > ret = false; > >else if (caller_opts->branch_cost != callee_opts->branch_cost)
[PATCH] Allow fpmath differences in inlining when callee doesn't use FP math
This adjusts the x86 backend to allow -mfpmath differences when deciding whether to allow inlining. -mfpmath doesn't really matter for functions not containing FP operations. It appears that the can_inline_p target hook is called from the C++ FE for multi-versioning, thus the ! ipa_fn_summaries check. Bootstrapped and tested on x86_64-unknown-linux-gnu, ok for trunk? Incidentically this fixes the bootstrap issue I have when enabling free-lang-data without LTO. It doesn't fully solve the fpmath difference issue we have with intrinsics but is an optimization. Thanks, Richard. 2017-08-24 Richard Biener* config/i386/i386.c: Include symbol-summary.h, ipa-prop.h and ipa-fnsummary.h. (ix86_can_inline_p): When ix86_fpmath flags do not match check whether the callee uses FP math at all. Index: gcc/config/i386/i386.c === --- gcc/config/i386/i386.c (revision 251307) +++ gcc/config/i386/i386.c (working copy) @@ -85,6 +85,9 @@ along with GCC; see the file COPYING3. #include "print-rtl.h" #include "intl.h" #include "ifcvt.h" +#include "symbol-summary.h" +#include "ipa-prop.h" +#include "ipa-fnsummary.h" /* This file should be included last. */ #include "target-def.h" @@ -7544,7 +7547,14 @@ ix86_can_inline_p (tree caller, tree cal else if (caller_opts->tune != callee_opts->tune) ret = false; - else if (caller_opts->x_ix86_fpmath != callee_opts->x_ix86_fpmath) + else if (caller_opts->x_ix86_fpmath != callee_opts->x_ix86_fpmath + /* If the calle doesn't use FP expressions differences in + ix86_fpmath can be ignored. We are called from FEs + for multi-versioning call optimization, so beware of + ipa_fn_summaries not available. */ + && (! ipa_fn_summaries + || ipa_fn_summaries->get + (cgraph_node::get (callee))->fp_expressions)) ret = false; else if (caller_opts->branch_cost != callee_opts->branch_cost)