Re: [PATCH v4 2/2] x86: Add general_regs_only function attribute
On Thu, Apr 22, 2021 at 2:52 PM Richard Biener wrote: > > On Thu, Apr 22, 2021 at 2:22 PM Jakub Jelinek wrote: > > > > On Thu, Apr 22, 2021 at 01:23:20PM +0200, Richard Biener via Gcc-patches > > wrote: > > > > The question is if the pragma GCC target right now behaves incrementally > > > > or not, whether > > > > #pragma GCC target("avx2") > > > > adds -mavx2 to options if it was missing before and nothing otherwise, > > > > or if > > > > it switches other options off. If it is incremental, we could e.g. try > > > > to > > > > use the second least significant bit of global_options_set.x_* to mean > > > > this option has been set explicitly by some surrounding #pragma GCC > > > > target. > > > > The normal tests - global_options_set.x_flag_whatever could still work > > > > fine because they wouldn't care if the option was explicit from anywhere > > > > (command line or GCC target or target attribute) and just & 2 would mean > > > > it was explicit from pragma GCC target; though there is the case of > > > > bitfields... And then the inlining decision could check the & 2 flags to > > > > see what is required and what is just from command line. > > > > Or we can have some other pragma GCC that would be like target but would > > > > have flags that are explicit (and could e.g. be more restricted, to ISA > > > > options only, and let those use in addition to #pragma GCC target. > > > > > > I'm still curious as to what you think will break if always-inline does > > > what > > > it is documented to do. > > > > We will silently accept calling intrinsics that must be used only in certain > > ISA contexts, which will lead to people writing non-portable code. > > > > So -O2 -mno-avx > > #include > > > > void > > foo (__m256 *x) > > { > > x[0] = _mm256_sub_ps (x[1], x[2]); > > } > > etc. will now be accepted when it shouldn't be. > > clang rejects it like gcc with: > > 1.c:6:10: error: always_inline function '_mm256_sub_ps' requires target > > feature 'avx', but would be inlined into function 'foo' that is compiled > > without support for 'avx' > > x[0] = _mm256_sub_ps (x[1], x[2]); > > ^ > > > > Note, if I do: > > #include > > > > __attribute__((target ("no-sse3"))) void > > foo (__m256 *x) > > { > > x[0] = _mm256_sub_ps (x[1], x[2]); > > } > > and compile > > clang -S -O2 -mavx2 1.c > > 1.c:6:10: error: always_inline function '_mm256_sub_ps' requires target > > feature 'avx', but would be inlined into function 'foo' that is compiled > > without support for 'avx' > > x[0] = _mm256_sub_ps (x[1], x[2]); > > ^ > > then from the error message it seems that unlike GCC, clang remembers > > the exact target features that are needed for the intrinsics and checks just > > those. > > Though, looking at the preprocessed source, seems it uses > > static __inline __m256 __attribute__((__always_inline__, __nodebug__, > > __target__("avx"), __min_vector_width__(256))) > > _mm256_sub_ps(__m256 __a, __m256 __b) > > { > > return (__m256)((__v8sf)__a-(__v8sf)__b); > > } > > and not target pragmas. > > > > Anyway, if we tweak our intrinsic headers so that > > -#ifndef __AVX__ > > #pragma GCC push_options > > #pragma GCC target("avx") > > -#define __DISABLE_AVX__ > > -#endif /* __AVX__ */ > > > > ... > > -#ifdef __DISABLE_AVX__ > > -#undef __DISABLE_AVX__ > > #pragma GCC pop_options > > -#endif /* __DISABLE_AVX__ */ > > and do the opts_set->x_* & 2 stuff on explicit options coming out of > > target/optimize pragmas and attributes, perhaps we don't even need > > to introduce a new attribute and can handle everything magically: Oh, and any such changes will likely interact with Martins ideas to rework how optimize and target attributes work (aka adding ontop of the commandline options). That is, attribute target will then not be enough to remember the exact set of needed ISA features (as opposed to what likely clang implements?) > > 1) if it is gnu_inline extern inline, allow indirect calls, otherwise > > disallow them for always_inline functions > > There are a lot of intrinsics using extern inline __gnu_inline though... > > > 2) for the isa flags and option mismatches, only disallow opts_set->x_* & 2 > > stuff > > This will keep both intrinsics and glibc fortify macros working fine > > in all the needed use cases. > > Yes, see my example in the other mail. > > I think before we add any new attributes we should sort out the > current mess, eventually adding some testcases for desired > diagnostic. > > Richard. > > > Jakub > >
Re: [PATCH v4 2/2] x86: Add general_regs_only function attribute
On Thu, Apr 22, 2021 at 2:22 PM Jakub Jelinek wrote: > > On Thu, Apr 22, 2021 at 01:23:20PM +0200, Richard Biener via Gcc-patches > wrote: > > > The question is if the pragma GCC target right now behaves incrementally > > > or not, whether > > > #pragma GCC target("avx2") > > > adds -mavx2 to options if it was missing before and nothing otherwise, or > > > if > > > it switches other options off. If it is incremental, we could e.g. try to > > > use the second least significant bit of global_options_set.x_* to mean > > > this option has been set explicitly by some surrounding #pragma GCC > > > target. > > > The normal tests - global_options_set.x_flag_whatever could still work > > > fine because they wouldn't care if the option was explicit from anywhere > > > (command line or GCC target or target attribute) and just & 2 would mean > > > it was explicit from pragma GCC target; though there is the case of > > > bitfields... And then the inlining decision could check the & 2 flags to > > > see what is required and what is just from command line. > > > Or we can have some other pragma GCC that would be like target but would > > > have flags that are explicit (and could e.g. be more restricted, to ISA > > > options only, and let those use in addition to #pragma GCC target. > > > > I'm still curious as to what you think will break if always-inline does what > > it is documented to do. > > We will silently accept calling intrinsics that must be used only in certain > ISA contexts, which will lead to people writing non-portable code. > > So -O2 -mno-avx > #include > > void > foo (__m256 *x) > { > x[0] = _mm256_sub_ps (x[1], x[2]); > } > etc. will now be accepted when it shouldn't be. > clang rejects it like gcc with: > 1.c:6:10: error: always_inline function '_mm256_sub_ps' requires target > feature 'avx', but would be inlined into function 'foo' that is compiled > without support for 'avx' > x[0] = _mm256_sub_ps (x[1], x[2]); > ^ > > Note, if I do: > #include > > __attribute__((target ("no-sse3"))) void > foo (__m256 *x) > { > x[0] = _mm256_sub_ps (x[1], x[2]); > } > and compile > clang -S -O2 -mavx2 1.c > 1.c:6:10: error: always_inline function '_mm256_sub_ps' requires target > feature 'avx', but would be inlined into function 'foo' that is compiled > without support for 'avx' > x[0] = _mm256_sub_ps (x[1], x[2]); > ^ > then from the error message it seems that unlike GCC, clang remembers > the exact target features that are needed for the intrinsics and checks just > those. > Though, looking at the preprocessed source, seems it uses > static __inline __m256 __attribute__((__always_inline__, __nodebug__, > __target__("avx"), __min_vector_width__(256))) > _mm256_sub_ps(__m256 __a, __m256 __b) > { > return (__m256)((__v8sf)__a-(__v8sf)__b); > } > and not target pragmas. > > Anyway, if we tweak our intrinsic headers so that > -#ifndef __AVX__ > #pragma GCC push_options > #pragma GCC target("avx") > -#define __DISABLE_AVX__ > -#endif /* __AVX__ */ > > ... > -#ifdef __DISABLE_AVX__ > -#undef __DISABLE_AVX__ > #pragma GCC pop_options > -#endif /* __DISABLE_AVX__ */ > and do the opts_set->x_* & 2 stuff on explicit options coming out of > target/optimize pragmas and attributes, perhaps we don't even need > to introduce a new attribute and can handle everything magically: > > 1) if it is gnu_inline extern inline, allow indirect calls, otherwise > disallow them for always_inline functions There are a lot of intrinsics using extern inline __gnu_inline though... > 2) for the isa flags and option mismatches, only disallow opts_set->x_* & 2 > stuff > This will keep both intrinsics and glibc fortify macros working fine > in all the needed use cases. Yes, see my example in the other mail. I think before we add any new attributes we should sort out the current mess, eventually adding some testcases for desired diagnostic. Richard. > Jakub >
Re: [PATCH v4 2/2] x86: Add general_regs_only function attribute
On Thu, Apr 22, 2021 at 01:23:20PM +0200, Richard Biener via Gcc-patches wrote: > > The question is if the pragma GCC target right now behaves incrementally > > or not, whether > > #pragma GCC target("avx2") > > adds -mavx2 to options if it was missing before and nothing otherwise, or if > > it switches other options off. If it is incremental, we could e.g. try to > > use the second least significant bit of global_options_set.x_* to mean > > this option has been set explicitly by some surrounding #pragma GCC target. > > The normal tests - global_options_set.x_flag_whatever could still work > > fine because they wouldn't care if the option was explicit from anywhere > > (command line or GCC target or target attribute) and just & 2 would mean > > it was explicit from pragma GCC target; though there is the case of > > bitfields... And then the inlining decision could check the & 2 flags to > > see what is required and what is just from command line. > > Or we can have some other pragma GCC that would be like target but would > > have flags that are explicit (and could e.g. be more restricted, to ISA > > options only, and let those use in addition to #pragma GCC target. > > I'm still curious as to what you think will break if always-inline does what > it is documented to do. We will silently accept calling intrinsics that must be used only in certain ISA contexts, which will lead to people writing non-portable code. So -O2 -mno-avx #include void foo (__m256 *x) { x[0] = _mm256_sub_ps (x[1], x[2]); } etc. will now be accepted when it shouldn't be. clang rejects it like gcc with: 1.c:6:10: error: always_inline function '_mm256_sub_ps' requires target feature 'avx', but would be inlined into function 'foo' that is compiled without support for 'avx' x[0] = _mm256_sub_ps (x[1], x[2]); ^ Note, if I do: #include __attribute__((target ("no-sse3"))) void foo (__m256 *x) { x[0] = _mm256_sub_ps (x[1], x[2]); } and compile clang -S -O2 -mavx2 1.c 1.c:6:10: error: always_inline function '_mm256_sub_ps' requires target feature 'avx', but would be inlined into function 'foo' that is compiled without support for 'avx' x[0] = _mm256_sub_ps (x[1], x[2]); ^ then from the error message it seems that unlike GCC, clang remembers the exact target features that are needed for the intrinsics and checks just those. Though, looking at the preprocessed source, seems it uses static __inline __m256 __attribute__((__always_inline__, __nodebug__, __target__("avx"), __min_vector_width__(256))) _mm256_sub_ps(__m256 __a, __m256 __b) { return (__m256)((__v8sf)__a-(__v8sf)__b); } and not target pragmas. Anyway, if we tweak our intrinsic headers so that -#ifndef __AVX__ #pragma GCC push_options #pragma GCC target("avx") -#define __DISABLE_AVX__ -#endif /* __AVX__ */ ... -#ifdef __DISABLE_AVX__ -#undef __DISABLE_AVX__ #pragma GCC pop_options -#endif /* __DISABLE_AVX__ */ and do the opts_set->x_* & 2 stuff on explicit options coming out of target/optimize pragmas and attributes, perhaps we don't even need to introduce a new attribute and can handle everything magically: 1) if it is gnu_inline extern inline, allow indirect calls, otherwise disallow them for always_inline functions 2) for the isa flags and option mismatches, only disallow opts_set->x_* & 2 stuff This will keep both intrinsics and glibc fortify macros working fine in all the needed use cases. Jakub
Re: [PATCH v4 2/2] x86: Add general_regs_only function attribute
On Thu, Apr 22, 2021 at 1:58 PM H.J. Lu wrote: > > On Thu, Apr 22, 2021 at 4:23 AM Richard Biener > wrote: > > > > On Thu, Apr 22, 2021 at 12:30 PM Jakub Jelinek via Gcc-patches > > wrote: > > > > > > On Wed, Apr 21, 2021 at 06:01:07PM -0700, H.J. Lu via Gcc-patches wrote: > > > > How about this? > > > > > > > > @item general_regs_only > > > > @cindex @code{general_regs_only} function attribute, x86 > > > > The @code{general_regs_only} function attribute informs the compiler > > > > that the function uses only general purpose registers. When the > > > > compiler inlines a function with the @code{always_inline} attribute, > > > > target-specific compilation options may lead to inline failures. > > > > The @code{general_regs_only} attribute, if applicable, can be used > > > > together with the @code{always_inline} attribute to reduce inlining > > > > failure. > > > > > > I don't really like this attribute. > > > It is very specific to what you want to solve and doesn't address the > > > general problem, that always_inline means different things in different > > > code, and that is a problem for many targets, not just one. > > > > > > As has been written, in some cases it means inline always, error > > > whenever it is called indirectly which can't be optimized into a direct > > > call that can be inlined and error whenever the inlining fails for other > > > reasons. > > > > > > Another case, e.g. in the glibc fortify wrappers, is inline always > > > when the call is direct or an indirect call can be optimized into a direct > > > call and error when the inlining fails, but support indirect calls without > > > errors. > > > > > > Another case, which is most of the x86/aarch64/arm etc. intrinsics, is > > > inline always unless there is a target mismatch (roughly what is > > > actually implemented). > > > > > > Because from the always_inline attribute it is impossible to determine > > > which > > > one of those it is (for the indirect calls the rule could be > > > gnu_inline extern inline means indirect calls are ok, anything else means > > > indirect calls are bad), we need new syntax to distinguish those cases. > > > > > > general_regs_only attribute doesn't seem to be it, e.g. for the glibc > > > fortify wrappers cases I don't see why we should forbid using floating > > > point > > > in such inlines. > > > > > > So IMHO we need some new attribute for one of those, or optional parameter > > > to always_inline. > > > > > > For the intrinsic case, ideal would be if we could record which ISA flags > > > (or more generally which options) are required and which are not. Either > > > have some syntax where those would be explicitly specified in attribute > > > (but > > > frankly that would be a maintainance nightmare), or derive those from > > > surrounding pragmas. Right now we have those wrapped in > > > #ifndef __AVX2__ > > > #pragma GCC push_options > > > #pragma GCC target("avx2") > > > #define __DISABLE_AVX2__ > > > #endif /* __AVX2__ */ > > > > > > ... > > > > > > #ifdef __DISABLE_AVX2__ > > > #undef __DISABLE_AVX2__ > > > #pragma GCC pop_options > > > #endif /* __DISABLE_AVX2__ */ > > > > > > The question is if the pragma GCC target right now behaves incrementally > > > or not, whether > > > #pragma GCC target("avx2") > > > adds -mavx2 to options if it was missing before and nothing otherwise, or > > > if > > > it switches other options off. If it is incremental, we could e.g. try to > > > use the second least significant bit of global_options_set.x_* to mean > > > this option has been set explicitly by some surrounding #pragma GCC > > > target. > > > The normal tests - global_options_set.x_flag_whatever could still work > > > fine because they wouldn't care if the option was explicit from anywhere > > > (command line or GCC target or target attribute) and just & 2 would mean > > > it was explicit from pragma GCC target; though there is the case of > > > bitfields... And then the inlining decision could check the & 2 flags to > > > see what is required and what is just from command line. > > > Or we can have some other pragma GCC that would be like target but would > > > have flags that are explicit (and could e.g. be more restricted, to ISA > > > options only, and let those use in addition to #pragma GCC target. > > > > I'm still curious as to what you think will break if always-inline does what > > it is documented to do. > > No wrong code. But the compiler will generate a different error message > at the later stage if the ISA for the intrinsic isn't enabled. But all issues at hand we're trying to fix are rejections of _valid_ code. Improving diagnostics for invalid code should be secondary to properly accepting valid code. There's nothing perfect here - for diagnostics we might want to consider an __attribute__((requires_isa(a,b,c))) which says that calling a function with such attribute requires the ISAs in the list to be active. Target code can then iterate over calls _before_ inlining
Re: [PATCH v4 2/2] x86: Add general_regs_only function attribute
On Thu, Apr 22, 2021 at 4:23 AM Richard Biener wrote: > > On Thu, Apr 22, 2021 at 12:30 PM Jakub Jelinek via Gcc-patches > wrote: > > > > On Wed, Apr 21, 2021 at 06:01:07PM -0700, H.J. Lu via Gcc-patches wrote: > > > How about this? > > > > > > @item general_regs_only > > > @cindex @code{general_regs_only} function attribute, x86 > > > The @code{general_regs_only} function attribute informs the compiler > > > that the function uses only general purpose registers. When the > > > compiler inlines a function with the @code{always_inline} attribute, > > > target-specific compilation options may lead to inline failures. > > > The @code{general_regs_only} attribute, if applicable, can be used > > > together with the @code{always_inline} attribute to reduce inlining > > > failure. > > > > I don't really like this attribute. > > It is very specific to what you want to solve and doesn't address the > > general problem, that always_inline means different things in different > > code, and that is a problem for many targets, not just one. > > > > As has been written, in some cases it means inline always, error > > whenever it is called indirectly which can't be optimized into a direct > > call that can be inlined and error whenever the inlining fails for other > > reasons. > > > > Another case, e.g. in the glibc fortify wrappers, is inline always > > when the call is direct or an indirect call can be optimized into a direct > > call and error when the inlining fails, but support indirect calls without > > errors. > > > > Another case, which is most of the x86/aarch64/arm etc. intrinsics, is > > inline always unless there is a target mismatch (roughly what is > > actually implemented). > > > > Because from the always_inline attribute it is impossible to determine which > > one of those it is (for the indirect calls the rule could be > > gnu_inline extern inline means indirect calls are ok, anything else means > > indirect calls are bad), we need new syntax to distinguish those cases. > > > > general_regs_only attribute doesn't seem to be it, e.g. for the glibc > > fortify wrappers cases I don't see why we should forbid using floating point > > in such inlines. > > > > So IMHO we need some new attribute for one of those, or optional parameter > > to always_inline. > > > > For the intrinsic case, ideal would be if we could record which ISA flags > > (or more generally which options) are required and which are not. Either > > have some syntax where those would be explicitly specified in attribute (but > > frankly that would be a maintainance nightmare), or derive those from > > surrounding pragmas. Right now we have those wrapped in > > #ifndef __AVX2__ > > #pragma GCC push_options > > #pragma GCC target("avx2") > > #define __DISABLE_AVX2__ > > #endif /* __AVX2__ */ > > > > ... > > > > #ifdef __DISABLE_AVX2__ > > #undef __DISABLE_AVX2__ > > #pragma GCC pop_options > > #endif /* __DISABLE_AVX2__ */ > > > > The question is if the pragma GCC target right now behaves incrementally > > or not, whether > > #pragma GCC target("avx2") > > adds -mavx2 to options if it was missing before and nothing otherwise, or if > > it switches other options off. If it is incremental, we could e.g. try to > > use the second least significant bit of global_options_set.x_* to mean > > this option has been set explicitly by some surrounding #pragma GCC target. > > The normal tests - global_options_set.x_flag_whatever could still work > > fine because they wouldn't care if the option was explicit from anywhere > > (command line or GCC target or target attribute) and just & 2 would mean > > it was explicit from pragma GCC target; though there is the case of > > bitfields... And then the inlining decision could check the & 2 flags to > > see what is required and what is just from command line. > > Or we can have some other pragma GCC that would be like target but would > > have flags that are explicit (and could e.g. be more restricted, to ISA > > options only, and let those use in addition to #pragma GCC target. > > I'm still curious as to what you think will break if always-inline does what > it is documented to do. No wrong code. But the compiler will generate a different error message at the later stage if the ISA for the intrinsic isn't enabled. -- H.J.
Re: [PATCH v4 2/2] x86: Add general_regs_only function attribute
On Thu, Apr 22, 2021 at 12:30 PM Jakub Jelinek via Gcc-patches wrote: > > On Wed, Apr 21, 2021 at 06:01:07PM -0700, H.J. Lu via Gcc-patches wrote: > > How about this? > > > > @item general_regs_only > > @cindex @code{general_regs_only} function attribute, x86 > > The @code{general_regs_only} function attribute informs the compiler > > that the function uses only general purpose registers. When the > > compiler inlines a function with the @code{always_inline} attribute, > > target-specific compilation options may lead to inline failures. > > The @code{general_regs_only} attribute, if applicable, can be used > > together with the @code{always_inline} attribute to reduce inlining > > failure. > > I don't really like this attribute. > It is very specific to what you want to solve and doesn't address the > general problem, that always_inline means different things in different > code, and that is a problem for many targets, not just one. > > As has been written, in some cases it means inline always, error > whenever it is called indirectly which can't be optimized into a direct > call that can be inlined and error whenever the inlining fails for other > reasons. > > Another case, e.g. in the glibc fortify wrappers, is inline always > when the call is direct or an indirect call can be optimized into a direct > call and error when the inlining fails, but support indirect calls without > errors. > > Another case, which is most of the x86/aarch64/arm etc. intrinsics, is > inline always unless there is a target mismatch (roughly what is > actually implemented). > > Because from the always_inline attribute it is impossible to determine which > one of those it is (for the indirect calls the rule could be > gnu_inline extern inline means indirect calls are ok, anything else means > indirect calls are bad), we need new syntax to distinguish those cases. > > general_regs_only attribute doesn't seem to be it, e.g. for the glibc > fortify wrappers cases I don't see why we should forbid using floating point > in such inlines. > > So IMHO we need some new attribute for one of those, or optional parameter > to always_inline. > > For the intrinsic case, ideal would be if we could record which ISA flags > (or more generally which options) are required and which are not. Either > have some syntax where those would be explicitly specified in attribute (but > frankly that would be a maintainance nightmare), or derive those from > surrounding pragmas. Right now we have those wrapped in > #ifndef __AVX2__ > #pragma GCC push_options > #pragma GCC target("avx2") > #define __DISABLE_AVX2__ > #endif /* __AVX2__ */ > > ... > > #ifdef __DISABLE_AVX2__ > #undef __DISABLE_AVX2__ > #pragma GCC pop_options > #endif /* __DISABLE_AVX2__ */ > > The question is if the pragma GCC target right now behaves incrementally > or not, whether > #pragma GCC target("avx2") > adds -mavx2 to options if it was missing before and nothing otherwise, or if > it switches other options off. If it is incremental, we could e.g. try to > use the second least significant bit of global_options_set.x_* to mean > this option has been set explicitly by some surrounding #pragma GCC target. > The normal tests - global_options_set.x_flag_whatever could still work > fine because they wouldn't care if the option was explicit from anywhere > (command line or GCC target or target attribute) and just & 2 would mean > it was explicit from pragma GCC target; though there is the case of > bitfields... And then the inlining decision could check the & 2 flags to > see what is required and what is just from command line. > Or we can have some other pragma GCC that would be like target but would > have flags that are explicit (and could e.g. be more restricted, to ISA > options only, and let those use in addition to #pragma GCC target. I'm still curious as to what you think will break if always-inline does what it is documented to do. Richard. > Jakub >
Re: [PATCH v4 2/2] x86: Add general_regs_only function attribute
On Wed, Apr 21, 2021 at 06:01:07PM -0700, H.J. Lu via Gcc-patches wrote: > How about this? > > @item general_regs_only > @cindex @code{general_regs_only} function attribute, x86 > The @code{general_regs_only} function attribute informs the compiler > that the function uses only general purpose registers. When the > compiler inlines a function with the @code{always_inline} attribute, > target-specific compilation options may lead to inline failures. > The @code{general_regs_only} attribute, if applicable, can be used > together with the @code{always_inline} attribute to reduce inlining > failure. I don't really like this attribute. It is very specific to what you want to solve and doesn't address the general problem, that always_inline means different things in different code, and that is a problem for many targets, not just one. As has been written, in some cases it means inline always, error whenever it is called indirectly which can't be optimized into a direct call that can be inlined and error whenever the inlining fails for other reasons. Another case, e.g. in the glibc fortify wrappers, is inline always when the call is direct or an indirect call can be optimized into a direct call and error when the inlining fails, but support indirect calls without errors. Another case, which is most of the x86/aarch64/arm etc. intrinsics, is inline always unless there is a target mismatch (roughly what is actually implemented). Because from the always_inline attribute it is impossible to determine which one of those it is (for the indirect calls the rule could be gnu_inline extern inline means indirect calls are ok, anything else means indirect calls are bad), we need new syntax to distinguish those cases. general_regs_only attribute doesn't seem to be it, e.g. for the glibc fortify wrappers cases I don't see why we should forbid using floating point in such inlines. So IMHO we need some new attribute for one of those, or optional parameter to always_inline. For the intrinsic case, ideal would be if we could record which ISA flags (or more generally which options) are required and which are not. Either have some syntax where those would be explicitly specified in attribute (but frankly that would be a maintainance nightmare), or derive those from surrounding pragmas. Right now we have those wrapped in #ifndef __AVX2__ #pragma GCC push_options #pragma GCC target("avx2") #define __DISABLE_AVX2__ #endif /* __AVX2__ */ ... #ifdef __DISABLE_AVX2__ #undef __DISABLE_AVX2__ #pragma GCC pop_options #endif /* __DISABLE_AVX2__ */ The question is if the pragma GCC target right now behaves incrementally or not, whether #pragma GCC target("avx2") adds -mavx2 to options if it was missing before and nothing otherwise, or if it switches other options off. If it is incremental, we could e.g. try to use the second least significant bit of global_options_set.x_* to mean this option has been set explicitly by some surrounding #pragma GCC target. The normal tests - global_options_set.x_flag_whatever could still work fine because they wouldn't care if the option was explicit from anywhere (command line or GCC target or target attribute) and just & 2 would mean it was explicit from pragma GCC target; though there is the case of bitfields... And then the inlining decision could check the & 2 flags to see what is required and what is just from command line. Or we can have some other pragma GCC that would be like target but would have flags that are explicit (and could e.g. be more restricted, to ISA options only, and let those use in addition to #pragma GCC target. Jakub
Re: [PATCH v4 2/2] x86: Add general_regs_only function attribute
On Thu, Apr 22, 2021 at 3:01 AM H.J. Lu wrote: > > On Wed, Apr 21, 2021 at 4:24 PM Martin Sebor wrote: > > > > On 4/21/21 2:58 PM, H.J. Lu wrote: > > > On Wed, Apr 21, 2021 at 10:09 AM Martin Sebor wrote: > > >> > > >> On 4/14/21 4:39 PM, H.J. Lu wrote: > > >>> commit 87c753ac241f25d222d46ba1ac66ceba89d6a200 > > >>> Author: H.J. Lu > > >>> Date: Fri Aug 21 09:42:49 2020 -0700 > > >>> > > >>> x86: Add target("general-regs-only") function attribute > > >>> > > >>> is incomplete since it is impossible to call integer intrinsics from > > >>> a function with general-regs-only target attribute. > > >>> > > >>> 1. Add general_regs_only function attribute to inform the compiler that > > >>> functions use only general purpose registers. When making inlining > > >>> decisions on such functions, non-GPR compiler options are excluded. > > >>> 2. Add general_regs_only attribute to x86 intrinsics which use only > > >>> general purpose registers. > > >>> > > >> ... > > >>> --- a/gcc/doc/extend.texi > > >>> +++ b/gcc/doc/extend.texi > > >>> @@ -7066,6 +7066,11 @@ On x86 targets, the @code{fentry_section} > > >>> attribute sets the name > > >>>of the section to record function entry instrumentation calls in when > > >>>enabled with @option{-pg -mrecord-mcount} > > >>> > > >>> +@item general_regs_only > > >>> +@cindex @code{general_regs_only} function attribute, x86 > > >>> +The @code{general_regs_only} attribute on functions is used to > > >>> +inform the compiler that functions use only general purpose registers. > > >> > > >> I'll just reiterate basically the same comment as before: it's not > > >> clear from the very brief description above what the requirements > > >> are for using the attribute. I'm guessing it can be applied to > > >> any function (inline or otherwise) but only has any effect when > > >> the function is actually inlined and otherwise doesn't constrain > > >> what the function can do. (Whatever the constraints are, I think > > >> the manual should spell them out, and likewise for its effects.) > > > > > > That is correct. > > > > > >> Similarly it's not clear what should be expected when the function > > >> does use some other register. Ideally, I think GCC would check and > > >> issue a nice error message whether or not the function is inlined > > >> or called. I suspect that might only be possible for inline > > >> functions that are actually called and for which the back end must > > >> emit code. > > > > > > This is what GCC does today: > > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99744 > > > > Yes, that's the rather obscure error I think I commented on before > > and suggested should be improved. Based on r99744-3.c I don't think > > this has changed in the improved patch. > > My goal is to fix the inline failures, not to improve the compiler error > message. > > > > > > >> Other than that, I'd suggest to improve the phrasing a bit: > > >> > > >> The @code{general_regs_only} function attribute indicates that > > >> the function uses only general purpose registers... [text > > >> explaining constraints and errors follows]. > > >> > > >> Martin > > > > > > How about this > > > > > > @item general_regs_only > > > @cindex @code{general_regs_only} function attribute, x86 > > > The @code{general_regs_only} attribute on functions is used to inform > > > the compiler that functions use only general purpose registers. It > > > can be used together with the @code{always_inline} attribute to avoid > > > inlining failure when there is a mismatch in compiler vector options. > > > > Without an article the part "that functions use only general purpose > > registers" is unclear and/or grammatically incorrect. What functions? > > If the function the attribute is applied to, it needs an article, e.g., > > "the function" or "a function", and singular. (Otherwise it could be > > read as talking about the functions called from the one with > > the attribute, or some other functions altogether). > > > > I tried to correct that above but, if you prefer, the following would > > be closer to your phrasing but more correct/accurate: > > > >The @code{general_regs_only} function attribute informs > >the compiler that the function uses only general purpose > >registers. > > > > I don't understand what the second sentence is trying to say, and > > without a better error message for the problem in r99744, I suspect > > few users will either. I am suggesting to explain in the text you > > are adding, under what conditions inlining might fail without > > the attribute, and what effect the attribute has on the function > > that prevents the inlining failure. > > How about this? > > @item general_regs_only > @cindex @code{general_regs_only} function attribute, x86 > The @code{general_regs_only} function attribute informs the compiler > that the function uses only general purpose registers. When the > compiler inlines a function with the @code{always_inline} attrib
Re: [PATCH v4 2/2] x86: Add general_regs_only function attribute
On Wed, Apr 21, 2021 at 4:24 PM Martin Sebor wrote: > > On 4/21/21 2:58 PM, H.J. Lu wrote: > > On Wed, Apr 21, 2021 at 10:09 AM Martin Sebor wrote: > >> > >> On 4/14/21 4:39 PM, H.J. Lu wrote: > >>> commit 87c753ac241f25d222d46ba1ac66ceba89d6a200 > >>> Author: H.J. Lu > >>> Date: Fri Aug 21 09:42:49 2020 -0700 > >>> > >>> x86: Add target("general-regs-only") function attribute > >>> > >>> is incomplete since it is impossible to call integer intrinsics from > >>> a function with general-regs-only target attribute. > >>> > >>> 1. Add general_regs_only function attribute to inform the compiler that > >>> functions use only general purpose registers. When making inlining > >>> decisions on such functions, non-GPR compiler options are excluded. > >>> 2. Add general_regs_only attribute to x86 intrinsics which use only > >>> general purpose registers. > >>> > >> ... > >>> --- a/gcc/doc/extend.texi > >>> +++ b/gcc/doc/extend.texi > >>> @@ -7066,6 +7066,11 @@ On x86 targets, the @code{fentry_section} > >>> attribute sets the name > >>>of the section to record function entry instrumentation calls in when > >>>enabled with @option{-pg -mrecord-mcount} > >>> > >>> +@item general_regs_only > >>> +@cindex @code{general_regs_only} function attribute, x86 > >>> +The @code{general_regs_only} attribute on functions is used to > >>> +inform the compiler that functions use only general purpose registers. > >> > >> I'll just reiterate basically the same comment as before: it's not > >> clear from the very brief description above what the requirements > >> are for using the attribute. I'm guessing it can be applied to > >> any function (inline or otherwise) but only has any effect when > >> the function is actually inlined and otherwise doesn't constrain > >> what the function can do. (Whatever the constraints are, I think > >> the manual should spell them out, and likewise for its effects.) > > > > That is correct. > > > >> Similarly it's not clear what should be expected when the function > >> does use some other register. Ideally, I think GCC would check and > >> issue a nice error message whether or not the function is inlined > >> or called. I suspect that might only be possible for inline > >> functions that are actually called and for which the back end must > >> emit code. > > > > This is what GCC does today: > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99744 > > Yes, that's the rather obscure error I think I commented on before > and suggested should be improved. Based on r99744-3.c I don't think > this has changed in the improved patch. My goal is to fix the inline failures, not to improve the compiler error message. > > > >> Other than that, I'd suggest to improve the phrasing a bit: > >> > >> The @code{general_regs_only} function attribute indicates that > >> the function uses only general purpose registers... [text > >> explaining constraints and errors follows]. > >> > >> Martin > > > > How about this > > > > @item general_regs_only > > @cindex @code{general_regs_only} function attribute, x86 > > The @code{general_regs_only} attribute on functions is used to inform > > the compiler that functions use only general purpose registers. It > > can be used together with the @code{always_inline} attribute to avoid > > inlining failure when there is a mismatch in compiler vector options. > > Without an article the part "that functions use only general purpose > registers" is unclear and/or grammatically incorrect. What functions? > If the function the attribute is applied to, it needs an article, e.g., > "the function" or "a function", and singular. (Otherwise it could be > read as talking about the functions called from the one with > the attribute, or some other functions altogether). > > I tried to correct that above but, if you prefer, the following would > be closer to your phrasing but more correct/accurate: > >The @code{general_regs_only} function attribute informs >the compiler that the function uses only general purpose >registers. > > I don't understand what the second sentence is trying to say, and > without a better error message for the problem in r99744, I suspect > few users will either. I am suggesting to explain in the text you > are adding, under what conditions inlining might fail without > the attribute, and what effect the attribute has on the function > that prevents the inlining failure. How about this? @item general_regs_only @cindex @code{general_regs_only} function attribute, x86 The @code{general_regs_only} function attribute informs the compiler that the function uses only general purpose registers. When the compiler inlines a function with the @code{always_inline} attribute, target-specific compilation options may lead to inline failures. The @code{general_regs_only} attribute, if applicable, can be used together with the @code{always_inline} attribute to reduce inlining failure. > (If we can't explain what the effect is then
Re: [PATCH v4 2/2] x86: Add general_regs_only function attribute
On 4/21/21 2:58 PM, H.J. Lu wrote: On Wed, Apr 21, 2021 at 10:09 AM Martin Sebor wrote: On 4/14/21 4:39 PM, H.J. Lu wrote: commit 87c753ac241f25d222d46ba1ac66ceba89d6a200 Author: H.J. Lu Date: Fri Aug 21 09:42:49 2020 -0700 x86: Add target("general-regs-only") function attribute is incomplete since it is impossible to call integer intrinsics from a function with general-regs-only target attribute. 1. Add general_regs_only function attribute to inform the compiler that functions use only general purpose registers. When making inlining decisions on such functions, non-GPR compiler options are excluded. 2. Add general_regs_only attribute to x86 intrinsics which use only general purpose registers. ... --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -7066,6 +7066,11 @@ On x86 targets, the @code{fentry_section} attribute sets the name of the section to record function entry instrumentation calls in when enabled with @option{-pg -mrecord-mcount} +@item general_regs_only +@cindex @code{general_regs_only} function attribute, x86 +The @code{general_regs_only} attribute on functions is used to +inform the compiler that functions use only general purpose registers. I'll just reiterate basically the same comment as before: it's not clear from the very brief description above what the requirements are for using the attribute. I'm guessing it can be applied to any function (inline or otherwise) but only has any effect when the function is actually inlined and otherwise doesn't constrain what the function can do. (Whatever the constraints are, I think the manual should spell them out, and likewise for its effects.) That is correct. Similarly it's not clear what should be expected when the function does use some other register. Ideally, I think GCC would check and issue a nice error message whether or not the function is inlined or called. I suspect that might only be possible for inline functions that are actually called and for which the back end must emit code. This is what GCC does today: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99744 Yes, that's the rather obscure error I think I commented on before and suggested should be improved. Based on r99744-3.c I don't think this has changed in the improved patch. Other than that, I'd suggest to improve the phrasing a bit: The @code{general_regs_only} function attribute indicates that the function uses only general purpose registers... [text explaining constraints and errors follows]. Martin How about this @item general_regs_only @cindex @code{general_regs_only} function attribute, x86 The @code{general_regs_only} attribute on functions is used to inform the compiler that functions use only general purpose registers. It can be used together with the @code{always_inline} attribute to avoid inlining failure when there is a mismatch in compiler vector options. Without an article the part "that functions use only general purpose registers" is unclear and/or grammatically incorrect. What functions? If the function the attribute is applied to, it needs an article, e.g., "the function" or "a function", and singular. (Otherwise it could be read as talking about the functions called from the one with the attribute, or some other functions altogether). I tried to correct that above but, if you prefer, the following would be closer to your phrasing but more correct/accurate: The @code{general_regs_only} function attribute informs the compiler that the function uses only general purpose registers. I don't understand what the second sentence is trying to say, and without a better error message for the problem in r99744, I suspect few users will either. I am suggesting to explain in the text you are adding, under what conditions inlining might fail without the attribute, and what effect the attribute has on the function that prevents the inlining failure. (If we can't explain what the effect is then I wonder why the attribute is being added at all instead of teaching GCC to always behave as if the attribute were there when its absence would otherwise lead to an error.) Martin
Re: [PATCH v4 2/2] x86: Add general_regs_only function attribute
On Wed, Apr 21, 2021 at 10:09 AM Martin Sebor wrote: > > On 4/14/21 4:39 PM, H.J. Lu wrote: > > commit 87c753ac241f25d222d46ba1ac66ceba89d6a200 > > Author: H.J. Lu > > Date: Fri Aug 21 09:42:49 2020 -0700 > > > > x86: Add target("general-regs-only") function attribute > > > > is incomplete since it is impossible to call integer intrinsics from > > a function with general-regs-only target attribute. > > > > 1. Add general_regs_only function attribute to inform the compiler that > > functions use only general purpose registers. When making inlining > > decisions on such functions, non-GPR compiler options are excluded. > > 2. Add general_regs_only attribute to x86 intrinsics which use only > > general purpose registers. > > > ... > > --- a/gcc/doc/extend.texi > > +++ b/gcc/doc/extend.texi > > @@ -7066,6 +7066,11 @@ On x86 targets, the @code{fentry_section} attribute > > sets the name > > of the section to record function entry instrumentation calls in when > > enabled with @option{-pg -mrecord-mcount} > > > > +@item general_regs_only > > +@cindex @code{general_regs_only} function attribute, x86 > > +The @code{general_regs_only} attribute on functions is used to > > +inform the compiler that functions use only general purpose registers. > > I'll just reiterate basically the same comment as before: it's not > clear from the very brief description above what the requirements > are for using the attribute. I'm guessing it can be applied to > any function (inline or otherwise) but only has any effect when > the function is actually inlined and otherwise doesn't constrain > what the function can do. (Whatever the constraints are, I think > the manual should spell them out, and likewise for its effects.) That is correct. > Similarly it's not clear what should be expected when the function > does use some other register. Ideally, I think GCC would check and > issue a nice error message whether or not the function is inlined > or called. I suspect that might only be possible for inline > functions that are actually called and for which the back end must > emit code. This is what GCC does today: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99744 > Other than that, I'd suggest to improve the phrasing a bit: > >The @code{general_regs_only} function attribute indicates that >the function uses only general purpose registers... [text >explaining constraints and errors follows]. > > Martin How about this @item general_regs_only @cindex @code{general_regs_only} function attribute, x86 The @code{general_regs_only} attribute on functions is used to inform the compiler that functions use only general purpose registers. It can be used together with the @code{always_inline} attribute to avoid inlining failure when there is a mismatch in compiler vector options. Thanks. -- H.J.
Re: [PATCH v4 2/2] x86: Add general_regs_only function attribute
On 4/14/21 4:39 PM, H.J. Lu wrote: commit 87c753ac241f25d222d46ba1ac66ceba89d6a200 Author: H.J. Lu Date: Fri Aug 21 09:42:49 2020 -0700 x86: Add target("general-regs-only") function attribute is incomplete since it is impossible to call integer intrinsics from a function with general-regs-only target attribute. 1. Add general_regs_only function attribute to inform the compiler that functions use only general purpose registers. When making inlining decisions on such functions, non-GPR compiler options are excluded. 2. Add general_regs_only attribute to x86 intrinsics which use only general purpose registers. ... --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -7066,6 +7066,11 @@ On x86 targets, the @code{fentry_section} attribute sets the name of the section to record function entry instrumentation calls in when enabled with @option{-pg -mrecord-mcount} +@item general_regs_only +@cindex @code{general_regs_only} function attribute, x86 +The @code{general_regs_only} attribute on functions is used to +inform the compiler that functions use only general purpose registers. I'll just reiterate basically the same comment as before: it's not clear from the very brief description above what the requirements are for using the attribute. I'm guessing it can be applied to any function (inline or otherwise) but only has any effect when the function is actually inlined and otherwise doesn't constrain what the function can do. (Whatever the constraints are, I think the manual should spell them out, and likewise for its effects.) Similarly it's not clear what should be expected when the function does use some other register. Ideally, I think GCC would check and issue a nice error message whether or not the function is inlined or called. I suspect that might only be possible for inline functions that are actually called and for which the back end must emit code. Other than that, I'd suggest to improve the phrasing a bit: The @code{general_regs_only} function attribute indicates that the function uses only general purpose registers... [text explaining constraints and errors follows]. Martin
Re: [PATCH v4 2/2] x86: Add general_regs_only function attribute
On 4/21/21 1:30 AM, Uros Bizjak wrote: On Thu, Apr 15, 2021 at 12:39 AM H.J. Lu wrote: commit 87c753ac241f25d222d46ba1ac66ceba89d6a200 Author: H.J. Lu Date: Fri Aug 21 09:42:49 2020 -0700 x86: Add target("general-regs-only") function attribute is incomplete since it is impossible to call integer intrinsics from a function with general-regs-only target attribute. 1. Add general_regs_only function attribute to inform the compiler that functions use only general purpose registers. When making inlining decisions on such functions, non-GPR compiler options are excluded. 2. Add general_regs_only attribute to x86 intrinsics which use only general purpose registers. I'd like to ask Richard and Jakub if they agree with the approach. On a related note, can we declare default attributes like clang does, e.g.: /* Define the default attributes for the functions. */ #define __DEFAULT_FN_ATTRS __attribute__((__gnu_inline__, __always_inline__, __artificial__)) #define __DEFAULT_FN_ATTRS_GRO __attribute__((__gnu_inline__, __always_inline__, __general_regs_only, __artificial__)) and use these defines throughout header files? FWIW, the sequence of attributes contributes measurably to the amount of time it takes to parse all the declarations. Since they're the same for most of the thousands of functions declared in these headers it might be worth considering either adding a new attribute that "expands" to all of these, or using some other mechanism to speed things up. (This came up recenetly in pr100099.) Martin Uros. gcc/ PR target/99744 * config/i386/i386-options.c (ix86_attribute_table): Add general_regs_only. * config/i386/i386.c (ix86_can_inline_p): Exclude non-integer target options if callee has general_regs_only attribute. * config/i386/adxintrin.h: Add general_regs_only attribute to intrinsics which use only general purpose registers. * config/i386/bmiintrin.h: Likewise. * config/i386/bmi2intrin.h: Likewise. * config/i386/cetintrin.h: Likewise. * config/i386/cldemoteintrin.h: Likewise. * config/i386/clflushoptintrin.h: Likewise. * config/i386/clwbintrin.h: Likewise. * config/i386/clzerointrin.h: Likewise. * config/i386/enqcmdintrin.h: Likewise. * config/i386/fxsrintrin.h: Likewise. * config/i386/hresetintrin.h: Likewise. * config/i386/ia32intrin.h: Likewise. * config/i386/lwpintrin.h: Likewise. * config/i386/lzcntintrin.h: Likewise. * config/i386/movdirintrin.h: Likewise. * config/i386/mwaitxintrin.h: Likewise. * config/i386/pconfigintrin.h: Likewise. * config/i386/pkuintrin.h: Likewise. * config/i386/popcntintrin.h: Likewise. * config/i386/rdseedintrin.h: Likewise. * config/i386/rtmintrin.h: Likewise. * config/i386/serializeintrin.h: Likewise. * config/i386/sgxintrin.h: Likewise. * config/i386/tbmintrin.h: Likewise. * config/i386/tsxldtrkintrin.h: Likewise. * config/i386/uintrintrin.h: Likewise. * config/i386/waitpkgintrin.h: Likewise. * config/i386/wbnoinvdintrin.h: Likewise. * config/i386/x86gprintrin.h: Likewise. * config/i386/xsavecintrin.h: Likewise. * config/i386/xsaveintrin.h: Likewise. * config/i386/xsaveoptintrin.h: Likewise. * config/i386/xsavesintrin.h: Likewise. * config/i386/xtestintrin.h: Likewise. * doc/extend.texi: Document general_regs_only function attribute. gcc/testsuite/ PR target/99744 * gcc.target/i386/pr99744-3.c: New test. * gcc.target/i386/pr99744-4.c: Likewise. --- gcc/config/i386/adxintrin.h | 18 +- gcc/config/i386/bmi2intrin.h | 24 +- gcc/config/i386/bmiintrin.h | 92 -- gcc/config/i386/cetintrin.h | 33 +- gcc/config/i386/cldemoteintrin.h | 3 +- gcc/config/i386/clflushoptintrin.h| 3 +- gcc/config/i386/clwbintrin.h | 3 +- gcc/config/i386/clzerointrin.h| 4 +- gcc/config/i386/enqcmdintrin.h| 6 +- gcc/config/i386/fxsrintrin.h | 12 +- gcc/config/i386/hresetintrin.h| 3 +- gcc/config/i386/i386-options.c| 2 + gcc/config/i386/i386.c| 29 +- gcc/config/i386/ia32intrin.h | 82 +++-- gcc/config/i386/lwpintrin.h | 24 +- gcc/config/i386/lzcntintrin.h | 20 +- gcc/config/i386/movdirintrin.h| 9 +- gcc/config/i386/mwaitxintrin.h| 8 +- gcc/config/i386/pconfigintrin.h | 3 +- gcc/config/i386/pkuintrin.h | 6 +- gcc/config/i386/popcntintrin.h| 8 +- gcc/config/i386/rdseedintrin.h| 9 +- gcc/config/i386/rtmintrin.h | 9 +
Re: [PATCH v4 2/2] x86: Add general_regs_only function attribute
On Wed, Apr 21, 2021 at 12:30 AM Uros Bizjak wrote: > > On Thu, Apr 15, 2021 at 12:39 AM H.J. Lu wrote: > > > > commit 87c753ac241f25d222d46ba1ac66ceba89d6a200 > > Author: H.J. Lu > > Date: Fri Aug 21 09:42:49 2020 -0700 > > > > x86: Add target("general-regs-only") function attribute > > > > is incomplete since it is impossible to call integer intrinsics from > > a function with general-regs-only target attribute. > > > > 1. Add general_regs_only function attribute to inform the compiler that > > functions use only general purpose registers. When making inlining > > decisions on such functions, non-GPR compiler options are excluded. > > 2. Add general_regs_only attribute to x86 intrinsics which use only > > general purpose registers. > > I'd like to ask Richard and Jakub if they agree with the approach. Here is the v5 patch: https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568407.html Richard, Jakub, do they look good to you? > On a related note, can we declare default attributes like clang does, e.g.: > > /* Define the default attributes for the functions. */ > #define __DEFAULT_FN_ATTRS __attribute__((__gnu_inline__, > __always_inline__, __artificial__)) This can be defined in x86intrin.h. > #define __DEFAULT_FN_ATTRS_GRO __attribute__((__gnu_inline__, > __always_inline__, __general_regs_only, __artificial__)) This can be defined in x86gprintrin.h. > and use these defines throughout header files? > > Uros. > Thanks. -- H.J.
Re: [PATCH v4 2/2] x86: Add general_regs_only function attribute
On Thu, Apr 15, 2021 at 12:39 AM H.J. Lu wrote: > > commit 87c753ac241f25d222d46ba1ac66ceba89d6a200 > Author: H.J. Lu > Date: Fri Aug 21 09:42:49 2020 -0700 > > x86: Add target("general-regs-only") function attribute > > is incomplete since it is impossible to call integer intrinsics from > a function with general-regs-only target attribute. > > 1. Add general_regs_only function attribute to inform the compiler that > functions use only general purpose registers. When making inlining > decisions on such functions, non-GPR compiler options are excluded. > 2. Add general_regs_only attribute to x86 intrinsics which use only > general purpose registers. I'd like to ask Richard and Jakub if they agree with the approach. On a related note, can we declare default attributes like clang does, e.g.: /* Define the default attributes for the functions. */ #define __DEFAULT_FN_ATTRS __attribute__((__gnu_inline__, __always_inline__, __artificial__)) #define __DEFAULT_FN_ATTRS_GRO __attribute__((__gnu_inline__, __always_inline__, __general_regs_only, __artificial__)) and use these defines throughout header files? Uros. > > gcc/ > > PR target/99744 > * config/i386/i386-options.c (ix86_attribute_table): Add > general_regs_only. > * config/i386/i386.c (ix86_can_inline_p): Exclude non-integer > target options if callee has general_regs_only attribute. > * config/i386/adxintrin.h: Add general_regs_only attribute to > intrinsics which use only general purpose registers. > * config/i386/bmiintrin.h: Likewise. > * config/i386/bmi2intrin.h: Likewise. > * config/i386/cetintrin.h: Likewise. > * config/i386/cldemoteintrin.h: Likewise. > * config/i386/clflushoptintrin.h: Likewise. > * config/i386/clwbintrin.h: Likewise. > * config/i386/clzerointrin.h: Likewise. > * config/i386/enqcmdintrin.h: Likewise. > * config/i386/fxsrintrin.h: Likewise. > * config/i386/hresetintrin.h: Likewise. > * config/i386/ia32intrin.h: Likewise. > * config/i386/lwpintrin.h: Likewise. > * config/i386/lzcntintrin.h: Likewise. > * config/i386/movdirintrin.h: Likewise. > * config/i386/mwaitxintrin.h: Likewise. > * config/i386/pconfigintrin.h: Likewise. > * config/i386/pkuintrin.h: Likewise. > * config/i386/popcntintrin.h: Likewise. > * config/i386/rdseedintrin.h: Likewise. > * config/i386/rtmintrin.h: Likewise. > * config/i386/serializeintrin.h: Likewise. > * config/i386/sgxintrin.h: Likewise. > * config/i386/tbmintrin.h: Likewise. > * config/i386/tsxldtrkintrin.h: Likewise. > * config/i386/uintrintrin.h: Likewise. > * config/i386/waitpkgintrin.h: Likewise. > * config/i386/wbnoinvdintrin.h: Likewise. > * config/i386/x86gprintrin.h: Likewise. > * config/i386/xsavecintrin.h: Likewise. > * config/i386/xsaveintrin.h: Likewise. > * config/i386/xsaveoptintrin.h: Likewise. > * config/i386/xsavesintrin.h: Likewise. > * config/i386/xtestintrin.h: Likewise. > * doc/extend.texi: Document general_regs_only function attribute. > > gcc/testsuite/ > > PR target/99744 > * gcc.target/i386/pr99744-3.c: New test. > * gcc.target/i386/pr99744-4.c: Likewise. > --- > gcc/config/i386/adxintrin.h | 18 +- > gcc/config/i386/bmi2intrin.h | 24 +- > gcc/config/i386/bmiintrin.h | 92 -- > gcc/config/i386/cetintrin.h | 33 +- > gcc/config/i386/cldemoteintrin.h | 3 +- > gcc/config/i386/clflushoptintrin.h| 3 +- > gcc/config/i386/clwbintrin.h | 3 +- > gcc/config/i386/clzerointrin.h| 4 +- > gcc/config/i386/enqcmdintrin.h| 6 +- > gcc/config/i386/fxsrintrin.h | 12 +- > gcc/config/i386/hresetintrin.h| 3 +- > gcc/config/i386/i386-options.c| 2 + > gcc/config/i386/i386.c| 29 +- > gcc/config/i386/ia32intrin.h | 82 +++-- > gcc/config/i386/lwpintrin.h | 24 +- > gcc/config/i386/lzcntintrin.h | 20 +- > gcc/config/i386/movdirintrin.h| 9 +- > gcc/config/i386/mwaitxintrin.h| 8 +- > gcc/config/i386/pconfigintrin.h | 3 +- > gcc/config/i386/pkuintrin.h | 6 +- > gcc/config/i386/popcntintrin.h| 8 +- > gcc/config/i386/rdseedintrin.h| 9 +- > gcc/config/i386/rtmintrin.h | 9 +- > gcc/config/i386/serializeintrin.h | 8 +- > gcc/config/i386/sgxintrin.h | 9 +- > gcc/config/i386/tbmintrin.h | 80 +++-- > gcc/config/i386/tsxldtrkintrin.h | 6 +- > gcc/config/i386/uintrintrin.h | 12 +- > gcc/config/i386/waitpkgintrin.h | 9 +- > gcc/config
[PATCH v4 2/2] x86: Add general_regs_only function attribute
commit 87c753ac241f25d222d46ba1ac66ceba89d6a200 Author: H.J. Lu Date: Fri Aug 21 09:42:49 2020 -0700 x86: Add target("general-regs-only") function attribute is incomplete since it is impossible to call integer intrinsics from a function with general-regs-only target attribute. 1. Add general_regs_only function attribute to inform the compiler that functions use only general purpose registers. When making inlining decisions on such functions, non-GPR compiler options are excluded. 2. Add general_regs_only attribute to x86 intrinsics which use only general purpose registers. gcc/ PR target/99744 * config/i386/i386-options.c (ix86_attribute_table): Add general_regs_only. * config/i386/i386.c (ix86_can_inline_p): Exclude non-integer target options if callee has general_regs_only attribute. * config/i386/adxintrin.h: Add general_regs_only attribute to intrinsics which use only general purpose registers. * config/i386/bmiintrin.h: Likewise. * config/i386/bmi2intrin.h: Likewise. * config/i386/cetintrin.h: Likewise. * config/i386/cldemoteintrin.h: Likewise. * config/i386/clflushoptintrin.h: Likewise. * config/i386/clwbintrin.h: Likewise. * config/i386/clzerointrin.h: Likewise. * config/i386/enqcmdintrin.h: Likewise. * config/i386/fxsrintrin.h: Likewise. * config/i386/hresetintrin.h: Likewise. * config/i386/ia32intrin.h: Likewise. * config/i386/lwpintrin.h: Likewise. * config/i386/lzcntintrin.h: Likewise. * config/i386/movdirintrin.h: Likewise. * config/i386/mwaitxintrin.h: Likewise. * config/i386/pconfigintrin.h: Likewise. * config/i386/pkuintrin.h: Likewise. * config/i386/popcntintrin.h: Likewise. * config/i386/rdseedintrin.h: Likewise. * config/i386/rtmintrin.h: Likewise. * config/i386/serializeintrin.h: Likewise. * config/i386/sgxintrin.h: Likewise. * config/i386/tbmintrin.h: Likewise. * config/i386/tsxldtrkintrin.h: Likewise. * config/i386/uintrintrin.h: Likewise. * config/i386/waitpkgintrin.h: Likewise. * config/i386/wbnoinvdintrin.h: Likewise. * config/i386/x86gprintrin.h: Likewise. * config/i386/xsavecintrin.h: Likewise. * config/i386/xsaveintrin.h: Likewise. * config/i386/xsaveoptintrin.h: Likewise. * config/i386/xsavesintrin.h: Likewise. * config/i386/xtestintrin.h: Likewise. * doc/extend.texi: Document general_regs_only function attribute. gcc/testsuite/ PR target/99744 * gcc.target/i386/pr99744-3.c: New test. * gcc.target/i386/pr99744-4.c: Likewise. --- gcc/config/i386/adxintrin.h | 18 +- gcc/config/i386/bmi2intrin.h | 24 +- gcc/config/i386/bmiintrin.h | 92 -- gcc/config/i386/cetintrin.h | 33 +- gcc/config/i386/cldemoteintrin.h | 3 +- gcc/config/i386/clflushoptintrin.h| 3 +- gcc/config/i386/clwbintrin.h | 3 +- gcc/config/i386/clzerointrin.h| 4 +- gcc/config/i386/enqcmdintrin.h| 6 +- gcc/config/i386/fxsrintrin.h | 12 +- gcc/config/i386/hresetintrin.h| 3 +- gcc/config/i386/i386-options.c| 2 + gcc/config/i386/i386.c| 29 +- gcc/config/i386/ia32intrin.h | 82 +++-- gcc/config/i386/lwpintrin.h | 24 +- gcc/config/i386/lzcntintrin.h | 20 +- gcc/config/i386/movdirintrin.h| 9 +- gcc/config/i386/mwaitxintrin.h| 8 +- gcc/config/i386/pconfigintrin.h | 3 +- gcc/config/i386/pkuintrin.h | 6 +- gcc/config/i386/popcntintrin.h| 8 +- gcc/config/i386/rdseedintrin.h| 9 +- gcc/config/i386/rtmintrin.h | 9 +- gcc/config/i386/serializeintrin.h | 8 +- gcc/config/i386/sgxintrin.h | 9 +- gcc/config/i386/tbmintrin.h | 80 +++-- gcc/config/i386/tsxldtrkintrin.h | 6 +- gcc/config/i386/uintrintrin.h | 12 +- gcc/config/i386/waitpkgintrin.h | 9 +- gcc/config/i386/wbnoinvdintrin.h | 3 +- gcc/config/i386/x86gprintrin.h| 45 ++- gcc/config/i386/xsavecintrin.h| 6 +- gcc/config/i386/xsaveintrin.h | 18 +- gcc/config/i386/xsaveoptintrin.h | 6 +- gcc/config/i386/xsavesintrin.h| 12 +- gcc/config/i386/xtestintrin.h | 3 +- gcc/doc/extend.texi | 5 + gcc/testsuite/gcc.target/i386/pr99744-3.c | 13 + gcc/testsuite/gcc.target/i386/pr99744-4.c | 352 ++ 39 files changed, 818 insertions(+), 179 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-3.c create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-4.