On Thu, Apr 22, 2021 at 1:58 PM H.J. Lu <hjl.to...@gmail.com> wrote:
>
> On Thu, Apr 22, 2021 at 4:23 AM Richard Biener
> <richard.guent...@gmail.com> wrote:
> >
> > On Thu, Apr 22, 2021 at 12:30 PM Jakub Jelinek via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> > >
> > > On Wed, Apr 21, 2021 at 06:01:07PM -0700, H.J. Lu via Gcc-patches wrote:
> > > > How about this?
> > > >
> > > > @item general_regs_only
> > > > @cindex @code{general_regs_only} function attribute, x86
> > > > The @code{general_regs_only} function attribute informs the compiler
> > > > that the function uses only general purpose registers.  When the
> > > > compiler inlines a function with the @code{always_inline} attribute,
> > > > target-specific compilation options may lead to inline failures.
> > > > The @code{general_regs_only} attribute, if applicable, can be used
> > > > together with the @code{always_inline} attribute to reduce inlining
> > > > failure.
> > >
> > > I don't really like this attribute.
> > > It is very specific to what you want to solve and doesn't address the
> > > general problem, that always_inline means different things in different
> > > code, and that is a problem for many targets, not just one.
> > >
> > > As has been written, in some cases it means inline always, error
> > > whenever it is called indirectly which can't be optimized into a direct
> > > call that can be inlined and error whenever the inlining fails for other
> > > reasons.
> > >
> > > Another case, e.g. in the glibc fortify wrappers, is inline always
> > > when the call is direct or an indirect call can be optimized into a direct
> > > call and error when the inlining fails, but support indirect calls without
> > > errors.
> > >
> > > Another case, which is most of the x86/aarch64/arm etc. intrinsics, is
> > > inline always unless there is a target mismatch (roughly what is
> > > actually implemented).
> > >
> > > Because from the always_inline attribute it is impossible to determine 
> > > which
> > > one of those it is (for the indirect calls the rule could be
> > > gnu_inline extern inline means indirect calls are ok, anything else means
> > > indirect calls are bad), we need new syntax to distinguish those cases.
> > >
> > > general_regs_only attribute doesn't seem to be it, e.g. for the glibc
> > > fortify wrappers cases I don't see why we should forbid using floating 
> > > point
> > > in such inlines.
> > >
> > > So IMHO we need some new attribute for one of those, or optional parameter
> > > to always_inline.
> > >
> > > For the intrinsic case, ideal would be if we could record which ISA flags
> > > (or more generally which options) are required and which are not.  Either
> > > have some syntax where those would be explicitly specified in attribute 
> > > (but
> > > frankly that would be a maintainance nightmare), or derive those from
> > > surrounding pragmas.  Right now we have those wrapped in
> > > #ifndef __AVX2__
> > > #pragma GCC push_options
> > > #pragma GCC target("avx2")
> > > #define __DISABLE_AVX2__
> > > #endif /* __AVX2__ */
> > >
> > > ...
> > >
> > > #ifdef __DISABLE_AVX2__
> > > #undef __DISABLE_AVX2__
> > > #pragma GCC pop_options
> > > #endif /* __DISABLE_AVX2__ */
> > >
> > > The question is if the pragma GCC target right now behaves incrementally
> > > or not, whether
> > > #pragma GCC target("avx2")
> > > adds -mavx2 to options if it was missing before and nothing otherwise, or 
> > > if
> > > it switches other options off.  If it is incremental, we could e.g. try to
> > > use the second least significant bit of global_options_set.x_* to mean
> > > this option has been set explicitly by some surrounding #pragma GCC 
> > > target.
> > > The normal tests - global_options_set.x_flag_whatever could still work
> > > fine because they wouldn't care if the option was explicit from anywhere
> > > (command line or GCC target or target attribute) and just & 2 would mean
> > > it was explicit from pragma GCC target; though there is the case of
> > > bitfields... And then the inlining decision could check the & 2 flags to
> > > see what is required and what is just from command line.
> > > Or we can have some other pragma GCC that would be like target but would
> > > have flags that are explicit (and could e.g. be more restricted, to ISA
> > > options only, and let those use in addition to #pragma GCC target.
> >
> > I'm still curious as to what you think will break if always-inline does what
> > it is documented to do.
>
> No wrong code.  But the compiler will generate a different error message
> at the later stage if the ISA for the intrinsic isn't enabled.

But all issues at hand we're trying to fix are rejections of _valid_ code.

Improving diagnostics for invalid code should be secondary to
properly accepting valid code.

There's nothing perfect here - for diagnostics we might want to consider
an __attribute__((requires_isa(a,b,c))) which says that calling a function
with such attribute requires the ISAs in the list to be active.  Target
code can then iterate over calls _before_ inlining to emit diagnostics.

Note that the target flags in effect on the callee are not the same as
what would be documented as required - if the CU is compiled
with -mavx2 then even SSE intrinsics will have AVX2 "enabled"
I think:

#include <xmmintrin.h>

__m128 __attribute__((target("no-avx,sse2"))) foo(__m128 A, __m128 B)
{
  return _mm_sub_ss (A, B);
}

> gcc-10 -S t.c -mavx2
In file included from t.c:1:
t.c: In function 'foo':
/usr/lib64/gcc/x86_64-suse-linux/10/include/xmmintrin.h:134:1: error:
inlining failed in call to 'always_inline' '_mm_sub_ss': target
specific option mismatch
  134 | _mm_sub_ss (__m128 __A, __m128 __B)
      | ^~~~~~~~~~
t.c:5:10: note: called from here
    5 |   return _mm_sub_ss (A, B);
      |          ^~~~~~~~~~~~~~~~~

the above is obviously bogus (_mm_sub_ss inherited AVX2).

Richard.

> --
> H.J.

Reply via email to