> -----Original Message-----
> From: Richard Sandiford [mailto:richard.sandif...@arm.com]
> Sent: Tuesday, July 7, 2020 10:08 PM
> To: xiezhiheng <xiezhih...@huawei.com>
> Cc: Richard Biener <richard.guent...@gmail.com>; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH PR94442] [AArch64] Redundant ldp/stp instructions
> emitted at -O3
> 
> xiezhiheng <xiezhih...@huawei.com> writes:
> >> -----Original Message-----
> >> From: Richard Sandiford [mailto:richard.sandif...@arm.com]
> >> Sent: Monday, July 6, 2020 5:31 PM
> >> To: xiezhiheng <xiezhih...@huawei.com>
> >> Cc: Richard Biener <richard.guent...@gmail.com>;
> gcc-patches@gcc.gnu.org
> >> Subject: Re: [PATCH PR94442] [AArch64] Redundant ldp/stp instructions
> >> emitted at -O3
> >>
> >> No, this is unfortunately a known bug.  See:
> >>
> >>   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95964
> >>
> >> (Although the PR is recent, it's been a known bug for longer.)
> >>
> >> As you say, the difficulty is that the correct attributes depend on what
> >> the built-in function does.  Most integer arithmetic is “const”, but
> things
> >> get more complicated for floating-point arithmetic.
> >>
> >> The SVE intrinsics use a three stage process:
> >>
> >> - each function is classified into one of several groups
> >> - each group has a set of flags that describe what functions in the
> >>   group can do
> >> - these flags get converted into attributes based on the current
> >>   command-line options
> >>
> >> I guess we should have something similar for the arm_neon.h built-ins.
> >>
> >> If you're willing to help fix this, that'd be great.  I think a first
> >> step would be to agree a design.
> >>
> >> Thanks,
> >> Richard
> >
> > I'd like to have a try.
> 
> Great!
> 
> > I have checked the steps in SVE intrinsics.
> > It defines a base class "function_base" and derives different classes
> > to describe several intrinsics for each.  And each class may
> > have its own unique flags described in virtual function "call_properties".
> > The specific attributes will be converted from these flags in
> > "get_attributes" later.
> >
> > I find that there are more than 100 classes in total and if I only
> > need to classify them into different groups by attributes, maybe
> > we does not need so many classes?
> 
> Yeah, I agree.
> 
> Long term, there might be value in defining arm_neon.h in a similar
> way to arm_sve.h: i.e. have arm_neon.h defer most of the work to
> a special compiler pragma.  But that's going to be a lot of work.
> 
> I think it's possible to make incremental improvements to the current
> arm_neon.h implementation without that work being thrown away if we
> ever
> did switch to a pragma in future.  And the incremental approach seems
> more practical.
> 
> > The difficult thing I think is how to classify neon intrinsics into
> > different groups.  I'm going to follow up the way in SVE intrinsics
> > first now.
> 
> For now I'd suggest just giving a name to each combination of flags
> that the intrinsics need, rather than splitting instructions in a
> more fine-grained way.  (It's not at all obvious from the final state
> of the SVE code, but even there, the idea was to have as few groups as
> possible.  I.e. the groups were supposedly only split where necessary.
> As you say, there still ended up being a lot of groups in the end…)
> 
> It'd be easier to review if the work was split up into smaller steps.
> E.g. maybe one way would be this, with each number being a single
> patch:
> 
> (1) (a) Add a flags field to the built-in function definitions
>         that for now is always zero.
>     (b) Pick a name N to describe the most conservative set of flags.
>     (c) Make every built-in function definition use N.
> 

I have finished the first part.

(a) I add a new parameter called FLAG to every built-in function macro.

(b) I define some flags in aarch64-builtins.c
FLAG_NONE for no needed flags
FLAG_READ_FPCR for functions will read FPCR register
FLAG_RAISE_FP_EXCEPTIONS for functions will raise fp exceptions
FLAG_READ_MEMORY for functions will read global memory
FLAG_PREFETCH_MEMORY for functions will prefetch data to memory
FLAG_WRITE_MEMORY for functions will write global memory

FLAG_FP is used for floating-point arithmetic
FLAG_ALL is all flags above

(c) I add a field in struct aarch64_simd_builtin_datum to record flags
for each built-in function.  But the default flags I set for built-in functions
are FLAG_ALL because by default the built-in functions might do anything.

And bootstrap and regression are tested ok on aarch64 Linux platform.

Any suggestions?

Thanks,
Xie Zhiheng

> (2) (a) Pick one type of function that cannot yet be described properly.
>     (b) Pick a name N for that type of function.
>     (c) Add whichever new flags are needed.
>     (d) Add the appropriate attributes when the flags are set,
>         possibly based on command-line options.
>     (e) Make (exactly) one built-in function definition use N.
> 
> (3) (a) Pick some functions that all need the same attributes and
>         that can already be described properly
>     (b) Update all of their built-in function definitions accordingly,
>         as a single change.
> 
> So after (1), filling out the table is an iterative process of (2) and
> (3), in any order that's convenient (although it might help to order the
> (2) patches so that each one adds as few flags as possible).  Each patch
> would then be fairly small and self-contained.
> 
> That's just a suggestion though.  Please let me know if you have
> any other suggestions.
> 
> I guess there are two obvious ways of adding the flags field:
> 
> - add a new parameter to every built-in function macro, e.g.
>   BUILTIN_VSDQ_I and VAR1.
> 
> - wrap the definitions in a new macro, e.g.
>   MY_NEW_GROUP (BUILTIN_VSDQ_I (BINOP, sqshl, 0))
> 
> I don't really have a preference, and I guess all other things being
> equal, the first one wins by being more obvious than the second.
> Just thought I'd mention the second way in case anyone preferred it.
> 
> Thanks,
> Richard

Attachment: pr94442-v1.patch
Description: pr94442-v1.patch

Reply via email to