> On 6 May 2025, at 16:01, Alfie Richards <alfie.richa...@arm.com> wrote:
> 
> Hello,
> 
> I like this idea. I have a couple thoughts to add.
> 
> On 05/05/2025 09:46, Yangyu Chen wrote:
>>> On 5 May 2025, at 16:34, Kyrylo Tkachov <ktkac...@nvidia.com> wrote:
>>> 
>>>> On 4 May 2025, at 19:19, Yangyu Chen <c...@cyyself.name> wrote:
>>>> 
>>>> Hi everyone,
>>>> 
>>>> This patch series introduces support for the target_clones profile
>>>> option in GCC. This option enables users to specify target_clones
>>>> attributes in a separate file, allowing GCC to generate multiple
>>>> versions of the function with different ISA extensions based on the
>>>> specified profile. This is achieved using the -ftarget-profile
>>>> option.
>>> 
>>> Interesting idea, but the terminology is confusing as is.
>>> In GCC “profile” usually refers to an execution profile gathered through 
>>> PGO instrumentation or perf.
>>> Whereas here I think you use “profile” to mean a “RISC-V profile” which is 
>>> something like a set of target architecture extensions?
>>> Thanks,
>>> Kyrill
>>> 
>> Sorry for the unclear information. The target profile here refers
>> to the target_clones attribute for each function. You can find an
>> example in the second patch.
> 
> I was also confused by the naming initially. Maybe something like
> "-ffunction-clone-table" instead?

Indeed. I will change that in the next revision.

> 
>> For instance, we want a function foo to generate default and RISC-V
>> vector targets, while a function bar should generate default and
>> zba,zbb targets. The corresponding source code could be as follows:
>> ```
>> __attribute__((target_clones("default","arch=+v")))
>> void foo();
>> __attribute__((target_clones("default","arch=+zba,+zbb")))
>> void bar();
>> ```
>> But if we have a target profile, we can describe it as follows in
>> a separate file:
>> ```
>> foo:default#arch=+v
>> bar:default#arch=+zba,+zbb
>> ```
> 
> As every function needs the default version it might be nice to make that 
> implicit. We can then avoid representing and subsequently diagnosing invalid 
> files. This could then be:
> 
> ```
> foo:arch=+v
> bar:arch=+zba,+zbb#+v
> ```

Great idea! This will make the table simpler.

> 
> Additionally, I think ideally the file can express functions disambiguated by 
> file, signature, and namespace.
> I imagine we could use similar syntax to gdb supports?
> 
> For example:
> 
> ```
> foo              |arch=+v
> bar(int, char)   |arch=+zba,+zbb
> file.C:baz(char) |arch=+zba,+zbb#arch=+v
> namespace::qux   |arch=+v
> ```

Also a great idea. However, I think it's not easy to use to implement
it now in GCC. But I would like to accept any further feedback if
we have such a simple API in GCC to do so, or if it will be implemented
by the community.

And something behind this idea is that I'm researching auto-generating
target clones attributes for developers. Only accepting the ASM
name is enough to implement this.

Thanks,
Yangyu Chen

> 
> Thanks,
> Alfie Richards
> 
>> Thanks,
>> Yangyu Chen
>>>> 
>>>> The primary objective of this patch series is to provide a
>>>> user-friendly way to specify target_clones attributes without
>>>> modifying the source code. This approach enhances the source code's
>>>> cleanliness, facilitates easier maintenance, and ensures portability
>>>> across different architectures and compiler versions.
>>>> 
>>>> The example usage of the target_clones profile option is detailed in
>>>> the commit message of the second patch.
>>>> 
>>>> I understand that this patch lacks comprehensive documentation and
>>>> test cases, as I am still in the process of writing them.
>>>> However, I would appreciate feedback on the implementation before
>>>> adding them. If the implementation is deemed acceptable, I
>>>> will proceed with writing the documentation and test cases.
>>>> 
>>>> Yangyu Chen (2):
>>>> Fortran: Do not make_decl_rtl in trans_function_start
>>>> Add target_clones profile option support
>>>> 
>>>> gcc/common.opt            |  7 +++++++
>>>> gcc/fortran/trans-decl.cc |  3 ---
>>>> gcc/multiple_target.cc    | 24 +++++++++++++++++++++++-
>>>> gcc/opts.cc               | 23 +++++++++++++++++++++++
>>>> 4 files changed, 53 insertions(+), 4 deletions(-)
>>>> 
>>>> -- 
>>>> 2.49.0


Reply via email to