> On 6 May 2025, at 16:01, Alfie Richards <alfie.richa...@arm.com> wrote:
>
> Hello,
>
> I like this idea. I have a couple thoughts to add.
>
> On 05/05/2025 09:46, Yangyu Chen wrote:
>>> On 5 May 2025, at 16:34, Kyrylo Tkachov <ktkac...@nvidia.com> wrote:
>>>
>>>> On 4 May 2025, at 19:19, Yangyu Chen <c...@cyyself.name> wrote:
>>>>
>>>> Hi everyone,
>>>>
>>>> This patch series introduces support for the target_clones profile
>>>> option in GCC. This option enables users to specify target_clones
>>>> attributes in a separate file, allowing GCC to generate multiple
>>>> versions of the function with different ISA extensions based on the
>>>> specified profile. This is achieved using the -ftarget-profile
>>>> option.
>>>
>>> Interesting idea, but the terminology is confusing as is.
>>> In GCC “profile” usually refers to an execution profile gathered through
>>> PGO instrumentation or perf.
>>> Whereas here I think you use “profile” to mean a “RISC-V profile” which is
>>> something like a set of target architecture extensions?
>>> Thanks,
>>> Kyrill
>>>
>> Sorry for the unclear information. The target profile here refers
>> to the target_clones attribute for each function. You can find an
>> example in the second patch.
>
> I was also confused by the naming initially. Maybe something like
> "-ffunction-clone-table" instead?
Indeed. I will change that in the next revision.
>
>> For instance, we want a function foo to generate default and RISC-V
>> vector targets, while a function bar should generate default and
>> zba,zbb targets. The corresponding source code could be as follows:
>> ```
>> __attribute__((target_clones("default","arch=+v")))
>> void foo();
>> __attribute__((target_clones("default","arch=+zba,+zbb")))
>> void bar();
>> ```
>> But if we have a target profile, we can describe it as follows in
>> a separate file:
>> ```
>> foo:default#arch=+v
>> bar:default#arch=+zba,+zbb
>> ```
>
> As every function needs the default version it might be nice to make that
> implicit. We can then avoid representing and subsequently diagnosing invalid
> files. This could then be:
>
> ```
> foo:arch=+v
> bar:arch=+zba,+zbb#+v
> ```
Great idea! This will make the table simpler.
>
> Additionally, I think ideally the file can express functions disambiguated by
> file, signature, and namespace.
> I imagine we could use similar syntax to gdb supports?
>
> For example:
>
> ```
> foo |arch=+v
> bar(int, char) |arch=+zba,+zbb
> file.C:baz(char) |arch=+zba,+zbb#arch=+v
> namespace::qux |arch=+v
> ```
Also a great idea. However, I think it's not easy to use to implement
it now in GCC. But I would like to accept any further feedback if
we have such a simple API in GCC to do so, or if it will be implemented
by the community.
And something behind this idea is that I'm researching auto-generating
target clones attributes for developers. Only accepting the ASM
name is enough to implement this.
Thanks,
Yangyu Chen
>
> Thanks,
> Alfie Richards
>
>> Thanks,
>> Yangyu Chen
>>>>
>>>> The primary objective of this patch series is to provide a
>>>> user-friendly way to specify target_clones attributes without
>>>> modifying the source code. This approach enhances the source code's
>>>> cleanliness, facilitates easier maintenance, and ensures portability
>>>> across different architectures and compiler versions.
>>>>
>>>> The example usage of the target_clones profile option is detailed in
>>>> the commit message of the second patch.
>>>>
>>>> I understand that this patch lacks comprehensive documentation and
>>>> test cases, as I am still in the process of writing them.
>>>> However, I would appreciate feedback on the implementation before
>>>> adding them. If the implementation is deemed acceptable, I
>>>> will proceed with writing the documentation and test cases.
>>>>
>>>> Yangyu Chen (2):
>>>> Fortran: Do not make_decl_rtl in trans_function_start
>>>> Add target_clones profile option support
>>>>
>>>> gcc/common.opt | 7 +++++++
>>>> gcc/fortran/trans-decl.cc | 3 ---
>>>> gcc/multiple_target.cc | 24 +++++++++++++++++++++++-
>>>> gcc/opts.cc | 23 +++++++++++++++++++++++
>>>> 4 files changed, 53 insertions(+), 4 deletions(-)
>>>>
>>>> --
>>>> 2.49.0