> -----Original Message-----
> From: [email protected] <[email protected]>
> Sent: 25 August 2025 12:00
> To: [email protected]
> Cc: [email protected]; [email protected]; [email protected]
> Subject: [v3 PATCH 0/6] aarch64: Support for user-defined aarch64 tuning
>
> From: Soumya AR <[email protected]>
>
> Hi,
>
> This is a follow-up to suggestions given on v2 of the aarch64
> user-defined tunings.
>
> https://gcc.gnu.org/pipermail/gcc-patches/2025-July/689893.html
> Attaching an updated patch series here.
>
> Regarding some questions on this thread:
>
> I've made the changes suggested by Andrew and Richard. Thanks a lot for
> taking
> a look!
>
> > Maybe you could do:
> > ```
> > template<typename T>
> > using serialize_func_type = std::unique_ptr<json::object>
> > (*serialize_func) (const typename std::remove_pointer<T>::type &);
> >
> > serialize_object_helper (
> > const T &member, serialize_func_type<T> serialize_func)
> > ```
> > Since std::remove_pointer<T>::type will T if T is not a pointer.
>
> Right, this is much better. I've updated both serialize_object_helper and
> parse_object_helper to do this.
>
> > > + function_map = {}
> > > + for path_key, (path, obj_schema) in all_objects_with_paths.items():
> > > + filtered_path = [p for p in path if p != "tune_params"]
> > > + if filtered_path:
> > > + full_name = "_".join(filtered_path)
> > > + function_map[path_key] = f"{operation}_{full_name}"
> > > + else:
> > > + function_map[path_key] = f"{operation}_{path_key}"
> >
> > Are the 6 lines above equivalent to:
> >
> > if len(path) > 1:
> > full_name = "_".join(path[1:])
> > function_map[path_key] = f"{operation}_{full_name}"
> > else:
> > function_map[path_key] = f"{operation}_{path_key}"
> >
> > ? If so, that might be easier to follow, since it emphasises that
> > we're structurally ignoring path[0] for all cases. It looked at first
> > like the filtering might remove something other than the first element.
>
> I actually had this check to avoid the case for when we encounter the
> "tune_params" key. On a second look, I realized I was taking care of that case
> earlier anyway, so the check isn't needed at all. Now it's just:
>
> for path_key, (path, obj_schema) in all_objects_with_paths.items():
> if path:
> full_name = "_".join(path)
> function_map[path_key] = f"{operation}_{full_name}"
> else:
> function_map[path_key] = f"{operation}_{path_key}"
>
>
> > How is memory management handled (both here and for pointers to
> > structures)? Do we just allocate and never free? If so, that's ok for
> > command-line options, but I wonder whether we'll ewventually be using
> > this on a per-function basis. (Hopefully not though...)
>
> About this, I addressed it earlier in a separate reply:
> https://gcc.gnu.org/pipermail/gcc-patches/2025-July/689902.html
>
> Pasting the relevant information from that thread.
>
> Possible solutions:
> 1. Simply don't bother freeing up the memory and let the memory leak be :)
> 2. Use the GCC garbage collector
> 3. Remove the pointers and have all the nested structures be embedded since
> I'm not too sure why this is inconsistent in tune_params.
> 4. (The current approach) Have each temporary structure be a static local
> variable. This works, but operates under the assumption that nested
> structures
> in tune_params will always have unique types, since the variable declaration
> is
> a template that is resolved by the type of the nested structures. There is an
> extra check added to ensure we don't overwrite accidentally, in the case we
> ever end up having structure members of the same type. This is still a bit
> ugly
> though, so I would appreciate if there are other suggestions on how to handle
> this.
>
> > I agree with Andrew that it would be nice in principle to be able
> > to autogenerate this. I'm personally happy without that while it
> > remains this small, but I think it would at least be worth sharing
> > this between the parsing and serialising code, rather than having
> > separate lists for each.
>
> Auto-generating this would not be a problem, but my main agenda with the
> script
> was to have a setup that was only dependent on the JSON schema and not
> have any
> actual references to the tuning parameters. Essentially, make the script
> as agnostic of the tuning parameters as possible, so anyone modifying the
> tuning
> paramters does not need to touch the script (and only modify the schema).
>
> Extending that philosophy to the enums, I now have the following setup:
>
> I've centralized the tuning enum definitions into a new aarch64-tuning-
> enums.def
> file. The enums (autoprefetch_model and ldp_stp_policy) are now defined
> once
> using macros and consumed by both the core definitions (aarch64-opts.h,
> aarch64-protos.h) and the generated parser/printer code. This ensures that if
> someone wants to add a new enum value, they only need to make
> modifications in
> the .def file. The codegen from the script refers to the same enums.
>
> If this is excessive, I can just have the script directly generate the enum to
> string maps. Let me know what you think.
>
> > Could you explain the reason for Negative(fdump-tuning-model=) and
> ToLower?
>
> ToLower is incorrect here, I've removed it. We should accept any filename for
> the user provided files. Thanks for pointing this out.
>
> As I understand it, Negative() will prune all earlier instances of the option,
> yes? I suppose in any case, even if I don't specify Negative(), then the
> second
> fdump-tuning-model= would override the first one? At that point, it seems
> better
> to not expose the earlier option at all..?
>
> Similar logic for Negative(muser-provided-CPU=), where we would only like to
> parse one file.
>
> > The option should be documented in invoke.texi, or marked Undocumented
> if
> > we really don't want to put it in the manual.
>
> Not too sure about this one. I've marked it Undocumented for now but I can
> see
> intances of options that are marked Undocumented but still specified in
> invoke.texi. Should we document its usage anyway in invoke.texi?
> Also, should we have something similar to -moverride, where we specify that
> this
> feature is only for power users?
Hi Soumya,
I'm still catching up on the series. So I'll come back to the questions above
once I finish going through all the patches.
But Yeah I think we want something like this to indicate it's for users whom
really know what they're doing.
Now to that end as I'm going through the series I had a question.
What's the interaction model here between this option and push/pop
pragmas or attributes that set a tuning/mcpu.
At the moment the flag is handled in override_options which means
It will override all cost models.
e.g.
__attribute__ ((target("cpu=neoverse-n1")))
void foo ()
{}
Would be overridden. Is this the intention, or did we only want to
override the default/global cost model with this option?
Thanks,
Tamar
>
> CCing Richard's personal email here (in case you would still like to give
> feedback on this). Thank you for all your help with this feature, as well as
> all
> of my other work! :)
>
> Best,
> Soumya
>
> Soumya AR (6):
> aarch64 + arm: Remove const keyword from tune_params members and
> nested members
> aarch64: Enable dumping of AArch64 CPU tuning parameters to JSON
> json: Add get_map() method to JSON object class
> aarch64: Enable parsing of user-provided AArch64 CPU tuning parameters
> aarch64: Regression tests for parsing of user-provided AArch64 CPU
> tuning parameters
> aarch64: Script to auto generate JSON tuning routines
>
> gcc/config.gcc | 2 +-
> .../aarch64-generate-json-tuning-routines.py | 374 +++++++++++++
> gcc/config/aarch64/aarch64-json-schema.h | 261 +++++++++
> .../aarch64-json-tunings-parser-generated.inc | 355 ++++++++++++
> .../aarch64/aarch64-json-tunings-parser.cc | 527 ++++++++++++++++++
> .../aarch64/aarch64-json-tunings-parser.h | 29 +
> ...aarch64-json-tunings-printer-generated.inc | 439 +++++++++++++++
> .../aarch64/aarch64-json-tunings-printer.cc | 137 +++++
> .../aarch64/aarch64-json-tunings-printer.h | 28 +
> gcc/config/aarch64/aarch64-opts.h | 6 +-
> gcc/config/aarch64/aarch64-protos.h | 169 +++---
> gcc/config/aarch64/aarch64-tuning-enums.def | 37 ++
> gcc/config/aarch64/aarch64.cc | 20 +
> gcc/config/aarch64/aarch64.opt | 8 +
> gcc/config/aarch64/t-aarch64 | 21 +
> gcc/config/arm/aarch-common-protos.h | 128 ++---
> gcc/json.h | 26 +-
> gcc/selftest-run-tests.cc | 1 +
> gcc/selftest.h | 1 +
> .../aarch64-json-tunings.exp | 35 ++
> .../aarch64/aarch64-json-tunings/boolean-1.c | 6 +
> .../aarch64-json-tunings/boolean-1.json | 9 +
> .../aarch64/aarch64-json-tunings/boolean-2.c | 7 +
> .../aarch64-json-tunings/boolean-2.json | 9 +
> .../aarch64-json-tunings/empty-brackets.c | 6 +
> .../aarch64-json-tunings/empty-brackets.json | 1 +
> .../aarch64/aarch64-json-tunings/empty.c | 6 +
> .../aarch64/aarch64-json-tunings/empty.json | 0
> .../aarch64/aarch64-json-tunings/enum-1.c | 8 +
> .../aarch64/aarch64-json-tunings/enum-1.json | 7 +
> .../aarch64/aarch64-json-tunings/enum-2.c | 7 +
> .../aarch64/aarch64-json-tunings/enum-2.json | 7 +
> .../aarch64/aarch64-json-tunings/integer-1.c | 7 +
> .../aarch64-json-tunings/integer-1.json | 6 +
> .../aarch64/aarch64-json-tunings/integer-2.c | 7 +
> .../aarch64-json-tunings/integer-2.json | 6 +
> .../aarch64/aarch64-json-tunings/integer-3.c | 7 +
> .../aarch64-json-tunings/integer-3.json | 5 +
> .../aarch64/aarch64-json-tunings/integer-4.c | 6 +
> .../aarch64-json-tunings/integer-4.json | 5 +
> .../aarch64/aarch64-json-tunings/string-1.c | 8 +
> .../aarch64-json-tunings/string-1.json | 7 +
> .../aarch64/aarch64-json-tunings/string-2.c | 7 +
> .../aarch64-json-tunings/string-2.json | 5 +
> .../aarch64/aarch64-json-tunings/test-all.c | 58 ++
> .../aarch64-json-tunings/test-all.json | 39 ++
> .../aarch64-json-tunings/unidentified-key.c | 6 +
> .../unidentified-key.json | 5 +
> 48 files changed, 2705 insertions(+), 156 deletions(-)
> create mode 100644 gcc/config/aarch64/aarch64-generate-json-tuning-
> routines.py
> create mode 100644 gcc/config/aarch64/aarch64-json-schema.h
> create mode 100644 gcc/config/aarch64/aarch64-json-tunings-parser-
> generated.inc
> create mode 100644 gcc/config/aarch64/aarch64-json-tunings-parser.cc
> create mode 100644 gcc/config/aarch64/aarch64-json-tunings-parser.h
> create mode 100644 gcc/config/aarch64/aarch64-json-tunings-printer-
> generated.inc
> create mode 100644 gcc/config/aarch64/aarch64-json-tunings-printer.cc
> create mode 100644 gcc/config/aarch64/aarch64-json-tunings-printer.h
> create mode 100644 gcc/config/aarch64/aarch64-tuning-enums.def
> create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/aarch64-json-tunings.exp
> create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/boolean-1.c
> create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/boolean-1.json
> create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/boolean-2.c
> create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/boolean-2.json
> create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/empty-brackets.c
> create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/empty-brackets.json
> create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/empty.c
> create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/empty.json
> create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/enum-1.c
> create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/enum-1.json
> create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/enum-2.c
> create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/enum-2.json
> create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/integer-1.c
> create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/integer-1.json
> create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/integer-2.c
> create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/integer-2.json
> create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/integer-3.c
> create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/integer-3.json
> create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/integer-4.c
> create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/integer-4.json
> create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/string-1.c
> create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/string-1.json
> create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/string-2.c
> create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/string-2.json
> create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/test-all.c
> create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/test-all.json
> create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/unidentified-key.c
> create mode 100644 gcc/testsuite/gcc.target/aarch64/aarch64-json-
> tunings/unidentified-key.json
>
> --
> 2.44.0