Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]
Tamar Christina writes: >> -Original Message- >> From: Richard Sandiford >> Sent: Thursday, January 16, 2025 7:11 AM >> To: Tamar Christina >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw >> ; ktkac...@gcc.gnu.org >> Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions >> for unknown non-homogenous systems [PR113257] >> >> Richard Sandiford writes: >> > Tamar Christina writes: >> >> Ok for master? and how do you feel about a backport for the two patches to >> help >> >> distros? >> > >> > Backporting to GCC 14 & GCC 13 sounds good. Not so sure about GCC 12, >> > since I think we should be extra cautious with the "most stable" branch, >> > but let's see what others think. >> > >> > OK for trunk, and for GCC 14 & 13 after a grace period, with one >> > trivial nit below: >> >> Sorry, was concentrating too much on the -mcpu vs. -march preemption >> thing and forgot to think about other aspects of the patch. The routine >> is used for all three of -march=native, -mcpu=native, and -mtune=native, >> so I think we want something like the following on top of your patch >> (untested so far). >> > > Cool, how's this one? > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Ok for master? and for backport to GCC 13 and 14? > > Thanks, > Tamar > > gcc/ChangeLog: > > PR target/113257 > * config/aarch64/driver-aarch64.cc (get_cpu_from_id, DEFAULT_CPU): New. > (host_detect_local_cpu): Use it. > > gcc/testsuite/ChangeLog: > > PR target/113257 > * gcc.target/aarch64/cpunative/info_34: New test. > * gcc.target/aarch64/cpunative/native_cpu_34.c: New test. > * gcc.target/aarch64/cpunative/info_35: New test. > * gcc.target/aarch64/cpunative/native_cpu_35.c: New test. > > Co-authored-by: Richard Sandiford > > -- inline copy of patch -- > > diff --git a/gcc/config/aarch64/driver-aarch64.cc > b/gcc/config/aarch64/driver-aarch64.cc > index > 45fce67a646351b848b7cd7d0fd35d343731c0d1..26ba2cd6f8883300951268aab7d0a22ec2588a0d > 100644 > --- a/gcc/config/aarch64/driver-aarch64.cc > +++ b/gcc/config/aarch64/driver-aarch64.cc > @@ -60,6 +60,7 @@ struct aarch64_core_data > #define ALL_VARIANTS ((unsigned)-1) > /* Default architecture to use if -mcpu=native did not detect a known CPU. > */ > #define DEFAULT_ARCH "8A" > +#define DEFAULT_CPU "generic-armv8-a" > > #define AARCH64_CORE(CORE_NAME, CORE_IDENT, SCHED, ARCH, FLAGS, COSTS, IMP, > PART, VARIANT) \ >{ CORE_NAME, #ARCH, IMP, PART, VARIANT, feature_deps::cpu_##CORE_IDENT }, > @@ -106,6 +107,21 @@ get_arch_from_id (const char* id) >return NULL; > } > > +/* Return an aarch64_core_data for the cpu described > + by ID, or NULL if ID describes something we don't know about. */ > + > +static const aarch64_core_data * > +get_cpu_from_id (const char* name) > +{ > + for (unsigned i = 0; aarch64_cpu_data[i].name != NULL; i++) > +{ > + if (strcmp (name, aarch64_cpu_data[i].name) == 0) > + return &aarch64_cpu_data[i]; > +} Redundant braces, the convention says: for (unsigned i = 0; aarch64_cpu_data[i].name != NULL; i++) if (strcmp (name, aarch64_cpu_data[i].name) == 0) return &aarch64_cpu_data[i]; OK with that change, thanks, and sorry for back-tracking on the original ack. Richard > + > + return NULL; > +} > + > /* Check wether the CORE array is the same as the big.LITTLE BL_CORE. > For an example CORE={0xd08, 0xd03} and > BL_CORE=AARCH64_BIG_LITTLE (0xd08, 0xd03) will return true. */ > @@ -403,18 +419,11 @@ host_detect_local_cpu (int argc, const char **argv) > || variants[0] == aarch64_cpu_data[i].variant)) > break; > > - if (aarch64_cpu_data[i].name == NULL) > + if (arch) > { > - auto arch_info = get_arch_from_id (DEFAULT_ARCH); > - > - gcc_assert (arch_info); > - > - res = concat ("-march=", arch_info->name, NULL); > - default_flags = arch_info->flags; > - } > - else if (arch) > - { > - const char *arch_id = aarch64_cpu_data[i].arch; > + const char *arch_id = (aarch64_cpu_data[i].name > + ? aarch64_cpu_data[i].arch > + : DEFAULT_ARCH); > auto arch_info = get_arch_from_id (arch_id); > > /* We got some arch indentifier that's not in aarch64-arches.def? */ > @@ -424,12 +433,15 @@ host_detect_l
RE: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]
> -Original Message- > From: Richard Sandiford > Sent: Thursday, January 16, 2025 7:11 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions > for unknown non-homogenous systems [PR113257] > > Richard Sandiford writes: > > Tamar Christina writes: > >> Ok for master? and how do you feel about a backport for the two patches to > help > >> distros? > > > > Backporting to GCC 14 & GCC 13 sounds good. Not so sure about GCC 12, > > since I think we should be extra cautious with the "most stable" branch, > > but let's see what others think. > > > > OK for trunk, and for GCC 14 & 13 after a grace period, with one > > trivial nit below: > > Sorry, was concentrating too much on the -mcpu vs. -march preemption > thing and forgot to think about other aspects of the patch. The routine > is used for all three of -march=native, -mcpu=native, and -mtune=native, > so I think we want something like the following on top of your patch > (untested so far). > Cool, how's this one? Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? and for backport to GCC 13 and 14? Thanks, Tamar gcc/ChangeLog: PR target/113257 * config/aarch64/driver-aarch64.cc (get_cpu_from_id, DEFAULT_CPU): New. (host_detect_local_cpu): Use it. gcc/testsuite/ChangeLog: PR target/113257 * gcc.target/aarch64/cpunative/info_34: New test. * gcc.target/aarch64/cpunative/native_cpu_34.c: New test. * gcc.target/aarch64/cpunative/info_35: New test. * gcc.target/aarch64/cpunative/native_cpu_35.c: New test. Co-authored-by: Richard Sandiford -- inline copy of patch -- diff --git a/gcc/config/aarch64/driver-aarch64.cc b/gcc/config/aarch64/driver-aarch64.cc index 45fce67a646351b848b7cd7d0fd35d343731c0d1..26ba2cd6f8883300951268aab7d0a22ec2588a0d 100644 --- a/gcc/config/aarch64/driver-aarch64.cc +++ b/gcc/config/aarch64/driver-aarch64.cc @@ -60,6 +60,7 @@ struct aarch64_core_data #define ALL_VARIANTS ((unsigned)-1) /* Default architecture to use if -mcpu=native did not detect a known CPU. */ #define DEFAULT_ARCH "8A" +#define DEFAULT_CPU "generic-armv8-a" #define AARCH64_CORE(CORE_NAME, CORE_IDENT, SCHED, ARCH, FLAGS, COSTS, IMP, PART, VARIANT) \ { CORE_NAME, #ARCH, IMP, PART, VARIANT, feature_deps::cpu_##CORE_IDENT }, @@ -106,6 +107,21 @@ get_arch_from_id (const char* id) return NULL; } +/* Return an aarch64_core_data for the cpu described + by ID, or NULL if ID describes something we don't know about. */ + +static const aarch64_core_data * +get_cpu_from_id (const char* name) +{ + for (unsigned i = 0; aarch64_cpu_data[i].name != NULL; i++) +{ + if (strcmp (name, aarch64_cpu_data[i].name) == 0) + return &aarch64_cpu_data[i]; +} + + return NULL; +} + /* Check wether the CORE array is the same as the big.LITTLE BL_CORE. For an example CORE={0xd08, 0xd03} and BL_CORE=AARCH64_BIG_LITTLE (0xd08, 0xd03) will return true. */ @@ -403,18 +419,11 @@ host_detect_local_cpu (int argc, const char **argv) || variants[0] == aarch64_cpu_data[i].variant)) break; - if (aarch64_cpu_data[i].name == NULL) + if (arch) { - auto arch_info = get_arch_from_id (DEFAULT_ARCH); - - gcc_assert (arch_info); - - res = concat ("-march=", arch_info->name, NULL); - default_flags = arch_info->flags; - } - else if (arch) - { - const char *arch_id = aarch64_cpu_data[i].arch; + const char *arch_id = (aarch64_cpu_data[i].name +? aarch64_cpu_data[i].arch +: DEFAULT_ARCH); auto arch_info = get_arch_from_id (arch_id); /* We got some arch indentifier that's not in aarch64-arches.def? */ @@ -424,12 +433,15 @@ host_detect_local_cpu (int argc, const char **argv) res = concat ("-march=", arch_info->name, NULL); default_flags = arch_info->flags; } - else + else if (cpu || aarch64_cpu_data[i].name) { - default_flags = aarch64_cpu_data[i].flags; + auto cpu_info = (aarch64_cpu_data[i].name + ? &aarch64_cpu_data[i] + : get_cpu_from_id (DEFAULT_CPU)); + default_flags = cpu_info->flags; res = concat ("-m", cpu ? "cpu" : "tune", "=", - aarch64_cpu_data[i].name, + cpu_info->name, NULL); } } @@ -449,6 +461,20 @@ host_detect_local_cpu (int argc,
Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]
Richard Sandiford writes: > Tamar Christina writes: >> Ok for master? and how do you feel about a backport for the two patches to >> help >> distros? > > Backporting to GCC 14 & GCC 13 sounds good. Not so sure about GCC 12, > since I think we should be extra cautious with the "most stable" branch, > but let's see what others think. > > OK for trunk, and for GCC 14 & 13 after a grace period, with one > trivial nit below: Sorry, was concentrating too much on the -mcpu vs. -march preemption thing and forgot to think about other aspects of the patch. The routine is used for all three of -march=native, -mcpu=native, and -mtune=native, so I think we want something like the following on top of your patch (untested so far). Richard diff --git a/gcc/config/aarch64/driver-aarch64.cc b/gcc/config/aarch64/driver-aarch64.cc index 731b5e48687..26ba2cd6f88 100644 --- a/gcc/config/aarch64/driver-aarch64.cc +++ b/gcc/config/aarch64/driver-aarch64.cc @@ -419,18 +419,11 @@ host_detect_local_cpu (int argc, const char **argv) || variants[0] == aarch64_cpu_data[i].variant)) break; - if (aarch64_cpu_data[i].name == NULL) + if (arch) { - auto cpu_info = get_cpu_from_id (DEFAULT_CPU); - - gcc_assert (cpu_info); - - res = concat ("-mcpu=", cpu_info->name, NULL); - default_flags = cpu_info->flags; - } - else if (arch) - { - const char *arch_id = aarch64_cpu_data[i].arch; + const char *arch_id = (aarch64_cpu_data[i].name +? aarch64_cpu_data[i].arch +: DEFAULT_ARCH); auto arch_info = get_arch_from_id (arch_id); /* We got some arch indentifier that's not in aarch64-arches.def? */ @@ -440,12 +433,15 @@ host_detect_local_cpu (int argc, const char **argv) res = concat ("-march=", arch_info->name, NULL); default_flags = arch_info->flags; } - else + else if (cpu || aarch64_cpu_data[i].name) { - default_flags = aarch64_cpu_data[i].flags; + auto cpu_info = (aarch64_cpu_data[i].name + ? &aarch64_cpu_data[i] + : get_cpu_from_id (DEFAULT_CPU)); + default_flags = cpu_info->flags; res = concat ("-m", cpu ? "cpu" : "tune", "=", - aarch64_cpu_data[i].name, + cpu_info->name, NULL); } } @@ -469,7 +465,7 @@ host_detect_local_cpu (int argc, const char **argv) /* On big.LITTLE if we find any unknown CPUs we can still pick arch features as the cores should have the same features. So just pick the feature flags from any of the cpus. */ - if (aarch64_cpu_data[i].name == NULL) + if (cpu && aarch64_cpu_data[i].name == NULL) { auto cpu_info = get_cpu_from_id (DEFAULT_CPU);
Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]
On Wed, 2025-01-15 at 18:54 +, Tamar Christina wrote: > > Re-reading again I realize I misread cache size from your question > with cache line size. > > Cache size can be whatever yes. Cache line size must match. > > But that doesn't change the fact that this patch is correct. Yes it's correct, having -march=native is at least better than not having it. I'm just providing some info so if your big.LITTLE user complains "hey when I bootstrap GCC with BOOT_CFLAGS=-O3 -march=native I get a comparison error" you can just point they to PR111768. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]
Re-reading again I realize I misread cache size from your question with cache line size. Cache size can be whatever yes. Cache line size must match. But that doesn't change the fact that this patch is correct. Thanks, Tamar From: Tamar Christina Sent: Wednesday, January 15, 2025 2:45 PM To: Xi Ruoyao ; gcc-patches@gcc.gnu.org Cc: nd ; Richard Earnshaw ; ktkac...@gcc.gnu.org ; Richard Sandiford Subject: RE: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257] > -Original Message- > From: Xi Ruoyao > Sent: Wednesday, January 15, 2025 1:40 PM > To: Tamar Christina ; gcc-patches@gcc.gnu.org > Cc: nd ; Richard Earnshaw ; > ktkac...@gcc.gnu.org; Richard Sandiford > Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions > for unknown non-homogenous systems [PR113257] > > On Wed, 2025-01-15 at 13:34 +, Tamar Christina wrote: > > > > > > > -Original Message- > > > From: Xi Ruoyao > > > Sent: Wednesday, January 15, 2025 1:29 PM > > > To: Tamar Christina ; gcc-patches@gcc.gnu.org > > > Cc: nd ; Richard Earnshaw ; > > > ktkac...@gcc.gnu.org; Richard Sandiford > > > Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture > extensions > > > for unknown non-homogenous systems [PR113257] > > > > > > On Sat, 2025-01-11 at 15:18 +, Tamar Christina wrote: > > > > However the same thing works for big.LITTLE as in such system the cores > must > > > > have the same extensions otherwise it doesn't fundamentally work. > > > > > > > > i.e. task migration from one core to the other wouldn't work. > > > > > > See https://gcc.gnu.org/PR111768 for a potential issue. > > > > > > > There's no issue there. > > As Alexander says in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111768#c7 > > > > It is not practical for you to have two CPUs in the same system with > > different > capabilities. > > Your kernel will not work, or you have to disable task migration. > > > > The same extends to cache sizes and other properties of the CPU. > > On Intel CPUs it's common that the L1 cache size is different on P and E > cores. The generated code still works but it may be suboptimal if it's > compiled on an E core but the executed on a P core or vice versa. > > Is it different on AArch64? I'd be really surprised if big and LITTLE > must have the same L1 cache size... Yes, Because if the sizes mismatch you break data cache operations. One example is DC ZVA [1] Where the operation clears whole cachelines at a time. This value is cached when doing a long clear operation, such as large memset of 0. Technically speaking the size of VA doesn't have to match the cache size, but practically it always does since otherwise the operation is a lot less efficient. This is an assumption also made in e.g. glibc's memset. If you migrate tasks between the reading of the value and the end of the set you may end up overwriting or not clearing enough. In addition, the Linux kernel will report the smallest value over all the cores so that software gets a coherent value: [2], because again, software can't work effectively when the sizes don't match. And lastly the big.LITTLE programming model explicitly states that the cores must be coherent [3]. This patch does not change the cache sizes anyway as those are determined by the generic cost model, which are a conservative safe minimum that work on all Arm cores. But coherency property is also what dictates that the patch is correct, because as a requirement of big.LITTLE you must be able to seamlessly transfer a program from one core to another. This cannot happen if they vary in capabilities. Regards, Tamar [1] https://developer.arm.com/documentation/102670/0301/AArch64-registers/AArch64-register-descriptions/AArch64-System-instruction-register-description/DC-ZVA--Data-Cache-Zero-by-VA [2] https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1227904.html [3] https://developer.arm.com/documentation/dai0424/a/Power-down-and-power-up-considerations/High-level-considerations/Coherency-in-a-big-LITTLE-system > > > As an aside, today the big.LITTLE selection already enforces a uniform > > feature set. > > -- > Xi Ruoyao > School of Aerospace Science and Technology, Xidian University
RE: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]
> -Original Message- > From: Richard Sandiford > Sent: Wednesday, January 15, 2025 3:23 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions > for unknown non-homogenous systems [PR113257] > > Tamar Christina writes: > > Ok for master? and how do you feel about a backport for the two patches to > > help > > distros? > > Backporting to GCC 14 & GCC 13 sounds good. Not so sure about GCC 12, > since I think we should be extra cautious with the "most stable" branch, > but let's see what others think. > > OK for trunk, and for GCC 14 & 13 after a grace period, with one > trivial nit below: Thanks, I should have indicated that I'm not pushing for GCC 12. So I'm happy with just GCC 13 and 14. Cheers, Tamar > > > > > Thanks, > > Tamar > > > > gcc/ChangeLog: > > > > PR target/113257 > > * config/aarch64/driver-aarch64.cc (get_cpu_from_id, DEFAULT_CPU): > New. > > (host_detect_local_cpu): Use it. > > > > gcc/testsuite/ChangeLog: > > > > PR target/113257 > > * gcc.target/aarch64/cpunative/info_34: New test. > > * gcc.target/aarch64/cpunative/native_cpu_34.c: New test. > > * gcc.target/aarch64/cpunative/info_35: New test. > > * gcc.target/aarch64/cpunative/native_cpu_35.c: New test. > > > > -- inline patch -- > > > > diff --git a/gcc/config/aarch64/driver-aarch64.cc > > b/gcc/config/aarch64/driver- > aarch64.cc > > index > 45fce67a646351b848b7cd7d0fd35d343731c0d1..731b5e486879d3670ad0feb > 02906c6d6bdbb1f01 100644 > > --- a/gcc/config/aarch64/driver-aarch64.cc > > +++ b/gcc/config/aarch64/driver-aarch64.cc > > @@ -60,6 +60,7 @@ struct aarch64_core_data > > #define ALL_VARIANTS ((unsigned)-1) > > /* Default architecture to use if -mcpu=native did not detect a known CPU. > > */ > > #define DEFAULT_ARCH "8A" > > +#define DEFAULT_CPU "generic-armv8-a" > > > > #define AARCH64_CORE(CORE_NAME, CORE_IDENT, SCHED, ARCH, FLAGS, > COSTS, IMP, PART, VARIANT) \ > >{ CORE_NAME, #ARCH, IMP, PART, VARIANT, feature_deps::cpu_##CORE_IDENT > }, > > @@ -106,6 +107,21 @@ get_arch_from_id (const char* id) > >return NULL; > > } > > > > +/* Return an aarch64_core_data for the cpu described > > + by ID, or NULL if ID describes something we don't know about. */ > > + > > +static const aarch64_core_data * > > +get_cpu_from_id (const char* name) > > +{ > > + for (unsigned i = 0; aarch64_cpu_data[i].name != NULL; i++) > > +{ > > + if (strcmp (name, aarch64_cpu_data[i].name) == 0) > > + return &aarch64_cpu_data[i]; > > +} > > Redundant braces, should be: > > for (unsigned i = 0; aarch64_cpu_data[i].name != NULL; i++) > if (strcmp (name, aarch64_cpu_data[i].name) == 0) > return &aarch64_cpu_data[i]; > > Thanks, > Richard > > > + > > + return NULL; > > +} > > + > > /* Check wether the CORE array is the same as the big.LITTLE BL_CORE. > > For an example CORE={0xd08, 0xd03} and > > BL_CORE=AARCH64_BIG_LITTLE (0xd08, 0xd03) will return true. */ > > @@ -405,12 +421,12 @@ host_detect_local_cpu (int argc, const char **argv) > > > >if (aarch64_cpu_data[i].name == NULL) > > { > > - auto arch_info = get_arch_from_id (DEFAULT_ARCH); > > + auto cpu_info = get_cpu_from_id (DEFAULT_CPU); > > > > - gcc_assert (arch_info); > > + gcc_assert (cpu_info); > > > > - res = concat ("-march=", arch_info->name, NULL); > > - default_flags = arch_info->flags; > > + res = concat ("-mcpu=", cpu_info->name, NULL); > > + default_flags = cpu_info->flags; > > } > >else if (arch) > > { > > @@ -449,6 +465,20 @@ host_detect_local_cpu (int argc, const char **argv) > > break; > > } > > } > > + > > + /* On big.LITTLE if we find any unknown CPUs we can still pick arch > > +features as the cores should have the same features. So just pick > > +the feature flags from any of the cpus. */ > > + if (aarch64_cpu_data[i].name == NULL) > > + { > > + auto cpu_info = get_cpu_from_id (DEFAULT_CPU); > > + > > + gcc_assert (cpu_info); > > + > > + res = concat ("-mcpu=", cpu_info->name, NULL); > >
Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]
Tamar Christina writes: > Ok for master? and how do you feel about a backport for the two patches to > help > distros? Backporting to GCC 14 & GCC 13 sounds good. Not so sure about GCC 12, since I think we should be extra cautious with the "most stable" branch, but let's see what others think. OK for trunk, and for GCC 14 & 13 after a grace period, with one trivial nit below: > > Thanks, > Tamar > > gcc/ChangeLog: > > PR target/113257 > * config/aarch64/driver-aarch64.cc (get_cpu_from_id, DEFAULT_CPU): New. > (host_detect_local_cpu): Use it. > > gcc/testsuite/ChangeLog: > > PR target/113257 > * gcc.target/aarch64/cpunative/info_34: New test. > * gcc.target/aarch64/cpunative/native_cpu_34.c: New test. > * gcc.target/aarch64/cpunative/info_35: New test. > * gcc.target/aarch64/cpunative/native_cpu_35.c: New test. > > -- inline patch -- > > diff --git a/gcc/config/aarch64/driver-aarch64.cc > b/gcc/config/aarch64/driver-aarch64.cc > index > 45fce67a646351b848b7cd7d0fd35d343731c0d1..731b5e486879d3670ad0feb02906c6d6bdbb1f01 > 100644 > --- a/gcc/config/aarch64/driver-aarch64.cc > +++ b/gcc/config/aarch64/driver-aarch64.cc > @@ -60,6 +60,7 @@ struct aarch64_core_data > #define ALL_VARIANTS ((unsigned)-1) > /* Default architecture to use if -mcpu=native did not detect a known CPU. > */ > #define DEFAULT_ARCH "8A" > +#define DEFAULT_CPU "generic-armv8-a" > > #define AARCH64_CORE(CORE_NAME, CORE_IDENT, SCHED, ARCH, FLAGS, COSTS, IMP, > PART, VARIANT) \ >{ CORE_NAME, #ARCH, IMP, PART, VARIANT, feature_deps::cpu_##CORE_IDENT }, > @@ -106,6 +107,21 @@ get_arch_from_id (const char* id) >return NULL; > } > > +/* Return an aarch64_core_data for the cpu described > + by ID, or NULL if ID describes something we don't know about. */ > + > +static const aarch64_core_data * > +get_cpu_from_id (const char* name) > +{ > + for (unsigned i = 0; aarch64_cpu_data[i].name != NULL; i++) > +{ > + if (strcmp (name, aarch64_cpu_data[i].name) == 0) > + return &aarch64_cpu_data[i]; > +} Redundant braces, should be: for (unsigned i = 0; aarch64_cpu_data[i].name != NULL; i++) if (strcmp (name, aarch64_cpu_data[i].name) == 0) return &aarch64_cpu_data[i]; Thanks, Richard > + > + return NULL; > +} > + > /* Check wether the CORE array is the same as the big.LITTLE BL_CORE. > For an example CORE={0xd08, 0xd03} and > BL_CORE=AARCH64_BIG_LITTLE (0xd08, 0xd03) will return true. */ > @@ -405,12 +421,12 @@ host_detect_local_cpu (int argc, const char **argv) > >if (aarch64_cpu_data[i].name == NULL) > { > - auto arch_info = get_arch_from_id (DEFAULT_ARCH); > + auto cpu_info = get_cpu_from_id (DEFAULT_CPU); > > - gcc_assert (arch_info); > + gcc_assert (cpu_info); > > - res = concat ("-march=", arch_info->name, NULL); > - default_flags = arch_info->flags; > + res = concat ("-mcpu=", cpu_info->name, NULL); > + default_flags = cpu_info->flags; > } >else if (arch) > { > @@ -449,6 +465,20 @@ host_detect_local_cpu (int argc, const char **argv) > break; > } > } > + > + /* On big.LITTLE if we find any unknown CPUs we can still pick arch > + features as the cores should have the same features. So just pick > + the feature flags from any of the cpus. */ > + if (aarch64_cpu_data[i].name == NULL) > + { > + auto cpu_info = get_cpu_from_id (DEFAULT_CPU); > + > + gcc_assert (cpu_info); > + > + res = concat ("-mcpu=", cpu_info->name, NULL); > + default_flags = cpu_info->flags; > + } > + >if (!res) > goto not_found; > } > diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/info_34 > b/gcc/testsuite/gcc.target/aarch64/cpunative/info_34 > new file mode 100644 > index > ..61cb254785a4b9ec19ebe388402223c9a82af7ed > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/info_34 > @@ -0,0 +1,18 @@ > +processor: 0 > +BogoMIPS : 100.00 > +Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 asimddp sve sve2 > fphp asimdhp fcma > +CPU implementer : 0xfe > +CPU architecture: 8 > +CPU variant : 0x0 > +CPU part : 0xd08 > +CPU revision : 2 > + > +processor: 0 > +BogoMIPS : 100.00 > +Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 asimddp sve sve2 > fphp asimdhp fcma > +CPU implementer : 0xfe > +CPU architecture: 8 > +CPU variant : 0x0 > +CPU part : 0xd09 > +CPU revision : 2 > + > diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/info_35 > b/gcc/testsuite/gcc.target/aarch64/cpunative/info_35 > new file mode 100644 > index > ..61cb254785a4b9ec19ebe388402223c9a82af7ed > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/info_35 > @@ -0,0 +1,18 @@ > +processor: 0 > +BogoMIPS : 100.00 > +Features : fp a
RE: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]
> -Original Message- > From: Xi Ruoyao > Sent: Wednesday, January 15, 2025 1:40 PM > To: Tamar Christina ; gcc-patches@gcc.gnu.org > Cc: nd ; Richard Earnshaw ; > ktkac...@gcc.gnu.org; Richard Sandiford > Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions > for unknown non-homogenous systems [PR113257] > > On Wed, 2025-01-15 at 13:34 +, Tamar Christina wrote: > > > > > > > -Original Message- > > > From: Xi Ruoyao > > > Sent: Wednesday, January 15, 2025 1:29 PM > > > To: Tamar Christina ; gcc-patches@gcc.gnu.org > > > Cc: nd ; Richard Earnshaw ; > > > ktkac...@gcc.gnu.org; Richard Sandiford > > > Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture > extensions > > > for unknown non-homogenous systems [PR113257] > > > > > > On Sat, 2025-01-11 at 15:18 +, Tamar Christina wrote: > > > > However the same thing works for big.LITTLE as in such system the cores > must > > > > have the same extensions otherwise it doesn't fundamentally work. > > > > > > > > i.e. task migration from one core to the other wouldn't work. > > > > > > See https://gcc.gnu.org/PR111768 for a potential issue. > > > > > > > There's no issue there. > > As Alexander says in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111768#c7 > > > > It is not practical for you to have two CPUs in the same system with > > different > capabilities. > > Your kernel will not work, or you have to disable task migration. > > > > The same extends to cache sizes and other properties of the CPU. > > On Intel CPUs it's common that the L1 cache size is different on P and E > cores. The generated code still works but it may be suboptimal if it's > compiled on an E core but the executed on a P core or vice versa. > > Is it different on AArch64? I'd be really surprised if big and LITTLE > must have the same L1 cache size... Yes, Because if the sizes mismatch you break data cache operations. One example is DC ZVA [1] Where the operation clears whole cachelines at a time. This value is cached when doing a long clear operation, such as large memset of 0. Technically speaking the size of VA doesn't have to match the cache size, but practically it always does since otherwise the operation is a lot less efficient. This is an assumption also made in e.g. glibc's memset. If you migrate tasks between the reading of the value and the end of the set you may end up overwriting or not clearing enough. In addition, the Linux kernel will report the smallest value over all the cores so that software gets a coherent value: [2], because again, software can't work effectively when the sizes don't match. And lastly the big.LITTLE programming model explicitly states that the cores must be coherent [3]. This patch does not change the cache sizes anyway as those are determined by the generic cost model, which are a conservative safe minimum that work on all Arm cores. But coherency property is also what dictates that the patch is correct, because as a requirement of big.LITTLE you must be able to seamlessly transfer a program from one core to another. This cannot happen if they vary in capabilities. Regards, Tamar [1] https://developer.arm.com/documentation/102670/0301/AArch64-registers/AArch64-register-descriptions/AArch64-System-instruction-register-description/DC-ZVA--Data-Cache-Zero-by-VA [2] https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1227904.html [3] https://developer.arm.com/documentation/dai0424/a/Power-down-and-power-up-considerations/High-level-considerations/Coherency-in-a-big-LITTLE-system > > > As an aside, today the big.LITTLE selection already enforces a uniform > > feature set. > > -- > Xi Ruoyao > School of Aerospace Science and Technology, Xidian University
Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]
On Wed, 2025-01-15 at 13:34 +, Tamar Christina wrote: > > > > -Original Message- > > From: Xi Ruoyao > > Sent: Wednesday, January 15, 2025 1:29 PM > > To: Tamar Christina ; gcc-patches@gcc.gnu.org > > Cc: nd ; Richard Earnshaw ; > > ktkac...@gcc.gnu.org; Richard Sandiford > > Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture > > extensions > > for unknown non-homogenous systems [PR113257] > > > > On Sat, 2025-01-11 at 15:18 +, Tamar Christina wrote: > > > However the same thing works for big.LITTLE as in such system the cores > > > must > > > have the same extensions otherwise it doesn't fundamentally work. > > > > > > i.e. task migration from one core to the other wouldn't work. > > > > See https://gcc.gnu.org/PR111768 for a potential issue. > > > > There's no issue there. > As Alexander says in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111768#c7 > > It is not practical for you to have two CPUs in the same system with > different capabilities. > Your kernel will not work, or you have to disable task migration. > > The same extends to cache sizes and other properties of the CPU. On Intel CPUs it's common that the L1 cache size is different on P and E cores. The generated code still works but it may be suboptimal if it's compiled on an E core but the executed on a P core or vice versa. Is it different on AArch64? I'd be really surprised if big and LITTLE must have the same L1 cache size... > As an aside, today the big.LITTLE selection already enforces a uniform > feature set. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
RE: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]
> -Original Message- > From: Xi Ruoyao > Sent: Wednesday, January 15, 2025 1:29 PM > To: Tamar Christina ; gcc-patches@gcc.gnu.org > Cc: nd ; Richard Earnshaw ; > ktkac...@gcc.gnu.org; Richard Sandiford > Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions > for unknown non-homogenous systems [PR113257] > > On Sat, 2025-01-11 at 15:18 +, Tamar Christina wrote: > > However the same thing works for big.LITTLE as in such system the cores must > > have the same extensions otherwise it doesn't fundamentally work. > > > > i.e. task migration from one core to the other wouldn't work. > > See https://gcc.gnu.org/PR111768 for a potential issue. > There's no issue there. As Alexander says in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111768#c7 It is not practical for you to have two CPUs in the same system with different capabilities. Your kernel will not work, or you have to disable task migration. The same extends to cache sizes and other properties of the CPU. As an aside, today the big.LITTLE selection already enforces a uniform feature set. Thanks, Tamar > -- > Xi Ruoyao > School of Aerospace Science and Technology, Xidian University
Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]
On Sat, 2025-01-11 at 15:18 +, Tamar Christina wrote: > However the same thing works for big.LITTLE as in such system the cores must > have the same extensions otherwise it doesn't fundamentally work. > > i.e. task migration from one core to the other wouldn't work. See https://gcc.gnu.org/PR111768 for a potential issue. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
RE: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]
> -Original Message- > From: Richard Sandiford > Sent: Monday, January 13, 2025 8:55 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions > for unknown non-homogenous systems [PR113257] > > Richard Sandiford writes: > > Tamar Christina writes: > >>> -Original Message- > >>> From: Richard Sandiford > >>> Sent: Monday, January 13, 2025 6:35 PM > >>> To: Tamar Christina > >>> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > >>> ; ktkac...@gcc.gnu.org > >>> Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture > extensions > >>> for unknown non-homogenous systems [PR113257] > >>> > >>> Tamar Christina writes: > >>> > Hi All, > >>> > > >>> > in g:e91a17fe39c39e98cebe6e1cbc8064ee6846a3a7 we added the ability > for > >>> > -mcpu=native on unknown CPUs to still enable architecture extensions. > >>> > > >>> > This has worked great but was only added for homogenous systems. > >>> > > >>> > However the same thing works for big.LITTLE as in such system the cores > must > >>> > have the same extensions otherwise it doesn't fundamentally work. > >>> > > >>> > i.e. task migration from one core to the other wouldn't work. > >>> > > >>> > This extends the same handling to non-homogenous systems. > >>> > > >>> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > >>> > > >>> > Ok for master? > >>> > > >>> > Thanks, > >>> > Tamar > >>> > > >>> > gcc/ChangeLog: > >>> > > >>> > PR target/113257 > >>> > * config/aarch64/driver-aarch64.cc (host_detect_local_cpu): > >>> > > >>> > gcc/testsuite/ChangeLog: > >>> > > >>> > PR target/113257 > >>> > * gcc.target/aarch64/cpunative/info_34: New test. > >>> > * gcc.target/aarch64/cpunative/native_cpu_34.c: New test. > >>> > > >>> > --- > >>> > > >>> > diff --git a/gcc/config/aarch64/driver-aarch64.cc > b/gcc/config/aarch64/driver- > >>> aarch64.cc > >>> > index > >>> > 45fce67a646351b848b7cd7d0fd35d343731c0d1..2a454daf031aa3ac81a9a2c0 > >>> 3b15c09731e4f56e 100644 > >>> > --- a/gcc/config/aarch64/driver-aarch64.cc > >>> > +++ b/gcc/config/aarch64/driver-aarch64.cc > >>> > @@ -449,6 +449,20 @@ host_detect_local_cpu (int argc, const char **argv) > >>> > break; > >>> > } > >>> > } > >>> > + > >>> > + /* On big.LITTLE if we find any unknown CPUs we can still pick > >>> > arch > >>> > +features as the cores should have the same features. So just > >>> > pick > >>> > +the feature flags from any of the cpus. */ > >>> > + if (aarch64_cpu_data[i].name == NULL) > >>> > + { > >>> > + auto arch_info = get_arch_from_id (DEFAULT_ARCH); > >>> > + > >>> > + gcc_assert (arch_info); > >>> > + > >>> > + res = concat ("-march=", arch_info->name, NULL); > >>> > + default_flags = arch_info->flags; > >>> > + } > >>> > + > >>> > >>> Currently, if gcc recognises the host cpu, and if one-thing is more > >>> restrictive than that cpu, gcc will warn on: > >>> > >>> gcc -march=one-thing -mcpu=native > >>> > >>> and choose one-thing. It looks like one consequence of this patch > >>> is that, for unrecognised big.LITTLE, the command line would get > >>> converted to: > >>> > >>> gcc -march=one-thing -march=above-replacement > >>> > >>> and so -mcpu=native would silently "win" over one-thing. Is that right? > >>> > >>> Perhaps we should adjust: > >>> > >>>" %{mcpu=native:% >>> > >>> to pass something like "cpu/arch" rather than "cpu" when -mar
Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]
Richard Sandiford writes: > Tamar Christina writes: >>> -Original Message- >>> From: Richard Sandiford >>> Sent: Monday, January 13, 2025 6:35 PM >>> To: Tamar Christina >>> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw >>> ; ktkac...@gcc.gnu.org >>> Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture >>> extensions >>> for unknown non-homogenous systems [PR113257] >>> >>> Tamar Christina writes: >>> > Hi All, >>> > >>> > in g:e91a17fe39c39e98cebe6e1cbc8064ee6846a3a7 we added the ability for >>> > -mcpu=native on unknown CPUs to still enable architecture extensions. >>> > >>> > This has worked great but was only added for homogenous systems. >>> > >>> > However the same thing works for big.LITTLE as in such system the cores >>> > must >>> > have the same extensions otherwise it doesn't fundamentally work. >>> > >>> > i.e. task migration from one core to the other wouldn't work. >>> > >>> > This extends the same handling to non-homogenous systems. >>> > >>> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. >>> > >>> > Ok for master? >>> > >>> > Thanks, >>> > Tamar >>> > >>> > gcc/ChangeLog: >>> > >>> > PR target/113257 >>> > * config/aarch64/driver-aarch64.cc (host_detect_local_cpu): >>> > >>> > gcc/testsuite/ChangeLog: >>> > >>> > PR target/113257 >>> > * gcc.target/aarch64/cpunative/info_34: New test. >>> > * gcc.target/aarch64/cpunative/native_cpu_34.c: New test. >>> > >>> > --- >>> > >>> > diff --git a/gcc/config/aarch64/driver-aarch64.cc >>> > b/gcc/config/aarch64/driver- >>> aarch64.cc >>> > index >>> 45fce67a646351b848b7cd7d0fd35d343731c0d1..2a454daf031aa3ac81a9a2c0 >>> 3b15c09731e4f56e 100644 >>> > --- a/gcc/config/aarch64/driver-aarch64.cc >>> > +++ b/gcc/config/aarch64/driver-aarch64.cc >>> > @@ -449,6 +449,20 @@ host_detect_local_cpu (int argc, const char **argv) >>> > break; >>> > } >>> > } >>> > + >>> > + /* On big.LITTLE if we find any unknown CPUs we can still pick arch >>> > + features as the cores should have the same features. So just pick >>> > + the feature flags from any of the cpus. */ >>> > + if (aarch64_cpu_data[i].name == NULL) >>> > + { >>> > + auto arch_info = get_arch_from_id (DEFAULT_ARCH); >>> > + >>> > + gcc_assert (arch_info); >>> > + >>> > + res = concat ("-march=", arch_info->name, NULL); >>> > + default_flags = arch_info->flags; >>> > + } >>> > + >>> >>> Currently, if gcc recognises the host cpu, and if one-thing is more >>> restrictive than that cpu, gcc will warn on: >>> >>> gcc -march=one-thing -mcpu=native >>> >>> and choose one-thing. It looks like one consequence of this patch >>> is that, for unrecognised big.LITTLE, the command line would get >>> converted to: >>> >>> gcc -march=one-thing -march=above-replacement >>> >>> and so -mcpu=native would silently "win" over one-thing. Is that right? >>> >>> Perhaps we should adjust: >>> >>>" %{mcpu=native:%>> >>> to pass something like "cpu/arch" rather than "cpu" when -march >>> is not specified, so that the routine knows that it has the choice >>> of using either -mcpu or -march. We wouldn't get the warning, but we >>> would get predictable preemption of -march over -mcpu. >>> >>> Admittedly, it looks like we already have this problem with: >>> >>> if (aarch64_cpu_data[i].name == NULL) >>> { >>> auto arch_info = get_arch_from_id (DEFAULT_ARCH); >>> >>> gcc_assert (arch_info); >>> >>> res = concat ("-march=", arch_info->name, NULL); >>> default_flags = arch_info->flags; >>> } >>> >>> so I guess this is pre-existing. >> >> Yes, it looks like this was a deliberate choice that mcpu=native >> overrides any march, this is the case for all pre
Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]
Tamar Christina writes: >> -Original Message- >> From: Richard Sandiford >> Sent: Monday, January 13, 2025 6:35 PM >> To: Tamar Christina >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw >> ; ktkac...@gcc.gnu.org >> Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions >> for unknown non-homogenous systems [PR113257] >> >> Tamar Christina writes: >> > Hi All, >> > >> > in g:e91a17fe39c39e98cebe6e1cbc8064ee6846a3a7 we added the ability for >> > -mcpu=native on unknown CPUs to still enable architecture extensions. >> > >> > This has worked great but was only added for homogenous systems. >> > >> > However the same thing works for big.LITTLE as in such system the cores >> > must >> > have the same extensions otherwise it doesn't fundamentally work. >> > >> > i.e. task migration from one core to the other wouldn't work. >> > >> > This extends the same handling to non-homogenous systems. >> > >> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. >> > >> > Ok for master? >> > >> > Thanks, >> > Tamar >> > >> > gcc/ChangeLog: >> > >> >PR target/113257 >> >* config/aarch64/driver-aarch64.cc (host_detect_local_cpu): >> > >> > gcc/testsuite/ChangeLog: >> > >> >PR target/113257 >> >* gcc.target/aarch64/cpunative/info_34: New test. >> >* gcc.target/aarch64/cpunative/native_cpu_34.c: New test. >> > >> > --- >> > >> > diff --git a/gcc/config/aarch64/driver-aarch64.cc >> > b/gcc/config/aarch64/driver- >> aarch64.cc >> > index >> 45fce67a646351b848b7cd7d0fd35d343731c0d1..2a454daf031aa3ac81a9a2c0 >> 3b15c09731e4f56e 100644 >> > --- a/gcc/config/aarch64/driver-aarch64.cc >> > +++ b/gcc/config/aarch64/driver-aarch64.cc >> > @@ -449,6 +449,20 @@ host_detect_local_cpu (int argc, const char **argv) >> > break; >> >} >> >} >> > + >> > + /* On big.LITTLE if we find any unknown CPUs we can still pick arch >> > + features as the cores should have the same features. So just pick >> > + the feature flags from any of the cpus. */ >> > + if (aarch64_cpu_data[i].name == NULL) >> > + { >> > +auto arch_info = get_arch_from_id (DEFAULT_ARCH); >> > + >> > +gcc_assert (arch_info); >> > + >> > +res = concat ("-march=", arch_info->name, NULL); >> > +default_flags = arch_info->flags; >> > + } >> > + >> >> Currently, if gcc recognises the host cpu, and if one-thing is more >> restrictive than that cpu, gcc will warn on: >> >> gcc -march=one-thing -mcpu=native >> >> and choose one-thing. It looks like one consequence of this patch >> is that, for unrecognised big.LITTLE, the command line would get >> converted to: >> >> gcc -march=one-thing -march=above-replacement >> >> and so -mcpu=native would silently "win" over one-thing. Is that right? >> >> Perhaps we should adjust: >> >>" %{mcpu=native:%> >> to pass something like "cpu/arch" rather than "cpu" when -march >> is not specified, so that the routine knows that it has the choice >> of using either -mcpu or -march. We wouldn't get the warning, but we >> would get predictable preemption of -march over -mcpu. >> >> Admittedly, it looks like we already have this problem with: >> >> if (aarch64_cpu_data[i].name == NULL) >> { >>auto arch_info = get_arch_from_id (DEFAULT_ARCH); >> >>gcc_assert (arch_info); >> >>res = concat ("-march=", arch_info->name, NULL); >>default_flags = arch_info->flags; >> } >> >> so I guess this is pre-existing. > > Yes, it looks like this was a deliberate choice that mcpu=native > overrides any march, this is the case for all previous GCCs as well. > > i.e. GCC 12-15 (ones I had at hand) convert > > -march=armv8.8-a+sve -mcpu=native on an Neoverse-V1 system to > > '-march=armv8.8-a+sve' '-c' '-mlittle-endian' '-mabi=lp64' > '-mcpu=neoverse-v1+sm4+crc+aes+sha3+nossbs' > > In both the calls to CC1 and to AS > > "-march=armv8.8-a+sve" > "-march=armv8.4-a+rng+sm4+crc+aes+sha3+i8mm+bf16
RE: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]
> -Original Message- > From: Richard Sandiford > Sent: Monday, January 13, 2025 6:35 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions > for unknown non-homogenous systems [PR113257] > > Tamar Christina writes: > > Hi All, > > > > in g:e91a17fe39c39e98cebe6e1cbc8064ee6846a3a7 we added the ability for > > -mcpu=native on unknown CPUs to still enable architecture extensions. > > > > This has worked great but was only added for homogenous systems. > > > > However the same thing works for big.LITTLE as in such system the cores must > > have the same extensions otherwise it doesn't fundamentally work. > > > > i.e. task migration from one core to the other wouldn't work. > > > > This extends the same handling to non-homogenous systems. > > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > > > Ok for master? > > > > Thanks, > > Tamar > > > > gcc/ChangeLog: > > > > PR target/113257 > > * config/aarch64/driver-aarch64.cc (host_detect_local_cpu): > > > > gcc/testsuite/ChangeLog: > > > > PR target/113257 > > * gcc.target/aarch64/cpunative/info_34: New test. > > * gcc.target/aarch64/cpunative/native_cpu_34.c: New test. > > > > --- > > > > diff --git a/gcc/config/aarch64/driver-aarch64.cc > > b/gcc/config/aarch64/driver- > aarch64.cc > > index > 45fce67a646351b848b7cd7d0fd35d343731c0d1..2a454daf031aa3ac81a9a2c0 > 3b15c09731e4f56e 100644 > > --- a/gcc/config/aarch64/driver-aarch64.cc > > +++ b/gcc/config/aarch64/driver-aarch64.cc > > @@ -449,6 +449,20 @@ host_detect_local_cpu (int argc, const char **argv) > > break; > > } > > } > > + > > + /* On big.LITTLE if we find any unknown CPUs we can still pick arch > > +features as the cores should have the same features. So just pick > > +the feature flags from any of the cpus. */ > > + if (aarch64_cpu_data[i].name == NULL) > > + { > > + auto arch_info = get_arch_from_id (DEFAULT_ARCH); > > + > > + gcc_assert (arch_info); > > + > > + res = concat ("-march=", arch_info->name, NULL); > > + default_flags = arch_info->flags; > > + } > > + > > Currently, if gcc recognises the host cpu, and if one-thing is more > restrictive than that cpu, gcc will warn on: > > gcc -march=one-thing -mcpu=native > > and choose one-thing. It looks like one consequence of this patch > is that, for unrecognised big.LITTLE, the command line would get > converted to: > > gcc -march=one-thing -march=above-replacement > > and so -mcpu=native would silently "win" over one-thing. Is that right? > > Perhaps we should adjust: > >" %{mcpu=native:% > to pass something like "cpu/arch" rather than "cpu" when -march > is not specified, so that the routine knows that it has the choice > of using either -mcpu or -march. We wouldn't get the warning, but we > would get predictable preemption of -march over -mcpu. > > Admittedly, it looks like we already have this problem with: > > if (aarch64_cpu_data[i].name == NULL) > { > auto arch_info = get_arch_from_id (DEFAULT_ARCH); > > gcc_assert (arch_info); > > res = concat ("-march=", arch_info->name, NULL); > default_flags = arch_info->flags; > } > > so I guess this is pre-existing. Yes, it looks like this was a deliberate choice that mcpu=native overrides any march, this is the case for all previous GCCs as well. i.e. GCC 12-15 (ones I had at hand) convert -march=armv8.8-a+sve -mcpu=native on an Neoverse-V1 system to '-march=armv8.8-a+sve' '-c' '-mlittle-endian' '-mabi=lp64' '-mcpu=neoverse-v1+sm4+crc+aes+sha3+nossbs' In both the calls to CC1 and to AS "-march=armv8.8-a+sve" "-march=armv8.4-a+rng+sm4+crc+aes+sha3+i8mm+bf16+sve+profile" Your request would change this long standing behavior but I don't know If that's correct or not. I'll admit it's inconsistent with the warning given by CC1 for the flags. The warnings are coming from CC1, and the question is if it's up to GCC to enforce the CC1 constraint onto binutils or not. So up to you, but I'm a bit hesitant to change this at this point without being able to tell what the impact is on downstream tools. Thanks, Tamar > > T
Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]
Tamar Christina writes: > Hi All, > > in g:e91a17fe39c39e98cebe6e1cbc8064ee6846a3a7 we added the ability for > -mcpu=native on unknown CPUs to still enable architecture extensions. > > This has worked great but was only added for homogenous systems. > > However the same thing works for big.LITTLE as in such system the cores must > have the same extensions otherwise it doesn't fundamentally work. > > i.e. task migration from one core to the other wouldn't work. > > This extends the same handling to non-homogenous systems. > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Ok for master? > > Thanks, > Tamar > > gcc/ChangeLog: > > PR target/113257 > * config/aarch64/driver-aarch64.cc (host_detect_local_cpu): > > gcc/testsuite/ChangeLog: > > PR target/113257 > * gcc.target/aarch64/cpunative/info_34: New test. > * gcc.target/aarch64/cpunative/native_cpu_34.c: New test. > > --- > > diff --git a/gcc/config/aarch64/driver-aarch64.cc > b/gcc/config/aarch64/driver-aarch64.cc > index > 45fce67a646351b848b7cd7d0fd35d343731c0d1..2a454daf031aa3ac81a9a2c03b15c09731e4f56e > 100644 > --- a/gcc/config/aarch64/driver-aarch64.cc > +++ b/gcc/config/aarch64/driver-aarch64.cc > @@ -449,6 +449,20 @@ host_detect_local_cpu (int argc, const char **argv) > break; > } > } > + > + /* On big.LITTLE if we find any unknown CPUs we can still pick arch > + features as the cores should have the same features. So just pick > + the feature flags from any of the cpus. */ > + if (aarch64_cpu_data[i].name == NULL) > + { > + auto arch_info = get_arch_from_id (DEFAULT_ARCH); > + > + gcc_assert (arch_info); > + > + res = concat ("-march=", arch_info->name, NULL); > + default_flags = arch_info->flags; > + } > + Currently, if gcc recognises the host cpu, and if one-thing is more restrictive than that cpu, gcc will warn on: gcc -march=one-thing -mcpu=native and choose one-thing. It looks like one consequence of this patch is that, for unrecognised big.LITTLE, the command line would get converted to: gcc -march=one-thing -march=above-replacement and so -mcpu=native would silently "win" over one-thing. Is that right? Perhaps we should adjust: " %{mcpu=native:% +/* { dg-additional-options "-mcpu=native" } */ > + > +int main() > +{ > + return 0; > +} > + > +/* { dg-final { scan-assembler {\.arch > armv8-a\+dotprod\+crc\+crypto\+sve2\n} } } */ > + > +/* Test a normal looking procinfo. */
[PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]
Hi All, in g:e91a17fe39c39e98cebe6e1cbc8064ee6846a3a7 we added the ability for -mcpu=native on unknown CPUs to still enable architecture extensions. This has worked great but was only added for homogenous systems. However the same thing works for big.LITTLE as in such system the cores must have the same extensions otherwise it doesn't fundamentally work. i.e. task migration from one core to the other wouldn't work. This extends the same handling to non-homogenous systems. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: PR target/113257 * config/aarch64/driver-aarch64.cc (host_detect_local_cpu): gcc/testsuite/ChangeLog: PR target/113257 * gcc.target/aarch64/cpunative/info_34: New test. * gcc.target/aarch64/cpunative/native_cpu_34.c: New test. --- diff --git a/gcc/config/aarch64/driver-aarch64.cc b/gcc/config/aarch64/driver-aarch64.cc index 45fce67a646351b848b7cd7d0fd35d343731c0d1..2a454daf031aa3ac81a9a2c03b15c09731e4f56e 100644 --- a/gcc/config/aarch64/driver-aarch64.cc +++ b/gcc/config/aarch64/driver-aarch64.cc @@ -449,6 +449,20 @@ host_detect_local_cpu (int argc, const char **argv) break; } } + + /* On big.LITTLE if we find any unknown CPUs we can still pick arch +features as the cores should have the same features. So just pick +the feature flags from any of the cpus. */ + if (aarch64_cpu_data[i].name == NULL) + { + auto arch_info = get_arch_from_id (DEFAULT_ARCH); + + gcc_assert (arch_info); + + res = concat ("-march=", arch_info->name, NULL); + default_flags = arch_info->flags; + } + if (!res) goto not_found; } diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/info_34 b/gcc/testsuite/gcc.target/aarch64/cpunative/info_34 new file mode 100644 index ..61cb254785a4b9ec19ebe388402223c9a82af7ed --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/info_34 @@ -0,0 +1,18 @@ +processor : 0 +BogoMIPS : 100.00 +Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 asimddp sve sve2 fphp asimdhp fcma +CPU implementer: 0xfe +CPU architecture: 8 +CPU variant: 0x0 +CPU part : 0xd08 +CPU revision : 2 + +processor : 0 +BogoMIPS : 100.00 +Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 asimddp sve sve2 fphp asimdhp fcma +CPU implementer: 0xfe +CPU architecture: 8 +CPU variant: 0x0 +CPU part : 0xd09 +CPU revision : 2 + diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_34.c b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_34.c new file mode 100644 index ..168140002a0f0205c0f552de0cce9b2d356e09e2 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_34.c @@ -0,0 +1,12 @@ +/* { dg-do compile { target { { aarch64*-*-linux*} && native } } } */ +/* { dg-set-compiler-env-var GCC_CPUINFO "$srcdir/gcc.target/aarch64/cpunative/info_34" } */ +/* { dg-additional-options "-mcpu=native" } */ + +int main() +{ + return 0; +} + +/* { dg-final { scan-assembler {\.arch armv8-a\+dotprod\+crc\+crypto\+sve2\n} } } */ + +/* Test a normal looking procinfo. */ -- diff --git a/gcc/config/aarch64/driver-aarch64.cc b/gcc/config/aarch64/driver-aarch64.cc index 45fce67a646351b848b7cd7d0fd35d343731c0d1..2a454daf031aa3ac81a9a2c03b15c09731e4f56e 100644 --- a/gcc/config/aarch64/driver-aarch64.cc +++ b/gcc/config/aarch64/driver-aarch64.cc @@ -449,6 +449,20 @@ host_detect_local_cpu (int argc, const char **argv) break; } } + + /* On big.LITTLE if we find any unknown CPUs we can still pick arch + features as the cores should have the same features. So just pick + the feature flags from any of the cpus. */ + if (aarch64_cpu_data[i].name == NULL) + { + auto arch_info = get_arch_from_id (DEFAULT_ARCH); + + gcc_assert (arch_info); + + res = concat ("-march=", arch_info->name, NULL); + default_flags = arch_info->flags; + } + if (!res) goto not_found; } diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/info_34 b/gcc/testsuite/gcc.target/aarch64/cpunative/info_34 new file mode 100644 index ..61cb254785a4b9ec19ebe388402223c9a82af7ed --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/info_34 @@ -0,0 +1,18 @@ +processor : 0 +BogoMIPS : 100.00 +Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 asimddp sve sve2 fphp asimdhp fcma +CPU implementer : 0xfe +CPU architecture: 8 +CPU variant : 0x0 +CPU part : 0xd08 +CPU revision : 2 + +processor : 0 +BogoMIPS : 100.00 +Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 asimddp sve sve2 fphp asimdhp fcma +CPU implementer : 0xfe +CPU architecture: 8 +CPU variant : 0x0 +CPU part : 0xd09 +CPU revision : 2 + diff --git a/gcc/testsuite/gcc.target/aarc