Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-16 Thread Richard Sandiford
Tamar Christina  writes:
>> -Original Message-
>> From: Richard Sandiford 
>> Sent: Thursday, January 16, 2025 7:11 AM
>> To: Tamar Christina 
>> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
>> ; ktkac...@gcc.gnu.org
>> Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions
>> for unknown non-homogenous systems [PR113257]
>> 
>> Richard Sandiford  writes:
>> > Tamar Christina  writes:
>> >> Ok for master? and how do you feel about a backport for the two patches to
>> help
>> >> distros?
>> >
>> > Backporting to GCC 14 & GCC 13 sounds good.  Not so sure about GCC 12,
>> > since I think we should be extra cautious with the "most stable" branch,
>> > but let's see what others think.
>> >
>> > OK for trunk, and for GCC 14 & 13 after a grace period, with one
>> > trivial nit below:
>> 
>> Sorry, was concentrating too much on the -mcpu vs. -march preemption
>> thing and forgot to think about other aspects of the patch.  The routine
>> is used for all three of -march=native, -mcpu=native, and -mtune=native,
>> so I think we want something like the following on top of your patch
>> (untested so far).
>> 
>
> Cool, how's this one?
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok for master? and for backport to GCC 13 and 14?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
>   PR target/113257
>   * config/aarch64/driver-aarch64.cc (get_cpu_from_id, DEFAULT_CPU): New.
>   (host_detect_local_cpu): Use it.
>
> gcc/testsuite/ChangeLog:
>
>   PR target/113257
>   * gcc.target/aarch64/cpunative/info_34: New test.
>   * gcc.target/aarch64/cpunative/native_cpu_34.c: New test.
>   * gcc.target/aarch64/cpunative/info_35: New test.
>   * gcc.target/aarch64/cpunative/native_cpu_35.c: New test.
>
> Co-authored-by: Richard Sandiford 
>
> -- inline copy of patch --
>
> diff --git a/gcc/config/aarch64/driver-aarch64.cc 
> b/gcc/config/aarch64/driver-aarch64.cc
> index 
> 45fce67a646351b848b7cd7d0fd35d343731c0d1..26ba2cd6f8883300951268aab7d0a22ec2588a0d
>  100644
> --- a/gcc/config/aarch64/driver-aarch64.cc
> +++ b/gcc/config/aarch64/driver-aarch64.cc
> @@ -60,6 +60,7 @@ struct aarch64_core_data
>  #define ALL_VARIANTS ((unsigned)-1)
>  /* Default architecture to use if -mcpu=native did not detect a known CPU.  
> */
>  #define DEFAULT_ARCH "8A"
> +#define DEFAULT_CPU "generic-armv8-a"
>  
>  #define AARCH64_CORE(CORE_NAME, CORE_IDENT, SCHED, ARCH, FLAGS, COSTS, IMP, 
> PART, VARIANT) \
>{ CORE_NAME, #ARCH, IMP, PART, VARIANT, feature_deps::cpu_##CORE_IDENT },
> @@ -106,6 +107,21 @@ get_arch_from_id (const char* id)
>return NULL;
>  }
>  
> +/* Return an aarch64_core_data for the cpu described
> +   by ID, or NULL if ID describes something we don't know about.  */
> +
> +static const aarch64_core_data *
> +get_cpu_from_id (const char* name)
> +{
> +  for (unsigned i = 0; aarch64_cpu_data[i].name != NULL; i++)
> +{
> +  if (strcmp (name, aarch64_cpu_data[i].name) == 0)
> + return &aarch64_cpu_data[i];
> +}

Redundant braces, the convention says:

  for (unsigned i = 0; aarch64_cpu_data[i].name != NULL; i++)
if (strcmp (name, aarch64_cpu_data[i].name) == 0)
  return &aarch64_cpu_data[i];

OK with that change, thanks, and sorry for back-tracking on the
original ack.

Richard

> +
> +  return NULL;
> +}
> +
>  /* Check wether the CORE array is the same as the big.LITTLE BL_CORE.
> For an example CORE={0xd08, 0xd03} and
> BL_CORE=AARCH64_BIG_LITTLE (0xd08, 0xd03) will return true.  */
> @@ -403,18 +419,11 @@ host_detect_local_cpu (int argc, const char **argv)
>  || variants[0] == aarch64_cpu_data[i].variant))
> break;
>  
> -  if (aarch64_cpu_data[i].name == NULL)
> +  if (arch)
>   {
> -   auto arch_info = get_arch_from_id (DEFAULT_ARCH);
> -
> -   gcc_assert (arch_info);
> -
> -   res = concat ("-march=", arch_info->name, NULL);
> -   default_flags = arch_info->flags;
> - }
> -  else if (arch)
> - {
> -   const char *arch_id = aarch64_cpu_data[i].arch;
> +   const char *arch_id = (aarch64_cpu_data[i].name
> +  ? aarch64_cpu_data[i].arch
> +  : DEFAULT_ARCH);
> auto arch_info = get_arch_from_id (arch_id);
>  
> /* We got some arch indentifier that's not in aarch64-arches.def?  */
> @@ -424,12 +433,15 @@ host_detect_l

RE: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-16 Thread Tamar Christina
> -Original Message-
> From: Richard Sandiford 
> Sent: Thursday, January 16, 2025 7:11 AM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; ktkac...@gcc.gnu.org
> Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions
> for unknown non-homogenous systems [PR113257]
> 
> Richard Sandiford  writes:
> > Tamar Christina  writes:
> >> Ok for master? and how do you feel about a backport for the two patches to
> help
> >> distros?
> >
> > Backporting to GCC 14 & GCC 13 sounds good.  Not so sure about GCC 12,
> > since I think we should be extra cautious with the "most stable" branch,
> > but let's see what others think.
> >
> > OK for trunk, and for GCC 14 & 13 after a grace period, with one
> > trivial nit below:
> 
> Sorry, was concentrating too much on the -mcpu vs. -march preemption
> thing and forgot to think about other aspects of the patch.  The routine
> is used for all three of -march=native, -mcpu=native, and -mtune=native,
> so I think we want something like the following on top of your patch
> (untested so far).
> 

Cool, how's this one?

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master? and for backport to GCC 13 and 14?

Thanks,
Tamar

gcc/ChangeLog:

PR target/113257
* config/aarch64/driver-aarch64.cc (get_cpu_from_id, DEFAULT_CPU): New.
(host_detect_local_cpu): Use it.

gcc/testsuite/ChangeLog:

PR target/113257
* gcc.target/aarch64/cpunative/info_34: New test.
* gcc.target/aarch64/cpunative/native_cpu_34.c: New test.
* gcc.target/aarch64/cpunative/info_35: New test.
* gcc.target/aarch64/cpunative/native_cpu_35.c: New test.

Co-authored-by: Richard Sandiford 

-- inline copy of patch --

diff --git a/gcc/config/aarch64/driver-aarch64.cc 
b/gcc/config/aarch64/driver-aarch64.cc
index 
45fce67a646351b848b7cd7d0fd35d343731c0d1..26ba2cd6f8883300951268aab7d0a22ec2588a0d
 100644
--- a/gcc/config/aarch64/driver-aarch64.cc
+++ b/gcc/config/aarch64/driver-aarch64.cc
@@ -60,6 +60,7 @@ struct aarch64_core_data
 #define ALL_VARIANTS ((unsigned)-1)
 /* Default architecture to use if -mcpu=native did not detect a known CPU.  */
 #define DEFAULT_ARCH "8A"
+#define DEFAULT_CPU "generic-armv8-a"
 
 #define AARCH64_CORE(CORE_NAME, CORE_IDENT, SCHED, ARCH, FLAGS, COSTS, IMP, 
PART, VARIANT) \
   { CORE_NAME, #ARCH, IMP, PART, VARIANT, feature_deps::cpu_##CORE_IDENT },
@@ -106,6 +107,21 @@ get_arch_from_id (const char* id)
   return NULL;
 }
 
+/* Return an aarch64_core_data for the cpu described
+   by ID, or NULL if ID describes something we don't know about.  */
+
+static const aarch64_core_data *
+get_cpu_from_id (const char* name)
+{
+  for (unsigned i = 0; aarch64_cpu_data[i].name != NULL; i++)
+{
+  if (strcmp (name, aarch64_cpu_data[i].name) == 0)
+   return &aarch64_cpu_data[i];
+}
+
+  return NULL;
+}
+
 /* Check wether the CORE array is the same as the big.LITTLE BL_CORE.
For an example CORE={0xd08, 0xd03} and
BL_CORE=AARCH64_BIG_LITTLE (0xd08, 0xd03) will return true.  */
@@ -403,18 +419,11 @@ host_detect_local_cpu (int argc, const char **argv)
 || variants[0] == aarch64_cpu_data[i].variant))
  break;
 
-  if (aarch64_cpu_data[i].name == NULL)
+  if (arch)
{
- auto arch_info = get_arch_from_id (DEFAULT_ARCH);
-
- gcc_assert (arch_info);
-
- res = concat ("-march=", arch_info->name, NULL);
- default_flags = arch_info->flags;
-   }
-  else if (arch)
-   {
- const char *arch_id = aarch64_cpu_data[i].arch;
+ const char *arch_id = (aarch64_cpu_data[i].name
+? aarch64_cpu_data[i].arch
+: DEFAULT_ARCH);
  auto arch_info = get_arch_from_id (arch_id);
 
  /* We got some arch indentifier that's not in aarch64-arches.def?  */
@@ -424,12 +433,15 @@ host_detect_local_cpu (int argc, const char **argv)
  res = concat ("-march=", arch_info->name, NULL);
  default_flags = arch_info->flags;
}
-  else
+  else if (cpu || aarch64_cpu_data[i].name)
{
- default_flags = aarch64_cpu_data[i].flags;
+ auto cpu_info = (aarch64_cpu_data[i].name
+  ? &aarch64_cpu_data[i]
+  : get_cpu_from_id (DEFAULT_CPU));
+ default_flags = cpu_info->flags;
  res = concat ("-m",
cpu ? "cpu" : "tune", "=",
-   aarch64_cpu_data[i].name,
+   cpu_info->name,
NULL);
}
 }
@@ -449,6 +461,20 @@ host_detect_local_cpu (int argc, 

Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-15 Thread Richard Sandiford
Richard Sandiford  writes:
> Tamar Christina  writes:
>> Ok for master? and how do you feel about a backport for the two patches to 
>> help
>> distros?
>
> Backporting to GCC 14 & GCC 13 sounds good.  Not so sure about GCC 12,
> since I think we should be extra cautious with the "most stable" branch,
> but let's see what others think.
>
> OK for trunk, and for GCC 14 & 13 after a grace period, with one
> trivial nit below:

Sorry, was concentrating too much on the -mcpu vs. -march preemption
thing and forgot to think about other aspects of the patch.  The routine
is used for all three of -march=native, -mcpu=native, and -mtune=native,
so I think we want something like the following on top of your patch
(untested so far).

Richard


diff --git a/gcc/config/aarch64/driver-aarch64.cc 
b/gcc/config/aarch64/driver-aarch64.cc
index 731b5e48687..26ba2cd6f88 100644
--- a/gcc/config/aarch64/driver-aarch64.cc
+++ b/gcc/config/aarch64/driver-aarch64.cc
@@ -419,18 +419,11 @@ host_detect_local_cpu (int argc, const char **argv)
 || variants[0] == aarch64_cpu_data[i].variant))
  break;
 
-  if (aarch64_cpu_data[i].name == NULL)
+  if (arch)
{
- auto cpu_info = get_cpu_from_id (DEFAULT_CPU);
-
- gcc_assert (cpu_info);
-
- res = concat ("-mcpu=", cpu_info->name, NULL);
- default_flags = cpu_info->flags;
-   }
-  else if (arch)
-   {
- const char *arch_id = aarch64_cpu_data[i].arch;
+ const char *arch_id = (aarch64_cpu_data[i].name
+? aarch64_cpu_data[i].arch
+: DEFAULT_ARCH);
  auto arch_info = get_arch_from_id (arch_id);
 
  /* We got some arch indentifier that's not in aarch64-arches.def?  */
@@ -440,12 +433,15 @@ host_detect_local_cpu (int argc, const char **argv)
  res = concat ("-march=", arch_info->name, NULL);
  default_flags = arch_info->flags;
}
-  else
+  else if (cpu || aarch64_cpu_data[i].name)
{
- default_flags = aarch64_cpu_data[i].flags;
+ auto cpu_info = (aarch64_cpu_data[i].name
+  ? &aarch64_cpu_data[i]
+  : get_cpu_from_id (DEFAULT_CPU));
+ default_flags = cpu_info->flags;
  res = concat ("-m",
cpu ? "cpu" : "tune", "=",
-   aarch64_cpu_data[i].name,
+   cpu_info->name,
NULL);
}
 }
@@ -469,7 +465,7 @@ host_detect_local_cpu (int argc, const char **argv)
   /* On big.LITTLE if we find any unknown CPUs we can still pick arch
 features as the cores should have the same features.  So just pick
 the feature flags from any of the cpus.  */
-  if (aarch64_cpu_data[i].name == NULL)
+  if (cpu && aarch64_cpu_data[i].name == NULL)
{
  auto cpu_info = get_cpu_from_id (DEFAULT_CPU);
 


Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-15 Thread Xi Ruoyao
On Wed, 2025-01-15 at 18:54 +, Tamar Christina wrote:
> 
> Re-reading again I realize I misread cache size from your question
> with cache line size. 
> 
> Cache size can be whatever yes. Cache line size must match. 
> 
> But that doesn't change the fact that this patch is correct. 

Yes it's correct, having -march=native is at least better than not
having it.  I'm just providing some info so if your big.LITTLE user
complains "hey when I bootstrap GCC with BOOT_CFLAGS=-O3 -march=native I
get a comparison error" you can just point they to PR111768.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-15 Thread Tamar Christina
Re-reading again I realize I misread cache size from your question with cache 
line size.

Cache size can be whatever yes. Cache line size must match.

But that doesn't change the fact that this patch is correct.

Thanks,
Tamar


From: Tamar Christina
Sent: Wednesday, January 15, 2025 2:45 PM
To: Xi Ruoyao ; gcc-patches@gcc.gnu.org 

Cc: nd ; Richard Earnshaw ; 
ktkac...@gcc.gnu.org ; Richard Sandiford 

Subject: RE: [PATCH]AArch64: have -mcpu=native detect architecture extensions 
for unknown non-homogenous systems [PR113257]

> -Original Message-
> From: Xi Ruoyao 
> Sent: Wednesday, January 15, 2025 1:40 PM
> To: Tamar Christina ; gcc-patches@gcc.gnu.org
> Cc: nd ; Richard Earnshaw ;
> ktkac...@gcc.gnu.org; Richard Sandiford 
> Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions
> for unknown non-homogenous systems [PR113257]
>
> On Wed, 2025-01-15 at 13:34 +, Tamar Christina wrote:
> >
> >
> > > -Original Message-
> > > From: Xi Ruoyao 
> > > Sent: Wednesday, January 15, 2025 1:29 PM
> > > To: Tamar Christina ; gcc-patches@gcc.gnu.org
> > > Cc: nd ; Richard Earnshaw ;
> > > ktkac...@gcc.gnu.org; Richard Sandiford 
> > > Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture
> extensions
> > > for unknown non-homogenous systems [PR113257]
> > >
> > > On Sat, 2025-01-11 at 15:18 +, Tamar Christina wrote:
> > > > However the same thing works for big.LITTLE as in such system the cores
> must
> > > > have the same extensions otherwise it doesn't fundamentally work.
> > > >
> > > > i.e. task migration from one core to the other wouldn't work.
> > >
> > > See https://gcc.gnu.org/PR111768 for a potential issue.
> > >
> >
> > There's no issue there.
> > As Alexander says in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111768#c7
> >
> > It is not practical for you to have two CPUs in the same system with 
> > different
> capabilities.
> > Your kernel will not work, or you have to disable task migration.
> >
> > The same extends to cache sizes and other properties of the CPU.
>
> On Intel CPUs it's common that the L1 cache size is different on P and E
> cores.  The generated code still works but it may be suboptimal if it's
> compiled on an E core but the executed on a P core or vice versa.
>
> Is it different on AArch64?  I'd be really surprised if big and LITTLE
> must have the same L1 cache size...

Yes, Because if the sizes mismatch you break data cache operations.
One example is DC ZVA [1]

Where the operation clears whole cachelines at a time. This value is cached
when doing a long clear operation, such as large memset of 0.  Technically 
speaking
the size of VA doesn't have to match the cache size, but practically it always 
does
since otherwise the operation is a lot less efficient.  This is an assumption 
also made
in e.g. glibc's memset.

If you migrate tasks between the reading of the value and the end of the set 
you may
end up overwriting or not clearing enough.

In addition, the Linux kernel will report the smallest value over all the cores 
so that
software gets a coherent value: [2], because again, software can't work 
effectively
when the sizes don't match.

And lastly the big.LITTLE programming model explicitly states that the cores 
must be
coherent [3].  This patch does not change the cache sizes anyway as those are 
determined
by the generic cost model, which are a conservative safe minimum that work on 
all Arm cores.

But coherency property is also what dictates that the patch is correct, because 
as a requirement
of big.LITTLE you must be able to seamlessly transfer a program from one core 
to another.

This cannot happen if they vary in capabilities.

Regards,
Tamar

[1] 
https://developer.arm.com/documentation/102670/0301/AArch64-registers/AArch64-register-descriptions/AArch64-System-instruction-register-description/DC-ZVA--Data-Cache-Zero-by-VA
[2] https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1227904.html
[3] 
https://developer.arm.com/documentation/dai0424/a/Power-down-and-power-up-considerations/High-level-considerations/Coherency-in-a-big-LITTLE-system

>
> > As an aside, today the big.LITTLE selection already enforces a uniform 
> > feature set.
>
> --
> Xi Ruoyao 
> School of Aerospace Science and Technology, Xidian University


RE: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-15 Thread Tamar Christina
> -Original Message-
> From: Richard Sandiford 
> Sent: Wednesday, January 15, 2025 3:23 PM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; ktkac...@gcc.gnu.org
> Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions
> for unknown non-homogenous systems [PR113257]
> 
> Tamar Christina  writes:
> > Ok for master? and how do you feel about a backport for the two patches to 
> > help
> > distros?
> 
> Backporting to GCC 14 & GCC 13 sounds good.  Not so sure about GCC 12,
> since I think we should be extra cautious with the "most stable" branch,
> but let's see what others think.
> 
> OK for trunk, and for GCC 14 & 13 after a grace period, with one
> trivial nit below:

Thanks, I should have indicated that I'm not pushing for GCC 12.

So I'm happy with just GCC 13 and 14.

Cheers,
Tamar
> 
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > PR target/113257
> > * config/aarch64/driver-aarch64.cc (get_cpu_from_id, DEFAULT_CPU):
> New.
> > (host_detect_local_cpu): Use it.
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR target/113257
> > * gcc.target/aarch64/cpunative/info_34: New test.
> > * gcc.target/aarch64/cpunative/native_cpu_34.c: New test.
> > * gcc.target/aarch64/cpunative/info_35: New test.
> > * gcc.target/aarch64/cpunative/native_cpu_35.c: New test.
> >
> > -- inline patch --
> >
> > diff --git a/gcc/config/aarch64/driver-aarch64.cc 
> > b/gcc/config/aarch64/driver-
> aarch64.cc
> > index
> 45fce67a646351b848b7cd7d0fd35d343731c0d1..731b5e486879d3670ad0feb
> 02906c6d6bdbb1f01 100644
> > --- a/gcc/config/aarch64/driver-aarch64.cc
> > +++ b/gcc/config/aarch64/driver-aarch64.cc
> > @@ -60,6 +60,7 @@ struct aarch64_core_data
> >  #define ALL_VARIANTS ((unsigned)-1)
> >  /* Default architecture to use if -mcpu=native did not detect a known CPU. 
> >  */
> >  #define DEFAULT_ARCH "8A"
> > +#define DEFAULT_CPU "generic-armv8-a"
> >
> >  #define AARCH64_CORE(CORE_NAME, CORE_IDENT, SCHED, ARCH, FLAGS,
> COSTS, IMP, PART, VARIANT) \
> >{ CORE_NAME, #ARCH, IMP, PART, VARIANT, feature_deps::cpu_##CORE_IDENT
> },
> > @@ -106,6 +107,21 @@ get_arch_from_id (const char* id)
> >return NULL;
> >  }
> >
> > +/* Return an aarch64_core_data for the cpu described
> > +   by ID, or NULL if ID describes something we don't know about.  */
> > +
> > +static const aarch64_core_data *
> > +get_cpu_from_id (const char* name)
> > +{
> > +  for (unsigned i = 0; aarch64_cpu_data[i].name != NULL; i++)
> > +{
> > +  if (strcmp (name, aarch64_cpu_data[i].name) == 0)
> > +   return &aarch64_cpu_data[i];
> > +}
> 
> Redundant braces, should be:
> 
>   for (unsigned i = 0; aarch64_cpu_data[i].name != NULL; i++)
> if (strcmp (name, aarch64_cpu_data[i].name) == 0)
>   return &aarch64_cpu_data[i];
> 
> Thanks,
> Richard
> 
> > +
> > +  return NULL;
> > +}
> > +
> >  /* Check wether the CORE array is the same as the big.LITTLE BL_CORE.
> > For an example CORE={0xd08, 0xd03} and
> > BL_CORE=AARCH64_BIG_LITTLE (0xd08, 0xd03) will return true.  */
> > @@ -405,12 +421,12 @@ host_detect_local_cpu (int argc, const char **argv)
> >
> >if (aarch64_cpu_data[i].name == NULL)
> > {
> > - auto arch_info = get_arch_from_id (DEFAULT_ARCH);
> > + auto cpu_info = get_cpu_from_id (DEFAULT_CPU);
> >
> > - gcc_assert (arch_info);
> > + gcc_assert (cpu_info);
> >
> > - res = concat ("-march=", arch_info->name, NULL);
> > - default_flags = arch_info->flags;
> > + res = concat ("-mcpu=", cpu_info->name, NULL);
> > + default_flags = cpu_info->flags;
> > }
> >else if (arch)
> > {
> > @@ -449,6 +465,20 @@ host_detect_local_cpu (int argc, const char **argv)
> >   break;
> > }
> > }
> > +
> > +  /* On big.LITTLE if we find any unknown CPUs we can still pick arch
> > +features as the cores should have the same features.  So just pick
> > +the feature flags from any of the cpus.  */
> > +  if (aarch64_cpu_data[i].name == NULL)
> > +   {
> > + auto cpu_info = get_cpu_from_id (DEFAULT_CPU);
> > +
> > + gcc_assert (cpu_info);
> > +
> > + res = concat ("-mcpu=", cpu_info->name, NULL);
> >

Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-15 Thread Richard Sandiford
Tamar Christina  writes:
> Ok for master? and how do you feel about a backport for the two patches to 
> help
> distros?

Backporting to GCC 14 & GCC 13 sounds good.  Not so sure about GCC 12,
since I think we should be extra cautious with the "most stable" branch,
but let's see what others think.

OK for trunk, and for GCC 14 & 13 after a grace period, with one
trivial nit below:

>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
>   PR target/113257
>   * config/aarch64/driver-aarch64.cc (get_cpu_from_id, DEFAULT_CPU): New.
>   (host_detect_local_cpu): Use it.
>
> gcc/testsuite/ChangeLog:
>
>   PR target/113257
>   * gcc.target/aarch64/cpunative/info_34: New test.
>   * gcc.target/aarch64/cpunative/native_cpu_34.c: New test.
>   * gcc.target/aarch64/cpunative/info_35: New test.
>   * gcc.target/aarch64/cpunative/native_cpu_35.c: New test.
>
> -- inline patch --
>
> diff --git a/gcc/config/aarch64/driver-aarch64.cc 
> b/gcc/config/aarch64/driver-aarch64.cc
> index 
> 45fce67a646351b848b7cd7d0fd35d343731c0d1..731b5e486879d3670ad0feb02906c6d6bdbb1f01
>  100644
> --- a/gcc/config/aarch64/driver-aarch64.cc
> +++ b/gcc/config/aarch64/driver-aarch64.cc
> @@ -60,6 +60,7 @@ struct aarch64_core_data
>  #define ALL_VARIANTS ((unsigned)-1)
>  /* Default architecture to use if -mcpu=native did not detect a known CPU.  
> */
>  #define DEFAULT_ARCH "8A"
> +#define DEFAULT_CPU "generic-armv8-a"
>  
>  #define AARCH64_CORE(CORE_NAME, CORE_IDENT, SCHED, ARCH, FLAGS, COSTS, IMP, 
> PART, VARIANT) \
>{ CORE_NAME, #ARCH, IMP, PART, VARIANT, feature_deps::cpu_##CORE_IDENT },
> @@ -106,6 +107,21 @@ get_arch_from_id (const char* id)
>return NULL;
>  }
>  
> +/* Return an aarch64_core_data for the cpu described
> +   by ID, or NULL if ID describes something we don't know about.  */
> +
> +static const aarch64_core_data *
> +get_cpu_from_id (const char* name)
> +{
> +  for (unsigned i = 0; aarch64_cpu_data[i].name != NULL; i++)
> +{
> +  if (strcmp (name, aarch64_cpu_data[i].name) == 0)
> + return &aarch64_cpu_data[i];
> +}

Redundant braces, should be:

  for (unsigned i = 0; aarch64_cpu_data[i].name != NULL; i++)
if (strcmp (name, aarch64_cpu_data[i].name) == 0)
  return &aarch64_cpu_data[i];

Thanks,
Richard

> +
> +  return NULL;
> +}
> +
>  /* Check wether the CORE array is the same as the big.LITTLE BL_CORE.
> For an example CORE={0xd08, 0xd03} and
> BL_CORE=AARCH64_BIG_LITTLE (0xd08, 0xd03) will return true.  */
> @@ -405,12 +421,12 @@ host_detect_local_cpu (int argc, const char **argv)
>  
>if (aarch64_cpu_data[i].name == NULL)
>   {
> -   auto arch_info = get_arch_from_id (DEFAULT_ARCH);
> +   auto cpu_info = get_cpu_from_id (DEFAULT_CPU);
>  
> -   gcc_assert (arch_info);
> +   gcc_assert (cpu_info);
>  
> -   res = concat ("-march=", arch_info->name, NULL);
> -   default_flags = arch_info->flags;
> +   res = concat ("-mcpu=", cpu_info->name, NULL);
> +   default_flags = cpu_info->flags;
>   }
>else if (arch)
>   {
> @@ -449,6 +465,20 @@ host_detect_local_cpu (int argc, const char **argv)
> break;
>   }
>   }
> +
> +  /* On big.LITTLE if we find any unknown CPUs we can still pick arch
> +  features as the cores should have the same features.  So just pick
> +  the feature flags from any of the cpus.  */
> +  if (aarch64_cpu_data[i].name == NULL)
> + {
> +   auto cpu_info = get_cpu_from_id (DEFAULT_CPU);
> +
> +   gcc_assert (cpu_info);
> +
> +   res = concat ("-mcpu=", cpu_info->name, NULL);
> +   default_flags = cpu_info->flags;
> + }
> +
>if (!res)
>   goto not_found;
>  }
> diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/info_34 
> b/gcc/testsuite/gcc.target/aarch64/cpunative/info_34
> new file mode 100644
> index 
> ..61cb254785a4b9ec19ebe388402223c9a82af7ed
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/info_34
> @@ -0,0 +1,18 @@
> +processor: 0
> +BogoMIPS : 100.00
> +Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 asimddp sve sve2 
> fphp asimdhp fcma
> +CPU implementer  : 0xfe
> +CPU architecture: 8
> +CPU variant  : 0x0
> +CPU part : 0xd08
> +CPU revision : 2
> +
> +processor: 0
> +BogoMIPS : 100.00
> +Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 asimddp sve sve2 
> fphp asimdhp fcma
> +CPU implementer  : 0xfe
> +CPU architecture: 8
> +CPU variant  : 0x0
> +CPU part : 0xd09
> +CPU revision : 2
> +
> diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/info_35 
> b/gcc/testsuite/gcc.target/aarch64/cpunative/info_35
> new file mode 100644
> index 
> ..61cb254785a4b9ec19ebe388402223c9a82af7ed
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/info_35
> @@ -0,0 +1,18 @@
> +processor: 0
> +BogoMIPS : 100.00
> +Features : fp a

RE: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-15 Thread Tamar Christina
> -Original Message-
> From: Xi Ruoyao 
> Sent: Wednesday, January 15, 2025 1:40 PM
> To: Tamar Christina ; gcc-patches@gcc.gnu.org
> Cc: nd ; Richard Earnshaw ;
> ktkac...@gcc.gnu.org; Richard Sandiford 
> Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions
> for unknown non-homogenous systems [PR113257]
> 
> On Wed, 2025-01-15 at 13:34 +, Tamar Christina wrote:
> >
> >
> > > -Original Message-
> > > From: Xi Ruoyao 
> > > Sent: Wednesday, January 15, 2025 1:29 PM
> > > To: Tamar Christina ; gcc-patches@gcc.gnu.org
> > > Cc: nd ; Richard Earnshaw ;
> > > ktkac...@gcc.gnu.org; Richard Sandiford 
> > > Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture
> extensions
> > > for unknown non-homogenous systems [PR113257]
> > >
> > > On Sat, 2025-01-11 at 15:18 +, Tamar Christina wrote:
> > > > However the same thing works for big.LITTLE as in such system the cores
> must
> > > > have the same extensions otherwise it doesn't fundamentally work.
> > > >
> > > > i.e. task migration from one core to the other wouldn't work.
> > >
> > > See https://gcc.gnu.org/PR111768 for a potential issue.
> > >
> >
> > There's no issue there.
> > As Alexander says in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111768#c7
> >
> > It is not practical for you to have two CPUs in the same system with 
> > different
> capabilities.
> > Your kernel will not work, or you have to disable task migration.
> >
> > The same extends to cache sizes and other properties of the CPU.
> 
> On Intel CPUs it's common that the L1 cache size is different on P and E
> cores.  The generated code still works but it may be suboptimal if it's
> compiled on an E core but the executed on a P core or vice versa.
> 
> Is it different on AArch64?  I'd be really surprised if big and LITTLE
> must have the same L1 cache size...

Yes, Because if the sizes mismatch you break data cache operations.
One example is DC ZVA [1]

Where the operation clears whole cachelines at a time. This value is cached
when doing a long clear operation, such as large memset of 0.  Technically 
speaking
the size of VA doesn't have to match the cache size, but practically it always 
does
since otherwise the operation is a lot less efficient.  This is an assumption 
also made
in e.g. glibc's memset.

If you migrate tasks between the reading of the value and the end of the set 
you may
end up overwriting or not clearing enough.

In addition, the Linux kernel will report the smallest value over all the cores 
so that
software gets a coherent value: [2], because again, software can't work 
effectively
when the sizes don't match.

And lastly the big.LITTLE programming model explicitly states that the cores 
must be
coherent [3].  This patch does not change the cache sizes anyway as those are 
determined
by the generic cost model, which are a conservative safe minimum that work on 
all Arm cores.

But coherency property is also what dictates that the patch is correct, because 
as a requirement
of big.LITTLE you must be able to seamlessly transfer a program from one core 
to another.

This cannot happen if they vary in capabilities.

Regards,
Tamar

[1] 
https://developer.arm.com/documentation/102670/0301/AArch64-registers/AArch64-register-descriptions/AArch64-System-instruction-register-description/DC-ZVA--Data-Cache-Zero-by-VA
[2] https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1227904.html
[3] 
https://developer.arm.com/documentation/dai0424/a/Power-down-and-power-up-considerations/High-level-considerations/Coherency-in-a-big-LITTLE-system

> 
> > As an aside, today the big.LITTLE selection already enforces a uniform 
> > feature set.
> 
> --
> Xi Ruoyao 
> School of Aerospace Science and Technology, Xidian University


Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-15 Thread Xi Ruoyao
On Wed, 2025-01-15 at 13:34 +, Tamar Christina wrote:
> 
> 
> > -Original Message-
> > From: Xi Ruoyao 
> > Sent: Wednesday, January 15, 2025 1:29 PM
> > To: Tamar Christina ; gcc-patches@gcc.gnu.org
> > Cc: nd ; Richard Earnshaw ;
> > ktkac...@gcc.gnu.org; Richard Sandiford 
> > Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture 
> > extensions
> > for unknown non-homogenous systems [PR113257]
> > 
> > On Sat, 2025-01-11 at 15:18 +, Tamar Christina wrote:
> > > However the same thing works for big.LITTLE as in such system the cores 
> > > must
> > > have the same extensions otherwise it doesn't fundamentally work.
> > > 
> > > i.e. task migration from one core to the other wouldn't work.
> > 
> > See https://gcc.gnu.org/PR111768 for a potential issue.
> > 
> 
> There's no issue there.
> As Alexander says in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111768#c7
> 
> It is not practical for you to have two CPUs in the same system with 
> different capabilities.
> Your kernel will not work, or you have to disable task migration.
> 
> The same extends to cache sizes and other properties of the CPU.

On Intel CPUs it's common that the L1 cache size is different on P and E
cores.  The generated code still works but it may be suboptimal if it's
compiled on an E core but the executed on a P core or vice versa.

Is it different on AArch64?  I'd be really surprised if big and LITTLE
must have the same L1 cache size...

> As an aside, today the big.LITTLE selection already enforces a uniform 
> feature set.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


RE: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-15 Thread Tamar Christina


> -Original Message-
> From: Xi Ruoyao 
> Sent: Wednesday, January 15, 2025 1:29 PM
> To: Tamar Christina ; gcc-patches@gcc.gnu.org
> Cc: nd ; Richard Earnshaw ;
> ktkac...@gcc.gnu.org; Richard Sandiford 
> Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions
> for unknown non-homogenous systems [PR113257]
> 
> On Sat, 2025-01-11 at 15:18 +, Tamar Christina wrote:
> > However the same thing works for big.LITTLE as in such system the cores must
> > have the same extensions otherwise it doesn't fundamentally work.
> >
> > i.e. task migration from one core to the other wouldn't work.
> 
> See https://gcc.gnu.org/PR111768 for a potential issue.
> 

There's no issue there.
As Alexander says in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111768#c7

It is not practical for you to have two CPUs in the same system with different 
capabilities.
Your kernel will not work, or you have to disable task migration.

The same extends to cache sizes and other properties of the CPU.

As an aside, today the big.LITTLE selection already enforces a uniform feature 
set.

Thanks,
Tamar
> --
> Xi Ruoyao 
> School of Aerospace Science and Technology, Xidian University


Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-15 Thread Xi Ruoyao
On Sat, 2025-01-11 at 15:18 +, Tamar Christina wrote:
> However the same thing works for big.LITTLE as in such system the cores must
> have the same extensions otherwise it doesn't fundamentally work.
> 
> i.e. task migration from one core to the other wouldn't work.

See https://gcc.gnu.org/PR111768 for a potential issue.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


RE: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-15 Thread Tamar Christina

> -Original Message-
> From: Richard Sandiford 
> Sent: Monday, January 13, 2025 8:55 PM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; ktkac...@gcc.gnu.org
> Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions
> for unknown non-homogenous systems [PR113257]
> 
> Richard Sandiford  writes:
> > Tamar Christina  writes:
> >>> -Original Message-
> >>> From: Richard Sandiford 
> >>> Sent: Monday, January 13, 2025 6:35 PM
> >>> To: Tamar Christina 
> >>> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> >>> ; ktkac...@gcc.gnu.org
> >>> Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture
> extensions
> >>> for unknown non-homogenous systems [PR113257]
> >>>
> >>> Tamar Christina  writes:
> >>> > Hi All,
> >>> >
> >>> > in g:e91a17fe39c39e98cebe6e1cbc8064ee6846a3a7 we added the ability
> for
> >>> > -mcpu=native on unknown CPUs to still enable architecture extensions.
> >>> >
> >>> > This has worked great but was only added for homogenous systems.
> >>> >
> >>> > However the same thing works for big.LITTLE as in such system the cores
> must
> >>> > have the same extensions otherwise it doesn't fundamentally work.
> >>> >
> >>> > i.e. task migration from one core to the other wouldn't work.
> >>> >
> >>> > This extends the same handling to non-homogenous systems.
> >>> >
> >>> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >>> >
> >>> > Ok for master?
> >>> >
> >>> > Thanks,
> >>> > Tamar
> >>> >
> >>> > gcc/ChangeLog:
> >>> >
> >>> > PR target/113257
> >>> > * config/aarch64/driver-aarch64.cc (host_detect_local_cpu):
> >>> >
> >>> > gcc/testsuite/ChangeLog:
> >>> >
> >>> > PR target/113257
> >>> > * gcc.target/aarch64/cpunative/info_34: New test.
> >>> > * gcc.target/aarch64/cpunative/native_cpu_34.c: New test.
> >>> >
> >>> > ---
> >>> >
> >>> > diff --git a/gcc/config/aarch64/driver-aarch64.cc
> b/gcc/config/aarch64/driver-
> >>> aarch64.cc
> >>> > index
> >>>
> 45fce67a646351b848b7cd7d0fd35d343731c0d1..2a454daf031aa3ac81a9a2c0
> >>> 3b15c09731e4f56e 100644
> >>> > --- a/gcc/config/aarch64/driver-aarch64.cc
> >>> > +++ b/gcc/config/aarch64/driver-aarch64.cc
> >>> > @@ -449,6 +449,20 @@ host_detect_local_cpu (int argc, const char **argv)
> >>> >   break;
> >>> > }
> >>> > }
> >>> > +
> >>> > +  /* On big.LITTLE if we find any unknown CPUs we can still pick 
> >>> > arch
> >>> > +features as the cores should have the same features.  So just 
> >>> > pick
> >>> > +the feature flags from any of the cpus.  */
> >>> > +  if (aarch64_cpu_data[i].name == NULL)
> >>> > +   {
> >>> > + auto arch_info = get_arch_from_id (DEFAULT_ARCH);
> >>> > +
> >>> > + gcc_assert (arch_info);
> >>> > +
> >>> > + res = concat ("-march=", arch_info->name, NULL);
> >>> > + default_flags = arch_info->flags;
> >>> > +   }
> >>> > +
> >>>
> >>> Currently, if gcc recognises the host cpu, and if one-thing is more
> >>> restrictive than that cpu, gcc will warn on:
> >>>
> >>>   gcc -march=one-thing -mcpu=native
> >>>
> >>> and choose one-thing.  It looks like one consequence of this patch
> >>> is that, for unrecognised big.LITTLE, the command line would get
> >>> converted to:
> >>>
> >>>   gcc -march=one-thing -march=above-replacement
> >>>
> >>> and so -mcpu=native would silently "win" over one-thing.  Is that right?
> >>>
> >>> Perhaps we should adjust:
> >>>
> >>>" %{mcpu=native:% >>>
> >>> to pass something like "cpu/arch" rather than "cpu" when -mar

Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-13 Thread Richard Sandiford
Richard Sandiford  writes:
> Tamar Christina  writes:
>>> -Original Message-
>>> From: Richard Sandiford 
>>> Sent: Monday, January 13, 2025 6:35 PM
>>> To: Tamar Christina 
>>> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
>>> ; ktkac...@gcc.gnu.org
>>> Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture 
>>> extensions
>>> for unknown non-homogenous systems [PR113257]
>>> 
>>> Tamar Christina  writes:
>>> > Hi All,
>>> >
>>> > in g:e91a17fe39c39e98cebe6e1cbc8064ee6846a3a7 we added the ability for
>>> > -mcpu=native on unknown CPUs to still enable architecture extensions.
>>> >
>>> > This has worked great but was only added for homogenous systems.
>>> >
>>> > However the same thing works for big.LITTLE as in such system the cores 
>>> > must
>>> > have the same extensions otherwise it doesn't fundamentally work.
>>> >
>>> > i.e. task migration from one core to the other wouldn't work.
>>> >
>>> > This extends the same handling to non-homogenous systems.
>>> >
>>> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>>> >
>>> > Ok for master?
>>> >
>>> > Thanks,
>>> > Tamar
>>> >
>>> > gcc/ChangeLog:
>>> >
>>> >   PR target/113257
>>> >   * config/aarch64/driver-aarch64.cc (host_detect_local_cpu):
>>> >
>>> > gcc/testsuite/ChangeLog:
>>> >
>>> >   PR target/113257
>>> >   * gcc.target/aarch64/cpunative/info_34: New test.
>>> >   * gcc.target/aarch64/cpunative/native_cpu_34.c: New test.
>>> >
>>> > ---
>>> >
>>> > diff --git a/gcc/config/aarch64/driver-aarch64.cc 
>>> > b/gcc/config/aarch64/driver-
>>> aarch64.cc
>>> > index
>>> 45fce67a646351b848b7cd7d0fd35d343731c0d1..2a454daf031aa3ac81a9a2c0
>>> 3b15c09731e4f56e 100644
>>> > --- a/gcc/config/aarch64/driver-aarch64.cc
>>> > +++ b/gcc/config/aarch64/driver-aarch64.cc
>>> > @@ -449,6 +449,20 @@ host_detect_local_cpu (int argc, const char **argv)
>>> > break;
>>> >   }
>>> >   }
>>> > +
>>> > +  /* On big.LITTLE if we find any unknown CPUs we can still pick arch
>>> > +  features as the cores should have the same features.  So just pick
>>> > +  the feature flags from any of the cpus.  */
>>> > +  if (aarch64_cpu_data[i].name == NULL)
>>> > + {
>>> > +   auto arch_info = get_arch_from_id (DEFAULT_ARCH);
>>> > +
>>> > +   gcc_assert (arch_info);
>>> > +
>>> > +   res = concat ("-march=", arch_info->name, NULL);
>>> > +   default_flags = arch_info->flags;
>>> > + }
>>> > +
>>> 
>>> Currently, if gcc recognises the host cpu, and if one-thing is more
>>> restrictive than that cpu, gcc will warn on:
>>> 
>>>   gcc -march=one-thing -mcpu=native
>>> 
>>> and choose one-thing.  It looks like one consequence of this patch
>>> is that, for unrecognised big.LITTLE, the command line would get
>>> converted to:
>>> 
>>>   gcc -march=one-thing -march=above-replacement
>>> 
>>> and so -mcpu=native would silently "win" over one-thing.  Is that right?
>>> 
>>> Perhaps we should adjust:
>>> 
>>>" %{mcpu=native:%>> 
>>> to pass something like "cpu/arch" rather than "cpu" when -march
>>> is not specified, so that the routine knows that it has the choice
>>> of using either -mcpu or -march.  We wouldn't get the warning, but we
>>> would get predictable preemption of -march over -mcpu.
>>> 
>>> Admittedly, it looks like we already have this problem with:
>>> 
>>>   if (aarch64_cpu_data[i].name == NULL)
>>> {
>>>   auto arch_info = get_arch_from_id (DEFAULT_ARCH);
>>> 
>>>   gcc_assert (arch_info);
>>> 
>>>   res = concat ("-march=", arch_info->name, NULL);
>>>   default_flags = arch_info->flags;
>>> }
>>> 
>>> so I guess this is pre-existing.
>>
>> Yes, it looks like this was a deliberate choice that mcpu=native
>> overrides any march, this is the case for all pre

Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-13 Thread Richard Sandiford
Tamar Christina  writes:
>> -Original Message-
>> From: Richard Sandiford 
>> Sent: Monday, January 13, 2025 6:35 PM
>> To: Tamar Christina 
>> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
>> ; ktkac...@gcc.gnu.org
>> Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions
>> for unknown non-homogenous systems [PR113257]
>> 
>> Tamar Christina  writes:
>> > Hi All,
>> >
>> > in g:e91a17fe39c39e98cebe6e1cbc8064ee6846a3a7 we added the ability for
>> > -mcpu=native on unknown CPUs to still enable architecture extensions.
>> >
>> > This has worked great but was only added for homogenous systems.
>> >
>> > However the same thing works for big.LITTLE as in such system the cores 
>> > must
>> > have the same extensions otherwise it doesn't fundamentally work.
>> >
>> > i.e. task migration from one core to the other wouldn't work.
>> >
>> > This extends the same handling to non-homogenous systems.
>> >
>> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>> >
>> > Ok for master?
>> >
>> > Thanks,
>> > Tamar
>> >
>> > gcc/ChangeLog:
>> >
>> >PR target/113257
>> >* config/aarch64/driver-aarch64.cc (host_detect_local_cpu):
>> >
>> > gcc/testsuite/ChangeLog:
>> >
>> >PR target/113257
>> >* gcc.target/aarch64/cpunative/info_34: New test.
>> >* gcc.target/aarch64/cpunative/native_cpu_34.c: New test.
>> >
>> > ---
>> >
>> > diff --git a/gcc/config/aarch64/driver-aarch64.cc 
>> > b/gcc/config/aarch64/driver-
>> aarch64.cc
>> > index
>> 45fce67a646351b848b7cd7d0fd35d343731c0d1..2a454daf031aa3ac81a9a2c0
>> 3b15c09731e4f56e 100644
>> > --- a/gcc/config/aarch64/driver-aarch64.cc
>> > +++ b/gcc/config/aarch64/driver-aarch64.cc
>> > @@ -449,6 +449,20 @@ host_detect_local_cpu (int argc, const char **argv)
>> >  break;
>> >}
>> >}
>> > +
>> > +  /* On big.LITTLE if we find any unknown CPUs we can still pick arch
>> > +   features as the cores should have the same features.  So just pick
>> > +   the feature flags from any of the cpus.  */
>> > +  if (aarch64_cpu_data[i].name == NULL)
>> > +  {
>> > +auto arch_info = get_arch_from_id (DEFAULT_ARCH);
>> > +
>> > +gcc_assert (arch_info);
>> > +
>> > +res = concat ("-march=", arch_info->name, NULL);
>> > +default_flags = arch_info->flags;
>> > +  }
>> > +
>> 
>> Currently, if gcc recognises the host cpu, and if one-thing is more
>> restrictive than that cpu, gcc will warn on:
>> 
>>   gcc -march=one-thing -mcpu=native
>> 
>> and choose one-thing.  It looks like one consequence of this patch
>> is that, for unrecognised big.LITTLE, the command line would get
>> converted to:
>> 
>>   gcc -march=one-thing -march=above-replacement
>> 
>> and so -mcpu=native would silently "win" over one-thing.  Is that right?
>> 
>> Perhaps we should adjust:
>> 
>>" %{mcpu=native:%> 
>> to pass something like "cpu/arch" rather than "cpu" when -march
>> is not specified, so that the routine knows that it has the choice
>> of using either -mcpu or -march.  We wouldn't get the warning, but we
>> would get predictable preemption of -march over -mcpu.
>> 
>> Admittedly, it looks like we already have this problem with:
>> 
>>   if (aarch64_cpu_data[i].name == NULL)
>>  {
>>auto arch_info = get_arch_from_id (DEFAULT_ARCH);
>> 
>>gcc_assert (arch_info);
>> 
>>res = concat ("-march=", arch_info->name, NULL);
>>default_flags = arch_info->flags;
>>  }
>> 
>> so I guess this is pre-existing.
>
> Yes, it looks like this was a deliberate choice that mcpu=native
> overrides any march, this is the case for all previous GCCs as well.
>
> i.e. GCC 12-15 (ones I had at hand) convert
>
> -march=armv8.8-a+sve -mcpu=native on an Neoverse-V1 system to
>
> '-march=armv8.8-a+sve'  '-c' '-mlittle-endian' '-mabi=lp64' 
> '-mcpu=neoverse-v1+sm4+crc+aes+sha3+nossbs'
>
> In both the calls to CC1 and to AS
>
> "-march=armv8.8-a+sve" 
> "-march=armv8.4-a+rng+sm4+crc+aes+sha3+i8mm+bf16

RE: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-13 Thread Tamar Christina
> -Original Message-
> From: Richard Sandiford 
> Sent: Monday, January 13, 2025 6:35 PM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; ktkac...@gcc.gnu.org
> Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions
> for unknown non-homogenous systems [PR113257]
> 
> Tamar Christina  writes:
> > Hi All,
> >
> > in g:e91a17fe39c39e98cebe6e1cbc8064ee6846a3a7 we added the ability for
> > -mcpu=native on unknown CPUs to still enable architecture extensions.
> >
> > This has worked great but was only added for homogenous systems.
> >
> > However the same thing works for big.LITTLE as in such system the cores must
> > have the same extensions otherwise it doesn't fundamentally work.
> >
> > i.e. task migration from one core to the other wouldn't work.
> >
> > This extends the same handling to non-homogenous systems.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > PR target/113257
> > * config/aarch64/driver-aarch64.cc (host_detect_local_cpu):
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR target/113257
> > * gcc.target/aarch64/cpunative/info_34: New test.
> > * gcc.target/aarch64/cpunative/native_cpu_34.c: New test.
> >
> > ---
> >
> > diff --git a/gcc/config/aarch64/driver-aarch64.cc 
> > b/gcc/config/aarch64/driver-
> aarch64.cc
> > index
> 45fce67a646351b848b7cd7d0fd35d343731c0d1..2a454daf031aa3ac81a9a2c0
> 3b15c09731e4f56e 100644
> > --- a/gcc/config/aarch64/driver-aarch64.cc
> > +++ b/gcc/config/aarch64/driver-aarch64.cc
> > @@ -449,6 +449,20 @@ host_detect_local_cpu (int argc, const char **argv)
> >   break;
> > }
> > }
> > +
> > +  /* On big.LITTLE if we find any unknown CPUs we can still pick arch
> > +features as the cores should have the same features.  So just pick
> > +the feature flags from any of the cpus.  */
> > +  if (aarch64_cpu_data[i].name == NULL)
> > +   {
> > + auto arch_info = get_arch_from_id (DEFAULT_ARCH);
> > +
> > + gcc_assert (arch_info);
> > +
> > + res = concat ("-march=", arch_info->name, NULL);
> > + default_flags = arch_info->flags;
> > +   }
> > +
> 
> Currently, if gcc recognises the host cpu, and if one-thing is more
> restrictive than that cpu, gcc will warn on:
> 
>   gcc -march=one-thing -mcpu=native
> 
> and choose one-thing.  It looks like one consequence of this patch
> is that, for unrecognised big.LITTLE, the command line would get
> converted to:
> 
>   gcc -march=one-thing -march=above-replacement
> 
> and so -mcpu=native would silently "win" over one-thing.  Is that right?
> 
> Perhaps we should adjust:
> 
>" %{mcpu=native:% 
> to pass something like "cpu/arch" rather than "cpu" when -march
> is not specified, so that the routine knows that it has the choice
> of using either -mcpu or -march.  We wouldn't get the warning, but we
> would get predictable preemption of -march over -mcpu.
> 
> Admittedly, it looks like we already have this problem with:
> 
>   if (aarch64_cpu_data[i].name == NULL)
>   {
> auto arch_info = get_arch_from_id (DEFAULT_ARCH);
> 
> gcc_assert (arch_info);
> 
> res = concat ("-march=", arch_info->name, NULL);
> default_flags = arch_info->flags;
>   }
> 
> so I guess this is pre-existing.

Yes, it looks like this was a deliberate choice that mcpu=native
overrides any march, this is the case for all previous GCCs as well.

i.e. GCC 12-15 (ones I had at hand) convert

-march=armv8.8-a+sve -mcpu=native on an Neoverse-V1 system to

'-march=armv8.8-a+sve'  '-c' '-mlittle-endian' '-mabi=lp64' 
'-mcpu=neoverse-v1+sm4+crc+aes+sha3+nossbs'

In both the calls to CC1 and to AS

"-march=armv8.8-a+sve" 
"-march=armv8.4-a+rng+sm4+crc+aes+sha3+i8mm+bf16+sve+profile"

Your request would change this long standing behavior but I don't know If 
that's correct or not.

I'll admit it's inconsistent with the warning given by CC1 for the flags.

The warnings are coming from CC1, and the question is if it's up to GCC to 
enforce the CC1 constraint onto binutils or not.

So up to you, but I'm a bit hesitant to change this at this point without being 
able to tell what the impact is on downstream tools.

Thanks,
Tamar

> 
> T

Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-13 Thread Richard Sandiford
Tamar Christina  writes:
> Hi All,
>
> in g:e91a17fe39c39e98cebe6e1cbc8064ee6846a3a7 we added the ability for
> -mcpu=native on unknown CPUs to still enable architecture extensions.
>
> This has worked great but was only added for homogenous systems.
>
> However the same thing works for big.LITTLE as in such system the cores must
> have the same extensions otherwise it doesn't fundamentally work.
>
> i.e. task migration from one core to the other wouldn't work.
>
> This extends the same handling to non-homogenous systems.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok for master?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
>   PR target/113257
>   * config/aarch64/driver-aarch64.cc (host_detect_local_cpu):
>
> gcc/testsuite/ChangeLog:
>
>   PR target/113257
>   * gcc.target/aarch64/cpunative/info_34: New test.
>   * gcc.target/aarch64/cpunative/native_cpu_34.c: New test.
>
> ---
>
> diff --git a/gcc/config/aarch64/driver-aarch64.cc 
> b/gcc/config/aarch64/driver-aarch64.cc
> index 
> 45fce67a646351b848b7cd7d0fd35d343731c0d1..2a454daf031aa3ac81a9a2c03b15c09731e4f56e
>  100644
> --- a/gcc/config/aarch64/driver-aarch64.cc
> +++ b/gcc/config/aarch64/driver-aarch64.cc
> @@ -449,6 +449,20 @@ host_detect_local_cpu (int argc, const char **argv)
> break;
>   }
>   }
> +
> +  /* On big.LITTLE if we find any unknown CPUs we can still pick arch
> +  features as the cores should have the same features.  So just pick
> +  the feature flags from any of the cpus.  */
> +  if (aarch64_cpu_data[i].name == NULL)
> + {
> +   auto arch_info = get_arch_from_id (DEFAULT_ARCH);
> +
> +   gcc_assert (arch_info);
> +
> +   res = concat ("-march=", arch_info->name, NULL);
> +   default_flags = arch_info->flags;
> + }
> +

Currently, if gcc recognises the host cpu, and if one-thing is more
restrictive than that cpu, gcc will warn on:

  gcc -march=one-thing -mcpu=native

and choose one-thing.  It looks like one consequence of this patch
is that, for unrecognised big.LITTLE, the command line would get
converted to:

  gcc -march=one-thing -march=above-replacement

and so -mcpu=native would silently "win" over one-thing.  Is that right?

Perhaps we should adjust:

   " %{mcpu=native:% +/* { dg-additional-options "-mcpu=native" } */
> +
> +int main()
> +{
> +  return 0;
> +}
> +
> +/* { dg-final { scan-assembler {\.arch 
> armv8-a\+dotprod\+crc\+crypto\+sve2\n} } } */
> +
> +/* Test a normal looking procinfo.  */


[PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-11 Thread Tamar Christina
Hi All,

in g:e91a17fe39c39e98cebe6e1cbc8064ee6846a3a7 we added the ability for
-mcpu=native on unknown CPUs to still enable architecture extensions.

This has worked great but was only added for homogenous systems.

However the same thing works for big.LITTLE as in such system the cores must
have the same extensions otherwise it doesn't fundamentally work.

i.e. task migration from one core to the other wouldn't work.

This extends the same handling to non-homogenous systems.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

PR target/113257
* config/aarch64/driver-aarch64.cc (host_detect_local_cpu):

gcc/testsuite/ChangeLog:

PR target/113257
* gcc.target/aarch64/cpunative/info_34: New test.
* gcc.target/aarch64/cpunative/native_cpu_34.c: New test.

---
diff --git a/gcc/config/aarch64/driver-aarch64.cc 
b/gcc/config/aarch64/driver-aarch64.cc
index 
45fce67a646351b848b7cd7d0fd35d343731c0d1..2a454daf031aa3ac81a9a2c03b15c09731e4f56e
 100644
--- a/gcc/config/aarch64/driver-aarch64.cc
+++ b/gcc/config/aarch64/driver-aarch64.cc
@@ -449,6 +449,20 @@ host_detect_local_cpu (int argc, const char **argv)
  break;
}
}
+
+  /* On big.LITTLE if we find any unknown CPUs we can still pick arch
+features as the cores should have the same features.  So just pick
+the feature flags from any of the cpus.  */
+  if (aarch64_cpu_data[i].name == NULL)
+   {
+ auto arch_info = get_arch_from_id (DEFAULT_ARCH);
+
+ gcc_assert (arch_info);
+
+ res = concat ("-march=", arch_info->name, NULL);
+ default_flags = arch_info->flags;
+   }
+
   if (!res)
goto not_found;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/info_34 
b/gcc/testsuite/gcc.target/aarch64/cpunative/info_34
new file mode 100644
index 
..61cb254785a4b9ec19ebe388402223c9a82af7ed
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/cpunative/info_34
@@ -0,0 +1,18 @@
+processor  : 0
+BogoMIPS   : 100.00
+Features   : fp asimd evtstrm aes pmull sha1 sha2 crc32 asimddp sve sve2 
fphp asimdhp fcma
+CPU implementer: 0xfe
+CPU architecture: 8
+CPU variant: 0x0
+CPU part   : 0xd08
+CPU revision   : 2
+
+processor  : 0
+BogoMIPS   : 100.00
+Features   : fp asimd evtstrm aes pmull sha1 sha2 crc32 asimddp sve sve2 
fphp asimdhp fcma
+CPU implementer: 0xfe
+CPU architecture: 8
+CPU variant: 0x0
+CPU part   : 0xd09
+CPU revision   : 2
+
diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_34.c 
b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_34.c
new file mode 100644
index 
..168140002a0f0205c0f552de0cce9b2d356e09e2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_34.c
@@ -0,0 +1,12 @@
+/* { dg-do compile { target { { aarch64*-*-linux*} && native } } } */
+/* { dg-set-compiler-env-var GCC_CPUINFO 
"$srcdir/gcc.target/aarch64/cpunative/info_34" } */
+/* { dg-additional-options "-mcpu=native" } */
+
+int main()
+{
+  return 0;
+}
+
+/* { dg-final { scan-assembler {\.arch armv8-a\+dotprod\+crc\+crypto\+sve2\n} 
} } */
+
+/* Test a normal looking procinfo.  */




-- 
diff --git a/gcc/config/aarch64/driver-aarch64.cc b/gcc/config/aarch64/driver-aarch64.cc
index 45fce67a646351b848b7cd7d0fd35d343731c0d1..2a454daf031aa3ac81a9a2c03b15c09731e4f56e 100644
--- a/gcc/config/aarch64/driver-aarch64.cc
+++ b/gcc/config/aarch64/driver-aarch64.cc
@@ -449,6 +449,20 @@ host_detect_local_cpu (int argc, const char **argv)
 	  break;
 	}
 	}
+
+  /* On big.LITTLE if we find any unknown CPUs we can still pick arch
+	 features as the cores should have the same features.  So just pick
+	 the feature flags from any of the cpus.  */
+  if (aarch64_cpu_data[i].name == NULL)
+	{
+	  auto arch_info = get_arch_from_id (DEFAULT_ARCH);
+
+	  gcc_assert (arch_info);
+
+	  res = concat ("-march=", arch_info->name, NULL);
+	  default_flags = arch_info->flags;
+	}
+
   if (!res)
 	goto not_found;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/info_34 b/gcc/testsuite/gcc.target/aarch64/cpunative/info_34
new file mode 100644
index ..61cb254785a4b9ec19ebe388402223c9a82af7ed
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/cpunative/info_34
@@ -0,0 +1,18 @@
+processor	: 0
+BogoMIPS	: 100.00
+Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 asimddp sve sve2 fphp asimdhp fcma
+CPU implementer	: 0xfe
+CPU architecture: 8
+CPU variant	: 0x0
+CPU part	: 0xd08
+CPU revision	: 2
+
+processor	: 0
+BogoMIPS	: 100.00
+Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 asimddp sve sve2 fphp asimdhp fcma
+CPU implementer	: 0xfe
+CPU architecture: 8
+CPU variant	: 0x0
+CPU part	: 0xd09
+CPU revision	: 2
+
diff --git a/gcc/testsuite/gcc.target/aarc