[Bug target/113257] -march=native or -mcpu=native are ineffective, but -march=native -mcpu=native works on arm64 M2 Ultra
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113257 --- Comment #17 from Sam James --- Thank you!
[Bug target/113257] -march=native or -mcpu=native are ineffective, but -march=native -mcpu=native works on arm64 M2 Ultra
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113257 Tamar Christina changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #16 from Tamar Christina --- Fixed
[Bug target/113257] -march=native or -mcpu=native are ineffective, but -march=native -mcpu=native works on arm64 M2 Ultra
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113257 --- Comment #15 from GCC Commits --- The master branch has been updated by Iain D Sandoe : https://gcc.gnu.org/g:3a77a567b1028a28ecbb2f2eadc351d8bd004352 commit r15-9261-g3a77a567b1028a28ecbb2f2eadc351d8bd004352 Author: Iain Sandoe Date: Sun Oct 15 10:19:22 2023 +0100 aarch64, Darwin: Initial implementation of Apple cores [PR113257]. After discussion with the open source support team at Apple, we have established that the cores conform to the 8.5 and 8.6 requirements. One of the mandatory features (FEAT_SPECRES) is not exposed (or available) in user-space code but is supported for privileged code. The values for chip IDs and the LITTLE.big variants have been taken from lists in the XNU and LLVM sources. PR target/113257 gcc/ChangeLog: * config/aarch64/aarch64-cores.def (AARCH64_CORE): Add Apple-a12, Apple-M1, Apple-M2, Apple-M3 with expanded names to allow for the LITTLE.big versions. * config/aarch64/aarch64-tune.md: Regenerate. * doc/invoke.texi: Add apple-m1,2 and 3 cores to the ones listed for arch and tune selections. Signed-off-by: Iain Sandoe
[Bug target/113257] -march=native or -mcpu=native are ineffective, but -march=native -mcpu=native works on arm64 M2 Ultra
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113257 --- Comment #14 from Tamar Christina --- Should be fixed now on trunk and GCC 14 and 13, leaving it open for Iain's patch introducing the cores in aarch64-cores.def which would give us the right architecture too. However this should unblock the gentoo ticket.
[Bug target/113257] -march=native or -mcpu=native are ineffective, but -march=native -mcpu=native works on arm64 M2 Ultra
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113257 --- Comment #13 from GCC Commits --- The releases/gcc-13 branch has been updated by Tamar Christina : https://gcc.gnu.org/g:eb45b829bb3fb658aa34a340264dee9755d34e69 commit r13-9351-geb45b829bb3fb658aa34a340264dee9755d34e69 Author: Tamar Christina Date: Thu Jan 16 19:25:26 2025 + AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257] in g:e91a17fe39c39e98cebe6e1cbc8064ee6846a3a7 we added the ability for -mcpu=native on unknown CPUs to still enable architecture extensions. This has worked great but was only added for homogenous systems. However the same thing works for big.LITTLE as in such system the cores must have the same extensions otherwise it doesn't fundamentally work. i.e. task migration from one core to the other wouldn't work. This extends the same handling to non-homogenous systems. gcc/ChangeLog: PR target/113257 * config/aarch64/driver-aarch64.cc (get_cpu_from_id, DEFAULT_CPU): New. (host_detect_local_cpu): Use it. gcc/testsuite/ChangeLog: PR target/113257 * gcc.target/aarch64/cpunative/info_34: New test. * gcc.target/aarch64/cpunative/native_cpu_34.c: New test. * gcc.target/aarch64/cpunative/info_35: New test. * gcc.target/aarch64/cpunative/native_cpu_35.c: New test. Co-authored-by: Richard Sandiford (cherry picked from commit 1ff85affe46623fe1a970de95887df22f4da9d16)
[Bug target/113257] -march=native or -mcpu=native are ineffective, but -march=native -mcpu=native works on arm64 M2 Ultra
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113257 --- Comment #12 from GCC Commits --- The releases/gcc-14 branch has been updated by Tamar Christina : https://gcc.gnu.org/g:7c6fde4bac6c20e0b04c3feb820abe5ce0e48d9b commit r14-11254-g7c6fde4bac6c20e0b04c3feb820abe5ce0e48d9b Author: Tamar Christina Date: Thu Jan 16 19:25:26 2025 + AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257] in g:e91a17fe39c39e98cebe6e1cbc8064ee6846a3a7 we added the ability for -mcpu=native on unknown CPUs to still enable architecture extensions. This has worked great but was only added for homogenous systems. However the same thing works for big.LITTLE as in such system the cores must have the same extensions otherwise it doesn't fundamentally work. i.e. task migration from one core to the other wouldn't work. This extends the same handling to non-homogenous systems. gcc/ChangeLog: PR target/113257 * config/aarch64/driver-aarch64.cc (get_cpu_from_id, DEFAULT_CPU): New. (host_detect_local_cpu): Use it. gcc/testsuite/ChangeLog: PR target/113257 * gcc.target/aarch64/cpunative/info_34: New test. * gcc.target/aarch64/cpunative/native_cpu_34.c: New test. * gcc.target/aarch64/cpunative/info_35: New test. * gcc.target/aarch64/cpunative/native_cpu_35.c: New test. Co-authored-by: Richard Sandiford (cherry picked from commit 1ff85affe46623fe1a970de95887df22f4da9d16)
[Bug target/113257] -march=native or -mcpu=native are ineffective, but -march=native -mcpu=native works on arm64 M2 Ultra
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113257 Tamar Christina changed: What|Removed |Added Version|14.0|13.0 --- Comment #11 from Tamar Christina --- Fixed in master, will backport in a few days
[Bug target/113257] -march=native or -mcpu=native are ineffective, but -march=native -mcpu=native works on arm64 M2 Ultra
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113257 --- Comment #10 from GCC Commits --- The master branch has been updated by Tamar Christina : https://gcc.gnu.org/g:1ff85affe46623fe1a970de95887df22f4da9d16 commit r15-6970-g1ff85affe46623fe1a970de95887df22f4da9d16 Author: Tamar Christina Date: Thu Jan 16 19:25:26 2025 + AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257] in g:e91a17fe39c39e98cebe6e1cbc8064ee6846a3a7 we added the ability for -mcpu=native on unknown CPUs to still enable architecture extensions. This has worked great but was only added for homogenous systems. However the same thing works for big.LITTLE as in such system the cores must have the same extensions otherwise it doesn't fundamentally work. i.e. task migration from one core to the other wouldn't work. This extends the same handling to non-homogenous systems. gcc/ChangeLog: PR target/113257 * config/aarch64/driver-aarch64.cc (get_cpu_from_id, DEFAULT_CPU): New. (host_detect_local_cpu): Use it. gcc/testsuite/ChangeLog: PR target/113257 * gcc.target/aarch64/cpunative/info_34: New test. * gcc.target/aarch64/cpunative/native_cpu_34.c: New test. * gcc.target/aarch64/cpunative/info_35: New test. * gcc.target/aarch64/cpunative/native_cpu_35.c: New test. Co-authored-by: Richard Sandiford
[Bug target/113257] -march=native or -mcpu=native are ineffective, but -march=native -mcpu=native works on arm64 M2 Ultra
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113257 Iain Sandoe changed: What|Removed |Added CC||iains at gcc dot gnu.org --- Comment #9 from Iain Sandoe --- I suppose we can deduce that 0x38 == efficiency core and 0x39 is the performance one (since the counts stack up). We can figure out M1 c.f. M2/3 on the basis of the feature list - but not sure we can discriminate M2 and M3 from the feature lists alone (and M4 is currently TBD). The piece of information that would be most helpful is the "CPU family" that the Darwin kernel provides - but I have no idea how one might get that during the Linux bootstrap.
[Bug target/113257] -march=native or -mcpu=native are ineffective, but -march=native -mcpu=native works on arm64 M2 Ultra
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113257 --- Comment #8 from Tamar Christina --- > ./install/bin/gcc -xc -S -o - - -march=native < /dev/null .arch armv8-a+flagm2+lse+dotprod+rdma+crc+aes+sha3+fp16fml+jscvt+fcma+rcpc2+frintts+i8mm+bf16+sb+ssbs+pauth patch works, will regtest and submit
[Bug target/113257] -march=native or -mcpu=native are ineffective, but -march=native -mcpu=native works on arm64 M2 Ultra
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113257 Tamar Christina changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot gnu.org Last reconfirmed||2025-01-10 Ever confirmed|0 |1 --- Comment #7 from Tamar Christina --- Confirmed, The reason -mcpu=native doesn't work is because it's a big.LITTLE system, on big little with two unknown CPUs we don't enable the fallback. The reason -mcpu=native -march=native works is the native detection fills the info structs and we enter /* Simple case, one core type or just looking for the arch. */ if (n_cores == 1 || arch) which makes it behave like a homogonous system. On big.LITTLE the big and little cores should have the same features, so we can more the check outside that if. Mine.
[Bug target/113257] -march=native or -mcpu=native are ineffective, but -march=native -mcpu=native works on arm64 M2 Ultra
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113257 --- Comment #6 from Sam James --- Created attachment 57546 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57546&action=edit gcc 14 test results $ gcc-13 --version gcc-13 (Gentoo 13.2.1_p20240210 p13) 13.2.1 20240210 Copyright (C) 2023 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ gcc-13 -v -E -x c /dev/null -o /dev/null -march=native 2>&1 | grep /cc1 /usr/libexec/gcc/aarch64-unknown-linux-gnu/13/cc1 -E -quiet -v /dev/null -o /dev/null -mlittle-endian -mabi=lp64 -dumpbase null $ gcc-13 -v -E -x c /dev/null -o /dev/null -mcpu=native 2>&1 | grep /cc1 /usr/libexec/gcc/aarch64-unknown-linux-gnu/13/cc1 -E -quiet -v /dev/null -o /dev/null -mlittle-endian -mabi=lp64 -dumpbase null $ gcc-13 -v -E -x c /dev/null -o /dev/null -march=native -mcpu=native 2>&1 | grep /cc1 /usr/libexec/gcc/aarch64-unknown-linux-gnu/13/cc1 -E -quiet -v /dev/null -o /dev/null -mlittle-endian -mabi=lp64 -march=armv8-a+crc+lse+rcpc+rdma+dotprod+aes+sha3+fp16fml+sb+ssbs+i8mm+bf16+flagm+pauth -dumpbase null $ gcc-14 --version gcc-14 (Gentoo 14.0.1_pre20240211-r1 p22) 14.0.1 20240211 (experimental) Copyright (C) 2024 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ gcc-14 -v -E -x c /dev/null -o /dev/null -march=native 2>&1 | grep /cc1 /usr/libexec/gcc/aarch64-unknown-linux-gnu/14/cc1 -E -quiet -v /dev/null -o /dev/null -mlittle-endian -mabi=lp64 -dumpbase null $ gcc-14 -v -E -x c /dev/null -o /dev/null -mcpu=native 2>&1 | grep /cc1 /usr/libexec/gcc/aarch64-unknown-linux-gnu/14/cc1 -E -quiet -v /dev/null -o /dev/null -mlittle-endian -mabi=lp64 -dumpbase null $ gcc-14 -v -E -x c /dev/null -o /dev/null -march=native -mcpu=native 2>&1 | grep /cc1 /usr/libexec/gcc/aarch64-unknown-linux-gnu/14/cc1 -E -quiet -v /dev/null -o /dev/null -mlittle-endian -mabi=lp64 -march=armv8-a+flagm+dotprod+rdma+lse+crc+aes+sha3+fp16fml+rcpc+i8mm+bf16+sb+ssbs+pauth -dumpbase null Still hosed :(
[Bug target/113257] -march=native or -mcpu=native are ineffective, but -march=native -mcpu=native works on arm64 M2 Ultra
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113257 --- Comment #5 from Tamar Christina --- (In reply to Sam James from comment #3) > (In reply to Richard Earnshaw from comment #2) > I'm missing why the combination then works though? So we've made several changes here over time. -mcpu=native does attempt to find the core id to know what it is, but since GCC-12 we also enable extensions that don't have a requirement on an architecture. So for instance on an SVE capable core we would enable it, but we of course can't enable a tuning model. But between GCC 13 and GCC 14 many things have changes. It looks like GCC-14 works as expected: > ./install/bin/gcc -xc -S -o - - -march=native < /dev/null .arch armv8-a+flagm+dotprod+rdma+lse+crc+aes+sha3+fp16fml+rcpc+i8mm+bf16+sb+ssbs+pauth .file "" .text .ident "GCC: (GNU) 14.0.1 20240129 (experimental)" .section .note.GNU-stack,"",@progbits > ./install/bin/gcc -xc -S -o - - -mcpu=native < /dev/null .arch armv8-a+flagm+dotprod+rdma+lse+crc+aes+sha3+fp16fml+rcpc+i8mm+bf16+sb+ssbs+pauth .file "" .text .ident "GCC: (GNU) 14.0.1 20240129 (experimental)" .section .note.GNU-stack,"",@progbits > ./install/bin/gcc -xc -S -o - - -mcpu=native -march=native < /dev/null .arch armv8-a+flagm+dotprod+rdma+lse+crc+aes+sha3+fp16fml+rcpc+i8mm+bf16+sb+ssbs+pauth .file "" .text .ident "GCC: (GNU) 14.0.1 20240129 (experimental)" .section .note.GNU-stack,"",@progbits So first question is, can you confirm it does for GCC-14 for you too? In the meantime I'll try GCC-13
[Bug target/113257] -march=native or -mcpu=native are ineffective, but -march=native -mcpu=native works on arm64 M2 Ultra
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113257 --- Comment #4 from Richard Earnshaw --- I'm not sure. My understanding was that -march=native started by looking up the CPU ID first and then using the internal mapping of that CPU to the architecture (which can't work if we don't recognize the CPU), but perhaps we try a bit harder when both are specified.
[Bug target/113257] -march=native or -mcpu=native are ineffective, but -march=native -mcpu=native works on arm64 M2 Ultra
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113257 --- Comment #3 from Sam James --- (In reply to Richard Earnshaw from comment #2) I'm missing why the combination then works though?
[Bug target/113257] -march=native or -mcpu=native are ineffective, but -march=native -mcpu=native works on arm64 M2 Ultra
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113257
--- Comment #2 from Richard Earnshaw ---
For -mcpu=native, the manual says:
Additionally on native AArch64 GNU/Linux systems the value
@samp{native} tunes performance to the host system. This option has no effect
if the compiler is unable to recognize the processor of the host system.
With similar working for -march=native
Since nobody has contributed patches to recognize the Apple Silicon cores, I
suspect that is the source of the problem.
[Bug target/113257] -march=native or -mcpu=native are ineffective, but -march=native -mcpu=native works on arm64 M2 Ultra
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113257
--- Comment #1 from Sam James ---
FWIW, I did try a patch per
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110901#c4 as a quick hack in case
it's the same issue, but it didn't help:
```
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -1421,7 +1421,7 @@ extern const char *host_detect_local_cpu (int argc, const
char **argv);
CONFIG_TUNE_SPEC
#define MCPU_TO_MARCH_SPEC \
- " %{mcpu=*:-march=%:rewrite_mcpu(%{mcpu=*:%*})}"
+ " %{!march=*:%{mcpu=*:-march=%:rewrite_mcpu(%{mcpu=*:%*})}}"
extern const char *aarch64_rewrite_mcpu (int argc, const char **argv);
#define MCPU_TO_MARCH_SPEC_FUNCTIONS \
```
