Hi Kyrill, I checked this patch on AMD seattle board and it bootstrapped cleanly with BOOT_CFLAGS=-O2 -mcpu=native and CFLAGS_FOR_TARGET=-O2 -mcpu=native.
With -mcpu=cortex-57, I get .cpu cortex-a57+fp+simd+crc options passed: test.c -mcpu=cortex-a57 -mlittle-endian -mabi=lp64 -auxbase-strip With –mcpu=native, I get .cpu cortex-a57+fp+simd+crypto+crc options passed: test.c -mlittle-endian -mabi=lp64 -mcpu=cortex-a57+fp+simd+crypto+crc -auxbase-strip It matches the features. evtstrm is not relevant to compiler I believe. Another thing is detecting cache sizes automatically. But I think as of now we have not yet included cache parameters in aarch64 tune tables. Regards, Venkat. -----Original Message----- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org] On Behalf Of Kyrill Tkachov Sent: Wednesday, April 22, 2015 8:38 PM To: GCC Patches Cc: Marcus Shawcroft; Richard Earnshaw; James Greenhalgh; Evandro Menezes; Andrew Pinski Subject: Re: [PATCH][AArch64] Implement -m{cpu,tune,arch}=native using only /proc/cpuinfo On 22/04/15 12:46, Kyrill Tkachov wrote: > [Sorry for resending twice. My mail client glitched] > > On 20/04/15 16:47, Kyrill Tkachov wrote: >> Hi all, >> >> This is an attempt to add native CPU detection to AArch64 GNU/Linux targets. >> Similar to other ports we use SPEC rewriting to rewrite >> -m{cpu,tune,arch}=native options into the appropriate >> CPU/architecture and the architecture extension options when appropriate >> (i.e. +crypto/+crc etc). >> >> For CPU/architecture detection it gets a bit involved, especially >> when running on a big.LITTLE system. My proposed approach is to look >> at /proc/cpuinfo/ and search for the implementer id and part number >> fields that uniquely identify each core (appropriate identifying >> information is added to aarch64-cores.def). If we find two types of >> core we have a big.LITTLE system, so search through the core >> definitions extracted from aarch64-cores.def to find if we support such a >> combination (currently only cortex-a57.cortex-a53 and cortex-a72.cortex-a53) >> and make sure that the implementer id field matches up. >> >> I tested this on a 4xCortex-A53 + 2xCortex-A57 big.LITTLE Ubuntu GNU/Linux >> system. >> There are two formats for /proc/cpuinfo/ that I'm aware of. The first (old) >> one has the format: >> -------------------------------------- >> processor : 0 >> processor : 1 >> processor : 2 >> processor : 3 >> processor : 4 >> processor : 5 >> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 >> CPU implementer : 0x41 >> CPU architecture: AArch64 >> CPU variant : 0x0 >> CPU part : 0xd03 >> -------------------------------------- >> >> In this format it lists the 6 cores but the CPU part it reports is >> only the one for the core from which /proc/cpuinfo was read from (!), in >> this case one of the Cortex-A53 cores. >> This means we detect a different CPU depending on which core GCC was >> invoked on. Not ideal really, but there's no more information that we can >> extract. >> Given the /proc/cpuinfo above, this patch will rewrite -mcpu=native >> into -mcpu=cortex-a53+fp+simd+crypto+crc >> >> The newer /proc/cpuinfo format proposed at >> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commi >> t/?id=44b82b7700d05a52cd983799d3ecde1a976b3bed >> looks like this: >> >> -------------------------------------------------------------- >> processor : 0 >> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 >> CPU implementer : 0x41 >> CPU architecture: 8 >> CPU variant : 0x0 >> CPU part : 0xd03 >> CPU revision : 0 >> >> processor : 1 >> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 >> CPU implementer : 0x41 >> CPU architecture: 8 >> CPU variant : 0x0 >> CPU part : 0xd03 >> CPU revision : 0 >> >> processor : 2 >> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 >> CPU implementer : 0x41 >> CPU architecture: 8 >> CPU variant : 0x0 >> CPU part : 0xd03 >> CPU revision : 0 >> >> processor : 3 >> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 >> CPU implementer : 0x41 >> CPU architecture: 8 >> CPU variant : 0x0 >> CPU part : 0xd03 >> CPU revision : 0 >> >> processor : 4 >> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 >> CPU implementer : 0x41 >> CPU architecture: 8 >> CPU variant : 0x0 >> CPU part : 0xd07 >> CPU revision : 0 >> >> processor : 5 >> Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 >> CPU implementer : 0x41 >> CPU architecture: 8 >> CPU variant : 0x0 >> CPU part : 0xd07 >> CPU revision : 0 >> -------------------------------------------------------------- >> >> The Features field is used to detect the architectural features that >> we map to GCC option extensions i.e. +fp,+crypto,+simd,+crc etc. >> >> Similarly, -march=native would be rewritten into >> -march=armv8-a+fp+simd+crypto+crc while -mtune=native into >> -march=cortex-a57.cortex-a53 (the arch extension options are not valid for >> -mtune). >> >> If it detects more than one implementer ID or the implementer IDs not >> matching up somewhere or some other weirdness /proc/cpuinfo or fails >> to recognise the CPU it will bail out and ignore the option entirely >> (similarly to other ports). >> >> The patch works fine with both /proc/cpuinfo formats although, as >> mentioned above, it will not be able to detect the big.LITTLE combination >> from the first format. >> >> I've filled in the implementer ID and part numbers for the >> Cortex-A57, Cortex-A53, Cortex-A72, X-Gene 1 cores, but I don't have >> that info for thunderx or exynosm1. Could someone from Cavium and Samsung >> help me out here? At present this patch has some false dummy values that I'd >> like to fill out before committing this. > Thanks Andrew and Evandro for the info. > I've added the numbers to the patch, so it should work on those systems. > I'm attaching the final patch here for review. And resending here with a minor whitespace change in aarch64-cores.def to make thunderx line up with the other entries. Thanks Evandro for pointing it out! Kyrill > > Thanks, > Kyrill > > > 2014-04-22 Kyrylo Tkachov <kyrylo.tkac...@arm.com> > > * config.host (case ${host}): Add aarch64*-*-linux case. > * config/aarch64/aarch64-cores.def: Add IMPLEMENTER_ID and PART_NUMBER > fields to all the cores. > * config/aarch64/aarch64-elf.h (DRIVER_SELF_SPECS): Add > MCPU_MTUNE_NATIVE_SPECS. > * config/aarch64/aarch64-option-extensions.def: Add FEAT_STRING field > to all > extensions. > * config/aarch64/aarch64-opts.h: Adjust definition of AARCH64_CORE. > * config/aarch64/aarch64.c: Adjust definition of AARCH64_CORE. > Adjust definition of AARCH64_OPT_EXTENSION. > * config/aarch64/aarch64.h: Adjust definition of AARCH64_CORE. > (MCPU_MTUNE_NATIVE_SPECS): Define. > * config/aarch64/driver-aarch64.c: New file. > * config/aarch64/x-arch64: New file. > * doc/invoke.texi (AArch64 Options): Document native value for -mcpu, > -mtune and -march. > >> I've bootstrapped this on the system mentioned above with -mcpu=native in >> the BOOT_CFLAGS and regtested as well. >> For the bootstrap I've used the 2nd /proc/cpuinfo format. >> >> I've also tested it on AArch64 hardware from ARM Ltd. and the ecosystem. >> >> If using the first format the bootstrap fails the comparison because, >> depending on the OS scheduling, some files are compiled with >> Cortex-A57 tuning and some with Cortex-A53 tuning and this is practically >> non-deterministic across stage2 and stage3! >> >> What do people think of this approach? >> >> 2014-04-20 Kyrylo Tkachov <kyrylo.tkac...@arm.com> >> >> * config.host (case ${host}): Add aarch64*-*-linux case. >> * config/aarch64/aarch64-cores.def: Add IMPLEMENTER_ID and PART_NUMBER >> fields to all the cores. >> * config/aarch64/aarch64-elf.h (DRIVER_SELF_SPECS): Add >> MCPU_MTUNE_NATIVE_SPECS. >> * config/aarch64/aarch64-option-extensions.def: Add FEAT_STRING field >> to all >> extensions. >> * config/aarch64/aarch64-opts.h: Adjust definition of AARCH64_CORE. >> * config/aarch64/aarch64.c: Adjust definition of AARCH64_CORE. >> Adjust definition of AARCH64_OPT_EXTENSION. >> * config/aarch64/aarch64.h: Adjust definition of AARCH64_CORE. >> (MCPU_MTUNE_NATIVE_SPECS): Define. >> * config/aarch64/driver-aarch64.c: New file. >> * config/aarch64/x-arch64: New file. >> * doc/invoke.texi (AArch64 Options): Document native value for -mcpu, >> -mtune and -march. > > aarch64-native.patch > > > commit bfdf31b9d71620afac43b15ebf31502022a9bc63 > Author: Kyrylo Tkachov<kyrylo.tkac...@arm.com> > Date: Fri Apr 10 16:39:27 2015 +0100 > > [AArch64] Implement -m{tune,cpu,arch}=native on AArch64 > GNU/Linux > > diff --git a/gcc/config.host b/gcc/config.host index b0f5940..a8896d1 > 100644 > --- a/gcc/config.host > +++ b/gcc/config.host > @@ -99,6 +99,14 @@ case ${host} in > esac > > case ${host} in > + aarch64*-*-linux*) > + case ${target} in > + aarch64*-*-*) > + host_extra_gcc_objs="driver-aarch64.o" > + host_xmake_file="${host_xmake_file} aarch64/x-aarch64" > + ;; > + esac > + ;; > arm*-*-freebsd* | arm*-*-linux*) > case ${target} in > arm*-*-*) > diff --git a/gcc/config/aarch64/aarch64-cores.def > b/gcc/config/aarch64/aarch64-cores.def > index e46d91b..7e6cb73 100644 > --- a/gcc/config/aarch64/aarch64-cores.def > +++ b/gcc/config/aarch64/aarch64-cores.def > @@ -21,7 +21,7 @@ > > Before using #include to read this file, define a macro: > > - AARCH64_CORE(CORE_NAME, CORE_IDENT, SCHEDULER_IDENT, ARCH, FLAGS, > COSTS) > + AARCH64_CORE(CORE_NAME, CORE_IDENT, SCHEDULER_IDENT, ARCH, > + FLAGS, COSTS, IMP, PART) > > The CORE_NAME is the name of the core, represented as a string constant. > The CORE_IDENT is the name of the core, represented as an identifier. > @@ -30,18 +30,23 @@ > ARCH is the architecture revision implemented by the chip. > FLAGS are the bitwise-or of the traits that apply to that core. > This need not include flags implied by the architecture. > - COSTS is the name of the rtx_costs routine to use. */ > + COSTS is the name of the rtx_costs routine to use. > + IMP is the implementer ID of the CPU vendor. On a GNU/Linux system it can > + be found in /proc/cpuinfo. > + PART is the part number of the CPU. On a GNU/Linux system it can be found > + in /proc/cpuinfo. For big.LITTLE systems this should have the form at of > + "<big core part number>.<LITTLE core part number>". */ > > /* V8 Architecture Processors. */ > > -AARCH64_CORE("cortex-a53", cortexa53, cortexa53, 8, > AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa53) > -AARCH64_CORE("cortex-a57", cortexa57, cortexa57, 8, AARCH64_FL_FOR_ARCH8 | > AARCH64_FL_CRC, cortexa57) -AARCH64_CORE("cortex-a72", cortexa72, cortexa57, > 8, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57) > -AARCH64_CORE("exynos-m1", exynosm1, cortexa57, 8, AARCH64_FL_FOR_ARCH8 | > AARCH64_FL_CRC | AARCH64_FL_CRYPTO, cortexa57) > -AARCH64_CORE("thunderx", thunderx, thunderx, 8, AARCH64_FL_FOR_ARCH8 | > AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx) > -AARCH64_CORE("xgene1", xgene1, xgene1, 8, AARCH64_FL_FOR_ARCH8, > xgene1) > +AARCH64_CORE("cortex-a53", cortexa53, cortexa53, 8, > +AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa53, "0x41", "0xd03") > +AARCH64_CORE("cortex-a57", cortexa57, cortexa57, 8, AARCH64_FL_FOR_ARCH8 | > AARCH64_FL_CRC, cortexa57, "0x41", "0xd07") AARCH64_CORE("cortex-a72", > cortexa72, cortexa57, 8, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, > "0x41", "0xd08") > +AARCH64_CORE("exynos-m1", exynosm1, cortexa57, 8, AARCH64_FL_FOR_ARCH8 | > AARCH64_FL_CRC | AARCH64_FL_CRYPTO, cortexa57, "0x53", "0x001") > +AARCH64_CORE("thunderx", thunderx, thunderx, 8, AARCH64_FL_FOR_ARCH8 | > AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx, "0x43", "0x0a1") > +AARCH64_CORE("xgene1", xgene1, xgene1, 8, AARCH64_FL_FOR_ARCH8, > xgene1, "0x50", "0x000") > > /* V8 big.LITTLE implementations. */ > > -AARCH64_CORE("cortex-a57.cortex-a53", cortexa57cortexa53, cortexa53, > 8, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57) > -AARCH64_CORE("cortex-a72.cortex-a53", cortexa72cortexa53, cortexa53, > 8, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57) > +AARCH64_CORE("cortex-a57.cortex-a53", cortexa57cortexa53, cortexa53, > +8, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, "0x41", > +"0xd07.0xd03") AARCH64_CORE("cortex-a72.cortex-a53", > +cortexa72cortexa53, cortexa53, 8, AARCH64_FL_FOR_ARCH8 | > +AARCH64_FL_CRC, cortexa57, "0x41", "0xd08.0xd03") > diff --git a/gcc/config/aarch64/aarch64-elf.h > b/gcc/config/aarch64/aarch64-elf.h > index a5ec8cb..1ce6343 100644 > --- a/gcc/config/aarch64/aarch64-elf.h > +++ b/gcc/config/aarch64/aarch64-elf.h > @@ -132,7 +132,8 @@ > #undef DRIVER_SELF_SPECS > #define DRIVER_SELF_SPECS \ > " %{!mbig-endian:%{!mlittle-endian:" ENDIAN_SPEC "}}" \ > - " %{!mabi=*:" ABI_SPEC "}" > + " %{!mabi=*:" ABI_SPEC "}" \ > + MCPU_MTUNE_NATIVE_SPECS > > #ifdef HAVE_AS_MABI_OPTION > #define ASM_MABI_SPEC "%{mabi=*:-mabi=%*}" > diff --git a/gcc/config/aarch64/aarch64-option-extensions.def > b/gcc/config/aarch64/aarch64-option-extensions.def > index 6ec3ed6..f296296 100644 > --- a/gcc/config/aarch64/aarch64-option-extensions.def > +++ b/gcc/config/aarch64/aarch64-option-extensions.def > @@ -21,18 +21,25 @@ > > Before using #include to read this file, define a macro: > > - AARCH64_OPT_EXTENSION(EXT_NAME, FLAGS_ON, FLAGS_OFF) > + AARCH64_OPT_EXTENSION(EXT_NAME, FLAGS_ON, FLAGS_OFF, > + FEATURE_STRING) > > EXT_NAME is the name of the extension, represented as a string constant. > FLAGS_ON are the bitwise-or of the features that the extension adds. > - FLAGS_OFF are the bitwise-or of the features that the extension removes. > */ > + FLAGS_OFF are the bitwise-or of the features that the extension removes. > + FEAT_STRING is a string containing the entries in the 'Features' field of > + /proc/cpuinfo on a GNU/Linux system that correspond to this architecture > + extension being available. Sometimes multiple entries are needed to > enable > + the extension (for example, the 'crypto' extension depends on four > + entries: aes, pmull, sha1, sha2 being present). In that case this field > + should contain a whitespace-separated list of the strings in 'Features' > + that are required. Their order is not important. */ > > /* V8 Architecture Extensions. > This list currently contains example extensions for CPUs that implement > AArch64, and therefore serves as a template for adding more CPUs in the > future. */ > > -AARCH64_OPT_EXTENSION("fp", AARCH64_FL_FP, AARCH64_FL_FPSIMD | > AARCH64_FL_CRYPTO) > -AARCH64_OPT_EXTENSION("simd", AARCH64_FL_FPSIMD, AARCH64_FL_SIMD | > AARCH64_FL_CRYPTO) > -AARCH64_OPT_EXTENSION("crypto", AARCH64_FL_CRYPTO | > AARCH64_FL_FPSIMD, AARCH64_FL_CRYPTO) > -AARCH64_OPT_EXTENSION("crc", AARCH64_FL_CRC, AARCH64_FL_CRC) > +AARCH64_OPT_EXTENSION("fp", AARCH64_FL_FP, > AARCH64_FL_FPSIMD | AARCH64_FL_CRYPTO, "fp") > +AARCH64_OPT_EXTENSION("simd", AARCH64_FL_FPSIMD, > AARCH64_FL_SIMD | AARCH64_FL_CRYPTO, "asimd") > +AARCH64_OPT_EXTENSION("crypto", AARCH64_FL_CRYPTO | > AARCH64_FL_FPSIMD, AARCH64_FL_CRYPTO, "aes pmull sha1 > sha2") > +AARCH64_OPT_EXTENSION("crc", AARCH64_FL_CRC, > AARCH64_FL_CRC, "crc32") > diff --git a/gcc/config/aarch64/aarch64-opts.h > b/gcc/config/aarch64/aarch64-opts.h > index f88ae5b..ea64cf4 100644 > --- a/gcc/config/aarch64/aarch64-opts.h > +++ b/gcc/config/aarch64/aarch64-opts.h > @@ -25,7 +25,7 @@ > /* The various cores that implement AArch64. */ > enum aarch64_processor > { > -#define AARCH64_CORE(NAME, INTERNAL_IDENT, SCHED, ARCH, FLAGS, COSTS) > \ > +#define AARCH64_CORE(NAME, INTERNAL_IDENT, SCHED, ARCH, FLAGS, COSTS, > +IMP, PART) \ > INTERNAL_IDENT, > #include "aarch64-cores.def" > #undef AARCH64_CORE > diff --git a/gcc/config/aarch64/aarch64.c > b/gcc/config/aarch64/aarch64.c index 954e110..ea6020f 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -441,7 +441,7 @@ struct processor > /* Processor cores implementing AArch64. */ > static const struct processor all_cores[] = > { > -#define AARCH64_CORE(NAME, IDENT, SCHED, ARCH, FLAGS, COSTS) \ > +#define AARCH64_CORE(NAME, IDENT, SCHED, ARCH, FLAGS, COSTS, IMP, > +PART) \ > {NAME, SCHED, #ARCH, ARCH, FLAGS, &COSTS##_tunings}, > #include "aarch64-cores.def" > #undef AARCH64_CORE > @@ -478,7 +478,7 @@ struct aarch64_option_extension > /* ISA extensions in AArch64. */ > static const struct aarch64_option_extension all_extensions[] = > { > -#define AARCH64_OPT_EXTENSION(NAME, FLAGS_ON, FLAGS_OFF) \ > +#define AARCH64_OPT_EXTENSION(NAME, FLAGS_ON, FLAGS_OFF, > +FEATURE_STRING) \ > {NAME, FLAGS_ON, FLAGS_OFF}, > #include "aarch64-option-extensions.def" > #undef AARCH64_OPT_EXTENSION > diff --git a/gcc/config/aarch64/aarch64.h > b/gcc/config/aarch64/aarch64.h index bf59e40..1f7187b 100644 > --- a/gcc/config/aarch64/aarch64.h > +++ b/gcc/config/aarch64/aarch64.h > @@ -506,7 +506,7 @@ enum reg_class > > enum target_cpus > { > -#define AARCH64_CORE(NAME, INTERNAL_IDENT, SCHED, ARCH, FLAGS, COSTS) > \ > +#define AARCH64_CORE(NAME, INTERNAL_IDENT, SCHED, ARCH, FLAGS, COSTS, > +IMP, PART) \ > TARGET_CPU_##INTERNAL_IDENT, > #include "aarch64-cores.def" > #undef AARCH64_CORE > @@ -929,11 +929,24 @@ extern const char *aarch64_rewrite_mcpu (int argc, > const char **argv); > #define BIG_LITTLE_CPU_SPEC_FUNCTIONS \ > { "rewrite_mcpu", aarch64_rewrite_mcpu }, > > +#if defined(__aarch64__) > +extern const char *host_detect_local_cpu (int argc, const char **argv); > +# define EXTRA_SPEC_FUNCTIONS \ > + { "local_cpu_detect", host_detect_local_cpu }, \ > + BIG_LITTLE_CPU_SPEC_FUNCTIONS > + > +# define MCPU_MTUNE_NATIVE_SPECS \ > + " %{march=native:%<march=native %:local_cpu_detect(arch)}" \ > + " %{mcpu=native:%<mcpu=native %:local_cpu_detect(cpu)}" \ > + " %{mtune=native:%<mtune=native %:local_cpu_detect(tune)}" > +#else > +# define MCPU_MTUNE_NATIVE_SPECS "" > +# define EXTRA_SPEC_FUNCTIONS BIG_LITTLE_CPU_SPEC_FUNCTIONS #endif > + > #define ASM_CPU_SPEC \ > BIG_LITTLE_SPEC > > -#define EXTRA_SPEC_FUNCTIONS BIG_LITTLE_CPU_SPEC_FUNCTIONS > - > #define EXTRA_SPECS \ > { "asm_cpu_spec", ASM_CPU_SPEC } > > diff --git a/gcc/config/aarch64/driver-aarch64.c > b/gcc/config/aarch64/driver-aarch64.c > new file mode 100644 > index 0000000..da10a4c > --- /dev/null > +++ b/gcc/config/aarch64/driver-aarch64.c > @@ -0,0 +1,307 @@ > +/* Native CPU detection for aarch64. > + Copyright (C) 2014 Free Software Foundation, Inc. > + > + This file is part of GCC. > + > + GCC is free software; you can redistribute it and/or modify > + it under the terms of the GNU General Public License as published by > + the Free Software Foundation; either version 3, or (at your option) > + any later version. > + > + GCC is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + GNU General Public License for more details. > + > + You should have received a copy of the GNU General Public License > + along with GCC; see the file COPYING3. If not see > +<http://www.gnu.org/licenses/>. */ > + > +#include "config.h" > +#include "system.h" > + > +struct arch_extension > +{ > + const char *ext; > + const char *feat_string; > +}; > + > +#define AARCH64_OPT_EXTENSION(EXT_NAME, FLAGS_ON, FLAGS_OFF, > +FEATURE_STRING) \ > + { EXT_NAME, FEATURE_STRING }, > +static struct arch_extension ext_to_feat_string[] = { #include > +"aarch64-option-extensions.def" > +}; > +#undef AARCH64_OPT_EXTENSION > + > + > +struct aarch64_core_data > +{ > + const char* name; > + const char* arch; > + const char* implementer_id; > + const char* part_no; > +}; > + > +#define AARCH64_CORE(CORE_NAME, CORE_IDENT, SCHED, ARCH, FLAGS, > +COSTS, IMP, PART) \ > + { CORE_NAME, #ARCH, IMP, PART }, > + > +static struct aarch64_core_data cpu_data [] = { #include > +"aarch64-cores.def" > + { NULL, NULL, NULL, NULL } > +}; > + > +#undef AARCH64_CORE > + > +struct aarch64_arch > +{ > + const char* id; > + const char* name; > +}; > + > +#define AARCH64_ARCH(NAME, CORE, ARCH, FLAGS) \ > + { #ARCH, NAME }, > + > +static struct aarch64_arch aarch64_arches [] = { #include > +"aarch64-arches.def" > + {NULL, NULL} > +}; > + > +#undef AARCH64_ARCH > + > +/* Return the full architecture name string corresponding to the > + identifier ID. */ > + > +static const char* > +get_arch_name_from_id (const char* id) { > + unsigned int i = 0; > + > + for (i = 0; aarch64_arches[i].id != NULL; i++) > + { > + if (strcmp (id, aarch64_arches[i].id) == 0) > + return aarch64_arches[i].name; > + } > + > + return NULL; > +} > + > + > +/* Check wether the string CORE contains the same CPU part numbers > + as BL_STRING. For example CORE="{0xd03, 0xd07}" and > BL_STRING="0xd07.0xd03" > + should return true. */ > + > +static bool > +valid_bL_string_p (const char** core, const char* bL_string) { > + return strstr (bL_string, core[0]) != NULL > + && strstr (bL_string, core[1]) != NULL; } > + > +/* Return true iff ARR contains STR in one of its two elements. */ > + > +static bool > +contains_string_p (const char** arr, const char* str) { > + bool res = false; > + > + if (arr[0] != NULL) > + { > + res = strstr (arr[0], str) != NULL; > + if (res) > + return res; > + > + if (arr[1] != NULL) > + return strstr (arr[1], str) != NULL; > + } > + > + return false; > +} > + > +/* This will be called by the spec parser in gcc.c when it sees > + a %:local_cpu_detect(args) construct. Currently it will be called > + with either "arch", "cpu" or "tune" as argument depending on if > + -march=native, -mcpu=native or -mtune=native is to be substituted. > + > + It returns a string containing new command line parameters to be > + put at the place of the above two options, depending on what CPU > + this is executed. E.g. "-march=armv8-a" on a Cortex-A57 for > + -march=native. If the routine can't detect a known processor, > + the -march or -mtune option is discarded. > + > + For -mtune and -mcpu arguments it attempts to detect the CPU or > + a big.LITTLE system. > + ARGC and ARGV are set depending on the actual arguments given > + in the spec. */ > + > +const char * > +host_detect_local_cpu (int argc, const char **argv) { > + const char *arch_id = NULL; > + const char *res = NULL; > + static const int num_exts = ARRAY_SIZE (ext_to_feat_string); > + char buf[128]; > + FILE *f = NULL; > + bool arch = false; > + bool tune = false; > + bool cpu = false; > + unsigned int i = 0; > + unsigned int core_idx = 0; > + const char* imps[2] = { NULL, NULL }; > + const char* cores[2] = { NULL, NULL }; > + unsigned int n_cores = 0; > + unsigned int n_imps = 0; > + bool processed_exts = false; > + const char *ext_string = ""; > + > + gcc_assert (argc); > + > + if (!argv[0]) > + goto not_found; > + > + /* Are we processing -march, mtune or mcpu? */ arch = strcmp > + (argv[0], "arch") == 0; if (!arch) > + tune = strcmp (argv[0], "tune") == 0; > + > + if (!arch && !tune) > + cpu = strcmp (argv[0], "cpu") == 0; > + > + if (!arch && !tune && !cpu) > + goto not_found; > + > + f = fopen ("/proc/cpuinfo", "r"); > + > + if (f == NULL) > + goto not_found; > + > + /* Look through /proc/cpuinfo to determine the implementer > + and then the part number that identifies a particular core. */ > + while (fgets (buf, sizeof (buf), f) != NULL) > + { > + if (strstr (buf, "implementer") != NULL) > + { > + for (i = 0; cpu_data[i].name != NULL; i++) > + if (strstr (buf, cpu_data[i].implementer_id) != NULL > + && !contains_string_p (imps, cpu_data[i].implementer_id)) > + { > + if (n_imps == 2) > + goto not_found; > + > + imps[n_imps++] = cpu_data[i].implementer_id; > + > + break; > + } > + continue; > + } > + > + if (strstr (buf, "part") != NULL) > + { > + for (i = 0; cpu_data[i].name != NULL; i++) > + if (strstr (buf, cpu_data[i].part_no) != NULL > + && !contains_string_p (cores, cpu_data[i].part_no)) > + { > + if (n_cores == 2) > + goto not_found; > + > + cores[n_cores++] = cpu_data[i].part_no; > + core_idx = i; > + arch_id = cpu_data[i].arch; > + break; > + } > + continue; > + } > + if (!tune && !processed_exts && strstr (buf, "Features") != NULL) > + { > + for (i = 0; i < num_exts; i++) > + { > + bool enabled = true; > + char *p = NULL; > + char *feat_string = concat > + (ext_to_feat_string[i].feat_string, NULL); > + > + p = strtok (feat_string, " "); > + > + while (p != NULL) > + { > + if (strstr (buf, p) == NULL) > + { > + enabled = false; > + break; > + } > + p = strtok (NULL, " "); > + } > + ext_string = concat (ext_string, "+", enabled ? "" : "no", > + ext_to_feat_string[i].ext, NULL); > + } > + processed_exts = true; > + } > + } > + > + fclose (f); > + f = NULL; > + > + /* Weird cpuinfo format that we don't know how to handle. */ if > + (n_cores == 0 || n_cores > 2 || n_imps != 1) > + goto not_found; > + > + if (arch && !arch_id) > + goto not_found; > + > + if (arch) > + { > + const char* arch_name = get_arch_name_from_id (arch_id); > + > + /* We got some arch indentifier that's not in aarch64-arches.def? */ > + if (!arch_name) > + goto not_found; > + > + res = concat ("-march=", arch_name, NULL); > + } > + /* We have big.LITTLE. */ > + else if (n_cores == 2) > + { > + for (i = 0; cpu_data[i].name != NULL; i++) > + { > + if (strchr (cpu_data[i].part_no, '.') != NULL > + && strncmp (cpu_data[i].implementer_id, imps[0], strlen > (imps[0]) - 1) == 0 > + && valid_bL_string_p (cores, cpu_data[i].part_no)) > + { > + res = concat ("-m", cpu ? "cpu" : "tune", "=", > cpu_data[i].name, NULL); > + break; > + } > + } > + if (!res) > + goto not_found; > + } > + /* The simple, non-big.LITTLE case. */ else > + { > + if (strncmp (cpu_data[core_idx].implementer_id, imps[0], > + strlen (imps[0]) - 1) != 0) > + goto not_found; > + > + res = concat ("-m", cpu ? "cpu" : "tune", "=", > + cpu_data[core_idx].name, NULL); > + } > + > + if (tune) > + return res; > + > + res = concat (res, ext_string, NULL); > + > + return res; > + > +not_found: > + { > + /* If detection fails we ignore the option. > + Clean up and return empty string. */ > + > + if (f) > + fclose (f); > + > + return ""; > + } > +} > + > diff --git a/gcc/config/aarch64/x-aarch64 > b/gcc/config/aarch64/x-aarch64 new file mode 100644 index > 0000000..8c09e04 > --- /dev/null > +++ b/gcc/config/aarch64/x-aarch64 > @@ -0,0 +1,3 @@ > +driver-aarch64.o: $(srcdir)/config/aarch64/driver-aarch64.c \ > + $(CONFIG_H) $(SYSTEM_H) > + $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) > +$(INCLUDES) $< > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index > e2918cb..5787524 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -12318,8 +12318,12 @@ This involves inserting a NOP instruction between > memory instructions and > Specify the name of the target architecture, optionally suffixed by one or > more feature modifiers. This option has the form > @option{-march=@var{arch}@r{@{}+@r{[}no@r{]}@var{feature}@r{@}*}}, > where the -only permissible value for @var{arch} is @samp{armv8-a}. > The permissible -values for @var{feature} are documented in the sub-section > below. > +only permissible value for @var{arch} is @samp{armv8-a}. > +The permissible values for @var{feature} are documented in the > +sub-section below. Additionally on native AArch64 GNU/Linux systems > +the value @samp{native} is available. This option causes the > +compiler to pick the architecture of the host system. If the > +compiler is unable to recognize the architecture of the host system this > option has no effect. > > Where conflicting feature modifiers are specified, the right-most feature > is > used. > @@ -12343,6 +12347,13 @@ Additionally, this option can specify that GCC > should tune the performance > of the code for a big.LITTLE system. Permissible values for this > option are: @samp{cortex-a57.cortex-a53}, @samp{cortex-a72.cortex-a53}. > > +Additionally on native AArch64 GNU/Linux systems the value > +@samp{native} is available. > +This option causes the compiler to pick the architecture of and tune > +the performance of the code for the processor of the host system. > +If the compiler is unable to recognize the processor of the host > +system this option has no effect. > + > Where none of @option{-mtune=}, @option{-mcpu=} or @option{-march=} > are specified, the code is tuned to perform well across a range > of target processors. > @@ -12355,7 +12366,11 @@ Specify the name of the target processor, optionally > suffixed by one or more > feature modifiers. This option has the form > @option{-mcpu=@var{cpu}@r{@{}+@r{[}no@r{]}@var{feature}@r{@}*}}, where the > permissible values for @var{cpu} are the same as those available > for -@option{-mtune}. > +@option{-mtune}. Additionally on native AArch64 GNU/Linux systems > +the value @samp{native} is available. > +This option causes the compiler to tune the performance of the code > +for the processor of the host system. If the compiler is unable to > +recognize the processor of the host system this option has no effect. > > The permissible values for @var{feature} are documented in the sub-section > below. >