On Fri, Oct 21, 2016 at 9:28 AM, James Greenhalgh <james.greenha...@arm.com> wrote: > On Fri, Oct 21, 2016 at 04:57:22PM +0100, Richard Earnshaw (lists) wrote: >> On 21/10/16 14:59, James Greenhalgh wrote: >> > On Sat, Oct 15, 2016 at 07:38:40PM -0700, Andrew Pinski wrote: >> >> On Wed, Nov 25, 2015 at 11:59 AM, Andrew Pinski <pins...@gmail.com> wrote: >> >> Here is finally an updated (fixed) patch (I did not implement the two >> >> implementer big.LITTLE support yet, that will be for a different patch >> >> since I also fixed the part no not being unique as a separate patch. >> >> Once I get a new enough kernel, I will also look into doing the >> >> /sys/cpu/* style detection first. >> >> >> >> OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions >> >> (and tested hacking the location of the read in file to see if it >> >> works with big.LITTLE and other formats of /proc/cpuinfo). >> > >> > I'm OK with this in principle, but it needs some polish for pedantic >> > style comments... >> > >> >> * config/aarch64/aarch64-cores.def: Rewrite so IMP and PART are >> >> integer constants. >> >> * config/aarch64/driver-aarch64.c (struct aarch64_core_data): Change >> >> implementer_id to unsigned char. >> >> Change part_no to unsigned int. >> >> (AARCH64_BIG_LITTLE): New define. >> >> (INVALID_IMP): New define. >> >> (INVALID_CORE): New define. >> >> (cpu_data): Change the last element's implementer_id and part_no to >> >> integers. >> >> (valid_bL_string_p): Rewrite to .. >> >> (valid_bL_core_p): this for integers instead of strings. >> >> (parse_field): New function. >> >> (contains_string_p): Rewrite to ... >> >> (contains_core_p): this for integers and only for the part_no. >> >> (host_detect_local_cpu): Rewrite handling of implementation and part >> >> num to be integers; >> >> simplifying the code. >> > >> >> Index: config/aarch64/aarch64-cores.def >> >> =================================================================== >> >> --- config/aarch64/aarch64-cores.def (revision 241200) >> >> +++ config/aarch64/aarch64-cores.def (working copy) >> >> @@ -32,43 +32,46 @@ >> >> FLAGS are the bitwise-or of the traits that apply to that core. >> >> This need not include flags implied by the architecture. >> >> COSTS is the name of the rtx_costs routine to use. >> >> - IMP is the implementer ID of the CPU vendor. On a GNU/Linux system >> >> it can >> >> - be found in /proc/cpuinfo. >> >> + IMP is the implementer ID of the CPU vendor. On a GNU/Linux system it >> >> + can be found in /proc/cpuinfo. A partial list of implementer IDs is >> >> + given in the ARM Architecture Reference Manual ARMv8, for >> >> - in /proc/cpuinfo. For big.LITTLE systems this should have the form >> >> at of >> >> - "<big core part number>.<LITTLE core part number>". */ >> >> + in /proc/cpuinfo. For big.LITTLE systems this should use the macro >> >> AARCH64_BIG_LITTLE >> >> + where the big part number comes as the first arugment to the macro >> >> and little is the >> >> + second. */ >> > >> > Needs rewrapped for 80 char width. >> > >> >> I don't think it's a good idea to line wrap the def files, some of them >> are processed with AWK during configure and having a complete entry per >> line avoids potential matching problems. > > Agreed (and essential) for the entries themselves. This is just a comment > that hangs over the end and should be fixed.
Yes I agree too. I think with the entries over 80 char width made me miss the comment being over 80. Anyways I fixed it. > > While I'm here... > >> >> + where the big part number comes as the first arugment to the macro >> >> and little is the > > s/arugment/argument. Fixed. I also changed the comment for parse_field to include why it would return -1 and that it parses hex rather than any base (I noticed this due to request to fix the style of the function itself :)). Attached is what I am applying after another bootstrap/test on aarch64-linux-gnu. Thanks, Andrew * config/aarch64/aarch64-cores.def: Rewrite so IMP and PART are integer constants. * config/aarch64/driver-aarch64.c (struct aarch64_core_data): Change implementer_id to unsigned char. Change part_no to unsigned int. (AARCH64_BIG_LITTLE): New define. (INVALID_IMP): New define. (INVALID_CORE): New define. (cpu_data): Change the last element's implementer_id and part_no to integers. (valid_bL_string_p): Rewrite to .. (valid_bL_core_p): this for integers instead of strings. (parse_field): New function. (contains_string_p): Rewrite to ... (contains_core_p): this for integers and only for the part_no. (host_detect_local_cpu): Rewrite handling of implementation and part num to be integers; simplifying the code. > > Cheers, > James >
Index: config/aarch64/aarch64-cores.def =================================================================== --- config/aarch64/aarch64-cores.def (revision 241432) +++ config/aarch64/aarch64-cores.def (working copy) @@ -32,43 +32,46 @@ FLAGS are the bitwise-or of the traits that apply to that core. This need not include flags implied by the architecture. COSTS is the name of the rtx_costs routine to use. - IMP is the implementer ID of the CPU vendor. On a GNU/Linux system it can - be found in /proc/cpuinfo. - PART is the part number of the CPU. On a GNU/Linux system it can be found - in /proc/cpuinfo. For big.LITTLE systems this should have the form at of - "<big core part number>.<LITTLE core part number>". */ + IMP is the implementer ID of the CPU vendor. On a GNU/Linux system it + can be found in /proc/cpuinfo. A partial list of implementer IDs is + given in the ARM Architecture Reference Manual ARMv8, for + ARMv8-A architecture profile. + PART is the part number of the CPU. On a GNU/Linux system it can be + found in /proc/cpuinfo. For big.LITTLE systems this should use the + macro AARCH64_BIG_LITTLE where the big part number comes as the first + argument to the macro and little is the second. */ /* V8 Architecture Processors. */ /* ARM ('A') cores. */ -AARCH64_CORE("cortex-a35", cortexa35, cortexa53, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa35, "0x41", "0xd04") -AARCH64_CORE("cortex-a53", cortexa53, cortexa53, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa53, "0x41", "0xd03") -AARCH64_CORE("cortex-a57", cortexa57, cortexa57, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, "0x41", "0xd07") -AARCH64_CORE("cortex-a72", cortexa72, cortexa57, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa72, "0x41", "0xd08") -AARCH64_CORE("cortex-a73", cortexa73, cortexa57, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa73, "0x41", "0xd09") +AARCH64_CORE("cortex-a35", cortexa35, cortexa53, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa35, 0x41, 0xd04) +AARCH64_CORE("cortex-a53", cortexa53, cortexa53, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa53, 0x41, 0xd03) +AARCH64_CORE("cortex-a57", cortexa57, cortexa57, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, 0x41, 0xd07) +AARCH64_CORE("cortex-a72", cortexa72, cortexa57, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa72, 0x41, 0xd08) +AARCH64_CORE("cortex-a73", cortexa73, cortexa57, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa73, 0x41, 0xd09) /* Samsung ('S') cores. */ -AARCH64_CORE("exynos-m1", exynosm1, exynosm1, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, exynosm1, "0x53", "0x001") +AARCH64_CORE("exynos-m1", exynosm1, exynosm1, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, exynosm1, 0x53, 0x001) /* Qualcomm ('Q') cores. */ -AARCH64_CORE("qdf24xx", qdf24xx, cortexa57, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, qdf24xx, "0x51", "0x800") +AARCH64_CORE("qdf24xx", qdf24xx, cortexa57, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, qdf24xx, 0x51, 0x800) /* Cavium ('C') cores. */ -AARCH64_CORE("thunderx", thunderx, thunderx, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx, "0x43", "0x0a1") +AARCH64_CORE("thunderx", thunderx, thunderx, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx, 0x43, 0x0a1) /* APM ('P') cores. */ -AARCH64_CORE("xgene1", xgene1, xgene1, 8A, AARCH64_FL_FOR_ARCH8, xgene1, "0x50", "0x000") +AARCH64_CORE("xgene1", xgene1, xgene1, 8A, AARCH64_FL_FOR_ARCH8, xgene1, 0x50, 0x000) /* V8.1 Architecture Processors. */ /* Broadcom ('B') cores. */ -AARCH64_CORE("vulcan", vulcan, cortexa57, 8_1A, AARCH64_FL_FOR_ARCH8_1 | AARCH64_FL_CRYPTO, vulcan, "0x42", "0x516") +AARCH64_CORE("vulcan", vulcan, cortexa57, 8_1A, AARCH64_FL_FOR_ARCH8_1 | AARCH64_FL_CRYPTO, vulcan, 0x42, 0x516) /* V8 big.LITTLE implementations. */ -AARCH64_CORE("cortex-a57.cortex-a53", cortexa57cortexa53, cortexa53, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, "0x41", "0xd07.0xd03") -AARCH64_CORE("cortex-a72.cortex-a53", cortexa72cortexa53, cortexa53, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa72, "0x41", "0xd08.0xd03") -AARCH64_CORE("cortex-a73.cortex-a35", cortexa73cortexa35, cortexa53, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa73, "0x41", "0xd09.0xd04") -AARCH64_CORE("cortex-a73.cortex-a53", cortexa73cortexa53, cortexa53, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa73, "0x41", "0xd09.0xd03") +AARCH64_CORE("cortex-a57.cortex-a53", cortexa57cortexa53, cortexa53, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57, 0x41, AARCH64_BIG_LITTLE (0xd07, 0xd03)) +AARCH64_CORE("cortex-a72.cortex-a53", cortexa72cortexa53, cortexa53, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa72, 0x41, AARCH64_BIG_LITTLE (0xd08, 0xd03)) +AARCH64_CORE("cortex-a73.cortex-a35", cortexa73cortexa35, cortexa53, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa73, 0x41, AARCH64_BIG_LITTLE (0xd09, 0xd04)) +AARCH64_CORE("cortex-a73.cortex-a53", cortexa73cortexa53, cortexa53, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa73, 0x41, AARCH64_BIG_LITTLE (0xd09, 0xd03)) #undef AARCH64_CORE Index: config/aarch64/driver-aarch64.c =================================================================== --- config/aarch64/driver-aarch64.c (revision 241432) +++ config/aarch64/driver-aarch64.c (working copy) @@ -46,18 +46,23 @@ struct aarch64_core_data { const char* name; const char* arch; - const char* implementer_id; - const char* part_no; + unsigned char implementer_id; /* Exactly 8 bits */ + unsigned int part_no; /* 12 bits + 12 bits */ const unsigned long flags; }; +#define AARCH64_BIG_LITTLE(BIG, LITTLE) \ + (((BIG)&0xFFFu) << 12 | ((LITTLE) & 0xFFFu)) +#define INVALID_IMP ((unsigned char) -1) +#define INVALID_CORE ((unsigned)-1) + #define AARCH64_CORE(CORE_NAME, CORE_IDENT, SCHED, ARCH, FLAGS, COSTS, IMP, PART) \ { CORE_NAME, #ARCH, IMP, PART, FLAGS }, static struct aarch64_core_data aarch64_cpu_data[] = { #include "aarch64-cores.def" - { NULL, NULL, NULL, NULL, 0 } + { NULL, NULL, INVALID_IMP, INVALID_CORE, 0 } }; @@ -95,32 +100,42 @@ get_arch_from_id (const char* id) return NULL; } -/* Check wether the string CORE contains the same CPU part numbers - as BL_STRING. For example CORE="{0xd03, 0xd07}" and BL_STRING="0xd07.0xd03" - should return true. */ +/* Check wether the CORE array is the same as the big.LITTLE BL_CORE. + For an example CORE={0xd08, 0xd03} and + BL_CORE=AARCH64_BIG_LITTLE (0xd08, 0xd03) will return true. */ static bool -valid_bL_string_p (const char** core, const char* bL_string) +valid_bL_core_p (unsigned int *core, unsigned int bL_core) +{ + return AARCH64_BIG_LITTLE (core[0], core[1]) == bL_core + || AARCH64_BIG_LITTLE (core[1], core[0]) == bL_core; +} + +/* Returns the hex integer that is after ':' for the FIELD. + Returns -1 is returned if there was problem parsing the integer. */ +static unsigned +parse_field (const char *field) { - return strstr (bL_string, core[0]) != NULL - && strstr (bL_string, core[1]) != NULL; + const char *rest = strchr (field, ':'); + char *after; + unsigned fint = strtol (rest + 1, &after, 16); + if (after == rest + 1) + return -1; + return fint; } -/* Return true iff ARR contains STR in one of its two elements. */ +/* Return true iff ARR contains CORE, in either of the two elements. */ static bool -contains_string_p (const char** arr, const char* str) +contains_core_p (unsigned *arr, unsigned core) { - bool res = false; - - if (arr[0] != NULL) + if (arr[0] != INVALID_CORE) { - res = strstr (arr[0], str) != NULL; - if (res) - return res; + if (arr[0] == core) + return true; - if (arr[1] != NULL) - return strstr (arr[1], str) != NULL; + if (arr[1] != INVALID_CORE) + return arr[1] == core; } return false; @@ -155,10 +170,9 @@ host_detect_local_cpu (int argc, const c bool cpu = false; unsigned int i = 0; unsigned int core_idx = 0; - const char* imps[2] = { NULL, NULL }; - const char* cores[2] = { NULL, NULL }; + unsigned char imp = INVALID_IMP; + unsigned int cores[2] = { INVALID_CORE, INVALID_CORE }; unsigned int n_cores = 0; - unsigned int n_imps = 0; bool processed_exts = false; const char *ext_string = ""; unsigned long extension_flags = 0; @@ -191,31 +205,28 @@ host_detect_local_cpu (int argc, const c { if (strstr (buf, "implementer") != NULL) { - for (i = 0; aarch64_cpu_data[i].name != NULL; i++) - if (strstr (buf, aarch64_cpu_data[i].implementer_id) != NULL - && !contains_string_p (imps, - aarch64_cpu_data[i].implementer_id)) - { - if (n_imps == 2) - goto not_found; - - imps[n_imps++] = aarch64_cpu_data[i].implementer_id; - - break; - } - continue; + unsigned cimp = parse_field (buf); + if (cimp == INVALID_IMP) + goto not_found; + + if (imp == INVALID_IMP) + imp = cimp; + /* FIXME: BIG.little implementers are always equal. */ + else if (imp != cimp) + goto not_found; } if (strstr (buf, "part") != NULL) { + unsigned ccore = parse_field (buf); for (i = 0; aarch64_cpu_data[i].name != NULL; i++) - if (strstr (buf, aarch64_cpu_data[i].part_no) != NULL - && !contains_string_p (cores, aarch64_cpu_data[i].part_no)) + if (ccore == aarch64_cpu_data[i].part_no + && !contains_core_p (cores, ccore)) { if (n_cores == 2) goto not_found; - cores[n_cores++] = aarch64_cpu_data[i].part_no; + cores[n_cores++] = ccore; core_idx = i; arch_id = aarch64_cpu_data[i].arch; break; @@ -262,7 +273,7 @@ host_detect_local_cpu (int argc, const c f = NULL; /* Weird cpuinfo format that we don't know how to handle. */ - if (n_cores == 0 || n_cores > 2 || n_imps != 1) + if (n_cores == 0 || n_cores > 2 || imp == INVALID_IMP) goto not_found; if (arch && !arch_id) @@ -284,11 +295,8 @@ host_detect_local_cpu (int argc, const c { for (i = 0; aarch64_cpu_data[i].name != NULL; i++) { - if (strchr (aarch64_cpu_data[i].part_no, '.') != NULL - && strncmp (aarch64_cpu_data[i].implementer_id, - imps[0], - strlen (imps[0]) - 1) == 0 - && valid_bL_string_p (cores, aarch64_cpu_data[i].part_no)) + if (aarch64_cpu_data[i].implementer_id == imp + && valid_bL_core_p (cores, aarch64_cpu_data[i].part_no)) { res = concat ("-m", cpu ? "cpu" : "tune", "=", @@ -304,8 +312,7 @@ host_detect_local_cpu (int argc, const c /* The simple, non-big.LITTLE case. */ else { - if (strncmp (aarch64_cpu_data[core_idx].implementer_id, imps[0], - strlen (imps[0]) - 1) != 0) + if (aarch64_cpu_data[core_idx].implementer_id != imp) goto not_found; res = concat ("-m", cpu ? "cpu" : "tune", "=",