[Bug target/111768] X86: -march=native does not support alder lake big.little cache infor correctly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111768 --- Comment #13 from David C. Manuelda --- I'd suggest for now to pick a common value in order to prevent the compilation failure (in stage comparison) while a proper fix/workaround is picked.
[Bug target/111768] X86: -march=native does not support alder lake big.little cache infor correctly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111768 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed||2023-10-12 --- Comment #12 from Richard Biener --- Looking at the 'hybrid' flag in cpuid sounds like the most reasonable thing to do, possibly simply skipping auto-detection for the problematical parts (L1 and L2 cache sizes) as Alex suggests.
[Bug target/111768] X86: -march=native does not support alder lake big.little cache infor correctly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111768 --- Comment #11 from Alexander Monakov --- (In reply to Hongtao.liu from comment #10) > > indeed (but I believe it did happen with Alder Lake already, by accident, > > with AVX512 on P-cores but not on E-cores). > > AVX512 is physically fused off for Alderlake P-core, P-core and E-core share > the same ISA level(AVX2). I think Arsen means initial Alder Lake batches, where AVX-512 wasn't yet fused off (but BIOS support was unofficial/experimental anyway).
[Bug target/111768] X86: -march=native does not support alder lake big.little cache infor correctly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111768 --- Comment #10 from Hongtao.liu --- > indeed (but I believe it did happen with Alder Lake already, by accident, > with AVX512 on P-cores but not on E-cores). AVX512 is physically fused off for Alderlake P-core, P-core and E-core share the same ISA level(AVX2).
[Bug target/111768] X86: -march=native does not support alder lake big.little cache infor correctly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111768 --- Comment #9 from Alexander Monakov --- (In reply to Arsen Arsenović from comment #8) > indeed (but I believe it did happen with Alder Lake already, by accident, > with AVX512 on P-cores but not on E-cores). AFAIK on those Alder Lake CPUs you could only get AVX-512 by disabling E-cores in the BIOS, so you couldn't boot in a configuration when both E-cores are available and AVX-512 on P-cores is available.
[Bug target/111768] X86: -march=native does not support alder lake big.little cache infor correctly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111768 --- Comment #8 from Arsen Arsenović --- (In reply to Alexander Monakov from comment #7) > I'm afraid hybrid CPUs with varying ISA feature sets are not practical for > the current ecosystem: you wouldn't be able to reschedule from a higher- to > lower-capable core. Not to mention scenarios like Mesa on-disk llvmpipe > shader cache. indeed (but I believe it did happen with Alder Lake already, by accident, with AVX512 on P-cores but not on E-cores). > "Always" probing all cores is a not a good idea (the compiler would have to > manually reschedule itself to all cores, of which there could be hundreds). > Plus, portable API for such probing across available cores does not exist > afaik. I'd consider this close enough to 'not possible' ;P my thinking was does cpuid provide a way to query cross-CPU (or CPU 'group' I suppose). if not, we're definitely better off just using a common, smaller cache size for intel hybrid CPUs (at least for now) > I think releasing an x86 hybrid CPU with varying capabilities across cores > would require substantial preparatory work in the kernel and likely in the > userland as well, so probably best to leave it until the time comes and > specifics of what can differ are known.
[Bug target/111768] X86: -march=native does not support alder lake big.little cache infor correctly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111768 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Comment #7 from Alexander Monakov --- I'm afraid hybrid CPUs with varying ISA feature sets are not practical for the current ecosystem: you wouldn't be able to reschedule from a higher- to lower-capable core. Not to mention scenarios like Mesa on-disk llvmpipe shader cache. "Always" probing all cores is a not a good idea (the compiler would have to manually reschedule itself to all cores, of which there could be hundreds). Plus, portable API for such probing across available cores does not exist afaik. I think releasing an x86 hybrid CPU with varying capabilities across cores would require substantial preparatory work in the kernel and likely in the userland as well, so probably best to leave it until the time comes and specifics of what can differ are known.
[Bug target/111768] X86: -march=native does not support alder lake big.little cache infor correctly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111768 --- Comment #6 from Arsen Arsenović --- this poses another problem too, though: should big and little cores ever differ in ISA support levels, building on big cores (which seems like a reasonable thing to do) with -march=native could lead to generating code incompatible with little cores. perhaps it'd be reasonable to always probe all cores (is that possible?) and pick a common subset?
[Bug target/111768] X86: -march=native does not support alder lake big.little cache infor correctly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111768 --- Comment #5 from Alexander Monakov --- I think it's similar to attempting -march=native under distcc, which is already warned about on Gentoo wiki: https://wiki.gentoo.org/wiki/Distcc The difference here is that Intel so far decided to make ISA feature set the same between 'performance' and 'power-efficient' cores, so the differences for -march=native detection are minimal. Intel also added a cpuid bit for hybrid CPUs, so in principle native arch detection could inspect that bit and then override l1-cache-size to 32 KiB (having the exact size in the param is not important, specifying a lower value is ok), or just drop it and let cc1 fall back to the default value (64) from params.opt. Short term, I would advise users to add --param=l1-cache-size=32 after -march=native in CFLAGS.
[Bug target/111768] X86: -march=native does not support alder lake big.little cache infor correctly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111768 --- Comment #4 from Hongtao.liu --- I checked Alderlake's L1 cachesize and it is indeed 48, and L1 cachesize in alderlake_cost is set to 32. But then again, we have a lot of different platforms that share the same cost and they may have different L1 cachesizes, but from a micro-architecture tuning point of view, it doesn't make a difference. A separate cost if only the L1 cachesize is different is quite unnecessary(the size itself is just a parameter for the software prefetch, it doesn't have to be real hardware cachesize)
[Bug target/111768] X86: -march=native does not support alder lake big.little cache infor correctly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111768 --- Comment #3 from Richard Biener --- I'd say "don't do this" (bootstrap with -march=native). Alternatively use a taskset to confine to either big or little cores.
[Bug target/111768] X86: -march=native does not support alder lake big.little cache infor correctly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111768 --- Comment #2 from Andrew Pinski --- I think on those soc we should ignore the cache info or set it to some common value between the 2.