Bug#800574: Final analysis for Broadwell
On Sun, 18 Oct 2015, Aurelien Jarno wrote: > > Broadwell-H with a very recent microcode update (rev 0x12, from > > 2015-06-04) was confirmed to have broken TSX-NI (RTM) and to _leave it > > enabled_ in CPUID, causing glibc with lock elision enabled to SIGSEGV. > > An even more recent Broadwell-H microcode update, rev 0x13 from > > 2015-08-03, is confirmed to (finally) disable the HLE and RTM CPUID > > bits. This should make blacklisting signature 0x40671 uncontroversial. FWIW, in the last few days it became clear that so far, the mobile Broadwell-H disables Intel TSX, but no instances of the desktop Broadwell-H with RTM disabled were found yet, not even with the latest microcode. And they all use the same microcode. It has also became clear a few days ago that it is very likely that the BIOS can disable Intel TSX-NI (RTM) and HLE, and it doesn't need very recent microcode to do that either. If this is true, it should be something like MSR 0x13c (bit 1 of that MSR disables AES-NI when set, and bit 0 locks that MSR against writting when set). Maybe the Intel TSX-NI (HLE and RTM) disable switches are even on this very same MSR... I've also since became aware of Debian bug #750792, and it describes the same SIGSEGV observed by Broadwell and Skylake Arch-linux users on lock-elision-enabled glibc. From that bug report, it is clear that the SIGSEGVs in __lll_unlock_elision can easily happen due to software bugs, so it need not be linked to any Intel-TSX processor errata. And this kind of defect is quite common, apparently. However, since Intel's current public specification update states (as errata) that Intel TSX-NI is not supposed to be usable in the Broadwell and Broadwell-H cores, that it should not even be reported in CPUID by these processors (but it is :p), and that this is not supposed to be fixable or worked around, I still think we need to blacklist it. I will keep tracking this issue, and report back any relevant information that becomes available. It would be _really_ nice if the Intel team that works with Canonical were to shed a light on this, though. > Thanks for the patch, I have committed it to the jessie and the 2.21 > branches. Thank you. -- "One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie." -- The Silicon Valley Tarot Henrique Holschuh
Bug#800574: Final analysis for Broadwell
On 2015-10-08 22:20, Henrique de Moraes Holschuh wrote: > tag 800574 + patch > thanks > > Attached updated version of amd64/local-blacklist-on-TSX-Haswell.diff. > I believe it should be renamed to > "amd64/local-blacklist-for-Intel-TSX.diff" as it is not just about Intel > Haswell anymore. > > The updated patch has been package-compile-tested on glibc 2.19-22. > > This new version of the blacklist patch had the patch header text and > blacklist code comments updated. It doesn't change anything for > Haswell. It adds to the blacklist the current Broadwell CPU models and > steppings. > > Broadwell-H with a very recent microcode update (rev 0x12, from > 2015-06-04) was confirmed to have broken TSX-NI (RTM) and to _leave it > enabled_ in CPUID, causing glibc with lock elision enabled to SIGSEGV. > An even more recent Broadwell-H microcode update, rev 0x13 from > 2015-08-03, is confirmed to (finally) disable the HLE and RTM CPUID > bits. This should make blacklisting signature 0x40671 uncontroversial. > > Refer to https://bugzilla.kernel.org/show_bug.cgi?id=103351 for details. > > This version of the blacklist patch leaves upcoming Broadwell-E > unblacklisted. It also leaves Skylake unblacklisted, as I have not been > able to confirm whether the newest Skylake-S microcode updates have > working Intel TSX-NI, or have it disabled. > > I propose that the updated blacklist patch be added to glibc in > unstable, and after it spends a few weeks in testing, that it should > also be the added to stable through a stable update. Thanks for the patch, I have committed it to the jessie and the 2.21 branches. -- Aurelien Jarno GPG: 4096R/1DDD8C9B aurel...@aurel32.net http://www.aurel32.net
Bug#800574: Final analysis for Broadwell
On 2015-10-07 07:32, Henrique de Moraes Holschuh wrote: > Meanwhile, a suggestion by Samuel Thibault to try to use hwcap did > provide for a possible long-term plan to fine-tune the lock-elision > blacklist (and anything else of that sort). > > We would have to (finally) extend x86-64 hwcap to cpuid(1) fully, and > also at least cpuid(7), which is anything but trivial and a lot of work. > This is _not_ worth the trouble if it is done just for lock elision > blacklisting purposes. > > However, it would be useful for link-time optimization in libraries > (e.g. avx2 flavours of something that really benefits from it, etc), so > it is likely worth pursuing... but only if we get buy-in from upstream. Why do you believe that hwcap is better for handling that than the current STT_GNU_IFUNC mechanism? For me hwcap is clearly superseded by the STT_GNU_IFUNC: 1) With the hwcap mechanism the libraries need to be recompiled multiple times, increasing build time, but also the disk space on the users computer. 2) It makes the upgrades more complex (see the nohwcap part of the libc preinst/postinst). 3) We need to ensure the ABI is the same for all versions of the same library (this is not the case for upstream glibc between i586 and i686). 4) The fact that the CPU supports a feature doesn't mean it supports it with good performances. For instance Intel Silvermont supports SSE4.2, but SSE2/SSSE3 based version of string functions are much faster there. 5) Finally it means that we need to provide a version of the libc for all combinations. Think on i386, we would need to provide: - libc6 - libc6-i686 - libc6-i686-tsx - libc6-xen - libc6-xen-tsx -- Aurelien Jarno GPG: 4096R/1DDD8C9B aurel...@aurel32.net http://www.aurel32.net
Bug#800574: Final analysis for Broadwell
On Sun, 18 Oct 2015, Aurelien Jarno wrote: > On 2015-10-07 07:32, Henrique de Moraes Holschuh wrote: > > Meanwhile, a suggestion by Samuel Thibault to try to use hwcap did > > provide for a possible long-term plan to fine-tune the lock-elision > > blacklist (and anything else of that sort). > > > > We would have to (finally) extend x86-64 hwcap to cpuid(1) fully, and > > also at least cpuid(7), which is anything but trivial and a lot of work. > > This is _not_ worth the trouble if it is done just for lock elision > > blacklisting purposes. > > > > However, it would be useful for link-time optimization in libraries > > (e.g. avx2 flavours of something that really benefits from it, etc), so > > it is likely worth pursuing... but only if we get buy-in from upstream. > > Why do you believe that hwcap is better for handling that than the > current STT_GNU_IFUNC mechanism? I was not aware of STT_GNU_IFUNC. I will look into it. > 5) Finally it means that we need to provide a version of the libc for > all combinations. Think on i386, we would need to provide: > - libc6 > - libc6-i686 > - libc6-i686-tsx > - libc6-xen > - libc6-xen-tsx No. We need nothing of the sort for Intel TSX. TSX-NI is something already detected at runtime by glibc using the cpuid instruction, there is no need to use the dynamic loader's hwcap object selection for this. What I proposed was to extend the kernel-supplied hwcap area for x86-64 (and x32, I suppose) to export the full flags information returned by CPUID.EAX=1, and also by CPUID.EAX=7 to all processes... and use _that_ instead of a direct call to the cpuid instruction to detect Intel TSX (and anything else based on cpuid(1) and cpuid(7) in glibc). We could do it for 32-bit too, I suppose. But if hardware-assisted lock elision is important enough to justify that kind of work for i686, that just means we should deploy x32 instead, as far as I'm concerned. Then, change glibc to use this extended hwcap information to detect such runtime-selected features instead of calling the cpuid instruction directly on the processor. On an older kernel without the extended hwcap fields, either call cpuid directly, or disable them. However, for stuff like AVX512, you might want to have the *entire* library compiled with a much more advanced instruction set (based on the fact that AVX512 being available also implies that very fast SSE4.2 is available, for example). It would be possible to use the dynamic linker's hwcap support to do that, if one wanted to. -- "One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie." -- The Silicon Valley Tarot Henrique Holschuh
Bug#800574: Final analysis for Broadwell
tag 800574 + patch thanks Attached updated version of amd64/local-blacklist-on-TSX-Haswell.diff. I believe it should be renamed to "amd64/local-blacklist-for-Intel-TSX.diff" as it is not just about Intel Haswell anymore. The updated patch has been package-compile-tested on glibc 2.19-22. This new version of the blacklist patch had the patch header text and blacklist code comments updated. It doesn't change anything for Haswell. It adds to the blacklist the current Broadwell CPU models and steppings. Broadwell-H with a very recent microcode update (rev 0x12, from 2015-06-04) was confirmed to have broken TSX-NI (RTM) and to _leave it enabled_ in CPUID, causing glibc with lock elision enabled to SIGSEGV. An even more recent Broadwell-H microcode update, rev 0x13 from 2015-08-03, is confirmed to (finally) disable the HLE and RTM CPUID bits. This should make blacklisting signature 0x40671 uncontroversial. Refer to https://bugzilla.kernel.org/show_bug.cgi?id=103351 for details. This version of the blacklist patch leaves upcoming Broadwell-E unblacklisted. It also leaves Skylake unblacklisted, as I have not been able to confirm whether the newest Skylake-S microcode updates have working Intel TSX-NI, or have it disabled. I propose that the updated blacklist patch be added to glibc in unstable, and after it spends a few weeks in testing, that it should also be the added to stable through a stable update. -- "One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie." -- The Silicon Valley Tarot Henrique de Moraes HolschuhIntel TSX is broken on Haswell based processors (erratum HSD136/HSW136) and a microcode update is available to simply disable the corresponding instructions. A live microcode update will disable the TSX instructions causing already started binaries to segfault. This patch simply disable Intel TSX (HLE and RTM) on processors which might receive a microcode update, so that it doesn't happen. We might expect newer steppings to fix the issue (e.g. as Haswell-EX did). Intel TSX-NI is also broken on Broadwell systems, and documented as being unavailable in their specification updates errata list. However, some end-user systems were shipped with old microcode that left Intel TSX-NI still enabled in CPUID on these processors. We must not allow RTM to be used by glibc on these systems, due to runtime system misbehavior and live-update of microcode hazards. Author: Henrique de Moraes Holschuh Index: glibc-2.19/sysdeps/x86_64/multiarch/init-arch.c === --- glibc-2.19.orig/sysdeps/x86_64/multiarch/init-arch.c 2014-02-07 07:04:38.0 -0200 +++ glibc-2.19/sysdeps/x86_64/multiarch/init-arch.c 2015-10-07 09:07:59.272156212 -0300 @@ -26,7 +26,7 @@ static void -get_common_indeces (unsigned int *family, unsigned int *model) +get_common_indeces (unsigned int *family, unsigned int *model, unsigned int *stepping) { __cpuid (1, __cpu_features.cpuid[COMMON_CPUID_INDEX_1].eax, __cpu_features.cpuid[COMMON_CPUID_INDEX_1].ebx, @@ -36,6 +36,7 @@ unsigned int eax = __cpu_features.cpuid[COMMON_CPUID_INDEX_1].eax; *family = (eax >> 8) & 0x0f; *model = (eax >> 4) & 0x0f; + *stepping = eax & 0x0f; } @@ -47,6 +48,7 @@ unsigned int edx; unsigned int family = 0; unsigned int model = 0; + unsigned int stepping = 0; enum cpu_features_kind kind; __cpuid (0, __cpu_features.max_cpuid, ebx, ecx, edx); @@ -56,7 +58,7 @@ { kind = arch_kind_intel; - get_common_indeces (, ); + get_common_indeces (, , ); unsigned int eax = __cpu_features.cpuid[COMMON_CPUID_INDEX_1].eax; unsigned int extended_family = (eax >> 20) & 0xff; @@ -131,7 +133,7 @@ { kind = arch_kind_amd; - get_common_indeces (, ); + get_common_indeces (, , ); ecx = __cpu_features.cpuid[COMMON_CPUID_INDEX_1].ecx; @@ -176,6 +178,24 @@ } } + /* Disable Intel TSX (HLE and RTM) due to erratum HSD136/HSW136 + on all Haswell processors, except Haswell-EX/Xeon E7-v3 (306F4), + to work around outdated microcode that doesn't disable the + broken feature by default. + + Disable TSX on Broadwell, due to errata BDM53/BDW51/BDD51/ + BDE42. The errata documentation states that RTM is unusable, + and that it should not be advertised by CPUID at all on any + such processors. Unfortunately, it _is_ advertised in some + (older) microcode versions. Exceptions: Broadwell-E (406Fx), + likely already fixed at launch */ + if (kind == arch_kind_intel && family == 6 && + ((model == 63 && stepping <= 2) || (model == 60 && stepping <= 3) || + (model == 69 && stepping <= 1) || (model == 70 && stepping <= 1) || + (model == 61 && stepping <= 4) || (model == 71 && stepping <= 1) || + (model ==
Bug#800574: Final analysis for Broadwell
Well, I've finally finished analysing things for Broadwell. The amd64 (x86-64) glibc lock elision code keys on the RTM CPUID bit because it is actually using RTM (and not HLE) to implement lock elision. I failed to keep this in mind when worrying about permanently blacklisting Broadwell processors from glibc lock elision in unstable. Errata BDM53/BDW51/BDD51/BDE42 "Intel TSX Instructions not avalilable" _should_ mean that trying to use any of the TSX-NI opcodes (i.e. RTM) in Broadwell would always result in an illegal opcode exception (SIGILL). The specificaton updates also explicitly say in the descriptions of these errata that the RTM bit in CPUID is not set. Were those errata present on every microcode revision/core versions of those processors, it would make them "safe" as far as our (patched) glibc lock elision is concerned. We are not that lucky. Seaching the network for cpuinfo reports resulted in a /proc/cpuinfo dump of signature 0x306d4, microcode rev 0x11 (Core i5-5300U), and rev 0x18 (Core M-5Y71), where both RTM and HLE are reported as enabled. Another /proc/cpuinfo dump of signature 0x306d4, with microcode rev 0x18 (Core i5-5300U and also Core M-5Y10c) and rev 0x1f (Core i5-5287U), shows both HLE and RTM already disabled. The fact that revision 0x18 had a different CPUID response for (Core i5-5300U, Core M-5Y10c), and Core M-5Y71 was a surprise. Perhaps it also has a dependency on the firmware doing a (hopefully boot-locked) wrmsr to disable TSX. Anyway, regardless of the reason, one cannot count on the RTM and HLE bits being disabled in CPUID(7) on every Broadwell processor and microcode revision out there. OTOH, it does means we can, and should, blacklist signature 0x306d4 (and earlier) permanently, because RTM is extremely unlikely to be fixed/fixable on these processors. Either it is disabled as it should be per the errata documentation, or enabled and very dangerous (resulting in either SIGILL or Haswell-style risk of unpredictable system behavior). Since signature 0x40671 also has the same "TSX unavailable" type of errata (BDD51, BDE42), I guess we can assume the same applies to Broadwell-H and Broadwell-DE, and blacklist lock elision there permanently as well. I am still collecting data for Skylake-S, but it boils down to whether up-to-date Skylake-S microcode (revisions 0x34 and higher) fixes, or disable TSX. We know that microcode update does stop glibc lock elision crashing with SIGSEGV, though. Meanwhile, a suggestion by Samuel Thibault to try to use hwcap did provide for a possible long-term plan to fine-tune the lock-elision blacklist (and anything else of that sort). We would have to (finally) extend x86-64 hwcap to cpuid(1) fully, and also at least cpuid(7), which is anything but trivial and a lot of work. This is _not_ worth the trouble if it is done just for lock elision blacklisting purposes. However, it would be useful for link-time optimization in libraries (e.g. avx2 flavours of something that really benefits from it, etc), so it is likely worth pursuing... but only if we get buy-in from upstream. Once it is there for far better purposes than blacklisting, there is no reason not to do the trivial work to have the kernel blacklist whatever capabilities should be avoided, and switch glibc to use the hwcap extension instead of doing cpuid directly wherever available, thus making it usable _also_ for blacklisting things. -- "One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie." -- The Silicon Valley Tarot Henrique de Moraes Holschuh