Hi Peter, On 10/01/2019 20:12, Peter Zijlstra wrote: > On Thu, Jan 10, 2019 at 06:25:57PM +0000, James Morse wrote: > >> On arm64 if all the RAS and psuedo-NMI patches land, our worst-case >> interleaving >> jumps to at least 7. The culprit is APEI using spinlocks to protect fixmap >> slots. >> >> I have an RFC to bump the number of node bits from 2 to 3, but as this is >> APEI >> four times, it may be preferable to make it use something other than >> spinlocks.
>> The worst-case order is below. Each one masks those before it: >> 1. process context >> 2. soft-irq >> 3. hard-irq >> 4. psuedo-nmi [0] >> - using the irqchip priorities to configure some IRQs as NMI. >> 5. SError [1] >> - a bit like an asynchronous MCE. ACPI allows this to convey CPER records, >> requiring an APEI call. >> 6&7. SDEI [2] >> - a firmware triggered software interrupt, only its two of them, either >> of >> which could convey CPER records. >> 8. Synchronous external abort >> - again, similar to MCE. There are systems using this with APEI. > The thing is, everything non-maskable (NMI like) really should not be > using spinlocks at all. > > I otherwise have no clue about wth APEI is, but it sounds like horrible > crap ;-) I think you've called it that before!: its that GHES thing in drivers/acpi/apei. What is the alternative? bit_spin_lock()? These things can happen independently on multiple CPUs. On arm64 these NMIlike things don't affect all CPUs like they seem to on x86. Thanks, James