Hi Tyler

Thanks for taking a look!


On 20/02/18 21:18, Tyler Baicar wrote:
> On 2/15/2018 1:56 PM, James Morse wrote:
>> Arm64 has multiple NMI-like notifications, but GHES only has one
>> in_nmi() path. The interactions between these multiple NMI-like
>> notifications is, unclear.
>>
>> Split this single path up by moving the fixmap idx and lock into
>> the struct ghes. Each notification's init function can consider
>> which other notifications it masks and can share a fixmap_idx with.
>> This lets us merge the two ghes_ioremap_pfn_* flavours.
>>
>> Two lock pointers are provided, but only one will be used by
>> ghes_copy_tofrom_phys(), depending on in_nmi(). This means any
>> notification that might arrive as an NMI must always be wrapped in
>> nmi_enter()/nmi_exit().
>>
>> The double-underscore version of fix_to_virt() is used because
>> the index to be mapped can't be tested against the end of the
>> enum at compile time.

>> @@ -303,13 +278,11 @@ static void ghes_copy_tofrom_phys(void *buffer, u64
>> paddr, u32 len,
>>         while (len > 0) {
>>           offset = paddr - (paddr & PAGE_MASK);
>> -        if (in_nmi) {
>> -            raw_spin_lock(&ghes_ioremap_lock_nmi);
>> -            vaddr = ghes_ioremap_pfn_nmi(paddr >> PAGE_SHIFT);
>> -        } else {
>> -            spin_lock_irqsave(&ghes_ioremap_lock_irq, flags);
>> -            vaddr = ghes_ioremap_pfn_irq(paddr >> PAGE_SHIFT);
>> -        }
>> +        if (in_nmi)
>> +            raw_spin_lock(ghes->nmi_fixmap_lock);
>> +        else
>> +            spin_lock_irqsave(ghes->fixmap_lock, flags);

> This locking is resulting in a NULL pointer dereference for me during boot 
> time.
> I removed the ghes_proc() call
> from ghes_probe() and then when triggering errors and going through 
> ghes_proc()
> the NULL pointer dereference
> no longer happens. That makes me think that this is dependent on something 
> that
> is not setup before
> ghes_probe() is happening. Any ideas?

Gah, One of the things I've tried to enforce is that notifications that happen
in_nmi() always happen in_nmi(): but that isn't the case for this first
ghes_proc() call, which always happens in process context.

Why didn't this happen to me? I'm assuming your GHES has work to do prior to
probing, but I waited for it to finish booting before generating test events.

The smallest fix is to have an irq/nmi fixmap and lock. This probe time call
would always use the irq fixmap and lock, and can be safely interrupted by an
NMI. Its only the NMI notifications interacting that we need to worry about as
they can't be masked.


Thanks!

James




_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Reply via email to