Re: [PATCH v6 18/24] mm: Try spin lock in speculative path

2018-01-16 Thread Laurent Dufour
On 12/01/2018 19:18, Matthew Wilcox wrote:
> On Fri, Jan 12, 2018 at 06:26:02PM +0100, Laurent Dufour wrote:
>> There is a deadlock when a CPU is doing a speculative page fault and
>> another one is calling do_unmap().
>>
>> The deadlock occurred because the speculative path try to spinlock the
>> pte while the interrupt are disabled. When the other CPU in the
>> unmap's path has locked the pte then is waiting for all the CPU to
>> invalidate the TLB. As the CPU doing the speculative fault have the
>> interrupt disable it can't invalidate the TLB, and can't get the lock.
>>
>> Since we are in a speculative path, we can race with other mm action.
>> So let assume that the lock may not get acquired and fail the
>> speculative page fault.
> 
> It seems like you introduced this bug in the previous patch, and now
> you're fixing it in this patch?  Why not merge the two?

You're right this is a fix from the previous patch. Initially my idea was
to keep the original Peter's patch as is, but sounds that this is not a
good idea.
I'll merge it in the previous one.

Thanks,
Laurent.



Re: [PATCH v6 18/24] mm: Try spin lock in speculative path

2018-01-12 Thread Matthew Wilcox
On Fri, Jan 12, 2018 at 06:26:02PM +0100, Laurent Dufour wrote:
> There is a deadlock when a CPU is doing a speculative page fault and
> another one is calling do_unmap().
> 
> The deadlock occurred because the speculative path try to spinlock the
> pte while the interrupt are disabled. When the other CPU in the
> unmap's path has locked the pte then is waiting for all the CPU to
> invalidate the TLB. As the CPU doing the speculative fault have the
> interrupt disable it can't invalidate the TLB, and can't get the lock.
> 
> Since we are in a speculative path, we can race with other mm action.
> So let assume that the lock may not get acquired and fail the
> speculative page fault.

It seems like you introduced this bug in the previous patch, and now
you're fixing it in this patch?  Why not merge the two?



[PATCH v6 18/24] mm: Try spin lock in speculative path

2018-01-12 Thread Laurent Dufour
There is a deadlock when a CPU is doing a speculative page fault and
another one is calling do_unmap().

The deadlock occurred because the speculative path try to spinlock the
pte while the interrupt are disabled. When the other CPU in the
unmap's path has locked the pte then is waiting for all the CPU to
invalidate the TLB. As the CPU doing the speculative fault have the
interrupt disable it can't invalidate the TLB, and can't get the lock.

Since we are in a speculative path, we can race with other mm action.
So let assume that the lock may not get acquired and fail the
speculative page fault.

Here are the stacks captured during the deadlock:

CPU 0
native_flush_tlb_others+0x7c/0x260
flush_tlb_mm_range+0x6a/0x220
tlb_flush_mmu_tlbonly+0x63/0xc0
unmap_page_range+0x897/0x9d0
? unmap_single_vma+0x7d/0xe0
? release_pages+0x2b3/0x360
unmap_single_vma+0x7d/0xe0
unmap_vmas+0x51/0xa0
unmap_region+0xbd/0x130
do_munmap+0x279/0x460
SyS_munmap+0x53/0x70

CPU 1
do_raw_spin_lock+0x14e/0x160
_raw_spin_lock+0x5d/0x80
? pte_map_lock+0x169/0x1b0
pte_map_lock+0x169/0x1b0
handle_pte_fault+0xbf2/0xd80
? trace_hardirqs_on+0xd/0x10
handle_speculative_fault+0x272/0x280
handle_speculative_fault+0x5/0x280
__do_page_fault+0x187/0x580
trace_do_page_fault+0x52/0x260
do_async_page_fault+0x19/0x70
async_page_fault+0x28/0x30

Signed-off-by: Laurent Dufour 
---
 mm/memory.c | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 96720cc7ca74..83640079d407 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2472,7 +2472,8 @@ static bool pte_spinlock(struct vm_fault *vmf)
goto out;
 
vmf->ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd);
-   spin_lock(vmf->ptl);
+   if (unlikely(!spin_trylock(vmf->ptl)))
+   goto out;
 
if (vma_has_changed(vmf)) {
spin_unlock(vmf->ptl);
@@ -2526,8 +2527,20 @@ static bool pte_map_lock(struct vm_fault *vmf)
if (!pmd_same(pmdval, vmf->orig_pmd))
goto out;
 
-   pte = pte_offset_map_lock(vmf->vma->vm_mm, vmf->pmd,
- vmf->address, );
+   /*
+* Same as pte_offset_map_lock() except that we call
+* spin_trylock() in place of spin_lock() to avoid race with
+* unmap path which may have the lock and wait for this CPU
+* to invalidate TLB but this CPU has irq disabled.
+* Since we are in a speculative patch, accept it could fail
+*/
+   ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd);
+   pte = pte_offset_map(vmf->pmd, vmf->address);
+   if (unlikely(!spin_trylock(ptl))) {
+   pte_unmap(pte);
+   goto out;
+   }
+
if (vma_has_changed(vmf)) {
pte_unmap_unlock(pte, ptl);
goto out;
-- 
2.7.4