Re: [PATCH 2/7] arm64: mm: accelerate pagefault when VM_FAULT_BADACCESS

2024-04-03 Thread Catalin Marinas
On Tue, Apr 02, 2024 at 03:51:37PM +0800, Kefeng Wang wrote:
> The vm_flags of vma already checked under per-VMA lock, if it is a
> bad access, directly set fault to VM_FAULT_BADACCESS and handle error,
> no need to lock_mm_and_find_vma() and check vm_flags again, the latency
> time reduce 34% in lmbench 'lat_sig -P 1 prot lat_sig'.
> 
> Signed-off-by: Kefeng Wang 
> ---
>  arch/arm64/mm/fault.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index 9bb9f395351a..405f9aa831bd 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -572,7 +572,9 @@ static int __kprobes do_page_fault(unsigned long far, 
> unsigned long esr,
>  
>   if (!(vma->vm_flags & vm_flags)) {
>   vma_end_read(vma);
> - goto lock_mmap;
> + fault = VM_FAULT_BADACCESS;
> + count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
> + goto done;
>   }
>   fault = handle_mm_fault(vma, addr, mm_flags | FAULT_FLAG_VMA_LOCK, 
> regs);
>   if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))

I think this makes sense. A concurrent modification of vma->vm_flags
(e.g. mprotect()) would do a vma_start_write(), so no need to recheck
again with the mmap lock held.

Reviewed-by: Catalin Marinas 


Re: [PATCH 2/7] arm64: mm: accelerate pagefault when VM_FAULT_BADACCESS

2024-04-03 Thread Kefeng Wang




On 2024/4/3 13:30, Suren Baghdasaryan wrote:

On Tue, Apr 2, 2024 at 10:19 PM Suren Baghdasaryan  wrote:


On Tue, Apr 2, 2024 at 12:53 AM Kefeng Wang  wrote:


The vm_flags of vma already checked under per-VMA lock, if it is a
bad access, directly set fault to VM_FAULT_BADACCESS and handle error,
no need to lock_mm_and_find_vma() and check vm_flags again, the latency
time reduce 34% in lmbench 'lat_sig -P 1 prot lat_sig'.


The change makes sense to me. Per-VMA lock is enough to keep
vma->vm_flags stable, so no need to retry with mmap_lock.



Signed-off-by: Kefeng Wang 


Reviewed-by: Suren Baghdasaryan 


---
  arch/arm64/mm/fault.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 9bb9f395351a..405f9aa831bd 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -572,7 +572,9 @@ static int __kprobes do_page_fault(unsigned long far, 
unsigned long esr,

 if (!(vma->vm_flags & vm_flags)) {
 vma_end_read(vma);
-   goto lock_mmap;
+   fault = VM_FAULT_BADACCESS;
+   count_vm_vma_lock_event(VMA_LOCK_SUCCESS);


nit: VMA_LOCK_SUCCESS accounting here seems correct to me but
unrelated to the main change. Either splitting into a separate patch
or mentioning this additional fixup in the changelog would be helpful.


The above nit applies to all the patches after this one, so I won't
comment on each one separately. If you decide to split or adjust the
changelog please do that for each patch.


I will update the change log for each patch, thank for your review and 
suggestion.







+   goto done;
 }
 fault = handle_mm_fault(vma, addr, mm_flags | FAULT_FLAG_VMA_LOCK, 
regs);
 if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
--
2.27.0



Re: [PATCH 2/7] arm64: mm: accelerate pagefault when VM_FAULT_BADACCESS

2024-04-02 Thread Suren Baghdasaryan
On Tue, Apr 2, 2024 at 10:19 PM Suren Baghdasaryan  wrote:
>
> On Tue, Apr 2, 2024 at 12:53 AM Kefeng Wang  
> wrote:
> >
> > The vm_flags of vma already checked under per-VMA lock, if it is a
> > bad access, directly set fault to VM_FAULT_BADACCESS and handle error,
> > no need to lock_mm_and_find_vma() and check vm_flags again, the latency
> > time reduce 34% in lmbench 'lat_sig -P 1 prot lat_sig'.
>
> The change makes sense to me. Per-VMA lock is enough to keep
> vma->vm_flags stable, so no need to retry with mmap_lock.
>
> >
> > Signed-off-by: Kefeng Wang 
>
> Reviewed-by: Suren Baghdasaryan 
>
> > ---
> >  arch/arm64/mm/fault.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> > index 9bb9f395351a..405f9aa831bd 100644
> > --- a/arch/arm64/mm/fault.c
> > +++ b/arch/arm64/mm/fault.c
> > @@ -572,7 +572,9 @@ static int __kprobes do_page_fault(unsigned long far, 
> > unsigned long esr,
> >
> > if (!(vma->vm_flags & vm_flags)) {
> > vma_end_read(vma);
> > -   goto lock_mmap;
> > +   fault = VM_FAULT_BADACCESS;
> > +   count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
>
> nit: VMA_LOCK_SUCCESS accounting here seems correct to me but
> unrelated to the main change. Either splitting into a separate patch
> or mentioning this additional fixup in the changelog would be helpful.

The above nit applies to all the patches after this one, so I won't
comment on each one separately. If you decide to split or adjust the
changelog please do that for each patch.

>
> > +   goto done;
> > }
> > fault = handle_mm_fault(vma, addr, mm_flags | FAULT_FLAG_VMA_LOCK, 
> > regs);
> > if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
> > --
> > 2.27.0
> >


Re: [PATCH 2/7] arm64: mm: accelerate pagefault when VM_FAULT_BADACCESS

2024-04-02 Thread Suren Baghdasaryan
On Tue, Apr 2, 2024 at 12:53 AM Kefeng Wang  wrote:
>
> The vm_flags of vma already checked under per-VMA lock, if it is a
> bad access, directly set fault to VM_FAULT_BADACCESS and handle error,
> no need to lock_mm_and_find_vma() and check vm_flags again, the latency
> time reduce 34% in lmbench 'lat_sig -P 1 prot lat_sig'.

The change makes sense to me. Per-VMA lock is enough to keep
vma->vm_flags stable, so no need to retry with mmap_lock.

>
> Signed-off-by: Kefeng Wang 

Reviewed-by: Suren Baghdasaryan 

> ---
>  arch/arm64/mm/fault.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index 9bb9f395351a..405f9aa831bd 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -572,7 +572,9 @@ static int __kprobes do_page_fault(unsigned long far, 
> unsigned long esr,
>
> if (!(vma->vm_flags & vm_flags)) {
> vma_end_read(vma);
> -   goto lock_mmap;
> +   fault = VM_FAULT_BADACCESS;
> +   count_vm_vma_lock_event(VMA_LOCK_SUCCESS);

nit: VMA_LOCK_SUCCESS accounting here seems correct to me but
unrelated to the main change. Either splitting into a separate patch
or mentioning this additional fixup in the changelog would be helpful.

> +   goto done;
> }
> fault = handle_mm_fault(vma, addr, mm_flags | FAULT_FLAG_VMA_LOCK, 
> regs);
> if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
> --
> 2.27.0
>