On Thu, 15 Apr 2021 15:08:09 +0100,
Keqian Zhu <zhukeqi...@huawei.com> wrote:
> 
> Hi Marc,
> 
> On 2021/4/15 22:03, Keqian Zhu wrote:
> > The MMIO region of a device maybe huge (GB level), try to use
> > block mapping in stage2 to speedup both map and unmap.
> > 
> > Compared to normal memory mapping, we should consider two more
> > points when try block mapping for MMIO region:
> > 
> > 1. For normal memory mapping, the PA(host physical address) and
> > HVA have same alignment within PUD_SIZE or PMD_SIZE when we use
> > the HVA to request hugepage, so we don't need to consider PA
> > alignment when verifing block mapping. But for device memory
> > mapping, the PA and HVA may have different alignment.
> > 
> > 2. For normal memory mapping, we are sure hugepage size properly
> > fit into vma, so we don't check whether the mapping size exceeds
> > the boundary of vma. But for device memory mapping, we should pay
> > attention to this.
> > 
> > This adds get_vma_page_shift() to get page shift for both normal
> > memory and device MMIO region, and check these two points when
> > selecting block mapping size for MMIO region.
> > 
> > Signed-off-by: Keqian Zhu <zhukeqi...@huawei.com>
> > ---
> >  arch/arm64/kvm/mmu.c | 61 ++++++++++++++++++++++++++++++++++++--------
> >  1 file changed, 51 insertions(+), 10 deletions(-)
> > 
> > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> > index c59af5ca01b0..5a1cc7751e6d 100644
> > --- a/arch/arm64/kvm/mmu.c
> > +++ b/arch/arm64/kvm/mmu.c
> > @@ -738,6 +738,35 @@ transparent_hugepage_adjust(struct kvm_memory_slot 
> > *memslot,
> >     return PAGE_SIZE;
> >  }
> >  
> > +static int get_vma_page_shift(struct vm_area_struct *vma, unsigned long 
> > hva)
> > +{
> > +   unsigned long pa;
> > +
> > +   if (is_vm_hugetlb_page(vma) && !(vma->vm_flags & VM_PFNMAP))
> > +           return huge_page_shift(hstate_vma(vma));
> > +
> > +   if (!(vma->vm_flags & VM_PFNMAP))
> > +           return PAGE_SHIFT;
> > +
> > +   VM_BUG_ON(is_vm_hugetlb_page(vma));
> > +
> > +   pa = (vma->vm_pgoff << PAGE_SHIFT) + (hva - vma->vm_start);
> > +
> > +#ifndef __PAGETABLE_PMD_FOLDED
> > +   if ((hva & (PUD_SIZE - 1)) == (pa & (PUD_SIZE - 1)) &&
> > +       ALIGN_DOWN(hva, PUD_SIZE) >= vma->vm_start &&
> > +       ALIGN(hva, PUD_SIZE) <= vma->vm_end)
> > +           return PUD_SHIFT;
> > +#endif
> > +
> > +   if ((hva & (PMD_SIZE - 1)) == (pa & (PMD_SIZE - 1)) &&
> > +       ALIGN_DOWN(hva, PMD_SIZE) >= vma->vm_start &&
> > +       ALIGN(hva, PMD_SIZE) <= vma->vm_end)
> > +           return PMD_SHIFT;
> > +
> > +   return PAGE_SHIFT;
> > +}
> > +
> >  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> >                       struct kvm_memory_slot *memslot, unsigned long hva,
> >                       unsigned long fault_status)
> > @@ -769,7 +798,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
> > phys_addr_t fault_ipa,
> >             return -EFAULT;
> >     }
> >  
> > -   /* Let's check if we will get back a huge page backed by hugetlbfs */
> > +   /*
> > +    * Let's check if we will get back a huge page backed by hugetlbfs, or
> > +    * get block mapping for device MMIO region.
> > +    */
> >     mmap_read_lock(current->mm);
> >     vma = find_vma_intersection(current->mm, hva, hva + 1);
> >     if (unlikely(!vma)) {
> > @@ -778,15 +810,15 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
> > phys_addr_t fault_ipa,
> >             return -EFAULT;
> >     }
> >  
> > -   if (is_vm_hugetlb_page(vma))
> > -           vma_shift = huge_page_shift(hstate_vma(vma));
> > -   else
> > -           vma_shift = PAGE_SHIFT;
> > -
> > -   if (logging_active ||
> > -       (vma->vm_flags & VM_PFNMAP)) {
> > +   /*
> > +    * logging_active is guaranteed to never be true for VM_PFNMAP
> > +    * memslots.
> > +    */
> > +   if (logging_active) {
> >             force_pte = true;
> >             vma_shift = PAGE_SHIFT;
> > +   } else {
> > +           vma_shift = get_vma_page_shift(vma, hva);
> >     }
> I use a if/else manner in v4, please check that. Thanks very much!

That's fine. However, it is getting a bit late for 5.13, and we don't
have much time to left it simmer in -next. I'll probably wait until
after the merge window to pick it up.

Thanks,

        M.

-- 
Without deviation from the norm, progress is not possible.

Reply via email to