On Mon, Sep 02, 2013 at 06:05:10PM +0800, Xiao Guangrong wrote:
> On 09/02/2013 05:49 PM, Gleb Natapov wrote:
> > On Mon, Sep 02, 2013 at 05:42:25PM +0800, Xiao Guangrong wrote:
> >> On 09/01/2013 05:17 PM, Gleb Natapov wrote:
> >>> On Fri, Aug 30, 2013 at 02:41:37PM +0200, Paolo Bonzini wrote:
> >>>> Page tables in a read-only memory slot will currently cause a triple
> >>>> fault because the page walker uses gfn_to_hva and it fails on such a 
> >>>> slot.
> >>>>
> >>>> OVMF uses such a page table; however, real hardware seems to be fine with
> >>>> that as long as the accessed/dirty bits are set.  Save whether the slot
> >>>> is readonly, and later check it when updating the accessed and dirty 
> >>>> bits.
> >>>>
> >>> The fix looks OK to me, but some comment below.
> >>>
> >>>> Cc: [email protected]
> >>>> Cc: [email protected]
> >>>> Cc: Xiao Guangrong <[email protected]>
> >>>> Signed-off-by: Paolo Bonzini <[email protected]>
> >>>> ---
> >>>>  CCing to stable@ since the regression was introduced with
> >>>>  support for readonly memory slots.
> >>>>
> >>>>  arch/x86/kvm/paging_tmpl.h |  7 ++++++-
> >>>>  include/linux/kvm_host.h   |  1 +
> >>>>  virt/kvm/kvm_main.c        | 14 +++++++++-----
> >>>>  3 files changed, 16 insertions(+), 6 deletions(-)
> >>>>
> >>>> diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
> >>>> index 0433301..dadc5c0 100644
> >>>> --- a/arch/x86/kvm/paging_tmpl.h
> >>>> +++ b/arch/x86/kvm/paging_tmpl.h
> >>>> @@ -99,6 +99,7 @@ struct guest_walker {
> >>>>          pt_element_t prefetch_ptes[PTE_PREFETCH_NUM];
> >>>>          gpa_t pte_gpa[PT_MAX_FULL_LEVELS];
> >>>>          pt_element_t __user *ptep_user[PT_MAX_FULL_LEVELS];
> >>>> +        bool pte_writable[PT_MAX_FULL_LEVELS];
> >>>>          unsigned pt_access;
> >>>>          unsigned pte_access;
> >>>>          gfn_t gfn;
> >>>> @@ -235,6 +236,9 @@ static int FNAME(update_accessed_dirty_bits)(struct 
> >>>> kvm_vcpu *vcpu,
> >>>>                  if (pte == orig_pte)
> >>>>                          continue;
> >>>>  
> >>>> +                if (unlikely(!walker->pte_writable[level - 1]))
> >>>> +                        return -EACCES;
> >>>> +
> >>>>                  ret = FNAME(cmpxchg_gpte)(vcpu, mmu, ptep_user, index, 
> >>>> orig_pte, pte);
> >>>>                  if (ret)
> >>>>                          return ret;
> >>>> @@ -309,7 +313,8 @@ retry_walk:
> >>>>                          goto error;
> >>>>                  real_gfn = gpa_to_gfn(real_gfn);
> >>>>  
> >>>> -                host_addr = gfn_to_hva(vcpu->kvm, real_gfn);
> >>>> +                host_addr = gfn_to_hva_read(vcpu->kvm, real_gfn,
> >>>> +                                            
> >>>> &walker->pte_writable[walker->level - 1]);
> >>> The use of gfn_to_hva_read is misleading. The code can still write into
> >>> gfn. Lets rename gfn_to_hva_read to gfn_to_hva_prot() and gfn_to_hva()
> >>> to gfn_to_hva_write().
> >>
> >> Yes. I agreed.
> >>
> >>>
> >>> This makes me think are there other places where gfn_to_hva() was
> >>> used, but gfn_to_hva_prot() should have been?
> >>>  - kvm_host_page_size() looks incorrect. We never use huge page to map
> >>>    read only memory slots currently.
> >>
> >> It only checks whether gfn have been mapped, I think we can use
> >> gfn_to_hva_read() instead, the real permission will be checked when we 
> >> translate
> >> the gfn to pfn.
> >>
> > Yes, all the cases I listed should be changed to use function that looks
> > at both regular and RO slots.
> > 
> >>>  - kvm_handle_bad_page() also looks incorrect and may cause incorrect
> >>>    address to be reported to userspace.
> >>
> >> I have no idea on this point. kvm_handle_bad_page() is called when it 
> >> failed to
> >> translate the target gfn to pfn, then the emulator can detect the error on 
> >> target gfn
> >> properly. no? Or i misunderstood your meaning?
> >>
> > I am talking about the following code:
> > 
> >         if (pfn == KVM_PFN_ERR_HWPOISON) {
> >                 kvm_send_hwpoison_signal(gfn_to_hva(vcpu->kvm, gfn), 
> > current);
> >                 return 0;
> >         }
> > 
> > pfn will be KVM_PFN_ERR_HWPOISON gfn is backed by faulty memory, we need
> > to report the liner address of the faulty memory to a userspace here,
> > but if gfn is in a RO slot gfn_to_hva() will not return correct address
> > here.
> 
> Got it, thanks for your explanation.
> 
> BTW, if you and Paolo are busy on other things, i am happy to fix these 
> issues. :)
I am busy with reviews mostly :). If you are not to busy with lockless
write protection then fine with me. Lest wait for Paolo's input on
proposed API though.

--
                        Gleb.
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to