Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-10-14 Thread Paolo Bonzini
Il 01/09/2013 11:17, Gleb Natapov ha scritto:
> This makes me think are there other places where gfn_to_hva() was
> used, but gfn_to_hva_prot() should have been?
>  - kvm_host_page_size() looks incorrect. We never use huge page to map
>read only memory slots currently.
>  - kvm_handle_bad_page() also looks incorrect and may cause incorrect
>address to be reported to userspace.
>  - kvm_setup_async_pf() also incorrect. Makes all page fault on read
>only slot to be sync.
>  - kvm_vm_fault() one looks OK since function assumes write only slots,
>but it is obsolete and should be deleted anyway.
> 
> Others in generic and x86 code looks OK, somebody need to check ppc and
> arm code.

PPC is a mess, for s390 I couldn't find a spec.  ARM needs to be fixed.
 Patches coming...

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-10-14 Thread Paolo Bonzini
Il 01/09/2013 11:17, Gleb Natapov ha scritto:
 This makes me think are there other places where gfn_to_hva() was
 used, but gfn_to_hva_prot() should have been?
  - kvm_host_page_size() looks incorrect. We never use huge page to map
read only memory slots currently.
  - kvm_handle_bad_page() also looks incorrect and may cause incorrect
address to be reported to userspace.
  - kvm_setup_async_pf() also incorrect. Makes all page fault on read
only slot to be sync.
  - kvm_vm_fault() one looks OK since function assumes write only slots,
but it is obsolete and should be deleted anyway.
 
 Others in generic and x86 code looks OK, somebody need to check ppc and
 arm code.

PPC is a mess, for s390 I couldn't find a spec.  ARM needs to be fixed.
 Patches coming...

Paolo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Gleb Natapov
On Mon, Sep 02, 2013 at 05:58:41PM +0200, Paolo Bonzini wrote:
> Il 02/09/2013 12:11, Gleb Natapov ha scritto:
> >> > 
> >> > Got it, thanks for your explanation.
> >> > 
> >> > BTW, if you and Paolo are busy on other things, i am happy to fix these 
> >> > issues. :)
> > I am busy with reviews mostly :). If you are not to busy with lockless
> > write protection then fine with me. Lest wait for Paolo's input on
> > proposed API though.
> 
> No problem, I'm happy to do this kind of exercise.  I like the API you
> proposed.  I can do this change on top of the API change.  However,
> since this bug is "in the wild" it may be applicable to stable@ as well.
>  If you agree, I will also post v2 of this patch for stable kernels as
> soon as we agree on the desired faulting behavior.
> 
I am OK with having minimal patch for backporting to stable and doing
API changes + fixes for other call sites on top.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Gleb Natapov
On Mon, Sep 02, 2013 at 05:56:07PM +0200, Paolo Bonzini wrote:
> Il 02/09/2013 12:07, Gleb Natapov ha scritto:
> > On Mon, Sep 02, 2013 at 06:00:39PM +0800, Xiao Guangrong wrote:
> >> On 09/02/2013 05:25 PM, Gleb Natapov wrote:
> >>> On Mon, Sep 02, 2013 at 05:20:15PM +0800, Xiao Guangrong wrote:
>  On 08/30/2013 08:41 PM, Paolo Bonzini wrote:
> > Page tables in a read-only memory slot will currently cause a triple
> > fault because the page walker uses gfn_to_hva and it fails on such a 
> > slot.
> >
> > OVMF uses such a page table; however, real hardware seems to be fine 
> > with
> > that as long as the accessed/dirty bits are set.  Save whether the slot
> > is readonly, and later check it when updating the accessed and dirty 
> > bits.
> 
>  Paolo, do you know why OVMF is using readonly memory like this?
> 
> >>> Just a guess, but perhaps they want to move to paging mode as early as
> >>> possible even before memory controller is fully initialized.
> >>>
>  AFAIK, The fault trigged by this kind of access can hardly be fixed by
>  userspace since the fault is trigged by pagetable walking not by the 
>  current
>  instruction. Do you have any idea to let uerspace emulate it properly?
> >>> Not sure what userspace you mean here, but there shouldn't be a fault in 
> >>> the
> >>
> >> I just wonder how to fix this kind of fault. The current patch returns 
> >> -EACCES
> >> but that will crash the guest. I think we'd better let userspace to fix 
> >> this
> >> error (let userspace set the D/A bit.)
> >>
> > Ugh, this is not good. Missed that. Don't know what real HW will do
> > here, but the easy thing for us to do would be to just return success.
> 
> Real hardware would just do a memory write.  What happens depends on
> what is on the bus, i.e. on what the ROM is used for.
> 
> QEMU uses read-only slots for two things: actual read-only memory where
> writes go to the bitbucket, and "ROMD" memory where writes are treated
> as MMIO.
> 
> So, in the first case we would ignore the write.  In the second we would
> do an MMIO exit to userspace.  But ignoring the write isn't always
> correct, and doing an MMIO exit is complicated, so I would just kill the
> guest.
> 
> EPT will probably write to the read-only slots without thinking much
> about it.
> 
> My patch injects a page fault, which is very likely to escalate to a
> triple fault.  This is probably never what you want---on the other hand,
> I wasn't sure what level of accuracy we want in this case, given that
> EPT does it wrong too.
> 
Just ignoring write into RO slot is OK. We do not handle page tables in
MMIO memory, so handling page tables in ROMD memory shouldn't be a
priority either. 

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Gleb Natapov
On Mon, Sep 02, 2013 at 06:00:08PM +0200, Paolo Bonzini wrote:
> Il 02/09/2013 11:25, Gleb Natapov ha scritto:
> > On Mon, Sep 02, 2013 at 05:20:15PM +0800, Xiao Guangrong wrote:
> >> On 08/30/2013 08:41 PM, Paolo Bonzini wrote:
> >>> Page tables in a read-only memory slot will currently cause a triple
> >>> fault because the page walker uses gfn_to_hva and it fails on such a slot.
> >>>
> >>> OVMF uses such a page table; however, real hardware seems to be fine with
> >>> that as long as the accessed/dirty bits are set.  Save whether the slot
> >>> is readonly, and later check it when updating the accessed and dirty bits.
> >>
> >> Paolo, do you know why OVMF is using readonly memory like this?
> >>
> > Just a guess, but perhaps they want to move to paging mode as early as
> > possible even before memory controller is fully initialized.
> 
> More precisely they want to move to 64-bit mode as early as possible,
> and that requires paging.
> 
> >> AFAIK, The fault trigged by this kind of access can hardly be fixed by
> >> userspace since the fault is trigged by pagetable walking not by the 
> >> current
> >> instruction. Do you have any idea to let uerspace emulate it properly?
> >
> > Not sure what userspace you mean here, but there shouldn't be a fault in the
> > first place if ROM page tables have access/dirty bit set and they do.
> 
> Yep.  Actually they do not set dirty on the PD/PDP/PML4, only on the
> bottom level (which is also fine).
> 
They do it for a good reason, dirty bit exists only on the bottom level :)

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Paolo Bonzini
Il 02/09/2013 11:25, Gleb Natapov ha scritto:
> On Mon, Sep 02, 2013 at 05:20:15PM +0800, Xiao Guangrong wrote:
>> On 08/30/2013 08:41 PM, Paolo Bonzini wrote:
>>> Page tables in a read-only memory slot will currently cause a triple
>>> fault because the page walker uses gfn_to_hva and it fails on such a slot.
>>>
>>> OVMF uses such a page table; however, real hardware seems to be fine with
>>> that as long as the accessed/dirty bits are set.  Save whether the slot
>>> is readonly, and later check it when updating the accessed and dirty bits.
>>
>> Paolo, do you know why OVMF is using readonly memory like this?
>>
> Just a guess, but perhaps they want to move to paging mode as early as
> possible even before memory controller is fully initialized.

More precisely they want to move to 64-bit mode as early as possible,
and that requires paging.

>> AFAIK, The fault trigged by this kind of access can hardly be fixed by
>> userspace since the fault is trigged by pagetable walking not by the current
>> instruction. Do you have any idea to let uerspace emulate it properly?
>
> Not sure what userspace you mean here, but there shouldn't be a fault in the
> first place if ROM page tables have access/dirty bit set and they do.

Yep.  Actually they do not set dirty on the PD/PDP/PML4, only on the
bottom level (which is also fine).

Paolo

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Paolo Bonzini
Il 02/09/2013 12:11, Gleb Natapov ha scritto:
>> > 
>> > Got it, thanks for your explanation.
>> > 
>> > BTW, if you and Paolo are busy on other things, i am happy to fix these 
>> > issues. :)
> I am busy with reviews mostly :). If you are not to busy with lockless
> write protection then fine with me. Lest wait for Paolo's input on
> proposed API though.

No problem, I'm happy to do this kind of exercise.  I like the API you
proposed.  I can do this change on top of the API change.  However,
since this bug is "in the wild" it may be applicable to stable@ as well.
 If you agree, I will also post v2 of this patch for stable kernels as
soon as we agree on the desired faulting behavior.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Paolo Bonzini
Il 02/09/2013 12:07, Gleb Natapov ha scritto:
> On Mon, Sep 02, 2013 at 06:00:39PM +0800, Xiao Guangrong wrote:
>> On 09/02/2013 05:25 PM, Gleb Natapov wrote:
>>> On Mon, Sep 02, 2013 at 05:20:15PM +0800, Xiao Guangrong wrote:
 On 08/30/2013 08:41 PM, Paolo Bonzini wrote:
> Page tables in a read-only memory slot will currently cause a triple
> fault because the page walker uses gfn_to_hva and it fails on such a slot.
>
> OVMF uses such a page table; however, real hardware seems to be fine with
> that as long as the accessed/dirty bits are set.  Save whether the slot
> is readonly, and later check it when updating the accessed and dirty bits.

 Paolo, do you know why OVMF is using readonly memory like this?

>>> Just a guess, but perhaps they want to move to paging mode as early as
>>> possible even before memory controller is fully initialized.
>>>
 AFAIK, The fault trigged by this kind of access can hardly be fixed by
 userspace since the fault is trigged by pagetable walking not by the 
 current
 instruction. Do you have any idea to let uerspace emulate it properly?
>>> Not sure what userspace you mean here, but there shouldn't be a fault in the
>>
>> I just wonder how to fix this kind of fault. The current patch returns 
>> -EACCES
>> but that will crash the guest. I think we'd better let userspace to fix this
>> error (let userspace set the D/A bit.)
>>
> Ugh, this is not good. Missed that. Don't know what real HW will do
> here, but the easy thing for us to do would be to just return success.

Real hardware would just do a memory write.  What happens depends on
what is on the bus, i.e. on what the ROM is used for.

QEMU uses read-only slots for two things: actual read-only memory where
writes go to the bitbucket, and "ROMD" memory where writes are treated
as MMIO.

So, in the first case we would ignore the write.  In the second we would
do an MMIO exit to userspace.  But ignoring the write isn't always
correct, and doing an MMIO exit is complicated, so I would just kill the
guest.

EPT will probably write to the read-only slots without thinking much
about it.

My patch injects a page fault, which is very likely to escalate to a
triple fault.  This is probably never what you want---on the other hand,
I wasn't sure what level of accuracy we want in this case, given that
EPT does it wrong too.

Paolo

>>> first place if ROM page tables have access/dirty bit set and they do.
>>
>> Yes, so we can not call x86_emulate_instruction() to fix this fault (that 
>> function
>> emulates the access on the first place). Need directly return a MMIO-exit to
>> userpsace when met this fault? What happen if this fault on pagetable-walking
>> is trigged in x86_emulate_instruction().?
> I think we should not return MMIO-exit to userspace. Either ignore write 
> attempt
> or kill a guest.
> 
> --
>   Gleb.
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Gleb Natapov
On Mon, Sep 02, 2013 at 06:05:10PM +0800, Xiao Guangrong wrote:
> On 09/02/2013 05:49 PM, Gleb Natapov wrote:
> > On Mon, Sep 02, 2013 at 05:42:25PM +0800, Xiao Guangrong wrote:
> >> On 09/01/2013 05:17 PM, Gleb Natapov wrote:
> >>> On Fri, Aug 30, 2013 at 02:41:37PM +0200, Paolo Bonzini wrote:
>  Page tables in a read-only memory slot will currently cause a triple
>  fault because the page walker uses gfn_to_hva and it fails on such a 
>  slot.
> 
>  OVMF uses such a page table; however, real hardware seems to be fine with
>  that as long as the accessed/dirty bits are set.  Save whether the slot
>  is readonly, and later check it when updating the accessed and dirty 
>  bits.
> 
> >>> The fix looks OK to me, but some comment below.
> >>>
>  Cc: sta...@vger.kernel.org
>  Cc: g...@redhat.com
>  Cc: Xiao Guangrong 
>  Signed-off-by: Paolo Bonzini 
>  ---
>   CCing to stable@ since the regression was introduced with
>   support for readonly memory slots.
> 
>   arch/x86/kvm/paging_tmpl.h |  7 ++-
>   include/linux/kvm_host.h   |  1 +
>   virt/kvm/kvm_main.c| 14 +-
>   3 files changed, 16 insertions(+), 6 deletions(-)
> 
>  diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
>  index 0433301..dadc5c0 100644
>  --- a/arch/x86/kvm/paging_tmpl.h
>  +++ b/arch/x86/kvm/paging_tmpl.h
>  @@ -99,6 +99,7 @@ struct guest_walker {
>   pt_element_t prefetch_ptes[PTE_PREFETCH_NUM];
>   gpa_t pte_gpa[PT_MAX_FULL_LEVELS];
>   pt_element_t __user *ptep_user[PT_MAX_FULL_LEVELS];
>  +bool pte_writable[PT_MAX_FULL_LEVELS];
>   unsigned pt_access;
>   unsigned pte_access;
>   gfn_t gfn;
>  @@ -235,6 +236,9 @@ static int FNAME(update_accessed_dirty_bits)(struct 
>  kvm_vcpu *vcpu,
>   if (pte == orig_pte)
>   continue;
>   
>  +if (unlikely(!walker->pte_writable[level - 1]))
>  +return -EACCES;
>  +
>   ret = FNAME(cmpxchg_gpte)(vcpu, mmu, ptep_user, index, 
>  orig_pte, pte);
>   if (ret)
>   return ret;
>  @@ -309,7 +313,8 @@ retry_walk:
>   goto error;
>   real_gfn = gpa_to_gfn(real_gfn);
>   
>  -host_addr = gfn_to_hva(vcpu->kvm, real_gfn);
>  +host_addr = gfn_to_hva_read(vcpu->kvm, real_gfn,
>  +
>  >pte_writable[walker->level - 1]);
> >>> The use of gfn_to_hva_read is misleading. The code can still write into
> >>> gfn. Lets rename gfn_to_hva_read to gfn_to_hva_prot() and gfn_to_hva()
> >>> to gfn_to_hva_write().
> >>
> >> Yes. I agreed.
> >>
> >>>
> >>> This makes me think are there other places where gfn_to_hva() was
> >>> used, but gfn_to_hva_prot() should have been?
> >>>  - kvm_host_page_size() looks incorrect. We never use huge page to map
> >>>read only memory slots currently.
> >>
> >> It only checks whether gfn have been mapped, I think we can use
> >> gfn_to_hva_read() instead, the real permission will be checked when we 
> >> translate
> >> the gfn to pfn.
> >>
> > Yes, all the cases I listed should be changed to use function that looks
> > at both regular and RO slots.
> > 
> >>>  - kvm_handle_bad_page() also looks incorrect and may cause incorrect
> >>>address to be reported to userspace.
> >>
> >> I have no idea on this point. kvm_handle_bad_page() is called when it 
> >> failed to
> >> translate the target gfn to pfn, then the emulator can detect the error on 
> >> target gfn
> >> properly. no? Or i misunderstood your meaning?
> >>
> > I am talking about the following code:
> > 
> > if (pfn == KVM_PFN_ERR_HWPOISON) {
> > kvm_send_hwpoison_signal(gfn_to_hva(vcpu->kvm, gfn), 
> > current);
> > return 0;
> > }
> > 
> > pfn will be KVM_PFN_ERR_HWPOISON gfn is backed by faulty memory, we need
> > to report the liner address of the faulty memory to a userspace here,
> > but if gfn is in a RO slot gfn_to_hva() will not return correct address
> > here.
> 
> Got it, thanks for your explanation.
> 
> BTW, if you and Paolo are busy on other things, i am happy to fix these 
> issues. :)
I am busy with reviews mostly :). If you are not to busy with lockless
write protection then fine with me. Lest wait for Paolo's input on
proposed API though.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Gleb Natapov
On Mon, Sep 02, 2013 at 06:00:39PM +0800, Xiao Guangrong wrote:
> On 09/02/2013 05:25 PM, Gleb Natapov wrote:
> > On Mon, Sep 02, 2013 at 05:20:15PM +0800, Xiao Guangrong wrote:
> >> On 08/30/2013 08:41 PM, Paolo Bonzini wrote:
> >>> Page tables in a read-only memory slot will currently cause a triple
> >>> fault because the page walker uses gfn_to_hva and it fails on such a slot.
> >>>
> >>> OVMF uses such a page table; however, real hardware seems to be fine with
> >>> that as long as the accessed/dirty bits are set.  Save whether the slot
> >>> is readonly, and later check it when updating the accessed and dirty bits.
> >>
> >> Paolo, do you know why OVMF is using readonly memory like this?
> >>
> > Just a guess, but perhaps they want to move to paging mode as early as
> > possible even before memory controller is fully initialized.
> > 
> >> AFAIK, The fault trigged by this kind of access can hardly be fixed by
> >> userspace since the fault is trigged by pagetable walking not by the 
> >> current
> >> instruction. Do you have any idea to let uerspace emulate it properly?
> > Not sure what userspace you mean here, but there shouldn't be a fault in the
> 
> I just wonder how to fix this kind of fault. The current patch returns -EACCES
> but that will crash the guest. I think we'd better let userspace to fix this
> error (let userspace set the D/A bit.)
> 
Ugh, this is not good. Missed that. Don't know what real HW will do
here, but the easy thing for us to do would be to just return success.

> > first place if ROM page tables have access/dirty bit set and they do.
> 
> Yes, so we can not call x86_emulate_instruction() to fix this fault (that 
> function
> emulates the access on the first place). Need directly return a MMIO-exit to
> userpsace when met this fault? What happen if this fault on pagetable-walking
> is trigged in x86_emulate_instruction().?
I think we should not return MMIO-exit to userspace. Either ignore write attempt
or kill a guest.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Xiao Guangrong
On 09/02/2013 05:49 PM, Gleb Natapov wrote:
> On Mon, Sep 02, 2013 at 05:42:25PM +0800, Xiao Guangrong wrote:
>> On 09/01/2013 05:17 PM, Gleb Natapov wrote:
>>> On Fri, Aug 30, 2013 at 02:41:37PM +0200, Paolo Bonzini wrote:
 Page tables in a read-only memory slot will currently cause a triple
 fault because the page walker uses gfn_to_hva and it fails on such a slot.

 OVMF uses such a page table; however, real hardware seems to be fine with
 that as long as the accessed/dirty bits are set.  Save whether the slot
 is readonly, and later check it when updating the accessed and dirty bits.

>>> The fix looks OK to me, but some comment below.
>>>
 Cc: sta...@vger.kernel.org
 Cc: g...@redhat.com
 Cc: Xiao Guangrong 
 Signed-off-by: Paolo Bonzini 
 ---
CCing to stable@ since the regression was introduced with
support for readonly memory slots.

  arch/x86/kvm/paging_tmpl.h |  7 ++-
  include/linux/kvm_host.h   |  1 +
  virt/kvm/kvm_main.c| 14 +-
  3 files changed, 16 insertions(+), 6 deletions(-)

 diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
 index 0433301..dadc5c0 100644
 --- a/arch/x86/kvm/paging_tmpl.h
 +++ b/arch/x86/kvm/paging_tmpl.h
 @@ -99,6 +99,7 @@ struct guest_walker {
pt_element_t prefetch_ptes[PTE_PREFETCH_NUM];
gpa_t pte_gpa[PT_MAX_FULL_LEVELS];
pt_element_t __user *ptep_user[PT_MAX_FULL_LEVELS];
 +  bool pte_writable[PT_MAX_FULL_LEVELS];
unsigned pt_access;
unsigned pte_access;
gfn_t gfn;
 @@ -235,6 +236,9 @@ static int FNAME(update_accessed_dirty_bits)(struct 
 kvm_vcpu *vcpu,
if (pte == orig_pte)
continue;
  
 +  if (unlikely(!walker->pte_writable[level - 1]))
 +  return -EACCES;
 +
ret = FNAME(cmpxchg_gpte)(vcpu, mmu, ptep_user, index, 
 orig_pte, pte);
if (ret)
return ret;
 @@ -309,7 +313,8 @@ retry_walk:
goto error;
real_gfn = gpa_to_gfn(real_gfn);
  
 -  host_addr = gfn_to_hva(vcpu->kvm, real_gfn);
 +  host_addr = gfn_to_hva_read(vcpu->kvm, real_gfn,
 +  >pte_writable[walker->level 
 - 1]);
>>> The use of gfn_to_hva_read is misleading. The code can still write into
>>> gfn. Lets rename gfn_to_hva_read to gfn_to_hva_prot() and gfn_to_hva()
>>> to gfn_to_hva_write().
>>
>> Yes. I agreed.
>>
>>>
>>> This makes me think are there other places where gfn_to_hva() was
>>> used, but gfn_to_hva_prot() should have been?
>>>  - kvm_host_page_size() looks incorrect. We never use huge page to map
>>>read only memory slots currently.
>>
>> It only checks whether gfn have been mapped, I think we can use
>> gfn_to_hva_read() instead, the real permission will be checked when we 
>> translate
>> the gfn to pfn.
>>
> Yes, all the cases I listed should be changed to use function that looks
> at both regular and RO slots.
> 
>>>  - kvm_handle_bad_page() also looks incorrect and may cause incorrect
>>>address to be reported to userspace.
>>
>> I have no idea on this point. kvm_handle_bad_page() is called when it failed 
>> to
>> translate the target gfn to pfn, then the emulator can detect the error on 
>> target gfn
>> properly. no? Or i misunderstood your meaning?
>>
> I am talking about the following code:
> 
> if (pfn == KVM_PFN_ERR_HWPOISON) {
> kvm_send_hwpoison_signal(gfn_to_hva(vcpu->kvm, gfn), current);
> return 0;
> }
> 
> pfn will be KVM_PFN_ERR_HWPOISON gfn is backed by faulty memory, we need
> to report the liner address of the faulty memory to a userspace here,
> but if gfn is in a RO slot gfn_to_hva() will not return correct address
> here.

Got it, thanks for your explanation.

BTW, if you and Paolo are busy on other things, i am happy to fix these issues. 
:)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Xiao Guangrong
On 09/02/2013 05:25 PM, Gleb Natapov wrote:
> On Mon, Sep 02, 2013 at 05:20:15PM +0800, Xiao Guangrong wrote:
>> On 08/30/2013 08:41 PM, Paolo Bonzini wrote:
>>> Page tables in a read-only memory slot will currently cause a triple
>>> fault because the page walker uses gfn_to_hva and it fails on such a slot.
>>>
>>> OVMF uses such a page table; however, real hardware seems to be fine with
>>> that as long as the accessed/dirty bits are set.  Save whether the slot
>>> is readonly, and later check it when updating the accessed and dirty bits.
>>
>> Paolo, do you know why OVMF is using readonly memory like this?
>>
> Just a guess, but perhaps they want to move to paging mode as early as
> possible even before memory controller is fully initialized.
> 
>> AFAIK, The fault trigged by this kind of access can hardly be fixed by
>> userspace since the fault is trigged by pagetable walking not by the current
>> instruction. Do you have any idea to let uerspace emulate it properly?
> Not sure what userspace you mean here, but there shouldn't be a fault in the

I just wonder how to fix this kind of fault. The current patch returns -EACCES
but that will crash the guest. I think we'd better let userspace to fix this
error (let userspace set the D/A bit.)

> first place if ROM page tables have access/dirty bit set and they do.

Yes, so we can not call x86_emulate_instruction() to fix this fault (that 
function
emulates the access on the first place). Need directly return a MMIO-exit to
userpsace when met this fault? What happen if this fault on pagetable-walking
is trigged in x86_emulate_instruction().?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Gleb Natapov
On Mon, Sep 02, 2013 at 05:42:25PM +0800, Xiao Guangrong wrote:
> On 09/01/2013 05:17 PM, Gleb Natapov wrote:
> > On Fri, Aug 30, 2013 at 02:41:37PM +0200, Paolo Bonzini wrote:
> >> Page tables in a read-only memory slot will currently cause a triple
> >> fault because the page walker uses gfn_to_hva and it fails on such a slot.
> >>
> >> OVMF uses such a page table; however, real hardware seems to be fine with
> >> that as long as the accessed/dirty bits are set.  Save whether the slot
> >> is readonly, and later check it when updating the accessed and dirty bits.
> >>
> > The fix looks OK to me, but some comment below.
> > 
> >> Cc: sta...@vger.kernel.org
> >> Cc: g...@redhat.com
> >> Cc: Xiao Guangrong 
> >> Signed-off-by: Paolo Bonzini 
> >> ---
> >>CCing to stable@ since the regression was introduced with
> >>support for readonly memory slots.
> >>
> >>  arch/x86/kvm/paging_tmpl.h |  7 ++-
> >>  include/linux/kvm_host.h   |  1 +
> >>  virt/kvm/kvm_main.c| 14 +-
> >>  3 files changed, 16 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
> >> index 0433301..dadc5c0 100644
> >> --- a/arch/x86/kvm/paging_tmpl.h
> >> +++ b/arch/x86/kvm/paging_tmpl.h
> >> @@ -99,6 +99,7 @@ struct guest_walker {
> >>pt_element_t prefetch_ptes[PTE_PREFETCH_NUM];
> >>gpa_t pte_gpa[PT_MAX_FULL_LEVELS];
> >>pt_element_t __user *ptep_user[PT_MAX_FULL_LEVELS];
> >> +  bool pte_writable[PT_MAX_FULL_LEVELS];
> >>unsigned pt_access;
> >>unsigned pte_access;
> >>gfn_t gfn;
> >> @@ -235,6 +236,9 @@ static int FNAME(update_accessed_dirty_bits)(struct 
> >> kvm_vcpu *vcpu,
> >>if (pte == orig_pte)
> >>continue;
> >>  
> >> +  if (unlikely(!walker->pte_writable[level - 1]))
> >> +  return -EACCES;
> >> +
> >>ret = FNAME(cmpxchg_gpte)(vcpu, mmu, ptep_user, index, 
> >> orig_pte, pte);
> >>if (ret)
> >>return ret;
> >> @@ -309,7 +313,8 @@ retry_walk:
> >>goto error;
> >>real_gfn = gpa_to_gfn(real_gfn);
> >>  
> >> -  host_addr = gfn_to_hva(vcpu->kvm, real_gfn);
> >> +  host_addr = gfn_to_hva_read(vcpu->kvm, real_gfn,
> >> +  >pte_writable[walker->level 
> >> - 1]);
> > The use of gfn_to_hva_read is misleading. The code can still write into
> > gfn. Lets rename gfn_to_hva_read to gfn_to_hva_prot() and gfn_to_hva()
> > to gfn_to_hva_write().
> 
> Yes. I agreed.
> 
> > 
> > This makes me think are there other places where gfn_to_hva() was
> > used, but gfn_to_hva_prot() should have been?
> >  - kvm_host_page_size() looks incorrect. We never use huge page to map
> >read only memory slots currently.
> 
> It only checks whether gfn have been mapped, I think we can use
> gfn_to_hva_read() instead, the real permission will be checked when we 
> translate
> the gfn to pfn.
> 
Yes, all the cases I listed should be changed to use function that looks
at both regular and RO slots.

> >  - kvm_handle_bad_page() also looks incorrect and may cause incorrect
> >address to be reported to userspace.
> 
> I have no idea on this point. kvm_handle_bad_page() is called when it failed 
> to
> translate the target gfn to pfn, then the emulator can detect the error on 
> target gfn
> properly. no? Or i misunderstood your meaning?
> 
I am talking about the following code:

if (pfn == KVM_PFN_ERR_HWPOISON) {
kvm_send_hwpoison_signal(gfn_to_hva(vcpu->kvm, gfn), current);
return 0;
}

pfn will be KVM_PFN_ERR_HWPOISON gfn is backed by faulty memory, we need
to report the liner address of the faulty memory to a userspace here,
but if gfn is in a RO slot gfn_to_hva() will not return correct address
here.

> >  - kvm_setup_async_pf() also incorrect. Makes all page fault on read
> >only slot to be sync.
> >  - kvm_vm_fault() one looks OK since function assumes write only slots,
> >but it is obsolete and should be deleted anyway.
> 

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Xiao Guangrong
On 09/01/2013 05:17 PM, Gleb Natapov wrote:
> On Fri, Aug 30, 2013 at 02:41:37PM +0200, Paolo Bonzini wrote:
>> Page tables in a read-only memory slot will currently cause a triple
>> fault because the page walker uses gfn_to_hva and it fails on such a slot.
>>
>> OVMF uses such a page table; however, real hardware seems to be fine with
>> that as long as the accessed/dirty bits are set.  Save whether the slot
>> is readonly, and later check it when updating the accessed and dirty bits.
>>
> The fix looks OK to me, but some comment below.
> 
>> Cc: sta...@vger.kernel.org
>> Cc: g...@redhat.com
>> Cc: Xiao Guangrong 
>> Signed-off-by: Paolo Bonzini 
>> ---
>>  CCing to stable@ since the regression was introduced with
>>  support for readonly memory slots.
>>
>>  arch/x86/kvm/paging_tmpl.h |  7 ++-
>>  include/linux/kvm_host.h   |  1 +
>>  virt/kvm/kvm_main.c| 14 +-
>>  3 files changed, 16 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
>> index 0433301..dadc5c0 100644
>> --- a/arch/x86/kvm/paging_tmpl.h
>> +++ b/arch/x86/kvm/paging_tmpl.h
>> @@ -99,6 +99,7 @@ struct guest_walker {
>>  pt_element_t prefetch_ptes[PTE_PREFETCH_NUM];
>>  gpa_t pte_gpa[PT_MAX_FULL_LEVELS];
>>  pt_element_t __user *ptep_user[PT_MAX_FULL_LEVELS];
>> +bool pte_writable[PT_MAX_FULL_LEVELS];
>>  unsigned pt_access;
>>  unsigned pte_access;
>>  gfn_t gfn;
>> @@ -235,6 +236,9 @@ static int FNAME(update_accessed_dirty_bits)(struct 
>> kvm_vcpu *vcpu,
>>  if (pte == orig_pte)
>>  continue;
>>  
>> +if (unlikely(!walker->pte_writable[level - 1]))
>> +return -EACCES;
>> +
>>  ret = FNAME(cmpxchg_gpte)(vcpu, mmu, ptep_user, index, 
>> orig_pte, pte);
>>  if (ret)
>>  return ret;
>> @@ -309,7 +313,8 @@ retry_walk:
>>  goto error;
>>  real_gfn = gpa_to_gfn(real_gfn);
>>  
>> -host_addr = gfn_to_hva(vcpu->kvm, real_gfn);
>> +host_addr = gfn_to_hva_read(vcpu->kvm, real_gfn,
>> +>pte_writable[walker->level 
>> - 1]);
> The use of gfn_to_hva_read is misleading. The code can still write into
> gfn. Lets rename gfn_to_hva_read to gfn_to_hva_prot() and gfn_to_hva()
> to gfn_to_hva_write().

Yes. I agreed.

> 
> This makes me think are there other places where gfn_to_hva() was
> used, but gfn_to_hva_prot() should have been?
>  - kvm_host_page_size() looks incorrect. We never use huge page to map
>read only memory slots currently.

It only checks whether gfn have been mapped, I think we can use
gfn_to_hva_read() instead, the real permission will be checked when we translate
the gfn to pfn.

>  - kvm_handle_bad_page() also looks incorrect and may cause incorrect
>address to be reported to userspace.

I have no idea on this point. kvm_handle_bad_page() is called when it failed to
translate the target gfn to pfn, then the emulator can detect the error on 
target gfn
properly. no? Or i misunderstood your meaning?

>  - kvm_setup_async_pf() also incorrect. Makes all page fault on read
>only slot to be sync.
>  - kvm_vm_fault() one looks OK since function assumes write only slots,
>but it is obsolete and should be deleted anyway.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Gleb Natapov
On Mon, Sep 02, 2013 at 05:20:15PM +0800, Xiao Guangrong wrote:
> On 08/30/2013 08:41 PM, Paolo Bonzini wrote:
> > Page tables in a read-only memory slot will currently cause a triple
> > fault because the page walker uses gfn_to_hva and it fails on such a slot.
> > 
> > OVMF uses such a page table; however, real hardware seems to be fine with
> > that as long as the accessed/dirty bits are set.  Save whether the slot
> > is readonly, and later check it when updating the accessed and dirty bits.
> 
> Paolo, do you know why OVMF is using readonly memory like this?
> 
Just a guess, but perhaps they want to move to paging mode as early as
possible even before memory controller is fully initialized.

> AFAIK, The fault trigged by this kind of access can hardly be fixed by
> userspace since the fault is trigged by pagetable walking not by the current
> instruction. Do you have any idea to let uerspace emulate it properly?
Not sure what userspace you mean here, but there shouldn't be a fault in the
first place if ROM page tables have access/dirty bit set and they do.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Xiao Guangrong
On 08/30/2013 08:41 PM, Paolo Bonzini wrote:
> Page tables in a read-only memory slot will currently cause a triple
> fault because the page walker uses gfn_to_hva and it fails on such a slot.
> 
> OVMF uses such a page table; however, real hardware seems to be fine with
> that as long as the accessed/dirty bits are set.  Save whether the slot
> is readonly, and later check it when updating the accessed and dirty bits.

Paolo, do you know why OVMF is using readonly memory like this?

AFAIK, The fault trigged by this kind of access can hardly be fixed by
userspace since the fault is trigged by pagetable walking not by the current
instruction. Do you have any idea to let uerspace emulate it properly?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Xiao Guangrong
On 08/30/2013 08:41 PM, Paolo Bonzini wrote:
 Page tables in a read-only memory slot will currently cause a triple
 fault because the page walker uses gfn_to_hva and it fails on such a slot.
 
 OVMF uses such a page table; however, real hardware seems to be fine with
 that as long as the accessed/dirty bits are set.  Save whether the slot
 is readonly, and later check it when updating the accessed and dirty bits.

Paolo, do you know why OVMF is using readonly memory like this?

AFAIK, The fault trigged by this kind of access can hardly be fixed by
userspace since the fault is trigged by pagetable walking not by the current
instruction. Do you have any idea to let uerspace emulate it properly?

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Gleb Natapov
On Mon, Sep 02, 2013 at 05:20:15PM +0800, Xiao Guangrong wrote:
 On 08/30/2013 08:41 PM, Paolo Bonzini wrote:
  Page tables in a read-only memory slot will currently cause a triple
  fault because the page walker uses gfn_to_hva and it fails on such a slot.
  
  OVMF uses such a page table; however, real hardware seems to be fine with
  that as long as the accessed/dirty bits are set.  Save whether the slot
  is readonly, and later check it when updating the accessed and dirty bits.
 
 Paolo, do you know why OVMF is using readonly memory like this?
 
Just a guess, but perhaps they want to move to paging mode as early as
possible even before memory controller is fully initialized.

 AFAIK, The fault trigged by this kind of access can hardly be fixed by
 userspace since the fault is trigged by pagetable walking not by the current
 instruction. Do you have any idea to let uerspace emulate it properly?
Not sure what userspace you mean here, but there shouldn't be a fault in the
first place if ROM page tables have access/dirty bit set and they do.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Xiao Guangrong
On 09/01/2013 05:17 PM, Gleb Natapov wrote:
 On Fri, Aug 30, 2013 at 02:41:37PM +0200, Paolo Bonzini wrote:
 Page tables in a read-only memory slot will currently cause a triple
 fault because the page walker uses gfn_to_hva and it fails on such a slot.

 OVMF uses such a page table; however, real hardware seems to be fine with
 that as long as the accessed/dirty bits are set.  Save whether the slot
 is readonly, and later check it when updating the accessed and dirty bits.

 The fix looks OK to me, but some comment below.
 
 Cc: sta...@vger.kernel.org
 Cc: g...@redhat.com
 Cc: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
  CCing to stable@ since the regression was introduced with
  support for readonly memory slots.

  arch/x86/kvm/paging_tmpl.h |  7 ++-
  include/linux/kvm_host.h   |  1 +
  virt/kvm/kvm_main.c| 14 +-
  3 files changed, 16 insertions(+), 6 deletions(-)

 diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
 index 0433301..dadc5c0 100644
 --- a/arch/x86/kvm/paging_tmpl.h
 +++ b/arch/x86/kvm/paging_tmpl.h
 @@ -99,6 +99,7 @@ struct guest_walker {
  pt_element_t prefetch_ptes[PTE_PREFETCH_NUM];
  gpa_t pte_gpa[PT_MAX_FULL_LEVELS];
  pt_element_t __user *ptep_user[PT_MAX_FULL_LEVELS];
 +bool pte_writable[PT_MAX_FULL_LEVELS];
  unsigned pt_access;
  unsigned pte_access;
  gfn_t gfn;
 @@ -235,6 +236,9 @@ static int FNAME(update_accessed_dirty_bits)(struct 
 kvm_vcpu *vcpu,
  if (pte == orig_pte)
  continue;
  
 +if (unlikely(!walker-pte_writable[level - 1]))
 +return -EACCES;
 +
  ret = FNAME(cmpxchg_gpte)(vcpu, mmu, ptep_user, index, 
 orig_pte, pte);
  if (ret)
  return ret;
 @@ -309,7 +313,8 @@ retry_walk:
  goto error;
  real_gfn = gpa_to_gfn(real_gfn);
  
 -host_addr = gfn_to_hva(vcpu-kvm, real_gfn);
 +host_addr = gfn_to_hva_read(vcpu-kvm, real_gfn,
 +walker-pte_writable[walker-level 
 - 1]);
 The use of gfn_to_hva_read is misleading. The code can still write into
 gfn. Lets rename gfn_to_hva_read to gfn_to_hva_prot() and gfn_to_hva()
 to gfn_to_hva_write().

Yes. I agreed.

 
 This makes me think are there other places where gfn_to_hva() was
 used, but gfn_to_hva_prot() should have been?
  - kvm_host_page_size() looks incorrect. We never use huge page to map
read only memory slots currently.

It only checks whether gfn have been mapped, I think we can use
gfn_to_hva_read() instead, the real permission will be checked when we translate
the gfn to pfn.

  - kvm_handle_bad_page() also looks incorrect and may cause incorrect
address to be reported to userspace.

I have no idea on this point. kvm_handle_bad_page() is called when it failed to
translate the target gfn to pfn, then the emulator can detect the error on 
target gfn
properly. no? Or i misunderstood your meaning?

  - kvm_setup_async_pf() also incorrect. Makes all page fault on read
only slot to be sync.
  - kvm_vm_fault() one looks OK since function assumes write only slots,
but it is obsolete and should be deleted anyway.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Gleb Natapov
On Mon, Sep 02, 2013 at 05:42:25PM +0800, Xiao Guangrong wrote:
 On 09/01/2013 05:17 PM, Gleb Natapov wrote:
  On Fri, Aug 30, 2013 at 02:41:37PM +0200, Paolo Bonzini wrote:
  Page tables in a read-only memory slot will currently cause a triple
  fault because the page walker uses gfn_to_hva and it fails on such a slot.
 
  OVMF uses such a page table; however, real hardware seems to be fine with
  that as long as the accessed/dirty bits are set.  Save whether the slot
  is readonly, and later check it when updating the accessed and dirty bits.
 
  The fix looks OK to me, but some comment below.
  
  Cc: sta...@vger.kernel.org
  Cc: g...@redhat.com
  Cc: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
  Signed-off-by: Paolo Bonzini pbonz...@redhat.com
  ---
 CCing to stable@ since the regression was introduced with
 support for readonly memory slots.
 
   arch/x86/kvm/paging_tmpl.h |  7 ++-
   include/linux/kvm_host.h   |  1 +
   virt/kvm/kvm_main.c| 14 +-
   3 files changed, 16 insertions(+), 6 deletions(-)
 
  diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
  index 0433301..dadc5c0 100644
  --- a/arch/x86/kvm/paging_tmpl.h
  +++ b/arch/x86/kvm/paging_tmpl.h
  @@ -99,6 +99,7 @@ struct guest_walker {
 pt_element_t prefetch_ptes[PTE_PREFETCH_NUM];
 gpa_t pte_gpa[PT_MAX_FULL_LEVELS];
 pt_element_t __user *ptep_user[PT_MAX_FULL_LEVELS];
  +  bool pte_writable[PT_MAX_FULL_LEVELS];
 unsigned pt_access;
 unsigned pte_access;
 gfn_t gfn;
  @@ -235,6 +236,9 @@ static int FNAME(update_accessed_dirty_bits)(struct 
  kvm_vcpu *vcpu,
 if (pte == orig_pte)
 continue;
   
  +  if (unlikely(!walker-pte_writable[level - 1]))
  +  return -EACCES;
  +
 ret = FNAME(cmpxchg_gpte)(vcpu, mmu, ptep_user, index, 
  orig_pte, pte);
 if (ret)
 return ret;
  @@ -309,7 +313,8 @@ retry_walk:
 goto error;
 real_gfn = gpa_to_gfn(real_gfn);
   
  -  host_addr = gfn_to_hva(vcpu-kvm, real_gfn);
  +  host_addr = gfn_to_hva_read(vcpu-kvm, real_gfn,
  +  walker-pte_writable[walker-level 
  - 1]);
  The use of gfn_to_hva_read is misleading. The code can still write into
  gfn. Lets rename gfn_to_hva_read to gfn_to_hva_prot() and gfn_to_hva()
  to gfn_to_hva_write().
 
 Yes. I agreed.
 
  
  This makes me think are there other places where gfn_to_hva() was
  used, but gfn_to_hva_prot() should have been?
   - kvm_host_page_size() looks incorrect. We never use huge page to map
 read only memory slots currently.
 
 It only checks whether gfn have been mapped, I think we can use
 gfn_to_hva_read() instead, the real permission will be checked when we 
 translate
 the gfn to pfn.
 
Yes, all the cases I listed should be changed to use function that looks
at both regular and RO slots.

   - kvm_handle_bad_page() also looks incorrect and may cause incorrect
 address to be reported to userspace.
 
 I have no idea on this point. kvm_handle_bad_page() is called when it failed 
 to
 translate the target gfn to pfn, then the emulator can detect the error on 
 target gfn
 properly. no? Or i misunderstood your meaning?
 
I am talking about the following code:

if (pfn == KVM_PFN_ERR_HWPOISON) {
kvm_send_hwpoison_signal(gfn_to_hva(vcpu-kvm, gfn), current);
return 0;
}

pfn will be KVM_PFN_ERR_HWPOISON gfn is backed by faulty memory, we need
to report the liner address of the faulty memory to a userspace here,
but if gfn is in a RO slot gfn_to_hva() will not return correct address
here.

   - kvm_setup_async_pf() also incorrect. Makes all page fault on read
 only slot to be sync.
   - kvm_vm_fault() one looks OK since function assumes write only slots,
 but it is obsolete and should be deleted anyway.
 

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Xiao Guangrong
On 09/02/2013 05:25 PM, Gleb Natapov wrote:
 On Mon, Sep 02, 2013 at 05:20:15PM +0800, Xiao Guangrong wrote:
 On 08/30/2013 08:41 PM, Paolo Bonzini wrote:
 Page tables in a read-only memory slot will currently cause a triple
 fault because the page walker uses gfn_to_hva and it fails on such a slot.

 OVMF uses such a page table; however, real hardware seems to be fine with
 that as long as the accessed/dirty bits are set.  Save whether the slot
 is readonly, and later check it when updating the accessed and dirty bits.

 Paolo, do you know why OVMF is using readonly memory like this?

 Just a guess, but perhaps they want to move to paging mode as early as
 possible even before memory controller is fully initialized.
 
 AFAIK, The fault trigged by this kind of access can hardly be fixed by
 userspace since the fault is trigged by pagetable walking not by the current
 instruction. Do you have any idea to let uerspace emulate it properly?
 Not sure what userspace you mean here, but there shouldn't be a fault in the

I just wonder how to fix this kind of fault. The current patch returns -EACCES
but that will crash the guest. I think we'd better let userspace to fix this
error (let userspace set the D/A bit.)

 first place if ROM page tables have access/dirty bit set and they do.

Yes, so we can not call x86_emulate_instruction() to fix this fault (that 
function
emulates the access on the first place). Need directly return a MMIO-exit to
userpsace when met this fault? What happen if this fault on pagetable-walking
is trigged in x86_emulate_instruction().?

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Xiao Guangrong
On 09/02/2013 05:49 PM, Gleb Natapov wrote:
 On Mon, Sep 02, 2013 at 05:42:25PM +0800, Xiao Guangrong wrote:
 On 09/01/2013 05:17 PM, Gleb Natapov wrote:
 On Fri, Aug 30, 2013 at 02:41:37PM +0200, Paolo Bonzini wrote:
 Page tables in a read-only memory slot will currently cause a triple
 fault because the page walker uses gfn_to_hva and it fails on such a slot.

 OVMF uses such a page table; however, real hardware seems to be fine with
 that as long as the accessed/dirty bits are set.  Save whether the slot
 is readonly, and later check it when updating the accessed and dirty bits.

 The fix looks OK to me, but some comment below.

 Cc: sta...@vger.kernel.org
 Cc: g...@redhat.com
 Cc: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
CCing to stable@ since the regression was introduced with
support for readonly memory slots.

  arch/x86/kvm/paging_tmpl.h |  7 ++-
  include/linux/kvm_host.h   |  1 +
  virt/kvm/kvm_main.c| 14 +-
  3 files changed, 16 insertions(+), 6 deletions(-)

 diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
 index 0433301..dadc5c0 100644
 --- a/arch/x86/kvm/paging_tmpl.h
 +++ b/arch/x86/kvm/paging_tmpl.h
 @@ -99,6 +99,7 @@ struct guest_walker {
pt_element_t prefetch_ptes[PTE_PREFETCH_NUM];
gpa_t pte_gpa[PT_MAX_FULL_LEVELS];
pt_element_t __user *ptep_user[PT_MAX_FULL_LEVELS];
 +  bool pte_writable[PT_MAX_FULL_LEVELS];
unsigned pt_access;
unsigned pte_access;
gfn_t gfn;
 @@ -235,6 +236,9 @@ static int FNAME(update_accessed_dirty_bits)(struct 
 kvm_vcpu *vcpu,
if (pte == orig_pte)
continue;
  
 +  if (unlikely(!walker-pte_writable[level - 1]))
 +  return -EACCES;
 +
ret = FNAME(cmpxchg_gpte)(vcpu, mmu, ptep_user, index, 
 orig_pte, pte);
if (ret)
return ret;
 @@ -309,7 +313,8 @@ retry_walk:
goto error;
real_gfn = gpa_to_gfn(real_gfn);
  
 -  host_addr = gfn_to_hva(vcpu-kvm, real_gfn);
 +  host_addr = gfn_to_hva_read(vcpu-kvm, real_gfn,
 +  walker-pte_writable[walker-level 
 - 1]);
 The use of gfn_to_hva_read is misleading. The code can still write into
 gfn. Lets rename gfn_to_hva_read to gfn_to_hva_prot() and gfn_to_hva()
 to gfn_to_hva_write().

 Yes. I agreed.


 This makes me think are there other places where gfn_to_hva() was
 used, but gfn_to_hva_prot() should have been?
  - kvm_host_page_size() looks incorrect. We never use huge page to map
read only memory slots currently.

 It only checks whether gfn have been mapped, I think we can use
 gfn_to_hva_read() instead, the real permission will be checked when we 
 translate
 the gfn to pfn.

 Yes, all the cases I listed should be changed to use function that looks
 at both regular and RO slots.
 
  - kvm_handle_bad_page() also looks incorrect and may cause incorrect
address to be reported to userspace.

 I have no idea on this point. kvm_handle_bad_page() is called when it failed 
 to
 translate the target gfn to pfn, then the emulator can detect the error on 
 target gfn
 properly. no? Or i misunderstood your meaning?

 I am talking about the following code:
 
 if (pfn == KVM_PFN_ERR_HWPOISON) {
 kvm_send_hwpoison_signal(gfn_to_hva(vcpu-kvm, gfn), current);
 return 0;
 }
 
 pfn will be KVM_PFN_ERR_HWPOISON gfn is backed by faulty memory, we need
 to report the liner address of the faulty memory to a userspace here,
 but if gfn is in a RO slot gfn_to_hva() will not return correct address
 here.

Got it, thanks for your explanation.

BTW, if you and Paolo are busy on other things, i am happy to fix these issues. 
:)

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Gleb Natapov
On Mon, Sep 02, 2013 at 06:00:39PM +0800, Xiao Guangrong wrote:
 On 09/02/2013 05:25 PM, Gleb Natapov wrote:
  On Mon, Sep 02, 2013 at 05:20:15PM +0800, Xiao Guangrong wrote:
  On 08/30/2013 08:41 PM, Paolo Bonzini wrote:
  Page tables in a read-only memory slot will currently cause a triple
  fault because the page walker uses gfn_to_hva and it fails on such a slot.
 
  OVMF uses such a page table; however, real hardware seems to be fine with
  that as long as the accessed/dirty bits are set.  Save whether the slot
  is readonly, and later check it when updating the accessed and dirty bits.
 
  Paolo, do you know why OVMF is using readonly memory like this?
 
  Just a guess, but perhaps they want to move to paging mode as early as
  possible even before memory controller is fully initialized.
  
  AFAIK, The fault trigged by this kind of access can hardly be fixed by
  userspace since the fault is trigged by pagetable walking not by the 
  current
  instruction. Do you have any idea to let uerspace emulate it properly?
  Not sure what userspace you mean here, but there shouldn't be a fault in the
 
 I just wonder how to fix this kind of fault. The current patch returns -EACCES
 but that will crash the guest. I think we'd better let userspace to fix this
 error (let userspace set the D/A bit.)
 
Ugh, this is not good. Missed that. Don't know what real HW will do
here, but the easy thing for us to do would be to just return success.

  first place if ROM page tables have access/dirty bit set and they do.
 
 Yes, so we can not call x86_emulate_instruction() to fix this fault (that 
 function
 emulates the access on the first place). Need directly return a MMIO-exit to
 userpsace when met this fault? What happen if this fault on pagetable-walking
 is trigged in x86_emulate_instruction().?
I think we should not return MMIO-exit to userspace. Either ignore write attempt
or kill a guest.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Gleb Natapov
On Mon, Sep 02, 2013 at 06:05:10PM +0800, Xiao Guangrong wrote:
 On 09/02/2013 05:49 PM, Gleb Natapov wrote:
  On Mon, Sep 02, 2013 at 05:42:25PM +0800, Xiao Guangrong wrote:
  On 09/01/2013 05:17 PM, Gleb Natapov wrote:
  On Fri, Aug 30, 2013 at 02:41:37PM +0200, Paolo Bonzini wrote:
  Page tables in a read-only memory slot will currently cause a triple
  fault because the page walker uses gfn_to_hva and it fails on such a 
  slot.
 
  OVMF uses such a page table; however, real hardware seems to be fine with
  that as long as the accessed/dirty bits are set.  Save whether the slot
  is readonly, and later check it when updating the accessed and dirty 
  bits.
 
  The fix looks OK to me, but some comment below.
 
  Cc: sta...@vger.kernel.org
  Cc: g...@redhat.com
  Cc: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
  Signed-off-by: Paolo Bonzini pbonz...@redhat.com
  ---
   CCing to stable@ since the regression was introduced with
   support for readonly memory slots.
 
   arch/x86/kvm/paging_tmpl.h |  7 ++-
   include/linux/kvm_host.h   |  1 +
   virt/kvm/kvm_main.c| 14 +-
   3 files changed, 16 insertions(+), 6 deletions(-)
 
  diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
  index 0433301..dadc5c0 100644
  --- a/arch/x86/kvm/paging_tmpl.h
  +++ b/arch/x86/kvm/paging_tmpl.h
  @@ -99,6 +99,7 @@ struct guest_walker {
   pt_element_t prefetch_ptes[PTE_PREFETCH_NUM];
   gpa_t pte_gpa[PT_MAX_FULL_LEVELS];
   pt_element_t __user *ptep_user[PT_MAX_FULL_LEVELS];
  +bool pte_writable[PT_MAX_FULL_LEVELS];
   unsigned pt_access;
   unsigned pte_access;
   gfn_t gfn;
  @@ -235,6 +236,9 @@ static int FNAME(update_accessed_dirty_bits)(struct 
  kvm_vcpu *vcpu,
   if (pte == orig_pte)
   continue;
   
  +if (unlikely(!walker-pte_writable[level - 1]))
  +return -EACCES;
  +
   ret = FNAME(cmpxchg_gpte)(vcpu, mmu, ptep_user, index, 
  orig_pte, pte);
   if (ret)
   return ret;
  @@ -309,7 +313,8 @@ retry_walk:
   goto error;
   real_gfn = gpa_to_gfn(real_gfn);
   
  -host_addr = gfn_to_hva(vcpu-kvm, real_gfn);
  +host_addr = gfn_to_hva_read(vcpu-kvm, real_gfn,
  +
  walker-pte_writable[walker-level - 1]);
  The use of gfn_to_hva_read is misleading. The code can still write into
  gfn. Lets rename gfn_to_hva_read to gfn_to_hva_prot() and gfn_to_hva()
  to gfn_to_hva_write().
 
  Yes. I agreed.
 
 
  This makes me think are there other places where gfn_to_hva() was
  used, but gfn_to_hva_prot() should have been?
   - kvm_host_page_size() looks incorrect. We never use huge page to map
 read only memory slots currently.
 
  It only checks whether gfn have been mapped, I think we can use
  gfn_to_hva_read() instead, the real permission will be checked when we 
  translate
  the gfn to pfn.
 
  Yes, all the cases I listed should be changed to use function that looks
  at both regular and RO slots.
  
   - kvm_handle_bad_page() also looks incorrect and may cause incorrect
 address to be reported to userspace.
 
  I have no idea on this point. kvm_handle_bad_page() is called when it 
  failed to
  translate the target gfn to pfn, then the emulator can detect the error on 
  target gfn
  properly. no? Or i misunderstood your meaning?
 
  I am talking about the following code:
  
  if (pfn == KVM_PFN_ERR_HWPOISON) {
  kvm_send_hwpoison_signal(gfn_to_hva(vcpu-kvm, gfn), 
  current);
  return 0;
  }
  
  pfn will be KVM_PFN_ERR_HWPOISON gfn is backed by faulty memory, we need
  to report the liner address of the faulty memory to a userspace here,
  but if gfn is in a RO slot gfn_to_hva() will not return correct address
  here.
 
 Got it, thanks for your explanation.
 
 BTW, if you and Paolo are busy on other things, i am happy to fix these 
 issues. :)
I am busy with reviews mostly :). If you are not to busy with lockless
write protection then fine with me. Lest wait for Paolo's input on
proposed API though.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Paolo Bonzini
Il 02/09/2013 12:07, Gleb Natapov ha scritto:
 On Mon, Sep 02, 2013 at 06:00:39PM +0800, Xiao Guangrong wrote:
 On 09/02/2013 05:25 PM, Gleb Natapov wrote:
 On Mon, Sep 02, 2013 at 05:20:15PM +0800, Xiao Guangrong wrote:
 On 08/30/2013 08:41 PM, Paolo Bonzini wrote:
 Page tables in a read-only memory slot will currently cause a triple
 fault because the page walker uses gfn_to_hva and it fails on such a slot.

 OVMF uses such a page table; however, real hardware seems to be fine with
 that as long as the accessed/dirty bits are set.  Save whether the slot
 is readonly, and later check it when updating the accessed and dirty bits.

 Paolo, do you know why OVMF is using readonly memory like this?

 Just a guess, but perhaps they want to move to paging mode as early as
 possible even before memory controller is fully initialized.

 AFAIK, The fault trigged by this kind of access can hardly be fixed by
 userspace since the fault is trigged by pagetable walking not by the 
 current
 instruction. Do you have any idea to let uerspace emulate it properly?
 Not sure what userspace you mean here, but there shouldn't be a fault in the

 I just wonder how to fix this kind of fault. The current patch returns 
 -EACCES
 but that will crash the guest. I think we'd better let userspace to fix this
 error (let userspace set the D/A bit.)

 Ugh, this is not good. Missed that. Don't know what real HW will do
 here, but the easy thing for us to do would be to just return success.

Real hardware would just do a memory write.  What happens depends on
what is on the bus, i.e. on what the ROM is used for.

QEMU uses read-only slots for two things: actual read-only memory where
writes go to the bitbucket, and ROMD memory where writes are treated
as MMIO.

So, in the first case we would ignore the write.  In the second we would
do an MMIO exit to userspace.  But ignoring the write isn't always
correct, and doing an MMIO exit is complicated, so I would just kill the
guest.

EPT will probably write to the read-only slots without thinking much
about it.

My patch injects a page fault, which is very likely to escalate to a
triple fault.  This is probably never what you want---on the other hand,
I wasn't sure what level of accuracy we want in this case, given that
EPT does it wrong too.

Paolo

 first place if ROM page tables have access/dirty bit set and they do.

 Yes, so we can not call x86_emulate_instruction() to fix this fault (that 
 function
 emulates the access on the first place). Need directly return a MMIO-exit to
 userpsace when met this fault? What happen if this fault on pagetable-walking
 is trigged in x86_emulate_instruction().?
 I think we should not return MMIO-exit to userspace. Either ignore write 
 attempt
 or kill a guest.
 
 --
   Gleb.
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Paolo Bonzini
Il 02/09/2013 12:11, Gleb Natapov ha scritto:
  
  Got it, thanks for your explanation.
  
  BTW, if you and Paolo are busy on other things, i am happy to fix these 
  issues. :)
 I am busy with reviews mostly :). If you are not to busy with lockless
 write protection then fine with me. Lest wait for Paolo's input on
 proposed API though.

No problem, I'm happy to do this kind of exercise.  I like the API you
proposed.  I can do this change on top of the API change.  However,
since this bug is in the wild it may be applicable to stable@ as well.
 If you agree, I will also post v2 of this patch for stable kernels as
soon as we agree on the desired faulting behavior.

Paolo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Paolo Bonzini
Il 02/09/2013 11:25, Gleb Natapov ha scritto:
 On Mon, Sep 02, 2013 at 05:20:15PM +0800, Xiao Guangrong wrote:
 On 08/30/2013 08:41 PM, Paolo Bonzini wrote:
 Page tables in a read-only memory slot will currently cause a triple
 fault because the page walker uses gfn_to_hva and it fails on such a slot.

 OVMF uses such a page table; however, real hardware seems to be fine with
 that as long as the accessed/dirty bits are set.  Save whether the slot
 is readonly, and later check it when updating the accessed and dirty bits.

 Paolo, do you know why OVMF is using readonly memory like this?

 Just a guess, but perhaps they want to move to paging mode as early as
 possible even before memory controller is fully initialized.

More precisely they want to move to 64-bit mode as early as possible,
and that requires paging.

 AFAIK, The fault trigged by this kind of access can hardly be fixed by
 userspace since the fault is trigged by pagetable walking not by the current
 instruction. Do you have any idea to let uerspace emulate it properly?

 Not sure what userspace you mean here, but there shouldn't be a fault in the
 first place if ROM page tables have access/dirty bit set and they do.

Yep.  Actually they do not set dirty on the PD/PDP/PML4, only on the
bottom level (which is also fine).

Paolo

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Gleb Natapov
On Mon, Sep 02, 2013 at 06:00:08PM +0200, Paolo Bonzini wrote:
 Il 02/09/2013 11:25, Gleb Natapov ha scritto:
  On Mon, Sep 02, 2013 at 05:20:15PM +0800, Xiao Guangrong wrote:
  On 08/30/2013 08:41 PM, Paolo Bonzini wrote:
  Page tables in a read-only memory slot will currently cause a triple
  fault because the page walker uses gfn_to_hva and it fails on such a slot.
 
  OVMF uses such a page table; however, real hardware seems to be fine with
  that as long as the accessed/dirty bits are set.  Save whether the slot
  is readonly, and later check it when updating the accessed and dirty bits.
 
  Paolo, do you know why OVMF is using readonly memory like this?
 
  Just a guess, but perhaps they want to move to paging mode as early as
  possible even before memory controller is fully initialized.
 
 More precisely they want to move to 64-bit mode as early as possible,
 and that requires paging.
 
  AFAIK, The fault trigged by this kind of access can hardly be fixed by
  userspace since the fault is trigged by pagetable walking not by the 
  current
  instruction. Do you have any idea to let uerspace emulate it properly?
 
  Not sure what userspace you mean here, but there shouldn't be a fault in the
  first place if ROM page tables have access/dirty bit set and they do.
 
 Yep.  Actually they do not set dirty on the PD/PDP/PML4, only on the
 bottom level (which is also fine).
 
They do it for a good reason, dirty bit exists only on the bottom level :)

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Gleb Natapov
On Mon, Sep 02, 2013 at 05:56:07PM +0200, Paolo Bonzini wrote:
 Il 02/09/2013 12:07, Gleb Natapov ha scritto:
  On Mon, Sep 02, 2013 at 06:00:39PM +0800, Xiao Guangrong wrote:
  On 09/02/2013 05:25 PM, Gleb Natapov wrote:
  On Mon, Sep 02, 2013 at 05:20:15PM +0800, Xiao Guangrong wrote:
  On 08/30/2013 08:41 PM, Paolo Bonzini wrote:
  Page tables in a read-only memory slot will currently cause a triple
  fault because the page walker uses gfn_to_hva and it fails on such a 
  slot.
 
  OVMF uses such a page table; however, real hardware seems to be fine 
  with
  that as long as the accessed/dirty bits are set.  Save whether the slot
  is readonly, and later check it when updating the accessed and dirty 
  bits.
 
  Paolo, do you know why OVMF is using readonly memory like this?
 
  Just a guess, but perhaps they want to move to paging mode as early as
  possible even before memory controller is fully initialized.
 
  AFAIK, The fault trigged by this kind of access can hardly be fixed by
  userspace since the fault is trigged by pagetable walking not by the 
  current
  instruction. Do you have any idea to let uerspace emulate it properly?
  Not sure what userspace you mean here, but there shouldn't be a fault in 
  the
 
  I just wonder how to fix this kind of fault. The current patch returns 
  -EACCES
  but that will crash the guest. I think we'd better let userspace to fix 
  this
  error (let userspace set the D/A bit.)
 
  Ugh, this is not good. Missed that. Don't know what real HW will do
  here, but the easy thing for us to do would be to just return success.
 
 Real hardware would just do a memory write.  What happens depends on
 what is on the bus, i.e. on what the ROM is used for.
 
 QEMU uses read-only slots for two things: actual read-only memory where
 writes go to the bitbucket, and ROMD memory where writes are treated
 as MMIO.
 
 So, in the first case we would ignore the write.  In the second we would
 do an MMIO exit to userspace.  But ignoring the write isn't always
 correct, and doing an MMIO exit is complicated, so I would just kill the
 guest.
 
 EPT will probably write to the read-only slots without thinking much
 about it.
 
 My patch injects a page fault, which is very likely to escalate to a
 triple fault.  This is probably never what you want---on the other hand,
 I wasn't sure what level of accuracy we want in this case, given that
 EPT does it wrong too.
 
Just ignoring write into RO slot is OK. We do not handle page tables in
MMIO memory, so handling page tables in ROMD memory shouldn't be a
priority either. 

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Gleb Natapov
On Mon, Sep 02, 2013 at 05:58:41PM +0200, Paolo Bonzini wrote:
 Il 02/09/2013 12:11, Gleb Natapov ha scritto:
   
   Got it, thanks for your explanation.
   
   BTW, if you and Paolo are busy on other things, i am happy to fix these 
   issues. :)
  I am busy with reviews mostly :). If you are not to busy with lockless
  write protection then fine with me. Lest wait for Paolo's input on
  proposed API though.
 
 No problem, I'm happy to do this kind of exercise.  I like the API you
 proposed.  I can do this change on top of the API change.  However,
 since this bug is in the wild it may be applicable to stable@ as well.
  If you agree, I will also post v2 of this patch for stable kernels as
 soon as we agree on the desired faulting behavior.
 
I am OK with having minimal patch for backporting to stable and doing
API changes + fixes for other call sites on top.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-01 Thread Gleb Natapov
On Fri, Aug 30, 2013 at 02:41:37PM +0200, Paolo Bonzini wrote:
> Page tables in a read-only memory slot will currently cause a triple
> fault because the page walker uses gfn_to_hva and it fails on such a slot.
> 
> OVMF uses such a page table; however, real hardware seems to be fine with
> that as long as the accessed/dirty bits are set.  Save whether the slot
> is readonly, and later check it when updating the accessed and dirty bits.
> 
The fix looks OK to me, but some comment below.

> Cc: sta...@vger.kernel.org
> Cc: g...@redhat.com
> Cc: Xiao Guangrong 
> Signed-off-by: Paolo Bonzini 
> ---
>   CCing to stable@ since the regression was introduced with
>   support for readonly memory slots.
> 
>  arch/x86/kvm/paging_tmpl.h |  7 ++-
>  include/linux/kvm_host.h   |  1 +
>  virt/kvm/kvm_main.c| 14 +-
>  3 files changed, 16 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
> index 0433301..dadc5c0 100644
> --- a/arch/x86/kvm/paging_tmpl.h
> +++ b/arch/x86/kvm/paging_tmpl.h
> @@ -99,6 +99,7 @@ struct guest_walker {
>   pt_element_t prefetch_ptes[PTE_PREFETCH_NUM];
>   gpa_t pte_gpa[PT_MAX_FULL_LEVELS];
>   pt_element_t __user *ptep_user[PT_MAX_FULL_LEVELS];
> + bool pte_writable[PT_MAX_FULL_LEVELS];
>   unsigned pt_access;
>   unsigned pte_access;
>   gfn_t gfn;
> @@ -235,6 +236,9 @@ static int FNAME(update_accessed_dirty_bits)(struct 
> kvm_vcpu *vcpu,
>   if (pte == orig_pte)
>   continue;
>  
> + if (unlikely(!walker->pte_writable[level - 1]))
> + return -EACCES;
> +
>   ret = FNAME(cmpxchg_gpte)(vcpu, mmu, ptep_user, index, 
> orig_pte, pte);
>   if (ret)
>   return ret;
> @@ -309,7 +313,8 @@ retry_walk:
>   goto error;
>   real_gfn = gpa_to_gfn(real_gfn);
>  
> - host_addr = gfn_to_hva(vcpu->kvm, real_gfn);
> + host_addr = gfn_to_hva_read(vcpu->kvm, real_gfn,
> + >pte_writable[walker->level 
> - 1]);
The use of gfn_to_hva_read is misleading. The code can still write into
gfn. Lets rename gfn_to_hva_read to gfn_to_hva_prot() and gfn_to_hva()
to gfn_to_hva_write().

This makes me think are there other places where gfn_to_hva() was
used, but gfn_to_hva_prot() should have been?
 - kvm_host_page_size() looks incorrect. We never use huge page to map
   read only memory slots currently.
 - kvm_handle_bad_page() also looks incorrect and may cause incorrect
   address to be reported to userspace.
 - kvm_setup_async_pf() also incorrect. Makes all page fault on read
   only slot to be sync.
 - kvm_vm_fault() one looks OK since function assumes write only slots,
   but it is obsolete and should be deleted anyway.

Others in generic and x86 code looks OK, somebody need to check ppc and
arm code.


>   if (unlikely(kvm_is_error_hva(host_addr)))
>   goto error;
>  
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index ca645a0..22f9cdf 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -533,6 +533,7 @@ int gfn_to_page_many_atomic(struct kvm *kvm, gfn_t gfn, 
> struct page **pages,
>  
>  struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn);
>  unsigned long gfn_to_hva(struct kvm *kvm, gfn_t gfn);
> +unsigned long gfn_to_hva_read(struct kvm *kvm, gfn_t gfn, bool *writable);
>  unsigned long gfn_to_hva_memslot(struct kvm_memory_slot *slot, gfn_t gfn);
>  void kvm_release_page_clean(struct page *page);
>  void kvm_release_page_dirty(struct page *page);
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index f7e4334..418d037 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1078,11 +1078,15 @@ unsigned long gfn_to_hva(struct kvm *kvm, gfn_t gfn)
>  EXPORT_SYMBOL_GPL(gfn_to_hva);
>  
>  /*
> - * The hva returned by this function is only allowed to be read.
> - * It should pair with kvm_read_hva() or kvm_read_hva_atomic().
> + * If writable is set to false, the hva returned by this function is only
> + * allowed to be read.
>   */
> -static unsigned long gfn_to_hva_read(struct kvm *kvm, gfn_t gfn)
> +unsigned long gfn_to_hva_read(struct kvm *kvm, gfn_t gfn, bool *writable)
>  {
> + struct kvm_memory_slot *slot = gfn_to_memslot(kvm, gfn);
> + if (writable)
> + *writable = !memslot_is_readonly(slot);
> +
>   return __gfn_to_hva_many(gfn_to_memslot(kvm, gfn), gfn, NULL, false);
>  }
>  
> @@ -1450,7 +1454,7 @@ int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, 
> void *data, int offset,
>   int r;
>   unsigned long addr;
>  
> - addr = gfn_to_hva_read(kvm, gfn);
> + addr = gfn_to_hva_read(kvm, gfn, NULL);
>   if (kvm_is_error_hva(addr))
>   return -EFAULT;
>   r = kvm_read_hva(data, (void __user *)addr + offset, len);
> @@ -1488,7 +1492,7 @@ int 

Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-01 Thread Gleb Natapov
On Fri, Aug 30, 2013 at 02:41:37PM +0200, Paolo Bonzini wrote:
 Page tables in a read-only memory slot will currently cause a triple
 fault because the page walker uses gfn_to_hva and it fails on such a slot.
 
 OVMF uses such a page table; however, real hardware seems to be fine with
 that as long as the accessed/dirty bits are set.  Save whether the slot
 is readonly, and later check it when updating the accessed and dirty bits.
 
The fix looks OK to me, but some comment below.

 Cc: sta...@vger.kernel.org
 Cc: g...@redhat.com
 Cc: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
   CCing to stable@ since the regression was introduced with
   support for readonly memory slots.
 
  arch/x86/kvm/paging_tmpl.h |  7 ++-
  include/linux/kvm_host.h   |  1 +
  virt/kvm/kvm_main.c| 14 +-
  3 files changed, 16 insertions(+), 6 deletions(-)
 
 diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
 index 0433301..dadc5c0 100644
 --- a/arch/x86/kvm/paging_tmpl.h
 +++ b/arch/x86/kvm/paging_tmpl.h
 @@ -99,6 +99,7 @@ struct guest_walker {
   pt_element_t prefetch_ptes[PTE_PREFETCH_NUM];
   gpa_t pte_gpa[PT_MAX_FULL_LEVELS];
   pt_element_t __user *ptep_user[PT_MAX_FULL_LEVELS];
 + bool pte_writable[PT_MAX_FULL_LEVELS];
   unsigned pt_access;
   unsigned pte_access;
   gfn_t gfn;
 @@ -235,6 +236,9 @@ static int FNAME(update_accessed_dirty_bits)(struct 
 kvm_vcpu *vcpu,
   if (pte == orig_pte)
   continue;
  
 + if (unlikely(!walker-pte_writable[level - 1]))
 + return -EACCES;
 +
   ret = FNAME(cmpxchg_gpte)(vcpu, mmu, ptep_user, index, 
 orig_pte, pte);
   if (ret)
   return ret;
 @@ -309,7 +313,8 @@ retry_walk:
   goto error;
   real_gfn = gpa_to_gfn(real_gfn);
  
 - host_addr = gfn_to_hva(vcpu-kvm, real_gfn);
 + host_addr = gfn_to_hva_read(vcpu-kvm, real_gfn,
 + walker-pte_writable[walker-level 
 - 1]);
The use of gfn_to_hva_read is misleading. The code can still write into
gfn. Lets rename gfn_to_hva_read to gfn_to_hva_prot() and gfn_to_hva()
to gfn_to_hva_write().

This makes me think are there other places where gfn_to_hva() was
used, but gfn_to_hva_prot() should have been?
 - kvm_host_page_size() looks incorrect. We never use huge page to map
   read only memory slots currently.
 - kvm_handle_bad_page() also looks incorrect and may cause incorrect
   address to be reported to userspace.
 - kvm_setup_async_pf() also incorrect. Makes all page fault on read
   only slot to be sync.
 - kvm_vm_fault() one looks OK since function assumes write only slots,
   but it is obsolete and should be deleted anyway.

Others in generic and x86 code looks OK, somebody need to check ppc and
arm code.


   if (unlikely(kvm_is_error_hva(host_addr)))
   goto error;
  
 diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
 index ca645a0..22f9cdf 100644
 --- a/include/linux/kvm_host.h
 +++ b/include/linux/kvm_host.h
 @@ -533,6 +533,7 @@ int gfn_to_page_many_atomic(struct kvm *kvm, gfn_t gfn, 
 struct page **pages,
  
  struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn);
  unsigned long gfn_to_hva(struct kvm *kvm, gfn_t gfn);
 +unsigned long gfn_to_hva_read(struct kvm *kvm, gfn_t gfn, bool *writable);
  unsigned long gfn_to_hva_memslot(struct kvm_memory_slot *slot, gfn_t gfn);
  void kvm_release_page_clean(struct page *page);
  void kvm_release_page_dirty(struct page *page);
 diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
 index f7e4334..418d037 100644
 --- a/virt/kvm/kvm_main.c
 +++ b/virt/kvm/kvm_main.c
 @@ -1078,11 +1078,15 @@ unsigned long gfn_to_hva(struct kvm *kvm, gfn_t gfn)
  EXPORT_SYMBOL_GPL(gfn_to_hva);
  
  /*
 - * The hva returned by this function is only allowed to be read.
 - * It should pair with kvm_read_hva() or kvm_read_hva_atomic().
 + * If writable is set to false, the hva returned by this function is only
 + * allowed to be read.
   */
 -static unsigned long gfn_to_hva_read(struct kvm *kvm, gfn_t gfn)
 +unsigned long gfn_to_hva_read(struct kvm *kvm, gfn_t gfn, bool *writable)
  {
 + struct kvm_memory_slot *slot = gfn_to_memslot(kvm, gfn);
 + if (writable)
 + *writable = !memslot_is_readonly(slot);
 +
   return __gfn_to_hva_many(gfn_to_memslot(kvm, gfn), gfn, NULL, false);
  }
  
 @@ -1450,7 +1454,7 @@ int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, 
 void *data, int offset,
   int r;
   unsigned long addr;
  
 - addr = gfn_to_hva_read(kvm, gfn);
 + addr = gfn_to_hva_read(kvm, gfn, NULL);
   if (kvm_is_error_hva(addr))
   return -EFAULT;
   r = kvm_read_hva(data, (void __user *)addr + offset, len);
 @@ -1488,7 +1492,7 @@ int kvm_read_guest_atomic(struct kvm *kvm, gpa_t gpa, 

[PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-08-30 Thread Paolo Bonzini
Page tables in a read-only memory slot will currently cause a triple
fault because the page walker uses gfn_to_hva and it fails on such a slot.

OVMF uses such a page table; however, real hardware seems to be fine with
that as long as the accessed/dirty bits are set.  Save whether the slot
is readonly, and later check it when updating the accessed and dirty bits.

Cc: sta...@vger.kernel.org
Cc: g...@redhat.com
Cc: Xiao Guangrong 
Signed-off-by: Paolo Bonzini 
---
CCing to stable@ since the regression was introduced with
support for readonly memory slots.

 arch/x86/kvm/paging_tmpl.h |  7 ++-
 include/linux/kvm_host.h   |  1 +
 virt/kvm/kvm_main.c| 14 +-
 3 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index 0433301..dadc5c0 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -99,6 +99,7 @@ struct guest_walker {
pt_element_t prefetch_ptes[PTE_PREFETCH_NUM];
gpa_t pte_gpa[PT_MAX_FULL_LEVELS];
pt_element_t __user *ptep_user[PT_MAX_FULL_LEVELS];
+   bool pte_writable[PT_MAX_FULL_LEVELS];
unsigned pt_access;
unsigned pte_access;
gfn_t gfn;
@@ -235,6 +236,9 @@ static int FNAME(update_accessed_dirty_bits)(struct 
kvm_vcpu *vcpu,
if (pte == orig_pte)
continue;
 
+   if (unlikely(!walker->pte_writable[level - 1]))
+   return -EACCES;
+
ret = FNAME(cmpxchg_gpte)(vcpu, mmu, ptep_user, index, 
orig_pte, pte);
if (ret)
return ret;
@@ -309,7 +313,8 @@ retry_walk:
goto error;
real_gfn = gpa_to_gfn(real_gfn);
 
-   host_addr = gfn_to_hva(vcpu->kvm, real_gfn);
+   host_addr = gfn_to_hva_read(vcpu->kvm, real_gfn,
+   >pte_writable[walker->level 
- 1]);
if (unlikely(kvm_is_error_hva(host_addr)))
goto error;
 
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index ca645a0..22f9cdf 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -533,6 +533,7 @@ int gfn_to_page_many_atomic(struct kvm *kvm, gfn_t gfn, 
struct page **pages,
 
 struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn);
 unsigned long gfn_to_hva(struct kvm *kvm, gfn_t gfn);
+unsigned long gfn_to_hva_read(struct kvm *kvm, gfn_t gfn, bool *writable);
 unsigned long gfn_to_hva_memslot(struct kvm_memory_slot *slot, gfn_t gfn);
 void kvm_release_page_clean(struct page *page);
 void kvm_release_page_dirty(struct page *page);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index f7e4334..418d037 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1078,11 +1078,15 @@ unsigned long gfn_to_hva(struct kvm *kvm, gfn_t gfn)
 EXPORT_SYMBOL_GPL(gfn_to_hva);
 
 /*
- * The hva returned by this function is only allowed to be read.
- * It should pair with kvm_read_hva() or kvm_read_hva_atomic().
+ * If writable is set to false, the hva returned by this function is only
+ * allowed to be read.
  */
-static unsigned long gfn_to_hva_read(struct kvm *kvm, gfn_t gfn)
+unsigned long gfn_to_hva_read(struct kvm *kvm, gfn_t gfn, bool *writable)
 {
+   struct kvm_memory_slot *slot = gfn_to_memslot(kvm, gfn);
+   if (writable)
+   *writable = !memslot_is_readonly(slot);
+
return __gfn_to_hva_many(gfn_to_memslot(kvm, gfn), gfn, NULL, false);
 }
 
@@ -1450,7 +1454,7 @@ int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void 
*data, int offset,
int r;
unsigned long addr;
 
-   addr = gfn_to_hva_read(kvm, gfn);
+   addr = gfn_to_hva_read(kvm, gfn, NULL);
if (kvm_is_error_hva(addr))
return -EFAULT;
r = kvm_read_hva(data, (void __user *)addr + offset, len);
@@ -1488,7 +1492,7 @@ int kvm_read_guest_atomic(struct kvm *kvm, gpa_t gpa, 
void *data,
gfn_t gfn = gpa >> PAGE_SHIFT;
int offset = offset_in_page(gpa);
 
-   addr = gfn_to_hva_read(kvm, gfn);
+   addr = gfn_to_hva_read(kvm, gfn, NULL);
if (kvm_is_error_hva(addr))
return -EFAULT;
pagefault_disable();
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-08-30 Thread Paolo Bonzini
Page tables in a read-only memory slot will currently cause a triple
fault because the page walker uses gfn_to_hva and it fails on such a slot.

OVMF uses such a page table; however, real hardware seems to be fine with
that as long as the accessed/dirty bits are set.  Save whether the slot
is readonly, and later check it when updating the accessed and dirty bits.

Cc: sta...@vger.kernel.org
Cc: g...@redhat.com
Cc: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
CCing to stable@ since the regression was introduced with
support for readonly memory slots.

 arch/x86/kvm/paging_tmpl.h |  7 ++-
 include/linux/kvm_host.h   |  1 +
 virt/kvm/kvm_main.c| 14 +-
 3 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index 0433301..dadc5c0 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -99,6 +99,7 @@ struct guest_walker {
pt_element_t prefetch_ptes[PTE_PREFETCH_NUM];
gpa_t pte_gpa[PT_MAX_FULL_LEVELS];
pt_element_t __user *ptep_user[PT_MAX_FULL_LEVELS];
+   bool pte_writable[PT_MAX_FULL_LEVELS];
unsigned pt_access;
unsigned pte_access;
gfn_t gfn;
@@ -235,6 +236,9 @@ static int FNAME(update_accessed_dirty_bits)(struct 
kvm_vcpu *vcpu,
if (pte == orig_pte)
continue;
 
+   if (unlikely(!walker-pte_writable[level - 1]))
+   return -EACCES;
+
ret = FNAME(cmpxchg_gpte)(vcpu, mmu, ptep_user, index, 
orig_pte, pte);
if (ret)
return ret;
@@ -309,7 +313,8 @@ retry_walk:
goto error;
real_gfn = gpa_to_gfn(real_gfn);
 
-   host_addr = gfn_to_hva(vcpu-kvm, real_gfn);
+   host_addr = gfn_to_hva_read(vcpu-kvm, real_gfn,
+   walker-pte_writable[walker-level 
- 1]);
if (unlikely(kvm_is_error_hva(host_addr)))
goto error;
 
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index ca645a0..22f9cdf 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -533,6 +533,7 @@ int gfn_to_page_many_atomic(struct kvm *kvm, gfn_t gfn, 
struct page **pages,
 
 struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn);
 unsigned long gfn_to_hva(struct kvm *kvm, gfn_t gfn);
+unsigned long gfn_to_hva_read(struct kvm *kvm, gfn_t gfn, bool *writable);
 unsigned long gfn_to_hva_memslot(struct kvm_memory_slot *slot, gfn_t gfn);
 void kvm_release_page_clean(struct page *page);
 void kvm_release_page_dirty(struct page *page);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index f7e4334..418d037 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1078,11 +1078,15 @@ unsigned long gfn_to_hva(struct kvm *kvm, gfn_t gfn)
 EXPORT_SYMBOL_GPL(gfn_to_hva);
 
 /*
- * The hva returned by this function is only allowed to be read.
- * It should pair with kvm_read_hva() or kvm_read_hva_atomic().
+ * If writable is set to false, the hva returned by this function is only
+ * allowed to be read.
  */
-static unsigned long gfn_to_hva_read(struct kvm *kvm, gfn_t gfn)
+unsigned long gfn_to_hva_read(struct kvm *kvm, gfn_t gfn, bool *writable)
 {
+   struct kvm_memory_slot *slot = gfn_to_memslot(kvm, gfn);
+   if (writable)
+   *writable = !memslot_is_readonly(slot);
+
return __gfn_to_hva_many(gfn_to_memslot(kvm, gfn), gfn, NULL, false);
 }
 
@@ -1450,7 +1454,7 @@ int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void 
*data, int offset,
int r;
unsigned long addr;
 
-   addr = gfn_to_hva_read(kvm, gfn);
+   addr = gfn_to_hva_read(kvm, gfn, NULL);
if (kvm_is_error_hva(addr))
return -EFAULT;
r = kvm_read_hva(data, (void __user *)addr + offset, len);
@@ -1488,7 +1492,7 @@ int kvm_read_guest_atomic(struct kvm *kvm, gpa_t gpa, 
void *data,
gfn_t gfn = gpa  PAGE_SHIFT;
int offset = offset_in_page(gpa);
 
-   addr = gfn_to_hva_read(kvm, gfn);
+   addr = gfn_to_hva_read(kvm, gfn, NULL);
if (kvm_is_error_hva(addr))
return -EFAULT;
pagefault_disable();
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/