[PATCH v2 11/15] KVM: MMU: reintroduce kvm_mmu_isolate_page()

2013-09-05 Thread Xiao Guangrong
It was removed by commit 834be0d83. Now we will need it to do lockless shadow page walking protected by rcu, so reintroduce it Signed-off-by: Xiao Guangrong --- arch/x86/kvm/mmu.c | 23 --- 1 file changed, 20 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/mmu.c b

[PATCH v2 13/15] KVM: MMU: locklessly write-protect the page

2013-09-05 Thread Xiao Guangrong
-off-by: Xiao Guangrong --- arch/x86/include/asm/kvm_host.h | 4 arch/x86/kvm/mmu.c | 53 + arch/x86/kvm/mmu.h | 6 + arch/x86/kvm/x86.c | 11 - 4 files changed, 49 insertions(+), 25 deletions(-) diff

[PATCH v2 02/15] KVM: MMU: properly check last spte in fast_page_fault()

2013-09-05 Thread Xiao Guangrong
ble and avoids potential issue in the further development Signed-off-by: Xiao Guangrong --- arch/x86/kvm/mmu.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 7714fd8..869f1db 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/

[PATCH v2 15/15] KVM: MMU: use rcu functions to access the pointer

2013-09-05 Thread Xiao Guangrong
Use rcu_assign_pointer() to update all the pointer in desc and use rcu_dereference() to lockless read the pointer Signed-off-by: Xiao Guangrong --- arch/x86/kvm/mmu.c | 46 -- 1 file changed, 28 insertions(+), 18 deletions(-) diff --git a/arch/x86

[PATCH v2 00/15] KVM: MMU: locklessly wirte-protect

2013-09-05 Thread Xiao Guangrong
te based on the dirty bitmap, we should ensure the writable spte can be found in rmap before the dirty bitmap is visible. Otherwise, we cleared the dirty bitmap and failed to write-protect the page. Performance result The performance result and the benchmark can be found at:

[PATCH v2 12/15] KVM: MMU: allow locklessly access shadow page table out of vcpu thread

2013-09-05 Thread Xiao Guangrong
nce and the scalability Signed-off-by: Xiao Guangrong --- arch/x86/include/asm/kvm_host.h | 6 +- arch/x86/kvm/mmu.c | 32 arch/x86/kvm/mmu.h | 22 ++ 3 files changed, 59 insertions(+), 1 deletion(-) diff --

[PATCH v2 02/15] KVM: MMU: properly check last spte in fast_page_fault()

2013-09-05 Thread Xiao Guangrong
and avoids potential issue in the further development Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/kvm/mmu.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 7714fd8..869f1db 100644 --- a/arch/x86

[PATCH v2 15/15] KVM: MMU: use rcu functions to access the pointer

2013-09-05 Thread Xiao Guangrong
Use rcu_assign_pointer() to update all the pointer in desc and use rcu_dereference() to lockless read the pointer Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/kvm/mmu.c | 46 -- 1 file changed, 28 insertions(+), 18

[PATCH v2 00/15] KVM: MMU: locklessly wirte-protect

2013-09-05 Thread Xiao Guangrong
is visible. Otherwise, we cleared the dirty bitmap and failed to write-protect the page. Performance result The performance result and the benchmark can be found at: http://permalink.gmane.org/gmane.linux.kernel/1534876 Xiao Guangrong (15): KVM: MMU: fix the count of spte

[PATCH v2 12/15] KVM: MMU: allow locklessly access shadow page table out of vcpu thread

2013-09-05 Thread Xiao Guangrong
and the scalability Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/include/asm/kvm_host.h | 6 +- arch/x86/kvm/mmu.c | 32 arch/x86/kvm/mmu.h | 22 ++ 3 files changed, 59 insertions

[PATCH v2 13/15] KVM: MMU: locklessly write-protect the page

2013-09-05 Thread Xiao Guangrong
-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/include/asm/kvm_host.h | 4 arch/x86/kvm/mmu.c | 53 + arch/x86/kvm/mmu.h | 6 + arch/x86/kvm/x86.c | 11 - 4 files changed, 49

[PATCH v2 09/15] KVM: MMU: introduce pte-list lockless walker

2013-09-05 Thread Xiao Guangrong
, but the issue will be triggered if we expend the size of desc in the further development Thanks to SLAB_DESTROY_BY_RCU, the desc can be quickly reused Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/kvm/mmu.c | 57

[PATCH v2 04/15] KVM: MMU: flush tlb if the spte can be locklessly modified

2013-09-05 Thread Xiao Guangrong
, see spte.w = 0, then without flush tlb unlock mmu-lock !!! At this point, the shadow page can still be writable due to the corrupt tlb entry Flush all TLB Signed-off-by: Xiao Guangrong

[PATCH v2 10/15] KVM: MMU: initialize the pointers in pte_list_desc properly

2013-09-05 Thread Xiao Guangrong
kmem_cache_zalloc Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/kvm/mmu.c | 27 +-- 1 file changed, 21 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 3e1432f..fe80019 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86

[PATCH v2 11/15] KVM: MMU: reintroduce kvm_mmu_isolate_page()

2013-09-05 Thread Xiao Guangrong
It was removed by commit 834be0d83. Now we will need it to do lockless shadow page walking protected by rcu, so reintroduce it Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/kvm/mmu.c | 23 --- 1 file changed, 20 insertions(+), 3 deletions

[PATCH v2 14/15] KVM: MMU: clean up spte_write_protect

2013-09-05 Thread Xiao Guangrong
Now, the only user of spte_write_protect is rmap_write_protect which always calls spte_write_protect with pt_protect = true, so drop it and the unused parameter @kvm Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/kvm/mmu.c | 19 --- 1 file changed, 8

[PATCH v2 01/15] KVM: MMU: fix the count of spte number

2013-09-05 Thread Xiao Guangrong
If the desc is the last one and it is full, its sptes is not counted Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/kvm/mmu.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 6e2d2c8..7714fd8 100644 --- a/arch/x86/kvm

[PATCH v2 03/15] KVM: MMU: lazily drop large spte

2013-09-05 Thread Xiao Guangrong
large spte to writable but only dirty the first page into the dirty-bitmap that means other pages are missed. Fixed it by only the normal sptes (on the PT_PAGE_TABLE_LEVEL level) can be fast fixed Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/kvm/mmu.c | 36

[PATCH v2 06/15] KVM: MMU: update spte and add it into rmap before dirty log

2013-09-05 Thread Xiao Guangrong
Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/kvm/mmu.c | 84 ++ arch/x86/kvm/x86.c | 10 +++ 2 files changed, 76 insertions(+), 18 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index

[PATCH v2 05/15] KVM: MMU: flush tlb out of mmu lock when write-protect the sptes

2013-09-05 Thread Xiao Guangrong
, that means it does not depend on PT_WRITABLE_MASK anymore Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/kvm/mmu.c | 18 ++ arch/x86/kvm/x86.c | 9 +++-- 2 files changed, 21 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm

[PATCH v2 07/15] KVM: MMU: redesign the algorithm of pte_list

2013-09-05 Thread Xiao Guangrong
-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/kvm/mmu.c | 180 - 1 file changed, 123 insertions(+), 57 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 8ea54d9..08fb4e2 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch

[PATCH v2 08/15] KVM: MMU: introduce nulls desc

2013-09-05 Thread Xiao Guangrong
can not see the same nulls used on different rmaps Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/kvm/mmu.c | 35 +-- 1 file changed, 29 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 08fb4e2

Re: [PATCH v2] KVM: mmu: allow page tables to be in read-only slots

2013-09-05 Thread Xiao Guangrong
. Note that this scenario is not supported by NPT at all, as explained by comments in the code. Reviewed-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo

Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Xiao Guangrong
On 09/02/2013 05:49 PM, Gleb Natapov wrote: > On Mon, Sep 02, 2013 at 05:42:25PM +0800, Xiao Guangrong wrote: >> On 09/01/2013 05:17 PM, Gleb Natapov wrote: >>> On Fri, Aug 30, 2013 at 02:41:37PM +0200, Paolo Bonzini wrote: >>>> Page tables in a read-only memory slot

Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Xiao Guangrong
On 09/02/2013 05:25 PM, Gleb Natapov wrote: > On Mon, Sep 02, 2013 at 05:20:15PM +0800, Xiao Guangrong wrote: >> On 08/30/2013 08:41 PM, Paolo Bonzini wrote: >>> Page tables in a read-only memory slot will currently cause a triple >>> fault because the page walker u

Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Xiao Guangrong
below. > >> Cc: sta...@vger.kernel.org >> Cc: g...@redhat.com >> Cc: Xiao Guangrong >> Signed-off-by: Paolo Bonzini >> --- >> CCing to stable@ since the regression was introduced with >> support for readonly memory slots. >> >> arch/x86

Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Xiao Guangrong
On 08/30/2013 08:41 PM, Paolo Bonzini wrote: > Page tables in a read-only memory slot will currently cause a triple > fault because the page walker uses gfn_to_hva and it fails on such a slot. > > OVMF uses such a page table; however, real hardware seems to be fine with > that as long as the

Re: [PATCH 09/12] KVM: MMU: introduce pte-list lockless walker

2013-09-02 Thread Xiao Guangrong
On 08/30/2013 07:44 PM, Gleb Natapov wrote: > On Thu, Aug 29, 2013 at 08:02:30PM +0800, Xiao Guangrong wrote: >> On 08/29/2013 07:33 PM, Xiao Guangrong wrote: >>> On 08/29/2013 05:31 PM, Gleb Natapov wrote: >>>> On Thu, Aug 29, 2013 at 02:50:51PM +0800, Xiao Gua

Re: [PATCH 09/12] KVM: MMU: introduce pte-list lockless walker

2013-09-02 Thread Xiao Guangrong
On 08/30/2013 07:38 PM, Gleb Natapov wrote: > On Thu, Aug 29, 2013 at 07:26:40PM +0800, Xiao Guangrong wrote: >> On 08/29/2013 05:51 PM, Gleb Natapov wrote: >>> On Thu, Aug 29, 2013 at 05:31:42PM +0800, Xiao Guangrong wrote: >>>>> As Doc

Re: [PATCH 09/12] KVM: MMU: introduce pte-list lockless walker

2013-09-02 Thread Xiao Guangrong
On 08/30/2013 07:38 PM, Gleb Natapov wrote: On Thu, Aug 29, 2013 at 07:26:40PM +0800, Xiao Guangrong wrote: On 08/29/2013 05:51 PM, Gleb Natapov wrote: On Thu, Aug 29, 2013 at 05:31:42PM +0800, Xiao Guangrong wrote: As Documentation/RCU/whatisRCU.txt says: As with rcu_assign_pointer

Re: [PATCH 09/12] KVM: MMU: introduce pte-list lockless walker

2013-09-02 Thread Xiao Guangrong
On 08/30/2013 07:44 PM, Gleb Natapov wrote: On Thu, Aug 29, 2013 at 08:02:30PM +0800, Xiao Guangrong wrote: On 08/29/2013 07:33 PM, Xiao Guangrong wrote: On 08/29/2013 05:31 PM, Gleb Natapov wrote: On Thu, Aug 29, 2013 at 02:50:51PM +0800, Xiao Guangrong wrote: After more thinking, I still

Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Xiao Guangrong
On 08/30/2013 08:41 PM, Paolo Bonzini wrote: Page tables in a read-only memory slot will currently cause a triple fault because the page walker uses gfn_to_hva and it fails on such a slot. OVMF uses such a page table; however, real hardware seems to be fine with that as long as the

Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Xiao Guangrong
hardware seems to be fine with that as long as the accessed/dirty bits are set. Save whether the slot is readonly, and later check it when updating the accessed and dirty bits. The fix looks OK to me, but some comment below. Cc: sta...@vger.kernel.org Cc: g...@redhat.com Cc: Xiao

Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Xiao Guangrong
On 09/02/2013 05:25 PM, Gleb Natapov wrote: On Mon, Sep 02, 2013 at 05:20:15PM +0800, Xiao Guangrong wrote: On 08/30/2013 08:41 PM, Paolo Bonzini wrote: Page tables in a read-only memory slot will currently cause a triple fault because the page walker uses gfn_to_hva and it fails

Re: [PATCH] KVM: mmu: allow page tables to be in read-only slots

2013-09-02 Thread Xiao Guangrong
On 09/02/2013 05:49 PM, Gleb Natapov wrote: On Mon, Sep 02, 2013 at 05:42:25PM +0800, Xiao Guangrong wrote: On 09/01/2013 05:17 PM, Gleb Natapov wrote: On Fri, Aug 30, 2013 at 02:41:37PM +0200, Paolo Bonzini wrote: Page tables in a read-only memory slot will currently cause a triple fault

Re: [PATCH 09/12] KVM: MMU: introduce pte-list lockless walker

2013-08-29 Thread Xiao Guangrong
On 08/29/2013 07:33 PM, Xiao Guangrong wrote: > On 08/29/2013 05:31 PM, Gleb Natapov wrote: >> On Thu, Aug 29, 2013 at 02:50:51PM +0800, Xiao Guangrong wrote: >>> After more thinking, I still think rcu_assign_pointer() is unneeded when a >>> entry >>> is r

Re: [PATCH 09/12] KVM: MMU: introduce pte-list lockless walker

2013-08-29 Thread Xiao Guangrong
On 08/29/2013 05:31 PM, Gleb Natapov wrote: > On Thu, Aug 29, 2013 at 02:50:51PM +0800, Xiao Guangrong wrote: >> After more thinking, I still think rcu_assign_pointer() is unneeded when a >> entry >> is removed. The remove-API does not care the order bet

Re: [PATCH 09/12] KVM: MMU: introduce pte-list lockless walker

2013-08-29 Thread Xiao Guangrong
On 08/29/2013 05:51 PM, Gleb Natapov wrote: > On Thu, Aug 29, 2013 at 05:31:42PM +0800, Xiao Guangrong wrote: >>> As Documentation/RCU/whatisRCU.txt says: >>> >>> As with rcu_assign_pointer(), an important function of >>> rcu_dere

Re: [PATCH 09/12] KVM: MMU: introduce pte-list lockless walker

2013-08-29 Thread Xiao Guangrong
On 08/29/2013 05:08 PM, Gleb Natapov wrote: > On Thu, Aug 29, 2013 at 02:50:51PM +0800, Xiao Guangrong wrote: >>>>> BTW I do not see >>>>> rcu_assign_pointer()/rcu_dereference() in your patches which hints on >>>> >>>> IIUC, We can not

Re: [PATCH 10/12] KVM: MMU: allow locklessly access shadow page table out of vcpu thread

2013-08-29 Thread Xiao Guangrong
On 08/29/2013 05:10 PM, Gleb Natapov wrote: > On Tue, Jul 30, 2013 at 09:02:08PM +0800, Xiao Guangrong wrote: >> It is easy if the handler is in the vcpu context, in that case we can use >> walk_shadow_page_lockless_begin() and walk_shadow_page_lockless_end() that >> disa

Re: [PATCH 09/12] KVM: MMU: introduce pte-list lockless walker

2013-08-29 Thread Xiao Guangrong
On 08/28/2013 09:36 PM, Gleb Natapov wrote: > On Wed, Aug 28, 2013 at 08:15:36PM +0800, Xiao Guangrong wrote: >> On 08/28/2013 06:49 PM, Gleb Natapov wrote: >>> On Wed, Aug 28, 2013 at 06:13:43PM +0800, Xiao Guangrong wrote: >>>> On 08/28/2013 05:46 PM, Gleb Natap

Re: [PATCH 09/12] KVM: MMU: introduce pte-list lockless walker

2013-08-29 Thread Xiao Guangrong
On 08/28/2013 09:36 PM, Gleb Natapov wrote: On Wed, Aug 28, 2013 at 08:15:36PM +0800, Xiao Guangrong wrote: On 08/28/2013 06:49 PM, Gleb Natapov wrote: On Wed, Aug 28, 2013 at 06:13:43PM +0800, Xiao Guangrong wrote: On 08/28/2013 05:46 PM, Gleb Natapov wrote: On Wed, Aug 28, 2013 at 05:33

Re: [PATCH 10/12] KVM: MMU: allow locklessly access shadow page table out of vcpu thread

2013-08-29 Thread Xiao Guangrong
On 08/29/2013 05:10 PM, Gleb Natapov wrote: On Tue, Jul 30, 2013 at 09:02:08PM +0800, Xiao Guangrong wrote: It is easy if the handler is in the vcpu context, in that case we can use walk_shadow_page_lockless_begin() and walk_shadow_page_lockless_end() that disable interrupt to stop shadow page

Re: [PATCH 09/12] KVM: MMU: introduce pte-list lockless walker

2013-08-29 Thread Xiao Guangrong
On 08/29/2013 05:08 PM, Gleb Natapov wrote: On Thu, Aug 29, 2013 at 02:50:51PM +0800, Xiao Guangrong wrote: BTW I do not see rcu_assign_pointer()/rcu_dereference() in your patches which hints on IIUC, We can not directly use rcu_assign_pointer(), that is something like: p = v to assign

Re: [PATCH 09/12] KVM: MMU: introduce pte-list lockless walker

2013-08-29 Thread Xiao Guangrong
On 08/29/2013 05:51 PM, Gleb Natapov wrote: On Thu, Aug 29, 2013 at 05:31:42PM +0800, Xiao Guangrong wrote: As Documentation/RCU/whatisRCU.txt says: As with rcu_assign_pointer(), an important function of rcu_dereference() is to document which pointers are protected

Re: [PATCH 09/12] KVM: MMU: introduce pte-list lockless walker

2013-08-29 Thread Xiao Guangrong
On 08/29/2013 05:31 PM, Gleb Natapov wrote: On Thu, Aug 29, 2013 at 02:50:51PM +0800, Xiao Guangrong wrote: After more thinking, I still think rcu_assign_pointer() is unneeded when a entry is removed. The remove-API does not care the order between unlink the entry and the changes to its

Re: [PATCH 09/12] KVM: MMU: introduce pte-list lockless walker

2013-08-29 Thread Xiao Guangrong
On 08/29/2013 07:33 PM, Xiao Guangrong wrote: On 08/29/2013 05:31 PM, Gleb Natapov wrote: On Thu, Aug 29, 2013 at 02:50:51PM +0800, Xiao Guangrong wrote: After more thinking, I still think rcu_assign_pointer() is unneeded when a entry is removed. The remove-API does not care the order

Re: [PATCH 09/12] KVM: MMU: introduce pte-list lockless walker

2013-08-28 Thread Xiao Guangrong
On 08/28/2013 06:49 PM, Gleb Natapov wrote: > On Wed, Aug 28, 2013 at 06:13:43PM +0800, Xiao Guangrong wrote: >> On 08/28/2013 05:46 PM, Gleb Natapov wrote: >>> On Wed, Aug 28, 2013 at 05:33:49PM +0800, Xiao Guangrong wrote: >>>>> Or what

Re: [PATCH 09/12] KVM: MMU: introduce pte-list lockless walker

2013-08-28 Thread Xiao Guangrong
On 08/28/2013 05:46 PM, Gleb Natapov wrote: > On Wed, Aug 28, 2013 at 05:33:49PM +0800, Xiao Guangrong wrote: >>> Or what if desc is moved to another rmap, but then it >>> is moved back to initial rmap (but another place in the desc list) so >>> the check he

Re: [PATCH 09/12] KVM: MMU: introduce pte-list lockless walker

2013-08-28 Thread Xiao Guangrong
On 08/28/2013 05:20 PM, Gleb Natapov wrote: > On Tue, Jul 30, 2013 at 09:02:07PM +0800, Xiao Guangrong wrote: >> The basic idea is from nulls list which uses a nulls to indicate >> whether the desc is moved to different pte-list >> >> Thanks to SLAB_DESTROY_BY_RCU, the

Re: [PATCH 07/12] KVM: MMU: redesign the algorithm of pte_list

2013-08-28 Thread Xiao Guangrong
On 08/28/2013 04:58 PM, Gleb Natapov wrote: > On Wed, Aug 28, 2013 at 04:37:32PM +0800, Xiao Guangrong wrote: >> On 08/28/2013 04:12 PM, Gleb Natapov wrote: >> >>>> + >>>> + rmap_printk("pte_list_add: %p %llx many->many\n", spte, *spte); >

Re: [PATCH 08/12] KVM: MMU: introduce nulls desc

2013-08-28 Thread Xiao Guangrong
On 08/28/2013 04:40 PM, Gleb Natapov wrote: >> static unsigned long *__gfn_to_rmap(gfn_t gfn, int level, >> @@ -1200,7 +1221,7 @@ static u64 *rmap_get_first(unsigned long rmap, struct >> rmap_iterator *iter) >> */ >> static u64 *rmap_get_next(struct rmap_iterator *iter) >> { >> -if

Re: [PATCH 07/12] KVM: MMU: redesign the algorithm of pte_list

2013-08-28 Thread Xiao Guangrong
On 08/28/2013 04:12 PM, Gleb Natapov wrote: >> + >> +rmap_printk("pte_list_add: %p %llx many->many\n", spte, *spte); >> +desc = (struct pte_list_desc *)(*pte_list & ~1ul); >> + >> +/* No empty position in the desc. */ >> +if (desc->sptes[PTE_LIST_EXT - 1]) { >> +struct

Re: [PATCH 06/12] KVM: MMU: flush tlb if the spte can be locklessly modified

2013-08-28 Thread Xiao Guangrong
On 08/28/2013 03:23 PM, Gleb Natapov wrote: > On Tue, Jul 30, 2013 at 09:02:04PM +0800, Xiao Guangrong wrote: >> Relax the tlb flush condition since we will write-protect the spte out of mmu >> lock. Note lockless write-protection only marks the writable spte to readonly >

Re: [PATCH 06/12] KVM: MMU: flush tlb if the spte can be locklessly modified

2013-08-28 Thread Xiao Guangrong
On 08/28/2013 03:23 PM, Gleb Natapov wrote: On Tue, Jul 30, 2013 at 09:02:04PM +0800, Xiao Guangrong wrote: Relax the tlb flush condition since we will write-protect the spte out of mmu lock. Note lockless write-protection only marks the writable spte to readonly and the spte can be writable

Re: [PATCH 07/12] KVM: MMU: redesign the algorithm of pte_list

2013-08-28 Thread Xiao Guangrong
On 08/28/2013 04:12 PM, Gleb Natapov wrote: + +rmap_printk(pte_list_add: %p %llx many-many\n, spte, *spte); +desc = (struct pte_list_desc *)(*pte_list ~1ul); + +/* No empty position in the desc. */ +if (desc-sptes[PTE_LIST_EXT - 1]) { +struct pte_list_desc

Re: [PATCH 08/12] KVM: MMU: introduce nulls desc

2013-08-28 Thread Xiao Guangrong
On 08/28/2013 04:40 PM, Gleb Natapov wrote: static unsigned long *__gfn_to_rmap(gfn_t gfn, int level, @@ -1200,7 +1221,7 @@ static u64 *rmap_get_first(unsigned long rmap, struct rmap_iterator *iter) */ static u64 *rmap_get_next(struct rmap_iterator *iter) { -if (iter-desc) { +

Re: [PATCH 07/12] KVM: MMU: redesign the algorithm of pte_list

2013-08-28 Thread Xiao Guangrong
On 08/28/2013 04:58 PM, Gleb Natapov wrote: On Wed, Aug 28, 2013 at 04:37:32PM +0800, Xiao Guangrong wrote: On 08/28/2013 04:12 PM, Gleb Natapov wrote: + + rmap_printk(pte_list_add: %p %llx many-many\n, spte, *spte); + desc = (struct pte_list_desc *)(*pte_list ~1ul); + + /* No empty

Re: [PATCH 09/12] KVM: MMU: introduce pte-list lockless walker

2013-08-28 Thread Xiao Guangrong
On 08/28/2013 05:20 PM, Gleb Natapov wrote: On Tue, Jul 30, 2013 at 09:02:07PM +0800, Xiao Guangrong wrote: The basic idea is from nulls list which uses a nulls to indicate whether the desc is moved to different pte-list Thanks to SLAB_DESTROY_BY_RCU, the desc can be quickly reused Signed

Re: [PATCH 09/12] KVM: MMU: introduce pte-list lockless walker

2013-08-28 Thread Xiao Guangrong
On 08/28/2013 05:46 PM, Gleb Natapov wrote: On Wed, Aug 28, 2013 at 05:33:49PM +0800, Xiao Guangrong wrote: Or what if desc is moved to another rmap, but then it is moved back to initial rmap (but another place in the desc list) so the check here will not catch that we need to restart walking

Re: [PATCH 09/12] KVM: MMU: introduce pte-list lockless walker

2013-08-28 Thread Xiao Guangrong
On 08/28/2013 06:49 PM, Gleb Natapov wrote: On Wed, Aug 28, 2013 at 06:13:43PM +0800, Xiao Guangrong wrote: On 08/28/2013 05:46 PM, Gleb Natapov wrote: On Wed, Aug 28, 2013 at 05:33:49PM +0800, Xiao Guangrong wrote: Or what if desc is moved to another rmap, but then it is moved back

Re: [RFC PATCH 00/12] KVM: MMU: locklessly wirte-protect

2013-08-08 Thread Xiao Guangrong
On 08/09/2013 01:38 AM, Paolo Bonzini wrote: > Il 06/08/2013 15:16, Xiao Guangrong ha scritto: >> Hi Gleb, Paolo, Marcelo, Takuya, >> >> Any comments or further comments? :) > > It's not the easiest patch to review. I've looked at it (beyond the > small

Re: [PATCH 04/12] KVM: MMU: log dirty page after marking spte writable

2013-08-08 Thread Xiao Guangrong
[ Post again after adjusting the format since the mail list rejected to deliver my previous one. ] On Aug 8, 2013, at 11:06 PM, Marcelo Tosatti wrote: > On Wed, Aug 07, 2013 at 12:06:49PM +0800, Xiao Guangrong wrote: >> On 08/07/2013 09:48 AM, Marcelo Tosatti wrote: >>> On

Re: [PATCH 04/12] KVM: MMU: log dirty page after marking spte writable

2013-08-08 Thread Xiao Guangrong
[ Post again after adjusting the format since the mail list rejected to deliver my previous one. ] On Aug 8, 2013, at 11:06 PM, Marcelo Tosatti mtosa...@redhat.com wrote: On Wed, Aug 07, 2013 at 12:06:49PM +0800, Xiao Guangrong wrote: On 08/07/2013 09:48 AM, Marcelo Tosatti wrote: On Tue

Re: [RFC PATCH 00/12] KVM: MMU: locklessly wirte-protect

2013-08-08 Thread Xiao Guangrong
On 08/09/2013 01:38 AM, Paolo Bonzini wrote: Il 06/08/2013 15:16, Xiao Guangrong ha scritto: Hi Gleb, Paolo, Marcelo, Takuya, Any comments or further comments? :) It's not the easiest patch to review. I've looked at it (beyond the small comments I have already posted), but it will take

Re: [PATCH 10/12] KVM: MMU: allow locklessly access shadow page table out of vcpu thread

2013-08-07 Thread Xiao Guangrong
On 08/07/2013 09:09 PM, Takuya Yoshikawa wrote: > On Tue, 30 Jul 2013 21:02:08 +0800 > Xiao Guangrong wrote: > >> @@ -2342,6 +2358,13 @@ static void kvm_mmu_commit_zap_page(struct kvm *kvm, >> */ >> kvm_flush_remote_tlbs(kvm); >> >>

Re: [PATCH 10/12] KVM: MMU: allow locklessly access shadow page table out of vcpu thread

2013-08-07 Thread Xiao Guangrong
On 08/07/2013 09:09 PM, Takuya Yoshikawa wrote: On Tue, 30 Jul 2013 21:02:08 +0800 Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: @@ -2342,6 +2358,13 @@ static void kvm_mmu_commit_zap_page(struct kvm *kvm, */ kvm_flush_remote_tlbs(kvm); +if (kvm

Re: [PATCH 04/12] KVM: MMU: log dirty page after marking spte writable

2013-08-06 Thread Xiao Guangrong
On 08/07/2013 09:48 AM, Marcelo Tosatti wrote: > On Tue, Jul 30, 2013 at 09:02:02PM +0800, Xiao Guangrong wrote: >> Make sure we can see the writable spte before the dirt bitmap is visible >> >> We do this is for kvm_vm_ioctl_get_dirty_log() write-protects the spte based >&

Re: [RFC PATCH 00/12] KVM: MMU: locklessly wirte-protect

2013-08-06 Thread Xiao Guangrong
Hi Gleb, Paolo, Marcelo, Takuya, Any comments or further comments? :) On 07/30/2013 09:01 PM, Xiao Guangrong wrote: > Background > == > Currently, when mark memslot dirty logged or get dirty page, we need to > write-protect large guest memory, it is the heavy work, especia

Re: [RFC PATCH 00/12] KVM: MMU: locklessly wirte-protect

2013-08-06 Thread Xiao Guangrong
Hi Gleb, Paolo, Marcelo, Takuya, Any comments or further comments? :) On 07/30/2013 09:01 PM, Xiao Guangrong wrote: Background == Currently, when mark memslot dirty logged or get dirty page, we need to write-protect large guest memory, it is the heavy work, especially, we need

Re: [PATCH 04/12] KVM: MMU: log dirty page after marking spte writable

2013-08-06 Thread Xiao Guangrong
On 08/07/2013 09:48 AM, Marcelo Tosatti wrote: On Tue, Jul 30, 2013 at 09:02:02PM +0800, Xiao Guangrong wrote: Make sure we can see the writable spte before the dirt bitmap is visible We do this is for kvm_vm_ioctl_get_dirty_log() write-protects the spte based on the dirty bitmap, we should

Re: [PATCH 0/5] perf kvm live - latest round take 4

2013-08-05 Thread Xiao Guangrong
On 08/06/2013 09:41 AM, David Ahern wrote: > Hi Arnaldo: > > This round addresses all of Xiao's comments. It also includes a small > change in the live mode introduction to improve ordered samples > processing. For that a change in perf-session functions is needed. Reviewed-by:

Re: [PATCH 9/9] perf kvm stat report: Add option to analyze specific VM

2013-08-05 Thread Xiao Guangrong
On 08/03/2013 04:05 AM, David Ahern wrote: > Add an option to analyze a specific VM within a data file. This > allows the collection of kvm events for all VMs and then analyze > data for each VM (or set of VMs) individually. Interesting. But how can we know which pid is the guest's pid after

Re: [PATCH 8/9] perf kvm: debug for missing vmexit/vmentry event

2013-08-05 Thread Xiao Guangrong
t; Cc: Ingo Molnar > Cc: Frederic Weisbecker > Cc: Peter Zijlstra > Cc: Jiri Olsa > Cc: Namhyung Kim > Cc: Xiao Guangrong > Cc: Runzhen Wang > --- > tools/perf/builtin-kvm.c | 15 +-- > 1 file changed, 13 insertions(+), 2 deletions(-) > >

Re: [PATCH 7/9] perf kvm: option to print events that exceed a threshold

2013-08-05 Thread Xiao Guangrong
On 08/03/2013 04:05 AM, David Ahern wrote: > This is useful to spot high latency blips. Yes, it is a good idea. > > Signed-off-by: David Ahern > Cc: Arnaldo Carvalho de Melo > Cc: Ingo Molnar > Cc: Frederic Weisbecker > Cc: Peter Zijlstra > Cc: Jiri Olsa > C

Re: [PATCH 6/9] perf kvm: add min and max stats to display

2013-08-05 Thread Xiao Guangrong
On 08/03/2013 04:05 AM, David Ahern wrote: > Signed-off-by: David Ahern > Cc: Arnaldo Carvalho de Melo > Cc: Ingo Molnar > Cc: Frederic Weisbecker > Cc: Peter Zijlstra > Cc: Jiri Olsa > Cc: Namhyung Kim > Cc: Xiao Guangrong > Cc: Runzhen Wang > --- >

Re: [PATCH 2/9] perf stats: add max and min stats

2013-08-05 Thread Xiao Guangrong
On 08/03/2013 04:05 AM, David Ahern wrote: > Need an initialization function to set min to -1 to > differentiate from an actual min of 0. Reviewed-by: Xiao Guangrong -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...

Re: [PATCH 2/9] perf stats: add max and min stats

2013-08-05 Thread Xiao Guangrong
On 08/03/2013 04:05 AM, David Ahern wrote: Need an initialization function to set min to -1 to differentiate from an actual min of 0. Reviewed-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message

Re: [PATCH 6/9] perf kvm: add min and max stats to display

2013-08-05 Thread Xiao Guangrong
: Namhyung Kim namhy...@kernel.org Cc: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com Cc: Runzhen Wang runz...@linux.vnet.ibm.com --- tools/perf/builtin-kvm.c | 21 ++--- 1 file changed, 18 insertions(+), 3 deletions(-) diff --git a/tools/perf/builtin-kvm.c b/tools/perf

Re: [PATCH 7/9] perf kvm: option to print events that exceed a threshold

2013-08-05 Thread Xiao Guangrong
: Peter Zijlstra pet...@infradead.org Cc: Jiri Olsa jo...@redhat.com Cc: Namhyung Kim namhy...@kernel.org Cc: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com Cc: Runzhen Wang runz...@linux.vnet.ibm.com --- tools/perf/builtin-kvm.c | 25 + tools/perf/perf.h

Re: [PATCH 8/9] perf kvm: debug for missing vmexit/vmentry event

2013-08-05 Thread Xiao Guangrong
a...@ghostprotocols.net Cc: Ingo Molnar mi...@kernel.org Cc: Frederic Weisbecker fweis...@gmail.com Cc: Peter Zijlstra pet...@infradead.org Cc: Jiri Olsa jo...@redhat.com Cc: Namhyung Kim namhy...@kernel.org Cc: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com Cc: Runzhen Wang runz

Re: [PATCH 9/9] perf kvm stat report: Add option to analyze specific VM

2013-08-05 Thread Xiao Guangrong
On 08/03/2013 04:05 AM, David Ahern wrote: Add an option to analyze a specific VM within a data file. This allows the collection of kvm events for all VMs and then analyze data for each VM (or set of VMs) individually. Interesting. But how can we know which pid is the guest's pid after

Re: [PATCH 0/5] perf kvm live - latest round take 4

2013-08-05 Thread Xiao Guangrong
On 08/06/2013 09:41 AM, David Ahern wrote: Hi Arnaldo: This round addresses all of Xiao's comments. It also includes a small change in the live mode introduction to improve ordered samples processing. For that a change in perf-session functions is needed. Reviewed-by: Xiao Guangrong

Re: [PATCH 5/9] perf kvm: add live mode - v3

2013-08-04 Thread Xiao Guangrong
Hi David, Thanks for your nice job! I got some questions. On 08/03/2013 04:05 AM, David Ahern wrote: > static int kvm_events_hash_fn(u64 key) > { > return key & (EVENTS_CACHE_SIZE - 1); > @@ -472,7 +501,11 @@ static bool handle_end_event(struct perf_kvm_stat *kvm, >

Re: [PATCH 4/9] perf kvm: split out tracepoints from record args

2013-08-04 Thread Xiao Guangrong
On 08/03/2013 04:05 AM, David Ahern wrote: > Needed by kvm live command. Make record_args a local while we are > messing with the args. Reviewed-by: Xiao Guangrong -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...

[PATCH] KVM: MMU: fix check the reserved bits on the gpte of L2

2013-08-04 Thread Xiao Guangrong
be triggered when nested npt is used and L1 guest and L2 guest use different mmu mode Reported-by: Jan Kiszka Signed-off-by: Xiao Guangrong --- arch/x86/kvm/paging_tmpl.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h

Re: [RFC PATCH 00/12] KVM: MMU: locklessly wirte-protect

2013-08-04 Thread Xiao Guangrong
On Aug 3, 2013, at 1:09 PM, Takuya Yoshikawa wrote: > On Tue, 30 Jul 2013 21:01:58 +0800 > Xiao Guangrong wrote: > >> Background >> == >> Currently, when mark memslot dirty logged or get dirty page, we need to >> write-protect large guest memory, it

Re: [RFC PATCH 00/12] KVM: MMU: locklessly wirte-protect

2013-08-04 Thread Xiao Guangrong
On Aug 3, 2013, at 1:09 PM, Takuya Yoshikawa takuya.yoshik...@gmail.com wrote: On Tue, 30 Jul 2013 21:01:58 +0800 Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com wrote: Background == Currently, when mark memslot dirty logged or get dirty page, we need to write-protect large guest

[PATCH] KVM: MMU: fix check the reserved bits on the gpte of L2

2013-08-04 Thread Xiao Guangrong
be triggered when nested npt is used and L1 guest and L2 guest use different mmu mode Reported-by: Jan Kiszka jan.kis...@siemens.com Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/kvm/paging_tmpl.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch

Re: [PATCH 4/9] perf kvm: split out tracepoints from record args

2013-08-04 Thread Xiao Guangrong
On 08/03/2013 04:05 AM, David Ahern wrote: Needed by kvm live command. Make record_args a local while we are messing with the args. Reviewed-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message

Re: [PATCH 5/9] perf kvm: add live mode - v3

2013-08-04 Thread Xiao Guangrong
Hi David, Thanks for your nice job! I got some questions. On 08/03/2013 04:05 AM, David Ahern wrote: static int kvm_events_hash_fn(u64 key) { return key (EVENTS_CACHE_SIZE - 1); @@ -472,7 +501,11 @@ static bool handle_end_event(struct perf_kvm_stat *kvm,

Re: [PATCH 03/12] KVM: MMU: lazily drop large spte

2013-08-02 Thread Xiao Guangrong
On Aug 3, 2013, at 4:27 AM, Marcelo Tosatti wrote: > On Fri, Aug 02, 2013 at 11:42:19PM +0800, Xiao Guangrong wrote: >> >> On Aug 2, 2013, at 10:55 PM, Marcelo Tosatti wrote: >> >>> On Tue, Jul 30, 2013 at 09:02:01PM +0800, Xiao Guangrong wrote: >>&

Re: [PATCH 03/12] KVM: MMU: lazily drop large spte

2013-08-02 Thread Xiao Guangrong
On Aug 2, 2013, at 10:55 PM, Marcelo Tosatti wrote: > On Tue, Jul 30, 2013 at 09:02:01PM +0800, Xiao Guangrong wrote: >> Currently, kvm zaps the large spte if write-protected is needed, the later >> read can fault on that spte. Actually, we can make the large spte readonly >&

Re: [PATCH 03/12] KVM: MMU: lazily drop large spte

2013-08-02 Thread Xiao Guangrong
On Aug 2, 2013, at 10:55 PM, Marcelo Tosatti mtosa...@redhat.com wrote: On Tue, Jul 30, 2013 at 09:02:01PM +0800, Xiao Guangrong wrote: Currently, kvm zaps the large spte if write-protected is needed, the later read can fault on that spte. Actually, we can make the large spte readonly

Re: [PATCH 03/12] KVM: MMU: lazily drop large spte

2013-08-02 Thread Xiao Guangrong
On Aug 3, 2013, at 4:27 AM, Marcelo Tosatti mtosa...@redhat.com wrote: On Fri, Aug 02, 2013 at 11:42:19PM +0800, Xiao Guangrong wrote: On Aug 2, 2013, at 10:55 PM, Marcelo Tosatti mtosa...@redhat.com wrote: On Tue, Jul 30, 2013 at 09:02:01PM +0800, Xiao Guangrong wrote: Currently, kvm

Re: [PATCH 05/12] KVM: MMU: add spte into rmap before logging dirty page

2013-07-31 Thread Xiao Guangrong
On 07/30/2013 09:27 PM, Paolo Bonzini wrote: > Il 30/07/2013 15:02, Xiao Guangrong ha scritto: >> kvm_vm_ioctl_get_dirty_log() write-protects the spte based on the dirty >> bitmap, we should ensure the writable spte can be found in rmap before the >> dirty bitmap is visible.

Re: [PATCH 04/12] KVM: MMU: log dirty page after marking spte writable

2013-07-31 Thread Xiao Guangrong
On 07/30/2013 09:26 PM, Paolo Bonzini wrote: > Il 30/07/2013 15:02, Xiao Guangrong ha scritto: >> Make sure we can see the writable spte before the dirt bitmap is visible >> >> We do this is for kvm_vm_ioctl_get_dirty_log() write-protects the spte based >> on the dir

Re: [PATCH 04/12] KVM: MMU: log dirty page after marking spte writable

2013-07-31 Thread Xiao Guangrong
On 07/30/2013 09:26 PM, Paolo Bonzini wrote: Il 30/07/2013 15:02, Xiao Guangrong ha scritto: Make sure we can see the writable spte before the dirt bitmap is visible We do this is for kvm_vm_ioctl_get_dirty_log() write-protects the spte based on the dirty bitmap, we should ensure the writable

Re: [PATCH 05/12] KVM: MMU: add spte into rmap before logging dirty page

2013-07-31 Thread Xiao Guangrong
On 07/30/2013 09:27 PM, Paolo Bonzini wrote: Il 30/07/2013 15:02, Xiao Guangrong ha scritto: kvm_vm_ioctl_get_dirty_log() write-protects the spte based on the dirty bitmap, we should ensure the writable spte can be found in rmap before the dirty bitmap is visible. Otherwise, we cleared

[PATCH 10/12] KVM: MMU: allow locklessly access shadow page table out of vcpu thread

2013-07-30 Thread Xiao Guangrong
nce and the scalability Signed-off-by: Xiao Guangrong --- arch/x86/include/asm/kvm_host.h | 6 +- arch/x86/kvm/mmu.c | 23 +++ arch/x86/kvm/mmu.h | 22 ++ 3 files changed, 50 insertions(+), 1 deletion(-) diff --git a/arch/

<    5   6   7   8   9   10   11   12   13   14   >