Re: [kvm-devel] Starting a VM reboots my machine

2007-11-04 Thread Andrea Arcangeli
: mkdir -p $(DESTDIR)/$(INSTALLDIR) Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] index 9584d0f..95a3489 100644 --- a/drivers/kvm/svm.c +++ b/drivers/kvm/svm.c @@ -1459,11 +1459,6 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) local_irq_enable

Re: [kvm-devel] Starting a VM reboots my machine

2007-11-05 Thread Andrea Arcangeli
On Mon, Nov 05, 2007 at 04:25:00PM +0200, Avi Kivity wrote: This one's obviously correct, will apply... thanks! Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] index 9584d0f..95a3489 100644 --- a/drivers/kvm/svm.c +++ b/drivers/kvm/svm.c @@ -1459,11 +1459,6 @@ static void svm_vcpu_run

Re: [kvm-devel] Starting a VM reboots my machine

2007-11-05 Thread Andrea Arcangeli
the entire kvm.git and see if what I get is a different tree and if my previous kvm.git bitrotten. It does. Ok. We're definitely looking at different trees. here my current tip. commit dfe665260be338e3a7fe59172220ccadd2d1b7e7 Merge: 8c37564... c388ba8... Author: Andrea Arcangeli [EMAIL PROTECTED

Re: [kvm-devel] Starting a VM reboots my machine

2007-11-05 Thread Andrea Arcangeli
On Mon, Nov 05, 2007 at 05:28:36PM +0100, Andrea Arcangeli wrote: Now I'm re-downloding the entire kvm.git and see if what I get is a different tree and if my previous kvm.git bitrotten. git bitrotten. What concerns me is that git pull + git reset --hard can't bring my old kvm.git tree in sync

Re: [kvm-devel] Starting a VM reboots my machine

2007-11-06 Thread Andrea Arcangeli
On Tue, Nov 06, 2007 at 11:16:16AM +0200, Avi Kivity wrote: Andrea Arcangeli wrote: On Mon, Nov 05, 2007 at 05:25:17PM +0200, Avi Kivity wrote: Well, I can't find anything like that it my tree. Maybe something's stale? Could be, this is why I don't like git that much, with hg

Re: [kvm-devel] [PATCH] Use cmpxchg for pte updates on walk_addr()

2007-12-07 Thread Andrea Arcangeli
On Fri, Dec 07, 2007 at 07:56:58AM -0500, Marcelo Tosatti wrote: I see that possibility, but why on earth would the guest be continuously updating a pagetable entry ? If I understood correctly this would be just to be more robust against malicious guests that could try to create unkillable

[kvm-devel] external module sched_in event

2007-12-20 Thread Andrea Arcangeli
crash hard in such a condition, svm is much simpler and it somewhat survives the lack of sched_in and only crashes the guest due to not monotone tsc): Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ac876ec..26372fa 100644 --- a/arch/x86/kvm

[kvm-devel] external module sched_in event

2007-12-21 Thread Andrea Arcangeli
tsc): Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ac876ec..26372fa 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -742,6 +742,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) void kvm_arch_vcpu_put(struct

Re: [kvm-devel] external module sched_in event

2007-12-21 Thread Andrea Arcangeli
On Fri, Dec 21, 2007 at 07:52:52PM +0200, Izik Eidus wrote: oh, it was sent to the list, dont trust (in case you did) the source forge site for the mails But this time I received it in my kvm-devel folder... previously I didn't, so it had to be blocked my some spamfilter in the other account

Re: [kvm-devel] external module sched_in event

2007-12-23 Thread Andrea Arcangeli
% chance of crashing in the one-more-dereference in a more meaningful way. Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] diff --git a/kernel/hack-module.awk b/kernel/hack-module.awk index 7993aa2..5187c96 100644 --- a/kernel/hack-module.awk +++ b/kernel/hack-module.awk @@ -24,32 +24,6

Re: [kvm-devel] external module sched_in event

2007-12-24 Thread Andrea Arcangeli
On Sun, Dec 23, 2007 at 07:37:40PM +0200, Avi Kivity wrote: The sched_in notifier needs to enable interrupts (but it must disable preemption to avoid recursion). Ok this update fixes the smp_call_function deadlock. Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] diff --git a/kernel/hack

Re: [kvm-devel] [PATCH] kvm guest balloon driver

2008-01-09 Thread Andrea Arcangeli
On Tue, Jan 08, 2008 at 09:42:13AM -0600, Anthony Liguori wrote: Instead of allocating a node for each page, you could use page-private page-lru is probably better for this so splice still works etc... (the struct page isn't visible to the guest VM so it's free to use)

[kvm-devel] mmu notifiers

2008-01-09 Thread Andrea Arcangeli
. Comments welcome... especially from Quadrics. Patch is mostly untested, tomorrow I'll try to plug KVM on top of the below and see if it survives swap. Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h --- a/include/asm-generic

[kvm-devel] KVM swapping with mmu notifiers

2008-01-13 Thread Andrea Arcangeli
is heavily I/O bound anyway so a some more ipi in smp host shouldn't be very measurable (on UP host it makes no difference to flush multiple times in practice). Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index 4086080..c527d7d 100644

[kvm-devel] [PATCH] mmu notifiers #v2

2008-01-13 Thread Andrea Arcangeli
kept passing down the 'mm' like below just in case others don't have 'mm' saved in the container for other reasons like we have in struct kvm, and they prefer to get it through registers/stack. In practice it won't make any measurable difference, it's mostly a style issue. Signed-off-by: Andrea

Re: [kvm-devel] KVM swapping with mmu notifiers

2008-01-14 Thread Andrea Arcangeli
On Mon, Jan 14, 2008 at 11:45:39AM -0200, Marcelo Tosatti wrote: The alias and memslot maps are protected only by mmap_sem, so you yes, they are already protected and furthermore in write mode. should make kvm_set_memory_region/set_memory_alias grab the mmu spinlock in addition to mmap_sem in

Re: [kvm-devel] KVM swapping with mmu notifiers

2008-01-14 Thread Andrea Arcangeli
On Mon, Jan 14, 2008 at 04:09:03PM +0200, Avi Kivity wrote: Marcelo Tosatti wrote: +static void unmap_spte(struct kvm *kvm, u64 *spte) +{ + struct page *page = pfn_to_page((*spte PT64_BASE_ADDR_MASK) PAGE_SHIFT); + get_page(page); + rmap_remove(kvm, spte); +

Re: [kvm-devel] KVM swapping with mmu notifiers

2008-01-14 Thread Andrea Arcangeli
see more problems, thanks a lot for the review! Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index 4086080..c527d7d 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -18,6 +18,7 @@ config KVM tristate Kernel-based

Re: [kvm-devel] KVM swapping with mmu notifiers

2008-01-15 Thread Andrea Arcangeli
On Tue, Jan 15, 2008 at 05:57:03PM +0200, Avi Kivity wrote: It's the same hva for two different gpas. Same functionality as the alias, but with less data structures. Ok but if this is already supposed to work, then I need to at least change kvm_hva_to_rmapp not to stop when it find the first

Re: [kvm-devel] [PATCH] mmu notifiers #v2

2008-01-17 Thread Andrea Arcangeli
On Wed, Jan 16, 2008 at 07:48:06PM +0200, Izik Eidus wrote: Rik van Riel wrote: On Sun, 13 Jan 2008 17:24:18 +0100 Andrea Arcangeli [EMAIL PROTECTED] wrote: In my basic initial patch I only track the tlb flushes which should be the minimum required to have a nice linux-VM controlled

Re: [kvm-devel] [PATCH] mmu notifiers #v2

2008-01-17 Thread Andrea Arcangeli
On Thu, Jan 17, 2008 at 08:21:16PM +0200, Izik Eidus wrote: ohh i like it, this is cleaver solution, and i guess the cost of the vmexits wont be too high if it will be not too much aggressive Yes, and especially during swapping, the system isn't usually CPU bound. The idea is to pay with

[kvm-devel] [PATCH] kvm memslot read-locking with mmu_lock

2008-01-21 Thread Andrea Arcangeli
This adds locking to the memslots so they can be looked up with only the mmu_lock. Entries with memslot-userspace_addr have to be ignored because they're not fully inserted yet. Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 8a90403

Re: [kvm-devel] [PATCH] kvm-userland mm count skew

2008-01-22 Thread Andrea Arcangeli
On Tue, Jan 22, 2008 at 03:41:02PM +0200, Avi Kivity wrote: Andrea Arcangeli wrote: I still can't see how it could be possibly make a difference for the mm_count if the kvm module is compiled inside the kernel or as an external module, the reference counting there hasn't changed since ages

Re: [kvm-devel] [PATCH] kvm swapping with mmu notifiers + age_page

2008-01-22 Thread Andrea Arcangeli
On Tue, Jan 22, 2008 at 04:08:16PM +0200, Avi Kivity wrote: Andrea Arcangeli wrote: This is the same as before but it uses the age_page callback to prevent the guest OS working set to be swapped out. It works well here so far. This depends on the memslot locking with mmu lock patch

Re: [kvm-devel] [PATCH] mmu notifiers #v3

2008-01-22 Thread Andrea Arcangeli
On Tue, Jan 22, 2008 at 04:12:34PM +0200, Avi Kivity wrote: Andrea Arcangeli wrote: diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -44,8 +44,10

Re: [kvm-devel] KVM swapping with mmu notifiers

2008-01-22 Thread Andrea Arcangeli
On Tue, Jan 22, 2008 at 03:37:59PM +0200, Avi Kivity wrote: Andrea Arcangeli wrote: On Sun, Jan 20, 2008 at 05:16:03PM +0200, Avi Kivity wrote: Yes, it's supposed to work (we can't prevent userspace from doing it). Hmm, I think we already prevent it, so I don't think I need

Re: [kvm-devel] [PATCH] kvm memslot read-locking with mmu_lock

2008-01-22 Thread Andrea Arcangeli
On Tue, Jan 22, 2008 at 04:38:49PM +0200, Avi Kivity wrote: Andrea Arcangeli wrote: This is arch independent code, I'm surprised mmu_lock is visible here? The mmu_lock is arch independent as far as I can tell. Pretty much like the mm-page_table_lock is also independent. All archs

Re: [kvm-devel] [PATCH] kvm memslot read-locking with mmu_lock

2008-01-22 Thread Andrea Arcangeli
On Tue, Jan 22, 2008 at 03:47:28PM +0200, Avi Kivity wrote: Andrea Arcangeli wrote: This adds locking to the memslots so they can be looked up with only the mmu_lock. Entries with memslot-userspace_addr have to be ignored because they're not fully inserted yet. What is the motivation

Re: [kvm-devel] KVM swapping with mmu notifiers

2008-01-22 Thread Andrea Arcangeli
On Tue, Jan 22, 2008 at 06:17:38PM +0200, Avi Kivity wrote: There can be more than one rmapp per hva. Real world example: memslot 1: gfn range 0xe00 - 0xe080 @ hva 0x1000 (8MB framebuffer) memslot 2: gfn range 0xa - 0xa8000 @ hva 0x1000 (32KB VGA window) If the guest

Re: [kvm-devel] [PATCH] kvm swapping with mmu notifiers + age_page

2008-01-22 Thread Andrea Arcangeli
On Tue, Jan 22, 2008 at 04:53:37PM +0200, Avi Kivity wrote: Andrea Arcangeli wrote: On Tue, Jan 22, 2008 at 04:08:16PM +0200, Avi Kivity wrote: Andrea Arcangeli wrote: This is the same as before but it uses the age_page callback to prevent the guest OS working set to be swapped out

Re: [kvm-devel] KVM swapping with mmu notifiers

2008-01-22 Thread Andrea Arcangeli
This last update will work against mmu-notifiers #v4, this will make the accessed bitflag in the spte visible to the linux VM so it will provide an accurate working set detection w/o requiring vmexits. Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] diff --git a/arch/x86/kvm/Kconfig b/arch/x86

Re: [kvm-devel] [PATCH] mmu notifiers #v4

2008-01-22 Thread Andrea Arcangeli
Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -46,6 +46,7 @@ __young = ptep_test_and_clear_young(__vma, __address, __ptep

Re: [kvm-devel] [PATCH] mmu notifiers #v3

2008-01-22 Thread Andrea Arcangeli
On Tue, Jan 22, 2008 at 08:28:47PM +0100, Peter Zijlstra wrote: I think we can get rid of this rwlock as I think this will seriously hurt larger machines. Yep, I initially considered it, nevertheless given you solved part of the complication I can add it now ;). The only technical reason for

Re: [kvm-devel] [PATCH] export notifier #1

2008-01-22 Thread Andrea Arcangeli
Hi Christoph, Just a few early comments. First it makes me optimistic this can be merged sooner than later to see a second brand new implementation of this ;). On Tue, Jan 22, 2008 at 12:34:46PM -0800, Christoph Lameter wrote: On Tue, 22 Jan 2008, Andrea Arcangeli wrote: This last update

Re: [kvm-devel] [PATCH] export notifier #1

2008-01-23 Thread Andrea Arcangeli
Hi Christoph, On Tue, Jan 22, 2008 at 02:53:12PM -0800, Christoph Lameter wrote: On Tue, 22 Jan 2008, Andrea Arcangeli wrote: First it makes me optimistic this can be merged sooner than later to see a second brand new implementation of this ;). Brand new? Well this is borrowing as much

Re: [kvm-devel] [PATCH] export notifier #1

2008-01-23 Thread Andrea Arcangeli
On Wed, Jan 23, 2008 at 04:52:47AM -0600, Robin Holt wrote: But 100 callouts holding spinlocks will not work for our implementation and even if the callouts are made with spinlocks released, we would very strongly prefer a single callout which messages the range to the other side. But you

Re: [kvm-devel] [PATCH] export notifier #1

2008-01-23 Thread Andrea Arcangeli
Hi Kraxel, On Wed, Jan 23, 2008 at 01:51:23PM +0100, Gerd Hoffmann wrote: That would render the notifies useless for Xen too. Xen needs to intercept the actual pte clear and instead of just zapping it use the hypercall to do the unmap and release the grant. I think it has yet to be

Re: [kvm-devel] [PATCH] export notifier #1

2008-01-23 Thread Andrea Arcangeli
On Wed, Jan 23, 2008 at 06:32:30AM -0600, Robin Holt wrote: Christoph, Maybe you can clear one thing up. Was this proposal an addition to or replacement of Andrea's? I assumed an addition. I am going to try to restrict my responses to ones appropriate for that assumption. It wasn't

Re: [kvm-devel] [RFC][PATCH 0/5] Memory merging driver for Linux

2008-01-23 Thread Andrea Arcangeli
On Wed, Jan 23, 2008 at 12:05:10PM -0500, Rik van Riel wrote: On Mon, 21 Jan 2008 18:05:53 +0200 Izik Eidus [EMAIL PROTECTED] wrote: i added 2 new functions to the kernel one: page_wrprotect() make the page as read only by setting the ptes point to it as read only. second:

Re: [kvm-devel] [PATCH] export notifier #1

2008-01-24 Thread Andrea Arcangeli
On Wed, Jan 23, 2008 at 12:18:45PM -0800, Christoph Lameter wrote: On Wed, 23 Jan 2008, Andrea Arcangeli wrote: [..] The linux instance with the secondary mmu must call back to the exporting machine in order to reinstantiate the page. PageExported is cleared in invalidate_page() so

Re: [kvm-devel] [PATCH] export notifier #1

2008-01-24 Thread Andrea Arcangeli
On Thu, Jan 24, 2008 at 03:34:54PM +0100, Andrea Arcangeli wrote: set_page_dirty can be called inside -invalidate_page if needed. But I'm not against artificially setting the dirty bit of the pteval returned by set_page_dirty, perhaps that's more efficient

Re: [kvm-devel] [PATCH] export notifier #1

2008-01-24 Thread Andrea Arcangeli
On Wed, Jan 23, 2008 at 12:27:47PM -0800, Christoph Lameter wrote: There are still dirty bit issues. Yes, but no big issues given -invalidate_page is fully capable of running set_page_dirty if needed. The window that you must close with that bitflag is the request coming from the remote

Re: [kvm-devel] [patch 0/4] [RFC] MMU Notifiers V1

2008-01-25 Thread Andrea Arcangeli
On Thu, Jan 24, 2008 at 09:56:06PM -0800, Christoph Lameter wrote: Andrea's mmu_notifier #4 - RFC V1 - Merge subsystem rmap based with Linux rmap based approach - Move Linux rmap based notifiers out of macro - Try to account for what locks are held while the notifiers are called. -

Re: [kvm-devel] [patch 0/4] [RFC] MMU Notifiers V1

2008-01-28 Thread Andrea Arcangeli
On Mon, Jan 28, 2008 at 06:10:39PM +0200, Izik Eidus wrote: i dont understand how is that better than notification on tlb flush? I certainly agree. The quoted call wasn't actually the one that could be moved in a single place in the .h file though. But the 4/4 patch could be reduced to a few

Re: [kvm-devel] [patch 4/6] MMU notifier: invalidate_page callbacks using Linux rmaps

2008-01-29 Thread Andrea Arcangeli
this will not be cleaned up eventually, the same way the tlb flushes have been cleaned up already. Nevertheless I back your implementation and I'm not even trying at changing it with the risk to slowdown merging. Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] diff --git a/mm/rmap.c b/mm/rmap.c --- a/mm

Re: [kvm-devel] [patch 1/6] mmu_notifier: Core code

2008-01-29 Thread Andrea Arcangeli
On Mon, Jan 28, 2008 at 12:28:41PM -0800, Christoph Lameter wrote: +struct mmu_notifier_head { + struct hlist_head head; +}; + struct mm_struct { struct vm_area_struct * mmap; /* list of VMAs */ struct rb_root mm_rb; @@ -219,6 +223,8 @@ struct mm_struct {

Re: [kvm-devel] swapping with MMU Notifiers V2

2008-01-29 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 06:24:12PM +0200, Avi Kivity wrote: Carsten Otte wrote: #include linux/kvm.h @@ -118,6 +119,7 @@ struct kvm { struct kvm_io_bus pio_bus; struct kvm_vm_stat stat; struct kvm_arch arch; + struct mmu_notifier mmu_notifier; }; /* The guest did

Re: [kvm-devel] swapping with MMU Notifiers V2

2008-01-29 Thread Andrea Arcangeli
Didn't realize s390 doesn't need those at all. Do you think mmu_notifier.h should also go in asm/mmu_notifier? We can always move them there later after merging with some compat code if needed. Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm

[kvm-devel] swapping with MMU Notifiers V2

2008-01-29 Thread Andrea Arcangeli
new test-setup (previously I was developing and testing on my workstation which was by far not ideal). Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index 4086080..c527d7d 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig

Re: [kvm-devel] [patch 4/6] MMU notifier: invalidate_page callbacks using Linux rmaps

2008-01-29 Thread Andrea Arcangeli
On Mon, Jan 28, 2008 at 12:28:44PM -0800, Christoph Lameter wrote: if (!migration ((vma-vm_flags VM_LOCKED) || - (ptep_clear_flush_young(vma, address, pte { + (ptep_clear_flush_young(vma, address, pte) || +

Re: [kvm-devel] swapping with MMU Notifiers V2

2008-01-29 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 06:17:51PM +0100, Carsten Otte wrote: Andrea Arcangeli wrote: Well I already moved that bit to x86, at least that had a good reason for being moved there, it's really invisible code to s390. The memslot are all but invisible to s390 instead, and so the locking rules

Re: [kvm-devel] swapping with MMU Notifiers V2

2008-01-29 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 05:35:34PM +0100, Carsten Otte wrote: Avi Kivity wrote: Every arch except s390 needs it. An ugly #ifndef CONFIG_KVM_HARDWARE_TLB_SYNC is preferred to duplicating the code. BTW, from reading AMDs spec I don't expect NPT to need this vehicle By your conclusion I

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
On Mon, Jan 28, 2008 at 12:28:42PM -0800, Christoph Lameter wrote: Index: linux-2.6/mm/fremap.c === --- linux-2.6.orig/mm/fremap.c2008-01-25 19:31:05.0 -0800 +++ linux-2.6/mm/fremap.c 2008-01-25

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
. If the pte is unmapped and the page is mapped back in with a minor fault that's ok, as long as the physical page remains the same for that mm+address, until all sptes are gone. Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] diff --git a/mm/fremap.c b/mm/fremap.c --- a/mm/fremap.c +++ b/mm/fremap.c

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 11:55:10AM -0800, Christoph Lameter wrote: I am not sure. AFAICT you wrote that code. Actually I didn't need to change a single line in do_wp_page because ptep_clear_flush was already doing everything transparently for me. This was the memory.c part of my last patch I

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 12:30:06PM -0800, Christoph Lameter wrote: On Tue, 29 Jan 2008, Andrea Arcangeli wrote: diff --git a/mm/fremap.c b/mm/fremap.c --- a/mm/fremap.c +++ b/mm/fremap.c @@ -212,8 +212,8 @@ asmlinkage long sys_remap_file_pages(uns spin_unlock(mapping

Re: [kvm-devel] swapping with MMU Notifiers V2

2008-01-29 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 07:19:18PM +0100, Joerg Roedel wrote: Since NPT uses the host page table format it is in theory possible to add the pagetable to the Linux MM rmap. In this case it would not be necessary to use MMU notifiers. But I think this would complicate the NPT support code

Re: [kvm-devel] swapping with MMU Notifiers V2

2008-01-29 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 08:05:20PM +0200, Avi Kivity wrote: If a hypervisor mandates (host virtual) == (guest physical), it would work. x86 still misses the dual-tagged tlb, so mmu notifiers are needed regardless. With s390, they have an additional offset parameter, so (host Yep. NPT is

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 01:53:05PM -0800, Christoph Lameter wrote: On Tue, 29 Jan 2008, Andrea Arcangeli wrote: We invalidate the range *after* populating it? Isnt it okay to establish references while populate_range() runs? It's not ok because that function can very well overwrite

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 02:55:56PM -0800, Christoph Lameter wrote: On Tue, 29 Jan 2008, Andrea Arcangeli wrote: But now I think there may be an issue with a third thread that may show unsafe the removal of invalidate_page from ptep_clear_flush. A third thread writing to a page through

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 02:39:00PM -0800, Christoph Lameter wrote: If it does not run in write mode then concurrent faults are permissible while we remap pages. Weird. Maybe we better handle this like individual page operations? Put the invalidate_page back into zap_pte. But then there would

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 01:00:39AM +0100, Andrea Arcangeli wrote: get_user_pages, regular linux writes don't fault unless it's explicitly writeprotect, which is mandatory in a few archs, x86 not). actually get_user_pages doesn't fault either but it calls into set_page_dirty, however

Re: [kvm-devel] swapping with MMU Notifiers V2

2008-01-30 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 12:26:39PM +0100, Carsten Otte wrote: Andrea Arcangeli wrote: By your conclusion I suppose you thought NPT maps guest physical to host virtual. If it was the case the cpu would to walk three layer of pagetables (each layer is an arrow): guest virtual - guest physical

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 06:28:05PM -0600, Jack Steiner wrote: On Tue, Jan 29, 2008 at 04:20:50PM -0800, Christoph Lameter wrote: On Wed, 30 Jan 2008, Andrea Arcangeli wrote: invalidate_range after populate allows access to memory for which ptes were zapped and the refcount

Re: [kvm-devel] [patch 1/6] mmu_notifier: Core code

2008-01-30 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 06:29:10PM -0800, Christoph Lameter wrote: +void mmu_notifier_release(struct mm_struct *mm) +{ + struct mmu_notifier *mn; + struct hlist_node *n, *t; + + if (unlikely(!hlist_empty(mm-mmu_notifier.head))) { + rcu_read_lock(); +

Re: [kvm-devel] [patch 1/6] mmu_notifier: Core code

2008-01-30 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 09:53:06AM -0600, Jack Steiner wrote: That will also resolve the problem we discussed yesterday. I want to unregister my mmu_notifier when a GRU segment is unmapped. This would not necessarily be at task termination. My proof that there is something wrong in the smp

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 10:11:24AM -0600, Robin Holt wrote: Robin, if you don't mind, could you please post or upload somewhere your GPLv2 code that registers itself in Christoph's V2 notifiers? Or is it top secret? I wouldn't mind to have a look so I can better understand what's the exact

Re: [kvm-devel] swapping with MMU Notifiers V2

2008-01-30 Thread Andrea Arcangeli
workstation but only on the test system. This seem not enough to get V2 stable yet (but I think it was enough to get my old codebase stable on the test system). I'll now try to rollback kvm source to my last stable status and to apply this fix and run it on V2/V3. Signed-off-by: Andrea Arcangeli

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 11:50:26AM -0800, Christoph Lameter wrote: Then we have invalidate_range_start(mm) and invalidate_range_finish(mm, start, end) in addition to the invalidate rmap_notifier? --- include/linux/mmu_notifier.h |7 +-- 1 file changed, 5 insertions(+),

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 06:08:14PM -0800, Christoph Lameter wrote: hlist_for_each_entry_safe_rcu(mn, n, t, mm-mmu_notifier.head, hlist) { hlist_del_rcu(mn-hlist);

Re: [kvm-devel] swapping with MMU Notifiers V2

2008-01-31 Thread Andrea Arcangeli
On Thu, Jan 31, 2008 at 08:50:01AM +0200, Avi Kivity wrote: This is surprising. pagefault_disable() is really a preempt_disable(), and kvm_read_guest_atomic() should only be called from atomic contexts (with preemption already disabled), no? _spin_lock calls preempt_disable() and that's the

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-31 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 05:46:21PM -0800, Christoph Lameter wrote: Well the GRU uses follow_page() instead of get_user_pages. Performance is a major issue for the GRU. GRU is a external TLB, we have to allocate RAM instead but we do it through the regular userland paging mechanism.

Re: [kvm-devel] [patch 2/3] mmu_notifier: Callbacks to invalidate address ranges

2008-01-31 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 08:57:52PM -0800, Christoph Lameter wrote: @@ -211,7 +212,9 @@ asmlinkage long sys_remap_file_pages(uns spin_unlock(mapping-i_mmap_lock); } + mmu_notifier(invalidate_range_begin, mm, start, start + size, 0); err = populate_range(mm,

Re: [kvm-devel] swapping with MMU Notifiers V2

2008-01-31 Thread Andrea Arcangeli
On Thu, Jan 31, 2008 at 12:34:37PM +0200, Avi Kivity wrote: I see. Will merge that patch, thanks. Thanks. BTW, with this fix I finally got KVM swapping 100% stable on my test system. However I had to rollback everything: I'm using my last mmu notifier patch (not Christoph's ones), my mmu

Re: [kvm-devel] swapping with MMU Notifiers V2

2008-01-31 Thread Andrea Arcangeli
On Thu, Jan 31, 2008 at 01:58:42PM +0100, Andrea Arcangeli wrote: It might also be something stale in the buildsystem (perhaps a distcc of ccache glitch?), I also cleared 1G of ccache just to be sure in My build problem might have been related to the fact the kvm-userland/kernel/include

Re: [kvm-devel] [PATCH] mmu notifiers #v5

2008-01-31 Thread Andrea Arcangeli
On Thu, Jan 31, 2008 at 12:18:54PM -0800, Christoph Lameter wrote: pt lock cannot serialize with invalidate_range since it is split. A range requires locking for a series of ptes not only individual ones. The lock I take already protects up to 512 ptes yes. I call invalidate_pages only across

Re: [kvm-devel] KVM swapping with mmu notifiers #v5

2008-01-31 Thread Andrea Arcangeli
On Thu, Jan 31, 2008 at 12:21:34PM -0800, Christoph Lameter wrote: On Thu, 31 Jan 2008, Andrea Arcangeli wrote: I doubt Christoph's V4 was close to final yet, GRU wasn't covered at all yet, not even mremap was covered at all (nor XPMEM nor GRU) in V4. The GRU not covered? Why would you

Re: [kvm-devel] [PATCH] mmu notifiers #v5

2008-01-31 Thread Andrea Arcangeli
On Thu, Jan 31, 2008 at 03:09:55PM -0800, Christoph Lameter wrote: On Thu, 31 Jan 2008, Christoph Lameter wrote: pagefault against the main linux page fault, given we already have all needed serialization out of the PT lock. XPMEM is forced to do that pt lock cannot serialize with

Re: [kvm-devel] mmu_notifier: close hole in fork

2008-01-31 Thread Andrea Arcangeli
, if yes, then I hope my _dual_ approach is by far the best for at least GRU (and KVM of course for the very same reason), and of course it'll fit XPMEM too the moment you add invalidate_range_start/end too. Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] diff --git a/mm/memory.c b/mm/memory.c

Re: [kvm-devel] mmu_notifier: Move mmu_notifier_release up to get rid of the invalidat_all() callback

2008-01-31 Thread Andrea Arcangeli
On Thu, Jan 31, 2008 at 02:21:58PM -0800, Christoph Lameter wrote: Is this okay for KVM too? -release isn't implemented at all in KVM, only the list_del generates complications. I think current code could be already safe through the mm_count pin, becasue KVM relies on the fact anybody pinning

Re: [kvm-devel] [PATCH] mmu notifiers #v5

2008-02-01 Thread Andrea Arcangeli
On Thu, Jan 31, 2008 at 05:37:21PM -0800, Christoph Lameter wrote: On Fri, 1 Feb 2008, Andrea Arcangeli wrote: I appreciate the review! I hope my entirely bug free and strightforward #v5 will strongly increase the probability of getting this in sooner than later. If something else

Re: [kvm-devel] [PATCH] mmu notifiers #v5

2008-02-01 Thread Andrea Arcangeli
On Thu, Jan 31, 2008 at 05:44:24PM -0800, Christoph Lameter wrote: The trouble is that the invalidates are much more expensive if you have to send theses to remote partitions (XPmem). And its really great if you can simple tear down everything. Certainly this is a significant improvement

Re: [kvm-devel] [patch 0/4] [RFC] EMMU Notifiers V5

2008-02-02 Thread Andrea Arcangeli
On Thu, Jan 31, 2008 at 09:04:39PM -0800, Christoph Lameter wrote: - Has page tables to track pages whose refcount was elevated(?) but no reverse maps. Just a correction, rmaps exists or swap couldn't be sane, it's just that it's not built on the page_t because the guest memory is really

Re: [kvm-devel] [patch 2/4] mmu_notifier: Callbacks to invalidate address ranges

2008-02-02 Thread Andrea Arcangeli
On Fri, Feb 01, 2008 at 05:35:28PM -0600, Robin Holt wrote: No, we need a callout when we are becoming more restrictive, but not when becoming more permissive. I would have to guess that is the case for any of these callouts. It is for both GRU and XPMEM. I would expect the same is true for

Re: [kvm-devel] [PATCH] mmu notifiers #v5

2008-02-02 Thread Andrea Arcangeli
On Sat, Feb 02, 2008 at 09:14:57PM -0600, Jack Steiner wrote: Also, most (but not all) applications that use the GRU do not usually do anything that requires frequent flushing (fortunately). The GRU is intended for HPC-like applications. These don't usually do frequent map/unmap operations or

[kvm-devel] preempt notifier emulation host crash fix

2008-02-03 Thread Andrea Arcangeli
it was KVM setting it manually with 0x701 (kvm really only needs 0x301 to get exact exception, dunno what 0x400 means, it's defined reserved, but it doesn't matter what it means as long as ptrace can't set it ;). So this fixes the host crash for me: Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED

Re: [kvm-devel] [PATCH] mmu notifiers #v5

2008-02-05 Thread Andrea Arcangeli
On Mon, Feb 04, 2008 at 10:11:24PM -0800, Christoph Lameter wrote: Zero problems only if you find having a single callout for every page acceptable. So the invalidate_range in your patch is only working invalidate_pages is only a further optimization that was strightforward in some places

Re: [kvm-devel] [PATCH] mmu notifiers #v5

2008-02-05 Thread Andrea Arcangeli
On Tue, Feb 05, 2008 at 10:17:41AM -0800, Christoph Lameter wrote: The other approach will not have any remote ptes at that point. Why would there be a coherency issue? It never happens that two threads writes to two different physical pages by working on the same process virtual address. This

Re: [kvm-devel] [PATCH] mmu notifiers #v5

2008-02-05 Thread Andrea Arcangeli
On Tue, Feb 05, 2008 at 02:06:23PM -0800, Christoph Lameter wrote: On Tue, 5 Feb 2008, Andrea Arcangeli wrote: On Tue, Feb 05, 2008 at 10:17:41AM -0800, Christoph Lameter wrote: The other approach will not have any remote ptes at that point. Why would there be a coherency issue

Re: [kvm-devel] Broken external module build on 2.6.23

2008-02-05 Thread Andrea Arcangeli
? This will solve it: Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1267,7 +1279,11 @@ static int kvm_resume(struct sys_device *dev) } static struct sysdev_class kvm_sysdev_class = { +#ifdef set_kset_name + set_kset_name(kvm), +#else

Re: [kvm-devel] [PATCH] mmu notifiers #v5

2008-02-05 Thread Andrea Arcangeli
On Tue, Feb 05, 2008 at 03:10:52PM -0800, Christoph Lameter wrote: On Tue, 5 Feb 2008, Andrea Arcangeli wrote: You can avoid the page-pin and the pt lock completely by zapping the mappings at _start and then holding off new references until _end. holding off new references until

Re: [kvm-devel] [ofa-general] Re: [patch 0/6] MMU Notifiers V6

2008-02-08 Thread Andrea Arcangeli
On Fri, Feb 08, 2008 at 04:36:16PM -0800, Christoph Lameter wrote: On Fri, 8 Feb 2008, Roland Dreier wrote: That would of course work -- dumb adapters would just always fail, which might be inefficient. H.. that means we need something that actually pins pages for good so that the

Re: [kvm-devel] [ofa-general] Re: [patch 0/6] MMU Notifiers V6

2008-02-08 Thread Andrea Arcangeli
On Fri, Feb 08, 2008 at 05:27:03PM -0800, Christoph Lameter wrote: Pages will still be on the LRU and cycle through rmap again and again. If page migration is used on those pages then the code may make repeated attempt to migrate the page thinking that the page count must at some point

Re: [kvm-devel] Broken external module build on 2.6.23

2008-02-11 Thread Andrea Arcangeli
/include headers with #error x to be sure... along with running gcc -E and checking the cpp work). Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] diff -urN 1/kernel/include/asm-x86/cmpxchg.h 2/kernel/include/asm-x86/cmpxchg.h --- 1/kernel/include/asm-x86/cmpxchg.h 2008-02-11 12:00:19.0

Re: [kvm-devel] swapping with MMU Notifiers V2

2008-02-11 Thread Andrea Arcangeli
On Mon, Feb 11, 2008 at 10:20:37AM +0200, Avi Kivity wrote: Andrea Arcangeli wrote: On Thu, Jan 31, 2008 at 01:58:42PM +0100, Andrea Arcangeli wrote: It might also be something stale in the buildsystem (perhaps a distcc of ccache glitch?), I also cleared 1G of ccache just to be sure

Re: [kvm-devel] Broken external module build on 2.6.23

2008-02-11 Thread Andrea Arcangeli
On Mon, Feb 11, 2008 at 12:19:44PM +0100, Andrea Arcangeli wrote: + LINUXINCLUDE=-I`pwd`/include -Iinclude -Iinclude-compat \ woops, here the last version: Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] diff -urN 1/kernel/include/asm-x86/cmpxchg.h 2/kernel/include/asm-x86

[kvm-devel] SIMPLE_ATTRIBUTE_GETTER compile fix

2008-02-12 Thread Andrea Arcangeli
Hello, this GETTER stuff looks a bit tricky... (we've just to cope with it). Anyway this allows building again and perhaps there's something good in the patch (not sure about the = 2.6.25 side that I didn't touch) Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] diff --git a/kernel/external

[kvm-devel] preempt emulation miscompile fix

2008-02-12 Thread Andrea Arcangeli
breakpoint too). This will fix it by importing the task struct before we define CONFIG_PREEMPT_NOTIFIERS. preempt.h is most certainly unnecessary but it's a good rule of thumb to include anything that uses CONFIG_PREEMPT_NOTIFIERS before defining it. Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED

Re: [kvm-devel] SIMPLE_ATTRIBUTE_GETTER compile fix

2008-02-12 Thread Andrea Arcangeli
On Tue, Feb 12, 2008 at 02:26:12PM +0200, Avi Kivity wrote: Perhaps you have an old kvm.git? ouch good point! Ignore it then, I've been deferring any change to kvm.git updates until I tracked down my kvm-userland troubles, I'll try again after a kvm.git pull and I'll let you know if there's any

Re: [kvm-devel] [ofa-general] Re: Demand paging for memory regions

2008-02-13 Thread Andrea Arcangeli
Hi Kanoj, On Wed, Feb 13, 2008 at 03:43:17PM -0800, Kanoj Sarcar wrote: Oh ok, yes, I did see the discussion on this; sorry I missed it. I do see what notifiers bring to the table now (without endorsing it :-)). I'm not really livelocks are really the big issue here. I'm running N 1G VM on a

  1   2   3   4   >