Subject: [patch] KVM: simplify mmu_alloc_roots()
From: Ingo Molnar <[EMAIL PROTECTED]>
small optimization/cleanup:
page == page_header(page->page_hpa)
Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
---
drivers/kvm/mmu.c | 12 ++--
1 file changed, 6 insertions(+), 6 deletions(-)
Index
Ingo Molnar wrote:
> * Ingo Molnar <[EMAIL PROTECTED]> wrote:
>
>
>> ok. How about the patch below then? This only addresses the OOM
>> scenario, not the !memslot case.
>>
>
> the !memslot case is covered by the patch below. Injecting a #GPF is the
> easiest one to do here, although we co
Ingo Molnar wrote:
> another small detail is that currently KVM_SET_MEMORY_REGION appears to
> be an add-only interface - it is not possible to 'unregister' RAM from a
> VM.
>
Well, the _interface_ supports removing, the implementation does not :)
Everything was written in mind to allow memo
* Avi Kivity <[EMAIL PROTECTED]> wrote:
> The guest needs to cooperate, but it can do so using the native memory
> hotlpug mechanisms (whatever they are). [...]
as far a Linux guest goes, there's no such thing at the moment, at least
in the mainline kernel. Most of the difficulties with RAM-un
Ingo Molnar wrote:
> Subject: [patch] KVM: simplify mmu_alloc_roots()
> From: Ingo Molnar <[EMAIL PROTECTED]>
>
> small optimization/cleanup:
>
> page == page_header(page->page_hpa)
>
>
Applied, thanks.
--
error compiling committee.c: too many arguments to function
-
Parag Warudkar wrote:
> Avi Kivity <[EMAIL PROTECTED]> writes:
>
>
>
>> 32-bin kvm userspace can run a 64-bit guest, if you're using a 64-bit os
>> kernel, hence the 64-bit registers. Just ignore the 64-bit parts.
>>
>>
>
> Didn't understand. Allow me to clarify a bit -
>
> I am running a
When compiling KVM I get the following error:-
In file included from /home/peter/applications-home/kvm-9/qemu/usb-linux.c:29:
/usr/include/linux/usbdevice_fs.h:49: error: variable or field `__user'
declared void
/usr/include/linux/usbdevice_fs.h:49: error: syntax error before '*' token
My enviro
Hardware virtualization implementations allow the guests to freely change
some of the bits in cr0 and cr4, but trap when changing the other bits. This
is useful to avoid excessive exits due to changing, for example, the ts flag.
It also means the kvm's copy of cr0 and cr4 may be stale with respec
The current kvm shadow page table implementation does not cache shadow
page tables (except for global translations, used for kernel addresses)
across context switches. This means that after a context switch, every
memory access will trap into the host. After a while, the shadow page
tables wi
Keep in each host page frame's page->private a pointer to the shadow pte which
maps it. If there are multiple shadow ptes mapping the page, set bit 0 of
page->private, and use the rest as a pointer to a linked list of all such
mappings.
Reverse mappings are needed because we when we cache shadow
Saving the table gfns removes the need to walk the guest and host page tables
in lockstep.
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/paging_tmpl.h
===
--- linux-2.6.orig/drivers/kvm/paging_tmpl.h
+++
In pae mode, a load of cr3 loads the four third-level page table entries
in addition to cr3 itself.
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/kvm_main.c
===
--- linux-2.6.orig/drivers/kvm/kvm_main.c
+
It is never necessary to fetch a guest entry from an intermediate page table
level (except for large pages), so avoid some confusion by always descending
into the lowest possible level.
Rename init_walker() to walk_addr() as it is no longer restricted to
initialization.
Signed-off-by: Avi Kivity
Since we're not going to cache the pae-mode shadow root pages, allocate
a single pae shadow that will hold the four lower-level pages, which
will act as roots.
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/mmu.c
This lets us not write protect a partial page, and is anyway what a
real processor does.
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/paging_tmpl.h
===
--- linux-2.6.orig/drivers/kvm/paging_tmpl.h
+++ li
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/paging_tmpl.h
===
--- linux-2.6.orig/drivers/kvm/paging_tmpl.h
+++ linux-2.6/drivers/kvm/paging_tmpl.h
@@ -170,6 +170,11 @@ static u64 *FNAME(fetch)(struct kvm
This allows further manipulation on the shadow page table.
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/mmu.c
===
--- linux-2.6.orig/drivers/kvm/mmu.c
+++ linux-2.6/drivers/kvm/mmu.c
@@ -292,12 +292,13 @
Define a hashtable for caching shadow page tables. Look up the cache on
context switch (cr3 change) or during page faults.
The key to the cache is a combination of
- the guest page table frame number
- the number of paging levels in the guest
* we can cache real mode, 32-bit mode, pae, and long
When we cache a guest page table into a shadow page table, we need to prevent
further access to that page by the guest, as that would render the cache
incoherent.
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/mmu.c
=
This fixes a problem where set_pte_common() looked for shadowed pages based
on the page directory gfn (a huge page) instead of the actual gfn being
mapped.
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/mmu.c
As the mmu write protects guest page table, we emulate those writes. Since
they are not mmio, there is no need to go to userspace to perform them.
So, perform the writes in the kernel if possible, and notify the mmu about
them so it can take the approriate action.
Signed-off-by: Avi Kivity <[EMA
Iterate over all shadow pages which correspond to a the given guest page table
and remove the mappings.
A subsequent page fault will reestablish the new mapping.
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/mmu.c
=
A page table may have been recycled into a regular page, and so any
instruction can be executed on it. Unprotect the page and let the cpu
do its thing.
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/mmu.c
===
When removing a page table, we must maintain the parent_pte field all child
shadow page tables.
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/mmu.c
===
--- linux-2.6.orig/drivers/kvm/mmu.c
+++ linux-2.6/d
... and so must not free it unconditionally.
Move the freeing to kvm_mmu_zap_page().
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/mmu.c
===
--- linux-2.6.orig/drivers/kvm/mmu.c
+++ linux-2.6/drivers/kvm
When beginning to process a page fault, make sure we have enough shadow pages
available to service the fault. If not, free some pages.
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/mmu.c
===
--- linux-2.
Since we write protect shadowed guest page tables, there is no need to
trap page invalidations (the guest will always change the mapping before
issuing the invlpg instruction).
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/mmu.c
===
Unused.
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/mmu.c
===
--- linux-2.6.orig/drivers/kvm/mmu.c
+++ linux-2.6/drivers/kvm/mmu.c
@@ -609,35 +609,6 @@ hpa_t gva_to_hpa(struct kvm_vcpu *vcpu,
r
A misaligned access affects two shadow ptes instead of just one.
Since a misaligned access is unlikely to occur on a real page table, just
zap the page out of existence, avoiding further trouble.
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/mmu.c
===
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/mmu.c
===
--- linux-2.6.orig/drivers/kvm/mmu.c
+++ linux-2.6/drivers/kvm/mmu.c
@@ -303,16 +303,6 @@ static void rmap_write_protect(struct kv
}
}
-st
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/mmu.c
===
--- linux-2.6.orig/drivers/kvm/mmu.c
+++ linux-2.6/drivers/kvm/mmu.c
@@ -318,6 +318,7 @@ static void kvm_mmu_free_page(struct kvm
{
struct k
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/mmu.c
===
--- linux-2.6.orig/drivers/kvm/mmu.c
+++ linux-2.6/drivers/kvm/mmu.c
@@ -305,12 +305,16 @@ static void rmap_write_protect(struct kv
static int is_
In fork() (or when we protect a page that is no longer a page table), we can
experience floods of writes to a page, which have to be emulated. This is
expensive.
So, if we detect such a flood, zap the page so subsequent writes can proceed
natively.
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
We always need cr3 to point to something valid, so if we detect that we're
freeing a root page, simply push it back to the top of the active list.
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/mmu.c
===
-
cmpxchg8b uses edx:eax as the compare operand, not edi:eax.
cmpxchg8b is used by 32-bit pae guests to set page table entries atomically,
and this is emulated touching shadowed guest page tables.
Also, implement it for 32-bit hosts.
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/paging_tmpl.h
===
--- linux-2.6.orig/drivers/kvm/paging_tmpl.h
+++ linux-2.6/drivers/kvm/paging_tmpl.h
@@ -271,6 +271,7 @@ static int FNAME(fix_write_pf)(struc
Because mmu pages have attached rmap and parent pte chain structures, we need
to zap them before freeing so the attached structures are freed.
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/mmu.c
===
--- l
The mmu sometimes needs memory for reverse mapping and parent pte chains.
however, we can't allocate from within the mmu because of the atomic context.
So, move the allocations to a central place that can be executed before
the main mmu machinery, where we can bail out on failure before any damage
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/mmu.c
===
--- linux-2.6.orig/drivers/kvm/mmu.c
+++ linux-2.6/drivers/kvm/mmu.c
@@ -166,19 +166,20 @@ static int is_rmap_pte(u64 pte)
== (PT_WRI
If we reduce permissions on a pte, we must flush the cached copy of the pte
from the guest's tlb.
This is implemented at the moment by flushing the entire guest tlb, and can
be improved by flushing just the relevant virtual address, if it is known.
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
I
mmu_destroy flushes the guest tlb (indirectly), which needs a valid vcpu.
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/kvm_main.c
===
--- linux-2.6.orig/drivers/kvm/kvm_main.c
+++ linux-2.6/drivers/kvm/k
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/mmu.c
===
--- linux-2.6.orig/drivers/kvm/mmu.c
+++ linux-2.6/drivers/kvm/mmu.c
@@ -26,8 +26,31 @@
#include "vmx.h"
#include "kvm.h"
-#define pgprintk(x...)
> Message: 9
> Date: Thu, 4 Jan 2007 17:00:32 +0200
> From: Peter Smith <[EMAIL PROTECTED]>
> Subject: [kvm-devel] Compile error with openSuse 10.2
> To: kvm-devel@lists.sourceforge.net
> Message-ID: <[EMAIL PROTECTED]>
> Content-Type: text/plain; charset="us-ascii"
>
> When compiling KVM I get th
On Thu, 04 Jan 2007 17:48:45 +0200
Avi Kivity <[EMAIL PROTECTED]> wrote:
> The current kvm shadow page table implementation does not cache shadow
> page tables (except for global translations, used for kernel addresses)
> across context switches. This means that after a context switch, every
>
Andrew Morton wrote:
> Is this intended for 2.6.20, or would you prefer that we release what we
> have now and hold this off for 2.6.21?
>
Even though these patches are potentially destabilazing, I'd like them
(and a few other patches) to go into 2.6.20:
- kvm did not exist in 2.6.19, hence w
* Avi Kivity <[EMAIL PROTECTED]> wrote:
> Andrew Morton wrote:
> >Is this intended for 2.6.20, or would you prefer that we release what we
> >have now and hold this off for 2.6.21?
> >
>
> Even though these patches are potentially destabilazing, I'd like them
> (and a few other patches) to go
This patchset is mostly fallout from the mmu stuff that I've neglected
to integrate with the main patchset sent yesterday. It includes a
fashionable missing dirty bit fix, and other fixes and cleanups.
--
Do not meddle in the internals of kernels, for they are subtle and quick to
panic.
---
This will allow us to see the root cause when a vmwrite error happens.
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/vmx.c
===
--- linux-2.6.orig/drivers/kvm/vmx.c
+++ linux-2.6/drivers/kvm/vmx.c
@@ -152,
Fixes oops on early close of /dev/kvm.
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/kvm_main.c
===
--- linux-2.6.orig/drivers/kvm/kvm_main.c
+++ linux-2.6/drivers/kvm/kvm_main.c
@@ -230,6 +230,7 @@ stati
From: Ingo Molnar <[EMAIL PROTECTED]>
Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/mmu.c
===
--- linux-2.6.orig/drivers/kvm/mmu.c
+++ linux-2.6/drivers/kvm/
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/kvm_main.c
===
--- linux-2.6.orig/drivers/kvm/kvm_main.c
+++ linux-2.6/drivers/kvm/kvm_main.c
@@ -1922,6 +1922,7 @@ static long kvm_dev_ioctl(struct file *f
It overwrites the right cr3 set from mmu setup. Happens only with the test
harness.
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/vmx.c
===
--- linux-2.6.orig/drivers/kvm/vmx.c
+++ linux-2.6/drivers/kvm
If we emulate a write, we fail to set the dirty bit on the guest pte, leading
the guest to believe the page is clean, and thus lose data. Bad.
Fix by setting the guest pte dirty bit under such conditions.
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/paging_tmpl.h
=
From: Ingo Molnar <[EMAIL PROTECTED]>
Prevent the guest's loading of a corrupt cr3 (pointing at no guest phsyical
page) from crashing the host.
Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/kvm_main.c
===
From: Ingo Molnar <[EMAIL PROTECTED]>
Small optimization/cleanup:
page == page_header(page->page_hpa)
Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/mmu.c
=
No need to test for rflags.if as both VT and SVM specs assure us that on exit
caused from
interrupt window opening, 'if' is set.
Signed-off-by: Dor Laor <[EMAIL PROTECTED]>
Signed-off-by: Avi Kivity <[EMAIL PROTECTED]>
Index: linux-2.6/drivers/kvm/svm.c
==
56 matches
Mail list logo