On Wed, 2008-10-01 at 14:34 -0700, Anthony Liguori wrote:
Jeremy Fitzhardinge wrote:
Alok Kataria wrote:
I guess, but the bulk of the uses of this stuff are going to be
hypervisor-specific. You're hard-pressed to come up with any other
generic uses beyond tsc.
And arguably, storing
On Wed, 2008-10-01 at 17:39 -0700, H. Peter Anvin wrote:
third, which is subject to spread-spectrum modulation due to RFI
concerns. Therefore, relying on the *nominal* frequency of this clock
I'm not suggesting using the nominal value. I'm suggesting the
measurement be done in the one and
, APICWrite);
+ apic_ops = vmi_basic_apic_ops;
Yinghai, Looking more closely at this, based on my understanding this might
be
wrong for VMI. Correct patch should be as follows. Any comments?
so you mean icr related will still use default native member?
YH
Nacked-by: Zachary Amsden
On Tue, 2008-07-15 at 11:52 -0700, Yinghai Lu wrote:
Nacked-by: Zachary Amsden [EMAIL PROTECTED]
because of not ccing you?
Because it's wrong.
What are you doing here and why aren't you cc-ing the maintainers?
did you checking tip tree for x86 changing?
No, this was brought to my
On Tue, 2008-07-15 at 11:51 -0700, Suresh Siddha wrote:
On Tue, Jul 15, 2008 at 11:38:50AM -0700, Zachary Amsden wrote:
Nacked-by: Zachary Amsden [EMAIL PROTECTED]
What are you doing here and why aren't you cc-ing the maintainers?
Sorry. I was about to bring you into the loop
On Tue, 2008-07-15 at 12:10 -0700, Yinghai Lu wrote:
On Tue, Jul 15, 2008 at 12:04 PM, Zachary Amsden [EMAIL PROTECTED] wrote:
On Tue, 2008-07-15 at 11:51 -0700, Suresh Siddha wrote:
On Tue, Jul 15, 2008 at 11:38:50AM -0700, Zachary Amsden wrote:
Nacked-by: Zachary Amsden [EMAIL
On Tue, 2008-07-15 at 12:22 -0700, Yinghai Lu wrote:
On Tue, Jul 15, 2008 at 11:51 AM, Suresh Siddha
[EMAIL PROTECTED] wrote:
On Tue, Jul 15, 2008 at 11:38:50AM -0700, Zachary Amsden wrote:
Nacked-by: Zachary Amsden [EMAIL PROTECTED]
What are you doing here and why aren't you cc-ing
On Sat, 2008-05-31 at 01:13 +0100, Jeremy Fitzhardinge wrote:
Zachary Amsden wrote:
We don't fault. We write directly to the primary page tables, and clear
the pte just like native. We just issue all mprotect updates in the
queue, and flush the queue when leaving lazy mmu mode. You can't
On Fri, 2008-05-23 at 21:32 +0100, Jeremy Fitzhardinge wrote:
Zachary Amsden wrote:
I'm a bit skeptical you can get such a semantic to work without a very
heavyweight method in the hypervisor. How do you guarantee no other CPU
is fizzling the A/D bits in the page table (it can be done
On Tue, 2008-05-06 at 15:46 -0400, Rik van Riel wrote:
On Tue, 06 May 2008 17:33:02 +0200
Martin Schwidefsky [EMAIL PROTECTED] wrote:
On Thu, 2008-03-13 at 16:57 +, Hugh Dickins wrote:
It's very encouraging to see Jeremy and Rusty weighing in. I hope
Zach will too, and I've added
On Fri, 2008-03-14 at 11:30 -0700, Jeremy Fitzhardinge wrote:
Zachary Amsden wrote:
How about a fake hypervisor, which is really just a random page evictor,
following the rules of CMM?
Probably simpler to just have variants of the page_set_* functions which
simulate the worst
On Thu, 2008-03-13 at 16:57 +, Hugh Dickins wrote:
Oh, that would be such a shame. Your guest page hinting patches remind
me of that childhood thrill, when once a year the circus comes to town ;)
I like the circus too.
But seriously, I'm ashamed to see my name in the Cc list: it would
On Thu, 2008-03-13 at 20:45 +0100, Andrea Arcangeli wrote:
On Thu, Mar 13, 2008 at 10:45:07AM -0700, Zachary Amsden wrote:
What doesn't appear to be useful however, is support for this under
VMware. It can be done, even without the writable pte support (yes,
really). But due to us
On Fri, 2008-01-18 at 22:37 +0100, Ingo Molnar wrote:
* Ingo Molnar [EMAIL PROTECTED] wrote:
The first fix is not even specific for PARAVIRT, and it's actually
preventing the whole tree from booting.
on CONFIG_EFI, indeed :)
but in exchange you broke all of 32-bit with
On Fri, 2007-11-16 at 11:24 -0800, Jeremy Fitzhardinge wrote:
Do you use paravirt_alloc_pd_clone()? I seem to you remember you
mentioning that it doesn't help you very much. I'm in the process of
unifying pgalloc*, and it seems to me that it would be a bit cleaner
without needing to worry
On Tue, 2007-11-13 at 08:18 -0500, Gregory Haskins wrote:
Since PCI was designed as a hardware solution it has all kinds of stuff
specifically geared towards hardware constraints. Those constraints
are different in a virtualized platform, so some things do not translate
well to an optimal
On Thu, 2007-11-01 at 10:41 -0700, Jeremy Fitzhardinge wrote:
Keir Fraser wrote:
volatile prevents the asm from being 'moved significantly', according to the
gcc manual. I take that to mean that reordering is not allowed.
I understood it as reordering was permitted, but no re-ordering
On Fri, 2007-09-28 at 11:10 -0700, Jeremy Fitzhardinge wrote:
This patch refactors the paravirt_ops structure into groups of
functionally related ops:
pv_info - random info, rather than function entrypoints
pv_init_ops - functions used at boot time (some for module_init too)
pv_misc_ops -
On Fri, 2007-09-28 at 11:49 -0700, Jeremy Fitzhardinge wrote:
We shouldn't need to export pv_init_ops.
No. The only ones I export are:
EXPORT_SYMBOL_GPL(pv_time_ops);
EXPORT_SYMBOL_GPL(pv_cpu_ops);
EXPORT_SYMBOL_GPL(pv_mmu_ops);
EXPORT_SYMBOL_GPL(pv_apic_ops);
EXPORT_SYMBOL
On Sun, 2007-09-16 at 07:56 -0700, Jeremy Fitzhardinge wrote:
Rusty Russell wrote:
Well, containerization deserves its own menu, but the question is does i
it belong under this menu?
No, I don't think so. While there are some broad similarities in
effect, the technology is completely
On Thu, 2007-09-06 at 02:33 +1000, Rusty Russell wrote:
On Tue, 2007-09-04 at 14:42 +0100, Jeremy Fitzhardinge wrote:
Rusty Russell wrote:
static inline void arch_flush_lazy_mmu_mode(void)
{
- PVOP_VCALL1(set_lazy_mode, PARAVIRT_LAZY_FLUSH);
+ if
On Thu, 2007-09-06 at 06:37 +1000, Rusty Russell wrote:
On Thu, 2007-08-23 at 22:46 -0700, Zachary Amsden wrote:
I recently sent off a fix for lazy vmalloc faults which can happen under
paravirt when lazy mode is enabled. Unfortunately, I jumped the gun a
bit on fixing this. I neglected
Benjamin Herrenschmidt wrote:
On Wed, 2007-08-22 at 16:25 +1000, Rusty Russell wrote:
On Wed, 2007-08-22 at 08:34 +0300, Avi Kivity wrote:
Zachary Amsden wrote:
This patch provides hypercalls for the i386 port I/O instructions,
which vastly helps guests which use native-style
Jeremy Fitzhardinge wrote:
Hm. Doing any kind of lazy-state operation with preemption enabled is
fundamentally meaningless. How does it get into a preemptable state
Agree 100%. It is the lazy mode flush that might happen when preempt is
enabled, but lazy mode is disabled. In that case,
-preemptible.
Signed-off-by: Zachary Amsden [EMAIL PROTECTED](none)
---
arch/i386/kernel/vmi.c| 14 ++
arch/i386/xen/enlighten.c |4 +++-
2 files changed, 13 insertions(+), 5 deletions(-)
diff --git a/arch/i386/kernel/vmi.c b/arch/i386/kernel/vmi.c
index 18673e0..9e669cb 100644
H. Peter Anvin wrote:
Zachary Amsden wrote:
In general, I/O in a virtual guest is subject to performance problems.
The I/O can not be completed physically, but must be virtualized. This
means trapping and decoding port I/O instructions from the guest OS.
Not only is the trap for a #GP
Jeremy Fitzhardinge wrote:
No, under Xen the kernel/hypervisor PMD is not shared between processes,
so this is still used when PAE is enabled.
Ahh, yes. So this was a lucky catch for us. Non-PAE kernels seem to be
increasing in value at antique sales.
Zach
Avi Kivity wrote:
Since this is only for newer kernels, won't updating the driver to use
a hypercall be more efficient? Or is this for existing out-of-tree
drivers?
Actually, it is for in-tree drivers that we emulate but don't want to
pollute, and one out of tree driver (that will
Andi Kleen wrote:
On Tue, Aug 21, 2007 at 10:23:14PM -0700, Zachary Amsden wrote:
In general, I/O in a virtual guest is subject to performance problems.
The I/O can not be completed physically, but must be virtualized. This
means trapping and decoding port I/O instructions from the guest
Andi Kleen wrote:
How is that measured? In a loop? In the same pipeline state?
It seems a little dubious to me.
I did the experiments in a controlled environment, with interrupts
disabled and care to get the pipeline in the same state. It was a
perfectly repeatable experiment. I don't
Alan Cox wrote:
I still think it's preferable to change some drivers than everybody.
AFAIK BusLogic as real hardware is pretty much dead anyways,
so you're probably the only primary user of it anyways.
Go wild on it!
I don't believe anyone is materially maintaining the buslogic driver
Andi Kleen wrote:
We might benefit from it, but would the
BusLogic driver? It sets a nasty precedent for maintenance as different
hypervisors and emulators hack up different drivers for their own
performance.
I still think it's preferable to change some drivers than everybody.
AFAIK
to -stable as well.
Zach
Touching vmalloc memory in the middle of a lazy mode update can generate
a kernel PDE update, which must be flushed immediately. The fix is to
leave lazy mode when doing a vmalloc sync.
Signed-off-by: Zachary Amsden [EMAIL PROTECTED]
diff --git
to make use of this feature.
Signed-off-by: Zachary Amsden [EMAIL PROTECTED]
diff --git a/arch/i386/kernel/paravirt.c b/arch/i386/kernel/paravirt.c
index ea962c0..4d0d150 100644
--- a/arch/i386/kernel/paravirt.c
+++ b/arch/i386/kernel/paravirt.c
@@ -329,6 +329,18 @@ struct paravirt_ops paravirt_ops
Avi Kivity wrote:
Zachary Amsden wrote:
In general, I/O in a virtual guest is subject to performance
problems. The I/O can not be completed physically, but must be
virtualized. This means trapping and decoding port I/O instructions
from the guest OS. Not only is the trap for a #GP
Andi Kleen wrote:
In the boot decompressor for the kernel in the image Iouri provided, I
32bit or 64bit image?
As you can plainly see, the call to memcpy (which is redefined in
boot/compressed/misc.c) is made using stack calling convention.
Unfortunately, the compiler generated the
Andi Kleen wrote:
In the boot decompressor for the kernel in the image Iouri provided, I
32bit or 64bit image?
As you can plainly see, the call to memcpy (which is redefined in
boot/compressed/misc.c) is made using stack calling convention.
Unfortunately, the compiler generated the
to a minute or more to decompress a 1.3MB kernel
on a very fast box.
Signed-off-by: Zachary Amsden [EMAIL PROTECTED]
===
--- a/arch/x86_64/boot/compressed/head.S
+++ a/arch/x86_64/boot/compressed/head.S
@@ -195,6 +195,11 @@
movl %eax
H. Peter Anvin wrote:
I just got the following message on the syslinux mailing list:
2. On some platforms (vmware for example :), READING from the video memory
in the 32bit mode is impossible (causes an exeption). Taking in to account
that the scroll function in
Rusty Russell wrote:
Otherwise we end up with $NARCH copies of that Kconfig, each slightly
different. The top-level entry can be made to depend on the archs that
actually have some virt capability, so as not to show empty an menu.
I dislike the duplication, too, but
1) it's a CPU
Stefan Richter wrote:
Robert P. J. Day wrote:
Signed-off-by: Robert P. J. Day [EMAIL PROTECTED]
---
diff --git a/arch/i386/kernel/vmi.c b/arch/i386/kernel/vmi.c
Maintainers are apparently those under PARAVIRT_OPS INTERFACE.
CCs added.
index c12720d..e3ce5c8 100644
---
H. Peter Anvin wrote:
Jeremy Fitzhardinge wrote:
H. Peter Anvin wrote:
Jeremy Fitzhardinge wrote:
You mean is a real failure? Or is triggered by particular
instructions? It seems profoundly bogus (as in, surely DOS or
something reads the framebuffer).
A real
H. Peter Anvin wrote:
Zachary Amsden wrote:
I'm failing to understand exactly what the failure here is. Can you
provide sample code that generates the problem? Surely, it should be
possible to read the framebuffer.
The supposed test case was leaving the cursor at the bottommost
Anthony Liguori wrote:
Zachary Amsden wrote:
Jeremy Fitzhardinge wrote:
Anthony Liguori wrote:
I don't agree that having paravirt_ops within a normal module is all
that useful. By the time modules can be loaded, the kernel has
completely booted. There should only be a handful
Jeremy Fitzhardinge wrote:
Zachary Amsden wrote:
Unless you also migrate the hypercall page itself and impose
migration restrictions on compatible hypercall pages.
Seems unreasonable, especially if you support migration between VT and
SVM machines. The whole point of a hypercall page
Anthony Liguori wrote:
I don't see a compelling reason to paravirtualize earlier although I
also don't see a compelling reason not too. I noticed that VMI hooks
setup.c. It wasn't immediately obvious why it was hooking there but
perhaps it worthwhile to have a common hook? I suspect VMI
Jeremy Fitzhardinge wrote:
Well, I was suggesting we could print the banner later rather than
forcing an earlier init.
The important part is that you set your pv_ops before patching occurs,
since that will bake the function calls into the rest of the kernel, and
it will ignore any further
Nakajima, Jun wrote:
And actually you don't need the write to CR3 to flush TLB because the
one to CR4 does it. Or does kvm_flush_tlb_kernel assume that CR3 is
updated at the same time?
Jun
It should not be necessary, but I believe this was added as a workaround
to a PII erratum. I can't
Jeremy Fitzhardinge wrote:
I'm implementing a more efficient version of the Xen iret paravirt_op,
so that it can use the real iret instruction where possible. I really
need to get access to per-cpu variables, so I can set the event mask
state in the vcpu_info structure, but unfortunately at the
Jeremy Fitzhardinge wrote:
I think the more things we can devolve out of paravirt_ops the better,
especially if they make well-defined self-contained interfaces of their
own. I would be open, for example, to moving all the pagetable and
privileged instruction operations out into their own _ops
default to on when CONFIG_PARAVIRT is enabled.
Signed-off-by: Zachary Amsden [EMAIL PROTECTED]
diff -r 20882b709da4 arch/i386/Kconfig
--- a/arch/i386/Kconfig Thu Apr 26 19:58:12 2007 -0700
+++ b/arch/i386/Kconfig Mon Apr 30 15:32:34 2007 -0700
@@ -227,6 +227,7 @@ config VMI
config VMI
bool VMI
Jeremy Fitzhardinge wrote:
Well, the BUG is if the patch-size is smaller than sizeof(indirect
call). Or more generally, if the patch site contains bogus crud. But I
don't think checking for that in the patcher makes a great deal of sense.
True, it may not make a lot of sense to check
traditional functionality which was inlined into modules before
is still available to modules with paravirt_ops, but there is no danger
of exporting it to rogue or non-GPL modules.
Signed-off-by: Zachary Amsden [EMAIL PROTECTED]
diff -r 80ddc95c2ab2 arch/i386/kernel/paravirt.c
--- a/arch/i386/kernel
Jeremy Fitzhardinge wrote:
Zachary Amsden wrote:
In shadow mode hypervisors, ptep_get_and_clear achieves the desired
purpose of keeping the shadows in sync by issuing a native_get_and_clear,
followed by a call to pte_update, which indicates the PTE has been
modified.
Direct mode
Dave Jones wrote:
On Wed, Apr 11, 2007 at 10:30:58PM -0700, Zachary Amsden wrote:
In situations where page table updates need only be made locally, and there
is no cross-processor A/D bit races involved, we need not use the
heavyweight
xchg instruction to atomically fetch and clear
H. Peter Anvin wrote:
Zachary Amsden wrote:
Some PTE optimizations for native and paravirt-ops kernels; this
provides a huge win for shadow mode hypervisors and gets rid of
some unnecessary atomic instructions in native kernels, saving
even more on UP by getting rid of implicit LOCK on xchg
this!
Signed-off-by: Zachary Amsden [EMAIL PROTECTED]
diff -r 99800d11a3ec arch/i386/kernel/vmi.c
--- a/arch/i386/kernel/vmi.cThu Apr 12 16:37:29 2007 -0700
+++ b/arch/i386/kernel/vmi.cThu Apr 12 19:00:46 2007 -0700
@@ -685,11 +685,14 @@ do
Chris Wright wrote:
* Zachary Amsden ([EMAIL PROTECTED]) wrote:
+void __init vmi_time_init(void)
+{
+ /* Disable PIT: BIOSes start PIT CH0 with 18.2hz peridic. */
+ outb_p(0x3a, PIT_MODE); /* binary, mode 5, LSB/MSB, ch 0 */
That shouldn't be necessary using clockevents
optimization for non-SMP kernels; drop the atomic
xchg operations from page table updates.
Thanks to Michel Lespinasse for noting this potential optimization.
Signed-off-by: Zachary Amsden [EMAIL PROTECTED]
diff -r 47495b2532b3 include/asm-i386/pgtable-2level.h
--- a/include/asm-i386/pgtable
Add comment and condense code to make use of native_local_ptep_get_and_clear
function. Also, it turns out the 2-level and 3-level paging definitions
were identical, so move the common definition into pgtable.h
Signed-off-by: Zachary Amsden [EMAIL PROTECTED]
diff -r b3bbc1b5e085 include/asm-i386
Chris Wright wrote:
* Zachary Amsden ([EMAIL PROTECTED]) wrote:
+void __init vmi_time_init(void)
+{
+ /* Disable PIT: BIOSes start PIT CH0 with 18.2hz peridic. */
+ outb_p(0x3a, PIT_MODE); /* binary, mode 5, LSB/MSB, ch 0 */
That shouldn't be necessary using clockevents
Chris Wright wrote:
* Zachary Amsden ([EMAIL PROTECTED]) wrote:
Yes, but unfortunately that is a nop:
/*
* Avoid unnecessary state transitions, as it confuses
* Geode / Cyrix based boxen.
*/
case CLOCK_EVT_MODE_SHUTDOWN
Jeremy Fitzhardinge wrote:
Seems to work OK for native and Xen. I had to play a bit with the
paravirt-sched-clock patch to deal with the VMI changes. Zach, can you
check that it still works?
I'm on it.
Zach
___
Virtualization mailing list
Copying of the pgd range must happen under the pgd_lock. This got broken by
the paravirt changes in the -mm tree. Badness can result if you copy the pgd
before being added to the list when splitting or rejoining large pages.
Signed-off-by: Zachary Amsden [EMAIL PROTECTED]
diff -r 2247ff2c3fdb
Now that the VDSO can be relocated, we can support it in VMI configurations.
Signed-off-by: Zachary Amsden [EMAIL PROTECTED]
diff -r 158d9ffb46fe arch/i386/Kconfig
--- a/arch/i386/Kconfig Thu Mar 29 04:17:05 2007 -0700
+++ b/arch/i386/Kconfig Thu Mar 29 04:18:05 2007 -0700
@@ -220,7 +220,7
Latest cleanups and junk from Zach's tree. All for -mm tree.
Based off Jeremy's latest known applied patches. If the
paravirt or VMI patches reject let me know; we are cleaning up
tree and will redo.
Otherwise, I have 4 fixes for i386; a warning fix in sysenter
which is quite serious; some less
Don't implement native_kmap_atomic_pte for !HIGHPTE case; it is never needed,
never called, and leaving it in is just plain confusing. Making it isolated
to the config where it is used may help find bugs.
Signed-off-by: Zachary Amsden [EMAIL PROTECTED]
diff -r 5c03805411a6 arch/i386/kernel
-off-by: Zachary Amsden [EMAIL PROTECTED]
diff -r ecb571084874 arch/i386/kernel/vmi.c
--- a/arch/i386/kernel/vmi.cFri Apr 06 12:31:06 2007 -0700
+++ b/arch/i386/kernel/vmi.cFri Apr 06 14:25:03 2007 -0700
@@ -69,6 +69,7 @@ static struct {
void (*flush_tlb)(int);
void
IRQ. It actually
gets delivered by the APIC hardware, but we don't want to use the same local
APIC clocksource processing, so we create our own handler here.
Signed-off-by: Zachary Amsden [EMAIL PROTECTED]
diff -r c02ab981c99c arch/i386/kernel/Makefile
--- a/arch/i386/kernel/Makefile Mon Apr 09
H. Peter Anvin wrote:
This code is almost entirely identical to the setgpr_wrapper in the
patch (except for the fact that setgpr_wrapper sets and captures *ALL*
the GPRs), and it seems rather pointless to use another wrapper. It
takes a pointer to an entrypoint (default to cpuid; ret in
H. Peter Anvin wrote:
I guess what I was trying to say was that we'd use setgpr_wrapper in
the case where you have an entrypoint with native (non-C) semantics;
in the other case we'd use an alternative to setgpr_wrapper. Either
way, it sounds like we're talking about implementing
Linus Torvalds wrote:
On Tue, 20 Mar 2007, Zachary Amsden wrote:
void local_irq_restore(int enabled)
{
pda.intr_mask = enabled;
/*
* note there is a window here where softirqs are not processed by
* the interrupt handler, but that is not a problem, since it will
* get
72 matches
Mail list logo