Gregory Haskins wrote:
> My current thoughts are that we at least move the IOAPIC into the kernel as
> well. That will give sufficient control to generate ISA bus interrupts for
> guests that understand APICs. If we want to be able to generate ISA
> interrupts for legacy guests which talk to t
Does KVM allow something like "memory hotplug" for its guests?
For example, lets says you are running several guests, and would like to
start yet another one for a while - but have no free memory left.
Obviously, your guests are so important that you don't want to stop them
- so you simply "h
>Does KVM allow something like "memory hotplug" for its guests?
It does not support.
>
>
>For example, lets says you are running several guests, and would like
to
>start yet another one for a while - but have no free memory left.
>
We have another solution for it that will soon be pushed into th
Avi Kivity wrote:
> Gregory Haskins wrote:
>> Hi Dor,
>> Please find a patch attached for your review which adds support for
>> dynamic substitution of the PIC/APIC code to QEMU. This will allow us
>> to selectively chose the KVM in-kernel apic emulation vs the QEMU
>> user-space apic emulatio
Anthony Liguori wrote:
>
> Then again, are we really positive that we have to move the APIC into
> the kernel? A lot of things will get much more complicated.
The following arguments are in favor:
- allow in-kernel paravirt drivers to interrupt the guest without going
through qemu (which involv
Avi Kivity wrote:
> Anthony Liguori wrote:
>>
>> Then again, are we really positive that we have to move the APIC into
>> the kernel? A lot of things will get much more complicated.
>
> The following arguments are in favor:
> - allow in-kernel paravirt drivers to interrupt the guest without
> go
Anthony Liguori wrote:
> Avi Kivity wrote:
>> Anthony Liguori wrote:
>>>
>>> Then again, are we really positive that we have to move the APIC
>>> into the kernel? A lot of things will get much more complicated.
>>
>> The following arguments are in favor:
>> - allow in-kernel paravirt drivers to i
Avi Kivity wrote:
> Anthony Liguori wrote:
>> Avi Kivity wrote:
>>> Anthony Liguori wrote:
Then again, are we really positive that we have to move the APIC
into the kernel? A lot of things will get much more complicated.
>>>
>>> The following arguments are in favor:
>>> - allow in
>>> This is for the TPR right? VT has special logic to handle TPR
>>> virtualization doesn't it? I thought SVM did too...
>>>
>>
>> Yes, the TPR. Both VT and SVM virtualize CR8 in 64-bit mode. SVM
>> also supports CR8 in 32-bit mode through a nwe instruction encoding,
>> but
>> nobody uses that
Avi Kivity wrote:
> Anthony Liguori wrote:
>> Avi Kivity wrote:
>>> Anthony Liguori wrote:
Then again, are we really positive that we have to move the APIC
into the kernel? A lot of things will get much more complicated.
>>>
>>> The following arguments are in favor:
>>> - allow in-
Dor Laor wrote:
This is for the TPR right? VT has special logic to handle TPR
virtualization doesn't it? I thought SVM did too...
>>> Yes, the TPR. Both VT and SVM virtualize CR8 in 64-bit mode. SVM
>>> also supports CR8 in 32-bit mode through a nwe instruction enco
Casey,
On Tue, Apr 03, 2007 at 10:46:38PM -0400, Casey Jeffery wrote:
> Stephane,
>
> I'm glad you found this; I thought I was going to have to repost while
> actually remembering to change the subject line.
>
Someone else pointed me to your message. The title was indeed misleading.
> >On Wed,
Anthony Liguori wrote:
>> Maybe some brave soul can hack kvm to patch the new instruction in
>> place of the mmio instruction Windows uses to bang on the tpr.
>
> It seems like that shouldn't be too hard assuming that the MMIO
> instructions are <= the new CR8 instruction. It would require knowi
Anthony Liguori wrote:
>
>>
>> This pushes towards in kernel apic too. Can't see how we avoid it.
>>
>
> Does it really? IIUC, we would avoid TPR traps entirely and would
> just need to synchronize the TPR whenever we go down to userspace.
>
It's a bit more complex than that, as userspace wou
Avi Kivity wrote:
> Anthony Liguori wrote:
>>> Maybe some brave soul can hack kvm to patch the new instruction in
>>> place of the mmio instruction Windows uses to bang on the tpr.
>>
>> It seems like that shouldn't be too hard assuming that the MMIO
>> instructions are <= the new CR8 instruction
Anthony Liguori wrote:
>>>
>>> If we do this, then we can probably just handle the TPR as a special
>>> case anyway and not bother returning to userspace when the TPR is
>>> updated through MMIO. That saves the round trip without adding
>>> emulation complexity.
>>
>> That means the emulation i
Stephane
> > >There may be some propagation delay yet you, supposedly, do not suffer
> > >from masked
> > >interrupt windows. Also something to watch out for is that when you restore
> > >you must make sure that msrs upper bits are set to 1. Otherwise you may
> > >trigger
> > >unvoluntary interrup
Avi Kivity wrote:
> Anthony Liguori wrote:
>>
>>>
>>> This pushes towards in kernel apic too. Can't see how we avoid it.
>>>
>>
>> Does it really? IIUC, we would avoid TPR traps entirely and would
>> just need to synchronize the TPR whenever we go down to userspace.
>>
>
> It's a bit more co
>>> On Wed, Apr 4, 2007 at 3:40 AM, in message <[EMAIL PROTECTED]>,
Avi Kivity <[EMAIL PROTECTED]> wrote:
>
> I would avoid moving down anything that's not strictly necessary.
Agreed.
>
> I still don't have an opinion as to whether it is necessary; I'll need
> to study the details. Xen pus
>>>
>>> If we do this, then we can probably just handle the TPR as a special
>>> case anyway and not bother returning to userspace when the TPR is
>>> updated through MMIO. That saves the round trip without adding
>>> emulation complexity.
>>
>> That means the emulation is split among user space a
This pushes towards in kernel apic too. Can't see how we avoid it.
>>>
>>> Does it really? IIUC, we would avoid TPR traps entirely and would
>>> just need to synchronize the TPR whenever we go down to userspace.
>>>
>>
>> It's a bit more complex than that, as userspace would need to
Leslie Mann wrote:
I'll prepare the first patch. Can you ensure that your upgraded setup
still works kvm-17.
It does, as I use it daily in order to run a Win app that I need.
Please test the attached patch, against kvm-17. This is subversion
revision 4546 and git commit c01571ed5
Nakajima, Jun wrote:
> Most of H/W-virtualization capable processors out there don't support
> that feature today. I think the decision (kvm or qemu) should be done
> based on performance data. I'm not worried about maintenance issues; the
> APIC code is not expected to change frequently. I'm a bit
I swear this has been brought up before in this forum, but I can't
find it. I'm curious what the virtualization gurus in this forum think
of the possibilities for recursive virtualization. I know vbox claims
to support it, but I haven't come across many details on how they do
it and I don't think t
Dor Laor wrote:
> This pushes towards in kernel apic too. Can't see how we avoid it.
>
>
Does it really? IIUC, we would avoid TPR traps entirely and would
just need to synchronize the TPR whenever we go down to userspace.
>>> It's a bit more comp
Dor Laor wrote:
If we do this, then we can probably just handle the TPR as a special
case anyway and not bother returning to userspace when the TPR is
updated through MMIO. That saves the round trip without adding
emulation complexity.
>>> That means the emulation
>I swear this has been brought up before in this forum, but I can't
>find it. I'm curious what the virtualization gurus in this forum think
>of the possibilities for recursive virtualization. I know vbox claims
>to support it, but I haven't come across many details on how they do
>it and I don't th
Avi Kivity wrote:
> Nakajima, Jun wrote:
>> Most of H/W-virtualization capable processors out there don't support
>> that feature today. I think the decision (kvm or qemu) should be done
>> based on performance data. I'm not worried about maintenance issues;
>> the APIC code is not expected to chan
Gregory Haskins wrote:
> What I was planning on doing was using that QEMU patch I provided to
> intercept all pic_send_irq() calls and forward them directly to the kernel
> via a new ioctl(). This ioctl would be directed at the VM fd, not the VCPU,
> since its a pure ISA global pin reference an
It seems from cursory inspection that this is possible in theory, even on HVM
hardware. My thoughts are as follows (Intel oriented, which I know better):
*) The hypervisor sets to trap on VMX type operations (VMXON/OFF/START/RESUME,
etc) and provide emulation of them as follows:
*) When a VMXO
Anthony Liguori wrote:
>>
>>> BTW, I see CPU utilization of qemu is almost always 99% in the top
>>> command when I run kernel build in an x86-64 Linux guest.
>>>
>>
>
> qemu would be 99% even if all the time is being spent in the guest
> context.
>
> If the user time is high, an oprofile r
Dor,
Thanks, I realize there will certainly be a lot of work in
virtualizing them. Maybe Intel can help out with VVT-x to give a
root-root mode. ;)
Any idea at a high level how vbox does it? I will post in their forum,
but I assume somebody here has a good idea.
Thanks.
On 4/4/07, Dor Laor <[EM
Nakajima, Jun wrote:
> I compared the performance on Xen and KVM for kernel build using the
> same guest image. Looks like KVM was (kvm-17) three times slower as far
> as we tested, and that high load of qemu was one of the symptoms. We are
> looking at the shadow code, but the load of qemu looks v
>>> On Wed, Apr 4, 2007 at 12:49 PM, in message <[EMAIL PROTECTED]>,
Avi Kivity <[EMAIL PROTECTED]> wrote:
> Gregory Haskins wrote:
>
>
> Hmm. If the ioapic is in the kernel, then it's a platform- wide resource
> and you would need a vm ioctl. If ioapic emulation is in userspace,
> then the i
Avi Kivity wrote:
> Nakajima, Jun wrote:
>> I compared the performance on Xen and KVM for kernel build using the
>> same guest image. Looks like KVM was (kvm-17) three times slower as
>> far as we tested, and that high load of qemu was one of the
>> symptoms. We are looking at the shadow code, but
Gregory Haskins wrote:
> Agreed. I was thinking that the interface for the "IOAPIC in kernel" model
> would look something like the way the pic_send_irq() function looks, except
> it would also convey BUS/IOAPIC id.
>
> so: kvm_inject_interrupt(int bus, int pin, int value);
>
> and the "kvmpic"
>>> On Wed, Apr 4, 2007 at 10:20 AM, in message <[EMAIL PROTECTED]>,
Anthony Liguori <[EMAIL PROTECTED]> wrote:
>
> The devices are already written to take a set_irq function. Instead of
> hijacking the emulated PIC device, I think it would be better if in
> pc.c, we just conditionally created
>>> On Wed, Apr 4, 2007 at 1:43 PM, in message <[EMAIL PROTECTED]>,
Avi Kivity <[EMAIL PROTECTED]> wrote:
> Gregory Haskins wrote:
>> Agreed. I was thinking that the interface for the "IOAPIC in kernel" model
> would look something like the way the pic_send_irq() function looks, except
> it wo
Nakajima, Jun wrote:
> Avi Kivity wrote:
>
>> Nakajima, Jun wrote:
>>
>>> Most of H/W-virtualization capable processors out there don't support
>>> that feature today. I think the decision (kvm or qemu) should be done
>>> based on performance data. I'm not worried about maintenance issues;
Gregory Haskins wrote:
On Wed, Apr 4, 2007 at 10:20 AM, in message <[EMAIL PROTECTED]>,
> Anthony Liguori <[EMAIL PROTECTED]> wrote:
>
>> The devices are already written to take a set_irq function. Instead of
>> hijacking the emulated PIC device, I think it would be better
* Avi Kivity <[EMAIL PROTECTED]> wrote:
> > It still exists in userspace. Having the code duplication
> > (especially when it's not the same code base) is unfortunate.
>
> This remains true.
but it's the wrong argument. Of course there's duplicate functionality,
and that's _good_ because it
* Anthony Liguori <[EMAIL PROTECTED]> wrote:
> > Keeping the apic in the kernel simplifies this with the cost of
> > maintaining an apic/pic implementation.
>
> Hrm, this is definitely starting to sound like a PITA to deal with.
> Maybe in-kernel platform devices are unavoidable :-/
yes, ver
* Avi Kivity <[EMAIL PROTECTED]> wrote:
> > My current thoughts are that we at least move the IOAPIC into the
> > kernel as well. That will give sufficient control to generate ISA
> > bus interrupts for guests that understand APICs. If we want to be
> > able to generate ISA interrupts for le
* Gregory Haskins <[EMAIL PROTECTED]> wrote:
> > pci is level triggered, so maybe the guests just handle the
> > inaccuracy.
> >
>
> Good point. I'm not sure how this works today. Perhaps we just get
> lucky that nothing checks the IRR in the IOAPIC coupled with a bug in
> the IOAPIC model
Ingo Molnar wrote:
> * Avi Kivity <[EMAIL PROTECTED]> wrote:
>
>
>>> It still exists in userspace. Having the code duplication
>>> (especially when it's not the same code base) is unfortunate.
>>>
>> This remains true.
>>
>
> but it's the wrong argument. Of course there's duplicate
* Gregory Haskins <[EMAIL PROTECTED]> wrote:
> Hi all,
>
> Attached is a snapshot of my current efforts on the kernel side for
> the in-kernel APIC work. Feedback welcome.
good work and nice patch! :)
> My current thoughts are that we at least move the IOAPIC into the
> kernel as well. [...
The MMIO registration code has been broken out as a new patch from the
in-kernel APIC work with the following changes per Avi's request:
1) Supports dynamic registration
2) Uses gpa_t addresses
3) Explicit per-cpu mappings
In addition, I have added the concept of distinct VCPU and VM level
regi
* Anthony Liguori <[EMAIL PROTECTED]> wrote:
> But why is it a good thing to do PV drivers in the kernel? You lose
> flexibility and functionality to gain performance. [...]
in Linux a kernel-space network driver can still be tunneled over
user-space code, and hence you can add arbitrary add-
>>> On Wed, Apr 4, 2007 at 4:32 PM, in message <[EMAIL PROTECTED]>,
Ingo Molnar <[EMAIL PROTECTED]> wrote:
>
>> My current thoughts are that we at least move the IOAPIC into the
>> kernel as well. [...]
>
> yes. And then do the final 10% move of handling the i8529A in KVM too.
Hi Ingo,
We a
>
>Dor,
>
>Thanks, I realize there will certainly be a lot of work in
>virtualizing them. Maybe Intel can help out with VVT-x to give a
>root-root mode. ;)
>
>Any idea at a high level how vbox does it? I will post in their forum,
>but I assume somebody here has a good idea.
Vbox branched out from
>Avi Kivity wrote:
>> Nakajima, Jun wrote:
>>> I compared the performance on Xen and KVM for kernel build using the
>>> same guest image. Looks like KVM was (kvm-17) three times slower as
>>> far as we tested, and that high load of qemu was one of the
>>> symptoms. We are looking at the shadow code
Gregory Haskins wrote:
On Wed, Apr 4, 2007 at 4:32 PM, in message <[EMAIL PROTECTED]>,
> Ingo Molnar <[EMAIL PROTECTED]> wrote:
>
>>> My current thoughts are that we at least move the IOAPIC into the
>>> kernel as well. [...]
>>>
>> yes. And then do the final 10% mov
>we should move all the PICs into KVM proper - and that includes the
>i8259A PIC too. Qemu-space drivers are then wired to pins on these
PICs,
>but nothing in Qemu does vector generation or vector prioritization -
>that task is purely up to KVM. There are mixed i8259A+lapic models
>possible too and
>> Gregory Haskins wrote:
>>
>>
>> Hmm. If the ioapic is in the kernel, then it's a platform- wide
resource
>> and you would need a vm ioctl. If ioapic emulation is in userspace,
>> then the ioapic logic will have decided which cpu is targeted and you
>> would issue a vcpu ioctl.
>>
>
>Thats exac
>But why is it a good thing to do PV drivers in the kernel? You lose
>flexibility and functionality to gain performance. Really, it's more
>about there not being good enough userspace interfaces to do network
IO.
>
>> The lapic/PIC code
>> should also be available in Qemu for OSs that dont have
* Gregory Haskins ([EMAIL PROTECTED]) wrote:
> The MMIO registration code has been broken out as a new patch from the
> in-kernel APIC work with the following changes per Avi's request:
>
> 1) Supports dynamic registration
> 2) Uses gpa_t addresses
> 3) Explicit per-cpu mappings
>
> In addition,
Hi Chris,
Thanks for the feedback. Ive answered inline below.
>>> On Wed, Apr 4, 2007 at 6:48 PM, in message
<[EMAIL PROTECTED]>, Chris Wright
<[EMAIL PROTECTED]> wrote:
> * Gregory Haskins ([EMAIL PROTECTED]) wrote:
>> The MMIO registration code has been broken out as a new patch from the
>
On Wed, 2007-04-04 at 23:21 +0200, Ingo Molnar wrote:
> * Anthony Liguori <[EMAIL PROTECTED]> wrote:
>
> > But why is it a good thing to do PV drivers in the kernel? You lose
> > flexibility and functionality to gain performance. [...]
>
> in Linux a kernel-space network driver can still be tun
The attachment contains fixes based on the feedback from Chris.
Thanks Chris!
Regards,
-Greg
diff --git a/drivers/kvm/kvm.h b/drivers/kvm/kvm.h
index fceeb84..0e6eb04 100644
--- a/drivers/kvm/kvm.h
+++ b/drivers/kvm/kvm.h
@@ -236,6 +236,54 @@ struct kvm_pio_request {
int rep;
};
+str
* Gregory Haskins ([EMAIL PROTECTED]) wrote:
> LAPICs can be remapped on a per-cpu basis via an MSR, whereas something
> like an IOAPIC is a system-wide resource.
Yes, I see now, no vcpu in kvm_io_device callbacks' context (admittedly,
I'm used to the Xen implementation ;-)
> >> +struct kvm_io_de
Ingo Molnar wrote:
> * Avi Kivity <[EMAIL PROTECTED]> wrote:
>
>
>>> It still exists in userspace. Having the code duplication
>>> (especially when it's not the same code base) is unfortunate.
>>>
>> This remains true.
>>
>
> but it's the wrong argument. Of course there's duplicate
Ingo Molnar wrote:
> we should move all the PICs into KVM proper - and that includes the
> i8259A PIC too. Qemu-space drivers are then wired to pins on these PICs,
> but nothing in Qemu does vector generation or vector prioritization -
> that task is purely up to KVM. There are mixed i8259A+lapi
Ingo Molnar wrote:
> there is a remote possibility that some OSs depend on certain devices
> being level-triggered: for example if you get an IRQ from a
> level-triggered device and _dont_ deassert that signal from the IRQ
> handler (intentionally so), then the semantics of current hardware will
Anthony Liguori wrote:
>
> Yeah, I think this is a good point. If we're going to push the APIC
> into the kernel, we might as well put the PIT there too. The timing
> stuff is an absolute mess in QEMU since it wants to get a fast high
> res clock but isn't aware of things like CPU migration.
64 matches
Mail list logo