Rusty Russell wrote:
> On Thu, 2007-04-12 at 06:32 +0300, Avi Kivity wrote:
>
>> I hadn't considered an always-blocking (or unbuffered) networking API.
>> It's very counter to current APIs, but does make sense with things like
>> syslets. Without syslets, I don't think it's very useful as you
On Thu, 2007-04-12 at 06:32 +0300, Avi Kivity wrote:
> I hadn't considered an always-blocking (or unbuffered) networking API.
> It's very counter to current APIs, but does make sense with things like
> syslets. Without syslets, I don't think it's very useful as you need
> some artificial threads
Rusty Russell wrote:
> On Wed, 2007-04-11 at 17:28 +0300, Avi Kivity wrote:
>
>> Rusty Russell wrote:
>>
>>> On Wed, 2007-04-11 at 07:26 +0300, Avi Kivity wrote:
>>>
>>>
Nope. Being async is critical for copyless networking:
>> With async operations, the s
On Wed, 2007-04-11 at 17:28 +0300, Avi Kivity wrote:
> Rusty Russell wrote:
> > On Wed, 2007-04-11 at 07:26 +0300, Avi Kivity wrote:
> >
> >> Nope. Being async is critical for copyless networking:
> >>
> With async operations, the saga continues like this: the host-side
> driver allocates an s
Rusty Russell wrote:
> On Wed, 2007-04-11 at 07:26 +0300, Avi Kivity wrote:
>
>> Nope. Being async is critical for copyless networking:
>>
>> - in the transmit path, so need to stop the sender (guest) from touching
>> the memory until it's on the wire. This means 100% of packets sent will
>> b
On Wed, 2007-04-11 at 07:26 +0300, Avi Kivity wrote:
> Nope. Being async is critical for copyless networking:
>
> - in the transmit path, so need to stop the sender (guest) from touching
> the memory until it's on the wire. This means 100% of packets sent will
> be blocked.
Hi Avi,
You
Rusty Russell wrote:
> On Mon, 2007-04-09 at 16:38 +0300, Avi Kivity wrote:
>
>> Moreover, some things just don't lend themselves to a userspace
>> abstraction. If we want to expose tso (tcp segmentation offload), we
>> can easily do so with a kernel driver since the kernel interfaces are
>>
On Mon, 2007-04-09 at 16:38 +0300, Avi Kivity wrote:
> Moreover, some things just don't lend themselves to a userspace
> abstraction. If we want to expose tso (tcp segmentation offload), we
> can easily do so with a kernel driver since the kernel interfaces are
> all tso aware. Tacking on tso
Evgeniy Polyakov wrote:
> On Mon, Apr 09, 2007 at 04:38:18PM +0300, Avi Kivity ([EMAIL PROTECTED])
> wrote:
>
>>> But I don't get this "we can enhance the kernel but not userspace" vibe
>>> 8(
>>>
>>>
>> I've been waiting for network aio since ~2003. If it arrives in the
>> next few
Evgeniy Polyakov wrote:
> On Tue, Apr 10, 2007 at 03:17:45PM +0300, Avi Kivity ([EMAIL PROTECTED])
> wrote:
>
>>> Check a link please in case we are talking about different ideas:
>>> http://marc.info/?l=linux-netdev&m=112262743505711&w=2
>>>
>>>
>>>
>> I don't really understand what y
On Tue, Apr 10, 2007 at 03:17:45PM +0300, Avi Kivity ([EMAIL PROTECTED]) wrote:
> >Check a link please in case we are talking about different ideas:
> >http://marc.info/?l=linux-netdev&m=112262743505711&w=2
> >
> >
>
> I don't really understand what you're testing there. in particular, how
> c
Evgeniy Polyakov wrote:
>> This is what Xen does. It is actually less performant than copying, IIRC.
>>
>> The problem with flipping pages around is that physical addresses are
>> cached both in the kvm mmu and in the on-chip tlbs, necessitating
>> expensive page table walks and tlb invalidation
On Tue, Apr 10, 2007 at 02:21:24PM +0300, Avi Kivity ([EMAIL PROTECTED]) wrote:
> >You want to implement zero-copy network device between host and guest, if
> >I understood this thread correctly?
> >So, for sending part, device allocates pages from receiver's memory (or
> >from shared memory), rece
Evgeniy Polyakov wrote:
>>> But it looks from this discussion, that it will not prevent from
>>> changing in-kernel driver - place a hook into skb allocation path and
>>> allocate data from opposing memory - get pages from another side and put
>>> them into fragments, then copy headers into skb->da
On Tue, Apr 10, 2007 at 11:19:52AM +0300, Avi Kivity ([EMAIL PROTECTED]) wrote:
> I meant, network aio in the mainline kernel. I am aware of the various
> out-of-tree implementations.
If potential users do not pay attention to initial implementaion, it is
quite hard to them to get into. But actua
On Mon, Apr 09, 2007 at 04:38:18PM +0300, Avi Kivity ([EMAIL PROTECTED]) wrote:
> >But I don't get this "we can enhance the kernel but not userspace" vibe
> >8(
> >
>
> I've been waiting for network aio since ~2003. If it arrives in the
> next few days, I'm all for it; much more than kvm can u
Rusty Russell wrote:
> On Mon, 2007-04-09 at 10:10 +0300, Avi Kivity wrote:
>
>> Rusty Russell wrote:
>>
>>> I'm a little puzzled by your response. Hmm...
>>>
>>> lguest's userspace network frontend does exactly as many copies as
>>> Ingo's in-host-kernel code. One from the Guest,
On Mon, 2007-04-09 at 10:10 +0300, Avi Kivity wrote:
> Rusty Russell wrote:
> > I'm a little puzzled by your response. Hmm...
> >
> > lguest's userspace network frontend does exactly as many copies as
> > Ingo's in-host-kernel code. One from the Guest, one to the Guest.
>
> kvm pvnet is
Rusty Russell wrote:
> On Sun, 2007-04-08 at 08:36 +0300, Avi Kivity wrote:
>
>> Rusty Russell wrote:
>>
>>> Hi Avi,
>>>
>>> I don't think you've thought about this very hard. The receive copy is
>>> completely independent with whether the packet is going to the guest via
>>> a kernel
On Sun, 2007-04-08 at 08:36 +0300, Avi Kivity wrote:
> Rusty Russell wrote:
> > Hi Avi,
> >
> > I don't think you've thought about this very hard. The receive copy is
> > completely independent with whether the packet is going to the guest via
> > a kernel driver or via userspace, so not relev
On Sun, Apr 08, 2007 at 08:36:14AM +0300, Avi Kivity wrote:
> That is not the common case. Nor is it true when there is a
> mismatch between the card's capabilties and guest expectations and
> constraints. For example, guest memory is not physically contiguous
> so a NIC that won't do scatter/ga
Rusty Russell wrote:
> On Thu, 2007-04-05 at 10:17 +0300, Avi Kivity wrote:
>
>> Rusty Russell wrote:
>>
>>> You didn't quote Anthony's point about "it's more about there not being
>>> good enough userspace interfaces to do network IO."
>>>
>>> It's easier to write a kernel-space network dr
* Rusty Russell <[EMAIL PROTECTED]> wrote:
> > prototyping new kernel APIs to implement user-space network drivers,
> > on a crufty codebase is not something that should be done lightly.
>
> I think you overestimate my radicalism. I was considering readv() and
> writev() on the tap device.
o
* Ingo Molnar <[EMAIL PROTECTED]> wrote:
> * Anthony Liguori <[EMAIL PROTECTED]> wrote:
>
> > [...] Did Linux have extremely high quality code in 1994?
>
> yes! It was crutial to strive for extremely high quality code all the
> time. That was the only way to grow Linux's codebase, which was
>
* Anthony Liguori <[EMAIL PROTECTED]> wrote:
> [...] Did Linux have extremely high quality code in 1994?
yes! It was crutial to strive for extremely high quality code all the
time. That was the only way to grow Linux's codebase, which was ~300,000
lines of code in 1994, to the current 7.2+ mil
On Thu, 2007-04-05 at 13:36 +0200, Ingo Molnar wrote:
> prototyping new kernel APIs to implement user-space network drivers, on
> a crufty codebase is not something that should be done lightly.
I think you overestimate my radicalism. I was considering readv() and
writev() on the tap device.
Qem
On Thu, 2007-04-05 at 10:17 +0300, Avi Kivity wrote:
> Rusty Russell wrote:
> > You didn't quote Anthony's point about "it's more about there not being
> > good enough userspace interfaces to do network IO."
> >
> > It's easier to write a kernel-space network driver, but it's not
> > obviously the
Dong, Eddie wrote:
> Avi Kivity wrote:
>
>> Dong, Eddie wrote:
>>
>>> Avi Kivity wrote:
>>>
>>>
> With PIC in Xen, CPU2K gets 6.5% performance gain in old 1000HZ
> linux kernel, KB gets 14% gain. We also did a shared PIC model
> which share PIC state among Qemu & VMM wit
Avi Kivity wrote:
> Dong, Eddie wrote:
>> Avi Kivity wrote:
>>
With PIC in Xen, CPU2K gets 6.5% performance gain in old 1000HZ linux
kernel, KB gets 14% gain. We also did a shared PIC model which share
PIC state among Qemu & VMM with less LOC in VMM, it can get
similar perfor
Avi Kivity wrote:
> Dong, Eddie wrote:
>> Avi Kivity wrote:
>>
With PIC in Xen, CPU2K gets 6.5% performance gain in old 1000HZ
linux kernel, KB gets 14% gain. We also did a shared PIC model
which share PIC state among Qemu & VMM with less LOC in VMM, it
can get
similar p
Dong, Eddie wrote:
> Avi Kivity wrote:
>
>>> With PIC in Xen, CPU2K gets 6.5% performance gain in old 1000HZ linux
>>> kernel, KB gets 14% gain. We also did a shared PIC model which share
>>> PIC state among Qemu & VMM with less LOC in VMM, it can get
>>> similar performance gain (5.8% in my te
Dor Laor wrote:
> So for the sake of the next arguments, what was the fuel running the
> decision of moving all the X-PI[c|t] things into the Xen?
PIT is in Xen at very beginning when HVM support is designed due to
performance
reason. Moving PIC to VMM is also for performance reason at that time
Avi Kivity wrote:
>> With PIC in Xen, CPU2K gets 6.5% performance gain in old 1000HZ linux
>> kernel, KB gets 14% gain. We also did a shared PIC model which share
>> PIC state among Qemu & VMM with less LOC in VMM, it can get
>> similar performance gain (5.8% in my test).
>> BTW, at that time, PIT
Dor Laor wrote:
>> Dor Laor wrote:
>>>
>>> Do you know why Xen guy choose of implementing it in Xen? Why
>>> didn't they rip Qemu implementation?
>>>
>> With PIC in Xen, CPU2K gets 6.5% performance gain in old 1000HZ linux
>> kernel, KB gets 14% gain. We also did a shared PIC model which share
>
>Dor Laor wrote:
>>
>> Do you know why Xen guy choose of implementing it in Xen? Why didn't
>> they rip Qemu implementation?
>>
>With PIC in Xen, CPU2K gets 6.5% performance gain in old 1000HZ linux
>kernel, KB gets 14% gain. We also did a shared PIC model which share
> PIC state among Qemu & VMM
Ingo Molnar wrote:
> * Rusty Russell <[EMAIL PROTECTED]> wrote:
>
>
>> It's easier to write a kernel-space network driver, but it's not
>> obviously the right thing to do until we can show that an efficient
>> packet-level userspace interface isn't possible. I don't think that's
>> been done
Dong, Eddie wrote:
> Dor Laor wrote:
>
>> Do you know why Xen guy choose of implementing it in Xen? Why didn't
>> they rip Qemu implementation?
>>
>>
> With PIC in Xen, CPU2K gets 6.5% performance gain in old 1000HZ linux
> kernel, KB gets 14% gain. We also did a shared PIC model which sha
Dor Laor wrote:
>
> Do you know why Xen guy choose of implementing it in Xen? Why didn't
> they rip Qemu implementation?
>
With PIC in Xen, CPU2K gets 6.5% performance gain in old 1000HZ linux
kernel, KB gets 14% gain. We also did a shared PIC model which share
PIC state among Qemu & VMM with l
* Ingo Molnar <[EMAIL PROTECTED]> wrote:
> * Rusty Russell <[EMAIL PROTECTED]> wrote:
>
> > It's easier to write a kernel-space network driver, but it's not
> > obviously the right thing to do until we can show that an efficient
> > packet-level userspace interface isn't possible. I don't thi
* Avi Kivity <[EMAIL PROTECTED]> wrote:
> [...] But the difference in cruftiness between kvm and qemu code
> should not enter into the discussion of where to do things.
i agree that it doesnt enter the discussion for the *PIC question, but
it very much enters the discussion for the question th
Ingo Molnar wrote:
> * Avi Kivity <[EMAIL PROTECTED]> wrote:
>
>
>>> so right now the only option for a clean codebase is the KVM
>>> in-kernel code.
>>>
>> I strongly disagree with this.
>>
>
> are you disagreeing with my statement that the KVM kernel-side code is
> the only clean
* Avi Kivity <[EMAIL PROTECTED]> wrote:
> > so right now the only option for a clean codebase is the KVM
> > in-kernel code.
>
> I strongly disagree with this.
are you disagreeing with my statement that the KVM kernel-side code is
the only clean codebase here? To me this is a clear fact :)
I
Ingo Molnar wrote:
> so right now the only option for a clean codebase is the KVM in-kernel
> code.
I strongly disagree with this. Bad code in userspace is not an excuse
for shoving stuff into the kernel, where maintaining it is much more
expensive, and the cause of a mistake can be system cra
* Rusty Russell <[EMAIL PROTECTED]> wrote:
> It's easier to write a kernel-space network driver, but it's not
> obviously the right thing to do until we can show that an efficient
> packet-level userspace interface isn't possible. I don't think that's
> been done, and it would be interesting
Rusty Russell wrote:
> You didn't quote Anthony's point about "it's more about there not being
> good enough userspace interfaces to do network IO."
>
> It's easier to write a kernel-space network driver, but it's not
> obviously the right thing to do until we can show that an efficient
> packet-le
Anthony Liguori wrote:
>
> Yeah, I think this is a good point. If we're going to push the APIC
> into the kernel, we might as well put the PIT there too. The timing
> stuff is an absolute mess in QEMU since it wants to get a fast high
> res clock but isn't aware of things like CPU migration.
Ingo Molnar wrote:
> * Avi Kivity <[EMAIL PROTECTED]> wrote:
>
>
>>> It still exists in userspace. Having the code duplication
>>> (especially when it's not the same code base) is unfortunate.
>>>
>> This remains true.
>>
>
> but it's the wrong argument. Of course there's duplicate
On Wed, 2007-04-04 at 23:21 +0200, Ingo Molnar wrote:
> * Anthony Liguori <[EMAIL PROTECTED]> wrote:
>
> > But why is it a good thing to do PV drivers in the kernel? You lose
> > flexibility and functionality to gain performance. [...]
>
> in Linux a kernel-space network driver can still be tun
>But why is it a good thing to do PV drivers in the kernel? You lose
>flexibility and functionality to gain performance. Really, it's more
>about there not being good enough userspace interfaces to do network
IO.
>
>> The lapic/PIC code
>> should also be available in Qemu for OSs that dont have
>Avi Kivity wrote:
>> Nakajima, Jun wrote:
>>> I compared the performance on Xen and KVM for kernel build using the
>>> same guest image. Looks like KVM was (kvm-17) three times slower as
>>> far as we tested, and that high load of qemu was one of the
>>> symptoms. We are looking at the shadow code
* Anthony Liguori <[EMAIL PROTECTED]> wrote:
> But why is it a good thing to do PV drivers in the kernel? You lose
> flexibility and functionality to gain performance. [...]
in Linux a kernel-space network driver can still be tunneled over
user-space code, and hence you can add arbitrary add-
Ingo Molnar wrote:
> * Avi Kivity <[EMAIL PROTECTED]> wrote:
>
>
>>> It still exists in userspace. Having the code duplication
>>> (especially when it's not the same code base) is unfortunate.
>>>
>> This remains true.
>>
>
> but it's the wrong argument. Of course there's duplicate
* Anthony Liguori <[EMAIL PROTECTED]> wrote:
> > Keeping the apic in the kernel simplifies this with the cost of
> > maintaining an apic/pic implementation.
>
> Hrm, this is definitely starting to sound like a PITA to deal with.
> Maybe in-kernel platform devices are unavoidable :-/
yes, ver
* Avi Kivity <[EMAIL PROTECTED]> wrote:
> > It still exists in userspace. Having the code duplication
> > (especially when it's not the same code base) is unfortunate.
>
> This remains true.
but it's the wrong argument. Of course there's duplicate functionality,
and that's _good_ because it
Gregory Haskins wrote:
On Wed, Apr 4, 2007 at 10:20 AM, in message <[EMAIL PROTECTED]>,
> Anthony Liguori <[EMAIL PROTECTED]> wrote:
>
>> The devices are already written to take a set_irq function. Instead of
>> hijacking the emulated PIC device, I think it would be better
Nakajima, Jun wrote:
> Avi Kivity wrote:
>
>> Nakajima, Jun wrote:
>>
>>> Most of H/W-virtualization capable processors out there don't support
>>> that feature today. I think the decision (kvm or qemu) should be done
>>> based on performance data. I'm not worried about maintenance issues;
>>> On Wed, Apr 4, 2007 at 10:20 AM, in message <[EMAIL PROTECTED]>,
Anthony Liguori <[EMAIL PROTECTED]> wrote:
>
> The devices are already written to take a set_irq function. Instead of
> hijacking the emulated PIC device, I think it would be better if in
> pc.c, we just conditionally created
Avi Kivity wrote:
> Nakajima, Jun wrote:
>> I compared the performance on Xen and KVM for kernel build using the
>> same guest image. Looks like KVM was (kvm-17) three times slower as
>> far as we tested, and that high load of qemu was one of the
>> symptoms. We are looking at the shadow code, but
Nakajima, Jun wrote:
> I compared the performance on Xen and KVM for kernel build using the
> same guest image. Looks like KVM was (kvm-17) three times slower as far
> as we tested, and that high load of qemu was one of the symptoms. We are
> looking at the shadow code, but the load of qemu looks v
Anthony Liguori wrote:
>>
>>> BTW, I see CPU utilization of qemu is almost always 99% in the top
>>> command when I run kernel build in an x86-64 Linux guest.
>>>
>>
>
> qemu would be 99% even if all the time is being spent in the guest
> context.
>
> If the user time is high, an oprofile r
Avi Kivity wrote:
> Nakajima, Jun wrote:
>> Most of H/W-virtualization capable processors out there don't support
>> that feature today. I think the decision (kvm or qemu) should be done
>> based on performance data. I'm not worried about maintenance issues;
>> the APIC code is not expected to chan
Dor Laor wrote:
If we do this, then we can probably just handle the TPR as a special
case anyway and not bother returning to userspace when the TPR is
updated through MMIO. That saves the round trip without adding
emulation complexity.
>>> That means the emulation
Dor Laor wrote:
> This pushes towards in kernel apic too. Can't see how we avoid it.
>
>
Does it really? IIUC, we would avoid TPR traps entirely and would
just need to synchronize the TPR whenever we go down to userspace.
>>> It's a bit more comp
Nakajima, Jun wrote:
> Most of H/W-virtualization capable processors out there don't support
> that feature today. I think the decision (kvm or qemu) should be done
> based on performance data. I'm not worried about maintenance issues; the
> APIC code is not expected to change frequently. I'm a bit
This pushes towards in kernel apic too. Can't see how we avoid it.
>>>
>>> Does it really? IIUC, we would avoid TPR traps entirely and would
>>> just need to synchronize the TPR whenever we go down to userspace.
>>>
>>
>> It's a bit more complex than that, as userspace would need to
>>>
>>> If we do this, then we can probably just handle the TPR as a special
>>> case anyway and not bother returning to userspace when the TPR is
>>> updated through MMIO. That saves the round trip without adding
>>> emulation complexity.
>>
>> That means the emulation is split among user space a
Avi Kivity wrote:
> Anthony Liguori wrote:
>>
>>>
>>> This pushes towards in kernel apic too. Can't see how we avoid it.
>>>
>>
>> Does it really? IIUC, we would avoid TPR traps entirely and would
>> just need to synchronize the TPR whenever we go down to userspace.
>>
>
> It's a bit more co
Anthony Liguori wrote:
>>>
>>> If we do this, then we can probably just handle the TPR as a special
>>> case anyway and not bother returning to userspace when the TPR is
>>> updated through MMIO. That saves the round trip without adding
>>> emulation complexity.
>>
>> That means the emulation i
Avi Kivity wrote:
> Anthony Liguori wrote:
>>> Maybe some brave soul can hack kvm to patch the new instruction in
>>> place of the mmio instruction Windows uses to bang on the tpr.
>>
>> It seems like that shouldn't be too hard assuming that the MMIO
>> instructions are <= the new CR8 instruction
Anthony Liguori wrote:
>
>>
>> This pushes towards in kernel apic too. Can't see how we avoid it.
>>
>
> Does it really? IIUC, we would avoid TPR traps entirely and would
> just need to synchronize the TPR whenever we go down to userspace.
>
It's a bit more complex than that, as userspace wou
Anthony Liguori wrote:
>> Maybe some brave soul can hack kvm to patch the new instruction in
>> place of the mmio instruction Windows uses to bang on the tpr.
>
> It seems like that shouldn't be too hard assuming that the MMIO
> instructions are <= the new CR8 instruction. It would require knowi
Dor Laor wrote:
This is for the TPR right? VT has special logic to handle TPR
virtualization doesn't it? I thought SVM did too...
>>> Yes, the TPR. Both VT and SVM virtualize CR8 in 64-bit mode. SVM
>>> also supports CR8 in 32-bit mode through a nwe instruction enco
Avi Kivity wrote:
> Anthony Liguori wrote:
>> Avi Kivity wrote:
>>> Anthony Liguori wrote:
Then again, are we really positive that we have to move the APIC
into the kernel? A lot of things will get much more complicated.
>>>
>>> The following arguments are in favor:
>>> - allow in-
>>> This is for the TPR right? VT has special logic to handle TPR
>>> virtualization doesn't it? I thought SVM did too...
>>>
>>
>> Yes, the TPR. Both VT and SVM virtualize CR8 in 64-bit mode. SVM
>> also supports CR8 in 32-bit mode through a nwe instruction encoding,
>> but
>> nobody uses that
Avi Kivity wrote:
> Anthony Liguori wrote:
>> Avi Kivity wrote:
>>> Anthony Liguori wrote:
Then again, are we really positive that we have to move the APIC
into the kernel? A lot of things will get much more complicated.
>>>
>>> The following arguments are in favor:
>>> - allow in
Anthony Liguori wrote:
> Avi Kivity wrote:
>> Anthony Liguori wrote:
>>>
>>> Then again, are we really positive that we have to move the APIC
>>> into the kernel? A lot of things will get much more complicated.
>>
>> The following arguments are in favor:
>> - allow in-kernel paravirt drivers to i
Avi Kivity wrote:
> Anthony Liguori wrote:
>>
>> Then again, are we really positive that we have to move the APIC into
>> the kernel? A lot of things will get much more complicated.
>
> The following arguments are in favor:
> - allow in-kernel paravirt drivers to interrupt the guest without
> go
Anthony Liguori wrote:
>
> Then again, are we really positive that we have to move the APIC into
> the kernel? A lot of things will get much more complicated.
The following arguments are in favor:
- allow in-kernel paravirt drivers to interrupt the guest without going
through qemu (which involv
Avi Kivity wrote:
> Gregory Haskins wrote:
>> Hi Dor,
>> Please find a patch attached for your review which adds support for
>> dynamic substitution of the PIC/APIC code to QEMU. This will allow us
>> to selectively chose the KVM in-kernel apic emulation vs the QEMU
>> user-space apic emulatio
Gregory Haskins wrote:
> Hi Dor,
> Please find a patch attached for your review which adds support for dynamic
> substitution of the PIC/APIC code to QEMU. This will allow us to selectively
> chose the KVM in-kernel apic emulation vs the QEMU user-space apic emulation.
> Support for both is k
Hi Dor,
Please find a patch attached for your review which adds support for dynamic
substitution of the PIC/APIC code to QEMU. This will allow us to selectively
chose the KVM in-kernel apic emulation vs the QEMU user-space apic emulation.
Support for both is key to allow "--no-kvm" type oper
81 matches
Mail list logo