Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-15 Thread Rusty Russell
On Thu, 2007-04-12 at 06:32 +0300, Avi Kivity wrote: I hadn't considered an always-blocking (or unbuffered) networking API. It's very counter to current APIs, but does make sense with things like syslets. Without syslets, I don't think it's very useful as you need some artificial threads to

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-15 Thread Avi Kivity
Rusty Russell wrote: On Thu, 2007-04-12 at 06:32 +0300, Avi Kivity wrote: I hadn't considered an always-blocking (or unbuffered) networking API. It's very counter to current APIs, but does make sense with things like syslets. Without syslets, I don't think it's very useful as you need

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-11 Thread Avi Kivity
Rusty Russell wrote: On Wed, 2007-04-11 at 07:26 +0300, Avi Kivity wrote: Nope. Being async is critical for copyless networking: - in the transmit path, so need to stop the sender (guest) from touching the memory until it's on the wire. This means 100% of packets sent will be blocked.

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-11 Thread Rusty Russell
On Wed, 2007-04-11 at 17:28 +0300, Avi Kivity wrote: Rusty Russell wrote: On Wed, 2007-04-11 at 07:26 +0300, Avi Kivity wrote: Nope. Being async is critical for copyless networking: With async operations, the saga continues like this: the host-side driver allocates an skb,

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-10 Thread Evgeniy Polyakov
On Mon, Apr 09, 2007 at 04:38:18PM +0300, Avi Kivity ([EMAIL PROTECTED]) wrote: But I don't get this we can enhance the kernel but not userspace vibe 8( I've been waiting for network aio since ~2003. If it arrives in the next few days, I'm all for it; much more than kvm can use it

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-10 Thread Avi Kivity
Evgeniy Polyakov wrote: On Mon, Apr 09, 2007 at 04:38:18PM +0300, Avi Kivity ([EMAIL PROTECTED]) wrote: But I don't get this we can enhance the kernel but not userspace vibe 8( I've been waiting for network aio since ~2003. If it arrives in the next few days, I'm all for

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-10 Thread Evgeniy Polyakov
On Tue, Apr 10, 2007 at 11:19:52AM +0300, Avi Kivity ([EMAIL PROTECTED]) wrote: I meant, network aio in the mainline kernel. I am aware of the various out-of-tree implementations. If potential users do not pay attention to initial implementaion, it is quite hard to them to get into. But

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-10 Thread Avi Kivity
Evgeniy Polyakov wrote: But it looks from this discussion, that it will not prevent from changing in-kernel driver - place a hook into skb allocation path and allocate data from opposing memory - get pages from another side and put them into fragments, then copy headers into skb-data.

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-10 Thread Evgeniy Polyakov
On Tue, Apr 10, 2007 at 02:21:24PM +0300, Avi Kivity ([EMAIL PROTECTED]) wrote: You want to implement zero-copy network device between host and guest, if I understood this thread correctly? So, for sending part, device allocates pages from receiver's memory (or from shared memory), receiver

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-10 Thread Avi Kivity
Evgeniy Polyakov wrote: This is what Xen does. It is actually less performant than copying, IIRC. The problem with flipping pages around is that physical addresses are cached both in the kvm mmu and in the on-chip tlbs, necessitating expensive page table walks and tlb invalidation IPIs.

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-10 Thread Evgeniy Polyakov
On Tue, Apr 10, 2007 at 03:17:45PM +0300, Avi Kivity ([EMAIL PROTECTED]) wrote: Check a link please in case we are talking about different ideas: http://marc.info/?l=linux-netdevm=112262743505711w=2 I don't really understand what you're testing there. in particular, how can the

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-10 Thread Avi Kivity
Evgeniy Polyakov wrote: On Tue, Apr 10, 2007 at 03:17:45PM +0300, Avi Kivity ([EMAIL PROTECTED]) wrote: Check a link please in case we are talking about different ideas: http://marc.info/?l=linux-netdevm=112262743505711w=2 I don't really understand what you're testing there. in

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-10 Thread Rusty Russell
On Mon, 2007-04-09 at 16:38 +0300, Avi Kivity wrote: Moreover, some things just don't lend themselves to a userspace abstraction. If we want to expose tso (tcp segmentation offload), we can easily do so with a kernel driver since the kernel interfaces are all tso aware. Tacking on tso

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-10 Thread Avi Kivity
Rusty Russell wrote: On Mon, 2007-04-09 at 16:38 +0300, Avi Kivity wrote: Moreover, some things just don't lend themselves to a userspace abstraction. If we want to expose tso (tcp segmentation offload), we can easily do so with a kernel driver since the kernel interfaces are all tso

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-09 Thread Avi Kivity
Rusty Russell wrote: On Sun, 2007-04-08 at 08:36 +0300, Avi Kivity wrote: Rusty Russell wrote: Hi Avi, I don't think you've thought about this very hard. The receive copy is completely independent with whether the packet is going to the guest via a kernel driver or via

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-09 Thread Rusty Russell
On Mon, 2007-04-09 at 10:10 +0300, Avi Kivity wrote: Rusty Russell wrote: I'm a little puzzled by your response. Hmm... lguest's userspace network frontend does exactly as many copies as Ingo's in-host-kernel code. One from the Guest, one to the Guest. kvm pvnet is suboptimal

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-09 Thread Avi Kivity
Rusty Russell wrote: On Mon, 2007-04-09 at 10:10 +0300, Avi Kivity wrote: Rusty Russell wrote: I'm a little puzzled by your response. Hmm... lguest's userspace network frontend does exactly as many copies as Ingo's in-host-kernel code. One from the Guest, one to the

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-08 Thread Muli Ben-Yehuda
On Sun, Apr 08, 2007 at 08:36:14AM +0300, Avi Kivity wrote: That is not the common case. Nor is it true when there is a mismatch between the card's capabilties and guest expectations and constraints. For example, guest memory is not physically contiguous so a NIC that won't do

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-08 Thread Rusty Russell
On Sun, 2007-04-08 at 08:36 +0300, Avi Kivity wrote: Rusty Russell wrote: Hi Avi, I don't think you've thought about this very hard. The receive copy is completely independent with whether the packet is going to the guest via a kernel driver or via userspace, so not relevant.

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-07 Thread Avi Kivity
Rusty Russell wrote: On Thu, 2007-04-05 at 10:17 +0300, Avi Kivity wrote: Rusty Russell wrote: You didn't quote Anthony's point about it's more about there not being good enough userspace interfaces to do network IO. It's easier to write a kernel-space network driver, but it's not

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-06 Thread Ingo Molnar
* Ingo Molnar [EMAIL PROTECTED] wrote: * Anthony Liguori [EMAIL PROTECTED] wrote: [...] Did Linux have extremely high quality code in 1994? yes! It was crutial to strive for extremely high quality code all the time. That was the only way to grow Linux's codebase, which was ~300,000

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-06 Thread Ingo Molnar
* Rusty Russell [EMAIL PROTECTED] wrote: prototyping new kernel APIs to implement user-space network drivers, on a crufty codebase is not something that should be done lightly. I think you overestimate my radicalism. I was considering readv() and writev() on the tap device. ok :-)

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-05 Thread Ingo Molnar
* Avi Kivity [EMAIL PROTECTED] wrote: [...] But the difference in cruftiness between kvm and qemu code should not enter into the discussion of where to do things. i agree that it doesnt enter the discussion for the *PIC question, but it very much enters the discussion for the question that

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-05 Thread Avi Kivity
Ingo Molnar wrote: so right now the only option for a clean codebase is the KVM in-kernel code. I strongly disagree with this. Bad code in userspace is not an excuse for shoving stuff into the kernel, where maintaining it is much more expensive, and the cause of a mistake can be system

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-05 Thread Ingo Molnar
* Ingo Molnar [EMAIL PROTECTED] wrote: * Rusty Russell [EMAIL PROTECTED] wrote: It's easier to write a kernel-space network driver, but it's not obviously the right thing to do until we can show that an efficient packet-level userspace interface isn't possible. I don't think

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-05 Thread Avi Kivity
Ingo Molnar wrote: * Avi Kivity [EMAIL PROTECTED] wrote: so right now the only option for a clean codebase is the KVM in-kernel code. I strongly disagree with this. are you disagreeing with my statement that the KVM kernel-side code is the only clean codebase here? To me

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-05 Thread Ingo Molnar
* Avi Kivity [EMAIL PROTECTED] wrote: so right now the only option for a clean codebase is the KVM in-kernel code. I strongly disagree with this. are you disagreeing with my statement that the KVM kernel-side code is the only clean codebase here? To me this is a clear fact :) I only

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-05 Thread Ingo Molnar
* Rusty Russell [EMAIL PROTECTED] wrote: It's easier to write a kernel-space network driver, but it's not obviously the right thing to do until we can show that an efficient packet-level userspace interface isn't possible. I don't think that's been done, and it would be interesting to

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-05 Thread Avi Kivity
Rusty Russell wrote: You didn't quote Anthony's point about it's more about there not being good enough userspace interfaces to do network IO. It's easier to write a kernel-space network driver, but it's not obviously the right thing to do until we can show that an efficient packet-level

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-05 Thread Anthony Liguori
Ingo Molnar wrote: * Rusty Russell [EMAIL PROTECTED] wrote: It's easier to write a kernel-space network driver, but it's not obviously the right thing to do until we can show that an efficient packet-level userspace interface isn't possible. I don't think that's been done, and it would

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-05 Thread Rusty Russell
On Thu, 2007-04-05 at 13:36 +0200, Ingo Molnar wrote: prototyping new kernel APIs to implement user-space network drivers, on a crufty codebase is not something that should be done lightly. I think you overestimate my radicalism. I was considering readv() and writev() on the tap device.

Re: [kvm-devel] QEMU PIC indirection patch for in-kernel APIC work

2007-04-04 Thread Rusty Russell
On Wed, 2007-04-04 at 23:21 +0200, Ingo Molnar wrote: * Anthony Liguori [EMAIL PROTECTED] wrote: But why is it a good thing to do PV drivers in the kernel? You lose flexibility and functionality to gain performance. [...] in Linux a kernel-space network driver can still be tunneled