Rusty Russell wrote:
On Thu, 2007-04-05 at 10:17 +0300, Avi Kivity wrote:
Rusty Russell wrote:
You didn't quote Anthony's point about "it's more about there not being
good enough userspace interfaces to do network IO."

It's easier to write a kernel-space network driver, but it's not
obviously the right thing to do until we can show that an efficient
packet-level userspace interface isn't possible.  I don't think that's
been done, and it would be interesting to try.
In the case of networking, the copyful interfaces on receive are driven by the hardware not knowing how to split the header from the data. On transmit I agree, it could be made copyless from userspace (somthing like sendfilev, only not file oriented).

Hi Avi,

        I don't think you've thought about this very hard.  The receive copy is
completely independent with whether the packet is going to the guest via
a kernel driver or via userspace, so not relevant.

A packet received in the kernel cannot be made available to userspace in a safe manner without a copy, as it will not be aligned with page boundaries, so userspace cannot examine the packet until after one copy has occured. After userspace has determined what to do with the packet, another copy must take place to get it there.

There's a counterexample, mmapped sockets, but that works only when all packets arriving on a card are exposed to the same process. This is useful for tcpdump or for what you outline below but is hardly generic.

        And if all packets from the card are going to the guest, you can
deliver directly.  Userspace or kernel, no difference.

That is not the common case. Nor is it true when there is a mismatch between the card's capabilties and guest expectations and constraints. For example, guest memory is not physically contiguous so a NIC that won't do scatter/gather will require bouncing (or an iommu, but that's not here yet).

        And we have a "sendfilev not file oriented": it's called "writev" 8)

writev() cannot be made copyless for networking. One needs an async interface so the kernel can complete the write after the NIC acks the dma transfer, or a kernel driver.

        An in-kernel driver can avoid system call overhead and page references.
But a better tap device helps more than just KVM.

I'll believe it when I see it.

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to