Rusty Russell wrote:
> On Thu, 2007-04-05 at 10:17 +0300, Avi Kivity wrote:
>   
>> Rusty Russell wrote:
>>     
>>> You didn't quote Anthony's point about "it's more about there not being
>>> good enough userspace interfaces to do network IO."
>>>
>>> It's easier to write a kernel-space network driver, but it's not
>>> obviously the right thing to do until we can show that an efficient
>>> packet-level userspace interface isn't possible.  I don't think that's
>>> been done, and it would be interesting to try.
>>>   
>>>       
>> In the case of networking, the copyful interfaces on receive are driven 
>> by the hardware not knowing how to split the header from the data.  On 
>> transmit I agree, it could be made copyless from userspace (somthing 
>> like sendfilev, only not file oriented).
>>     
>
> Hi Avi,
>
>       I don't think you've thought about this very hard.  The receive copy is
> completely independent with whether the packet is going to the guest via
> a kernel driver or via userspace, so not relevant.
>   

A packet received in the kernel cannot be made available to userspace in 
a safe manner without a copy, as it will not be aligned with page 
boundaries, so userspace cannot examine the packet until after one copy 
has occured.  After userspace has determined what to do with the packet, 
another copy must take place to get it there.

There's a counterexample, mmapped sockets, but that works only when all 
packets arriving on a card are exposed to the same process.  This is 
useful for tcpdump or for what you outline below but is hardly generic.

>       And if all packets from the card are going to the guest, you can
> deliver directly.  Userspace or kernel, no difference.
>   

That is not the common case.  Nor is it true when there is a mismatch 
between the card's capabilties and guest expectations and constraints.  
For example, guest memory is not physically contiguous so a NIC that 
won't do scatter/gather will require bouncing (or an iommu, but that's 
not here yet).

>       And we have a "sendfilev not file oriented": it's called "writev" 8)
>   

writev() cannot be made copyless for networking.  One needs an async 
interface so the kernel can complete the write after the NIC acks the 
dma transfer, or a kernel driver.

>       An in-kernel driver can avoid system call overhead and page references.
> But a better tap device helps more than just KVM.
>   

I'll believe it when I see it.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Reply via email to