Julian Stecklina <jstec...@os.inf.tu-dresden.de> writes:

> On 05/28/2013 12:10 PM, Luke Gorrie wrote:
>> On 27 May 2013 11:34, Stefan Hajnoczi <stefa...@redhat.com
>> <mailto:stefa...@redhat.com>> wrote:
>> 
>>     vhost_net is about connecting the a virtio-net speaking process to a
>>     tun-like device.  The problem you are trying to solve is connecting a
>>     virtio-net speaking process to Snabb Switch.
>> 
>> 
>> Yep!
>
> Since I am on a similar path as Luke, let me share another idea.
>
> What about extending qemu in a way to allow PCI device models to be
> implemented in another process.

We aren't going to support any interface that enables out of tree
devices.  This is just plugins in a different form with even more
downsides.  You cannot easily keep track of dirty info, the guest
physical address translation to host is difficult to keep in sync
(imagine the complexity of memory hotplug).

Basically, it's easy to hack up but extremely hard to do something that
works correctly overall.

There isn't a compelling reason to implement something like this other
than avoiding getting code into QEMU.  Best to just submit your device
to QEMU for inclusion.

If you want to avoid copying in a vswitch, better to use something like
vmsplice as I outlined in another thread.

> This is not as hard as it may sound.
> qemu would open a domain socket to this process and map VM memory over
> to the other side. This can be accomplished by having file descriptors
> in qemu to VM memory (reusing -mem-path code) and passing those over the
> domain socket. The other side can then just mmap them. The socket would
> also be used for configuration and I/O by the guest on the PCI
> I/O/memory regions. You could also use this to do IRQs or use eventfds,
> whatever works better.
>
> To have a zero copy userspace switch, the switch would offer virtio-net
> devices to any qemu that wants to connect to it and implement the
> complete device logic itself. Since it has access to all guest memory,
> it can just do memcpy for packet data. Of course, this only works for
> 64-bit systems, because you need vast amounts of virtual address space.
> In my experience, doing this in userspace is _way less painful_.
>
> If you can get away with polling in the switch the overhead of doing all
> this in userspace is zero. And as long as you can rate-limit explicit
> notifications over the socket even that overhead should be okay.
>
> Opinions?

I don't see any compelling reason to do something like this.  It's
jumping through a tremendous number of hoops to avoid putting code that
belongs in QEMU in tree.

Regards,

Anthony Liguori

>
> Julian

Reply via email to