Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
Anthony Liguori wrote: Considering VEPA enabled hardware doesn't exist today and the standards aren't even finished being defined, I don't think it's a really strong use case ;-) Anthony, VEPA enabled NIC hardware is live and kicking, maybe even @ your onboard 1Gbs NIC: the intel 82576 (-- Linux igb network driver) supports SR-IOV VEPA: 1. register exists which dictates whether the NIC does switching between the different VFs or just send every packet transmitted from the VF to the uplink PF 2. a logic exists which makes sure a downstream (incoming from the network) packet is never sent to a VF who has the source mac of this packet, which account for multicast support. To learn more about that, see the Intel 82576 SR-IOV Driver Companion Guide, available on the web. Or. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On Wednesday 27 January 2010, Anthony Liguori wrote: Introducing something that is known to be problematic from a security perspective without any clear idea of what the use-case for it is is a bad idea IMHO. vepa on existing kernels is one use-case. Considering VEPA enabled hardware doesn't exist today and the standards aren't even finished being defined, I don't think it's a really strong use case ;-) The hairpin turn (the part that is required on the bridge) was implemented in the Linux bridge in 2.6.32, so that is one existing implementation you can use as a peer. The VEPA mode in macvlan only made it into 2.6.33, so using the raw socket on older kernels does not give you actual VEPA semantics. The part of the standard that is still under discussion is the management side, which is almost entirely unrelated to this question though. With Linux-2.6.33 on both sides using raw/macvlan and bridge respectively, you can have a working VEPA setup. The only thing missing is that the hypervisor will not be able to tell the bridge to automatically enable hairpin mode (you need to do that on the bridge on a per-port basis). Now, the most important use case I see for the raw socket interface in qemu is to get vhost-net and the qemu user implementation to support the same feature set. If you ask for a network setup involving a raw socket and vhost-net and the kernel can support raw sockets but for some reason fails to set up vhost-net, you should have a fallback that has the exact same semantics at a possibly significant performance loss. Arnd -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
Sridhar Samudrala wrote: On Wed, 2010-01-27 at 22:39 +0100, Arnd Bergmann wrote: we already have -net socket,fd and any user that passes an fd into that already knows what he wants to do with it. Making it work with raw sockets is just a natural extension to this Didn't realize that -net socket is already there and supports TCP and UDP sockets. I will look into extending -net socket to support AF_PACKET SOCK_RAW type sockets The original thought was that the -raw option will be integrated in a pass through manner, that is bypassing the qemu vlan (internal bridge). This will allow qemu to use the mac address of the SR-IOV (e.g HW VF, software macvlan) NIC as the mac delivered to the VM, in that sense it is pretty different from the -net socket option. Or. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On 01/28/2010 07:56 AM, Michael S. Tsirkin wrote: Now, the most important use case I see for the raw socket interface in qemu is to get vhost-net and the qemu user implementation to support the same feature set. If you ask for a network setup involving a raw socket and vhost-net and the kernel can support raw sockets but for some reason fails to set up vhost-net, you should have a fallback that has the exact same semantics at a possibly significant performance loss. Arnd Makes sense. A simple reason you can't do vhost-net would be that you are using tcg. Some good arguments have been raised in this thread. I really don't like making our security depend on something external to qemu that is not widely used or understood. That said, I'm not seeing a lot of great alternatives. I definitely like -net socket better than -net raw. In the absence of an extraordinarily clever solution, I think we may be stuck with doing this. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On 01/28/2010 08:13 AM, Anthony Liguori wrote: On 01/28/2010 07:56 AM, Michael S. Tsirkin wrote: Now, the most important use case I see for the raw socket interface in qemu is to get vhost-net and the qemu user implementation to support the same feature set. If you ask for a network setup involving a raw socket and vhost-net and the kernel can support raw sockets but for some reason fails to set up vhost-net, you should have a fallback that has the exact same semantics at a possibly significant performance loss. Arnd Makes sense. A simple reason you can't do vhost-net would be that you are using tcg. Some good arguments have been raised in this thread. I really don't like making our security depend on something external to qemu that is not widely used or understood. Thinking about it, I don't think network namespaces actually provides us the security that we need. It's quite easy to break out of it if not being used in the context of a full container. But this discussion belongs in netdev, I'll raise the issue there. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On Thu, Jan 28, 2010 at 08:13:53AM -0600, Anthony Liguori wrote: On 01/28/2010 07:56 AM, Michael S. Tsirkin wrote: Now, the most important use case I see for the raw socket interface in qemu is to get vhost-net and the qemu user implementation to support the same feature set. If you ask for a network setup involving a raw socket and vhost-net and the kernel can support raw sockets but for some reason fails to set up vhost-net, you should have a fallback that has the exact same semantics at a possibly significant performance loss. Arnd Makes sense. A simple reason you can't do vhost-net would be that you are using tcg. Some good arguments have been raised in this thread. I really don't like making our security depend on something external to qemu that is not widely used or understood. That said, I'm not seeing a lot of great alternatives. I definitely like -net socket better than -net raw. In the absence of an extraordinarily clever solution, I think we may be stuck with doing this. Agreed on all points. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On 01/28/2010 08:52 AM, Michael S. Tsirkin wrote: On Thu, Jan 28, 2010 at 08:13:53AM -0600, Anthony Liguori wrote: On 01/28/2010 07:56 AM, Michael S. Tsirkin wrote: Now, the most important use case I see for the raw socket interface in qemu is to get vhost-net and the qemu user implementation to support the same feature set. If you ask for a network setup involving a raw socket and vhost-net and the kernel can support raw sockets but for some reason fails to set up vhost-net, you should have a fallback that has the exact same semantics at a possibly significant performance loss. Arnd Makes sense. A simple reason you can't do vhost-net would be that you are using tcg. Some good arguments have been raised in this thread. I really don't like making our security depend on something external to qemu that is not widely used or understood. That said, I'm not seeing a lot of great alternatives. I definitely like -net socket better than -net raw. In the absence of an extraordinarily clever solution, I think we may be stuck with doing this. Agreed on all points. The scenario I'm concerned about is: normal user uses libvirt to launch custom qemu instance. libvirt passes an fd of a raw socket to qemu and puts the qemu process in a restricted network namespace. user has another program running listening on a unix domain socket and does something to the qemu process that causes it to open the domain socket and send the fd it received from libvirt via SCM_RIGHTS. user now has a raw socket that is not confined to a network namespace. I'm trying to digest the disablenetwork thread right now. Basically though, what would be ideal is a /dev/net/ethN that we could open, and use read/write to send packets to and use ioctls to issue commands to do things like enable/disable offloads. I understand that raw socket is the interface we have today but I think we aren't going to be able to get around the need for a restricted file descriptor vs. using process restrictions to achieve isolation. Regards, Anthony Liguori Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On Thu, Jan 28, 2010 at 09:05:45AM -0600, Anthony Liguori wrote: On 01/28/2010 08:52 AM, Michael S. Tsirkin wrote: On Thu, Jan 28, 2010 at 08:13:53AM -0600, Anthony Liguori wrote: On 01/28/2010 07:56 AM, Michael S. Tsirkin wrote: Now, the most important use case I see for the raw socket interface in qemu is to get vhost-net and the qemu user implementation to support the same feature set. If you ask for a network setup involving a raw socket and vhost-net and the kernel can support raw sockets but for some reason fails to set up vhost-net, you should have a fallback that has the exact same semantics at a possibly significant performance loss. Arnd Makes sense. A simple reason you can't do vhost-net would be that you are using tcg. Some good arguments have been raised in this thread. I really don't like making our security depend on something external to qemu that is not widely used or understood. That said, I'm not seeing a lot of great alternatives. I definitely like -net socket better than -net raw. In the absence of an extraordinarily clever solution, I think we may be stuck with doing this. Agreed on all points. The scenario I'm concerned about is: normal user uses libvirt to launch custom qemu instance. libvirt passes an fd of a raw socket to qemu and puts the qemu process in a restricted network namespace. user has another program running listening on a unix domain socket and does something to the qemu process that causes it to open the domain socket and send the fd it received from libvirt via SCM_RIGHTS. user now has a raw socket that is not confined to a network namespace. I'm trying to digest the disablenetwork thread right now. Basically though, what would be ideal is a /dev/net/ethN that we could open, and use read/write to send packets to and use ioctls to issue commands to do things like enable/disable offloads. I understand that raw socket is the interface we have today but I think we aren't going to be able to get around the need for a restricted file descriptor vs. using process restrictions to achieve isolation. So actually, this is an interesting argument in favor of turning disablenetwork from per-process as it is now to per-file. Regards, Anthony Liguori Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On Thursday 28 January 2010, Arnd Bergmann wrote: On Wednesday 27 January 2010, Sridhar Samudrala wrote: On Wed, 2010-01-27 at 22:39 +0100, Arnd Bergmann wrote: On Wednesday 27 January 2010, Anthony Liguori wrote: I think -net socket,fd should just be (trivially) extended to work with raw sockets out of the box, with no support for opening it. Then you can have libvirt or some wrapper open a raw socket and a private namespace and just pass it down. That'd work. Anthony? The fundamental problem that I have with all of this is that we should not be introducing new network backends that are based around something only a developer is going to understand. If I'm a user and I want to use an external switch in VEPA mode, how in the world am I going to know that I'm supposed to use the -net raw backend or the -net socket backend? It might as well be the -net butterflies backend as far as a user is concerned. My point is that we already have -net socket,fd and any user that passes an fd into that already knows what he wants to do with it. Making it work with raw sockets is just a natural extension to this, which works on all kernels and (with separate namespaces) is reasonably secure. Didn't realize that -net socket is already there and supports TCP and UDP sockets. I will look into extending -net socket to support AF_PACKET SOCK_RAW type sockets. Actually, Jens had a patch doing this in early 2009 already but we decided to not send that one out at the time after Or had sent his version of the raw socket interface, which was a superset. Maybe Jens can post his patch again if that still applies? It's been a while since I last looked at it. I think it will need a bitt massaging before it will apply again. Jens -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On 01/28/2010 10:37 AM, Michael S. Tsirkin wrote: So actually, this is an interesting argument in favor of turning disablenetwork from per-process as it is now to per-file. Yup. I think we really need a file-based restriction mechanism and so far, neither disablenetwork or network namespace seems to do that. I think you might be able to mitigate this with SELinux since I'm fairly certain it can prevent SCM_RIGHTS but SELinux is not something that can be enforced within a set of applications so we'd be relying on SELinux being enabled (honestly, unlikely) and the policy being correctly configured (unlikely in the general case at least). Regards, Anthony Liguori Regards, Anthony Liguori Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On Thu, Jan 28, 2010 at 11:58:48AM -0600, Anthony Liguori wrote: On 01/28/2010 10:37 AM, Michael S. Tsirkin wrote: So actually, this is an interesting argument in favor of turning disablenetwork from per-process as it is now to per-file. Yup. I think we really need a file-based restriction mechanism and so far, neither disablenetwork or network namespace seems to do that. I think you might be able to mitigate this with SELinux since I'm fairly certain it can prevent SCM_RIGHTS but SELinux is not something that can be enforced within a set of applications so we'd be relying on SELinux being enabled (honestly, unlikely) and the policy being correctly configured (unlikely in the general case at least). Regards, Anthony Liguori I am not convinced SELinux being disabled is a problem we necessarily need to deal with, and qemu does not verify e.g. that it is not run as root either. A more serious problem IMO is that SCM_RIGHTS might be needed for some other functionality. Regards, Anthony Liguori Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On 01/28/2010 12:04 PM, Michael S. Tsirkin wrote: On Thu, Jan 28, 2010 at 11:58:48AM -0600, Anthony Liguori wrote: On 01/28/2010 10:37 AM, Michael S. Tsirkin wrote: So actually, this is an interesting argument in favor of turning disablenetwork from per-process as it is now to per-file. Yup. I think we really need a file-based restriction mechanism and so far, neither disablenetwork or network namespace seems to do that. I think you might be able to mitigate this with SELinux since I'm fairly certain it can prevent SCM_RIGHTS but SELinux is not something that can be enforced within a set of applications so we'd be relying on SELinux being enabled (honestly, unlikely) and the policy being correctly configured (unlikely in the general case at least). Regards, Anthony Liguori I am not convinced SELinux being disabled is a problem we necessarily need to deal with, and qemu does not verify e.g. that it is not run as root either. A more serious problem IMO is that SCM_RIGHTS might be needed for some other functionality. It would mean that libvirt is insecure unless SELinux is enabled. That's a pretty fundamental flaw IMHO. At any rate, I think we both agree that we need to figure out a solution, so that's good :-) Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On Thursday 28 January 2010, Anthony Liguori wrote: normal user uses libvirt to launch custom qemu instance. libvirt passes an fd of a raw socket to qemu and puts the qemu process in a restricted network namespace. user has another program running listening on a unix domain socket and does something to the qemu process that causes it to open the domain socket and send the fd it received from libvirt via SCM_RIGHTS. I looked at the af_unix code and it seems to suggest that this is not possible, because you cannot bind to a socket that belongs to a different network namespace. I haven't tried it though, so I may have missed something. Arnd -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On Tue, Jan 26, 2010 at 02:50:28PM -0600, Anthony Liguori wrote: On 01/26/2010 02:47 PM, Anthony Liguori wrote: On 01/26/2010 02:40 PM, Sridhar Samudrala wrote: This patch adds raw socket backend to qemu and is based on Or Gerlitz's patch re-factored and ported to the latest qemu-kvm git tree. It also includes support for vnet_hdr option that enables gso/checksum offload with raw backend. You can find the linux kernel patch to support this feature here. http://thread.gmane.org/gmane.linux.network/150308 Signed-off-by: Sridhar Samudralas...@us.ibm.com See the previous discussion about the raw backend from Or's original patch. There's no obvious reason why we should have this in addition to a tun/tap backend. I thought this was cleared already: vepa support is the requirement here. Existing tap solution requires management of host linux networking which some users would rather avoid. The only use-case I know of is macvlan but macvtap addresses this functionality while not introduce the rather nasty security problems associated with a raw backend. I am not sure I agree with this sentiment. The main issue being that macvtap doesn't exist on all kernels :). macvlan also requires hardware support, packet socket can work with any network card in promisc mode. Not to mention that from a user perspective, raw makes almost no sense as it's an obscure socket protocol family. A user wants to do useful things like bridged networking or direct VF assignment. We should have -net backends that reflect things that make sense to a user. Regards, Anthony Liguori I agree to that. People don't even seem to agree whether it's a raw socket or a packet socket :) We need a better name for this option: what it really does is rely on an external device to loopback a packet to us, so how about -net loopback or -net extbridge? -- MST -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On Wednesday 27 January 2010, Michael S. Tsirkin wrote: I am not sure I agree with this sentiment. The main issue being that macvtap doesn't exist on all kernels :). macvlan also requires hardware support, packet socket can work with any network card in promisc mode. To be clear, macvlan does not require hardware support, it will happily put cards into promiscous mode if they don't support multiple mac addresses. I agree to that. People don't even seem to agree whether it's a raw socket or a packet socket :) We need a better name for this option: what it really does is rely on an external device to loopback a packet to us, so how about -net loopback or -net extbridge? I think -net socket,fd should just be (trivially) extended to work with raw sockets out of the box, with no support for opening it. Then you can have libvirt or some wrapper open a raw socket and a private namespace and just pass it down. If you really want to let qemu open the socket itself, -net socket,raw=eth0 is probably closer to what you want than a new -net xxx option. Arnd -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On Wed, Jan 27, 2010 at 10:34:35AM +0100, Arnd Bergmann wrote: On Wednesday 27 January 2010, Michael S. Tsirkin wrote: I am not sure I agree with this sentiment. The main issue being that macvtap doesn't exist on all kernels :). macvlan also requires hardware support, packet socket can work with any network card in promisc mode. To be clear, macvlan does not require hardware support, it will happily put cards into promiscous mode if they don't support multiple mac addresses. I agree to that. People don't even seem to agree whether it's a raw socket or a packet socket :) We need a better name for this option: what it really does is rely on an external device to loopback a packet to us, so how about -net loopback or -net extbridge? I think -net socket,fd should just be (trivially) extended to work with raw sockets out of the box, with no support for opening it. Then you can have libvirt or some wrapper open a raw socket and a private namespace and just pass it down. That'd work. Anthony? If you really want to let qemu open the socket itself, -net socket,raw=eth0 is probably closer to what you want than a new -net xxx option. Arnd So again if implemented this probably should be -net socket,raw,loopback=eth0 or -net socket,raw,extbridge=eth0 or some such, just to make it abundantly clear that you must not bind it to a regular ethernet device. -- MST -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On 01/27/2010 03:44 AM, Michael S. Tsirkin wrote: On Wed, Jan 27, 2010 at 10:34:35AM +0100, Arnd Bergmann wrote: On Wednesday 27 January 2010, Michael S. Tsirkin wrote: I am not sure I agree with this sentiment. The main issue being that macvtap doesn't exist on all kernels :). macvlan also requires hardware support, packet socket can work with any network card in promisc mode. To be clear, macvlan does not require hardware support, it will happily put cards into promiscous mode if they don't support multiple mac addresses. I agree to that. People don't even seem to agree whether it's a raw socket or a packet socket :) We need a better name for this option: what it really does is rely on an external device to loopback a packet to us, so how about -net loopback or -net extbridge? I think -net socket,fd should just be (trivially) extended to work with raw sockets out of the box, with no support for opening it. Then you can have libvirt or some wrapper open a raw socket and a private namespace and just pass it down. That'd work. Anthony? What functionality are we trying to achieve? Let's be very specific about use-cases here. If it's VEPA, like you mentioned earlier, why isn't macvtap a better solution from a security perspective? The fundamental problem that I have with all of this is that we should not be introducing new network backends that are based around something only a developer is going to understand. If I'm a user and I want to use an external switch in VEPA mode, how in the world am I going to know that I'm supposed to use the -net raw backend or the -net socket backend? It might as well be the -net butterflies backend as far as a user is concerned. Networking in QEMU is already hard enough for users, we shouldn't make it worse than it already is. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On 01/27/2010 03:24 AM, Michael S. Tsirkin wrote: I am not sure I agree with this sentiment. The main issue being that macvtap doesn't exist on all kernels :). Neither does vhost ;-) If it were just that as the difference, I'd be inclined to agree, but macvtap is much better from a security PoV. Not to mention that from a user perspective, raw makes almost no sense as it's an obscure socket protocol family. A user wants to do useful things like bridged networking or direct VF assignment. We should have -net backends that reflect things that make sense to a user. Regards, Anthony Liguori I agree to that. People don't even seem to agree whether it's a raw socket or a packet socket :) We need a better name for this option: what it really does is rely on an external device to loopback a packet to us, so how about -net loopback or -net extbridge? Specifically for VEPA, something like: -net extbridge,if=eth0 or even -net vepa,if=eth0 Would be fantastic. I think the best way to achieve this is to introduce a small helper that gets called and can create a macvtap device and hand the file descriptor back to qemu :-) A builtin backend would also be fine since we don't have the helper infrastructure. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On Wed, Jan 27, 2010 at 08:07:11AM -0600, Anthony Liguori wrote: On 01/27/2010 03:24 AM, Michael S. Tsirkin wrote: I am not sure I agree with this sentiment. The main issue being that macvtap doesn't exist on all kernels :). Neither does vhost ;-) If it were just that as the difference, I'd be inclined to agree, but macvtap is much better from a security PoV. Not to mention that from a user perspective, raw makes almost no sense as it's an obscure socket protocol family. A user wants to do useful things like bridged networking or direct VF assignment. We should have -net backends that reflect things that make sense to a user. Regards, Anthony Liguori I agree to that. People don't even seem to agree whether it's a raw socket or a packet socket :) We need a better name for this option: what it really does is rely on an external device to loopback a packet to us, so how about -net loopback or -net extbridge? Specifically for VEPA, something like: -net extbridge,if=eth0 or even -net vepa,if=eth0 Would be fantastic. extbridge is IMO better. I think the best way to achieve this is to introduce a small helper that gets called and can create a macvtap device and hand the file descriptor back to qemu :-) A builtin backend would also be fine since we don't have the helper infrastructure. Excellent. Sridhar, this is actually not a lot of work on top of what you already posted. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On 01/27/2010 10:59 AM, Michael S. Tsirkin wrote: On Wed, Jan 27, 2010 at 08:07:11AM -0600, Anthony Liguori wrote: On 01/27/2010 03:24 AM, Michael S. Tsirkin wrote: I am not sure I agree with this sentiment. The main issue being that macvtap doesn't exist on all kernels :). Neither does vhost ;-) If it were just that as the difference, I'd be inclined to agree, but macvtap is much better from a security PoV. Not to mention that from a user perspective, raw makes almost no sense as it's an obscure socket protocol family. A user wants to do useful things like bridged networking or direct VF assignment. We should have -net backends that reflect things that make sense to a user. Regards, Anthony Liguori I agree to that. People don't even seem to agree whether it's a raw socket or a packet socket :) We need a better name for this option: what it really does is rely on an external device to loopback a packet to us, so how about -net loopback or -net extbridge? Specifically for VEPA, something like: -net extbridge,if=eth0 or even -net vepa,if=eth0 Would be fantastic. extbridge is IMO better. I think the best way to achieve this is to introduce a small helper that gets called and can create a macvtap device and hand the file descriptor back to qemu :-) A builtin backend would also be fine since we don't have the helper infrastructure. Excellent. Sridhar, this is actually not a lot of work on top of what you already posted. N.B. I had suggested using macvtap, not raw. In this case, the full syntax would be: -net vepa,if=eth0 or -net vepa,fd=N where N is a macvtap fd. For raw, I think there's a real problem wrt security. I think it's important that we support running qemu as a non-privileged user. In fact, this seems to be the mode libvirt is now preferring to operate in. I think we need to re-evaluate the use of any raw socket by qemu as it's very dangerous from a security perspective (assuming we cannot introduced a locked raw socket mode). Regards, Anthony Liguori Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On Wed, Jan 27, 2010 at 11:07:45AM -0600, Anthony Liguori wrote: On 01/27/2010 10:59 AM, Michael S. Tsirkin wrote: On Wed, Jan 27, 2010 at 08:07:11AM -0600, Anthony Liguori wrote: On 01/27/2010 03:24 AM, Michael S. Tsirkin wrote: I am not sure I agree with this sentiment. The main issue being that macvtap doesn't exist on all kernels :). Neither does vhost ;-) If it were just that as the difference, I'd be inclined to agree, but macvtap is much better from a security PoV. Not to mention that from a user perspective, raw makes almost no sense as it's an obscure socket protocol family. A user wants to do useful things like bridged networking or direct VF assignment. We should have -net backends that reflect things that make sense to a user. Regards, Anthony Liguori I agree to that. People don't even seem to agree whether it's a raw socket or a packet socket :) We need a better name for this option: what it really does is rely on an external device to loopback a packet to us, so how about -net loopback or -net extbridge? Specifically for VEPA, something like: -net extbridge,if=eth0 or even -net vepa,if=eth0 Would be fantastic. extbridge is IMO better. I think the best way to achieve this is to introduce a small helper that gets called and can create a macvtap device and hand the file descriptor back to qemu :-) A builtin backend would also be fine since we don't have the helper infrastructure. Excellent. Sridhar, this is actually not a lot of work on top of what you already posted. N.B. I had suggested using macvtap, not raw. Well, this is an implementation detail :) In fact, I don't have any objections to using macvtap. As I tried to hint, macvtap doesn't seem to exist in any Linux yet, packet sockets have been supported since ages. So we might want to support packet sockets at least optionally as a backend for extbridge. In this case, the full syntax would be: -net vepa,if=eth0 or -net vepa,fd=N I still hope it's extbridge, vepa is an acronym that will likely not be known for 99% of users. where N is a macvtap fd. For raw, I think there's a real problem wrt security. I think it's important that we support running qemu as a non-privileged user. In fact, this seems to be the mode libvirt is now preferring to operate in. I think we need to re-evaluate the use of any raw socket by qemu as it's very dangerous from a security perspective (assuming we cannot introduced a locked raw socket mode). As was pointed out on netdev and elsewhere this seems to be what namespaces/selinux are there for. Can qemu be run within a namespace and if yes would that address your concerns? Security is probably a wrong reason to use character devices: they are much more likely to have security problems than standard interfaces. Ease of setup/compatibility with tap would be a better reason. Regards, Anthony Liguori Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On 01/27/2010 11:25 AM, Michael S. Tsirkin wrote: In this case, the full syntax would be: -net vepa,if=eth0 or -net vepa,fd=N I still hope it's extbridge, vepa is an acronym that will likely not be known for 99% of users. Oh sorry, I don't care about the name at all. If you prefer extbridge, I'm all for it :-) where N is a macvtap fd. For raw, I think there's a real problem wrt security. I think it's important that we support running qemu as a non-privileged user. In fact, this seems to be the mode libvirt is now preferring to operate in. I think we need to re-evaluate the use of any raw socket by qemu as it's very dangerous from a security perspective (assuming we cannot introduced a locked raw socket mode). As was pointed out on netdev and elsewhere this seems to be what namespaces/selinux are there for. Can qemu be run within a namespace and if yes would that address your concerns? It's unclear to me what this would even involve. But really, we just want an interface to inject packets directly into a physical device. raw sockets give us that but it also gives us way more. Using network namespaces to restrict this is a bit convoluted. It seems to me that providing an interface that never gives us way more to start with is better overall from a security perspective. Regards, Anthony Liguori Security is probably a wrong reason to use character devices: they are much more likely to have security problems than standard interfaces. Ease of setup/compatibility with tap would be a better reason. Regards, Anthony Liguori Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On 01/27/2010 11:54 AM, Sridhar Samudrala wrote: I too think that we should not block raw backend in qemu just because of security reasons. It should be perfectly fine to use raw backend in scenarios where qemu can be run as a privileged process. libvirt need not support raw backend until we figure out a secure way to start qemu when passing raw fd. using network namespaces seems like a good option. Introducing something that is known to be problematic from a security perspective without any clear idea of what the use-case for it is is a bad idea IMHO. Regards, Anthony Liguori Thanks Sridhar -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On Wed, Jan 27, 2010 at 12:02:34PM -0600, Anthony Liguori wrote: On 01/27/2010 11:54 AM, Sridhar Samudrala wrote: I too think that we should not block raw backend in qemu just because of security reasons. It should be perfectly fine to use raw backend in scenarios where qemu can be run as a privileged process. libvirt need not support raw backend until we figure out a secure way to start qemu when passing raw fd. using network namespaces seems like a good option. Introducing something that is known to be problematic from a security perspective without any clear idea of what the use-case for it is is a bad idea IMHO. vepa on existing kernels is one use-case. Regards, Anthony Liguori Thanks Sridhar -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On Wed, Jan 27, 2010 at 11:36:31AM -0600, Anthony Liguori wrote: On 01/27/2010 11:25 AM, Michael S. Tsirkin wrote: In this case, the full syntax would be: -net vepa,if=eth0 or -net vepa,fd=N I still hope it's extbridge, vepa is an acronym that will likely not be known for 99% of users. Oh sorry, I don't care about the name at all. If you prefer extbridge, I'm all for it :-) where N is a macvtap fd. For raw, I think there's a real problem wrt security. I think it's important that we support running qemu as a non-privileged user. In fact, this seems to be the mode libvirt is now preferring to operate in. I think we need to re-evaluate the use of any raw socket by qemu as it's very dangerous from a security perspective (assuming we cannot introduced a locked raw socket mode). As was pointed out on netdev and elsewhere this seems to be what namespaces/selinux are there for. Can qemu be run within a namespace and if yes would that address your concerns? It's unclear to me what this would even involve. But really, we just want an interface to inject packets directly into a physical device. Not only. We also want to program filters by mac/vlan, enable/disable promisc mode, set mac, maybe more, all this in response to guest activity so it's not as trivial as doing it in a helper script. The patches supplied do not do this and do filtering in userspace but I trust this is short-term. raw sockets give us that but it also gives us way more. Using network namespaces to restrict this is a bit convoluted. It seems to me that providing an interface that never gives us way more to start with is better overall from a security perspective. You are thinking about qemu security so custom groups and permissions on character devices and/or suid scripts with custom configuration files look nice to you. But think in terms of an overall system security. If you write custom kernel interfaces you end up with an unmanageable security policy. And system administrator not being in control of security policy is very bad for security. All the above is basically repeating what others said on netdev. If you care, pls argue on disablenetwork thread. Regards, Anthony Liguori Security is probably a wrong reason to use character devices: they are much more likely to have security problems than standard interfaces. Ease of setup/compatibility with tap would be a better reason. Regards, Anthony Liguori Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On 01/27/2010 12:03 PM, Michael S. Tsirkin wrote: On Wed, Jan 27, 2010 at 12:02:34PM -0600, Anthony Liguori wrote: On 01/27/2010 11:54 AM, Sridhar Samudrala wrote: I too think that we should not block raw backend in qemu just because of security reasons. It should be perfectly fine to use raw backend in scenarios where qemu can be run as a privileged process. libvirt need not support raw backend until we figure out a secure way to start qemu when passing raw fd. using network namespaces seems like a good option. Introducing something that is known to be problematic from a security perspective without any clear idea of what the use-case for it is is a bad idea IMHO. vepa on existing kernels is one use-case. Considering VEPA enabled hardware doesn't exist today and the standards aren't even finished being defined, I don't think it's a really strong use case ;-) Regards, Anthony Liguori Regards, Anthony Liguori Thanks Sridhar -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On Wednesday 27 January 2010, Anthony Liguori wrote: I think -net socket,fd should just be (trivially) extended to work with raw sockets out of the box, with no support for opening it. Then you can have libvirt or some wrapper open a raw socket and a private namespace and just pass it down. That'd work. Anthony? The fundamental problem that I have with all of this is that we should not be introducing new network backends that are based around something only a developer is going to understand. If I'm a user and I want to use an external switch in VEPA mode, how in the world am I going to know that I'm supposed to use the -net raw backend or the -net socket backend? It might as well be the -net butterflies backend as far as a user is concerned. My point is that we already have -net socket,fd and any user that passes an fd into that already knows what he wants to do with it. Making it work with raw sockets is just a natural extension to this, which works on all kernels and (with separate namespaces) is reasonably secure. I fully agree that we should not introduce further network backends that would confuse users, but making the existing backends more flexible is something entirely different. Arnd -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On Wed, 2010-01-27 at 22:39 +0100, Arnd Bergmann wrote: On Wednesday 27 January 2010, Anthony Liguori wrote: I think -net socket,fd should just be (trivially) extended to work with raw sockets out of the box, with no support for opening it. Then you can have libvirt or some wrapper open a raw socket and a private namespace and just pass it down. That'd work. Anthony? The fundamental problem that I have with all of this is that we should not be introducing new network backends that are based around something only a developer is going to understand. If I'm a user and I want to use an external switch in VEPA mode, how in the world am I going to know that I'm supposed to use the -net raw backend or the -net socket backend? It might as well be the -net butterflies backend as far as a user is concerned. My point is that we already have -net socket,fd and any user that passes an fd into that already knows what he wants to do with it. Making it work with raw sockets is just a natural extension to this, which works on all kernels and (with separate namespaces) is reasonably secure. Didn't realize that -net socket is already there and supports TCP and UDP sockets. I will look into extending -net socket to support AF_PACKET SOCK_RAW type sockets. Thanks Sridhar -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On Wednesday 27 January 2010, Sridhar Samudrala wrote: On Wed, 2010-01-27 at 22:39 +0100, Arnd Bergmann wrote: On Wednesday 27 January 2010, Anthony Liguori wrote: I think -net socket,fd should just be (trivially) extended to work with raw sockets out of the box, with no support for opening it. Then you can have libvirt or some wrapper open a raw socket and a private namespace and just pass it down. That'd work. Anthony? The fundamental problem that I have with all of this is that we should not be introducing new network backends that are based around something only a developer is going to understand. If I'm a user and I want to use an external switch in VEPA mode, how in the world am I going to know that I'm supposed to use the -net raw backend or the -net socket backend? It might as well be the -net butterflies backend as far as a user is concerned. My point is that we already have -net socket,fd and any user that passes an fd into that already knows what he wants to do with it. Making it work with raw sockets is just a natural extension to this, which works on all kernels and (with separate namespaces) is reasonably secure. Didn't realize that -net socket is already there and supports TCP and UDP sockets. I will look into extending -net socket to support AF_PACKET SOCK_RAW type sockets. Actually, Jens had a patch doing this in early 2009 already but we decided to not send that one out at the time after Or had sent his version of the raw socket interface, which was a superset. Maybe Jens can post his patch again if that still applies? Arnd -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
This patch adds raw socket backend to qemu and is based on Or Gerlitz's patch re-factored and ported to the latest qemu-kvm git tree. It also includes support for vnet_hdr option that enables gso/checksum offload with raw backend. You can find the linux kernel patch to support this feature here. http://thread.gmane.org/gmane.linux.network/150308 Signed-off-by: Sridhar Samudrala s...@us.ibm.com diff --git a/Makefile.objs b/Makefile.objs index 357d305..4468124 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -34,6 +34,8 @@ net-nested-$(CONFIG_SOLARIS) += tap-solaris.o net-nested-$(CONFIG_AIX) += tap-aix.o net-nested-$(CONFIG_SLIRP) += slirp.o net-nested-$(CONFIG_VDE) += vde.o +net-nested-$(CONFIG_POSIX) += raw.o +net-nested-$(CONFIG_LINUX) += raw-linux.o net-obj-y += $(addprefix net/, $(net-nested-y)) ## diff --git a/hw/virtio-net.c b/hw/virtio-net.c index eba578a..4aa40f2 100644 --- a/hw/virtio-net.c +++ b/hw/virtio-net.c @@ -15,6 +15,7 @@ #include net.h #include net/checksum.h #include net/tap.h +#include net/raw.h #include qemu-timer.h #include virtio-net.h @@ -133,6 +134,9 @@ static int peer_has_vnet_hdr(VirtIONet *n) case NET_CLIENT_TYPE_TAP: n-has_vnet_hdr = tap_has_vnet_hdr(n-nic-nc.peer); break; +case NET_CLIENT_TYPE_RAW: +n-has_vnet_hdr = raw_has_vnet_hdr(n-nic-nc.peer); +break; default: return 0; } @@ -149,6 +153,9 @@ static int peer_has_ufo(VirtIONet *n) case NET_CLIENT_TYPE_TAP: n-has_ufo = tap_has_ufo(n-nic-nc.peer); break; +case NET_CLIENT_TYPE_RAW: +n-has_ufo = raw_has_ufo(n-nic-nc.peer); +break; default: return 0; } @@ -165,6 +172,9 @@ static void peer_using_vnet_hdr(VirtIONet *n, int using_vnet_hdr) case NET_CLIENT_TYPE_TAP: tap_using_vnet_hdr(n-nic-nc.peer, using_vnet_hdr); break; +case NET_CLIENT_TYPE_RAW: +raw_using_vnet_hdr(n-nic-nc.peer, using_vnet_hdr); +break; default: break; } @@ -180,6 +190,9 @@ static void peer_set_offload(VirtIONet *n, int csum, int tso4, int tso6, case NET_CLIENT_TYPE_TAP: tap_set_offload(n-nic-nc.peer, csum, tso4, tso6, ecn, ufo); break; +case NET_CLIENT_TYPE_RAW: +raw_set_offload(n-nic-nc.peer, csum, tso4, tso6, ecn, ufo); +break; default: break; } diff --git a/net.c b/net.c index 6ef93e6..1ca2415 100644 --- a/net.c +++ b/net.c @@ -26,6 +26,7 @@ #include config-host.h #include net/tap.h +#include net/raw.h #include net/socket.h #include net/dump.h #include net/slirp.h @@ -1004,6 +1005,27 @@ static struct { }, { /* end of list */ } }, +}, { +.type = raw, +.init = net_init_raw, +.desc = { +NET_COMMON_PARAMS_DESC, +{ +.name = fd, +.type = QEMU_OPT_STRING, +.help = file descriptor of an already opened raw socket, +}, { +.name = ifname, +.type = QEMU_OPT_STRING, +.help = interface name, + }, { + .name = vnet_hdr, + .type = QEMU_OPT_BOOL, + .help = enable PACKET_VNET_HDR option on the raw interface + }, +{ /* end of list */ } + }, + #ifdef CONFIG_VDE }, { .type = vde, @@ -1076,6 +1098,7 @@ int net_client_init(Monitor *mon, QemuOpts *opts, int is_netdev) #ifdef CONFIG_VDE strcmp(type, vde) != 0 #endif +strcmp(type, raw) != 0 strcmp(type, socket) != 0) { qemu_error(The '%s' network backend type is not valid with -netdev\n, type); diff --git a/net.h b/net.h index 116bb80..4722185 100644 --- a/net.h +++ b/net.h @@ -34,7 +34,8 @@ typedef enum { NET_CLIENT_TYPE_TAP, NET_CLIENT_TYPE_SOCKET, NET_CLIENT_TYPE_VDE, -NET_CLIENT_TYPE_DUMP +NET_CLIENT_TYPE_DUMP, +NET_CLIENT_TYPE_RAW, } net_client_type; typedef void (NetPoll)(VLANClientState *, bool enable); diff --git a/net/raw-linux.c b/net/raw-linux.c new file mode 100644 index 000..9ed2e6a --- /dev/null +++ b/net/raw-linux.c @@ -0,0 +1,97 @@ +/* + * QEMU System Emulator + * + * Copyright (c) 2003-2008 Fabrice Bellard + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the Software), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On 01/26/2010 02:40 PM, Sridhar Samudrala wrote: This patch adds raw socket backend to qemu and is based on Or Gerlitz's patch re-factored and ported to the latest qemu-kvm git tree. It also includes support for vnet_hdr option that enables gso/checksum offload with raw backend. You can find the linux kernel patch to support this feature here. http://thread.gmane.org/gmane.linux.network/150308 Signed-off-by: Sridhar Samudralas...@us.ibm.com See the previous discussion about the raw backend from Or's original patch. There's no obvious reason why we should have this in addition to a tun/tap backend. The only use-case I know of is macvlan but macvtap addresses this functionality while not introduce the rather nasty security problems associated with a raw backend. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On 01/26/2010 02:47 PM, Anthony Liguori wrote: On 01/26/2010 02:40 PM, Sridhar Samudrala wrote: This patch adds raw socket backend to qemu and is based on Or Gerlitz's patch re-factored and ported to the latest qemu-kvm git tree. It also includes support for vnet_hdr option that enables gso/checksum offload with raw backend. You can find the linux kernel patch to support this feature here. http://thread.gmane.org/gmane.linux.network/150308 Signed-off-by: Sridhar Samudralas...@us.ibm.com See the previous discussion about the raw backend from Or's original patch. There's no obvious reason why we should have this in addition to a tun/tap backend. The only use-case I know of is macvlan but macvtap addresses this functionality while not introduce the rather nasty security problems associated with a raw backend. Not to mention that from a user perspective, raw makes almost no sense as it's an obscure socket protocol family. A user wants to do useful things like bridged networking or direct VF assignment. We should have -net backends that reflect things that make sense to a user. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On Tue, 2010-01-26 at 14:47 -0600, Anthony Liguori wrote: On 01/26/2010 02:40 PM, Sridhar Samudrala wrote: This patch adds raw socket backend to qemu and is based on Or Gerlitz's patch re-factored and ported to the latest qemu-kvm git tree. It also includes support for vnet_hdr option that enables gso/checksum offload with raw backend. You can find the linux kernel patch to support this feature here. http://thread.gmane.org/gmane.linux.network/150308 Signed-off-by: Sridhar Samudralas...@us.ibm.com See the previous discussion about the raw backend from Or's original patch. There's no obvious reason why we should have this in addition to a tun/tap backend. The only use-case I know of is macvlan but macvtap addresses this functionality while not introduce the rather nasty security problems associated with a raw backend. The raw backend can be attached to a physical device, macvlan or SR-IOV VF. I don't think AF_PACKET socket itself introduces any security problems. The raw socket can be created only by a user with CAP_RAW capability. The only issue is if we need to assume that qemu itself is an untrusted process and a raw fd cannot be passed to it. But, i think it is a useful backend to support in qemu that provides guest to remote host connectivity without the need for a bridge/tap. macvtap could be an alternative if it supports binding to SR-IOV VFs too. Thanks Sridhar -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On Tue, 2010-01-26 at 14:50 -0600, Anthony Liguori wrote: On 01/26/2010 02:47 PM, Anthony Liguori wrote: On 01/26/2010 02:40 PM, Sridhar Samudrala wrote: This patch adds raw socket backend to qemu and is based on Or Gerlitz's patch re-factored and ported to the latest qemu-kvm git tree. It also includes support for vnet_hdr option that enables gso/checksum offload with raw backend. You can find the linux kernel patch to support this feature here. http://thread.gmane.org/gmane.linux.network/150308 Signed-off-by: Sridhar Samudralas...@us.ibm.com See the previous discussion about the raw backend from Or's original patch. There's no obvious reason why we should have this in addition to a tun/tap backend. The only use-case I know of is macvlan but macvtap addresses this functionality while not introduce the rather nasty security problems associated with a raw backend. Not to mention that from a user perspective, raw makes almost no sense as it's an obscure socket protocol family. Not clear what you mean here. AF_PACKET socket is just a transport mechanism for the host kernel to put the packets from the guest directly to an attached interface and vice-versa. A user wants to do useful things like bridged networking or direct VF assignment. We should have -net backends that reflect things that make sense to a user. Binding to a SR-IOV VF is one of the use-case that is supported by raw backend. Thanks Sridhar -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On 01/26/2010 05:15 PM, Sridhar Samudrala wrote: On Tue, 2010-01-26 at 14:47 -0600, Anthony Liguori wrote: On 01/26/2010 02:40 PM, Sridhar Samudrala wrote: This patch adds raw socket backend to qemu and is based on Or Gerlitz's patch re-factored and ported to the latest qemu-kvm git tree. It also includes support for vnet_hdr option that enables gso/checksum offload with raw backend. You can find the linux kernel patch to support this feature here. http://thread.gmane.org/gmane.linux.network/150308 Signed-off-by: Sridhar Samudralas...@us.ibm.com See the previous discussion about the raw backend from Or's original patch. There's no obvious reason why we should have this in addition to a tun/tap backend. The only use-case I know of is macvlan but macvtap addresses this functionality while not introduce the rather nasty security problems associated with a raw backend. The raw backend can be attached to a physical device This is equivalent to bridging with tun/tap except that it has the unexpected behaviour of unreliable host/guest networking (which is not universally consistent across platforms either). This is not a mode we want to encourage users to use. , macvlan macvtap is a superior way to achieve this use case because a macvtap fd can safely be given to a lesser privilege process without allowing escalation of privileges. or SR-IOV VF. This depends on vhost-net. In general, what I would like to see for this is something more user friendly that dealt specifically with this use-case. Although honestly, given the recent security concerns around raw sockets, I'm very concerned about supporting raw sockets in qemu at all. Essentially, you get worse security doing vhost-net + raw + VF then with PCI passthrough + VF because at least in the later case you can run qemu without privileges. CAP_NET_RAW is a very big privilege. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Re: [PATCH qemu-kvm] Add raw(af_packet) network backend to qemu
On Wednesday 27 January 2010, Anthony Liguori wrote: The raw backend can be attached to a physical device This is equivalent to bridging with tun/tap except that it has the unexpected behaviour of unreliable host/guest networking (which is not universally consistent across platforms either). This is not a mode we want to encourage users to use. It's not the most common scenario, but I've seen systems (I remember one on s/390 with z/VM) where you really want to isolate the guest network as much as possible from the host network. Besides PCI passthrough, giving the host device to a guest using a raw socket is the next best approximation of that. Then again, macvtap will do that too, if the device driver supports multiple unicast MAC addresses without forcing promiscous mode. , macvlan macvtap is a superior way to achieve this use case because a macvtap fd can safely be given to a lesser privilege process without allowing escalation of privileges. Yes. or SR-IOV VF. This depends on vhost-net. Why? I don't see anything in this scenario that is vhost-net specific. I also plan to cover this aspect in macvtap in the future, but the current code does not do it yet. It also requires device driver changes. In general, what I would like to see for this is something more user friendly that dealt specifically with this use-case. Although honestly, given the recent security concerns around raw sockets, I'm very concerned about supporting raw sockets in qemu at all. Essentially, you get worse security doing vhost-net + raw + VF then with PCI passthrough + VF because at least in the later case you can run qemu without privileges. CAP_NET_RAW is a very big privilege. It can be contained to a large degree with network namespaces. When you run qemu in its own namespace and add the VF to that, CAP_NET_RAW should ideally have no effect on other parts of the system (except bugs in the namespace implementation). Arnd -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html