Re: [RFC] [PATCH v2 3/3] Let host NIC driver to DMA to guest user space.
On Tue, 6 Apr 2010 14:26:29 +0800 Xin, Xiaohui xiaohui@intel.com wrote: How do you deal with the DoS problem of hostile user space app posting huge number of receives and never getting anything. That's a problem we are trying to deal with. It's critical for long term. Currently, we tried to limit the pages it can pin, but not sure how much is reasonable. For now, the buffers submitted is from guest virtio-net driver, so it's safe in some extent just for now. It is critical even now. Once you get past toy benchmarks you will see things like Java processes with 1000 threads all reading at once. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [RFC] [PATCH v2 3/3] Let host NIC driver to DMA to guest user space.
From: Xin Xiaohui xiaohui@intel.com The patch let host NIC driver to receive user space skb, then the driver has chance to directly DMA to guest user space buffers thru single ethX interface. We want it to be more generic as a zero copy framework. Signed-off-by: Xin Xiaohui xiaohui@intel.com Signed-off-by: Zhao Yu yzha...@gmail.com Sigend-off-by: Jeff Dike jd...@c2.user-mode-linux.org --- We consider 2 way to utilize the user buffres, but not sure which one is better. Please give any comments. One:Modify __alloc_skb() function a bit, it can only allocate a structure of sk_buff, and the data pointer is pointing to a user buffer which is coming from a page constructor API. Then the shinfo of the skb is also from guest. When packet is received from hardware, the skb-data is filled directly by h/w. What we have done is in this way. Pros: We can avoid any copy here. Cons: Guest virtio-net driver needs to allocate skb as almost the same method with the host NIC drivers, say the size of netdev_alloc_skb() and the same reserved space in the head of skb. Many NIC drivers are the same with guest and ok for this. But some lastest NIC drivers reserves special room in skb head. To deal with it, we suggest to provide a method in guest virtio-net driver to ask for parameter we interest from the NIC driver when we know which device we have bind to do zero-copy. Then we ask guest to do so. Is that reasonable? Two:Modify driver to get user buffer allocated from a page constructor API(to substitute alloc_page()), the user buffer are used as payload buffers and filled by h/w directly when packet is received. Driver should associate the pages with skb (skb_shinfo(skb)-frags). For the head buffer side, let host allocates skb, and h/w fills it. After that, the data filled in host skb header will be copied into guest header buffer which is submitted together with the payload buffer. Pros: We could less care the way how guest or host allocates their buffers. Cons: We still need a bit copy here for the skb header. We are not sure which way is the better here. This is the first thing we want to get comments from the community. We wish the modification to the network part will be generic which not used by vhost-net backend only, but a user application may use it as well when the zero-copy device may provides async read/write operations later. Thanks Xiaohui How do you deal with the DoS problem of hostile user space app posting huge number of receives and never getting anything. That's a problem we are trying to deal with. It's critical for long term. Currently, we tried to limit the pages it can pin, but not sure how much is reasonable. For now, the buffers submitted is from guest virtio-net driver, so it's safe in some extent just for now. Thanks Xiaohui -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC] [PATCH v2 3/3] Let host NIC driver to DMA to guest user space.
From: Xin Xiaohui xiaohui@intel.com The patch let host NIC driver to receive user space skb, then the driver has chance to directly DMA to guest user space buffers thru single ethX interface. We want it to be more generic as a zero copy framework. Signed-off-by: Xin Xiaohui xiaohui@intel.com Signed-off-by: Zhao Yu yzha...@gmail.com Sigend-off-by: Jeff Dike jd...@c2.user-mode-linux.org --- We consider 2 way to utilize the user buffres, but not sure which one is better. Please give any comments. One:Modify __alloc_skb() function a bit, it can only allocate a structure of sk_buff, and the data pointer is pointing to a user buffer which is coming from a page constructor API. Then the shinfo of the skb is also from guest. When packet is received from hardware, the skb-data is filled directly by h/w. What we have done is in this way. Pros: We can avoid any copy here. Cons: Guest virtio-net driver needs to allocate skb as almost the same method with the host NIC drivers, say the size of netdev_alloc_skb() and the same reserved space in the head of skb. Many NIC drivers are the same with guest and ok for this. But some lastest NIC drivers reserves special room in skb head. To deal with it, we suggest to provide a method in guest virtio-net driver to ask for parameter we interest from the NIC driver when we know which device we have bind to do zero-copy. Then we ask guest to do so. Is that reasonable? Two:Modify driver to get user buffer allocated from a page constructor API(to substitute alloc_page()), the user buffer are used as payload buffers and filled by h/w directly when packet is received. Driver should associate the pages with skb (skb_shinfo(skb)-frags). For the head buffer side, let host allocates skb, and h/w fills it. After that, the data filled in host skb header will be copied into guest header buffer which is submitted together with the payload buffer. Pros: We could less care the way how guest or host allocates their buffers. Cons: We still need a bit copy here for the skb header. We are not sure which way is the better here. This is the first thing we want to get comments from the community. We wish the modification to the network part will be generic which not used by vhost-net backend only, but a user application may use it as well when the zero-copy device may provides async read/write operations later. Thanks Xiaohui include/linux/netdevice.h | 69 - include/linux/skbuff.h| 30 -- net/core/dev.c| 63 ++ net/core/skbuff.c | 74 4 files changed, 224 insertions(+), 12 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 94958c1..ba48eb0 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -485,6 +485,17 @@ struct netdev_queue { unsigned long tx_dropped; } cacheline_aligned_in_smp; +#if defined(CONFIG_VHOST_PASSTHRU) || defined(CONFIG_VHOST_PASSTHRU_MODULE) +struct mpassthru_port { + int hdr_len; + int data_len; + int npages; + unsignedflags; + struct socket *sock; + struct skb_user_page*(*ctor)(struct mpassthru_port *, + struct sk_buff *, int); +}; +#endif /* * This structure defines the management hooks for network devices. @@ -636,6 +647,10 @@ struct net_device_ops { int (*ndo_fcoe_ddp_done)(struct net_device *dev, u16 xid); #endif +#if defined(CONFIG_VHOST_PASSTHRU) || defined(CONFIG_VHOST_PASSTHRU_MODULE) + int (*ndo_mp_port_prep)(struct net_device *dev, + struct mpassthru_port *port); +#endif }; /* @@ -891,7 +906,8 @@ struct net_device struct macvlan_port *macvlan_port; /* GARP */ struct garp_port*garp_port; - + /* mpassthru */ + struct mpassthru_port *mp_port; /* class/net/name entry */ struct device dev; /* space for optional statistics and wireless sysfs groups */ @@ -2013,6 +2029,55 @@ static inline u32 dev_ethtool_get_flags(struct net_device *dev) return 0; return dev-ethtool_ops-get_flags(dev); } -#endif /* __KERNEL__ */ +/* To support zero-copy between user space application and NIC driver, + * we'd better ask NIC driver for the capability it can provide, especially + * for packet split mode, now we only ask for the header size, and the + *
Re: [RFC] [PATCH v2 3/3] Let host NIC driver to DMA to guest user space.
On Fri, 2 Apr 2010 15:30:10 +0800 xiaohui@intel.com wrote: From: Xin Xiaohui xiaohui@intel.com The patch let host NIC driver to receive user space skb, then the driver has chance to directly DMA to guest user space buffers thru single ethX interface. We want it to be more generic as a zero copy framework. Signed-off-by: Xin Xiaohui xiaohui@intel.com Signed-off-by: Zhao Yu yzha...@gmail.com Sigend-off-by: Jeff Dike jd...@c2.user-mode-linux.org --- We consider 2 way to utilize the user buffres, but not sure which one is better. Please give any comments. One:Modify __alloc_skb() function a bit, it can only allocate a structure of sk_buff, and the data pointer is pointing to a user buffer which is coming from a page constructor API. Then the shinfo of the skb is also from guest. When packet is received from hardware, the skb-data is filled directly by h/w. What we have done is in this way. Pros: We can avoid any copy here. Cons: Guest virtio-net driver needs to allocate skb as almost the same method with the host NIC drivers, say the size of netdev_alloc_skb() and the same reserved space in the head of skb. Many NIC drivers are the same with guest and ok for this. But some lastest NIC drivers reserves special room in skb head. To deal with it, we suggest to provide a method in guest virtio-net driver to ask for parameter we interest from the NIC driver when we know which device we have bind to do zero-copy. Then we ask guest to do so. Is that reasonable? Two:Modify driver to get user buffer allocated from a page constructor API(to substitute alloc_page()), the user buffer are used as payload buffers and filled by h/w directly when packet is received. Driver should associate the pages with skb (skb_shinfo(skb)-frags). For the head buffer side, let host allocates skb, and h/w fills it. After that, the data filled in host skb header will be copied into guest header buffer which is submitted together with the payload buffer. Pros: We could less care the way how guest or host allocates their buffers. Cons: We still need a bit copy here for the skb header. We are not sure which way is the better here. This is the first thing we want to get comments from the community. We wish the modification to the network part will be generic which not used by vhost-net backend only, but a user application may use it as well when the zero-copy device may provides async read/write operations later. Thanks Xiaohui How do you deal with the DoS problem of hostile user space app posting huge number of receives and never getting anything. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html