Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-29 Thread Michael S. Tsirkin
On Tue, Sep 28, 2010 at 08:24:29PM -0700, Shirley Ma wrote: Hello Michael, On Wed, 2010-09-15 at 07:52 -0700, Shirley Ma wrote: Don't you think once I address vhost_add_used_and_signal update issue, it is a simple and complete patch for macvtap TX zero copy? Thanks

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-29 Thread Michael S. Tsirkin
On Wed, Sep 29, 2010 at 10:16:45AM +0200, Michael S. Tsirkin wrote: On Tue, Sep 28, 2010 at 08:24:29PM -0700, Shirley Ma wrote: Hello Michael, On Wed, 2010-09-15 at 07:52 -0700, Shirley Ma wrote: Don't you think once I address vhost_add_used_and_signal update issue, it is a

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-29 Thread Shirley Ma
On Wed, 2010-09-29 at 10:16 +0200, Michael S. Tsirkin wrote: If you look at dev_hard_start_xmit you will see a call to skb_orphan_try which often calls the skb destructor. So I suspect this is almost equivalent to your original patch, and has the same correctness issue. I forgot to mention,

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-29 Thread Shirley Ma
On Wed, 2010-09-29 at 10:28 +0200, Michael S. Tsirkin wrote: I think you should try testing with guest to external communication, this will uncover some of these correctness issues for you. I think netperf also has some flag to check data, might be a good idea to use it for testing. I always

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-29 Thread Shirley Ma
On Wed, 2010-09-29 at 10:16 +0200, Michael S. Tsirkin wrote: I compared several approaches for addressing the issue being raised here on how/when to update vhost_add_used_and_signal. The simple approach I have found is: 1. Adding completion field in struct virtqueue; 2. when it is a

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-29 Thread Shirley Ma
On Wed, 2010-09-29 at 10:28 +0200, Michael S. Tsirkin wrote: 1. Adding completion field in struct virtqueue; 2. when it is a zero copy packet, put vhost thread wait for completion to update vhost_add_used_and_signal; 3. passing vq from vhost to macvtap as skb destruct_arg; 4. when

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-29 Thread Michael S. Tsirkin
On Tue, Sep 28, 2010 at 08:24:29PM -0700, Shirley Ma wrote: Hello Michael, On Wed, 2010-09-15 at 07:52 -0700, Shirley Ma wrote: Don't you think once I address vhost_add_used_and_signal update issue, it is a simple and complete patch for macvtap TX zero copy? Thanks

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-29 Thread Shirley Ma
On Wed, 2010-09-29 at 17:14 +0200, Michael S. Tsirkin wrote: I guess I just don't understand what your patch does. If you send it, I can take a look. Ok, I will collect more performance data and send the patch. Thanks Shirley -- To unsubscribe from this list: send the line unsubscribe kvm in

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-28 Thread Shirley Ma
Hello Michael, On Wed, 2010-09-15 at 07:52 -0700, Shirley Ma wrote: Don't you think once I address vhost_add_used_and_signal update issue, it is a simple and complete patch for macvtap TX zero copy? Thanks Shirley I like the fact that the patch is simple. Unfortunately I

RE: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-16 Thread Xin, Xiaohui
@vger.kernel.org; linux-ker...@vger.kernel.org Subject: Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel On Tue, Sep 14, 2010 at 09:00:25AM -0700, Shirley Ma wrote: On Tue, 2010-09-14 at 17:22 +0200, Michael S. Tsirkin wrote: I would expect this to hurt performance

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-16 Thread Michael S. Tsirkin
...@vger.kernel.org Subject: Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel On Wed, Sep 15, 2010 at 10:46:02AM +0800, Xin, Xiaohui wrote: From: Michael S. Tsirkin [mailto:m...@redhat.com] Sent: Wednesday, September 15, 2010 12:30 AM To: Shirley Ma Cc: Arnd Bergmann

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-15 Thread Shirley Ma
On Wed, 2010-09-15 at 07:27 +0200, Michael S. Tsirkin wrote: For some of the issues, try following the discussion around net: af_packet: don't call tpacket_destruct_skb() until the skb is sent out. Summary: it's difficult to do correctly generally. Limiting ourselves to transmit on

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-15 Thread Shirley Ma
On Wed, 2010-09-15 at 07:12 +0200, Michael S. Tsirkin wrote: Yes, I agree this patch is useful for demo purposes: simple, and shows what kind of performance gains we can expect for TX. Any other issue you can see in this patch beside vhost descriptors update? Don't you think once I address

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-15 Thread Michael S. Tsirkin
...@vger.kernel.org Subject: Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel On Tue, Sep 14, 2010 at 09:00:25AM -0700, Shirley Ma wrote: On Tue, 2010-09-14 at 17:22 +0200, Michael S. Tsirkin wrote: I would expect this to hurt performance significantly. We could do

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-15 Thread Michael S. Tsirkin
On Tue, Sep 14, 2010 at 11:21:15PM -0700, Shirley Ma wrote: On Wed, 2010-09-15 at 07:12 +0200, Michael S. Tsirkin wrote: Yes, I agree this patch is useful for demo purposes: simple, and shows what kind of performance gains we can expect for TX. Any other issue you can see in this patch

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-15 Thread Shirley Ma
On Wed, 2010-09-15 at 12:10 +0200, Michael S. Tsirkin wrote: Another issue is that macvtap can be bound to almost anything, including e.g. a tap device or a bridge, which might hang on to skb fragments for unlimited time. Zero copy TX won't easily work there. I can imagine either somehow

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-15 Thread Shirley Ma
On Wed, 2010-09-15 at 17:39 +0200, Michael S. Tsirkin wrote: In fact, I rechecked: both bridge and loopback have NETIF_F_HIGHDMA set. So maybe we should check NETIF_F_NETNS_LOCAL ... macvtap in bridged mode is interesting as well. I found that too, just wondered which flag to use is

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-15 Thread Michael S. Tsirkin
On Wed, Sep 15, 2010 at 10:00:04AM -0700, Shirley Ma wrote: On Wed, 2010-09-15 at 17:39 +0200, Michael S. Tsirkin wrote: In fact, I rechecked: both bridge and loopback have NETIF_F_HIGHDMA set. So maybe we should check NETIF_F_NETNS_LOCAL ... macvtap in bridged mode is interesting as

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-15 Thread Shirley Ma
On Wed, 2010-09-15 at 19:30 +0200, Michael S. Tsirkin wrote: At some level NETIF_F_NETNS_LOCAL makes sense: local packets can get anywhere. OTOH one wonders whether there might be other issues, e.g. in theory devices could hang on to frag pages just by doing get_page. There might be other

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-14 Thread Shirley Ma
On Tue, 2010-09-14 at 11:12 +0200, Avi Kivity wrote: +base = (unsigned long)from-iov_base + offset1; +size = ((base ~PAGE_MASK) + len + ~PAGE_MASK) PAGE_SHIFT; +num_pages = get_user_pages_fast(base, size, 0,page[i]); +if ((num_pages !=

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-14 Thread Arnd Bergmann
On Tuesday 14 September 2010, Shirley Ma wrote: On Tue, 2010-09-14 at 11:12 +0200, Avi Kivity wrote: That's what io_submit() is for. Then io_getevents() tells you what a while actually was. This macvtap zero copy uses iov buffers from vhost ring, which is allocated from guest kernel.

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-14 Thread Shirley Ma
On Tue, 2010-09-14 at 17:22 +0200, Michael S. Tsirkin wrote: I would expect this to hurt performance significantly. We could do this for asynchronous requests only to avoid the slowdown. Is kiocb in sendmsg helpful here? It is not used now. Shirley -- To unsubscribe from this list: send the

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-14 Thread Michael S. Tsirkin
On Tue, Sep 14, 2010 at 09:00:25AM -0700, Shirley Ma wrote: On Tue, 2010-09-14 at 17:22 +0200, Michael S. Tsirkin wrote: I would expect this to hurt performance significantly. We could do this for asynchronous requests only to avoid the slowdown. Is kiocb in sendmsg helpful here? It is

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-14 Thread Shirley Ma
On Tue, 2010-09-14 at 18:29 +0200, Michael S. Tsirkin wrote: Precisely. This is what the patch from Xin Xiaohui does. That code already seems to do most of what you are trying to do, right? I thought host pins guest kernel buffer pages was good enough for TX thought I didn't look up xiaohui's

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-14 Thread Michael S. Tsirkin
On Tue, Sep 14, 2010 at 10:02:25AM -0700, Shirley Ma wrote: On Tue, 2010-09-14 at 18:29 +0200, Michael S. Tsirkin wrote: Precisely. This is what the patch from Xin Xiaohui does. That code already seems to do most of what you are trying to do, right? I thought host pins guest kernel buffer

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-14 Thread Shirley Ma
On Tue, 2010-09-14 at 20:27 +0200, Michael S. Tsirkin wrote: As others said, the harder issues for TX are in determining that it's safe to unpin the memory, and how much memory is it safe to pin to beging with. For RX we have some more complexity. I think unpin the memory is in kfree_skb()

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-14 Thread Michael S. Tsirkin
On Tue, Sep 14, 2010 at 11:49:03AM -0700, Shirley Ma wrote: On Tue, 2010-09-14 at 20:27 +0200, Michael S. Tsirkin wrote: As others said, the harder issues for TX are in determining that it's safe to unpin the memory, and how much memory is it safe to pin to beging with. For RX we have

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-14 Thread Shirley Ma
On Tue, 2010-09-14 at 21:01 +0200, Michael S. Tsirkin wrote: On Tue, Sep 14, 2010 at 11:49:03AM -0700, Shirley Ma wrote: On Tue, 2010-09-14 at 20:27 +0200, Michael S. Tsirkin wrote: As others said, the harder issues for TX are in determining that it's safe to unpin the memory, and how

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-14 Thread Shirley Ma
On Tue, 2010-09-14 at 21:01 +0200, Michael S. Tsirkin wrote: I think that you should be able to simply combine the two drivers together, add an ioctl to enable/disable zero copy mode of operation. That could work. But what's the purpose to have two drivers if one driver can handle

RE: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-14 Thread Xin, Xiaohui
From: Shirley Ma [mailto:mashi...@us.ibm.com] Sent: Tuesday, September 14, 2010 11:05 PM To: Avi Kivity Cc: David Miller; a...@arndb.de; m...@redhat.com; Xin, Xiaohui; net...@vger.kernel.org; kvm@vger.kernel.org; linux-ker...@vger.kernel.org Subject: Re: [RFC PATCH 2/2] macvtap: TX zero copy

RE: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-14 Thread Xin, Xiaohui
From: Arnd Bergmann [mailto:a...@arndb.de] Sent: Tuesday, September 14, 2010 11:21 PM To: Shirley Ma Cc: Avi Kivity; David Miller; m...@redhat.com; Xin, Xiaohui; net...@vger.kernel.org; kvm@vger.kernel.org; linux-ker...@vger.kernel.org Subject: Re: [RFC PATCH 2/2] macvtap: TX zero copy between

RE: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-14 Thread Shirley Ma
On Wed, 2010-09-15 at 09:50 +0800, Xin, Xiaohui wrote: I think what David said is what we have thought before in mp device. Since we are not sure the exact time the tx buffer was wrote though DMA operation. But the deadline is when the tx buffer was freed. So we only notify the vhost stuff

RE: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-14 Thread Xin, Xiaohui
From: Michael S. Tsirkin [mailto:m...@redhat.com] Sent: Wednesday, September 15, 2010 12:30 AM To: Shirley Ma Cc: Arnd Bergmann; Avi Kivity; Xin, Xiaohui; David Miller; net...@vger.kernel.org; kvm@vger.kernel.org; linux-ker...@vger.kernel.org Subject: Re: [RFC PATCH 2/2] macvtap: TX zero copy

RE: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-14 Thread Xin, Xiaohui
From: Shirley Ma [mailto:mashi...@us.ibm.com] Sent: Wednesday, September 15, 2010 10:41 AM To: Xin, Xiaohui Cc: Avi Kivity; David Miller; a...@arndb.de; m...@redhat.com; net...@vger.kernel.org; kvm@vger.kernel.org; linux-ker...@vger.kernel.org Subject: RE: [RFC PATCH 2/2] macvtap: TX zero copy

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-14 Thread Michael S. Tsirkin
On Tue, Sep 14, 2010 at 12:36:23PM -0700, Shirley Ma wrote: On Tue, 2010-09-14 at 21:01 +0200, Michael S. Tsirkin wrote: I think that you should be able to simply combine the two drivers together, add an ioctl to enable/disable zero copy mode of operation. That could work. But

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-14 Thread Michael S. Tsirkin
On Tue, Sep 14, 2010 at 07:40:52PM -0700, Shirley Ma wrote: On Wed, 2010-09-15 at 09:50 +0800, Xin, Xiaohui wrote: I think what David said is what we have thought before in mp device. Since we are not sure the exact time the tx buffer was wrote though DMA operation. But the deadline is

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-14 Thread Michael S. Tsirkin
On Tue, Sep 14, 2010 at 12:20:29PM -0700, Shirley Ma wrote: On Tue, 2010-09-14 at 21:01 +0200, Michael S. Tsirkin wrote: On Tue, Sep 14, 2010 at 11:49:03AM -0700, Shirley Ma wrote: On Tue, 2010-09-14 at 20:27 +0200, Michael S. Tsirkin wrote: As others said, the harder issues for TX are

[RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-13 Thread Shirley Ma
Add zero copy feature between userspace and kernel in macvtap when lower device supports high memory DMA. Signed-off-by: Shirley Ma x...@us.ibm.com --- drivers/net/macvtap.c | 136 + 1 files changed, 126 insertions(+), 10 deletions(-) diff

Re: [RFC PATCH 2/2] macvtap: TX zero copy between guest and host kernel

2010-09-13 Thread David Miller
From: Shirley Ma mashi...@us.ibm.com Date: Mon, 13 Sep 2010 13:48:03 -0700 + base = (unsigned long)from-iov_base + offset1; + size = ((base ~PAGE_MASK) + len + ~PAGE_MASK) PAGE_SHIFT; + num_pages = get_user_pages_fast(base, size, 0, page[i]); +