[PATCH net-next] ixgbevf: Enable GRO by default

2010-05-13 Thread Shirley Ma
Enable GRO by default for performance. Signed-off-by: Shirley Ma x...@us.ibm.com --- drivers/net/ixgbevf/ixgbevf_main.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/net/ixgbevf/ixgbevf_main.c b/drivers/net/ixgbevf/ixgbevf_main.c index 40f47b8..1bbb05e

[PATCH net-next] ixgbe: return error in set_rar when index out of range

2010-05-18 Thread Shirley Ma
Return -1 when set_rar index is out of range Signed-off-by: Shirley Ma x...@us.ibm.com --- drivers/net/ixgbe/ixgbe_common.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/drivers/net/ixgbe/ixgbe_common.c b/drivers/net/ixgbe/ixgbe_common.c index 1159d91..77b3cf4 100644

ixgbe: macvlan on PF/VF when SRIOV is enabled

2010-05-21 Thread Shirley Ma
Hello Jeff, macvlan doesn't work on PF when SRIOV is enabled. Creating macvlan has been successful, but ping (icmp request) goes to VF interface not PF/macvlan even arp entry is correct. I patched ixgbe driver, and macvlan/PF has worked with the patch. But I am not sure whether it is right since

RE: ixgbe: macvlan on PF/VF when SRIOV is enabled

2010-05-24 Thread Shirley Ma
Hello Greg, Thanks for your prompt response. On Sat, 2010-05-22 at 10:53 -0700, Rose, Gregory V wrote: As of 2.6.34 the ixgbe driver does not support multiple queues for macvlan. Support for multiple queues for macvlan will come in a subsequent release. When it might happen? I will double

RE: ixgbe: macvlan on PF/VF when SRIOV is enabled

2010-05-25 Thread Shirley Ma
On Mon, 2010-05-24 at 10:54 -0700, Rose, Gregory V wrote: We look forward to it and will be happy to provide feedback. I have submitted the patch to make macvlan on PF works when SRIOV is enabled. One thing you can do is allocate VFs and then load the VF driver in your host domain and then

Re: [PATCH net-next] ixgbe: make macvlan on PF working when SRIOV is enabled

2010-05-25 Thread Shirley Ma
To produce this problem: 1. modprobe ixgbe max_vfs=2 eth4 is PF, eth5 is VF 2. ip link set eth4 up 3. ip link add link eth4 address 54:52:00:35:e3:20 macvlan2 type macvlan 4. ip addr add 192.168.7.74/24 dev macvlan2 5. ping macvlan2 from remote host, works 6. ip link set eth5 up 7. ping

Re: [RFC] defer skb allocation in virtio_net -- mergable buff part

2009-09-18 Thread Shirley Ma
Hello Michael, I am working on the patch to address the question you raised below. I am adding one more function -- destroy_buf in virtqueue_ops, so we don't need to maintain the list of pending buffers in upper layer (like virtio_net), when the device is shutdown or removed, this buffer free

INFO: task journal:337 blocked for more than 120 seconds

2009-09-30 Thread Shirley Ma
Hello all, Anybody found this problem before? I kept hitting this issue for 2.6.31 guest kernel even with a simple network test. INFO: task kjournal:337 blocked for more than 120 seconds. echo 0 /proc/sys/kernel/hung_task_timeout_sec disables this message. kjournald D 0041 0

Re: INFO: task journal:337 blocked for more than 120 seconds

2009-10-01 Thread Shirley Ma
On Thu, 2009-10-01 at 10:20 -0300, Marcelo Tosatti wrote: I've hit this in the past with ext3, mounting with data=writeback made it disappear. Thanks. I will make a try. Someone should fix this. Shirley -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message

Re: INFO: task kjournal:337 blocked for more than 120 seconds

2009-10-01 Thread Shirley Ma
I talked to Mingming, she suggested to use different IO scheduler. The default scheduler is cfg, after I switch to noop, the problem is gone. So there seems a bug in cfg scheduler. It's easily reproduced it when running the guest kernel, so far I haven't hit this problem on the host side. If I

Re: INFO: task kjournal:337 blocked for more than 120 seconds

2009-10-01 Thread Shirley Ma
On Thu, 2009-10-01 at 16:03 -0500, Javier Guerra wrote: deadline is the most recommended one for virtualization hosts. some distros set it as default if you select Xen or KVM at installation time. (and noop for the guests) I spoke too earlier, after a while noop scheduler hit the same issue.

Re: INFO: task kjournal:337 blocked for more than 120 seconds

2009-10-01 Thread Shirley Ma
Switching to different scheduler doesn't make the problem gone away. Shirley -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: vhost-net patches

2009-10-22 Thread Shirley Ma
On Thu, 2009-10-22 at 15:13 +0200, Michael S. Tsirkin wrote: OK, I sent a patch that should fix the errors for you. Could you please confirm, preferably on-list, whether the patch makes the errors go away for you with userspace virtio? Confirmed, your patch has fixed irq handler mismatch

Re: vhost-net patches

2009-10-22 Thread Shirley Ma
On Thu, 2009-10-22 at 10:23 -0700, Shirley Ma wrote: Yes, agreed. One observation is when I enable PCI MSI in guest kernel, I found that even without vhost supportin host kernel the network doesn't work either. So I think this is nothing related to vhost. I need to find why PCI MSI doesn't

Re: vhost-net patches

2009-10-22 Thread Shirley Ma
On Thu, 2009-10-22 at 19:36 +0200, Michael S. Tsirkin wrote: Upstream is Avi's qemu-kvm.git? So, for a moment taking vhost out of the equation, it seems that MSI was broken in Avi's tree again, after I forked my tree? The upper stream qemu git tree never worked for me w/i MSI, the boot hung

Re: vhost-net patches

2009-10-22 Thread Shirley Ma
On Thu, 2009-10-22 at 19:47 +0200, Michael S. Tsirkin wrote: What happens if you reset my tree to commit 47e465f031fc43c53ea8f08fa55cc3482c6435c8? I am going to clean up my upperstream git tree and retest first. Then I will try back up this commit. Looks like there are 2 issues: - upstream

Re: vhost-net patches

2009-10-22 Thread Shirley Ma
On Thu, 2009-10-22 at 10:56 -0700, Shirley Ma wrote: On Thu, 2009-10-22 at 19:47 +0200, Michael S. Tsirkin wrote: What happens if you reset my tree to commit 47e465f031fc43c53ea8f08fa55cc3482c6435c8? I am going to clean up my upperstream git tree and retest first. Then I will try back up

Re: vhost-net patches

2009-10-23 Thread Shirley Ma
On Fri, 2009-10-23 at 13:04 +0200, Michael S. Tsirkin wrote: Sridhar, Shirley, Could you please test the following patch? It should fix a bug on 32 bit hosts - is this what you have? Yes, it's 32 bit host. I checked out your recent git tree. Looks like the patch is already there, but vhost

Re: vhost-net patches

2009-10-23 Thread Shirley Ma
Hello Michael, Tested raw packet, it didn't work; switching to tap device, it is working. Qemu command is: x86_64-softmmu/qemu-system-x86_64 -s /home/xma/images/fedora10-2-vm -m 512 -drive file=/home/xma/images/fedora10-2-vm,if=virtio,index=0,boot=on -net tap,ifname=vnet0,script=no,downscript=no

Re: vhost-net patches

2009-10-23 Thread Shirley Ma
Hello Michael, Some initial vhost test netperf results on my T61 laptop from the working tap device are here, latency has been significant decreased, but throughput from guest to host has huge regression. I also hit guest skb_xmit panic. netperf TCP_STREAM, default setup, 60 secs run guest-host

Re: vhost-net patches

2009-10-23 Thread Shirley Ma
Hello Michael, Some update, On Fri, 2009-10-23 at 08:12 -0700, Shirley Ma wrote: Tested raw packet, it didn't work; Tested option -net raw,ifname=eth0, attached to a real device, raw works to remote node. I was expecting raw worked to local host. Does this option -net raw,ifname=vnet0

Re: vhost-net patches

2009-10-26 Thread Shirley Ma
Hello Miachel, On Mon, 2009-10-26 at 22:05 +0200, Michael S. Tsirkin wrote: Shirley, could you please test the following patch? With this patch, the performance has gained from 1xxx to 2xxx Mb/s, still has some performance gap compared to without vhost. It was 3xxxMb/s before from guest to host

Re: vhost-net patches

2009-10-26 Thread Shirley Ma
Pulled your git tree, didn't see the panic. Thanks Shirley -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: vhost-net patches

2009-10-26 Thread Shirley Ma
On Sun, 2009-10-25 at 11:11 +0200, Michael S. Tsirkin wrote: What is vnet0? That's a tap interface. I am binding raw socket to a tap interface and it doesn't work. Does it support? Thanks Shirley -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to

Re: vhost-net patches

2009-10-27 Thread Shirley Ma
Hello Michael, On Tue, 2009-10-27 at 08:43 +0200, Michael S. Tsirkin wrote: At some point my guest had a runaway nash-hotplug process consuming 100% CPU. Could you please verify this does not happen to you? What I have found that the start_xmit stopped and restarted too often. There is no

Re: vhost-net patches

2009-10-27 Thread Shirley Ma
On Tue, 2009-10-27 at 08:38 +0200, Michael S. Tsirkin wrote: Yes but you need to make host send packets out to tap as well, somehow. One way to do this is to assign IP address in a separate subnet to tap in host and to eth device in guest. Thanks for the hint, I will make a try. Shirley --

Re: vhost-net patches

2009-10-28 Thread Shirley Ma
On Tue, 2009-10-27 at 22:58 +0200, Michael S. Tsirkin wrote: How large is large here? I usually allocate 1G. I used to have 512, for this run I allocated 1G. I do see performance improves to 3xxxMb/s, and occasionally reaches 40xxMb/s. This is same as userspace, isn't it? A little bit

Re: vhost-net patches

2009-10-28 Thread Shirley Ma
Hello Michael, On Wed, 2009-10-28 at 17:39 +0200, Michael S. Tsirkin wrote: Here's another hack to try. It will break raw sockets, but just as a test: This patch looks better than previous one for guest to host TCP_STREAM performance. The transmission queue full still exists, but TCP_STREAM

Re: vhost-net patches

2009-10-28 Thread Shirley Ma
Hello Miachel, On Wed, 2009-10-28 at 18:53 +0200, Michael S. Tsirkin wrote: what exactly do you mean by transmission queue size? tx_queue_len? I think what should help with transmission queue full is actually sndbuf parameter for tap in qemu. I didn't see my email out, I resend the response

Re: vhost-net patches

2009-10-28 Thread Shirley Ma
Hello Arnd, On Wed, 2009-10-28 at 18:46 +0100, Arnd Bergmann wrote: You can probably connect it like this: qemu - vhost_net - vnet0 == /dev/tun - qemu To connect two guests. I've also used a bidirectional pipe before, to connect two tap interfaces to each other. However, if you want to

Re: vhost-net patches

2009-10-28 Thread Shirley Ma
Hello Michael, When I am testing deferring skb allocation patch, I found this problem. Simply removing and reloading guest virtio_net module would cause guest exit with errors. It is easy to reproduce it: [r...@localhost ~]# rmmod virtio_net [r...@localhost ~]# modprobe virtio_net

Re: vhost-net patches

2009-10-29 Thread Shirley Ma
Hello Michael, I am able to get 63xxMb/s throughput with 10% less cpu utilization when I apply deferring skb patch on top of your most recent vhost patch. The userspace TCP_STREAM BW used to be 3xxxMb/s from upper stream git tree. After applying your recent vhost patch, it goes up to 53xxMb/s.

RE: vhost-net patches

2009-11-03 Thread Shirley Ma
Hello Xiaohui, On Tue, 2009-11-03 at 09:06 +0800, Xin, Xiaohui wrote: Hi, Michael, What's your deferring skb allocation patch mentioned here, may you elaborate it a little more detailed? That's my patch. It was submitted a few month ago. Here is the link to this RFC patch:

[PATCH 0/1] Defer skb allocation for both mergeable buffers and big packets in virtio_net

2009-11-19 Thread Shirley Ma
Guest virtio_net receives packets from its pre-allocated vring buffers, then it delivers these packets to upper layer protocols as skb buffs. So it's not necessary to pre-allocate skb for each mergable buffer, then frees it when it's useless. This patch has deferred skb allocation when

[PATCH 1/1] Defer skb allocation for both mergeable buffers and big packets in virtio_net

2009-11-19 Thread Shirley Ma
This patch is generated against 2.6 git tree. I didn't break up this patch since it has one functionality. Please review it. Thanks Shirley Signed-off-by: Shirley Ma x...@us.ibm.com -- diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index b9e002f..6fb788b 100644

Re: [PATCH 1/1] Defer skb allocation for both mergeable buffers and big packets in virtio_net

2009-11-20 Thread Shirley Ma
On Fri, 2009-11-20 at 07:19 +0100, Eric Dumazet wrote: +void virtio_free_pages(void *buf) +{ + struct page *page = (struct page *)buf; + + while (page) { + __free_pages(page, 0); + page = (struct page *)page-private; + } +} + Interesting

Re: [PATCH 1/1] Defer skb allocation for both mergeable buffers and big packets in virtio_net

2009-11-20 Thread Shirley Ma
On Fri, 2009-11-20 at 07:19 +0100, Eric Dumazet wrote: Interesting use after free :) Thanks for catching the stupid mistake. This is the updated patch for review. Signed-off-by: Shirley Ma (x...@us.ibm.com) -- diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index b9e002f

Re: [PATCH 1/1] Defer skb allocation for both mergeable buffers and big packets in virtio_net

2009-11-23 Thread Shirley Ma
On Mon, 2009-11-23 at 11:38 +1030, Rusty Russell wrote: How about: struct page *end; /* Find end of list, sew whole thing into vi-pages. */ for (end = page; end-private; end = (struct page *)end-private); end-private = (unsigned long)vi-pages;

Re: [PATCH 1/1] Defer skb allocation for both mergeable buffers and big packets in virtio_net

2009-11-23 Thread Shirley Ma
On Mon, 2009-11-23 at 11:43 +0200, Michael S. Tsirkin wrote: should be !npage-private and nesting is too deep here: this is cleaner in a give_a_page subroutine as it was. This will be addressed with Rusty's comment. + /* use private to chain big packets */ packets? or pages?

Re: [PATCH 1/1] Defer skb allocation for both mergeable buffers and big packets in virtio_net

2009-11-23 Thread Shirley Ma
Hello Rusty, On Mon, 2009-11-23 at 11:38 +1030, Rusty Russell wrote: Overall, the patch looks good. But it would have been nicer if it were split into several parts: cleanups, new infrastructure, then the actual allocation change. I have split the patch into a set: cleanups, new

Re: [PATCH 1/1] Defer skb allocation for both mergeable buffers and big packets in virtio_net

2009-11-23 Thread Shirley Ma
On Tue, 2009-11-24 at 08:54 +1030, Rusty Russell wrote: #define BIG_PACKET_PAD (NET_SKB_PAD - sizeof(struct virtio_net_hdr) + NET_IP_ALIGN) struct big_packet_page { struct virtio_net_hdr hdr; char pad[BIG_PACKET_PAD]; /* Actual packet data starts here */

[PATCH v2 0/4] Defer skb allocation for both mergeable buffers and big packets in virtio_net

2009-12-11 Thread Shirley Ma
This is a patch-set for deferring skb allocation based on Rusty and Michael's inputs. Guest virtio_net receives packets from its pre-allocated vring buffers, then it delivers these packets to upper layer protocols as skb buffs. So it's not necessary to pre-allocate skb for each mergable buffer,

[PATCH v2 1/4] Defer skb allocation -- add destroy buffers function for virtio

2009-12-11 Thread Shirley Ma
Signed-off-by: x...@us.ibm.com - diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index c708ecc..bb5eb7b 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -107,6 +107,16 @@ static struct page *get_a_page(struct virtnet_info *vi, gfp_t gfp_mask)

[PATCH v2 2/4] Defer skb allocation -- new skb_set calls chain pages in virtio_net

2009-12-11 Thread Shirley Ma
Signed-off-by: Shirley Ma x...@us.ibm.com -- diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index bb5eb7b..100b4b9 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -80,29 +80,25 @@ static inline struct skb_vnet_hdr *skb_vnet_hdr(struct sk_buff

PATCH v2 3/4] Defer skb allocation -- new recvbuf alloc receive calls

2009-12-11 Thread Shirley Ma
Signed-off-by: Shirley Ma x...@us.ibm.com - diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 100b4b9..dde8060 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -203,6 +203,73 @@ static struct sk_buff *skb_goodcopy(struct virtnet_info *vi

[PATCH v2 4/4] Defer skb allocation -- change allocation receiving in recv path

2009-12-11 Thread Shirley Ma
Signed-off-by: Shirley Ma x...@us.ibm.com - diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index dde8060..b919169 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -270,99 +270,44 @@ static struct sk_buff *receive_mergeable(struct virtnet_info

Re: [PATCHv9 3/3] vhost_net: a kernel-level virtio server

2009-12-11 Thread Shirley Ma
On Sun, 2009-11-22 at 12:35 +0200, Michael S. Tsirkin wrote: These results where sent by Shirley Ma (Cc'd). I think they were with tap, host-to-guest/guest-to-host Yes, you are right. Shirley -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord

Re: [PATCH v2 0/4] Defer skb allocation for both mergeable buffers and big packets in virtio_net

2009-12-14 Thread Shirley Ma
On Sun, 2009-12-13 at 12:19 +0200, Michael S. Tsirkin wrote: Shirley, some advice on packaging patches that I hope will be helpful: You did try to split up the patch logically, and it's better than a single huge patch, but it can be better. For example, you add static functions in one

Re: [PATCH v2 1/4] Defer skb allocation -- add destroy buffers function for virtio

2009-12-14 Thread Shirley Ma
Hello Michael, I agree with the comments (will have two patches instead of 4 based on Rusty's comments) except below one. On Sun, 2009-12-13 at 12:26 +0200, Michael S. Tsirkin wrote: That said - do we have to use a callback? I think destroy_buf which returns data pointer, and which we call

Re: [PATCH v2 2/4] Defer skb allocation -- new skb_set calls chain pages in virtio_net

2009-12-14 Thread Shirley Ma
On Sun, 2009-12-13 at 13:24 +0200, Michael S. Tsirkin wrote: Hmm, this scans the whole list each time. OTOH, the caller probably can easily get list tail as well as head. If we ask caller to give us list tail, and chain them at head, then 1. we won't have to scan the list each time 2. we get

Re: PATCH v2 3/4] Defer skb allocation -- new recvbuf alloc receive calls

2009-12-14 Thread Shirley Ma
On Sun, 2009-12-13 at 13:43 +0200, Michael S. Tsirkin wrote: Interesting. I think skb_goodcopy will sometimes set *page to NULL. Will the above crash then? Nope, when *page is NULL, *len is 0. don't put empty line here. if below is part of same logical block as skb_goodcopy. Ok. Local

Re: [PATCH v2 1/4] Defer skb allocation -- add destroy buffers function for virtio

2009-12-14 Thread Shirley Ma
Thanks Rusty. I agree with all these comments, will work on them. Shirley -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 2/4] Defer skb allocation -- new skb_set calls chain pages in virtio_net

2009-12-14 Thread Shirley Ma
Thanks Rusty, agree with you, working on them. Shirley -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 1/4] Defer skb allocation -- add destroy buffers function for virtio

2009-12-14 Thread Shirley Ma
Hello Michael, On Mon, 2009-12-14 at 22:22 +0200, Michael S. Tsirkin wrote: I dont insist, but my idea was for (;;) { b = vq-destroy(vq); if (!b) break; --vi-num; put_page(b); } so we do not have to lose track of the counter. That's

Re: PATCH v2 3/4] Defer skb allocation -- new recvbuf alloc receive calls

2009-12-14 Thread Shirley Ma
Hello Michael, On Mon, 2009-12-14 at 14:08 -0800, Shirley Ma wrote: + + err = vi-rvq-vq_ops-add_buf(vi-rvq, sg, 0, 2, skb); + if (err 0) + kfree_skb(skb); + else + skb_queue_head(vi-recv, skb); So why are we queueing this still

Re: [PATCH v2 4/4] Defer skb allocation -- change allocation receiving in recv path

2009-12-15 Thread Shirley Ma
On Sun, 2009-12-13 at 13:08 +0200, Michael S. Tsirkin wrote: Do not cast away void*. This initialization above looks very strange: in fact only one of skb, page makes sense. So I think you should either get rid of both page and skb variables (routines such as give_pages get page * so they

[Fwd: Re: [PATCH v2 1/4] Defer skb allocation -- add destroy buffers function for virtio]

2009-12-15 Thread Shirley Ma
Sorry, forgot to CC all. Thanks Shirley ---BeginMessage--- On Tue, Dec 15, 2009 at 07:59:42AM -0800, Shirley Ma wrote: Hello Michael, On Tue, 2009-12-15 at 12:57 +0200, Michael S. Tsirkin wrote: No, this code would be in virtio net. destroy would simply be the virtqueue API that returns

Re: PATCH v2 3/4] Defer skb allocation -- new recvbuf alloc receive calls

2009-12-15 Thread Shirley Ma
On Tue, 2009-12-15 at 13:33 +0200, Michael S. Tsirkin wrote: So what I would suggest is, have function that just copies part of skb, and have caller open-code allocating the skb and free up pages as necessary. Yes, the updated patch has changed the function. What I am asking is why do we add

[RFC PATCH] Subject: virtio: Add unused buffers detach from vring

2009-12-15 Thread Shirley Ma
for unused buffers, so it has to keep a list itself to reclaim them at shutdown. This is redundant, since virtio_ring stores that information. So add a new hook to do this: virtio_net will be the first user. Signed-off-by: Shirley Ma x...@us.ibm.com --- drivers/virtio/virtio_ring.c | 24

Re: [RFC PATCH] Subject: virtio: Add unused buffers detach from vring

2009-12-15 Thread Shirley Ma
Thanks Michael, will fix them all. Shirley -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH] Subject: virtio: Add unused buffers detach from vring

2009-12-15 Thread Shirley Ma
Hello Michael, On Tue, 2009-12-15 at 20:47 +0200, Michael S. Tsirkin wrote: + detach_buf(vq, i); + END_USE(vq); + return vq-data[i]; In fact, this will return NULL always, won't it? Nope, I changed the destroy to detach and

[PATCH net-next 0/2] Defer skb allocation in virtio_net recv

2009-12-17 Thread Shirley Ma
This patch-set has deferred virtio_net skb allocation in receiving path for both big packets and mergeable buffers. It reduces skb pre-allocations and skb frees. This patch-set also add a new API detach_unused_bufs in virtio. Recv skb queue has been removed in virtio_net. It is based on previous

[PATCH 1/2] virtio: Add detach unused buffer from vring

2009-12-17 Thread Shirley Ma
There's currently no way for a virtio driver to ask for unused buffers, so it has to keep a list itself to reclaim them at shutdown. This is redundant, since virtio_ring stores that information. So add a new hook to do this: virtio_net will be the first user. Signed-off-by: Shirley Ma x

[PATCH 2/2] virtio_net: Defer skb allocation in receive path

2009-12-17 Thread Shirley Ma
skb allocation in receiving packets for both big packets and mergeable buffers to reduce skb pre-allocations and skb frees. It frees unused buffers by calling detach_unused_buf in vring, so recv skb queue is not needed. Signed-off-by: Shirley Ma x...@us.ibm.com --- drivers/net/virtio_net.c | 426

Re: [PATCH net-next 0/2] Defer skb allocation in virtio_net recv

2009-12-17 Thread Shirley Ma
Send skb queue can also be reduced with detach_unused_buf API by it is not a part of this patch. Shirley -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Network performance with small packets

2011-01-27 Thread Shirley Ma
On Wed, 2011-01-26 at 17:17 +0200, Michael S. Tsirkin wrote: I am seeing a similar problem, and am trying to fix that. My current theory is that this is a variant of a receive livelock: if the application isn't fast enough to process incoming data, the guest net stack switches from prequeue

Re: Network performance with small packets

2011-01-27 Thread Shirley Ma
On Thu, 2011-01-27 at 21:00 +0200, Michael S. Tsirkin wrote: Interesting. In particular running vhost and the transmitting guest on the same host would have the effect of slowing down TX. Does it double the BW for you too? Running vhost and TX guest on the same host seems not good enough to

Re: Network performance with small packets

2011-01-27 Thread Shirley Ma
On Thu, 2011-01-27 at 21:31 +0200, Michael S. Tsirkin wrote: Well slowing down the guest does not sound hard - for example we can request guest notifications, or send extra interrupts :) A slightly more sophisticated thing to try is to poll the vq a bit more aggressively. For example if we

Re: Network performance with small packets

2011-01-27 Thread Shirley Ma
On Thu, 2011-01-27 at 22:05 +0200, Michael S. Tsirkin wrote: Interesting. Could this is be a variant of the now famuous bufferbloat then? I guess we could drop some packets if we see we are not keeping up. For example if we see that the ring is X% full, we could quickly complete Y%

Re: Network performance with small packets

2011-01-27 Thread Shirley Ma
On Thu, 2011-01-27 at 13:02 -0800, David Miller wrote: Interesting. Could this is be a variant of the now famuous bufferbloat then? Sigh, bufferbloat is the new global warming... :-/ Yep, some places become colder, some other places become warmer; Same as BW results, sometimes faster,

Re: Network performance with small packets

2011-02-01 Thread Shirley Ma
On Tue, 2011-02-01 at 22:17 +0200, Michael S. Tsirkin wrote: On Tue, Feb 01, 2011 at 12:09:03PM -0800, Shirley Ma wrote: On Tue, 2011-02-01 at 19:23 +0200, Michael S. Tsirkin wrote: On Thu, Jan 27, 2011 at 01:30:38PM -0800, Shirley Ma wrote: On Thu, 2011-01-27 at 13:02 -0800, David

Re: Network performance with small packets

2011-02-01 Thread Shirley Ma
On Mon, 2011-01-31 at 17:30 -0800, Sridhar Samudrala wrote: Yes. It definitely should be 'out'. 'in' should be 0 in the tx path. I tried a simpler version of this patch without any tunables by delaying the signaling until we come out of the for loop. It definitely reduced the number of

Re: Network performance with small packets

2011-02-01 Thread Shirley Ma
On Tue, 2011-02-01 at 23:21 +0200, Michael S. Tsirkin wrote: Confused. We compare capacity to skb frags, no? That's sg I think ... Current guest kernel use indirect buffers, num_free returns how many available descriptors not skb frags. So it's wrong here. Shirley -- To unsubscribe from this

Re: Network performance with small packets

2011-02-01 Thread Shirley Ma
On Tue, 2011-02-01 at 23:24 +0200, Michael S. Tsirkin wrote: My theory is that the issue is not signalling. Rather, our queue fills up, then host handles one packet and sends an interrupt, and we immediately wake the queue. So the vq once it gets full, stays full. From the printk debugging

Re: Network performance with small packets

2011-02-01 Thread Shirley Ma
On Tue, 2011-02-01 at 23:42 +0200, Michael S. Tsirkin wrote: On Tue, Feb 01, 2011 at 01:32:35PM -0800, Shirley Ma wrote: On Tue, 2011-02-01 at 23:24 +0200, Michael S. Tsirkin wrote: My theory is that the issue is not signalling. Rather, our queue fills up, then host handles one packet

Re: Network performance with small packets

2011-02-01 Thread Shirley Ma
On Tue, 2011-02-01 at 23:56 +0200, Michael S. Tsirkin wrote: There are flags for bytes, buffers and packets. Try playing with any one of them :) Just be sure to use v2. I would like to change it to half of the ring size instead for signaling. Is that OK? Shirley Sure that

Re: [PATCHv2 dontapply] vhost-net tx tuning

2011-02-01 Thread Shirley Ma
On Tue, 2011-02-01 at 15:07 -0800, Sridhar Samudrala wrote: I think the counters that exceed the limits need to be reset to 0 here. Otherwise we keep signaling for every buffer once we hit this condition. I will modify the patch to rerun the test to see the difference. Shirley -- To

Re: Network performance with small packets

2011-02-01 Thread Shirley Ma
On Wed, 2011-02-02 at 06:40 +0200, Michael S. Tsirkin wrote: ust tweak the parameters with sysfs, you do not have to edit the code: echo 64 /sys/module/vhost_net/parameters/tx_bufs_coalesce Or in a similar way for tx_packets_coalesce (since we use indirect, packets will typically use 1

Re: Network performance with small packets

2011-02-01 Thread Shirley Ma
On Tue, 2011-02-01 at 22:05 -0800, Shirley Ma wrote: The way I am changing is only when netif queue has stopped, then we start to count num_free descriptors to send the signal to wake netif queue. I forgot to mention, the code change I am making is in guest kernel, in xmit call back only

Re: Network performance with small packets

2011-02-01 Thread Shirley Ma
On Wed, 2011-02-02 at 12:04 +0530, Krishna Kumar2 wrote: On Tue, 2011-02-01 at 22:05 -0800, Shirley Ma wrote: The way I am changing is only when netif queue has stopped, then we start to count num_free descriptors to send the signal to wake netif queue. I forgot to mention

Re: Network performance with small packets

2011-02-01 Thread Shirley Ma
On Wed, 2011-02-02 at 08:29 +0200, Michael S. Tsirkin wrote: On Tue, Feb 01, 2011 at 10:19:09PM -0800, Shirley Ma wrote: On Tue, 2011-02-01 at 22:05 -0800, Shirley Ma wrote: The way I am changing is only when netif queue has stopped, then we start to count num_free descriptors

Re: Network performance with small packets

2011-02-01 Thread Shirley Ma
On Tue, 2011-02-01 at 23:14 -0800, Shirley Ma wrote: w/i guest change, I played around the parameters,for example: I could get 3.7Gb/s with 42% CPU BW increasing from 2.5Gb/s for 1K message size, w/i dropping packet, I was able to get up to 6.2Gb/s with similar CPU usage. I meant w/o guest

Re: Network performance with small packets

2011-02-02 Thread Shirley Ma
On Wed, 2011-02-02 at 12:48 +0200, Michael S. Tsirkin wrote: Yes, I think doing this in the host is much simpler, just send an interrupt after there's a decent amount of space in the queue. Having said that the simple heuristic that I coded might be a bit too simple. From the debugging out

Re: Network performance with small packets

2011-02-02 Thread Shirley Ma
On Wed, 2011-02-02 at 12:49 +0200, Michael S. Tsirkin wrote: On Tue, Feb 01, 2011 at 11:33:49PM -0800, Shirley Ma wrote: On Tue, 2011-02-01 at 23:14 -0800, Shirley Ma wrote: w/i guest change, I played around the parameters,for example: I could get 3.7Gb/s with 42% CPU BW increasing from

Re: Network performance with small packets

2011-02-02 Thread Shirley Ma
On Wed, 2011-02-02 at 17:47 +0200, Michael S. Tsirkin wrote: On Wed, Feb 02, 2011 at 07:39:45AM -0800, Shirley Ma wrote: On Wed, 2011-02-02 at 12:48 +0200, Michael S. Tsirkin wrote: Yes, I think doing this in the host is much simpler, just send an interrupt after there's a decent amount

Re: Network performance with small packets

2011-02-02 Thread Shirley Ma
On Wed, 2011-02-02 at 17:48 +0200, Michael S. Tsirkin wrote: And this is with sndbuf=0 in host, yes? And do you see a lot of tx interrupts? How packets per interrupt? Nope, sndbuf doens't matter since I never hit reaching sock wmem condition in vhost. I am still playing around, let me know

Re: Network performance with small packets

2011-02-02 Thread Shirley Ma
On Wed, 2011-02-02 at 19:32 +0200, Michael S. Tsirkin wrote: OK, but this should have no effect with a vhost patch which should ensure that we don't get an interrupt until the queue is at least half empty. Right? There should be some coordination between guest and vhost. We shouldn't count

Re: Network performance with small packets

2011-02-02 Thread Shirley Ma
On Wed, 2011-02-02 at 20:20 +0200, Michael S. Tsirkin wrote: How many packets and bytes per interrupt are sent? Also, what about other values for the counters and other counters? What does your patch do? Just drop packets instead of stopping the interface? To have an understanding when

Re: Network performance with small packets

2011-02-02 Thread Shirley Ma
On Wed, 2011-02-02 at 20:27 +0200, Michael S. Tsirkin wrote: On Wed, Feb 02, 2011 at 10:11:51AM -0800, Shirley Ma wrote: On Wed, 2011-02-02 at 19:32 +0200, Michael S. Tsirkin wrote: OK, but this should have no effect with a vhost patch which should ensure that we don't get an interrupt

Re: Network performance with small packets

2011-02-02 Thread Shirley Ma
On Wed, 2011-02-02 at 22:17 +0200, Michael S. Tsirkin wrote: Well, this is also the only case where the queue is stopped, no? Yes. I got some debugging data, I saw that sometimes there were so many packets were waiting for free in guest between vhost_signal guest xmit callback. Looks like the

Re: Network performance with small packets

2011-02-02 Thread Shirley Ma
On Wed, 2011-02-02 at 23:20 +0200, Michael S. Tsirkin wrote: On Wed, 2011-02-02 at 22:17 +0200, Michael S. Tsirkin wrote: Well, this is also the only case where the queue is stopped, no? Yes. I got some debugging data, I saw that sometimes there were so many packets were waiting for free

Re: Network performance with small packets

2011-02-02 Thread Shirley Ma
On Wed, 2011-02-02 at 23:20 +0200, Michael S. Tsirkin wrote: I think I need to define the test matrix to collect data for TX xmit from guest to host here for different tests. Data to be collected: - 1. kvm_stat for VM, I/O exits 2. cpu utilization for both guest

Re: Network performance with small packets

2011-02-02 Thread Shirley Ma
On Thu, 2011-02-03 at 07:59 +0200, Michael S. Tsirkin wrote: Let's look at the sequence here: guest start_xmit() xmit_skb() if ring is full, enable_cb() guest skb_xmit_done() disable_cb, printk free_old_xmit_skbs -- it was between more

Re: Network performance with small packets

2011-02-03 Thread Shirley Ma
On Thu, 2011-02-03 at 08:13 +0200, Michael S. Tsirkin wrote: Initial TCP_STREAM performance results I got for guest to local host 4.2Gb/s for 1K message size, (vs. 2.5Gb/s) 6.2Gb/s for 2K message size, and (vs. 3.8Gb/s) 9.8Gb/s for 4K message size. (vs.5.xGb/s) What is the average

Re: Network performance with small packets

2011-02-03 Thread Shirley Ma
On Thu, 2011-02-03 at 18:20 +0200, Michael S. Tsirkin wrote: Just a thought: does it help to make tx queue len of the virtio device smaller? Yes, that what I did before, reducing txqueuelen will cause qdisc dropp the packet early, But it's hard to control by using tx queuelen for performance

Re: [RFC PATCH V2 0/5] macvtap TX zero copy between guest and host kernel

2011-02-14 Thread Shirley Ma
On Mon, 2011-02-14 at 15:09 +0200, Michael S. Tsirkin wrote: What's the status here? Since there are core net changes, we'll need to see the final version soon if it's to appear in 2.6.39. I am updating the patch and retesting it for the new kernel. I am trying to understand why zero copy

Re: Network performance with small packets

2011-03-08 Thread Shirley Ma
On Wed, 2011-02-09 at 11:07 +1030, Rusty Russell wrote: I've finally read this thread... I think we need to get more serious with our stats gathering to diagnose these kind of performance issues. This is a start; it should tell us what is actually happening to the virtio ring(s) without

Re: Network performance with small packets

2011-03-09 Thread Shirley Ma
On Tue, 2011-03-08 at 20:21 -0600, Andrew Theurer wrote: Tom L has started using Rusty's patches and found some interesting results, sent yesterday: http://marc.info/?l=kvmm=129953710930124w=2 Thanks. Very good experimental. I have been struggling with guest/vhost optimization work for a

Re: Network performance with small packets - continued

2011-03-09 Thread Shirley Ma
On Wed, 2011-03-09 at 09:15 +0200, Michael S. Tsirkin wrote: diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 82dba5a..ebe3337 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -514,11 +514,11 @@ static unsigned int free_old_xmit_skbs(struct

Re: Network performance with small packets - continued

2011-03-09 Thread Shirley Ma
On Wed, 2011-03-09 at 10:09 -0600, Tom Lendacky wrote: This spread out the kick_notify but still resulted in alot of them. I decided to build on the delayed Tx buffer freeing and code up an ethtool like coalescing patch in order to delay the kick_notify until there were at least

  1   2   3   4   >