Re: virtio_net: BQL?
On Mon, May 24, 2021 at 10:53:08AM +0800, Jason Wang wrote: > > 在 2021/5/18 上午5:48, Dave Taht 写道: > > On Mon, May 17, 2021 at 1:23 PM Willem de Bruijn > > wrote: > > > On Mon, May 17, 2021 at 2:44 PM Dave Taht wrote: > > > > Not really related to this patch, but is there some reason why virtio > > > > has no support for BQL? > > > There have been a few attempts to add it over the years. > > > > > > Most recently, > > > https://lore.kernel.org/lkml/20181205225323.12555-2-...@redhat.com/ > > > > > > That thread has a long discussion. I think the key open issue remains > > > > > > "The tricky part is the mode switching between napi and no napi." > > Oy, vey. > > > > I didn't pay any attention to that discussion, sadly enough. > > > > It's been about that long (2018) since I paid any attention to > > bufferbloat in the cloud and my cloudy provider (linode) switched to > > using virtio when I wasn't looking. For over a year now, I'd been > > getting reports saying that comcast's pie rollout wasn't working as > > well as expected, that evenroute's implementation of sch_cake and sqm > > on inbound wasn't working right, nor pf_sense's and numerous other > > issues at Internet scale. > > > > Last week I ran a string of benchmarks against starlink's new services > > and was really aghast at what I found there, too. but the problem > > seemed deeper than in just the dishy... > > > > Without BQL, there's no backpressure for fq_codel to do its thing. > > None. My measurement servers aren't FQ-codeling > > no matter how much load I put on them. Since that qdisc is the default > > now in most linux distributions, I imagine that the bulk of the cloud > > is now behaving as erratically as linux was in 2011 with enormous > > swings in throughput and latency from GSO/TSO hitting overlarge rx/tx > > rings, [1], breaking various rate estimators in codel, pie and the tcp > > stack itself. > > > > See: > > > > http://fremont.starlink.taht.net/~d/virtio_nobql/rrul_-_evenroute_v3_server_fq_codel.png > > > > See the swings in latency there? that's symptomatic of tx/rx rings > > filling and emptying. > > > > it wasn't until I switched my measurement server temporarily over to > > sch_fq that I got a rrul result that was close to the results we used > > to get from the virtualized e1000e drivers we were using in 2014. > > > > http://fremont.starlink.taht.net/~d/virtio_nobql/rrul_-_evenroute_v3_server_fq.png > > > > While I have long supported the use of sch_fq for tcp-heavy workloads, > > it still behaves better with bql in place, and fq_codel is better for > > generic workloads... but needs bql based backpressure to kick in. > > > > [1] I really hope I'm overreacting but, um, er, could someone(s) spin > > up a new patch that does bql in some way even half right for this > > driver and help test it? I haven't built a kernel in a while. > > > I think it's time to obsolete skb_orphan() for virtio-net to get rid of a > brunch of tricky codes in the current virtio-net driver. > > Then we can do BQL on top. > > I will prepare some patches to do this (probably with Michael's BQL patch). > > Thanks First step would be to fix up and test the BQL part. IIRC it didn't seem to help performance in our benchmarking, and Eric seems to say that's expected ... > > > > > > > > > On Mon, May 17, 2021 at 11:41 AM Xianting Tian > > > > wrote: > > > > > BUG_ON() uses unlikely in if(), which can be optimized at compile > > > > > time. > > > > > > > > > > Signed-off-by: Xianting Tian > > > > > --- > > > > >drivers/net/virtio_net.c | 5 ++--- > > > > >1 file changed, 2 insertions(+), 3 deletions(-) > > > > > > > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > > > > > index c921ebf3ae82..212d52204884 100644 > > > > > --- a/drivers/net/virtio_net.c > > > > > +++ b/drivers/net/virtio_net.c > > > > > @@ -1646,10 +1646,9 @@ static int xmit_skb(struct send_queue *sq, > > > > > struct > > > > > sk_buff *skb) > > > > > else > > > > > hdr = skb_vnet_hdr(skb); > > > > > > > > > > - if (virtio_net_hdr_from_skb(skb, &hdr->hdr, > > > > > + BUG_ON(virtio_net_hdr_from_skb(skb, &hdr->hdr, > > > > > > > > > > virtio_is_little_endian(vi->vdev), false, > > > > > - 0)) > > > > > - BUG(); > > > > > + 0)); > > > > > > > > > > if (vi->mergeable_rx_bufs) > > > > > hdr->num_buffers = 0; > > > > > -- > > > > > 2.17.1 > > > > > > > > > > > > > -- > > > > Latest Podcast: > > > > https://www.linkedin.com/feed/update/urn:li:activity:6791014284936785920/ > > > > > > > > Dave Täht CTO, TekLibre, LLC > > > > > > -- > > Latest Podcast: > > https://www.linkedin.com/feed/update/urn:li:activity:6791014284936785920/ > > > > Dave Täht CTO, TekLibre, LLC > > ___ Virtualization mailing list Virtual
Re: virtio_net: BQL?
在 2021/5/18 上午5:48, Dave Taht 写道: On Mon, May 17, 2021 at 1:23 PM Willem de Bruijn wrote: On Mon, May 17, 2021 at 2:44 PM Dave Taht wrote: Not really related to this patch, but is there some reason why virtio has no support for BQL? There have been a few attempts to add it over the years. Most recently, https://lore.kernel.org/lkml/20181205225323.12555-2-...@redhat.com/ That thread has a long discussion. I think the key open issue remains "The tricky part is the mode switching between napi and no napi." Oy, vey. I didn't pay any attention to that discussion, sadly enough. It's been about that long (2018) since I paid any attention to bufferbloat in the cloud and my cloudy provider (linode) switched to using virtio when I wasn't looking. For over a year now, I'd been getting reports saying that comcast's pie rollout wasn't working as well as expected, that evenroute's implementation of sch_cake and sqm on inbound wasn't working right, nor pf_sense's and numerous other issues at Internet scale. Last week I ran a string of benchmarks against starlink's new services and was really aghast at what I found there, too. but the problem seemed deeper than in just the dishy... Without BQL, there's no backpressure for fq_codel to do its thing. None. My measurement servers aren't FQ-codeling no matter how much load I put on them. Since that qdisc is the default now in most linux distributions, I imagine that the bulk of the cloud is now behaving as erratically as linux was in 2011 with enormous swings in throughput and latency from GSO/TSO hitting overlarge rx/tx rings, [1], breaking various rate estimators in codel, pie and the tcp stack itself. See: http://fremont.starlink.taht.net/~d/virtio_nobql/rrul_-_evenroute_v3_server_fq_codel.png See the swings in latency there? that's symptomatic of tx/rx rings filling and emptying. it wasn't until I switched my measurement server temporarily over to sch_fq that I got a rrul result that was close to the results we used to get from the virtualized e1000e drivers we were using in 2014. http://fremont.starlink.taht.net/~d/virtio_nobql/rrul_-_evenroute_v3_server_fq.png While I have long supported the use of sch_fq for tcp-heavy workloads, it still behaves better with bql in place, and fq_codel is better for generic workloads... but needs bql based backpressure to kick in. [1] I really hope I'm overreacting but, um, er, could someone(s) spin up a new patch that does bql in some way even half right for this driver and help test it? I haven't built a kernel in a while. I think it's time to obsolete skb_orphan() for virtio-net to get rid of a brunch of tricky codes in the current virtio-net driver. Then we can do BQL on top. I will prepare some patches to do this (probably with Michael's BQL patch). Thanks On Mon, May 17, 2021 at 11:41 AM Xianting Tian wrote: BUG_ON() uses unlikely in if(), which can be optimized at compile time. Signed-off-by: Xianting Tian --- drivers/net/virtio_net.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index c921ebf3ae82..212d52204884 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -1646,10 +1646,9 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb) else hdr = skb_vnet_hdr(skb); - if (virtio_net_hdr_from_skb(skb, &hdr->hdr, + BUG_ON(virtio_net_hdr_from_skb(skb, &hdr->hdr, virtio_is_little_endian(vi->vdev), false, - 0)) - BUG(); + 0)); if (vi->mergeable_rx_bufs) hdr->num_buffers = 0; -- 2.17.1 -- Latest Podcast: https://www.linkedin.com/feed/update/urn:li:activity:6791014284936785920/ Dave Täht CTO, TekLibre, LLC -- Latest Podcast: https://www.linkedin.com/feed/update/urn:li:activity:6791014284936785920/ Dave Täht CTO, TekLibre, LLC ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [Bloat] virtio_net: BQL?
On 5/18/21 1:00 AM, Stephen Hemminger wrote: > > The Azure network driver (netvsc) also does not have BQL. Several years ago > I tried adding it but it benchmarked worse and there is the added complexity > of handling the accelerated networking VF path. > Note that NIC with many TX queues make BQL almost useless, only adding extra overhead. We should probably make BQL something that can be manually turned on/off. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: virtio_net: BQL?
On Mon, May 17, 2021 at 11:43:43AM -0700, Dave Taht wrote: > Not really related to this patch, but is there some reason why virtio > has no support for BQL? So just so you can try it out, I rebased my old patch. XDP is handled incorrectly by it so we shouldn't apply it as is, but should be good enough for you to see whether it helps. Completely untested! Signed-off-by: Michael S. Tsirkin diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 7be93ca01650..4bfb682a20b2 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -556,6 +556,7 @@ static int virtnet_xdp_xmit(struct net_device *dev, kicks = 1; } out: + /* TODO: netdev_tx_completed_queue? */ u64_stats_update_begin(&sq->stats.syncp); sq->stats.bytes += bytes; sq->stats.packets += packets; @@ -1376,7 +1377,7 @@ static int virtnet_receive(struct receive_queue *rq, int budget, return stats.packets; } -static void free_old_xmit_skbs(struct send_queue *sq, bool in_napi) +static void free_old_xmit_skbs(struct netdev_queue *txq, struct send_queue *sq, bool in_napi) { unsigned int len; unsigned int packets = 0; @@ -1406,6 +1407,8 @@ static void free_old_xmit_skbs(struct send_queue *sq, bool in_napi) if (!packets) return; + netdev_tx_completed_queue(txq, packets, bytes); + u64_stats_update_begin(&sq->stats.syncp); sq->stats.bytes += bytes; sq->stats.packets += packets; @@ -1434,7 +1437,7 @@ static void virtnet_poll_cleantx(struct receive_queue *rq) if (__netif_tx_trylock(txq)) { virtqueue_disable_cb(sq->vq); - free_old_xmit_skbs(sq, true); + free_old_xmit_skbs(txq, sq, true); if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) netif_tx_wake_queue(txq); @@ -1522,7 +1525,7 @@ static int virtnet_poll_tx(struct napi_struct *napi, int budget) txq = netdev_get_tx_queue(vi->dev, index); __netif_tx_lock(txq, raw_smp_processor_id()); virtqueue_disable_cb(sq->vq); - free_old_xmit_skbs(sq, true); + free_old_xmit_skbs(txq, sq, true); if (sq->vq->num_free >= 2 + MAX_SKB_FRAGS) netif_tx_wake_queue(txq); @@ -1606,10 +1609,11 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) struct netdev_queue *txq = netdev_get_tx_queue(dev, qnum); bool kick = !netdev_xmit_more(); bool use_napi = sq->napi.weight; + unsigned int bytes = skb->len; /* Free up any pending old buffers before queueing new ones. */ virtqueue_disable_cb(sq->vq); - free_old_xmit_skbs(sq, false); + free_old_xmit_skbs(txq, sq, false); if (use_napi && kick) virtqueue_enable_cb_delayed(sq->vq); @@ -1638,6 +1642,8 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) nf_reset_ct(skb); } + netdev_tx_sent_queue(txq, bytes); + /* If running out of space, stop queue to avoid getting packets that we * are then unable to transmit. * An alternative would be to force queuing layer to requeue the skb by @@ -1653,7 +1659,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) if (!use_napi && unlikely(!virtqueue_enable_cb_delayed(sq->vq))) { /* More just got used, free them then recheck. */ - free_old_xmit_skbs(sq, false); + free_old_xmit_skbs(txq, sq, false); if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) { netif_start_subqueue(dev, qnum); virtqueue_disable_cb(sq->vq); ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: virtio_net: BQL?
On Mon, May 17, 2021 at 2:44 PM Dave Taht wrote: > > Not really related to this patch, but is there some reason why virtio > has no support for BQL? There have been a few attempts to add it over the years. Most recently, https://lore.kernel.org/lkml/20181205225323.12555-2-...@redhat.com/ That thread has a long discussion. I think the key open issue remains "The tricky part is the mode switching between napi and no napi." > On Mon, May 17, 2021 at 11:41 AM Xianting Tian > wrote: > > > > BUG_ON() uses unlikely in if(), which can be optimized at compile time. > > > > Signed-off-by: Xianting Tian > > --- > > drivers/net/virtio_net.c | 5 ++--- > > 1 file changed, 2 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > > index c921ebf3ae82..212d52204884 100644 > > --- a/drivers/net/virtio_net.c > > +++ b/drivers/net/virtio_net.c > > @@ -1646,10 +1646,9 @@ static int xmit_skb(struct send_queue *sq, struct > > sk_buff *skb) > > else > > hdr = skb_vnet_hdr(skb); > > > > - if (virtio_net_hdr_from_skb(skb, &hdr->hdr, > > + BUG_ON(virtio_net_hdr_from_skb(skb, &hdr->hdr, > > virtio_is_little_endian(vi->vdev), > > false, > > - 0)) > > - BUG(); > > + 0)); > > > > if (vi->mergeable_rx_bufs) > > hdr->num_buffers = 0; > > -- > > 2.17.1 > > > > > -- > Latest Podcast: > https://www.linkedin.com/feed/update/urn:li:activity:6791014284936785920/ > > Dave Täht CTO, TekLibre, LLC ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH RFC v4 net-next 2/5] virtio_net: bql
From: "Michael S. Tsirkin" Improve tx batching using byte queue limits. Should be especially effective for MQ. Cc: Rusty Russell Cc: Michael S. Tsirkin Signed-off-by: Michael S. Tsirkin --- drivers/net/virtio_net.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index f68114e..0ed24ff 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -224,6 +224,7 @@ static unsigned int free_old_xmit_skbs(struct netdev_queue *txq, struct virtnet_info *vi = sq->vq->vdev->priv; struct virtnet_stats *stats = this_cpu_ptr(vi->stats); unsigned int packets = 0; + unsigned int bytes = 0; while (packets < budget && (skb = virtqueue_get_buf(sq->vq, &len)) != NULL) { @@ -231,6 +232,7 @@ static unsigned int free_old_xmit_skbs(struct netdev_queue *txq, u64_stats_update_begin(&stats->tx_syncp); stats->tx_bytes += skb->len; + bytes += skb->len; stats->tx_packets++; u64_stats_update_end(&stats->tx_syncp); @@ -238,6 +240,8 @@ static unsigned int free_old_xmit_skbs(struct netdev_queue *txq, packets++; } + netdev_tx_completed_queue(txq, packets, bytes); + if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) netif_tx_start_queue(txq); @@ -964,6 +968,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) int err; struct netdev_queue *txq = netdev_get_tx_queue(dev, qnum); bool kick = !skb->xmit_more; + unsigned int bytes = skb->len; virtqueue_disable_cb(sq->vq); @@ -981,6 +986,8 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) return NETDEV_TX_OK; } + netdev_tx_sent_queue(txq, bytes); + /* Apparently nice girls don't return TX_BUSY; stop the queue * before it gets out of hand. Naturally, this wastes entries. */ if (sq->vq->num_free < 2+MAX_SKB_FRAGS) -- 1.8.3.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH RFC v3 2/3] virtio_net: bql
Improve tx batching using byte queue limits. Should be especially effective for MQ. Signed-off-by: Michael S. Tsirkin --- drivers/net/virtio_net.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 14f4cda..b83d39d 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -227,6 +227,7 @@ static unsigned int free_old_xmit_skbs(struct netdev_queue *txq, struct virtnet_info *vi = sq->vq->vdev->priv; struct virtnet_stats *stats = this_cpu_ptr(vi->stats); unsigned int packets = 0; + unsigned int bytes = 0; while (packets < budget && (skb = virtqueue_get_buf(sq->vq, &len)) != NULL) { @@ -234,6 +235,7 @@ static unsigned int free_old_xmit_skbs(struct netdev_queue *txq, u64_stats_update_begin(&stats->tx_syncp); stats->tx_bytes += skb->len; + bytes += skb->len; stats->tx_packets++; u64_stats_update_end(&stats->tx_syncp); @@ -241,6 +243,8 @@ static unsigned int free_old_xmit_skbs(struct netdev_queue *txq, packets++; } + netdev_tx_completed_queue(txq, packets, bytes); + if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) netif_tx_start_queue(txq); @@ -959,6 +963,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) int err; struct netdev_queue *txq = netdev_get_tx_queue(dev, qnum); bool kick = !skb->xmit_more; + unsigned int bytes = skb->len; virtqueue_disable_cb(sq->vq); @@ -976,6 +981,8 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) return NETDEV_TX_OK; } + netdev_tx_sent_queue(txq, bytes); + /* Apparently nice girls don't return TX_BUSY; stop the queue * before it gets out of hand. Naturally, this wastes entries. */ if (sq->vq->num_free < 2+MAX_SKB_FRAGS) -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH RFC v2 2/3] virtio_net: bql
On 10/15/2014 10:32 PM, Michael S. Tsirkin wrote: > Improve tx batching using byte queue limits. > Should be especially effective for MQ. > > Signed-off-by: Michael S. Tsirkin > --- > drivers/net/virtio_net.c | 20 > 1 file changed, 16 insertions(+), 4 deletions(-) > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index a9bf178..8dea411 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -219,13 +219,15 @@ static struct page *get_a_page(struct receive_queue > *rq, gfp_t gfp_mask) > return p; > } > > -static unsigned int free_old_xmit_skbs(struct send_queue *sq, int budget) > +static unsigned int free_old_xmit_skbs(struct netdev_queue *txq, > +struct send_queue *sq, int budget) > { > struct sk_buff *skb; > unsigned int len; > struct virtnet_info *vi = sq->vq->vdev->priv; > struct virtnet_stats *stats = this_cpu_ptr(vi->stats); > unsigned int packets = 0; > + unsigned int bytes = 0; > > while (packets < budget && > (skb = virtqueue_get_buf(sq->vq, &len)) != NULL) { > @@ -233,6 +235,7 @@ static unsigned int free_old_xmit_skbs(struct send_queue > *sq, int budget) > > u64_stats_update_begin(&stats->tx_syncp); > stats->tx_bytes += skb->len; > + bytes += skb->len; > stats->tx_packets++; > u64_stats_update_end(&stats->tx_syncp); > > @@ -240,6 +243,8 @@ static unsigned int free_old_xmit_skbs(struct send_queue > *sq, int budget) > packets++; > } > > + netdev_tx_completed_queue(txq, packets, bytes); > + > return packets; > } > > @@ -810,7 +815,7 @@ static int virtnet_poll_tx(struct napi_struct *napi, int > budget) > again: > __netif_tx_lock(txq, smp_processor_id()); > virtqueue_disable_cb(sq->vq); > - sent += free_old_xmit_skbs(sq, budget - sent); > + sent += free_old_xmit_skbs(txq, sq, budget - sent); > > if (sent < budget) { > enable_done = virtqueue_enable_cb_delayed(sq->vq); > @@ -962,12 +967,13 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, > struct net_device *dev) > struct netdev_queue *txq = netdev_get_tx_queue(dev, qnum); > bool kick = !skb->xmit_more; > bool stopped; > + unsigned int bytes = skb->len; > > virtqueue_disable_cb(sq->vq); > > /* We are going to push one skb. >* Try to pop one off to free space for it. */ > - free_old_xmit_skbs(sq, 1); > + free_old_xmit_skbs(txq, sq, 1); > > /* Try to transmit */ > err = xmit_skb(sq, skb); > @@ -983,6 +989,12 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, > struct net_device *dev) > return NETDEV_TX_OK; > } > > + netdev_tx_sent_queue(txq, bytes); > + > + /* Kick early so device can process descriptors in parallel with us. */ > + if (kick) > + virtqueue_kick(sq->vq); Haven't figured out how this will help for bql, consider only a netif_stop_subqueue() may be called during two possible kicks. And since we don't add any buffer between the two kicks, the send kick is almost useless. > + > /* Apparently nice girls don't return TX_BUSY; stop the queue >* before it gets out of hand. Naturally, this wastes entries. */ > if (sq->vq->num_free < 2+MAX_SKB_FRAGS) { > @@ -997,7 +1009,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, > struct net_device *dev) > > if (unlikely(!virtqueue_enable_cb_delayed(sq->vq))) { > /* More just got used, free them then recheck. */ > - free_old_xmit_skbs(sq, qsize); > + free_old_xmit_skbs(txq, sq, qsize); > if (stopped && sq->vq->num_free >= 2+MAX_SKB_FRAGS) > netif_start_subqueue(dev, qnum); > } ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH RFC v2 2/3] virtio_net: bql
Improve tx batching using byte queue limits. Should be especially effective for MQ. Signed-off-by: Michael S. Tsirkin --- drivers/net/virtio_net.c | 20 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index a9bf178..8dea411 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -219,13 +219,15 @@ static struct page *get_a_page(struct receive_queue *rq, gfp_t gfp_mask) return p; } -static unsigned int free_old_xmit_skbs(struct send_queue *sq, int budget) +static unsigned int free_old_xmit_skbs(struct netdev_queue *txq, + struct send_queue *sq, int budget) { struct sk_buff *skb; unsigned int len; struct virtnet_info *vi = sq->vq->vdev->priv; struct virtnet_stats *stats = this_cpu_ptr(vi->stats); unsigned int packets = 0; + unsigned int bytes = 0; while (packets < budget && (skb = virtqueue_get_buf(sq->vq, &len)) != NULL) { @@ -233,6 +235,7 @@ static unsigned int free_old_xmit_skbs(struct send_queue *sq, int budget) u64_stats_update_begin(&stats->tx_syncp); stats->tx_bytes += skb->len; + bytes += skb->len; stats->tx_packets++; u64_stats_update_end(&stats->tx_syncp); @@ -240,6 +243,8 @@ static unsigned int free_old_xmit_skbs(struct send_queue *sq, int budget) packets++; } + netdev_tx_completed_queue(txq, packets, bytes); + return packets; } @@ -810,7 +815,7 @@ static int virtnet_poll_tx(struct napi_struct *napi, int budget) again: __netif_tx_lock(txq, smp_processor_id()); virtqueue_disable_cb(sq->vq); - sent += free_old_xmit_skbs(sq, budget - sent); + sent += free_old_xmit_skbs(txq, sq, budget - sent); if (sent < budget) { enable_done = virtqueue_enable_cb_delayed(sq->vq); @@ -962,12 +967,13 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) struct netdev_queue *txq = netdev_get_tx_queue(dev, qnum); bool kick = !skb->xmit_more; bool stopped; + unsigned int bytes = skb->len; virtqueue_disable_cb(sq->vq); /* We are going to push one skb. * Try to pop one off to free space for it. */ - free_old_xmit_skbs(sq, 1); + free_old_xmit_skbs(txq, sq, 1); /* Try to transmit */ err = xmit_skb(sq, skb); @@ -983,6 +989,12 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) return NETDEV_TX_OK; } + netdev_tx_sent_queue(txq, bytes); + + /* Kick early so device can process descriptors in parallel with us. */ + if (kick) + virtqueue_kick(sq->vq); + /* Apparently nice girls don't return TX_BUSY; stop the queue * before it gets out of hand. Naturally, this wastes entries. */ if (sq->vq->num_free < 2+MAX_SKB_FRAGS) { @@ -997,7 +1009,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) if (unlikely(!virtqueue_enable_cb_delayed(sq->vq))) { /* More just got used, free them then recheck. */ - free_old_xmit_skbs(sq, qsize); + free_old_xmit_skbs(txq, sq, qsize); if (stopped && sq->vq->num_free >= 2+MAX_SKB_FRAGS) netif_start_subqueue(dev, qnum); } -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization