On 10/15/2014 05:51 AM, Michael S. Tsirkin wrote:
> On Sat, Oct 11, 2014 at 03:16:46PM +0800, Jason Wang wrote:
>> > We free transmitted packets in ndo_start_xmit() in the past to get better
>> > performance in the past. One side effect is that skb_orphan() needs to be
>> > called in ndo_start_xmit() which makes sk_wmem_alloc not accurate in
>> > fact. For TCP protocol, this means several optimization could not work well
>> > such as TCP small queue and auto corking. This can lead extra low
>> > throughput of small packets stream.
>> > 
>> > Thanks to the urgent descriptor support. This patch tries to solve this
>> > issue by enable the tx interrupt selectively for stream packets. This means
>> > we don't need to orphan TCP stream packets in ndo_start_xmit() but enable
>> > tx interrupt for those packets. After we get tx interrupt, a tx napi was
>> > scheduled to free those packets.
>> > 
>> > With this method, sk_wmem_alloc of TCP socket were more accurate than in
>> > the past which let TCP can batch more through TSQ and auto corking.
>> > 
>> > Signed-off-by: Jason Wang <jasow...@redhat.com>
>> > ---
>> >  drivers/net/virtio_net.c | 164 
>> > ++++++++++++++++++++++++++++++++++++-----------
>> >  1 file changed, 128 insertions(+), 36 deletions(-)
>> > 
>> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> > index 5810841..b450fc4 100644
>> > --- a/drivers/net/virtio_net.c
>> > +++ b/drivers/net/virtio_net.c
>> > @@ -72,6 +72,8 @@ struct send_queue {
>> >  
>> >    /* Name of the send queue: output.$index */
>> >    char name[40];
>> > +
>> > +  struct napi_struct napi;
>> >  };
>> >  
>> >  /* Internal representation of a receive virtqueue */
>> > @@ -217,15 +219,40 @@ static struct page *get_a_page(struct receive_queue 
>> > *rq, gfp_t gfp_mask)
>> >    return p;
>> >  }
>> >  
>> > +static int free_old_xmit_skbs(struct send_queue *sq, int budget)
>> > +{
>> > +  struct sk_buff *skb;
>> > +  unsigned int len;
>> > +  struct virtnet_info *vi = sq->vq->vdev->priv;
>> > +  struct virtnet_stats *stats = this_cpu_ptr(vi->stats);
>> > +  int sent = 0;
>> > +
>> > +  while (sent < budget &&
>> > +         (skb = virtqueue_get_buf(sq->vq, &len)) != NULL) {
>> > +          pr_debug("Sent skb %p\n", skb);
>> > +
>> > +          u64_stats_update_begin(&stats->tx_syncp);
>> > +          stats->tx_bytes += skb->len;
>> > +          stats->tx_packets++;
>> > +          u64_stats_update_end(&stats->tx_syncp);
>> > +
>> > +          dev_kfree_skb_any(skb);
>> > +          sent++;
>> > +  }
>> > +
>> > +  return sent;
>> > +}
>> > +
>> >  static void skb_xmit_done(struct virtqueue *vq)
>> >  {
>> >    struct virtnet_info *vi = vq->vdev->priv;
>> > +  struct send_queue *sq = &vi->sq[vq2txq(vq)];
>> >  
>> > -  /* Suppress further interrupts. */
>> > -  virtqueue_disable_cb(vq);
>> > -
>> > -  /* We were probably waiting for more output buffers. */
>> > -  netif_wake_subqueue(vi->dev, vq2txq(vq));
>> > +  if (napi_schedule_prep(&sq->napi)) {
>> > +          virtqueue_disable_cb(vq);
>> > +          virtqueue_disable_cb_urgent(vq);
> This disable_cb is no longer safe in xmit_done callback,
> since queue can be running at the same time.
>
> You must do it under tx lock. And yes, this likely will not work
> work well without event_idx. We'll probably need extra
> synchronization for such old hosts.
>
>
>

Yes, and the virtqueue_enable_cb_prepare() in virtnet_poll_tx() needs to
be synced with virtqueue_enabled_cb_dealyed(). Otherwise an old idx will
be published and we may still see tx interrupt storm.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Reply via email to