Re: [PATCH net-next 3/3] vhost_net: basic polling support
On 12/02/2015 08:36 PM, Michael S. Tsirkin wrote: > On Wed, Dec 02, 2015 at 01:04:03PM +0800, Jason Wang wrote: >> >> On 12/01/2015 10:43 PM, Michael S. Tsirkin wrote: >>> On Tue, Dec 01, 2015 at 01:17:49PM +0800, Jason Wang wrote: On 11/30/2015 06:44 PM, Michael S. Tsirkin wrote: > On Wed, Nov 25, 2015 at 03:11:29PM +0800, Jason Wang wrote: >>> This patch tries to poll for new added tx buffer or socket receive >>> queue for a while at the end of tx/rx processing. The maximum time >>> spent on polling were specified through a new kind of vring ioctl. >>> >>> Signed-off-by: Jason Wang> One further enhancement would be to actually poll > the underlying device. This should be reasonably > straight-forward with macvtap (especially in the > passthrough mode). > > Yes, it is. I have some patches to do this by replacing skb_queue_empty() with sk_busy_loop() but for tap. >>> We probably don't want to do this unconditionally, though. >>> Tests does not show any improvement but some regression. >>> Did you add code to call sk_mark_napi_id on tap then? >>> sk_busy_loop won't do anything useful without. >> Yes I did. Probably something wrong elsewhere. > Is this for guest-to-guest? Nope. Like you said below, since it requires NAPI so it was external host to guest. > the patch to do napi > for tap is still not upstream due to minor performance > regression. Want me to repost it? Sure, I've played this a little bit in the past too. > Maybe it's better to test macvtap. >>> Same thing ... >>> > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next 3/3] vhost_net: basic polling support
On Wed, Dec 02, 2015 at 01:04:03PM +0800, Jason Wang wrote: > > > On 12/01/2015 10:43 PM, Michael S. Tsirkin wrote: > > On Tue, Dec 01, 2015 at 01:17:49PM +0800, Jason Wang wrote: > >> > >> On 11/30/2015 06:44 PM, Michael S. Tsirkin wrote: > >>> On Wed, Nov 25, 2015 at 03:11:29PM +0800, Jason Wang wrote: > > This patch tries to poll for new added tx buffer or socket receive > > queue for a while at the end of tx/rx processing. The maximum time > > spent on polling were specified through a new kind of vring ioctl. > > > > Signed-off-by: Jason Wang> >>> One further enhancement would be to actually poll > >>> the underlying device. This should be reasonably > >>> straight-forward with macvtap (especially in the > >>> passthrough mode). > >>> > >>> > >> Yes, it is. I have some patches to do this by replacing > >> skb_queue_empty() with sk_busy_loop() but for tap. > > We probably don't want to do this unconditionally, though. > > > >> Tests does not show > >> any improvement but some regression. > > Did you add code to call sk_mark_napi_id on tap then? > > sk_busy_loop won't do anything useful without. > > Yes I did. Probably something wrong elsewhere. Is this for guest-to-guest? the patch to do napi for tap is still not upstream due to minor performance regression. Want me to repost it? > > > >> Maybe it's better to test macvtap. > > Same thing ... > > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next 3/3] vhost_net: basic polling support
On 12/01/2015 10:43 PM, Michael S. Tsirkin wrote: > On Tue, Dec 01, 2015 at 01:17:49PM +0800, Jason Wang wrote: >> >> On 11/30/2015 06:44 PM, Michael S. Tsirkin wrote: >>> On Wed, Nov 25, 2015 at 03:11:29PM +0800, Jason Wang wrote: > This patch tries to poll for new added tx buffer or socket receive > queue for a while at the end of tx/rx processing. The maximum time > spent on polling were specified through a new kind of vring ioctl. > > Signed-off-by: Jason Wang>>> One further enhancement would be to actually poll >>> the underlying device. This should be reasonably >>> straight-forward with macvtap (especially in the >>> passthrough mode). >>> >>> >> Yes, it is. I have some patches to do this by replacing >> skb_queue_empty() with sk_busy_loop() but for tap. > We probably don't want to do this unconditionally, though. > >> Tests does not show >> any improvement but some regression. > Did you add code to call sk_mark_napi_id on tap then? > sk_busy_loop won't do anything useful without. Yes I did. Probably something wrong elsewhere. > >> Maybe it's better to test macvtap. > Same thing ... > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next 3/3] vhost_net: basic polling support
On Tue, Dec 01, 2015 at 01:17:49PM +0800, Jason Wang wrote: > > > On 11/30/2015 06:44 PM, Michael S. Tsirkin wrote: > > On Wed, Nov 25, 2015 at 03:11:29PM +0800, Jason Wang wrote: > >> > This patch tries to poll for new added tx buffer or socket receive > >> > queue for a while at the end of tx/rx processing. The maximum time > >> > spent on polling were specified through a new kind of vring ioctl. > >> > > >> > Signed-off-by: Jason Wang> > One further enhancement would be to actually poll > > the underlying device. This should be reasonably > > straight-forward with macvtap (especially in the > > passthrough mode). > > > > > > Yes, it is. I have some patches to do this by replacing > skb_queue_empty() with sk_busy_loop() but for tap. We probably don't want to do this unconditionally, though. > Tests does not show > any improvement but some regression. Did you add code to call sk_mark_napi_id on tap then? sk_busy_loop won't do anything useful without. > Maybe it's better to test macvtap. Same thing ... -- MST -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next 3/3] vhost_net: basic polling support
On 11/30/2015 06:44 PM, Michael S. Tsirkin wrote: > On Wed, Nov 25, 2015 at 03:11:29PM +0800, Jason Wang wrote: >> > This patch tries to poll for new added tx buffer or socket receive >> > queue for a while at the end of tx/rx processing. The maximum time >> > spent on polling were specified through a new kind of vring ioctl. >> > >> > Signed-off-by: Jason Wang> One further enhancement would be to actually poll > the underlying device. This should be reasonably > straight-forward with macvtap (especially in the > passthrough mode). > > Yes, it is. I have some patches to do this by replacing skb_queue_empty() with sk_busy_loop() but for tap. Tests does not show any improvement but some regression. Maybe it's better to test macvtap. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next 3/3] vhost_net: basic polling support
On Wed, Nov 25, 2015 at 03:11:29PM +0800, Jason Wang wrote: > This patch tries to poll for new added tx buffer or socket receive > queue for a while at the end of tx/rx processing. The maximum time > spent on polling were specified through a new kind of vring ioctl. > > Signed-off-by: Jason WangOne further enhancement would be to actually poll the underlying device. This should be reasonably straight-forward with macvtap (especially in the passthrough mode). > --- > drivers/vhost/net.c| 72 > ++ > drivers/vhost/vhost.c | 15 ++ > drivers/vhost/vhost.h | 1 + > include/uapi/linux/vhost.h | 11 +++ > 4 files changed, 94 insertions(+), 5 deletions(-) > > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c > index 9eda69e..ce6da77 100644 > --- a/drivers/vhost/net.c > +++ b/drivers/vhost/net.c > @@ -287,6 +287,41 @@ static void vhost_zerocopy_callback(struct ubuf_info > *ubuf, bool success) > rcu_read_unlock_bh(); > } > > +static inline unsigned long busy_clock(void) > +{ > + return local_clock() >> 10; > +} > + > +static bool vhost_can_busy_poll(struct vhost_dev *dev, > + unsigned long endtime) > +{ > + return likely(!need_resched()) && > +likely(!time_after(busy_clock(), endtime)) && > +likely(!signal_pending(current)) && > +!vhost_has_work(dev) && > +single_task_running(); > +} > + > +static int vhost_net_tx_get_vq_desc(struct vhost_net *net, > + struct vhost_virtqueue *vq, > + struct iovec iov[], unsigned int iov_size, > + unsigned int *out_num, unsigned int *in_num) > +{ > + unsigned long uninitialized_var(endtime); > + > + if (vq->busyloop_timeout) { > + preempt_disable(); > + endtime = busy_clock() + vq->busyloop_timeout; > + while (vhost_can_busy_poll(vq->dev, endtime) && > +!vhost_vq_more_avail(vq->dev, vq)) > + cpu_relax(); > + preempt_enable(); > + } > + > + return vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov), > + out_num, in_num, NULL, NULL); > +} > + > /* Expects to be always run from workqueue - which acts as > * read-size critical section for our kind of RCU. */ > static void handle_tx(struct vhost_net *net) > @@ -331,10 +366,9 @@ static void handle_tx(struct vhost_net *net) > % UIO_MAXIOV == nvq->done_idx)) > break; > > - head = vhost_get_vq_desc(vq, vq->iov, > - ARRAY_SIZE(vq->iov), > - , , > - NULL, NULL); > + head = vhost_net_tx_get_vq_desc(net, vq, vq->iov, > + ARRAY_SIZE(vq->iov), > + , ); > /* On error, stop handling until the next kick. */ > if (unlikely(head < 0)) > break; > @@ -435,6 +469,34 @@ static int peek_head_len(struct sock *sk) > return len; > } > > +static int vhost_net_peek_head_len(struct vhost_net *net, struct sock *sk) > +{ > + struct vhost_net_virtqueue *nvq = >vqs[VHOST_NET_VQ_TX]; > + struct vhost_virtqueue *vq = >vq; > + unsigned long uninitialized_var(endtime); > + > + if (vq->busyloop_timeout) { > + mutex_lock(>mutex); > + vhost_disable_notify(>dev, vq); > + > + preempt_disable(); > + endtime = busy_clock() + vq->busyloop_timeout; > + > + while (vhost_can_busy_poll(>dev, endtime) && > +skb_queue_empty(>sk_receive_queue) && > +!vhost_vq_more_avail(>dev, vq)) > + cpu_relax(); > + > + preempt_enable(); > + > + if (vhost_enable_notify(>dev, vq)) > + vhost_poll_queue(>poll); > + mutex_unlock(>mutex); > + } > + > + return peek_head_len(sk); > +} > + > /* This is a multi-buffer version of vhost_get_desc, that works if > * vq has read descriptors only. > * @vq - the relevant virtqueue > @@ -553,7 +615,7 @@ static void handle_rx(struct vhost_net *net) > vq->log : NULL; > mergeable = vhost_has_feature(vq, VIRTIO_NET_F_MRG_RXBUF); > > - while ((sock_len = peek_head_len(sock->sk))) { > + while ((sock_len = vhost_net_peek_head_len(net, sock->sk))) { > sock_len += sock_hlen; > vhost_len = sock_len + vhost_hlen; > headcount = get_rx_bufs(vq, vq->heads, vhost_len, > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c > index b86c5aa..857af6c 100644 > --- a/drivers/vhost/vhost.c > +++ b/drivers/vhost/vhost.c > @@ -285,6 +285,7 @@