Re: [PATCH net-next 3/3] vhost_net: basic polling support

2015-12-03 Thread Jason Wang


On 12/02/2015 08:36 PM, Michael S. Tsirkin wrote:
> On Wed, Dec 02, 2015 at 01:04:03PM +0800, Jason Wang wrote:
>>
>> On 12/01/2015 10:43 PM, Michael S. Tsirkin wrote:
>>> On Tue, Dec 01, 2015 at 01:17:49PM +0800, Jason Wang wrote:
 On 11/30/2015 06:44 PM, Michael S. Tsirkin wrote:
> On Wed, Nov 25, 2015 at 03:11:29PM +0800, Jason Wang wrote:
>>> This patch tries to poll for new added tx buffer or socket receive
>>> queue for a while at the end of tx/rx processing. The maximum time
>>> spent on polling were specified through a new kind of vring ioctl.
>>>
>>> Signed-off-by: Jason Wang 
> One further enhancement would be to actually poll
> the underlying device. This should be reasonably
> straight-forward with macvtap (especially in the
> passthrough mode).
>
>
 Yes, it is. I have some patches to do this by replacing
 skb_queue_empty() with sk_busy_loop() but for tap.
>>> We probably don't want to do this unconditionally, though.
>>>
 Tests does not show
 any improvement but some regression.
>>> Did you add code to call sk_mark_napi_id on tap then?
>>> sk_busy_loop won't do anything useful without.
>> Yes I did. Probably something wrong elsewhere.
> Is this for guest-to-guest?

Nope. Like you said below, since it requires NAPI so it was external
host to guest.

>  the patch to do napi
> for tap is still not upstream due to minor performance
> regression.  Want me to repost it?

Sure, I've played this a little bit in the past too.

>
  Maybe it's better to test macvtap.
>>> Same thing ...
>>>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 3/3] vhost_net: basic polling support

2015-12-02 Thread Michael S. Tsirkin
On Wed, Dec 02, 2015 at 01:04:03PM +0800, Jason Wang wrote:
> 
> 
> On 12/01/2015 10:43 PM, Michael S. Tsirkin wrote:
> > On Tue, Dec 01, 2015 at 01:17:49PM +0800, Jason Wang wrote:
> >>
> >> On 11/30/2015 06:44 PM, Michael S. Tsirkin wrote:
> >>> On Wed, Nov 25, 2015 at 03:11:29PM +0800, Jason Wang wrote:
> > This patch tries to poll for new added tx buffer or socket receive
> > queue for a while at the end of tx/rx processing. The maximum time
> > spent on polling were specified through a new kind of vring ioctl.
> >
> > Signed-off-by: Jason Wang 
> >>> One further enhancement would be to actually poll
> >>> the underlying device. This should be reasonably
> >>> straight-forward with macvtap (especially in the
> >>> passthrough mode).
> >>>
> >>>
> >> Yes, it is. I have some patches to do this by replacing
> >> skb_queue_empty() with sk_busy_loop() but for tap.
> > We probably don't want to do this unconditionally, though.
> >
> >> Tests does not show
> >> any improvement but some regression.
> > Did you add code to call sk_mark_napi_id on tap then?
> > sk_busy_loop won't do anything useful without.
> 
> Yes I did. Probably something wrong elsewhere.

Is this for guest-to-guest? the patch to do napi
for tap is still not upstream due to minor performance
regression.  Want me to repost it?

> >
> >>  Maybe it's better to test macvtap.
> > Same thing ...
> >
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 3/3] vhost_net: basic polling support

2015-12-01 Thread Jason Wang


On 12/01/2015 10:43 PM, Michael S. Tsirkin wrote:
> On Tue, Dec 01, 2015 at 01:17:49PM +0800, Jason Wang wrote:
>>
>> On 11/30/2015 06:44 PM, Michael S. Tsirkin wrote:
>>> On Wed, Nov 25, 2015 at 03:11:29PM +0800, Jason Wang wrote:
> This patch tries to poll for new added tx buffer or socket receive
> queue for a while at the end of tx/rx processing. The maximum time
> spent on polling were specified through a new kind of vring ioctl.
>
> Signed-off-by: Jason Wang 
>>> One further enhancement would be to actually poll
>>> the underlying device. This should be reasonably
>>> straight-forward with macvtap (especially in the
>>> passthrough mode).
>>>
>>>
>> Yes, it is. I have some patches to do this by replacing
>> skb_queue_empty() with sk_busy_loop() but for tap.
> We probably don't want to do this unconditionally, though.
>
>> Tests does not show
>> any improvement but some regression.
> Did you add code to call sk_mark_napi_id on tap then?
> sk_busy_loop won't do anything useful without.

Yes I did. Probably something wrong elsewhere.

>
>>  Maybe it's better to test macvtap.
> Same thing ...
>

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 3/3] vhost_net: basic polling support

2015-12-01 Thread Michael S. Tsirkin
On Tue, Dec 01, 2015 at 01:17:49PM +0800, Jason Wang wrote:
> 
> 
> On 11/30/2015 06:44 PM, Michael S. Tsirkin wrote:
> > On Wed, Nov 25, 2015 at 03:11:29PM +0800, Jason Wang wrote:
> >> > This patch tries to poll for new added tx buffer or socket receive
> >> > queue for a while at the end of tx/rx processing. The maximum time
> >> > spent on polling were specified through a new kind of vring ioctl.
> >> > 
> >> > Signed-off-by: Jason Wang 
> > One further enhancement would be to actually poll
> > the underlying device. This should be reasonably
> > straight-forward with macvtap (especially in the
> > passthrough mode).
> >
> >
> 
> Yes, it is. I have some patches to do this by replacing
> skb_queue_empty() with sk_busy_loop() but for tap.

We probably don't want to do this unconditionally, though.

> Tests does not show
> any improvement but some regression.

Did you add code to call sk_mark_napi_id on tap then?
sk_busy_loop won't do anything useful without.

>  Maybe it's better to test macvtap.

Same thing ...

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 3/3] vhost_net: basic polling support

2015-11-30 Thread Jason Wang


On 11/30/2015 06:44 PM, Michael S. Tsirkin wrote:
> On Wed, Nov 25, 2015 at 03:11:29PM +0800, Jason Wang wrote:
>> > This patch tries to poll for new added tx buffer or socket receive
>> > queue for a while at the end of tx/rx processing. The maximum time
>> > spent on polling were specified through a new kind of vring ioctl.
>> > 
>> > Signed-off-by: Jason Wang 
> One further enhancement would be to actually poll
> the underlying device. This should be reasonably
> straight-forward with macvtap (especially in the
> passthrough mode).
>
>

Yes, it is. I have some patches to do this by replacing
skb_queue_empty() with sk_busy_loop() but for tap. Tests does not show
any improvement but some regression.  Maybe it's better to test macvtap.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 3/3] vhost_net: basic polling support

2015-11-30 Thread Michael S. Tsirkin
On Wed, Nov 25, 2015 at 03:11:29PM +0800, Jason Wang wrote:
> This patch tries to poll for new added tx buffer or socket receive
> queue for a while at the end of tx/rx processing. The maximum time
> spent on polling were specified through a new kind of vring ioctl.
> 
> Signed-off-by: Jason Wang 

One further enhancement would be to actually poll
the underlying device. This should be reasonably
straight-forward with macvtap (especially in the
passthrough mode).


> ---
>  drivers/vhost/net.c| 72 
> ++
>  drivers/vhost/vhost.c  | 15 ++
>  drivers/vhost/vhost.h  |  1 +
>  include/uapi/linux/vhost.h | 11 +++
>  4 files changed, 94 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index 9eda69e..ce6da77 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -287,6 +287,41 @@ static void vhost_zerocopy_callback(struct ubuf_info 
> *ubuf, bool success)
>   rcu_read_unlock_bh();
>  }
>  
> +static inline unsigned long busy_clock(void)
> +{
> + return local_clock() >> 10;
> +}
> +
> +static bool vhost_can_busy_poll(struct vhost_dev *dev,
> + unsigned long endtime)
> +{
> + return likely(!need_resched()) &&
> +likely(!time_after(busy_clock(), endtime)) &&
> +likely(!signal_pending(current)) &&
> +!vhost_has_work(dev) &&
> +single_task_running();
> +}
> +
> +static int vhost_net_tx_get_vq_desc(struct vhost_net *net,
> + struct vhost_virtqueue *vq,
> + struct iovec iov[], unsigned int iov_size,
> + unsigned int *out_num, unsigned int *in_num)
> +{
> + unsigned long uninitialized_var(endtime);
> +
> + if (vq->busyloop_timeout) {
> + preempt_disable();
> + endtime = busy_clock() + vq->busyloop_timeout;
> + while (vhost_can_busy_poll(vq->dev, endtime) &&
> +!vhost_vq_more_avail(vq->dev, vq))
> + cpu_relax();
> + preempt_enable();
> + }
> +
> + return vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov),
> +  out_num, in_num, NULL, NULL);
> +}
> +
>  /* Expects to be always run from workqueue - which acts as
>   * read-size critical section for our kind of RCU. */
>  static void handle_tx(struct vhost_net *net)
> @@ -331,10 +366,9 @@ static void handle_tx(struct vhost_net *net)
> % UIO_MAXIOV == nvq->done_idx))
>   break;
>  
> - head = vhost_get_vq_desc(vq, vq->iov,
> -  ARRAY_SIZE(vq->iov),
> -  , ,
> -  NULL, NULL);
> + head = vhost_net_tx_get_vq_desc(net, vq, vq->iov,
> + ARRAY_SIZE(vq->iov),
> + , );
>   /* On error, stop handling until the next kick. */
>   if (unlikely(head < 0))
>   break;
> @@ -435,6 +469,34 @@ static int peek_head_len(struct sock *sk)
>   return len;
>  }
>  
> +static int vhost_net_peek_head_len(struct vhost_net *net, struct sock *sk)
> +{
> + struct vhost_net_virtqueue *nvq = >vqs[VHOST_NET_VQ_TX];
> + struct vhost_virtqueue *vq = >vq;
> + unsigned long uninitialized_var(endtime);
> +
> + if (vq->busyloop_timeout) {
> + mutex_lock(>mutex);
> + vhost_disable_notify(>dev, vq);
> +
> + preempt_disable();
> + endtime = busy_clock() + vq->busyloop_timeout;
> +
> + while (vhost_can_busy_poll(>dev, endtime) &&
> +skb_queue_empty(>sk_receive_queue) &&
> +!vhost_vq_more_avail(>dev, vq))
> + cpu_relax();
> +
> + preempt_enable();
> +
> + if (vhost_enable_notify(>dev, vq))
> + vhost_poll_queue(>poll);
> + mutex_unlock(>mutex);
> + }
> +
> + return peek_head_len(sk);
> +}
> +
>  /* This is a multi-buffer version of vhost_get_desc, that works if
>   *   vq has read descriptors only.
>   * @vq   - the relevant virtqueue
> @@ -553,7 +615,7 @@ static void handle_rx(struct vhost_net *net)
>   vq->log : NULL;
>   mergeable = vhost_has_feature(vq, VIRTIO_NET_F_MRG_RXBUF);
>  
> - while ((sock_len = peek_head_len(sock->sk))) {
> + while ((sock_len = vhost_net_peek_head_len(net, sock->sk))) {
>   sock_len += sock_hlen;
>   vhost_len = sock_len + vhost_hlen;
>   headcount = get_rx_bufs(vq, vq->heads, vhost_len,
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index b86c5aa..857af6c 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -285,6 +285,7 @@