Re: [RFC PATCH V3 1/5] vhost: generalize adding used elem

2019-01-07 Thread Michael S. Tsirkin
On Mon, Jan 07, 2019 at 03:00:17PM +0800, Jason Wang wrote:
> 
> On 2019/1/5 上午8:33, Sean Christopherson wrote:
> > On Fri, Jan 04, 2019 at 04:29:34PM -0500, Michael S. Tsirkin wrote:
> > > On Sat, Dec 29, 2018 at 08:46:52PM +0800, Jason Wang wrote:
> > > > Use one generic vhost_copy_to_user() instead of two dedicated
> > > > accessor. This will simplify the conversion to fine grain
> > > > accessors. About 2% improvement of PPS were seen during vitio-user
> > > > txonly test.
> > > > 
> > > > Signed-off-by: Jason Wang 
> > > I don't hve a problem with this patch but do you have
> > > any idea how come removing what's supposed to be
> > > an optimization speeds things up?
> > With SMAP, the 2x vhost_put_user() will also mean an extra STAC/CLAC pair,
> > which is probably slower than the overhead of CALL+RET to whatever flavor
> > of copy_user_generic() gets used.  CALL+RET is really the only overhead
> > since all variants of copy_user_generic() unroll accesses smaller than
> > 64 bytes, e.g. on a 64-bit system, __copy_to_user() will write all 8
> > bytes in a single MOV.
> > 
> > Removing the special casing also eliminates a few hundred bytes of code
> > as well as the need for hardware to predict count==1 vs. count>1.
> > 
> 
> Yes, I don't measure, but STAC/CALC is pretty expensive when we are do very
> small copies based on the result of nosmap PPS.
> 
> Thanks

Yes all this really looks like a poster child for uaccess_begin/end
plus unsafe accesses. And if these APIs don't do the job for us
then maybe better ones are needed ...


-- 
MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH V3 1/5] vhost: generalize adding used elem

2019-01-06 Thread Jason Wang


On 2019/1/5 上午8:33, Sean Christopherson wrote:

On Fri, Jan 04, 2019 at 04:29:34PM -0500, Michael S. Tsirkin wrote:

On Sat, Dec 29, 2018 at 08:46:52PM +0800, Jason Wang wrote:

Use one generic vhost_copy_to_user() instead of two dedicated
accessor. This will simplify the conversion to fine grain
accessors. About 2% improvement of PPS were seen during vitio-user
txonly test.

Signed-off-by: Jason Wang 

I don't hve a problem with this patch but do you have
any idea how come removing what's supposed to be
an optimization speeds things up?

With SMAP, the 2x vhost_put_user() will also mean an extra STAC/CLAC pair,
which is probably slower than the overhead of CALL+RET to whatever flavor
of copy_user_generic() gets used.  CALL+RET is really the only overhead
since all variants of copy_user_generic() unroll accesses smaller than
64 bytes, e.g. on a 64-bit system, __copy_to_user() will write all 8
bytes in a single MOV.

Removing the special casing also eliminates a few hundred bytes of code
as well as the need for hardware to predict count==1 vs. count>1.



Yes, I don't measure, but STAC/CALC is pretty expensive when we are do 
very small copies based on the result of nosmap PPS.


Thanks

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH V3 1/5] vhost: generalize adding used elem

2019-01-04 Thread Michael S. Tsirkin
On Sat, Dec 29, 2018 at 08:46:52PM +0800, Jason Wang wrote:
> Use one generic vhost_copy_to_user() instead of two dedicated
> accessor. This will simplify the conversion to fine grain
> accessors. About 2% improvement of PPS were seen during vitio-user
> txonly test.
> 
> Signed-off-by: Jason Wang 

I don't hve a problem with this patch but do you have
any idea how come removing what's supposed to be
an optimization speeds things up?

> ---
>  drivers/vhost/vhost.c | 11 +--
>  1 file changed, 1 insertion(+), 10 deletions(-)
> 
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index 55e5aa662ad5..f179b5ee14c4 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -2174,16 +2174,7 @@ static int __vhost_add_used_n(struct vhost_virtqueue 
> *vq,
>  
>   start = vq->last_used_idx & (vq->num - 1);
>   used = vq->used->ring + start;
> - if (count == 1) {
> - if (vhost_put_user(vq, heads[0].id, >id)) {
> - vq_err(vq, "Failed to write used id");
> - return -EFAULT;
> - }
> - if (vhost_put_user(vq, heads[0].len, >len)) {
> - vq_err(vq, "Failed to write used len");
> - return -EFAULT;
> - }
> - } else if (vhost_copy_to_user(vq, used, heads, count * sizeof *used)) {
> + if (vhost_copy_to_user(vq, used, heads, count * sizeof *used)) {
>   vq_err(vq, "Failed to write used");
>   return -EFAULT;
>   }
> -- 
> 2.17.1
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[RFC PATCH V3 1/5] vhost: generalize adding used elem

2018-12-29 Thread Jason Wang
Use one generic vhost_copy_to_user() instead of two dedicated
accessor. This will simplify the conversion to fine grain
accessors. About 2% improvement of PPS were seen during vitio-user
txonly test.

Signed-off-by: Jason Wang 
---
 drivers/vhost/vhost.c | 11 +--
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 55e5aa662ad5..f179b5ee14c4 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -2174,16 +2174,7 @@ static int __vhost_add_used_n(struct vhost_virtqueue *vq,
 
start = vq->last_used_idx & (vq->num - 1);
used = vq->used->ring + start;
-   if (count == 1) {
-   if (vhost_put_user(vq, heads[0].id, >id)) {
-   vq_err(vq, "Failed to write used id");
-   return -EFAULT;
-   }
-   if (vhost_put_user(vq, heads[0].len, >len)) {
-   vq_err(vq, "Failed to write used len");
-   return -EFAULT;
-   }
-   } else if (vhost_copy_to_user(vq, used, heads, count * sizeof *used)) {
+   if (vhost_copy_to_user(vq, used, heads, count * sizeof *used)) {
vq_err(vq, "Failed to write used");
return -EFAULT;
}
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization