Re: [PATCH] virtio_console: remove vq buf while unpluging port

2019-05-24 Thread Greg KH
On Sun, Apr 28, 2019 at 09:50:04AM +0800, zhenwei pi wrote:
> A bug can be easily reproduced:
> Host# cat guest-agent.xml
> 
>   
>   
> 
> Host# virsh attach-device instance guest-agent.xml
> Host# virsh detach-device instance guest-agent.xml
> Host# virsh attach-device instance guest-agent.xml
> 
> and guest report: virtio-ports vport0p1: Error allocating inbufs
> 
> The reason is that the port is unplugged and the vq buf still remained.
> So, fix two cases in this patch:
> 1, fix memory leak with attach-device/detach-device.
> 2, fix logic bug with attach-device/detach-device/attach-device.
> 
> Signed-off-by: zhenwei pi 
> ---
>  drivers/char/virtio_console.c | 21 +++--
>  1 file changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
> index fbeb719..f6e37f4 100644
> --- a/drivers/char/virtio_console.c
> +++ b/drivers/char/virtio_console.c
> @@ -251,6 +251,7 @@ struct port {
>  
>  /* This is the very early arch-specified put chars function. */
>  static int (*early_put_chars)(u32, const char *, int);
> +static void remove_vq(struct virtqueue *vq);
>  
>  static struct port *find_port_by_vtermno(u32 vtermno)
>  {
> @@ -1550,6 +1551,9 @@ static void unplug_port(struct port *port)
>   }
>  
>   remove_port_data(port);
> + spin_lock_irq(>inbuf_lock);
> + remove_vq(port->in_vq);
> + spin_unlock_irq(>inbuf_lock);
>  
>   /*
>* We should just assume the device itself has gone off --
> @@ -1945,17 +1949,22 @@ static const struct file_operations portdev_fops = {
>   .owner = THIS_MODULE,
>  };
>  
> +static void remove_vq(struct virtqueue *vq)
> +{
> + struct port_buffer *buf;
> +
> + flush_bufs(vq, true);
> + while ((buf = virtqueue_detach_unused_buf(vq)))
> + free_buf(buf, true);
> +}
> +
>  static void remove_vqs(struct ports_device *portdev)
>  {
>   struct virtqueue *vq;
>  
> - virtio_device_for_each_vq(portdev->vdev, vq) {
> - struct port_buffer *buf;
> + virtio_device_for_each_vq(portdev->vdev, vq)
> + remove_vq(vq);
>  
> - flush_bufs(vq, true);
> - while ((buf = virtqueue_detach_unused_buf(vq)))
> - free_buf(buf, true);
> - }
>   portdev->vdev->config->del_vqs(portdev->vdev);
>   kfree(portdev->in_vqs);
>   kfree(portdev->out_vqs);
> -- 
> 2.7.4


Amit, any ideas if this is valid or not and if this should be applied?

thanks,

greg k-h
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] drm/qxl: drop WARN_ONCE()

2019-05-24 Thread Daniel Vetter
On Fri, May 24, 2019 at 12:42:50PM +0200, Gerd Hoffmann wrote:
> There is no good reason to flood the kernel log with a WARN
> stacktrace just because someone tried to mmap a prime buffer.

Yeah no userspace triggerable dmesg noise above debug level.
> 
> Signed-off-by: Gerd Hoffmann 

Reviewed-by: Daniel Vetter 
> ---
>  drivers/gpu/drm/qxl/qxl_prime.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/qxl/qxl_prime.c b/drivers/gpu/drm/qxl/qxl_prime.c
> index 114653b471c6..7d3816fca5a8 100644
> --- a/drivers/gpu/drm/qxl/qxl_prime.c
> +++ b/drivers/gpu/drm/qxl/qxl_prime.c
> @@ -77,6 +77,5 @@ void qxl_gem_prime_vunmap(struct drm_gem_object *obj, void 
> *vaddr)
>  int qxl_gem_prime_mmap(struct drm_gem_object *obj,
>  struct vm_area_struct *area)
>  {
> - WARN_ONCE(1, "not implemented");
>   return -ENOSYS;
>  }
> -- 
> 2.18.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH] VMCI: Fix integer overflow in VMCI handle arrays

2019-05-24 Thread Vishnu DASA via Virtualization
The VMCI handle array has an integer overflow in
vmci_handle_arr_append_entry when it tries to expand the array. This can be
triggered from a guest, since the doorbell link hypercall doesn't impose a
limit on the number of doorbell handles that a VM can create in the
hypervisor, and these handles are stored in a handle array.

In this change, we introduce a mandatory max capacity for handle
arrays/lists to avoid excessive memory usage.

Signed-off-by: Vishnu Dasa 
Reviewed-by: Adit Ranadive 
Reviewed-by: Jorgen Hansen 
---
 drivers/misc/vmw_vmci/vmci_context.c  | 80 +--
 drivers/misc/vmw_vmci/vmci_handle_array.c | 38 +++
 drivers/misc/vmw_vmci/vmci_handle_array.h | 29 +---
 include/linux/vmw_vmci_defs.h | 11 +++-
 4 files changed, 99 insertions(+), 59 deletions(-)

diff --git a/drivers/misc/vmw_vmci/vmci_context.c 
b/drivers/misc/vmw_vmci/vmci_context.c
index 21d0fa592145..bc089e634a75 100644
--- a/drivers/misc/vmw_vmci/vmci_context.c
+++ b/drivers/misc/vmw_vmci/vmci_context.c
@@ -29,6 +29,9 @@
 #include "vmci_driver.h"
 #include "vmci_event.h"
 
+/* Use a wide upper bound for the maximum contexts. */
+#define VMCI_MAX_CONTEXTS 2000
+
 /*
  * List of current VMCI contexts.  Contexts can be added by
  * vmci_ctx_create() and removed via vmci_ctx_destroy().
@@ -125,19 +128,22 @@ struct vmci_ctx *vmci_ctx_create(u32 cid, u32 priv_flags,
/* Initialize host-specific VMCI context. */
init_waitqueue_head(>host_context.wait_queue);
 
-   context->queue_pair_array = vmci_handle_arr_create(0);
+   context->queue_pair_array =
+   vmci_handle_arr_create(0, VMCI_MAX_GUEST_QP_COUNT);
if (!context->queue_pair_array) {
error = -ENOMEM;
goto err_free_ctx;
}
 
-   context->doorbell_array = vmci_handle_arr_create(0);
+   context->doorbell_array =
+   vmci_handle_arr_create(0, VMCI_MAX_GUEST_DOORBELL_COUNT);
if (!context->doorbell_array) {
error = -ENOMEM;
goto err_free_qp_array;
}
 
-   context->pending_doorbell_array = vmci_handle_arr_create(0);
+   context->pending_doorbell_array =
+   vmci_handle_arr_create(0, VMCI_MAX_GUEST_DOORBELL_COUNT);
if (!context->pending_doorbell_array) {
error = -ENOMEM;
goto err_free_db_array;
@@ -212,7 +218,7 @@ static int ctx_fire_notification(u32 context_id, u32 
priv_flags)
 * We create an array to hold the subscribers we find when
 * scanning through all contexts.
 */
-   subscriber_array = vmci_handle_arr_create(0);
+   subscriber_array = vmci_handle_arr_create(0, VMCI_MAX_CONTEXTS);
if (subscriber_array == NULL)
return VMCI_ERROR_NO_MEM;
 
@@ -631,20 +637,26 @@ int vmci_ctx_add_notification(u32 context_id, u32 
remote_cid)
 
spin_lock(>lock);
 
-   list_for_each_entry(n, >notifier_list, node) {
-   if (vmci_handle_is_equal(n->handle, notifier->handle)) {
-   exists = true;
-   break;
+   if (context->n_notifiers < VMCI_MAX_CONTEXTS) {
+   list_for_each_entry(n, >notifier_list, node) {
+   if (vmci_handle_is_equal(n->handle, notifier->handle)) {
+   exists = true;
+   break;
+   }
}
-   }
 
-   if (exists) {
-   kfree(notifier);
-   result = VMCI_ERROR_ALREADY_EXISTS;
+   if (exists) {
+   kfree(notifier);
+   result = VMCI_ERROR_ALREADY_EXISTS;
+   } else {
+   list_add_tail_rcu(>node,
+ >notifier_list);
+   context->n_notifiers++;
+   result = VMCI_SUCCESS;
+   }
} else {
-   list_add_tail_rcu(>node, >notifier_list);
-   context->n_notifiers++;
-   result = VMCI_SUCCESS;
+   kfree(notifier);
+   result = VMCI_ERROR_NO_MEM;
}
 
spin_unlock(>lock);
@@ -729,8 +741,7 @@ static int vmci_ctx_get_chkpt_doorbells(struct vmci_ctx 
*context,
u32 *buf_size, void **pbuf)
 {
struct dbell_cpt_state *dbells;
-   size_t n_doorbells;
-   int i;
+   u32 i, n_doorbells;
 
n_doorbells = vmci_handle_arr_get_size(context->doorbell_array);
if (n_doorbells > 0) {
@@ -868,7 +879,8 @@ int vmci_ctx_rcv_notifications_get(u32 context_id,
spin_lock(>lock);
 
*db_handle_array = context->pending_doorbell_array;
-   context->pending_doorbell_array = vmci_handle_arr_create(0);
+   context->pending_doorbell_array =
+   vmci_handle_arr_create(0, VMCI_MAX_GUEST_DOORBELL_COUNT);
if (!context->pending_doorbell_array) {
   

Call for Papers - ICOTTS'2019, Buenos Aires, Argentina

2019-05-24 Thread Maria
ICOTTS'19 - The 2019 International Conference on Tourism, Technology & Systems

5 - 7 December 2019, Buenos Aires, Argentina

Proceedings by Springer. Indexed by Scopus, ISI, etc.

https://www.icotts.org/ 

ICOTTS'19 - The 2019 International Conference on Tourism, Technology & Systems, 
to be held at the Universidad Abierta Interamericana in Buenos Aires 
, Argentina, between the 
5th and the 7th of December 2019. ICOTTS is a Multidisciplinary conference with 
a special focus in new technologies and systems in the tourism sector.

We are pleased to invite you to submit your papers to ICOTTS'19. They can be 
written in English, Spanish or Portuguese. All submissions will be reviewed on 
the basis of relevance, originality, importance and clarity.

Scope & Topics

Multidisciplinary conference, transversal to all the activity sectors that 
involve Information Technologies and systems in the Tourism area, namely: 
Competitiveness of destinations based on digital technology, Hospitality, 
Destinations Management, Business & Finance, Public Administration; Economics; 
Management Science; Education; Health & Rehabilitation; Agriculture & Food 
Technology.

Topics of interest include but are not limited to:

· Technology in Tourism and Tourist experience

· Generations and Technology in Tourism

· Digital Marketing applied to Tourism and Travel

· Mobile Technologies applied to sustainable Tourism

· Tourism research in providing innovative solutions to social problems

· Tourism, Wellness and Hospitality

· Information Technologies in Tourism

· Digital transformation of Tourism Business

· Traveling for health/medical and wellness

· Information Technologies in Ecotourism and Agritourism

· Information Technologies in Food Tourism

· Information Technologies in Education and Educational Tourism

· eTourism and Tourism 2.0

· Big data and Management for Travel and Tourism

· Geo-tagging and Tourist mobility

· Health Tourism

· Information Systems in Tourism and Hospitality

· Smart Destinations

· Resilience and Tourism

· Dark Tourism

· Military Tourism

· Robotics in Tourism

· Destination Marketing Systems

· Computer Reservations Systems

· Global Distribution Systems

· Electronic Information Distribution in Tourism and Hospitality

· Organizational Models and Information Systems

· Information Systems and Technologies​

Submission and Decision

Submitted papers must comply with the format of Smart Innovation, Systems and 
Technologies (see Instructions for Authors at Springer Website 
),
 be written in English (until 10-page limit), must not have been published 
before, not be under review for any other conference or publication and not 
include any information leading to the authors’ identification. Therefore, the 
authors’ names, affiliations and bibliographic references should not be 
included in the version for evaluation by the Scientific Committee. This 
information should only be included in the camera-ready version, saved in Word 
or Latex format and also in PDF format. These files must be accompanied by the 
Consent to Publish 

 form filled out, in a ZIP file, and uploaded at the conference management 
system.

​

Submitted papers written in Spanish or Portuguese (until 15-page limit) must 
comply with the format of RISTI 

 - Revista Ibérica de Sistemas e Tecnologias de Informação must not have been 
published before, not be under review for any other conference or publication 
and not include any information leading to the authors’ identification. 
Therefore, the authors’ names, affiliations and e-mails should not be included 
in the version for evaluation by the Scientific Committee. This information 
should only be included in the camera-ready version, saved in Word. These files 
must be uploaded at the conference management system in a ZIP file.

​

All papers will be subjected to a “double-blind review” by at least two members 
of the Scientific Committee. Based on Scientific Committee evaluation, a paper 
can be rejected or accepted by the Conference Chairs. In the later case, it can 
be accepted as the type originally submitted or as another type.

​

The authors of accepted poster papers must also build and print a poster to be 
exhibited during the Conference. This poster must follow an A1 or A2 vertical 
format. The Conference can includes Work Sessions where these posters are 
presented and orally discussed, with a 7 minute limit 

Call for Papers - ICOTTS'2019, Buenos Aires, Argentina

2019-05-24 Thread Maria
ICOTTS'19 - The 2019 International Conference on Tourism, Technology & Systems

5 - 7 December 2019, Buenos Aires, Argentina

Proceedings by Springer. Indexed by Scopus, ISI, etc.

https://www.icotts.org/ 

ICOTTS'19 - The 2019 International Conference on Tourism, Technology & Systems, 
to be held at the Universidad Abierta Interamericana in Buenos Aires 
, Argentina, between the 
5th and the 7th of December 2019. ICOTTS is a Multidisciplinary conference with 
a special focus in new technologies and systems in the tourism sector.

We are pleased to invite you to submit your papers to ICOTTS'19. They can be 
written in English, Spanish or Portuguese. All submissions will be reviewed on 
the basis of relevance, originality, importance and clarity.

Scope & Topics

Multidisciplinary conference, transversal to all the activity sectors that 
involve Information Technologies and systems in the Tourism area, namely: 
Competitiveness of destinations based on digital technology, Hospitality, 
Destinations Management, Business & Finance, Public Administration; Economics; 
Management Science; Education; Health & Rehabilitation; Agriculture & Food 
Technology.

Topics of interest include but are not limited to:

· Technology in Tourism and Tourist experience

· Generations and Technology in Tourism

· Digital Marketing applied to Tourism and Travel

· Mobile Technologies applied to sustainable Tourism

· Tourism research in providing innovative solutions to social problems

· Tourism, Wellness and Hospitality

· Information Technologies in Tourism

· Digital transformation of Tourism Business

· Traveling for health/medical and wellness

· Information Technologies in Ecotourism and Agritourism

· Information Technologies in Food Tourism

· Information Technologies in Education and Educational Tourism

· eTourism and Tourism 2.0

· Big data and Management for Travel and Tourism

· Geo-tagging and Tourist mobility

· Health Tourism

· Information Systems in Tourism and Hospitality

· Smart Destinations

· Resilience and Tourism

· Dark Tourism

· Military Tourism

· Robotics in Tourism

· Destination Marketing Systems

· Computer Reservations Systems

· Global Distribution Systems

· Electronic Information Distribution in Tourism and Hospitality

· Organizational Models and Information Systems

· Information Systems and Technologies​

Submission and Decision

Submitted papers must comply with the format of Smart Innovation, Systems and 
Technologies (see Instructions for Authors at Springer Website 
),
 be written in English (until 10-page limit), must not have been published 
before, not be under review for any other conference or publication and not 
include any information leading to the authors’ identification. Therefore, the 
authors’ names, affiliations and bibliographic references should not be 
included in the version for evaluation by the Scientific Committee. This 
information should only be included in the camera-ready version, saved in Word 
or Latex format and also in PDF format. These files must be accompanied by the 
Consent to Publish 

 form filled out, in a ZIP file, and uploaded at the conference management 
system.

​

Submitted papers written in Spanish or Portuguese (until 15-page limit) must 
comply with the format of RISTI 

 - Revista Ibérica de Sistemas e Tecnologias de Informação must not have been 
published before, not be under review for any other conference or publication 
and not include any information leading to the authors’ identification. 
Therefore, the authors’ names, affiliations and e-mails should not be included 
in the version for evaluation by the Scientific Committee. This information 
should only be included in the camera-ready version, saved in Word. These files 
must be uploaded at the conference management system in a ZIP file.

​

All papers will be subjected to a “double-blind review” by at least two members 
of the Scientific Committee. Based on Scientific Committee evaluation, a paper 
can be rejected or accepted by the Conference Chairs. In the later case, it can 
be accepted as the type originally submitted or as another type.

​

The authors of accepted poster papers must also build and print a poster to be 
exhibited during the Conference. This poster must follow an A1 or A2 vertical 
format. The Conference can includes Work Sessions where these posters are 
presented and orally discussed, with a 7 minute limit 

[PATCH] drm/qxl: drop WARN_ONCE()

2019-05-24 Thread Gerd Hoffmann
There is no good reason to flood the kernel log with a WARN
stacktrace just because someone tried to mmap a prime buffer.

Signed-off-by: Gerd Hoffmann 
---
 drivers/gpu/drm/qxl/qxl_prime.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/qxl/qxl_prime.c b/drivers/gpu/drm/qxl/qxl_prime.c
index 114653b471c6..7d3816fca5a8 100644
--- a/drivers/gpu/drm/qxl/qxl_prime.c
+++ b/drivers/gpu/drm/qxl/qxl_prime.c
@@ -77,6 +77,5 @@ void qxl_gem_prime_vunmap(struct drm_gem_object *obj, void 
*vaddr)
 int qxl_gem_prime_mmap(struct drm_gem_object *obj,
   struct vm_area_struct *area)
 {
-   WARN_ONCE(1, "not implemented");
return -ENOSYS;
 }
-- 
2.18.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [Qemu-devel] custom virt-io support (in user-mode-linux)

2019-05-24 Thread Johannes Berg
On Thu, 2019-05-23 at 15:41 +0100, Stefan Hajnoczi wrote:

> > Also, not sure I understand how the client is started?
> 
> The vhost-user device backend can be launched before QEMU.  QEMU is
> started with the UNIX domain socket path so it can connect.

Hmm. I guess I'm confusing the terminology then - I thought qemu was the
server and the backend was the client that connects to it. If it's the
other way around, yeah, that makes things easier and certainly makes
sense (you could have a daemon that implements something).

> QEMU itself doesn't fork+exec the vhost-user device backend.  It's
> expected that the user or the management stack has already launched
> the vhost-user device backend.

Right.

> > Do you know if there's a sample client/server somewhere?
> 
> See contrib/libvhost-user in the QEMU source tree as well as the
> vhost-user-blk and vhost-user-scsi examples in the contrib/ directory.

Awesome, thanks!

johannes

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH net-next 5/6] vhost: factor out setting vring addr and num

2019-05-24 Thread Jason Wang
Factoring vring address and num setting which needs special care for
accelerating vq metadata accessing.

Signed-off-by: Jason Wang 
---
 drivers/vhost/vhost.c | 177 --
 1 file changed, 103 insertions(+), 74 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 8605e44a7001..8bbda1777c61 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1468,6 +1468,104 @@ static long vhost_set_memory(struct vhost_dev *d, 
struct vhost_memory __user *m)
return -EFAULT;
 }
 
+static long vhost_vring_set_num(struct vhost_dev *d,
+   struct vhost_virtqueue *vq,
+   void __user *argp)
+{
+   struct vhost_vring_state s;
+
+   /* Resizing ring with an active backend?
+* You don't want to do that. */
+   if (vq->private_data)
+   return -EBUSY;
+
+   if (copy_from_user(, argp, sizeof s))
+   return -EFAULT;
+
+   if (!s.num || s.num > 0x || (s.num & (s.num - 1)))
+   return -EINVAL;
+   vq->num = s.num;
+
+   return 0;
+}
+
+static long vhost_vring_set_addr(struct vhost_dev *d,
+struct vhost_virtqueue *vq,
+void __user *argp)
+{
+   struct vhost_vring_addr a;
+
+   if (copy_from_user(, argp, sizeof a))
+   return -EFAULT;
+   if (a.flags & ~(0x1 << VHOST_VRING_F_LOG))
+   return -EOPNOTSUPP;
+
+   /* For 32bit, verify that the top 32bits of the user
+  data are set to zero. */
+   if ((u64)(unsigned long)a.desc_user_addr != a.desc_user_addr ||
+   (u64)(unsigned long)a.used_user_addr != a.used_user_addr ||
+   (u64)(unsigned long)a.avail_user_addr != a.avail_user_addr)
+   return -EFAULT;
+
+   /* Make sure it's safe to cast pointers to vring types. */
+   BUILD_BUG_ON(__alignof__ *vq->avail > VRING_AVAIL_ALIGN_SIZE);
+   BUILD_BUG_ON(__alignof__ *vq->used > VRING_USED_ALIGN_SIZE);
+   if ((a.avail_user_addr & (VRING_AVAIL_ALIGN_SIZE - 1)) ||
+   (a.used_user_addr & (VRING_USED_ALIGN_SIZE - 1)) ||
+   (a.log_guest_addr & (VRING_USED_ALIGN_SIZE - 1)))
+   return -EINVAL;
+
+   /* We only verify access here if backend is configured.
+* If it is not, we don't as size might not have been setup.
+* We will verify when backend is configured. */
+   if (vq->private_data) {
+   if (!vq_access_ok(vq, vq->num,
+   (void __user *)(unsigned long)a.desc_user_addr,
+   (void __user *)(unsigned long)a.avail_user_addr,
+   (void __user *)(unsigned long)a.used_user_addr))
+   return -EINVAL;
+
+   /* Also validate log access for used ring if enabled. */
+   if ((a.flags & (0x1 << VHOST_VRING_F_LOG)) &&
+   !log_access_ok(vq->log_base, a.log_guest_addr,
+   sizeof *vq->used +
+   vq->num * sizeof *vq->used->ring))
+   return -EINVAL;
+   }
+
+   vq->log_used = !!(a.flags & (0x1 << VHOST_VRING_F_LOG));
+   vq->desc = (void __user *)(unsigned long)a.desc_user_addr;
+   vq->avail = (void __user *)(unsigned long)a.avail_user_addr;
+   vq->log_addr = a.log_guest_addr;
+   vq->used = (void __user *)(unsigned long)a.used_user_addr;
+
+   return 0;
+}
+
+static long vhost_vring_set_num_addr(struct vhost_dev *d,
+struct vhost_virtqueue *vq,
+unsigned int ioctl,
+void __user *argp)
+{
+   long r;
+
+   mutex_lock(>mutex);
+
+   switch (ioctl) {
+   case VHOST_SET_VRING_NUM:
+   r = vhost_vring_set_num(d, vq, argp);
+   break;
+   case VHOST_SET_VRING_ADDR:
+   r = vhost_vring_set_addr(d, vq, argp);
+   break;
+   default:
+   BUG();
+   }
+
+   mutex_unlock(>mutex);
+
+   return r;
+}
 long vhost_vring_ioctl(struct vhost_dev *d, unsigned int ioctl, void __user 
*argp)
 {
struct file *eventfp, *filep = NULL;
@@ -1477,7 +1575,6 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned int 
ioctl, void __user *arg
struct vhost_virtqueue *vq;
struct vhost_vring_state s;
struct vhost_vring_file f;
-   struct vhost_vring_addr a;
u32 idx;
long r;
 
@@ -1490,26 +1587,14 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned 
int ioctl, void __user *arg
idx = array_index_nospec(idx, d->nvqs);
vq = d->vqs[idx];
 
+   if (ioctl == VHOST_SET_VRING_NUM ||
+   ioctl == VHOST_SET_VRING_ADDR) {
+   return vhost_vring_set_num_addr(d, vq, ioctl, argp);
+   }
+
mutex_lock(>mutex);
 
switch (ioctl) {
- 

[PATCH net-next 6/6] vhost: access vq metadata through kernel virtual address

2019-05-24 Thread Jason Wang
It was noticed that the copy_to/from_user() friends that was used to
access virtqueue metdata tends to be very expensive for dataplane
implementation like vhost since it involves lots of software checks,
speculation barriers, hardware feature toggling (e.g SMAP). The
extra cost will be more obvious when transferring small packets since
the time spent on metadata accessing become more significant.

This patch tries to eliminate those overheads by accessing them
through direct mapping of those pages. Invalidation callbacks is
implemented for co-operation with general VM management (swap, KSM,
THP or NUMA balancing). We will try to get the direct mapping of vq
metadata before each round of packet processing if it doesn't
exist. If we fail, we will simplely fallback to copy_to/from_user()
friends.

This invalidation and direct mapping access are synchronized through
spinlock and RCU. All matedata accessing through direct map is
protected by RCU, and the setup or invalidation are done under
spinlock.

This method might does not work for high mem page which requires
temporary mapping so we just fallback to normal
copy_to/from_user() and may not for arch that has virtual tagged cache
since extra cache flushing is needed to eliminate the alias. This will
result complex logic and bad performance. For those archs, this patch
simply go for copy_to/from_user() friends. This is done by ruling out
kernel mapping codes through ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE.

Note that this is only done when device IOTLB is not enabled. We
could use similar method to optimize IOTLB in the future.

Tests shows at most about 23% improvement on TX PPS when using
virtio-user + vhost_net + xdp1 + TAP on 2.6GHz Broadwell:

SMAP on | SMAP off
Before: 5.2Mpps | 7.1Mpps
After:  6.4Mpps | 8.2Mpps

Cc: Andrea Arcangeli 
Cc: James Bottomley 
Cc: Christoph Hellwig 
Cc: David Miller 
Cc: Jerome Glisse 
Cc: linux...@kvack.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-par...@vger.kernel.org
Signed-off-by: Jason Wang 
---
 drivers/vhost/vhost.c | 515 +-
 drivers/vhost/vhost.h |  36 +++
 2 files changed, 548 insertions(+), 3 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 8bbda1777c61..fcc2ffd3e12a 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -299,6 +299,160 @@ static void vhost_vq_meta_reset(struct vhost_dev *d)
__vhost_vq_meta_reset(d->vqs[i]);
 }
 
+#if VHOST_ARCH_CAN_ACCEL_UACCESS
+static void vhost_map_unprefetch(struct vhost_map *map)
+{
+   kfree(map->pages);
+   map->pages = NULL;
+   map->npages = 0;
+   map->addr = NULL;
+}
+
+static void vhost_uninit_vq_maps(struct vhost_virtqueue *vq)
+{
+   struct vhost_map *map[VHOST_NUM_ADDRS];
+   int i;
+
+   spin_lock(>mmu_lock);
+   for (i = 0; i < VHOST_NUM_ADDRS; i++) {
+   map[i] = rcu_dereference_protected(vq->maps[i],
+ lockdep_is_held(>mmu_lock));
+   if (map[i])
+   rcu_assign_pointer(vq->maps[i], NULL);
+   }
+   spin_unlock(>mmu_lock);
+
+   synchronize_rcu();
+
+   for (i = 0; i < VHOST_NUM_ADDRS; i++)
+   if (map[i])
+   vhost_map_unprefetch(map[i]);
+
+}
+
+static void vhost_reset_vq_maps(struct vhost_virtqueue *vq)
+{
+   int i;
+
+   vhost_uninit_vq_maps(vq);
+   for (i = 0; i < VHOST_NUM_ADDRS; i++)
+   vq->uaddrs[i].size = 0;
+}
+
+static bool vhost_map_range_overlap(struct vhost_uaddr *uaddr,
+unsigned long start,
+unsigned long end)
+{
+   if (unlikely(!uaddr->size))
+   return false;
+
+   return !(end < uaddr->uaddr || start > uaddr->uaddr - 1 + uaddr->size);
+}
+
+static void vhost_invalidate_vq_start(struct vhost_virtqueue *vq,
+ int index,
+ unsigned long start,
+ unsigned long end)
+{
+   struct vhost_uaddr *uaddr = >uaddrs[index];
+   struct vhost_map *map;
+   int i;
+
+   if (!vhost_map_range_overlap(uaddr, start, end))
+   return;
+
+   spin_lock(>mmu_lock);
+   ++vq->invalidate_count;
+
+   map = rcu_dereference_protected(vq->maps[index],
+   lockdep_is_held(>mmu_lock));
+   if (map) {
+   if (uaddr->write) {
+   for (i = 0; i < map->npages; i++)
+   set_page_dirty(map->pages[i]);
+   }
+   rcu_assign_pointer(vq->maps[index], NULL);
+   }
+   spin_unlock(>mmu_lock);
+
+   if (map) {
+   synchronize_rcu();
+   vhost_map_unprefetch(map);
+   }
+}
+
+static void vhost_invalidate_vq_end(struct vhost_virtqueue *vq,
+   int index,
+   

[PATCH net-next 2/6] vhost: fine grain userspace memory accessors

2019-05-24 Thread Jason Wang
This is used to hide the metadata address from virtqueue helpers. This
will allow to implement a vmap based fast accessing to metadata.

Signed-off-by: Jason Wang 
---
 drivers/vhost/vhost.c | 94 +++
 1 file changed, 77 insertions(+), 17 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 47fb3a297c29..e78c195448f0 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -869,6 +869,34 @@ static inline void __user *__vhost_get_user(struct 
vhost_virtqueue *vq,
ret; \
 })
 
+static inline int vhost_put_avail_event(struct vhost_virtqueue *vq)
+{
+   return vhost_put_user(vq, cpu_to_vhost16(vq, vq->avail_idx),
+ vhost_avail_event(vq));
+}
+
+static inline int vhost_put_used(struct vhost_virtqueue *vq,
+struct vring_used_elem *head, int idx,
+int count)
+{
+   return vhost_copy_to_user(vq, vq->used->ring + idx, head,
+ count * sizeof(*head));
+}
+
+static inline int vhost_put_used_flags(struct vhost_virtqueue *vq)
+
+{
+   return vhost_put_user(vq, cpu_to_vhost16(vq, vq->used_flags),
+ >used->flags);
+}
+
+static inline int vhost_put_used_idx(struct vhost_virtqueue *vq)
+
+{
+   return vhost_put_user(vq, cpu_to_vhost16(vq, vq->last_used_idx),
+ >used->idx);
+}
+
 #define vhost_get_user(vq, x, ptr, type)   \
 ({ \
int ret; \
@@ -907,6 +935,43 @@ static void vhost_dev_unlock_vqs(struct vhost_dev *d)
mutex_unlock(>vqs[i]->mutex);
 }
 
+static inline int vhost_get_avail_idx(struct vhost_virtqueue *vq,
+ __virtio16 *idx)
+{
+   return vhost_get_avail(vq, *idx, >avail->idx);
+}
+
+static inline int vhost_get_avail_head(struct vhost_virtqueue *vq,
+  __virtio16 *head, int idx)
+{
+   return vhost_get_avail(vq, *head,
+  >avail->ring[idx & (vq->num - 1)]);
+}
+
+static inline int vhost_get_avail_flags(struct vhost_virtqueue *vq,
+   __virtio16 *flags)
+{
+   return vhost_get_avail(vq, *flags, >avail->flags);
+}
+
+static inline int vhost_get_used_event(struct vhost_virtqueue *vq,
+  __virtio16 *event)
+{
+   return vhost_get_avail(vq, *event, vhost_used_event(vq));
+}
+
+static inline int vhost_get_used_idx(struct vhost_virtqueue *vq,
+__virtio16 *idx)
+{
+   return vhost_get_used(vq, *idx, >used->idx);
+}
+
+static inline int vhost_get_desc(struct vhost_virtqueue *vq,
+struct vring_desc *desc, int idx)
+{
+   return vhost_copy_from_user(vq, desc, vq->desc + idx, sizeof(*desc));
+}
+
 static int vhost_new_umem_range(struct vhost_umem *umem,
u64 start, u64 size, u64 end,
u64 userspace_addr, int perm)
@@ -1844,8 +1909,7 @@ EXPORT_SYMBOL_GPL(vhost_log_write);
 static int vhost_update_used_flags(struct vhost_virtqueue *vq)
 {
void __user *used;
-   if (vhost_put_user(vq, cpu_to_vhost16(vq, vq->used_flags),
-  >used->flags) < 0)
+   if (vhost_put_used_flags(vq))
return -EFAULT;
if (unlikely(vq->log_used)) {
/* Make sure the flag is seen before log. */
@@ -1862,8 +1926,7 @@ static int vhost_update_used_flags(struct vhost_virtqueue 
*vq)
 
 static int vhost_update_avail_event(struct vhost_virtqueue *vq, u16 
avail_event)
 {
-   if (vhost_put_user(vq, cpu_to_vhost16(vq, vq->avail_idx),
-  vhost_avail_event(vq)))
+   if (vhost_put_avail_event(vq))
return -EFAULT;
if (unlikely(vq->log_used)) {
void __user *used;
@@ -1899,7 +1962,7 @@ int vhost_vq_init_access(struct vhost_virtqueue *vq)
r = -EFAULT;
goto err;
}
-   r = vhost_get_used(vq, last_used_idx, >used->idx);
+   r = vhost_get_used_idx(vq, _used_idx);
if (r) {
vq_err(vq, "Can't access used idx at %p\n",
   >used->idx);
@@ -2098,7 +2161,7 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
last_avail_idx = vq->last_avail_idx;
 
if (vq->avail_idx == vq->last_avail_idx) {
-   if (unlikely(vhost_get_avail(vq, avail_idx, >avail->idx))) {
+   if (unlikely(vhost_get_avail_idx(vq, _idx))) {
vq_err(vq, "Failed to access avail idx at %p\n",
>avail->idx);
return -EFAULT;
@@ -2125,8 +2188,7 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
 
/* Grab the next descriptor number they're advertising, and increment
 * the index we've seen. */
-   if 

[PATCH net-next 1/6] vhost: generalize adding used elem

2019-05-24 Thread Jason Wang
Use one generic vhost_copy_to_user() instead of two dedicated
accessor. This will simplify the conversion to fine grain
accessors. About 2% improvement of PPS were seen during vitio-user
txonly test.

Signed-off-by: Jason Wang 
---
 drivers/vhost/vhost.c | 11 +--
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 1e3ed41ae1f3..47fb3a297c29 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -2255,16 +2255,7 @@ static int __vhost_add_used_n(struct vhost_virtqueue *vq,
 
start = vq->last_used_idx & (vq->num - 1);
used = vq->used->ring + start;
-   if (count == 1) {
-   if (vhost_put_user(vq, heads[0].id, >id)) {
-   vq_err(vq, "Failed to write used id");
-   return -EFAULT;
-   }
-   if (vhost_put_user(vq, heads[0].len, >len)) {
-   vq_err(vq, "Failed to write used len");
-   return -EFAULT;
-   }
-   } else if (vhost_copy_to_user(vq, used, heads, count * sizeof *used)) {
+   if (vhost_copy_to_user(vq, used, heads, count * sizeof *used)) {
vq_err(vq, "Failed to write used");
return -EFAULT;
}
-- 
2.18.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH net-next 3/6] vhost: rename vq_iotlb_prefetch() to vq_meta_prefetch()

2019-05-24 Thread Jason Wang
Rename the function to be more accurate since it actually tries to
prefetch vq metadata address in IOTLB. And this will be used by
following patch to prefetch metadata virtual addresses.

Signed-off-by: Jason Wang 
---
 drivers/vhost/net.c   | 4 ++--
 drivers/vhost/vhost.c | 4 ++--
 drivers/vhost/vhost.h | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index df51a35cf537..bf55f995ebae 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -971,7 +971,7 @@ static void handle_tx(struct vhost_net *net)
if (!sock)
goto out;
 
-   if (!vq_iotlb_prefetch(vq))
+   if (!vq_meta_prefetch(vq))
goto out;
 
vhost_disable_notify(>dev, vq);
@@ -1140,7 +1140,7 @@ static void handle_rx(struct vhost_net *net)
if (!sock)
goto out;
 
-   if (!vq_iotlb_prefetch(vq))
+   if (!vq_meta_prefetch(vq))
goto out;
 
vhost_disable_notify(>dev, vq);
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index e78c195448f0..b353a00094aa 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1313,7 +1313,7 @@ static bool iotlb_access_ok(struct vhost_virtqueue *vq,
return true;
 }
 
-int vq_iotlb_prefetch(struct vhost_virtqueue *vq)
+int vq_meta_prefetch(struct vhost_virtqueue *vq)
 {
size_t s = vhost_has_feature(vq, VIRTIO_RING_F_EVENT_IDX) ? 2 : 0;
unsigned int num = vq->num;
@@ -1332,7 +1332,7 @@ int vq_iotlb_prefetch(struct vhost_virtqueue *vq)
   num * sizeof(*vq->used->ring) + s,
   VHOST_ADDR_USED);
 }
-EXPORT_SYMBOL_GPL(vq_iotlb_prefetch);
+EXPORT_SYMBOL_GPL(vq_meta_prefetch);
 
 /* Can we log writes? */
 /* Caller should have device mutex but not vq mutex */
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 9490e7ddb340..7a7fc001265f 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -209,7 +209,7 @@ bool vhost_enable_notify(struct vhost_dev *, struct 
vhost_virtqueue *);
 int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
unsigned int log_num, u64 len,
struct iovec *iov, int count);
-int vq_iotlb_prefetch(struct vhost_virtqueue *vq);
+int vq_meta_prefetch(struct vhost_virtqueue *vq);
 
 struct vhost_msg_node *vhost_new_msg(struct vhost_virtqueue *vq, int type);
 void vhost_enqueue_msg(struct vhost_dev *dev,
-- 
2.18.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH net-next 4/6] vhost: introduce helpers to get the size of metadata area

2019-05-24 Thread Jason Wang
To avoid code duplication since it will be used by kernel VA prefetching.

Signed-off-by: Jason Wang 
---
 drivers/vhost/vhost.c | 51 ---
 1 file changed, 33 insertions(+), 18 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index b353a00094aa..8605e44a7001 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -413,6 +413,32 @@ static void vhost_dev_free_iovecs(struct vhost_dev *dev)
vhost_vq_free_iovecs(dev->vqs[i]);
 }
 
+static size_t vhost_get_avail_size(struct vhost_virtqueue *vq,
+  unsigned int num)
+{
+   size_t event __maybe_unused =
+  vhost_has_feature(vq, VIRTIO_RING_F_EVENT_IDX) ? 2 : 0;
+
+   return sizeof(*vq->avail) +
+  sizeof(*vq->avail->ring) * num + event;
+}
+
+static size_t vhost_get_used_size(struct vhost_virtqueue *vq,
+ unsigned int num)
+{
+   size_t event __maybe_unused =
+  vhost_has_feature(vq, VIRTIO_RING_F_EVENT_IDX) ? 2 : 0;
+
+   return sizeof(*vq->used) +
+  sizeof(*vq->used->ring) * num + event;
+}
+
+static size_t vhost_get_desc_size(struct vhost_virtqueue *vq,
+ unsigned int num)
+{
+   return sizeof(*vq->desc) * num;
+}
+
 void vhost_dev_init(struct vhost_dev *dev,
struct vhost_virtqueue **vqs, int nvqs, int iov_limit)
 {
@@ -1257,13 +1283,9 @@ static bool vq_access_ok(struct vhost_virtqueue *vq, 
unsigned int num,
 struct vring_used __user *used)
 
 {
-   size_t s __maybe_unused = vhost_has_feature(vq, 
VIRTIO_RING_F_EVENT_IDX) ? 2 : 0;
-
-   return access_ok(desc, num * sizeof *desc) &&
-  access_ok(avail,
-sizeof *avail + num * sizeof *avail->ring + s) &&
-  access_ok(used,
-   sizeof *used + num * sizeof *used->ring + s);
+   return access_ok(desc, vhost_get_desc_size(vq, num)) &&
+  access_ok(avail, vhost_get_avail_size(vq, num)) &&
+  access_ok(used, vhost_get_used_size(vq, num));
 }
 
 static void vhost_vq_meta_update(struct vhost_virtqueue *vq,
@@ -1315,22 +1337,18 @@ static bool iotlb_access_ok(struct vhost_virtqueue *vq,
 
 int vq_meta_prefetch(struct vhost_virtqueue *vq)
 {
-   size_t s = vhost_has_feature(vq, VIRTIO_RING_F_EVENT_IDX) ? 2 : 0;
unsigned int num = vq->num;
 
if (!vq->iotlb)
return 1;
 
return iotlb_access_ok(vq, VHOST_ACCESS_RO, (u64)(uintptr_t)vq->desc,
-  num * sizeof(*vq->desc), VHOST_ADDR_DESC) &&
+  vhost_get_desc_size(vq, num), VHOST_ADDR_DESC) &&
   iotlb_access_ok(vq, VHOST_ACCESS_RO, (u64)(uintptr_t)vq->avail,
-  sizeof *vq->avail +
-  num * sizeof(*vq->avail->ring) + s,
+  vhost_get_avail_size(vq, num),
   VHOST_ADDR_AVAIL) &&
   iotlb_access_ok(vq, VHOST_ACCESS_WO, (u64)(uintptr_t)vq->used,
-  sizeof *vq->used +
-  num * sizeof(*vq->used->ring) + s,
-  VHOST_ADDR_USED);
+  vhost_get_used_size(vq, num), VHOST_ADDR_USED);
 }
 EXPORT_SYMBOL_GPL(vq_meta_prefetch);
 
@@ -1347,13 +1365,10 @@ EXPORT_SYMBOL_GPL(vhost_log_access_ok);
 static bool vq_log_access_ok(struct vhost_virtqueue *vq,
 void __user *log_base)
 {
-   size_t s = vhost_has_feature(vq, VIRTIO_RING_F_EVENT_IDX) ? 2 : 0;
-
return vq_memory_access_ok(log_base, vq->umem,
   vhost_has_feature(vq, VHOST_F_LOG_ALL)) &&
(!vq->log_used || log_access_ok(log_base, vq->log_addr,
-   sizeof *vq->used +
-   vq->num * sizeof *vq->used->ring + s));
+ vhost_get_used_size(vq, vq->num)));
 }
 
 /* Can we start vq? */
-- 
2.18.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH net-next 0/6] vhost: accelerate metadata access

2019-05-24 Thread Jason Wang
Hi:

This series tries to access virtqueue metadata through kernel virtual
address instead of copy_user() friends since they had too much
overheads like checks, spec barriers or even hardware feature
toggling like SMAP. This is done through setup kernel address through
direct mapping and co-opreate VM management with MMU notifiers.

Test shows about 23% improvement on TX PPS. TCP_STREAM doesn't see
obvious improvement.

Thanks

Changes from RFC V3:
- rebase to net-next
- Tweak on the comments
Changes from RFC V2:
- switch to use direct mapping instead of vmap()
- switch to use spinlock + RCU to synchronize MMU notifier and vhost
  data/control path
- set dirty pages in the invalidation callbacks
- always use copy_to/from_users() friends for the archs that may need
  flush_dcache_pages()
- various minor fixes
Changes from V4:
- use invalidate_range() instead of invalidate_range_start()
- track dirty pages
Changes from V3:
- don't try to use vmap for file backed pages
- rebase to master
Changes from V2:
- fix buggy range overlapping check
- tear down MMU notifier during vhost ioctl to make sure
  invalidation request can read metadata userspace address and vq size
  without holding vq mutex.
Changes from V1:
- instead of pinning pages, use MMU notifier to invalidate vmaps
  and remap duing metadata prefetch
- fix build warning on MIPS

Jason Wang (6):
  vhost: generalize adding used elem
  vhost: fine grain userspace memory accessors
  vhost: rename vq_iotlb_prefetch() to vq_meta_prefetch()
  vhost: introduce helpers to get the size of metadata area
  vhost: factor out setting vring addr and num
  vhost: access vq metadata through kernel virtual address

 drivers/vhost/net.c   |   4 +-
 drivers/vhost/vhost.c | 850 --
 drivers/vhost/vhost.h |  38 +-
 3 files changed, 766 insertions(+), 126 deletions(-)

-- 
2.18.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization