Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot

2020-02-06 Thread Michael S. Tsirkin
On Fri, Feb 07, 2020 at 08:47:14AM +0100, Christian Borntraeger wrote:
> Also adding Cornelia.
> 
> 
> On 06.02.20 23:17, Michael S. Tsirkin wrote:
> > On Thu, Feb 06, 2020 at 04:12:21PM +0100, Christian Borntraeger wrote:
> >>
> >>
> >> On 06.02.20 15:22, epere...@redhat.com wrote:
> >>> Hi Christian.
> >>>
> >>> Could you try this patch on top of ("38ced0208491 vhost: use batched 
> >>> version by default")?
> >>>
> >>> It will not solve your first random crash but it should help with the 
> >>> lost of network connectivity.
> >>>
> >>> Please let me know how does it goes.
> >>
> >>
> >> 38ced0208491 + this seem to be ok.
> >>
> >> Not sure if you can make out anything of this (and the previous git bisect 
> >> log)
> > 
> > Yes it does - that this is just bad split-up of patches, and there's
> > still a real bug that caused worse crashes :)
> > 
> > So I just pushed batch-v4.
> > I expect that will fail, and bisect to give us
> > vhost: batching fetches
> > Can you try that please?
> > 
> 
> yes.
> 
> eccb852f1fe6bede630e2e4f1a121a81e34354ab is the first bad commit
> commit eccb852f1fe6bede630e2e4f1a121a81e34354ab
> Author: Michael S. Tsirkin 
> Date:   Mon Oct 7 06:11:18 2019 -0400
> 
> vhost: batching fetches
> 
> With this patch applied, new and old code perform identically.
> 
> Lots of extra optimizations are now possible, e.g.
> we can fetch multiple heads with copy_from/to_user now.
> We can get rid of maintaining the log array.  Etc etc.
> 
> Signed-off-by: Michael S. Tsirkin 
> 
>  drivers/vhost/test.c  |  2 +-
>  drivers/vhost/vhost.c | 39 ++-
>  drivers/vhost/vhost.h |  4 +++-
>  3 files changed, 38 insertions(+), 7 deletions(-)
> 


And the symptom is still the same - random crashes
after a bit of traffic, right?

> > 
>   

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot

2020-02-06 Thread Christian Borntraeger
Also adding Cornelia.


On 06.02.20 23:17, Michael S. Tsirkin wrote:
> On Thu, Feb 06, 2020 at 04:12:21PM +0100, Christian Borntraeger wrote:
>>
>>
>> On 06.02.20 15:22, epere...@redhat.com wrote:
>>> Hi Christian.
>>>
>>> Could you try this patch on top of ("38ced0208491 vhost: use batched 
>>> version by default")?
>>>
>>> It will not solve your first random crash but it should help with the lost 
>>> of network connectivity.
>>>
>>> Please let me know how does it goes.
>>
>>
>> 38ced0208491 + this seem to be ok.
>>
>> Not sure if you can make out anything of this (and the previous git bisect 
>> log)
> 
> Yes it does - that this is just bad split-up of patches, and there's
> still a real bug that caused worse crashes :)
> 
> So I just pushed batch-v4.
> I expect that will fail, and bisect to give us
> vhost: batching fetches
> Can you try that please?
> 

yes.

eccb852f1fe6bede630e2e4f1a121a81e34354ab is the first bad commit
commit eccb852f1fe6bede630e2e4f1a121a81e34354ab
Author: Michael S. Tsirkin 
Date:   Mon Oct 7 06:11:18 2019 -0400

vhost: batching fetches

With this patch applied, new and old code perform identically.

Lots of extra optimizations are now possible, e.g.
we can fetch multiple heads with copy_from/to_user now.
We can get rid of maintaining the log array.  Etc etc.

Signed-off-by: Michael S. Tsirkin 

 drivers/vhost/test.c  |  2 +-
 drivers/vhost/vhost.c | 39 ++-
 drivers/vhost/vhost.h |  4 +++-
 3 files changed, 38 insertions(+), 7 deletions(-)


> 


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 2/4] drm/virtio: resource teardown tweaks

2020-02-06 Thread Gerd Hoffmann
Add new virtio_gpu_cleanup_object() helper function for object cleanup.
Wire up callback function for resource unref, do cleanup from callback
when we know the host stopped using the resource.

Signed-off-by: Gerd Hoffmann 
---
 drivers/gpu/drm/virtio/virtgpu_drv.h|  4 +++-
 drivers/gpu/drm/virtio/virtgpu_object.c | 19 +++-
 drivers/gpu/drm/virtio/virtgpu_vq.c | 29 ++---
 3 files changed, 43 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h 
b/drivers/gpu/drm/virtio/virtgpu_drv.h
index 7e69c06e168e..1bc13f6b161b 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -114,6 +114,7 @@ struct virtio_gpu_vbuffer {
char *resp_buf;
int resp_size;
virtio_gpu_resp_cb resp_cb;
+   void *resp_cb_data;
 
struct virtio_gpu_object_array *objs;
struct list_head list;
@@ -262,7 +263,7 @@ void virtio_gpu_cmd_create_resource(struct 
virtio_gpu_device *vgdev,
struct virtio_gpu_object_array *objs,
struct virtio_gpu_fence *fence);
 void virtio_gpu_cmd_unref_resource(struct virtio_gpu_device *vgdev,
-  uint32_t resource_id);
+  struct virtio_gpu_object *bo);
 void virtio_gpu_cmd_transfer_to_host_2d(struct virtio_gpu_device *vgdev,
uint64_t offset,
uint32_t width, uint32_t height,
@@ -355,6 +356,7 @@ void virtio_gpu_fence_event_process(struct 
virtio_gpu_device *vdev,
u64 last_seq);
 
 /* virtio_gpu_object */
+void virtio_gpu_cleanup_object(struct virtio_gpu_object *bo);
 struct drm_gem_object *virtio_gpu_create_object(struct drm_device *dev,
size_t size);
 int virtio_gpu_object_create(struct virtio_gpu_device *vgdev,
diff --git a/drivers/gpu/drm/virtio/virtgpu_object.c 
b/drivers/gpu/drm/virtio/virtgpu_object.c
index 017a9e0fc3bb..28a161af7503 100644
--- a/drivers/gpu/drm/virtio/virtgpu_object.c
+++ b/drivers/gpu/drm/virtio/virtgpu_object.c
@@ -61,6 +61,14 @@ static void virtio_gpu_resource_id_put(struct 
virtio_gpu_device *vgdev, uint32_t
}
 }
 
+void virtio_gpu_cleanup_object(struct virtio_gpu_object *bo)
+{
+   struct virtio_gpu_device *vgdev = bo->base.base.dev->dev_private;
+
+   virtio_gpu_resource_id_put(vgdev, bo->hw_res_handle);
+   drm_gem_shmem_free_object(>base.base);
+}
+
 static void virtio_gpu_free_object(struct drm_gem_object *obj)
 {
struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
@@ -68,11 +76,12 @@ static void virtio_gpu_free_object(struct drm_gem_object 
*obj)
 
if (bo->pages)
virtio_gpu_object_detach(vgdev, bo);
-   if (bo->created)
-   virtio_gpu_cmd_unref_resource(vgdev, bo->hw_res_handle);
-   virtio_gpu_resource_id_put(vgdev, bo->hw_res_handle);
-
-   drm_gem_shmem_free_object(obj);
+   if (bo->created) {
+   virtio_gpu_cmd_unref_resource(vgdev, bo);
+   /* completion handler calls virtio_gpu_cleanup_object() */
+   return;
+   }
+   virtio_gpu_cleanup_object(bo);
 }
 
 static const struct drm_gem_object_funcs virtio_gpu_gem_funcs = {
diff --git a/drivers/gpu/drm/virtio/virtgpu_vq.c 
b/drivers/gpu/drm/virtio/virtgpu_vq.c
index df499fb64ac7..4e22c3914f94 100644
--- a/drivers/gpu/drm/virtio/virtgpu_vq.c
+++ b/drivers/gpu/drm/virtio/virtgpu_vq.c
@@ -164,6 +164,16 @@ static void *virtio_gpu_alloc_cmd(struct virtio_gpu_device 
*vgdev,
 NULL);
 }
 
+static void *virtio_gpu_alloc_cmd_cb(struct virtio_gpu_device *vgdev,
+struct virtio_gpu_vbuffer **vbuffer_p,
+int size,
+virtio_gpu_resp_cb cb)
+{
+   return virtio_gpu_alloc_cmd_resp(vgdev, cb, vbuffer_p, size,
+sizeof(struct virtio_gpu_ctrl_hdr),
+NULL);
+}
+
 static void free_vbuf(struct virtio_gpu_device *vgdev,
  struct virtio_gpu_vbuffer *vbuf)
 {
@@ -507,18 +517,31 @@ void virtio_gpu_cmd_create_resource(struct 
virtio_gpu_device *vgdev,
bo->created = true;
 }
 
+static void virtio_gpu_cmd_unref_cb(struct virtio_gpu_device *vgdev,
+   struct virtio_gpu_vbuffer *vbuf)
+{
+   struct virtio_gpu_object *bo;
+
+   bo = vbuf->resp_cb_data;
+   vbuf->resp_cb_data = NULL;
+
+   virtio_gpu_cleanup_object(bo);
+}
+
 void virtio_gpu_cmd_unref_resource(struct virtio_gpu_device *vgdev,
-  uint32_t resource_id)
+  struct virtio_gpu_object *bo)
 {
struct virtio_gpu_resource_unref *cmd_p;
struct 

[PATCH v2 1/4] drm/virtio: simplify virtio_gpu_alloc_cmd

2020-02-06 Thread Gerd Hoffmann
Just call virtio_gpu_alloc_cmd_resp with some fixed args
instead of duplicating most of the function body.

Signed-off-by: Gerd Hoffmann 
---
 drivers/gpu/drm/virtio/virtgpu_vq.c | 26 +-
 1 file changed, 9 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_vq.c 
b/drivers/gpu/drm/virtio/virtgpu_vq.c
index 41e475fbd67b..df499fb64ac7 100644
--- a/drivers/gpu/drm/virtio/virtgpu_vq.c
+++ b/drivers/gpu/drm/virtio/virtgpu_vq.c
@@ -120,23 +120,6 @@ virtio_gpu_vbuf_ctrl_hdr(struct virtio_gpu_vbuffer *vbuf)
return (struct virtio_gpu_ctrl_hdr *)vbuf->buf;
 }
 
-static void *virtio_gpu_alloc_cmd(struct virtio_gpu_device *vgdev,
- struct virtio_gpu_vbuffer **vbuffer_p,
- int size)
-{
-   struct virtio_gpu_vbuffer *vbuf;
-
-   vbuf = virtio_gpu_get_vbuf(vgdev, size,
-  sizeof(struct virtio_gpu_ctrl_hdr),
-  NULL, NULL);
-   if (IS_ERR(vbuf)) {
-   *vbuffer_p = NULL;
-   return ERR_CAST(vbuf);
-   }
-   *vbuffer_p = vbuf;
-   return vbuf->buf;
-}
-
 static struct virtio_gpu_update_cursor*
 virtio_gpu_alloc_cursor(struct virtio_gpu_device *vgdev,
struct virtio_gpu_vbuffer **vbuffer_p)
@@ -172,6 +155,15 @@ static void *virtio_gpu_alloc_cmd_resp(struct 
virtio_gpu_device *vgdev,
return (struct virtio_gpu_command *)vbuf->buf;
 }
 
+static void *virtio_gpu_alloc_cmd(struct virtio_gpu_device *vgdev,
+ struct virtio_gpu_vbuffer **vbuffer_p,
+ int size)
+{
+   return virtio_gpu_alloc_cmd_resp(vgdev, NULL, vbuffer_p, size,
+sizeof(struct virtio_gpu_ctrl_hdr),
+NULL);
+}
+
 static void free_vbuf(struct virtio_gpu_device *vgdev,
  struct virtio_gpu_vbuffer *vbuf)
 {
-- 
2.18.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 3/4] drm/virtio: move mapping teardown to virtio_gpu_cleanup_object()

2020-02-06 Thread Gerd Hoffmann
Stop sending DETACH_BACKING commands, that will happening anyway when
releasing resources via UNREF.  Handle guest-side cleanup in
virtio_gpu_cleanup_object(), called when the host finished processing
the UNREF command.

Signed-off-by: Gerd Hoffmann 
---
 drivers/gpu/drm/virtio/virtgpu_drv.h|  2 --
 drivers/gpu/drm/virtio/virtgpu_object.c | 14 ++--
 drivers/gpu/drm/virtio/virtgpu_vq.c | 46 -
 3 files changed, 12 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h 
b/drivers/gpu/drm/virtio/virtgpu_drv.h
index 1bc13f6b161b..d37ddd7644f6 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -281,8 +281,6 @@ void virtio_gpu_cmd_set_scanout(struct virtio_gpu_device 
*vgdev,
 int virtio_gpu_object_attach(struct virtio_gpu_device *vgdev,
 struct virtio_gpu_object *obj,
 struct virtio_gpu_fence *fence);
-void virtio_gpu_object_detach(struct virtio_gpu_device *vgdev,
- struct virtio_gpu_object *obj);
 int virtio_gpu_attach_status_page(struct virtio_gpu_device *vgdev);
 int virtio_gpu_detach_status_page(struct virtio_gpu_device *vgdev);
 void virtio_gpu_cursor_ping(struct virtio_gpu_device *vgdev,
diff --git a/drivers/gpu/drm/virtio/virtgpu_object.c 
b/drivers/gpu/drm/virtio/virtgpu_object.c
index 28a161af7503..bce2b3d843fe 100644
--- a/drivers/gpu/drm/virtio/virtgpu_object.c
+++ b/drivers/gpu/drm/virtio/virtgpu_object.c
@@ -23,6 +23,7 @@
  * WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
  */
 
+#include 
 #include 
 
 #include "virtgpu_drv.h"
@@ -65,6 +66,17 @@ void virtio_gpu_cleanup_object(struct virtio_gpu_object *bo)
 {
struct virtio_gpu_device *vgdev = bo->base.base.dev->dev_private;
 
+   if (bo->pages) {
+   if (bo->mapped) {
+   dma_unmap_sg(vgdev->vdev->dev.parent,
+bo->pages->sgl, bo->mapped,
+DMA_TO_DEVICE);
+   bo->mapped = 0;
+   }
+   sg_free_table(bo->pages);
+   bo->pages = NULL;
+   drm_gem_shmem_unpin(>base.base);
+   }
virtio_gpu_resource_id_put(vgdev, bo->hw_res_handle);
drm_gem_shmem_free_object(>base.base);
 }
@@ -74,8 +86,6 @@ static void virtio_gpu_free_object(struct drm_gem_object *obj)
struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
struct virtio_gpu_device *vgdev = bo->base.base.dev->dev_private;
 
-   if (bo->pages)
-   virtio_gpu_object_detach(vgdev, bo);
if (bo->created) {
virtio_gpu_cmd_unref_resource(vgdev, bo);
/* completion handler calls virtio_gpu_cleanup_object() */
diff --git a/drivers/gpu/drm/virtio/virtgpu_vq.c 
b/drivers/gpu/drm/virtio/virtgpu_vq.c
index 4e22c3914f94..87c439156151 100644
--- a/drivers/gpu/drm/virtio/virtgpu_vq.c
+++ b/drivers/gpu/drm/virtio/virtgpu_vq.c
@@ -545,22 +545,6 @@ void virtio_gpu_cmd_unref_resource(struct 
virtio_gpu_device *vgdev,
virtio_gpu_queue_ctrl_buffer(vgdev, vbuf);
 }
 
-static void virtio_gpu_cmd_resource_inval_backing(struct virtio_gpu_device 
*vgdev,
- uint32_t resource_id,
- struct virtio_gpu_fence 
*fence)
-{
-   struct virtio_gpu_resource_detach_backing *cmd_p;
-   struct virtio_gpu_vbuffer *vbuf;
-
-   cmd_p = virtio_gpu_alloc_cmd(vgdev, , sizeof(*cmd_p));
-   memset(cmd_p, 0, sizeof(*cmd_p));
-
-   cmd_p->hdr.type = cpu_to_le32(VIRTIO_GPU_CMD_RESOURCE_DETACH_BACKING);
-   cmd_p->resource_id = cpu_to_le32(resource_id);
-
-   virtio_gpu_queue_fenced_ctrl_buffer(vgdev, vbuf, fence);
-}
-
 void virtio_gpu_cmd_set_scanout(struct virtio_gpu_device *vgdev,
uint32_t scanout_id, uint32_t resource_id,
uint32_t width, uint32_t height,
@@ -1155,36 +1139,6 @@ int virtio_gpu_object_attach(struct virtio_gpu_device 
*vgdev,
return 0;
 }
 
-void virtio_gpu_object_detach(struct virtio_gpu_device *vgdev,
- struct virtio_gpu_object *obj)
-{
-   bool use_dma_api = !virtio_has_iommu_quirk(vgdev->vdev);
-
-   if (WARN_ON_ONCE(!obj->pages))
-   return;
-
-   if (use_dma_api && obj->mapped) {
-   struct virtio_gpu_fence *fence = virtio_gpu_fence_alloc(vgdev);
-   /* detach backing and wait for the host process it ... */
-   virtio_gpu_cmd_resource_inval_backing(vgdev, 
obj->hw_res_handle, fence);
-   dma_fence_wait(>f, true);
-   dma_fence_put(>f);
-
-   /* ... then tear down iommu mappings */
-   dma_unmap_sg(vgdev->vdev->dev.parent,
-obj->pages->sgl, obj->mapped,
-

[PATCH v2 4/4] drm/virtio: move virtio_gpu_mem_entry initialization to new function

2020-02-06 Thread Gerd Hoffmann
Introduce new virtio_gpu_object_shmem_init() helper function which will
create the virtio_gpu_mem_entry array, containing the backing storage
information for the host.  For the most path this just moves code from
virtio_gpu_object_attach().

Signed-off-by: Gerd Hoffmann 
---
 drivers/gpu/drm/virtio/virtgpu_drv.h|  4 +-
 drivers/gpu/drm/virtio/virtgpu_object.c | 55 -
 drivers/gpu/drm/virtio/virtgpu_vq.c | 51 ++-
 3 files changed, 60 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h 
b/drivers/gpu/drm/virtio/virtgpu_drv.h
index d37ddd7644f6..6c78c77a2afc 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -71,6 +71,7 @@ struct virtio_gpu_object {
 
struct sg_table *pages;
uint32_t mapped;
+
bool dumb;
bool created;
 };
@@ -280,7 +281,8 @@ void virtio_gpu_cmd_set_scanout(struct virtio_gpu_device 
*vgdev,
uint32_t x, uint32_t y);
 int virtio_gpu_object_attach(struct virtio_gpu_device *vgdev,
 struct virtio_gpu_object *obj,
-struct virtio_gpu_fence *fence);
+struct virtio_gpu_mem_entry *ents,
+unsigned int nents);
 int virtio_gpu_attach_status_page(struct virtio_gpu_device *vgdev);
 int virtio_gpu_detach_status_page(struct virtio_gpu_device *vgdev);
 void virtio_gpu_cursor_ping(struct virtio_gpu_device *vgdev,
diff --git a/drivers/gpu/drm/virtio/virtgpu_object.c 
b/drivers/gpu/drm/virtio/virtgpu_object.c
index bce2b3d843fe..8870ee23ff2b 100644
--- a/drivers/gpu/drm/virtio/virtgpu_object.c
+++ b/drivers/gpu/drm/virtio/virtgpu_object.c
@@ -121,6 +121,51 @@ struct drm_gem_object *virtio_gpu_create_object(struct 
drm_device *dev,
return >base.base;
 }
 
+static int virtio_gpu_object_shmem_init(struct virtio_gpu_device *vgdev,
+   struct virtio_gpu_object *bo,
+   struct virtio_gpu_mem_entry **ents,
+   unsigned int *nents)
+{
+   bool use_dma_api = !virtio_has_iommu_quirk(vgdev->vdev);
+   struct scatterlist *sg;
+   int si, ret;
+
+   ret = drm_gem_shmem_pin(>base.base);
+   if (ret < 0)
+   return -EINVAL;
+
+   bo->pages = drm_gem_shmem_get_sg_table(>base.base);
+   if (!bo->pages) {
+   drm_gem_shmem_unpin(>base.base);
+   return -EINVAL;
+   }
+
+   if (use_dma_api) {
+   bo->mapped = dma_map_sg(vgdev->vdev->dev.parent,
+   bo->pages->sgl, bo->pages->nents,
+   DMA_TO_DEVICE);
+   *nents = bo->mapped;
+   } else {
+   *nents = bo->pages->nents;
+   }
+
+   *ents = kmalloc_array(*nents, sizeof(struct virtio_gpu_mem_entry),
+ GFP_KERNEL);
+   if (!(*ents)) {
+   DRM_ERROR("failed to allocate ent list\n");
+   return -ENOMEM;
+   }
+
+   for_each_sg(bo->pages->sgl, sg, *nents, si) {
+   (*ents)[si].addr = cpu_to_le64(use_dma_api
+  ? sg_dma_address(sg)
+  : sg_phys(sg));
+   (*ents)[si].length = cpu_to_le32(sg->length);
+   (*ents)[si].padding = 0;
+   }
+   return 0;
+}
+
 int virtio_gpu_object_create(struct virtio_gpu_device *vgdev,
 struct virtio_gpu_object_params *params,
 struct virtio_gpu_object **bo_ptr,
@@ -129,6 +174,8 @@ int virtio_gpu_object_create(struct virtio_gpu_device 
*vgdev,
struct virtio_gpu_object_array *objs = NULL;
struct drm_gem_shmem_object *shmem_obj;
struct virtio_gpu_object *bo;
+   struct virtio_gpu_mem_entry *ents;
+   unsigned int nents;
int ret;
 
*bo_ptr = NULL;
@@ -165,7 +212,13 @@ int virtio_gpu_object_create(struct virtio_gpu_device 
*vgdev,
   objs, fence);
}
 
-   ret = virtio_gpu_object_attach(vgdev, bo, NULL);
+   ret = virtio_gpu_object_shmem_init(vgdev, bo, , );
+   if (ret != 0) {
+   virtio_gpu_free_object(_obj->base);
+   return ret;
+   }
+
+   ret = virtio_gpu_object_attach(vgdev, bo, ents, nents);
if (ret != 0) {
virtio_gpu_free_object(_obj->base);
return ret;
diff --git a/drivers/gpu/drm/virtio/virtgpu_vq.c 
b/drivers/gpu/drm/virtio/virtgpu_vq.c
index 87c439156151..8360f7338209 100644
--- a/drivers/gpu/drm/virtio/virtgpu_vq.c
+++ b/drivers/gpu/drm/virtio/virtgpu_vq.c
@@ -1086,56 +1086,11 @@ void virtio_gpu_cmd_submit(struct virtio_gpu_device 
*vgdev,
 
 int virtio_gpu_object_attach(struct virtio_gpu_device *vgdev,

[PULL] vhost: cleanups and fixes

2020-02-06 Thread Michael S. Tsirkin
The following changes since commit d5226fa6dbae0569ee43ecfc08bdcd6770fc4755:

  Linux 5.5 (2020-01-26 16:23:03 -0800)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus

for you to fetch changes up to 1ad6f58ea9364b0a5d8ae06249653ac9304a8578:

  virtio_balloon: Fix memory leaks on errors in virtballoon_probe() (2020-02-06 
03:40:27 -0500)


virtio: fixes, cleanups

Some bug fixes/cleanups.
Deprecated scsi passthrough for blk removed.

Signed-off-by: Michael S. Tsirkin 


Christoph Hellwig (1):
  virtio-blk: remove VIRTIO_BLK_F_SCSI support

Daniel Verkamp (2):
  virtio-balloon: initialize all vq callbacks
  virtio-pci: check name when counting MSI-X vectors

David Hildenbrand (2):
  virtio-balloon: Fix memory leak when unloading while hinting is in 
progress
  virtio_balloon: Fix memory leaks on errors in virtballoon_probe()

Michael S. Tsirkin (1):
  virtio_balloon: prevent pfn array overflow

Yangtao Li (1):
  virtio-mmio: convert to devm_platform_ioremap_resource

 arch/powerpc/configs/guest.config  |   1 -
 drivers/block/Kconfig  |  10 
 drivers/block/virtio_blk.c | 115 +
 drivers/virtio/virtio_balloon.c|  21 +--
 drivers/virtio/virtio_mmio.c   |  15 +
 drivers/virtio/virtio_pci_common.c |   2 +-
 6 files changed, 22 insertions(+), 142 deletions(-)

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2] tools/virtio: option to build an out of tree module

2020-02-06 Thread Michael S. Tsirkin
Handy for testing with distro kernels.
Warn that the resulting module is completely unsupported,
and isn't intended for production use.

Usage:
make oot # builds vhost_test.ko, vhost.ko
make oot-clean # cleans out files created

Signed-off-by: Michael S. Tsirkin 
---

changes from v1:
lots of refactoring
disable all modules except vhost by default (more of a chance
 it'll build)
oot-clean target

 tools/virtio/Makefile | 27 ++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/tools/virtio/Makefile b/tools/virtio/Makefile
index 8e2a908115c2..f33f32f1d208 100644
--- a/tools/virtio/Makefile
+++ b/tools/virtio/Makefile
@@ -8,7 +8,32 @@ CFLAGS += -g -O2 -Werror -Wall -I. -I../include/ -I 
../../usr/include/ -Wno-poin
 vpath %.c ../../drivers/virtio ../../drivers/vhost
 mod:
${MAKE} -C `pwd`/../.. M=`pwd`/vhost_test V=${V}
-.PHONY: all test mod clean
+
+#oot: build vhost as an out of tree module for a distro kernel
+#no effort is taken to make it actually build or work, but tends to mostly work
+#if the distro kernel is very close to upstream
+#unsupported! this is a development tool only, don't use the
+#resulting modules in production!
+OOT_KSRC=/lib/modules/$$(uname -r)/build
+OOT_VHOST=`pwd`/../../drivers/vhost
+#Everyone depends on vhost
+#Tweak the below to enable more modules
+OOT_CONFIGS=\
+   CONFIG_VHOST=m \
+   CONFIG_VHOST_NET=n \
+   CONFIG_VHOST_SCSI=n \
+   CONFIG_VHOST_VSOCK=n
+OOT_BUILD=KCFLAGS="-I "${OOT_VHOST} ${MAKE} -C ${OOT_KSRC} V=${V}
+oot-build:
+   echo "UNSUPPORTED! Don't use the resulting modules in production!"
+   ${OOT_BUILD} M=`pwd`/vhost_test
+   ${OOT_BUILD} M=${OOT_VHOST} ${OOT_CONFIGS}
+
+oot-clean: oot-build
+oot: oot-build
+oot-clean: OOT_BUILD+=clean
+
+.PHONY: all test mod clean vhost oot oot-clean oot-build
 clean:
${RM} *.o vringh_test virtio_test vhost_test/*.o vhost_test/.*.cmd \
   vhost_test/Module.symvers vhost_test/modules.order *.d
-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] tools/virtio: option to build an out of tree module

2020-02-06 Thread Michael S. Tsirkin
On Fri, Feb 07, 2020 at 11:38:20AM +0800, Jason Wang wrote:
> 
> On 2020/2/6 下午4:01, Michael S. Tsirkin wrote:
> > Handy for testing with distro kernels.
> > Warn that the resulting module is completely unsupported,
> > and isn't intended for production use.
> > 
> > Signed-off-by: Michael S. Tsirkin 
> > ---
> >   tools/virtio/Makefile | 13 -
> >   1 file changed, 12 insertions(+), 1 deletion(-)
> > 
> > diff --git a/tools/virtio/Makefile b/tools/virtio/Makefile
> > index 8e2a908115c2..94106cde49e3 100644
> > --- a/tools/virtio/Makefile
> > +++ b/tools/virtio/Makefile
> > @@ -8,7 +8,18 @@ CFLAGS += -g -O2 -Werror -Wall -I. -I../include/ -I 
> > ../../usr/include/ -Wno-poin
> >   vpath %.c ../../drivers/virtio ../../drivers/vhost
> >   mod:
> > ${MAKE} -C `pwd`/../.. M=`pwd`/vhost_test V=${V}
> > -.PHONY: all test mod clean
> > +
> > +#oot: build vhost as an out of tree module for a distro kernel
> > +#no effort is taken to make it actually build or work, but tends to mostly 
> > work
> > +#if the distro kernel is very close to upstream
> > +#unsupported! this is a development tool only, don't use the
> > +#resulting modules in production!
> > +oot:
> > +   echo "UNSUPPORTED! Don't use the resulting modules in production!"
> > +   KCFLAGS="-I "`pwd`/../../drivers/vhost ${MAKE} -C 
> > /usr/src/kernels/$$(uname -r) M=`pwd`/vhost_test V=${V}
> > +   KCFLAGS="-I "`pwd`/../../drivers/vhost ${MAKE} -C 
> > /usr/src/kernels/$$(uname -r) M=`pwd`/../../drivers/vhost V=${V} 
> > CONFIG_VHOST_VSOCK=n
> 
> 
> Any reason for disabling vsock here?
> 
> Thanks

It's just moving too fast with its internal API changes for a simplistic
oot build like this to work. But I guess it need to make it a bit
more generic. I'll try.

> 
> > +
> > +.PHONY: all test mod clean vhost oot
> >   clean:
> > ${RM} *.o vringh_test virtio_test vhost_test/*.o vhost_test/.*.cmd \
> > vhost_test/Module.symvers vhost_test/modules.order *.d

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v2] drm/virtio: fix ring free check

2020-02-06 Thread Chia-I Wu
On Thu, Feb 6, 2020 at 10:47 PM Gerd Hoffmann  wrote:
>
> If the virtio device supports indirect ring descriptors we need only one
> ring entry for the whole command.  Take that into account when checking
> whenever the virtqueue has enough free entries for our command.
>
> Signed-off-by: Gerd Hoffmann 
> ---
>  drivers/gpu/drm/virtio/virtgpu_drv.h | 1 +
>  drivers/gpu/drm/virtio/virtgpu_debugfs.c | 1 +
>  drivers/gpu/drm/virtio/virtgpu_kms.c | 3 +++
>  drivers/gpu/drm/virtio/virtgpu_vq.c  | 3 +++
>  4 files changed, 8 insertions(+)
>
> diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h 
> b/drivers/gpu/drm/virtio/virtgpu_drv.h
> index 7e69c06e168e..d278c8c50f39 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_drv.h
> +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
> @@ -193,6 +193,7 @@ struct virtio_gpu_device {
>
> bool has_virgl_3d;
> bool has_edid;
> +   bool has_indirect;
has_indirect_desc?  Either way,

Reviewed-by: Chia-I Wu 
>
> struct work_struct config_changed_work;
>
> diff --git a/drivers/gpu/drm/virtio/virtgpu_debugfs.c 
> b/drivers/gpu/drm/virtio/virtgpu_debugfs.c
> index 5156e6b279db..e27120d512b0 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_debugfs.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_debugfs.c
> @@ -47,6 +47,7 @@ static int virtio_gpu_features(struct seq_file *m, void 
> *data)
>
> virtio_add_bool(m, "virgl", vgdev->has_virgl_3d);
> virtio_add_bool(m, "edid", vgdev->has_edid);
> +   virtio_add_bool(m, "indirect", vgdev->has_indirect);
> virtio_add_int(m, "cap sets", vgdev->num_capsets);
> virtio_add_int(m, "scanouts", vgdev->num_scanouts);
> return 0;
> diff --git a/drivers/gpu/drm/virtio/virtgpu_kms.c 
> b/drivers/gpu/drm/virtio/virtgpu_kms.c
> index 2f5773e43557..c1086df49816 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_kms.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_kms.c
> @@ -159,6 +159,9 @@ int virtio_gpu_init(struct drm_device *dev)
> if (virtio_has_feature(vgdev->vdev, VIRTIO_GPU_F_EDID)) {
> vgdev->has_edid = true;
> }
> +   if (virtio_has_feature(vgdev->vdev, VIRTIO_RING_F_INDIRECT_DESC)) {
> +   vgdev->has_indirect = true;
> +   }
>
> DRM_INFO("features: %cvirgl %cedid\n",
>  vgdev->has_virgl_3d ? '+' : '-',
> diff --git a/drivers/gpu/drm/virtio/virtgpu_vq.c 
> b/drivers/gpu/drm/virtio/virtgpu_vq.c
> index 41e475fbd67b..cc02fc4bab2a 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_vq.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_vq.c
> @@ -330,6 +330,9 @@ static void virtio_gpu_queue_ctrl_sgs(struct 
> virtio_gpu_device *vgdev,
> bool notify = false;
> int ret;
>
> +   if (vgdev->has_indirect)
> +   elemcnt = 1;
> +
>  again:
> spin_lock(>ctrlq.qlock);
>
> --
> 2.18.1
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2] drm/virtio: fix ring free check

2020-02-06 Thread Gerd Hoffmann
If the virtio device supports indirect ring descriptors we need only one
ring entry for the whole command.  Take that into account when checking
whenever the virtqueue has enough free entries for our command.

Signed-off-by: Gerd Hoffmann 
---
 drivers/gpu/drm/virtio/virtgpu_drv.h | 1 +
 drivers/gpu/drm/virtio/virtgpu_debugfs.c | 1 +
 drivers/gpu/drm/virtio/virtgpu_kms.c | 3 +++
 drivers/gpu/drm/virtio/virtgpu_vq.c  | 3 +++
 4 files changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h 
b/drivers/gpu/drm/virtio/virtgpu_drv.h
index 7e69c06e168e..d278c8c50f39 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -193,6 +193,7 @@ struct virtio_gpu_device {
 
bool has_virgl_3d;
bool has_edid;
+   bool has_indirect;
 
struct work_struct config_changed_work;
 
diff --git a/drivers/gpu/drm/virtio/virtgpu_debugfs.c 
b/drivers/gpu/drm/virtio/virtgpu_debugfs.c
index 5156e6b279db..e27120d512b0 100644
--- a/drivers/gpu/drm/virtio/virtgpu_debugfs.c
+++ b/drivers/gpu/drm/virtio/virtgpu_debugfs.c
@@ -47,6 +47,7 @@ static int virtio_gpu_features(struct seq_file *m, void *data)
 
virtio_add_bool(m, "virgl", vgdev->has_virgl_3d);
virtio_add_bool(m, "edid", vgdev->has_edid);
+   virtio_add_bool(m, "indirect", vgdev->has_indirect);
virtio_add_int(m, "cap sets", vgdev->num_capsets);
virtio_add_int(m, "scanouts", vgdev->num_scanouts);
return 0;
diff --git a/drivers/gpu/drm/virtio/virtgpu_kms.c 
b/drivers/gpu/drm/virtio/virtgpu_kms.c
index 2f5773e43557..c1086df49816 100644
--- a/drivers/gpu/drm/virtio/virtgpu_kms.c
+++ b/drivers/gpu/drm/virtio/virtgpu_kms.c
@@ -159,6 +159,9 @@ int virtio_gpu_init(struct drm_device *dev)
if (virtio_has_feature(vgdev->vdev, VIRTIO_GPU_F_EDID)) {
vgdev->has_edid = true;
}
+   if (virtio_has_feature(vgdev->vdev, VIRTIO_RING_F_INDIRECT_DESC)) {
+   vgdev->has_indirect = true;
+   }
 
DRM_INFO("features: %cvirgl %cedid\n",
 vgdev->has_virgl_3d ? '+' : '-',
diff --git a/drivers/gpu/drm/virtio/virtgpu_vq.c 
b/drivers/gpu/drm/virtio/virtgpu_vq.c
index 41e475fbd67b..cc02fc4bab2a 100644
--- a/drivers/gpu/drm/virtio/virtgpu_vq.c
+++ b/drivers/gpu/drm/virtio/virtgpu_vq.c
@@ -330,6 +330,9 @@ static void virtio_gpu_queue_ctrl_sgs(struct 
virtio_gpu_device *vgdev,
bool notify = false;
int ret;
 
+   if (vgdev->has_indirect)
+   elemcnt = 1;
+
 again:
spin_lock(>ctrlq.qlock);
 
-- 
2.18.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] drm/virtio: fix ring free check

2020-02-06 Thread Gerd Hoffmann
  Hi,

> > +   indirect = virtio_has_feature(vgdev->vdev, 
> > VIRTIO_RING_F_INDIRECT_DESC);
> > +   vqcnt = indirect ? 1 : elemcnt;
> Is the feature dynamic and require the lock held?  If not, the result
> can be cached and the fixup can happen before grabbing the lock

Not dynamic, so yes, caching makes sense.

cheers,
  Gerd

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] tools/virtio: option to build an out of tree module

2020-02-06 Thread Jason Wang


On 2020/2/6 下午4:01, Michael S. Tsirkin wrote:

Handy for testing with distro kernels.
Warn that the resulting module is completely unsupported,
and isn't intended for production use.

Signed-off-by: Michael S. Tsirkin 
---
  tools/virtio/Makefile | 13 -
  1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/tools/virtio/Makefile b/tools/virtio/Makefile
index 8e2a908115c2..94106cde49e3 100644
--- a/tools/virtio/Makefile
+++ b/tools/virtio/Makefile
@@ -8,7 +8,18 @@ CFLAGS += -g -O2 -Werror -Wall -I. -I../include/ -I 
../../usr/include/ -Wno-poin
  vpath %.c ../../drivers/virtio ../../drivers/vhost
  mod:
${MAKE} -C `pwd`/../.. M=`pwd`/vhost_test V=${V}
-.PHONY: all test mod clean
+
+#oot: build vhost as an out of tree module for a distro kernel
+#no effort is taken to make it actually build or work, but tends to mostly work
+#if the distro kernel is very close to upstream
+#unsupported! this is a development tool only, don't use the
+#resulting modules in production!
+oot:
+   echo "UNSUPPORTED! Don't use the resulting modules in production!"
+   KCFLAGS="-I "`pwd`/../../drivers/vhost ${MAKE} -C 
/usr/src/kernels/$$(uname -r) M=`pwd`/vhost_test V=${V}
+   KCFLAGS="-I "`pwd`/../../drivers/vhost ${MAKE} -C 
/usr/src/kernels/$$(uname -r) M=`pwd`/../../drivers/vhost V=${V} CONFIG_VHOST_VSOCK=n



Any reason for disabling vsock here?

Thanks



+
+.PHONY: all test mod clean vhost oot
  clean:
${RM} *.o vringh_test virtio_test vhost_test/*.o vhost_test/.*.cmd \
vhost_test/Module.symvers vhost_test/modules.order *.d


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] virtio_balloon: prevent pfn array overflow

2020-02-06 Thread Jason Wang


On 2020/2/6 下午3:47, Michael S. Tsirkin wrote:

Make sure, at build time, that pfn array is big enough to hold a single
page.  It happens to be true since the PAGE_SHIFT value at the moment is
20, which is 1M - exactly 256 4K balloon pages.

Signed-off-by: Michael S. Tsirkin 
---
  drivers/virtio/virtio_balloon.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 8e400ece9273..2457c54b6185 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -158,6 +158,8 @@ static void set_page_pfns(struct virtio_balloon *vb,
  {
unsigned int i;
  
+	BUILD_BUG_ON(VIRTIO_BALLOON_PAGES_PER_PAGE > VIRTIO_BALLOON_ARRAY_PFNS_MAX);

+
/*
 * Set balloon pfns pointing at this page.
 * Note that the first pfn points at start of the page.



Acked-by: Jason Wang 


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot

2020-02-06 Thread Michael S. Tsirkin
On Thu, Feb 06, 2020 at 04:12:21PM +0100, Christian Borntraeger wrote:
> 
> 
> On 06.02.20 15:22, epere...@redhat.com wrote:
> > Hi Christian.
> > 
> > Could you try this patch on top of ("38ced0208491 vhost: use batched 
> > version by default")?
> > 
> > It will not solve your first random crash but it should help with the lost 
> > of network connectivity.
> > 
> > Please let me know how does it goes.
> 
> 
> 38ced0208491 + this seem to be ok.
> 
> Not sure if you can make out anything of this (and the previous git bisect 
> log)

Yes it does - that this is just bad split-up of patches, and there's
still a real bug that caused worse crashes :)

So I just pushed batch-v4.
I expect that will fail, and bisect to give us
vhost: batching fetches
Can you try that please?


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot

2020-02-06 Thread Michael S. Tsirkin
On Thu, Feb 06, 2020 at 03:22:39PM +0100, epere...@redhat.com wrote:
> Hi Christian.
> 
> Could you try this patch on top of ("38ced0208491 vhost: use batched version 
> by default")?
> 
> It will not solve your first random crash but it should help with the lost of 
> network connectivity.
> 
> Please let me know how does it goes.
> 
> Thanks!
> 
> >From 99f0f543f3939dbe803988c9153a95616acd Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= 
> Date: Thu, 6 Feb 2020 15:13:42 +0100
> Subject: [PATCH] vhost: filter valid vhost descriptors flags
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> Previous commit copy _NEXT flag, and it complains if a copied descriptor
> contains it.
> 
> Signed-off-by: Eugenio Pérez 
> ---
>  drivers/vhost/vhost.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index 27ae5b4872a0..56c5253056ee 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -2125,6 +2125,8 @@ static void pop_split_desc(struct vhost_virtqueue *vq)
>   --vq->ndescs;
>  }
>  
> +#define VHOST_DESC_FLAGS (VRING_DESC_F_INDIRECT | VRING_DESC_F_WRITE | \
> +   VRING_DESC_F_NEXT)
>  static int push_split_desc(struct vhost_virtqueue *vq, struct vring_desc 
> *desc, u16 id)
>  {
>   struct vhost_desc *h;
> @@ -2134,7 +2136,7 @@ static int push_split_desc(struct vhost_virtqueue *vq, 
> struct vring_desc *desc,
>   h = >descs[vq->ndescs++];
>   h->addr = vhost64_to_cpu(vq, desc->addr);
>   h->len = vhost32_to_cpu(vq, desc->len);
> - h->flags = vhost16_to_cpu(vq, desc->flags);
> + h->flags = vhost16_to_cpu(vq, desc->flags) & VHOST_DESC_FLAGS;
>   h->id = id;
>  
>   return 0;



> @@ -2343,7 +2345,7 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
>   struct vhost_desc *desc = >descs[i];
>   int access;
>  
> - if (desc->flags & ~(VRING_DESC_F_INDIRECT | 
> VRING_DESC_F_WRITE)) {
> + if (desc->flags & ~VHOST_DESC_FLAGS) {
>   vq_err(vq, "Unexpected flags: 0x%x at descriptor id 
> 0x%x\n",
>  desc->flags, desc->id);
>   ret = -EINVAL;
> -- 
> 2.18.1

Thanks for catching this!

Do we need the 1st chunk though?

It seems preferable to just muck with flags in 1 place, when we
validate them ...

> 
> On Wed, 2020-01-22 at 20:32 +0100, Christian Borntraeger wrote:
> > 
> > On 20.01.20 07:27, Michael S. Tsirkin wrote:
> > > On Tue, Jan 07, 2020 at 01:16:50PM +0100, Christian Borntraeger wrote:
> > > > On 07.01.20 12:55, Michael S. Tsirkin wrote:
> > > > 
> > > > > I pushed batched-v3 - same head but bisect should work now.
> > > > > 
> > > > 
> > > > With 
> > > > commit 38ced0208491103b50f1056f0d1c8f28e2e13d08 (HEAD)
> > > > Author: Michael S. Tsirkin 
> > > > AuthorDate: Wed Dec 11 12:19:26 2019 -0500
> > > > Commit: Michael S. Tsirkin 
> > > > CommitDate: Tue Jan 7 06:52:42 2020 -0500
> > > > 
> > > > vhost: use batched version by default
> > > > 
> > > > 
> > > > I have exactly one successful ping and then the network inside the 
> > > > guest is broken (no packet
> > > > anymore).
> > > 
> > > Does anything appear in host's dmesg when this happens?
> > 
> > I think there was nothing, but I am not sure. I would need to redo the test 
> > if this is important to know.
> > 
> > > 
> > > > So you could consider this commit broken (but in a different way and 
> > > > also without any
> > > > guest reboot necessary).
> > > > 
> > > > 
> > > > bisect log:
> > > > git bisect start
> > > > # bad: [d2f6175f52062ee51ee69754a6925608213475d2] vhost: use vhost_desc 
> > > > instead of vhost_log
> > > > git bisect bad d2f6175f52062ee51ee69754a6925608213475d2
> > > > # good: [d1281e3a562ec6a08f944a876481dd043ba739b9] virtio-blk: remove 
> > > > VIRTIO_BLK_F_SCSI support
> > > > git bisect good d1281e3a562ec6a08f944a876481dd043ba739b9
> > > > # good: [fac7c0f46996e32d996f5c46121df24a6b95ec3b] vhost: option to 
> > > > fetch descriptors through an independent
> > > > struct
> > > > git bisect good fac7c0f46996e32d996f5c46121df24a6b95ec3b
> > > > # bad: [539eb9d738f048cd7be61f404e8f9c7d9d2ff3cc] vhost: batching 
> > > > fetches
> > > > git bisect bad 539eb9d738f048cd7be61f404e8f9c7d9d2ff3cc

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 4/4] drm/virtio: move virtio_gpu_mem_entry initialization to new function

2020-02-06 Thread Chia-I Wu
On Thu, Feb 6, 2020 at 12:55 AM Gerd Hoffmann  wrote:
>
>   Hi,
>
> > > virtio_gpu_cmd_resource_attach_backing(vgdev, obj->hw_res_handle,
> > > -  ents, nents,
> > > +  obj->ents, obj->nents,
> > >fence);
> > > +   obj->ents = NULL;
> > > +   obj->nents = 0;
> > Hm, if the entries are temporary, can we allocate and initialize them
> > in this function?
>
> Well, the plan for CREATE_RESOURCE_BLOB is to use obj->ents too ...
Is obj->ents needed after CREATE_RESOURCE_BLOB?  If not, having yet
another helper

  ents = virtio_gpu_object_alloc_mem_entries(..., );

seems cleaner.  We would also be able to get rid of virtio_gpu_object_attach.

>
> cheers,
>   Gerd
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 2/4] drm/virtio: resource teardown tweaks

2020-02-06 Thread Chia-I Wu
On Wed, Feb 5, 2020 at 10:43 PM Gerd Hoffmann  wrote:
>
> > > -
> > > -   drm_gem_shmem_free_object(obj);
> > > +   if (bo->created) {
> > > +   virtio_gpu_cmd_unref_resource(vgdev, bo);
> > > +   /* completion handler calls virtio_gpu_cleanup_object() */
> > nitpick: we don't need this comment when virtio_gpu_cmd_unref_cb is
> > defined by this file and passed to virtio_gpu_cmd_unref_resource.
>
> I want virtio_gpu_cmd_unref_cb + virtio_gpu_cmd_unref_resource being
> placed next to each other so it is easier to see how they work hand in
> hand.
>
> > I happen to be looking at our error handling paths.  I think we want
> > virtio_gpu_queue_fenced_ctrl_buffer to call vbuf->resp_cb on errors.
>
> /me was thinking about that too.  Yes, we will need either that,
> or a separate vbuf->error_cb callback.  That'll be another patch
> though.
Or the new virtio_gpu_queue_ctrl_sgs can return errors rather than
eating errors.

Yeah, that should be another patch.
>
> > > +   /*
> > > +* We are in the release callback and do NOT want refcount
> > > +* bo, so do NOT use virtio_gpu_array_add_obj().
> > > +*/
> > > +   vbuf->objs = virtio_gpu_array_alloc(1);
> > > +   vbuf->objs->objs[0] = >base.base
> > This is an abuse of obj array.  Add "void *private_data;" to
> > virtio_gpu_vbuffer and use that maybe?
>
> I'd name that *cb_data, but yes, that makes sense.
Sounds great.
>
> cheers,
>   Gerd
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] drm/virtio: fix ring free check

2020-02-06 Thread Chia-I Wu
On Thu, Feb 6, 2020 at 3:14 AM Gerd Hoffmann  wrote:
>
> If the virtio device supports indirect ring descriptors we need only one
> ring entry for the whole command.  Take that into account when checking
> whenever the virtqueue has enough free entries for our command.
>
> Signed-off-by: Gerd Hoffmann 
> ---
>  drivers/gpu/drm/virtio/virtgpu_vq.c | 9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/virtio/virtgpu_vq.c 
> b/drivers/gpu/drm/virtio/virtgpu_vq.c
> index 41e475fbd67b..a2ec09dba530 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_vq.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_vq.c
> @@ -328,7 +328,8 @@ static void virtio_gpu_queue_ctrl_sgs(struct 
> virtio_gpu_device *vgdev,
>  {
> struct virtqueue *vq = vgdev->ctrlq.vq;
> bool notify = false;
> -   int ret;
> +   bool indirect;
> +   int vqcnt, ret;
>
>  again:
> spin_lock(>ctrlq.qlock);
> @@ -341,9 +342,11 @@ static void virtio_gpu_queue_ctrl_sgs(struct 
> virtio_gpu_device *vgdev,
> return;
> }
>
> -   if (vq->num_free < elemcnt) {
> +   indirect = virtio_has_feature(vgdev->vdev, 
> VIRTIO_RING_F_INDIRECT_DESC);
> +   vqcnt = indirect ? 1 : elemcnt;
Is the feature dynamic and require the lock held?  If not, the result
can be cached and the fixup can happen before grabbing the lock

  if (vgdev->has_indirect_desc)
elemcnt = 1;

Either way, patch is

  Reviewed-by: Chia-I Wu 


> +   if (vq->num_free < vqcnt) {
> spin_unlock(>ctrlq.qlock);
> -   wait_event(vgdev->ctrlq.ack_queue, vq->num_free >= elemcnt);
> +   wait_event(vgdev->ctrlq.ack_queue, vq->num_free >= vqcnt);
> goto again;
> }
>
> --
> 2.18.1
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: vhost changes (batched) in linux-next after 12/13 trigger random crashes in KVM guests after reboot

2020-02-06 Thread Christian Borntraeger



On 06.02.20 15:22, epere...@redhat.com wrote:
> Hi Christian.
> 
> Could you try this patch on top of ("38ced0208491 vhost: use batched version 
> by default")?
> 
> It will not solve your first random crash but it should help with the lost of 
> network connectivity.
> 
> Please let me know how does it goes.


38ced0208491 + this seem to be ok.

Not sure if you can make out anything of this (and the previous git bisect log)

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking

2020-02-06 Thread Michael S. Tsirkin
On Thu, Feb 06, 2020 at 09:43:10AM +, Wang, Wei W wrote:
> On Thursday, February 6, 2020 5:31 PM, Michael S. Tsirkin wrote:
> > 
> > How about just making this a last resort thing to be compatible with 
> > existing
> > hypervisors? if someone wants to change behaviour that really should use a
> > feature bit ...
> 
> Yeah, sounds good to me to control via feature bits.
> 
> Best,
> Wei

To clarify, shrinker use could be a feature bit. OOM behaviour was
there for years and has been used to dynamically size guests.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH] drm/virtio: fix ring free check

2020-02-06 Thread Gerd Hoffmann
If the virtio device supports indirect ring descriptors we need only one
ring entry for the whole command.  Take that into account when checking
whenever the virtqueue has enough free entries for our command.

Signed-off-by: Gerd Hoffmann 
---
 drivers/gpu/drm/virtio/virtgpu_vq.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_vq.c 
b/drivers/gpu/drm/virtio/virtgpu_vq.c
index 41e475fbd67b..a2ec09dba530 100644
--- a/drivers/gpu/drm/virtio/virtgpu_vq.c
+++ b/drivers/gpu/drm/virtio/virtgpu_vq.c
@@ -328,7 +328,8 @@ static void virtio_gpu_queue_ctrl_sgs(struct 
virtio_gpu_device *vgdev,
 {
struct virtqueue *vq = vgdev->ctrlq.vq;
bool notify = false;
-   int ret;
+   bool indirect;
+   int vqcnt, ret;
 
 again:
spin_lock(>ctrlq.qlock);
@@ -341,9 +342,11 @@ static void virtio_gpu_queue_ctrl_sgs(struct 
virtio_gpu_device *vgdev,
return;
}
 
-   if (vq->num_free < elemcnt) {
+   indirect = virtio_has_feature(vgdev->vdev, VIRTIO_RING_F_INDIRECT_DESC);
+   vqcnt = indirect ? 1 : elemcnt;
+   if (vq->num_free < vqcnt) {
spin_unlock(>ctrlq.qlock);
-   wait_event(vgdev->ctrlq.ack_queue, vq->num_free >= elemcnt);
+   wait_event(vgdev->ctrlq.ack_queue, vq->num_free >= vqcnt);
goto again;
}
 
-- 
2.18.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: Balloon pressuring page cache

2020-02-06 Thread David Hildenbrand
>>> It's not about dangers as such. It's just that when linux hits OOM
>>> all kind of error paths are being hit, latent bugs start triggering,
>>> latency goes up drastically.
>> Doesn't this suggest that the shrinker is preferable to the OOM
>> notifier in the case that we're actually OOMing (with DEFLATE_ON_OOM)?
> 
> I think it all depends on the use case. For the use case you
> describe going to the shrinker might be preferable as you are
> wanting to exert just a light bit of pressure to start some page
> cache reclaim. However if you are wanting to make the deflation a
> last resort sort of thing then I would think the OOM would make more
> sense.

Long story short: What you actually want is free page reporting combined
with

a) avoiding the guest page cache (emulated nvdimms, virtio-pmem). Not
always possible and has some issues in non-trusted environments (IOW,
cloud).

b) a way to tame the page cache (e.g., drop it completely similar to
drop_caches, or a way to drop a specific fraction, things not touch for
the last $SECONDS) etc.

There are some nice discussions in response to Alexander's v16.1 posting.

-- 
Thanks,

David / dhildenb

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking

2020-02-06 Thread David Hildenbrand
On 06.02.20 10:44, Wang, Wei W wrote:
> On Thursday, February 6, 2020 5:32 PM, David Hildenbrand wrote:
>>
>> If the page cache is empty, a drop_slab() will deflate the whole balloon if I
>> am not wrong.
>>
>> Especially, a echo 3 > /proc/sys/vm/drop_caches
>>
>> will first drop the page cache and then drop_slab()
> 
> Then that's the problem of "echo 3 > /proc/sys/vm/drop_cache" itself. It 
> invokes other shrinkers as well (if considered an issue), need to be tweaked 
> in the mm.

In short, I don't like this approach as long as a drop_slab() can
deflate the whole balloon and don't think this is the right approach then.


-- 
Thanks,

David / dhildenb

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


RE: [PATCH RFC] virtio_balloon: conservative balloon page shrinking

2020-02-06 Thread Wang, Wei W
On Thursday, February 6, 2020 5:32 PM, David Hildenbrand wrote:
> 
> If the page cache is empty, a drop_slab() will deflate the whole balloon if I
> am not wrong.
> 
> Especially, a echo 3 > /proc/sys/vm/drop_caches
> 
> will first drop the page cache and then drop_slab()

Then that's the problem of "echo 3 > /proc/sys/vm/drop_cache" itself. It 
invokes other shrinkers as well (if considered an issue), need to be tweaked in 
the mm.

Best,
Wei
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


RE: [PATCH RFC] virtio_balloon: conservative balloon page shrinking

2020-02-06 Thread Wang, Wei W
On Thursday, February 6, 2020 5:31 PM, Michael S. Tsirkin wrote:
> 
> How about just making this a last resort thing to be compatible with existing
> hypervisors? if someone wants to change behaviour that really should use a
> feature bit ...

Yeah, sounds good to me to control via feature bits.

Best,
Wei
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking

2020-02-06 Thread David Hildenbrand
On 06.02.20 10:28, Wang, Wei W wrote:
> On Thursday, February 6, 2020 5:10 PM, David Hildenbrand wrote:
>> so dropping caches (echo 3 > /proc/sys/vm/drop_caches) will no longer
>> deflate the balloon when conservative_shrinker=true?
>>
> 
> Should be. Need Tyler's help to test it.
> 

If the page cache is empty, a drop_slab() will deflate the whole balloon
if I am not wrong.

Especially, a echo 3 > /proc/sys/vm/drop_caches

will first drop the page cache and then drop_slab()

While I like the general idea, it looks more like a hack to me, to try
to teach the shrinker something it was not built for/does not support yet.

-- 
Thanks,

David / dhildenb

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking

2020-02-06 Thread Michael S. Tsirkin
On Thu, Feb 06, 2020 at 09:27:04AM +, Wang, Wei W wrote:
> On Thursday, February 6, 2020 5:04 PM, Michael S. Tsirkin wrote:
> > virtio_balloon_shrinker_count(struct shrinker *shrinker,
> > >   struct virtio_balloon, shrinker);
> > >   unsigned long count;
> > >
> > > - count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE;
> > > + if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES))
> > 
> > I'd rather have an API for that in mm/. In particular, do we want other
> > shrinkers to run, not just pagecache? To pick an example I'm familiar
> > with, kvm mmu cache for nested virt?
> 
> We could make it extendable:
> 
> #define BALLOON_SHRINKER_AFTER_PAGE_CACHE (1 << 0)
> #define BALLOON_SHRINKER_AFTER_KVM_MMU_CACHE  (1 << 1)
> ...
> 
> uint64_t conservative_shrinker;
> if ((conservative_shrinker | BALLOON_SHRINKER_AFTER_PAGE_CACHE) && 
> global_node_page_state(NR_FILE_PAGES))
>   return 0;
> 
> For now, we probably only need BALLOON_SHRINKER_AFTER_PAGE_CACHE.
> 
> Best,
> Wei

How about just making this a last resort thing to be compatible with
existing hypervisors? if someone wants to change behaviour
that really should use a feature bit ...

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


RE: Balloon pressuring page cache

2020-02-06 Thread Wang, Wei W
Isn't the hint only useful during the first round?
After the first round if a page becomes free then we need to update the copy at 
the migration destination, so freeing a page that previously had contents 
should mark it dirty.


Nope. I think as long as it is a free page (no matter 1st or 2nd round), we 
don’t need to send it.

Best,
Wei
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

RE: [PATCH RFC] virtio_balloon: conservative balloon page shrinking

2020-02-06 Thread Wang, Wei W
On Thursday, February 6, 2020 5:10 PM, David Hildenbrand wrote:
> so dropping caches (echo 3 > /proc/sys/vm/drop_caches) will no longer
> deflate the balloon when conservative_shrinker=true?
> 

Should be. Need Tyler's help to test it.

Best,
Wei

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


RE: [PATCH RFC] virtio_balloon: conservative balloon page shrinking

2020-02-06 Thread Wang, Wei W
On Thursday, February 6, 2020 5:04 PM, Michael S. Tsirkin wrote:
> virtio_balloon_shrinker_count(struct shrinker *shrinker,
> > struct virtio_balloon, shrinker);
> > unsigned long count;
> >
> > -   count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE;
> > +   if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES))
> 
> I'd rather have an API for that in mm/. In particular, do we want other
> shrinkers to run, not just pagecache? To pick an example I'm familiar
> with, kvm mmu cache for nested virt?

We could make it extendable:

#define BALLOON_SHRINKER_AFTER_PAGE_CACHE   (1 << 0)
#define BALLOON_SHRINKER_AFTER_KVM_MMU_CACHE(1 << 1)
...

uint64_t conservative_shrinker;
if ((conservative_shrinker | BALLOON_SHRINKER_AFTER_PAGE_CACHE) && 
global_node_page_state(NR_FILE_PAGES))
return 0;

For now, we probably only need BALLOON_SHRINKER_AFTER_PAGE_CACHE.

Best,
Wei
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v1 3/3] virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM

2020-02-06 Thread David Hildenbrand
On 06.02.20 10:12, Michael S. Tsirkin wrote:
> On Wed, Feb 05, 2020 at 05:34:02PM +0100, David Hildenbrand wrote:
>> Commit 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker")
>> changed the behavior when deflation happens automatically. Instead of
>> deflating when called by the OOM handler, the shrinker is used.
>>
>> However, the balloon is not simply some slab cache that should be
>> shrunk when under memory pressure. The shrinker does not have a concept of
>> priorities, so this behavior cannot be configured.
>>
>> There was a report that this results in undesired side effects when
>> inflating the balloon to shrink the page cache. [1]
>>  "When inflating the balloon against page cache (i.e. no free memory
>>   remains) vmscan.c will both shrink page cache, but also invoke the
>>   shrinkers -- including the balloon's shrinker. So the balloon
>>   driver allocates memory which requires reclaim, vmscan gets this
>>   memory by shrinking the balloon, and then the driver adds the
>>   memory back to the balloon. Basically a busy no-op."
>>
>> The name "deflate on OOM" makes it pretty clear when deflation should
>> happen - after other approaches to reclaim memory failed, not while
>> reclaiming. This allows to minimize the footprint of a guest - memory
>> will only be taken out of the balloon when really needed.
>>
>> Especially, a drop_slab() will result in the whole balloon getting
>> deflated - undesired. While handling it via the OOM handler might not be
>> perfect, it keeps existing behavior. If we want a different behavior, then
>> we need a new feature bit and document it properly (although, there should
>> be a clear use case and the intended effects should be well described).
>>
>> Keep using the shrinker for VIRTIO_BALLOON_F_FREE_PAGE_HINT, because
>> this has no such side effects. Always register the shrinker with
>> VIRTIO_BALLOON_F_FREE_PAGE_HINT now. We are always allowed to reuse free
>> pages that are still to be processed by the guest. The hypervisor takes
>> care of identifying and resolving possible races between processing a
>> hinting request and the guest reusing a page.
>>
>> In contrast to pre commit 71994620bb25 ("virtio_balloon: replace oom
>> notifier with shrinker"), don't add a moodule parameter to configure the
>> number of pages to deflate on OOM. Can be re-added if really needed.
>> Also, pay attention that leak_balloon() returns the number of 4k pages -
>> convert it properly in virtio_balloon_oom_notify().
>>
>> Note1: using the OOM handler is frowned upon, but it really is what we
>>need for this feature.
>>
>> Note2: without VIRTIO_BALLOON_F_MUST_TELL_HOST (iow, always with QEMU) we
>>could actually skip sending deflation requests to our hypervisor,
>>making the OOM path *very* simple. Besically freeing pages and
>>updating the balloon. If the communication with the host ever
>>becomes a problem on this call path.
>>
>> [1] https://www.spinics.net/lists/linux-virtualization/msg40863.html
>>
>> Reported-by: Tyler Sanderson 
>> Cc: Michael S. Tsirkin 
>> Cc: Wei Wang 
>> Cc: Alexander Duyck 
>> Cc: David Rientjes 
>> Cc: Nadav Amit 
>> Cc: Michal Hocko 
>> Signed-off-by: David Hildenbrand 
> 
> 
> I guess we should add a Fixes tag to the patch it's reverting,
> this way it's backported and hypervisors will be able to rely on OOM
> behaviour.

Makes sense,

Fixes: 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker")

-- 
Thanks,

David / dhildenb

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v1 3/3] virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM

2020-02-06 Thread Michael S. Tsirkin
On Wed, Feb 05, 2020 at 05:34:02PM +0100, David Hildenbrand wrote:
> Commit 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker")
> changed the behavior when deflation happens automatically. Instead of
> deflating when called by the OOM handler, the shrinker is used.
> 
> However, the balloon is not simply some slab cache that should be
> shrunk when under memory pressure. The shrinker does not have a concept of
> priorities, so this behavior cannot be configured.
> 
> There was a report that this results in undesired side effects when
> inflating the balloon to shrink the page cache. [1]
>   "When inflating the balloon against page cache (i.e. no free memory
>remains) vmscan.c will both shrink page cache, but also invoke the
>shrinkers -- including the balloon's shrinker. So the balloon
>driver allocates memory which requires reclaim, vmscan gets this
>memory by shrinking the balloon, and then the driver adds the
>memory back to the balloon. Basically a busy no-op."
> 
> The name "deflate on OOM" makes it pretty clear when deflation should
> happen - after other approaches to reclaim memory failed, not while
> reclaiming. This allows to minimize the footprint of a guest - memory
> will only be taken out of the balloon when really needed.
> 
> Especially, a drop_slab() will result in the whole balloon getting
> deflated - undesired. While handling it via the OOM handler might not be
> perfect, it keeps existing behavior. If we want a different behavior, then
> we need a new feature bit and document it properly (although, there should
> be a clear use case and the intended effects should be well described).
> 
> Keep using the shrinker for VIRTIO_BALLOON_F_FREE_PAGE_HINT, because
> this has no such side effects. Always register the shrinker with
> VIRTIO_BALLOON_F_FREE_PAGE_HINT now. We are always allowed to reuse free
> pages that are still to be processed by the guest. The hypervisor takes
> care of identifying and resolving possible races between processing a
> hinting request and the guest reusing a page.
> 
> In contrast to pre commit 71994620bb25 ("virtio_balloon: replace oom
> notifier with shrinker"), don't add a moodule parameter to configure the
> number of pages to deflate on OOM. Can be re-added if really needed.
> Also, pay attention that leak_balloon() returns the number of 4k pages -
> convert it properly in virtio_balloon_oom_notify().
> 
> Note1: using the OOM handler is frowned upon, but it really is what we
>need for this feature.
> 
> Note2: without VIRTIO_BALLOON_F_MUST_TELL_HOST (iow, always with QEMU) we
>could actually skip sending deflation requests to our hypervisor,
>making the OOM path *very* simple. Besically freeing pages and
>updating the balloon. If the communication with the host ever
>becomes a problem on this call path.
> 
> [1] https://www.spinics.net/lists/linux-virtualization/msg40863.html
> 
> Reported-by: Tyler Sanderson 
> Cc: Michael S. Tsirkin 
> Cc: Wei Wang 
> Cc: Alexander Duyck 
> Cc: David Rientjes 
> Cc: Nadav Amit 
> Cc: Michal Hocko 
> Signed-off-by: David Hildenbrand 


I guess we should add a Fixes tag to the patch it's reverting,
this way it's backported and hypervisors will be able to rely on OOM
behaviour.

> ---
>  drivers/virtio/virtio_balloon.c | 107 +---
>  1 file changed, 44 insertions(+), 63 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 7e5d84caeb94..e7b18f556c5e 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -14,6 +14,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -27,7 +28,9 @@
>   */
>  #define VIRTIO_BALLOON_PAGES_PER_PAGE (unsigned)(PAGE_SIZE >> 
> VIRTIO_BALLOON_PFN_SHIFT)
>  #define VIRTIO_BALLOON_ARRAY_PFNS_MAX 256
> -#define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
> +/* Maximum number of (4k) pages to deflate on OOM notifications. */
> +#define VIRTIO_BALLOON_OOM_NR_PAGES 256
> +#define VIRTIO_BALLOON_OOM_NOTIFY_PRIORITY 80
>  
>  #define VIRTIO_BALLOON_FREE_PAGE_ALLOC_FLAG (__GFP_NORETRY | __GFP_NOWARN | \
>__GFP_NOMEMALLOC)
> @@ -112,8 +115,11 @@ struct virtio_balloon {
>   /* Memory statistics */
>   struct virtio_balloon_stat stats[VIRTIO_BALLOON_S_NR];
>  
> - /* To register a shrinker to shrink memory upon memory pressure */
> + /* Shrinker to return free pages - VIRTIO_BALLOON_F_FREE_PAGE_HINT */
>   struct shrinker shrinker;
> +
> + /* OOM notifier to deflate on OOM - VIRTIO_BALLOON_F_DEFLATE_ON_OOM */
> + struct notifier_block oom_nb;
>  };
>  
>  static struct virtio_device_id id_table[] = {
> @@ -786,50 +792,13 @@ static unsigned long shrink_free_pages(struct 
> virtio_balloon *vb,
>   return blocks_freed * VIRTIO_BALLOON_HINT_BLOCK_PAGES;
>  }
>  
> -static unsigned long 

Re: [PATCH v1 3/3] virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM

2020-02-06 Thread Michael S. Tsirkin
On Wed, Feb 05, 2020 at 05:34:02PM +0100, David Hildenbrand wrote:
> Commit 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker")
> changed the behavior when deflation happens automatically. Instead of
> deflating when called by the OOM handler, the shrinker is used.
> 
> However, the balloon is not simply some slab cache that should be
> shrunk when under memory pressure. The shrinker does not have a concept of
> priorities, so this behavior cannot be configured.
> 
> There was a report that this results in undesired side effects when
> inflating the balloon to shrink the page cache. [1]
>   "When inflating the balloon against page cache (i.e. no free memory
>remains) vmscan.c will both shrink page cache, but also invoke the
>shrinkers -- including the balloon's shrinker. So the balloon
>driver allocates memory which requires reclaim, vmscan gets this
>memory by shrinking the balloon, and then the driver adds the
>memory back to the balloon. Basically a busy no-op."
> 
> The name "deflate on OOM" makes it pretty clear when deflation should
> happen - after other approaches to reclaim memory failed, not while
> reclaiming. This allows to minimize the footprint of a guest - memory
> will only be taken out of the balloon when really needed.
> 
> Especially, a drop_slab() will result in the whole balloon getting
> deflated - undesired. While handling it via the OOM handler might not be
> perfect, it keeps existing behavior. If we want a different behavior, then
> we need a new feature bit and document it properly (although, there should
> be a clear use case and the intended effects should be well described).
> 
> Keep using the shrinker for VIRTIO_BALLOON_F_FREE_PAGE_HINT, because
> this has no such side effects. Always register the shrinker with
> VIRTIO_BALLOON_F_FREE_PAGE_HINT now. We are always allowed to reuse free
> pages that are still to be processed by the guest. The hypervisor takes
> care of identifying and resolving possible races between processing a
> hinting request and the guest reusing a page.
> 
> In contrast to pre commit 71994620bb25 ("virtio_balloon: replace oom
> notifier with shrinker"), don't add a moodule parameter to configure the
> number of pages to deflate on OOM. Can be re-added if really needed.
> Also, pay attention that leak_balloon() returns the number of 4k pages -
> convert it properly in virtio_balloon_oom_notify().
> 
> Note1: using the OOM handler is frowned upon, but it really is what we
>need for this feature.
> 
> Note2: without VIRTIO_BALLOON_F_MUST_TELL_HOST (iow, always with QEMU) we
>could actually skip sending deflation requests to our hypervisor,
>making the OOM path *very* simple. Besically freeing pages and
>updating the balloon. If the communication with the host ever
>becomes a problem on this call path.
> 
> [1] https://www.spinics.net/lists/linux-virtualization/msg40863.html
> 
> Reported-by: Tyler Sanderson 
> Cc: Michael S. Tsirkin 
> Cc: Wei Wang 
> Cc: Alexander Duyck 
> Cc: David Rientjes 
> Cc: Nadav Amit 
> Cc: Michal Hocko 
> Signed-off-by: David Hildenbrand 

So the revert looks ok, from that POV and with commit log changes

Acked-by: Michael S. Tsirkin 

however, let's see what do others think, and whether Wei can come
up with a fixup for the shrinker.


> ---
>  drivers/virtio/virtio_balloon.c | 107 +---
>  1 file changed, 44 insertions(+), 63 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 7e5d84caeb94..e7b18f556c5e 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -14,6 +14,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -27,7 +28,9 @@
>   */
>  #define VIRTIO_BALLOON_PAGES_PER_PAGE (unsigned)(PAGE_SIZE >> 
> VIRTIO_BALLOON_PFN_SHIFT)
>  #define VIRTIO_BALLOON_ARRAY_PFNS_MAX 256
> -#define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
> +/* Maximum number of (4k) pages to deflate on OOM notifications. */
> +#define VIRTIO_BALLOON_OOM_NR_PAGES 256
> +#define VIRTIO_BALLOON_OOM_NOTIFY_PRIORITY 80
>  
>  #define VIRTIO_BALLOON_FREE_PAGE_ALLOC_FLAG (__GFP_NORETRY | __GFP_NOWARN | \
>__GFP_NOMEMALLOC)
> @@ -112,8 +115,11 @@ struct virtio_balloon {
>   /* Memory statistics */
>   struct virtio_balloon_stat stats[VIRTIO_BALLOON_S_NR];
>  
> - /* To register a shrinker to shrink memory upon memory pressure */
> + /* Shrinker to return free pages - VIRTIO_BALLOON_F_FREE_PAGE_HINT */
>   struct shrinker shrinker;
> +
> + /* OOM notifier to deflate on OOM - VIRTIO_BALLOON_F_DEFLATE_ON_OOM */
> + struct notifier_block oom_nb;
>  };
>  
>  static struct virtio_device_id id_table[] = {
> @@ -786,50 +792,13 @@ static unsigned long shrink_free_pages(struct 
> virtio_balloon *vb,
>   return blocks_freed * 

Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking

2020-02-06 Thread David Hildenbrand
On 06.02.20 09:01, Wei Wang wrote:
> There are cases that users want to shrink balloon pages after the
> pagecache depleted. The conservative_shrinker lets the shrinker
> shrink balloon pages when all the pagecache has been reclaimed.
> 
> Signed-off-by: Wei Wang 
> ---
>  drivers/virtio/virtio_balloon.c | 14 +-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 93f995f6cf36..b4c5bb13a867 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -42,6 +42,10 @@
>  static struct vfsmount *balloon_mnt;
>  #endif
>  
> +static bool conservative_shrinker = true;
> +module_param(conservative_shrinker, bool, 0644);
> +MODULE_PARM_DESC(conservative_shrinker, "conservatively shrink balloon 
> pages");
> +
>  enum virtio_balloon_vq {
>   VIRTIO_BALLOON_VQ_INFLATE,
>   VIRTIO_BALLOON_VQ_DEFLATE,
> @@ -796,6 +800,10 @@ static unsigned long shrink_balloon_pages(struct 
> virtio_balloon *vb,
>  {
>   unsigned long pages_freed = 0;
>  
> + /* Balloon pages only gets shrunk when the pagecache depleted */
> + if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES))
> + return 0;
> +
>   /*
>* One invocation of leak_balloon can deflate at most
>* VIRTIO_BALLOON_ARRAY_PFNS_MAX balloon pages, so we call it
> @@ -837,7 +845,11 @@ static unsigned long 
> virtio_balloon_shrinker_count(struct shrinker *shrinker,
>   struct virtio_balloon, shrinker);
>   unsigned long count;
>  
> - count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE;
> + if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES))
> + count = 0;
> + else
> + count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE;
> +
>   count += vb->num_free_page_blocks * VIRTIO_BALLOON_HINT_BLOCK_PAGES;
>  
>   return count;
> 

so dropping caches (echo 3 > /proc/sys/vm/drop_caches) will no longer
deflate the balloon when conservative_shrinker=true?

-- 
Thanks,

David / dhildenb

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v1 3/3] virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM

2020-02-06 Thread Michael S. Tsirkin
On Thu, Feb 06, 2020 at 10:05:43AM +0100, David Hildenbrand wrote:
> >> commit bf50e69f63d21091e525185c3ae761412be0ba72
> >> Author: Dave Hansen 
> >> Date:   Thu Apr 7 10:43:25 2011 -0700
> >>
> >> virtio balloon: kill tell-host-first logic
> >>
> >> The virtio balloon driver has a VIRTIO_BALLOON_F_MUST_TELL_HOST
> >> feature bit.  Whenever the bit is set, the guest kernel must
> >> always tell the host before we free pages back to the allocator.
> >> Without this feature, we might free a page (and have another
> >> user touch it) while the hypervisor is unprepared for it.
> >>
> >> But, if the bit is _not_ set, we are under no obligation to
> >> reverse the order; we're under no obligation to do _anything_.
> >> As of now, qemu-kvm defines the bit, but doesn't set it.
> > 
> > Well this is not what the spec says in the end.
> 
> I didn't check the spec, maybe I should do that :)
> 
> > To continue that commit message:
> > 
> > This patch makes the "tell host first" logic the only case.  This
> > should make everybody happy, and reduce the amount of untested or
> > untestable code in the kernel.
> 
> Yeah, but this comment explains that the current deflate is only in
> place, because it makes the code simpler (to support both cases). Of
> course, doing the deflate might result in performance improvements.
> (e.g., MADV_WILLNEED)
> 
> > 
> > you can try proposing the change to the virtio TC, see what do others
> > think.
> 
> We can just drop the comment from this patch for now. The tell_host host
> not be an issue AFAIKS.

I guess it's a good idea.


> -- 
> Thanks,
> 
> David / dhildenb

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v1 3/3] virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM

2020-02-06 Thread David Hildenbrand
>> commit bf50e69f63d21091e525185c3ae761412be0ba72
>> Author: Dave Hansen 
>> Date:   Thu Apr 7 10:43:25 2011 -0700
>>
>> virtio balloon: kill tell-host-first logic
>>
>> The virtio balloon driver has a VIRTIO_BALLOON_F_MUST_TELL_HOST
>> feature bit.  Whenever the bit is set, the guest kernel must
>> always tell the host before we free pages back to the allocator.
>> Without this feature, we might free a page (and have another
>> user touch it) while the hypervisor is unprepared for it.
>>
>> But, if the bit is _not_ set, we are under no obligation to
>> reverse the order; we're under no obligation to do _anything_.
>> As of now, qemu-kvm defines the bit, but doesn't set it.
> 
> Well this is not what the spec says in the end.

I didn't check the spec, maybe I should do that :)

> To continue that commit message:
> 
> This patch makes the "tell host first" logic the only case.  This
> should make everybody happy, and reduce the amount of untested or
> untestable code in the kernel.

Yeah, but this comment explains that the current deflate is only in
place, because it makes the code simpler (to support both cases). Of
course, doing the deflate might result in performance improvements.
(e.g., MADV_WILLNEED)

> 
> you can try proposing the change to the virtio TC, see what do others
> think.

We can just drop the comment from this patch for now. The tell_host host
not be an issue AFAIKS.

-- 
Thanks,

David / dhildenb

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH RFC] virtio_balloon: conservative balloon page shrinking

2020-02-06 Thread Michael S. Tsirkin
On Thu, Feb 06, 2020 at 04:01:47PM +0800, Wei Wang wrote:
> There are cases that users want to shrink balloon pages after the
> pagecache depleted. The conservative_shrinker lets the shrinker
> shrink balloon pages when all the pagecache has been reclaimed.
> 
> Signed-off-by: Wei Wang 

I'd rather avoid module parameters, but otherwise looks 
like a reasonable idea.
Tyler, what do you think?


> ---
>  drivers/virtio/virtio_balloon.c | 14 +-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 93f995f6cf36..b4c5bb13a867 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -42,6 +42,10 @@
>  static struct vfsmount *balloon_mnt;
>  #endif
>  
> +static bool conservative_shrinker = true;
> +module_param(conservative_shrinker, bool, 0644);
> +MODULE_PARM_DESC(conservative_shrinker, "conservatively shrink balloon 
> pages");
> +
>  enum virtio_balloon_vq {
>   VIRTIO_BALLOON_VQ_INFLATE,
>   VIRTIO_BALLOON_VQ_DEFLATE,
> @@ -796,6 +800,10 @@ static unsigned long shrink_balloon_pages(struct 
> virtio_balloon *vb,
>  {
>   unsigned long pages_freed = 0;
>  
> + /* Balloon pages only gets shrunk when the pagecache depleted */
> + if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES))
> + return 0;
> +
>   /*
>* One invocation of leak_balloon can deflate at most
>* VIRTIO_BALLOON_ARRAY_PFNS_MAX balloon pages, so we call it
> @@ -837,7 +845,11 @@ static unsigned long 
> virtio_balloon_shrinker_count(struct shrinker *shrinker,
>   struct virtio_balloon, shrinker);
>   unsigned long count;
>  
> - count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE;
> + if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES))

I'd rather have an API for that in mm/. In particular, do we want other
shrinkers to run, not just pagecache? To pick an example I'm familiar
with, kvm mmu cache for nested virt?

> + count = 0;
> + else
> + count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE;
> +
>   count += vb->num_free_page_blocks * VIRTIO_BALLOON_HINT_BLOCK_PAGES;
>  
>   return count;
> -- 
> 2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v1 3/3] virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM

2020-02-06 Thread Michael S. Tsirkin
On Thu, Feb 06, 2020 at 09:42:34AM +0100, David Hildenbrand wrote:
> On 06.02.20 08:40, Michael S. Tsirkin wrote:
> > On Wed, Feb 05, 2020 at 05:34:02PM +0100, David Hildenbrand wrote:
> >> Commit 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker")
> >> changed the behavior when deflation happens automatically. Instead of
> >> deflating when called by the OOM handler, the shrinker is used.
> >>
> >> However, the balloon is not simply some slab cache that should be
> >> shrunk when under memory pressure. The shrinker does not have a concept of
> >> priorities, so this behavior cannot be configured.
> >>
> >> There was a report that this results in undesired side effects when
> >> inflating the balloon to shrink the page cache. [1]
> >>"When inflating the balloon against page cache (i.e. no free memory
> >> remains) vmscan.c will both shrink page cache, but also invoke the
> >> shrinkers -- including the balloon's shrinker. So the balloon
> >> driver allocates memory which requires reclaim, vmscan gets this
> >> memory by shrinking the balloon, and then the driver adds the
> >> memory back to the balloon. Basically a busy no-op."
> >>
> >> The name "deflate on OOM" makes it pretty clear when deflation should
> >> happen - after other approaches to reclaim memory failed, not while
> >> reclaiming. This allows to minimize the footprint of a guest - memory
> >> will only be taken out of the balloon when really needed.
> >>
> >> Especially, a drop_slab() will result in the whole balloon getting
> >> deflated - undesired. While handling it via the OOM handler might not be
> >> perfect, it keeps existing behavior. If we want a different behavior, then
> >> we need a new feature bit and document it properly (although, there should
> >> be a clear use case and the intended effects should be well described).
> >>
> >> Keep using the shrinker for VIRTIO_BALLOON_F_FREE_PAGE_HINT, because
> >> this has no such side effects. Always register the shrinker with
> >> VIRTIO_BALLOON_F_FREE_PAGE_HINT now. We are always allowed to reuse free
> >> pages that are still to be processed by the guest. The hypervisor takes
> >> care of identifying and resolving possible races between processing a
> >> hinting request and the guest reusing a page.
> >>
> >> In contrast to pre commit 71994620bb25 ("virtio_balloon: replace oom
> >> notifier with shrinker"), don't add a moodule parameter to configure the
> >> number of pages to deflate on OOM. Can be re-added if really needed.
> > 
> > I agree. And to make this case even stronger:
> > 
> > The oom_pages module parameter was known to be broken: whatever its
> > value, we return at most VIRTIO_BALLOON_ARRAY_PFNS_MAX.  So module
> > parameter values > 256 never worked, and it seems highly unlikely that
> > freeing 1Mbyte on OOM is too aggressive.
> > There was a patch
> >  virtio-balloon: deflate up to oom_pages on OOM
> > by Wei Wang to try to fix it:
> > https://lore.kernel.org/r/1508500466-21165-3-git-send-email-wei.w.w...@intel.com
> > but this was dropped.
> 
> Makes sense. 1MB is usually good enough.
> 
> > 
> >> Also, pay attention that leak_balloon() returns the number of 4k pages -
> >> convert it properly in virtio_balloon_oom_notify().
> > 
> > Oh. So it was returning a wrong value originally (before 71994620bb25).
> > However what really matters for notifiers is whether the value is 0 -
> > whether we made progress. So it's cosmetic.
> 
> Yes, that's also my understanding.
> 
> > 
> >> Note1: using the OOM handler is frowned upon, but it really is what we
> >>need for this feature.
> > 
> > Quite. However, I went back researching why we dropped the OOM notifier,
> > and found this:
> > 
> > https://lore.kernel.org/r/1508500466-21165-2-git-send-email-wei.w.w...@intel.com
> > 
> > To quote from there:
> > 
> > The balloon_lock was used to synchronize the access demand to elements
> > of struct virtio_balloon and its queue operations (please see commit
> > e22504296d). This prevents the concurrent run of the leak_balloon and
> > fill_balloon functions, thereby resulting in a deadlock issue on OOM:
> > 
> > fill_balloon: take balloon_lock and wait for OOM to get some memory;
> > oom_notify: release some inflated memory via leak_balloon();
> > leak_balloon: wait for balloon_lock to be released by fill_balloon.
> 
> fill_balloon does the allocation *before* taking the lock. tell_host()
> should not allocate memory AFAIR. So how could this ever happen?
> 
> Anyhow, we could simply work around this by doing a trylock in
> fill_balloon() and retrying in the caller. That should be easy. But I
> want to understand first, how something like that would even be possible.

Hmm it looks like you are right.  Sorry!


> >> Note2: without VIRTIO_BALLOON_F_MUST_TELL_HOST (iow, always with QEMU) we
> >>could actually skip sending deflation requests to our hypervisor,
> >>making the OOM path *very* simple. Besically freeing pages and
> 

RE: [PATCH v1 3/3] virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM

2020-02-06 Thread Wang, Wei W
On Thursday, February 6, 2020 12:34 AM, David Hildenbrand wrote:
> Commit 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker")
> changed the behavior when deflation happens automatically. Instead of
> deflating when called by the OOM handler, the shrinker is used.
> 
> However, the balloon is not simply some slab cache that should be shrunk
> when under memory pressure. The shrinker does not have a concept of
> priorities, so this behavior cannot be configured.
> 
> There was a report that this results in undesired side effects when inflating
> the balloon to shrink the page cache. [1]
>   "When inflating the balloon against page cache (i.e. no free memory
>remains) vmscan.c will both shrink page cache, but also invoke the
>shrinkers -- including the balloon's shrinker. So the balloon
>driver allocates memory which requires reclaim, vmscan gets this
>memory by shrinking the balloon, and then the driver adds the
>memory back to the balloon. Basically a busy no-op."

Not sure if we need to go back to OOM, which has many drawbacks as we discussed.
Just posted out another approach, which is simple.

Best,
Wei
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 4/4] drm/virtio: move virtio_gpu_mem_entry initialization to new function

2020-02-06 Thread Gerd Hoffmann
  Hi,

> > virtio_gpu_cmd_resource_attach_backing(vgdev, obj->hw_res_handle,
> > -  ents, nents,
> > +  obj->ents, obj->nents,
> >fence);
> > +   obj->ents = NULL;
> > +   obj->nents = 0;
> Hm, if the entries are temporary, can we allocate and initialize them
> in this function?

Well, the plan for CREATE_RESOURCE_BLOB is to use obj->ents too ...

cheers,
  Gerd

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH RFC] virtio_balloon: conservative balloon page shrinking

2020-02-06 Thread Wei Wang
There are cases that users want to shrink balloon pages after the
pagecache depleted. The conservative_shrinker lets the shrinker
shrink balloon pages when all the pagecache has been reclaimed.

Signed-off-by: Wei Wang 
---
 drivers/virtio/virtio_balloon.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 93f995f6cf36..b4c5bb13a867 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -42,6 +42,10 @@
 static struct vfsmount *balloon_mnt;
 #endif
 
+static bool conservative_shrinker = true;
+module_param(conservative_shrinker, bool, 0644);
+MODULE_PARM_DESC(conservative_shrinker, "conservatively shrink balloon pages");
+
 enum virtio_balloon_vq {
VIRTIO_BALLOON_VQ_INFLATE,
VIRTIO_BALLOON_VQ_DEFLATE,
@@ -796,6 +800,10 @@ static unsigned long shrink_balloon_pages(struct 
virtio_balloon *vb,
 {
unsigned long pages_freed = 0;
 
+   /* Balloon pages only gets shrunk when the pagecache depleted */
+   if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES))
+   return 0;
+
/*
 * One invocation of leak_balloon can deflate at most
 * VIRTIO_BALLOON_ARRAY_PFNS_MAX balloon pages, so we call it
@@ -837,7 +845,11 @@ static unsigned long virtio_balloon_shrinker_count(struct 
shrinker *shrinker,
struct virtio_balloon, shrinker);
unsigned long count;
 
-   count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE;
+   if (conservative_shrinker && global_node_page_state(NR_FILE_PAGES))
+   count = 0;
+   else
+   count = vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE;
+
count += vb->num_free_page_blocks * VIRTIO_BALLOON_HINT_BLOCK_PAGES;
 
return count;
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v1 3/3] virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM

2020-02-06 Thread David Hildenbrand
On 06.02.20 08:40, Michael S. Tsirkin wrote:
> On Wed, Feb 05, 2020 at 05:34:02PM +0100, David Hildenbrand wrote:
>> Commit 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker")
>> changed the behavior when deflation happens automatically. Instead of
>> deflating when called by the OOM handler, the shrinker is used.
>>
>> However, the balloon is not simply some slab cache that should be
>> shrunk when under memory pressure. The shrinker does not have a concept of
>> priorities, so this behavior cannot be configured.
>>
>> There was a report that this results in undesired side effects when
>> inflating the balloon to shrink the page cache. [1]
>>  "When inflating the balloon against page cache (i.e. no free memory
>>   remains) vmscan.c will both shrink page cache, but also invoke the
>>   shrinkers -- including the balloon's shrinker. So the balloon
>>   driver allocates memory which requires reclaim, vmscan gets this
>>   memory by shrinking the balloon, and then the driver adds the
>>   memory back to the balloon. Basically a busy no-op."
>>
>> The name "deflate on OOM" makes it pretty clear when deflation should
>> happen - after other approaches to reclaim memory failed, not while
>> reclaiming. This allows to minimize the footprint of a guest - memory
>> will only be taken out of the balloon when really needed.
>>
>> Especially, a drop_slab() will result in the whole balloon getting
>> deflated - undesired. While handling it via the OOM handler might not be
>> perfect, it keeps existing behavior. If we want a different behavior, then
>> we need a new feature bit and document it properly (although, there should
>> be a clear use case and the intended effects should be well described).
>>
>> Keep using the shrinker for VIRTIO_BALLOON_F_FREE_PAGE_HINT, because
>> this has no such side effects. Always register the shrinker with
>> VIRTIO_BALLOON_F_FREE_PAGE_HINT now. We are always allowed to reuse free
>> pages that are still to be processed by the guest. The hypervisor takes
>> care of identifying and resolving possible races between processing a
>> hinting request and the guest reusing a page.
>>
>> In contrast to pre commit 71994620bb25 ("virtio_balloon: replace oom
>> notifier with shrinker"), don't add a moodule parameter to configure the
>> number of pages to deflate on OOM. Can be re-added if really needed.
> 
> I agree. And to make this case even stronger:
> 
> The oom_pages module parameter was known to be broken: whatever its
> value, we return at most VIRTIO_BALLOON_ARRAY_PFNS_MAX.  So module
> parameter values > 256 never worked, and it seems highly unlikely that
> freeing 1Mbyte on OOM is too aggressive.
> There was a patch
>  virtio-balloon: deflate up to oom_pages on OOM
> by Wei Wang to try to fix it:
> https://lore.kernel.org/r/1508500466-21165-3-git-send-email-wei.w.w...@intel.com
> but this was dropped.

Makes sense. 1MB is usually good enough.

> 
>> Also, pay attention that leak_balloon() returns the number of 4k pages -
>> convert it properly in virtio_balloon_oom_notify().
> 
> Oh. So it was returning a wrong value originally (before 71994620bb25).
> However what really matters for notifiers is whether the value is 0 -
> whether we made progress. So it's cosmetic.

Yes, that's also my understanding.

> 
>> Note1: using the OOM handler is frowned upon, but it really is what we
>>need for this feature.
> 
> Quite. However, I went back researching why we dropped the OOM notifier,
> and found this:
> 
> https://lore.kernel.org/r/1508500466-21165-2-git-send-email-wei.w.w...@intel.com
> 
> To quote from there:
> 
> The balloon_lock was used to synchronize the access demand to elements
> of struct virtio_balloon and its queue operations (please see commit
> e22504296d). This prevents the concurrent run of the leak_balloon and
> fill_balloon functions, thereby resulting in a deadlock issue on OOM:
> 
> fill_balloon: take balloon_lock and wait for OOM to get some memory;
> oom_notify: release some inflated memory via leak_balloon();
> leak_balloon: wait for balloon_lock to be released by fill_balloon.

fill_balloon does the allocation *before* taking the lock. tell_host()
should not allocate memory AFAIR. So how could this ever happen?

Anyhow, we could simply work around this by doing a trylock in
fill_balloon() and retrying in the caller. That should be easy. But I
want to understand first, how something like that would even be possible.

>> Note2: without VIRTIO_BALLOON_F_MUST_TELL_HOST (iow, always with QEMU) we
>>could actually skip sending deflation requests to our hypervisor,
>>making the OOM path *very* simple. Besically freeing pages and
>>updating the balloon.
> 
> Well not exactly. !VIRTIO_BALLOON_F_MUST_TELL_HOST does not actually
> mean "never tell host". It means "host will not discard pages in the
> balloon, you can defer host notification until after use".
> 
> This was the original implementation:
> 
> +  

Re: [PATCH v1 1/3] virtio-balloon: Fix memory leak when unloading while hinting is in progress

2020-02-06 Thread Michael S. Tsirkin
On Wed, Feb 05, 2020 at 05:34:00PM +0100, David Hildenbrand wrote:
> When unloading the driver while hinting is in progress, we will not
> release the free page blocks back to MM, resulting in a memory leak.
> 
> Fixes: 86a559787e6f ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT")
> Cc: "Michael S. Tsirkin" 
> Cc: Jason Wang 
> Cc: Wei Wang 
> Cc: Liang Li 
> Signed-off-by: David Hildenbrand 

Applied, thanks!

> ---
>  drivers/virtio/virtio_balloon.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 8e400ece9273..abef2306c899 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -968,6 +968,10 @@ static void remove_common(struct virtio_balloon *vb)
>   leak_balloon(vb, vb->num_pages);
>   update_balloon_size(vb);
>  
> + /* There might be free pages that are being reported: release them. */
> + if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT))
> + return_free_pages_to_mm(vb, ULONG_MAX);
> +
>   /* Now we reset the device so we can clean up the queues. */
>   vb->vdev->config->reset(vb->vdev);
>  
> -- 
> 2.24.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v1 2/3] virtio_balloon: Fix memory leaks on errors in virtballoon_probe()

2020-02-06 Thread Michael S. Tsirkin
On Wed, Feb 05, 2020 at 05:34:01PM +0100, David Hildenbrand wrote:
> We forget to put the inode and unmount the kernfs used for compaction.
> 
> Fixes: 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker")
> Cc: "Michael S. Tsirkin" 
> Cc: Jason Wang 
> Cc: Wei Wang 
> Cc: Liang Li 
> Signed-off-by: David Hildenbrand 

Applied, thanks!

> ---
>  drivers/virtio/virtio_balloon.c | 13 +
>  1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index abef2306c899..7e5d84caeb94 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -901,8 +901,7 @@ static int virtballoon_probe(struct virtio_device *vdev)
>   vb->vb_dev_info.inode = alloc_anon_inode(balloon_mnt->mnt_sb);
>   if (IS_ERR(vb->vb_dev_info.inode)) {
>   err = PTR_ERR(vb->vb_dev_info.inode);
> - kern_unmount(balloon_mnt);
> - goto out_del_vqs;
> + goto out_kern_unmount;
>   }
>   vb->vb_dev_info.inode->i_mapping->a_ops = _aops;
>  #endif
> @@ -913,13 +912,13 @@ static int virtballoon_probe(struct virtio_device *vdev)
>*/
>   if (virtqueue_get_vring_size(vb->free_page_vq) < 2) {
>   err = -ENOSPC;
> - goto out_del_vqs;
> + goto out_iput;
>   }
>   vb->balloon_wq = alloc_workqueue("balloon-wq",
>   WQ_FREEZABLE | WQ_CPU_INTENSIVE, 0);
>   if (!vb->balloon_wq) {
>   err = -ENOMEM;
> - goto out_del_vqs;
> + goto out_iput;
>   }
>   INIT_WORK(>report_free_page_work, report_free_page_func);
>   vb->cmd_id_received_cache = VIRTIO_BALLOON_CMD_ID_STOP;
> @@ -953,6 +952,12 @@ static int virtballoon_probe(struct virtio_device *vdev)
>  out_del_balloon_wq:
>   if (virtio_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT))
>   destroy_workqueue(vb->balloon_wq);
> +out_iput:
> +#ifdef CONFIG_BALLOON_COMPACTION
> + iput(vb->vb_dev_info.inode);
> +out_kern_unmount:
> + kern_unmount(balloon_mnt);
> +#endif
>  out_del_vqs:
>   vdev->config->del_vqs(vdev);
>  out_free_vb:
> -- 
> 2.24.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio_balloon: prevent pfn array overflow

2020-02-06 Thread David Hildenbrand
On 06.02.20 08:47, Michael S. Tsirkin wrote:
> Make sure, at build time, that pfn array is big enough to hold a single
> page.  It happens to be true since the PAGE_SHIFT value at the moment is
> 20, which is 1M - exactly 256 4K balloon pages.
> 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  drivers/virtio/virtio_balloon.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 8e400ece9273..2457c54b6185 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -158,6 +158,8 @@ static void set_page_pfns(struct virtio_balloon *vb,
>  {
>   unsigned int i;
>  
> + BUILD_BUG_ON(VIRTIO_BALLOON_PAGES_PER_PAGE > 
> VIRTIO_BALLOON_ARRAY_PFNS_MAX);
> +
>   /*
>* Set balloon pfns pointing at this page.
>* Note that the first pfn points at start of the page.
> 

Reviewed-by: David Hildenbrand 

-- 
Thanks,

David / dhildenb

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH] tools/virtio: option to build an out of tree module

2020-02-06 Thread Michael S. Tsirkin
Handy for testing with distro kernels.
Warn that the resulting module is completely unsupported,
and isn't intended for production use.

Signed-off-by: Michael S. Tsirkin 
---
 tools/virtio/Makefile | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/tools/virtio/Makefile b/tools/virtio/Makefile
index 8e2a908115c2..94106cde49e3 100644
--- a/tools/virtio/Makefile
+++ b/tools/virtio/Makefile
@@ -8,7 +8,18 @@ CFLAGS += -g -O2 -Werror -Wall -I. -I../include/ -I 
../../usr/include/ -Wno-poin
 vpath %.c ../../drivers/virtio ../../drivers/vhost
 mod:
${MAKE} -C `pwd`/../.. M=`pwd`/vhost_test V=${V}
-.PHONY: all test mod clean
+
+#oot: build vhost as an out of tree module for a distro kernel
+#no effort is taken to make it actually build or work, but tends to mostly work
+#if the distro kernel is very close to upstream
+#unsupported! this is a development tool only, don't use the
+#resulting modules in production!
+oot:
+   echo "UNSUPPORTED! Don't use the resulting modules in production!"
+   KCFLAGS="-I "`pwd`/../../drivers/vhost ${MAKE} -C 
/usr/src/kernels/$$(uname -r) M=`pwd`/vhost_test V=${V}
+   KCFLAGS="-I "`pwd`/../../drivers/vhost ${MAKE} -C 
/usr/src/kernels/$$(uname -r) M=`pwd`/../../drivers/vhost V=${V} 
CONFIG_VHOST_VSOCK=n
+
+.PHONY: all test mod clean vhost oot
 clean:
${RM} *.o vringh_test virtio_test vhost_test/*.o vhost_test/.*.cmd \
   vhost_test/Module.symvers vhost_test/modules.order *.d
-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization