Re: [PATCH v37 1/3] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT

2018-12-28 Thread Christian Borntraeger


On 28.12.2018 04:12, Wei Wang wrote:
> On 12/27/2018 08:03 PM, Christian Borntraeger wrote:
>> On 27.08.2018 03:32, Wei Wang wrote:
>>>   static int init_vqs(struct virtio_balloon *vb)
>>>   {
>>> -    struct virtqueue *vqs[3];
>>> -    vq_callback_t *callbacks[] = { balloon_ack, balloon_ack, stats_request 
>>> };
>>> -    static const char * const names[] = { "inflate", "deflate", "stats" };
>>> -    int err, nvqs;
>>> +    struct virtqueue *vqs[VIRTIO_BALLOON_VQ_MAX];
>>> +    vq_callback_t *callbacks[VIRTIO_BALLOON_VQ_MAX];
>>> +    const char *names[VIRTIO_BALLOON_VQ_MAX];
>>> +    int err;
>>>
>>>   /*
>>> - * We expect two virtqueues: inflate and deflate, and
>>> - * optionally stat.
>>> + * Inflateq and deflateq are used unconditionally. The names[]
>>> + * will be NULL if the related feature is not enabled, which will
>>> + * cause no allocation for the corresponding virtqueue in find_vqs.
>>>    */
>> This might be true for virtio-pci, but it is not for virtio-ccw.
> 
> Hi Christian,
> 
> 
> Please try the fix patches: https://lkml.org/lkml/2018/12/27/336

See answer to that thread. It fixes the random boot crashes.
There is still the regression that ballooning does no longer work on
s390 (see the call trace).

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v37 1/3] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT

2018-12-27 Thread Wei Wang

On 12/27/2018 08:03 PM, Christian Borntraeger wrote:

On 27.08.2018 03:32, Wei Wang wrote:

  static int init_vqs(struct virtio_balloon *vb)
  {
-   struct virtqueue *vqs[3];
-   vq_callback_t *callbacks[] = { balloon_ack, balloon_ack, stats_request 
};
-   static const char * const names[] = { "inflate", "deflate", "stats" };
-   int err, nvqs;
+   struct virtqueue *vqs[VIRTIO_BALLOON_VQ_MAX];
+   vq_callback_t *callbacks[VIRTIO_BALLOON_VQ_MAX];
+   const char *names[VIRTIO_BALLOON_VQ_MAX];
+   int err;

/*
-* We expect two virtqueues: inflate and deflate, and
-* optionally stat.
+* Inflateq and deflateq are used unconditionally. The names[]
+* will be NULL if the related feature is not enabled, which will
+* cause no allocation for the corresponding virtqueue in find_vqs.
 */

This might be true for virtio-pci, but it is not for virtio-ccw.


Hi Christian,


Please try the fix patches: https://lkml.org/lkml/2018/12/27/336

Best,
Wei
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v37 1/3] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT

2018-12-27 Thread Christian Borntraeger
On 27.08.2018 03:32, Wei Wang wrote:
>  static int init_vqs(struct virtio_balloon *vb)
>  {
> - struct virtqueue *vqs[3];
> - vq_callback_t *callbacks[] = { balloon_ack, balloon_ack, stats_request 
> };
> - static const char * const names[] = { "inflate", "deflate", "stats" };
> - int err, nvqs;
> + struct virtqueue *vqs[VIRTIO_BALLOON_VQ_MAX];
> + vq_callback_t *callbacks[VIRTIO_BALLOON_VQ_MAX];
> + const char *names[VIRTIO_BALLOON_VQ_MAX];
> + int err;
> 
>   /*
> -  * We expect two virtqueues: inflate and deflate, and
> -  * optionally stat.
> +  * Inflateq and deflateq are used unconditionally. The names[]
> +  * will be NULL if the related feature is not enabled, which will
> +  * cause no allocation for the corresponding virtqueue in find_vqs.
>*/

This might be true for virtio-pci, but it is not for virtio-ccw.

> - nvqs = virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_STATS_VQ) ? 3 : 2;
> - err = virtio_find_vqs(vb->vdev, nvqs, vqs, callbacks, names, NULL);
> + callbacks[VIRTIO_BALLOON_VQ_INFLATE] = balloon_ack;
> + names[VIRTIO_BALLOON_VQ_INFLATE] = "inflate";
> + callbacks[VIRTIO_BALLOON_VQ_DEFLATE] = balloon_ack;
> + names[VIRTIO_BALLOON_VQ_DEFLATE] = "deflate";
> + names[VIRTIO_BALLOON_VQ_STATS] = NULL;
> + names[VIRTIO_BALLOON_VQ_FREE_PAGE] = NULL;
> +
> + if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_STATS_VQ)) {
> + names[VIRTIO_BALLOON_VQ_STATS] = "stats";
> + callbacks[VIRTIO_BALLOON_VQ_STATS] = stats_request;
> + }
> +
> + if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
> + names[VIRTIO_BALLOON_VQ_FREE_PAGE] = "free_page_vq";
> + callbacks[VIRTIO_BALLOON_VQ_FREE_PAGE] = NULL;
> + }
> +
> + err = vb->vdev->config->find_vqs(vb->vdev, VIRTIO_BALLOON_VQ_MAX,
[...]

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v37 1/3] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT

2018-08-26 Thread Wei Wang
Negotiation of the VIRTIO_BALLOON_F_FREE_PAGE_HINT feature indicates the
support of reporting hints of guest free pages to host via virtio-balloon.
Currenlty, only free page blocks of MAX_ORDER - 1 are reported. They are
obtained one by one from the mm free list via the regular allocation
function.

Host requests the guest to report free page hints by sending a new cmd id
to the guest via the free_page_report_cmd_id configuration register. When
the guest starts to report, it first sends a start cmd to host via the
free page vq, which acks to host the cmd id received. When the guest
finishes reporting free pages, a stop cmd is sent to host via the vq.
Host may also send a stop cmd id to the guest to stop the reporting.

VIRTIO_BALLOON_CMD_ID_STOP: Host sends this cmd to stop the guest
reporting.
VIRTIO_BALLOON_CMD_ID_DONE: Host sends this cmd to tell the guest that
the reported pages are ready to be freed.

Why does the guest free the reported pages when host tells it is ready to
free?
This is because freeing pages appears to be expensive for live migration.
free_pages() dirties memory very quickly and makes the live migraion not
converge in some cases. So it is good to delay the free_page operation
when the migration is done, and host sends a command to guest about that.

Why do we need the new VIRTIO_BALLOON_CMD_ID_DONE, instead of reusing
VIRTIO_BALLOON_CMD_ID_STOP?
This is because live migration is usually done in several rounds. At the
end of each round, host needs to send a VIRTIO_BALLOON_CMD_ID_STOP cmd to
the guest to stop (or say pause) the reporting. The guest resumes the
reporting when it receives a new command id at the beginning of the next
round. So we need a new cmd id to distinguish between "stop reporting" and
"ready to free the reported pages".

TODO:
- Add a batch page allocation API to amortize the allocation overhead.

Signed-off-by: Wei Wang 
Signed-off-by: Liang Li 
Cc: Michael S. Tsirkin 
Cc: Michal Hocko 
Cc: Andrew Morton 
Cc: Linus Torvalds 
---
 drivers/virtio/virtio_balloon.c | 364 
 include/uapi/linux/virtio_balloon.h |   5 +
 2 files changed, 336 insertions(+), 33 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index d1c1f62..a185678 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -41,13 +41,34 @@
 #define VIRTIO_BALLOON_ARRAY_PFNS_MAX 256
 #define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
 
+#define VIRTIO_BALLOON_FREE_PAGE_ALLOC_FLAG (__GFP_NORETRY | __GFP_NOWARN | \
+__GFP_NOMEMALLOC)
+/* The order of free page blocks to report to host */
+#define VIRTIO_BALLOON_FREE_PAGE_ORDER (MAX_ORDER - 1)
+/* The size of a free page block in bytes */
+#define VIRTIO_BALLOON_FREE_PAGE_SIZE \
+   (1 << (VIRTIO_BALLOON_FREE_PAGE_ORDER + PAGE_SHIFT))
+
 #ifdef CONFIG_BALLOON_COMPACTION
 static struct vfsmount *balloon_mnt;
 #endif
 
+enum virtio_balloon_vq {
+   VIRTIO_BALLOON_VQ_INFLATE,
+   VIRTIO_BALLOON_VQ_DEFLATE,
+   VIRTIO_BALLOON_VQ_STATS,
+   VIRTIO_BALLOON_VQ_FREE_PAGE,
+   VIRTIO_BALLOON_VQ_MAX
+};
+
 struct virtio_balloon {
struct virtio_device *vdev;
-   struct virtqueue *inflate_vq, *deflate_vq, *stats_vq;
+   struct virtqueue *inflate_vq, *deflate_vq, *stats_vq, *free_page_vq;
+
+   /* Balloon's own wq for cpu-intensive work items */
+   struct workqueue_struct *balloon_wq;
+   /* The free page reporting work item submitted to the balloon wq */
+   struct work_struct report_free_page_work;
 
/* The balloon servicing is delegated to a freezable workqueue. */
struct work_struct update_balloon_stats_work;
@@ -57,6 +78,18 @@ struct virtio_balloon {
spinlock_t stop_update_lock;
bool stop_update;
 
+   /* The list of allocated free pages, waiting to be given back to mm */
+   struct list_head free_page_list;
+   spinlock_t free_page_list_lock;
+   /* The number of free page blocks on the above list */
+   unsigned long num_free_page_blocks;
+   /* The cmd id received from host */
+   u32 cmd_id_received;
+   /* The cmd id that is actively in use */
+   __virtio32 cmd_id_active;
+   /* Buffer to store the stop sign */
+   __virtio32 cmd_id_stop;
+
/* Waiting for host to ack the pages we released. */
wait_queue_head_t acked;
 
@@ -320,17 +353,6 @@ static void stats_handle_request(struct virtio_balloon *vb)
virtqueue_kick(vq);
 }
 
-static void virtballoon_changed(struct virtio_device *vdev)
-{
-   struct virtio_balloon *vb = vdev->priv;
-   unsigned long flags;
-
-   spin_lock_irqsave(>stop_update_lock, flags);
-   if (!vb->stop_update)
-   queue_work(system_freezable_wq, >update_balloon_size_work);
-   spin_unlock_irqrestore(>stop_update_lock, flags);
-}
-
 static inline s64 towards_target(struct virtio_balloon *vb)
 {
s64 target;