Re: [PATCH v11 5/7] virtio_balloon: introduce migration primitives to balloon pages

2012-11-11 Thread Rusty Russell
Rafael Aquini aqu...@redhat.com writes:

 On Thu, Nov 08, 2012 at 09:32:18AM +1030, Rusty Russell wrote:
 The first one can be delayed, the second one can be delayed if the host
 didn't ask for VIRTIO_BALLOON_F_MUST_TELL_HOST (qemu doesn't).
 
 We could implement a proper request queue for these, and return -EAGAIN
 if the queue fills.  Though in practice, it's not important (it might
 help performance).

 I liked the idea. Give me the directions to accomplish it and I'll give it a 
 try
 for sure.

OK, let's get this applied first, but here are some pointers:

Here's the current callback function when the host has processed the
buffers we put in the queue:

 static void balloon_ack(struct virtqueue *vq)
 {
struct virtio_balloon *vb = vq-vdev-priv;

wake_up(vb-acked);
 }

It's almost a noop: here's how we use it to make our queues synchronous:

 static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq)
 {
struct scatterlist sg;
unsigned int len;

sg_init_one(sg, vb-pfns, sizeof(vb-pfns[0]) * vb-num_pfns);

/* We should always be able to add one buffer to an empty queue. */
if (virtqueue_add_buf(vq, sg, 1, 0, vb, GFP_KERNEL)  0)
BUG();
virtqueue_kick(vq);

/* When host has read buffer, this completes via balloon_ack */
wait_event(vb-acked, virtqueue_get_buf(vq, len));
 }

And we set up the callback when we create the virtqueue:

vq_callback_t *callbacks[] = { balloon_ack, balloon_ack, stats_request 
};
...
err = vb-vdev-config-find_vqs(vb-vdev, nvqs, vqs, callbacks, names);

So off the top of my head it should be as simple as changing tell_host()
to only wait if the virtqueue_add_buf() fails (ie. queue is full).

Hmm, though you will want to synchronize the inflate and deflate queues:
if we tell the host we're giving a page up we want it to have seen that
before we tell it we're using it again...

Cheers,
Rusty.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v11 5/7] virtio_balloon: introduce migration primitives to balloon pages

2012-11-10 Thread Michael S. Tsirkin
On Wed, Nov 07, 2012 at 04:11:46PM -0800, Andrew Morton wrote:
 On Thu, 08 Nov 2012 09:32:18 +1030
 Rusty Russell ru...@rustcorp.com.au wrote:
 
  Rafael Aquini aqu...@redhat.com writes:
   + * virtballoon_migratepage - perform the balloon page migration on 
   behalf of
   + *a compation thread. (called under page 
   lock)
  
   + if (!mutex_trylock(vb-balloon_lock))
   + return -EAGAIN;
  
  Erk, OK...
 
 Not really.  As is almost always the case with a trylock, it needs a
 comment explaining why we couldn't use the far superior mutex_lock(). 
 Data: this reader doesn't know!
 
   + /* balloon's page migration 1st step  -- inflate newpage */
   + spin_lock_irqsave(vb_dev_info-pages_lock, flags);
   + balloon_page_insert(newpage, mapping, vb_dev_info-pages);
   + vb_dev_info-isolated_pages--;
   + spin_unlock_irqrestore(vb_dev_info-pages_lock, flags);
   + vb-num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;
   + set_page_pfns(vb-pfns, newpage);
   + tell_host(vb, vb-inflate_vq);
  
  tell_host does wait_event(), so you can't call it under the page_lock.
  Right?
 
 Sleeping inside lock_page() is OK.  More problematic is that GFP_KERNEL
 allocation.

Do you mean this one:
if (virtqueue_add_buf(vq, sg, 1, 0, vb, GFP_KERNEL)  0)
 ?

In practice it never triggers an allocation, we can pass in
GFP_ATOMIC just as well.

-- 
MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v11 5/7] virtio_balloon: introduce migration primitives to balloon pages

2012-11-10 Thread Michael S. Tsirkin
On Wed, Nov 07, 2012 at 01:05:52AM -0200, Rafael Aquini wrote:
 Memory fragmentation introduced by ballooning might reduce significantly
 the number of 2MB contiguous memory blocks that can be used within a guest,
 thus imposing performance penalties associated with the reduced number of
 transparent huge pages that could be used by the guest workload.
 
 Besides making balloon pages movable at allocation time and introducing
 the necessary primitives to perform balloon page migration/compaction,
 this patch also introduces the following locking scheme, in order to
 enhance the syncronization methods for accessing elements of struct
 virtio_balloon, thus providing protection against concurrent access
 introduced by parallel memory migration threads.
 
  - balloon_lock (mutex) : synchronizes the access demand to elements of
   struct virtio_balloon and its queue operations;
 
 Signed-off-by: Rafael Aquini aqu...@redhat.com


Acked-by: Michael S. Tsirkin m...@redhat.com


 ---
  drivers/virtio/virtio_balloon.c | 135 
 
  1 file changed, 123 insertions(+), 12 deletions(-)
 
 diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
 index 0908e60..69eede7 100644
 --- a/drivers/virtio/virtio_balloon.c
 +++ b/drivers/virtio/virtio_balloon.c
 @@ -27,6 +27,7 @@
  #include linux/delay.h
  #include linux/slab.h
  #include linux/module.h
 +#include linux/balloon_compaction.h
  
  /*
   * Balloon device works in 4K page units.  So each page is pointed to by
 @@ -34,6 +35,7 @@
   * page units.
   */
  #define VIRTIO_BALLOON_PAGES_PER_PAGE (PAGE_SIZE  VIRTIO_BALLOON_PFN_SHIFT)
 +#define VIRTIO_BALLOON_ARRAY_PFNS_MAX 256
  
  struct virtio_balloon
  {
 @@ -52,15 +54,19 @@ struct virtio_balloon
   /* Number of balloon pages we've told the Host we're not using. */
   unsigned int num_pages;
   /*
 -  * The pages we've told the Host we're not using.
 +  * The pages we've told the Host we're not using are enqueued
 +  * at vb_dev_info-pages list.
* Each page on this list adds VIRTIO_BALLOON_PAGES_PER_PAGE
* to num_pages above.
*/
 - struct list_head pages;
 + struct balloon_dev_info *vb_dev_info;
 +
 + /* Synchronize access/update to this struct virtio_balloon elements */
 + struct mutex balloon_lock;
  
   /* The array of pfns we tell the Host about. */
   unsigned int num_pfns;
 - u32 pfns[256];
 + u32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX];
  
   /* Memory statistics */
   int need_stats_update;
 @@ -122,18 +128,25 @@ static void set_page_pfns(u32 pfns[], struct page *page)
  
  static void fill_balloon(struct virtio_balloon *vb, size_t num)
  {
 + struct balloon_dev_info *vb_dev_info = vb-vb_dev_info;
 +
 + static DEFINE_RATELIMIT_STATE(fill_balloon_rs,
 +   DEFAULT_RATELIMIT_INTERVAL,
 +   DEFAULT_RATELIMIT_BURST);
 +
   /* We can only do one array worth at a time. */
   num = min(num, ARRAY_SIZE(vb-pfns));
  
 + mutex_lock(vb-balloon_lock);
   for (vb-num_pfns = 0; vb-num_pfns  num;
vb-num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
 - struct page *page = alloc_page(GFP_HIGHUSER | __GFP_NORETRY |
 - __GFP_NOMEMALLOC | __GFP_NOWARN);
 + struct page *page = balloon_page_enqueue(vb_dev_info);
 +
   if (!page) {
 - if (printk_ratelimit())
 + if (__ratelimit(fill_balloon_rs))
   dev_printk(KERN_INFO, vb-vdev-dev,
  Out of puff! Can't get %zu pages\n,
 -num);
 +VIRTIO_BALLOON_PAGES_PER_PAGE);
   /* Sleep for at least 1/5 of a second before retry. */
   msleep(200);
   break;
 @@ -141,7 +154,6 @@ static void fill_balloon(struct virtio_balloon *vb, 
 size_t num)
   set_page_pfns(vb-pfns + vb-num_pfns, page);
   vb-num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
   totalram_pages--;
 - list_add(page-lru, vb-pages);
   }
  
   /* Didn't get any?  Oh well. */
 @@ -149,6 +161,7 @@ static void fill_balloon(struct virtio_balloon *vb, 
 size_t num)
   return;
  
   tell_host(vb, vb-inflate_vq);
 + mutex_unlock(vb-balloon_lock);
  }
  
  static void release_pages_by_pfn(const u32 pfns[], unsigned int num)
 @@ -165,14 +178,17 @@ static void release_pages_by_pfn(const u32 pfns[], 
 unsigned int num)
  static void leak_balloon(struct virtio_balloon *vb, size_t num)
  {
   struct page *page;
 + struct balloon_dev_info *vb_dev_info = vb-vb_dev_info;
  
   /* We can only do one array worth at a time. */
   num = min(num, ARRAY_SIZE(vb-pfns));
  
 + 

Re: [PATCH v11 5/7] virtio_balloon: introduce migration primitives to balloon pages

2012-11-07 Thread Andrew Morton
On Wed,  7 Nov 2012 01:05:52 -0200
Rafael Aquini aqu...@redhat.com wrote:

 Memory fragmentation introduced by ballooning might reduce significantly
 the number of 2MB contiguous memory blocks that can be used within a guest,
 thus imposing performance penalties associated with the reduced number of
 transparent huge pages that could be used by the guest workload.
 
 Besides making balloon pages movable at allocation time and introducing
 the necessary primitives to perform balloon page migration/compaction,
 this patch also introduces the following locking scheme, in order to
 enhance the syncronization methods for accessing elements of struct
 virtio_balloon, thus providing protection against concurrent access
 introduced by parallel memory migration threads.
 
 ...

 @@ -122,18 +128,25 @@ static void set_page_pfns(u32 pfns[], struct page *page)
  
  static void fill_balloon(struct virtio_balloon *vb, size_t num)
  {
 + struct balloon_dev_info *vb_dev_info = vb-vb_dev_info;
 +
 + static DEFINE_RATELIMIT_STATE(fill_balloon_rs,
 +   DEFAULT_RATELIMIT_INTERVAL,
 +   DEFAULT_RATELIMIT_BURST);
 +
   /* We can only do one array worth at a time. */
   num = min(num, ARRAY_SIZE(vb-pfns));
  
 + mutex_lock(vb-balloon_lock);
   for (vb-num_pfns = 0; vb-num_pfns  num;
vb-num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
 - struct page *page = alloc_page(GFP_HIGHUSER | __GFP_NORETRY |
 - __GFP_NOMEMALLOC | __GFP_NOWARN);
 + struct page *page = balloon_page_enqueue(vb_dev_info);
 +
   if (!page) {
 - if (printk_ratelimit())
 + if (__ratelimit(fill_balloon_rs))
   dev_printk(KERN_INFO, vb-vdev-dev,
  Out of puff! Can't get %zu pages\n,
 -num);
 +VIRTIO_BALLOON_PAGES_PER_PAGE);
   /* Sleep for at least 1/5 of a second before retry. */
   msleep(200);
   break;

linux-next's fill_balloon() has already been converted to
dev_info_ratelimited().  I fixed everything up.  Please check the result.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v11 5/7] virtio_balloon: introduce migration primitives to balloon pages

2012-11-07 Thread Rafael Aquini
On Wed, Nov 07, 2012 at 11:58:10AM -0800, Andrew Morton wrote:
 On Wed,  7 Nov 2012 01:05:52 -0200
 Rafael Aquini aqu...@redhat.com wrote:
 
  Memory fragmentation introduced by ballooning might reduce significantly
  the number of 2MB contiguous memory blocks that can be used within a guest,
  thus imposing performance penalties associated with the reduced number of
  transparent huge pages that could be used by the guest workload.
  
  Besides making balloon pages movable at allocation time and introducing
  the necessary primitives to perform balloon page migration/compaction,
  this patch also introduces the following locking scheme, in order to
  enhance the syncronization methods for accessing elements of struct
  virtio_balloon, thus providing protection against concurrent access
  introduced by parallel memory migration threads.
  
  ...
 
  @@ -122,18 +128,25 @@ static void set_page_pfns(u32 pfns[], struct page 
  *page)
   
   static void fill_balloon(struct virtio_balloon *vb, size_t num)
   {
  +   struct balloon_dev_info *vb_dev_info = vb-vb_dev_info;
  +
  +   static DEFINE_RATELIMIT_STATE(fill_balloon_rs,
  + DEFAULT_RATELIMIT_INTERVAL,
  + DEFAULT_RATELIMIT_BURST);
  +
  /* We can only do one array worth at a time. */
  num = min(num, ARRAY_SIZE(vb-pfns));
   
  +   mutex_lock(vb-balloon_lock);
  for (vb-num_pfns = 0; vb-num_pfns  num;
   vb-num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
  -   struct page *page = alloc_page(GFP_HIGHUSER | __GFP_NORETRY |
  -   __GFP_NOMEMALLOC | __GFP_NOWARN);
  +   struct page *page = balloon_page_enqueue(vb_dev_info);
  +
  if (!page) {
  -   if (printk_ratelimit())
  +   if (__ratelimit(fill_balloon_rs))
  dev_printk(KERN_INFO, vb-vdev-dev,
 Out of puff! Can't get %zu pages\n,
  -  num);
  +  VIRTIO_BALLOON_PAGES_PER_PAGE);
  /* Sleep for at least 1/5 of a second before retry. */
  msleep(200);
  break;
 
 linux-next's fill_balloon() has already been converted to
 dev_info_ratelimited().  I fixed everything up.  Please check the result.

Looks great, thanks for doing it

 
 --
 To unsubscribe, send a message with 'unsubscribe linux-mm' in
 the body to majord...@kvack.org.  For more info on Linux MM,
 see: http://www.linux-mm.org/ .
 Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v11 5/7] virtio_balloon: introduce migration primitives to balloon pages

2012-11-07 Thread Rusty Russell
Rafael Aquini aqu...@redhat.com writes:
 + * virtballoon_migratepage - perform the balloon page migration on behalf of
 + *a compation thread. (called under page lock)

 + if (!mutex_trylock(vb-balloon_lock))
 + return -EAGAIN;

Erk, OK...

 + /* balloon's page migration 1st step  -- inflate newpage */
 + spin_lock_irqsave(vb_dev_info-pages_lock, flags);
 + balloon_page_insert(newpage, mapping, vb_dev_info-pages);
 + vb_dev_info-isolated_pages--;
 + spin_unlock_irqrestore(vb_dev_info-pages_lock, flags);
 + vb-num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;
 + set_page_pfns(vb-pfns, newpage);
 + tell_host(vb, vb-inflate_vq);

tell_host does wait_event(), so you can't call it under the page_lock.
Right?

You probably get away with it because current qemu will service you
immediately.  You could spin here in this case for the moment.

There's a second call to tell_host():

 + /*
 +  * balloon's page migration 2nd step -- deflate page
 +  *
 +  * It's safe to delete page-lru here because this page is at
 +  * an isolated migration list, and this step is expected to happen here
 +  */
 + balloon_page_delete(page);
 + vb-num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;
 + set_page_pfns(vb-pfns, page);
 + tell_host(vb, vb-deflate_vq);

The first one can be delayed, the second one can be delayed if the host
didn't ask for VIRTIO_BALLOON_F_MUST_TELL_HOST (qemu doesn't).

We could implement a proper request queue for these, and return -EAGAIN
if the queue fills.  Though in practice, it's not important (it might
help performance).

Cheers,
Rusty.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v11 5/7] virtio_balloon: introduce migration primitives to balloon pages

2012-11-07 Thread Andrew Morton
On Thu, 08 Nov 2012 09:32:18 +1030
Rusty Russell ru...@rustcorp.com.au wrote:

 Rafael Aquini aqu...@redhat.com writes:
  + * virtballoon_migratepage - perform the balloon page migration on behalf 
  of
  + *  a compation thread. (called under page lock)
 
  +   if (!mutex_trylock(vb-balloon_lock))
  +   return -EAGAIN;
 
 Erk, OK...

Not really.  As is almost always the case with a trylock, it needs a
comment explaining why we couldn't use the far superior mutex_lock(). 
Data: this reader doesn't know!

  +   /* balloon's page migration 1st step  -- inflate newpage */
  +   spin_lock_irqsave(vb_dev_info-pages_lock, flags);
  +   balloon_page_insert(newpage, mapping, vb_dev_info-pages);
  +   vb_dev_info-isolated_pages--;
  +   spin_unlock_irqrestore(vb_dev_info-pages_lock, flags);
  +   vb-num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;
  +   set_page_pfns(vb-pfns, newpage);
  +   tell_host(vb, vb-inflate_vq);
 
 tell_host does wait_event(), so you can't call it under the page_lock.
 Right?

Sleeping inside lock_page() is OK.  More problematic is that GFP_KERNEL
allocation.  iirc it _should_ be OK.  Core VM uses trylock_page() and
the filesystems shouldn't be doing a synchronous lock_page() in the
pageout path.  But I suspect it isn't a well-tested area.

 You probably get away with it because current qemu will service you
 immediately.  You could spin here in this case for the moment.
 
 There's a second call to tell_host():
 
  +   /*
  +* balloon's page migration 2nd step -- deflate page
  +*
  +* It's safe to delete page-lru here because this page is at
  +* an isolated migration list, and this step is expected to happen here
  +*/
  +   balloon_page_delete(page);
  +   vb-num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;
  +   set_page_pfns(vb-pfns, page);
  +   tell_host(vb, vb-deflate_vq);
 
 The first one can be delayed, the second one can be delayed if the host
 didn't ask for VIRTIO_BALLOON_F_MUST_TELL_HOST (qemu doesn't).
 
 We could implement a proper request queue for these, and return -EAGAIN
 if the queue fills.  Though in practice, it's not important (it might
 help performance).

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v11 5/7] virtio_balloon: introduce migration primitives to balloon pages

2012-11-07 Thread Rafael Aquini
On Wed, Nov 07, 2012 at 04:11:46PM -0800, Andrew Morton wrote:
 On Thu, 08 Nov 2012 09:32:18 +1030
 Rusty Russell ru...@rustcorp.com.au wrote:
 
  Rafael Aquini aqu...@redhat.com writes:
   + * virtballoon_migratepage - perform the balloon page migration on 
   behalf of
   + *a compation thread. (called under page 
   lock)
  
   + if (!mutex_trylock(vb-balloon_lock))
   + return -EAGAIN;
  
  Erk, OK...
 
 Not really.  As is almost always the case with a trylock, it needs a
 comment explaining why we couldn't use the far superior mutex_lock(). 
 Data: this reader doesn't know!



That was just to alleviate balloon_lock contention if we're migrating pages
concurrently with balloon_fill() or balloon_leak(), as it's easier to retry
the page migration later (in the contended case).

 
   + /* balloon's page migration 1st step  -- inflate newpage */
   + spin_lock_irqsave(vb_dev_info-pages_lock, flags);
   + balloon_page_insert(newpage, mapping, vb_dev_info-pages);
   + vb_dev_info-isolated_pages--;
   + spin_unlock_irqrestore(vb_dev_info-pages_lock, flags);
   + vb-num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;
   + set_page_pfns(vb-pfns, newpage);
   + tell_host(vb, vb-inflate_vq);
  
  tell_host does wait_event(), so you can't call it under the page_lock.
  Right?
 
 Sleeping inside lock_page() is OK.  More problematic is that GFP_KERNEL
 allocation.  iirc it _should_ be OK.  Core VM uses trylock_page() and
 the filesystems shouldn't be doing a synchronous lock_page() in the
 pageout path.  But I suspect it isn't a well-tested area.


The locked page under migration is not contended by any other FS / core VM path,
as it is already isolated by compaction to a private migration page list.
OTOH, there's a chance of another parallel compaction thread hitting this page
while scanning page blocks for isolation, but that path can be considered safe
as it uses trylock_page()



 
  You probably get away with it because current qemu will service you
  immediately.  You could spin here in this case for the moment.
  
  There's a second call to tell_host():
  
   + /*
   +  * balloon's page migration 2nd step -- deflate page
   +  *
   +  * It's safe to delete page-lru here because this page is at
   +  * an isolated migration list, and this step is expected to happen here
   +  */
   + balloon_page_delete(page);
   + vb-num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE;
   + set_page_pfns(vb-pfns, page);
   + tell_host(vb, vb-deflate_vq);
  
  The first one can be delayed, the second one can be delayed if the host
  didn't ask for VIRTIO_BALLOON_F_MUST_TELL_HOST (qemu doesn't).
  
  We could implement a proper request queue for these, and return -EAGAIN
  if the queue fills.  Though in practice, it's not important (it might
  help performance).
 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v11 5/7] virtio_balloon: introduce migration primitives to balloon pages

2012-11-07 Thread Rafael Aquini
On Thu, Nov 08, 2012 at 09:32:18AM +1030, Rusty Russell wrote:
 The first one can be delayed, the second one can be delayed if the host
 didn't ask for VIRTIO_BALLOON_F_MUST_TELL_HOST (qemu doesn't).
 
 We could implement a proper request queue for these, and return -EAGAIN
 if the queue fills.  Though in practice, it's not important (it might
 help performance).

I liked the idea. Give me the directions to accomplish it and I'll give it a try
for sure.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization