Re: [Qemu-devel] [PATCH v6 kernel 3/5] virtio-balloon: speed up inflate/deflate process

2017-02-03 Thread Li, Liang Z
> 
> 
> > +static void free_extended_page_bitmap(struct virtio_balloon *vb) {
> > +   int i, bmap_count = vb->nr_page_bmap;
> > +
> > +   for (i = 1; i < bmap_count; i++) {
> > +   kfree(vb->page_bitmap[i]);
> > +   vb->page_bitmap[i] = NULL;
> > +   vb->nr_page_bmap--;
> > +   }
> > +}
> > +
> > +static void kfree_page_bitmap(struct virtio_balloon *vb) {
> > +   int i;
> > +
> > +   for (i = 0; i < vb->nr_page_bmap; i++)
> > +   kfree(vb->page_bitmap[i]);
> > +}
> 
> It might be worth commenting that pair of functions to make it clear why
> they are so different; I guess the kfree_page_bitmap is used just before you
> free the structure above it so you don't need to keep the count/pointers
> updated?
> 

Yes. I will add some comments for that. Thanks!

Liang
 
> Dave
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH v6 kernel 3/5] virtio-balloon: speed up inflate/deflate process

2017-01-20 Thread Michael S. Tsirkin
On Thu, Jan 19, 2017 at 01:44:36AM +, Li, Liang Z wrote:
> > > > > + *range = cpu_to_le64((base_pfn <<
> > > > > + VIRTIO_BALLOON_NR_PFN_BITS) | 0);
> > > > > + *(range + 1) = cpu_to_le64(pages);
> > > > > + vb->resp_pos += 2;
> > > >
> > > > Pls use structs for this kind of stuff.
> > >
> > > I am not sure if you mean to use
> > >
> > > struct  range {
> > >   __le64 pfn: 52;
> > >   __le64 nr_page: 12
> > > }
> > > Instead of the shift operation?
> > 
> > Not just that. You want to add a pages field as well.
> > 
> 
> pages field? Could you give more hints?

Well look how you are formatting it manually above.
There is clearly a structure with two 64 bit fields.
First one includes pfn and 0 (no idea why does | 0 make
sense but that's a separate issue).
Second one includes the pages value.


> > Generally describe the format in the header in some way so host and guest
> > can easily stay in sync.
> 
> 'VIRTIO_BALLOON_NR_PFN_BITS' is for this purpose and it will be passed to the
> related function in page_alloc.c as a parameter.
> 
> Thanks!
> Liang
> > All the pointer math and void * means we get zero type safety and I'm not
> > happy about it.
> > 
> > It's not good that virtio format seeps out to page_alloc anyway.
> > If unavoidable it is not a good idea to try to hide this fact, people will 
> > assume
> > they can change the format at will.
> > 
> > --
> > MST



Re: [Qemu-devel] [PATCH v6 kernel 3/5] virtio-balloon: speed up inflate/deflate process

2017-01-20 Thread Dr. David Alan Gilbert
* Liang Li (liang.z...@intel.com) wrote:



> +static void free_extended_page_bitmap(struct virtio_balloon *vb)
> +{
> + int i, bmap_count = vb->nr_page_bmap;
> +
> + for (i = 1; i < bmap_count; i++) {
> + kfree(vb->page_bitmap[i]);
> + vb->page_bitmap[i] = NULL;
> + vb->nr_page_bmap--;
> + }
> +}
> +
> +static void kfree_page_bitmap(struct virtio_balloon *vb)
> +{
> + int i;
> +
> + for (i = 0; i < vb->nr_page_bmap; i++)
> + kfree(vb->page_bitmap[i]);
> +}

It might be worth commenting that pair of functions to make it clear
why they are so different; I guess the kfree_page_bitmap
is used just before you free the structure above it so you
don't need to keep the count/pointers updated?

Dave
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH v6 kernel 3/5] virtio-balloon: speed up inflate/deflate process

2017-01-18 Thread Li, Liang Z
> On Wed, Jan 18, 2017 at 04:56:58AM +, Li, Liang Z wrote:
> > > > -   virtqueue_add_outbuf(vq, , 1, vb, GFP_KERNEL);
> > > > -   virtqueue_kick(vq);
> > > > +static void do_set_resp_bitmap(struct virtio_balloon *vb,
> > > > +   unsigned long base_pfn, int pages)
> > > >
> > > > -   /* When host has read buffer, this completes via balloon_ack */
> > > > -   wait_event(vb->acked, virtqueue_get_buf(vq, ));
> > > > +{
> > > > +   __le64 *range = vb->resp_data + vb->resp_pos;
> > > >
> > > > +   if (pages > (1 << VIRTIO_BALLOON_NR_PFN_BITS)) {
> > > > +   /* when the length field can't contain pages, set it to 
> > > > 0 to
> > >
> > > /*
> > >  * Multi-line
> > >  * comments
> > >  * should look like this.
> > >  */
> > >
> > > Also, pls start sentences with an upper-case letter.
> > >
> >
> > Sorry for that.
> >
> > > > +* indicate the actual length is in the next __le64;
> > > > +*/
> > >
> > > This is part of the interface so should be documented as such.
> > >
> > > > +   *range = cpu_to_le64((base_pfn <<
> > > > +   VIRTIO_BALLOON_NR_PFN_BITS) | 0);
> > > > +   *(range + 1) = cpu_to_le64(pages);
> > > > +   vb->resp_pos += 2;
> > >
> > > Pls use structs for this kind of stuff.
> >
> > I am not sure if you mean to use
> >
> > struct  range {
> > __le64 pfn: 52;
> > __le64 nr_page: 12
> > }
> > Instead of the shift operation?
> 
> Not just that. You want to add a pages field as well.
> 

pages field? Could you give more hints?

> Generally describe the format in the header in some way so host and guest
> can easily stay in sync.

'VIRTIO_BALLOON_NR_PFN_BITS' is for this purpose and it will be passed to the
related function in page_alloc.c as a parameter.

Thanks!
Liang
> All the pointer math and void * means we get zero type safety and I'm not
> happy about it.
> 
> It's not good that virtio format seeps out to page_alloc anyway.
> If unavoidable it is not a good idea to try to hide this fact, people will 
> assume
> they can change the format at will.
> 
> --
> MST



Re: [Qemu-devel] [PATCH v6 kernel 3/5] virtio-balloon: speed up inflate/deflate process

2017-01-18 Thread Michael S. Tsirkin
On Wed, Jan 18, 2017 at 04:56:58AM +, Li, Liang Z wrote:
> > > - virtqueue_add_outbuf(vq, , 1, vb, GFP_KERNEL);
> > > - virtqueue_kick(vq);
> > > +static void do_set_resp_bitmap(struct virtio_balloon *vb,
> > > + unsigned long base_pfn, int pages)
> > >
> > > - /* When host has read buffer, this completes via balloon_ack */
> > > - wait_event(vb->acked, virtqueue_get_buf(vq, ));
> > > +{
> > > + __le64 *range = vb->resp_data + vb->resp_pos;
> > >
> > > + if (pages > (1 << VIRTIO_BALLOON_NR_PFN_BITS)) {
> > > + /* when the length field can't contain pages, set it to 0 to
> > 
> > /*
> >  * Multi-line
> >  * comments
> >  * should look like this.
> >  */
> > 
> > Also, pls start sentences with an upper-case letter.
> > 
> 
> Sorry for that.
> 
> > > +  * indicate the actual length is in the next __le64;
> > > +  */
> > 
> > This is part of the interface so should be documented as such.
> > 
> > > + *range = cpu_to_le64((base_pfn <<
> > > + VIRTIO_BALLOON_NR_PFN_BITS) | 0);
> > > + *(range + 1) = cpu_to_le64(pages);
> > > + vb->resp_pos += 2;
> > 
> > Pls use structs for this kind of stuff.
> 
> I am not sure if you mean to use 
> 
> struct  range {
>   __le64 pfn: 52;
>   __le64 nr_page: 12
> }
> Instead of the shift operation?

Not just that. You want to add a pages field as well.

Generally describe the format in the header in some way
so host and guest can easily stay in sync.

All the pointer math and void * means we get zero type
safety and I'm not happy about it.


> I didn't use this way because I don't want to include 'virtio-balloon.h' in 
> page_alloc.c,
> or copy the define of this struct in page_alloc.c
> 
> Thanks!
> Liang


It's not good that virtio format seeps out to page_alloc anyway.
If unavoidable it is not a good idea to try to hide this fact,
people will assume they can change the format at will.

-- 
MST



Re: [Qemu-devel] [PATCH v6 kernel 3/5] virtio-balloon: speed up inflate/deflate process

2017-01-17 Thread Li, Liang Z
> > -   virtqueue_add_outbuf(vq, , 1, vb, GFP_KERNEL);
> > -   virtqueue_kick(vq);
> > +static void do_set_resp_bitmap(struct virtio_balloon *vb,
> > +   unsigned long base_pfn, int pages)
> >
> > -   /* When host has read buffer, this completes via balloon_ack */
> > -   wait_event(vb->acked, virtqueue_get_buf(vq, ));
> > +{
> > +   __le64 *range = vb->resp_data + vb->resp_pos;
> >
> > +   if (pages > (1 << VIRTIO_BALLOON_NR_PFN_BITS)) {
> > +   /* when the length field can't contain pages, set it to 0 to
> 
> /*
>  * Multi-line
>  * comments
>  * should look like this.
>  */
> 
> Also, pls start sentences with an upper-case letter.
> 

Sorry for that.

> > +* indicate the actual length is in the next __le64;
> > +*/
> 
> This is part of the interface so should be documented as such.
> 
> > +   *range = cpu_to_le64((base_pfn <<
> > +   VIRTIO_BALLOON_NR_PFN_BITS) | 0);
> > +   *(range + 1) = cpu_to_le64(pages);
> > +   vb->resp_pos += 2;
> 
> Pls use structs for this kind of stuff.

I am not sure if you mean to use 

struct  range {
__le64 pfn: 52;
__le64 nr_page: 12
}
Instead of the shift operation?

I didn't use this way because I don't want to include 'virtio-balloon.h' in 
page_alloc.c,
or copy the define of this struct in page_alloc.c

Thanks!
Liang



Re: [Qemu-devel] [PATCH v6 kernel 3/5] virtio-balloon: speed up inflate/deflate process

2017-01-17 Thread Michael S. Tsirkin
On Wed, Dec 21, 2016 at 02:52:26PM +0800, Liang Li wrote:
>  
> - /* We should always be able to add one buffer to an empty queue. */
> - virtqueue_add_outbuf(vq, , 1, vb, GFP_KERNEL);
> - virtqueue_kick(vq);
> +static void do_set_resp_bitmap(struct virtio_balloon *vb,
> + unsigned long base_pfn, int pages)
>  
> - /* When host has read buffer, this completes via balloon_ack */
> - wait_event(vb->acked, virtqueue_get_buf(vq, ));
> +{
> + __le64 *range = vb->resp_data + vb->resp_pos;
>  
> + if (pages > (1 << VIRTIO_BALLOON_NR_PFN_BITS)) {
> + /* when the length field can't contain pages, set it to 0 to

/*
 * Multi-line
 * comments
 * should look like this.
 */

Also, pls start sentences with an upper-case letter.

> +  * indicate the actual length is in the next __le64;
> +  */

This is part of the interface so should be documented as such.

> + *range = cpu_to_le64((base_pfn <<
> + VIRTIO_BALLOON_NR_PFN_BITS) | 0);
> + *(range + 1) = cpu_to_le64(pages);
> + vb->resp_pos += 2;

Pls use structs for this kind of stuff.

> + } else {
> + *range = (base_pfn << VIRTIO_BALLOON_NR_PFN_BITS) | pages;
> + vb->resp_pos++;
> + }
> +}



[Qemu-devel] [PATCH v6 kernel 3/5] virtio-balloon: speed up inflate/deflate process

2016-12-20 Thread Liang Li
The implementation of the current virtio-balloon is not very
efficient, the time spends on different stages of inflating
the balloon to 7GB of a 8GB idle guest:

a. allocating pages (6.5%)
b. sending PFNs to host (68.3%)
c. address translation (6.1%)
d. madvise (19%)

It takes about 4126ms for the inflating process to complete.
Debugging shows that the bottle neck are the stage b and stage d.

If using {pfn|length} array to send the page info instead of the
PFNs, we can reduce the overhead in stage b quite a lot.
Furthermore, we can do the address translation and call madvise()
with a range of memory, instead of the current page per page way,
the overhead of stage c and stage d can also be reduced a lot.

This patch is the kernel side implementation which is intended to
speed up the inflating & deflating process by adding a new feature
to the virtio-balloon device. With this new feature, inflating the
balloon to 7GB of a 8GB idle guest only takes 590ms, the
performance improvement is about 85%.

TODO: optimize stage a by allocating/freeing a chunk of pages
instead of a single page at a time.

Signed-off-by: Liang Li 
Suggested-by: Michael S. Tsirkin 
Cc: Michael S. Tsirkin 
Cc: Paolo Bonzini 
Cc: Cornelia Huck 
Cc: Amit Shah 
Cc: Dave Hansen 
Cc: Andrea Arcangeli 
Cc: David Hildenbrand 
---
 drivers/virtio/virtio_balloon.c | 348 
 1 file changed, 320 insertions(+), 28 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index f59cb4f..03383b3 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -42,6 +42,10 @@
 #define OOM_VBALLOON_DEFAULT_PAGES 256
 #define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
 
+#define BALLOON_BMAP_SIZE  (8 * PAGE_SIZE)
+#define PFNS_PER_BMAP  (BALLOON_BMAP_SIZE * BITS_PER_BYTE)
+#define BALLOON_BMAP_COUNT 32
+
 static int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
 module_param(oom_pages, int, S_IRUSR | S_IWUSR);
 MODULE_PARM_DESC(oom_pages, "pages to free on OOM");
@@ -67,6 +71,20 @@ struct virtio_balloon {
 
/* Number of balloon pages we've told the Host we're not using. */
unsigned int num_pages;
+   /* Pointer to the response header. */
+   void *resp_hdr;
+   /* Pointer to the start address of response data. */
+   __le64 *resp_data;
+   /* Size of response data buffer. */
+   unsigned int resp_buf_size;
+   /* Pointer offset of the response data. */
+   unsigned int resp_pos;
+   /* Bitmap used to save the pfns info */
+   unsigned long *page_bitmap[BALLOON_BMAP_COUNT];
+   /* Number of split page bitmaps */
+   unsigned int nr_page_bmap;
+   /* Used to record the processed pfn range */
+   unsigned long min_pfn, max_pfn, start_pfn, end_pfn;
/*
 * The pages we've told the Host we're not using are enqueued
 * at vb_dev_info->pages list.
@@ -110,20 +128,180 @@ static void balloon_ack(struct virtqueue *vq)
wake_up(>acked);
 }
 
-static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq)
+static inline void init_bmap_pfn_range(struct virtio_balloon *vb)
 {
-   struct scatterlist sg;
+   vb->min_pfn = ULONG_MAX;
+   vb->max_pfn = 0;
+}
+
+static inline void update_bmap_pfn_range(struct virtio_balloon *vb,
+struct page *page)
+{
+   unsigned long balloon_pfn = page_to_balloon_pfn(page);
+
+   vb->min_pfn = min(balloon_pfn, vb->min_pfn);
+   vb->max_pfn = max(balloon_pfn, vb->max_pfn);
+}
+
+static void extend_page_bitmap(struct virtio_balloon *vb,
+   unsigned long nr_pfn)
+{
+   int i, bmap_count;
+   unsigned long bmap_len;
+
+   bmap_len = ALIGN(nr_pfn, BITS_PER_LONG) / BITS_PER_BYTE;
+   bmap_len = ALIGN(bmap_len, BALLOON_BMAP_SIZE);
+   bmap_count = min((int)(bmap_len / BALLOON_BMAP_SIZE),
+BALLOON_BMAP_COUNT);
+
+   for (i = 1; i < bmap_count; i++) {
+   vb->page_bitmap[i] = kmalloc(BALLOON_BMAP_SIZE, GFP_KERNEL);
+   if (vb->page_bitmap[i])
+   vb->nr_page_bmap++;
+   else
+   break;
+   }
+}
+
+static void free_extended_page_bitmap(struct virtio_balloon *vb)
+{
+   int i, bmap_count = vb->nr_page_bmap;
+
+   for (i = 1; i < bmap_count; i++) {
+   kfree(vb->page_bitmap[i]);
+   vb->page_bitmap[i] = NULL;
+   vb->nr_page_bmap--;
+   }
+}
+
+static void kfree_page_bitmap(struct virtio_balloon *vb)
+{
+   int i;
+
+   for (i = 0; i < vb->nr_page_bmap; i++)
+   kfree(vb->page_bitmap[i]);
+}
+
+static void clear_page_bitmap(struct virtio_balloon *vb)
+{
+   int i;
+
+   for