Re: [Qemu-devel] [PATCH v6 kernel 3/5] virtio-balloon: speed up inflate/deflate process
> > > > +static void free_extended_page_bitmap(struct virtio_balloon *vb) { > > + int i, bmap_count = vb->nr_page_bmap; > > + > > + for (i = 1; i < bmap_count; i++) { > > + kfree(vb->page_bitmap[i]); > > + vb->page_bitmap[i] = NULL; > > + vb->nr_page_bmap--; > > + } > > +} > > + > > +static void kfree_page_bitmap(struct virtio_balloon *vb) { > > + int i; > > + > > + for (i = 0; i < vb->nr_page_bmap; i++) > > + kfree(vb->page_bitmap[i]); > > +} > > It might be worth commenting that pair of functions to make it clear why > they are so different; I guess the kfree_page_bitmap is used just before you > free the structure above it so you don't need to keep the count/pointers > updated? > Yes. I will add some comments for that. Thanks! Liang > Dave > -- > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [PATCH v6 kernel 3/5] virtio-balloon: speed up inflate/deflate process
On Thu, Jan 19, 2017 at 01:44:36AM +, Li, Liang Z wrote: > > > > > + *range = cpu_to_le64((base_pfn << > > > > > + VIRTIO_BALLOON_NR_PFN_BITS) | 0); > > > > > + *(range + 1) = cpu_to_le64(pages); > > > > > + vb->resp_pos += 2; > > > > > > > > Pls use structs for this kind of stuff. > > > > > > I am not sure if you mean to use > > > > > > struct range { > > > __le64 pfn: 52; > > > __le64 nr_page: 12 > > > } > > > Instead of the shift operation? > > > > Not just that. You want to add a pages field as well. > > > > pages field? Could you give more hints? Well look how you are formatting it manually above. There is clearly a structure with two 64 bit fields. First one includes pfn and 0 (no idea why does | 0 make sense but that's a separate issue). Second one includes the pages value. > > Generally describe the format in the header in some way so host and guest > > can easily stay in sync. > > 'VIRTIO_BALLOON_NR_PFN_BITS' is for this purpose and it will be passed to the > related function in page_alloc.c as a parameter. > > Thanks! > Liang > > All the pointer math and void * means we get zero type safety and I'm not > > happy about it. > > > > It's not good that virtio format seeps out to page_alloc anyway. > > If unavoidable it is not a good idea to try to hide this fact, people will > > assume > > they can change the format at will. > > > > -- > > MST
Re: [Qemu-devel] [PATCH v6 kernel 3/5] virtio-balloon: speed up inflate/deflate process
* Liang Li (liang.z...@intel.com) wrote: > +static void free_extended_page_bitmap(struct virtio_balloon *vb) > +{ > + int i, bmap_count = vb->nr_page_bmap; > + > + for (i = 1; i < bmap_count; i++) { > + kfree(vb->page_bitmap[i]); > + vb->page_bitmap[i] = NULL; > + vb->nr_page_bmap--; > + } > +} > + > +static void kfree_page_bitmap(struct virtio_balloon *vb) > +{ > + int i; > + > + for (i = 0; i < vb->nr_page_bmap; i++) > + kfree(vb->page_bitmap[i]); > +} It might be worth commenting that pair of functions to make it clear why they are so different; I guess the kfree_page_bitmap is used just before you free the structure above it so you don't need to keep the count/pointers updated? Dave -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [PATCH v6 kernel 3/5] virtio-balloon: speed up inflate/deflate process
> On Wed, Jan 18, 2017 at 04:56:58AM +, Li, Liang Z wrote: > > > > - virtqueue_add_outbuf(vq, , 1, vb, GFP_KERNEL); > > > > - virtqueue_kick(vq); > > > > +static void do_set_resp_bitmap(struct virtio_balloon *vb, > > > > + unsigned long base_pfn, int pages) > > > > > > > > - /* When host has read buffer, this completes via balloon_ack */ > > > > - wait_event(vb->acked, virtqueue_get_buf(vq, )); > > > > +{ > > > > + __le64 *range = vb->resp_data + vb->resp_pos; > > > > > > > > + if (pages > (1 << VIRTIO_BALLOON_NR_PFN_BITS)) { > > > > + /* when the length field can't contain pages, set it to > > > > 0 to > > > > > > /* > > > * Multi-line > > > * comments > > > * should look like this. > > > */ > > > > > > Also, pls start sentences with an upper-case letter. > > > > > > > Sorry for that. > > > > > > +* indicate the actual length is in the next __le64; > > > > +*/ > > > > > > This is part of the interface so should be documented as such. > > > > > > > + *range = cpu_to_le64((base_pfn << > > > > + VIRTIO_BALLOON_NR_PFN_BITS) | 0); > > > > + *(range + 1) = cpu_to_le64(pages); > > > > + vb->resp_pos += 2; > > > > > > Pls use structs for this kind of stuff. > > > > I am not sure if you mean to use > > > > struct range { > > __le64 pfn: 52; > > __le64 nr_page: 12 > > } > > Instead of the shift operation? > > Not just that. You want to add a pages field as well. > pages field? Could you give more hints? > Generally describe the format in the header in some way so host and guest > can easily stay in sync. 'VIRTIO_BALLOON_NR_PFN_BITS' is for this purpose and it will be passed to the related function in page_alloc.c as a parameter. Thanks! Liang > All the pointer math and void * means we get zero type safety and I'm not > happy about it. > > It's not good that virtio format seeps out to page_alloc anyway. > If unavoidable it is not a good idea to try to hide this fact, people will > assume > they can change the format at will. > > -- > MST
Re: [Qemu-devel] [PATCH v6 kernel 3/5] virtio-balloon: speed up inflate/deflate process
On Wed, Jan 18, 2017 at 04:56:58AM +, Li, Liang Z wrote: > > > - virtqueue_add_outbuf(vq, , 1, vb, GFP_KERNEL); > > > - virtqueue_kick(vq); > > > +static void do_set_resp_bitmap(struct virtio_balloon *vb, > > > + unsigned long base_pfn, int pages) > > > > > > - /* When host has read buffer, this completes via balloon_ack */ > > > - wait_event(vb->acked, virtqueue_get_buf(vq, )); > > > +{ > > > + __le64 *range = vb->resp_data + vb->resp_pos; > > > > > > + if (pages > (1 << VIRTIO_BALLOON_NR_PFN_BITS)) { > > > + /* when the length field can't contain pages, set it to 0 to > > > > /* > > * Multi-line > > * comments > > * should look like this. > > */ > > > > Also, pls start sentences with an upper-case letter. > > > > Sorry for that. > > > > + * indicate the actual length is in the next __le64; > > > + */ > > > > This is part of the interface so should be documented as such. > > > > > + *range = cpu_to_le64((base_pfn << > > > + VIRTIO_BALLOON_NR_PFN_BITS) | 0); > > > + *(range + 1) = cpu_to_le64(pages); > > > + vb->resp_pos += 2; > > > > Pls use structs for this kind of stuff. > > I am not sure if you mean to use > > struct range { > __le64 pfn: 52; > __le64 nr_page: 12 > } > Instead of the shift operation? Not just that. You want to add a pages field as well. Generally describe the format in the header in some way so host and guest can easily stay in sync. All the pointer math and void * means we get zero type safety and I'm not happy about it. > I didn't use this way because I don't want to include 'virtio-balloon.h' in > page_alloc.c, > or copy the define of this struct in page_alloc.c > > Thanks! > Liang It's not good that virtio format seeps out to page_alloc anyway. If unavoidable it is not a good idea to try to hide this fact, people will assume they can change the format at will. -- MST
Re: [Qemu-devel] [PATCH v6 kernel 3/5] virtio-balloon: speed up inflate/deflate process
> > - virtqueue_add_outbuf(vq, , 1, vb, GFP_KERNEL); > > - virtqueue_kick(vq); > > +static void do_set_resp_bitmap(struct virtio_balloon *vb, > > + unsigned long base_pfn, int pages) > > > > - /* When host has read buffer, this completes via balloon_ack */ > > - wait_event(vb->acked, virtqueue_get_buf(vq, )); > > +{ > > + __le64 *range = vb->resp_data + vb->resp_pos; > > > > + if (pages > (1 << VIRTIO_BALLOON_NR_PFN_BITS)) { > > + /* when the length field can't contain pages, set it to 0 to > > /* > * Multi-line > * comments > * should look like this. > */ > > Also, pls start sentences with an upper-case letter. > Sorry for that. > > +* indicate the actual length is in the next __le64; > > +*/ > > This is part of the interface so should be documented as such. > > > + *range = cpu_to_le64((base_pfn << > > + VIRTIO_BALLOON_NR_PFN_BITS) | 0); > > + *(range + 1) = cpu_to_le64(pages); > > + vb->resp_pos += 2; > > Pls use structs for this kind of stuff. I am not sure if you mean to use struct range { __le64 pfn: 52; __le64 nr_page: 12 } Instead of the shift operation? I didn't use this way because I don't want to include 'virtio-balloon.h' in page_alloc.c, or copy the define of this struct in page_alloc.c Thanks! Liang
Re: [Qemu-devel] [PATCH v6 kernel 3/5] virtio-balloon: speed up inflate/deflate process
On Wed, Dec 21, 2016 at 02:52:26PM +0800, Liang Li wrote: > > - /* We should always be able to add one buffer to an empty queue. */ > - virtqueue_add_outbuf(vq, , 1, vb, GFP_KERNEL); > - virtqueue_kick(vq); > +static void do_set_resp_bitmap(struct virtio_balloon *vb, > + unsigned long base_pfn, int pages) > > - /* When host has read buffer, this completes via balloon_ack */ > - wait_event(vb->acked, virtqueue_get_buf(vq, )); > +{ > + __le64 *range = vb->resp_data + vb->resp_pos; > > + if (pages > (1 << VIRTIO_BALLOON_NR_PFN_BITS)) { > + /* when the length field can't contain pages, set it to 0 to /* * Multi-line * comments * should look like this. */ Also, pls start sentences with an upper-case letter. > + * indicate the actual length is in the next __le64; > + */ This is part of the interface so should be documented as such. > + *range = cpu_to_le64((base_pfn << > + VIRTIO_BALLOON_NR_PFN_BITS) | 0); > + *(range + 1) = cpu_to_le64(pages); > + vb->resp_pos += 2; Pls use structs for this kind of stuff. > + } else { > + *range = (base_pfn << VIRTIO_BALLOON_NR_PFN_BITS) | pages; > + vb->resp_pos++; > + } > +}
[Qemu-devel] [PATCH v6 kernel 3/5] virtio-balloon: speed up inflate/deflate process
The implementation of the current virtio-balloon is not very efficient, the time spends on different stages of inflating the balloon to 7GB of a 8GB idle guest: a. allocating pages (6.5%) b. sending PFNs to host (68.3%) c. address translation (6.1%) d. madvise (19%) It takes about 4126ms for the inflating process to complete. Debugging shows that the bottle neck are the stage b and stage d. If using {pfn|length} array to send the page info instead of the PFNs, we can reduce the overhead in stage b quite a lot. Furthermore, we can do the address translation and call madvise() with a range of memory, instead of the current page per page way, the overhead of stage c and stage d can also be reduced a lot. This patch is the kernel side implementation which is intended to speed up the inflating & deflating process by adding a new feature to the virtio-balloon device. With this new feature, inflating the balloon to 7GB of a 8GB idle guest only takes 590ms, the performance improvement is about 85%. TODO: optimize stage a by allocating/freeing a chunk of pages instead of a single page at a time. Signed-off-by: Liang LiSuggested-by: Michael S. Tsirkin Cc: Michael S. Tsirkin Cc: Paolo Bonzini Cc: Cornelia Huck Cc: Amit Shah Cc: Dave Hansen Cc: Andrea Arcangeli Cc: David Hildenbrand --- drivers/virtio/virtio_balloon.c | 348 1 file changed, 320 insertions(+), 28 deletions(-) diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c index f59cb4f..03383b3 100644 --- a/drivers/virtio/virtio_balloon.c +++ b/drivers/virtio/virtio_balloon.c @@ -42,6 +42,10 @@ #define OOM_VBALLOON_DEFAULT_PAGES 256 #define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80 +#define BALLOON_BMAP_SIZE (8 * PAGE_SIZE) +#define PFNS_PER_BMAP (BALLOON_BMAP_SIZE * BITS_PER_BYTE) +#define BALLOON_BMAP_COUNT 32 + static int oom_pages = OOM_VBALLOON_DEFAULT_PAGES; module_param(oom_pages, int, S_IRUSR | S_IWUSR); MODULE_PARM_DESC(oom_pages, "pages to free on OOM"); @@ -67,6 +71,20 @@ struct virtio_balloon { /* Number of balloon pages we've told the Host we're not using. */ unsigned int num_pages; + /* Pointer to the response header. */ + void *resp_hdr; + /* Pointer to the start address of response data. */ + __le64 *resp_data; + /* Size of response data buffer. */ + unsigned int resp_buf_size; + /* Pointer offset of the response data. */ + unsigned int resp_pos; + /* Bitmap used to save the pfns info */ + unsigned long *page_bitmap[BALLOON_BMAP_COUNT]; + /* Number of split page bitmaps */ + unsigned int nr_page_bmap; + /* Used to record the processed pfn range */ + unsigned long min_pfn, max_pfn, start_pfn, end_pfn; /* * The pages we've told the Host we're not using are enqueued * at vb_dev_info->pages list. @@ -110,20 +128,180 @@ static void balloon_ack(struct virtqueue *vq) wake_up(>acked); } -static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq) +static inline void init_bmap_pfn_range(struct virtio_balloon *vb) { - struct scatterlist sg; + vb->min_pfn = ULONG_MAX; + vb->max_pfn = 0; +} + +static inline void update_bmap_pfn_range(struct virtio_balloon *vb, +struct page *page) +{ + unsigned long balloon_pfn = page_to_balloon_pfn(page); + + vb->min_pfn = min(balloon_pfn, vb->min_pfn); + vb->max_pfn = max(balloon_pfn, vb->max_pfn); +} + +static void extend_page_bitmap(struct virtio_balloon *vb, + unsigned long nr_pfn) +{ + int i, bmap_count; + unsigned long bmap_len; + + bmap_len = ALIGN(nr_pfn, BITS_PER_LONG) / BITS_PER_BYTE; + bmap_len = ALIGN(bmap_len, BALLOON_BMAP_SIZE); + bmap_count = min((int)(bmap_len / BALLOON_BMAP_SIZE), +BALLOON_BMAP_COUNT); + + for (i = 1; i < bmap_count; i++) { + vb->page_bitmap[i] = kmalloc(BALLOON_BMAP_SIZE, GFP_KERNEL); + if (vb->page_bitmap[i]) + vb->nr_page_bmap++; + else + break; + } +} + +static void free_extended_page_bitmap(struct virtio_balloon *vb) +{ + int i, bmap_count = vb->nr_page_bmap; + + for (i = 1; i < bmap_count; i++) { + kfree(vb->page_bitmap[i]); + vb->page_bitmap[i] = NULL; + vb->nr_page_bmap--; + } +} + +static void kfree_page_bitmap(struct virtio_balloon *vb) +{ + int i; + + for (i = 0; i < vb->nr_page_bmap; i++) + kfree(vb->page_bitmap[i]); +} + +static void clear_page_bitmap(struct virtio_balloon *vb) +{ + int i; + + for