[PATCH v12 0/7] make balloon pages movable by compaction
Memory fragmentation introduced by ballooning might reduce significantly the number of 2MB contiguous memory blocks that can be used within a guest, thus imposing performance penalties associated with the reduced number of transparent huge pages that could be used by the guest workload. This patch-set follows the main idea discussed at 2012 LSFMMS session: Ballooning for transparent huge pages -- http://lwn.net/Articles/490114/ to introduce the required changes to the virtio_balloon driver, as well as the changes to the core compaction migration bits, in order to make those subsystems aware of ballooned pages and allow memory balloon pages become movable within a guest, thus avoiding the aforementioned fragmentation issue Following are numbers that prove this patch benefits on allowing compaction to be more effective at memory ballooned guests. Results for STRESS-HIGHALLOC benchmark, from Mel Gorman's mmtests suite, running on a 4gB RAM KVM guest which was ballooning 512mB RAM in 64mB chunks, at every minute (inflating/deflating), while test was running: ===BEGIN stress-highalloc STRESS-HIGHALLOC highalloc-3.7 highalloc-3.7 rc4-clean rc4-patch Pass 1 55.00 ( 0.00%)62.00 ( 7.00%) Pass 2 54.00 ( 0.00%)62.00 ( 8.00%) while Rested75.00 ( 0.00%)80.00 ( 5.00%) MMTests Statistics: duration 3.7 3.7 rc4-clean rc4-patch User 1207.59 1207.46 System 1300.55 1299.61 Elapsed 2273.72 2157.06 MMTests Statistics: vmstat 3.7 3.7 rc4-clean rc4-patch Page Ins3581516 2374368 Page Outs 1114869210410332 Swap Ins 80 47 Swap Outs 3641 476 Direct pages scanned 37978 33826 Kswapd pages scanned1828245 1342869 Kswapd pages reclaimed 1710236 1304099 Direct pages reclaimed32207 31005 Kswapd efficiency 93% 97% Kswapd velocity 804.077 622.546 Direct efficiency 84% 91% Direct velocity 16.703 15.682 Percentage direct scans 2% 2% Page writes by reclaim792529704 Page writes file 756119228 Page writes anon 3641 476 Page reclaim immediate16764 11014 Page rescued immediate0 0 Slabs scanned 2171904 2152448 Direct inode steals 3852261 Kswapd inode steals 659137 609670 Kswapd skipped wait 1 69 THP fault alloc 546 631 THP collapse alloc 361 339 THP splits 259 263 THP fault fallback 98 50 THP collapse fail20 17 Compaction stalls 747 499 Compaction success 244 145 Compaction failures 503 354 Compaction pages moved 370888 474837 Compaction move failure 77378 65259 ===END stress-highalloc Rafael Aquini (7): mm: adjust address_space_operations.migratepage() return code mm: redefine address_space.assoc_mapping mm: introduce a common interface for balloon pages mobility mm: introduce compaction and migration for ballooned pages virtio_balloon: introduce migration primitives to balloon pages mm: introduce putback_movable_pages() mm: add vm event counters for balloon pages compaction drivers/virtio/virtio_balloon.c| 139 +++-- fs/buffer.c| 12 +- fs/gfs2/glock.c| 2 +- fs/hugetlbfs/inode.c | 4 +- fs/inode.c | 2 +- fs/nilfs2/page.c | 2 +- include/linux/balloon_compaction.h | 263 include/linux/fs.h | 2 +- include/linux/migrate.h| 19 +++ include/linux/pagemap.h| 16 ++ include/linux/vm_event_item.h | 7 +- mm/Kconfig | 15 ++ mm/Makefile| 3 +- mm/balloon_compaction.c| 304 + mm/compaction.c| 27 +++- mm/migrate.c | 86 --- mm/page_alloc.c| 2 +- mm/vmstat.c| 9 +- 18 files changed, 862 insertions(+), 52 deletions(-) create mode 100644 include/linux/balloon_compaction.h create mode 100644 mm/balloon_compaction.c Change log: v12: * Address last suggestions on sorting the barriers usage out (Mel Gorman); * Fix reported build breakages for CONFIG_BALLOON_COMPACTION=n (Andrew Morton); * Enhance commentary on the locking scheme used for balloon page compaction; * Move all the 'balloon
[PATCH v12 1/7] mm: adjust address_space_operations.migratepage() return code
This patch introduces MIGRATEPAGE_SUCCESS as the default return code for address_space_operations.migratepage() method and documents the expected return code for the same method in failure cases. Signed-off-by: Rafael Aquini aqu...@redhat.com --- fs/hugetlbfs/inode.c| 4 ++-- include/linux/migrate.h | 7 +++ mm/migrate.c| 33 +++-- 3 files changed, 24 insertions(+), 20 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 14bc0c1..fed1cd5 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -582,11 +582,11 @@ static int hugetlbfs_migrate_page(struct address_space *mapping, int rc; rc = migrate_huge_page_move_mapping(mapping, newpage, page); - if (rc) + if (rc != MIGRATEPAGE_SUCCESS) return rc; migrate_page_copy(newpage, page); - return 0; + return MIGRATEPAGE_SUCCESS; } static int hugetlbfs_statfs(struct dentry *dentry, struct kstatfs *buf) diff --git a/include/linux/migrate.h b/include/linux/migrate.h index 9a5afea..fab15ae 100644 --- a/include/linux/migrate.h +++ b/include/linux/migrate.h @@ -7,6 +7,13 @@ typedef struct page *new_page_t(struct page *, unsigned long private, int **); +/* + * Return values from addresss_space_operations.migratepage(): + * - negative errno on page migration failure; + * - zero on page migration success; + */ +#define MIGRATEPAGE_SUCCESS0 + #ifdef CONFIG_MIGRATION extern void putback_lru_pages(struct list_head *l); diff --git a/mm/migrate.c b/mm/migrate.c index 0c5ec37..6f408c7 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -286,7 +286,7 @@ static int migrate_page_move_mapping(struct address_space *mapping, expected_count += 1; if (page_count(page) != expected_count) return -EAGAIN; - return 0; + return MIGRATEPAGE_SUCCESS; } spin_lock_irq(mapping-tree_lock); @@ -356,7 +356,7 @@ static int migrate_page_move_mapping(struct address_space *mapping, } spin_unlock_irq(mapping-tree_lock); - return 0; + return MIGRATEPAGE_SUCCESS; } /* @@ -372,7 +372,7 @@ int migrate_huge_page_move_mapping(struct address_space *mapping, if (!mapping) { if (page_count(page) != 1) return -EAGAIN; - return 0; + return MIGRATEPAGE_SUCCESS; } spin_lock_irq(mapping-tree_lock); @@ -399,7 +399,7 @@ int migrate_huge_page_move_mapping(struct address_space *mapping, page_unfreeze_refs(page, expected_count - 1); spin_unlock_irq(mapping-tree_lock); - return 0; + return MIGRATEPAGE_SUCCESS; } /* @@ -486,11 +486,11 @@ int migrate_page(struct address_space *mapping, rc = migrate_page_move_mapping(mapping, newpage, page, NULL, mode); - if (rc) + if (rc != MIGRATEPAGE_SUCCESS) return rc; migrate_page_copy(newpage, page); - return 0; + return MIGRATEPAGE_SUCCESS; } EXPORT_SYMBOL(migrate_page); @@ -513,7 +513,7 @@ int buffer_migrate_page(struct address_space *mapping, rc = migrate_page_move_mapping(mapping, newpage, page, head, mode); - if (rc) + if (rc != MIGRATEPAGE_SUCCESS) return rc; /* @@ -549,7 +549,7 @@ int buffer_migrate_page(struct address_space *mapping, } while (bh != head); - return 0; + return MIGRATEPAGE_SUCCESS; } EXPORT_SYMBOL(buffer_migrate_page); #endif @@ -628,7 +628,7 @@ static int fallback_migrate_page(struct address_space *mapping, * * Return value: *0 - error code - * == 0 - success + * MIGRATEPAGE_SUCCESS - success */ static int move_to_new_page(struct page *newpage, struct page *page, int remap_swapcache, enum migrate_mode mode) @@ -665,7 +665,7 @@ static int move_to_new_page(struct page *newpage, struct page *page, else rc = fallback_migrate_page(mapping, newpage, page, mode); - if (rc) { + if (rc != MIGRATEPAGE_SUCCESS) { newpage-mapping = NULL; } else { if (remap_swapcache) @@ -814,7 +814,7 @@ skip_unmap: put_anon_vma(anon_vma); uncharge: - mem_cgroup_end_migration(mem, page, newpage, rc == 0); + mem_cgroup_end_migration(mem, page, newpage, rc == MIGRATEPAGE_SUCCESS); unlock: unlock_page(page); out: @@ -987,7 +987,7 @@ int migrate_pages(struct list_head *from, case -EAGAIN: retry++; break; - case 0: + case MIGRATEPAGE_SUCCESS: break; default: /* Permanent failure */ @@ -996,15 +996,12 @@ int migrate_pages(struct list_head *from,
[PATCH v12 2/7] mm: redefine address_space.assoc_mapping
This patch overhauls struct address_space.assoc_mapping renaming it to address_space.private_data and its type is redefined to void*. By this approach we consistently name the .private_* elements from struct address_space as well as allow extended usage for address_space association with other data structures through -private_data. Also, all users of old -assoc_mapping element are converted to reflect its new name and type change (-private_data). Signed-off-by: Rafael Aquini aqu...@redhat.com --- fs/buffer.c| 12 ++-- fs/gfs2/glock.c| 2 +- fs/inode.c | 2 +- fs/nilfs2/page.c | 2 +- include/linux/fs.h | 2 +- 5 files changed, 10 insertions(+), 10 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index b5f0442..e0bad95 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -555,7 +555,7 @@ void emergency_thaw_all(void) */ int sync_mapping_buffers(struct address_space *mapping) { - struct address_space *buffer_mapping = mapping-assoc_mapping; + struct address_space *buffer_mapping = mapping-private_data; if (buffer_mapping == NULL || list_empty(mapping-private_list)) return 0; @@ -588,10 +588,10 @@ void mark_buffer_dirty_inode(struct buffer_head *bh, struct inode *inode) struct address_space *buffer_mapping = bh-b_page-mapping; mark_buffer_dirty(bh); - if (!mapping-assoc_mapping) { - mapping-assoc_mapping = buffer_mapping; + if (!mapping-private_data) { + mapping-private_data = buffer_mapping; } else { - BUG_ON(mapping-assoc_mapping != buffer_mapping); + BUG_ON(mapping-private_data != buffer_mapping); } if (!bh-b_assoc_map) { spin_lock(buffer_mapping-private_lock); @@ -788,7 +788,7 @@ void invalidate_inode_buffers(struct inode *inode) if (inode_has_buffers(inode)) { struct address_space *mapping = inode-i_data; struct list_head *list = mapping-private_list; - struct address_space *buffer_mapping = mapping-assoc_mapping; + struct address_space *buffer_mapping = mapping-private_data; spin_lock(buffer_mapping-private_lock); while (!list_empty(list)) @@ -811,7 +811,7 @@ int remove_inode_buffers(struct inode *inode) if (inode_has_buffers(inode)) { struct address_space *mapping = inode-i_data; struct list_head *list = mapping-private_list; - struct address_space *buffer_mapping = mapping-assoc_mapping; + struct address_space *buffer_mapping = mapping-private_data; spin_lock(buffer_mapping-private_lock); while (!list_empty(list)) { diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c index 6114571..904a808 100644 --- a/fs/gfs2/glock.c +++ b/fs/gfs2/glock.c @@ -766,7 +766,7 @@ int gfs2_glock_get(struct gfs2_sbd *sdp, u64 number, mapping-host = s-s_bdev-bd_inode; mapping-flags = 0; mapping_set_gfp_mask(mapping, GFP_NOFS); - mapping-assoc_mapping = NULL; + mapping-private_data = NULL; mapping-backing_dev_info = s-s_bdi; mapping-writeback_index = 0; } diff --git a/fs/inode.c b/fs/inode.c index b03c719..4cac8e1 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -165,7 +165,7 @@ int inode_init_always(struct super_block *sb, struct inode *inode) mapping-host = inode; mapping-flags = 0; mapping_set_gfp_mask(mapping, GFP_HIGHUSER_MOVABLE); - mapping-assoc_mapping = NULL; + mapping-private_data = NULL; mapping-backing_dev_info = default_backing_dev_info; mapping-writeback_index = 0; diff --git a/fs/nilfs2/page.c b/fs/nilfs2/page.c index 3e7b2a0..07f76db 100644 --- a/fs/nilfs2/page.c +++ b/fs/nilfs2/page.c @@ -431,7 +431,7 @@ void nilfs_mapping_init(struct address_space *mapping, struct inode *inode, mapping-host = inode; mapping-flags = 0; mapping_set_gfp_mask(mapping, GFP_NOFS); - mapping-assoc_mapping = NULL; + mapping-private_data = NULL; mapping-backing_dev_info = bdi; mapping-a_ops = empty_aops; } diff --git a/include/linux/fs.h b/include/linux/fs.h index b33cfc9..0982565 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -418,7 +418,7 @@ struct address_space { struct backing_dev_info *backing_dev_info; /* device readahead, etc */ spinlock_t private_lock; /* for use by the address_space */ struct list_headprivate_list; /* ditto */ - struct address_space*assoc_mapping; /* ditto */ + void*private_data; /* ditto */ } __attribute__((aligned(sizeof(long; /* * On most architectures that alignment is already the case; but -- 1.7.11.7 ___ Virtualization mailing
[PATCH v12 3/7] mm: introduce a common interface for balloon pages mobility
Memory fragmentation introduced by ballooning might reduce significantly the number of 2MB contiguous memory blocks that can be used within a guest, thus imposing performance penalties associated with the reduced number of transparent huge pages that could be used by the guest workload. This patch introduces a common interface to help a balloon driver on making its page set movable to compaction, and thus allowing the system to better leverage the compation efforts on memory defragmentation. Signed-off-by: Rafael Aquini aqu...@redhat.com Acked-by: Mel Gorman m...@csn.ul.ie --- include/linux/balloon_compaction.h | 256 +++ include/linux/migrate.h| 10 ++ include/linux/pagemap.h| 16 ++ mm/Kconfig | 15 ++ mm/Makefile| 3 +- mm/balloon_compaction.c| 302 + 6 files changed, 601 insertions(+), 1 deletion(-) create mode 100644 include/linux/balloon_compaction.h create mode 100644 mm/balloon_compaction.c diff --git a/include/linux/balloon_compaction.h b/include/linux/balloon_compaction.h new file mode 100644 index 000..2e63d94 --- /dev/null +++ b/include/linux/balloon_compaction.h @@ -0,0 +1,256 @@ +/* + * include/linux/balloon_compaction.h + * + * Common interface definitions for making balloon pages movable by compaction. + * + * Despite being perfectly possible to perform ballooned pages migration, they + * make a special corner case to compaction scans because balloon pages are not + * enlisted at any LRU list like the other pages we do compact / migrate. + * + * As the page isolation scanning step a compaction thread does is a lockless + * procedure (from a page standpoint), it might bring some racy situations while + * performing balloon page compaction. In order to sort out these racy scenarios + * and safely perform balloon's page compaction and migration we must, always, + * ensure following these three simple rules: + * + * i. when updating a balloon's page -mapping element, strictly do it under + * the following lock order, independently of the far superior + * locking scheme (lru_lock, balloon_lock): + * +-page_lock(page); + * +--spin_lock_irq(b_dev_info-pages_lock); + * ... page-mapping updates here ... + * + * ii. before isolating or dequeueing a balloon page from the balloon device + * pages list, the page reference counter must be raised by one and the + * extra refcount must be dropped when the page is enqueued back into + * the balloon device page list, thus a balloon page keeps its reference + * counter raised only while it is under our special handling; + * + * iii. after the lockless scan step have selected a potential balloon page for + * isolation, re-test the page-mapping flags and the page ref counter + * under the proper page lock, to ensure isolating a valid balloon page + * (not yet isolated, nor under release procedure) + * + * The functions provided by this interface are placed to help on coping with + * the aforementioned balloon page corner case, as well as to ensure the simple + * set of exposed rules are satisfied while we are dealing with balloon pages + * compaction / migration. + * + * Copyright (C) 2012, Red Hat, Inc. Rafael Aquini aqu...@redhat.com + */ +#ifndef _LINUX_BALLOON_COMPACTION_H +#define _LINUX_BALLOON_COMPACTION_H +#include linux/pagemap.h +#include linux/migrate.h +#include linux/gfp.h +#include linux/err.h + +/* + * Balloon device information descriptor. + * This struct is used to allow the common balloon compaction interface + * procedures to find the proper balloon device holding memory pages they'll + * have to cope for page compaction / migration, as well as it serves the + * balloon driver as a page book-keeper for its registered balloon devices. + */ +struct balloon_dev_info { + void *balloon_device; /* balloon device descriptor */ + struct address_space *mapping; /* balloon special page-mapping */ + unsigned long isolated_pages; /* # of isolated pages for migration */ + spinlock_t pages_lock; /* Protection to pages list */ + struct list_head pages; /* Pages enqueued handled to Host */ +}; + +extern struct page *balloon_page_enqueue(struct balloon_dev_info *b_dev_info); +extern struct page *balloon_page_dequeue(struct balloon_dev_info *b_dev_info); +extern struct balloon_dev_info *balloon_devinfo_alloc( + void *balloon_dev_descriptor); + +static inline void balloon_devinfo_free(struct balloon_dev_info *b_dev_info) +{ + kfree(b_dev_info); +} + +/* + * balloon_page_free - release a balloon page back to the page free lists + * @page: ballooned page to be set free + * + * This function must be used to properly set free an isolated/dequeued balloon + * page at the end of a sucessful page migration, or
[PATCH v12 4/7] mm: introduce compaction and migration for ballooned pages
Memory fragmentation introduced by ballooning might reduce significantly the number of 2MB contiguous memory blocks that can be used within a guest, thus imposing performance penalties associated with the reduced number of transparent huge pages that could be used by the guest workload. This patch introduces the helper functions as well as the necessary changes to teach compaction and migration bits how to cope with pages which are part of a guest memory balloon, in order to make them movable by memory compaction procedures. Signed-off-by: Rafael Aquini aqu...@redhat.com Acked-by: Mel Gorman m...@csn.ul.ie --- mm/compaction.c | 21 +++-- mm/migrate.c| 34 -- 2 files changed, 51 insertions(+), 4 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index 9eef558..76abd84 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -14,6 +14,7 @@ #include linux/backing-dev.h #include linux/sysctl.h #include linux/sysfs.h +#include linux/balloon_compaction.h #include internal.h #if defined CONFIG_COMPACTION || defined CONFIG_CMA @@ -565,9 +566,24 @@ isolate_migratepages_range(struct zone *zone, struct compact_control *cc, goto next_pageblock; } - /* Check may be lockless but that's ok as we recheck later */ - if (!PageLRU(page)) + /* +* Check may be lockless but that's ok as we recheck later. +* It's possible to migrate LRU pages and balloon pages +* Skip any other type of page +*/ + if (!PageLRU(page)) { + if (unlikely(balloon_page_movable(page))) { + if (locked balloon_page_isolate(page)) { + /* Successfully isolated */ + cc-finished_update_migrate = true; + list_add(page-lru, migratelist); + cc-nr_migratepages++; + nr_isolated++; + goto check_compact_cluster; + } + } continue; + } /* * PageLRU is set. lru_lock normally excludes isolation @@ -621,6 +637,7 @@ isolate_migratepages_range(struct zone *zone, struct compact_control *cc, cc-nr_migratepages++; nr_isolated++; +check_compact_cluster: /* Avoid isolating too much */ if (cc-nr_migratepages == COMPACT_CLUSTER_MAX) { ++low_pfn; diff --git a/mm/migrate.c b/mm/migrate.c index 6f408c7..a771751 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -35,6 +35,7 @@ #include linux/hugetlb.h #include linux/hugetlb_cgroup.h #include linux/gfp.h +#include linux/balloon_compaction.h #include asm/tlbflush.h @@ -79,7 +80,10 @@ void putback_lru_pages(struct list_head *l) list_del(page-lru); dec_zone_page_state(page, NR_ISOLATED_ANON + page_is_file_cache(page)); - putback_lru_page(page); + if (unlikely(balloon_page_movable(page))) + balloon_page_putback(page); + else + putback_lru_page(page); } } @@ -778,6 +782,18 @@ static int __unmap_and_move(struct page *page, struct page *newpage, } } + if (unlikely(balloon_page_movable(page))) { + /* +* A ballooned page does not need any special attention from +* physical to virtual reverse mapping procedures. +* Skip any attempt to unmap PTEs or to remap swap cache, +* in order to avoid burning cycles at rmap level, and perform +* the page migration right away (proteced by page lock). +*/ + rc = balloon_page_migrate(newpage, page, mode); + goto uncharge; + } + /* * Corner case handling: * 1. When a new swap-cache page is read into, it is added to the LRU @@ -814,7 +830,9 @@ skip_unmap: put_anon_vma(anon_vma); uncharge: - mem_cgroup_end_migration(mem, page, newpage, rc == MIGRATEPAGE_SUCCESS); + mem_cgroup_end_migration(mem, page, newpage, +(rc == MIGRATEPAGE_SUCCESS || + rc == MIGRATEPAGE_BALLOON_SUCCESS)); unlock: unlock_page(page); out: @@ -846,6 +864,18 @@ static int unmap_and_move(new_page_t get_new_page, unsigned long private, goto out; rc = __unmap_and_move(page, newpage, force, offlining, mode); + + if (unlikely(rc == MIGRATEPAGE_BALLOON_SUCCESS)) { + /* +* A ballooned page has been
[PATCH v12 5/7] virtio_balloon: introduce migration primitives to balloon pages
Memory fragmentation introduced by ballooning might reduce significantly the number of 2MB contiguous memory blocks that can be used within a guest, thus imposing performance penalties associated with the reduced number of transparent huge pages that could be used by the guest workload. Besides making balloon pages movable at allocation time and introducing the necessary primitives to perform balloon page migration/compaction, this patch also introduces the following locking scheme, in order to enhance the syncronization methods for accessing elements of struct virtio_balloon, thus providing protection against concurrent access introduced by parallel memory migration threads. - balloon_lock (mutex) : synchronizes the access demand to elements of struct virtio_balloon and its queue operations; Signed-off-by: Rafael Aquini aqu...@redhat.com Acked-by: Michael S. Tsirkin m...@redhat.com --- drivers/virtio/virtio_balloon.c | 139 1 file changed, 127 insertions(+), 12 deletions(-) diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c index 586395c..8f92ab7 100644 --- a/drivers/virtio/virtio_balloon.c +++ b/drivers/virtio/virtio_balloon.c @@ -27,6 +27,7 @@ #include linux/delay.h #include linux/slab.h #include linux/module.h +#include linux/balloon_compaction.h /* * Balloon device works in 4K page units. So each page is pointed to by @@ -34,6 +35,7 @@ * page units. */ #define VIRTIO_BALLOON_PAGES_PER_PAGE (PAGE_SIZE VIRTIO_BALLOON_PFN_SHIFT) +#define VIRTIO_BALLOON_ARRAY_PFNS_MAX 256 struct virtio_balloon { @@ -52,15 +54,19 @@ struct virtio_balloon /* Number of balloon pages we've told the Host we're not using. */ unsigned int num_pages; /* -* The pages we've told the Host we're not using. +* The pages we've told the Host we're not using are enqueued +* at vb_dev_info-pages list. * Each page on this list adds VIRTIO_BALLOON_PAGES_PER_PAGE * to num_pages above. */ - struct list_head pages; + struct balloon_dev_info *vb_dev_info; + + /* Synchronize access/update to this struct virtio_balloon elements */ + struct mutex balloon_lock; /* The array of pfns we tell the Host about. */ unsigned int num_pfns; - u32 pfns[256]; + u32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX]; /* Memory statistics */ int need_stats_update; @@ -122,17 +128,20 @@ static void set_page_pfns(u32 pfns[], struct page *page) static void fill_balloon(struct virtio_balloon *vb, size_t num) { + struct balloon_dev_info *vb_dev_info = vb-vb_dev_info; + /* We can only do one array worth at a time. */ num = min(num, ARRAY_SIZE(vb-pfns)); + mutex_lock(vb-balloon_lock); for (vb-num_pfns = 0; vb-num_pfns num; vb-num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) { - struct page *page = alloc_page(GFP_HIGHUSER | __GFP_NORETRY | - __GFP_NOMEMALLOC | __GFP_NOWARN); + struct page *page = balloon_page_enqueue(vb_dev_info); + if (!page) { dev_info_ratelimited(vb-vdev-dev, Out of puff! Can't get %zu pages\n, -num); +VIRTIO_BALLOON_PAGES_PER_PAGE); /* Sleep for at least 1/5 of a second before retry. */ msleep(200); break; @@ -140,7 +149,6 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num) set_page_pfns(vb-pfns + vb-num_pfns, page); vb-num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE; totalram_pages--; - list_add(page-lru, vb-pages); } /* Didn't get any? Oh well. */ @@ -148,6 +156,7 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num) return; tell_host(vb, vb-inflate_vq); + mutex_unlock(vb-balloon_lock); } static void release_pages_by_pfn(const u32 pfns[], unsigned int num) @@ -156,7 +165,7 @@ static void release_pages_by_pfn(const u32 pfns[], unsigned int num) /* Find pfns pointing at start of each page, get pages and free them. */ for (i = 0; i num; i += VIRTIO_BALLOON_PAGES_PER_PAGE) { - __free_page(balloon_pfn_to_page(pfns[i])); + balloon_page_free(balloon_pfn_to_page(pfns[i])); totalram_pages++; } } @@ -164,14 +173,17 @@ static void release_pages_by_pfn(const u32 pfns[], unsigned int num) static void leak_balloon(struct virtio_balloon *vb, size_t num) { struct page *page; + struct balloon_dev_info *vb_dev_info = vb-vb_dev_info; /* We can only do one array worth at a time. */ num = min(num, ARRAY_SIZE(vb-pfns));
[PATCH v12 7/7] mm: add vm event counters for balloon pages compaction
This patch introduces a new set of vm event counters to keep track of ballooned pages compaction activity. Signed-off-by: Rafael Aquini aqu...@redhat.com --- include/linux/balloon_compaction.h | 7 +++ include/linux/vm_event_item.h | 7 ++- mm/balloon_compaction.c| 2 ++ mm/migrate.c | 1 + mm/vmstat.c| 9 - 5 files changed, 24 insertions(+), 2 deletions(-) diff --git a/include/linux/balloon_compaction.h b/include/linux/balloon_compaction.h index 2e63d94..68893bc 100644 --- a/include/linux/balloon_compaction.h +++ b/include/linux/balloon_compaction.h @@ -197,8 +197,15 @@ static inline bool balloon_compaction_check(void) return true; } +static inline void balloon_event_count(enum vm_event_item item) +{ + count_vm_event(item); +} #else /* !CONFIG_BALLOON_COMPACTION */ +/* A macro, to avoid generating references to the undefined COMPACTBALLOON* */ +#define balloon_event_count(item) do { } while (0) + static inline void *balloon_mapping_alloc(void *balloon_device, const struct address_space_operations *a_ops) { diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index 3d31145..bd67c3f 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -41,7 +41,12 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, #ifdef CONFIG_COMPACTION COMPACTBLOCKS, COMPACTPAGES, COMPACTPAGEFAILED, COMPACTSTALL, COMPACTFAIL, COMPACTSUCCESS, -#endif +#ifdef CONFIG_BALLOON_COMPACTION + COMPACTBALLOONISOLATED, /* isolated from balloon pagelist */ + COMPACTBALLOONMIGRATED, /* balloon page sucessfully migrated */ + COMPACTBALLOONRETURNED, /* putback to pagelist, not-migrated */ +#endif /* CONFIG_BALLOON_COMPACTION */ +#endif /* CONFIG_COMPACTION */ #ifdef CONFIG_HUGETLB_PAGE HTLB_BUDDY_PGALLOC, HTLB_BUDDY_PGALLOC_FAIL, #endif diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c index 07dbc8e..2c8ce49 100644 --- a/mm/balloon_compaction.c +++ b/mm/balloon_compaction.c @@ -242,6 +242,7 @@ bool balloon_page_isolate(struct page *page) if (__is_movable_balloon_page(page) page_count(page) == 2) { __isolate_balloon_page(page); + balloon_event_count(COMPACTBALLOONISOLATED); unlock_page(page); return true; } @@ -265,6 +266,7 @@ void balloon_page_putback(struct page *page) __putback_balloon_page(page); /* drop the extra ref count taken for page isolation */ put_page(page); + balloon_event_count(COMPACTBALLOONRETURNED); } else { WARN_ON(1); dump_page(page); diff --git a/mm/migrate.c b/mm/migrate.c index 107a281..ecae213 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -894,6 +894,7 @@ static int unmap_and_move(new_page_t get_new_page, unsigned long private, dec_zone_page_state(page, NR_ISOLATED_ANON + page_is_file_cache(page)); balloon_page_free(page); + balloon_event_count(COMPACTBALLOONMIGRATED); return MIGRATEPAGE_SUCCESS; } out: diff --git a/mm/vmstat.c b/mm/vmstat.c index c737057..18a76ea 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -781,7 +781,14 @@ const char * const vmstat_text[] = { compact_stall, compact_fail, compact_success, -#endif + +#ifdef CONFIG_BALLOON_COMPACTION + compact_balloon_isolated, + compact_balloon_migrated, + compact_balloon_returned, +#endif /* CONFIG_BALLOON_COMPACTION */ + +#endif /* CONFIG_COMPACTION */ #ifdef CONFIG_HUGETLB_PAGE htlb_buddy_alloc_success, -- 1.7.11.7 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v11 7/7] mm: add vm event counters for balloon pages compaction
On Sat, Nov 10, 2012 at 05:55:38PM +0200, Michael S. Tsirkin wrote: mutex_unlock(vb-balloon_lock); + balloon_event_count(COMPACTBALLOONMIGRATED); return MIGRATEPAGE_BALLOON_SUCCESS; } Looks like any ballon would need to do this. Can this chunk go into caller instead? Good catch. It's done, already (v12 just hit the wild). Thanks! -- Rafael ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v11 5/7] virtio_balloon: introduce migration primitives to balloon pages
Rafael Aquini aqu...@redhat.com writes: On Thu, Nov 08, 2012 at 09:32:18AM +1030, Rusty Russell wrote: The first one can be delayed, the second one can be delayed if the host didn't ask for VIRTIO_BALLOON_F_MUST_TELL_HOST (qemu doesn't). We could implement a proper request queue for these, and return -EAGAIN if the queue fills. Though in practice, it's not important (it might help performance). I liked the idea. Give me the directions to accomplish it and I'll give it a try for sure. OK, let's get this applied first, but here are some pointers: Here's the current callback function when the host has processed the buffers we put in the queue: static void balloon_ack(struct virtqueue *vq) { struct virtio_balloon *vb = vq-vdev-priv; wake_up(vb-acked); } It's almost a noop: here's how we use it to make our queues synchronous: static void tell_host(struct virtio_balloon *vb, struct virtqueue *vq) { struct scatterlist sg; unsigned int len; sg_init_one(sg, vb-pfns, sizeof(vb-pfns[0]) * vb-num_pfns); /* We should always be able to add one buffer to an empty queue. */ if (virtqueue_add_buf(vq, sg, 1, 0, vb, GFP_KERNEL) 0) BUG(); virtqueue_kick(vq); /* When host has read buffer, this completes via balloon_ack */ wait_event(vb-acked, virtqueue_get_buf(vq, len)); } And we set up the callback when we create the virtqueue: vq_callback_t *callbacks[] = { balloon_ack, balloon_ack, stats_request }; ... err = vb-vdev-config-find_vqs(vb-vdev, nvqs, vqs, callbacks, names); So off the top of my head it should be as simple as changing tell_host() to only wait if the virtqueue_add_buf() fails (ie. queue is full). Hmm, though you will want to synchronize the inflate and deflate queues: if we tell the host we're giving a page up we want it to have seen that before we tell it we're using it again... Cheers, Rusty. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization