[PATCH] zsmalloc: fix fatal corruption due to wrong size class selection
There is no point in overriding the size class below. It causes fatal corruption on the next chunk on the 3264-bytes size class, which is the last size class that is not huge. For example, if the requested size was exactly 3264 bytes, current zsmalloc allocates and returns a chunk from the size class of 3264 bytes, not 4096. User access to this chunk may overwrite head of the next adjacent chunk. Here is the panic log captured when freelist was corrupted due to this: Kernel BUG at ffc00030659c [verbose debug info unavailable] Internal error: Oops - BUG: 9606 [#1] PREEMPT SMP Modules linked in: exynos-snapshot: core register saved(CPU:5) CPUMERRSR: , L2MERRSR: exynos-snapshot: context saved(CPU:5) exynos-snapshot: item - log_kevents is disabled CPU: 5 PID: 898 Comm: kswapd0 Not tainted 3.10.61-4497415-eng #1 task: ffc0b8783d80 ti: ffc0b71e8000 task.ti: ffc0b71e8000 PC is at obj_idx_to_offset+0x0/0x1c LR is at obj_malloc+0x44/0xe8 pc : [] lr : [] pstate: a045 sp : ffc0b71eb790 x29: ffc0b71eb790 x28: ffc00204c000 x27: 0001d96f x26: x25: ffc098cc3500 x24: ffc0a13f2810 x23: ffc098cc3501 x22: ffc0a13f2800 x21: 11e1a02006e3 x20: ffc0a13f2800 x19: ffbc02a7e000 x18: x17: x16: 0feb x15: x14: a01003e3 x13: 0020 x12: fff0 x11: ffc08b264000 x10: e3a01004 x9 : ffc08b263fea x8 : ffc0b1e611c0 x7 : ffc000307d24 x6 : x5 : 0038 x4 : 011e x3 : ffbc3e90 x2 : 0cc0 x1 : d0100371 x0 : ffbc3e90 Reported-by: Sooyong Suk Signed-off-by: Heesub Shin Tested-by: Sooyong Suk Cc: Minchan Kim --- mm/zsmalloc.c | 5 - 1 file changed, 5 deletions(-) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 6c39ae9..a2da64b 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -1398,11 +1398,6 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size) /* extra space in chunk to keep the handle */ size += ZS_HANDLE_SIZE; class = pool->size_class[get_size_class_index(size)]; - /* In huge class size, we store the handle into first_page->private */ - if (class->huge) { - size -= ZS_HANDLE_SIZE; - class = pool->size_class[get_size_class_index(size)]; - } spin_lock(>lock); first_page = find_get_zspage(class); -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] zsmalloc: fix fatal corruption due to wrong size class selection
There is no point in overriding the size class below. It causes fatal corruption on the next chunk on the 3264-bytes size class, which is the last size class that is not huge. For example, if the requested size was exactly 3264 bytes, current zsmalloc allocates and returns a chunk from the size class of 3264 bytes, not 4096. User access to this chunk may overwrite head of the next adjacent chunk. Here is the panic log captured when freelist was corrupted due to this: Kernel BUG at ffc00030659c [verbose debug info unavailable] Internal error: Oops - BUG: 9606 [#1] PREEMPT SMP Modules linked in: exynos-snapshot: core register saved(CPU:5) CPUMERRSR: , L2MERRSR: exynos-snapshot: context saved(CPU:5) exynos-snapshot: item - log_kevents is disabled CPU: 5 PID: 898 Comm: kswapd0 Not tainted 3.10.61-4497415-eng #1 task: ffc0b8783d80 ti: ffc0b71e8000 task.ti: ffc0b71e8000 PC is at obj_idx_to_offset+0x0/0x1c LR is at obj_malloc+0x44/0xe8 pc : [ffc00030659c] lr : [ffc000306604] pstate: a045 sp : ffc0b71eb790 x29: ffc0b71eb790 x28: ffc00204c000 x27: 0001d96f x26: x25: ffc098cc3500 x24: ffc0a13f2810 x23: ffc098cc3501 x22: ffc0a13f2800 x21: 11e1a02006e3 x20: ffc0a13f2800 x19: ffbc02a7e000 x18: x17: x16: 0feb x15: x14: a01003e3 x13: 0020 x12: fff0 x11: ffc08b264000 x10: e3a01004 x9 : ffc08b263fea x8 : ffc0b1e611c0 x7 : ffc000307d24 x6 : x5 : 0038 x4 : 011e x3 : ffbc3e90 x2 : 0cc0 x1 : d0100371 x0 : ffbc3e90 Reported-by: Sooyong Suk s@samsung.com Signed-off-by: Heesub Shin heesub.s...@samsung.com Tested-by: Sooyong Suk s@samsung.com Cc: Minchan Kim minc...@kernel.org --- mm/zsmalloc.c | 5 - 1 file changed, 5 deletions(-) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 6c39ae9..a2da64b 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -1398,11 +1398,6 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size) /* extra space in chunk to keep the handle */ size += ZS_HANDLE_SIZE; class = pool-size_class[get_size_class_index(size)]; - /* In huge class size, we store the handle into first_page-private */ - if (class-huge) { - size -= ZS_HANDLE_SIZE; - class = pool-size_class[get_size_class_index(size)]; - } spin_lock(class-lock); first_page = find_get_zspage(class); -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] zsmalloc: do not remap dst page while prepare next src page
Hello, On 03/25/2015 12:24 AM, Sergey Senozhatsky wrote: > object may belong to different pages. zs_object_copy() handles > this case and maps a new source page (get_next_page() and > kmap_atomic()) when object crosses boundaries of the current > source page. But it also performs unnecessary kunmap/kmap_atomic > of the destination page (it remains unchanged), which can be > avoided. No, it's not unnecessary. We should do kunmap_atomic() in the reverse order of kmap_atomic(), so unfortunately it's inevitable to kunmap_atomic() both on d_addr and s_addr. > > Signed-off-by: Sergey Senozhatsky > --- > mm/zsmalloc.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c > index d920e8b..7af4456 100644 > --- a/mm/zsmalloc.c > +++ b/mm/zsmalloc.c > @@ -1536,12 +1536,10 @@ static void zs_object_copy(unsigned long src, > unsigned long dst, > break; > > if (s_off + size >= PAGE_SIZE) { > - kunmap_atomic(d_addr); > kunmap_atomic(s_addr); Removing kunmap_atomic(d_addr) here may cause BUG_ON() at __kunmap_atomic(). I tried yours to see it really happens: > kernel BUG at arch/arm/mm/highmem.c:113! > Internal error: Oops - BUG: 0 [#1] SMP ARM > Modules linked in: > CPU: 2 PID: 1774 Comm: bash Not tainted 4.0.0-rc2-mm1+ #105 > Hardware name: ARM-Versatile Express > task: ee971300 ti: e8a26000 task.ti: e8a26000 > PC is at __kunmap_atomic+0x144/0x14c > LR is at zs_object_copy+0x19c/0x2dc regards heesub -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] zsmalloc: do not remap dst page while prepare next src page
Hello, On 03/25/2015 12:24 AM, Sergey Senozhatsky wrote: object may belong to different pages. zs_object_copy() handles this case and maps a new source page (get_next_page() and kmap_atomic()) when object crosses boundaries of the current source page. But it also performs unnecessary kunmap/kmap_atomic of the destination page (it remains unchanged), which can be avoided. No, it's not unnecessary. We should do kunmap_atomic() in the reverse order of kmap_atomic(), so unfortunately it's inevitable to kunmap_atomic() both on d_addr and s_addr. Signed-off-by: Sergey Senozhatsky sergey.senozhat...@gmail.com --- mm/zsmalloc.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index d920e8b..7af4456 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -1536,12 +1536,10 @@ static void zs_object_copy(unsigned long src, unsigned long dst, break; if (s_off + size = PAGE_SIZE) { - kunmap_atomic(d_addr); kunmap_atomic(s_addr); Removing kunmap_atomic(d_addr) here may cause BUG_ON() at __kunmap_atomic(). I tried yours to see it really happens: kernel BUG at arch/arm/mm/highmem.c:113! Internal error: Oops - BUG: 0 [#1] SMP ARM Modules linked in: CPU: 2 PID: 1774 Comm: bash Not tainted 4.0.0-rc2-mm1+ #105 Hardware name: ARM-Versatile Express task: ee971300 ti: e8a26000 task.ti: e8a26000 PC is at __kunmap_atomic+0x144/0x14c LR is at zs_object_copy+0x19c/0x2dc regards heesub -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 3/7] zsmalloc: support compaction
Hello Minchan, Nice work! On 03/04/2015 02:01 PM, Minchan Kim wrote: > +static void putback_zspage(struct zs_pool *pool, struct size_class *class, > + struct page *first_page) > +{ > + int class_idx; > + enum fullness_group fullness; > + > + BUG_ON(!is_first_page(first_page)); > + > + get_zspage_mapping(first_page, _idx, ); > + insert_zspage(first_page, class, fullness); > + fullness = fix_fullness_group(class, first_page); Removal and re-insertion of zspage above can be eliminated, like this: fullness = get_fullness_group(first_page); insert_zspage(first_page, class, fullness); set_zspage_mapping(first_page, class->index, fullness); regards, heesub > if (fullness == ZS_EMPTY) { > + zs_stat_dec(class, OBJ_ALLOCATED, get_maxobj_per_zspage( > + class->size, class->pages_per_zspage)); > atomic_long_sub(class->pages_per_zspage, > >pages_allocated); > + > free_zspage(first_page); > } > } > -EXPORT_SYMBOL_GPL(zs_free); > + -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 3/7] zsmalloc: support compaction
Hello Minchan, Nice work! On 03/04/2015 02:01 PM, Minchan Kim wrote: +static void putback_zspage(struct zs_pool *pool, struct size_class *class, + struct page *first_page) +{ + int class_idx; + enum fullness_group fullness; + + BUG_ON(!is_first_page(first_page)); + + get_zspage_mapping(first_page, class_idx, fullness); + insert_zspage(first_page, class, fullness); + fullness = fix_fullness_group(class, first_page); Removal and re-insertion of zspage above can be eliminated, like this: fullness = get_fullness_group(first_page); insert_zspage(first_page, class, fullness); set_zspage_mapping(first_page, class-index, fullness); regards, heesub if (fullness == ZS_EMPTY) { + zs_stat_dec(class, OBJ_ALLOCATED, get_maxobj_per_zspage( + class-size, class-pages_per_zspage)); atomic_long_sub(class-pages_per_zspage, pool-pages_allocated); + free_zspage(first_page); } } -EXPORT_SYMBOL_GPL(zs_free); + -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 0/9] mm/zbud: support highmem pages
On 01/28/2015 05:24 AM, Seth Jennings wrote: On Tue, Nov 04, 2014 at 10:33:43AM -0600, Seth Jennings wrote: On Tue, Oct 14, 2014 at 08:59:19PM +0900, Heesub Shin wrote: zbud is a memory allocator for storing compressed data pages. It keeps two data objects of arbitrary size on a single page. This simple design provides very deterministic behavior on reclamation, which is one of reasons why zswap selected zbud as a default allocator over zsmalloc. Unlike zsmalloc, however, zbud does not support highmem. This is problomatic especially on 32-bit machines having relatively small lowmem. Compressing anonymous pages from highmem and storing them into lowmem could eat up lowmem spaces. This limitation is due to the fact that zbud manages its internal data structures on zbud_header which is kept in the head of zbud_page. For example, zbud_pages are tracked by several lists and have some status information, which are being referenced at any time by the kernel. Thus, zbud_pages should be allocated on a memory region directly mapped, lowmem. After some digging out, I found that internal data structures of zbud can be kept in the struct page, the same way as zsmalloc does. So, this series moves out all fields in zbud_header to struct page. Though it alters quite a lot, it does not add any functional differences except highmem support. I am afraid that this kind of modification abusing several fields in struct page would be ok. Hi Heesub, Sorry for the very late reply. The end of October was very busy for me. A little history on zbud. I didn't put the metadata in the struct page, even though I knew that was an option since we had done it with zsmalloc. At the time, Andrew Morton had concerns about memmap walkers getting messed up with unexpected values in the struct page fields. In order to smooth zbud's acceptance, I decided to store the metadata inline in the page itself. Later, zsmalloc eventually got accepted, which basically gave the impression that putting the metadata in the struct page was acceptable. I have recently been looking at implementing compaction for zsmalloc, but having the metadata in the struct page and having the handle directly encode the PFN and offset of the data block prevents transparent relocation of the data. zbud has a similar issue as it currently encodes the page address in the handle returned to the user (also the limitation that is preventing use of highmem pages). I would like to implement compaction for zbud too and moving the metadata into the struct page is going to work against that. In fact, I'm looking at the option of converting the current zbud_header into a per-allocation metadata structure, which would provide a layer of indirection between zbud and the user, allowing for transparent relocation and compaction. I had some downtime and started thinking about this again today (after 3 months). Upon further reflection, I really like this and don't think that it inhibits introducing compaction later. There are just a few places that look messy or problematic to me: 1. the use of page->private and masking the number of chunks for both buddies into it (see suggestion for overlay struct below) 2. the use of the second double word >index to store a list_head #2 might be problematic because, IIRC, memmap walkers will check _count (or _mapcount). I think we ran into this in zsmalloc. Initially, when working on zsmalloc, I just created a structure that overlaid the struct page in the memmap, reserving the flags and _count areas, so that I wouldn't have to be bound by the field names/boundaries in the struct page. IIRC, Andrew was initially against that, but he was also against the whole idea of using the struct page fields for random stuff... I that ended up being accepted. This code looks really good! I think with a little cleanup and finding a way to steer clear of using the _count part of the structure, this will be great. Thanks for your comments! I will try to address problems you pointed and post a new patchset hopefully soon. regards, heesub Sorry for dismissing it earlier. Didn't give it enough credit. Thanks, Seth However, I do like the part about letting zbud use highmem pages. I have something in mind that would allow highmem pages _and_ move toward something that would support compaction. I'll see if I can put it into code today. Thanks, Seth Heesub Shin (9): mm/zbud: tidy up a bit mm/zbud: remove buddied list from zbud_pool mm/zbud: remove lru from zbud_header mm/zbud: remove first|last_chunks from zbud_header mm/zbud: encode zbud handle using struct page mm/zbud: remove list_head for buddied list from zbud_header mm/zbud: drop zbud_header mm/zbud: allow clients to use highmem pages mm/zswap: use highmem pages for compressed pool mm/zbud.c | 244 ++--- mm/zswap.c | 4 +- 2 files changed, 121 insertions(+), 127 deletions(-) --
Re: [RFC PATCH 0/9] mm/zbud: support highmem pages
On 01/28/2015 05:24 AM, Seth Jennings wrote: On Tue, Nov 04, 2014 at 10:33:43AM -0600, Seth Jennings wrote: On Tue, Oct 14, 2014 at 08:59:19PM +0900, Heesub Shin wrote: zbud is a memory allocator for storing compressed data pages. It keeps two data objects of arbitrary size on a single page. This simple design provides very deterministic behavior on reclamation, which is one of reasons why zswap selected zbud as a default allocator over zsmalloc. Unlike zsmalloc, however, zbud does not support highmem. This is problomatic especially on 32-bit machines having relatively small lowmem. Compressing anonymous pages from highmem and storing them into lowmem could eat up lowmem spaces. This limitation is due to the fact that zbud manages its internal data structures on zbud_header which is kept in the head of zbud_page. For example, zbud_pages are tracked by several lists and have some status information, which are being referenced at any time by the kernel. Thus, zbud_pages should be allocated on a memory region directly mapped, lowmem. After some digging out, I found that internal data structures of zbud can be kept in the struct page, the same way as zsmalloc does. So, this series moves out all fields in zbud_header to struct page. Though it alters quite a lot, it does not add any functional differences except highmem support. I am afraid that this kind of modification abusing several fields in struct page would be ok. Hi Heesub, Sorry for the very late reply. The end of October was very busy for me. A little history on zbud. I didn't put the metadata in the struct page, even though I knew that was an option since we had done it with zsmalloc. At the time, Andrew Morton had concerns about memmap walkers getting messed up with unexpected values in the struct page fields. In order to smooth zbud's acceptance, I decided to store the metadata inline in the page itself. Later, zsmalloc eventually got accepted, which basically gave the impression that putting the metadata in the struct page was acceptable. I have recently been looking at implementing compaction for zsmalloc, but having the metadata in the struct page and having the handle directly encode the PFN and offset of the data block prevents transparent relocation of the data. zbud has a similar issue as it currently encodes the page address in the handle returned to the user (also the limitation that is preventing use of highmem pages). I would like to implement compaction for zbud too and moving the metadata into the struct page is going to work against that. In fact, I'm looking at the option of converting the current zbud_header into a per-allocation metadata structure, which would provide a layer of indirection between zbud and the user, allowing for transparent relocation and compaction. I had some downtime and started thinking about this again today (after 3 months). Upon further reflection, I really like this and don't think that it inhibits introducing compaction later. There are just a few places that look messy or problematic to me: 1. the use of page-private and masking the number of chunks for both buddies into it (see suggestion for overlay struct below) 2. the use of the second double word page-index to store a list_head #2 might be problematic because, IIRC, memmap walkers will check _count (or _mapcount). I think we ran into this in zsmalloc. Initially, when working on zsmalloc, I just created a structure that overlaid the struct page in the memmap, reserving the flags and _count areas, so that I wouldn't have to be bound by the field names/boundaries in the struct page. IIRC, Andrew was initially against that, but he was also against the whole idea of using the struct page fields for random stuff... I that ended up being accepted. This code looks really good! I think with a little cleanup and finding a way to steer clear of using the _count part of the structure, this will be great. Thanks for your comments! I will try to address problems you pointed and post a new patchset hopefully soon. regards, heesub Sorry for dismissing it earlier. Didn't give it enough credit. Thanks, Seth However, I do like the part about letting zbud use highmem pages. I have something in mind that would allow highmem pages _and_ move toward something that would support compaction. I'll see if I can put it into code today. Thanks, Seth Heesub Shin (9): mm/zbud: tidy up a bit mm/zbud: remove buddied list from zbud_pool mm/zbud: remove lru from zbud_header mm/zbud: remove first|last_chunks from zbud_header mm/zbud: encode zbud handle using struct page mm/zbud: remove list_head for buddied list from zbud_header mm/zbud: drop zbud_header mm/zbud: allow clients to use highmem pages mm/zswap: use highmem pages for compressed pool mm/zbud.c | 244 ++--- mm/zswap.c | 4 +- 2 files changed, 121 insertions(+), 127 deletions(-) -- 1.9.1
Re: [PATCH v5 2/4] mm/page_alloc: add freepage on isolate pageblock to correct buddy list
Hello, On 10/31/2014 04:25 PM, Joonsoo Kim wrote: In free_pcppages_bulk(), we use cached migratetype of freepage to determine type of buddy list where freepage will be added. This information is stored when freepage is added to pcp list, so if isolation of pageblock of this freepage begins after storing, this cached information could be stale. In other words, it has original migratetype rather than MIGRATE_ISOLATE. There are two problems caused by this stale information. One is that we can't keep these freepages from being allocated. Although this pageblock is isolated, freepage will be added to normal buddy list so that it could be allocated without any restriction. And the other problem is incorrect freepage accounting. Freepages on isolate pageblock should not be counted for number of freepage. Following is the code snippet in free_pcppages_bulk(). /* MIGRATE_MOVABLE list may include MIGRATE_RESERVEs */ __free_one_page(page, page_to_pfn(page), zone, 0, mt); trace_mm_page_pcpu_drain(page, 0, mt); if (likely(!is_migrate_isolate_page(page))) { __mod_zone_page_state(zone, NR_FREE_PAGES, 1); if (is_migrate_cma(mt)) __mod_zone_page_state(zone, NR_FREE_CMA_PAGES, 1); } As you can see above snippet, current code already handle second problem, incorrect freepage accounting, by re-fetching pageblock migratetype through is_migrate_isolate_page(page). But, because this re-fetched information isn't used for __free_one_page(), first problem would not be solved. This patch try to solve this situation to re-fetch pageblock migratetype before __free_one_page() and to use it for __free_one_page(). In addition to move up position of this re-fetch, this patch use optimization technique, re-fetching migratetype only if there is isolate pageblock. Pageblock isolation is rare event, so we can avoid re-fetching in common case with this optimization. This patch also correct migratetype of the tracepoint output. Cc: Acked-by: Minchan Kim Acked-by: Michal Nazarewicz Acked-by: Vlastimil Babka Signed-off-by: Joonsoo Kim --- mm/page_alloc.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index f7a867e..6df23fe 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -725,14 +725,17 @@ static void free_pcppages_bulk(struct zone *zone, int count, /* must delete as __free_one_page list manipulates */ list_del(>lru); mt = get_freepage_migratetype(page); + if (unlikely(has_isolate_pageblock(zone))) { How about adding an additional check for 'mt == MIGRATE_MOVABLE' here? Then, most of get_pageblock_migratetype() calls could be avoided while the isolation is in progress. I am not sure this is the case on memory offlining. How do you think? + mt = get_pageblock_migratetype(page); + if (is_migrate_isolate(mt)) + goto skip_counting; + } + __mod_zone_freepage_state(zone, 1, mt); + +skip_counting: /* MIGRATE_MOVABLE list may include MIGRATE_RESERVEs */ __free_one_page(page, page_to_pfn(page), zone, 0, mt); trace_mm_page_pcpu_drain(page, 0, mt); - if (likely(!is_migrate_isolate_page(page))) { - __mod_zone_page_state(zone, NR_FREE_PAGES, 1); - if (is_migrate_cma(mt)) - __mod_zone_page_state(zone, NR_FREE_CMA_PAGES, 1); - } } while (--to_free && --batch_free && !list_empty(list)); } spin_unlock(>lock); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v5 2/4] mm/page_alloc: add freepage on isolate pageblock to correct buddy list
Hello, On 10/31/2014 04:25 PM, Joonsoo Kim wrote: In free_pcppages_bulk(), we use cached migratetype of freepage to determine type of buddy list where freepage will be added. This information is stored when freepage is added to pcp list, so if isolation of pageblock of this freepage begins after storing, this cached information could be stale. In other words, it has original migratetype rather than MIGRATE_ISOLATE. There are two problems caused by this stale information. One is that we can't keep these freepages from being allocated. Although this pageblock is isolated, freepage will be added to normal buddy list so that it could be allocated without any restriction. And the other problem is incorrect freepage accounting. Freepages on isolate pageblock should not be counted for number of freepage. Following is the code snippet in free_pcppages_bulk(). /* MIGRATE_MOVABLE list may include MIGRATE_RESERVEs */ __free_one_page(page, page_to_pfn(page), zone, 0, mt); trace_mm_page_pcpu_drain(page, 0, mt); if (likely(!is_migrate_isolate_page(page))) { __mod_zone_page_state(zone, NR_FREE_PAGES, 1); if (is_migrate_cma(mt)) __mod_zone_page_state(zone, NR_FREE_CMA_PAGES, 1); } As you can see above snippet, current code already handle second problem, incorrect freepage accounting, by re-fetching pageblock migratetype through is_migrate_isolate_page(page). But, because this re-fetched information isn't used for __free_one_page(), first problem would not be solved. This patch try to solve this situation to re-fetch pageblock migratetype before __free_one_page() and to use it for __free_one_page(). In addition to move up position of this re-fetch, this patch use optimization technique, re-fetching migratetype only if there is isolate pageblock. Pageblock isolation is rare event, so we can avoid re-fetching in common case with this optimization. This patch also correct migratetype of the tracepoint output. Cc: sta...@vger.kernel.org Acked-by: Minchan Kim minc...@kernel.org Acked-by: Michal Nazarewicz min...@mina86.com Acked-by: Vlastimil Babka vba...@suse.cz Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com --- mm/page_alloc.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index f7a867e..6df23fe 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -725,14 +725,17 @@ static void free_pcppages_bulk(struct zone *zone, int count, /* must delete as __free_one_page list manipulates */ list_del(page-lru); mt = get_freepage_migratetype(page); + if (unlikely(has_isolate_pageblock(zone))) { How about adding an additional check for 'mt == MIGRATE_MOVABLE' here? Then, most of get_pageblock_migratetype() calls could be avoided while the isolation is in progress. I am not sure this is the case on memory offlining. How do you think? + mt = get_pageblock_migratetype(page); + if (is_migrate_isolate(mt)) + goto skip_counting; + } + __mod_zone_freepage_state(zone, 1, mt); + +skip_counting: /* MIGRATE_MOVABLE list may include MIGRATE_RESERVEs */ __free_one_page(page, page_to_pfn(page), zone, 0, mt); trace_mm_page_pcpu_drain(page, 0, mt); - if (likely(!is_migrate_isolate_page(page))) { - __mod_zone_page_state(zone, NR_FREE_PAGES, 1); - if (is_migrate_cma(mt)) - __mod_zone_page_state(zone, NR_FREE_CMA_PAGES, 1); - } } while (--to_free --batch_free !list_empty(list)); } spin_unlock(zone-lock); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm/zbud: init user ops only when it is needed
Hello, On 10/16/2014 05:17 AM, Andrew Morton wrote: On Wed, 15 Oct 2014 19:00:43 +0900 Heesub Shin wrote: When zbud is initialized through the zpool wrapper, pool->ops which points to user-defined operations is always set regardless of whether it is specified from the upper layer. This causes zbud_reclaim_page() to iterate its loop for evicting pool pages out without any gain. This patch sets the user-defined ops only when it is needed, so that zbud_reclaim_page() can bail out the reclamation loop earlier if there is no user-defined operations specified. Which callsite is calling zbud_zpool_create(..., NULL)? Currently nowhere. zswap is the only user of zbud and always passes a pointer to user-defined operation on pool creation. In addition, there may be less possibility that pool shrinking is requested by users who did not provide the user-defined ops. So, we may not need to worry much about what I wrote in the changelog. However, it is definitely weird to pass an argument, zpool_ops, which even will not be referenced by zbud_zpool_create(). Above all, it would be more useful to avoid the possibility in the future rather than just ignoring it. regards, heesub ... --- a/mm/zbud.c +++ b/mm/zbud.c @@ -132,7 +132,7 @@ static struct zbud_ops zbud_zpool_ops = { static void *zbud_zpool_create(gfp_t gfp, struct zpool_ops *zpool_ops) { - return zbud_create_pool(gfp, _zpool_ops); + return zbud_create_pool(gfp, zpool_ops ? _zpool_ops : NULL); } static void zbud_zpool_destroy(void *pool) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mm/zbud: init user ops only when it is needed
When zbud is initialized through the zpool wrapper, pool->ops which points to user-defined operations is always set regardless of whether it is specified from the upper layer. This causes zbud_reclaim_page() to iterate its loop for evicting pool pages out without any gain. This patch sets the user-defined ops only when it is needed, so that zbud_reclaim_page() can bail out the reclamation loop earlier if there is no user-defined operations specified. Signed-off-by: Heesub Shin --- mm/zbud.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/zbud.c b/mm/zbud.c index ecf1dbe..db8de74 100644 --- a/mm/zbud.c +++ b/mm/zbud.c @@ -132,7 +132,7 @@ static struct zbud_ops zbud_zpool_ops = { static void *zbud_zpool_create(gfp_t gfp, struct zpool_ops *zpool_ops) { - return zbud_create_pool(gfp, _zpool_ops); + return zbud_create_pool(gfp, zpool_ops ? _zpool_ops : NULL); } static void zbud_zpool_destroy(void *pool) -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] mm/zbud: init user ops only when it is needed
When zbud is initialized through the zpool wrapper, pool-ops which points to user-defined operations is always set regardless of whether it is specified from the upper layer. This causes zbud_reclaim_page() to iterate its loop for evicting pool pages out without any gain. This patch sets the user-defined ops only when it is needed, so that zbud_reclaim_page() can bail out the reclamation loop earlier if there is no user-defined operations specified. Signed-off-by: Heesub Shin heesub.s...@samsung.com --- mm/zbud.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/zbud.c b/mm/zbud.c index ecf1dbe..db8de74 100644 --- a/mm/zbud.c +++ b/mm/zbud.c @@ -132,7 +132,7 @@ static struct zbud_ops zbud_zpool_ops = { static void *zbud_zpool_create(gfp_t gfp, struct zpool_ops *zpool_ops) { - return zbud_create_pool(gfp, zbud_zpool_ops); + return zbud_create_pool(gfp, zpool_ops ? zbud_zpool_ops : NULL); } static void zbud_zpool_destroy(void *pool) -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mm/zbud: init user ops only when it is needed
Hello, On 10/16/2014 05:17 AM, Andrew Morton wrote: On Wed, 15 Oct 2014 19:00:43 +0900 Heesub Shin heesub.s...@samsung.com wrote: When zbud is initialized through the zpool wrapper, pool-ops which points to user-defined operations is always set regardless of whether it is specified from the upper layer. This causes zbud_reclaim_page() to iterate its loop for evicting pool pages out without any gain. This patch sets the user-defined ops only when it is needed, so that zbud_reclaim_page() can bail out the reclamation loop earlier if there is no user-defined operations specified. Which callsite is calling zbud_zpool_create(..., NULL)? Currently nowhere. zswap is the only user of zbud and always passes a pointer to user-defined operation on pool creation. In addition, there may be less possibility that pool shrinking is requested by users who did not provide the user-defined ops. So, we may not need to worry much about what I wrote in the changelog. However, it is definitely weird to pass an argument, zpool_ops, which even will not be referenced by zbud_zpool_create(). Above all, it would be more useful to avoid the possibility in the future rather than just ignoring it. regards, heesub ... --- a/mm/zbud.c +++ b/mm/zbud.c @@ -132,7 +132,7 @@ static struct zbud_ops zbud_zpool_ops = { static void *zbud_zpool_create(gfp_t gfp, struct zpool_ops *zpool_ops) { - return zbud_create_pool(gfp, zbud_zpool_ops); + return zbud_create_pool(gfp, zpool_ops ? zbud_zpool_ops : NULL); } static void zbud_zpool_destroy(void *pool) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH 1/9] mm/zbud: tidy up a bit
For aesthetics, add a blank line between functions, remove useless initialization statements, and simplify codes a bit. No functional differences are introduced. Signed-off-by: Heesub Shin --- mm/zbud.c | 21 ++--- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/mm/zbud.c b/mm/zbud.c index ecf1dbe..6f36394 100644 --- a/mm/zbud.c +++ b/mm/zbud.c @@ -145,6 +145,7 @@ static int zbud_zpool_malloc(void *pool, size_t size, gfp_t gfp, { return zbud_alloc(pool, size, gfp, handle); } + static void zbud_zpool_free(void *pool, unsigned long handle) { zbud_free(pool, handle); @@ -174,6 +175,7 @@ static void *zbud_zpool_map(void *pool, unsigned long handle, { return zbud_map(pool, handle); } + static void zbud_zpool_unmap(void *pool, unsigned long handle) { zbud_unmap(pool, handle); @@ -350,16 +352,11 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp, spin_lock(>lock); /* First, try to find an unbuddied zbud page. */ - zhdr = NULL; for_each_unbuddied_list(i, chunks) { if (!list_empty(>unbuddied[i])) { zhdr = list_first_entry(>unbuddied[i], struct zbud_header, buddy); list_del(>buddy); - if (zhdr->first_chunks == 0) - bud = FIRST; - else - bud = LAST; goto found; } } @@ -372,13 +369,15 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp, spin_lock(>lock); pool->pages_nr++; zhdr = init_zbud_page(page); - bud = FIRST; found: - if (bud == FIRST) + if (zhdr->first_chunks == 0) { zhdr->first_chunks = chunks; - else + bud = FIRST; + } else { zhdr->last_chunks = chunks; + bud = LAST; + } if (zhdr->first_chunks == 0 || zhdr->last_chunks == 0) { /* Add to unbuddied list */ @@ -433,7 +432,7 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle) /* Remove from existing buddy list */ list_del(>buddy); - if (zhdr->first_chunks == 0 && zhdr->last_chunks == 0) { + if (num_free_chunks(zhdr) == NCHUNKS) { /* zbud page is empty, free */ list_del(>lru); free_zbud_page(zhdr); @@ -489,7 +488,7 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries) { int i, ret, freechunks; struct zbud_header *zhdr; - unsigned long first_handle = 0, last_handle = 0; + unsigned long first_handle, last_handle; spin_lock(>lock); if (!pool->ops || !pool->ops->evict || list_empty(>lru) || @@ -529,7 +528,7 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries) next: spin_lock(>lock); zhdr->under_reclaim = false; - if (zhdr->first_chunks == 0 && zhdr->last_chunks == 0) { + if (num_free_chunks(zhdr) == NCHUNKS) { /* * Both buddies are now free, free the zbud page and * return success. -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH 7/9] mm/zbud: drop zbud_header
Now that the only field in zbud_header is .under_reclaim, get it out of the struct and let PG_reclaim bit in page->flags take over. As a result of this change, we can finally eliminate the struct zbud_header, and hence all the internal data structures of zbud live in struct page. Signed-off-by: Heesub Shin --- mm/zbud.c | 66 +-- 1 file changed, 18 insertions(+), 48 deletions(-) diff --git a/mm/zbud.c b/mm/zbud.c index 8a6dd6b..5a392f3 100644 --- a/mm/zbud.c +++ b/mm/zbud.c @@ -60,17 +60,15 @@ * NCHUNKS_ORDER determines the internal allocation granularity, effectively * adjusting internal fragmentation. It also determines the number of * freelists maintained in each pool. NCHUNKS_ORDER of 6 means that the - * allocation granularity will be in chunks of size PAGE_SIZE/64. As one chunk - * in allocated page is occupied by zbud header, NCHUNKS will be calculated to - * 63 which shows the max number of free chunks in zbud page, also there will be - * 63 freelists per pool. + * allocation granularity will be in chunks of size PAGE_SIZE/64. + * NCHUNKS will be calculated to 64 which shows the max number of free + * chunks in zbud page, also there will be 64 freelists per pool. */ #define NCHUNKS_ORDER 6 #define CHUNK_SHIFT(PAGE_SHIFT - NCHUNKS_ORDER) #define CHUNK_SIZE (1 << CHUNK_SHIFT) -#define ZHDR_SIZE_ALIGNED CHUNK_SIZE -#define NCHUNKS((PAGE_SIZE - ZHDR_SIZE_ALIGNED) >> CHUNK_SHIFT) +#define NCHUNKS(PAGE_SIZE >> CHUNK_SHIFT) /** * struct zbud_pool - stores metadata for each zbud pool @@ -96,14 +94,6 @@ struct zbud_pool { struct zbud_ops *ops; }; -/* - * struct zbud_header - zbud page metadata occupying the first chunk of each - * zbud page. - */ -struct zbud_header { - bool under_reclaim; -}; - /* * zpool / @@ -220,22 +210,19 @@ static size_t get_num_chunks(struct page *page, enum buddy bud) #define for_each_unbuddied_list(_iter, _begin) \ for ((_iter) = (_begin); (_iter) < NCHUNKS; (_iter)++) -/* Initializes the zbud header of a newly allocated zbud page */ +/* Initializes a newly allocated zbud page */ static void init_zbud_page(struct page *page) { - struct zbud_header *zhdr = page_address(page); set_num_chunks(page, FIRST, 0); set_num_chunks(page, LAST, 0); INIT_LIST_HEAD((struct list_head *) >index); INIT_LIST_HEAD(>lru); - zhdr->under_reclaim = 0; + ClearPageReclaim(page); } /* Resets the struct page fields and frees the page */ -static void free_zbud_page(struct zbud_header *zhdr) +static void free_zbud_page(struct page *page) { - struct page *page = virt_to_page(zhdr); - init_page_count(page); page_mapcount_reset(page); __free_page(page); @@ -261,14 +248,6 @@ static struct page *handle_to_zbud_page(unsigned long handle) return (struct page *) (handle & ~LAST); } -/* Returns the zbud page where a given handle is stored */ -static struct zbud_header *handle_to_zbud_header(unsigned long handle) -{ - struct page *page = handle_to_zbud_page(handle); - - return page_address(page); -} - /* Returns the number of free chunks in a zbud page */ static int num_free_chunks(struct page *page) { @@ -347,7 +326,7 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp, if (!size || (gfp & __GFP_HIGHMEM)) return -EINVAL; - if (size > PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE) + if (size > PAGE_SIZE - CHUNK_SIZE) return -ENOSPC; chunks = size_to_chunks(size); spin_lock(>lock); @@ -410,21 +389,18 @@ found: */ void zbud_free(struct zbud_pool *pool, unsigned long handle) { - struct zbud_header *zhdr; struct page *page; int freechunks; spin_lock(>lock); - zhdr = handle_to_zbud_header(handle); - page = virt_to_page(zhdr); + page = handle_to_zbud_page(handle); - /* If first buddy, handle will be page aligned */ - if ((handle - ZHDR_SIZE_ALIGNED) & ~PAGE_MASK) - set_num_chunks(page, LAST, 0); - else + if (!is_last_chunk(handle)) set_num_chunks(page, FIRST, 0); + else + set_num_chunks(page, LAST, 0); - if (zhdr->under_reclaim) { + if (PageReclaim(page)) { /* zbud page is under reclaim, reclaim will free */ spin_unlock(>lock); return; @@ -436,7 +412,7 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle) list_del((struct list_head *) >index); /* zbud page is empty, free */ list_del(>lru); - free_zbud_page(zhdr); + free_zbud_page(page); pool->pages_nr--; } el
[RFC PATCH 4/9] mm/zbud: remove first|last_chunks from zbud_header
The size information of each first and last buddy are stored into first|last_chunks in struct zbud_header respectively. Put them into page->private instead of zbud_header. Signed-off-by: Heesub Shin --- mm/zbud.c | 62 -- 1 file changed, 36 insertions(+), 26 deletions(-) diff --git a/mm/zbud.c b/mm/zbud.c index a2390f6..193ea4f 100644 --- a/mm/zbud.c +++ b/mm/zbud.c @@ -100,13 +100,9 @@ struct zbud_pool { * struct zbud_header - zbud page metadata occupying the first chunk of each * zbud page. * @buddy: links the zbud page into the unbuddied lists in the pool - * @first_chunks: the size of the first buddy in chunks, 0 if free - * @last_chunks: the size of the last buddy in chunks, 0 if free */ struct zbud_header { struct list_head buddy; - unsigned int first_chunks; - unsigned int last_chunks; bool under_reclaim; }; @@ -212,6 +208,17 @@ static int size_to_chunks(size_t size) return (size + CHUNK_SIZE - 1) >> CHUNK_SHIFT; } +static void set_num_chunks(struct page *page, enum buddy bud, size_t chunks) +{ + page->private = (page->private & (0x << (16 * !bud))) | + ((chunks & 0x) << (16 * bud)); +} + +static size_t get_num_chunks(struct page *page, enum buddy bud) +{ + return (page->private >> (16 * bud)) & 0x; +} + #define for_each_unbuddied_list(_iter, _begin) \ for ((_iter) = (_begin); (_iter) < NCHUNKS; (_iter)++) @@ -219,8 +226,8 @@ static int size_to_chunks(size_t size) static struct zbud_header *init_zbud_page(struct page *page) { struct zbud_header *zhdr = page_address(page); - zhdr->first_chunks = 0; - zhdr->last_chunks = 0; + set_num_chunks(page, FIRST, 0); + set_num_chunks(page, LAST, 0); INIT_LIST_HEAD(>buddy); INIT_LIST_HEAD(>lru); zhdr->under_reclaim = 0; @@ -240,6 +247,7 @@ static void free_zbud_page(struct zbud_header *zhdr) static unsigned long encode_handle(struct zbud_header *zhdr, enum buddy bud) { unsigned long handle; + struct page *page = virt_to_page(zhdr); /* * For now, the encoded handle is actually just the pointer to the data @@ -252,7 +260,8 @@ static unsigned long encode_handle(struct zbud_header *zhdr, enum buddy bud) /* skip over zbud header */ handle += ZHDR_SIZE_ALIGNED; else /* bud == LAST */ - handle += PAGE_SIZE - (zhdr->last_chunks << CHUNK_SHIFT); + handle += PAGE_SIZE - + (get_num_chunks(page, LAST) << CHUNK_SHIFT); return handle; } @@ -263,13 +272,14 @@ static struct zbud_header *handle_to_zbud_header(unsigned long handle) } /* Returns the number of free chunks in a zbud page */ -static int num_free_chunks(struct zbud_header *zhdr) +static int num_free_chunks(struct page *page) { /* * Rather than branch for different situations, just use the fact that * free buddies have a length of zero to simplify everything. */ - return NCHUNKS - zhdr->first_chunks - zhdr->last_chunks; + return NCHUNKS - get_num_chunks(page, FIRST) + - get_num_chunks(page, LAST); } /* @@ -366,17 +376,17 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp, zhdr = init_zbud_page(page); found: - if (zhdr->first_chunks == 0) { - zhdr->first_chunks = chunks; + if (get_num_chunks(page, FIRST) == 0) bud = FIRST; - } else { - zhdr->last_chunks = chunks; + else bud = LAST; - } - if (zhdr->first_chunks == 0 || zhdr->last_chunks == 0) { + set_num_chunks(page, bud, chunks); + + if (get_num_chunks(page, FIRST) == 0 || + get_num_chunks(page, LAST) == 0) { /* Add to unbuddied list */ - freechunks = num_free_chunks(zhdr); + freechunks = num_free_chunks(page); list_add(>buddy, >unbuddied[freechunks]); } @@ -413,9 +423,9 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle) /* If first buddy, handle will be page aligned */ if ((handle - ZHDR_SIZE_ALIGNED) & ~PAGE_MASK) - zhdr->last_chunks = 0; + set_num_chunks(page, LAST, 0); else - zhdr->first_chunks = 0; + set_num_chunks(page, FIRST, 0); if (zhdr->under_reclaim) { /* zbud page is under reclaim, reclaim will free */ @@ -423,7 +433,8 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle) return; } - if (num_free_chunks(zhdr) == NCHUNKS) { + freech
[RFC PATCH 3/9] mm/zbud: remove lru from zbud_header
zbud_pool has an lru list for tracking zbud pages and they are strung together via zhdr->lru. If we reuse page->lru for linking zbud pages instead of it, the lru field in zbud_header can be dropped. Signed-off-by: Heesub Shin --- mm/zbud.c | 23 +-- 1 file changed, 13 insertions(+), 10 deletions(-) diff --git a/mm/zbud.c b/mm/zbud.c index 0f5add0..a2390f6 100644 --- a/mm/zbud.c +++ b/mm/zbud.c @@ -100,13 +100,11 @@ struct zbud_pool { * struct zbud_header - zbud page metadata occupying the first chunk of each * zbud page. * @buddy: links the zbud page into the unbuddied lists in the pool - * @lru: links the zbud page into the lru list in the pool * @first_chunks: the size of the first buddy in chunks, 0 if free * @last_chunks: the size of the last buddy in chunks, 0 if free */ struct zbud_header { struct list_head buddy; - struct list_head lru; unsigned int first_chunks; unsigned int last_chunks; bool under_reclaim; @@ -224,7 +222,7 @@ static struct zbud_header *init_zbud_page(struct page *page) zhdr->first_chunks = 0; zhdr->last_chunks = 0; INIT_LIST_HEAD(>buddy); - INIT_LIST_HEAD(>lru); + INIT_LIST_HEAD(>lru); zhdr->under_reclaim = 0; return zhdr; } @@ -352,6 +350,7 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp, if (!list_empty(>unbuddied[i])) { zhdr = list_first_entry(>unbuddied[i], struct zbud_header, buddy); + page = virt_to_page(zhdr); list_del(>buddy); goto found; } @@ -382,9 +381,9 @@ found: } /* Add/move zbud page to beginning of LRU */ - if (!list_empty(>lru)) - list_del(>lru); - list_add(>lru, >lru); + if (!list_empty(>lru)) + list_del(>lru); + list_add(>lru, >lru); *handle = encode_handle(zhdr, bud); spin_unlock(>lock); @@ -405,10 +404,12 @@ found: void zbud_free(struct zbud_pool *pool, unsigned long handle) { struct zbud_header *zhdr; + struct page *page; int freechunks; spin_lock(>lock); zhdr = handle_to_zbud_header(handle); + page = virt_to_page(zhdr); /* If first buddy, handle will be page aligned */ if ((handle - ZHDR_SIZE_ALIGNED) & ~PAGE_MASK) @@ -426,7 +427,7 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle) /* Remove from existing unbuddied list */ list_del(>buddy); /* zbud page is empty, free */ - list_del(>lru); + list_del(>lru); free_zbud_page(zhdr); pool->pages_nr--; } else { @@ -479,6 +480,7 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle) int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries) { int i, ret, freechunks; + struct page *page; struct zbud_header *zhdr; unsigned long first_handle, last_handle; @@ -489,8 +491,9 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries) return -EINVAL; } for (i = 0; i < retries; i++) { - zhdr = list_tail_entry(>lru, struct zbud_header, lru); - list_del(>lru); + page = list_tail_entry(>lru, struct page, lru); + zhdr = page_address(page); + list_del(>lru); list_del(>buddy); /* Protect zbud page against free */ zhdr->under_reclaim = true; @@ -537,7 +540,7 @@ next: } /* add to beginning of LRU */ - list_add(>lru, >lru); + list_add(>lru, >lru); } spin_unlock(>lock); return -EAGAIN; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH 9/9] mm/zswap: use highmem pages for compressed pool
Now that the zbud supports highmem, storing compressed anonymous pages on highmem looks more reasonble. So, pass __GFP_HIGHMEM flag to zpool when zswap allocates memory from it. Signed-off-by: Heesub Shin --- mm/zswap.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/zswap.c b/mm/zswap.c index ea064c1..eaabe95 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -684,8 +684,8 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset, /* store */ len = dlen + sizeof(struct zswap_header); - ret = zpool_malloc(zswap_pool, len, __GFP_NORETRY | __GFP_NOWARN, - ); + ret = zpool_malloc(zswap_pool, len, + __GFP_NORETRY | __GFP_NOWARN | __GFP_HIGHMEM, ); if (ret == -ENOSPC) { zswap_reject_compress_poor++; goto freepage; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH 8/9] mm/zbud: allow clients to use highmem pages
Now that all fields for the internal data structure of zbud are moved to struct page, there is no reason to restrict zbud pages to be allocated only in lowmem. This patch allows to use highmem pages for zbud pages. Pages from highmem are mapped using kmap_atomic() before accessing. Signed-off-by: Heesub Shin --- mm/zbud.c | 25 - 1 file changed, 20 insertions(+), 5 deletions(-) diff --git a/mm/zbud.c b/mm/zbud.c index 5a392f3..677fdc1 100644 --- a/mm/zbud.c +++ b/mm/zbud.c @@ -52,6 +52,7 @@ #include #include #include +#include /* * Structures @@ -94,6 +95,9 @@ struct zbud_pool { struct zbud_ops *ops; }; +/* per-cpu mapping addresses of kmap_atomic()'ed zbud pages */ +static DEFINE_PER_CPU(void *, zbud_mapping); + /* * zpool / @@ -310,9 +314,6 @@ void zbud_destroy_pool(struct zbud_pool *pool) * performed first. If no suitable free region is found, then a new page is * allocated and added to the pool to satisfy the request. * - * gfp should not set __GFP_HIGHMEM as highmem pages cannot be used - * as zbud pool pages. - * * Return: 0 if success and handle is set, otherwise -EINVAL if the size or * gfp arguments are invalid or -ENOMEM if the pool was unable to allocate * a new page. @@ -324,7 +325,7 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp, enum buddy bud; struct page *page; - if (!size || (gfp & __GFP_HIGHMEM)) + if (!size) return -EINVAL; if (size > PAGE_SIZE - CHUNK_SIZE) return -ENOSPC; @@ -543,14 +544,24 @@ next: */ void *zbud_map(struct zbud_pool *pool, unsigned long handle) { + void **mapping; size_t offset = 0; struct page *page = handle_to_zbud_page(handle); + /* +* Because we use per-cpu mapping shared among the pools/users, +* we can't allow mapping in interrupt context because it can +* corrupt another users mappings. +*/ + BUG_ON(in_interrupt()); + if (is_last_chunk(handle)) offset = PAGE_SIZE - (get_num_chunks(page, LAST) << CHUNK_SHIFT); - return (unsigned char *) page_address(page) + offset; + mapping = _cpu_var(zbud_mapping); + *mapping = kmap_atomic(page); + return (char *) *mapping + offset; } /** @@ -560,6 +571,10 @@ void *zbud_map(struct zbud_pool *pool, unsigned long handle) */ void zbud_unmap(struct zbud_pool *pool, unsigned long handle) { + void **mapping = this_cpu_ptr(_mapping); + + kunmap_atomic(*mapping); + put_cpu_var(zbud_mapping); } /** -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH 6/9] mm/zbud: remove list_head for buddied list from zbud_header
zbud allocator links the _unbuddied_ zbud pages into a list in the pool. When it tries to allocate some spaces, the list is first searched for the best fit possible. Thus, current implementation has a list_head in zbud_header structure to construct the list. This patch simulates a list using the second double word of struct page, instead of zbud_header. Then, we can eliminate the list_head in zbud_header. Using _index and _mapcount fields (also including _count on 64-bits machines) in the page struct for list management looks a bit odd, but no better idea now considering that page->lru is already in use. Signed-off-by: Heesub Shin --- mm/zbud.c | 36 +++- 1 file changed, 19 insertions(+), 17 deletions(-) diff --git a/mm/zbud.c b/mm/zbud.c index 383bab0..8a6dd6b 100644 --- a/mm/zbud.c +++ b/mm/zbud.c @@ -99,10 +99,8 @@ struct zbud_pool { /* * struct zbud_header - zbud page metadata occupying the first chunk of each * zbud page. - * @buddy: links the zbud page into the unbuddied lists in the pool */ struct zbud_header { - struct list_head buddy; bool under_reclaim; }; @@ -223,21 +221,24 @@ static size_t get_num_chunks(struct page *page, enum buddy bud) for ((_iter) = (_begin); (_iter) < NCHUNKS; (_iter)++) /* Initializes the zbud header of a newly allocated zbud page */ -static struct zbud_header *init_zbud_page(struct page *page) +static void init_zbud_page(struct page *page) { struct zbud_header *zhdr = page_address(page); set_num_chunks(page, FIRST, 0); set_num_chunks(page, LAST, 0); - INIT_LIST_HEAD(>buddy); + INIT_LIST_HEAD((struct list_head *) >index); INIT_LIST_HEAD(>lru); zhdr->under_reclaim = 0; - return zhdr; } /* Resets the struct page fields and frees the page */ static void free_zbud_page(struct zbud_header *zhdr) { - __free_page(virt_to_page(zhdr)); + struct page *page = virt_to_page(zhdr); + + init_page_count(page); + page_mapcount_reset(page); + __free_page(page); } static int is_last_chunk(unsigned long handle) @@ -341,7 +342,6 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp, unsigned long *handle) { int chunks, i, freechunks; - struct zbud_header *zhdr = NULL; enum buddy bud; struct page *page; @@ -355,10 +355,9 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp, /* First, try to find an unbuddied zbud page. */ for_each_unbuddied_list(i, chunks) { if (!list_empty(>unbuddied[i])) { - zhdr = list_first_entry(>unbuddied[i], - struct zbud_header, buddy); - page = virt_to_page(zhdr); - list_del(>buddy); + page = list_entry((unsigned long *) + pool->unbuddied[i].next, struct page, index); + list_del((struct list_head *) >index); goto found; } } @@ -370,7 +369,7 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp, return -ENOMEM; spin_lock(>lock); pool->pages_nr++; - zhdr = init_zbud_page(page); + init_zbud_page(page); found: if (get_num_chunks(page, FIRST) == 0) @@ -384,7 +383,8 @@ found: get_num_chunks(page, LAST) == 0) { /* Add to unbuddied list */ freechunks = num_free_chunks(page); - list_add(>buddy, >unbuddied[freechunks]); + list_add((struct list_head *) >index, + >unbuddied[freechunks]); } /* Add/move zbud page to beginning of LRU */ @@ -433,14 +433,15 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle) freechunks = num_free_chunks(page); if (freechunks == NCHUNKS) { /* Remove from existing unbuddied list */ - list_del(>buddy); + list_del((struct list_head *) >index); /* zbud page is empty, free */ list_del(>lru); free_zbud_page(zhdr); pool->pages_nr--; } else { /* Add to unbuddied list */ - list_add(>buddy, >unbuddied[freechunks]); + list_add((struct list_head *) >index, + >unbuddied[freechunks]); } spin_unlock(>lock); @@ -501,7 +502,7 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries) page = list_tail_entry(>lru, struct page, lru); zhdr = page_address(page); list_del(>lru); - list_del(>buddy); + list_del((struct list_head *) >index);
[RFC PATCH 5/9] mm/zbud: encode zbud handle using struct page
As a preparation for further patches, this patch changes the way of encoding zbud handle. Currently, zbud handle is actually just a virtual address that is casted to unsigned long before return back. Exporting the address to clients would be inappropriate if we use highmem pages for zbud pages, which will be implemented by following patches. Change the zbud handle to struct page* with the least significant bit indicating the first or last. All other information are hidden in the struct page. Signed-off-by: Heesub Shin --- mm/zbud.c | 50 -- 1 file changed, 28 insertions(+), 22 deletions(-) diff --git a/mm/zbud.c b/mm/zbud.c index 193ea4f..383bab0 100644 --- a/mm/zbud.c +++ b/mm/zbud.c @@ -240,35 +240,32 @@ static void free_zbud_page(struct zbud_header *zhdr) __free_page(virt_to_page(zhdr)); } +static int is_last_chunk(unsigned long handle) +{ + return (handle & LAST) == LAST; +} + /* * Encodes the handle of a particular buddy within a zbud page * Pool lock should be held as this function accesses first|last_chunks */ -static unsigned long encode_handle(struct zbud_header *zhdr, enum buddy bud) +static unsigned long encode_handle(struct page *page, enum buddy bud) { - unsigned long handle; - struct page *page = virt_to_page(zhdr); + return (unsigned long) page | bud; +} - /* -* For now, the encoded handle is actually just the pointer to the data -* but this might not always be the case. A little information hiding. -* Add CHUNK_SIZE to the handle if it is the first allocation to jump -* over the zbud header in the first chunk. -*/ - handle = (unsigned long)zhdr; - if (bud == FIRST) - /* skip over zbud header */ - handle += ZHDR_SIZE_ALIGNED; - else /* bud == LAST */ - handle += PAGE_SIZE - - (get_num_chunks(page, LAST) << CHUNK_SHIFT); - return handle; +/* Returns struct page of the zbud page where a given handle is stored */ +static struct page *handle_to_zbud_page(unsigned long handle) +{ + return (struct page *) (handle & ~LAST); } /* Returns the zbud page where a given handle is stored */ static struct zbud_header *handle_to_zbud_header(unsigned long handle) { - return (struct zbud_header *)(handle & PAGE_MASK); + struct page *page = handle_to_zbud_page(handle); + + return page_address(page); } /* Returns the number of free chunks in a zbud page */ @@ -395,7 +392,7 @@ found: list_del(>lru); list_add(>lru, >lru); - *handle = encode_handle(zhdr, bud); + *handle = encode_handle(page, bud); spin_unlock(>lock); return 0; @@ -514,9 +511,9 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries) first_handle = 0; last_handle = 0; if (get_num_chunks(page, FIRST)) - first_handle = encode_handle(zhdr, FIRST); + first_handle = encode_handle(page, FIRST); if (get_num_chunks(page, LAST)) - last_handle = encode_handle(zhdr, LAST); + last_handle = encode_handle(page, LAST); spin_unlock(>lock); /* Issue the eviction callback(s) */ @@ -570,7 +567,16 @@ next: */ void *zbud_map(struct zbud_pool *pool, unsigned long handle) { - return (void *)(handle); + size_t offset; + struct page *page = handle_to_zbud_page(handle); + + if (is_last_chunk(handle)) + offset = PAGE_SIZE - + (get_num_chunks(page, LAST) << CHUNK_SHIFT); + else + offset = ZHDR_SIZE_ALIGNED; + + return (unsigned char *) page_address(page) + offset; } /** -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH 2/9] mm/zbud: remove buddied list from zbud_pool
There's no point in having the _buddied_ list of zbud_pages, as nobody refers it. Tracking it adds runtime overheads only, so let's remove it. Signed-off-by: Heesub Shin --- mm/zbud.c | 17 +++-- 1 file changed, 3 insertions(+), 14 deletions(-) diff --git a/mm/zbud.c b/mm/zbud.c index 6f36394..0f5add0 100644 --- a/mm/zbud.c +++ b/mm/zbud.c @@ -79,8 +79,6 @@ * @unbuddied: array of lists tracking zbud pages that only contain one buddy; * the lists each zbud page is added to depends on the size of * its free region. - * @buddied: list tracking the zbud pages that contain two buddies; - * these zbud pages are full * @lru: list tracking the zbud pages in LRU order by most recently * added buddy. * @pages_nr: number of zbud pages in the pool. @@ -93,7 +91,6 @@ struct zbud_pool { spinlock_t lock; struct list_head unbuddied[NCHUNKS]; - struct list_head buddied; struct list_head lru; u64 pages_nr; struct zbud_ops *ops; @@ -102,7 +99,7 @@ struct zbud_pool { /* * struct zbud_header - zbud page metadata occupying the first chunk of each * zbud page. - * @buddy: links the zbud page into the unbuddied/buddied lists in the pool + * @buddy: links the zbud page into the unbuddied lists in the pool * @lru: links the zbud page into the lru list in the pool * @first_chunks: the size of the first buddy in chunks, 0 if free * @last_chunks: the size of the last buddy in chunks, 0 if free @@ -299,7 +296,6 @@ struct zbud_pool *zbud_create_pool(gfp_t gfp, struct zbud_ops *ops) spin_lock_init(>lock); for_each_unbuddied_list(i, 0) INIT_LIST_HEAD(>unbuddied[i]); - INIT_LIST_HEAD(>buddied); INIT_LIST_HEAD(>lru); pool->pages_nr = 0; pool->ops = ops; @@ -383,9 +379,6 @@ found: /* Add to unbuddied list */ freechunks = num_free_chunks(zhdr); list_add(>buddy, >unbuddied[freechunks]); - } else { - /* Add to buddied list */ - list_add(>buddy, >buddied); } /* Add/move zbud page to beginning of LRU */ @@ -429,10 +422,9 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle) return; } - /* Remove from existing buddy list */ - list_del(>buddy); - if (num_free_chunks(zhdr) == NCHUNKS) { + /* Remove from existing unbuddied list */ + list_del(>buddy); /* zbud page is empty, free */ list_del(>lru); free_zbud_page(zhdr); @@ -542,9 +534,6 @@ next: /* add to unbuddied list */ freechunks = num_free_chunks(zhdr); list_add(>buddy, >unbuddied[freechunks]); - } else { - /* add to buddied list */ - list_add(>buddy, >buddied); } /* add to beginning of LRU */ -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH 0/9] mm/zbud: support highmem pages
zbud is a memory allocator for storing compressed data pages. It keeps two data objects of arbitrary size on a single page. This simple design provides very deterministic behavior on reclamation, which is one of reasons why zswap selected zbud as a default allocator over zsmalloc. Unlike zsmalloc, however, zbud does not support highmem. This is problomatic especially on 32-bit machines having relatively small lowmem. Compressing anonymous pages from highmem and storing them into lowmem could eat up lowmem spaces. This limitation is due to the fact that zbud manages its internal data structures on zbud_header which is kept in the head of zbud_page. For example, zbud_pages are tracked by several lists and have some status information, which are being referenced at any time by the kernel. Thus, zbud_pages should be allocated on a memory region directly mapped, lowmem. After some digging out, I found that internal data structures of zbud can be kept in the struct page, the same way as zsmalloc does. So, this series moves out all fields in zbud_header to struct page. Though it alters quite a lot, it does not add any functional differences except highmem support. I am afraid that this kind of modification abusing several fields in struct page would be ok. Heesub Shin (9): mm/zbud: tidy up a bit mm/zbud: remove buddied list from zbud_pool mm/zbud: remove lru from zbud_header mm/zbud: remove first|last_chunks from zbud_header mm/zbud: encode zbud handle using struct page mm/zbud: remove list_head for buddied list from zbud_header mm/zbud: drop zbud_header mm/zbud: allow clients to use highmem pages mm/zswap: use highmem pages for compressed pool mm/zbud.c | 244 ++--- mm/zswap.c | 4 +- 2 files changed, 121 insertions(+), 127 deletions(-) -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH 0/9] mm/zbud: support highmem pages
zbud is a memory allocator for storing compressed data pages. It keeps two data objects of arbitrary size on a single page. This simple design provides very deterministic behavior on reclamation, which is one of reasons why zswap selected zbud as a default allocator over zsmalloc. Unlike zsmalloc, however, zbud does not support highmem. This is problomatic especially on 32-bit machines having relatively small lowmem. Compressing anonymous pages from highmem and storing them into lowmem could eat up lowmem spaces. This limitation is due to the fact that zbud manages its internal data structures on zbud_header which is kept in the head of zbud_page. For example, zbud_pages are tracked by several lists and have some status information, which are being referenced at any time by the kernel. Thus, zbud_pages should be allocated on a memory region directly mapped, lowmem. After some digging out, I found that internal data structures of zbud can be kept in the struct page, the same way as zsmalloc does. So, this series moves out all fields in zbud_header to struct page. Though it alters quite a lot, it does not add any functional differences except highmem support. I am afraid that this kind of modification abusing several fields in struct page would be ok. Heesub Shin (9): mm/zbud: tidy up a bit mm/zbud: remove buddied list from zbud_pool mm/zbud: remove lru from zbud_header mm/zbud: remove first|last_chunks from zbud_header mm/zbud: encode zbud handle using struct page mm/zbud: remove list_head for buddied list from zbud_header mm/zbud: drop zbud_header mm/zbud: allow clients to use highmem pages mm/zswap: use highmem pages for compressed pool mm/zbud.c | 244 ++--- mm/zswap.c | 4 +- 2 files changed, 121 insertions(+), 127 deletions(-) -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH 2/9] mm/zbud: remove buddied list from zbud_pool
There's no point in having the _buddied_ list of zbud_pages, as nobody refers it. Tracking it adds runtime overheads only, so let's remove it. Signed-off-by: Heesub Shin heesub.s...@samsung.com --- mm/zbud.c | 17 +++-- 1 file changed, 3 insertions(+), 14 deletions(-) diff --git a/mm/zbud.c b/mm/zbud.c index 6f36394..0f5add0 100644 --- a/mm/zbud.c +++ b/mm/zbud.c @@ -79,8 +79,6 @@ * @unbuddied: array of lists tracking zbud pages that only contain one buddy; * the lists each zbud page is added to depends on the size of * its free region. - * @buddied: list tracking the zbud pages that contain two buddies; - * these zbud pages are full * @lru: list tracking the zbud pages in LRU order by most recently * added buddy. * @pages_nr: number of zbud pages in the pool. @@ -93,7 +91,6 @@ struct zbud_pool { spinlock_t lock; struct list_head unbuddied[NCHUNKS]; - struct list_head buddied; struct list_head lru; u64 pages_nr; struct zbud_ops *ops; @@ -102,7 +99,7 @@ struct zbud_pool { /* * struct zbud_header - zbud page metadata occupying the first chunk of each * zbud page. - * @buddy: links the zbud page into the unbuddied/buddied lists in the pool + * @buddy: links the zbud page into the unbuddied lists in the pool * @lru: links the zbud page into the lru list in the pool * @first_chunks: the size of the first buddy in chunks, 0 if free * @last_chunks: the size of the last buddy in chunks, 0 if free @@ -299,7 +296,6 @@ struct zbud_pool *zbud_create_pool(gfp_t gfp, struct zbud_ops *ops) spin_lock_init(pool-lock); for_each_unbuddied_list(i, 0) INIT_LIST_HEAD(pool-unbuddied[i]); - INIT_LIST_HEAD(pool-buddied); INIT_LIST_HEAD(pool-lru); pool-pages_nr = 0; pool-ops = ops; @@ -383,9 +379,6 @@ found: /* Add to unbuddied list */ freechunks = num_free_chunks(zhdr); list_add(zhdr-buddy, pool-unbuddied[freechunks]); - } else { - /* Add to buddied list */ - list_add(zhdr-buddy, pool-buddied); } /* Add/move zbud page to beginning of LRU */ @@ -429,10 +422,9 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle) return; } - /* Remove from existing buddy list */ - list_del(zhdr-buddy); - if (num_free_chunks(zhdr) == NCHUNKS) { + /* Remove from existing unbuddied list */ + list_del(zhdr-buddy); /* zbud page is empty, free */ list_del(zhdr-lru); free_zbud_page(zhdr); @@ -542,9 +534,6 @@ next: /* add to unbuddied list */ freechunks = num_free_chunks(zhdr); list_add(zhdr-buddy, pool-unbuddied[freechunks]); - } else { - /* add to buddied list */ - list_add(zhdr-buddy, pool-buddied); } /* add to beginning of LRU */ -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH 5/9] mm/zbud: encode zbud handle using struct page
As a preparation for further patches, this patch changes the way of encoding zbud handle. Currently, zbud handle is actually just a virtual address that is casted to unsigned long before return back. Exporting the address to clients would be inappropriate if we use highmem pages for zbud pages, which will be implemented by following patches. Change the zbud handle to struct page* with the least significant bit indicating the first or last. All other information are hidden in the struct page. Signed-off-by: Heesub Shin heesub.s...@samsung.com --- mm/zbud.c | 50 -- 1 file changed, 28 insertions(+), 22 deletions(-) diff --git a/mm/zbud.c b/mm/zbud.c index 193ea4f..383bab0 100644 --- a/mm/zbud.c +++ b/mm/zbud.c @@ -240,35 +240,32 @@ static void free_zbud_page(struct zbud_header *zhdr) __free_page(virt_to_page(zhdr)); } +static int is_last_chunk(unsigned long handle) +{ + return (handle LAST) == LAST; +} + /* * Encodes the handle of a particular buddy within a zbud page * Pool lock should be held as this function accesses first|last_chunks */ -static unsigned long encode_handle(struct zbud_header *zhdr, enum buddy bud) +static unsigned long encode_handle(struct page *page, enum buddy bud) { - unsigned long handle; - struct page *page = virt_to_page(zhdr); + return (unsigned long) page | bud; +} - /* -* For now, the encoded handle is actually just the pointer to the data -* but this might not always be the case. A little information hiding. -* Add CHUNK_SIZE to the handle if it is the first allocation to jump -* over the zbud header in the first chunk. -*/ - handle = (unsigned long)zhdr; - if (bud == FIRST) - /* skip over zbud header */ - handle += ZHDR_SIZE_ALIGNED; - else /* bud == LAST */ - handle += PAGE_SIZE - - (get_num_chunks(page, LAST) CHUNK_SHIFT); - return handle; +/* Returns struct page of the zbud page where a given handle is stored */ +static struct page *handle_to_zbud_page(unsigned long handle) +{ + return (struct page *) (handle ~LAST); } /* Returns the zbud page where a given handle is stored */ static struct zbud_header *handle_to_zbud_header(unsigned long handle) { - return (struct zbud_header *)(handle PAGE_MASK); + struct page *page = handle_to_zbud_page(handle); + + return page_address(page); } /* Returns the number of free chunks in a zbud page */ @@ -395,7 +392,7 @@ found: list_del(page-lru); list_add(page-lru, pool-lru); - *handle = encode_handle(zhdr, bud); + *handle = encode_handle(page, bud); spin_unlock(pool-lock); return 0; @@ -514,9 +511,9 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries) first_handle = 0; last_handle = 0; if (get_num_chunks(page, FIRST)) - first_handle = encode_handle(zhdr, FIRST); + first_handle = encode_handle(page, FIRST); if (get_num_chunks(page, LAST)) - last_handle = encode_handle(zhdr, LAST); + last_handle = encode_handle(page, LAST); spin_unlock(pool-lock); /* Issue the eviction callback(s) */ @@ -570,7 +567,16 @@ next: */ void *zbud_map(struct zbud_pool *pool, unsigned long handle) { - return (void *)(handle); + size_t offset; + struct page *page = handle_to_zbud_page(handle); + + if (is_last_chunk(handle)) + offset = PAGE_SIZE - + (get_num_chunks(page, LAST) CHUNK_SHIFT); + else + offset = ZHDR_SIZE_ALIGNED; + + return (unsigned char *) page_address(page) + offset; } /** -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH 6/9] mm/zbud: remove list_head for buddied list from zbud_header
zbud allocator links the _unbuddied_ zbud pages into a list in the pool. When it tries to allocate some spaces, the list is first searched for the best fit possible. Thus, current implementation has a list_head in zbud_header structure to construct the list. This patch simulates a list using the second double word of struct page, instead of zbud_header. Then, we can eliminate the list_head in zbud_header. Using _index and _mapcount fields (also including _count on 64-bits machines) in the page struct for list management looks a bit odd, but no better idea now considering that page-lru is already in use. Signed-off-by: Heesub Shin heesub.s...@samsung.com --- mm/zbud.c | 36 +++- 1 file changed, 19 insertions(+), 17 deletions(-) diff --git a/mm/zbud.c b/mm/zbud.c index 383bab0..8a6dd6b 100644 --- a/mm/zbud.c +++ b/mm/zbud.c @@ -99,10 +99,8 @@ struct zbud_pool { /* * struct zbud_header - zbud page metadata occupying the first chunk of each * zbud page. - * @buddy: links the zbud page into the unbuddied lists in the pool */ struct zbud_header { - struct list_head buddy; bool under_reclaim; }; @@ -223,21 +221,24 @@ static size_t get_num_chunks(struct page *page, enum buddy bud) for ((_iter) = (_begin); (_iter) NCHUNKS; (_iter)++) /* Initializes the zbud header of a newly allocated zbud page */ -static struct zbud_header *init_zbud_page(struct page *page) +static void init_zbud_page(struct page *page) { struct zbud_header *zhdr = page_address(page); set_num_chunks(page, FIRST, 0); set_num_chunks(page, LAST, 0); - INIT_LIST_HEAD(zhdr-buddy); + INIT_LIST_HEAD((struct list_head *) page-index); INIT_LIST_HEAD(page-lru); zhdr-under_reclaim = 0; - return zhdr; } /* Resets the struct page fields and frees the page */ static void free_zbud_page(struct zbud_header *zhdr) { - __free_page(virt_to_page(zhdr)); + struct page *page = virt_to_page(zhdr); + + init_page_count(page); + page_mapcount_reset(page); + __free_page(page); } static int is_last_chunk(unsigned long handle) @@ -341,7 +342,6 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp, unsigned long *handle) { int chunks, i, freechunks; - struct zbud_header *zhdr = NULL; enum buddy bud; struct page *page; @@ -355,10 +355,9 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp, /* First, try to find an unbuddied zbud page. */ for_each_unbuddied_list(i, chunks) { if (!list_empty(pool-unbuddied[i])) { - zhdr = list_first_entry(pool-unbuddied[i], - struct zbud_header, buddy); - page = virt_to_page(zhdr); - list_del(zhdr-buddy); + page = list_entry((unsigned long *) + pool-unbuddied[i].next, struct page, index); + list_del((struct list_head *) page-index); goto found; } } @@ -370,7 +369,7 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp, return -ENOMEM; spin_lock(pool-lock); pool-pages_nr++; - zhdr = init_zbud_page(page); + init_zbud_page(page); found: if (get_num_chunks(page, FIRST) == 0) @@ -384,7 +383,8 @@ found: get_num_chunks(page, LAST) == 0) { /* Add to unbuddied list */ freechunks = num_free_chunks(page); - list_add(zhdr-buddy, pool-unbuddied[freechunks]); + list_add((struct list_head *) page-index, + pool-unbuddied[freechunks]); } /* Add/move zbud page to beginning of LRU */ @@ -433,14 +433,15 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle) freechunks = num_free_chunks(page); if (freechunks == NCHUNKS) { /* Remove from existing unbuddied list */ - list_del(zhdr-buddy); + list_del((struct list_head *) page-index); /* zbud page is empty, free */ list_del(page-lru); free_zbud_page(zhdr); pool-pages_nr--; } else { /* Add to unbuddied list */ - list_add(zhdr-buddy, pool-unbuddied[freechunks]); + list_add((struct list_head *) page-index, + pool-unbuddied[freechunks]); } spin_unlock(pool-lock); @@ -501,7 +502,7 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries) page = list_tail_entry(pool-lru, struct page, lru); zhdr = page_address(page); list_del(page-lru); - list_del(zhdr-buddy); + list_del((struct list_head
[RFC PATCH 8/9] mm/zbud: allow clients to use highmem pages
Now that all fields for the internal data structure of zbud are moved to struct page, there is no reason to restrict zbud pages to be allocated only in lowmem. This patch allows to use highmem pages for zbud pages. Pages from highmem are mapped using kmap_atomic() before accessing. Signed-off-by: Heesub Shin heesub.s...@samsung.com --- mm/zbud.c | 25 - 1 file changed, 20 insertions(+), 5 deletions(-) diff --git a/mm/zbud.c b/mm/zbud.c index 5a392f3..677fdc1 100644 --- a/mm/zbud.c +++ b/mm/zbud.c @@ -52,6 +52,7 @@ #include linux/spinlock.h #include linux/zbud.h #include linux/zpool.h +#include linux/highmem.h /* * Structures @@ -94,6 +95,9 @@ struct zbud_pool { struct zbud_ops *ops; }; +/* per-cpu mapping addresses of kmap_atomic()'ed zbud pages */ +static DEFINE_PER_CPU(void *, zbud_mapping); + /* * zpool / @@ -310,9 +314,6 @@ void zbud_destroy_pool(struct zbud_pool *pool) * performed first. If no suitable free region is found, then a new page is * allocated and added to the pool to satisfy the request. * - * gfp should not set __GFP_HIGHMEM as highmem pages cannot be used - * as zbud pool pages. - * * Return: 0 if success and handle is set, otherwise -EINVAL if the size or * gfp arguments are invalid or -ENOMEM if the pool was unable to allocate * a new page. @@ -324,7 +325,7 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp, enum buddy bud; struct page *page; - if (!size || (gfp __GFP_HIGHMEM)) + if (!size) return -EINVAL; if (size PAGE_SIZE - CHUNK_SIZE) return -ENOSPC; @@ -543,14 +544,24 @@ next: */ void *zbud_map(struct zbud_pool *pool, unsigned long handle) { + void **mapping; size_t offset = 0; struct page *page = handle_to_zbud_page(handle); + /* +* Because we use per-cpu mapping shared among the pools/users, +* we can't allow mapping in interrupt context because it can +* corrupt another users mappings. +*/ + BUG_ON(in_interrupt()); + if (is_last_chunk(handle)) offset = PAGE_SIZE - (get_num_chunks(page, LAST) CHUNK_SHIFT); - return (unsigned char *) page_address(page) + offset; + mapping = get_cpu_var(zbud_mapping); + *mapping = kmap_atomic(page); + return (char *) *mapping + offset; } /** @@ -560,6 +571,10 @@ void *zbud_map(struct zbud_pool *pool, unsigned long handle) */ void zbud_unmap(struct zbud_pool *pool, unsigned long handle) { + void **mapping = this_cpu_ptr(zbud_mapping); + + kunmap_atomic(*mapping); + put_cpu_var(zbud_mapping); } /** -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH 3/9] mm/zbud: remove lru from zbud_header
zbud_pool has an lru list for tracking zbud pages and they are strung together via zhdr-lru. If we reuse page-lru for linking zbud pages instead of it, the lru field in zbud_header can be dropped. Signed-off-by: Heesub Shin heesub.s...@samsung.com --- mm/zbud.c | 23 +-- 1 file changed, 13 insertions(+), 10 deletions(-) diff --git a/mm/zbud.c b/mm/zbud.c index 0f5add0..a2390f6 100644 --- a/mm/zbud.c +++ b/mm/zbud.c @@ -100,13 +100,11 @@ struct zbud_pool { * struct zbud_header - zbud page metadata occupying the first chunk of each * zbud page. * @buddy: links the zbud page into the unbuddied lists in the pool - * @lru: links the zbud page into the lru list in the pool * @first_chunks: the size of the first buddy in chunks, 0 if free * @last_chunks: the size of the last buddy in chunks, 0 if free */ struct zbud_header { struct list_head buddy; - struct list_head lru; unsigned int first_chunks; unsigned int last_chunks; bool under_reclaim; @@ -224,7 +222,7 @@ static struct zbud_header *init_zbud_page(struct page *page) zhdr-first_chunks = 0; zhdr-last_chunks = 0; INIT_LIST_HEAD(zhdr-buddy); - INIT_LIST_HEAD(zhdr-lru); + INIT_LIST_HEAD(page-lru); zhdr-under_reclaim = 0; return zhdr; } @@ -352,6 +350,7 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp, if (!list_empty(pool-unbuddied[i])) { zhdr = list_first_entry(pool-unbuddied[i], struct zbud_header, buddy); + page = virt_to_page(zhdr); list_del(zhdr-buddy); goto found; } @@ -382,9 +381,9 @@ found: } /* Add/move zbud page to beginning of LRU */ - if (!list_empty(zhdr-lru)) - list_del(zhdr-lru); - list_add(zhdr-lru, pool-lru); + if (!list_empty(page-lru)) + list_del(page-lru); + list_add(page-lru, pool-lru); *handle = encode_handle(zhdr, bud); spin_unlock(pool-lock); @@ -405,10 +404,12 @@ found: void zbud_free(struct zbud_pool *pool, unsigned long handle) { struct zbud_header *zhdr; + struct page *page; int freechunks; spin_lock(pool-lock); zhdr = handle_to_zbud_header(handle); + page = virt_to_page(zhdr); /* If first buddy, handle will be page aligned */ if ((handle - ZHDR_SIZE_ALIGNED) ~PAGE_MASK) @@ -426,7 +427,7 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle) /* Remove from existing unbuddied list */ list_del(zhdr-buddy); /* zbud page is empty, free */ - list_del(zhdr-lru); + list_del(page-lru); free_zbud_page(zhdr); pool-pages_nr--; } else { @@ -479,6 +480,7 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle) int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries) { int i, ret, freechunks; + struct page *page; struct zbud_header *zhdr; unsigned long first_handle, last_handle; @@ -489,8 +491,9 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries) return -EINVAL; } for (i = 0; i retries; i++) { - zhdr = list_tail_entry(pool-lru, struct zbud_header, lru); - list_del(zhdr-lru); + page = list_tail_entry(pool-lru, struct page, lru); + zhdr = page_address(page); + list_del(page-lru); list_del(zhdr-buddy); /* Protect zbud page against free */ zhdr-under_reclaim = true; @@ -537,7 +540,7 @@ next: } /* add to beginning of LRU */ - list_add(zhdr-lru, pool-lru); + list_add(page-lru, pool-lru); } spin_unlock(pool-lock); return -EAGAIN; -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH 9/9] mm/zswap: use highmem pages for compressed pool
Now that the zbud supports highmem, storing compressed anonymous pages on highmem looks more reasonble. So, pass __GFP_HIGHMEM flag to zpool when zswap allocates memory from it. Signed-off-by: Heesub Shin heesub.s...@samsung.com --- mm/zswap.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/zswap.c b/mm/zswap.c index ea064c1..eaabe95 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -684,8 +684,8 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset, /* store */ len = dlen + sizeof(struct zswap_header); - ret = zpool_malloc(zswap_pool, len, __GFP_NORETRY | __GFP_NOWARN, - handle); + ret = zpool_malloc(zswap_pool, len, + __GFP_NORETRY | __GFP_NOWARN | __GFP_HIGHMEM, handle); if (ret == -ENOSPC) { zswap_reject_compress_poor++; goto freepage; -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH 1/9] mm/zbud: tidy up a bit
For aesthetics, add a blank line between functions, remove useless initialization statements, and simplify codes a bit. No functional differences are introduced. Signed-off-by: Heesub Shin heesub.s...@samsung.com --- mm/zbud.c | 21 ++--- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/mm/zbud.c b/mm/zbud.c index ecf1dbe..6f36394 100644 --- a/mm/zbud.c +++ b/mm/zbud.c @@ -145,6 +145,7 @@ static int zbud_zpool_malloc(void *pool, size_t size, gfp_t gfp, { return zbud_alloc(pool, size, gfp, handle); } + static void zbud_zpool_free(void *pool, unsigned long handle) { zbud_free(pool, handle); @@ -174,6 +175,7 @@ static void *zbud_zpool_map(void *pool, unsigned long handle, { return zbud_map(pool, handle); } + static void zbud_zpool_unmap(void *pool, unsigned long handle) { zbud_unmap(pool, handle); @@ -350,16 +352,11 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp, spin_lock(pool-lock); /* First, try to find an unbuddied zbud page. */ - zhdr = NULL; for_each_unbuddied_list(i, chunks) { if (!list_empty(pool-unbuddied[i])) { zhdr = list_first_entry(pool-unbuddied[i], struct zbud_header, buddy); list_del(zhdr-buddy); - if (zhdr-first_chunks == 0) - bud = FIRST; - else - bud = LAST; goto found; } } @@ -372,13 +369,15 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp, spin_lock(pool-lock); pool-pages_nr++; zhdr = init_zbud_page(page); - bud = FIRST; found: - if (bud == FIRST) + if (zhdr-first_chunks == 0) { zhdr-first_chunks = chunks; - else + bud = FIRST; + } else { zhdr-last_chunks = chunks; + bud = LAST; + } if (zhdr-first_chunks == 0 || zhdr-last_chunks == 0) { /* Add to unbuddied list */ @@ -433,7 +432,7 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle) /* Remove from existing buddy list */ list_del(zhdr-buddy); - if (zhdr-first_chunks == 0 zhdr-last_chunks == 0) { + if (num_free_chunks(zhdr) == NCHUNKS) { /* zbud page is empty, free */ list_del(zhdr-lru); free_zbud_page(zhdr); @@ -489,7 +488,7 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries) { int i, ret, freechunks; struct zbud_header *zhdr; - unsigned long first_handle = 0, last_handle = 0; + unsigned long first_handle, last_handle; spin_lock(pool-lock); if (!pool-ops || !pool-ops-evict || list_empty(pool-lru) || @@ -529,7 +528,7 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries) next: spin_lock(pool-lock); zhdr-under_reclaim = false; - if (zhdr-first_chunks == 0 zhdr-last_chunks == 0) { + if (num_free_chunks(zhdr) == NCHUNKS) { /* * Both buddies are now free, free the zbud page and * return success. -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC PATCH 7/9] mm/zbud: drop zbud_header
Now that the only field in zbud_header is .under_reclaim, get it out of the struct and let PG_reclaim bit in page-flags take over. As a result of this change, we can finally eliminate the struct zbud_header, and hence all the internal data structures of zbud live in struct page. Signed-off-by: Heesub Shin heesub.s...@samsung.com --- mm/zbud.c | 66 +-- 1 file changed, 18 insertions(+), 48 deletions(-) diff --git a/mm/zbud.c b/mm/zbud.c index 8a6dd6b..5a392f3 100644 --- a/mm/zbud.c +++ b/mm/zbud.c @@ -60,17 +60,15 @@ * NCHUNKS_ORDER determines the internal allocation granularity, effectively * adjusting internal fragmentation. It also determines the number of * freelists maintained in each pool. NCHUNKS_ORDER of 6 means that the - * allocation granularity will be in chunks of size PAGE_SIZE/64. As one chunk - * in allocated page is occupied by zbud header, NCHUNKS will be calculated to - * 63 which shows the max number of free chunks in zbud page, also there will be - * 63 freelists per pool. + * allocation granularity will be in chunks of size PAGE_SIZE/64. + * NCHUNKS will be calculated to 64 which shows the max number of free + * chunks in zbud page, also there will be 64 freelists per pool. */ #define NCHUNKS_ORDER 6 #define CHUNK_SHIFT(PAGE_SHIFT - NCHUNKS_ORDER) #define CHUNK_SIZE (1 CHUNK_SHIFT) -#define ZHDR_SIZE_ALIGNED CHUNK_SIZE -#define NCHUNKS((PAGE_SIZE - ZHDR_SIZE_ALIGNED) CHUNK_SHIFT) +#define NCHUNKS(PAGE_SIZE CHUNK_SHIFT) /** * struct zbud_pool - stores metadata for each zbud pool @@ -96,14 +94,6 @@ struct zbud_pool { struct zbud_ops *ops; }; -/* - * struct zbud_header - zbud page metadata occupying the first chunk of each - * zbud page. - */ -struct zbud_header { - bool under_reclaim; -}; - /* * zpool / @@ -220,22 +210,19 @@ static size_t get_num_chunks(struct page *page, enum buddy bud) #define for_each_unbuddied_list(_iter, _begin) \ for ((_iter) = (_begin); (_iter) NCHUNKS; (_iter)++) -/* Initializes the zbud header of a newly allocated zbud page */ +/* Initializes a newly allocated zbud page */ static void init_zbud_page(struct page *page) { - struct zbud_header *zhdr = page_address(page); set_num_chunks(page, FIRST, 0); set_num_chunks(page, LAST, 0); INIT_LIST_HEAD((struct list_head *) page-index); INIT_LIST_HEAD(page-lru); - zhdr-under_reclaim = 0; + ClearPageReclaim(page); } /* Resets the struct page fields and frees the page */ -static void free_zbud_page(struct zbud_header *zhdr) +static void free_zbud_page(struct page *page) { - struct page *page = virt_to_page(zhdr); - init_page_count(page); page_mapcount_reset(page); __free_page(page); @@ -261,14 +248,6 @@ static struct page *handle_to_zbud_page(unsigned long handle) return (struct page *) (handle ~LAST); } -/* Returns the zbud page where a given handle is stored */ -static struct zbud_header *handle_to_zbud_header(unsigned long handle) -{ - struct page *page = handle_to_zbud_page(handle); - - return page_address(page); -} - /* Returns the number of free chunks in a zbud page */ static int num_free_chunks(struct page *page) { @@ -347,7 +326,7 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp, if (!size || (gfp __GFP_HIGHMEM)) return -EINVAL; - if (size PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE) + if (size PAGE_SIZE - CHUNK_SIZE) return -ENOSPC; chunks = size_to_chunks(size); spin_lock(pool-lock); @@ -410,21 +389,18 @@ found: */ void zbud_free(struct zbud_pool *pool, unsigned long handle) { - struct zbud_header *zhdr; struct page *page; int freechunks; spin_lock(pool-lock); - zhdr = handle_to_zbud_header(handle); - page = virt_to_page(zhdr); + page = handle_to_zbud_page(handle); - /* If first buddy, handle will be page aligned */ - if ((handle - ZHDR_SIZE_ALIGNED) ~PAGE_MASK) - set_num_chunks(page, LAST, 0); - else + if (!is_last_chunk(handle)) set_num_chunks(page, FIRST, 0); + else + set_num_chunks(page, LAST, 0); - if (zhdr-under_reclaim) { + if (PageReclaim(page)) { /* zbud page is under reclaim, reclaim will free */ spin_unlock(pool-lock); return; @@ -436,7 +412,7 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle) list_del((struct list_head *) page-index); /* zbud page is empty, free */ list_del(page-lru); - free_zbud_page(zhdr); + free_zbud_page(page); pool-pages_nr--; } else { /* Add
[RFC PATCH 4/9] mm/zbud: remove first|last_chunks from zbud_header
The size information of each first and last buddy are stored into first|last_chunks in struct zbud_header respectively. Put them into page-private instead of zbud_header. Signed-off-by: Heesub Shin heesub.s...@samsung.com --- mm/zbud.c | 62 -- 1 file changed, 36 insertions(+), 26 deletions(-) diff --git a/mm/zbud.c b/mm/zbud.c index a2390f6..193ea4f 100644 --- a/mm/zbud.c +++ b/mm/zbud.c @@ -100,13 +100,9 @@ struct zbud_pool { * struct zbud_header - zbud page metadata occupying the first chunk of each * zbud page. * @buddy: links the zbud page into the unbuddied lists in the pool - * @first_chunks: the size of the first buddy in chunks, 0 if free - * @last_chunks: the size of the last buddy in chunks, 0 if free */ struct zbud_header { struct list_head buddy; - unsigned int first_chunks; - unsigned int last_chunks; bool under_reclaim; }; @@ -212,6 +208,17 @@ static int size_to_chunks(size_t size) return (size + CHUNK_SIZE - 1) CHUNK_SHIFT; } +static void set_num_chunks(struct page *page, enum buddy bud, size_t chunks) +{ + page-private = (page-private (0x (16 * !bud))) | + ((chunks 0x) (16 * bud)); +} + +static size_t get_num_chunks(struct page *page, enum buddy bud) +{ + return (page-private (16 * bud)) 0x; +} + #define for_each_unbuddied_list(_iter, _begin) \ for ((_iter) = (_begin); (_iter) NCHUNKS; (_iter)++) @@ -219,8 +226,8 @@ static int size_to_chunks(size_t size) static struct zbud_header *init_zbud_page(struct page *page) { struct zbud_header *zhdr = page_address(page); - zhdr-first_chunks = 0; - zhdr-last_chunks = 0; + set_num_chunks(page, FIRST, 0); + set_num_chunks(page, LAST, 0); INIT_LIST_HEAD(zhdr-buddy); INIT_LIST_HEAD(page-lru); zhdr-under_reclaim = 0; @@ -240,6 +247,7 @@ static void free_zbud_page(struct zbud_header *zhdr) static unsigned long encode_handle(struct zbud_header *zhdr, enum buddy bud) { unsigned long handle; + struct page *page = virt_to_page(zhdr); /* * For now, the encoded handle is actually just the pointer to the data @@ -252,7 +260,8 @@ static unsigned long encode_handle(struct zbud_header *zhdr, enum buddy bud) /* skip over zbud header */ handle += ZHDR_SIZE_ALIGNED; else /* bud == LAST */ - handle += PAGE_SIZE - (zhdr-last_chunks CHUNK_SHIFT); + handle += PAGE_SIZE - + (get_num_chunks(page, LAST) CHUNK_SHIFT); return handle; } @@ -263,13 +272,14 @@ static struct zbud_header *handle_to_zbud_header(unsigned long handle) } /* Returns the number of free chunks in a zbud page */ -static int num_free_chunks(struct zbud_header *zhdr) +static int num_free_chunks(struct page *page) { /* * Rather than branch for different situations, just use the fact that * free buddies have a length of zero to simplify everything. */ - return NCHUNKS - zhdr-first_chunks - zhdr-last_chunks; + return NCHUNKS - get_num_chunks(page, FIRST) + - get_num_chunks(page, LAST); } /* @@ -366,17 +376,17 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t gfp, zhdr = init_zbud_page(page); found: - if (zhdr-first_chunks == 0) { - zhdr-first_chunks = chunks; + if (get_num_chunks(page, FIRST) == 0) bud = FIRST; - } else { - zhdr-last_chunks = chunks; + else bud = LAST; - } - if (zhdr-first_chunks == 0 || zhdr-last_chunks == 0) { + set_num_chunks(page, bud, chunks); + + if (get_num_chunks(page, FIRST) == 0 || + get_num_chunks(page, LAST) == 0) { /* Add to unbuddied list */ - freechunks = num_free_chunks(zhdr); + freechunks = num_free_chunks(page); list_add(zhdr-buddy, pool-unbuddied[freechunks]); } @@ -413,9 +423,9 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle) /* If first buddy, handle will be page aligned */ if ((handle - ZHDR_SIZE_ALIGNED) ~PAGE_MASK) - zhdr-last_chunks = 0; + set_num_chunks(page, LAST, 0); else - zhdr-first_chunks = 0; + set_num_chunks(page, FIRST, 0); if (zhdr-under_reclaim) { /* zbud page is under reclaim, reclaim will free */ @@ -423,7 +433,8 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle) return; } - if (num_free_chunks(zhdr) == NCHUNKS) { + freechunks = num_free_chunks(page); + if (freechunks == NCHUNKS) { /* Remove from existing unbuddied list
Re: [PATCH 1/1] [ion]: system-heap use PAGE_ALLOC_COSTLY_ORDER for high order
Hello Kumar, On 10/06/2014 05:31 PM, Pintu Kumar wrote: The Android ion_system_heap uses allocation fallback mechanism based on 8,4,0 order pages available in the system. It changes gfp flags based on higher order allocation request. This higher order value is hard-coded as 4, instead of using the system defined higher order value. Thus replacing this hard-coded value with PAGE_ALLOC_COSTLY_ORDER which is defined as 3. This will help mapping the higher order request in system heap with the actual allocation request. Quite reasonable. Reviewed-by: Heesub Shin BTW, Anyone knows how the allocation order (8,4 and 0) was decided? I think only Google guys might know the answer. regards, heesub Signed-off-by: Pintu Kumar --- drivers/staging/android/ion/ion_system_heap.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index da2a63c..e6c393f 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -65,7 +65,7 @@ static struct page *alloc_buffer_page(struct ion_system_heap *heap, } else { gfp_t gfp_flags = low_order_gfp_flags; - if (order > 4) + if (order > PAGE_ALLOC_COSTLY_ORDER) gfp_flags = high_order_gfp_flags; page = alloc_pages(gfp_flags | __GFP_COMP, order); if (!page) @@ -276,7 +276,7 @@ struct ion_heap *ion_system_heap_create(struct ion_platform_heap *unused) struct ion_page_pool *pool; gfp_t gfp_flags = low_order_gfp_flags; - if (orders[i] > 4) + if (orders[i] > PAGE_ALLOC_COSTLY_ORDER) gfp_flags = high_order_gfp_flags; pool = ion_page_pool_create(gfp_flags, orders[i]); if (!pool) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/1] [ion]: system-heap use PAGE_ALLOC_COSTLY_ORDER for high order
Hello Kumar, On 10/06/2014 05:31 PM, Pintu Kumar wrote: The Android ion_system_heap uses allocation fallback mechanism based on 8,4,0 order pages available in the system. It changes gfp flags based on higher order allocation request. This higher order value is hard-coded as 4, instead of using the system defined higher order value. Thus replacing this hard-coded value with PAGE_ALLOC_COSTLY_ORDER which is defined as 3. This will help mapping the higher order request in system heap with the actual allocation request. Quite reasonable. Reviewed-by: Heesub Shin heesub.s...@samsung.com BTW, Anyone knows how the allocation order (8,4 and 0) was decided? I think only Google guys might know the answer. regards, heesub Signed-off-by: Pintu Kumar pint...@samsung.com --- drivers/staging/android/ion/ion_system_heap.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index da2a63c..e6c393f 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -65,7 +65,7 @@ static struct page *alloc_buffer_page(struct ion_system_heap *heap, } else { gfp_t gfp_flags = low_order_gfp_flags; - if (order 4) + if (order PAGE_ALLOC_COSTLY_ORDER) gfp_flags = high_order_gfp_flags; page = alloc_pages(gfp_flags | __GFP_COMP, order); if (!page) @@ -276,7 +276,7 @@ struct ion_heap *ion_system_heap_create(struct ion_platform_heap *unused) struct ion_page_pool *pool; gfp_t gfp_flags = low_order_gfp_flags; - if (orders[i] 4) + if (orders[i] PAGE_ALLOC_COSTLY_ORDER) gfp_flags = high_order_gfp_flags; pool = ion_page_pool_create(gfp_flags, orders[i]); if (!pool) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 3/3] zram: add swap_get_free hint
Hello Minchan, First of all, I agree with the overall purpose of your patch set. On 09/04/2014 10:39 AM, Minchan Kim wrote: This patch implement SWAP_GET_FREE handler in zram so that VM can know how many zram has freeable space. VM can use it to stop anonymous reclaiming once zram is full. Signed-off-by: Minchan Kim --- drivers/block/zram/zram_drv.c | 18 ++ 1 file changed, 18 insertions(+) diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index 88661d62e46a..8e22b20aa2db 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -951,6 +951,22 @@ static int zram_slot_free_notify(struct block_device *bdev, return 0; } +static int zram_get_free_pages(struct block_device *bdev, long *free) +{ + struct zram *zram; + struct zram_meta *meta; + + zram = bdev->bd_disk->private_data; + meta = zram->meta; + + if (!zram->limit_pages) + return 1; + + *free = zram->limit_pages - zs_get_total_pages(meta->mem_pool); Even if 'free' is zero here, there may be free spaces available to store more compressed pages into the zs_pool. I mean calculation above is not quite accurate and wastes memory, but have no better idea for now. heesub + + return 0; +} + static int zram_swap_hint(struct block_device *bdev, unsigned int hint, void *arg) { @@ -958,6 +974,8 @@ static int zram_swap_hint(struct block_device *bdev, if (hint == SWAP_SLOT_FREE) ret = zram_slot_free_notify(bdev, (unsigned long)arg); + else if (hint == SWAP_GET_FREE) + ret = zram_get_free_pages(bdev, arg); return ret; } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 3/3] zram: add swap_get_free hint
Hello Minchan, First of all, I agree with the overall purpose of your patch set. On 09/04/2014 10:39 AM, Minchan Kim wrote: This patch implement SWAP_GET_FREE handler in zram so that VM can know how many zram has freeable space. VM can use it to stop anonymous reclaiming once zram is full. Signed-off-by: Minchan Kim minc...@kernel.org --- drivers/block/zram/zram_drv.c | 18 ++ 1 file changed, 18 insertions(+) diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index 88661d62e46a..8e22b20aa2db 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -951,6 +951,22 @@ static int zram_slot_free_notify(struct block_device *bdev, return 0; } +static int zram_get_free_pages(struct block_device *bdev, long *free) +{ + struct zram *zram; + struct zram_meta *meta; + + zram = bdev-bd_disk-private_data; + meta = zram-meta; + + if (!zram-limit_pages) + return 1; + + *free = zram-limit_pages - zs_get_total_pages(meta-mem_pool); Even if 'free' is zero here, there may be free spaces available to store more compressed pages into the zs_pool. I mean calculation above is not quite accurate and wastes memory, but have no better idea for now. heesub + + return 0; +} + static int zram_swap_hint(struct block_device *bdev, unsigned int hint, void *arg) { @@ -958,6 +974,8 @@ static int zram_swap_hint(struct block_device *bdev, if (hint == SWAP_SLOT_FREE) ret = zram_slot_free_notify(bdev, (unsigned long)arg); + else if (hint == SWAP_GET_FREE) + ret = zram_get_free_pages(bdev, arg); return ret; } -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] staging: ion: fixup invalid kfree() calls on heap destroy
I've noticed that the last commit to ion_system_heap.c ('staging: ion: optimize struct ion_system_heap') has an omission, so an invalid kfree() gets called on ion_system_heap_destroy(). As ION system heap is never destroyed until system shutdown, it may not cause any harm, but should be fixed. I should have caught this before the merge, my bad. Signed-off-by: Heesub Shin --- drivers/staging/android/ion/ion_system_heap.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index c826b4c..6b77c51 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -304,7 +304,6 @@ void ion_system_heap_destroy(struct ion_heap *heap) for (i = 0; i < num_orders; i++) ion_page_pool_destroy(sys_heap->pools[i]); - kfree(sys_heap->pools); kfree(sys_heap); } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] staging: ion: fixup invalid kfree() calls on heap destroy
I've noticed that the last commit to ion_system_heap.c ('staging: ion: optimize struct ion_system_heap') has an omission, so an invalid kfree() gets called on ion_system_heap_destroy(). As ION system heap is never destroyed until system shutdown, it may not cause any harm, but should be fixed. I should have caught this before the merge, my bad. Signed-off-by: Heesub Shin heesub.s...@samsung.com --- drivers/staging/android/ion/ion_system_heap.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index c826b4c..6b77c51 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -304,7 +304,6 @@ void ion_system_heap_destroy(struct ion_heap *heap) for (i = 0; i num_orders; i++) ion_page_pool_destroy(sys_heap-pools[i]); - kfree(sys_heap-pools); kfree(sys_heap); } -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 RESEND 3/4] staging: ion: remove order argument from free_buffer_page()
Now that the pages returned from the pool are compound pages, we do not need to pass the order information to free_buffer_page(). Signed-off-by: Heesub Shin Reviewed-by: Mitchel Humpherys Tested-by: John Stultz --- drivers/staging/android/ion/ion_system_heap.c | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index f3b9008..e0a7e54 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -78,9 +78,9 @@ static struct page *alloc_buffer_page(struct ion_system_heap *heap, } static void free_buffer_page(struct ion_system_heap *heap, -struct ion_buffer *buffer, struct page *page, -unsigned int order) +struct ion_buffer *buffer, struct page *page) { + unsigned int order = compound_order(page); bool cached = ion_buffer_cached(buffer); if (!cached && !(buffer->private_flags & ION_PRIV_FLAG_SHRINKER_FREE)) { @@ -171,7 +171,7 @@ free_table: kfree(table); free_pages: list_for_each_entry_safe(page, tmp_page, , lru) - free_buffer_page(sys_heap, buffer, page, compound_order(page)); + free_buffer_page(sys_heap, buffer, page); return -ENOMEM; } @@ -191,8 +191,7 @@ static void ion_system_heap_free(struct ion_buffer *buffer) ion_heap_buffer_zero(buffer); for_each_sg(table->sgl, sg, table->nents, i) - free_buffer_page(sys_heap, buffer, sg_page(sg), - get_order(sg->length)); + free_buffer_page(sys_heap, buffer, sg_page(sg)); sg_free_table(table); kfree(table); } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 RESEND 4/4] staging: ion: optimize struct ion_system_heap
struct ion_system_heap has an array for storing pointers to page pools and it is allocated separately from the containing structure. There is no point in allocating those two small objects individually, bothering slab allocator. Using a variable length array simplifies code lines and reduces overhead to the slab. Signed-off-by: Heesub Shin Reviewed-by: Mitchel Humpherys Tested-by: John Stultz --- drivers/staging/android/ion/ion_system_heap.c | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index e0a7e54..c826b4c 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -49,7 +49,7 @@ static inline unsigned int order_to_size(int order) struct ion_system_heap { struct ion_heap heap; - struct ion_page_pool **pools; + struct ion_page_pool *pools[0]; }; static struct page *alloc_buffer_page(struct ion_system_heap *heap, @@ -264,16 +264,15 @@ struct ion_heap *ion_system_heap_create(struct ion_platform_heap *unused) struct ion_system_heap *heap; int i; - heap = kzalloc(sizeof(struct ion_system_heap), GFP_KERNEL); + heap = kzalloc(sizeof(struct ion_system_heap) + + sizeof(struct ion_page_pool *) * num_orders, + GFP_KERNEL); if (!heap) return ERR_PTR(-ENOMEM); heap->heap.ops = _heap_ops; heap->heap.type = ION_HEAP_TYPE_SYSTEM; heap->heap.flags = ION_HEAP_FLAG_DEFER_FREE; - heap->pools = kzalloc(sizeof(struct ion_page_pool *) * num_orders, - GFP_KERNEL); - if (!heap->pools) - goto free_heap; + for (i = 0; i < num_orders; i++) { struct ion_page_pool *pool; gfp_t gfp_flags = low_order_gfp_flags; @@ -292,8 +291,6 @@ struct ion_heap *ion_system_heap_create(struct ion_platform_heap *unused) destroy_pools: while (i--) ion_page_pool_destroy(heap->pools[i]); - kfree(heap->pools); -free_heap: kfree(heap); return ERR_PTR(-ENOMEM); } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 RESEND 2/4] staging: ion: remove struct page_info
ION system heap creates a temporary list of pages to build scatter/gather table, introducing an internal data type, page_info. Now that the order field has been removed from it, we do not need to depend on such data type anymore. Signed-off-by: Heesub Shin Reviewed-by: Mitchel Humpherys Tested-by: John Stultz --- drivers/staging/android/ion/ion_system_heap.c | 47 +-- 1 file changed, 15 insertions(+), 32 deletions(-) diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index 621b857..f3b9008 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -52,11 +52,6 @@ struct ion_system_heap { struct ion_page_pool **pools; }; -struct page_info { - struct page *page; - struct list_head list; -}; - static struct page *alloc_buffer_page(struct ion_system_heap *heap, struct ion_buffer *buffer, unsigned long order) @@ -98,19 +93,14 @@ static void free_buffer_page(struct ion_system_heap *heap, } -static struct page_info *alloc_largest_available(struct ion_system_heap *heap, -struct ion_buffer *buffer, -unsigned long size, -unsigned int max_order) +static struct page *alloc_largest_available(struct ion_system_heap *heap, + struct ion_buffer *buffer, + unsigned long size, + unsigned int max_order) { struct page *page; - struct page_info *info; int i; - info = kmalloc(sizeof(struct page_info), GFP_KERNEL); - if (!info) - return NULL; - for (i = 0; i < num_orders; i++) { if (size < order_to_size(orders[i])) continue; @@ -121,10 +111,8 @@ static struct page_info *alloc_largest_available(struct ion_system_heap *heap, if (!page) continue; - info->page = page; - return info; + return page; } - kfree(info); return NULL; } @@ -140,7 +128,7 @@ static int ion_system_heap_allocate(struct ion_heap *heap, struct sg_table *table; struct scatterlist *sg; struct list_head pages; - struct page_info *info, *tmp_info; + struct page *page, *tmp_page; int i = 0; unsigned long size_remaining = PAGE_ALIGN(size); unsigned int max_order = orders[0]; @@ -153,13 +141,13 @@ static int ion_system_heap_allocate(struct ion_heap *heap, INIT_LIST_HEAD(); while (size_remaining > 0) { - info = alloc_largest_available(sys_heap, buffer, size_remaining, + page = alloc_largest_available(sys_heap, buffer, size_remaining, max_order); - if (!info) + if (!page) goto free_pages; - list_add_tail(>list, ); - size_remaining -= PAGE_SIZE << compound_order(info->page); - max_order = compound_order(info->page); + list_add_tail(>lru, ); + size_remaining -= PAGE_SIZE << compound_order(page); + max_order = compound_order(page); i++; } table = kmalloc(sizeof(struct sg_table), GFP_KERNEL); @@ -170,12 +158,10 @@ static int ion_system_heap_allocate(struct ion_heap *heap, goto free_table; sg = table->sgl; - list_for_each_entry_safe(info, tmp_info, , list) { - struct page *page = info->page; + list_for_each_entry_safe(page, tmp_page, , lru) { sg_set_page(sg, page, PAGE_SIZE << compound_order(page), 0); sg = sg_next(sg); - list_del(>list); - kfree(info); + list_del(>lru); } buffer->priv_virt = table; @@ -184,11 +170,8 @@ static int ion_system_heap_allocate(struct ion_heap *heap, free_table: kfree(table); free_pages: - list_for_each_entry_safe(info, tmp_info, , list) { - free_buffer_page(sys_heap, buffer, info->page, - compound_order(info->page)); - kfree(info); - } + list_for_each_entry_safe(page, tmp_page, , lru) + free_buffer_page(sys_heap, buffer, page, compound_order(page)); return -ENOMEM; } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 RESEND 1/4] staging: ion: remove order from struct page_info
ION system heap uses an internal data structure, struct page_info, for tracking down the meta information of the pages allocated from the pool. Now that the pool returns compound pages, we don't need to store page order in struct page_info. Signed-off-by: Heesub Shin Reviewed-by: Mitchel Humpherys Tested-by: John Stultz --- drivers/staging/android/ion/ion_system_heap.c | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index cb7ae08..621b857 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -54,7 +54,6 @@ struct ion_system_heap { struct page_info { struct page *page; - unsigned int order; struct list_head list; }; @@ -123,7 +122,6 @@ static struct page_info *alloc_largest_available(struct ion_system_heap *heap, continue; info->page = page; - info->order = orders[i]; return info; } kfree(info); @@ -160,8 +158,8 @@ static int ion_system_heap_allocate(struct ion_heap *heap, if (!info) goto free_pages; list_add_tail(>list, ); - size_remaining -= PAGE_SIZE << info->order; - max_order = info->order; + size_remaining -= PAGE_SIZE << compound_order(info->page); + max_order = compound_order(info->page); i++; } table = kmalloc(sizeof(struct sg_table), GFP_KERNEL); @@ -174,7 +172,7 @@ static int ion_system_heap_allocate(struct ion_heap *heap, sg = table->sgl; list_for_each_entry_safe(info, tmp_info, , list) { struct page *page = info->page; - sg_set_page(sg, page, PAGE_SIZE << info->order, 0); + sg_set_page(sg, page, PAGE_SIZE << compound_order(page), 0); sg = sg_next(sg); list_del(>list); kfree(info); @@ -187,7 +185,8 @@ free_table: kfree(table); free_pages: list_for_each_entry_safe(info, tmp_info, , list) { - free_buffer_page(sys_heap, buffer, info->page, info->order); + free_buffer_page(sys_heap, buffer, info->page, + compound_order(info->page)); kfree(info); } return -ENOMEM; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/9] staging: ion: tidy up a bit
On 05/30/2014 05:42 AM, Greg Kroah-Hartman wrote: On Wed, May 28, 2014 at 03:52:52PM +0900, Heesub Shin wrote: For aesthetics and readability, rename goto labels, remove useless code lines, and clarify function return type. Signed-off-by: Heesub Shin Reviewed-by: Mitchel Humpherys Tested-by: John Stultz Not all of these would apply to my tree. So I've applied the ones that did. Please resend the remaining ones, after refreshing them against my staging-next branch of the staging.git tree. Hi KH, I will rebase them on staging-next and resend soon. Thanks a lot! regads, heesub -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/9] staging: ion: tidy up a bit
On 05/30/2014 05:42 AM, Greg Kroah-Hartman wrote: On Wed, May 28, 2014 at 03:52:52PM +0900, Heesub Shin wrote: For aesthetics and readability, rename goto labels, remove useless code lines, and clarify function return type. Signed-off-by: Heesub Shin heesub.s...@samsung.com Reviewed-by: Mitchel Humpherys mitch...@codeaurora.org Tested-by: John Stultz john.stu...@linaro.org Not all of these would apply to my tree. So I've applied the ones that did. Please resend the remaining ones, after refreshing them against my staging-next branch of the staging.git tree. Hi KH, I will rebase them on staging-next and resend soon. Thanks a lot! regads, heesub -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 RESEND 4/4] staging: ion: optimize struct ion_system_heap
struct ion_system_heap has an array for storing pointers to page pools and it is allocated separately from the containing structure. There is no point in allocating those two small objects individually, bothering slab allocator. Using a variable length array simplifies code lines and reduces overhead to the slab. Signed-off-by: Heesub Shin heesub.s...@samsung.com Reviewed-by: Mitchel Humpherys mitch...@codeaurora.org Tested-by: John Stultz john.stu...@linaro.org --- drivers/staging/android/ion/ion_system_heap.c | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index e0a7e54..c826b4c 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -49,7 +49,7 @@ static inline unsigned int order_to_size(int order) struct ion_system_heap { struct ion_heap heap; - struct ion_page_pool **pools; + struct ion_page_pool *pools[0]; }; static struct page *alloc_buffer_page(struct ion_system_heap *heap, @@ -264,16 +264,15 @@ struct ion_heap *ion_system_heap_create(struct ion_platform_heap *unused) struct ion_system_heap *heap; int i; - heap = kzalloc(sizeof(struct ion_system_heap), GFP_KERNEL); + heap = kzalloc(sizeof(struct ion_system_heap) + + sizeof(struct ion_page_pool *) * num_orders, + GFP_KERNEL); if (!heap) return ERR_PTR(-ENOMEM); heap-heap.ops = system_heap_ops; heap-heap.type = ION_HEAP_TYPE_SYSTEM; heap-heap.flags = ION_HEAP_FLAG_DEFER_FREE; - heap-pools = kzalloc(sizeof(struct ion_page_pool *) * num_orders, - GFP_KERNEL); - if (!heap-pools) - goto free_heap; + for (i = 0; i num_orders; i++) { struct ion_page_pool *pool; gfp_t gfp_flags = low_order_gfp_flags; @@ -292,8 +291,6 @@ struct ion_heap *ion_system_heap_create(struct ion_platform_heap *unused) destroy_pools: while (i--) ion_page_pool_destroy(heap-pools[i]); - kfree(heap-pools); -free_heap: kfree(heap); return ERR_PTR(-ENOMEM); } -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 RESEND 2/4] staging: ion: remove struct page_info
ION system heap creates a temporary list of pages to build scatter/gather table, introducing an internal data type, page_info. Now that the order field has been removed from it, we do not need to depend on such data type anymore. Signed-off-by: Heesub Shin heesub.s...@samsung.com Reviewed-by: Mitchel Humpherys mitch...@codeaurora.org Tested-by: John Stultz john.stu...@linaro.org --- drivers/staging/android/ion/ion_system_heap.c | 47 +-- 1 file changed, 15 insertions(+), 32 deletions(-) diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index 621b857..f3b9008 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -52,11 +52,6 @@ struct ion_system_heap { struct ion_page_pool **pools; }; -struct page_info { - struct page *page; - struct list_head list; -}; - static struct page *alloc_buffer_page(struct ion_system_heap *heap, struct ion_buffer *buffer, unsigned long order) @@ -98,19 +93,14 @@ static void free_buffer_page(struct ion_system_heap *heap, } -static struct page_info *alloc_largest_available(struct ion_system_heap *heap, -struct ion_buffer *buffer, -unsigned long size, -unsigned int max_order) +static struct page *alloc_largest_available(struct ion_system_heap *heap, + struct ion_buffer *buffer, + unsigned long size, + unsigned int max_order) { struct page *page; - struct page_info *info; int i; - info = kmalloc(sizeof(struct page_info), GFP_KERNEL); - if (!info) - return NULL; - for (i = 0; i num_orders; i++) { if (size order_to_size(orders[i])) continue; @@ -121,10 +111,8 @@ static struct page_info *alloc_largest_available(struct ion_system_heap *heap, if (!page) continue; - info-page = page; - return info; + return page; } - kfree(info); return NULL; } @@ -140,7 +128,7 @@ static int ion_system_heap_allocate(struct ion_heap *heap, struct sg_table *table; struct scatterlist *sg; struct list_head pages; - struct page_info *info, *tmp_info; + struct page *page, *tmp_page; int i = 0; unsigned long size_remaining = PAGE_ALIGN(size); unsigned int max_order = orders[0]; @@ -153,13 +141,13 @@ static int ion_system_heap_allocate(struct ion_heap *heap, INIT_LIST_HEAD(pages); while (size_remaining 0) { - info = alloc_largest_available(sys_heap, buffer, size_remaining, + page = alloc_largest_available(sys_heap, buffer, size_remaining, max_order); - if (!info) + if (!page) goto free_pages; - list_add_tail(info-list, pages); - size_remaining -= PAGE_SIZE compound_order(info-page); - max_order = compound_order(info-page); + list_add_tail(page-lru, pages); + size_remaining -= PAGE_SIZE compound_order(page); + max_order = compound_order(page); i++; } table = kmalloc(sizeof(struct sg_table), GFP_KERNEL); @@ -170,12 +158,10 @@ static int ion_system_heap_allocate(struct ion_heap *heap, goto free_table; sg = table-sgl; - list_for_each_entry_safe(info, tmp_info, pages, list) { - struct page *page = info-page; + list_for_each_entry_safe(page, tmp_page, pages, lru) { sg_set_page(sg, page, PAGE_SIZE compound_order(page), 0); sg = sg_next(sg); - list_del(info-list); - kfree(info); + list_del(page-lru); } buffer-priv_virt = table; @@ -184,11 +170,8 @@ static int ion_system_heap_allocate(struct ion_heap *heap, free_table: kfree(table); free_pages: - list_for_each_entry_safe(info, tmp_info, pages, list) { - free_buffer_page(sys_heap, buffer, info-page, - compound_order(info-page)); - kfree(info); - } + list_for_each_entry_safe(page, tmp_page, pages, lru) + free_buffer_page(sys_heap, buffer, page, compound_order(page)); return -ENOMEM; } -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org
[PATCH v2 RESEND 1/4] staging: ion: remove order from struct page_info
ION system heap uses an internal data structure, struct page_info, for tracking down the meta information of the pages allocated from the pool. Now that the pool returns compound pages, we don't need to store page order in struct page_info. Signed-off-by: Heesub Shin heesub.s...@samsung.com Reviewed-by: Mitchel Humpherys mitch...@codeaurora.org Tested-by: John Stultz john.stu...@linaro.org --- drivers/staging/android/ion/ion_system_heap.c | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index cb7ae08..621b857 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -54,7 +54,6 @@ struct ion_system_heap { struct page_info { struct page *page; - unsigned int order; struct list_head list; }; @@ -123,7 +122,6 @@ static struct page_info *alloc_largest_available(struct ion_system_heap *heap, continue; info-page = page; - info-order = orders[i]; return info; } kfree(info); @@ -160,8 +158,8 @@ static int ion_system_heap_allocate(struct ion_heap *heap, if (!info) goto free_pages; list_add_tail(info-list, pages); - size_remaining -= PAGE_SIZE info-order; - max_order = info-order; + size_remaining -= PAGE_SIZE compound_order(info-page); + max_order = compound_order(info-page); i++; } table = kmalloc(sizeof(struct sg_table), GFP_KERNEL); @@ -174,7 +172,7 @@ static int ion_system_heap_allocate(struct ion_heap *heap, sg = table-sgl; list_for_each_entry_safe(info, tmp_info, pages, list) { struct page *page = info-page; - sg_set_page(sg, page, PAGE_SIZE info-order, 0); + sg_set_page(sg, page, PAGE_SIZE compound_order(page), 0); sg = sg_next(sg); list_del(info-list); kfree(info); @@ -187,7 +185,8 @@ free_table: kfree(table); free_pages: list_for_each_entry_safe(info, tmp_info, pages, list) { - free_buffer_page(sys_heap, buffer, info-page, info-order); + free_buffer_page(sys_heap, buffer, info-page, + compound_order(info-page)); kfree(info); } return -ENOMEM; -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 RESEND 3/4] staging: ion: remove order argument from free_buffer_page()
Now that the pages returned from the pool are compound pages, we do not need to pass the order information to free_buffer_page(). Signed-off-by: Heesub Shin heesub.s...@samsung.com Reviewed-by: Mitchel Humpherys mitch...@codeaurora.org Tested-by: John Stultz john.stu...@linaro.org --- drivers/staging/android/ion/ion_system_heap.c | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index f3b9008..e0a7e54 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -78,9 +78,9 @@ static struct page *alloc_buffer_page(struct ion_system_heap *heap, } static void free_buffer_page(struct ion_system_heap *heap, -struct ion_buffer *buffer, struct page *page, -unsigned int order) +struct ion_buffer *buffer, struct page *page) { + unsigned int order = compound_order(page); bool cached = ion_buffer_cached(buffer); if (!cached !(buffer-private_flags ION_PRIV_FLAG_SHRINKER_FREE)) { @@ -171,7 +171,7 @@ free_table: kfree(table); free_pages: list_for_each_entry_safe(page, tmp_page, pages, lru) - free_buffer_page(sys_heap, buffer, page, compound_order(page)); + free_buffer_page(sys_heap, buffer, page); return -ENOMEM; } @@ -191,8 +191,7 @@ static void ion_system_heap_free(struct ion_buffer *buffer) ion_heap_buffer_zero(buffer); for_each_sg(table-sgl, sg, table-nents, i) - free_buffer_page(sys_heap, buffer, sg_page(sg), - get_order(sg-length)); + free_buffer_page(sys_heap, buffer, sg_page(sg)); sg_free_table(table); kfree(table); } -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 7/9] staging: ion: remove order argument from free_buffer_page()
Now that the pages returned from the pool are compound pages, we do not need to pass the order information to free_buffer_page(). Signed-off-by: Heesub Shin --- drivers/staging/android/ion/ion_system_heap.c | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index f0ae210..d78d589e 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -77,9 +77,9 @@ static struct page *alloc_buffer_page(struct ion_system_heap *heap, } static void free_buffer_page(struct ion_system_heap *heap, -struct ion_buffer *buffer, struct page *page, -unsigned int order) +struct ion_buffer *buffer, struct page *page) { + unsigned int order = compound_order(page); bool cached = ion_buffer_cached(buffer); if (!cached && !(buffer->private_flags & ION_PRIV_FLAG_SHRINKER_FREE)) { @@ -169,7 +169,7 @@ free_table: kfree(table); free_pages: list_for_each_entry_safe(page, tmp_page, , lru) - free_buffer_page(sys_heap, buffer, page, compound_order(page)); + free_buffer_page(sys_heap, buffer, page); return -ENOMEM; } @@ -189,8 +189,7 @@ static void ion_system_heap_free(struct ion_buffer *buffer) ion_heap_buffer_zero(buffer); for_each_sg(table->sgl, sg, table->nents, i) - free_buffer_page(sys_heap, buffer, sg_page(sg), - get_order(sg->length)); + free_buffer_page(sys_heap, buffer, sg_page(sg)); sg_free_table(table); kfree(table); } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 9/9] staging: ion: optimize struct ion_system_heap
struct ion_system_heap has an array for storing pointers to page pools and it is allocated separately from the containing structure. There is no point in allocating those two small objects individually, bothering slab allocator. Using a variable length array simplifies code lines and reduces overhead to the slab. Signed-off-by: Heesub Shin --- drivers/staging/android/ion/ion_system_heap.c | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index d78d589e..690d866 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -48,7 +48,7 @@ static inline unsigned int order_to_size(int order) struct ion_system_heap { struct ion_heap heap; - struct ion_page_pool **pools; + struct ion_page_pool *pools[0]; }; static struct page *alloc_buffer_page(struct ion_system_heap *heap, @@ -259,16 +259,15 @@ struct ion_heap *ion_system_heap_create(struct ion_platform_heap *unused) struct ion_system_heap *heap; int i; - heap = kzalloc(sizeof(struct ion_system_heap), GFP_KERNEL); + heap = kzalloc(sizeof(struct ion_system_heap) + + sizeof(struct ion_page_pool *) * num_orders, + GFP_KERNEL); if (!heap) return ERR_PTR(-ENOMEM); heap->heap.ops = _heap_ops; heap->heap.type = ION_HEAP_TYPE_SYSTEM; heap->heap.flags = ION_HEAP_FLAG_DEFER_FREE; - heap->pools = kzalloc(sizeof(struct ion_page_pool *) * num_orders, - GFP_KERNEL); - if (!heap->pools) - goto free_heap; + for (i = 0; i < num_orders; i++) { struct ion_page_pool *pool; gfp_t gfp_flags = low_order_gfp_flags; @@ -287,8 +286,6 @@ struct ion_heap *ion_system_heap_create(struct ion_platform_heap *unused) destroy_pools: while (i--) ion_page_pool_destroy(heap->pools[i]); - kfree(heap->pools); -free_heap: kfree(heap); return ERR_PTR(-ENOMEM); } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 8/9] staging: ion: shrink highmem pages on kswapd
ION system heap keeps pages in its pool for better performance. When the system is under memory pressure, slab shrinker calls the callback registered and then the pages pooled get freed. When the shrinker is called, it checks gfp_mask and determines whether the pages from highmem need to be freed or the pages from lowmem. Usually, slab shrinker is invoked on kswapd context which gfp_mask is always GFP_KERNEL, so only lowmem pages are released on kswapd context. This means that highmem pages in the pool are never reclaimed until direct reclaim occurs. This can be problematic when the page pool holds excessive amounts of highmem. For now, the shrinker callback cannot know exactly which zone should be targeted for reclamation, as enough information are not passed to. Thus, it makes sense to shrink both lowmem and highmem zone on kswapd context. Reported-by: Wonseo Choi Signed-off-by: Heesub Shin --- drivers/staging/android/ion/ion_page_pool.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/staging/android/ion/ion_page_pool.c b/drivers/staging/android/ion/ion_page_pool.c index c1cea42b..5864f3d 100644 --- a/drivers/staging/android/ion/ion_page_pool.c +++ b/drivers/staging/android/ion/ion_page_pool.c @@ -21,6 +21,7 @@ #include #include #include +#include #include "ion_priv.h" static void *ion_page_pool_alloc_pages(struct ion_page_pool *pool) @@ -118,7 +119,10 @@ int ion_page_pool_shrink(struct ion_page_pool *pool, gfp_t gfp_mask, int freed; bool high; - high = !!(gfp_mask & __GFP_HIGHMEM); + if (current_is_kswapd()) + high = 1; + else + high = !!(gfp_mask & __GFP_HIGHMEM); if (nr_to_scan == 0) return ion_page_pool_total(pool, high); -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 6/9] staging: ion: remove struct page_info
ION system heap creates a temporary list of pages to build scatter/gather table, introducing an internal data type, page_info. Now that the order field has been removed from it, we do not need to depend on such data type anymore. Signed-off-by: Heesub Shin --- drivers/staging/android/ion/ion_system_heap.c | 47 +-- 1 file changed, 15 insertions(+), 32 deletions(-) diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index 73a2e67..f0ae210 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -51,11 +51,6 @@ struct ion_system_heap { struct ion_page_pool **pools; }; -struct page_info { - struct page *page; - struct list_head list; -}; - static struct page *alloc_buffer_page(struct ion_system_heap *heap, struct ion_buffer *buffer, unsigned long order) @@ -96,19 +91,14 @@ static void free_buffer_page(struct ion_system_heap *heap, } -static struct page_info *alloc_largest_available(struct ion_system_heap *heap, -struct ion_buffer *buffer, -unsigned long size, -unsigned int max_order) +static struct page *alloc_largest_available(struct ion_system_heap *heap, + struct ion_buffer *buffer, + unsigned long size, + unsigned int max_order) { struct page *page; - struct page_info *info; int i; - info = kmalloc(sizeof(struct page_info), GFP_KERNEL); - if (!info) - return NULL; - for (i = 0; i < num_orders; i++) { if (size < order_to_size(orders[i])) continue; @@ -119,10 +109,8 @@ static struct page_info *alloc_largest_available(struct ion_system_heap *heap, if (!page) continue; - info->page = page; - return info; + return page; } - kfree(info); return NULL; } @@ -138,7 +126,7 @@ static int ion_system_heap_allocate(struct ion_heap *heap, struct sg_table *table; struct scatterlist *sg; struct list_head pages; - struct page_info *info, *tmp_info; + struct page *page, *tmp_page; int i = 0; unsigned long size_remaining = PAGE_ALIGN(size); unsigned int max_order = orders[0]; @@ -151,13 +139,13 @@ static int ion_system_heap_allocate(struct ion_heap *heap, INIT_LIST_HEAD(); while (size_remaining > 0) { - info = alloc_largest_available(sys_heap, buffer, size_remaining, + page = alloc_largest_available(sys_heap, buffer, size_remaining, max_order); - if (!info) + if (!page) goto free_pages; - list_add_tail(>list, ); - size_remaining -= PAGE_SIZE << compound_order(info->page); - max_order = compound_order(info->page); + list_add_tail(>lru, ); + size_remaining -= PAGE_SIZE << compound_order(page); + max_order = compound_order(page); i++; } table = kzalloc(sizeof(struct sg_table), GFP_KERNEL); @@ -168,12 +156,10 @@ static int ion_system_heap_allocate(struct ion_heap *heap, goto free_table; sg = table->sgl; - list_for_each_entry_safe(info, tmp_info, , list) { - struct page *page = info->page; + list_for_each_entry_safe(page, tmp_page, , lru) { sg_set_page(sg, page, PAGE_SIZE << compound_order(page), 0); sg = sg_next(sg); - list_del(>list); - kfree(info); + list_del(>lru); } buffer->priv_virt = table; @@ -182,11 +168,8 @@ static int ion_system_heap_allocate(struct ion_heap *heap, free_table: kfree(table); free_pages: - list_for_each_entry_safe(info, tmp_info, , list) { - free_buffer_page(sys_heap, buffer, info->page, - compound_order(info->page)); - kfree(info); - } + list_for_each_entry_safe(page, tmp_page, , lru) + free_buffer_page(sys_heap, buffer, page, compound_order(page)); return -ENOMEM; } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 5/9] staging: ion: remove order from struct page_info
ION system heap uses an internal data structure, struct page_info, for tracking down the meta information of the pages allocated from the pool. Now that the pool returns compound pages, we don't need to store page order in struct page_info. Signed-off-by: Heesub Shin --- drivers/staging/android/ion/ion_system_heap.c | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index 48e6a58..73a2e67 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -53,7 +53,6 @@ struct ion_system_heap { struct page_info { struct page *page; - unsigned int order; struct list_head list; }; @@ -121,7 +120,6 @@ static struct page_info *alloc_largest_available(struct ion_system_heap *heap, continue; info->page = page; - info->order = orders[i]; return info; } kfree(info); @@ -158,8 +156,8 @@ static int ion_system_heap_allocate(struct ion_heap *heap, if (!info) goto free_pages; list_add_tail(>list, ); - size_remaining -= PAGE_SIZE << info->order; - max_order = info->order; + size_remaining -= PAGE_SIZE << compound_order(info->page); + max_order = compound_order(info->page); i++; } table = kzalloc(sizeof(struct sg_table), GFP_KERNEL); @@ -172,7 +170,7 @@ static int ion_system_heap_allocate(struct ion_heap *heap, sg = table->sgl; list_for_each_entry_safe(info, tmp_info, , list) { struct page *page = info->page; - sg_set_page(sg, page, PAGE_SIZE << info->order, 0); + sg_set_page(sg, page, PAGE_SIZE << compound_order(page), 0); sg = sg_next(sg); list_del(>list); kfree(info); @@ -185,7 +183,8 @@ free_table: kfree(table); free_pages: list_for_each_entry_safe(info, tmp_info, , list) { - free_buffer_page(sys_heap, buffer, info->page, info->order); + free_buffer_page(sys_heap, buffer, info->page, + compound_order(info->page)); kfree(info); } return -ENOMEM; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 4/9] staging: ion: use compound pages on high order pages for system heap
Using compound pages relieves burden on tracking the meta information which are currently stored in page_info. Signed-off-by: Heesub Shin --- drivers/staging/android/ion/ion_page_pool.c | 4 +++- drivers/staging/android/ion/ion_system_heap.c | 2 +- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/staging/android/ion/ion_page_pool.c b/drivers/staging/android/ion/ion_page_pool.c index 111777c..c1cea42b 100644 --- a/drivers/staging/android/ion/ion_page_pool.c +++ b/drivers/staging/android/ion/ion_page_pool.c @@ -95,6 +95,8 @@ void ion_page_pool_free(struct ion_page_pool *pool, struct page *page) { int ret; + BUG_ON(pool->order != compound_order(page)); + ret = ion_page_pool_add(pool, page); if (ret) ion_page_pool_free_pages(pool, page); @@ -150,7 +152,7 @@ struct ion_page_pool *ion_page_pool_create(gfp_t gfp_mask, unsigned int order) pool->low_count = 0; INIT_LIST_HEAD(>low_items); INIT_LIST_HEAD(>high_items); - pool->gfp_mask = gfp_mask; + pool->gfp_mask = gfp_mask | __GFP_COMP; pool->order = order; mutex_init(>mutex); plist_node_init(>list, order); diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index b554751..48e6a58 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -72,7 +72,7 @@ static struct page *alloc_buffer_page(struct ion_system_heap *heap, if (order > 4) gfp_flags = high_order_gfp_flags; - page = alloc_pages(gfp_flags, order); + page = alloc_pages(gfp_flags | __GFP_COMP, order); if (!page) return NULL; ion_pages_sync_for_device(NULL, page, PAGE_SIZE << order, -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 3/9] staging: ion: remove struct ion_page_pool_item
The page pool uses an internal data structure, ion_page_pool_item, for wrapping pooled pages and constructing a list. As the struct page already provides ways for doing exactly the same thing, we do not need to reinvent the wheel. This commit removes the data structure and slab allocations for it. Signed-off-by: Heesub Shin --- drivers/staging/android/ion/ion_page_pool.c | 27 +-- 1 file changed, 5 insertions(+), 22 deletions(-) diff --git a/drivers/staging/android/ion/ion_page_pool.c b/drivers/staging/android/ion/ion_page_pool.c index 1684780..111777c 100644 --- a/drivers/staging/android/ion/ion_page_pool.c +++ b/drivers/staging/android/ion/ion_page_pool.c @@ -23,11 +23,6 @@ #include #include "ion_priv.h" -struct ion_page_pool_item { - struct page *page; - struct list_head list; -}; - static void *ion_page_pool_alloc_pages(struct ion_page_pool *pool) { struct page *page = alloc_pages(pool->gfp_mask, pool->order); @@ -47,19 +42,12 @@ static void ion_page_pool_free_pages(struct ion_page_pool *pool, static int ion_page_pool_add(struct ion_page_pool *pool, struct page *page) { - struct ion_page_pool_item *item; - - item = kmalloc(sizeof(struct ion_page_pool_item), GFP_KERNEL); - if (!item) - return -ENOMEM; - mutex_lock(>mutex); - item->page = page; if (PageHighMem(page)) { - list_add_tail(>list, >high_items); + list_add_tail(>lru, >high_items); pool->high_count++; } else { - list_add_tail(>list, >low_items); + list_add_tail(>lru, >low_items); pool->low_count++; } mutex_unlock(>mutex); @@ -68,24 +56,19 @@ static int ion_page_pool_add(struct ion_page_pool *pool, struct page *page) static struct page *ion_page_pool_remove(struct ion_page_pool *pool, bool high) { - struct ion_page_pool_item *item; struct page *page; if (high) { BUG_ON(!pool->high_count); - item = list_first_entry(>high_items, - struct ion_page_pool_item, list); + page = list_first_entry(>high_items, struct page, lru); pool->high_count--; } else { BUG_ON(!pool->low_count); - item = list_first_entry(>low_items, - struct ion_page_pool_item, list); + page = list_first_entry(>low_items, struct page, lru); pool->low_count--; } - list_del(>list); - page = item->page; - kfree(item); + list_del(>lru); return page; } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 1/9] staging: ion: tidy up a bit
For aesthetics and readability, rename goto labels, remove useless code lines, and clarify function return type. Signed-off-by: Heesub Shin --- drivers/staging/android/ion/ion_page_pool.c | 2 +- drivers/staging/android/ion/ion_priv.h| 2 +- drivers/staging/android/ion/ion_system_heap.c | 57 --- 3 files changed, 28 insertions(+), 33 deletions(-) diff --git a/drivers/staging/android/ion/ion_page_pool.c b/drivers/staging/android/ion/ion_page_pool.c index ecb5fc3..ead4054 100644 --- a/drivers/staging/android/ion/ion_page_pool.c +++ b/drivers/staging/android/ion/ion_page_pool.c @@ -89,7 +89,7 @@ static struct page *ion_page_pool_remove(struct ion_page_pool *pool, bool high) return page; } -void *ion_page_pool_alloc(struct ion_page_pool *pool) +struct page *ion_page_pool_alloc(struct ion_page_pool *pool) { struct page *page = NULL; diff --git a/drivers/staging/android/ion/ion_priv.h b/drivers/staging/android/ion/ion_priv.h index 1eba3f2..365c947 100644 --- a/drivers/staging/android/ion/ion_priv.h +++ b/drivers/staging/android/ion/ion_priv.h @@ -377,7 +377,7 @@ struct ion_page_pool { struct ion_page_pool *ion_page_pool_create(gfp_t gfp_mask, unsigned int order); void ion_page_pool_destroy(struct ion_page_pool *); -void *ion_page_pool_alloc(struct ion_page_pool *); +struct page *ion_page_pool_alloc(struct ion_page_pool *); void ion_page_pool_free(struct ion_page_pool *, struct page *); /** ion_page_pool_shrink - shrinks the size of the memory cached in the pool diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index c923633..b554751 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -41,7 +41,7 @@ static int order_to_index(unsigned int order) return -1; } -static unsigned int order_to_size(int order) +static inline unsigned int order_to_size(int order) { return PAGE_SIZE << order; } @@ -78,8 +78,6 @@ static struct page *alloc_buffer_page(struct ion_system_heap *heap, ion_pages_sync_for_device(NULL, page, PAGE_SIZE << order, DMA_BIDIRECTIONAL); } - if (!page) - return NULL; return page; } @@ -124,7 +122,6 @@ static struct page_info *alloc_largest_available(struct ion_system_heap *heap, info->page = page; info->order = orders[i]; - INIT_LIST_HEAD(>list); return info; } kfree(info); @@ -142,7 +139,6 @@ static int ion_system_heap_allocate(struct ion_heap *heap, heap); struct sg_table *table; struct scatterlist *sg; - int ret; struct list_head pages; struct page_info *info, *tmp_info; int i = 0; @@ -160,24 +156,23 @@ static int ion_system_heap_allocate(struct ion_heap *heap, info = alloc_largest_available(sys_heap, buffer, size_remaining, max_order); if (!info) - goto err; + goto free_pages; list_add_tail(>list, ); - size_remaining -= (1 << info->order) * PAGE_SIZE; + size_remaining -= PAGE_SIZE << info->order; max_order = info->order; i++; } table = kzalloc(sizeof(struct sg_table), GFP_KERNEL); if (!table) - goto err; + goto free_pages; - ret = sg_alloc_table(table, i, GFP_KERNEL); - if (ret) - goto err1; + if (sg_alloc_table(table, i, GFP_KERNEL)) + goto free_table; sg = table->sgl; list_for_each_entry_safe(info, tmp_info, , list) { struct page *page = info->page; - sg_set_page(sg, page, (1 << info->order) * PAGE_SIZE, 0); + sg_set_page(sg, page, PAGE_SIZE << info->order, 0); sg = sg_next(sg); list_del(>list); kfree(info); @@ -185,9 +180,10 @@ static int ion_system_heap_allocate(struct ion_heap *heap, buffer->priv_virt = table; return 0; -err1: + +free_table: kfree(table); -err: +free_pages: list_for_each_entry_safe(info, tmp_info, , list) { free_buffer_page(sys_heap, buffer, info->page, info->order); kfree(info); @@ -197,14 +193,12 @@ err: static void ion_system_heap_free(struct ion_buffer *buffer) { - struct ion_heap *heap = buffer->heap; - struct ion_system_heap *sys_heap = container_of(heap, + struct ion_system_heap *sys_heap = container_of(buffer->heap, struct ion_system_heap,
[PATCH v2 2/9] staging: ion: simplify ion_page_pool_total()
ion_page_pool_total() returns the total number of pages in the pool. Depending on the argument passed, it counts highmem pages in or not. This commit simplifies the code lines for better readability. Signed-off-by: Heesub Shin --- drivers/staging/android/ion/ion_page_pool.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/staging/android/ion/ion_page_pool.c b/drivers/staging/android/ion/ion_page_pool.c index ead4054..1684780 100644 --- a/drivers/staging/android/ion/ion_page_pool.c +++ b/drivers/staging/android/ion/ion_page_pool.c @@ -119,12 +119,12 @@ void ion_page_pool_free(struct ion_page_pool *pool, struct page *page) static int ion_page_pool_total(struct ion_page_pool *pool, bool high) { - int total = 0; + int count = pool->low_count; - total += high ? (pool->high_count + pool->low_count) * - (1 << pool->order) : - pool->low_count * (1 << pool->order); - return total; + if (high) + count += pool->high_count; + + return count << pool->order; } int ion_page_pool_shrink(struct ion_page_pool *pool, gfp_t gfp_mask, -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 0/9] staging: ion: system heap and page pool fixes
Hi, Here is my patchset with some modification, hoping reviews or comments from you guys. v2: o No changes in the code, just reworded changelog o Reorder patch Heesub Shin (9): staging: ion: tidy up a bit staging: ion: simplify ion_page_pool_total() staging: ion: remove struct ion_page_pool_item staging: ion: use compound pages on high order pages for system heap staging: ion: remove order from struct page_info staging: ion: remove struct page_info staging: ion: remove order argument from free_buffer_page() staging: ion: shrink highmem pages on kswapd staging: ion: optimize struct ion_system_heap drivers/staging/android/ion/ion_page_pool.c | 49 --- drivers/staging/android/ion/ion_priv.h| 2 +- drivers/staging/android/ion/ion_system_heap.c | 121 ++ 3 files changed, 67 insertions(+), 105 deletions(-) -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 0/9] staging: ion: system heap and page pool fixes
Hi, Here is my patchset with some modification, hoping reviews or comments from you guys. v2: o No changes in the code, just reworded changelog o Reorder patch Heesub Shin (9): staging: ion: tidy up a bit staging: ion: simplify ion_page_pool_total() staging: ion: remove struct ion_page_pool_item staging: ion: use compound pages on high order pages for system heap staging: ion: remove order from struct page_info staging: ion: remove struct page_info staging: ion: remove order argument from free_buffer_page() staging: ion: shrink highmem pages on kswapd staging: ion: optimize struct ion_system_heap drivers/staging/android/ion/ion_page_pool.c | 49 --- drivers/staging/android/ion/ion_priv.h| 2 +- drivers/staging/android/ion/ion_system_heap.c | 121 ++ 3 files changed, 67 insertions(+), 105 deletions(-) -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 2/9] staging: ion: simplify ion_page_pool_total()
ion_page_pool_total() returns the total number of pages in the pool. Depending on the argument passed, it counts highmem pages in or not. This commit simplifies the code lines for better readability. Signed-off-by: Heesub Shin heesub.s...@samsung.com --- drivers/staging/android/ion/ion_page_pool.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/staging/android/ion/ion_page_pool.c b/drivers/staging/android/ion/ion_page_pool.c index ead4054..1684780 100644 --- a/drivers/staging/android/ion/ion_page_pool.c +++ b/drivers/staging/android/ion/ion_page_pool.c @@ -119,12 +119,12 @@ void ion_page_pool_free(struct ion_page_pool *pool, struct page *page) static int ion_page_pool_total(struct ion_page_pool *pool, bool high) { - int total = 0; + int count = pool-low_count; - total += high ? (pool-high_count + pool-low_count) * - (1 pool-order) : - pool-low_count * (1 pool-order); - return total; + if (high) + count += pool-high_count; + + return count pool-order; } int ion_page_pool_shrink(struct ion_page_pool *pool, gfp_t gfp_mask, -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 1/9] staging: ion: tidy up a bit
For aesthetics and readability, rename goto labels, remove useless code lines, and clarify function return type. Signed-off-by: Heesub Shin heesub.s...@samsung.com --- drivers/staging/android/ion/ion_page_pool.c | 2 +- drivers/staging/android/ion/ion_priv.h| 2 +- drivers/staging/android/ion/ion_system_heap.c | 57 --- 3 files changed, 28 insertions(+), 33 deletions(-) diff --git a/drivers/staging/android/ion/ion_page_pool.c b/drivers/staging/android/ion/ion_page_pool.c index ecb5fc3..ead4054 100644 --- a/drivers/staging/android/ion/ion_page_pool.c +++ b/drivers/staging/android/ion/ion_page_pool.c @@ -89,7 +89,7 @@ static struct page *ion_page_pool_remove(struct ion_page_pool *pool, bool high) return page; } -void *ion_page_pool_alloc(struct ion_page_pool *pool) +struct page *ion_page_pool_alloc(struct ion_page_pool *pool) { struct page *page = NULL; diff --git a/drivers/staging/android/ion/ion_priv.h b/drivers/staging/android/ion/ion_priv.h index 1eba3f2..365c947 100644 --- a/drivers/staging/android/ion/ion_priv.h +++ b/drivers/staging/android/ion/ion_priv.h @@ -377,7 +377,7 @@ struct ion_page_pool { struct ion_page_pool *ion_page_pool_create(gfp_t gfp_mask, unsigned int order); void ion_page_pool_destroy(struct ion_page_pool *); -void *ion_page_pool_alloc(struct ion_page_pool *); +struct page *ion_page_pool_alloc(struct ion_page_pool *); void ion_page_pool_free(struct ion_page_pool *, struct page *); /** ion_page_pool_shrink - shrinks the size of the memory cached in the pool diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index c923633..b554751 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -41,7 +41,7 @@ static int order_to_index(unsigned int order) return -1; } -static unsigned int order_to_size(int order) +static inline unsigned int order_to_size(int order) { return PAGE_SIZE order; } @@ -78,8 +78,6 @@ static struct page *alloc_buffer_page(struct ion_system_heap *heap, ion_pages_sync_for_device(NULL, page, PAGE_SIZE order, DMA_BIDIRECTIONAL); } - if (!page) - return NULL; return page; } @@ -124,7 +122,6 @@ static struct page_info *alloc_largest_available(struct ion_system_heap *heap, info-page = page; info-order = orders[i]; - INIT_LIST_HEAD(info-list); return info; } kfree(info); @@ -142,7 +139,6 @@ static int ion_system_heap_allocate(struct ion_heap *heap, heap); struct sg_table *table; struct scatterlist *sg; - int ret; struct list_head pages; struct page_info *info, *tmp_info; int i = 0; @@ -160,24 +156,23 @@ static int ion_system_heap_allocate(struct ion_heap *heap, info = alloc_largest_available(sys_heap, buffer, size_remaining, max_order); if (!info) - goto err; + goto free_pages; list_add_tail(info-list, pages); - size_remaining -= (1 info-order) * PAGE_SIZE; + size_remaining -= PAGE_SIZE info-order; max_order = info-order; i++; } table = kzalloc(sizeof(struct sg_table), GFP_KERNEL); if (!table) - goto err; + goto free_pages; - ret = sg_alloc_table(table, i, GFP_KERNEL); - if (ret) - goto err1; + if (sg_alloc_table(table, i, GFP_KERNEL)) + goto free_table; sg = table-sgl; list_for_each_entry_safe(info, tmp_info, pages, list) { struct page *page = info-page; - sg_set_page(sg, page, (1 info-order) * PAGE_SIZE, 0); + sg_set_page(sg, page, PAGE_SIZE info-order, 0); sg = sg_next(sg); list_del(info-list); kfree(info); @@ -185,9 +180,10 @@ static int ion_system_heap_allocate(struct ion_heap *heap, buffer-priv_virt = table; return 0; -err1: + +free_table: kfree(table); -err: +free_pages: list_for_each_entry_safe(info, tmp_info, pages, list) { free_buffer_page(sys_heap, buffer, info-page, info-order); kfree(info); @@ -197,14 +193,12 @@ err: static void ion_system_heap_free(struct ion_buffer *buffer) { - struct ion_heap *heap = buffer-heap; - struct ion_system_heap *sys_heap = container_of(heap, + struct ion_system_heap *sys_heap = container_of(buffer-heap, struct ion_system_heap, heap
[PATCH v2 3/9] staging: ion: remove struct ion_page_pool_item
The page pool uses an internal data structure, ion_page_pool_item, for wrapping pooled pages and constructing a list. As the struct page already provides ways for doing exactly the same thing, we do not need to reinvent the wheel. This commit removes the data structure and slab allocations for it. Signed-off-by: Heesub Shin heesub.s...@samsung.com --- drivers/staging/android/ion/ion_page_pool.c | 27 +-- 1 file changed, 5 insertions(+), 22 deletions(-) diff --git a/drivers/staging/android/ion/ion_page_pool.c b/drivers/staging/android/ion/ion_page_pool.c index 1684780..111777c 100644 --- a/drivers/staging/android/ion/ion_page_pool.c +++ b/drivers/staging/android/ion/ion_page_pool.c @@ -23,11 +23,6 @@ #include linux/slab.h #include ion_priv.h -struct ion_page_pool_item { - struct page *page; - struct list_head list; -}; - static void *ion_page_pool_alloc_pages(struct ion_page_pool *pool) { struct page *page = alloc_pages(pool-gfp_mask, pool-order); @@ -47,19 +42,12 @@ static void ion_page_pool_free_pages(struct ion_page_pool *pool, static int ion_page_pool_add(struct ion_page_pool *pool, struct page *page) { - struct ion_page_pool_item *item; - - item = kmalloc(sizeof(struct ion_page_pool_item), GFP_KERNEL); - if (!item) - return -ENOMEM; - mutex_lock(pool-mutex); - item-page = page; if (PageHighMem(page)) { - list_add_tail(item-list, pool-high_items); + list_add_tail(page-lru, pool-high_items); pool-high_count++; } else { - list_add_tail(item-list, pool-low_items); + list_add_tail(page-lru, pool-low_items); pool-low_count++; } mutex_unlock(pool-mutex); @@ -68,24 +56,19 @@ static int ion_page_pool_add(struct ion_page_pool *pool, struct page *page) static struct page *ion_page_pool_remove(struct ion_page_pool *pool, bool high) { - struct ion_page_pool_item *item; struct page *page; if (high) { BUG_ON(!pool-high_count); - item = list_first_entry(pool-high_items, - struct ion_page_pool_item, list); + page = list_first_entry(pool-high_items, struct page, lru); pool-high_count--; } else { BUG_ON(!pool-low_count); - item = list_first_entry(pool-low_items, - struct ion_page_pool_item, list); + page = list_first_entry(pool-low_items, struct page, lru); pool-low_count--; } - list_del(item-list); - page = item-page; - kfree(item); + list_del(page-lru); return page; } -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 4/9] staging: ion: use compound pages on high order pages for system heap
Using compound pages relieves burden on tracking the meta information which are currently stored in page_info. Signed-off-by: Heesub Shin heesub.s...@samsung.com --- drivers/staging/android/ion/ion_page_pool.c | 4 +++- drivers/staging/android/ion/ion_system_heap.c | 2 +- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/staging/android/ion/ion_page_pool.c b/drivers/staging/android/ion/ion_page_pool.c index 111777c..c1cea42b 100644 --- a/drivers/staging/android/ion/ion_page_pool.c +++ b/drivers/staging/android/ion/ion_page_pool.c @@ -95,6 +95,8 @@ void ion_page_pool_free(struct ion_page_pool *pool, struct page *page) { int ret; + BUG_ON(pool-order != compound_order(page)); + ret = ion_page_pool_add(pool, page); if (ret) ion_page_pool_free_pages(pool, page); @@ -150,7 +152,7 @@ struct ion_page_pool *ion_page_pool_create(gfp_t gfp_mask, unsigned int order) pool-low_count = 0; INIT_LIST_HEAD(pool-low_items); INIT_LIST_HEAD(pool-high_items); - pool-gfp_mask = gfp_mask; + pool-gfp_mask = gfp_mask | __GFP_COMP; pool-order = order; mutex_init(pool-mutex); plist_node_init(pool-list, order); diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index b554751..48e6a58 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -72,7 +72,7 @@ static struct page *alloc_buffer_page(struct ion_system_heap *heap, if (order 4) gfp_flags = high_order_gfp_flags; - page = alloc_pages(gfp_flags, order); + page = alloc_pages(gfp_flags | __GFP_COMP, order); if (!page) return NULL; ion_pages_sync_for_device(NULL, page, PAGE_SIZE order, -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 5/9] staging: ion: remove order from struct page_info
ION system heap uses an internal data structure, struct page_info, for tracking down the meta information of the pages allocated from the pool. Now that the pool returns compound pages, we don't need to store page order in struct page_info. Signed-off-by: Heesub Shin heesub.s...@samsung.com --- drivers/staging/android/ion/ion_system_heap.c | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index 48e6a58..73a2e67 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -53,7 +53,6 @@ struct ion_system_heap { struct page_info { struct page *page; - unsigned int order; struct list_head list; }; @@ -121,7 +120,6 @@ static struct page_info *alloc_largest_available(struct ion_system_heap *heap, continue; info-page = page; - info-order = orders[i]; return info; } kfree(info); @@ -158,8 +156,8 @@ static int ion_system_heap_allocate(struct ion_heap *heap, if (!info) goto free_pages; list_add_tail(info-list, pages); - size_remaining -= PAGE_SIZE info-order; - max_order = info-order; + size_remaining -= PAGE_SIZE compound_order(info-page); + max_order = compound_order(info-page); i++; } table = kzalloc(sizeof(struct sg_table), GFP_KERNEL); @@ -172,7 +170,7 @@ static int ion_system_heap_allocate(struct ion_heap *heap, sg = table-sgl; list_for_each_entry_safe(info, tmp_info, pages, list) { struct page *page = info-page; - sg_set_page(sg, page, PAGE_SIZE info-order, 0); + sg_set_page(sg, page, PAGE_SIZE compound_order(page), 0); sg = sg_next(sg); list_del(info-list); kfree(info); @@ -185,7 +183,8 @@ free_table: kfree(table); free_pages: list_for_each_entry_safe(info, tmp_info, pages, list) { - free_buffer_page(sys_heap, buffer, info-page, info-order); + free_buffer_page(sys_heap, buffer, info-page, + compound_order(info-page)); kfree(info); } return -ENOMEM; -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 6/9] staging: ion: remove struct page_info
ION system heap creates a temporary list of pages to build scatter/gather table, introducing an internal data type, page_info. Now that the order field has been removed from it, we do not need to depend on such data type anymore. Signed-off-by: Heesub Shin heesub.s...@samsung.com --- drivers/staging/android/ion/ion_system_heap.c | 47 +-- 1 file changed, 15 insertions(+), 32 deletions(-) diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index 73a2e67..f0ae210 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -51,11 +51,6 @@ struct ion_system_heap { struct ion_page_pool **pools; }; -struct page_info { - struct page *page; - struct list_head list; -}; - static struct page *alloc_buffer_page(struct ion_system_heap *heap, struct ion_buffer *buffer, unsigned long order) @@ -96,19 +91,14 @@ static void free_buffer_page(struct ion_system_heap *heap, } -static struct page_info *alloc_largest_available(struct ion_system_heap *heap, -struct ion_buffer *buffer, -unsigned long size, -unsigned int max_order) +static struct page *alloc_largest_available(struct ion_system_heap *heap, + struct ion_buffer *buffer, + unsigned long size, + unsigned int max_order) { struct page *page; - struct page_info *info; int i; - info = kmalloc(sizeof(struct page_info), GFP_KERNEL); - if (!info) - return NULL; - for (i = 0; i num_orders; i++) { if (size order_to_size(orders[i])) continue; @@ -119,10 +109,8 @@ static struct page_info *alloc_largest_available(struct ion_system_heap *heap, if (!page) continue; - info-page = page; - return info; + return page; } - kfree(info); return NULL; } @@ -138,7 +126,7 @@ static int ion_system_heap_allocate(struct ion_heap *heap, struct sg_table *table; struct scatterlist *sg; struct list_head pages; - struct page_info *info, *tmp_info; + struct page *page, *tmp_page; int i = 0; unsigned long size_remaining = PAGE_ALIGN(size); unsigned int max_order = orders[0]; @@ -151,13 +139,13 @@ static int ion_system_heap_allocate(struct ion_heap *heap, INIT_LIST_HEAD(pages); while (size_remaining 0) { - info = alloc_largest_available(sys_heap, buffer, size_remaining, + page = alloc_largest_available(sys_heap, buffer, size_remaining, max_order); - if (!info) + if (!page) goto free_pages; - list_add_tail(info-list, pages); - size_remaining -= PAGE_SIZE compound_order(info-page); - max_order = compound_order(info-page); + list_add_tail(page-lru, pages); + size_remaining -= PAGE_SIZE compound_order(page); + max_order = compound_order(page); i++; } table = kzalloc(sizeof(struct sg_table), GFP_KERNEL); @@ -168,12 +156,10 @@ static int ion_system_heap_allocate(struct ion_heap *heap, goto free_table; sg = table-sgl; - list_for_each_entry_safe(info, tmp_info, pages, list) { - struct page *page = info-page; + list_for_each_entry_safe(page, tmp_page, pages, lru) { sg_set_page(sg, page, PAGE_SIZE compound_order(page), 0); sg = sg_next(sg); - list_del(info-list); - kfree(info); + list_del(page-lru); } buffer-priv_virt = table; @@ -182,11 +168,8 @@ static int ion_system_heap_allocate(struct ion_heap *heap, free_table: kfree(table); free_pages: - list_for_each_entry_safe(info, tmp_info, pages, list) { - free_buffer_page(sys_heap, buffer, info-page, - compound_order(info-page)); - kfree(info); - } + list_for_each_entry_safe(page, tmp_page, pages, lru) + free_buffer_page(sys_heap, buffer, page, compound_order(page)); return -ENOMEM; } -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 8/9] staging: ion: shrink highmem pages on kswapd
ION system heap keeps pages in its pool for better performance. When the system is under memory pressure, slab shrinker calls the callback registered and then the pages pooled get freed. When the shrinker is called, it checks gfp_mask and determines whether the pages from highmem need to be freed or the pages from lowmem. Usually, slab shrinker is invoked on kswapd context which gfp_mask is always GFP_KERNEL, so only lowmem pages are released on kswapd context. This means that highmem pages in the pool are never reclaimed until direct reclaim occurs. This can be problematic when the page pool holds excessive amounts of highmem. For now, the shrinker callback cannot know exactly which zone should be targeted for reclamation, as enough information are not passed to. Thus, it makes sense to shrink both lowmem and highmem zone on kswapd context. Reported-by: Wonseo Choi wonseo.c...@samsung.com Signed-off-by: Heesub Shin heesub.s...@samsung.com --- drivers/staging/android/ion/ion_page_pool.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/staging/android/ion/ion_page_pool.c b/drivers/staging/android/ion/ion_page_pool.c index c1cea42b..5864f3d 100644 --- a/drivers/staging/android/ion/ion_page_pool.c +++ b/drivers/staging/android/ion/ion_page_pool.c @@ -21,6 +21,7 @@ #include linux/list.h #include linux/module.h #include linux/slab.h +#include linux/swap.h #include ion_priv.h static void *ion_page_pool_alloc_pages(struct ion_page_pool *pool) @@ -118,7 +119,10 @@ int ion_page_pool_shrink(struct ion_page_pool *pool, gfp_t gfp_mask, int freed; bool high; - high = !!(gfp_mask __GFP_HIGHMEM); + if (current_is_kswapd()) + high = 1; + else + high = !!(gfp_mask __GFP_HIGHMEM); if (nr_to_scan == 0) return ion_page_pool_total(pool, high); -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 9/9] staging: ion: optimize struct ion_system_heap
struct ion_system_heap has an array for storing pointers to page pools and it is allocated separately from the containing structure. There is no point in allocating those two small objects individually, bothering slab allocator. Using a variable length array simplifies code lines and reduces overhead to the slab. Signed-off-by: Heesub Shin heesub.s...@samsung.com --- drivers/staging/android/ion/ion_system_heap.c | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index d78d589e..690d866 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -48,7 +48,7 @@ static inline unsigned int order_to_size(int order) struct ion_system_heap { struct ion_heap heap; - struct ion_page_pool **pools; + struct ion_page_pool *pools[0]; }; static struct page *alloc_buffer_page(struct ion_system_heap *heap, @@ -259,16 +259,15 @@ struct ion_heap *ion_system_heap_create(struct ion_platform_heap *unused) struct ion_system_heap *heap; int i; - heap = kzalloc(sizeof(struct ion_system_heap), GFP_KERNEL); + heap = kzalloc(sizeof(struct ion_system_heap) + + sizeof(struct ion_page_pool *) * num_orders, + GFP_KERNEL); if (!heap) return ERR_PTR(-ENOMEM); heap-heap.ops = system_heap_ops; heap-heap.type = ION_HEAP_TYPE_SYSTEM; heap-heap.flags = ION_HEAP_FLAG_DEFER_FREE; - heap-pools = kzalloc(sizeof(struct ion_page_pool *) * num_orders, - GFP_KERNEL); - if (!heap-pools) - goto free_heap; + for (i = 0; i num_orders; i++) { struct ion_page_pool *pool; gfp_t gfp_flags = low_order_gfp_flags; @@ -287,8 +286,6 @@ struct ion_heap *ion_system_heap_create(struct ion_platform_heap *unused) destroy_pools: while (i--) ion_page_pool_destroy(heap-pools[i]); - kfree(heap-pools); -free_heap: kfree(heap); return ERR_PTR(-ENOMEM); } -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2 7/9] staging: ion: remove order argument from free_buffer_page()
Now that the pages returned from the pool are compound pages, we do not need to pass the order information to free_buffer_page(). Signed-off-by: Heesub Shin heesub.s...@samsung.com --- drivers/staging/android/ion/ion_system_heap.c | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index f0ae210..d78d589e 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -77,9 +77,9 @@ static struct page *alloc_buffer_page(struct ion_system_heap *heap, } static void free_buffer_page(struct ion_system_heap *heap, -struct ion_buffer *buffer, struct page *page, -unsigned int order) +struct ion_buffer *buffer, struct page *page) { + unsigned int order = compound_order(page); bool cached = ion_buffer_cached(buffer); if (!cached !(buffer-private_flags ION_PRIV_FLAG_SHRINKER_FREE)) { @@ -169,7 +169,7 @@ free_table: kfree(table); free_pages: list_for_each_entry_safe(page, tmp_page, pages, lru) - free_buffer_page(sys_heap, buffer, page, compound_order(page)); + free_buffer_page(sys_heap, buffer, page); return -ENOMEM; } @@ -189,8 +189,7 @@ static void ion_system_heap_free(struct ion_buffer *buffer) ion_heap_buffer_zero(buffer); for_each_sg(table-sgl, sg, table-nents, i) - free_buffer_page(sys_heap, buffer, sg_page(sg), - get_order(sg-length)); + free_buffer_page(sys_heap, buffer, sg_page(sg)); sg_free_table(table); kfree(table); } -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/9] staging: ion: tidy up a bit
Hello Carpenter, On 05/26/2014 07:36 PM, Dan Carpenter wrote: On Mon, May 26, 2014 at 07:04:53PM +0900, Heesub Shin wrote: @@ -124,7 +122,6 @@ static struct page_info *alloc_largest_available(struct ion_system_heap *heap, info->page = page; info->order = orders[i]; - INIT_LIST_HEAD(>list); return info; } kfree(info); Wait. How does this code work without that INIT_LIST_HEAD()? What am I missing here... No problem. As the object info is just a node, not a head, it is completely useless to initialize it as a list head. regards, Heesub regards, dan carpenter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/9] staging: ion: remove struct ion_page_pool_item
Hello, On 05/26/2014 07:04 PM, Heesub Shin wrote: Now that the order information is held on struct page itself, we do not need to use extra data structure. This commit reduces unnecessary slab usage for allocating small objects. Oops. I need to amend changelog above and resend this patchset. regards, Heesub -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 5/9] staging: ion: remove order from struct page_info
ION system heap uses an internal data structure, struct page_info, for tracking down the meta information of the pages allocated from the pool. Now that the pool returns compound pages, we don't need to store page order in struct page_info. Signed-off-by: Heesub Shin --- drivers/staging/android/ion/ion_system_heap.c | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index 48e6a58..73a2e67 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -53,7 +53,6 @@ struct ion_system_heap { struct page_info { struct page *page; - unsigned int order; struct list_head list; }; @@ -121,7 +120,6 @@ static struct page_info *alloc_largest_available(struct ion_system_heap *heap, continue; info->page = page; - info->order = orders[i]; return info; } kfree(info); @@ -158,8 +156,8 @@ static int ion_system_heap_allocate(struct ion_heap *heap, if (!info) goto free_pages; list_add_tail(>list, ); - size_remaining -= PAGE_SIZE << info->order; - max_order = info->order; + size_remaining -= PAGE_SIZE << compound_order(info->page); + max_order = compound_order(info->page); i++; } table = kzalloc(sizeof(struct sg_table), GFP_KERNEL); @@ -172,7 +170,7 @@ static int ion_system_heap_allocate(struct ion_heap *heap, sg = table->sgl; list_for_each_entry_safe(info, tmp_info, , list) { struct page *page = info->page; - sg_set_page(sg, page, PAGE_SIZE << info->order, 0); + sg_set_page(sg, page, PAGE_SIZE << compound_order(page), 0); sg = sg_next(sg); list_del(>list); kfree(info); @@ -185,7 +183,8 @@ free_table: kfree(table); free_pages: list_for_each_entry_safe(info, tmp_info, , list) { - free_buffer_page(sys_heap, buffer, info->page, info->order); + free_buffer_page(sys_heap, buffer, info->page, + compound_order(info->page)); kfree(info); } return -ENOMEM; -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 6/9] staging: ion: remove struct page_info
ION system heap uses a temporary list holding meta data on pages allocated to build scatter/gather table. Now that the pages are compound pages, we do not need to introduce a new data type redundantly. Signed-off-by: Heesub Shin --- drivers/staging/android/ion/ion_system_heap.c | 47 +-- 1 file changed, 15 insertions(+), 32 deletions(-) diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index 73a2e67..f0ae210 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -51,11 +51,6 @@ struct ion_system_heap { struct ion_page_pool **pools; }; -struct page_info { - struct page *page; - struct list_head list; -}; - static struct page *alloc_buffer_page(struct ion_system_heap *heap, struct ion_buffer *buffer, unsigned long order) @@ -96,19 +91,14 @@ static void free_buffer_page(struct ion_system_heap *heap, } -static struct page_info *alloc_largest_available(struct ion_system_heap *heap, -struct ion_buffer *buffer, -unsigned long size, -unsigned int max_order) +static struct page *alloc_largest_available(struct ion_system_heap *heap, + struct ion_buffer *buffer, + unsigned long size, + unsigned int max_order) { struct page *page; - struct page_info *info; int i; - info = kmalloc(sizeof(struct page_info), GFP_KERNEL); - if (!info) - return NULL; - for (i = 0; i < num_orders; i++) { if (size < order_to_size(orders[i])) continue; @@ -119,10 +109,8 @@ static struct page_info *alloc_largest_available(struct ion_system_heap *heap, if (!page) continue; - info->page = page; - return info; + return page; } - kfree(info); return NULL; } @@ -138,7 +126,7 @@ static int ion_system_heap_allocate(struct ion_heap *heap, struct sg_table *table; struct scatterlist *sg; struct list_head pages; - struct page_info *info, *tmp_info; + struct page *page, *tmp_page; int i = 0; unsigned long size_remaining = PAGE_ALIGN(size); unsigned int max_order = orders[0]; @@ -151,13 +139,13 @@ static int ion_system_heap_allocate(struct ion_heap *heap, INIT_LIST_HEAD(); while (size_remaining > 0) { - info = alloc_largest_available(sys_heap, buffer, size_remaining, + page = alloc_largest_available(sys_heap, buffer, size_remaining, max_order); - if (!info) + if (!page) goto free_pages; - list_add_tail(>list, ); - size_remaining -= PAGE_SIZE << compound_order(info->page); - max_order = compound_order(info->page); + list_add_tail(>lru, ); + size_remaining -= PAGE_SIZE << compound_order(page); + max_order = compound_order(page); i++; } table = kzalloc(sizeof(struct sg_table), GFP_KERNEL); @@ -168,12 +156,10 @@ static int ion_system_heap_allocate(struct ion_heap *heap, goto free_table; sg = table->sgl; - list_for_each_entry_safe(info, tmp_info, , list) { - struct page *page = info->page; + list_for_each_entry_safe(page, tmp_page, , lru) { sg_set_page(sg, page, PAGE_SIZE << compound_order(page), 0); sg = sg_next(sg); - list_del(>list); - kfree(info); + list_del(>lru); } buffer->priv_virt = table; @@ -182,11 +168,8 @@ static int ion_system_heap_allocate(struct ion_heap *heap, free_table: kfree(table); free_pages: - list_for_each_entry_safe(info, tmp_info, , list) { - free_buffer_page(sys_heap, buffer, info->page, - compound_order(info->page)); - kfree(info); - } + list_for_each_entry_safe(page, tmp_page, , lru) + free_buffer_page(sys_heap, buffer, page, compound_order(page)); return -ENOMEM; } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/9] staging: ion: use compound pages on high order pages for system heap
Using compound pages relieves burden on tracking the meta information which are currently stored in page_info. Signed-off-by: Heesub Shin --- drivers/staging/android/ion/ion_page_pool.c | 4 +++- drivers/staging/android/ion/ion_system_heap.c | 2 +- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/staging/android/ion/ion_page_pool.c b/drivers/staging/android/ion/ion_page_pool.c index 1684780..0141b3b 100644 --- a/drivers/staging/android/ion/ion_page_pool.c +++ b/drivers/staging/android/ion/ion_page_pool.c @@ -112,6 +112,8 @@ void ion_page_pool_free(struct ion_page_pool *pool, struct page *page) { int ret; + BUG_ON(pool->order != compound_order(page)); + ret = ion_page_pool_add(pool, page); if (ret) ion_page_pool_free_pages(pool, page); @@ -167,7 +169,7 @@ struct ion_page_pool *ion_page_pool_create(gfp_t gfp_mask, unsigned int order) pool->low_count = 0; INIT_LIST_HEAD(>low_items); INIT_LIST_HEAD(>high_items); - pool->gfp_mask = gfp_mask; + pool->gfp_mask = gfp_mask | __GFP_COMP; pool->order = order; mutex_init(>mutex); plist_node_init(>list, order); diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index b554751..48e6a58 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -72,7 +72,7 @@ static struct page *alloc_buffer_page(struct ion_system_heap *heap, if (order > 4) gfp_flags = high_order_gfp_flags; - page = alloc_pages(gfp_flags, order); + page = alloc_pages(gfp_flags | __GFP_COMP, order); if (!page) return NULL; ion_pages_sync_for_device(NULL, page, PAGE_SIZE << order, -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/9] staging: ion: remove struct ion_page_pool_item
Now that the order information is held on struct page itself, we do not need to use extra data structure. This commit reduces unnecessary slab usage for allocating small objects. Signed-off-by: Heesub Shin --- drivers/staging/android/ion/ion_page_pool.c | 27 +-- 1 file changed, 5 insertions(+), 22 deletions(-) diff --git a/drivers/staging/android/ion/ion_page_pool.c b/drivers/staging/android/ion/ion_page_pool.c index 0141b3b..c1cea42b 100644 --- a/drivers/staging/android/ion/ion_page_pool.c +++ b/drivers/staging/android/ion/ion_page_pool.c @@ -23,11 +23,6 @@ #include #include "ion_priv.h" -struct ion_page_pool_item { - struct page *page; - struct list_head list; -}; - static void *ion_page_pool_alloc_pages(struct ion_page_pool *pool) { struct page *page = alloc_pages(pool->gfp_mask, pool->order); @@ -47,19 +42,12 @@ static void ion_page_pool_free_pages(struct ion_page_pool *pool, static int ion_page_pool_add(struct ion_page_pool *pool, struct page *page) { - struct ion_page_pool_item *item; - - item = kmalloc(sizeof(struct ion_page_pool_item), GFP_KERNEL); - if (!item) - return -ENOMEM; - mutex_lock(>mutex); - item->page = page; if (PageHighMem(page)) { - list_add_tail(>list, >high_items); + list_add_tail(>lru, >high_items); pool->high_count++; } else { - list_add_tail(>list, >low_items); + list_add_tail(>lru, >low_items); pool->low_count++; } mutex_unlock(>mutex); @@ -68,24 +56,19 @@ static int ion_page_pool_add(struct ion_page_pool *pool, struct page *page) static struct page *ion_page_pool_remove(struct ion_page_pool *pool, bool high) { - struct ion_page_pool_item *item; struct page *page; if (high) { BUG_ON(!pool->high_count); - item = list_first_entry(>high_items, - struct ion_page_pool_item, list); + page = list_first_entry(>high_items, struct page, lru); pool->high_count--; } else { BUG_ON(!pool->low_count); - item = list_first_entry(>low_items, - struct ion_page_pool_item, list); + page = list_first_entry(>low_items, struct page, lru); pool->low_count--; } - list_del(>list); - page = item->page; - kfree(item); + list_del(>lru); return page; } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 9/9] staging: ion: optimize struct ion_system_heap
struct ion_system_heap has an array for storing pointers to page pools and it is allocated separately from the containing structure. There is no point in allocating those two small objects individually, bothering slab allocator. Using a variable length array simplifies code lines and reduces overhead to the slab. Signed-off-by: Heesub Shin --- drivers/staging/android/ion/ion_system_heap.c | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index d78d589e..690d866 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -48,7 +48,7 @@ static inline unsigned int order_to_size(int order) struct ion_system_heap { struct ion_heap heap; - struct ion_page_pool **pools; + struct ion_page_pool *pools[0]; }; static struct page *alloc_buffer_page(struct ion_system_heap *heap, @@ -259,16 +259,15 @@ struct ion_heap *ion_system_heap_create(struct ion_platform_heap *unused) struct ion_system_heap *heap; int i; - heap = kzalloc(sizeof(struct ion_system_heap), GFP_KERNEL); + heap = kzalloc(sizeof(struct ion_system_heap) + + sizeof(struct ion_page_pool *) * num_orders, + GFP_KERNEL); if (!heap) return ERR_PTR(-ENOMEM); heap->heap.ops = _heap_ops; heap->heap.type = ION_HEAP_TYPE_SYSTEM; heap->heap.flags = ION_HEAP_FLAG_DEFER_FREE; - heap->pools = kzalloc(sizeof(struct ion_page_pool *) * num_orders, - GFP_KERNEL); - if (!heap->pools) - goto free_heap; + for (i = 0; i < num_orders; i++) { struct ion_page_pool *pool; gfp_t gfp_flags = low_order_gfp_flags; @@ -287,8 +286,6 @@ struct ion_heap *ion_system_heap_create(struct ion_platform_heap *unused) destroy_pools: while (i--) ion_page_pool_destroy(heap->pools[i]); - kfree(heap->pools); -free_heap: kfree(heap); return ERR_PTR(-ENOMEM); } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 8/9] staging: ion: shrink highmem pages on kswapd
ION system heap keeps pages in its pool for better performance. When the system is under memory pressure, slab shrinker calls the callback registered and then the pages pooled get freed. When the shrinker is called, it checks gfp_mask and determines whether the pages from highmem need to be freed or the pages from lowmem. Usually, slab shrinker is invoked on kswapd context which gfp_mask is always GFP_KERNEL, so only lowmem pages are released on kswapd context. This means that highmem pages in the pool are never reclaimed until direct reclaim occurs. This can be problematic when the page pool holds excessive amounts of highmem. For now, the shrinker callback cannot know exactly which zone should be targeted for reclamation, as enough information are not passed to. Thus, it makes sense to shrink both lowmem and highmem zone on kswapd context. Reported-by: Wonseo Choi Signed-off-by: Heesub Shin --- drivers/staging/android/ion/ion_page_pool.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/staging/android/ion/ion_page_pool.c b/drivers/staging/android/ion/ion_page_pool.c index c1cea42b..5864f3d 100644 --- a/drivers/staging/android/ion/ion_page_pool.c +++ b/drivers/staging/android/ion/ion_page_pool.c @@ -21,6 +21,7 @@ #include #include #include +#include #include "ion_priv.h" static void *ion_page_pool_alloc_pages(struct ion_page_pool *pool) @@ -118,7 +119,10 @@ int ion_page_pool_shrink(struct ion_page_pool *pool, gfp_t gfp_mask, int freed; bool high; - high = !!(gfp_mask & __GFP_HIGHMEM); + if (current_is_kswapd()) + high = 1; + else + high = !!(gfp_mask & __GFP_HIGHMEM); if (nr_to_scan == 0) return ion_page_pool_total(pool, high); -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 7/9] staging: ion: remove order argument from free_buffer_page()
Not that the pages returned from the pool are compound pages, we do not need to pass the order information to free_buffer_page(). Signed-off-by: Heesub Shin --- drivers/staging/android/ion/ion_system_heap.c | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index f0ae210..d78d589e 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -77,9 +77,9 @@ static struct page *alloc_buffer_page(struct ion_system_heap *heap, } static void free_buffer_page(struct ion_system_heap *heap, -struct ion_buffer *buffer, struct page *page, -unsigned int order) +struct ion_buffer *buffer, struct page *page) { + unsigned int order = compound_order(page); bool cached = ion_buffer_cached(buffer); if (!cached && !(buffer->private_flags & ION_PRIV_FLAG_SHRINKER_FREE)) { @@ -169,7 +169,7 @@ free_table: kfree(table); free_pages: list_for_each_entry_safe(page, tmp_page, , lru) - free_buffer_page(sys_heap, buffer, page, compound_order(page)); + free_buffer_page(sys_heap, buffer, page); return -ENOMEM; } @@ -189,8 +189,7 @@ static void ion_system_heap_free(struct ion_buffer *buffer) ion_heap_buffer_zero(buffer); for_each_sg(table->sgl, sg, table->nents, i) - free_buffer_page(sys_heap, buffer, sg_page(sg), - get_order(sg->length)); + free_buffer_page(sys_heap, buffer, sg_page(sg)); sg_free_table(table); kfree(table); } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/9] staging: ion: simplify ion_page_pool_total()
ion_page_pool_total() returns the total number of pages in the pool. Depending on the argument passed, it counts pages from highmem zone in or not. This commit simplifies the code lines for better readability. Signed-off-by: Heesub Shin --- drivers/staging/android/ion/ion_page_pool.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/staging/android/ion/ion_page_pool.c b/drivers/staging/android/ion/ion_page_pool.c index ead4054..1684780 100644 --- a/drivers/staging/android/ion/ion_page_pool.c +++ b/drivers/staging/android/ion/ion_page_pool.c @@ -119,12 +119,12 @@ void ion_page_pool_free(struct ion_page_pool *pool, struct page *page) static int ion_page_pool_total(struct ion_page_pool *pool, bool high) { - int total = 0; + int count = pool->low_count; - total += high ? (pool->high_count + pool->low_count) * - (1 << pool->order) : - pool->low_count * (1 << pool->order); - return total; + if (high) + count += pool->high_count; + + return count << pool->order; } int ion_page_pool_shrink(struct ion_page_pool *pool, gfp_t gfp_mask, -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/9] staging: ion: tidy up a bit
For aesthetics and readability, rename goto labels, remove useless code lines, and clarify function return type. Signed-off-by: Heesub Shin --- drivers/staging/android/ion/ion_page_pool.c | 2 +- drivers/staging/android/ion/ion_priv.h| 2 +- drivers/staging/android/ion/ion_system_heap.c | 57 --- 3 files changed, 28 insertions(+), 33 deletions(-) diff --git a/drivers/staging/android/ion/ion_page_pool.c b/drivers/staging/android/ion/ion_page_pool.c index ecb5fc3..ead4054 100644 --- a/drivers/staging/android/ion/ion_page_pool.c +++ b/drivers/staging/android/ion/ion_page_pool.c @@ -89,7 +89,7 @@ static struct page *ion_page_pool_remove(struct ion_page_pool *pool, bool high) return page; } -void *ion_page_pool_alloc(struct ion_page_pool *pool) +struct page *ion_page_pool_alloc(struct ion_page_pool *pool) { struct page *page = NULL; diff --git a/drivers/staging/android/ion/ion_priv.h b/drivers/staging/android/ion/ion_priv.h index 1eba3f2..365c947 100644 --- a/drivers/staging/android/ion/ion_priv.h +++ b/drivers/staging/android/ion/ion_priv.h @@ -377,7 +377,7 @@ struct ion_page_pool { struct ion_page_pool *ion_page_pool_create(gfp_t gfp_mask, unsigned int order); void ion_page_pool_destroy(struct ion_page_pool *); -void *ion_page_pool_alloc(struct ion_page_pool *); +struct page *ion_page_pool_alloc(struct ion_page_pool *); void ion_page_pool_free(struct ion_page_pool *, struct page *); /** ion_page_pool_shrink - shrinks the size of the memory cached in the pool diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index c923633..b554751 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -41,7 +41,7 @@ static int order_to_index(unsigned int order) return -1; } -static unsigned int order_to_size(int order) +static inline unsigned int order_to_size(int order) { return PAGE_SIZE << order; } @@ -78,8 +78,6 @@ static struct page *alloc_buffer_page(struct ion_system_heap *heap, ion_pages_sync_for_device(NULL, page, PAGE_SIZE << order, DMA_BIDIRECTIONAL); } - if (!page) - return NULL; return page; } @@ -124,7 +122,6 @@ static struct page_info *alloc_largest_available(struct ion_system_heap *heap, info->page = page; info->order = orders[i]; - INIT_LIST_HEAD(>list); return info; } kfree(info); @@ -142,7 +139,6 @@ static int ion_system_heap_allocate(struct ion_heap *heap, heap); struct sg_table *table; struct scatterlist *sg; - int ret; struct list_head pages; struct page_info *info, *tmp_info; int i = 0; @@ -160,24 +156,23 @@ static int ion_system_heap_allocate(struct ion_heap *heap, info = alloc_largest_available(sys_heap, buffer, size_remaining, max_order); if (!info) - goto err; + goto free_pages; list_add_tail(>list, ); - size_remaining -= (1 << info->order) * PAGE_SIZE; + size_remaining -= PAGE_SIZE << info->order; max_order = info->order; i++; } table = kzalloc(sizeof(struct sg_table), GFP_KERNEL); if (!table) - goto err; + goto free_pages; - ret = sg_alloc_table(table, i, GFP_KERNEL); - if (ret) - goto err1; + if (sg_alloc_table(table, i, GFP_KERNEL)) + goto free_table; sg = table->sgl; list_for_each_entry_safe(info, tmp_info, , list) { struct page *page = info->page; - sg_set_page(sg, page, (1 << info->order) * PAGE_SIZE, 0); + sg_set_page(sg, page, PAGE_SIZE << info->order, 0); sg = sg_next(sg); list_del(>list); kfree(info); @@ -185,9 +180,10 @@ static int ion_system_heap_allocate(struct ion_heap *heap, buffer->priv_virt = table; return 0; -err1: + +free_table: kfree(table); -err: +free_pages: list_for_each_entry_safe(info, tmp_info, , list) { free_buffer_page(sys_heap, buffer, info->page, info->order); kfree(info); @@ -197,14 +193,12 @@ err: static void ion_system_heap_free(struct ion_buffer *buffer) { - struct ion_heap *heap = buffer->heap; - struct ion_system_heap *sys_heap = container_of(heap, + struct ion_system_heap *sys_heap = container_of(buffer->heap, struct ion_system_heap,
[PATCH 1/9] staging: ion: tidy up a bit
For aesthetics and readability, rename goto labels, remove useless code lines, and clarify function return type. Signed-off-by: Heesub Shin heesub.s...@samsung.com --- drivers/staging/android/ion/ion_page_pool.c | 2 +- drivers/staging/android/ion/ion_priv.h| 2 +- drivers/staging/android/ion/ion_system_heap.c | 57 --- 3 files changed, 28 insertions(+), 33 deletions(-) diff --git a/drivers/staging/android/ion/ion_page_pool.c b/drivers/staging/android/ion/ion_page_pool.c index ecb5fc3..ead4054 100644 --- a/drivers/staging/android/ion/ion_page_pool.c +++ b/drivers/staging/android/ion/ion_page_pool.c @@ -89,7 +89,7 @@ static struct page *ion_page_pool_remove(struct ion_page_pool *pool, bool high) return page; } -void *ion_page_pool_alloc(struct ion_page_pool *pool) +struct page *ion_page_pool_alloc(struct ion_page_pool *pool) { struct page *page = NULL; diff --git a/drivers/staging/android/ion/ion_priv.h b/drivers/staging/android/ion/ion_priv.h index 1eba3f2..365c947 100644 --- a/drivers/staging/android/ion/ion_priv.h +++ b/drivers/staging/android/ion/ion_priv.h @@ -377,7 +377,7 @@ struct ion_page_pool { struct ion_page_pool *ion_page_pool_create(gfp_t gfp_mask, unsigned int order); void ion_page_pool_destroy(struct ion_page_pool *); -void *ion_page_pool_alloc(struct ion_page_pool *); +struct page *ion_page_pool_alloc(struct ion_page_pool *); void ion_page_pool_free(struct ion_page_pool *, struct page *); /** ion_page_pool_shrink - shrinks the size of the memory cached in the pool diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index c923633..b554751 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -41,7 +41,7 @@ static int order_to_index(unsigned int order) return -1; } -static unsigned int order_to_size(int order) +static inline unsigned int order_to_size(int order) { return PAGE_SIZE order; } @@ -78,8 +78,6 @@ static struct page *alloc_buffer_page(struct ion_system_heap *heap, ion_pages_sync_for_device(NULL, page, PAGE_SIZE order, DMA_BIDIRECTIONAL); } - if (!page) - return NULL; return page; } @@ -124,7 +122,6 @@ static struct page_info *alloc_largest_available(struct ion_system_heap *heap, info-page = page; info-order = orders[i]; - INIT_LIST_HEAD(info-list); return info; } kfree(info); @@ -142,7 +139,6 @@ static int ion_system_heap_allocate(struct ion_heap *heap, heap); struct sg_table *table; struct scatterlist *sg; - int ret; struct list_head pages; struct page_info *info, *tmp_info; int i = 0; @@ -160,24 +156,23 @@ static int ion_system_heap_allocate(struct ion_heap *heap, info = alloc_largest_available(sys_heap, buffer, size_remaining, max_order); if (!info) - goto err; + goto free_pages; list_add_tail(info-list, pages); - size_remaining -= (1 info-order) * PAGE_SIZE; + size_remaining -= PAGE_SIZE info-order; max_order = info-order; i++; } table = kzalloc(sizeof(struct sg_table), GFP_KERNEL); if (!table) - goto err; + goto free_pages; - ret = sg_alloc_table(table, i, GFP_KERNEL); - if (ret) - goto err1; + if (sg_alloc_table(table, i, GFP_KERNEL)) + goto free_table; sg = table-sgl; list_for_each_entry_safe(info, tmp_info, pages, list) { struct page *page = info-page; - sg_set_page(sg, page, (1 info-order) * PAGE_SIZE, 0); + sg_set_page(sg, page, PAGE_SIZE info-order, 0); sg = sg_next(sg); list_del(info-list); kfree(info); @@ -185,9 +180,10 @@ static int ion_system_heap_allocate(struct ion_heap *heap, buffer-priv_virt = table; return 0; -err1: + +free_table: kfree(table); -err: +free_pages: list_for_each_entry_safe(info, tmp_info, pages, list) { free_buffer_page(sys_heap, buffer, info-page, info-order); kfree(info); @@ -197,14 +193,12 @@ err: static void ion_system_heap_free(struct ion_buffer *buffer) { - struct ion_heap *heap = buffer-heap; - struct ion_system_heap *sys_heap = container_of(heap, + struct ion_system_heap *sys_heap = container_of(buffer-heap, struct ion_system_heap, heap
[PATCH 7/9] staging: ion: remove order argument from free_buffer_page()
Not that the pages returned from the pool are compound pages, we do not need to pass the order information to free_buffer_page(). Signed-off-by: Heesub Shin heesub.s...@samsung.com --- drivers/staging/android/ion/ion_system_heap.c | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index f0ae210..d78d589e 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -77,9 +77,9 @@ static struct page *alloc_buffer_page(struct ion_system_heap *heap, } static void free_buffer_page(struct ion_system_heap *heap, -struct ion_buffer *buffer, struct page *page, -unsigned int order) +struct ion_buffer *buffer, struct page *page) { + unsigned int order = compound_order(page); bool cached = ion_buffer_cached(buffer); if (!cached !(buffer-private_flags ION_PRIV_FLAG_SHRINKER_FREE)) { @@ -169,7 +169,7 @@ free_table: kfree(table); free_pages: list_for_each_entry_safe(page, tmp_page, pages, lru) - free_buffer_page(sys_heap, buffer, page, compound_order(page)); + free_buffer_page(sys_heap, buffer, page); return -ENOMEM; } @@ -189,8 +189,7 @@ static void ion_system_heap_free(struct ion_buffer *buffer) ion_heap_buffer_zero(buffer); for_each_sg(table-sgl, sg, table-nents, i) - free_buffer_page(sys_heap, buffer, sg_page(sg), - get_order(sg-length)); + free_buffer_page(sys_heap, buffer, sg_page(sg)); sg_free_table(table); kfree(table); } -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/9] staging: ion: simplify ion_page_pool_total()
ion_page_pool_total() returns the total number of pages in the pool. Depending on the argument passed, it counts pages from highmem zone in or not. This commit simplifies the code lines for better readability. Signed-off-by: Heesub Shin heesub.s...@samsung.com --- drivers/staging/android/ion/ion_page_pool.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/staging/android/ion/ion_page_pool.c b/drivers/staging/android/ion/ion_page_pool.c index ead4054..1684780 100644 --- a/drivers/staging/android/ion/ion_page_pool.c +++ b/drivers/staging/android/ion/ion_page_pool.c @@ -119,12 +119,12 @@ void ion_page_pool_free(struct ion_page_pool *pool, struct page *page) static int ion_page_pool_total(struct ion_page_pool *pool, bool high) { - int total = 0; + int count = pool-low_count; - total += high ? (pool-high_count + pool-low_count) * - (1 pool-order) : - pool-low_count * (1 pool-order); - return total; + if (high) + count += pool-high_count; + + return count pool-order; } int ion_page_pool_shrink(struct ion_page_pool *pool, gfp_t gfp_mask, -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 8/9] staging: ion: shrink highmem pages on kswapd
ION system heap keeps pages in its pool for better performance. When the system is under memory pressure, slab shrinker calls the callback registered and then the pages pooled get freed. When the shrinker is called, it checks gfp_mask and determines whether the pages from highmem need to be freed or the pages from lowmem. Usually, slab shrinker is invoked on kswapd context which gfp_mask is always GFP_KERNEL, so only lowmem pages are released on kswapd context. This means that highmem pages in the pool are never reclaimed until direct reclaim occurs. This can be problematic when the page pool holds excessive amounts of highmem. For now, the shrinker callback cannot know exactly which zone should be targeted for reclamation, as enough information are not passed to. Thus, it makes sense to shrink both lowmem and highmem zone on kswapd context. Reported-by: Wonseo Choi wonseo.c...@samsung.com Signed-off-by: Heesub Shin heesub.s...@samsung.com --- drivers/staging/android/ion/ion_page_pool.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/staging/android/ion/ion_page_pool.c b/drivers/staging/android/ion/ion_page_pool.c index c1cea42b..5864f3d 100644 --- a/drivers/staging/android/ion/ion_page_pool.c +++ b/drivers/staging/android/ion/ion_page_pool.c @@ -21,6 +21,7 @@ #include linux/list.h #include linux/module.h #include linux/slab.h +#include linux/swap.h #include ion_priv.h static void *ion_page_pool_alloc_pages(struct ion_page_pool *pool) @@ -118,7 +119,10 @@ int ion_page_pool_shrink(struct ion_page_pool *pool, gfp_t gfp_mask, int freed; bool high; - high = !!(gfp_mask __GFP_HIGHMEM); + if (current_is_kswapd()) + high = 1; + else + high = !!(gfp_mask __GFP_HIGHMEM); if (nr_to_scan == 0) return ion_page_pool_total(pool, high); -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 9/9] staging: ion: optimize struct ion_system_heap
struct ion_system_heap has an array for storing pointers to page pools and it is allocated separately from the containing structure. There is no point in allocating those two small objects individually, bothering slab allocator. Using a variable length array simplifies code lines and reduces overhead to the slab. Signed-off-by: Heesub Shin heesub.s...@samsung.com --- drivers/staging/android/ion/ion_system_heap.c | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index d78d589e..690d866 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -48,7 +48,7 @@ static inline unsigned int order_to_size(int order) struct ion_system_heap { struct ion_heap heap; - struct ion_page_pool **pools; + struct ion_page_pool *pools[0]; }; static struct page *alloc_buffer_page(struct ion_system_heap *heap, @@ -259,16 +259,15 @@ struct ion_heap *ion_system_heap_create(struct ion_platform_heap *unused) struct ion_system_heap *heap; int i; - heap = kzalloc(sizeof(struct ion_system_heap), GFP_KERNEL); + heap = kzalloc(sizeof(struct ion_system_heap) + + sizeof(struct ion_page_pool *) * num_orders, + GFP_KERNEL); if (!heap) return ERR_PTR(-ENOMEM); heap-heap.ops = system_heap_ops; heap-heap.type = ION_HEAP_TYPE_SYSTEM; heap-heap.flags = ION_HEAP_FLAG_DEFER_FREE; - heap-pools = kzalloc(sizeof(struct ion_page_pool *) * num_orders, - GFP_KERNEL); - if (!heap-pools) - goto free_heap; + for (i = 0; i num_orders; i++) { struct ion_page_pool *pool; gfp_t gfp_flags = low_order_gfp_flags; @@ -287,8 +286,6 @@ struct ion_heap *ion_system_heap_create(struct ion_platform_heap *unused) destroy_pools: while (i--) ion_page_pool_destroy(heap-pools[i]); - kfree(heap-pools); -free_heap: kfree(heap); return ERR_PTR(-ENOMEM); } -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/9] staging: ion: use compound pages on high order pages for system heap
Using compound pages relieves burden on tracking the meta information which are currently stored in page_info. Signed-off-by: Heesub Shin heesub.s...@samsung.com --- drivers/staging/android/ion/ion_page_pool.c | 4 +++- drivers/staging/android/ion/ion_system_heap.c | 2 +- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/staging/android/ion/ion_page_pool.c b/drivers/staging/android/ion/ion_page_pool.c index 1684780..0141b3b 100644 --- a/drivers/staging/android/ion/ion_page_pool.c +++ b/drivers/staging/android/ion/ion_page_pool.c @@ -112,6 +112,8 @@ void ion_page_pool_free(struct ion_page_pool *pool, struct page *page) { int ret; + BUG_ON(pool-order != compound_order(page)); + ret = ion_page_pool_add(pool, page); if (ret) ion_page_pool_free_pages(pool, page); @@ -167,7 +169,7 @@ struct ion_page_pool *ion_page_pool_create(gfp_t gfp_mask, unsigned int order) pool-low_count = 0; INIT_LIST_HEAD(pool-low_items); INIT_LIST_HEAD(pool-high_items); - pool-gfp_mask = gfp_mask; + pool-gfp_mask = gfp_mask | __GFP_COMP; pool-order = order; mutex_init(pool-mutex); plist_node_init(pool-list, order); diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index b554751..48e6a58 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -72,7 +72,7 @@ static struct page *alloc_buffer_page(struct ion_system_heap *heap, if (order 4) gfp_flags = high_order_gfp_flags; - page = alloc_pages(gfp_flags, order); + page = alloc_pages(gfp_flags | __GFP_COMP, order); if (!page) return NULL; ion_pages_sync_for_device(NULL, page, PAGE_SIZE order, -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 4/9] staging: ion: remove struct ion_page_pool_item
Now that the order information is held on struct page itself, we do not need to use extra data structure. This commit reduces unnecessary slab usage for allocating small objects. Signed-off-by: Heesub Shin heesub.s...@samsung.com --- drivers/staging/android/ion/ion_page_pool.c | 27 +-- 1 file changed, 5 insertions(+), 22 deletions(-) diff --git a/drivers/staging/android/ion/ion_page_pool.c b/drivers/staging/android/ion/ion_page_pool.c index 0141b3b..c1cea42b 100644 --- a/drivers/staging/android/ion/ion_page_pool.c +++ b/drivers/staging/android/ion/ion_page_pool.c @@ -23,11 +23,6 @@ #include linux/slab.h #include ion_priv.h -struct ion_page_pool_item { - struct page *page; - struct list_head list; -}; - static void *ion_page_pool_alloc_pages(struct ion_page_pool *pool) { struct page *page = alloc_pages(pool-gfp_mask, pool-order); @@ -47,19 +42,12 @@ static void ion_page_pool_free_pages(struct ion_page_pool *pool, static int ion_page_pool_add(struct ion_page_pool *pool, struct page *page) { - struct ion_page_pool_item *item; - - item = kmalloc(sizeof(struct ion_page_pool_item), GFP_KERNEL); - if (!item) - return -ENOMEM; - mutex_lock(pool-mutex); - item-page = page; if (PageHighMem(page)) { - list_add_tail(item-list, pool-high_items); + list_add_tail(page-lru, pool-high_items); pool-high_count++; } else { - list_add_tail(item-list, pool-low_items); + list_add_tail(page-lru, pool-low_items); pool-low_count++; } mutex_unlock(pool-mutex); @@ -68,24 +56,19 @@ static int ion_page_pool_add(struct ion_page_pool *pool, struct page *page) static struct page *ion_page_pool_remove(struct ion_page_pool *pool, bool high) { - struct ion_page_pool_item *item; struct page *page; if (high) { BUG_ON(!pool-high_count); - item = list_first_entry(pool-high_items, - struct ion_page_pool_item, list); + page = list_first_entry(pool-high_items, struct page, lru); pool-high_count--; } else { BUG_ON(!pool-low_count); - item = list_first_entry(pool-low_items, - struct ion_page_pool_item, list); + page = list_first_entry(pool-low_items, struct page, lru); pool-low_count--; } - list_del(item-list); - page = item-page; - kfree(item); + list_del(page-lru); return page; } -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 6/9] staging: ion: remove struct page_info
ION system heap uses a temporary list holding meta data on pages allocated to build scatter/gather table. Now that the pages are compound pages, we do not need to introduce a new data type redundantly. Signed-off-by: Heesub Shin heesub.s...@samsung.com --- drivers/staging/android/ion/ion_system_heap.c | 47 +-- 1 file changed, 15 insertions(+), 32 deletions(-) diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index 73a2e67..f0ae210 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -51,11 +51,6 @@ struct ion_system_heap { struct ion_page_pool **pools; }; -struct page_info { - struct page *page; - struct list_head list; -}; - static struct page *alloc_buffer_page(struct ion_system_heap *heap, struct ion_buffer *buffer, unsigned long order) @@ -96,19 +91,14 @@ static void free_buffer_page(struct ion_system_heap *heap, } -static struct page_info *alloc_largest_available(struct ion_system_heap *heap, -struct ion_buffer *buffer, -unsigned long size, -unsigned int max_order) +static struct page *alloc_largest_available(struct ion_system_heap *heap, + struct ion_buffer *buffer, + unsigned long size, + unsigned int max_order) { struct page *page; - struct page_info *info; int i; - info = kmalloc(sizeof(struct page_info), GFP_KERNEL); - if (!info) - return NULL; - for (i = 0; i num_orders; i++) { if (size order_to_size(orders[i])) continue; @@ -119,10 +109,8 @@ static struct page_info *alloc_largest_available(struct ion_system_heap *heap, if (!page) continue; - info-page = page; - return info; + return page; } - kfree(info); return NULL; } @@ -138,7 +126,7 @@ static int ion_system_heap_allocate(struct ion_heap *heap, struct sg_table *table; struct scatterlist *sg; struct list_head pages; - struct page_info *info, *tmp_info; + struct page *page, *tmp_page; int i = 0; unsigned long size_remaining = PAGE_ALIGN(size); unsigned int max_order = orders[0]; @@ -151,13 +139,13 @@ static int ion_system_heap_allocate(struct ion_heap *heap, INIT_LIST_HEAD(pages); while (size_remaining 0) { - info = alloc_largest_available(sys_heap, buffer, size_remaining, + page = alloc_largest_available(sys_heap, buffer, size_remaining, max_order); - if (!info) + if (!page) goto free_pages; - list_add_tail(info-list, pages); - size_remaining -= PAGE_SIZE compound_order(info-page); - max_order = compound_order(info-page); + list_add_tail(page-lru, pages); + size_remaining -= PAGE_SIZE compound_order(page); + max_order = compound_order(page); i++; } table = kzalloc(sizeof(struct sg_table), GFP_KERNEL); @@ -168,12 +156,10 @@ static int ion_system_heap_allocate(struct ion_heap *heap, goto free_table; sg = table-sgl; - list_for_each_entry_safe(info, tmp_info, pages, list) { - struct page *page = info-page; + list_for_each_entry_safe(page, tmp_page, pages, lru) { sg_set_page(sg, page, PAGE_SIZE compound_order(page), 0); sg = sg_next(sg); - list_del(info-list); - kfree(info); + list_del(page-lru); } buffer-priv_virt = table; @@ -182,11 +168,8 @@ static int ion_system_heap_allocate(struct ion_heap *heap, free_table: kfree(table); free_pages: - list_for_each_entry_safe(info, tmp_info, pages, list) { - free_buffer_page(sys_heap, buffer, info-page, - compound_order(info-page)); - kfree(info); - } + list_for_each_entry_safe(page, tmp_page, pages, lru) + free_buffer_page(sys_heap, buffer, page, compound_order(page)); return -ENOMEM; } -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 5/9] staging: ion: remove order from struct page_info
ION system heap uses an internal data structure, struct page_info, for tracking down the meta information of the pages allocated from the pool. Now that the pool returns compound pages, we don't need to store page order in struct page_info. Signed-off-by: Heesub Shin heesub.s...@samsung.com --- drivers/staging/android/ion/ion_system_heap.c | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index 48e6a58..73a2e67 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -53,7 +53,6 @@ struct ion_system_heap { struct page_info { struct page *page; - unsigned int order; struct list_head list; }; @@ -121,7 +120,6 @@ static struct page_info *alloc_largest_available(struct ion_system_heap *heap, continue; info-page = page; - info-order = orders[i]; return info; } kfree(info); @@ -158,8 +156,8 @@ static int ion_system_heap_allocate(struct ion_heap *heap, if (!info) goto free_pages; list_add_tail(info-list, pages); - size_remaining -= PAGE_SIZE info-order; - max_order = info-order; + size_remaining -= PAGE_SIZE compound_order(info-page); + max_order = compound_order(info-page); i++; } table = kzalloc(sizeof(struct sg_table), GFP_KERNEL); @@ -172,7 +170,7 @@ static int ion_system_heap_allocate(struct ion_heap *heap, sg = table-sgl; list_for_each_entry_safe(info, tmp_info, pages, list) { struct page *page = info-page; - sg_set_page(sg, page, PAGE_SIZE info-order, 0); + sg_set_page(sg, page, PAGE_SIZE compound_order(page), 0); sg = sg_next(sg); list_del(info-list); kfree(info); @@ -185,7 +183,8 @@ free_table: kfree(table); free_pages: list_for_each_entry_safe(info, tmp_info, pages, list) { - free_buffer_page(sys_heap, buffer, info-page, info-order); + free_buffer_page(sys_heap, buffer, info-page, + compound_order(info-page)); kfree(info); } return -ENOMEM; -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/9] staging: ion: remove struct ion_page_pool_item
Hello, On 05/26/2014 07:04 PM, Heesub Shin wrote: Now that the order information is held on struct page itself, we do not need to use extra data structure. This commit reduces unnecessary slab usage for allocating small objects. Oops. I need to amend changelog above and resend this patchset. regards, Heesub -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/9] staging: ion: tidy up a bit
Hello Carpenter, On 05/26/2014 07:36 PM, Dan Carpenter wrote: On Mon, May 26, 2014 at 07:04:53PM +0900, Heesub Shin wrote: @@ -124,7 +122,6 @@ static struct page_info *alloc_largest_available(struct ion_system_heap *heap, info-page = page; info-order = orders[i]; - INIT_LIST_HEAD(info-list); return info; } kfree(info); Wait. How does this code work without that INIT_LIST_HEAD()? What am I missing here... No problem. As the object info is just a node, not a head, it is completely useless to initialize it as a list head. regards, Heesub regards, dan carpenter -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used
Hello, On 05/15/2014 10:53 AM, Joonsoo Kim wrote: On Tue, May 13, 2014 at 12:00:57PM +0900, Minchan Kim wrote: Hey Joonsoo, On Thu, May 08, 2014 at 09:32:23AM +0900, Joonsoo Kim wrote: CMA is introduced to provide physically contiguous pages at runtime. For this purpose, it reserves memory at boot time. Although it reserve memory, this reserved memory can be used for movable memory allocation request. This usecase is beneficial to the system that needs this CMA reserved memory infrequently and it is one of main purpose of introducing CMA. But, there is a problem in current implementation. The problem is that it works like as just reserved memory approach. The pages on cma reserved memory are hardly used for movable memory allocation. This is caused by combination of allocation and reclaim policy. The pages on cma reserved memory are allocated if there is no movable memory, that is, as fallback allocation. So the time this fallback allocation is started is under heavy memory pressure. Although it is under memory pressure, movable allocation easily succeed, since there would be many pages on cma reserved memory. But this is not the case for unmovable and reclaimable allocation, because they can't use the pages on cma reserved memory. These allocations regard system's free memory as (free pages - free cma pages) on watermark checking, that is, free unmovable pages + free reclaimable pages + free movable pages. Because we already exhausted movable pages, only free pages we have are unmovable and reclaimable types and this would be really small amount. So watermark checking would be failed. It will wake up kswapd to make enough free memory for unmovable and reclaimable allocation and kswapd will do. So before we fully utilize pages on cma reserved memory, kswapd start to reclaim memory and try to make free memory over the high watermark. This watermark checking by kswapd doesn't take care free cma pages so many movable pages would be reclaimed. After then, we have a lot of movable pages again, so fallback allocation doesn't happen again. To conclude, amount of free memory on meminfo which includes free CMA pages is moving around 512 MB if I reserve 512 MB memory for CMA. I found this problem on following experiment. 4 CPUs, 1024 MB, VIRTUAL MACHINE make -j24 CMA reserve:0 MB512 MB Elapsed-time: 234.8 361.8 Average-MemFree:283880 KB 530851 KB To solve this problem, I can think following 2 possible solutions. 1. allocate the pages on cma reserved memory first, and if they are exhausted, allocate movable pages. 2. interleaved allocation: try to allocate specific amounts of memory from cma reserved memory and then allocate from free movable memory. I love this idea but when I see the code, I don't like that. In allocation path, just try to allocate pages by round-robin so it's role of allocator. If one of migratetype is full, just pass mission to reclaimer with hint(ie, Hey reclaimer, it's non-movable allocation fail so there is pointless if you reclaim MIGRATE_CMA pages) so that reclaimer can filter it out during page scanning. We already have an tool to achieve it(ie, isolate_mode_t). Hello, I agree with leaving fast allocation path as simple as possible. I will remove runtime computation for determining ratio in __rmqueue_cma() and, instead, will use pre-computed value calculated on the other path. I am not sure that whether your second suggestion(Hey relaimer part) is good or not. In my quick thought, that could be helpful in the situation that many free cma pages remained. But, it would be not helpful when there are neither free movable and cma pages. In generally, most workloads mainly uses movable pages for page cache or anonymous mapping. Although reclaim is triggered by non-movable allocation failure, reclaimed pages are used mostly by movable allocation. We can handle these allocation request even if we reclaim the pages just in lru order. If we rotate the lru list for finding movable pages, it could cause more useful pages to be evicted. This is just my quick thought, so please let me correct if I am wrong. We have an out of tree implementation that is completely the same with the approach Minchan said and it works, but it has definitely some side-effects as you pointed, distorting the LRU and evicting hot pages. I do not attach code fragments in this thread for some reasons, but it must be easy for yourself. I am wondering if it could help also in your case. Thanks, Heesub And we couldn't do it in zone_watermark_ok with set/reset ALLOC_CMA? If possible, it would be better becauser it's generic function to check free pages and cause trigger reclaim/compaction logic. I guess, your *it* means ratio computation. Right? I don't like putting it on zone_watermark_ok(). Although it need to refer to free cma pages value which are also referred in zone_watermark_ok(), this computation is for determining ratio,
Re: [RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used
Hello, On 05/15/2014 10:53 AM, Joonsoo Kim wrote: On Tue, May 13, 2014 at 12:00:57PM +0900, Minchan Kim wrote: Hey Joonsoo, On Thu, May 08, 2014 at 09:32:23AM +0900, Joonsoo Kim wrote: CMA is introduced to provide physically contiguous pages at runtime. For this purpose, it reserves memory at boot time. Although it reserve memory, this reserved memory can be used for movable memory allocation request. This usecase is beneficial to the system that needs this CMA reserved memory infrequently and it is one of main purpose of introducing CMA. But, there is a problem in current implementation. The problem is that it works like as just reserved memory approach. The pages on cma reserved memory are hardly used for movable memory allocation. This is caused by combination of allocation and reclaim policy. The pages on cma reserved memory are allocated if there is no movable memory, that is, as fallback allocation. So the time this fallback allocation is started is under heavy memory pressure. Although it is under memory pressure, movable allocation easily succeed, since there would be many pages on cma reserved memory. But this is not the case for unmovable and reclaimable allocation, because they can't use the pages on cma reserved memory. These allocations regard system's free memory as (free pages - free cma pages) on watermark checking, that is, free unmovable pages + free reclaimable pages + free movable pages. Because we already exhausted movable pages, only free pages we have are unmovable and reclaimable types and this would be really small amount. So watermark checking would be failed. It will wake up kswapd to make enough free memory for unmovable and reclaimable allocation and kswapd will do. So before we fully utilize pages on cma reserved memory, kswapd start to reclaim memory and try to make free memory over the high watermark. This watermark checking by kswapd doesn't take care free cma pages so many movable pages would be reclaimed. After then, we have a lot of movable pages again, so fallback allocation doesn't happen again. To conclude, amount of free memory on meminfo which includes free CMA pages is moving around 512 MB if I reserve 512 MB memory for CMA. I found this problem on following experiment. 4 CPUs, 1024 MB, VIRTUAL MACHINE make -j24 CMA reserve:0 MB512 MB Elapsed-time: 234.8 361.8 Average-MemFree:283880 KB 530851 KB To solve this problem, I can think following 2 possible solutions. 1. allocate the pages on cma reserved memory first, and if they are exhausted, allocate movable pages. 2. interleaved allocation: try to allocate specific amounts of memory from cma reserved memory and then allocate from free movable memory. I love this idea but when I see the code, I don't like that. In allocation path, just try to allocate pages by round-robin so it's role of allocator. If one of migratetype is full, just pass mission to reclaimer with hint(ie, Hey reclaimer, it's non-movable allocation fail so there is pointless if you reclaim MIGRATE_CMA pages) so that reclaimer can filter it out during page scanning. We already have an tool to achieve it(ie, isolate_mode_t). Hello, I agree with leaving fast allocation path as simple as possible. I will remove runtime computation for determining ratio in __rmqueue_cma() and, instead, will use pre-computed value calculated on the other path. I am not sure that whether your second suggestion(Hey relaimer part) is good or not. In my quick thought, that could be helpful in the situation that many free cma pages remained. But, it would be not helpful when there are neither free movable and cma pages. In generally, most workloads mainly uses movable pages for page cache or anonymous mapping. Although reclaim is triggered by non-movable allocation failure, reclaimed pages are used mostly by movable allocation. We can handle these allocation request even if we reclaim the pages just in lru order. If we rotate the lru list for finding movable pages, it could cause more useful pages to be evicted. This is just my quick thought, so please let me correct if I am wrong. We have an out of tree implementation that is completely the same with the approach Minchan said and it works, but it has definitely some side-effects as you pointed, distorting the LRU and evicting hot pages. I do not attach code fragments in this thread for some reasons, but it must be easy for yourself. I am wondering if it could help also in your case. Thanks, Heesub And we couldn't do it in zone_watermark_ok with set/reset ALLOC_CMA? If possible, it would be better becauser it's generic function to check free pages and cause trigger reclaim/compaction logic. I guess, your *it* means ratio computation. Right? I don't like putting it on zone_watermark_ok(). Although it need to refer to free cma pages value which are also referred in zone_watermark_ok(), this computation is for determining ratio,
[PATCH 1/2] mm/compaction: clean up unused code lines
This commit removes code lines currently not in use or never called. Signed-off-by: Heesub Shin Cc: Dongjun Shin Cc: Sunghwan Yun --- mm/compaction.c | 10 -- 1 file changed, 10 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index 9635083..1ef9144 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -208,12 +208,6 @@ static bool compact_checklock_irqsave(spinlock_t *lock, unsigned long *flags, return true; } -static inline bool compact_trylock_irqsave(spinlock_t *lock, - unsigned long *flags, struct compact_control *cc) -{ - return compact_checklock_irqsave(lock, flags, false, cc); -} - /* Returns true if the page is within a block suitable for migration to */ static bool suitable_migration_target(struct page *page) { @@ -728,7 +722,6 @@ static void isolate_freepages(struct zone *zone, continue; /* Found a block suitable for isolating free pages from */ - isolated = 0; /* * As pfn may not start aligned, pfn+pageblock_nr_page @@ -1160,9 +1153,6 @@ static void __compact_pgdat(pg_data_t *pgdat, struct compact_control *cc) if (zone_watermark_ok(zone, cc->order, low_wmark_pages(zone), 0, 0)) compaction_defer_reset(zone, cc->order, false); - /* Currently async compaction is never deferred. */ - else if (cc->sync) - defer_compaction(zone, cc->order); } VM_BUG_ON(!list_empty(>freepages)); -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] mm/compaction: fix to initialize free scanner properly
Free scanner does not works well on systems having zones which do not span to pageblock-aligned boundary. zone->compact_cached_free_pfn is reset when the migration and free scanner across or compaction restarts. After the reset, if end_pfn of the zone was not aligned to pageblock_nr_pages, free scanner tries to isolate free pages from the middle of pageblock to the end, which can be very small range. Signed-off-by: Heesub Shin Cc: Dongjun Shin Cc: Sunghwan Yun --- mm/compaction.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/compaction.c b/mm/compaction.c index 1ef9144..fefe1da 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -983,7 +983,7 @@ static int compact_zone(struct zone *zone, struct compact_control *cc) */ cc->migrate_pfn = zone->compact_cached_migrate_pfn; cc->free_pfn = zone->compact_cached_free_pfn; - if (cc->free_pfn < start_pfn || cc->free_pfn > end_pfn) { + if (cc->free_pfn < start_pfn || cc->free_pfn >= end_pfn) { cc->free_pfn = end_pfn & ~(pageblock_nr_pages-1); zone->compact_cached_free_pfn = cc->free_pfn; } -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] mm/compaction: clean up unused code lines
This commit removes code lines currently not in use or never called. Signed-off-by: Heesub Shin heesub.s...@samsung.com Cc: Dongjun Shin d.j.s...@samsung.com Cc: Sunghwan Yun sunghwan@samsung.com --- mm/compaction.c | 10 -- 1 file changed, 10 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index 9635083..1ef9144 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -208,12 +208,6 @@ static bool compact_checklock_irqsave(spinlock_t *lock, unsigned long *flags, return true; } -static inline bool compact_trylock_irqsave(spinlock_t *lock, - unsigned long *flags, struct compact_control *cc) -{ - return compact_checklock_irqsave(lock, flags, false, cc); -} - /* Returns true if the page is within a block suitable for migration to */ static bool suitable_migration_target(struct page *page) { @@ -728,7 +722,6 @@ static void isolate_freepages(struct zone *zone, continue; /* Found a block suitable for isolating free pages from */ - isolated = 0; /* * As pfn may not start aligned, pfn+pageblock_nr_page @@ -1160,9 +1153,6 @@ static void __compact_pgdat(pg_data_t *pgdat, struct compact_control *cc) if (zone_watermark_ok(zone, cc-order, low_wmark_pages(zone), 0, 0)) compaction_defer_reset(zone, cc-order, false); - /* Currently async compaction is never deferred. */ - else if (cc-sync) - defer_compaction(zone, cc-order); } VM_BUG_ON(!list_empty(cc-freepages)); -- 1.8.3.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/2] mm/compaction: fix to initialize free scanner properly
Free scanner does not works well on systems having zones which do not span to pageblock-aligned boundary. zone-compact_cached_free_pfn is reset when the migration and free scanner across or compaction restarts. After the reset, if end_pfn of the zone was not aligned to pageblock_nr_pages, free scanner tries to isolate free pages from the middle of pageblock to the end, which can be very small range. Signed-off-by: Heesub Shin heesub.s...@samsung.com Cc: Dongjun Shin d.j.s...@samsung.com Cc: Sunghwan Yun sunghwan@samsung.com --- mm/compaction.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/compaction.c b/mm/compaction.c index 1ef9144..fefe1da 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -983,7 +983,7 @@ static int compact_zone(struct zone *zone, struct compact_control *cc) */ cc-migrate_pfn = zone-compact_cached_migrate_pfn; cc-free_pfn = zone-compact_cached_free_pfn; - if (cc-free_pfn start_pfn || cc-free_pfn end_pfn) { + if (cc-free_pfn start_pfn || cc-free_pfn = end_pfn) { cc-free_pfn = end_pfn ~(pageblock_nr_pages-1); zone-compact_cached_free_pfn = cc-free_pfn; } -- 1.8.3.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] fs/buffer.c: use lowmem_page_address instead of page_address
Here, the page has been identified as lowmem already. So, calling lowmem_page_address() directly is a little cheaper than page_address(). Signed-off-by: Heesub Shin Cc: Dongjun Shin --- fs/buffer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/buffer.c b/fs/buffer.c index 695eb14..ccc2c7b 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -1464,7 +1464,7 @@ void set_bh_page(struct buffer_head *bh, */ bh->b_data = (char *)(0 + offset); else - bh->b_data = page_address(page) + offset; + bh->b_data = lowmem_page_address(page) + offset; } EXPORT_SYMBOL(set_bh_page); -- 1.8.3.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] fs/buffer.c: use lowmem_page_address instead of page_address
Here, the page has been identified as lowmem already. So, calling lowmem_page_address() directly is a little cheaper than page_address(). Signed-off-by: Heesub Shin heesub.s...@samsung.com Cc: Dongjun Shin d.j.s...@samsung.com --- fs/buffer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/buffer.c b/fs/buffer.c index 695eb14..ccc2c7b 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -1464,7 +1464,7 @@ void set_bh_page(struct buffer_head *bh, */ bh-b_data = (char *)(0 + offset); else - bh-b_data = page_address(page) + offset; + bh-b_data = lowmem_page_address(page) + offset; } EXPORT_SYMBOL(set_bh_page); -- 1.8.3.2 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/