from:"Heesub Shin"

[PATCH] zsmalloc: fix fatal corruption due to wrong size class selection

2015-03-25 Thread Heesub Shin

There is no point in overriding the size class below. It causes fatal
corruption on the next chunk on the 3264-bytes size class, which is the
last size class that is not huge.

For example, if the requested size was exactly 3264 bytes, current
zsmalloc allocates and returns a chunk from the size class of 3264
bytes, not 4096. User access to this chunk may overwrite head of the
next adjacent chunk.

Here is the panic log captured when freelist was corrupted due to this:

Kernel BUG at ffc00030659c [verbose debug info unavailable]
Internal error: Oops - BUG: 9606 [#1] PREEMPT SMP
Modules linked in:
exynos-snapshot: core register saved(CPU:5)
CPUMERRSR: , L2MERRSR: 
exynos-snapshot: context saved(CPU:5)
exynos-snapshot: item - log_kevents is disabled
CPU: 5 PID: 898 Comm: kswapd0 Not tainted 3.10.61-4497415-eng #1
task: ffc0b8783d80 ti: ffc0b71e8000 task.ti: ffc0b71e8000
PC is at obj_idx_to_offset+0x0/0x1c
LR is at obj_malloc+0x44/0xe8
pc : [] lr : [] pstate: a045
sp : ffc0b71eb790
x29: ffc0b71eb790 x28: ffc00204c000
x27: 0001d96f x26: 
x25: ffc098cc3500 x24: ffc0a13f2810
x23: ffc098cc3501 x22: ffc0a13f2800
x21: 11e1a02006e3 x20: ffc0a13f2800
x19: ffbc02a7e000 x18: 
x17:  x16: 0feb
x15:  x14: a01003e3
x13: 0020 x12: fff0
x11: ffc08b264000 x10: e3a01004
x9 : ffc08b263fea x8 : ffc0b1e611c0
x7 : ffc000307d24 x6 : 
x5 : 0038 x4 : 011e
x3 : ffbc3e90 x2 : 0cc0
x1 : d0100371 x0 : ffbc3e90

Reported-by: Sooyong Suk 
Signed-off-by: Heesub Shin 
Tested-by: Sooyong Suk 
Cc: Minchan Kim 
---
 mm/zsmalloc.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 6c39ae9..a2da64b 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1398,11 +1398,6 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t 
size)
/* extra space in chunk to keep the handle */
size += ZS_HANDLE_SIZE;
class = pool->size_class[get_size_class_index(size)];
-   /* In huge class size, we store the handle into first_page->private */
-   if (class->huge) {
-   size -= ZS_HANDLE_SIZE;
-   class = pool->size_class[get_size_class_index(size)];
-   }
 
spin_lock(>lock);
first_page = find_get_zspage(class);
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] zsmalloc: fix fatal corruption due to wrong size class selection

2015-03-25 Thread Heesub Shin

There is no point in overriding the size class below. It causes fatal
corruption on the next chunk on the 3264-bytes size class, which is the
last size class that is not huge.

For example, if the requested size was exactly 3264 bytes, current
zsmalloc allocates and returns a chunk from the size class of 3264
bytes, not 4096. User access to this chunk may overwrite head of the
next adjacent chunk.

Here is the panic log captured when freelist was corrupted due to this:

Kernel BUG at ffc00030659c [verbose debug info unavailable]
Internal error: Oops - BUG: 9606 [#1] PREEMPT SMP
Modules linked in:
exynos-snapshot: core register saved(CPU:5)
CPUMERRSR: , L2MERRSR: 
exynos-snapshot: context saved(CPU:5)
exynos-snapshot: item - log_kevents is disabled
CPU: 5 PID: 898 Comm: kswapd0 Not tainted 3.10.61-4497415-eng #1
task: ffc0b8783d80 ti: ffc0b71e8000 task.ti: ffc0b71e8000
PC is at obj_idx_to_offset+0x0/0x1c
LR is at obj_malloc+0x44/0xe8
pc : [ffc00030659c] lr : [ffc000306604] pstate: a045
sp : ffc0b71eb790
x29: ffc0b71eb790 x28: ffc00204c000
x27: 0001d96f x26: 
x25: ffc098cc3500 x24: ffc0a13f2810
x23: ffc098cc3501 x22: ffc0a13f2800
x21: 11e1a02006e3 x20: ffc0a13f2800
x19: ffbc02a7e000 x18: 
x17:  x16: 0feb
x15:  x14: a01003e3
x13: 0020 x12: fff0
x11: ffc08b264000 x10: e3a01004
x9 : ffc08b263fea x8 : ffc0b1e611c0
x7 : ffc000307d24 x6 : 
x5 : 0038 x4 : 011e
x3 : ffbc3e90 x2 : 0cc0
x1 : d0100371 x0 : ffbc3e90

Reported-by: Sooyong Suk s@samsung.com
Signed-off-by: Heesub Shin heesub.s...@samsung.com
Tested-by: Sooyong Suk s@samsung.com
Cc: Minchan Kim minc...@kernel.org
---
 mm/zsmalloc.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 6c39ae9..a2da64b 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -1398,11 +1398,6 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t 
size)
/* extra space in chunk to keep the handle */
size += ZS_HANDLE_SIZE;
class = pool-size_class[get_size_class_index(size)];
-   /* In huge class size, we store the handle into first_page-private */
-   if (class-huge) {
-   size -= ZS_HANDLE_SIZE;
-   class = pool-size_class[get_size_class_index(size)];
-   }
 
spin_lock(class-lock);
first_page = find_get_zspage(class);
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] zsmalloc: do not remap dst page while prepare next src page

2015-03-24 Thread Heesub Shin

Hello,

On 03/25/2015 12:24 AM, Sergey Senozhatsky wrote:
> object may belong to different pages. zs_object_copy() handles
> this case and maps a new source page (get_next_page() and
> kmap_atomic()) when object crosses boundaries of the current
> source page. But it also performs unnecessary kunmap/kmap_atomic
> of the destination page (it remains unchanged), which can be
> avoided.

No, it's not unnecessary. We should do kunmap_atomic() in the reverse
order of kmap_atomic(), so unfortunately it's inevitable to
kunmap_atomic() both on d_addr and s_addr.

> 
> Signed-off-by: Sergey Senozhatsky 
> ---
>  mm/zsmalloc.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index d920e8b..7af4456 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -1536,12 +1536,10 @@ static void zs_object_copy(unsigned long src, 
> unsigned long dst,
>   break;
>  
>   if (s_off + size >= PAGE_SIZE) {
> - kunmap_atomic(d_addr);
>   kunmap_atomic(s_addr);

Removing kunmap_atomic(d_addr) here may cause BUG_ON() at __kunmap_atomic().

I tried yours to see it really happens:
> kernel BUG at arch/arm/mm/highmem.c:113!
> Internal error: Oops - BUG: 0 [#1] SMP ARM
> Modules linked in:
> CPU: 2 PID: 1774 Comm: bash Not tainted 4.0.0-rc2-mm1+ #105
> Hardware name: ARM-Versatile Express
> task: ee971300 ti: e8a26000 task.ti: e8a26000
> PC is at __kunmap_atomic+0x144/0x14c
> LR is at zs_object_copy+0x19c/0x2dc

regards
heesub
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] zsmalloc: do not remap dst page while prepare next src page

2015-03-24 Thread Heesub Shin

Hello,

On 03/25/2015 12:24 AM, Sergey Senozhatsky wrote:
 object may belong to different pages. zs_object_copy() handles
 this case and maps a new source page (get_next_page() and
 kmap_atomic()) when object crosses boundaries of the current
 source page. But it also performs unnecessary kunmap/kmap_atomic
 of the destination page (it remains unchanged), which can be
 avoided.

No, it's not unnecessary. We should do kunmap_atomic() in the reverse
order of kmap_atomic(), so unfortunately it's inevitable to
kunmap_atomic() both on d_addr and s_addr.

 
 Signed-off-by: Sergey Senozhatsky sergey.senozhat...@gmail.com
 ---
  mm/zsmalloc.c | 2 --
  1 file changed, 2 deletions(-)
 
 diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
 index d920e8b..7af4456 100644
 --- a/mm/zsmalloc.c
 +++ b/mm/zsmalloc.c
 @@ -1536,12 +1536,10 @@ static void zs_object_copy(unsigned long src, 
 unsigned long dst,
   break;
  
   if (s_off + size = PAGE_SIZE) {
 - kunmap_atomic(d_addr);
   kunmap_atomic(s_addr);

Removing kunmap_atomic(d_addr) here may cause BUG_ON() at __kunmap_atomic().

I tried yours to see it really happens:
 kernel BUG at arch/arm/mm/highmem.c:113!
 Internal error: Oops - BUG: 0 [#1] SMP ARM
 Modules linked in:
 CPU: 2 PID: 1774 Comm: bash Not tainted 4.0.0-rc2-mm1+ #105
 Hardware name: ARM-Versatile Express
 task: ee971300 ti: e8a26000 task.ti: e8a26000
 PC is at __kunmap_atomic+0x144/0x14c
 LR is at zs_object_copy+0x19c/0x2dc

regards
heesub
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 3/7] zsmalloc: support compaction

2015-03-04 Thread Heesub Shin

Hello Minchan,

Nice work!

On 03/04/2015 02:01 PM, Minchan Kim wrote:
> +static void putback_zspage(struct zs_pool *pool, struct size_class *class,
> + struct page *first_page)
> +{
> + int class_idx;
> + enum fullness_group fullness;
> +
> + BUG_ON(!is_first_page(first_page));
> +
> + get_zspage_mapping(first_page, _idx, );
> + insert_zspage(first_page, class, fullness);
> + fullness = fix_fullness_group(class, first_page);

Removal and re-insertion of zspage above can be eliminated, like this:

fullness = get_fullness_group(first_page);
insert_zspage(first_page, class, fullness);
set_zspage_mapping(first_page, class->index, fullness);

regards,
heesub

>   if (fullness == ZS_EMPTY) {
> + zs_stat_dec(class, OBJ_ALLOCATED, get_maxobj_per_zspage(
> + class->size, class->pages_per_zspage));
>   atomic_long_sub(class->pages_per_zspage,
>   >pages_allocated);
> +
>   free_zspage(first_page);
>   }
>  }
> -EXPORT_SYMBOL_GPL(zs_free);
> +
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 3/7] zsmalloc: support compaction

2015-03-04 Thread Heesub Shin

Hello Minchan,

Nice work!

On 03/04/2015 02:01 PM, Minchan Kim wrote:
 +static void putback_zspage(struct zs_pool *pool, struct size_class *class,
 + struct page *first_page)
 +{
 + int class_idx;
 + enum fullness_group fullness;
 +
 + BUG_ON(!is_first_page(first_page));
 +
 + get_zspage_mapping(first_page, class_idx, fullness);
 + insert_zspage(first_page, class, fullness);
 + fullness = fix_fullness_group(class, first_page);

Removal and re-insertion of zspage above can be eliminated, like this:

fullness = get_fullness_group(first_page);
insert_zspage(first_page, class, fullness);
set_zspage_mapping(first_page, class-index, fullness);

regards,
heesub

   if (fullness == ZS_EMPTY) {
 + zs_stat_dec(class, OBJ_ALLOCATED, get_maxobj_per_zspage(
 + class-size, class-pages_per_zspage));
   atomic_long_sub(class-pages_per_zspage,
   pool-pages_allocated);
 +
   free_zspage(first_page);
   }
  }
 -EXPORT_SYMBOL_GPL(zs_free);
 +
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 0/9] mm/zbud: support highmem pages

2015-01-28 Thread Heesub Shin




On 01/28/2015 05:24 AM, Seth Jennings wrote:

On Tue, Nov 04, 2014 at 10:33:43AM -0600, Seth Jennings wrote:

On Tue, Oct 14, 2014 at 08:59:19PM +0900, Heesub Shin wrote:

zbud is a memory allocator for storing compressed data pages. It keeps
two data objects of arbitrary size on a single page. This simple design
provides very deterministic behavior on reclamation, which is one of
reasons why zswap selected zbud as a default allocator over zsmalloc.

Unlike zsmalloc, however, zbud does not support highmem. This is
problomatic especially on 32-bit machines having relatively small
lowmem. Compressing anonymous pages from highmem and storing them into
lowmem could eat up lowmem spaces.

This limitation is due to the fact that zbud manages its internal data
structures on zbud_header which is kept in the head of zbud_page. For
example, zbud_pages are tracked by several lists and have some status
information, which are being referenced at any time by the kernel. Thus,
zbud_pages should be allocated on a memory region directly mapped,
lowmem.

After some digging out, I found that internal data structures of zbud
can be kept in the struct page, the same way as zsmalloc does. So, this
series moves out all fields in zbud_header to struct page. Though it
alters quite a lot, it does not add any functional differences except
highmem support. I am afraid that this kind of modification abusing
several fields in struct page would be ok.


Hi Heesub,

Sorry for the very late reply.  The end of October was very busy for me.

A little history on zbud.  I didn't put the metadata in the struct
page, even though I knew that was an option since we had done it with
zsmalloc. At the time, Andrew Morton had concerns about memmap walkers
getting messed up with unexpected values in the struct page fields.  In
order to smooth zbud's acceptance, I decided to store the metadata
inline in the page itself.

Later, zsmalloc eventually got accepted, which basically gave the
impression that putting the metadata in the struct page was acceptable.

I have recently been looking at implementing compaction for zsmalloc,
but having the metadata in the struct page and having the handle
directly encode the PFN and offset of the data block prevents
transparent relocation of the data. zbud has a similar issue as it
currently encodes the page address in the handle returned to the user
(also the limitation that is preventing use of highmem pages).

I would like to implement compaction for zbud too and moving the
metadata into the struct page is going to work against that. In fact,
I'm looking at the option of converting the current zbud_header into a
per-allocation metadata structure, which would provide a layer of
indirection between zbud and the user, allowing for transparent
relocation and compaction.


I had some downtime and started thinking about this again today (after
3 months).

Upon further reflection, I really like this and don't think that it
inhibits introducing compaction later.

There are just a few places that look messy or problematic to me:

1. the use of page->private and masking the number of chunks for both
buddies into it (see suggestion for overlay struct below)
2. the use of the second double word >index to store a list_head

#2 might be problematic because, IIRC, memmap walkers will check _count
(or _mapcount).  I think we ran into this in zsmalloc.

Initially, when working on zsmalloc, I just created a structure that
overlaid the struct page in the memmap, reserving the flags and _count
areas, so that I wouldn't have to be bound by the field names/boundaries
in the struct page.

IIRC, Andrew was initially against that, but he was also against the
whole idea of using the struct page fields for random stuff... I that
ended up being accepted.

This code looks really good!  I think with a little cleanup and finding
a way to steer clear of using the _count part of the structure, this
will be great.


Thanks for your comments! I will try to address problems you pointed and 
post a new patchset hopefully soon.


regards,
heesub



Sorry for dismissing it earlier.  Didn't give it enough credit.

Thanks,
Seth



However, I do like the part about letting zbud use highmem pages.

I have something in mind that would allow highmem pages _and_ move
toward something that would support compaction.  I'll see if I can put
it into code today.

Thanks,
Seth



Heesub Shin (9):
   mm/zbud: tidy up a bit
   mm/zbud: remove buddied list from zbud_pool
   mm/zbud: remove lru from zbud_header
   mm/zbud: remove first|last_chunks from zbud_header
   mm/zbud: encode zbud handle using struct page
   mm/zbud: remove list_head for buddied list from zbud_header
   mm/zbud: drop zbud_header
   mm/zbud: allow clients to use highmem pages
   mm/zswap: use highmem pages for compressed pool

  mm/zbud.c  | 244 ++---
  mm/zswap.c |   4 +-
  2 files changed, 121 insertions(+), 127 deletions(-)

--

Re: [RFC PATCH 0/9] mm/zbud: support highmem pages

2015-01-28 Thread Heesub Shin




On 01/28/2015 05:24 AM, Seth Jennings wrote:

On Tue, Nov 04, 2014 at 10:33:43AM -0600, Seth Jennings wrote:

On Tue, Oct 14, 2014 at 08:59:19PM +0900, Heesub Shin wrote:

zbud is a memory allocator for storing compressed data pages. It keeps
two data objects of arbitrary size on a single page. This simple design
provides very deterministic behavior on reclamation, which is one of
reasons why zswap selected zbud as a default allocator over zsmalloc.

Unlike zsmalloc, however, zbud does not support highmem. This is
problomatic especially on 32-bit machines having relatively small
lowmem. Compressing anonymous pages from highmem and storing them into
lowmem could eat up lowmem spaces.

This limitation is due to the fact that zbud manages its internal data
structures on zbud_header which is kept in the head of zbud_page. For
example, zbud_pages are tracked by several lists and have some status
information, which are being referenced at any time by the kernel. Thus,
zbud_pages should be allocated on a memory region directly mapped,
lowmem.

After some digging out, I found that internal data structures of zbud
can be kept in the struct page, the same way as zsmalloc does. So, this
series moves out all fields in zbud_header to struct page. Though it
alters quite a lot, it does not add any functional differences except
highmem support. I am afraid that this kind of modification abusing
several fields in struct page would be ok.


Hi Heesub,

Sorry for the very late reply.  The end of October was very busy for me.

A little history on zbud.  I didn't put the metadata in the struct
page, even though I knew that was an option since we had done it with
zsmalloc. At the time, Andrew Morton had concerns about memmap walkers
getting messed up with unexpected values in the struct page fields.  In
order to smooth zbud's acceptance, I decided to store the metadata
inline in the page itself.

Later, zsmalloc eventually got accepted, which basically gave the
impression that putting the metadata in the struct page was acceptable.

I have recently been looking at implementing compaction for zsmalloc,
but having the metadata in the struct page and having the handle
directly encode the PFN and offset of the data block prevents
transparent relocation of the data. zbud has a similar issue as it
currently encodes the page address in the handle returned to the user
(also the limitation that is preventing use of highmem pages).

I would like to implement compaction for zbud too and moving the
metadata into the struct page is going to work against that. In fact,
I'm looking at the option of converting the current zbud_header into a
per-allocation metadata structure, which would provide a layer of
indirection between zbud and the user, allowing for transparent
relocation and compaction.


I had some downtime and started thinking about this again today (after
3 months).

Upon further reflection, I really like this and don't think that it
inhibits introducing compaction later.

There are just a few places that look messy or problematic to me:

1. the use of page-private and masking the number of chunks for both
buddies into it (see suggestion for overlay struct below)
2. the use of the second double word page-index to store a list_head

#2 might be problematic because, IIRC, memmap walkers will check _count
(or _mapcount).  I think we ran into this in zsmalloc.

Initially, when working on zsmalloc, I just created a structure that
overlaid the struct page in the memmap, reserving the flags and _count
areas, so that I wouldn't have to be bound by the field names/boundaries
in the struct page.

IIRC, Andrew was initially against that, but he was also against the
whole idea of using the struct page fields for random stuff... I that
ended up being accepted.

This code looks really good!  I think with a little cleanup and finding
a way to steer clear of using the _count part of the structure, this
will be great.


Thanks for your comments! I will try to address problems you pointed and 
post a new patchset hopefully soon.


regards,
heesub



Sorry for dismissing it earlier.  Didn't give it enough credit.

Thanks,
Seth



However, I do like the part about letting zbud use highmem pages.

I have something in mind that would allow highmem pages _and_ move
toward something that would support compaction.  I'll see if I can put
it into code today.

Thanks,
Seth



Heesub Shin (9):
   mm/zbud: tidy up a bit
   mm/zbud: remove buddied list from zbud_pool
   mm/zbud: remove lru from zbud_header
   mm/zbud: remove first|last_chunks from zbud_header
   mm/zbud: encode zbud handle using struct page
   mm/zbud: remove list_head for buddied list from zbud_header
   mm/zbud: drop zbud_header
   mm/zbud: allow clients to use highmem pages
   mm/zswap: use highmem pages for compressed pool

  mm/zbud.c  | 244 ++---
  mm/zswap.c |   4 +-
  2 files changed, 121 insertions(+), 127 deletions(-)

--
1.9.1

Re: [PATCH v5 2/4] mm/page_alloc: add freepage on isolate pageblock to correct buddy list

2014-11-03 Thread Heesub Shin


Hello,

On 10/31/2014 04:25 PM, Joonsoo Kim wrote:

In free_pcppages_bulk(), we use cached migratetype of freepage
to determine type of buddy list where freepage will be added.
This information is stored when freepage is added to pcp list, so
if isolation of pageblock of this freepage begins after storing,
this cached information could be stale. In other words, it has
original migratetype rather than MIGRATE_ISOLATE.

There are two problems caused by this stale information. One is that
we can't keep these freepages from being allocated. Although this
pageblock is isolated, freepage will be added to normal buddy list
so that it could be allocated without any restriction. And the other
problem is incorrect freepage accounting. Freepages on isolate pageblock
should not be counted for number of freepage.

Following is the code snippet in free_pcppages_bulk().

/* MIGRATE_MOVABLE list may include MIGRATE_RESERVEs */
__free_one_page(page, page_to_pfn(page), zone, 0, mt);
trace_mm_page_pcpu_drain(page, 0, mt);
if (likely(!is_migrate_isolate_page(page))) {
__mod_zone_page_state(zone, NR_FREE_PAGES, 1);
if (is_migrate_cma(mt))
__mod_zone_page_state(zone, NR_FREE_CMA_PAGES, 1);
}

As you can see above snippet, current code already handle second problem,
incorrect freepage accounting, by re-fetching pageblock migratetype
through is_migrate_isolate_page(page). But, because this re-fetched
information isn't used for __free_one_page(), first problem would not be
solved. This patch try to solve this situation to re-fetch pageblock
migratetype before __free_one_page() and to use it for __free_one_page().

In addition to move up position of this re-fetch, this patch use
optimization technique, re-fetching migratetype only if there is
isolate pageblock. Pageblock isolation is rare event, so we can
avoid re-fetching in common case with this optimization.

This patch also correct migratetype of the tracepoint output.

Cc: 
Acked-by: Minchan Kim 
Acked-by: Michal Nazarewicz 
Acked-by: Vlastimil Babka 
Signed-off-by: Joonsoo Kim 
---
  mm/page_alloc.c |   13 -
  1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f7a867e..6df23fe 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -725,14 +725,17 @@ static void free_pcppages_bulk(struct zone *zone, int 
count,
/* must delete as __free_one_page list manipulates */
list_del(>lru);
mt = get_freepage_migratetype(page);
+   if (unlikely(has_isolate_pageblock(zone))) {


How about adding an additional check for 'mt == MIGRATE_MOVABLE' here? 
Then, most of get_pageblock_migratetype() calls could be avoided while 
the isolation is in progress. I am not sure this is the case on memory 
offlining. How do you think?



+   mt = get_pageblock_migratetype(page);
+   if (is_migrate_isolate(mt))
+   goto skip_counting;
+   }
+   __mod_zone_freepage_state(zone, 1, mt);
+
+skip_counting:
/* MIGRATE_MOVABLE list may include MIGRATE_RESERVEs */
__free_one_page(page, page_to_pfn(page), zone, 0, mt);
trace_mm_page_pcpu_drain(page, 0, mt);
-   if (likely(!is_migrate_isolate_page(page))) {
-   __mod_zone_page_state(zone, NR_FREE_PAGES, 1);
-   if (is_migrate_cma(mt))
-   __mod_zone_page_state(zone, 
NR_FREE_CMA_PAGES, 1);
-   }
} while (--to_free && --batch_free && !list_empty(list));
}
spin_unlock(>lock);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 2/4] mm/page_alloc: add freepage on isolate pageblock to correct buddy list

2014-11-03 Thread Heesub Shin


Hello,

On 10/31/2014 04:25 PM, Joonsoo Kim wrote:

In free_pcppages_bulk(), we use cached migratetype of freepage
to determine type of buddy list where freepage will be added.
This information is stored when freepage is added to pcp list, so
if isolation of pageblock of this freepage begins after storing,
this cached information could be stale. In other words, it has
original migratetype rather than MIGRATE_ISOLATE.

There are two problems caused by this stale information. One is that
we can't keep these freepages from being allocated. Although this
pageblock is isolated, freepage will be added to normal buddy list
so that it could be allocated without any restriction. And the other
problem is incorrect freepage accounting. Freepages on isolate pageblock
should not be counted for number of freepage.

Following is the code snippet in free_pcppages_bulk().

/* MIGRATE_MOVABLE list may include MIGRATE_RESERVEs */
__free_one_page(page, page_to_pfn(page), zone, 0, mt);
trace_mm_page_pcpu_drain(page, 0, mt);
if (likely(!is_migrate_isolate_page(page))) {
__mod_zone_page_state(zone, NR_FREE_PAGES, 1);
if (is_migrate_cma(mt))
__mod_zone_page_state(zone, NR_FREE_CMA_PAGES, 1);
}

As you can see above snippet, current code already handle second problem,
incorrect freepage accounting, by re-fetching pageblock migratetype
through is_migrate_isolate_page(page). But, because this re-fetched
information isn't used for __free_one_page(), first problem would not be
solved. This patch try to solve this situation to re-fetch pageblock
migratetype before __free_one_page() and to use it for __free_one_page().

In addition to move up position of this re-fetch, this patch use
optimization technique, re-fetching migratetype only if there is
isolate pageblock. Pageblock isolation is rare event, so we can
avoid re-fetching in common case with this optimization.

This patch also correct migratetype of the tracepoint output.

Cc: sta...@vger.kernel.org
Acked-by: Minchan Kim minc...@kernel.org
Acked-by: Michal Nazarewicz min...@mina86.com
Acked-by: Vlastimil Babka vba...@suse.cz
Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com
---
  mm/page_alloc.c |   13 -
  1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f7a867e..6df23fe 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -725,14 +725,17 @@ static void free_pcppages_bulk(struct zone *zone, int 
count,
/* must delete as __free_one_page list manipulates */
list_del(page-lru);
mt = get_freepage_migratetype(page);
+   if (unlikely(has_isolate_pageblock(zone))) {


How about adding an additional check for 'mt == MIGRATE_MOVABLE' here? 
Then, most of get_pageblock_migratetype() calls could be avoided while 
the isolation is in progress. I am not sure this is the case on memory 
offlining. How do you think?



+   mt = get_pageblock_migratetype(page);
+   if (is_migrate_isolate(mt))
+   goto skip_counting;
+   }
+   __mod_zone_freepage_state(zone, 1, mt);
+
+skip_counting:
/* MIGRATE_MOVABLE list may include MIGRATE_RESERVEs */
__free_one_page(page, page_to_pfn(page), zone, 0, mt);
trace_mm_page_pcpu_drain(page, 0, mt);
-   if (likely(!is_migrate_isolate_page(page))) {
-   __mod_zone_page_state(zone, NR_FREE_PAGES, 1);
-   if (is_migrate_cma(mt))
-   __mod_zone_page_state(zone, 
NR_FREE_CMA_PAGES, 1);
-   }
} while (--to_free  --batch_free  !list_empty(list));
}
spin_unlock(zone-lock);


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mm/zbud: init user ops only when it is needed

2014-10-15 Thread Heesub Shin


Hello,

On 10/16/2014 05:17 AM, Andrew Morton wrote:

On Wed, 15 Oct 2014 19:00:43 +0900 Heesub Shin  wrote:


When zbud is initialized through the zpool wrapper, pool->ops which
points to user-defined operations is always set regardless of whether it
is specified from the upper layer. This causes zbud_reclaim_page() to
iterate its loop for evicting pool pages out without any gain.

This patch sets the user-defined ops only when it is needed, so that
zbud_reclaim_page() can bail out the reclamation loop earlier if there
is no user-defined operations specified.


Which callsite is calling zbud_zpool_create(..., NULL)?


Currently nowhere. zswap is the only user of zbud and always passes a 
pointer to user-defined operation on pool creation. In addition, there 
may be less possibility that pool shrinking is requested by users who 
did not provide the user-defined ops. So, we may not need to worry much 
about what I wrote in the changelog. However, it is definitely weird to 
pass an argument, zpool_ops, which even will not be referenced by 
zbud_zpool_create(). Above all, it would be more useful to avoid the 
possibility in the future rather than just ignoring it.


regards,
heesub




...
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -132,7 +132,7 @@ static struct zbud_ops zbud_zpool_ops = {

  static void *zbud_zpool_create(gfp_t gfp, struct zpool_ops *zpool_ops)
  {
-   return zbud_create_pool(gfp, _zpool_ops);
+   return zbud_create_pool(gfp, zpool_ops ? _zpool_ops : NULL);
  }

  static void zbud_zpool_destroy(void *pool)




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] mm/zbud: init user ops only when it is needed

2014-10-15 Thread Heesub Shin

When zbud is initialized through the zpool wrapper, pool->ops which
points to user-defined operations is always set regardless of whether it
is specified from the upper layer. This causes zbud_reclaim_page() to
iterate its loop for evicting pool pages out without any gain.

This patch sets the user-defined ops only when it is needed, so that
zbud_reclaim_page() can bail out the reclamation loop earlier if there
is no user-defined operations specified.

Signed-off-by: Heesub Shin 
---
 mm/zbud.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/zbud.c b/mm/zbud.c
index ecf1dbe..db8de74 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -132,7 +132,7 @@ static struct zbud_ops zbud_zpool_ops = {
 
 static void *zbud_zpool_create(gfp_t gfp, struct zpool_ops *zpool_ops)
 {
-   return zbud_create_pool(gfp, _zpool_ops);
+   return zbud_create_pool(gfp, zpool_ops ? _zpool_ops : NULL);
 }
 
 static void zbud_zpool_destroy(void *pool)
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] mm/zbud: init user ops only when it is needed

2014-10-15 Thread Heesub Shin

When zbud is initialized through the zpool wrapper, pool-ops which
points to user-defined operations is always set regardless of whether it
is specified from the upper layer. This causes zbud_reclaim_page() to
iterate its loop for evicting pool pages out without any gain.

This patch sets the user-defined ops only when it is needed, so that
zbud_reclaim_page() can bail out the reclamation loop earlier if there
is no user-defined operations specified.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
---
 mm/zbud.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/zbud.c b/mm/zbud.c
index ecf1dbe..db8de74 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -132,7 +132,7 @@ static struct zbud_ops zbud_zpool_ops = {
 
 static void *zbud_zpool_create(gfp_t gfp, struct zpool_ops *zpool_ops)
 {
-   return zbud_create_pool(gfp, zbud_zpool_ops);
+   return zbud_create_pool(gfp, zpool_ops ? zbud_zpool_ops : NULL);
 }
 
 static void zbud_zpool_destroy(void *pool)
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mm/zbud: init user ops only when it is needed

2014-10-15 Thread Heesub Shin


Hello,

On 10/16/2014 05:17 AM, Andrew Morton wrote:

On Wed, 15 Oct 2014 19:00:43 +0900 Heesub Shin heesub.s...@samsung.com wrote:


When zbud is initialized through the zpool wrapper, pool-ops which
points to user-defined operations is always set regardless of whether it
is specified from the upper layer. This causes zbud_reclaim_page() to
iterate its loop for evicting pool pages out without any gain.

This patch sets the user-defined ops only when it is needed, so that
zbud_reclaim_page() can bail out the reclamation loop earlier if there
is no user-defined operations specified.


Which callsite is calling zbud_zpool_create(..., NULL)?


Currently nowhere. zswap is the only user of zbud and always passes a 
pointer to user-defined operation on pool creation. In addition, there 
may be less possibility that pool shrinking is requested by users who 
did not provide the user-defined ops. So, we may not need to worry much 
about what I wrote in the changelog. However, it is definitely weird to 
pass an argument, zpool_ops, which even will not be referenced by 
zbud_zpool_create(). Above all, it would be more useful to avoid the 
possibility in the future rather than just ignoring it.


regards,
heesub




...
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -132,7 +132,7 @@ static struct zbud_ops zbud_zpool_ops = {

  static void *zbud_zpool_create(gfp_t gfp, struct zpool_ops *zpool_ops)
  {
-   return zbud_create_pool(gfp, zbud_zpool_ops);
+   return zbud_create_pool(gfp, zpool_ops ? zbud_zpool_ops : NULL);
  }

  static void zbud_zpool_destroy(void *pool)




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 1/9] mm/zbud: tidy up a bit

2014-10-14 Thread Heesub Shin

For aesthetics, add a blank line between functions, remove useless
initialization statements, and simplify codes a bit. No functional
differences are introduced.

Signed-off-by: Heesub Shin 
---
 mm/zbud.c | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/mm/zbud.c b/mm/zbud.c
index ecf1dbe..6f36394 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -145,6 +145,7 @@ static int zbud_zpool_malloc(void *pool, size_t size, gfp_t 
gfp,
 {
return zbud_alloc(pool, size, gfp, handle);
 }
+
 static void zbud_zpool_free(void *pool, unsigned long handle)
 {
zbud_free(pool, handle);
@@ -174,6 +175,7 @@ static void *zbud_zpool_map(void *pool, unsigned long 
handle,
 {
return zbud_map(pool, handle);
 }
+
 static void zbud_zpool_unmap(void *pool, unsigned long handle)
 {
zbud_unmap(pool, handle);
@@ -350,16 +352,11 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t 
gfp,
spin_lock(>lock);
 
/* First, try to find an unbuddied zbud page. */
-   zhdr = NULL;
for_each_unbuddied_list(i, chunks) {
if (!list_empty(>unbuddied[i])) {
zhdr = list_first_entry(>unbuddied[i],
struct zbud_header, buddy);
list_del(>buddy);
-   if (zhdr->first_chunks == 0)
-   bud = FIRST;
-   else
-   bud = LAST;
goto found;
}
}
@@ -372,13 +369,15 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t 
gfp,
spin_lock(>lock);
pool->pages_nr++;
zhdr = init_zbud_page(page);
-   bud = FIRST;
 
 found:
-   if (bud == FIRST)
+   if (zhdr->first_chunks == 0) {
zhdr->first_chunks = chunks;
-   else
+   bud = FIRST;
+   } else {
zhdr->last_chunks = chunks;
+   bud = LAST;
+   }
 
if (zhdr->first_chunks == 0 || zhdr->last_chunks == 0) {
/* Add to unbuddied list */
@@ -433,7 +432,7 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle)
/* Remove from existing buddy list */
list_del(>buddy);
 
-   if (zhdr->first_chunks == 0 && zhdr->last_chunks == 0) {
+   if (num_free_chunks(zhdr) == NCHUNKS) {
/* zbud page is empty, free */
list_del(>lru);
free_zbud_page(zhdr);
@@ -489,7 +488,7 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int 
retries)
 {
int i, ret, freechunks;
struct zbud_header *zhdr;
-   unsigned long first_handle = 0, last_handle = 0;
+   unsigned long first_handle, last_handle;
 
spin_lock(>lock);
if (!pool->ops || !pool->ops->evict || list_empty(>lru) ||
@@ -529,7 +528,7 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int 
retries)
 next:
spin_lock(>lock);
zhdr->under_reclaim = false;
-   if (zhdr->first_chunks == 0 && zhdr->last_chunks == 0) {
+   if (num_free_chunks(zhdr) == NCHUNKS) {
/*
 * Both buddies are now free, free the zbud page and
 * return success.
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 7/9] mm/zbud: drop zbud_header

2014-10-14 Thread Heesub Shin

Now that the only field in zbud_header is .under_reclaim, get it out of
the struct and let PG_reclaim bit in page->flags take over. As a result
of this change, we can finally eliminate the struct zbud_header, and
hence all the internal data structures of zbud live in struct page.

Signed-off-by: Heesub Shin 
---
 mm/zbud.c | 66 +--
 1 file changed, 18 insertions(+), 48 deletions(-)

diff --git a/mm/zbud.c b/mm/zbud.c
index 8a6dd6b..5a392f3 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -60,17 +60,15 @@
  * NCHUNKS_ORDER determines the internal allocation granularity, effectively
  * adjusting internal fragmentation.  It also determines the number of
  * freelists maintained in each pool. NCHUNKS_ORDER of 6 means that the
- * allocation granularity will be in chunks of size PAGE_SIZE/64. As one chunk
- * in allocated page is occupied by zbud header, NCHUNKS will be calculated to
- * 63 which shows the max number of free chunks in zbud page, also there will 
be
- * 63 freelists per pool.
+ * allocation granularity will be in chunks of size PAGE_SIZE/64.
+ * NCHUNKS will be calculated to 64 which shows the max number of free
+ * chunks in zbud page, also there will be 64 freelists per pool.
  */
 #define NCHUNKS_ORDER  6
 
 #define CHUNK_SHIFT(PAGE_SHIFT - NCHUNKS_ORDER)
 #define CHUNK_SIZE (1 << CHUNK_SHIFT)
-#define ZHDR_SIZE_ALIGNED CHUNK_SIZE
-#define NCHUNKS((PAGE_SIZE - ZHDR_SIZE_ALIGNED) >> CHUNK_SHIFT)
+#define NCHUNKS(PAGE_SIZE >> CHUNK_SHIFT)
 
 /**
  * struct zbud_pool - stores metadata for each zbud pool
@@ -96,14 +94,6 @@ struct zbud_pool {
struct zbud_ops *ops;
 };
 
-/*
- * struct zbud_header - zbud page metadata occupying the first chunk of each
- * zbud page.
- */
-struct zbud_header {
-   bool under_reclaim;
-};
-
 /*
  * zpool
  /
@@ -220,22 +210,19 @@ static size_t get_num_chunks(struct page *page, enum 
buddy bud)
 #define for_each_unbuddied_list(_iter, _begin) \
for ((_iter) = (_begin); (_iter) < NCHUNKS; (_iter)++)
 
-/* Initializes the zbud header of a newly allocated zbud page */
+/* Initializes a newly allocated zbud page */
 static void init_zbud_page(struct page *page)
 {
-   struct zbud_header *zhdr = page_address(page);
set_num_chunks(page, FIRST, 0);
set_num_chunks(page, LAST, 0);
INIT_LIST_HEAD((struct list_head *) >index);
INIT_LIST_HEAD(>lru);
-   zhdr->under_reclaim = 0;
+   ClearPageReclaim(page);
 }
 
 /* Resets the struct page fields and frees the page */
-static void free_zbud_page(struct zbud_header *zhdr)
+static void free_zbud_page(struct page *page)
 {
-   struct page *page = virt_to_page(zhdr);
-
init_page_count(page);
page_mapcount_reset(page);
__free_page(page);
@@ -261,14 +248,6 @@ static struct page *handle_to_zbud_page(unsigned long 
handle)
return (struct page *) (handle & ~LAST);
 }
 
-/* Returns the zbud page where a given handle is stored */
-static struct zbud_header *handle_to_zbud_header(unsigned long handle)
-{
-   struct page *page = handle_to_zbud_page(handle);
-
-   return page_address(page);
-}
-
 /* Returns the number of free chunks in a zbud page */
 static int num_free_chunks(struct page *page)
 {
@@ -347,7 +326,7 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t 
gfp,
 
if (!size || (gfp & __GFP_HIGHMEM))
return -EINVAL;
-   if (size > PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE)
+   if (size > PAGE_SIZE - CHUNK_SIZE)
return -ENOSPC;
chunks = size_to_chunks(size);
spin_lock(>lock);
@@ -410,21 +389,18 @@ found:
  */
 void zbud_free(struct zbud_pool *pool, unsigned long handle)
 {
-   struct zbud_header *zhdr;
struct page *page;
int freechunks;
 
spin_lock(>lock);
-   zhdr = handle_to_zbud_header(handle);
-   page = virt_to_page(zhdr);
+   page = handle_to_zbud_page(handle);
 
-   /* If first buddy, handle will be page aligned */
-   if ((handle - ZHDR_SIZE_ALIGNED) & ~PAGE_MASK)
-   set_num_chunks(page, LAST, 0);
-   else
+   if (!is_last_chunk(handle))
set_num_chunks(page, FIRST, 0);
+   else
+   set_num_chunks(page, LAST, 0);
 
-   if (zhdr->under_reclaim) {
+   if (PageReclaim(page)) {
/* zbud page is under reclaim, reclaim will free */
spin_unlock(>lock);
return;
@@ -436,7 +412,7 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle)
list_del((struct list_head *) >index);
/* zbud page is empty, free */
list_del(>lru);
-   free_zbud_page(zhdr);
+   free_zbud_page(page);
pool->pages_nr--;
} el

[RFC PATCH 4/9] mm/zbud: remove first|last_chunks from zbud_header

2014-10-14 Thread Heesub Shin

The size information of each first and last buddy are stored into
first|last_chunks in struct zbud_header respectively. Put them into
page->private instead of zbud_header.

Signed-off-by: Heesub Shin 
---
 mm/zbud.c | 62 --
 1 file changed, 36 insertions(+), 26 deletions(-)

diff --git a/mm/zbud.c b/mm/zbud.c
index a2390f6..193ea4f 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -100,13 +100,9 @@ struct zbud_pool {
  * struct zbud_header - zbud page metadata occupying the first chunk of each
  * zbud page.
  * @buddy: links the zbud page into the unbuddied lists in the pool
- * @first_chunks:  the size of the first buddy in chunks, 0 if free
- * @last_chunks:   the size of the last buddy in chunks, 0 if free
  */
 struct zbud_header {
struct list_head buddy;
-   unsigned int first_chunks;
-   unsigned int last_chunks;
bool under_reclaim;
 };
 
@@ -212,6 +208,17 @@ static int size_to_chunks(size_t size)
return (size + CHUNK_SIZE - 1) >> CHUNK_SHIFT;
 }
 
+static void set_num_chunks(struct page *page, enum buddy bud, size_t chunks)
+{
+   page->private = (page->private & (0x << (16 * !bud))) |
+   ((chunks & 0x) << (16 * bud));
+}
+
+static size_t get_num_chunks(struct page *page, enum buddy bud)
+{
+   return (page->private >> (16 * bud)) & 0x;
+}
+
 #define for_each_unbuddied_list(_iter, _begin) \
for ((_iter) = (_begin); (_iter) < NCHUNKS; (_iter)++)
 
@@ -219,8 +226,8 @@ static int size_to_chunks(size_t size)
 static struct zbud_header *init_zbud_page(struct page *page)
 {
struct zbud_header *zhdr = page_address(page);
-   zhdr->first_chunks = 0;
-   zhdr->last_chunks = 0;
+   set_num_chunks(page, FIRST, 0);
+   set_num_chunks(page, LAST, 0);
INIT_LIST_HEAD(>buddy);
INIT_LIST_HEAD(>lru);
zhdr->under_reclaim = 0;
@@ -240,6 +247,7 @@ static void free_zbud_page(struct zbud_header *zhdr)
 static unsigned long encode_handle(struct zbud_header *zhdr, enum buddy bud)
 {
unsigned long handle;
+   struct page *page = virt_to_page(zhdr);
 
/*
 * For now, the encoded handle is actually just the pointer to the data
@@ -252,7 +260,8 @@ static unsigned long encode_handle(struct zbud_header 
*zhdr, enum buddy bud)
/* skip over zbud header */
handle += ZHDR_SIZE_ALIGNED;
else /* bud == LAST */
-   handle += PAGE_SIZE - (zhdr->last_chunks  << CHUNK_SHIFT);
+   handle += PAGE_SIZE -
+   (get_num_chunks(page, LAST) << CHUNK_SHIFT);
return handle;
 }
 
@@ -263,13 +272,14 @@ static struct zbud_header *handle_to_zbud_header(unsigned 
long handle)
 }
 
 /* Returns the number of free chunks in a zbud page */
-static int num_free_chunks(struct zbud_header *zhdr)
+static int num_free_chunks(struct page *page)
 {
/*
 * Rather than branch for different situations, just use the fact that
 * free buddies have a length of zero to simplify everything.
 */
-   return NCHUNKS - zhdr->first_chunks - zhdr->last_chunks;
+   return NCHUNKS - get_num_chunks(page, FIRST)
+   - get_num_chunks(page, LAST);
 }
 
 /*
@@ -366,17 +376,17 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t 
gfp,
zhdr = init_zbud_page(page);
 
 found:
-   if (zhdr->first_chunks == 0) {
-   zhdr->first_chunks = chunks;
+   if (get_num_chunks(page, FIRST) == 0)
bud = FIRST;
-   } else {
-   zhdr->last_chunks = chunks;
+   else
bud = LAST;
-   }
 
-   if (zhdr->first_chunks == 0 || zhdr->last_chunks == 0) {
+   set_num_chunks(page, bud, chunks);
+
+   if (get_num_chunks(page, FIRST) == 0 ||
+   get_num_chunks(page, LAST) == 0) {
/* Add to unbuddied list */
-   freechunks = num_free_chunks(zhdr);
+   freechunks = num_free_chunks(page);
list_add(>buddy, >unbuddied[freechunks]);
}
 
@@ -413,9 +423,9 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle)
 
/* If first buddy, handle will be page aligned */
if ((handle - ZHDR_SIZE_ALIGNED) & ~PAGE_MASK)
-   zhdr->last_chunks = 0;
+   set_num_chunks(page, LAST, 0);
else
-   zhdr->first_chunks = 0;
+   set_num_chunks(page, FIRST, 0);
 
if (zhdr->under_reclaim) {
/* zbud page is under reclaim, reclaim will free */
@@ -423,7 +433,8 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle)
return;
}
 
-   if (num_free_chunks(zhdr) == NCHUNKS) {
+   freech

[RFC PATCH 3/9] mm/zbud: remove lru from zbud_header

2014-10-14 Thread Heesub Shin

zbud_pool has an lru list for tracking zbud pages and they are strung
together via zhdr->lru. If we reuse page->lru for linking zbud pages
instead of it, the lru field in zbud_header can be dropped.

Signed-off-by: Heesub Shin 
---
 mm/zbud.c | 23 +--
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/mm/zbud.c b/mm/zbud.c
index 0f5add0..a2390f6 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -100,13 +100,11 @@ struct zbud_pool {
  * struct zbud_header - zbud page metadata occupying the first chunk of each
  * zbud page.
  * @buddy: links the zbud page into the unbuddied lists in the pool
- * @lru:   links the zbud page into the lru list in the pool
  * @first_chunks:  the size of the first buddy in chunks, 0 if free
  * @last_chunks:   the size of the last buddy in chunks, 0 if free
  */
 struct zbud_header {
struct list_head buddy;
-   struct list_head lru;
unsigned int first_chunks;
unsigned int last_chunks;
bool under_reclaim;
@@ -224,7 +222,7 @@ static struct zbud_header *init_zbud_page(struct page *page)
zhdr->first_chunks = 0;
zhdr->last_chunks = 0;
INIT_LIST_HEAD(>buddy);
-   INIT_LIST_HEAD(>lru);
+   INIT_LIST_HEAD(>lru);
zhdr->under_reclaim = 0;
return zhdr;
 }
@@ -352,6 +350,7 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t 
gfp,
if (!list_empty(>unbuddied[i])) {
zhdr = list_first_entry(>unbuddied[i],
struct zbud_header, buddy);
+   page = virt_to_page(zhdr);
list_del(>buddy);
goto found;
}
@@ -382,9 +381,9 @@ found:
}
 
/* Add/move zbud page to beginning of LRU */
-   if (!list_empty(>lru))
-   list_del(>lru);
-   list_add(>lru, >lru);
+   if (!list_empty(>lru))
+   list_del(>lru);
+   list_add(>lru, >lru);
 
*handle = encode_handle(zhdr, bud);
spin_unlock(>lock);
@@ -405,10 +404,12 @@ found:
 void zbud_free(struct zbud_pool *pool, unsigned long handle)
 {
struct zbud_header *zhdr;
+   struct page *page;
int freechunks;
 
spin_lock(>lock);
zhdr = handle_to_zbud_header(handle);
+   page = virt_to_page(zhdr);
 
/* If first buddy, handle will be page aligned */
if ((handle - ZHDR_SIZE_ALIGNED) & ~PAGE_MASK)
@@ -426,7 +427,7 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle)
/* Remove from existing unbuddied list */
list_del(>buddy);
/* zbud page is empty, free */
-   list_del(>lru);
+   list_del(>lru);
free_zbud_page(zhdr);
pool->pages_nr--;
} else {
@@ -479,6 +480,7 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle)
 int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries)
 {
int i, ret, freechunks;
+   struct page *page;
struct zbud_header *zhdr;
unsigned long first_handle, last_handle;
 
@@ -489,8 +491,9 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int 
retries)
return -EINVAL;
}
for (i = 0; i < retries; i++) {
-   zhdr = list_tail_entry(>lru, struct zbud_header, lru);
-   list_del(>lru);
+   page = list_tail_entry(>lru, struct page, lru);
+   zhdr = page_address(page);
+   list_del(>lru);
list_del(>buddy);
/* Protect zbud page against free */
zhdr->under_reclaim = true;
@@ -537,7 +540,7 @@ next:
}
 
/* add to beginning of LRU */
-   list_add(>lru, >lru);
+   list_add(>lru, >lru);
}
spin_unlock(>lock);
return -EAGAIN;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 9/9] mm/zswap: use highmem pages for compressed pool

2014-10-14 Thread Heesub Shin

Now that the zbud supports highmem, storing compressed anonymous pages
on highmem looks more reasonble. So, pass __GFP_HIGHMEM flag to zpool
when zswap allocates memory from it.

Signed-off-by: Heesub Shin 
---
 mm/zswap.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index ea064c1..eaabe95 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -684,8 +684,8 @@ static int zswap_frontswap_store(unsigned type, pgoff_t 
offset,
 
/* store */
len = dlen + sizeof(struct zswap_header);
-   ret = zpool_malloc(zswap_pool, len, __GFP_NORETRY | __GFP_NOWARN,
-   );
+   ret = zpool_malloc(zswap_pool, len,
+   __GFP_NORETRY | __GFP_NOWARN | __GFP_HIGHMEM, );
if (ret == -ENOSPC) {
zswap_reject_compress_poor++;
goto freepage;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 8/9] mm/zbud: allow clients to use highmem pages

2014-10-14 Thread Heesub Shin

Now that all fields for the internal data structure of zbud are moved to
struct page, there is no reason to restrict zbud pages to be allocated
only in lowmem. This patch allows to use highmem pages for zbud pages.
Pages from highmem are mapped using kmap_atomic() before accessing.

Signed-off-by: Heesub Shin 
---
 mm/zbud.c | 25 -
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/mm/zbud.c b/mm/zbud.c
index 5a392f3..677fdc1 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -52,6 +52,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * Structures
@@ -94,6 +95,9 @@ struct zbud_pool {
struct zbud_ops *ops;
 };
 
+/* per-cpu mapping addresses of kmap_atomic()'ed zbud pages */
+static DEFINE_PER_CPU(void *, zbud_mapping);
+
 /*
  * zpool
  /
@@ -310,9 +314,6 @@ void zbud_destroy_pool(struct zbud_pool *pool)
  * performed first. If no suitable free region is found, then a new page is
  * allocated and added to the pool to satisfy the request.
  *
- * gfp should not set __GFP_HIGHMEM as highmem pages cannot be used
- * as zbud pool pages.
- *
  * Return: 0 if success and handle is set, otherwise -EINVAL if the size or
  * gfp arguments are invalid or -ENOMEM if the pool was unable to allocate
  * a new page.
@@ -324,7 +325,7 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t 
gfp,
enum buddy bud;
struct page *page;
 
-   if (!size || (gfp & __GFP_HIGHMEM))
+   if (!size)
return -EINVAL;
if (size > PAGE_SIZE - CHUNK_SIZE)
return -ENOSPC;
@@ -543,14 +544,24 @@ next:
  */
 void *zbud_map(struct zbud_pool *pool, unsigned long handle)
 {
+   void **mapping;
size_t offset = 0;
struct page *page = handle_to_zbud_page(handle);
 
+   /*
+* Because we use per-cpu mapping shared among the pools/users,
+* we can't allow mapping in interrupt context because it can
+* corrupt another users mappings.
+*/
+   BUG_ON(in_interrupt());
+
if (is_last_chunk(handle))
offset = PAGE_SIZE -
(get_num_chunks(page, LAST) << CHUNK_SHIFT);
 
-   return (unsigned char *) page_address(page) + offset;
+   mapping = _cpu_var(zbud_mapping);
+   *mapping = kmap_atomic(page);
+   return (char *) *mapping + offset;
 }
 
 /**
@@ -560,6 +571,10 @@ void *zbud_map(struct zbud_pool *pool, unsigned long 
handle)
  */
 void zbud_unmap(struct zbud_pool *pool, unsigned long handle)
 {
+   void **mapping = this_cpu_ptr(_mapping);
+
+   kunmap_atomic(*mapping);
+   put_cpu_var(zbud_mapping);
 }
 
 /**
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 6/9] mm/zbud: remove list_head for buddied list from zbud_header

2014-10-14 Thread Heesub Shin

zbud allocator links the _unbuddied_ zbud pages into a list in the pool.
When it tries to allocate some spaces, the list is first searched for
the best fit possible. Thus, current implementation has a list_head in
zbud_header structure to construct the list.

This patch simulates a list using the second double word of struct page,
instead of zbud_header. Then, we can eliminate the list_head in
zbud_header. Using _index and _mapcount fields (also including _count on
64-bits machines) in the page struct for list management looks a bit
odd, but no better idea now considering that page->lru is already in
use.

Signed-off-by: Heesub Shin 
---
 mm/zbud.c | 36 +++-
 1 file changed, 19 insertions(+), 17 deletions(-)

diff --git a/mm/zbud.c b/mm/zbud.c
index 383bab0..8a6dd6b 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -99,10 +99,8 @@ struct zbud_pool {
 /*
  * struct zbud_header - zbud page metadata occupying the first chunk of each
  * zbud page.
- * @buddy: links the zbud page into the unbuddied lists in the pool
  */
 struct zbud_header {
-   struct list_head buddy;
bool under_reclaim;
 };
 
@@ -223,21 +221,24 @@ static size_t get_num_chunks(struct page *page, enum 
buddy bud)
for ((_iter) = (_begin); (_iter) < NCHUNKS; (_iter)++)
 
 /* Initializes the zbud header of a newly allocated zbud page */
-static struct zbud_header *init_zbud_page(struct page *page)
+static void init_zbud_page(struct page *page)
 {
struct zbud_header *zhdr = page_address(page);
set_num_chunks(page, FIRST, 0);
set_num_chunks(page, LAST, 0);
-   INIT_LIST_HEAD(>buddy);
+   INIT_LIST_HEAD((struct list_head *) >index);
INIT_LIST_HEAD(>lru);
zhdr->under_reclaim = 0;
-   return zhdr;
 }
 
 /* Resets the struct page fields and frees the page */
 static void free_zbud_page(struct zbud_header *zhdr)
 {
-   __free_page(virt_to_page(zhdr));
+   struct page *page = virt_to_page(zhdr);
+
+   init_page_count(page);
+   page_mapcount_reset(page);
+   __free_page(page);
 }
 
 static int is_last_chunk(unsigned long handle)
@@ -341,7 +342,6 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t 
gfp,
unsigned long *handle)
 {
int chunks, i, freechunks;
-   struct zbud_header *zhdr = NULL;
enum buddy bud;
struct page *page;
 
@@ -355,10 +355,9 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t 
gfp,
/* First, try to find an unbuddied zbud page. */
for_each_unbuddied_list(i, chunks) {
if (!list_empty(>unbuddied[i])) {
-   zhdr = list_first_entry(>unbuddied[i],
-   struct zbud_header, buddy);
-   page = virt_to_page(zhdr);
-   list_del(>buddy);
+   page = list_entry((unsigned long *)
+   pool->unbuddied[i].next, struct page, index);
+   list_del((struct list_head *) >index);
goto found;
}
}
@@ -370,7 +369,7 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t 
gfp,
return -ENOMEM;
spin_lock(>lock);
pool->pages_nr++;
-   zhdr = init_zbud_page(page);
+   init_zbud_page(page);
 
 found:
if (get_num_chunks(page, FIRST) == 0)
@@ -384,7 +383,8 @@ found:
get_num_chunks(page, LAST) == 0) {
/* Add to unbuddied list */
freechunks = num_free_chunks(page);
-   list_add(>buddy, >unbuddied[freechunks]);
+   list_add((struct list_head *) >index,
+   >unbuddied[freechunks]);
}
 
/* Add/move zbud page to beginning of LRU */
@@ -433,14 +433,15 @@ void zbud_free(struct zbud_pool *pool, unsigned long 
handle)
freechunks = num_free_chunks(page);
if (freechunks == NCHUNKS) {
/* Remove from existing unbuddied list */
-   list_del(>buddy);
+   list_del((struct list_head *) >index);
/* zbud page is empty, free */
list_del(>lru);
free_zbud_page(zhdr);
pool->pages_nr--;
} else {
/* Add to unbuddied list */
-   list_add(>buddy, >unbuddied[freechunks]);
+   list_add((struct list_head *) >index,
+   >unbuddied[freechunks]);
}
 
spin_unlock(>lock);
@@ -501,7 +502,7 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int 
retries)
page = list_tail_entry(>lru, struct page, lru);
zhdr = page_address(page);
list_del(>lru);
-   list_del(>buddy);
+   list_del((struct list_head *) >index);

[RFC PATCH 5/9] mm/zbud: encode zbud handle using struct page

2014-10-14 Thread Heesub Shin

As a preparation for further patches, this patch changes the way of
encoding zbud handle. Currently, zbud handle is actually just a virtual
address that is casted to unsigned long before return back. Exporting
the address to clients would be inappropriate if we use highmem pages
for zbud pages, which will be implemented by following patches.

Change the zbud handle to struct page* with the least significant bit
indicating the first or last. All other information are hidden in the
struct page.

Signed-off-by: Heesub Shin 
---
 mm/zbud.c | 50 --
 1 file changed, 28 insertions(+), 22 deletions(-)

diff --git a/mm/zbud.c b/mm/zbud.c
index 193ea4f..383bab0 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -240,35 +240,32 @@ static void free_zbud_page(struct zbud_header *zhdr)
__free_page(virt_to_page(zhdr));
 }
 
+static int is_last_chunk(unsigned long handle)
+{
+   return (handle & LAST) == LAST;
+}
+
 /*
  * Encodes the handle of a particular buddy within a zbud page
  * Pool lock should be held as this function accesses first|last_chunks
  */
-static unsigned long encode_handle(struct zbud_header *zhdr, enum buddy bud)
+static unsigned long encode_handle(struct page *page, enum buddy bud)
 {
-   unsigned long handle;
-   struct page *page = virt_to_page(zhdr);
+   return (unsigned long) page | bud;
+}
 
-   /*
-* For now, the encoded handle is actually just the pointer to the data
-* but this might not always be the case.  A little information hiding.
-* Add CHUNK_SIZE to the handle if it is the first allocation to jump
-* over the zbud header in the first chunk.
-*/
-   handle = (unsigned long)zhdr;
-   if (bud == FIRST)
-   /* skip over zbud header */
-   handle += ZHDR_SIZE_ALIGNED;
-   else /* bud == LAST */
-   handle += PAGE_SIZE -
-   (get_num_chunks(page, LAST) << CHUNK_SHIFT);
-   return handle;
+/* Returns struct page of the zbud page where a given handle is stored */
+static struct page *handle_to_zbud_page(unsigned long handle)
+{
+   return (struct page *) (handle & ~LAST);
 }
 
 /* Returns the zbud page where a given handle is stored */
 static struct zbud_header *handle_to_zbud_header(unsigned long handle)
 {
-   return (struct zbud_header *)(handle & PAGE_MASK);
+   struct page *page = handle_to_zbud_page(handle);
+
+   return page_address(page);
 }
 
 /* Returns the number of free chunks in a zbud page */
@@ -395,7 +392,7 @@ found:
list_del(>lru);
list_add(>lru, >lru);
 
-   *handle = encode_handle(zhdr, bud);
+   *handle = encode_handle(page, bud);
spin_unlock(>lock);
 
return 0;
@@ -514,9 +511,9 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int 
retries)
first_handle = 0;
last_handle = 0;
if (get_num_chunks(page, FIRST))
-   first_handle = encode_handle(zhdr, FIRST);
+   first_handle = encode_handle(page, FIRST);
if (get_num_chunks(page, LAST))
-   last_handle = encode_handle(zhdr, LAST);
+   last_handle = encode_handle(page, LAST);
spin_unlock(>lock);
 
/* Issue the eviction callback(s) */
@@ -570,7 +567,16 @@ next:
  */
 void *zbud_map(struct zbud_pool *pool, unsigned long handle)
 {
-   return (void *)(handle);
+   size_t offset;
+   struct page *page = handle_to_zbud_page(handle);
+
+   if (is_last_chunk(handle))
+   offset = PAGE_SIZE -
+   (get_num_chunks(page, LAST) << CHUNK_SHIFT);
+   else
+   offset = ZHDR_SIZE_ALIGNED;
+
+   return (unsigned char *) page_address(page) + offset;
 }
 
 /**
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 2/9] mm/zbud: remove buddied list from zbud_pool

2014-10-14 Thread Heesub Shin

There's no point in having the _buddied_ list of zbud_pages, as nobody
refers it. Tracking it adds runtime overheads only, so let's remove it.

Signed-off-by: Heesub Shin 
---
 mm/zbud.c | 17 +++--
 1 file changed, 3 insertions(+), 14 deletions(-)

diff --git a/mm/zbud.c b/mm/zbud.c
index 6f36394..0f5add0 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -79,8 +79,6 @@
  * @unbuddied: array of lists tracking zbud pages that only contain one buddy;
  * the lists each zbud page is added to depends on the size of
  * its free region.
- * @buddied:   list tracking the zbud pages that contain two buddies;
- * these zbud pages are full
  * @lru:   list tracking the zbud pages in LRU order by most recently
  * added buddy.
  * @pages_nr:  number of zbud pages in the pool.
@@ -93,7 +91,6 @@
 struct zbud_pool {
spinlock_t lock;
struct list_head unbuddied[NCHUNKS];
-   struct list_head buddied;
struct list_head lru;
u64 pages_nr;
struct zbud_ops *ops;
@@ -102,7 +99,7 @@ struct zbud_pool {
 /*
  * struct zbud_header - zbud page metadata occupying the first chunk of each
  * zbud page.
- * @buddy: links the zbud page into the unbuddied/buddied lists in the pool
+ * @buddy: links the zbud page into the unbuddied lists in the pool
  * @lru:   links the zbud page into the lru list in the pool
  * @first_chunks:  the size of the first buddy in chunks, 0 if free
  * @last_chunks:   the size of the last buddy in chunks, 0 if free
@@ -299,7 +296,6 @@ struct zbud_pool *zbud_create_pool(gfp_t gfp, struct 
zbud_ops *ops)
spin_lock_init(>lock);
for_each_unbuddied_list(i, 0)
INIT_LIST_HEAD(>unbuddied[i]);
-   INIT_LIST_HEAD(>buddied);
INIT_LIST_HEAD(>lru);
pool->pages_nr = 0;
pool->ops = ops;
@@ -383,9 +379,6 @@ found:
/* Add to unbuddied list */
freechunks = num_free_chunks(zhdr);
list_add(>buddy, >unbuddied[freechunks]);
-   } else {
-   /* Add to buddied list */
-   list_add(>buddy, >buddied);
}
 
/* Add/move zbud page to beginning of LRU */
@@ -429,10 +422,9 @@ void zbud_free(struct zbud_pool *pool, unsigned long 
handle)
return;
}
 
-   /* Remove from existing buddy list */
-   list_del(>buddy);
-
if (num_free_chunks(zhdr) == NCHUNKS) {
+   /* Remove from existing unbuddied list */
+   list_del(>buddy);
/* zbud page is empty, free */
list_del(>lru);
free_zbud_page(zhdr);
@@ -542,9 +534,6 @@ next:
/* add to unbuddied list */
freechunks = num_free_chunks(zhdr);
list_add(>buddy, >unbuddied[freechunks]);
-   } else {
-   /* add to buddied list */
-   list_add(>buddy, >buddied);
}
 
/* add to beginning of LRU */
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 0/9] mm/zbud: support highmem pages

2014-10-14 Thread Heesub Shin

zbud is a memory allocator for storing compressed data pages. It keeps
two data objects of arbitrary size on a single page. This simple design
provides very deterministic behavior on reclamation, which is one of
reasons why zswap selected zbud as a default allocator over zsmalloc.

Unlike zsmalloc, however, zbud does not support highmem. This is
problomatic especially on 32-bit machines having relatively small
lowmem. Compressing anonymous pages from highmem and storing them into
lowmem could eat up lowmem spaces.

This limitation is due to the fact that zbud manages its internal data
structures on zbud_header which is kept in the head of zbud_page. For
example, zbud_pages are tracked by several lists and have some status
information, which are being referenced at any time by the kernel. Thus,
zbud_pages should be allocated on a memory region directly mapped,
lowmem.

After some digging out, I found that internal data structures of zbud
can be kept in the struct page, the same way as zsmalloc does. So, this
series moves out all fields in zbud_header to struct page. Though it
alters quite a lot, it does not add any functional differences except
highmem support. I am afraid that this kind of modification abusing
several fields in struct page would be ok.

Heesub Shin (9):
  mm/zbud: tidy up a bit
  mm/zbud: remove buddied list from zbud_pool
  mm/zbud: remove lru from zbud_header
  mm/zbud: remove first|last_chunks from zbud_header
  mm/zbud: encode zbud handle using struct page
  mm/zbud: remove list_head for buddied list from zbud_header
  mm/zbud: drop zbud_header
  mm/zbud: allow clients to use highmem pages
  mm/zswap: use highmem pages for compressed pool

 mm/zbud.c  | 244 ++---
 mm/zswap.c |   4 +-
 2 files changed, 121 insertions(+), 127 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 0/9] mm/zbud: support highmem pages

2014-10-14 Thread Heesub Shin

zbud is a memory allocator for storing compressed data pages. It keeps
two data objects of arbitrary size on a single page. This simple design
provides very deterministic behavior on reclamation, which is one of
reasons why zswap selected zbud as a default allocator over zsmalloc.

Unlike zsmalloc, however, zbud does not support highmem. This is
problomatic especially on 32-bit machines having relatively small
lowmem. Compressing anonymous pages from highmem and storing them into
lowmem could eat up lowmem spaces.

This limitation is due to the fact that zbud manages its internal data
structures on zbud_header which is kept in the head of zbud_page. For
example, zbud_pages are tracked by several lists and have some status
information, which are being referenced at any time by the kernel. Thus,
zbud_pages should be allocated on a memory region directly mapped,
lowmem.

After some digging out, I found that internal data structures of zbud
can be kept in the struct page, the same way as zsmalloc does. So, this
series moves out all fields in zbud_header to struct page. Though it
alters quite a lot, it does not add any functional differences except
highmem support. I am afraid that this kind of modification abusing
several fields in struct page would be ok.

Heesub Shin (9):
  mm/zbud: tidy up a bit
  mm/zbud: remove buddied list from zbud_pool
  mm/zbud: remove lru from zbud_header
  mm/zbud: remove first|last_chunks from zbud_header
  mm/zbud: encode zbud handle using struct page
  mm/zbud: remove list_head for buddied list from zbud_header
  mm/zbud: drop zbud_header
  mm/zbud: allow clients to use highmem pages
  mm/zswap: use highmem pages for compressed pool

 mm/zbud.c  | 244 ++---
 mm/zswap.c |   4 +-
 2 files changed, 121 insertions(+), 127 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 2/9] mm/zbud: remove buddied list from zbud_pool

2014-10-14 Thread Heesub Shin

There's no point in having the _buddied_ list of zbud_pages, as nobody
refers it. Tracking it adds runtime overheads only, so let's remove it.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
---
 mm/zbud.c | 17 +++--
 1 file changed, 3 insertions(+), 14 deletions(-)

diff --git a/mm/zbud.c b/mm/zbud.c
index 6f36394..0f5add0 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -79,8 +79,6 @@
  * @unbuddied: array of lists tracking zbud pages that only contain one buddy;
  * the lists each zbud page is added to depends on the size of
  * its free region.
- * @buddied:   list tracking the zbud pages that contain two buddies;
- * these zbud pages are full
  * @lru:   list tracking the zbud pages in LRU order by most recently
  * added buddy.
  * @pages_nr:  number of zbud pages in the pool.
@@ -93,7 +91,6 @@
 struct zbud_pool {
spinlock_t lock;
struct list_head unbuddied[NCHUNKS];
-   struct list_head buddied;
struct list_head lru;
u64 pages_nr;
struct zbud_ops *ops;
@@ -102,7 +99,7 @@ struct zbud_pool {
 /*
  * struct zbud_header - zbud page metadata occupying the first chunk of each
  * zbud page.
- * @buddy: links the zbud page into the unbuddied/buddied lists in the pool
+ * @buddy: links the zbud page into the unbuddied lists in the pool
  * @lru:   links the zbud page into the lru list in the pool
  * @first_chunks:  the size of the first buddy in chunks, 0 if free
  * @last_chunks:   the size of the last buddy in chunks, 0 if free
@@ -299,7 +296,6 @@ struct zbud_pool *zbud_create_pool(gfp_t gfp, struct 
zbud_ops *ops)
spin_lock_init(pool-lock);
for_each_unbuddied_list(i, 0)
INIT_LIST_HEAD(pool-unbuddied[i]);
-   INIT_LIST_HEAD(pool-buddied);
INIT_LIST_HEAD(pool-lru);
pool-pages_nr = 0;
pool-ops = ops;
@@ -383,9 +379,6 @@ found:
/* Add to unbuddied list */
freechunks = num_free_chunks(zhdr);
list_add(zhdr-buddy, pool-unbuddied[freechunks]);
-   } else {
-   /* Add to buddied list */
-   list_add(zhdr-buddy, pool-buddied);
}
 
/* Add/move zbud page to beginning of LRU */
@@ -429,10 +422,9 @@ void zbud_free(struct zbud_pool *pool, unsigned long 
handle)
return;
}
 
-   /* Remove from existing buddy list */
-   list_del(zhdr-buddy);
-
if (num_free_chunks(zhdr) == NCHUNKS) {
+   /* Remove from existing unbuddied list */
+   list_del(zhdr-buddy);
/* zbud page is empty, free */
list_del(zhdr-lru);
free_zbud_page(zhdr);
@@ -542,9 +534,6 @@ next:
/* add to unbuddied list */
freechunks = num_free_chunks(zhdr);
list_add(zhdr-buddy, pool-unbuddied[freechunks]);
-   } else {
-   /* add to buddied list */
-   list_add(zhdr-buddy, pool-buddied);
}
 
/* add to beginning of LRU */
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 5/9] mm/zbud: encode zbud handle using struct page

2014-10-14 Thread Heesub Shin

As a preparation for further patches, this patch changes the way of
encoding zbud handle. Currently, zbud handle is actually just a virtual
address that is casted to unsigned long before return back. Exporting
the address to clients would be inappropriate if we use highmem pages
for zbud pages, which will be implemented by following patches.

Change the zbud handle to struct page* with the least significant bit
indicating the first or last. All other information are hidden in the
struct page.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
---
 mm/zbud.c | 50 --
 1 file changed, 28 insertions(+), 22 deletions(-)

diff --git a/mm/zbud.c b/mm/zbud.c
index 193ea4f..383bab0 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -240,35 +240,32 @@ static void free_zbud_page(struct zbud_header *zhdr)
__free_page(virt_to_page(zhdr));
 }
 
+static int is_last_chunk(unsigned long handle)
+{
+   return (handle  LAST) == LAST;
+}
+
 /*
  * Encodes the handle of a particular buddy within a zbud page
  * Pool lock should be held as this function accesses first|last_chunks
  */
-static unsigned long encode_handle(struct zbud_header *zhdr, enum buddy bud)
+static unsigned long encode_handle(struct page *page, enum buddy bud)
 {
-   unsigned long handle;
-   struct page *page = virt_to_page(zhdr);
+   return (unsigned long) page | bud;
+}
 
-   /*
-* For now, the encoded handle is actually just the pointer to the data
-* but this might not always be the case.  A little information hiding.
-* Add CHUNK_SIZE to the handle if it is the first allocation to jump
-* over the zbud header in the first chunk.
-*/
-   handle = (unsigned long)zhdr;
-   if (bud == FIRST)
-   /* skip over zbud header */
-   handle += ZHDR_SIZE_ALIGNED;
-   else /* bud == LAST */
-   handle += PAGE_SIZE -
-   (get_num_chunks(page, LAST)  CHUNK_SHIFT);
-   return handle;
+/* Returns struct page of the zbud page where a given handle is stored */
+static struct page *handle_to_zbud_page(unsigned long handle)
+{
+   return (struct page *) (handle  ~LAST);
 }
 
 /* Returns the zbud page where a given handle is stored */
 static struct zbud_header *handle_to_zbud_header(unsigned long handle)
 {
-   return (struct zbud_header *)(handle  PAGE_MASK);
+   struct page *page = handle_to_zbud_page(handle);
+
+   return page_address(page);
 }
 
 /* Returns the number of free chunks in a zbud page */
@@ -395,7 +392,7 @@ found:
list_del(page-lru);
list_add(page-lru, pool-lru);
 
-   *handle = encode_handle(zhdr, bud);
+   *handle = encode_handle(page, bud);
spin_unlock(pool-lock);
 
return 0;
@@ -514,9 +511,9 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int 
retries)
first_handle = 0;
last_handle = 0;
if (get_num_chunks(page, FIRST))
-   first_handle = encode_handle(zhdr, FIRST);
+   first_handle = encode_handle(page, FIRST);
if (get_num_chunks(page, LAST))
-   last_handle = encode_handle(zhdr, LAST);
+   last_handle = encode_handle(page, LAST);
spin_unlock(pool-lock);
 
/* Issue the eviction callback(s) */
@@ -570,7 +567,16 @@ next:
  */
 void *zbud_map(struct zbud_pool *pool, unsigned long handle)
 {
-   return (void *)(handle);
+   size_t offset;
+   struct page *page = handle_to_zbud_page(handle);
+
+   if (is_last_chunk(handle))
+   offset = PAGE_SIZE -
+   (get_num_chunks(page, LAST)  CHUNK_SHIFT);
+   else
+   offset = ZHDR_SIZE_ALIGNED;
+
+   return (unsigned char *) page_address(page) + offset;
 }
 
 /**
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 6/9] mm/zbud: remove list_head for buddied list from zbud_header

2014-10-14 Thread Heesub Shin

zbud allocator links the _unbuddied_ zbud pages into a list in the pool.
When it tries to allocate some spaces, the list is first searched for
the best fit possible. Thus, current implementation has a list_head in
zbud_header structure to construct the list.

This patch simulates a list using the second double word of struct page,
instead of zbud_header. Then, we can eliminate the list_head in
zbud_header. Using _index and _mapcount fields (also including _count on
64-bits machines) in the page struct for list management looks a bit
odd, but no better idea now considering that page-lru is already in
use.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
---
 mm/zbud.c | 36 +++-
 1 file changed, 19 insertions(+), 17 deletions(-)

diff --git a/mm/zbud.c b/mm/zbud.c
index 383bab0..8a6dd6b 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -99,10 +99,8 @@ struct zbud_pool {
 /*
  * struct zbud_header - zbud page metadata occupying the first chunk of each
  * zbud page.
- * @buddy: links the zbud page into the unbuddied lists in the pool
  */
 struct zbud_header {
-   struct list_head buddy;
bool under_reclaim;
 };
 
@@ -223,21 +221,24 @@ static size_t get_num_chunks(struct page *page, enum 
buddy bud)
for ((_iter) = (_begin); (_iter)  NCHUNKS; (_iter)++)
 
 /* Initializes the zbud header of a newly allocated zbud page */
-static struct zbud_header *init_zbud_page(struct page *page)
+static void init_zbud_page(struct page *page)
 {
struct zbud_header *zhdr = page_address(page);
set_num_chunks(page, FIRST, 0);
set_num_chunks(page, LAST, 0);
-   INIT_LIST_HEAD(zhdr-buddy);
+   INIT_LIST_HEAD((struct list_head *) page-index);
INIT_LIST_HEAD(page-lru);
zhdr-under_reclaim = 0;
-   return zhdr;
 }
 
 /* Resets the struct page fields and frees the page */
 static void free_zbud_page(struct zbud_header *zhdr)
 {
-   __free_page(virt_to_page(zhdr));
+   struct page *page = virt_to_page(zhdr);
+
+   init_page_count(page);
+   page_mapcount_reset(page);
+   __free_page(page);
 }
 
 static int is_last_chunk(unsigned long handle)
@@ -341,7 +342,6 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t 
gfp,
unsigned long *handle)
 {
int chunks, i, freechunks;
-   struct zbud_header *zhdr = NULL;
enum buddy bud;
struct page *page;
 
@@ -355,10 +355,9 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t 
gfp,
/* First, try to find an unbuddied zbud page. */
for_each_unbuddied_list(i, chunks) {
if (!list_empty(pool-unbuddied[i])) {
-   zhdr = list_first_entry(pool-unbuddied[i],
-   struct zbud_header, buddy);
-   page = virt_to_page(zhdr);
-   list_del(zhdr-buddy);
+   page = list_entry((unsigned long *)
+   pool-unbuddied[i].next, struct page, index);
+   list_del((struct list_head *) page-index);
goto found;
}
}
@@ -370,7 +369,7 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t 
gfp,
return -ENOMEM;
spin_lock(pool-lock);
pool-pages_nr++;
-   zhdr = init_zbud_page(page);
+   init_zbud_page(page);
 
 found:
if (get_num_chunks(page, FIRST) == 0)
@@ -384,7 +383,8 @@ found:
get_num_chunks(page, LAST) == 0) {
/* Add to unbuddied list */
freechunks = num_free_chunks(page);
-   list_add(zhdr-buddy, pool-unbuddied[freechunks]);
+   list_add((struct list_head *) page-index,
+   pool-unbuddied[freechunks]);
}
 
/* Add/move zbud page to beginning of LRU */
@@ -433,14 +433,15 @@ void zbud_free(struct zbud_pool *pool, unsigned long 
handle)
freechunks = num_free_chunks(page);
if (freechunks == NCHUNKS) {
/* Remove from existing unbuddied list */
-   list_del(zhdr-buddy);
+   list_del((struct list_head *) page-index);
/* zbud page is empty, free */
list_del(page-lru);
free_zbud_page(zhdr);
pool-pages_nr--;
} else {
/* Add to unbuddied list */
-   list_add(zhdr-buddy, pool-unbuddied[freechunks]);
+   list_add((struct list_head *) page-index,
+   pool-unbuddied[freechunks]);
}
 
spin_unlock(pool-lock);
@@ -501,7 +502,7 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int 
retries)
page = list_tail_entry(pool-lru, struct page, lru);
zhdr = page_address(page);
list_del(page-lru);
-   list_del(zhdr-buddy);
+   list_del((struct list_head

[RFC PATCH 8/9] mm/zbud: allow clients to use highmem pages

2014-10-14 Thread Heesub Shin

Now that all fields for the internal data structure of zbud are moved to
struct page, there is no reason to restrict zbud pages to be allocated
only in lowmem. This patch allows to use highmem pages for zbud pages.
Pages from highmem are mapped using kmap_atomic() before accessing.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
---
 mm/zbud.c | 25 -
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/mm/zbud.c b/mm/zbud.c
index 5a392f3..677fdc1 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -52,6 +52,7 @@
 #include linux/spinlock.h
 #include linux/zbud.h
 #include linux/zpool.h
+#include linux/highmem.h
 
 /*
  * Structures
@@ -94,6 +95,9 @@ struct zbud_pool {
struct zbud_ops *ops;
 };
 
+/* per-cpu mapping addresses of kmap_atomic()'ed zbud pages */
+static DEFINE_PER_CPU(void *, zbud_mapping);
+
 /*
  * zpool
  /
@@ -310,9 +314,6 @@ void zbud_destroy_pool(struct zbud_pool *pool)
  * performed first. If no suitable free region is found, then a new page is
  * allocated and added to the pool to satisfy the request.
  *
- * gfp should not set __GFP_HIGHMEM as highmem pages cannot be used
- * as zbud pool pages.
- *
  * Return: 0 if success and handle is set, otherwise -EINVAL if the size or
  * gfp arguments are invalid or -ENOMEM if the pool was unable to allocate
  * a new page.
@@ -324,7 +325,7 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t 
gfp,
enum buddy bud;
struct page *page;
 
-   if (!size || (gfp  __GFP_HIGHMEM))
+   if (!size)
return -EINVAL;
if (size  PAGE_SIZE - CHUNK_SIZE)
return -ENOSPC;
@@ -543,14 +544,24 @@ next:
  */
 void *zbud_map(struct zbud_pool *pool, unsigned long handle)
 {
+   void **mapping;
size_t offset = 0;
struct page *page = handle_to_zbud_page(handle);
 
+   /*
+* Because we use per-cpu mapping shared among the pools/users,
+* we can't allow mapping in interrupt context because it can
+* corrupt another users mappings.
+*/
+   BUG_ON(in_interrupt());
+
if (is_last_chunk(handle))
offset = PAGE_SIZE -
(get_num_chunks(page, LAST)  CHUNK_SHIFT);
 
-   return (unsigned char *) page_address(page) + offset;
+   mapping = get_cpu_var(zbud_mapping);
+   *mapping = kmap_atomic(page);
+   return (char *) *mapping + offset;
 }
 
 /**
@@ -560,6 +571,10 @@ void *zbud_map(struct zbud_pool *pool, unsigned long 
handle)
  */
 void zbud_unmap(struct zbud_pool *pool, unsigned long handle)
 {
+   void **mapping = this_cpu_ptr(zbud_mapping);
+
+   kunmap_atomic(*mapping);
+   put_cpu_var(zbud_mapping);
 }
 
 /**
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 3/9] mm/zbud: remove lru from zbud_header

2014-10-14 Thread Heesub Shin

zbud_pool has an lru list for tracking zbud pages and they are strung
together via zhdr-lru. If we reuse page-lru for linking zbud pages
instead of it, the lru field in zbud_header can be dropped.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
---
 mm/zbud.c | 23 +--
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/mm/zbud.c b/mm/zbud.c
index 0f5add0..a2390f6 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -100,13 +100,11 @@ struct zbud_pool {
  * struct zbud_header - zbud page metadata occupying the first chunk of each
  * zbud page.
  * @buddy: links the zbud page into the unbuddied lists in the pool
- * @lru:   links the zbud page into the lru list in the pool
  * @first_chunks:  the size of the first buddy in chunks, 0 if free
  * @last_chunks:   the size of the last buddy in chunks, 0 if free
  */
 struct zbud_header {
struct list_head buddy;
-   struct list_head lru;
unsigned int first_chunks;
unsigned int last_chunks;
bool under_reclaim;
@@ -224,7 +222,7 @@ static struct zbud_header *init_zbud_page(struct page *page)
zhdr-first_chunks = 0;
zhdr-last_chunks = 0;
INIT_LIST_HEAD(zhdr-buddy);
-   INIT_LIST_HEAD(zhdr-lru);
+   INIT_LIST_HEAD(page-lru);
zhdr-under_reclaim = 0;
return zhdr;
 }
@@ -352,6 +350,7 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t 
gfp,
if (!list_empty(pool-unbuddied[i])) {
zhdr = list_first_entry(pool-unbuddied[i],
struct zbud_header, buddy);
+   page = virt_to_page(zhdr);
list_del(zhdr-buddy);
goto found;
}
@@ -382,9 +381,9 @@ found:
}
 
/* Add/move zbud page to beginning of LRU */
-   if (!list_empty(zhdr-lru))
-   list_del(zhdr-lru);
-   list_add(zhdr-lru, pool-lru);
+   if (!list_empty(page-lru))
+   list_del(page-lru);
+   list_add(page-lru, pool-lru);
 
*handle = encode_handle(zhdr, bud);
spin_unlock(pool-lock);
@@ -405,10 +404,12 @@ found:
 void zbud_free(struct zbud_pool *pool, unsigned long handle)
 {
struct zbud_header *zhdr;
+   struct page *page;
int freechunks;
 
spin_lock(pool-lock);
zhdr = handle_to_zbud_header(handle);
+   page = virt_to_page(zhdr);
 
/* If first buddy, handle will be page aligned */
if ((handle - ZHDR_SIZE_ALIGNED)  ~PAGE_MASK)
@@ -426,7 +427,7 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle)
/* Remove from existing unbuddied list */
list_del(zhdr-buddy);
/* zbud page is empty, free */
-   list_del(zhdr-lru);
+   list_del(page-lru);
free_zbud_page(zhdr);
pool-pages_nr--;
} else {
@@ -479,6 +480,7 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle)
 int zbud_reclaim_page(struct zbud_pool *pool, unsigned int retries)
 {
int i, ret, freechunks;
+   struct page *page;
struct zbud_header *zhdr;
unsigned long first_handle, last_handle;
 
@@ -489,8 +491,9 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int 
retries)
return -EINVAL;
}
for (i = 0; i  retries; i++) {
-   zhdr = list_tail_entry(pool-lru, struct zbud_header, lru);
-   list_del(zhdr-lru);
+   page = list_tail_entry(pool-lru, struct page, lru);
+   zhdr = page_address(page);
+   list_del(page-lru);
list_del(zhdr-buddy);
/* Protect zbud page against free */
zhdr-under_reclaim = true;
@@ -537,7 +540,7 @@ next:
}
 
/* add to beginning of LRU */
-   list_add(zhdr-lru, pool-lru);
+   list_add(page-lru, pool-lru);
}
spin_unlock(pool-lock);
return -EAGAIN;
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 9/9] mm/zswap: use highmem pages for compressed pool

2014-10-14 Thread Heesub Shin

Now that the zbud supports highmem, storing compressed anonymous pages
on highmem looks more reasonble. So, pass __GFP_HIGHMEM flag to zpool
when zswap allocates memory from it.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
---
 mm/zswap.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index ea064c1..eaabe95 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -684,8 +684,8 @@ static int zswap_frontswap_store(unsigned type, pgoff_t 
offset,
 
/* store */
len = dlen + sizeof(struct zswap_header);
-   ret = zpool_malloc(zswap_pool, len, __GFP_NORETRY | __GFP_NOWARN,
-   handle);
+   ret = zpool_malloc(zswap_pool, len,
+   __GFP_NORETRY | __GFP_NOWARN | __GFP_HIGHMEM, handle);
if (ret == -ENOSPC) {
zswap_reject_compress_poor++;
goto freepage;
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 1/9] mm/zbud: tidy up a bit

2014-10-14 Thread Heesub Shin

For aesthetics, add a blank line between functions, remove useless
initialization statements, and simplify codes a bit. No functional
differences are introduced.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
---
 mm/zbud.c | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/mm/zbud.c b/mm/zbud.c
index ecf1dbe..6f36394 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -145,6 +145,7 @@ static int zbud_zpool_malloc(void *pool, size_t size, gfp_t 
gfp,
 {
return zbud_alloc(pool, size, gfp, handle);
 }
+
 static void zbud_zpool_free(void *pool, unsigned long handle)
 {
zbud_free(pool, handle);
@@ -174,6 +175,7 @@ static void *zbud_zpool_map(void *pool, unsigned long 
handle,
 {
return zbud_map(pool, handle);
 }
+
 static void zbud_zpool_unmap(void *pool, unsigned long handle)
 {
zbud_unmap(pool, handle);
@@ -350,16 +352,11 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t 
gfp,
spin_lock(pool-lock);
 
/* First, try to find an unbuddied zbud page. */
-   zhdr = NULL;
for_each_unbuddied_list(i, chunks) {
if (!list_empty(pool-unbuddied[i])) {
zhdr = list_first_entry(pool-unbuddied[i],
struct zbud_header, buddy);
list_del(zhdr-buddy);
-   if (zhdr-first_chunks == 0)
-   bud = FIRST;
-   else
-   bud = LAST;
goto found;
}
}
@@ -372,13 +369,15 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t 
gfp,
spin_lock(pool-lock);
pool-pages_nr++;
zhdr = init_zbud_page(page);
-   bud = FIRST;
 
 found:
-   if (bud == FIRST)
+   if (zhdr-first_chunks == 0) {
zhdr-first_chunks = chunks;
-   else
+   bud = FIRST;
+   } else {
zhdr-last_chunks = chunks;
+   bud = LAST;
+   }
 
if (zhdr-first_chunks == 0 || zhdr-last_chunks == 0) {
/* Add to unbuddied list */
@@ -433,7 +432,7 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle)
/* Remove from existing buddy list */
list_del(zhdr-buddy);
 
-   if (zhdr-first_chunks == 0  zhdr-last_chunks == 0) {
+   if (num_free_chunks(zhdr) == NCHUNKS) {
/* zbud page is empty, free */
list_del(zhdr-lru);
free_zbud_page(zhdr);
@@ -489,7 +488,7 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int 
retries)
 {
int i, ret, freechunks;
struct zbud_header *zhdr;
-   unsigned long first_handle = 0, last_handle = 0;
+   unsigned long first_handle, last_handle;
 
spin_lock(pool-lock);
if (!pool-ops || !pool-ops-evict || list_empty(pool-lru) ||
@@ -529,7 +528,7 @@ int zbud_reclaim_page(struct zbud_pool *pool, unsigned int 
retries)
 next:
spin_lock(pool-lock);
zhdr-under_reclaim = false;
-   if (zhdr-first_chunks == 0  zhdr-last_chunks == 0) {
+   if (num_free_chunks(zhdr) == NCHUNKS) {
/*
 * Both buddies are now free, free the zbud page and
 * return success.
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH 7/9] mm/zbud: drop zbud_header

2014-10-14 Thread Heesub Shin

Now that the only field in zbud_header is .under_reclaim, get it out of
the struct and let PG_reclaim bit in page-flags take over. As a result
of this change, we can finally eliminate the struct zbud_header, and
hence all the internal data structures of zbud live in struct page.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
---
 mm/zbud.c | 66 +--
 1 file changed, 18 insertions(+), 48 deletions(-)

diff --git a/mm/zbud.c b/mm/zbud.c
index 8a6dd6b..5a392f3 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -60,17 +60,15 @@
  * NCHUNKS_ORDER determines the internal allocation granularity, effectively
  * adjusting internal fragmentation.  It also determines the number of
  * freelists maintained in each pool. NCHUNKS_ORDER of 6 means that the
- * allocation granularity will be in chunks of size PAGE_SIZE/64. As one chunk
- * in allocated page is occupied by zbud header, NCHUNKS will be calculated to
- * 63 which shows the max number of free chunks in zbud page, also there will 
be
- * 63 freelists per pool.
+ * allocation granularity will be in chunks of size PAGE_SIZE/64.
+ * NCHUNKS will be calculated to 64 which shows the max number of free
+ * chunks in zbud page, also there will be 64 freelists per pool.
  */
 #define NCHUNKS_ORDER  6
 
 #define CHUNK_SHIFT(PAGE_SHIFT - NCHUNKS_ORDER)
 #define CHUNK_SIZE (1  CHUNK_SHIFT)
-#define ZHDR_SIZE_ALIGNED CHUNK_SIZE
-#define NCHUNKS((PAGE_SIZE - ZHDR_SIZE_ALIGNED)  CHUNK_SHIFT)
+#define NCHUNKS(PAGE_SIZE  CHUNK_SHIFT)
 
 /**
  * struct zbud_pool - stores metadata for each zbud pool
@@ -96,14 +94,6 @@ struct zbud_pool {
struct zbud_ops *ops;
 };
 
-/*
- * struct zbud_header - zbud page metadata occupying the first chunk of each
- * zbud page.
- */
-struct zbud_header {
-   bool under_reclaim;
-};
-
 /*
  * zpool
  /
@@ -220,22 +210,19 @@ static size_t get_num_chunks(struct page *page, enum 
buddy bud)
 #define for_each_unbuddied_list(_iter, _begin) \
for ((_iter) = (_begin); (_iter)  NCHUNKS; (_iter)++)
 
-/* Initializes the zbud header of a newly allocated zbud page */
+/* Initializes a newly allocated zbud page */
 static void init_zbud_page(struct page *page)
 {
-   struct zbud_header *zhdr = page_address(page);
set_num_chunks(page, FIRST, 0);
set_num_chunks(page, LAST, 0);
INIT_LIST_HEAD((struct list_head *) page-index);
INIT_LIST_HEAD(page-lru);
-   zhdr-under_reclaim = 0;
+   ClearPageReclaim(page);
 }
 
 /* Resets the struct page fields and frees the page */
-static void free_zbud_page(struct zbud_header *zhdr)
+static void free_zbud_page(struct page *page)
 {
-   struct page *page = virt_to_page(zhdr);
-
init_page_count(page);
page_mapcount_reset(page);
__free_page(page);
@@ -261,14 +248,6 @@ static struct page *handle_to_zbud_page(unsigned long 
handle)
return (struct page *) (handle  ~LAST);
 }
 
-/* Returns the zbud page where a given handle is stored */
-static struct zbud_header *handle_to_zbud_header(unsigned long handle)
-{
-   struct page *page = handle_to_zbud_page(handle);
-
-   return page_address(page);
-}
-
 /* Returns the number of free chunks in a zbud page */
 static int num_free_chunks(struct page *page)
 {
@@ -347,7 +326,7 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t 
gfp,
 
if (!size || (gfp  __GFP_HIGHMEM))
return -EINVAL;
-   if (size  PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE)
+   if (size  PAGE_SIZE - CHUNK_SIZE)
return -ENOSPC;
chunks = size_to_chunks(size);
spin_lock(pool-lock);
@@ -410,21 +389,18 @@ found:
  */
 void zbud_free(struct zbud_pool *pool, unsigned long handle)
 {
-   struct zbud_header *zhdr;
struct page *page;
int freechunks;
 
spin_lock(pool-lock);
-   zhdr = handle_to_zbud_header(handle);
-   page = virt_to_page(zhdr);
+   page = handle_to_zbud_page(handle);
 
-   /* If first buddy, handle will be page aligned */
-   if ((handle - ZHDR_SIZE_ALIGNED)  ~PAGE_MASK)
-   set_num_chunks(page, LAST, 0);
-   else
+   if (!is_last_chunk(handle))
set_num_chunks(page, FIRST, 0);
+   else
+   set_num_chunks(page, LAST, 0);
 
-   if (zhdr-under_reclaim) {
+   if (PageReclaim(page)) {
/* zbud page is under reclaim, reclaim will free */
spin_unlock(pool-lock);
return;
@@ -436,7 +412,7 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle)
list_del((struct list_head *) page-index);
/* zbud page is empty, free */
list_del(page-lru);
-   free_zbud_page(zhdr);
+   free_zbud_page(page);
pool-pages_nr--;
} else {
/* Add

[RFC PATCH 4/9] mm/zbud: remove first|last_chunks from zbud_header

2014-10-14 Thread Heesub Shin

The size information of each first and last buddy are stored into
first|last_chunks in struct zbud_header respectively. Put them into
page-private instead of zbud_header.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
---
 mm/zbud.c | 62 --
 1 file changed, 36 insertions(+), 26 deletions(-)

diff --git a/mm/zbud.c b/mm/zbud.c
index a2390f6..193ea4f 100644
--- a/mm/zbud.c
+++ b/mm/zbud.c
@@ -100,13 +100,9 @@ struct zbud_pool {
  * struct zbud_header - zbud page metadata occupying the first chunk of each
  * zbud page.
  * @buddy: links the zbud page into the unbuddied lists in the pool
- * @first_chunks:  the size of the first buddy in chunks, 0 if free
- * @last_chunks:   the size of the last buddy in chunks, 0 if free
  */
 struct zbud_header {
struct list_head buddy;
-   unsigned int first_chunks;
-   unsigned int last_chunks;
bool under_reclaim;
 };
 
@@ -212,6 +208,17 @@ static int size_to_chunks(size_t size)
return (size + CHUNK_SIZE - 1)  CHUNK_SHIFT;
 }
 
+static void set_num_chunks(struct page *page, enum buddy bud, size_t chunks)
+{
+   page-private = (page-private  (0x  (16 * !bud))) |
+   ((chunks  0x)  (16 * bud));
+}
+
+static size_t get_num_chunks(struct page *page, enum buddy bud)
+{
+   return (page-private  (16 * bud))  0x;
+}
+
 #define for_each_unbuddied_list(_iter, _begin) \
for ((_iter) = (_begin); (_iter)  NCHUNKS; (_iter)++)
 
@@ -219,8 +226,8 @@ static int size_to_chunks(size_t size)
 static struct zbud_header *init_zbud_page(struct page *page)
 {
struct zbud_header *zhdr = page_address(page);
-   zhdr-first_chunks = 0;
-   zhdr-last_chunks = 0;
+   set_num_chunks(page, FIRST, 0);
+   set_num_chunks(page, LAST, 0);
INIT_LIST_HEAD(zhdr-buddy);
INIT_LIST_HEAD(page-lru);
zhdr-under_reclaim = 0;
@@ -240,6 +247,7 @@ static void free_zbud_page(struct zbud_header *zhdr)
 static unsigned long encode_handle(struct zbud_header *zhdr, enum buddy bud)
 {
unsigned long handle;
+   struct page *page = virt_to_page(zhdr);
 
/*
 * For now, the encoded handle is actually just the pointer to the data
@@ -252,7 +260,8 @@ static unsigned long encode_handle(struct zbud_header 
*zhdr, enum buddy bud)
/* skip over zbud header */
handle += ZHDR_SIZE_ALIGNED;
else /* bud == LAST */
-   handle += PAGE_SIZE - (zhdr-last_chunks   CHUNK_SHIFT);
+   handle += PAGE_SIZE -
+   (get_num_chunks(page, LAST)  CHUNK_SHIFT);
return handle;
 }
 
@@ -263,13 +272,14 @@ static struct zbud_header *handle_to_zbud_header(unsigned 
long handle)
 }
 
 /* Returns the number of free chunks in a zbud page */
-static int num_free_chunks(struct zbud_header *zhdr)
+static int num_free_chunks(struct page *page)
 {
/*
 * Rather than branch for different situations, just use the fact that
 * free buddies have a length of zero to simplify everything.
 */
-   return NCHUNKS - zhdr-first_chunks - zhdr-last_chunks;
+   return NCHUNKS - get_num_chunks(page, FIRST)
+   - get_num_chunks(page, LAST);
 }
 
 /*
@@ -366,17 +376,17 @@ int zbud_alloc(struct zbud_pool *pool, size_t size, gfp_t 
gfp,
zhdr = init_zbud_page(page);
 
 found:
-   if (zhdr-first_chunks == 0) {
-   zhdr-first_chunks = chunks;
+   if (get_num_chunks(page, FIRST) == 0)
bud = FIRST;
-   } else {
-   zhdr-last_chunks = chunks;
+   else
bud = LAST;
-   }
 
-   if (zhdr-first_chunks == 0 || zhdr-last_chunks == 0) {
+   set_num_chunks(page, bud, chunks);
+
+   if (get_num_chunks(page, FIRST) == 0 ||
+   get_num_chunks(page, LAST) == 0) {
/* Add to unbuddied list */
-   freechunks = num_free_chunks(zhdr);
+   freechunks = num_free_chunks(page);
list_add(zhdr-buddy, pool-unbuddied[freechunks]);
}
 
@@ -413,9 +423,9 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle)
 
/* If first buddy, handle will be page aligned */
if ((handle - ZHDR_SIZE_ALIGNED)  ~PAGE_MASK)
-   zhdr-last_chunks = 0;
+   set_num_chunks(page, LAST, 0);
else
-   zhdr-first_chunks = 0;
+   set_num_chunks(page, FIRST, 0);
 
if (zhdr-under_reclaim) {
/* zbud page is under reclaim, reclaim will free */
@@ -423,7 +433,8 @@ void zbud_free(struct zbud_pool *pool, unsigned long handle)
return;
}
 
-   if (num_free_chunks(zhdr) == NCHUNKS) {
+   freechunks = num_free_chunks(page);
+   if (freechunks == NCHUNKS) {
/* Remove from existing unbuddied list

Re: [PATCH 1/1] [ion]: system-heap use PAGE_ALLOC_COSTLY_ORDER for high order

2014-10-06 Thread Heesub Shin


Hello Kumar,

On 10/06/2014 05:31 PM, Pintu Kumar wrote:

The Android ion_system_heap uses allocation fallback mechanism
based on 8,4,0 order pages available in the system.
It changes gfp flags based on higher order allocation request.
This higher order value is hard-coded as 4, instead of using
the system defined higher order value.
Thus replacing this hard-coded value with PAGE_ALLOC_COSTLY_ORDER
which is defined as 3.
This will help mapping the higher order request in system heap with
the actual allocation request.


Quite reasonable.

Reviewed-by: Heesub Shin 

BTW, Anyone knows how the allocation order (8,4 and 0) was decided? I 
think only Google guys might know the answer.


regards,
heesub



Signed-off-by: Pintu Kumar 
---
  drivers/staging/android/ion/ion_system_heap.c |4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index da2a63c..e6c393f 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -65,7 +65,7 @@ static struct page *alloc_buffer_page(struct ion_system_heap 
*heap,
} else {
gfp_t gfp_flags = low_order_gfp_flags;

-   if (order > 4)
+   if (order > PAGE_ALLOC_COSTLY_ORDER)
gfp_flags = high_order_gfp_flags;
page = alloc_pages(gfp_flags | __GFP_COMP, order);
if (!page)
@@ -276,7 +276,7 @@ struct ion_heap *ion_system_heap_create(struct 
ion_platform_heap *unused)
struct ion_page_pool *pool;
gfp_t gfp_flags = low_order_gfp_flags;

-   if (orders[i] > 4)
+   if (orders[i] > PAGE_ALLOC_COSTLY_ORDER)
gfp_flags = high_order_gfp_flags;
pool = ion_page_pool_create(gfp_flags, orders[i]);
if (!pool)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/1] [ion]: system-heap use PAGE_ALLOC_COSTLY_ORDER for high order

2014-10-06 Thread Heesub Shin


Hello Kumar,

On 10/06/2014 05:31 PM, Pintu Kumar wrote:

The Android ion_system_heap uses allocation fallback mechanism
based on 8,4,0 order pages available in the system.
It changes gfp flags based on higher order allocation request.
This higher order value is hard-coded as 4, instead of using
the system defined higher order value.
Thus replacing this hard-coded value with PAGE_ALLOC_COSTLY_ORDER
which is defined as 3.
This will help mapping the higher order request in system heap with
the actual allocation request.


Quite reasonable.

Reviewed-by: Heesub Shin heesub.s...@samsung.com

BTW, Anyone knows how the allocation order (8,4 and 0) was decided? I 
think only Google guys might know the answer.


regards,
heesub



Signed-off-by: Pintu Kumar pint...@samsung.com
---
  drivers/staging/android/ion/ion_system_heap.c |4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index da2a63c..e6c393f 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -65,7 +65,7 @@ static struct page *alloc_buffer_page(struct ion_system_heap 
*heap,
} else {
gfp_t gfp_flags = low_order_gfp_flags;

-   if (order  4)
+   if (order  PAGE_ALLOC_COSTLY_ORDER)
gfp_flags = high_order_gfp_flags;
page = alloc_pages(gfp_flags | __GFP_COMP, order);
if (!page)
@@ -276,7 +276,7 @@ struct ion_heap *ion_system_heap_create(struct 
ion_platform_heap *unused)
struct ion_page_pool *pool;
gfp_t gfp_flags = low_order_gfp_flags;

-   if (orders[i]  4)
+   if (orders[i]  PAGE_ALLOC_COSTLY_ORDER)
gfp_flags = high_order_gfp_flags;
pool = ion_page_pool_create(gfp_flags, orders[i]);
if (!pool)


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 3/3] zram: add swap_get_free hint

2014-09-04 Thread Heesub Shin


Hello Minchan,

First of all, I agree with the overall purpose of your patch set.

On 09/04/2014 10:39 AM, Minchan Kim wrote:

This patch implement SWAP_GET_FREE handler in zram so that VM can
know how many zram has freeable space.
VM can use it to stop anonymous reclaiming once zram is full.

Signed-off-by: Minchan Kim 
---
  drivers/block/zram/zram_drv.c | 18 ++
  1 file changed, 18 insertions(+)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 88661d62e46a..8e22b20aa2db 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -951,6 +951,22 @@ static int zram_slot_free_notify(struct block_device *bdev,
return 0;
  }

+static int zram_get_free_pages(struct block_device *bdev, long *free)
+{
+   struct zram *zram;
+   struct zram_meta *meta;
+
+   zram = bdev->bd_disk->private_data;
+   meta = zram->meta;
+
+   if (!zram->limit_pages)
+   return 1;
+
+   *free = zram->limit_pages - zs_get_total_pages(meta->mem_pool);


Even if 'free' is zero here, there may be free spaces available to store 
more compressed pages into the zs_pool. I mean calculation above is not 
quite accurate and wastes memory, but have no better idea for now.


heesub


+
+   return 0;
+}
+
  static int zram_swap_hint(struct block_device *bdev,
unsigned int hint, void *arg)
  {
@@ -958,6 +974,8 @@ static int zram_swap_hint(struct block_device *bdev,

if (hint == SWAP_SLOT_FREE)
ret = zram_slot_free_notify(bdev, (unsigned long)arg);
+   else if (hint == SWAP_GET_FREE)
+   ret = zram_get_free_pages(bdev, arg);

return ret;
  }


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 3/3] zram: add swap_get_free hint

2014-09-04 Thread Heesub Shin


Hello Minchan,

First of all, I agree with the overall purpose of your patch set.

On 09/04/2014 10:39 AM, Minchan Kim wrote:

This patch implement SWAP_GET_FREE handler in zram so that VM can
know how many zram has freeable space.
VM can use it to stop anonymous reclaiming once zram is full.

Signed-off-by: Minchan Kim minc...@kernel.org
---
  drivers/block/zram/zram_drv.c | 18 ++
  1 file changed, 18 insertions(+)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 88661d62e46a..8e22b20aa2db 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -951,6 +951,22 @@ static int zram_slot_free_notify(struct block_device *bdev,
return 0;
  }

+static int zram_get_free_pages(struct block_device *bdev, long *free)
+{
+   struct zram *zram;
+   struct zram_meta *meta;
+
+   zram = bdev-bd_disk-private_data;
+   meta = zram-meta;
+
+   if (!zram-limit_pages)
+   return 1;
+
+   *free = zram-limit_pages - zs_get_total_pages(meta-mem_pool);


Even if 'free' is zero here, there may be free spaces available to store 
more compressed pages into the zs_pool. I mean calculation above is not 
quite accurate and wastes memory, but have no better idea for now.


heesub


+
+   return 0;
+}
+
  static int zram_swap_hint(struct block_device *bdev,
unsigned int hint, void *arg)
  {
@@ -958,6 +974,8 @@ static int zram_swap_hint(struct block_device *bdev,

if (hint == SWAP_SLOT_FREE)
ret = zram_slot_free_notify(bdev, (unsigned long)arg);
+   else if (hint == SWAP_GET_FREE)
+   ret = zram_get_free_pages(bdev, arg);

return ret;
  }


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] staging: ion: fixup invalid kfree() calls on heap destroy

2014-06-19 Thread Heesub Shin

I've noticed that the last commit to ion_system_heap.c ('staging: ion:
optimize struct ion_system_heap') has an omission, so an invalid kfree()
gets called on ion_system_heap_destroy(). As ION system heap is never
destroyed until system shutdown, it may not cause any harm, but should
be fixed. I should have caught this before the merge, my bad.

Signed-off-by: Heesub Shin 
---
 drivers/staging/android/ion/ion_system_heap.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index c826b4c..6b77c51 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -304,7 +304,6 @@ void ion_system_heap_destroy(struct ion_heap *heap)
 
for (i = 0; i < num_orders; i++)
ion_page_pool_destroy(sys_heap->pools[i]);
-   kfree(sys_heap->pools);
kfree(sys_heap);
 }
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] staging: ion: fixup invalid kfree() calls on heap destroy

2014-06-19 Thread Heesub Shin

I've noticed that the last commit to ion_system_heap.c ('staging: ion:
optimize struct ion_system_heap') has an omission, so an invalid kfree()
gets called on ion_system_heap_destroy(). As ION system heap is never
destroyed until system shutdown, it may not cause any harm, but should
be fixed. I should have caught this before the merge, my bad.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
---
 drivers/staging/android/ion/ion_system_heap.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index c826b4c..6b77c51 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -304,7 +304,6 @@ void ion_system_heap_destroy(struct ion_heap *heap)
 
for (i = 0; i  num_orders; i++)
ion_page_pool_destroy(sys_heap-pools[i]);
-   kfree(sys_heap-pools);
kfree(sys_heap);
 }
 
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 RESEND 3/4] staging: ion: remove order argument from free_buffer_page()

2014-05-29 Thread Heesub Shin

Now that the pages returned from the pool are compound pages, we do not
need to pass the order information to free_buffer_page().

Signed-off-by: Heesub Shin 
Reviewed-by: Mitchel Humpherys 
Tested-by: John Stultz 
---
 drivers/staging/android/ion/ion_system_heap.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index f3b9008..e0a7e54 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -78,9 +78,9 @@ static struct page *alloc_buffer_page(struct ion_system_heap 
*heap,
 }
 
 static void free_buffer_page(struct ion_system_heap *heap,
-struct ion_buffer *buffer, struct page *page,
-unsigned int order)
+struct ion_buffer *buffer, struct page *page)
 {
+   unsigned int order = compound_order(page);
bool cached = ion_buffer_cached(buffer);
 
if (!cached && !(buffer->private_flags & ION_PRIV_FLAG_SHRINKER_FREE)) {
@@ -171,7 +171,7 @@ free_table:
kfree(table);
 free_pages:
list_for_each_entry_safe(page, tmp_page, , lru)
-   free_buffer_page(sys_heap, buffer, page, compound_order(page));
+   free_buffer_page(sys_heap, buffer, page);
return -ENOMEM;
 }
 
@@ -191,8 +191,7 @@ static void ion_system_heap_free(struct ion_buffer *buffer)
ion_heap_buffer_zero(buffer);
 
for_each_sg(table->sgl, sg, table->nents, i)
-   free_buffer_page(sys_heap, buffer, sg_page(sg),
-   get_order(sg->length));
+   free_buffer_page(sys_heap, buffer, sg_page(sg));
sg_free_table(table);
kfree(table);
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 RESEND 4/4] staging: ion: optimize struct ion_system_heap

2014-05-29 Thread Heesub Shin

struct ion_system_heap has an array for storing pointers to page pools
and it is allocated separately from the containing structure. There is
no point in allocating those two small objects individually, bothering
slab allocator. Using a variable length array simplifies code lines and
reduces overhead to the slab.

Signed-off-by: Heesub Shin 
Reviewed-by: Mitchel Humpherys 
Tested-by: John Stultz 
---
 drivers/staging/android/ion/ion_system_heap.c | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index e0a7e54..c826b4c 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -49,7 +49,7 @@ static inline unsigned int order_to_size(int order)
 
 struct ion_system_heap {
struct ion_heap heap;
-   struct ion_page_pool **pools;
+   struct ion_page_pool *pools[0];
 };
 
 static struct page *alloc_buffer_page(struct ion_system_heap *heap,
@@ -264,16 +264,15 @@ struct ion_heap *ion_system_heap_create(struct 
ion_platform_heap *unused)
struct ion_system_heap *heap;
int i;
 
-   heap = kzalloc(sizeof(struct ion_system_heap), GFP_KERNEL);
+   heap = kzalloc(sizeof(struct ion_system_heap) +
+   sizeof(struct ion_page_pool *) * num_orders,
+   GFP_KERNEL);
if (!heap)
return ERR_PTR(-ENOMEM);
heap->heap.ops = _heap_ops;
heap->heap.type = ION_HEAP_TYPE_SYSTEM;
heap->heap.flags = ION_HEAP_FLAG_DEFER_FREE;
-   heap->pools = kzalloc(sizeof(struct ion_page_pool *) * num_orders,
- GFP_KERNEL);
-   if (!heap->pools)
-   goto free_heap;
+
for (i = 0; i < num_orders; i++) {
struct ion_page_pool *pool;
gfp_t gfp_flags = low_order_gfp_flags;
@@ -292,8 +291,6 @@ struct ion_heap *ion_system_heap_create(struct 
ion_platform_heap *unused)
 destroy_pools:
while (i--)
ion_page_pool_destroy(heap->pools[i]);
-   kfree(heap->pools);
-free_heap:
kfree(heap);
return ERR_PTR(-ENOMEM);
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 RESEND 2/4] staging: ion: remove struct page_info

2014-05-29 Thread Heesub Shin

ION system heap creates a temporary list of pages to build
scatter/gather table, introducing an internal data type, page_info. Now
that the order field has been removed from it, we do not need to depend
on such data type anymore.

Signed-off-by: Heesub Shin 
Reviewed-by: Mitchel Humpherys 
Tested-by: John Stultz 
---
 drivers/staging/android/ion/ion_system_heap.c | 47 +--
 1 file changed, 15 insertions(+), 32 deletions(-)

diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index 621b857..f3b9008 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -52,11 +52,6 @@ struct ion_system_heap {
struct ion_page_pool **pools;
 };
 
-struct page_info {
-   struct page *page;
-   struct list_head list;
-};
-
 static struct page *alloc_buffer_page(struct ion_system_heap *heap,
  struct ion_buffer *buffer,
  unsigned long order)
@@ -98,19 +93,14 @@ static void free_buffer_page(struct ion_system_heap *heap,
 }
 
 
-static struct page_info *alloc_largest_available(struct ion_system_heap *heap,
-struct ion_buffer *buffer,
-unsigned long size,
-unsigned int max_order)
+static struct page *alloc_largest_available(struct ion_system_heap *heap,
+   struct ion_buffer *buffer,
+   unsigned long size,
+   unsigned int max_order)
 {
struct page *page;
-   struct page_info *info;
int i;
 
-   info = kmalloc(sizeof(struct page_info), GFP_KERNEL);
-   if (!info)
-   return NULL;
-
for (i = 0; i < num_orders; i++) {
if (size < order_to_size(orders[i]))
continue;
@@ -121,10 +111,8 @@ static struct page_info *alloc_largest_available(struct 
ion_system_heap *heap,
if (!page)
continue;
 
-   info->page = page;
-   return info;
+   return page;
}
-   kfree(info);
 
return NULL;
 }
@@ -140,7 +128,7 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
struct sg_table *table;
struct scatterlist *sg;
struct list_head pages;
-   struct page_info *info, *tmp_info;
+   struct page *page, *tmp_page;
int i = 0;
unsigned long size_remaining = PAGE_ALIGN(size);
unsigned int max_order = orders[0];
@@ -153,13 +141,13 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
 
INIT_LIST_HEAD();
while (size_remaining > 0) {
-   info = alloc_largest_available(sys_heap, buffer, size_remaining,
+   page = alloc_largest_available(sys_heap, buffer, size_remaining,
max_order);
-   if (!info)
+   if (!page)
goto free_pages;
-   list_add_tail(>list, );
-   size_remaining -= PAGE_SIZE << compound_order(info->page);
-   max_order = compound_order(info->page);
+   list_add_tail(>lru, );
+   size_remaining -= PAGE_SIZE << compound_order(page);
+   max_order = compound_order(page);
i++;
}
table = kmalloc(sizeof(struct sg_table), GFP_KERNEL);
@@ -170,12 +158,10 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
goto free_table;
 
sg = table->sgl;
-   list_for_each_entry_safe(info, tmp_info, , list) {
-   struct page *page = info->page;
+   list_for_each_entry_safe(page, tmp_page, , lru) {
sg_set_page(sg, page, PAGE_SIZE << compound_order(page), 0);
sg = sg_next(sg);
-   list_del(>list);
-   kfree(info);
+   list_del(>lru);
}
 
buffer->priv_virt = table;
@@ -184,11 +170,8 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
 free_table:
kfree(table);
 free_pages:
-   list_for_each_entry_safe(info, tmp_info, , list) {
-   free_buffer_page(sys_heap, buffer, info->page,
-   compound_order(info->page));
-   kfree(info);
-   }
+   list_for_each_entry_safe(page, tmp_page, , lru)
+   free_buffer_page(sys_heap, buffer, page, compound_order(page));
return -ENOMEM;
 }
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 RESEND 1/4] staging: ion: remove order from struct page_info

2014-05-29 Thread Heesub Shin

ION system heap uses an internal data structure, struct page_info, for
tracking down the meta information of the pages allocated from the pool.
Now that the pool returns compound pages, we don't need to store page
order in struct page_info.

Signed-off-by: Heesub Shin 
Reviewed-by: Mitchel Humpherys 
Tested-by: John Stultz 
---
 drivers/staging/android/ion/ion_system_heap.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index cb7ae08..621b857 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -54,7 +54,6 @@ struct ion_system_heap {
 
 struct page_info {
struct page *page;
-   unsigned int order;
struct list_head list;
 };
 
@@ -123,7 +122,6 @@ static struct page_info *alloc_largest_available(struct 
ion_system_heap *heap,
continue;
 
info->page = page;
-   info->order = orders[i];
return info;
}
kfree(info);
@@ -160,8 +158,8 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
if (!info)
goto free_pages;
list_add_tail(>list, );
-   size_remaining -= PAGE_SIZE << info->order;
-   max_order = info->order;
+   size_remaining -= PAGE_SIZE << compound_order(info->page);
+   max_order = compound_order(info->page);
i++;
}
table = kmalloc(sizeof(struct sg_table), GFP_KERNEL);
@@ -174,7 +172,7 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
sg = table->sgl;
list_for_each_entry_safe(info, tmp_info, , list) {
struct page *page = info->page;
-   sg_set_page(sg, page, PAGE_SIZE << info->order, 0);
+   sg_set_page(sg, page, PAGE_SIZE << compound_order(page), 0);
sg = sg_next(sg);
list_del(>list);
kfree(info);
@@ -187,7 +185,8 @@ free_table:
kfree(table);
 free_pages:
list_for_each_entry_safe(info, tmp_info, , list) {
-   free_buffer_page(sys_heap, buffer, info->page, info->order);
+   free_buffer_page(sys_heap, buffer, info->page,
+   compound_order(info->page));
kfree(info);
}
return -ENOMEM;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 1/9] staging: ion: tidy up a bit

2014-05-29 Thread Heesub Shin



On 05/30/2014 05:42 AM, Greg Kroah-Hartman wrote:

On Wed, May 28, 2014 at 03:52:52PM +0900, Heesub Shin wrote:

For aesthetics and readability, rename goto labels, remove
useless code lines, and clarify function return type.

Signed-off-by: Heesub Shin 
Reviewed-by: Mitchel Humpherys 
Tested-by: John Stultz 


Not all of these would apply to my tree.  So I've applied the ones that
did.  Please resend the remaining ones, after refreshing them against my
staging-next branch of the staging.git tree.


Hi KH,

I will rebase them on staging-next and resend soon.

Thanks a lot!

regads,
heesub
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 1/9] staging: ion: tidy up a bit

2014-05-29 Thread Heesub Shin



On 05/30/2014 05:42 AM, Greg Kroah-Hartman wrote:

On Wed, May 28, 2014 at 03:52:52PM +0900, Heesub Shin wrote:

For aesthetics and readability, rename goto labels, remove
useless code lines, and clarify function return type.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
Reviewed-by: Mitchel Humpherys mitch...@codeaurora.org
Tested-by: John Stultz john.stu...@linaro.org


Not all of these would apply to my tree.  So I've applied the ones that
did.  Please resend the remaining ones, after refreshing them against my
staging-next branch of the staging.git tree.


Hi KH,

I will rebase them on staging-next and resend soon.

Thanks a lot!

regads,
heesub
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 RESEND 4/4] staging: ion: optimize struct ion_system_heap

2014-05-29 Thread Heesub Shin

struct ion_system_heap has an array for storing pointers to page pools
and it is allocated separately from the containing structure. There is
no point in allocating those two small objects individually, bothering
slab allocator. Using a variable length array simplifies code lines and
reduces overhead to the slab.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
Reviewed-by: Mitchel Humpherys mitch...@codeaurora.org
Tested-by: John Stultz john.stu...@linaro.org
---
 drivers/staging/android/ion/ion_system_heap.c | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index e0a7e54..c826b4c 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -49,7 +49,7 @@ static inline unsigned int order_to_size(int order)
 
 struct ion_system_heap {
struct ion_heap heap;
-   struct ion_page_pool **pools;
+   struct ion_page_pool *pools[0];
 };
 
 static struct page *alloc_buffer_page(struct ion_system_heap *heap,
@@ -264,16 +264,15 @@ struct ion_heap *ion_system_heap_create(struct 
ion_platform_heap *unused)
struct ion_system_heap *heap;
int i;
 
-   heap = kzalloc(sizeof(struct ion_system_heap), GFP_KERNEL);
+   heap = kzalloc(sizeof(struct ion_system_heap) +
+   sizeof(struct ion_page_pool *) * num_orders,
+   GFP_KERNEL);
if (!heap)
return ERR_PTR(-ENOMEM);
heap-heap.ops = system_heap_ops;
heap-heap.type = ION_HEAP_TYPE_SYSTEM;
heap-heap.flags = ION_HEAP_FLAG_DEFER_FREE;
-   heap-pools = kzalloc(sizeof(struct ion_page_pool *) * num_orders,
- GFP_KERNEL);
-   if (!heap-pools)
-   goto free_heap;
+
for (i = 0; i  num_orders; i++) {
struct ion_page_pool *pool;
gfp_t gfp_flags = low_order_gfp_flags;
@@ -292,8 +291,6 @@ struct ion_heap *ion_system_heap_create(struct 
ion_platform_heap *unused)
 destroy_pools:
while (i--)
ion_page_pool_destroy(heap-pools[i]);
-   kfree(heap-pools);
-free_heap:
kfree(heap);
return ERR_PTR(-ENOMEM);
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 RESEND 2/4] staging: ion: remove struct page_info

2014-05-29 Thread Heesub Shin

ION system heap creates a temporary list of pages to build
scatter/gather table, introducing an internal data type, page_info. Now
that the order field has been removed from it, we do not need to depend
on such data type anymore.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
Reviewed-by: Mitchel Humpherys mitch...@codeaurora.org
Tested-by: John Stultz john.stu...@linaro.org
---
 drivers/staging/android/ion/ion_system_heap.c | 47 +--
 1 file changed, 15 insertions(+), 32 deletions(-)

diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index 621b857..f3b9008 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -52,11 +52,6 @@ struct ion_system_heap {
struct ion_page_pool **pools;
 };
 
-struct page_info {
-   struct page *page;
-   struct list_head list;
-};
-
 static struct page *alloc_buffer_page(struct ion_system_heap *heap,
  struct ion_buffer *buffer,
  unsigned long order)
@@ -98,19 +93,14 @@ static void free_buffer_page(struct ion_system_heap *heap,
 }
 
 
-static struct page_info *alloc_largest_available(struct ion_system_heap *heap,
-struct ion_buffer *buffer,
-unsigned long size,
-unsigned int max_order)
+static struct page *alloc_largest_available(struct ion_system_heap *heap,
+   struct ion_buffer *buffer,
+   unsigned long size,
+   unsigned int max_order)
 {
struct page *page;
-   struct page_info *info;
int i;
 
-   info = kmalloc(sizeof(struct page_info), GFP_KERNEL);
-   if (!info)
-   return NULL;
-
for (i = 0; i  num_orders; i++) {
if (size  order_to_size(orders[i]))
continue;
@@ -121,10 +111,8 @@ static struct page_info *alloc_largest_available(struct 
ion_system_heap *heap,
if (!page)
continue;
 
-   info-page = page;
-   return info;
+   return page;
}
-   kfree(info);
 
return NULL;
 }
@@ -140,7 +128,7 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
struct sg_table *table;
struct scatterlist *sg;
struct list_head pages;
-   struct page_info *info, *tmp_info;
+   struct page *page, *tmp_page;
int i = 0;
unsigned long size_remaining = PAGE_ALIGN(size);
unsigned int max_order = orders[0];
@@ -153,13 +141,13 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
 
INIT_LIST_HEAD(pages);
while (size_remaining  0) {
-   info = alloc_largest_available(sys_heap, buffer, size_remaining,
+   page = alloc_largest_available(sys_heap, buffer, size_remaining,
max_order);
-   if (!info)
+   if (!page)
goto free_pages;
-   list_add_tail(info-list, pages);
-   size_remaining -= PAGE_SIZE  compound_order(info-page);
-   max_order = compound_order(info-page);
+   list_add_tail(page-lru, pages);
+   size_remaining -= PAGE_SIZE  compound_order(page);
+   max_order = compound_order(page);
i++;
}
table = kmalloc(sizeof(struct sg_table), GFP_KERNEL);
@@ -170,12 +158,10 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
goto free_table;
 
sg = table-sgl;
-   list_for_each_entry_safe(info, tmp_info, pages, list) {
-   struct page *page = info-page;
+   list_for_each_entry_safe(page, tmp_page, pages, lru) {
sg_set_page(sg, page, PAGE_SIZE  compound_order(page), 0);
sg = sg_next(sg);
-   list_del(info-list);
-   kfree(info);
+   list_del(page-lru);
}
 
buffer-priv_virt = table;
@@ -184,11 +170,8 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
 free_table:
kfree(table);
 free_pages:
-   list_for_each_entry_safe(info, tmp_info, pages, list) {
-   free_buffer_page(sys_heap, buffer, info-page,
-   compound_order(info-page));
-   kfree(info);
-   }
+   list_for_each_entry_safe(page, tmp_page, pages, lru)
+   free_buffer_page(sys_heap, buffer, page, compound_order(page));
return -ENOMEM;
 }
 
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org

[PATCH v2 RESEND 1/4] staging: ion: remove order from struct page_info

2014-05-29 Thread Heesub Shin

ION system heap uses an internal data structure, struct page_info, for
tracking down the meta information of the pages allocated from the pool.
Now that the pool returns compound pages, we don't need to store page
order in struct page_info.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
Reviewed-by: Mitchel Humpherys mitch...@codeaurora.org
Tested-by: John Stultz john.stu...@linaro.org
---
 drivers/staging/android/ion/ion_system_heap.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index cb7ae08..621b857 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -54,7 +54,6 @@ struct ion_system_heap {
 
 struct page_info {
struct page *page;
-   unsigned int order;
struct list_head list;
 };
 
@@ -123,7 +122,6 @@ static struct page_info *alloc_largest_available(struct 
ion_system_heap *heap,
continue;
 
info-page = page;
-   info-order = orders[i];
return info;
}
kfree(info);
@@ -160,8 +158,8 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
if (!info)
goto free_pages;
list_add_tail(info-list, pages);
-   size_remaining -= PAGE_SIZE  info-order;
-   max_order = info-order;
+   size_remaining -= PAGE_SIZE  compound_order(info-page);
+   max_order = compound_order(info-page);
i++;
}
table = kmalloc(sizeof(struct sg_table), GFP_KERNEL);
@@ -174,7 +172,7 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
sg = table-sgl;
list_for_each_entry_safe(info, tmp_info, pages, list) {
struct page *page = info-page;
-   sg_set_page(sg, page, PAGE_SIZE  info-order, 0);
+   sg_set_page(sg, page, PAGE_SIZE  compound_order(page), 0);
sg = sg_next(sg);
list_del(info-list);
kfree(info);
@@ -187,7 +185,8 @@ free_table:
kfree(table);
 free_pages:
list_for_each_entry_safe(info, tmp_info, pages, list) {
-   free_buffer_page(sys_heap, buffer, info-page, info-order);
+   free_buffer_page(sys_heap, buffer, info-page,
+   compound_order(info-page));
kfree(info);
}
return -ENOMEM;
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 RESEND 3/4] staging: ion: remove order argument from free_buffer_page()

2014-05-29 Thread Heesub Shin

Now that the pages returned from the pool are compound pages, we do not
need to pass the order information to free_buffer_page().

Signed-off-by: Heesub Shin heesub.s...@samsung.com
Reviewed-by: Mitchel Humpherys mitch...@codeaurora.org
Tested-by: John Stultz john.stu...@linaro.org
---
 drivers/staging/android/ion/ion_system_heap.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index f3b9008..e0a7e54 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -78,9 +78,9 @@ static struct page *alloc_buffer_page(struct ion_system_heap 
*heap,
 }
 
 static void free_buffer_page(struct ion_system_heap *heap,
-struct ion_buffer *buffer, struct page *page,
-unsigned int order)
+struct ion_buffer *buffer, struct page *page)
 {
+   unsigned int order = compound_order(page);
bool cached = ion_buffer_cached(buffer);
 
if (!cached  !(buffer-private_flags  ION_PRIV_FLAG_SHRINKER_FREE)) {
@@ -171,7 +171,7 @@ free_table:
kfree(table);
 free_pages:
list_for_each_entry_safe(page, tmp_page, pages, lru)
-   free_buffer_page(sys_heap, buffer, page, compound_order(page));
+   free_buffer_page(sys_heap, buffer, page);
return -ENOMEM;
 }
 
@@ -191,8 +191,7 @@ static void ion_system_heap_free(struct ion_buffer *buffer)
ion_heap_buffer_zero(buffer);
 
for_each_sg(table-sgl, sg, table-nents, i)
-   free_buffer_page(sys_heap, buffer, sg_page(sg),
-   get_order(sg-length));
+   free_buffer_page(sys_heap, buffer, sg_page(sg));
sg_free_table(table);
kfree(table);
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 7/9] staging: ion: remove order argument from free_buffer_page()

2014-05-28 Thread Heesub Shin

Now that the pages returned from the pool are compound pages, we do not
need to pass the order information to free_buffer_page().

Signed-off-by: Heesub Shin 
---
 drivers/staging/android/ion/ion_system_heap.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index f0ae210..d78d589e 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -77,9 +77,9 @@ static struct page *alloc_buffer_page(struct ion_system_heap 
*heap,
 }
 
 static void free_buffer_page(struct ion_system_heap *heap,
-struct ion_buffer *buffer, struct page *page,
-unsigned int order)
+struct ion_buffer *buffer, struct page *page)
 {
+   unsigned int order = compound_order(page);
bool cached = ion_buffer_cached(buffer);
 
if (!cached && !(buffer->private_flags & ION_PRIV_FLAG_SHRINKER_FREE)) {
@@ -169,7 +169,7 @@ free_table:
kfree(table);
 free_pages:
list_for_each_entry_safe(page, tmp_page, , lru)
-   free_buffer_page(sys_heap, buffer, page, compound_order(page));
+   free_buffer_page(sys_heap, buffer, page);
return -ENOMEM;
 }
 
@@ -189,8 +189,7 @@ static void ion_system_heap_free(struct ion_buffer *buffer)
ion_heap_buffer_zero(buffer);
 
for_each_sg(table->sgl, sg, table->nents, i)
-   free_buffer_page(sys_heap, buffer, sg_page(sg),
-   get_order(sg->length));
+   free_buffer_page(sys_heap, buffer, sg_page(sg));
sg_free_table(table);
kfree(table);
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 9/9] staging: ion: optimize struct ion_system_heap

2014-05-28 Thread Heesub Shin

struct ion_system_heap has an array for storing pointers to page pools
and it is allocated separately from the containing structure. There is
no point in allocating those two small objects individually, bothering
slab allocator. Using a variable length array simplifies code lines and
reduces overhead to the slab.

Signed-off-by: Heesub Shin 
---
 drivers/staging/android/ion/ion_system_heap.c | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index d78d589e..690d866 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -48,7 +48,7 @@ static inline unsigned int order_to_size(int order)
 
 struct ion_system_heap {
struct ion_heap heap;
-   struct ion_page_pool **pools;
+   struct ion_page_pool *pools[0];
 };
 
 static struct page *alloc_buffer_page(struct ion_system_heap *heap,
@@ -259,16 +259,15 @@ struct ion_heap *ion_system_heap_create(struct 
ion_platform_heap *unused)
struct ion_system_heap *heap;
int i;
 
-   heap = kzalloc(sizeof(struct ion_system_heap), GFP_KERNEL);
+   heap = kzalloc(sizeof(struct ion_system_heap) +
+   sizeof(struct ion_page_pool *) * num_orders,
+   GFP_KERNEL);
if (!heap)
return ERR_PTR(-ENOMEM);
heap->heap.ops = _heap_ops;
heap->heap.type = ION_HEAP_TYPE_SYSTEM;
heap->heap.flags = ION_HEAP_FLAG_DEFER_FREE;
-   heap->pools = kzalloc(sizeof(struct ion_page_pool *) * num_orders,
- GFP_KERNEL);
-   if (!heap->pools)
-   goto free_heap;
+
for (i = 0; i < num_orders; i++) {
struct ion_page_pool *pool;
gfp_t gfp_flags = low_order_gfp_flags;
@@ -287,8 +286,6 @@ struct ion_heap *ion_system_heap_create(struct 
ion_platform_heap *unused)
 destroy_pools:
while (i--)
ion_page_pool_destroy(heap->pools[i]);
-   kfree(heap->pools);
-free_heap:
kfree(heap);
return ERR_PTR(-ENOMEM);
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 8/9] staging: ion: shrink highmem pages on kswapd

2014-05-28 Thread Heesub Shin

ION system heap keeps pages in its pool for better performance. When the
system is under memory pressure, slab shrinker calls the callback
registered and then the pages pooled get freed.

When the shrinker is called, it checks gfp_mask and determines whether
the pages from highmem need to be freed or the pages from lowmem.
Usually, slab shrinker is invoked on kswapd context which gfp_mask is
always GFP_KERNEL, so only lowmem pages are released on kswapd context.
This means that highmem pages in the pool are never reclaimed until
direct reclaim occurs. This can be problematic when the page pool holds
excessive amounts of highmem.

For now, the shrinker callback cannot know exactly which zone should be
targeted for reclamation, as enough information are not passed to. Thus,
it makes sense to shrink both lowmem and highmem zone on kswapd context.

Reported-by: Wonseo Choi 
Signed-off-by: Heesub Shin 
---
 drivers/staging/android/ion/ion_page_pool.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/android/ion/ion_page_pool.c 
b/drivers/staging/android/ion/ion_page_pool.c
index c1cea42b..5864f3d 100644
--- a/drivers/staging/android/ion/ion_page_pool.c
+++ b/drivers/staging/android/ion/ion_page_pool.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "ion_priv.h"
 
 static void *ion_page_pool_alloc_pages(struct ion_page_pool *pool)
@@ -118,7 +119,10 @@ int ion_page_pool_shrink(struct ion_page_pool *pool, gfp_t 
gfp_mask,
int freed;
bool high;
 
-   high = !!(gfp_mask & __GFP_HIGHMEM);
+   if (current_is_kswapd())
+   high = 1;
+   else
+   high = !!(gfp_mask & __GFP_HIGHMEM);
 
if (nr_to_scan == 0)
return ion_page_pool_total(pool, high);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 6/9] staging: ion: remove struct page_info

2014-05-28 Thread Heesub Shin

ION system heap creates a temporary list of pages to build
scatter/gather table, introducing an internal data type, page_info. Now
that the order field has been removed from it, we do not need to depend
on such data type anymore.

Signed-off-by: Heesub Shin 
---
 drivers/staging/android/ion/ion_system_heap.c | 47 +--
 1 file changed, 15 insertions(+), 32 deletions(-)

diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index 73a2e67..f0ae210 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -51,11 +51,6 @@ struct ion_system_heap {
struct ion_page_pool **pools;
 };
 
-struct page_info {
-   struct page *page;
-   struct list_head list;
-};
-
 static struct page *alloc_buffer_page(struct ion_system_heap *heap,
  struct ion_buffer *buffer,
  unsigned long order)
@@ -96,19 +91,14 @@ static void free_buffer_page(struct ion_system_heap *heap,
 }
 
 
-static struct page_info *alloc_largest_available(struct ion_system_heap *heap,
-struct ion_buffer *buffer,
-unsigned long size,
-unsigned int max_order)
+static struct page *alloc_largest_available(struct ion_system_heap *heap,
+   struct ion_buffer *buffer,
+   unsigned long size,
+   unsigned int max_order)
 {
struct page *page;
-   struct page_info *info;
int i;
 
-   info = kmalloc(sizeof(struct page_info), GFP_KERNEL);
-   if (!info)
-   return NULL;
-
for (i = 0; i < num_orders; i++) {
if (size < order_to_size(orders[i]))
continue;
@@ -119,10 +109,8 @@ static struct page_info *alloc_largest_available(struct 
ion_system_heap *heap,
if (!page)
continue;
 
-   info->page = page;
-   return info;
+   return page;
}
-   kfree(info);
 
return NULL;
 }
@@ -138,7 +126,7 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
struct sg_table *table;
struct scatterlist *sg;
struct list_head pages;
-   struct page_info *info, *tmp_info;
+   struct page *page, *tmp_page;
int i = 0;
unsigned long size_remaining = PAGE_ALIGN(size);
unsigned int max_order = orders[0];
@@ -151,13 +139,13 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
 
INIT_LIST_HEAD();
while (size_remaining > 0) {
-   info = alloc_largest_available(sys_heap, buffer, size_remaining,
+   page = alloc_largest_available(sys_heap, buffer, size_remaining,
max_order);
-   if (!info)
+   if (!page)
goto free_pages;
-   list_add_tail(>list, );
-   size_remaining -= PAGE_SIZE << compound_order(info->page);
-   max_order = compound_order(info->page);
+   list_add_tail(>lru, );
+   size_remaining -= PAGE_SIZE << compound_order(page);
+   max_order = compound_order(page);
i++;
}
table = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
@@ -168,12 +156,10 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
goto free_table;
 
sg = table->sgl;
-   list_for_each_entry_safe(info, tmp_info, , list) {
-   struct page *page = info->page;
+   list_for_each_entry_safe(page, tmp_page, , lru) {
sg_set_page(sg, page, PAGE_SIZE << compound_order(page), 0);
sg = sg_next(sg);
-   list_del(>list);
-   kfree(info);
+   list_del(>lru);
}
 
buffer->priv_virt = table;
@@ -182,11 +168,8 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
 free_table:
kfree(table);
 free_pages:
-   list_for_each_entry_safe(info, tmp_info, , list) {
-   free_buffer_page(sys_heap, buffer, info->page,
-   compound_order(info->page));
-   kfree(info);
-   }
+   list_for_each_entry_safe(page, tmp_page, , lru)
+   free_buffer_page(sys_heap, buffer, page, compound_order(page));
return -ENOMEM;
 }
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 5/9] staging: ion: remove order from struct page_info

2014-05-28 Thread Heesub Shin

ION system heap uses an internal data structure, struct page_info, for
tracking down the meta information of the pages allocated from the pool.
Now that the pool returns compound pages, we don't need to store page
order in struct page_info.

Signed-off-by: Heesub Shin 
---
 drivers/staging/android/ion/ion_system_heap.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index 48e6a58..73a2e67 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -53,7 +53,6 @@ struct ion_system_heap {
 
 struct page_info {
struct page *page;
-   unsigned int order;
struct list_head list;
 };
 
@@ -121,7 +120,6 @@ static struct page_info *alloc_largest_available(struct 
ion_system_heap *heap,
continue;
 
info->page = page;
-   info->order = orders[i];
return info;
}
kfree(info);
@@ -158,8 +156,8 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
if (!info)
goto free_pages;
list_add_tail(>list, );
-   size_remaining -= PAGE_SIZE << info->order;
-   max_order = info->order;
+   size_remaining -= PAGE_SIZE << compound_order(info->page);
+   max_order = compound_order(info->page);
i++;
}
table = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
@@ -172,7 +170,7 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
sg = table->sgl;
list_for_each_entry_safe(info, tmp_info, , list) {
struct page *page = info->page;
-   sg_set_page(sg, page, PAGE_SIZE << info->order, 0);
+   sg_set_page(sg, page, PAGE_SIZE << compound_order(page), 0);
sg = sg_next(sg);
list_del(>list);
kfree(info);
@@ -185,7 +183,8 @@ free_table:
kfree(table);
 free_pages:
list_for_each_entry_safe(info, tmp_info, , list) {
-   free_buffer_page(sys_heap, buffer, info->page, info->order);
+   free_buffer_page(sys_heap, buffer, info->page,
+   compound_order(info->page));
kfree(info);
}
return -ENOMEM;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 4/9] staging: ion: use compound pages on high order pages for system heap

2014-05-28 Thread Heesub Shin

Using compound pages relieves burden on tracking the meta information
which are currently stored in page_info.

Signed-off-by: Heesub Shin 
---
 drivers/staging/android/ion/ion_page_pool.c   | 4 +++-
 drivers/staging/android/ion/ion_system_heap.c | 2 +-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/android/ion/ion_page_pool.c 
b/drivers/staging/android/ion/ion_page_pool.c
index 111777c..c1cea42b 100644
--- a/drivers/staging/android/ion/ion_page_pool.c
+++ b/drivers/staging/android/ion/ion_page_pool.c
@@ -95,6 +95,8 @@ void ion_page_pool_free(struct ion_page_pool *pool, struct 
page *page)
 {
int ret;
 
+   BUG_ON(pool->order != compound_order(page));
+
ret = ion_page_pool_add(pool, page);
if (ret)
ion_page_pool_free_pages(pool, page);
@@ -150,7 +152,7 @@ struct ion_page_pool *ion_page_pool_create(gfp_t gfp_mask, 
unsigned int order)
pool->low_count = 0;
INIT_LIST_HEAD(>low_items);
INIT_LIST_HEAD(>high_items);
-   pool->gfp_mask = gfp_mask;
+   pool->gfp_mask = gfp_mask | __GFP_COMP;
pool->order = order;
mutex_init(>mutex);
plist_node_init(>list, order);
diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index b554751..48e6a58 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -72,7 +72,7 @@ static struct page *alloc_buffer_page(struct ion_system_heap 
*heap,
 
if (order > 4)
gfp_flags = high_order_gfp_flags;
-   page = alloc_pages(gfp_flags, order);
+   page = alloc_pages(gfp_flags | __GFP_COMP, order);
if (!page)
return NULL;
ion_pages_sync_for_device(NULL, page, PAGE_SIZE << order,
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 3/9] staging: ion: remove struct ion_page_pool_item

2014-05-28 Thread Heesub Shin

The page pool uses an internal data structure, ion_page_pool_item, for
wrapping pooled pages and constructing a list. As the struct page
already provides ways for doing exactly the same thing, we do not need
to reinvent the wheel. This commit removes the data structure and slab
allocations for it.

Signed-off-by: Heesub Shin 
---
 drivers/staging/android/ion/ion_page_pool.c | 27 +--
 1 file changed, 5 insertions(+), 22 deletions(-)

diff --git a/drivers/staging/android/ion/ion_page_pool.c 
b/drivers/staging/android/ion/ion_page_pool.c
index 1684780..111777c 100644
--- a/drivers/staging/android/ion/ion_page_pool.c
+++ b/drivers/staging/android/ion/ion_page_pool.c
@@ -23,11 +23,6 @@
 #include 
 #include "ion_priv.h"
 
-struct ion_page_pool_item {
-   struct page *page;
-   struct list_head list;
-};
-
 static void *ion_page_pool_alloc_pages(struct ion_page_pool *pool)
 {
struct page *page = alloc_pages(pool->gfp_mask, pool->order);
@@ -47,19 +42,12 @@ static void ion_page_pool_free_pages(struct ion_page_pool 
*pool,
 
 static int ion_page_pool_add(struct ion_page_pool *pool, struct page *page)
 {
-   struct ion_page_pool_item *item;
-
-   item = kmalloc(sizeof(struct ion_page_pool_item), GFP_KERNEL);
-   if (!item)
-   return -ENOMEM;
-
mutex_lock(>mutex);
-   item->page = page;
if (PageHighMem(page)) {
-   list_add_tail(>list, >high_items);
+   list_add_tail(>lru, >high_items);
pool->high_count++;
} else {
-   list_add_tail(>list, >low_items);
+   list_add_tail(>lru, >low_items);
pool->low_count++;
}
mutex_unlock(>mutex);
@@ -68,24 +56,19 @@ static int ion_page_pool_add(struct ion_page_pool *pool, 
struct page *page)
 
 static struct page *ion_page_pool_remove(struct ion_page_pool *pool, bool high)
 {
-   struct ion_page_pool_item *item;
struct page *page;
 
if (high) {
BUG_ON(!pool->high_count);
-   item = list_first_entry(>high_items,
-   struct ion_page_pool_item, list);
+   page = list_first_entry(>high_items, struct page, lru);
pool->high_count--;
} else {
BUG_ON(!pool->low_count);
-   item = list_first_entry(>low_items,
-   struct ion_page_pool_item, list);
+   page = list_first_entry(>low_items, struct page, lru);
pool->low_count--;
}
 
-   list_del(>list);
-   page = item->page;
-   kfree(item);
+   list_del(>lru);
return page;
 }
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 1/9] staging: ion: tidy up a bit

2014-05-28 Thread Heesub Shin

For aesthetics and readability, rename goto labels, remove
useless code lines, and clarify function return type.

Signed-off-by: Heesub Shin 
---
 drivers/staging/android/ion/ion_page_pool.c   |  2 +-
 drivers/staging/android/ion/ion_priv.h|  2 +-
 drivers/staging/android/ion/ion_system_heap.c | 57 ---
 3 files changed, 28 insertions(+), 33 deletions(-)

diff --git a/drivers/staging/android/ion/ion_page_pool.c 
b/drivers/staging/android/ion/ion_page_pool.c
index ecb5fc3..ead4054 100644
--- a/drivers/staging/android/ion/ion_page_pool.c
+++ b/drivers/staging/android/ion/ion_page_pool.c
@@ -89,7 +89,7 @@ static struct page *ion_page_pool_remove(struct ion_page_pool 
*pool, bool high)
return page;
 }
 
-void *ion_page_pool_alloc(struct ion_page_pool *pool)
+struct page *ion_page_pool_alloc(struct ion_page_pool *pool)
 {
struct page *page = NULL;
 
diff --git a/drivers/staging/android/ion/ion_priv.h 
b/drivers/staging/android/ion/ion_priv.h
index 1eba3f2..365c947 100644
--- a/drivers/staging/android/ion/ion_priv.h
+++ b/drivers/staging/android/ion/ion_priv.h
@@ -377,7 +377,7 @@ struct ion_page_pool {
 
 struct ion_page_pool *ion_page_pool_create(gfp_t gfp_mask, unsigned int order);
 void ion_page_pool_destroy(struct ion_page_pool *);
-void *ion_page_pool_alloc(struct ion_page_pool *);
+struct page *ion_page_pool_alloc(struct ion_page_pool *);
 void ion_page_pool_free(struct ion_page_pool *, struct page *);
 
 /** ion_page_pool_shrink - shrinks the size of the memory cached in the pool
diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index c923633..b554751 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -41,7 +41,7 @@ static int order_to_index(unsigned int order)
return -1;
 }
 
-static unsigned int order_to_size(int order)
+static inline unsigned int order_to_size(int order)
 {
return PAGE_SIZE << order;
 }
@@ -78,8 +78,6 @@ static struct page *alloc_buffer_page(struct ion_system_heap 
*heap,
ion_pages_sync_for_device(NULL, page, PAGE_SIZE << order,
DMA_BIDIRECTIONAL);
}
-   if (!page)
-   return NULL;
 
return page;
 }
@@ -124,7 +122,6 @@ static struct page_info *alloc_largest_available(struct 
ion_system_heap *heap,
 
info->page = page;
info->order = orders[i];
-   INIT_LIST_HEAD(>list);
return info;
}
kfree(info);
@@ -142,7 +139,6 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
heap);
struct sg_table *table;
struct scatterlist *sg;
-   int ret;
struct list_head pages;
struct page_info *info, *tmp_info;
int i = 0;
@@ -160,24 +156,23 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
info = alloc_largest_available(sys_heap, buffer, size_remaining,
max_order);
if (!info)
-   goto err;
+   goto free_pages;
list_add_tail(>list, );
-   size_remaining -= (1 << info->order) * PAGE_SIZE;
+   size_remaining -= PAGE_SIZE << info->order;
max_order = info->order;
i++;
}
table = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
if (!table)
-   goto err;
+   goto free_pages;
 
-   ret = sg_alloc_table(table, i, GFP_KERNEL);
-   if (ret)
-   goto err1;
+   if (sg_alloc_table(table, i, GFP_KERNEL))
+   goto free_table;
 
sg = table->sgl;
list_for_each_entry_safe(info, tmp_info, , list) {
struct page *page = info->page;
-   sg_set_page(sg, page, (1 << info->order) * PAGE_SIZE, 0);
+   sg_set_page(sg, page, PAGE_SIZE << info->order, 0);
sg = sg_next(sg);
list_del(>list);
kfree(info);
@@ -185,9 +180,10 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
 
buffer->priv_virt = table;
return 0;
-err1:
+
+free_table:
kfree(table);
-err:
+free_pages:
list_for_each_entry_safe(info, tmp_info, , list) {
free_buffer_page(sys_heap, buffer, info->page, info->order);
kfree(info);
@@ -197,14 +193,12 @@ err:
 
 static void ion_system_heap_free(struct ion_buffer *buffer)
 {
-   struct ion_heap *heap = buffer->heap;
-   struct ion_system_heap *sys_heap = container_of(heap,
+   struct ion_system_heap *sys_heap = container_of(buffer->heap,
struct ion_system_heap,

[PATCH v2 2/9] staging: ion: simplify ion_page_pool_total()

2014-05-28 Thread Heesub Shin

ion_page_pool_total() returns the total number of pages in the pool.
Depending on the argument passed, it counts highmem pages in or not.
This commit simplifies the code lines for better readability.

Signed-off-by: Heesub Shin 
---
 drivers/staging/android/ion/ion_page_pool.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/android/ion/ion_page_pool.c 
b/drivers/staging/android/ion/ion_page_pool.c
index ead4054..1684780 100644
--- a/drivers/staging/android/ion/ion_page_pool.c
+++ b/drivers/staging/android/ion/ion_page_pool.c
@@ -119,12 +119,12 @@ void ion_page_pool_free(struct ion_page_pool *pool, 
struct page *page)
 
 static int ion_page_pool_total(struct ion_page_pool *pool, bool high)
 {
-   int total = 0;
+   int count = pool->low_count;
 
-   total += high ? (pool->high_count + pool->low_count) *
-   (1 << pool->order) :
-   pool->low_count * (1 << pool->order);
-   return total;
+   if (high)
+   count += pool->high_count;
+
+   return count << pool->order;
 }
 
 int ion_page_pool_shrink(struct ion_page_pool *pool, gfp_t gfp_mask,
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 0/9] staging: ion: system heap and page pool fixes

2014-05-28 Thread Heesub Shin

Hi,

Here is my patchset with some modification, hoping reviews or comments
from you guys.

v2:
 o No changes in the code, just reworded changelog
 o Reorder patch 

Heesub Shin (9):
  staging: ion: tidy up a bit
  staging: ion: simplify ion_page_pool_total()
  staging: ion: remove struct ion_page_pool_item
  staging: ion: use compound pages on high order pages for system heap
  staging: ion: remove order from struct page_info
  staging: ion: remove struct page_info
  staging: ion: remove order argument from free_buffer_page()
  staging: ion: shrink highmem pages on kswapd
  staging: ion: optimize struct ion_system_heap

 drivers/staging/android/ion/ion_page_pool.c   |  49 ---
 drivers/staging/android/ion/ion_priv.h|   2 +-
 drivers/staging/android/ion/ion_system_heap.c | 121 ++
 3 files changed, 67 insertions(+), 105 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 0/9] staging: ion: system heap and page pool fixes

2014-05-28 Thread Heesub Shin

Hi,

Here is my patchset with some modification, hoping reviews or comments
from you guys.

v2:
 o No changes in the code, just reworded changelog
 o Reorder patch 

Heesub Shin (9):
  staging: ion: tidy up a bit
  staging: ion: simplify ion_page_pool_total()
  staging: ion: remove struct ion_page_pool_item
  staging: ion: use compound pages on high order pages for system heap
  staging: ion: remove order from struct page_info
  staging: ion: remove struct page_info
  staging: ion: remove order argument from free_buffer_page()
  staging: ion: shrink highmem pages on kswapd
  staging: ion: optimize struct ion_system_heap

 drivers/staging/android/ion/ion_page_pool.c   |  49 ---
 drivers/staging/android/ion/ion_priv.h|   2 +-
 drivers/staging/android/ion/ion_system_heap.c | 121 ++
 3 files changed, 67 insertions(+), 105 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 2/9] staging: ion: simplify ion_page_pool_total()

2014-05-28 Thread Heesub Shin

ion_page_pool_total() returns the total number of pages in the pool.
Depending on the argument passed, it counts highmem pages in or not.
This commit simplifies the code lines for better readability.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
---
 drivers/staging/android/ion/ion_page_pool.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/android/ion/ion_page_pool.c 
b/drivers/staging/android/ion/ion_page_pool.c
index ead4054..1684780 100644
--- a/drivers/staging/android/ion/ion_page_pool.c
+++ b/drivers/staging/android/ion/ion_page_pool.c
@@ -119,12 +119,12 @@ void ion_page_pool_free(struct ion_page_pool *pool, 
struct page *page)
 
 static int ion_page_pool_total(struct ion_page_pool *pool, bool high)
 {
-   int total = 0;
+   int count = pool-low_count;
 
-   total += high ? (pool-high_count + pool-low_count) *
-   (1  pool-order) :
-   pool-low_count * (1  pool-order);
-   return total;
+   if (high)
+   count += pool-high_count;
+
+   return count  pool-order;
 }
 
 int ion_page_pool_shrink(struct ion_page_pool *pool, gfp_t gfp_mask,
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 1/9] staging: ion: tidy up a bit

2014-05-28 Thread Heesub Shin

For aesthetics and readability, rename goto labels, remove
useless code lines, and clarify function return type.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
---
 drivers/staging/android/ion/ion_page_pool.c   |  2 +-
 drivers/staging/android/ion/ion_priv.h|  2 +-
 drivers/staging/android/ion/ion_system_heap.c | 57 ---
 3 files changed, 28 insertions(+), 33 deletions(-)

diff --git a/drivers/staging/android/ion/ion_page_pool.c 
b/drivers/staging/android/ion/ion_page_pool.c
index ecb5fc3..ead4054 100644
--- a/drivers/staging/android/ion/ion_page_pool.c
+++ b/drivers/staging/android/ion/ion_page_pool.c
@@ -89,7 +89,7 @@ static struct page *ion_page_pool_remove(struct ion_page_pool 
*pool, bool high)
return page;
 }
 
-void *ion_page_pool_alloc(struct ion_page_pool *pool)
+struct page *ion_page_pool_alloc(struct ion_page_pool *pool)
 {
struct page *page = NULL;
 
diff --git a/drivers/staging/android/ion/ion_priv.h 
b/drivers/staging/android/ion/ion_priv.h
index 1eba3f2..365c947 100644
--- a/drivers/staging/android/ion/ion_priv.h
+++ b/drivers/staging/android/ion/ion_priv.h
@@ -377,7 +377,7 @@ struct ion_page_pool {
 
 struct ion_page_pool *ion_page_pool_create(gfp_t gfp_mask, unsigned int order);
 void ion_page_pool_destroy(struct ion_page_pool *);
-void *ion_page_pool_alloc(struct ion_page_pool *);
+struct page *ion_page_pool_alloc(struct ion_page_pool *);
 void ion_page_pool_free(struct ion_page_pool *, struct page *);
 
 /** ion_page_pool_shrink - shrinks the size of the memory cached in the pool
diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index c923633..b554751 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -41,7 +41,7 @@ static int order_to_index(unsigned int order)
return -1;
 }
 
-static unsigned int order_to_size(int order)
+static inline unsigned int order_to_size(int order)
 {
return PAGE_SIZE  order;
 }
@@ -78,8 +78,6 @@ static struct page *alloc_buffer_page(struct ion_system_heap 
*heap,
ion_pages_sync_for_device(NULL, page, PAGE_SIZE  order,
DMA_BIDIRECTIONAL);
}
-   if (!page)
-   return NULL;
 
return page;
 }
@@ -124,7 +122,6 @@ static struct page_info *alloc_largest_available(struct 
ion_system_heap *heap,
 
info-page = page;
info-order = orders[i];
-   INIT_LIST_HEAD(info-list);
return info;
}
kfree(info);
@@ -142,7 +139,6 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
heap);
struct sg_table *table;
struct scatterlist *sg;
-   int ret;
struct list_head pages;
struct page_info *info, *tmp_info;
int i = 0;
@@ -160,24 +156,23 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
info = alloc_largest_available(sys_heap, buffer, size_remaining,
max_order);
if (!info)
-   goto err;
+   goto free_pages;
list_add_tail(info-list, pages);
-   size_remaining -= (1  info-order) * PAGE_SIZE;
+   size_remaining -= PAGE_SIZE  info-order;
max_order = info-order;
i++;
}
table = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
if (!table)
-   goto err;
+   goto free_pages;
 
-   ret = sg_alloc_table(table, i, GFP_KERNEL);
-   if (ret)
-   goto err1;
+   if (sg_alloc_table(table, i, GFP_KERNEL))
+   goto free_table;
 
sg = table-sgl;
list_for_each_entry_safe(info, tmp_info, pages, list) {
struct page *page = info-page;
-   sg_set_page(sg, page, (1  info-order) * PAGE_SIZE, 0);
+   sg_set_page(sg, page, PAGE_SIZE  info-order, 0);
sg = sg_next(sg);
list_del(info-list);
kfree(info);
@@ -185,9 +180,10 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
 
buffer-priv_virt = table;
return 0;
-err1:
+
+free_table:
kfree(table);
-err:
+free_pages:
list_for_each_entry_safe(info, tmp_info, pages, list) {
free_buffer_page(sys_heap, buffer, info-page, info-order);
kfree(info);
@@ -197,14 +193,12 @@ err:
 
 static void ion_system_heap_free(struct ion_buffer *buffer)
 {
-   struct ion_heap *heap = buffer-heap;
-   struct ion_system_heap *sys_heap = container_of(heap,
+   struct ion_system_heap *sys_heap = container_of(buffer-heap,
struct ion_system_heap,
heap

[PATCH v2 3/9] staging: ion: remove struct ion_page_pool_item

2014-05-28 Thread Heesub Shin

The page pool uses an internal data structure, ion_page_pool_item, for
wrapping pooled pages and constructing a list. As the struct page
already provides ways for doing exactly the same thing, we do not need
to reinvent the wheel. This commit removes the data structure and slab
allocations for it.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
---
 drivers/staging/android/ion/ion_page_pool.c | 27 +--
 1 file changed, 5 insertions(+), 22 deletions(-)

diff --git a/drivers/staging/android/ion/ion_page_pool.c 
b/drivers/staging/android/ion/ion_page_pool.c
index 1684780..111777c 100644
--- a/drivers/staging/android/ion/ion_page_pool.c
+++ b/drivers/staging/android/ion/ion_page_pool.c
@@ -23,11 +23,6 @@
 #include linux/slab.h
 #include ion_priv.h
 
-struct ion_page_pool_item {
-   struct page *page;
-   struct list_head list;
-};
-
 static void *ion_page_pool_alloc_pages(struct ion_page_pool *pool)
 {
struct page *page = alloc_pages(pool-gfp_mask, pool-order);
@@ -47,19 +42,12 @@ static void ion_page_pool_free_pages(struct ion_page_pool 
*pool,
 
 static int ion_page_pool_add(struct ion_page_pool *pool, struct page *page)
 {
-   struct ion_page_pool_item *item;
-
-   item = kmalloc(sizeof(struct ion_page_pool_item), GFP_KERNEL);
-   if (!item)
-   return -ENOMEM;
-
mutex_lock(pool-mutex);
-   item-page = page;
if (PageHighMem(page)) {
-   list_add_tail(item-list, pool-high_items);
+   list_add_tail(page-lru, pool-high_items);
pool-high_count++;
} else {
-   list_add_tail(item-list, pool-low_items);
+   list_add_tail(page-lru, pool-low_items);
pool-low_count++;
}
mutex_unlock(pool-mutex);
@@ -68,24 +56,19 @@ static int ion_page_pool_add(struct ion_page_pool *pool, 
struct page *page)
 
 static struct page *ion_page_pool_remove(struct ion_page_pool *pool, bool high)
 {
-   struct ion_page_pool_item *item;
struct page *page;
 
if (high) {
BUG_ON(!pool-high_count);
-   item = list_first_entry(pool-high_items,
-   struct ion_page_pool_item, list);
+   page = list_first_entry(pool-high_items, struct page, lru);
pool-high_count--;
} else {
BUG_ON(!pool-low_count);
-   item = list_first_entry(pool-low_items,
-   struct ion_page_pool_item, list);
+   page = list_first_entry(pool-low_items, struct page, lru);
pool-low_count--;
}
 
-   list_del(item-list);
-   page = item-page;
-   kfree(item);
+   list_del(page-lru);
return page;
 }
 
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 4/9] staging: ion: use compound pages on high order pages for system heap

2014-05-28 Thread Heesub Shin

Using compound pages relieves burden on tracking the meta information
which are currently stored in page_info.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
---
 drivers/staging/android/ion/ion_page_pool.c   | 4 +++-
 drivers/staging/android/ion/ion_system_heap.c | 2 +-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/android/ion/ion_page_pool.c 
b/drivers/staging/android/ion/ion_page_pool.c
index 111777c..c1cea42b 100644
--- a/drivers/staging/android/ion/ion_page_pool.c
+++ b/drivers/staging/android/ion/ion_page_pool.c
@@ -95,6 +95,8 @@ void ion_page_pool_free(struct ion_page_pool *pool, struct 
page *page)
 {
int ret;
 
+   BUG_ON(pool-order != compound_order(page));
+
ret = ion_page_pool_add(pool, page);
if (ret)
ion_page_pool_free_pages(pool, page);
@@ -150,7 +152,7 @@ struct ion_page_pool *ion_page_pool_create(gfp_t gfp_mask, 
unsigned int order)
pool-low_count = 0;
INIT_LIST_HEAD(pool-low_items);
INIT_LIST_HEAD(pool-high_items);
-   pool-gfp_mask = gfp_mask;
+   pool-gfp_mask = gfp_mask | __GFP_COMP;
pool-order = order;
mutex_init(pool-mutex);
plist_node_init(pool-list, order);
diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index b554751..48e6a58 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -72,7 +72,7 @@ static struct page *alloc_buffer_page(struct ion_system_heap 
*heap,
 
if (order  4)
gfp_flags = high_order_gfp_flags;
-   page = alloc_pages(gfp_flags, order);
+   page = alloc_pages(gfp_flags | __GFP_COMP, order);
if (!page)
return NULL;
ion_pages_sync_for_device(NULL, page, PAGE_SIZE  order,
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 5/9] staging: ion: remove order from struct page_info

2014-05-28 Thread Heesub Shin

ION system heap uses an internal data structure, struct page_info, for
tracking down the meta information of the pages allocated from the pool.
Now that the pool returns compound pages, we don't need to store page
order in struct page_info.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
---
 drivers/staging/android/ion/ion_system_heap.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index 48e6a58..73a2e67 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -53,7 +53,6 @@ struct ion_system_heap {
 
 struct page_info {
struct page *page;
-   unsigned int order;
struct list_head list;
 };
 
@@ -121,7 +120,6 @@ static struct page_info *alloc_largest_available(struct 
ion_system_heap *heap,
continue;
 
info-page = page;
-   info-order = orders[i];
return info;
}
kfree(info);
@@ -158,8 +156,8 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
if (!info)
goto free_pages;
list_add_tail(info-list, pages);
-   size_remaining -= PAGE_SIZE  info-order;
-   max_order = info-order;
+   size_remaining -= PAGE_SIZE  compound_order(info-page);
+   max_order = compound_order(info-page);
i++;
}
table = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
@@ -172,7 +170,7 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
sg = table-sgl;
list_for_each_entry_safe(info, tmp_info, pages, list) {
struct page *page = info-page;
-   sg_set_page(sg, page, PAGE_SIZE  info-order, 0);
+   sg_set_page(sg, page, PAGE_SIZE  compound_order(page), 0);
sg = sg_next(sg);
list_del(info-list);
kfree(info);
@@ -185,7 +183,8 @@ free_table:
kfree(table);
 free_pages:
list_for_each_entry_safe(info, tmp_info, pages, list) {
-   free_buffer_page(sys_heap, buffer, info-page, info-order);
+   free_buffer_page(sys_heap, buffer, info-page,
+   compound_order(info-page));
kfree(info);
}
return -ENOMEM;
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 6/9] staging: ion: remove struct page_info

2014-05-28 Thread Heesub Shin

ION system heap creates a temporary list of pages to build
scatter/gather table, introducing an internal data type, page_info. Now
that the order field has been removed from it, we do not need to depend
on such data type anymore.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
---
 drivers/staging/android/ion/ion_system_heap.c | 47 +--
 1 file changed, 15 insertions(+), 32 deletions(-)

diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index 73a2e67..f0ae210 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -51,11 +51,6 @@ struct ion_system_heap {
struct ion_page_pool **pools;
 };
 
-struct page_info {
-   struct page *page;
-   struct list_head list;
-};
-
 static struct page *alloc_buffer_page(struct ion_system_heap *heap,
  struct ion_buffer *buffer,
  unsigned long order)
@@ -96,19 +91,14 @@ static void free_buffer_page(struct ion_system_heap *heap,
 }
 
 
-static struct page_info *alloc_largest_available(struct ion_system_heap *heap,
-struct ion_buffer *buffer,
-unsigned long size,
-unsigned int max_order)
+static struct page *alloc_largest_available(struct ion_system_heap *heap,
+   struct ion_buffer *buffer,
+   unsigned long size,
+   unsigned int max_order)
 {
struct page *page;
-   struct page_info *info;
int i;
 
-   info = kmalloc(sizeof(struct page_info), GFP_KERNEL);
-   if (!info)
-   return NULL;
-
for (i = 0; i  num_orders; i++) {
if (size  order_to_size(orders[i]))
continue;
@@ -119,10 +109,8 @@ static struct page_info *alloc_largest_available(struct 
ion_system_heap *heap,
if (!page)
continue;
 
-   info-page = page;
-   return info;
+   return page;
}
-   kfree(info);
 
return NULL;
 }
@@ -138,7 +126,7 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
struct sg_table *table;
struct scatterlist *sg;
struct list_head pages;
-   struct page_info *info, *tmp_info;
+   struct page *page, *tmp_page;
int i = 0;
unsigned long size_remaining = PAGE_ALIGN(size);
unsigned int max_order = orders[0];
@@ -151,13 +139,13 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
 
INIT_LIST_HEAD(pages);
while (size_remaining  0) {
-   info = alloc_largest_available(sys_heap, buffer, size_remaining,
+   page = alloc_largest_available(sys_heap, buffer, size_remaining,
max_order);
-   if (!info)
+   if (!page)
goto free_pages;
-   list_add_tail(info-list, pages);
-   size_remaining -= PAGE_SIZE  compound_order(info-page);
-   max_order = compound_order(info-page);
+   list_add_tail(page-lru, pages);
+   size_remaining -= PAGE_SIZE  compound_order(page);
+   max_order = compound_order(page);
i++;
}
table = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
@@ -168,12 +156,10 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
goto free_table;
 
sg = table-sgl;
-   list_for_each_entry_safe(info, tmp_info, pages, list) {
-   struct page *page = info-page;
+   list_for_each_entry_safe(page, tmp_page, pages, lru) {
sg_set_page(sg, page, PAGE_SIZE  compound_order(page), 0);
sg = sg_next(sg);
-   list_del(info-list);
-   kfree(info);
+   list_del(page-lru);
}
 
buffer-priv_virt = table;
@@ -182,11 +168,8 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
 free_table:
kfree(table);
 free_pages:
-   list_for_each_entry_safe(info, tmp_info, pages, list) {
-   free_buffer_page(sys_heap, buffer, info-page,
-   compound_order(info-page));
-   kfree(info);
-   }
+   list_for_each_entry_safe(page, tmp_page, pages, lru)
+   free_buffer_page(sys_heap, buffer, page, compound_order(page));
return -ENOMEM;
 }
 
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 8/9] staging: ion: shrink highmem pages on kswapd

2014-05-28 Thread Heesub Shin

ION system heap keeps pages in its pool for better performance. When the
system is under memory pressure, slab shrinker calls the callback
registered and then the pages pooled get freed.

When the shrinker is called, it checks gfp_mask and determines whether
the pages from highmem need to be freed or the pages from lowmem.
Usually, slab shrinker is invoked on kswapd context which gfp_mask is
always GFP_KERNEL, so only lowmem pages are released on kswapd context.
This means that highmem pages in the pool are never reclaimed until
direct reclaim occurs. This can be problematic when the page pool holds
excessive amounts of highmem.

For now, the shrinker callback cannot know exactly which zone should be
targeted for reclamation, as enough information are not passed to. Thus,
it makes sense to shrink both lowmem and highmem zone on kswapd context.

Reported-by: Wonseo Choi wonseo.c...@samsung.com
Signed-off-by: Heesub Shin heesub.s...@samsung.com
---
 drivers/staging/android/ion/ion_page_pool.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/android/ion/ion_page_pool.c 
b/drivers/staging/android/ion/ion_page_pool.c
index c1cea42b..5864f3d 100644
--- a/drivers/staging/android/ion/ion_page_pool.c
+++ b/drivers/staging/android/ion/ion_page_pool.c
@@ -21,6 +21,7 @@
 #include linux/list.h
 #include linux/module.h
 #include linux/slab.h
+#include linux/swap.h
 #include ion_priv.h
 
 static void *ion_page_pool_alloc_pages(struct ion_page_pool *pool)
@@ -118,7 +119,10 @@ int ion_page_pool_shrink(struct ion_page_pool *pool, gfp_t 
gfp_mask,
int freed;
bool high;
 
-   high = !!(gfp_mask  __GFP_HIGHMEM);
+   if (current_is_kswapd())
+   high = 1;
+   else
+   high = !!(gfp_mask  __GFP_HIGHMEM);
 
if (nr_to_scan == 0)
return ion_page_pool_total(pool, high);
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 9/9] staging: ion: optimize struct ion_system_heap

2014-05-28 Thread Heesub Shin

struct ion_system_heap has an array for storing pointers to page pools
and it is allocated separately from the containing structure. There is
no point in allocating those two small objects individually, bothering
slab allocator. Using a variable length array simplifies code lines and
reduces overhead to the slab.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
---
 drivers/staging/android/ion/ion_system_heap.c | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index d78d589e..690d866 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -48,7 +48,7 @@ static inline unsigned int order_to_size(int order)
 
 struct ion_system_heap {
struct ion_heap heap;
-   struct ion_page_pool **pools;
+   struct ion_page_pool *pools[0];
 };
 
 static struct page *alloc_buffer_page(struct ion_system_heap *heap,
@@ -259,16 +259,15 @@ struct ion_heap *ion_system_heap_create(struct 
ion_platform_heap *unused)
struct ion_system_heap *heap;
int i;
 
-   heap = kzalloc(sizeof(struct ion_system_heap), GFP_KERNEL);
+   heap = kzalloc(sizeof(struct ion_system_heap) +
+   sizeof(struct ion_page_pool *) * num_orders,
+   GFP_KERNEL);
if (!heap)
return ERR_PTR(-ENOMEM);
heap-heap.ops = system_heap_ops;
heap-heap.type = ION_HEAP_TYPE_SYSTEM;
heap-heap.flags = ION_HEAP_FLAG_DEFER_FREE;
-   heap-pools = kzalloc(sizeof(struct ion_page_pool *) * num_orders,
- GFP_KERNEL);
-   if (!heap-pools)
-   goto free_heap;
+
for (i = 0; i  num_orders; i++) {
struct ion_page_pool *pool;
gfp_t gfp_flags = low_order_gfp_flags;
@@ -287,8 +286,6 @@ struct ion_heap *ion_system_heap_create(struct 
ion_platform_heap *unused)
 destroy_pools:
while (i--)
ion_page_pool_destroy(heap-pools[i]);
-   kfree(heap-pools);
-free_heap:
kfree(heap);
return ERR_PTR(-ENOMEM);
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 7/9] staging: ion: remove order argument from free_buffer_page()

2014-05-28 Thread Heesub Shin

Now that the pages returned from the pool are compound pages, we do not
need to pass the order information to free_buffer_page().

Signed-off-by: Heesub Shin heesub.s...@samsung.com
---
 drivers/staging/android/ion/ion_system_heap.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index f0ae210..d78d589e 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -77,9 +77,9 @@ static struct page *alloc_buffer_page(struct ion_system_heap 
*heap,
 }
 
 static void free_buffer_page(struct ion_system_heap *heap,
-struct ion_buffer *buffer, struct page *page,
-unsigned int order)
+struct ion_buffer *buffer, struct page *page)
 {
+   unsigned int order = compound_order(page);
bool cached = ion_buffer_cached(buffer);
 
if (!cached  !(buffer-private_flags  ION_PRIV_FLAG_SHRINKER_FREE)) {
@@ -169,7 +169,7 @@ free_table:
kfree(table);
 free_pages:
list_for_each_entry_safe(page, tmp_page, pages, lru)
-   free_buffer_page(sys_heap, buffer, page, compound_order(page));
+   free_buffer_page(sys_heap, buffer, page);
return -ENOMEM;
 }
 
@@ -189,8 +189,7 @@ static void ion_system_heap_free(struct ion_buffer *buffer)
ion_heap_buffer_zero(buffer);
 
for_each_sg(table-sgl, sg, table-nents, i)
-   free_buffer_page(sys_heap, buffer, sg_page(sg),
-   get_order(sg-length));
+   free_buffer_page(sys_heap, buffer, sg_page(sg));
sg_free_table(table);
kfree(table);
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/9] staging: ion: tidy up a bit

2014-05-26 Thread Heesub Shin


Hello Carpenter,

On 05/26/2014 07:36 PM, Dan Carpenter wrote:

On Mon, May 26, 2014 at 07:04:53PM +0900, Heesub Shin wrote:

@@ -124,7 +122,6 @@ static struct page_info *alloc_largest_available(struct 
ion_system_heap *heap,

info->page = page;
info->order = orders[i];
-   INIT_LIST_HEAD(>list);
return info;
}
kfree(info);


Wait.  How does this code work without that INIT_LIST_HEAD()?  What am
I missing here...


No problem. As the object info is just a node, not a head, it is 
completely useless to initialize it as a list head.


regards,
Heesub



regards,
dan carpenter




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/9] staging: ion: remove struct ion_page_pool_item

2014-05-26 Thread Heesub Shin


Hello,

On 05/26/2014 07:04 PM, Heesub Shin wrote:

Now that the order information is held on struct page itself, we do not
need to use extra data structure. This commit reduces unnecessary slab
usage for allocating small objects.



Oops. I need to amend changelog above and resend this patchset.

regards,
Heesub
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 5/9] staging: ion: remove order from struct page_info

2014-05-26 Thread Heesub Shin

ION system heap uses an internal data structure, struct page_info, for
tracking down the meta information of the pages allocated from the pool.
Now that the pool returns compound pages, we don't need to store page
order in struct page_info.

Signed-off-by: Heesub Shin 
---
 drivers/staging/android/ion/ion_system_heap.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index 48e6a58..73a2e67 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -53,7 +53,6 @@ struct ion_system_heap {
 
 struct page_info {
struct page *page;
-   unsigned int order;
struct list_head list;
 };
 
@@ -121,7 +120,6 @@ static struct page_info *alloc_largest_available(struct 
ion_system_heap *heap,
continue;
 
info->page = page;
-   info->order = orders[i];
return info;
}
kfree(info);
@@ -158,8 +156,8 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
if (!info)
goto free_pages;
list_add_tail(>list, );
-   size_remaining -= PAGE_SIZE << info->order;
-   max_order = info->order;
+   size_remaining -= PAGE_SIZE << compound_order(info->page);
+   max_order = compound_order(info->page);
i++;
}
table = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
@@ -172,7 +170,7 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
sg = table->sgl;
list_for_each_entry_safe(info, tmp_info, , list) {
struct page *page = info->page;
-   sg_set_page(sg, page, PAGE_SIZE << info->order, 0);
+   sg_set_page(sg, page, PAGE_SIZE << compound_order(page), 0);
sg = sg_next(sg);
list_del(>list);
kfree(info);
@@ -185,7 +183,8 @@ free_table:
kfree(table);
 free_pages:
list_for_each_entry_safe(info, tmp_info, , list) {
-   free_buffer_page(sys_heap, buffer, info->page, info->order);
+   free_buffer_page(sys_heap, buffer, info->page,
+   compound_order(info->page));
kfree(info);
}
return -ENOMEM;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 6/9] staging: ion: remove struct page_info

2014-05-26 Thread Heesub Shin

ION system heap uses a temporary list holding meta data on pages
allocated to build scatter/gather table. Now that the pages are compound
pages, we do not need to introduce a new data type redundantly.

Signed-off-by: Heesub Shin 
---
 drivers/staging/android/ion/ion_system_heap.c | 47 +--
 1 file changed, 15 insertions(+), 32 deletions(-)

diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index 73a2e67..f0ae210 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -51,11 +51,6 @@ struct ion_system_heap {
struct ion_page_pool **pools;
 };
 
-struct page_info {
-   struct page *page;
-   struct list_head list;
-};
-
 static struct page *alloc_buffer_page(struct ion_system_heap *heap,
  struct ion_buffer *buffer,
  unsigned long order)
@@ -96,19 +91,14 @@ static void free_buffer_page(struct ion_system_heap *heap,
 }
 
 
-static struct page_info *alloc_largest_available(struct ion_system_heap *heap,
-struct ion_buffer *buffer,
-unsigned long size,
-unsigned int max_order)
+static struct page *alloc_largest_available(struct ion_system_heap *heap,
+   struct ion_buffer *buffer,
+   unsigned long size,
+   unsigned int max_order)
 {
struct page *page;
-   struct page_info *info;
int i;
 
-   info = kmalloc(sizeof(struct page_info), GFP_KERNEL);
-   if (!info)
-   return NULL;
-
for (i = 0; i < num_orders; i++) {
if (size < order_to_size(orders[i]))
continue;
@@ -119,10 +109,8 @@ static struct page_info *alloc_largest_available(struct 
ion_system_heap *heap,
if (!page)
continue;
 
-   info->page = page;
-   return info;
+   return page;
}
-   kfree(info);
 
return NULL;
 }
@@ -138,7 +126,7 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
struct sg_table *table;
struct scatterlist *sg;
struct list_head pages;
-   struct page_info *info, *tmp_info;
+   struct page *page, *tmp_page;
int i = 0;
unsigned long size_remaining = PAGE_ALIGN(size);
unsigned int max_order = orders[0];
@@ -151,13 +139,13 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
 
INIT_LIST_HEAD();
while (size_remaining > 0) {
-   info = alloc_largest_available(sys_heap, buffer, size_remaining,
+   page = alloc_largest_available(sys_heap, buffer, size_remaining,
max_order);
-   if (!info)
+   if (!page)
goto free_pages;
-   list_add_tail(>list, );
-   size_remaining -= PAGE_SIZE << compound_order(info->page);
-   max_order = compound_order(info->page);
+   list_add_tail(>lru, );
+   size_remaining -= PAGE_SIZE << compound_order(page);
+   max_order = compound_order(page);
i++;
}
table = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
@@ -168,12 +156,10 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
goto free_table;
 
sg = table->sgl;
-   list_for_each_entry_safe(info, tmp_info, , list) {
-   struct page *page = info->page;
+   list_for_each_entry_safe(page, tmp_page, , lru) {
sg_set_page(sg, page, PAGE_SIZE << compound_order(page), 0);
sg = sg_next(sg);
-   list_del(>list);
-   kfree(info);
+   list_del(>lru);
}
 
buffer->priv_virt = table;
@@ -182,11 +168,8 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
 free_table:
kfree(table);
 free_pages:
-   list_for_each_entry_safe(info, tmp_info, , list) {
-   free_buffer_page(sys_heap, buffer, info->page,
-   compound_order(info->page));
-   kfree(info);
-   }
+   list_for_each_entry_safe(page, tmp_page, , lru)
+   free_buffer_page(sys_heap, buffer, page, compound_order(page));
return -ENOMEM;
 }
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/9] staging: ion: use compound pages on high order pages for system heap

2014-05-26 Thread Heesub Shin

Using compound pages relieves burden on tracking the meta information
which are currently stored in page_info.

Signed-off-by: Heesub Shin 
---
 drivers/staging/android/ion/ion_page_pool.c   | 4 +++-
 drivers/staging/android/ion/ion_system_heap.c | 2 +-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/android/ion/ion_page_pool.c 
b/drivers/staging/android/ion/ion_page_pool.c
index 1684780..0141b3b 100644
--- a/drivers/staging/android/ion/ion_page_pool.c
+++ b/drivers/staging/android/ion/ion_page_pool.c
@@ -112,6 +112,8 @@ void ion_page_pool_free(struct ion_page_pool *pool, struct 
page *page)
 {
int ret;
 
+   BUG_ON(pool->order != compound_order(page));
+
ret = ion_page_pool_add(pool, page);
if (ret)
ion_page_pool_free_pages(pool, page);
@@ -167,7 +169,7 @@ struct ion_page_pool *ion_page_pool_create(gfp_t gfp_mask, 
unsigned int order)
pool->low_count = 0;
INIT_LIST_HEAD(>low_items);
INIT_LIST_HEAD(>high_items);
-   pool->gfp_mask = gfp_mask;
+   pool->gfp_mask = gfp_mask | __GFP_COMP;
pool->order = order;
mutex_init(>mutex);
plist_node_init(>list, order);
diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index b554751..48e6a58 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -72,7 +72,7 @@ static struct page *alloc_buffer_page(struct ion_system_heap 
*heap,
 
if (order > 4)
gfp_flags = high_order_gfp_flags;
-   page = alloc_pages(gfp_flags, order);
+   page = alloc_pages(gfp_flags | __GFP_COMP, order);
if (!page)
return NULL;
ion_pages_sync_for_device(NULL, page, PAGE_SIZE << order,
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/9] staging: ion: remove struct ion_page_pool_item

2014-05-26 Thread Heesub Shin

Now that the order information is held on struct page itself, we do not
need to use extra data structure. This commit reduces unnecessary slab
usage for allocating small objects.

Signed-off-by: Heesub Shin 
---
 drivers/staging/android/ion/ion_page_pool.c | 27 +--
 1 file changed, 5 insertions(+), 22 deletions(-)

diff --git a/drivers/staging/android/ion/ion_page_pool.c 
b/drivers/staging/android/ion/ion_page_pool.c
index 0141b3b..c1cea42b 100644
--- a/drivers/staging/android/ion/ion_page_pool.c
+++ b/drivers/staging/android/ion/ion_page_pool.c
@@ -23,11 +23,6 @@
 #include 
 #include "ion_priv.h"
 
-struct ion_page_pool_item {
-   struct page *page;
-   struct list_head list;
-};
-
 static void *ion_page_pool_alloc_pages(struct ion_page_pool *pool)
 {
struct page *page = alloc_pages(pool->gfp_mask, pool->order);
@@ -47,19 +42,12 @@ static void ion_page_pool_free_pages(struct ion_page_pool 
*pool,
 
 static int ion_page_pool_add(struct ion_page_pool *pool, struct page *page)
 {
-   struct ion_page_pool_item *item;
-
-   item = kmalloc(sizeof(struct ion_page_pool_item), GFP_KERNEL);
-   if (!item)
-   return -ENOMEM;
-
mutex_lock(>mutex);
-   item->page = page;
if (PageHighMem(page)) {
-   list_add_tail(>list, >high_items);
+   list_add_tail(>lru, >high_items);
pool->high_count++;
} else {
-   list_add_tail(>list, >low_items);
+   list_add_tail(>lru, >low_items);
pool->low_count++;
}
mutex_unlock(>mutex);
@@ -68,24 +56,19 @@ static int ion_page_pool_add(struct ion_page_pool *pool, 
struct page *page)
 
 static struct page *ion_page_pool_remove(struct ion_page_pool *pool, bool high)
 {
-   struct ion_page_pool_item *item;
struct page *page;
 
if (high) {
BUG_ON(!pool->high_count);
-   item = list_first_entry(>high_items,
-   struct ion_page_pool_item, list);
+   page = list_first_entry(>high_items, struct page, lru);
pool->high_count--;
} else {
BUG_ON(!pool->low_count);
-   item = list_first_entry(>low_items,
-   struct ion_page_pool_item, list);
+   page = list_first_entry(>low_items, struct page, lru);
pool->low_count--;
}
 
-   list_del(>list);
-   page = item->page;
-   kfree(item);
+   list_del(>lru);
return page;
 }
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 9/9] staging: ion: optimize struct ion_system_heap

2014-05-26 Thread Heesub Shin

struct ion_system_heap has an array for storing pointers to page pools
and it is allocated separately from the containing structure. There is
no point in allocating those two small objects individually, bothering
slab allocator. Using a variable length array simplifies code lines and
reduces overhead to the slab.

Signed-off-by: Heesub Shin 
---
 drivers/staging/android/ion/ion_system_heap.c | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index d78d589e..690d866 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -48,7 +48,7 @@ static inline unsigned int order_to_size(int order)
 
 struct ion_system_heap {
struct ion_heap heap;
-   struct ion_page_pool **pools;
+   struct ion_page_pool *pools[0];
 };
 
 static struct page *alloc_buffer_page(struct ion_system_heap *heap,
@@ -259,16 +259,15 @@ struct ion_heap *ion_system_heap_create(struct 
ion_platform_heap *unused)
struct ion_system_heap *heap;
int i;
 
-   heap = kzalloc(sizeof(struct ion_system_heap), GFP_KERNEL);
+   heap = kzalloc(sizeof(struct ion_system_heap) +
+   sizeof(struct ion_page_pool *) * num_orders,
+   GFP_KERNEL);
if (!heap)
return ERR_PTR(-ENOMEM);
heap->heap.ops = _heap_ops;
heap->heap.type = ION_HEAP_TYPE_SYSTEM;
heap->heap.flags = ION_HEAP_FLAG_DEFER_FREE;
-   heap->pools = kzalloc(sizeof(struct ion_page_pool *) * num_orders,
- GFP_KERNEL);
-   if (!heap->pools)
-   goto free_heap;
+
for (i = 0; i < num_orders; i++) {
struct ion_page_pool *pool;
gfp_t gfp_flags = low_order_gfp_flags;
@@ -287,8 +286,6 @@ struct ion_heap *ion_system_heap_create(struct 
ion_platform_heap *unused)
 destroy_pools:
while (i--)
ion_page_pool_destroy(heap->pools[i]);
-   kfree(heap->pools);
-free_heap:
kfree(heap);
return ERR_PTR(-ENOMEM);
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 8/9] staging: ion: shrink highmem pages on kswapd

2014-05-26 Thread Heesub Shin

ION system heap keeps pages in its pool for better performance. When the
system is under memory pressure, slab shrinker calls the callback
registered and then the pages pooled get freed.

When the shrinker is called, it checks gfp_mask and determines whether
the pages from highmem need to be freed or the pages from lowmem.
Usually, slab shrinker is invoked on kswapd context which gfp_mask is
always GFP_KERNEL, so only lowmem pages are released on kswapd context.
This means that highmem pages in the pool are never reclaimed until
direct reclaim occurs. This can be problematic when the page pool holds
excessive amounts of highmem.

For now, the shrinker callback cannot know exactly which zone should be
targeted for reclamation, as enough information are not passed to. Thus,
it makes sense to shrink both lowmem and highmem zone on kswapd context.

Reported-by: Wonseo Choi 
Signed-off-by: Heesub Shin 
---
 drivers/staging/android/ion/ion_page_pool.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/android/ion/ion_page_pool.c 
b/drivers/staging/android/ion/ion_page_pool.c
index c1cea42b..5864f3d 100644
--- a/drivers/staging/android/ion/ion_page_pool.c
+++ b/drivers/staging/android/ion/ion_page_pool.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "ion_priv.h"
 
 static void *ion_page_pool_alloc_pages(struct ion_page_pool *pool)
@@ -118,7 +119,10 @@ int ion_page_pool_shrink(struct ion_page_pool *pool, gfp_t 
gfp_mask,
int freed;
bool high;
 
-   high = !!(gfp_mask & __GFP_HIGHMEM);
+   if (current_is_kswapd())
+   high = 1;
+   else
+   high = !!(gfp_mask & __GFP_HIGHMEM);
 
if (nr_to_scan == 0)
return ion_page_pool_total(pool, high);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 7/9] staging: ion: remove order argument from free_buffer_page()

2014-05-26 Thread Heesub Shin

Not that the pages returned from the pool are compound pages, we do not
need to pass the order information to free_buffer_page().

Signed-off-by: Heesub Shin 
---
 drivers/staging/android/ion/ion_system_heap.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index f0ae210..d78d589e 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -77,9 +77,9 @@ static struct page *alloc_buffer_page(struct ion_system_heap 
*heap,
 }
 
 static void free_buffer_page(struct ion_system_heap *heap,
-struct ion_buffer *buffer, struct page *page,
-unsigned int order)
+struct ion_buffer *buffer, struct page *page)
 {
+   unsigned int order = compound_order(page);
bool cached = ion_buffer_cached(buffer);
 
if (!cached && !(buffer->private_flags & ION_PRIV_FLAG_SHRINKER_FREE)) {
@@ -169,7 +169,7 @@ free_table:
kfree(table);
 free_pages:
list_for_each_entry_safe(page, tmp_page, , lru)
-   free_buffer_page(sys_heap, buffer, page, compound_order(page));
+   free_buffer_page(sys_heap, buffer, page);
return -ENOMEM;
 }
 
@@ -189,8 +189,7 @@ static void ion_system_heap_free(struct ion_buffer *buffer)
ion_heap_buffer_zero(buffer);
 
for_each_sg(table->sgl, sg, table->nents, i)
-   free_buffer_page(sys_heap, buffer, sg_page(sg),
-   get_order(sg->length));
+   free_buffer_page(sys_heap, buffer, sg_page(sg));
sg_free_table(table);
kfree(table);
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/9] staging: ion: simplify ion_page_pool_total()

2014-05-26 Thread Heesub Shin

ion_page_pool_total() returns the total number of pages in the pool.
Depending on the argument passed, it counts pages from highmem zone in
or not. This commit simplifies the code lines for better readability.

Signed-off-by: Heesub Shin 
---
 drivers/staging/android/ion/ion_page_pool.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/android/ion/ion_page_pool.c 
b/drivers/staging/android/ion/ion_page_pool.c
index ead4054..1684780 100644
--- a/drivers/staging/android/ion/ion_page_pool.c
+++ b/drivers/staging/android/ion/ion_page_pool.c
@@ -119,12 +119,12 @@ void ion_page_pool_free(struct ion_page_pool *pool, 
struct page *page)
 
 static int ion_page_pool_total(struct ion_page_pool *pool, bool high)
 {
-   int total = 0;
+   int count = pool->low_count;
 
-   total += high ? (pool->high_count + pool->low_count) *
-   (1 << pool->order) :
-   pool->low_count * (1 << pool->order);
-   return total;
+   if (high)
+   count += pool->high_count;
+
+   return count << pool->order;
 }
 
 int ion_page_pool_shrink(struct ion_page_pool *pool, gfp_t gfp_mask,
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/9] staging: ion: tidy up a bit

2014-05-26 Thread Heesub Shin

For aesthetics and readability, rename goto labels, remove
useless code lines, and clarify function return type.

Signed-off-by: Heesub Shin 
---
 drivers/staging/android/ion/ion_page_pool.c   |  2 +-
 drivers/staging/android/ion/ion_priv.h|  2 +-
 drivers/staging/android/ion/ion_system_heap.c | 57 ---
 3 files changed, 28 insertions(+), 33 deletions(-)

diff --git a/drivers/staging/android/ion/ion_page_pool.c 
b/drivers/staging/android/ion/ion_page_pool.c
index ecb5fc3..ead4054 100644
--- a/drivers/staging/android/ion/ion_page_pool.c
+++ b/drivers/staging/android/ion/ion_page_pool.c
@@ -89,7 +89,7 @@ static struct page *ion_page_pool_remove(struct ion_page_pool 
*pool, bool high)
return page;
 }
 
-void *ion_page_pool_alloc(struct ion_page_pool *pool)
+struct page *ion_page_pool_alloc(struct ion_page_pool *pool)
 {
struct page *page = NULL;
 
diff --git a/drivers/staging/android/ion/ion_priv.h 
b/drivers/staging/android/ion/ion_priv.h
index 1eba3f2..365c947 100644
--- a/drivers/staging/android/ion/ion_priv.h
+++ b/drivers/staging/android/ion/ion_priv.h
@@ -377,7 +377,7 @@ struct ion_page_pool {
 
 struct ion_page_pool *ion_page_pool_create(gfp_t gfp_mask, unsigned int order);
 void ion_page_pool_destroy(struct ion_page_pool *);
-void *ion_page_pool_alloc(struct ion_page_pool *);
+struct page *ion_page_pool_alloc(struct ion_page_pool *);
 void ion_page_pool_free(struct ion_page_pool *, struct page *);
 
 /** ion_page_pool_shrink - shrinks the size of the memory cached in the pool
diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index c923633..b554751 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -41,7 +41,7 @@ static int order_to_index(unsigned int order)
return -1;
 }
 
-static unsigned int order_to_size(int order)
+static inline unsigned int order_to_size(int order)
 {
return PAGE_SIZE << order;
 }
@@ -78,8 +78,6 @@ static struct page *alloc_buffer_page(struct ion_system_heap 
*heap,
ion_pages_sync_for_device(NULL, page, PAGE_SIZE << order,
DMA_BIDIRECTIONAL);
}
-   if (!page)
-   return NULL;
 
return page;
 }
@@ -124,7 +122,6 @@ static struct page_info *alloc_largest_available(struct 
ion_system_heap *heap,
 
info->page = page;
info->order = orders[i];
-   INIT_LIST_HEAD(>list);
return info;
}
kfree(info);
@@ -142,7 +139,6 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
heap);
struct sg_table *table;
struct scatterlist *sg;
-   int ret;
struct list_head pages;
struct page_info *info, *tmp_info;
int i = 0;
@@ -160,24 +156,23 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
info = alloc_largest_available(sys_heap, buffer, size_remaining,
max_order);
if (!info)
-   goto err;
+   goto free_pages;
list_add_tail(>list, );
-   size_remaining -= (1 << info->order) * PAGE_SIZE;
+   size_remaining -= PAGE_SIZE << info->order;
max_order = info->order;
i++;
}
table = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
if (!table)
-   goto err;
+   goto free_pages;
 
-   ret = sg_alloc_table(table, i, GFP_KERNEL);
-   if (ret)
-   goto err1;
+   if (sg_alloc_table(table, i, GFP_KERNEL))
+   goto free_table;
 
sg = table->sgl;
list_for_each_entry_safe(info, tmp_info, , list) {
struct page *page = info->page;
-   sg_set_page(sg, page, (1 << info->order) * PAGE_SIZE, 0);
+   sg_set_page(sg, page, PAGE_SIZE << info->order, 0);
sg = sg_next(sg);
list_del(>list);
kfree(info);
@@ -185,9 +180,10 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
 
buffer->priv_virt = table;
return 0;
-err1:
+
+free_table:
kfree(table);
-err:
+free_pages:
list_for_each_entry_safe(info, tmp_info, , list) {
free_buffer_page(sys_heap, buffer, info->page, info->order);
kfree(info);
@@ -197,14 +193,12 @@ err:
 
 static void ion_system_heap_free(struct ion_buffer *buffer)
 {
-   struct ion_heap *heap = buffer->heap;
-   struct ion_system_heap *sys_heap = container_of(heap,
+   struct ion_system_heap *sys_heap = container_of(buffer->heap,
struct ion_system_heap,

[PATCH 1/9] staging: ion: tidy up a bit

2014-05-26 Thread Heesub Shin

For aesthetics and readability, rename goto labels, remove
useless code lines, and clarify function return type.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
---
 drivers/staging/android/ion/ion_page_pool.c   |  2 +-
 drivers/staging/android/ion/ion_priv.h|  2 +-
 drivers/staging/android/ion/ion_system_heap.c | 57 ---
 3 files changed, 28 insertions(+), 33 deletions(-)

diff --git a/drivers/staging/android/ion/ion_page_pool.c 
b/drivers/staging/android/ion/ion_page_pool.c
index ecb5fc3..ead4054 100644
--- a/drivers/staging/android/ion/ion_page_pool.c
+++ b/drivers/staging/android/ion/ion_page_pool.c
@@ -89,7 +89,7 @@ static struct page *ion_page_pool_remove(struct ion_page_pool 
*pool, bool high)
return page;
 }
 
-void *ion_page_pool_alloc(struct ion_page_pool *pool)
+struct page *ion_page_pool_alloc(struct ion_page_pool *pool)
 {
struct page *page = NULL;
 
diff --git a/drivers/staging/android/ion/ion_priv.h 
b/drivers/staging/android/ion/ion_priv.h
index 1eba3f2..365c947 100644
--- a/drivers/staging/android/ion/ion_priv.h
+++ b/drivers/staging/android/ion/ion_priv.h
@@ -377,7 +377,7 @@ struct ion_page_pool {
 
 struct ion_page_pool *ion_page_pool_create(gfp_t gfp_mask, unsigned int order);
 void ion_page_pool_destroy(struct ion_page_pool *);
-void *ion_page_pool_alloc(struct ion_page_pool *);
+struct page *ion_page_pool_alloc(struct ion_page_pool *);
 void ion_page_pool_free(struct ion_page_pool *, struct page *);
 
 /** ion_page_pool_shrink - shrinks the size of the memory cached in the pool
diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index c923633..b554751 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -41,7 +41,7 @@ static int order_to_index(unsigned int order)
return -1;
 }
 
-static unsigned int order_to_size(int order)
+static inline unsigned int order_to_size(int order)
 {
return PAGE_SIZE  order;
 }
@@ -78,8 +78,6 @@ static struct page *alloc_buffer_page(struct ion_system_heap 
*heap,
ion_pages_sync_for_device(NULL, page, PAGE_SIZE  order,
DMA_BIDIRECTIONAL);
}
-   if (!page)
-   return NULL;
 
return page;
 }
@@ -124,7 +122,6 @@ static struct page_info *alloc_largest_available(struct 
ion_system_heap *heap,
 
info-page = page;
info-order = orders[i];
-   INIT_LIST_HEAD(info-list);
return info;
}
kfree(info);
@@ -142,7 +139,6 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
heap);
struct sg_table *table;
struct scatterlist *sg;
-   int ret;
struct list_head pages;
struct page_info *info, *tmp_info;
int i = 0;
@@ -160,24 +156,23 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
info = alloc_largest_available(sys_heap, buffer, size_remaining,
max_order);
if (!info)
-   goto err;
+   goto free_pages;
list_add_tail(info-list, pages);
-   size_remaining -= (1  info-order) * PAGE_SIZE;
+   size_remaining -= PAGE_SIZE  info-order;
max_order = info-order;
i++;
}
table = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
if (!table)
-   goto err;
+   goto free_pages;
 
-   ret = sg_alloc_table(table, i, GFP_KERNEL);
-   if (ret)
-   goto err1;
+   if (sg_alloc_table(table, i, GFP_KERNEL))
+   goto free_table;
 
sg = table-sgl;
list_for_each_entry_safe(info, tmp_info, pages, list) {
struct page *page = info-page;
-   sg_set_page(sg, page, (1  info-order) * PAGE_SIZE, 0);
+   sg_set_page(sg, page, PAGE_SIZE  info-order, 0);
sg = sg_next(sg);
list_del(info-list);
kfree(info);
@@ -185,9 +180,10 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
 
buffer-priv_virt = table;
return 0;
-err1:
+
+free_table:
kfree(table);
-err:
+free_pages:
list_for_each_entry_safe(info, tmp_info, pages, list) {
free_buffer_page(sys_heap, buffer, info-page, info-order);
kfree(info);
@@ -197,14 +193,12 @@ err:
 
 static void ion_system_heap_free(struct ion_buffer *buffer)
 {
-   struct ion_heap *heap = buffer-heap;
-   struct ion_system_heap *sys_heap = container_of(heap,
+   struct ion_system_heap *sys_heap = container_of(buffer-heap,
struct ion_system_heap,
heap

[PATCH 7/9] staging: ion: remove order argument from free_buffer_page()

2014-05-26 Thread Heesub Shin

Not that the pages returned from the pool are compound pages, we do not
need to pass the order information to free_buffer_page().

Signed-off-by: Heesub Shin heesub.s...@samsung.com
---
 drivers/staging/android/ion/ion_system_heap.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index f0ae210..d78d589e 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -77,9 +77,9 @@ static struct page *alloc_buffer_page(struct ion_system_heap 
*heap,
 }
 
 static void free_buffer_page(struct ion_system_heap *heap,
-struct ion_buffer *buffer, struct page *page,
-unsigned int order)
+struct ion_buffer *buffer, struct page *page)
 {
+   unsigned int order = compound_order(page);
bool cached = ion_buffer_cached(buffer);
 
if (!cached  !(buffer-private_flags  ION_PRIV_FLAG_SHRINKER_FREE)) {
@@ -169,7 +169,7 @@ free_table:
kfree(table);
 free_pages:
list_for_each_entry_safe(page, tmp_page, pages, lru)
-   free_buffer_page(sys_heap, buffer, page, compound_order(page));
+   free_buffer_page(sys_heap, buffer, page);
return -ENOMEM;
 }
 
@@ -189,8 +189,7 @@ static void ion_system_heap_free(struct ion_buffer *buffer)
ion_heap_buffer_zero(buffer);
 
for_each_sg(table-sgl, sg, table-nents, i)
-   free_buffer_page(sys_heap, buffer, sg_page(sg),
-   get_order(sg-length));
+   free_buffer_page(sys_heap, buffer, sg_page(sg));
sg_free_table(table);
kfree(table);
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/9] staging: ion: simplify ion_page_pool_total()

2014-05-26 Thread Heesub Shin

ion_page_pool_total() returns the total number of pages in the pool.
Depending on the argument passed, it counts pages from highmem zone in
or not. This commit simplifies the code lines for better readability.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
---
 drivers/staging/android/ion/ion_page_pool.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/android/ion/ion_page_pool.c 
b/drivers/staging/android/ion/ion_page_pool.c
index ead4054..1684780 100644
--- a/drivers/staging/android/ion/ion_page_pool.c
+++ b/drivers/staging/android/ion/ion_page_pool.c
@@ -119,12 +119,12 @@ void ion_page_pool_free(struct ion_page_pool *pool, 
struct page *page)
 
 static int ion_page_pool_total(struct ion_page_pool *pool, bool high)
 {
-   int total = 0;
+   int count = pool-low_count;
 
-   total += high ? (pool-high_count + pool-low_count) *
-   (1  pool-order) :
-   pool-low_count * (1  pool-order);
-   return total;
+   if (high)
+   count += pool-high_count;
+
+   return count  pool-order;
 }
 
 int ion_page_pool_shrink(struct ion_page_pool *pool, gfp_t gfp_mask,
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 8/9] staging: ion: shrink highmem pages on kswapd

2014-05-26 Thread Heesub Shin

ION system heap keeps pages in its pool for better performance. When the
system is under memory pressure, slab shrinker calls the callback
registered and then the pages pooled get freed.

When the shrinker is called, it checks gfp_mask and determines whether
the pages from highmem need to be freed or the pages from lowmem.
Usually, slab shrinker is invoked on kswapd context which gfp_mask is
always GFP_KERNEL, so only lowmem pages are released on kswapd context.
This means that highmem pages in the pool are never reclaimed until
direct reclaim occurs. This can be problematic when the page pool holds
excessive amounts of highmem.

For now, the shrinker callback cannot know exactly which zone should be
targeted for reclamation, as enough information are not passed to. Thus,
it makes sense to shrink both lowmem and highmem zone on kswapd context.

Reported-by: Wonseo Choi wonseo.c...@samsung.com
Signed-off-by: Heesub Shin heesub.s...@samsung.com
---
 drivers/staging/android/ion/ion_page_pool.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/android/ion/ion_page_pool.c 
b/drivers/staging/android/ion/ion_page_pool.c
index c1cea42b..5864f3d 100644
--- a/drivers/staging/android/ion/ion_page_pool.c
+++ b/drivers/staging/android/ion/ion_page_pool.c
@@ -21,6 +21,7 @@
 #include linux/list.h
 #include linux/module.h
 #include linux/slab.h
+#include linux/swap.h
 #include ion_priv.h
 
 static void *ion_page_pool_alloc_pages(struct ion_page_pool *pool)
@@ -118,7 +119,10 @@ int ion_page_pool_shrink(struct ion_page_pool *pool, gfp_t 
gfp_mask,
int freed;
bool high;
 
-   high = !!(gfp_mask  __GFP_HIGHMEM);
+   if (current_is_kswapd())
+   high = 1;
+   else
+   high = !!(gfp_mask  __GFP_HIGHMEM);
 
if (nr_to_scan == 0)
return ion_page_pool_total(pool, high);
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 9/9] staging: ion: optimize struct ion_system_heap

2014-05-26 Thread Heesub Shin

struct ion_system_heap has an array for storing pointers to page pools
and it is allocated separately from the containing structure. There is
no point in allocating those two small objects individually, bothering
slab allocator. Using a variable length array simplifies code lines and
reduces overhead to the slab.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
---
 drivers/staging/android/ion/ion_system_heap.c | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index d78d589e..690d866 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -48,7 +48,7 @@ static inline unsigned int order_to_size(int order)
 
 struct ion_system_heap {
struct ion_heap heap;
-   struct ion_page_pool **pools;
+   struct ion_page_pool *pools[0];
 };
 
 static struct page *alloc_buffer_page(struct ion_system_heap *heap,
@@ -259,16 +259,15 @@ struct ion_heap *ion_system_heap_create(struct 
ion_platform_heap *unused)
struct ion_system_heap *heap;
int i;
 
-   heap = kzalloc(sizeof(struct ion_system_heap), GFP_KERNEL);
+   heap = kzalloc(sizeof(struct ion_system_heap) +
+   sizeof(struct ion_page_pool *) * num_orders,
+   GFP_KERNEL);
if (!heap)
return ERR_PTR(-ENOMEM);
heap-heap.ops = system_heap_ops;
heap-heap.type = ION_HEAP_TYPE_SYSTEM;
heap-heap.flags = ION_HEAP_FLAG_DEFER_FREE;
-   heap-pools = kzalloc(sizeof(struct ion_page_pool *) * num_orders,
- GFP_KERNEL);
-   if (!heap-pools)
-   goto free_heap;
+
for (i = 0; i  num_orders; i++) {
struct ion_page_pool *pool;
gfp_t gfp_flags = low_order_gfp_flags;
@@ -287,8 +286,6 @@ struct ion_heap *ion_system_heap_create(struct 
ion_platform_heap *unused)
 destroy_pools:
while (i--)
ion_page_pool_destroy(heap-pools[i]);
-   kfree(heap-pools);
-free_heap:
kfree(heap);
return ERR_PTR(-ENOMEM);
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/9] staging: ion: use compound pages on high order pages for system heap

2014-05-26 Thread Heesub Shin

Using compound pages relieves burden on tracking the meta information
which are currently stored in page_info.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
---
 drivers/staging/android/ion/ion_page_pool.c   | 4 +++-
 drivers/staging/android/ion/ion_system_heap.c | 2 +-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/android/ion/ion_page_pool.c 
b/drivers/staging/android/ion/ion_page_pool.c
index 1684780..0141b3b 100644
--- a/drivers/staging/android/ion/ion_page_pool.c
+++ b/drivers/staging/android/ion/ion_page_pool.c
@@ -112,6 +112,8 @@ void ion_page_pool_free(struct ion_page_pool *pool, struct 
page *page)
 {
int ret;
 
+   BUG_ON(pool-order != compound_order(page));
+
ret = ion_page_pool_add(pool, page);
if (ret)
ion_page_pool_free_pages(pool, page);
@@ -167,7 +169,7 @@ struct ion_page_pool *ion_page_pool_create(gfp_t gfp_mask, 
unsigned int order)
pool-low_count = 0;
INIT_LIST_HEAD(pool-low_items);
INIT_LIST_HEAD(pool-high_items);
-   pool-gfp_mask = gfp_mask;
+   pool-gfp_mask = gfp_mask | __GFP_COMP;
pool-order = order;
mutex_init(pool-mutex);
plist_node_init(pool-list, order);
diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index b554751..48e6a58 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -72,7 +72,7 @@ static struct page *alloc_buffer_page(struct ion_system_heap 
*heap,
 
if (order  4)
gfp_flags = high_order_gfp_flags;
-   page = alloc_pages(gfp_flags, order);
+   page = alloc_pages(gfp_flags | __GFP_COMP, order);
if (!page)
return NULL;
ion_pages_sync_for_device(NULL, page, PAGE_SIZE  order,
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/9] staging: ion: remove struct ion_page_pool_item

2014-05-26 Thread Heesub Shin

Now that the order information is held on struct page itself, we do not
need to use extra data structure. This commit reduces unnecessary slab
usage for allocating small objects.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
---
 drivers/staging/android/ion/ion_page_pool.c | 27 +--
 1 file changed, 5 insertions(+), 22 deletions(-)

diff --git a/drivers/staging/android/ion/ion_page_pool.c 
b/drivers/staging/android/ion/ion_page_pool.c
index 0141b3b..c1cea42b 100644
--- a/drivers/staging/android/ion/ion_page_pool.c
+++ b/drivers/staging/android/ion/ion_page_pool.c
@@ -23,11 +23,6 @@
 #include linux/slab.h
 #include ion_priv.h
 
-struct ion_page_pool_item {
-   struct page *page;
-   struct list_head list;
-};
-
 static void *ion_page_pool_alloc_pages(struct ion_page_pool *pool)
 {
struct page *page = alloc_pages(pool-gfp_mask, pool-order);
@@ -47,19 +42,12 @@ static void ion_page_pool_free_pages(struct ion_page_pool 
*pool,
 
 static int ion_page_pool_add(struct ion_page_pool *pool, struct page *page)
 {
-   struct ion_page_pool_item *item;
-
-   item = kmalloc(sizeof(struct ion_page_pool_item), GFP_KERNEL);
-   if (!item)
-   return -ENOMEM;
-
mutex_lock(pool-mutex);
-   item-page = page;
if (PageHighMem(page)) {
-   list_add_tail(item-list, pool-high_items);
+   list_add_tail(page-lru, pool-high_items);
pool-high_count++;
} else {
-   list_add_tail(item-list, pool-low_items);
+   list_add_tail(page-lru, pool-low_items);
pool-low_count++;
}
mutex_unlock(pool-mutex);
@@ -68,24 +56,19 @@ static int ion_page_pool_add(struct ion_page_pool *pool, 
struct page *page)
 
 static struct page *ion_page_pool_remove(struct ion_page_pool *pool, bool high)
 {
-   struct ion_page_pool_item *item;
struct page *page;
 
if (high) {
BUG_ON(!pool-high_count);
-   item = list_first_entry(pool-high_items,
-   struct ion_page_pool_item, list);
+   page = list_first_entry(pool-high_items, struct page, lru);
pool-high_count--;
} else {
BUG_ON(!pool-low_count);
-   item = list_first_entry(pool-low_items,
-   struct ion_page_pool_item, list);
+   page = list_first_entry(pool-low_items, struct page, lru);
pool-low_count--;
}
 
-   list_del(item-list);
-   page = item-page;
-   kfree(item);
+   list_del(page-lru);
return page;
 }
 
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 6/9] staging: ion: remove struct page_info

2014-05-26 Thread Heesub Shin

ION system heap uses a temporary list holding meta data on pages
allocated to build scatter/gather table. Now that the pages are compound
pages, we do not need to introduce a new data type redundantly.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
---
 drivers/staging/android/ion/ion_system_heap.c | 47 +--
 1 file changed, 15 insertions(+), 32 deletions(-)

diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index 73a2e67..f0ae210 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -51,11 +51,6 @@ struct ion_system_heap {
struct ion_page_pool **pools;
 };
 
-struct page_info {
-   struct page *page;
-   struct list_head list;
-};
-
 static struct page *alloc_buffer_page(struct ion_system_heap *heap,
  struct ion_buffer *buffer,
  unsigned long order)
@@ -96,19 +91,14 @@ static void free_buffer_page(struct ion_system_heap *heap,
 }
 
 
-static struct page_info *alloc_largest_available(struct ion_system_heap *heap,
-struct ion_buffer *buffer,
-unsigned long size,
-unsigned int max_order)
+static struct page *alloc_largest_available(struct ion_system_heap *heap,
+   struct ion_buffer *buffer,
+   unsigned long size,
+   unsigned int max_order)
 {
struct page *page;
-   struct page_info *info;
int i;
 
-   info = kmalloc(sizeof(struct page_info), GFP_KERNEL);
-   if (!info)
-   return NULL;
-
for (i = 0; i  num_orders; i++) {
if (size  order_to_size(orders[i]))
continue;
@@ -119,10 +109,8 @@ static struct page_info *alloc_largest_available(struct 
ion_system_heap *heap,
if (!page)
continue;
 
-   info-page = page;
-   return info;
+   return page;
}
-   kfree(info);
 
return NULL;
 }
@@ -138,7 +126,7 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
struct sg_table *table;
struct scatterlist *sg;
struct list_head pages;
-   struct page_info *info, *tmp_info;
+   struct page *page, *tmp_page;
int i = 0;
unsigned long size_remaining = PAGE_ALIGN(size);
unsigned int max_order = orders[0];
@@ -151,13 +139,13 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
 
INIT_LIST_HEAD(pages);
while (size_remaining  0) {
-   info = alloc_largest_available(sys_heap, buffer, size_remaining,
+   page = alloc_largest_available(sys_heap, buffer, size_remaining,
max_order);
-   if (!info)
+   if (!page)
goto free_pages;
-   list_add_tail(info-list, pages);
-   size_remaining -= PAGE_SIZE  compound_order(info-page);
-   max_order = compound_order(info-page);
+   list_add_tail(page-lru, pages);
+   size_remaining -= PAGE_SIZE  compound_order(page);
+   max_order = compound_order(page);
i++;
}
table = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
@@ -168,12 +156,10 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
goto free_table;
 
sg = table-sgl;
-   list_for_each_entry_safe(info, tmp_info, pages, list) {
-   struct page *page = info-page;
+   list_for_each_entry_safe(page, tmp_page, pages, lru) {
sg_set_page(sg, page, PAGE_SIZE  compound_order(page), 0);
sg = sg_next(sg);
-   list_del(info-list);
-   kfree(info);
+   list_del(page-lru);
}
 
buffer-priv_virt = table;
@@ -182,11 +168,8 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
 free_table:
kfree(table);
 free_pages:
-   list_for_each_entry_safe(info, tmp_info, pages, list) {
-   free_buffer_page(sys_heap, buffer, info-page,
-   compound_order(info-page));
-   kfree(info);
-   }
+   list_for_each_entry_safe(page, tmp_page, pages, lru)
+   free_buffer_page(sys_heap, buffer, page, compound_order(page));
return -ENOMEM;
 }
 
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 5/9] staging: ion: remove order from struct page_info

2014-05-26 Thread Heesub Shin

ION system heap uses an internal data structure, struct page_info, for
tracking down the meta information of the pages allocated from the pool.
Now that the pool returns compound pages, we don't need to store page
order in struct page_info.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
---
 drivers/staging/android/ion/ion_system_heap.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/android/ion/ion_system_heap.c 
b/drivers/staging/android/ion/ion_system_heap.c
index 48e6a58..73a2e67 100644
--- a/drivers/staging/android/ion/ion_system_heap.c
+++ b/drivers/staging/android/ion/ion_system_heap.c
@@ -53,7 +53,6 @@ struct ion_system_heap {
 
 struct page_info {
struct page *page;
-   unsigned int order;
struct list_head list;
 };
 
@@ -121,7 +120,6 @@ static struct page_info *alloc_largest_available(struct 
ion_system_heap *heap,
continue;
 
info-page = page;
-   info-order = orders[i];
return info;
}
kfree(info);
@@ -158,8 +156,8 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
if (!info)
goto free_pages;
list_add_tail(info-list, pages);
-   size_remaining -= PAGE_SIZE  info-order;
-   max_order = info-order;
+   size_remaining -= PAGE_SIZE  compound_order(info-page);
+   max_order = compound_order(info-page);
i++;
}
table = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
@@ -172,7 +170,7 @@ static int ion_system_heap_allocate(struct ion_heap *heap,
sg = table-sgl;
list_for_each_entry_safe(info, tmp_info, pages, list) {
struct page *page = info-page;
-   sg_set_page(sg, page, PAGE_SIZE  info-order, 0);
+   sg_set_page(sg, page, PAGE_SIZE  compound_order(page), 0);
sg = sg_next(sg);
list_del(info-list);
kfree(info);
@@ -185,7 +183,8 @@ free_table:
kfree(table);
 free_pages:
list_for_each_entry_safe(info, tmp_info, pages, list) {
-   free_buffer_page(sys_heap, buffer, info-page, info-order);
+   free_buffer_page(sys_heap, buffer, info-page,
+   compound_order(info-page));
kfree(info);
}
return -ENOMEM;
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/9] staging: ion: remove struct ion_page_pool_item

2014-05-26 Thread Heesub Shin


Hello,

On 05/26/2014 07:04 PM, Heesub Shin wrote:

Now that the order information is held on struct page itself, we do not
need to use extra data structure. This commit reduces unnecessary slab
usage for allocating small objects.



Oops. I need to amend changelog above and resend this patchset.

regards,
Heesub
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/9] staging: ion: tidy up a bit

2014-05-26 Thread Heesub Shin


Hello Carpenter,

On 05/26/2014 07:36 PM, Dan Carpenter wrote:

On Mon, May 26, 2014 at 07:04:53PM +0900, Heesub Shin wrote:

@@ -124,7 +122,6 @@ static struct page_info *alloc_largest_available(struct 
ion_system_heap *heap,

info-page = page;
info-order = orders[i];
-   INIT_LIST_HEAD(info-list);
return info;
}
kfree(info);


Wait.  How does this code work without that INIT_LIST_HEAD()?  What am
I missing here...


No problem. As the object info is just a node, not a head, it is 
completely useless to initialize it as a list head.


regards,
Heesub



regards,
dan carpenter




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used

2014-05-14 Thread Heesub Shin


Hello,

On 05/15/2014 10:53 AM, Joonsoo Kim wrote:

On Tue, May 13, 2014 at 12:00:57PM +0900, Minchan Kim wrote:

Hey Joonsoo,

On Thu, May 08, 2014 at 09:32:23AM +0900, Joonsoo Kim wrote:

CMA is introduced to provide physically contiguous pages at runtime.
For this purpose, it reserves memory at boot time. Although it reserve
memory, this reserved memory can be used for movable memory allocation
request. This usecase is beneficial to the system that needs this CMA
reserved memory infrequently and it is one of main purpose of
introducing CMA.

But, there is a problem in current implementation. The problem is that
it works like as just reserved memory approach. The pages on cma reserved
memory are hardly used for movable memory allocation. This is caused by
combination of allocation and reclaim policy.

The pages on cma reserved memory are allocated if there is no movable
memory, that is, as fallback allocation. So the time this fallback
allocation is started is under heavy memory pressure. Although it is under
memory pressure, movable allocation easily succeed, since there would be
many pages on cma reserved memory. But this is not the case for unmovable
and reclaimable allocation, because they can't use the pages on cma
reserved memory. These allocations regard system's free memory as
(free pages - free cma pages) on watermark checking, that is, free
unmovable pages + free reclaimable pages + free movable pages. Because
we already exhausted movable pages, only free pages we have are unmovable
and reclaimable types and this would be really small amount. So watermark
checking would be failed. It will wake up kswapd to make enough free
memory for unmovable and reclaimable allocation and kswapd will do.
So before we fully utilize pages on cma reserved memory, kswapd start to
reclaim memory and try to make free memory over the high watermark. This
watermark checking by kswapd doesn't take care free cma pages so many
movable pages would be reclaimed. After then, we have a lot of movable
pages again, so fallback allocation doesn't happen again. To conclude,
amount of free memory on meminfo which includes free CMA pages is moving
around 512 MB if I reserve 512 MB memory for CMA.

I found this problem on following experiment.

4 CPUs, 1024 MB, VIRTUAL MACHINE
make -j24

CMA reserve:0 MB512 MB
Elapsed-time:   234.8   361.8
Average-MemFree:283880 KB   530851 KB

To solve this problem, I can think following 2 possible solutions.
1. allocate the pages on cma reserved memory first, and if they are
exhausted, allocate movable pages.
2. interleaved allocation: try to allocate specific amounts of memory
from cma reserved memory and then allocate from free movable memory.


I love this idea but when I see the code, I don't like that.
In allocation path, just try to allocate pages by round-robin so it's role
of allocator. If one of migratetype is full, just pass mission to reclaimer
with hint(ie, Hey reclaimer, it's non-movable allocation fail
so there is pointless if you reclaim MIGRATE_CMA pages) so that
reclaimer can filter it out during page scanning.
We already have an tool to achieve it(ie, isolate_mode_t).


Hello,

I agree with leaving fast allocation path as simple as possible.
I will remove runtime computation for determining ratio in
__rmqueue_cma() and, instead, will use pre-computed value calculated
on the other path.

I am not sure that whether your second suggestion(Hey relaimer part)
is good or not. In my quick thought, that could be helpful in the
situation that many free cma pages remained. But, it would be not helpful
when there are neither free movable and cma pages. In generally, most
workloads mainly uses movable pages for page cache or anonymous mapping.
Although reclaim is triggered by non-movable allocation failure, reclaimed
pages are used mostly by movable allocation. We can handle these allocation
request even if we reclaim the pages just in lru order. If we rotate
the lru list for finding movable pages, it could cause more useful
pages to be evicted.

This is just my quick thought, so please let me correct if I am wrong.


We have an out of tree implementation that is completely the same with 
the approach Minchan said and it works, but it has definitely some 
side-effects as you pointed, distorting the LRU and evicting hot pages. 
I do not attach code fragments in this thread for some reasons, but it 
must be easy for yourself. I am wondering if it could help also in your 
case.


Thanks,
Heesub





And we couldn't do it in zone_watermark_ok with set/reset ALLOC_CMA?
If possible, it would be better becauser it's generic function to check
free pages and cause trigger reclaim/compaction logic.


I guess, your *it* means ratio computation. Right?
I don't like putting it on zone_watermark_ok(). Although it need to
refer to free cma pages value which are also referred in zone_watermark_ok(),
this computation is for determining ratio,

Re: [RFC PATCH 2/3] CMA: aggressively allocate the pages on cma reserved memory when not used

2014-05-14 Thread Heesub Shin


Hello,

On 05/15/2014 10:53 AM, Joonsoo Kim wrote:

On Tue, May 13, 2014 at 12:00:57PM +0900, Minchan Kim wrote:

Hey Joonsoo,

On Thu, May 08, 2014 at 09:32:23AM +0900, Joonsoo Kim wrote:

CMA is introduced to provide physically contiguous pages at runtime.
For this purpose, it reserves memory at boot time. Although it reserve
memory, this reserved memory can be used for movable memory allocation
request. This usecase is beneficial to the system that needs this CMA
reserved memory infrequently and it is one of main purpose of
introducing CMA.

But, there is a problem in current implementation. The problem is that
it works like as just reserved memory approach. The pages on cma reserved
memory are hardly used for movable memory allocation. This is caused by
combination of allocation and reclaim policy.

The pages on cma reserved memory are allocated if there is no movable
memory, that is, as fallback allocation. So the time this fallback
allocation is started is under heavy memory pressure. Although it is under
memory pressure, movable allocation easily succeed, since there would be
many pages on cma reserved memory. But this is not the case for unmovable
and reclaimable allocation, because they can't use the pages on cma
reserved memory. These allocations regard system's free memory as
(free pages - free cma pages) on watermark checking, that is, free
unmovable pages + free reclaimable pages + free movable pages. Because
we already exhausted movable pages, only free pages we have are unmovable
and reclaimable types and this would be really small amount. So watermark
checking would be failed. It will wake up kswapd to make enough free
memory for unmovable and reclaimable allocation and kswapd will do.
So before we fully utilize pages on cma reserved memory, kswapd start to
reclaim memory and try to make free memory over the high watermark. This
watermark checking by kswapd doesn't take care free cma pages so many
movable pages would be reclaimed. After then, we have a lot of movable
pages again, so fallback allocation doesn't happen again. To conclude,
amount of free memory on meminfo which includes free CMA pages is moving
around 512 MB if I reserve 512 MB memory for CMA.

I found this problem on following experiment.

4 CPUs, 1024 MB, VIRTUAL MACHINE
make -j24

CMA reserve:0 MB512 MB
Elapsed-time:   234.8   361.8
Average-MemFree:283880 KB   530851 KB

To solve this problem, I can think following 2 possible solutions.
1. allocate the pages on cma reserved memory first, and if they are
exhausted, allocate movable pages.
2. interleaved allocation: try to allocate specific amounts of memory
from cma reserved memory and then allocate from free movable memory.


I love this idea but when I see the code, I don't like that.
In allocation path, just try to allocate pages by round-robin so it's role
of allocator. If one of migratetype is full, just pass mission to reclaimer
with hint(ie, Hey reclaimer, it's non-movable allocation fail
so there is pointless if you reclaim MIGRATE_CMA pages) so that
reclaimer can filter it out during page scanning.
We already have an tool to achieve it(ie, isolate_mode_t).


Hello,

I agree with leaving fast allocation path as simple as possible.
I will remove runtime computation for determining ratio in
__rmqueue_cma() and, instead, will use pre-computed value calculated
on the other path.

I am not sure that whether your second suggestion(Hey relaimer part)
is good or not. In my quick thought, that could be helpful in the
situation that many free cma pages remained. But, it would be not helpful
when there are neither free movable and cma pages. In generally, most
workloads mainly uses movable pages for page cache or anonymous mapping.
Although reclaim is triggered by non-movable allocation failure, reclaimed
pages are used mostly by movable allocation. We can handle these allocation
request even if we reclaim the pages just in lru order. If we rotate
the lru list for finding movable pages, it could cause more useful
pages to be evicted.

This is just my quick thought, so please let me correct if I am wrong.


We have an out of tree implementation that is completely the same with 
the approach Minchan said and it works, but it has definitely some 
side-effects as you pointed, distorting the LRU and evicting hot pages. 
I do not attach code fragments in this thread for some reasons, but it 
must be easy for yourself. I am wondering if it could help also in your 
case.


Thanks,
Heesub





And we couldn't do it in zone_watermark_ok with set/reset ALLOC_CMA?
If possible, it would be better becauser it's generic function to check
free pages and cause trigger reclaim/compaction logic.


I guess, your *it* means ratio computation. Right?
I don't like putting it on zone_watermark_ok(). Although it need to
refer to free cma pages value which are also referred in zone_watermark_ok(),
this computation is for determining ratio,

[PATCH 1/2] mm/compaction: clean up unused code lines

2014-04-03 Thread Heesub Shin

This commit removes code lines currently not in use or never called.

Signed-off-by: Heesub Shin 
Cc: Dongjun Shin 
Cc: Sunghwan Yun 
---
 mm/compaction.c | 10 --
 1 file changed, 10 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 9635083..1ef9144 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -208,12 +208,6 @@ static bool compact_checklock_irqsave(spinlock_t *lock, 
unsigned long *flags,
return true;
 }
 
-static inline bool compact_trylock_irqsave(spinlock_t *lock,
-   unsigned long *flags, struct compact_control *cc)
-{
-   return compact_checklock_irqsave(lock, flags, false, cc);
-}
-
 /* Returns true if the page is within a block suitable for migration to */
 static bool suitable_migration_target(struct page *page)
 {
@@ -728,7 +722,6 @@ static void isolate_freepages(struct zone *zone,
continue;
 
/* Found a block suitable for isolating free pages from */
-   isolated = 0;
 
/*
 * As pfn may not start aligned, pfn+pageblock_nr_page
@@ -1160,9 +1153,6 @@ static void __compact_pgdat(pg_data_t *pgdat, struct 
compact_control *cc)
if (zone_watermark_ok(zone, cc->order,
low_wmark_pages(zone), 0, 0))
compaction_defer_reset(zone, cc->order, false);
-   /* Currently async compaction is never deferred. */
-   else if (cc->sync)
-   defer_compaction(zone, cc->order);
}
 
VM_BUG_ON(!list_empty(>freepages));
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] mm/compaction: fix to initialize free scanner properly

2014-04-03 Thread Heesub Shin

Free scanner does not works well on systems having zones which do not
span to pageblock-aligned boundary.

zone->compact_cached_free_pfn is reset when the migration and free
scanner across or compaction restarts. After the reset, if end_pfn of
the zone was not aligned to pageblock_nr_pages, free scanner tries to
isolate free pages from the middle of pageblock to the end, which can
be very small range.

Signed-off-by: Heesub Shin 
Cc: Dongjun Shin 
Cc: Sunghwan Yun 
---
 mm/compaction.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 1ef9144..fefe1da 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -983,7 +983,7 @@ static int compact_zone(struct zone *zone, struct 
compact_control *cc)
 */
cc->migrate_pfn = zone->compact_cached_migrate_pfn;
cc->free_pfn = zone->compact_cached_free_pfn;
-   if (cc->free_pfn < start_pfn || cc->free_pfn > end_pfn) {
+   if (cc->free_pfn < start_pfn || cc->free_pfn >= end_pfn) {
cc->free_pfn = end_pfn & ~(pageblock_nr_pages-1);
zone->compact_cached_free_pfn = cc->free_pfn;
}
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] mm/compaction: clean up unused code lines

2014-04-03 Thread Heesub Shin

This commit removes code lines currently not in use or never called.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
Cc: Dongjun Shin d.j.s...@samsung.com
Cc: Sunghwan Yun sunghwan@samsung.com
---
 mm/compaction.c | 10 --
 1 file changed, 10 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 9635083..1ef9144 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -208,12 +208,6 @@ static bool compact_checklock_irqsave(spinlock_t *lock, 
unsigned long *flags,
return true;
 }
 
-static inline bool compact_trylock_irqsave(spinlock_t *lock,
-   unsigned long *flags, struct compact_control *cc)
-{
-   return compact_checklock_irqsave(lock, flags, false, cc);
-}
-
 /* Returns true if the page is within a block suitable for migration to */
 static bool suitable_migration_target(struct page *page)
 {
@@ -728,7 +722,6 @@ static void isolate_freepages(struct zone *zone,
continue;
 
/* Found a block suitable for isolating free pages from */
-   isolated = 0;
 
/*
 * As pfn may not start aligned, pfn+pageblock_nr_page
@@ -1160,9 +1153,6 @@ static void __compact_pgdat(pg_data_t *pgdat, struct 
compact_control *cc)
if (zone_watermark_ok(zone, cc-order,
low_wmark_pages(zone), 0, 0))
compaction_defer_reset(zone, cc-order, false);
-   /* Currently async compaction is never deferred. */
-   else if (cc-sync)
-   defer_compaction(zone, cc-order);
}
 
VM_BUG_ON(!list_empty(cc-freepages));
-- 
1.8.3.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] mm/compaction: fix to initialize free scanner properly

2014-04-03 Thread Heesub Shin

Free scanner does not works well on systems having zones which do not
span to pageblock-aligned boundary.

zone-compact_cached_free_pfn is reset when the migration and free
scanner across or compaction restarts. After the reset, if end_pfn of
the zone was not aligned to pageblock_nr_pages, free scanner tries to
isolate free pages from the middle of pageblock to the end, which can
be very small range.

Signed-off-by: Heesub Shin heesub.s...@samsung.com
Cc: Dongjun Shin d.j.s...@samsung.com
Cc: Sunghwan Yun sunghwan@samsung.com
---
 mm/compaction.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 1ef9144..fefe1da 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -983,7 +983,7 @@ static int compact_zone(struct zone *zone, struct 
compact_control *cc)
 */
cc-migrate_pfn = zone-compact_cached_migrate_pfn;
cc-free_pfn = zone-compact_cached_free_pfn;
-   if (cc-free_pfn  start_pfn || cc-free_pfn  end_pfn) {
+   if (cc-free_pfn  start_pfn || cc-free_pfn = end_pfn) {
cc-free_pfn = end_pfn  ~(pageblock_nr_pages-1);
zone-compact_cached_free_pfn = cc-free_pfn;
}
-- 
1.8.3.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] fs/buffer.c: use lowmem_page_address instead of page_address

2013-08-12 Thread Heesub Shin

Here, the page has been identified as lowmem already. So, calling
lowmem_page_address() directly is a little cheaper than page_address().

Signed-off-by: Heesub Shin 
Cc: Dongjun Shin 
---
 fs/buffer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 695eb14..ccc2c7b 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -1464,7 +1464,7 @@ void set_bh_page(struct buffer_head *bh,
 */
bh->b_data = (char *)(0 + offset);
else
-   bh->b_data = page_address(page) + offset;
+   bh->b_data = lowmem_page_address(page) + offset;
 }
 EXPORT_SYMBOL(set_bh_page);
 
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] fs/buffer.c: use lowmem_page_address instead of page_address

2013-08-12 Thread Heesub Shin

Here, the page has been identified as lowmem already. So, calling
lowmem_page_address() directly is a little cheaper than page_address().

Signed-off-by: Heesub Shin heesub.s...@samsung.com
Cc: Dongjun Shin d.j.s...@samsung.com
---
 fs/buffer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 695eb14..ccc2c7b 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -1464,7 +1464,7 @@ void set_bh_page(struct buffer_head *bh,
 */
bh-b_data = (char *)(0 + offset);
else
-   bh-b_data = page_address(page) + offset;
+   bh-b_data = lowmem_page_address(page) + offset;
 }
 EXPORT_SYMBOL(set_bh_page);
 
-- 
1.8.3.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 >

1 - 100 of 106 matches

Mail list logo