[PATCH v2] zswap: Update with same-value filled page feature

2017-12-06 Thread Srividya Desireddy
From: Srividya Desireddy <srividya...@samsung.com>
Date: Wed, 6 Dec 2017 16:29:50 +0530
Subject: [PATCH v2] zswap: Update with same-value filled page feature

Changes since v1:
Updated to clarify about zswap.same_filled_pages_enabled parameter.

Updated zswap document with details on same-value filled
pages identification feature.
The usage of zswap.same_filled_pages_enabled module parameter
is explained.

Signed-off-by: Srividya Desireddy <srividya...@samsung.com>
---
 Documentation/vm/zswap.txt | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/Documentation/vm/zswap.txt b/Documentation/vm/zswap.txt
index 89fff7d..0b3a114 100644
--- a/Documentation/vm/zswap.txt
+++ b/Documentation/vm/zswap.txt
@@ -98,5 +98,25 @@ request is made for a page in an old zpool, it is 
uncompressed using its
 original compressor.  Once all pages are removed from an old zpool, the zpool
 and its compressor are freed.
 
+Some of the pages in zswap are same-value filled pages (i.e. contents of the
+page have same value or repetitive pattern). These pages include zero-filled
+pages and they are handled differently. During store operation, a page is
+checked if it is a same-value filled page before compressing it. If true, the
+compressed length of the page is set to zero and the pattern or same-filled
+value is stored.
+
+Same-value filled pages identification feature is enabled by default and can be
+disabled at boot time by setting the "same_filled_pages_enabled" attribute to 
0,
+e.g. zswap.same_filled_pages_enabled=0. It can also be enabled and disabled at
+runtime using the sysfs "same_filled_pages_enabled" attribute, e.g.
+
+echo 1 > /sys/module/zswap/parameters/same_filled_pages_enabled
+
+When zswap same-filled page identification is disabled at runtime, it will stop
+checking for the same-value filled pages during store operation. However, the
+existing pages which are marked as same-value filled pages remain stored
+unchanged in zswap until they are either loaded or invalidated.
+
 A debugfs interface is provided for various statistic about pool size, number
-of pages stored, and various counters for the reasons pages are rejected.
+of pages stored, same-value filled pages and various counters for the reasons
+pages are rejected.
-- 
2.7.4



[PATCH v2] zswap: Update with same-value filled page feature

2017-12-06 Thread Srividya Desireddy
From: Srividya Desireddy 
Date: Wed, 6 Dec 2017 16:29:50 +0530
Subject: [PATCH v2] zswap: Update with same-value filled page feature

Changes since v1:
Updated to clarify about zswap.same_filled_pages_enabled parameter.

Updated zswap document with details on same-value filled
pages identification feature.
The usage of zswap.same_filled_pages_enabled module parameter
is explained.

Signed-off-by: Srividya Desireddy 
---
 Documentation/vm/zswap.txt | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/Documentation/vm/zswap.txt b/Documentation/vm/zswap.txt
index 89fff7d..0b3a114 100644
--- a/Documentation/vm/zswap.txt
+++ b/Documentation/vm/zswap.txt
@@ -98,5 +98,25 @@ request is made for a page in an old zpool, it is 
uncompressed using its
 original compressor.  Once all pages are removed from an old zpool, the zpool
 and its compressor are freed.
 
+Some of the pages in zswap are same-value filled pages (i.e. contents of the
+page have same value or repetitive pattern). These pages include zero-filled
+pages and they are handled differently. During store operation, a page is
+checked if it is a same-value filled page before compressing it. If true, the
+compressed length of the page is set to zero and the pattern or same-filled
+value is stored.
+
+Same-value filled pages identification feature is enabled by default and can be
+disabled at boot time by setting the "same_filled_pages_enabled" attribute to 
0,
+e.g. zswap.same_filled_pages_enabled=0. It can also be enabled and disabled at
+runtime using the sysfs "same_filled_pages_enabled" attribute, e.g.
+
+echo 1 > /sys/module/zswap/parameters/same_filled_pages_enabled
+
+When zswap same-filled page identification is disabled at runtime, it will stop
+checking for the same-value filled pages during store operation. However, the
+existing pages which are marked as same-value filled pages remain stored
+unchanged in zswap until they are either loaded or invalidated.
+
 A debugfs interface is provided for various statistic about pool size, number
-of pages stored, and various counters for the reasons pages are rejected.
+of pages stored, same-value filled pages and various counters for the reasons
+pages are rejected.
-- 
2.7.4



[PATCH] zswap: Update with same-value filled page feature

2017-11-29 Thread Srividya Desireddy
From: Srividya Desireddy <srividya...@samsung.com>
Date: Wed, 29 Nov 2017 20:23:15 +0530
Subject: [PATCH] zswap: Update with same-value filled page feature

Updated zswap document with details on same-value filled
pages identification feature.
The usage of zswap.same_filled_pages_enabled module parameter
is explained.

Signed-off-by: Srividya Desireddy <srividya...@samsung.com>
---
 Documentation/vm/zswap.txt | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/Documentation/vm/zswap.txt b/Documentation/vm/zswap.txt
index 89fff7d..cc015b5 100644
--- a/Documentation/vm/zswap.txt
+++ b/Documentation/vm/zswap.txt
@@ -98,5 +98,25 @@ request is made for a page in an old zpool, it is 
uncompressed using its
 original compressor.  Once all pages are removed from an old zpool, the zpool
 and its compressor are freed.
 
+Some of the pages in zswap are same-value filled pages (i.e. contents of the
+page have same value or repetitive pattern). These pages include zero-filled
+pages and they are handled differently. During store operation, a page is
+checked if it is a same-value filled page before compressing it. If true, the
+compressed length of the page is set to zero and the pattern or same-filled
+value is stored.
+
+Same-value filled pages identification feature is enabled by default and can be
+disabled at boot time by setting the "same_filled_pages_enabled" attribute to 
0,
+e.g. zswap.same_filled_pages_enabled=0. It can also be enabled and disabled at
+runtime using the sysfs "same_filled_pages_enabled" attribute, e.g.
+
+echo 1 > /sys/module/zswap/parameters/same_filled_pages_enabled
+
+When zswap same-filled page identification is disabled at runtime, it will stop
+checking for the same-value filled pages during store operation. However, the
+existing pages which are marked as same-value filled pages will be loaded or
+invalidated.
+
 A debugfs interface is provided for various statistic about pool size, number
-of pages stored, and various counters for the reasons pages are rejected.
+of pages stored, same-value filled pages and various counters for the reasons
+pages are rejected.
-- 
2.7.4



[PATCH] zswap: Update with same-value filled page feature

2017-11-29 Thread Srividya Desireddy
From: Srividya Desireddy 
Date: Wed, 29 Nov 2017 20:23:15 +0530
Subject: [PATCH] zswap: Update with same-value filled page feature

Updated zswap document with details on same-value filled
pages identification feature.
The usage of zswap.same_filled_pages_enabled module parameter
is explained.

Signed-off-by: Srividya Desireddy 
---
 Documentation/vm/zswap.txt | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/Documentation/vm/zswap.txt b/Documentation/vm/zswap.txt
index 89fff7d..cc015b5 100644
--- a/Documentation/vm/zswap.txt
+++ b/Documentation/vm/zswap.txt
@@ -98,5 +98,25 @@ request is made for a page in an old zpool, it is 
uncompressed using its
 original compressor.  Once all pages are removed from an old zpool, the zpool
 and its compressor are freed.
 
+Some of the pages in zswap are same-value filled pages (i.e. contents of the
+page have same value or repetitive pattern). These pages include zero-filled
+pages and they are handled differently. During store operation, a page is
+checked if it is a same-value filled page before compressing it. If true, the
+compressed length of the page is set to zero and the pattern or same-filled
+value is stored.
+
+Same-value filled pages identification feature is enabled by default and can be
+disabled at boot time by setting the "same_filled_pages_enabled" attribute to 
0,
+e.g. zswap.same_filled_pages_enabled=0. It can also be enabled and disabled at
+runtime using the sysfs "same_filled_pages_enabled" attribute, e.g.
+
+echo 1 > /sys/module/zswap/parameters/same_filled_pages_enabled
+
+When zswap same-filled page identification is disabled at runtime, it will stop
+checking for the same-value filled pages during store operation. However, the
+existing pages which are marked as same-value filled pages will be loaded or
+invalidated.
+
 A debugfs interface is provided for various statistic about pool size, number
-of pages stored, and various counters for the reasons pages are rejected.
+of pages stored, same-value filled pages and various counters for the reasons
+pages are rejected.
-- 
2.7.4



[PATCH v2] zswap: Same-filled pages handling

2017-11-21 Thread Srividya Desireddy

From: Srividya Desireddy <srividya...@samsung.com>
Date: Sat, 18 Nov 2017 18:29:16 +0530
Subject: [PATCH v2] zswap: Same-filled pages handling

Changes since v1 :

Added memset_l instead of for loop.

Zswap is a cache which compresses the pages that are being swapped out
and stores them into a dynamically allocated RAM-based memory pool.
Experiments have shown that around 10-20% of pages stored in zswap
are same-filled pages (i.e. contents of the page are all same), but
these pages are handled as normal pages by compressing and allocating
memory in the pool.

This patch adds a check in zswap_frontswap_store() to identify same-filled
page before compression of the page. If the page is a same-filled page, set
zswap_entry.length to zero, save the same-filled value and skip the
compression of the page and alloction of memory in zpool.
In zswap_frontswap_load(), check if value of zswap_entry.length is zero
corresponding to the page to be loaded. If zswap_entry.length is zero,
fill the page with same-filled value. This saves the decompression time
during load.

On a ARM Quad Core 32-bit device with 1.5GB RAM by launching and
relaunching different applications, out of ~64000 pages stored in
zswap, ~11000 pages were same-value filled pages (including zero-filled
pages) and ~9000 pages were zero-filled pages.

An average of 17% of pages(including zero-filled pages) in zswap are
same-value filled pages and 14% pages are zero-filled pages.
An average of 3% of pages are same-filled non-zero pages.

The below table shows the execution time profiling with the patch.

  BaselineWith patch  % Improvement
-
*Zswap Store Time   26.5ms   18ms  32%
 (of same value pages)
*Zswap Load Time
 (of same value pages)  25.5ms   13ms  49%
-

On Ubuntu PC with 2GB RAM, while executing kernel build and other test
scripts and running multimedia applications, out of 36 pages
stored in zswap 78000(~22%) of pages were found to be same-value filled
pages (including zero-filled pages) and 64000(~17%) are zero-filled
pages. So an average of %5 of pages are same-filled non-zero pages.

The below table shows the execution time profiling with the patch.

  BaselineWith patch  % Improvement
-
*Zswap Store Time   91ms74ms   19%
 (of same value pages)
*Zswap Load Time50ms7.5ms  85%
 (of same value pages)
-

*The execution times may vary with test device used.

Signed-off-by: Srividya Desireddy <srividya...@samsung.com>
---
 mm/zswap.c | 71 +-
 1 file changed, 66 insertions(+), 5 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index d39581a..1133b4ce 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -49,6 +49,8 @@
 static u64 zswap_pool_total_size;
 /* The number of compressed pages currently stored in zswap */
 static atomic_t zswap_stored_pages = ATOMIC_INIT(0);
+/* The number of same-value filled pages currently stored in zswap */
+static atomic_t zswap_same_filled_pages = ATOMIC_INIT(0);
 
 /*
  * The statistics below are not protected from concurrent access for
@@ -116,6 +118,11 @@ module_param_cb(zpool, _zpool_param_ops, 
_zpool_type, 0644);
 static unsigned int zswap_max_pool_percent = 20;
 module_param_named(max_pool_percent, zswap_max_pool_percent, uint, 0644);
 
+/* Enable/disable handling same-value filled pages (enabled by default) */
+static bool zswap_same_filled_pages_enabled = true;
+module_param_named(same_filled_pages_enabled, zswap_same_filled_pages_enabled,
+  bool, 0644);
+
 /*
 * data structures
 **/
@@ -145,9 +152,10 @@ struct zswap_pool {
  *be held while changing the refcount.  Since the lock must
  *be held, there is no reason to also make refcount atomic.
  * length - the length in bytes of the compressed page data.  Needed during
- *  decompression
+ *  decompression. For a same value filled page length is 0.
  * pool - the zswap_pool the entry's data is in
  * handle - zpool allocation handle that stores the compressed page data
+ * value - value of the same-value filled pages which have same content
  */
 struct zswap_entry {
struct rb_node rbnode;
@@ -155,7 +163,10 @@ struct zswap_entry {
int refcount;
unsigned int length;
struct zswap_pool *pool;
-   unsigned long handle;
+   union {
+   unsigned long handle;
+   unsigned long value;
+   };
 };
 
 struct zswap_header {
@@ -320,8 +331,12 @@ static void zswap_rb_erase(struct rb_root *root, struct 
zswap_entry *entry)

[PATCH v2] zswap: Same-filled pages handling

2017-11-21 Thread Srividya Desireddy

From: Srividya Desireddy 
Date: Sat, 18 Nov 2017 18:29:16 +0530
Subject: [PATCH v2] zswap: Same-filled pages handling

Changes since v1 :

Added memset_l instead of for loop.

Zswap is a cache which compresses the pages that are being swapped out
and stores them into a dynamically allocated RAM-based memory pool.
Experiments have shown that around 10-20% of pages stored in zswap
are same-filled pages (i.e. contents of the page are all same), but
these pages are handled as normal pages by compressing and allocating
memory in the pool.

This patch adds a check in zswap_frontswap_store() to identify same-filled
page before compression of the page. If the page is a same-filled page, set
zswap_entry.length to zero, save the same-filled value and skip the
compression of the page and alloction of memory in zpool.
In zswap_frontswap_load(), check if value of zswap_entry.length is zero
corresponding to the page to be loaded. If zswap_entry.length is zero,
fill the page with same-filled value. This saves the decompression time
during load.

On a ARM Quad Core 32-bit device with 1.5GB RAM by launching and
relaunching different applications, out of ~64000 pages stored in
zswap, ~11000 pages were same-value filled pages (including zero-filled
pages) and ~9000 pages were zero-filled pages.

An average of 17% of pages(including zero-filled pages) in zswap are
same-value filled pages and 14% pages are zero-filled pages.
An average of 3% of pages are same-filled non-zero pages.

The below table shows the execution time profiling with the patch.

  BaselineWith patch  % Improvement
-
*Zswap Store Time   26.5ms   18ms  32%
 (of same value pages)
*Zswap Load Time
 (of same value pages)  25.5ms   13ms  49%
-

On Ubuntu PC with 2GB RAM, while executing kernel build and other test
scripts and running multimedia applications, out of 36 pages
stored in zswap 78000(~22%) of pages were found to be same-value filled
pages (including zero-filled pages) and 64000(~17%) are zero-filled
pages. So an average of %5 of pages are same-filled non-zero pages.

The below table shows the execution time profiling with the patch.

  BaselineWith patch  % Improvement
-
*Zswap Store Time   91ms74ms   19%
 (of same value pages)
*Zswap Load Time50ms7.5ms  85%
 (of same value pages)
-

*The execution times may vary with test device used.

Signed-off-by: Srividya Desireddy 
---
 mm/zswap.c | 71 +-
 1 file changed, 66 insertions(+), 5 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index d39581a..1133b4ce 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -49,6 +49,8 @@
 static u64 zswap_pool_total_size;
 /* The number of compressed pages currently stored in zswap */
 static atomic_t zswap_stored_pages = ATOMIC_INIT(0);
+/* The number of same-value filled pages currently stored in zswap */
+static atomic_t zswap_same_filled_pages = ATOMIC_INIT(0);
 
 /*
  * The statistics below are not protected from concurrent access for
@@ -116,6 +118,11 @@ module_param_cb(zpool, _zpool_param_ops, 
_zpool_type, 0644);
 static unsigned int zswap_max_pool_percent = 20;
 module_param_named(max_pool_percent, zswap_max_pool_percent, uint, 0644);
 
+/* Enable/disable handling same-value filled pages (enabled by default) */
+static bool zswap_same_filled_pages_enabled = true;
+module_param_named(same_filled_pages_enabled, zswap_same_filled_pages_enabled,
+  bool, 0644);
+
 /*
 * data structures
 **/
@@ -145,9 +152,10 @@ struct zswap_pool {
  *be held while changing the refcount.  Since the lock must
  *be held, there is no reason to also make refcount atomic.
  * length - the length in bytes of the compressed page data.  Needed during
- *  decompression
+ *  decompression. For a same value filled page length is 0.
  * pool - the zswap_pool the entry's data is in
  * handle - zpool allocation handle that stores the compressed page data
+ * value - value of the same-value filled pages which have same content
  */
 struct zswap_entry {
struct rb_node rbnode;
@@ -155,7 +163,10 @@ struct zswap_entry {
int refcount;
unsigned int length;
struct zswap_pool *pool;
-   unsigned long handle;
+   union {
+   unsigned long handle;
+   unsigned long value;
+   };
 };
 
 struct zswap_header {
@@ -320,8 +331,12 @@ static void zswap_rb_erase(struct rb_root *root, struct 
zswap_entry *entry)
  */
 static void zswap_free_entry(struct zswap_entry *entry

Re: [PATCH] zswap: Same-filled pages handling

2017-11-02 Thread Srividya Desireddy
 
On Wed, Oct 19, 2017 at 6:38 AM, Matthew Wilcox wrote: 
> On Thu, Oct 19, 2017 at 12:31:18AM +0300, Timofey Titovets wrote:
>> > +static void zswap_fill_page(void *ptr, unsigned long value)
>> > +{
>> > +   unsigned int pos;
>> > +   unsigned long *page;
>> > +
>> > +   page = (unsigned long *)ptr;
>> > +   if (value == 0)
>> > +   memset(page, 0, PAGE_SIZE);
>> > +   else {
>> > +   for (pos = 0; pos < PAGE_SIZE / sizeof(*page); pos++)
>> > +   page[pos] = value;
>> > +   }
>> > +}
>> 
>> Same here, but with memcpy().
>
>No.  Use memset_l which is optimised for this specific job.

I have tested this patch using memset_l() function in zswap_fill_page() on 
x86 64-bit system with 2GB RAM. The performance remains same. 
But, memset_l() funcion might be optimised in future. 
@Seth Jennings/Dan Streetman:  Should I use memset_l() function in this patch.


Re: [PATCH] zswap: Same-filled pages handling

2017-11-02 Thread Srividya Desireddy
 
On Wed, Oct 19, 2017 at 6:38 AM, Matthew Wilcox wrote: 
> On Thu, Oct 19, 2017 at 12:31:18AM +0300, Timofey Titovets wrote:
>> > +static void zswap_fill_page(void *ptr, unsigned long value)
>> > +{
>> > +   unsigned int pos;
>> > +   unsigned long *page;
>> > +
>> > +   page = (unsigned long *)ptr;
>> > +   if (value == 0)
>> > +   memset(page, 0, PAGE_SIZE);
>> > +   else {
>> > +   for (pos = 0; pos < PAGE_SIZE / sizeof(*page); pos++)
>> > +   page[pos] = value;
>> > +   }
>> > +}
>> 
>> Same here, but with memcpy().
>
>No.  Use memset_l which is optimised for this specific job.

I have tested this patch using memset_l() function in zswap_fill_page() on 
x86 64-bit system with 2GB RAM. The performance remains same. 
But, memset_l() funcion might be optimised in future. 
@Seth Jennings/Dan Streetman:  Should I use memset_l() function in this patch.


Re: [PATCH] zswap: Same-filled pages handling

2017-10-18 Thread Srividya Desireddy
On Wed, Oct 18, 2017 at 7:41 PM, Matthew Wilcox wrote: 
> On Wed, Oct 18, 2017 at 04:33:43PM +0300, Timofey Titovets wrote:
>> 2017-10-18 15:34 GMT+03:00 Matthew Wilcox <wi...@infradead.org>:
>> > On Wed, Oct 18, 2017 at 10:48:32AM +, Srividya Desireddy wrote:
>> >> +static void zswap_fill_page(void *ptr, unsigned long value)
>> >> +{
>> >> + unsigned int pos;
>> >> + unsigned long *page;
>> >> +
>> >> + page = (unsigned long *)ptr;
>> >> + if (value == 0)
>> >> + memset(page, 0, PAGE_SIZE);
>> >> + else {
>> >> + for (pos = 0; pos < PAGE_SIZE / sizeof(*page); pos++)
>> >> + page[pos] = value;
>> >> + }
>> >> +}
>> >
>> > I think you meant:
>> >
>> > static void zswap_fill_page(void *ptr, unsigned long value)
>> > {
>> > memset_l(ptr, value, PAGE_SIZE / sizeof(unsigned long));
>> > }
>> 
>> IIRC kernel have special zero page, and if i understand correctly.
>> You can map all zero pages to that zero page and not touch zswap completely.
>> (Your situation look like some KSM case (i.e. KSM can handle pages
>> with same content), but i'm not sure if that applicable there)
> 
>You're confused by the word "same".  What Srividya meant was that the
>page is filled with a pattern, eg 0xfffefffefffefffe..., not that it is
>the same as any other page.

In kernel there is a special zero page or empty_zero_page which is in
general allocated in paging_init() function, to map all zero pages. But,
same-value-filled pages including zero pages exist in memory because
applications may be initializing the allocated pages with a value and not
using them; or the actual content written to the memory pages during 
execution itself is same-value, in case of multimedia data for example.

I had earlier posted a patch with similar implementaion of KSM concept 
for Zswap:
https://lkml.org/lkml/2016/8/17/171
https://lkml.org/lkml/2017/2/17/612

- Srividya


Re: [PATCH] zswap: Same-filled pages handling

2017-10-18 Thread Srividya Desireddy
On Wed, Oct 18, 2017 at 7:41 PM, Matthew Wilcox wrote: 
> On Wed, Oct 18, 2017 at 04:33:43PM +0300, Timofey Titovets wrote:
>> 2017-10-18 15:34 GMT+03:00 Matthew Wilcox :
>> > On Wed, Oct 18, 2017 at 10:48:32AM +, Srividya Desireddy wrote:
>> >> +static void zswap_fill_page(void *ptr, unsigned long value)
>> >> +{
>> >> + unsigned int pos;
>> >> + unsigned long *page;
>> >> +
>> >> + page = (unsigned long *)ptr;
>> >> + if (value == 0)
>> >> + memset(page, 0, PAGE_SIZE);
>> >> + else {
>> >> + for (pos = 0; pos < PAGE_SIZE / sizeof(*page); pos++)
>> >> + page[pos] = value;
>> >> + }
>> >> +}
>> >
>> > I think you meant:
>> >
>> > static void zswap_fill_page(void *ptr, unsigned long value)
>> > {
>> > memset_l(ptr, value, PAGE_SIZE / sizeof(unsigned long));
>> > }
>> 
>> IIRC kernel have special zero page, and if i understand correctly.
>> You can map all zero pages to that zero page and not touch zswap completely.
>> (Your situation look like some KSM case (i.e. KSM can handle pages
>> with same content), but i'm not sure if that applicable there)
> 
>You're confused by the word "same".  What Srividya meant was that the
>page is filled with a pattern, eg 0xfffefffefffefffe..., not that it is
>the same as any other page.

In kernel there is a special zero page or empty_zero_page which is in
general allocated in paging_init() function, to map all zero pages. But,
same-value-filled pages including zero pages exist in memory because
applications may be initializing the allocated pages with a value and not
using them; or the actual content written to the memory pages during 
execution itself is same-value, in case of multimedia data for example.

I had earlier posted a patch with similar implementaion of KSM concept 
for Zswap:
https://lkml.org/lkml/2016/8/17/171
https://lkml.org/lkml/2017/2/17/612

- Srividya


[PATCH] zswap: Same-filled pages handling

2017-10-18 Thread Srividya Desireddy
From: Srividya Desireddy <srividya...@samsung.com>
Date: Wed, 18 Oct 2017 15:39:02 +0530
Subject: [PATCH] zswap: Same-filled pages handling

Zswap is a cache which compresses the pages that are being swapped out
and stores them into a dynamically allocated RAM-based memory pool.
Experiments have shown that around 10-20% of pages stored in zswap
are same-filled pages (i.e. contents of the page are all same), but
these pages are handled as normal pages by compressing and allocating
memory in the pool.

This patch adds a check in zswap_frontswap_store() to identify same-filled
page before compression of the page. If the page is a same-filled page, set
zswap_entry.length to zero, save the same-filled value and skip the
compression of the page and alloction of memory in zpool.
In zswap_frontswap_load(), check if value of zswap_entry.length is zero
corresponding to the page to be loaded. If zswap_entry.length is zero,
fill the page with same-filled value. This saves the decompression time
during load.

On a ARM Quad Core 32-bit device with 1.5GB RAM by launching and
relaunching different applications, out of ~64000 pages stored in
zswap, ~11000 pages were same-value filled pages (including zero-filled
pages) and ~9000 pages were zero-filled pages.

An average of 17% of pages(including zero-filled pages) in zswap are
same-value filled pages and 14% pages are zero-filled pages.
An average of 3% of pages are same-filled non-zero pages.

The below table shows the execution time profiling with the patch.

  BaselineWith patch  % Improvement
-
*Zswap Store Time   26.5ms   18ms  32%
 (of same value pages)
*Zswap Load Time
 (of same value pages)  25.5ms   13ms  49%
-

On Ubuntu PC with 2GB RAM, while executing kernel build and other test
scripts and running multimedia applications, out of 36 pages
stored in zswap 78000(~22%) of pages were found to be same-value filled
pages (including zero-filled pages) and 64000(~17%) are zero-filled
pages. So an average of %5 of pages are same-filled non-zero pages.

The below table shows the execution time profiling with the patch.

  BaselineWith patch  % Improvement
-
*Zswap Store Time   91ms74ms   19%
 (of same value pages)
*Zswap Load Time50ms7.5ms  85%
 (of same value pages)
-

*The execution times may vary with test device used.

Signed-off-by: Srividya Desireddy <srividya...@samsung.com>
---
 mm/zswap.c | 77 ++
 1 file changed, 72 insertions(+), 5 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index d39581a..4dd8b89 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -49,6 +49,8 @@
 static u64 zswap_pool_total_size;
 /* The number of compressed pages currently stored in zswap */
 static atomic_t zswap_stored_pages = ATOMIC_INIT(0);
+/* The number of same-value filled pages currently stored in zswap */
+static atomic_t zswap_same_filled_pages = ATOMIC_INIT(0);
 
 /*
  * The statistics below are not protected from concurrent access for
@@ -116,6 +118,11 @@ static int zswap_compressor_param_set(const char *,
 static unsigned int zswap_max_pool_percent = 20;
 module_param_named(max_pool_percent, zswap_max_pool_percent, uint, 0644);
 
+/* Enable/disable handling same-value filled pages (enabled by default) */
+static bool zswap_same_filled_pages_enabled = true;
+module_param_named(same_filled_pages_enabled, zswap_same_filled_pages_enabled,
+  bool, 0644);
+
 /*
 * data structures
 **/
@@ -145,9 +152,10 @@ struct zswap_pool {
  *be held while changing the refcount.  Since the lock must
  *be held, there is no reason to also make refcount atomic.
  * length - the length in bytes of the compressed page data.  Needed during
- *  decompression
+ *  decompression. For a same value filled page length is 0.
  * pool - the zswap_pool the entry's data is in
  * handle - zpool allocation handle that stores the compressed page data
+ * value - value of the same-value filled pages which have same content
  */
 struct zswap_entry {
struct rb_node rbnode;
@@ -155,7 +163,10 @@ struct zswap_entry {
int refcount;
unsigned int length;
struct zswap_pool *pool;
-   unsigned long handle;
+   union {
+   unsigned long handle;
+   unsigned long value;
+   };
 };
 
 struct zswap_header {
@@ -320,8 +331,12 @@ static void zswap_rb_erase(struct rb_root *root, struct 
zswap_entry *entry)
  */
 static void zswap_free_entry(struct zswap_entry *entry)
 {
- 

[PATCH] zswap: Same-filled pages handling

2017-10-18 Thread Srividya Desireddy
From: Srividya Desireddy 
Date: Wed, 18 Oct 2017 15:39:02 +0530
Subject: [PATCH] zswap: Same-filled pages handling

Zswap is a cache which compresses the pages that are being swapped out
and stores them into a dynamically allocated RAM-based memory pool.
Experiments have shown that around 10-20% of pages stored in zswap
are same-filled pages (i.e. contents of the page are all same), but
these pages are handled as normal pages by compressing and allocating
memory in the pool.

This patch adds a check in zswap_frontswap_store() to identify same-filled
page before compression of the page. If the page is a same-filled page, set
zswap_entry.length to zero, save the same-filled value and skip the
compression of the page and alloction of memory in zpool.
In zswap_frontswap_load(), check if value of zswap_entry.length is zero
corresponding to the page to be loaded. If zswap_entry.length is zero,
fill the page with same-filled value. This saves the decompression time
during load.

On a ARM Quad Core 32-bit device with 1.5GB RAM by launching and
relaunching different applications, out of ~64000 pages stored in
zswap, ~11000 pages were same-value filled pages (including zero-filled
pages) and ~9000 pages were zero-filled pages.

An average of 17% of pages(including zero-filled pages) in zswap are
same-value filled pages and 14% pages are zero-filled pages.
An average of 3% of pages are same-filled non-zero pages.

The below table shows the execution time profiling with the patch.

  BaselineWith patch  % Improvement
-
*Zswap Store Time   26.5ms   18ms  32%
 (of same value pages)
*Zswap Load Time
 (of same value pages)  25.5ms   13ms  49%
-

On Ubuntu PC with 2GB RAM, while executing kernel build and other test
scripts and running multimedia applications, out of 36 pages
stored in zswap 78000(~22%) of pages were found to be same-value filled
pages (including zero-filled pages) and 64000(~17%) are zero-filled
pages. So an average of %5 of pages are same-filled non-zero pages.

The below table shows the execution time profiling with the patch.

  BaselineWith patch  % Improvement
-
*Zswap Store Time   91ms74ms   19%
 (of same value pages)
*Zswap Load Time50ms7.5ms  85%
 (of same value pages)
-

*The execution times may vary with test device used.

Signed-off-by: Srividya Desireddy 
---
 mm/zswap.c | 77 ++
 1 file changed, 72 insertions(+), 5 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index d39581a..4dd8b89 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -49,6 +49,8 @@
 static u64 zswap_pool_total_size;
 /* The number of compressed pages currently stored in zswap */
 static atomic_t zswap_stored_pages = ATOMIC_INIT(0);
+/* The number of same-value filled pages currently stored in zswap */
+static atomic_t zswap_same_filled_pages = ATOMIC_INIT(0);
 
 /*
  * The statistics below are not protected from concurrent access for
@@ -116,6 +118,11 @@ static int zswap_compressor_param_set(const char *,
 static unsigned int zswap_max_pool_percent = 20;
 module_param_named(max_pool_percent, zswap_max_pool_percent, uint, 0644);
 
+/* Enable/disable handling same-value filled pages (enabled by default) */
+static bool zswap_same_filled_pages_enabled = true;
+module_param_named(same_filled_pages_enabled, zswap_same_filled_pages_enabled,
+  bool, 0644);
+
 /*
 * data structures
 **/
@@ -145,9 +152,10 @@ struct zswap_pool {
  *be held while changing the refcount.  Since the lock must
  *be held, there is no reason to also make refcount atomic.
  * length - the length in bytes of the compressed page data.  Needed during
- *  decompression
+ *  decompression. For a same value filled page length is 0.
  * pool - the zswap_pool the entry's data is in
  * handle - zpool allocation handle that stores the compressed page data
+ * value - value of the same-value filled pages which have same content
  */
 struct zswap_entry {
struct rb_node rbnode;
@@ -155,7 +163,10 @@ struct zswap_entry {
int refcount;
unsigned int length;
struct zswap_pool *pool;
-   unsigned long handle;
+   union {
+   unsigned long handle;
+   unsigned long value;
+   };
 };
 
 struct zswap_header {
@@ -320,8 +331,12 @@ static void zswap_rb_erase(struct rb_root *root, struct 
zswap_entry *entry)
  */
 static void zswap_free_entry(struct zswap_entry *entry)
 {
-   zpool_free(entry->pool->zpool, entry-&

[PATCH v2] zswap: Zero-filled pages handling

2017-08-16 Thread Srividya Desireddy

On Thu, Jul 6, 2017 at 3:32 PM, Dan Streetman wrote:
> On Thu, Jul 6, 2017 at 5:29 AM, Srividya Desireddy
> wrote:
>> On Wed, Jul 6, 2017 at 10:49 AM, Sergey Senozhatsky wrote:
>>> On (07/02/17 20:28), Seth Jennings wrote:
>>>> On Sun, Jul 2, 2017 at 9:19 AM, Srividya Desireddy
>>>> > Zswap is a cache which compresses the pages that are being swapped out
>>>> > and stores them into a dynamically allocated RAM-based memory pool.
>>>> > Experiments have shown that around 10-20% of pages stored in zswap
>>>> > are zero-filled pages (i.e. contents of the page are all zeros), but
>>>> > these pages are handled as normal pages by compressing and allocating
>>>> > memory in the pool.
>>>>
>>>> I am somewhat surprised that this many anon pages are zero filled.
>>>>
>>>> If this is true, then maybe we should consider solving this at the
>>>> swap level in general, as we can de-dup zero pages in all swap
>>>> devices, not just zswap.
>>>>
>>>> That being said, this is a fair small change and I don't see anything
>>>> objectionable.  However, I do think the better solution would be to do
>>> this at a higher level.
>>>
>>
>> Thank you for your suggestion. It is a better solution to handle
>> zero-filled pages before swapping-out to zswap. Since, Zram is already
>> handles Zero pages internally, I considered to handle within Zswap.
>> In a long run, we can work on it to commonly handle zero-filled anon
>> pages.
>>
>>> zero-filled pages are just 1 case. in general, it's better
>>> to handle pages that are memset-ed with the same value (e.g.
>>> memset(page, 0x01, page_size)). which includes, but not
>>> limited to, 0x00. zram does it.
>>>
>>> -ss
>>
>> It is a good solution to extend zero-filled pages handling to same value
>> pages. I will work on to identify the percentage of same value pages
>> excluding zero-filled pages in Zswap and will get back.
>
> Yes, this sounds like a good modification to the patch.  Also, unless
> anyone else disagrees, it may be good to control this with a module
> param - in case anyone has a use case that they know won't be helped
> by this, and the extra overhead of checking each page is wasteful.
> Probably should default to enabled.
>
>>
>> - Srividya

I have made changes to patch to handle pages with same-value filled.

I tested on a ARM Quad Core 32-bit device with 1.5GB RAM by launching
and relaunching different applications. After the test, out of ~64000
pages stored in zswap, ~ 11000 pages were same-value filled pages
(including zero-filled pages) and ~9000 pages were zero-filled pages.

An average of 17% of pages(including zero-filled pages) in zswap are 
same-value filled pages and 14% pages are zero-filled pages.
An average of 3% of pages are same-filled non-zero pages.

The below table shows the execution time profiling with the patch.

  BaselineWith patch  % Improvement
-
*Zswap Store Time   26.5ms18ms  32%
 (of same value pages)
*Zswap Load Time
 (of same value pages)  25.5ms  13ms  49%
-

On Ubuntu PC with 2GB RAM, while executing kernel build and other test
scripts and running multimedia applications, out of 36 pages 
stored in zswap 78000(~22%) of pages were found to be same-value filled
pages (including zero-filled pages) and 64000(~17%) are zero-filled 
pages. So an average of %5 of pages are same-filled non-zero pages.

The below table shows the execution time profiling with the patch.

  BaselineWith patch  % Improvement
-
*Zswap Store Time   91ms74ms   19%
 (of same value pages)
*Zswap Load Time50ms7.5ms  85%
 (of same value pages)
-

*The execution times may vary with test device used.

I will send this patch of handling same-value filled pages along with
module param to control it(default being enabled).

 - Srividya


[PATCH v2] zswap: Zero-filled pages handling

2017-08-16 Thread Srividya Desireddy

On Thu, Jul 6, 2017 at 3:32 PM, Dan Streetman wrote:
> On Thu, Jul 6, 2017 at 5:29 AM, Srividya Desireddy
> wrote:
>> On Wed, Jul 6, 2017 at 10:49 AM, Sergey Senozhatsky wrote:
>>> On (07/02/17 20:28), Seth Jennings wrote:
>>>> On Sun, Jul 2, 2017 at 9:19 AM, Srividya Desireddy
>>>> > Zswap is a cache which compresses the pages that are being swapped out
>>>> > and stores them into a dynamically allocated RAM-based memory pool.
>>>> > Experiments have shown that around 10-20% of pages stored in zswap
>>>> > are zero-filled pages (i.e. contents of the page are all zeros), but
>>>> > these pages are handled as normal pages by compressing and allocating
>>>> > memory in the pool.
>>>>
>>>> I am somewhat surprised that this many anon pages are zero filled.
>>>>
>>>> If this is true, then maybe we should consider solving this at the
>>>> swap level in general, as we can de-dup zero pages in all swap
>>>> devices, not just zswap.
>>>>
>>>> That being said, this is a fair small change and I don't see anything
>>>> objectionable.  However, I do think the better solution would be to do
>>> this at a higher level.
>>>
>>
>> Thank you for your suggestion. It is a better solution to handle
>> zero-filled pages before swapping-out to zswap. Since, Zram is already
>> handles Zero pages internally, I considered to handle within Zswap.
>> In a long run, we can work on it to commonly handle zero-filled anon
>> pages.
>>
>>> zero-filled pages are just 1 case. in general, it's better
>>> to handle pages that are memset-ed with the same value (e.g.
>>> memset(page, 0x01, page_size)). which includes, but not
>>> limited to, 0x00. zram does it.
>>>
>>> -ss
>>
>> It is a good solution to extend zero-filled pages handling to same value
>> pages. I will work on to identify the percentage of same value pages
>> excluding zero-filled pages in Zswap and will get back.
>
> Yes, this sounds like a good modification to the patch.  Also, unless
> anyone else disagrees, it may be good to control this with a module
> param - in case anyone has a use case that they know won't be helped
> by this, and the extra overhead of checking each page is wasteful.
> Probably should default to enabled.
>
>>
>> - Srividya

I have made changes to patch to handle pages with same-value filled.

I tested on a ARM Quad Core 32-bit device with 1.5GB RAM by launching
and relaunching different applications. After the test, out of ~64000
pages stored in zswap, ~ 11000 pages were same-value filled pages
(including zero-filled pages) and ~9000 pages were zero-filled pages.

An average of 17% of pages(including zero-filled pages) in zswap are 
same-value filled pages and 14% pages are zero-filled pages.
An average of 3% of pages are same-filled non-zero pages.

The below table shows the execution time profiling with the patch.

  BaselineWith patch  % Improvement
-
*Zswap Store Time   26.5ms18ms  32%
 (of same value pages)
*Zswap Load Time
 (of same value pages)  25.5ms  13ms  49%
-

On Ubuntu PC with 2GB RAM, while executing kernel build and other test
scripts and running multimedia applications, out of 36 pages 
stored in zswap 78000(~22%) of pages were found to be same-value filled
pages (including zero-filled pages) and 64000(~17%) are zero-filled 
pages. So an average of %5 of pages are same-filled non-zero pages.

The below table shows the execution time profiling with the patch.

  BaselineWith patch  % Improvement
-
*Zswap Store Time   91ms74ms   19%
 (of same value pages)
*Zswap Load Time50ms7.5ms  85%
 (of same value pages)
-

*The execution times may vary with test device used.

I will send this patch of handling same-value filled pages along with
module param to control it(default being enabled).

 - Srividya


Re: [PATCH v2] zswap: Zero-filled pages handling

2017-07-06 Thread Srividya Desireddy
On Wed, Jul 6, 2017 at 10:49 AM, Sergey Senozhatsky wrote:
> On (07/02/17 20:28), Seth Jennings wrote:
>> On Sun, Jul 2, 2017 at 9:19 AM, Srividya Desireddy
>> > Zswap is a cache which compresses the pages that are being swapped out
>> > and stores them into a dynamically allocated RAM-based memory pool.
>> > Experiments have shown that around 10-20% of pages stored in zswap
>> > are zero-filled pages (i.e. contents of the page are all zeros), but
>> > these pages are handled as normal pages by compressing and allocating
>> > memory in the pool.
>> 
>> I am somewhat surprised that this many anon pages are zero filled.
>> 
>> If this is true, then maybe we should consider solving this at the
>> swap level in general, as we can de-dup zero pages in all swap
>> devices, not just zswap.
>> 
>> That being said, this is a fair small change and I don't see anything
>> objectionable.  However, I do think the better solution would be to do
> this at a higher level.
> 

Thank you for your suggestion. It is a better solution to handle
zero-filled pages before swapping-out to zswap. Since, Zram is already
handles Zero pages internally, I considered to handle within Zswap.
In a long run, we can work on it to commonly handle zero-filled anon
pages.

> zero-filled pages are just 1 case. in general, it's better
> to handle pages that are memset-ed with the same value (e.g.
> memset(page, 0x01, page_size)). which includes, but not
> limited to, 0x00. zram does it.
> 
> -ss

It is a good solution to extend zero-filled pages handling to same value
pages. I will work on to identify the percentage of same value pages
excluding zero-filled pages in Zswap and will get back.

- Srividya


Re: [PATCH v2] zswap: Zero-filled pages handling

2017-07-06 Thread Srividya Desireddy
On Wed, Jul 6, 2017 at 10:49 AM, Sergey Senozhatsky wrote:
> On (07/02/17 20:28), Seth Jennings wrote:
>> On Sun, Jul 2, 2017 at 9:19 AM, Srividya Desireddy
>> > Zswap is a cache which compresses the pages that are being swapped out
>> > and stores them into a dynamically allocated RAM-based memory pool.
>> > Experiments have shown that around 10-20% of pages stored in zswap
>> > are zero-filled pages (i.e. contents of the page are all zeros), but
>> > these pages are handled as normal pages by compressing and allocating
>> > memory in the pool.
>> 
>> I am somewhat surprised that this many anon pages are zero filled.
>> 
>> If this is true, then maybe we should consider solving this at the
>> swap level in general, as we can de-dup zero pages in all swap
>> devices, not just zswap.
>> 
>> That being said, this is a fair small change and I don't see anything
>> objectionable.  However, I do think the better solution would be to do
> this at a higher level.
> 

Thank you for your suggestion. It is a better solution to handle
zero-filled pages before swapping-out to zswap. Since, Zram is already
handles Zero pages internally, I considered to handle within Zswap.
In a long run, we can work on it to commonly handle zero-filled anon
pages.

> zero-filled pages are just 1 case. in general, it's better
> to handle pages that are memset-ed with the same value (e.g.
> memset(page, 0x01, page_size)). which includes, but not
> limited to, 0x00. zram does it.
> 
> -ss

It is a good solution to extend zero-filled pages handling to same value
pages. I will work on to identify the percentage of same value pages
excluding zero-filled pages in Zswap and will get back.

- Srividya


[PATCH v2] zswap: Zero-filled pages handling

2017-07-02 Thread Srividya Desireddy
From: Srividya Desireddy <srividya...@samsung.com>
Date: Sun, 2 Jul 2017 19:15:37 +0530
Subject: [PATCH v2] zswap: Zero-filled pages handling

Zswap is a cache which compresses the pages that are being swapped out
and stores them into a dynamically allocated RAM-based memory pool.
Experiments have shown that around 10-20% of pages stored in zswap
are zero-filled pages (i.e. contents of the page are all zeros), but
these pages are handled as normal pages by compressing and allocating
memory in the pool.

This patch adds a check in zswap_frontswap_store() to identify zero-filled
page before compression of the page. If the page is a zero-filled page, set
zswap_entry.zeroflag and skip the compression of the page and alloction
of memory in zpool. In zswap_frontswap_load(), check if the zeroflag is
set for the page in zswap_entry. If the flag is set, memset the page with
zero. This saves the decompression time during load.

On Ubuntu PC with 2GB RAM, while executing kernel build and other test
scripts ~15% of pages in zswap were zero pages. With multimedia workload
more than 20% of zswap pages were found to be zero pages.

On a ARM Quad Core 32-bit device with 1.5GB RAM an average 10% of zero
pages were found in zswap (an average of 5000 zero pages found out of
~5 pages stored in zswap) on launching and relaunching 15 applications.
The launch time of the applications improved by ~3%.

Test Parameters BaselineWith patch  Improvement
---
Total RAM   1343MB  1343MB
Available RAM   451MB   445MB -6MB
Avg. Memfree69MB70MB  1MB
Avg. Swap Used  226MB   215MB -11MB
Avg. App entry time 644msec 623msec   3%

With patch, every page swapped to zswap is checked if it is a zero
page or not and for all the zero pages compression and memory allocation
operations are skipped. Overall there is an improvement of 30% in zswap
store time.

In case of non-zero pages there is no overhead during zswap page load. For
zero pages there is a improvement of more than 60% in the zswap load time
as the zero page decompression is avoided.
The below table shows the execution time profiling of the patch.

Zswap Store Operation BaselineWith patch  % Improvement
--
* Zero page check-- 22.5ms
 (for non-zero pages)
* Zero page check-- 24ms
 (for zero pages)
* Compression time  55ms --
 (of zero pages)
* Allocation time   14ms --
 (to store compressed
  zero pages)
-
Total   69ms46.5ms 32%

Zswap Load Operation BaselineWith patch  % Improvement
-
* Decompression time  30.4ms--
 (of zero pages)
* Zero page check +-- 10.04ms
 memset operation
 (of zero pages)
-
Total 30.4ms  10.04ms   66%

*The execution times may vary with test device used.

Signed-off-by: Srividya Desireddy <srividya...@samsung.com>
---
 mm/zswap.c |   46 ++
 1 file changed, 42 insertions(+), 4 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index eedc278..edc584b 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -49,6 +49,8 @@
 static u64 zswap_pool_total_size;
 /* The number of compressed pages currently stored in zswap */
 static atomic_t zswap_stored_pages = ATOMIC_INIT(0);
+/* The number of zero filled pages swapped out to zswap */
+static atomic_t zswap_zero_pages = ATOMIC_INIT(0);
 
 /*
  * The statistics below are not protected from concurrent access for
@@ -145,7 +147,7 @@ struct zswap_pool {
  *be held while changing the refcount.  Since the lock must
  *be held, there is no reason to also make refcount atomic.
  * length - the length in bytes of the compressed page data.  Needed during
- *  decompression
+ *  decompression. For a zero page length is 0.
  * pool - the zswap_pool the entry's data is in
  * handle - zpool allocation handle that stores the compressed page data
  */
@@ -320,8 +322,12 @@ static void zswap_rb_erase(struct rb_root *root, struct 
zswap_entry *entry)
  */
 static void zswap_free_entry(struct zswap_entry *entry)
 {
-   zpool_free(entry->pool->zpool, entry->handle);
-   zswap_pool_put(entry->pool);
+   if (!entry->length)
+   atomic_dec(_zero_pages);
+   else {
+   zpool_free(entry->pool->zpool, entry->handle);
+   zswap_pool_put(entry->pool);
+   }
zswap_entry_cache_free(entry);
atomic_dec(_stored_pages);
zswap_update_total_size();
@@ -956,6 +962,19 @@ static int zswap_shrink

[PATCH v2] zswap: Zero-filled pages handling

2017-07-02 Thread Srividya Desireddy
From: Srividya Desireddy 
Date: Sun, 2 Jul 2017 19:15:37 +0530
Subject: [PATCH v2] zswap: Zero-filled pages handling

Zswap is a cache which compresses the pages that are being swapped out
and stores them into a dynamically allocated RAM-based memory pool.
Experiments have shown that around 10-20% of pages stored in zswap
are zero-filled pages (i.e. contents of the page are all zeros), but
these pages are handled as normal pages by compressing and allocating
memory in the pool.

This patch adds a check in zswap_frontswap_store() to identify zero-filled
page before compression of the page. If the page is a zero-filled page, set
zswap_entry.zeroflag and skip the compression of the page and alloction
of memory in zpool. In zswap_frontswap_load(), check if the zeroflag is
set for the page in zswap_entry. If the flag is set, memset the page with
zero. This saves the decompression time during load.

On Ubuntu PC with 2GB RAM, while executing kernel build and other test
scripts ~15% of pages in zswap were zero pages. With multimedia workload
more than 20% of zswap pages were found to be zero pages.

On a ARM Quad Core 32-bit device with 1.5GB RAM an average 10% of zero
pages were found in zswap (an average of 5000 zero pages found out of
~5 pages stored in zswap) on launching and relaunching 15 applications.
The launch time of the applications improved by ~3%.

Test Parameters BaselineWith patch  Improvement
---
Total RAM   1343MB  1343MB
Available RAM   451MB   445MB -6MB
Avg. Memfree69MB70MB  1MB
Avg. Swap Used  226MB   215MB -11MB
Avg. App entry time 644msec 623msec   3%

With patch, every page swapped to zswap is checked if it is a zero
page or not and for all the zero pages compression and memory allocation
operations are skipped. Overall there is an improvement of 30% in zswap
store time.

In case of non-zero pages there is no overhead during zswap page load. For
zero pages there is a improvement of more than 60% in the zswap load time
as the zero page decompression is avoided.
The below table shows the execution time profiling of the patch.

Zswap Store Operation BaselineWith patch  % Improvement
--
* Zero page check-- 22.5ms
 (for non-zero pages)
* Zero page check-- 24ms
 (for zero pages)
* Compression time  55ms --
 (of zero pages)
* Allocation time   14ms --
 (to store compressed
  zero pages)
-
Total   69ms46.5ms 32%

Zswap Load Operation BaselineWith patch  % Improvement
-
* Decompression time  30.4ms--
 (of zero pages)
* Zero page check +-- 10.04ms
 memset operation
 (of zero pages)
-
Total 30.4ms  10.04ms   66%

*The execution times may vary with test device used.

Signed-off-by: Srividya Desireddy 
---
 mm/zswap.c |   46 ++
 1 file changed, 42 insertions(+), 4 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index eedc278..edc584b 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -49,6 +49,8 @@
 static u64 zswap_pool_total_size;
 /* The number of compressed pages currently stored in zswap */
 static atomic_t zswap_stored_pages = ATOMIC_INIT(0);
+/* The number of zero filled pages swapped out to zswap */
+static atomic_t zswap_zero_pages = ATOMIC_INIT(0);
 
 /*
  * The statistics below are not protected from concurrent access for
@@ -145,7 +147,7 @@ struct zswap_pool {
  *be held while changing the refcount.  Since the lock must
  *be held, there is no reason to also make refcount atomic.
  * length - the length in bytes of the compressed page data.  Needed during
- *  decompression
+ *  decompression. For a zero page length is 0.
  * pool - the zswap_pool the entry's data is in
  * handle - zpool allocation handle that stores the compressed page data
  */
@@ -320,8 +322,12 @@ static void zswap_rb_erase(struct rb_root *root, struct 
zswap_entry *entry)
  */
 static void zswap_free_entry(struct zswap_entry *entry)
 {
-   zpool_free(entry->pool->zpool, entry->handle);
-   zswap_pool_put(entry->pool);
+   if (!entry->length)
+   atomic_dec(_zero_pages);
+   else {
+   zpool_free(entry->pool->zpool, entry->handle);
+   zswap_pool_put(entry->pool);
+   }
zswap_entry_cache_free(entry);
atomic_dec(_stored_pages);
zswap_update_total_size();
@@ -956,6 +962,19 @@ static int zswap_shrink(void)
return ret;
 }
 
+static int zswap_is

[PATCH] zswap: Zero-filled pages handling

2017-03-08 Thread Srividya Desireddy

On Sat, Mar 4, 2017 at 02:55 AM, Dan Streetman <ddstr...@ieee.org> wrote:
> On Sat, Feb 25, 2017 at 12:18 PM, Sarbojit Ganguly
> <unixman.linux...@gmail.com> wrote:
>> On 25 February 2017 at 20:12, Srividya Desireddy
>> <srividya...@samsung.com> wrote:
>>> From: Srividya Desireddy <srividya...@samsung.com>
>>> Date: Thu, 23 Feb 2017 15:04:06 +0530
>>> Subject: [PATCH] zswap: Zero-filled pages handling
>
> your email is base64-encoded; please send plain text emails.
>
>>>
>>> Zswap is a cache which compresses the pages that are being swapped out
>>> and stores them into a dynamically allocated RAM-based memory pool.
>>> Experiments have shown that around 10-20% of pages stored in zswap
>>> are zero-filled pages (i.e. contents of the page are all zeros), but
>
> 20%?  that's a LOT of zero pages...which seems like applications are
> wasting a lot of memory.  what kind of workload are you testing with?
>

I have tested this patch with different workloaded on different devices.
On Ubuntu PC with 2GB RAM, while executing kernel build and other test 
scripts ~15% of pages in zswap were zero pages. With multimedia workload
more than 20% of zswap pages were found to be zero pages.
On a ARM Quad Core 32-bit device with 1.5GB RAM an average 10% of zero
pages were found on launching and relaunching 15 applications.

>>> these pages are handled as normal pages by compressing and allocating
>>> memory in the pool.
>>>
>>> This patch adds a check in zswap_frontswap_store() to identify zero-filled
>>> page before compression of the page. If the page is a zero-filled page, set
>>> zswap_entry.zeroflag and skip the compression of the page and alloction
>>> of memory in zpool. In zswap_frontswap_load(), check if the zeroflag is
>>> set for the page in zswap_entry. If the flag is set, memset the page with
>>> zero. This saves the decompression time during load.
>>>
>>> The overall overhead caused to check for a zero-filled page is very minimal
>>> when compared to the time saved by avoiding compression and allocation in
>>> case of zero-filled pages. Although, compressed size of a zero-filled page
>>> is very less, with this patch load time of a zero-filled page is reduced by
>>> 80% when compared to baseline.
>>
>> Is it possible to share the benchmark details?
>
> Was there an answer to this?
>

This patch is tested on a ARM Quad Core 32-bit device with 1.5GB RAM by
launching and relaunching different applications. With the patch, an
average of 5000 pages zero pages found in zswap out of the ~5 pages
stored in zswap and application launch time improved by ~3%.

Test Parameters BaselineWith patch  Improvement
---
Total RAM   1343MB  1343MB 
Available RAM   451MB   445MB -6MB
Avg. Memfree69MB70MB  1MB
Avg. Swap Used  226MB   215MB -11MB
Avg. App entry time 644msec 623msec   3%

With patch, every page swapped to zswap is checked if it is a zero
page or not and for all the zero pages compression and memory allocation
operations are skipped. Overall there is an improvement of 30% in zswap
store time.
In case of non-zero pages there is no overhead during zswap page load. For 
zero pages there is a improvement of more than 60% in the zswap load time 
as the zero page decompression is avoided.

The below table shows the execution time profiling of the patch.

Zswap Store Operation BaselineWith patch  % Improvement
--
* Zero page check-- 22.5ms
 (for non-zero pages)
* Zero page check-- 24ms
 (for zero pages)
* Compression time  55ms --
 (of zero pages)
* Allocation time   14ms --
 (to store compressed 
  zero pages)
-   
Total   69ms46.5ms 32%

Zswap Load Operation BaselineWith patch  % Improvement
-
* Decompression time  30.4ms--
 (of zero pages)
* Zero page check +-- 10.04ms
 memset operation
 (of zero pages)
-   
Total 30.4ms  10.04ms   66%

*The execution times may vary with test device used.

>>
>>
>>>
>>> Signed-off-by: Srividya Desireddy <srividya...@samsung.com>
>>> ---
>>>  mm/zswap.c |   48 +---
>>>  1 file changed, 45 insertions(+), 3 deleti

[PATCH] zswap: Zero-filled pages handling

2017-03-08 Thread Srividya Desireddy

On Sat, Mar 4, 2017 at 02:55 AM, Dan Streetman  wrote:
> On Sat, Feb 25, 2017 at 12:18 PM, Sarbojit Ganguly
>  wrote:
>> On 25 February 2017 at 20:12, Srividya Desireddy
>>  wrote:
>>> From: Srividya Desireddy 
>>> Date: Thu, 23 Feb 2017 15:04:06 +0530
>>> Subject: [PATCH] zswap: Zero-filled pages handling
>
> your email is base64-encoded; please send plain text emails.
>
>>>
>>> Zswap is a cache which compresses the pages that are being swapped out
>>> and stores them into a dynamically allocated RAM-based memory pool.
>>> Experiments have shown that around 10-20% of pages stored in zswap
>>> are zero-filled pages (i.e. contents of the page are all zeros), but
>
> 20%?  that's a LOT of zero pages...which seems like applications are
> wasting a lot of memory.  what kind of workload are you testing with?
>

I have tested this patch with different workloaded on different devices.
On Ubuntu PC with 2GB RAM, while executing kernel build and other test 
scripts ~15% of pages in zswap were zero pages. With multimedia workload
more than 20% of zswap pages were found to be zero pages.
On a ARM Quad Core 32-bit device with 1.5GB RAM an average 10% of zero
pages were found on launching and relaunching 15 applications.

>>> these pages are handled as normal pages by compressing and allocating
>>> memory in the pool.
>>>
>>> This patch adds a check in zswap_frontswap_store() to identify zero-filled
>>> page before compression of the page. If the page is a zero-filled page, set
>>> zswap_entry.zeroflag and skip the compression of the page and alloction
>>> of memory in zpool. In zswap_frontswap_load(), check if the zeroflag is
>>> set for the page in zswap_entry. If the flag is set, memset the page with
>>> zero. This saves the decompression time during load.
>>>
>>> The overall overhead caused to check for a zero-filled page is very minimal
>>> when compared to the time saved by avoiding compression and allocation in
>>> case of zero-filled pages. Although, compressed size of a zero-filled page
>>> is very less, with this patch load time of a zero-filled page is reduced by
>>> 80% when compared to baseline.
>>
>> Is it possible to share the benchmark details?
>
> Was there an answer to this?
>

This patch is tested on a ARM Quad Core 32-bit device with 1.5GB RAM by
launching and relaunching different applications. With the patch, an
average of 5000 pages zero pages found in zswap out of the ~5 pages
stored in zswap and application launch time improved by ~3%.

Test Parameters BaselineWith patch  Improvement
---
Total RAM   1343MB  1343MB 
Available RAM   451MB   445MB -6MB
Avg. Memfree69MB70MB  1MB
Avg. Swap Used  226MB   215MB -11MB
Avg. App entry time 644msec 623msec   3%

With patch, every page swapped to zswap is checked if it is a zero
page or not and for all the zero pages compression and memory allocation
operations are skipped. Overall there is an improvement of 30% in zswap
store time.
In case of non-zero pages there is no overhead during zswap page load. For 
zero pages there is a improvement of more than 60% in the zswap load time 
as the zero page decompression is avoided.

The below table shows the execution time profiling of the patch.

Zswap Store Operation BaselineWith patch  % Improvement
--
* Zero page check-- 22.5ms
 (for non-zero pages)
* Zero page check-- 24ms
 (for zero pages)
* Compression time  55ms --
 (of zero pages)
* Allocation time   14ms --
 (to store compressed 
  zero pages)
-   
Total   69ms46.5ms 32%

Zswap Load Operation BaselineWith patch  % Improvement
-
* Decompression time  30.4ms--
 (of zero pages)
* Zero page check +-- 10.04ms
 memset operation
 (of zero pages)
-   
Total 30.4ms  10.04ms   66%

*The execution times may vary with test device used.

>>
>>
>>>
>>> Signed-off-by: Srividya Desireddy 
>>> ---
>>>  mm/zswap.c |   48 +---
>>>  1 file changed, 45 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/mm/zswap.c b/mm/zswap.c
>>> index 067a0d6..a574008 100644
>>> --- a/mm/zswap.c
>>

[PATCH] zswap: Zero-filled pages handling

2017-02-25 Thread Srividya Desireddy
From: Srividya Desireddy <srividya...@samsung.com>
Date: Thu, 23 Feb 2017 15:04:06 +0530
Subject: [PATCH] zswap: Zero-filled pages handling

Zswap is a cache which compresses the pages that are being swapped out
and stores them into a dynamically allocated RAM-based memory pool.
Experiments have shown that around 10-20% of pages stored in zswap
are zero-filled pages (i.e. contents of the page are all zeros), but
these pages are handled as normal pages by compressing and allocating
memory in the pool.

This patch adds a check in zswap_frontswap_store() to identify zero-filled
page before compression of the page. If the page is a zero-filled page, set
zswap_entry.zeroflag and skip the compression of the page and alloction
of memory in zpool. In zswap_frontswap_load(), check if the zeroflag is
set for the page in zswap_entry. If the flag is set, memset the page with
zero. This saves the decompression time during load.

The overall overhead caused to check for a zero-filled page is very minimal
when compared to the time saved by avoiding compression and allocation in
case of zero-filled pages. Although, compressed size of a zero-filled page
is very less, with this patch load time of a zero-filled page is reduced by
80% when compared to baseline.

Signed-off-by: Srividya Desireddy <srividya...@samsung.com>
---
 mm/zswap.c |   48 +---
 1 file changed, 45 insertions(+), 3 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index 067a0d6..a574008 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -49,6 +49,8 @@
 static u64 zswap_pool_total_size;
 /* The number of compressed pages currently stored in zswap */
 static atomic_t zswap_stored_pages = ATOMIC_INIT(0);
+/* The number of zero filled pages swapped out to zswap */
+static atomic_t zswap_zero_pages = ATOMIC_INIT(0);
 
 /*
  * The statistics below are not protected from concurrent access for
@@ -140,6 +142,8 @@ struct zswap_pool {
  *  decompression
  * pool - the zswap_pool the entry's data is in
  * handle - zpool allocation handle that stores the compressed page data
+ * zeroflag - the flag is set if the content of the page is filled with
+ *zeros
  */
 struct zswap_entry {
struct rb_node rbnode;
@@ -148,6 +152,7 @@ struct zswap_entry {
unsigned int length;
struct zswap_pool *pool;
unsigned long handle;
+   unsigned char zeroflag;
 };
 
 struct zswap_header {
@@ -236,6 +241,7 @@ static struct zswap_entry *zswap_entry_cache_alloc(gfp_t 
gfp)
if (!entry)
return NULL;
entry->refcount = 1;
+   entry->zeroflag = 0;
RB_CLEAR_NODE(>rbnode);
return entry;
 }
@@ -306,8 +312,12 @@ static void zswap_rb_erase(struct rb_root *root, struct 
zswap_entry *entry)
  */
 static void zswap_free_entry(struct zswap_entry *entry)
 {
-   zpool_free(entry->pool->zpool, entry->handle);
-   zswap_pool_put(entry->pool);
+   if (entry->zeroflag)
+   atomic_dec(_zero_pages);
+   else {
+   zpool_free(entry->pool->zpool, entry->handle);
+   zswap_pool_put(entry->pool);
+   }
zswap_entry_cache_free(entry);
atomic_dec(_stored_pages);
zswap_update_total_size();
@@ -877,6 +887,19 @@ static int zswap_shrink(void)
return ret;
 }
 
+static int zswap_is_page_zero_filled(void *ptr)
+{
+   unsigned int pos;
+   unsigned long *page;
+
+   page = (unsigned long *)ptr;
+   for (pos = 0; pos != PAGE_SIZE / sizeof(*page); pos++) {
+   if (page[pos])
+   return 0;
+   }
+   return 1;
+}
+
 /*
 * frontswap hooks
 **/
@@ -917,6 +940,15 @@ static int zswap_frontswap_store(unsigned type, pgoff_t 
offset,
goto reject;
}
 
+   src = kmap_atomic(page);
+   if (zswap_is_page_zero_filled(src)) {
+   kunmap_atomic(src);
+   entry->offset = offset;
+   entry->zeroflag = 1;
+   atomic_inc(_zero_pages);
+   goto insert_entry;
+   }
+
/* if entry is successfully added, it keeps the reference */
entry->pool = zswap_pool_current_get();
if (!entry->pool) {
@@ -927,7 +959,6 @@ static int zswap_frontswap_store(unsigned type, pgoff_t 
offset,
/* compress */
dst = get_cpu_var(zswap_dstmem);
tfm = *get_cpu_ptr(entry->pool->tfm);
-   src = kmap_atomic(page);
ret = crypto_comp_compress(tfm, src, PAGE_SIZE, dst, );
kunmap_atomic(src);
put_cpu_ptr(entry->pool->tfm);
@@ -961,6 +992,7 @@ static int zswap_frontswap_store(unsigned type, pgoff_t 
offset,
entry->handle = handle;
entry->length = dlen;
 
+insert_entry:
/* map */
spin_lock(>lock);
do {
@@ -1013,6 +1045,13 @@ static int zswap_f

[PATCH] zswap: Zero-filled pages handling

2017-02-25 Thread Srividya Desireddy
From: Srividya Desireddy 
Date: Thu, 23 Feb 2017 15:04:06 +0530
Subject: [PATCH] zswap: Zero-filled pages handling

Zswap is a cache which compresses the pages that are being swapped out
and stores them into a dynamically allocated RAM-based memory pool.
Experiments have shown that around 10-20% of pages stored in zswap
are zero-filled pages (i.e. contents of the page are all zeros), but
these pages are handled as normal pages by compressing and allocating
memory in the pool.

This patch adds a check in zswap_frontswap_store() to identify zero-filled
page before compression of the page. If the page is a zero-filled page, set
zswap_entry.zeroflag and skip the compression of the page and alloction
of memory in zpool. In zswap_frontswap_load(), check if the zeroflag is
set for the page in zswap_entry. If the flag is set, memset the page with
zero. This saves the decompression time during load.

The overall overhead caused to check for a zero-filled page is very minimal
when compared to the time saved by avoiding compression and allocation in
case of zero-filled pages. Although, compressed size of a zero-filled page
is very less, with this patch load time of a zero-filled page is reduced by
80% when compared to baseline.

Signed-off-by: Srividya Desireddy 
---
 mm/zswap.c |   48 +---
 1 file changed, 45 insertions(+), 3 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index 067a0d6..a574008 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -49,6 +49,8 @@
 static u64 zswap_pool_total_size;
 /* The number of compressed pages currently stored in zswap */
 static atomic_t zswap_stored_pages = ATOMIC_INIT(0);
+/* The number of zero filled pages swapped out to zswap */
+static atomic_t zswap_zero_pages = ATOMIC_INIT(0);
 
 /*
  * The statistics below are not protected from concurrent access for
@@ -140,6 +142,8 @@ struct zswap_pool {
  *  decompression
  * pool - the zswap_pool the entry's data is in
  * handle - zpool allocation handle that stores the compressed page data
+ * zeroflag - the flag is set if the content of the page is filled with
+ *zeros
  */
 struct zswap_entry {
struct rb_node rbnode;
@@ -148,6 +152,7 @@ struct zswap_entry {
unsigned int length;
struct zswap_pool *pool;
unsigned long handle;
+   unsigned char zeroflag;
 };
 
 struct zswap_header {
@@ -236,6 +241,7 @@ static struct zswap_entry *zswap_entry_cache_alloc(gfp_t 
gfp)
if (!entry)
return NULL;
entry->refcount = 1;
+   entry->zeroflag = 0;
RB_CLEAR_NODE(>rbnode);
return entry;
 }
@@ -306,8 +312,12 @@ static void zswap_rb_erase(struct rb_root *root, struct 
zswap_entry *entry)
  */
 static void zswap_free_entry(struct zswap_entry *entry)
 {
-   zpool_free(entry->pool->zpool, entry->handle);
-   zswap_pool_put(entry->pool);
+   if (entry->zeroflag)
+   atomic_dec(_zero_pages);
+   else {
+   zpool_free(entry->pool->zpool, entry->handle);
+   zswap_pool_put(entry->pool);
+   }
zswap_entry_cache_free(entry);
atomic_dec(_stored_pages);
zswap_update_total_size();
@@ -877,6 +887,19 @@ static int zswap_shrink(void)
return ret;
 }
 
+static int zswap_is_page_zero_filled(void *ptr)
+{
+   unsigned int pos;
+   unsigned long *page;
+
+   page = (unsigned long *)ptr;
+   for (pos = 0; pos != PAGE_SIZE / sizeof(*page); pos++) {
+   if (page[pos])
+   return 0;
+   }
+   return 1;
+}
+
 /*
 * frontswap hooks
 **/
@@ -917,6 +940,15 @@ static int zswap_frontswap_store(unsigned type, pgoff_t 
offset,
goto reject;
}
 
+   src = kmap_atomic(page);
+   if (zswap_is_page_zero_filled(src)) {
+   kunmap_atomic(src);
+   entry->offset = offset;
+   entry->zeroflag = 1;
+   atomic_inc(_zero_pages);
+   goto insert_entry;
+   }
+
/* if entry is successfully added, it keeps the reference */
entry->pool = zswap_pool_current_get();
if (!entry->pool) {
@@ -927,7 +959,6 @@ static int zswap_frontswap_store(unsigned type, pgoff_t 
offset,
/* compress */
dst = get_cpu_var(zswap_dstmem);
tfm = *get_cpu_ptr(entry->pool->tfm);
-   src = kmap_atomic(page);
ret = crypto_comp_compress(tfm, src, PAGE_SIZE, dst, );
kunmap_atomic(src);
put_cpu_ptr(entry->pool->tfm);
@@ -961,6 +992,7 @@ static int zswap_frontswap_store(unsigned type, pgoff_t 
offset,
entry->handle = handle;
entry->length = dlen;
 
+insert_entry:
/* map */
spin_lock(>lock);
do {
@@ -1013,6 +1045,13 @@ static int zswap_frontswap_load(unsigned type, pgoff_t 
offset,
}
spi

[PATCH 0/4] zswap: Optimize compressed pool memory utilization

2017-02-16 Thread Srividya Desireddy

Could you please review this patch series and update if any corrections are 
needed in the patch-set.

-Srividya

On Fri, Aug 19, 2016 at 11:04 AM, Srividya Desireddy wrote:
> On 17 August 2016 at 18:08, Pekka Enberg  wrote:
>> On Wed, Aug 17, 2016 at 1:03 PM, Srividya Desireddy
>> wrote:
>>> This series of patches optimize the memory utilized by zswap for storing
>>> the swapped out pages.
>>>
>>> Zswap is a cache which compresses the pages that are being swapped out
>>> and stores them into a dynamically allocated RAM-based memory pool.
>>> Experiments have shown that around 10-15% of pages stored in zswap are
>>> duplicates which results in 10-12% more RAM required to store these
>>> duplicate compressed pages. Around 10-20% of pages stored in zswap
>>> are zero-filled pages, but these pages are handled as normal pages by
>>> compressing and allocating memory in the pool.
>>>
>>> The following patch-set optimizes memory utilized by zswap by avoiding the
>>> storage of duplicate pages and zero-filled pages in zswap compressed memory
>>> pool.
>>>
>>> Patch 1/4: zswap: Share zpool memory of duplicate pages
>>> This patch shares compressed pool memory of the duplicate pages. When a new
>>> page is requested for swap-out to zswap; search for an identical page in
>>> the pages already stored in zswap. If an identical page is found then share
>>> the compressed page data of the identical page with the new page. This
>>> avoids allocation of memory in the compressed pool for a duplicate page.
>>> This feature is tested on devices with 1GB, 2GB and 3GB RAM by executing
>>> performance test at low memory conditions. Around 15-20% of the pages
>>> swapped are duplicate of the pages existing in zswap, resulting in 15%
>>> saving of zswap memory pool when compared to the baseline version.
>>>
>>> Test Parameters BaselineWith patch  Improvement
>>> Total RAM   955MB   955MB
>>> Available RAM 254MB   269MB   15MB
>>> Avg. App entry time 2.469sec2.207sec7%
>>> Avg. App close time 1.151sec1.085sec6%
>>> Apps launched in 1sec   5 12 7
>>>
>>> There is little overhead in zswap store function due to the search
>>> operation for finding duplicate pages. However, if duplicate page is
>>> found it saves the compression and allocation time of the page. The average
>>> overhead per zswap_frontswap_store() function call in the experimental
>>> device is 9us. There is no overhead in case of zswap_frontswap_load()
>>> operation.
>>>
>>> Patch 2/4: zswap: Enable/disable sharing of duplicate pages at runtime
>>> This patch adds a module parameter to enable or disable the sharing of
>>> duplicate zswap pages at runtime.
>>>
>>> Patch 3/4: zswap: Zero-filled pages handling
>>> This patch checks if a page to be stored in zswap is a zero-filled page
>>> (i.e. contents of the page are all zeros). If such page is found,
>>> compression and allocation of memory for the compressed page is avoided
>>> and instead the page is just marked as zero-filled page.
>>> Although, compressed size of a zero-filled page using LZO compressor is
>>> very less (52 bytes including zswap_header), this patch saves compression
>>> and allocation time during store operation and decompression time during
>>> zswap load operation for zero-filled pages. Experiments have shown that
>>> around 10-20% of pages stored in zswap are zero-filled.
>>
>> Aren't zero-filled pages already handled by patch 1/4 as their
>> contents match? So the overall memory saving is 52 bytes?
>>
>> - Pekka
>
> Thanks for the quick reply.
>
> Zero-filled pages can also be handled by patch 1/4. It performs
> searching of a duplicate page among existing stored pages in zswap.
> Its been observed that average search time to identify duplicate zero
> filled pages(using patch 1/4) is almost thrice compared to checking
> all pages for zero-filled. 
>
> Also, in case of patch 1/4, the zswap_frontswap_load() operation requires
> the compressed zero-filled page to be decompressed. zswap_frontswap_load()
> function in patch 3/4 just fills the page with zeros while loading a
> zero-filled page and is faster than decompression.
>
> - Srividya

[PATCH 0/4] zswap: Optimize compressed pool memory utilization

2017-02-16 Thread Srividya Desireddy

Could you please review this patch series and update if any corrections are 
needed in the patch-set.

-Srividya

On Fri, Aug 19, 2016 at 11:04 AM, Srividya Desireddy wrote:
> On 17 August 2016 at 18:08, Pekka Enberg  wrote:
>> On Wed, Aug 17, 2016 at 1:03 PM, Srividya Desireddy
>> wrote:
>>> This series of patches optimize the memory utilized by zswap for storing
>>> the swapped out pages.
>>>
>>> Zswap is a cache which compresses the pages that are being swapped out
>>> and stores them into a dynamically allocated RAM-based memory pool.
>>> Experiments have shown that around 10-15% of pages stored in zswap are
>>> duplicates which results in 10-12% more RAM required to store these
>>> duplicate compressed pages. Around 10-20% of pages stored in zswap
>>> are zero-filled pages, but these pages are handled as normal pages by
>>> compressing and allocating memory in the pool.
>>>
>>> The following patch-set optimizes memory utilized by zswap by avoiding the
>>> storage of duplicate pages and zero-filled pages in zswap compressed memory
>>> pool.
>>>
>>> Patch 1/4: zswap: Share zpool memory of duplicate pages
>>> This patch shares compressed pool memory of the duplicate pages. When a new
>>> page is requested for swap-out to zswap; search for an identical page in
>>> the pages already stored in zswap. If an identical page is found then share
>>> the compressed page data of the identical page with the new page. This
>>> avoids allocation of memory in the compressed pool for a duplicate page.
>>> This feature is tested on devices with 1GB, 2GB and 3GB RAM by executing
>>> performance test at low memory conditions. Around 15-20% of the pages
>>> swapped are duplicate of the pages existing in zswap, resulting in 15%
>>> saving of zswap memory pool when compared to the baseline version.
>>>
>>> Test Parameters BaselineWith patch  Improvement
>>> Total RAM   955MB   955MB
>>> Available RAM 254MB   269MB   15MB
>>> Avg. App entry time 2.469sec2.207sec7%
>>> Avg. App close time 1.151sec1.085sec6%
>>> Apps launched in 1sec   5 12 7
>>>
>>> There is little overhead in zswap store function due to the search
>>> operation for finding duplicate pages. However, if duplicate page is
>>> found it saves the compression and allocation time of the page. The average
>>> overhead per zswap_frontswap_store() function call in the experimental
>>> device is 9us. There is no overhead in case of zswap_frontswap_load()
>>> operation.
>>>
>>> Patch 2/4: zswap: Enable/disable sharing of duplicate pages at runtime
>>> This patch adds a module parameter to enable or disable the sharing of
>>> duplicate zswap pages at runtime.
>>>
>>> Patch 3/4: zswap: Zero-filled pages handling
>>> This patch checks if a page to be stored in zswap is a zero-filled page
>>> (i.e. contents of the page are all zeros). If such page is found,
>>> compression and allocation of memory for the compressed page is avoided
>>> and instead the page is just marked as zero-filled page.
>>> Although, compressed size of a zero-filled page using LZO compressor is
>>> very less (52 bytes including zswap_header), this patch saves compression
>>> and allocation time during store operation and decompression time during
>>> zswap load operation for zero-filled pages. Experiments have shown that
>>> around 10-20% of pages stored in zswap are zero-filled.
>>
>> Aren't zero-filled pages already handled by patch 1/4 as their
>> contents match? So the overall memory saving is 52 bytes?
>>
>> - Pekka
>
> Thanks for the quick reply.
>
> Zero-filled pages can also be handled by patch 1/4. It performs
> searching of a duplicate page among existing stored pages in zswap.
> Its been observed that average search time to identify duplicate zero
> filled pages(using patch 1/4) is almost thrice compared to checking
> all pages for zero-filled. 
>
> Also, in case of patch 1/4, the zswap_frontswap_load() operation requires
> the compressed zero-filled page to be decompressed. zswap_frontswap_load()
> function in patch 3/4 just fills the page with zeros while loading a
> zero-filled page and is faster than decompression.
>
> - Srividya

[PATCH 3/4] zswap: Zero-filled pages handling

2016-08-19 Thread Srividya Desireddy

On 17 August 2016 at 18:02, Pekka Enberg <penb...@kernel.org> wrote:
> On Wed, Aug 17, 2016 at 1:18 PM, Srividya Desireddy
> <srividya...@samsung.com> wrote:
>>> This patch adds a check in zswap_frontswap_store() to identify zero-filled
>>> page before compression of the page. If the page is a zero-filled page, set
>>> zswap_entry.zeroflag and skip the compression of the page and alloction
>>> of memory in zpool. In zswap_frontswap_load(), check if the zeroflag is
>>> set for the page in zswap_entry. If the flag is set, memset the page with
>>> zero. This saves the decompression time during load.
>>>
>>> The overall overhead caused due to zero-filled page check is very minimal
>>> when compared to the time saved by avoiding compression and allocation in
>>> case of zero-filled pages. The load time of a zero-filled page is reduced
>>> by 80% when compared to baseline.
>
> On Wed, Aug 17, 2016 at 3:25 PM, Pekka Enberg <penb...@kernel.org> wrote:
>> AFAICT, that's an overall improvement only if there are a lot of
>> zero-filled pages because it's just overhead for pages that we *need*
>> to compress, no? So I suppose the question is, are there a lot of
>> zero-filled pages that we need to swap and why is that the case?
>
> I suppose reading your cover letter would have been helpful before
> sending out my email:
>
> "Experiments have shown that around 10-15% of pages stored in zswap are
> duplicates which results in 10-12% more RAM required to store these
> duplicate compressed pages."
>
> But I still don't understand why we have zero-filled pages that we are
> swapping out.
>
> - Pekka

Zero-filled pages exists in memory because applications may be
initializing the allocated pages with zeros and not using them; or
the actual content written to the memory pages during execution 
itself is zeros.
The existing page reclamation path in kernel does not check for
zero-filled pages in the anonymous LRU lists before swapping out.

- Srividya

[PATCH 3/4] zswap: Zero-filled pages handling

2016-08-19 Thread Srividya Desireddy

On 17 August 2016 at 18:02, Pekka Enberg  wrote:
> On Wed, Aug 17, 2016 at 1:18 PM, Srividya Desireddy
>  wrote:
>>> This patch adds a check in zswap_frontswap_store() to identify zero-filled
>>> page before compression of the page. If the page is a zero-filled page, set
>>> zswap_entry.zeroflag and skip the compression of the page and alloction
>>> of memory in zpool. In zswap_frontswap_load(), check if the zeroflag is
>>> set for the page in zswap_entry. If the flag is set, memset the page with
>>> zero. This saves the decompression time during load.
>>>
>>> The overall overhead caused due to zero-filled page check is very minimal
>>> when compared to the time saved by avoiding compression and allocation in
>>> case of zero-filled pages. The load time of a zero-filled page is reduced
>>> by 80% when compared to baseline.
>
> On Wed, Aug 17, 2016 at 3:25 PM, Pekka Enberg  wrote:
>> AFAICT, that's an overall improvement only if there are a lot of
>> zero-filled pages because it's just overhead for pages that we *need*
>> to compress, no? So I suppose the question is, are there a lot of
>> zero-filled pages that we need to swap and why is that the case?
>
> I suppose reading your cover letter would have been helpful before
> sending out my email:
>
> "Experiments have shown that around 10-15% of pages stored in zswap are
> duplicates which results in 10-12% more RAM required to store these
> duplicate compressed pages."
>
> But I still don't understand why we have zero-filled pages that we are
> swapping out.
>
> - Pekka

Zero-filled pages exists in memory because applications may be
initializing the allocated pages with zeros and not using them; or
the actual content written to the memory pages during execution 
itself is zeros.
The existing page reclamation path in kernel does not check for
zero-filled pages in the anonymous LRU lists before swapping out.

- Srividya

[PATCH 3/4] zswap: Zero-filled pages handling

2016-08-19 Thread Srividya Desireddy
On 17 August 2016 at 17:55, Pekka Enberg <penb...@kernel.org> wrote:
> On Wed, Aug 17, 2016 at 1:18 PM, Srividya Desireddy
> <srividya...@samsung.com> wrote:
>> @@ -1314,6 +1347,13 @@ static int zswap_frontswap_load(unsigned type, 
>> pgoff_t offset,
>> }
>> spin_unlock(>lock);
>>
>> +   if (entry->zeroflag) {
>> +   dst = kmap_atomic(page);
>> +   memset(dst, 0, PAGE_SIZE);
>> +   kunmap_atomic(dst);
>> +   goto freeentry;
>> +   }
>
> Don't we need the same thing in zswap_writeback_entry() for the
> ZSWAP_SWAPCACHE_NEW case?

Zero-filled pages are not compressed and stored in the zpool memory.
Zpool handle will not be created for zero-filled pages, hence they
can not be picked for eviction/writeback to the swap device.

- Srividya
>
>> +
>> /* decompress */
>> dlen = PAGE_SIZE;
>> src = (u8 *)zpool_map_handle(entry->pool->zpool, 
>> entry->zhandle->handle,
>> @@ -1327,6 +1367,7 @@ static int zswap_frontswap_load(unsigned type, pgoff_t 
>> offset,
>> zpool_unmap_handle(entry->pool->zpool, entry->zhandle->handle);
>> BUG_ON(ret);
>
> - Pekka


[PATCH 3/4] zswap: Zero-filled pages handling

2016-08-19 Thread Srividya Desireddy
On 17 August 2016 at 17:55, Pekka Enberg  wrote:
> On Wed, Aug 17, 2016 at 1:18 PM, Srividya Desireddy
>  wrote:
>> @@ -1314,6 +1347,13 @@ static int zswap_frontswap_load(unsigned type, 
>> pgoff_t offset,
>> }
>> spin_unlock(>lock);
>>
>> +   if (entry->zeroflag) {
>> +   dst = kmap_atomic(page);
>> +   memset(dst, 0, PAGE_SIZE);
>> +   kunmap_atomic(dst);
>> +   goto freeentry;
>> +   }
>
> Don't we need the same thing in zswap_writeback_entry() for the
> ZSWAP_SWAPCACHE_NEW case?

Zero-filled pages are not compressed and stored in the zpool memory.
Zpool handle will not be created for zero-filled pages, hence they
can not be picked for eviction/writeback to the swap device.

- Srividya
>
>> +
>> /* decompress */
>> dlen = PAGE_SIZE;
>> src = (u8 *)zpool_map_handle(entry->pool->zpool, 
>> entry->zhandle->handle,
>> @@ -1327,6 +1367,7 @@ static int zswap_frontswap_load(unsigned type, pgoff_t 
>> offset,
>> zpool_unmap_handle(entry->pool->zpool, entry->zhandle->handle);
>> BUG_ON(ret);
>
> - Pekka


[PATCH 0/4] zswap: Optimize compressed pool memory utilization

2016-08-18 Thread Srividya Desireddy
On 17 August 2016 at 18:08, Pekka Enberg <penb...@kernel.org> wrote:
> On Wed, Aug 17, 2016 at 1:03 PM, Srividya Desireddy
> <srividya...@samsung.com> wrote:
>> This series of patches optimize the memory utilized by zswap for storing
>> the swapped out pages.
>>
>> Zswap is a cache which compresses the pages that are being swapped out
>> and stores them into a dynamically allocated RAM-based memory pool.
>> Experiments have shown that around 10-15% of pages stored in zswap are
>> duplicates which results in 10-12% more RAM required to store these
>> duplicate compressed pages. Around 10-20% of pages stored in zswap
>> are zero-filled pages, but these pages are handled as normal pages by
>> compressing and allocating memory in the pool.
>>
>> The following patch-set optimizes memory utilized by zswap by avoiding the
>> storage of duplicate pages and zero-filled pages in zswap compressed memory
>> pool.
>>
>> Patch 1/4: zswap: Share zpool memory of duplicate pages
>> This patch shares compressed pool memory of the duplicate pages. When a new
>> page is requested for swap-out to zswap; search for an identical page in
>> the pages already stored in zswap. If an identical page is found then share
>> the compressed page data of the identical page with the new page. This
>> avoids allocation of memory in the compressed pool for a duplicate page.
>> This feature is tested on devices with 1GB, 2GB and 3GB RAM by executing
>> performance test at low memory conditions. Around 15-20% of the pages
>> swapped are duplicate of the pages existing in zswap, resulting in 15%
>> saving of zswap memory pool when compared to the baseline version.
>>
>> Test Parameters BaselineWith patch  Improvement
>> Total RAM   955MB   955MB
>> Available RAM 254MB   269MB   15MB
>> Avg. App entry time 2.469sec2.207sec7%
>> Avg. App close time 1.151sec1.085sec6%
>> Apps launched in 1sec   5 12 7
>>
>> There is little overhead in zswap store function due to the search
>> operation for finding duplicate pages. However, if duplicate page is
>> found it saves the compression and allocation time of the page. The average
>> overhead per zswap_frontswap_store() function call in the experimental
>> device is 9us. There is no overhead in case of zswap_frontswap_load()
>> operation.
>>
>> Patch 2/4: zswap: Enable/disable sharing of duplicate pages at runtime
>> This patch adds a module parameter to enable or disable the sharing of
>> duplicate zswap pages at runtime.
>>
>> Patch 3/4: zswap: Zero-filled pages handling
>> This patch checks if a page to be stored in zswap is a zero-filled page
>> (i.e. contents of the page are all zeros). If such page is found,
>> compression and allocation of memory for the compressed page is avoided
>> and instead the page is just marked as zero-filled page.
>> Although, compressed size of a zero-filled page using LZO compressor is
>> very less (52 bytes including zswap_header), this patch saves compression
>> and allocation time during store operation and decompression time during
>> zswap load operation for zero-filled pages. Experiments have shown that
>> around 10-20% of pages stored in zswap are zero-filled.
>
> Aren't zero-filled pages already handled by patch 1/4 as their
> contents match? So the overall memory saving is 52 bytes?
>
> - Pekka

Thanks for the quick reply.

Zero-filled pages can also be handled by patch 1/4. It performs
searching of a duplicate page among existing stored pages in zswap.
Its been observed that average search time to identify duplicate zero
filled pages(using patch 1/4) is almost thrice compared to checking
all pages for zero-filled. 

Also, in case of patch 1/4, the zswap_frontswap_load() operation requires
the compressed zero-filled page to be decompressed. zswap_frontswap_load()
function in patch 3/4 just fills the page with zeros while loading a
zero-filled page and is faster than decompression.

- Srividya

[PATCH 0/4] zswap: Optimize compressed pool memory utilization

2016-08-18 Thread Srividya Desireddy
On 17 August 2016 at 18:08, Pekka Enberg  wrote:
> On Wed, Aug 17, 2016 at 1:03 PM, Srividya Desireddy
>  wrote:
>> This series of patches optimize the memory utilized by zswap for storing
>> the swapped out pages.
>>
>> Zswap is a cache which compresses the pages that are being swapped out
>> and stores them into a dynamically allocated RAM-based memory pool.
>> Experiments have shown that around 10-15% of pages stored in zswap are
>> duplicates which results in 10-12% more RAM required to store these
>> duplicate compressed pages. Around 10-20% of pages stored in zswap
>> are zero-filled pages, but these pages are handled as normal pages by
>> compressing and allocating memory in the pool.
>>
>> The following patch-set optimizes memory utilized by zswap by avoiding the
>> storage of duplicate pages and zero-filled pages in zswap compressed memory
>> pool.
>>
>> Patch 1/4: zswap: Share zpool memory of duplicate pages
>> This patch shares compressed pool memory of the duplicate pages. When a new
>> page is requested for swap-out to zswap; search for an identical page in
>> the pages already stored in zswap. If an identical page is found then share
>> the compressed page data of the identical page with the new page. This
>> avoids allocation of memory in the compressed pool for a duplicate page.
>> This feature is tested on devices with 1GB, 2GB and 3GB RAM by executing
>> performance test at low memory conditions. Around 15-20% of the pages
>> swapped are duplicate of the pages existing in zswap, resulting in 15%
>> saving of zswap memory pool when compared to the baseline version.
>>
>> Test Parameters BaselineWith patch  Improvement
>> Total RAM   955MB   955MB
>> Available RAM 254MB   269MB   15MB
>> Avg. App entry time 2.469sec2.207sec7%
>> Avg. App close time 1.151sec1.085sec6%
>> Apps launched in 1sec   5 12 7
>>
>> There is little overhead in zswap store function due to the search
>> operation for finding duplicate pages. However, if duplicate page is
>> found it saves the compression and allocation time of the page. The average
>> overhead per zswap_frontswap_store() function call in the experimental
>> device is 9us. There is no overhead in case of zswap_frontswap_load()
>> operation.
>>
>> Patch 2/4: zswap: Enable/disable sharing of duplicate pages at runtime
>> This patch adds a module parameter to enable or disable the sharing of
>> duplicate zswap pages at runtime.
>>
>> Patch 3/4: zswap: Zero-filled pages handling
>> This patch checks if a page to be stored in zswap is a zero-filled page
>> (i.e. contents of the page are all zeros). If such page is found,
>> compression and allocation of memory for the compressed page is avoided
>> and instead the page is just marked as zero-filled page.
>> Although, compressed size of a zero-filled page using LZO compressor is
>> very less (52 bytes including zswap_header), this patch saves compression
>> and allocation time during store operation and decompression time during
>> zswap load operation for zero-filled pages. Experiments have shown that
>> around 10-20% of pages stored in zswap are zero-filled.
>
> Aren't zero-filled pages already handled by patch 1/4 as their
> contents match? So the overall memory saving is 52 bytes?
>
> - Pekka

Thanks for the quick reply.

Zero-filled pages can also be handled by patch 1/4. It performs
searching of a duplicate page among existing stored pages in zswap.
Its been observed that average search time to identify duplicate zero
filled pages(using patch 1/4) is almost thrice compared to checking
all pages for zero-filled. 

Also, in case of patch 1/4, the zswap_frontswap_load() operation requires
the compressed zero-filled page to be decompressed. zswap_frontswap_load()
function in patch 3/4 just fills the page with zeros while loading a
zero-filled page and is faster than decompression.

- Srividya

[PATCH 4/4] zswap: Update document with sharing of duplicate pages feature

2016-08-17 Thread Srividya Desireddy
From: Srividya Desireddy <srividya...@samsung.com>
Date: Wed, 17 Aug 2016 14:34:41 +0530
Subject: [PATCH 4/4] zswap: Update document with sharing of duplicate pages
 feature

Updated zswap document with details on the sharing of duplicate swap pages
feature. The usage of zswap.same_page_sharing module parameter is
explained.

Signed-off-by: Srividya Desireddy <srividya...@samsung.com>
---
 Documentation/vm/zswap.txt |   18 ++
 1 file changed, 18 insertions(+)

diff --git a/Documentation/vm/zswap.txt b/Documentation/vm/zswap.txt
index 89fff7d..cf11807 100644
--- a/Documentation/vm/zswap.txt
+++ b/Documentation/vm/zswap.txt
@@ -98,5 +98,23 @@ request is made for a page in an old zpool, it is 
uncompressed using its
 original compressor.  Once all pages are removed from an old zpool, the zpool
 and its compressor are freed.
 
+Some of the pages swapped to zswap have same content as that of pages already
+stored in zswap. These pages are compressed and stored in the zpool memory.
+Same page sharing feature enables the duplicate pages to share same compressed
+zpool memory. This helps in reducing the zpool memory allocated by zswap to
+store compressed pages.
+
+Same page sharing feature is disabled by default and can be enabled at boot
+time by setting the "same_page_sharing" attribute to 1 at boot time. ie:
+zswap.same_page_sharing=1. It can also be enabled and disabled at runtime
+using the sysfs "same_page_sharing" attribute, e.g.
+
+echo 1 > /sys/module/zswap/parameters/same_page_sharing
+
+When zswap same page sharing is disabled at runtime it will stop sharing the
+new duplicate pages that are being swapped out. However, the existing duplicate
+pages will keep sharing the compressed memory pool until they are swapped in or
+invalidated.
+
 A debugfs interface is provided for various statistic about pool size, number
 of pages stored, and various counters for the reasons pages are rejected.
-- 
1.7.9.5



[PATCH 4/4] zswap: Update document with sharing of duplicate pages feature

2016-08-17 Thread Srividya Desireddy
From: Srividya Desireddy 
Date: Wed, 17 Aug 2016 14:34:41 +0530
Subject: [PATCH 4/4] zswap: Update document with sharing of duplicate pages
 feature

Updated zswap document with details on the sharing of duplicate swap pages
feature. The usage of zswap.same_page_sharing module parameter is
explained.

Signed-off-by: Srividya Desireddy 
---
 Documentation/vm/zswap.txt |   18 ++
 1 file changed, 18 insertions(+)

diff --git a/Documentation/vm/zswap.txt b/Documentation/vm/zswap.txt
index 89fff7d..cf11807 100644
--- a/Documentation/vm/zswap.txt
+++ b/Documentation/vm/zswap.txt
@@ -98,5 +98,23 @@ request is made for a page in an old zpool, it is 
uncompressed using its
 original compressor.  Once all pages are removed from an old zpool, the zpool
 and its compressor are freed.
 
+Some of the pages swapped to zswap have same content as that of pages already
+stored in zswap. These pages are compressed and stored in the zpool memory.
+Same page sharing feature enables the duplicate pages to share same compressed
+zpool memory. This helps in reducing the zpool memory allocated by zswap to
+store compressed pages.
+
+Same page sharing feature is disabled by default and can be enabled at boot
+time by setting the "same_page_sharing" attribute to 1 at boot time. ie:
+zswap.same_page_sharing=1. It can also be enabled and disabled at runtime
+using the sysfs "same_page_sharing" attribute, e.g.
+
+echo 1 > /sys/module/zswap/parameters/same_page_sharing
+
+When zswap same page sharing is disabled at runtime it will stop sharing the
+new duplicate pages that are being swapped out. However, the existing duplicate
+pages will keep sharing the compressed memory pool until they are swapped in or
+invalidated.
+
 A debugfs interface is provided for various statistic about pool size, number
 of pages stored, and various counters for the reasons pages are rejected.
-- 
1.7.9.5



[PATCH 2/4] zswap: Enable or disable sharing of duplicate pages at runtime

2016-08-17 Thread Srividya Desireddy
From: Srividya Desireddy <srividya...@samsung.com>
Date: Wed, 17 Aug 2016 14:32:24 +0530
Subject: [PATCH 2/4] zswap: Enable or disable sharing of duplicate pages at
 runtime

Enable or disable the sharing of duplicate zswap pages at runtime.
To enable sharing of duplicate zswap pages set 'same_page_sharing' sysfs
attribute. By default it is disabled.

In zswap_frontswap_store(), duplicate pages are searched in zswap only
when same_page_sharing is set. When zswap same page sharing is
disabled at runtime it will stop sharing the new duplicate pages. However,
the existing duplicate pages will keep sharing the compressed memory pool
until they are faulted back or invalidated.

Signed-off-by: Srividya Desireddy <srividya...@samsung.com>
---
 mm/zswap.c |   42 +-
 1 file changed, 25 insertions(+), 17 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index f7efede..ae39c77 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -116,6 +116,10 @@ module_param_cb(zpool, _zpool_param_ops, 
_zpool_type, 0644);
 static unsigned int zswap_max_pool_percent = 20;
 module_param_named(max_pool_percent, zswap_max_pool_percent, uint, 0644);
 
+/* Enable/disable zswap same page sharing feature (disabled by default) */
+static bool zswap_same_page_sharing;
+module_param_named(same_page_sharing, zswap_same_page_sharing, bool, 0644);
+
 /*
 * data structures
 **/
@@ -1180,20 +1184,22 @@ static int zswap_frontswap_store(unsigned type, pgoff_t 
offset,
 
src = kmap_atomic(page);
 
-   checksum = jhash2((const u32 *)src, PAGE_SIZE / 4, 17);
-   spin_lock(>lock);
-   zhandle = zswap_same_page_search(tree, src, checksum);
-   if (zhandle) {
-   entry->offset = offset;
-   entry->zhandle = zhandle;
-   entry->pool = zhandle->pool;
-   entry->zhandle->ref_count++;
+   if (zswap_same_page_sharing) {
+   checksum = jhash2((const u32 *)src, PAGE_SIZE / 4, 17);
+   spin_lock(>lock);
+   zhandle = zswap_same_page_search(tree, src, checksum);
+   if (zhandle) {
+   entry->offset = offset;
+   entry->zhandle = zhandle;
+   entry->pool = zhandle->pool;
+   entry->zhandle->ref_count++;
+   spin_unlock(>lock);
+   kunmap_atomic(src);
+   atomic_inc(_duplicate_pages);
+   goto insert_entry;
+   }
spin_unlock(>lock);
-   kunmap_atomic(src);
-   atomic_inc(_duplicate_pages);
-   goto insert_entry;
}
-   spin_unlock(>lock);
 
/* if entry is successfully added, it keeps the reference */
entry->pool = zswap_pool_current_get();
@@ -1245,12 +1251,14 @@ static int zswap_frontswap_store(unsigned type, pgoff_t 
offset,
entry->zhandle = zhandle;
entry->zhandle->handle = handle;
entry->zhandle->length = dlen;
-   entry->zhandle->checksum = checksum;
-   entry->zhandle->pool = entry->pool;
-   spin_lock(>lock);
-   ret = zswap_handle_rb_insert(>zhandleroot, entry->zhandle,
+   if (zswap_same_page_sharing) {
+   entry->zhandle->checksum = checksum;
+   entry->zhandle->pool = entry->pool;
+   spin_lock(>lock);
+   ret = zswap_handle_rb_insert(>zhandleroot, entry->zhandle,
);
-   spin_unlock(>lock);
+   spin_unlock(>lock);
+   }
 
 insert_entry:
/* map */
-- 
1.7.9.5



[PATCH 2/4] zswap: Enable or disable sharing of duplicate pages at runtime

2016-08-17 Thread Srividya Desireddy
From: Srividya Desireddy 
Date: Wed, 17 Aug 2016 14:32:24 +0530
Subject: [PATCH 2/4] zswap: Enable or disable sharing of duplicate pages at
 runtime

Enable or disable the sharing of duplicate zswap pages at runtime.
To enable sharing of duplicate zswap pages set 'same_page_sharing' sysfs
attribute. By default it is disabled.

In zswap_frontswap_store(), duplicate pages are searched in zswap only
when same_page_sharing is set. When zswap same page sharing is
disabled at runtime it will stop sharing the new duplicate pages. However,
the existing duplicate pages will keep sharing the compressed memory pool
until they are faulted back or invalidated.

Signed-off-by: Srividya Desireddy 
---
 mm/zswap.c |   42 +-
 1 file changed, 25 insertions(+), 17 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index f7efede..ae39c77 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -116,6 +116,10 @@ module_param_cb(zpool, _zpool_param_ops, 
_zpool_type, 0644);
 static unsigned int zswap_max_pool_percent = 20;
 module_param_named(max_pool_percent, zswap_max_pool_percent, uint, 0644);
 
+/* Enable/disable zswap same page sharing feature (disabled by default) */
+static bool zswap_same_page_sharing;
+module_param_named(same_page_sharing, zswap_same_page_sharing, bool, 0644);
+
 /*
 * data structures
 **/
@@ -1180,20 +1184,22 @@ static int zswap_frontswap_store(unsigned type, pgoff_t 
offset,
 
src = kmap_atomic(page);
 
-   checksum = jhash2((const u32 *)src, PAGE_SIZE / 4, 17);
-   spin_lock(>lock);
-   zhandle = zswap_same_page_search(tree, src, checksum);
-   if (zhandle) {
-   entry->offset = offset;
-   entry->zhandle = zhandle;
-   entry->pool = zhandle->pool;
-   entry->zhandle->ref_count++;
+   if (zswap_same_page_sharing) {
+   checksum = jhash2((const u32 *)src, PAGE_SIZE / 4, 17);
+   spin_lock(>lock);
+   zhandle = zswap_same_page_search(tree, src, checksum);
+   if (zhandle) {
+   entry->offset = offset;
+   entry->zhandle = zhandle;
+   entry->pool = zhandle->pool;
+   entry->zhandle->ref_count++;
+   spin_unlock(>lock);
+   kunmap_atomic(src);
+   atomic_inc(_duplicate_pages);
+   goto insert_entry;
+   }
spin_unlock(>lock);
-   kunmap_atomic(src);
-   atomic_inc(_duplicate_pages);
-   goto insert_entry;
}
-   spin_unlock(>lock);
 
/* if entry is successfully added, it keeps the reference */
entry->pool = zswap_pool_current_get();
@@ -1245,12 +1251,14 @@ static int zswap_frontswap_store(unsigned type, pgoff_t 
offset,
entry->zhandle = zhandle;
entry->zhandle->handle = handle;
entry->zhandle->length = dlen;
-   entry->zhandle->checksum = checksum;
-   entry->zhandle->pool = entry->pool;
-   spin_lock(>lock);
-   ret = zswap_handle_rb_insert(>zhandleroot, entry->zhandle,
+   if (zswap_same_page_sharing) {
+   entry->zhandle->checksum = checksum;
+   entry->zhandle->pool = entry->pool;
+   spin_lock(>lock);
+   ret = zswap_handle_rb_insert(>zhandleroot, entry->zhandle,
);
-   spin_unlock(>lock);
+   spin_unlock(>lock);
+   }
 
 insert_entry:
/* map */
-- 
1.7.9.5



[PATCH 3/4] zswap: Zero-filled pages handling

2016-08-17 Thread Srividya Desireddy
From: Srividya Desireddy <srividya...@samsung.com>
Date: Wed, 17 Aug 2016 14:34:14 +0530
Subject: [PATCH 3/4] zswap: Zero-filled pages handling

This patch adds a check in zswap_frontswap_store() to identify zero-filled
page before compression of the page. If the page is a zero-filled page, set
zswap_entry.zeroflag and skip the compression of the page and alloction
of memory in zpool. In zswap_frontswap_load(), check if the zeroflag is
set for the page in zswap_entry. If the flag is set, memset the page with
zero. This saves the decompression time during load.

The overall overhead caused due to zero-filled page check is very minimal
when compared to the time saved by avoiding compression and allocation in
case of zero-filled pages. The load time of a zero-filled page is reduced
by 80% when compared to baseline.

Signed-off-by: Srividya Desireddy <srividya...@samsung.com>
---
 mm/zswap.c |   58 ++
 1 file changed, 50 insertions(+), 8 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index ae39c77..d0c3f96 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -58,6 +58,9 @@ static atomic_t zswap_stored_pages = ATOMIC_INIT(0);
  */
 static atomic_t zswap_duplicate_pages = ATOMIC_INIT(0);
 
+/* The number of zero filled pages swapped out to zswap */
+static atomic_t zswap_zero_pages = ATOMIC_INIT(0);
+
 /*
  * The statistics below are not protected from concurrent access for
  * performance reasons so they may not be a 100% accurate.  However,
@@ -172,6 +175,8 @@ struct zswap_handle {
  *be held, there is no reason to also make refcount atomic.
  * pool - the zswap_pool the entry's data is in
  * zhandle - pointer to struct zswap_handle
+ * zeroflag - the flag is set if the content of the page is filled with
+ *zeros
  */
 struct zswap_entry {
struct rb_node rbnode;
@@ -179,6 +184,7 @@ struct zswap_entry {
int refcount;
struct zswap_pool *pool;
struct zswap_handle *zhandle;
+   unsigned char zeroflag;
 };
 
 struct zswap_header {
@@ -269,6 +275,7 @@ static struct zswap_entry *zswap_entry_cache_alloc(gfp_t 
gfp)
if (!entry)
return NULL;
entry->refcount = 1;
+   entry->zeroflag = 0;
entry->zhandle = NULL;
RB_CLEAR_NODE(>rbnode);
return entry;
@@ -477,13 +484,17 @@ static bool zswap_handle_is_unique(struct zswap_handle 
*zhandle)
  */
 static void zswap_free_entry(struct zswap_entry *entry)
 {
-   if (zswap_handle_is_unique(entry->zhandle)) {
-   zpool_free(entry->pool->zpool, entry->zhandle->handle);
-   zswap_handle_cache_free(entry->zhandle);
-   zswap_pool_put(entry->pool);
-   } else {
-   entry->zhandle->ref_count--;
-   atomic_dec(_duplicate_pages);
+   if (entry->zeroflag)
+   atomic_dec(_zero_pages);
+   else {
+   if (zswap_handle_is_unique(entry->zhandle)) {
+   zpool_free(entry->pool->zpool, entry->zhandle->handle);
+   zswap_handle_cache_free(entry->zhandle);
+   zswap_pool_put(entry->pool);
+   } else {
+   entry->zhandle->ref_count--;
+   atomic_dec(_duplicate_pages);
+   }
}
zswap_entry_cache_free(entry);
atomic_dec(_stored_pages);
@@ -1140,6 +1151,21 @@ static int zswap_shrink(void)
return ret;
 }
 
+static int zswap_is_page_zero_filled(void *ptr)
+{
+   unsigned int pos;
+   unsigned long *page;
+
+   page = (unsigned long *)ptr;
+
+   for (pos = 0; pos != PAGE_SIZE / sizeof(*page); pos++) {
+   if (page[pos])
+   return 0;
+   }
+
+   return 1;
+}
+
 /*
 * frontswap hooks
 **/
@@ -1183,6 +1209,13 @@ static int zswap_frontswap_store(unsigned type, pgoff_t 
offset,
}
 
src = kmap_atomic(page);
+   if (zswap_is_page_zero_filled(src)) {
+   kunmap_atomic(src);
+   entry->offset = offset;
+   entry->zeroflag = 1;
+   atomic_inc(_zero_pages);
+   goto insert_entry;
+   }
 
if (zswap_same_page_sharing) {
checksum = jhash2((const u32 *)src, PAGE_SIZE / 4, 17);
@@ -1314,6 +1347,13 @@ static int zswap_frontswap_load(unsigned type, pgoff_t 
offset,
}
spin_unlock(>lock);
 
+   if (entry->zeroflag) {
+   dst = kmap_atomic(page);
+   memset(dst, 0, PAGE_SIZE);
+   kunmap_atomic(dst);
+   goto freeentry;
+   }
+
/* decompress */
dlen = PAGE_SIZE;
src = (u8 *)zpool_map_handle(entry->pool->zpool, entry->zhandle->handle,
@@ -1327,6 +1367,7 @@ static int zswap_frontswap_l

[PATCH 3/4] zswap: Zero-filled pages handling

2016-08-17 Thread Srividya Desireddy
From: Srividya Desireddy 
Date: Wed, 17 Aug 2016 14:34:14 +0530
Subject: [PATCH 3/4] zswap: Zero-filled pages handling

This patch adds a check in zswap_frontswap_store() to identify zero-filled
page before compression of the page. If the page is a zero-filled page, set
zswap_entry.zeroflag and skip the compression of the page and alloction
of memory in zpool. In zswap_frontswap_load(), check if the zeroflag is
set for the page in zswap_entry. If the flag is set, memset the page with
zero. This saves the decompression time during load.

The overall overhead caused due to zero-filled page check is very minimal
when compared to the time saved by avoiding compression and allocation in
case of zero-filled pages. The load time of a zero-filled page is reduced
by 80% when compared to baseline.

Signed-off-by: Srividya Desireddy 
---
 mm/zswap.c |   58 ++
 1 file changed, 50 insertions(+), 8 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index ae39c77..d0c3f96 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -58,6 +58,9 @@ static atomic_t zswap_stored_pages = ATOMIC_INIT(0);
  */
 static atomic_t zswap_duplicate_pages = ATOMIC_INIT(0);
 
+/* The number of zero filled pages swapped out to zswap */
+static atomic_t zswap_zero_pages = ATOMIC_INIT(0);
+
 /*
  * The statistics below are not protected from concurrent access for
  * performance reasons so they may not be a 100% accurate.  However,
@@ -172,6 +175,8 @@ struct zswap_handle {
  *be held, there is no reason to also make refcount atomic.
  * pool - the zswap_pool the entry's data is in
  * zhandle - pointer to struct zswap_handle
+ * zeroflag - the flag is set if the content of the page is filled with
+ *zeros
  */
 struct zswap_entry {
struct rb_node rbnode;
@@ -179,6 +184,7 @@ struct zswap_entry {
int refcount;
struct zswap_pool *pool;
struct zswap_handle *zhandle;
+   unsigned char zeroflag;
 };
 
 struct zswap_header {
@@ -269,6 +275,7 @@ static struct zswap_entry *zswap_entry_cache_alloc(gfp_t 
gfp)
if (!entry)
return NULL;
entry->refcount = 1;
+   entry->zeroflag = 0;
entry->zhandle = NULL;
RB_CLEAR_NODE(>rbnode);
return entry;
@@ -477,13 +484,17 @@ static bool zswap_handle_is_unique(struct zswap_handle 
*zhandle)
  */
 static void zswap_free_entry(struct zswap_entry *entry)
 {
-   if (zswap_handle_is_unique(entry->zhandle)) {
-   zpool_free(entry->pool->zpool, entry->zhandle->handle);
-   zswap_handle_cache_free(entry->zhandle);
-   zswap_pool_put(entry->pool);
-   } else {
-   entry->zhandle->ref_count--;
-   atomic_dec(_duplicate_pages);
+   if (entry->zeroflag)
+   atomic_dec(_zero_pages);
+   else {
+   if (zswap_handle_is_unique(entry->zhandle)) {
+   zpool_free(entry->pool->zpool, entry->zhandle->handle);
+   zswap_handle_cache_free(entry->zhandle);
+   zswap_pool_put(entry->pool);
+   } else {
+   entry->zhandle->ref_count--;
+   atomic_dec(_duplicate_pages);
+   }
}
zswap_entry_cache_free(entry);
atomic_dec(_stored_pages);
@@ -1140,6 +1151,21 @@ static int zswap_shrink(void)
return ret;
 }
 
+static int zswap_is_page_zero_filled(void *ptr)
+{
+   unsigned int pos;
+   unsigned long *page;
+
+   page = (unsigned long *)ptr;
+
+   for (pos = 0; pos != PAGE_SIZE / sizeof(*page); pos++) {
+   if (page[pos])
+   return 0;
+   }
+
+   return 1;
+}
+
 /*
 * frontswap hooks
 **/
@@ -1183,6 +1209,13 @@ static int zswap_frontswap_store(unsigned type, pgoff_t 
offset,
}
 
src = kmap_atomic(page);
+   if (zswap_is_page_zero_filled(src)) {
+   kunmap_atomic(src);
+   entry->offset = offset;
+   entry->zeroflag = 1;
+   atomic_inc(_zero_pages);
+   goto insert_entry;
+   }
 
if (zswap_same_page_sharing) {
checksum = jhash2((const u32 *)src, PAGE_SIZE / 4, 17);
@@ -1314,6 +1347,13 @@ static int zswap_frontswap_load(unsigned type, pgoff_t 
offset,
}
spin_unlock(>lock);
 
+   if (entry->zeroflag) {
+   dst = kmap_atomic(page);
+   memset(dst, 0, PAGE_SIZE);
+   kunmap_atomic(dst);
+   goto freeentry;
+   }
+
/* decompress */
dlen = PAGE_SIZE;
src = (u8 *)zpool_map_handle(entry->pool->zpool, entry->zhandle->handle,
@@ -1327,6 +1367,7 @@ static int zswap_frontswap_load(unsigned type, pgoff_t 
offset,
zpoo

[PATCH 1/4] zswap: Share zpool memory of duplicate pages

2016-08-17 Thread Srividya Desireddy
From: Srividya Desireddy <srividya...@samsung.com>
Date: Wed, 17 Aug 2016 14:31:01 +0530
Subject: [PATCH 1/4] zswap: Share zpool memory of duplicate pages

This patch shares the compressed pool memory of duplicate pages and reduces
compressed pool memory utilized by zswap.

For each page requested for swap-out to zswap, calculate 32-bit checksum of
the page. Search for duplicate pages by comparing the checksum of the new
page with existing pages. Compare the contents of the pages if checksum
matches. If the contents also match, then share the compressed data of the
existing page with the new page. Increment the reference count to check
the number of pages sharing the compressed page in zpool.

If a duplicate page is not found then treat the new page as a 'unique' page
in zswap. Compress the new page and store the compressed data in the zpool.
Insert the unique page in the Red-Black Tree which is balanced based on
32-bit checksum value of the page.

Signed-off-by: Srividya Desireddy <srividya...@samsung.com>
---
 mm/zswap.c |  265 
 1 file changed, 248 insertions(+), 17 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index 275b22c..f7efede 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -41,6 +41,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
 * statistics
@@ -51,6 +52,13 @@ static u64 zswap_pool_total_size;
 static atomic_t zswap_stored_pages = ATOMIC_INIT(0);
 
 /*
+ * The number of swapped out pages which are identified as duplicate
+ * to the existing zswap pages. Compression and storing of these pages
+ * is avoided.
+ */
+static atomic_t zswap_duplicate_pages = ATOMIC_INIT(0);
+
+/*
  * The statistics below are not protected from concurrent access for
  * performance reasons so they may not be a 100% accurate.  However,
  * they do provide useful information on roughly how many times a
@@ -123,6 +131,28 @@ struct zswap_pool {
 };
 
 /*
+ * struct zswap_handle
+ * This structure contains the metadata for tracking single zpool handle
+ * allocation.
+ *
+ * rbnode - links the zswap_handle into red-black tree
+ * checksum - 32-bit checksum value of the page swapped to zswap
+ * ref_count - number of pages sharing this handle
+ * length - the length in bytes of the compressed page data.
+ *  Needed during decompression.
+ * handle - zpool allocation handle that stores the compressed page data.
+ * pool - the zswap_pool the entry's data is in.
+ */
+struct zswap_handle {
+   struct rb_node rbnode;
+   u32 checksum;
+   u16 ref_count;
+   unsigned int length;
+   unsigned long handle;
+   struct zswap_pool *pool;
+};
+
+/*
  * struct zswap_entry
  *
  * This structure contains the metadata for tracking a single compressed
@@ -136,18 +166,15 @@ struct zswap_pool {
  *for the zswap_tree structure that contains the entry must
  *be held while changing the refcount.  Since the lock must
  *be held, there is no reason to also make refcount atomic.
- * length - the length in bytes of the compressed page data.  Needed during
- *  decompression
  * pool - the zswap_pool the entry's data is in
- * handle - zpool allocation handle that stores the compressed page data
+ * zhandle - pointer to struct zswap_handle
  */
 struct zswap_entry {
struct rb_node rbnode;
pgoff_t offset;
int refcount;
-   unsigned int length;
struct zswap_pool *pool;
-   unsigned long handle;
+   struct zswap_handle *zhandle;
 };
 
 struct zswap_header {
@@ -161,6 +188,8 @@ struct zswap_header {
  */
 struct zswap_tree {
struct rb_root rbroot;
+   struct rb_root zhandleroot;
+   void  *buffer;
spinlock_t lock;
 };
 
@@ -236,6 +265,7 @@ static struct zswap_entry *zswap_entry_cache_alloc(gfp_t 
gfp)
if (!entry)
return NULL;
entry->refcount = 1;
+   entry->zhandle = NULL;
RB_CLEAR_NODE(>rbnode);
return entry;
 }
@@ -246,6 +276,39 @@ static void zswap_entry_cache_free(struct zswap_entry 
*entry)
 }
 
 /*
+* zswap handle functions
+**/
+static struct kmem_cache *zswap_handle_cache;
+
+static int __init zswap_handle_cache_create(void)
+{
+   zswap_handle_cache = KMEM_CACHE(zswap_handle, 0);
+   return zswap_handle_cache == NULL;
+}
+
+static void __init zswap_handle_cache_destroy(void)
+{
+   kmem_cache_destroy(zswap_handle_cache);
+}
+
+static struct zswap_handle *zswap_handle_cache_alloc(gfp_t gfp)
+{
+   struct zswap_handle *zhandle;
+
+   zhandle = kmem_cache_alloc(zswap_handle_cache, gfp);
+   if (!zhandle)
+   return NULL;
+   zhandle->ref_count = 1;
+   RB_CLEAR_NODE(>rbnode);
+   return zhandle;
+}
+
+static void zswap_handle_cache_free(struct zswap_handle *zhandle)
+{
+   kmem_cache_free(

[PATCH 1/4] zswap: Share zpool memory of duplicate pages

2016-08-17 Thread Srividya Desireddy
From: Srividya Desireddy 
Date: Wed, 17 Aug 2016 14:31:01 +0530
Subject: [PATCH 1/4] zswap: Share zpool memory of duplicate pages

This patch shares the compressed pool memory of duplicate pages and reduces
compressed pool memory utilized by zswap.

For each page requested for swap-out to zswap, calculate 32-bit checksum of
the page. Search for duplicate pages by comparing the checksum of the new
page with existing pages. Compare the contents of the pages if checksum
matches. If the contents also match, then share the compressed data of the
existing page with the new page. Increment the reference count to check
the number of pages sharing the compressed page in zpool.

If a duplicate page is not found then treat the new page as a 'unique' page
in zswap. Compress the new page and store the compressed data in the zpool.
Insert the unique page in the Red-Black Tree which is balanced based on
32-bit checksum value of the page.

Signed-off-by: Srividya Desireddy 
---
 mm/zswap.c |  265 
 1 file changed, 248 insertions(+), 17 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index 275b22c..f7efede 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -41,6 +41,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
 * statistics
@@ -51,6 +52,13 @@ static u64 zswap_pool_total_size;
 static atomic_t zswap_stored_pages = ATOMIC_INIT(0);
 
 /*
+ * The number of swapped out pages which are identified as duplicate
+ * to the existing zswap pages. Compression and storing of these pages
+ * is avoided.
+ */
+static atomic_t zswap_duplicate_pages = ATOMIC_INIT(0);
+
+/*
  * The statistics below are not protected from concurrent access for
  * performance reasons so they may not be a 100% accurate.  However,
  * they do provide useful information on roughly how many times a
@@ -123,6 +131,28 @@ struct zswap_pool {
 };
 
 /*
+ * struct zswap_handle
+ * This structure contains the metadata for tracking single zpool handle
+ * allocation.
+ *
+ * rbnode - links the zswap_handle into red-black tree
+ * checksum - 32-bit checksum value of the page swapped to zswap
+ * ref_count - number of pages sharing this handle
+ * length - the length in bytes of the compressed page data.
+ *  Needed during decompression.
+ * handle - zpool allocation handle that stores the compressed page data.
+ * pool - the zswap_pool the entry's data is in.
+ */
+struct zswap_handle {
+   struct rb_node rbnode;
+   u32 checksum;
+   u16 ref_count;
+   unsigned int length;
+   unsigned long handle;
+   struct zswap_pool *pool;
+};
+
+/*
  * struct zswap_entry
  *
  * This structure contains the metadata for tracking a single compressed
@@ -136,18 +166,15 @@ struct zswap_pool {
  *for the zswap_tree structure that contains the entry must
  *be held while changing the refcount.  Since the lock must
  *be held, there is no reason to also make refcount atomic.
- * length - the length in bytes of the compressed page data.  Needed during
- *  decompression
  * pool - the zswap_pool the entry's data is in
- * handle - zpool allocation handle that stores the compressed page data
+ * zhandle - pointer to struct zswap_handle
  */
 struct zswap_entry {
struct rb_node rbnode;
pgoff_t offset;
int refcount;
-   unsigned int length;
struct zswap_pool *pool;
-   unsigned long handle;
+   struct zswap_handle *zhandle;
 };
 
 struct zswap_header {
@@ -161,6 +188,8 @@ struct zswap_header {
  */
 struct zswap_tree {
struct rb_root rbroot;
+   struct rb_root zhandleroot;
+   void  *buffer;
spinlock_t lock;
 };
 
@@ -236,6 +265,7 @@ static struct zswap_entry *zswap_entry_cache_alloc(gfp_t 
gfp)
if (!entry)
return NULL;
entry->refcount = 1;
+   entry->zhandle = NULL;
RB_CLEAR_NODE(>rbnode);
return entry;
 }
@@ -246,6 +276,39 @@ static void zswap_entry_cache_free(struct zswap_entry 
*entry)
 }
 
 /*
+* zswap handle functions
+**/
+static struct kmem_cache *zswap_handle_cache;
+
+static int __init zswap_handle_cache_create(void)
+{
+   zswap_handle_cache = KMEM_CACHE(zswap_handle, 0);
+   return zswap_handle_cache == NULL;
+}
+
+static void __init zswap_handle_cache_destroy(void)
+{
+   kmem_cache_destroy(zswap_handle_cache);
+}
+
+static struct zswap_handle *zswap_handle_cache_alloc(gfp_t gfp)
+{
+   struct zswap_handle *zhandle;
+
+   zhandle = kmem_cache_alloc(zswap_handle_cache, gfp);
+   if (!zhandle)
+   return NULL;
+   zhandle->ref_count = 1;
+   RB_CLEAR_NODE(>rbnode);
+   return zhandle;
+}
+
+static void zswap_handle_cache_free(struct zswap_handle *zhandle)
+{
+   kmem_cache_free(zswap_handle_cache, zhandle);
+}
+
+/*
 * r

[PATCH 0/4] zswap: Optimize compressed pool memory utilization

2016-08-17 Thread Srividya Desireddy
This series of patches optimize the memory utilized by zswap for storing
the swapped out pages.

Zswap is a cache which compresses the pages that are being swapped out
and stores them into a dynamically allocated RAM-based memory pool.
Experiments have shown that around 10-15% of pages stored in zswap are
duplicates which results in 10-12% more RAM required to store these
duplicate compressed pages. Around 10-20% of pages stored in zswap
are zero-filled pages, but these pages are handled as normal pages by
compressing and allocating memory in the pool.

The following patch-set optimizes memory utilized by zswap by avoiding the
storage of duplicate pages and zero-filled pages in zswap compressed memory
pool.

Patch 1/4: zswap: Share zpool memory of duplicate pages
This patch shares compressed pool memory of the duplicate pages. When a new
page is requested for swap-out to zswap; search for an identical page in
the pages already stored in zswap. If an identical page is found then share
the compressed page data of the identical page with the new page. This
avoids allocation of memory in the compressed pool for a duplicate page.
This feature is tested on devices with 1GB, 2GB and 3GB RAM by executing
performance test at low memory conditions. Around 15-20% of the pages
swapped are duplicate of the pages existing in zswap, resulting in 15%
saving of zswap memory pool when compared to the baseline version.

Test Parameters BaselineWith patch  Improvement
Total RAM   955MB   955MB
Available RAM 254MB   269MB   15MB
Avg. App entry time 2.469sec2.207sec7%
Avg. App close time 1.151sec1.085sec6%
Apps launched in 1sec   5 12 7

There is little overhead in zswap store function due to the search
operation for finding duplicate pages. However, if duplicate page is
found it saves the compression and allocation time of the page. The average
overhead per zswap_frontswap_store() function call in the experimental
device is 9us. There is no overhead in case of zswap_frontswap_load()
operation.

Patch 2/4: zswap: Enable/disable sharing of duplicate pages at runtime
This patch adds a module parameter to enable or disable the sharing of
duplicate zswap pages at runtime.

Patch 3/4: zswap: Zero-filled pages handling
This patch checks if a page to be stored in zswap is a zero-filled page
(i.e. contents of the page are all zeros). If such page is found,
compression and allocation of memory for the compressed page is avoided
and instead the page is just marked as zero-filled page.
Although, compressed size of a zero-filled page using LZO compressor is
very less (52 bytes including zswap_header), this patch saves compression
and allocation time during store operation and decompression time during
zswap load operation for zero-filled pages. Experiments have shown that
around 10-20% of pages stored in zswap are zero-filled.

Patch 4/4: Update document with sharing of duplicate pages feature
In this patch zswap document is updated with information on sharing of
duplicate swap pages feature.

Documentation/vm/zswap.txt |   18 +++
mm/zswap.c |  315 +---
2 files changed, 316 insertions(+), 17 deletions(-)

[PATCH 0/4] zswap: Optimize compressed pool memory utilization

2016-08-17 Thread Srividya Desireddy
This series of patches optimize the memory utilized by zswap for storing
the swapped out pages.

Zswap is a cache which compresses the pages that are being swapped out
and stores them into a dynamically allocated RAM-based memory pool.
Experiments have shown that around 10-15% of pages stored in zswap are
duplicates which results in 10-12% more RAM required to store these
duplicate compressed pages. Around 10-20% of pages stored in zswap
are zero-filled pages, but these pages are handled as normal pages by
compressing and allocating memory in the pool.

The following patch-set optimizes memory utilized by zswap by avoiding the
storage of duplicate pages and zero-filled pages in zswap compressed memory
pool.

Patch 1/4: zswap: Share zpool memory of duplicate pages
This patch shares compressed pool memory of the duplicate pages. When a new
page is requested for swap-out to zswap; search for an identical page in
the pages already stored in zswap. If an identical page is found then share
the compressed page data of the identical page with the new page. This
avoids allocation of memory in the compressed pool for a duplicate page.
This feature is tested on devices with 1GB, 2GB and 3GB RAM by executing
performance test at low memory conditions. Around 15-20% of the pages
swapped are duplicate of the pages existing in zswap, resulting in 15%
saving of zswap memory pool when compared to the baseline version.

Test Parameters BaselineWith patch  Improvement
Total RAM   955MB   955MB
Available RAM 254MB   269MB   15MB
Avg. App entry time 2.469sec2.207sec7%
Avg. App close time 1.151sec1.085sec6%
Apps launched in 1sec   5 12 7

There is little overhead in zswap store function due to the search
operation for finding duplicate pages. However, if duplicate page is
found it saves the compression and allocation time of the page. The average
overhead per zswap_frontswap_store() function call in the experimental
device is 9us. There is no overhead in case of zswap_frontswap_load()
operation.

Patch 2/4: zswap: Enable/disable sharing of duplicate pages at runtime
This patch adds a module parameter to enable or disable the sharing of
duplicate zswap pages at runtime.

Patch 3/4: zswap: Zero-filled pages handling
This patch checks if a page to be stored in zswap is a zero-filled page
(i.e. contents of the page are all zeros). If such page is found,
compression and allocation of memory for the compressed page is avoided
and instead the page is just marked as zero-filled page.
Although, compressed size of a zero-filled page using LZO compressor is
very less (52 bytes including zswap_header), this patch saves compression
and allocation time during store operation and decompression time during
zswap load operation for zero-filled pages. Experiments have shown that
around 10-20% of pages stored in zswap are zero-filled.

Patch 4/4: Update document with sharing of duplicate pages feature
In this patch zswap document is updated with information on sharing of
duplicate swap pages feature.

Documentation/vm/zswap.txt |   18 +++
mm/zswap.c |  315 +---
2 files changed, 316 insertions(+), 17 deletions(-)