Re: [PATCH] hugetlbfs: add O_TMPFILE support

2019-10-23 Thread Piotr Sarna

On 10/23/19 4:55 AM, Mike Kravetz wrote:

On 10/22/19 12:09 AM, Piotr Sarna wrote:

On 10/21/19 7:17 PM, Mike Kravetz wrote:

On 10/15/19 4:37 PM, Mike Kravetz wrote:

On 10/15/19 3:50 AM, Michal Hocko wrote:

On Tue 15-10-19 11:01:12, Piotr Sarna wrote:

With hugetlbfs, a common pattern for mapping anonymous huge pages
is to create a temporary file first.


Really? I though that this is normally done by shmget(SHM_HUGETLB) or
mmap(MAP_HUGETLB). Or maybe I misunderstood your definition on anonymous
huge pages.


Currently libraries like
libhugetlbfs and seastar create these with a standard mkstemp+unlink
trick,


I would guess that much of libhugetlbfs was writen before MAP_HUGETLB
was implemented.  So, that is why it does not make (more) use of that
option.

The implementation looks to be straight forward.  However, I really do
not want to add more functionality to hugetlbfs unless there is specific
use case that needs it.


It was not my intention to shut down discussion on this patch.  I was just
asking if there was a (new) use case for such a change.  I am checking with
our DB team as I seem to remember them using the create/unlink approach for
hugetlbfs in one of their upcoming models.

Is there a new use case you were thinking about?



Oh, I indeed thought it was a shutdown. The use case I was thinking about was 
in Seastar, where the create+unlink trick is used for creating temporary files 
(in a generic way, not only for hugetlbfs). I simply intended to migrate it to 
a newer approach - O_TMPFILE. However,
for the specific case of hugetlbfs it indeed makes more sense to skip it and 
use mmap's MAP_HUGETLB, so perhaps it's not worth it to patch a perfectly good 
and stable file system just to provide a semi-useful flag support. My 
implementation of tmpfile for hugetlbfs is straightforward indeed, but the 
MAP_HUGETLB argument made me realize that it may not be worth the trouble - 
especially that MAP_HUGETLB is here since 2.6 and O_TMPFILE was introduced 
around v3.11, so the mmap way looks more portable.

tldr: I'd be very happy to get my patch accepted, but the use case I had in 
mind can be easily solved with MAP_HUGETLB, so I don't insist.


If you really are after something like 'anonymous memory' for Seastar,
then MAP_HUGETLB would be the better approach.


Just to clarify - my original goal was to migrate Seastar's temporary 
file implementation (which is fs-agnostic, based on descriptors) from 
the current create+unlink to O_TMPFILE, for robustness. One of the 
internal usages of this generic mechanism was to create a tmpfile on 
hugetlbfs and that's why I sent this patch. However, this particular 
internal usage can be easily switched to more portable MAP_HUGETLB, 
which will also mean that the generic tmpfile implementation will not be 
used internally for hugetlbfs anymore.


There *may* still be value in being able to support hugetlbfs once 
Seastar's tmpfile implementation migrates to O_TMPFILE, since the 
library offers creating temporary files in its public API, but there's 
no immediate use case I can apply it to.




I'm still checking with Oracle DB team as they may have a use for O_TMPFILE
in an upcoming release.  In their use case, they want an open fd to work with.
If it looks like they will proceed in this direction, we can work to get
your patch moved forward.

Thanks,


Great, if it turns out that my patch helps anyone with their O_TMPFILE 
usage, I'd be very glad to see it merged.




Re: [PATCH] hugetlbfs: add O_TMPFILE support

2019-10-22 Thread Piotr Sarna

On 10/21/19 7:17 PM, Mike Kravetz wrote:

On 10/15/19 4:37 PM, Mike Kravetz wrote:

On 10/15/19 3:50 AM, Michal Hocko wrote:

On Tue 15-10-19 11:01:12, Piotr Sarna wrote:

With hugetlbfs, a common pattern for mapping anonymous huge pages
is to create a temporary file first.


Really? I though that this is normally done by shmget(SHM_HUGETLB) or
mmap(MAP_HUGETLB). Or maybe I misunderstood your definition on anonymous
huge pages.


Currently libraries like
libhugetlbfs and seastar create these with a standard mkstemp+unlink
trick,


I would guess that much of libhugetlbfs was writen before MAP_HUGETLB
was implemented.  So, that is why it does not make (more) use of that
option.

The implementation looks to be straight forward.  However, I really do
not want to add more functionality to hugetlbfs unless there is specific
use case that needs it.


It was not my intention to shut down discussion on this patch.  I was just
asking if there was a (new) use case for such a change.  I am checking with
our DB team as I seem to remember them using the create/unlink approach for
hugetlbfs in one of their upcoming models.

Is there a new use case you were thinking about?



Oh, I indeed thought it was a shutdown. The use case I was thinking 
about was in Seastar, where the create+unlink trick is used for creating 
temporary files (in a generic way, not only for hugetlbfs). I simply 
intended to migrate it to a newer approach - O_TMPFILE. However,
for the specific case of hugetlbfs it indeed makes more sense to skip it 
and use mmap's MAP_HUGETLB, so perhaps it's not worth it to patch a 
perfectly good and stable file system just to provide a semi-useful flag 
support. My implementation of tmpfile for hugetlbfs is straightforward 
indeed, but the MAP_HUGETLB argument made me realize that it may not be 
worth the trouble - especially that MAP_HUGETLB is here since 2.6 and 
O_TMPFILE was introduced around v3.11, so the mmap way looks more portable.


tldr: I'd be very happy to get my patch accepted, but the use case I had 
in mind can be easily solved with MAP_HUGETLB, so I don't insist.


[PATCH] hugetlbfs: add O_TMPFILE support

2019-10-15 Thread Piotr Sarna
With hugetlbfs, a common pattern for mapping anonymous huge pages
is to create a temporary file first. Currently libraries like
libhugetlbfs and seastar create these with a standard mkstemp+unlink
trick, but it would be more robust to be able to simply pass
the O_TMPFILE flag to open(). O_TMPFILE is already supported by several
file systems like ext4 and xfs. The implementation simply uses the existing
d_tmpfile utility function to instantiate the dcache entry for the file.

Tested manually by successfully creating a temporary file by opening
it with (O_TMPFILE|O_RDWR) on mounted hugetlbfs and successfully
mapping 2M huge pages with it. Without the patch, trying to open
a file with O_TMPFILE results in -ENOSUP.

Signed-off-by: Piotr Sarna 
---
 fs/hugetlbfs/inode.c | 25 ++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 1dcc57189382..277b7d231db8 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -815,8 +815,11 @@ static struct inode *hugetlbfs_get_inode(struct 
super_block *sb,
 /*
  * File creation. Allocate an inode, and we're done..
  */
-static int hugetlbfs_mknod(struct inode *dir,
-   struct dentry *dentry, umode_t mode, dev_t dev)
+static int do_hugetlbfs_mknod(struct inode *dir,
+   struct dentry *dentry,
+   umode_t mode,
+   dev_t dev,
+   bool tmpfile)
 {
struct inode *inode;
int error = -ENOSPC;
@@ -824,13 +827,22 @@ static int hugetlbfs_mknod(struct inode *dir,
inode = hugetlbfs_get_inode(dir->i_sb, dir, mode, dev);
if (inode) {
dir->i_ctime = dir->i_mtime = current_time(dir);
-   d_instantiate(dentry, inode);
+   if (tmpfile)
+   d_tmpfile(dentry, inode);
+   else
+   d_instantiate(dentry, inode);
dget(dentry);   /* Extra count - pin the dentry in core */
error = 0;
}
return error;
 }
 
+static int hugetlbfs_mknod(struct inode *dir,
+   struct dentry *dentry, umode_t mode, dev_t dev)
+{
+   return do_hugetlbfs_mknod(dir, dentry, mode, dev, false);
+}
+
 static int hugetlbfs_mkdir(struct inode *dir, struct dentry *dentry, umode_t 
mode)
 {
int retval = hugetlbfs_mknod(dir, dentry, mode | S_IFDIR, 0);
@@ -844,6 +856,12 @@ static int hugetlbfs_create(struct inode *dir, struct 
dentry *dentry, umode_t mo
return hugetlbfs_mknod(dir, dentry, mode | S_IFREG, 0);
 }
 
+static int hugetlbfs_tmpfile(struct inode *dir,
+   struct dentry *dentry, umode_t mode)
+{
+   return do_hugetlbfs_mknod(dir, dentry, mode | S_IFREG, 0, true);
+}
+
 static int hugetlbfs_symlink(struct inode *dir,
struct dentry *dentry, const char *symname)
 {
@@ -1102,6 +1120,7 @@ static const struct inode_operations 
hugetlbfs_dir_inode_operations = {
.mknod  = hugetlbfs_mknod,
.rename = simple_rename,
.setattr= hugetlbfs_setattr,
+   .tmpfile= hugetlbfs_tmpfile,
 };
 
 static const struct inode_operations hugetlbfs_inode_operations = {
-- 
2.21.0



[PATCH] mm: zcache: zcache_cleancache_flush_fs fix

2013-08-06 Thread Piotr Sarna
This patch fixes "mm: zcache: core functions added" patch,
available at https://lkml.org/lkml/2013/7/20/90.
It regards incorrect implementation of zcache_cleancache_flush_fs().

Function above should be effective only if cleancache pool referred
by pool_id is valid. This issue is checked by testing whether zpool
points to NULL.

Unfortunately, if filesystem mount fails, such pool is never created
and fs/super.c calls cleancache_invalidate_fs() function with pool_id
parameter set to -1. This results in assigning zpool with pools[-1],
which causes zpool to be not NULL and thus whole function hangs on
uninitialized read-write lock.

To prevent that behaviour, pool_id should be checked for being positive
before assigning zpool variable with pools[pool_id].

Signed-off-by: Piotr Sarna 
Acked-by: Bartlomiej Zolnierkiewicz 
Signed-off-by: Kyungmin Park 
---
 mm/zcache.c |6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/mm/zcache.c b/mm/zcache.c
index a2408e8..7e6d2e7 100644
--- a/mm/zcache.c
+++ b/mm/zcache.c
@@ -600,8 +600,12 @@ static void zcache_cleancache_flush_fs(int pool_id)
struct zcache_rb_entry *entry = NULL;
struct rb_node *node;
unsigned long flags1, flags2;
-   struct zcache_pool *zpool = zcache.pools[pool_id];
+   struct zcache_pool *zpool;
+
+   if (pool_id < 0)
+   return;
 
+   zpool = zcache.pools[pool_id];
if (!zpool)
return;
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mm: zcache: zcache_cleancache_flush_fs fix

2013-08-06 Thread Piotr Sarna
This patch fixes mm: zcache: core functions added patch,
available at https://lkml.org/lkml/2013/7/20/90.
It regards incorrect implementation of zcache_cleancache_flush_fs().

Function above should be effective only if cleancache pool referred
by pool_id is valid. This issue is checked by testing whether zpool
points to NULL.

Unfortunately, if filesystem mount fails, such pool is never created
and fs/super.c calls cleancache_invalidate_fs() function with pool_id
parameter set to -1. This results in assigning zpool with pools[-1],
which causes zpool to be not NULL and thus whole function hangs on
uninitialized read-write lock.

To prevent that behaviour, pool_id should be checked for being positive
before assigning zpool variable with pools[pool_id].

Signed-off-by: Piotr Sarna p.sa...@partner.samsung.com
Acked-by: Bartlomiej Zolnierkiewicz b.zolnier...@samsung.com
Signed-off-by: Kyungmin Park kyungmin.p...@samsung.com
---
 mm/zcache.c |6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/mm/zcache.c b/mm/zcache.c
index a2408e8..7e6d2e7 100644
--- a/mm/zcache.c
+++ b/mm/zcache.c
@@ -600,8 +600,12 @@ static void zcache_cleancache_flush_fs(int pool_id)
struct zcache_rb_entry *entry = NULL;
struct rb_node *node;
unsigned long flags1, flags2;
-   struct zcache_pool *zpool = zcache.pools[pool_id];
+   struct zcache_pool *zpool;
+
+   if (pool_id  0)
+   return;
 
+   zpool = zcache.pools[pool_id];
if (!zpool)
return;
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] ext4: fix handling of nodelalloc parameter

2013-08-02 Thread Piotr Sarna
Commit 26092bf ("ext4: use a table-driven handler for mount options")
introduced buggy handling of nodelalloc parameter in mount command.

After explicitly using delalloc or nodelalloc parameter in mount command,
MOPT_EXPLICIT flag is set. After that, a test ensures that "data=journal"
and "delalloc" parameters are not simultaneously activated.
Unluckily, the mentioned test reports a bug in both situations:
- "data=journal,delalloc"
- "data=journal,nodelalloc"
whereas the second one is perfectly legal and acceptable.

A simple solution to this problem is in setting EXPLICIT_DELALLOC flag
properly. This patch ensures that EXPLICIT_DELALLOC flag is set only
if "delalloc" parameter was used, and not set in case of "nodelalloc".

Signed-off-by: Piotr Sarna 
Acked-by: Bartlomiej Zolnierkiewicz 
Signed-off-by: Kyungmin Park 
---
 fs/ext4/super.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 94cc84d..10f9bb0 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1467,7 +1467,8 @@ static int handle_mount_opt(struct super_block *sb, char 
*opt, int token,
if (args->from && (m->flags & MOPT_GTE0) && (arg < 0))
return -1;
if (m->flags & MOPT_EXPLICIT)
-   set_opt2(sb, EXPLICIT_DELALLOC);
+   if (m->flags & MOPT_SET)
+   set_opt2(sb, EXPLICIT_DELALLOC);
if (m->flags & MOPT_CLEAR_ERR)
clear_opt(sb, ERRORS_MASK);
if (token == Opt_noquota && sb_any_quota_loaded(sb)) {
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] ext4: improve mount/remount error handling

2013-08-02 Thread Piotr Sarna
Commit 5688978 ("ext4: improve handling of conflicting mount options")
introduced incorrect messages shown while choosing wrong mount options.

Firstly, both cases of incorrect mount options, "data=journal,delalloc"
and "data=journal,dioread_nolock" result in the same error message.

Secondly, the problem above isn't solved for remount option: the mismatched
parameter is simply ignored. Moreover, ext4_msg states that remount
with options "data=journal,delalloc" succeeded, which is not true.

To fix it up, I added a simple check after parse_options() call to ensure
that data=journal and delalloc/dioread_nolock parameters are not present
at the same time.

Signed-off-by: Piotr Sarna 
Acked-by: Bartlomiej Zolnierkiewicz 
Signed-off-by: Kyungmin Park 
---
 fs/ext4/super.c |   17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 10f9bb0..ab0b23b 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -3452,7 +3452,7 @@ static int ext4_fill_super(struct super_block *sb, void 
*data, int silent)
}
if (test_opt(sb, DIOREAD_NOLOCK)) {
ext4_msg(sb, KERN_ERR, "can't mount with "
-"both data=journal and delalloc");
+"both data=journal and dioread_nolock");
goto failed_mount;
}
if (test_opt(sb, DELALLOC))
@@ -4653,6 +4653,21 @@ static int ext4_remount(struct super_block *sb, int 
*flags, char *data)
goto restore_opts;
}
 
+   if (test_opt(sb, DATA_FLAGS) == EXT4_MOUNT_JOURNAL_DATA) {
+   if (test_opt2(sb, EXPLICIT_DELALLOC)) {
+   ext4_msg(sb, KERN_ERR, "can't mount with "
+"both data=journal and delalloc");
+   err = -EINVAL;
+   goto restore_opts;
+   }
+   if (test_opt(sb, DIOREAD_NOLOCK)) {
+   ext4_msg(sb, KERN_ERR, "can't mount with "
+"both data=journal and dioread_nolock");
+   err = -EINVAL;
+   goto restore_opts;
+   }
+   }
+
if (sbi->s_mount_flags & EXT4_MF_FS_ABORTED)
ext4_abort(sb, "Abort forced by user");
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] ext4: fix handling of nodelalloc parameter

2013-08-02 Thread Piotr Sarna
Commit 26092bf (ext4: use a table-driven handler for mount options)
introduced buggy handling of nodelalloc parameter in mount command.

After explicitly using delalloc or nodelalloc parameter in mount command,
MOPT_EXPLICIT flag is set. After that, a test ensures that data=journal
and delalloc parameters are not simultaneously activated.
Unluckily, the mentioned test reports a bug in both situations:
- data=journal,delalloc
- data=journal,nodelalloc
whereas the second one is perfectly legal and acceptable.

A simple solution to this problem is in setting EXPLICIT_DELALLOC flag
properly. This patch ensures that EXPLICIT_DELALLOC flag is set only
if delalloc parameter was used, and not set in case of nodelalloc.

Signed-off-by: Piotr Sarna p.sa...@partner.samsung.com
Acked-by: Bartlomiej Zolnierkiewicz b.zolnier...@samsung.com
Signed-off-by: Kyungmin Park kyungmin.p...@samsung.com
---
 fs/ext4/super.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 94cc84d..10f9bb0 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1467,7 +1467,8 @@ static int handle_mount_opt(struct super_block *sb, char 
*opt, int token,
if (args-from  (m-flags  MOPT_GTE0)  (arg  0))
return -1;
if (m-flags  MOPT_EXPLICIT)
-   set_opt2(sb, EXPLICIT_DELALLOC);
+   if (m-flags  MOPT_SET)
+   set_opt2(sb, EXPLICIT_DELALLOC);
if (m-flags  MOPT_CLEAR_ERR)
clear_opt(sb, ERRORS_MASK);
if (token == Opt_noquota  sb_any_quota_loaded(sb)) {
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] ext4: improve mount/remount error handling

2013-08-02 Thread Piotr Sarna
Commit 5688978 (ext4: improve handling of conflicting mount options)
introduced incorrect messages shown while choosing wrong mount options.

Firstly, both cases of incorrect mount options, data=journal,delalloc
and data=journal,dioread_nolock result in the same error message.

Secondly, the problem above isn't solved for remount option: the mismatched
parameter is simply ignored. Moreover, ext4_msg states that remount
with options data=journal,delalloc succeeded, which is not true.

To fix it up, I added a simple check after parse_options() call to ensure
that data=journal and delalloc/dioread_nolock parameters are not present
at the same time.

Signed-off-by: Piotr Sarna p.sa...@partner.samsung.com
Acked-by: Bartlomiej Zolnierkiewicz b.zolnier...@samsung.com
Signed-off-by: Kyungmin Park kyungmin.p...@samsung.com
---
 fs/ext4/super.c |   17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 10f9bb0..ab0b23b 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -3452,7 +3452,7 @@ static int ext4_fill_super(struct super_block *sb, void 
*data, int silent)
}
if (test_opt(sb, DIOREAD_NOLOCK)) {
ext4_msg(sb, KERN_ERR, can't mount with 
-both data=journal and delalloc);
+both data=journal and dioread_nolock);
goto failed_mount;
}
if (test_opt(sb, DELALLOC))
@@ -4653,6 +4653,21 @@ static int ext4_remount(struct super_block *sb, int 
*flags, char *data)
goto restore_opts;
}
 
+   if (test_opt(sb, DATA_FLAGS) == EXT4_MOUNT_JOURNAL_DATA) {
+   if (test_opt2(sb, EXPLICIT_DELALLOC)) {
+   ext4_msg(sb, KERN_ERR, can't mount with 
+both data=journal and delalloc);
+   err = -EINVAL;
+   goto restore_opts;
+   }
+   if (test_opt(sb, DIOREAD_NOLOCK)) {
+   ext4_msg(sb, KERN_ERR, can't mount with 
+both data=journal and dioread_nolock);
+   err = -EINVAL;
+   goto restore_opts;
+   }
+   }
+
if (sbi-s_mount_flags  EXT4_MF_FS_ABORTED)
ext4_abort(sb, Abort forced by user);
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] staging: zram: add Crypto API support

2013-07-31 Thread Piotr Sarna
On 07/31/2013 02:49 PM, Greg KH wrote:
> On Wed, Jul 31, 2013 at 12:16:36PM +0200, Piotr Sarna wrote:
>> Hi,
>>
>> On 07/30/2013 03:53 PM, Greg KH wrote:
>>> On Tue, Jul 30, 2013 at 02:30:48PM +0200, Piotr Sarna wrote:
>>>> Current version of zram does not allow any substitution of a default
>>>> compression algorithm. Therefore, I decided to change the existing
>>>> implementation of page compression by adding Crypto API compability.
>>>
>>> As I have stated before, I want this code to get merged out of the
>>> drivers/staging/ area _before_ any new features get added to it.  People
>>> are taking too much time adding new stuff, like this, and no one seems
>>> to be working to get it merged to the proper place anymore.
>>>
>>
>> OK, but we actually need those features in order to test zram
>> against other, competitive solutions - and then decide whether
>> and how to merge it out of /drivers/staging.
> 
> Where is another "competitive solution" in the kernel?
> 
> And you are implying that as-is, zram isn't acceptable, right?  If so, I
> should just delete it now as I was originally told otherwise, that's why
> I merged it :(
> 
>>> So again, I'm going to have to say no to a new feature here, sorry.
>>> zcache, zram, and zsmalloc need to get cleaned up and merged out of
>>> staging before anything new can be added to them.
>>>
>>> Or, if no one is working on that, I guess I can just delete them,
>>> right?...
>>>
>>
>> Is there any official TODO list of cleaning up and merging out zram?
> 
> As it came from the "zswap" code, there doesn't seem to be one :(
> 
> The code is over 2 years old now, with no percieved movement out of
> staging.  If it doesn't get fixed up in the next kernel version or so, I
> will have to remove it entirely.
> 
> sorry,
> 
> greg k-h
> 

"Another competitive solutions" I thought of were zswap and zcache.

On one hand, during my tests (of compressed swap feature), zram
turned out to be no faster than zswap/zcache (but no slower either).

On the other, zram happens to have some more, perhaps reasonable
use cases (described in zram.txt), e.g. mounting it as /tmp.

So far, I haven't tested those other features of zram,
so I can't possibly say whether it should be considered for a merge-out
or kept in staging for now.

Regards,
Piotr Sarna
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] staging: zram: add Crypto API support

2013-07-31 Thread Piotr Sarna
Hi,

On 07/30/2013 03:53 PM, Greg KH wrote:
> On Tue, Jul 30, 2013 at 02:30:48PM +0200, Piotr Sarna wrote:
>> Current version of zram does not allow any substitution of a default
>> compression algorithm. Therefore, I decided to change the existing
>> implementation of page compression by adding Crypto API compability.
> 
> As I have stated before, I want this code to get merged out of the
> drivers/staging/ area _before_ any new features get added to it.  People
> are taking too much time adding new stuff, like this, and no one seems
> to be working to get it merged to the proper place anymore.
> 

OK, but we actually need those features in order to test zram
against other, competitive solutions - and then decide whether
and how to merge it out of /drivers/staging.

> So again, I'm going to have to say no to a new feature here, sorry.
> zcache, zram, and zsmalloc need to get cleaned up and merged out of
> staging before anything new can be added to them.
> 
> Or, if no one is working on that, I guess I can just delete them,
> right?...
> 

Is there any official TODO list of cleaning up and merging out zram?

Regards,
Piotr Sarna
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] staging: zram: add Crypto API support

2013-07-31 Thread Piotr Sarna
Hi,

On 07/30/2013 03:53 PM, Greg KH wrote:
 On Tue, Jul 30, 2013 at 02:30:48PM +0200, Piotr Sarna wrote:
 Current version of zram does not allow any substitution of a default
 compression algorithm. Therefore, I decided to change the existing
 implementation of page compression by adding Crypto API compability.
 
 As I have stated before, I want this code to get merged out of the
 drivers/staging/ area _before_ any new features get added to it.  People
 are taking too much time adding new stuff, like this, and no one seems
 to be working to get it merged to the proper place anymore.
 

OK, but we actually need those features in order to test zram
against other, competitive solutions - and then decide whether
and how to merge it out of /drivers/staging.

 So again, I'm going to have to say no to a new feature here, sorry.
 zcache, zram, and zsmalloc need to get cleaned up and merged out of
 staging before anything new can be added to them.
 
 Or, if no one is working on that, I guess I can just delete them,
 right?...
 

Is there any official TODO list of cleaning up and merging out zram?

Regards,
Piotr Sarna
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] staging: zram: add Crypto API support

2013-07-31 Thread Piotr Sarna
On 07/31/2013 02:49 PM, Greg KH wrote:
 On Wed, Jul 31, 2013 at 12:16:36PM +0200, Piotr Sarna wrote:
 Hi,

 On 07/30/2013 03:53 PM, Greg KH wrote:
 On Tue, Jul 30, 2013 at 02:30:48PM +0200, Piotr Sarna wrote:
 Current version of zram does not allow any substitution of a default
 compression algorithm. Therefore, I decided to change the existing
 implementation of page compression by adding Crypto API compability.

 As I have stated before, I want this code to get merged out of the
 drivers/staging/ area _before_ any new features get added to it.  People
 are taking too much time adding new stuff, like this, and no one seems
 to be working to get it merged to the proper place anymore.


 OK, but we actually need those features in order to test zram
 against other, competitive solutions - and then decide whether
 and how to merge it out of /drivers/staging.
 
 Where is another competitive solution in the kernel?
 
 And you are implying that as-is, zram isn't acceptable, right?  If so, I
 should just delete it now as I was originally told otherwise, that's why
 I merged it :(
 
 So again, I'm going to have to say no to a new feature here, sorry.
 zcache, zram, and zsmalloc need to get cleaned up and merged out of
 staging before anything new can be added to them.

 Or, if no one is working on that, I guess I can just delete them,
 right?...


 Is there any official TODO list of cleaning up and merging out zram?
 
 As it came from the zswap code, there doesn't seem to be one :(
 
 The code is over 2 years old now, with no percieved movement out of
 staging.  If it doesn't get fixed up in the next kernel version or so, I
 will have to remove it entirely.
 
 sorry,
 
 greg k-h
 

Another competitive solutions I thought of were zswap and zcache.

On one hand, during my tests (of compressed swap feature), zram
turned out to be no faster than zswap/zcache (but no slower either).

On the other, zram happens to have some more, perhaps reasonable
use cases (described in zram.txt), e.g. mounting it as /tmp.

So far, I haven't tested those other features of zram,
so I can't possibly say whether it should be considered for a merge-out
or kept in staging for now.

Regards,
Piotr Sarna
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] staging: zram: add per-cpu support to Crypto

2013-07-30 Thread Piotr Sarna
Since original zram code did not implement any per-cpu operations,
my previous patch (staging: zram: add Crypto API support) did not
include them either.

This patch complements the first one with per-cpu support for Crypto,
allocating tfms buffer separately for each online processor.
Changes are based on zswap and zcache per-cpu code.

Basic tests (concurrent writing several 10-40MB chunks to zram) performed
on an ARM-based EXYNOS4412 Quad-Core showed that per-cpu code provides
noticeable time saving, ranging between 30-40% for LZO and LZ4 compressors.
Sample data (LZO): writing 160MB, 40MB per thread took 0.60s with per-cpu
code included and approximately 0.80s without per-cpu support.

Signed-off-by: Piotr Sarna 
Acked-by: Bartlomiej Zolnierkiewicz 
Signed-off-by: Kyungmin Park 
---
 drivers/staging/zram/zram_drv.c |  146 +--
 drivers/staging/zram/zram_drv.h |1 -
 2 files changed, 125 insertions(+), 22 deletions(-)

diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c
index d6f1f67..3dd5085 100644
--- a/drivers/staging/zram/zram_drv.c
+++ b/drivers/staging/zram/zram_drv.c
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -46,7 +47,7 @@ static unsigned int num_devices = 1;
 
 /* Cryptographic API features */
 static char *zram_compressor = ZRAM_COMPRESSOR_DEFAULT;
-static struct crypto_comp *zram_comp_tfm;
+static struct crypto_comp * __percpu *zram_comp_pcpu_tfms;
 
 enum comp_op {
ZRAM_COMPOP_COMPRESS,
@@ -59,7 +60,7 @@ static int zram_comp_op(enum comp_op op, const u8 *src, 
unsigned int slen,
struct crypto_comp *tfm;
int ret;
 
-   tfm = zram_comp_tfm;
+   tfm = *per_cpu_ptr(zram_comp_pcpu_tfms, get_cpu());
switch (op) {
case ZRAM_COMPOP_COMPRESS:
ret = crypto_comp_compress(tfm, src, slen, dst, dlen);
@@ -70,6 +71,7 @@ static int zram_comp_op(enum comp_op op, const u8 *src, 
unsigned int slen,
default:
ret = -EINVAL;
}
+   put_cpu();
 
return ret;
 }
@@ -87,9 +89,9 @@ static int __init zram_comp_init(void)
}
pr_info("using %s compressor\n", zram_compressor);
 
-   /* alloc transform */
-   zram_comp_tfm = crypto_alloc_comp(zram_compressor, 0, 0);
-   if (!zram_comp_tfm)
+   /* alloc percpu transforms */
+   zram_comp_pcpu_tfms = alloc_percpu(struct crypto_comp *);
+   if (!zram_comp_pcpu_tfms)
return -ENOMEM;
 
return 0;
@@ -97,8 +99,110 @@ static int __init zram_comp_init(void)
 
 static inline void zram_comp_exit(void)
 {
-   if (zram_comp_tfm)
-   crypto_free_comp(zram_comp_tfm);
+   /* free percpu transforms */
+   if (zram_comp_pcpu_tfms)
+   free_percpu(zram_comp_pcpu_tfms);
+}
+
+
+/* Crypto API features: percpu code */
+#define ZRAM_DSTMEM_ORDER 1
+static DEFINE_PER_CPU(u8 *, zram_dstmem);
+
+static int zram_comp_cpu_up(int cpu)
+{
+   struct crypto_comp *tfm;
+
+   tfm = crypto_alloc_comp(zram_compressor, 0, 0);
+   if (IS_ERR(tfm))
+   return NOTIFY_BAD;
+   *per_cpu_ptr(zram_comp_pcpu_tfms, cpu) = tfm;
+   return NOTIFY_OK;
+}
+
+static void zram_comp_cpu_down(int cpu)
+{
+   struct crypto_comp *tfm;
+
+   tfm = *per_cpu_ptr(zram_comp_pcpu_tfms, cpu);
+   crypto_free_comp(tfm);
+   *per_cpu_ptr(zram_comp_pcpu_tfms, cpu) = NULL;
+}
+
+static int zram_cpu_notifier(struct notifier_block *nb,
+   unsigned long action, void *pcpu)
+{
+   int ret;
+   int cpu = (long) pcpu;
+
+   switch (action) {
+   case CPU_UP_PREPARE:
+   ret = zram_comp_cpu_up(cpu);
+   if (ret != NOTIFY_OK) {
+   pr_err("zram: can't allocate compressor xform\n");
+   return ret;
+   }
+   per_cpu(zram_dstmem, cpu) = (void *)__get_free_pages(
+   GFP_KERNEL | __GFP_REPEAT, ZRAM_DSTMEM_ORDER);
+   break;
+   case CPU_DEAD:
+   case CPU_UP_CANCELED:
+   zram_comp_cpu_down(cpu);
+   free_pages((unsigned long) per_cpu(zram_dstmem, cpu),
+   ZRAM_DSTMEM_ORDER);
+   per_cpu(zram_dstmem, cpu) = NULL;
+   break;
+   default:
+   break;
+   }
+   return NOTIFY_OK;
+}
+
+static struct notifier_block zram_cpu_notifier_block = {
+   .notifier_call = zram_cpu_notifier
+};
+
+/* Helper function releasing tfms from online cpus */
+static inline void zram_comp_cpus_down(void)
+{
+   int cpu;
+
+   get_online_cpus();
+   for_each_online_cpu(cpu) {
+   void *pcpu = (void *)(long)cpu;
+   zram_cpu_notifier(_cpu_notifier_block,
+ CPU_UP_CANCELED, pcpu);
+   }
+   put_online_cpus();
+}
+
+static int zram_cpu_init(void

[PATCH 1/2] staging: zram: add Crypto API support

2013-07-30 Thread Piotr Sarna
Current version of zram does not allow any substitution of a default
compression algorithm. Therefore, I decided to change the existing
implementation of page compression by adding Crypto API compability.

All direct calls to lzo1x compression/decompression methods are now
replaced by calls consistent with Crypto. Also, I removed "workmem"
field from struct zram_meta, as it was there for lzo1x purposes only
and is no longer needed. Finally, I added a set of functions required
by Crypto API to work properly.

In order to substitute the default algorithm (lzo), change the value
of zram.compressor module parameter to a proper name (e.g. lz4).

Signed-off-by: Piotr Sarna 
Acked-by: Bartlomiej Zolnierkiewicz 
Signed-off-by: Kyungmin Park 
---
 drivers/staging/zram/Kconfig|5 +-
 drivers/staging/zram/zram_drv.c |  106 ---
 drivers/staging/zram/zram_drv.h |1 -
 3 files changed, 89 insertions(+), 23 deletions(-)

diff --git a/drivers/staging/zram/Kconfig b/drivers/staging/zram/Kconfig
index 983314c..b51cac5 100644
--- a/drivers/staging/zram/Kconfig
+++ b/drivers/staging/zram/Kconfig
@@ -1,8 +1,7 @@
 config ZRAM
tristate "Compressed RAM block device support"
-   depends on BLOCK && SYSFS && ZSMALLOC
-   select LZO_COMPRESS
-   select LZO_DECOMPRESS
+   depends on BLOCK && SYSFS && ZSMALLOC && CRYPTO=y
+   select CRYPTO_LZO
default n
help
  Creates virtual block devices called /dev/zramX (X = 0, 1, ...).
diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c
index 7ebf91d..d6f1f67 100644
--- a/drivers/staging/zram/zram_drv.c
+++ b/drivers/staging/zram/zram_drv.c
@@ -29,12 +29,14 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 
 #include "zram_drv.h"
 
+#define ZRAM_COMPRESSOR_DEFAULT "lzo"
+
 /* Globals */
 static int zram_major;
 static struct zram *zram_devices;
@@ -42,6 +44,64 @@ static struct zram *zram_devices;
 /* Module params (documentation at end) */
 static unsigned int num_devices = 1;
 
+/* Cryptographic API features */
+static char *zram_compressor = ZRAM_COMPRESSOR_DEFAULT;
+static struct crypto_comp *zram_comp_tfm;
+
+enum comp_op {
+   ZRAM_COMPOP_COMPRESS,
+   ZRAM_COMPOP_DECOMPRESS
+};
+
+static int zram_comp_op(enum comp_op op, const u8 *src, unsigned int slen,
+   u8 *dst, unsigned int *dlen)
+{
+   struct crypto_comp *tfm;
+   int ret;
+
+   tfm = zram_comp_tfm;
+   switch (op) {
+   case ZRAM_COMPOP_COMPRESS:
+   ret = crypto_comp_compress(tfm, src, slen, dst, dlen);
+   break;
+   case ZRAM_COMPOP_DECOMPRESS:
+   ret = crypto_comp_decompress(tfm, src, slen, dst, dlen);
+   break;
+   default:
+   ret = -EINVAL;
+   }
+
+   return ret;
+}
+
+static int __init zram_comp_init(void)
+{
+   int ret;
+   ret = crypto_has_comp(zram_compressor, 0, 0);
+   if (!ret) {
+   pr_info("%s is not available\n", zram_compressor);
+   zram_compressor = ZRAM_COMPRESSOR_DEFAULT;
+   ret = crypto_has_comp(zram_compressor, 0, 0);
+   if (!ret)
+   return -ENODEV;
+   }
+   pr_info("using %s compressor\n", zram_compressor);
+
+   /* alloc transform */
+   zram_comp_tfm = crypto_alloc_comp(zram_compressor, 0, 0);
+   if (!zram_comp_tfm)
+   return -ENOMEM;
+
+   return 0;
+}
+
+static inline void zram_comp_exit(void)
+{
+   if (zram_comp_tfm)
+   crypto_free_comp(zram_comp_tfm);
+}
+/* end of Cryptographic API features */
+
 static inline struct zram *dev_to_zram(struct device *dev)
 {
return (struct zram *)dev_to_disk(dev)->private_data;
@@ -190,7 +250,6 @@ static inline int valid_io_request(struct zram *zram, 
struct bio *bio)
 static void zram_meta_free(struct zram_meta *meta)
 {
zs_destroy_pool(meta->mem_pool);
-   kfree(meta->compress_workmem);
free_pages((unsigned long)meta->compress_buffer, 1);
vfree(meta->table);
kfree(meta);
@@ -203,15 +262,11 @@ static struct zram_meta *zram_meta_alloc(u64 disksize)
if (!meta)
goto out;
 
-   meta->compress_workmem = kzalloc(LZO1X_MEM_COMPRESS, GFP_KERNEL);
-   if (!meta->compress_workmem)
-   goto free_meta;
-
meta->compress_buffer =
(void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 1);
if (!meta->compress_buffer) {
pr_err("Error allocating compressor buffer space\n");
-   goto free_workmem;
+   goto free_meta;
}
 
num_pages = disksize >> PAGE_SHIFT;
@@ -233,8 +288,6 @@ free_table:
vfree(meta->table);
 free_buffer:
free_pages((u

[PATCH 1/2] staging: zram: add Crypto API support

2013-07-30 Thread Piotr Sarna
Current version of zram does not allow any substitution of a default
compression algorithm. Therefore, I decided to change the existing
implementation of page compression by adding Crypto API compability.

All direct calls to lzo1x compression/decompression methods are now
replaced by calls consistent with Crypto. Also, I removed workmem
field from struct zram_meta, as it was there for lzo1x purposes only
and is no longer needed. Finally, I added a set of functions required
by Crypto API to work properly.

In order to substitute the default algorithm (lzo), change the value
of zram.compressor module parameter to a proper name (e.g. lz4).

Signed-off-by: Piotr Sarna p.sa...@partner.samsung.com
Acked-by: Bartlomiej Zolnierkiewicz b.zolnier...@samsung.com
Signed-off-by: Kyungmin Park kyungmin.p...@samsung.com
---
 drivers/staging/zram/Kconfig|5 +-
 drivers/staging/zram/zram_drv.c |  106 ---
 drivers/staging/zram/zram_drv.h |1 -
 3 files changed, 89 insertions(+), 23 deletions(-)

diff --git a/drivers/staging/zram/Kconfig b/drivers/staging/zram/Kconfig
index 983314c..b51cac5 100644
--- a/drivers/staging/zram/Kconfig
+++ b/drivers/staging/zram/Kconfig
@@ -1,8 +1,7 @@
 config ZRAM
tristate Compressed RAM block device support
-   depends on BLOCK  SYSFS  ZSMALLOC
-   select LZO_COMPRESS
-   select LZO_DECOMPRESS
+   depends on BLOCK  SYSFS  ZSMALLOC  CRYPTO=y
+   select CRYPTO_LZO
default n
help
  Creates virtual block devices called /dev/zramX (X = 0, 1, ...).
diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c
index 7ebf91d..d6f1f67 100644
--- a/drivers/staging/zram/zram_drv.c
+++ b/drivers/staging/zram/zram_drv.c
@@ -29,12 +29,14 @@
 #include linux/genhd.h
 #include linux/highmem.h
 #include linux/slab.h
-#include linux/lzo.h
+#include linux/crypto.h
 #include linux/string.h
 #include linux/vmalloc.h
 
 #include zram_drv.h
 
+#define ZRAM_COMPRESSOR_DEFAULT lzo
+
 /* Globals */
 static int zram_major;
 static struct zram *zram_devices;
@@ -42,6 +44,64 @@ static struct zram *zram_devices;
 /* Module params (documentation at end) */
 static unsigned int num_devices = 1;
 
+/* Cryptographic API features */
+static char *zram_compressor = ZRAM_COMPRESSOR_DEFAULT;
+static struct crypto_comp *zram_comp_tfm;
+
+enum comp_op {
+   ZRAM_COMPOP_COMPRESS,
+   ZRAM_COMPOP_DECOMPRESS
+};
+
+static int zram_comp_op(enum comp_op op, const u8 *src, unsigned int slen,
+   u8 *dst, unsigned int *dlen)
+{
+   struct crypto_comp *tfm;
+   int ret;
+
+   tfm = zram_comp_tfm;
+   switch (op) {
+   case ZRAM_COMPOP_COMPRESS:
+   ret = crypto_comp_compress(tfm, src, slen, dst, dlen);
+   break;
+   case ZRAM_COMPOP_DECOMPRESS:
+   ret = crypto_comp_decompress(tfm, src, slen, dst, dlen);
+   break;
+   default:
+   ret = -EINVAL;
+   }
+
+   return ret;
+}
+
+static int __init zram_comp_init(void)
+{
+   int ret;
+   ret = crypto_has_comp(zram_compressor, 0, 0);
+   if (!ret) {
+   pr_info(%s is not available\n, zram_compressor);
+   zram_compressor = ZRAM_COMPRESSOR_DEFAULT;
+   ret = crypto_has_comp(zram_compressor, 0, 0);
+   if (!ret)
+   return -ENODEV;
+   }
+   pr_info(using %s compressor\n, zram_compressor);
+
+   /* alloc transform */
+   zram_comp_tfm = crypto_alloc_comp(zram_compressor, 0, 0);
+   if (!zram_comp_tfm)
+   return -ENOMEM;
+
+   return 0;
+}
+
+static inline void zram_comp_exit(void)
+{
+   if (zram_comp_tfm)
+   crypto_free_comp(zram_comp_tfm);
+}
+/* end of Cryptographic API features */
+
 static inline struct zram *dev_to_zram(struct device *dev)
 {
return (struct zram *)dev_to_disk(dev)-private_data;
@@ -190,7 +250,6 @@ static inline int valid_io_request(struct zram *zram, 
struct bio *bio)
 static void zram_meta_free(struct zram_meta *meta)
 {
zs_destroy_pool(meta-mem_pool);
-   kfree(meta-compress_workmem);
free_pages((unsigned long)meta-compress_buffer, 1);
vfree(meta-table);
kfree(meta);
@@ -203,15 +262,11 @@ static struct zram_meta *zram_meta_alloc(u64 disksize)
if (!meta)
goto out;
 
-   meta-compress_workmem = kzalloc(LZO1X_MEM_COMPRESS, GFP_KERNEL);
-   if (!meta-compress_workmem)
-   goto free_meta;
-
meta-compress_buffer =
(void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 1);
if (!meta-compress_buffer) {
pr_err(Error allocating compressor buffer space\n);
-   goto free_workmem;
+   goto free_meta;
}
 
num_pages = disksize  PAGE_SHIFT;
@@ -233,8 +288,6 @@ free_table:
vfree(meta-table);
 free_buffer:
free_pages((unsigned long)meta

[PATCH 2/2] staging: zram: add per-cpu support to Crypto

2013-07-30 Thread Piotr Sarna
Since original zram code did not implement any per-cpu operations,
my previous patch (staging: zram: add Crypto API support) did not
include them either.

This patch complements the first one with per-cpu support for Crypto,
allocating tfms buffer separately for each online processor.
Changes are based on zswap and zcache per-cpu code.

Basic tests (concurrent writing several 10-40MB chunks to zram) performed
on an ARM-based EXYNOS4412 Quad-Core showed that per-cpu code provides
noticeable time saving, ranging between 30-40% for LZO and LZ4 compressors.
Sample data (LZO): writing 160MB, 40MB per thread took 0.60s with per-cpu
code included and approximately 0.80s without per-cpu support.

Signed-off-by: Piotr Sarna p.sa...@partner.samsung.com
Acked-by: Bartlomiej Zolnierkiewicz b.zolnier...@samsung.com
Signed-off-by: Kyungmin Park kyungmin.p...@samsung.com
---
 drivers/staging/zram/zram_drv.c |  146 +--
 drivers/staging/zram/zram_drv.h |1 -
 2 files changed, 125 insertions(+), 22 deletions(-)

diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c
index d6f1f67..3dd5085 100644
--- a/drivers/staging/zram/zram_drv.c
+++ b/drivers/staging/zram/zram_drv.c
@@ -30,6 +30,7 @@
 #include linux/highmem.h
 #include linux/slab.h
 #include linux/crypto.h
+#include linux/cpu.h
 #include linux/string.h
 #include linux/vmalloc.h
 
@@ -46,7 +47,7 @@ static unsigned int num_devices = 1;
 
 /* Cryptographic API features */
 static char *zram_compressor = ZRAM_COMPRESSOR_DEFAULT;
-static struct crypto_comp *zram_comp_tfm;
+static struct crypto_comp * __percpu *zram_comp_pcpu_tfms;
 
 enum comp_op {
ZRAM_COMPOP_COMPRESS,
@@ -59,7 +60,7 @@ static int zram_comp_op(enum comp_op op, const u8 *src, 
unsigned int slen,
struct crypto_comp *tfm;
int ret;
 
-   tfm = zram_comp_tfm;
+   tfm = *per_cpu_ptr(zram_comp_pcpu_tfms, get_cpu());
switch (op) {
case ZRAM_COMPOP_COMPRESS:
ret = crypto_comp_compress(tfm, src, slen, dst, dlen);
@@ -70,6 +71,7 @@ static int zram_comp_op(enum comp_op op, const u8 *src, 
unsigned int slen,
default:
ret = -EINVAL;
}
+   put_cpu();
 
return ret;
 }
@@ -87,9 +89,9 @@ static int __init zram_comp_init(void)
}
pr_info(using %s compressor\n, zram_compressor);
 
-   /* alloc transform */
-   zram_comp_tfm = crypto_alloc_comp(zram_compressor, 0, 0);
-   if (!zram_comp_tfm)
+   /* alloc percpu transforms */
+   zram_comp_pcpu_tfms = alloc_percpu(struct crypto_comp *);
+   if (!zram_comp_pcpu_tfms)
return -ENOMEM;
 
return 0;
@@ -97,8 +99,110 @@ static int __init zram_comp_init(void)
 
 static inline void zram_comp_exit(void)
 {
-   if (zram_comp_tfm)
-   crypto_free_comp(zram_comp_tfm);
+   /* free percpu transforms */
+   if (zram_comp_pcpu_tfms)
+   free_percpu(zram_comp_pcpu_tfms);
+}
+
+
+/* Crypto API features: percpu code */
+#define ZRAM_DSTMEM_ORDER 1
+static DEFINE_PER_CPU(u8 *, zram_dstmem);
+
+static int zram_comp_cpu_up(int cpu)
+{
+   struct crypto_comp *tfm;
+
+   tfm = crypto_alloc_comp(zram_compressor, 0, 0);
+   if (IS_ERR(tfm))
+   return NOTIFY_BAD;
+   *per_cpu_ptr(zram_comp_pcpu_tfms, cpu) = tfm;
+   return NOTIFY_OK;
+}
+
+static void zram_comp_cpu_down(int cpu)
+{
+   struct crypto_comp *tfm;
+
+   tfm = *per_cpu_ptr(zram_comp_pcpu_tfms, cpu);
+   crypto_free_comp(tfm);
+   *per_cpu_ptr(zram_comp_pcpu_tfms, cpu) = NULL;
+}
+
+static int zram_cpu_notifier(struct notifier_block *nb,
+   unsigned long action, void *pcpu)
+{
+   int ret;
+   int cpu = (long) pcpu;
+
+   switch (action) {
+   case CPU_UP_PREPARE:
+   ret = zram_comp_cpu_up(cpu);
+   if (ret != NOTIFY_OK) {
+   pr_err(zram: can't allocate compressor xform\n);
+   return ret;
+   }
+   per_cpu(zram_dstmem, cpu) = (void *)__get_free_pages(
+   GFP_KERNEL | __GFP_REPEAT, ZRAM_DSTMEM_ORDER);
+   break;
+   case CPU_DEAD:
+   case CPU_UP_CANCELED:
+   zram_comp_cpu_down(cpu);
+   free_pages((unsigned long) per_cpu(zram_dstmem, cpu),
+   ZRAM_DSTMEM_ORDER);
+   per_cpu(zram_dstmem, cpu) = NULL;
+   break;
+   default:
+   break;
+   }
+   return NOTIFY_OK;
+}
+
+static struct notifier_block zram_cpu_notifier_block = {
+   .notifier_call = zram_cpu_notifier
+};
+
+/* Helper function releasing tfms from online cpus */
+static inline void zram_comp_cpus_down(void)
+{
+   int cpu;
+
+   get_online_cpus();
+   for_each_online_cpu(cpu) {
+   void *pcpu = (void *)(long)cpu;
+   zram_cpu_notifier(zram_cpu_notifier_block

[PATCH] zcache: fix "zcache=" kernel parameter

2013-07-19 Thread Piotr Sarna
Commit 835f2f5 ("staging: zcache: enable zcache to be built/loaded as
a module") introduced an incorrect handling of "zcache=" parameter.

Inside zcache_comp_init() function, zcache_comp_name variable is
checked for being empty. If not empty, the above variable is tested
for being compatible with Crypto API. Unfortunately, after that
function ends unconditionally (by the "goto out" directive) and returns:
- non-zero value if verification succeeded, wrongly indicating an error
- zero value if verification failed, falsely informing that function
  zcache_comp_init() ended properly.

A solution to this problem is as following:
1. Move the "goto out" directive inside the "if (!ret)" statement
2. In case that crypto_has_comp() returned 0, change the value of ret
   to non-zero before "goto out" to indicate an error.

Signed-off-by: Piotr Sarna 
Acked-by: Bartlomiej Zolnierkiewicz 
Signed-off-by: Kyungmin Park 
---
 drivers/staging/zcache/zcache-main.c |6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/zcache/zcache-main.c 
b/drivers/staging/zcache/zcache-main.c
index dcceed2..81972fa 100644
--- a/drivers/staging/zcache/zcache-main.c
+++ b/drivers/staging/zcache/zcache-main.c
@@ -1811,10 +1811,12 @@ static int zcache_comp_init(void)
 #else
if (*zcache_comp_name != '\0') {
ret = crypto_has_comp(zcache_comp_name, 0, 0);
-   if (!ret)
+   if (!ret) {
pr_info("zcache: %s not supported\n",
zcache_comp_name);
-   goto out;
+   ret = 1;
+   goto out;
+   }
}
if (!ret)
strcpy(zcache_comp_name, "lzo");
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] zcache: fix zcache= kernel parameter

2013-07-19 Thread Piotr Sarna
Commit 835f2f5 (staging: zcache: enable zcache to be built/loaded as
a module) introduced an incorrect handling of zcache= parameter.

Inside zcache_comp_init() function, zcache_comp_name variable is
checked for being empty. If not empty, the above variable is tested
for being compatible with Crypto API. Unfortunately, after that
function ends unconditionally (by the goto out directive) and returns:
- non-zero value if verification succeeded, wrongly indicating an error
- zero value if verification failed, falsely informing that function
  zcache_comp_init() ended properly.

A solution to this problem is as following:
1. Move the goto out directive inside the if (!ret) statement
2. In case that crypto_has_comp() returned 0, change the value of ret
   to non-zero before goto out to indicate an error.

Signed-off-by: Piotr Sarna p.sa...@partner.samsung.com
Acked-by: Bartlomiej Zolnierkiewicz b.zolnier...@samsung.com
Signed-off-by: Kyungmin Park kyungmin.p...@samsung.com
---
 drivers/staging/zcache/zcache-main.c |6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/zcache/zcache-main.c 
b/drivers/staging/zcache/zcache-main.c
index dcceed2..81972fa 100644
--- a/drivers/staging/zcache/zcache-main.c
+++ b/drivers/staging/zcache/zcache-main.c
@@ -1811,10 +1811,12 @@ static int zcache_comp_init(void)
 #else
if (*zcache_comp_name != '\0') {
ret = crypto_has_comp(zcache_comp_name, 0, 0);
-   if (!ret)
+   if (!ret) {
pr_info(zcache: %s not supported\n,
zcache_comp_name);
-   goto out;
+   ret = 1;
+   goto out;
+   }
}
if (!ret)
strcpy(zcache_comp_name, lzo);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] zcache: fix "zcache=" kernel parameter

2013-07-18 Thread Piotr Sarna
Commit 835f2f5 ("staging: zcache: enable zcache to be built/loaded as
a module") introduced an incorrect handling of "zcache=" parameter.

Inside zcache_comp_init() function, zcache_comp_name variable is
checked for being empty. If not empty, the above variable is tested
for being compatible with Crypto API. Unfortunately, after that
function ends unconditionally (by the "goto out" directive) and returns:
- non-zero value if verification succeeded, wrongly indicating an error
- zero value if verification failed, falsely informing that function
  zcache_comp_init() ended properly.

A solution to this problem is as following:
1. Move the "goto out" directive inside the "if (!ret)" statement
2. In case that crypto_has_comp() returned 0, change the value of ret
   to non-zero before "goto out" to indicate an error.

Signed-off-by: Piotr Sarna 
Acked-by: Bartlomiej Zolnierkiewicz 
Signed-off-by: Kyungmin Park 
---
 drivers/staging/zcache/zcache-main.c |6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/zcache/zcache-main.c 
b/drivers/staging/zcache/zcache-main.c
index dcceed2..81972fa 100644
--- a/drivers/staging/zcache/zcache-main.c
+++ b/drivers/staging/zcache/zcache-main.c
@@ -1811,10 +1811,12 @@ static int zcache_comp_init(void)
 #else
if (*zcache_comp_name != '\0') {
ret = crypto_has_comp(zcache_comp_name, 0, 0);
-   if (!ret)
+   if (!ret) {
pr_info("zcache: %s not supported\n",
zcache_comp_name);
-   goto out;
+   ret = 1;
+   goto out;
+   }
}
if (!ret)
strcpy(zcache_comp_name, "lzo");
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] zcache: fix zcache= kernel parameter

2013-07-18 Thread Piotr Sarna
Commit 835f2f5 (staging: zcache: enable zcache to be built/loaded as
a module) introduced an incorrect handling of zcache= parameter.

Inside zcache_comp_init() function, zcache_comp_name variable is
checked for being empty. If not empty, the above variable is tested
for being compatible with Crypto API. Unfortunately, after that
function ends unconditionally (by the goto out directive) and returns:
- non-zero value if verification succeeded, wrongly indicating an error
- zero value if verification failed, falsely informing that function
  zcache_comp_init() ended properly.

A solution to this problem is as following:
1. Move the goto out directive inside the if (!ret) statement
2. In case that crypto_has_comp() returned 0, change the value of ret
   to non-zero before goto out to indicate an error.

Signed-off-by: Piotr Sarna p.sa...@partner.samsung.com
Acked-by: Bartlomiej Zolnierkiewicz b.zolnier...@samsung.com
Signed-off-by: Kyungmin Park kyungmin.p...@samsung.com
---
 drivers/staging/zcache/zcache-main.c |6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/zcache/zcache-main.c 
b/drivers/staging/zcache/zcache-main.c
index dcceed2..81972fa 100644
--- a/drivers/staging/zcache/zcache-main.c
+++ b/drivers/staging/zcache/zcache-main.c
@@ -1811,10 +1811,12 @@ static int zcache_comp_init(void)
 #else
if (*zcache_comp_name != '\0') {
ret = crypto_has_comp(zcache_comp_name, 0, 0);
-   if (!ret)
+   if (!ret) {
pr_info(zcache: %s not supported\n,
zcache_comp_name);
-   goto out;
+   ret = 1;
+   goto out;
+   }
}
if (!ret)
strcpy(zcache_comp_name, lzo);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/