Re: [PATCH] zram: bug fix: delay lock holding in zram_slot_free_noity

2013-08-11 Thread Minchan Kim
Hello Greg,

On Fri, Aug 09, 2013 at 04:39:08PM -0700, Greg Kroah-Hartman wrote:
> On Tue, Aug 06, 2013 at 01:26:34AM +0900, Minchan Kim wrote:
> > On Mon, Aug 05, 2013 at 04:18:34PM +0900, Minchan Kim wrote:
> > > I was preparing to promote zram and it was almost done.
> > > Before sending patch, I tried to test and eyebrows went up.
> > > 
> > > [1] introduced down_write in zram_slot_free_notify to prevent race
> > > between zram_slot_free_notify and zram_bvec_[read|write]. The race
> > > could happen if somebody who has right permission to open swap device
> > > is reading swap device while it is used by swap in parallel.
> > > 
> > > However, zram_slot_free_notify is called with holding spin_lock of
> > > swap layer so we shouldn't avoid holing mutex. Otherwise, lockdep
> > > warns it.
> > > 
> > > I guess, best solution is to redesign zram lock scheme totally but
> > > we are on the verge of promoting so it's not desirable to change a lot
> > > critical code and such big change isn't good shape for backporting to
> > > stable trees so I think the simple patch is best at the moment.
> > > 
> > > [1] [57ab0485, zram: use zram->lock to protect zram_free_page()
> > > in swap free notify path]
> > > 
> > > Cc: Jiang Liu 
> > > Cc: Nitin Gupta 
> > > Cc: sta...@vger.kernel.org
> > > Signed-off-by: Minchan Kim 
> > > ---
> > >  drivers/staging/zram/zram_drv.c |   15 ++-
> > >  1 file changed, 14 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/staging/zram/zram_drv.c 
> > > b/drivers/staging/zram/zram_drv.c
> > > index 7ebf91d..7b574c4 100644
> > > --- a/drivers/staging/zram/zram_drv.c
> > > +++ b/drivers/staging/zram/zram_drv.c
> > > @@ -440,6 +440,13 @@ static int zram_bvec_write(struct zram *zram, struct 
> > > bio_vec *bvec, u32 index,
> > >   goto out;
> > >   }
> > >  
> > > + /*
> > > +  * zram_slot_free_notify could miss free so that let's
> > > +  * double check.
> > > +  */
> > > + if (unlikely(meta->table[index].handle))
> > > + zram_free_page(zram, index);
> > > +
> > >   ret = lzo1x_1_compress(uncmem, PAGE_SIZE, src, ,
> > >  meta->compress_workmem);
> > >  
> > > @@ -727,7 +734,13 @@ static void zram_slot_free_notify(struct 
> > > block_device *bdev,
> > >   struct zram *zram;
> > >  
> > >   zram = bdev->bd_disk->private_data;
> > > - down_write(>lock);
> > > + /*
> > > +  * The function is called in atomic context so down_write should
> > > +  * be prohibited. If we couldn't hold a mutex, the free could be
> > > +  * handled by zram_bvec_write later when same index is overwritten.
> > > +  */
> > > + if (!down_write_trylock(>lock))
> > > + return;
> > >   zram_free_page(zram, index);
> > >   up_write(>lock);
> > >   atomic64_inc(>stats.notify_free);
> > > -- 
> > > 1.7.9.5
> > > 
> > 
> > How about this version?
> 
> I'm guessing you tested it out?  If so, please resend in a format that I
> can apply it in.

Sure, I will post soon.
Thanks!

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] zram: bug fix: delay lock holding in zram_slot_free_noity

2013-08-11 Thread Minchan Kim
Hello Greg,

On Fri, Aug 09, 2013 at 04:39:08PM -0700, Greg Kroah-Hartman wrote:
 On Tue, Aug 06, 2013 at 01:26:34AM +0900, Minchan Kim wrote:
  On Mon, Aug 05, 2013 at 04:18:34PM +0900, Minchan Kim wrote:
   I was preparing to promote zram and it was almost done.
   Before sending patch, I tried to test and eyebrows went up.
   
   [1] introduced down_write in zram_slot_free_notify to prevent race
   between zram_slot_free_notify and zram_bvec_[read|write]. The race
   could happen if somebody who has right permission to open swap device
   is reading swap device while it is used by swap in parallel.
   
   However, zram_slot_free_notify is called with holding spin_lock of
   swap layer so we shouldn't avoid holing mutex. Otherwise, lockdep
   warns it.
   
   I guess, best solution is to redesign zram lock scheme totally but
   we are on the verge of promoting so it's not desirable to change a lot
   critical code and such big change isn't good shape for backporting to
   stable trees so I think the simple patch is best at the moment.
   
   [1] [57ab0485, zram: use zram-lock to protect zram_free_page()
   in swap free notify path]
   
   Cc: Jiang Liu jiang@huawei.com
   Cc: Nitin Gupta ngu...@vflare.org
   Cc: sta...@vger.kernel.org
   Signed-off-by: Minchan Kim minc...@kernel.org
   ---
drivers/staging/zram/zram_drv.c |   15 ++-
1 file changed, 14 insertions(+), 1 deletion(-)
   
   diff --git a/drivers/staging/zram/zram_drv.c 
   b/drivers/staging/zram/zram_drv.c
   index 7ebf91d..7b574c4 100644
   --- a/drivers/staging/zram/zram_drv.c
   +++ b/drivers/staging/zram/zram_drv.c
   @@ -440,6 +440,13 @@ static int zram_bvec_write(struct zram *zram, struct 
   bio_vec *bvec, u32 index,
 goto out;
 }

   + /*
   +  * zram_slot_free_notify could miss free so that let's
   +  * double check.
   +  */
   + if (unlikely(meta-table[index].handle))
   + zram_free_page(zram, index);
   +
 ret = lzo1x_1_compress(uncmem, PAGE_SIZE, src, clen,
meta-compress_workmem);

   @@ -727,7 +734,13 @@ static void zram_slot_free_notify(struct 
   block_device *bdev,
 struct zram *zram;

 zram = bdev-bd_disk-private_data;
   - down_write(zram-lock);
   + /*
   +  * The function is called in atomic context so down_write should
   +  * be prohibited. If we couldn't hold a mutex, the free could be
   +  * handled by zram_bvec_write later when same index is overwritten.
   +  */
   + if (!down_write_trylock(zram-lock))
   + return;
 zram_free_page(zram, index);
 up_write(zram-lock);
 atomic64_inc(zram-stats.notify_free);
   -- 
   1.7.9.5
   
  
  How about this version?
 
 I'm guessing you tested it out?  If so, please resend in a format that I
 can apply it in.

Sure, I will post soon.
Thanks!

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] zram: bug fix: delay lock holding in zram_slot_free_noity

2013-08-09 Thread Greg Kroah-Hartman
On Tue, Aug 06, 2013 at 01:26:34AM +0900, Minchan Kim wrote:
> On Mon, Aug 05, 2013 at 04:18:34PM +0900, Minchan Kim wrote:
> > I was preparing to promote zram and it was almost done.
> > Before sending patch, I tried to test and eyebrows went up.
> > 
> > [1] introduced down_write in zram_slot_free_notify to prevent race
> > between zram_slot_free_notify and zram_bvec_[read|write]. The race
> > could happen if somebody who has right permission to open swap device
> > is reading swap device while it is used by swap in parallel.
> > 
> > However, zram_slot_free_notify is called with holding spin_lock of
> > swap layer so we shouldn't avoid holing mutex. Otherwise, lockdep
> > warns it.
> > 
> > I guess, best solution is to redesign zram lock scheme totally but
> > we are on the verge of promoting so it's not desirable to change a lot
> > critical code and such big change isn't good shape for backporting to
> > stable trees so I think the simple patch is best at the moment.
> > 
> > [1] [57ab0485, zram: use zram->lock to protect zram_free_page()
> > in swap free notify path]
> > 
> > Cc: Jiang Liu 
> > Cc: Nitin Gupta 
> > Cc: sta...@vger.kernel.org
> > Signed-off-by: Minchan Kim 
> > ---
> >  drivers/staging/zram/zram_drv.c |   15 ++-
> >  1 file changed, 14 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/staging/zram/zram_drv.c 
> > b/drivers/staging/zram/zram_drv.c
> > index 7ebf91d..7b574c4 100644
> > --- a/drivers/staging/zram/zram_drv.c
> > +++ b/drivers/staging/zram/zram_drv.c
> > @@ -440,6 +440,13 @@ static int zram_bvec_write(struct zram *zram, struct 
> > bio_vec *bvec, u32 index,
> > goto out;
> > }
> >  
> > +   /*
> > +* zram_slot_free_notify could miss free so that let's
> > +* double check.
> > +*/
> > +   if (unlikely(meta->table[index].handle))
> > +   zram_free_page(zram, index);
> > +
> > ret = lzo1x_1_compress(uncmem, PAGE_SIZE, src, ,
> >meta->compress_workmem);
> >  
> > @@ -727,7 +734,13 @@ static void zram_slot_free_notify(struct block_device 
> > *bdev,
> > struct zram *zram;
> >  
> > zram = bdev->bd_disk->private_data;
> > -   down_write(>lock);
> > +   /*
> > +* The function is called in atomic context so down_write should
> > +* be prohibited. If we couldn't hold a mutex, the free could be
> > +* handled by zram_bvec_write later when same index is overwritten.
> > +*/
> > +   if (!down_write_trylock(>lock))
> > +   return;
> > zram_free_page(zram, index);
> > up_write(>lock);
> > atomic64_inc(>stats.notify_free);
> > -- 
> > 1.7.9.5
> > 
> 
> How about this version?

I'm guessing you tested it out?  If so, please resend in a format that I
can apply it in.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] zram: bug fix: delay lock holding in zram_slot_free_noity

2013-08-09 Thread Greg Kroah-Hartman
On Tue, Aug 06, 2013 at 01:26:34AM +0900, Minchan Kim wrote:
 On Mon, Aug 05, 2013 at 04:18:34PM +0900, Minchan Kim wrote:
  I was preparing to promote zram and it was almost done.
  Before sending patch, I tried to test and eyebrows went up.
  
  [1] introduced down_write in zram_slot_free_notify to prevent race
  between zram_slot_free_notify and zram_bvec_[read|write]. The race
  could happen if somebody who has right permission to open swap device
  is reading swap device while it is used by swap in parallel.
  
  However, zram_slot_free_notify is called with holding spin_lock of
  swap layer so we shouldn't avoid holing mutex. Otherwise, lockdep
  warns it.
  
  I guess, best solution is to redesign zram lock scheme totally but
  we are on the verge of promoting so it's not desirable to change a lot
  critical code and such big change isn't good shape for backporting to
  stable trees so I think the simple patch is best at the moment.
  
  [1] [57ab0485, zram: use zram-lock to protect zram_free_page()
  in swap free notify path]
  
  Cc: Jiang Liu jiang@huawei.com
  Cc: Nitin Gupta ngu...@vflare.org
  Cc: sta...@vger.kernel.org
  Signed-off-by: Minchan Kim minc...@kernel.org
  ---
   drivers/staging/zram/zram_drv.c |   15 ++-
   1 file changed, 14 insertions(+), 1 deletion(-)
  
  diff --git a/drivers/staging/zram/zram_drv.c 
  b/drivers/staging/zram/zram_drv.c
  index 7ebf91d..7b574c4 100644
  --- a/drivers/staging/zram/zram_drv.c
  +++ b/drivers/staging/zram/zram_drv.c
  @@ -440,6 +440,13 @@ static int zram_bvec_write(struct zram *zram, struct 
  bio_vec *bvec, u32 index,
  goto out;
  }
   
  +   /*
  +* zram_slot_free_notify could miss free so that let's
  +* double check.
  +*/
  +   if (unlikely(meta-table[index].handle))
  +   zram_free_page(zram, index);
  +
  ret = lzo1x_1_compress(uncmem, PAGE_SIZE, src, clen,
 meta-compress_workmem);
   
  @@ -727,7 +734,13 @@ static void zram_slot_free_notify(struct block_device 
  *bdev,
  struct zram *zram;
   
  zram = bdev-bd_disk-private_data;
  -   down_write(zram-lock);
  +   /*
  +* The function is called in atomic context so down_write should
  +* be prohibited. If we couldn't hold a mutex, the free could be
  +* handled by zram_bvec_write later when same index is overwritten.
  +*/
  +   if (!down_write_trylock(zram-lock))
  +   return;
  zram_free_page(zram, index);
  up_write(zram-lock);
  atomic64_inc(zram-stats.notify_free);
  -- 
  1.7.9.5
  
 
 How about this version?

I'm guessing you tested it out?  If so, please resend in a format that I
can apply it in.

thanks,

greg k-h
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] zram: bug fix: delay lock holding in zram_slot_free_noity

2013-08-05 Thread Minchan Kim
On Mon, Aug 05, 2013 at 04:18:34PM +0900, Minchan Kim wrote:
> I was preparing to promote zram and it was almost done.
> Before sending patch, I tried to test and eyebrows went up.
> 
> [1] introduced down_write in zram_slot_free_notify to prevent race
> between zram_slot_free_notify and zram_bvec_[read|write]. The race
> could happen if somebody who has right permission to open swap device
> is reading swap device while it is used by swap in parallel.
> 
> However, zram_slot_free_notify is called with holding spin_lock of
> swap layer so we shouldn't avoid holing mutex. Otherwise, lockdep
> warns it.
> 
> I guess, best solution is to redesign zram lock scheme totally but
> we are on the verge of promoting so it's not desirable to change a lot
> critical code and such big change isn't good shape for backporting to
> stable trees so I think the simple patch is best at the moment.
> 
> [1] [57ab0485, zram: use zram->lock to protect zram_free_page()
> in swap free notify path]
> 
> Cc: Jiang Liu 
> Cc: Nitin Gupta 
> Cc: sta...@vger.kernel.org
> Signed-off-by: Minchan Kim 
> ---
>  drivers/staging/zram/zram_drv.c |   15 ++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c
> index 7ebf91d..7b574c4 100644
> --- a/drivers/staging/zram/zram_drv.c
> +++ b/drivers/staging/zram/zram_drv.c
> @@ -440,6 +440,13 @@ static int zram_bvec_write(struct zram *zram, struct 
> bio_vec *bvec, u32 index,
>   goto out;
>   }
>  
> + /*
> +  * zram_slot_free_notify could miss free so that let's
> +  * double check.
> +  */
> + if (unlikely(meta->table[index].handle))
> + zram_free_page(zram, index);
> +
>   ret = lzo1x_1_compress(uncmem, PAGE_SIZE, src, ,
>  meta->compress_workmem);
>  
> @@ -727,7 +734,13 @@ static void zram_slot_free_notify(struct block_device 
> *bdev,
>   struct zram *zram;
>  
>   zram = bdev->bd_disk->private_data;
> - down_write(>lock);
> + /*
> +  * The function is called in atomic context so down_write should
> +  * be prohibited. If we couldn't hold a mutex, the free could be
> +  * handled by zram_bvec_write later when same index is overwritten.
> +  */
> + if (!down_write_trylock(>lock))
> + return;
>   zram_free_page(zram, index);
>   up_write(>lock);
>   atomic64_inc(>stats.notify_free);
> -- 
> 1.7.9.5
> 

How about this version?

>From a447aac3cd451058baf42c9d6dca3197893f4d65 Mon Sep 17 00:00:00 2001
From: Minchan Kim 
Date: Mon, 5 Aug 2013 23:53:05 +0900
Subject: [PATCH v2] zram: bug fix: don't grab mutex in zram_slot_free_noity

[1] introduced down_write in zram_slot_free_notify to prevent race
between zram_slot_free_notify and zram_bvec_[read|write]. The race
could happen if somebody who has right permission to open swap device
is reading swap device while it is used by swap in parallel.

However, zram_slot_free_notify is called with holding spin_lock of
swap layer so we shouldn't avoid holing mutex. Otherwise, lockdep
warns it.

This patch adds new list to handle free object and workqueue
so zram_slot_free_notify just registers index to be freed and
queue works. If workqueue is expired, it could hold mutex_lock
and spinlock so there isn't no race between them.

If any I/O is issued, zram handles pending free request caused
by zram_slot_free_notify right before hanling issued request
because workqueue wouldn't handle pending requests yet.

Lastly, when zram is reset, flush_work could handle all of pending
free request so we shouldn't have memory leak.

NOTE: If zram_slot_free_notify's kmalloc with GFP_ATOMIC would be
failed, the slot will be freed when next write I/O write the slot.

[1] [57ab0485, zram: use zram->lock to protect zram_free_page()
in swap free notify path]
 
* from v1
  * totally redesign

Cc: Jiang Liu 
Cc: Nitin Gupta 
Signed-off-by: Minchan Kim 
---
 drivers/staging/zram/zram_drv.c | 60 ++---
 drivers/staging/zram/zram_drv.h |  8 ++
 2 files changed, 65 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c
index 7ebf91d..ec881e0 100644
--- a/drivers/staging/zram/zram_drv.c
+++ b/drivers/staging/zram/zram_drv.c
@@ -440,6 +440,14 @@ static int zram_bvec_write(struct zram *zram, struct 
bio_vec *bvec, u32 index,
goto out;
}
 
+   /*
+* zram_slot_free_notify could miss free so that let's
+* double check.
+*/
+   if (unlikely(meta->table[index].handle ||
+   zram_test_flag(meta, index, ZRAM_ZERO)))
+   zram_free_page(zram, index);
+
ret = lzo1x_1_compress(uncmem, PAGE_SIZE, src, ,
   meta->compress_workmem);
 
@@ -505,6 +513,20 @@ out:
return ret;
 }
 
+static void free_pending_rq(struct zram *zram)
+{
+   

Re: [PATCH] zram: bug fix: delay lock holding in zram_slot_free_noity

2013-08-05 Thread Minchan Kim
Hello Greg,

On Mon, Aug 05, 2013 at 04:04:22PM +0800, Greg Kroah-Hartman wrote:
> On Mon, Aug 05, 2013 at 04:18:34PM +0900, Minchan Kim wrote:
> > I was preparing to promote zram and it was almost done.
> > Before sending patch, I tried to test and eyebrows went up.
> > 
> > [1] introduced down_write in zram_slot_free_notify to prevent race
> > between zram_slot_free_notify and zram_bvec_[read|write]. The race
> > could happen if somebody who has right permission to open swap device
> > is reading swap device while it is used by swap in parallel.
> > 
> > However, zram_slot_free_notify is called with holding spin_lock of
> > swap layer so we shouldn't avoid holing mutex. Otherwise, lockdep
> > warns it.
> 
> As it should.

It's okay to call down_write_trylock instead of down_write under spinlock.
Is there any problem? Might need to rewrite description?

> 
> > I guess, best solution is to redesign zram lock scheme totally but
> > we are on the verge of promoting so it's not desirable to change a lot
> > critical code and such big change isn't good shape for backporting to
> > stable trees so I think the simple patch is best at the moment.
> 
> What do you mean by "verge of promoting"?  If it's wrong, it needs to be
> fixed properly, don't paper over something.

It seems you consider the patch as bandaid due to rather misleading my
description. I didn't mean it. I guess ideal solution would be to change
locking scheme totally to enhance concurrency but others might think it's
rather overkill because we don't see any reports about such parallel workloads
to make coarse-grained lock trouble. So, I think below simple patch looks
reasonable to me. Let's wait other zram developers's opinons.

> 
> Please fix this correctly, I really don't care about staging drivers in
> stable kernels as lots of distros refuse to enable them (and rightly
> so.)

It might be a huge so early decision is rather hurry.
Let's wait others's opition.
Nitin, could you post your opinion?

> 
> thanks,
> 
> greg k-h
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] zram: bug fix: delay lock holding in zram_slot_free_noity

2013-08-05 Thread Greg Kroah-Hartman
On Mon, Aug 05, 2013 at 04:18:34PM +0900, Minchan Kim wrote:
> I was preparing to promote zram and it was almost done.
> Before sending patch, I tried to test and eyebrows went up.
> 
> [1] introduced down_write in zram_slot_free_notify to prevent race
> between zram_slot_free_notify and zram_bvec_[read|write]. The race
> could happen if somebody who has right permission to open swap device
> is reading swap device while it is used by swap in parallel.
> 
> However, zram_slot_free_notify is called with holding spin_lock of
> swap layer so we shouldn't avoid holing mutex. Otherwise, lockdep
> warns it.

As it should.

> I guess, best solution is to redesign zram lock scheme totally but
> we are on the verge of promoting so it's not desirable to change a lot
> critical code and such big change isn't good shape for backporting to
> stable trees so I think the simple patch is best at the moment.

What do you mean by "verge of promoting"?  If it's wrong, it needs to be
fixed properly, don't paper over something.

Please fix this correctly, I really don't care about staging drivers in
stable kernels as lots of distros refuse to enable them (and rightly
so.)

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] zram: bug fix: delay lock holding in zram_slot_free_noity

2013-08-05 Thread Minchan Kim
I was preparing to promote zram and it was almost done.
Before sending patch, I tried to test and eyebrows went up.

[1] introduced down_write in zram_slot_free_notify to prevent race
between zram_slot_free_notify and zram_bvec_[read|write]. The race
could happen if somebody who has right permission to open swap device
is reading swap device while it is used by swap in parallel.

However, zram_slot_free_notify is called with holding spin_lock of
swap layer so we shouldn't avoid holing mutex. Otherwise, lockdep
warns it.

I guess, best solution is to redesign zram lock scheme totally but
we are on the verge of promoting so it's not desirable to change a lot
critical code and such big change isn't good shape for backporting to
stable trees so I think the simple patch is best at the moment.

[1] [57ab0485, zram: use zram->lock to protect zram_free_page()
in swap free notify path]

Cc: Jiang Liu 
Cc: Nitin Gupta 
Cc: sta...@vger.kernel.org
Signed-off-by: Minchan Kim 
---
 drivers/staging/zram/zram_drv.c |   15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c
index 7ebf91d..7b574c4 100644
--- a/drivers/staging/zram/zram_drv.c
+++ b/drivers/staging/zram/zram_drv.c
@@ -440,6 +440,13 @@ static int zram_bvec_write(struct zram *zram, struct 
bio_vec *bvec, u32 index,
goto out;
}
 
+   /*
+* zram_slot_free_notify could miss free so that let's
+* double check.
+*/
+   if (unlikely(meta->table[index].handle))
+   zram_free_page(zram, index);
+
ret = lzo1x_1_compress(uncmem, PAGE_SIZE, src, ,
   meta->compress_workmem);
 
@@ -727,7 +734,13 @@ static void zram_slot_free_notify(struct block_device 
*bdev,
struct zram *zram;
 
zram = bdev->bd_disk->private_data;
-   down_write(>lock);
+   /*
+* The function is called in atomic context so down_write should
+* be prohibited. If we couldn't hold a mutex, the free could be
+* handled by zram_bvec_write later when same index is overwritten.
+*/
+   if (!down_write_trylock(>lock))
+   return;
zram_free_page(zram, index);
up_write(>lock);
atomic64_inc(>stats.notify_free);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] zram: bug fix: delay lock holding in zram_slot_free_noity

2013-08-05 Thread Minchan Kim
I was preparing to promote zram and it was almost done.
Before sending patch, I tried to test and eyebrows went up.

[1] introduced down_write in zram_slot_free_notify to prevent race
between zram_slot_free_notify and zram_bvec_[read|write]. The race
could happen if somebody who has right permission to open swap device
is reading swap device while it is used by swap in parallel.

However, zram_slot_free_notify is called with holding spin_lock of
swap layer so we shouldn't avoid holing mutex. Otherwise, lockdep
warns it.

I guess, best solution is to redesign zram lock scheme totally but
we are on the verge of promoting so it's not desirable to change a lot
critical code and such big change isn't good shape for backporting to
stable trees so I think the simple patch is best at the moment.

[1] [57ab0485, zram: use zram-lock to protect zram_free_page()
in swap free notify path]

Cc: Jiang Liu jiang@huawei.com
Cc: Nitin Gupta ngu...@vflare.org
Cc: sta...@vger.kernel.org
Signed-off-by: Minchan Kim minc...@kernel.org
---
 drivers/staging/zram/zram_drv.c |   15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c
index 7ebf91d..7b574c4 100644
--- a/drivers/staging/zram/zram_drv.c
+++ b/drivers/staging/zram/zram_drv.c
@@ -440,6 +440,13 @@ static int zram_bvec_write(struct zram *zram, struct 
bio_vec *bvec, u32 index,
goto out;
}
 
+   /*
+* zram_slot_free_notify could miss free so that let's
+* double check.
+*/
+   if (unlikely(meta-table[index].handle))
+   zram_free_page(zram, index);
+
ret = lzo1x_1_compress(uncmem, PAGE_SIZE, src, clen,
   meta-compress_workmem);
 
@@ -727,7 +734,13 @@ static void zram_slot_free_notify(struct block_device 
*bdev,
struct zram *zram;
 
zram = bdev-bd_disk-private_data;
-   down_write(zram-lock);
+   /*
+* The function is called in atomic context so down_write should
+* be prohibited. If we couldn't hold a mutex, the free could be
+* handled by zram_bvec_write later when same index is overwritten.
+*/
+   if (!down_write_trylock(zram-lock))
+   return;
zram_free_page(zram, index);
up_write(zram-lock);
atomic64_inc(zram-stats.notify_free);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] zram: bug fix: delay lock holding in zram_slot_free_noity

2013-08-05 Thread Greg Kroah-Hartman
On Mon, Aug 05, 2013 at 04:18:34PM +0900, Minchan Kim wrote:
 I was preparing to promote zram and it was almost done.
 Before sending patch, I tried to test and eyebrows went up.
 
 [1] introduced down_write in zram_slot_free_notify to prevent race
 between zram_slot_free_notify and zram_bvec_[read|write]. The race
 could happen if somebody who has right permission to open swap device
 is reading swap device while it is used by swap in parallel.
 
 However, zram_slot_free_notify is called with holding spin_lock of
 swap layer so we shouldn't avoid holing mutex. Otherwise, lockdep
 warns it.

As it should.

 I guess, best solution is to redesign zram lock scheme totally but
 we are on the verge of promoting so it's not desirable to change a lot
 critical code and such big change isn't good shape for backporting to
 stable trees so I think the simple patch is best at the moment.

What do you mean by verge of promoting?  If it's wrong, it needs to be
fixed properly, don't paper over something.

Please fix this correctly, I really don't care about staging drivers in
stable kernels as lots of distros refuse to enable them (and rightly
so.)

thanks,

greg k-h
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] zram: bug fix: delay lock holding in zram_slot_free_noity

2013-08-05 Thread Minchan Kim
Hello Greg,

On Mon, Aug 05, 2013 at 04:04:22PM +0800, Greg Kroah-Hartman wrote:
 On Mon, Aug 05, 2013 at 04:18:34PM +0900, Minchan Kim wrote:
  I was preparing to promote zram and it was almost done.
  Before sending patch, I tried to test and eyebrows went up.
  
  [1] introduced down_write in zram_slot_free_notify to prevent race
  between zram_slot_free_notify and zram_bvec_[read|write]. The race
  could happen if somebody who has right permission to open swap device
  is reading swap device while it is used by swap in parallel.
  
  However, zram_slot_free_notify is called with holding spin_lock of
  swap layer so we shouldn't avoid holing mutex. Otherwise, lockdep
  warns it.
 
 As it should.

It's okay to call down_write_trylock instead of down_write under spinlock.
Is there any problem? Might need to rewrite description?

 
  I guess, best solution is to redesign zram lock scheme totally but
  we are on the verge of promoting so it's not desirable to change a lot
  critical code and such big change isn't good shape for backporting to
  stable trees so I think the simple patch is best at the moment.
 
 What do you mean by verge of promoting?  If it's wrong, it needs to be
 fixed properly, don't paper over something.

It seems you consider the patch as bandaid due to rather misleading my
description. I didn't mean it. I guess ideal solution would be to change
locking scheme totally to enhance concurrency but others might think it's
rather overkill because we don't see any reports about such parallel workloads
to make coarse-grained lock trouble. So, I think below simple patch looks
reasonable to me. Let's wait other zram developers's opinons.

 
 Please fix this correctly, I really don't care about staging drivers in
 stable kernels as lots of distros refuse to enable them (and rightly
 so.)

It might be a huge so early decision is rather hurry.
Let's wait others's opition.
Nitin, could you post your opinion?

 
 thanks,
 
 greg k-h
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] zram: bug fix: delay lock holding in zram_slot_free_noity

2013-08-05 Thread Minchan Kim
On Mon, Aug 05, 2013 at 04:18:34PM +0900, Minchan Kim wrote:
 I was preparing to promote zram and it was almost done.
 Before sending patch, I tried to test and eyebrows went up.
 
 [1] introduced down_write in zram_slot_free_notify to prevent race
 between zram_slot_free_notify and zram_bvec_[read|write]. The race
 could happen if somebody who has right permission to open swap device
 is reading swap device while it is used by swap in parallel.
 
 However, zram_slot_free_notify is called with holding spin_lock of
 swap layer so we shouldn't avoid holing mutex. Otherwise, lockdep
 warns it.
 
 I guess, best solution is to redesign zram lock scheme totally but
 we are on the verge of promoting so it's not desirable to change a lot
 critical code and such big change isn't good shape for backporting to
 stable trees so I think the simple patch is best at the moment.
 
 [1] [57ab0485, zram: use zram-lock to protect zram_free_page()
 in swap free notify path]
 
 Cc: Jiang Liu jiang@huawei.com
 Cc: Nitin Gupta ngu...@vflare.org
 Cc: sta...@vger.kernel.org
 Signed-off-by: Minchan Kim minc...@kernel.org
 ---
  drivers/staging/zram/zram_drv.c |   15 ++-
  1 file changed, 14 insertions(+), 1 deletion(-)
 
 diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c
 index 7ebf91d..7b574c4 100644
 --- a/drivers/staging/zram/zram_drv.c
 +++ b/drivers/staging/zram/zram_drv.c
 @@ -440,6 +440,13 @@ static int zram_bvec_write(struct zram *zram, struct 
 bio_vec *bvec, u32 index,
   goto out;
   }
  
 + /*
 +  * zram_slot_free_notify could miss free so that let's
 +  * double check.
 +  */
 + if (unlikely(meta-table[index].handle))
 + zram_free_page(zram, index);
 +
   ret = lzo1x_1_compress(uncmem, PAGE_SIZE, src, clen,
  meta-compress_workmem);
  
 @@ -727,7 +734,13 @@ static void zram_slot_free_notify(struct block_device 
 *bdev,
   struct zram *zram;
  
   zram = bdev-bd_disk-private_data;
 - down_write(zram-lock);
 + /*
 +  * The function is called in atomic context so down_write should
 +  * be prohibited. If we couldn't hold a mutex, the free could be
 +  * handled by zram_bvec_write later when same index is overwritten.
 +  */
 + if (!down_write_trylock(zram-lock))
 + return;
   zram_free_page(zram, index);
   up_write(zram-lock);
   atomic64_inc(zram-stats.notify_free);
 -- 
 1.7.9.5
 

How about this version?

From a447aac3cd451058baf42c9d6dca3197893f4d65 Mon Sep 17 00:00:00 2001
From: Minchan Kim minc...@kernel.org
Date: Mon, 5 Aug 2013 23:53:05 +0900
Subject: [PATCH v2] zram: bug fix: don't grab mutex in zram_slot_free_noity

[1] introduced down_write in zram_slot_free_notify to prevent race
between zram_slot_free_notify and zram_bvec_[read|write]. The race
could happen if somebody who has right permission to open swap device
is reading swap device while it is used by swap in parallel.

However, zram_slot_free_notify is called with holding spin_lock of
swap layer so we shouldn't avoid holing mutex. Otherwise, lockdep
warns it.

This patch adds new list to handle free object and workqueue
so zram_slot_free_notify just registers index to be freed and
queue works. If workqueue is expired, it could hold mutex_lock
and spinlock so there isn't no race between them.

If any I/O is issued, zram handles pending free request caused
by zram_slot_free_notify right before hanling issued request
because workqueue wouldn't handle pending requests yet.

Lastly, when zram is reset, flush_work could handle all of pending
free request so we shouldn't have memory leak.

NOTE: If zram_slot_free_notify's kmalloc with GFP_ATOMIC would be
failed, the slot will be freed when next write I/O write the slot.

[1] [57ab0485, zram: use zram-lock to protect zram_free_page()
in swap free notify path]
 
* from v1
  * totally redesign

Cc: Jiang Liu jiang@huawei.com
Cc: Nitin Gupta ngu...@vflare.org
Signed-off-by: Minchan Kim minc...@kernel.org
---
 drivers/staging/zram/zram_drv.c | 60 ++---
 drivers/staging/zram/zram_drv.h |  8 ++
 2 files changed, 65 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c
index 7ebf91d..ec881e0 100644
--- a/drivers/staging/zram/zram_drv.c
+++ b/drivers/staging/zram/zram_drv.c
@@ -440,6 +440,14 @@ static int zram_bvec_write(struct zram *zram, struct 
bio_vec *bvec, u32 index,
goto out;
}
 
+   /*
+* zram_slot_free_notify could miss free so that let's
+* double check.
+*/
+   if (unlikely(meta-table[index].handle ||
+   zram_test_flag(meta, index, ZRAM_ZERO)))
+   zram_free_page(zram, index);
+
ret = lzo1x_1_compress(uncmem, PAGE_SIZE, src, clen,
   meta-compress_workmem);
 
@@ -505,6 +513,20 @@ out: