Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance

2013-09-11 Thread Gu Zheng
Hi Chao,
On 09/12/2013 10:40 AM, 俞超 wrote:

> Hi Gu
> 
>> -Original Message-
>> From: Gu Zheng [mailto:guz.f...@cn.fujitsu.com]
>> Sent: Wednesday, September 11, 2013 1:38 PM
>> To: jaegeuk@samsung.com
>> Cc: chao2...@samsung.com; shu@samsung.com;
>> linux-fsde...@vger.kernel.org; linux-kernel@vger.kernel.org;
>> linux-f2fs-de...@lists.sourceforge.net
>> Subject: Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance
>>
>> Hi Jaegeuk, Chao,
>>
>> On 09/10/2013 08:52 AM, Jaegeuk Kim wrote:
>>
>>> Hi,
>>>
>>> At first, thank you for the report and please follow the email writing
>>> rules. :)
>>>
>>> Anyway, I agree to the below issue.
>>> One thing that I can think of is that we don't need to use the
>>> spin_lock, since we don't care about the exact lock number, but just
>>> need to get any not-collided number.
>>
>> IMHO, just moving sbi->next_lock_num++ before
>> mutex_lock(>fs_lock[next_lock])
>> can avoid unbalance issue mostly.
>> IMO, the case two or more threads increase sbi->next_lock_num in the same
>> time is really very very little. If you think it is not rigorous, change
>> next_lock_num to atomic one can fix it.
>> What's your opinion?
>>
>> Regards,
>> Gu
> 
> I did the test sbi->next_lock_num++ compare with the atomic one,
> And I found performance of them is almost the same under a small number 
> thread racing.
> So as your and Kim's opinion, it's enough to use "sbi->next_lock_num++" to 
> fix this issue.

Good, but it seems that your replay patch is out of format, and it's hard for 
Jaegeuk to merge.
I'll format it, see the following thread.

Thanks,
Gu

> 
> Thanks for the advice.
>>
>>>
>>> So, how about removing the spin_lock?
>>> And how about using a random number?
>>
>>> Thanks,
>>>
>>> 2013-09-06 (금), 09:48 +, Chao Yu:
>>>> Hi Kim:
>>>>
>>>>  I think there is a performance problem: when all sbi->fs_lock is
>>>> holded,
>>>>
>>>> then all other threads may get the same next_lock value from
>>>> sbi->next_lock_num in function mutex_lock_op,
>>>>
>>>> and wait to get the same lock at position fs_lock[next_lock], it
>>>> unbalance the fs_lock usage.
>>>>
>>>> It may lost performance when we do the multithread test.
>>>>
>>>>
>>>>
>>>> Here is the patch to fix this problem:
>>>>
>>>>
>>>>
>>>> Signed-off-by: Yu Chao 
>>>>
>>>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>>>>
>>>> old mode 100644
>>>>
>>>> new mode 100755
>>>>
>>>> index 467d42d..983bb45
>>>>
>>>> --- a/fs/f2fs/f2fs.h
>>>>
>>>> +++ b/fs/f2fs/f2fs.h
>>>>
>>>> @@ -371,6 +371,7 @@ struct f2fs_sb_info {
>>>>
>>>> struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
>>>> operations */
>>>>
>>>> struct mutex node_write;/* locking node
>> writes
>>>> */
>>>>
>>>> struct mutex writepages;/* mutex for
>>>> writepages() */
>>>>
>>>> +   spinlock_t spin_lock;   /* lock for
>>>> next_lock_num */
>>>>
>>>> unsigned char next_lock_num;/* round-robin
>> global
>>>> locks */
>>>>
>>>> int por_doing;  /* recovery is doing
>>>> or not */
>>>>
>>>> int on_build_free_nids; /* build_free_nids is
>>>> doing */
>>>>
>>>> @@ -533,15 +534,19 @@ static inline void mutex_unlock_all(struct
>>>> f2fs_sb_info *sbi)
>>>>
>>>>
>>>>
>>>>  static inline int mutex_lock_op(struct f2fs_sb_info *sbi)
>>>>
>>>>  {
>>>>
>>>> -   unsigned char next_lock = sbi->next_lock_num %
>>>> NR_GLOBAL_LOCKS;
>>>>
>>>> +   unsigned char next_lock;
>>>>
>>>> int i = 0;
>>>>
>>>>
>>>>
>>>> for (; i < NR_GLOBAL_LOCKS; i++)
>>>>
>>>> if (mutex_trylock(>fs_lock[i]))
>

RE: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance

2013-09-11 Thread 俞超
Hi Gu

> -Original Message-
> From: Gu Zheng [mailto:guz.f...@cn.fujitsu.com]
> Sent: Wednesday, September 11, 2013 1:38 PM
> To: jaegeuk@samsung.com
> Cc: chao2...@samsung.com; shu@samsung.com;
> linux-fsde...@vger.kernel.org; linux-kernel@vger.kernel.org;
> linux-f2fs-de...@lists.sourceforge.net
> Subject: Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance
> 
> Hi Jaegeuk, Chao,
> 
> On 09/10/2013 08:52 AM, Jaegeuk Kim wrote:
> 
> > Hi,
> >
> > At first, thank you for the report and please follow the email writing
> > rules. :)
> >
> > Anyway, I agree to the below issue.
> > One thing that I can think of is that we don't need to use the
> > spin_lock, since we don't care about the exact lock number, but just
> > need to get any not-collided number.
> 
> IMHO, just moving sbi->next_lock_num++ before
> mutex_lock(>fs_lock[next_lock])
> can avoid unbalance issue mostly.
> IMO, the case two or more threads increase sbi->next_lock_num in the same
> time is really very very little. If you think it is not rigorous, change
> next_lock_num to atomic one can fix it.
> What's your opinion?
> 
> Regards,
> Gu

I did the test sbi->next_lock_num++ compare with the atomic one,
And I found performance of them is almost the same under a small number thread 
racing.
So as your and Kim's opinion, it's enough to use "sbi->next_lock_num++" to fix 
this issue.

Thanks for the advice.
> 
> >
> > So, how about removing the spin_lock?
> > And how about using a random number?
> 
> > Thanks,
> >
> > 2013-09-06 (금), 09:48 +, Chao Yu:
> >> Hi Kim:
> >>
> >>  I think there is a performance problem: when all sbi->fs_lock is
> >> holded,
> >>
> >> then all other threads may get the same next_lock value from
> >> sbi->next_lock_num in function mutex_lock_op,
> >>
> >> and wait to get the same lock at position fs_lock[next_lock], it
> >> unbalance the fs_lock usage.
> >>
> >> It may lost performance when we do the multithread test.
> >>
> >>
> >>
> >> Here is the patch to fix this problem:
> >>
> >>
> >>
> >> Signed-off-by: Yu Chao 
> >>
> >> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> >>
> >> old mode 100644
> >>
> >> new mode 100755
> >>
> >> index 467d42d..983bb45
> >>
> >> --- a/fs/f2fs/f2fs.h
> >>
> >> +++ b/fs/f2fs/f2fs.h
> >>
> >> @@ -371,6 +371,7 @@ struct f2fs_sb_info {
> >>
> >> struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
> >> operations */
> >>
> >> struct mutex node_write;/* locking node
> writes
> >> */
> >>
> >> struct mutex writepages;/* mutex for
> >> writepages() */
> >>
> >> +   spinlock_t spin_lock;   /* lock for
> >> next_lock_num */
> >>
> >> unsigned char next_lock_num;/* round-robin
> global
> >> locks */
> >>
> >> int por_doing;  /* recovery is doing
> >> or not */
> >>
> >> int on_build_free_nids; /* build_free_nids is
> >> doing */
> >>
> >> @@ -533,15 +534,19 @@ static inline void mutex_unlock_all(struct
> >> f2fs_sb_info *sbi)
> >>
> >>
> >>
> >>  static inline int mutex_lock_op(struct f2fs_sb_info *sbi)
> >>
> >>  {
> >>
> >> -   unsigned char next_lock = sbi->next_lock_num %
> >> NR_GLOBAL_LOCKS;
> >>
> >> +   unsigned char next_lock;
> >>
> >> int i = 0;
> >>
> >>
> >>
> >> for (; i < NR_GLOBAL_LOCKS; i++)
> >>
> >> if (mutex_trylock(>fs_lock[i]))
> >>
> >> return i;
> >>
> >>
> >>
> >> -   mutex_lock(>fs_lock[next_lock]);
> >>
> >> +   spin_lock(>spin_lock);
> >>
> >> +   next_lock = sbi->next_lock_num % NR_GLOBAL_LOCKS;
> >>
> >> sbi->next_lock_num++;
> >>
> >> +   spin_unlock(>spin_lock);
> >>
> >> +
> >>
> >> +   mutex_lock(>fs_lock[next_lock]);
> >>
> >> return next_lock;
> >>
> >>  }
> >>
> >>
> >>
> >> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> >>
> >> old mode 100644
> >>
> >> new mode 100755
> >>
> >> index 75c7dc3..4f27596
> >>
> >> --- a/fs/f2fs/super.c
> >>
> >> +++ b/fs/f2fs/super.c
> >>
> >> @@ -657,6 +657,7 @@ static int f2fs_fill_super(struct super_block
> >> *sb, void *data, int silent)
> >>
> >> mutex_init(>cp_mutex);
> >>
> >> for (i = 0; i < NR_GLOBAL_LOCKS; i++)
> >>
> >> mutex_init(>fs_lock[i]);
> >>
> >> +   spin_lock_init(>spin_lock);
> >>
> >> mutex_init(>node_write);
> >>
> >> sbi->por_doing = 0;
> >>
> >> spin_lock_init(>stat_lock);
> >>
> >> (END)
> >>
> >>
> >>
> >>
> >>
> >>
> >
> 
> 
> =

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance

2013-09-11 Thread 俞超
Hi Kim

> -Original Message-
> From: Kim Jaegeuk [mailto:jaegeuk@gmail.com]
> Sent: Wednesday, September 11, 2013 9:15 PM
> To: chao2...@samsung.com
> Cc: ???; 谭姝; linux-fsde...@vger.kernel.org;
linux-kernel@vger.kernel.org;
> linux-f2fs-de...@lists.sourceforge.net
> Subject: Re: Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better
performance
> 
> Hi,
> 
> 2013/9/11 Chao Yu 
> >
> > Hi Kim,
> >
> > I did some tests as you mention of using random instead of spin_lock.
> > The test model is as following:
> > eight threads race to grab one of eight locks for one thousand times,
> > and I used four methods to generate lock num:
> >
> > 1.atomic_add_return(1, >next_lock_num) % NR_GLOBAL_LOCKS;
> > 2.spin_lock(); next_lock_num++ % NR_GLOBAL_LOCKS; spin_unlock();
> > 3.ktime_get().tv64 % NR_GLOBAL_LOCKS;
> > 4.get_random_bytes(_lock, sizeof(unsigned int));
> >
> > the result indicate that:
> > max count of collide continuously: 4 > 3 > 2 = 1 max-min count of lock
> > is grabbed: 4 > 3 > 2 = 1 elapsed time of generating: 3 > 2 > 4 > 1
> >
> > So I think it's better to use atomic_add_return in round-robin method
> > to cost less time and reduce collide.
> > What's your opinion?
> 
> Could you test with sbi->next_lock_num++ only instead of using
> atomic_add_return?
> IMO, this is just an integer value and still I don't think this value
should be
> covered by any kind of locks.
> Thanks,

Thanks for the advice, I have tested sbi->next_lock_num++.
The time cost of it is a little bit lower than the atomic one's.
for running 8 thread for 100 times test.
the performance of it's balance and collide play quit well than I expected.

Can we modify this patch as following?

root@virtaulmachine:/home/yuchao/git/linux-next/fs/f2fs# git diff --stat
 fs/f2fs/f2fs.h |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
root@virtaulmachine:/home/yuchao/git/linux-next/fs/f2fs# git diff
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 608f0df..7fd99d8 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -544,15 +544,15 @@ static inline void mutex_unlock_all(struct
f2fs_sb_info *sbi)
 
 static inline int mutex_lock_op(struct f2fs_sb_info *sbi)
 {
-   unsigned char next_lock = sbi->next_lock_num % NR_GLOBAL_LOCKS;
+   unsigned char next_lock;
int i = 0;
 
for (; i < NR_GLOBAL_LOCKS; i++)
if (mutex_trylock(>fs_lock[i]))
return i;
 
+   next_lock = sbi->next_lock_num++ % NR_GLOBAL_LOCKS;
mutex_lock(>fs_lock[next_lock]);
-   sbi->next_lock_num++;
    return next_lock;
 }

> 
> >
> > thanks
> >
> > --- Original Message ---
> > Sender : ??? S5(??)/??/?(???)/
> > Date : 九月 10, 2013 09:52 (GMT+09:00)
> > Title : Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better
> > performance
> >
> > Hi,
> >
> > At first, thank you for the report and please follow the email writing
> > rules. :)
> >
> > Anyway, I agree to the below issue.
> > One thing that I can think of is that we don't need to use the
> > spin_lock, since we don't care about the exact lock number, but just
> > need to get any not-collided number.
> >
> > So, how about removing the spin_lock?
> > And how about using a random number?
> > Thanks,
> >
> > 2013-09-06 (?), 09:48 +, Chao Yu:
> > > Hi Kim:
> > >
> > >  I think there is a performance problem: when all sbi->fs_lock
> > > is holded,
> > >
> > > then all other threads may get the same next_lock value from
> > > sbi->next_lock_num in function mutex_lock_op,
> > >
> > > and wait to get the same lock at position fs_lock[next_lock], it
> > > unbalance the fs_lock usage.
> > >
> > > It may lost performance when we do the multithread test.
> > >
> > >
> > >
> > > Here is the patch to fix this problem:
> > >
> > >
> > >
> > > Signed-off-by: Yu Chao
> > >
> > > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> > >
> > > old mode 100644
> > >
> > > new mode 100755
> > >
> > > index 467d42d..983bb45
> > >
> > > --- a/fs/f2fs/f2fs.h
> > >
> > > +++ b/fs/f2fs/f2fs.h
> > >
> > > @@ -371,6 +371,7 @@ struct f2fs_sb_info {
> > >
> > > struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
> > > operations */
> > >
> > > struct mutex node_w

Re: [f2fs-dev] [PATCH] f2fs: optimize fs_lock for better performance

2013-09-11 Thread Jin Xu

Hi,

On 11/09/2013 21:19, Kim Jaegeuk wrote:

Hi Russ,

The usage of fs_locks is for the recovery, so it doesn't matter
with stress-testing.
Actually what I've concerned is that we should not grab two or
more fs_locks in the same call path.
Thanks,



I am wondering why we don't use other kind of methods like r/w semaphore
instead of the fs_locks for access control purpose. Is it due to the
little lower performance of r/w semaphore or other reasons?

I think the r/w semaphore can bring more clearer access control over
the current fs_locks, and will not suffer the problems reported here.
It can be used in a way that i/o tasks just acquire read sem while the
checkpoint task takes the write sem.

Or do we have other better method?

Regards,
Jin


2013/9/11 Russ Knize :

Hi Jaegeuk/Gu,

I've removed the lock and have been stress-testing with SELinux and some
additional xattr torture for 24+ hours.  I have not encountered any issues
yet.

My previous suggestion about moving the lock is probably not a good idea
without some significant code rework (thanks to the f2fs_balance_fs call in
f2fs_setxattr).

Russ

On Tue, Sep 10, 2013 at 10:22 PM, Gu Zheng  wrote:


Hi Jaegeuk,
On 09/10/2013 08:59 AM, Jaegeuk Kim wrote:


Hi,

2013-09-07 (토), 08:00 +, Chao Yu:

Hi Knize,

 Thanks for your reply, I think it's actually meaningless that it's
being named after "spin_lock",
it's better to rename this spinlock to "round_robin_lock".

 This patch can only resolve the issue of unbalanced fs_lock usage,
it can not fix the deadlock issue.
can we fix deadlock issue through this method:

- vfs_create()
  - f2fs_create() - takes an fs_lock and save current thread info into
thread_info[NR_GLOBAL_LOCKS]
   - f2fs_add_link()
- __f2fs_add_link()
 - init_inode_metadata()
  - f2fs_init_security()
   - security_inode_init_security()
- f2fs_initxattrs()
 - f2fs_setxattr() - get fs_lock only if there is no current
thread info in thread_info

So it keeps one thread can only hold one fs_lock to avoid deadlock.
Can we use this solution?


It could be.
But, I think we can avoid to grab the fs_lock at the f2fs_initxattrs()


Agree. This fs_lock here is used to protect the xattr from parallel
modification,
but here is in the initxattrs routine, parallel modification can not
happen.
And in the normal setxattr routine the inode->i_mutex (vfs layer) is used
to
avoid parallel modification. So I think this fs_lock is needless.
Am I missing something?

Regards,
Gu


level, since this case only happens when f2fs_initxattrs() is called.
Let's think about ut in more detail.
Thanks,





thanks again!



--- Original Message ---

Sender : Russ Knize

Date : 九月 07, 2013 04:25 (GMT+09:00)

Title : Re: [f2fs-dev] [PATCH] f2fs: optimize fs_lock for better
performance



I encountered this same issue recently and solved it in much the same
way.  Can we rename "spin_lock" to something more meaningful?


This race actually exposed a potential deadlock between f2fs_create()
and f2fs_initxattrs():


- vfs_create()
  - f2fs_create() - takes an fs_lock
   - f2fs_add_link()
- __f2fs_add_link()
 - init_inode_metadata()
  - f2fs_init_security()
   - security_inode_init_security()
- f2fs_initxattrs()
 - f2fs_setxattr() - also takes an fs_lock


If another CPU happens to have the same lock that f2fs_setxattr() was
trying to take because of the race around next_lock_num, we can get
into a deadlock situation if the two threads are also contending over
another resource (like bdi).


Another scenario is if the above happens while another thread is in
the middle of grabbing all of the locks via mutex_lock_all().
  f2fs_create() is holding a lock that mutex_lock_all() is waiting for
and mutex_lock_all() is holding a lock that f2fs_setxattr() is waiting
for.


Russ


On Fri, Sep 6, 2013 at 4:48 AM, Chao Yu  wrote:
 Hi Kim:

  I think there is a performance problem: when all
 sbi->fs_lock is holded,

 then all other threads may get the same next_lock value from
 sbi->next_lock_num in function mutex_lock_op,

 and wait to get the same lock at position fs_lock[next_lock],
 it unbalance the fs_lock usage.

 It may lost performance when we do the multithread test.



 Here is the patch to fix this problem:



 Signed-off-by: Yu Chao 

 diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h

 old mode 100644

 new mode 100755

 index 467d42d..983bb45

 --- a/fs/f2fs/f2fs.h

 +++ b/fs/f2fs/f2fs.h

 @@ -371,6 +371,7 @@ struct f2fs_sb_info {

 struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
 operations */

 struct mutex node_write;/* locking
 node writes */

 struct mutex writepages;/* mutex for
 writepages() */

Re: [f2fs-dev] [PATCH] f2fs: optimize fs_lock for better performance

2013-09-11 Thread Russ Knize
Jaegeuk,

My tests include forced kernel panics while fsstress is running, which
generates a lot of recovery activity.  Sorry I wasn't more clear.

I understand your concern, which is why I first tried to keep the
fs_lock in the xattr_handler->set() path from VFS while removing it
from the call path I shared earlier.  Doing this requires reworking
the xattr code a bit.  Based on what Gu said about the inode mutex, it
seems like we can just remove it.

Russ

On Wed, Sep 11, 2013 at 8:19 AM, Kim Jaegeuk  wrote:
> Hi Russ,
>
> The usage of fs_locks is for the recovery, so it doesn't matter
> with stress-testing.
> Actually what I've concerned is that we should not grab two or
> more fs_locks in the same call path.
> Thanks,
>
> 2013/9/11 Russ Knize :
>> Hi Jaegeuk/Gu,
>>
>> I've removed the lock and have been stress-testing with SELinux and some
>> additional xattr torture for 24+ hours.  I have not encountered any issues
>> yet.
>>
>> My previous suggestion about moving the lock is probably not a good idea
>> without some significant code rework (thanks to the f2fs_balance_fs call in
>> f2fs_setxattr).
>>
>> Russ
>>
>> On Tue, Sep 10, 2013 at 10:22 PM, Gu Zheng  wrote:
>>>
>>> Hi Jaegeuk,
>>> On 09/10/2013 08:59 AM, Jaegeuk Kim wrote:
>>>
>>> > Hi,
>>> >
>>> > 2013-09-07 (토), 08:00 +, Chao Yu:
>>> >> Hi Knize,
>>> >>
>>> >> Thanks for your reply, I think it's actually meaningless that it's
>>> >> being named after "spin_lock",
>>> >> it's better to rename this spinlock to "round_robin_lock".
>>> >>
>>> >> This patch can only resolve the issue of unbalanced fs_lock usage,
>>> >> it can not fix the deadlock issue.
>>> >> can we fix deadlock issue through this method:
>>> >>
>>> >> - vfs_create()
>>> >>  - f2fs_create() - takes an fs_lock and save current thread info into
>>> >> thread_info[NR_GLOBAL_LOCKS]
>>> >>   - f2fs_add_link()
>>> >>- __f2fs_add_link()
>>> >> - init_inode_metadata()
>>> >>  - f2fs_init_security()
>>> >>   - security_inode_init_security()
>>> >>- f2fs_initxattrs()
>>> >> - f2fs_setxattr() - get fs_lock only if there is no current
>>> >> thread info in thread_info
>>> >>
>>> >> So it keeps one thread can only hold one fs_lock to avoid deadlock.
>>> >> Can we use this solution?
>>> >
>>> > It could be.
>>> > But, I think we can avoid to grab the fs_lock at the f2fs_initxattrs()
>>>
>>> Agree. This fs_lock here is used to protect the xattr from parallel
>>> modification,
>>> but here is in the initxattrs routine, parallel modification can not
>>> happen.
>>> And in the normal setxattr routine the inode->i_mutex (vfs layer) is used
>>> to
>>> avoid parallel modification. So I think this fs_lock is needless.
>>> Am I missing something?
>>>
>>> Regards,
>>> Gu
>>>
>>> > level, since this case only happens when f2fs_initxattrs() is called.
>>> > Let's think about ut in more detail.
>>> > Thanks,
>>> >
>>> >>
>>> >>
>>> >>
>>> >> thanks again!
>>> >>
>>> >>
>>> >>
>>> >> --- Original Message ---
>>> >>
>>> >> Sender : Russ Knize
>>> >>
>>> >> Date : 九月 07, 2013 04:25 (GMT+09:00)
>>> >>
>>> >> Title : Re: [f2fs-dev] [PATCH] f2fs: optimize fs_lock for better
>>> >> performance
>>> >>
>>> >>
>>> >>
>>> >> I encountered this same issue recently and solved it in much the same
>>> >> way.  Can we rename "spin_lock" to something more meaningful?
>>> >>
>>> >>
>>> >> This race actually exposed a potential deadlock between f2fs_create()
>>> >> and f2fs_initxattrs():
>>> >>
>>> >>
>>> >> - vfs_create()
>>> >>  - f2fs_create() - takes an fs_lock
>>> >>   - f2fs_add_link()
>>> >>- __f2fs_add_link()
>>> >> - init_inode_metadata()
>>> >>  - f2fs_init_security()
>>> >>   - security_inode_init_s

Re: [f2fs-dev] [PATCH] f2fs: optimize fs_lock for better performance

2013-09-11 Thread Kim Jaegeuk
Hi Gu,

2013/9/11 Gu Zheng :
> Hi Jaegeuk, Chao,
>
> On 09/10/2013 08:52 AM, Jaegeuk Kim wrote:
>
>> Hi,
>>
>> At first, thank you for the report and please follow the email writing
>> rules. :)
>>
>> Anyway, I agree to the below issue.
>> One thing that I can think of is that we don't need to use the
>> spin_lock, since we don't care about the exact lock number, but just
>> need to get any not-collided number.
>
> IMHO, just moving sbi->next_lock_num++ before 
> mutex_lock(>fs_lock[next_lock])
> can avoid unbalance issue mostly.
> IMO, the case two or more threads increase sbi->next_lock_num in the same 
> time is
> really very very little. If you think it is not rigorous, change 
> next_lock_num to
> atomic one can fix it.
> What's your opinion?

As your opinion, I think it is enough to replace it with simple
sbi->next_lock_num++.
Thanks,

>
> Regards,
> Gu
>
>>
>> So, how about removing the spin_lock?
>> And how about using a random number?
>
>> Thanks,
>>
>> 2013-09-06 (금), 09:48 +, Chao Yu:
>>> Hi Kim:
>>>
>>>  I think there is a performance problem: when all sbi->fs_lock is
>>> holded,
>>>
>>> then all other threads may get the same next_lock value from
>>> sbi->next_lock_num in function mutex_lock_op,
>>>
>>> and wait to get the same lock at position fs_lock[next_lock], it
>>> unbalance the fs_lock usage.
>>>
>>> It may lost performance when we do the multithread test.
>>>
>>>
>>>
>>> Here is the patch to fix this problem:
>>>
>>>
>>>
>>> Signed-off-by: Yu Chao 
>>>
>>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>>>
>>> old mode 100644
>>>
>>> new mode 100755
>>>
>>> index 467d42d..983bb45
>>>
>>> --- a/fs/f2fs/f2fs.h
>>>
>>> +++ b/fs/f2fs/f2fs.h
>>>
>>> @@ -371,6 +371,7 @@ struct f2fs_sb_info {
>>>
>>> struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
>>> operations */
>>>
>>> struct mutex node_write;/* locking node writes
>>> */
>>>
>>> struct mutex writepages;/* mutex for
>>> writepages() */
>>>
>>> +   spinlock_t spin_lock;   /* lock for
>>> next_lock_num */
>>>
>>> unsigned char next_lock_num;/* round-robin global
>>> locks */
>>>
>>> int por_doing;  /* recovery is doing
>>> or not */
>>>
>>> int on_build_free_nids; /* build_free_nids is
>>> doing */
>>>
>>> @@ -533,15 +534,19 @@ static inline void mutex_unlock_all(struct
>>> f2fs_sb_info *sbi)
>>>
>>>
>>>
>>>  static inline int mutex_lock_op(struct f2fs_sb_info *sbi)
>>>
>>>  {
>>>
>>> -   unsigned char next_lock = sbi->next_lock_num %
>>> NR_GLOBAL_LOCKS;
>>>
>>> +   unsigned char next_lock;
>>>
>>> int i = 0;
>>>
>>>
>>>
>>> for (; i < NR_GLOBAL_LOCKS; i++)
>>>
>>> if (mutex_trylock(>fs_lock[i]))
>>>
>>> return i;
>>>
>>>
>>>
>>> -   mutex_lock(>fs_lock[next_lock]);
>>>
>>> +   spin_lock(>spin_lock);
>>>
>>> +   next_lock = sbi->next_lock_num % NR_GLOBAL_LOCKS;
>>>
>>> sbi->next_lock_num++;
>>>
>>> +   spin_unlock(>spin_lock);
>>>
>>> +
>>>
>>> +   mutex_lock(>fs_lock[next_lock]);
>>>
>>> return next_lock;
>>>
>>>  }
>>>
>>>
>>>
>>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
>>>
>>> old mode 100644
>>>
>>> new mode 100755
>>>
>>> index 75c7dc3..4f27596
>>>
>>> --- a/fs/f2fs/super.c
>>>
>>> +++ b/fs/f2fs/super.c
>>>
>>> @@ -657,6 +657,7 @@ static int f2fs_fill_super(struct super_block *sb,
>>> void *data, int silent)
>>>
>>> mutex_init(>cp_mutex);
>>>
>>> for (i = 0; i < NR_GLOBAL_LOCKS; i++)
>>>
>>> mutex_init(>fs_lock[i]);
>>>
>>> +   spin_lock_init(>spin_lock);
>>>
>>> mutex_init(>node_write);
>>>
>>> sbi->por_doing = 0;
>>>
>>> spin_lock_init(>stat_lock);
>>>
>>> (END)
>>>
>>>
>>>
>>>
>>>
>>>
>>
>
>
>
> --
> How ServiceNow helps IT people transform IT departments:
> 1. Consolidate legacy IT systems to a single system of record for IT
> 2. Standardize and globalize service processes across IT
> 3. Implement zero-touch automation to replace manual, redundant tasks
> http://pubads.g.doubleclick.net/gampad/clk?id=5127=/4140/ostg.clktrk
> ___
> Linux-f2fs-devel mailing list
> linux-f2fs-de...@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [f2fs-dev] [PATCH] f2fs: optimize fs_lock for better performance

2013-09-11 Thread Kim Jaegeuk
Hi Russ,

The usage of fs_locks is for the recovery, so it doesn't matter
with stress-testing.
Actually what I've concerned is that we should not grab two or
more fs_locks in the same call path.
Thanks,

2013/9/11 Russ Knize :
> Hi Jaegeuk/Gu,
>
> I've removed the lock and have been stress-testing with SELinux and some
> additional xattr torture for 24+ hours.  I have not encountered any issues
> yet.
>
> My previous suggestion about moving the lock is probably not a good idea
> without some significant code rework (thanks to the f2fs_balance_fs call in
> f2fs_setxattr).
>
> Russ
>
> On Tue, Sep 10, 2013 at 10:22 PM, Gu Zheng  wrote:
>>
>> Hi Jaegeuk,
>> On 09/10/2013 08:59 AM, Jaegeuk Kim wrote:
>>
>> > Hi,
>> >
>> > 2013-09-07 (토), 08:00 +, Chao Yu:
>> >> Hi Knize,
>> >>
>> >> Thanks for your reply, I think it's actually meaningless that it's
>> >> being named after "spin_lock",
>> >> it's better to rename this spinlock to "round_robin_lock".
>> >>
>> >> This patch can only resolve the issue of unbalanced fs_lock usage,
>> >> it can not fix the deadlock issue.
>> >> can we fix deadlock issue through this method:
>> >>
>> >> - vfs_create()
>> >>  - f2fs_create() - takes an fs_lock and save current thread info into
>> >> thread_info[NR_GLOBAL_LOCKS]
>> >>   - f2fs_add_link()
>> >>- __f2fs_add_link()
>> >> - init_inode_metadata()
>> >>  - f2fs_init_security()
>> >>   - security_inode_init_security()
>> >>- f2fs_initxattrs()
>> >> - f2fs_setxattr() - get fs_lock only if there is no current
>> >> thread info in thread_info
>> >>
>> >> So it keeps one thread can only hold one fs_lock to avoid deadlock.
>> >> Can we use this solution?
>> >
>> > It could be.
>> > But, I think we can avoid to grab the fs_lock at the f2fs_initxattrs()
>>
>> Agree. This fs_lock here is used to protect the xattr from parallel
>> modification,
>> but here is in the initxattrs routine, parallel modification can not
>> happen.
>> And in the normal setxattr routine the inode->i_mutex (vfs layer) is used
>> to
>> avoid parallel modification. So I think this fs_lock is needless.
>> Am I missing something?
>>
>> Regards,
>> Gu
>>
>> > level, since this case only happens when f2fs_initxattrs() is called.
>> > Let's think about ut in more detail.
>> > Thanks,
>> >
>> >>
>> >>
>> >>
>> >> thanks again!
>> >>
>> >>
>> >>
>> >> --- Original Message ---
>> >>
>> >> Sender : Russ Knize
>> >>
>> >> Date : 九月 07, 2013 04:25 (GMT+09:00)
>> >>
>> >> Title : Re: [f2fs-dev] [PATCH] f2fs: optimize fs_lock for better
>> >> performance
>> >>
>> >>
>> >>
>> >> I encountered this same issue recently and solved it in much the same
>> >> way.  Can we rename "spin_lock" to something more meaningful?
>> >>
>> >>
>> >> This race actually exposed a potential deadlock between f2fs_create()
>> >> and f2fs_initxattrs():
>> >>
>> >>
>> >> - vfs_create()
>> >>  - f2fs_create() - takes an fs_lock
>> >>   - f2fs_add_link()
>> >>- __f2fs_add_link()
>> >> - init_inode_metadata()
>> >>  - f2fs_init_security()
>> >>   - security_inode_init_security()
>> >>- f2fs_initxattrs()
>> >> - f2fs_setxattr() - also takes an fs_lock
>> >>
>> >>
>> >> If another CPU happens to have the same lock that f2fs_setxattr() was
>> >> trying to take because of the race around next_lock_num, we can get
>> >> into a deadlock situation if the two threads are also contending over
>> >> another resource (like bdi).
>> >>
>> >>
>> >> Another scenario is if the above happens while another thread is in
>> >> the middle of grabbing all of the locks via mutex_lock_all().
>> >>  f2fs_create() is holding a lock that mutex_lock_all() is waiting for
>> >> and mutex_lock_all() is holding a lock that f2fs_setxattr() is waiting
>> >> for.
>> >>
>> >>
>> >> Russ
>> >>
>

Re: Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance

2013-09-11 Thread Kim Jaegeuk
Hi,

2013/9/11 Chao Yu 
>
> Hi Kim,
>
> I did some tests as you mention of using random instead of spin_lock.
> The test model is as following:
> eight threads race to grab one of eight locks for one thousand times,
> and I used four methods to generate lock num:
>
> 1.atomic_add_return(1, >next_lock_num) % NR_GLOBAL_LOCKS;
> 2.spin_lock(); next_lock_num++ % NR_GLOBAL_LOCKS; spin_unlock();
> 3.ktime_get().tv64 % NR_GLOBAL_LOCKS;
> 4.get_random_bytes(_lock, sizeof(unsigned int));
>
> the result indicate that:
> max count of collide continuously: 4 > 3 > 2 = 1
> max-min count of lock is grabbed: 4 > 3 > 2 = 1
> elapsed time of generating: 3 > 2 > 4 > 1
>
> So I think it's better to use atomic_add_return in round-robin method to
> cost less time and reduce collide.
> What's your opinion?

Could you test with sbi->next_lock_num++ only instead of using
atomic_add_return?
IMO, this is just an integer value and still I don't think this value should
be covered by any kind of locks.
Thanks,

>
> thanks
>
> --- Original Message ---
> Sender : ??? S5(??)/??/?????(???)/
> Date : 九月 10, 2013 09:52 (GMT+09:00)
> Title : Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance
>
> Hi,
>
> At first, thank you for the report and please follow the email writing
> rules. :)
>
> Anyway, I agree to the below issue.
> One thing that I can think of is that we don't need to use the
> spin_lock, since we don't care about the exact lock number, but just
> need to get any not-collided number.
>
> So, how about removing the spin_lock?
> And how about using a random number?
> Thanks,
>
> 2013-09-06 (?), 09:48 +, Chao Yu:
> > Hi Kim:
> >
> >  I think there is a performance problem: when all sbi->fs_lock is
> > holded,
> >
> > then all other threads may get the same next_lock value from
> > sbi->next_lock_num in function mutex_lock_op,
> >
> > and wait to get the same lock at position fs_lock[next_lock], it
> > unbalance the fs_lock usage.
> >
> > It may lost performance when we do the multithread test.
> >
> >
> >
> > Here is the patch to fix this problem:
> >
> >
> >
> > Signed-off-by: Yu Chao
> >
> > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> >
> > old mode 100644
> >
> > new mode 100755
> >
> > index 467d42d..983bb45
> >
> > --- a/fs/f2fs/f2fs.h
> >
> > +++ b/fs/f2fs/f2fs.h
> >
> > @@ -371,6 +371,7 @@ struct f2fs_sb_info {
> >
> > struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
> > operations */
> >
> > struct mutex node_write;/* locking node writes
> > */
> >
> > struct mutex writepages;/* mutex for
> > writepages() */
> >
> > +   spinlock_t spin_lock;   /* lock for
> > next_lock_num */
> >
> > unsigned char next_lock_num;/* round-robin global
> > locks */
> >
> > int por_doing;  /* recovery is doing
> > or not */
> >
> > int on_build_free_nids; /* build_free_nids is
> > doing */
> >
> > @@ -533,15 +534,19 @@ static inline void mutex_unlock_all(struct
> > f2fs_sb_info *sbi)
> >
> >
> >
> >  static inline int mutex_lock_op(struct f2fs_sb_info *sbi)
> >
> >  {
> >
> > -   unsigned char next_lock = sbi->next_lock_num %
> > NR_GLOBAL_LOCKS;
> >
> > +   unsigned char next_lock;
> >
> > int i = 0;
> >
> >
> >
> > for (; i < NR_GLOBAL_LOCKS; i++)
> >
> > if (mutex_trylock(>fs_lock[i]))
> >
> > return i;
> >
> >
> >
> > -   mutex_lock(>fs_lock[next_lock]);
> >
> > +   spin_lock(>spin_lock);
> >
> > +   next_lock = sbi->next_lock_num % NR_GLOBAL_LOCKS;
> >
> > sbi->next_lock_num++;
> >
> > +   spin_unlock(>spin_lock);
> >
> > +
> >
> > +   mutex_lock(>fs_lock[next_lock]);
> >
> > return next_lock;
> >
> >  }
> >
> >
> >
> > diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> >
> > old mode 100644
> >
> > new mode 100755
> >
> > index 75c7dc3..4f27596
> >
> > --- a/fs/f2fs/super.c
> >
> > +++ b/fs/f2fs/super.c
> >
> > @@ -657,6 +657,7 @@ static int f2fs_fill_super(struct super_block *sb,
> > void *data, int silent)
> >
> > mutex_init(>cp_mutex);
> >
> > for (i = 0; i < NR_GLOBAL_LOCKS; i++)
> >
> > mutex_init(>fs_lock[i]);
> >
> > +   spin_lock_init(>spin_lock);
> >
> > mutex_init(>node_write);
> >
> > sbi->por_doing = 0;
> >
> > spin_lock_init(>stat_lock);
> >
> > (END)
> >
> >
> >
> >
> >
> >
>
> --
> Jaegeuk Kim
> Samsung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance

2013-09-11 Thread Kim Jaegeuk
Hi,

2013/9/11 Chao Yu chao2...@samsung.com

 Hi Kim,

 I did some tests as you mention of using random instead of spin_lock.
 The test model is as following:
 eight threads race to grab one of eight locks for one thousand times,
 and I used four methods to generate lock num:

 1.atomic_add_return(1, sbi-next_lock_num) % NR_GLOBAL_LOCKS;
 2.spin_lock(); next_lock_num++ % NR_GLOBAL_LOCKS; spin_unlock();
 3.ktime_get().tv64 % NR_GLOBAL_LOCKS;
 4.get_random_bytes(next_lock, sizeof(unsigned int));

 the result indicate that:
 max count of collide continuously: 4  3  2 = 1
 max-min count of lock is grabbed: 4  3  2 = 1
 elapsed time of generating: 3  2  4  1

 So I think it's better to use atomic_add_return in round-robin method to
 cost less time and reduce collide.
 What's your opinion?

Could you test with sbi-next_lock_num++ only instead of using
atomic_add_return?
IMO, this is just an integer value and still I don't think this value should
be covered by any kind of locks.
Thanks,


 thanks

 --- Original Message ---
 Sender : ???jaegeuk@samsung.com S5(??)/??/?(???)/
 Date : 九月 10, 2013 09:52 (GMT+09:00)
 Title : Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance

 Hi,

 At first, thank you for the report and please follow the email writing
 rules. :)

 Anyway, I agree to the below issue.
 One thing that I can think of is that we don't need to use the
 spin_lock, since we don't care about the exact lock number, but just
 need to get any not-collided number.

 So, how about removing the spin_lock?
 And how about using a random number?
 Thanks,

 2013-09-06 (?), 09:48 +, Chao Yu:
  Hi Kim:
 
   I think there is a performance problem: when all sbi-fs_lock is
  holded,
 
  then all other threads may get the same next_lock value from
  sbi-next_lock_num in function mutex_lock_op,
 
  and wait to get the same lock at position fs_lock[next_lock], it
  unbalance the fs_lock usage.
 
  It may lost performance when we do the multithread test.
 
 
 
  Here is the patch to fix this problem:
 
 
 
  Signed-off-by: Yu Chao
 
  diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
 
  old mode 100644
 
  new mode 100755
 
  index 467d42d..983bb45
 
  --- a/fs/f2fs/f2fs.h
 
  +++ b/fs/f2fs/f2fs.h
 
  @@ -371,6 +371,7 @@ struct f2fs_sb_info {
 
  struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
  operations */
 
  struct mutex node_write;/* locking node writes
  */
 
  struct mutex writepages;/* mutex for
  writepages() */
 
  +   spinlock_t spin_lock;   /* lock for
  next_lock_num */
 
  unsigned char next_lock_num;/* round-robin global
  locks */
 
  int por_doing;  /* recovery is doing
  or not */
 
  int on_build_free_nids; /* build_free_nids is
  doing */
 
  @@ -533,15 +534,19 @@ static inline void mutex_unlock_all(struct
  f2fs_sb_info *sbi)
 
 
 
   static inline int mutex_lock_op(struct f2fs_sb_info *sbi)
 
   {
 
  -   unsigned char next_lock = sbi-next_lock_num %
  NR_GLOBAL_LOCKS;
 
  +   unsigned char next_lock;
 
  int i = 0;
 
 
 
  for (; i  NR_GLOBAL_LOCKS; i++)
 
  if (mutex_trylock(sbi-fs_lock[i]))
 
  return i;
 
 
 
  -   mutex_lock(sbi-fs_lock[next_lock]);
 
  +   spin_lock(sbi-spin_lock);
 
  +   next_lock = sbi-next_lock_num % NR_GLOBAL_LOCKS;
 
  sbi-next_lock_num++;
 
  +   spin_unlock(sbi-spin_lock);
 
  +
 
  +   mutex_lock(sbi-fs_lock[next_lock]);
 
  return next_lock;
 
   }
 
 
 
  diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
 
  old mode 100644
 
  new mode 100755
 
  index 75c7dc3..4f27596
 
  --- a/fs/f2fs/super.c
 
  +++ b/fs/f2fs/super.c
 
  @@ -657,6 +657,7 @@ static int f2fs_fill_super(struct super_block *sb,
  void *data, int silent)
 
  mutex_init(sbi-cp_mutex);
 
  for (i = 0; i  NR_GLOBAL_LOCKS; i++)
 
  mutex_init(sbi-fs_lock[i]);
 
  +   spin_lock_init(sbi-spin_lock);
 
  mutex_init(sbi-node_write);
 
  sbi-por_doing = 0;
 
  spin_lock_init(sbi-stat_lock);
 
  (END)
 
 
 
 
 
 

 --
 Jaegeuk Kim
 Samsung
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [f2fs-dev] [PATCH] f2fs: optimize fs_lock for better performance

2013-09-11 Thread Kim Jaegeuk
Hi Russ,

The usage of fs_locks is for the recovery, so it doesn't matter
with stress-testing.
Actually what I've concerned is that we should not grab two or
more fs_locks in the same call path.
Thanks,

2013/9/11 Russ Knize russ.kn...@motorola.com:
 Hi Jaegeuk/Gu,

 I've removed the lock and have been stress-testing with SELinux and some
 additional xattr torture for 24+ hours.  I have not encountered any issues
 yet.

 My previous suggestion about moving the lock is probably not a good idea
 without some significant code rework (thanks to the f2fs_balance_fs call in
 f2fs_setxattr).

 Russ

 On Tue, Sep 10, 2013 at 10:22 PM, Gu Zheng guz.f...@cn.fujitsu.com wrote:

 Hi Jaegeuk,
 On 09/10/2013 08:59 AM, Jaegeuk Kim wrote:

  Hi,
 
  2013-09-07 (토), 08:00 +, Chao Yu:
  Hi Knize,
 
  Thanks for your reply, I think it's actually meaningless that it's
  being named after spin_lock,
  it's better to rename this spinlock to round_robin_lock.
 
  This patch can only resolve the issue of unbalanced fs_lock usage,
  it can not fix the deadlock issue.
  can we fix deadlock issue through this method:
 
  - vfs_create()
   - f2fs_create() - takes an fs_lock and save current thread info into
  thread_info[NR_GLOBAL_LOCKS]
- f2fs_add_link()
 - __f2fs_add_link()
  - init_inode_metadata()
   - f2fs_init_security()
- security_inode_init_security()
 - f2fs_initxattrs()
  - f2fs_setxattr() - get fs_lock only if there is no current
  thread info in thread_info
 
  So it keeps one thread can only hold one fs_lock to avoid deadlock.
  Can we use this solution?
 
  It could be.
  But, I think we can avoid to grab the fs_lock at the f2fs_initxattrs()

 Agree. This fs_lock here is used to protect the xattr from parallel
 modification,
 but here is in the initxattrs routine, parallel modification can not
 happen.
 And in the normal setxattr routine the inode-i_mutex (vfs layer) is used
 to
 avoid parallel modification. So I think this fs_lock is needless.
 Am I missing something?

 Regards,
 Gu

  level, since this case only happens when f2fs_initxattrs() is called.
  Let's think about ut in more detail.
  Thanks,
 
 
 
 
  thanks again!
 
 
 
  --- Original Message ---
 
  Sender : Russ Knizeruss.kn...@motorola.com
 
  Date : 九月 07, 2013 04:25 (GMT+09:00)
 
  Title : Re: [f2fs-dev] [PATCH] f2fs: optimize fs_lock for better
  performance
 
 
 
  I encountered this same issue recently and solved it in much the same
  way.  Can we rename spin_lock to something more meaningful?
 
 
  This race actually exposed a potential deadlock between f2fs_create()
  and f2fs_initxattrs():
 
 
  - vfs_create()
   - f2fs_create() - takes an fs_lock
- f2fs_add_link()
 - __f2fs_add_link()
  - init_inode_metadata()
   - f2fs_init_security()
- security_inode_init_security()
 - f2fs_initxattrs()
  - f2fs_setxattr() - also takes an fs_lock
 
 
  If another CPU happens to have the same lock that f2fs_setxattr() was
  trying to take because of the race around next_lock_num, we can get
  into a deadlock situation if the two threads are also contending over
  another resource (like bdi).
 
 
  Another scenario is if the above happens while another thread is in
  the middle of grabbing all of the locks via mutex_lock_all().
   f2fs_create() is holding a lock that mutex_lock_all() is waiting for
  and mutex_lock_all() is holding a lock that f2fs_setxattr() is waiting
  for.
 
 
  Russ
 
 
  On Fri, Sep 6, 2013 at 4:48 AM, Chao Yu chao2...@samsung.com wrote:
  Hi Kim:
 
   I think there is a performance problem: when all
  sbi-fs_lock is holded,
 
  then all other threads may get the same next_lock value from
  sbi-next_lock_num in function mutex_lock_op,
 
  and wait to get the same lock at position fs_lock[next_lock],
  it unbalance the fs_lock usage.
 
  It may lost performance when we do the multithread test.
 
 
 
  Here is the patch to fix this problem:
 
 
 
  Signed-off-by: Yu Chao chao2...@samsung.com
 
  diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
 
  old mode 100644
 
  new mode 100755
 
  index 467d42d..983bb45
 
  --- a/fs/f2fs/f2fs.h
 
  +++ b/fs/f2fs/f2fs.h
 
  @@ -371,6 +371,7 @@ struct f2fs_sb_info {
 
  struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
  operations */
 
  struct mutex node_write;/* locking
  node writes */
 
  struct mutex writepages;/* mutex for
  writepages() */
 
  +   spinlock_t spin_lock;   /* lock for
  next_lock_num */
 
  unsigned char next_lock_num;/* round-robin
  global locks */
 
  int por_doing;  /* recovery is
  doing

Re: [f2fs-dev] [PATCH] f2fs: optimize fs_lock for better performance

2013-09-11 Thread Kim Jaegeuk
Hi Gu,

2013/9/11 Gu Zheng guz.f...@cn.fujitsu.com:
 Hi Jaegeuk, Chao,

 On 09/10/2013 08:52 AM, Jaegeuk Kim wrote:

 Hi,

 At first, thank you for the report and please follow the email writing
 rules. :)

 Anyway, I agree to the below issue.
 One thing that I can think of is that we don't need to use the
 spin_lock, since we don't care about the exact lock number, but just
 need to get any not-collided number.

 IMHO, just moving sbi-next_lock_num++ before 
 mutex_lock(sbi-fs_lock[next_lock])
 can avoid unbalance issue mostly.
 IMO, the case two or more threads increase sbi-next_lock_num in the same 
 time is
 really very very little. If you think it is not rigorous, change 
 next_lock_num to
 atomic one can fix it.
 What's your opinion?

As your opinion, I think it is enough to replace it with simple
sbi-next_lock_num++.
Thanks,


 Regards,
 Gu


 So, how about removing the spin_lock?
 And how about using a random number?

 Thanks,

 2013-09-06 (금), 09:48 +, Chao Yu:
 Hi Kim:

  I think there is a performance problem: when all sbi-fs_lock is
 holded,

 then all other threads may get the same next_lock value from
 sbi-next_lock_num in function mutex_lock_op,

 and wait to get the same lock at position fs_lock[next_lock], it
 unbalance the fs_lock usage.

 It may lost performance when we do the multithread test.



 Here is the patch to fix this problem:



 Signed-off-by: Yu Chao chao2...@samsung.com

 diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h

 old mode 100644

 new mode 100755

 index 467d42d..983bb45

 --- a/fs/f2fs/f2fs.h

 +++ b/fs/f2fs/f2fs.h

 @@ -371,6 +371,7 @@ struct f2fs_sb_info {

 struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
 operations */

 struct mutex node_write;/* locking node writes
 */

 struct mutex writepages;/* mutex for
 writepages() */

 +   spinlock_t spin_lock;   /* lock for
 next_lock_num */

 unsigned char next_lock_num;/* round-robin global
 locks */

 int por_doing;  /* recovery is doing
 or not */

 int on_build_free_nids; /* build_free_nids is
 doing */

 @@ -533,15 +534,19 @@ static inline void mutex_unlock_all(struct
 f2fs_sb_info *sbi)



  static inline int mutex_lock_op(struct f2fs_sb_info *sbi)

  {

 -   unsigned char next_lock = sbi-next_lock_num %
 NR_GLOBAL_LOCKS;

 +   unsigned char next_lock;

 int i = 0;



 for (; i  NR_GLOBAL_LOCKS; i++)

 if (mutex_trylock(sbi-fs_lock[i]))

 return i;



 -   mutex_lock(sbi-fs_lock[next_lock]);

 +   spin_lock(sbi-spin_lock);

 +   next_lock = sbi-next_lock_num % NR_GLOBAL_LOCKS;

 sbi-next_lock_num++;

 +   spin_unlock(sbi-spin_lock);

 +

 +   mutex_lock(sbi-fs_lock[next_lock]);

 return next_lock;

  }



 diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c

 old mode 100644

 new mode 100755

 index 75c7dc3..4f27596

 --- a/fs/f2fs/super.c

 +++ b/fs/f2fs/super.c

 @@ -657,6 +657,7 @@ static int f2fs_fill_super(struct super_block *sb,
 void *data, int silent)

 mutex_init(sbi-cp_mutex);

 for (i = 0; i  NR_GLOBAL_LOCKS; i++)

 mutex_init(sbi-fs_lock[i]);

 +   spin_lock_init(sbi-spin_lock);

 mutex_init(sbi-node_write);

 sbi-por_doing = 0;

 spin_lock_init(sbi-stat_lock);

 (END)










 --
 How ServiceNow helps IT people transform IT departments:
 1. Consolidate legacy IT systems to a single system of record for IT
 2. Standardize and globalize service processes across IT
 3. Implement zero-touch automation to replace manual, redundant tasks
 http://pubads.g.doubleclick.net/gampad/clk?id=5127iu=/4140/ostg.clktrk
 ___
 Linux-f2fs-devel mailing list
 linux-f2fs-de...@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [f2fs-dev] [PATCH] f2fs: optimize fs_lock for better performance

2013-09-11 Thread Russ Knize
Jaegeuk,

My tests include forced kernel panics while fsstress is running, which
generates a lot of recovery activity.  Sorry I wasn't more clear.

I understand your concern, which is why I first tried to keep the
fs_lock in the xattr_handler-set() path from VFS while removing it
from the call path I shared earlier.  Doing this requires reworking
the xattr code a bit.  Based on what Gu said about the inode mutex, it
seems like we can just remove it.

Russ

On Wed, Sep 11, 2013 at 8:19 AM, Kim Jaegeuk jaegeuk@gmail.com wrote:
 Hi Russ,

 The usage of fs_locks is for the recovery, so it doesn't matter
 with stress-testing.
 Actually what I've concerned is that we should not grab two or
 more fs_locks in the same call path.
 Thanks,

 2013/9/11 Russ Knize russ.kn...@motorola.com:
 Hi Jaegeuk/Gu,

 I've removed the lock and have been stress-testing with SELinux and some
 additional xattr torture for 24+ hours.  I have not encountered any issues
 yet.

 My previous suggestion about moving the lock is probably not a good idea
 without some significant code rework (thanks to the f2fs_balance_fs call in
 f2fs_setxattr).

 Russ

 On Tue, Sep 10, 2013 at 10:22 PM, Gu Zheng guz.f...@cn.fujitsu.com wrote:

 Hi Jaegeuk,
 On 09/10/2013 08:59 AM, Jaegeuk Kim wrote:

  Hi,
 
  2013-09-07 (토), 08:00 +, Chao Yu:
  Hi Knize,
 
  Thanks for your reply, I think it's actually meaningless that it's
  being named after spin_lock,
  it's better to rename this spinlock to round_robin_lock.
 
  This patch can only resolve the issue of unbalanced fs_lock usage,
  it can not fix the deadlock issue.
  can we fix deadlock issue through this method:
 
  - vfs_create()
   - f2fs_create() - takes an fs_lock and save current thread info into
  thread_info[NR_GLOBAL_LOCKS]
- f2fs_add_link()
 - __f2fs_add_link()
  - init_inode_metadata()
   - f2fs_init_security()
- security_inode_init_security()
 - f2fs_initxattrs()
  - f2fs_setxattr() - get fs_lock only if there is no current
  thread info in thread_info
 
  So it keeps one thread can only hold one fs_lock to avoid deadlock.
  Can we use this solution?
 
  It could be.
  But, I think we can avoid to grab the fs_lock at the f2fs_initxattrs()

 Agree. This fs_lock here is used to protect the xattr from parallel
 modification,
 but here is in the initxattrs routine, parallel modification can not
 happen.
 And in the normal setxattr routine the inode-i_mutex (vfs layer) is used
 to
 avoid parallel modification. So I think this fs_lock is needless.
 Am I missing something?

 Regards,
 Gu

  level, since this case only happens when f2fs_initxattrs() is called.
  Let's think about ut in more detail.
  Thanks,
 
 
 
 
  thanks again!
 
 
 
  --- Original Message ---
 
  Sender : Russ Knizeruss.kn...@motorola.com
 
  Date : 九月 07, 2013 04:25 (GMT+09:00)
 
  Title : Re: [f2fs-dev] [PATCH] f2fs: optimize fs_lock for better
  performance
 
 
 
  I encountered this same issue recently and solved it in much the same
  way.  Can we rename spin_lock to something more meaningful?
 
 
  This race actually exposed a potential deadlock between f2fs_create()
  and f2fs_initxattrs():
 
 
  - vfs_create()
   - f2fs_create() - takes an fs_lock
- f2fs_add_link()
 - __f2fs_add_link()
  - init_inode_metadata()
   - f2fs_init_security()
- security_inode_init_security()
 - f2fs_initxattrs()
  - f2fs_setxattr() - also takes an fs_lock
 
 
  If another CPU happens to have the same lock that f2fs_setxattr() was
  trying to take because of the race around next_lock_num, we can get
  into a deadlock situation if the two threads are also contending over
  another resource (like bdi).
 
 
  Another scenario is if the above happens while another thread is in
  the middle of grabbing all of the locks via mutex_lock_all().
   f2fs_create() is holding a lock that mutex_lock_all() is waiting for
  and mutex_lock_all() is holding a lock that f2fs_setxattr() is waiting
  for.
 
 
  Russ
 
 
  On Fri, Sep 6, 2013 at 4:48 AM, Chao Yu chao2...@samsung.com wrote:
  Hi Kim:
 
   I think there is a performance problem: when all
  sbi-fs_lock is holded,
 
  then all other threads may get the same next_lock value from
  sbi-next_lock_num in function mutex_lock_op,
 
  and wait to get the same lock at position fs_lock[next_lock],
  it unbalance the fs_lock usage.
 
  It may lost performance when we do the multithread test.
 
 
 
  Here is the patch to fix this problem:
 
 
 
  Signed-off-by: Yu Chao chao2...@samsung.com
 
  diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
 
  old mode 100644
 
  new mode 100755
 
  index 467d42d..983bb45
 
  --- a/fs/f2fs/f2fs.h
 
  +++ b/fs/f2fs/f2fs.h
 
  @@ -371,6 +371,7 @@ struct f2fs_sb_info {
 
  struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS

Re: [f2fs-dev] [PATCH] f2fs: optimize fs_lock for better performance

2013-09-11 Thread Jin Xu

Hi,

On 11/09/2013 21:19, Kim Jaegeuk wrote:

Hi Russ,

The usage of fs_locks is for the recovery, so it doesn't matter
with stress-testing.
Actually what I've concerned is that we should not grab two or
more fs_locks in the same call path.
Thanks,



I am wondering why we don't use other kind of methods like r/w semaphore
instead of the fs_locks for access control purpose. Is it due to the
little lower performance of r/w semaphore or other reasons?

I think the r/w semaphore can bring more clearer access control over
the current fs_locks, and will not suffer the problems reported here.
It can be used in a way that i/o tasks just acquire read sem while the
checkpoint task takes the write sem.

Or do we have other better method?

Regards,
Jin


2013/9/11 Russ Knize russ.kn...@motorola.com:

Hi Jaegeuk/Gu,

I've removed the lock and have been stress-testing with SELinux and some
additional xattr torture for 24+ hours.  I have not encountered any issues
yet.

My previous suggestion about moving the lock is probably not a good idea
without some significant code rework (thanks to the f2fs_balance_fs call in
f2fs_setxattr).

Russ

On Tue, Sep 10, 2013 at 10:22 PM, Gu Zheng guz.f...@cn.fujitsu.com wrote:


Hi Jaegeuk,
On 09/10/2013 08:59 AM, Jaegeuk Kim wrote:


Hi,

2013-09-07 (토), 08:00 +, Chao Yu:

Hi Knize,

 Thanks for your reply, I think it's actually meaningless that it's
being named after spin_lock,
it's better to rename this spinlock to round_robin_lock.

 This patch can only resolve the issue of unbalanced fs_lock usage,
it can not fix the deadlock issue.
can we fix deadlock issue through this method:

- vfs_create()
  - f2fs_create() - takes an fs_lock and save current thread info into
thread_info[NR_GLOBAL_LOCKS]
   - f2fs_add_link()
- __f2fs_add_link()
 - init_inode_metadata()
  - f2fs_init_security()
   - security_inode_init_security()
- f2fs_initxattrs()
 - f2fs_setxattr() - get fs_lock only if there is no current
thread info in thread_info

So it keeps one thread can only hold one fs_lock to avoid deadlock.
Can we use this solution?


It could be.
But, I think we can avoid to grab the fs_lock at the f2fs_initxattrs()


Agree. This fs_lock here is used to protect the xattr from parallel
modification,
but here is in the initxattrs routine, parallel modification can not
happen.
And in the normal setxattr routine the inode-i_mutex (vfs layer) is used
to
avoid parallel modification. So I think this fs_lock is needless.
Am I missing something?

Regards,
Gu


level, since this case only happens when f2fs_initxattrs() is called.
Let's think about ut in more detail.
Thanks,





thanks again!



--- Original Message ---

Sender : Russ Knizeruss.kn...@motorola.com

Date : 九月 07, 2013 04:25 (GMT+09:00)

Title : Re: [f2fs-dev] [PATCH] f2fs: optimize fs_lock for better
performance



I encountered this same issue recently and solved it in much the same
way.  Can we rename spin_lock to something more meaningful?


This race actually exposed a potential deadlock between f2fs_create()
and f2fs_initxattrs():


- vfs_create()
  - f2fs_create() - takes an fs_lock
   - f2fs_add_link()
- __f2fs_add_link()
 - init_inode_metadata()
  - f2fs_init_security()
   - security_inode_init_security()
- f2fs_initxattrs()
 - f2fs_setxattr() - also takes an fs_lock


If another CPU happens to have the same lock that f2fs_setxattr() was
trying to take because of the race around next_lock_num, we can get
into a deadlock situation if the two threads are also contending over
another resource (like bdi).


Another scenario is if the above happens while another thread is in
the middle of grabbing all of the locks via mutex_lock_all().
  f2fs_create() is holding a lock that mutex_lock_all() is waiting for
and mutex_lock_all() is holding a lock that f2fs_setxattr() is waiting
for.


Russ


On Fri, Sep 6, 2013 at 4:48 AM, Chao Yu chao2...@samsung.com wrote:
 Hi Kim:

  I think there is a performance problem: when all
 sbi-fs_lock is holded,

 then all other threads may get the same next_lock value from
 sbi-next_lock_num in function mutex_lock_op,

 and wait to get the same lock at position fs_lock[next_lock],
 it unbalance the fs_lock usage.

 It may lost performance when we do the multithread test.



 Here is the patch to fix this problem:



 Signed-off-by: Yu Chao chao2...@samsung.com

 diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h

 old mode 100644

 new mode 100755

 index 467d42d..983bb45

 --- a/fs/f2fs/f2fs.h

 +++ b/fs/f2fs/f2fs.h

 @@ -371,6 +371,7 @@ struct f2fs_sb_info {

 struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
 operations */

 struct mutex node_write;/* locking
 node writes */

 struct mutex writepages

Re: Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance

2013-09-11 Thread 俞超
Hi Kim

 -Original Message-
 From: Kim Jaegeuk [mailto:jaegeuk@gmail.com]
 Sent: Wednesday, September 11, 2013 9:15 PM
 To: chao2...@samsung.com
 Cc: ???; 谭姝; linux-fsde...@vger.kernel.org;
linux-kernel@vger.kernel.org;
 linux-f2fs-de...@lists.sourceforge.net
 Subject: Re: Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better
performance
 
 Hi,
 
 2013/9/11 Chao Yu chao2...@samsung.com
 
  Hi Kim,
 
  I did some tests as you mention of using random instead of spin_lock.
  The test model is as following:
  eight threads race to grab one of eight locks for one thousand times,
  and I used four methods to generate lock num:
 
  1.atomic_add_return(1, sbi-next_lock_num) % NR_GLOBAL_LOCKS;
  2.spin_lock(); next_lock_num++ % NR_GLOBAL_LOCKS; spin_unlock();
  3.ktime_get().tv64 % NR_GLOBAL_LOCKS;
  4.get_random_bytes(next_lock, sizeof(unsigned int));
 
  the result indicate that:
  max count of collide continuously: 4  3  2 = 1 max-min count of lock
  is grabbed: 4  3  2 = 1 elapsed time of generating: 3  2  4  1
 
  So I think it's better to use atomic_add_return in round-robin method
  to cost less time and reduce collide.
  What's your opinion?
 
 Could you test with sbi-next_lock_num++ only instead of using
 atomic_add_return?
 IMO, this is just an integer value and still I don't think this value
should be
 covered by any kind of locks.
 Thanks,

Thanks for the advice, I have tested sbi-next_lock_num++.
The time cost of it is a little bit lower than the atomic one's.
for running 8 thread for 100 times test.
the performance of it's balance and collide play quit well than I expected.

Can we modify this patch as following?

root@virtaulmachine:/home/yuchao/git/linux-next/fs/f2fs# git diff --stat
 fs/f2fs/f2fs.h |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
root@virtaulmachine:/home/yuchao/git/linux-next/fs/f2fs# git diff
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 608f0df..7fd99d8 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -544,15 +544,15 @@ static inline void mutex_unlock_all(struct
f2fs_sb_info *sbi)
 
 static inline int mutex_lock_op(struct f2fs_sb_info *sbi)
 {
-   unsigned char next_lock = sbi-next_lock_num % NR_GLOBAL_LOCKS;
+   unsigned char next_lock;
int i = 0;
 
for (; i  NR_GLOBAL_LOCKS; i++)
if (mutex_trylock(sbi-fs_lock[i]))
return i;
 
+   next_lock = sbi-next_lock_num++ % NR_GLOBAL_LOCKS;
mutex_lock(sbi-fs_lock[next_lock]);
-   sbi-next_lock_num++;
return next_lock;
 }

 
 
  thanks
 
  --- Original Message ---
  Sender : ???jaegeuk@samsung.com S5(??)/??/?(???)/
  Date : 九月 10, 2013 09:52 (GMT+09:00)
  Title : Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better
  performance
 
  Hi,
 
  At first, thank you for the report and please follow the email writing
  rules. :)
 
  Anyway, I agree to the below issue.
  One thing that I can think of is that we don't need to use the
  spin_lock, since we don't care about the exact lock number, but just
  need to get any not-collided number.
 
  So, how about removing the spin_lock?
  And how about using a random number?
  Thanks,
 
  2013-09-06 (?), 09:48 +, Chao Yu:
   Hi Kim:
  
I think there is a performance problem: when all sbi-fs_lock
   is holded,
  
   then all other threads may get the same next_lock value from
   sbi-next_lock_num in function mutex_lock_op,
  
   and wait to get the same lock at position fs_lock[next_lock], it
   unbalance the fs_lock usage.
  
   It may lost performance when we do the multithread test.
  
  
  
   Here is the patch to fix this problem:
  
  
  
   Signed-off-by: Yu Chao
  
   diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
  
   old mode 100644
  
   new mode 100755
  
   index 467d42d..983bb45
  
   --- a/fs/f2fs/f2fs.h
  
   +++ b/fs/f2fs/f2fs.h
  
   @@ -371,6 +371,7 @@ struct f2fs_sb_info {
  
   struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
   operations */
  
   struct mutex node_write;/* locking node
 writes
   */
  
   struct mutex writepages;/* mutex for
   writepages() */
  
   +   spinlock_t spin_lock;   /* lock for
   next_lock_num */
  
   unsigned char next_lock_num;/* round-robin
 global
   locks */
  
   int por_doing;  /* recovery is
 doing
   or not */
  
   int on_build_free_nids; /* build_free_nids is
   doing */
  
   @@ -533,15 +534,19 @@ static inline void mutex_unlock_all(struct
   f2fs_sb_info *sbi)
  
  
  
static inline int mutex_lock_op(struct f2fs_sb_info *sbi)
  
{
  
   -   unsigned char next_lock = sbi-next_lock_num %
   NR_GLOBAL_LOCKS;
  
   +   unsigned char next_lock;
  
   int i = 0;
  
  
  
   for (; i  NR_GLOBAL_LOCKS; i++)
  
   if (mutex_trylock(sbi-fs_lock[i

RE: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance

2013-09-11 Thread 俞超
Hi Gu

 -Original Message-
 From: Gu Zheng [mailto:guz.f...@cn.fujitsu.com]
 Sent: Wednesday, September 11, 2013 1:38 PM
 To: jaegeuk@samsung.com
 Cc: chao2...@samsung.com; shu@samsung.com;
 linux-fsde...@vger.kernel.org; linux-kernel@vger.kernel.org;
 linux-f2fs-de...@lists.sourceforge.net
 Subject: Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance
 
 Hi Jaegeuk, Chao,
 
 On 09/10/2013 08:52 AM, Jaegeuk Kim wrote:
 
  Hi,
 
  At first, thank you for the report and please follow the email writing
  rules. :)
 
  Anyway, I agree to the below issue.
  One thing that I can think of is that we don't need to use the
  spin_lock, since we don't care about the exact lock number, but just
  need to get any not-collided number.
 
 IMHO, just moving sbi-next_lock_num++ before
 mutex_lock(sbi-fs_lock[next_lock])
 can avoid unbalance issue mostly.
 IMO, the case two or more threads increase sbi-next_lock_num in the same
 time is really very very little. If you think it is not rigorous, change
 next_lock_num to atomic one can fix it.
 What's your opinion?
 
 Regards,
 Gu

I did the test sbi-next_lock_num++ compare with the atomic one,
And I found performance of them is almost the same under a small number thread 
racing.
So as your and Kim's opinion, it's enough to use sbi-next_lock_num++ to fix 
this issue.

Thanks for the advice.
 
 
  So, how about removing the spin_lock?
  And how about using a random number?
 
  Thanks,
 
  2013-09-06 (금), 09:48 +, Chao Yu:
  Hi Kim:
 
   I think there is a performance problem: when all sbi-fs_lock is
  holded,
 
  then all other threads may get the same next_lock value from
  sbi-next_lock_num in function mutex_lock_op,
 
  and wait to get the same lock at position fs_lock[next_lock], it
  unbalance the fs_lock usage.
 
  It may lost performance when we do the multithread test.
 
 
 
  Here is the patch to fix this problem:
 
 
 
  Signed-off-by: Yu Chao chao2...@samsung.com
 
  diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
 
  old mode 100644
 
  new mode 100755
 
  index 467d42d..983bb45
 
  --- a/fs/f2fs/f2fs.h
 
  +++ b/fs/f2fs/f2fs.h
 
  @@ -371,6 +371,7 @@ struct f2fs_sb_info {
 
  struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
  operations */
 
  struct mutex node_write;/* locking node
 writes
  */
 
  struct mutex writepages;/* mutex for
  writepages() */
 
  +   spinlock_t spin_lock;   /* lock for
  next_lock_num */
 
  unsigned char next_lock_num;/* round-robin
 global
  locks */
 
  int por_doing;  /* recovery is doing
  or not */
 
  int on_build_free_nids; /* build_free_nids is
  doing */
 
  @@ -533,15 +534,19 @@ static inline void mutex_unlock_all(struct
  f2fs_sb_info *sbi)
 
 
 
   static inline int mutex_lock_op(struct f2fs_sb_info *sbi)
 
   {
 
  -   unsigned char next_lock = sbi-next_lock_num %
  NR_GLOBAL_LOCKS;
 
  +   unsigned char next_lock;
 
  int i = 0;
 
 
 
  for (; i  NR_GLOBAL_LOCKS; i++)
 
  if (mutex_trylock(sbi-fs_lock[i]))
 
  return i;
 
 
 
  -   mutex_lock(sbi-fs_lock[next_lock]);
 
  +   spin_lock(sbi-spin_lock);
 
  +   next_lock = sbi-next_lock_num % NR_GLOBAL_LOCKS;
 
  sbi-next_lock_num++;
 
  +   spin_unlock(sbi-spin_lock);
 
  +
 
  +   mutex_lock(sbi-fs_lock[next_lock]);
 
  return next_lock;
 
   }
 
 
 
  diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
 
  old mode 100644
 
  new mode 100755
 
  index 75c7dc3..4f27596
 
  --- a/fs/f2fs/super.c
 
  +++ b/fs/f2fs/super.c
 
  @@ -657,6 +657,7 @@ static int f2fs_fill_super(struct super_block
  *sb, void *data, int silent)
 
  mutex_init(sbi-cp_mutex);
 
  for (i = 0; i  NR_GLOBAL_LOCKS; i++)
 
  mutex_init(sbi-fs_lock[i]);
 
  +   spin_lock_init(sbi-spin_lock);
 
  mutex_init(sbi-node_write);
 
  sbi-por_doing = 0;
 
  spin_lock_init(sbi-stat_lock);
 
  (END)
 
 
 
 
 
 
 
 
 
 =

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance

2013-09-11 Thread Gu Zheng
Hi Chao,
On 09/12/2013 10:40 AM, 俞超 wrote:

 Hi Gu
 
 -Original Message-
 From: Gu Zheng [mailto:guz.f...@cn.fujitsu.com]
 Sent: Wednesday, September 11, 2013 1:38 PM
 To: jaegeuk@samsung.com
 Cc: chao2...@samsung.com; shu@samsung.com;
 linux-fsde...@vger.kernel.org; linux-kernel@vger.kernel.org;
 linux-f2fs-de...@lists.sourceforge.net
 Subject: Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance

 Hi Jaegeuk, Chao,

 On 09/10/2013 08:52 AM, Jaegeuk Kim wrote:

 Hi,

 At first, thank you for the report and please follow the email writing
 rules. :)

 Anyway, I agree to the below issue.
 One thing that I can think of is that we don't need to use the
 spin_lock, since we don't care about the exact lock number, but just
 need to get any not-collided number.

 IMHO, just moving sbi-next_lock_num++ before
 mutex_lock(sbi-fs_lock[next_lock])
 can avoid unbalance issue mostly.
 IMO, the case two or more threads increase sbi-next_lock_num in the same
 time is really very very little. If you think it is not rigorous, change
 next_lock_num to atomic one can fix it.
 What's your opinion?

 Regards,
 Gu
 
 I did the test sbi-next_lock_num++ compare with the atomic one,
 And I found performance of them is almost the same under a small number 
 thread racing.
 So as your and Kim's opinion, it's enough to use sbi-next_lock_num++ to 
 fix this issue.

Good, but it seems that your replay patch is out of format, and it's hard for 
Jaegeuk to merge.
I'll format it, see the following thread.

Thanks,
Gu

 
 Thanks for the advice.


 So, how about removing the spin_lock?
 And how about using a random number?

 Thanks,

 2013-09-06 (금), 09:48 +, Chao Yu:
 Hi Kim:

  I think there is a performance problem: when all sbi-fs_lock is
 holded,

 then all other threads may get the same next_lock value from
 sbi-next_lock_num in function mutex_lock_op,

 and wait to get the same lock at position fs_lock[next_lock], it
 unbalance the fs_lock usage.

 It may lost performance when we do the multithread test.



 Here is the patch to fix this problem:



 Signed-off-by: Yu Chao chao2...@samsung.com

 diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h

 old mode 100644

 new mode 100755

 index 467d42d..983bb45

 --- a/fs/f2fs/f2fs.h

 +++ b/fs/f2fs/f2fs.h

 @@ -371,6 +371,7 @@ struct f2fs_sb_info {

 struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
 operations */

 struct mutex node_write;/* locking node
 writes
 */

 struct mutex writepages;/* mutex for
 writepages() */

 +   spinlock_t spin_lock;   /* lock for
 next_lock_num */

 unsigned char next_lock_num;/* round-robin
 global
 locks */

 int por_doing;  /* recovery is doing
 or not */

 int on_build_free_nids; /* build_free_nids is
 doing */

 @@ -533,15 +534,19 @@ static inline void mutex_unlock_all(struct
 f2fs_sb_info *sbi)



  static inline int mutex_lock_op(struct f2fs_sb_info *sbi)

  {

 -   unsigned char next_lock = sbi-next_lock_num %
 NR_GLOBAL_LOCKS;

 +   unsigned char next_lock;

 int i = 0;



 for (; i  NR_GLOBAL_LOCKS; i++)

 if (mutex_trylock(sbi-fs_lock[i]))

 return i;



 -   mutex_lock(sbi-fs_lock[next_lock]);

 +   spin_lock(sbi-spin_lock);

 +   next_lock = sbi-next_lock_num % NR_GLOBAL_LOCKS;

 sbi-next_lock_num++;

 +   spin_unlock(sbi-spin_lock);

 +

 +   mutex_lock(sbi-fs_lock[next_lock]);

 return next_lock;

  }



 diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c

 old mode 100644

 new mode 100755

 index 75c7dc3..4f27596

 --- a/fs/f2fs/super.c

 +++ b/fs/f2fs/super.c

 @@ -657,6 +657,7 @@ static int f2fs_fill_super(struct super_block
 *sb, void *data, int silent)

 mutex_init(sbi-cp_mutex);

 for (i = 0; i  NR_GLOBAL_LOCKS; i++)

 mutex_init(sbi-fs_lock[i]);

 +   spin_lock_init(sbi-spin_lock);

 mutex_init(sbi-node_write);

 sbi-por_doing = 0;

 spin_lock_init(sbi-stat_lock);

 (END)









 =
 
 


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance

2013-09-10 Thread Gu Zheng
Hi Jaegeuk, Chao,

On 09/10/2013 08:52 AM, Jaegeuk Kim wrote:

> Hi,
> 
> At first, thank you for the report and please follow the email writing
> rules. :)
> 
> Anyway, I agree to the below issue.
> One thing that I can think of is that we don't need to use the
> spin_lock, since we don't care about the exact lock number, but just
> need to get any not-collided number.

IMHO, just moving sbi->next_lock_num++ before 
mutex_lock(>fs_lock[next_lock])
can avoid unbalance issue mostly.
IMO, the case two or more threads increase sbi->next_lock_num in the same time 
is
really very very little. If you think it is not rigorous, change next_lock_num 
to
atomic one can fix it.
What's your opinion?

Regards,
Gu

> 
> So, how about removing the spin_lock?
> And how about using a random number?

> Thanks,
> 
> 2013-09-06 (금), 09:48 +, Chao Yu:
>> Hi Kim:
>>
>>  I think there is a performance problem: when all sbi->fs_lock is
>> holded, 
>>
>> then all other threads may get the same next_lock value from
>> sbi->next_lock_num in function mutex_lock_op, 
>>
>> and wait to get the same lock at position fs_lock[next_lock], it
>> unbalance the fs_lock usage. 
>>
>> It may lost performance when we do the multithread test.
>>
>>  
>>
>> Here is the patch to fix this problem:
>>
>>  
>>
>> Signed-off-by: Yu Chao 
>>
>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>>
>> old mode 100644
>>
>> new mode 100755
>>
>> index 467d42d..983bb45
>>
>> --- a/fs/f2fs/f2fs.h
>>
>> +++ b/fs/f2fs/f2fs.h
>>
>> @@ -371,6 +371,7 @@ struct f2fs_sb_info {
>>
>> struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
>> operations */
>>
>> struct mutex node_write;/* locking node writes
>> */
>>
>> struct mutex writepages;/* mutex for
>> writepages() */
>>
>> +   spinlock_t spin_lock;   /* lock for
>> next_lock_num */
>>
>> unsigned char next_lock_num;/* round-robin global
>> locks */
>>
>> int por_doing;  /* recovery is doing
>> or not */
>>
>> int on_build_free_nids; /* build_free_nids is
>> doing */
>>
>> @@ -533,15 +534,19 @@ static inline void mutex_unlock_all(struct
>> f2fs_sb_info *sbi)
>>
>>  
>>
>>  static inline int mutex_lock_op(struct f2fs_sb_info *sbi)
>>
>>  {
>>
>> -   unsigned char next_lock = sbi->next_lock_num %
>> NR_GLOBAL_LOCKS;
>>
>> +   unsigned char next_lock;
>>
>> int i = 0;
>>
>>  
>>
>> for (; i < NR_GLOBAL_LOCKS; i++)
>>
>> if (mutex_trylock(>fs_lock[i]))
>>
>> return i;
>>
>>  
>>
>> -   mutex_lock(>fs_lock[next_lock]);
>>
>> +   spin_lock(>spin_lock);
>>
>> +   next_lock = sbi->next_lock_num % NR_GLOBAL_LOCKS;
>>
>> sbi->next_lock_num++;
>>
>> +   spin_unlock(>spin_lock);
>>
>> +
>>
>> +   mutex_lock(>fs_lock[next_lock]);
>>
>> return next_lock;
>>
>>  }
>>
>>  
>>
>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
>>
>> old mode 100644
>>
>> new mode 100755
>>
>> index 75c7dc3..4f27596
>>
>> --- a/fs/f2fs/super.c
>>
>> +++ b/fs/f2fs/super.c
>>
>> @@ -657,6 +657,7 @@ static int f2fs_fill_super(struct super_block *sb,
>> void *data, int silent)
>>
>> mutex_init(>cp_mutex);
>>
>> for (i = 0; i < NR_GLOBAL_LOCKS; i++)
>>
>> mutex_init(>fs_lock[i]);
>>
>> +   spin_lock_init(>spin_lock);
>>
>> mutex_init(>node_write);
>>
>> sbi->por_doing = 0;
>>
>> spin_lock_init(>stat_lock);
>>
>> (END)
>>
>>  
>>
>>
>>
>>
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance

2013-09-10 Thread Gu Zheng
Hi Jaegeuk,

On 09/10/2013 08:52 AM, Jaegeuk Kim wrote:

> Hi,
> 
> At first, thank you for the report and please follow the email writing
> rules. :)
> 
> Anyway, I agree to the below issue.
> One thing that I can think of is that we don't need to use the
> spin_lock, since we don't care about the exact lock number, but just
> need to get any not-collided number.

Agree, but if all the locks are held, IMO, we need to balance the following
threads to wait for each not-collided number lock, though complete balance is 
unreachable.

> 
> So, how about removing the spin_lock?

Yeah, in this case, spin_lock is a bit heavy cost. 

> And how about using a random number?

Now NR_GLOBAL_LOCKS is 8, it seems that random can not offer an balance number 
as we expected.

Regards,
Gu 

> Thanks,
> 
> 2013-09-06 (금), 09:48 +, Chao Yu:
>> Hi Kim:
>>
>>  I think there is a performance problem: when all sbi->fs_lock is
>> holded, 
>>
>> then all other threads may get the same next_lock value from
>> sbi->next_lock_num in function mutex_lock_op, 
>>
>> and wait to get the same lock at position fs_lock[next_lock], it
>> unbalance the fs_lock usage. 
>>
>> It may lost performance when we do the multithread test.
>>
>>  
>>
>> Here is the patch to fix this problem:
>>
>>  
>>
>> Signed-off-by: Yu Chao 
>>
>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>>
>> old mode 100644
>>
>> new mode 100755
>>
>> index 467d42d..983bb45
>>
>> --- a/fs/f2fs/f2fs.h
>>
>> +++ b/fs/f2fs/f2fs.h
>>
>> @@ -371,6 +371,7 @@ struct f2fs_sb_info {
>>
>> struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
>> operations */
>>
>> struct mutex node_write;/* locking node writes
>> */
>>
>> struct mutex writepages;/* mutex for
>> writepages() */
>>
>> +   spinlock_t spin_lock;   /* lock for
>> next_lock_num */
>>
>> unsigned char next_lock_num;/* round-robin global
>> locks */
>>
>> int por_doing;  /* recovery is doing
>> or not */
>>
>> int on_build_free_nids; /* build_free_nids is
>> doing */
>>
>> @@ -533,15 +534,19 @@ static inline void mutex_unlock_all(struct
>> f2fs_sb_info *sbi)
>>
>>  
>>
>>  static inline int mutex_lock_op(struct f2fs_sb_info *sbi)
>>
>>  {
>>
>> -   unsigned char next_lock = sbi->next_lock_num %
>> NR_GLOBAL_LOCKS;
>>
>> +   unsigned char next_lock;
>>
>> int i = 0;
>>
>>  
>>
>> for (; i < NR_GLOBAL_LOCKS; i++)
>>
>> if (mutex_trylock(>fs_lock[i]))
>>
>> return i;
>>
>>  
>>
>> -   mutex_lock(>fs_lock[next_lock]);
>>
>> +   spin_lock(>spin_lock);
>>
>> +   next_lock = sbi->next_lock_num % NR_GLOBAL_LOCKS;
>>
>> sbi->next_lock_num++;
>>
>> +   spin_unlock(>spin_lock);
>>
>> +
>>
>> +   mutex_lock(>fs_lock[next_lock]);
>>
>> return next_lock;
>>
>>  }
>>
>>  
>>
>> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
>>
>> old mode 100644
>>
>> new mode 100755
>>
>> index 75c7dc3..4f27596
>>
>> --- a/fs/f2fs/super.c
>>
>> +++ b/fs/f2fs/super.c
>>
>> @@ -657,6 +657,7 @@ static int f2fs_fill_super(struct super_block *sb,
>> void *data, int silent)
>>
>> mutex_init(>cp_mutex);
>>
>> for (i = 0; i < NR_GLOBAL_LOCKS; i++)
>>
>> mutex_init(>fs_lock[i]);
>>
>> +   spin_lock_init(>spin_lock);
>>
>> mutex_init(>node_write);
>>
>> sbi->por_doing = 0;
>>
>> spin_lock_init(>stat_lock);
>>
>> (END)
>>
>>  
>>
>>
>>
>>
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [f2fs-dev] [PATCH] f2fs: optimize fs_lock for better performance

2013-09-10 Thread Gu Zheng
Hi Jaegeuk,
On 09/10/2013 08:59 AM, Jaegeuk Kim wrote:

> Hi,
> 
> 2013-09-07 (토), 08:00 +, Chao Yu:
>> Hi Knize,
>>
>> Thanks for your reply, I think it's actually meaningless that it's
>> being named after "spin_lock",
>> it's better to rename this spinlock to "round_robin_lock".
>>
>> This patch can only resolve the issue of unbalanced fs_lock usage,
>> it can not fix the deadlock issue.
>> can we fix deadlock issue through this method:
>>
>> - vfs_create()
>>  - f2fs_create() - takes an fs_lock and save current thread info into
>> thread_info[NR_GLOBAL_LOCKS]
>>   - f2fs_add_link()
>>- __f2fs_add_link()
>> - init_inode_metadata()
>>  - f2fs_init_security()
>>   - security_inode_init_security()
>>- f2fs_initxattrs()
>> - f2fs_setxattr() - get fs_lock only if there is no current
>> thread info in thread_info
>> 
>> So it keeps one thread can only hold one fs_lock to avoid deadlock.
>> Can we use this solution?
> 
> It could be.
> But, I think we can avoid to grab the fs_lock at the f2fs_initxattrs()

Agree. This fs_lock here is used to protect the xattr from parallel 
modification,
but here is in the initxattrs routine, parallel modification can not happen.
And in the normal setxattr routine the inode->i_mutex (vfs layer) is used to
avoid parallel modification. So I think this fs_lock is needless.
Am I missing something?

Regards,
Gu

> level, since this case only happens when f2fs_initxattrs() is called.
> Let's think about ut in more detail.
> Thanks,
> 
>>
>>  
>>
>> thanks again!
>>
>>  
>>
>> --- Original Message ---
>>
>> Sender : Russ Knize
>>
>> Date : 九月 07, 2013 04:25 (GMT+09:00)
>>
>> Title : Re: [f2fs-dev] [PATCH] f2fs: optimize fs_lock for better
>> performance
>>
>>  
>>
>> I encountered this same issue recently and solved it in much the same
>> way.  Can we rename "spin_lock" to something more meaningful? 
>>
>>
>> This race actually exposed a potential deadlock between f2fs_create()
>> and f2fs_initxattrs(): 
>>
>>
>> - vfs_create()
>>  - f2fs_create() - takes an fs_lock
>>   - f2fs_add_link()
>>- __f2fs_add_link()
>> - init_inode_metadata()
>>  - f2fs_init_security()
>>   - security_inode_init_security()
>>- f2fs_initxattrs()
>> - f2fs_setxattr() - also takes an fs_lock
>>
>>
>> If another CPU happens to have the same lock that f2fs_setxattr() was
>> trying to take because of the race around next_lock_num, we can get
>> into a deadlock situation if the two threads are also contending over
>> another resource (like bdi).
>>
>>
>> Another scenario is if the above happens while another thread is in
>> the middle of grabbing all of the locks via mutex_lock_all().
>>  f2fs_create() is holding a lock that mutex_lock_all() is waiting for
>> and mutex_lock_all() is holding a lock that f2fs_setxattr() is waiting
>> for.
>>
>>
>> Russ
>>
>>
>> On Fri, Sep 6, 2013 at 4:48 AM, Chao Yu  wrote:
>> Hi Kim:
>> 
>>  I think there is a performance problem: when all
>> sbi->fs_lock is holded, 
>> 
>> then all other threads may get the same next_lock value from
>> sbi->next_lock_num in function mutex_lock_op, 
>> 
>> and wait to get the same lock at position fs_lock[next_lock],
>> it unbalance the fs_lock usage. 
>> 
>> It may lost performance when we do the multithread test.
>> 
>>  
>> 
>> Here is the patch to fix this problem:
>> 
>>  
>> 
>> Signed-off-by: Yu Chao 
>> 
>> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
>> 
>> old mode 100644
>> 
>> new mode 100755
>> 
>> index 467d42d..983bb45
>> 
>> --- a/fs/f2fs/f2fs.h
>> 
>> +++ b/fs/f2fs/f2fs.h
>> 
>> @@ -371,6 +371,7 @@ struct f2fs_sb_info {
>> 
>> struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
>> operations */
>> 
>> struct mutex node_write;/* locking
>> node writes */
>>

Re: Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance

2013-09-10 Thread Chao Yu
Hi Kim,

I did some tests as you mention of using random instead of spin_lock.
The test model is as following:
eight threads race to grab one of eight locks for one thousand times,
and I used four methods to generate lock num: 

1.atomic_add_return(1, >next_lock_num) % NR_GLOBAL_LOCKS;
2.spin_lock(); next_lock_num++ % NR_GLOBAL_LOCKS; spin_unlock();
3.ktime_get().tv64 % NR_GLOBAL_LOCKS;
4.get_random_bytes(_lock, sizeof(unsigned int));

the result indicate that:
max count of collide continuously: 4 > 3 > 2 = 1
max-min count of lock is grabbed: 4 > 3 > 2 = 1
elapsed time of generating: 3 > 2 > 4 > 1

So I think it's better to use atomic_add_return in round-robin method to
cost less time and reduce collide.
What's your opinion?

thanks

--- Original Message ---
Sender : ??? S5(??)/??/?(???)/
Date : 九月 10, 2013 09:52 (GMT+09:00)
Title : Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance

Hi,

At first, thank you for the report and please follow the email writing
rules. :)

Anyway, I agree to the below issue.
One thing that I can think of is that we don't need to use the
spin_lock, since we don't care about the exact lock number, but just
need to get any not-collided number.

So, how about removing the spin_lock?
And how about using a random number?
Thanks,

2013-09-06 (?), 09:48 +, Chao Yu:
> Hi Kim:
> 
>  I think there is a performance problem: when all sbi->fs_lock is
> holded, 
> 
> then all other threads may get the same next_lock value from
> sbi->next_lock_num in function mutex_lock_op, 
> 
> and wait to get the same lock at position fs_lock[next_lock], it
> unbalance the fs_lock usage. 
> 
> It may lost performance when we do the multithread test.
> 
>  
> 
> Here is the patch to fix this problem:
> 
>  
> 
> Signed-off-by: Yu Chao 
> 
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> 
> old mode 100644
> 
> new mode 100755
> 
> index 467d42d..983bb45
> 
> --- a/fs/f2fs/f2fs.h
> 
> +++ b/fs/f2fs/f2fs.h
> 
> @@ -371,6 +371,7 @@ struct f2fs_sb_info {
> 
> struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
> operations */
> 
> struct mutex node_write;/* locking node writes
> */
> 
> struct mutex writepages;/* mutex for
> writepages() */
> 
> +   spinlock_t spin_lock;   /* lock for
> next_lock_num */
> 
> unsigned char next_lock_num;/* round-robin global
> locks */
> 
> int por_doing;  /* recovery is doing
> or not */
> 
> int on_build_free_nids; /* build_free_nids is
> doing */
> 
> @@ -533,15 +534,19 @@ static inline void mutex_unlock_all(struct
> f2fs_sb_info *sbi)
> 
>  
> 
>  static inline int mutex_lock_op(struct f2fs_sb_info *sbi)
> 
>  {
> 
> -   unsigned char next_lock = sbi->next_lock_num %
> NR_GLOBAL_LOCKS;
> 
> +   unsigned char next_lock;
> 
> int i = 0;
> 
>  
> 
> for (; i < NR_GLOBAL_LOCKS; i++)
> 
> if (mutex_trylock(>fs_lock[i]))
> 
> return i;
> 
>  
> 
> -   mutex_lock(>fs_lock[next_lock]);
> 
> +   spin_lock(>spin_lock);
> 
> +   next_lock = sbi->next_lock_num % NR_GLOBAL_LOCKS;
> 
> sbi->next_lock_num++;
> 
> +   spin_unlock(>spin_lock);
> 
> +
> 
> +   mutex_lock(>fs_lock[next_lock]);
> 
> return next_lock;
> 
>  }
> 
>  
> 
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> 
> old mode 100644
> 
> new mode 100755
> 
> index 75c7dc3..4f27596
> 
> --- a/fs/f2fs/super.c
> 
> +++ b/fs/f2fs/super.c
> 
> @@ -657,6 +657,7 @@ static int f2fs_fill_super(struct super_block *sb,
> void *data, int silent)
> 
> mutex_init(>cp_mutex);
> 
> for (i = 0; i < NR_GLOBAL_LOCKS; i++)
> 
> mutex_init(>fs_lock[i]);
> 
> +   spin_lock_init(>spin_lock);
> 
> mutex_init(>node_write);
> 
> sbi->por_doing = 0;
> 
> spin_lock_init(>stat_lock);
> 
> (END)
> 
>  
> 
> 
> 
> 

-- 
Jaegeuk Kim
SamsungN�Р骒r��yb�X�肚�v�^�)藓{.n�+�伐�{��赙zXФ�≤�}��财�z�:+v�����赙zZ+��+zf"�h���~i���z��wア�?�ㄨ��&�)撷f��^j谦y�m��@A�a囤�
0鹅h���i

Re: Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance

2013-09-10 Thread Chao Yu
Hi Kim,

I did some tests as you mention of using random instead of spin_lock.
The test model is as following:
eight threads race to grab one of eight locks for one thousand times,
and I used four methods to generate lock num: 

1.atomic_add_return(1, sbi-next_lock_num) % NR_GLOBAL_LOCKS;
2.spin_lock(); next_lock_num++ % NR_GLOBAL_LOCKS; spin_unlock();
3.ktime_get().tv64 % NR_GLOBAL_LOCKS;
4.get_random_bytes(next_lock, sizeof(unsigned int));

the result indicate that:
max count of collide continuously: 4  3  2 = 1
max-min count of lock is grabbed: 4  3  2 = 1
elapsed time of generating: 3  2  4  1

So I think it's better to use atomic_add_return in round-robin method to
cost less time and reduce collide.
What's your opinion?

thanks

--- Original Message ---
Sender : ???jaegeuk@samsung.com S5(??)/??/?(???)/
Date : 九月 10, 2013 09:52 (GMT+09:00)
Title : Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance

Hi,

At first, thank you for the report and please follow the email writing
rules. :)

Anyway, I agree to the below issue.
One thing that I can think of is that we don't need to use the
spin_lock, since we don't care about the exact lock number, but just
need to get any not-collided number.

So, how about removing the spin_lock?
And how about using a random number?
Thanks,

2013-09-06 (?), 09:48 +, Chao Yu:
 Hi Kim:
 
  I think there is a performance problem: when all sbi-fs_lock is
 holded, 
 
 then all other threads may get the same next_lock value from
 sbi-next_lock_num in function mutex_lock_op, 
 
 and wait to get the same lock at position fs_lock[next_lock], it
 unbalance the fs_lock usage. 
 
 It may lost performance when we do the multithread test.
 
  
 
 Here is the patch to fix this problem:
 
  
 
 Signed-off-by: Yu Chao 
 
 diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
 
 old mode 100644
 
 new mode 100755
 
 index 467d42d..983bb45
 
 --- a/fs/f2fs/f2fs.h
 
 +++ b/fs/f2fs/f2fs.h
 
 @@ -371,6 +371,7 @@ struct f2fs_sb_info {
 
 struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
 operations */
 
 struct mutex node_write;/* locking node writes
 */
 
 struct mutex writepages;/* mutex for
 writepages() */
 
 +   spinlock_t spin_lock;   /* lock for
 next_lock_num */
 
 unsigned char next_lock_num;/* round-robin global
 locks */
 
 int por_doing;  /* recovery is doing
 or not */
 
 int on_build_free_nids; /* build_free_nids is
 doing */
 
 @@ -533,15 +534,19 @@ static inline void mutex_unlock_all(struct
 f2fs_sb_info *sbi)
 
  
 
  static inline int mutex_lock_op(struct f2fs_sb_info *sbi)
 
  {
 
 -   unsigned char next_lock = sbi-next_lock_num %
 NR_GLOBAL_LOCKS;
 
 +   unsigned char next_lock;
 
 int i = 0;
 
  
 
 for (; i  NR_GLOBAL_LOCKS; i++)
 
 if (mutex_trylock(sbi-fs_lock[i]))
 
 return i;
 
  
 
 -   mutex_lock(sbi-fs_lock[next_lock]);
 
 +   spin_lock(sbi-spin_lock);
 
 +   next_lock = sbi-next_lock_num % NR_GLOBAL_LOCKS;
 
 sbi-next_lock_num++;
 
 +   spin_unlock(sbi-spin_lock);
 
 +
 
 +   mutex_lock(sbi-fs_lock[next_lock]);
 
 return next_lock;
 
  }
 
  
 
 diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
 
 old mode 100644
 
 new mode 100755
 
 index 75c7dc3..4f27596
 
 --- a/fs/f2fs/super.c
 
 +++ b/fs/f2fs/super.c
 
 @@ -657,6 +657,7 @@ static int f2fs_fill_super(struct super_block *sb,
 void *data, int silent)
 
 mutex_init(sbi-cp_mutex);
 
 for (i = 0; i  NR_GLOBAL_LOCKS; i++)
 
 mutex_init(sbi-fs_lock[i]);
 
 +   spin_lock_init(sbi-spin_lock);
 
 mutex_init(sbi-node_write);
 
 sbi-por_doing = 0;
 
 spin_lock_init(sbi-stat_lock);
 
 (END)
 
  
 
 
 
 

-- 
Jaegeuk Kim
SamsungN�Р骒r��yb�X�肚�v�^�)藓{.n�+�伐�{��赙zXФ�≤�}��财�z�j:+v�����赙zZ+��+zf"�h���~i���z��wア�?�ㄨ���)撷f��^j谦y�m��@A�a囤�
0鹅h���i

Re: [f2fs-dev] [PATCH] f2fs: optimize fs_lock for better performance

2013-09-10 Thread Gu Zheng
Hi Jaegeuk,
On 09/10/2013 08:59 AM, Jaegeuk Kim wrote:

 Hi,
 
 2013-09-07 (토), 08:00 +, Chao Yu:
 Hi Knize,

 Thanks for your reply, I think it's actually meaningless that it's
 being named after spin_lock,
 it's better to rename this spinlock to round_robin_lock.

 This patch can only resolve the issue of unbalanced fs_lock usage,
 it can not fix the deadlock issue.
 can we fix deadlock issue through this method:

 - vfs_create()
  - f2fs_create() - takes an fs_lock and save current thread info into
 thread_info[NR_GLOBAL_LOCKS]
   - f2fs_add_link()
- __f2fs_add_link()
 - init_inode_metadata()
  - f2fs_init_security()
   - security_inode_init_security()
- f2fs_initxattrs()
 - f2fs_setxattr() - get fs_lock only if there is no current
 thread info in thread_info
 
 So it keeps one thread can only hold one fs_lock to avoid deadlock.
 Can we use this solution?
 
 It could be.
 But, I think we can avoid to grab the fs_lock at the f2fs_initxattrs()

Agree. This fs_lock here is used to protect the xattr from parallel 
modification,
but here is in the initxattrs routine, parallel modification can not happen.
And in the normal setxattr routine the inode-i_mutex (vfs layer) is used to
avoid parallel modification. So I think this fs_lock is needless.
Am I missing something?

Regards,
Gu

 level, since this case only happens when f2fs_initxattrs() is called.
 Let's think about ut in more detail.
 Thanks,
 

  

 thanks again!

  

 --- Original Message ---

 Sender : Russ Knizeruss.kn...@motorola.com

 Date : 九月 07, 2013 04:25 (GMT+09:00)

 Title : Re: [f2fs-dev] [PATCH] f2fs: optimize fs_lock for better
 performance

  

 I encountered this same issue recently and solved it in much the same
 way.  Can we rename spin_lock to something more meaningful? 


 This race actually exposed a potential deadlock between f2fs_create()
 and f2fs_initxattrs(): 


 - vfs_create()
  - f2fs_create() - takes an fs_lock
   - f2fs_add_link()
- __f2fs_add_link()
 - init_inode_metadata()
  - f2fs_init_security()
   - security_inode_init_security()
- f2fs_initxattrs()
 - f2fs_setxattr() - also takes an fs_lock


 If another CPU happens to have the same lock that f2fs_setxattr() was
 trying to take because of the race around next_lock_num, we can get
 into a deadlock situation if the two threads are also contending over
 another resource (like bdi).


 Another scenario is if the above happens while another thread is in
 the middle of grabbing all of the locks via mutex_lock_all().
  f2fs_create() is holding a lock that mutex_lock_all() is waiting for
 and mutex_lock_all() is holding a lock that f2fs_setxattr() is waiting
 for.


 Russ


 On Fri, Sep 6, 2013 at 4:48 AM, Chao Yu chao2...@samsung.com wrote:
 Hi Kim:
 
  I think there is a performance problem: when all
 sbi-fs_lock is holded, 
 
 then all other threads may get the same next_lock value from
 sbi-next_lock_num in function mutex_lock_op, 
 
 and wait to get the same lock at position fs_lock[next_lock],
 it unbalance the fs_lock usage. 
 
 It may lost performance when we do the multithread test.
 
  
 
 Here is the patch to fix this problem:
 
  
 
 Signed-off-by: Yu Chao chao2...@samsung.com
 
 diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
 
 old mode 100644
 
 new mode 100755
 
 index 467d42d..983bb45
 
 --- a/fs/f2fs/f2fs.h
 
 +++ b/fs/f2fs/f2fs.h
 
 @@ -371,6 +371,7 @@ struct f2fs_sb_info {
 
 struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
 operations */
 
 struct mutex node_write;/* locking
 node writes */
 
 struct mutex writepages;/* mutex for
 writepages() */
 
 +   spinlock_t spin_lock;   /* lock for
 next_lock_num */
 
 unsigned char next_lock_num;/* round-robin
 global locks */
 
 int por_doing;  /* recovery is
 doing or not */
 
 int on_build_free_nids; /*
 build_free_nids is doing */
 
 @@ -533,15 +534,19 @@ static inline void
 mutex_unlock_all(struct f2fs_sb_info *sbi)
 
  
 
  static inline int mutex_lock_op(struct f2fs_sb_info *sbi)
 
  {
 
 -   unsigned char next_lock = sbi-next_lock_num %
 NR_GLOBAL_LOCKS;
 
 +   unsigned char next_lock;
 
 int i = 0;
 
  
 
 for (; i  NR_GLOBAL_LOCKS; i

Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance

2013-09-10 Thread Gu Zheng
Hi Jaegeuk,

On 09/10/2013 08:52 AM, Jaegeuk Kim wrote:

 Hi,
 
 At first, thank you for the report and please follow the email writing
 rules. :)
 
 Anyway, I agree to the below issue.
 One thing that I can think of is that we don't need to use the
 spin_lock, since we don't care about the exact lock number, but just
 need to get any not-collided number.

Agree, but if all the locks are held, IMO, we need to balance the following
threads to wait for each not-collided number lock, though complete balance is 
unreachable.

 
 So, how about removing the spin_lock?

Yeah, in this case, spin_lock is a bit heavy cost. 

 And how about using a random number?

Now NR_GLOBAL_LOCKS is 8, it seems that random can not offer an balance number 
as we expected.

Regards,
Gu 

 Thanks,
 
 2013-09-06 (금), 09:48 +, Chao Yu:
 Hi Kim:

  I think there is a performance problem: when all sbi-fs_lock is
 holded, 

 then all other threads may get the same next_lock value from
 sbi-next_lock_num in function mutex_lock_op, 

 and wait to get the same lock at position fs_lock[next_lock], it
 unbalance the fs_lock usage. 

 It may lost performance when we do the multithread test.

  

 Here is the patch to fix this problem:

  

 Signed-off-by: Yu Chao chao2...@samsung.com

 diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h

 old mode 100644

 new mode 100755

 index 467d42d..983bb45

 --- a/fs/f2fs/f2fs.h

 +++ b/fs/f2fs/f2fs.h

 @@ -371,6 +371,7 @@ struct f2fs_sb_info {

 struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
 operations */

 struct mutex node_write;/* locking node writes
 */

 struct mutex writepages;/* mutex for
 writepages() */

 +   spinlock_t spin_lock;   /* lock for
 next_lock_num */

 unsigned char next_lock_num;/* round-robin global
 locks */

 int por_doing;  /* recovery is doing
 or not */

 int on_build_free_nids; /* build_free_nids is
 doing */

 @@ -533,15 +534,19 @@ static inline void mutex_unlock_all(struct
 f2fs_sb_info *sbi)

  

  static inline int mutex_lock_op(struct f2fs_sb_info *sbi)

  {

 -   unsigned char next_lock = sbi-next_lock_num %
 NR_GLOBAL_LOCKS;

 +   unsigned char next_lock;

 int i = 0;

  

 for (; i  NR_GLOBAL_LOCKS; i++)

 if (mutex_trylock(sbi-fs_lock[i]))

 return i;

  

 -   mutex_lock(sbi-fs_lock[next_lock]);

 +   spin_lock(sbi-spin_lock);

 +   next_lock = sbi-next_lock_num % NR_GLOBAL_LOCKS;

 sbi-next_lock_num++;

 +   spin_unlock(sbi-spin_lock);

 +

 +   mutex_lock(sbi-fs_lock[next_lock]);

 return next_lock;

  }

  

 diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c

 old mode 100644

 new mode 100755

 index 75c7dc3..4f27596

 --- a/fs/f2fs/super.c

 +++ b/fs/f2fs/super.c

 @@ -657,6 +657,7 @@ static int f2fs_fill_super(struct super_block *sb,
 void *data, int silent)

 mutex_init(sbi-cp_mutex);

 for (i = 0; i  NR_GLOBAL_LOCKS; i++)

 mutex_init(sbi-fs_lock[i]);

 +   spin_lock_init(sbi-spin_lock);

 mutex_init(sbi-node_write);

 sbi-por_doing = 0;

 spin_lock_init(sbi-stat_lock);

 (END)

  




 


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance

2013-09-10 Thread Gu Zheng
Hi Jaegeuk, Chao,

On 09/10/2013 08:52 AM, Jaegeuk Kim wrote:

 Hi,
 
 At first, thank you for the report and please follow the email writing
 rules. :)
 
 Anyway, I agree to the below issue.
 One thing that I can think of is that we don't need to use the
 spin_lock, since we don't care about the exact lock number, but just
 need to get any not-collided number.

IMHO, just moving sbi-next_lock_num++ before 
mutex_lock(sbi-fs_lock[next_lock])
can avoid unbalance issue mostly.
IMO, the case two or more threads increase sbi-next_lock_num in the same time 
is
really very very little. If you think it is not rigorous, change next_lock_num 
to
atomic one can fix it.
What's your opinion?

Regards,
Gu

 
 So, how about removing the spin_lock?
 And how about using a random number?

 Thanks,
 
 2013-09-06 (금), 09:48 +, Chao Yu:
 Hi Kim:

  I think there is a performance problem: when all sbi-fs_lock is
 holded, 

 then all other threads may get the same next_lock value from
 sbi-next_lock_num in function mutex_lock_op, 

 and wait to get the same lock at position fs_lock[next_lock], it
 unbalance the fs_lock usage. 

 It may lost performance when we do the multithread test.

  

 Here is the patch to fix this problem:

  

 Signed-off-by: Yu Chao chao2...@samsung.com

 diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h

 old mode 100644

 new mode 100755

 index 467d42d..983bb45

 --- a/fs/f2fs/f2fs.h

 +++ b/fs/f2fs/f2fs.h

 @@ -371,6 +371,7 @@ struct f2fs_sb_info {

 struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
 operations */

 struct mutex node_write;/* locking node writes
 */

 struct mutex writepages;/* mutex for
 writepages() */

 +   spinlock_t spin_lock;   /* lock for
 next_lock_num */

 unsigned char next_lock_num;/* round-robin global
 locks */

 int por_doing;  /* recovery is doing
 or not */

 int on_build_free_nids; /* build_free_nids is
 doing */

 @@ -533,15 +534,19 @@ static inline void mutex_unlock_all(struct
 f2fs_sb_info *sbi)

  

  static inline int mutex_lock_op(struct f2fs_sb_info *sbi)

  {

 -   unsigned char next_lock = sbi-next_lock_num %
 NR_GLOBAL_LOCKS;

 +   unsigned char next_lock;

 int i = 0;

  

 for (; i  NR_GLOBAL_LOCKS; i++)

 if (mutex_trylock(sbi-fs_lock[i]))

 return i;

  

 -   mutex_lock(sbi-fs_lock[next_lock]);

 +   spin_lock(sbi-spin_lock);

 +   next_lock = sbi-next_lock_num % NR_GLOBAL_LOCKS;

 sbi-next_lock_num++;

 +   spin_unlock(sbi-spin_lock);

 +

 +   mutex_lock(sbi-fs_lock[next_lock]);

 return next_lock;

  }

  

 diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c

 old mode 100644

 new mode 100755

 index 75c7dc3..4f27596

 --- a/fs/f2fs/super.c

 +++ b/fs/f2fs/super.c

 @@ -657,6 +657,7 @@ static int f2fs_fill_super(struct super_block *sb,
 void *data, int silent)

 mutex_init(sbi-cp_mutex);

 for (i = 0; i  NR_GLOBAL_LOCKS; i++)

 mutex_init(sbi-fs_lock[i]);

 +   spin_lock_init(sbi-spin_lock);

 mutex_init(sbi-node_write);

 sbi-por_doing = 0;

 spin_lock_init(sbi-stat_lock);

 (END)

  




 


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: [f2fs-dev] [PATCH] f2fs: optimize fs_lock for better performance

2013-09-09 Thread Jaegeuk Kim
Hi,

2013-09-07 (토), 08:00 +, Chao Yu:
> Hi Knize,
> 
> Thanks for your reply, I think it's actually meaningless that it's
> being named after "spin_lock",
> it's better to rename this spinlock to "round_robin_lock".
> 
> This patch can only resolve the issue of unbalanced fs_lock usage,
> it can not fix the deadlock issue.
> can we fix deadlock issue through this method:
> 
> - vfs_create()
>  - f2fs_create() - takes an fs_lock and save current thread info into
> thread_info[NR_GLOBAL_LOCKS]
>   - f2fs_add_link()
>- __f2fs_add_link()
> - init_inode_metadata()
>  - f2fs_init_security()
>   - security_inode_init_security()
>- f2fs_initxattrs()
> - f2fs_setxattr() - get fs_lock only if there is no current
> thread info in thread_info
> 
> So it keeps one thread can only hold one fs_lock to avoid deadlock.
> Can we use this solution?

It could be.
But, I think we can avoid to grab the fs_lock at the f2fs_initxattrs()
level, since this case only happens when f2fs_initxattrs() is called.
Let's think about ut in more detail.
Thanks,

> 
>  
> 
> thanks again!
> 
>  
> 
> --- Original Message ---
> 
> Sender : Russ Knize
> 
> Date : 九月 07, 2013 04:25 (GMT+09:00)
> 
> Title : Re: [f2fs-dev] [PATCH] f2fs: optimize fs_lock for better
> performance
> 
>  
> 
> I encountered this same issue recently and solved it in much the same
> way.  Can we rename "spin_lock" to something more meaningful? 
> 
> 
> This race actually exposed a potential deadlock between f2fs_create()
> and f2fs_initxattrs(): 
> 
> 
> - vfs_create()
>  - f2fs_create() - takes an fs_lock
>   - f2fs_add_link()
>- __f2fs_add_link()
> - init_inode_metadata()
>  - f2fs_init_security()
>   - security_inode_init_security()
>- f2fs_initxattrs()
> - f2fs_setxattr() - also takes an fs_lock
> 
> 
> If another CPU happens to have the same lock that f2fs_setxattr() was
> trying to take because of the race around next_lock_num, we can get
> into a deadlock situation if the two threads are also contending over
> another resource (like bdi).
> 
> 
> Another scenario is if the above happens while another thread is in
> the middle of grabbing all of the locks via mutex_lock_all().
>  f2fs_create() is holding a lock that mutex_lock_all() is waiting for
> and mutex_lock_all() is holding a lock that f2fs_setxattr() is waiting
> for.
> 
> 
> Russ
> 
> 
> On Fri, Sep 6, 2013 at 4:48 AM, Chao Yu  wrote:
> Hi Kim:
> 
>  I think there is a performance problem: when all
> sbi->fs_lock is holded, 
> 
> then all other threads may get the same next_lock value from
> sbi->next_lock_num in function mutex_lock_op, 
> 
> and wait to get the same lock at position fs_lock[next_lock],
> it unbalance the fs_lock usage. 
> 
> It may lost performance when we do the multithread test.
> 
>  
> 
> Here is the patch to fix this problem:
> 
>  
> 
> Signed-off-by: Yu Chao 
> 
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> 
> old mode 100644
> 
> new mode 100755
> 
> index 467d42d..983bb45
> 
> --- a/fs/f2fs/f2fs.h
> 
> +++ b/fs/f2fs/f2fs.h
> 
> @@ -371,6 +371,7 @@ struct f2fs_sb_info {
> 
> struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
> operations */
> 
> struct mutex node_write;/* locking
> node writes */
> 
> struct mutex writepages;/* mutex for
> writepages() */
> 
> +   spinlock_t spin_lock;   /* lock for
> next_lock_num */
> 
> unsigned char next_lock_num;/* round-robin
> global locks */
> 
> int por_doing;  /* recovery is
> doing or not */
> 
> int on_build_free_nids; /*
> build_free_nids is doing */
> 
> @@ -533,15 +534,19 @@ static inline void
> mutex_unlock_all(struct f2fs_sb_info *sbi)
> 
>  
> 
>  static inline int mutex_lock_op(struct f2fs_sb_info *sbi)
> 
>  {
> 
> -   unsigned char next_lock = sbi->next_lock

Re: [f2fs-dev] [PATCH] f2fs: optimize fs_lock for better performance

2013-09-09 Thread Jaegeuk Kim
Hi,

Nice catch.
This is definitely a bug where one thread grabbed two fs_locks across
the same flow.
Any idea?
Thanks,

2013-09-06 (금), 14:25 -0500, Russ Knize:
> I encountered this same issue recently and solved it in much the same
> way.  Can we rename "spin_lock" to something more meaningful?
> 
> 
> This race actually exposed a potential deadlock between f2fs_create()
> and f2fs_initxattrs():
> 
> 
> - vfs_create()
>  - f2fs_create() - takes an fs_lock
>   - f2fs_add_link()
>- __f2fs_add_link()
> - init_inode_metadata()
>  - f2fs_init_security()
>   - security_inode_init_security()
>- f2fs_initxattrs()
> - f2fs_setxattr() - also takes an fs_lock
> 
> 
> If another CPU happens to have the same lock that f2fs_setxattr() was
> trying to take because of the race around next_lock_num, we can get
> into a deadlock situation if the two threads are also contending over
> another resource (like bdi).
> 
> 
> Another scenario is if the above happens while another thread is in
> the middle of grabbing all of the locks via mutex_lock_all().
>  f2fs_create() is holding a lock that mutex_lock_all() is waiting for
> and mutex_lock_all() is holding a lock that f2fs_setxattr() is waiting
> for.
> 
> 
> Russ
> 
> 
> On Fri, Sep 6, 2013 at 4:48 AM, Chao Yu  wrote:
> Hi Kim:
> 
>  I think there is a performance problem: when all
> sbi->fs_lock is holded, 
> 
> then all other threads may get the same next_lock value from
> sbi->next_lock_num in function mutex_lock_op, 
> 
> and wait to get the same lock at position fs_lock[next_lock],
> it unbalance the fs_lock usage. 
> 
> It may lost performance when we do the multithread test.
> 
>  
> 
> Here is the patch to fix this problem:
> 
>  
> 
> Signed-off-by: Yu Chao 
> 
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> 
> old mode 100644
> 
> new mode 100755
> 
> index 467d42d..983bb45
> 
> --- a/fs/f2fs/f2fs.h
> 
> +++ b/fs/f2fs/f2fs.h
> 
> @@ -371,6 +371,7 @@ struct f2fs_sb_info {
> 
> struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
> operations */
> 
> struct mutex node_write;/* locking
> node writes */
> 
> struct mutex writepages;/* mutex for
> writepages() */
> 
> +   spinlock_t spin_lock;   /* lock for
> next_lock_num */
> 
> unsigned char next_lock_num;/* round-robin
> global locks */
> 
> int por_doing;  /* recovery is
> doing or not */
> 
> int on_build_free_nids; /*
> build_free_nids is doing */
> 
> @@ -533,15 +534,19 @@ static inline void
> mutex_unlock_all(struct f2fs_sb_info *sbi)
> 
>  
> 
>  static inline int mutex_lock_op(struct f2fs_sb_info *sbi)
> 
>  {
> 
> -   unsigned char next_lock = sbi->next_lock_num %
> NR_GLOBAL_LOCKS;
> 
> +   unsigned char next_lock;
> 
> int i = 0;
> 
>  
> 
> for (; i < NR_GLOBAL_LOCKS; i++)
> 
> if (mutex_trylock(>fs_lock[i]))
> 
> return i;
> 
>  
> 
> -   mutex_lock(>fs_lock[next_lock]);
> 
> +   spin_lock(>spin_lock);
> 
> +   next_lock = sbi->next_lock_num % NR_GLOBAL_LOCKS;
> 
> sbi->next_lock_num++;
> 
> +   spin_unlock(>spin_lock);
> 
> +
> 
> +   mutex_lock(>fs_lock[next_lock]);
> 
> return next_lock;
> 
>  }
> 
>  
> 
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> 
> old mode 100644
> 
> new mode 100755
> 
> index 75c7dc3..4f27596
> 
> --- a/fs/f2fs/super.c
> 
> +++ b/fs/f2fs/super.c
> 
> @@ -657,6 +657,7 @@ static int f2fs_fill_super(struct
> super_block *sb, void *data, int silent)
> 
> mutex_init(>cp_mutex);
> 
> for (i = 0; i < NR_GLOBAL_LOCKS; i++)
> 
> mutex_init(>fs_lock[i]);
> 
> +   spin_lock_init(>spin_lock);
> 
> mutex_init(>node_write);
> 
> sbi->por_doing = 0;
> 
> spin_lock_init(>stat_lock);
> 
> (END)
> 
> 

Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance

2013-09-09 Thread Jaegeuk Kim
Hi,

At first, thank you for the report and please follow the email writing
rules. :)

Anyway, I agree to the below issue.
One thing that I can think of is that we don't need to use the
spin_lock, since we don't care about the exact lock number, but just
need to get any not-collided number.

So, how about removing the spin_lock?
And how about using a random number?
Thanks,

2013-09-06 (금), 09:48 +, Chao Yu:
> Hi Kim:
> 
>  I think there is a performance problem: when all sbi->fs_lock is
> holded, 
> 
> then all other threads may get the same next_lock value from
> sbi->next_lock_num in function mutex_lock_op, 
> 
> and wait to get the same lock at position fs_lock[next_lock], it
> unbalance the fs_lock usage. 
> 
> It may lost performance when we do the multithread test.
> 
>  
> 
> Here is the patch to fix this problem:
> 
>  
> 
> Signed-off-by: Yu Chao 
> 
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> 
> old mode 100644
> 
> new mode 100755
> 
> index 467d42d..983bb45
> 
> --- a/fs/f2fs/f2fs.h
> 
> +++ b/fs/f2fs/f2fs.h
> 
> @@ -371,6 +371,7 @@ struct f2fs_sb_info {
> 
> struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
> operations */
> 
> struct mutex node_write;/* locking node writes
> */
> 
> struct mutex writepages;/* mutex for
> writepages() */
> 
> +   spinlock_t spin_lock;   /* lock for
> next_lock_num */
> 
> unsigned char next_lock_num;/* round-robin global
> locks */
> 
> int por_doing;  /* recovery is doing
> or not */
> 
> int on_build_free_nids; /* build_free_nids is
> doing */
> 
> @@ -533,15 +534,19 @@ static inline void mutex_unlock_all(struct
> f2fs_sb_info *sbi)
> 
>  
> 
>  static inline int mutex_lock_op(struct f2fs_sb_info *sbi)
> 
>  {
> 
> -   unsigned char next_lock = sbi->next_lock_num %
> NR_GLOBAL_LOCKS;
> 
> +   unsigned char next_lock;
> 
> int i = 0;
> 
>  
> 
> for (; i < NR_GLOBAL_LOCKS; i++)
> 
> if (mutex_trylock(>fs_lock[i]))
> 
> return i;
> 
>  
> 
> -   mutex_lock(>fs_lock[next_lock]);
> 
> +   spin_lock(>spin_lock);
> 
> +   next_lock = sbi->next_lock_num % NR_GLOBAL_LOCKS;
> 
> sbi->next_lock_num++;
> 
> +   spin_unlock(>spin_lock);
> 
> +
> 
> +   mutex_lock(>fs_lock[next_lock]);
> 
> return next_lock;
> 
>  }
> 
>  
> 
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> 
> old mode 100644
> 
> new mode 100755
> 
> index 75c7dc3..4f27596
> 
> --- a/fs/f2fs/super.c
> 
> +++ b/fs/f2fs/super.c
> 
> @@ -657,6 +657,7 @@ static int f2fs_fill_super(struct super_block *sb,
> void *data, int silent)
> 
> mutex_init(>cp_mutex);
> 
> for (i = 0; i < NR_GLOBAL_LOCKS; i++)
> 
> mutex_init(>fs_lock[i]);
> 
> +   spin_lock_init(>spin_lock);
> 
> mutex_init(>node_write);
> 
> sbi->por_doing = 0;
> 
> spin_lock_init(>stat_lock);
> 
> (END)
> 
>  
> 
> 
> 
> 

-- 
Jaegeuk Kim
Samsung

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [f2fs-dev][PATCH] f2fs: optimize fs_lock for better performance

2013-09-09 Thread Jaegeuk Kim
Hi,

At first, thank you for the report and please follow the email writing
rules. :)

Anyway, I agree to the below issue.
One thing that I can think of is that we don't need to use the
spin_lock, since we don't care about the exact lock number, but just
need to get any not-collided number.

So, how about removing the spin_lock?
And how about using a random number?
Thanks,

2013-09-06 (금), 09:48 +, Chao Yu:
 Hi Kim:
 
  I think there is a performance problem: when all sbi-fs_lock is
 holded, 
 
 then all other threads may get the same next_lock value from
 sbi-next_lock_num in function mutex_lock_op, 
 
 and wait to get the same lock at position fs_lock[next_lock], it
 unbalance the fs_lock usage. 
 
 It may lost performance when we do the multithread test.
 
  
 
 Here is the patch to fix this problem:
 
  
 
 Signed-off-by: Yu Chao chao2...@samsung.com
 
 diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
 
 old mode 100644
 
 new mode 100755
 
 index 467d42d..983bb45
 
 --- a/fs/f2fs/f2fs.h
 
 +++ b/fs/f2fs/f2fs.h
 
 @@ -371,6 +371,7 @@ struct f2fs_sb_info {
 
 struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
 operations */
 
 struct mutex node_write;/* locking node writes
 */
 
 struct mutex writepages;/* mutex for
 writepages() */
 
 +   spinlock_t spin_lock;   /* lock for
 next_lock_num */
 
 unsigned char next_lock_num;/* round-robin global
 locks */
 
 int por_doing;  /* recovery is doing
 or not */
 
 int on_build_free_nids; /* build_free_nids is
 doing */
 
 @@ -533,15 +534,19 @@ static inline void mutex_unlock_all(struct
 f2fs_sb_info *sbi)
 
  
 
  static inline int mutex_lock_op(struct f2fs_sb_info *sbi)
 
  {
 
 -   unsigned char next_lock = sbi-next_lock_num %
 NR_GLOBAL_LOCKS;
 
 +   unsigned char next_lock;
 
 int i = 0;
 
  
 
 for (; i  NR_GLOBAL_LOCKS; i++)
 
 if (mutex_trylock(sbi-fs_lock[i]))
 
 return i;
 
  
 
 -   mutex_lock(sbi-fs_lock[next_lock]);
 
 +   spin_lock(sbi-spin_lock);
 
 +   next_lock = sbi-next_lock_num % NR_GLOBAL_LOCKS;
 
 sbi-next_lock_num++;
 
 +   spin_unlock(sbi-spin_lock);
 
 +
 
 +   mutex_lock(sbi-fs_lock[next_lock]);
 
 return next_lock;
 
  }
 
  
 
 diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
 
 old mode 100644
 
 new mode 100755
 
 index 75c7dc3..4f27596
 
 --- a/fs/f2fs/super.c
 
 +++ b/fs/f2fs/super.c
 
 @@ -657,6 +657,7 @@ static int f2fs_fill_super(struct super_block *sb,
 void *data, int silent)
 
 mutex_init(sbi-cp_mutex);
 
 for (i = 0; i  NR_GLOBAL_LOCKS; i++)
 
 mutex_init(sbi-fs_lock[i]);
 
 +   spin_lock_init(sbi-spin_lock);
 
 mutex_init(sbi-node_write);
 
 sbi-por_doing = 0;
 
 spin_lock_init(sbi-stat_lock);
 
 (END)
 
  
 
 
 
 

-- 
Jaegeuk Kim
Samsung

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [f2fs-dev] [PATCH] f2fs: optimize fs_lock for better performance

2013-09-09 Thread Jaegeuk Kim
Hi,

Nice catch.
This is definitely a bug where one thread grabbed two fs_locks across
the same flow.
Any idea?
Thanks,

2013-09-06 (금), 14:25 -0500, Russ Knize:
 I encountered this same issue recently and solved it in much the same
 way.  Can we rename spin_lock to something more meaningful?
 
 
 This race actually exposed a potential deadlock between f2fs_create()
 and f2fs_initxattrs():
 
 
 - vfs_create()
  - f2fs_create() - takes an fs_lock
   - f2fs_add_link()
- __f2fs_add_link()
 - init_inode_metadata()
  - f2fs_init_security()
   - security_inode_init_security()
- f2fs_initxattrs()
 - f2fs_setxattr() - also takes an fs_lock
 
 
 If another CPU happens to have the same lock that f2fs_setxattr() was
 trying to take because of the race around next_lock_num, we can get
 into a deadlock situation if the two threads are also contending over
 another resource (like bdi).
 
 
 Another scenario is if the above happens while another thread is in
 the middle of grabbing all of the locks via mutex_lock_all().
  f2fs_create() is holding a lock that mutex_lock_all() is waiting for
 and mutex_lock_all() is holding a lock that f2fs_setxattr() is waiting
 for.
 
 
 Russ
 
 
 On Fri, Sep 6, 2013 at 4:48 AM, Chao Yu chao2...@samsung.com wrote:
 Hi Kim:
 
  I think there is a performance problem: when all
 sbi-fs_lock is holded, 
 
 then all other threads may get the same next_lock value from
 sbi-next_lock_num in function mutex_lock_op, 
 
 and wait to get the same lock at position fs_lock[next_lock],
 it unbalance the fs_lock usage. 
 
 It may lost performance when we do the multithread test.
 
  
 
 Here is the patch to fix this problem:
 
  
 
 Signed-off-by: Yu Chao chao2...@samsung.com
 
 diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
 
 old mode 100644
 
 new mode 100755
 
 index 467d42d..983bb45
 
 --- a/fs/f2fs/f2fs.h
 
 +++ b/fs/f2fs/f2fs.h
 
 @@ -371,6 +371,7 @@ struct f2fs_sb_info {
 
 struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
 operations */
 
 struct mutex node_write;/* locking
 node writes */
 
 struct mutex writepages;/* mutex for
 writepages() */
 
 +   spinlock_t spin_lock;   /* lock for
 next_lock_num */
 
 unsigned char next_lock_num;/* round-robin
 global locks */
 
 int por_doing;  /* recovery is
 doing or not */
 
 int on_build_free_nids; /*
 build_free_nids is doing */
 
 @@ -533,15 +534,19 @@ static inline void
 mutex_unlock_all(struct f2fs_sb_info *sbi)
 
  
 
  static inline int mutex_lock_op(struct f2fs_sb_info *sbi)
 
  {
 
 -   unsigned char next_lock = sbi-next_lock_num %
 NR_GLOBAL_LOCKS;
 
 +   unsigned char next_lock;
 
 int i = 0;
 
  
 
 for (; i  NR_GLOBAL_LOCKS; i++)
 
 if (mutex_trylock(sbi-fs_lock[i]))
 
 return i;
 
  
 
 -   mutex_lock(sbi-fs_lock[next_lock]);
 
 +   spin_lock(sbi-spin_lock);
 
 +   next_lock = sbi-next_lock_num % NR_GLOBAL_LOCKS;
 
 sbi-next_lock_num++;
 
 +   spin_unlock(sbi-spin_lock);
 
 +
 
 +   mutex_lock(sbi-fs_lock[next_lock]);
 
 return next_lock;
 
  }
 
  
 
 diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
 
 old mode 100644
 
 new mode 100755
 
 index 75c7dc3..4f27596
 
 --- a/fs/f2fs/super.c
 
 +++ b/fs/f2fs/super.c
 
 @@ -657,6 +657,7 @@ static int f2fs_fill_super(struct
 super_block *sb, void *data, int silent)
 
 mutex_init(sbi-cp_mutex);
 
 for (i = 0; i  NR_GLOBAL_LOCKS; i++)
 
 mutex_init(sbi-fs_lock[i]);
 
 +   spin_lock_init(sbi-spin_lock);
 
 mutex_init(sbi-node_write);
 
 sbi-por_doing = 0;
 
 spin_lock_init(sbi-stat_lock);
 
 (END)
 
  
 
 
 
 
 
 
 

Re: Re: [f2fs-dev] [PATCH] f2fs: optimize fs_lock for better performance

2013-09-09 Thread Jaegeuk Kim
Hi,

2013-09-07 (토), 08:00 +, Chao Yu:
 Hi Knize,
 
 Thanks for your reply, I think it's actually meaningless that it's
 being named after spin_lock,
 it's better to rename this spinlock to round_robin_lock.
 
 This patch can only resolve the issue of unbalanced fs_lock usage,
 it can not fix the deadlock issue.
 can we fix deadlock issue through this method:
 
 - vfs_create()
  - f2fs_create() - takes an fs_lock and save current thread info into
 thread_info[NR_GLOBAL_LOCKS]
   - f2fs_add_link()
- __f2fs_add_link()
 - init_inode_metadata()
  - f2fs_init_security()
   - security_inode_init_security()
- f2fs_initxattrs()
 - f2fs_setxattr() - get fs_lock only if there is no current
 thread info in thread_info
 
 So it keeps one thread can only hold one fs_lock to avoid deadlock.
 Can we use this solution?

It could be.
But, I think we can avoid to grab the fs_lock at the f2fs_initxattrs()
level, since this case only happens when f2fs_initxattrs() is called.
Let's think about ut in more detail.
Thanks,

 
  
 
 thanks again!
 
  
 
 --- Original Message ---
 
 Sender : Russ Knizeruss.kn...@motorola.com
 
 Date : 九月 07, 2013 04:25 (GMT+09:00)
 
 Title : Re: [f2fs-dev] [PATCH] f2fs: optimize fs_lock for better
 performance
 
  
 
 I encountered this same issue recently and solved it in much the same
 way.  Can we rename spin_lock to something more meaningful? 
 
 
 This race actually exposed a potential deadlock between f2fs_create()
 and f2fs_initxattrs(): 
 
 
 - vfs_create()
  - f2fs_create() - takes an fs_lock
   - f2fs_add_link()
- __f2fs_add_link()
 - init_inode_metadata()
  - f2fs_init_security()
   - security_inode_init_security()
- f2fs_initxattrs()
 - f2fs_setxattr() - also takes an fs_lock
 
 
 If another CPU happens to have the same lock that f2fs_setxattr() was
 trying to take because of the race around next_lock_num, we can get
 into a deadlock situation if the two threads are also contending over
 another resource (like bdi).
 
 
 Another scenario is if the above happens while another thread is in
 the middle of grabbing all of the locks via mutex_lock_all().
  f2fs_create() is holding a lock that mutex_lock_all() is waiting for
 and mutex_lock_all() is holding a lock that f2fs_setxattr() is waiting
 for.
 
 
 Russ
 
 
 On Fri, Sep 6, 2013 at 4:48 AM, Chao Yu chao2...@samsung.com wrote:
 Hi Kim:
 
  I think there is a performance problem: when all
 sbi-fs_lock is holded, 
 
 then all other threads may get the same next_lock value from
 sbi-next_lock_num in function mutex_lock_op, 
 
 and wait to get the same lock at position fs_lock[next_lock],
 it unbalance the fs_lock usage. 
 
 It may lost performance when we do the multithread test.
 
  
 
 Here is the patch to fix this problem:
 
  
 
 Signed-off-by: Yu Chao chao2...@samsung.com
 
 diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
 
 old mode 100644
 
 new mode 100755
 
 index 467d42d..983bb45
 
 --- a/fs/f2fs/f2fs.h
 
 +++ b/fs/f2fs/f2fs.h
 
 @@ -371,6 +371,7 @@ struct f2fs_sb_info {
 
 struct mutex fs_lock[NR_GLOBAL_LOCKS];  /* blocking FS
 operations */
 
 struct mutex node_write;/* locking
 node writes */
 
 struct mutex writepages;/* mutex for
 writepages() */
 
 +   spinlock_t spin_lock;   /* lock for
 next_lock_num */
 
 unsigned char next_lock_num;/* round-robin
 global locks */
 
 int por_doing;  /* recovery is
 doing or not */
 
 int on_build_free_nids; /*
 build_free_nids is doing */
 
 @@ -533,15 +534,19 @@ static inline void
 mutex_unlock_all(struct f2fs_sb_info *sbi)
 
  
 
  static inline int mutex_lock_op(struct f2fs_sb_info *sbi)
 
  {
 
 -   unsigned char next_lock = sbi-next_lock_num %
 NR_GLOBAL_LOCKS;
 
 +   unsigned char next_lock;
 
 int i = 0;
 
  
 
 for (; i  NR_GLOBAL_LOCKS; i++)
 
 if (mutex_trylock(sbi-fs_lock[i]))
 
 return i;
 
  
 
 -   mutex_lock(sbi-fs_lock[next_lock]);
 
 +   spin_lock(sbi-spin_lock);
 
 +   next_lock = sbi-next_lock_num % NR_GLOBAL_LOCKS;
 
 sbi-next_lock_num