Re: [PATCH] locking/rwsem: Add up_write_non_owner() for percpu_up_write()

2018-05-14 Thread Waiman Long
On 04/09/2018 10:22 AM, Oleg Nesterov wrote:
> On 04/09, Waiman Long wrote:
>> On 04/09/2018 07:20 AM, Oleg Nesterov wrote:
>>> Hmm. Can you look at lockdep_sb_freeze_release() and 
>>> lockdep_sb_freeze_acquire()?
>> These 2 functions are there to deal with the lockdep code.
> Plus they clearly document why sem->owner check is not right when it comes
> to super_block->s_writers[]. Not only freeze and thaw can be called by
> different processes, we need to return to user-space with rwsem held for
> writing.

Sorry for the late reply as I was busy on other work.

I have just sent out a v2 patch to hopefully address your concern.
Please let me know your thought on that.

Thanks,
Longman


Re: [PATCH] locking/rwsem: Add up_write_non_owner() for percpu_up_write()

2018-05-14 Thread Waiman Long
On 04/09/2018 10:22 AM, Oleg Nesterov wrote:
> On 04/09, Waiman Long wrote:
>> On 04/09/2018 07:20 AM, Oleg Nesterov wrote:
>>> Hmm. Can you look at lockdep_sb_freeze_release() and 
>>> lockdep_sb_freeze_acquire()?
>> These 2 functions are there to deal with the lockdep code.
> Plus they clearly document why sem->owner check is not right when it comes
> to super_block->s_writers[]. Not only freeze and thaw can be called by
> different processes, we need to return to user-space with rwsem held for
> writing.

Sorry for the late reply as I was busy on other work.

I have just sent out a v2 patch to hopefully address your concern.
Please let me know your thought on that.

Thanks,
Longman


Re: [PATCH] locking/rwsem: Add up_write_non_owner() for percpu_up_write()

2018-04-09 Thread Oleg Nesterov
On 04/09, Waiman Long wrote:
>
> On 04/09/2018 07:20 AM, Oleg Nesterov wrote:
> > Hmm. Can you look at lockdep_sb_freeze_release() and 
> > lockdep_sb_freeze_acquire()?
>
> These 2 functions are there to deal with the lockdep code.

Plus they clearly document why sem->owner check is not right when it comes
to super_block->s_writers[]. Not only freeze and thaw can be called by
different processes, we need to return to user-space with rwsem held for
writing.

> > At first glance, it would be much better to set sem->owner = current in
> > percpu_rwsem_acquire(), no?
>
> The primary purpose of the owner field is to enable optimistic spinning
> to improve locking performance. So it needs to be set during an
> up_write() call.

Unless, again, the "owner" has to do lockdep_sb_freeze_release() for any
reason.

And please note that percpu_rwsem_release() already clears rw_sem.owner.
It checks CONFIG_RWSEM_SPIN_ON_OWNER, but this is simply because
rw_semaphore->owner doesn't exist otherwise.

> My rwsem debug patch does use it also to check for consistency in the
> use of lock/unlock call. Anyway, I don't think it is right to set it
> again in percpu_rwsem_acquire() if there is no guarantee that the task
> that call percpu_rwsem_acquire will be the one that will do the unlock.

Hmm. Perhaps I missed something, but I think this should be true.

Of course, you need to check "if (!read)", but again, this is what
percpu_rwsem_release() already does.

> I am wondering if it makes sense to do optimistic spinning in the case
> of percpu_rwsem where the unlocker may be a different task.

Again, perhaps I missed something, but see above. percpu_rwsem does not
really differ from the regular rwsem, however its usage in sb->s_writers[]
differs.

Oleg.



Re: [PATCH] locking/rwsem: Add up_write_non_owner() for percpu_up_write()

2018-04-09 Thread Oleg Nesterov
On 04/09, Waiman Long wrote:
>
> On 04/09/2018 07:20 AM, Oleg Nesterov wrote:
> > Hmm. Can you look at lockdep_sb_freeze_release() and 
> > lockdep_sb_freeze_acquire()?
>
> These 2 functions are there to deal with the lockdep code.

Plus they clearly document why sem->owner check is not right when it comes
to super_block->s_writers[]. Not only freeze and thaw can be called by
different processes, we need to return to user-space with rwsem held for
writing.

> > At first glance, it would be much better to set sem->owner = current in
> > percpu_rwsem_acquire(), no?
>
> The primary purpose of the owner field is to enable optimistic spinning
> to improve locking performance. So it needs to be set during an
> up_write() call.

Unless, again, the "owner" has to do lockdep_sb_freeze_release() for any
reason.

And please note that percpu_rwsem_release() already clears rw_sem.owner.
It checks CONFIG_RWSEM_SPIN_ON_OWNER, but this is simply because
rw_semaphore->owner doesn't exist otherwise.

> My rwsem debug patch does use it also to check for consistency in the
> use of lock/unlock call. Anyway, I don't think it is right to set it
> again in percpu_rwsem_acquire() if there is no guarantee that the task
> that call percpu_rwsem_acquire will be the one that will do the unlock.

Hmm. Perhaps I missed something, but I think this should be true.

Of course, you need to check "if (!read)", but again, this is what
percpu_rwsem_release() already does.

> I am wondering if it makes sense to do optimistic spinning in the case
> of percpu_rwsem where the unlocker may be a different task.

Again, perhaps I missed something, but see above. percpu_rwsem does not
really differ from the regular rwsem, however its usage in sb->s_writers[]
differs.

Oleg.



Re: [PATCH] locking/rwsem: Add up_write_non_owner() for percpu_up_write()

2018-04-09 Thread Waiman Long
On 04/09/2018 07:20 AM, Oleg Nesterov wrote:
> On 04/04, Waiman Long wrote:
>> --- a/kernel/locking/percpu-rwsem.c
>> +++ b/kernel/locking/percpu-rwsem.c
>> @@ -179,8 +179,10 @@ void percpu_up_write(struct percpu_rw_semaphore *sem)
>>  
>>  /*
>>   * Release the write lock, this will allow readers back in the game.
>> + * percpu_up_write() may be called from a task different from the one
>> + * taking the lock.
>>   */
>> -up_write(>rw_sem);
>> +up_write_non_owner(>rw_sem);
>>  
>>  /*
>>   * Once this completes (at least one RCU-sched grace period hence) the
>> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
>> index 30465a2..140d5ef 100644
>> --- a/kernel/locking/rwsem.c
>> +++ b/kernel/locking/rwsem.c
>> @@ -222,4 +222,17 @@ void up_read_non_owner(struct rw_semaphore *sem)
>>  
>>  #endif
>>  
>> +#ifdef CONFIG_DEBUG_RWSEMS
>> +/*
>> + * release a write lock from a different task
>> + */
>> +void up_write_non_owner(struct rw_semaphore *sem)
>> +{
>> +rwsem_release(>dep_map, 1, _RET_IP_);
>> +DEBUG_RWSEMS_WARN_ON(!sem->owner || (sem->owner == RWSEM_READER_OWNED));
>>  
>> +rwsem_clear_owner(sem);
>> +__up_write(sem);
>> +}
> Hmm. Can you look at lockdep_sb_freeze_release() and 
> lockdep_sb_freeze_acquire()?

These 2 functions are there to deal with the lockdep code.

> At first glance, it would be much better to set sem->owner = current in
> percpu_rwsem_acquire(), no?

The primary purpose of the owner field is to enable optimistic spinning
to improve locking performance. So it needs to be set during an
up_write() call.

My rwsem debug patch does use it also to check for consistency in the
use of lock/unlock call. Anyway, I don't think it is right to set it
again in percpu_rwsem_acquire() if there is no guarantee that the task
that call percpu_rwsem_acquire will be the one that will do the unlock.

I am wondering if it makes sense to do optimistic spinning in the case
of percpu_rwsem where the unlocker may be a different task. We could set
a special code for writer owned lock, but don't do optimistic spinning
in this case.

Cheers,
Longman




Re: [PATCH] locking/rwsem: Add up_write_non_owner() for percpu_up_write()

2018-04-09 Thread Waiman Long
On 04/09/2018 07:20 AM, Oleg Nesterov wrote:
> On 04/04, Waiman Long wrote:
>> --- a/kernel/locking/percpu-rwsem.c
>> +++ b/kernel/locking/percpu-rwsem.c
>> @@ -179,8 +179,10 @@ void percpu_up_write(struct percpu_rw_semaphore *sem)
>>  
>>  /*
>>   * Release the write lock, this will allow readers back in the game.
>> + * percpu_up_write() may be called from a task different from the one
>> + * taking the lock.
>>   */
>> -up_write(>rw_sem);
>> +up_write_non_owner(>rw_sem);
>>  
>>  /*
>>   * Once this completes (at least one RCU-sched grace period hence) the
>> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
>> index 30465a2..140d5ef 100644
>> --- a/kernel/locking/rwsem.c
>> +++ b/kernel/locking/rwsem.c
>> @@ -222,4 +222,17 @@ void up_read_non_owner(struct rw_semaphore *sem)
>>  
>>  #endif
>>  
>> +#ifdef CONFIG_DEBUG_RWSEMS
>> +/*
>> + * release a write lock from a different task
>> + */
>> +void up_write_non_owner(struct rw_semaphore *sem)
>> +{
>> +rwsem_release(>dep_map, 1, _RET_IP_);
>> +DEBUG_RWSEMS_WARN_ON(!sem->owner || (sem->owner == RWSEM_READER_OWNED));
>>  
>> +rwsem_clear_owner(sem);
>> +__up_write(sem);
>> +}
> Hmm. Can you look at lockdep_sb_freeze_release() and 
> lockdep_sb_freeze_acquire()?

These 2 functions are there to deal with the lockdep code.

> At first glance, it would be much better to set sem->owner = current in
> percpu_rwsem_acquire(), no?

The primary purpose of the owner field is to enable optimistic spinning
to improve locking performance. So it needs to be set during an
up_write() call.

My rwsem debug patch does use it also to check for consistency in the
use of lock/unlock call. Anyway, I don't think it is right to set it
again in percpu_rwsem_acquire() if there is no guarantee that the task
that call percpu_rwsem_acquire will be the one that will do the unlock.

I am wondering if it makes sense to do optimistic spinning in the case
of percpu_rwsem where the unlocker may be a different task. We could set
a special code for writer owned lock, but don't do optimistic spinning
in this case.

Cheers,
Longman




Re: [PATCH] locking/rwsem: Add up_write_non_owner() for percpu_up_write()

2018-04-09 Thread Oleg Nesterov
On 04/04, Waiman Long wrote:
>
> --- a/kernel/locking/percpu-rwsem.c
> +++ b/kernel/locking/percpu-rwsem.c
> @@ -179,8 +179,10 @@ void percpu_up_write(struct percpu_rw_semaphore *sem)
>  
>   /*
>* Release the write lock, this will allow readers back in the game.
> +  * percpu_up_write() may be called from a task different from the one
> +  * taking the lock.
>*/
> - up_write(>rw_sem);
> + up_write_non_owner(>rw_sem);
>  
>   /*
>* Once this completes (at least one RCU-sched grace period hence) the
> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
> index 30465a2..140d5ef 100644
> --- a/kernel/locking/rwsem.c
> +++ b/kernel/locking/rwsem.c
> @@ -222,4 +222,17 @@ void up_read_non_owner(struct rw_semaphore *sem)
>  
>  #endif
>  
> +#ifdef CONFIG_DEBUG_RWSEMS
> +/*
> + * release a write lock from a different task
> + */
> +void up_write_non_owner(struct rw_semaphore *sem)
> +{
> + rwsem_release(>dep_map, 1, _RET_IP_);
> + DEBUG_RWSEMS_WARN_ON(!sem->owner || (sem->owner == RWSEM_READER_OWNED));
>  
> + rwsem_clear_owner(sem);
> + __up_write(sem);
> +}

Hmm. Can you look at lockdep_sb_freeze_release() and 
lockdep_sb_freeze_acquire()?

At first glance, it would be much better to set sem->owner = current in
percpu_rwsem_acquire(), no?

Oleg.



Re: [PATCH] locking/rwsem: Add up_write_non_owner() for percpu_up_write()

2018-04-09 Thread Oleg Nesterov
On 04/04, Waiman Long wrote:
>
> --- a/kernel/locking/percpu-rwsem.c
> +++ b/kernel/locking/percpu-rwsem.c
> @@ -179,8 +179,10 @@ void percpu_up_write(struct percpu_rw_semaphore *sem)
>  
>   /*
>* Release the write lock, this will allow readers back in the game.
> +  * percpu_up_write() may be called from a task different from the one
> +  * taking the lock.
>*/
> - up_write(>rw_sem);
> + up_write_non_owner(>rw_sem);
>  
>   /*
>* Once this completes (at least one RCU-sched grace period hence) the
> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
> index 30465a2..140d5ef 100644
> --- a/kernel/locking/rwsem.c
> +++ b/kernel/locking/rwsem.c
> @@ -222,4 +222,17 @@ void up_read_non_owner(struct rw_semaphore *sem)
>  
>  #endif
>  
> +#ifdef CONFIG_DEBUG_RWSEMS
> +/*
> + * release a write lock from a different task
> + */
> +void up_write_non_owner(struct rw_semaphore *sem)
> +{
> + rwsem_release(>dep_map, 1, _RET_IP_);
> + DEBUG_RWSEMS_WARN_ON(!sem->owner || (sem->owner == RWSEM_READER_OWNED));
>  
> + rwsem_clear_owner(sem);
> + __up_write(sem);
> +}

Hmm. Can you look at lockdep_sb_freeze_release() and 
lockdep_sb_freeze_acquire()?

At first glance, it would be much better to set sem->owner = current in
percpu_rwsem_acquire(), no?

Oleg.



Re: [PATCH] locking/rwsem: Add up_write_non_owner() for percpu_up_write()

2018-04-04 Thread Theodore Y. Ts'o
On Wed, Apr 04, 2018 at 10:37:26AM -0400, Waiman Long wrote:
> The commit 8c5db92a705d ("locking/rwsem: Add DEBUG_RWSEMS to look for
> lock/unlock mismatches") causes a warning in ext4 fstests due to the
> fact that the freeze and thaw ioctls, by design, are run in different
> processes.

While true, it's not just ext4 fstests --- it's any file system which
supports freeze/thaw, since it's happening in the vfs layer.  I have
just as easily replicated the problem using "kvm-xfstests -c xfs
generic/068".  Or "kvm-xfstests -c btrfs generic/068".

Cheers,

- Ted


Re: [PATCH] locking/rwsem: Add up_write_non_owner() for percpu_up_write()

2018-04-04 Thread Theodore Y. Ts'o
On Wed, Apr 04, 2018 at 10:37:26AM -0400, Waiman Long wrote:
> The commit 8c5db92a705d ("locking/rwsem: Add DEBUG_RWSEMS to look for
> lock/unlock mismatches") causes a warning in ext4 fstests due to the
> fact that the freeze and thaw ioctls, by design, are run in different
> processes.

While true, it's not just ext4 fstests --- it's any file system which
supports freeze/thaw, since it's happening in the vfs layer.  I have
just as easily replicated the problem using "kvm-xfstests -c xfs
generic/068".  Or "kvm-xfstests -c btrfs generic/068".

Cheers,

- Ted


Re: [PATCH] locking/rwsem: Add up_write_non_owner() for percpu_up_write()

2018-04-04 Thread Waiman Long
On 04/04/2018 10:37 AM, Waiman Long wrote:
> The commit 8c5db92a705d ("locking/rwsem: Add DEBUG_RWSEMS to look for
> lock/unlock mismatches") causes a warning in ext4 fstests due to the
> fact that the freeze and thaw ioctls, by design, are run in different
> processes. This is not a bug in the commit itself. Rather, we need a
> up_write() variant that annotates the fact that it can be called from
> a task different from the one that took the lock.
>
> The new API is up_write_non_owner() and the percpu_up_write() function
> is modified to use it instead of the plain up_write().
>
> Signed-off-by: Waiman Long 
> ---
>  include/linux/rwsem.h |  6 ++
>  kernel/locking/percpu-rwsem.c |  4 +++-
>  kernel/locking/rwsem.c| 13 +
>  3 files changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/rwsem.h b/include/linux/rwsem.h
> index 56707d5..0e2a425 100644
> --- a/include/linux/rwsem.h
> +++ b/include/linux/rwsem.h
> @@ -187,4 +187,10 @@ static inline int rwsem_is_contended(struct rw_semaphore 
> *sem)
>  # define up_read_non_owner(sem)  up_read(sem)
>  #endif
>  
> +#ifdef CONFIG_DEBUG_RWSEMS
> +extern void up_write_non_owner(struct rw_semaphore *sem);
> +#else
> +# define up_write_non_owner(sem) up_write(sem)
> +#endif
> +
>  #endif /* _LINUX_RWSEM_H */
> diff --git a/kernel/locking/percpu-rwsem.c b/kernel/locking/percpu-rwsem.c
> index 883cf1b..13decf2 100644
> --- a/kernel/locking/percpu-rwsem.c
> +++ b/kernel/locking/percpu-rwsem.c
> @@ -179,8 +179,10 @@ void percpu_up_write(struct percpu_rw_semaphore *sem)
>  
>   /*
>* Release the write lock, this will allow readers back in the game.
> +  * percpu_up_write() may be called from a task different from the one
> +  * taking the lock.
>*/
> - up_write(>rw_sem);
> + up_write_non_owner(>rw_sem);
>  
>   /*
>* Once this completes (at least one RCU-sched grace period hence) the
> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
> index 30465a2..140d5ef 100644
> --- a/kernel/locking/rwsem.c
> +++ b/kernel/locking/rwsem.c
> @@ -222,4 +222,17 @@ void up_read_non_owner(struct rw_semaphore *sem)
>  
>  #endif
>  
> +#ifdef CONFIG_DEBUG_RWSEMS
> +/*
> + * release a write lock from a different task
> + */
> +void up_write_non_owner(struct rw_semaphore *sem)
> +{
> + rwsem_release(>dep_map, 1, _RET_IP_);
> + DEBUG_RWSEMS_WARN_ON(!sem->owner || (sem->owner == RWSEM_READER_OWNED));
>  
> + rwsem_clear_owner(sem);
> + __up_write(sem);
> +}
> +EXPORT_SYMBOL(up_write_non_owner);
> +#endif

Theodore,

Could you try this patch to see if it can fix your fstests warning?

Thanks,
Longman




Re: [PATCH] locking/rwsem: Add up_write_non_owner() for percpu_up_write()

2018-04-04 Thread Waiman Long
On 04/04/2018 10:37 AM, Waiman Long wrote:
> The commit 8c5db92a705d ("locking/rwsem: Add DEBUG_RWSEMS to look for
> lock/unlock mismatches") causes a warning in ext4 fstests due to the
> fact that the freeze and thaw ioctls, by design, are run in different
> processes. This is not a bug in the commit itself. Rather, we need a
> up_write() variant that annotates the fact that it can be called from
> a task different from the one that took the lock.
>
> The new API is up_write_non_owner() and the percpu_up_write() function
> is modified to use it instead of the plain up_write().
>
> Signed-off-by: Waiman Long 
> ---
>  include/linux/rwsem.h |  6 ++
>  kernel/locking/percpu-rwsem.c |  4 +++-
>  kernel/locking/rwsem.c| 13 +
>  3 files changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/rwsem.h b/include/linux/rwsem.h
> index 56707d5..0e2a425 100644
> --- a/include/linux/rwsem.h
> +++ b/include/linux/rwsem.h
> @@ -187,4 +187,10 @@ static inline int rwsem_is_contended(struct rw_semaphore 
> *sem)
>  # define up_read_non_owner(sem)  up_read(sem)
>  #endif
>  
> +#ifdef CONFIG_DEBUG_RWSEMS
> +extern void up_write_non_owner(struct rw_semaphore *sem);
> +#else
> +# define up_write_non_owner(sem) up_write(sem)
> +#endif
> +
>  #endif /* _LINUX_RWSEM_H */
> diff --git a/kernel/locking/percpu-rwsem.c b/kernel/locking/percpu-rwsem.c
> index 883cf1b..13decf2 100644
> --- a/kernel/locking/percpu-rwsem.c
> +++ b/kernel/locking/percpu-rwsem.c
> @@ -179,8 +179,10 @@ void percpu_up_write(struct percpu_rw_semaphore *sem)
>  
>   /*
>* Release the write lock, this will allow readers back in the game.
> +  * percpu_up_write() may be called from a task different from the one
> +  * taking the lock.
>*/
> - up_write(>rw_sem);
> + up_write_non_owner(>rw_sem);
>  
>   /*
>* Once this completes (at least one RCU-sched grace period hence) the
> diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
> index 30465a2..140d5ef 100644
> --- a/kernel/locking/rwsem.c
> +++ b/kernel/locking/rwsem.c
> @@ -222,4 +222,17 @@ void up_read_non_owner(struct rw_semaphore *sem)
>  
>  #endif
>  
> +#ifdef CONFIG_DEBUG_RWSEMS
> +/*
> + * release a write lock from a different task
> + */
> +void up_write_non_owner(struct rw_semaphore *sem)
> +{
> + rwsem_release(>dep_map, 1, _RET_IP_);
> + DEBUG_RWSEMS_WARN_ON(!sem->owner || (sem->owner == RWSEM_READER_OWNED));
>  
> + rwsem_clear_owner(sem);
> + __up_write(sem);
> +}
> +EXPORT_SYMBOL(up_write_non_owner);
> +#endif

Theodore,

Could you try this patch to see if it can fix your fstests warning?

Thanks,
Longman