Re: Question on mutex code

2015-03-15 Thread Matthias Bonne

On 03/16/15 00:10, Rabin Vincent wrote:

On Sun, Mar 15, 2015 at 11:49:07PM +0200, Matthias Bonne wrote:

So the counter is set to 1 before taking the spinlock, which I think
might cause the race. Did I miss something?


Yes, you miss the fact that __mutex_slowpath_needs_to_unlock() is 0 for
the CONFIG_DEBUG_MUTEXES case:

  #ifdef CONFIG_DEBUG_MUTEXES
  # include "mutex-debug.h"
  # include 
  /*
   * Must be 0 for the debug case so we do not do the unlock outside of the
   * wait_lock region. debug_mutex_unlock() will do the actual unlock in this
   * case.
   */
  # undef __mutex_slowpath_needs_to_unlock
  # define  __mutex_slowpath_needs_to_unlock()  0



Right, I overlooked this part. Thanks to both of you for the
clarifications.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Question on mutex code

2015-03-15 Thread Davidlohr Bueso
On Sun, 2015-03-15 at 15:18 -0700, Davidlohr Bueso wrote:
> Correct, in debug this is most likely true, yet safe because everything
^^^ false, again the same reasoning.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Question on mutex code

2015-03-15 Thread Davidlohr Bueso
On Sun, 2015-03-15 at 23:49 +0200, Matthias Bonne wrote:
> On 03/15/15 03:09, Davidlohr Bueso wrote:
> > On Sat, 2015-03-14 at 18:03 -0700, Davidlohr Bueso wrote:
> >> Good analysis, but not quite accurate for one simple fact: mutex
> >> trylocks _only_ use fastpaths (obviously just depend on the counter
> >> cmpxchg to 0), so you never fallback to the slowpath you are mentioning,
> >> thus the race is non existent. Please see the arch code.
> >
> > For debug we use the trylock slowpath, but so does everything else, so
> > again you cannot hit this scenario.
> >
> >
> 
> You are correct of course - this is why I said that
> CONFIG_DEBUG_MUTEXES must be enabled for this to happen.

Right, so I just skimmed through the email ;)

>  Can you
> explain why this scenario is still not possible in the debug case?
> 
> The debug case uses mutex-null.h, which contains these macros:
> 
> #define __mutex_fastpath_lock(count, fail_fn)   fail_fn(count)
> #define __mutex_fastpath_lock_retval(count) (-1)
> #define __mutex_fastpath_unlock(count, fail_fn) fail_fn(count)
> #define __mutex_fastpath_trylock(count, fail_fn)fail_fn(count)
> #define __mutex_slowpath_needs_to_unlock()  1
> 
> So both mutex_trylock() and mutex_unlock() always use the slow paths.

Right.

> The slowpath for mutex_unlock() is __mutex_unlock_slowpath(), which
> simply calls __mutex_unlock_common_slowpath(), and the latter starts
> like this:
> 
>  /*
>   * As a performance measurement, release the lock before doing 
> other
>   * wakeup related duties to follow. This allows other tasks to 
> acquire
>   * the lock sooner, while still handling cleanups in past 
> unlock calls.
>   * This can be done as we do not enforce strict equivalence 
> between the
>   * mutex counter and wait_list.
>   *
>   *
>   * Some architectures leave the lock unlocked in the fastpath 
> failure
>   * case, others need to leave it locked. In the later case we 
> have to
>   * unlock it here - as the lock counter is currently 0 or negative.
>   */
>  if (__mutex_slowpath_needs_to_unlock())
>  atomic_set(>count, 1);

Correct, in debug this is most likely true, yet safe because everything
is serialized through the mutex wait_lock. 

> 
>  spin_lock_mutex(>wait_lock, flags);
>  [...]
> 
> So the counter is set to 1 before taking the spinlock, which I think
> might cause the race. Did I miss something?

So in debug we play no counter/wait_list games when trying to grab the
lock, ie things such as lock stealing or optimistic spinning.
Furthermore, it is the unlocker thread's duty to wakeup the next task in
the list, so nothing can jump in and steal the lock. Additionally,
ordering also relies on the wait_queue ticket spinlock.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Question on mutex code

2015-03-15 Thread Rabin Vincent
On Sun, Mar 15, 2015 at 11:49:07PM +0200, Matthias Bonne wrote:
> So both mutex_trylock() and mutex_unlock() always use the slow paths.
> The slowpath for mutex_unlock() is __mutex_unlock_slowpath(), which
> simply calls __mutex_unlock_common_slowpath(), and the latter starts
> like this:
> 
> /*
>  * As a performance measurement, release the lock before doing other
>  * wakeup related duties to follow. This allows other tasks to
> acquire
>  * the lock sooner, while still handling cleanups in past unlock
> calls.
>  * This can be done as we do not enforce strict equivalence between
> the
>  * mutex counter and wait_list.
>  *
>  *
>  * Some architectures leave the lock unlocked in the fastpath
> failure
>  * case, others need to leave it locked. In the later case we have
> to
>  * unlock it here - as the lock counter is currently 0 or negative.
>  */
> if (__mutex_slowpath_needs_to_unlock())
> atomic_set(>count, 1);
> 
> spin_lock_mutex(>wait_lock, flags);
> [...]
> 
> So the counter is set to 1 before taking the spinlock, which I think
> might cause the race. Did I miss something?

Yes, you miss the fact that __mutex_slowpath_needs_to_unlock() is 0 for
the CONFIG_DEBUG_MUTEXES case:

 #ifdef CONFIG_DEBUG_MUTEXES
 # include "mutex-debug.h"
 # include 
 /*
  * Must be 0 for the debug case so we do not do the unlock outside of the
  * wait_lock region. debug_mutex_unlock() will do the actual unlock in this
  * case.
  */
 # undef __mutex_slowpath_needs_to_unlock
 # define  __mutex_slowpath_needs_to_unlock()   0
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Question on mutex code

2015-03-15 Thread Matthias Bonne

On 03/15/15 03:09, Davidlohr Bueso wrote:

On Sat, 2015-03-14 at 18:03 -0700, Davidlohr Bueso wrote:

Good analysis, but not quite accurate for one simple fact: mutex
trylocks _only_ use fastpaths (obviously just depend on the counter
cmpxchg to 0), so you never fallback to the slowpath you are mentioning,
thus the race is non existent. Please see the arch code.


For debug we use the trylock slowpath, but so does everything else, so
again you cannot hit this scenario.




You are correct of course - this is why I said that
CONFIG_DEBUG_MUTEXES must be enabled for this to happen. Can you
explain why this scenario is still not possible in the debug case?

The debug case uses mutex-null.h, which contains these macros:

#define __mutex_fastpath_lock(count, fail_fn)   fail_fn(count)
#define __mutex_fastpath_lock_retval(count) (-1)
#define __mutex_fastpath_unlock(count, fail_fn) fail_fn(count)
#define __mutex_fastpath_trylock(count, fail_fn)fail_fn(count)
#define __mutex_slowpath_needs_to_unlock()  1

So both mutex_trylock() and mutex_unlock() always use the slow paths.
The slowpath for mutex_unlock() is __mutex_unlock_slowpath(), which
simply calls __mutex_unlock_common_slowpath(), and the latter starts
like this:

/*
 * As a performance measurement, release the lock before doing 
other
 * wakeup related duties to follow. This allows other tasks to 
acquire
 * the lock sooner, while still handling cleanups in past 
unlock calls.
 * This can be done as we do not enforce strict equivalence 
between the

 * mutex counter and wait_list.
 *
 *
 * Some architectures leave the lock unlocked in the fastpath 
failure
 * case, others need to leave it locked. In the later case we 
have to

 * unlock it here - as the lock counter is currently 0 or negative.
 */
if (__mutex_slowpath_needs_to_unlock())
atomic_set(>count, 1);

spin_lock_mutex(>wait_lock, flags);
[...]

So the counter is set to 1 before taking the spinlock, which I think
might cause the race. Did I miss something?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Question on mutex code

2015-03-15 Thread Rabin Vincent
On Sun, Mar 15, 2015 at 11:49:07PM +0200, Matthias Bonne wrote:
 So both mutex_trylock() and mutex_unlock() always use the slow paths.
 The slowpath for mutex_unlock() is __mutex_unlock_slowpath(), which
 simply calls __mutex_unlock_common_slowpath(), and the latter starts
 like this:
 
 /*
  * As a performance measurement, release the lock before doing other
  * wakeup related duties to follow. This allows other tasks to
 acquire
  * the lock sooner, while still handling cleanups in past unlock
 calls.
  * This can be done as we do not enforce strict equivalence between
 the
  * mutex counter and wait_list.
  *
  *
  * Some architectures leave the lock unlocked in the fastpath
 failure
  * case, others need to leave it locked. In the later case we have
 to
  * unlock it here - as the lock counter is currently 0 or negative.
  */
 if (__mutex_slowpath_needs_to_unlock())
 atomic_set(lock-count, 1);
 
 spin_lock_mutex(lock-wait_lock, flags);
 [...]
 
 So the counter is set to 1 before taking the spinlock, which I think
 might cause the race. Did I miss something?

Yes, you miss the fact that __mutex_slowpath_needs_to_unlock() is 0 for
the CONFIG_DEBUG_MUTEXES case:

 #ifdef CONFIG_DEBUG_MUTEXES
 # include mutex-debug.h
 # include asm-generic/mutex-null.h
 /*
  * Must be 0 for the debug case so we do not do the unlock outside of the
  * wait_lock region. debug_mutex_unlock() will do the actual unlock in this
  * case.
  */
 # undef __mutex_slowpath_needs_to_unlock
 # define  __mutex_slowpath_needs_to_unlock()   0
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Question on mutex code

2015-03-15 Thread Matthias Bonne

On 03/15/15 03:09, Davidlohr Bueso wrote:

On Sat, 2015-03-14 at 18:03 -0700, Davidlohr Bueso wrote:

Good analysis, but not quite accurate for one simple fact: mutex
trylocks _only_ use fastpaths (obviously just depend on the counter
cmpxchg to 0), so you never fallback to the slowpath you are mentioning,
thus the race is non existent. Please see the arch code.


For debug we use the trylock slowpath, but so does everything else, so
again you cannot hit this scenario.




You are correct of course - this is why I said that
CONFIG_DEBUG_MUTEXES must be enabled for this to happen. Can you
explain why this scenario is still not possible in the debug case?

The debug case uses mutex-null.h, which contains these macros:

#define __mutex_fastpath_lock(count, fail_fn)   fail_fn(count)
#define __mutex_fastpath_lock_retval(count) (-1)
#define __mutex_fastpath_unlock(count, fail_fn) fail_fn(count)
#define __mutex_fastpath_trylock(count, fail_fn)fail_fn(count)
#define __mutex_slowpath_needs_to_unlock()  1

So both mutex_trylock() and mutex_unlock() always use the slow paths.
The slowpath for mutex_unlock() is __mutex_unlock_slowpath(), which
simply calls __mutex_unlock_common_slowpath(), and the latter starts
like this:

/*
 * As a performance measurement, release the lock before doing 
other
 * wakeup related duties to follow. This allows other tasks to 
acquire
 * the lock sooner, while still handling cleanups in past 
unlock calls.
 * This can be done as we do not enforce strict equivalence 
between the

 * mutex counter and wait_list.
 *
 *
 * Some architectures leave the lock unlocked in the fastpath 
failure
 * case, others need to leave it locked. In the later case we 
have to

 * unlock it here - as the lock counter is currently 0 or negative.
 */
if (__mutex_slowpath_needs_to_unlock())
atomic_set(lock-count, 1);

spin_lock_mutex(lock-wait_lock, flags);
[...]

So the counter is set to 1 before taking the spinlock, which I think
might cause the race. Did I miss something?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Question on mutex code

2015-03-15 Thread Davidlohr Bueso
On Sun, 2015-03-15 at 15:18 -0700, Davidlohr Bueso wrote:
 Correct, in debug this is most likely true, yet safe because everything
^^^ false, again the same reasoning.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Question on mutex code

2015-03-15 Thread Davidlohr Bueso
On Sun, 2015-03-15 at 23:49 +0200, Matthias Bonne wrote:
 On 03/15/15 03:09, Davidlohr Bueso wrote:
  On Sat, 2015-03-14 at 18:03 -0700, Davidlohr Bueso wrote:
  Good analysis, but not quite accurate for one simple fact: mutex
  trylocks _only_ use fastpaths (obviously just depend on the counter
  cmpxchg to 0), so you never fallback to the slowpath you are mentioning,
  thus the race is non existent. Please see the arch code.
 
  For debug we use the trylock slowpath, but so does everything else, so
  again you cannot hit this scenario.
 
 
 
 You are correct of course - this is why I said that
 CONFIG_DEBUG_MUTEXES must be enabled for this to happen.

Right, so I just skimmed through the email ;)

  Can you
 explain why this scenario is still not possible in the debug case?
 
 The debug case uses mutex-null.h, which contains these macros:
 
 #define __mutex_fastpath_lock(count, fail_fn)   fail_fn(count)
 #define __mutex_fastpath_lock_retval(count) (-1)
 #define __mutex_fastpath_unlock(count, fail_fn) fail_fn(count)
 #define __mutex_fastpath_trylock(count, fail_fn)fail_fn(count)
 #define __mutex_slowpath_needs_to_unlock()  1
 
 So both mutex_trylock() and mutex_unlock() always use the slow paths.

Right.

 The slowpath for mutex_unlock() is __mutex_unlock_slowpath(), which
 simply calls __mutex_unlock_common_slowpath(), and the latter starts
 like this:
 
  /*
   * As a performance measurement, release the lock before doing 
 other
   * wakeup related duties to follow. This allows other tasks to 
 acquire
   * the lock sooner, while still handling cleanups in past 
 unlock calls.
   * This can be done as we do not enforce strict equivalence 
 between the
   * mutex counter and wait_list.
   *
   *
   * Some architectures leave the lock unlocked in the fastpath 
 failure
   * case, others need to leave it locked. In the later case we 
 have to
   * unlock it here - as the lock counter is currently 0 or negative.
   */
  if (__mutex_slowpath_needs_to_unlock())
  atomic_set(lock-count, 1);

Correct, in debug this is most likely true, yet safe because everything
is serialized through the mutex wait_lock. 

 
  spin_lock_mutex(lock-wait_lock, flags);
  [...]
 
 So the counter is set to 1 before taking the spinlock, which I think
 might cause the race. Did I miss something?

So in debug we play no counter/wait_list games when trying to grab the
lock, ie things such as lock stealing or optimistic spinning.
Furthermore, it is the unlocker thread's duty to wakeup the next task in
the list, so nothing can jump in and steal the lock. Additionally,
ordering also relies on the wait_queue ticket spinlock.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Question on mutex code

2015-03-15 Thread Matthias Bonne

On 03/16/15 00:10, Rabin Vincent wrote:

On Sun, Mar 15, 2015 at 11:49:07PM +0200, Matthias Bonne wrote:

So the counter is set to 1 before taking the spinlock, which I think
might cause the race. Did I miss something?


Yes, you miss the fact that __mutex_slowpath_needs_to_unlock() is 0 for
the CONFIG_DEBUG_MUTEXES case:

  #ifdef CONFIG_DEBUG_MUTEXES
  # include mutex-debug.h
  # include asm-generic/mutex-null.h
  /*
   * Must be 0 for the debug case so we do not do the unlock outside of the
   * wait_lock region. debug_mutex_unlock() will do the actual unlock in this
   * case.
   */
  # undef __mutex_slowpath_needs_to_unlock
  # define  __mutex_slowpath_needs_to_unlock()  0



Right, I overlooked this part. Thanks to both of you for the
clarifications.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Question on mutex code

2015-03-14 Thread Davidlohr Bueso
On Sat, 2015-03-14 at 18:03 -0700, Davidlohr Bueso wrote:
> Good analysis, but not quite accurate for one simple fact: mutex
> trylocks _only_ use fastpaths (obviously just depend on the counter
> cmpxchg to 0), so you never fallback to the slowpath you are mentioning,
> thus the race is non existent. Please see the arch code.

For debug we use the trylock slowpath, but so does everything else, so
again you cannot hit this scenario.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Question on mutex code

2015-03-14 Thread Davidlohr Bueso
On Sun, 2015-03-15 at 01:05 +0200, Matthias Bonne wrote:
> On 03/10/15 15:03, Yann Droneaud wrote:
> > Hi,
> >
> > Le mercredi 04 mars 2015 à 02:13 +0200, Matthias Bonne a écrit :
> >
> >> I am trying to understand how mutexes work in the kernel, and I think
> >> there might be a race between mutex_trylock() and mutex_unlock(). More
> >> specifically, the race is between the functions
> >> __mutex_trylock_slowpath and __mutex_unlock_common_slowpath (both
> >> defined in kernel/locking/mutex.c).
> >>
> >> Consider the following sequence of events:
> >>
> [...]
> >>
> >> The end result is that the mutex count is 0 (locked), although the
> >> owner has just released it, and nobody else is holding the mutex. So it
> >> can no longer be acquired by anyone.
> >>
> >> Am I missing something that prevents the above scenario from happening?
> >> If not, should I post a patch that fixes it to LKML? Or is it
> >> considered too "theoretical" and cannot happen in practice?
> >>
> >
> > I haven't looked at your explanations, you should have come with a
> > reproductible test case to demonstrate the issue (involving slowing
> > down one CPU ?).
> >
> > Anyway, such deep knowledge on the mutex implementation has to be found
> > on lkml.
> >
> > Regards.
> >
> 
> Thank you for your suggestions, and sorry for the long delay.
> 
> I see now that my explanation was unneccesarily complex. The problem is
> this code from __mutex_trylock_slowpath():
> 
>  spin_lock_mutex(>wait_lock, flags);
> 
>  prev = atomic_xchg(>count, -1);
>  if (likely(prev == 1)) {
>  mutex_set_owner(lock);
>  mutex_acquire(>dep_map, 0, 1, _RET_IP_);
>  }
> 
>  /* Set it back to 0 if there are no waiters: */
>  if (likely(list_empty(>wait_list)))
>  atomic_set(>count, 0);
> 
>  spin_unlock_mutex(>wait_lock, flags);
> 
>  return prev == 1;
> 
> The above code assumes that the mutex cannot be unlocked while the
> spinlock is held. However, mutex_unlock() sets the mutex count to 1
> before taking the spinlock (even in the slowpath). If this happens
> between the atomic_xchg() and the atomic_set() above, and the mutex has
> no waiters, then the atomic_set() will set the mutex count back to 0
> after it has been unlocked by mutex_unlock(), but mutex_trylock() will
> still return failure. So the mutex will remain locked forever.

Good analysis, but not quite accurate for one simple fact: mutex
trylocks _only_ use fastpaths (obviously just depend on the counter
cmpxchg to 0), so you never fallback to the slowpath you are mentioning,
thus the race is non existent. Please see the arch code.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Question on mutex code

2015-03-14 Thread Matthias Bonne

On 03/10/15 16:59, valdis.kletni...@vt.edu wrote:

On Tue, 10 Mar 2015 14:03:59 +0100, Yann Droneaud said:


Consider the following sequence of events:

0. Suppose a mutex is locked by task A and has no waiters.

1. Task B calls mutex_trylock().

2. mutex_trylock() calls the architecture-specific
 __mutex_fastpath_trylock(), with __mutex_trylock_slowpath() as
 fail_fn.

3. According to the description of __mutex_fastpath_trylock() (for
 example in include/asm-generic/mutex-dec.h), "if the architecture
 has no effective trylock variant, it should call the fail_fn
 spinlock-based trylock variant unconditionally". So
 __mutex_fastpath_trylock() may now call __mutex_trylock_slowpath().

4. Task A releases the mutex.

5. Task B, in __mutex_trylock_slowpath, executes:

  /* No need to trylock if the mutex is locked. */
  if (mutex_is_locked(lock))
  return 0;

 Since the mutex is no longer locked, the function continues.

6. Task C, which runs on a different cpu than task B, locks the mutex
 again.

7. Task B, in __mutex_trylock_slowpath(), continues:

  spin_lock_mutex(>wait_lock, flags);


B will spin here until C releases the lock.

When that spin exits, C no longer holds the lock.  Re-do the analysis
from this point.



Thank you for the review.

I don't think B waits for C here - C holds the mutex (lock), not the 
internal spinlock (lock->wait_lock). I might be wrong though.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Question on mutex code

2015-03-14 Thread Matthias Bonne

On 03/10/15 15:03, Yann Droneaud wrote:

Hi,

Le mercredi 04 mars 2015 à 02:13 +0200, Matthias Bonne a écrit :


I am trying to understand how mutexes work in the kernel, and I think
there might be a race between mutex_trylock() and mutex_unlock(). More
specifically, the race is between the functions
__mutex_trylock_slowpath and __mutex_unlock_common_slowpath (both
defined in kernel/locking/mutex.c).

Consider the following sequence of events:


[...]


The end result is that the mutex count is 0 (locked), although the
owner has just released it, and nobody else is holding the mutex. So it
can no longer be acquired by anyone.

Am I missing something that prevents the above scenario from happening?
If not, should I post a patch that fixes it to LKML? Or is it
considered too "theoretical" and cannot happen in practice?



I haven't looked at your explanations, you should have come with a
reproductible test case to demonstrate the issue (involving slowing
down one CPU ?).

Anyway, such deep knowledge on the mutex implementation has to be found
on lkml.

Regards.



Thank you for your suggestions, and sorry for the long delay.

I see now that my explanation was unneccesarily complex. The problem is
this code from __mutex_trylock_slowpath():

spin_lock_mutex(>wait_lock, flags);

prev = atomic_xchg(>count, -1);
if (likely(prev == 1)) {
mutex_set_owner(lock);
mutex_acquire(>dep_map, 0, 1, _RET_IP_);
}

/* Set it back to 0 if there are no waiters: */
if (likely(list_empty(>wait_list)))
atomic_set(>count, 0);

spin_unlock_mutex(>wait_lock, flags);

return prev == 1;

The above code assumes that the mutex cannot be unlocked while the
spinlock is held. However, mutex_unlock() sets the mutex count to 1
before taking the spinlock (even in the slowpath). If this happens
between the atomic_xchg() and the atomic_set() above, and the mutex has
no waiters, then the atomic_set() will set the mutex count back to 0
after it has been unlocked by mutex_unlock(), but mutex_trylock() will
still return failure. So the mutex will remain locked forever.

I don't know how to write a test case to demonstrate the issue, because
this race is very hard to trigger in practice: the mutex needs to be
locked immediately before the spinlock is acquired, and unlocked in the
very short interval between atomic_xchg() and atomic_set(). It also
requires that CONFIG_DEBUG_MUTEXES be set, since AFAICT the mutex
debugging code is currently the only user of __mutex_trylock_slowpath.
This is why I asked if it is acceptable to submit a patch for such
hard-to-trigger problems.

I think I will just send a fix. Any further suggestions or guidance
would be appreciated.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Question on mutex code

2015-03-14 Thread Matthias Bonne

On 03/10/15 15:03, Yann Droneaud wrote:

Hi,

Le mercredi 04 mars 2015 à 02:13 +0200, Matthias Bonne a écrit :


I am trying to understand how mutexes work in the kernel, and I think
there might be a race between mutex_trylock() and mutex_unlock(). More
specifically, the race is between the functions
__mutex_trylock_slowpath and __mutex_unlock_common_slowpath (both
defined in kernel/locking/mutex.c).

Consider the following sequence of events:


[...]


The end result is that the mutex count is 0 (locked), although the
owner has just released it, and nobody else is holding the mutex. So it
can no longer be acquired by anyone.

Am I missing something that prevents the above scenario from happening?
If not, should I post a patch that fixes it to LKML? Or is it
considered too theoretical and cannot happen in practice?



I haven't looked at your explanations, you should have come with a
reproductible test case to demonstrate the issue (involving slowing
down one CPU ?).

Anyway, such deep knowledge on the mutex implementation has to be found
on lkml.

Regards.



Thank you for your suggestions, and sorry for the long delay.

I see now that my explanation was unneccesarily complex. The problem is
this code from __mutex_trylock_slowpath():

spin_lock_mutex(lock-wait_lock, flags);

prev = atomic_xchg(lock-count, -1);
if (likely(prev == 1)) {
mutex_set_owner(lock);
mutex_acquire(lock-dep_map, 0, 1, _RET_IP_);
}

/* Set it back to 0 if there are no waiters: */
if (likely(list_empty(lock-wait_list)))
atomic_set(lock-count, 0);

spin_unlock_mutex(lock-wait_lock, flags);

return prev == 1;

The above code assumes that the mutex cannot be unlocked while the
spinlock is held. However, mutex_unlock() sets the mutex count to 1
before taking the spinlock (even in the slowpath). If this happens
between the atomic_xchg() and the atomic_set() above, and the mutex has
no waiters, then the atomic_set() will set the mutex count back to 0
after it has been unlocked by mutex_unlock(), but mutex_trylock() will
still return failure. So the mutex will remain locked forever.

I don't know how to write a test case to demonstrate the issue, because
this race is very hard to trigger in practice: the mutex needs to be
locked immediately before the spinlock is acquired, and unlocked in the
very short interval between atomic_xchg() and atomic_set(). It also
requires that CONFIG_DEBUG_MUTEXES be set, since AFAICT the mutex
debugging code is currently the only user of __mutex_trylock_slowpath.
This is why I asked if it is acceptable to submit a patch for such
hard-to-trigger problems.

I think I will just send a fix. Any further suggestions or guidance
would be appreciated.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Question on mutex code

2015-03-14 Thread Davidlohr Bueso
On Sun, 2015-03-15 at 01:05 +0200, Matthias Bonne wrote:
 On 03/10/15 15:03, Yann Droneaud wrote:
  Hi,
 
  Le mercredi 04 mars 2015 à 02:13 +0200, Matthias Bonne a écrit :
 
  I am trying to understand how mutexes work in the kernel, and I think
  there might be a race between mutex_trylock() and mutex_unlock(). More
  specifically, the race is between the functions
  __mutex_trylock_slowpath and __mutex_unlock_common_slowpath (both
  defined in kernel/locking/mutex.c).
 
  Consider the following sequence of events:
 
 [...]
 
  The end result is that the mutex count is 0 (locked), although the
  owner has just released it, and nobody else is holding the mutex. So it
  can no longer be acquired by anyone.
 
  Am I missing something that prevents the above scenario from happening?
  If not, should I post a patch that fixes it to LKML? Or is it
  considered too theoretical and cannot happen in practice?
 
 
  I haven't looked at your explanations, you should have come with a
  reproductible test case to demonstrate the issue (involving slowing
  down one CPU ?).
 
  Anyway, such deep knowledge on the mutex implementation has to be found
  on lkml.
 
  Regards.
 
 
 Thank you for your suggestions, and sorry for the long delay.
 
 I see now that my explanation was unneccesarily complex. The problem is
 this code from __mutex_trylock_slowpath():
 
  spin_lock_mutex(lock-wait_lock, flags);
 
  prev = atomic_xchg(lock-count, -1);
  if (likely(prev == 1)) {
  mutex_set_owner(lock);
  mutex_acquire(lock-dep_map, 0, 1, _RET_IP_);
  }
 
  /* Set it back to 0 if there are no waiters: */
  if (likely(list_empty(lock-wait_list)))
  atomic_set(lock-count, 0);
 
  spin_unlock_mutex(lock-wait_lock, flags);
 
  return prev == 1;
 
 The above code assumes that the mutex cannot be unlocked while the
 spinlock is held. However, mutex_unlock() sets the mutex count to 1
 before taking the spinlock (even in the slowpath). If this happens
 between the atomic_xchg() and the atomic_set() above, and the mutex has
 no waiters, then the atomic_set() will set the mutex count back to 0
 after it has been unlocked by mutex_unlock(), but mutex_trylock() will
 still return failure. So the mutex will remain locked forever.

Good analysis, but not quite accurate for one simple fact: mutex
trylocks _only_ use fastpaths (obviously just depend on the counter
cmpxchg to 0), so you never fallback to the slowpath you are mentioning,
thus the race is non existent. Please see the arch code.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Question on mutex code

2015-03-14 Thread Matthias Bonne

On 03/10/15 16:59, valdis.kletni...@vt.edu wrote:

On Tue, 10 Mar 2015 14:03:59 +0100, Yann Droneaud said:


Consider the following sequence of events:

0. Suppose a mutex is locked by task A and has no waiters.

1. Task B calls mutex_trylock().

2. mutex_trylock() calls the architecture-specific
 __mutex_fastpath_trylock(), with __mutex_trylock_slowpath() as
 fail_fn.

3. According to the description of __mutex_fastpath_trylock() (for
 example in include/asm-generic/mutex-dec.h), if the architecture
 has no effective trylock variant, it should call the fail_fn
 spinlock-based trylock variant unconditionally. So
 __mutex_fastpath_trylock() may now call __mutex_trylock_slowpath().

4. Task A releases the mutex.

5. Task B, in __mutex_trylock_slowpath, executes:

  /* No need to trylock if the mutex is locked. */
  if (mutex_is_locked(lock))
  return 0;

 Since the mutex is no longer locked, the function continues.

6. Task C, which runs on a different cpu than task B, locks the mutex
 again.

7. Task B, in __mutex_trylock_slowpath(), continues:

  spin_lock_mutex(lock-wait_lock, flags);


B will spin here until C releases the lock.

When that spin exits, C no longer holds the lock.  Re-do the analysis
from this point.



Thank you for the review.

I don't think B waits for C here - C holds the mutex (lock), not the 
internal spinlock (lock-wait_lock). I might be wrong though.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Question on mutex code

2015-03-14 Thread Davidlohr Bueso
On Sat, 2015-03-14 at 18:03 -0700, Davidlohr Bueso wrote:
 Good analysis, but not quite accurate for one simple fact: mutex
 trylocks _only_ use fastpaths (obviously just depend on the counter
 cmpxchg to 0), so you never fallback to the slowpath you are mentioning,
 thus the race is non existent. Please see the arch code.

For debug we use the trylock slowpath, but so does everything else, so
again you cannot hit this scenario.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Question on mutex code

2015-03-10 Thread Valdis . Kletnieks
On Tue, 10 Mar 2015 14:03:59 +0100, Yann Droneaud said:

> > Consider the following sequence of events:
> > 
> > 0. Suppose a mutex is locked by task A and has no waiters.
> > 
> > 1. Task B calls mutex_trylock().
> > 
> > 2. mutex_trylock() calls the architecture-specific
> > __mutex_fastpath_trylock(), with __mutex_trylock_slowpath() as
> > fail_fn.
> > 
> > 3. According to the description of __mutex_fastpath_trylock() (for
> > example in include/asm-generic/mutex-dec.h), "if the architecture
> > has no effective trylock variant, it should call the fail_fn
> > spinlock-based trylock variant unconditionally". So
> > __mutex_fastpath_trylock() may now call __mutex_trylock_slowpath().
> > 
> > 4. Task A releases the mutex.
> > 
> > 5. Task B, in __mutex_trylock_slowpath, executes:
> > 
> >  /* No need to trylock if the mutex is locked. */
> >  if (mutex_is_locked(lock))
> >  return 0;
> > 
> > Since the mutex is no longer locked, the function continues.
> > 
> > 6. Task C, which runs on a different cpu than task B, locks the mutex
> > again.
> > 
> > 7. Task B, in __mutex_trylock_slowpath(), continues:
> > 
> >  spin_lock_mutex(>wait_lock, flags);

B will spin here until C releases the lock.

When that spin exits, C no longer holds the lock.  Re-do the analysis
from this point.


pgpThlpyoPcRG.pgp
Description: PGP signature


Re: Question on mutex code

2015-03-10 Thread Yann Droneaud
Hi,

Le mercredi 04 mars 2015 à 02:13 +0200, Matthias Bonne a écrit :

> I am trying to understand how mutexes work in the kernel, and I think
> there might be a race between mutex_trylock() and mutex_unlock(). More
> specifically, the race is between the functions
> __mutex_trylock_slowpath and __mutex_unlock_common_slowpath (both
> defined in kernel/locking/mutex.c).
> 
> Consider the following sequence of events:
> 
> 0. Suppose a mutex is locked by task A and has no waiters.
> 
> 1. Task B calls mutex_trylock().
> 
> 2. mutex_trylock() calls the architecture-specific
> __mutex_fastpath_trylock(), with __mutex_trylock_slowpath() as
> fail_fn.
> 
> 3. According to the description of __mutex_fastpath_trylock() (for
> example in include/asm-generic/mutex-dec.h), "if the architecture
> has no effective trylock variant, it should call the fail_fn
> spinlock-based trylock variant unconditionally". So
> __mutex_fastpath_trylock() may now call __mutex_trylock_slowpath().
> 
> 4. Task A releases the mutex.
> 
> 5. Task B, in __mutex_trylock_slowpath, executes:
> 
>  /* No need to trylock if the mutex is locked. */
>  if (mutex_is_locked(lock))
>  return 0;
> 
> Since the mutex is no longer locked, the function continues.
> 
> 6. Task C, which runs on a different cpu than task B, locks the mutex
> again.
> 
> 7. Task B, in __mutex_trylock_slowpath(), continues:
> 
>  spin_lock_mutex(>wait_lock, flags);
> 
>  prev = atomic_xchg(>count, -1);
>  if (likely(prev == 1)) {
>  mutex_set_owner(lock);
>  mutex_acquire(>dep_map, 0, 1, _RET_IP_);
>  }
> 
> At this point task B holds mutex->wait_lock, prev is 0 (because there
> are no waiters other than task B, so the count was 0) and the mutex
> count is set to -1.
> 
> 5. Task C calls mutex_unlock() to unlock the mutex.
> 
> 6. mutex_unlock() calls the architecture-specific function
> __mutex_fastpath_unlock(), which fails (because the mutex count is
> -1), so it now calls __mutex_unlock_slowpath(), which calls
> __mutex_unlock_common_slowpath().
> 
> 7. __mutex_unlock_common_slowpath() sets the mutex count to 1
> unconditionally, before spinning on mutex->wait_lock.
> 
> 8. Task B, in __mutex_trylock_slowpath, continues:
> 
>  /* Set it back to 0 if there are no waiters: */
>  if (likely(list_empty(>wait_list)))
>  atomic_set(>count, 0);
> 
>  spin_unlock_mutex(>wait_lock, flags);
> 
>  return prev == 1;
> 
> mutex->wait_list is still empty, so the code sets the mutex count to
> zero (which means the mutex is locked), releases mutex->wait_lock,
> and returns 0 (which means that the mutex is locked by someone else,
> and cannot be acquired).
> 
> 9. Task C, in __mutex_unlock_common_slowpath, acquires
> mutex->wait_lock, unlocks it immediately (because there are no
> waiters to wake up) and returns.
> 
> The end result is that the mutex count is 0 (locked), although the
> owner has just released it, and nobody else is holding the mutex. So it
> can no longer be acquired by anyone.
> 
> Am I missing something that prevents the above scenario from happening?
> If not, should I post a patch that fixes it to LKML? Or is it
> considered too "theoretical" and cannot happen in practice?
> 

I haven't looked at your explanations, you should have come with a 
reproductible test case to demonstrate the issue (involving slowing 
down one CPU ?).

Anyway, such deep knowledge on the mutex implementation has to be found
on lkml.

Regards.

-- 
Yann Droneaud
OPTEYA



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Question on mutex code

2015-03-10 Thread Yann Droneaud
Hi,

Le mercredi 04 mars 2015 à 02:13 +0200, Matthias Bonne a écrit :

 I am trying to understand how mutexes work in the kernel, and I think
 there might be a race between mutex_trylock() and mutex_unlock(). More
 specifically, the race is between the functions
 __mutex_trylock_slowpath and __mutex_unlock_common_slowpath (both
 defined in kernel/locking/mutex.c).
 
 Consider the following sequence of events:
 
 0. Suppose a mutex is locked by task A and has no waiters.
 
 1. Task B calls mutex_trylock().
 
 2. mutex_trylock() calls the architecture-specific
 __mutex_fastpath_trylock(), with __mutex_trylock_slowpath() as
 fail_fn.
 
 3. According to the description of __mutex_fastpath_trylock() (for
 example in include/asm-generic/mutex-dec.h), if the architecture
 has no effective trylock variant, it should call the fail_fn
 spinlock-based trylock variant unconditionally. So
 __mutex_fastpath_trylock() may now call __mutex_trylock_slowpath().
 
 4. Task A releases the mutex.
 
 5. Task B, in __mutex_trylock_slowpath, executes:
 
  /* No need to trylock if the mutex is locked. */
  if (mutex_is_locked(lock))
  return 0;
 
 Since the mutex is no longer locked, the function continues.
 
 6. Task C, which runs on a different cpu than task B, locks the mutex
 again.
 
 7. Task B, in __mutex_trylock_slowpath(), continues:
 
  spin_lock_mutex(lock-wait_lock, flags);
 
  prev = atomic_xchg(lock-count, -1);
  if (likely(prev == 1)) {
  mutex_set_owner(lock);
  mutex_acquire(lock-dep_map, 0, 1, _RET_IP_);
  }
 
 At this point task B holds mutex-wait_lock, prev is 0 (because there
 are no waiters other than task B, so the count was 0) and the mutex
 count is set to -1.
 
 5. Task C calls mutex_unlock() to unlock the mutex.
 
 6. mutex_unlock() calls the architecture-specific function
 __mutex_fastpath_unlock(), which fails (because the mutex count is
 -1), so it now calls __mutex_unlock_slowpath(), which calls
 __mutex_unlock_common_slowpath().
 
 7. __mutex_unlock_common_slowpath() sets the mutex count to 1
 unconditionally, before spinning on mutex-wait_lock.
 
 8. Task B, in __mutex_trylock_slowpath, continues:
 
  /* Set it back to 0 if there are no waiters: */
  if (likely(list_empty(lock-wait_list)))
  atomic_set(lock-count, 0);
 
  spin_unlock_mutex(lock-wait_lock, flags);
 
  return prev == 1;
 
 mutex-wait_list is still empty, so the code sets the mutex count to
 zero (which means the mutex is locked), releases mutex-wait_lock,
 and returns 0 (which means that the mutex is locked by someone else,
 and cannot be acquired).
 
 9. Task C, in __mutex_unlock_common_slowpath, acquires
 mutex-wait_lock, unlocks it immediately (because there are no
 waiters to wake up) and returns.
 
 The end result is that the mutex count is 0 (locked), although the
 owner has just released it, and nobody else is holding the mutex. So it
 can no longer be acquired by anyone.
 
 Am I missing something that prevents the above scenario from happening?
 If not, should I post a patch that fixes it to LKML? Or is it
 considered too theoretical and cannot happen in practice?
 

I haven't looked at your explanations, you should have come with a 
reproductible test case to demonstrate the issue (involving slowing 
down one CPU ?).

Anyway, such deep knowledge on the mutex implementation has to be found
on lkml.

Regards.

-- 
Yann Droneaud
OPTEYA



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Question on mutex code

2015-03-10 Thread Valdis . Kletnieks
On Tue, 10 Mar 2015 14:03:59 +0100, Yann Droneaud said:

  Consider the following sequence of events:
  
  0. Suppose a mutex is locked by task A and has no waiters.
  
  1. Task B calls mutex_trylock().
  
  2. mutex_trylock() calls the architecture-specific
  __mutex_fastpath_trylock(), with __mutex_trylock_slowpath() as
  fail_fn.
  
  3. According to the description of __mutex_fastpath_trylock() (for
  example in include/asm-generic/mutex-dec.h), if the architecture
  has no effective trylock variant, it should call the fail_fn
  spinlock-based trylock variant unconditionally. So
  __mutex_fastpath_trylock() may now call __mutex_trylock_slowpath().
  
  4. Task A releases the mutex.
  
  5. Task B, in __mutex_trylock_slowpath, executes:
  
   /* No need to trylock if the mutex is locked. */
   if (mutex_is_locked(lock))
   return 0;
  
  Since the mutex is no longer locked, the function continues.
  
  6. Task C, which runs on a different cpu than task B, locks the mutex
  again.
  
  7. Task B, in __mutex_trylock_slowpath(), continues:
  
   spin_lock_mutex(lock-wait_lock, flags);

B will spin here until C releases the lock.

When that spin exits, C no longer holds the lock.  Re-do the analysis
from this point.


pgpThlpyoPcRG.pgp
Description: PGP signature