New issue 3072: Race in PyThread_release_lock - can lead to memory corruption 
and deadlock
https://bitbucket.org/pypy/pypy/issues/3072/race-in-pythread_release_lock-can-lead-to

Kirill Smelkov:

Hello up there. I recently found concurency bug that applies to CPython2.7, 
PyPy2 and PyPy3. Basically PyThread\_release\_lock, on systems where it is 
implemented via mutex \+ condition variable, has a race condition since 
pthread\_cond\_signal is called outside of mutex-protected region. This race 
condition can lead to memory corruption and deadlock. I’ve hit this bug for 
real while testing Pygolang on macOS. Original bug report is here: 
[https://bugs.python.org/issue38106](https://bugs.python.org/issue38106).

Copy of original bug report follows:

----8<----

### Bug history

\( Please see attached pylock\_bug.pyx for the program that triggers the bug 
for real. \)

On Darwin, even though this is considered as POSIX, Python uses  
mutex\+condition variable to implement its lock, and, as of 20190828, Py2.7  
implementation, even though similar issue was fixed for Py3 in 2012, contains  
synchronization bug: the condition is signalled after mutex unlock while the  
correct protocol is to signal condition from under mutex:

[https://github.com/python/cpython/blob/v2.7.16-127-g0229b56d8c0/Python/thread\_pthread.h#L486-L506](https://github.com/python/cpython/blob/v2.7.16-127-g0229b56d8c0/Python/thread_pthread.h#L486-L506)
  
[https://github.com/python/cpython/commit/187aa545165d](https://github.com/python/cpython/commit/187aa545165d)
 \(py3 fix\)

PyPy has the same bug for both pypy2 and pypy3:

[https://bitbucket.org/pypy/pypy/src/578667b3fef9/rpython/translator/c/src/thread\_pthread.c#lines-443:465](https://bitbucket.org/pypy/pypy/src/578667b3fef9/rpython/translator/c/src/thread_pthread.c#lines-443:465)
  
[https://bitbucket.org/pypy/pypy/src/5b42890d48c3/rpython/translator/c/src/thread\_pthread.c#lines-443:465](https://bitbucket.org/pypy/pypy/src/5b42890d48c3/rpython/translator/c/src/thread_pthread.c#lines-443:465)

Signalling condition outside of corresponding mutex is considered OK by  
POSIX, but in Python context it can lead to at least memory corruption if we  
consider the whole lifetime of python level lock. For example the following  
logical scenario:

```
      T1                                          T2

  sema = Lock()
  sema.acquire()

                                              sema.release()

  sema.acquire()
  free(sema)

  ...
```

can translate to the next C-level calls:

```
      T1                                          T2

  # sema = Lock()
  sema = malloc(...)
  sema.locked = 0
  pthread_mutex_init(&sema.mut)
  pthread_cond_init (&sema.lock_released)

  # sema.acquire()
  pthread_mutex_lock(&sema.mut)
  # sees sema.locked == 0
  sema.locked = 1
  pthread_mutex_unlock(&sema.mut)


                                              # sema.release()
                                              pthread_mutex_lock(&sema.mut)
                                              sema.locked = 0
                                              pthread_mutex_unlock(&sema.mut)

                      # OS scheduler gets in and relinquishes control from T2
                      # to another process
                                              ...

  # second sema.acquire()
  pthread_mutex_lock(&sema.mut)
  # sees sema.locked == 0
  sema.locked = 1
  pthread_mutex_unlock(&sema.mut)

  # free(sema)
  pthread_mutex_destroy(&sema.mut)
  pthread_cond_destroy (&sema.lock_released)
  free(sema)


  # ...
  e.g. malloc() which returns memory where sema was

                                              ...
                      # OS scheduler returns control to T2
                      # sema.release() continues
                      #
                      # BUT sema was already freed and writing to anywhere
                      # inside sema block CORRUPTS MEMORY. In particular if
                      # _another_ python-level lock was allocated where sema
                      # block was, writing into the memory can have effect on
                      # further synchronization correctness and in particular
                      # lead to deadlock on lock that was next allocated.
                                              
pthread_cond_signal(&sema.lock_released)
```

Note that T2.pthread\_cond\_signal\(&sema.lock\_released\) CORRUPTS MEMORY as 
it  
is called when sema memory was already freed and is potentially  
reallocated for another object.

The fix is to move pthread\_cond\_signal to be done under corresponding mutex:

```
  # sema.release()
  pthread_mutex_lock(&sema.mut)
  sema.locked = 0
  pthread_cond_signal(&sema.lock_released)
  pthread_mutex_unlock(&sema.mut)
```

by cherry-picking commit 187aa545165d \("Signal condition variables with the  
mutex held. Destroy condition variables before their mutexes"\).

### Bug history

The bug was there since 1994 - since at least \[1\]. It was discussed in 2001  
with original code author\[2\], but the code was still considered to be  
race-free. In 2010 the place where pthread\_cond\_signal should be - before or  
after pthread\_mutex\_unlock - was discussed with the rationale to avoid  
threads bouncing\[3,4,5\], and in 2012 pthread\_cond\_signal was moved to be  
called from under mutex, but only for CPython3\[6,7\].

In 2019 the bug was \(re-\)discovered while testing Pygolang\[8\] on macOS with 
 
CPython2 and PyPy2 and PyPy3.

\[1\] 
[https://github.com/python/cpython/commit/2c8cb9f3d240](https://github.com/python/cpython/commit/2c8cb9f3d240)
  
\[2\] 
[https://bugs.python.org/issue433625](https://bugs.python.org/issue433625)  
\[3\] 
[https://bugs.python.org/issue8299#msg103224](https://bugs.python.org/issue8299#msg103224)
  
\[4\] 
[https://bugs.python.org/issue8410#msg103313](https://bugs.python.org/issue8410#msg103313)
  
\[5\] 
[https://bugs.python.org/issue8411#msg113301](https://bugs.python.org/issue8411#msg113301)
  
\[6\] 
[https://bugs.python.org/issue15038#msg163187](https://bugs.python.org/issue15038#msg163187)
  
\[7\] 
[https://github.com/python/cpython/commit/187aa545165d](https://github.com/python/cpython/commit/187aa545165d)
  
\[8\] [https://pypi.org/project/pygolang](https://pypi.org/project/pygolang)


_______________________________________________
pypy-issue mailing list
pypy-issue@python.org
https://mail.python.org/mailman/listinfo/pypy-issue

Reply via email to