On 20/08/20 12:06, Christian Schoenebeck wrote: > Hmmm, as Geoffrey already added a lock today, I noticed that QEMU's main IO > thread mutex is not initialized as 'recursive' lock type. Does that make > sense? I.e. shouldn't there be a > > qemu_rec_mutex_init(&qemu_global_mutex); > > in softmmu/cpus.c for safety reasons to prevent nested locks from same thread > causing misbehaviour? > > CCing Paolo to clarify.
atexit functions (in this case audio_cleanup->free_audio_state->qjack_fini_out) might be called both with or without the BQL taken, so v7 of this series is likely wrong as you point out. However, recursive locks are pure evil. :) More seriously: programming with concurrency is first and foremost about understanding invariants[1]. Locks are relatively simple to reason about because they enforce invariants at the points where they are acquired or released. If you have something like static void do_it_locked(struct foo *foo) { /* ... */ } static void do_it(struct foo *foo) { mutex_lock(&foo->lock); do_it_locked(foo); mutex_unlock(&foo->lock); } then you can immediately know that foo_locked() calls requires more care, because the invariants around "foo" have to be provided by the caller of foo_locked(). Instead, on a call to foo() the invariants are guaranteed just because the caller must not have locked foo(). Instead, recursive locks allow you to slip into a different mindset where locks protect the _code_ instead of the _data_ (and the invariants of that data). Things look simpler because you can just have static void do_it(struct foo *foo) { rec_mutex_lock(&foo->lock); /* ... */ rec_mutex_unlock(&foo->lock); } but callers have no clue about what invariants are provided around calls to do_it(). So, understanding complex code that uses recursive mutexes is effectively impossible. More on the practical side, recursive mutex are an easy way to get a deadlock. It's a common idiom to do /* Need to take foo->lock outside bar->lock. */ mutex_unlock(&bar->lock); mutex_lock(&foo->lock); mutex_lock(&bar->lock); With a recursive mutex, there's no guarantee that foo->lock is *really* taken outside bar->lock, because the first unlock could have done nothing except decreasing the lock-nesting count. This is also why QEMU uses differently-named functions to lock/unlock recursive mutexes, instead of having a flag like pthread mutexes do; it makes code like this stand out as wrong: /* Meant to take foo->lock outside bar->lock, but really doesn't */ rec_mutex_unlock(&bar->lock); mutex_lock(&foo->lock); rec_mutex_lock(&bar->lock); A variant of this applies to callbacks: the golden rule is that callbacks (e.g. from a function that iterates over a data structure) should be called without any lock taken in order to avoid deadlocks. However, this rule is most often ignored in code that uses a recursive mutex, because that code is written around the (incorrect) paradigm that mutual exclusion makes code safe. Which technically is true in this case, as deadlocks are not a safety problem, but that's not a great consolation. My suggestion is to work towards protecting the audio code with its own mutex(es) and ignore the existence of the BQL for subsystems that can do so (audio is a prime candidate). Also please add comments to audio_int.h about which functions are called from other threads than the QEMU main thread. Thanks, Paolo [1] https://lamport.azurewebsites.net/pubs/teaching-concurrency.pdf