jack: fix use after free segfault

Paolo Bonzini Thu, 20 Aug 2020 03:56:19 -0700

On 20/08/20 12:06, Christian Schoenebeck wrote:
> Hmmm, as Geoffrey already added a lock today, I noticed that QEMU's main IO 
> thread mutex is not initialized as 'recursive' lock type. Does that make 
> sense? I.e. shouldn't there be a
> 
>       qemu_rec_mutex_init(&qemu_global_mutex);
> 
> in softmmu/cpus.c for safety reasons to prevent nested locks from same thread 
> causing misbehaviour?
> 
> CCing Paolo to clarify.


atexit functions (in this case
audio_cleanup->free_audio_state->qjack_fini_out) might be called both
with or without the BQL taken, so v7 of this series is likely wrong as
you point out.

However, recursive locks are pure evil. :)

More seriously: programming with concurrency is first and foremost about
understanding invariants[1].  Locks are relatively simple to reason
about because they enforce invariants at the points where they are
acquired or released.

If you have something like

static void do_it_locked(struct foo *foo)
{
    /* ... */
}

static void do_it(struct foo *foo)
{
    mutex_lock(&foo->lock);
    do_it_locked(foo);
    mutex_unlock(&foo->lock);
}

then you can immediately know that foo_locked() calls requires more
care, because the invariants around "foo" have to be provided by the
caller of foo_locked().  Instead, on a call to foo() the invariants are
guaranteed just because the caller must not have locked foo().

Instead, recursive locks allow you to slip into a different mindset
where locks protect the _code_ instead of the _data_ (and the invariants
of that data).  Things look simpler because you can just have

static void do_it(struct foo *foo)
{
    rec_mutex_lock(&foo->lock);
    /* ... */
    rec_mutex_unlock(&foo->lock);
}

but callers have no clue about what invariants are provided around calls
to do_it().  So, understanding complex code that uses recursive mutexes
is effectively impossible.

More on the practical side, recursive mutex are an easy way to get a
deadlock.  It's a common idiom to do

    /* Need to take foo->lock outside bar->lock.  */
    mutex_unlock(&bar->lock);
    mutex_lock(&foo->lock);
    mutex_lock(&bar->lock);

With a recursive mutex, there's no guarantee that foo->lock is *really*
taken outside bar->lock, because the first unlock could have done
nothing except decreasing the lock-nesting count.  This is also why QEMU
uses differently-named functions to lock/unlock recursive mutexes,
instead of having a flag like pthread mutexes do; it makes code like
this stand out as wrong:

    /* Meant to take foo->lock outside bar->lock, but really doesn't  */
    rec_mutex_unlock(&bar->lock);
    mutex_lock(&foo->lock);
    rec_mutex_lock(&bar->lock);

A variant of this applies to callbacks: the golden rule is that
callbacks (e.g. from a function that iterates over a data structure)
should be called without any lock taken in order to avoid deadlocks.
However, this rule is most often ignored in code that uses a recursive
mutex, because that code is written around the (incorrect) paradigm that
mutual exclusion makes code safe.  Which technically is true in this
case, as deadlocks are not a safety problem, but that's not a great
consolation.

My suggestion is to work towards protecting the audio code with its own
mutex(es) and ignore the existence of the BQL for subsystems that can do
so (audio is a prime candidate).  Also please add comments to
audio_int.h about which functions are called from other threads than the
QEMU main thread.

Thanks,

Paolo

[1] https://lamport.azurewebsites.net/pubs/teaching-concurrency.pdf

Re: [PATCH v5 1/1] audio/jack: fix use after free segfault

Reply via email to