> Date: Sun, 02 Feb 2020 14:25:12 +0900
> From: Tetsuya Isaki <is...@pastel-flower.jp>
> 
> According to audio.c comment, vdevgone() calls close().  But it is
> not called actually.

vdevgone calls VOP_REVOKE on the vnode, which _used_ to call
audio_close, but it does essentially nothing for cloning file
descriptors as we have used since sys/dev/audio.c r1.302.

Instead of vdevgone,

- audiodetach must go through sc->sc_files and render them invalid,
  and

- audioread, audiowrite, &c. -- all the audio_fileops -- must be able
  to handle an invalid file gracefully by returning error.

But it's tricky because audioread and audiowrite need to get a handle
on the softc, without holding any locks, while audiodetach may be
racing to destroy the softc.

Here's one way you could do it, using pserialize(9) in
audioread/audiowrite to reliably tell whether the file is still valid,
and psref(9) to get a handle on the softc:

1. Add struct pserialize * and struct psref_target to struct
   audio_softc so we can passive references to it:

        struct audio_softc {
                ...
                struct pserialize       *sc_psz;
                struct psref_target     sc_psref;
                ...
        };

   (Create/destroy an audio psref class in MODULE_CMD_INIT/FINI, and
   then do pserialize_create/destroy and psref_target_init/destroy in
   audioattach/detach.)

2. In audioread, audiowrite, &c., get the sc and acquire a psref to it
   in a pserialize read section:

        static int
        audioread(struct file *fp, ...)
        {
                struct audio_file *file = fp->f_audioctx;
                struct audio_softc *sc;
                struct psref sc_ref;
                int error;
                int s;

                /* Block audiodetach while we acquire a reference.  */
                s = pserialize_read_enter();

                /* If audiodetach already ran, tough -- no more audio.  */
                if ((sc = atomic_load_relaxed(&file->sc)) == NULL) {
                        pserialize_read_exit(s);
                        return ENXIO;
                }

                /* Acquire a reference.  */
                psref_acquire(&sc_ref, &sc->sc_psref, audio_psref_class);

                /* Now sc won't go away until we drop the reference count.  */
                pserialize_read_exit(s);

                ... use sc->sc_lock, do the read, &c. ...
                ... make sure sc_dying prevents long waits ...

                /* Release the reference.  */
                psref_release(&sc_ref, &sc->sc_psref, audio_psref_class);

                return error;
        }

3. In audio_close, remove it from the list _only_ if it's still valid.

        audio_close(sc, file)
        {
                ...
                mutex_enter(sc->sc_intr_lock);
                /* Detach races to remove us from the list too.  */
                if (file->sc) {
                        KASSERT(sc == file->sc);
                        SLIST_REMOVE(&sc->sc_files, file, audio_file, entry);
                        file->sc = NULL; /* paranoia */
                }
                mutex_exit(sc->sc_intr_lock);
                ...
        }

3. In audiodetach, you need to (a) prevent new users, (b) wait for
   existing users to drain, (c) free it.  So, after setting
   sc->sc_dying, and in the place of the logic that currently does
   vdevgone:

        audiodetach()
        {
                struct audio_file *file;
                ...
                /* (a) Prevent new users.  */
                mutex_enter(sc->sc_intr_lock);
                while ((file = SLIST_FIRST(&sc->sc_files)) != NULL) {
                        SLIST_REMOVE_HEAD(&sc->sc_files, entry);
                        atomic_store_relaxed(&file->sc, NULL);
                }
                mutex_exit(sc->sc_intr_lock);

                /*
                 * (b) Wait for existing users to drain:
                 *     - pserialize_perform waits for all
                 *       pserialize_read sections on all CPUs; after
                 *       this, no more new psref_acquire can happen.
                 *     - psref_target_destroy waits for all extant
                 *       acquired psrefs to be psref_released.
                 */
                mutex_enter(sc->sc_lock);
                pserialize_perform(sc->sc_psz);
                mutex_exit(sc->sc_lock);
                psref_target_destroy(&sc->sc_psref, audio_psref_class);

                /*
                 * We are now guaranteed that there are no calls to
                 * audio fileops that hold sc, and any new calls with
                 * files that were for sc will fail.  Thus, we now
                 * have exclusive access to the softc.
                 */
                callout_destroy(...);
                kmem_free(...);
        }


Side notes after reviewing audio.c:

- You will need to find a way to ensure that audio_close can free all
  of its resources unconditionally: although it can technically return
  an error code, it can't fail to free resources; close is final and
  can never be retried.  The file descriptor will be freed by closef
  even if .fo_close returns an error code.

  If audio_close really really needs to hold the long-term exclusive
  lock (sc_exlock), then you need to provide a way to interrupt anyone
  holding the exclusive lock so that audio_close can proceed without
  blocking indefinitely.  But if there's any way to avoid taking the
  long-term exclusive lock at all, that would be better.

- kmem_alloc/free (or any other sleeping allocation -- for example,
  softint_establish/disestablish) is not allowed while holding a lock.
  It may work accidentally sometimes, but it's wrong for audio to do
  -- it can lead to deadlocks, uninterruptible blocking waits, &c.

- KASSERT(!mutex_owned(lock)) is not generally allowed.  If you're
  about to do mutex_enter(lock) anyway, just skip the kassert --
  mutex_enter will detect a recursive lock and panic itself.

Reply via email to