Re: CVS commit: src/sys/dev/ic

Taylor R Campbell Mon, 19 Sep 2016 08:00:43 -0700

   Date: Mon, 19 Sep 2016 01:36:14 +0000
   From: David Holland <dholland-sourcechan...@netbsd.org>

   On Sun, Sep 18, 2016 at 09:52:37PM +0000, Jaromir Dolecek wrote:
    > Modified Files:
    >   src/sys/dev/ic: ld_nvme.c
    > 
    > Log Message:
    > must use PR_NOWAIT also during ldattach()/dkwedge discover, our
    > i/o is there called with a spin lock held, which triggers
    > LOCKDEBUG panic

   That sounds like it should be corrected upstream of nvme...?

Here's the relevant call tree:

ld_diskstart
  acquires sc->sc_mutex (an IPL_VM spin lock)
  calls sc->sc_start = ld_nvme_start
    calls ld_nvme_dobio
      calls nvme_ns_get_ctx with PR_WAITOK or PR_NOWAIT
  releases sc->sc_mutex

In this call tree, it is *never* correct to use PR_WAITOK, because the
spin lock is unconditionally held.  And the only other caller of
ld_nvme_dobio is ld_nvme_dump, which is probably not allowed to block
either for other reasons -- which presumably led to the abuse of
cpu_(soft)intr_p here.

Instead, you should prime nvme_ns_ctx_cache up front, e.g. by calling
pool_cache_setlowat in ld_nvme_attach (and maintaining a count of the
nvme instances), and always use PR_NOWAIT.  Also, you need to handle
failure in nvme_ns_get_ctx.  From a cursory examination of dk_start,
it looks like returning EAGAIN will DTRT -- dk_start will retry the
block later.

(Incidentally, it looks like nvme is statically wired to use a maximum
queue depth of 128.  In that case, you could just use a fixed array of
128 entries -- that's just a few pages of RAM.  But maybe you want to
be able to change the queue depth later.)

Re: CVS commit: src/sys/dev/ic

Reply via email to