> On 13. Aug 2018, at 19:25, Emmanuel Dreyfus <m...@netbsd.org> wrote: > > On Mon, Aug 13, 2018 at 11:56:45AM +0000, Taylor R Campbell wrote: >> Unless I misunderstand fss(4), this is an abuse of mutex(9): nothing >> should sleep while holding the lock, so that nothing trying to acquire >> the lock will wait for a long time. > > Well, the cause is not yet completely clear to me, but the user > experience is terrible. The first time I used it, I thought > the system crashed, because fssconfig -l was just hung for > hours. > > And it is very easy to acheive a situation where most processes > are in tstile awaiting a vnode lock for a name lookup.
I see two problems here. 1) File system internal snapshots take long to create or destroy on large file systems. I have no solution to this problem. Using file system external snapshots for dumps should work fine. 2) Fss devices block in ioctl while a snapshot gets created or destroyed. A possible fix is to replace the current active/non-active state with idle/creating/active/destroying and changing FSSIOCSET to mutex_enter(&sc->sc_lock); if (sc->sc_state != FSS_IDLE) { mutex_exit(&sc->sc_lock); return EBUSY; } sc->sc_state = FSS_CREATING; mutex_exit(&sc->sc_lock); error = fss_create_snapshot(); mutex_enter(&sc->sc_lock); if (error) sc->sc_state = FSS_IDLE; else sc->sc_state = FSS_ACTIVE; mutex_exit(&sc->sc_lock); return error; Problem here is backwards compatibility. I have no idea what to return for FSSIOCGET when the state is creating or destroying. -- J. Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)