Date: Tue, 3 May 2016 10:02:44 +0000 From: [email protected] I fear that WAPBL should be locking wl_mtx before running wapbl_transaction_len (in wapbl_flush)
I suspect so because it performs a similar lock before looking at wl_bufcount in sys/kern/vfs_wapbl.c:1455-1460. If true, how about this diff? Certainly there's something fishy here! Unfortunately, just adding that mutex_enter/exit there won't work, because at least one caller, wapbl_begin, calls wapbl_transaction_len with wl->wl_mtx already held: http://nxr.netbsd.org/xref/src/sys/kern/vfs_wapbl.c?r=1.64#924 It looks like wl->wl_mtx should be held inside wapbl_transaction_len, based on the struct wapbl fields it uses and the locking legend at the top, <http://nxr.netbsd.org/xref/src/sys/kern/vfs_wapbl.c?r=1.64#102>. So I'm inclined to suggest adding KASSERT(mutex_owned(&wl->wl_mtx)) to it -- but currently there are some callers that don't hold the mutex: - http://nxr.netbsd.org/xref/src/sys/kern/vfs_wapbl.c?r=1.64#352 (I think this is safe because done only at initialization, when no other threads can have access to it. But it is also harmless to acquire the lock here.) - http://nxr.netbsd.org/xref/src/sys/kern/vfs_wapbl.c?r=1.64#979 (This should be a KASSERTMSG, not an #ifdef-DIAGNOSTIC-panic, and could easily be moved into the mutex_enter/exit just below.) - http://nxr.netbsd.org/xref/src/sys/kern/vfs_wapbl.c?r=1.64#1489 (This one is a little hairier: might need to push the call into wapbl_truncate, or require the caller of wapbl_truncate to hold the lock on entry and to accept that it will be released/reacquired.)
