On Tue, Dec 03, 2024 at 12:00:42AM +0300, Vitaliy Makkoveev wrote: > > On 2 Dec 2024, at 20:47, Alexander Bluhm <bl...@openbsd.org> wrote: > > > > On Mon, Dec 02, 2024 at 06:30:41PM +0100, Alexander Bluhm wrote: > >> On Mon, Dec 02, 2024 at 05:22:23PM +0100, Alexander Bluhm wrote: > >>> panic: Data modified on freelist: word 11 of object 0xd7632800 size 0x270 > >>> previous type mount (0xdead410f != 0xdead4110) > >> > >> It looks like the struct mount mnt_lock rwl_readers was derecemented > >> after free. rwl_readers = 3735896335 = 0xdead410f > >> > >> This happens in vfs_busy() with RW_READ. New rw_do_enter_read() > >> function added call to rw_dec(&rwl->rwl_readers) there. > >> > >> I have the impression that the reimplementation of rw-lock makes > >> the problem visible. Accessing struct mount after sleep is > >> questionalble. Does anything prevent a file system unount while > >> sleeping in vfs_busy()? > > > > New implementation of rw-locks is wrong. > > > > vfs_lookup() checks the existance of dp->v_mountedhere > > each time before vfs_busy() is called. > > > > while (dp->v_type == VDIR && (mp = dp->v_mountedhere) && > > (cnp->cn_flags & NOCROSSMOUNT) == 0) { > > if (vfs_busy(mp, VB_READ|VB_WAIT)) > > continue; > > > > vfs_busy() calls rw_enter() with RW_READ and RW_SLEEPFAIL. > > > > Old rw_enter() did > > > > error = sleep_finish(0, do_sleep); > > ... > > if (flags & RW_SLEEPFAIL) > > return (EAGAIN); > > > > New rw_do_enter_read() does > > > > error = sleep_finish(RW_SLEEP_TMO, > > atomic_load_int(&rwl->rwl_waiters) > 0 || > > ISSET(atomic_load_long(&rwl->rwl_owner), RWLOCK_WRLOCK)); > > ... > > if (ISSET(flags, RW_SLEEPFAIL)) { > > error = EAGAIN; > > goto fail; > > } > > ... > > fail: > > rw_dec(&rwl->rwl_readers); > > return (error); > > > > After sleep_finish() and RW_SLEEPFAIL there is no guarantee that > > rwl is still valid memory. This is a use-after-free in rw_do_enter_read(). > > > > bluhm > > Everything fine with rw_do_enter_read(). It seems the problem lays in the > vfs_busy(). How can it prevent mp being killed during context switch?
Semantics of rw_enter() changed. Before this commit, memory of rw-lock had to be valid until sleep if RW_SLEEPFAIL was given. Now it has to be valid after sleep. That does not work with sleeping and kernel lock. The whole idea of RW_SLEEPFAIL is that anything can happen during sleep. The loop around vfs_busy() in vfs_lookup() seems to be built for the old semantics. I notice the problem with vfs_busy(), but any other lock with RW_SLEEPFAIL could be wrong now. bluhm