On Tue, Dec 03, 2024 at 12:00:42AM +0300, Vitaliy Makkoveev wrote:
> > On 2 Dec 2024, at 20:47, Alexander Bluhm <bl...@openbsd.org> wrote:
> > 
> > On Mon, Dec 02, 2024 at 06:30:41PM +0100, Alexander Bluhm wrote:
> >> On Mon, Dec 02, 2024 at 05:22:23PM +0100, Alexander Bluhm wrote:
> >>> panic: Data modified on freelist: word 11 of object 0xd7632800 size 0x270 
> >>> previous type mount (0xdead410f != 0xdead4110)
> >> 
> >> It looks like the struct mount mnt_lock rwl_readers was derecemented
> >> after free.  rwl_readers = 3735896335 = 0xdead410f
> >> 
> >> This happens in vfs_busy() with RW_READ.  New rw_do_enter_read()
> >> function added call to rw_dec(&rwl->rwl_readers) there.
> >> 
> >> I have the impression that the reimplementation of rw-lock makes
> >> the problem visible.  Accessing struct mount after sleep is
> >> questionalble.  Does anything prevent a file system unount while
> >> sleeping in vfs_busy()?
> > 
> > New implementation of rw-locks is wrong.
> > 
> > vfs_lookup() checks the existance of dp->v_mountedhere
> > each time before vfs_busy() is called.
> > 
> >     while (dp->v_type == VDIR && (mp = dp->v_mountedhere) &&
> >         (cnp->cn_flags & NOCROSSMOUNT) == 0) {
> >             if (vfs_busy(mp, VB_READ|VB_WAIT))
> >                     continue;
> > 
> > vfs_busy() calls rw_enter() with RW_READ and RW_SLEEPFAIL.
> > 
> > Old rw_enter() did 
> > 
> >             error = sleep_finish(0, do_sleep);
> >             ...
> >             if (flags & RW_SLEEPFAIL)
> >                     return (EAGAIN);
> > 
> > New rw_do_enter_read() does
> > 
> >             error = sleep_finish(RW_SLEEP_TMO,
> >                 atomic_load_int(&rwl->rwl_waiters) > 0 ||
> >                 ISSET(atomic_load_long(&rwl->rwl_owner), RWLOCK_WRLOCK));
> >             ...
> >             if (ISSET(flags, RW_SLEEPFAIL)) {
> >                     error = EAGAIN;
> >                     goto fail;
> >             }
> >     ...
> > fail:
> >        rw_dec(&rwl->rwl_readers);
> >        return (error);
> > 
> > After sleep_finish() and RW_SLEEPFAIL there is no guarantee that
> > rwl is still valid memory.  This is a use-after-free in rw_do_enter_read().
> > 
> > bluhm
> 
> Everything fine with rw_do_enter_read(). It seems the problem lays in the
> vfs_busy(). How can it prevent mp being killed during context switch?

Semantics of rw_enter() changed.  Before this commit, memory of
rw-lock had to be valid until sleep if RW_SLEEPFAIL was given.  Now
it has to be valid after sleep.  That does not work with sleeping
and kernel lock.  The whole idea of RW_SLEEPFAIL is that anything
can happen during sleep.  The loop around vfs_busy() in vfs_lookup()
seems to be built for the old semantics.

I notice the problem with vfs_busy(), but any other lock with
RW_SLEEPFAIL could be wrong now.

bluhm

Reply via email to