On 03/22, Linus Torvalds wrote:
> On Tue, Mar 22, 2022 at 5:34 PM Tim Murray <[email protected]> wrote:
> >
> > AFAICT, what's happening is that rwsem_down_read_slowpath
> > modifies sem->count to indicate that there's a pending reader while
> > f2fs_ckpt holds the write lock, and when f2fs_ckpt releases the write
> > lock, it wakes pending readers and hands the lock over to readers.
> > This means that any subsequent attempt to grab the write lock from
> > f2fs_ckpt will stall until the newly-awakened reader releases the read
> > lock, which depends on the readers' arbitrarily long scheduling
> > delays.
> 
> Ugh.
> 
> So I'm looking at some of this, and you have things like this:
> 
>         f2fs_down_read(&F2FS_I(inode)->i_sem);
>         cp_reason = need_do_checkpoint(inode);
>         f2fs_up_read(&F2FS_I(inode)->i_sem);
> 
> which really doesn't seem to want a sleeping lock at all.
> 
> In fact, it's not clear that it has any business serializing with IO
> at all. It seems to just check very basic inode state. Very strange.
> It's the kind of thing that the VFS layer tends to use te i_lock
> *spinlock* for.

Um.. let me check this i_sem, introduced by
d928bfbfe77a ("f2fs: introduce fi->i_sem to protect fi's info").

OTOH, I was suspecting the major contetion would be
        f2fs_lock_op -> f2fs_down_read(&sbi->cp_rwsem);
, which was used for most of filesystem operations.

And, when we need to do checkpoint, we'd like to block internal operations by
        f2fs_lock_all -> f2fs_down_write(&sbi->cp_rwsem);

So, what I expected was giving the highest priority to the checkpoint thread
by grabbing down_write to block all the other readers.

> 
> And perhaps equally oddly, then when you do f2fs_issue_checkpoint(),
> _that_ code uses fancy lockless lists.
> 
> I'm probably mis-reading it.
> 
>              Linus


_______________________________________________
Linux-f2fs-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Reply via email to