On Thu 22-04-21 14:51:42, Andreas Gruenbacher wrote:
> Hi Jan,
> 
> On Thu, Apr 22, 2021 at 1:26 PM Jan Kara <j...@suse.cz> wrote:
> > I am looking into how GFS2 protects against races between hole punching and
> > things like page fault or readahead and AFAICT it seems it does not. In
> > particular is there anything that protects against a race like:
> >
> > CPU1                                    CPU2
> > gfs2_fallocate()
> >   __gfs2_punch_hole()
> >     truncate_pagecache_range()
> >                                         gfs2_fault()
> >                                           - faults in old data into page
> >                                             cache
> >     punch_hole()
> >
> > And now we have stale data in the page cache (data corruption). If
> > gfs2_page_mkwrite() sneaked in that window as well, we might be even racing
> > with writeback and are possibly corrupting the filesystem on disk. Is there
> > anything I'm missing?
> 
> This is controlled by the inode glock (ip->i_gl). Readers are holding
> a shared lock while writers are holding an exclusive lock, similar to
> an rwlock. Those locks take effect cluster wide (via DLM locks) and
> down to individual tasks locally. The only exception are resource
> group glocks, which use the LM_FLAG_NODE_SCOPE flag to indicate that
> exclusive locks should be shared locally. In that case, rgd->rd_mutex
> provides the exclusion among local tasks.

OK, thanks for explanation! I missed that GFS2 glocks are task local. But
then I have another question. We have the following:

gfs2_file_read_iter()
  grabs inode glock in shared mode
  generic_file_read_iter()
    filemap_read()
      copy_page_to_iter()
        <page fault>
        grabs mmap_sem
          gfs2_fault()
            - tries to grab inode glock again -> possible recursive locking

And even if the locking isn't recursive, you have glock->mmap_sem and
mmap_sem->glock lock orderings so ABBA deadlocks are possible AFAICT.

And there's a similar problem with the write path as well, just the lock is
grabbed exclusively.

                                                                Honza
-- 
Jan Kara <j...@suse.com>
SUSE Labs, CR

Reply via email to