Re: [Cluster-devel] [GIT PULL] gfs2 fix

Andreas Gruenbacher Thu, 28 Apr 2022 10:39:23 -0700

On Thu, Apr 28, 2022 at 7:09 PM Linus Torvalds
<torva...@linux-foundation.org> wrote:
> On Thu, Apr 28, 2022 at 6:27 AM Andreas Gruenbacher <agrue...@redhat.com> 
> wrote:
> >
> > The data corruption we've been getting unfortunately didn't have to do
> > with lock contention (we already knew that); it still occurs. I'm
> > running out of ideas on what to try there.
>
> Hmm.
>
> I don't see the bug, but I do have a suggestion on something to try.
>
> In particular, you said the problem started with commit 00bfe02f4796
> ("gfs2: Fix mmap + page fault deadlocks for buffered I/O").


Yes, but note that it's gfs2_file_buffered_write() that fails. When
the pagefault_disable/enable() around iomap_file_buffered_write() is
removed, the corruption goes away.

> And to me, I see two main things that are going on
>
>  (a) the obvious "calling generic IO functions with pagefault disabled" thing
>
>  (b) the "allow demotion" thing
>
> And I wonder if you could at least pinpoint which of the  cases it is
> that triggers it.
>
> So I'd love to see you try three things:
>
>  (1) just remove the "allow demotion" cases.
>
>      This will re-introduce the deadlock the commit is trying to fix,
> but that's such a special case that I assume you can run your
> test-suite that shows the problem even without that fix in place?
>
>      This would just pinpoint whether it's due to some odd locking issue or 
> not.
>
> Honestly, from how you describe the symptoms, I don't think (1) is the
> cause, but I think making sure is good.
>
> It sounds much more likely that it's one of those generic vfs
> functions that screws up when a page fault happens and it gets a
> partial result instead of handling the fault.

The test should run just fine without allowing demotion. I'll try (1),
but I don't expect the outcome to change.

> Which gets us to
>
>  (2) remove the pagefault_disable/enable() around just the
> generic_file_read_iter() case in gfs2_file_read_iter().
>
> and
>
>  (3) finally, remove the pagefault_disable/enable() around the
> iomap_file_buffered_write() case in gfs2_file_buffered_write()
>
> Yeah, yeah, you say it's just the read that fails, but humor me on
> (3), just in case it's an earlier write in your test-suite and the
> read just then uncovered it.
>
> But I put it as (3) so that you'd do the obvious (2) case first, and
> narrow it down (ie if (1) still shows the bug, then do (2), and if
> that fixes the bug it will be fairly well pinpointed to
> generic_file_read_iter().

As mentioned above, we already did (3) and it didn't help. I'll do (1)
now, and then (2).

> Looking around, gfs2 is the only thing that obviously calls
> generic_file_read_iter() with pagefaults disabled, so it does smell
> like filemap_read() might have some issue, but the only thing that
> does is basically that
>
>                 copied = copy_folio_to_iter(folio, offset, bytes, iter);
>
> which should just become copy_page_to_iter_iovec(), which you'd hope
> would get things right.
>
> But it would be good to just narrow things down a bit.
>
> I'll look at that copy_page_to_iter_iovec() some more regardless, but
> doing that "let's double-check it's not somethign else" would be good.

We've actually been running most of our experiments on a 5.14-based
kernel with a plethora of backports, so pre-folio. Sorry I forgot to
mention that. I'll reproduce with mainline as well.

Thanks,
Andreas

Re: [Cluster-devel] [GIT PULL] gfs2 fix

Reply via email to