On Thu, Apr 28, 2022 at 7:09 PM Linus Torvalds <torva...@linux-foundation.org> wrote: > On Thu, Apr 28, 2022 at 6:27 AM Andreas Gruenbacher <agrue...@redhat.com> > wrote: > > > > The data corruption we've been getting unfortunately didn't have to do > > with lock contention (we already knew that); it still occurs. I'm > > running out of ideas on what to try there. > > Hmm. > > I don't see the bug, but I do have a suggestion on something to try. > > In particular, you said the problem started with commit 00bfe02f4796 > ("gfs2: Fix mmap + page fault deadlocks for buffered I/O").
Yes, but note that it's gfs2_file_buffered_write() that fails. When the pagefault_disable/enable() around iomap_file_buffered_write() is removed, the corruption goes away. > And to me, I see two main things that are going on > > (a) the obvious "calling generic IO functions with pagefault disabled" thing > > (b) the "allow demotion" thing > > And I wonder if you could at least pinpoint which of the cases it is > that triggers it. > > So I'd love to see you try three things: > > (1) just remove the "allow demotion" cases. > > This will re-introduce the deadlock the commit is trying to fix, > but that's such a special case that I assume you can run your > test-suite that shows the problem even without that fix in place? > > This would just pinpoint whether it's due to some odd locking issue or > not. > > Honestly, from how you describe the symptoms, I don't think (1) is the > cause, but I think making sure is good. > > It sounds much more likely that it's one of those generic vfs > functions that screws up when a page fault happens and it gets a > partial result instead of handling the fault. The test should run just fine without allowing demotion. I'll try (1), but I don't expect the outcome to change. > Which gets us to > > (2) remove the pagefault_disable/enable() around just the > generic_file_read_iter() case in gfs2_file_read_iter(). > > and > > (3) finally, remove the pagefault_disable/enable() around the > iomap_file_buffered_write() case in gfs2_file_buffered_write() > > Yeah, yeah, you say it's just the read that fails, but humor me on > (3), just in case it's an earlier write in your test-suite and the > read just then uncovered it. > > But I put it as (3) so that you'd do the obvious (2) case first, and > narrow it down (ie if (1) still shows the bug, then do (2), and if > that fixes the bug it will be fairly well pinpointed to > generic_file_read_iter(). As mentioned above, we already did (3) and it didn't help. I'll do (1) now, and then (2). > Looking around, gfs2 is the only thing that obviously calls > generic_file_read_iter() with pagefaults disabled, so it does smell > like filemap_read() might have some issue, but the only thing that > does is basically that > > copied = copy_folio_to_iter(folio, offset, bytes, iter); > > which should just become copy_page_to_iter_iovec(), which you'd hope > would get things right. > > But it would be good to just narrow things down a bit. > > I'll look at that copy_page_to_iter_iovec() some more regardless, but > doing that "let's double-check it's not somethign else" would be good. We've actually been running most of our experiments on a 5.14-based kernel with a plethora of backports, so pre-folio. Sorry I forgot to mention that. I'll reproduce with mainline as well. Thanks, Andreas