On Wed, May 20, 2026 at 3:50 PM Lorenzo Stoakes <[email protected]> wrote: > > On Wed, May 20, 2026 at 05:18:52AM +0800, Barry Song wrote: > > On Tue, May 19, 2026 at 8:53 PM Lorenzo Stoakes <[email protected]> wrote: > > > > > > On Mon, May 18, 2026 at 12:56:59PM -0700, Suren Baghdasaryan wrote: > > > > > > > > > > > > > I think we either need to fix `fork()`, or keep the current > > > > > behavior of dropping the VMA lock before performing I/O. > > > > > > > > I see. So, this problem arises from the fact that we are changing the > > > > pagefaults requiring I/O operation to hold VMA lock... > > > > And you want to lock VMA on fork only if vma_is_anonymous(vma) || > > > > is_cow_mapping(vma->vm_flags). So, we will be blocking page faults for > > > > anonymous and COW VMAs only while holding mmap_write_lock, preventing > > > > any VMA modification. On the surface, that looks ok to me but I might > > > > be missing some corner cases. If nobody sees any obvious issues, I > > > > think it's worth a try. > > > > > > Not sure if you noticed but I did raise concerns ;) > > > > > > I wonder if you've confused the fault path and fork here, as I think > > > Barry has > > > been a little unclear on that. > > > > I think I’ve been absolutely clear :-) > > On this point sure, I would argue less so around the fork stuff but I > responded > on that specifically elsewhere so let's keep things moving :>) > > > We should either stick to the current behavior - drop > > the VMA lock before doing I/O, or change fork() so that it > > does not wait on vma_start_write(). > > Again, as I said elsewhere, I think there might be a 3rd way possibly. It's a > big mistake to assume that there are only specific solutions to problems in > the > kernel then to present a false dichotomy.
I recalled that when we discussed this part in my slides: ‘For simplicity, rather than using a whitelist mechanism for per-VMA retry, we could use a blacklist instead: default to always retry via the VMA lock, and only allow mmap_lock-based page-fault retry for specific cases such as __vmf_anon_prepare().’ Suren mentioned introducing a FALLBACK flag. With the FALLBACK flag, we would retry via mmap_lock; with the RETRY flag, we would retry via the VMA lock. Not sure whether this could really be called a ‘third way,’ but it seems more like a shift from a whitelist model to a blacklist model, without changing the fundamental design, but it does change where we would need to touch the source code. > > We absolutely hear you on this being a problem and it WILL be addressed one > way > or another. Thanks. This is a bit of light in what has felt like a fairly dark situation. I really appreciate your thoughtful and responsible approach. > > Of the two approaches, as I said elsewhere, I prefer what you've done in this > series to anything touching fork. > > But give me time to look through the series please (I'd also suggest RFC'ing > when it's something kinda fundamental that might generate converastion, makes > life a bit easier on the review side :) Thanks! Sure, I’m happy to wait and there’s no urgency. Last year you made quite a significant contribution to the work when I tried to remove mmap_lock in madvise. I really appreciated it. Now we’re back to the same lock again, just in different places. Best Regards Barry
