On Thu, May 21, 2026 at 5:05 AM Matthew Wilcox <[email protected]> wrote: > > On Wed, May 20, 2026 at 06:01:56AM +0800, Barry Song wrote: > > > implied is that the per-vma locking may stall mmap_lock writes for > > > longer than if the mmap_lock was taken in read mode? Barry, is that > > > correct? > > > > Not the case — the actual situation is (if we modify the > > current kernel to perform I/O without releasing VMA read locks): > > > > thread 1 PF: lock vma1 read ---- IO ----- ; > > thread 2 PF: lock vma2 read ----- IO ----- ; > > thread 3 PF: lock vma3 read ---- IO ----- ; > > thread 4 fork: mmap_lock_write ---- lock vma1, vma2, vma3 write ; > > thread 5 : take mmap_lock for any read/write reason > > > > Now you can see that thread 4 has to wait for the I/O of > > VMA1, VMA2, and VMA3 to complete, and thread 5 then has to > > wait for thread 4 to release mmap_lock. Both thread 4 and > > thread 5 can become extremely slow, because I/O may be stuck > > anywhere in the bio/request queue or filesystem GC. > > > > So now we have two choices: > > > > 1. Change fork() to avoid taking the vma write lock for vma1/2/3 where > > possible; > > 2. Keep the current kernel behavior and drop the VMA lock before I/O: > > Option 3: Say that this is a very silly thing to optimise for. I have a > hard time believing that any application will care about the latency of > fork(), or the latency of page faults while it's in the middle of fork(). > Multithreaded applications just don't fork that often!
My understanding is that we should not blame applications here. This is 2026: there are basically only two kinds of applications — single-threaded and multi-threaded — and single-threaded applications are nearly extinct.
