On Wed, May 20, 2026 at 12:33:56PM +0200, David Hildenbrand (Arm) wrote: > On 5/19/26 14:53, Lorenzo Stoakes wrote: > > On Mon, May 18, 2026 at 12:56:59PM -0700, Suren Baghdasaryan wrote: > > > >>> > >>> I think we either need to fix `fork()`, or keep the current > >>> behavior of dropping the VMA lock before performing I/O. > >> > >> I see. So, this problem arises from the fact that we are changing the > >> pagefaults requiring I/O operation to hold VMA lock... > >> And you want to lock VMA on fork only if vma_is_anonymous(vma) || > >> is_cow_mapping(vma->vm_flags). So, we will be blocking page faults for > >> anonymous and COW VMAs only while holding mmap_write_lock, preventing > >> any VMA modification. On the surface, that looks ok to me but I might > >> be missing some corner cases. If nobody sees any obvious issues, I > >> think it's worth a try. > > > > Not sure if you noticed but I did raise concerns ;) > > > > I wonder if you've confused the fault path and fork here, as I think Barry > > has > > been a little unclear on that. > > > > What's being suggested in this thread is to fundamentally change fork > > behaviour > > so it's different from the entire history of the kernel (or - presumably - > > at > > least recent history :) > I don't want fork() to become different in that regard. > > There is already a slight difference with vs. without per-VMA locks, because > there is a window in-between us taking the write mmap_lock and all the per-VMA > locks. I raised that previously [1] and assumed that it is probably fine. > > I also raised in the past why I think we must not allow concurrent page > faults, > at least as soon as anonymous memory is involved [2]. > > ... and I raised that this is pretty much slower by design right now: "Well, > the > design decision that CONFIG_PER_VMA_LOCK made for now to make page faults fast > and to make blocking any page faults from happening to be slower ..." [3]
Thanks for the background will read through! :) But yeah I think the transition from !vma->anon_vma -> vma->anon_vma being a bit slow is kinda ok most page faults will of course have anon_vma populated. Be interesting with CoW context, because we won't need to mmap read lock there at all :) > > [1] > https://lore.kernel.org/all/[email protected]/ > [2] > https://lore.kernel.org/all/[email protected]/ > [3] > https://lore.kernel.org/all/[email protected]/ > > -- > Cheers, > > David Cheers, Lorenzo
