On Thu, Nov 06, 2025 at 12:31:29PM +0100, Vlastimil Babka wrote: > On 11/6/25 11:46, Lorenzo Stoakes wrote: > > This patch adds the ability to atomically set VMA flags with only the mmap > > read/VMA read lock held. > > > > As this could be hugely problematic for VMA flags in general given that all > > other accesses are non-atomic and serialised by the mmap/VMA locks, we > > implement this with a strict allow-list - that is, only designated flags > > are allowed to do this. > > > > We make VM_MAYBE_GUARD one of these flags, and then set it under the mmap > > read flag upon guard region installation. > > > > The places where this flag is used currently and matter are: > > > > * VMA merge - performed under mmap/VMA write lock, therefore excluding > > racing writes. > > > > * /proc/$pid/smaps - can race the write, however this isn't meaningful as > > the flag write is performed at the point of the guard region being > > established, and thus an smaps reader can't reasonably expect to avoid > > races. Due to atomicity, a reader will observe either the flag being set > > or not. Therefore consistency will be maintained. > > > > In all other cases the flag being set is irrelevant and atomicity > > guarantees other flags will be read correctly. > > Could we maybe also spell out that we rely on the read mmap/VMA lock to > exclude with writers that have write lock and then use non-atomic updates to > update completely different flags than VM_MAYBE_GUARD? Those non-atomic > updates could cause RMW races when only our side uses an atomic update, but > the trick is that the read lock excludes with the write lock.
I thought this was implicit, I guess I can spell that out. > > > We additionally update madvise_guard_install() to ensure that > > anon_vma_prepare() is set for anonymous VMAs to maintain consistency with > > the assumption that any anonymous VMA with page tables will have an > > anon_vma set, and any with an anon_vma unset will not have page tables > > established. > > Could we more obviously say that we did anon_vma_prepare() unconditionally > before this patch to trigger the page table copying in fork, but it's not > needed anymore because fork now checks also VM_MAYBE_GUARD that we're > setting here. Maybe it would be even more obvious to move that > vma_needs_copy() hunk from previous patch to this one, but doesn't matter > that much. I thought that was covered between the comment, the previous patch and this but I can spell it out also. > > Also we could mention that this patch alone will prevent merging of VMAs in > some situations, but that's addressed next. I don't think it's such a bisect > hazard to need reordering or combining changes, just mention perhaps. A little pedantic but sure :) > > > Signed-off-by: Lorenzo Stoakes <[email protected]> > > Otherwise LGTM. > > Reviewed-by: Vlastimil Babka <[email protected]> > Thanks!
