On Thu, Nov 06, 2025 at 12:31:29PM +0100, Vlastimil Babka wrote:
> On 11/6/25 11:46, Lorenzo Stoakes wrote:
> > This patch adds the ability to atomically set VMA flags with only the mmap
> > read/VMA read lock held.
> >
> > As this could be hugely problematic for VMA flags in general given that all
> > other accesses are non-atomic and serialised by the mmap/VMA locks, we
> > implement this with a strict allow-list - that is, only designated flags
> > are allowed to do this.
> >
> > We make VM_MAYBE_GUARD one of these flags, and then set it under the mmap
> > read flag upon guard region installation.
> >
> > The places where this flag is used currently and matter are:
> >
> > * VMA merge - performed under mmap/VMA write lock, therefore excluding
> >   racing writes.
> >
> > * /proc/$pid/smaps - can race the write, however this isn't meaningful as
> >   the flag write is performed at the point of the guard region being
> >   established, and thus an smaps reader can't reasonably expect to avoid
> >   races. Due to atomicity, a reader will observe either the flag being set
> >   or not. Therefore consistency will be maintained.
> >
> > In all other cases the flag being set is irrelevant and atomicity
> > guarantees other flags will be read correctly.
>
> Could we maybe also spell out that we rely on the read mmap/VMA lock to
> exclude with writers that have write lock and then use non-atomic updates to
> update completely different flags than VM_MAYBE_GUARD? Those non-atomic
> updates could cause RMW races when only our side uses an atomic update, but
> the trick is that the read lock excludes with the write lock.

I thought this was implicit, I guess I can spell that out.

>
> > We additionally update madvise_guard_install() to ensure that
> > anon_vma_prepare() is set for anonymous VMAs to maintain consistency with
> > the assumption that any anonymous VMA with page tables will have an
> > anon_vma set, and any with an anon_vma unset will not have page tables
> > established.
>
> Could we more obviously say that we did anon_vma_prepare() unconditionally
> before this patch to trigger the page table copying in fork, but it's not
> needed anymore because fork now checks also VM_MAYBE_GUARD that we're
> setting here. Maybe it would be even more obvious to move that
> vma_needs_copy() hunk from previous patch to this one, but doesn't matter
> that much.

I thought that was covered between the comment, the previous patch and this but
I can spell it out also.

>
> Also we could mention that this patch alone will prevent merging of VMAs in
> some situations, but that's addressed next. I don't think it's such a bisect
> hazard to need reordering or combining changes, just mention perhaps.

A little pedantic but sure :)

>
> > Signed-off-by: Lorenzo Stoakes <[email protected]>
>
> Otherwise LGTM.
>
> Reviewed-by: Vlastimil Babka <[email protected]>
>

Thanks!

Reply via email to