On Tue, Jan 20, 2026 at 09:36:19AM -0400, Jason Gunthorpe wrote:
> On Tue, Jan 20, 2026 at 09:46:05AM +0000, Lorenzo Stoakes wrote:
> > On Mon, Jan 19, 2026 at 07:14:03PM -0400, Jason Gunthorpe wrote:
> > > On Mon, Jan 19, 2026 at 09:19:11PM +0000, Lorenzo Stoakes wrote:
> > > > +static inline bool is_shared_maywrite(vma_flags_t flags)
> > > > +{
> > >
> > > I'm not sure it is ideal to pass this array by value? Seems like it
> > > might invite some negative optimizations since now the compiler has to
> > > optimze away a copy too.
> >
> > I really don't think so? This is inlined and thus collapses to a totally
> > standard vma_flags_test_all() which passes by value anyway.
>
> > Do you have specific examples or evidence the compiler will optimise poorly 
> > here
> > on that basis as compared to pass by reference? And pass by reference would
> > necessitate:
>
> I've recently seen enough cases of older compilers and other arches
> making weird choices to be a little concerened. In the above case
> there is no reason not to use a const pointer (and indeed that would
> be the expected idomatic kernel style), so why take chances is my
> thinking.

With respect Jason, you're going to have to do better than that.

The entire implementation is dependent on passing-by-value.

Right now we can do:

        vma_flags_test(&flags, VMA_READ_BIT, VMA_WRITE_BIT, ...);

Which uses mk_vma_flags() in a macro to generalise to:

        vma_flags_test(&flags, <vma_flags_t value>);

The natural implication of what you're saying is that we can no longer use this
from _anywhere_ because - hey - passing this by value is bad so now _everything_
has to be re-written as:

        vma_flags_t flags_to_set = mk_vma_flags(<flags>);

        if (vma_flags_test(&flags, &flags_to_set)) { ... }

Right?

But is even that ok? Because presumably these compilers can inline, so that is
basically equivalent to what the macro's doing so does that rule out the VMA
bitmap flags concept altogether...

For hand-waved 'old compilers' (ok, people who use old compilers should not
expect optimal code) or 'other arches' (unspecified)?

If it was just changing this one function I'd still object as it makes it differ
from _every other test predicate_ using vma_flags_t but maybe to humour you I'd
change it, but surely by this argument you're essentially objecting to the whole
series?

I find it really strange you're going down this road as it was you who suggested
this approach in the first place and had to convince me the compiler would
manage it!...

Maybe I'm missing something here...

I am not sure about this 'idiomatic kernel style' thing either, it feels rather
conjured. Yes you wouldn't ordinarily pass something larger than a register size
by-value, but here the intent is for it to be inlined anyway right?

It strikes me that the key optimisation here is the inlining, now if the issue
is that ye olde compiler might choose not to inline very small functions (seems
unlikely) we could always throw in an __always_inline?

But it seems rather silly for a one-liner?

If the concern is deeper (not optimising the bitmap operations) then aren't you
saying no to the whole concept of the series?

Out of interest I godbolted a bunch of architectures:

x86-64
riscv
mips
s390x
sparc
arm7 32-bit
loongarch
m68k
xtensa

And found the manual method vs. the pass-by-value macro method were equivalent
in each case as far as I could tell.

In the worst case if we hit a weirdo case we can always substitute something
manual I have all the vma_flags_*word*() stuff available (which I recall you
objecting to...!)

I may have completely the wrong end of the stick here?...

>
> Jason

Thanks, Lorenzo

Reply via email to