On Wed, 17 Jun 2026 14:56:09 +0200
Johannes Berg <[email protected]> wrote:

> On Wed, 2026-06-17 at 13:12 +0200, Andy Shevchenko wrote:
> > Convert size_add() to take variadic argument, so we can simplify users
> > with using a macro only once.  
> 
> > +#define __size_add3(addend1, addend2, addend3, addend4, ...)               
> >         \
> > +   __size_add(__size_add2(addend1,  addend2, addend3), addend4)
> > +#define __size_add4(addend1, addend2, addend3, addend4, addend5, ...)      
> >         \
> > +   __size_add(__size_add3(addend1,  addend2, addend3, addend4), addend5)  
> 
> I guess it's not going to really matter, but it would generate fewer
> calls to have something more like
> 
> #define __size_add3(a1, a2, a3, a4) \
>       size_add(size_add(a1, a2), size_add(a3, a4))
> #define __size_add4(a1, a2, a3, a4, a5) \
>       size_add(size_add(a1, a2), size_add(a3, a4, a5))
> 
> as a binary tree, rather than only cutting one off every time. Not sure
> that results in hugely different code though - maybe fewer overflow
> checks?

The binary tree stands a chance of executing less slowly because the leaf
adds can be executed in parallel.
Excluding the saturation checks (wtf is it called size_add() not
saturating_add() ?) (a + b) + (c + d) will usually execute faster than
((a + b) + c) + d because the (a + b) and (c + d) can execute at the
same time; unfortunately gcc will always generate the latter.

        David

> 
> Although your version make it really completely equivalent to the
> nl80211.c code, clearly it doesn't matter if all the values are "good",
> and I believe the overflow behaviour means it doesn't matter for the
> overflow case either?
> 
> johannes
> 


Reply via email to