On Thu, Jan 29, 2026 at 11:37:33AM +0200, Andy Shevchenko wrote:
> On Thu, Jan 29, 2026 at 02:58:52AM +0200, Cristian Ciocaltea wrote:

...

> > +#define __DRM_ARGB64_PREP_BPC(c, shift, bpc)({                             
> > \
> > +   __u16 mask = __GENMASK((bpc) - 1, 0);                           \
> > +   __u16 conv = __KERNEL_DIV_ROUND_CLOSEST((mask & (c)) *          \
> > +                                           __GENMASK(15, 0), mask);\
> 
> The whole point of the first patch is to use it in the divisions by 2^n - 1.
> Can we transform this to make it "divisions" by power-of-two?
> 
>      ...: def dbm2(c, bpc):
>      ...:     m = (1 << bpc) - 1
>      ...:     c1 = m & c
>      ...:     r = c1 << (16 - bpc)
>      ...:     for i in range(1, 16 // bpc):
>      ...:         r = r + (c1 << (16 - (i + 1) * bpc))
>      ...:     return r

I noticed that on some inputs it gives off-by-small-number error.
But you got the idea.

Taking this into account, perhaps we can share __KERNEL_DIV_ROUND_CLOSEST()
anyway and leave it there and improve the situation later on. Up to DRM
maintainers.

> The above is a Python version of PoC of this approximation. Basically
> we transform the fraction X / (2^n - 1) to a chained version of
> X / 2^n + X / 2^2n + ... X / 2^kn as derived from recurrent formula
> of i+1:th iteration as Xi+1 = Xi / 2^n + Xi / (2^n * (2^n - 1)).
> 
> So, maybe that one should be used instead? (It may be thought through
> on how to collapse the for-loop to maybe some bitops, but even with
> a for-loop it might be faster than real division.)
> 
> Note, we have some (for sure more than one, I remember the same Q appeared to
> me a few years ago) of the examples which may avoid division at all. I would
> like to have this macro to be kernel wide (and UAPI seems also okay).

> > +   __DRM_ARGB64_PREP(conv, shift);                                 \
> > +})

-- 
With Best Regards,
Andy Shevchenko


Reply via email to