Re: [PATCH 06/14] bitsperlong.h: introduce SMALL_CONST() macro

2021-03-12 Thread Yury Norov
On Fri, Mar 12, 2021 at 10:12:22AM +0100, Rasmus Villemoes wrote:
> On 12/03/2021 06.28, Yury Norov wrote:
> > On Fri, Feb 19, 2021 at 12:07:27AM +0100, Rasmus Villemoes wrote:
> >> On 18/02/2021 05.05, Yury Norov wrote:
> >>> Many algorithms become simpler if they are passed with relatively small
> >>> input values. One example is bitmap operations when the whole bitmap fits
> >>> into one word. To implement such simplifications, linux/bitmap.h declares
> >>> small_const_nbits() macro.
> >>>
> >>> Other subsystems may also benefit from optimizations of this sort, like
> >>> find_bit API in the following patches. So it looks helpful to generalize
> >>> the macro and extend it's visibility.
> >>
> >> Perhaps, but SMALL_CONST is too generic a name, it needs to keep "bits"
> >> somewhere in there. So why not just keep it at small_const_nbits?
> >>
> >>> Signed-off-by: Yury Norov 
> >>> ---
> >>>  include/asm-generic/bitsperlong.h |  2 ++
> >>>  include/linux/bitmap.h| 33 ++-
> >>>  2 files changed, 17 insertions(+), 18 deletions(-)
> >>>
> >>> diff --git a/include/asm-generic/bitsperlong.h 
> >>> b/include/asm-generic/bitsperlong.h
> >>> index 3905c1c93dc2..0eeb77544f1d 100644
> >>> --- a/include/asm-generic/bitsperlong.h
> >>> +++ b/include/asm-generic/bitsperlong.h
> >>> @@ -23,4 +23,6 @@
> >>>  #define BITS_PER_LONG_LONG 64
> >>>  #endif
> >>>  
> >>> +#define SMALL_CONST(n) (__builtin_constant_p(n) && (unsigned long)(n) < 
> >>> BITS_PER_LONG)
> >>> +
> >>>  #endif /* __ASM_GENERIC_BITS_PER_LONG */
> >>> diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
> >>> index adf7bd9f0467..e89f1dace846 100644
> >>> --- a/include/linux/bitmap.h
> >>> +++ b/include/linux/bitmap.h
> >>> @@ -224,9 +224,6 @@ extern int bitmap_print_to_pagebuf(bool list, char 
> >>> *buf,
> >>>   * so make such users (should any ever turn up) call the out-of-line
> >>>   * versions.
> >>>   */
> >>> -#define small_const_nbits(nbits) \
> >>> - (__builtin_constant_p(nbits) && (nbits) <= BITS_PER_LONG && (nbits) > 0)
> >>> -
> >>>  static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
> >>>  {
> >>>   unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
> >>> @@ -278,7 +275,7 @@ extern void bitmap_to_arr32(u32 *buf, const unsigned 
> >>> long *bitmap,
> >>>  static inline int bitmap_and(unsigned long *dst, const unsigned long 
> >>> *src1,
> >>>   const unsigned long *src2, unsigned int nbits)
> >>>  {
> >>> - if (small_const_nbits(nbits))
> >>> + if (SMALL_CONST(nbits - 1))
> >>
> >> Please don't force most users to be changed to something less readable.
> >> What's wrong with just keeping small_const_nbits() the way it is,
> >> avoiding all this churn and keeping the readability?
> > 
> > The wrong thing is that it's defined in include/linux/bitmap.h, and I
> > cannot use it in include/asm-generic/bitops/find.h, so I have to either
> > move it to a separate header, or generalize and share with find.h and
> > other users this way. I prefer the latter option, thougt it's more
> > verbose.
> 
> The logical place would be the same place the BITS_PER_LONG macro is
> defined, no?

Yes. This where I placed SMALL_CONST() in current version.

> No need to introduce a new header for that, and all current
> users of small_const_nbits() must already (very possibly indirectly)
> include asm-generic/bitsperlong.h.
> 
> I do prefer to keep both the name small_const_nbits() and its current
> semantics, which, although not currently spelled out that way anywhere,
> is "is BITMAP_SIZE(nbits) known at compile time and equal to 1", which
> is precisely what allows the static inlines to unconditionally
> dereference the pointer (that's the "exclude the 0 case") and just deal
> with that one word.
> 
> I don't like either SMALL_CONST or small_const_size, because nothing in
> there says it has anything to do with bit ops. As I said, if you have
> some special place that for some reason cannot handle
> nbits==BITS_PER_LONG, then just add that as an additional constraint
> with a comment why.

OK, I'll move small_const_nbits() to the bitsperlong header and
resubmit shortly. My concern still is that nbits is too specific 
for bitsperlong.h, but if you're good with it, I'm OK as well.


Re: [PATCH 06/14] bitsperlong.h: introduce SMALL_CONST() macro

2021-03-12 Thread Rasmus Villemoes
On 12/03/2021 06.28, Yury Norov wrote:
> On Fri, Feb 19, 2021 at 12:07:27AM +0100, Rasmus Villemoes wrote:
>> On 18/02/2021 05.05, Yury Norov wrote:
>>> Many algorithms become simpler if they are passed with relatively small
>>> input values. One example is bitmap operations when the whole bitmap fits
>>> into one word. To implement such simplifications, linux/bitmap.h declares
>>> small_const_nbits() macro.
>>>
>>> Other subsystems may also benefit from optimizations of this sort, like
>>> find_bit API in the following patches. So it looks helpful to generalize
>>> the macro and extend it's visibility.
>>
>> Perhaps, but SMALL_CONST is too generic a name, it needs to keep "bits"
>> somewhere in there. So why not just keep it at small_const_nbits?
>>
>>> Signed-off-by: Yury Norov 
>>> ---
>>>  include/asm-generic/bitsperlong.h |  2 ++
>>>  include/linux/bitmap.h| 33 ++-
>>>  2 files changed, 17 insertions(+), 18 deletions(-)
>>>
>>> diff --git a/include/asm-generic/bitsperlong.h 
>>> b/include/asm-generic/bitsperlong.h
>>> index 3905c1c93dc2..0eeb77544f1d 100644
>>> --- a/include/asm-generic/bitsperlong.h
>>> +++ b/include/asm-generic/bitsperlong.h
>>> @@ -23,4 +23,6 @@
>>>  #define BITS_PER_LONG_LONG 64
>>>  #endif
>>>  
>>> +#define SMALL_CONST(n) (__builtin_constant_p(n) && (unsigned long)(n) < 
>>> BITS_PER_LONG)
>>> +
>>>  #endif /* __ASM_GENERIC_BITS_PER_LONG */
>>> diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
>>> index adf7bd9f0467..e89f1dace846 100644
>>> --- a/include/linux/bitmap.h
>>> +++ b/include/linux/bitmap.h
>>> @@ -224,9 +224,6 @@ extern int bitmap_print_to_pagebuf(bool list, char *buf,
>>>   * so make such users (should any ever turn up) call the out-of-line
>>>   * versions.
>>>   */
>>> -#define small_const_nbits(nbits) \
>>> -   (__builtin_constant_p(nbits) && (nbits) <= BITS_PER_LONG && (nbits) > 0)
>>> -
>>>  static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
>>>  {
>>> unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
>>> @@ -278,7 +275,7 @@ extern void bitmap_to_arr32(u32 *buf, const unsigned 
>>> long *bitmap,
>>>  static inline int bitmap_and(unsigned long *dst, const unsigned long *src1,
>>> const unsigned long *src2, unsigned int nbits)
>>>  {
>>> -   if (small_const_nbits(nbits))
>>> +   if (SMALL_CONST(nbits - 1))
>>
>> Please don't force most users to be changed to something less readable.
>> What's wrong with just keeping small_const_nbits() the way it is,
>> avoiding all this churn and keeping the readability?
> 
> The wrong thing is that it's defined in include/linux/bitmap.h, and I
> cannot use it in include/asm-generic/bitops/find.h, so I have to either
> move it to a separate header, or generalize and share with find.h and
> other users this way. I prefer the latter option, thougt it's more
> verbose.

The logical place would be the same place the BITS_PER_LONG macro is
defined, no? No need to introduce a new header for that, and all current
users of small_const_nbits() must already (very possibly indirectly)
include asm-generic/bitsperlong.h.

I do prefer to keep both the name small_const_nbits() and its current
semantics, which, although not currently spelled out that way anywhere,
is "is BITMAP_SIZE(nbits) known at compile time and equal to 1", which
is precisely what allows the static inlines to unconditionally
dereference the pointer (that's the "exclude the 0 case") and just deal
with that one word.

I don't like either SMALL_CONST or small_const_size, because nothing in
there says it has anything to do with bit ops. As I said, if you have
some special place that for some reason cannot handle
nbits==BITS_PER_LONG, then just add that as an additional constraint
with a comment why.

Rasmus


Re: [PATCH 06/14] bitsperlong.h: introduce SMALL_CONST() macro

2021-03-11 Thread Yury Norov
On Fri, Feb 19, 2021 at 12:07:27AM +0100, Rasmus Villemoes wrote:
> On 18/02/2021 05.05, Yury Norov wrote:
> > Many algorithms become simpler if they are passed with relatively small
> > input values. One example is bitmap operations when the whole bitmap fits
> > into one word. To implement such simplifications, linux/bitmap.h declares
> > small_const_nbits() macro.
> > 
> > Other subsystems may also benefit from optimizations of this sort, like
> > find_bit API in the following patches. So it looks helpful to generalize
> > the macro and extend it's visibility.
> 
> Perhaps, but SMALL_CONST is too generic a name, it needs to keep "bits"
> somewhere in there. So why not just keep it at small_const_nbits?
> 
> > Signed-off-by: Yury Norov 
> > ---
> >  include/asm-generic/bitsperlong.h |  2 ++
> >  include/linux/bitmap.h| 33 ++-
> >  2 files changed, 17 insertions(+), 18 deletions(-)
> > 
> > diff --git a/include/asm-generic/bitsperlong.h 
> > b/include/asm-generic/bitsperlong.h
> > index 3905c1c93dc2..0eeb77544f1d 100644
> > --- a/include/asm-generic/bitsperlong.h
> > +++ b/include/asm-generic/bitsperlong.h
> > @@ -23,4 +23,6 @@
> >  #define BITS_PER_LONG_LONG 64
> >  #endif
> >  
> > +#define SMALL_CONST(n) (__builtin_constant_p(n) && (unsigned long)(n) < 
> > BITS_PER_LONG)
> > +
> >  #endif /* __ASM_GENERIC_BITS_PER_LONG */
> > diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
> > index adf7bd9f0467..e89f1dace846 100644
> > --- a/include/linux/bitmap.h
> > +++ b/include/linux/bitmap.h
> > @@ -224,9 +224,6 @@ extern int bitmap_print_to_pagebuf(bool list, char *buf,
> >   * so make such users (should any ever turn up) call the out-of-line
> >   * versions.
> >   */
> > -#define small_const_nbits(nbits) \
> > -   (__builtin_constant_p(nbits) && (nbits) <= BITS_PER_LONG && (nbits) > 0)
> > -
> >  static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
> >  {
> > unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
> > @@ -278,7 +275,7 @@ extern void bitmap_to_arr32(u32 *buf, const unsigned 
> > long *bitmap,
> >  static inline int bitmap_and(unsigned long *dst, const unsigned long *src1,
> > const unsigned long *src2, unsigned int nbits)
> >  {
> > -   if (small_const_nbits(nbits))
> > +   if (SMALL_CONST(nbits - 1))
> 
> Please don't force most users to be changed to something less readable.
> What's wrong with just keeping small_const_nbits() the way it is,
> avoiding all this churn and keeping the readability?

The wrong thing is that it's defined in include/linux/bitmap.h, and I
cannot use it in include/asm-generic/bitops/find.h, so I have to either
move it to a separate header, or generalize and share with find.h and
other users this way. I prefer the latter option, thougt it's more
verbose.
 
> At a quick reading, one of the very few places where you end up not
> passing nbits-1 but just nbits is this
> 
>  unsigned long find_next_zero_bit_le(const void *addr, unsigned
>   long size, unsigned long offset)
>  {
> + if (SMALL_CONST(size)) {
> + unsigned long val = *(const unsigned long *)addr;
> +
> + if (unlikely(offset >= size))
> + return size;
> 
> which is a regression, for much the same reason the nbits==0 case was
> excluded from small_const_nbits in the first place. If size is 0, we
> used to just return 0 early in _find_next_bit. But you've introduced a
> dereference of addr before that check is now done, which is
> theoretically an oops.
> 
> If find_next_zero_bit_le cannot handle nbits==BITS_PER_LONG efficiently
> but requires one off-limits bit position, fine, so be it, add an extra
> "small_const_nbits() && nbits < BITS_PER_LONG" (and a comment).

Sure, it's my bad. I need to fix it.

What would you prefer, moving the small_const_nbits() to a separate
header, or introduce something like
#define small_const_size(n) SMALL_CONST((n) - 1)

to avoid mistakes like this? I think small_const_size() is better
because it might be potentially useful for someone else, but I can
go either way.


Re: [PATCH 06/14] bitsperlong.h: introduce SMALL_CONST() macro

2021-02-18 Thread Rasmus Villemoes
On 18/02/2021 05.05, Yury Norov wrote:
> Many algorithms become simpler if they are passed with relatively small
> input values. One example is bitmap operations when the whole bitmap fits
> into one word. To implement such simplifications, linux/bitmap.h declares
> small_const_nbits() macro.
> 
> Other subsystems may also benefit from optimizations of this sort, like
> find_bit API in the following patches. So it looks helpful to generalize
> the macro and extend it's visibility.

Perhaps, but SMALL_CONST is too generic a name, it needs to keep "bits"
somewhere in there. So why not just keep it at small_const_nbits?

> Signed-off-by: Yury Norov 
> ---
>  include/asm-generic/bitsperlong.h |  2 ++
>  include/linux/bitmap.h| 33 ++-
>  2 files changed, 17 insertions(+), 18 deletions(-)
> 
> diff --git a/include/asm-generic/bitsperlong.h 
> b/include/asm-generic/bitsperlong.h
> index 3905c1c93dc2..0eeb77544f1d 100644
> --- a/include/asm-generic/bitsperlong.h
> +++ b/include/asm-generic/bitsperlong.h
> @@ -23,4 +23,6 @@
>  #define BITS_PER_LONG_LONG 64
>  #endif
>  
> +#define SMALL_CONST(n) (__builtin_constant_p(n) && (unsigned long)(n) < 
> BITS_PER_LONG)
> +
>  #endif /* __ASM_GENERIC_BITS_PER_LONG */
> diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
> index adf7bd9f0467..e89f1dace846 100644
> --- a/include/linux/bitmap.h
> +++ b/include/linux/bitmap.h
> @@ -224,9 +224,6 @@ extern int bitmap_print_to_pagebuf(bool list, char *buf,
>   * so make such users (should any ever turn up) call the out-of-line
>   * versions.
>   */
> -#define small_const_nbits(nbits) \
> - (__builtin_constant_p(nbits) && (nbits) <= BITS_PER_LONG && (nbits) > 0)
> -
>  static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
>  {
>   unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
> @@ -278,7 +275,7 @@ extern void bitmap_to_arr32(u32 *buf, const unsigned long 
> *bitmap,
>  static inline int bitmap_and(unsigned long *dst, const unsigned long *src1,
>   const unsigned long *src2, unsigned int nbits)
>  {
> - if (small_const_nbits(nbits))
> + if (SMALL_CONST(nbits - 1))

Please don't force most users to be changed to something less readable.
What's wrong with just keeping small_const_nbits() the way it is,
avoiding all this churn and keeping the readability?

At a quick reading, one of the very few places where you end up not
passing nbits-1 but just nbits is this

 unsigned long find_next_zero_bit_le(const void *addr, unsigned
long size, unsigned long offset)
 {
+   if (SMALL_CONST(size)) {
+   unsigned long val = *(const unsigned long *)addr;
+
+   if (unlikely(offset >= size))
+   return size;

which is a regression, for much the same reason the nbits==0 case was
excluded from small_const_nbits in the first place. If size is 0, we
used to just return 0 early in _find_next_bit. But you've introduced a
dereference of addr before that check is now done, which is
theoretically an oops.

If find_next_zero_bit_le cannot handle nbits==BITS_PER_LONG efficiently
but requires one off-limits bit position, fine, so be it, add an extra
"small_const_nbits() && nbits < BITS_PER_LONG" (and a comment).

Rasmus


[PATCH 06/14] bitsperlong.h: introduce SMALL_CONST() macro

2021-02-17 Thread Yury Norov
Many algorithms become simpler if they are passed with relatively small
input values. One example is bitmap operations when the whole bitmap fits
into one word. To implement such simplifications, linux/bitmap.h declares
small_const_nbits() macro.

Other subsystems may also benefit from optimizations of this sort, like
find_bit API in the following patches. So it looks helpful to generalize
the macro and extend it's visibility.

Signed-off-by: Yury Norov 
---
 include/asm-generic/bitsperlong.h |  2 ++
 include/linux/bitmap.h| 33 ++-
 2 files changed, 17 insertions(+), 18 deletions(-)

diff --git a/include/asm-generic/bitsperlong.h 
b/include/asm-generic/bitsperlong.h
index 3905c1c93dc2..0eeb77544f1d 100644
--- a/include/asm-generic/bitsperlong.h
+++ b/include/asm-generic/bitsperlong.h
@@ -23,4 +23,6 @@
 #define BITS_PER_LONG_LONG 64
 #endif
 
+#define SMALL_CONST(n) (__builtin_constant_p(n) && (unsigned long)(n) < 
BITS_PER_LONG)
+
 #endif /* __ASM_GENERIC_BITS_PER_LONG */
diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index adf7bd9f0467..e89f1dace846 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -224,9 +224,6 @@ extern int bitmap_print_to_pagebuf(bool list, char *buf,
  * so make such users (should any ever turn up) call the out-of-line
  * versions.
  */
-#define small_const_nbits(nbits) \
-   (__builtin_constant_p(nbits) && (nbits) <= BITS_PER_LONG && (nbits) > 0)
-
 static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
 {
unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
@@ -278,7 +275,7 @@ extern void bitmap_to_arr32(u32 *buf, const unsigned long 
*bitmap,
 static inline int bitmap_and(unsigned long *dst, const unsigned long *src1,
const unsigned long *src2, unsigned int nbits)
 {
-   if (small_const_nbits(nbits))
+   if (SMALL_CONST(nbits - 1))
return (*dst = *src1 & *src2 & BITS_FIRST(nbits - 1)) != 0;
return __bitmap_and(dst, src1, src2, nbits);
 }
@@ -286,7 +283,7 @@ static inline int bitmap_and(unsigned long *dst, const 
unsigned long *src1,
 static inline void bitmap_or(unsigned long *dst, const unsigned long *src1,
const unsigned long *src2, unsigned int nbits)
 {
-   if (small_const_nbits(nbits))
+   if (SMALL_CONST(nbits - 1))
*dst = *src1 | *src2;
else
__bitmap_or(dst, src1, src2, nbits);
@@ -295,7 +292,7 @@ static inline void bitmap_or(unsigned long *dst, const 
unsigned long *src1,
 static inline void bitmap_xor(unsigned long *dst, const unsigned long *src1,
const unsigned long *src2, unsigned int nbits)
 {
-   if (small_const_nbits(nbits))
+   if (SMALL_CONST(nbits - 1))
*dst = *src1 ^ *src2;
else
__bitmap_xor(dst, src1, src2, nbits);
@@ -304,7 +301,7 @@ static inline void bitmap_xor(unsigned long *dst, const 
unsigned long *src1,
 static inline int bitmap_andnot(unsigned long *dst, const unsigned long *src1,
const unsigned long *src2, unsigned int nbits)
 {
-   if (small_const_nbits(nbits))
+   if (SMALL_CONST(nbits - 1))
return (*dst = *src1 & ~(*src2) & BITS_FIRST(nbits - 1)) != 0;
return __bitmap_andnot(dst, src1, src2, nbits);
 }
@@ -312,7 +309,7 @@ static inline int bitmap_andnot(unsigned long *dst, const 
unsigned long *src1,
 static inline void bitmap_complement(unsigned long *dst, const unsigned long 
*src,
unsigned int nbits)
 {
-   if (small_const_nbits(nbits))
+   if (SMALL_CONST(nbits - 1))
*dst = ~(*src);
else
__bitmap_complement(dst, src, nbits);
@@ -328,7 +325,7 @@ static inline void bitmap_complement(unsigned long *dst, 
const unsigned long *sr
 static inline int bitmap_equal(const unsigned long *src1,
const unsigned long *src2, unsigned int nbits)
 {
-   if (small_const_nbits(nbits))
+   if (SMALL_CONST(nbits - 1))
return !((*src1 ^ *src2) & BITS_FIRST(nbits - 1));
if (__builtin_constant_p(nbits & BITMAP_MEM_MASK) &&
IS_ALIGNED(nbits, BITMAP_MEM_ALIGNMENT))
@@ -350,7 +347,7 @@ static inline bool bitmap_or_equal(const unsigned long 
*src1,
   const unsigned long *src3,
   unsigned int nbits)
 {
-   if (!small_const_nbits(nbits))
+   if (!SMALL_CONST(nbits - 1))
return __bitmap_or_equal(src1, src2, src3, nbits);
 
return !(((*src1 | *src2) ^ *src3) & BITS_FIRST(nbits - 1));
@@ -359,7 +356,7 @@ static inline bool bitmap_or_equal(const unsigned long 
*src1,
 static inline int bitmap_intersects(const unsigned long *src1,
const unsigned long *src2, unsigned int nbits)
 {
-   if (small_const_nbits(nbits))
+   if (SMALL_CONST(nbits - 1))