Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-21 Thread Vlastimil Babka
On 11/13/18 7:22 PM, Vlastimil Babka wrote:
> On 11/12/18 10:55 AM, David Laight wrote:
>> From: Vlastimil Babka [mailto:vba...@suse.cz]
>>> Sent: 09 November 2018 19:16
>> ...
>>> This? Not terribly elegant, but I don't see a nicer way right now...
>>
>> Maybe just have two copies of the function body?
>>
>>  static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
>> {
>> #ifndef CONFIG_ZONE_DMA
>>  return flags & __GFP_RECLAIMABLE ? KMALLOC_RECLAIM : KMALLOC_NORMAL;
>> #else
>>  if (likely((flags & (__GFP_DMA | __GFP_RECLAIMABLE)) == 0))
>>  return KMALLOC_NORMAL;
>>  return flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM;
>> #endif
>> }
> 
> OK that's probably the most straightforward to follow, thanks.
> Note that for CONFIG_ZONE_DMA=n the result is identical to original code and
> all other attempts. flags & __GFP_DMA is converted to 1/0 index without 
> branches
> or cmovs or whatnot.

Ping? Seems like people will report duplicates until the sparse warning
is gone in mainline...

Also CC linux-mm which was somehow lost.


8<
>From 40735b637b28c3e5798bc7e90f72f349050c2045 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka 
Date: Fri, 9 Nov 2018 08:47:12 +0100
Subject: [PATCH] mm, slab: fix sparse warning in kmalloc_type()

Multiple people have reported the following sparse warning:

./include/linux/slab.h:332:43: warning: dubious: x & !y

The minimal fix would be to change the logical & to boolean &&, which emits the
same code, but Andrew has suggested that the branch-avoiding tricks are maybe
not worthwile. David Laight provided a nice comparison of disassembly of
multiple variants, which shows that the current version produces a 4 deep
dependency chain, and fixing the sparse warning by changing logical and to
multiplication emits an IMUL, making it even more expensive.

The code as rewritten by this patch yielded the best disassembly, with a single
predictable branch for the most common case, and a ternary operator for the
rest, which gcc seems to compile without a branch or cmov by itself.

The result should be more readable, without a sparse warning and probably also
faster for the common case.

Reported-by: Bart Van Assche 
Reported-by: Darryl T. Agostinelli 
Reported-by: Masahiro Yamada 
Suggested-by: Andrew Morton 
Suggested-by: David Laight 
Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
Signed-off-by: Vlastimil Babka 
---
 include/linux/slab.h | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 918f374e7156..6d5009f29ce5 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -314,22 +314,22 @@ kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1];
 
 static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
 {
-   int is_dma = 0;
-   int type_dma = 0;
-   int is_reclaimable;
-
 #ifdef CONFIG_ZONE_DMA
-   is_dma = !!(flags & __GFP_DMA);
-   type_dma = is_dma * KMALLOC_DMA;
-#endif
-
-   is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
+   /*
+* The most common case is KMALLOC_NORMAL, so test for it
+* with a single branch for both flags.
+*/
+   if (likely((flags & (__GFP_DMA | __GFP_RECLAIMABLE)) == 0))
+   return KMALLOC_NORMAL;
 
/*
-* If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
-* KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
+* At least one of the flags has to be set. If both are, __GFP_DMA
+* is more important.
 */
-   return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
+   return flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM;
+#else
+   return flags & __GFP_RECLAIMABLE ? KMALLOC_RECLAIM : KMALLOC_NORMAL;
+#endif
 }
 
 /*
-- 
2.19.1





Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-21 Thread Vlastimil Babka
On 11/13/18 7:22 PM, Vlastimil Babka wrote:
> On 11/12/18 10:55 AM, David Laight wrote:
>> From: Vlastimil Babka [mailto:vba...@suse.cz]
>>> Sent: 09 November 2018 19:16
>> ...
>>> This? Not terribly elegant, but I don't see a nicer way right now...
>>
>> Maybe just have two copies of the function body?
>>
>>  static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
>> {
>> #ifndef CONFIG_ZONE_DMA
>>  return flags & __GFP_RECLAIMABLE ? KMALLOC_RECLAIM : KMALLOC_NORMAL;
>> #else
>>  if (likely((flags & (__GFP_DMA | __GFP_RECLAIMABLE)) == 0))
>>  return KMALLOC_NORMAL;
>>  return flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM;
>> #endif
>> }
> 
> OK that's probably the most straightforward to follow, thanks.
> Note that for CONFIG_ZONE_DMA=n the result is identical to original code and
> all other attempts. flags & __GFP_DMA is converted to 1/0 index without 
> branches
> or cmovs or whatnot.

Ping? Seems like people will report duplicates until the sparse warning
is gone in mainline...

Also CC linux-mm which was somehow lost.


8<
>From 40735b637b28c3e5798bc7e90f72f349050c2045 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka 
Date: Fri, 9 Nov 2018 08:47:12 +0100
Subject: [PATCH] mm, slab: fix sparse warning in kmalloc_type()

Multiple people have reported the following sparse warning:

./include/linux/slab.h:332:43: warning: dubious: x & !y

The minimal fix would be to change the logical & to boolean &&, which emits the
same code, but Andrew has suggested that the branch-avoiding tricks are maybe
not worthwile. David Laight provided a nice comparison of disassembly of
multiple variants, which shows that the current version produces a 4 deep
dependency chain, and fixing the sparse warning by changing logical and to
multiplication emits an IMUL, making it even more expensive.

The code as rewritten by this patch yielded the best disassembly, with a single
predictable branch for the most common case, and a ternary operator for the
rest, which gcc seems to compile without a branch or cmov by itself.

The result should be more readable, without a sparse warning and probably also
faster for the common case.

Reported-by: Bart Van Assche 
Reported-by: Darryl T. Agostinelli 
Reported-by: Masahiro Yamada 
Suggested-by: Andrew Morton 
Suggested-by: David Laight 
Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
Signed-off-by: Vlastimil Babka 
---
 include/linux/slab.h | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 918f374e7156..6d5009f29ce5 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -314,22 +314,22 @@ kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1];
 
 static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
 {
-   int is_dma = 0;
-   int type_dma = 0;
-   int is_reclaimable;
-
 #ifdef CONFIG_ZONE_DMA
-   is_dma = !!(flags & __GFP_DMA);
-   type_dma = is_dma * KMALLOC_DMA;
-#endif
-
-   is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
+   /*
+* The most common case is KMALLOC_NORMAL, so test for it
+* with a single branch for both flags.
+*/
+   if (likely((flags & (__GFP_DMA | __GFP_RECLAIMABLE)) == 0))
+   return KMALLOC_NORMAL;
 
/*
-* If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
-* KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
+* At least one of the flags has to be set. If both are, __GFP_DMA
+* is more important.
 */
-   return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
+   return flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM;
+#else
+   return flags & __GFP_RECLAIMABLE ? KMALLOC_RECLAIM : KMALLOC_NORMAL;
+#endif
 }
 
 /*
-- 
2.19.1





Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-19 Thread Vlastimil Babka
On 11/19/18 12:04 PM, Pavel Machek wrote:
> On Mon 2018-11-05 12:40:00, Bart Van Assche wrote:
>> This patch suppresses the following sparse warning:
>>
>> ./include/linux/slab.h:332:43: warning: dubious: x & !y
>>
>> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
>> Cc: Vlastimil Babka 
>> Cc: Mel Gorman 
>> Cc: Christoph Lameter 
>> Cc: Roman Gushchin 
>> Signed-off-by: Bart Van Assche 
>> ---
>>  include/linux/slab.h | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/include/linux/slab.h b/include/linux/slab.h
>> index 918f374e7156..97d0599ddb7b 100644
>> --- a/include/linux/slab.h
>> +++ b/include/linux/slab.h
>> @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type 
>> kmalloc_type(gfp_t flags)
>>   * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
>>   * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
>>   */
>> -return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
>> +return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
>>  }
>>  
> 
> What is wrong with && ?

Nothing, it would work and generate the same assembly as '&'. But Andrew
noted that this code is probably too clever for its own good, and he has
a point. The single predictable branch is also likely faster than the
chain of arithmetic calculations anyway. Nobody has actually measured
it, so I'd go with the easier-to-read variant.

> If logical and is better done by multiply,
> that's compiler job, and compiler should be fixed to do it...

Multiply was just another way (equivalent to '&&' semantically) to shut
up sparse warning. But gcc actually emits IMUL in that case, which is
wasteful, so yeah there's a bug report now:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87954

> 
>   Pavel
> 



Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-19 Thread Vlastimil Babka
On 11/19/18 12:04 PM, Pavel Machek wrote:
> On Mon 2018-11-05 12:40:00, Bart Van Assche wrote:
>> This patch suppresses the following sparse warning:
>>
>> ./include/linux/slab.h:332:43: warning: dubious: x & !y
>>
>> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
>> Cc: Vlastimil Babka 
>> Cc: Mel Gorman 
>> Cc: Christoph Lameter 
>> Cc: Roman Gushchin 
>> Signed-off-by: Bart Van Assche 
>> ---
>>  include/linux/slab.h | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/include/linux/slab.h b/include/linux/slab.h
>> index 918f374e7156..97d0599ddb7b 100644
>> --- a/include/linux/slab.h
>> +++ b/include/linux/slab.h
>> @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type 
>> kmalloc_type(gfp_t flags)
>>   * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
>>   * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
>>   */
>> -return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
>> +return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
>>  }
>>  
> 
> What is wrong with && ?

Nothing, it would work and generate the same assembly as '&'. But Andrew
noted that this code is probably too clever for its own good, and he has
a point. The single predictable branch is also likely faster than the
chain of arithmetic calculations anyway. Nobody has actually measured
it, so I'd go with the easier-to-read variant.

> If logical and is better done by multiply,
> that's compiler job, and compiler should be fixed to do it...

Multiply was just another way (equivalent to '&&' semantically) to shut
up sparse warning. But gcc actually emits IMUL in that case, which is
wasteful, so yeah there's a bug report now:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87954

> 
>   Pavel
> 



Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-19 Thread Pavel Machek
On Mon 2018-11-05 12:40:00, Bart Van Assche wrote:
> This patch suppresses the following sparse warning:
> 
> ./include/linux/slab.h:332:43: warning: dubious: x & !y
> 
> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
> Cc: Vlastimil Babka 
> Cc: Mel Gorman 
> Cc: Christoph Lameter 
> Cc: Roman Gushchin 
> Signed-off-by: Bart Van Assche 
> ---
>  include/linux/slab.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 918f374e7156..97d0599ddb7b 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type 
> kmalloc_type(gfp_t flags)
>* If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
>* KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
>*/
> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
>  }
>  

What is wrong with && ? If logical and is better done by multiply,
that's compiler job, and compiler should be fixed to do it...


Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-19 Thread Pavel Machek
On Mon 2018-11-05 12:40:00, Bart Van Assche wrote:
> This patch suppresses the following sparse warning:
> 
> ./include/linux/slab.h:332:43: warning: dubious: x & !y
> 
> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
> Cc: Vlastimil Babka 
> Cc: Mel Gorman 
> Cc: Christoph Lameter 
> Cc: Roman Gushchin 
> Signed-off-by: Bart Van Assche 
> ---
>  include/linux/slab.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 918f374e7156..97d0599ddb7b 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type 
> kmalloc_type(gfp_t flags)
>* If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
>* KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
>*/
> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
>  }
>  

What is wrong with && ? If logical and is better done by multiply,
that's compiler job, and compiler should be fixed to do it...


Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-13 Thread Vlastimil Babka
On 11/12/18 10:55 AM, David Laight wrote:
> From: Vlastimil Babka [mailto:vba...@suse.cz]
>> Sent: 09 November 2018 19:16
> ...
>> This? Not terribly elegant, but I don't see a nicer way right now...
> 
> Maybe just have two copies of the function body?
> 
>  static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
> {
> #ifndef CONFIG_ZONE_DMA
>   return flags & __GFP_RECLAIMABLE ? KMALLOC_RECLAIM : KMALLOC_NORMAL;
> #else
>   if (likely((flags & (__GFP_DMA | __GFP_RECLAIMABLE)) == 0))
>   return KMALLOC_NORMAL;
>   return flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM;
> #endif
> }

OK that's probably the most straightforward to follow, thanks.
Note that for CONFIG_ZONE_DMA=n the result is identical to original code and
all other attempts. flags & __GFP_DMA is converted to 1/0 index without branches
or cmovs or whatnot.

8<
>From 40735b637b28c3e5798bc7e90f72f349050c2045 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka 
Date: Fri, 9 Nov 2018 08:47:12 +0100
Subject: [PATCH] mm, slab: fix sparse warning in kmalloc_type()

Multiple people have reported the following sparse warning:

./include/linux/slab.h:332:43: warning: dubious: x & !y

The minimal fix would be to change the logical & to boolean &&, which emits the
same code, but Andrew has suggested that the branch-avoiding tricks are maybe
not worthwile. David Laight provided a nice comparison of disassembly of
multiple variants, which shows that the current version produces a 4 deep
dependency chain, and fixing the sparse warning by changing logical and to
multiplication emits an IMUL, making it even more expensive.

The code as rewritten by this patch yielded the best disassembly, with a single
predictable branch for the most common case, and a ternary operator for the
rest, which gcc seems to compile without a branch or cmov by itself.

The result should be more readable, without a sparse warning and probably also
faster for the common case.

Reported-by: Bart Van Assche 
Reported-by: Darryl T. Agostinelli 
Suggested-by: Andrew Morton 
Suggested-by: David Laight 
Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
Signed-off-by: Vlastimil Babka 
---
 include/linux/slab.h | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 918f374e7156..6d5009f29ce5 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -314,22 +314,22 @@ kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1];
 
 static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
 {
-   int is_dma = 0;
-   int type_dma = 0;
-   int is_reclaimable;
-
 #ifdef CONFIG_ZONE_DMA
-   is_dma = !!(flags & __GFP_DMA);
-   type_dma = is_dma * KMALLOC_DMA;
-#endif
-
-   is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
+   /*
+* The most common case is KMALLOC_NORMAL, so test for it
+* with a single branch for both flags.
+*/
+   if (likely((flags & (__GFP_DMA | __GFP_RECLAIMABLE)) == 0))
+   return KMALLOC_NORMAL;
 
/*
-* If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
-* KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
+* At least one of the flags has to be set. If both are, __GFP_DMA
+* is more important.
 */
-   return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
+   return flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM;
+#else
+   return flags & __GFP_RECLAIMABLE ? KMALLOC_RECLAIM : KMALLOC_NORMAL;
+#endif
 }
 
 /*
-- 
2.19.1





Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-13 Thread Vlastimil Babka
On 11/12/18 10:55 AM, David Laight wrote:
> From: Vlastimil Babka [mailto:vba...@suse.cz]
>> Sent: 09 November 2018 19:16
> ...
>> This? Not terribly elegant, but I don't see a nicer way right now...
> 
> Maybe just have two copies of the function body?
> 
>  static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
> {
> #ifndef CONFIG_ZONE_DMA
>   return flags & __GFP_RECLAIMABLE ? KMALLOC_RECLAIM : KMALLOC_NORMAL;
> #else
>   if (likely((flags & (__GFP_DMA | __GFP_RECLAIMABLE)) == 0))
>   return KMALLOC_NORMAL;
>   return flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM;
> #endif
> }

OK that's probably the most straightforward to follow, thanks.
Note that for CONFIG_ZONE_DMA=n the result is identical to original code and
all other attempts. flags & __GFP_DMA is converted to 1/0 index without branches
or cmovs or whatnot.

8<
>From 40735b637b28c3e5798bc7e90f72f349050c2045 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka 
Date: Fri, 9 Nov 2018 08:47:12 +0100
Subject: [PATCH] mm, slab: fix sparse warning in kmalloc_type()

Multiple people have reported the following sparse warning:

./include/linux/slab.h:332:43: warning: dubious: x & !y

The minimal fix would be to change the logical & to boolean &&, which emits the
same code, but Andrew has suggested that the branch-avoiding tricks are maybe
not worthwile. David Laight provided a nice comparison of disassembly of
multiple variants, which shows that the current version produces a 4 deep
dependency chain, and fixing the sparse warning by changing logical and to
multiplication emits an IMUL, making it even more expensive.

The code as rewritten by this patch yielded the best disassembly, with a single
predictable branch for the most common case, and a ternary operator for the
rest, which gcc seems to compile without a branch or cmov by itself.

The result should be more readable, without a sparse warning and probably also
faster for the common case.

Reported-by: Bart Van Assche 
Reported-by: Darryl T. Agostinelli 
Suggested-by: Andrew Morton 
Suggested-by: David Laight 
Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
Signed-off-by: Vlastimil Babka 
---
 include/linux/slab.h | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 918f374e7156..6d5009f29ce5 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -314,22 +314,22 @@ kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1];
 
 static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
 {
-   int is_dma = 0;
-   int type_dma = 0;
-   int is_reclaimable;
-
 #ifdef CONFIG_ZONE_DMA
-   is_dma = !!(flags & __GFP_DMA);
-   type_dma = is_dma * KMALLOC_DMA;
-#endif
-
-   is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
+   /*
+* The most common case is KMALLOC_NORMAL, so test for it
+* with a single branch for both flags.
+*/
+   if (likely((flags & (__GFP_DMA | __GFP_RECLAIMABLE)) == 0))
+   return KMALLOC_NORMAL;
 
/*
-* If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
-* KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
+* At least one of the flags has to be set. If both are, __GFP_DMA
+* is more important.
 */
-   return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
+   return flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM;
+#else
+   return flags & __GFP_RECLAIMABLE ? KMALLOC_RECLAIM : KMALLOC_NORMAL;
+#endif
 }
 
 /*
-- 
2.19.1





RE: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-12 Thread David Laight
From: Vlastimil Babka [mailto:vba...@suse.cz]
> Sent: 09 November 2018 19:16
...
> This? Not terribly elegant, but I don't see a nicer way right now...

Maybe just have two copies of the function body?

 static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
{
#ifndef CONFIG_ZONE_DMA
return flags & __GFP_RECLAIMABLE ? KMALLOC_RECLAIM : KMALLOC_NORMAL;
#else
if (likely((flags & (__GFP_DMA | __GFP_RECLAIMABLE)) == 0))
return KMALLOC_NORMAL;
return flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM;
#endif
}

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-12 Thread David Laight
From: Vlastimil Babka [mailto:vba...@suse.cz]
> Sent: 09 November 2018 19:16
...
> This? Not terribly elegant, but I don't see a nicer way right now...

Maybe just have two copies of the function body?

 static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
{
#ifndef CONFIG_ZONE_DMA
return flags & __GFP_RECLAIMABLE ? KMALLOC_RECLAIM : KMALLOC_NORMAL;
#else
if (likely((flags & (__GFP_DMA | __GFP_RECLAIMABLE)) == 0))
return KMALLOC_NORMAL;
return flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM;
#endif
}

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-09 Thread Vlastimil Babka
On 11/9/18 8:47 PM, Darryl T. Agostinelli wrote:
> On Fri, Nov 09, 2018 at 08:16:07PM +0100, Vlastimil Babka wrote:
>> On 11/9/18 8:00 PM, Andrew Morton wrote:
>>> On Fri, 9 Nov 2018 09:12:09 +0100 Vlastimil Babka  wrote:
>>>
 Multiple people have reported the following sparse warning:

 ./include/linux/slab.h:332:43: warning: dubious: x & !y

 The minimal fix would be to change the logical & to boolean &&, which 
 emits the
 same code, but Andrew has suggested that the branch-avoiding tricks are 
 maybe
 not worthwile. David Laight provided a nice comparison of disassembly of
 multiple variants, which shows that the current version produces a 4 deep
 dependency chain, and fixing the sparse warning by changing logical and to
 multiplication emits an IMUL, making it even more expensive.

 The code as rewritten by this patch yielded the best disassembly, with a 
 single
 predictable branch for the most common case, and a ternary operator for the
 rest, which gcc seems to compile without a branch or cmov by itself.

 The result should be more readable, without a sparse warning and probably 
 also
 faster for the common case.

 Reported-by: Bart Van Assche 
 Reported-by: Darryl T. Agostinelli 
 Suggested-by: Andrew Morton 
 Suggested-by: David Laight 
 Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
 Signed-off-by: Vlastimil Babka 
 ---
  include/linux/slab.h | 24 
  1 file changed, 12 insertions(+), 12 deletions(-)

 diff --git a/include/linux/slab.h b/include/linux/slab.h
 index 918f374e7156..18c6920c2803 100644
 --- a/include/linux/slab.h
 +++ b/include/linux/slab.h
 @@ -304,6 +304,8 @@ enum kmalloc_cache_type {
KMALLOC_RECLAIM,
  #ifdef CONFIG_ZONE_DMA
KMALLOC_DMA,
 +#else
 +  KMALLOC_DMA = KMALLOC_NORMAL,
  #endif
NR_KMALLOC_TYPES
  };
>>>
>>> I don't think this works correctly.  Resetting KMALLOC_DMA to 0 will
>>> cause NR_KMALLOC_TYPES to have value 1.
>>
>> Doh, right! Thanks for catching this.
>>
>> This? Not terribly elegant, but I don't see a nicer way right now...
>>
> 
> How about the solution I proposed yesterday? 
> 
> https://lkml.org/lkml/2018/11/9/750
> 
> It doesn't involve any tricks. 

It doesn't remove the "trick" that calculates return value as a sum of
booleans multiplying constants. The patch converts one part of the
expression of those booleans to a ternary operator. I think the result
is even harder to follow and meanwhile Andrew's suggestion was to remove
all the tricks.

> As it is, this sparse warning is begging for a trick. Let's not 
> oblidge it to much.

The sparse warning could be silenced just by changing '&' to '&&' which
would emit the same code. But we decided to untrick the code.


Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-09 Thread Vlastimil Babka
On 11/9/18 8:47 PM, Darryl T. Agostinelli wrote:
> On Fri, Nov 09, 2018 at 08:16:07PM +0100, Vlastimil Babka wrote:
>> On 11/9/18 8:00 PM, Andrew Morton wrote:
>>> On Fri, 9 Nov 2018 09:12:09 +0100 Vlastimil Babka  wrote:
>>>
 Multiple people have reported the following sparse warning:

 ./include/linux/slab.h:332:43: warning: dubious: x & !y

 The minimal fix would be to change the logical & to boolean &&, which 
 emits the
 same code, but Andrew has suggested that the branch-avoiding tricks are 
 maybe
 not worthwile. David Laight provided a nice comparison of disassembly of
 multiple variants, which shows that the current version produces a 4 deep
 dependency chain, and fixing the sparse warning by changing logical and to
 multiplication emits an IMUL, making it even more expensive.

 The code as rewritten by this patch yielded the best disassembly, with a 
 single
 predictable branch for the most common case, and a ternary operator for the
 rest, which gcc seems to compile without a branch or cmov by itself.

 The result should be more readable, without a sparse warning and probably 
 also
 faster for the common case.

 Reported-by: Bart Van Assche 
 Reported-by: Darryl T. Agostinelli 
 Suggested-by: Andrew Morton 
 Suggested-by: David Laight 
 Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
 Signed-off-by: Vlastimil Babka 
 ---
  include/linux/slab.h | 24 
  1 file changed, 12 insertions(+), 12 deletions(-)

 diff --git a/include/linux/slab.h b/include/linux/slab.h
 index 918f374e7156..18c6920c2803 100644
 --- a/include/linux/slab.h
 +++ b/include/linux/slab.h
 @@ -304,6 +304,8 @@ enum kmalloc_cache_type {
KMALLOC_RECLAIM,
  #ifdef CONFIG_ZONE_DMA
KMALLOC_DMA,
 +#else
 +  KMALLOC_DMA = KMALLOC_NORMAL,
  #endif
NR_KMALLOC_TYPES
  };
>>>
>>> I don't think this works correctly.  Resetting KMALLOC_DMA to 0 will
>>> cause NR_KMALLOC_TYPES to have value 1.
>>
>> Doh, right! Thanks for catching this.
>>
>> This? Not terribly elegant, but I don't see a nicer way right now...
>>
> 
> How about the solution I proposed yesterday? 
> 
> https://lkml.org/lkml/2018/11/9/750
> 
> It doesn't involve any tricks. 

It doesn't remove the "trick" that calculates return value as a sum of
booleans multiplying constants. The patch converts one part of the
expression of those booleans to a ternary operator. I think the result
is even harder to follow and meanwhile Andrew's suggestion was to remove
all the tricks.

> As it is, this sparse warning is begging for a trick. Let's not 
> oblidge it to much.

The sparse warning could be silenced just by changing '&' to '&&' which
would emit the same code. But we decided to untrick the code.


Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-09 Thread Darryl T. Agostinelli
On Fri, Nov 09, 2018 at 08:16:07PM +0100, Vlastimil Babka wrote:
> On 11/9/18 8:00 PM, Andrew Morton wrote:
> > On Fri, 9 Nov 2018 09:12:09 +0100 Vlastimil Babka  wrote:
> > 
> >> Multiple people have reported the following sparse warning:
> >>
> >> ./include/linux/slab.h:332:43: warning: dubious: x & !y
> >>
> >> The minimal fix would be to change the logical & to boolean &&, which 
> >> emits the
> >> same code, but Andrew has suggested that the branch-avoiding tricks are 
> >> maybe
> >> not worthwile. David Laight provided a nice comparison of disassembly of
> >> multiple variants, which shows that the current version produces a 4 deep
> >> dependency chain, and fixing the sparse warning by changing logical and to
> >> multiplication emits an IMUL, making it even more expensive.
> >>
> >> The code as rewritten by this patch yielded the best disassembly, with a 
> >> single
> >> predictable branch for the most common case, and a ternary operator for the
> >> rest, which gcc seems to compile without a branch or cmov by itself.
> >>
> >> The result should be more readable, without a sparse warning and probably 
> >> also
> >> faster for the common case.
> >>
> >> Reported-by: Bart Van Assche 
> >> Reported-by: Darryl T. Agostinelli 
> >> Suggested-by: Andrew Morton 
> >> Suggested-by: David Laight 
> >> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
> >> Signed-off-by: Vlastimil Babka 
> >> ---
> >>  include/linux/slab.h | 24 
> >>  1 file changed, 12 insertions(+), 12 deletions(-)
> >>
> >> diff --git a/include/linux/slab.h b/include/linux/slab.h
> >> index 918f374e7156..18c6920c2803 100644
> >> --- a/include/linux/slab.h
> >> +++ b/include/linux/slab.h
> >> @@ -304,6 +304,8 @@ enum kmalloc_cache_type {
> >>KMALLOC_RECLAIM,
> >>  #ifdef CONFIG_ZONE_DMA
> >>KMALLOC_DMA,
> >> +#else
> >> +  KMALLOC_DMA = KMALLOC_NORMAL,
> >>  #endif
> >>NR_KMALLOC_TYPES
> >>  };
> > 
> > I don't think this works correctly.  Resetting KMALLOC_DMA to 0 will
> > cause NR_KMALLOC_TYPES to have value 1.
> 
> Doh, right! Thanks for catching this.
> 
> This? Not terribly elegant, but I don't see a nicer way right now...
> 

How about the solution I proposed yesterday? 

https://lkml.org/lkml/2018/11/9/750

It doesn't involve any tricks. 

As it is, this sparse warning is begging for a trick. Let's not 
oblidge it to much.



Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-09 Thread Darryl T. Agostinelli
On Fri, Nov 09, 2018 at 08:16:07PM +0100, Vlastimil Babka wrote:
> On 11/9/18 8:00 PM, Andrew Morton wrote:
> > On Fri, 9 Nov 2018 09:12:09 +0100 Vlastimil Babka  wrote:
> > 
> >> Multiple people have reported the following sparse warning:
> >>
> >> ./include/linux/slab.h:332:43: warning: dubious: x & !y
> >>
> >> The minimal fix would be to change the logical & to boolean &&, which 
> >> emits the
> >> same code, but Andrew has suggested that the branch-avoiding tricks are 
> >> maybe
> >> not worthwile. David Laight provided a nice comparison of disassembly of
> >> multiple variants, which shows that the current version produces a 4 deep
> >> dependency chain, and fixing the sparse warning by changing logical and to
> >> multiplication emits an IMUL, making it even more expensive.
> >>
> >> The code as rewritten by this patch yielded the best disassembly, with a 
> >> single
> >> predictable branch for the most common case, and a ternary operator for the
> >> rest, which gcc seems to compile without a branch or cmov by itself.
> >>
> >> The result should be more readable, without a sparse warning and probably 
> >> also
> >> faster for the common case.
> >>
> >> Reported-by: Bart Van Assche 
> >> Reported-by: Darryl T. Agostinelli 
> >> Suggested-by: Andrew Morton 
> >> Suggested-by: David Laight 
> >> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
> >> Signed-off-by: Vlastimil Babka 
> >> ---
> >>  include/linux/slab.h | 24 
> >>  1 file changed, 12 insertions(+), 12 deletions(-)
> >>
> >> diff --git a/include/linux/slab.h b/include/linux/slab.h
> >> index 918f374e7156..18c6920c2803 100644
> >> --- a/include/linux/slab.h
> >> +++ b/include/linux/slab.h
> >> @@ -304,6 +304,8 @@ enum kmalloc_cache_type {
> >>KMALLOC_RECLAIM,
> >>  #ifdef CONFIG_ZONE_DMA
> >>KMALLOC_DMA,
> >> +#else
> >> +  KMALLOC_DMA = KMALLOC_NORMAL,
> >>  #endif
> >>NR_KMALLOC_TYPES
> >>  };
> > 
> > I don't think this works correctly.  Resetting KMALLOC_DMA to 0 will
> > cause NR_KMALLOC_TYPES to have value 1.
> 
> Doh, right! Thanks for catching this.
> 
> This? Not terribly elegant, but I don't see a nicer way right now...
> 

How about the solution I proposed yesterday? 

https://lkml.org/lkml/2018/11/9/750

It doesn't involve any tricks. 

As it is, this sparse warning is begging for a trick. Let's not 
oblidge it to much.



Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-09 Thread Vlastimil Babka
On 11/9/18 8:00 PM, Andrew Morton wrote:
> On Fri, 9 Nov 2018 09:12:09 +0100 Vlastimil Babka  wrote:
> 
>> Multiple people have reported the following sparse warning:
>>
>> ./include/linux/slab.h:332:43: warning: dubious: x & !y
>>
>> The minimal fix would be to change the logical & to boolean &&, which emits 
>> the
>> same code, but Andrew has suggested that the branch-avoiding tricks are maybe
>> not worthwile. David Laight provided a nice comparison of disassembly of
>> multiple variants, which shows that the current version produces a 4 deep
>> dependency chain, and fixing the sparse warning by changing logical and to
>> multiplication emits an IMUL, making it even more expensive.
>>
>> The code as rewritten by this patch yielded the best disassembly, with a 
>> single
>> predictable branch for the most common case, and a ternary operator for the
>> rest, which gcc seems to compile without a branch or cmov by itself.
>>
>> The result should be more readable, without a sparse warning and probably 
>> also
>> faster for the common case.
>>
>> Reported-by: Bart Van Assche 
>> Reported-by: Darryl T. Agostinelli 
>> Suggested-by: Andrew Morton 
>> Suggested-by: David Laight 
>> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
>> Signed-off-by: Vlastimil Babka 
>> ---
>>  include/linux/slab.h | 24 
>>  1 file changed, 12 insertions(+), 12 deletions(-)
>>
>> diff --git a/include/linux/slab.h b/include/linux/slab.h
>> index 918f374e7156..18c6920c2803 100644
>> --- a/include/linux/slab.h
>> +++ b/include/linux/slab.h
>> @@ -304,6 +304,8 @@ enum kmalloc_cache_type {
>>  KMALLOC_RECLAIM,
>>  #ifdef CONFIG_ZONE_DMA
>>  KMALLOC_DMA,
>> +#else
>> +KMALLOC_DMA = KMALLOC_NORMAL,
>>  #endif
>>  NR_KMALLOC_TYPES
>>  };
> 
> I don't think this works correctly.  Resetting KMALLOC_DMA to 0 will
> cause NR_KMALLOC_TYPES to have value 1.

Doh, right! Thanks for catching this.

This? Not terribly elegant, but I don't see a nicer way right now...

8<
>From 40b84707e1b5aeccff11bd5f0563bb938e2c22d6 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka 
Date: Fri, 9 Nov 2018 08:47:12 +0100
Subject: [PATCH] mm, slab: fix sparse warning in kmalloc_type()

Multiple people have reported the following sparse warning:

./include/linux/slab.h:332:43: warning: dubious: x & !y

The minimal fix would be to change the logical & to boolean &&, which emits the
same code, but Andrew has suggested that the branch-avoiding tricks are maybe
not worthwile. David Laight provided a nice comparison of disassembly of
multiple variants, which shows that the current version produces a 4 deep
dependency chain, and fixing the sparse warning by changing logical and to
multiplication emits an IMUL, making it even more expensive.

The code as rewritten by this patch yielded the best disassembly, with a single
predictable branch for the most common case, and a ternary operator for the
rest, which gcc seems to compile without a branch or cmov by itself.

The result should be more readable, without a sparse warning and probably also
faster for the common case.

Reported-by: Bart Van Assche 
Reported-by: Darryl T. Agostinelli 
Suggested-by: Andrew Morton 
Suggested-by: David Laight 
Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
Signed-off-by: Vlastimil Babka 
---
 include/linux/slab.h | 26 ++
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 918f374e7156..ca113d4ecc6f 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -314,22 +314,24 @@ kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1];
 
 static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
 {
-   int is_dma = 0;
-   int type_dma = 0;
-   int is_reclaimable;
+   int gfp_dma = IS_ENABLED(CONFIG_ZONE_DMA) ? __GFP_DMA : 0;
 
-#ifdef CONFIG_ZONE_DMA
-   is_dma = !!(flags & __GFP_DMA);
-   type_dma = is_dma * KMALLOC_DMA;
-#endif
-
-   is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
+   /*
+* The most common case is KMALLOC_NORMAL, so test for it
+* with a single branch for both flags.
+*/
+   if (likely((flags & (gfp_dma | __GFP_RECLAIMABLE)) == 0))
+   return KMALLOC_NORMAL;
 
/*
-* If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
-* KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
+* At least one of the flags has to be set. If both are, __GFP_DMA
+* is more important.
 */
-   return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
+#ifdef CONFIG_ZONE_DMA
+   return flags & gfp_dma ? KMALLOC_DMA : KMALLOC_RECLAIM;
+#else
+   return KMALLOC_RECLAIM;
+#endif
 }
 
 /*
-- 
2.19.1



Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-09 Thread Vlastimil Babka
On 11/9/18 8:00 PM, Andrew Morton wrote:
> On Fri, 9 Nov 2018 09:12:09 +0100 Vlastimil Babka  wrote:
> 
>> Multiple people have reported the following sparse warning:
>>
>> ./include/linux/slab.h:332:43: warning: dubious: x & !y
>>
>> The minimal fix would be to change the logical & to boolean &&, which emits 
>> the
>> same code, but Andrew has suggested that the branch-avoiding tricks are maybe
>> not worthwile. David Laight provided a nice comparison of disassembly of
>> multiple variants, which shows that the current version produces a 4 deep
>> dependency chain, and fixing the sparse warning by changing logical and to
>> multiplication emits an IMUL, making it even more expensive.
>>
>> The code as rewritten by this patch yielded the best disassembly, with a 
>> single
>> predictable branch for the most common case, and a ternary operator for the
>> rest, which gcc seems to compile without a branch or cmov by itself.
>>
>> The result should be more readable, without a sparse warning and probably 
>> also
>> faster for the common case.
>>
>> Reported-by: Bart Van Assche 
>> Reported-by: Darryl T. Agostinelli 
>> Suggested-by: Andrew Morton 
>> Suggested-by: David Laight 
>> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
>> Signed-off-by: Vlastimil Babka 
>> ---
>>  include/linux/slab.h | 24 
>>  1 file changed, 12 insertions(+), 12 deletions(-)
>>
>> diff --git a/include/linux/slab.h b/include/linux/slab.h
>> index 918f374e7156..18c6920c2803 100644
>> --- a/include/linux/slab.h
>> +++ b/include/linux/slab.h
>> @@ -304,6 +304,8 @@ enum kmalloc_cache_type {
>>  KMALLOC_RECLAIM,
>>  #ifdef CONFIG_ZONE_DMA
>>  KMALLOC_DMA,
>> +#else
>> +KMALLOC_DMA = KMALLOC_NORMAL,
>>  #endif
>>  NR_KMALLOC_TYPES
>>  };
> 
> I don't think this works correctly.  Resetting KMALLOC_DMA to 0 will
> cause NR_KMALLOC_TYPES to have value 1.

Doh, right! Thanks for catching this.

This? Not terribly elegant, but I don't see a nicer way right now...

8<
>From 40b84707e1b5aeccff11bd5f0563bb938e2c22d6 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka 
Date: Fri, 9 Nov 2018 08:47:12 +0100
Subject: [PATCH] mm, slab: fix sparse warning in kmalloc_type()

Multiple people have reported the following sparse warning:

./include/linux/slab.h:332:43: warning: dubious: x & !y

The minimal fix would be to change the logical & to boolean &&, which emits the
same code, but Andrew has suggested that the branch-avoiding tricks are maybe
not worthwile. David Laight provided a nice comparison of disassembly of
multiple variants, which shows that the current version produces a 4 deep
dependency chain, and fixing the sparse warning by changing logical and to
multiplication emits an IMUL, making it even more expensive.

The code as rewritten by this patch yielded the best disassembly, with a single
predictable branch for the most common case, and a ternary operator for the
rest, which gcc seems to compile without a branch or cmov by itself.

The result should be more readable, without a sparse warning and probably also
faster for the common case.

Reported-by: Bart Van Assche 
Reported-by: Darryl T. Agostinelli 
Suggested-by: Andrew Morton 
Suggested-by: David Laight 
Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
Signed-off-by: Vlastimil Babka 
---
 include/linux/slab.h | 26 ++
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 918f374e7156..ca113d4ecc6f 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -314,22 +314,24 @@ kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1];
 
 static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
 {
-   int is_dma = 0;
-   int type_dma = 0;
-   int is_reclaimable;
+   int gfp_dma = IS_ENABLED(CONFIG_ZONE_DMA) ? __GFP_DMA : 0;
 
-#ifdef CONFIG_ZONE_DMA
-   is_dma = !!(flags & __GFP_DMA);
-   type_dma = is_dma * KMALLOC_DMA;
-#endif
-
-   is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
+   /*
+* The most common case is KMALLOC_NORMAL, so test for it
+* with a single branch for both flags.
+*/
+   if (likely((flags & (gfp_dma | __GFP_RECLAIMABLE)) == 0))
+   return KMALLOC_NORMAL;
 
/*
-* If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
-* KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
+* At least one of the flags has to be set. If both are, __GFP_DMA
+* is more important.
 */
-   return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
+#ifdef CONFIG_ZONE_DMA
+   return flags & gfp_dma ? KMALLOC_DMA : KMALLOC_RECLAIM;
+#else
+   return KMALLOC_RECLAIM;
+#endif
 }
 
 /*
-- 
2.19.1



Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-09 Thread Andrew Morton
On Fri, 9 Nov 2018 09:12:09 +0100 Vlastimil Babka  wrote:

> Multiple people have reported the following sparse warning:
> 
> ./include/linux/slab.h:332:43: warning: dubious: x & !y
> 
> The minimal fix would be to change the logical & to boolean &&, which emits 
> the
> same code, but Andrew has suggested that the branch-avoiding tricks are maybe
> not worthwile. David Laight provided a nice comparison of disassembly of
> multiple variants, which shows that the current version produces a 4 deep
> dependency chain, and fixing the sparse warning by changing logical and to
> multiplication emits an IMUL, making it even more expensive.
> 
> The code as rewritten by this patch yielded the best disassembly, with a 
> single
> predictable branch for the most common case, and a ternary operator for the
> rest, which gcc seems to compile without a branch or cmov by itself.
> 
> The result should be more readable, without a sparse warning and probably also
> faster for the common case.
> 
> Reported-by: Bart Van Assche 
> Reported-by: Darryl T. Agostinelli 
> Suggested-by: Andrew Morton 
> Suggested-by: David Laight 
> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
> Signed-off-by: Vlastimil Babka 
> ---
>  include/linux/slab.h | 24 
>  1 file changed, 12 insertions(+), 12 deletions(-)
> 
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 918f374e7156..18c6920c2803 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -304,6 +304,8 @@ enum kmalloc_cache_type {
>   KMALLOC_RECLAIM,
>  #ifdef CONFIG_ZONE_DMA
>   KMALLOC_DMA,
> +#else
> + KMALLOC_DMA = KMALLOC_NORMAL,
>  #endif
>   NR_KMALLOC_TYPES
>  };

I don't think this works correctly.  Resetting KMALLOC_DMA to 0 will
cause NR_KMALLOC_TYPES to have value 1.


enum foo {
a = 0,
b,
c = 0,
d
}

main()
{
printf("%d %d %d %d\n", a, b, c, d);
}

akpm3> ./a.out
0 1 0 1



Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-09 Thread Andrew Morton
On Fri, 9 Nov 2018 09:12:09 +0100 Vlastimil Babka  wrote:

> Multiple people have reported the following sparse warning:
> 
> ./include/linux/slab.h:332:43: warning: dubious: x & !y
> 
> The minimal fix would be to change the logical & to boolean &&, which emits 
> the
> same code, but Andrew has suggested that the branch-avoiding tricks are maybe
> not worthwile. David Laight provided a nice comparison of disassembly of
> multiple variants, which shows that the current version produces a 4 deep
> dependency chain, and fixing the sparse warning by changing logical and to
> multiplication emits an IMUL, making it even more expensive.
> 
> The code as rewritten by this patch yielded the best disassembly, with a 
> single
> predictable branch for the most common case, and a ternary operator for the
> rest, which gcc seems to compile without a branch or cmov by itself.
> 
> The result should be more readable, without a sparse warning and probably also
> faster for the common case.
> 
> Reported-by: Bart Van Assche 
> Reported-by: Darryl T. Agostinelli 
> Suggested-by: Andrew Morton 
> Suggested-by: David Laight 
> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
> Signed-off-by: Vlastimil Babka 
> ---
>  include/linux/slab.h | 24 
>  1 file changed, 12 insertions(+), 12 deletions(-)
> 
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 918f374e7156..18c6920c2803 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -304,6 +304,8 @@ enum kmalloc_cache_type {
>   KMALLOC_RECLAIM,
>  #ifdef CONFIG_ZONE_DMA
>   KMALLOC_DMA,
> +#else
> + KMALLOC_DMA = KMALLOC_NORMAL,
>  #endif
>   NR_KMALLOC_TYPES
>  };

I don't think this works correctly.  Resetting KMALLOC_DMA to 0 will
cause NR_KMALLOC_TYPES to have value 1.


enum foo {
a = 0,
b,
c = 0,
d
}

main()
{
printf("%d %d %d %d\n", a, b, c, d);
}

akpm3> ./a.out
0 1 0 1



Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-09 Thread Vlastimil Babka
On 11/7/18 11:41 AM, David Laight wrote:
> From: Vlastimil Babka
>> Sent: 06 November 2018 12:51
>>
>> On 11/6/18 12:07 PM, David Laight wrote:
>>> From: Vlastimil Babka [mailto:vba...@suse.cz]
>>> 0020 :
>>>   20:   40 f6 c7 11 test   $0x11,%dil
>>>   24:   75 03   jne29 
>>>   26:   31 c0   xor%eax,%eax
>>>   28:   c3  retq
>>>   29:   83 e7 01and$0x1,%edi
>>>   2c:   83 ff 01cmp$0x1,%edi
>>>   2f:   19 c0   sbb%eax,%eax
>>>   31:   83 c0 02add$0x2,%eax
>>>   34:   c3  retq
>>>
>>> The jne will be predicted not taken and the retq predicted.
>>> So this might only be 1 clock in the normal case.
>>
>> I think this is the winner. It's also a single branch and not two,
>> because the compiler could figure out some of the "clever arithmetics"
>> itself. Care to send a full patch?
> 
> I've not got a suitable source tree lurking.
> So someone else would need to do it.
> I'll waive any copyright that could plausibly be assigned to the above!

There we go. This is to replace the current fix by Bart (sorry) which seems
to add an extra IMUL. Apparently current mainline is spamming anyone running
sparse with lots of warning, so it should be merged soon.

8<
>From ddd2fc6fcba425733f8320413a1451410687c9c3 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka 
Date: Fri, 9 Nov 2018 08:47:12 +0100
Subject: [PATCH] mm, slab: fix sparse warning in kmalloc_type()

Multiple people have reported the following sparse warning:

./include/linux/slab.h:332:43: warning: dubious: x & !y

The minimal fix would be to change the logical & to boolean &&, which emits the
same code, but Andrew has suggested that the branch-avoiding tricks are maybe
not worthwile. David Laight provided a nice comparison of disassembly of
multiple variants, which shows that the current version produces a 4 deep
dependency chain, and fixing the sparse warning by changing logical and to
multiplication emits an IMUL, making it even more expensive.

The code as rewritten by this patch yielded the best disassembly, with a single
predictable branch for the most common case, and a ternary operator for the
rest, which gcc seems to compile without a branch or cmov by itself.

The result should be more readable, without a sparse warning and probably also
faster for the common case.

Reported-by: Bart Van Assche 
Reported-by: Darryl T. Agostinelli 
Suggested-by: Andrew Morton 
Suggested-by: David Laight 
Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
Signed-off-by: Vlastimil Babka 
---
 include/linux/slab.h | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 918f374e7156..18c6920c2803 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -304,6 +304,8 @@ enum kmalloc_cache_type {
KMALLOC_RECLAIM,
 #ifdef CONFIG_ZONE_DMA
KMALLOC_DMA,
+#else
+   KMALLOC_DMA = KMALLOC_NORMAL,
 #endif
NR_KMALLOC_TYPES
 };
@@ -314,22 +316,20 @@ kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1];
 
 static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
 {
-   int is_dma = 0;
-   int type_dma = 0;
-   int is_reclaimable;
-
-#ifdef CONFIG_ZONE_DMA
-   is_dma = !!(flags & __GFP_DMA);
-   type_dma = is_dma * KMALLOC_DMA;
-#endif
+   int gfp_dma = IS_ENABLED(CONFIG_ZONE_DMA) ? __GFP_DMA : 0;
 
-   is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
+   /*
+* The most common case is KMALLOC_NORMAL, so test for it
+* with a single branch for both flags.
+*/
+   if (likely((flags & (gfp_dma | __GFP_RECLAIMABLE)) == 0))
+   return KMALLOC_NORMAL;
 
/*
-* If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
-* KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
+* At least one of the flags has to be set. If both are, __GFP_DMA
+* is more important.
 */
-   return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
+   return flags & gfp_dma ? KMALLOC_DMA : KMALLOC_RECLAIM;
 }
 
 /*
-- 
2.19.1



Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-09 Thread Vlastimil Babka
On 11/7/18 11:41 AM, David Laight wrote:
> From: Vlastimil Babka
>> Sent: 06 November 2018 12:51
>>
>> On 11/6/18 12:07 PM, David Laight wrote:
>>> From: Vlastimil Babka [mailto:vba...@suse.cz]
>>> 0020 :
>>>   20:   40 f6 c7 11 test   $0x11,%dil
>>>   24:   75 03   jne29 
>>>   26:   31 c0   xor%eax,%eax
>>>   28:   c3  retq
>>>   29:   83 e7 01and$0x1,%edi
>>>   2c:   83 ff 01cmp$0x1,%edi
>>>   2f:   19 c0   sbb%eax,%eax
>>>   31:   83 c0 02add$0x2,%eax
>>>   34:   c3  retq
>>>
>>> The jne will be predicted not taken and the retq predicted.
>>> So this might only be 1 clock in the normal case.
>>
>> I think this is the winner. It's also a single branch and not two,
>> because the compiler could figure out some of the "clever arithmetics"
>> itself. Care to send a full patch?
> 
> I've not got a suitable source tree lurking.
> So someone else would need to do it.
> I'll waive any copyright that could plausibly be assigned to the above!

There we go. This is to replace the current fix by Bart (sorry) which seems
to add an extra IMUL. Apparently current mainline is spamming anyone running
sparse with lots of warning, so it should be merged soon.

8<
>From ddd2fc6fcba425733f8320413a1451410687c9c3 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka 
Date: Fri, 9 Nov 2018 08:47:12 +0100
Subject: [PATCH] mm, slab: fix sparse warning in kmalloc_type()

Multiple people have reported the following sparse warning:

./include/linux/slab.h:332:43: warning: dubious: x & !y

The minimal fix would be to change the logical & to boolean &&, which emits the
same code, but Andrew has suggested that the branch-avoiding tricks are maybe
not worthwile. David Laight provided a nice comparison of disassembly of
multiple variants, which shows that the current version produces a 4 deep
dependency chain, and fixing the sparse warning by changing logical and to
multiplication emits an IMUL, making it even more expensive.

The code as rewritten by this patch yielded the best disassembly, with a single
predictable branch for the most common case, and a ternary operator for the
rest, which gcc seems to compile without a branch or cmov by itself.

The result should be more readable, without a sparse warning and probably also
faster for the common case.

Reported-by: Bart Van Assche 
Reported-by: Darryl T. Agostinelli 
Suggested-by: Andrew Morton 
Suggested-by: David Laight 
Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
Signed-off-by: Vlastimil Babka 
---
 include/linux/slab.h | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 918f374e7156..18c6920c2803 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -304,6 +304,8 @@ enum kmalloc_cache_type {
KMALLOC_RECLAIM,
 #ifdef CONFIG_ZONE_DMA
KMALLOC_DMA,
+#else
+   KMALLOC_DMA = KMALLOC_NORMAL,
 #endif
NR_KMALLOC_TYPES
 };
@@ -314,22 +316,20 @@ kmalloc_caches[NR_KMALLOC_TYPES][KMALLOC_SHIFT_HIGH + 1];
 
 static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
 {
-   int is_dma = 0;
-   int type_dma = 0;
-   int is_reclaimable;
-
-#ifdef CONFIG_ZONE_DMA
-   is_dma = !!(flags & __GFP_DMA);
-   type_dma = is_dma * KMALLOC_DMA;
-#endif
+   int gfp_dma = IS_ENABLED(CONFIG_ZONE_DMA) ? __GFP_DMA : 0;
 
-   is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
+   /*
+* The most common case is KMALLOC_NORMAL, so test for it
+* with a single branch for both flags.
+*/
+   if (likely((flags & (gfp_dma | __GFP_RECLAIMABLE)) == 0))
+   return KMALLOC_NORMAL;
 
/*
-* If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
-* KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
+* At least one of the flags has to be set. If both are, __GFP_DMA
+* is more important.
 */
-   return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
+   return flags & gfp_dma ? KMALLOC_DMA : KMALLOC_RECLAIM;
 }
 
 /*
-- 
2.19.1



RE: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-07 Thread David Laight
From: Vlastimil Babka
> Sent: 06 November 2018 12:51
> 
> On 11/6/18 12:07 PM, David Laight wrote:
> > From: Vlastimil Babka [mailto:vba...@suse.cz]
> >> Sent: 06 November 2018 10:22
> > ...
>  -return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
>  +return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
...
> > I've done some experiments, compiled with gcc 4.7.3 and -O2
> > The constants match those from the kernel headers.
> >
> > It is noticable that there isn't a cmov in sight.
> 
> There is with newer gcc: https://godbolt.org/z/qFdByQ
> 
> But even that didn't remove the imul in f3() so that's indeed a bust.
> 
> > The code would also be better if the KMALLOC constants matched the GFP ones.
> 
> That would be hard, as __GFP flags have also other constraints
> (especially __GFP_DMA relative to other zone restricting __GFP flags)
> and KMALLOC_* are used as array index.

Hmmm...
With only 2 or three values conditionals are probably better than
table lookups - especially if they are function pointers.

> 
> > unsigned int f1(unsigned int flags)
> > {
> > return !__builtin_expect(flags & (__GFP_DMA | __GFP_RECLAIM), 0) ? 
> > 0 : flags & __GFP_DMA ?
> KMALLOC_DMA : KMALLOC_RECLAIM;
> > }
> >
> 
> ...
> 
> > 0020 :
> >   20:   40 f6 c7 11 test   $0x11,%dil
> >   24:   75 03   jne29 
> >   26:   31 c0   xor%eax,%eax
> >   28:   c3  retq
> >   29:   83 e7 01and$0x1,%edi
> >   2c:   83 ff 01cmp$0x1,%edi
> >   2f:   19 c0   sbb%eax,%eax
> >   31:   83 c0 02add$0x2,%eax
> >   34:   c3  retq
> >
> > The jne will be predicted not taken and the retq predicted.
> > So this might only be 1 clock in the normal case.
> 
> I think this is the winner. It's also a single branch and not two,
> because the compiler could figure out some of the "clever arithmetics"
> itself. Care to send a full patch?

I've not got a suitable source tree lurking.
So someone else would need to do it.
I'll waive any copyright that could plausibly be assigned to the above!

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-07 Thread David Laight
From: Vlastimil Babka
> Sent: 06 November 2018 12:51
> 
> On 11/6/18 12:07 PM, David Laight wrote:
> > From: Vlastimil Babka [mailto:vba...@suse.cz]
> >> Sent: 06 November 2018 10:22
> > ...
>  -return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
>  +return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
...
> > I've done some experiments, compiled with gcc 4.7.3 and -O2
> > The constants match those from the kernel headers.
> >
> > It is noticable that there isn't a cmov in sight.
> 
> There is with newer gcc: https://godbolt.org/z/qFdByQ
> 
> But even that didn't remove the imul in f3() so that's indeed a bust.
> 
> > The code would also be better if the KMALLOC constants matched the GFP ones.
> 
> That would be hard, as __GFP flags have also other constraints
> (especially __GFP_DMA relative to other zone restricting __GFP flags)
> and KMALLOC_* are used as array index.

Hmmm...
With only 2 or three values conditionals are probably better than
table lookups - especially if they are function pointers.

> 
> > unsigned int f1(unsigned int flags)
> > {
> > return !__builtin_expect(flags & (__GFP_DMA | __GFP_RECLAIM), 0) ? 
> > 0 : flags & __GFP_DMA ?
> KMALLOC_DMA : KMALLOC_RECLAIM;
> > }
> >
> 
> ...
> 
> > 0020 :
> >   20:   40 f6 c7 11 test   $0x11,%dil
> >   24:   75 03   jne29 
> >   26:   31 c0   xor%eax,%eax
> >   28:   c3  retq
> >   29:   83 e7 01and$0x1,%edi
> >   2c:   83 ff 01cmp$0x1,%edi
> >   2f:   19 c0   sbb%eax,%eax
> >   31:   83 c0 02add$0x2,%eax
> >   34:   c3  retq
> >
> > The jne will be predicted not taken and the retq predicted.
> > So this might only be 1 clock in the normal case.
> 
> I think this is the winner. It's also a single branch and not two,
> because the compiler could figure out some of the "clever arithmetics"
> itself. Care to send a full patch?

I've not got a suitable source tree lurking.
So someone else would need to do it.
I'll waive any copyright that could plausibly be assigned to the above!

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-06 Thread Alexander Duyck
On Tue, Nov 6, 2018 at 9:48 AM Bart Van Assche  wrote:
>
> On Tue, 2018-11-06 at 09:20 -0800, Alexander Duyck wrote:
> > On Mon, Nov 5, 2018 at 4:32 PM Bart Van Assche  wrote:
> > >
> > > On Mon, 2018-11-05 at 16:11 -0800, Alexander Duyck wrote:
> > > > If we really don't care then why even bother with the switch statement
> > > > anyway? It seems like you could just do one ternary operator and be
> > > > done with it. Basically all you need is:
> > > > return (defined(CONFIG_ZONE_DMA) && (flags & __GFP_DMA)) ? KMALLOC_DMA :
> > > > (flags & __GFP_RECLAIMABLE) ? KMALLOC_RECLAIM : 0;
> > > >
> > > > Why bother with all the extra complexity of the switch statement?
> > >
> > > I don't think that defined() can be used in a C expression. Hence the
> > > IS_ENABLED() macro. If you fix that, leave out four superfluous 
> > > parentheses,
> > > test your patch, post that patch and cc me then I will add my Reviewed-by.
> >
> > Actually the defined macro is used multiple spots in if statements
> > throughout the kernel.
>
> The only 'if (defined(' matches I found in the kernel tree that are not
> preprocessor statements occur in Perl code. Maybe I overlooked something?

You may be right. I think I was thinking of "__is_defined", not "defined".

> > The reason for IS_ENABLED is to address the fact that we can be
> > dealing with macros that indicate if they are built in or a module
> > since those end up being two different defines depending on if you
> > select 'y' or 'm'.
>
> From Documentation/process/coding-style.rst:
>
> Within code, where possible, use the IS_ENABLED macro to convert a Kconfig
> symbol into a C boolean expression, and use it in a normal C conditional:
>
> .. code-block:: c
>
> if (IS_ENABLED(CONFIG_SOMETHING)) {
> ...
> }
>
> Bart.

Right. Part of the reason for suggesting that is that depending on how
you define "CONFIG_SOMETHING" it can actually be defined as
"CONFIG_SOMETHING" or "CONFIG_SOMETHING_MODULE".  I was operating
under the assumption that CONFIG_ZONE_DMA wasn't ever going to be
built as a module.

Thanks.

- Alex


Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-06 Thread Alexander Duyck
On Tue, Nov 6, 2018 at 9:48 AM Bart Van Assche  wrote:
>
> On Tue, 2018-11-06 at 09:20 -0800, Alexander Duyck wrote:
> > On Mon, Nov 5, 2018 at 4:32 PM Bart Van Assche  wrote:
> > >
> > > On Mon, 2018-11-05 at 16:11 -0800, Alexander Duyck wrote:
> > > > If we really don't care then why even bother with the switch statement
> > > > anyway? It seems like you could just do one ternary operator and be
> > > > done with it. Basically all you need is:
> > > > return (defined(CONFIG_ZONE_DMA) && (flags & __GFP_DMA)) ? KMALLOC_DMA :
> > > > (flags & __GFP_RECLAIMABLE) ? KMALLOC_RECLAIM : 0;
> > > >
> > > > Why bother with all the extra complexity of the switch statement?
> > >
> > > I don't think that defined() can be used in a C expression. Hence the
> > > IS_ENABLED() macro. If you fix that, leave out four superfluous 
> > > parentheses,
> > > test your patch, post that patch and cc me then I will add my Reviewed-by.
> >
> > Actually the defined macro is used multiple spots in if statements
> > throughout the kernel.
>
> The only 'if (defined(' matches I found in the kernel tree that are not
> preprocessor statements occur in Perl code. Maybe I overlooked something?

You may be right. I think I was thinking of "__is_defined", not "defined".

> > The reason for IS_ENABLED is to address the fact that we can be
> > dealing with macros that indicate if they are built in or a module
> > since those end up being two different defines depending on if you
> > select 'y' or 'm'.
>
> From Documentation/process/coding-style.rst:
>
> Within code, where possible, use the IS_ENABLED macro to convert a Kconfig
> symbol into a C boolean expression, and use it in a normal C conditional:
>
> .. code-block:: c
>
> if (IS_ENABLED(CONFIG_SOMETHING)) {
> ...
> }
>
> Bart.

Right. Part of the reason for suggesting that is that depending on how
you define "CONFIG_SOMETHING" it can actually be defined as
"CONFIG_SOMETHING" or "CONFIG_SOMETHING_MODULE".  I was operating
under the assumption that CONFIG_ZONE_DMA wasn't ever going to be
built as a module.

Thanks.

- Alex


Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-06 Thread Bart Van Assche
On Tue, 2018-11-06 at 09:20 -0800, Alexander Duyck wrote:
> On Mon, Nov 5, 2018 at 4:32 PM Bart Van Assche  wrote:
> > 
> > On Mon, 2018-11-05 at 16:11 -0800, Alexander Duyck wrote:
> > > If we really don't care then why even bother with the switch statement
> > > anyway? It seems like you could just do one ternary operator and be
> > > done with it. Basically all you need is:
> > > return (defined(CONFIG_ZONE_DMA) && (flags & __GFP_DMA)) ? KMALLOC_DMA :
> > > (flags & __GFP_RECLAIMABLE) ? KMALLOC_RECLAIM : 0;
> > > 
> > > Why bother with all the extra complexity of the switch statement?
> > 
> > I don't think that defined() can be used in a C expression. Hence the
> > IS_ENABLED() macro. If you fix that, leave out four superfluous parentheses,
> > test your patch, post that patch and cc me then I will add my Reviewed-by.
> 
> Actually the defined macro is used multiple spots in if statements
> throughout the kernel.

The only 'if (defined(' matches I found in the kernel tree that are not
preprocessor statements occur in Perl code. Maybe I overlooked something?

> The reason for IS_ENABLED is to address the fact that we can be
> dealing with macros that indicate if they are built in or a module
> since those end up being two different defines depending on if you
> select 'y' or 'm'.

>From Documentation/process/coding-style.rst:

Within code, where possible, use the IS_ENABLED macro to convert a Kconfig
symbol into a C boolean expression, and use it in a normal C conditional:

.. code-block:: c

if (IS_ENABLED(CONFIG_SOMETHING)) {
...
}

Bart.


Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-06 Thread Bart Van Assche
On Tue, 2018-11-06 at 09:20 -0800, Alexander Duyck wrote:
> On Mon, Nov 5, 2018 at 4:32 PM Bart Van Assche  wrote:
> > 
> > On Mon, 2018-11-05 at 16:11 -0800, Alexander Duyck wrote:
> > > If we really don't care then why even bother with the switch statement
> > > anyway? It seems like you could just do one ternary operator and be
> > > done with it. Basically all you need is:
> > > return (defined(CONFIG_ZONE_DMA) && (flags & __GFP_DMA)) ? KMALLOC_DMA :
> > > (flags & __GFP_RECLAIMABLE) ? KMALLOC_RECLAIM : 0;
> > > 
> > > Why bother with all the extra complexity of the switch statement?
> > 
> > I don't think that defined() can be used in a C expression. Hence the
> > IS_ENABLED() macro. If you fix that, leave out four superfluous parentheses,
> > test your patch, post that patch and cc me then I will add my Reviewed-by.
> 
> Actually the defined macro is used multiple spots in if statements
> throughout the kernel.

The only 'if (defined(' matches I found in the kernel tree that are not
preprocessor statements occur in Perl code. Maybe I overlooked something?

> The reason for IS_ENABLED is to address the fact that we can be
> dealing with macros that indicate if they are built in or a module
> since those end up being two different defines depending on if you
> select 'y' or 'm'.

>From Documentation/process/coding-style.rst:

Within code, where possible, use the IS_ENABLED macro to convert a Kconfig
symbol into a C boolean expression, and use it in a normal C conditional:

.. code-block:: c

if (IS_ENABLED(CONFIG_SOMETHING)) {
...
}

Bart.


Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-06 Thread Alexander Duyck
On Mon, Nov 5, 2018 at 4:32 PM Bart Van Assche  wrote:
>
> On Mon, 2018-11-05 at 16:11 -0800, Alexander Duyck wrote:
> > If we really don't care then why even bother with the switch statement
> > anyway? It seems like you could just do one ternary operator and be
> > done with it. Basically all you need is:
> > return (defined(CONFIG_ZONE_DMA) && (flags & __GFP_DMA)) ? KMALLOC_DMA :
> > (flags & __GFP_RECLAIMABLE) ? KMALLOC_RECLAIM : 0;
> >
> > Why bother with all the extra complexity of the switch statement?
>
> I don't think that defined() can be used in a C expression. Hence the
> IS_ENABLED() macro. If you fix that, leave out four superfluous parentheses,
> test your patch, post that patch and cc me then I will add my Reviewed-by.
>
> Bart.

Actually the defined macro is used multiple spots in if statements
throughout the kernel.

The reason for IS_ENABLED is to address the fact that we can be
dealing with macros that indicate if they are built in or a module
since those end up being two different defines depending on if you
select 'y' or 'm'.

Thanks.

- Alex


Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-06 Thread Alexander Duyck
On Mon, Nov 5, 2018 at 4:32 PM Bart Van Assche  wrote:
>
> On Mon, 2018-11-05 at 16:11 -0800, Alexander Duyck wrote:
> > If we really don't care then why even bother with the switch statement
> > anyway? It seems like you could just do one ternary operator and be
> > done with it. Basically all you need is:
> > return (defined(CONFIG_ZONE_DMA) && (flags & __GFP_DMA)) ? KMALLOC_DMA :
> > (flags & __GFP_RECLAIMABLE) ? KMALLOC_RECLAIM : 0;
> >
> > Why bother with all the extra complexity of the switch statement?
>
> I don't think that defined() can be used in a C expression. Hence the
> IS_ENABLED() macro. If you fix that, leave out four superfluous parentheses,
> test your patch, post that patch and cc me then I will add my Reviewed-by.
>
> Bart.

Actually the defined macro is used multiple spots in if statements
throughout the kernel.

The reason for IS_ENABLED is to address the fact that we can be
dealing with macros that indicate if they are built in or a module
since those end up being two different defines depending on if you
select 'y' or 'm'.

Thanks.

- Alex


Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-06 Thread Vlastimil Babka
On 11/6/18 12:07 PM, David Laight wrote:
> From: Vlastimil Babka [mailto:vba...@suse.cz]
>> Sent: 06 November 2018 10:22
> ...
 -  return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
 +  return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
>>>
>>> ISTM that changing is_dma and is_reclaimable from int to bool will stop the 
>>> bleating.
>>>
>>> It is also strange that this code is trying so hard here to avoid 
>>> conditional instructions
> 
> I've done some experiments, compiled with gcc 4.7.3 and -O2
> The constants match those from the kernel headers.
> 
> It is noticable that there isn't a cmov in sight.

There is with newer gcc: https://godbolt.org/z/qFdByQ

But even that didn't remove the imul in f3() so that's indeed a bust.

> The code would also be better if the KMALLOC constants matched the GFP ones.

That would be hard, as __GFP flags have also other constraints
(especially __GFP_DMA relative to other zone restricting __GFP flags)
and KMALLOC_* are used as array index.

> unsigned int f1(unsigned int flags)
> {
> return !__builtin_expect(flags & (__GFP_DMA | __GFP_RECLAIM), 0) ? 0 
> : flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM;
> }
> 

...

> 0020 :
>   20:   40 f6 c7 11 test   $0x11,%dil
>   24:   75 03   jne29 
>   26:   31 c0   xor%eax,%eax
>   28:   c3  retq
>   29:   83 e7 01and$0x1,%edi
>   2c:   83 ff 01cmp$0x1,%edi
>   2f:   19 c0   sbb%eax,%eax
>   31:   83 c0 02add$0x2,%eax
>   34:   c3  retq
> 
> The jne will be predicted not taken and the retq predicted.
> So this might only be 1 clock in the normal case.

I think this is the winner. It's also a single branch and not two,
because the compiler could figure out some of the "clever arithmetics"
itself. Care to send a full patch?



Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-06 Thread Vlastimil Babka
On 11/6/18 12:07 PM, David Laight wrote:
> From: Vlastimil Babka [mailto:vba...@suse.cz]
>> Sent: 06 November 2018 10:22
> ...
 -  return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
 +  return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
>>>
>>> ISTM that changing is_dma and is_reclaimable from int to bool will stop the 
>>> bleating.
>>>
>>> It is also strange that this code is trying so hard here to avoid 
>>> conditional instructions
> 
> I've done some experiments, compiled with gcc 4.7.3 and -O2
> The constants match those from the kernel headers.
> 
> It is noticable that there isn't a cmov in sight.

There is with newer gcc: https://godbolt.org/z/qFdByQ

But even that didn't remove the imul in f3() so that's indeed a bust.

> The code would also be better if the KMALLOC constants matched the GFP ones.

That would be hard, as __GFP flags have also other constraints
(especially __GFP_DMA relative to other zone restricting __GFP flags)
and KMALLOC_* are used as array index.

> unsigned int f1(unsigned int flags)
> {
> return !__builtin_expect(flags & (__GFP_DMA | __GFP_RECLAIM), 0) ? 0 
> : flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM;
> }
> 

...

> 0020 :
>   20:   40 f6 c7 11 test   $0x11,%dil
>   24:   75 03   jne29 
>   26:   31 c0   xor%eax,%eax
>   28:   c3  retq
>   29:   83 e7 01and$0x1,%edi
>   2c:   83 ff 01cmp$0x1,%edi
>   2f:   19 c0   sbb%eax,%eax
>   31:   83 c0 02add$0x2,%eax
>   34:   c3  retq
> 
> The jne will be predicted not taken and the retq predicted.
> So this might only be 1 clock in the normal case.

I think this is the winner. It's also a single branch and not two,
because the compiler could figure out some of the "clever arithmetics"
itself. Care to send a full patch?



RE: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-06 Thread David Laight
From: Vlastimil Babka [mailto:vba...@suse.cz]
> Sent: 06 November 2018 10:22
...
> >> -  return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> >> +  return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
> >
> > ISTM that changing is_dma and is_reclaimable from int to bool will stop the 
> > bleating.
> >
> > It is also strange that this code is trying so hard here to avoid 
> > conditional instructions

I've done some experiments, compiled with gcc 4.7.3 and -O2
The constants match those from the kernel headers.

It is noticable that there isn't a cmov in sight.
The code would also be better if the KMALLOC constants matched the GFP ones.


#define __GFP_DMA 1u
#define __GFP_RECLAIM 0x10u

#define KMALLOC_DMA 2
#define KMALLOC_RECLAIM 1

unsigned int f(unsigned int flags)
{
return flags & __GFP_DMA ? KMALLOC_DMA : flags & __GFP_RECLAIM ? 
KMALLOC_RECLAIM : 0;
}

unsigned int f1(unsigned int flags)
{
return !__builtin_expect(flags & (__GFP_DMA | __GFP_RECLAIM), 0) ? 0 : 
flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM;
}

unsigned int f2(unsigned int flags)
{
int is_dma, type_dma, is_rec;

is_dma = !!(flags & __GFP_DMA);
type_dma = is_dma * KMALLOC_DMA;
is_rec = !!(flags & __GFP_RECLAIM);

return type_dma + (is_rec & !is_dma) * KMALLOC_RECLAIM;
}

unsigned int f3(unsigned int flags)
{
int is_dma, type_dma, is_rec;

is_dma = !!(flags & __GFP_DMA);
type_dma = is_dma * KMALLOC_DMA;
is_rec = !!(flags & __GFP_RECLAIM);

return type_dma + (is_rec * !is_dma) * KMALLOC_RECLAIM;
}

Disassembly of section .text:

 :
   0:   40 f6 c7 01 test   $0x1,%dil
   4:   b8 02 00 00 00  mov$0x2,%eax
   9:   74 05   je 10 
   b:   f3 c3   repz retq
   d:   0f 1f 00nopl   (%rax)
  10:   89 f8   mov%edi,%eax
  12:   c1 e8 04shr$0x4,%eax
  15:   83 e0 01and$0x1,%eax
  18:   c3  retq

This one has a misprediced branch in the common path.

0020 :
  20:   40 f6 c7 11 test   $0x11,%dil
  24:   75 03   jne29 
  26:   31 c0   xor%eax,%eax
  28:   c3  retq
  29:   83 e7 01and$0x1,%edi
  2c:   83 ff 01cmp$0x1,%edi
  2f:   19 c0   sbb%eax,%eax
  31:   83 c0 02add$0x2,%eax
  34:   c3  retq

The jne will be predicted not taken and the retq predicted.
So this might only be 1 clock in the normal case.

0040 :
  40:   89 f8   mov%edi,%eax
  42:   c1 ef 04shr$0x4,%edi
  45:   83 e0 01and$0x1,%eax
  48:   89 c2   mov%eax,%edx
  4a:   83 f2 01xor$0x1,%edx
  4d:   21 d7   and%edx,%edi
  4f:   8d 04 47lea(%rdi,%rax,2),%eax
  52:   c3  retq

No conditionals, but a 4 deep dependency chain.

0060 :
  60:   89 fa   mov%edi,%edx
  62:   c1 ef 04shr$0x4,%edi
  65:   83 e2 01and$0x1,%edx
  68:   83 e7 01and$0x1,%edi
  6b:   89 d0   mov%edx,%eax
  6d:   83 f0 01xor$0x1,%eax
  70:   0f af c7imul   %edi,%eax
  73:   8d 04 50lea(%rax,%rdx,2),%eax
  76:   c3  retq

You really don't want the multiply.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-06 Thread David Laight
From: Vlastimil Babka [mailto:vba...@suse.cz]
> Sent: 06 November 2018 10:22
...
> >> -  return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> >> +  return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
> >
> > ISTM that changing is_dma and is_reclaimable from int to bool will stop the 
> > bleating.
> >
> > It is also strange that this code is trying so hard here to avoid 
> > conditional instructions

I've done some experiments, compiled with gcc 4.7.3 and -O2
The constants match those from the kernel headers.

It is noticable that there isn't a cmov in sight.
The code would also be better if the KMALLOC constants matched the GFP ones.


#define __GFP_DMA 1u
#define __GFP_RECLAIM 0x10u

#define KMALLOC_DMA 2
#define KMALLOC_RECLAIM 1

unsigned int f(unsigned int flags)
{
return flags & __GFP_DMA ? KMALLOC_DMA : flags & __GFP_RECLAIM ? 
KMALLOC_RECLAIM : 0;
}

unsigned int f1(unsigned int flags)
{
return !__builtin_expect(flags & (__GFP_DMA | __GFP_RECLAIM), 0) ? 0 : 
flags & __GFP_DMA ? KMALLOC_DMA : KMALLOC_RECLAIM;
}

unsigned int f2(unsigned int flags)
{
int is_dma, type_dma, is_rec;

is_dma = !!(flags & __GFP_DMA);
type_dma = is_dma * KMALLOC_DMA;
is_rec = !!(flags & __GFP_RECLAIM);

return type_dma + (is_rec & !is_dma) * KMALLOC_RECLAIM;
}

unsigned int f3(unsigned int flags)
{
int is_dma, type_dma, is_rec;

is_dma = !!(flags & __GFP_DMA);
type_dma = is_dma * KMALLOC_DMA;
is_rec = !!(flags & __GFP_RECLAIM);

return type_dma + (is_rec * !is_dma) * KMALLOC_RECLAIM;
}

Disassembly of section .text:

 :
   0:   40 f6 c7 01 test   $0x1,%dil
   4:   b8 02 00 00 00  mov$0x2,%eax
   9:   74 05   je 10 
   b:   f3 c3   repz retq
   d:   0f 1f 00nopl   (%rax)
  10:   89 f8   mov%edi,%eax
  12:   c1 e8 04shr$0x4,%eax
  15:   83 e0 01and$0x1,%eax
  18:   c3  retq

This one has a misprediced branch in the common path.

0020 :
  20:   40 f6 c7 11 test   $0x11,%dil
  24:   75 03   jne29 
  26:   31 c0   xor%eax,%eax
  28:   c3  retq
  29:   83 e7 01and$0x1,%edi
  2c:   83 ff 01cmp$0x1,%edi
  2f:   19 c0   sbb%eax,%eax
  31:   83 c0 02add$0x2,%eax
  34:   c3  retq

The jne will be predicted not taken and the retq predicted.
So this might only be 1 clock in the normal case.

0040 :
  40:   89 f8   mov%edi,%eax
  42:   c1 ef 04shr$0x4,%edi
  45:   83 e0 01and$0x1,%eax
  48:   89 c2   mov%eax,%edx
  4a:   83 f2 01xor$0x1,%edx
  4d:   21 d7   and%edx,%edi
  4f:   8d 04 47lea(%rdi,%rax,2),%eax
  52:   c3  retq

No conditionals, but a 4 deep dependency chain.

0060 :
  60:   89 fa   mov%edi,%edx
  62:   c1 ef 04shr$0x4,%edi
  65:   83 e2 01and$0x1,%edx
  68:   83 e7 01and$0x1,%edi
  6b:   89 d0   mov%edx,%eax
  6d:   83 f0 01xor$0x1,%eax
  70:   0f af c7imul   %edi,%eax
  73:   8d 04 50lea(%rax,%rdx,2),%eax
  76:   c3  retq

You really don't want the multiply.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-06 Thread Vlastimil Babka
On 11/6/18 11:08 AM, David Laight wrote:
> From: Bart Van Assche
>> Sent: 05 November 2018 20:40
>>
>> This patch suppresses the following sparse warning:
>>
>> ./include/linux/slab.h:332:43: warning: dubious: x & !y

BTW, I wonder why the warnings appeared only now, after maybe months in
linux-next. Don't the various automated testing bots run sparse also on
linux-next?

>>
>> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
>> Cc: Vlastimil Babka 
>> Cc: Mel Gorman 
>> Cc: Christoph Lameter 
>> Cc: Roman Gushchin 
>> Signed-off-by: Bart Van Assche 
>> ---
>>  include/linux/slab.h | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/include/linux/slab.h b/include/linux/slab.h
>> index 918f374e7156..97d0599ddb7b 100644
>> --- a/include/linux/slab.h
>> +++ b/include/linux/slab.h
>> @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type 
>> kmalloc_type(gfp_t flags)
>>   * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
>>   * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
>>   */
>> -return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
>> +return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
> 
> ISTM that changing is_dma and is_reclaimable from int to bool will stop the 
> bleating.
> 
> It is also strange that this code is trying so hard here to avoid conditional 
> instructions

I primarily wanted to avoid branches in a hot path, not cmovs. Note
those are also not "free" (latency-wise) if the result of cmov is
immediately used for further computation.

> and then uses several to generate the boolean values in the first place.

I'm not sure where exactly?

> OTOH I'd probably write:
>   int gfp_dma = 0;
> 
> #ifdef CONFIG_ZONE_DMA
>   gfp_dma = __GFP_DMA;
> #endif
> 
>   return flags & gfp_dma ? KMALLOC_DMA : flags & __GFP_RECLAIMABLE ? 
> KMALLOC_RECLAIM : 0;

I'm not opposed to this. Christoph might :)

> 
> That might generate cmovs, but is may be better to put unlikely() around both
> conditional expressions. Or redo as:
> 
>   return !unlikely(flags & (dfp_dma | __GFP_RECLAIMABLE)) ? 0 : flags & 
> gfp_dma ? KMALLOC_DMA : KMALLOC_RECLAIM;

I guess it should be structured so that the fast path is for gfp without
both __GFP_DMA and __GFP_RECLAIMABLE, with a single test+branch. IIRC
that's what Christoph originally requested.

>   David
> 
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 
> 1PT, UK
> Registration No: 1397386 (Wales)
> 



Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-06 Thread Vlastimil Babka
On 11/6/18 11:08 AM, David Laight wrote:
> From: Bart Van Assche
>> Sent: 05 November 2018 20:40
>>
>> This patch suppresses the following sparse warning:
>>
>> ./include/linux/slab.h:332:43: warning: dubious: x & !y

BTW, I wonder why the warnings appeared only now, after maybe months in
linux-next. Don't the various automated testing bots run sparse also on
linux-next?

>>
>> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
>> Cc: Vlastimil Babka 
>> Cc: Mel Gorman 
>> Cc: Christoph Lameter 
>> Cc: Roman Gushchin 
>> Signed-off-by: Bart Van Assche 
>> ---
>>  include/linux/slab.h | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/include/linux/slab.h b/include/linux/slab.h
>> index 918f374e7156..97d0599ddb7b 100644
>> --- a/include/linux/slab.h
>> +++ b/include/linux/slab.h
>> @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type 
>> kmalloc_type(gfp_t flags)
>>   * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
>>   * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
>>   */
>> -return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
>> +return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
> 
> ISTM that changing is_dma and is_reclaimable from int to bool will stop the 
> bleating.
> 
> It is also strange that this code is trying so hard here to avoid conditional 
> instructions

I primarily wanted to avoid branches in a hot path, not cmovs. Note
those are also not "free" (latency-wise) if the result of cmov is
immediately used for further computation.

> and then uses several to generate the boolean values in the first place.

I'm not sure where exactly?

> OTOH I'd probably write:
>   int gfp_dma = 0;
> 
> #ifdef CONFIG_ZONE_DMA
>   gfp_dma = __GFP_DMA;
> #endif
> 
>   return flags & gfp_dma ? KMALLOC_DMA : flags & __GFP_RECLAIMABLE ? 
> KMALLOC_RECLAIM : 0;

I'm not opposed to this. Christoph might :)

> 
> That might generate cmovs, but is may be better to put unlikely() around both
> conditional expressions. Or redo as:
> 
>   return !unlikely(flags & (dfp_dma | __GFP_RECLAIMABLE)) ? 0 : flags & 
> gfp_dma ? KMALLOC_DMA : KMALLOC_RECLAIM;

I guess it should be structured so that the fast path is for gfp without
both __GFP_DMA and __GFP_RECLAIMABLE, with a single test+branch. IIRC
that's what Christoph originally requested.

>   David
> 
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 
> 1PT, UK
> Registration No: 1397386 (Wales)
> 



RE: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-06 Thread David Laight
From: Bart Van Assche
> Sent: 05 November 2018 20:40
> 
> This patch suppresses the following sparse warning:
> 
> ./include/linux/slab.h:332:43: warning: dubious: x & !y
> 
> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
> Cc: Vlastimil Babka 
> Cc: Mel Gorman 
> Cc: Christoph Lameter 
> Cc: Roman Gushchin 
> Signed-off-by: Bart Van Assche 
> ---
>  include/linux/slab.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 918f374e7156..97d0599ddb7b 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type 
> kmalloc_type(gfp_t flags)
>* If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
>* KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
>*/
> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;

ISTM that changing is_dma and is_reclaimable from int to bool will stop the 
bleating.

It is also strange that this code is trying so hard here to avoid conditional 
instructions
and then uses several to generate the boolean values in the first place.

OTOH I'd probably write:
int gfp_dma = 0;

#ifdef CONFIG_ZONE_DMA
gfp_dma = __GFP_DMA;
#endif

return flags & gfp_dma ? KMALLOC_DMA : flags & __GFP_RECLAIMABLE ? 
KMALLOC_RECLAIM : 0;

That might generate cmovs, but is may be better to put unlikely() around both
conditional expressions. Or redo as:

return !unlikely(flags & (dfp_dma | __GFP_RECLAIMABLE)) ? 0 : flags & 
gfp_dma ? KMALLOC_DMA : KMALLOC_RECLAIM;

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-06 Thread David Laight
From: Bart Van Assche
> Sent: 05 November 2018 20:40
> 
> This patch suppresses the following sparse warning:
> 
> ./include/linux/slab.h:332:43: warning: dubious: x & !y
> 
> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
> Cc: Vlastimil Babka 
> Cc: Mel Gorman 
> Cc: Christoph Lameter 
> Cc: Roman Gushchin 
> Signed-off-by: Bart Van Assche 
> ---
>  include/linux/slab.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 918f374e7156..97d0599ddb7b 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type 
> kmalloc_type(gfp_t flags)
>* If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
>* KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
>*/
> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;

ISTM that changing is_dma and is_reclaimable from int to bool will stop the 
bleating.

It is also strange that this code is trying so hard here to avoid conditional 
instructions
and then uses several to generate the boolean values in the first place.

OTOH I'd probably write:
int gfp_dma = 0;

#ifdef CONFIG_ZONE_DMA
gfp_dma = __GFP_DMA;
#endif

return flags & gfp_dma ? KMALLOC_DMA : flags & __GFP_RECLAIMABLE ? 
KMALLOC_RECLAIM : 0;

That might generate cmovs, but is may be better to put unlikely() around both
conditional expressions. Or redo as:

return !unlikely(flags & (dfp_dma | __GFP_RECLAIMABLE)) ? 0 : flags & 
gfp_dma ? KMALLOC_DMA : KMALLOC_RECLAIM;

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-06 Thread William Kucharski



> On Nov 5, 2018, at 14:13, Andrew Morton  wrote:
> 
>> On Mon,  5 Nov 2018 12:40:00 -0800 Bart Van Assche  
>> wrote:
>> -return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
>> +return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
>> }
>> 
>> /*
> 
> I suppose so.
> 
> That function seems too clever for its own good :(.  I wonder if these
> branch-avoiding tricks are really worthwhile.

At the very least I'd like to see some comments added as to why that approach 
was taken for the sake of future maintainers.

William Kucharski




Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-06 Thread William Kucharski



> On Nov 5, 2018, at 14:13, Andrew Morton  wrote:
> 
>> On Mon,  5 Nov 2018 12:40:00 -0800 Bart Van Assche  
>> wrote:
>> -return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
>> +return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
>> }
>> 
>> /*
> 
> I suppose so.
> 
> That function seems too clever for its own good :(.  I wonder if these
> branch-avoiding tricks are really worthwhile.

At the very least I'd like to see some comments added as to why that approach 
was taken for the sake of future maintainers.

William Kucharski




Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-06 Thread Vlastimil Babka
On 11/5/18 9:40 PM, Bart Van Assche wrote:
> This patch suppresses the following sparse warning:
> 
> ./include/linux/slab.h:332:43: warning: dubious: x & !y
> 
> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
> Cc: Vlastimil Babka 
> Cc: Mel Gorman 
> Cc: Christoph Lameter 
> Cc: Roman Gushchin 
> Signed-off-by: Bart Van Assche 

Thanks.

Acked-by: Vlastimil Babka 

> ---
>  include/linux/slab.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 918f374e7156..97d0599ddb7b 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type 
> kmalloc_type(gfp_t flags)
>* If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
>* KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
>*/
> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
>  }
>  
>  /*
> 



Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-06 Thread Vlastimil Babka
On 11/5/18 9:40 PM, Bart Van Assche wrote:
> This patch suppresses the following sparse warning:
> 
> ./include/linux/slab.h:332:43: warning: dubious: x & !y
> 
> Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
> Cc: Vlastimil Babka 
> Cc: Mel Gorman 
> Cc: Christoph Lameter 
> Cc: Roman Gushchin 
> Signed-off-by: Bart Van Assche 

Thanks.

Acked-by: Vlastimil Babka 

> ---
>  include/linux/slab.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 918f374e7156..97d0599ddb7b 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type 
> kmalloc_type(gfp_t flags)
>* If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
>* KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
>*/
> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
>  }
>  
>  /*
> 



Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-05 Thread Bart Van Assche
On Mon, 2018-11-05 at 16:11 -0800, Alexander Duyck wrote:
> If we really don't care then why even bother with the switch statement
> anyway? It seems like you could just do one ternary operator and be
> done with it. Basically all you need is:
> return (defined(CONFIG_ZONE_DMA) && (flags & __GFP_DMA)) ? KMALLOC_DMA :
> (flags & __GFP_RECLAIMABLE) ? KMALLOC_RECLAIM : 0;
> 
> Why bother with all the extra complexity of the switch statement?

I don't think that defined() can be used in a C expression. Hence the
IS_ENABLED() macro. If you fix that, leave out four superfluous parentheses,
test your patch, post that patch and cc me then I will add my Reviewed-by.

Bart.




Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-05 Thread Bart Van Assche
On Mon, 2018-11-05 at 16:11 -0800, Alexander Duyck wrote:
> If we really don't care then why even bother with the switch statement
> anyway? It seems like you could just do one ternary operator and be
> done with it. Basically all you need is:
> return (defined(CONFIG_ZONE_DMA) && (flags & __GFP_DMA)) ? KMALLOC_DMA :
> (flags & __GFP_RECLAIMABLE) ? KMALLOC_RECLAIM : 0;
> 
> Why bother with all the extra complexity of the switch statement?

I don't think that defined() can be used in a C expression. Hence the
IS_ENABLED() macro. If you fix that, leave out four superfluous parentheses,
test your patch, post that patch and cc me then I will add my Reviewed-by.

Bart.




Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-05 Thread Alexander Duyck
On Mon, Nov 5, 2018 at 4:01 PM Bart Van Assche  wrote:
>
> On Mon, 2018-11-05 at 14:48 -0800, Alexander Duyck wrote:
> > On Mon, Nov 5, 2018 at 2:41 PM Bart Van Assche  wrote:
> > > How about this version, still untested? My compiler is able to evaluate
> > > the switch expression if the argument is constant.
> > >
> > >  static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
> > >  {
> > > -   int is_dma = 0;
> > > -   int type_dma = 0;
> > > -   int is_reclaimable;
> > > +   unsigned int dr = !!(flags & __GFP_RECLAIMABLE);
> > >
> > >  #ifdef CONFIG_ZONE_DMA
> > > -   is_dma = !!(flags & __GFP_DMA);
> > > -   type_dma = is_dma * KMALLOC_DMA;
> > > +   dr |= !!(flags & __GFP_DMA) << 1;
> > >  #endif
> > >
> > > -   is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
> > > -
> > > /*
> > >  * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, 
> > > return
> > >  * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
> > >  */
> > > -   return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> > > +   switch (dr) {
> > > +   default:
> > > +   case 0:
> > > +   return 0;
> > > +   case 1:
> > > +   return KMALLOC_RECLAIM;
> > > +   case 2:
> > > +   case 3:
> > > +   return KMALLOC_DMA;
> > > +   }
> > >  }
> >
> > Doesn't this defeat the whole point of the code which I thought was to
> > avoid conditional jumps and branches? Also why would you bother with
> > the "dr" value when you could just mask the flags value and switch on
> > that directly?
>
> Storing the relevant bits of 'flags' in the 'dr' variable avoids that the
> bit selection expressions have to be repeated and allows to use a switch
> statement instead of multiple if / else statements.

Really they shouldn't have to be repeated. You essentially have just 3
cases. 0, __GFP_RECLAIMABLE, and the default case.

> Most kmalloc() calls pass a constant to the gfp argument. That allows the
> compiler to evaluate kmalloc_type() at compile time. So the conditional jumps
> and branches only appear when the gfp argument is not a constant. What makes
> you think it is important to optimize for that case?
>
> Bart.

I didn't really think it was all that important to optimize, but I
thought that was what you were trying to maintain with the earlier
patch since it was converting things to a table lookup.

If we really don't care then why even bother with the switch statement
anyway? It seems like you could just do one ternary operator and be
done with it. Basically all you need is:
return (defined(CONFIG_ZONE_DMA) && (flags & __GFP_DMA)) ? KMALLOC_DMA :
(flags & __GFP_RECLAIMABLE) ? KMALLOC_RECLAIM : 0;

Why bother with all the extra complexity of the switch statement?

Thanks.

- Alex


Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-05 Thread Alexander Duyck
On Mon, Nov 5, 2018 at 4:01 PM Bart Van Assche  wrote:
>
> On Mon, 2018-11-05 at 14:48 -0800, Alexander Duyck wrote:
> > On Mon, Nov 5, 2018 at 2:41 PM Bart Van Assche  wrote:
> > > How about this version, still untested? My compiler is able to evaluate
> > > the switch expression if the argument is constant.
> > >
> > >  static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
> > >  {
> > > -   int is_dma = 0;
> > > -   int type_dma = 0;
> > > -   int is_reclaimable;
> > > +   unsigned int dr = !!(flags & __GFP_RECLAIMABLE);
> > >
> > >  #ifdef CONFIG_ZONE_DMA
> > > -   is_dma = !!(flags & __GFP_DMA);
> > > -   type_dma = is_dma * KMALLOC_DMA;
> > > +   dr |= !!(flags & __GFP_DMA) << 1;
> > >  #endif
> > >
> > > -   is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
> > > -
> > > /*
> > >  * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, 
> > > return
> > >  * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
> > >  */
> > > -   return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> > > +   switch (dr) {
> > > +   default:
> > > +   case 0:
> > > +   return 0;
> > > +   case 1:
> > > +   return KMALLOC_RECLAIM;
> > > +   case 2:
> > > +   case 3:
> > > +   return KMALLOC_DMA;
> > > +   }
> > >  }
> >
> > Doesn't this defeat the whole point of the code which I thought was to
> > avoid conditional jumps and branches? Also why would you bother with
> > the "dr" value when you could just mask the flags value and switch on
> > that directly?
>
> Storing the relevant bits of 'flags' in the 'dr' variable avoids that the
> bit selection expressions have to be repeated and allows to use a switch
> statement instead of multiple if / else statements.

Really they shouldn't have to be repeated. You essentially have just 3
cases. 0, __GFP_RECLAIMABLE, and the default case.

> Most kmalloc() calls pass a constant to the gfp argument. That allows the
> compiler to evaluate kmalloc_type() at compile time. So the conditional jumps
> and branches only appear when the gfp argument is not a constant. What makes
> you think it is important to optimize for that case?
>
> Bart.

I didn't really think it was all that important to optimize, but I
thought that was what you were trying to maintain with the earlier
patch since it was converting things to a table lookup.

If we really don't care then why even bother with the switch statement
anyway? It seems like you could just do one ternary operator and be
done with it. Basically all you need is:
return (defined(CONFIG_ZONE_DMA) && (flags & __GFP_DMA)) ? KMALLOC_DMA :
(flags & __GFP_RECLAIMABLE) ? KMALLOC_RECLAIM : 0;

Why bother with all the extra complexity of the switch statement?

Thanks.

- Alex


Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-05 Thread Bart Van Assche
On Mon, 2018-11-05 at 14:48 -0800, Alexander Duyck wrote:
> On Mon, Nov 5, 2018 at 2:41 PM Bart Van Assche  wrote:
> > How about this version, still untested? My compiler is able to evaluate
> > the switch expression if the argument is constant.
> > 
> >  static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
> >  {
> > -   int is_dma = 0;
> > -   int type_dma = 0;
> > -   int is_reclaimable;
> > +   unsigned int dr = !!(flags & __GFP_RECLAIMABLE);
> > 
> >  #ifdef CONFIG_ZONE_DMA
> > -   is_dma = !!(flags & __GFP_DMA);
> > -   type_dma = is_dma * KMALLOC_DMA;
> > +   dr |= !!(flags & __GFP_DMA) << 1;
> >  #endif
> > 
> > -   is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
> > -
> > /*
> >  * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
> >  * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
> >  */
> > -   return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> > +   switch (dr) {
> > +   default:
> > +   case 0:
> > +   return 0;
> > +   case 1:
> > +   return KMALLOC_RECLAIM;
> > +   case 2:
> > +   case 3:
> > +   return KMALLOC_DMA;
> > +   }
> >  }
>
> Doesn't this defeat the whole point of the code which I thought was to
> avoid conditional jumps and branches? Also why would you bother with
> the "dr" value when you could just mask the flags value and switch on
> that directly?

Storing the relevant bits of 'flags' in the 'dr' variable avoids that the
bit selection expressions have to be repeated and allows to use a switch
statement instead of multiple if / else statements.

Most kmalloc() calls pass a constant to the gfp argument. That allows the
compiler to evaluate kmalloc_type() at compile time. So the conditional jumps
and branches only appear when the gfp argument is not a constant. What makes
you think it is important to optimize for that case?

Bart.


Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-05 Thread Bart Van Assche
On Mon, 2018-11-05 at 14:48 -0800, Alexander Duyck wrote:
> On Mon, Nov 5, 2018 at 2:41 PM Bart Van Assche  wrote:
> > How about this version, still untested? My compiler is able to evaluate
> > the switch expression if the argument is constant.
> > 
> >  static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
> >  {
> > -   int is_dma = 0;
> > -   int type_dma = 0;
> > -   int is_reclaimable;
> > +   unsigned int dr = !!(flags & __GFP_RECLAIMABLE);
> > 
> >  #ifdef CONFIG_ZONE_DMA
> > -   is_dma = !!(flags & __GFP_DMA);
> > -   type_dma = is_dma * KMALLOC_DMA;
> > +   dr |= !!(flags & __GFP_DMA) << 1;
> >  #endif
> > 
> > -   is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
> > -
> > /*
> >  * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
> >  * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
> >  */
> > -   return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> > +   switch (dr) {
> > +   default:
> > +   case 0:
> > +   return 0;
> > +   case 1:
> > +   return KMALLOC_RECLAIM;
> > +   case 2:
> > +   case 3:
> > +   return KMALLOC_DMA;
> > +   }
> >  }
>
> Doesn't this defeat the whole point of the code which I thought was to
> avoid conditional jumps and branches? Also why would you bother with
> the "dr" value when you could just mask the flags value and switch on
> that directly?

Storing the relevant bits of 'flags' in the 'dr' variable avoids that the
bit selection expressions have to be repeated and allows to use a switch
statement instead of multiple if / else statements.

Most kmalloc() calls pass a constant to the gfp argument. That allows the
compiler to evaluate kmalloc_type() at compile time. So the conditional jumps
and branches only appear when the gfp argument is not a constant. What makes
you think it is important to optimize for that case?

Bart.


Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-05 Thread Alexander Duyck
On Mon, Nov 5, 2018 at 2:41 PM Bart Van Assche  wrote:
>
> On Mon, 2018-11-05 at 23:14 +0100, Rasmus Villemoes wrote:
> > Won't that pessimize the cases where gfp is a constant to actually do
> > the table lookup, and add 16 bytes to every translation unit?
> >
> > Another option is to add a fake KMALLOC_DMA_RECLAIM so the
> > kmalloc_caches[] array has size 4, then assign the same dma
> > kmalloc_cache pointer to [2][i] and [3][i] (so that costs perhaps a
> > dozen pointers in .data), and then just compute kmalloc_type() as
> >
> > ((flags & __GFP_RECLAIMABLE) >> someshift) | ((flags & __GFP_DMA) >>
> > someothershift).
> >
> > Perhaps one could even shuffle the GFP flags so the two shifts are the same.
>
> How about this version, still untested? My compiler is able to evaluate
> the switch expression if the argument is constant.
>
>  static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
>  {
> -   int is_dma = 0;
> -   int type_dma = 0;
> -   int is_reclaimable;
> +   unsigned int dr = !!(flags & __GFP_RECLAIMABLE);
>
>  #ifdef CONFIG_ZONE_DMA
> -   is_dma = !!(flags & __GFP_DMA);
> -   type_dma = is_dma * KMALLOC_DMA;
> +   dr |= !!(flags & __GFP_DMA) << 1;
>  #endif
>
> -   is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
> -
> /*
>  * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
>  * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
>  */
> -   return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> +   switch (dr) {
> +   default:
> +   case 0:
> +   return 0;
> +   case 1:
> +   return KMALLOC_RECLAIM;
> +   case 2:
> +   case 3:
> +   return KMALLOC_DMA;
> +   }
>  }
>
> Bart.

Doesn't this defeat the whole point of the code which I thought was to
avoid conditional jumps and branches? Also why would you bother with
the "dr" value when you could just mask the flags value and switch on
that directly?


Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-05 Thread Alexander Duyck
On Mon, Nov 5, 2018 at 2:41 PM Bart Van Assche  wrote:
>
> On Mon, 2018-11-05 at 23:14 +0100, Rasmus Villemoes wrote:
> > Won't that pessimize the cases where gfp is a constant to actually do
> > the table lookup, and add 16 bytes to every translation unit?
> >
> > Another option is to add a fake KMALLOC_DMA_RECLAIM so the
> > kmalloc_caches[] array has size 4, then assign the same dma
> > kmalloc_cache pointer to [2][i] and [3][i] (so that costs perhaps a
> > dozen pointers in .data), and then just compute kmalloc_type() as
> >
> > ((flags & __GFP_RECLAIMABLE) >> someshift) | ((flags & __GFP_DMA) >>
> > someothershift).
> >
> > Perhaps one could even shuffle the GFP flags so the two shifts are the same.
>
> How about this version, still untested? My compiler is able to evaluate
> the switch expression if the argument is constant.
>
>  static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
>  {
> -   int is_dma = 0;
> -   int type_dma = 0;
> -   int is_reclaimable;
> +   unsigned int dr = !!(flags & __GFP_RECLAIMABLE);
>
>  #ifdef CONFIG_ZONE_DMA
> -   is_dma = !!(flags & __GFP_DMA);
> -   type_dma = is_dma * KMALLOC_DMA;
> +   dr |= !!(flags & __GFP_DMA) << 1;
>  #endif
>
> -   is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
> -
> /*
>  * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
>  * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
>  */
> -   return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> +   switch (dr) {
> +   default:
> +   case 0:
> +   return 0;
> +   case 1:
> +   return KMALLOC_RECLAIM;
> +   case 2:
> +   case 3:
> +   return KMALLOC_DMA;
> +   }
>  }
>
> Bart.

Doesn't this defeat the whole point of the code which I thought was to
avoid conditional jumps and branches? Also why would you bother with
the "dr" value when you could just mask the flags value and switch on
that directly?


Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-05 Thread Bart Van Assche
On Mon, 2018-11-05 at 23:14 +0100, Rasmus Villemoes wrote:
> Won't that pessimize the cases where gfp is a constant to actually do
> the table lookup, and add 16 bytes to every translation unit?
> 
> Another option is to add a fake KMALLOC_DMA_RECLAIM so the
> kmalloc_caches[] array has size 4, then assign the same dma
> kmalloc_cache pointer to [2][i] and [3][i] (so that costs perhaps a
> dozen pointers in .data), and then just compute kmalloc_type() as
> 
> ((flags & __GFP_RECLAIMABLE) >> someshift) | ((flags & __GFP_DMA) >>
> someothershift).
> 
> Perhaps one could even shuffle the GFP flags so the two shifts are the same.

How about this version, still untested? My compiler is able to evaluate
the switch expression if the argument is constant.

 static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
 {
-   int is_dma = 0;
-   int type_dma = 0;
-   int is_reclaimable;
+   unsigned int dr = !!(flags & __GFP_RECLAIMABLE);
 
 #ifdef CONFIG_ZONE_DMA
-   is_dma = !!(flags & __GFP_DMA);
-   type_dma = is_dma * KMALLOC_DMA;
+   dr |= !!(flags & __GFP_DMA) << 1;
 #endif
 
-   is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
-
/*
 * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
 * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
 */
-   return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
+   switch (dr) {
+   default:
+   case 0:
+   return 0;
+   case 1:
+   return KMALLOC_RECLAIM;
+   case 2:
+   case 3:
+   return KMALLOC_DMA;
+   }
 }
 
Bart.


Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-05 Thread Bart Van Assche
On Mon, 2018-11-05 at 23:14 +0100, Rasmus Villemoes wrote:
> Won't that pessimize the cases where gfp is a constant to actually do
> the table lookup, and add 16 bytes to every translation unit?
> 
> Another option is to add a fake KMALLOC_DMA_RECLAIM so the
> kmalloc_caches[] array has size 4, then assign the same dma
> kmalloc_cache pointer to [2][i] and [3][i] (so that costs perhaps a
> dozen pointers in .data), and then just compute kmalloc_type() as
> 
> ((flags & __GFP_RECLAIMABLE) >> someshift) | ((flags & __GFP_DMA) >>
> someothershift).
> 
> Perhaps one could even shuffle the GFP flags so the two shifts are the same.

How about this version, still untested? My compiler is able to evaluate
the switch expression if the argument is constant.

 static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
 {
-   int is_dma = 0;
-   int type_dma = 0;
-   int is_reclaimable;
+   unsigned int dr = !!(flags & __GFP_RECLAIMABLE);
 
 #ifdef CONFIG_ZONE_DMA
-   is_dma = !!(flags & __GFP_DMA);
-   type_dma = is_dma * KMALLOC_DMA;
+   dr |= !!(flags & __GFP_DMA) << 1;
 #endif
 
-   is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
-
/*
 * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
 * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
 */
-   return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
+   switch (dr) {
+   default:
+   case 0:
+   return 0;
+   case 1:
+   return KMALLOC_RECLAIM;
+   case 2:
+   case 3:
+   return KMALLOC_DMA;
+   }
 }
 
Bart.


Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-05 Thread Rasmus Villemoes
On 2018-11-05 22:48, Bart Van Assche wrote:
> On Mon, 2018-11-05 at 13:13 -0800, Andrew Morton wrote:
>> On Mon,  5 Nov 2018 12:40:00 -0800 Bart Van Assche  
>> wrote:
>>
>>> This patch suppresses the following sparse warning:
>>>
>>> ./include/linux/slab.h:332:43: warning: dubious: x & !y
>>>
>>> ...
>>>
>>> --- a/include/linux/slab.h
>>> +++ b/include/linux/slab.h
>>> @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type 
>>> kmalloc_type(gfp_t flags)
>>>  * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
>>>  * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
>>>  */
>>> -   return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
>>> +   return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
>>>  }
>>>  
>>>  /*
>>
>> I suppose so.
>>
>> That function seems too clever for its own good :(.  I wonder if these
>> branch-avoiding tricks are really worthwhile.
> 
> From what I have seen in gcc disassembly it seems to me like gcc uses the
> cmov instruction to implement e.g. the ternary operator (?:). So I think none
> of the cleverness in kmalloc_type() is really necessary to avoid conditional
> branches. I think this function would become much more readable when using a
> switch statement or when rewriting it as follows (untested):
> 
>  static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
>  {
> - int is_dma = 0;
> - int type_dma = 0;
> - int is_reclaimable;
> -
> -#ifdef CONFIG_ZONE_DMA
> - is_dma = !!(flags & __GFP_DMA);
> - type_dma = is_dma * KMALLOC_DMA;
> -#endif
> -
> - is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
> -
>   /*
>* If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
>* KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
>*/
> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> + static const enum kmalloc_cache_type flags_to_type[2][2] = {
> + { 0,KMALLOC_RECLAIM },
> + { KMALLOC_DMA,  KMALLOC_DMA },
> + };
> +#ifdef CONFIG_ZONE_DMA
> + bool is_dma = !!(flags & __GFP_DMA);
> +#endif
> + bool is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
> +
> + return flags_to_type[is_dma][is_reclaimable];
>  }
> 

Won't that pessimize the cases where gfp is a constant to actually do
the table lookup, and add 16 bytes to every translation unit?

Another option is to add a fake KMALLOC_DMA_RECLAIM so the
kmalloc_caches[] array has size 4, then assign the same dma
kmalloc_cache pointer to [2][i] and [3][i] (so that costs perhaps a
dozen pointers in .data), and then just compute kmalloc_type() as

((flags & __GFP_RECLAIMABLE) >> someshift) | ((flags & __GFP_DMA) >>
someothershift).

Perhaps one could even shuffle the GFP flags so the two shifts are the same.

Rasmus


Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-05 Thread Rasmus Villemoes
On 2018-11-05 22:48, Bart Van Assche wrote:
> On Mon, 2018-11-05 at 13:13 -0800, Andrew Morton wrote:
>> On Mon,  5 Nov 2018 12:40:00 -0800 Bart Van Assche  
>> wrote:
>>
>>> This patch suppresses the following sparse warning:
>>>
>>> ./include/linux/slab.h:332:43: warning: dubious: x & !y
>>>
>>> ...
>>>
>>> --- a/include/linux/slab.h
>>> +++ b/include/linux/slab.h
>>> @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type 
>>> kmalloc_type(gfp_t flags)
>>>  * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
>>>  * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
>>>  */
>>> -   return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
>>> +   return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
>>>  }
>>>  
>>>  /*
>>
>> I suppose so.
>>
>> That function seems too clever for its own good :(.  I wonder if these
>> branch-avoiding tricks are really worthwhile.
> 
> From what I have seen in gcc disassembly it seems to me like gcc uses the
> cmov instruction to implement e.g. the ternary operator (?:). So I think none
> of the cleverness in kmalloc_type() is really necessary to avoid conditional
> branches. I think this function would become much more readable when using a
> switch statement or when rewriting it as follows (untested):
> 
>  static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
>  {
> - int is_dma = 0;
> - int type_dma = 0;
> - int is_reclaimable;
> -
> -#ifdef CONFIG_ZONE_DMA
> - is_dma = !!(flags & __GFP_DMA);
> - type_dma = is_dma * KMALLOC_DMA;
> -#endif
> -
> - is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
> -
>   /*
>* If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
>* KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
>*/
> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> + static const enum kmalloc_cache_type flags_to_type[2][2] = {
> + { 0,KMALLOC_RECLAIM },
> + { KMALLOC_DMA,  KMALLOC_DMA },
> + };
> +#ifdef CONFIG_ZONE_DMA
> + bool is_dma = !!(flags & __GFP_DMA);
> +#endif
> + bool is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
> +
> + return flags_to_type[is_dma][is_reclaimable];
>  }
> 

Won't that pessimize the cases where gfp is a constant to actually do
the table lookup, and add 16 bytes to every translation unit?

Another option is to add a fake KMALLOC_DMA_RECLAIM so the
kmalloc_caches[] array has size 4, then assign the same dma
kmalloc_cache pointer to [2][i] and [3][i] (so that costs perhaps a
dozen pointers in .data), and then just compute kmalloc_type() as

((flags & __GFP_RECLAIMABLE) >> someshift) | ((flags & __GFP_DMA) >>
someothershift).

Perhaps one could even shuffle the GFP flags so the two shifts are the same.

Rasmus


Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-05 Thread Bart Van Assche
On Mon, 2018-11-05 at 13:13 -0800, Andrew Morton wrote:
> On Mon,  5 Nov 2018 12:40:00 -0800 Bart Van Assche  wrote:
> 
> > This patch suppresses the following sparse warning:
> > 
> > ./include/linux/slab.h:332:43: warning: dubious: x & !y
> > 
> > ...
> > 
> > --- a/include/linux/slab.h
> > +++ b/include/linux/slab.h
> > @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type 
> > kmalloc_type(gfp_t flags)
> >  * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
> >  * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
> >  */
> > -   return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> > +   return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
> >  }
> >  
> >  /*
> 
> I suppose so.
> 
> That function seems too clever for its own good :(.  I wonder if these
> branch-avoiding tricks are really worthwhile.

>From what I have seen in gcc disassembly it seems to me like gcc uses the
cmov instruction to implement e.g. the ternary operator (?:). So I think none
of the cleverness in kmalloc_type() is really necessary to avoid conditional
branches. I think this function would become much more readable when using a
switch statement or when rewriting it as follows (untested):

 static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
 {
-   int is_dma = 0;
-   int type_dma = 0;
-   int is_reclaimable;
-
-#ifdef CONFIG_ZONE_DMA
-   is_dma = !!(flags & __GFP_DMA);
-   type_dma = is_dma * KMALLOC_DMA;
-#endif
-
-   is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
-
/*
 * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
 * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
 */
-   return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
+   static const enum kmalloc_cache_type flags_to_type[2][2] = {
+   { 0,KMALLOC_RECLAIM },
+   { KMALLOC_DMA,  KMALLOC_DMA },
+   };
+#ifdef CONFIG_ZONE_DMA
+   bool is_dma = !!(flags & __GFP_DMA);
+#endif
+   bool is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
+
+   return flags_to_type[is_dma][is_reclaimable];
 }



Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-05 Thread Bart Van Assche
On Mon, 2018-11-05 at 13:13 -0800, Andrew Morton wrote:
> On Mon,  5 Nov 2018 12:40:00 -0800 Bart Van Assche  wrote:
> 
> > This patch suppresses the following sparse warning:
> > 
> > ./include/linux/slab.h:332:43: warning: dubious: x & !y
> > 
> > ...
> > 
> > --- a/include/linux/slab.h
> > +++ b/include/linux/slab.h
> > @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type 
> > kmalloc_type(gfp_t flags)
> >  * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
> >  * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
> >  */
> > -   return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> > +   return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
> >  }
> >  
> >  /*
> 
> I suppose so.
> 
> That function seems too clever for its own good :(.  I wonder if these
> branch-avoiding tricks are really worthwhile.

>From what I have seen in gcc disassembly it seems to me like gcc uses the
cmov instruction to implement e.g. the ternary operator (?:). So I think none
of the cleverness in kmalloc_type() is really necessary to avoid conditional
branches. I think this function would become much more readable when using a
switch statement or when rewriting it as follows (untested):

 static __always_inline enum kmalloc_cache_type kmalloc_type(gfp_t flags)
 {
-   int is_dma = 0;
-   int type_dma = 0;
-   int is_reclaimable;
-
-#ifdef CONFIG_ZONE_DMA
-   is_dma = !!(flags & __GFP_DMA);
-   type_dma = is_dma * KMALLOC_DMA;
-#endif
-
-   is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
-
/*
 * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
 * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
 */
-   return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
+   static const enum kmalloc_cache_type flags_to_type[2][2] = {
+   { 0,KMALLOC_RECLAIM },
+   { KMALLOC_DMA,  KMALLOC_DMA },
+   };
+#ifdef CONFIG_ZONE_DMA
+   bool is_dma = !!(flags & __GFP_DMA);
+#endif
+   bool is_reclaimable = !!(flags & __GFP_RECLAIMABLE);
+
+   return flags_to_type[is_dma][is_reclaimable];
 }



Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-05 Thread Andrew Morton
On Mon,  5 Nov 2018 12:40:00 -0800 Bart Van Assche  wrote:

> This patch suppresses the following sparse warning:
> 
> ./include/linux/slab.h:332:43: warning: dubious: x & !y
> 
> ...
>
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type 
> kmalloc_type(gfp_t flags)
>* If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
>* KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
>*/
> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
>  }
>  
>  /*

I suppose so.

That function seems too clever for its own good :(.  I wonder if these
branch-avoiding tricks are really worthwhile.



Re: [PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-05 Thread Andrew Morton
On Mon,  5 Nov 2018 12:40:00 -0800 Bart Van Assche  wrote:

> This patch suppresses the following sparse warning:
> 
> ./include/linux/slab.h:332:43: warning: dubious: x & !y
> 
> ...
>
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type 
> kmalloc_type(gfp_t flags)
>* If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
>* KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
>*/
> - return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
> + return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
>  }
>  
>  /*

I suppose so.

That function seems too clever for its own good :(.  I wonder if these
branch-avoiding tricks are really worthwhile.



[PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-05 Thread Bart Van Assche
This patch suppresses the following sparse warning:

./include/linux/slab.h:332:43: warning: dubious: x & !y

Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
Cc: Vlastimil Babka 
Cc: Mel Gorman 
Cc: Christoph Lameter 
Cc: Roman Gushchin 
Signed-off-by: Bart Van Assche 
---
 include/linux/slab.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 918f374e7156..97d0599ddb7b 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type 
kmalloc_type(gfp_t flags)
 * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
 * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
 */
-   return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
+   return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
 }
 
 /*
-- 
2.19.1.930.g4563a0d9d0-goog



[PATCH] slab.h: Avoid using & for logical and of booleans

2018-11-05 Thread Bart Van Assche
This patch suppresses the following sparse warning:

./include/linux/slab.h:332:43: warning: dubious: x & !y

Fixes: 1291523f2c1d ("mm, slab/slub: introduce kmalloc-reclaimable caches")
Cc: Vlastimil Babka 
Cc: Mel Gorman 
Cc: Christoph Lameter 
Cc: Roman Gushchin 
Signed-off-by: Bart Van Assche 
---
 include/linux/slab.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 918f374e7156..97d0599ddb7b 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -329,7 +329,7 @@ static __always_inline enum kmalloc_cache_type 
kmalloc_type(gfp_t flags)
 * If an allocation is both __GFP_DMA and __GFP_RECLAIMABLE, return
 * KMALLOC_DMA and effectively ignore __GFP_RECLAIMABLE
 */
-   return type_dma + (is_reclaimable & !is_dma) * KMALLOC_RECLAIM;
+   return type_dma + is_reclaimable * !is_dma * KMALLOC_RECLAIM;
 }
 
 /*
-- 
2.19.1.930.g4563a0d9d0-goog