https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114512

            Bug ID: 114512
           Summary: Optimization on "test if bit N is set" pattern ((C >>
                    x) & 1)
           Product: gcc
           Version: 13.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: Explorer09 at gmail dot com
  Target Milestone: ---

It should be a common code pattern for "checking if bit N of a value is set"
(See these web pages:
https://stackoverflow.com/questions/523724
https://www.geeksforgeeks.org/check-whether-k-th-bit-set-not/ )

And I have at least 3 ways to do the thing:

```c
/* 1. */ ((value >> count) & 1)

/* 2. */ !!(value & (1 << count))

/* 3. */
/* 'bit_reversed_value' is a constant */
(SignedType)(bit_reversed_value << count) < 0
```

For a compiler, it's better to canonicalize the patterns so that they all
generate the same, optimized code for the target architectures.

## Sample code

The example is modified from a real case:
https://github.com/htop-dev/htop/blob/484f029d8b0e0dbb5df37e3aff0dbf059fd857a1/linux/LinuxProcessTable.c#L96

```c
#include <stdbool.h>
#include <stdint.h>

bool my_isxdigit_1(unsigned char ch) {
  uint32_t mask1 = 0x03FF007E;
  if (!((mask1 >> (ch & 0x1F)) & 1))
    return false;

  uint32_t mask2 = 0x58;
  if (!((mask2 >> (ch >> 4)) & 1))
    return false;

  return true;
}

bool my_isxdigit_2(unsigned char ch) {
  uint32_t mask1 = 0x03FF007E;
  if (!(mask1 & (1L << (ch & 0x1F))))
    return false;

  uint32_t mask2 = 0x58;
  if (!(mask2 & (1L << (ch >> 4))))
    return false;

  return true;
}

bool my_isxdigit_3(unsigned char ch) {
  uint32_t mask1 = 0x7E00FFC0;
  if (!((mask1 << (ch & 0x1F)) >> 31))
    return false;

  uint32_t mask2 = 0x1A << 24;
  if (!((mask2 << (ch >> 4)) >> 31))
    return false;

  return true;
}
```

All three functions do the same thing. Sometimes `my_isxdigit_1` and
`my_isxdigit_2` generate the same code, and for some target architectures they
do not. (`my_isxdigit_2` is typically larger according to my testing.)

(In x86 Clang and for `my_isxdigit_1` and `my_isxdigit_2` functions, it might
utilize the "bt" instructions available in 80386 or newer processors.)

The `my_isxdigit_3` case may require transformation of the constants, but for
some targets (ARM and RISC-V) and when the function is inlined in a conditional
like "`if (my_isxdigit_3(ch)) { putchar(ch); }`", it might generate smaller
code than `my_isxdigit_1` or `my_isxdigit_2`.

(`my_isxdigit_3` utilizes the sign bit after shift to save a bitwise AND on
some architectures. I found it might work for ARM, but didn't work well for x86
as I initially expected.)

The compiler should be free to choose which form of the code to emit if the
three patterns are canonicalized.

Reply via email to