https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108370

            Bug ID: 108370
           Summary: gcc doesn't merge bitwise-AND if an explicit
                    comparison against 0 is given
           Product: gcc
           Version: 12.2.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: dhowells at redhat dot com
  Target Milestone: ---

Created attachment 54245
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54245&action=edit
Demo code

If gcc sees a couple of calls to an inline function that does a bitwise-AND and
returns whether the result was zero or non-zero (e.g. a flag check helper), gcc
cannot merge them if the result of the AND is explicitly compared against 0,
even if the function's return type is a bool (which would do that anyway).  For
example:

   static inline bool bio_flagged(struct bio *bio, unsigned int bit)
   {
        return (bio->bi_flags & (1U << bit)) != 0;
   }

   void bio_release_pages(struct bio *bio, bool mark_dirty)
   {
        if (bio_flagged(bio, BIO_PAGE_REFFED) ||
            bio_flagged(bio, BIO_PAGE_PINNED))
                __bio_release_pages(bio, mark_dirty);
   }

compiles bio_release_pages() to:

   0:   66 8b 07                mov    (%rdi),%ax
   3:   a8 01                   test   $0x1,%al
   5:   75 04                   jne    b <bio_release_pages+0xb>
   7:   a8 02                   test   $0x2,%al
   9:   74 09                   je     14 <bio_release_pages+0x14>
   b:   40 0f b6 f6             movzbl %sil,%esi
   f:   e9 00 00 00 00          jmp    14 <bio_release_pages+0x14>
  14:   c3                      ret    

but:

   static inline bool bio_flagged(struct bio *bio, unsigned int bit)
   {
        return bio->bi_flags & (1U << bit);
   }

gives:

   0:   f6 07 03                testb  $0x3,(%rdi)
   3:   74 09                   je     e <bio_release_pages+0xe>
   5:   40 0f b6 f6             movzbl %sil,%esi
   9:   e9 00 00 00 00          jmp    e <bio_release_pages+0xe>
   e:   c3                      ret    

Possibly the comparison against 0 could be optimised away.

I've attached some demo code that can be compiled with one of:

gcc -Os -c gcc-bool-demo.c
gcc -Os -c gcc-bool-demo.c -Dfix

The gcc I used above is the Fedora 37 system compiler:

gcc-12.2.1-4.fc37.x86_64
binutils-2.38-25.fc37.x86_64

but similar results can be seen with the Fedora arm cross-compiler:

   0:   e1d030b0        ldrh    r3, [r0]
   4:   e3130001        tst     r3, #1
   8:   1a000001        bne     14 <bio_release_pages+0x14>
   c:   e3130002        tst     r3, #2
  10:   012fff1e        bxeq    lr
  14:   eafffffe        b       0 <__bio_release_pages>

vs

   0:   e1d030b0        ldrh    r3, [r0]
   4:   e3130003        tst     r3, #3
   8:   012fff1e        bxeq    lr
   c:   eafffffe        b       0 <__bio_release_pages>

Reply via email to