https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121662

            Bug ID: 121662
           Summary: Unnecessary data dependant branches with avx512 masks
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: andi-gcc at firstfloor dot org
  Target Milestone: ---
            Target: x86_64

/* { dg-do compile } */
/* { dg-options " -mavx512vl -mavx512bw -mavx512f -O3" } */

void
ubyteshiftl_mask (unsigned char *a, int len)
{
  int i;
  for (i = 0; i < len; i++)
    if (a[i] & 1)
      a[i] <<= 5;
}

generates

        kortestq        %k1, %k1
        je      .L4
        vpsllw  $5, %zmm0, %zmm0
        vpandq  %zmm0, %zmm3, %zmm0
        vmovdqu8        %zmm0, (%rdx){%k1}
.L4:

The branch is really unnecessary because the mask does all all the work
(although it should perhaps be applied to the computation too)

The problem is if a[i] & 1 is unpredictable this will slow down the loop due to
branch mispredicts. If we have masks we should avoid data dependent branches
because it needs no prediction.

Reply via email to