https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121662
Bug ID: 121662 Summary: Unnecessary data dependant branches with avx512 masks Product: gcc Version: 16.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: andi-gcc at firstfloor dot org Target Milestone: --- Target: x86_64 /* { dg-do compile } */ /* { dg-options " -mavx512vl -mavx512bw -mavx512f -O3" } */ void ubyteshiftl_mask (unsigned char *a, int len) { int i; for (i = 0; i < len; i++) if (a[i] & 1) a[i] <<= 5; } generates kortestq %k1, %k1 je .L4 vpsllw $5, %zmm0, %zmm0 vpandq %zmm0, %zmm3, %zmm0 vmovdqu8 %zmm0, (%rdx){%k1} .L4: The branch is really unnecessary because the mask does all all the work (although it should perhaps be applied to the computation too) The problem is if a[i] & 1 is unpredictable this will slow down the loop due to branch mispredicts. If we have masks we should avoid data dependent branches because it needs no prediction.