Bug ID: 34620
           Summary: redundant pand after vector shift of a byte vector
           Product: libraries
           Version: 4.0
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: X86

I have a vector of bytes and want to increment the vector elements according to
certain bits of another vector like so:

define <16 x i8> @add_bitset_to_vector(<16 x i8>, <16 x i8>) {
  %v0 = lshr <16 x i8> %0, <i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8
1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1>
  %v1 = and <16 x i8> %v0, <i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8
1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1>
  %v2 = add <16 x i8> %v1, %1
  ret <16 x i8> %v2

On X86-64 llc emits:

        .section        .rodata.cst16,"aM",@progbits,16
        .p2align        4
        .zero   16,127
        .zero   16,1
        .globl  add_bitset_to_vector
        .p2align        4, 0x90
        .type   add_bitset_to_vector,@function
add_bitset_to_vector:                   # @add_bitset_to_vector
# BB#0:                                 # %_L1
        psrlw   $1, %xmm0
        pand    .LCPI2_0(%rip), %xmm0
        pand    .LCPI2_1(%rip), %xmm0
        paddb   %xmm1, %xmm0
        .size   add_bitset_to_vector, .Lfunc_end2-add_bitset_to_vector

That is, the X86 backend uses word vector shift (psrlw) plus pand in order to
perform the byte vector shift. However, my pand for masking the least
significant bits makes the pand generated by lshr redundant.

You are receiving this mail because:
You are on the CC list for the bug.
llvm-bugs mailing list

Reply via email to