Issue 178403
Summary [SLP] Handle zext(i1 -> i32) and select(i1, i32 1, i32 0) equivalency
Labels llvm:SLPVectorizer, missed-optimization
Assignees
Reporter RKSimon
    Pulled out of #96395
```rust
pub unsafe fn ascii_prefix(input: &[u8]) -> usize {
 let mut mask = 0_u16;
    for i in 0..16 {
        mask |= ((*input.get_unchecked(i) < 128) as u16) << i;
    }
 mask.trailing_ones() as usize
}
```
By the time it gets to SLP we see this pattern:
```ll
  %0 = load i8, ptr %input, align 1
  %cmp3 = icmp sgt i8 %0, -1
  %1 = zext i1 %cmp3 to i16

  %arrayidx.1 = getelementptr inbounds nuw i8, ptr %input, i64 1
  %2 = load i8, ptr %arrayidx.1, align 1
 %cmp3.1 = icmp sgt i8 %2, -1
  %3 = select i1 %cmp3.1, i16 2, i16 0
 %conv9.1 = or disjoint i16 %3, %1

  %arrayidx.2 = getelementptr inbounds nuw i8, ptr %input, i64 2
  %4 = load i8, ptr %arrayidx.2, align 1
 %cmp3.2 = icmp sgt i8 %4, -1
  %5 = select i1 %cmp3.2, i16 4, i16 0
 %conv9.2 = or disjoint i16 %conv9.1, %5
  .....
```
Loop unrolling has constant folded the shifts entirely and replaced them with icmp+select pairs, except for i == 0 which has become a icmp+zext - preventing SLP copyable / sameopcode patterns from cleaning everything up.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to