Issue 165647
Summary TypeBasedAliasAnalysis induces incorrect optimization in SLP-VECTORIZER
Labels new issue
Assignees
Reporter ParkHanbum
    
C Code sample
https://godbolt.org/z/1avvGh7sx

```
void src_assign_to_ptr_trunc(unsigned short *s, unsigned short* t)
{
    s[0] = t[0];
    s[1] = t[1];
    s[2] = t[2];
    s[3] = t[3];
}

void src_assign_to_ptr_ext(unsigned int *s, unsigned short* t)
{
    s[0] = t[0];
    s[1] = t[1];
    s[2] = t[2];
    s[3] = t[3];
}
```

The example code above is currently optimized as vector operations in the clang trunk.


```
define dso_local void @tgt_assign_to_ptr_trunc(unsigned short*, unsigned int*)(ptr noundef writeonly captures(none) initializes((0, 8)) %0, ptr noundef readonly captures(none) %1) local_unnamed_addr #1 {
  %3 = load <4 x i32>, ptr %1, align 4
  %4 = trunc <4 x i32> %3 to <4 x i16>
 store <4 x i16> %4, ptr %0, align 2
  ret void
}

define dso_local void @tgt_assign_to_ptr_ext(unsigned int*, unsigned short*)(ptr noundef writeonly captures(none) initializes((0, 16)) %0, ptr noundef readonly captures(none) %1) local_unnamed_addr #1 {
  %3 = load <4 x i16>, ptr %1, align 2
  %4 = zext <4 x i16> %3 to <4 x i32>
  store <4 x i32> %4, ptr %0, align 4
  ret void
}
```

Alive link : https://alive2.llvm.org/ce/z/SYn2VG
However, through Alive2, we can see that such optimization is not correct. 

On the other hand, if the target is a pointer of the same type rather than a different type, it does not optimize using vector operations. 

```
void src_assign_to_ptr_same(unsigned int *s, unsigned int* t)
{
    s[0] = t[0];
    s[1] = t[1];
    s[2] = t[2];
    s[3] = t[3];
}
```
```
define dso_local void @src_assign_to_ptr_same(unsigned int*, unsigned int*)(ptr noundef writeonly captures(none) initializes((0, 16)) %0, ptr noundef readonly captures(none) %1) local_unnamed_addr #1 {
 %3 = load i32, ptr %1, align 4
  store i32 %3, ptr %0, align 4
  %4 = getelementptr inbounds nuw i8, ptr %1, i64 4
  %5 = load i32, ptr %4, align 4
  %6 = getelementptr inbounds nuw i8, ptr %0, i64 4
  store i32 %5, ptr %6, align 4
  %7 = getelementptr inbounds nuw i8, ptr %1, i64 8
  %8 = load i32, ptr %7, align 4
  %9 = getelementptr inbounds nuw i8, ptr %0, i64 8
 store i32 %8, ptr %9, align 4
  %10 = getelementptr inbounds nuw i8, ptr %1, i64 12
  %11 = load i32, ptr %10, align 4
  %12 = getelementptr inbounds nuw i8, ptr %0, i64 12
  store i32 %11, ptr %12, align 4
  ret void
}
```

Here's my analysis of why this situation has arisen.

`SLP-VECTORIZER` checks dependencies between memory locations. 
During this process, the internally called `AAResults::alias` function determines alias relationships 
between two given memorylocations through registered `AliasAnalysis` instances. 
TypeBasedAliasAnalysis is also called during this process to determine whether memory locations are aliases. 

However, since TypeBaseAlias only checks the types of the two memory locations, 
it determines that truncation/extension occurs and thus deems it a NoAlias case. 
Consequently, the SLP-VECTORIZER determines there is no aliasing between the two memory locations 
and proceeds with optimization.


_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to