Issue 165484
Summary Missed stack move optimization when either source or destination has its address captured
Labels missed-optimization
Assignees
Reporter tmiasko
    Currently MemCpyOpt performs stack move optimization when neither `src` nor `dst` is captured, i.e., it unifies `src` with `dst` and removes `memcpy`:

```llvm
%struct.Foo = type { i32, i32, i32 }
define void @test() {
  %src = "" %struct.Foo, align 4
  %dst = alloca %struct.Foo, align 4
  store %struct.Foo { i32 10, i32 20, i32 30 }, ptr %src
  call void @f(ptr %src)
  call void @llvm.memcpy.p0.p0.i64(ptr align 4 %dst, ptr align 4 %src, i64 12, i1 false)
  call void @g(ptr %dst)
  ret void
}
declare void @f(ptr captures(none))
declare void @g(ptr captures(none))
```

```llvm
define void @test() {
  %src = "" %struct.Foo, align 4
  store %struct.Foo { i32 10, i32 20, i32 30 }, ptr %src, align 4
  call void @f(ptr %src)
  call void @g(ptr %src)
  ret void
}
```

It should be also possible to perform this optimization when either `src` or `dst` has its address captured, but not both of them. For example, in:

```llvm
%struct.Foo = type { i32, i32, i32 }
define void @test() {
  %src = "" %struct.Foo, align 4
  %dst = alloca %struct.Foo, align 4
  store %struct.Foo { i32 10, i32 20, i32 30 }, ptr %src
  call void @f(ptr %src)
  call void @llvm.memcpy.p0.p0.i64(ptr align 4 %dst, ptr align 4 %src, i64 12, i1 false)
  call void @g(ptr %dst)
  ret void
}
declare void @f(ptr captures(address))
declare void @g(ptr captures(none))
```
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to