[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)

2025-04-19 Thread Martin Storsjö via cfe-commits

mstorsjo wrote:

With https://martin.st/temp/abs-preproc.cpp as input, I run `clang -target 
i686-w64-mingw32 abs-preproc.cpp -fsanitize=signed-integer-overflow -w -O3 -S 
-emit-llvm -o abs-out.ll`, and I get the following diff of the output IR after 
this change:
```diff
--- abs-good.ll 2025-04-20 00:10:20.155904358 +0300
+++ abs-bad.ll  2025-04-20 00:10:30.432882049 +0300
@@ -14,7 +14,7 @@
 @7 = private unnamed_addr global { { ptr, i32, i32 }, ptr } { { ptr, i32, i32 
} { ptr @.src, i32 481, i32 19 }, ptr @6 }
 @8 = private unnamed_addr global { { ptr, i32, i32 }, ptr } { { ptr, i32, i32 
} { ptr @.src, i32 482, i32 9 }, ptr @6 }
 
-; Function Attrs: mustprogress norecurse nounwind
+; Function Attrs: mustprogress norecurse nounwind memory(read, argmem: none, 
inaccessiblemem: readwrite)
 define dso_local noundef i32 @main() local_unnamed_addr #0 {
 entry:
   %tmp = alloca i64, align 8
@@ -23,20 +23,18 @@
   tail call void @__ubsan_handle_negate_overflow(ptr nonnull @2, i32 
-2147483648) #2, !nosanitize !6
   tail call void @__ubsan_handle_negate_overflow(ptr nonnull @4, i32 
-2147483648) #2, !nosanitize !6
   tail call void @__ubsan_handle_negate_overflow(ptr nonnull @5, i32 
-2147483648) #2, !nosanitize !6
-  store i64 -9223372036854775808, ptr %tmp, align 8, !nosanitize !6
   %0 = ptrtoint ptr %tmp to i32, !nosanitize !6
   call void @__ubsan_handle_negate_overflow(ptr nonnull @7, i32 %0) #2, 
!nosanitize !6
-  store i64 -9223372036854775808, ptr %tmp14, align 8, !nosanitize !6
   %1 = ptrtoint ptr %tmp14 to i32, !nosanitize !6
   call void @__ubsan_handle_negate_overflow(ptr nonnull @8, i32 %1) #2, 
!nosanitize !6
   ret i32 0
 }
 
-; Function Attrs: uwtable
+; Function Attrs: memory(argmem: read, inaccessiblemem: readwrite) uwtable
 declare dso_local void @__ubsan_handle_negate_overflow(ptr, i32) 
local_unnamed_addr #1
 
-attributes #0 = { mustprogress norecurse nounwind "min-legal-vector-width"="0" 
"no-trapping-math"="true" "stack-protector-buffer-size"="8" 
"target-cpu"="pentium4" 
"target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
-attributes #1 = { uwtable }
+attributes #0 = { mustprogress norecurse nounwind memory(read, argmem: none, 
inaccessiblemem: readwrite) "min-legal-vector-width"="0" 
"no-trapping-math"="true" "stack-protector-buffer-size"="8" 
"target-cpu"="pentium4" 
"target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
+attributes #1 = { memory(argmem: read, inaccessiblemem: readwrite) uwtable }
 attributes #2 = { nounwind }
 
 !llvm.module.flags = !{!0, !1, !2}
```

https://github.com/llvm/llvm-project/pull/135135
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)

2025-04-19 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

@vitalybuka @mstorsjo Can you please provide the command to reproduce the wrong 
IR/codegen?


https://github.com/llvm/llvm-project/pull/135135
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)

2025-04-18 Thread Vitaly Buka via cfe-commits

vitalybuka wrote:

https://lab.llvm.org/buildbot/#/builders/66/builds/12718
https://lab.llvm.org/buildbot/#/builders/186/builds/8291

https://github.com/llvm/llvm-project/pull/135135
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)

2025-04-18 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

> This change broke a couple of ubsan tests on i686 on Windows (i.e. a 32 bit 
> environment): 
> https://github.com/mstorsjo/llvm-mingw/actions/runs/14527591706/job/40770945907
> 
> The failing tests are 
> https://github.com/llvm/llvm-project/blob/llvmorg-20.1.3/compiler-rt/test/ubsan/TestCases/Misc/abs.cpp
>  and 
> https://github.com/llvm/llvm-project/blob/llvmorg-20.1.3/compiler-rt/test/ubsan/TestCases/ImplicitConversion/bitfield-conversion.c.
> 
> The diff in ubsan output from `abs.cpp` looks like this:
> 
> ```diff
> @@ -6,7 +6,7 @@
>  SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior 
> compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:19:18 
>  compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:20:8: runtime error: negation 
> of -2147483648 cannot be represented in type 'long'; cast to an unsigned type 
> to negate this value to itself
>  SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior 
> compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:20:8 
> -compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:24:19: runtime error: negation 
> of -9223372036854775808 cannot be represented in type 'long long'; cast to an 
> unsigned type to negate this value to itself
> +compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:24:19: runtime error: negation 
> of 14963666063463584 cannot be represented in type 'long long'; cast to an 
> unsigned type to negate this value to itself
>  SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior 
> compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:24:19 
> -compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:25:9: runtime error: negation 
> of -9223372036854775808 cannot be represented in type 'long long'; cast to an 
> unsigned type to negate this value to itself
> +compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:25:9: runtime error: negation 
> of 8880086229349853276 cannot be represented in type 'long long'; cast to an 
> unsigned type to negate this value to itself
>  SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior 
> compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:25:9 
> ```

IR output is correct.
```
bin/clang --target=i686-w64-windows-gnu -emit-llvm -S 
-fsanitize=signed-integer-overflow -O3 
-fsanitize-recover=signed-integer-overflow -w -o - 
/home/dtcxzyw/WorkSpace/Projects/compilers/llvm-project/clang/test/CodeGen/AArch64/ubsan-handler-pass-by-ref.c
; ModuleID = 
'/home/dtcxzyw/WorkSpace/Projects/compilers/llvm-project/clang/test/CodeGen/AArch64/ubsan-handler-pass-by-ref.c'
source_filename = 
"/home/dtcxzyw/WorkSpace/Projects/compilers/llvm-project/clang/test/CodeGen/AArch64/ubsan-handler-pass-by-ref.c"
target datalayout = 
"e-m:x-p:32:32-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:32-n8:16:32-a:0:32-S32"
target triple = "i686-w64-windows-gnu"

@.src = private unnamed_addr constant [111 x i8] 
c"/home/dtcxzyw/WorkSpace/Projects/compilers/llvm-project/clang/test/CodeGen/AArch64/ubsan-handler-pass-by-ref.c\00",
 align 1
@0 = private unnamed_addr constant { i16, i16, [12 x i8] } { i16 0, i16 13, [12 
x i8] c"'long long'\00" }
@1 = private unnamed_addr global { { ptr, i32, i32 }, ptr } { { ptr, i32, i32 } 
{ ptr @.src, i32 17, i32 19 }, ptr @0 }

; Function Attrs: nounwind memory(read, inaccessiblemem: readwrite)
define dso_local noundef i32 @main() local_unnamed_addr #0 {
entry:
  %tmp = alloca i64, align 8
  store i64 -9223372036854775808, ptr %tmp, align 8, !nosanitize !5
  %0 = ptrtoint ptr %tmp to i32, !nosanitize !5
  call void @__ubsan_handle_negate_overflow(ptr nonnull @1, i32 %0) #2, 
!nosanitize !5
  ret i32 0
}

; Function Attrs: memory(read, inaccessiblemem: readwrite) uwtable
declare dso_local void @__ubsan_handle_negate_overflow(ptr, i32) 
local_unnamed_addr #1

attributes #0 = { nounwind memory(read, inaccessiblemem: readwrite) 
"min-legal-vector-width"="0" "no-trapping-math"="true" 
"stack-protector-buffer-size"="8" "target-cpu"="pentium4" 
"target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
attributes #1 = { memory(read, inaccessiblemem: readwrite) uwtable }
attributes #2 = { nounwind }

!llvm.module.flags = !{!0, !1, !2}
!llvm.linker.options = !{!3}
!llvm.ident = !{!4}

!0 = !{i32 1, !"NumRegisterParameters", i32 0}
!1 = !{i32 1, !"wchar_size", i32 2}
!2 = !{i32 1, !"MaxTLSAlign", i32 65536}
!3 = !{!"/DEFAULTLIB:libclang_rt.ubsan_standalone.a"}
!4 = !{!"clang version 21.0.0git"}
!5 = !{}
```
And the codegen looks good.
```
.def@feat.00;
.scl3;
.type   0;
.endef
.globl  @feat.00
.set @feat.00, 1
.file   "ubsan-handler-pass-by-ref.c"
.def_main;
.scl2;
.type   32;
.endef
.text
.globl  _main   # -- Begin function main
.p2align4
_main:  # @main
# %bb.0:# %entry
pushl   %ebp
movl%esp, %ebp
andl$-8, %esp
subl$8, %esp
calll   ___main
movl$-2147483648,

[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)

2025-04-18 Thread Martin Storsjö via cfe-commits

mstorsjo wrote:

This change broke a couple of ubsan tests on i686 on Windows (i.e. a 32 bit 
environment): 
https://github.com/mstorsjo/llvm-mingw/actions/runs/14527591706/job/40770945907

The failing tests are 
https://github.com/llvm/llvm-project/blob/llvmorg-20.1.3/compiler-rt/test/ubsan/TestCases/Misc/abs.cpp
 and 
https://github.com/llvm/llvm-project/blob/llvmorg-20.1.3/compiler-rt/test/ubsan/TestCases/ImplicitConversion/bitfield-conversion.c.

The diff in ubsan output from `abs.cpp` looks like this:
```diff
@@ -6,7 +6,7 @@
 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior 
compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:19:18 
 compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:20:8: runtime error: negation of 
-2147483648 cannot be represented in type 'long'; cast to an unsigned type to 
negate this value to itself
 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior 
compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:20:8 
-compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:24:19: runtime error: negation 
of -9223372036854775808 cannot be represented in type 'long long'; cast to an 
unsigned type to negate this value to itself
+compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:24:19: runtime error: negation 
of 14963666063463584 cannot be represented in type 'long long'; cast to an 
unsigned type to negate this value to itself
 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior 
compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:24:19 
-compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:25:9: runtime error: negation of 
-9223372036854775808 cannot be represented in type 'long long'; cast to an 
unsigned type to negate this value to itself
+compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:25:9: runtime error: negation of 
8880086229349853276 cannot be represented in type 'long long'; cast to an 
unsigned type to negate this value to itself
 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior 
compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:25:9 
```


https://github.com/llvm/llvm-project/pull/135135
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)

2025-04-18 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw edited 
https://github.com/llvm/llvm-project/pull/135135
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)

2025-04-17 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw closed 
https://github.com/llvm/llvm-project/pull/135135
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)

2025-04-17 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw updated 
https://github.com/llvm/llvm-project/pull/135135

>From de6d895ebb26225db909f8506f77162f38a3f65f Mon Sep 17 00:00:00 2001
From: Yingwei Zheng 
Date: Thu, 10 Apr 2025 11:09:45 +0800
Subject: [PATCH 1/4] [Clang][CodeGen][UBSan] Add more precise attributes to
 recoverable ubsan handlers (#130990)

This patch adds `memory(argmem: read, inaccessiblemem: readwrite)
mustprogress` to **recoverable** ubsan handlers in order to unblock some
memory/loop optimizations. It provides an average of 3% performance
improvement on llvm-test-suite (except for 49 test failures due to ubsan
diagnostics).

Closes https://github.com/llvm/llvm-project/issues/130093.
---
 clang/lib/CodeGen/CGExpr.cpp   |  12 ++
 clang/test/CodeGen/allow-ubsan-check.c |  28 +--
 clang/test/CodeGen/attr-counted-by.c   | 232 -
 clang/test/CodeGen/ubsan-attr.cpp  | 100 +++
 4 files changed, 242 insertions(+), 130 deletions(-)
 create mode 100644 clang/test/CodeGen/ubsan-attr.cpp

diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index abb88477062fc..16cbe4687f671 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -3645,6 +3645,18 @@ static void emitCheckHandlerCall(CodeGenFunction &CGF,
 .addAttribute(llvm::Attribute::NoUnwind);
   }
   B.addUWTableAttr(llvm::UWTableKind::Default);
+  // Add more precise attributes to recoverable ubsan handlers for better
+  // optimizations.
+  if (CGF.CGM.getCodeGenOpts().OptimizationLevel > 0 && MayReturn) {
+// __ubsan_handle_dynamic_type_cache_miss reads the vtable, which is also
+// accessible by the current module.
+if (CheckHandler != SanitizerHandler::DynamicTypeCacheMiss)
+  B.addMemoryAttr(llvm::MemoryEffects::argMemOnly(llvm::ModRefInfo::Ref) |
+  llvm::MemoryEffects::inaccessibleMemOnly());
+// If the handler does not return, it must interact with the environment in
+// an observable way.
+B.addAttribute(llvm::Attribute::MustProgress);
+  }
 
   llvm::FunctionCallee Fn = CGF.CGM.CreateRuntimeFunction(
   FnType, FnName,
diff --git a/clang/test/CodeGen/allow-ubsan-check.c 
b/clang/test/CodeGen/allow-ubsan-check.c
index e225fb63f08eb..6358425aa40be 100644
--- a/clang/test/CodeGen/allow-ubsan-check.c
+++ b/clang/test/CodeGen/allow-ubsan-check.c
@@ -29,7 +29,7 @@
 // CHECK:   [[HANDLER_DIVREM_OVERFLOW]]:
 // CHECK-NEXT:[[TMP10:%.*]] = zext i32 [[X]] to i64, !nosanitize [[META2]]
 // CHECK-NEXT:[[TMP11:%.*]] = zext i32 [[Y]] to i64, !nosanitize [[META2]]
-// CHECK-NEXT:tail call void @__ubsan_handle_divrem_overflow_abort(ptr 
nonnull @[[GLOB1:[0-9]+]], i64 [[TMP10]], i64 [[TMP11]]) #[[ATTR6:[0-9]+]], 
!nosanitize [[META2]]
+// CHECK-NEXT:tail call void @__ubsan_handle_divrem_overflow_abort(ptr 
nonnull @[[GLOB1:[0-9]+]], i64 [[TMP10]], i64 [[TMP11]]) #[[ATTR8:[0-9]+]], 
!nosanitize [[META2]]
 // CHECK-NEXT:unreachable, !nosanitize [[META2]]
 // CHECK:   [[CONT]]:
 // CHECK-NEXT:[[DIV:%.*]] = sdiv i32 [[X]], [[Y]]
@@ -75,7 +75,7 @@
 // REC:   [[HANDLER_DIVREM_OVERFLOW]]:
 // REC-NEXT:[[TMP10:%.*]] = zext i32 [[X]] to i64, !nosanitize [[META2]]
 // REC-NEXT:[[TMP11:%.*]] = zext i32 [[Y]] to i64, !nosanitize [[META2]]
-// REC-NEXT:tail call void @__ubsan_handle_divrem_overflow(ptr nonnull 
@[[GLOB1:[0-9]+]], i64 [[TMP10]], i64 [[TMP11]]) #[[ATTR6:[0-9]+]], !nosanitize 
[[META2]]
+// REC-NEXT:tail call void @__ubsan_handle_divrem_overflow(ptr nonnull 
@[[GLOB1:[0-9]+]], i64 [[TMP10]], i64 [[TMP11]]) #[[ATTR8:[0-9]+]], !nosanitize 
[[META2]]
 // REC-NEXT:br label %[[CONT]], !nosanitize [[META2]]
 // REC:   [[CONT]]:
 // REC-NEXT:[[DIV:%.*]] = sdiv i32 [[X]], [[Y]]
@@ -86,7 +86,7 @@ int div(int x, int y) {
 }
 
 // CHECK-LABEL: define dso_local i32 @null(
-// CHECK-SAME: ptr noundef readonly captures(address_is_null) [[X:%.*]]) 
local_unnamed_addr #[[ATTR0]] {
+// CHECK-SAME: ptr noundef readonly captures(address_is_null) [[X:%.*]]) 
local_unnamed_addr #[[ATTR3:[0-9]+]] {
 // CHECK-NEXT:  [[ENTRY:.*:]]
 // CHECK-NEXT:[[TMP0:%.*]] = icmp eq ptr [[X]], null, !nosanitize [[META2]]
 //
@@ -95,7 +95,7 @@ int div(int x, int y) {
 // CHECK-NEXT:[[DOTNOT1:%.*]] = and i1 [[TMP0]], [[TMP1]]
 // CHECK-NEXT:br i1 [[DOTNOT1]], label %[[HANDLER_TYPE_MISMATCH:.*]], 
label %[[CONT:.*]], !prof [[PROF4:![0-9]+]], !nosanitize [[META2]]
 // CHECK:   [[HANDLER_TYPE_MISMATCH]]:
-// CHECK-NEXT:tail call void @__ubsan_handle_type_mismatch_v1_abort(ptr 
nonnull @[[GLOB2:[0-9]+]], i64 0) #[[ATTR6]], !nosanitize [[META2]]
+// CHECK-NEXT:tail call void @__ubsan_handle_type_mismatch_v1_abort(ptr 
nonnull @[[GLOB2:[0-9]+]], i64 0) #[[ATTR8]], !nosanitize [[META2]]
 // CHECK-NEXT:unreachable, !nosanitize [[META2]]
 // CHECK:   [[CONT]]:
 // CHECK-NEXT:[[TMP2:%.*]] = load i32, ptr [[X]], align 4, !tbaa 
[[TBAA5:![0-9]+]]
@@ -116,14 +116,14 @@ int div(int x, int

[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)

2025-04-15 Thread Eli Friedman via cfe-commits

https://github.com/efriedma-quic commented:

LGTM with one minor comment.

https://github.com/llvm/llvm-project/pull/135135
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)

2025-04-14 Thread Eli Friedman via cfe-commits

https://github.com/efriedma-quic edited 
https://github.com/llvm/llvm-project/pull/135135
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)

2025-04-14 Thread Eli Friedman via cfe-commits


@@ -3615,6 +3618,23 @@ static void emitCheckHandlerCall(CodeGenFunction &CGF,
 .addAttribute(llvm::Attribute::NoUnwind);
   }
   B.addUWTableAttr(llvm::UWTableKind::Default);
+  // Add more precise attributes to recoverable ubsan handlers for better
+  // optimizations.
+  if (CGF.CGM.getCodeGenOpts().OptimizationLevel > 0 && MayReturn) {
+// __ubsan_handle_dynamic_type_cache_miss reads the vtable, which is also
+// accessible by the current module.
+if (CheckHandler != SanitizerHandler::DynamicTypeCacheMiss) {
+  llvm::MemoryEffects ME =
+  llvm::MemoryEffects::argMemOnly(llvm::ModRefInfo::Ref) |
+  llvm::MemoryEffects::inaccessibleMemOnly();
+  if (MayReadFromPtrToInt)
+ME |= llvm::MemoryEffects::readOnly();
+  B.addMemoryAttr(ME);
+}
+// If the handler does not return, it must interact with the environment in
+// an observable way.
+B.addAttribute(llvm::Attribute::MustProgress);

efriedma-quic wrote:

I know this isn't new... but I don't think this "mustprogress" marking actually 
does anything useful.  "mustprogress" is more a characteristic of function 
definitions, not calls.

https://github.com/llvm/llvm-project/pull/135135
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)

2025-04-12 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Yingwei Zheng (dtcxzyw)


Changes

This patch relands https://github.com/llvm/llvm-project/pull/130990.
If the check value is passed by reference, add `memory(read)`.

Original PR description:

This patch adds `memory(argmem: read, inaccessiblemem: readwrite)
mustprogress` to **recoverable** ubsan handlers in order to unblock some
memory/loop optimizations. It provides an average of 3% performance
improvement on llvm-test-suite (except for 49 test failures due to ubsan
diagnostics).


---

Patch is 94.02 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/135135.diff


6 Files Affected:

- (modified) clang/lib/CodeGen/CGExpr.cpp (+28-6) 
- (modified) clang/lib/CodeGen/CodeGenFunction.h (+3-1) 
- (added) clang/test/CodeGen/AArch64/ubsan-handler-pass-by-ref.c (+22) 
- (modified) clang/test/CodeGen/allow-ubsan-check.c (+14-14) 
- (modified) clang/test/CodeGen/attr-counted-by.c (+116-116) 
- (added) clang/test/CodeGen/ubsan-attr.cpp (+100) 


``diff
diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index 5f028f6d8c6ac..fe8ccb713230e 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -3456,7 +3456,8 @@ llvm::Constant 
*CodeGenFunction::EmitCheckTypeDescriptor(QualType T) {
   return GV;
 }
 
-llvm::Value *CodeGenFunction::EmitCheckValue(llvm::Value *V) {
+llvm::Value *CodeGenFunction::EmitCheckValue(llvm::Value *V,
+ bool &MayReadFromPtrToInt) {
   llvm::Type *TargetTy = IntPtrTy;
 
   if (V->getType() == TargetTy)
@@ -3482,6 +3483,7 @@ llvm::Value *CodeGenFunction::EmitCheckValue(llvm::Value 
*V) {
 Builder.CreateStore(V, Ptr);
 V = Ptr.getPointer();
   }
+  MayReadFromPtrToInt = true;
   return Builder.CreatePtrToInt(V, TargetTy);
 }
 
@@ -3587,7 +3589,8 @@ static void emitCheckHandlerCall(CodeGenFunction &CGF,
  ArrayRef FnArgs,
  SanitizerHandler CheckHandler,
  CheckRecoverableKind RecoverKind, bool 
IsFatal,
- llvm::BasicBlock *ContBB, bool NoMerge) {
+ llvm::BasicBlock *ContBB, bool NoMerge,
+ bool MayReadFromPtrToInt) {
   assert(IsFatal || RecoverKind != CheckRecoverableKind::Unrecoverable);
   std::optional DL;
   if (!CGF.Builder.getCurrentDebugLocation()) {
@@ -3615,6 +3618,23 @@ static void emitCheckHandlerCall(CodeGenFunction &CGF,
 .addAttribute(llvm::Attribute::NoUnwind);
   }
   B.addUWTableAttr(llvm::UWTableKind::Default);
+  // Add more precise attributes to recoverable ubsan handlers for better
+  // optimizations.
+  if (CGF.CGM.getCodeGenOpts().OptimizationLevel > 0 && MayReturn) {
+// __ubsan_handle_dynamic_type_cache_miss reads the vtable, which is also
+// accessible by the current module.
+if (CheckHandler != SanitizerHandler::DynamicTypeCacheMiss) {
+  llvm::MemoryEffects ME =
+  llvm::MemoryEffects::argMemOnly(llvm::ModRefInfo::Ref) |
+  llvm::MemoryEffects::inaccessibleMemOnly();
+  if (MayReadFromPtrToInt)
+ME |= llvm::MemoryEffects::readOnly();
+  B.addMemoryAttr(ME);
+}
+// If the handler does not return, it must interact with the environment in
+// an observable way.
+B.addAttribute(llvm::Attribute::MustProgress);
+  }
 
   llvm::FunctionCallee Fn = CGF.CGM.CreateRuntimeFunction(
   FnType, FnName,
@@ -3711,6 +3731,7 @@ void CodeGenFunction::EmitCheck(
   // representing operand values.
   SmallVector Args;
   SmallVector ArgTypes;
+  bool MayReadFromPtrToInt = false;
   if (!CGM.getCodeGenOpts().SanitizeMinimalRuntime) {
 Args.reserve(DynamicArgs.size() + 1);
 ArgTypes.reserve(DynamicArgs.size() + 1);
@@ -3730,7 +3751,7 @@ void CodeGenFunction::EmitCheck(
 }
 
 for (size_t i = 0, n = DynamicArgs.size(); i != n; ++i) {
-  Args.push_back(EmitCheckValue(DynamicArgs[i]));
+  Args.push_back(EmitCheckValue(DynamicArgs[i], MayReadFromPtrToInt));
   ArgTypes.push_back(IntPtrTy);
 }
   }
@@ -3742,7 +3763,8 @@ void CodeGenFunction::EmitCheck(
 // Simple case: we need to generate a single handler call, either
 // fatal, or non-fatal.
 emitCheckHandlerCall(*this, FnType, Args, CheckHandler, RecoverKind,
- (FatalCond != nullptr), Cont, NoMerge);
+ (FatalCond != nullptr), Cont, NoMerge,
+ MayReadFromPtrToInt);
   } else {
 // Emit two handler calls: first one for set of unrecoverable checks,
 // another one for recoverable.
@@ -3752,10 +3774,10 @@ void CodeGenFunction::EmitCheck(
 Builder.CreateCondBr(FatalCond, NonFatalHandlerBB, FatalHandlerBB);
 EmitBlock(FatalHandlerBB);
 emitCheckHandlerCall(*this, FnType, Args, CheckHandler, RecoverKind, true,
- NonFatalHandlerB

[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)

2025-04-12 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw ready_for_review 
https://github.com/llvm/llvm-project/pull/135135
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)

2025-04-12 Thread Yingwei Zheng via cfe-commits

https://github.com/dtcxzyw updated 
https://github.com/llvm/llvm-project/pull/135135

>From c0d8765b48fac979ab46eba4fd2cde85595c31e0 Mon Sep 17 00:00:00 2001
From: Yingwei Zheng 
Date: Thu, 10 Apr 2025 11:09:45 +0800
Subject: [PATCH 1/3] [Clang][CodeGen][UBSan] Add more precise attributes to
 recoverable ubsan handlers (#130990)

This patch adds `memory(argmem: read, inaccessiblemem: readwrite)
mustprogress` to **recoverable** ubsan handlers in order to unblock some
memory/loop optimizations. It provides an average of 3% performance
improvement on llvm-test-suite (except for 49 test failures due to ubsan
diagnostics).

Closes https://github.com/llvm/llvm-project/issues/130093.
---
 clang/lib/CodeGen/CGExpr.cpp   |  12 ++
 clang/test/CodeGen/allow-ubsan-check.c |  28 +--
 clang/test/CodeGen/attr-counted-by.c   | 232 -
 clang/test/CodeGen/ubsan-attr.cpp  | 100 +++
 4 files changed, 242 insertions(+), 130 deletions(-)
 create mode 100644 clang/test/CodeGen/ubsan-attr.cpp

diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index 5f028f6d8c6ac..b721eeea6e906 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -3615,6 +3615,18 @@ static void emitCheckHandlerCall(CodeGenFunction &CGF,
 .addAttribute(llvm::Attribute::NoUnwind);
   }
   B.addUWTableAttr(llvm::UWTableKind::Default);
+  // Add more precise attributes to recoverable ubsan handlers for better
+  // optimizations.
+  if (CGF.CGM.getCodeGenOpts().OptimizationLevel > 0 && MayReturn) {
+// __ubsan_handle_dynamic_type_cache_miss reads the vtable, which is also
+// accessible by the current module.
+if (CheckHandler != SanitizerHandler::DynamicTypeCacheMiss)
+  B.addMemoryAttr(llvm::MemoryEffects::argMemOnly(llvm::ModRefInfo::Ref) |
+  llvm::MemoryEffects::inaccessibleMemOnly());
+// If the handler does not return, it must interact with the environment in
+// an observable way.
+B.addAttribute(llvm::Attribute::MustProgress);
+  }
 
   llvm::FunctionCallee Fn = CGF.CGM.CreateRuntimeFunction(
   FnType, FnName,
diff --git a/clang/test/CodeGen/allow-ubsan-check.c 
b/clang/test/CodeGen/allow-ubsan-check.c
index e225fb63f08eb..6358425aa40be 100644
--- a/clang/test/CodeGen/allow-ubsan-check.c
+++ b/clang/test/CodeGen/allow-ubsan-check.c
@@ -29,7 +29,7 @@
 // CHECK:   [[HANDLER_DIVREM_OVERFLOW]]:
 // CHECK-NEXT:[[TMP10:%.*]] = zext i32 [[X]] to i64, !nosanitize [[META2]]
 // CHECK-NEXT:[[TMP11:%.*]] = zext i32 [[Y]] to i64, !nosanitize [[META2]]
-// CHECK-NEXT:tail call void @__ubsan_handle_divrem_overflow_abort(ptr 
nonnull @[[GLOB1:[0-9]+]], i64 [[TMP10]], i64 [[TMP11]]) #[[ATTR6:[0-9]+]], 
!nosanitize [[META2]]
+// CHECK-NEXT:tail call void @__ubsan_handle_divrem_overflow_abort(ptr 
nonnull @[[GLOB1:[0-9]+]], i64 [[TMP10]], i64 [[TMP11]]) #[[ATTR8:[0-9]+]], 
!nosanitize [[META2]]
 // CHECK-NEXT:unreachable, !nosanitize [[META2]]
 // CHECK:   [[CONT]]:
 // CHECK-NEXT:[[DIV:%.*]] = sdiv i32 [[X]], [[Y]]
@@ -75,7 +75,7 @@
 // REC:   [[HANDLER_DIVREM_OVERFLOW]]:
 // REC-NEXT:[[TMP10:%.*]] = zext i32 [[X]] to i64, !nosanitize [[META2]]
 // REC-NEXT:[[TMP11:%.*]] = zext i32 [[Y]] to i64, !nosanitize [[META2]]
-// REC-NEXT:tail call void @__ubsan_handle_divrem_overflow(ptr nonnull 
@[[GLOB1:[0-9]+]], i64 [[TMP10]], i64 [[TMP11]]) #[[ATTR6:[0-9]+]], !nosanitize 
[[META2]]
+// REC-NEXT:tail call void @__ubsan_handle_divrem_overflow(ptr nonnull 
@[[GLOB1:[0-9]+]], i64 [[TMP10]], i64 [[TMP11]]) #[[ATTR8:[0-9]+]], !nosanitize 
[[META2]]
 // REC-NEXT:br label %[[CONT]], !nosanitize [[META2]]
 // REC:   [[CONT]]:
 // REC-NEXT:[[DIV:%.*]] = sdiv i32 [[X]], [[Y]]
@@ -86,7 +86,7 @@ int div(int x, int y) {
 }
 
 // CHECK-LABEL: define dso_local i32 @null(
-// CHECK-SAME: ptr noundef readonly captures(address_is_null) [[X:%.*]]) 
local_unnamed_addr #[[ATTR0]] {
+// CHECK-SAME: ptr noundef readonly captures(address_is_null) [[X:%.*]]) 
local_unnamed_addr #[[ATTR3:[0-9]+]] {
 // CHECK-NEXT:  [[ENTRY:.*:]]
 // CHECK-NEXT:[[TMP0:%.*]] = icmp eq ptr [[X]], null, !nosanitize [[META2]]
 //
@@ -95,7 +95,7 @@ int div(int x, int y) {
 // CHECK-NEXT:[[DOTNOT1:%.*]] = and i1 [[TMP0]], [[TMP1]]
 // CHECK-NEXT:br i1 [[DOTNOT1]], label %[[HANDLER_TYPE_MISMATCH:.*]], 
label %[[CONT:.*]], !prof [[PROF4:![0-9]+]], !nosanitize [[META2]]
 // CHECK:   [[HANDLER_TYPE_MISMATCH]]:
-// CHECK-NEXT:tail call void @__ubsan_handle_type_mismatch_v1_abort(ptr 
nonnull @[[GLOB2:[0-9]+]], i64 0) #[[ATTR6]], !nosanitize [[META2]]
+// CHECK-NEXT:tail call void @__ubsan_handle_type_mismatch_v1_abort(ptr 
nonnull @[[GLOB2:[0-9]+]], i64 0) #[[ATTR8]], !nosanitize [[META2]]
 // CHECK-NEXT:unreachable, !nosanitize [[META2]]
 // CHECK:   [[CONT]]:
 // CHECK-NEXT:[[TMP2:%.*]] = load i32, ptr [[X]], align 4, !tbaa 
[[TBAA5:![0-9]+]]
@@ -116,14 +116,14 @@ int div(int x, int

[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)

2025-04-10 Thread Yingwei Zheng via cfe-commits

dtcxzyw wrote:

> Still not sure how https://github.com/llvm/llvm-project/pull/135141 is 
> related.

We add a new parameter `bool &MayReadFromPtrToInt` to 
`CodeGenFunction::EmitCheckValue`. If it is called outside 
`CodeGenFunction::EmitCheck`, we have to pass the result into `EmitCheck` as 
well :(


https://github.com/llvm/llvm-project/pull/135135
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)

2025-04-10 Thread Eli Friedman via cfe-commits

efriedma-quic wrote:

> Looking at the issue description from 
> https://github.com/llvm/llvm-project/pull/130990, the problem is that the 
> code is passing argmem: read, but then passes a "pointer" as an i32. Which is 
> against the rules: argmem is specifically pointer arguments. So either needs 
> to just be marked "read", or you need to fix the argument passing to pass the 
> pointer as an actual pointer.

Oh, sorry, somehow didn't read the current version of the patch; I see you 
fixed this.

Still not sure how #135141 is related.

https://github.com/llvm/llvm-project/pull/135135
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)

2025-04-10 Thread Eli Friedman via cfe-commits

efriedma-quic wrote:

Looking at the issue description from #130990, the problem is that the code is 
passing `argmem: read`, but then passes a "pointer" as an i32.  Which is 
against the rules: argmem is specifically pointer arguments.   So either needs 
to just be marked "read", or you need to fix the argument passing to pass the 
pointer as an actual pointer.

I'm not sure how #135141 is connected to this.

https://github.com/llvm/llvm-project/pull/135135
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits