[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)
mstorsjo wrote: With https://martin.st/temp/abs-preproc.cpp as input, I run `clang -target i686-w64-mingw32 abs-preproc.cpp -fsanitize=signed-integer-overflow -w -O3 -S -emit-llvm -o abs-out.ll`, and I get the following diff of the output IR after this change: ```diff --- abs-good.ll 2025-04-20 00:10:20.155904358 +0300 +++ abs-bad.ll 2025-04-20 00:10:30.432882049 +0300 @@ -14,7 +14,7 @@ @7 = private unnamed_addr global { { ptr, i32, i32 }, ptr } { { ptr, i32, i32 } { ptr @.src, i32 481, i32 19 }, ptr @6 } @8 = private unnamed_addr global { { ptr, i32, i32 }, ptr } { { ptr, i32, i32 } { ptr @.src, i32 482, i32 9 }, ptr @6 } -; Function Attrs: mustprogress norecurse nounwind +; Function Attrs: mustprogress norecurse nounwind memory(read, argmem: none, inaccessiblemem: readwrite) define dso_local noundef i32 @main() local_unnamed_addr #0 { entry: %tmp = alloca i64, align 8 @@ -23,20 +23,18 @@ tail call void @__ubsan_handle_negate_overflow(ptr nonnull @2, i32 -2147483648) #2, !nosanitize !6 tail call void @__ubsan_handle_negate_overflow(ptr nonnull @4, i32 -2147483648) #2, !nosanitize !6 tail call void @__ubsan_handle_negate_overflow(ptr nonnull @5, i32 -2147483648) #2, !nosanitize !6 - store i64 -9223372036854775808, ptr %tmp, align 8, !nosanitize !6 %0 = ptrtoint ptr %tmp to i32, !nosanitize !6 call void @__ubsan_handle_negate_overflow(ptr nonnull @7, i32 %0) #2, !nosanitize !6 - store i64 -9223372036854775808, ptr %tmp14, align 8, !nosanitize !6 %1 = ptrtoint ptr %tmp14 to i32, !nosanitize !6 call void @__ubsan_handle_negate_overflow(ptr nonnull @8, i32 %1) #2, !nosanitize !6 ret i32 0 } -; Function Attrs: uwtable +; Function Attrs: memory(argmem: read, inaccessiblemem: readwrite) uwtable declare dso_local void @__ubsan_handle_negate_overflow(ptr, i32) local_unnamed_addr #1 -attributes #0 = { mustprogress norecurse nounwind "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="pentium4" "target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" } -attributes #1 = { uwtable } +attributes #0 = { mustprogress norecurse nounwind memory(read, argmem: none, inaccessiblemem: readwrite) "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="pentium4" "target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" } +attributes #1 = { memory(argmem: read, inaccessiblemem: readwrite) uwtable } attributes #2 = { nounwind } !llvm.module.flags = !{!0, !1, !2} ``` https://github.com/llvm/llvm-project/pull/135135 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)
dtcxzyw wrote: @vitalybuka @mstorsjo Can you please provide the command to reproduce the wrong IR/codegen? https://github.com/llvm/llvm-project/pull/135135 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)
vitalybuka wrote: https://lab.llvm.org/buildbot/#/builders/66/builds/12718 https://lab.llvm.org/buildbot/#/builders/186/builds/8291 https://github.com/llvm/llvm-project/pull/135135 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)
dtcxzyw wrote: > This change broke a couple of ubsan tests on i686 on Windows (i.e. a 32 bit > environment): > https://github.com/mstorsjo/llvm-mingw/actions/runs/14527591706/job/40770945907 > > The failing tests are > https://github.com/llvm/llvm-project/blob/llvmorg-20.1.3/compiler-rt/test/ubsan/TestCases/Misc/abs.cpp > and > https://github.com/llvm/llvm-project/blob/llvmorg-20.1.3/compiler-rt/test/ubsan/TestCases/ImplicitConversion/bitfield-conversion.c. > > The diff in ubsan output from `abs.cpp` looks like this: > > ```diff > @@ -6,7 +6,7 @@ > SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior > compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:19:18 > compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:20:8: runtime error: negation > of -2147483648 cannot be represented in type 'long'; cast to an unsigned type > to negate this value to itself > SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior > compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:20:8 > -compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:24:19: runtime error: negation > of -9223372036854775808 cannot be represented in type 'long long'; cast to an > unsigned type to negate this value to itself > +compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:24:19: runtime error: negation > of 14963666063463584 cannot be represented in type 'long long'; cast to an > unsigned type to negate this value to itself > SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior > compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:24:19 > -compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:25:9: runtime error: negation > of -9223372036854775808 cannot be represented in type 'long long'; cast to an > unsigned type to negate this value to itself > +compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:25:9: runtime error: negation > of 8880086229349853276 cannot be represented in type 'long long'; cast to an > unsigned type to negate this value to itself > SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior > compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:25:9 > ``` IR output is correct. ``` bin/clang --target=i686-w64-windows-gnu -emit-llvm -S -fsanitize=signed-integer-overflow -O3 -fsanitize-recover=signed-integer-overflow -w -o - /home/dtcxzyw/WorkSpace/Projects/compilers/llvm-project/clang/test/CodeGen/AArch64/ubsan-handler-pass-by-ref.c ; ModuleID = '/home/dtcxzyw/WorkSpace/Projects/compilers/llvm-project/clang/test/CodeGen/AArch64/ubsan-handler-pass-by-ref.c' source_filename = "/home/dtcxzyw/WorkSpace/Projects/compilers/llvm-project/clang/test/CodeGen/AArch64/ubsan-handler-pass-by-ref.c" target datalayout = "e-m:x-p:32:32-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:32-n8:16:32-a:0:32-S32" target triple = "i686-w64-windows-gnu" @.src = private unnamed_addr constant [111 x i8] c"/home/dtcxzyw/WorkSpace/Projects/compilers/llvm-project/clang/test/CodeGen/AArch64/ubsan-handler-pass-by-ref.c\00", align 1 @0 = private unnamed_addr constant { i16, i16, [12 x i8] } { i16 0, i16 13, [12 x i8] c"'long long'\00" } @1 = private unnamed_addr global { { ptr, i32, i32 }, ptr } { { ptr, i32, i32 } { ptr @.src, i32 17, i32 19 }, ptr @0 } ; Function Attrs: nounwind memory(read, inaccessiblemem: readwrite) define dso_local noundef i32 @main() local_unnamed_addr #0 { entry: %tmp = alloca i64, align 8 store i64 -9223372036854775808, ptr %tmp, align 8, !nosanitize !5 %0 = ptrtoint ptr %tmp to i32, !nosanitize !5 call void @__ubsan_handle_negate_overflow(ptr nonnull @1, i32 %0) #2, !nosanitize !5 ret i32 0 } ; Function Attrs: memory(read, inaccessiblemem: readwrite) uwtable declare dso_local void @__ubsan_handle_negate_overflow(ptr, i32) local_unnamed_addr #1 attributes #0 = { nounwind memory(read, inaccessiblemem: readwrite) "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="pentium4" "target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" } attributes #1 = { memory(read, inaccessiblemem: readwrite) uwtable } attributes #2 = { nounwind } !llvm.module.flags = !{!0, !1, !2} !llvm.linker.options = !{!3} !llvm.ident = !{!4} !0 = !{i32 1, !"NumRegisterParameters", i32 0} !1 = !{i32 1, !"wchar_size", i32 2} !2 = !{i32 1, !"MaxTLSAlign", i32 65536} !3 = !{!"/DEFAULTLIB:libclang_rt.ubsan_standalone.a"} !4 = !{!"clang version 21.0.0git"} !5 = !{} ``` And the codegen looks good. ``` .def@feat.00; .scl3; .type 0; .endef .globl @feat.00 .set @feat.00, 1 .file "ubsan-handler-pass-by-ref.c" .def_main; .scl2; .type 32; .endef .text .globl _main # -- Begin function main .p2align4 _main: # @main # %bb.0:# %entry pushl %ebp movl%esp, %ebp andl$-8, %esp subl$8, %esp calll ___main movl$-2147483648,
[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)
mstorsjo wrote: This change broke a couple of ubsan tests on i686 on Windows (i.e. a 32 bit environment): https://github.com/mstorsjo/llvm-mingw/actions/runs/14527591706/job/40770945907 The failing tests are https://github.com/llvm/llvm-project/blob/llvmorg-20.1.3/compiler-rt/test/ubsan/TestCases/Misc/abs.cpp and https://github.com/llvm/llvm-project/blob/llvmorg-20.1.3/compiler-rt/test/ubsan/TestCases/ImplicitConversion/bitfield-conversion.c. The diff in ubsan output from `abs.cpp` looks like this: ```diff @@ -6,7 +6,7 @@ SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:19:18 compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:20:8: runtime error: negation of -2147483648 cannot be represented in type 'long'; cast to an unsigned type to negate this value to itself SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:20:8 -compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:24:19: runtime error: negation of -9223372036854775808 cannot be represented in type 'long long'; cast to an unsigned type to negate this value to itself +compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:24:19: runtime error: negation of 14963666063463584 cannot be represented in type 'long long'; cast to an unsigned type to negate this value to itself SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:24:19 -compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:25:9: runtime error: negation of -9223372036854775808 cannot be represented in type 'long long'; cast to an unsigned type to negate this value to itself +compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:25:9: runtime error: negation of 8880086229349853276 cannot be represented in type 'long long'; cast to an unsigned type to negate this value to itself SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior compiler-rt/test/ubsan/TestCases/Misc/abs.cpp:25:9 ``` https://github.com/llvm/llvm-project/pull/135135 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)
https://github.com/dtcxzyw edited https://github.com/llvm/llvm-project/pull/135135 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)
https://github.com/dtcxzyw closed https://github.com/llvm/llvm-project/pull/135135 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)
https://github.com/dtcxzyw updated https://github.com/llvm/llvm-project/pull/135135 >From de6d895ebb26225db909f8506f77162f38a3f65f Mon Sep 17 00:00:00 2001 From: Yingwei Zheng Date: Thu, 10 Apr 2025 11:09:45 +0800 Subject: [PATCH 1/4] [Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (#130990) This patch adds `memory(argmem: read, inaccessiblemem: readwrite) mustprogress` to **recoverable** ubsan handlers in order to unblock some memory/loop optimizations. It provides an average of 3% performance improvement on llvm-test-suite (except for 49 test failures due to ubsan diagnostics). Closes https://github.com/llvm/llvm-project/issues/130093. --- clang/lib/CodeGen/CGExpr.cpp | 12 ++ clang/test/CodeGen/allow-ubsan-check.c | 28 +-- clang/test/CodeGen/attr-counted-by.c | 232 - clang/test/CodeGen/ubsan-attr.cpp | 100 +++ 4 files changed, 242 insertions(+), 130 deletions(-) create mode 100644 clang/test/CodeGen/ubsan-attr.cpp diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp index abb88477062fc..16cbe4687f671 100644 --- a/clang/lib/CodeGen/CGExpr.cpp +++ b/clang/lib/CodeGen/CGExpr.cpp @@ -3645,6 +3645,18 @@ static void emitCheckHandlerCall(CodeGenFunction &CGF, .addAttribute(llvm::Attribute::NoUnwind); } B.addUWTableAttr(llvm::UWTableKind::Default); + // Add more precise attributes to recoverable ubsan handlers for better + // optimizations. + if (CGF.CGM.getCodeGenOpts().OptimizationLevel > 0 && MayReturn) { +// __ubsan_handle_dynamic_type_cache_miss reads the vtable, which is also +// accessible by the current module. +if (CheckHandler != SanitizerHandler::DynamicTypeCacheMiss) + B.addMemoryAttr(llvm::MemoryEffects::argMemOnly(llvm::ModRefInfo::Ref) | + llvm::MemoryEffects::inaccessibleMemOnly()); +// If the handler does not return, it must interact with the environment in +// an observable way. +B.addAttribute(llvm::Attribute::MustProgress); + } llvm::FunctionCallee Fn = CGF.CGM.CreateRuntimeFunction( FnType, FnName, diff --git a/clang/test/CodeGen/allow-ubsan-check.c b/clang/test/CodeGen/allow-ubsan-check.c index e225fb63f08eb..6358425aa40be 100644 --- a/clang/test/CodeGen/allow-ubsan-check.c +++ b/clang/test/CodeGen/allow-ubsan-check.c @@ -29,7 +29,7 @@ // CHECK: [[HANDLER_DIVREM_OVERFLOW]]: // CHECK-NEXT:[[TMP10:%.*]] = zext i32 [[X]] to i64, !nosanitize [[META2]] // CHECK-NEXT:[[TMP11:%.*]] = zext i32 [[Y]] to i64, !nosanitize [[META2]] -// CHECK-NEXT:tail call void @__ubsan_handle_divrem_overflow_abort(ptr nonnull @[[GLOB1:[0-9]+]], i64 [[TMP10]], i64 [[TMP11]]) #[[ATTR6:[0-9]+]], !nosanitize [[META2]] +// CHECK-NEXT:tail call void @__ubsan_handle_divrem_overflow_abort(ptr nonnull @[[GLOB1:[0-9]+]], i64 [[TMP10]], i64 [[TMP11]]) #[[ATTR8:[0-9]+]], !nosanitize [[META2]] // CHECK-NEXT:unreachable, !nosanitize [[META2]] // CHECK: [[CONT]]: // CHECK-NEXT:[[DIV:%.*]] = sdiv i32 [[X]], [[Y]] @@ -75,7 +75,7 @@ // REC: [[HANDLER_DIVREM_OVERFLOW]]: // REC-NEXT:[[TMP10:%.*]] = zext i32 [[X]] to i64, !nosanitize [[META2]] // REC-NEXT:[[TMP11:%.*]] = zext i32 [[Y]] to i64, !nosanitize [[META2]] -// REC-NEXT:tail call void @__ubsan_handle_divrem_overflow(ptr nonnull @[[GLOB1:[0-9]+]], i64 [[TMP10]], i64 [[TMP11]]) #[[ATTR6:[0-9]+]], !nosanitize [[META2]] +// REC-NEXT:tail call void @__ubsan_handle_divrem_overflow(ptr nonnull @[[GLOB1:[0-9]+]], i64 [[TMP10]], i64 [[TMP11]]) #[[ATTR8:[0-9]+]], !nosanitize [[META2]] // REC-NEXT:br label %[[CONT]], !nosanitize [[META2]] // REC: [[CONT]]: // REC-NEXT:[[DIV:%.*]] = sdiv i32 [[X]], [[Y]] @@ -86,7 +86,7 @@ int div(int x, int y) { } // CHECK-LABEL: define dso_local i32 @null( -// CHECK-SAME: ptr noundef readonly captures(address_is_null) [[X:%.*]]) local_unnamed_addr #[[ATTR0]] { +// CHECK-SAME: ptr noundef readonly captures(address_is_null) [[X:%.*]]) local_unnamed_addr #[[ATTR3:[0-9]+]] { // CHECK-NEXT: [[ENTRY:.*:]] // CHECK-NEXT:[[TMP0:%.*]] = icmp eq ptr [[X]], null, !nosanitize [[META2]] // @@ -95,7 +95,7 @@ int div(int x, int y) { // CHECK-NEXT:[[DOTNOT1:%.*]] = and i1 [[TMP0]], [[TMP1]] // CHECK-NEXT:br i1 [[DOTNOT1]], label %[[HANDLER_TYPE_MISMATCH:.*]], label %[[CONT:.*]], !prof [[PROF4:![0-9]+]], !nosanitize [[META2]] // CHECK: [[HANDLER_TYPE_MISMATCH]]: -// CHECK-NEXT:tail call void @__ubsan_handle_type_mismatch_v1_abort(ptr nonnull @[[GLOB2:[0-9]+]], i64 0) #[[ATTR6]], !nosanitize [[META2]] +// CHECK-NEXT:tail call void @__ubsan_handle_type_mismatch_v1_abort(ptr nonnull @[[GLOB2:[0-9]+]], i64 0) #[[ATTR8]], !nosanitize [[META2]] // CHECK-NEXT:unreachable, !nosanitize [[META2]] // CHECK: [[CONT]]: // CHECK-NEXT:[[TMP2:%.*]] = load i32, ptr [[X]], align 4, !tbaa [[TBAA5:![0-9]+]] @@ -116,14 +116,14 @@ int div(int x, int
[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)
https://github.com/efriedma-quic commented: LGTM with one minor comment. https://github.com/llvm/llvm-project/pull/135135 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)
https://github.com/efriedma-quic edited https://github.com/llvm/llvm-project/pull/135135 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)
@@ -3615,6 +3618,23 @@ static void emitCheckHandlerCall(CodeGenFunction &CGF, .addAttribute(llvm::Attribute::NoUnwind); } B.addUWTableAttr(llvm::UWTableKind::Default); + // Add more precise attributes to recoverable ubsan handlers for better + // optimizations. + if (CGF.CGM.getCodeGenOpts().OptimizationLevel > 0 && MayReturn) { +// __ubsan_handle_dynamic_type_cache_miss reads the vtable, which is also +// accessible by the current module. +if (CheckHandler != SanitizerHandler::DynamicTypeCacheMiss) { + llvm::MemoryEffects ME = + llvm::MemoryEffects::argMemOnly(llvm::ModRefInfo::Ref) | + llvm::MemoryEffects::inaccessibleMemOnly(); + if (MayReadFromPtrToInt) +ME |= llvm::MemoryEffects::readOnly(); + B.addMemoryAttr(ME); +} +// If the handler does not return, it must interact with the environment in +// an observable way. +B.addAttribute(llvm::Attribute::MustProgress); efriedma-quic wrote: I know this isn't new... but I don't think this "mustprogress" marking actually does anything useful. "mustprogress" is more a characteristic of function definitions, not calls. https://github.com/llvm/llvm-project/pull/135135 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)
llvmbot wrote: @llvm/pr-subscribers-clang Author: Yingwei Zheng (dtcxzyw) Changes This patch relands https://github.com/llvm/llvm-project/pull/130990. If the check value is passed by reference, add `memory(read)`. Original PR description: This patch adds `memory(argmem: read, inaccessiblemem: readwrite) mustprogress` to **recoverable** ubsan handlers in order to unblock some memory/loop optimizations. It provides an average of 3% performance improvement on llvm-test-suite (except for 49 test failures due to ubsan diagnostics). --- Patch is 94.02 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/135135.diff 6 Files Affected: - (modified) clang/lib/CodeGen/CGExpr.cpp (+28-6) - (modified) clang/lib/CodeGen/CodeGenFunction.h (+3-1) - (added) clang/test/CodeGen/AArch64/ubsan-handler-pass-by-ref.c (+22) - (modified) clang/test/CodeGen/allow-ubsan-check.c (+14-14) - (modified) clang/test/CodeGen/attr-counted-by.c (+116-116) - (added) clang/test/CodeGen/ubsan-attr.cpp (+100) ``diff diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp index 5f028f6d8c6ac..fe8ccb713230e 100644 --- a/clang/lib/CodeGen/CGExpr.cpp +++ b/clang/lib/CodeGen/CGExpr.cpp @@ -3456,7 +3456,8 @@ llvm::Constant *CodeGenFunction::EmitCheckTypeDescriptor(QualType T) { return GV; } -llvm::Value *CodeGenFunction::EmitCheckValue(llvm::Value *V) { +llvm::Value *CodeGenFunction::EmitCheckValue(llvm::Value *V, + bool &MayReadFromPtrToInt) { llvm::Type *TargetTy = IntPtrTy; if (V->getType() == TargetTy) @@ -3482,6 +3483,7 @@ llvm::Value *CodeGenFunction::EmitCheckValue(llvm::Value *V) { Builder.CreateStore(V, Ptr); V = Ptr.getPointer(); } + MayReadFromPtrToInt = true; return Builder.CreatePtrToInt(V, TargetTy); } @@ -3587,7 +3589,8 @@ static void emitCheckHandlerCall(CodeGenFunction &CGF, ArrayRef FnArgs, SanitizerHandler CheckHandler, CheckRecoverableKind RecoverKind, bool IsFatal, - llvm::BasicBlock *ContBB, bool NoMerge) { + llvm::BasicBlock *ContBB, bool NoMerge, + bool MayReadFromPtrToInt) { assert(IsFatal || RecoverKind != CheckRecoverableKind::Unrecoverable); std::optional DL; if (!CGF.Builder.getCurrentDebugLocation()) { @@ -3615,6 +3618,23 @@ static void emitCheckHandlerCall(CodeGenFunction &CGF, .addAttribute(llvm::Attribute::NoUnwind); } B.addUWTableAttr(llvm::UWTableKind::Default); + // Add more precise attributes to recoverable ubsan handlers for better + // optimizations. + if (CGF.CGM.getCodeGenOpts().OptimizationLevel > 0 && MayReturn) { +// __ubsan_handle_dynamic_type_cache_miss reads the vtable, which is also +// accessible by the current module. +if (CheckHandler != SanitizerHandler::DynamicTypeCacheMiss) { + llvm::MemoryEffects ME = + llvm::MemoryEffects::argMemOnly(llvm::ModRefInfo::Ref) | + llvm::MemoryEffects::inaccessibleMemOnly(); + if (MayReadFromPtrToInt) +ME |= llvm::MemoryEffects::readOnly(); + B.addMemoryAttr(ME); +} +// If the handler does not return, it must interact with the environment in +// an observable way. +B.addAttribute(llvm::Attribute::MustProgress); + } llvm::FunctionCallee Fn = CGF.CGM.CreateRuntimeFunction( FnType, FnName, @@ -3711,6 +3731,7 @@ void CodeGenFunction::EmitCheck( // representing operand values. SmallVector Args; SmallVector ArgTypes; + bool MayReadFromPtrToInt = false; if (!CGM.getCodeGenOpts().SanitizeMinimalRuntime) { Args.reserve(DynamicArgs.size() + 1); ArgTypes.reserve(DynamicArgs.size() + 1); @@ -3730,7 +3751,7 @@ void CodeGenFunction::EmitCheck( } for (size_t i = 0, n = DynamicArgs.size(); i != n; ++i) { - Args.push_back(EmitCheckValue(DynamicArgs[i])); + Args.push_back(EmitCheckValue(DynamicArgs[i], MayReadFromPtrToInt)); ArgTypes.push_back(IntPtrTy); } } @@ -3742,7 +3763,8 @@ void CodeGenFunction::EmitCheck( // Simple case: we need to generate a single handler call, either // fatal, or non-fatal. emitCheckHandlerCall(*this, FnType, Args, CheckHandler, RecoverKind, - (FatalCond != nullptr), Cont, NoMerge); + (FatalCond != nullptr), Cont, NoMerge, + MayReadFromPtrToInt); } else { // Emit two handler calls: first one for set of unrecoverable checks, // another one for recoverable. @@ -3752,10 +3774,10 @@ void CodeGenFunction::EmitCheck( Builder.CreateCondBr(FatalCond, NonFatalHandlerBB, FatalHandlerBB); EmitBlock(FatalHandlerBB); emitCheckHandlerCall(*this, FnType, Args, CheckHandler, RecoverKind, true, - NonFatalHandlerB
[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)
https://github.com/dtcxzyw ready_for_review https://github.com/llvm/llvm-project/pull/135135 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)
https://github.com/dtcxzyw updated https://github.com/llvm/llvm-project/pull/135135 >From c0d8765b48fac979ab46eba4fd2cde85595c31e0 Mon Sep 17 00:00:00 2001 From: Yingwei Zheng Date: Thu, 10 Apr 2025 11:09:45 +0800 Subject: [PATCH 1/3] [Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (#130990) This patch adds `memory(argmem: read, inaccessiblemem: readwrite) mustprogress` to **recoverable** ubsan handlers in order to unblock some memory/loop optimizations. It provides an average of 3% performance improvement on llvm-test-suite (except for 49 test failures due to ubsan diagnostics). Closes https://github.com/llvm/llvm-project/issues/130093. --- clang/lib/CodeGen/CGExpr.cpp | 12 ++ clang/test/CodeGen/allow-ubsan-check.c | 28 +-- clang/test/CodeGen/attr-counted-by.c | 232 - clang/test/CodeGen/ubsan-attr.cpp | 100 +++ 4 files changed, 242 insertions(+), 130 deletions(-) create mode 100644 clang/test/CodeGen/ubsan-attr.cpp diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp index 5f028f6d8c6ac..b721eeea6e906 100644 --- a/clang/lib/CodeGen/CGExpr.cpp +++ b/clang/lib/CodeGen/CGExpr.cpp @@ -3615,6 +3615,18 @@ static void emitCheckHandlerCall(CodeGenFunction &CGF, .addAttribute(llvm::Attribute::NoUnwind); } B.addUWTableAttr(llvm::UWTableKind::Default); + // Add more precise attributes to recoverable ubsan handlers for better + // optimizations. + if (CGF.CGM.getCodeGenOpts().OptimizationLevel > 0 && MayReturn) { +// __ubsan_handle_dynamic_type_cache_miss reads the vtable, which is also +// accessible by the current module. +if (CheckHandler != SanitizerHandler::DynamicTypeCacheMiss) + B.addMemoryAttr(llvm::MemoryEffects::argMemOnly(llvm::ModRefInfo::Ref) | + llvm::MemoryEffects::inaccessibleMemOnly()); +// If the handler does not return, it must interact with the environment in +// an observable way. +B.addAttribute(llvm::Attribute::MustProgress); + } llvm::FunctionCallee Fn = CGF.CGM.CreateRuntimeFunction( FnType, FnName, diff --git a/clang/test/CodeGen/allow-ubsan-check.c b/clang/test/CodeGen/allow-ubsan-check.c index e225fb63f08eb..6358425aa40be 100644 --- a/clang/test/CodeGen/allow-ubsan-check.c +++ b/clang/test/CodeGen/allow-ubsan-check.c @@ -29,7 +29,7 @@ // CHECK: [[HANDLER_DIVREM_OVERFLOW]]: // CHECK-NEXT:[[TMP10:%.*]] = zext i32 [[X]] to i64, !nosanitize [[META2]] // CHECK-NEXT:[[TMP11:%.*]] = zext i32 [[Y]] to i64, !nosanitize [[META2]] -// CHECK-NEXT:tail call void @__ubsan_handle_divrem_overflow_abort(ptr nonnull @[[GLOB1:[0-9]+]], i64 [[TMP10]], i64 [[TMP11]]) #[[ATTR6:[0-9]+]], !nosanitize [[META2]] +// CHECK-NEXT:tail call void @__ubsan_handle_divrem_overflow_abort(ptr nonnull @[[GLOB1:[0-9]+]], i64 [[TMP10]], i64 [[TMP11]]) #[[ATTR8:[0-9]+]], !nosanitize [[META2]] // CHECK-NEXT:unreachable, !nosanitize [[META2]] // CHECK: [[CONT]]: // CHECK-NEXT:[[DIV:%.*]] = sdiv i32 [[X]], [[Y]] @@ -75,7 +75,7 @@ // REC: [[HANDLER_DIVREM_OVERFLOW]]: // REC-NEXT:[[TMP10:%.*]] = zext i32 [[X]] to i64, !nosanitize [[META2]] // REC-NEXT:[[TMP11:%.*]] = zext i32 [[Y]] to i64, !nosanitize [[META2]] -// REC-NEXT:tail call void @__ubsan_handle_divrem_overflow(ptr nonnull @[[GLOB1:[0-9]+]], i64 [[TMP10]], i64 [[TMP11]]) #[[ATTR6:[0-9]+]], !nosanitize [[META2]] +// REC-NEXT:tail call void @__ubsan_handle_divrem_overflow(ptr nonnull @[[GLOB1:[0-9]+]], i64 [[TMP10]], i64 [[TMP11]]) #[[ATTR8:[0-9]+]], !nosanitize [[META2]] // REC-NEXT:br label %[[CONT]], !nosanitize [[META2]] // REC: [[CONT]]: // REC-NEXT:[[DIV:%.*]] = sdiv i32 [[X]], [[Y]] @@ -86,7 +86,7 @@ int div(int x, int y) { } // CHECK-LABEL: define dso_local i32 @null( -// CHECK-SAME: ptr noundef readonly captures(address_is_null) [[X:%.*]]) local_unnamed_addr #[[ATTR0]] { +// CHECK-SAME: ptr noundef readonly captures(address_is_null) [[X:%.*]]) local_unnamed_addr #[[ATTR3:[0-9]+]] { // CHECK-NEXT: [[ENTRY:.*:]] // CHECK-NEXT:[[TMP0:%.*]] = icmp eq ptr [[X]], null, !nosanitize [[META2]] // @@ -95,7 +95,7 @@ int div(int x, int y) { // CHECK-NEXT:[[DOTNOT1:%.*]] = and i1 [[TMP0]], [[TMP1]] // CHECK-NEXT:br i1 [[DOTNOT1]], label %[[HANDLER_TYPE_MISMATCH:.*]], label %[[CONT:.*]], !prof [[PROF4:![0-9]+]], !nosanitize [[META2]] // CHECK: [[HANDLER_TYPE_MISMATCH]]: -// CHECK-NEXT:tail call void @__ubsan_handle_type_mismatch_v1_abort(ptr nonnull @[[GLOB2:[0-9]+]], i64 0) #[[ATTR6]], !nosanitize [[META2]] +// CHECK-NEXT:tail call void @__ubsan_handle_type_mismatch_v1_abort(ptr nonnull @[[GLOB2:[0-9]+]], i64 0) #[[ATTR8]], !nosanitize [[META2]] // CHECK-NEXT:unreachable, !nosanitize [[META2]] // CHECK: [[CONT]]: // CHECK-NEXT:[[TMP2:%.*]] = load i32, ptr [[X]], align 4, !tbaa [[TBAA5:![0-9]+]] @@ -116,14 +116,14 @@ int div(int x, int
[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)
dtcxzyw wrote: > Still not sure how https://github.com/llvm/llvm-project/pull/135141 is > related. We add a new parameter `bool &MayReadFromPtrToInt` to `CodeGenFunction::EmitCheckValue`. If it is called outside `CodeGenFunction::EmitCheck`, we have to pass the result into `EmitCheck` as well :( https://github.com/llvm/llvm-project/pull/135135 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)
efriedma-quic wrote: > Looking at the issue description from > https://github.com/llvm/llvm-project/pull/130990, the problem is that the > code is passing argmem: read, but then passes a "pointer" as an i32. Which is > against the rules: argmem is specifically pointer arguments. So either needs > to just be marked "read", or you need to fix the argument passing to pass the > pointer as an actual pointer. Oh, sorry, somehow didn't read the current version of the patch; I see you fixed this. Still not sure how #135141 is related. https://github.com/llvm/llvm-project/pull/135135 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Reland][Clang][CodeGen][UBSan] Add more precise attributes to recoverable ubsan handlers (PR #135135)
efriedma-quic wrote: Looking at the issue description from #130990, the problem is that the code is passing `argmem: read`, but then passes a "pointer" as an i32. Which is against the rules: argmem is specifically pointer arguments. So either needs to just be marked "read", or you need to fix the argument passing to pass the pointer as an actual pointer. I'm not sure how #135141 is connected to this. https://github.com/llvm/llvm-project/pull/135135 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits