[clang] [llvm] [ASAN] For Asan instrumented global, emit two symbols, one with actual size and other with instrumented size. (PR #70166)
b-sumner wrote: Hi @hctim, I'm really not seeing how this patch could possibly prevent ASAN or other sanitizers from arbitrarily changing their implementations or placing arbitrary data in redzones. This patch is merely introducing additional symbols that overlap with the uninstrumented parts of instrumented objects. We would need to do this again if we implemented MSAN, so would be fine with not restricting it to ASAN if that is the concern. Of course our runtime could avoid checking the shadow when copying, but that would be just as crippling as disabling checking by memcpy() and memmove(). It's not an option. https://github.com/llvm/llvm-project/pull/70166 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [ASAN] For Asan instrumented global, emit two symbols, one with actual size and other with instrumented size. (PR #70166)
hctim wrote: Messing around with global variables (changing their size, padding, alignment) is a common theme amongst sanitizers. We'd therefore want any strategy applied to ASan to be generic and apply across other sanitizers. The patch might not cause issues right now with ASan - but I think it's best to reserve the right to change the way that ASan's global variable instrumentation works. The padding bytes are currently unused. But, I've heard at least one idea come across my desk to change that and to use the "free" metadata space. That would be incompatible with your patch. In contrary - I still don't know why the small section of AMDGPU's symbol-copying code can't be just built without `__attribute__((no_sanitize("address")))` https://github.com/llvm/llvm-project/pull/70166 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [ASAN] For Asan instrumented global, emit two symbols, one with actual size and other with instrumented size. (PR #70166)
skc7 wrote: Hi @hctim, In this patch, we are identifying globals that are instrumented using an new IR attribute and for them extra symbol with padded size is emitted. This is done only in ASAN pass. Not sure how would that affect HWAsan and MTE-globals. Could you please give an example for this ? At AMD, we have complex HPC/AI applications and libraries and this patch seems to not create any issue as such. Could you please give a concrete example test case where this patch would create an issue? https://github.com/llvm/llvm-project/pull/70166 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [ASAN] For Asan instrumented global, emit two symbols, one with actual size and other with instrumented size. (PR #70166)
hctim wrote: Hi, My apologies for the delay. I started to hack together the same idea using `SanitizerMetadata`, which is definitely the preferred way of adding sanitizer globalvariable instrumentation (given that everything else already lives there). You can see the WIP here: https://pastebin.com/vXzHmnMg Part way through patching up all the tests though, I've come to the conclusion that this really isn't a great idea. For just ASan, sure we can just emit a secondary symbol and everything is hunky dory. But I think that we shouldn't do something like this for just ASan, it needs to also work for HWASan and MTE-globals as well. And it really doesn't compose well with the MTE-globals model. If you were to emit the unpadded symbol as the dominant name (from the c/cpp definition), then the linker gets super confused if you're linking multiple DSOs, as it needs to ensure that *all* of the DSOs use the tagged definition (or they all get demoted to being untagged). So, you'd need the padded one to be the dominant name, which I think is not what you want. This feels like a behaviour that we don't want the additional complexity for AMDGPU's drivers, because I highly suspect you can just deal with the sanitizers in the AMDGPU driver itself. How does your driver work? If it's basically doing a `dlsym` -> `memcpy` (and the memcpy interceptor in ASan is firing because you're going OOB on the unpadded symbol), can't you just have an `internal_memcpy` that's built with `__attribute__((no_sanitize("address")))`? We'd also potentially like to put some data in the padding. This has been discussed between some colleagues and I. If you end up not copying the padding as well, you might drop some metadata and things could explode in future. HTH, Mitch. https://github.com/llvm/llvm-project/pull/70166 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [ASAN] For Asan instrumented global, emit two symbols, one with actual size and other with instrumented size. (PR #70166)
skc7 wrote: @hctim @MaskRay @vitalybuka #70166 and #68865 have been pending for review and approvals for few months now. All the feedback has been useful in improving the patch. Made the changes to patches as per feedback. Please review. https://github.com/llvm/llvm-project/pull/70166 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [ASAN] For Asan instrumented global, emit two symbols, one with actual size and other with instrumented size. (PR #70166)
skc7 wrote: #68865 and #70166 have been rebased and updated as per review comments. Please review. https://github.com/llvm/llvm-project/pull/70166 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [ASAN] For Asan instrumented global, emit two symbols, one with actual size and other with instrumented size. (PR #70166)
https://github.com/skc7 updated https://github.com/llvm/llvm-project/pull/70166 >From 45d7b14e494ce771460ba263c5573f0fb4715246 Mon Sep 17 00:00:00 2001 From: skc7 Date: Wed, 25 Oct 2023 10:46:10 +0530 Subject: [PATCH] [ASAN] For Asan instrumented globals, emit two symbols, with actual size and instrumented size. --- clang/test/CodeGen/asan_globals_symbols.cpp | 15 ++ llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp| 28 ++- .../Instrumentation/AddressSanitizer.cpp | 3 ++ .../AddressSanitizer/debug-info-global-var.ll | 2 +- 4 files changed, 46 insertions(+), 2 deletions(-) create mode 100644 clang/test/CodeGen/asan_globals_symbols.cpp diff --git a/clang/test/CodeGen/asan_globals_symbols.cpp b/clang/test/CodeGen/asan_globals_symbols.cpp new file mode 100644 index 00..d53afb2433b171 --- /dev/null +++ b/clang/test/CodeGen/asan_globals_symbols.cpp @@ -0,0 +1,15 @@ +// RUN: %clang_cc1 -S -x c++ -std=c++11 -triple x86_64-linux \ +// RUN: -fsanitize=address -o %t.out %s +// RUN: FileCheck %s --input-file=%t.out --check-prefix=CHECK-A + +// CHECK-A: myGlobal: +// CHECK-A: .size myGlobal, 4 +// CHECK-A: myGlobal__sanitized_padded_global: +// CHECK-A .size myGlobal__sanitized_padded_global, 32 + +int myGlobal; + +int main() { +myGlobal = 0; +return 0; +} diff --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp index 7df1c82bf357f6..cf1bdde6fc3425 100644 --- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp +++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp @@ -801,6 +801,19 @@ void AsmPrinter::emitGlobalVariable(const GlobalVariable *GV) { // sections and expected to be contiguous (e.g. ObjC metadata). const Align Alignment = getGVAlignment(GV, DL); + // Identify globals with "SanitizedPaddedGlobal" attribute and extract + // the actual global variable size. + uint64_t ActualSize = 0; + if (GV->hasAttribute(Attribute::SanitizedPaddedGlobal)) { +StructType *ST = dyn_cast(GV->getValueType()); +if (ST && ST->getNumElements() == 2) { + auto *ET0 = ST->getElementType(0); + if (ET0 && isa(ST->getElementType(1))) { +ActualSize = DL.getTypeAllocSize(ET0); + } +} + } + for (const HandlerInfo : Handlers) { NamedRegionTimer T(HI.TimerName, HI.TimerDescription, HI.TimerGroupName, HI.TimerGroupDescription, @@ -911,6 +924,18 @@ void AsmPrinter::emitGlobalVariable(const GlobalVariable *GV) { MCSymbol *EmittedInitSym = GVSym; + if (GV->hasAttribute(Attribute::SanitizedPaddedGlobal)) { +OutStreamer->switchSection(TheSection); +emitLinkage(GV, EmittedInitSym); +OutStreamer->emitLabel(EmittedInitSym); +if (MAI->hasDotTypeDotSizeDirective()) + OutStreamer->emitELFSize(EmittedInitSym, + MCConstantExpr::create(ActualSize, OutContext)); +EmittedInitSym = OutContext.getOrCreateSymbol( +GVSym->getName() + Twine("__sanitized_padded_global")); +emitVisibility(EmittedInitSym, GV->getVisibility(), !GV->isDeclaration()); + } + OutStreamer->switchSection(TheSection); emitLinkage(GV, EmittedInitSym); @@ -918,7 +943,8 @@ void AsmPrinter::emitGlobalVariable(const GlobalVariable *GV) { OutStreamer->emitLabel(EmittedInitSym); MCSymbol *LocalAlias = getSymbolPreferLocal(*GV); - if (LocalAlias != EmittedInitSym) + if ((LocalAlias != EmittedInitSym) && + !GV->hasAttribute(Attribute::SanitizedPaddedGlobal)) OutStreamer->emitLabel(LocalAlias); emitGlobalConstant(GV->getParent()->getDataLayout(), GV->getInitializer()); diff --git a/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp b/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp index caab98c732eeeb..a0dbc5224b3ebc 100644 --- a/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp +++ b/llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp @@ -2496,6 +2496,9 @@ void ModuleAddressSanitizer::instrumentGlobals(IRBuilder<> , Module , // zero so we can copy the metadata over as is. NewGlobal->copyMetadata(G, 0); +// Attach "SanitizedPaddedGlobal" attribute to the new global. +NewGlobal->addAttribute(Attribute::SanitizedPaddedGlobal); + Value *Indices2[2]; Indices2[0] = IRB.getInt32(0); Indices2[1] = IRB.getInt32(0); diff --git a/llvm/test/Instrumentation/AddressSanitizer/debug-info-global-var.ll b/llvm/test/Instrumentation/AddressSanitizer/debug-info-global-var.ll index 0b516e0174d6d1..2815c1f04bff15 100644 --- a/llvm/test/Instrumentation/AddressSanitizer/debug-info-global-var.ll +++ b/llvm/test/Instrumentation/AddressSanitizer/debug-info-global-var.ll @@ -2,7 +2,7 @@ source_filename = "version.c" target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-apple-macosx10.12.0" -; CHECK: @version = constant { [5 x i8], [27 x i8] } {{.*}}, !dbg ![[GV:.*]] +; CHECK: @version = constant { [5 x i8], [27 x i8] } {{.*}}, !dbg ![[GV:.*]]
[clang] [llvm] [ASAN] For Asan instrumented global, emit two symbols, one with actual size and other with instrumented size. (PR #70166)
hctim wrote: Will wait for a rebase on some changes requested in #68865, but it'd also be really important to write tests for a couple more scenarios: 1. Building with `-fsanitize=address -S -emit-llvm` results in GVs with `sanitized_padded_global` (and results in GVs without `sanitized_padded_global` if they have `__attribute__((no_sanitize("address")))` for example). See `clang/test/CodeGen/memtag-globals.cpp`. 2. Running #1 through bitcode back/forth to make sure you still have the same attributes sets (to ensure you got the parser/writer logic correct). See `llvm/test/Bitcode/compatibility.ll` and `llvm/test/Assembler/globalvariable-attributes.ll`. https://github.com/llvm/llvm-project/pull/70166 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [ASAN] For Asan instrumented global, emit two symbols, one with actual size and other with instrumented size. (PR #70166)
hctim wrote: > > It's my understanding your problem is that you are asan-trapping on the > > redzones when you copy data to/from the device. Is it possible instead to > > just make those copy-from and copy-to functions in the runtime > > `__attribute__((no_sanitize("address")))` and copy the padding as well? > > Yes it is possible, but it would result in bugs being hidden...the very bugs > that ASAN is meant to catch. This is not an acceptable option for us. Only bugs in the `copy_from_` and `copy_to_` functions though (which should be pretty clear)? -- Alright, let's give this a crack. I'll leave some comments here and on #68865, but let's not worry about potential unknown problems and we can see what shakes out once we actually submit the patches. https://github.com/llvm/llvm-project/pull/70166 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits