[PATCH] D25448: [ubsan] Disable -fsanitize=vptr checks for devirtualized calls
rjmccall added a comment. In https://reviews.llvm.org/D25448#571941, @vsk wrote: > Thanks for your feedback so far, and sorry for the delayed response. > > In https://reviews.llvm.org/D25448#570014, @rjmccall wrote: > > > Wait, can you talk me through the bug here? > > > Derived inherits from Base1 and Base2. We upcast an instance of Derived to > Base2, then call a method (f1). Clang figures out that the method has to be > Derived::f1. Next, clang passes in a pointer to the vtable pointer for > **Base1**, along with the typeinfo for **Base2**, into the sanitizer runtime. > This confuses the runtime, which reports that the dynamic type of the object > is "Base1", and that this does not match the expected type ("Base2"). A pointer to the vtable pointer for Base1 is a pointer to a Derived. You have a multiple inheritance bug, or really a general inheritance bug. It's being covered up in the case of single, non-virtual inheritance because that's the case in which a pointer to a base-class object is the same as a pointer to the derived class object. When the devirtualizer sees what's formally a call to a method on Base2, and it decides that it can instead treat it as a call to a method on Derived, it has to adjust the 'this' pointer to point to a Derived. (I don't know off-hand whether it does this by retroactively adjusting the pointer or whether it just avoids adjusting it in the first place.) When it does this, it should also be changing its notion of what class the pointer points to. > With the 'final' keyword: > > %6 = ptrtoint <2 x i32 (...)**>* %1 to i64 > call void @__ubsan_handle_dynamic_type_cache_miss(@1 (TypeInfo for Base2), > i64 %6, ...) > > > Without the 'final' keyword: > > %6 = bitcast <2 x i32 (...)**>* %1 to %class.Derived* > %7 = getelementptr inbounds %class.Derived, %class.Derived* %6, i64 0, i32 1 > %8 = ptrtoint %class.Base2* %7 to i64, !nosanitize !5 > call void @__ubsan_handle_dynamic_type_cache_miss(@1 (TypeInfo for Base2), > i64 %8, ...) > > > > >> Why is final-based devirtualization here different from, say, >> user-directed devirtualization via a qualified method name? > > I'm not sure. I tested this by removing the 'final' specifier from 'Derived' > and calling: > > obj.Derived::f1() > > > In this case, clang passes the correct typeinfo (for Derived) in to the > runtime. > >> It sounds to me from your description that you're not sure why this is >> happening. If this indeed only triggers in the presence of multiple >> inheritance, it might just be the case that you're doing your object-extents >> analysis starting from the wrong offset. > > It looks like I just haven't done a good job of explaining the issue. The bug > really does seem to be that clang isn't passing the right information to the > ubsan runtime. However, I'm not sure what the right fix is. Do we disable > sanitization in cases where we expect FP's, do we try harder to pass in the > right vptr (in this case, the vptr for Base2) into the runtime, or do we try > harder to pass in the right typeinfo (in this case, the typeinfo for Derived)? https://reviews.llvm.org/D25448 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: [PATCH] D25448: [ubsan] Disable -fsanitize=vptr checks for devirtualized calls
On 17 Oct 2016 12:06 pm, "Vedant Kumar via cfe-commits" < cfe-commits@lists.llvm.org> wrote: vsk added a comment. Thanks for your feedback so far, and sorry for the delayed response. In https://reviews.llvm.org/D25448#570014, @rjmccall wrote: > Wait, can you talk me through the bug here? Derived inherits from Base1 and Base2. We upcast an instance of Derived to Base2, then call a method (f1). Clang figures out that the method has to be Derived::f1. Next, clang passes in a pointer to the vtable pointer for **Base1**, along with the typeinfo for **Base2**, into the sanitizer runtime. This confuses the runtime, which reports that the dynamic type of the object is "Base1", and that this does not match the expected type ("Base2"). With the 'final' keyword: %6 = ptrtoint <2 x i32 (...)**>* %1 to i64 call void @__ubsan_handle_dynamic_type_cache_miss(@1 (TypeInfo for Base2), i64 %6, ...) Without the 'final' keyword: %6 = bitcast <2 x i32 (...)**>* %1 to %class.Derived* %7 = getelementptr inbounds %class.Derived, %class.Derived* %6, i64 0, i32 1 %8 = ptrtoint %class.Base2* %7 to i64, !nosanitize !5 call void @__ubsan_handle_dynamic_type_cache_miss(@1 (TypeInfo for Base2), i64 %8, ...) > Why is final-based devirtualization here different from, say, user-directed devirtualization via a qualified method name? I'm not sure. I tested this by removing the 'final' specifier from 'Derived' and calling: obj.Derived::f1() In this case, clang passes the correct typeinfo (for Derived) in to the runtime. > It sounds to me from your description that you're not sure why this is happening. If this indeed only triggers in the presence of multiple inheritance, it might just be the case that you're doing your object-extents analysis starting from the wrong offset. It looks like I just haven't done a good job of explaining the issue. The bug really does seem to be that clang isn't passing the right information to the ubsan runtime. However, I'm not sure what the right fix is. Do we disable sanitization in cases where we expect FP's, do we try harder to pass in the right vptr (in this case, the vptr for Base2) into the runtime, or do we try harder to pass in the right typeinfo (in this case, the typeinfo for Derived)? The baseline correct behaviour is that we should be passing a pointer to the Base2 subobject, since we're calling a function declared in Base2. However, since we know we can devirtualize, we can do better by passing a pointer to the Derived object and the Derived typeinfo. But we should not be working around our broken instrumentation by just turning it off. https://reviews.llvm.org/D25448 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D25448: [ubsan] Disable -fsanitize=vptr checks for devirtualized calls
vsk added a comment. Thanks for your feedback so far, and sorry for the delayed response. In https://reviews.llvm.org/D25448#570014, @rjmccall wrote: > Wait, can you talk me through the bug here? Derived inherits from Base1 and Base2. We upcast an instance of Derived to Base2, then call a method (f1). Clang figures out that the method has to be Derived::f1. Next, clang passes in a pointer to the vtable pointer for **Base1**, along with the typeinfo for **Base2**, into the sanitizer runtime. This confuses the runtime, which reports that the dynamic type of the object is "Base1", and that this does not match the expected type ("Base2"). With the 'final' keyword: %6 = ptrtoint <2 x i32 (...)**>* %1 to i64 call void @__ubsan_handle_dynamic_type_cache_miss(@1 (TypeInfo for Base2), i64 %6, ...) Without the 'final' keyword: %6 = bitcast <2 x i32 (...)**>* %1 to %class.Derived* %7 = getelementptr inbounds %class.Derived, %class.Derived* %6, i64 0, i32 1 %8 = ptrtoint %class.Base2* %7 to i64, !nosanitize !5 call void @__ubsan_handle_dynamic_type_cache_miss(@1 (TypeInfo for Base2), i64 %8, ...) > Why is final-based devirtualization here different from, say, user-directed > devirtualization via a qualified method name? I'm not sure. I tested this by removing the 'final' specifier from 'Derived' and calling: obj.Derived::f1() In this case, clang passes the correct typeinfo (for Derived) in to the runtime. > It sounds to me from your description that you're not sure why this is > happening. If this indeed only triggers in the presence of multiple > inheritance, it might just be the case that you're doing your object-extents > analysis starting from the wrong offset. It looks like I just haven't done a good job of explaining the issue. The bug really does seem to be that clang isn't passing the right information to the ubsan runtime. However, I'm not sure what the right fix is. Do we disable sanitization in cases where we expect FP's, do we try harder to pass in the right vptr (in this case, the vptr for Base2) into the runtime, or do we try harder to pass in the right typeinfo (in this case, the typeinfo for Derived)? https://reviews.llvm.org/D25448 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D25448: [ubsan] Disable -fsanitize=vptr checks for devirtualized calls
vsk updated this revision to Diff 74606. vsk added a comment. - Remove the set of blacklisted object pointers; pass down a flag which indicates whether or not the member call check should be skipped. https://reviews.llvm.org/D25448 Files: lib/CodeGen/CGExpr.cpp lib/CodeGen/CGExprCXX.cpp lib/CodeGen/CodeGenFunction.h lib/CodeGen/ItaniumCXXABI.cpp test/CodeGenCXX/ubsan-devirtualized-calls.cpp Index: test/CodeGenCXX/ubsan-devirtualized-calls.cpp === --- /dev/null +++ test/CodeGenCXX/ubsan-devirtualized-calls.cpp @@ -0,0 +1,38 @@ +// RUN: %clang_cc1 -std=c++11 -triple %itanium_abi_triple -emit-llvm -fsanitize=vptr %s -o - | FileCheck %s + +struct Base1 { + virtual int f1() { return 1; } +}; + +struct Base2 { + virtual int f1() { return 2; } +}; + +struct Derived1 final : Base1 { + int f1() override { return 3; } +}; + +struct Derived2 final : Base1, Base2 { + int f1() override { return 3; } +}; + +// CHECK-LABEL: define i32 @_Z2t1v +int t1() { + Derived1 d1; + return static_cast()->f1(); +// CHECK-NOT: !nosanitize +} + +// CHECK-LABEL: define i32 @_Z2t2v +int t2() { + Derived2 d2; + return static_cast()->f1(); +// CHECK-NOT: !nosanitize +} + +// CHECK-LABEL: define i32 @_Z2t3v +int t3() { + Derived2 d2; + return static_cast()->f1(); +// CHECK-NOT: !nosanitize +} Index: lib/CodeGen/ItaniumCXXABI.cpp === --- lib/CodeGen/ItaniumCXXABI.cpp +++ lib/CodeGen/ItaniumCXXABI.cpp @@ -1456,8 +1456,8 @@ Callee = CGM.getAddrOfCXXStructor(DD, getFromDtorType(Type)); CGF.EmitCXXMemberOrOperatorCall(DD, Callee, ReturnValueSlot(), - This.getPointer(), VTT, VTTTy, - nullptr, nullptr); + This.getPointer(), VTT, VTTTy, nullptr, + nullptr, /*SkipMemberCallCheck=*/false); } void ItaniumCXXABI::emitVTableDefinitions(CodeGenVTables , @@ -1637,7 +1637,8 @@ CGF.EmitCXXMemberOrOperatorCall(Dtor, Callee, ReturnValueSlot(), This.getPointer(), /*ImplicitParam=*/nullptr, - QualType(), CE, nullptr); + QualType(), CE, nullptr, + /*SkipMemberCallCheck=*/false); return nullptr; } Index: lib/CodeGen/CodeGenFunction.h === --- lib/CodeGen/CodeGenFunction.h +++ lib/CodeGen/CodeGenFunction.h @@ -2084,7 +2084,8 @@ /// appropriate size and alignment for an object of type \p Type. void EmitTypeCheck(TypeCheckKind TCK, SourceLocation Loc, llvm::Value *V, QualType Type, CharUnits Alignment = CharUnits::Zero(), - bool SkipNullCheck = false); + bool SkipNullCheck = false, + bool SkipMemberCallCheck = false); /// \brief Emit a check that \p Base points into an array object, which /// we can access at index \p Index. \p Accessed should be \c false if we @@ -2882,7 +2883,7 @@ ReturnValueSlot ReturnValue, llvm::Value *This, llvm::Value *ImplicitParam, QualType ImplicitParamTy, const CallExpr *E, - CallArgList *RtlArgs); + CallArgList *RtlArgs, bool SkipMemberCallCheck); RValue EmitCXXDestructorCall(const CXXDestructorDecl *DD, llvm::Value *Callee, llvm::Value *This, llvm::Value *ImplicitParam, QualType ImplicitParamTy, const CallExpr *E, Index: lib/CodeGen/CGExprCXX.cpp === --- lib/CodeGen/CGExprCXX.cpp +++ lib/CodeGen/CGExprCXX.cpp @@ -28,7 +28,8 @@ commonEmitCXXMemberOrOperatorCall(CodeGenFunction , const CXXMethodDecl *MD, llvm::Value *This, llvm::Value *ImplicitParam, QualType ImplicitParamTy, const CallExpr *CE, - CallArgList , CallArgList *RtlArgs) { + CallArgList , CallArgList *RtlArgs, + bool SkipMemberCallCheck) { assert(CE == nullptr || isa(CE) || isa(CE)); assert(MD->isInstance() && @@ -44,7 +45,9 @@ CGF.EmitTypeCheck(isa(MD) ? CodeGenFunction::TCK_ConstructorCall : CodeGenFunction::TCK_MemberCall, -CallLoc, This, C.getRecordType(MD->getParent())); +CallLoc, This, C.getRecordType(MD->getParent()), +/*Alignment=*/CharUnits::Zero(), /*SkipNullCheck=*/false, +SkipMemberCallCheck); // Push the this ptr. const CXXRecordDecl *RD = @@ -82,11 +85,12 @@ RValue
[PATCH] D25448: [ubsan] Disable -fsanitize=vptr checks for devirtualized calls
rjmccall requested changes to this revision. rjmccall added inline comments. This revision now requires changes to proceed. Comment at: lib/CodeGen/CodeGenFunction.h:379 + /// Set of object pointers which are blacklisted from the UB sanitizer. + llvm::SmallPtrSet SanitizerBasePointerBlacklist; + Please find some way to do this that doesn't require adding new tracking state like this. It should be quite easy to pass down that a call was devirtualized. https://reviews.llvm.org/D25448 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D25448: [ubsan] Disable -fsanitize=vptr checks for devirtualized calls
vsk created this revision. vsk added reviewers: pcc, rjmccall. vsk added subscribers: krasin, cfe-commits. ubsan reports a false positive 'invalid member call' diagnostic on the following example (PR30478): struct Base1 { virtual int f1() { return 1; } }; struct Base2 { virtual int f1() { return 2; } }; struct Derived2 final : Base1, Base2 { int f1() override { return 3; } }; int t1() { Derived2 d; return static_cast()->f1(); } Adding the "final" attribute to a most-derived class allows clang to devirtualize member calls into an instance of that object. In this case, it means that the wrong vptr (or, alternatively, the wrong type info) is passed to ubsan's checkDynamicType routine. By the time clang emits the instrumentation to check for UB member calls, there doesn't seem to to be enough information to properly diagnose the case where the dynamic type of the cast operand isn't "Derived". In light of that, the simplest way to fix this FP should be to disable -fsanitize=vptr instrumentation for devirtualized member calls. https://reviews.llvm.org/D25448 Files: lib/CodeGen/CGExpr.cpp lib/CodeGen/CGExprCXX.cpp lib/CodeGen/CodeGenFunction.h test/CodeGenCXX/ubsan-devirtualized-calls.cpp Index: test/CodeGenCXX/ubsan-devirtualized-calls.cpp === --- /dev/null +++ test/CodeGenCXX/ubsan-devirtualized-calls.cpp @@ -0,0 +1,38 @@ +// RUN: %clang_cc1 -std=c++11 -triple %itanium_abi_triple -emit-llvm -fsanitize=vptr %s -o - | FileCheck %s + +struct Base1 { + virtual int f1() { return 1; } +}; + +struct Base2 { + virtual int f1() { return 2; } +}; + +struct Derived1 final : Base1 { + int f1() override { return 3; } +}; + +struct Derived2 final : Base1, Base2 { + int f1() override { return 3; } +}; + +// CHECK-LABEL: define i32 @_Z2t1v +int t1() { + Derived1 d1; + return static_cast()->f1(); +// CHECK-NOT: !nosanitize +} + +// CHECK-LABEL: define i32 @_Z2t2v +int t2() { + Derived2 d2; + return static_cast()->f1(); +// CHECK-NOT: !nosanitize +} + +// CHECK-LABEL: define i32 @_Z2t3v +int t3() { + Derived2 d2; + return static_cast()->f1(); +// CHECK-NOT: !nosanitize +} Index: lib/CodeGen/CodeGenFunction.h === --- lib/CodeGen/CodeGenFunction.h +++ lib/CodeGen/CodeGenFunction.h @@ -33,6 +33,7 @@ #include "clang/Frontend/CodeGenOptions.h" #include "llvm/ADT/ArrayRef.h" #include "llvm/ADT/DenseMap.h" +#include "llvm/ADT/SmallPtrSet.h" #include "llvm/ADT/SmallVector.h" #include "llvm/IR/ValueHandle.h" #include "llvm/Support/Debug.h" @@ -374,6 +375,9 @@ return DominatingValue::save(*this, value); } + /// Set of object pointers which are blacklisted from the UB sanitizer. + llvm::SmallPtrSet SanitizerBasePointerBlacklist; + public: /// ObjCEHValueStack - Stack of Objective-C exception values, used for /// rethrows. Index: lib/CodeGen/CGExprCXX.cpp === --- lib/CodeGen/CGExprCXX.cpp +++ lib/CodeGen/CGExprCXX.cpp @@ -194,6 +194,8 @@ else This = EmitLValue(Base).getAddress(); + if (SanOpts.has(SanitizerKind::Vptr) && DevirtualizedMethod) +SanitizerBasePointerBlacklist.insert(This.getPointer()); if (MD->isTrivial() || (MD->isDefaulted() && MD->getParent()->isUnion())) { if (isa(MD)) return RValue::get(nullptr); Index: lib/CodeGen/CGExpr.cpp === --- lib/CodeGen/CGExpr.cpp +++ lib/CodeGen/CGExpr.cpp @@ -619,8 +619,11 @@ //-- the [pointer or glvalue] is used to access a non-static data member // or call a non-static member function CXXRecordDecl *RD = Ty->getAsCXXRecordDecl(); + bool SupportedMemberCall = (TCK == TCK_MemberCall) + ? !SanitizerBasePointerBlacklist.count(Ptr) + : false; if (SanOpts.has(SanitizerKind::Vptr) && - (TCK == TCK_MemberAccess || TCK == TCK_MemberCall || + (TCK == TCK_MemberAccess || SupportedMemberCall || TCK == TCK_DowncastPointer || TCK == TCK_DowncastReference || TCK == TCK_UpcastToVirtualBase) && RD && RD->hasDefinition() && RD->isDynamicClass()) { Index: test/CodeGenCXX/ubsan-devirtualized-calls.cpp === --- /dev/null +++ test/CodeGenCXX/ubsan-devirtualized-calls.cpp @@ -0,0 +1,38 @@ +// RUN: %clang_cc1 -std=c++11 -triple %itanium_abi_triple -emit-llvm -fsanitize=vptr %s -o - | FileCheck %s + +struct Base1 { + virtual int f1() { return 1; } +}; + +struct Base2 { + virtual int f1() { return 2; } +}; + +struct Derived1 final : Base1 { + int f1() override { return 3; } +}; + +struct Derived2 final : Base1, Base2 { + int f1() override { return 3; } +}; + +// CHECK-LABEL: define i32 @_Z2t1v +int t1() { + Derived1 d1; + return static_cast()->f1(); +//