[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
sylvestre.ledru added a comment. For the record, Firefox was using this trick. This patch is breaking a ci build (clang trunk + warning as errors) More information here: https://bugzilla.mozilla.org/show_bug.cgi?id=1402362 Repository: rL LLVM https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
efriedma added a comment. I agree, it doesn't add much value. Either way, though, please make sure you address the buildbot failures quickly. Repository: rL LLVM https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
andrew.w.kaylor added a comment. This is breaking buildbots for 32-bit targets because the offset in 'nullptr + int8_t_N' is being implicitly cast to an int. That makes the sizeof(offset) == sizeof(ptr) check turn out differently than my tests were assuming. I can get the buildbots green quickly by taking out parts of the tests, but I just talked to Erich Keane about this and we think the right way to fix this long term is to stop checking for sizeof(offset) == sizeof(ptr) in the code that identifies the idiom, since that check is of dubious value and would be difficult to always get to behave the way I intended. Repository: rL LLVM https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
This revision was automatically updated to reflect the committed changes. Closed by commit rL313666: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc (authored by akaylor). Changed prior to commit: https://reviews.llvm.org/D37042?vs=115262=115889#toc Repository: rL LLVM https://reviews.llvm.org/D37042 Files: cfe/trunk/include/clang/AST/Expr.h cfe/trunk/include/clang/Basic/DiagnosticGroups.td cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td cfe/trunk/lib/AST/Expr.cpp cfe/trunk/lib/CodeGen/CGExprScalar.cpp cfe/trunk/lib/Sema/SemaExpr.cpp cfe/trunk/test/CodeGen/nullptr-arithmetic.c cfe/trunk/test/Sema/pointer-addition.c cfe/trunk/test/SemaCXX/nullptr-arithmetic.cpp Index: cfe/trunk/lib/CodeGen/CGExprScalar.cpp === --- cfe/trunk/lib/CodeGen/CGExprScalar.cpp +++ cfe/trunk/lib/CodeGen/CGExprScalar.cpp @@ -2678,6 +2678,30 @@ unsigned width = cast(index->getType())->getBitWidth(); auto = CGF.CGM.getDataLayout(); auto PtrTy = cast(pointer->getType()); + + // Some versions of glibc and gcc use idioms (particularly in their malloc + // routines) that add a pointer-sized integer (known to be a pointer value) + // to a null pointer in order to cast the value back to an integer or as + // part of a pointer alignment algorithm. This is undefined behavior, but + // we'd like to be able to compile programs that use it. + // + // Normally, we'd generate a GEP with a null-pointer base here in response + // to that code, but it's also UB to dereference a pointer created that + // way. Instead (as an acknowledged hack to tolerate the idiom) we will + // generate a direct cast of the integer value to a pointer. + // + // The idiom (p = nullptr + N) is not met if any of the following are true: + // + // The operation is subtraction. + // The index is not pointer-sized. + // The pointer type is not byte-sized. + // + if (BinaryOperator::isNullPointerArithmeticExtension(CGF.getContext(), + op.Opcode, + expr->getLHS(), + expr->getRHS())) +return CGF.Builder.CreateIntToPtr(index, pointer->getType()); + if (width != DL.getTypeSizeInBits(PtrTy)) { // Zero-extend or sign-extend the pointer value according to // whether the index is signed or not. Index: cfe/trunk/lib/AST/Expr.cpp === --- cfe/trunk/lib/AST/Expr.cpp +++ cfe/trunk/lib/AST/Expr.cpp @@ -1829,6 +1829,45 @@ return OverOps[Opc]; } +bool BinaryOperator::isNullPointerArithmeticExtension(ASTContext , + Opcode Opc, + Expr *LHS, Expr *RHS) { + if (Opc != BO_Add) +return false; + + // Check that we have one pointer and one integer operand. + Expr *PExp; + Expr *IExp; + if (LHS->getType()->isPointerType()) { +if (!RHS->getType()->isIntegerType()) + return false; +PExp = LHS; +IExp = RHS; + } else if (RHS->getType()->isPointerType()) { +if (!LHS->getType()->isIntegerType()) + return false; +PExp = RHS; +IExp = LHS; + } else { +return false; + } + + // Check that the pointer is a nullptr. + if (!PExp->IgnoreParenCasts() + ->isNullPointerConstant(Ctx, Expr::NPC_ValueDependentIsNotNull)) +return false; + + // Check that the pointee type is char-sized. + const PointerType *PTy = PExp->getType()->getAs(); + if (!PTy || !PTy->getPointeeType()->isCharType()) +return false; + + // Check that the integer type is pointer-sized. + if (Ctx.getTypeSize(IExp->getType()) != Ctx.getTypeSize(PExp->getType())) +return false; + + return true; +} InitListExpr::InitListExpr(const ASTContext , SourceLocation lbraceloc, ArrayRefinitExprs, SourceLocation rbraceloc) : Expr(InitListExprClass, QualType(), VK_RValue, OK_Ordinary, false, false, Index: cfe/trunk/lib/Sema/SemaExpr.cpp === --- cfe/trunk/lib/Sema/SemaExpr.cpp +++ cfe/trunk/lib/Sema/SemaExpr.cpp @@ -8572,6 +8572,21 @@ << 0 /* one pointer */ << Pointer->getSourceRange(); } +/// \brief Diagnose invalid arithmetic on a null pointer. +/// +/// If \p IsGNUIdiom is true, the operation is using the 'p = (i8*)nullptr + n' +/// idiom, which we recognize as a GNU extension. +/// +static void diagnoseArithmeticOnNullPointer(Sema , SourceLocation Loc, +Expr *Pointer, bool IsGNUIdiom) { + if (IsGNUIdiom) +S.Diag(Loc, diag::warn_gnu_null_ptr_arith) + << Pointer->getSourceRange(); + else +S.Diag(Loc, diag::warn_pointer_arith_null_ptr) + << S.getLangOpts().CPlusPlus << Pointer->getSourceRange(); +} + /// \brief Diagnose
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
efriedma accepted this revision. efriedma added a comment. This revision is now accepted and ready to land. LGTM https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
andrew.w.kaylor updated this revision to Diff 115262. andrew.w.kaylor added a comment. -Changed GNU idiom from extension to warning. -Updated to ToT. https://reviews.llvm.org/D37042 Files: include/clang/AST/Expr.h include/clang/Basic/DiagnosticGroups.td include/clang/Basic/DiagnosticSemaKinds.td lib/AST/Expr.cpp lib/CodeGen/CGExprScalar.cpp lib/Sema/SemaExpr.cpp test/CodeGen/nullptr-arithmetic.c test/Sema/pointer-addition.c test/SemaCXX/nullptr-arithmetic.cpp Index: lib/CodeGen/CGExprScalar.cpp === --- lib/CodeGen/CGExprScalar.cpp +++ lib/CodeGen/CGExprScalar.cpp @@ -2678,6 +2678,30 @@ unsigned width = cast(index->getType())->getBitWidth(); auto = CGF.CGM.getDataLayout(); auto PtrTy = cast(pointer->getType()); + + // Some versions of glibc and gcc use idioms (particularly in their malloc + // routines) that add a pointer-sized integer (known to be a pointer value) + // to a null pointer in order to cast the value back to an integer or as + // part of a pointer alignment algorithm. This is undefined behavior, but + // we'd like to be able to compile programs that use it. + // + // Normally, we'd generate a GEP with a null-pointer base here in response + // to that code, but it's also UB to dereference a pointer created that + // way. Instead (as an acknowledged hack to tolerate the idiom) we will + // generate a direct cast of the integer value to a pointer. + // + // The idiom (p = nullptr + N) is not met if any of the following are true: + // + // The operation is subtraction. + // The index is not pointer-sized. + // The pointer type is not byte-sized. + // + if (BinaryOperator::isNullPointerArithmeticExtension(CGF.getContext(), + op.Opcode, + expr->getLHS(), + expr->getRHS())) +return CGF.Builder.CreateIntToPtr(index, pointer->getType()); + if (width != DL.getTypeSizeInBits(PtrTy)) { // Zero-extend or sign-extend the pointer value according to // whether the index is signed or not. Index: lib/AST/Expr.cpp === --- lib/AST/Expr.cpp +++ lib/AST/Expr.cpp @@ -1829,6 +1829,45 @@ return OverOps[Opc]; } +bool BinaryOperator::isNullPointerArithmeticExtension(ASTContext , + Opcode Opc, + Expr *LHS, Expr *RHS) { + if (Opc != BO_Add) +return false; + + // Check that we have one pointer and one integer operand. + Expr *PExp; + Expr *IExp; + if (LHS->getType()->isPointerType()) { +if (!RHS->getType()->isIntegerType()) + return false; +PExp = LHS; +IExp = RHS; + } else if (RHS->getType()->isPointerType()) { +if (!LHS->getType()->isIntegerType()) + return false; +PExp = RHS; +IExp = LHS; + } else { +return false; + } + + // Check that the pointer is a nullptr. + if (!PExp->IgnoreParenCasts() + ->isNullPointerConstant(Ctx, Expr::NPC_ValueDependentIsNotNull)) +return false; + + // Check that the pointee type is char-sized. + const PointerType *PTy = PExp->getType()->getAs(); + if (!PTy || !PTy->getPointeeType()->isCharType()) +return false; + + // Check that the integer type is pointer-sized. + if (Ctx.getTypeSize(IExp->getType()) != Ctx.getTypeSize(PExp->getType())) +return false; + + return true; +} InitListExpr::InitListExpr(const ASTContext , SourceLocation lbraceloc, ArrayRefinitExprs, SourceLocation rbraceloc) : Expr(InitListExprClass, QualType(), VK_RValue, OK_Ordinary, false, false, Index: lib/Sema/SemaExpr.cpp === --- lib/Sema/SemaExpr.cpp +++ lib/Sema/SemaExpr.cpp @@ -8486,6 +8486,21 @@ << 0 /* one pointer */ << Pointer->getSourceRange(); } +/// \brief Diagnose invalid arithmetic on a null pointer. +/// +/// If \p IsGNUIdiom is true, the operation is using the 'p = (i8*)nullptr + n' +/// idiom, which we recognize as a GNU extension. +/// +static void diagnoseArithmeticOnNullPointer(Sema , SourceLocation Loc, +Expr *Pointer, bool IsGNUIdiom) { + if (IsGNUIdiom) +S.Diag(Loc, diag::warn_gnu_null_ptr_arith) + << Pointer->getSourceRange(); + else +S.Diag(Loc, diag::warn_pointer_arith_null_ptr) + << S.getLangOpts().CPlusPlus << Pointer->getSourceRange(); +} + /// \brief Diagnose invalid arithmetic on two function pointers. static void diagnoseArithmeticOnTwoFunctionPointers(Sema , SourceLocation Loc, Expr *LHS, Expr *RHS) { @@ -8780,6 +8795,21 @@ if (!IExp->getType()->isIntegerType()) return InvalidOperands(Loc, LHS, RHS); + //
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
efriedma added inline comments. Comment at: include/clang/Basic/DiagnosticSemaKinds.td:6031 InGroup; +def ext_gnu_null_ptr_arith : Extension< + "arithmetic on a null pointer treated as a cast from integer to pointer is a GNU extension">, andrew.w.kaylor wrote: > efriedma wrote: > > andrew.w.kaylor wrote: > > > efriedma wrote: > > > > "extension" isn't really right here; this shouldn't be an error in > > > > -pedantic-errors mode. Probably best to just stick this into the > > > > NullPointerArithmetic, like the other new warning. > > > So how should a word the warning? Just this: > > > > > > "arithmetic on a null pointer treated as a cast from integer to pointer"? > > That wasn't what I meant; the current wording is fine. I meant this should > > be something like `def warn_gnu_null_ptr_arith : Warning<`. > OK. I think I understand the behavior you wanted. I just thought maybe the > current wording might be technically incorrect. I wasn't sure how precisely > defined we consider "extension" to be in this context. The part that makes this a little weird is that unlike most extensions, this code is already well-formed. It's an extension because we're guaranteeing runtime behavior for a construct which has undefined behavior at runtime according to the standard. (This is in contrast to "implementation-defined" behaviors, which are the gaps in the standard we're allowed to fill in as we see fit.) Given that, I think calling it a "GNU extension" in the text is fine. https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
andrew.w.kaylor added inline comments. Comment at: include/clang/Basic/DiagnosticSemaKinds.td:6031 InGroup; +def ext_gnu_null_ptr_arith : Extension< + "arithmetic on a null pointer treated as a cast from integer to pointer is a GNU extension">, efriedma wrote: > andrew.w.kaylor wrote: > > efriedma wrote: > > > "extension" isn't really right here; this shouldn't be an error in > > > -pedantic-errors mode. Probably best to just stick this into the > > > NullPointerArithmetic, like the other new warning. > > So how should a word the warning? Just this: > > > > "arithmetic on a null pointer treated as a cast from integer to pointer"? > That wasn't what I meant; the current wording is fine. I meant this should be > something like `def warn_gnu_null_ptr_arith : Warning<`. OK. I think I understand the behavior you wanted. I just thought maybe the current wording might be technically incorrect. I wasn't sure how precisely defined we consider "extension" to be in this context. https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
efriedma added inline comments. Comment at: include/clang/Basic/DiagnosticSemaKinds.td:6031 InGroup; +def ext_gnu_null_ptr_arith : Extension< + "arithmetic on a null pointer treated as a cast from integer to pointer is a GNU extension">, andrew.w.kaylor wrote: > efriedma wrote: > > "extension" isn't really right here; this shouldn't be an error in > > -pedantic-errors mode. Probably best to just stick this into the > > NullPointerArithmetic, like the other new warning. > So how should a word the warning? Just this: > > "arithmetic on a null pointer treated as a cast from integer to pointer"? That wasn't what I meant; the current wording is fine. I meant this should be something like `def warn_gnu_null_ptr_arith : Warning<`. https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
andrew.w.kaylor added inline comments. Comment at: include/clang/Basic/DiagnosticSemaKinds.td:6031 InGroup; +def ext_gnu_null_ptr_arith : Extension< + "arithmetic on a null pointer treated as a cast from integer to pointer is a GNU extension">, efriedma wrote: > "extension" isn't really right here; this shouldn't be an error in > -pedantic-errors mode. Probably best to just stick this into the > NullPointerArithmetic, like the other new warning. So how should a word the warning? Just this: "arithmetic on a null pointer treated as a cast from integer to pointer"? https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
efriedma added inline comments. Comment at: include/clang/Basic/DiagnosticSemaKinds.td:6031 InGroup; +def ext_gnu_null_ptr_arith : Extension< + "arithmetic on a null pointer treated as a cast from integer to pointer is a GNU extension">, "extension" isn't really right here; this shouldn't be an error in -pedantic-errors mode. Probably best to just stick this into the NullPointerArithmetic, like the other new warning. https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
andrew.w.kaylor added a comment. Does anything else need to be done for this to be ready to land? https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
andrew.w.kaylor added a comment. Ping. https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
andrew.w.kaylor updated this revision to Diff 113343. andrew.w.kaylor added a comment. Fixed value-dependent argument in isNullPointerConstant checks. Added check for C++ zero offset in subtraction. Added value-dependent test cases. https://reviews.llvm.org/D37042 Files: include/clang/AST/Expr.h include/clang/Basic/DiagnosticGroups.td include/clang/Basic/DiagnosticSemaKinds.td lib/AST/Expr.cpp lib/CodeGen/CGExprScalar.cpp lib/Sema/SemaExpr.cpp test/CodeGen/nullptr-arithmetic.c test/Sema/pointer-addition.c test/SemaCXX/nullptr-arithmetic.cpp Index: lib/CodeGen/CGExprScalar.cpp === --- lib/CodeGen/CGExprScalar.cpp +++ lib/CodeGen/CGExprScalar.cpp @@ -2671,6 +2671,30 @@ unsigned width = cast(index->getType())->getBitWidth(); auto = CGF.CGM.getDataLayout(); auto PtrTy = cast(pointer->getType()); + + // Some versions of glibc and gcc use idioms (particularly in their malloc + // routines) that add a pointer-sized integer (known to be a pointer value) + // to a null pointer in order to cast the value back to an integer or as + // part of a pointer alignment algorithm. This is undefined behavior, but + // we'd like to be able to compile programs that use it. + // + // Normally, we'd generate a GEP with a null-pointer base here in response + // to that code, but it's also UB to dereference a pointer created that + // way. Instead (as an acknowledged hack to tolerate the idiom) we will + // generate a direct cast of the integer value to a pointer. + // + // The idiom (p = nullptr + N) is not met if any of the following are true: + // + // The operation is subtraction. + // The index is not pointer-sized. + // The pointer type is not byte-sized. + // + if (BinaryOperator::isNullPointerArithmeticExtension(CGF.getContext(), + op.Opcode, + expr->getLHS(), + expr->getRHS())) +return CGF.Builder.CreateIntToPtr(index, pointer->getType()); + if (width != DL.getTypeSizeInBits(PtrTy)) { // Zero-extend or sign-extend the pointer value according to // whether the index is signed or not. Index: lib/AST/Expr.cpp === --- lib/AST/Expr.cpp +++ lib/AST/Expr.cpp @@ -1829,6 +1829,45 @@ return OverOps[Opc]; } +bool BinaryOperator::isNullPointerArithmeticExtension(ASTContext , + Opcode Opc, + Expr *LHS, Expr *RHS) { + if (Opc != BO_Add) +return false; + + // Check that we have one pointer and one integer operand. + Expr *PExp; + Expr *IExp; + if (LHS->getType()->isPointerType()) { +if (!RHS->getType()->isIntegerType()) + return false; +PExp = LHS; +IExp = RHS; + } else if (RHS->getType()->isPointerType()) { +if (!LHS->getType()->isIntegerType()) + return false; +PExp = RHS; +IExp = LHS; + } else { +return false; + } + + // Check that the pointer is a nullptr. + if (!PExp->IgnoreParenCasts() + ->isNullPointerConstant(Ctx, Expr::NPC_ValueDependentIsNotNull)) +return false; + + // Check that the pointee type is char-sized. + const PointerType *PTy = PExp->getType()->getAs(); + if (!PTy || !PTy->getPointeeType()->isCharType()) +return false; + + // Check that the integer type is pointer-sized. + if (Ctx.getTypeSize(IExp->getType()) != Ctx.getTypeSize(PExp->getType())) +return false; + + return true; +} InitListExpr::InitListExpr(const ASTContext , SourceLocation lbraceloc, ArrayRefinitExprs, SourceLocation rbraceloc) : Expr(InitListExprClass, QualType(), VK_RValue, OK_Ordinary, false, false, Index: lib/Sema/SemaExpr.cpp === --- lib/Sema/SemaExpr.cpp +++ lib/Sema/SemaExpr.cpp @@ -8490,6 +8490,21 @@ << 0 /* one pointer */ << Pointer->getSourceRange(); } +/// \brief Diagnose invalid arithmetic on a null pointer. +/// +/// If \p IsGNUIdiom is true, the operation is using the 'p = (i8*)nullptr + n' +/// idiom, which we recognize as a GNU extension. +/// +static void diagnoseArithmeticOnNullPointer(Sema , SourceLocation Loc, +Expr *Pointer, bool IsGNUIdiom) { + if (IsGNUIdiom) +S.Diag(Loc, diag::ext_gnu_null_ptr_arith) + << Pointer->getSourceRange(); + else +S.Diag(Loc, diag::warn_pointer_arith_null_ptr) + << S.getLangOpts().CPlusPlus << Pointer->getSourceRange(); +} + /// \brief Diagnose invalid arithmetic on two function pointers. static void diagnoseArithmeticOnTwoFunctionPointers(Sema , SourceLocation Loc, Expr *LHS, Expr *RHS) { @@ -8784,6 +8799,21 @@ if
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
rsmith added inline comments. Comment at: lib/AST/Expr.cpp:1857 + if (!PExp->IgnoreParenCasts() + ->isNullPointerConstant(Ctx, Expr::NPC_ValueDependentIsNull)) +return false; andrew.w.kaylor wrote: > rsmith wrote: > > If we get here with a value-dependent expression, we should treat it as > > non-null (do not warn on `(char*)PtrTemplateParameter + N`). > OK. I wasn't sure about that. > > So how do I test that? Is this right? > ``` > template > T* f(intptr_t offset) { > return P + offset; > } > > char *test(intptr_t offset) { > return f(offset); > } > ``` You can simplify that slightly by using `template`, but yes. Comment at: lib/Sema/SemaExpr.cpp:8807 +llvm::APSInt KnownVal; +if (!getLangOpts().CPlusPlus || !IExp->EvaluateAsInt(KnownVal, Context) || +KnownVal != 0) { This talk about value-dependence reminds me: it is an error to call `EvaluateAsInt` on a value-dependent expression (the expression evaluator will probably assert). If `IExp->isValueDependent()`, you should skip the diagnostic, since it might instantiate to zero. Comment at: lib/Sema/SemaExpr.cpp:8877 if (RHS.get()->getType()->isIntegerType()) { + // Subtracting from a null pointer should produce a warning. + // The last argument to the diagnose call says this doesn't match the andrew.w.kaylor wrote: > rsmith wrote: > > Subtracting zero from a null pointer should not warn in C++. > > > > (Conversely, subtracting a non-null pointer from a pointer should warn in > > C++, and subtracting any pointer from a null pointer should warn in C.) > Is it OK if I just add a FIXME in the section below that handles > pointer-pointer? Yes, that's fine. (But the `nullptr - 0` case should be handled in this patch.) https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
andrew.w.kaylor added inline comments. Comment at: lib/AST/Expr.cpp:1857 + if (!PExp->IgnoreParenCasts() + ->isNullPointerConstant(Ctx, Expr::NPC_ValueDependentIsNull)) +return false; rsmith wrote: > If we get here with a value-dependent expression, we should treat it as > non-null (do not warn on `(char*)PtrTemplateParameter + N`). OK. I wasn't sure about that. So how do I test that? Is this right? ``` template T* f(intptr_t offset) { return P + offset; } char *test(intptr_t offset) { return f(offset); } ``` Comment at: lib/Sema/SemaExpr.cpp:8877 if (RHS.get()->getType()->isIntegerType()) { + // Subtracting from a null pointer should produce a warning. + // The last argument to the diagnose call says this doesn't match the rsmith wrote: > Subtracting zero from a null pointer should not warn in C++. > > (Conversely, subtracting a non-null pointer from a pointer should warn in > C++, and subtracting any pointer from a null pointer should warn in C.) Is it OK if I just add a FIXME in the section below that handles pointer-pointer? https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
rjmccall added inline comments. Comment at: lib/Sema/SemaExpr.cpp:8808 +Context.getTypeSize(pointerType) == +Context.getTypeSize(IExp->getType())) + IsGNUIdiom = true; efriedma wrote: > andrew.w.kaylor wrote: > > efriedma wrote: > > > rsmith wrote: > > > > andrew.w.kaylor wrote: > > > > > efriedma wrote: > > > > > > Please make sure you use exactly the same check in Sema and CodeGen > > > > > > (probably easiest to stick a helper into lib/AST/). > > > > > That's a good idea, but I'm not really familiar enough with the > > > > > structure of clang to be sure I'm not doing it in a ham-fisted way. > > > > > Can you clarify? Are you suggesting adding something like > > > > > ASTContext::isPointeeTypeCharSize() and > > > > > ASTContext::isIntegerTypePointerSize()? Or maybe a single > > > > > specialized function somewhere that does both checks like > > > > > ASTContext::areOperandNullPointerArithmeticCompatible()? > > > > I would suggest something even more specific, such as > > > > `isNullPointerPlusIntegerExtension` > > > I'm most concerned about the difference between > > > "isa(pointer)" and > > > "PExp->IgnoreParenCasts()->isNullPointerConstant()"... they're different > > > in important ways. > > > > > > I was thinking something like > > > "BinaryOperator::isNullPointerArithmeticExtension()" > > At this point in Sema the BinaryOperator for the addition hasn't been > > created yet, right? So a static function that takes the opcode and > > operands? > I wasn't really thinking about that... but yes, something like that. I've long thought that it would make sense to store a semantic operation kind in BinaryOperator and UnaryOperator instead of treating the operator alone as a sufficient discriminator. Of course the operator is *usually* sufficient, but there are cases where that's not true — for example, we could use that operation kind to distinguish integer and pointer arithmetic, and maybe even to identify which side of a + is the pointer. If we had that, we could very easily flag a null+int operation as a completely different semantic operation, which it basically is. https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
rsmith added inline comments. Comment at: lib/AST/Expr.cpp:1857 + if (!PExp->IgnoreParenCasts() + ->isNullPointerConstant(Ctx, Expr::NPC_ValueDependentIsNull)) +return false; If we get here with a value-dependent expression, we should treat it as non-null (do not warn on `(char*)PtrTemplateParameter + N`). Comment at: lib/Sema/SemaExpr.cpp:8804 + if (PExp->IgnoreParenCasts()->isNullPointerConstant( + Context, Expr::NPC_ValueDependentIsNull)) { +// In C++ adding zero to a null pointer is defined. Likewise here, we should treat a value-dependent expression as non-null. Comment at: lib/Sema/SemaExpr.cpp:8877 if (RHS.get()->getType()->isIntegerType()) { + // Subtracting from a null pointer should produce a warning. + // The last argument to the diagnose call says this doesn't match the Subtracting zero from a null pointer should not warn in C++. (Conversely, subtracting a non-null pointer from a pointer should warn in C++, and subtracting any pointer from a null pointer should warn in C.) Comment at: lib/Sema/SemaExpr.cpp:8881 + if (LHS.get()->IgnoreParenCasts()->isNullPointerConstant(Context, + Expr::NPC_ValueDependentIsNull)) +diagnoseArithmeticOnNullPointer(*this, Loc, LHS.get(), false); Likewise. https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
andrew.w.kaylor updated this revision to Diff 113311. andrew.w.kaylor added a comment. Refactored the GNU idiom check to be shared by Sema and CodeGen. Refined the checks so that nullptr+0 doesn't warn in C++. Added the zero offset qualification in the warning produced with C++. https://reviews.llvm.org/D37042 Files: include/clang/AST/Expr.h include/clang/Basic/DiagnosticGroups.td include/clang/Basic/DiagnosticSemaKinds.td lib/AST/Expr.cpp lib/CodeGen/CGExprScalar.cpp lib/Sema/SemaExpr.cpp test/CodeGen/nullptr-arithmetic.c test/Sema/pointer-addition.c test/SemaCXX/nullptr-arithmetic.cpp Index: lib/CodeGen/CGExprScalar.cpp === --- lib/CodeGen/CGExprScalar.cpp +++ lib/CodeGen/CGExprScalar.cpp @@ -2671,6 +2671,30 @@ unsigned width = cast(index->getType())->getBitWidth(); auto = CGF.CGM.getDataLayout(); auto PtrTy = cast(pointer->getType()); + + // Some versions of glibc and gcc use idioms (particularly in their malloc + // routines) that add a pointer-sized integer (known to be a pointer value) + // to a null pointer in order to cast the value back to an integer or as + // part of a pointer alignment algorithm. This is undefined behavior, but + // we'd like to be able to compile programs that use it. + // + // Normally, we'd generate a GEP with a null-pointer base here in response + // to that code, but it's also UB to dereference a pointer created that + // way. Instead (as an acknowledged hack to tolerate the idiom) we will + // generate a direct cast of the integer value to a pointer. + // + // The idiom (p = nullptr + N) is not met if any of the following are true: + // + // The operation is subtraction. + // The index is not pointer-sized. + // The pointer type is not byte-sized. + // + if (BinaryOperator::isNullPointerArithmeticExtension(CGF.getContext(), + op.Opcode, + expr->getLHS(), + expr->getRHS())) +return CGF.Builder.CreateIntToPtr(index, pointer->getType()); + if (width != DL.getTypeSizeInBits(PtrTy)) { // Zero-extend or sign-extend the pointer value according to // whether the index is signed or not. Index: lib/AST/Expr.cpp === --- lib/AST/Expr.cpp +++ lib/AST/Expr.cpp @@ -1829,6 +1829,45 @@ return OverOps[Opc]; } +bool BinaryOperator::isNullPointerArithmeticExtension(ASTContext , + Opcode Opc, + Expr *LHS, Expr *RHS) { + if (Opc != BO_Add) +return false; + + // Check that we have one pointer and one integer operand. + Expr *PExp; + Expr *IExp; + if (LHS->getType()->isPointerType()) { +if (!RHS->getType()->isIntegerType()) + return false; +PExp = LHS; +IExp = RHS; + } else if (RHS->getType()->isPointerType()) { +if (!LHS->getType()->isIntegerType()) + return false; +PExp = RHS; +IExp = LHS; + } else { +return false; + } + + // Check that the pointer is a nullptr. + if (!PExp->IgnoreParenCasts() + ->isNullPointerConstant(Ctx, Expr::NPC_ValueDependentIsNull)) +return false; + + // Check that the pointee type is char-sized. + const PointerType *PTy = PExp->getType()->getAs(); + if (!PTy || !PTy->getPointeeType()->isCharType()) +return false; + + // Check that the integer type is pointer-sized. + if (Ctx.getTypeSize(IExp->getType()) != Ctx.getTypeSize(PExp->getType())) +return false; + + return true; +} InitListExpr::InitListExpr(const ASTContext , SourceLocation lbraceloc, ArrayRefinitExprs, SourceLocation rbraceloc) : Expr(InitListExprClass, QualType(), VK_RValue, OK_Ordinary, false, false, Index: lib/Sema/SemaExpr.cpp === --- lib/Sema/SemaExpr.cpp +++ lib/Sema/SemaExpr.cpp @@ -8490,6 +8490,21 @@ << 0 /* one pointer */ << Pointer->getSourceRange(); } +/// \brief Diagnose invalid arithmetic on a null pointer. +/// +/// If \p IsGNUIdiom is true, the operation is using the 'p = (i8*)nullptr + n' +/// idiom, which we recognize as a GNU extension. +/// +static void diagnoseArithmeticOnNullPointer(Sema , SourceLocation Loc, +Expr *Pointer, bool IsGNUIdiom) { + if (IsGNUIdiom) +S.Diag(Loc, diag::ext_gnu_null_ptr_arith) + << Pointer->getSourceRange(); + else +S.Diag(Loc, diag::warn_pointer_arith_null_ptr) + << S.getLangOpts().CPlusPlus << Pointer->getSourceRange(); +} + /// \brief Diagnose invalid arithmetic on two function pointers. static void diagnoseArithmeticOnTwoFunctionPointers(Sema , SourceLocation Loc, Expr *LHS, Expr
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
efriedma added inline comments. Comment at: lib/Sema/SemaExpr.cpp:8808 +Context.getTypeSize(pointerType) == +Context.getTypeSize(IExp->getType())) + IsGNUIdiom = true; andrew.w.kaylor wrote: > efriedma wrote: > > rsmith wrote: > > > andrew.w.kaylor wrote: > > > > efriedma wrote: > > > > > Please make sure you use exactly the same check in Sema and CodeGen > > > > > (probably easiest to stick a helper into lib/AST/). > > > > That's a good idea, but I'm not really familiar enough with the > > > > structure of clang to be sure I'm not doing it in a ham-fisted way. > > > > Can you clarify? Are you suggesting adding something like > > > > ASTContext::isPointeeTypeCharSize() and > > > > ASTContext::isIntegerTypePointerSize()? Or maybe a single specialized > > > > function somewhere that does both checks like > > > > ASTContext::areOperandNullPointerArithmeticCompatible()? > > > I would suggest something even more specific, such as > > > `isNullPointerPlusIntegerExtension` > > I'm most concerned about the difference between > > "isa(pointer)" and > > "PExp->IgnoreParenCasts()->isNullPointerConstant()"... they're different in > > important ways. > > > > I was thinking something like > > "BinaryOperator::isNullPointerArithmeticExtension()" > At this point in Sema the BinaryOperator for the addition hasn't been created > yet, right? So a static function that takes the opcode and operands? I wasn't really thinking about that... but yes, something like that. https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
andrew.w.kaylor added inline comments. Comment at: lib/Sema/SemaExpr.cpp:8808 +Context.getTypeSize(pointerType) == +Context.getTypeSize(IExp->getType())) + IsGNUIdiom = true; efriedma wrote: > rsmith wrote: > > andrew.w.kaylor wrote: > > > efriedma wrote: > > > > Please make sure you use exactly the same check in Sema and CodeGen > > > > (probably easiest to stick a helper into lib/AST/). > > > That's a good idea, but I'm not really familiar enough with the structure > > > of clang to be sure I'm not doing it in a ham-fisted way. Can you > > > clarify? Are you suggesting adding something like > > > ASTContext::isPointeeTypeCharSize() and > > > ASTContext::isIntegerTypePointerSize()? Or maybe a single specialized > > > function somewhere that does both checks like > > > ASTContext::areOperandNullPointerArithmeticCompatible()? > > I would suggest something even more specific, such as > > `isNullPointerPlusIntegerExtension` > I'm most concerned about the difference between > "isa(pointer)" and > "PExp->IgnoreParenCasts()->isNullPointerConstant()"... they're different in > important ways. > > I was thinking something like > "BinaryOperator::isNullPointerArithmeticExtension()" At this point in Sema the BinaryOperator for the addition hasn't been created yet, right? So a static function that takes the opcode and operands? https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
efriedma added inline comments. Comment at: lib/Sema/SemaExpr.cpp:8808 +Context.getTypeSize(pointerType) == +Context.getTypeSize(IExp->getType())) + IsGNUIdiom = true; rsmith wrote: > andrew.w.kaylor wrote: > > efriedma wrote: > > > Please make sure you use exactly the same check in Sema and CodeGen > > > (probably easiest to stick a helper into lib/AST/). > > That's a good idea, but I'm not really familiar enough with the structure > > of clang to be sure I'm not doing it in a ham-fisted way. Can you clarify? > > Are you suggesting adding something like > > ASTContext::isPointeeTypeCharSize() and > > ASTContext::isIntegerTypePointerSize()? Or maybe a single specialized > > function somewhere that does both checks like > > ASTContext::areOperandNullPointerArithmeticCompatible()? > I would suggest something even more specific, such as > `isNullPointerPlusIntegerExtension` I'm most concerned about the difference between "isa(pointer)" and "PExp->IgnoreParenCasts()->isNullPointerConstant()"... they're different in important ways. I was thinking something like "BinaryOperator::isNullPointerArithmeticExtension()" https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
rsmith added inline comments. Comment at: lib/Sema/SemaExpr.cpp:8808 +Context.getTypeSize(pointerType) == +Context.getTypeSize(IExp->getType())) + IsGNUIdiom = true; andrew.w.kaylor wrote: > efriedma wrote: > > Please make sure you use exactly the same check in Sema and CodeGen > > (probably easiest to stick a helper into lib/AST/). > That's a good idea, but I'm not really familiar enough with the structure of > clang to be sure I'm not doing it in a ham-fisted way. Can you clarify? Are > you suggesting adding something like ASTContext::isPointeeTypeCharSize() and > ASTContext::isIntegerTypePointerSize()? Or maybe a single specialized > function somewhere that does both checks like > ASTContext::areOperandNullPointerArithmeticCompatible()? I would suggest something even more specific, such as `isNullPointerPlusIntegerExtension` https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
andrew.w.kaylor added inline comments. Comment at: lib/Sema/SemaExpr.cpp:8808 +Context.getTypeSize(pointerType) == +Context.getTypeSize(IExp->getType())) + IsGNUIdiom = true; efriedma wrote: > Please make sure you use exactly the same check in Sema and CodeGen (probably > easiest to stick a helper into lib/AST/). That's a good idea, but I'm not really familiar enough with the structure of clang to be sure I'm not doing it in a ham-fisted way. Can you clarify? Are you suggesting adding something like ASTContext::isPointeeTypeCharSize() and ASTContext::isIntegerTypePointerSize()? Or maybe a single specialized function somewhere that does both checks like ASTContext::areOperandNullPointerArithmeticCompatible()? https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
andrew.w.kaylor added inline comments. Comment at: include/clang/Basic/DiagnosticSemaKinds.td:6032 +def ext_gnu_null_ptr_arith : Extension< + "inttoptr casting using arithmetic on a null pointer is a GNU extension">, + InGroup; rsmith wrote: > efriedma wrote: > > The keyword "inttoptr" is part of LLVM, not something we expect users to > > understand. > How about something like "arithmetic on null pointer treated as cast from > integer to pointer as a GNU extension"? I like that. Thanks for the suggestion. Comment at: lib/Sema/SemaExpr.cpp:8805 +const PointerType *pointerType = PExp->getType()->getAs(); +if (!IExp->isIntegerConstantExpr(Context) && +pointerType->getPointeeType()->isCharType() && rsmith wrote: > It seems strange to me to disable this when the RHS is an ICE. If we're going > to support this as an extension, we should make the rules for it as simple as > we reasonably can; this ICE check seems like an unnecessary complication. You're probably right. My thinking was that I wanted to keep the extension idiom as narrow as was reasonable, so these were cases that I felt like I could rule out, but the argument for simplicity is compelling since the alternative isn't doing anything particularly desirable in most cases. https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
rsmith added a comment. In https://reviews.llvm.org/D37042#855713, @efriedma wrote: > I didn't think of this earlier, but strictly speaking, I think > "(char*)nullptr+0" isn't undefined in C++? Yes, that's correct. (C++'s model is basically equivalent to having an object of size zero at the null address, so things like `(char*)nullptr - (char*)nullptr` and `(char*)nullptr + 0` are valid.) > But probably worth emitting the warning anyway; anyone writing out arithmetic > on null is probably doing something suspicious, even if it isn't technically > undefined. I agree. We should probably suppress the warning if the non-pointer operand is known to be zero in C++, and weaken the wording slightly "[..] has undefined behavior if offset is nonzero" otherwise, though. Comment at: include/clang/Basic/DiagnosticSemaKinds.td:6032 +def ext_gnu_null_ptr_arith : Extension< + "inttoptr casting using arithmetic on a null pointer is a GNU extension">, + InGroup; efriedma wrote: > The keyword "inttoptr" is part of LLVM, not something we expect users to > understand. How about something like "arithmetic on null pointer treated as cast from integer to pointer as a GNU extension"? Comment at: lib/Sema/SemaExpr.cpp:8799 + // Adding to a null pointer is never an error, but should warn. + if (PExp->IgnoreParenCasts()->isNullPointerConstant(Context, Maybe `// Adding to a null pointer results in undefined behavior.` to explain why we warn (a reader of the code can already see that we do warn in this case). Comment at: lib/Sema/SemaExpr.cpp:8805 +const PointerType *pointerType = PExp->getType()->getAs(); +if (!IExp->isIntegerConstantExpr(Context) && +pointerType->getPointeeType()->isCharType() && It seems strange to me to disable this when the RHS is an ICE. If we're going to support this as an extension, we should make the rules for it as simple as we reasonably can; this ICE check seems like an unnecessary complication. https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
efriedma added a reviewer: rsmith. efriedma added a comment. I didn't think of this earlier, but strictly speaking, I think "(char*)nullptr+0" isn't undefined in C++? But probably worth emitting the warning anyway; anyone writing out arithmetic on null is probably doing something suspicious, even if it isn't technically undefined. Comment at: include/clang/Basic/DiagnosticSemaKinds.td:6032 +def ext_gnu_null_ptr_arith : Extension< + "inttoptr casting using arithmetic on a null pointer is a GNU extension">, + InGroup; The keyword "inttoptr" is part of LLVM, not something we expect users to understand. Comment at: lib/Sema/SemaExpr.cpp:8808 +Context.getTypeSize(pointerType) == +Context.getTypeSize(IExp->getType())) + IsGNUIdiom = true; Please make sure you use exactly the same check in Sema and CodeGen (probably easiest to stick a helper into lib/AST/). https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
andrew.w.kaylor updated this revision to Diff 113134. andrew.w.kaylor added a comment. Added warnings for null pointer arithmetic. https://reviews.llvm.org/D37042 Files: include/clang/Basic/DiagnosticGroups.td include/clang/Basic/DiagnosticSemaKinds.td lib/CodeGen/CGExprScalar.cpp lib/Sema/SemaExpr.cpp test/CodeGen/nullptr-arithmetic.c test/Sema/pointer-addition.c Index: lib/CodeGen/CGExprScalar.cpp === --- lib/CodeGen/CGExprScalar.cpp +++ lib/CodeGen/CGExprScalar.cpp @@ -2671,6 +2671,37 @@ unsigned width = cast(index->getType())->getBitWidth(); auto = CGF.CGM.getDataLayout(); auto PtrTy = cast(pointer->getType()); + + // Some versions of glibc and gcc use idioms (particularly in their malloc + // routines) that add a pointer-sized integer (known to be a pointer value) + // to a null pointer in order to cast the value back to an integer or as + // part of a pointer alignment algorithm. This is undefined behavior, but + // we'd like to be able to compile programs that use it. + // + // Normally, we'd generate a GEP with a null-pointer base here in response + // to that code, but it's also UB to dereference a pointer created that + // way. Instead (as an acknowledged hack to tolerate the idiom) we will + // generate a direct cast of the integer value to a pointer. + // + // The idiom (p = nullptr + N) is not met if any of the following are true: + // + // The operation is subtraction. + // The index is not pointer-sized. + // The pointer type is not byte-sized. + // The index operand is a constant. + // + if (isa(pointer) && !isSubtraction && + (width == DL.getTypeSizeInBits(PtrTy)) && + !isa(index)) { +// The pointer type might come back as null, so it's deferred until here. +const PointerType *pointerType + = pointerOperand->getType()->getAs(); +if (pointerType && pointerType->getPointeeType()->isCharType()) { + // (nullptr + N) -> inttoptr N to + return CGF.Builder.CreateIntToPtr(index, pointer->getType()); +} + } + if (width != DL.getTypeSizeInBits(PtrTy)) { // Zero-extend or sign-extend the pointer value according to // whether the index is signed or not. Index: lib/Sema/SemaExpr.cpp === --- lib/Sema/SemaExpr.cpp +++ lib/Sema/SemaExpr.cpp @@ -8490,6 +8490,18 @@ << 0 /* one pointer */ << Pointer->getSourceRange(); } +/// \brief Diagnose invalid arithmetic on a null pointer. +/// +/// If \p IsGNUIdiom is true, the operation is using the 'p = (i8*)nullptr + n' +/// idiom, which we recognize as a GNU extension. +/// +static void diagnoseArithmeticOnNullPointer(Sema , SourceLocation Loc, +Expr *Pointer, bool IsGNUIdiom) { + S.Diag(Loc, IsGNUIdiom ? diag::ext_gnu_null_ptr_arith + : diag::warn_pointer_arith_null_ptr) +<< Pointer->getSourceRange(); +} + /// \brief Diagnose invalid arithmetic on two function pointers. static void diagnoseArithmeticOnTwoFunctionPointers(Sema , SourceLocation Loc, Expr *LHS, Expr *RHS) { @@ -8784,6 +8796,21 @@ if (!IExp->getType()->isIntegerType()) return InvalidOperands(Loc, LHS, RHS); + // Adding to a null pointer is never an error, but should warn. + if (PExp->IgnoreParenCasts()->isNullPointerConstant(Context, + Expr::NPC_ValueDependentIsNull)) { +// Check the conditions to see if this is the 'p = (i8*)nullptr + n' idiom. +bool IsGNUIdiom = false; +const PointerType *pointerType = PExp->getType()->getAs(); +if (!IExp->isIntegerConstantExpr(Context) && +pointerType->getPointeeType()->isCharType() && +Context.getTypeSize(pointerType) == +Context.getTypeSize(IExp->getType())) + IsGNUIdiom = true; + +diagnoseArithmeticOnNullPointer(*this, Loc, PExp, IsGNUIdiom); + } + if (!checkArithmeticOpPointerOperand(*this, Loc, PExp)) return QualType(); @@ -8845,6 +8872,13 @@ // The result type of a pointer-int computation is the pointer type. if (RHS.get()->getType()->isIntegerType()) { + // Subtracting from a null pointer should produce a warning. + // The last argument to the diagnose call says this doesn't match the + // GNU int-to-pointer idiom. + if (LHS.get()->IgnoreParenCasts()->isNullPointerConstant(Context, + Expr::NPC_ValueDependentIsNull)) +diagnoseArithmeticOnNullPointer(*this, Loc, LHS.get(), false); + if (!checkArithmeticOpPointerOperand(*this, Loc, LHS.get())) return QualType(); Index: include/clang/Basic/DiagnosticSemaKinds.td === --- include/clang/Basic/DiagnosticSemaKinds.td +++ include/clang/Basic/DiagnosticSemaKinds.td
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
hfinkel added a comment. In https://reviews.llvm.org/D37042#852733, @efriedma wrote: > It would be nice to warn on any nullptr arithmetic, yes. (We probably want > the wording of the warning to be a bit different if we're activating this > workaround.) +1 (I'll likely want the ability to turn off the warning for one without turning off the warning for the other) https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
efriedma added a comment. It would be nice to warn on any nullptr arithmetic, yes. (We probably want the wording of the warning to be a bit different if we're activating this workaround.) https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
andrew.w.kaylor added a comment. In https://reviews.llvm.org/D37042#850676, @hfinkel wrote: > I'd really like to see this done in some way such that we can issue a warning > along with generating well-defined code. The warning can specifically state > that the code is using an extension, etc. Should it warn on any nullptr arithmetic, or just the cases that are being handled by this change? https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
hfinkel added a comment. I'd really like to see this done in some way such that we can issue a warning along with generating well-defined code. The warning can specifically state that the code is using an extension, etc. https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
andrew.w.kaylor added a comment. My intention here was to make this as strict/limited as possible while still handling the cases of interest. There are two different implementations I want to handle. You can see the first variation in the __BPTR_ALIGN definition being added here: https://github.com/lattera/glibc/commit/2fd4de4b15a66f821057af90714145d2c034a609#diff-cd0c2b2693d632ad8843ae58a6ea7ffaR125 You can see the other in the __INT_TO_PTR definition that was being deleted in the same change set here: https://github.com/lattera/glibc/commit/2fd4de4b15a66f821057af90714145d2c034a609#diff-cd0c2b2693d632ad8843ae58a6ea7ffaL122 This second case shows up in some benchmark code, so it's important even though glibc phased it out. I'm really not sure what this looks like in the front end, but it seems like it could be relatively open-ended as to where the value came from. Basically, my intention is to eliminate anything I know isn't what I'm looking for and then handle whatever remains being added to a null pointer. Given that adding to a null pointer is undefined behavior, whatever we do should be acceptable in all cases. What I found in practice was that even with the existing handling of this it took an awful lot of optimizations before the null-based GEP and the load associated with it got close enough together for us to recognize that we could optimize it away, so I don't think we'd be losing much by tolerating additional cases in the way this patch proposes. https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
majnemer added a comment. In https://reviews.llvm.org/D37042#849726, @majnemer wrote: > I'd expect this to be more limited. > > Something like, "if we have a BinaryOperator of + between a CStyleCastExpr of > NullToPointer and 'X', emit inttoptr(X)" Although maybe that is too strict... I guess stuff gets hairy quickly if you want to be able to also catch (char*)NULL + x https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
majnemer added a comment. I'd expect this to be more limited. Something like, "if we have a BinaryOperator of + between a CStyleCastExpr of NullToPointer and 'X', emit inttoptr(X)" https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
andrew.w.kaylor added a comment. I'm not sure why the svn attributes got attached to the file I added. I'll remove them before committing. https://reviews.llvm.org/D37042 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37042: Teach clang to tolerate the 'p = nullptr + n' idiom used by glibc
andrew.w.kaylor created this revision. This patch adds a hack to clang's emitPointerArithmetic() function to get it to emit an inttoptr instruction rather than a GEP with a null base pointer when the 'p = nullptr + n' idiom is used to convert a pointer-sized integer to a pointer. This idiom is used by some versions of glibc and gcc. This was previously discussed here: http://lists.llvm.org/pipermail/llvm-dev/2017-July/115053.html https://reviews.llvm.org/D37042 Files: lib/CodeGen/CGExprScalar.cpp test/CodeGen/nullptr-arithmetic.c Index: lib/CodeGen/CGExprScalar.cpp === --- lib/CodeGen/CGExprScalar.cpp +++ lib/CodeGen/CGExprScalar.cpp @@ -2671,6 +2671,37 @@ unsigned width = cast(index->getType())->getBitWidth(); auto = CGF.CGM.getDataLayout(); auto PtrTy = cast(pointer->getType()); + + // Some versions of glibc and gcc use idioms (particularly in their malloc + // routines) that add a pointer-sized integer (known to be a pointer value) + // to a null pointer in order to cast the value back to an integer or as + // part of a pointer alignment algorithm. This is undefined behavior, but + // we'd like to be able to compile programs that use it. + // + // Normally, we'd generate a GEP with a null-pointer base here in response + // to that code, but it's also UB to dereference a pointer created that + // way. Instead (as an acknowledged hack to tolerate the idiom) we will + // generate a direct cast of the integer value to a pointer. + // + // The idiom (p = nullptr + N) is not met if any of the following are true: + // + // The operation is subtraction. + // The index is not pointer-sized. + // The pointer type is not byte-sized. + // The index operand is a constant. + // + if (isa(pointer) && !isSubtraction && + (width == DL.getTypeSizeInBits(PtrTy)) && + !isa(index)) { +// The pointer type might come back as null, so it's deferred until here. +const PointerType *pointerType + = pointerOperand->getType()->getAs(); +if (pointerType && pointerType->getPointeeType()->isCharType()) { + // (nullptr + N) -> inttoptr N to + return CGF.Builder.CreateIntToPtr(index, pointer->getType()); +} + } + if (width != DL.getTypeSizeInBits(PtrTy)) { // Zero-extend or sign-extend the pointer value according to // whether the index is signed or not. Index: test/CodeGen/nullptr-arithmetic.c === --- test/CodeGen/nullptr-arithmetic.c +++ test/CodeGen/nullptr-arithmetic.c @@ -0,0 +1,51 @@ +// RUN: %clang_cc1 -S %s -emit-llvm -o - | FileCheck %s + +#include + +// This test is meant to verify code that handles the 'p = nullptr + n' idiom +// used by some versions of glibc and gcc. This is undefined behavior but +// it is intended there to act like a conversion from a pointer-sized integer +// to a pointer, and we would like to tolerate that. + +#define NULLPTRI8 ((int8_t*)0) + +// This should get the inttoptr instruction. +int8_t *test1(intptr_t n) { + return NULLPTRI8 + n; +} +// CHECK-LABEL: test1 +// CHECK: inttoptr +// CHECK-NOT: getelementptr + +// This doesn't meet the idiom because the offset type isn't pointer-sized. +int8_t *test2(int16_t n) { + return NULLPTRI8 + n; +} +// CHECK-LABEL: test2 +// CHECK: getelementptr +// CHECK-NOT: inttoptr + +// This doesn't meet the idiom because the offset is constant. +int8_t *test3() { + return NULLPTRI8 + 16; +} +// CHECK-LABEL: test3 +// CHECK: getelementptr +// CHECK-NOT: inttoptr + +// This doesn't meet the idiom because the element type is larger than a byte. +int16_t *test4(intptr_t n) { + return (int16_t*)0 + n; +} +// CHECK-LABEL: test4 +// CHECK: getelementptr +// CHECK-NOT: inttoptr + +// This doesn't meet the idiom because the offset is subtracted. +int8_t* test5(intptr_t n) { + return NULLPTRI8 - n; +} +// CHECK-LABEL: test5 +// CHECK: getelementptr +// CHECK-NOT: inttoptr + Index: lib/CodeGen/CGExprScalar.cpp === --- lib/CodeGen/CGExprScalar.cpp +++ lib/CodeGen/CGExprScalar.cpp @@ -2671,6 +2671,37 @@ unsigned width = cast(index->getType())->getBitWidth(); auto = CGF.CGM.getDataLayout(); auto PtrTy = cast(pointer->getType()); + + // Some versions of glibc and gcc use idioms (particularly in their malloc + // routines) that add a pointer-sized integer (known to be a pointer value) + // to a null pointer in order to cast the value back to an integer or as + // part of a pointer alignment algorithm. This is undefined behavior, but + // we'd like to be able to compile programs that use it. + // + // Normally, we'd generate a GEP with a null-pointer base here in response + // to that code, but it's also UB to dereference a pointer created that + // way. Instead (as an acknowledged hack to tolerate the idiom) we will + // generate a direct cast of the