from:"Artem Dergachev via cfe\-commits"

[clang] [webkit.RefCntblBaseVirtualDtor] Allow CRTP classes without a virtual destructor. (PR #92837)

2024-05-23 Thread Artem Dergachev via cfe-commits



@@ -11,16 +11,134 @@
 #include "PtrTypesSemantics.h"
 #include "clang/AST/CXXInheritance.h"
 #include "clang/AST/RecursiveASTVisitor.h"
+#include "clang/AST/StmtVisitor.h"
 #include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
 #include "clang/StaticAnalyzer/Core/BugReporter/BugReporter.h"
 #include "clang/StaticAnalyzer/Core/BugReporter/BugType.h"
 #include "clang/StaticAnalyzer/Core/Checker.h"
+#include "llvm/ADT/DenseSet.h"
+#include "llvm/ADT/SetVector.h"
 #include 
 
 using namespace clang;
 using namespace ento;
 
 namespace {
+
+class DerefAnalysisVisitor
+: public ConstStmtVisitor {
+  // Returns true if any of child statements return true.
+  bool VisitChildren(const Stmt *S) {
+for (const Stmt *Child : S->children()) {
+  if (Child && Visit(Child))
+return true;
+}
+return false;
+  }
+
+  bool VisitBody(const Stmt *Body) {
+if (!Body)
+  return false;
+
+auto [It, IsNew] = VisitedBody.insert(Body);
+if (!IsNew) // This body is recursive
+  return false;
+
+return Visit(Body);
+  }
+
+public:
+  DerefAnalysisVisitor(const TemplateArgumentList ,
+   const CXXRecordDecl *ClassDecl)
+  : ArgList(), ClassDecl(ClassDecl) {}
+
+  DerefAnalysisVisitor(const CXXRecordDecl *ClassDecl) : ClassDecl(ClassDecl) 
{}
+
+  std::optional HasSpecializedDelete(CXXMethodDecl *Decl) {
+if (auto *Body = Decl->getBody())
+  return VisitBody(Body);
+if (auto *Tmpl = Decl->getTemplateInstantiationPattern())
+  return std::nullopt; // Indeterminate. There was no concrete instance.
+return false;
+  }
+
+  bool VisitCallExpr(const CallExpr *CE) {
+auto *Callee = CE->getCallee();
+while (auto *Expr = dyn_cast(Callee))
+  Callee = Expr->getSubExpr();
+if (auto *DeclRef = dyn_cast(Callee)) {
+  auto *Decl = DeclRef->getDecl();
+  if (auto *VD = dyn_cast(Decl)) {
+if (auto *Init = VD->getInit()) {

haoNoQ wrote:

In other words, it's a type-system problem. You're not supposed to be emulating 
data flow analysis in order to solve it.

If it was a raw function pointer then yeah, you'd have to look at the 
initializer.

But if it's a lambda then all you need should be in the type of the variable. 
You don't need to jump from statement to statement (from lambda invocation 
expression to lambda variable initializer expression) in order to figure it out.

It's fairly unobvious because it may look like the type of the variable is 
`auto` which doesn't really say much. But in the AST all the `auto`s are fully 
resolved, and in this case this `auto` is resolved to the lambda's underlying 
anonymous class type, from which the `operator()()` can be trivially looked up.

https://github.com/llvm/llvm-project/pull/92837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [webkit.RefCntblBaseVirtualDtor] Allow CRTP classes without a virtual destructor. (PR #92837)

2024-05-23 Thread Artem Dergachev via cfe-commits



@@ -11,16 +11,134 @@
 #include "PtrTypesSemantics.h"
 #include "clang/AST/CXXInheritance.h"
 #include "clang/AST/RecursiveASTVisitor.h"
+#include "clang/AST/StmtVisitor.h"
 #include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
 #include "clang/StaticAnalyzer/Core/BugReporter/BugReporter.h"
 #include "clang/StaticAnalyzer/Core/BugReporter/BugType.h"
 #include "clang/StaticAnalyzer/Core/Checker.h"
+#include "llvm/ADT/DenseSet.h"
+#include "llvm/ADT/SetVector.h"
 #include 
 
 using namespace clang;
 using namespace ento;
 
 namespace {
+
+class DerefAnalysisVisitor

haoNoQ wrote:

Sounds great to me!

https://github.com/llvm/llvm-project/pull/92837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [webkit.RefCntblBaseVirtualDtor] Allow CRTP classes without a virtual destructor. (PR #92837)

2024-05-23 Thread Artem Dergachev via cfe-commits



@@ -11,16 +11,134 @@
 #include "PtrTypesSemantics.h"
 #include "clang/AST/CXXInheritance.h"
 #include "clang/AST/RecursiveASTVisitor.h"
+#include "clang/AST/StmtVisitor.h"
 #include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
 #include "clang/StaticAnalyzer/Core/BugReporter/BugReporter.h"
 #include "clang/StaticAnalyzer/Core/BugReporter/BugType.h"
 #include "clang/StaticAnalyzer/Core/Checker.h"
+#include "llvm/ADT/DenseSet.h"
+#include "llvm/ADT/SetVector.h"
 #include 
 
 using namespace clang;
 using namespace ento;
 
 namespace {
+
+class DerefAnalysisVisitor
+: public ConstStmtVisitor {
+  // Returns true if any of child statements return true.
+  bool VisitChildren(const Stmt *S) {
+for (const Stmt *Child : S->children()) {
+  if (Child && Visit(Child))
+return true;
+}
+return false;
+  }
+
+  bool VisitBody(const Stmt *Body) {
+if (!Body)
+  return false;
+
+auto [It, IsNew] = VisitedBody.insert(Body);
+if (!IsNew) // This body is recursive
+  return false;
+
+return Visit(Body);
+  }
+
+public:
+  DerefAnalysisVisitor(const TemplateArgumentList ,
+   const CXXRecordDecl *ClassDecl)
+  : ArgList(), ClassDecl(ClassDecl) {}
+
+  DerefAnalysisVisitor(const CXXRecordDecl *ClassDecl) : ClassDecl(ClassDecl) 
{}
+
+  std::optional HasSpecializedDelete(CXXMethodDecl *Decl) {
+if (auto *Body = Decl->getBody())
+  return VisitBody(Body);
+if (auto *Tmpl = Decl->getTemplateInstantiationPattern())
+  return std::nullopt; // Indeterminate. There was no concrete instance.
+return false;
+  }
+
+  bool VisitCallExpr(const CallExpr *CE) {
+auto *Callee = CE->getCallee();
+while (auto *Expr = dyn_cast(Callee))
+  Callee = Expr->getSubExpr();
+if (auto *DeclRef = dyn_cast(Callee)) {
+  auto *Decl = DeclRef->getDecl();
+  if (auto *VD = dyn_cast(Decl)) {
+if (auto *Init = VD->getInit()) {

haoNoQ wrote:

I mean, the only thing you can call on a lambda-type variable, is that lambda's 
`operator()()`. And it can't be any other lambda's `operator()()` because you 
can't assign lambdas to each other even when they have the same signature. So I 
suspect that if you simply ask the AST for `CE->getCalleeDecl()` or 
`CE->getDirectCallee()` or something of that nature, it'll give you the 
declaration of `operator()()` without any extra steps, and it'll fit in 
naturally with the rest of the code so the `getInit()` branch can be completely 
deleted.

I might be wrong, needs testing.

https://github.com/llvm/llvm-project/pull/92837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [webkit.RefCntblBaseVirtualDtor] Allow CRTP classes without a virtual destructor. (PR #92837)

2024-05-23 Thread Artem Dergachev via cfe-commits



@@ -11,16 +11,134 @@
 #include "PtrTypesSemantics.h"
 #include "clang/AST/CXXInheritance.h"
 #include "clang/AST/RecursiveASTVisitor.h"
+#include "clang/AST/StmtVisitor.h"
 #include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
 #include "clang/StaticAnalyzer/Core/BugReporter/BugReporter.h"
 #include "clang/StaticAnalyzer/Core/BugReporter/BugType.h"
 #include "clang/StaticAnalyzer/Core/Checker.h"
+#include "llvm/ADT/DenseSet.h"
+#include "llvm/ADT/SetVector.h"
 #include 
 
 using namespace clang;
 using namespace ento;
 
 namespace {
+
+class DerefAnalysisVisitor

haoNoQ wrote:

I'd even go with like `UnsafeDeleteVisitor` or something. Because fundamentally 
you're looking for `delete`, and `deref` is just one of the forms it takes, or 
something  

https://github.com/llvm/llvm-project/pull/92837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] Allow recursive functions to be trivial. (PR #91876)

2024-05-23 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ commented:

> If that is an issue, one way to do this "properly" would be to create a graph

Note that we do have a 
[`CallGraph`](https://clang.llvm.org/doxygen/classclang_1_1CallGraph.html) 
class. The static analyzer uses it to identify top-level entry points. It 
doesn't look like it has a readily available solution for identifying recursive 
functions. But it may be a good idea to simply use the class and implement it 
there.

https://github.com/llvm/llvm-project/pull/91876
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] Allow recursive functions to be trivial. (PR #91876)

2024-05-23 Thread Artem Dergachev via cfe-commits



@@ -231,6 +231,15 @@ class RefCounted {
   void method();
   void someFunction();
   int otherFunction();
+  unsigned recursiveTrivialFunction(int n) { return !n ? 1 : 
recursiveTrivialFunction(n - 1);  }
+  unsigned recursiveComplexFunction(int n) { return !n ? otherFunction() : 
recursiveComplexFunction(n - 1);  }
+  unsigned mutuallyRecursiveFunction1(int n) { return n < 0 ? 1 : (n % 2 ? 
mutuallyRecursiveFunction2(n - 2) : mutuallyRecursiveFunction1(n - 1)); }
+  unsigned mutuallyRecursiveFunction2(int n) { return n < 0 ? 1 : (n % 3 ? 
mutuallyRecursiveFunction2(n - 3) : mutuallyRecursiveFunction1(n - 2)); }
+  unsigned mutuallyRecursiveFunction3(int n) { return n < 0 ? 1 : (n % 5 ? 
mutuallyRecursiveFunction3(n - 5) : mutuallyRecursiveFunction4(n - 3)); }
+  unsigned mutuallyRecursiveFunction4(int n) { return n < 0 ? 1 : (n % 7 ? 
otherFunction() : mutuallyRecursiveFunction3(n - 3)); }
+  unsigned mutuallyRecursiveFunction5(unsigned n) { return n > 100 ? 2 : (n % 
2 ? mutuallyRecursiveFunction5(n + 1) : mutuallyRecursiveFunction6(n + 2)); }

haoNoQ wrote:

I think it may be a good idea to add a direct test for @MitalAshok's example.
```c++
void foo() { bar(); something_non_trivial(); }
void bar() { foo(); }

void test() {
  foo();
  bar();
}
```
where none of the functions call themselves directly. I'm still not quite sure 
whether we're correctly confirming that `bar()` is non-trivial here, when we 
start the analysis from `foo()`.

https://github.com/llvm/llvm-project/pull/91876
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] Allow recursive functions to be trivial. (PR #91876)

2024-05-23 Thread Artem Dergachev via cfe-commits



@@ -231,6 +231,15 @@ class RefCounted {
   void method();
   void someFunction();
   int otherFunction();
+  unsigned recursiveTrivialFunction(int n) { return !n ? 1 : 
recursiveTrivialFunction(n - 1);  }
+  unsigned recursiveComplexFunction(int n) { return !n ? otherFunction() : 
recursiveComplexFunction(n - 1);  }
+  unsigned mutuallyRecursiveFunction1(int n) { return n < 0 ? 1 : (n % 2 ? 
mutuallyRecursiveFunction2(n - 2) : mutuallyRecursiveFunction1(n - 1)); }
+  unsigned mutuallyRecursiveFunction2(int n) { return n < 0 ? 1 : (n % 3 ? 
mutuallyRecursiveFunction2(n - 3) : mutuallyRecursiveFunction1(n - 2)); }
+  unsigned mutuallyRecursiveFunction3(int n) { return n < 0 ? 1 : (n % 5 ? 
mutuallyRecursiveFunction3(n - 5) : mutuallyRecursiveFunction4(n - 3)); }
+  unsigned mutuallyRecursiveFunction4(int n) { return n < 0 ? 1 : (n % 7 ? 
otherFunction() : mutuallyRecursiveFunction3(n - 3)); }
+  unsigned mutuallyRecursiveFunction5(unsigned n) { return n > 100 ? 2 : (n % 
2 ? mutuallyRecursiveFunction5(n + 1) : mutuallyRecursiveFunction6(n + 2)); }

haoNoQ wrote:

These probably shouldn't be called "mutually" recursive because you never 
really go back from 6 to 5 or from 7 to 6. Though, I don't have a better name 
for such contraption.

https://github.com/llvm/llvm-project/pull/91876
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] Allow recursive functions to be trivial. (PR #91876)

2024-05-23 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/91876
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [alpha.webkit.UncountedLocalVarsChecker] Detect assignments to uncounted local variable and parameters. (PR #92639)

2024-05-23 Thread Artem Dergachev via cfe-commits


haoNoQ wrote:

(It might be a good idea to add comments to those parts of the code to make 
sure the reader knows that it was intentional.)

https://github.com/llvm/llvm-project/pull/92639
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [alpha.webkit.UncountedLocalVarsChecker] Detect assignments to uncounted local variable and parameters. (PR #92639)

2024-05-23 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ approved this pull request.

Aha in this case LGTM!

https://github.com/llvm/llvm-project/pull/92639
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [webkit.RefCntblBaseVirtualDtor] Allow CRTP classes without a virtual destructor. (PR #92837)

2024-05-23 Thread Artem Dergachev via cfe-commits



@@ -11,16 +11,134 @@
 #include "PtrTypesSemantics.h"
 #include "clang/AST/CXXInheritance.h"
 #include "clang/AST/RecursiveASTVisitor.h"
+#include "clang/AST/StmtVisitor.h"
 #include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
 #include "clang/StaticAnalyzer/Core/BugReporter/BugReporter.h"
 #include "clang/StaticAnalyzer/Core/BugReporter/BugType.h"
 #include "clang/StaticAnalyzer/Core/Checker.h"
+#include "llvm/ADT/DenseSet.h"
+#include "llvm/ADT/SetVector.h"
 #include 
 
 using namespace clang;
 using namespace ento;
 
 namespace {
+
+class DerefAnalysisVisitor
+: public ConstStmtVisitor {
+  // Returns true if any of child statements return true.
+  bool VisitChildren(const Stmt *S) {
+for (const Stmt *Child : S->children()) {
+  if (Child && Visit(Child))
+return true;
+}
+return false;
+  }
+
+  bool VisitBody(const Stmt *Body) {
+if (!Body)
+  return false;
+
+auto [It, IsNew] = VisitedBody.insert(Body);
+if (!IsNew) // This body is recursive
+  return false;
+
+return Visit(Body);
+  }
+
+public:
+  DerefAnalysisVisitor(const TemplateArgumentList ,
+   const CXXRecordDecl *ClassDecl)
+  : ArgList(), ClassDecl(ClassDecl) {}
+
+  DerefAnalysisVisitor(const CXXRecordDecl *ClassDecl) : ClassDecl(ClassDecl) 
{}
+
+  std::optional HasSpecializedDelete(CXXMethodDecl *Decl) {
+if (auto *Body = Decl->getBody())
+  return VisitBody(Body);
+if (auto *Tmpl = Decl->getTemplateInstantiationPattern())
+  return std::nullopt; // Indeterminate. There was no concrete instance.
+return false;
+  }
+
+  bool VisitCallExpr(const CallExpr *CE) {
+auto *Callee = CE->getCallee();
+while (auto *Expr = dyn_cast(Callee))
+  Callee = Expr->getSubExpr();
+if (auto *DeclRef = dyn_cast(Callee)) {
+  auto *Decl = DeclRef->getDecl();
+  if (auto *VD = dyn_cast(Decl)) {
+if (auto *Init = VD->getInit()) {

haoNoQ wrote:

Never mind, you cannot assign lambdas this way.

This means that your `getInit()` branch is probably unnecessary; the compiler 
knows at compile time which `operator()()` is called(?)

https://github.com/llvm/llvm-project/pull/92837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [webkit.RefCntblBaseVirtualDtor] Allow CRTP classes without a virtual destructor. (PR #92837)

2024-05-23 Thread Artem Dergachev via cfe-commits



@@ -51,92 +169,137 @@ class RefCntblBaseVirtualDtorChecker
   bool shouldVisitImplicitCode() const { return false; }
 
   bool VisitCXXRecordDecl(const CXXRecordDecl *RD) {
-Checker->visitCXXRecordDecl(RD);
+if (!RD->hasDefinition())
+  return true;
+
+Decls.insert(RD);
+
+for (auto  : RD->bases()) {
+  const auto AccSpec = Base.getAccessSpecifier();
+  if (AccSpec == AS_protected || AccSpec == AS_private ||
+  (AccSpec == AS_none && RD->isClass()))
+continue;
+
+  QualType T = Base.getType();
+  if (T.isNull())
+continue;
+
+  const CXXRecordDecl *C = T->getAsCXXRecordDecl();
+  if (!C)
+continue;
+
+  if (auto *CTSD = dyn_cast(C)) {
+auto  = CTSD->getTemplateArgs();
+for (unsigned i = 0; i < Args.size(); ++i) {

haoNoQ wrote:

If you like you can do a `CTSD->getTemplateArgs().asArray()` and do a 
range-based loop. It's going to be an `llvm::ArrayRef` so you can store it by 
value too.

https://github.com/llvm/llvm-project/pull/92837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [webkit.RefCntblBaseVirtualDtor] Allow CRTP classes without a virtual destructor. (PR #92837)

2024-05-23 Thread Artem Dergachev via cfe-commits



@@ -11,16 +11,134 @@
 #include "PtrTypesSemantics.h"
 #include "clang/AST/CXXInheritance.h"
 #include "clang/AST/RecursiveASTVisitor.h"
+#include "clang/AST/StmtVisitor.h"
 #include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
 #include "clang/StaticAnalyzer/Core/BugReporter/BugReporter.h"
 #include "clang/StaticAnalyzer/Core/BugReporter/BugType.h"
 #include "clang/StaticAnalyzer/Core/Checker.h"
+#include "llvm/ADT/DenseSet.h"
+#include "llvm/ADT/SetVector.h"
 #include 
 
 using namespace clang;
 using namespace ento;
 
 namespace {
+
+class DerefAnalysisVisitor
+: public ConstStmtVisitor {
+  // Returns true if any of child statements return true.
+  bool VisitChildren(const Stmt *S) {
+for (const Stmt *Child : S->children()) {
+  if (Child && Visit(Child))
+return true;
+}
+return false;
+  }
+
+  bool VisitBody(const Stmt *Body) {
+if (!Body)
+  return false;
+
+auto [It, IsNew] = VisitedBody.insert(Body);
+if (!IsNew) // This body is recursive
+  return false;
+
+return Visit(Body);
+  }
+
+public:
+  DerefAnalysisVisitor(const TemplateArgumentList ,
+   const CXXRecordDecl *ClassDecl)
+  : ArgList(), ClassDecl(ClassDecl) {}
+
+  DerefAnalysisVisitor(const CXXRecordDecl *ClassDecl) : ClassDecl(ClassDecl) 
{}
+
+  std::optional HasSpecializedDelete(CXXMethodDecl *Decl) {
+if (auto *Body = Decl->getBody())
+  return VisitBody(Body);
+if (auto *Tmpl = Decl->getTemplateInstantiationPattern())
+  return std::nullopt; // Indeterminate. There was no concrete instance.
+return false;
+  }
+
+  bool VisitCallExpr(const CallExpr *CE) {
+auto *Callee = CE->getCallee();
+while (auto *Expr = dyn_cast(Callee))
+  Callee = Expr->getSubExpr();
+if (auto *DeclRef = dyn_cast(Callee)) {
+  auto *Decl = DeclRef->getDecl();
+  if (auto *VD = dyn_cast(Decl)) {
+if (auto *Init = VD->getInit()) {
+  if (auto *Lambda = dyn_cast(Init))
+return VisitBody(Lambda->getBody());
+}
+  } else if (auto *FD = dyn_cast(Decl))
+return VisitBody(FD->getBody());
+}
+return false;
+  }
+
+  bool VisitCXXMemberCallExpr(const CXXMemberCallExpr *MCE) {

haoNoQ wrote:

I'm pretty sure `VisitCallExpr` will do fine here, there's no need to make this 
a special case. This could be a bit faster because it doesn't need to consider 
the lambda `operator()()` case, but it could be a premature optimization if you 
start expanding the `VisitCallExpr` case with more smart behavior, so I think 
it's better to reuse code. I also don't think it's actually faster; the visitor 
fall-through cost will probably make it even worse.

https://github.com/llvm/llvm-project/pull/92837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [webkit.RefCntblBaseVirtualDtor] Allow CRTP classes without a virtual destructor. (PR #92837)

2024-05-23 Thread Artem Dergachev via cfe-commits



@@ -11,16 +11,134 @@
 #include "PtrTypesSemantics.h"
 #include "clang/AST/CXXInheritance.h"
 #include "clang/AST/RecursiveASTVisitor.h"
+#include "clang/AST/StmtVisitor.h"
 #include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
 #include "clang/StaticAnalyzer/Core/BugReporter/BugReporter.h"
 #include "clang/StaticAnalyzer/Core/BugReporter/BugType.h"
 #include "clang/StaticAnalyzer/Core/Checker.h"
+#include "llvm/ADT/DenseSet.h"
+#include "llvm/ADT/SetVector.h"
 #include 
 
 using namespace clang;
 using namespace ento;
 
 namespace {
+
+class DerefAnalysisVisitor
+: public ConstStmtVisitor {
+  // Returns true if any of child statements return true.
+  bool VisitChildren(const Stmt *S) {
+for (const Stmt *Child : S->children()) {
+  if (Child && Visit(Child))
+return true;
+}
+return false;
+  }
+
+  bool VisitBody(const Stmt *Body) {
+if (!Body)
+  return false;
+
+auto [It, IsNew] = VisitedBody.insert(Body);
+if (!IsNew) // This body is recursive
+  return false;
+
+return Visit(Body);
+  }
+
+public:
+  DerefAnalysisVisitor(const TemplateArgumentList ,
+   const CXXRecordDecl *ClassDecl)
+  : ArgList(), ClassDecl(ClassDecl) {}
+
+  DerefAnalysisVisitor(const CXXRecordDecl *ClassDecl) : ClassDecl(ClassDecl) 
{}
+
+  std::optional HasSpecializedDelete(CXXMethodDecl *Decl) {
+if (auto *Body = Decl->getBody())
+  return VisitBody(Body);
+if (auto *Tmpl = Decl->getTemplateInstantiationPattern())
+  return std::nullopt; // Indeterminate. There was no concrete instance.
+return false;
+  }
+
+  bool VisitCallExpr(const CallExpr *CE) {
+auto *Callee = CE->getCallee();
+while (auto *Expr = dyn_cast(Callee))
+  Callee = Expr->getSubExpr();
+if (auto *DeclRef = dyn_cast(Callee)) {
+  auto *Decl = DeclRef->getDecl();
+  if (auto *VD = dyn_cast(Decl)) {
+if (auto *Init = VD->getInit()) {

haoNoQ wrote:

Same with variables, if you're considering initializations, then you might also 
want to consider assignments. Though, that's much more annoying with all the 
special ASTs that come with lambda copy/move-constructors.

You could also, like, force the lambda variable to be `const` or something, if 
it truly isn't intended to be reassigned. But I don't think a lot of people 
would be happy writing code this way, because that's like the default 
assumption.

https://github.com/llvm/llvm-project/pull/92837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [webkit.RefCntblBaseVirtualDtor] Allow CRTP classes without a virtual destructor. (PR #92837)

2024-05-23 Thread Artem Dergachev via cfe-commits



@@ -11,16 +11,134 @@
 #include "PtrTypesSemantics.h"
 #include "clang/AST/CXXInheritance.h"
 #include "clang/AST/RecursiveASTVisitor.h"
+#include "clang/AST/StmtVisitor.h"
 #include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
 #include "clang/StaticAnalyzer/Core/BugReporter/BugReporter.h"
 #include "clang/StaticAnalyzer/Core/BugReporter/BugType.h"
 #include "clang/StaticAnalyzer/Core/Checker.h"
+#include "llvm/ADT/DenseSet.h"
+#include "llvm/ADT/SetVector.h"
 #include 
 
 using namespace clang;
 using namespace ento;
 
 namespace {
+
+class DerefAnalysisVisitor
+: public ConstStmtVisitor {
+  // Returns true if any of child statements return true.
+  bool VisitChildren(const Stmt *S) {
+for (const Stmt *Child : S->children()) {
+  if (Child && Visit(Child))
+return true;
+}
+return false;
+  }
+
+  bool VisitBody(const Stmt *Body) {
+if (!Body)
+  return false;
+
+auto [It, IsNew] = VisitedBody.insert(Body);
+if (!IsNew) // This body is recursive
+  return false;
+
+return Visit(Body);
+  }
+
+public:
+  DerefAnalysisVisitor(const TemplateArgumentList ,
+   const CXXRecordDecl *ClassDecl)
+  : ArgList(), ClassDecl(ClassDecl) {}
+
+  DerefAnalysisVisitor(const CXXRecordDecl *ClassDecl) : ClassDecl(ClassDecl) 
{}
+
+  std::optional HasSpecializedDelete(CXXMethodDecl *Decl) {
+if (auto *Body = Decl->getBody())
+  return VisitBody(Body);
+if (auto *Tmpl = Decl->getTemplateInstantiationPattern())
+  return std::nullopt; // Indeterminate. There was no concrete instance.
+return false;
+  }
+
+  bool VisitCallExpr(const CallExpr *CE) {
+auto *Callee = CE->getCallee();

haoNoQ wrote:

I usually start with like a
```c++
const Decl *D = CE->getCalleeDecl()
if (D && D->hasBody())
VisitBody(D->getBody());
```
by default, it's a bit more straightforward.

https://github.com/llvm/llvm-project/pull/92837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [webkit.RefCntblBaseVirtualDtor] Allow CRTP classes without a virtual destructor. (PR #92837)

2024-05-23 Thread Artem Dergachev via cfe-commits



@@ -11,16 +11,134 @@
 #include "PtrTypesSemantics.h"
 #include "clang/AST/CXXInheritance.h"
 #include "clang/AST/RecursiveASTVisitor.h"
+#include "clang/AST/StmtVisitor.h"
 #include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
 #include "clang/StaticAnalyzer/Core/BugReporter/BugReporter.h"
 #include "clang/StaticAnalyzer/Core/BugReporter/BugType.h"
 #include "clang/StaticAnalyzer/Core/Checker.h"
+#include "llvm/ADT/DenseSet.h"
+#include "llvm/ADT/SetVector.h"
 #include 
 
 using namespace clang;
 using namespace ento;
 
 namespace {
+
+class DerefAnalysisVisitor

haoNoQ wrote:

: by "deref" you mean "decrement reference count" right? Because when I see 
it in the compiler I mostly think of operations `*` and `->` and `[]` and such. 
This is readable when it's next to a similarly named thing named "ref" but in 
this case I was confused for a few minutes, like what does this have to do with 
templates?

https://github.com/llvm/llvm-project/pull/92837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [webkit.RefCntblBaseVirtualDtor] Allow CRTP classes without a virtual destructor. (PR #92837)

2024-05-23 Thread Artem Dergachev via cfe-commits



@@ -11,16 +11,134 @@
 #include "PtrTypesSemantics.h"
 #include "clang/AST/CXXInheritance.h"
 #include "clang/AST/RecursiveASTVisitor.h"
+#include "clang/AST/StmtVisitor.h"
 #include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
 #include "clang/StaticAnalyzer/Core/BugReporter/BugReporter.h"
 #include "clang/StaticAnalyzer/Core/BugReporter/BugType.h"
 #include "clang/StaticAnalyzer/Core/Checker.h"
+#include "llvm/ADT/DenseSet.h"
+#include "llvm/ADT/SetVector.h"
 #include 
 
 using namespace clang;
 using namespace ento;
 
 namespace {
+
+class DerefAnalysisVisitor
+: public ConstStmtVisitor {
+  // Returns true if any of child statements return true.
+  bool VisitChildren(const Stmt *S) {
+for (const Stmt *Child : S->children()) {
+  if (Child && Visit(Child))
+return true;
+}
+return false;
+  }
+
+  bool VisitBody(const Stmt *Body) {
+if (!Body)
+  return false;
+
+auto [It, IsNew] = VisitedBody.insert(Body);
+if (!IsNew) // This body is recursive

haoNoQ wrote:

Shouldn't it return whatever it returned previously for that body?

https://github.com/llvm/llvm-project/pull/92837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [alpha.webkit.UncountedLocalVarsChecker] Detect assignments to uncounted local variable and parameters. (PR #92639)

2024-05-23 Thread Artem Dergachev via cfe-commits



@@ -135,7 +135,19 @@ class UncountedLocalVarsChecker
   bool shouldVisitImplicitCode() const { return false; }
 
   bool VisitVarDecl(VarDecl *V) {
-Checker->visitVarDecl(V);
+auto *Init = V->getInit();
+if (Init && V->isLocalVarDecl())

haoNoQ wrote:

This method says "yes" to static locals. Are we ok with that?

(These methods are super confusing so I'd rather double-check.)

https://github.com/llvm/llvm-project/pull/92639
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [alpha.webkit.UncountedLocalVarsChecker] Detect assignments to uncounted local variable and parameters. (PR #92639)

2024-05-23 Thread Artem Dergachev via cfe-commits



@@ -135,7 +135,19 @@ class UncountedLocalVarsChecker
   bool shouldVisitImplicitCode() const { return false; }
 
   bool VisitVarDecl(VarDecl *V) {
-Checker->visitVarDecl(V);
+auto *Init = V->getInit();
+if (Init && V->isLocalVarDecl())
+  Checker->visitVarDecl(V, Init);
+return true;
+  }
+
+  bool VisitBinaryOperator(const BinaryOperator *BO) {
+if (BO->isAssignmentOp()) {
+  if (auto *VarRef = dyn_cast(BO->getLHS())) {
+if (auto *V = dyn_cast(VarRef->getDecl()))

haoNoQ wrote:

Would you also like to skip globals here? (But you probably don't want to skip 
parameters here – only in the initializer case.)

https://github.com/llvm/llvm-project/pull/92639
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [-Wunsafe-buffer-usage] Fix false positives for constant cases (PR #92432)

2024-05-20 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/92432
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [-Wunsafe-buffer-usage] Fix false positives for constant cases (PR #92432)

2024-05-20 Thread Artem Dergachev via cfe-commits



@@ -420,25 +420,63 @@ AST_MATCHER(ArraySubscriptExpr, isSafeArraySubscript) {
   //already duplicated
   //  - call both from Sema and from here
 
-  const auto *BaseDRE =
-  dyn_cast(Node.getBase()->IgnoreParenImpCasts());
-  if (!BaseDRE)
+  if (const auto *BaseDRE =
+  dyn_cast(Node.getBase()->IgnoreParenImpCasts())) {
+if (!BaseDRE->getDecl())
+  return false;
+if (const auto *CATy = Finder->getASTContext().getAsConstantArrayType(
+BaseDRE->getDecl()->getType())) {
+  if (const auto *IdxLit = dyn_cast(Node.getIdx())) {
+const APInt ArrIdx = IdxLit->getValue();
+// FIXME: ArrIdx.isNegative() we could immediately emit an error as
+// that's a bug
+if (ArrIdx.isNonNegative() &&
+ArrIdx.getLimitedValue() < CATy->getLimitedSize())
+  return true;
+  }
+}
+  }
+
+  if (const auto *BaseSL =
+  dyn_cast(Node.getBase()->IgnoreParenImpCasts())) {
+if (const auto *CATy =
+Finder->getASTContext().getAsConstantArrayType(BaseSL->getType())) 
{
+  if (const auto *IdxLit = dyn_cast(Node.getIdx())) {
+const APInt ArrIdx = IdxLit->getValue();
+// FIXME: ArrIdx.isNegative() we could immediately emit an error as
+// that's a bug
+if (ArrIdx.isNonNegative() &&
+ArrIdx.getLimitedValue() < CATy->getLimitedSize())
+  return true;
+  }
+}
+  }
+
+  return false;
+}
+
+AST_MATCHER(BinaryOperator, isSafePtrArithmetic) {
+  if (Node.getOpcode() != BinaryOperatorKind::BO_Add)
 return false;
-  if (!BaseDRE->getDecl())
+
+  const auto *LHSDRE = dyn_cast(Node.getLHS()->IgnoreImpCasts());

haoNoQ wrote:

I guess we might as well support the `"abc" + 3` case here. With the ultimate 
goal of ultimately supporting
```c++
static const char *const abc = "abc";

abc[3];
abc + 3;
```

https://github.com/llvm/llvm-project/pull/92432
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [-Wunsafe-buffer-usage] Fix false positives for constant cases (PR #92432)

2024-05-20 Thread Artem Dergachev via cfe-commits



@@ -0,0 +1,17 @@
+// RUN: %clang_cc1 -std=c++20 -Wno-everything -Wunsafe-buffer-usage \
+// RUN:-fsafe-buffer-usage-suggestions \
+// RUN:-verify %s
+
+void char_literal() {
+  if ("abc"[2] == 'c')
+return;
+  if ("def"[3] == '0')
+return;
+}
+
+void const_size_buffer_arithmetic() {
+  char kBuf[64] = {};
+  const char* p = kBuf + 1;
+}
+
+// expected-no-diagnostics

haoNoQ wrote:

Extreme nitpicking: I usually put those on top of the file because otherwise 
I'd be super confused where all those expected warnings are at. We also really 
don't want it to end up in the middle of the file if folks add more code at the 
bottom without reading the comment.

https://github.com/llvm/llvm-project/pull/92432
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [-Wunsafe-buffer-usage] Fix false positives for constant cases (PR #92432)

2024-05-20 Thread Artem Dergachev via cfe-commits



@@ -420,25 +420,63 @@ AST_MATCHER(ArraySubscriptExpr, isSafeArraySubscript) {
   //already duplicated
   //  - call both from Sema and from here
 
-  const auto *BaseDRE =
-  dyn_cast(Node.getBase()->IgnoreParenImpCasts());
-  if (!BaseDRE)
+  if (const auto *BaseDRE =
+  dyn_cast(Node.getBase()->IgnoreParenImpCasts())) {
+if (!BaseDRE->getDecl())
+  return false;
+if (const auto *CATy = Finder->getASTContext().getAsConstantArrayType(
+BaseDRE->getDecl()->getType())) {
+  if (const auto *IdxLit = dyn_cast(Node.getIdx())) {

haoNoQ wrote:

I suspect that this could have been a:
```c++
const Expr *BaseE = Node.getBase()->IgnoreParenImpCasts();
if (isa(BaseE)) {
  if (const auto *CATy = 
Finder->getASTContext().getAsConstantArrayType(BaseE->getType())) {
...
  }
}
```
Which would also eliminate duplication. (Unless 
`DeclRefExpr->getDecl()->getType()` is somehow significantly different from 
`DeclRefExpr->getType()`.)

Another thing we can try is to simply eliminate the `isa` entirely. Like, we 
know that it's "an" expression, and its type is a constant-size array type. Do 
we really need more? Why not simply trust the type system? (It's not like the C 
type system has ever lied to us right? )

https://github.com/llvm/llvm-project/pull/92432
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [-Wunsafe-buffer-usage] Fix false positives for constant cases (PR #92432)

2024-05-20 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/92432
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [-Wunsafe-buffer-usage] Fix false positives for constant cases (PR #92432)

2024-05-20 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ commented:

Overall looks great!

I think I see a couple easy improvements, this isn't blocking but let's take a 
moment to consider them 

https://github.com/llvm/llvm-project/pull/92432
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [-Wunsafe-buffer-usage] Fix false positives for constant cases (PR #92432)

2024-05-20 Thread Artem Dergachev via cfe-commits



@@ -420,25 +420,63 @@ AST_MATCHER(ArraySubscriptExpr, isSafeArraySubscript) {
   //already duplicated
   //  - call both from Sema and from here
 
-  const auto *BaseDRE =
-  dyn_cast(Node.getBase()->IgnoreParenImpCasts());
-  if (!BaseDRE)
+  if (const auto *BaseDRE =
+  dyn_cast(Node.getBase()->IgnoreParenImpCasts())) {
+if (!BaseDRE->getDecl())
+  return false;
+if (const auto *CATy = Finder->getASTContext().getAsConstantArrayType(
+BaseDRE->getDecl()->getType())) {
+  if (const auto *IdxLit = dyn_cast(Node.getIdx())) {
+const APInt ArrIdx = IdxLit->getValue();
+// FIXME: ArrIdx.isNegative() we could immediately emit an error as
+// that's a bug
+if (ArrIdx.isNonNegative() &&
+ArrIdx.getLimitedValue() < CATy->getLimitedSize())
+  return true;
+  }
+}
+  }
+
+  if (const auto *BaseSL =
+  dyn_cast(Node.getBase()->IgnoreParenImpCasts())) {
+if (const auto *CATy =
+Finder->getASTContext().getAsConstantArrayType(BaseSL->getType())) 
{
+  if (const auto *IdxLit = dyn_cast(Node.getIdx())) {
+const APInt ArrIdx = IdxLit->getValue();
+// FIXME: ArrIdx.isNegative() we could immediately emit an error as
+// that's a bug
+if (ArrIdx.isNonNegative() &&
+ArrIdx.getLimitedValue() < CATy->getLimitedSize())
+  return true;
+  }
+}
+  }
+
+  return false;
+}
+
+AST_MATCHER(BinaryOperator, isSafePtrArithmetic) {
+  if (Node.getOpcode() != BinaryOperatorKind::BO_Add)
 return false;
-  if (!BaseDRE->getDecl())
+
+  const auto *LHSDRE = dyn_cast(Node.getLHS()->IgnoreImpCasts());
+  if (!LHSDRE)
 return false;
+
   const auto *CATy = Finder->getASTContext().getAsConstantArrayType(
-  BaseDRE->getDecl()->getType());
+  LHSDRE->getDecl()->getType());
   if (!CATy)
 return false;
 
-  if (const auto *IdxLit = dyn_cast(Node.getIdx())) {
-const APInt ArrIdx = IdxLit->getValue();
-// FIXME: ArrIdx.isNegative() we could immediately emit an error as that's 
a
-// bug
-if (ArrIdx.isNonNegative() &&
-ArrIdx.getLimitedValue() < CATy->getLimitedSize())
-  return true;
-  }
+  const auto *RHSIntLit = dyn_cast(Node.getRHS());
+  if (!RHSIntLit)
+return false;
+
+  const APInt BufferOffset = RHSIntLit->getValue();
+
+  if (BufferOffset.isNonNegative() &&

haoNoQ wrote:

There's this whole `Expr->EvaluateAsInt()` thing which is arguably even simpler 
to use than your existing code, and it constant-folds pretty well. We should 
probably just use that, instead of matching integer literals.

But it still won't handle the case where there's nested layers of arithmetic 
over the base pointer, like `(ptr + 1) - 1`. It'll only handle like `ptr + (1 - 
1)`. We'll have to support that separately. I suspect that it's still 
relatively easy to do with plain old recursion but this probably doesn't need 
to block this patch.

https://github.com/llvm/llvm-project/pull/92432
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] Adding taint analysis capability to unix.Malloc checker (PR #92420)

2024-05-16 Thread Artem Dergachev via cfe-commits


haoNoQ wrote:

I think there should be a way to enable/disable this check separately because 
memory exhaustion / denial of service isn't necessarily something you care 
about when you enable taint analysis.

It's essential for web servers when the attacker is interested in interrupting 
their operation but not necessarily for personal devices where the attacker is 
interested only in gaining control of the device. For these applications it's 
more important to catch cases where malloc size and index used for access are 
coming from "different sources", eg. one is tainted and another isn't, doesn't 
matter which one.

For the same reason, I think the error node needs to be non-fatal. If you make 
it fatal, you lose the opportunity to catch this other type of bugs on the same 
path, which are 50% likely to be found on the same path, and are arguably much 
more severe.

Of course when your new check is disabled there won't be a sink so we'll still 
catch the other bug. But we don't actually want it to work this way; we'd 
rather have the user control the warnings they want to see with simple on-off 
flags without weird interactions between those flags.

So sink generation should be mostly a thing that "modeling" checkers do, when 
they're confident that we simply cannot continue analysis further. In this case 
I don't really see any problems with continuing the analysis so it should 
probably be a non-fatal error.

https://github.com/llvm/llvm-project/pull/92420
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [WIP][Safe Buffers] Serialize unsafe_buffer_usage pragmas (PR #92031)

2024-05-15 Thread Artem Dergachev via cfe-commits


haoNoQ wrote:

> serializing the pragmas themselves as AST nodes

These pragmas don't really translate very well into the AST. Similarly to 
`#pragma clang diagnostic`, they can be placed weirdly "across" other AST 
nodes, like:
```
#pragma clang unsafe_buffer_usage begin
void foo() {
#pragma clang unsafe_buffer_usage end
}
```
We could support it in the same way that clang supports Duff's Device but I 
don't think this is really a good idea. We ultimately want them to be textual, 
a preprocessor-level construct, because that's what security audits are looking 
for: an extremely clear sign that this is where unchecked pointer operations 
live. So that they could tell it without looking at anything else in the code.

https://github.com/llvm/llvm-project/pull/92031
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [WIP][Safe Buffers] Serialize unsafe_buffer_usage pragmas (PR #92031)

2024-05-15 Thread Artem Dergachev via cfe-commits



@@ -1551,6 +1567,58 @@ bool 
Preprocessor::isPPInSafeBufferOptOutRegion(SourceLocation ) {
   return InSafeBufferOptOutRegion;
 }
 
+SmallVector
+Preprocessor::serializeSafeBufferOptOutMap() const {
+  assert(!InSafeBufferOptOutRegion &&

haoNoQ wrote:

Yeah in this case we need to support the situation when the warning was ignored.

https://github.com/llvm/llvm-project/pull/92031
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] Check C++ base or member initializer in WebKit checkers. (PR #92220)

2024-05-15 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ approved this pull request.

LGTM!

https://github.com/llvm/llvm-project/pull/92220
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] Check C++ base or member initializer in WebKit checkers. (PR #92220)

2024-05-15 Thread Artem Dergachev via cfe-commits



@@ -525,11 +525,19 @@ bool TrivialFunctionAnalysis::isTrivialImpl(
   if (!IsNew)
 return It->second;
 
+  TrivialFunctionAnalysisVisitor V(Cache);
+
+  if (auto *CtorDecl = dyn_cast(D)) {
+for (auto *CtorInit : CtorDecl->inits()) {
+  if (!V.Visit(CtorInit->getInit()))

haoNoQ wrote:

I don't remember if either of these needs a null check. IIRC it's probably ok 
but if you run into crashes you know where to look ;)

https://github.com/llvm/llvm-project/pull/92220
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] Check C++ base or member initializer in WebKit checkers. (PR #92220)

2024-05-15 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/92220
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Respect the [[clang::unsafe_buffer_usage]] attribute for constructors (PR #91777)

2024-05-15 Thread Artem Dergachev via cfe-commits



@@ -1315,9 +1374,9 @@ class DerefSimplePtrArithFixableGadget : public 
FixableGadget {
 
   virtual std::optional
   getFixits(const FixitStrategy ) const final;
-
-  // TODO remove this method from FixableGadget interface

haoNoQ wrote:

Right!

https://github.com/llvm/llvm-project/pull/91777
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] Treat break, continue, goto, and label statements as trivial in WebKit checkers. (PR #91873)

2024-05-14 Thread Artem Dergachev via cfe-commits



@@ -445,6 +456,10 @@ class TrivialFunctionAnalysisVisitor
 return Visit(VMT->getSubExpr());
   }
 
+  bool VisitCXXBindTemporaryExpr(const CXXBindTemporaryExpr* BTE) {
+return Visit(BTE->getSubExpr());

haoNoQ wrote:

Which reminds me, I think we're also forgetting to check `CXXCtorInitializer`s. 
Because they aren't included in `CXXConstructorDecl->getBody()`. They also may 
be non-trivial.

Also need to double-check default function argument expressions. We probably 
don't need to handle them when we're jumping into a function from a call site. 
But if we're ever asking the analysis whether a function is safe outside of any 
caller context, we should probably traverse them manually. But I don't think 
we're actually asking that.

https://github.com/llvm/llvm-project/pull/91873
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] Treat break, continue, goto, and label statements as trivial in WebKit checkers. (PR #91873)

2024-05-14 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/91873
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] Treat break, continue, goto, and label statements as trivial in WebKit checkers. (PR #91873)

2024-05-14 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ approved this pull request.

Aha great LGTM!

https://github.com/llvm/llvm-project/pull/91873
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Respect the [[clang::unsafe_buffer_usage]] attribute for field and constructor initializers (PR #91991)

2024-05-14 Thread Artem Dergachev via cfe-commits



@@ -3328,3 +3300,63 @@ void clang::checkUnsafeBufferUsage(const Decl *D,
 }
   }
 }
+
+void clang::checkUnsafeBufferUsage(const Decl *D,
+   UnsafeBufferUsageHandler ,
+   bool EmitSuggestions) {
+#ifndef NDEBUG
+  Handler.clearDebugNotes();
+#endif
+
+  assert(D);
+
+  SmallVector Stmts;
+
+  // We do not want to visit a Lambda expression defined inside a method
+  // independently. Instead, it should be visited along with the outer method.
+  // FIXME: do we want to do the same thing for `BlockDecl`s?
+  if (const auto *fd = dyn_cast(D)) {
+if (fd->getParent()->isLambda() && fd->getParent()->isLocalClass())
+  return;
+  }
+
+  // Do not emit fixit suggestions for functions declared in an
+  // extern "C" block.
+  if (const auto *FD = dyn_cast(D)) {
+for (FunctionDecl *FReDecl : FD->redecls()) {
+  if (FReDecl->isExternC()) {
+EmitSuggestions = false;
+break;
+  }
+}
+
+Stmts.push_back(FD->getBody());
+
+if (const auto *ID = dyn_cast(D)) {
+  for (const CXXCtorInitializer *CI : ID->inits()) {
+Stmts.push_back(CI->getInit());
+  }
+}
+  }
+
+  if (const auto *FD = dyn_cast(D)) {

haoNoQ wrote:

Actually, do we even need to handle default field initializers explicitly? I 
think the AST copy-pastes them into `CXXCtorInitializer`s in every constructor 
with the help of `CXXDefaultInitExpr`. So as long as you handle 
`CXXCtorInitializer` you get coverage of default initializers too. Unless there 
are zero constructors that respect that initializer. But then it's just dead 
code so we might as well leave it be.

https://github.com/llvm/llvm-project/pull/91991
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [WIP][Safe Buffers] Serialize unsafe_buffer_usage pragmas (PR #92031)

2024-05-14 Thread Artem Dergachev via cfe-commits



@@ -1551,6 +1567,58 @@ bool 
Preprocessor::isPPInSafeBufferOptOutRegion(SourceLocation ) {
   return InSafeBufferOptOutRegion;
 }
 
+SmallVector
+Preprocessor::serializeSafeBufferOptOutMap() const {
+  assert(!InSafeBufferOptOutRegion &&
+ "Attempt to serialize safe buffer opt-out regions before file being "
+ "completely preprocessed");
+
+  SmallVector SrcSeq;
+
+  for (const auto &[begin, end] : SafeBufferOptOutMap) {
+SrcSeq.push_back(begin);
+SrcSeq.push_back(end);
+  }
+  // Only `SafeBufferOptOutMap` gets serialized. No need to serialize
+  // `LoadedSafeBufferOptOutMap` because if this TU loads a pch/module, every
+  // pch/module in the pch-chain/module-DAG will be loaded one by one in order.
+  // It means that for each loading pch/module m, it just needs to load m's own
+  // `SafeBufferOptOutMap`.
+  return SrcSeq;
+}
+
+void Preprocessor::setDeserializedSafeBufferOptOutMap(
+const SmallVectorImpl ) {
+  auto It = SourceLocations.begin();
+
+  assert(SourceLocations.size() % 2 == 0 &&
+ "ill-formed SourceLocation sequence");
+  while (It != SourceLocations.end()) {
+SourceLocation begin = *It++;
+SourceLocation end = *It++;
+SourceLocation FileLoc = SourceMgr.getFileLoc(begin);
+FileID FID = SourceMgr.getDecomposedLoc(FileLoc).first;
+
+if (FID.isInvalid()) {
+  // I suppose this should not happen:
+  assert(false && "Attempted to read a safe buffer opt-out region whose "
+  "begin location is associated to an invalid File ID.");
+  break;
+}
+assert(!SourceMgr.isLocalFileID(FID) && "Expected a pch/module file");
+// Here we assume that
+// `SourceMgr.getFileLoc(begin) == SourceMgr.getFileLoc(end)`.
+// Though it may not hold in very rare and strange cases, i.e., a pair of

haoNoQ wrote:

If it may happen it shouldn't be an assert. If we really don't support this 
scenario it should probably be a compilation error, even if it only says "we 
don't support this yet sorry". (Can we emit compilation errors from here?)

https://github.com/llvm/llvm-project/pull/92031
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [WIP][Safe Buffers] Serialize unsafe_buffer_usage pragmas (PR #92031)

2024-05-14 Thread Artem Dergachev via cfe-commits



@@ -1551,6 +1567,58 @@ bool 
Preprocessor::isPPInSafeBufferOptOutRegion(SourceLocation ) {
   return InSafeBufferOptOutRegion;
 }
 
+SmallVector
+Preprocessor::serializeSafeBufferOptOutMap() const {
+  assert(!InSafeBufferOptOutRegion &&

haoNoQ wrote:

We require the pragma to be closed before the end of TU right? And that's a 
hard error, not a warning?

https://github.com/llvm/llvm-project/pull/92031
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [WIP][Safe Buffers] Serialize unsafe_buffer_usage pragmas (PR #92031)

2024-05-14 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ commented:

I'm very happy that this is going somewhere! Everything makes sense to me but I 
also don't know a lot about this stuff.

> During serialization, it only serializes regions of the current translation 
> unit. Regions from loaded files are not serialized.

Hmm how does this work? When a precompiled header includes another precompiled 
header, do they both get loaded independently and then you get to combine the 
maps?

https://github.com/llvm/llvm-project/pull/92031
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [WIP][Safe Buffers] Serialize unsafe_buffer_usage pragmas (PR #92031)

2024-05-14 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/92031
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Respect the [[clang::unsafe_buffer_usage]] attribute for field and constructor initializers (PR #91991)

2024-05-14 Thread Artem Dergachev via cfe-commits



@@ -3328,3 +3300,63 @@ void clang::checkUnsafeBufferUsage(const Decl *D,
 }
   }
 }
+
+void clang::checkUnsafeBufferUsage(const Decl *D,
+   UnsafeBufferUsageHandler ,
+   bool EmitSuggestions) {
+#ifndef NDEBUG
+  Handler.clearDebugNotes();
+#endif
+
+  assert(D);
+
+  SmallVector Stmts;
+
+  // We do not want to visit a Lambda expression defined inside a method
+  // independently. Instead, it should be visited along with the outer method.
+  // FIXME: do we want to do the same thing for `BlockDecl`s?
+  if (const auto *fd = dyn_cast(D)) {
+if (fd->getParent()->isLambda() && fd->getParent()->isLocalClass())
+  return;
+  }
+
+  // Do not emit fixit suggestions for functions declared in an
+  // extern "C" block.
+  if (const auto *FD = dyn_cast(D)) {
+for (FunctionDecl *FReDecl : FD->redecls()) {
+  if (FReDecl->isExternC()) {
+EmitSuggestions = false;
+break;
+  }
+}
+
+Stmts.push_back(FD->getBody());
+
+if (const auto *ID = dyn_cast(D)) {
+  for (const CXXCtorInitializer *CI : ID->inits()) {
+Stmts.push_back(CI->getInit());
+  }
+}
+  }
+
+  if (const auto *FD = dyn_cast(D)) {

haoNoQ wrote:

I think the rest of the machine may be super unprepared for the scenario when 
there's no complete "function" under analysis. We obviously want to find unsafe 
operations in such code but I think it's a good idea to temporarily suppress 
suggestion/fixit generation in this scenario (like hard-assign 
`EmitSuggestions` to false here), until we carefully confirm that the 
suggestion/fixit machine actually works. It's unlikely to produce useful 
suggestions or fixits anyway, unless the initializer somehow contains an entire 
compound statement with local variables or nested functions or classes inside. 
But I think it's very likely to step on some assertions and crash because we 
haven't considered this scenario at all until recently.

Global variable initializers are probably in the same situation.

https://github.com/llvm/llvm-project/pull/91991
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Respect the [[clang::unsafe_buffer_usage]] attribute for field and constructor initializers (PR #91991)

2024-05-14 Thread Artem Dergachev via cfe-commits



@@ -3328,3 +3300,63 @@ void clang::checkUnsafeBufferUsage(const Decl *D,
 }
   }
 }
+
+void clang::checkUnsafeBufferUsage(const Decl *D,
+   UnsafeBufferUsageHandler ,
+   bool EmitSuggestions) {
+#ifndef NDEBUG
+  Handler.clearDebugNotes();
+#endif
+
+  assert(D);
+
+  SmallVector Stmts;
+
+  // We do not want to visit a Lambda expression defined inside a method
+  // independently. Instead, it should be visited along with the outer method.
+  // FIXME: do we want to do the same thing for `BlockDecl`s?
+  if (const auto *fd = dyn_cast(D)) {
+if (fd->getParent()->isLambda() && fd->getParent()->isLocalClass())
+  return;
+  }
+
+  // Do not emit fixit suggestions for functions declared in an
+  // extern "C" block.
+  if (const auto *FD = dyn_cast(D)) {
+for (FunctionDecl *FReDecl : FD->redecls()) {
+  if (FReDecl->isExternC()) {
+EmitSuggestions = false;
+break;
+  }
+}
+
+Stmts.push_back(FD->getBody());
+
+if (const auto *ID = dyn_cast(D)) {
+  for (const CXXCtorInitializer *CI : ID->inits()) {
+Stmts.push_back(CI->getInit());
+  }
+}
+  }
+
+  if (const auto *FD = dyn_cast(D)) {
+// Visit in-class initializers for fields.
+if (!FD->hasInClassInitializer())
+  return;
+
+Stmts.push_back(FD->getInClassInitializer());
+  }
+
+  if (isa(D) || isa(D)) {
+Stmts.push_back(D->getBody());
+  }
+
+  assert(!Stmts.empty());
+
+  for (Stmt *S : Stmts) {
+auto [FixableGadgets, WarningGadgets, Tracker] =
+findGadgets(S, D->getASTContext(), Handler, EmitSuggestions);
+applyGadgets(D, std::move(FixableGadgets), std::move(WarningGadgets),

haoNoQ wrote:

I think `applyGadgets()` should be invoked only once per function. For 
instance, when we're emitting a fixit for a function parameter, we emit a 
single fixit that covers every use of that parameter. We can't do that 
separately for the use of that parameter in a `CXXCtorInitializer` and then 
independently do the same for the rest of the function body. That'd result in 
conflicting fixes.

The tracker should also be re-used for the entire function. We need to know if 
we can fix the use inside the `CXXCtorInitializer` at all before emitting a 
fixit for all other uses.

(Deep inside their hearts these data structures dream of becoming global for 
the entire translation unit.)

So probably `[FixableGadgets, WarningGadgets, Tracker]` should be passed by 
reference, to `push_back()` gadgets directly into them.

https://github.com/llvm/llvm-project/pull/91991
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Respect the [[clang::unsafe_buffer_usage]] attribute for field and constructor initializers (PR #91991)

2024-05-14 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ commented:

Aha makes sense! Thanks for catching these false negatives!!

https://github.com/llvm/llvm-project/pull/91991
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Respect the [[clang::unsafe_buffer_usage]] attribute for field and constructor initializers (PR #91991)

2024-05-14 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/91991
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Respect the [[clang::unsafe_buffer_usage]] attribute for constructors (PR #91777)

2024-05-14 Thread Artem Dergachev via cfe-commits



@@ -2856,7 +2916,7 @@ getFixIts(FixableGadgetSets , const 
FixitStrategy ,
   }
 #ifndef NDEBUG
   Handler.addDebugNoteForVar(
-  VD, F->getBaseStmt()->getBeginLoc(),
+  VD, F->getSourceLoc(),

haoNoQ wrote:

Hmm right, this is literally the only case where we use it *at all*. Maybe the 
right thing to do is to make that source location (or whatever we need) a 
member variable in the `Gadget` class, and pre-populate it in the 
`findGadgets()` method with the root statement. We don't really care what 
statement it is anyway do we. We just need something to work with. We can even 
do this under `#ifndef NDEBUG` because that's the only way we ever use it. And 
we don't have to define it in every subclass anymore. But if a subclass doesn't 
like the default value they can still reassign it in constructor.

I.e.
```diff
 class FixableGadget {
+#ifndef NDEBUG
+SourceLocation DebugLoc;
+#endif

 public:
-  FixableGadget(Kind K) : Gadget(K) {}
+  FixableGadget(Kind K, const MatchFinder::MatchResult ) : 
+  Gadget(K),
+  DebugLoc(Result.getNodeAs(getDebugName() + 
"Gadget")->getBeginLoc())
+  {}

+#ifndef NDEBUG
+  SourceLocation getDebugLoc() const { return DebugLoc; }
+#endif
};

 class PointerInitGadget : FixableGadget {
 PointerInitGadget(const MatchFinder::MatchResult )
-: FixableGadget(Kind::PointerInit),
+: FixableGadget(Kind::PointerInit, Result),
   PtrInitLHS(Result.Nodes.getNodeAs(PointerInitLHSTag)),
   PtrInitRHS(Result.Nodes.getNodeAs(PointerInitRHSTag)) {}

- // And then you no longer need getSourceLoc() in every class.
 };
```

(You don't *have* to do this in order to land the patch. I think it looks great 
already. I'm just thinking out loud.)

https://github.com/llvm/llvm-project/pull/91777
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Respect the [[clang::unsafe_buffer_usage]] attribute for constructors (PR #91777)

2024-05-14 Thread Artem Dergachev via cfe-commits



@@ -1315,9 +1374,9 @@ class DerefSimplePtrArithFixableGadget : public 
FixableGadget {
 
   virtual std::optional
   getFixits(const FixitStrategy ) const final;
-
-  // TODO remove this method from FixableGadget interface

haoNoQ wrote:

Still relevant tbh.

https://github.com/llvm/llvm-project/pull/91777
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Respect the [[clang::unsafe_buffer_usage]] attribute for constructors (PR #91777)

2024-05-14 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/91777
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Respect the [[clang::unsafe_buffer_usage]] attribute for constructors (PR #91777)

2024-05-14 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ approved this pull request.

Aha ok everything makes sense now! I think this is good to go so LGTM! Let me 
see the other PR too.

https://github.com/llvm/llvm-project/pull/91777
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Respect the [[clang::unsafe_buffer_usage]] attribute for constructors (PR #91777)

2024-05-14 Thread Artem Dergachev via cfe-commits



@@ -2295,6 +2292,23 @@ class UnsafeBufferUsageReporter : public 
UnsafeBufferUsageHandler {
 }
   }
 
+  void handleUnsafeOperationInContainer(const Stmt *Operation,
+bool IsRelatedToDecl,
+ASTContext ) override {
+SourceLocation Loc;
+SourceRange Range;
+unsigned MsgParam = 0;
+if (const auto *CtorExpr = dyn_cast(Operation)) {

haoNoQ wrote:

This should be a hard `cast<>()` because you don't really handle the 
else-branch so it results in an invalid source location. We can make it dynamic 
again later if we want.

https://github.com/llvm/llvm-project/pull/91777
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Respect the [[clang::unsafe_buffer_usage]] attribute for constructors (PR #91777)

2024-05-14 Thread Artem Dergachev via cfe-commits



@@ -921,10 +937,55 @@ class UnsafeBufferUsageAttrGadget : public WarningGadget {
   }
 
   static Matcher matcher() {
-return 
stmt(callExpr(callee(functionDecl(hasAttr(attr::UnsafeBufferUsage
-.bind(OpTag));
+auto HasUnsafeFnDecl =
+callee(functionDecl(hasAttr(attr::UnsafeBufferUsage)));
+return stmt(callExpr(HasUnsafeFnDecl).bind(OpTag));
+  }
+
+  void handleUnsafeOperation(UnsafeBufferUsageHandler ,
+ bool IsRelatedToDecl,
+ ASTContext ) const override {
+Handler.handleUnsafeOperation(Op, IsRelatedToDecl, Ctx);
+  }
+  SourceLocation getSourceLoc() const override { return Op->getBeginLoc(); }
+
+  DeclUseList getClaimedVarUseSites() const override { return {}; }
+};
+
+/// A call of a constructor that performs unchecked buffer operations
+/// over one of its pointer parameters, or constructs a class object that will
+/// perform buffer operations that depend on the correctness of the parameters.
+class UnsafeBufferUsageCtorAttrGadget : public WarningGadget {
+  constexpr static const char *const OpTag = "cxx_construct_expr";
+  const CXXConstructExpr *Op;
+
+public:
+  UnsafeBufferUsageCtorAttrGadget(const MatchFinder::MatchResult )
+  : WarningGadget(Kind::UnsafeBufferUsageCtorAttr),
+Op(Result.Nodes.getNodeAs(OpTag)) {}
+
+  static bool classof(const Gadget *G) {
+return G->getKind() == Kind::UnsafeBufferUsageCtorAttr;
+  }
+
+  static Matcher matcher() {
+auto HasUnsafeCtorDecl =
+hasDeclaration(cxxConstructorDecl(hasAttr(attr::UnsafeBufferUsage)));
+// std::span(ptr, size) ctor is handled by SpanTwoParamConstructorGadget.
+auto HasTwoParamSpanCtorDecl = hasDeclaration(

haoNoQ wrote:

I wouldn't mind something like
```
auto HasTwoParamSpanCtorDecl = SpanTwoParamConstructorGadget::matcher();
```
to prevent it from diverging when the other gadget changes. (We could build a 
fancier conflict resolution system but I don't think it's necessary yet.)

https://github.com/llvm/llvm-project/pull/91777
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Respect the [[clang::unsafe_buffer_usage]] attribute for constructors (PR #91777)

2024-05-14 Thread Artem Dergachev via cfe-commits



@@ -2856,7 +2916,7 @@ getFixIts(FixableGadgetSets , const 
FixitStrategy ,
   }
 #ifndef NDEBUG
   Handler.addDebugNoteForVar(
-  VD, F->getBaseStmt()->getBeginLoc(),
+  VD, F->getSourceLoc(),

haoNoQ wrote:

This is the only place where we do any actual virtual dispatch over 
`getSourceLoc()` right? Otherwise we could merge it into 
`handleUnsafeOperation()` too. Maybe with a slightly different design we could 
eliminate this too. Dunno, I don't have anything specific in mind.

https://github.com/llvm/llvm-project/pull/91777
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Respect the [[clang::unsafe_buffer_usage]] attribute for constructors (PR #91777)

2024-05-14 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/91777
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Respect the [[clang::unsafe_buffer_usage]] attribute for constructors (PR #91777)

2024-05-14 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ commented:

> https://github.com/llvm/llvm-project/blob/2ff43ce87e66d9324370e35ea6743ef57400c76e/clang/lib/Analysis/UnsafeBufferUsage.cpp#L1373-L1374
> 
> These assert that exactly one gadget matched. I think it's kinda worthwhile 
> keeping the warnings independent too, so I don't see this as a bad thing. If 
> we were to mark the span ctor as `[[clang::unsafe_buffer_usage]]` for 
> instance, it the span-specific warning gives a better output still, and 
> generating two warnings for the same piece of code is noisy.

Hmm this isn't really what these assertions are about. They don't protect 
against duplicate gadgets, they protect against duplicate `.bind()` tags across 
gadgets. Because gadgets are connected by the `anyOf()` matcher (as opposed to 
`eachOf()`), a single match result will always have exactly one gadget, the 
first one in the list, even if they match the same statement.

So I think you're right that there's a problem: if you have two gadgets match 
the same statement, only one will be reported. So your solution is probably 
necessary. We should also maybe do something about that, so that warning 
gadgets weren't conflicting this way. But I don't think *this* assertion guards 
against that.

If this still doesn't sound right to you, can you include a test where the 
assertion fails for you? No matter how ridiculous the test is. The machine is 
complex enough to appreciate ridiculous tests. I think we want to get to the 
bottom of this.

https://github.com/llvm/llvm-project/pull/91777
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [alpha.webkit.UncountedCallArgsChecker] Allow explicit instantiation of Ref/RefPtr on call arguments. (PR #91875)

2024-05-13 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ approved this pull request.

LGTM!

https://github.com/llvm/llvm-project/pull/91875
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] Treat break, continue, goto, and label statements as trivial in WebKit checkers. (PR #91873)

2024-05-13 Thread Artem Dergachev via cfe-commits



@@ -445,6 +456,10 @@ class TrivialFunctionAnalysisVisitor
 return Visit(VMT->getSubExpr());
   }
 
+  bool VisitCXXBindTemporaryExpr(const CXXBindTemporaryExpr* BTE) {
+return Visit(BTE->getSubExpr());

haoNoQ wrote:

At this point you probably want to double-check that the destructor itself is 
trivial (in the sense of your analysis, not in the general C++ sense). 
Destructor calls are generally not represented in the AST at all (unless 
they're explicit), CodeGen just figures them out from the rest of the syntax.

`CXXBindTemporaryExpr` is probably the right place to look for destructors, at 
least according to me after reading a lot of ASTs. But at the same time nobody 
really understands what `CXXBindTemporaryExpr` stands for anymore and a few 
famous people have argued for removing it.

Note that `CXXBindTemporaryExpr` doesn't guarantee that there will be a 
destructor call at the end of the full-expression (lifetime extension is a 
thing - need to consult `MaterializeTemporaryExpr` to see where this is going), 
or even at the end of function (RVO is a thing), even when the constructor is 
targeting a local variable (NRVO is a thing). In case of RVO/NRVO the caller 
doesn't necessarily call the destructor either; it could also be RVOing/NRVOing 
it further up the call stack up to arbitrarily large depth.

In pre-C++17 AST, and even in C++17 and later if NRVO is involved, the AST 
would look as if a copy/move is being made even if it's elided in practice.

So when checking the destructor, you need to think how far do you want to go 
when it comes to determining if the destructor actually happens by the time the 
function returns. Because in order to implement this accurately you'll need to 
re-implement a big chunk of CodeGen.

You can also try to check the destructor inside `VisitCXXConstructExpr()`, as 
if assuming that every time an object is constructed, it'd be deleted 
"eventually". Except in this case you'll miss the part where you're receiving 
an object by value from a `CallExpr` (and such); then you'll also need to check 
the destructor for the received object even though you don't see the 
constructor. This isn't a problem when you're relying on `CXXBindTemporaryExpr` 
which will be present around the `CallExpr` normally when the object needs a 
destructor.

https://github.com/llvm/llvm-project/pull/91873
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Respect the [[clang::unsafe_buffer_usage]] attribute for constructors (PR #91777)

2024-05-13 Thread Artem Dergachev via cfe-commits

haoNoQ wrote:

Hi! Thank you for digging into this! Sorry for the delay.

> The new UnsafeBufferUsageCtorAttrGadget gadget explicitly avoids matching 
> against the std::span(ptr, size) constructor because that is handled by 
> SpanTwoParamConstructorGadget and we never want two gadgets to match the same 
> thing (and this is guarded by asserts).

Hmm at a glance I'm not sure this should really be illegal. It's a big problem 
when fixable gadgets overlap, because this means that they'll try to fix the 
same code in two incompatible ways. I don't immediately see why this would be a 
problem for warning gadgets that don't try to fix anything. This just means 
that the code is non-compliant for two different reasons, which is 
theoretically fine. I also tried to reproduce your assert and I didn't hit it; 
which one are you running into?

> To handle this we allow the gadget to control if the warning is general (it 
> calls handleUnsafeBufferUsage()) or is a std-container-specific warning (it 
> calls handleUnsafeOperationInContainer()).

Ooo I love this.

My initial thinking was, just make the base Gadget class "public" (move it into 
`UnsafeBufferUsage.h` so that the handler could see it) and make the handler 
switch over the gadget's `getKind()` to produce specialized messages for each 
gadget. This would help us unscrew the rest of the `Reporter` code so that it 
also didn't have to guess by statement kind.

Your design can grow into that too - make a separate handler method for each 
gadget kind (or group of kinds), and it preserves even more encapsulation. 
Additionally it throws away `getBaseStmt()` in favor of a more "integrated" 
solution. I'm a big fan, let's keep this.

https://github.com/llvm/llvm-project/pull/91777
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] Treat bitwise_cast, std::addressof, and new as trivial in WebKit checkers. (PR #91830)

2024-05-10 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ approved this pull request.


https://github.com/llvm/llvm-project/pull/91830
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] Treat bitwise_cast, std::addressof, and new as trivial in WebKit checkers. (PR #91830)

2024-05-10 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ approved this pull request.

LGTM thanks!

https://github.com/llvm/llvm-project/pull/91830
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] MallocChecker: Recognize std::atomics in smart pointer suppression. (PR #90918)

2024-05-08 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ closed https://github.com/llvm/llvm-project/pull/90918
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] MallocChecker: Recognize std::atomics in smart pointer suppression. (PR #90918)

2024-05-08 Thread Artem Dergachev via cfe-commits


haoNoQ wrote:

@sharkautarch Yeah I think the situation where we observe the allocation site 
during analysis may be significantly different in presence of smart pointers. 
It may be a good idea to include such check in the heuristic, to reclaim some 
of the false negatives currently silenced by it. If we've seen the initial 
reference count and we've modeled it perfectly so far, these false positives 
can't happen. But this implies that we model our atomics perfectly. And given 
that we don't even know what a "reference count" is (maybe it's just some 
unrelated integer?), it's quite hard to get that right.

Also we don't really do "whole program analysis" and we probably can't do that 
without major changes to our technique; it simply doesn't scale to that scale. 
Our existing cross-TU mode is more about _occasionally_ breaking translation 
unit boundaries when we really want something specific from another TU (like 
the effects of a function call we just encountered), but we still focus on 
small portions of the program at a time. We never really derive whole-program 
conclusions from these small independent invocations of our analysis. That'd 
require very different technology. It's not necessarily difficult per se – we 
could develop such technology and occasionally make our main analysis 
interoperate with it. But we haven't done any of that yet and our cross-TU mode 
isn't that kind of cross-TU mode.

https://github.com/llvm/llvm-project/pull/90918
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] Support determining origins in a conditional operator in WebKit checkers. (PR #91143)

2024-05-08 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ approved this pull request.

Aha ok this aligns with my understanding. LGTM!

https://github.com/llvm/llvm-project/pull/91143
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] Support determining origins in a conditional operator in WebKit checkers. (PR #91143)

2024-05-08 Thread Artem Dergachev via cfe-commits



@@ -27,12 +28,18 @@ tryToFindPtrOrigin(const Expr *E, bool 
StopAtFirstRefCountedObj) {
   E = tempExpr->getSubExpr();
   continue;
 }
+if (auto *Expr = dyn_cast(E)) {
+  return tryToFindPtrOrigin(Expr->getTrueExpr(), StopAtFirstRefCountedObj,
+callback) &&
+ tryToFindPtrOrigin(Expr->getFalseExpr(), StopAtFirstRefCountedObj,

haoNoQ wrote:

Aha makes sense then!

https://github.com/llvm/llvm-project/pull/91143
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] Support determining origins in a conditional operator in WebKit checkers. (PR #91143)

2024-05-08 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/91143
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [alpha.webkit.UncountedCallArgsChecker] Allow trivial operator++ (PR #91102)

2024-05-08 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/91102
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [alpha.webkit.UncountedCallArgsChecker] Allow trivial operator++ (PR #91102)

2024-05-08 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ approved this pull request.

LGTM now thanks!

https://github.com/llvm/llvm-project/pull/91102
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [alpha.webkit.UncountedCallArgsChecker] Allow trivial operator++ (PR #91102)

2024-05-08 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/91102
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [alpha.webkit.UncountedCallArgsChecker] Allow trivial operator++ (PR #91102)

2024-05-08 Thread Artem Dergachev via cfe-commits



@@ -309,21 +309,8 @@ class TrivialFunctionAnalysisVisitor
   bool VisitDefaultStmt(const DefaultStmt *DS) { return VisitChildren(DS); }
 
   bool VisitUnaryOperator(const UnaryOperator *UO) {
-// Operator '*' and '!' are allowed as long as the operand is trivial.
-auto op = UO->getOpcode();
-if (op == UO_Deref || op == UO_AddrOf || op == UO_LNot || op == UO_Not)
-  return Visit(UO->getSubExpr());
-
-if (UO->isIncrementOp() || UO->isDecrementOp()) {
-  // Allow increment or decrement of a POD type.
-  if (auto *RefExpr = dyn_cast(UO->getSubExpr())) {
-if (auto *Decl = dyn_cast(RefExpr->getDecl()))
-  return Decl->isLocalVarDeclOrParm() &&
- Decl->getType().isPODType(Decl->getASTContext());
-  }
-}
-// Other operators are non-trivial.
-return false;
+// Unary operators are trivial if its operand is trivial.
+return Visit(UO->getSubExpr());

haoNoQ wrote:

When blanket-approving all operators, it's probably a good idea to go through 
the current list to see if any exceptions need  to be made. Eg. for unary 
operators it's 
https://github.com/llvm/llvm-project/blob/llvmorg-18-init/clang/include/clang/AST/OperationKinds.def#L417
 and binary operators are just above that. But I don't immediately see anything 
problematic in that list. (Is `co_await` blessed?)

https://github.com/llvm/llvm-project/pull/91102
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] Ignore system headers in WebKit checkers. (PR #91103)

2024-05-07 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ approved this pull request.

Perfect thanks!!

https://github.com/llvm/llvm-project/pull/91103
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Fix a typo in webkit.NoUncountedMemberChecker. (PR #91402)

2024-05-07 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ approved this pull request.

Nice LGTM!

You could test this patch by pasting the warning into one of the 
`expected-warning{{}}` clauses in the tests but I honestly don't think it's all 
that important for a patch like this.

https://github.com/llvm/llvm-project/pull/91402
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] Use explicit call description mode in MIGChecker (PR #91331)

2024-05-07 Thread Artem Dergachev via cfe-commits



@@ -87,7 +90,7 @@ class MIGChecker : public Checker,
 #undef CALL
   };
 
-  CallDescription OsRefRetain{{"os_ref_retain"}, 1};
+  CallDescription OsRefRetain{CDM::SimpleFunc, {"os_ref_retain"}, 1};

haoNoQ wrote:

Yes, this is a plain C function. Definitely not builtin. May or may not come 
from system headers depending on what we're compiling.

https://github.com/llvm/llvm-project/pull/91331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] Use explicit call description mode in MIGChecker (PR #91331)

2024-05-07 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ approved this pull request.

Looks awesome thank you!

https://github.com/llvm/llvm-project/pull/91331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] Use explicit call description mode in MIGChecker (PR #91331)

2024-05-07 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/91331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] MallocChecker: Recognize std::atomics in smart pointer suppression. (PR #90918)

2024-05-06 Thread Artem Dergachev via cfe-commits

haoNoQ wrote:

> I've just left here what my thought process was when I dig into a similar 
> case. It might be useful, who knows.

Medium-to-long-term I think an attribute-based approach might make sense there:
- Either annotate reference-counting pointers as "hey I'm a smart pointer (I'm 
following a different safety model in which symbolic execution based analysis 
would do more harm than good) so please ignore memory managed by me";
- Or annotate reference-counted objects (i.e. objects that contain an intrusive 
reference count and are always managed by intrusive reference-counted pointers) 
with an attribute "hey I'm always well-managed, please ignore my allocations 
entirely (for the same reason)".

Or both.

I've definitely seen a few codebases, that are security-critical to a large-ish 
chunk of humanity, that have like 20 competing ad-hoc reference-counting 
schemes. Mostly in plain C where MallocChecker would otherwise be very useful, 
if it wasn't ruined by false positives of a similar nature from all these 
reference counting schemes. These schemes probably don't make sense to hardcode 
in the compiler because there's no inheritance so you'd need to hardcode every 
struct that follows one of those schemes, not just every scheme in and of 
itself.

> I considered modeling the atomic member functions, until the first place they 
> escape, after which we can no longer ever reason about them. This made me to 
> look into suppressions.

Yeah I think it's not sufficient to model the atomics. It's a good idea to 
model them anyway, but even when atomics aren't used (eg. the custom smart 
pointer never needed to be thread-safe), the fundamental problem is that we 
still don't know the *initial* value of the reference count. In particular we 
can't refute the possibility that the original value was like `-1` or something.

Modeling atomics would make it better when the smart pointer was first created 
*during* analysis, so we actually know that the initial value is 0. Then by 
carefully tracking it we might arrive to the correct conclusion. But when the 
initial value is symbolic we're fundamentally powerless without the 
domain-specific knowledge that the value could not have been `-1`.

It's possible that this domain-specific knowledge could be transferred to us 
with the help of well-placed assertions across the smart pointer class. But an 
attribute could achieve the same in a more direct manner.

> I believe, that such a heuristic with dominators could work for the rest of 
> the checkers - where the bug report is attached to some given statement - and 
> not delayed until some other future statement like in the leak checker.

Yes, I think there has to be something good in this area, even though I haven't 
got any good specific solutions in my head. @Szelethus did a lot of initial 
experimentation in this area, which resulted in improved condition tracking, 
but we haven't used it for false positive suppression yet. Once we research 
this deeper and make it more principled, maybe we should really start doing 
that.

It may also be a good idea to not look at the bug report in isolation, but 
consider the entire space of execution paths on which it was found. If the 
space isn't "vast" enough to convince us that we aren't stepping onto an 
#61669, suppress the warning.

https://github.com/llvm/llvm-project/pull/90918
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] MallocChecker: Recognize std::atomics in smart pointer suppression. (PR #90918)

2024-05-06 Thread Artem Dergachev via cfe-commits

@@ -3479,13 +3479,24 @@ PathDiagnosticPieceRef 
MallocBugVisitor::VisitNode(const ExplodedNode *N,
   // original reference count is positive, we should not report use-after-frees
   // on objects deleted in such destructors. This can probably be improved
   // through better shared pointer modeling.
-  if (ReleaseDestructorLC) {
+  if (ReleaseDestructorLC && (ReleaseDestructorLC == CurrentLC ||
+  ReleaseDestructorLC->isParentOf(CurrentLC))) {
 if (const auto *AE = dyn_cast(S)) {
+  // Check for manual use of atomic builtins.
   AtomicExpr::AtomicOp Op = AE->getOp();
   if (Op == AtomicExpr::AO__c11_atomic_fetch_add ||
   Op == AtomicExpr::AO__c11_atomic_fetch_sub) {
-if (ReleaseDestructorLC == CurrentLC ||
-ReleaseDestructorLC->isParentOf(CurrentLC)) {
+BR.markInvalid(getTag(), S);
+  }
+} else if (const auto *CE = dyn_cast(S)) {
+  // Check for `std::atomic` and such. This covers both regular method 
calls
+  // and operator calls.
+  if (const auto *MD =
+  dyn_cast_or_null(CE->getDirectCallee())) {
+const CXXRecordDecl *RD = MD->getParent();
+// A bit wobbly with ".contains()" because it may be like
+// "__atomic_base" or something.
+if (StringRef(RD->getNameAsString()).contains("atomic")) {

haoNoQ wrote:

> Do we have any safeguard to only match names within the `std` namespace? 
> Could you add a test case demonstrating that a user-defined type wouldn't be 
> mistaken for `atomic` here?

There aren't any safeguards, but I'm not sure we want them. This is already a 
crude heuristic that goes at like 45 degrees against the desired direction. I 
think I'd rather have it catch more false positives by respecting user-defined 
types that are probably atomics, than eliminate a few false negatives when our 
tool is applied to... Chemistry software probably? A few video games come to 
mind? Which are both amazing and I'd love to catch a few bugs in them. But I'm 
generally more worried about the entire projects that can't use our tool at all 
because they use custom atomic classes, dealing with problems similar to the 
original bug report.

Because this heuristic applies only to method calls inside destructors (which 
doesn't include other destructor calls), the exact situation where this causes 
problems is _"somebody explicitly calls a method on a class named 
'...atomic...' which isn't an actual atomic integer, in a destructor which 
isn't a destructor of a smart pointer, and we're tracking a MallocChecker 
use-after-free report where memory was released inside that destructor, and 
that report is actually desired by the user"_. Which is definitely not 
impossible, but even in projects where this could happen, it would not happen 
every time; I hope that only one or two reports are affected in practice. 
MallocChecker isn't even that good in C++ code in which the programmers know 
what a destructor is, so I think even the "report is actually desired by the 
user" part would be fairly hard to satisfy.

So I'm future-proofing this a bit, acknowledging that if we went with strict 
`std::atomic` requirement, we'd be likely to relax it in the future when more 
bug reports come in.

Dunno, am I being overly pessimistic? Again, I'm very open to changing my mind.

https://github.com/llvm/llvm-project/pull/90918
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] MallocChecker: Recognize std::atomics in smart pointer suppression. (PR #90918)

2024-05-06 Thread Artem Dergachev via cfe-commits



@@ -3479,13 +3479,24 @@ PathDiagnosticPieceRef 
MallocBugVisitor::VisitNode(const ExplodedNode *N,
   // original reference count is positive, we should not report use-after-frees
   // on objects deleted in such destructors. This can probably be improved
   // through better shared pointer modeling.
-  if (ReleaseDestructorLC) {
+  if (ReleaseDestructorLC && (ReleaseDestructorLC == CurrentLC ||
+  ReleaseDestructorLC->isParentOf(CurrentLC))) {
 if (const auto *AE = dyn_cast(S)) {
+  // Check for manual use of atomic builtins.
   AtomicExpr::AtomicOp Op = AE->getOp();
   if (Op == AtomicExpr::AO__c11_atomic_fetch_add ||
   Op == AtomicExpr::AO__c11_atomic_fetch_sub) {
-if (ReleaseDestructorLC == CurrentLC ||
-ReleaseDestructorLC->isParentOf(CurrentLC)) {
+BR.markInvalid(getTag(), S);
+  }
+} else if (const auto *CE = dyn_cast(S)) {
+  // Check for `std::atomic` and such. This covers both regular method 
calls
+  // and operator calls.
+  if (const auto *MD =
+  dyn_cast_or_null(CE->getDirectCallee())) {
+const CXXRecordDecl *RD = MD->getParent();
+// A bit wobbly with ".contains()" because it may be like
+// "__atomic_base" or something.
+if (StringRef(RD->getNameAsString()).contains("atomic")) {

haoNoQ wrote:

> Do we need to do the heavy weight `getNameAsString` here? Could we get the 
> identifier instead?

Dunno do we still care about this? This isn't a hot path, and it's so annoying 
and error-prone to check the crash pre-condition first. If it's fast enough for 
ASTMatchers it's probably fast enough for us.

I feel like, since nobody provided a safer method, and since the documentation 
appears to be very stale (`getNameAsString` is described as deprecated but it's 
already clear that it's not going anywhere), this doesn't appear to be a real 
problem to anybody.

Though, I'm very open to changing my mind about this :)

https://github.com/llvm/llvm-project/pull/90918
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Use cmake to find perl executable (PR #91275)

2024-05-06 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ approved this pull request.


https://github.com/llvm/llvm-project/pull/91275
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Use cmake to find perl executable (PR #91275)

2024-05-06 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ edited https://github.com/llvm/llvm-project/pull/91275
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Use cmake to find perl executable (PR #91275)

2024-05-06 Thread Artem Dergachev via cfe-commits



@@ -1,4 +1,4 @@
-REQUIRES: shell
+REQUIRES: perl

haoNoQ wrote:

A few of those require `rm` and `ls` but these don't count as shell right?

https://github.com/llvm/llvm-project/pull/91275
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Use cmake to find perl executable (PR #91275)

2024-05-06 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ approved this pull request.

Looks great thanks a lot!!

I can confirm that it works for me on a mac too.

@jroelofs's comment is on point.

https://github.com/llvm/llvm-project/pull/91275
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] Support determining origins in a conditional operator in WebKit checkers. (PR #91143)

2024-05-06 Thread Artem Dergachev via cfe-commits



@@ -27,12 +28,18 @@ tryToFindPtrOrigin(const Expr *E, bool 
StopAtFirstRefCountedObj) {
   E = tempExpr->getSubExpr();
   continue;
 }
+if (auto *Expr = dyn_cast(E)) {
+  return tryToFindPtrOrigin(Expr->getTrueExpr(), StopAtFirstRefCountedObj,
+callback) &&
+ tryToFindPtrOrigin(Expr->getFalseExpr(), StopAtFirstRefCountedObj,

haoNoQ wrote:

Hmm does this code play well with `StopAtFirstRefCountedObj`? Am I reading this 
right that both branches need to originate from a smart pointer, so we should 
never really stop in the middle between the branches?

https://github.com/llvm/llvm-project/pull/91143
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] Ignore system headers in WebKit checkers. (PR #91103)

2024-05-06 Thread Artem Dergachev via cfe-commits


haoNoQ wrote:

Aha, this looks like a more traditional solution! Looks great. You can make 
tests for this by feeding the compiler fake system headers with the help of
```
#pragma clang system_header
```
!

https://github.com/llvm/llvm-project/pull/91103
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [alpha.webkit.UncountedCallArgsChecker] Allow trivial operator++ (PR #91102)

2024-05-06 Thread Artem Dergachev via cfe-commits



@@ -316,10 +316,15 @@ class TrivialFunctionAnalysisVisitor
 
 if (UO->isIncrementOp() || UO->isDecrementOp()) {
   // Allow increment or decrement of a POD type.
-  if (auto *RefExpr = dyn_cast(UO->getSubExpr())) {
+  auto *SubExpr = UO->getSubExpr();
+  if (auto *RefExpr = dyn_cast(SubExpr)) {

haoNoQ wrote:

Hmm. You're jumping two steps at a time here: exploring `UnaryOperator` and the 
inner `DeclRefExrpr`/`MemberExpr` together simultaneously, as opposed to fully 
relying on the `Visit()` method which already has slightly different behavior.

For example, it looks like an access to a global variable is considered trivial 
_unless it's an increment_. In particular, `global += 1` is still considered 
trivial but `++global` isn't. (This has actually already been the case before 
your patch, you just added more similar behavior.)

Is this intended? Maybe it's better to check for these extra conditions by 
default?

https://github.com/llvm/llvm-project/pull/91102
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [alpha.webkit.UncountedCallArgsChecker] Treat (foo())->bar() like foo()->bar(). (PR #91052)

2024-05-06 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ approved this pull request.

LGTM! Yeah I really don't know why `ParenExpr` needed to be a thing in the AST.

https://github.com/llvm/llvm-project/pull/91052
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [webkit.RefCntblBaseVirtualDtor] Ignore WTF::RefCounted and its variants missing virtual destructor (PR #91009)

2024-05-06 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ approved this pull request.

LGTM!

https://github.com/llvm/llvm-project/pull/91009
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] MallocChecker: Recognize std::atomics in smart pointer suppression. (PR #90918)

2024-05-02 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ created 
https://github.com/llvm/llvm-project/pull/90918

Fixes #90498.

Same as 5337efc69cdd5 for atomic builtins, but for `std::atomic` this time. 
This is useful because even though the actual builtin atomic is still there, it 
may be buried beyond the inlining depth limit.

Also add one popular custom smart pointer class name to the name-based 
heuristics, which isn't necessary to fix the bug but arguably a good idea 
regardless.

>From 6f7c0e33d10d6b3dec2a96a8cdc132eff71463fc Mon Sep 17 00:00:00 2001
From: Artem Dergachev 
Date: Thu, 2 May 2024 15:15:27 -0700
Subject: [PATCH] [analyzer] MallocChecker: Recognize std::atomics in smart
 pointer suppression.

Fixes #90498.

Same as 5337efc69cdd5 for atomic builtins, but for `std::atomic` this time.
This is useful because even though the actual builtin atomic is still there,
it may be buried beyond the inlining depth limit.

Also add one popular custom smart pointer class name to the name-based
heuristics, which isn't necessary to fix the bug but arguably a good idea
regardless.
---
 .../StaticAnalyzer/Checkers/MallocChecker.cpp |  19 ++-
 .../Inputs/system-header-simulator-cxx.h  |   7 ++
 clang/test/Analysis/NewDelete-atomics.cpp | 116 --
 3 files changed, 130 insertions(+), 12 deletions(-)

diff --git a/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp 
b/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
index 11651fd491f743..3d89d1e6343490 100644
--- a/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
@@ -3447,7 +3447,7 @@ static bool isReferenceCountingPointerDestructor(const 
CXXDestructorDecl *DD) {
 if (N.contains_insensitive("ptr") || N.contains_insensitive("pointer")) {
   if (N.contains_insensitive("ref") || N.contains_insensitive("cnt") ||
   N.contains_insensitive("intrusive") ||
-  N.contains_insensitive("shared")) {
+  N.contains_insensitive("shared") || N.ends_with_insensitive("rc")) {
 return true;
   }
 }
@@ -3479,13 +3479,24 @@ PathDiagnosticPieceRef 
MallocBugVisitor::VisitNode(const ExplodedNode *N,
   // original reference count is positive, we should not report use-after-frees
   // on objects deleted in such destructors. This can probably be improved
   // through better shared pointer modeling.
-  if (ReleaseDestructorLC) {
+  if (ReleaseDestructorLC && (ReleaseDestructorLC == CurrentLC ||
+  ReleaseDestructorLC->isParentOf(CurrentLC))) {
 if (const auto *AE = dyn_cast(S)) {
+  // Check for manual use of atomic builtins.
   AtomicExpr::AtomicOp Op = AE->getOp();
   if (Op == AtomicExpr::AO__c11_atomic_fetch_add ||
   Op == AtomicExpr::AO__c11_atomic_fetch_sub) {
-if (ReleaseDestructorLC == CurrentLC ||
-ReleaseDestructorLC->isParentOf(CurrentLC)) {
+BR.markInvalid(getTag(), S);
+  }
+} else if (const auto *CE = dyn_cast(S)) {
+  // Check for `std::atomic` and such. This covers both regular method 
calls
+  // and operator calls.
+  if (const auto *MD =
+  dyn_cast_or_null(CE->getDirectCallee())) {
+const CXXRecordDecl *RD = MD->getParent();
+// A bit wobbly with ".contains()" because it may be like
+// "__atomic_base" or something.
+if (StringRef(RD->getNameAsString()).contains("atomic")) {
   BR.markInvalid(getTag(), S);
 }
   }
diff --git a/clang/test/Analysis/Inputs/system-header-simulator-cxx.h 
b/clang/test/Analysis/Inputs/system-header-simulator-cxx.h
index 1c2be322f83c20..29326ec1f9280d 100644
--- a/clang/test/Analysis/Inputs/system-header-simulator-cxx.h
+++ b/clang/test/Analysis/Inputs/system-header-simulator-cxx.h
@@ -1260,6 +1260,13 @@ template<
 iterator end() const { return iterator(val + 1); }
 };
 
+template 
+class atomic {
+public:
+  T operator++();
+  T operator--();
+};
+
 namespace execution {
 class sequenced_policy {};
 }
diff --git a/clang/test/Analysis/NewDelete-atomics.cpp 
b/clang/test/Analysis/NewDelete-atomics.cpp
index 54fce17ea7bd22..1425acab7489ba 100644
--- a/clang/test/Analysis/NewDelete-atomics.cpp
+++ b/clang/test/Analysis/NewDelete-atomics.cpp
@@ -20,7 +20,7 @@ typedef enum memory_order {
   memory_order_seq_cst = __ATOMIC_SEQ_CST
 } memory_order;
 
-class Obj {
+class RawObj {
   int RefCnt;
 
 public:
@@ -37,11 +37,27 @@ class Obj {
   void foo();
 };
 
+class StdAtomicObj {
+  std::atomic RefCnt;
+
+public:
+  int incRef() {
+return ++RefCnt;
+  }
+
+  int decRef() {
+return --RefCnt;
+  }
+
+  void foo();
+};
+
+template 
 class IntrusivePtr {
-  Obj *Ptr;
+  T *Ptr;
 
 public:
-  IntrusivePtr(Obj *Ptr) : Ptr(Ptr) {
+  IntrusivePtr(T *Ptr) : Ptr(Ptr) {
 Ptr->incRef();
   }
 
@@ -55,22 +71,106 @@ class IntrusivePtr {
   delete Ptr;
   }
 
-  Obj *getPtr() const { return Ptr; } // no-warning
+  T *getPtr() const { return Ptr; } // no-warning
+};
+
+//

[clang] [alpha.webkit.UncountedLocalVarsChecker] Don't warning on inlined functions (PR #90733)

2024-05-01 Thread Artem Dergachev via cfe-commits


haoNoQ wrote:

@mikhailramalho long time no see!

I think @rniwa is right on this one. "Inline" is completely orthogonal to 
"safe". Fundamentally it only matters what the function _actually does_; it 
doesn't matter how it's defined or where or how the code is generated for it.

These checkers don't follow our usual paradigm of "false positives are bad". 
They're coding convention checkers that force WebKit to use smart pointers 
consistently, even if the compiler hasn't identified any specific memory leaks 
or use-after-free vulnerabilities to worry about. While it's still valuable to 
avoid emitting warnings when the code is "obviously safe", they probably 
wouldn't go for a very approximate reasoning such as "all inline functions are 
safe" . We do this sort of stuff quite often in the rest of the static analyzer 
where false positives are much more terrifying than false negatives. But here, 
WebKit people such as @rniwa have consensually accepted a potentially high 
"false positive" rate as long as this means getting close to _some real safety_.

So this whole "trivial function analysis" thing aggressively treats unknown 
functions as unsafe by default, and when it actually gets to see the body 
(which correlates quite a bit with being inline) it carefully inspects it and 
confirms that it's actually "trivial" enough to not screw memory management. If 
you turn it off for inline functions, that's pretty close to turning it off 
entirely. That's probably not what they're looking for. There has to be a more 
careful solution to the false positves you're seeing.

https://github.com/llvm/llvm-project/pull/90733
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [alpha.webkit.UncountedCallArgsChecker] Ignore methods of WTF String classes. (PR #90704)

2024-05-01 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ approved this pull request.

LGTM! The revert was caused entirely by something in the mock headers in tests 
right?

https://github.com/llvm/llvm-project/pull/90704
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Skip over std namespace in WebKit checkers. (PR #90552)

2024-05-01 Thread Artem Dergachev via cfe-commits


haoNoQ wrote:

Hmm it doesn't look like we actually have that other suppression. It looks like 
the checker simply never warned in the standard library because there aren't 
WebKit classes in the standard library. But it looks like you've found a few 
cases where it actually happens? How does this happen? Is this because 
templates get instantiated with WTF classes as type arguments? Maybe a 
something completely different solution should be considered?

If we think it's actually a good idea to suppress warnings about standard 
functions, typically these suppressions are implemented with 
`getSourceManager().isInSystemHeader(SLoc)` checks, such as [the one we 
actually 
have](https://github.com/llvm/llvm-project/blob/llvmorg-19-init/clang/lib/StaticAnalyzer/Checkers/WebKit/NoUncountedMembersChecker.cpp#L110).
 This covers not only the standard library, but also other code outside of your 
project (basically everything that goes with `-isystem` as opposed to `-I`). 
Which may or may not be what we want.

I think it's likely that your solution is optimal but I still want to see some 
examples to understand what exactly we're dealing with.

https://github.com/llvm/llvm-project/pull/90552
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Skip over std namespace in WebKit checkers. (PR #90552)

2024-04-30 Thread Artem Dergachev via cfe-commits


haoNoQ wrote:

(If yes, LGTM!)

https://github.com/llvm/llvm-project/pull/90552
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Skip over std namespace in WebKit checkers. (PR #90552)

2024-04-30 Thread Artem Dergachev via cfe-commits


haoNoQ wrote:

Ok am I reading this right: this is about testing code that lives directly in 
namespace `std::` but somehow still lives in your project right? Such as your 
custom modifications to standard classes, various overloads and specializations 
of standard things.

This patch does not affect the behavior of checkers when you're just passing 
raw pointers from user code to standard functions such as `std::memcpy()`. You 
still want to have a warning about those.

And generally you already don't have any warnings about things that's going on 
in system headers themselves. There's a separate suppression about that.

https://github.com/llvm/llvm-project/pull/90552
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [alpha.webkit.UncountedCallArgsChecker] Support more trivial expressions. (PR #90414)

2024-04-30 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ approved this pull request.

LGTM other than the other tiny comment.

https://github.com/llvm/llvm-project/pull/90414
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [alpha.webkit.UncountedCallArgsChecker] Support more trivial expressions. (PR #90414)

2024-04-30 Thread Artem Dergachev via cfe-commits



@@ -360,6 +370,16 @@ class TrivialFunctionAnalysisVisitor
 return TrivialFunctionAnalysis::isTrivialImpl(Callee, Cache);
   }
 
+  bool
+  VisitSubstNonTypeTemplateParmExpr(const SubstNonTypeTemplateParmExpr *E) {
+// Non-type template paramter is trivial if the replacement is trivial.
+return Visit(E->getReplacement());

haoNoQ wrote:

These might be always trivial? They're compile-time constants after all.

https://github.com/llvm/llvm-project/pull/90414
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [alpha.webkit.UncountedCallArgsChecker] Ignore methods of WTF String classes. (PR #90180)

2024-04-30 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ approved this pull request.

LGTM!

https://github.com/llvm/llvm-project/pull/90180
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [alpha.webkit.UncountedCallArgsChecker] Avoid emitting warnings for Ref, RefPtr, and their variants. (PR #90153)

2024-04-25 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ approved this pull request.

LGTM!

https://github.com/llvm/llvm-project/pull/90153
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Add handling of the `-nostdlibinc` option to ccc-analyzer (PR #88017)

2024-04-08 Thread Artem Dergachev via cfe-commits


haoNoQ wrote:

I never realized that scan-build makes a "short list" of flags to pass through 
and throws away the rest. This is terrifying. It only works because most 
compiler flags are recognized because they're prefixed with `-f`, `-W`, etc.; 
but it's fascinating that it even works.

This may be testable; we've recently started adding tests for scan-build. But 
you'll have to see how `-nostdlibinc` is tested in general, and then see if you 
can combine the two tricks.

https://github.com/llvm/llvm-project/pull/88017
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Add handling of the `-nostdlibinc` option to ccc-analyzer (PR #88017)

2024-04-08 Thread Artem Dergachev via cfe-commits


https://github.com/haoNoQ approved this pull request.


https://github.com/llvm/llvm-project/pull/88017
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] `canReasonAbout` does not support `nonloc::LazyCompoundVal` (PR #87521)

2024-04-04 Thread Artem Dergachev via cfe-commits


haoNoQ wrote:

> Here's the traceback for the second set:

The constraint manager doesn't even exist yet at the moment of time represented 
by this backtrace. There's something else going on. I suspect that you're 
loading checkers as clang plugins and one of them is causing it.

https://github.com/llvm/llvm-project/pull/87521
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] `canReasonAbout` does not support `nonloc::LazyCompoundVal` (PR #87521)

2024-04-04 Thread Artem Dergachev via cfe-commits



@@ -0,0 +1,17 @@
+// REQUIRES: crash-recovery, asserts

haoNoQ wrote:

You probably don't need this. There's no need to disable a perfectly normal 
test just because it didn't crash before the patch in that configuration. It's 
still nice to know that it doesn't crash after the patch _regardless of_ 
configuration.

https://github.com/llvm/llvm-project/pull/87521
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [analyzer] Set and display CSA analysis entry points as notes on debugging (PR #84823)

2024-04-04 Thread Artem Dergachev via cfe-commits



@@ -788,7 +791,7 @@ class PathDiagnostic : public llvm::FoldingSetNode {
   PathDiagnostic(StringRef CheckerName, const Decl *DeclWithIssue,
  StringRef bugtype, StringRef verboseDesc, StringRef shortDesc,
  StringRef category, PathDiagnosticLocation LocationToUnique,
- const Decl *DeclToUnique,
+ const Decl *DeclToUnique, const Decl *AnalysisEntryPoint,

haoNoQ wrote:

I mean, this is a debug feature. Debug features are great but it's probably ok 
to have them slightly incorrect or incomplete if it means that they're no 
longer associated with increased complexity, with paying for something we don't 
use.

Out of those AST checkers, how many are emitting diagnostics outside of the 
entry function / decl-with-issue function? If you append
```
[debug] This bubble's decl name is 'foo()'
```
to _every_ message bubble, would that entirely cover all your needs?

I can easily see how there could be a checker that always warns outside of the 
entry function. If that's the case then it's probably incredibly useful for 
debugging to quickly know which entry function inspired the warning. But at the 
same time I'm not sure I can think of a good example when the same information 
wouldn't be useful for the *user* as well; maybe the users could benefit from a 
user-facing note too? (Somewhat similarly to how Clang explains template 
instantiation stack when diagnosing problems in individual template 
instantiations. (Something we cannot mimic in static analysis tools because the 
instantiation stack is already lost.))

https://github.com/llvm/llvm-project/pull/84823
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1066 matches

Mail list logo