[PATCH] D137346: [-Wunsafe-buffer-usage] Initial commit - Transition away from raw buffer accesses.

2022-12-08 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ added inline comments.



Comment at: clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h:17
+
+#include "clang/ASTMatchers/ASTMatchFinder.h"
+

Ok so this is what caused the revert 
(https://lab.llvm.org/buildbot#builders/121/builds/25877):
```lines=10
FAILED: lib/libclangSema.so.16git 
: && /usr/lib64/ccache/c++ -fPIC -fPIC -fno-semantic-interposition 
-fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra 
-Wno-unused-parameter -Wwrite-strings -Wcast-qual 
-Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough 
-Wno-maybe-uninitialized -Wno-class-memaccess -Wno-noexcept-type 
-Wdelete-non-virtual-dtor -Wno-comment -Wno-misleading-indentation 
-fdiagnostics-color -ffunction-sections -fdata-sections -fno-common 
-Woverloaded-virtual -fno-strict-aliasing -O3 -DNDEBUG  -Wl,-z,defs 
-Wl,-z,nodelete   
-Wl,-rpath-link,/home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1/./lib
  -Wl,--gc-sections -shared -Wl,-soname,libclangSema.so.16git -o 
lib/libclangSema.so.16git 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/AnalysisBasedWarnings.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/CodeCompleteConsumer.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/DeclSpec.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/DelayedDiagnostic.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/HLSLExternalSemaSource.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/IdentifierResolver.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/JumpDiagnostics.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/MultiplexExternalSemaSource.cpp.o
 tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/ParsedAttr.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/Scope.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/ScopeInfo.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/Sema.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaAccess.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaAttr.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaAvailability.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaCXXScopeSpec.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaCast.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaChecking.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaCodeComplete.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaConcept.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaConsumer.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaCoroutine.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaCUDA.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaDecl.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaDeclAttr.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaDeclCXX.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaDeclObjC.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaExceptionSpec.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaExpr.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaExprCXX.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaExprMember.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaExprObjC.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaFixItUtils.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaHLSL.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaInit.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaLambda.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaLookup.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaModule.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaObjCProperty.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaOpenMP.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaOverload.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaPseudoObject.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaRISCVVectorLookup.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaStmt.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaStmtAsm.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaStmtAttr.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaSYCL.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaTemplate.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaTemplateDeduction.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaTemplateInstantiate.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaTemplateInstantiateDecl.cpp.o
 tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaTemplateVariadic.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaType.cpp.o 
tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/TypeLocBuilder.cpp.o  
-Wl,-rpath,"\$ORIGIN/../lib"  lib/libclangAnalysis.so.16git  

[PATCH] D137346: [-Wunsafe-buffer-usage] Initial commit - Transition away from raw buffer accesses.

2022-12-05 Thread Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG27ec85f8: [-Wunsafe-buffer-usage] Initial commit - 
Transition away from raw buffers. (authored by Artem Dergachev 
adergac...@apple.com).
Herald added a project: clang.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D137346/new/

https://reviews.llvm.org/D137346

Files:
  clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/lib/Analysis/CMakeLists.txt
  clang/lib/Analysis/UnsafeBufferUsage.cpp
  clang/lib/Sema/AnalysisBasedWarnings.cpp
  clang/test/SemaCXX/warn-unsafe-buffer-usage.cpp

Index: clang/test/SemaCXX/warn-unsafe-buffer-usage.cpp
===
--- /dev/null
+++ clang/test/SemaCXX/warn-unsafe-buffer-usage.cpp
@@ -0,0 +1,8 @@
+// RUN: %clang_cc1 -Wunsafe-buffer-usage -verify %s
+
+void testIncrement(char *p) {
+  ++p; // expected-warning{{unchecked operation on raw buffer in expression}}
+  p++; // expected-warning{{unchecked operation on raw buffer in expression}}
+  --p; // expected-warning{{unchecked operation on raw buffer in expression}}
+  p--; // expected-warning{{unchecked operation on raw buffer in expression}}
+}
Index: clang/lib/Sema/AnalysisBasedWarnings.cpp
===
--- clang/lib/Sema/AnalysisBasedWarnings.cpp
+++ clang/lib/Sema/AnalysisBasedWarnings.cpp
@@ -29,6 +29,7 @@
 #include "clang/Analysis/Analyses/ReachableCode.h"
 #include "clang/Analysis/Analyses/ThreadSafety.h"
 #include "clang/Analysis/Analyses/UninitializedValues.h"
+#include "clang/Analysis/Analyses/UnsafeBufferUsage.h"
 #include "clang/Analysis/AnalysisDeclContext.h"
 #include "clang/Analysis/CFG.h"
 #include "clang/Analysis/CFGStmtMap.h"
@@ -2138,6 +2139,23 @@
 } // namespace consumed
 } // namespace clang
 
+//===--===//
+// Unsafe buffer usage analysis.
+//===--===//
+
+class UnsafeBufferUsageReporter : public UnsafeBufferUsageHandler {
+  Sema 
+
+public:
+  UnsafeBufferUsageReporter(Sema ) : S(S) {}
+
+  void handleUnsafeOperation(const Stmt *Operation) override {
+S.Diag(Operation->getBeginLoc(), diag::warn_unsafe_buffer_usage)
+<< Operation->getSourceRange();
+  }
+};
+
+
 //===--===//
 // AnalysisBasedWarnings - Worker object used by Sema to execute analysis-based
 //  warnings on a function, method, or block.
@@ -2430,6 +2448,12 @@
   if (S.getLangOpts().CPlusPlus && isNoexcept(FD))
 checkThrowInNonThrowingFunc(S, FD, AC);
 
+  // Emit unsafe buffer usage warnings and fixits.
+  if (!Diags.isIgnored(diag::warn_unsafe_buffer_usage, D->getBeginLoc())) {
+UnsafeBufferUsageReporter R(S);
+checkUnsafeBufferUsage(D, R);
+  }
+
   // If none of the previous checks caused a CFG build, trigger one here
   // for the logical error handler.
   if (LogicalErrorHandler::hasActiveDiagnostics(Diags, D->getBeginLoc())) {
Index: clang/lib/Analysis/UnsafeBufferUsage.cpp
===
--- /dev/null
+++ clang/lib/Analysis/UnsafeBufferUsage.cpp
@@ -0,0 +1,79 @@
+//===- UnsafeBufferUsage.cpp - Replace pointers with modern C++ ---===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#include "clang/Analysis/Analyses/UnsafeBufferUsage.h"
+#include "llvm/ADT/SmallVector.h"
+
+using namespace llvm;
+using namespace clang;
+using namespace ast_matchers;
+
+namespace {
+// TODO: Better abstractions over gadgets.
+using GadgetList = std::vector;
+}
+
+// Scan the function and return a list of gadgets found with provided kits.
+static GadgetList findGadgets(const Decl *D) {
+
+  class GadgetFinderCallback : public MatchFinder::MatchCallback {
+GadgetList 
+
+  public:
+GadgetFinderCallback(GadgetList ) : Output(Output) {}
+
+void run(const MatchFinder::MatchResult ) override {
+  Output.push_back(Result.Nodes.getNodeAs("root_node"));
+}
+  };
+
+  GadgetList G;
+  MatchFinder M;
+
+  auto IncrementMatcher = unaryOperator(
+hasOperatorName("++"),
+hasUnaryOperand(hasType(pointerType()))
+  );
+  auto DecrementMatcher = unaryOperator(
+hasOperatorName("--"),
+hasUnaryOperand(hasType(pointerType()))
+  );
+
+  GadgetFinderCallback CB(G);
+
+  M.addMatcher(
+  stmt(forEachDescendant(
+stmt(
+  anyOf(
+IncrementMatcher,
+DecrementMatcher
+/* Fill 

[PATCH] D137346: [-Wunsafe-buffer-usage] Initial commit - Transition away from raw buffer accesses.

2022-12-01 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ added a comment.

Ok I added D136811  as a parent revision but 
I don't think it makes sense to land D136811  
before this patch, given that D136811 's 
documentation does not *yet* reflect the actual state of things. I think we 
should start landing patches, and land the documentation later when we think it 
actually makes sense to inform people about the warning and encourage enabling 
it.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D137346/new/

https://reviews.llvm.org/D137346

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D137346: [-Wunsafe-buffer-usage] Initial commit - Transition away from raw buffer accesses.

2022-12-01 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman accepted this revision.
aaron.ballman added a comment.

LGTM




Comment at: clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h:34
+// through the handler class.
+void checkUnsafeBufferUsage(const Decl *D, UnsafeBufferUsageHandler );
+

NoQ wrote:
> aaron.ballman wrote:
> > Do we need the interface to accept a non-const reference?
> It's same as asking whether the `handle...` methods should be const. Roughly 
> similar to whether `MatchFinder::MatchCallback::run()` should be const. I 
> don't see a reason why not, but also "Why say lot word when few word do 
> trick", could have been a lambda.
> 
> As an opposite example, static analyzer's `Checker::check...` callbacks 
> really needed to be const, because carrying mutable state in the checker is 
> often *tempting* but always *terrible*, in a way that's not immediately 
> obvious to beginners.
Hmmm, I may regret this later (I often do when suggesting laxity with `const`), 
but I suppose it's reasonable to leave the interface mutable. I usually prefer 
things to be `const` up front and then relax the restriction later once we have 
a need, but this is a case where relaxing that later would be about as viral as 
adding `const` later.



Comment at: clang/include/clang/Basic/DiagnosticSemaKinds.td:11731-11732
+// Unsafe buffer usage diagnostics.
+def warn_unsafe_buffer_usage : Warning<"unchecked operation on raw buffer in 
expression">,
+  InGroup>, DefaultIgnore;
 } // end of sema component.

NoQ wrote:
> aaron.ballman wrote:
> > The diagnostic wording isn't wrong, but I am a bit worried about complex 
> > expressions. Consider something like `void func(a, b, c + d, e++, f());` 
> > -- if you got this warning on that line of code, how likely would you be to 
> > spot what caused the diagnostic? I think we need to be sure that issuing 
> > this warning *always* passes in an extra `SourceRange` for the part of the 
> > expression that's caused the issue so users will at least get some 
> > underlined squiggles to help them out.
> Yeah, I agree. Later we'll additionally specialize this warning to be more 
> specific than "expression" (eg. "pointer arithmetic", "array access", etc.).
> 
> Hmm, is there a way to write the .td file entry so that the source range 
> becomes mandatory?
> Hmm, is there a way to write the .td file entry so that the source range 
> becomes mandatory?

Not to my knowledge; we could maybe thread through enough information to assert 
if the diagnostic is called without a source range, but I'm not certain we 
could reasonably get a compile-time error if the range isn't provided.



Comment at: clang/lib/Sema/AnalysisBasedWarnings.cpp:2152-2154
+  void handleUnsafeOperation(const Stmt *Operation) override {
+S.Diag(Operation->getBeginLoc(), diag::warn_unsafe_buffer_usage);
+  }

NoQ wrote:
> aaron.ballman wrote:
> > I think this interface needs an additional `SourceRange` parameter that can 
> > be passed in as a streaming argument, along these lines. This way you'll 
> > get squiggles for the problematic part of the expression.
> `Operation->getSourceRange()` does the trick right? Like this:
> ```
> warning: unchecked operation on raw buffer in expression
> 
>   a[i];
>   
> ```
> I suspect that the Operation is forever going to be the exact thing we want 
> to highlight, there's virtually no reason for it to be anything else.
> 
> Or do you think it'd be better to do it like this?
> ```
> warning: unchecked operation on raw buffer in expression
> 
>   a[i];
>   ~
> ```
> Operation->getSourceRange() does the trick right?

Yup!

> Or do you think it'd be better to do it like this?

I prefer the first form where the whole operation is highlighted instead of 
just an operand of the operation. I think that'll make more sense for 
situations like:
```
a + i;
```
to highlight the whole expression instead of just `a`. It also helpfully means 
I don't have to ask for a test like: `i[a]` and see if we highlight the correct 
"raw buffer" operand. :-D


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D137346/new/

https://reviews.llvm.org/D137346

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D137346: [-Wunsafe-buffer-usage] Initial commit - Transition away from raw buffer accesses.

2022-11-30 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ added inline comments.



Comment at: clang/lib/Analysis/UnsafeBufferUsage.cpp:50
+  M.addMatcher(
+  stmt(forEachDescendant(
+stmt(

aaron.ballman wrote:
> Errr, this looks expensive to me, but maybe I'm forgetting (I can't recall if 
> it's ancestor or descendant that's more expensive, I think it's ancestor 
> though). @njames93 @LegalizeAdulthood @klimek @sammccall -- do you have 
> concerns here or know of a better approach?
I added some speculation to the other comment 
(https://reviews.llvm.org/D137346?id=472987#inline-1342220) - I think we're 
going to be fast and safe as long as this other thing gets implemented.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D137346/new/

https://reviews.llvm.org/D137346

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D137346: [-Wunsafe-buffer-usage] Initial commit - Transition away from raw buffer accesses.

2022-11-30 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ added inline comments.



Comment at: clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h:29
+  /// Invoked when an unsafe operation over raw pointers is found.
+  virtual void handleUnsafeOperation(const Stmt *Operation) = 0;
+};

NoQ wrote:
> xazax.hun wrote:
> >  What is the purpose of the handler, would this add the fixit hints? In 
> > that case, is this the right abstraction level? E.g., do we want to group 
> > these by the declarations so the handler can rewrite the declaration and 
> > its uses at once? 
> You're absolutely right, we want to group fixits by declarations when fixits 
> are actually available.
> 
> But we still need to cover the case when fixits *aren't* available, and in 
> this case it doesn't make sense to group anything, so this interface isn't 
> going away.
> 
> And on top of that, there's layers to grouping. For instance, in this code
> ```
> void foo(int *p, size_t size) {
>   int *q = p;
>   for (size_t i = 0; i < size; ++i)
> q[i] = 0;
> }
> ```
> we'll need to attach the fix to the declaration `p` rather than `q`, as we 
> aim to provide a single fixit for these two variables, because otherwise we 
> end up with a low-quality fix that suppresses the warning but doesn't carry 
> the span all the way from the caller to the use site.
> 
> Then if we have two parameters that we want to fix simultaneously this way, 
> the fix will have to be inevitably grouped by *function* declaration.
I'll add comments in D138253 to clarify the separation of concerns more 
clearly, once the methods [between which concerns are separated] actually get 
defined.



Comment at: clang/lib/Sema/AnalysisBasedWarnings.cpp:2152-2154
+  void handleUnsafeOperation(const Stmt *Operation) override {
+S.Diag(Operation->getBeginLoc(), diag::warn_unsafe_buffer_usage);
+  }

aaron.ballman wrote:
> I think this interface needs an additional `SourceRange` parameter that can 
> be passed in as a streaming argument, along these lines. This way you'll get 
> squiggles for the problematic part of the expression.
`Operation->getSourceRange()` does the trick right? Like this:
```
warning: unchecked operation on raw buffer in expression

  a[i];
  
```
I suspect that the Operation is forever going to be the exact thing we want to 
highlight, there's virtually no reason for it to be anything else.

Or do you think it'd be better to do it like this?
```
warning: unchecked operation on raw buffer in expression

  a[i];
  ~
```


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D137346/new/

https://reviews.llvm.org/D137346

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D137346: [-Wunsafe-buffer-usage] Initial commit - Transition away from raw buffer accesses.

2022-11-30 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ updated this revision to Diff 479134.
NoQ marked 4 inline comments as done.
NoQ added a comment.

Add range highlighting, unhardcode language mode.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D137346/new/

https://reviews.llvm.org/D137346

Files:
  clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/lib/Analysis/CMakeLists.txt
  clang/lib/Analysis/UnsafeBufferUsage.cpp
  clang/lib/Sema/AnalysisBasedWarnings.cpp
  clang/test/SemaCXX/warn-unsafe-buffer-usage.cpp

Index: clang/test/SemaCXX/warn-unsafe-buffer-usage.cpp
===
--- /dev/null
+++ clang/test/SemaCXX/warn-unsafe-buffer-usage.cpp
@@ -0,0 +1,8 @@
+// RUN: %clang_cc1 -Wunsafe-buffer-usage -verify %s
+
+void testIncrement(char *p) {
+  ++p; // expected-warning{{unchecked operation on raw buffer in expression}}
+  p++; // expected-warning{{unchecked operation on raw buffer in expression}}
+  --p; // expected-warning{{unchecked operation on raw buffer in expression}}
+  p--; // expected-warning{{unchecked operation on raw buffer in expression}}
+}
Index: clang/lib/Sema/AnalysisBasedWarnings.cpp
===
--- clang/lib/Sema/AnalysisBasedWarnings.cpp
+++ clang/lib/Sema/AnalysisBasedWarnings.cpp
@@ -29,6 +29,7 @@
 #include "clang/Analysis/Analyses/ReachableCode.h"
 #include "clang/Analysis/Analyses/ThreadSafety.h"
 #include "clang/Analysis/Analyses/UninitializedValues.h"
+#include "clang/Analysis/Analyses/UnsafeBufferUsage.h"
 #include "clang/Analysis/AnalysisDeclContext.h"
 #include "clang/Analysis/CFG.h"
 #include "clang/Analysis/CFGStmtMap.h"
@@ -2138,6 +2139,23 @@
 } // namespace consumed
 } // namespace clang
 
+//===--===//
+// Unsafe buffer usage analysis.
+//===--===//
+
+class UnsafeBufferUsageReporter : public UnsafeBufferUsageHandler {
+  Sema 
+
+public:
+  UnsafeBufferUsageReporter(Sema ) : S(S) {}
+
+  void handleUnsafeOperation(const Stmt *Operation) override {
+S.Diag(Operation->getBeginLoc(), diag::warn_unsafe_buffer_usage)
+<< Operation->getSourceRange();
+  }
+};
+
+
 //===--===//
 // AnalysisBasedWarnings - Worker object used by Sema to execute analysis-based
 //  warnings on a function, method, or block.
@@ -2430,6 +2448,12 @@
   if (S.getLangOpts().CPlusPlus && isNoexcept(FD))
 checkThrowInNonThrowingFunc(S, FD, AC);
 
+  // Emit unsafe buffer usage warnings and fixits.
+  if (!Diags.isIgnored(diag::warn_unsafe_buffer_usage, D->getBeginLoc())) {
+UnsafeBufferUsageReporter R(S);
+checkUnsafeBufferUsage(D, R);
+  }
+
   // If none of the previous checks caused a CFG build, trigger one here
   // for the logical error handler.
   if (LogicalErrorHandler::hasActiveDiagnostics(Diags, D->getBeginLoc())) {
Index: clang/lib/Analysis/UnsafeBufferUsage.cpp
===
--- /dev/null
+++ clang/lib/Analysis/UnsafeBufferUsage.cpp
@@ -0,0 +1,79 @@
+//===- UnsafeBufferUsage.cpp - Replace pointers with modern C++ ---===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#include "clang/Analysis/Analyses/UnsafeBufferUsage.h"
+#include "llvm/ADT/SmallVector.h"
+
+using namespace llvm;
+using namespace clang;
+using namespace ast_matchers;
+
+namespace {
+// TODO: Better abstractions over gadgets.
+using GadgetList = std::vector;
+}
+
+// Scan the function and return a list of gadgets found with provided kits.
+static GadgetList findGadgets(const Decl *D) {
+
+  class GadgetFinderCallback : public MatchFinder::MatchCallback {
+GadgetList 
+
+  public:
+GadgetFinderCallback(GadgetList ) : Output(Output) {}
+
+void run(const MatchFinder::MatchResult ) override {
+  Output.push_back(Result.Nodes.getNodeAs("root_node"));
+}
+  };
+
+  GadgetList G;
+  MatchFinder M;
+
+  auto IncrementMatcher = unaryOperator(
+hasOperatorName("++"),
+hasUnaryOperand(hasType(pointerType()))
+  );
+  auto DecrementMatcher = unaryOperator(
+hasOperatorName("--"),
+hasUnaryOperand(hasType(pointerType()))
+  );
+
+  GadgetFinderCallback CB(G);
+
+  M.addMatcher(
+  stmt(forEachDescendant(
+stmt(
+  anyOf(
+IncrementMatcher,
+DecrementMatcher
+/* Fill me in! */
+  )
+  // FIXME: Idiomatically there should be a forCallable(equalsNode(D))
+  // here, to make sure that the statement actually belongs to the
+  // function and not 

[PATCH] D137346: [-Wunsafe-buffer-usage] Initial commit - Transition away from raw buffer accesses.

2022-11-30 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ added inline comments.



Comment at: clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h:34
+// through the handler class.
+void checkUnsafeBufferUsage(const Decl *D, UnsafeBufferUsageHandler );
+

aaron.ballman wrote:
> Do we need the interface to accept a non-const reference?
It's same as asking whether the `handle...` methods should be const. Roughly 
similar to whether `MatchFinder::MatchCallback::run()` should be const. I don't 
see a reason why not, but also "Why say lot word when few word do trick", could 
have been a lambda.

As an opposite example, static analyzer's `Checker::check...` callbacks really 
needed to be const, because carrying mutable state in the checker is often 
*tempting* but always *terrible*, in a way that's not immediately obvious to 
beginners.



Comment at: clang/include/clang/Basic/DiagnosticSemaKinds.td:11731-11732
+// Unsafe buffer usage diagnostics.
+def warn_unsafe_buffer_usage : Warning<"unchecked operation on raw buffer in 
expression">,
+  InGroup>, DefaultIgnore;
 } // end of sema component.

aaron.ballman wrote:
> The diagnostic wording isn't wrong, but I am a bit worried about complex 
> expressions. Consider something like `void func(a, b, c + d, e++, f());` -- 
> if you got this warning on that line of code, how likely would you be to spot 
> what caused the diagnostic? I think we need to be sure that issuing this 
> warning *always* passes in an extra `SourceRange` for the part of the 
> expression that's caused the issue so users will at least get some underlined 
> squiggles to help them out.
Yeah, I agree. Later we'll additionally specialize this warning to be more 
specific than "expression" (eg. "pointer arithmetic", "array access", etc.).

Hmm, is there a way to write the .td file entry so that the source range 
becomes mandatory?



Comment at: clang/lib/Analysis/UnsafeBufferUsage.cpp:57
+  )
+  // FIXME: Idiomatically there should be a forCallable(equalsNode(D))
+  // here, to make sure that the statement actually belongs to the

aaron.ballman wrote:
> Heh, this is the sort of thing I was worried about, especially when you 
> consider things like lambda bodies or class definitions with member functions 
> being declared in a function, etc.
Yes, so @ziqingluo-90 is trying to combat this problem in D138329. His solution 
is non-expensive (single-pass through the AST without any backreference 
lookups) and I hope it can be standardized into a general-purpose matcher so 
that we can get rid of the `forCallable()` idiom entirely, even in existing 
clang-tidy checks, to win some performance and make the code less convoluted.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D137346/new/

https://reviews.llvm.org/D137346

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D137346: [-Wunsafe-buffer-usage] Initial commit - Transition away from raw buffer accesses.

2022-11-21 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added subscribers: njames93, sammccall, LegalizeAdulthood, klimek.
aaron.ballman added inline comments.



Comment at: clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h:34
+// through the handler class.
+void checkUnsafeBufferUsage(const Decl *D, UnsafeBufferUsageHandler );
+

Do we need the interface to accept a non-const reference?



Comment at: clang/include/clang/Basic/DiagnosticSemaKinds.td:11731-11732
+// Unsafe buffer usage diagnostics.
+def warn_unsafe_buffer_usage : Warning<"unchecked operation on raw buffer in 
expression">,
+  InGroup>, DefaultIgnore;
 } // end of sema component.

The diagnostic wording isn't wrong, but I am a bit worried about complex 
expressions. Consider something like `void func(a, b, c + d, e++, f());` -- 
if you got this warning on that line of code, how likely would you be to spot 
what caused the diagnostic? I think we need to be sure that issuing this 
warning *always* passes in an extra `SourceRange` for the part of the 
expression that's caused the issue so users will at least get some underlined 
squiggles to help them out.



Comment at: clang/lib/Analysis/UnsafeBufferUsage.cpp:50
+  M.addMatcher(
+  stmt(forEachDescendant(
+stmt(

Errr, this looks expensive to me, but maybe I'm forgetting (I can't recall if 
it's ancestor or descendant that's more expensive, I think it's ancestor 
though). @njames93 @LegalizeAdulthood @klimek @sammccall -- do you have 
concerns here or know of a better approach?



Comment at: clang/lib/Analysis/UnsafeBufferUsage.cpp:57
+  )
+  // FIXME: Idiomatically there should be a forCallable(equalsNode(D))
+  // here, to make sure that the statement actually belongs to the

Heh, this is the sort of thing I was worried about, especially when you 
consider things like lambda bodies or class definitions with member functions 
being declared in a function, etc.



Comment at: clang/lib/Sema/AnalysisBasedWarnings.cpp:2152-2154
+  void handleUnsafeOperation(const Stmt *Operation) override {
+S.Diag(Operation->getBeginLoc(), diag::warn_unsafe_buffer_usage);
+  }

I think this interface needs an additional `SourceRange` parameter that can be 
passed in as a streaming argument, along these lines. This way you'll get 
squiggles for the problematic part of the expression.



Comment at: clang/test/SemaCXX/warn-unsafe-buffer-usage.cpp:1
+// RUN: %clang_cc1 -std=c++20 -Wunsafe-buffer-usage -verify %s
+

No need to pin to a specific language mode.


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D137346/new/

https://reviews.llvm.org/D137346

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D137346: -Wunsafe-buffer-usage: Initial commit - Transition away from raw buffer accesses.

2022-11-15 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ added inline comments.



Comment at: clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h:29
+  /// Invoked when an unsafe operation over raw pointers is found.
+  virtual void handleUnsafeOperation(const Stmt *Operation) = 0;
+};

xazax.hun wrote:
>  What is the purpose of the handler, would this add the fixit hints? In that 
> case, is this the right abstraction level? E.g., do we want to group these by 
> the declarations so the handler can rewrite the declaration and its uses at 
> once? 
You're absolutely right, we want to group fixits by declarations when fixits 
are actually available.

But we still need to cover the case when fixits *aren't* available, and in this 
case it doesn't make sense to group anything, so this interface isn't going 
away.

And on top of that, there's layers to grouping. For instance, in this code
```
void foo(int *p, size_t size) {
  int *q = p;
  for (size_t i = 0; i < size; ++i)
q[i] = 0;
}
```
we'll need to attach the fix to the declaration `p` rather than `q`, as we aim 
to provide a single fixit for these two variables, because otherwise we end up 
with a low-quality fix that suppresses the warning but doesn't carry the span 
all the way from the caller to the use site.

Then if we have two parameters that we want to fix simultaneously this way, the 
fix will have to be inevitably grouped by *function* declaration.



Comment at: clang/lib/Analysis/UnsafeBufferUsage.cpp:22
+// Scan the function and return a list of gadgets found with provided kits.
+static GadgetList findGadgets(const Decl *D) {
+

xazax.hun wrote:
> Is this not a FunctionDecl because of ObjCMethod? At some point I wonder if 
> we should have an abstraction that works for both. Would NamedDecl work?
Yupp.

Well, we do have https://clang.llvm.org/doxygen/classclang_1_1AnyCall.html But 
it's not like the code actually cares at this point. And once it does (eg. for 
the purposes of fixits), it'll have to consider them separately anyway.



Comment at: clang/lib/Analysis/UnsafeBufferUsage.cpp:35-36
+
+  GadgetList G;
+  MatchFinder M;
+

xazax.hun wrote:
> Move these close to their uses?
The code in between is actually instantly deleted by the follow-up patch :)

(D137348)


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D137346/new/

https://reviews.llvm.org/D137346

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D137346: -Wunsafe-buffer-usage: Initial commit - Transition away from raw buffer accesses.

2022-11-15 Thread Gábor Horváth via Phabricator via cfe-commits
xazax.hun added inline comments.



Comment at: clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h:29
+  /// Invoked when an unsafe operation over raw pointers is found.
+  virtual void handleUnsafeOperation(const Stmt *Operation) = 0;
+};

 What is the purpose of the handler, would this add the fixit hints? In that 
case, is this the right abstraction level? E.g., do we want to group these by 
the declarations so the handler can rewrite the declaration and its uses at 
once? 



Comment at: clang/lib/Analysis/UnsafeBufferUsage.cpp:22
+// Scan the function and return a list of gadgets found with provided kits.
+static GadgetList findGadgets(const Decl *D) {
+

Is this not a FunctionDecl because of ObjCMethod? At some point I wonder if we 
should have an abstraction that works for both. Would NamedDecl work?



Comment at: clang/lib/Analysis/UnsafeBufferUsage.cpp:35-36
+
+  GadgetList G;
+  MatchFinder M;
+

Move these close to their uses?


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D137346/new/

https://reviews.llvm.org/D137346

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D137346: -Wunsafe-buffer-usage: Initial commit - Transition away from raw buffer accesses.

2022-11-15 Thread Jan Korous via Phabricator via cfe-commits
jkorous accepted this revision.
jkorous added a comment.
This revision is now accepted and ready to land.

LGTM.

@aaron.ballman Do you have any objection to us landing this?


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D137346/new/

https://reviews.llvm.org/D137346

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits