[PATCH] D64087: [clang] Correct source locations for instantiations of out-of-line defaulted special member functions. (PR25683)

2023-11-03 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

This patch appears to introduce a bug in some source ranges.
Reported the regression at https://github.com/llvm/llvm-project/issues/71161.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64087/new/

https://reviews.llvm.org/D64087

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158156: [analyzer] Add C++ array delete checker

2023-09-28 Thread Balázs Benics via Phabricator via cfe-commits
steakhal accepted this revision.
steakhal added a comment.
This revision is now accepted and ready to land.

It's good enough already, given you apply my last suggestion.
References would be nice to support for notes, but not a blocker.




Comment at: clang/lib/StaticAnalyzer/Checkers/CXXDeleteChecker.cpp:199-201
+  // FIXME: This way of getting base types does not support reference types.
+  QualType SourceType = CastE->getSubExpr()->getType()->getPointeeType();
+  QualType TargetType = CastE->getType()->getPointeeType();

Discookie wrote:
> steakhal wrote:
> > What is the problem with this?
> > I thought `getPointeeType()` works for ReferenceTypes.
> Apparently not, because references aren't ReferenceTypes but qualified Types. 
> I could add support for it in a future commit, but I'd think casting and 
> deleting array-references wrongly is even less common than deleting 
> array-pointers.
Nop, this is for tracking conversions. That might be common.
Such as in our codebase, we usually prefer references. However, in clang we 
have pointers all over the place, thus internally one might cast references, 
but delete or pass them back to a legacy API by taking the address of the 
reference.



Comment at: clang/test/Analysis/ArrayDelete.cpp:102-109
+Base *b = new DoubleDerived[10]; // expected-note{{Casting from 
'DoubleDerived' to 'Base' here}}
+Base &b2 = *b; // no-warning
+
+// FIXME: Displaying casts of reference types is not supported.
+Derived &d2 = static_cast(b2); // no-warning
+
+Derived *d = &d2; // no-warning




CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158156/new/

https://reviews.llvm.org/D158156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D159549: [analyzer] Fix false negative when accessing a nonnull property from a nullable object

2023-09-27 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

For new revisions, prefer creating a PR at GitHub.




Comment at: clang/lib/StaticAnalyzer/Checkers/NullabilityChecker.cpp:903-904
   if (!TrackedNullability &&
-  getNullabilityAnnotation(ReturnType) == Nullability::Nullable) {
+  getNullabilityAnnotation(Call.getOriginExpr()->getType()) ==
+  Nullability::Nullable) {
 State = State->set(Region, Nullability::Nullable);

I'm not sure if `getOriginExpr` is always non-null. How do you know?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159549/new/

https://reviews.llvm.org/D159549

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158156: [analyzer] Add C++ array delete checker

2023-09-27 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

The code looks good now.
Let's have one last round.

Do you have real-world reports?




Comment at: clang/include/clang/StaticAnalyzer/Checkers/Checkers.td:762-763
+def CXXArrayDeleteChecker : Checker<"ArrayDelete">,
+  HelpText<"Reports destructions of arrays of polymorphic objects that are"
+   "destructed as their base class.">,
+  Documentation;

I thin these strings are concatenated like in C, thus it's gonna have 
"aredestructed" joined.



Comment at: clang/lib/StaticAnalyzer/Checkers/CXXDeleteChecker.cpp:174-177
+  OS << "Deleting an array of `" << TargetType.getAsString()
+ << "` objects as their base class `"
+ << SourceType.getAsString(C.getASTContext().getPrintingPolicy())
+ << "` is undefined";

Prefer single quotes over backticks.



Comment at: clang/lib/StaticAnalyzer/Checkers/CXXDeleteChecker.cpp:199-201
+  // FIXME: This way of getting base types does not support reference types.
+  QualType SourceType = CastE->getSubExpr()->getType()->getPointeeType();
+  QualType TargetType = CastE->getType()->getPointeeType();

What is the problem with this?
I thought `getPointeeType()` works for ReferenceTypes.



Comment at: clang/lib/StaticAnalyzer/Checkers/CXXDeleteChecker.cpp:218-219
+
+  OS << "Casting from `" << SourceType.getAsString() << "` to `"
+ << TargetType.getAsString() << "` here";
+

We use single apostrophes for quoting names in CSA.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158156/new/

https://reviews.llvm.org/D158156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154603: [analyzer][clangsa] Add new option to alpha.security.cert.InvalidPtrChecker

2023-09-12 Thread Balázs Benics via Phabricator via cfe-commits
steakhal requested changes to this revision.
steakhal added a comment.
This revision now requires changes to proceed.

Thanks for the ping. I have some concerns inline.




Comment at: clang/include/clang/StaticAnalyzer/Checkers/Checkers.td:1000-1008
+  CheckerOptions<[
+CmdLineOptionget())
+CurrentChainEnd = PlaceInvalidationNote(
+EnvPtr, "call may invalidate the environment returned by getenv",
+CurrentChainEnd);

I'd prefer if we wouldn't put N separate NoteTags, but rather iterate over this 
set of regions inside the NoteTag.
That way here you don't need the loop and play with Pred nodes.



Comment at: clang/lib/StaticAnalyzer/Checkers/cert/InvalidPtrChecker.cpp:219-225
+  if (GetEnvCall.matches(Call)) {
+const MemRegion *Region = Call.getReturnValue().getAsRegion();
+if (Region) {
+  State = State->add(Region);
+  C.addTransition(State);
+}
+  }





Comment at: clang/lib/StaticAnalyzer/Checkers/cert/InvalidPtrChecker.cpp:116-117
+const NoteTag *Note =
+C.getNoteTag([Region, FunctionName, Message](PathSensitiveBugReport 
&BR,
+ llvm::raw_ostream &Out) {
+  if (!BR.isInteresting(Region))

gamesh411 wrote:
> steakhal wrote:
> > `FunctionName` and `Message` will dangle inside the NoteTag.
> Good catch, thanks! Fixed this with a lambda capture initializer.
On second thought, I'm wrong. It won't dangle, because the 
StringRef(FunctionName) is owned by the identifier of the function, and thus 
lives as long as the ASTContext.
But `Message` would dangle :D


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154603/new/

https://reviews.llvm.org/D154603

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158156: [analyzer] Add C++ array delete checker

2023-09-12 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added inline comments.



Comment at: clang/test/Analysis/ArrayDelete.cpp:85-88
+Derived *d3 = new DoubleDerived[10];
+Base *b3 = d3; // expected-note{{Conversion from derived to base happened 
here}}
+delete[] b3; // expected-warning{{Deleting an array of polymorphic objects 
is undefined}}
+// expected-note@-1{{Deleting an array of polymorphic objects is 
undefined}}

donat.nagy wrote:
> steakhal wrote:
> > Hmm, the static type of `d3` doesn't tell us much about how it refers to an 
> > object of type `DoubleDerived`.
> > To me, it would make sense to have multiple `Conversion from derived to 
> > base happened here`, even telling us what static type it converted to what 
> > other static type in the message.
> > And it should add a new visitor of the same kind tracking the castee.
> > 
> > ```
> > Derived *d3 = new DoubleDerived[10]; // note: `DoubleDerived` -> `Derived` 
> > here
> > Base *b3 = d3; // note: `Derived` -> `Base` here
> > delete[] b3; // warn: Deleting `Derived` objects as `Base` objects.
> > ```
> > WDYT @donat.nagy ?
> I agree that it would be good to be good to mention the class names in the 
> message.
Do you also agree that we should have all steps where such a conversion 
happened?
Notice the 2 `note:` markers in my example. @donat.nagy 


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158156/new/

https://reviews.llvm.org/D158156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D159109: [analyzer] CStringChecker buffer access checks should check the first bytes

2023-09-11 Thread Balázs Benics via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG0954dc3fb921: [analyzer] CStringChecker buffer access checks 
should check the first bytes (authored by steakhal).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159109/new/

https://reviews.llvm.org/D159109

Files:
  clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
  clang/test/Analysis/string.c


Index: clang/test/Analysis/string.c
===
--- clang/test/Analysis/string.c
+++ clang/test/Analysis/string.c
@@ -71,6 +71,7 @@
 int scanf(const char *restrict format, ...);
 void *malloc(size_t);
 void free(void *);
+void *memcpy(void *dest, const void *src, unsigned long n);
 
 //===--===
 // strlen()
@@ -1713,3 +1714,21 @@
   free(dataBuffer);
 }
 #endif
+
+#ifndef SUPPRESS_OUT_OF_BOUND
+void CWE124_Buffer_Underwrite__malloc_char_memcpy() {
+  char * dataBuffer = (char *)malloc(100*sizeof(char));
+  if (dataBuffer == NULL) return;
+  memset(dataBuffer, 'A', 100-1);
+  dataBuffer[100-1] = '\0';
+  char *data = dataBuffer - 8;
+
+  char source[100];
+  memset(source, 'C', 100-1); // fill with 'C's
+  source[100-1] = '\0'; // null terminate
+
+  memcpy(data, source, 100*sizeof(char)); // expected-warning {{Memory copy 
function overflows the destination buffer}}
+  data[100-1] = '\0';
+  free(dataBuffer);
+}
+#endif
Index: clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
===
--- clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
+++ clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
@@ -480,6 +480,14 @@
   if (!Filter.CheckCStringOutOfBounds)
 return State;
 
+  SVal BufStart =
+  svalBuilder.evalCast(BufVal, PtrTy, Buffer.Expression->getType());
+
+  // Check if the first byte of the buffer is accessible.
+  State = CheckLocation(C, State, Buffer, BufStart, Access, CK);
+  if (!State)
+return nullptr;
+
   // Get the access length and make sure it is known.
   // FIXME: This assumes the caller has already checked that the access length
   // is positive. And that it's unsigned.
@@ -496,8 +504,6 @@
   NonLoc LastOffset = Offset.castAs();
 
   // Check that the first buffer is sufficiently long.
-  SVal BufStart =
-  svalBuilder.evalCast(BufVal, PtrTy, Buffer.Expression->getType());
   if (std::optional BufLoc = BufStart.getAs()) {
 
 SVal BufEnd =


Index: clang/test/Analysis/string.c
===
--- clang/test/Analysis/string.c
+++ clang/test/Analysis/string.c
@@ -71,6 +71,7 @@
 int scanf(const char *restrict format, ...);
 void *malloc(size_t);
 void free(void *);
+void *memcpy(void *dest, const void *src, unsigned long n);
 
 //===--===
 // strlen()
@@ -1713,3 +1714,21 @@
   free(dataBuffer);
 }
 #endif
+
+#ifndef SUPPRESS_OUT_OF_BOUND
+void CWE124_Buffer_Underwrite__malloc_char_memcpy() {
+  char * dataBuffer = (char *)malloc(100*sizeof(char));
+  if (dataBuffer == NULL) return;
+  memset(dataBuffer, 'A', 100-1);
+  dataBuffer[100-1] = '\0';
+  char *data = dataBuffer - 8;
+
+  char source[100];
+  memset(source, 'C', 100-1); // fill with 'C's
+  source[100-1] = '\0'; // null terminate
+
+  memcpy(data, source, 100*sizeof(char)); // expected-warning {{Memory copy function overflows the destination buffer}}
+  data[100-1] = '\0';
+  free(dataBuffer);
+}
+#endif
Index: clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
===
--- clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
+++ clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
@@ -480,6 +480,14 @@
   if (!Filter.CheckCStringOutOfBounds)
 return State;
 
+  SVal BufStart =
+  svalBuilder.evalCast(BufVal, PtrTy, Buffer.Expression->getType());
+
+  // Check if the first byte of the buffer is accessible.
+  State = CheckLocation(C, State, Buffer, BufStart, Access, CK);
+  if (!State)
+return nullptr;
+
   // Get the access length and make sure it is known.
   // FIXME: This assumes the caller has already checked that the access length
   // is positive. And that it's unsigned.
@@ -496,8 +504,6 @@
   NonLoc LastOffset = Offset.castAs();
 
   // Check that the first buffer is sufficiently long.
-  SVal BufStart =
-  svalBuilder.evalCast(BufVal, PtrTy, Buffer.Expression->getType());
   if (std::optional BufLoc = BufStart.getAs()) {
 
 SVal BufEnd =
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D159108: [analyzer] CStringChecker should check the first byte of the destination of strcpy, strncpy

2023-09-11 Thread Balázs Benics via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGc3a87ddad62a: [analyzer] CStringChecker should check the 
first byte of the destination of… (authored by steakhal).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159108/new/

https://reviews.llvm.org/D159108

Files:
  clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
  clang/test/Analysis/string.c


Index: clang/test/Analysis/string.c
===
--- clang/test/Analysis/string.c
+++ clang/test/Analysis/string.c
@@ -1667,3 +1667,49 @@
   strcpy(x, "12\0");
 }
 #endif
+
+#ifndef SUPPRESS_OUT_OF_BOUND
+void testStrcpyDestinationWritableFirstByte(void) {
+  char dst[10];
+  char *p = dst - 8;
+  strcpy(p, "src"); // expected-warning {{String copy function overflows the 
destination buffer}}
+}
+
+void CWE124_Buffer_Underwrite__malloc_char_cpy() {
+  char * dataBuffer = (char *)malloc(100*sizeof(char));
+  if (dataBuffer == NULL) return;
+  memset(dataBuffer, 'A', 100-1);
+  dataBuffer[100-1] = '\0';
+  char * data = dataBuffer - 8;
+  char source[100];
+  memset(source, 'C', 100-1); // fill with 'C's
+  source[100-1] = '\0'; // null terminate
+  strcpy(data, source); // expected-warning {{String copy function overflows 
the destination buffer}}
+  free(dataBuffer);
+}
+#endif
+
+#ifndef SUPPRESS_OUT_OF_BOUND
+void testStrncpyDestinationWritableFirstByte(void) {
+  char source[100];
+  use_string(source); // escape
+  char buf[100];
+  char *p = buf - 8;
+  strncpy(p, source, 100-1); // expected-warning {{String copy function 
overflows the destination buffer}}
+}
+
+void CWE124_Buffer_Underwrite__malloc_char_ncpy() {
+  char * dataBuffer = (char *)malloc(100*sizeof(char));
+  if (dataBuffer == 0) return;
+  memset(dataBuffer, 'A', 100-1);
+  dataBuffer[100-1] = '\0';
+  char *data = dataBuffer - 8;
+
+  char source[100];
+  memset(source, 'C', 100-1); // fill with 'C's
+  source[100-1] = '\0'; // null terminate
+  strncpy(data, source, 100-1); // expected-warning {{String copy function 
overflows the destination buffer}}
+  data[100-1] = '\0'; // null terminate
+  free(dataBuffer);
+}
+#endif
Index: clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
===
--- clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
+++ clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
@@ -2009,6 +2009,11 @@
   SVal maxLastElement =
   svalBuilder.evalBinOpLN(state, BO_Add, *dstRegVal, *maxLastNL, 
ptrTy);
 
+  // Check if the first byte of the destination is writable.
+  state = CheckLocation(C, state, Dst, DstVal, AccessKind::write);
+  if (!state)
+return;
+  // Check if the last byte of the destination is writable.
   state = CheckLocation(C, state, Dst, maxLastElement, AccessKind::write);
   if (!state)
 return;
@@ -2021,6 +2026,11 @@
 
   // ...and we haven't checked the bound, we'll check the actual copy.
   if (!boundWarning) {
+// Check if the first byte of the destination is writable.
+state = CheckLocation(C, state, Dst, DstVal, AccessKind::write);
+if (!state)
+  return;
+// Check if the last byte of the destination is writable.
 state = CheckLocation(C, state, Dst, lastElement, AccessKind::write);
 if (!state)
   return;


Index: clang/test/Analysis/string.c
===
--- clang/test/Analysis/string.c
+++ clang/test/Analysis/string.c
@@ -1667,3 +1667,49 @@
   strcpy(x, "12\0");
 }
 #endif
+
+#ifndef SUPPRESS_OUT_OF_BOUND
+void testStrcpyDestinationWritableFirstByte(void) {
+  char dst[10];
+  char *p = dst - 8;
+  strcpy(p, "src"); // expected-warning {{String copy function overflows the destination buffer}}
+}
+
+void CWE124_Buffer_Underwrite__malloc_char_cpy() {
+  char * dataBuffer = (char *)malloc(100*sizeof(char));
+  if (dataBuffer == NULL) return;
+  memset(dataBuffer, 'A', 100-1);
+  dataBuffer[100-1] = '\0';
+  char * data = dataBuffer - 8;
+  char source[100];
+  memset(source, 'C', 100-1); // fill with 'C's
+  source[100-1] = '\0'; // null terminate
+  strcpy(data, source); // expected-warning {{String copy function overflows the destination buffer}}
+  free(dataBuffer);
+}
+#endif
+
+#ifndef SUPPRESS_OUT_OF_BOUND
+void testStrncpyDestinationWritableFirstByte(void) {
+  char source[100];
+  use_string(source); // escape
+  char buf[100];
+  char *p = buf - 8;
+  strncpy(p, source, 100-1); // expected-warning {{String copy function overflows the destination buffer}}
+}
+
+void CWE124_Buffer_Underwrite__malloc_char_ncpy() {
+  char * dataBuffer = (char *)malloc(100*sizeof(char));
+  if (dataBuffer == 0) return;
+  memset(dataBuffer, 'A', 100-1);
+  dataBuffer[100-1] = '\0';
+  char *data = dataBuffer - 8;
+
+  char source[100];
+  memset(source, 'C', 100-1); // fill with

[PATCH] D158156: [analyzer] Add C++ array delete checker

2023-09-11 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

I have concerns mostly about the cast visitor.




Comment at: clang/lib/StaticAnalyzer/Checkers/CXXDeleteChecker.cpp:192
+
+  const Stmt *S = N->getStmtForDiagnostics();
+  if (!S)

Discookie wrote:
> steakhal wrote:
> > Aren't you actually interested in 
> > N->getLocation().getAs().getStmt()?
> > 
> > The diag stmt can be fuzzy, but the PP is exact.
> As far as I can tell, getStmtForDiagnostics() does exactly that, but with a 
> bit more edge case handling and a couple fallbacks.
If they do the same, and it does not depend on the mentioned fallbacks, I think 
we should use the Stmt of the programpoint to be explicit.



Comment at: clang/lib/StaticAnalyzer/Checkers/CXXDeleteChecker.cpp:194-201
   // Only interested in DerivedToBase implicit casts.
   // Explicit casts can have different CastKinds.
+  // FIXME: The checker currently matches all explicit casts,
+  // but only ones casting to a base class (or simular) should be matcherd.
   if (const auto *ImplCastE = dyn_cast(CastE)) {
 if (ImplCastE->getCastKind() != CK_DerivedToBase)
   return nullptr;

Have you considered `dyn_cast()` to make it work for both implicit 
and explicit casts?
BTW `simular` -> `similar`



Comment at: clang/lib/StaticAnalyzer/Checkers/CXXDeleteChecker.cpp:220
  N->getLocationContext());
-  return std::make_shared(Pos, OS.str(), true);
+  return std::make_shared(Pos, OS.str(), 
/*addPosRange=*/true);
+}

We should assert that this visitor always adds a Note. In other words, that it 
must find the Stmt where the derived->base conversion happened. If ever that's 
not true, we have a bug.



Comment at: clang/test/Analysis/ArrayDelete.cpp:85-88
+Derived *d3 = new DoubleDerived[10];
+Base *b3 = d3; // expected-note{{Conversion from derived to base happened 
here}}
+delete[] b3; // expected-warning{{Deleting an array of polymorphic objects 
is undefined}}
+// expected-note@-1{{Deleting an array of polymorphic objects is 
undefined}}

Hmm, the static type of `d3` doesn't tell us much about how it refers to an 
object of type `DoubleDerived`.
To me, it would make sense to have multiple `Conversion from derived to base 
happened here`, even telling us what static type it converted to what other 
static type in the message.
And it should add a new visitor of the same kind tracking the castee.

```
Derived *d3 = new DoubleDerived[10]; // note: `DoubleDerived` -> `Derived` here
Base *b3 = d3; // note: `Derived` -> `Base` here
delete[] b3; // warn: Deleting `Derived` objects as `Base` objects.
```
WDYT @donat.nagy ?



Comment at: clang/test/Analysis/ArrayDelete.cpp:90-93
+Base *b4 = new DoubleDerived[10];
+Derived *d4 = reinterpret_cast(b4);
+DoubleDerived *dd4 = reinterpret_cast(d4);
+delete[] dd4; // no-warning

I think in such cases a `static_cast` should suffice; unless you intentionally 
wanted to test `reinterpret_cast` of course.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158156/new/

https://reviews.llvm.org/D158156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D159479: [ASTImport]enhance statement comparing

2023-09-07 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

Make sure the commit message and content complies with the guidelines.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159479/new/

https://reviews.llvm.org/D159479

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158813: [analyzer] MPIChecker: MPI_Waitall should respect count arg

2023-09-05 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

I don't have time this week, sorry.
Maybe others can take this over.
@donat.nagy could you please have a look?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158813/new/

https://reviews.llvm.org/D158813

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158156: [analyzer] Add C++ array delete checker

2023-09-05 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

Add a test where we have multiple imp derived to base casts but only one leads 
to a bug. Do it for both ways, once that the first is the offending, and once 
the second.




Comment at: clang/docs/analyzer/checkers.rst:1793-1803
+.. code-block:: cpp
+
+ Base *create() {
+   Base *x = new Derived[10]; // note: conversion from derived to base
+  //   happened here
+   return x;
+ }

Discookie wrote:
> steakhal wrote:
> > Make sure in the example the `create` is related (e.g. called/used).
> > Also, refrain from using `sink` in the docs. It's usually used in the 
> > context of taint analysis.
> Changed the example - should I change the DeleteWithNonVirtualDtor example as 
> well? That has the same issues as you said here.
I have no opinion. You could.



Comment at: clang/lib/StaticAnalyzer/Checkers/CXXDeleteChecker.cpp:126
+
+  if (!DerivedClass->isDerivedFrom(BaseClass))
+return;

Is this transitive?

BTW inheritance can only be expressed if the class is a definition, right?
Thus passing this should imply has definition.



Comment at: clang/lib/StaticAnalyzer/Checkers/CXXDeleteChecker.cpp:143
+  R->markInteresting(BaseClassRegion);
+  R->addVisitor(std::make_unique());
+  C.emitReport(std::move(R));

I think addVisitor already takes a template param, and it will call make unique 
for you.



Comment at: clang/lib/StaticAnalyzer/Checkers/CXXDeleteChecker.cpp:192
+
+  const Stmt *S = N->getStmtForDiagnostics();
+  if (!S)

Aren't you actually interested in N->getLocation().getAs().getStmt()?

The diag stmt can be fuzzy, but the PP is exact.



Comment at: clang/lib/StaticAnalyzer/Checkers/CXXDeleteChecker.cpp:196
+
+  const auto *CastE = dyn_cast(S);
+  if (!CastE)

If you dyncast to ImplicitCastExpr, couldn't you have done it here?



Comment at: clang/lib/StaticAnalyzer/Checkers/CXXDeleteChecker.cpp:202
+  // Explicit casts can have different CastKinds.
+  if (const auto *ImplCastE = dyn_cast(CastE)) {
+if (ImplCastE->getCastKind() != CK_DerivedToBase)

How do you know that that this castexpr corresponds to the region for which you 
report the bug? To mez this might be some unrelated castexpr.
I was expecting the offending memregion being passed to the visitor, that it 
can check against.



Comment at: clang/lib/StaticAnalyzer/Checkers/CXXDeleteChecker.cpp:217
+  // Stop traversal on this path.
+  Satisfied = true;
+

There are so many early returns going on. I feel, we could miss the program 
point where it should have been marked satisfied. After this point, the visitor 
will never or should never emit a note.



Comment at: clang/lib/StaticAnalyzer/Checkers/CXXDeleteChecker.cpp:224
+ N->getLocationContext());
+  return std::make_shared(Pos, OS.str(), true);
+}

Use named argument passing.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158156/new/

https://reviews.llvm.org/D158156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D159397: [StaticAnalyzer] Use switch statement in MallocChecker::performKernelMalloc. NFC

2023-09-04 Thread Balázs Benics via Phabricator via cfe-commits
steakhal accepted this revision.
steakhal added a comment.
This revision is now accepted and ready to land.

It doesn't look much of an improvement TBH, but I won't stand against it either.
LGTM


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159397/new/

https://reviews.llvm.org/D159397

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157385: [clang][CFG] Cleanup functions

2023-09-01 Thread Balázs Benics via Phabricator via cfe-commits
steakhal accepted this revision.
steakhal added a comment.
This revision is now accepted and ready to land.

In D157385#4634591 , @tbaeder wrote:

> @steakhal Double lifetime ends should be fixed now.

I haven't verified, but it should be good now.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157385/new/

https://reviews.llvm.org/D157385

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157385: [clang][CFG] Cleanup functions

2023-09-01 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

When I added `-analyzer-config cfg-lifetime=true` to 
`clang/test/Analysis/scopes-cfg-output.cpp`, suddenly duplicated lifetime ends 
entries appeared where we have `CleanupFunctions`.
My output is:

  void test_cleanup_functions()
   [B2 (ENTRY)]
 Succs (1): B1
  
   [B1]
 1: CFGScopeBegin(i)
 2: int i __attribute__((cleanup(cleanup_int)));
 3: CleanupFunction (cleanup_int)
 4: [B1.2] (Lifetime ends)
 5: [B1.2] (Lifetime ends)
 6: CFGScopeEnd(i)
 Preds (1): B2
 Succs (1): B0
  
   [B0 (EXIT)]
 Preds (1): B1
  
  void test_cleanup_functions2(int m)
   [B4 (ENTRY)]
 Succs (1): B3
  
   [B1]
 1: 10
 2: i
 3: [B1.2] = [B1.1]
 4: return;
 5: CleanupFunction (cleanup_int)
 6: [B3.2] (Lifetime ends)
 7: [B3.2] (Lifetime ends)
 8: CFGScopeEnd(i)
 Preds (1): B3
 Succs (1): B0
  
   [B2]
 1: return;
 2: CleanupFunction (cleanup_int)
 3: [B3.2] (Lifetime ends)
 4: [B3.2] (Lifetime ends)
 5: CFGScopeEnd(i)
 Preds (1): B3
 Succs (1): B0
  
   [B3]
 1: CFGScopeBegin(i)
 2: int i __attribute__((cleanup(cleanup_int)));
 3: m
 4: [B3.3] (ImplicitCastExpr, LValueToRValue, int)
 5: 1
 6: [B3.4] == [B3.5]
 T: if [B3.6]
 Preds (1): B4
 Succs (2): B2 B1
  
   [B0 (EXIT)]
 Preds (2): B1 B2
  
  void test()
   [B2 (ENTRY)]
 Succs (1): B1
  
   [B1]
 1: CFGScopeBegin(f)
 2:  (CXXConstructExpr, [B1.3], F)
 3: F f __attribute__((cleanup(cleanup_F)));
 4: CleanupFunction (cleanup_F)
 5: [B1.3].~F() (Implicit destructor)
 6: [B1.3] (Lifetime ends)
 7: CFGScopeEnd(f)
 Preds (1): B2
 Succs (1): B0
  
   [B0 (EXIT)]
 Preds (1): B1

Notice the `[B3.2] (Lifetime ends)` lines for example.

The order in which the Lifetime, Scope and Cleanup elements appear looks 
correct; my only concern is the duplicate Lifetime ends marker.

About the `noreturn` cleanup function, well, GCC says: `It is undefined what 
happens if cleanup_function does not return normally.` here 
,
 thus I'm not sure what to do in that case. GCC seems to optimize accordingly, 
but clang does not. See https://godbolt.org/z/z8s6bPPjv.

FYI Unfortunately, I don't have much experience with CFG either.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157385/new/

https://reviews.llvm.org/D157385

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158707: [analyzer] Fix a few size-type signedness inconsistency related to DynamicExtent

2023-09-01 Thread Balázs Benics via Phabricator via cfe-commits
steakhal closed this revision.
steakhal added a comment.

Landed as 
https://github.com/llvm/llvm-project/commit/12559064e05a11e8418425de59d8745f0cfb1122


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158707/new/

https://reviews.llvm.org/D158707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158499: [analyzer] Compute FAM dynamic size

2023-09-01 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

@danix800 FYI I think you used the wrong revision link in the commit. Maybe 
mark this revision as "abandoned" again, to reflect the actual status.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158499/new/

https://reviews.llvm.org/D158499

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D159105: [analyzer] ArrayBoundCheckerV2 should check the region for taint as well

2023-09-01 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

In D159105#4633899 , @steakhal wrote:

> The measurement is still running, and I'll post a few words about that once I 
> had a look; but anyway, I feel that this patch is controversial and needs 
> more discussion before it could land.
> I plan to abandon this and think about it. We can continue the discussion of 
> course.

Here are the results after applying the suppression heuristic for `getenv`, 
plus D159108  and D159109 
:

- `=X`: Given message appeared X times in baseline.
- `+X (+Y%)`: Patch added X new reports with this message, and that is Y 
percent increase compared to the baseline.
- `-X (-Y%)`: Removed X reports with this message, and that is Y percent 
decrease compared to the baseline.

  * Out of bound memory access (accessed memory precedes memory block)
=1221, +0 (+0%), -4 (-0%)
  * Out of bound memory access (access exceeds upper limit of memory block)
=10547, +7 (+0%), -5 (-0%)
  * Out of bound memory access (index is tainted)
=624, +71 (+11%), -1 (-0%)
  
  * Memory copy function overflows the destination buffer
=436, +45 (+10%), -1 (-0%)
  * Memory copy function accesses out-of-bound array element
=443, +19 (+4%), -1 (-0%)
  * Memory comparison function accesses out-of-bound array element
=102, +1 (+1%), -1 (-1%)
  * Memory set function overflows the destination buffer
=119, +10 (+8%), -0 (-0%)
  * String copy function overflows the destination buffer
 =42, +3 (+7%), -0 (-0%)
  
  * This was not the most recently acquired lock. Possible lock order reversal
=81, +0 (+0%), -1 (-1%)
  * Division by zero
=775, +1 (+0%), -0 (-0%)
  * Called C++ object pointer is null
=3719, +0 (+0%), -2 (-0%)
  * Potential leak of memory pointed to by
=1996, +0 (+0%), -2 (-0%)
  * Returned pointer value points outside the original object (potential buffer 
overflow)
=625, +2 (+0%), -0 (-0%)

I think the mentioned heuristic works quite well, and really decreased the 
number of FPs for `Out of bound memory access (index is tainted)`.
I still don't want to commit to this patch, so I'm abandoning this.

I don't plan to repeat the measurement for D159108 
 and D159109 
, as the numbers related to those messages 
look nice.
I had a look at the reports for those, and as expected it's barely impossible 
to make sense of them. But at least nothing dumb stood out.
If you are okay, I'll commit D159108  and 
D159109  later today. @donat.nagy


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159105/new/

https://reviews.llvm.org/D159105

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D159105: [analyzer] ArrayBoundCheckerV2 should check the region for taint as well

2023-09-01 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

In D159105#4633864 , @donat.nagy 
wrote:

> As I thought more about this commit I realized that I have two vague but 
> significant concerns. I'm sorry that I wasn't able to describe these before 
> you started to dig into the details of the heuristics.
>
> (1) I don't think that code like `int *ptr; scanf("%ld", &ptr);` should be 
> covered by an ArrayBound checker: this is not just a "be careful with the 
> value of the index" issue, but a raw pointer that's completely controlled by 
> an untrusted source. I'd try to cover this kind of bug with either the 
> StdLibrary checker or a small new checker [or perhaps just a clang-tidy 
> check] which would report pointers coming from `scanf` or other user input 
> even if it doesn't see a dereference of the pointer value.
>
> (2) Taintedness of a memory region (i.e. `isTainted(state, location)` where 
> `location` is presumably a `MemRegionVal`) is poorly defined and might mean 
> three different things:
> (a) **The 'begin' pointer value of the region is tainted** (controlled by an 
> untrusted source). This may be a "we're already dead" situation (e.g. the 
> pointer is an arbitrary value from `scanf`) or something completely harmless 
> (we are examining something like the second subscript operator in 
> `large_array[user_input].field[idx]` after verifying that the first subscript 
> with the tainted index is valid), but in either case, the reliability of the 
> 'begin' pointer should not affect the comparison between the region extent 
> and the index value.
> (b)** The extent of the region is tainted** (controlled by an untrusted 
> source, e.g. it's an user input string with an unknown length). In this case 
> it's reasonable to turn on the "paranoid" comparison mode -- however, this 
> situation should be recorded by marking the //extent symbol// of the region 
> as tainted instead of marking the region itself as tainted. I'd be happy to 
> see a commit that ensures that the extent symbol of user input strings is 
> tainted and its taintedness is handled in ArrayBoundsV2.
> (c)** The contents of the region are tainted** (controlled by an untrusted 
> source). Arguably this should be recorded by marking the //contents// as 
> tainted, but that's difficult to implement on a string/array, so it's a 
> somewhat reasonable shortcut to put the taint on the region itself. This kind 
> of taint is completely irrelevant for out of bounds memory access and should 
> not affect the comparisons.
>
> In summary, I'd say that this checker should not be affected by the 
> taintedness of the memory region itself (but it should continue to monitor 
> tainted indices, and if possible, it should be extended to handle tainted 
> extent values). I'm not completely opposed to merging this patch if you add 
> enough heuristics and workarounds to make it stable in practice, but I feel 
> that this is not the right way forward.

I see your point, makes sense and I'd agree.
Probably, by another checker/check, we could indeed sidestep the problem by 
handling these situations separately. I'll think about it.

To me, the whole boils down to null-terminated buffers, such as one returned by 
`getenv`.
There, we have an implicit dependency between the extent of the region and the 
position of the null-terminator inside; as such it affects what locations are 
allowed to be read.
Locations are usually checked by the array-bounds checker (V1 xor V2). 
Consequently, it's still not clear to me if this falls in/out of this checker's 
responsibility.
We don't have many APIs like `getenv`, but users could define custom rules that 
would cause similar problems, thus ATM only `getenv` suffers from these new FPs.

The fact that `isTainted()` could mean so many things, as you enumerated has 3 
different meanings, which is a problem on its own. The design decision was to 
make it deliberately vague to let users express their custom APIs more easily 
in their yaml files.

The measurement is still running, and I'll post a few words about that once I 
had a look; but anyway, I feel that this patch is controversial and needs more 
discussion before it could land.
I plan to abandon this and think about it. We can continue the discussion of 
course.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159105/new/

https://reviews.llvm.org/D159105

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158707: [analyzer] Fix a few size-type signedness inconsistency related to DynamicExtent

2023-09-01 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added inline comments.



Comment at: clang/test/Analysis/memory-model.cpp:167
+  clang_analyzer_dumpExtent(a);   // expected-warning {{0 S64b}}
+  clang_analyzer_dumpElementCount(a); // expected-warning {{5 S64b}}
+  clang_analyzer_dumpExtent(t);   // expected-warning {{0 S64b}}

donat.nagy wrote:
> steakhal wrote:
> > If the array has zero extent, how can is have any elements?
> It has five elements, each element is a 0-element array, total size is 5*0 = 
> 0.
Okay, makes sense. Thanks.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158707/new/

https://reviews.llvm.org/D158707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D159105: [analyzer] ArrayBoundCheckerV2 should check the region for taint as well

2023-09-01 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

In D159105#4631504 , @steakhal wrote:

> There are still a few FPs of the kind, where they iterate over the result of 
> `getenv` in a loop, and continuously checks the character against the zero 
> terminator.
> I refined the suppression heuristic as follows:
>
> - If the offset is zero, don't report taint issue. (as I suggested in the 
> previous heuristic)
> - If the offset is non-zero, calculate the offset for the previous element 
> and check if the value there is proven to be non-zero. If it cannot be zero, 
> don't report this taint issue.
>
> I'll check the results tomorrow.

There are still FPs. I'll refine the heuristic to accept any constraint.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159105/new/

https://reviews.llvm.org/D159105

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158707: [analyzer] Fix a few size-type signedness inconsistency related to DynamicExtent

2023-08-31 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added inline comments.



Comment at: clang/test/Analysis/memory-model.cpp:167
+  clang_analyzer_dumpExtent(a);   // expected-warning {{0 S64b}}
+  clang_analyzer_dumpElementCount(a); // expected-warning {{5 S64b}}
+  clang_analyzer_dumpExtent(t);   // expected-warning {{0 S64b}}

If the array has zero extent, how can is have any elements?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158707/new/

https://reviews.llvm.org/D158707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D159105: [analyzer] ArrayBoundCheckerV2 should check the region for taint as well

2023-08-31 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

There are still a few FPs of the kind, where they iterate over the result of 
`getenv` in a loop, and continuously checks the character against the zero 
terminator.
I refined the suppression heuristic as follows:

- If the offset is zero, don't report taint issue. (as I suggested in the 
previous heuristic)
- If the offset is non-zero, calculate the offset for the previous element and 
check if the value there is proven to be non-zero. If it cannot be zero, don't 
report this taint issue.

I'll check the results tomorrow.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159105/new/

https://reviews.llvm.org/D159105

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D159107: [analyzer] ArrayBoundCheckerV2 should disallow forming lvalues to out-of-bounds locations

2023-08-31 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

In D159107#4631069 , @donat.nagy 
wrote:

> In D159107#4630764 , @steakhal 
> wrote:
>
>> In D159107#4630573 , @donat.nagy 
>> wrote:
>>
>>> I don't think that the `&arr[N]` issue is too serious: we can just 
>>> increment the array extent when the parent expression of the array 
>>> subscript operator is the unary operator `&`. If the past-the-end pointer 
>>> ends up dereferenced later, the current code is sufficient to report it as 
>>> a bug (as the checker monitors all dereferences).
>>
>> My instinct suggests that there is more behind the curtain.
>
> As a rough solution we can simply say that this checker ignores `&arr[idx]` 
> (because it's just a different way of writing `arr + idx`) and only checks 
> expressions where "real" dereference happens. This way the checker would 
> won't emit false positives on past-the-end pointers and still report every 
> out of bound //memory access// (including off-by-one errors).
>
> This can be refined by adding a check that which validates that in an 
> expression like `&arr[idx]` the index satisfies `0 <= idx && idx <= 
> array_size`. This is conceptually independent (but can use the same 
> implementation as the real memory access check) and would add some useful 
> reports and constraints without breaking anything.

It makes sense, but it would need to special case on the parent or child AST 
node, which isn't really clean. But I think we think the same on this subject.
(Note that the `&` is a parent of the `arr[N]`, where the checker would 
trigger, thus we would need to check the parent node to ignore such cases; and 
that is not really supported in the AST, and using the ParentMap for it is 
expensive. Remember, that this would happen for any SubscriptExpr as many times 
they are visited.)

> In D159107#4630764 , @steakhal 
> wrote:
>
>> For example, on an expression `&arr[N]` (of type  `int arr[10]`), we would 
>> constrain `N`. We could say that we allow the one before and one after 
>> elements, thus introduce a constraint: `N: [-1, 11]`. This `-1` later could 
>> participate in casts, and end up interpreted as a really large unsigned 
>> number. I should also think about how would we detect off-by-one errors, 
>> which is a common category.
>
> Pointer arithmetic is very restricted in the standard, e.g. 
> http://eel.is/c++draft/basic.compound says that
>
>> Every value of pointer type is one of the following:
>> (3.1) a pointer to an object or function (the pointer is said to point to 
>> the object or function), or
>> (3.2) a pointer past the end of an object ([expr.add]), or
>> (3.3) the null pointer value for that type, or
>> (3.4) an invalid pointer value.
>
> As this explicitly rules out before-first-element pointers and they are not 
> too common (I don't recall any code that used them), I don't think that they 
> deserve a special case (just report them if we see them).

Yes. They are really restricted, and AFAIK only past-the-end pointers are 
allowed, unlike before-the-first element. Such code might be written for 
reverse iterations.
But we know that hardware would work like that, and people might rely on it.

> In D159107#4630764 , @steakhal 
> wrote:
>
>> The thing is, that I only have this week, aka. 1.5 days before I leave for 
>> vacation. :D
>> In the future, (no ETA), we likely come back to the Juliet improvements.
>
> Of course, it's completely reasonable to focus on a limited set of changes 
> before the vacation.
>
> In the meantime, I'll probably work on one or more commits that would (1) 
> switch to using the concrete `check::PostStmt` callbacks instead of 
> `check::Location` (and `Bind`) like this commit (2) improve the diagnostics, 
> add more details to the messages. When will you return from the vacation? 
> Should I wait for you with the review of these changes (if I can write and 
> test them before your return)?

I don't think you should wait for me. I'll be off for a week though, but I 
might not have the bandwidth to do reviews even after that.
Post it once you are fine with it. Prove it "behaves well", and just wait for 
someone to review it. It might be noq, xazax or me. We will see.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159107/new/

https://reviews.llvm.org/D159107

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D159105: [analyzer] ArrayBoundCheckerV2 should check the region for taint as well

2023-08-31 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

I'm thinking of a heuristic like this:

  bool IsAtOffsetZero = [ByteOffset] {
const auto *Int = ByteOffset.getAsInteger();
return Int && Int->isZero();
  }();
  // ...
  if (state_precedesLowerBound && state_withinLowerBound && !IsAtOffsetZero &&
  isTainted(state, location)) {
reportTaintOOB(checkerContext, state_precedesLowerBound, location);
return;
  }

It will definitely not work for all cases but should filter out the most 
annoying ones.

  int testTaintedPtr() {
// Read a pointer from a tainted source and dereference it.
char *app = getenv("APP_NAME");
if (!app)
  return -1;
if (app[0] == '\0') return 0; // no-warning: Allow manual checking for null 
terminator at offset zero.
if (app[1] == '\0') return 1; // expected-warning {{Out of bound memory 
access (index is tainted)}} Any other index than 0 is disallowed.
if (app[2] == '\0') return 2; // no-warning: The path already sunk.
return -1;
  }




Comment at: clang/test/Analysis/taint-generic.c:1133-1141
+void testTaintedLowerConstrainedIndex() {
+  int n;
+  scanf("%d", &n);
+  if (n >= 0) {
+// We only constained the lower end, and it's tainted => report.
+Buffer[n] = 1; // expected-warning {{Out of bound memory access (index is 
tainted)}}
+  }

donat.nagy wrote:
> Also add the "opposite" of this where you test `if (n < BUFSIZE)`.
Makes sense.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159105/new/

https://reviews.llvm.org/D159105

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D159107: [analyzer] ArrayBoundCheckerV2 should disallow forming lvalues to out-of-bounds locations

2023-08-31 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

In D159107#4630573 , @donat.nagy 
wrote:

> I don't think that the `&arr[N]` issue is too serious: we can just increment 
> the array extent when the parent expression of the array subscript operator 
> is the unary operator `&`. If the past-the-end pointer ends up dereferenced 
> later, the current code is sufficient to report it as a bug (as the checker 
> monitors all dereferences).

My instinct suggests that there is more behind the curtain.
For example, on an expression `&arr[N]` (of type  `int arr[10]`), we would 
constrain `N`. We could say that we allow the one before and one after 
elements, thus introduce a constraint: `N: [-1, 11]`. This `-1` later could 
participate in casts, and end up interpreted as a really large unsigned number. 
I should also think about how would we detect off-by-one errors, which is a 
common category.

> I'd be happy to see (a slightly extended variant of) this commit merged, 
> because I could provide much better warning messages if I can access the 
> concrete subscript/dereference operations. Of course if you don't have time 
> to work on this I can put this up for review myself (probably after your 
> other commits are handled).

The thing is, that I only have this week, aka. 1.5 days before I leave for 
vacation. :D
In the future, (no ETA), we likely come back to the Juliet improvements.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159107/new/

https://reviews.llvm.org/D159107

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154838: [analyzer] Add check for null pointer passed to the %p of printf family

2023-08-31 Thread Balázs Benics via Phabricator via cfe-commits
steakhal requested changes to this revision.
steakhal added a comment.
This revision now requires changes to proceed.

I'd suggest renaming the checker to `alpha.unix.NullFormatSpecifier`.
Maybe add tests where multiple format specifiers are null, and also one test 
where no variadic args are present.

I also tried to simplify your solution a bit, so that we don't need to specify 
manually which arguments are for the "variadic" part.
Check it out here: F28923128: simplified.patch 


We should think about a shorter diagnostic message because it seems 
unnecessarily long to me.
The rest looks good to me.

I could do a measurement to see how this would behave in the wild.




Comment at: 
clang/lib/StaticAnalyzer/Checkers/UnixAPIPortabilityMinorChecker.cpp:70-71
+  new BugType(this,
+  "Passing a null pointer to the pointer conversion "
+  "specifier of ",
+  categories::UnixAPI));

Is this an unfinished sentence? I guess you could remove the ` of ` suffix.



Comment at: 
clang/lib/StaticAnalyzer/Checkers/UnixAPIPortabilityMinorChecker.cpp:86
+  ExplodedNode *N =
+  C.generateNonFatalErrorNode(nullState ? nullState : C.getState());
+  if (!N)

You always pass this state.



Comment at: clang/test/Analysis/unix-api-portability-minor.cpp:5-13
+struct FILE_t;
+typedef struct FILE_t FILE;
+
+typedef __typeof(sizeof(int)) size_t;
+
+int printf(const char *, ... );
+int fprintf(FILE *, const char *, ...);





Comment at: clang/test/Analysis/unix-api-portability-minor.cpp:30-35
+  printf(format, NULL); // expected-warning{{The result of passing a null 
pointer to the pointer conversion specifier of the printf family of functions 
is implementation defined}}
+  printf(format, 1, NULL); // expected-warning{{The result of passing a null 
pointer to the pointer conversion specifier of the printf family of functions 
is implementation defined}}
+  printf(format, 1, NULL, 2); // expected-warning{{The result of passing a 
null pointer to the pointer conversion specifier of the printf family of 
functions is implementation defined}}
+  printf(format, NULL, NULL); // expected-warning{{The result of passing a 
null pointer to the pointer conversion specifier of the printf family of 
functions is implementation defined}}
+  printf(format, NULL, 1, NULL); // expected-warning{{The result of passing a 
null pointer to the pointer conversion specifier of the printf family of 
functions is implementation defined}}
+  printf(format, 0); // no-warning

This format makes the pattern more apparent.



Comment at: clang/test/Analysis/unix-api-portability-minor.cpp:42-43
+  printf(format, pointer1); // no-warning
+  // Pointer argument should not get constrained after the check.
+  *pointer1 = 777; // no-warning
+

If `printf` had assumed that `pointer1` must be non-null to make the 
postcondition of the function pass, the dereference would succeed afterward 
anyway.

To test if `printf` didn't assume this, just have this afterward to make this 
observable:
```lang=C++
if (pointer1) {
  *pointer1 = 777;
} else {
  // We should have a warning here, as 'printf' didn't assume that the 
specifier is non-null.
  *pointer1 = 888; // expect-warning {{deref}}
}
```
If the pointer was constrained, then the `else` branch should be proven 
dead-code, and we should not have a warning


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154838/new/

https://reviews.llvm.org/D154838

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D159107: [analyzer] ArrayBoundCheckerV2 should disallow forming lvalues to out-of-bounds locations

2023-08-31 Thread Balázs Benics via Phabricator via cfe-commits
steakhal abandoned this revision.
steakhal added a comment.

In D159107#4627903 , @donat.nagy 
wrote:

> Good direction of development, this will be useful for providing better bug 
> reports (in addition to ensuring correct behavior some situations). 
> Note that it's also possible to dereference pointers with the operator `->`, 
> which is represented by `MemberExpr`s in the AST; we should probably handle 
> that as if it was a `UO_Deref`.

+1, +1

> There is also a small corner case that for an array `some_type arr[N]` it's 
> well-defined to form the past-the-end pointer as `&arr[N]` (instead of `arr + 
> N`) -- while any other use of `arr[N]` is undefined behavior. If this occurs 
> in practice, then we'll probably need some additional logic to handle it. 
> (Note that the `check::Location` implementation dodged this question, because 
> it didn't report anything when the program formed `&arr[N]`, but later 
> created a bug report when this pointer value was dereferenced.)

Ah, this disappoints me. You are right. This delayed mechanism definitely needs 
a more careful approach.

Given these concerns and the quality of the diagnostics, I don't think it's 
something easy to fix. I'll abandon this for now, to focus on the low-hanging 
patches to upstream.

Thanks for the thoughtful review! @donat.nagy




Comment at: clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp:34
+class ArrayBoundCheckerV2
+: public Checker,
+ check::PostStmt> {

donat.nagy wrote:
> Which testcase would break without the `check::Bind` callback? (Not action 
> needed, I'm just curious.)
Good question. TO my surprise, no tests would be broken if we removed this 
callback.
Nice catch.
I thought it would break the new test added in D159106, but it would still pass.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159107/new/

https://reviews.llvm.org/D159107

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D159106: [analyzer] ArrayBoundCheckerV2 should listen to check::Bind as well

2023-08-31 Thread Balázs Benics via Phabricator via cfe-commits
steakhal planned changes to this revision.
steakhal added a comment.

This patch would cause FPs on this code:

  struct S {};
  void zero_size_array() {
S arr[0];
(void)arr;
  }

Being short on time, I'll just drop this commit from the stack and come back in 
a late future.
This might be related to D131501 .


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159106/new/

https://reviews.llvm.org/D159106

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D159105: [analyzer] ArrayBoundCheckerV2 should check the region for taint as well

2023-08-30 Thread Balázs Benics via Phabricator via cfe-commits
steakhal planned changes to this revision.
steakhal added a comment.

In D159105#4627883 , @donat.nagy 
wrote:

> In D159105#4627785 , @steakhal 
> wrote:
>
>> In D159105#4627724 , @donat.nagy 
>> wrote:
>>
>>> I fear that the overzealous handling of tainted data would produce too many 
>>> false positives in situations when code indexes strings that contain user 
>>> input and the SA engine cannot understand the logic that calculates the 
>>> indices. For example if I understand it correctly a function like
>>>
>>>   char f(int n) {
>>> char *var = getenv("FOO");
>>> return var[n];
>>>   }
>>>
>>> would be reported as an out of bounds memory access, because the region is 
>>> tainted and the index value is not known. This is especially problematic on 
>>> the underflow side (which is introduced by your commit), because I'd assume 
>>> that it's common to have numeric values (e.g. function arguments) where the 
>>> programmer knows that they're nonnegative, but the analyzer cannot deduce 
>>> this.
>>
>> To me, this is a TP. The envvar might not be present, and even if it's 
>> present, the relation of the string length of `var` to `n` is not expressed 
>> or checked; thus this deserves a report.
>> But I see your concern, however, I cannot think of valid counterexamples, 
>> aka. FPs off the top of my head. I'll come back once I see the real results.
>
> Hmm, you're right that my example is a TP as written; however I could imagine 
> similar FPs where we e.g. check that `n < strlen(var)` but don't check that 
> `n` is nonnegative (because that's the responsibility of the caller).
>
> Anyway, let's wait for the test results, they'll be enlightening. (You mostly 
> convinced me that your commit will work, but Murphy's laws are still 
> standing...)

I have the results now, I just need to evaluate it. It turns out to require 
some scripting, as we sooner detect out-of-bounds accesses (by not allowing to 
form an lvalue to such) the locations change.
Because of this changed location, I have to filter out the cases when the "same 
line" has a report "disappearing" and "appearing".
The sooner detection also means that I have more paths killed, thus "in the 
close proximity below the appearing issue some others disappear - if they are 
dominated".
I have to manually check these because that diff is sort of intentional; to see 
what's left and look for unintentional cases.

I can already confirm an issue though, similar to the one you showed, and 
similar variants:

  char *str = getenv(name);
  if (str && str[0])
return atoi(str);
  return 0;

This sort of code looks good, and I'll think about how to suppress them.
Probably, because of this, we have a huge relative increase for `Out of bound 
memory access (index is tainted)` diagnostics (+72%, +447).
The second biggest relative change was for `Out of bound memory access (access 
exceeds upper limit of memory block)` (+18%, +1862).
These are preliminary numbers, and I'll continue the validation.

>>> Also note that the error message "Out of bound memory access (index is 
>>> tainted)" becomes misleading after this patch -- but don't spend too much 
>>> time on resolving this, because right now I'm working on a complete rewrite 
>>> the message generation to replace the current spartan messages with useful 
>>> error reporting.
>>
>> I agree, but I'm not sure how much more readable something more accurate 
>> like "the location where this pointer points to potentially depends on 
>> untrusted tainted data". (Note that this would also work for tainted indexes 
>> as well).
>>
>> I must admit, that reporting and diagnostics were low priorities for this 
>> patch stack, so I'd focus on addressing logic concerns first for the whole 
>> stack and then come back to reporting to make it consistent across the stack.
>
> Improving reporting and diagnostics is also on my TODO list, so we should 
> coordinate progress in that area to avoid redundant work. If you wish to work 
> on it, then I'm happy to review; otherwise I'll do something with it soon 
> (perhaps after you merged these commits to avoid merge conflicts).

I don't plan to work on the checker.

I'll post the result of the evaluation, once I'm happy with the findings. This 
might need some iterations.

Thanks for the review!
And sorry that I haven't done this evaluation before I posted these for review. 
Usually, it takes more time for the community to review patches - but this was 
a pleasant surprise!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159105/new/

https://reviews.llvm.org/D159105

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D159106: [analyzer] ArrayBoundCheckerV2 should listen to check::Bind as well

2023-08-30 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added inline comments.



Comment at: clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp:46
 
+  void impl(SVal Loc, bool isLoad, const Stmt *S, CheckerContext &C) const;
+

donat.nagy wrote:
> I'd call this function `performCheck` or something similar, but "`impl`" is 
> also fine especially if it's a traditional name (that I haven't encountered 
> yet in clang SA code).
Yes, I'll rename it. I was already thinking about it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159106/new/

https://reviews.llvm.org/D159106

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D159105: [analyzer] ArrayBoundCheckerV2 should check the region for taint as well

2023-08-30 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

In D159105#4627724 , @donat.nagy 
wrote:

> I was thinking about adding an improvement like this for the sake of 
> consistency, but I fear that this might cause a surprising amount of false 
> positives. (Did you test it on larger codebases?)

I'm just done backporting the first batch to our fork and scheduling a 
large-scale measurement.
But my gut instinct suggests there won't be too many new reports, as I believe 
"tainted" pointers shouldn't be really a thing, except for really exotic cases, 
like `scanf` a pointer - which is dubious at best in presence of ASLR.

> As far as I understand (correct me if I'm wrong), there are situations when 
> TaintChecker marks the memory region of e.g. a returned string as tainted to 
> mark that the //contents// of the region are unreliable. (There may be other 
> more exotic situations when even the //pointer value// or the //extent// 
> becomes unreliable e.g. your testcases where you `scanf` into a pointer.)

Taint can only bind to symbols, thus a memregion per-say cannot be tainted, 
only the "conjured address of it" or the "symbol bound to the 
pointee's/referred lvalue" can be tainted.
WDYM by "are unreliable" and "becomes unreliable"?

> I fear that the overzealous handling of tainted data would produce too many 
> false positives in situations when code indexes strings that contain user 
> input and the SA engine cannot understand the logic that calculates the 
> indices. For example if I understand it correctly a function like
>
>   char f(int n) {
> char *var = getenv("FOO");
> return var[n];
>   }
>
> would be reported as an out of bounds memory access, because the region is 
> tainted and the index value is not known. This is especially problematic on 
> the underflow side (which is introduced by your commit), because I'd assume 
> that it's common to have numeric values (e.g. function arguments) where the 
> programmer knows that they're nonnegative, but the analyzer cannot deduce 
> this.

To me, this is a TP. The envvar might not be present, and even if it's present, 
the relation of the string length of `var` to `n` is not expressed or checked; 
thus this deserves a report.
But I see your concern, however, I cannot think of valid counterexamples, aka. 
FPs off the top of my head. I'll come back once I see the real results.

> Also note that the error message "Out of bound memory access (index is 
> tainted)" becomes misleading after this patch -- but don't spend too much 
> time on resolving this, because right now I'm working on a complete rewrite 
> the message generation to replace the current spartan messages with useful 
> error reporting.

I agree, but I'm not sure how much more readable something more accurate like 
"the location where this pointer points to potentially depends on untrusted 
tainted data". (Note that this would also work for tainted indexes as well).

I must admit, that reporting and diagnostics were low priorities for this patch 
stack, so I'd focus on addressing logic concerns first for the whole stack and 
then come back to reporting to make it consistent across the stack.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159105/new/

https://reviews.llvm.org/D159105

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D159163: [analyzer][NFC] Workaround miscompilation on recent MSVC

2023-08-30 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

Please add the exact versions to the commit message of Visual Studio where it 
worked, and where it didn't, thus we applied this change so that it would work 
on all (both) Visual Studio versions.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159163/new/

https://reviews.llvm.org/D159163

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D159163: [analyzer][NFC] Workaround miscompilation on recent MSVC

2023-08-30 Thread Balázs Benics via Phabricator via cfe-commits
steakhal accepted this revision.
steakhal added a comment.
This revision is now accepted and ready to land.

Feel free to merge it. Thanks!
I'd be curious to see if this bug is tracked at Microsoft.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159163/new/

https://reviews.llvm.org/D159163

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D159109: [analyzer] CStringChecker buffer access checks should check the first bytes

2023-08-29 Thread Balázs Benics via Phabricator via cfe-commits
steakhal created this revision.
steakhal added reviewers: NoQ, xazax.hun, donat.nagy, Szelethus.
Herald added subscribers: manas, ASDenysPetrov, martong, dkrupp, 
mikhail.ramalho, a.sidorin, szepet, baloghadamsoftware.
Herald added a project: All.
steakhal requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

By not checking if the first byte of the buffer is accessible,
we missed some reports in the Juliet benchmark.

(Juliet CWE-124 Buffer Underwrite: memcpy, memmove)

Depends on D159108 


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D159109

Files:
  clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
  clang/test/Analysis/string.c


Index: clang/test/Analysis/string.c
===
--- clang/test/Analysis/string.c
+++ clang/test/Analysis/string.c
@@ -71,6 +71,7 @@
 int scanf(const char *restrict format, ...);
 void *malloc(size_t);
 void free(void *);
+void *memcpy(void *dest, const void *src, unsigned long n);
 
 //===--===
 // strlen()
@@ -1713,3 +1714,21 @@
   free(dataBuffer);
 }
 #endif
+
+#ifndef SUPPRESS_OUT_OF_BOUND
+void CWE124_Buffer_Underwrite__malloc_char_memcpy() {
+  char * dataBuffer = (char *)malloc(100*sizeof(char));
+  if (dataBuffer == NULL) return;
+  memset(dataBuffer, 'A', 100-1);
+  dataBuffer[100-1] = '\0';
+  char *data = dataBuffer - 8;
+
+  char source[100];
+  memset(source, 'C', 100-1); // fill with 'C's
+  source[100-1] = '\0'; // null terminate
+
+  memcpy(data, source, 100*sizeof(char)); // expected-warning {{Memory copy 
function overflows the destination buffer}}
+  data[100-1] = '\0';
+  free(dataBuffer);
+}
+#endif
Index: clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
===
--- clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
+++ clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
@@ -480,6 +480,14 @@
   if (!Filter.CheckCStringOutOfBounds)
 return State;
 
+  SVal BufStart =
+  svalBuilder.evalCast(BufVal, PtrTy, Buffer.Expression->getType());
+
+  // Check if the first byte of the buffer is accessible.
+  State = CheckLocation(C, State, Buffer, BufStart, Access, CK);
+  if (!State)
+return nullptr;
+
   // Get the access length and make sure it is known.
   // FIXME: This assumes the caller has already checked that the access length
   // is positive. And that it's unsigned.
@@ -496,8 +504,6 @@
   NonLoc LastOffset = Offset.castAs();
 
   // Check that the first buffer is sufficiently long.
-  SVal BufStart =
-  svalBuilder.evalCast(BufVal, PtrTy, Buffer.Expression->getType());
   if (std::optional BufLoc = BufStart.getAs()) {
 
 SVal BufEnd =


Index: clang/test/Analysis/string.c
===
--- clang/test/Analysis/string.c
+++ clang/test/Analysis/string.c
@@ -71,6 +71,7 @@
 int scanf(const char *restrict format, ...);
 void *malloc(size_t);
 void free(void *);
+void *memcpy(void *dest, const void *src, unsigned long n);
 
 //===--===
 // strlen()
@@ -1713,3 +1714,21 @@
   free(dataBuffer);
 }
 #endif
+
+#ifndef SUPPRESS_OUT_OF_BOUND
+void CWE124_Buffer_Underwrite__malloc_char_memcpy() {
+  char * dataBuffer = (char *)malloc(100*sizeof(char));
+  if (dataBuffer == NULL) return;
+  memset(dataBuffer, 'A', 100-1);
+  dataBuffer[100-1] = '\0';
+  char *data = dataBuffer - 8;
+
+  char source[100];
+  memset(source, 'C', 100-1); // fill with 'C's
+  source[100-1] = '\0'; // null terminate
+
+  memcpy(data, source, 100*sizeof(char)); // expected-warning {{Memory copy function overflows the destination buffer}}
+  data[100-1] = '\0';
+  free(dataBuffer);
+}
+#endif
Index: clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
===
--- clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
+++ clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
@@ -480,6 +480,14 @@
   if (!Filter.CheckCStringOutOfBounds)
 return State;
 
+  SVal BufStart =
+  svalBuilder.evalCast(BufVal, PtrTy, Buffer.Expression->getType());
+
+  // Check if the first byte of the buffer is accessible.
+  State = CheckLocation(C, State, Buffer, BufStart, Access, CK);
+  if (!State)
+return nullptr;
+
   // Get the access length and make sure it is known.
   // FIXME: This assumes the caller has already checked that the access length
   // is positive. And that it's unsigned.
@@ -496,8 +504,6 @@
   NonLoc LastOffset = Offset.castAs();
 
   // Check that the first buffer is sufficiently long.
-  SVal BufStart =
-  svalBuilder.evalCast(BufVal, PtrTy, Buffer.Expression->getType());
   if (std::optional BufLoc = BufStart.getAs()) {
 
 SVal BufEnd =
_

[PATCH] D159108: [analyzer] CStringChecker should check the first byte of the destination of strcpy, strncpy

2023-08-29 Thread Balázs Benics via Phabricator via cfe-commits
steakhal created this revision.
steakhal added reviewers: NoQ, xazax.hun, donat.nagy, Szelethus.
Herald added subscribers: manas, ASDenysPetrov, martong, dkrupp, 
mikhail.ramalho, a.sidorin, szepet, baloghadamsoftware.
Herald added a project: All.
steakhal requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

By not checking if the first byte of the destination of strcpy and
strncpy is writable, we missed some reports in the Juliet benchmark.

(Juliet CWE-124 Buffer Underwrite: strcpy, strncpy)

Depends on D159107 


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D159108

Files:
  clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
  clang/test/Analysis/string.c


Index: clang/test/Analysis/string.c
===
--- clang/test/Analysis/string.c
+++ clang/test/Analysis/string.c
@@ -1667,3 +1667,49 @@
   strcpy(x, "12\0");
 }
 #endif
+
+#ifndef SUPPRESS_OUT_OF_BOUND
+void testStrcpyDestinationWritableFirstByte(void) {
+  char dst[10];
+  char *p = dst - 8;
+  strcpy(p, "src"); // expected-warning {{String copy function overflows the 
destination buffer}}
+}
+
+void CWE124_Buffer_Underwrite__malloc_char_cpy() {
+  char * dataBuffer = (char *)malloc(100*sizeof(char));
+  if (dataBuffer == NULL) return;
+  memset(dataBuffer, 'A', 100-1);
+  dataBuffer[100-1] = '\0';
+  char * data = dataBuffer - 8;
+  char source[100];
+  memset(source, 'C', 100-1); // fill with 'C's
+  source[100-1] = '\0'; // null terminate
+  strcpy(data, source); // expected-warning {{String copy function overflows 
the destination buffer}}
+  free(dataBuffer);
+}
+#endif
+
+#ifndef SUPPRESS_OUT_OF_BOUND
+void testStrncpyDestinationWritableFirstByte(void) {
+  char source[100];
+  use_string(source); // escape
+  char buf[100];
+  char *p = buf - 8;
+  strncpy(p, source, 100-1); // expected-warning {{String copy function 
overflows the destination buffer}}
+}
+
+void CWE124_Buffer_Underwrite__malloc_char_ncpy() {
+  char * dataBuffer = (char *)malloc(100*sizeof(char));
+  if (dataBuffer == 0) return;
+  memset(dataBuffer, 'A', 100-1);
+  dataBuffer[100-1] = '\0';
+  char *data = dataBuffer - 8;
+
+  char source[100];
+  memset(source, 'C', 100-1); // fill with 'C's
+  source[100-1] = '\0'; // null terminate
+  strncpy(data, source, 100-1); // expected-warning {{String copy function 
overflows the destination buffer}}
+  data[100-1] = '\0'; // null terminate
+  free(dataBuffer);
+}
+#endif
Index: clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
===
--- clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
+++ clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
@@ -2009,6 +2009,11 @@
   SVal maxLastElement =
   svalBuilder.evalBinOpLN(state, BO_Add, *dstRegVal, *maxLastNL, 
ptrTy);
 
+  // Check if the first byte of the destination is writable.
+  state = CheckLocation(C, state, Dst, DstVal, AccessKind::write);
+  if (!state)
+return;
+  // Check if the last byte of the destination is writable.
   state = CheckLocation(C, state, Dst, maxLastElement, AccessKind::write);
   if (!state)
 return;
@@ -2021,6 +2026,11 @@
 
   // ...and we haven't checked the bound, we'll check the actual copy.
   if (!boundWarning) {
+// Check if the first byte of the destination is writable.
+state = CheckLocation(C, state, Dst, DstVal, AccessKind::write);
+if (!state)
+  return;
+// Check if the last byte of the destination is writable.
 state = CheckLocation(C, state, Dst, lastElement, AccessKind::write);
 if (!state)
   return;


Index: clang/test/Analysis/string.c
===
--- clang/test/Analysis/string.c
+++ clang/test/Analysis/string.c
@@ -1667,3 +1667,49 @@
   strcpy(x, "12\0");
 }
 #endif
+
+#ifndef SUPPRESS_OUT_OF_BOUND
+void testStrcpyDestinationWritableFirstByte(void) {
+  char dst[10];
+  char *p = dst - 8;
+  strcpy(p, "src"); // expected-warning {{String copy function overflows the destination buffer}}
+}
+
+void CWE124_Buffer_Underwrite__malloc_char_cpy() {
+  char * dataBuffer = (char *)malloc(100*sizeof(char));
+  if (dataBuffer == NULL) return;
+  memset(dataBuffer, 'A', 100-1);
+  dataBuffer[100-1] = '\0';
+  char * data = dataBuffer - 8;
+  char source[100];
+  memset(source, 'C', 100-1); // fill with 'C's
+  source[100-1] = '\0'; // null terminate
+  strcpy(data, source); // expected-warning {{String copy function overflows the destination buffer}}
+  free(dataBuffer);
+}
+#endif
+
+#ifndef SUPPRESS_OUT_OF_BOUND
+void testStrncpyDestinationWritableFirstByte(void) {
+  char source[100];
+  use_string(source); // escape
+  char buf[100];
+  char *p = buf - 8;
+  strncpy(p, source, 100-1); // expected-warning {{String copy function overflows t

[PATCH] D159107: [analyzer] ArrayBoundCheckerV2 should disallow forming lvalues to out-of-bounds locations

2023-08-29 Thread Balázs Benics via Phabricator via cfe-commits
steakhal created this revision.
steakhal added reviewers: NoQ, xazax.hun, donat.nagy, Szelethus.
Herald added subscribers: manas, ASDenysPetrov, martong, dkrupp, 
mikhail.ramalho, a.sidorin, szepet, baloghadamsoftware.
Herald added a project: All.
steakhal requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

After this patch, we no longer need the check::Location callback, as we
caught bugs earlier when forming bad references - aka. before loading or
binding anything to it.

(CWE-122 Heap Based Buffer Overflow: CWE-805-class-loop)

Depends on D159106 


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D159107

Files:
  clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
  clang/test/Analysis/out-of-bounds-new.cpp

Index: clang/test/Analysis/out-of-bounds-new.cpp
===
--- clang/test/Analysis/out-of-bounds-new.cpp
+++ clang/test/Analysis/out-of-bounds-new.cpp
@@ -176,3 +176,63 @@
   break;
   }
 }
+
+void test_memberwise_copy() {
+  ManyInts dst[1];
+  ManyInts src[2] = {5};
+
+  dst[0] = src[0]; // ok
+  dst[0].a = src[0].a; // ok
+
+  // Positive indexing:
+  switch (rng()) {
+default: break;
+case 0: {
+  auto &z = dst[1]; // expected-warning {{Out of bound memory access (access exceeds upper limit of memory block)}}
+  break;
+}
+case 1: {
+  dst[1].a = src[1].a; // expected-warning {{Out of bound memory access (access exceeds upper limit of memory block)}}
+  break;
+}
+case 2: {
+  auto &z = *(dst + 1); // expected-warning {{Out of bound memory access (access exceeds upper limit of memory block)}}
+  break;
+}
+case 3: {
+  auto &z = *dst;
+  auto &zz = 1[&z]; // expected-warning {{Out of bound memory access (access exceeds upper limit of memory block)}}
+  break;
+}
+case 4: {
+  dst[1] = src[1]; // expected-warning {{Out of bound memory access (access exceeds upper limit of memory block)}}
+  break;
+}
+  }
+
+  // Negative indexing:
+  switch (rng()) {
+default: break;
+case 0: {
+  auto &z = dst[-1]; // expected-warning {{Out of bound memory access (accessed memory precedes memory block)}}
+  break;
+}
+case 1: {
+  dst[-1].a = src[1].a; // expected-warning {{Out of bound memory access (accessed memory precedes memory block)}}
+  break;
+}
+case 2: {
+  auto &z = *(dst - 1); // expected-warning {{Out of bound memory access (accessed memory precedes memory block)}}
+  break;
+}
+case 3: {
+  auto &z = *dst;
+  auto &zz = (-1)[&z]; // expected-warning {{Out of bound memory access (accessed memory precedes memory block)}}
+  break;
+}
+case 4: {
+  dst[-1] = src[1]; // expected-warning {{Out of bound memory access (accessed memory precedes memory block)}}
+  break;
+}
+  }
+}
Index: clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
===
--- clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
+++ clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
@@ -30,7 +30,9 @@
 using namespace taint;
 
 namespace {
-class ArrayBoundCheckerV2 : public Checker {
+class ArrayBoundCheckerV2
+: public Checker,
+ check::PostStmt> {
   mutable std::unique_ptr BT;
   mutable std::unique_ptr TaintBT;
 
@@ -46,9 +48,9 @@
   void impl(SVal Loc, bool isLoad, const Stmt *S, CheckerContext &C) const;
 
 public:
-  void checkLocation(SVal l, bool isLoad, const Stmt *S,
- CheckerContext &C) const;
   void checkBind(SVal Loc, SVal, const Stmt *S, CheckerContext &) const;
+  void checkPostStmt(const ArraySubscriptExpr *E, CheckerContext &C) const;
+  void checkPostStmt(const UnaryOperator *E, CheckerContext &C) const;
 };
 
 // FIXME: Eventually replace RegionRawOffset with this class.
@@ -142,13 +144,6 @@
   return {nullptr, nullptr};
 }
 
-void ArrayBoundCheckerV2::checkLocation(SVal location, bool isLoad,
-const Stmt* LoadS,
-CheckerContext &checkerContext) const {
-  if (isLoad)
-impl(location, isLoad, LoadS, checkerContext);
-}
-
 void ArrayBoundCheckerV2::checkBind(SVal Loc, SVal, const Stmt *S,
 CheckerContext &C) const {
   impl(Loc, /*isLoad=*/false, S, C);
@@ -375,6 +370,21 @@
   return std::nullopt;
 }
 
+void ArrayBoundCheckerV2::checkPostStmt(const ArraySubscriptExpr *E,
+CheckerContext &C) const {
+  // TODO: Specialize the diagnostic message.
+  // It might not be a "load" operation.
+  impl(C.getSVal(E), /*isLoad=*/true, E, C);
+}
+void ArrayBoundCheckerV2::checkPostStmt(const UnaryOperator *E,
+CheckerContext &C) const {
+  if (E->getOpcode() != UO_

[PATCH] D159106: [analyzer] ArrayBoundCheckerV2 should listen to check::Bind as well

2023-08-29 Thread Balázs Benics via Phabricator via cfe-commits
steakhal created this revision.
steakhal added reviewers: NoQ, xazax.hun, donat.nagy, Szelethus.
Herald added subscribers: manas, ASDenysPetrov, martong, dkrupp, 
mikhail.ramalho, a.sidorin, szepet, baloghadamsoftware.
Herald added a project: All.
steakhal requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Turns out check::Location is not always called when storing a value to a
location. See the details here:
https://discourse.llvm.org/t/checklocation-vs-checkbind-when-isload-false/72728

To catch the remaining cases when binding to a location, subscribe to the
check::Bind callback.

(CWE-122 Heap Based Buffer Overflow: CWE-805-class-loop)

Depends on D554351


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D159106

Files:
  clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
  clang/test/Analysis/out-of-bounds-new.cpp


Index: clang/test/Analysis/out-of-bounds-new.cpp
===
--- clang/test/Analysis/out-of-bounds-new.cpp
+++ clang/test/Analysis/out-of-bounds-new.cpp
@@ -154,3 +154,25 @@
   unsigned *U = nullptr;
   U = new unsigned[m + n + 1];
 }
+
+
+int rng();
+struct ManyInts {
+int a, b, c, d, e, f, g;
+};
+void test_trivial_copy_move_is_checked_by_the_checker() {
+  ManyInts v;
+  ManyInts *p = &v;
+
+  switch (rng()) {
+case 0:
+  *p = ManyInts{3,2,1}; // ok
+  break;
+case -1:
+  *--p = ManyInts{3,2,1}; // expected-warning {{Out of bound memory access 
(accessed memory precedes memory block)}}
+  break;
+case 1:
+  *++p = ManyInts{3,2,1}; // expected-warning {{Out of bound memory access 
(access exceeds upper limit of memory block)}}
+  break;
+  }
+}
Index: clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
===
--- clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
+++ clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
@@ -30,8 +30,7 @@
 using namespace taint;
 
 namespace {
-class ArrayBoundCheckerV2 :
-public Checker {
+class ArrayBoundCheckerV2 : public Checker {
   mutable std::unique_ptr BT;
   mutable std::unique_ptr TaintBT;
 
@@ -44,9 +43,12 @@
 
   static bool isFromCtypeMacro(const Stmt *S, ASTContext &AC);
 
+  void impl(SVal Loc, bool isLoad, const Stmt *S, CheckerContext &C) const;
+
 public:
   void checkLocation(SVal l, bool isLoad, const Stmt *S,
  CheckerContext &C) const;
+  void checkBind(SVal Loc, SVal, const Stmt *S, CheckerContext &) const;
 };
 
 // FIXME: Eventually replace RegionRawOffset with this class.
@@ -143,6 +145,17 @@
 void ArrayBoundCheckerV2::checkLocation(SVal location, bool isLoad,
 const Stmt* LoadS,
 CheckerContext &checkerContext) const {
+  if (isLoad)
+impl(location, isLoad, LoadS, checkerContext);
+}
+
+void ArrayBoundCheckerV2::checkBind(SVal Loc, SVal, const Stmt *S,
+CheckerContext &C) const {
+  impl(Loc, /*isLoad=*/false, S, C);
+}
+
+void ArrayBoundCheckerV2::impl(SVal location, bool isLoad, const Stmt *LoadS,
+   CheckerContext &checkerContext) const {
 
   // NOTE: Instead of using ProgramState::assumeInBound(), we are prototyping
   // some new logic here that reasons directly about memory region extents.


Index: clang/test/Analysis/out-of-bounds-new.cpp
===
--- clang/test/Analysis/out-of-bounds-new.cpp
+++ clang/test/Analysis/out-of-bounds-new.cpp
@@ -154,3 +154,25 @@
   unsigned *U = nullptr;
   U = new unsigned[m + n + 1];
 }
+
+
+int rng();
+struct ManyInts {
+int a, b, c, d, e, f, g;
+};
+void test_trivial_copy_move_is_checked_by_the_checker() {
+  ManyInts v;
+  ManyInts *p = &v;
+
+  switch (rng()) {
+case 0:
+  *p = ManyInts{3,2,1}; // ok
+  break;
+case -1:
+  *--p = ManyInts{3,2,1}; // expected-warning {{Out of bound memory access (accessed memory precedes memory block)}}
+  break;
+case 1:
+  *++p = ManyInts{3,2,1}; // expected-warning {{Out of bound memory access (access exceeds upper limit of memory block)}}
+  break;
+  }
+}
Index: clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
===
--- clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
+++ clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
@@ -30,8 +30,7 @@
 using namespace taint;
 
 namespace {
-class ArrayBoundCheckerV2 :
-public Checker {
+class ArrayBoundCheckerV2 : public Checker {
   mutable std::unique_ptr BT;
   mutable std::unique_ptr TaintBT;
 
@@ -44,9 +43,12 @@
 
   static bool isFromCtypeMacro(const Stmt *S, ASTContext &AC);
 
+  void impl(SVal Loc, bool isLoad, const Stmt *S, CheckerContext &C) const;
+
 public:
   void checkLocation(SVal l, bool 

[PATCH] D159105: [analyzer] ArrayBoundCheckerV2 should check the region for taint as well

2023-08-29 Thread Balázs Benics via Phabricator via cfe-commits
steakhal created this revision.
steakhal added reviewers: NoQ, xazax.hun, donat.nagy, Szelethus.
Herald added subscribers: manas, ASDenysPetrov, martong, dkrupp, 
mikhail.ramalho, a.sidorin, szepet, baloghadamsoftware.
Herald added a project: All.
steakhal requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Previously, we didn't report OOB accesses if the pointer itself was
tainted. This looks weird, but there is weird code out there, code like
inside the Juliet benchmark.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D159105

Files:
  clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
  clang/test/Analysis/taint-generic.c

Index: clang/test/Analysis/taint-generic.c
===
--- clang/test/Analysis/taint-generic.c
+++ clang/test/Analysis/taint-generic.c
@@ -55,10 +55,19 @@
 // CHECK-INVALID-ARG-SAME:that expects an argument number for propagation
 // CHECK-INVALID-ARG-SAME:rules greater or equal to -1
 
+typedef typeof(sizeof(int)) size_t;
 typedef long long rsize_t;
 void clang_analyzer_isTainted_char(char);
 void clang_analyzer_isTainted_charp(char*);
 void clang_analyzer_isTainted_int(int);
+void clang_analyzer_isTainted_int_ptr(const int *);
+size_t clang_analyzer_getExtent(const void *);
+void clang_analyzer_dump_int(int);
+void clang_analyzer_dump_int_ptr(const int *);
+void clang_analyzer_dump_char(char);
+void clang_analyzer_dump_char_ptr(const char*);
+void clang_analyzer_value(char);
+void clang_analyzer_printState();
 
 int scanf(const char *restrict format, ...);
 char *gets(char *str);
@@ -1107,3 +1116,36 @@
   setproctitle_init(1, argv, 0); // expected-warning {{Untrusted data is passed to a user-defined sink}}
   setproctitle_init(1, real_argv, argv); // expected-warning {{Untrusted data is passed to a user-defined sink}}
 }
+
+void testTaintedPtr() {
+  // Read a pointer from a tainted source and dereference it.
+  int *ptr;
+  scanf("%ld", &ptr); // expected-warning {{format specifies type 'long *' but the argument has type 'int **'}}
+  *ptr = 44; // expected-warning {{Out of bound memory access (index is tainted)}}
+}
+
+void testTaintedUnconstrainedIndex() {
+  int n;
+  scanf("%d", &n);
+  Buffer[n] = 1; // expected-warning {{Out of bound memory access (index is tainted)}}
+}
+
+void testTaintedLowerConstrainedIndex() {
+  int n;
+  scanf("%d", &n);
+  if (n >= 0) {
+// We only constained the lower end, and it's tainted => report.
+Buffer[n] = 1; // expected-warning {{Out of bound memory access (index is tainted)}}
+  }
+}
+
+void testUnknownExtentWithTaintedIndex(void) {
+  extern void v;
+  int *p = (int *)&v;
+  int n;
+  scanf("%d", &n);
+
+  clang_analyzer_isTainted_int(n); // expected-warning {{YES}}
+  clang_analyzer_dump_int(clang_analyzer_getExtent(p)); // expected-warning {{Unknown}}
+  p[n] = 42; // expected-warning {{Out of bound memory access (index is tainted)}}
+}
Index: clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
===
--- clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
+++ clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp
@@ -193,6 +193,11 @@
   reportOOB(checkerContext, state_precedesLowerBound, OOB_Precedes);
   return;
 }
+if (state_precedesLowerBound && state_withinLowerBound &&
+isTainted(state, location)) {
+  reportTaintOOB(checkerContext, state_precedesLowerBound, location);
+  return;
+}
 
 if (state_withinLowerBound)
   state = state_withinLowerBound;
@@ -204,17 +209,16 @@
 auto [state_withinUpperBound, state_exceedsUpperBound] =
 compareValueToThreshold(state, ByteOffset, *KnownSize, svalBuilder);
 
-if (state_exceedsUpperBound) {
-  if (!state_withinUpperBound) {
-// We know that the index definitely exceeds the upper bound.
-reportOOB(checkerContext, state_exceedsUpperBound, OOB_Excedes);
-return;
-  }
-  if (isTainted(state, ByteOffset)) {
-// Both cases are possible, but the index is tainted, so report.
-reportTaintOOB(checkerContext, state_exceedsUpperBound, ByteOffset);
-return;
-  }
+if (state_exceedsUpperBound && !state_withinUpperBound) {
+  // We know that the index definitely exceeds the upper bound.
+  reportOOB(checkerContext, state_exceedsUpperBound, OOB_Excedes);
+  return;
+}
+
+if (state_withinUpperBound && state_exceedsUpperBound &&
+isTainted(state, location)) {
+  reportTaintOOB(checkerContext, state_exceedsUpperBound, location);
+  return;
 }
 
 if (state_withinUpperBound)
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158707: [analyzer] Fix a few size-type signedness inconsistency related to DynamicExtent

2023-08-28 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

In D158707#4621270 , @donat.nagy 
wrote:

> In D158707#4621135 , @steakhal 
> wrote:
>
>> Oh, so we would canonicalize towards a signed extent type. I see. I think 
>> I'm okay with that.
>> However, I think this needs a little bit of polishing and testing when the 
>> region does not point to the beginning of an array or object, but rather 
>> inside an object, or like this:
>>
>>   int nums[] = {1,2,3};
>>   char *p = (char*)&nums[1];
>>   
>>   clang_analyzer_dumpExtent(p);
>>   clang_analyzer_dumpElementCount(p);
>>   ++p;
>>   clang_analyzer_dumpExtent(p);
>>   clang_analyzer_dumpElementCount(p);
>>   ++p;
>>   int *q = (int*)p;
>>   clang_analyzer_dumpExtent(q);
>>   clang_analyzer_dumpElementCount(q);
>
> Unfortunately the analyzer engine does not distinguish pointer arithmetic and 
> element access, and we cannot handle these situations while that limitation 
> exists. For example, if we have an array `int nums[3];`, then the element 
> `nums[1]` and the incremented pointer `int *ptr = nums + 1;` are represented 
> by the same `ElementRegion` object; so we cannot simultaneously say that (1) 
> `nums[1]` is an int-sized memory region, and (2) it's valid to access the 
> elements of the array as `ptr[-1]`, `ptr[0]` and `ptr[1]`.
>
> The least bad heuristic is probably `RegionRawOffsetV2::computeOffset()` from 
> ArrayBoundCheckerV2, which strips away all the `ElementRegion` layers, acting 
> as if they are all coming from pointer arithmetic. This behavior is incorrect 
> if we want to query the extent/elementcount of a "real" ElementRegion (e.g. a 
> slice of a multidimensional array or `(char*)&nums[1]` in your example), but 
> mostly leads to false negatives. On the other hand, if we process a pointer 
> increment as if it were an element access, we can easily run into false 
> positive reports -- this is why I had to abandon D150446 
> .
>
> I'd suggest that we should ignore pointer arithmetic in this commit (or 
> create a testcase that documents the current insufficient behavior) and 
> remember this as yet another reason to do through rewrite of the memory 
> model. This is related to the plan 
> https://github.com/llvm/llvm-project/issues/42709 suggested by @NoQ, but 
> perhaps it would be enough to do a less drastic change in that direction.

I'm aware of this, and I didn't mean to ask to fix this here.
I just want to see how these ExprInspection APIs will be affected in these 
edge-cases, where previously it returned "unknown", and now will return 
something else.
It's useful to demonstrate this if we ever come back to this commit.
Pinning down some tests and having some FIXMEs there should bring enough 
visibility.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158707/new/

https://reviews.llvm.org/D158707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158707: [analyzer] Fix a few size-type signedness inconsistency related to DynamicExtent

2023-08-28 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

Oh, so we would canonicalize towards a signed extent type. I see. I think I'm 
okay with that.
However, I think this needs a little bit of polishing and testing when the 
region does not point to the beginning of an array or object, but rather inside 
an object, or like this:

  int nums[] = {1,2,3};
  char *p = (char*)&nums[1];
  
  clang_analyzer_dumpExtent(p);
  clang_analyzer_dumpElementCount(p);
  ++p;
  clang_analyzer_dumpExtent(p);
  clang_analyzer_dumpElementCount(p);
  ++p;
  int *q = (int*)p;
  clang_analyzer_dumpExtent(q);
  clang_analyzer_dumpElementCount(q);




Comment at: clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp:47-48
+ const MemRegion *MR) {
+  if (!MR)
+return UnknownVal();
+  MR = MR->StripCasts();

I'd prefer hoisting this check to the callsite.
Usually, we assume `MR` is non-null. This is already the case for a sibling 
API, `getDynamicExtent()`.
Let's keep these "overloads" behave the same.



Comment at: clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp:56-57
+  QualType Ty = TVR->getValueType();
+  if (const auto *ConstArrayTy =
+  dyn_cast_or_null(Ty->getAsArrayTypeUnsafe()))
+return SVB.makeIntVal(ConstArrayTy->getSize(), false);

Funnily enough, one must use the `ASTContext::getAsConstantArrayType()` to 
achieve this.
I'm not sure why, but I was bitten by this.
It says something like this at the ASTContext:
```
  /// Type Query functions.  If the type is an instance of the specified class,
  /// return the Type pointer for the underlying maximally pretty type.  This
  /// is a member of ASTContext because this may need to do some amount of
  /// canonicalization, e.g. to move type qualifiers into the element type.
  const ArrayType *getAsArrayType(QualType T) const;
  const ConstantArrayType *getAsConstantArrayType(QualType T) const {
return dyn_cast_or_null(getAsArrayType(T));
  }
```



Comment at: clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp:58
+  dyn_cast_or_null(Ty->getAsArrayTypeUnsafe()))
+return SVB.makeIntVal(ConstArrayTy->getSize(), false);
+

For bool literal arguments we usually use "named parameter passing", aka. 
`/*paramname=*/false` constructs.



Comment at: clang/lib/StaticAnalyzer/Core/DynamicExtent.cpp:66-75
+  assert(!ElementSize.isZeroConstant() && "Non-zero element size expected");
+  if (Size.isUndef())
+return UnknownVal();
+
+  SValBuilder &SVB = State->getStateManager().getSValBuilder();
+
+  auto ElementCount =

If `ElementSize` would be some symbol, constrained to null, we would pass the 
assertion, but still end up modeling a division by zero, resulting in 
`Undefined`, which then turned into `Unknown` - I see.

Looking at this, the assertion seems misleading as it won't protect the 
division.
That said, the early return on undef could be also dropped.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158707/new/

https://reviews.llvm.org/D158707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158968: [analyzer] Fix assertion on casting SVal to NonLoc inside the IteratorRange checker

2023-08-28 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

In D158968#4620721 , @donat.nagy 
wrote:

> LGTM. I'm not familiar with the Iterator checker family, but this is a very 
> straightforward change.

Thanks for the review!
Well, it's an Ericsson checker ;)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158968/new/

https://reviews.llvm.org/D158968

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158968: [analyzer] Fix assertion on casting SVal to NonLoc inside the IteratorRange checker

2023-08-28 Thread Balázs Benics via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG985e399647d5: [analyzer] Fix assertion on casting SVal to 
NonLoc inside the IteratorRange… (authored by steakhal).

Changed prior to commit:
  https://reviews.llvm.org/D158968?vs=553858&id=553879#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158968/new/

https://reviews.llvm.org/D158968

Files:
  clang/lib/StaticAnalyzer/Checkers/IteratorRangeChecker.cpp
  clang/test/Analysis/iterator-range.cpp


Index: clang/test/Analysis/iterator-range.cpp
===
--- clang/test/Analysis/iterator-range.cpp
+++ clang/test/Analysis/iterator-range.cpp
@@ -946,3 +946,14 @@
   // expected-warning@-1 {{The right operand of '-' is a garbage value}}
   // expected-note@-2 {{The right operand of '-' is a garbage value}}
 }
+
+namespace std {
+namespace ranges {
+  template 
+  InOutIter next(InOutIter, Sentinel);
+} // namespace ranges
+} // namespace std
+
+void gh65009__no_crash_on_ranges_next(int **begin, int **end) {
+  (void)std::ranges::next(begin, end); // no-crash
+}
Index: clang/lib/StaticAnalyzer/Checkers/IteratorRangeChecker.cpp
===
--- clang/lib/StaticAnalyzer/Checkers/IteratorRangeChecker.cpp
+++ clang/lib/StaticAnalyzer/Checkers/IteratorRangeChecker.cpp
@@ -228,7 +228,7 @@
 Value = State->getRawSVal(*ValAsLoc);
   }
 
-  if (Value.isUnknownOrUndef())
+  if (Value.isUnknownOrUndef() || !isa(Value))
 return;
 
   // Incremention or decremention by 0 is never a bug.


Index: clang/test/Analysis/iterator-range.cpp
===
--- clang/test/Analysis/iterator-range.cpp
+++ clang/test/Analysis/iterator-range.cpp
@@ -946,3 +946,14 @@
   // expected-warning@-1 {{The right operand of '-' is a garbage value}}
   // expected-note@-2 {{The right operand of '-' is a garbage value}}
 }
+
+namespace std {
+namespace ranges {
+  template 
+  InOutIter next(InOutIter, Sentinel);
+} // namespace ranges
+} // namespace std
+
+void gh65009__no_crash_on_ranges_next(int **begin, int **end) {
+  (void)std::ranges::next(begin, end); // no-crash
+}
Index: clang/lib/StaticAnalyzer/Checkers/IteratorRangeChecker.cpp
===
--- clang/lib/StaticAnalyzer/Checkers/IteratorRangeChecker.cpp
+++ clang/lib/StaticAnalyzer/Checkers/IteratorRangeChecker.cpp
@@ -228,7 +228,7 @@
 Value = State->getRawSVal(*ValAsLoc);
   }
 
-  if (Value.isUnknownOrUndef())
+  if (Value.isUnknownOrUndef() || !isa(Value))
 return;
 
   // Incremention or decremention by 0 is never a bug.
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158855: [analyzer][NFC] Remove useless class BuiltinBug

2023-08-28 Thread Balázs Benics via Phabricator via cfe-commits
steakhal accepted this revision.
steakhal added a comment.
This revision is now accepted and ready to land.

I haven't checked the details, but it makes sense.
It's reassuring that all the tests pass :D, which is good enough for me.

Thanks for the cleanup!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158855/new/

https://reviews.llvm.org/D158855

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158968: [analyzer] Fix assertion on casting SVal to NonLoc inside the IteratorRange checker

2023-08-28 Thread Balázs Benics via Phabricator via cfe-commits
steakhal created this revision.
steakhal added reviewers: NoQ, donat.nagy, xazax.hun, Szelethus.
Herald added subscribers: manas, ASDenysPetrov, martong, dkrupp, 
mikhail.ramalho, a.sidorin, szepet, baloghadamsoftware.
Herald added a project: All.
steakhal requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

The checker assumed that it could safely cast an SVal to Nonloc.
This surfaced because, with std::ranges, we can unintentionally match
on other APIs as well, thus increasing the likelihood of violating
checker assumptions about the context it's invoked.

See the discourse post on CallDescriptions and std::ranges here.
https://discourse.llvm.org/t/calldescriptions-should-not-skip-the-ranges-part-in-std-names-when-matching/73076

Fixes https://github.com/llvm/llvm-project/issues/65009


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D158968

Files:
  clang/lib/StaticAnalyzer/Checkers/IteratorRangeChecker.cpp
  clang/test/Analysis/iterator-range.cpp


Index: clang/test/Analysis/iterator-range.cpp
===
--- clang/test/Analysis/iterator-range.cpp
+++ clang/test/Analysis/iterator-range.cpp
@@ -946,3 +946,14 @@
   // expected-warning@-1 {{The right operand of '-' is a garbage value}}
   // expected-note@-2 {{The right operand of '-' is a garbage value}}
 }
+
+namespace std {
+namespace ranges {
+  template 
+  InOutIter next(InOutIter, Sentinel);
+} // namespace ranges
+} // namespace std
+
+void gh65009__no_crash_on_ranges_next(int **begin, int **end) {
+  (void)std::ranges::next(begin, end); // no-crash
+}
Index: clang/lib/StaticAnalyzer/Checkers/IteratorRangeChecker.cpp
===
--- clang/lib/StaticAnalyzer/Checkers/IteratorRangeChecker.cpp
+++ clang/lib/StaticAnalyzer/Checkers/IteratorRangeChecker.cpp
@@ -228,7 +228,7 @@
 Value = State->getRawSVal(*ValAsLoc);
   }
 
-  if (Value.isUnknownOrUndef())
+  if (Value.isUnknownOrUndef() || !isa(Value))
 return;
 
   // Incremention or decremention by 0 is never a bug.


Index: clang/test/Analysis/iterator-range.cpp
===
--- clang/test/Analysis/iterator-range.cpp
+++ clang/test/Analysis/iterator-range.cpp
@@ -946,3 +946,14 @@
   // expected-warning@-1 {{The right operand of '-' is a garbage value}}
   // expected-note@-2 {{The right operand of '-' is a garbage value}}
 }
+
+namespace std {
+namespace ranges {
+  template 
+  InOutIter next(InOutIter, Sentinel);
+} // namespace ranges
+} // namespace std
+
+void gh65009__no_crash_on_ranges_next(int **begin, int **end) {
+  (void)std::ranges::next(begin, end); // no-crash
+}
Index: clang/lib/StaticAnalyzer/Checkers/IteratorRangeChecker.cpp
===
--- clang/lib/StaticAnalyzer/Checkers/IteratorRangeChecker.cpp
+++ clang/lib/StaticAnalyzer/Checkers/IteratorRangeChecker.cpp
@@ -228,7 +228,7 @@
 Value = State->getRawSVal(*ValAsLoc);
   }
 
-  if (Value.isUnknownOrUndef())
+  if (Value.isUnknownOrUndef() || !isa(Value))
 return;
 
   // Incremention or decremention by 0 is never a bug.
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158953: [analyzer] MmapWriteExecChecker: use getAs instead of castAs

2023-08-27 Thread Balázs Benics via Phabricator via cfe-commits
steakhal accepted this revision.
steakhal added a comment.
This revision is now accepted and ready to land.

Land it, whenever you wish. Thanks!
PS: usually we don't have a comment like that. The name of the function should 
give enough context, along with a // no-crash comment at the line where it was 
triggered in the past.
That usually just helps reviewers to understand the issue, but as I triaged 
this I'm aware what's going on.
In tricky situations.
I just let you know about these conventions, no need to adjust this patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158953/new/

https://reviews.llvm.org/D158953

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158953: [analyzer] MmapWriteExecChecker: use getAs instead of castAs

2023-08-27 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

Could you please simplify the test case?
You could basically get rid of everything there, except for forwarding one top 
level parameter as the proto argument to the mmap call.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158953/new/

https://reviews.llvm.org/D158953

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158858: [analyzer] MPIChecker: add defensive checking (check against nullptr)

2023-08-25 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

Thanks. Feel free to merge this (before any of the others).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158858/new/

https://reviews.llvm.org/D158858

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158858: [analyzer] MPIChecker: add defensive checking (check against nullptr)

2023-08-25 Thread Balázs Benics via Phabricator via cfe-commits
steakhal accepted this revision.
steakhal added a comment.
This revision is now accepted and ready to land.

Makes sense.
Would thus test crash without the early returns? And now they wouldn't?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158858/new/

https://reviews.llvm.org/D158858

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D140828: [C++] Implement "Deducing this" (P0847R7)

2023-08-25 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

I checked out the code to see how does the Static Analyzer work after this.
I'm impressed that it seems to work.
Do you mind adding my test file to this patch?
`clang/test/Analysis/cxx2b-deducing-this.cpp`:

  // RUN: %clang_analyze_cc1 -std=c++2b -verify %s \
  // RUN:   -analyzer-checker=core,debug.ExprInspection
  
  template  void clang_analyzer_dump(T);
  
  struct S {
int num;
S *orig;
  
void a(this auto Self) {
  clang_analyzer_dump(&Self); // expected-warning {{&Self}}
  clang_analyzer_dump(Self.orig); // expected-warning {{&s}}
  clang_analyzer_dump(Self.num);   // expected-warning {{5 S32b}}
  clang_analyzer_dump(Self.orig->num); // expected-warning {{5 S32b}}
  
  Self.num = 1;
  clang_analyzer_dump(Self.num);   // expected-warning {{1 S32b}}
  clang_analyzer_dump(Self.orig->num); // expected-warning {{5 S32b}}
}
  
void b(this auto& Self) {
  clang_analyzer_dump(&Self); // expected-warning {{&s}}
  clang_analyzer_dump(Self.orig); // expected-warning {{&s}}
  clang_analyzer_dump(Self.num);   // expected-warning {{5 S32b}}
  clang_analyzer_dump(Self.orig->num); // expected-warning {{5 S32b}}
  
  Self.num = 2;
  clang_analyzer_dump(Self.num);   // expected-warning {{2 S32b}}
  clang_analyzer_dump(Self.orig->num); // expected-warning {{2 S32b}}
}
  
void c(this S Self) {
  clang_analyzer_dump(&Self); // expected-warning {{&Self}}
  clang_analyzer_dump(Self.orig); // expected-warning {{&s}}
  clang_analyzer_dump(Self.num);   // expected-warning {{2 S32b}}
  clang_analyzer_dump(Self.orig->num); // expected-warning {{2 S32b}}
  
  Self.num = 3;
  clang_analyzer_dump(Self.num);   // expected-warning {{3 S32b}}
  clang_analyzer_dump(Self.orig->num); // expected-warning {{2 S32b}}
}
  
void c(this S Self, int I) {
  clang_analyzer_dump(I); // expected-warning {{11 S32b}}
  clang_analyzer_dump(&Self); // expected-warning {{&Self}}
  clang_analyzer_dump(Self.orig); // expected-warning {{&s}}
  clang_analyzer_dump(Self.num);   // expected-warning {{2 S32b}}
  clang_analyzer_dump(Self.orig->num); // expected-warning {{2 S32b}}
  
  Self.num = 4;
  clang_analyzer_dump(Self.num);   // expected-warning {{4 S32b}}
  clang_analyzer_dump(Self.orig->num); // expected-warning {{2 S32b}}
}
  };
  
  void top() {
S s = {/*num=*/5, /*orig=*/&s};
s.a();
s.b(); // This call changes 's.num' to 2.
s.c();
s.c(11);
  }

Thank you for implementing (deducing) this!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140828/new/

https://reviews.llvm.org/D140828

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158813: [analyzer] MPIChecker: MPI_Waitall should respect count arg

2023-08-25 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

Let me try to summarize some of the variables:

So, given an invocation like `MPI_Waitall(C, Requests, Statues)`:

- `MR` is `lval(Requests)`
- `ElementCount` is the number of elements `Requests` consist of.
- `ArrSize` is basically `ElementCount` but an APSInt.
- `MROffset` is the offset and allocated region of the `Requests` region.
- `Index` is ConcreteInt(0, ..., ArrSize)
- `Count` is `C`, which is likely a ConcreteInt.

Please note that I have no idea what this checker does.




Comment at: clang/lib/StaticAnalyzer/Checkers/MPI-Checker/MPIChecker.cpp:46-48
+if (!ErrorNode)
+  return;
+

If these hunks are not closely related to the issue you intend to fix in this 
PR, I'd suggesst submitting it separately. That is a common API, and we 
wouldn't need tests for such defensive checks, as they rarely trigger.



Comment at: clang/lib/StaticAnalyzer/Checkers/MPI-Checker/MPIChecker.cpp:185-188
+CharUnits ElemSizeInChars = ASTCtx.getTypeSizeInChars(ElemType);
+int64_t ElemSizeInBits =
+(ElemSizeInChars.isZero() ? 1 : ElemSizeInChars.getQuantity()) *
+ASTCtx.getCharWidth();

How can `ElemSizeInChar` be zero?
If it can be zero, could you demonstrate it by a test?



Comment at: clang/lib/StaticAnalyzer/Checkers/MPI-Checker/MPIChecker.cpp:189-190
+ASTCtx.getCharWidth();
+const NonLoc MROffset =
+SVB.makeArrayIndex(MR->getAsOffset().getOffset() / ElemSizeInBits);
 

What implies that `MR->getAsOffset()` succeeds, and also what implies that the 
`Offset` within that is not symbolic?
Also, how can you use this without also using the result region?
Without using that you don't know what is the anchor point, from which this 
offset represent anything.
ATM I believe the code assumes that `MR->getRegion()` equals to `SuperRegion`, 
which might not be always the case.
This could materialize a problem when you construct the element region later.



Comment at: clang/lib/StaticAnalyzer/Checkers/MPI-Checker/MPIChecker.cpp:192-198
+SVal Count = CE.getArgSVal(0);
+for (size_t i = 0; i < ArrSize; ++i) {
+  const NonLoc Idx = Ctx.getSValBuilder().makeArrayIndex(i);
+  auto CountReached = SVB.evalBinOp(State, BO_GE, Idx, Count, 
ASTCtx.BoolTy)
+  .getAs();
+  if (CountReached && State->assume(*CountReached, true))
+break;

It's suspicious to me to compare stuff in the symbolic domain, such as `Idx >= 
Count` here.
And let me elaborate on why I think so.

I'd assume that the first argument of `MPI_Waitall` is usually just a number 
literal, thus it's gonna be a ConcreteInt, so now we have `Count`.
The `Index` you just created, obviously is just a ConcreteInt.
After this, we could simply unwrap them and do the comparison without doing it 
in the symbolic domain.

---

As a sidenote, `State->assume(*CountReached, true)` would return a state in two 
cases:
 1) We can prove that the assumption holds.
 2) We cannot disprove the assumption, thus it might hold, so we assume it 
holds.
This is probably not what you wanted in the first place, but it's probably 
irrelelevant anyway.



Comment at: clang/lib/StaticAnalyzer/Checkers/MPI-Checker/MPIChecker.cpp:195
+  const NonLoc Idx = Ctx.getSValBuilder().makeArrayIndex(i);
+  auto CountReached = SVB.evalBinOp(State, BO_GE, Idx, Count, 
ASTCtx.BoolTy)
+  .getAs();

In C for example, we might not have a Boot type AFAIK.
Usually, we use `SVB.getConditionType()` in such cases.



Comment at: clang/lib/StaticAnalyzer/Checkers/MPI-Checker/MPIChecker.cpp:200-202
+  auto ElementRegionIndex =
+  SVB.evalBinOp(State, BO_Add, Idx, MROffset, SVB.getArrayIndexType())
+  .getAs();

My guess is that we only care about if `MROffset` is a concrete int, thus we 
could also just do this assidion in regular c++ and just transfer the result 
into the symbolic domain. 



Comment at: clang/lib/StaticAnalyzer/Checkers/MPI-Checker/MPIChecker.cpp:205
+const ElementRegion *const ER = RegionManager.getElementRegion(
+ElemType, *ElementRegionIndex, SuperRegion, Ctx.getASTContext());
+ReqRegions.push_back(ER);

I believe you have a variable `ASTCtx` that you could use.



Comment at: clang/test/Analysis/mpichecker-crash-gh64647.cpp:5-13
+bool contains();
+void do_a() {
+  if (contains()) {
+MPI_Request request_item;
+MPI_Wait(&request_item, MPI_STATUS_IGNORE);
+// expected-warning@-1 {{Request 'request_item' has no matching 
nonblocking call.}}
+  }

Unless you have a compelling reason creating this file, I'd suggest to append 
this to the end of the `test/Analysis/mpichecker.cpp`, and just leave a comment 
there like `// GithHub Issue

[PATCH] D158707: [analyzer] Fix a few size-type signedness inconsistency related to DynamicExtent

2023-08-24 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

In D158707#4614268 , @danix800 wrote:

> In D158707#4614100 , @steakhal 
> wrote:
>
>> As a general comment on requiring all extents to be of unsigned APSInts.
>> Checkers, like the `MallocChecker`, binds the result of arbitrary 
>> expression's values (the argument of malloc, for example) as extents of some 
>> regions.
>> Usually, that argument is of an `unsigned` static type, thus even if some 
>> other expression is there, and implicit cast will turn it into `unsigned`.
>> However, in CSA, we don't respect such casts, thus if it was signed, we will 
>> bind a signed value as an extent.
>>
>> Here is an example  demonstrating this:
>>
>>   int n = conjure();
>>   clang_analyzer_dump(n);  // conj
>>   clang_analyzer_value(n); // i32
>>   
>>   char *p = (char*)malloc(n);
>>   clang_analyzer_dump(clang_analyzer_getExtent(p));  // conj
>>   clang_analyzer_value(clang_analyzer_getExtent(p)); // i32
>>
>> Consequently, I'm not sure it's possible to turn extents "unsigned", without 
>> fixing casts.
>
> Wow! Really another unexpectation to me!
>
> I think either signedness is ok, the stored data would not be lost. How 
> clients use these values are the actual problem.

I'm not sure how could the clients, aka. checkers use extents "correctly". They 
cannot even explicitly wrap the extent by an "unsigned" type e.g. by modeling a 
cast to "size_t" because that is a noop, thus adding casts is an identity 
operation. See here 
.
Consequently, checkers using extents in comparisons or doing arithmetic in the 
symbolic domain e.g. when doing bounds checks under the hood, will be affected 
by the missing cast there.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158707/new/

https://reviews.llvm.org/D158707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158707: [analyzer] Fix a few size-type signedness inconsistency related to DynamicExtent

2023-08-24 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

As a general comment on requiring all extents to be of unsigned APSInts.
Checkers, like the `MallocChecker`, binds the result of arbitrary expression's 
values (the argument of malloc, for example) as extents of some regions.
Usually, that argument is of an `unsigned` static type, thus even if some other 
expression is there, and implicit cast will turn it into `unsigned`.
However, in CSA, we don't respect such casts, thus if it was signed, we will 
bind a signed value as an extent.

Here is an example  demonstrating this:

  int n = conjure();
  clang_analyzer_dump(n);  // conj
  clang_analyzer_value(n); // i32
  
  char *p = (char*)malloc(n);
  clang_analyzer_dump(clang_analyzer_getExtent(p));  // conj
  clang_analyzer_value(clang_analyzer_getExtent(p)); // i32

Consequently, I'm not sure it's possible to turn extents "unsigned", without 
fixing casts.
I was sort of expecting this, but I didn't know real world issues materialized 
by this in the domain of extents.
Usually, they just happen to work, so I didn't really bother with it.
This was actually what I intended to deliver with my comment at D158499#4609046 
.

By this comment I'm not suggesting that this is a lost cause, or we shouldn't 
move in this direction. I think something like this would make sense, but I'd 
need to delve into the change to make sure.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158707/new/

https://reviews.llvm.org/D158707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158285: [NFC][CLANG] Fix wrong orders of function arguments positions

2023-08-24 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added inline comments.



Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:1468-1477
   std::optional getRangeForNegatedSymSym(const SymSymExpr *SSE) {
 return getRangeForNegatedExpr(
 [SSE, State = this->State]() -> SymbolRef {
   if (SSE->getOpcode() == BO_Sub)
 return State->getSymbolManager().getSymSymExpr(
-SSE->getRHS(), BO_Sub, SSE->getLHS(), SSE->getType());
+SSE->getLHS(), BO_Sub, SSE->getRHS(), SSE->getType());
   return nullptr;

tahonermann wrote:
> steakhal wrote:
> > Now this is within my realm.
> > This code applies the following transformation: `-(X - Y)  =>  (Y - X)` .
> > Consequently, this is intentional.
> Ideally, this change would have tripped up a test. Do we have tests that 
> attempt to verify source locations such that one could be added? I know 
> testing source locations is difficult and tedious.
> Ideally, this change would have tripped up a test.
I think it did. I had a quick look at the premerge test bot on buildkite, and 
there is a test failure likely related to this hunk. See 
clang/test/Analysis/ptr-arith.c
Check [[ 
https://buildkite.com/llvm-project/premerge-checks/builds/171979#018a0995-4c0b-4703-aef0-75e0a8def7c7
 | buildkite ]].



Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:1529-1536
   // If ranges were not previously found,
   // try to find a reversed expression (y > x).
   if (!QueriedRangeSet) {
 const BinaryOperatorKind ROP =
 BinaryOperator::reverseComparisonOp(QueriedOP);
-SymSym = SymMgr.getSymSymExpr(RHS, ROP, LHS, T);
+SymSym = SymMgr.getSymSymExpr(LHS, ROP, RHS, T);
 QueriedRangeSet = getConstraint(State, SymSym);

steakhal wrote:
> If you have recommendations on improving the comments of the surrounding 
> context, let me know.
This breaks the test: clang/test/Analysis/constraint_manager_conditions.cpp
Check [[ 
https://buildkite.com/llvm-project/premerge-checks/builds/171979#018a0995-4c0b-4703-aef0-75e0a8def7c7
 | buildkite ]].


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158285/new/

https://reviews.llvm.org/D158285

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158707: [analyzer] Fix a few size-type signedness inconsistency related to DynamicExtent

2023-08-24 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

Maybe its too early for the patch, but if a nonlocal change, like changing some 
Core functionality, we usually measure the real world implications and compare 
it against some baseline to get an idea how this affects the whole system. It 
also helps uncovering corner cases and crashes. Have you done some 
measurements? If not, @donat.Nagy might could help you out gathering some 
statistics.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158707/new/

https://reviews.llvm.org/D158707

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157385: [clang][CFG] Cleanup functions

2023-08-23 Thread Balázs Benics via Phabricator via cfe-commits
steakhal requested changes to this revision.
steakhal added a comment.
This revision now requires changes to proceed.

This is a great improvement. When I saw that clang now supports it and e.g. the 
CSA operates on the CFG, I also considered adding this.
Now I don't need to do it, so many thanks!

The code looks correct to me, except that the cleanup should be before the dtor 
if it has any. Other than that, the code coverage also looks great, thus I 
would not oppose.




Comment at: clang/lib/Analysis/CFG.cpp:1901-1902
   appendLifetimeEnds(Block, VD, S);
-if (BuildOpts.AddImplicitDtors)
+if (BuildOpts.AddImplicitDtors && !hasTrivialDestructor(VD))
   appendAutomaticObjDtor(Block, VD, S);
+if (HasCleanupAttr)

This condition looks new. Is it an orthogonal improvement?



Comment at: clang/lib/Analysis/CFG.cpp:1901-1904
+if (BuildOpts.AddImplicitDtors && !hasTrivialDestructor(VD))
   appendAutomaticObjDtor(Block, VD, S);
+if (HasCleanupAttr)
+  appendCleanupFunction(Block, VD);

steakhal wrote:
> This condition looks new. Is it an orthogonal improvement?
Shouldn't the cleanup function run first, and then the dtor of the variable?
This way the object is already "dead" even before reaching the cleanup handler.
https://godbolt.org/z/sT65boooW



Comment at: clang/lib/Analysis/CFG.cpp:2113
+
+bool CFGBuilder::hasTrivialDestructor(VarDecl *VD) const {
   // Check for const references bound to temporary. Set type to pointee.

Can't this accept `const VarDecl*` instead? That way we could get rid of that 
`const_cast` above.



Comment at: clang/test/Analysis/scopes-cfg-output.cpp:1473-1474
+// CHECK-NEXT:3: F f __attribute__((cleanup(cleanup_F)));
+// CHECK-NEXT:4: [B1.3].~F() (Implicit destructor)
+// CHECK-NEXT:5: CleanupFunction (cleanup_F)
+// CHECK-NEXT:6: CFGScopeEnd(f)

tbaeder wrote:
> aaronpuchert wrote:
> > Interesting test! But it seems CodeGen has them swapped: compiling this 
> > snippet with `clang -c -S -emit-llvm` I get
> > ```lang=LLVM
> > define dso_local void @_Z4testv() #0 personality ptr @__gxx_personality_v0 {
> >   %1 = alloca %class.F, align 1
> >   %2 = alloca ptr, align 8
> >   %3 = alloca i32, align 4
> >   invoke void @_Z9cleanup_FP1F(ptr noundef %1)
> >   to label %4 unwind label %5
> > 
> > 4:; preds = %0
> >   call void @_ZN1FD2Ev(ptr noundef nonnull align 1 dereferenceable(1) %1) #3
> >   ret void
> > 
> > ; ...
> > }
> > ```
> > So first cleanup, then destructor. This is with 17.0.0-rc2.
> Interesting, I thought I checked this and used the correct order. Will 
> re-check, thanks.
I believe this demonstrates the wrong order I also spotted earlier.



Comment at: clang/test/Analysis/scopes-cfg-output.cpp:1480
+public:
+  ~F() {}
+};

aaronpuchert wrote:
> tbaeder wrote:
> > aaronpuchert wrote:
> > > As with the cleanup function, a definition shouldn't be necessary.
> > Is there a way to test whether the contents of the cleanup function are 
> > being checked as well? From these tests, I only know we consider them 
> > called, but not whether we (properly) analyze their bodies in the context 
> > as well. Or is that separate from this patch?
> For now we're just adding a new element to the CFG and adapting the 
> respective tests. The CFG is generated on a per-function basis, and the 
> generation of one function's CFG will never look into another function's 
> body. It might use some (declaration) properties of course, like whether it 
> has `[[noreturn]]` or `noexcept`. Of course we should also generate a CFG for 
> the cleanup function, but that's independent of this change.
> 
> Users of the CFG will naturally need to be taught about this new element type 
> to “understand” it. Otherwise they should simply skip it. Since the CFG 
> previously did not contain such elements, there should be no change for them. 
> So we can also initially just add the element and not tell anybody about it.
> 
> We could also add understanding of the new element type to other CFG users, 
> but you don't have to do this. If you only care about Thread Safety Analysis, 
> then it's totally fine to handle it only there.
> 
> But let's move all changes to Thread Safety Analysis into a follow-up, so 
> that we don't have to bother the CFG maintainers with that.
Speaking of `noreturn`, I think it is worth demonstrating that if the cleanup 
function has `noreturn` attribute, then the dtor is not called.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157385/new/

https://reviews.llvm.org/D157385

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154603: [analyzer][clangsa] Add new option to alpha.security.cert.InvalidPtrChecker

2023-08-23 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

In D154603#4609872 , @gamesh411 wrote:

> - try to consolidate the multiple warnings coming from this checker's 
> `checkLocation` callback
>
> category based filtering ( example from 
> lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:167 ):
>
>   If (!BR.isInteresting(CallLocation) ||
> BR.getBugType().getCategory() != categories::TaintedData) { //but this 
> would be InvalidPtr BugType's category, namely memory_error
> return "";
>   }
>
> or checker based filtering ( example from 
> lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp:397 )
>
>   if (&BR.getBugType() != smartptr::getNullDereferenceBugType() || // this is 
> a comparison of the address of a static bugtype
>   !BR.isInteresting(ThisRegion))
>
> This second one gives a more precise filtering, but the 
> implementation-specific detail of storing the bugtype by reference is what 
> seems to make this work, which I find hacky.

If the checker issues a NoteTag, it makes sense in certain situations to make 
sure that it acts on only reports issued by that checker. The standard way of 
achieving that is by comparing the tags, as you do in the second example.
There is nothing wrong with this, if that particular checker issued that 
NoteTag in the first place. It's marginally debatable, if it was not issued by 
the given checker but rather something else. That would suggest to me some 
logic flaw, or coupling issue. For cross-checker cases, I think the bug 
category would be the better option, but I would still need to think about 
that, so not set in stone :D

FYI I haven't looked at the patch yet, but I wanted to answer your question.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154603/new/

https://reviews.llvm.org/D154603

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158499: [analyzer] Compute FAM dynamic size

2023-08-22 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

In D158499#4608847 , @danix800 wrote:

> We have `getDynamicExtentWithOffset` to use, which can handle more general
> dynamic memory based cases than this fix.
>
> I'll abandon this one.
>
> There are issues worth clarifying and fixing:
>
> 1). Debugging APIs like `clang_analyzer_dumpExtent` in `ExprInspection` might 
> be expanded
> to use `getDynamicExtentWithOffset` if it's intended to be used on not only 
> dynamic allocated
> regions but more general ones, and more testcases are needed to demonstrate 
> the usage.
>
> 2). Fix type-inconsistency of several size-related `SVal`s, e.g.
>
>   FAM fam;
>   clang_analyzer_dump(clang_analyzer_getExtent(&fam));
>   clang_analyzer_dump(clang_analyzer_getExtent(fam.data));
>   // expected-warning@-2 {{4 S64b}}  // U64b is more reasonable when used as 
> an extent
>   // expected-warning@-2 {{0 U64b}}
>
> `ArrayIndexType` might be misused here.
>
> Simple searching results listed here (might not be complete):
>
> 1. `getDynamicExtentWithOffset` return value
> 2. `getElementExtent` return value
> 3. `ExprEngineCallAndReturn.cpp` when calling `setDynamicExtent` the `Size` 
> arg

I've also thought of these two. I think we should indeed fix both.

In D158499#4608974 , @danix800 wrote:

> One of the observable issue with inconsistent size type is
>
>   void clang_analyzer_eval(int);
>   
>   typedef unsigned long long size_t;
>   void *malloc(unsigned long size);
>   void free(void *);
>   
>   void symbolic_longlong_and_int0(long long len) {
> char *a = malloc(5);
> (void)a[len + 1]; // no-warning
> // len: [-1,3]
> clang_analyzer_eval(-1 <= len && len <= 3); // expected-warning {{TRUE}}
> clang_analyzer_eval(0 <= len);  // expected-warning 
> {{UNKNOWN}}
> clang_analyzer_eval(len <= 2);  // expected-warning 
> {{UNKNOWN}}
> free(a);
>   }
>
> which is extracted from 
> `clang/test/Analysis/array-bound-v2-constraint-check.c`,
> with `DynamicMemoryModeling` turned on,
> the second warning does not hold anymore: `clang_analyzer_eval(0 <= len);`
> will be reported as `TRUE` which is not expected.
>
> `DynamicMemoryModeling` will record the extent of allocated memory as `5ULL`,
> `ArrayBoundV2` will do `len + 1 < 5ULL` assumption, simplified to `len < 
> 4ULL`,
> which casts `len` to unsigned, dropping `-1`, similar to
>
>   void clang_analyzer_eval(int);
>   
>   void test(int len) {
> if (len >= -1 && len <= 4U) { // len is promoted into unsigned, thus can 
> never be negative
> clang_analyzer_eval(0 <= len);  // TRUE
> }
>   }

I suspected that the wrong cast modeling and how we infer what value ranges are 
calculated is susceptible to such APSInt signedness issues, but I haven't seen 
a case in the wild for extents. Thus, I didn't think of fixing it either. But 
yes, we should.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158499/new/

https://reviews.llvm.org/D158499

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158499: [analyzer] Compute FAM dynamic size

2023-08-22 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

In D158499#4606337 , @danix800 wrote:

> In D158499#4606291 , @steakhal 
> wrote:
>
>> Thanks for submitting this.
>> A funny thing is that in my free time I was also working on this last week. 
>> I'll have a look at this more in depth during the week.
>> For the mean time here is my version, committed pretty much a couple days 
>> ago to my fork.
>> https://github.com/steakhal/llvm-project/commit/986059a146e036ec84db64868a079d3c4a0e5e16
>>
>> EDIT: fix the link to point to my fork.
>>
>> Your proposed change looks pretty similar to mine.
>
> Are you going to submit your changes? Which one do you think is more suitable?
>
> Please let me know whether I should continue on this one or wait for yours. 
> Thanks!

I think the difference between yours and my implementation is that I try to 
accept FAM like arrays even if they are nested inside other strict, if they are 
notionally the last field of the whole thing. I'm doing it because that is how 
the machines work, thus the model should model that and return the right extent 
even for those cases.
What I can tell, that I didn't check my implementation with the packed 
attribute. I assume it should work.
The decl has a tempting method checking if its a FAM, however that does not 
accept FAMs of size 1 or of any size.

So its hard to tell right away which approach is better, but I'll look into 
this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158499/new/

https://reviews.llvm.org/D158499

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158499: [analyzer] Compute FAM dynamic size

2023-08-22 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

Thanks for submitting this.
A funny thing is that in my free time I was also working on this last week. 
I'll have a look at this more in depth during the week.
For the mean time here is my version, committed pretty much a couple days ago 
to my fork.
https://github.com/llvm/llvm-project/commit/986059a146e036ec84db64868a079d3c4a0e5e16

Your proposed change looks pretty similar to mine.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158499/new/

https://reviews.llvm.org/D158499

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158156: [analyzer] Add C++ array delete checker

2023-08-21 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

One more thing to mention. Its usually illadvised to rename files or do changes 
that would seriously impact git blame, unless we have a really good reason 
doing so.
Aesthetics usually don't count one, especially if the name is not user-facing. 
However, maintainability is another axis, thus as it's always, not black and 
white.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158156/new/

https://reviews.llvm.org/D158156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158156: [analyzer] Add C++ array delete checker

2023-08-21 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

I share the raised concerns. And I think we are aligned.

PS: sorry if any of my comments are dups, or outdated. I only had a little time 
last week, and things have progressed since then I noticed. I still decided to 
submit my possibly outdated and partial review comments. I hope you understand.
Its quite difficult to allocate time to do reviews from day to to day work. I 
unfortunately usually do this in my spare time, if I have.




Comment at: clang/docs/analyzer/checkers.rst:1787-1804
+.. _alpha-cplusplus-ArrayDelete:
+
+alpha.cplusplus.ArrayDelete (C++)
+""
+Reports destructions of arrays of polymorphic objects that are destructed as 
their base class.
+
+.. code-block:: cpp

I think you should probably mention `EXP51-CPP CERT rule` somehow here.



Comment at: clang/docs/analyzer/checkers.rst:1789-1791
+alpha.cplusplus.ArrayDelete (C++)
+""
+Reports destructions of arrays of polymorphic objects that are destructed as 
their base class.





Comment at: clang/docs/analyzer/checkers.rst:1793-1803
+.. code-block:: cpp
+
+ Base *create() {
+   Base *x = new Derived[10]; // note: conversion from derived to base
+  //   happened here
+   return x;
+ }

Make sure in the example the `create` is related (e.g. called/used).
Also, refrain from using `sink` in the docs. It's usually used in the context 
of taint analysis.



Comment at: clang/include/clang/StaticAnalyzer/Checkers/Checkers.td:757-760
+def CXXArrayModeling: Checker<"CXXArrayModeling">,
+  HelpText<"The base of a few CXX array related checkers.">,
+  Documentation,
+  Hidden;





Comment at: clang/lib/StaticAnalyzer/Checkers/CXXDeleteChecker.cpp:50-51
+  class DeleteBugVisitor : public BugReporterVisitor {
+  public:
+DeleteBugVisitor() : Satisfied(false) {}
+void Profile(llvm::FoldingSetNodeID &ID) const override {





Comment at: clang/lib/StaticAnalyzer/Checkers/CXXDeleteChecker.cpp:61
+  private:
+bool Satisfied;
+  };





Comment at: clang/lib/StaticAnalyzer/Checkers/CXXDeleteChecker.cpp:65-68
+  bool ArrayDeleteEnabled = false;
+  CheckerNameRef ArrayDeleteName;
+  bool NonVirtualDeleteEnabled = false;
+  CheckerNameRef NonVirtualDeleteName;




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158156/new/

https://reviews.llvm.org/D158156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158285: [NFC][CLANG] Fix wrong orders of function arguments positions

2023-08-21 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

In D158285#4603477 , @aaron.ballman 
wrote:

>> I've seen a few similar patches from you @Manna and probably some related 
>> folks.
>> Could you please clarify the motivation and the expectations? Or feel free 
>> to point me to the Discuss post around the subject if I happened to miss it.
>
> Intel has been addressing issues found by a static analysis tool and when we 
> find the issue exists in community Clang, we address it upstream. Part of 
> addressing the issue is figuring out whether the issue is actually a false 
> positive or not; we try to limit the changes to only true positives (but some 
> will sometimes slip through the cracks). I don't believe there's been a 
> Discourse post on the topic as it's expected to be regular maintenance work 
> instead of something special.

Thanks for clarifying. It does resonate with me, and it is an important subject 
to work on. However, I sometimes feel that a little more effort in 
understanding the context where the suggestion is raised before review would 
sort out some of the FPs I usually encounter of such patches.
I find this important because it's difficult to allocate time for doing this.

To me, an important metric of a quality patch is if the title and the summary 
describe why we have this patch, and what we do about it. Maybe, if it matters, 
also describe what the author tried and didn't work and why.
An implicit benefit of this is that it helps the author to reflect and form a 
deeper understanding of the issue at hand (to be able to describe it).

In addition to this, if it's a behavioral change (like this), it's valuable to 
demonstrate the bug with a test case covering the changed parts.
I'm not enforcing this, as it doesn't always make sense. But thinking about it 
again helps to reflect. At worst, the result of this would materialize in "I 
don't have a test for this change because XYZ" in the summary of the revision.
Following these steps would improve review experience, which should also lead 
to happier patch authors in the form of quick and to-the-point review comments.

I hope I clarified my standpoint. @aaron.ballman


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158285/new/

https://reviews.llvm.org/D158285

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157104: [analyzer] Improve underflow handling in ArrayBoundV2

2023-08-21 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

Hold on, make sure this is clang-format compliant before committing.
(Check buildkite)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157104/new/

https://reviews.llvm.org/D157104

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157104: [analyzer] Improve underflow handling in ArrayBoundV2

2023-08-21 Thread Balázs Benics via Phabricator via cfe-commits
steakhal accepted this revision.
steakhal added a comment.

Thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157104/new/

https://reviews.llvm.org/D157104

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158360: [docs] Update the static analyzer bug reporting page

2023-08-20 Thread Balázs Benics via Phabricator via cfe-commits
steakhal accepted this revision.
steakhal added a comment.
This revision is now accepted and ready to land.

Makes sense to me, thanks!

I'm not sure though how much we wanna invest into maintaining these handwritten 
htmls. I would prefer moving away from these. It would likely also fit more 
nicely with how the other tools look like in terms of styling.
No actions expected, this is more of just expressing my position on these 
analyzer HTML files.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158360/new/

https://reviews.llvm.org/D158360

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158360: [docs] Update the static analyzer bug reporting page

2023-08-20 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

Makes sense to me. Thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158360/new/

https://reviews.llvm.org/D158360

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158285: [NFC][CLANG] Fix wrong orders of function arguments positions

2023-08-19 Thread Balázs Benics via Phabricator via cfe-commits
steakhal requested changes to this revision.
steakhal added subscribers: tbaeder, ABataev.
steakhal added a comment.
This revision now requires changes to proceed.

I think we have two 2 TPs, that were interesting to see. I've CCd the code 
owners of the affected parts to confirm this.

As a general note,this patch is not NFC, thus it's expected to demonstrate the 
behavior difference by a test.
Without tests, at least in the Static Analyzer, we generally don't accept 
patches.

---

I've seen a few similar patches from you @Manna and probably some related folks.
Could you please clarify the motivation and the expectations? Or feel free to 
point me to the Discuss post around the subject if I happened to miss it.




Comment at: clang/lib/AST/Interp/InterpBlock.cpp:94-95
 
 DeadBlock::DeadBlock(DeadBlock *&Root, Block *Blk)
-: Root(Root), B(Blk->Desc, Blk->IsStatic, Blk->IsExtern, /*isDead=*/true) {
+: Root(Root), B(Blk->Desc, Blk->IsExtern, Blk->IsStatic, /*isDead=*/true) {
   // Add the block to the chain of dead blocks.

I think this is a TP.
I find this unfortunate, that while all the ˙Block` ctors accepts the 
parameters in the order how the backing fields are defined - except for the 
`isDead` ctor overload, where the `IsExtern` and `IsStatic`.

To address this, I'd recommend not swapping the parameters at the callsite, but 
rather fix the ctor overload to expect the parameters in the right order as the 
rest of the ctors do. And of course, check all the callsites to this weird 
constructor to fix them as well.

WDYT @tbaeder



Comment at: clang/lib/Sema/SemaOpenMP.cpp:18098-18101
   case OMPC_allocate:
 Res = ActOnOpenMPAllocateClause(Data.DepModOrTailExpr, VarList, StartLoc,
-LParenLoc, ColonLoc, EndLoc);
+ColonLoc, LParenLoc, EndLoc);
 break;

Looks like another TP.
WDYT @ABataev



Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:1468-1477
   std::optional getRangeForNegatedSymSym(const SymSymExpr *SSE) {
 return getRangeForNegatedExpr(
 [SSE, State = this->State]() -> SymbolRef {
   if (SSE->getOpcode() == BO_Sub)
 return State->getSymbolManager().getSymSymExpr(
-SSE->getRHS(), BO_Sub, SSE->getLHS(), SSE->getType());
+SSE->getLHS(), BO_Sub, SSE->getRHS(), SSE->getType());
   return nullptr;

Now this is within my realm.
This code applies the following transformation: `-(X - Y)  =>  (Y - X)` .
Consequently, this is intentional.



Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:1529-1536
   // If ranges were not previously found,
   // try to find a reversed expression (y > x).
   if (!QueriedRangeSet) {
 const BinaryOperatorKind ROP =
 BinaryOperator::reverseComparisonOp(QueriedOP);
-SymSym = SymMgr.getSymSymExpr(RHS, ROP, LHS, T);
+SymSym = SymMgr.getSymSymExpr(LHS, ROP, RHS, T);
 QueriedRangeSet = getConstraint(State, SymSym);

If you have recommendations on improving the comments of the surrounding 
context, let me know.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158285/new/

https://reviews.llvm.org/D158285

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158295: [NFC][clang][analyzer] Avoid potential dereferencing of null pointer value

2023-08-18 Thread Balázs Benics via Phabricator via cfe-commits
steakhal requested changes to this revision.
steakhal added a comment.
This revision now requires changes to proceed.

Thanks for the PR.
I went over the code and concluded that it can never be null.
However, the code could be improved, while making this explicit.




Comment at: 
clang/lib/StaticAnalyzer/Checkers/RetainCountChecker/RetainCountChecker.cpp:1416-1433
   // Update counts from autorelease pools
   for (const auto &I: state->get()) {
 SymbolRef Sym = I.first;
 if (SymReaper.isDead(Sym)) {
   static CheckerProgramPointTag Tag(this, "DeadSymbolAutorelease");
   const RefVal &V = I.second;
   state = handleAutoreleaseCounts(state, Pred, &Tag, C, Sym, V);

So `state->get()` returns a map, and we iterate over the Key&Val 
pairs of that map.

```
const RefVal *getRefBinding(ProgramStateRef State, SymbolRef Sym) {
  return State->get(Sym);
}
```

So, this just returns the associated value with `Sym` in the map.
Instead of this lookup ,we really should have just used the `I.second`.

Now the question is, could have map a `Sym` to a null pointer?

```
static ProgramStateRef setRefBinding(ProgramStateRef State, SymbolRef Sym,
 RefVal Val) {
  assert(Sym != nullptr);
  return State->set(Sym, Val);
}
```
Nop. It's never supposed to be null.

---

Could you please just decompose in the loop with a structured-binding to `auto 
[Sym, V]` and use `V` instead of `T`?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158295/new/

https://reviews.llvm.org/D158295

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D156312: [analyzer] Upstream BitwiseShiftChecker

2023-08-18 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

🎉 🚀


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156312/new/

https://reviews.llvm.org/D156312

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D156312: [analyzer] Upstream BitwiseShiftChecker

2023-08-15 Thread Balázs Benics via Phabricator via cfe-commits
steakhal accepted this revision.
steakhal added a comment.

In D156312#4588406 , @donat.nagy 
wrote:

> After investigating this issue, I added the testcases 
> `signed_aritmetic_{good,bad}` which document the current sub-optimal state. 
> The root cause of this problem is a high-level property of the engine (that 
> it assumes that signed overflows are always possible and acceptable) and I 
> don't see a local workaround that would silence or fix these incorrect error 
> messages.
>
> @steakhal @NoQ What do you know about these signed overflow issues (I presume 
> that analogous issues also appear in other checkers)? How should we handle 
> this limitation of this checker?

I don't think most checkers depend on value ranges explicitly, except for a 
handful of checkers maybe. So, I don't think it's such a huge deal, but I agree 
that it's a problem.
I would bet that the StdLibraryFunctionsChecker would suffer from the same 
issue, and the OOB checker(s) along with it.
I don't know of heuristics mitigating the damage.

I don't think we should do anything about it unless it's frequent enough.
Try to come up with a heuristic to be able to measure how often this happens, 
if you really care.
Once you have a semi-working heuristic for determining if that bugpath suffers 
from this, you can as well use it for marking the bugreport invalid if that's 
the case to lean towards suppressing those FPs.

I don't think it's completely necessary to fix those FPs to land this. I think 
of that as an improvement, on top of this one.
I hope I clarified my standpoint.




Comment at: clang/test/Analysis/bitwise-shift-sanity-checks.c:70-74
+  if (right < 0)
+return 0;
+  return left << (right + 32);
+  // expected - warning@-1 {{Left shift overflows the capacity of 'int'}}
+  // expected-warning@-2 {{Right operand is negative in left shift}}

Let's be explicit about the actual assumed value range, and use 
`clang_analyzer_value()`.
I also recommend using an explicit FIXME for calling out what should be there, 
instead of abusing a non-matching `expected` pattern. I know I used that in the 
past, and probably seen it from me. I feel ashamed that I did that. I know I 
did that to have cleaner diffs, once fixed, but I honestly think it does more 
harm than good.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156312/new/

https://reviews.llvm.org/D156312

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D126864: [clang] Introduce -fstrict-flex-arrays= for stricter handling of flexible arrays

2023-08-15 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

Thanks for the xrefs! It turns out I've already seen that one :D
I'll look into it once more, with a fresh set of eyes. Thanks again!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D126864/new/

https://reviews.llvm.org/D126864

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D156312: [analyzer] Upstream BitwiseShiftChecker

2023-08-15 Thread Balázs Benics via Phabricator via cfe-commits
steakhal accepted this revision.
steakhal added a comment.
This revision is now accepted and ready to land.

The test coverage is also impressive. Thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156312/new/

https://reviews.llvm.org/D156312

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157888: [NFC][Clang] Fix static analyzer concern about null value dereference

2023-08-14 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

If `getStmtForDiagnostics()` in general, never returns null, then shouldn't we 
mark the API with an appropriate attribute?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157888/new/

https://reviews.llvm.org/D157888

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157888: [NFC][Clang] Fix static analyzer concern about null value dereference

2023-08-14 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

Hmm. I guess the assertion is to silence some tool. And I think actually that 
function might very well also return null in some cases.
Why do you think it cannot or at least should not return null in your context? 
I couldn't infer this from the context, neither from the description of this 
patch.

Without that, I would prefer an if to guard the code, instead of asserting this.




Comment at: clang/lib/StaticAnalyzer/Checkers/FuchsiaHandleChecker.cpp:657
+  const Stmt *S = AcquireNode->getStmtForDiagnostics();
+  assert(S && "Statmement cannot be null.");
   PathDiagnosticLocation LocUsedForUniqueing =

I think there is a typo in the word statement.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157888/new/

https://reviews.llvm.org/D157888

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D126864: [clang] Introduce -fstrict-flex-arrays= for stricter handling of flexible arrays

2023-08-13 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

I notices some inconsistency between this `-fstrict-flex-arrays=N` flag and 
what the `RecordDecl::hasFlexibleArrayMember()` returns for an example like 
this:

  typedef unsigned long size_t;
  void *malloc(size_t);
  void free(void *);
  
  void field(void) {
struct vec { size_t len; int data[0]; };
struct vec *a = (struct vec*) malloc(sizeof(struct vec) + 10*sizeof(int));
free(a);
  }

In the example, I use don't specify the `-fstrict-flex-arrays` flag, thus it 
should default to `0`, which means that any trailing arrays (let it be 
incomplete or of any concrete size), to be considered as a 
flexible-array-member.
In the AST, for the `RecordDecl`, I'd expect that the 
`hasFlexibleArrayMember()` member function would reflect this, and return 
`true`, however, it returns `false` instead.
It does so because in SemaDecl.cpp Sema::ActOnFields() 

 before doing some FAM-related checks and diagnostics it performs this check, 
guarding that branch:

  bool IsLastField = (i + 1 == Fields.end());
  if (FDTy->isFunctionType()) {
// ...
  } else if (FDTy->isIncompleteArrayType() &&
 (Record || isa(EnclosingDecl))) {
//This is the FAM handling branch.
// ...
  }

Consequently, Sema will only set the bit for `hasFlexibleArrayMember` if the 
array is //incomplete//.
So my question is, if the `RecordDecl::hasFlexibleArrayMember()` should be 
consistent with the `-fstrict-flex-arrays` flag, or not?
@serge-sans-paille @aaron.ballman


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D126864/new/

https://reviews.llvm.org/D126864

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D150647: [WIP][analyzer] Fix EnumCastOutOfRangeChecker C++17 handling

2023-08-11 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

Isn't D153954  superseded this one? If so, 
consider abandoning it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150647/new/

https://reviews.llvm.org/D150647

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154603: [analyzer][clangsa] Add new option to alpha.security.cert.InvalidPtrChecker

2023-08-11 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

I'm sorry starting the review of this one only now, but I'm quite booked.
Is it still relevant? If so, I'll continue.




Comment at: clang/lib/StaticAnalyzer/Checkers/cert/InvalidPtrChecker.cpp:116-117
+const NoteTag *Note =
+C.getNoteTag([Region, FunctionName, Message](PathSensitiveBugReport 
&BR,
+ llvm::raw_ostream &Out) {
+  if (!BR.isInteresting(Region))

`FunctionName` and `Message` will dangle inside the NoteTag.



Comment at: clang/lib/StaticAnalyzer/Checkers/cert/InvalidPtrChecker.cpp:125
 
-  const NoteTag *Note =
-  C.getNoteTag([SymbolicEnvPtrRegion, FunctionName](
-   PathSensitiveBugReport &BR, llvm::raw_ostream &Out) {
-if (!BR.isInteresting(SymbolicEnvPtrRegion))
-  return;
-Out << '\'' << FunctionName
-<< "' call may invalidate the environment parameter of 'main'";
-  });
+  ExplodedNode *CurrentChainEnd = nullptr;
+

donat.nagy wrote:
> Perhaps add a comment that clarifies that passing a `nullptr` as the 
> ExplodedNode to `addTransition` is equivalent to specifying the current node. 
> I remember this because I was studying its implementation recently, but I 
> would've been surprised and suspicious otherwise.
If `nullptr` is equivalent with `C.getPredecessor()` inside `addTransition()`, 
why not simply initialize it to that value instead of to `nullptr`?



Comment at: clang/lib/StaticAnalyzer/Checkers/cert/InvalidPtrChecker.cpp:214
+  // Model 'getenv' calls
+  CallDescription GetEnvCall{{"getenv"}, 1};
+  if (GetEnvCall.matches(Call)) {

We should hoist this into a field, to only construct it once.



Comment at: clang/lib/StaticAnalyzer/Checkers/cert/InvalidPtrChecker.cpp:216-217
+  if (GetEnvCall.matches(Call)) {
+State =
+State->add(Call.getReturnValue().getAsRegion());
+   C.addTransition(State);

What ensures that `Call.getReturnValue().getAsRegion()` is not null?



Comment at: clang/lib/StaticAnalyzer/Checkers/cert/InvalidPtrChecker.cpp:218
+State->add(Call.getReturnValue().getAsRegion());
+   C.addTransition(State);
+  }

donat.nagy wrote:
> I fear that this state transition will go "sideways" and the later state 
> transitions (which add the note tags) will branch off instead of building 
> onto this. IIUC calling `CheckerContext::addTransition` registers the 
> transition without updating the "current ExplodedNode" field of 
> `CheckerContext`, so you need to explicitly store and pass around the 
> ExplodedNode returned by it if you want to build on it.
> 
> This is an ugly and counter-intuitive API, and I also ran into a very similar 
> issue a few weeks ago (@Szelethus helped me).
I think the usage here is correct.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154603/new/

https://reviews.llvm.org/D154603

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152436: [clang][analyzer] Move checker alpha.unix.StdCLibraryFunctions out of alpha.

2023-08-11 Thread Balázs Benics via Phabricator via cfe-commits
steakhal accepted this revision.
steakhal added a comment.
This revision is now accepted and ready to land.

As I don't use this checker, thus I cannot speak of my experience.
However, the reasons look solid and I'm fine with moving this checker to 
`unix.StdCLibraryFunctions`.
Let some time for other reviewers to object before landing this. Lets say, one 
week from now.
@xazax.hun @NoQ?

As next steps, I'd be glad set the default value of `ModelPOSIX`  to `true`. I 
don't see much harm doing so.
(And maybe getting rid of that checker option entirely.)




Comment at: clang/docs/analyzer/checkers.rst:979-982
+.. _unix-StdCLibraryFunctions:
+
+unix.StdCLibraryFunctions (C)
+"""




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152436/new/

https://reviews.llvm.org/D152436

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157104: [analyzer] Improve underflow handling in ArrayBoundV2

2023-08-11 Thread Balázs Benics via Phabricator via cfe-commits
steakhal accepted this revision.
steakhal added a comment.

Looks safe and good. I'm interested in the diff though.
Let me know once you have the results.
I wanna have a look before we land this.




Comment at: clang/lib/StaticAnalyzer/Checkers/ArrayBoundCheckerV2.cpp:184
+// symbolic regions on the heap (which may be introduced by checkers that
+// call SValBuilder::getConjuredHeapSymbolVal()) and non-symbolic regions
+// (e.g. a field subregion of a symbolic region) in unknown space.

Such as the MallocChecker.



Comment at: clang/test/Analysis/out-of-bounds.c:176
 }
-

Try not to introduce unrelated hunks, as it might introduce more conflicts for 
downstream users than absolutely necessary.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157104/new/

https://reviews.llvm.org/D157104

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157702: [clang][analyzer][NFC] Change PlacementNewChecker into PlacementNew in documentation

2023-08-11 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

In D157702#4580384 , @PiotrZSL wrote:

> In D157702#4580310 , @steakhal 
> wrote:
>
>> Do we have other typos like this?
>
> I don't think so, only mess in overall documentation: 
> https://github.com/llvm/llvm-project/issues/64614
> I updated gen-static-analyzer-docs.py, and it shown me that some 
> documentation is missing: 
> https://clang.llvm.org/extra/clang-tidy/checks/list.html.
> You can see it, as there are no links for some checks.

Yea, it's a complete mess, I agree.
I have no stakes on the docs, but I know that Ericsson folks do. They might 
wanna chim in. @donat.nagy @Szelethus


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157702/new/

https://reviews.llvm.org/D157702

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157702: [clang][analyzer][NFC] Change PlacementNewChecker into PlacementNew in documentation

2023-08-11 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

Do we have other typos like this?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157702/new/

https://reviews.llvm.org/D157702

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157702: [clang][analyzer][NFC] Change PlacementNewChecker into PlacementNew in documentation

2023-08-11 Thread Balázs Benics via Phabricator via cfe-commits
steakhal accepted this revision.
steakhal added a comment.
This revision is now accepted and ready to land.

Correct. Thanks.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157702/new/

https://reviews.llvm.org/D157702

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D156312: [analyzer] Upstream BitwiseShiftChecker

2023-08-10 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

Great progress.
Now, moving to the docs.
Have you checked that the docs compile without errors and look as you would 
expect? Please post how it looks.

Please make a note about this new checker in the clang's release docs, as we 
usually announce new checkers and major changes there.




Comment at: clang/docs/analyzer/checkers.rst:58-64
+ int basic_examples(int a, int b) {
+   if (b < 0) {
+ return a >> b; // warn: right argument is negative
+   } else if (b >= 32) {
+ return a << b; // warn: right argument >= capacity
+   }
+ }

else after return



Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:272
+  if (const auto ConcreteRight = Right.getAs()) {
+const std::int64_t RHS = ConcreteRight->getValue().getExtValue();
+assert(RHS > MaximalAllowedShift);

`getExtValue()`, I believe, asserts that it "fits", thus the concrete int is at 
most 64 bits wide. Does it work for `_BigInt`s? (or whatever we have like that 
:D)



Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:299-303
+  if (Side == OperandSide::Left) {
+NonNegOperands |= NonNegLeft;
+  } else {
+NonNegOperands |= NonNegRight;
+  }

When I see something like this I always think of using a ternary.
Have you also considered?



Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:335-339
+auto R =
+std::make_unique(BT, ShortMsg, Msg, ErrNode);
+bugreporter::trackExpressionValue(ErrNode, Op->getLHS(), *R);
+bugreporter::trackExpressionValue(ErrNode, Op->getRHS(), *R);
+return R;

AHA! Now I got you. You also call this `R`.
Just use `R` at the callsite as well. Job done.



Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:355-357
+if (!BTPtr)
+  BTPtr = std::make_unique(this, "Bitwise shift",
+"Suspicious operation");

How about eagerly inline initializing this field? And avoiding `mutable` as 
well.



Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:367-369
+
+  const AnalyzerOptions &Opts = Mgr.getAnalyzerOptions();
+





Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:53
+class BitwiseShiftValidator {
+  // Primary mutable state:
+  CheckerContext &Ctx;

donat.nagy wrote:
> steakhal wrote:
> > Where does this comment corresponds to?
> The two data members (`Ctx` and `State`) that are directly below the comment.
> 
> I tried to split the list of data members into three groups (primary mutable 
> state, secondary mutable state, `const`  members), but now I reconsider this 
> I feel that these header comments are actually superfluous.
> 
> I'll return to a variant of the earlier layout where I put 
> `NonNegFlags`/`UpperBound` to the end and comment that they are only used for 
> note tag creation; but don't emphasize the (immediately visible) `const`ness 
> / non-`const`-ness of other members with comments.
One way to emphasize this is by choosing names like `FoldedState`, or anything 
else that has something to do with "motion" or "change".
Given that if we have `ProgramStateRefs` in helper classes they usually remain 
still. Think of BugReportVisitors for example. Consequently,it's important to 
raise awareness when it's not like that. A different name would achieve this.



Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:57-59
+  // Secondary mutable state, used only for note tag creation:
+  enum { NonNegLeft = 1, NonNegRight = 2 };
+  unsigned NonNegFlags = 0;

donat.nagy wrote:
> donat.nagy wrote:
> > steakhal wrote:
> > > How about using the same enum type for these two? It would signify that 
> > > these are used together.
> > > Also, by only looking at the possible values, it's not apparent that 
> > > these are bitflag values. A suffix might help the reader.
> > Using the enum type is a good idea, however I'd probably put the `Flag` 
> > suffix into the name of the type:
> > ```
> > enum SignednessFlags : unsigned { NonNegLeft = 1, NonNegRight = 2 };
> > SignednessFlags NonNegOperands = 0;
> > ```
> I rename the member `NonNegFlags` to `NonNegOperands` but its type remains a 
> plain `unsigned` because otherwise using the or-assign operator like 
> `NonNegOperands |= NonNegLeft` produces the error
> > Assigning to 'enum SignednessFlags' from incompatible type 'unsigned int' 
> > (clang typecheck_convert_incompatible)
> (note that both operands were from the same enum type).
Okay, indeed one would need to define the `operator|(SignednessStatus, 
SignednessStatus)` and `operato&|(SignednessStatus, SignednessStatus)` as well.
It probably doesn't worth it. Such a pity that we can't have nice things in C++.



[PATCH] D154838: [analyzer] Add check for null pointer passed to %p of printf family

2023-08-10 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

Looks pretty good.




Comment at: clang/docs/analyzer/checkers.rst:704
+optin.portabilityMinor.UnixAPI
+"
+Finds non-severe implementation-defined behavior in UNIX/Posix functions.

This line should be just as long, as the line above.



Comment at: clang/docs/analyzer/checkers.rst:705
+"
+Finds non-severe implementation-defined behavior in UNIX/Posix functions.
+

Our docs aren't great, but we should have a brief description what the checker 
detects, basically here it would be "Find null pointers being passed to 
printf-like functions. Usually, passing null pointers to such functions is 
implementation defined, thus non-portable. Printf-like functions are: x, y, z."
Illustrate one example for diagnosing a case. Also illustrate one case similar 
to the bad one, that is fixed (basically ensure that the pointer is not null 
there).
And that's it.



Comment at: clang/include/clang/StaticAnalyzer/Checkers/Checkers.td:45-50
+// The PortabilityMinor package is for checkers that find non-severe 
portability
+// issues (see also the Portability package). Such checks may be unwanted for
+// developers who want to ignore minor portability issues, hence they are put 
in
+// a separate package.
+def PortabilityMinorOptIn : Package<"portabilityMinor">, ParentPackage;
+

I don't think we should create a separate package. A separate checker should be 
enough in `alpha.unix`, or `alpha.optin`, or just `optin`.



Comment at: clang/include/clang/StaticAnalyzer/Checkers/Checkers.td:1671
+
+def UnixAPIPortabilityMinorChecker : Checker<"UnixAPI">,
+  HelpText<"Finds non-severe implementation-defined behavior in UNIX/Posix 
functions">,

Usually, the user-facing checker name which is the `<"...">` part there, 
matches the name of the checker's class name. It helps for maintenance.
I would strongly suggest keeping this naming convention.



Comment at: clang/include/clang/StaticAnalyzer/Checkers/Checkers.td:1673
+  HelpText<"Finds non-severe implementation-defined behavior in UNIX/Posix 
functions">,
+  Documentation;
+

Docs are not really optional these days, but we can come back later once the 
checker is ironed out.



Comment at: clang/lib/StaticAnalyzer/Checkers/UnixAPIChecker.cpp:88
 
+class UnixAPIPortabilityMinorChecker
+: public Checker> {

Usually, we put separate checkers into their own file.



Comment at: clang/lib/StaticAnalyzer/Checkers/UnixAPIChecker.cpp:89
+class UnixAPIPortabilityMinorChecker
+: public Checker> {
+public:

Prefer using the `check::PreCall` callback over the `PreStmt`.



Comment at: clang/lib/StaticAnalyzer/Checkers/UnixAPIChecker.cpp:94
+private:
+  mutable std::unique_ptr BT_printfPointerConversionSpecifierNULL;
+

Nowdays, if I'm not mistaken, we just inline initialize these eagerly.
That way you could also avoid marking it mutable.



Comment at: clang/lib/StaticAnalyzer/Checkers/UnixAPIChecker.cpp:102
+  void
+  ReportPrintfPointerConversionSpecifierNULL(clang::ento::CheckerContext &C,
+ ProgramStateRef nullState,

Feel free to just call it `reportBug`.
Also, in the cpp file, usually we use the namespaces `clang,llvm,ento` already. 
You don't need to fully-qualify these names.



Comment at: clang/lib/StaticAnalyzer/Checkers/UnixAPIChecker.cpp:593-594
+
+  if (FName == "printf")
+CheckPrintfPointerConversionSpecifierNULL(C, CE, 1);
+

For matching names, we usually use `CallDescriptions` to do it for us.
If think you could use them while also checking if the Call is a 
`CallEvent::isGlobalCFunction()`. I suspect, you only wanna check those.
I think you should define a `CallDescriptionMap`, because you will also need 
your specific `data_arg_index`...
Look for uses of such maps and you will get inspired.



Comment at: clang/test/Analysis/unix-fns.c:81-92
+#ifndef NULL
+#define NULL ((void*) 0)
+#endif
+
+struct FILE_t;
+typedef struct FILE_t FILE;
+

I'd highly recommend not touching this file to avoid the bulk change in the 
expected plist output. Please just introduce a new test file.



Comment at: clang/test/Analysis/unix-fns.c:280-289
+// Test pointer constraint.
+void printf_pointer_conversion_specifier_null_pointer_constraint(char *format, 
void *pointer1) {
+  void *pointer2 = NULL;
+  printf(format, pointer2); // expected-warning{{The result of passing a null 
pointer to the pointer conversion specifier of the printf family of functions 
is implementation defined}}
+  if (pointer1 != NULL) {
+printf(format, pointer1); // no-warning
+return;

I would appreciate a 

[PATCH] D156312: [analyzer] Upstream BitwiseShiftChecker

2023-08-10 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

I haven't gone deep into the implementation, but by only looking at the quality 
of the code and the structure I assume it implements the right behavior.
I've checked the tests though, and they look good.
I only found irrelevant nits. Awesome work.

---

Have you considered checking the shift operands for taint?
I wonder if we could warn if we cannot prove the correct use of the shift 
operator when either of the operands is tainted.




Comment at: clang/include/clang/StaticAnalyzer/Checkers/Checkers.td:135
+  "If set to true, the checker reports undefined behavior even 
"
+  "if if is supported by most compilers. (This flag has no "
+  "effect in C++20 where these constructs are legal.)",

Check for subsequent duplicated words. They are usually typos.



Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:9-11
+// This file defines BitwiseShiftChecker, which is a path-sensitive checker
+// that looks for undefined behavior caused by incorrect use of the bitwise
+// shift operators '<<' and '>>'.

Please elaborate what counts as "incorrect use".



Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:29-30
+
+using namespace clang;
+using namespace ento;
+

This is used quite often and hinders the readability by indenting the args a 
lot. Making the spelling shorter, would somewhat mitigate this.



Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:35
+
+typedef std::unique_ptr BugReportPtr;
+





Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:42
+
+constexpr NoteTagTemplate NoteTagTemplates[4] = {
+  {"", "right operand of bit shift is less than "},





Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:53
+class BitwiseShiftValidator {
+  // Primary mutable state:
+  CheckerContext &Ctx;

Where does this comment corresponds to?



Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:57-59
+  // Secondary mutable state, used only for note tag creation:
+  enum { NonNegLeft = 1, NonNegRight = 2 };
+  unsigned NonNegFlags = 0;

How about using the same enum type for these two? It would signify that these 
are used together.
Also, by only looking at the possible values, it's not apparent that these are 
bitflag values. A suffix might help the reader.



Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:60-63
+  // If there is an upper bound, it is <= the size of a primitive type
+  // (expressed in bits), so this is a very safe sentinel value:
+  enum { NoUpperBound = 999 };
+  unsigned UpperBound = NoUpperBound;

Have you considered using `std::optional`?
Given that the dimension of this variable is "bits", I think it would be great 
to have that in its name.



Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:81
+
+  bool shouldPerformPedanticChecks() {
+// The pedantic flag has no effect under C++20 because the affected issues

Check if your member functions can be declared as `const`.



Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:106-109
+  if (auto BRP = checkOvershift()) {
+Ctx.emitReport(std::move(BRP));
+return;
+  }

I'd prefer `BR` or `Report`. Since we know already that we are in the 
path-sensitive domain, `P` has less value there.
I'd apply this concept everywhere in the patch.



Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:124-127
+// Report a bug if we are left shifting a concrete signed value and it
+// would overflow (e.g. 2 << 31 with a 32-bit int) or reach the sign bit
+// under C (e.g. 1 << 31 with a 32-bit int; under C++ this is a
+// well-defined way to get INT_MIN):

To me it feels as if this comment should be at the definition of 
`checkLeftShiftOverflow`, and only keep here a short reminder. Similarly to how 
you did with the previous cases.



Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:148
+
+  const auto Val = Ctx.getSVal(operandExpr(Side));
+  const auto LimitVal = SVB.makeIntVal(Limit, Ctx.getASTContext().IntTy);





Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:190-191
+  std::string Msg =
+  llvm::formatv("The result of {0} shift is undefined because the right "
+"operand{1} is not smaller than {2}, the capacity of 
'{3}'",
+isLeftShift() ? "left" : "right", RightOpStr, LHSBitWidth,





Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker

[PATCH] D152436: [clang][analyzer] Move checker alpha.unix.StdCLibraryFunctions out of alpha.

2023-08-09 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

Additionally, it might make sense to first "release" the checker, and after 
another llvm release, turn this into a Core checker.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152436/new/

https://reviews.llvm.org/D152436

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152436: [clang][analyzer] Move checker alpha.unix.StdCLibraryFunctions out of alpha.

2023-08-09 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

Thanks for the sample reports.
I'm fine if we want to make it a non-alpha (released) checker.

An orthogonal question is, whether we want to have it under the code package.
I'm not sure there are official guidance for elevating a checker to code, but 
here are my assumptions.
To me, these are the questions to see how valuable our reports are for the user.

- How many issues does it raise? Would we flood the user?
- How "interesting" those issues are? Do they have *actual* value for the user? 
(Not only niece edge-cases, that is fancy to know about, but actual users would 
genuinely commit such mistakes)
- How long those bug-paths are in practice? I'd argue, the longer they are, 
usually the less actionable they are for the user. Less actionable reports are 
also less valuable, or even harmful.
- In general, how understandable these reports are? Do we have all the 
interesting "notes" or "events" on the path?

Please, reflect on these questions and argue why we should have diagnostics of 
this kind?
Note that, I'm (probably) fine enabling modeling parts by default,but having 
diagnostics by default is another thing.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152436/new/

https://reviews.llvm.org/D152436

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157129: [NFC] Fix unnecessary copy with auto.

2023-08-07 Thread Balázs Benics via Phabricator via cfe-commits
steakhal requested changes to this revision.
steakhal added a comment.
This revision now requires changes to proceed.

I went over the patch and I found only a single debatable case for taking by 
reference.
To clarify why I would prefer taking by value: value semantics is good for 
local reasoning; thus improves maintainability. We should only deviate from 
that when there is actual benefit for doing so.
Static analysis tools, such as Coverity, have false-positives. Some rules are 
better than others.
As a static analysis tool developer myself, I'd recommend carefully evaluating 
the findings before taking action for resolving them.
And if you find anything interesting, tool vendors are generally happy to 
receive feedback about their tool. I guess, they should as well understand that 
taking a pointer by reference doesn't improve anything.




Comment at: clang/include/clang/Analysis/FlowSensitive/StorageLocation.h:129
 llvm::dbgs() << "Existing children:\n";
-for ([[maybe_unused]] auto [Field, Loc] : Children) {
+for ([[maybe_unused]] const auto &[Field, Loc] : Children) {
   llvm::dbgs() << Field->getNameAsString() << "\n";

This is only 2 pointers. I'd rather take them by value.



Comment at: clang/lib/AST/Interp/ByteCodeEmitter.cpp:64
 
-  for (auto Cap : LC) {
+  for (const auto &Cap : LC) {
 unsigned Offset = R->getField(Cap.second)->Offset;

Only 2 pointers.



Comment at: clang/lib/AST/RecordLayoutBuilder.cpp:3728
 const CXXRecordDecl *Base = nullptr;
-for (auto I : CXXRD->bases()) {
+for (const auto &I : CXXRD->bases()) {
   if (I.isVirtual())

It's 24 bytes, which is like 3 pointers.



Comment at: clang/lib/Analysis/FlowSensitive/RecordOps.cpp:38
 
-  for (auto [Field, DstFieldLoc] : Dst.children()) {
+  for (const auto &[Field, DstFieldLoc] : Dst.children()) {
 StorageLocation *SrcFieldLoc = Src.getChild(*Field);

2 pointers.



Comment at: clang/lib/Analysis/FlowSensitive/RecordOps.cpp:86
 
-  for (auto [Field, FieldLoc1] : Loc1.children()) {
+  for (const auto &[Field, FieldLoc1] : Loc1.children()) {
 StorageLocation *FieldLoc2 = Loc2.getChild(*Field);

2 pointers.



Comment at: clang/lib/Sema/SemaRISCVVectorLookup.cpp:305
 // Create masked policy intrinsic.
-for (auto P : SupportedMaskedPolicies) {
+for (const auto &P : SupportedMaskedPolicies) {
   llvm::SmallVector PolicyPrototype =

It's just a single `long`. 8 bytes.
Taking by ref would definitely not help the situation.



Comment at: clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp:2621
   CStringLengthTy::Factory &F = state->get_context();
-  for (auto [Reg, Len] : Entries) {
+  for (auto &[Reg, Len] : Entries) {
 if (SymbolRef Sym = Len.getAsSymbol()) {

8 + 16 bytes. Basically 3 pointers.



Comment at: clang/lib/StaticAnalyzer/Checkers/LocalizationChecker.cpp:851-852
   if (const ObjCMethodDecl *OMD = dyn_cast_or_null(D)) {
-for (auto [Idx, FormalParam] : llvm::enumerate(OMD->parameters())) {
+for (const auto &[Idx, FormalParam] :
+ llvm::enumerate(OMD->parameters())) {
   if (isAnnotatedAsTakingLocalized(FormalParam)) {

size_t and a pointer. 16 bytes in total.



Comment at: clang/lib/StaticAnalyzer/Checkers/MacOSKeychainAPIChecker.cpp:534
   if (ReturnSymbol)
-for (auto [Sym, AllocState] : AMap) {
+for (auto &[Sym, AllocState] : AMap) {
   if (ReturnSymbol == AllocState.Region)

8 + 16 bytes.



Comment at: clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp:2821
   ReallocPairsTy RP = state->get();
-  for (auto [Sym, ReallocPair] : RP) {
+  for (auto &[Sym, ReallocPair] : RP) {
 if (SymReaper.isDead(Sym) || SymReaper.isDead(ReallocPair.ReallocatedSym)) 
{

8 + 16 bytes.



Comment at: clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp:3081
   ReallocPairsTy RP = state->get();
-  for (auto [Sym, ReallocPair] : RP) {
+  for (auto &[Sym, ReallocPair] : RP) {
 // If the symbol is assumed to be NULL, remove it from consideration.

8 + 16 bytes.



Comment at: clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp:3558
 Out << Sep << "MallocChecker :" << NL;
-for (auto [Sym, Data] : RS) {
+for (auto &[Sym, Data] : RS) {
   const RefState *RefS = State->get(Sym);

8 + 16 bytes.



Comment at: clang/lib/StaticAnalyzer/Checkers/NullabilityChecker.cpp:945
   PropertyAccessesMapTy PropertyAccesses = State->get();
-  for (auto [PropKey, PropVal] : PropertyAccesses) {
+  for (auto &[PropKey, PropVal] : PropertyAccesses) {
 if (!PropVal.isConstrainedNonnull) 

[PATCH] D157129: [NFC] Fix unnecessary copy with auto.

2023-08-04 Thread Balázs Benics via Phabricator via cfe-commits
steakhal requested changes to this revision.
steakhal added a comment.
This revision now requires changes to proceed.

I can only vouch for the SttaticAnalyzer portion.
Those parts makes sense to me.
However, I've seen cases where it doesn't resonate.
As a rule of thumb, if we are "copying" only at most 3 pointers, I'd rather 
still just make a copy instead of by reference.
Please check if your recommended changes reflect this.

If you decide to take it by reference, prefer taking it also by const.

Let me know if you oppose.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157129/new/

https://reviews.llvm.org/D157129

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D150446: [analyzer] Check ArraySubscriptExprs in ArrayBoundCheckerV2

2023-08-04 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

In D150446#4560892 , @donat.nagy 
wrote:

> I'm abandoning this commit because it amalgamates several unrelated changes 
> and I think it'd be better to handle them separately:
>
> 1. First, there is a very simple, independent improvement related to 
> underflows and UnknownSpaces. I already created the separate commit D157104 
>  for this (reviews are welcome ;) ).
> 2. As the title of this commit says, I wanted to turn this checker into a 
> `Checker>` instead of a 
> `Checker`. I'm still planning to do this transition in a 
> separate commit because I feel that this will be needed to compose good 
> warning messages and there are a few situations like the testcase 
> `test_field` where the `check::Location` model produces counter-intuitive 
> results. (When I implement this, I'll also ensure that `*p` and `p->field` 
> are handled analogously to `p[0]`, but perhaps these will be in a separate 
> commit.)
> 3. Finally there is the "simplify `RegionRawOffsetV2::computeOffset` and fix 
> multidimensional array handling" change. This is the modification that was 
> responsible for the multitude of false positives caused by this commit; and I 
> don't have a quick fix for it (the engine abuses `ElementRegion` to record 
> pointer arithmetic and I didn't find a clear way to distinguish it from real 
> element access). On the short-term I think I'll need to accept that this 
> checker will produce false negatives in situations when one element of a 
> multi-dimensional array is over-indexed without overflowing the whole array 
> (e.g. if `arr` is declared as `int arr[5][5]`, then `arr[1][10]` over-indexes 
> `arr[1]`, but points inside the full area of the matrix); but fortunately 
> this is not a fatal limitation (it only produces false negatives, 
> multi-dimensional arrays are not too common).

I'd like to thank you your effort on this subject.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150446/new/

https://reviews.llvm.org/D150446

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157114: [clang][ASTImporter] Improve StructuralEquivalence algorithm on repeated friends

2023-08-04 Thread Balázs Benics via Phabricator via cfe-commits
steakhal resigned from this revision.
steakhal added a comment.

In general, I'm not really involved with the ASTImporter.
Please, put me as a reviewer if the ASTImporter directly affects the Static 
Analyzer.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157114/new/

https://reviews.llvm.org/D157114

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D156312: [analyzer] Upstream BitwiseShiftChecker

2023-08-03 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added inline comments.



Comment at: clang/docs/analyzer/checkers.rst:35
+core.BitwiseShift (C, C++)
+"""
+





Comment at: clang/docs/analyzer/checkers.rst:44-50
+Moreover, if the pedantic mode is activated by
+``-analyzer-config core.BitwiseShift:Pedantic=true``, then this checker also
+reports situations where the _left_ operand of a shift operator is negative or
+overflow occurs during the right shift of a signed value. (Most compilers
+handle these predictably, but the C standard and the C++ standards before C++20
+say that they're undefined behavior. In the C++20 standard these constructs are
+well-defined, so activating pedantic mode in C++20 has no effect.)

I believe shifting out the "sign bit" is UB because the representation of the 
signed integer type was not defined.
Prior C++20 (?), it was allowed to represent signed integer types as 
`two's-complement` (virtually every platform uses this), `one's complement`, 
`sign-magnitude`. [[ 
https://en.wikipedia.org/wiki/Signed_number_representations | Wiki ]].
And it turns out, all these three representations behave differently with 
negative values.
Check out [[ https://youtu.be/JhUxIVf1qok?t=2089 | JF's talk ]] from 2018 about 
this.

I'd need to think about the relevance of C++20 in this context, but here is 
what I think of now:
My idea is that the observed behavior is not influenced by the "standard", 
rather by the time we released C++20, they realized that no actual platform 
uses weird signed integer representations.
Given this observation, I'd argue that we shouldn't have this flag at all.



Comment at: clang/docs/analyzer/checkers.rst:81
+
+Ensure the shift operands are in proper range before shifting.
+

We should exactly elaborate on what "proper" means here.



Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:142
+  BinaryOperator::Opcode 
Comparison,
+  long long Limit) {
+  SValBuilder &SVB = Ctx.getSValBuilder();

One does not frequently see the `long long` type.
Do you have a specific reason why `int` or `unsigned` wouldn't be good?



Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:150
+  SVB.evalBinOp(State, Comparison, Val, LimitVal, SVB.getConditionType());
+  if (auto DURes = ResultVal.getAs()) {
+auto [StTrue, StFalse] = State->assume(DURes.value());

How about swapping these branches on the `ResultVal.isUndef()` predicate?



Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:152-156
+if (!StTrue) {
+  // We detected undefined behavior (the caller will report it).
+  State = StFalse;
+  return false;
+}

Generally, I don't favor mutating member functions.
It makes it harder to track how we ended up with a given State.




Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:157-158
+}
+// The code may be valid, so let's assume that it's valid:
+State = StTrue;
+if (StFalse) {

I wonder if we should add a message here that we introduced a new constraint: 
"X assumed to be non-negative".
To add such a message, you should do something similar to what is done in the 
`StdLibraryFunctionsChecker`.
Take the `C.getPredecessor()` and chain all your NoteTags by calling the `Pred 
= C.addTransition(NewState, Pred, Tag)` like this. This way. This is how one 
"sequence of ExplodedNodes" can be made.



Comment at: clang/test/Analysis/analyzer-config.c:36
 // CHECK-NEXT: cfg-temporary-dtors = true
+// CHECK-NEXT: consider-single-element-arrays-as-flexible-array-members = true
+// CHECK-NEXT: core.BitwiseShift:Pedantic = false

I'm pretty sure I got rid of this one :D
Consider rebasing on top of `llvm/main`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156312/new/

https://reviews.llvm.org/D156312

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D153954: [clang][analyzer] Fix empty enum handling in EnumCastOutOfRange checker

2023-08-03 Thread Balázs Benics via Phabricator via cfe-commits
steakhal accepted this revision.
steakhal added a comment.

LGTM




Comment at: clang/test/Analysis/enum-cast-out-of-range.cpp:203
+
+enum class empty_unfixed {};
+

donat.nagy wrote:
> Consider using "specified" and "unspecified" instead of "fixed" and 
> "unfixed", because the rest of the test file uses them and in my opinion 
> "unfixed" looks strange in this context.  (I know that e.g. 
> https://en.cppreference.com/w/cpp/language/enum speaks about "fixed" 
> underlying context, but it doesn't use "unfixed".) 
How about calling it "plain" or "common"?



Comment at: clang/test/Analysis/enum-cast-out-of-range.cpp:212-214
+  empty_unfixed eu = static_cast(0); //should always be OK to 
zero initialize any enum
+  empty_fixed ef = static_cast(0);
+  empty_fixed_unsigned efu = static_cast(0);




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153954/new/

https://reviews.llvm.org/D153954

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D156724: [StaticAnalyzer] Fix incorrect link to "note" diagnostics in HTML output

2023-08-03 Thread Balázs Benics via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGe73ae745b0d6: [analyzer] Fix incorrect link to 
"note" diagnostics in HTML output (authored by gruuprasad, committed 
by steakhal).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156724/new/

https://reviews.llvm.org/D156724

Files:
  clang/lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp
  clang/test/Analysis/html_diagnostics/notes-links.cpp


Index: clang/test/Analysis/html_diagnostics/notes-links.cpp
===
--- /dev/null
+++ clang/test/Analysis/html_diagnostics/notes-links.cpp
@@ -0,0 +1,22 @@
+// RUN: rm -fR %t
+// RUN: mkdir %t
+// RUN: %clang_analyze_cc1 
-analyzer-checker=optin.cplusplus.UninitializedObject \
+// RUN:-analyzer-output=html -o %t -verify %s
+// RUN: cat %t/report-*.html | FileCheck %s
+
+struct A {
+  int *iptr;
+  int a;  // expected-note{{uninitialized field 'this->a'}}
+  int b;  // expected-note{{uninitialized field 'this->b'}}
+
+  A (int *iptr) : iptr(iptr) {} // expected-warning{{2 uninitialized fields at 
the end of the constructor call [optin.cplusplus.UninitializedObject]}}
+};
+
+void f() {
+  A a(0);
+}
+
+//CHECK:  Note:
+//CHECK-NOT:  
+//CHECK-SAME: line 9, column 7
+//CHECK-SAME: line 10, column 7
Index: clang/lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp
===
--- clang/lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp
+++ clang/lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp
@@ -592,11 +592,11 @@
 P->getLocation().asLocation().getExpansionLineNumber();
 int ColumnNumber =
 P->getLocation().asLocation().getExpansionColumnNumber();
+++NumExtraPieces;
 os << "Note:"
<< "line "
<< LineNumber << ", column " << ColumnNumber << ""
<< P->getString() << "";
-++NumExtraPieces;
   }
 }
 


Index: clang/test/Analysis/html_diagnostics/notes-links.cpp
===
--- /dev/null
+++ clang/test/Analysis/html_diagnostics/notes-links.cpp
@@ -0,0 +1,22 @@
+// RUN: rm -fR %t
+// RUN: mkdir %t
+// RUN: %clang_analyze_cc1 -analyzer-checker=optin.cplusplus.UninitializedObject \
+// RUN:-analyzer-output=html -o %t -verify %s
+// RUN: cat %t/report-*.html | FileCheck %s
+
+struct A {
+  int *iptr;
+  int a;  // expected-note{{uninitialized field 'this->a'}}
+  int b;  // expected-note{{uninitialized field 'this->b'}}
+
+  A (int *iptr) : iptr(iptr) {} // expected-warning{{2 uninitialized fields at the end of the constructor call [optin.cplusplus.UninitializedObject]}}
+};
+
+void f() {
+  A a(0);
+}
+
+//CHECK:  Note:
+//CHECK-NOT:  
+//CHECK-SAME: line 9, column 7
+//CHECK-SAME: line 10, column 7
Index: clang/lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp
===
--- clang/lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp
+++ clang/lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp
@@ -592,11 +592,11 @@
 P->getLocation().asLocation().getExpansionLineNumber();
 int ColumnNumber =
 P->getLocation().asLocation().getExpansionColumnNumber();
+++NumExtraPieces;
 os << "Note:"
<< "line "
<< LineNumber << ", column " << ColumnNumber << ""
<< P->getString() << "";
-++NumExtraPieces;
   }
 }
 
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D156724: [StaticAnalyzer] Fix incorrect link to "note" diagnostics in HTML output

2023-08-03 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

In D156724#4556774 , @gruuprasad 
wrote:

> @steakhal, ok, this is my first contribution, I don't have the commit access 
> yet. 
> Can you please merge this?

What should be the commit author?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156724/new/

https://reviews.llvm.org/D156724

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D156724: [StaticAnalyzer] Fix incorrect link to "note" diagnostics in HTML output

2023-08-03 Thread Balázs Benics via Phabricator via cfe-commits
steakhal accepted this revision.
steakhal added a comment.
This revision is now accepted and ready to land.

Looks correct.
Please merge it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156724/new/

https://reviews.llvm.org/D156724

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D156787: [analyzer][attr] Add docs to ownership_* attributes

2023-08-03 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added inline comments.



Comment at: clang/test/Analysis/malloc-annotations.c:151-155
+void af6(void) {
+  int *p = my_malloc(12);
+  my_hold(p);
+  *p = 4;
+}

Please, consider elaborating on this test with some comments, as it's not clear 
to me at first glance - without reading anything about the attributes - what's 
going on here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156787/new/

https://reviews.llvm.org/D156787

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D156201: [ASTImporter] Fix corrupted RecordLayout introduced by circular referenced fields

2023-08-01 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

Have you considered back porting this to clang-17?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156201/new/

https://reviews.llvm.org/D156201

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D156787: [analyzer][attr] Add docs to ownership_* attributes

2023-08-01 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added a comment.

I haven't looked at the content but I have the feeling that we should probably 
backport this to release/17.x


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156787/new/

https://reviews.llvm.org/D156787

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


  1   2   3   4   5   6   7   8   9   10   >