[PATCH] D71155: [analyzer] CERT: STR30-C

2020-02-03 Thread Csaba Dabis via Phabricator via cfe-commits
Charusso added a comment.

Thanks! I have felt that may I create a too complex symbolic value. Sorry for 
stealing your time.




Comment at: clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp:2097-2098
 
+void CStringChecker::evalCharPtrCommon(CheckerContext ,
+   const CallExpr *CE) const {
+  CurrentFunctionDescription = "char pointer returning function";

NoQ wrote:
> Ok, so this whole thing is trying to evaluate `strchr`-like functions, right? 
> I've no idea what it does but the problem is much more trivial that what 
> you're trying to do here.
> 
> Branch 1:
> 1. Conjure the offset.
> 2. Add it to the original pointer (using `evalBinOp`).
> 3. Bind the result.
> Branch 2:
> 1. Bind null.
> 
> And you probably should drop branch 2 completely because usually there's no 
> indication that the character may in fact be missing in the string, and i 
> don't want more null dereference false alarms. So it's just branch 1.
> 
> So the whole function should be 3 lines of code (maybe with a couple line 
> breaks).
> 
> Well, maybe you should also handle the case where the original pointer is 
> null (not sure if it's an immediate UB or not).
> 
> This could be improved by actually taking into account the contents of the 
> string, but you don't seem to be trying to do this here.
> I've no idea what it does
I wanted to represent the root string's VarDecl in the symbolic value so I 
could refer to it. I think it became too complex and printing out the root 
string's variable name does not worth that complexity. But here it is from 
`str30-c-explain.cpp`:
```
  char *slash;
  slash = strrchr(pathname, '/');

  clang_analyzer_dump(slash);
  // CHECK: {pathname,conj_$1{long long, LC1, S{{[0-9]+}}, #1},char}

  clang_analyzer_explain(slash);
  // CHECK: pointer to element of type 'char' with index 'symbol of type 'long 
long' conjured at statement 'pathname'' of parameter 'pathname'
```

> This could be improved by actually taking into account the contents of the 
> string, but you don't seem to be trying to do this here.
At first the name of the string would be cool to print out and we need dataflow 
support to make it cool. Given that it is too complex I have reverted the 
modeling part. Also I wanted to create a simpler modeling with your modeling 
solution for my name-storing purposes, but I cannot.



Comment at: clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp:2101-2103
+  SValBuilder  = C.getSValBuilder();
+  ProgramStateRef State, StateNull;
+  std::tie(StateNull, State) = C.getState()->assume(SVB.makeNull());

NoQ wrote:
> So, like, `StateNull` is the state in which it is assumed that `0` is 
> non-zero and `State` is the state in which it is assumed that `0` is zero?
> 
> I mean, apart from the naming flip-flop - i can tell you in advance that `0` 
> is zero, it's not a matter of assumptions.
I wanted to force out the state split of an unknown indexing: null and non-null 
but may it was too explicit and useless, yes.



Comment at: clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp:2127-2128
+  if (const auto *SL = dyn_cast(SrcExpr->IgnoreImpCasts())) {
+const StringRegion *ResultMR =
+C.getStoreManager().getRegionManager().getStringRegion(SL);
+SVal ResultV = loc::MemRegionVal(ResultMR);

NoQ wrote:
> This is guaranteed to return the string region that you already have as the 
> value of that expression.
Whoops. The idea was that to handle string regions explicitly and may calculate 
the returning index as well.



Comment at: clang/test/Analysis/cert/str30-c-notes.cpp:29
+  if (slash) {
+// expected-note@-1 {{'slash' is non-null}}
+// expected-note@-2 {{Taking true branch}}

Charusso wrote:
> Needs to be an assumption.
Let us say it is non-null.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71155/new/

https://reviews.llvm.org/D71155



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71155: [analyzer] CERT: STR30-C

2020-02-03 Thread Csaba Dabis via Phabricator via cfe-commits
Charusso updated this revision to Diff 242159.
Charusso marked 8 inline comments as done.
Charusso added a comment.

- Remove the modeling from CStringChecker.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71155/new/

https://reviews.llvm.org/D71155

Files:
  clang/docs/analyzer/checkers.rst
  clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
  clang/lib/StaticAnalyzer/Checkers/cert/StrChecker.cpp
  clang/test/Analysis/cert/str30-c-notes.cpp
  clang/test/Analysis/cert/str30-c.cpp

Index: clang/test/Analysis/cert/str30-c.cpp
===
--- /dev/null
+++ clang/test/Analysis/cert/str30-c.cpp
@@ -0,0 +1,59 @@
+// RUN: %clang_analyze_cc1 \
+// RUN:  -analyzer-checker=core,unix,alpha.security.cert.str.30c \
+// RUN:  -verify %s
+
+// See the examples on the page of STR30-C:
+// https://wiki.sei.cmu.edu/confluence/display/c/STR30-C.+Do+not+attempt+to+modify+string+literals
+
+#include "../Inputs/system-header-simulator.h"
+
+#define EOF -1
+typedef __SIZE_TYPE__ size_t;
+typedef __typeof__((char *)0 - (char *)0) ptrdiff_t;
+
+void free(void *memblock);
+void *malloc(size_t size);
+
+char *strrchr(const char *str, int c);
+int puts(const char *str);
+
+namespace test_strrchr_bad {
+const char *get_dirname(const char *pathname) {
+  char *slash;
+  slash = strrchr(pathname, '/');
+  if (slash) {
+*slash = '\0'; /* Undefined behavior */
+// expected-warning@-1 {{'slash' is pointing to a constant string}}
+  }
+  return pathname;
+}
+
+int main(void) {
+  puts(get_dirname(__FILE__));
+  return 0;
+}
+} // namespace test_strrchr_bad
+
+namespace test_strrchr_good {
+char *get_dirname(const char *pathname, char *dirname, size_t size) {
+  const char *slash;
+  slash = strrchr(pathname, '/');
+  if (slash) {
+ptrdiff_t slash_idx = slash - pathname;
+if ((size_t)slash_idx < size) {
+  memcpy(dirname, pathname, slash_idx);
+  dirname[slash_idx] = '\0';
+  return dirname;
+}
+  }
+  return 0;
+}
+
+int main(void) {
+  char dirname[260];
+  if (get_dirname(__FILE__, dirname, sizeof(dirname))) {
+puts(dirname);
+  }
+  return 0;
+}
+} // namespace test_strrchr_good
Index: clang/test/Analysis/cert/str30-c-notes.cpp
===
--- /dev/null
+++ clang/test/Analysis/cert/str30-c-notes.cpp
@@ -0,0 +1,44 @@
+// RUN: %clang_analyze_cc1 \
+// RUN:  -analyzer-checker=core,unix,alpha.security.cert.str.30c \
+// RUN:  -analyzer-output=text -verify %s
+
+// See the examples on the page of STR30-C:
+// https://wiki.sei.cmu.edu/confluence/display/c/STR30-C.+Do+not+attempt+to+modify+string+literals
+
+#include "../Inputs/system-header-simulator.h"
+
+#define EOF -1
+typedef __SIZE_TYPE__ size_t;
+
+void free(void *memblock);
+void *malloc(size_t size);
+
+char *strrchr(const char *str, int c);
+int puts(const char *str);
+
+namespace test_strrchr_bad {
+const char *get_dirname(const char *pathname) {
+  char *slash;
+  slash = strrchr(pathname, '/');
+  // expected-note@-1 {{'strrchr' returns a pointer to the constant 'pathname'}}
+
+  slash = strrchr(slash, '/');
+  // expected-note@-1 {{'strrchr' returns a pointer to the constant 'slash'}}
+
+  if (slash) {
+// expected-note@-1 {{'slash' is non-null}}
+// expected-note@-2 {{Taking true branch}}
+
+*slash = '\0'; /* Undefined behavior */
+// expected-note@-1 {{'slash' is pointing to a constant string}}
+// expected-warning@-2 {{'slash' is pointing to a constant string}}
+  }
+  return pathname;
+}
+
+int main(void) {
+  puts(get_dirname(__FILE__));
+  // expected-note@-1 {{Calling 'get_dirname'}}
+  return 0;
+}
+} // namespace test_strrchr_bad
Index: clang/lib/StaticAnalyzer/Checkers/cert/StrChecker.cpp
===
--- clang/lib/StaticAnalyzer/Checkers/cert/StrChecker.cpp
+++ clang/lib/StaticAnalyzer/Checkers/cert/StrChecker.cpp
@@ -12,6 +12,9 @@
 //  https://wiki.sei.cmu.edu/confluence/pages/viewpage.action?pageId=87152038
 //
 //  This checker is a base checker which consist of the following checkers:
+//  - '30c'
+//  https://wiki.sei.cmu.edu/confluence/display/c/STR30-C.+Do+not+attempt+to+modify+string+literals
+//
 //  - '31c'
 //  https://wiki.sei.cmu.edu/confluence/display/c/STR31-C.+Guarantee+that+storage+for+strings+has+sufficient+space+for+character+data+and+the+null+terminator
 //
@@ -54,9 +57,6 @@
 
 class StrCheckerBase
 : public Checker> {
-  using StrCheck = std::function;
-
 public:
   // We report a note when any of the calls in 'CDM' is being used because
   // they can cause a not null-terminated string. In this case we store the
@@ -73,6 +73,19 @@
 
   void checkPostStmt(const DeclStmt *S, CheckerContext ) const;
 
+  // STR30-C.
+  void checkMemchr(const CallEvent , const CallContext ,
+   CheckerContext ) const;
+  void checkStrchr(const CallEvent , const CallContext ,
+   CheckerContext ) 

[PATCH] D71155: [analyzer] CERT: STR30-C

2020-02-03 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ added a comment.

Let's separate `CStringChecker` improvements into a separate patch and have a 
separate set of tests for it.




Comment at: clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp:2097-2098
 
+void CStringChecker::evalCharPtrCommon(CheckerContext ,
+   const CallExpr *CE) const {
+  CurrentFunctionDescription = "char pointer returning function";

Ok, so this whole thing is trying to evaluate `strchr`-like functions, right? 
I've no idea what it does but the problem is much more trivial that what you're 
trying to do here.

Branch 1:
1. Conjure the offset.
2. Add it to the original pointer (using `evalBinOp`).
3. Bind the result.
Branch 2:
1. Bind null.

And you probably should drop branch 2 completely because usually there's no 
indication that the character may in fact be missing in the string, and i don't 
want more null dereference false alarms. So it's just branch 1.

So the whole function should be 3 lines of code (maybe with a couple line 
breaks).

Well, maybe you should also handle the case where the original pointer is null 
(not sure if it's an immediate UB or not).

This could be improved by actually taking into account the contents of the 
string, but you don't seem to be trying to do this here.



Comment at: clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp:2101-2103
+  SValBuilder  = C.getSValBuilder();
+  ProgramStateRef State, StateNull;
+  std::tie(StateNull, State) = C.getState()->assume(SVB.makeNull());

So, like, `StateNull` is the state in which it is assumed that `0` is non-zero 
and `State` is the state in which it is assumed that `0` is zero?

I mean, apart from the naming flip-flop - i can tell you in advance that `0` is 
zero, it's not a matter of assumptions.



Comment at: clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp:2127-2128
+  if (const auto *SL = dyn_cast(SrcExpr->IgnoreImpCasts())) {
+const StringRegion *ResultMR =
+C.getStoreManager().getRegionManager().getStringRegion(SL);
+SVal ResultV = loc::MemRegionVal(ResultMR);

This is guaranteed to return the string region that you already have as the 
value of that expression.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71155/new/

https://reviews.llvm.org/D71155



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71155: [analyzer] CERT: STR30-C

2019-12-13 Thread Csaba Dabis via Phabricator via cfe-commits
Charusso marked 3 inline comments as done.
Charusso added a comment.

In order to bypass the `CK_LValueToRValue` `evalCast()` we have to create en 
`ElementRegion` as a return-value of the problematic function call. In that 
case for a mythical reason we miss the fact the pointer is nullable. I have not 
figured out yet why, but tried to create an appropriate return-value.




Comment at: clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp:2122
+
+  SVal ConjuredV = SVB.getConjuredHeapSymbolVal(CE, LCtx, C.blockCount());
+  SVal ResultV = loc::MemRegionVal(SVB.getRegionManager().getElementRegion(

NoQ wrote:
> Charusso wrote:
> > NoQ wrote:
> > > Why "heap"?
> > Well, a string which length is at least 16 characters long is going to be 
> > allocated on the heap. I have to conjure the string here to create its 
> > element.
> o.o
> ```lang=c++
> void foo() {
>   // This string is 20 characters long
>   // but it's clearly on the stack.
>   char str[] = "12345678901234567890";
>   // This one is therefore also on the stack.
>   char *ptr = strchr(str, '0');
> }
> ```
> Well, a string which length is at least 16 characters long is going to be 
> allocated on the heap. I have to conjure the string here to create its 
> element.

I really felt that the `std::string` should behave like the C-strings, but 
C-strings are on the stack whatever it takes, yes, my bad. Thanks for pointing 
that out!



Comment at: clang/test/Analysis/cert/str30-c-notes.cpp:29
+  if (slash) {
+// expected-note@-1 {{'slash' is non-null}}
+// expected-note@-2 {{Taking true branch}}

Needs to be an assumption.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71155/new/

https://reviews.llvm.org/D71155



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits