NoQ added a comment.

A while back I had a look into implementing some of the CERT checkers and my 
vague recollection from these attempts was that it's a bad idea to take CERT 
examples literally. Most of the CERT rules are explained in an informal natural 
language and it makes CERT great for demonstrating common mistakes that 
programmers make when they write in C/C++. Humans can easily understand the 
problem by reading CERT pages. But it doesn't necessarily mean that static 
analysis tools can be built by generalizing over the examples from the CERT 
wiki.

So i suggest that we make one more step back and agree upon what exactly are we 
checking here. I.e., basically, agree on the tests. Because right now it looks 
like you're trying to blindly generalize over the examples: //"The CERT example 
has a `strcpy()` into a fixed-size buffer, let's warn on all `strcpy()`s into 
fixed-size buffers!"//. This way your check does indeed pass the tests, but it 
doesn't make sense both from the rule's perspective (the rule never said that 
it's wrong to `strcpy()` into a fixed-size buffer, it only said that you should 
anyhow guarantee that the storage is sufficient) and from the user's 
perspective (you won't be able to reduce the gap between false positives and 
false negatives enough for the check to be useful, even with the subsequent 
whack-a-mole of adding more heuristics, because what the check is checking is 
relatively orthogonal to the actual problem you're trying to solve).

For example, in the example titled "Noncompliant Code Example (argv)", it is 
explicitly stated that the problem is not only that the buffer is fixed-size 
and `strcpy()` is made, but also that the original string //is controlled by 
the attacker//. The right analysis to catch such issues is //taint analysis//. 
It's a typical taint problem. I honestly believe you should try to solve it by 
combining the taint checker with the string checker: "if string length metadata 
is tainted (in particular, if the string itself is tainted) and is potentially 
larger than the destination buffer size (eg., unconstrained), warn". With the 
recent advancements in the development of the taint checker, i think you can 
get pretty far this way without constantly struggling with false positives.



================
Comment at: clang/include/clang/StaticAnalyzer/Checkers/Checkers.td:801
+  Dependencies<[StrCheckerBase]>,
+  Documentation<HasDocumentation>;
+
----------------
Charusso wrote:
> In my mind the documentation is the CERT rule's page. Is it okay?
I'd prefer our own documentation that may link to the CERT page but should also 
explain, say, which parts of the rule are covered by the checker (even if it's 
"all of them", they may add more rules in the future), and probably some 
specifics of the checker's behavior if they aren't obvious from the rule.


================
Comment at: clang/test/Analysis/cert/str31-c-fp-suppression.cpp:33-34
+  // string, so that we could see whether the string is null-terminated on
+  // every path or not. Because the 'src' could be not null-terminated this
+  // report is fine, but if 'src' is null-terminated this report is a false
+  // positive because the 'src' fits into 'dest' with the null terminator
----------------
Mmm, no, it's not fine. The warning is saying something that's not correct. 
Even if `src` is not null-terminated, under the assumption that no 
out-of-bounds read UB occurs in the `strcpy()` call, `dest` is going to always 
be null-terminated and have a `strlen` of at most 127. And by continuing our 
analysis after `strcpy()`, we inform the user that we assume that no UB has 
happened on the current execution path yet.

Traditionally checkers either warn immediately when they can detect an error, 
or assume that the error has not happened. For example, when we dereference a 
pointer, if the pointer is not known to be null, we actively assume that it's 
non-null. Similarly, if the `src` string is not null-terminated, we should warn 
at the invocation of `strcpy`; if it's not known whether it's null-terminated 
or not, we should actively assume that it's null-terminated at `strcpy`.

Also, I usually don't agree with the statements like "This report helped us 
find a real bug, therefore it's a true positive". You can find a bug in 
literally any code if you stare at it long enough! I'm glad you found the bug 
anyway, but if the report is not pointing at the problem directly, we're not 
doing a great job.


================
Comment at: clang/test/Analysis/cert/str31-c-fp-suppression.cpp:51-53
+  strcpy(dest, src);
+  do_something(dest);
+  // expected-warning@-1 {{'dest' is not null-terminated}}
----------------
Looks like a false positive. Consider the following perfectly correct (even if 
a bit inefficient) code that only differs from the current code only in names 
and context:
```lang=c++
void copy_and_test_if_short(const char *src) {
  char dest[128];
  strcpy(dest, src);
  warn_if_starts_with_foo(dest);
}

void copy_and_test(const char *src) {
  if (strlen(src) < 64)
    copy_and_test_if_short(src);
  else
    copy_and_test_if_long(src);
}
```


================
Comment at: clang/test/Analysis/cert/str31-c-notes.cpp:22
+  if (gets(buf)) {}
+  // expected-note@-1 {{'gets' could write outside of 'buf'}}
+  // expected-note@-2 {{Assuming the condition is false}}
----------------
I still believe that this should be *the* warning in this code. This is already 
broken code.


================
Comment at: clang/test/Analysis/cert/str31-c-notes.cpp:26-27
+
+  free(buf);
+  // expected-warning@-1 {{'buf' is not null-terminated}}
+  // expected-note@-2 {{'buf' is not null-terminated}}
----------------
It is not wrong to free a buffer that isn't null-terminated. We shouldn't warn 
here.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70411/new/

https://reviews.llvm.org/D70411



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to