[PATCH] D59516: [analyzer] Add custom filter functions for GenericTaintChecker

Artem Dergachev via Phabricator via cfe-commits Thu, 07 Nov 2019 16:57:06 -0800

NoQ accepted this revision.
NoQ added a comment.

@NoQ: "Why not simply remove taint?"
@boga95: //*removes taint*//
@NoQ: "Hmm, now that i think about it, adding a 'no taint' marker might 
actually work correctly more often."

Like, if you have taint on `$x` and try to remove taint from `derived<$x, 
x.a>`, your current implementation will do nothing. But the approach with 
adding a 'no taint' marker will actually add a new marker and make subsequent 
lookups to `derived<$x, x.a>` will return the newly added marker, which is the 
correct behavior; additionally, `derived<$x, x.b>` would remain tainted (where 
`b != a`), which is also the correct behavior. It would have still failed when 
you describe the sub-region slightly differently, but that'd be a fairly minor 
glitch.

The right way to proceed further with the `removeTaint()` approach on 
`SymbolDerived` is to introduce `removePartialTaint()` that would complement 
`addPartialTaint()`. But that will require changing the data structure in the 
program state from a simple set of tainted sub-regions to a sophisticated tree 
of sub-regions that are marked up as tainted or not tainted. Which might have 
as well been a marker.

I still tend to believe that `removeTaint()` is the right approach, but it's a 
bit harder to get right and a bit worse if not enough effort is invested into 
it.

There definitely is, however, a good use for the `removeTaint()` function in 
both approaches: namely, our taint checker still lacks `checkDeadSymbols` :D

================
Comment at: clang/lib/StaticAnalyzer/Checkers/Taint.cpp:111-112
+ProgramStateRef taint::removeTaint(ProgramStateRef State, SymbolRef Sym) {
+  // If this is a symbol cast, remove the cast before adding the taint. Taint
+  // is cast agnostic.
+  while (const SymbolCast *SC = dyn_cast<SymbolCast>(Sym))
----------------
That's not entirely true. Like, if you check `(char)x`, you still have 24 bytes 
of `x` controlled by the attacker. But that's a good false-positive-proof 
approach.

================
Comment at: lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:474-476
+  const FunctionDecl *FDecl = C.getCalleeDecl(CE);
+  if (!FDecl || FDecl->getKind() != Decl::Function)
+    return;
----------------
boga95 wrote:
> Szelethus wrote:
> > When do these happen? For implicit functions in C? 
> For example, a lambda doesn't have an FunctionDecl.
Lambda most certainly has a `FunctionDecl`, which is the declaration of its 
`operator()()`, and that's exactly what's going to be in `FDecl` if a lambda is 
invoked. However, the `getKind()` of this `FunctionDecl` will not be 
`Decl::Function` but `Decl::CXXMethod`, like of any other member function.

So this second check is checking that the function is a simple global function.

I recommend replacing with `!isa<CXXMethodDecl>(FDecl)`, purely for 
readability, or at least adding a comment.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59516/new/

https://reviews.llvm.org/D59516

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D59516: [analyzer] Add custom filter functions for GenericTaintChecker

Reply via email to