[PATCH] D73536: [analyser][taint] Remove taint from symbolic expressions if used in comparisons

2020-02-03 Thread Kristóf Umann via Phabricator via cfe-commits
Szelethus added a comment.

In D73536#1845031 , @NoQ wrote:

> > Describing value constraints in the taint config file is unfeasible.
>
> This is the only correct way to go, because, as you yourself point out, every 
> sink function (or other use of tainted value) does indeed have different 
> constraint requirements.


Over the last couple months I've been pretty conflicted on config files. While 
I see that it is the correct solution, I also fear that just like attributes, 
they require tedious work to set up and maintain. With that said, its been a 
while since I've evaluated analyses that had taint analysis in the focus, so I 
have no concrete data on whether its worth trying to reduce their count, though 
I suspect they wouldn't show the entire picture, as very few checkers utilize 
taintedness.

> What exactly is preventing you from describing value constraints in the 
> config file?

This sounds like moving, or even worse duplicating the same checks both in a 
tool-specific config file and in the code. I sympathize with this as well:

>   int idx;
>   scanf("%d", );
>   
>   if (idx < 0 || 42 < idx) { // tainted
> return -1;
>   }
>   mySink(idx); // Warning {{Untrusted data is passed to a user-defined sink}}
>   return idx;
> 
> Even though we know at the point of `mySink` is called we know that `idx` is 
> properly constrained, `mySink` will emit warning since `idx` holds tainted 
> value.

This is valid, and I totally see how we can't possibly remove the taint (or in 
other words, prove to the analyzer that we properly checked the value) before 
passing it into a sink (as I understand it).

In summary, I think making decision like this is maybe a bit premature before 
we have some more results. It would be interesting to see what happens on 
larger projects once more checkers utilize taintedness, and act proactively, 
because

> Checking the wrong requirements is a very common source of security issues 
> and we cannot afford destroying our ability to catch them.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73536/new/

https://reviews.llvm.org/D73536



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D73536: [analyser][taint] Remove taint from symbolic expressions if used in comparisons

2020-01-28 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ requested changes to this revision.
NoQ added a comment.
This revision now requires changes to proceed.

> Describing value constraints in the taint config file is unfeasible.

This is the only correct way to go, because, as you yourself point out, every 
sink function (or other use of tainted value) does indeed have different 
constraint requirements. Checking the wrong requirements is a very common 
source of security issues and we cannot afford destroying our ability to catch 
them.

Like, checking that the tainted value is non-zero is a good idea before 
dividing by that value, but it's clearly not sufficient before using the same 
value as an array index.

What exactly is preventing you from describing value constraints in the config 
file? Like, i get it that the generic case may get pretty rough (given that 
constraints may be potentially arbitrary algebraic expressions over function 
argument values and possibly other values), and i guess you could do a "poor 
man's" wildcard suppression for some sinks ("the constraint for this sink is so 
complicated that let's see if it was checked at all and think of it as fine if 
it was), but we definitely should be able to try harder when it matters.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73536/new/

https://reviews.llvm.org/D73536



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D73536: [analyser][taint] Remove taint from symbolic expressions if used in comparisons

2020-01-28 Thread Balázs Benics via Phabricator via cfe-commits
steakhal created this revision.
steakhal added reviewers: NoQ, Szelethus.
steakhal added a project: clang.
Herald added subscribers: cfe-commits, JDevlieghere.
steakhal added a subscriber: boga95.

**Remove taint from symbolic expressions if used in comparison expressions.**

**Problem statement and background:**
TaintConfig was introduced by D59555 .
In that config file users are able to specify functions (//sinks//) which are 
emitting warnings if tainted values are passed to it.
This is great, but we don't have the facilities to suppress those warning.

Consider this example:

  int idx;
  scanf("%d", );
  
  if (idx < 0 || 42 < idx) { // tainted
return -1;
  }
  mySink(idx); // Warning {{Untrusted data is passed to a user-defined sink}}
  return idx;

Even though we know at the point of `mySink` is called we know that `idx` is 
properly constrained, `mySink` will emit warning since `idx` holds tainted 
value.

**Considered solutions:**
Describing value constraints in the taint config file is unfeasible.
We could loosen the rules for evaluating sink functions by checking taint only 
if the value is not constrained //enough//, but this would require a heuristic 
to decide that. I believe that no such heuristic would be satisfying.

**Provided solution:**
AFAIK the option we have left is to remove taint from certain symbolic 
expressions when a tainted expression occur in a comparison expression. This 
could be fine tuned by a heuristic, let's say:
Remove taint if exactly one operand of the comparison is tainted.
Ignore equality comparisons against null pointer constants.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D73536

Files:
  clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
  clang/lib/StaticAnalyzer/Checkers/Taint.cpp
  clang/lib/StaticAnalyzer/Checkers/Taint.h
  clang/test/Analysis/taint-tester.c

Index: clang/test/Analysis/taint-tester.c
===
--- clang/test/Analysis/taint-tester.c
+++ clang/test/Analysis/taint-tester.c
@@ -67,11 +67,14 @@
   int y = (in << (x << in)) * 5;// expected-warning + {{tainted}}
   // The next line tests integer to integer cast.
   int z = y & inn; // expected-warning + {{tainted}}
-  if (y == 5) // expected-warning + {{tainted}}
-m = z | z;// expected-warning + {{tainted}}
-  else
-m = inn;
-  int mm = m; // expected-warning + {{tainted}}
+  if (y == 5) {// expected-warning + {{tainted}}
+// Since the only tainted symbol y depended on was the value of x, the
+// check on y in the condition marked the value of x not tainted anymore.
+m = z | z; // no warning
+  } else {
+m = inn; // no warning
+  }
+  int mm = m; // no warning
 }
 
 // Test getenv.
@@ -168,6 +171,43 @@
   free(line); // expected-warning + {{tainted}}
 }
 
+int conditionRemovesTaintTest() {
+  int idx;
+  scanf("%d", ); // The value of idx become tainted.
+  // Relational operators comparing a tainted value to a non-tainted will
+  // remove taint.
+  if (idx < 0 || 42 < idx) { // expected-warning + {{tainted}}
+int idx2 = idx; // no warning
+return -1;
+  }
+  // Not tainted now, since appeared in the condition previously.
+  return idx; // no warning
+}
+
+int conditionDoesNotRemoveTaintTest() {
+  int idx1, idx2;
+  scanf("%d %d", , );
+
+  // Bot operands of the comparison are tainted.
+  // Taint won't be removed.
+  if (idx1 < idx2) { // expected-warning + {{tainted}}
+int tmp = idx1;  // expected-warning + {{tainted}}
+return -1;
+  }
+
+
+  int sum = idx1 + idx2; // expected-warning + {{tainted}}
+
+  // Relation operator removes taint from all dependent symbolic expressions.
+  if (0 <= sum && sum < 42) { // expected-warning {{tainted}}
+int tmp1 = idx1; // no warning
+int tmp2 = idx2; // no warning
+int tmp3 = sum;  // no warning
+  }
+
+  return idx1 + idx2 + sum; // no warning
+}
+
 // Test propagation functions - the ones that propagate taint from arguments to
 // return value, ptr arguments.
 
Index: clang/lib/StaticAnalyzer/Checkers/Taint.h
===
--- clang/lib/StaticAnalyzer/Checkers/Taint.h
+++ clang/lib/StaticAnalyzer/Checkers/Taint.h
@@ -45,6 +45,9 @@
 const MemRegion *R,
 TaintTagType Kind = TaintTagGeneric);
 
+LLVM_NODISCARD ProgramStateRef removeTaint(ProgramStateRef State, const Stmt *S,
+   const LocationContext *LCtx);
+
 LLVM_NODISCARD ProgramStateRef removeTaint(ProgramStateRef State, SVal V);
 
 LLVM_NODISCARD ProgramStateRef removeTaint(ProgramStateRef State,
Index: clang/lib/StaticAnalyzer/Checkers/Taint.cpp
===
--- clang/lib/StaticAnalyzer/Checkers/Taint.cpp
+++ clang/lib/StaticAnalyzer/Checkers/Taint.cpp
@@ -92,6 +92,11 @@
   return NewState;
 }