[PATCH] D59637: [analyzer] Use the custom propagation rules and sinks in GenericTaintChecker

2019-09-08 Thread Borsik Gábor via Phabricator via cfe-commits
boga95 closed this revision.
boga95 marked 10 inline comments as done.
boga95 added a comment.

Closed by 080ecafdd8b3e990e5ad19202d089c91c9c9b164 
.




Comment at: lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:251
+const llvm::StringLiteral GenericTaintChecker::MsgCustomSink =
+"Untrusted data is passed to a user defined sink";
+;

NoQ wrote:
> It should be "user-defined" because it's a single adjective.
> 
> I recommend against using the word "sink" in user-facing messages because 
> it's too jargony. Do we have a better word for this?
Currently, I don't have any better idea. I will think about it.



Comment at: lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:118
   /// Check for CWE-134: Uncontrolled Format String.
-  static const char MsgUncontrolledFormatString[];
+  static constexpr llvm::StringLiteral MsgUncontrolledFormatString =
+  "Untrusted data is used as a format string "

steakhal wrote:
> Shouldn't we still need an out-of-class initializer part for each static 
> constexpr class member variable?
> These would provide the memory locations for the declarations.
> ```
> constexpr llvm::StringLiteral 
> GenericTaintChecker::MsgUncontrolledFormatString;
> constexpr llvm::StringLiteral GenericTaintChecker::MsgSanitizeSystemArgs;
> constexpr llvm::StringLiteral GenericTaintChecker::MsgTaintedBufferSize;
> constexpr llvm::StringLiteral GenericTaintChecker::MsgCustomSink;
> ```
Constexpr values cannot be initialized out of the class, that's why I moved 
them to here.



Comment at: lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:191
 static TaintPropagationRule
-getTaintPropagationRule(const FunctionDecl *FDecl, StringRef Name,
+getTaintPropagationRule(const GenericTaintChecker *Checker,
+const FunctionDecl *FDecl, StringRef Name,

steakhal wrote:
> Szelethus wrote:
> > How about only passing `CustomPropagations`?
> I would even consider to move this function out of the whole class. (Not only 
> this function, but the others as well. Like isStdin, etc.)
> I think pure, free-functions (in an anonymous namespace) are easier to reason 
> about.
I could do that in a separate patch if necessary.



Comment at: lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:605
 // Mark the given argument.
-assert(ArgNum < CE->getNumArgs());
 State = State->add(ArgNum);

Szelethus wrote:
> I get that there isn't much substance to this assert, but why remove it? We 
> might as well populate the lines in between that and the branch.
I think there is no need for this. Maybe I can make the ArgNum const.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59637/new/

https://reviews.llvm.org/D59637



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D59637: [analyzer] Use the custom propagation rules and sinks in GenericTaintChecker

2019-09-07 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added inline comments.



Comment at: lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:118
   /// Check for CWE-134: Uncontrolled Format String.
-  static const char MsgUncontrolledFormatString[];
+  static constexpr llvm::StringLiteral MsgUncontrolledFormatString =
+  "Untrusted data is used as a format string "

Shouldn't we still need an out-of-class initializer part for each static 
constexpr class member variable?
These would provide the memory locations for the declarations.
```
constexpr llvm::StringLiteral GenericTaintChecker::MsgUncontrolledFormatString;
constexpr llvm::StringLiteral GenericTaintChecker::MsgSanitizeSystemArgs;
constexpr llvm::StringLiteral GenericTaintChecker::MsgTaintedBufferSize;
constexpr llvm::StringLiteral GenericTaintChecker::MsgCustomSink;
```


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59637/new/

https://reviews.llvm.org/D59637



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D59637: [analyzer] Use the custom propagation rules and sinks in GenericTaintChecker

2019-09-06 Thread Kristóf Umann via Phabricator via cfe-commits
Szelethus accepted this revision.
Szelethus added a comment.

I'm fine with moving in-class function into anonymous namespace later. LGTM, 
feel free to commit!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59637/new/

https://reviews.llvm.org/D59637



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D59637: [analyzer] Use the custom propagation rules and sinks in GenericTaintChecker

2019-09-06 Thread Borsik Gábor via Phabricator via cfe-commits
boga95 updated this revision to Diff 219172.
boga95 marked 7 inline comments as done.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59637/new/

https://reviews.llvm.org/D59637

Files:
  clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
  clang/test/Analysis/taint-generic.c

Index: clang/test/Analysis/taint-generic.c
===
--- clang/test/Analysis/taint-generic.c
+++ clang/test/Analysis/taint-generic.c
@@ -338,3 +338,45 @@
   if (i < rhs)
 *(volatile int *) 0; // no-warning
 }
+
+
+// Test configuration
+int mySource1();
+void mySource2(int*);
+void myScanf(const char*, ...);
+int myPropagator(int, int*);
+int mySnprintf(char*, size_t, const char*, ...);
+void mySink(int, int, int);
+
+void testConfigurationSources1() {
+  int x = mySource1();
+  Buffer[x] = 1; // expected-warning {{Out of bound memory access }}
+}
+
+void testConfigurationSources2() {
+  int x;
+  mySource2(&x);
+  Buffer[x] = 1; // expected-warning {{Out of bound memory access }}
+}
+
+void testConfigurationSources3() {
+  int x, y;
+  myScanf("%d %d", &x, &y);
+  Buffer[y] = 1; // expected-warning {{Out of bound memory access }}
+}
+
+void testConfigurationPropagation() {
+  int x = mySource1();
+  int y;
+  myPropagator(x, &y);
+  Buffer[y] = 1; // expected-warning {{Out of bound memory access }}
+}
+
+void testConfigurationSinks() {
+  int x = mySource1();
+  mySink(x, 1, 2);
+  // expected-warning@-1 {{Untrusted data is passed to a user-defined sink}}
+  mySink(1, x, 2); // no-warning
+  mySink(1, 2, x);
+  // expected-warning@-1 {{Untrusted data is passed to a user-defined sink}}
+}
Index: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
===
--- clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
+++ clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
@@ -115,27 +115,44 @@
   static Optional getPointedToSVal(CheckerContext &C, const Expr *Arg);
 
   /// Check for CWE-134: Uncontrolled Format String.
-  static const char MsgUncontrolledFormatString[];
+  static constexpr llvm::StringLiteral MsgUncontrolledFormatString =
+  "Untrusted data is used as a format string "
+  "(CWE-134: Uncontrolled Format String)";
   bool checkUncontrolledFormatString(const CallExpr *CE,
  CheckerContext &C) const;
 
   /// Check for:
   /// CERT/STR02-C. "Sanitize data passed to complex subsystems"
   /// CWE-78, "Failure to Sanitize Data into an OS Command"
-  static const char MsgSanitizeSystemArgs[];
+  static constexpr llvm::StringLiteral MsgSanitizeSystemArgs =
+  "Untrusted data is passed to a system call "
+  "(CERT/STR02-C. Sanitize data passed to complex subsystems)";
   bool checkSystemCall(const CallExpr *CE, StringRef Name,
CheckerContext &C) const;
 
   /// Check if tainted data is used as a buffer size ins strn.. functions,
   /// and allocators.
-  static const char MsgTaintedBufferSize[];
+  static constexpr llvm::StringLiteral MsgTaintedBufferSize =
+  "Untrusted data is used to specify the buffer size "
+  "(CERT/STR31-C. Guarantee that storage for strings has sufficient space "
+  "for character data and the null terminator)";
   bool checkTaintedBufferSize(const CallExpr *CE, const FunctionDecl *FDecl,
   CheckerContext &C) const;
 
+  /// Check if tainted data is used as a custom sink's parameter.
+  static constexpr llvm::StringLiteral MsgCustomSink =
+  "Untrusted data is passed to a user-defined sink";
+  bool checkCustomSinks(const CallExpr *CE, StringRef Name,
+CheckerContext &C) const;
+
   /// Generate a report if the expression is tainted or points to tainted data.
-  bool generateReportIfTainted(const Expr *E, const char Msg[],
+  bool generateReportIfTainted(const Expr *E, StringRef Msg,
CheckerContext &C) const;
 
+  struct TaintPropagationRule;
+  using NameRuleMap = llvm::StringMap;
+  using NameArgMap = llvm::StringMap;
+
   /// A struct used to specify taint propagation rules for a function.
   ///
   /// If any of the possible taint source arguments is tainted, all of the
@@ -175,7 +192,8 @@
 
 /// Get the propagation rule for a given function.
 static TaintPropagationRule
-getTaintPropagationRule(const FunctionDecl *FDecl, StringRef Name,
+getTaintPropagationRule(const NameRuleMap &CustomPropagations,
+const FunctionDecl *FDecl, StringRef Name,
 CheckerContext &C);
 
 void addSrcArg(unsigned A) { SrcArgs.push_back(A); }
@@ -211,9 +229,6 @@
CheckerContext &C);
   };
 
-  using NameRuleMap = llvm::StringMap;
-  using NameArgMap = llvm::StringMap;
-
   /// Defines a map between the propagation function's name and
   /// TaintPropagationRule.
   NameRuleMap CustomPropagations;
@@ -2

[PATCH] D59637: [analyzer] Use the custom propagation rules and sinks in GenericTaintChecker

2019-09-05 Thread Balázs Benics via Phabricator via cfe-commits
steakhal added inline comments.



Comment at: lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:191
 static TaintPropagationRule
-getTaintPropagationRule(const FunctionDecl *FDecl, StringRef Name,
+getTaintPropagationRule(const GenericTaintChecker *Checker,
+const FunctionDecl *FDecl, StringRef Name,

Szelethus wrote:
> How about only passing `CustomPropagations`?
I would even consider to move this function out of the whole class. (Not only 
this function, but the others as well. Like isStdin, etc.)
I think pure, free-functions (in an anonymous namespace) are easier to reason 
about.



Comment at: lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:844
   if (Config)
-Checker->parseConfiguration(Mgr, Option, std::move(Config).getValue());
+Checker->parseConfiguration(Mgr, Option, std::move(Config.getValue()));
 }

Szelethus wrote:
> Wasn't this commited before?
Yes it was 
(https://github.com/llvm/llvm-project/blob/master/clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp#L814).
I would kindly request a rebase.



Comment at: test/Analysis/taint-generic.c:377
+  int x = mySource1();
+  mySink(x, 1, 2); // expected-warning {{Untrusted data is passed to a 
user-defined sink}}
+  mySink(1, x, 2); // no-warning

We could use this syntacs to achieve shorter lines. Note that `@-1`. Same for 
all the other lines.
```
mySink(x, 1, 2);
// expected-warning@-1 {{Untrusted data is passed to a user-defined sink}}
```


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59637/new/

https://reviews.llvm.org/D59637



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D59637: [analyzer] Use the custom propagation rules and sinks in GenericTaintChecker

2019-09-05 Thread Kristóf Umann via Phabricator via cfe-commits
Szelethus added a comment.

Also, please mark inlines done as you fix them :)


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59637/new/

https://reviews.llvm.org/D59637



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D59637: [analyzer] Use the custom propagation rules and sinks in GenericTaintChecker

2019-09-05 Thread Kristóf Umann via Phabricator via cfe-commits
Szelethus accepted this revision.
Szelethus added a reviewer: steakhal.
Szelethus added a comment.
This revision is now accepted and ready to land.

Some nits inline, otherwise LGTM. @steakhal, do you have anything to add to 
this?




Comment at: lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:191
 static TaintPropagationRule
-getTaintPropagationRule(const FunctionDecl *FDecl, StringRef Name,
+getTaintPropagationRule(const GenericTaintChecker *Checker,
+const FunctionDecl *FDecl, StringRef Name,

How about only passing `CustomPropagations`?



Comment at: lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:605
 // Mark the given argument.
-assert(ArgNum < CE->getNumArgs());
 State = State->add(ArgNum);

I get that there isn't much substance to this assert, but why remove it? We 
might as well populate the lines in between that and the branch.



Comment at: lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:844
   if (Config)
-Checker->parseConfiguration(Mgr, Option, std::move(Config).getValue());
+Checker->parseConfiguration(Mgr, Option, std::move(Config.getValue()));
 }

Wasn't this commited before?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59637/new/

https://reviews.llvm.org/D59637



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D59637: [analyzer] Use the custom propagation rules and sinks in GenericTaintChecker

2019-09-02 Thread Borsik Gábor via Phabricator via cfe-commits
boga95 updated this revision to Diff 218395.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59637/new/

https://reviews.llvm.org/D59637

Files:
  lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
  test/Analysis/taint-generic.c

Index: test/Analysis/taint-generic.c
===
--- test/Analysis/taint-generic.c
+++ test/Analysis/taint-generic.c
@@ -338,3 +338,43 @@
   if (i < rhs)
 *(volatile int *) 0; // no-warning
 }
+
+
+// Test configuration
+int mySource1();
+void mySource2(int*);
+void myScanf(const char*, ...);
+int myPropagator(int, int*);
+int mySnprintf(char*, size_t, const char*, ...);
+void mySink(int, int, int);
+
+void testConfigurationSources1() {
+  int x = mySource1();
+  Buffer[x] = 1; // expected-warning {{Out of bound memory access }}
+}
+
+void testConfigurationSources2() {
+  int x;
+  mySource2(&x);
+  Buffer[x] = 1; // expected-warning {{Out of bound memory access }}
+}
+
+void testConfigurationSources3() {
+  int x, y;
+  myScanf("%d %d", &x, &y);
+  Buffer[y] = 1; // expected-warning {{Out of bound memory access }}
+}
+
+void testConfigurationPropagation() {
+  int x = mySource1();
+  int y;
+  myPropagator(x, &y);
+  Buffer[y] = 1; // expected-warning {{Out of bound memory access }}
+}
+
+void testConfigurationSinks() {
+  int x = mySource1();
+  mySink(x, 1, 2); // expected-warning {{Untrusted data is passed to a user-defined sink}}
+  mySink(1, x, 2); // no-warning
+  mySink(1, 2, x); // expected-warning {{Untrusted data is passed to a user-defined sink}}
+}
Index: lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
===
--- lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
+++ lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
@@ -115,25 +115,38 @@
   static Optional getPointedToSVal(CheckerContext &C, const Expr *Arg);
 
   /// Check for CWE-134: Uncontrolled Format String.
-  static const char MsgUncontrolledFormatString[];
+  static constexpr llvm::StringLiteral MsgUncontrolledFormatString =
+  "Untrusted data is used as a format string "
+  "(CWE-134: Uncontrolled Format String)";
   bool checkUncontrolledFormatString(const CallExpr *CE,
  CheckerContext &C) const;
 
   /// Check for:
   /// CERT/STR02-C. "Sanitize data passed to complex subsystems"
   /// CWE-78, "Failure to Sanitize Data into an OS Command"
-  static const char MsgSanitizeSystemArgs[];
+  static constexpr llvm::StringLiteral MsgSanitizeSystemArgs =
+  "Untrusted data is passed to a system call "
+  "(CERT/STR02-C. Sanitize data passed to complex subsystems)";
   bool checkSystemCall(const CallExpr *CE, StringRef Name,
CheckerContext &C) const;
 
   /// Check if tainted data is used as a buffer size ins strn.. functions,
   /// and allocators.
-  static const char MsgTaintedBufferSize[];
+  static constexpr llvm::StringLiteral MsgTaintedBufferSize =
+  "Untrusted data is used to specify the buffer size "
+  "(CERT/STR31-C. Guarantee that storage for strings has sufficient space "
+  "for character data and the null terminator)";
   bool checkTaintedBufferSize(const CallExpr *CE, const FunctionDecl *FDecl,
   CheckerContext &C) const;
 
+  /// Check if tainted data is used as a custom sink's parameter.
+  static constexpr llvm::StringLiteral MsgCustomSink =
+  "Untrusted data is passed to a user-defined sink";
+  bool checkCustomSinks(const CallExpr *CE, StringRef Name,
+CheckerContext &C) const;
+
   /// Generate a report if the expression is tainted or points to tainted data.
-  bool generateReportIfTainted(const Expr *E, const char Msg[],
+  bool generateReportIfTainted(const Expr *E, const StringRef Msg,
CheckerContext &C) const;
 
   /// A struct used to specify taint propagation rules for a function.
@@ -175,7 +188,8 @@
 
 /// Get the propagation rule for a given function.
 static TaintPropagationRule
-getTaintPropagationRule(const FunctionDecl *FDecl, StringRef Name,
+getTaintPropagationRule(const GenericTaintChecker *Checker,
+const FunctionDecl *FDecl, StringRef Name,
 CheckerContext &C);
 
 void addSrcArg(unsigned A) { SrcArgs.push_back(A); }
@@ -227,19 +241,6 @@
 
 const unsigned GenericTaintChecker::ReturnValueIndex;
 const unsigned GenericTaintChecker::InvalidArgIndex;
-
-const char GenericTaintChecker::MsgUncontrolledFormatString[] =
-"Untrusted data is used as a format string "
-"(CWE-134: Uncontrolled Format String)";
-
-const char GenericTaintChecker::MsgSanitizeSystemArgs[] =
-"Untrusted data is passed to a system call "
-"(CERT/STR02-C. Sanitize data passed to complex subsystems)";
-
-const char GenericTaintChecker::MsgTaintedBufferSize[] =
-"Untrusted data is used to specify the buffer s

[PATCH] D59637: [analyzer] Use the custom propagation rules and sinks in GenericTaintChecker

2019-08-30 Thread Artem Dergachev via Phabricator via cfe-commits
NoQ added a comment.

I have a few immediate style issues but otherwise it looks good :)




Comment at: lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:237
 
-const char GenericTaintChecker::MsgUncontrolledFormatString[] =
+const llvm::StringLiteral GenericTaintChecker::MsgUncontrolledFormatString =
 "Untrusted data is used as a format string "

`StringLiteral`s need to always be declared as `constexpr`.



Comment at: lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:251
+const llvm::StringLiteral GenericTaintChecker::MsgCustomSink =
+"Untrusted data is passed to a user defined sink";
+;

It should be "user-defined" because it's a single adjective.

I recommend against using the word "sink" in user-facing messages because it's 
too jargony. Do we have a better word for this?



Comment at: lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:252
+"Untrusted data is passed to a user defined sink";
+;
 } // end of anonymous namespace

This semicolon looks unnecessary.



Comment at: lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:595-596
+  StringRef Name = C.getCalleeName(C.getCalleeDecl(CE));
+  llvm::errs() << "Skip out of bound SrcArg: " << Name << ":" << ArgNum
+   << '\n';
+  continue;

These debug prints should not land.



Comment at: lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:840
+
+  const GenericTaintChecker::ArgVector &Args = It->getValue();
+  for (unsigned ArgNum : Args) {

The `GenericTaintChecker::` qualifier is unnecessary.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59637/new/

https://reviews.llvm.org/D59637



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D59637: [analyzer] Use the custom propagation rules and sinks in GenericTaintChecker

2019-08-30 Thread Borsik Gábor via Phabricator via cfe-commits
boga95 added a comment.

Should I do anything or it is ready?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59637/new/

https://reviews.llvm.org/D59637



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D59637: [analyzer] Use the custom propagation rules and sinks in GenericTaintChecker

2019-08-14 Thread Borsik Gábor via Phabricator via cfe-commits
boga95 updated this revision to Diff 215268.
boga95 marked an inline comment as done.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59637/new/

https://reviews.llvm.org/D59637

Files:
  lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
  test/Analysis/taint-generic.c

Index: test/Analysis/taint-generic.c
===
--- test/Analysis/taint-generic.c
+++ test/Analysis/taint-generic.c
@@ -338,3 +338,43 @@
   if (i < rhs)
 *(volatile int *) 0; // no-warning
 }
+
+
+// Test configuration
+int mySource1();
+void mySource2(int*);
+void myScanf(const char*, ...);
+int myPropagator(int, int*);
+int mySnprintf(char*, size_t, const char*, ...);
+void mySink(int, int, int);
+
+void testConfigurationSources1() {
+  int x = mySource1();
+  Buffer[x] = 1; // expected-warning {{Out of bound memory access }}
+}
+
+void testConfigurationSources2() {
+  int x;
+  mySource2(&x);
+  Buffer[x] = 1; // expected-warning {{Out of bound memory access }}
+}
+
+void testConfigurationSources3() {
+  int x, y;
+  myScanf("%d %d", &x, &y);
+  Buffer[y] = 1; // expected-warning {{Out of bound memory access }}
+}
+
+void testConfigurationPropagation() {
+  int x = mySource1();
+  int y;
+  myPropagator(x, &y);
+  Buffer[y] = 1; // expected-warning {{Out of bound memory access }}
+}
+
+void testConfigurationSinks() {
+  int x = mySource1();
+  mySink(x, 1, 2); // expected-warning {{Untrusted data is passed to a user defined sink}}
+  mySink(1, x, 2); // no-warning
+  mySink(1, 2, x); // expected-warning {{Untrusted data is passed to a user defined sink}}
+}
Index: lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
===
--- lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
+++ lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
@@ -115,25 +115,30 @@
   static Optional getPointedToSVal(CheckerContext &C, const Expr *Arg);
 
   /// Check for CWE-134: Uncontrolled Format String.
-  static const char MsgUncontrolledFormatString[];
+  static const llvm::StringLiteral MsgUncontrolledFormatString;
   bool checkUncontrolledFormatString(const CallExpr *CE,
  CheckerContext &C) const;
 
   /// Check for:
   /// CERT/STR02-C. "Sanitize data passed to complex subsystems"
   /// CWE-78, "Failure to Sanitize Data into an OS Command"
-  static const char MsgSanitizeSystemArgs[];
+  static const llvm::StringLiteral MsgSanitizeSystemArgs;
   bool checkSystemCall(const CallExpr *CE, StringRef Name,
CheckerContext &C) const;
 
   /// Check if tainted data is used as a buffer size ins strn.. functions,
   /// and allocators.
-  static const char MsgTaintedBufferSize[];
+  static const llvm::StringLiteral MsgTaintedBufferSize;
   bool checkTaintedBufferSize(const CallExpr *CE, const FunctionDecl *FDecl,
   CheckerContext &C) const;
 
+  /// Check if tainted data is used as a custom sink's parameter.
+  static const llvm::StringLiteral MsgCustomSink;
+  bool checkCustomSinks(const CallExpr *CE, StringRef Name,
+CheckerContext &C) const;
+
   /// Generate a report if the expression is tainted or points to tainted data.
-  bool generateReportIfTainted(const Expr *E, const char Msg[],
+  bool generateReportIfTainted(const Expr *E, const StringRef Msg,
CheckerContext &C) const;
 
   /// A struct used to specify taint propagation rules for a function.
@@ -175,7 +180,8 @@
 
 /// Get the propagation rule for a given function.
 static TaintPropagationRule
-getTaintPropagationRule(const FunctionDecl *FDecl, StringRef Name,
+getTaintPropagationRule(const GenericTaintChecker *Checker,
+const FunctionDecl *FDecl, StringRef Name,
 CheckerContext &C);
 
 void addSrcArg(unsigned A) { SrcArgs.push_back(A); }
@@ -228,18 +234,22 @@
 const unsigned GenericTaintChecker::ReturnValueIndex;
 const unsigned GenericTaintChecker::InvalidArgIndex;
 
-const char GenericTaintChecker::MsgUncontrolledFormatString[] =
+const llvm::StringLiteral GenericTaintChecker::MsgUncontrolledFormatString =
 "Untrusted data is used as a format string "
 "(CWE-134: Uncontrolled Format String)";
 
-const char GenericTaintChecker::MsgSanitizeSystemArgs[] =
+const llvm::StringLiteral GenericTaintChecker::MsgSanitizeSystemArgs =
 "Untrusted data is passed to a system call "
 "(CERT/STR02-C. Sanitize data passed to complex subsystems)";
 
-const char GenericTaintChecker::MsgTaintedBufferSize[] =
+const llvm::StringLiteral GenericTaintChecker::MsgTaintedBufferSize =
 "Untrusted data is used to specify the buffer size "
 "(CERT/STR31-C. Guarantee that storage for strings has sufficient space "
 "for character data and the null terminator)";
+
+const llvm::StringLiteral GenericTaintChecker::MsgCustomSink =
+"Untrusted data is passed

[PATCH] D59637: [analyzer] Use the custom propagation rules and sinks in GenericTaintChecker

2019-07-31 Thread Borsik Gábor via Phabricator via cfe-commits
boga95 marked 2 inline comments as done.
boga95 added a comment.

I think it shouldn't give compile error in case of incorrect configuration now 
(maybe warning) because:

- Without qualified names, I can create a code which cannot be configured 
properly.



- It can throw an error without configuration, for example:

  void read(int*); // There is an existing propagation rule for it

I suggest to let it unchanged now, and I will change it when the checker can 
handle qualified names.
On the other hand, I think we should make this type of error configurable (from 
the command line). So the user can select between warnings and errors.




Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:836
+   CheckerContext &C) const {
+  auto It = CustomSinks.find(Name);
+  if (It == CustomSinks.end())

Szelethus wrote:
> Hmmm, how do we do with qualified names (`MyClass::generateTaint()`, 
> `std::cin >>`)?
These patches focus on C style functions. I have implemented the uses of 
qualified names, but I intended to make a separate patch for that. 


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59637/new/

https://reviews.llvm.org/D59637



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D59637: [analyzer] Use the custom propagation rules and sinks in GenericTaintChecker

2019-07-31 Thread Kristóf Umann via Phabricator via cfe-commits
Szelethus added a comment.

In general, don't emit to stderr unless we either emit a warning/error about 
the incorrect configuration. As an experiment, what happens when you try to 
emit an error in the middle of the symbolic execution? You can retrieve a 
`DiagnosticsEngine` from any decl: `D->getASTContext().getDiagnostics()` (it's 
funny how you can retrieve almost all major manager objects if you try hard 
enough).




Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:136
+  /// Check if tainted data is used as a custom sink's parameter.
+  static const char MsgCustomSink[];
+  bool checkCustomSinks(const CallExpr *CE, StringRef Name,

How about `llvm::StringLiteral`?



Comment at: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp:836
+   CheckerContext &C) const {
+  auto It = CustomSinks.find(Name);
+  if (It == CustomSinks.end())

Hmmm, how do we do with qualified names (`MyClass::generateTaint()`, `std::cin 
>>`)?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59637/new/

https://reviews.llvm.org/D59637



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D59637: [analyzer] Use the custom propagation rules and sinks in GenericTaintChecker

2019-07-28 Thread Borsik Gábor via Phabricator via cfe-commits
boga95 updated this revision to Diff 212119.
boga95 added a subscriber: steakhal.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59637/new/

https://reviews.llvm.org/D59637

Files:
  clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
  clang/test/Analysis/taint-generic.c

Index: clang/test/Analysis/taint-generic.c
===
--- clang/test/Analysis/taint-generic.c
+++ clang/test/Analysis/taint-generic.c
@@ -338,3 +338,43 @@
   if (i < rhs)
 *(volatile int *) 0; // no-warning
 }
+
+
+// Test configuration
+int mySource1();
+void mySource2(int*);
+void myScanf(const char*, ...);
+int myPropagator(int, int*);
+int mySnprintf(char*, size_t, const char*, ...);
+void mySink(int, int, int);
+
+void testConfigurationSources1() {
+  int x = mySource1();
+  Buffer[x] = 1; // expected-warning {{Out of bound memory access }}
+}
+
+void testConfigurationSources2() {
+  int x;
+  mySource2(&x);
+  Buffer[x] = 1; // expected-warning {{Out of bound memory access }}
+}
+
+void testConfigurationSources3() {
+  int x, y;
+  myScanf("%d %d", &x, &y);
+  Buffer[y] = 1; // expected-warning {{Out of bound memory access }}
+}
+
+void testConfigurationPropagation() {
+  int x = mySource1();
+  int y;
+  myPropagator(x, &y);
+  Buffer[y] = 1; // expected-warning {{Out of bound memory access }}
+}
+
+void testConfigurationSinks() {
+  int x = mySource1();
+  mySink(x, 1, 2); // expected-warning {{Untrusted data is passed to a user defined sink}}
+  mySink(1, x, 2); // no-warning
+  mySink(1, 2, x); // expected-warning {{Untrusted data is passed to a user defined sink}}
+}
Index: clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
===
--- clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
+++ clang/lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
@@ -132,6 +132,11 @@
   bool checkTaintedBufferSize(const CallExpr *CE, const FunctionDecl *FDecl,
   CheckerContext &C) const;
 
+  /// Check if tainted data is used as a custom sink's parameter.
+  static const char MsgCustomSink[];
+  bool checkCustomSinks(const CallExpr *CE, StringRef Name,
+CheckerContext &C) const;
+
   /// Generate a report if the expression is tainted or points to tainted data.
   bool generateReportIfTainted(const Expr *E, const char Msg[],
CheckerContext &C) const;
@@ -175,7 +180,8 @@
 
 /// Get the propagation rule for a given function.
 static TaintPropagationRule
-getTaintPropagationRule(const FunctionDecl *FDecl, StringRef Name,
+getTaintPropagationRule(const GenericTaintChecker *Checker,
+const FunctionDecl *FDecl, StringRef Name,
 CheckerContext &C);
 
 void addSrcArg(unsigned A) { SrcArgs.push_back(A); }
@@ -240,6 +246,10 @@
 "Untrusted data is used to specify the buffer size "
 "(CERT/STR31-C. Guarantee that storage for strings has sufficient space "
 "for character data and the null terminator)";
+
+const char GenericTaintChecker::MsgCustomSink[] =
+"Untrusted data is passed to a user defined sink";
+;
 } // end of anonymous namespace
 
 using TaintConfig = GenericTaintChecker::TaintConfiguration;
@@ -330,7 +340,8 @@
 
 GenericTaintChecker::TaintPropagationRule
 GenericTaintChecker::TaintPropagationRule::getTaintPropagationRule(
-const FunctionDecl *FDecl, StringRef Name, CheckerContext &C) {
+const GenericTaintChecker *Checker, const FunctionDecl *FDecl,
+StringRef Name, CheckerContext &C) {
   // TODO: Currently, we might lose precision here: we always mark a return
   // value as tainted even if it's just a pointer, pointing to tainted data.
 
@@ -424,6 +435,10 @@
   // or smart memory copy:
   // - memccpy - copying until hitting a special character.
 
+  auto It = Checker->CustomPropagations.find(Name);
+  if (It != Checker->CustomPropagations.end())
+return It->getValue();
+
   return TaintPropagationRule();
 }
 
@@ -464,7 +479,7 @@
 
   // First, try generating a propagation rule for this function.
   TaintPropagationRule Rule =
-  TaintPropagationRule::getTaintPropagationRule(FDecl, Name, C);
+  TaintPropagationRule::getTaintPropagationRule(this, FDecl, Name, C);
   if (!Rule.isNull()) {
 State = Rule.process(CE, C);
 if (!State)
@@ -536,6 +551,9 @@
   if (checkTaintedBufferSize(CE, FDecl, C))
 return true;
 
+  if (checkCustomSinks(CE, Name, C))
+return true;
+
   return false;
 }
 
@@ -572,8 +590,12 @@
   // Check for taint in arguments.
   bool IsTainted = true;
   for (unsigned ArgNum : SrcArgs) {
-if (ArgNum >= CE->getNumArgs())
-  return State;
+if (ArgNum >= CE->getNumArgs()) {
+  StringRef Name = C.getCalleeName(C.getCalleeDecl(CE));
+  llvm::errs() << "Skip out of bound SrcArg: " << Name << ":" << ArgNum
+   << '\n';
+  continue;
+

[PATCH] D59637: [analyzer] Use the custom propagation rules and sinks in GenericTaintChecker

2019-04-04 Thread Borsik Gábor via Phabricator via cfe-commits
boga95 updated this revision to Diff 193676.
boga95 added a comment.

Rebase after https://reviews.llvm.org/D59861.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59637/new/

https://reviews.llvm.org/D59637

Files:
  lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
  test/Analysis/Inputs/taint-generic-config.yaml
  test/Analysis/taint-generic.c

Index: test/Analysis/taint-generic.c
===
--- test/Analysis/taint-generic.c
+++ test/Analysis/taint-generic.c
@@ -1,5 +1,5 @@
-// RUN: %clang_analyze_cc1  -analyzer-checker=alpha.security.taint,core,alpha.security.ArrayBoundV2 -Wno-format-security -verify %s
-// RUN: %clang_analyze_cc1  -DFILE_IS_STRUCT -analyzer-checker=alpha.security.taint,core,alpha.security.ArrayBoundV2 -Wno-format-security -verify %s
+// RUN: %clang_analyze_cc1  -analyzer-checker=alpha.security.taint,core,alpha.security.ArrayBoundV2 -analyzer-config alpha.security.taint.TaintPropagation:Config=%S/Inputs/taint-generic-config.yaml -Wno-format-security -verify %s
+// RUN: %clang_analyze_cc1  -DFILE_IS_STRUCT -analyzer-checker=alpha.security.taint,core,alpha.security.ArrayBoundV2 -analyzer-config alpha.security.taint.TaintPropagation:Config=%S/Inputs/taint-generic-config.yaml -Wno-format-security -verify %s
 
 int scanf(const char *restrict format, ...);
 char *gets(char *str);
@@ -295,3 +295,43 @@
   if (i < rhs)
 *(volatile int *) 0; // no-warning
 }
+
+
+// Test configuration
+int mySource1();
+void mySource2(int*);
+void myScanf(const char*, ...);
+int myPropagator(int, int*);
+int mySnprintf(char*, size_t, const char*, ...);
+void mySink(int, int, int);
+
+void testConfigurationSources1() {
+  int x = mySource1();
+  Buffer[x] = 1; // expected-warning {{Out of bound memory access }}
+}
+
+void testConfigurationSources2() {
+  int x;
+  mySource2(&x);
+  Buffer[x] = 1; // expected-warning {{Out of bound memory access }}
+}
+
+void testConfigurationSources3() {
+  int x, y;
+  myScanf("%d %d", &x, &y);
+  Buffer[y] = 1; // expected-warning {{Out of bound memory access }}
+}
+
+void testConfigurationPropagation() {
+  int x = mySource1();
+  int y;
+  myPropagator(x, &y);
+  Buffer[y] = 1; // expected-warning {{Out of bound memory access }}
+}
+
+void testConfigurationSinks() {
+  int x = mySource1();
+  mySink(x, 1, 2); // expected-warning {{Untrusted data is passed to a user defined sink}}
+  mySink(1, x, 2); // no-warning
+  mySink(1, 2, x); // expected-warning {{Untrusted data is passed to a user defined sink}}
+}
Index: test/Analysis/Inputs/taint-generic-config.yaml
===
--- /dev/null
+++ test/Analysis/Inputs/taint-generic-config.yaml
@@ -0,0 +1,51 @@
+# A list of source/propagation function
+Propagations:
+  # int x = mySource1(); // x is tainted
+  - Name: mySource1
+DstArgs:  [4294967294] # Index for return value
+
+  # int x;
+  # mySource2(&x); // x is tainted
+  - Name: mySource2
+DstArgs:  [0]
+
+  # int x, y;
+  # myScanf("%d %d", &x, &y); // x and y are tainted
+  - Name: myScanf
+VarType:  Dst
+VarIndex: 1
+
+  # int x; // x is tainted
+  # int y;
+  # myPropagator(x, &y); // y is tainted
+  - Name: myPropagator
+SrcArgs:  [0]
+DstArgs:  [1]
+
+  # const unsigned size = 100;
+  # char buf[size];
+  # int x, y;
+  # int n = mySprintf(buf, size, "%d %d", x, y); // If size, x or y is tainted
+  # // the return value and the buf will be tainted
+  - Name: mySnprintf
+SrcArgs:  [1]
+DstArgs:  [0, 4294967294]
+VarType:  Src
+VarIndex: 3
+
+# A list of filter functions
+Filters:
+  # int x; // x is tainted
+  # myFilter(&x); // x is not tainted anymore
+  - Name: myFilter
+Args: [0]
+
+# A list of sink functions
+Sinks:
+  # int x, y; // x and y are tainted
+  # mySink(x, 0, 1); // It will warn
+  # mySink(0, 1, y); // It will warn
+  # mySink(0, x, 1); // It won't warn
+  - Name: mySink
+Args: [0, 2]
+
Index: lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
===
--- lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
+++ lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
@@ -122,6 +122,11 @@
   bool checkTaintedBufferSize(const CallExpr *CE, const FunctionDecl *FDecl,
   CheckerContext &C) const;
 
+  /// Check if tainted data is used as a custom sink's parameter.
+  static const char MsgCustomSink[];
+  bool checkCustomSinks(const CallExpr *CE, StringRef Name,
+CheckerContext &C) const;
+
   /// Generate a report if the expression is tainted or points to tainted data.
   bool generateReportIfTainted(const Expr *E, const char Msg[],
CheckerContext &C) const;
@@ -231,6 +236,9 @@
 "(CERT/STR31-C. Guarantee that storage for strings has sufficient space "
 "for character data and the null terminator)";
 
+const char GenericTaint

[PATCH] D59637: [analyzer] Use the custom propagation rules and sinks in GenericTaintChecker

2019-03-21 Thread Borsik Gábor via Phabricator via cfe-commits
boga95 created this revision.
boga95 added reviewers: NoQ, Szelethus, xazax.hun, dkrupp.
Herald added subscribers: cfe-commits, Charusso, donat.nagy, mikhail.ramalho, 
a.sidorin, rnkovacs, szepet, baloghadamsoftware, whisperity.
Herald added a project: clang.

The `TaintPropagationRule` deduction uses the custom rules from the config.
Check custom sinks when looking for errors. Give an error when at least one of 
the specified arguments is tainted.
Emit error message and omit a parameter if it's out of bound.


Repository:
  rC Clang

https://reviews.llvm.org/D59637

Files:
  lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
  test/Analysis/Inputs/taint-generic-config.yaml
  test/Analysis/taint-generic.c

Index: test/Analysis/taint-generic.c
===
--- test/Analysis/taint-generic.c
+++ test/Analysis/taint-generic.c
@@ -1,5 +1,5 @@
-// RUN: %clang_analyze_cc1  -analyzer-checker=alpha.security.taint,core,alpha.security.ArrayBoundV2 -Wno-format-security -verify %s
-// RUN: %clang_analyze_cc1  -DFILE_IS_STRUCT -analyzer-checker=alpha.security.taint,core,alpha.security.ArrayBoundV2 -Wno-format-security -verify %s
+// RUN: %clang_analyze_cc1  -analyzer-checker=alpha.security.taint,core,alpha.security.ArrayBoundV2 -analyzer-config alpha.security.taint.TaintPropagation:Config=%S/Inputs/taint-generic-config.yaml -Wno-format-security -verify %s
+// RUN: %clang_analyze_cc1  -DFILE_IS_STRUCT -analyzer-checker=alpha.security.taint,core,alpha.security.ArrayBoundV2 -analyzer-config alpha.security.taint.TaintPropagation:Config=%S/Inputs/taint-generic-config.yaml -Wno-format-security -verify %s
 
 int scanf(const char *restrict format, ...);
 char *gets(char *str);
@@ -295,3 +295,43 @@
   if (i < rhs)
 *(volatile int *) 0; // no-warning
 }
+
+
+// Test configuration
+int mySource1();
+void mySource2(int*);
+void myScanf(const char*, ...);
+int myPropagator(int, int*);
+int mySnprintf(char*, size_t, const char*, ...);
+void mySink(int, int, int);
+
+void testConfigurationSources1() {
+  int x = mySource1();
+  Buffer[x] = 1; // expected-warning {{Out of bound memory access }}
+}
+
+void testConfigurationSources2() {
+  int x;
+  mySource2(&x);
+  Buffer[x] = 1; // expected-warning {{Out of bound memory access }}
+}
+
+void testConfigurationSources3() {
+  int x, y;
+  myScanf("%d %d", &x, &y);
+  Buffer[y] = 1; // expected-warning {{Out of bound memory access }}
+}
+
+void testConfigurationPropagation() {
+  int x = mySource1();
+  int y;
+  myPropagator(x, &y);
+  Buffer[y] = 1; // expected-warning {{Out of bound memory access }}
+}
+
+void testConfigurationSinks() {
+  int x = mySource1();
+  mySink(x, 1, 2); // expected-warning {{Untrusted data is passed to a user defined sink}}
+  mySink(1, x, 2); // no-warning
+  mySink(1, 2, x); // expected-warning {{Untrusted data is passed to a user defined sink}}
+}
Index: test/Analysis/Inputs/taint-generic-config.yaml
===
--- /dev/null
+++ test/Analysis/Inputs/taint-generic-config.yaml
@@ -0,0 +1,51 @@
+# A list of source/propagation function
+Propagations:
+  # int x = mySource1(); // x is tainted
+  - Name: mySource1
+DstArgs:  [4294967294] # Index for return value
+
+  # int x;
+  # mySource2(&x); // x is tainted
+  - Name: mySource2
+DstArgs:  [0]
+
+  # int x, y;
+  # myScanf("%d %d", &x, &y); // x and y are tainted
+  - Name: myScanf
+VarType:  Dst
+VarIndex: 1
+
+  # int x; // x is tainted
+  # int y;
+  # myPropagator(x, &y); // y is tainted
+  - Name: myPropagator
+SrcArgs:  [0]
+DstArgs:  [1]
+
+  # const unsigned size = 100;
+  # char buf[size];
+  # int x, y;
+  # int n = mySprintf(buf, size, "%d %d", x, y); // If size, x or y is tainted
+  # // the return value and the buf will be tainted
+  - Name: mySnprintf
+SrcArgs:  [1]
+DstArgs:  [0, 4294967294]
+VarType:  Src
+VarIndex: 3
+
+# A list of filter functions
+Filters:
+  # int x; // x is tainted
+  # myFilter(&x); // x is not tainted anymore
+  - Name: myFilter
+Args: [0]
+
+# A list of sink functions
+Sinks:
+  # int x, y; // x and y are tainted
+  # mySink(x, 0, 1); // It will warn
+  # mySink(0, 1, y); // It will warn
+  # mySink(0, x, 1); // It won't warn
+  - Name: mySink
+Args: [0, 2]
+
Index: lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
===
--- lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
+++ lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp
@@ -116,6 +116,11 @@
   bool checkTaintedBufferSize(const CallExpr *CE, const FunctionDecl *FDecl,
   CheckerContext &C) const;
 
+  /// Check if tainted data is used as a custom sink's parameter.
+  static const char MsgCustomSink[];
+  bool checkCustomSinks(const CallExpr *CE, StringRef Name,
+CheckerContext &C) const;
+
   /// Generate a report