NoQ added a comment.

Your code is well-written and easy to understand, and makes me want to use it 
on my code! Added some minor comments, and there seems to be a small logic 
error in the compound statement hasher.

Not sure if that was already explained in detail, but i seem to agree that the 
only point for this project to integrate into Clang Static Analyzer is to 
integrate with the `BugReporter` facility - at least optionally: it would allow 
the reports of the clone detector checker to be gathered and viewed in a manner 
similar to other checkers - i.e. through scan-build/scan-view, probably 
CodeChecker and other tools. That should make the check easier to run and 
attract more users. However, the `BugReporter` mechanism is tweaked for the 
analyzer's path-sensitive checkers, so it'd need to be modified to suit your 
purpose, so in my opinion this is not critical for the initial commit.


================
Comment at: lib/AST/CloneDetection.cpp:52
@@ +51,3 @@
+      Other.getEndLoc() == getEndLoc();
+  return StartIsInBounds && EndIsInBounds;
+}
----------------
Perhaps we could early-return when we see that `StartIsInBounds` is false (for 
performance, because `isBeforeInTranslationUnit` looks heavy).

================
Comment at: lib/AST/CloneDetection.cpp:88
@@ +87,3 @@
+/// It is defined as:
+///   H(D) = PrimeFactor^(N-1) * D[0] + PrimeFactor^(N-2) * D[1] + ... + D[N]
+/// where
----------------
It seems that the last index is off-by-one, i think you meant something like:

```
H(D) = PrimeFactor^(N-1) * D[0] + PrimeFactor^(N-2) * D[1] + ... + D[N-1]
```

================
Comment at: lib/AST/CloneDetection.cpp:175
@@ +174,3 @@
+      ++StartIterator;
+    auto EndIterator = Iterator;
+    for (unsigned I = 0; I < Length; ++I)
----------------
Shouldn't it say something like "`EndIterator = StartIterator`"? Because when 
`StartPos` is non-zero, in your code we get [`StartPos`, `Length`) instead of 
[`StartPos`, `StartPos` + `Length`). In your code, `Length` acts as some kind 
of `EndPos` instead.

================
Comment at: lib/AST/CloneDetection.cpp:194
@@ +193,3 @@
+      } else {
+        StmtSequence ChildSequence(StmtSequence(Child, Context));
+
----------------
Simplifies to:

```
StmtSequence ChildSequence(Child, Context);
```

================
Comment at: lib/AST/CloneDetection.cpp:238
@@ +237,3 @@
+  // the CompoundStmt.
+  auto CS = dyn_cast<CompoundStmt>(CurrentStmt.front());
+  if (!CS)
----------------
I think that the code that deals with //computing// data for sections of 
compound statements (as opposed to //saving// this data) should be moved to 
`VisitStmt()`. Or, even better, to `VisitCompoundStmt()` - that's the whole 
point of having a visitor, after all. Upon expanding the complexity of the 
hash, you'd probably provide more special cases for special statement classes, 
which would all sit in their own Visit method.

That's for the future though, not really critical.

================
Comment at: lib/AST/CloneDetection.cpp:262
@@ +261,3 @@
+
+      CloneDetector::StmtData SubData(SubHash.getValue(), SubComplexity);
+      CD.add(Sequence, SubData);
----------------
This code is duplicated from the beginning of the function, which synergizes 
with my point above: if `SaveData()` contained only that much, then it could 
have been re-used on both call sites.

================
Comment at: lib/StaticAnalyzer/Checkers/CloneChecker.cpp:40
@@ +39,3 @@
+  CloneDetector.AnalyzeTranslationUnitDecl(
+      TU->getASTContext().getTranslationUnitDecl());
+
----------------
`TU->getASTContext().getTranslationUnitDecl()` is always equal to `TU` - there 
should, in most cases, be only one TranslationUnitDecl object around, but even 
if there'd be more some day, i think this invariant would still hold. 

================
Comment at: lib/StaticAnalyzer/Checkers/CloneChecker.cpp:43
@@ +42,3 @@
+  std::vector<CloneDetector::CloneGroup> CloneGroups;
+  CloneDetector.findClones(CloneGroups);
+
----------------
An idea: because this function optionally accepts `MinGroupComplexity`, you may 
probably want to expose this feature as a //checker-specific option// - see 
`MallocChecker::IsOptimistic` as an example, shouldn't be hard.

================
Comment at: lib/StaticAnalyzer/Checkers/CloneChecker.cpp:45
@@ +44,3 @@
+
+  DiagnosticsEngine &DiagEngine = Mgr.getDiagnostic();
+
----------------
zaks.anna wrote:
> Is it possible to report these through BugReporter?
I think that'd require changes in the BugReporter to allow easily adding extra 
pieces to path-insensitive reports (which is quite a wanted feature in many 
AST-based checkers - assuming we want more AST-based checkers to be moved into 
the analyzer specifically for better reporting, integration with scan-build, 
and stuff like that).

================
Comment at: test/Analysis/copypaste/test-min-max.cpp:39
@@ +38,3 @@
+
+int foo(int a, int b) {
+  return a + b;
----------------
Perhaps add a `// no-warning` comment to spots that should not cause a warning? 
This comment has no effect on the test suite, just a traditional way to notify 
the reader.


http://reviews.llvm.org/D20795



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to