ASDenysPetrov marked 10 inline comments as done.
ASDenysPetrov added a comment.

Thank you for the review @martong! Your work is not less harder then mine. I'll 
rework and update the revision ASAP.



================
Comment at: clang/lib/StaticAnalyzer/Checkers/ExprInspectionChecker.cpp:421-426
+  // Unwrap symbolic expression to skip argument casts on function call.
+  // This is useful when there is no way for overloading function in C
+  // but we need to pass different types of arguments and
+  // implicit cast occures.
+  Sym = Sym->ignoreCasts();
+
----------------
martong wrote:
> Does it really matter? I mean, why do we need this change?
I investigated. This changes is not obligatory now. I'll remove it.


================
Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:2227-2251
+  llvm::SmallSet<EquivalenceClass, 4> SimplifiedClasses;
+  // Iterate over all equivalence classes and try to simplify them.
+  ClassMembersTy Members = State->get<ClassMembers>();
+  for (std::pair<EquivalenceClass, SymbolSet> ClassToSymbolSet : Members) {
+    EquivalenceClass Class = ClassToSymbolSet.first;
+    State = EquivalenceClass::simplify(Builder, RangeFactory, State, Class);
+    if (!State)
----------------
martong wrote:
> I think this hunk should remain in `assignSymExprToConst`. Why did you move 
> it?
I'll remove. It's unrelated one.


================
Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:2280-2290
+/// Return a symbol which is the best canidate to save it in the constraint
+/// map. We should correct symbol because in case of truncation cast we can
+/// only reason about truncated bytes but not the whole value. E.g. (char)(int
+/// x), we can store constraints for the first lower byte but we still don't
+/// know the root value. Also in case of promotion or converion we should
+/// store the root value instead of cast symbol, because we can always get
+/// a correct range using `castTo` metho. And we are not intrested in any
----------------
martong wrote:
> I think, we should definitely store the constraints as they appear in the 
> analyzed code. This way we mix the infer logic into the constraint setting, 
> which is bad.
> I mean, we should simply store the constraint directly for the symbol as it 
> is. And then only in `VisitSymbolCast` should we infer the proper value from 
> the stored constraint (if we can).
> 
> (Of course, if we have related symbols (casts of the original symbol) then 
> their constraints must be updated.)
I see what you mean. I thought about this. Here what I've faced with.
# Let's say you meet `(wchar_t)x > 0`, which you store like a pair {(wchar_t)x, 
[1,32767]}.
# Then you meet `(short)x < 0`, where you have to answer whether it's `true` or 
`false`.
# So would be your next step? Brute forcing all the constraint pairs to find 
any `x`-related symbol? Obviously, O(n) complexity for every single integral 
symbol is inefficient.
What I propose is to "canonize" arbitrary types to a general form where this 
form could be a part of key along with `x` and we could get a constraint with a 
classic map complexity. So that:
# You meet `(wchar_t)x > 0`, which you convert `wchar_t` to `int16` and store 
like a pair {(int16)x, [1,32767]}.
# Then you meet `(short)x < 0`, where you convert `short` to `int16` and get a 
constraint.
That's why I've introduced `NominalTypeList`.
But now I admited your concern about arbitrary size of integers and will 
redesign my solution.


================
Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:2346-2350
+/// FIXME: Update bigger casts. We only can reason about ranges of smaller
+/// types, because it would be too complicated to update, say, the entire
+/// `int` range if you only have knowledge that its lowest byte has been
+/// changed. So we don't touch bigger casts and they may be potentially
+/// invalid. For future, for:
----------------
martong wrote:
> Instead of a noop we should be more conservative in this case. We should 
> invalidate (remove) the constraints of all the symbols that have more bits 
> than the currently set symbol. However, we might be clever in cases of 
> special values (e.g `0` or in case of the `true` rangeSet {[MIN, -1], [1, 
> MAX]}).
No, it's incorrect. Consider next:
```
int x;
if(x > 1000000 || x < 100000) 
  return;
// x (100'000, 1000'000) 
if((int8)x != 42) 
  return;
// x (100'000, 1000'000) && (int8)x (42, 42) 
```
We can't just remove or invalidate `x (100'000, 1000'000)` because this range 
will still stay //true//.
Strictly speaking `x` range should be updated with values 100394, 102442, 
108586, ...,, 960554 and any other value within the range which has its lowest 
byte equals to 42.
We can't just update the `RangeSet` with such a big amount of values due to 
performance issues. So we just assume it as less accurate.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103096/new/

https://reviews.llvm.org/D103096

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to