ASDenysPetrov added inline comments.

================
Comment at: clang/lib/StaticAnalyzer/Checkers/StrictAliasingChecker.cpp:48-49
+  //
+  // NOTE: User must provide canonical and unqualified QualType's for the
+  // correct result.
+  static bool canAccess(QualType From, QualType To, ASTContext &Ctx) {
----------------
NoQ wrote:
> `assert()` it? Or maybe canonicalize defensively so that not to duplicate 
> code in the caller, given that there's really only one correct way to do that?
>Or maybe canonicalize defensively so that not to duplicate code in the caller,
We need canonical types in the bug report as well, so we have to get them on 
the caller side.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/StrictAliasingChecker.cpp:58
+      : From(From), To(To), Ctx(Ctx) {}
+  bool canAccessImpl() {
+    return isSame() || isCharOrByte() || isOppositeSign();
----------------
xazax.hun wrote:
> I'd love to see some detailed descriptions of the strict aliasing rule, what 
> parts are we checking for and what parts we don't. E.g. it would be nice to 
> document the differences between C and C++ aliasing rules. I do remember 
> something about prefixes, i.e.: it would be well defined to access something 
> with the dynamic type `struct { int x, y; };` and read `struct{ int x; };` 
> from it. Does that not apply to C++?
>I'd love to see some detailed descriptions of the strict aliasing rule, 
I've added a description in the header.
>it would be nice to document the differences between C and C++ aliasing rules.
I need some time to gather those differences and will add them to the 
description as well.
> it would be well defined to access something with the dynamic type struct { 
> int x, y; }; and read struct{ int x; }; from it. Does that not apply to C++?
AFAIK it's UB if you do any access to the object which lifetime is not started 
in C++.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/StrictAliasingChecker.cpp:80
+class StrictAliasingChecker : public Checker<check::Location> {
+  mutable std::unique_ptr<BugType> BT;
+
----------------
NoQ wrote:
> The modern solution is
> ```lang=c++
> BugType BT{this, "...", "..."};
> ```
Got it.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/StrictAliasingChecker.cpp:113
+  QualType getOriginalType(CheckerContext &C, SVal V, QualType T) const {
+    assert(V.getAs<Loc>() && "Location shall be a Loc.");
+    V = C.getState()->getSVal(V.castAs<Loc>(), T);
----------------
NoQ wrote:
> I suspect it might also be `UnknownVal` (?) It usually makes sense to protect 
> against such scenarios with an early return.
I'll try to add some tests to model this.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/StrictAliasingChecker.cpp:141-145
+    auto ER = dyn_cast<ElementRegion>(R);
+    while (ER) {
+      R = ER->getSuperRegion();
+      ER = dyn_cast<ElementRegion>(R);
+    }
----------------
NoQ wrote:
> You're basically running into https://bugs.llvm.org/show_bug.cgi?id=43364 
> here. The element region is not necessarily a consequence of a pointer cast; 
> it may also be a legitimate array element access, and there's no way to 
> figure out what it really was. So I suspect that you may fail the following 
> test case:
> ```lang=c++
> void foo() {
>   int x[10];
>   int *p = &x[1];
>   *p = 1; // int incompatible with int[10]
> }
> ```
> 
> Separately from that, ultimatetly you'll most likely have to handle //regions 
> with symbolic base// separately. These regions are special because they are 
> built when the information about the original type is not really unavailable. 
> For now you're dodging this problem by only supporting `VarRegion`s with 
> element sub-regions. But in the general case you may encounter code like this
> ```lang=c++
> void foo(void *p) {
>   int *x = p;
>   float *y = x;
>   *y = 1.0;
> }
> ```
> which may be valid if the original pointer `p` points to a `float`. But this 
> can be delayed until later.
Oh, unfortunately I've stumbled upon these representation issues:
```
int arr[2][8]
auto p1 = (int(*)[8])arr[0]; // &Element{arr,0 S64b,int[8]}
auto p2 = (int(*)[8])arr;    // &Element{arr,0 S64b,int[8]}
auto p3 = (int(*)[7])arr[0]; // &Element{arr,0 S64b,int[7]}
auto x = *p1; // OK
auto y = *p2; // UB
auto z = *p3; // UB
```
I think we need some sort of a new region `CastRegion` that we can store the 
**provenance/origin** and **cast** type separately (in a canonical form), like 
I did for integers in  D103096. I can investigate it soon. And `ElementRegion` 
would be in charge of arrays only.
Or
We could modificate `ElementRegion` to be always in a canonical form which we 
could easily differentiate.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/StrictAliasingChecker.cpp:157
+
+    ExplodedNode *Node = C.generateNonFatalErrorNode();
+    if (!BT)
----------------
NoQ wrote:
> I suggest a fatal error node. Undefined behavior already occured, there's 
> nothing interesting in the follow-up.
> I suggest a fatal error node. 
That's a big discussion what would be correct :-) since 98.99% of modern 
compilers treat e.g. `*((short*)x)` as a truncation and the following flow 
works as most users expect (who doesn't aware about UB). So here we could just 
warn about UB and explore the rest of the code for similar UB's without 
//sinking//. That's why I decided using non-fatal node.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/StrictAliasingChecker.cpp:178-179
+  const LangOptions &LO = CM.getLangOpts();
+  // Ideally, the condition should be `LO.CPlusPlus11 || LO.CPlusPlus2b` but
+  // implemented checks can be partially applied for C++17 and lower versions.
+  return LO.CPlusPlus;
----------------
NoQ wrote:
> We should probably keep the checker disabled in these language modes until 
> we're sure it works correctly (in a conservative sense, i.e. doesn't emit 
> false positives).
> 
> Can we check whether `-fstrict-aliasing` is enabled here? That's probably 
> appropriate.
> Can we check whether -fstrict-aliasing is enabled here? That's probably 
> appropriate.
Basically, I've already tried to do some preparations for using 
`-fstrict-aliasing` in D114006. But now it needs a bit modification. I'll 
update it.

>We should probably keep the checker disabled in these language modes until 
>we're sure it works correctly (in a conservative sense, i.e. doesn't emit 
>false positives).
I did it for the testing purposes. Several extisting tests help to find 
uncovered cases (As you can see, those, which have failed in //pre-merge 
checks//).
Agree. I'll change the condition to `LO.CPlusPlus20 || LO.CPlusPlus2b`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114718/new/

https://reviews.llvm.org/D114718

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to