[clang] [llvm] [clang-tools-extra] [analyzer] Trust base to derived casts for dynamic types (PR #69057)

Balazs Benics via cfe-commits Fri, 29 Dec 2023 06:51:53 -0800

steakhal wrote:

> Thanks for the replies. I'll come back to this PR once I have some time; 
> maybe during the holidays. Both assertions directly relate to this PR for 
> sure.


I think I've just fixed the reported crash for this PR. For some reason a 
`CastExpr` does not have a reference type (`getType()`), even though it casts 
lvalue expressions - in case casting references instead of pointers). I have no 
clue why that's the case, but the workaround seems to work now. Added a 
testcase demonstrating the fix.

---

> Hm, testing these patches on the original testcase in #62663 (the one where 
> we use statements 1B and 2B) - I don't think this patchset solves that 
> scenario...

I've looked into the case and I think now I see your problem.

In case we have `2B` instead of `2A`, we no longer have a cast to hint us the 
dynamic type of the object; consequently we can't follow/devirtualize the call. 
So, we are basically forced to conservatively evaluate the call, as if the body 
is not present.
This is strictly speaking correct, but also not particularly useful.

If I'm not mistaken, inheritance has "open-set" semantics, which means that in 
our current translation-unit we might not see all the derived classes, and 
could be that some derived classes are loaded only at runtime using shared 
libraries fox example.

What it means for us is that, by seeing a `Base*`, which has a pure virtual 
function, and that only a single class inherits transitively from `Base` (and 
defines the virtual function); we can't be really sure that by calling that 
virtual function it will invoke that particular function.

For example, in an other translation unit they might define a `NewDerived` 
implementing `Base` slightly differently and smuggle a  pointer into our 
context; breaking local reasoning. Which means that the call to 
`foo->OnRecvCancel(port)` could potentially invoke a completely different 
definition that we saw in our translation unit.

If we would have seen the allocation (thus the dynamic type) of the object of 
which we call functions, it's fairly easy to inline the right definitions. 
However, when we need to guess what could be the dynamic type, its no longer 
that easy.

In presence of static-casts, the developer at least hinted a type, so in that 
case we have some justification for trusting the type there; otherwise all bets 
are off.

---

What could we do about inlining function definitions without knowing precisely 
the dynamic type of the object?

I think it would still make sense to assume well architected code and if only 
one thing derives (implements) a Base class, then it's probably safe to assume 
that that's the definition we want to inline. However, what should we do if 
multiple (N) classes implement Base? Trying each N, and basically splitting the 
state to (N+1) ways is not going to scale. Unless N is of course really small, 
like 2 or 3 at most.

To conclude, I don't think we could get the `2B` case working (by inlining the 
`Handler::OnRecvCancel(int)`); but I'll think about it.
Maybe you have some thoughts @Xazax-hun.

https://github.com/llvm/llvm-project/pull/69057
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [clang-tools-extra] [analyzer] Trust base to derived casts for dynamic types (PR #69057)

Reply via email to