NagyDonat wrote:
> I'm gonna be honest with you that my approach was so far to find something
> similar to what I want and copy paste it. Then carefully check the exploded
> graph to see if it's correct.
After surveying code that uses `NodeBuilder`s my impression is that probably
everyone followed this approach (or at least the first part, I wouldn't bet on
everyone "carefully check"ing. There are several code fragments that are
completely non-functional (e.g. add a node to a set, then immediately remove
it), but as they do not cause any problems, they preserved and copied around.
This approach has the advantage that it actually gets things done (instead of
digging deep in rabbit holes and unearthing the skeleton armies buried in our
codebase), but I fear that it doesn't cover the "rare" cases.
In fact, even before I read this comment, I had the feeling that these part of
the engine are "fragile" in the sense that they have lots of quirks that are
usually irrelevant, but can differ from the "natural" behavior in some rare
corner case. (E.g. adding a node to a set then removing it is usually a no-op,
but if the node was somehow already present in the set, then it gets removed,
which may be an error.)
My intuition is that anything that uses `NodeBuilder`s has a chance to behave
incorrectly on exceptional input:
- encountering a sink node from a checker (on callbacks where that is rare),
- encountering a "node already exists" situation (cache out, node creation
returns `nullptr`),
- encountering a posteriorly overconstrained state (probably the most
exceptional case).
This is my primary motivation for cleaning up these parts of the codebase. Our
data model is full of rare-but-can-happen-anywhere exceptional things (e.g.
Undefined/Unknown values, sinks, caching out, posteriorly overconstrained
states, reaching various budgets and complexity limits etc.) so I feel that it
is infeasible to cover all corner cases experimentally with tests -- and
instead I'm trying to ground the correctness of the code in theoretical
understanding, invariants and readability. (Of course, we should _also_ have
tests in addition to theory, but I feel that it is not enough to test the
project as a black box full of skeletons.)
> Objectively, the engineers were really smart and had so many abstraction
> layers over time, and only few of them made long term sense.
I feel that code written around 2010-12 (IIUC the early days of the analyzer)
often contains overeager pretentious abstractions and generalizations that are
challenging the readers ("are you as smart as me?") but provide no practical
benefits. I don't know whether these are products of hubris ("I'm SO smart"),
written with well-meaning naivety ("generic solutions will help development in
the future") or just ruins of an ancient age -- maybe they _had_ some practical
benefits in the past which was lost during some refactoring.
I also admit that there were some old abstractions that appeared to be useless
at first, but since then I realized that they have some relevant (and sometimes
even important) role.
In fact, since I wrote my complaints about the overcomplicated management of
the `Frontier` (in `NodeBuilder`s), I realized that this is probably relevant
_during the evaluation of checker callbacks_ (when
`CheckerContext::addTransition` calls can and do perform arbitrarily complex
transition sequences and it is important to continue with the _frontier_ after
the checker callback).
My current plan is that I would move this `Frontier` management to
`CheckerContext` -- because it is irrelevant and confusing in the dozens of
other throwaway `NodeBuilder`s that are never used for checker execution.
> I don't mind and I actually want to encourage anyone simplifying things. One
> way is to try to get rid of something and if it was easy, then chances are
> that the "thing" wasn't really needed after all.
I like where this is going.
I hope that I will be able to meaningfully simplify the codebase :smile:
However it is likely that I won't do this immediately. During the next days
I'll try to make `switch` statements trigger the `BranchCondition` callback (I
have an unpublished almost-ready commit which should work correctly now that
`SwitchNodeBuilder`s are `NodeBuilder`s) and then merge
optin.core.UnconditionalVAArg
(https://github.com/llvm/llvm-project/pull/175602, which would work correctly
once `switch` triggers the `BranchCondition` callback). After these are done,
I'll switch to working on `security.ArrayBound` and generalizing its logic for
use in other checkers (like e.g. `ReturnPtrRange`), and I will work on this
`NodeBuilder` removal only after that (or within "breaks" when I'm blocked on
the bounds checking front).
https://github.com/llvm/llvm-project/pull/181431
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits