steakhal wrote: > I think every time we need to iterate over all member of an equivalence > class, we might do something wrong. The point of the equivalence class would > be to make sure those elements are equivalent. One way to avoid iteration > would be to always use the representative of the equivalence class. E.g., > each time we record a new constraint to a member of the class, we add this > information to the representative element. Every time we do a query, we first > look up the representative element which already should have all the info > from the class and use it instead of the original symbol. > > Would something like this work or am I missing something?
I had to think about it for a little while, and here are my thoughts: In an example like here: ```c++ void gh_62215(int x, int y, int z) { if (x != y) return; // x == y if (z <= x) return; // z > x if (z >= y) return; // z < y clang_analyzer_warnIfReached(); // no-warning: This should be dead code. (void)(x + y + z); // keep the constraints alive. } ``` Here, after the first if, we would have an eqclass of `{x,y}`; with a single constraint `x != y: [0,0]`. After the second if, we would have notionally 3 eqclasses: `{x,y}`, and two trivial ones `x` and `y`. We would also have 3 constraints: `x != y: [0,0]`, `z <= x: [0,0]`, `z <= y: [0,0]`. Note that the keys in the constraints map (symbols) can be 'arbitrarily' complex and refer to already dead symbols. In this example, I keep these symbols artificially alive to make the case slightly simpler to discuss. So, if I understand you correctly, at the 3rd if statement, we should canonicalize the symbol we are constraining by walking every sub-symbol and substituting it with its equivalent counterpart (if any), by basically with its eqclass' representative symbol. In this example, it would mean that instead of adding a constraint `z >= y: [0,0]`, we would instead `z >= x: [0,0]` (assuming that `x` is the representative symbol of eqclass `{x,y}`. I believe this canonicalization could work (and would solve the other demonstrative limitation I added in this PR), but would also incur overhead. And from a design point of view, an eqclass would need to keep the representative symbol alive, because otherwise, we can't substitute a symbol with the representative symbol. (Note that for this reason, we don't actually have a representative symbol, but rather a unique integral value as an ID for the eqclass - which happens to be equal to the `SymbolRef` pointer value of the representative symbol if that's still alive. Here is an example where the representative symbol `x` of the eqclass `{x,y}` would die, but remain the ID of the eqclass: ```c++ // same example, but with: (void)(y + z); // X is not mentioned!, thus reclaimed after the 2nd if statement. ``` https://github.com/llvm/llvm-project/pull/71284 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits