ymandel added a comment. Thanks for the very helpful feedback! I think I've addressed all of the suggestions/concerns. I suspect there's yet more room for improvement in the algorithm, but I think we're converging on "good enough".
================ Comment at: clang/include/clang/Analysis/Analyses/IntervalPartition.h:38 +// interval. struct CFGInterval { + CFGInterval() = default; ---------------- sammccall wrote: > Summarizing an offline discussion, IIUC... > - we build a tree of intervals that represents the whole sequence of interval > graphs > - for the current worklist algorithm, we use only the total order and process > the first invalid block, the tree is not needed > - the algorithm/data structure could be significantly simplified (no > variants, templates, recursion...) if it didn't build the tree, just the > interval graphs iteratively until a single node is reached > - however a different, non-worklist algorithm could use the tree information > to determine which blocks to visit next. (Given `a (b c d) e`, after `d` we > try only `b` or `e`). This avoids expensive "is invalid" checks > - so the motivation for preserving the full interval tree is future-proofing > for using a different algorithm > > I think it's worth documenting this motivation, and also considering not > paying this complexity cost for a feature we aren't adding yet. Sam, I've radically simplified the implementation to only build exactly what's needed -- the ordering. PTAL and let me know what further comments/documentation would help. ================ Comment at: clang/include/clang/Analysis/Analyses/IntervalPartition.h:56 + + template <typename Node> struct NodeData { + // The block from which the interval was constructed. It is either the ---------------- sammccall wrote: > xazax.hun wrote: > > Correct me if I'm wrong but I have the impression that `CFGInterval` might > > either contain only `CfgBlock`s or other `CfgInterval`s, but these might > > never be mixed. The current representation does allow such mixing. In case > > do not need that, I think it might be cleaner to make `CfgInterval` itself > > templated and remove the `std::variant`. > (Very optional idea, in case you like it) > > This data structure is complicated: a tree of Intervals with CFGBlocks at the > root, and at every level the Intervals have pred/succ edges that form part of > one of the intermediate flow graphs. > > It seems nice to get rid of the tree if possible (ultimately we're using > these for ordering) and the nested pred/succs - we seem to only need these > for the top level interval (and in the end, we have one interval and > pred/succs are empty. > > What if we made this a flat structure, preserving the bracket structure as > some kind of back-edge? > > ``` > 1 ( 2 ( 3 4 ) ) 5 > becomes > 1 2 3 4 !3 !2 5 > ``` > > AIUI, the dataflow algorithm can basically treat this as a program, we move > left to right and `!X` means "if (X is invalidated) goto X". > > This would give us something like: > ``` > using Entry = PointerIntPair<CFGBlock*, 1>; // either an actual block, or a > conditional jump > struct Interval { vector<Entry> Sequence; SmallDenseSet<Interval*> Preds, > Succs; } > using IntervalGraph = vector<Interval>; > ``` > > I also suspect that this would let us avoid the core algorithms > (buildInterval, addIntervalNode) being templated, simply wrap each CFGBlock > in a trivial Interval to start with instead. I guess this is OK > performance-wise, but the reason not to do it is ending up with those nodes > polluting your final data structure? > > > Anyway, all this is up to you - for me keeping more data/structure around > than we're going to use is confusing, but it does seem closer to the textbook. I went with this proposal and stripped it down to only track what we need. I didn't even preserve the backedge structure, because we can't use that yet. If and when we get there, I think you're approach or something similar, where we avoid the tree structure, is a great idea. ================ Comment at: clang/include/clang/Analysis/Analyses/IntervalPartition.h:56-64 + template <typename Node> struct NodeData { + // The block from which the interval was constructed. It is either the + // graph's entry block or has at least one predecessor outside the interval. + const Node *Header; + std::vector<const Node *> Nodes; + }; + using IntervalData = std::variant<NodeData<CFGBlock>, NodeData<CFGInterval>>; ---------------- ymandel wrote: > sammccall wrote: > > xazax.hun wrote: > > > Correct me if I'm wrong but I have the impression that `CFGInterval` > > > might either contain only `CfgBlock`s or other `CfgInterval`s, but these > > > might never be mixed. The current representation does allow such mixing. > > > In case do not need that, I think it might be cleaner to make > > > `CfgInterval` itself templated and remove the `std::variant`. > > (Very optional idea, in case you like it) > > > > This data structure is complicated: a tree of Intervals with CFGBlocks at > > the root, and at every level the Intervals have pred/succ edges that form > > part of one of the intermediate flow graphs. > > > > It seems nice to get rid of the tree if possible (ultimately we're using > > these for ordering) and the nested pred/succs - we seem to only need these > > for the top level interval (and in the end, we have one interval and > > pred/succs are empty. > > > > What if we made this a flat structure, preserving the bracket structure as > > some kind of back-edge? > > > > ``` > > 1 ( 2 ( 3 4 ) ) 5 > > becomes > > 1 2 3 4 !3 !2 5 > > ``` > > > > AIUI, the dataflow algorithm can basically treat this as a program, we move > > left to right and `!X` means "if (X is invalidated) goto X". > > > > This would give us something like: > > ``` > > using Entry = PointerIntPair<CFGBlock*, 1>; // either an actual block, or a > > conditional jump > > struct Interval { vector<Entry> Sequence; SmallDenseSet<Interval*> Preds, > > Succs; } > > using IntervalGraph = vector<Interval>; > > ``` > > > > I also suspect that this would let us avoid the core algorithms > > (buildInterval, addIntervalNode) being templated, simply wrap each CFGBlock > > in a trivial Interval to start with instead. I guess this is OK > > performance-wise, but the reason not to do it is ending up with those nodes > > polluting your final data structure? > > > > > > Anyway, all this is up to you - for me keeping more data/structure around > > than we're going to use is confusing, but it does seem closer to the > > textbook. > I went with this proposal and stripped it down to only track what we need. I > didn't even preserve the backedge structure, because we can't use that yet. > If and when we get there, I think you're approach or something similar, where > we avoid the tree structure, is a great idea. True, but the `Header` field gets in the way, since it needs to have a recursive type (it's type is always one level down from the interval's type). So, somewhere you'll need a variant and a recursive type. The current version avoids this by dropping `Header` entirely, since it isn't needed in the output. ================ Comment at: clang/include/clang/Analysis/Analyses/IntervalPartition.h:101-103 +/// groups topologically. As a result, the blocks in a series of loops are +/// ordered such that all nodes in loop `i` are earlier in the order than nodes +/// in loop `j`. This ordering, when combined with widening, bounds the number ---------------- xazax.hun wrote: > I wonder if we still want to somehow enforce that within a loop/interval the > we visit nodes in the RPO order. Starting with RPO should make this algorithm cheaper (and maybe simpler!). But, RPO construction isn't free. So, I think it is a matter of cost -- compare generating the RPO first and then using that here vs. just using this directly. I can put a note in the implementation as FIXME. WDYT? ================ Comment at: clang/include/clang/Analysis/Analyses/IntervalPartition.h:123 +/// This WTO construction is described in Section 4.2 of [Bourdoncle1993]. +std::optional<WeakTopologicalOrdering> getIntervalWTO(const CFG &Cfg); + ---------------- sammccall wrote: > Naming is confusing here: `getWTO` is the internal function whose API > involves intervals, and `getIntervalWTO` be the main function whose API > doesn't. I think literally swapping them might be clearer! > > in general it's hard to tell which parts of this API people are actually > meant to use. > Is it reasonable to wrap everything up to `WeakTopologicalOrdering` in > `namespace internal` or so? Or are there places this will be used outside > tests? Agreed. The only reason to expose is for tests. ================ Comment at: clang/lib/Analysis/IntervalPartition.cpp:55 + for (const Node *S : Header->succs()) if (S != nullptr) + if (auto SID = getID(S); !Partitioned.test(SID)) { ---------------- sammccall wrote: > I'm confused, here and below why are successors allowed to be nullptr? That's how the CFG encodes unreachable nodes. `succs` doesn't actually return CFGBlocks, but an intermediate type which encodes reachability info. When you (implicitly) cast it to CFGBlock*, unreachable nodes become a nullptr. ================ Comment at: clang/lib/Analysis/IntervalPartition.cpp:121 + Index.emplace(N, ID); + Graph.Intervals.emplace_back(ID, Header, std::move(Data.Nodes)); } ---------------- xazax.hun wrote: > ymandel wrote: > > xazax.hun wrote: > > > It would probably take for me some time to understand what is going on > > > here, but I think this part might worth a comment. In particular, I have > > > the impression that `Graph.Intervals` is the owner of all the intervals. > > > But we use pointers to refer to these intervals. What would guarantee > > > that those pointers are not being invalidated by this `emplace_back` > > > operations? > > Excellent question, not least because I *wasn't* careful the first around > > and indeed ran up against this as a memory bug. :) That's the reason I > > added IDs, so that I could _avoid_ taking the addresses of elements until > > after we finish growing the vector. See lines 148-165: after we finish > > building the intervals, we go through and take addresses. > > > > I can add comments to this effect, but perhaps the safe option is to use a > > std::dequeue. What do you think? > Oh, thanks! I don't have a strong preference between a comment or a deque. I ended up changing to a deque after almost making the same mistake again. vector is just not safe when you want to be taking addresses. :) I've dropped the use of IDs and added comments. Let me know if this clear enough. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D153058/new/ https://reviews.llvm.org/D153058 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits