ymandel added a comment.

Thanks for the very helpful feedback! 
I think I've addressed all of the suggestions/concerns. I suspect there's yet 
more room for improvement in the algorithm, but I think we're converging on 
"good enough".



================
Comment at: clang/include/clang/Analysis/Analyses/IntervalPartition.h:38
+// interval.
 struct CFGInterval {
+  CFGInterval() = default;
----------------
sammccall wrote:
> Summarizing an offline discussion, IIUC...
> - we build a tree of intervals that represents the whole sequence of interval 
> graphs
> - for the current worklist algorithm, we use only the total order and process 
> the first invalid block, the tree is not needed
> - the algorithm/data structure could be significantly simplified (no 
> variants, templates, recursion...) if it didn't build the tree, just the 
> interval graphs iteratively until a single node is reached
> - however a different, non-worklist algorithm could use the tree information 
> to determine which blocks to visit next. (Given `a (b c d) e`, after `d` we 
> try only `b` or `e`). This avoids expensive "is invalid" checks
> - so the motivation for preserving the full interval tree is future-proofing 
> for using a different algorithm
> 
> I think it's worth documenting this motivation, and also considering not 
> paying this complexity cost for a feature we aren't adding yet.
Sam, I've radically simplified the implementation to only build exactly what's 
needed -- the ordering. PTAL and let me know what further 
comments/documentation would help.


================
Comment at: clang/include/clang/Analysis/Analyses/IntervalPartition.h:56
+
+  template <typename Node> struct NodeData {
+    // The block from which the interval was constructed. It is either the
----------------
sammccall wrote:
> xazax.hun wrote:
> > Correct me if I'm wrong but I have the impression that `CFGInterval` might 
> > either contain only `CfgBlock`s or other `CfgInterval`s, but these might 
> > never be mixed. The current representation does allow such mixing. In case 
> > do not need that, I think it might be cleaner to make `CfgInterval` itself 
> > templated and remove the `std::variant`.
> (Very optional idea, in case you like it)
> 
> This data structure is complicated: a tree of Intervals with CFGBlocks at the 
> root, and at every level the Intervals have pred/succ edges that form part of 
> one of the intermediate flow graphs.
> 
> It seems nice to get rid of the tree if possible (ultimately we're using 
> these for ordering) and the nested pred/succs - we seem to only need these 
> for the top level interval (and in the end, we have one interval and 
> pred/succs are empty.
> 
> What if we made this a flat structure, preserving the bracket structure as 
> some kind of back-edge?
> 
> ```
> 1  (  2  (  3  4  )  )  5
> becomes
> 1     2     3  4 !3 !2  5
> ```
> 
> AIUI, the dataflow algorithm can basically treat this as a program, we move 
> left to right and `!X` means "if (X is invalidated) goto X".
> 
> This would give us something like:
> ```
> using Entry = PointerIntPair<CFGBlock*, 1>; // either an actual block, or a 
> conditional jump
> struct Interval { vector<Entry> Sequence; SmallDenseSet<Interval*> Preds, 
> Succs; }
> using IntervalGraph = vector<Interval>;
> ```
> 
> I also suspect that this would let us avoid the core algorithms 
> (buildInterval, addIntervalNode) being templated, simply wrap each CFGBlock 
> in a trivial Interval to start with instead. I guess this is OK 
> performance-wise, but the reason not to do it is ending up with those nodes 
> polluting your final data structure?
> 
> 
> Anyway, all this is up to you - for me keeping more data/structure around 
> than we're going to use is confusing, but it does seem closer to the textbook.
I went with this proposal and stripped it down to only track what we need. I 
didn't even preserve the backedge structure, because we can't use that yet. If 
and when we get there, I think you're approach or something similar, where we 
avoid the tree structure, is a great idea.


================
Comment at: clang/include/clang/Analysis/Analyses/IntervalPartition.h:56-64
+  template <typename Node> struct NodeData {
+    // The block from which the interval was constructed. It is either the
+    // graph's entry block or has at least one predecessor outside the 
interval.
+    const Node *Header;
+    std::vector<const Node *> Nodes;
+  };
+  using IntervalData = std::variant<NodeData<CFGBlock>, NodeData<CFGInterval>>;
----------------
ymandel wrote:
> sammccall wrote:
> > xazax.hun wrote:
> > > Correct me if I'm wrong but I have the impression that `CFGInterval` 
> > > might either contain only `CfgBlock`s or other `CfgInterval`s, but these 
> > > might never be mixed. The current representation does allow such mixing. 
> > > In case do not need that, I think it might be cleaner to make 
> > > `CfgInterval` itself templated and remove the `std::variant`.
> > (Very optional idea, in case you like it)
> > 
> > This data structure is complicated: a tree of Intervals with CFGBlocks at 
> > the root, and at every level the Intervals have pred/succ edges that form 
> > part of one of the intermediate flow graphs.
> > 
> > It seems nice to get rid of the tree if possible (ultimately we're using 
> > these for ordering) and the nested pred/succs - we seem to only need these 
> > for the top level interval (and in the end, we have one interval and 
> > pred/succs are empty.
> > 
> > What if we made this a flat structure, preserving the bracket structure as 
> > some kind of back-edge?
> > 
> > ```
> > 1  (  2  (  3  4  )  )  5
> > becomes
> > 1     2     3  4 !3 !2  5
> > ```
> > 
> > AIUI, the dataflow algorithm can basically treat this as a program, we move 
> > left to right and `!X` means "if (X is invalidated) goto X".
> > 
> > This would give us something like:
> > ```
> > using Entry = PointerIntPair<CFGBlock*, 1>; // either an actual block, or a 
> > conditional jump
> > struct Interval { vector<Entry> Sequence; SmallDenseSet<Interval*> Preds, 
> > Succs; }
> > using IntervalGraph = vector<Interval>;
> > ```
> > 
> > I also suspect that this would let us avoid the core algorithms 
> > (buildInterval, addIntervalNode) being templated, simply wrap each CFGBlock 
> > in a trivial Interval to start with instead. I guess this is OK 
> > performance-wise, but the reason not to do it is ending up with those nodes 
> > polluting your final data structure?
> > 
> > 
> > Anyway, all this is up to you - for me keeping more data/structure around 
> > than we're going to use is confusing, but it does seem closer to the 
> > textbook.
> I went with this proposal and stripped it down to only track what we need. I 
> didn't even preserve the backedge structure, because we can't use that yet. 
> If and when we get there, I think you're approach or something similar, where 
> we avoid the tree structure, is a great idea.
True, but the `Header` field gets in the way, since it needs to have a 
recursive type (it's type is always one level down from the interval's type).  
So, somewhere you'll need a variant and a recursive type. The current version 
avoids this by dropping `Header` entirely, since it isn't needed in the output.


================
Comment at: clang/include/clang/Analysis/Analyses/IntervalPartition.h:101-103
+/// groups topologically. As a result, the blocks in a series of loops are
+/// ordered such that all nodes in loop `i` are earlier in the order than nodes
+/// in loop `j`. This ordering, when combined with widening, bounds the number
----------------
xazax.hun wrote:
> I wonder if we still want to somehow enforce that within a loop/interval the 
> we visit nodes in the RPO order.
Starting with RPO should make this algorithm cheaper (and maybe simpler!).  
But, RPO construction isn't free. So, I think it is a matter of cost -- compare 
generating the RPO first and then using that here vs. just using this directly. 
 I can put a note in the implementation as FIXME. WDYT?


================
Comment at: clang/include/clang/Analysis/Analyses/IntervalPartition.h:123
+/// This WTO construction is described in Section 4.2 of [Bourdoncle1993].
+std::optional<WeakTopologicalOrdering> getIntervalWTO(const CFG &Cfg);
+
----------------
sammccall wrote:
> Naming is confusing here: `getWTO` is the internal function whose API 
> involves intervals, and `getIntervalWTO` be the main function whose API 
> doesn't. I think literally swapping them might be clearer!
> 
> in general it's hard to tell which parts of this API people are actually 
> meant to use.
> Is it reasonable to wrap everything up to `WeakTopologicalOrdering` in 
> `namespace internal` or so? Or are there places this will be used outside 
> tests?
Agreed.  The only reason to  expose is for tests.


================
Comment at: clang/lib/Analysis/IntervalPartition.cpp:55
+  for (const Node *S : Header->succs())
     if (S != nullptr)
+      if (auto SID = getID(S); !Partitioned.test(SID)) {
----------------
sammccall wrote:
> I'm confused, here and below why are successors allowed to be nullptr?
That's how the CFG encodes unreachable nodes. `succs` doesn't actually return 
CFGBlocks, but an intermediate type which encodes reachability info. When you 
(implicitly) cast it to CFGBlock*, unreachable nodes become a nullptr.


================
Comment at: clang/lib/Analysis/IntervalPartition.cpp:121
+    Index.emplace(N, ID);
+  Graph.Intervals.emplace_back(ID, Header, std::move(Data.Nodes));
 }
----------------
xazax.hun wrote:
> ymandel wrote:
> > xazax.hun wrote:
> > > It would probably take for me some time to understand what is going on 
> > > here, but I think this part might worth a comment. In particular, I have 
> > > the impression that `Graph.Intervals` is the owner of all the intervals. 
> > > But we use pointers to refer to these intervals. What would guarantee 
> > > that those pointers are not being invalidated by this `emplace_back` 
> > > operations?
> > Excellent question, not least because I *wasn't* careful the first around 
> > and indeed ran up against this as a memory bug. :) That's the reason I 
> > added IDs, so that I could _avoid_ taking the addresses of elements until 
> > after we finish growing the vector. See lines 148-165: after we finish 
> > building the intervals, we go through and take addresses.
> > 
> > I can add comments to this effect, but perhaps the safe option is to use a 
> > std::dequeue. What do you think?
> Oh, thanks! I don't have a strong preference between a comment or a deque. 
I ended up changing to a deque after almost making the same mistake again. 
vector is just not safe when you want to be taking addresses. :) I've dropped 
the use of IDs and added comments. Let me know if this clear enough.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153058/new/

https://reviews.llvm.org/D153058

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to