[PATCH] D105904: [clangd] Support `#pragma mark` in the outline

David Goldman via Phabricator via cfe-commits Tue, 03 Aug 2021 06:54:42 -0700

dgoldman marked 2 inline comments as done.
dgoldman added inline comments.


================
Comment at: clang-tools-extra/clangd/FindSymbols.cpp:535
+/// by range.
+std::vector<DocumentSymbol> mergePragmas(std::vector<DocumentSymbol> &Syms,
+                                         std::vector<PragmaMarkSymbol> 
&Pragmas,
----------------
kadircet wrote:
> dgoldman wrote:
> > kadircet wrote:
> > > dgoldman wrote:
> > > > kadircet wrote:
> > > > > dgoldman wrote:
> > > > > > sammccall wrote:
> > > > > > > FWIW the flow control/how we make progress seem hard to follow 
> > > > > > > here to me.
> > > > > > > 
> > > > > > > In particular I think I'm struggling with the statefulness of "is 
> > > > > > > there an open mark group".
> > > > > > > 
> > > > > > > Possible simplifications:
> > > > > > >  - define a dummy root symbol, which seems clearer than the 
> > > > > > > vector<symbols> + range
> > > > > > >  - avoid reverse-sorting the list of pragma symbols, and just 
> > > > > > > consume from the front of an ArrayRef instead
> > > > > > >  - make the outer loop over pragmas, rather than symbols. It 
> > > > > > > would first check if the pragma belongs directly here or not, and 
> > > > > > > if so, loop over symbols to work out which should become 
> > > > > > > children. This seems very likely to be efficient enough in 
> > > > > > > practice (few pragmas, or most children are grouped into pragmas)
> > > > > > > define a dummy root symbol, which seems clearer than the 
> > > > > > > vector<symbols> + range
> > > > > > 
> > > > > > I guess? Then we'd take in a `DocumentSymbol & and a 
> > > > > > ArrayRef<PragmaMarkSymbol> & (or just by value and then return it 
> > > > > > as well). The rest would be the same though
> > > > > > 
> > > > > > > In particular I think I'm struggling with the statefulness of "is 
> > > > > > > there an open mark group".
> > > > > > 
> > > > > > We need to track the current open group if there is one in order to 
> > > > > > move children to it.
> > > > > > 
> > > > > > > make the outer loop over pragmas, rather than symbols. It would 
> > > > > > > first check if the pragma belongs directly here or not, and if 
> > > > > > > so, loop over symbols to work out which should become children. 
> > > > > > > This seems very likely to be efficient enough in practice (few 
> > > > > > > pragmas, or most children are grouped into pragmas)
> > > > > > 
> > > > > > The important thing here is knowing where the pragma mark ends - if 
> > > > > > it doesn't, it actually gets all of the children. So we'd have to 
> > > > > > peak at the next pragma mark, add all symbols before it to us as 
> > > > > > children, and then potentially recurse to nest it inside of a 
> > > > > > symbol. I'll try it out and see if it's simpler.
> > > > > > 
> > > > > > 
> > > > > ```
> > > > > while(Pragmas) {
> > > > > // We'll figure out where the Pragmas.front() should go.
> > > > > Pragma P = Pragmas.front();
> > > > > DocumentSymbol *Cur = Root;
> > > > > while(Cur->contains(P)) {
> > > > >   auto *OldCur = Cur;
> > > > >   for(auto *C : Cur->children) {
> > > > >      // We assume at most 1 child can contain the pragma (as pragmas 
> > > > > are on a single line, and children have disjoint ranges)
> > > > >      if (C->contains(P)) {
> > > > >          Cur = C;
> > > > >          break;
> > > > >      }
> > > > >   }
> > > > >   // Cur is immediate parent of P
> > > > >   if (OldCur == Cur) {
> > > > >     // Just insert P into children if it is not a group and we are 
> > > > > done.
> > > > >     // Otherwise we need to figure out when current pragma is 
> > > > > terminated:
> > > > > // if next pragma is not contained in Cur, or is contained in one of 
> > > > > the children, It is at the end of Cur, nest all the children that 
> > > > > appear after P under the symbol node for P.
> > > > > // Otherwise nest all the children that appear after P but before 
> > > > > next pragma under the symbol node for P.
> > > > > // Pop Pragmas and break
> > > > >   }
> > > > > }
> > > > > }
> > > > > ```
> > > > > 
> > > > > Does that make sense, i hope i am not missing something obvious? 
> > > > > Complexity-wise in the worst case we'll go all the way down to a leaf 
> > > > > once per pragma, since there will only be a handful of pragmas most 
> > > > > of the time it shouldn't be too bad.
> > > > I've implemented your suggestion. I don't think it's simpler, but LMK, 
> > > > maybe it can be improved.
> > > oops, i was looking into an older revision and missed mergepragmas2, i 
> > > think it looks quite similar to this one but we can probably get rid of 
> > > the recursion as well and simplify a couple more cases
> > This makes sense,  I think that works for the most part besides dropping 
> > the recursion, specifically for
> > 
> > ```
> >       // Next pragma is contained in the Sym, it belongs there and doesn't
> >       // affect us at all.
> >       if (Sym.range.contains(NextPragma.DocSym.range)) {
> >         Sym.children = mergePragmas2(Sym.children, Pragmas, Sym.range);
> >         continue;
> >       }
> > ```
> > 
> > I guess we could explicitly forbid 3+ layers of nesting and handle it 
> > inline there? But I'm not sure it's worth the effort to rewrite all of this 
> > - the recursion shouldn't be deep and we avoid needing to shift vector 
> > elements over by recreating a new one.
> Sorry I don't follow why we can't get rid of the recursion in this case.
> 
> Two loop solution I described above literally tries to find the document 
> symbol node, such that the current pragma is contained in that node && 
> current pragma isn't contained in any of that node's children. Afterwards it 
> inserts the pragma into that node and starts traversing the tree from root 
> again for the next pragma.
> 
> Again I don't follow where the `3+ layers of nesting` constraint came from. 
> But I do feel like the iterative version is somewhat easier to reason about 
> (especially keeping track of what's happening with `pragmas.front()` and the 
> way it bails out via `parentrange` check). Shifting of the vector is 
> definitely unfortunate but I think it shouldn't imply big performance hits in 
> practice as we are only shifting the children of a single node.
Yeah, I understand the first part,  I think specifically handling the group 
case after you discover where it needs to be inserted is a bit more 
complicated, something like the following:


```
        // Pragma is a group, so we need to figure out where it terminates:
        // - If the next Pragma is not contained in Cur, P owns all of its
        //   parent's children which occur after P.
        // - If the next pragma is contained in Cur but actually belongs to one
        //   of the parent's children, we temporarily skip over it and look at
        //   the next pragma to decide where we end.
        // - Otherwise nest all of its parent's children which occur after P but
        //   before the next pragma.
```

And yeah, shifting in the worst case is definitely worst (due to repeat 
shifting) although it shouldn't be too common in practice (things like a large 
@implementation block would probably have the most children).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105904/new/

https://reviews.llvm.org/D105904

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D105904: [clangd] Support `#pragma mark` in the outline

Reply via email to