dgoldman marked 2 inline comments as done. dgoldman added inline comments.
================ Comment at: clang-tools-extra/clangd/FindSymbols.cpp:535 +/// by range. +std::vector<DocumentSymbol> mergePragmas(std::vector<DocumentSymbol> &Syms, + std::vector<PragmaMarkSymbol> &Pragmas, ---------------- kadircet wrote: > dgoldman wrote: > > kadircet wrote: > > > dgoldman wrote: > > > > kadircet wrote: > > > > > dgoldman wrote: > > > > > > sammccall wrote: > > > > > > > FWIW the flow control/how we make progress seem hard to follow > > > > > > > here to me. > > > > > > > > > > > > > > In particular I think I'm struggling with the statefulness of "is > > > > > > > there an open mark group". > > > > > > > > > > > > > > Possible simplifications: > > > > > > > - define a dummy root symbol, which seems clearer than the > > > > > > > vector<symbols> + range > > > > > > > - avoid reverse-sorting the list of pragma symbols, and just > > > > > > > consume from the front of an ArrayRef instead > > > > > > > - make the outer loop over pragmas, rather than symbols. It > > > > > > > would first check if the pragma belongs directly here or not, and > > > > > > > if so, loop over symbols to work out which should become > > > > > > > children. This seems very likely to be efficient enough in > > > > > > > practice (few pragmas, or most children are grouped into pragmas) > > > > > > > define a dummy root symbol, which seems clearer than the > > > > > > > vector<symbols> + range > > > > > > > > > > > > I guess? Then we'd take in a `DocumentSymbol & and a > > > > > > ArrayRef<PragmaMarkSymbol> & (or just by value and then return it > > > > > > as well). The rest would be the same though > > > > > > > > > > > > > In particular I think I'm struggling with the statefulness of "is > > > > > > > there an open mark group". > > > > > > > > > > > > We need to track the current open group if there is one in order to > > > > > > move children to it. > > > > > > > > > > > > > make the outer loop over pragmas, rather than symbols. It would > > > > > > > first check if the pragma belongs directly here or not, and if > > > > > > > so, loop over symbols to work out which should become children. > > > > > > > This seems very likely to be efficient enough in practice (few > > > > > > > pragmas, or most children are grouped into pragmas) > > > > > > > > > > > > The important thing here is knowing where the pragma mark ends - if > > > > > > it doesn't, it actually gets all of the children. So we'd have to > > > > > > peak at the next pragma mark, add all symbols before it to us as > > > > > > children, and then potentially recurse to nest it inside of a > > > > > > symbol. I'll try it out and see if it's simpler. > > > > > > > > > > > > > > > > > ``` > > > > > while(Pragmas) { > > > > > // We'll figure out where the Pragmas.front() should go. > > > > > Pragma P = Pragmas.front(); > > > > > DocumentSymbol *Cur = Root; > > > > > while(Cur->contains(P)) { > > > > > auto *OldCur = Cur; > > > > > for(auto *C : Cur->children) { > > > > > // We assume at most 1 child can contain the pragma (as pragmas > > > > > are on a single line, and children have disjoint ranges) > > > > > if (C->contains(P)) { > > > > > Cur = C; > > > > > break; > > > > > } > > > > > } > > > > > // Cur is immediate parent of P > > > > > if (OldCur == Cur) { > > > > > // Just insert P into children if it is not a group and we are > > > > > done. > > > > > // Otherwise we need to figure out when current pragma is > > > > > terminated: > > > > > // if next pragma is not contained in Cur, or is contained in one of > > > > > the children, It is at the end of Cur, nest all the children that > > > > > appear after P under the symbol node for P. > > > > > // Otherwise nest all the children that appear after P but before > > > > > next pragma under the symbol node for P. > > > > > // Pop Pragmas and break > > > > > } > > > > > } > > > > > } > > > > > ``` > > > > > > > > > > Does that make sense, i hope i am not missing something obvious? > > > > > Complexity-wise in the worst case we'll go all the way down to a leaf > > > > > once per pragma, since there will only be a handful of pragmas most > > > > > of the time it shouldn't be too bad. > > > > I've implemented your suggestion. I don't think it's simpler, but LMK, > > > > maybe it can be improved. > > > oops, i was looking into an older revision and missed mergepragmas2, i > > > think it looks quite similar to this one but we can probably get rid of > > > the recursion as well and simplify a couple more cases > > This makes sense, I think that works for the most part besides dropping > > the recursion, specifically for > > > > ``` > > // Next pragma is contained in the Sym, it belongs there and doesn't > > // affect us at all. > > if (Sym.range.contains(NextPragma.DocSym.range)) { > > Sym.children = mergePragmas2(Sym.children, Pragmas, Sym.range); > > continue; > > } > > ``` > > > > I guess we could explicitly forbid 3+ layers of nesting and handle it > > inline there? But I'm not sure it's worth the effort to rewrite all of this > > - the recursion shouldn't be deep and we avoid needing to shift vector > > elements over by recreating a new one. > Sorry I don't follow why we can't get rid of the recursion in this case. > > Two loop solution I described above literally tries to find the document > symbol node, such that the current pragma is contained in that node && > current pragma isn't contained in any of that node's children. Afterwards it > inserts the pragma into that node and starts traversing the tree from root > again for the next pragma. > > Again I don't follow where the `3+ layers of nesting` constraint came from. > But I do feel like the iterative version is somewhat easier to reason about > (especially keeping track of what's happening with `pragmas.front()` and the > way it bails out via `parentrange` check). Shifting of the vector is > definitely unfortunate but I think it shouldn't imply big performance hits in > practice as we are only shifting the children of a single node. Yeah, I understand the first part, I think specifically handling the group case after you discover where it needs to be inserted is a bit more complicated, something like the following: ``` // Pragma is a group, so we need to figure out where it terminates: // - If the next Pragma is not contained in Cur, P owns all of its // parent's children which occur after P. // - If the next pragma is contained in Cur but actually belongs to one // of the parent's children, we temporarily skip over it and look at // the next pragma to decide where we end. // - Otherwise nest all of its parent's children which occur after P but // before the next pragma. ``` And yeah, shifting in the worst case is definitely worst (due to repeat shifting) although it shouldn't be too common in practice (things like a large @implementation block would probably have the most children). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105904/new/ https://reviews.llvm.org/D105904 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits