ioeric added a comment. First round of comments. Mostly around indexing actions and file records; I haven't started reviewing the data writing and storing code. I think it might make sense to split the index writing and storing logics into a separate patch, which should be possible if `writeUnitData` is abstracted into an interface.
================ Comment at: include/clang/Frontend/CompilerInstance.h:188 + /// produced to generate imported modules before they are executed. + std::function<std::unique_ptr<FrontendAction> + (const FrontendOptions &opts, std::unique_ptr<FrontendAction> action)> ---------------- It might make sense to define an alias for `std::function<std::unique_ptr<FrontendAction>(const FrontendOptions &opts, std::unique_ptr<FrontendAction> action)>`, which is used multiple times. ================ Comment at: include/clang/Frontend/FrontendOptions.h:262 + std::string IndexStorePath; + unsigned IndexIgnoreSystemSymbols : 1; ---------------- It might make sense to also have documentations for these options here. ================ Comment at: include/indexstore/IndexStoreCXX.h:84 + + std::pair<unsigned, unsigned> getLineCol() { + unsigned line, col; ---------------- nathawes wrote: > malaperle wrote: > > From what I understand, this returns the beginning of the occurrence. It > > would be useful to also have the end of the occurrence. From what I tested > > in Xcode, when you do "Find Selected Symbol in Workspace", it highlights > > the symbol name in yellow in the list of matches, so it mush use that > > LineCol then highlight the matching name. This is works in many situations > > but others occurrences won't have the name of the symbol. For example: > > "MyClass o1, o2;" > > If I use "Find Selected Symbol in Workspace" on MyClass constructor, if > > won't be able to highlight o1 and o2 > > Do you think it would be possible to add that (EndLineCol)? If not, how > > would one go about extending libindexstore in order to add additional > > information per occurrence? It is not obvious to me so far. > > We also need other things, for example in the case of definitions, we need > > the full range of the definition so that users can "peek" at definitions in > > a pop-up. Without storing the end of the definition in the index, we would > > need to reparse the file. > > > Our approach to related locations (e.g. name, signature, body, and doc > comment start/end locs) has been to not include them in the index and derive > them from the start location later. There's less data to collect and write > out during the build that way, and deriving the other locations isn't that > costly usually, as in most cases you 1) don't need to type check or even > preprocess to get the related locations, as with finding the end of o1 and o2 > in your example, and 2) usually only need to derive locations for a single or > small number of occurrences, like when 'peeking' at a definition. > > Are there cases where you think this approach won't work/perform well enough > for the indexer-backed queries clangd needs to support? > I agree that we should try to keep the serialized symbol size minimal, but I think it's worthwhile to also store end locations for occurrences because 1) it's cheap, 2) it's not necessary easy to compute without parsing for occurrences like `a::b::c` or `a::b<X>`, 3) it would be useful for many LSP use cases. ================ Comment at: lib/Frontend/CompilerInstance.cpp:1176 + new GenerateModuleFromModuleMapAction); + if (WrapGenModuleAction) { + Action = WrapGenModuleAction(FrontendOpts, std::move(Action)); ---------------- nit: no braces around one liners. ================ Comment at: lib/FrontendTool/ExecuteCompilerInvocation.cpp:170 + Act = index::createIndexDataRecordingAction(FEOpts, std::move(Act)); + CI.setGenModuleActionWrapper(&index::createIndexDataRecordingAction); + } ---------------- Could you comment on what this does? The `Act` above is already wrapped. Why do we need `setGenModuleActionWrapper` to `createIndexDataRecordingAction` again? Also, `createIndexDataRecordingAction` doesn't seem related to `GenModule`. ================ Comment at: lib/Index/FileIndexRecord.cpp:23 + ArrayRef<SymbolRelation> Relations) { + assert(D->isCanonicalDecl()); + ---------------- Why? ================ Comment at: lib/Index/FileIndexRecord.cpp:37 + + DeclOccurrence NewInfo(Roles, Offset, D, Relations); + auto It = std::upper_bound(Decls.begin(), Decls.end(), NewInfo); ---------------- Please comment when this would happen. ================ Comment at: lib/Index/FileIndexRecord.cpp:39 + auto It = std::upper_bound(Decls.begin(), Decls.end(), NewInfo); + Decls.insert(It, std::move(NewInfo)); +} ---------------- Why do we need `Decls` to be sorted by offset? If we want this for printing, it might make sense to just do a sort there. ================ Comment at: lib/Index/FileIndexRecord.h:24 + +class FileIndexRecord { +public: ---------------- Please add documentation. ================ Comment at: lib/Index/FileIndexRecord.h:41 + + friend bool operator <(const DeclOccurrence &LHS, + const DeclOccurrence &RHS) { ---------------- Is this clang-formatted? You might want to run git-clang-format on the whole patch. ================ Comment at: lib/Index/IndexingAction.cpp:284 + SourceManager &SourceMgr) : + IndexCtx(indexCtx), RecordOpts(recordOpts), + Includes(IncludesForFile), SourceMgr(SourceMgr) {} ---------------- Again, you don't need the full `IndexingContext` and `RecordOptions` here. ================ Comment at: lib/Index/IndexingAction.cpp:294 + + std::pair<FileID, unsigned> LocInfo = + SourceMgr.getDecomposedExpansionLoc(From); ---------------- Note that `getDecomposedExpansionLoc ` can also return invalid decomposed loc. ================ Comment at: lib/Index/IndexingAction.cpp:308 + if (!FE) + return; + auto lineNo = SourceMgr.getLineNumber(LocInfo.first, LocInfo.second); ---------------- Do we want better error handling here? ================ Comment at: lib/Index/IndexingAction.cpp:327 + +class IndexDependencyProvider { +public: ---------------- Please provide documentation. ================ Comment at: lib/Index/IndexingAction.cpp:504 + + CreatedASTConsumer = true; + std::vector<std::unique_ptr<ASTConsumer>> Consumers; ---------------- Can we get this state from the base class instead of maintaining a another state, which seems to be identical? ================ Comment at: lib/Index/IndexingAction.cpp:524 + std::string RepositoryPath = getClangRepositoryPath(); + StringRef BuildNumber = StringRef(RepositoryPath); + size_t DashOffset = BuildNumber.find('-'); ---------------- Just `StringRef BuildNumber = RepositoryPath;` ================ Comment at: lib/Index/IndexingAction.cpp:701 +namespace { +class ModuleFileIndexDependencyCollector : public IndexDependencyProvider { + serialization::ModuleFile &ModFile; ---------------- Please provide a brief documentation for this class. ================ Comment at: lib/Index/IndexingAction.cpp:703 + serialization::ModuleFile &ModFile; + RecordingOptions RecordOpts; + ---------------- Again, it doesn't seem necessary for this class to have information about all record options. It seems that you only need `RecordSystemDependencies` here. ================ Comment at: lib/Index/IndexingAction.cpp:713 + override { + auto Reader = CI.getModuleManager(); + Reader->visitInputFiles(ModFile, RecordOpts.RecordSystemDependencies, ---------------- readability nit: avoid using `auto` if the return type is short to spell but hard to infer from the value expression. Same else where. ================ Comment at: lib/Index/IndexingAction.cpp:762 + HeaderSearch &HS = CI.getPreprocessor().getHeaderSearchInfo(); + Module *UnitMod = HS.lookupModule(Mod.ModuleName, /*AllowSearch=*/false); + ---------------- Could you add a comment explaining why we are not allowing searching. ================ Comment at: lib/Index/IndexingAction.cpp:769 + IndexCtx.setSysrootPath(SysrootPath); + Recorder.init(&IndexCtx, CI); + ---------------- It's a bit worrying that `IndexDataRecorder` and `IndexContext` reference each other. If you only need some information from the `IndexingContext`, simply pass it into `Recorder`. In this case, I think you only need the `SourceManager` from the `ASTContext` in the recorder to calculate whether a file is a system header. I see you also cache result of `IndexingContext::isSystemFile` in the indexing context, but I think it would be more sensible for the callers to handle caching for this call. ================ Comment at: lib/Index/IndexingAction.cpp:771 + + for (const Decl *D : CI.getModuleManager()->getModuleFileLevelDecls(Mod)) { + IndexCtx.indexTopLevelDecl(D); ---------------- nit: no braces around one liners. ================ Comment at: lib/Index/IndexingAction.cpp:779 + Mod.FileName, /*RootFile=*/nullptr, UnitMod, SysrootPath); + +} ---------------- nit: redundant empty line ================ Comment at: lib/Index/IndexingAction.cpp:837 + index::RecordingOptions RecordOpts; + std::tie(IndexOpts, RecordOpts) = getIndexOptionsFromFrontendOptions(FEOpts); + return ::createIndexDataRecordingAction(IndexOpts, RecordOpts, ---------------- Just `auto pair = getIndexOptionsFromFrontendOptions(FEOpts);` and then use `pair.first` and `pair.second`? Same below. ================ Comment at: lib/Index/IndexingContext.h:39 class IndexingContext { IndexingOptions IndexOpts; ---------------- Please define the scope of this class to avoid throwing random states into it, which usually happens to a "context" class. https://reviews.llvm.org/D39050 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits