sammccall added a comment. Looks really nice! Only major issue is the query trigrams don't look right. Otherwise, some style nits and fixes that seem to have gotten lost.
================ Comment at: clang-tools-extra/clangd/index/dex/Token.cpp:28 + +llvm::StringRef Token::data() const { return Data; } + ---------------- nit: inline these trivial accessors, so they can be inlined at the callsite ================ Comment at: clang-tools-extra/clangd/index/dex/Token.h:33 +/// constructing complex iterator trees. +class Token { +public: ---------------- sammccall wrote: > As discussed offline: think we misunderstood each other about the separate > struct. > I rather meant *this* class could just be a struct. This one doesn't seem to be done - any reason to keep the class with accessors rather than just a struct for now? ================ Comment at: clang-tools-extra/clangd/index/dex/Trigram.cpp:35 + llvm::DenseSet<Token> UniqueTrigrams; + std::vector<Token> Trigrams; + ---------------- sammccall wrote: > just iterate at the end and collect them? order shouldn't matter here. this was not done (now also in `generateQueryTrigrams`) ================ Comment at: clang-tools-extra/clangd/index/dex/Trigram.cpp:32 +// +// Short names (1 segment with <3 characters) are currently ignored. +std::vector<Token> generateIdentifierTrigrams(std::string Identifier) { ---------------- nit: total segment length <3 characters? (e.g. u_p is also ignored) ================ Comment at: clang-tools-extra/clangd/index/dex/Trigram.cpp:38 + llvm::makeMutableArrayRef(Roles.data(), Identifier.size())); + std::transform(begin(Identifier), end(Identifier), begin(Identifier), + ::tolower); ---------------- nit: seems clearer to call `tolower` inline rather than mutating the input param - "identifier" doesn't really describe its new value ================ Comment at: clang-tools-extra/clangd/index/dex/Trigram.cpp:39 + std::transform(begin(Identifier), end(Identifier), begin(Identifier), + ::tolower); + ---------------- nit: prefer llvm::toLower, which returns `char` ================ Comment at: clang-tools-extra/clangd/index/dex/Trigram.cpp:94 + calculateRoles(Query, llvm::makeMutableArrayRef(Roles.data(), Query.size())); + std::transform(begin(Query), end(Query), begin(Query), ::tolower); + ---------------- (again, `toLower` and consider moving to Chars.push_back() call) ================ Comment at: clang-tools-extra/clangd/index/dex/Trigram.cpp:105 + + // New segment starts with Head: flush Chars. + if (Roles[I] == Head) ---------------- i.e. you don't want this ================ Comment at: clang-tools-extra/clangd/index/dex/Trigram.h:37 +/// First, given Identifier (unqualified symbol name) is segmented using +/// calculateRoles and downcasted to lowercase. After segmentation, the +/// following technique is applied for generating trigrams: for each letter or ---------------- nit: using the same logic as FuzzyMatch (callers don't know the name calculateRoles, probably) nit: lowercased (downcast seems a bit confusing) ================ Comment at: clang-tools-extra/clangd/index/dex/Trigram.h:49 +/// belongs to more than one class it is only inserted once. +std::vector<Token> generateIdentifierTrigrams(std::string Identifier); + ---------------- nit: StringRef here and below? ================ Comment at: clang-tools-extra/clangd/index/dex/Trigram.h:56 +/// segment are extracted and returned after deduplication. +std::vector<Token> generateQueryTrigrams(std::string Query); + ---------------- this appears to be too restrictive? e.g. for TUDecl you require [tud, ude, dec, ecl] I think you just want a sliding window of three successive head/tail characters. ================ Comment at: clang-tools-extra/unittests/clangd/DexIndexTests.cpp:75 + + EXPECT_EQ(generateQueryTrigrams("abc_def"), getTrigrams({"abc", "def"})); + ---------------- seems like it should be [abc, bcd, cde, def] https://reviews.llvm.org/D49591 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits