[PATCH] D123289: [clangd][SymbolCollector] Introduce a cache for SymbolID generation and some cleanups

2022-04-20 Thread Kadir Cetinkaya via Phabricator via cfe-commits
kadircet closed this revision.
kadircet added a comment.

Landed in 001e88ac83b5c3a4d4f4e61480953ebcabc82b88 



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123289/new/

https://reviews.llvm.org/D123289

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D123289: [clangd][SymbolCollector] Introduce a cache for SymbolID generation and some cleanups

2022-04-20 Thread Sam McCall via Phabricator via cfe-commits
sammccall added a comment.

(This is ready to land, right?)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123289/new/

https://reviews.llvm.org/D123289

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D123289: [clangd][SymbolCollector] Introduce a cache for SymbolID generation and some cleanups

2022-04-11 Thread Kadir Cetinkaya via Phabricator via cfe-commits
kadircet added inline comments.



Comment at: clang-tools-extra/clangd/index/SymbolCollector.cpp:189
+  clang::Token Tok;
+  if (clang::Lexer::getRawToken(Loc, Tok, SM, LO))
+return false;

sammccall wrote:
> You've changed this from tokenizing the file with a cache.
> If I'm reading your benchmark spreadsheet right, this is ~3% speedup.
> 
> I'm not sure this is significant (I imagine not), but you don't actually have 
> to run the lexer here, since you already know what the string is going to be, 
> it's enough to grab the buffer pointer, check that it starts with the text, 
> check that the next character is not an identifier-continuer.
right, I actually did that at first, but it implies keeping the behaviour 
around unclean tokens broken, and there wasn't much of a win (delta was in the 
noise). I think because the expensive part is actually figuring out the fileid, 
and the lexing call can do it cheaply right now as it benefits from the single 
element cache.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123289/new/

https://reviews.llvm.org/D123289

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D123289: [clangd][SymbolCollector] Introduce a cache for SymbolID generation and some cleanups

2022-04-11 Thread Sam McCall via Phabricator via cfe-commits
sammccall accepted this revision.
sammccall added inline comments.



Comment at: clang-tools-extra/clangd/index/SymbolCollector.cpp:189
+  clang::Token Tok;
+  if (clang::Lexer::getRawToken(Loc, Tok, SM, LO))
+return false;

You've changed this from tokenizing the file with a cache.
If I'm reading your benchmark spreadsheet right, this is ~3% speedup.

I'm not sure this is significant (I imagine not), but you don't actually have 
to run the lexer here, since you already know what the string is going to be, 
it's enough to grab the buffer pointer, check that it starts with the text, 
check that the next character is not an identifier-continuer.



Comment at: clang-tools-extra/clangd/index/SymbolCollector.cpp:710
 // FIXME: Populate container information for macro references.
-MacroRefs[ID].push_back({Loc, Roles, /*Container=*/nullptr});
+// FIXME: All macro references are marked as Spelled now, but this should 
be
+// checked.

kadircet wrote:
> sammccall wrote:
> > What does a non-spelled macro reference look like?
> I don't fully understand it either (hence decided to keep it as-is), but my 
> initial guess is nested macro expansions, e.g:
> ```
> #define FOO(X) X
> #define BAR(X) FOO(X)
> 
> BAR(int x);
> ```
> 
> I suppose we assume there's a reference to FOO at expansion of BAR here 
> today. But I am not sure if `libIndex` will actually emit a macro reference 
> for `FOO` here.
oops nevermind, you're only moving this comment from elsewhere


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123289/new/

https://reviews.llvm.org/D123289

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D123289: [clangd][SymbolCollector] Introduce a cache for SymbolID generation and some cleanups

2022-04-11 Thread Kadir Cetinkaya via Phabricator via cfe-commits
kadircet added inline comments.



Comment at: clang-tools-extra/clangd/index/SymbolCollector.cpp:710
 // FIXME: Populate container information for macro references.
-MacroRefs[ID].push_back({Loc, Roles, /*Container=*/nullptr});
+// FIXME: All macro references are marked as Spelled now, but this should 
be
+// checked.

sammccall wrote:
> What does a non-spelled macro reference look like?
I don't fully understand it either (hence decided to keep it as-is), but my 
initial guess is nested macro expansions, e.g:
```
#define FOO(X) X
#define BAR(X) FOO(X)

BAR(int x);
```

I suppose we assume there's a reference to FOO at expansion of BAR here today. 
But I am not sure if `libIndex` will actually emit a macro reference for `FOO` 
here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123289/new/

https://reviews.llvm.org/D123289

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D123289: [clangd][SymbolCollector] Introduce a cache for SymbolID generation and some cleanups

2022-04-11 Thread Kadir Cetinkaya via Phabricator via cfe-commits
kadircet updated this revision to Diff 421855.
kadircet marked 6 inline comments as done.
kadircet edited the summary of this revision.
kadircet added a comment.



- Get rid of leftovers and update comments


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123289/new/

https://reviews.llvm.org/D123289

Files:
  clang-tools-extra/clangd/index/SymbolCollector.cpp
  clang-tools-extra/clangd/index/SymbolCollector.h
  clang-tools-extra/clangd/index/SymbolID.cpp
  clang-tools-extra/clangd/index/SymbolID.h
  clang-tools-extra/clangd/unittests/SymbolCollectorTests.cpp

Index: clang-tools-extra/clangd/unittests/SymbolCollectorTests.cpp
===
--- clang-tools-extra/clangd/unittests/SymbolCollectorTests.cpp
+++ clang-tools-extra/clangd/unittests/SymbolCollectorTests.cpp
@@ -1013,10 +1013,21 @@
   )cpp",
   "Foo::Foo" /// constructor.
 },
+{ // Unclean identifiers
+  R"cpp(
+struct Foo {};
+  )cpp",
+  R"cpp(
+$spelled[[Fo\
+o]] f{};
+  )cpp",
+  "Foo",
+},
   };
   CollectorOpts.RefFilter = RefKind::All;
   CollectorOpts.RefsInHeaders = false;
   for (const auto& T : TestCases) {
+SCOPED_TRACE(T.Header + "\n---\n" + T.Main);
 Annotations Header(T.Header);
 Annotations Main(T.Main);
 // Reset the file system.
@@ -1039,10 +1050,14 @@
 }
 const auto SpelledRefs = std::move(SpelledSlabBuilder).build(),
ImplicitRefs = std::move(ImplicitSlabBuilder).build();
-EXPECT_THAT(SpelledRefs,
-Contains(Pair(TargetID, haveRanges(SpelledRanges;
-EXPECT_THAT(ImplicitRefs,
-Contains(Pair(TargetID, haveRanges(ImplicitRanges;
+EXPECT_EQ(SpelledRanges.empty(), SpelledRefs.empty());
+EXPECT_EQ(ImplicitRanges.empty(), ImplicitRefs.empty());
+if (!SpelledRanges.empty())
+  EXPECT_THAT(SpelledRefs,
+  Contains(Pair(TargetID, haveRanges(SpelledRanges;
+if (!ImplicitRanges.empty())
+  EXPECT_THAT(ImplicitRefs,
+  Contains(Pair(TargetID, haveRanges(ImplicitRanges;
   }
 }
 
Index: clang-tools-extra/clangd/index/SymbolID.h
===
--- clang-tools-extra/clangd/index/SymbolID.h
+++ clang-tools-extra/clangd/index/SymbolID.h
@@ -14,6 +14,7 @@
 #include "llvm/Support/Error.h"
 #include "llvm/Support/raw_ostream.h"
 #include 
+#include 
 #include 
 #include 
 
@@ -36,9 +37,7 @@
   bool operator==(const SymbolID ) const {
 return HashValue == Sym.HashValue;
   }
-  bool operator!=(const SymbolID ) const {
-return !(*this == Sym);
-  }
+  bool operator!=(const SymbolID ) const { return !(*this == Sym); }
   bool operator<(const SymbolID ) const {
 return HashValue < Sym.HashValue;
   }
@@ -60,7 +59,14 @@
   std::array HashValue{};
 };
 
-llvm::hash_code hash_value(const SymbolID );
+inline llvm::hash_code hash_value(const SymbolID ) {
+  // We already have a good hash, just return the first bytes.
+  static_assert(sizeof(size_t) <= SymbolID::RawSize,
+"size_t longer than SHA1!");
+  size_t Result;
+  memcpy(, ID.raw().data(), sizeof(size_t));
+  return llvm::hash_code(Result);
+}
 
 // Write SymbolID into the given stream. SymbolID is encoded as ID.str().
 llvm::raw_ostream <<(llvm::raw_ostream , const SymbolID );
Index: clang-tools-extra/clangd/index/SymbolID.cpp
===
--- clang-tools-extra/clangd/index/SymbolID.cpp
+++ clang-tools-extra/clangd/index/SymbolID.cpp
@@ -46,14 +46,5 @@
   return OS << llvm::toHex(ID.raw());
 }
 
-llvm::hash_code hash_value(const SymbolID ) {
-  // We already have a good hash, just return the first bytes.
-  static_assert(sizeof(size_t) <= SymbolID::RawSize,
-"size_t longer than SHA1!");
-  size_t Result;
-  memcpy(, ID.raw().data(), sizeof(size_t));
-  return llvm::hash_code(Result);
-}
-
 } // namespace clangd
 } // namespace clang
Index: clang-tools-extra/clangd/index/SymbolCollector.h
===
--- clang-tools-extra/clangd/index/SymbolCollector.h
+++ clang-tools-extra/clangd/index/SymbolCollector.h
@@ -8,11 +8,12 @@
 #ifndef LLVM_CLANG_TOOLS_EXTRA_CLANGD_INDEX_SYMBOLCOLLECTOR_H
 #define LLVM_CLANG_TOOLS_EXTRA_CLANGD_INDEX_SYMBOLCOLLECTOR_H
 
-#include "index/CanonicalIncludes.h"
 #include "CollectMacros.h"
+#include "index/CanonicalIncludes.h"
 #include "index/Ref.h"
 #include "index/Relation.h"
 #include "index/Symbol.h"
+#include "index/SymbolID.h"
 #include "index/SymbolOrigin.h"
 #include "clang/AST/ASTContext.h"
 #include "clang/AST/Decl.h"
@@ -21,6 +22,7 @@
 #include "clang/Index/IndexDataConsumer.h"
 #include "clang/Index/IndexSymbol.h"
 #include "clang/Sema/CodeCompleteConsumer.h"
+#include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/DenseMap.h"
 #include 
 
@@ -142,6 +144,10 @@
 
   

[PATCH] D123289: [clangd][SymbolCollector] Introduce a cache for SymbolID generation and some cleanups

2022-04-11 Thread Kadir Cetinkaya via Phabricator via cfe-commits
kadircet updated this revision to Diff 421847.
kadircet added a comment.

- Address comments and more cleanups


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123289/new/

https://reviews.llvm.org/D123289

Files:
  clang-tools-extra/clangd/index/SymbolCollector.cpp
  clang-tools-extra/clangd/index/SymbolCollector.h
  clang-tools-extra/clangd/index/SymbolID.cpp
  clang-tools-extra/clangd/index/SymbolID.h
  clang-tools-extra/clangd/unittests/SymbolCollectorTests.cpp

Index: clang-tools-extra/clangd/unittests/SymbolCollectorTests.cpp
===
--- clang-tools-extra/clangd/unittests/SymbolCollectorTests.cpp
+++ clang-tools-extra/clangd/unittests/SymbolCollectorTests.cpp
@@ -1013,10 +1013,21 @@
   )cpp",
   "Foo::Foo" /// constructor.
 },
+{ // Unclean identifiers
+  R"cpp(
+struct Foo {};
+  )cpp",
+  R"cpp(
+$spelled[[Fo\
+o]] f{};
+  )cpp",
+  "Foo",
+},
   };
   CollectorOpts.RefFilter = RefKind::All;
   CollectorOpts.RefsInHeaders = false;
   for (const auto& T : TestCases) {
+SCOPED_TRACE(T.Header + "\n---\n" + T.Main);
 Annotations Header(T.Header);
 Annotations Main(T.Main);
 // Reset the file system.
@@ -1039,10 +1050,14 @@
 }
 const auto SpelledRefs = std::move(SpelledSlabBuilder).build(),
ImplicitRefs = std::move(ImplicitSlabBuilder).build();
-EXPECT_THAT(SpelledRefs,
-Contains(Pair(TargetID, haveRanges(SpelledRanges;
-EXPECT_THAT(ImplicitRefs,
-Contains(Pair(TargetID, haveRanges(ImplicitRanges;
+EXPECT_EQ(SpelledRanges.empty(), SpelledRefs.empty());
+EXPECT_EQ(ImplicitRanges.empty(), ImplicitRefs.empty());
+if (!SpelledRanges.empty())
+  EXPECT_THAT(SpelledRefs,
+  Contains(Pair(TargetID, haveRanges(SpelledRanges;
+if (!ImplicitRanges.empty())
+  EXPECT_THAT(ImplicitRefs,
+  Contains(Pair(TargetID, haveRanges(ImplicitRanges;
   }
 }
 
Index: clang-tools-extra/clangd/index/SymbolID.h
===
--- clang-tools-extra/clangd/index/SymbolID.h
+++ clang-tools-extra/clangd/index/SymbolID.h
@@ -14,6 +14,7 @@
 #include "llvm/Support/Error.h"
 #include "llvm/Support/raw_ostream.h"
 #include 
+#include 
 #include 
 #include 
 
@@ -36,9 +37,7 @@
   bool operator==(const SymbolID ) const {
 return HashValue == Sym.HashValue;
   }
-  bool operator!=(const SymbolID ) const {
-return !(*this == Sym);
-  }
+  bool operator!=(const SymbolID ) const { return !(*this == Sym); }
   bool operator<(const SymbolID ) const {
 return HashValue < Sym.HashValue;
   }
@@ -60,7 +59,14 @@
   std::array HashValue{};
 };
 
-llvm::hash_code hash_value(const SymbolID );
+inline llvm::hash_code hash_value(const SymbolID ) {
+  // We already have a good hash, just return the first bytes.
+  static_assert(sizeof(size_t) <= SymbolID::RawSize,
+"size_t longer than SHA1!");
+  size_t Result;
+  memcpy(, ID.raw().data(), sizeof(size_t));
+  return llvm::hash_code(Result);
+}
 
 // Write SymbolID into the given stream. SymbolID is encoded as ID.str().
 llvm::raw_ostream <<(llvm::raw_ostream , const SymbolID );
Index: clang-tools-extra/clangd/index/SymbolID.cpp
===
--- clang-tools-extra/clangd/index/SymbolID.cpp
+++ clang-tools-extra/clangd/index/SymbolID.cpp
@@ -20,8 +20,7 @@
 }
 
 llvm::StringRef SymbolID::raw() const {
-  return llvm::StringRef(reinterpret_cast(HashValue.data()),
- RawSize);
+  return llvm::StringRef(reinterpret_cast(), RawSize);
 }
 
 SymbolID SymbolID::fromRaw(llvm::StringRef Raw) {
@@ -46,14 +45,5 @@
   return OS << llvm::toHex(ID.raw());
 }
 
-llvm::hash_code hash_value(const SymbolID ) {
-  // We already have a good hash, just return the first bytes.
-  static_assert(sizeof(size_t) <= SymbolID::RawSize,
-"size_t longer than SHA1!");
-  size_t Result;
-  memcpy(, ID.raw().data(), sizeof(size_t));
-  return llvm::hash_code(Result);
-}
-
 } // namespace clangd
 } // namespace clang
Index: clang-tools-extra/clangd/index/SymbolCollector.h
===
--- clang-tools-extra/clangd/index/SymbolCollector.h
+++ clang-tools-extra/clangd/index/SymbolCollector.h
@@ -13,6 +13,7 @@
 #include "index/Ref.h"
 #include "index/Relation.h"
 #include "index/Symbol.h"
+#include "index/SymbolID.h"
 #include "index/SymbolOrigin.h"
 #include "clang/AST/ASTContext.h"
 #include "clang/AST/Decl.h"
@@ -21,6 +22,8 @@
 #include "clang/Index/IndexDataConsumer.h"
 #include "clang/Index/IndexSymbol.h"
 #include "clang/Sema/CodeCompleteConsumer.h"
+#include "clang/Tooling/Syntax/Tokens.h"
+#include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/DenseMap.h"
 #include 
 
@@ -142,6 +145,10 @@
 
   llvm::Optional 

[PATCH] D123289: [clangd][SymbolCollector] Introduce a cache for SymbolID generation and some cleanups

2022-04-07 Thread Sam McCall via Phabricator via cfe-commits
sammccall accepted this revision.
sammccall added a comment.
This revision is now accepted and ready to land.

Neat!

I think you've broken the FilesToTokenCache by moving it into the loop, so I'd 
expect further wins from fixing that.
(If you don't see any, something seems wrong: syntax::tokenize should be called 
a fair bit and should be pretty expensive!)




Comment at: clang-tools-extra/clangd/index/SymbolCollector.cpp:589
+if (Opts.RefsInHeaders || FID == SM.getMainFileID()) {
+  // FIXME: It's better to use TokenBuffer by passing spelled tokens from
+  // the caller of SymbolCollector.

there's a big block of code here that's checking if the reference was spelled 
or not, pull out a function?



Comment at: clang-tools-extra/clangd/index/SymbolCollector.cpp:591
+  // the caller of SymbolCollector.
+  llvm::DenseMap> FilesToTokensCache;
+  if (!FilesToTokensCache.count(FID))

Um, is this cache meant to be a member? It's pretty well guaranteed to be empty 
on the next line :-)



Comment at: clang-tools-extra/clangd/index/SymbolCollector.cpp:710
 // FIXME: Populate container information for macro references.
-MacroRefs[ID].push_back({Loc, Roles, /*Container=*/nullptr});
+// FIXME: All macro references are marked as Spelled now, but this should 
be
+// checked.

What does a non-spelled macro reference look like?



Comment at: clang-tools-extra/clangd/index/SymbolCollector.cpp:810
   // At the end of the TU, add 1 to the refcount of all referenced symbols.
   auto IncRef = [this](const SymbolID ) {
 if (const auto *S = Symbols.find(ID)) {

no need for this to be a lambda anymore, a plain loop seems fine



Comment at: clang-tools-extra/clangd/index/SymbolCollector.cpp:814
   ++Inc.References;
   Symbols.insert(Inc);
 }

if you have timing set up, try making this 
`const_cast(S)->References++`.

Reinserting into a symbol slab is pretty expensive I think, and this is just an 
awkward gap in the API.
We could fix the API or live with const_cast if it matters.



Comment at: clang-tools-extra/clangd/index/SymbolCollector.cpp:964
+SymbolID SymbolCollector::getSymbolIDCached(const Decl *D) {
+  auto It = DeclToIDCache.try_emplace(D, SymbolID{});
+  if (It.second)

nit: just try_emplace(D). The rest of the arguments get forwarded to the V 
constructor, so you're calling an unneccesary move constructor here.

(Shouldn't matter here because SymbolID is trivial, but it can)



Comment at: clang-tools-extra/clangd/index/SymbolCollector.h:174
   // Symbols referenced from the current TU, flushed on finish().
-  llvm::DenseSet ReferencedDecls;
-  llvm::DenseSet ReferencedMacros;
-  llvm::DenseMap> DeclRefs;
-  llvm::DenseMap> MacroRefs;
+  llvm::DenseSet ReferencedSymbols;
   // Maps canonical declaration provided by clang to canonical declaration for

hash_code(SymbolID) is not defined in the header, but the hash function is 
trivial and could be inlined everywhere. Might be worth exposing.

(not really related to this patch, but you have some setup for benchmarking at 
the moment...)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123289/new/

https://reviews.llvm.org/D123289

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D123289: [clangd][SymbolCollector] Introduce a cache for SymbolID generation and some cleanups

2022-04-07 Thread Kadir Cetinkaya via Phabricator via cfe-commits
kadircet created this revision.
Herald added subscribers: usaxena95, arphaman.
Herald added a project: All.
kadircet requested review of this revision.
Herald added subscribers: cfe-commits, MaskRay, ilya-biryukov.
Herald added a project: clang-tools-extra.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D123289

Files:
  clang-tools-extra/clangd/index/SymbolCollector.cpp
  clang-tools-extra/clangd/index/SymbolCollector.h

Index: clang-tools-extra/clangd/index/SymbolCollector.h
===
--- clang-tools-extra/clangd/index/SymbolCollector.h
+++ clang-tools-extra/clangd/index/SymbolCollector.h
@@ -8,18 +8,21 @@
 #ifndef LLVM_CLANG_TOOLS_EXTRA_CLANGD_INDEX_SYMBOLCOLLECTOR_H
 #define LLVM_CLANG_TOOLS_EXTRA_CLANGD_INDEX_SYMBOLCOLLECTOR_H
 
-#include "index/CanonicalIncludes.h"
 #include "CollectMacros.h"
+#include "index/CanonicalIncludes.h"
 #include "index/Ref.h"
 #include "index/Relation.h"
 #include "index/Symbol.h"
+#include "index/SymbolID.h"
 #include "index/SymbolOrigin.h"
 #include "clang/AST/ASTContext.h"
 #include "clang/AST/Decl.h"
+#include "clang/AST/DeclBase.h"
 #include "clang/Basic/SourceLocation.h"
 #include "clang/Basic/SourceManager.h"
 #include "clang/Index/IndexDataConsumer.h"
 #include "clang/Index/IndexSymbol.h"
+#include "clang/Lex/MacroInfo.h"
 #include "clang/Sema/CodeCompleteConsumer.h"
 #include "llvm/ADT/DenseMap.h"
 #include 
@@ -141,6 +144,11 @@
   llvm::Optional getTokenLocation(SourceLocation TokLoc);
 
   llvm::Optional getIncludeHeader(const Symbol , FileID);
+  SymbolID getSymbolIDCached(const Decl *D);
+  SymbolID getSymbolIDCached(const llvm::StringRef MacroName,
+ const MacroInfo *MI, const SourceManager );
+  void addRef(SymbolID ID, SourceLocation RefLoc, FileID FID,
+  index::SymbolRoleSet Roles, const Decl *Container, bool Spelled);
 
   // All Symbols collected from the AST.
   SymbolSlab::Builder Symbols;
@@ -162,16 +170,8 @@
   std::shared_ptr CompletionAllocator;
   std::unique_ptr CompletionTUInfo;
   Options Opts;
-  struct SymbolRef {
-SourceLocation Loc;
-index::SymbolRoleSet Roles;
-const Decl *Container;
-  };
   // Symbols referenced from the current TU, flushed on finish().
-  llvm::DenseSet ReferencedDecls;
-  llvm::DenseSet ReferencedMacros;
-  llvm::DenseMap> DeclRefs;
-  llvm::DenseMap> MacroRefs;
+  llvm::DenseSet ReferencedSymbols;
   // Maps canonical declaration provided by clang to canonical declaration for
   // an index symbol, if clangd prefers a different declaration than that
   // provided by clang. For example, friend declaration might be considered
@@ -184,6 +184,8 @@
   // to insert for which symbol, etc.
   class HeaderFileURICache;
   std::unique_ptr HeaderFileURIs;
+  llvm::DenseMap DeclToIDCache;
+  llvm::DenseMap MacroToIDCache;
 };
 
 } // namespace clangd
Index: clang-tools-extra/clangd/index/SymbolCollector.cpp
===
--- clang-tools-extra/clangd/index/SymbolCollector.cpp
+++ clang-tools-extra/clangd/index/SymbolCollector.cpp
@@ -14,6 +14,7 @@
 #include "SourceCode.h"
 #include "URI.h"
 #include "index/CanonicalIncludes.h"
+#include "index/Ref.h"
 #include "index/Relation.h"
 #include "index/SymbolID.h"
 #include "index/SymbolLocation.h"
@@ -543,6 +544,9 @@
   const NamedDecl *ND = dyn_cast(D);
   if (!ND)
 return true;
+  auto ID = getSymbolIDCached(ND);
+  if (!ID)
+return true;
 
   // Mark D as referenced if this is a reference coming from the main file.
   // D may not be an interesting symbol, but it's cheaper to check at the end.
@@ -550,11 +554,7 @@
   if (Opts.CountReferences &&
   (Roles & static_cast(index::SymbolRole::Reference)) &&
   SM.getFileID(SM.getSpellingLoc(Loc)) == SM.getMainFileID())
-ReferencedDecls.insert(ND);
-
-  auto ID = getSymbolID(ND);
-  if (!ID)
-return true;
+ReferencedSymbols.insert(ID);
 
   // ND is the canonical (i.e. first) declaration. If it's in the main file
   // (which is not a header), then no public declaration was visible, so assume
@@ -575,13 +575,6 @@
   processRelations(*ND, ID, Relations);
 
   bool CollectRef = static_cast(Opts.RefFilter & toRefKind(Roles));
-  bool IsOnlyRef =
-  !(Roles & (static_cast(index::SymbolRole::Declaration) |
- static_cast(index::SymbolRole::Definition)));
-
-  if (IsOnlyRef && !CollectRef)
-return true;
-
   // Unlike other fields, e.g. Symbols (which use spelling locations), we use
   // file locations for references (as it aligns the behavior of clangd's
   // AST-based xref).
@@ -589,13 +582,33 @@
   if (CollectRef &&
   (!IsMainFileOnly || Opts.CollectMainFileRefs ||
ND->isExternallyVisible()) &&
-  !isa(ND) &&
-  (Opts.RefsInHeaders ||
-   SM.getFileID(SM.getFileLoc(Loc)) == SM.getMainFileID()))
-DeclRefs[ND].push_back(SymbolRef{SM.getFileLoc(Loc), Roles,
-