kadircet added inline comments.

================
Comment at: clang-tools-extra/clangd/IncludeCleaner.cpp:359
+    TransformedInc.Angled = WrittenRef.starts_with("<");
+    auto FE = SM.getFileManager().getFile(Inc.Resolved);
+    if (!FE)
----------------
unfortunately `getFile` returns an `llvm::Expected` which requires explicit 
error handling (or it'll trigger a crash). you can simply `elog` the issue:
```
if (!FE) {
  elog("IncludeCleaner: Failed to get an entry for resolved path {0}: {1}", 
Inc.Resolved, FE.takeError());
  continue;
}
```


================
Comment at: clang-tools-extra/clangd/IncludeCleaner.cpp:519
+std::vector<include_cleaner::SymbolReference>
+collectMacroReferences(ParsedAST &AST) {
+  const auto &SM = AST.getSourceManager();
----------------
this should also be either static or put into anon namespace


================
Comment at: clang-tools-extra/clangd/IncludeCleaner.cpp:554
+
+std::vector<Diag> generateMissingIncludeDiagnostics(
+    ParsedAST &AST, llvm::SmallVector<MissingIncludeDiagInfo> MissingIncludes,
----------------
`generateMissingIncludeDiagnostics` should also be either static or put into 
anon namespace


================
Comment at: clang-tools-extra/clangd/IncludeCleaner.cpp:555
+std::vector<Diag> generateMissingIncludeDiagnostics(
+    ParsedAST &AST, llvm::SmallVector<MissingIncludeDiagInfo> MissingIncludes,
+    llvm::StringRef Code) {
----------------
i think it's better to use an `llvm::ArrayRef` instead of `llvm::SmallVector` 
for diag infos here, we don't need to copy.


================
Comment at: clang-tools-extra/clangd/IncludeCleaner.cpp:563
+  for (const auto &SymbolWithMissingInclude : MissingIncludes) {
+    std::string Spelling =
+        spellHeader(AST, MainFile, SymbolWithMissingInclude.Providers.front());
----------------
we've got `Config::Diagnostics::Includes::IgnoreHeader` that disables 
include-cleaner analysis on headers that match a pattern. we should be 
respecting those config options here too.


================
Comment at: clang-tools-extra/clangd/IncludeCleaner.cpp:582
+        HeaderRef.trim("\"<>"), Angled, tooling::IncludeDirective::Include);
+    if (!Replacement.has_value())
       continue;
----------------
could you add a comment here, as this is subtle, something like `We might 
suggest insertion of an existing include in edge cases, e.g. include is present 
in a PP-disabled region, or spelling of the header turns out to be the same as 
one of the unresolved includes in the main file`


================
Comment at: clang-tools-extra/clangd/IncludeCleaner.cpp:589
+    D.InsideMainFile = true;
+    Result.push_back(D);
+  }
----------------
nit: `Result.push_back(std::move(D))`, as `D` has lots of strings in it, it 
might be expensive to copy it around. Even better if you construct the Diag in 
place via `Diag &D = Result.emplace_back()`, you can achieve this by moving all 
the logic that might bail out (e.g. replacement generation) to the top of the 
loop, and start forming the diagnostic only after we're sure that there'll be 
one.


================
Comment at: clang-tools-extra/clangd/IncludeCleaner.cpp:595
+std::vector<Diag>
+generateUnusedIncludeDiagnostics(PathRef FileName,
+                                 std::vector<const Inclusion *> UnusedIncludes,
----------------
this also needs to be static or put into anon namespace


================
Comment at: clang-tools-extra/clangd/IncludeCleaner.cpp:596
+generateUnusedIncludeDiagnostics(PathRef FileName,
+                                 std::vector<const Inclusion *> UnusedIncludes,
+                                 llvm::StringRef Code) {
----------------
no need to copy the vector by taking a `std::vector` here, you can take an 
`llvm::ArrayRef` instead.


================
Comment at: clang-tools-extra/clangd/IncludeCleaner.cpp:623
     D.InsideMainFile = true;
-    Result.push_back(std::move(D));
+    Result.push_back(D);
+  }
----------------
can you restore `std::move`?


================
Comment at: clang-tools-extra/clangd/IncludeCleaner.cpp:632
+  include_cleaner::Includes ConvertedIncludes =
+      convertIncludes(AST.getSourceManager(), Includes.MainFileIncludes);
+  const FileEntry *MainFile = SM.getFileEntryForID(SM.getMainFileID());
----------------
s/AST.getSourceManager()/SM


================
Comment at: clang-tools-extra/clangd/IncludeCleaner.cpp:639
+  llvm::DenseSet<IncludeStructure::HeaderID> Used;
+  include_cleaner::walkUsed(
+      AST.getLocalTopLevelDecls(), /*MacroRefs=*/Macros,
----------------
can you introduce a `trace::Span` wrapping the call to `walkUsed` with name 
`IncludeCleanerAnalysis` so that we can collect some stats about latency here?


================
Comment at: clang-tools-extra/clangd/IncludeCleaner.cpp:651
+          }
+          for (auto *&Inc : ConvertedIncludes.match(H)) {
+            Satisfied = true;
----------------
nit: drop either `*` or `&` (preferably `&`), having a reference vs a pointer 
doesn't make any differences performance wise, but creates a confusion (as we 
don't realy need a reference to a pointer here)


================
Comment at: clang-tools-extra/clangd/IncludeCleaner.cpp:661
+        const syntax::TokenBuffer &Tokens = AST.getTokens();
+        if (!Satisfied && !Providers.empty() &&
+            Ref.RT == include_cleaner::RefType::Explicit) {
----------------
nit: we prefer `early exits` to extra `nesting`, e.g. rewriting this as:
```
if (Satisfied || Providers.empty() || Ref.RT != Explicit)
  continue;
const auto &TB = AST.getTokens();
auto SpelledTokens = TB.spelledForExpanded(...);
if (!SpelledTokens)
  continue;
...
```

increases readability by:
- reducing the nesting
- making it more explicit about under what assumptions the rest of the code is 
working


================
Comment at: clang-tools-extra/clangd/IncludeCleaner.cpp:665
+                  Tokens.expandedTokens(Ref.RefLocation))) {
+            syntax::FileRange Range = syntax::Token::range(
+                AST.getSourceManager(), (*SpelledForExpanded)[0],
----------------
nit: `auto Range = syntax::Token::range(SM, SpelledForExpanded->front(), 
SpelledForExpanded->back());`


================
Comment at: clang-tools-extra/clangd/IncludeCleaner.cpp:668-671
+            llvm::SmallVector<include_cleaner::Header> ProviderHeaders;
+            for (const auto H : Providers) {
+              ProviderHeaders.push_back(H);
+            }
----------------
you don't need to explicitly copy `Providers` into `ProviderHeaders`, you can 
pass it directly to `DiagInfo` below.


================
Comment at: clang-tools-extra/clangd/IncludeCleaner.cpp:680
+              SymbolName =
+                  static_cast<const NamedDecl &>(Ref.Target.declaration())
+                      .getQualifiedNameAsString();
----------------
we use llvm casts, specifically 
`llvm::dyn_cast<NamedDecl*>(&Ref.Target.declaration())->getQualifiedNameAsString()`


================
Comment at: clang-tools-extra/clangd/IncludeCleaner.cpp:681
+                  static_cast<const NamedDecl &>(Ref.Target.declaration())
+                      .getQualifiedNameAsString();
+              break;
----------------
`getQualifiedNameAsString` is going to print names that are really ugly at 
certain times, but unfortunately that's a problem we don't have a great 
solution to. so no action needed ATM, but we might want to switch between 
qualified and unqualified name depending on the length at the very least (e.g. 
symbol is coming from a templated class, which has a nasty nested 
instantiation).


================
Comment at: clang-tools-extra/clangd/IncludeCleaner.cpp:685
+            MissingIncludeDiagInfo DiagInfo{SymbolName, Range, 
ProviderHeaders};
+            MissingIncludes.push_back(std::move(DiagInfo));
+          }
----------------
nit: `MissingIncludes.emplace_back(std::move(SymbolName), Range, Providers);`


================
Comment at: clang-tools-extra/clangd/IncludeCleaner.cpp:691
+      getUnused(AST, Used, /*ReferencedPublicHeaders*/ {});
+  return {UnusedIncludes, MissingIncludes};
+}
----------------
nit: you'd want `std::move`s here, around both of them


================
Comment at: clang-tools-extra/clangd/IncludeCleaner.cpp:718
+
+  if (!Cfg.Diagnostics.Suppress.contains("unused-includes")) {
+    switch (Cfg.Diagnostics.UnusedIncludes) {
----------------
nit: it might be worth re-writing the following section as:
```
std::vector<Diag> Result = generateUnusedIncludeDiagnostics(AST.tuPath(),
                Cfg.Diagnostics.UnusedIncludes == Strict ? 
computeUnusedIncludes(AST) : Findings.UnusedIncludes, Code);
llvm::move(generateMissingIncludeDiagnostics(AST, MissingIncludes, Code),
             std::back_inserter(Result));
```

and move the checks like `if (Cfg.Diagnostics.MissingIncludes == 
Config::IncludesPolicy::Strict && 
!Cfg.Diagnostics.Suppress.contains("missing-includes"))` into the specific 
function, e.g. `generateUnusedIncludeDiagnostics`, as they already do some of 
the diagnostic filtering logic.


================
Comment at: clang-tools-extra/clangd/IncludeCleaner.cpp:523
+    TransformedInc.Angled = WrittenRef.starts_with("<");
+    if (auto FE = SM.getFileManager().getFile(Inc.Resolved))
+      TransformedInc.Resolved = *FE;
----------------
VitaNuo wrote:
> kadircet wrote:
> > i don't think we should convert any unresolved includes. it usually means 
> > header wasn't found, so we can't perform any reliable analysis on them. 
> > anything i am missing?
> Ok I will skip unresolved includes. But I am not sure I fully understand. We 
> do the following:
> 1. Convert clangd includes to include-cleaner includes.
> 2. Match include-cleaner includes with symbol providers.
> 3. If match found, symbol reference is satisfied.
> 
> How does it matter in this scenario if the include is resolved? AFAIU as long 
> as the header is spelled in the main file + it's matched with a symbol 
> provider, we should say that the symbol reference is satisfied.
> 
> Otherwise, it seems like we'll say that the header is missing, although it's 
> there in the main file and unresolved.
> 
> I don't know if this is in any way a realistic scenario. I am just 
> approaching it with general logic, and in this sense having more "satisfied" 
> symbols seems better than having less => leads to less false positives. It 
> can lead to false negatives, too, but AFAIU false negatives are much less of 
> a risk for missing include management. 
> How does it matter in this scenario if the include is resolved? AFAIU as long 
> as the header is spelled in the main file + it's matched with a symbol 
> provider, we should say that the symbol reference is satisfied.

It doesn't matter at a high-level view. But since the implementation recognizes 
headers based on the HeaderID and it's only defined for resolved includes, if 
we were to have any unresolved include matches somehow (e.g. it has spelling 
"foo/bar.h", but is unresolved, and we do a match based on spelling because 
some IWYU pragma pointed at this header), we would hit the assertion around 
HeaderID always having value.


================
Comment at: clang-tools-extra/clangd/IncludeCleaner.cpp:535
+  Diag D;
+  D.Message = llvm::formatv("header {0} is missing", Spelling);
+  D.Name = "missing-includes";
----------------
VitaNuo wrote:
> kadircet wrote:
> > i think the symbol name should also be part of the diagnostic. as editors 
> > can show these diagnostics without context (e.g. you've got a big file 
> > open, there's a diagnostic panel displaying messages/fixes for all of the 
> > file. you should be able to reason about the diagnostics and fixes without 
> > jumping all over the file). so maybe something like:
> > 
> > `formatv("{0} providing '{1}' is not directly included", Header, Symbol)`
> Sure. The new design does this, as well as skipping the header name.
i feel like there's actually value in keeping the header name around, i.e. the 
user will have some idea about the action, without triggering an extra 
interaction. this helps especially in cases where the finding is wrong, they'll 
discover this sooner, hence we'll be more likely to receive bug reports. but 
don't really have a strong preference, so feel free to keep it that way.


================
Comment at: clang-tools-extra/clangd/IncludeCleaner.cpp:549
+  bool Angled = HeaderRef.starts_with("<");
+  std::optional<tooling::Replacement> Replacement = HeaderIncludes.insert(
+      HeaderRef.trim("\"<>"), Angled, tooling::IncludeDirective::Include);
----------------
VitaNuo wrote:
> kadircet wrote:
> > this returns nullopt only if Header is already included in the main file. 
> > our analysis should never suggest such a header for include, unless the 
> > include is coming from a PP-disabled region.
> > 
> > so i think if Replacement generation fails, we should drop the diagnostic 
> > completely rather than just dropping the fix. WDYT?
> Ok, sure. What's a PP-disabled region? Are you talking about #ifdef's and 
> such?
> What's a PP-disabled region

Yes, I was trying to say "preprocessor disabled region". e.g. in a piece of 
code like:
```
#if 0
#include "foo.h"
#endif
```

preprocessor won't actually trigger inclusion of "foo.h", but most of the 
heuristic parsers (most importantly the logic in `HeaderIncludes`) will treat 
this include as usual.


================
Comment at: clang-tools-extra/clangd/IncludeCleaner.h:41
+struct MissingIncludeDiagInfo {
+  std::string SymbolName;
+  syntax::FileRange SymRefRange;
----------------
it seems unfortunate that we're duplicating these strings for each diag we want 
to emit. it might be better to just store a Symbol here (similar to Header) and 
delay spelling until needed.


================
Comment at: clang-tools-extra/clangd/IncludeCleaner.h:46
+  bool operator==(const MissingIncludeDiagInfo &Other) const {
+    return SymbolName == Other.SymbolName && SymRefRange == Other.SymRefRange 
&&
+           Providers == Other.Providers;
----------------
nit: `std::tie(SymbolName, SymRefRange, Providers) == 
std::tie(Other.SymbolName, ...);`

I'd also put the SymbolName match to be the last, as it's a string match and 
might be more costly (if we can bail out early)


================
Comment at: clang-tools-extra/clangd/IncludeCleaner.h:53
+  std::vector<const Inclusion *> UnusedIncludes;
+  llvm::SmallVector<MissingIncludeDiagInfo> MissingIncludes;
+};
----------------
I don't think we've much to gain by using SmallVector here, instead of 
std::vector


================
Comment at: clang-tools-extra/clangd/unittests/IncludeCleanerTests.cpp:427
+
+  IncludeCleanerFindings Findings = computeIncludeCleanerFindings(AST);
+  const SourceManager &SM = AST.getSourceManager();
----------------
i don't think there's much value in testing out analysis here, we should rather 
focus on diagnostics generation, which isn't part of 
`computeIncludeCleanerFindings`. existing tests were focused on analysis, 
because legacy implementation for include-cleaner was actually performing these 
analysis itself.

so I'd rather suggest having trivial test cases (from include-cleaner analysis 
perspective, no need for complicated directory/file layouts) and rather test 
things out through calls to `generateMissingIncludeDiagnostics` to make sure 
diagnostics has the right ranges, text and fix contents.

right now we're not testing:
- header spelling
- symbol name generation
- ranges these diagnostics correspond to

and these are the main functionality we're adding on top of include-cleaner 
analysis.

you can take a look at the tests in 
llvm/llvm-project/clang-tools-extra/clangd/unittests/DiagnosticsTests.cpp to 
see how we're testing out diagnostics ranges, messages, fixes and what kind of 
helpers/matchers we have for them.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143496/new/

https://reviews.llvm.org/D143496

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to