hokein marked 2 inline comments as done.
hokein added inline comments.

================
Comment at: clangd/index/CanonicalIncludes.cpp:47
+  // headers (e.g. std::move from <utility> and <algorithm>), using
+  // qualified name can not disambiguate headers. Instead we should return all
+  // headers and do the disambiguation in clangd side.
----------------
sammccall wrote:
> Looking at the actual data, I'm not sure this is the right strategy. (and 
> it's not clear what "the clangd side" is :-)
> 
> The ambiguous cases seem to break down into:
>  - cases where we should just pick one, e.g. cstdlib vs cmath for std::abs
>  - cases where a function is overloaded for a particular type (e.g. complex), 
> where passing more information in here (strawman: "the signature as a 
> string") would let us get this correct. Alternatively (and easier), I believe 
> always using the "primary" version of this function is going to be correct if 
> not *strictly* iwyu, as the type comes from the same header. So In the short 
> term I'd suggest just blacklisting the variants.
>  - cases like std::move and... maybe one or two more? In the short term 
> always returning `<utility>` seems good enough. In the long term, passing in 
> the signature again would let us disambiguate here.
> Looking at the actual data, I'm not sure this is the right strategy. (and 
> it's not clear what "the clangd side" is :-)

`clangd side` I mean the consumer side.

> cases where we should just pick one, e.g. cstdlib vs cmath for std::abs

For these cases, I think we can address them in generator side by using a 
hardcoded whitelist.

> cases where a function is overloaded for a particular type (e.g. complex), 
> where passing more information in here (strawman: "the signature as a 
> string") would let us get this correct. Alternatively (and easier), I believe 
> always using the "primary" version of this function is going to be correct if 
> not *strictly* iwyu, as the type comes from the same header. So In the short 
> term I'd suggest just blacklisting the variants.

Using primary version seems promising, and our indexer doesn't index the 
template specification symbols.

> cases like std::move and... maybe one or two more? In the short term always 
> returning <utility> seems good enough. In the long term, passing in the 
> signature again would let us disambiguate here.

For std::move this particular case, I don't think using signature (as a string) 
really helps to disambiguate, since both of them are templates, the signature 
string doesn't contain any word specific to the header names 
<utility>/<algorithm>. The best way to disambiguate them is to use the number 
of parameters IMO.

I'm +1 on returning `utility`, since `std::move` in `utility` is more 
widely-used.


Or another way to disambiguate: use symbol name => header for symbols with 
unique headers; otherwise fallback to use header  mapping.


================
Comment at: clangd/index/CanonicalIncludes.cpp:123
 
   static const std::vector<std::pair<const char *, const char *>>
       SystemHeaderMap = {
----------------
sammccall wrote:
> hokein wrote:
> > ioeric wrote:
> > > Can we remove the suffix header mapping now? Is it for the 
> > > `std::chrono::` symbols? What are the reasons not to include them in this 
> > > patch? 
> > Ideally, we could remove it once we get a perfect symbol mapping but I'd 
> > prefer to keep it (it serves as a fallback when we don't find symbols in 
> > the symbol mapping), so that we could add new symbol mapping increasingly 
> > without changing the current behavior.
> > 
> > Removing it should be done carefully without introducing regression, and is 
> > outside the scope of this patch.
> We do need fallback heuristics. We should conisder cases:
>  - for known `std::` symbols mapped to one system header, we don't need a 
> fallback
>  - for known `std::` mapped to multiple system headers (`std::move`), we need 
> some fallback. There are probably few enough of these that it doesn't justify 
> listing *all* system header paths.
>  - for known `std::` symbols with a primary template in one file and 
> specializations/overloads in another (swap?) always inserting the primary 
> seems fine 
>  - For "random" unknown symbols from system header `foo/bits/_bar.h`, we can 
> not insert, correct to `<bar>`, or use the full path name. I don't have a 
> strong preference here, maybe we should do what's easiest.
>  - for c standard library (`::printf` --> `<stdio.h>` etc) we probably want 
> mappings just like in this patch, I think cppreference has them.
>  - other "standardish" headers (POSIX etc) - hmm, unclear.
> 
> Thinking about all these cases, I'm thinking the nicest solution would be 
> having the symbol -> header mapping, having a small (symbol,header) -> header 
> mapping **only** for ambiguous cases, and not inserting "system" headers that 
> aren't in the main mapping at all. Then we can improve coverage of posix, 
> windows etc headers over time.
> Question is, how can we recognize "system" headers?
I think we were talking about C++ std symbols.

> for known std:: symbols mapped to one system header, we don't need a fallback
> for known std:: mapped to multiple system headers (std::move), we need some 
> fallback. There are probably few enough of these that it doesn't justify 
> listing *all* system header paths.
> for known std:: symbols with a primary template in one file and 
> specializations/overloads in another (swap?) always inserting the primary 
> seems fine

As discussed in this patch, we have other ways to disambiguate these symbols 
(rather than relying on the header mapping), so it is possible to remove STL 
headers mapping, but we'll lose correct headers for STL implement-related 
symbols (e.g. `__hash_code`), ideally we should drop these symbols in the 
indexer...

> for c standard library (::printf --> <stdio.h> etc) we probably want mappings 
> just like in this patch, I think cppreference has them.

Yes, cppreference has them.

> other "standardish" headers (POSIX etc) - hmm, unclear.

I think we should still keep the header mapping. Not sure whether there are 
some sources like cppreference that list all symbols.




Repository:
  rCTE Clang Tools Extra

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D58345/new/

https://reviews.llvm.org/D58345



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to