Re: [PATCH] D28548: Improve include fixer's ranking by taking the paths into account.

2017-01-17 Thread David Blaikie via cfe-commits
On Tue, Jan 17, 2017 at 12:57 AM Manuel Klimek  wrote:

> It's by design. Do we want to change this? I always had the impression
> turning on the state changes for the list is a bit noisy, but I'm happy if
> that's not the general sentiment.
>
For approval it's important that that reaches the mailing list so it's
accounted for/visible. Otherwise it looks like code was sent for review and
then committed without review being performed which is not generally
accepted.

- Dave


>
> On Mon, Jan 16, 2017, 11:07 PM Benjamin Kramer 
> wrote:
>
> I got an email where it says that I accepted the revision. Looks like
> phab didn't add cfe-commits to the list of recipients though :(
>
> On Mon, Jan 16, 2017 at 6:43 PM, David Blaikie  wrote:
> > Looks like Ben signed off on this on Phab - but the email didn't go to
> the
> > list (making this look like code was sent for review, then committed,
> > without review/approval happening)
> >
> > Ben: I think Phab doesn't send mail for an approval with no text, so at
> > least as a workaround you can write something in the comments section
> when
> > approving (people often write "LGTM" or similar) to ensure the approval
> is
> > reflected on the mailing list.
> >
> > On Wed, Jan 11, 2017 at 1:59 AM Manuel Klimek via Phabricator via
> > cfe-commits  wrote:
> >>
> >> klimek created this revision.
> >> klimek added a reviewer: bkramer.
> >> klimek added a subscriber: cfe-commits.
> >>
> >> Instead of just using popularity, we also take into account how similar
> >> the
> >> path of the current file is to the path of the header.
> >> Our first approach is to get popularity into a reasonably small scale by
> >> taking
> >> log2 (which is roughly intuitive to how humans would bucket popularity),
> >> and
> >> multiply that with the number of matching prefix path fragments of the
> >> included
> >> header with the current file.
> >> Note that currently we do not take special care for unclean paths
> >> containing
> >> "../" or "./".
> >>
> >>
> >> https://reviews.llvm.org/D28548
> >>
> >> Files:
> >>   include-fixer/IncludeFixer.cpp
> >>   include-fixer/SymbolIndexManager.cpp
> >>   include-fixer/SymbolIndexManager.h
> >>   include-fixer/tool/ClangIncludeFixer.cpp
> >>   test/include-fixer/Inputs/fake_yaml_db.yaml
> >>   test/include-fixer/ranking.cpp
> >>
> >> ___
> >> cfe-commits mailing list
> >> cfe-commits@lists.llvm.org
> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>
>
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D28548: Improve include fixer's ranking by taking the paths into account.

2017-01-17 Thread Benjamin Kramer via cfe-commits
If we add all state transitions it will only create noise. I will
teach myself to always write "lg" into the text field when approving a
change instead.

On Tue, Jan 17, 2017 at 9:57 AM, Manuel Klimek via cfe-commits
 wrote:
> It's by design. Do we want to change this? I always had the impression
> turning on the state changes for the list is a bit noisy, but I'm happy if
> that's not the general sentiment.
>
>
> On Mon, Jan 16, 2017, 11:07 PM Benjamin Kramer  wrote:
>>
>> I got an email where it says that I accepted the revision. Looks like
>> phab didn't add cfe-commits to the list of recipients though :(
>>
>> On Mon, Jan 16, 2017 at 6:43 PM, David Blaikie  wrote:
>> > Looks like Ben signed off on this on Phab - but the email didn't go to
>> > the
>> > list (making this look like code was sent for review, then committed,
>> > without review/approval happening)
>> >
>> > Ben: I think Phab doesn't send mail for an approval with no text, so at
>> > least as a workaround you can write something in the comments section
>> > when
>> > approving (people often write "LGTM" or similar) to ensure the approval
>> > is
>> > reflected on the mailing list.
>> >
>> > On Wed, Jan 11, 2017 at 1:59 AM Manuel Klimek via Phabricator via
>> > cfe-commits  wrote:
>> >>
>> >> klimek created this revision.
>> >> klimek added a reviewer: bkramer.
>> >> klimek added a subscriber: cfe-commits.
>> >>
>> >> Instead of just using popularity, we also take into account how similar
>> >> the
>> >> path of the current file is to the path of the header.
>> >> Our first approach is to get popularity into a reasonably small scale
>> >> by
>> >> taking
>> >> log2 (which is roughly intuitive to how humans would bucket
>> >> popularity),
>> >> and
>> >> multiply that with the number of matching prefix path fragments of the
>> >> included
>> >> header with the current file.
>> >> Note that currently we do not take special care for unclean paths
>> >> containing
>> >> "../" or "./".
>> >>
>> >>
>> >> https://reviews.llvm.org/D28548
>> >>
>> >> Files:
>> >>   include-fixer/IncludeFixer.cpp
>> >>   include-fixer/SymbolIndexManager.cpp
>> >>   include-fixer/SymbolIndexManager.h
>> >>   include-fixer/tool/ClangIncludeFixer.cpp
>> >>   test/include-fixer/Inputs/fake_yaml_db.yaml
>> >>   test/include-fixer/ranking.cpp
>> >>
>> >> ___
>> >> cfe-commits mailing list
>> >> cfe-commits@lists.llvm.org
>> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>
>
> ___
> cfe-commits mailing list
> cfe-commits@lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D28548: Improve include fixer's ranking by taking the paths into account.

2017-01-17 Thread Manuel Klimek via cfe-commits
It's by design. Do we want to change this? I always had the impression
turning on the state changes for the list is a bit noisy, but I'm happy if
that's not the general sentiment.

On Mon, Jan 16, 2017, 11:07 PM Benjamin Kramer  wrote:

I got an email where it says that I accepted the revision. Looks like
phab didn't add cfe-commits to the list of recipients though :(

On Mon, Jan 16, 2017 at 6:43 PM, David Blaikie  wrote:
> Looks like Ben signed off on this on Phab - but the email didn't go to the
> list (making this look like code was sent for review, then committed,
> without review/approval happening)
>
> Ben: I think Phab doesn't send mail for an approval with no text, so at
> least as a workaround you can write something in the comments section when
> approving (people often write "LGTM" or similar) to ensure the approval is
> reflected on the mailing list.
>
> On Wed, Jan 11, 2017 at 1:59 AM Manuel Klimek via Phabricator via
> cfe-commits  wrote:
>>
>> klimek created this revision.
>> klimek added a reviewer: bkramer.
>> klimek added a subscriber: cfe-commits.
>>
>> Instead of just using popularity, we also take into account how similar
>> the
>> path of the current file is to the path of the header.
>> Our first approach is to get popularity into a reasonably small scale by
>> taking
>> log2 (which is roughly intuitive to how humans would bucket popularity),
>> and
>> multiply that with the number of matching prefix path fragments of the
>> included
>> header with the current file.
>> Note that currently we do not take special care for unclean paths
>> containing
>> "../" or "./".
>>
>>
>> https://reviews.llvm.org/D28548
>>
>> Files:
>>   include-fixer/IncludeFixer.cpp
>>   include-fixer/SymbolIndexManager.cpp
>>   include-fixer/SymbolIndexManager.h
>>   include-fixer/tool/ClangIncludeFixer.cpp
>>   test/include-fixer/Inputs/fake_yaml_db.yaml
>>   test/include-fixer/ranking.cpp
>>
>> ___
>> cfe-commits mailing list
>> cfe-commits@lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D28548: Improve include fixer's ranking by taking the paths into account.

2017-01-16 Thread Benjamin Kramer via cfe-commits
I got an email where it says that I accepted the revision. Looks like
phab didn't add cfe-commits to the list of recipients though :(

On Mon, Jan 16, 2017 at 6:43 PM, David Blaikie  wrote:
> Looks like Ben signed off on this on Phab - but the email didn't go to the
> list (making this look like code was sent for review, then committed,
> without review/approval happening)
>
> Ben: I think Phab doesn't send mail for an approval with no text, so at
> least as a workaround you can write something in the comments section when
> approving (people often write "LGTM" or similar) to ensure the approval is
> reflected on the mailing list.
>
> On Wed, Jan 11, 2017 at 1:59 AM Manuel Klimek via Phabricator via
> cfe-commits  wrote:
>>
>> klimek created this revision.
>> klimek added a reviewer: bkramer.
>> klimek added a subscriber: cfe-commits.
>>
>> Instead of just using popularity, we also take into account how similar
>> the
>> path of the current file is to the path of the header.
>> Our first approach is to get popularity into a reasonably small scale by
>> taking
>> log2 (which is roughly intuitive to how humans would bucket popularity),
>> and
>> multiply that with the number of matching prefix path fragments of the
>> included
>> header with the current file.
>> Note that currently we do not take special care for unclean paths
>> containing
>> "../" or "./".
>>
>>
>> https://reviews.llvm.org/D28548
>>
>> Files:
>>   include-fixer/IncludeFixer.cpp
>>   include-fixer/SymbolIndexManager.cpp
>>   include-fixer/SymbolIndexManager.h
>>   include-fixer/tool/ClangIncludeFixer.cpp
>>   test/include-fixer/Inputs/fake_yaml_db.yaml
>>   test/include-fixer/ranking.cpp
>>
>> ___
>> cfe-commits mailing list
>> cfe-commits@lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D28548: Improve include fixer's ranking by taking the paths into account.

2017-01-16 Thread David Blaikie via cfe-commits
Looks like Ben signed off on this on Phab - but the email didn't go to the
list (making this look like code was sent for review, then committed,
without review/approval happening)

Ben: I think Phab doesn't send mail for an approval with no text, so at
least as a workaround you can write something in the comments section when
approving (people often write "LGTM" or similar) to ensure the approval is
reflected on the mailing list.

On Wed, Jan 11, 2017 at 1:59 AM Manuel Klimek via Phabricator via
cfe-commits  wrote:

> klimek created this revision.
> klimek added a reviewer: bkramer.
> klimek added a subscriber: cfe-commits.
>
> Instead of just using popularity, we also take into account how similar the
> path of the current file is to the path of the header.
> Our first approach is to get popularity into a reasonably small scale by
> taking
> log2 (which is roughly intuitive to how humans would bucket popularity),
> and
> multiply that with the number of matching prefix path fragments of the
> included
> header with the current file.
> Note that currently we do not take special care for unclean paths
> containing
> "../" or "./".
>
>
> https://reviews.llvm.org/D28548
>
> Files:
>   include-fixer/IncludeFixer.cpp
>   include-fixer/SymbolIndexManager.cpp
>   include-fixer/SymbolIndexManager.h
>   include-fixer/tool/ClangIncludeFixer.cpp
>   test/include-fixer/Inputs/fake_yaml_db.yaml
>   test/include-fixer/ranking.cpp
>
> ___
> cfe-commits mailing list
> cfe-commits@lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D28548: Improve include fixer's ranking by taking the paths into account.

2017-01-11 Thread Manuel Klimek via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rL291664: Improve include fixer's ranking by taking the paths 
into account. (authored by klimek).

Changed prior to commit:
  https://reviews.llvm.org/D28548?vs=83930=83937#toc

Repository:
  rL LLVM

https://reviews.llvm.org/D28548

Files:
  clang-tools-extra/trunk/include-fixer/IncludeFixer.cpp
  clang-tools-extra/trunk/include-fixer/SymbolIndexManager.cpp
  clang-tools-extra/trunk/include-fixer/SymbolIndexManager.h
  clang-tools-extra/trunk/include-fixer/tool/ClangIncludeFixer.cpp
  clang-tools-extra/trunk/test/include-fixer/Inputs/fake_yaml_db.yaml
  clang-tools-extra/trunk/test/include-fixer/ranking.cpp

Index: clang-tools-extra/trunk/include-fixer/IncludeFixer.cpp
===
--- clang-tools-extra/trunk/include-fixer/IncludeFixer.cpp
+++ clang-tools-extra/trunk/include-fixer/IncludeFixer.cpp
@@ -365,6 +365,9 @@
 .getLocWithOffset(Range.getOffset())
 .print(llvm::dbgs(), CI->getSourceManager()));
   DEBUG(llvm::dbgs() << " ...");
+  llvm::StringRef FileName = CI->getSourceManager().getFilename(
+  CI->getSourceManager().getLocForStartOfFile(
+  CI->getSourceManager().getMainFileID()));
 
   QuerySymbolInfos.push_back({Query.str(), ScopedQualifiers, Range});
 
@@ -385,9 +388,10 @@
   // context, it might treat the identifier as a nested class of the scoped
   // namespace.
   std::vector MatchedSymbols =
-  SymbolIndexMgr.search(QueryString, /*IsNestedSearch=*/false);
+  SymbolIndexMgr.search(QueryString, /*IsNestedSearch=*/false, FileName);
   if (MatchedSymbols.empty())
-MatchedSymbols = SymbolIndexMgr.search(Query);
+MatchedSymbols =
+SymbolIndexMgr.search(Query, /*IsNestedSearch=*/true, FileName);
   DEBUG(llvm::dbgs() << "Having found " << MatchedSymbols.size()
  << " symbols\n");
   // We store a copy of MatchedSymbols in a place where it's globally reachable.
Index: clang-tools-extra/trunk/include-fixer/SymbolIndexManager.h
===
--- clang-tools-extra/trunk/include-fixer/SymbolIndexManager.h
+++ clang-tools-extra/trunk/include-fixer/SymbolIndexManager.h
@@ -42,7 +42,8 @@
   ///
   /// \returns A list of symbol candidates.
   std::vector
-  search(llvm::StringRef Identifier, bool IsNestedSearch = true) const;
+  search(llvm::StringRef Identifier, bool IsNestedSearch = true,
+ llvm::StringRef FileName = "") const;
 
 private:
   std::vector> SymbolIndices;
Index: clang-tools-extra/trunk/include-fixer/tool/ClangIncludeFixer.cpp
===
--- clang-tools-extra/trunk/include-fixer/tool/ClangIncludeFixer.cpp
+++ clang-tools-extra/trunk/include-fixer/tool/ClangIncludeFixer.cpp
@@ -332,7 +332,8 @@
 
   // Query symbol mode.
   if (!QuerySymbol.empty()) {
-auto MatchedSymbols = SymbolIndexMgr->search(QuerySymbol);
+auto MatchedSymbols = SymbolIndexMgr->search(
+QuerySymbol, /*IsNestedSearch=*/true, SourceFilePath);
 for (auto  : MatchedSymbols) {
   std::string HeaderPath = Symbol.getFilePath().str();
   Symbol.SetFilePath(((HeaderPath[0] == '"' || HeaderPath[0] == '<')
Index: clang-tools-extra/trunk/include-fixer/SymbolIndexManager.cpp
===
--- clang-tools-extra/trunk/include-fixer/SymbolIndexManager.cpp
+++ clang-tools-extra/trunk/include-fixer/SymbolIndexManager.cpp
@@ -12,38 +12,66 @@
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/Support/Debug.h"
+#include "llvm/Support/Path.h"
 
 #define DEBUG_TYPE "include-fixer"
 
 namespace clang {
 namespace include_fixer {
 
 using clang::find_all_symbols::SymbolInfo;
 
-/// Sorts SymbolInfos based on the popularity info in SymbolInfo.
-static void rankByPopularity(std::vector ) {
-  // First collect occurrences per header file.
-  llvm::DenseMap HeaderPopularity;
-  for (const SymbolInfo  : Symbols) {
-unsigned  = HeaderPopularity[Symbol.getFilePath()];
-Popularity = std::max(Popularity, Symbol.getNumOccurrences());
+// Calculate a score based on whether we think the given header is closely
+// related to the given source file.
+static double similarityScore(llvm::StringRef FileName,
+  llvm::StringRef Header) {
+  // Compute the maximum number of common path segements between Header and
+  // a suffix of FileName.
+  // We do not do a full longest common substring computation, as Header
+  // specifies the path we would directly #include, so we assume it is rooted
+  // relatively to a subproject of the repository.
+  int MaxSegments = 1;
+  for (auto FileI = llvm::sys::path::begin(FileName),
+FileE = llvm::sys::path::end(FileName);
+   FileI != FileE; ++FileI) {
+

[PATCH] D28548: Improve include fixer's ranking by taking the paths into account.

2017-01-11 Thread Manuel Klimek via Phabricator via cfe-commits
klimek created this revision.
klimek added a reviewer: bkramer.
klimek added a subscriber: cfe-commits.

Instead of just using popularity, we also take into account how similar the
path of the current file is to the path of the header.
Our first approach is to get popularity into a reasonably small scale by taking
log2 (which is roughly intuitive to how humans would bucket popularity), and
multiply that with the number of matching prefix path fragments of the included
header with the current file.
Note that currently we do not take special care for unclean paths containing
"../" or "./".


https://reviews.llvm.org/D28548

Files:
  include-fixer/IncludeFixer.cpp
  include-fixer/SymbolIndexManager.cpp
  include-fixer/SymbolIndexManager.h
  include-fixer/tool/ClangIncludeFixer.cpp
  test/include-fixer/Inputs/fake_yaml_db.yaml
  test/include-fixer/ranking.cpp

Index: test/include-fixer/ranking.cpp
===
--- test/include-fixer/ranking.cpp
+++ test/include-fixer/ranking.cpp
@@ -1,6 +1,9 @@
 // RUN: clang-include-fixer -db=yaml -input=%S/Inputs/fake_yaml_db.yaml -output-headers %s -- | FileCheck %s
+// RUN: clang-include-fixer -query-symbol bar -db=yaml -input=%S/Inputs/fake_yaml_db.yaml -output-headers %s -- | FileCheck %s
 
 // CHECK: "HeaderInfos": [
+// CHECK-NEXT:  {"Header": "\"test/include-fixer/baz.h\"",
+// CHECK-NEXT:   "QualifiedName": "c::bar"},
 // CHECK-NEXT:  {"Header": "\"../include/bar.h\"",
 // CHECK-NEXT:   "QualifiedName": "b::a::bar"},
 // CHECK-NEXT:  {"Header": "\"../include/zbar.h\"",
Index: test/include-fixer/Inputs/fake_yaml_db.yaml
===
--- test/include-fixer/Inputs/fake_yaml_db.yaml
+++ test/include-fixer/Inputs/fake_yaml_db.yaml
@@ -9,7 +9,6 @@
 LineNumber:  1
 Type:Class
 NumOccurrences:  1
-...
 ---
 Name:   bar
 Contexts:
@@ -21,7 +20,7 @@
 LineNumber:  1
 Type:Class
 NumOccurrences:  1
-...
+---
 Name:   bar
 Contexts:
   - ContextType: Namespace
@@ -32,7 +31,7 @@
 LineNumber:  2
 Type:Class
 NumOccurrences:  3
-...
+---
 Name:   bar
 Contexts:
   - ContextType: Namespace
@@ -50,4 +49,12 @@
 LineNumber:  1
 Type:Variable
 NumOccurrences:  1
-...
+---
+Name:bar
+Contexts:
+  - ContextType:Namespace
+ContextName:c
+FilePath:test/include-fixer/baz.h
+LineNumber:  1
+Type:Class
+NumOccurrences:  1
Index: include-fixer/tool/ClangIncludeFixer.cpp
===
--- include-fixer/tool/ClangIncludeFixer.cpp
+++ include-fixer/tool/ClangIncludeFixer.cpp
@@ -332,7 +332,8 @@
 
   // Query symbol mode.
   if (!QuerySymbol.empty()) {
-auto MatchedSymbols = SymbolIndexMgr->search(QuerySymbol);
+auto MatchedSymbols = SymbolIndexMgr->search(
+QuerySymbol, /*IsNestedSearch=*/true, SourceFilePath);
 for (auto  : MatchedSymbols) {
   std::string HeaderPath = Symbol.getFilePath().str();
   Symbol.SetFilePath(((HeaderPath[0] == '"' || HeaderPath[0] == '<')
Index: include-fixer/SymbolIndexManager.h
===
--- include-fixer/SymbolIndexManager.h
+++ include-fixer/SymbolIndexManager.h
@@ -42,7 +42,8 @@
   ///
   /// \returns A list of symbol candidates.
   std::vector
-  search(llvm::StringRef Identifier, bool IsNestedSearch = true) const;
+  search(llvm::StringRef Identifier, bool IsNestedSearch = true,
+ llvm::StringRef FileName = "") const;
 
 private:
   std::vector> SymbolIndices;
Index: include-fixer/SymbolIndexManager.cpp
===
--- include-fixer/SymbolIndexManager.cpp
+++ include-fixer/SymbolIndexManager.cpp
@@ -12,38 +12,66 @@
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/Support/Debug.h"
+#include "llvm/Support/Path.h"
 
 #define DEBUG_TYPE "include-fixer"
 
 namespace clang {
 namespace include_fixer {
 
 using clang::find_all_symbols::SymbolInfo;
 
-/// Sorts SymbolInfos based on the popularity info in SymbolInfo.
-static void rankByPopularity(std::vector ) {
-  // First collect occurrences per header file.
-  llvm::DenseMap HeaderPopularity;
-  for (const SymbolInfo  : Symbols) {
-unsigned  = HeaderPopularity[Symbol.getFilePath()];
-Popularity = std::max(Popularity, Symbol.getNumOccurrences());
+// Calculate a score based on whether we think the given header is closely
+// related to the given source file.
+static double similarityScore(llvm::StringRef FileName,
+  llvm::StringRef Header) {
+  // Compute the maximum number of common path segements between Header and
+  // a suffix of FileName.
+  // We do not do a full longest common substring computation, as Header
+  // specifies