[PATCH] D96744: clang-format IncludeBlocks: Regroup determination of "main" for framework-style includes fix

2023-08-01 Thread Conrad Poelman via Phabricator via cfe-commits
poelmanc abandoned this revision.
poelmanc added a comment.
Herald added a project: All.

We eventually updated our coding standards and our fairly massive code base to 
change only the very top "main" include from `#include ` to 
`#include "Foo.h"`.

For all include statements after that, we still use framework-style includes 
(`#include `).

It worked great and now we're happily using clang-format to order and group all 
our include statements, with no need for any changes to clang-format. Thanks to 
all for maintaining such an amazing tool!


Herald added a comment.

NOTE: Clang-Format Team Automated Review Comment

It looks like your clang-format review does not contain any unit tests, please 
try to ensure all code changes have a unit test (unless this is an `NFC` or 
refactoring, adding documentation etc..)

Add your unit tests in `clang/unittests/Format` and you can build with `ninja 
FormatTests`.  We recommend using the `verifyFormat(xxx)` format of unit tests 
rather than `EXPECT_EQ` as this will ensure you change is tolerant to random 
whitespace changes (see FormatTest.cpp as an example)

For situations where your change is altering the TokenAnnotator.cpp which can 
happen if you are trying to improve the annotation phase to ensure we are 
correctly identifying the type of a token, please add a token annotator test in 
`TokenAnnotatorTest.cpp`


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96744/new/

https://reviews.llvm.org/D96744

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D96744: clang-format IncludeBlocks: Regroup determination of "main" for framework-style includes fix

2021-03-02 Thread Mitchell via Phabricator via cfe-commits
mitchell-stellar added a comment.

I am in favor of a new option, `IncludeIsMainAllowBraces`. It keeps things 
simple and does not cause any backwards-compatibility issues. The `${}` 
variables are clever, but if "users unconcerned with formatting ... could get 
the current functionality by including a simple rule [that includes a 
variable]", I think it might be asking too much. Being able to write (and read) 
an include regex without referring to documentation seems ideal to me. You 
mentioned automatically adding variables in rules for backwards-compatibility, 
but I'd be a bit concerned about robustness and too much magic going on behind 
the scenes for users to understand if something goes wrong or has 
unexpected/unexplained results.

Admittedly I don't fully comprehend the proposed additions and all their 
ins-and-outs, but I do understand a new, simple option.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96744/new/

https://reviews.llvm.org/D96744

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D96744: clang-format IncludeBlocks: Regroup determination of "main" for framework-style includes fix

2021-02-26 Thread Conrad Poelman via Phabricator via cfe-commits
poelmanc added a comment.

In D96744#2564828 , @MyDeveloperDay 
wrote:

> Does this need to be an option?

It's easy to add an option, but there are already two //main include//-related 
options, so before adding a third I wanted to give this some thought. As 
someone new to `IncludeCategories`, I was genuinely impressed at how easy it 
was to use and how it gave me such complete control over the grouping via 
regular expressions. Yet in comparison the determination of //main// headers 
was less clear and more hard-coded; I had to examine the source code to figure 
out that the comparison is case-insensitive, it doesn't consider `<>` includes, 
only file stems are considered (e.g. the `foo/bar` in `foo/bar/baz.h` is 
ignored), and the behaviors of `IncludeIsMainSourceRegex` and 
`IncludeIsMainRegex` were a bit murky.

That's all fixable with a patch and minor documentation tweaks, but I wanted to 
consider some alternatives. Users of the `IncludeCategories` feature are 
comfortable with regular expressions, so imagine eliminating special handling 
of //main headers// entirely, and instead enabling users to write their own 
`IncludeCategories` rules for putting main headers first? We'd need to give 
them some bits of the source file path to use in their `Regex`, say called 
`${SourceStem}` and `${SourcePath}`.

Users unconcerned with formatting `foo_test.cc` or `foo_proto.cc` files could 
get the current functionality by including a simple rule like:

  - Regex:   '"(.*\/)?${SourceStem}(\..*)?"'
Priority:0
CaseSensitive:   false

I want case sensitivity, matching at least one component of the path, and angle 
brackets, so I'd use something like:

  - Regex:   '[<"]${SourcePath:1}${SourceStem}\..*[>"]'
Priority:0
CaseSensitive:   true

Then adding a generally-useful `ApplyToSourcesRegex` feature to apply any 
category regex only to certain source files, and an ability to trim a regular 
expression from the end of `${SourceStem}`, gives users full control, including 
mimicking current `isMainHeader()` behavior with rules like:

  - Regex:   
'"(.*\/)?${SourceStem-}(\..*)?"'
Priority:0
CaseSensitive:   false
ApplyToSourcesRegex: 
`((.*\.(c|cc|cpp|c\+\+|cxx|m|mm))|)`

For backwards-compatibility we'd automatically add the above rule if no special 
`${Source` tokens appear in any rules, and we can deprecate 
`IncludeIsMainRegex` and `IncludeIsMainSourceRegex` at any point. The special 
tokens for use in `Regex` are defined as:

- `${SourcePath:}` is a regular expression matching `n` or more trailing 
components of the source directory file path, using forward slashes and 
including any trailing slash (`0 <= n < 10`)
- `${SourceStem}` is a string starting after the last `/` in the source file 
path and includes up to but excluding the first dot
- `${SourceStem-}` is `${SourceStem}` with any trailing characters matching 
regular expression `re` removed

I have this coded up and it passes the ToolingTests, but wanted to run the idea 
by others. Thoughts? Should I update this patch with these changes? Start a 
separate issue? Stick with a new option `IncludeIsMainAllowBraces`?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96744/new/

https://reviews.llvm.org/D96744

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D96744: clang-format IncludeBlocks: Regroup determination of "main" for framework-style includes fix

2021-02-16 Thread Conrad Poelman via Phabricator via cfe-commits
poelmanc added a comment.

Thanks for the speedy reply and the great tool. With appropriate Regex and 
Priority settings, `IncludeCategories` worked flawlessly for me aside from not 
identifying the main header. Treating `#include "foo/bar/baz.h"` as the main 
header in file `bar/random/baz.h` seems like a bug, but I certainly see the 
dangers of changing current `<>` behavior. I also considered treating `<>` 
includes as main headers only if they also contain a forward-slash, e.g.:

  if (!IncludeName.startswith("\"") && !IncludeName.contains("/"))
return false;

That would resolve the `` case, although `#include ` in 
a file `anything/sys/types.h` would be identified as the main header. So making 
an option seems like the cleanest solution. Say, `bool 
IncludeIsMainAllowBraces`?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96744/new/

https://reviews.llvm.org/D96744

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D96744: clang-format IncludeBlocks: Regroup determination of "main" for framework-style includes fix

2021-02-16 Thread MyDeveloperDay via Phabricator via cfe-commits
MyDeveloperDay added a comment.

I think I'd be concern about the `` case (lets ignore the fact that I 
should be including 

so imagine I'm writing a string class and I have

string.cpp including "string.h"

but as part of my string class because I'm somehow extending std::string

  #include "string.h"
  #include 
  
  class mystring : public std::string
  {
  }

Which file wins?

I think lots of people will always consider `<>` to mean system paths, I think 
if there is any chance that `<>` could change the order then I think it needs 
to be an option that is off by default.

I don't think this is a bug, I think you are asking for an enhancement.




Comment at: clang/lib/Tooling/Inclusions/HeaderIncludes.cpp:237
-  if (!IncludeName.startswith("\""))
-return false;
-

Does this need to be an option?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96744/new/

https://reviews.llvm.org/D96744

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D96744: clang-format IncludeBlocks: Regroup determination of "main" for framework-style includes fix

2021-02-16 Thread Björn Schäpers via Phabricator via cfe-commits
HazardyKnusperkeks added inline comments.



Comment at: clang/unittests/Tooling/HeaderIncludesTest.cpp:106
+<< "for source file " << FileName;
+EXPECT_EQ(Manager.getIncludePriority("", true), 0)
+<< "for source file " << FileName;

I would also add `"baz.h"`, since the test is named `IsMainHeader`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96744/new/

https://reviews.llvm.org/D96744

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D96744: clang-format IncludeBlocks: Regroup determination of "main" for framework-style includes fix

2021-02-15 Thread Conrad Poelman via Phabricator via cfe-commits
poelmanc created this revision.
poelmanc added reviewers: MyDeveloperDay, hokein, sammccall.
poelmanc added projects: clang-tools-extra, clang-format.
poelmanc requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Most modern libraries and applications include files with relative paths from a 
root (e.g. `#include ` versus `#include "baz.h"`.) When 
regrouping, a file's "main include" should be left at the top (given priority 
0.) The existing `IncludeCategoryManager::isMainHeader()` checks only the file 
//stem// (e.g. `baz.h`), and fails to identify main includes with angle 
brackets despite a comment saying `// remove the surrounding "" or <>`.

This patch fixes these issues by comparing all directory components present in 
both the `#include` string and the file name, and by allowing angle bracket 
includes to be considered "main". It adds a new `IsMainHeader` unit test to 
check behavior of framework-style includes, which would fail without this patch.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D96744

Files:
  clang/lib/Tooling/Inclusions/HeaderIncludes.cpp
  clang/unittests/Tooling/HeaderIncludesTest.cpp


Index: clang/unittests/Tooling/HeaderIncludesTest.cpp
===
--- clang/unittests/Tooling/HeaderIncludesTest.cpp
+++ clang/unittests/Tooling/HeaderIncludesTest.cpp
@@ -91,6 +91,28 @@
   EXPECT_EQ(Expected, insert(Code, "\"a.h\""));
 }
 
+TEST_F(HeaderIncludesTest, IsMainHeader) {
+  Style = format::getGoogleStyle(format::FormatStyle::LanguageKind::LK_Cpp)
+  .IncludeStyle;
+  std::vector FileNames{"foo/bar/baz.cpp", "foo/bar/baz.cu.cpp",
+   "foo/bar/baz_test.cu.cpp"};
+  for (const StringRef  : FileNames) {
+IncludeCategoryManager Manager(Style, FileName);
+// These framework-style includes should all be considered "main".
+EXPECT_EQ(Manager.getIncludePriority("", true), 0)
+<< "for source file " << FileName;
+EXPECT_EQ(Manager.getIncludePriority("\"bar/baz.h\"", true), 0)
+<< "for source file " << FileName;
+EXPECT_EQ(Manager.getIncludePriority("", true), 0)
+<< "for source file " << FileName;
+// These should not be considered "main" as the paths to baz.h differ.
+EXPECT_NE(Manager.getIncludePriority("", true), 0)
+<< "for source file " << FileName;
+EXPECT_NE(Manager.getIncludePriority("\"foo/baz.h\"", true), 0)
+<< "for source file " << FileName;
+  }
+}
+
 TEST_F(HeaderIncludesTest, InsertAfterMainHeader) {
   std::string Code = "#include \"fix.h\"\n"
  "\n"
Index: clang/lib/Tooling/Inclusions/HeaderIncludes.cpp
===
--- clang/lib/Tooling/Inclusions/HeaderIncludes.cpp
+++ clang/lib/Tooling/Inclusions/HeaderIncludes.cpp
@@ -233,9 +233,6 @@
   return Ret;
 }
 bool IncludeCategoryManager::isMainHeader(StringRef IncludeName) const {
-  if (!IncludeName.startswith("\""))
-return false;
-
   IncludeName =
   IncludeName.drop_front(1).drop_back(1); // remove the surrounding "" or 
<>
   // Not matchingStem: implementation files may have compound extensions but
@@ -259,8 +256,22 @@
   if (!Matching.empty()) {
 llvm::Regex MainIncludeRegex(HeaderStem.str() + Style.IncludeIsMainRegex,
  llvm::Regex::IgnoreCase);
-if (MainIncludeRegex.match(Matching))
+if (MainIncludeRegex.match(Matching)) {
+  // Matching is non-empty so these should be non-empty as well, making
+  // ++rbegin(IncludeName) and ++rbegin(FileName) safe.
+  assert(!IncludeName.empty());
+  assert(!FileName.empty());
+  // Checked stems above. Check remaining common path components here.
+  auto IncludePathRIter = ++llvm::sys::path::rbegin(IncludeName);
+  auto FilePathRiter = ++llvm::sys::path::rbegin(FileName);
+  for (; IncludePathRIter != llvm::sys::path::rend(IncludeName) &&
+ FilePathRiter != llvm::sys::path::rend(FileName);
+   ++IncludePathRIter, ++FilePathRiter) {
+if (*IncludePathRIter != *FilePathRiter)
+  return false;
+  }
   return true;
+}
   }
   return false;
 }


Index: clang/unittests/Tooling/HeaderIncludesTest.cpp
===
--- clang/unittests/Tooling/HeaderIncludesTest.cpp
+++ clang/unittests/Tooling/HeaderIncludesTest.cpp
@@ -91,6 +91,28 @@
   EXPECT_EQ(Expected, insert(Code, "\"a.h\""));
 }
 
+TEST_F(HeaderIncludesTest, IsMainHeader) {
+  Style = format::getGoogleStyle(format::FormatStyle::LanguageKind::LK_Cpp)
+  .IncludeStyle;
+  std::vector FileNames{"foo/bar/baz.cpp", "foo/bar/baz.cu.cpp",
+   "foo/bar/baz_test.cu.cpp"};
+  for (const StringRef  : FileNames) {
+IncludeCategoryManager Manager(Style, FileName);
+// These framework-style includes