poelmanc added a comment.

In D96744#2564828 <https://reviews.llvm.org/D96744#2564828>, @MyDeveloperDay 
wrote:

> Does this need to be an option?

It's easy to add an option, but there are already two //main include//-related 
options, so before adding a third I wanted to give this some thought. As 
someone new to `IncludeCategories`, I was genuinely impressed at how easy it 
was to use and how it gave me such complete control over the grouping via 
regular expressions. Yet in comparison the determination of //main// headers 
was less clear and more hard-coded; I had to examine the source code to figure 
out that the comparison is case-insensitive, it doesn't consider `<>` includes, 
only file stems are considered (e.g. the `foo/bar` in `foo/bar/baz.h` is 
ignored), and the behaviors of `IncludeIsMainSourceRegex` and 
`IncludeIsMainRegex` were a bit murky.

That's all fixable with a patch and minor documentation tweaks, but I wanted to 
consider some alternatives. Users of the `IncludeCategories` feature are 
comfortable with regular expressions, so imagine eliminating special handling 
of //main headers// entirely, and instead enabling users to write their own 
`IncludeCategories` rules for putting main headers first? We'd need to give 
them some bits of the source file path to use in their `Regex`, say called 
`${SourceStem}` and `${SourcePath}`.

Users unconcerned with formatting `foo_test.cc` or `foo_proto.cc` files could 
get the current functionality by including a simple rule like:

  - Regex:           '"(.*\/)?${SourceStem}(\..*)?"'
    Priority:        0
    CaseSensitive:   false

I want case sensitivity, matching at least one component of the path, and angle 
brackets, so I'd use something like:

  - Regex:           '[<"]${SourcePath:1}${SourceStem}\..*[>"]'
    Priority:        0
    CaseSensitive:   true

Then adding a generally-useful `ApplyToSourcesRegex` feature to apply any 
category regex only to certain source files, and an ability to trim a regular 
expression from the end of `${SourceStem}`, gives users full control, including 
mimicking current `isMainHeader()` behavior with rules like:

  - Regex:           
'"(.*\/)?${SourceStem-<insert-old-IncludeIsMainRegex>}(\..*)?"'
    Priority:        0
    CaseSensitive:   false
    ApplyToSourcesRegex: 
`((.*\.(c|cc|cpp|c\+\+|cxx|m|mm))|<insert-old-IncludeIsMainSourceRegex>)`

For backwards-compatibility we'd automatically add the above rule if no special 
`${Source` tokens appear in any rules, and we can deprecate 
`IncludeIsMainRegex` and `IncludeIsMainSourceRegex` at any point. The special 
tokens for use in `Regex` are defined as:

- `${SourcePath:<n>}` is a regular expression matching `n` or more trailing 
components of the source directory file path, using forward slashes and 
including any trailing slash (`0 <= n < 10`)
- `${SourceStem}` is a string starting after the last `/` in the source file 
path and includes up to but excluding the first dot
- `${SourceStem-<re>}` is `${SourceStem}` with any trailing characters matching 
regular expression `re` removed

I have this coded up and it passes the ToolingTests, but wanted to run the idea 
by others. Thoughts? Should I update this patch with these changes? Start a 
separate issue? Stick with a new option `IncludeIsMainAllowBraces`?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96744/new/

https://reviews.llvm.org/D96744

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to