[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-05-16 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added inline comments.



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:204
+  return true;
+}
+

LegalizeAdulthood wrote:
> LegalizeAdulthood wrote:
> > aaron.ballman wrote:
> > > LegalizeAdulthood wrote:
> > > > LegalizeAdulthood wrote:
> > > > > aaron.ballman wrote:
> > > > > > LegalizeAdulthood wrote:
> > > > > > > aaron.ballman wrote:
> > > > > > > > Comma operator?
> > > > > > > Remember that the use case here is identifying expressions that 
> > > > > > > are initializers for an enum.  If you were doing a code review 
> > > > > > > and you saw this:
> > > > > > > ```
> > > > > > > enum {
> > > > > > > FOO = (2, 3)
> > > > > > > };
> > > > > > > ```
> > > > > > > Would you be OK with that?  I wouldn't.  Clang even warns about 
> > > > > > > it: https://godbolt.org/z/Y641cb8Y9
> > > > > > > 
> > > > > > > Therefore I deliberately left comma operator out of the grammar.
> > > > > > This is another case where I think you're predicting that users 
> > > > > > won't be using the full expressivity of the language and we'll get 
> > > > > > bug reports later. Again, in insolation, I tend to agree that I 
> > > > > > wouldn't be happy seeing that code. However, users write some very 
> > > > > > creative code and there's no technical reason why we can't or 
> > > > > > shouldn't handle comma operators.
> > > > > "Don't let the perfect be the enemy of the good."
> > > > > 
> > > > > My inclination is to simply explicitly state that comma operator is 
> > > > > not recognized in the documentation.  It's already implicit by it's 
> > > > > absence from the list of recognized operators.
> > > > > 
> > > > > Again, the worst that happens is that your macro isn't converted.
> > > > > 
> > > > > I'm open to being convinced that it's important, but you haven't 
> > > > > convinced me yet `:)`
> > > > It wasn't much extra work/code to add comma operator support so I've 
> > > > done that.
> > > > "Don't let the perfect be the enemy of the good."
> > > 
> > > This is a production compiler toolchain. Correctness is important and 
> > > that sometimes means caring more about perfection than you otherwise 
> > > would like to.
> > > 
> > > > I'm open to being convinced that it's important, but you haven't 
> > > > convinced me yet :)
> > > 
> > > It's less about importance and more about maintainability coupled with 
> > > correctness. With your approach, we get something that will have a long 
> > > tail of bugs. If you used Clang's parser, you don't get the same issue -- 
> > > maintenance largely comes along for free, and the bugs are far less 
> > > likely.
> > > 
> > > About the only reason I like your current approach over using clang's 
> > > parsing is that it quite likely performs much better than doing an actual 
> > > token parsing of the expression. But as you pointed out, about the worst 
> > > thing for a check can do is take correct code and make it incorrect -- 
> > > doing that right requires some amount of semantic evaluation of the 
> > > expressions (which you're not doing). For example:
> > > ```
> > > #define FINE 1LL << 30LL;
> > > #define BAD 1LL << 31LL;
> > > #define ALSO_BAD 1LL << 32L;
> > > ```
> > > We'll convert this into an enumeration and break `-pedantic-errors` 
> > > builds in C. If we had a `ConstantExpr` object, we could see what it's 
> > > value is and note that it's greater than what fits into an `int` and 
> > > decide to do something smarter.
> > > 
> > > So I continue to see the current approach as being somewhat reasonable 
> > > (especially for experimentation), but incorrect in the long run. Not 
> > > sufficiently incorrect for me to block this patch on, but incorrect 
> > > enough that the first time this check becomes a maintenance burden, I'll 
> > > be asking more strongly to do this the correct way.
> > > > "Don't let the perfect be the enemy of the good."
> > > 
> > > This is a production compiler toolchain. Correctness is important and 
> > > that sometimes means caring more about perfection than you otherwise 
> > > would like to.
> > 
> > That's fair.
> > 
> > > For example:
> > > ```
> > > #define FINE 1LL << 30LL;
> > > #define BAD 1LL << 31LL;
> > > #define ALSO_BAD 1LL << 32L;
> > > ```
> > 
> > Oh this brings up the pesky "semicolons disappear from the AST" issue.  I 
> > wonder what happens when we're just processing tokens, though.  I will add 
> > a test to see.  This could be a case where my approach results in more 
> > correctness than `clangParse`!
> > 
> > > Not sufficiently incorrect for me to block this patch on, but incorrect 
> > > enough that the first time this check becomes a maintenance burden, I'll 
> > > be asking more strongly to do this the correct way.
> > 
> > I agree.
> So I was research the C standard for what it says are acceptable initializer 
> values for an enum and it //disallows// the comma operator:
> 
> 

[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-05-14 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood marked an inline comment as done.
LegalizeAdulthood added inline comments.



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:204
+  return true;
+}
+

LegalizeAdulthood wrote:
> aaron.ballman wrote:
> > LegalizeAdulthood wrote:
> > > LegalizeAdulthood wrote:
> > > > aaron.ballman wrote:
> > > > > LegalizeAdulthood wrote:
> > > > > > aaron.ballman wrote:
> > > > > > > Comma operator?
> > > > > > Remember that the use case here is identifying expressions that are 
> > > > > > initializers for an enum.  If you were doing a code review and you 
> > > > > > saw this:
> > > > > > ```
> > > > > > enum {
> > > > > > FOO = (2, 3)
> > > > > > };
> > > > > > ```
> > > > > > Would you be OK with that?  I wouldn't.  Clang even warns about it: 
> > > > > > https://godbolt.org/z/Y641cb8Y9
> > > > > > 
> > > > > > Therefore I deliberately left comma operator out of the grammar.
> > > > > This is another case where I think you're predicting that users won't 
> > > > > be using the full expressivity of the language and we'll get bug 
> > > > > reports later. Again, in insolation, I tend to agree that I wouldn't 
> > > > > be happy seeing that code. However, users write some very creative 
> > > > > code and there's no technical reason why we can't or shouldn't handle 
> > > > > comma operators.
> > > > "Don't let the perfect be the enemy of the good."
> > > > 
> > > > My inclination is to simply explicitly state that comma operator is not 
> > > > recognized in the documentation.  It's already implicit by it's absence 
> > > > from the list of recognized operators.
> > > > 
> > > > Again, the worst that happens is that your macro isn't converted.
> > > > 
> > > > I'm open to being convinced that it's important, but you haven't 
> > > > convinced me yet `:)`
> > > It wasn't much extra work/code to add comma operator support so I've done 
> > > that.
> > > "Don't let the perfect be the enemy of the good."
> > 
> > This is a production compiler toolchain. Correctness is important and that 
> > sometimes means caring more about perfection than you otherwise would like 
> > to.
> > 
> > > I'm open to being convinced that it's important, but you haven't 
> > > convinced me yet :)
> > 
> > It's less about importance and more about maintainability coupled with 
> > correctness. With your approach, we get something that will have a long 
> > tail of bugs. If you used Clang's parser, you don't get the same issue -- 
> > maintenance largely comes along for free, and the bugs are far less likely.
> > 
> > About the only reason I like your current approach over using clang's 
> > parsing is that it quite likely performs much better than doing an actual 
> > token parsing of the expression. But as you pointed out, about the worst 
> > thing for a check can do is take correct code and make it incorrect -- 
> > doing that right requires some amount of semantic evaluation of the 
> > expressions (which you're not doing). For example:
> > ```
> > #define FINE 1LL << 30LL;
> > #define BAD 1LL << 31LL;
> > #define ALSO_BAD 1LL << 32L;
> > ```
> > We'll convert this into an enumeration and break `-pedantic-errors` builds 
> > in C. If we had a `ConstantExpr` object, we could see what it's value is 
> > and note that it's greater than what fits into an `int` and decide to do 
> > something smarter.
> > 
> > So I continue to see the current approach as being somewhat reasonable 
> > (especially for experimentation), but incorrect in the long run. Not 
> > sufficiently incorrect for me to block this patch on, but incorrect enough 
> > that the first time this check becomes a maintenance burden, I'll be asking 
> > more strongly to do this the correct way.
> > > "Don't let the perfect be the enemy of the good."
> > 
> > This is a production compiler toolchain. Correctness is important and that 
> > sometimes means caring more about perfection than you otherwise would like 
> > to.
> 
> That's fair.
> 
> > For example:
> > ```
> > #define FINE 1LL << 30LL;
> > #define BAD 1LL << 31LL;
> > #define ALSO_BAD 1LL << 32L;
> > ```
> 
> Oh this brings up the pesky "semicolons disappear from the AST" issue.  I 
> wonder what happens when we're just processing tokens, though.  I will add a 
> test to see.  This could be a case where my approach results in more 
> correctness than `clangParse`!
> 
> > Not sufficiently incorrect for me to block this patch on, but incorrect 
> > enough that the first time this check becomes a maintenance burden, I'll be 
> > asking more strongly to do this the correct way.
> 
> I agree.
So I was research the C standard for what it says are acceptable initializer 
values for an enum and it //disallows// the comma operator:

https://en.cppreference.com/w/c/language/enum

> integer constant expression whose value is representable as a value of type 
> int

https://en.cppreference.com/w/c/language/constant_expression

> An integer 

[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-05-14 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood marked an inline comment as done.
LegalizeAdulthood added inline comments.



Comment at: 
clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp:210-214
+std::ostream <<(std::ostream ,
+ const clang::tidy::test::Param ) {
+  return Str << "Matched: " << std::boolalpha << Value.Matched << ", Text: '"
+ << Value.Text << "'";
+}

LegalizeAdulthood wrote:
> aaronpuchert wrote:
> > LegalizeAdulthood wrote:
> > > LegalizeAdulthood wrote:
> > > > aaronpuchert wrote:
> > > > > Seems to have caused a [build 
> > > > > failure](https://lab.llvm.org/buildbot/#/builders/57/builds/17915):
> > > > > 
> > > > > ```
> > > > > FAILED: 
> > > > > tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o
> > > > >  
> > > > > /home/buildbots/clang.11.0.0/bin/clang++ 
> > > > > --gcc-toolchain=/opt/rh/devtoolset-7/root/usr  -DGTEST_HAS_RTTI=0 
> > > > > -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS 
> > > > > -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS 
> > > > > -Itools/clang/tools/extra/unittests/clang-tidy 
> > > > > -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/unittests/clang-tidy
> > > > >  
> > > > > -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang/include
> > > > >  -Itools/clang/include -Iinclude 
> > > > > -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/include
> > > > >  
> > > > > -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/clang-tidy
> > > > >  
> > > > > -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/utils/unittest/googletest/include
> > > > >  
> > > > > -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/utils/unittest/googlemock/include
> > > > >  -fPIC -fvisibility-inlines-hidden -Werror -Werror=date-time 
> > > > > -Werror=unguarded-availability-new -Wall -Wextra 
> > > > > -Wno-unused-parameter -Wwrite-strings -Wcast-qual 
> > > > > -Wmissing-field-initializers -pedantic -Wno-long-long 
> > > > > -Wc++98-compat-extra-semi -Wimplicit-fallthrough 
> > > > > -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor 
> > > > > -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion 
> > > > > -Wmisleading-indentation -fdiagnostics-color -ffunction-sections 
> > > > > -fdata-sections -fno-common -Woverloaded-virtual 
> > > > > -Wno-nested-anon-types -O3 -DNDEBUG-Wno-variadic-macros 
> > > > > -Wno-gnu-zero-variadic-macro-arguments -fno-exceptions -fno-rtti 
> > > > > -UNDEBUG -Wno-suggest-override -std=c++14 -MD -MT 
> > > > > tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o
> > > > >  -MF 
> > > > > tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o.d
> > > > >  -o 
> > > > > tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o
> > > > >  -c 
> > > > > /home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
> > > > > /home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp:210:15:
> > > > >  error: unused function 'operator<<' [-Werror,-Wunused-function]
> > > > > std::ostream <<(std::ostream ,
> > > > >   ^
> > > > > 1 error generated.
> > > > > ```
> > > > Simon Pilgrim fixed it, but I don't understand why clang calls this 
> > > > function unused.  When the test fails, gtest uses this function to 
> > > > pretty print the parameter.  I'm rebuilding with a forced test failure 
> > > > to validate.
> > > Yes, without this function the failing test prints results like this:
> > > ```
> > > [ RUN  ] TokenExpressionParserTests/MatcherTest.MatchResult/123
> > > D:\legalize\llvm\llvm-project\clang-tools-extra\unittests\clang-tidy\ModernizeModuleTest.cpp(200):
> > >  error: Value of: matchText(GetParam().Text) == GetParam().Matched
> > >   Actual: false
> > > Expected: true
> > > [  FAILED  ] TokenExpressionParserTests/MatcherTest.MatchResult/123, 
> > > where GetParam() = 16-byte object <00-00 00-00 00-00 00-00 40-EC 3B-D6 
> > > F6-7F 00-00> (1 ms)
> > > ```
> > > which isn't particularly useful.
> > > 
> > > So how do we include pretty printers for tests without clang erroneously 
> > > flagging them as unused?
> > What got me wondering: this definition is last in the file, and there is no 
> > prior 

[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-05-14 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood marked an inline comment as done.
LegalizeAdulthood added inline comments.



Comment at: 
clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp:210-214
+std::ostream <<(std::ostream ,
+ const clang::tidy::test::Param ) {
+  return Str << "Matched: " << std::boolalpha << Value.Matched << ", Text: '"
+ << Value.Text << "'";
+}

aaronpuchert wrote:
> LegalizeAdulthood wrote:
> > LegalizeAdulthood wrote:
> > > aaronpuchert wrote:
> > > > Seems to have caused a [build 
> > > > failure](https://lab.llvm.org/buildbot/#/builders/57/builds/17915):
> > > > 
> > > > ```
> > > > FAILED: 
> > > > tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o
> > > >  
> > > > /home/buildbots/clang.11.0.0/bin/clang++ 
> > > > --gcc-toolchain=/opt/rh/devtoolset-7/root/usr  -DGTEST_HAS_RTTI=0 
> > > > -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS 
> > > > -D__STDC_LIMIT_MACROS -Itools/clang/tools/extra/unittests/clang-tidy 
> > > > -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/unittests/clang-tidy
> > > >  
> > > > -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang/include
> > > >  -Itools/clang/include -Iinclude 
> > > > -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/include
> > > >  
> > > > -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/clang-tidy
> > > >  
> > > > -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/utils/unittest/googletest/include
> > > >  
> > > > -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/utils/unittest/googlemock/include
> > > >  -fPIC -fvisibility-inlines-hidden -Werror -Werror=date-time 
> > > > -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter 
> > > > -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic 
> > > > -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough 
> > > > -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor 
> > > > -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion 
> > > > -Wmisleading-indentation -fdiagnostics-color -ffunction-sections 
> > > > -fdata-sections -fno-common -Woverloaded-virtual -Wno-nested-anon-types 
> > > > -O3 -DNDEBUG-Wno-variadic-macros 
> > > > -Wno-gnu-zero-variadic-macro-arguments -fno-exceptions -fno-rtti 
> > > > -UNDEBUG -Wno-suggest-override -std=c++14 -MD -MT 
> > > > tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o
> > > >  -MF 
> > > > tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o.d
> > > >  -o 
> > > > tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o
> > > >  -c 
> > > > /home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
> > > > /home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp:210:15:
> > > >  error: unused function 'operator<<' [-Werror,-Wunused-function]
> > > > std::ostream <<(std::ostream ,
> > > >   ^
> > > > 1 error generated.
> > > > ```
> > > Simon Pilgrim fixed it, but I don't understand why clang calls this 
> > > function unused.  When the test fails, gtest uses this function to pretty 
> > > print the parameter.  I'm rebuilding with a forced test failure to 
> > > validate.
> > Yes, without this function the failing test prints results like this:
> > ```
> > [ RUN  ] TokenExpressionParserTests/MatcherTest.MatchResult/123
> > D:\legalize\llvm\llvm-project\clang-tools-extra\unittests\clang-tidy\ModernizeModuleTest.cpp(200):
> >  error: Value of: matchText(GetParam().Text) == GetParam().Matched
> >   Actual: false
> > Expected: true
> > [  FAILED  ] TokenExpressionParserTests/MatcherTest.MatchResult/123, where 
> > GetParam() = 16-byte object <00-00 00-00 00-00 00-00 40-EC 3B-D6 F6-7F 
> > 00-00> (1 ms)
> > ```
> > which isn't particularly useful.
> > 
> > So how do we include pretty printers for tests without clang erroneously 
> > flagging them as unused?
> What got me wondering: this definition is last in the file, and there is no 
> prior declaration of this function. How can there be any uses of it? We're 
> not in class scope, so all prior uses of `operator<<` or perhaps `PrintTo` 
> must have been resolved to some other 

[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-05-14 Thread Aaron Puchert via Phabricator via cfe-commits
aaronpuchert added inline comments.



Comment at: 
clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp:210-214
+std::ostream <<(std::ostream ,
+ const clang::tidy::test::Param ) {
+  return Str << "Matched: " << std::boolalpha << Value.Matched << ", Text: '"
+ << Value.Text << "'";
+}

LegalizeAdulthood wrote:
> LegalizeAdulthood wrote:
> > aaronpuchert wrote:
> > > Seems to have caused a [build 
> > > failure](https://lab.llvm.org/buildbot/#/builders/57/builds/17915):
> > > 
> > > ```
> > > FAILED: 
> > > tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o
> > >  
> > > /home/buildbots/clang.11.0.0/bin/clang++ 
> > > --gcc-toolchain=/opt/rh/devtoolset-7/root/usr  -DGTEST_HAS_RTTI=0 
> > > -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS 
> > > -D__STDC_LIMIT_MACROS -Itools/clang/tools/extra/unittests/clang-tidy 
> > > -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/unittests/clang-tidy
> > >  
> > > -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang/include
> > >  -Itools/clang/include -Iinclude 
> > > -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/include
> > >  
> > > -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/clang-tidy
> > >  
> > > -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/utils/unittest/googletest/include
> > >  
> > > -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/utils/unittest/googlemock/include
> > >  -fPIC -fvisibility-inlines-hidden -Werror -Werror=date-time 
> > > -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter 
> > > -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic 
> > > -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough 
> > > -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor 
> > > -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion 
> > > -Wmisleading-indentation -fdiagnostics-color -ffunction-sections 
> > > -fdata-sections -fno-common -Woverloaded-virtual -Wno-nested-anon-types 
> > > -O3 -DNDEBUG-Wno-variadic-macros 
> > > -Wno-gnu-zero-variadic-macro-arguments -fno-exceptions -fno-rtti -UNDEBUG 
> > > -Wno-suggest-override -std=c++14 -MD -MT 
> > > tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o
> > >  -MF 
> > > tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o.d
> > >  -o 
> > > tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o
> > >  -c 
> > > /home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
> > > /home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp:210:15:
> > >  error: unused function 'operator<<' [-Werror,-Wunused-function]
> > > std::ostream <<(std::ostream ,
> > >   ^
> > > 1 error generated.
> > > ```
> > Simon Pilgrim fixed it, but I don't understand why clang calls this 
> > function unused.  When the test fails, gtest uses this function to pretty 
> > print the parameter.  I'm rebuilding with a forced test failure to validate.
> Yes, without this function the failing test prints results like this:
> ```
> [ RUN  ] TokenExpressionParserTests/MatcherTest.MatchResult/123
> D:\legalize\llvm\llvm-project\clang-tools-extra\unittests\clang-tidy\ModernizeModuleTest.cpp(200):
>  error: Value of: matchText(GetParam().Text) == GetParam().Matched
>   Actual: false
> Expected: true
> [  FAILED  ] TokenExpressionParserTests/MatcherTest.MatchResult/123, where 
> GetParam() = 16-byte object <00-00 00-00 00-00 00-00 40-EC 3B-D6 F6-7F 00-00> 
> (1 ms)
> ```
> which isn't particularly useful.
> 
> So how do we include pretty printers for tests without clang erroneously 
> flagging them as unused?
What got me wondering: this definition is last in the file, and there is no 
prior declaration of this function. How can there be any uses of it? We're not 
in class scope, so all prior uses of `operator<<` or perhaps `PrintTo` must 
have been resolved to some other function already. Or am I missing something?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124500/new/

https://reviews.llvm.org/D124500


[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-05-14 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood added inline comments.



Comment at: 
clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp:210-214
+std::ostream <<(std::ostream ,
+ const clang::tidy::test::Param ) {
+  return Str << "Matched: " << std::boolalpha << Value.Matched << ", Text: '"
+ << Value.Text << "'";
+}

LegalizeAdulthood wrote:
> aaronpuchert wrote:
> > Seems to have caused a [build 
> > failure](https://lab.llvm.org/buildbot/#/builders/57/builds/17915):
> > 
> > ```
> > FAILED: 
> > tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o
> >  
> > /home/buildbots/clang.11.0.0/bin/clang++ 
> > --gcc-toolchain=/opt/rh/devtoolset-7/root/usr  -DGTEST_HAS_RTTI=0 -D_DEBUG 
> > -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS 
> > -D__STDC_LIMIT_MACROS -Itools/clang/tools/extra/unittests/clang-tidy 
> > -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/unittests/clang-tidy
> >  
> > -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang/include
> >  -Itools/clang/include -Iinclude 
> > -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/include
> >  
> > -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/clang-tidy
> >  
> > -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/utils/unittest/googletest/include
> >  
> > -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/utils/unittest/googlemock/include
> >  -fPIC -fvisibility-inlines-hidden -Werror -Werror=date-time 
> > -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter 
> > -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic 
> > -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough 
> > -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor 
> > -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion 
> > -Wmisleading-indentation -fdiagnostics-color -ffunction-sections 
> > -fdata-sections -fno-common -Woverloaded-virtual -Wno-nested-anon-types -O3 
> > -DNDEBUG-Wno-variadic-macros -Wno-gnu-zero-variadic-macro-arguments 
> > -fno-exceptions -fno-rtti -UNDEBUG -Wno-suggest-override -std=c++14 -MD -MT 
> > tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o
> >  -MF 
> > tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o.d
> >  -o 
> > tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o
> >  -c 
> > /home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
> > /home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp:210:15:
> >  error: unused function 'operator<<' [-Werror,-Wunused-function]
> > std::ostream <<(std::ostream ,
> >   ^
> > 1 error generated.
> > ```
> Simon Pilgrim fixed it, but I don't understand why clang calls this function 
> unused.  When the test fails, gtest uses this function to pretty print the 
> parameter.  I'm rebuilding with a forced test failure to validate.
Yes, without this function the failing test prints results like this:
```
[ RUN  ] TokenExpressionParserTests/MatcherTest.MatchResult/123
D:\legalize\llvm\llvm-project\clang-tools-extra\unittests\clang-tidy\ModernizeModuleTest.cpp(200):
 error: Value of: matchText(GetParam().Text) == GetParam().Matched
  Actual: false
Expected: true
[  FAILED  ] TokenExpressionParserTests/MatcherTest.MatchResult/123, where 
GetParam() = 16-byte object <00-00 00-00 00-00 00-00 40-EC 3B-D6 F6-7F 00-00> 
(1 ms)
```
which isn't particularly useful.

So how do we include pretty printers for tests without clang erroneously 
flagging them as unused?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124500/new/

https://reviews.llvm.org/D124500

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-05-14 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood added inline comments.



Comment at: 
clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp:210-214
+std::ostream <<(std::ostream ,
+ const clang::tidy::test::Param ) {
+  return Str << "Matched: " << std::boolalpha << Value.Matched << ", Text: '"
+ << Value.Text << "'";
+}

aaronpuchert wrote:
> Seems to have caused a [build 
> failure](https://lab.llvm.org/buildbot/#/builders/57/builds/17915):
> 
> ```
> FAILED: 
> tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o
>  
> /home/buildbots/clang.11.0.0/bin/clang++ 
> --gcc-toolchain=/opt/rh/devtoolset-7/root/usr  -DGTEST_HAS_RTTI=0 -D_DEBUG 
> -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS 
> -D__STDC_LIMIT_MACROS -Itools/clang/tools/extra/unittests/clang-tidy 
> -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/unittests/clang-tidy
>  
> -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang/include
>  -Itools/clang/include -Iinclude 
> -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/include
>  
> -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/clang-tidy
>  
> -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/utils/unittest/googletest/include
>  
> -I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/utils/unittest/googlemock/include
>  -fPIC -fvisibility-inlines-hidden -Werror -Werror=date-time 
> -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter 
> -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic 
> -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough 
> -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor 
> -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion 
> -Wmisleading-indentation -fdiagnostics-color -ffunction-sections 
> -fdata-sections -fno-common -Woverloaded-virtual -Wno-nested-anon-types -O3 
> -DNDEBUG-Wno-variadic-macros -Wno-gnu-zero-variadic-macro-arguments 
> -fno-exceptions -fno-rtti -UNDEBUG -Wno-suggest-override -std=c++14 -MD -MT 
> tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o
>  -MF 
> tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o.d
>  -o 
> tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o
>  -c 
> /home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
> /home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp:210:15:
>  error: unused function 'operator<<' [-Werror,-Wunused-function]
> std::ostream <<(std::ostream ,
>   ^
> 1 error generated.
> ```
Simon Pilgrim fixed it, but I don't understand why clang calls this function 
unused.  When the test fails, gtest uses this function to pretty print the 
parameter.  I'm rebuilding with a forced test failure to validate.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124500/new/

https://reviews.llvm.org/D124500

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-05-14 Thread Aaron Puchert via Phabricator via cfe-commits
aaronpuchert added inline comments.



Comment at: 
clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp:210-214
+std::ostream <<(std::ostream ,
+ const clang::tidy::test::Param ) {
+  return Str << "Matched: " << std::boolalpha << Value.Matched << ", Text: '"
+ << Value.Text << "'";
+}

Seems to have caused a [build 
failure](https://lab.llvm.org/buildbot/#/builders/57/builds/17915):

```
FAILED: 
tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o
 
/home/buildbots/clang.11.0.0/bin/clang++ 
--gcc-toolchain=/opt/rh/devtoolset-7/root/usr  -DGTEST_HAS_RTTI=0 -D_DEBUG 
-D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS 
-D__STDC_LIMIT_MACROS -Itools/clang/tools/extra/unittests/clang-tidy 
-I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/unittests/clang-tidy
 
-I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang/include
 -Itools/clang/include -Iinclude 
-I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/include
 
-I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/clang-tidy
 
-I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/utils/unittest/googletest/include
 
-I/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/utils/unittest/googlemock/include
 -fPIC -fvisibility-inlines-hidden -Werror -Werror=date-time 
-Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter 
-Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic 
-Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough 
-Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor 
-Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion 
-Wmisleading-indentation -fdiagnostics-color -ffunction-sections 
-fdata-sections -fno-common -Woverloaded-virtual -Wno-nested-anon-types -O3 
-DNDEBUG-Wno-variadic-macros -Wno-gnu-zero-variadic-macro-arguments 
-fno-exceptions -fno-rtti -UNDEBUG -Wno-suggest-override -std=c++14 -MD -MT 
tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o
 -MF 
tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o.d
 -o 
tools/clang/tools/extra/unittests/clang-tidy/CMakeFiles/ClangTidyTests.dir/ModernizeModuleTest.cpp.o
 -c 
/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
/home/buildbots/docker-RHEL-buildbot/SetupBot/worker_env/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp:210:15:
 error: unused function 'operator<<' [-Werror,-Wunused-function]
std::ostream <<(std::ostream ,
  ^
1 error generated.
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124500/new/

https://reviews.llvm.org/D124500

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-05-13 Thread Richard via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG512273833136: [clang-tidy] Support expressions of literals 
in modernize-macro-to-enum (authored by LegalizeAdulthood).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124500/new/

https://reviews.llvm.org/D124500

Files:
  clang-tools-extra/clang-tidy/modernize/CMakeLists.txt
  clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp
  clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.h
  clang-tools-extra/clang-tidy/modernize/MacroToEnumCheck.cpp
  clang-tools-extra/docs/clang-tidy/checks/modernize-macro-to-enum.rst
  clang-tools-extra/test/clang-tidy/checkers/modernize-macro-to-enum.cpp
  clang-tools-extra/unittests/clang-tidy/CMakeLists.txt
  clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp

Index: clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
===
--- /dev/null
+++ clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
@@ -0,0 +1,214 @@
+//=== ModernizeModuleTest.cpp - clang-tidy ===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+#include "ClangTidyTest.h"
+#include "modernize/IntegralLiteralExpressionMatcher.h"
+#include "clang/Lex/Lexer.h"
+#include "gtest/gtest.h"
+
+#include 
+#include 
+#include 
+#include 
+
+namespace clang {
+namespace tidy {
+namespace test {
+
+static std::vector tokenify(const char *Text) {
+  LangOptions LangOpts;
+  std::vector Includes;
+  LangOptions::setLangDefaults(LangOpts, Language::CXX, llvm::Triple(),
+   Includes, LangStandard::lang_cxx20);
+  Lexer Lex(SourceLocation{}, LangOpts, Text, Text, Text + std::strlen(Text));
+  std::vector Tokens;
+  bool End = false;
+  while (!End) {
+Token Tok;
+End = Lex.LexFromRawLexer(Tok);
+Tokens.push_back(Tok);
+  }
+
+  return Tokens;
+}
+
+static bool matchText(const char *Text) {
+  std::vector Tokens{tokenify(Text)};
+  modernize::IntegralLiteralExpressionMatcher Matcher(Tokens);
+
+  return Matcher.match();
+}
+
+namespace {
+
+struct Param {
+  bool Matched;
+  const char *Text;
+};
+
+class MatcherTest : public ::testing::TestWithParam {};
+
+} // namespace
+
+static const Param Params[] = {
+// Accept integral literals.
+{true, "1"},
+{true, "0177"},
+{true, "0xdeadbeef"},
+{true, "0b1011"},
+{true, "'c'"},
+// Reject non-integral literals.
+{false, "1.23"},
+{false, "0x1p3"},
+{false, R"("string")"},
+{false, "1i"},
+
+// Accept literals with these unary operators.
+{true, "-1"},
+{true, "+1"},
+{true, "~1"},
+{true, "!1"},
+// Reject invalid unary operators.
+{false, "1-"},
+{false, "1+"},
+{false, "1~"},
+{false, "1!"},
+
+// Accept valid binary operators.
+{true, "1+1"},
+{true, "1-1"},
+{true, "1*1"},
+{true, "1/1"},
+{true, "1%2"},
+{true, "1<<1"},
+{true, "1>>1"},
+{true, "1<=>1"},
+{true, "1<1"},
+{true, "1>1"},
+{true, "1<=1"},
+{true, "1>=1"},
+{true, "1==1"},
+{true, "1!=1"},
+{true, "1&1"},
+{true, "1^1"},
+{true, "1|1"},
+{true, "1&&1"},
+{true, "1||1"},
+{true, "1+ +1"}, // A space is needed to avoid being tokenized as ++ or --.
+{true, "1- -1"},
+{true, "1,1"},
+// Reject invalid binary operators.
+{false, "1+"},
+{false, "1-"},
+{false, "1*"},
+{false, "1/"},
+{false, "1%"},
+{false, "1<<"},
+{false, "1>>"},
+{false, "1<=>"},
+{false, "1<"},
+{false, "1>"},
+{false, "1<="},
+{false, "1>="},
+{false, "1=="},
+{false, "1!="},
+{false, "1&"},
+{false, "1^"},
+{false, "1|"},
+{false, "1&&"},
+{false, "1||"},
+{false, "1,"},
+{false, ",1"},
+
+// Accept valid ternary operators.
+{true, "1?1:1"},
+{true, "1?:1"}, // A gcc extension treats x ? : y as x ? x : y.
+// Reject invalid ternary operators.
+{false, "?"},
+{false, "?1"},
+{false, "?:"},
+{false, "?:1"},
+{false, "?1:"},
+{false, "?1:1"},
+{false, "1?"},
+{false, "1?1"},
+{false, "1?:"},
+{false, "1?1:"},
+
+// Accept parenthesized expressions.
+{true, "(1)"},
+{true, "((+1))"},
+{true, "((+(1)))"},
+{true, "(-1)"},
+{true, "-(1)"},
+{true, "(+1)"},
+{true, "((+1))"},
+{true, "+(1)"},
+{true, "(~1)"},
+{true, "~(1)"},
+{true, "(!1)"},
+{true, "!(1)"},
+{true, "(1+1)"},
+{true, "(1-1)"},
+{true, "(1*1)"},
+{true, "(1/1)"},
+  

[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-05-13 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood updated this revision to Diff 429353.
LegalizeAdulthood added a comment.

Update documentation from review comments


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124500/new/

https://reviews.llvm.org/D124500

Files:
  clang-tools-extra/clang-tidy/modernize/CMakeLists.txt
  clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp
  clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.h
  clang-tools-extra/clang-tidy/modernize/MacroToEnumCheck.cpp
  clang-tools-extra/docs/clang-tidy/checks/modernize-macro-to-enum.rst
  clang-tools-extra/test/clang-tidy/checkers/modernize-macro-to-enum.cpp
  clang-tools-extra/unittests/clang-tidy/CMakeLists.txt
  clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp

Index: clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
===
--- /dev/null
+++ clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
@@ -0,0 +1,214 @@
+//=== ModernizeModuleTest.cpp - clang-tidy ===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+#include "ClangTidyTest.h"
+#include "modernize/IntegralLiteralExpressionMatcher.h"
+#include "clang/Lex/Lexer.h"
+#include "gtest/gtest.h"
+
+#include 
+#include 
+#include 
+#include 
+
+namespace clang {
+namespace tidy {
+namespace test {
+
+static std::vector tokenify(const char *Text) {
+  LangOptions LangOpts;
+  std::vector Includes;
+  LangOptions::setLangDefaults(LangOpts, Language::CXX, llvm::Triple(),
+   Includes, LangStandard::lang_cxx20);
+  Lexer Lex(SourceLocation{}, LangOpts, Text, Text, Text + std::strlen(Text));
+  std::vector Tokens;
+  bool End = false;
+  while (!End) {
+Token Tok;
+End = Lex.LexFromRawLexer(Tok);
+Tokens.push_back(Tok);
+  }
+
+  return Tokens;
+}
+
+static bool matchText(const char *Text) {
+  std::vector Tokens{tokenify(Text)};
+  modernize::IntegralLiteralExpressionMatcher Matcher(Tokens);
+
+  return Matcher.match();
+}
+
+namespace {
+
+struct Param {
+  bool Matched;
+  const char *Text;
+};
+
+class MatcherTest : public ::testing::TestWithParam {};
+
+} // namespace
+
+static const Param Params[] = {
+// Accept integral literals.
+{true, "1"},
+{true, "0177"},
+{true, "0xdeadbeef"},
+{true, "0b1011"},
+{true, "'c'"},
+// Reject non-integral literals.
+{false, "1.23"},
+{false, "0x1p3"},
+{false, R"("string")"},
+{false, "1i"},
+
+// Accept literals with these unary operators.
+{true, "-1"},
+{true, "+1"},
+{true, "~1"},
+{true, "!1"},
+// Reject invalid unary operators.
+{false, "1-"},
+{false, "1+"},
+{false, "1~"},
+{false, "1!"},
+
+// Accept valid binary operators.
+{true, "1+1"},
+{true, "1-1"},
+{true, "1*1"},
+{true, "1/1"},
+{true, "1%2"},
+{true, "1<<1"},
+{true, "1>>1"},
+{true, "1<=>1"},
+{true, "1<1"},
+{true, "1>1"},
+{true, "1<=1"},
+{true, "1>=1"},
+{true, "1==1"},
+{true, "1!=1"},
+{true, "1&1"},
+{true, "1^1"},
+{true, "1|1"},
+{true, "1&&1"},
+{true, "1||1"},
+{true, "1+ +1"}, // A space is needed to avoid being tokenized as ++ or --.
+{true, "1- -1"},
+{true, "1,1"},
+// Reject invalid binary operators.
+{false, "1+"},
+{false, "1-"},
+{false, "1*"},
+{false, "1/"},
+{false, "1%"},
+{false, "1<<"},
+{false, "1>>"},
+{false, "1<=>"},
+{false, "1<"},
+{false, "1>"},
+{false, "1<="},
+{false, "1>="},
+{false, "1=="},
+{false, "1!="},
+{false, "1&"},
+{false, "1^"},
+{false, "1|"},
+{false, "1&&"},
+{false, "1||"},
+{false, "1,"},
+{false, ",1"},
+
+// Accept valid ternary operators.
+{true, "1?1:1"},
+{true, "1?:1"}, // A gcc extension treats x ? : y as x ? x : y.
+// Reject invalid ternary operators.
+{false, "?"},
+{false, "?1"},
+{false, "?:"},
+{false, "?:1"},
+{false, "?1:"},
+{false, "?1:1"},
+{false, "1?"},
+{false, "1?1"},
+{false, "1?:"},
+{false, "1?1:"},
+
+// Accept parenthesized expressions.
+{true, "(1)"},
+{true, "((+1))"},
+{true, "((+(1)))"},
+{true, "(-1)"},
+{true, "-(1)"},
+{true, "(+1)"},
+{true, "((+1))"},
+{true, "+(1)"},
+{true, "(~1)"},
+{true, "~(1)"},
+{true, "(!1)"},
+{true, "!(1)"},
+{true, "(1+1)"},
+{true, "(1-1)"},
+{true, "(1*1)"},
+{true, "(1/1)"},
+{true, "(1%2)"},
+{true, "(1<<1)"},
+{true, "(1>>1)"},
+{true, "(1<=>1)"},
+{true, "(1<1)"},
+{true, "(1>1)"},
+{true, "(1<=1)"},
+{true, 

[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-05-09 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood added inline comments.



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:99
+
+  if (!Current->isLiteral() || isStringLiteral(Current->getKind()) ||
+  !isIntegralConstant(*Current)) {

aaron.ballman wrote:
> LegalizeAdulthood wrote:
> > aaron.ballman wrote:
> > > aaron.ballman wrote:
> > > > LegalizeAdulthood wrote:
> > > > > LegalizeAdulthood wrote:
> > > > > > aaron.ballman wrote:
> > > > > > > LegalizeAdulthood wrote:
> > > > > > > > aaron.ballman wrote:
> > > > > > > > > LegalizeAdulthood wrote:
> > > > > > > > > > aaron.ballman wrote:
> > > > > > > > > > > LegalizeAdulthood wrote:
> > > > > > > > > > > > aaron.ballman wrote:
> > > > > > > > > > > > > I know this is code moved from elsewhere, but I 
> > > > > > > > > > > > > suppose we never considered the odd edge case where a 
> > > > > > > > > > > > > user does something like `"foo"[0]` as a really awful 
> > > > > > > > > > > > > integer constant. :-D
> > > > > > > > > > > > It's always possible to write crazy contorted code and 
> > > > > > > > > > > > have a check not recognize it.  I don't think it's 
> > > > > > > > > > > > worthwhile to spend time trying to handle torture cases 
> > > > > > > > > > > > unless we can find data that shows it's prevalent in 
> > > > > > > > > > > > real world code.
> > > > > > > > > > > > 
> > > > > > > > > > > > If I was doing a code review and saw this:
> > > > > > > > > > > > ```
> > > > > > > > > > > > enum {
> > > > > > > > > > > > FOO = "foo"[0]
> > > > > > > > > > > > };
> > > > > > > > > > > > ```
> > > > > > > > > > > > I'd flag that in a code review as bogus, whereas if I 
> > > > > > > > > > > > saw:
> > > > > > > > > > > > ```
> > > > > > > > > > > > enum {
> > > > > > > > > > > > FOO = 'f'
> > > > > > > > > > > > };
> > > > > > > > > > > > ```
> > > > > > > > > > > > That would be acceptable, which is why character 
> > > > > > > > > > > > literals are accepted as an integral literal 
> > > > > > > > > > > > initializer for an enum in this check.
> > > > > > > > > > > >  I don't think it's worthwhile to spend time trying to 
> > > > > > > > > > > > handle torture cases unless we can find data that shows 
> > > > > > > > > > > > it's prevalent in real world code.
> > > > > > > > > > > 
> > > > > > > > > > > I think I'm okay agreeing to that in this particular 
> > > > > > > > > > > case, but this is more to point out that writing your own 
> > > > > > > > > > > parser is a maintenance burden. Users will hit cases 
> > > > > > > > > > > we've both forgotten about here, they'll file a bug, then 
> > > > > > > > > > > someone has to deal with it. It's very hard to justify to 
> > > > > > > > > > > users "we think you write silly code" because they often 
> > > > > > > > > > > have creative ways in which their code is not actually so 
> > > > > > > > > > > silly, especially when we support "most" valid 
> > > > > > > > > > > expressions.
> > > > > > > > > > Writing your own parser is unavoidable here because we 
> > > > > > > > > > can't just assume that any old thing will be a valid 
> > > > > > > > > > initializer just by looking at the set of tokens present in 
> > > > > > > > > > the macro body.  (There is a separate discussion going on 
> > > > > > > > > > about improving the preprocessor support and parsing things 
> > > > > > > > > > more deeply, but that isn't even to the point of a 
> > > > > > > > > > prototype yet.)  The worst thing we can do is create 
> > > > > > > > > > "fixits" that produce invalid code.
> > > > > > > > > > 
> > > > > > > > > > The worst that happens if your expression isn't recognized 
> > > > > > > > > > is that your macro isn't matched as a candidate for an 
> > > > > > > > > > enum.  You can always make it an enum manually and join it 
> > > > > > > > > > with adjacent macros that were recognized and converted.
> > > > > > > > > > 
> > > > > > > > > > As it stands, the check only recognizes a single literal 
> > > > > > > > > > with an optional unary operator.
> > > > > > > > > > 
> > > > > > > > > > This change expands the check to recognize a broad range of 
> > > > > > > > > > expressions, allowing those macros to be converted to 
> > > > > > > > > > enums.  I opened the issue because running 
> > > > > > > > > > modernize-macro-to-enum on the [[ 
> > > > > > > > > > https://github.com/InsightSoftwareConsortium/ITK | ITK 
> > > > > > > > > > codebase ]] showed some simple expressions involving 
> > > > > > > > > > literals that weren't recognized and converted.
> > > > > > > > > > 
> > > > > > > > > > If an expression isn't recognized and an issue is opened, 
> > > > > > > > > > it will be an enhancement request to support a broader 
> > > > > > > > > > range of expression, not a bug that this check created 
> > > > > > > > > > invalid code.
> > > > > > > > > > 
> > > > > > > > > > IMO, the more useful thing that's missing from the grammar 
> > > > > > > > > > is 

[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-05-09 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood added inline comments.



Comment at: 
clang-tools-extra/docs/clang-tidy/checks/modernize-macro-to-enum.rst:21-22
+  `,`, `-`, `+`, `*`, `/`, `%`, `&`, `|`, `^`, `<`, `>`, `<=`, `>=`,
+  `==`, `!=`, `||`, `&&`, `<<`, `>>` or `<=>`, and the ternary operator
+  `?:`.  Parenthesized expressions are also recognized.  This
+  recognizes the most valid expressions.  In particular, expressions

aaron.ballman wrote:
> Maybe we should also list the binary `?:` GNU extension and comma expressions?
Yes, I need to update the docs on that and also call out the potential false 
positives explicitly.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124500/new/

https://reviews.llvm.org/D124500

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-05-09 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman accepted this revision.
aaron.ballman added a comment.
This revision is now accepted and ready to land.

In D124500#3499683 , 
@LegalizeAdulthood wrote:

> OK, so thinking about this review a little more, I propose this:
>
> - Take the check as is, but document that the initializing expressions may 
> result in an invalid enum, particularly for C which restricts the underlying 
> type to be `int`
> - Create a subsequent commit that rejects the enums where the language is C 
> and the initializing expression is a value larger than an `int` by rejecting 
> any macro where any integer token in the expression is larger than an `int`
> - Create an additional subsequent commit that not only matches the expression 
> but also computes the value and checks it for range.
>
> How does that sound?

I think that sounds like a good approach -- I expect the third bullet is when 
we'll likely learn whether we should have let Clang do the parsing or not 
(because then we're replicating not only the expression parsing but the 
constant evaluation calculations from the frontend).

I re-reviewed the patch as it stands today, and given the above plan, I think 
this is good to go. So it gets my LGTM and my thanks for the good discussions!




Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:99
+
+  if (!Current->isLiteral() || isStringLiteral(Current->getKind()) ||
+  !isIntegralConstant(*Current)) {

LegalizeAdulthood wrote:
> aaron.ballman wrote:
> > aaron.ballman wrote:
> > > LegalizeAdulthood wrote:
> > > > LegalizeAdulthood wrote:
> > > > > aaron.ballman wrote:
> > > > > > LegalizeAdulthood wrote:
> > > > > > > aaron.ballman wrote:
> > > > > > > > LegalizeAdulthood wrote:
> > > > > > > > > aaron.ballman wrote:
> > > > > > > > > > LegalizeAdulthood wrote:
> > > > > > > > > > > aaron.ballman wrote:
> > > > > > > > > > > > I know this is code moved from elsewhere, but I suppose 
> > > > > > > > > > > > we never considered the odd edge case where a user does 
> > > > > > > > > > > > something like `"foo"[0]` as a really awful integer 
> > > > > > > > > > > > constant. :-D
> > > > > > > > > > > It's always possible to write crazy contorted code and 
> > > > > > > > > > > have a check not recognize it.  I don't think it's 
> > > > > > > > > > > worthwhile to spend time trying to handle torture cases 
> > > > > > > > > > > unless we can find data that shows it's prevalent in real 
> > > > > > > > > > > world code.
> > > > > > > > > > > 
> > > > > > > > > > > If I was doing a code review and saw this:
> > > > > > > > > > > ```
> > > > > > > > > > > enum {
> > > > > > > > > > > FOO = "foo"[0]
> > > > > > > > > > > };
> > > > > > > > > > > ```
> > > > > > > > > > > I'd flag that in a code review as bogus, whereas if I saw:
> > > > > > > > > > > ```
> > > > > > > > > > > enum {
> > > > > > > > > > > FOO = 'f'
> > > > > > > > > > > };
> > > > > > > > > > > ```
> > > > > > > > > > > That would be acceptable, which is why character literals 
> > > > > > > > > > > are accepted as an integral literal initializer for an 
> > > > > > > > > > > enum in this check.
> > > > > > > > > > >  I don't think it's worthwhile to spend time trying to 
> > > > > > > > > > > handle torture cases unless we can find data that shows 
> > > > > > > > > > > it's prevalent in real world code.
> > > > > > > > > > 
> > > > > > > > > > I think I'm okay agreeing to that in this particular case, 
> > > > > > > > > > but this is more to point out that writing your own parser 
> > > > > > > > > > is a maintenance burden. Users will hit cases we've both 
> > > > > > > > > > forgotten about here, they'll file a bug, then someone has 
> > > > > > > > > > to deal with it. It's very hard to justify to users "we 
> > > > > > > > > > think you write silly code" because they often have 
> > > > > > > > > > creative ways in which their code is not actually so silly, 
> > > > > > > > > > especially when we support "most" valid expressions.
> > > > > > > > > Writing your own parser is unavoidable here because we can't 
> > > > > > > > > just assume that any old thing will be a valid initializer 
> > > > > > > > > just by looking at the set of tokens present in the macro 
> > > > > > > > > body.  (There is a separate discussion going on about 
> > > > > > > > > improving the preprocessor support and parsing things more 
> > > > > > > > > deeply, but that isn't even to the point of a prototype yet.) 
> > > > > > > > >  The worst thing we can do is create "fixits" that produce 
> > > > > > > > > invalid code.
> > > > > > > > > 
> > > > > > > > > The worst that happens if your expression isn't recognized is 
> > > > > > > > > that your macro isn't matched as a candidate for an enum.  
> > > > > > > > > You can always make it an enum manually and join it with 
> > > > > > > > > adjacent macros that were recognized and converted.
> > > > > > > 

[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-05-08 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood marked 2 inline comments as done.
LegalizeAdulthood added a comment.

OK, so thinking about this review a little more, I propose this:

- Take the check as is, but document that the initializing expressions may 
result in an invalid enum, particularly for C which restricts the underlying 
type to be `int`
- Create a subsequent commit that rejects the enums where the language is C and 
the initializing expression is a value larger than an `int` by rejecting any 
macro where any integer token in the expression is larger than an `int`
- Create an additional subsequent commit that not only matches the expression 
but also computes the value and checks it for range.

How does that sound?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124500/new/

https://reviews.llvm.org/D124500

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-05-08 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood marked an inline comment as done.
LegalizeAdulthood added inline comments.



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:99
+
+  if (!Current->isLiteral() || isStringLiteral(Current->getKind()) ||
+  !isIntegralConstant(*Current)) {

aaron.ballman wrote:
> aaron.ballman wrote:
> > LegalizeAdulthood wrote:
> > > LegalizeAdulthood wrote:
> > > > aaron.ballman wrote:
> > > > > LegalizeAdulthood wrote:
> > > > > > aaron.ballman wrote:
> > > > > > > LegalizeAdulthood wrote:
> > > > > > > > aaron.ballman wrote:
> > > > > > > > > LegalizeAdulthood wrote:
> > > > > > > > > > aaron.ballman wrote:
> > > > > > > > > > > I know this is code moved from elsewhere, but I suppose 
> > > > > > > > > > > we never considered the odd edge case where a user does 
> > > > > > > > > > > something like `"foo"[0]` as a really awful integer 
> > > > > > > > > > > constant. :-D
> > > > > > > > > > It's always possible to write crazy contorted code and have 
> > > > > > > > > > a check not recognize it.  I don't think it's worthwhile to 
> > > > > > > > > > spend time trying to handle torture cases unless we can 
> > > > > > > > > > find data that shows it's prevalent in real world code.
> > > > > > > > > > 
> > > > > > > > > > If I was doing a code review and saw this:
> > > > > > > > > > ```
> > > > > > > > > > enum {
> > > > > > > > > > FOO = "foo"[0]
> > > > > > > > > > };
> > > > > > > > > > ```
> > > > > > > > > > I'd flag that in a code review as bogus, whereas if I saw:
> > > > > > > > > > ```
> > > > > > > > > > enum {
> > > > > > > > > > FOO = 'f'
> > > > > > > > > > };
> > > > > > > > > > ```
> > > > > > > > > > That would be acceptable, which is why character literals 
> > > > > > > > > > are accepted as an integral literal initializer for an enum 
> > > > > > > > > > in this check.
> > > > > > > > > >  I don't think it's worthwhile to spend time trying to 
> > > > > > > > > > handle torture cases unless we can find data that shows 
> > > > > > > > > > it's prevalent in real world code.
> > > > > > > > > 
> > > > > > > > > I think I'm okay agreeing to that in this particular case, 
> > > > > > > > > but this is more to point out that writing your own parser is 
> > > > > > > > > a maintenance burden. Users will hit cases we've both 
> > > > > > > > > forgotten about here, they'll file a bug, then someone has to 
> > > > > > > > > deal with it. It's very hard to justify to users "we think 
> > > > > > > > > you write silly code" because they often have creative ways 
> > > > > > > > > in which their code is not actually so silly, especially when 
> > > > > > > > > we support "most" valid expressions.
> > > > > > > > Writing your own parser is unavoidable here because we can't 
> > > > > > > > just assume that any old thing will be a valid initializer just 
> > > > > > > > by looking at the set of tokens present in the macro body.  
> > > > > > > > (There is a separate discussion going on about improving the 
> > > > > > > > preprocessor support and parsing things more deeply, but that 
> > > > > > > > isn't even to the point of a prototype yet.)  The worst thing 
> > > > > > > > we can do is create "fixits" that produce invalid code.
> > > > > > > > 
> > > > > > > > The worst that happens if your expression isn't recognized is 
> > > > > > > > that your macro isn't matched as a candidate for an enum.  You 
> > > > > > > > can always make it an enum manually and join it with adjacent 
> > > > > > > > macros that were recognized and converted.
> > > > > > > > 
> > > > > > > > As it stands, the check only recognizes a single literal with 
> > > > > > > > an optional unary operator.
> > > > > > > > 
> > > > > > > > This change expands the check to recognize a broad range of 
> > > > > > > > expressions, allowing those macros to be converted to enums.  I 
> > > > > > > > opened the issue because running modernize-macro-to-enum on the 
> > > > > > > > [[ https://github.com/InsightSoftwareConsortium/ITK | ITK 
> > > > > > > > codebase ]] showed some simple expressions involving literals 
> > > > > > > > that weren't recognized and converted.
> > > > > > > > 
> > > > > > > > If an expression isn't recognized and an issue is opened, it 
> > > > > > > > will be an enhancement request to support a broader range of 
> > > > > > > > expression, not a bug that this check created invalid code.
> > > > > > > > 
> > > > > > > > IMO, the more useful thing that's missing from the grammar is 
> > > > > > > > recognizing `sizeof` expressions rather than indexing string 
> > > > > > > > literals with an integral literal subscript.
> > > > > > > > 
> > > > > > > > I had planned on doing another increment to recognize `sizeof` 
> > > > > > > > expressions.
> > > > > > > > Writing your own parser is unavoidable here because we can't 
> > > > > > > > just assume that any old thing will be a valid initializer just 
> > > > > > > > by looking 

[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-05-03 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added inline comments.



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:99
+
+  if (!Current->isLiteral() || isStringLiteral(Current->getKind()) ||
+  !isIntegralConstant(*Current)) {

aaron.ballman wrote:
> LegalizeAdulthood wrote:
> > LegalizeAdulthood wrote:
> > > aaron.ballman wrote:
> > > > LegalizeAdulthood wrote:
> > > > > aaron.ballman wrote:
> > > > > > LegalizeAdulthood wrote:
> > > > > > > aaron.ballman wrote:
> > > > > > > > LegalizeAdulthood wrote:
> > > > > > > > > aaron.ballman wrote:
> > > > > > > > > > I know this is code moved from elsewhere, but I suppose we 
> > > > > > > > > > never considered the odd edge case where a user does 
> > > > > > > > > > something like `"foo"[0]` as a really awful integer 
> > > > > > > > > > constant. :-D
> > > > > > > > > It's always possible to write crazy contorted code and have a 
> > > > > > > > > check not recognize it.  I don't think it's worthwhile to 
> > > > > > > > > spend time trying to handle torture cases unless we can find 
> > > > > > > > > data that shows it's prevalent in real world code.
> > > > > > > > > 
> > > > > > > > > If I was doing a code review and saw this:
> > > > > > > > > ```
> > > > > > > > > enum {
> > > > > > > > > FOO = "foo"[0]
> > > > > > > > > };
> > > > > > > > > ```
> > > > > > > > > I'd flag that in a code review as bogus, whereas if I saw:
> > > > > > > > > ```
> > > > > > > > > enum {
> > > > > > > > > FOO = 'f'
> > > > > > > > > };
> > > > > > > > > ```
> > > > > > > > > That would be acceptable, which is why character literals are 
> > > > > > > > > accepted as an integral literal initializer for an enum in 
> > > > > > > > > this check.
> > > > > > > > >  I don't think it's worthwhile to spend time trying to handle 
> > > > > > > > > torture cases unless we can find data that shows it's 
> > > > > > > > > prevalent in real world code.
> > > > > > > > 
> > > > > > > > I think I'm okay agreeing to that in this particular case, but 
> > > > > > > > this is more to point out that writing your own parser is a 
> > > > > > > > maintenance burden. Users will hit cases we've both forgotten 
> > > > > > > > about here, they'll file a bug, then someone has to deal with 
> > > > > > > > it. It's very hard to justify to users "we think you write 
> > > > > > > > silly code" because they often have creative ways in which 
> > > > > > > > their code is not actually so silly, especially when we support 
> > > > > > > > "most" valid expressions.
> > > > > > > Writing your own parser is unavoidable here because we can't just 
> > > > > > > assume that any old thing will be a valid initializer just by 
> > > > > > > looking at the set of tokens present in the macro body.  (There 
> > > > > > > is a separate discussion going on about improving the 
> > > > > > > preprocessor support and parsing things more deeply, but that 
> > > > > > > isn't even to the point of a prototype yet.)  The worst thing we 
> > > > > > > can do is create "fixits" that produce invalid code.
> > > > > > > 
> > > > > > > The worst that happens if your expression isn't recognized is 
> > > > > > > that your macro isn't matched as a candidate for an enum.  You 
> > > > > > > can always make it an enum manually and join it with adjacent 
> > > > > > > macros that were recognized and converted.
> > > > > > > 
> > > > > > > As it stands, the check only recognizes a single literal with an 
> > > > > > > optional unary operator.
> > > > > > > 
> > > > > > > This change expands the check to recognize a broad range of 
> > > > > > > expressions, allowing those macros to be converted to enums.  I 
> > > > > > > opened the issue because running modernize-macro-to-enum on the 
> > > > > > > [[ https://github.com/InsightSoftwareConsortium/ITK | ITK 
> > > > > > > codebase ]] showed some simple expressions involving literals 
> > > > > > > that weren't recognized and converted.
> > > > > > > 
> > > > > > > If an expression isn't recognized and an issue is opened, it will 
> > > > > > > be an enhancement request to support a broader range of 
> > > > > > > expression, not a bug that this check created invalid code.
> > > > > > > 
> > > > > > > IMO, the more useful thing that's missing from the grammar is 
> > > > > > > recognizing `sizeof` expressions rather than indexing string 
> > > > > > > literals with an integral literal subscript.
> > > > > > > 
> > > > > > > I had planned on doing another increment to recognize `sizeof` 
> > > > > > > expressions.
> > > > > > > Writing your own parser is unavoidable here because we can't just 
> > > > > > > assume that any old thing will be a valid initializer just by 
> > > > > > > looking at the set of tokens present in the macro body. 
> > > > > > 
> > > > > > If you ran the token sequence through clang's parser and got an AST 
> > > > > > node out, you'd have significantly *more* information as to whether 
> > > > > > 

[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-05-02 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood added inline comments.



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:99
+
+  if (!Current->isLiteral() || isStringLiteral(Current->getKind()) ||
+  !isIntegralConstant(*Current)) {

LegalizeAdulthood wrote:
> aaron.ballman wrote:
> > LegalizeAdulthood wrote:
> > > aaron.ballman wrote:
> > > > LegalizeAdulthood wrote:
> > > > > aaron.ballman wrote:
> > > > > > LegalizeAdulthood wrote:
> > > > > > > aaron.ballman wrote:
> > > > > > > > I know this is code moved from elsewhere, but I suppose we 
> > > > > > > > never considered the odd edge case where a user does something 
> > > > > > > > like `"foo"[0]` as a really awful integer constant. :-D
> > > > > > > It's always possible to write crazy contorted code and have a 
> > > > > > > check not recognize it.  I don't think it's worthwhile to spend 
> > > > > > > time trying to handle torture cases unless we can find data that 
> > > > > > > shows it's prevalent in real world code.
> > > > > > > 
> > > > > > > If I was doing a code review and saw this:
> > > > > > > ```
> > > > > > > enum {
> > > > > > > FOO = "foo"[0]
> > > > > > > };
> > > > > > > ```
> > > > > > > I'd flag that in a code review as bogus, whereas if I saw:
> > > > > > > ```
> > > > > > > enum {
> > > > > > > FOO = 'f'
> > > > > > > };
> > > > > > > ```
> > > > > > > That would be acceptable, which is why character literals are 
> > > > > > > accepted as an integral literal initializer for an enum in this 
> > > > > > > check.
> > > > > > >  I don't think it's worthwhile to spend time trying to handle 
> > > > > > > torture cases unless we can find data that shows it's prevalent 
> > > > > > > in real world code.
> > > > > > 
> > > > > > I think I'm okay agreeing to that in this particular case, but this 
> > > > > > is more to point out that writing your own parser is a maintenance 
> > > > > > burden. Users will hit cases we've both forgotten about here, 
> > > > > > they'll file a bug, then someone has to deal with it. It's very 
> > > > > > hard to justify to users "we think you write silly code" because 
> > > > > > they often have creative ways in which their code is not actually 
> > > > > > so silly, especially when we support "most" valid expressions.
> > > > > Writing your own parser is unavoidable here because we can't just 
> > > > > assume that any old thing will be a valid initializer just by looking 
> > > > > at the set of tokens present in the macro body.  (There is a separate 
> > > > > discussion going on about improving the preprocessor support and 
> > > > > parsing things more deeply, but that isn't even to the point of a 
> > > > > prototype yet.)  The worst thing we can do is create "fixits" that 
> > > > > produce invalid code.
> > > > > 
> > > > > The worst that happens if your expression isn't recognized is that 
> > > > > your macro isn't matched as a candidate for an enum.  You can always 
> > > > > make it an enum manually and join it with adjacent macros that were 
> > > > > recognized and converted.
> > > > > 
> > > > > As it stands, the check only recognizes a single literal with an 
> > > > > optional unary operator.
> > > > > 
> > > > > This change expands the check to recognize a broad range of 
> > > > > expressions, allowing those macros to be converted to enums.  I 
> > > > > opened the issue because running modernize-macro-to-enum on the [[ 
> > > > > https://github.com/InsightSoftwareConsortium/ITK | ITK codebase ]] 
> > > > > showed some simple expressions involving literals that weren't 
> > > > > recognized and converted.
> > > > > 
> > > > > If an expression isn't recognized and an issue is opened, it will be 
> > > > > an enhancement request to support a broader range of expression, not 
> > > > > a bug that this check created invalid code.
> > > > > 
> > > > > IMO, the more useful thing that's missing from the grammar is 
> > > > > recognizing `sizeof` expressions rather than indexing string literals 
> > > > > with an integral literal subscript.
> > > > > 
> > > > > I had planned on doing another increment to recognize `sizeof` 
> > > > > expressions.
> > > > > Writing your own parser is unavoidable here because we can't just 
> > > > > assume that any old thing will be a valid initializer just by looking 
> > > > > at the set of tokens present in the macro body. 
> > > > 
> > > > If you ran the token sequence through clang's parser and got an AST 
> > > > node out, you'd have significantly *more* information as to whether 
> > > > something is a valid enum constant initializer because you can check 
> > > > that it's an actual constant expression *and* that it's within a valid 
> > > > range of values. This not only fixes edge case bugs with your approach 
> > > > (like the fact that you can generate a series of literal expressions 
> > > > that result in a value too large to store within an enumerator 
> > > > constant), but it enables 

[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-05-02 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added inline comments.



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:99
+
+  if (!Current->isLiteral() || isStringLiteral(Current->getKind()) ||
+  !isIntegralConstant(*Current)) {

LegalizeAdulthood wrote:
> aaron.ballman wrote:
> > LegalizeAdulthood wrote:
> > > aaron.ballman wrote:
> > > > LegalizeAdulthood wrote:
> > > > > aaron.ballman wrote:
> > > > > > LegalizeAdulthood wrote:
> > > > > > > aaron.ballman wrote:
> > > > > > > > I know this is code moved from elsewhere, but I suppose we 
> > > > > > > > never considered the odd edge case where a user does something 
> > > > > > > > like `"foo"[0]` as a really awful integer constant. :-D
> > > > > > > It's always possible to write crazy contorted code and have a 
> > > > > > > check not recognize it.  I don't think it's worthwhile to spend 
> > > > > > > time trying to handle torture cases unless we can find data that 
> > > > > > > shows it's prevalent in real world code.
> > > > > > > 
> > > > > > > If I was doing a code review and saw this:
> > > > > > > ```
> > > > > > > enum {
> > > > > > > FOO = "foo"[0]
> > > > > > > };
> > > > > > > ```
> > > > > > > I'd flag that in a code review as bogus, whereas if I saw:
> > > > > > > ```
> > > > > > > enum {
> > > > > > > FOO = 'f'
> > > > > > > };
> > > > > > > ```
> > > > > > > That would be acceptable, which is why character literals are 
> > > > > > > accepted as an integral literal initializer for an enum in this 
> > > > > > > check.
> > > > > > >  I don't think it's worthwhile to spend time trying to handle 
> > > > > > > torture cases unless we can find data that shows it's prevalent 
> > > > > > > in real world code.
> > > > > > 
> > > > > > I think I'm okay agreeing to that in this particular case, but this 
> > > > > > is more to point out that writing your own parser is a maintenance 
> > > > > > burden. Users will hit cases we've both forgotten about here, 
> > > > > > they'll file a bug, then someone has to deal with it. It's very 
> > > > > > hard to justify to users "we think you write silly code" because 
> > > > > > they often have creative ways in which their code is not actually 
> > > > > > so silly, especially when we support "most" valid expressions.
> > > > > Writing your own parser is unavoidable here because we can't just 
> > > > > assume that any old thing will be a valid initializer just by looking 
> > > > > at the set of tokens present in the macro body.  (There is a separate 
> > > > > discussion going on about improving the preprocessor support and 
> > > > > parsing things more deeply, but that isn't even to the point of a 
> > > > > prototype yet.)  The worst thing we can do is create "fixits" that 
> > > > > produce invalid code.
> > > > > 
> > > > > The worst that happens if your expression isn't recognized is that 
> > > > > your macro isn't matched as a candidate for an enum.  You can always 
> > > > > make it an enum manually and join it with adjacent macros that were 
> > > > > recognized and converted.
> > > > > 
> > > > > As it stands, the check only recognizes a single literal with an 
> > > > > optional unary operator.
> > > > > 
> > > > > This change expands the check to recognize a broad range of 
> > > > > expressions, allowing those macros to be converted to enums.  I 
> > > > > opened the issue because running modernize-macro-to-enum on the [[ 
> > > > > https://github.com/InsightSoftwareConsortium/ITK | ITK codebase ]] 
> > > > > showed some simple expressions involving literals that weren't 
> > > > > recognized and converted.
> > > > > 
> > > > > If an expression isn't recognized and an issue is opened, it will be 
> > > > > an enhancement request to support a broader range of expression, not 
> > > > > a bug that this check created invalid code.
> > > > > 
> > > > > IMO, the more useful thing that's missing from the grammar is 
> > > > > recognizing `sizeof` expressions rather than indexing string literals 
> > > > > with an integral literal subscript.
> > > > > 
> > > > > I had planned on doing another increment to recognize `sizeof` 
> > > > > expressions.
> > > > > Writing your own parser is unavoidable here because we can't just 
> > > > > assume that any old thing will be a valid initializer just by looking 
> > > > > at the set of tokens present in the macro body. 
> > > > 
> > > > If you ran the token sequence through clang's parser and got an AST 
> > > > node out, you'd have significantly *more* information as to whether 
> > > > something is a valid enum constant initializer because you can check 
> > > > that it's an actual constant expression *and* that it's within a valid 
> > > > range of values. This not only fixes edge case bugs with your approach 
> > > > (like the fact that you can generate a series of literal expressions 
> > > > that result in a value too large to store within an enumerator 
> > > > constant), but it enables new 

[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-05-02 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood added inline comments.



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:99
+
+  if (!Current->isLiteral() || isStringLiteral(Current->getKind()) ||
+  !isIntegralConstant(*Current)) {

aaron.ballman wrote:
> LegalizeAdulthood wrote:
> > aaron.ballman wrote:
> > > LegalizeAdulthood wrote:
> > > > aaron.ballman wrote:
> > > > > LegalizeAdulthood wrote:
> > > > > > aaron.ballman wrote:
> > > > > > > I know this is code moved from elsewhere, but I suppose we never 
> > > > > > > considered the odd edge case where a user does something like 
> > > > > > > `"foo"[0]` as a really awful integer constant. :-D
> > > > > > It's always possible to write crazy contorted code and have a check 
> > > > > > not recognize it.  I don't think it's worthwhile to spend time 
> > > > > > trying to handle torture cases unless we can find data that shows 
> > > > > > it's prevalent in real world code.
> > > > > > 
> > > > > > If I was doing a code review and saw this:
> > > > > > ```
> > > > > > enum {
> > > > > > FOO = "foo"[0]
> > > > > > };
> > > > > > ```
> > > > > > I'd flag that in a code review as bogus, whereas if I saw:
> > > > > > ```
> > > > > > enum {
> > > > > > FOO = 'f'
> > > > > > };
> > > > > > ```
> > > > > > That would be acceptable, which is why character literals are 
> > > > > > accepted as an integral literal initializer for an enum in this 
> > > > > > check.
> > > > > >  I don't think it's worthwhile to spend time trying to handle 
> > > > > > torture cases unless we can find data that shows it's prevalent in 
> > > > > > real world code.
> > > > > 
> > > > > I think I'm okay agreeing to that in this particular case, but this 
> > > > > is more to point out that writing your own parser is a maintenance 
> > > > > burden. Users will hit cases we've both forgotten about here, they'll 
> > > > > file a bug, then someone has to deal with it. It's very hard to 
> > > > > justify to users "we think you write silly code" because they often 
> > > > > have creative ways in which their code is not actually so silly, 
> > > > > especially when we support "most" valid expressions.
> > > > Writing your own parser is unavoidable here because we can't just 
> > > > assume that any old thing will be a valid initializer just by looking 
> > > > at the set of tokens present in the macro body.  (There is a separate 
> > > > discussion going on about improving the preprocessor support and 
> > > > parsing things more deeply, but that isn't even to the point of a 
> > > > prototype yet.)  The worst thing we can do is create "fixits" that 
> > > > produce invalid code.
> > > > 
> > > > The worst that happens if your expression isn't recognized is that your 
> > > > macro isn't matched as a candidate for an enum.  You can always make it 
> > > > an enum manually and join it with adjacent macros that were recognized 
> > > > and converted.
> > > > 
> > > > As it stands, the check only recognizes a single literal with an 
> > > > optional unary operator.
> > > > 
> > > > This change expands the check to recognize a broad range of 
> > > > expressions, allowing those macros to be converted to enums.  I opened 
> > > > the issue because running modernize-macro-to-enum on the [[ 
> > > > https://github.com/InsightSoftwareConsortium/ITK | ITK codebase ]] 
> > > > showed some simple expressions involving literals that weren't 
> > > > recognized and converted.
> > > > 
> > > > If an expression isn't recognized and an issue is opened, it will be an 
> > > > enhancement request to support a broader range of expression, not a bug 
> > > > that this check created invalid code.
> > > > 
> > > > IMO, the more useful thing that's missing from the grammar is 
> > > > recognizing `sizeof` expressions rather than indexing string literals 
> > > > with an integral literal subscript.
> > > > 
> > > > I had planned on doing another increment to recognize `sizeof` 
> > > > expressions.
> > > > Writing your own parser is unavoidable here because we can't just 
> > > > assume that any old thing will be a valid initializer just by looking 
> > > > at the set of tokens present in the macro body. 
> > > 
> > > If you ran the token sequence through clang's parser and got an AST node 
> > > out, you'd have significantly *more* information as to whether something 
> > > is a valid enum constant initializer because you can check that it's an 
> > > actual constant expression *and* that it's within a valid range of 
> > > values. This not only fixes edge case bugs with your approach (like the 
> > > fact that you can generate a series of literal expressions that result in 
> > > a value too large to store within an enumerator constant), but it enables 
> > > new functionality your approach currently disallows (like using constexpr 
> > > variables instead of just numeric literals).
> > > 
> > > So I don't agree that it's unavoidable to write another 

[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-05-02 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added a comment.

In D124500#3485924 , 
@LegalizeAdulthood wrote:

> In D124500#3485462 , @aaron.ballman 
> wrote:
>
>> I largely agree, but I've found cases where we'll convert correct code to 
>> incorrect code, so it's a bit stronger than that.
>
> Are you talking generally, or with this check?  I can't see how this check
> is going to generate incorrect code (so far).

Specifically this check.

>> I think that's a reasonable goal, but we're not meeting the "never ever 
>> generate invalid code" part.
>
> How so?  Can you give an example where this check will produce invalid code?

As posted before:

  #define FINE 1LL << 30LL
  #define BAD 1LL << 31LL
  #define ALSO_BAD 1LL << 32L

Now with godbolt goodness: https://godbolt.org/z/Tzbe8qWT5




Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:99
+
+  if (!Current->isLiteral() || isStringLiteral(Current->getKind()) ||
+  !isIntegralConstant(*Current)) {

LegalizeAdulthood wrote:
> aaron.ballman wrote:
> > LegalizeAdulthood wrote:
> > > aaron.ballman wrote:
> > > > LegalizeAdulthood wrote:
> > > > > aaron.ballman wrote:
> > > > > > I know this is code moved from elsewhere, but I suppose we never 
> > > > > > considered the odd edge case where a user does something like 
> > > > > > `"foo"[0]` as a really awful integer constant. :-D
> > > > > It's always possible to write crazy contorted code and have a check 
> > > > > not recognize it.  I don't think it's worthwhile to spend time trying 
> > > > > to handle torture cases unless we can find data that shows it's 
> > > > > prevalent in real world code.
> > > > > 
> > > > > If I was doing a code review and saw this:
> > > > > ```
> > > > > enum {
> > > > > FOO = "foo"[0]
> > > > > };
> > > > > ```
> > > > > I'd flag that in a code review as bogus, whereas if I saw:
> > > > > ```
> > > > > enum {
> > > > > FOO = 'f'
> > > > > };
> > > > > ```
> > > > > That would be acceptable, which is why character literals are 
> > > > > accepted as an integral literal initializer for an enum in this check.
> > > > >  I don't think it's worthwhile to spend time trying to handle torture 
> > > > > cases unless we can find data that shows it's prevalent in real world 
> > > > > code.
> > > > 
> > > > I think I'm okay agreeing to that in this particular case, but this is 
> > > > more to point out that writing your own parser is a maintenance burden. 
> > > > Users will hit cases we've both forgotten about here, they'll file a 
> > > > bug, then someone has to deal with it. It's very hard to justify to 
> > > > users "we think you write silly code" because they often have creative 
> > > > ways in which their code is not actually so silly, especially when we 
> > > > support "most" valid expressions.
> > > Writing your own parser is unavoidable here because we can't just assume 
> > > that any old thing will be a valid initializer just by looking at the set 
> > > of tokens present in the macro body.  (There is a separate discussion 
> > > going on about improving the preprocessor support and parsing things more 
> > > deeply, but that isn't even to the point of a prototype yet.)  The worst 
> > > thing we can do is create "fixits" that produce invalid code.
> > > 
> > > The worst that happens if your expression isn't recognized is that your 
> > > macro isn't matched as a candidate for an enum.  You can always make it 
> > > an enum manually and join it with adjacent macros that were recognized 
> > > and converted.
> > > 
> > > As it stands, the check only recognizes a single literal with an optional 
> > > unary operator.
> > > 
> > > This change expands the check to recognize a broad range of expressions, 
> > > allowing those macros to be converted to enums.  I opened the issue 
> > > because running modernize-macro-to-enum on the [[ 
> > > https://github.com/InsightSoftwareConsortium/ITK | ITK codebase ]] showed 
> > > some simple expressions involving literals that weren't recognized and 
> > > converted.
> > > 
> > > If an expression isn't recognized and an issue is opened, it will be an 
> > > enhancement request to support a broader range of expression, not a bug 
> > > that this check created invalid code.
> > > 
> > > IMO, the more useful thing that's missing from the grammar is recognizing 
> > > `sizeof` expressions rather than indexing string literals with an 
> > > integral literal subscript.
> > > 
> > > I had planned on doing another increment to recognize `sizeof` 
> > > expressions.
> > > Writing your own parser is unavoidable here because we can't just assume 
> > > that any old thing will be a valid initializer just by looking at the set 
> > > of tokens present in the macro body. 
> > 
> > If you ran the token sequence through clang's parser and got an AST node 
> > out, you'd have significantly *more* information as to 

[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-05-02 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood added inline comments.



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:99
+
+  if (!Current->isLiteral() || isStringLiteral(Current->getKind()) ||
+  !isIntegralConstant(*Current)) {

aaron.ballman wrote:
> LegalizeAdulthood wrote:
> > aaron.ballman wrote:
> > > LegalizeAdulthood wrote:
> > > > aaron.ballman wrote:
> > > > > I know this is code moved from elsewhere, but I suppose we never 
> > > > > considered the odd edge case where a user does something like 
> > > > > `"foo"[0]` as a really awful integer constant. :-D
> > > > It's always possible to write crazy contorted code and have a check not 
> > > > recognize it.  I don't think it's worthwhile to spend time trying to 
> > > > handle torture cases unless we can find data that shows it's prevalent 
> > > > in real world code.
> > > > 
> > > > If I was doing a code review and saw this:
> > > > ```
> > > > enum {
> > > > FOO = "foo"[0]
> > > > };
> > > > ```
> > > > I'd flag that in a code review as bogus, whereas if I saw:
> > > > ```
> > > > enum {
> > > > FOO = 'f'
> > > > };
> > > > ```
> > > > That would be acceptable, which is why character literals are accepted 
> > > > as an integral literal initializer for an enum in this check.
> > > >  I don't think it's worthwhile to spend time trying to handle torture 
> > > > cases unless we can find data that shows it's prevalent in real world 
> > > > code.
> > > 
> > > I think I'm okay agreeing to that in this particular case, but this is 
> > > more to point out that writing your own parser is a maintenance burden. 
> > > Users will hit cases we've both forgotten about here, they'll file a bug, 
> > > then someone has to deal with it. It's very hard to justify to users "we 
> > > think you write silly code" because they often have creative ways in 
> > > which their code is not actually so silly, especially when we support 
> > > "most" valid expressions.
> > Writing your own parser is unavoidable here because we can't just assume 
> > that any old thing will be a valid initializer just by looking at the set 
> > of tokens present in the macro body.  (There is a separate discussion going 
> > on about improving the preprocessor support and parsing things more deeply, 
> > but that isn't even to the point of a prototype yet.)  The worst thing we 
> > can do is create "fixits" that produce invalid code.
> > 
> > The worst that happens if your expression isn't recognized is that your 
> > macro isn't matched as a candidate for an enum.  You can always make it an 
> > enum manually and join it with adjacent macros that were recognized and 
> > converted.
> > 
> > As it stands, the check only recognizes a single literal with an optional 
> > unary operator.
> > 
> > This change expands the check to recognize a broad range of expressions, 
> > allowing those macros to be converted to enums.  I opened the issue because 
> > running modernize-macro-to-enum on the [[ 
> > https://github.com/InsightSoftwareConsortium/ITK | ITK codebase ]] showed 
> > some simple expressions involving literals that weren't recognized and 
> > converted.
> > 
> > If an expression isn't recognized and an issue is opened, it will be an 
> > enhancement request to support a broader range of expression, not a bug 
> > that this check created invalid code.
> > 
> > IMO, the more useful thing that's missing from the grammar is recognizing 
> > `sizeof` expressions rather than indexing string literals with an integral 
> > literal subscript.
> > 
> > I had planned on doing another increment to recognize `sizeof` expressions.
> > Writing your own parser is unavoidable here because we can't just assume 
> > that any old thing will be a valid initializer just by looking at the set 
> > of tokens present in the macro body. 
> 
> If you ran the token sequence through clang's parser and got an AST node out, 
> you'd have significantly *more* information as to whether something is a 
> valid enum constant initializer because you can check that it's an actual 
> constant expression *and* that it's within a valid range of values. This not 
> only fixes edge case bugs with your approach (like the fact that you can 
> generate a series of literal expressions that result in a value too large to 
> store within an enumerator constant), but it enables new functionality your 
> approach currently disallows (like using constexpr variables instead of just 
> numeric literals).
> 
> So I don't agree that it's unavoidable to write another custom parser.
You keep bringing up the idea that the values have to be known, but so far they 
don't.

Remember, we are replacing macro identifiers with anonymous enum identifiers.  
We aren't specifying a restricting type to the enum, so as long as it's a valid 
integral literal expression, we're not changing any semantics.  Unscoped enums 
also allow arbitrary conversions to/from an underlying integral type 

[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-05-02 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood added a comment.

In D124500#3485462 , @aaron.ballman 
wrote:

> In D124500#3483328 , 
> @LegalizeAdulthood wrote:
>
>> In D124500#3483224 , 
>> @aaron.ballman wrote:
>>
>>> To clarify my previous comments, I'm fine punting on the edge cases until 
>>> user reports come in, so don't let them block this review if you feel 
>>> strongly about not supporting them. But when user reports start coming in, 
>>> at some point I might start asking to replace the custom parser with 
>>> calling into the clangParse library through some invented utility interface 
>>> so that we don't have to deal with a long tail of bug reports.
>>
>> Yeah, I think punting on edge cases is the right thing to do here.  As I say,
>> the worst that happens is your macro isn't converted automatically when you
>> could convert it manually.
>
> I largely agree, but I've found cases where we'll convert correct code to 
> incorrect code, so it's a bit stronger than that.

Are you talking generally, or with this check?  I can't see how this check
is going to generate incorrect code (so far).

> I think that's a reasonable goal, but we're not meeting the "never ever 
> generate invalid code" part.

How so?  Can you give an example where this check will produce invalid code?

>> As for calling into `clangParse`, I think that would be overkill for a 
>> couple reasons.
>> First, the real parser is going to do a lot of work creating AST nodes which 
>> we will
>> never use, except to traverse the structure looking for things that would 
>> invalidate
>> it as a candidate for a constant initializing expression.  Second, we only 
>> need to
>> match the structure, we don't need to extract any information from the token 
>> stream
>> other than a "thumbs up" or "thumbs down" that it is a valid initializing 
>> expression.
>
> There's a few reasons I disagree with this. First, you need to know the value 
> of the
> constant expression in order to know whether it's valid as an enumeration 
> constant.

I'm not following you.  Nothing requires knowing this yet.

>   constexpr int a = 12;
>   constexpr int foo() { return 12; }
>   
>   #define FOO (a + 1)
>   #define BAR (a + 2)
>   #define BAZ (a + 3)
>   #define QUUX (foo() + 4)

`QUUX` will never be converted to an enum by this check.  It references an 
identifier `foo`.

> The situations under which it will break correct code should be something we 
> document explicitly

You haven't shown an example yet where it will break code.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124500/new/

https://reviews.llvm.org/D124500

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-05-02 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added a comment.

In D124500#3483328 , 
@LegalizeAdulthood wrote:

> In D124500#3483224 , @aaron.ballman 
> wrote:
>
>> To clarify my previous comments, I'm fine punting on the edge cases until 
>> user reports come in, so don't let them block this review if you feel 
>> strongly about not supporting them. But when user reports start coming in, 
>> at some point I might start asking to replace the custom parser with calling 
>> into the clangParse library through some invented utility interface so that 
>> we don't have to deal with a long tail of bug reports.
>
> Yeah, I think punting on edge cases is the right thing to do here.  As I say,
> the worst that happens is your macro isn't converted automatically when you
> could convert it manually.

I largely agree, but I've found cases where we'll convert correct code to 
incorrect code, so it's a bit stronger than that.

> Maybe we're thinking about this check differently.
>
> I want this check to do the majority of the heavy lifting so that I'm only 
> left with a
> few "weird" macros that I might have to convert by hand.  I never, ever, ever 
> want
> this check to generate invalid code.  Therefore it is of paramount importance 
> that
> it be conservative about what it recognizes as a candidate for conversion.

I think that's a reasonable goal, but we're not meeting the "never ever 
generate invalid code" part. I already know we can break correct C and C++ code 
through overflow. Should we ever allow an option to use an enum with a fixed 
underlying type in C++ (either `enum class` or `enum : type` form), we'll have 
the same breakage there but at different thresholds.

> I think this is the baseline for all the modernize checks, really.  There are 
> still cases
> where loop-convert doesn't recognize an iterator based loop that could be
> converted to a range-based for loop.  The most important thing is that 
> loop-convert
> doesn't take my weird iterator based loop and convert it to a range based for 
> loop
> that doesn't compile.
>
> As for calling into `clangParse`, I think that would be overkill for a couple 
> reasons.
> First, the real parser is going to do a lot of work creating AST nodes which 
> we will
> never use, except to traverse the structure looking for things that would 
> invalidate
> it as a candidate for a constant initializing expression.  Second, we only 
> need to
> match the structure, we don't need to extract any information from the token 
> stream
> other than a "thumbs up" or "thumbs down" that it is a valid initializing 
> expression.

There's a few reasons I disagree with this. First, you need to know the value 
of the constant expression in order to know whether it's valid as an 
enumeration constant. That alone requires expression evaluation capabilities, 
assuming you want the check to behave correctly for those kinds of cases. But 
second, without support for generating that AST and validating it, you can 
never handle cases like this:

  constexpr int a = 12;
  constexpr int foo() { return 12; }
  
  #define FOO (a + 1)
  #define BAR (a + 2)
  #define BAZ (a + 3)
  #define QUUX (foo() + 4)

because you have no way to know whether or not that constant expression is 
valid based solely on token soup (especially when you start to factor in 
namespaces, qualified lookup, template instantiations, etc). So I think 
avoiding the parser will limit the utility of this check. And maybe that's 
fine, maybe it's only ever intended to convert C code using defines to C code 
using enumerations or other such simple cases.

> Many times in clang-tidy reviews performance concerns are raised and I think
> matching the token stream with the recursive descent matcher here is going to 
> be
> much, much faster than invoking `clangParse`, particularly since the matcher 
> bails
> out early on the first token that doesn't match.

Absolutely 100% agreed on this point.

> The only thing I can think of that
> would make it faster is if we could get the lexed tokens from the preprocessor
> instead of making it re-lex the macro body, but that's a change beyond the 
> scope
> of this check or even clang-tidy.

Yeah, and that wouldn't even be sufficient because I still think we're going to 
want to know the *value* of the expression at some point.

All that "this is the wrong way!" doom and gloom aside, I still think 
you're fine to proceed with the current approach if you'd like to. The 
situations under which it will break correct code should be something we 
document explicitly in the check somewhere, but they feel sufficiently like 
edge cases to me that I'm fine moving forward with the caution in mind that if 
this becomes too much more problematic in the future, we have *a* path forward 
we could use to improve things should we decide we need to.




Comment at: 

[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-04-30 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood added a comment.

FYI, once nice addition of the parsing of macro bodies is that it paves the way 
for
a modernize-macro-to-function check that converts function-like macros that
compute values to template functions.  Once this change has landed, I'll be 
putting
up a review for that.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124500/new/

https://reviews.llvm.org/D124500

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-04-30 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood updated this revision to Diff 426263.
LegalizeAdulthood added a comment.

Recognize comma operator expressions


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124500/new/

https://reviews.llvm.org/D124500

Files:
  clang-tools-extra/clang-tidy/modernize/CMakeLists.txt
  clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp
  clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.h
  clang-tools-extra/clang-tidy/modernize/MacroToEnumCheck.cpp
  clang-tools-extra/docs/clang-tidy/checks/modernize-macro-to-enum.rst
  clang-tools-extra/test/clang-tidy/checkers/modernize-macro-to-enum.cpp
  clang-tools-extra/unittests/clang-tidy/CMakeLists.txt
  clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp

Index: clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
===
--- /dev/null
+++ clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
@@ -0,0 +1,214 @@
+//=== ModernizeModuleTest.cpp - clang-tidy ===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+#include "ClangTidyTest.h"
+#include "modernize/IntegralLiteralExpressionMatcher.h"
+#include "clang/Lex/Lexer.h"
+#include "gtest/gtest.h"
+
+#include 
+#include 
+#include 
+#include 
+
+namespace clang {
+namespace tidy {
+namespace test {
+
+static std::vector tokenify(const char *Text) {
+  LangOptions LangOpts;
+  std::vector Includes;
+  LangOptions::setLangDefaults(LangOpts, Language::CXX, llvm::Triple(),
+   Includes, LangStandard::lang_cxx20);
+  Lexer Lex(SourceLocation{}, LangOpts, Text, Text, Text + std::strlen(Text));
+  std::vector Tokens;
+  bool End = false;
+  while (!End) {
+Token Tok;
+End = Lex.LexFromRawLexer(Tok);
+Tokens.push_back(Tok);
+  }
+
+  return Tokens;
+}
+
+static bool matchText(const char *Text) {
+  std::vector Tokens{tokenify(Text)};
+  modernize::IntegralLiteralExpressionMatcher Matcher(Tokens);
+
+  return Matcher.match();
+}
+
+namespace {
+
+struct Param {
+  bool Matched;
+  const char *Text;
+};
+
+class MatcherTest : public ::testing::TestWithParam {};
+
+} // namespace
+
+static const Param Params[] = {
+// Accept integral literals.
+{true, "1"},
+{true, "0177"},
+{true, "0xdeadbeef"},
+{true, "0b1011"},
+{true, "'c'"},
+// Reject non-integral literals.
+{false, "1.23"},
+{false, "0x1p3"},
+{false, R"("string")"},
+{false, "1i"},
+
+// Accept literals with these unary operators.
+{true, "-1"},
+{true, "+1"},
+{true, "~1"},
+{true, "!1"},
+// Reject invalid unary operators.
+{false, "1-"},
+{false, "1+"},
+{false, "1~"},
+{false, "1!"},
+
+// Accept valid binary operators.
+{true, "1+1"},
+{true, "1-1"},
+{true, "1*1"},
+{true, "1/1"},
+{true, "1%2"},
+{true, "1<<1"},
+{true, "1>>1"},
+{true, "1<=>1"},
+{true, "1<1"},
+{true, "1>1"},
+{true, "1<=1"},
+{true, "1>=1"},
+{true, "1==1"},
+{true, "1!=1"},
+{true, "1&1"},
+{true, "1^1"},
+{true, "1|1"},
+{true, "1&&1"},
+{true, "1||1"},
+{true, "1+ +1"}, // A space is needed to avoid being tokenized as ++ or --.
+{true, "1- -1"},
+{true, "1,1"},
+// Reject invalid binary operators.
+{false, "1+"},
+{false, "1-"},
+{false, "1*"},
+{false, "1/"},
+{false, "1%"},
+{false, "1<<"},
+{false, "1>>"},
+{false, "1<=>"},
+{false, "1<"},
+{false, "1>"},
+{false, "1<="},
+{false, "1>="},
+{false, "1=="},
+{false, "1!="},
+{false, "1&"},
+{false, "1^"},
+{false, "1|"},
+{false, "1&&"},
+{false, "1||"},
+{false, "1,"},
+{false, ",1"},
+
+// Accept valid ternary operators.
+{true, "1?1:1"},
+{true, "1?:1"}, // A gcc extension treats x ? : y as x ? x : y.
+// Reject invalid ternary operators.
+{false, "?"},
+{false, "?1"},
+{false, "?:"},
+{false, "?:1"},
+{false, "?1:"},
+{false, "?1:1"},
+{false, "1?"},
+{false, "1?1"},
+{false, "1?:"},
+{false, "1?1:"},
+
+// Accept parenthesized expressions.
+{true, "(1)"},
+{true, "((+1))"},
+{true, "((+(1)))"},
+{true, "(-1)"},
+{true, "-(1)"},
+{true, "(+1)"},
+{true, "((+1))"},
+{true, "+(1)"},
+{true, "(~1)"},
+{true, "~(1)"},
+{true, "(!1)"},
+{true, "!(1)"},
+{true, "(1+1)"},
+{true, "(1-1)"},
+{true, "(1*1)"},
+{true, "(1/1)"},
+{true, "(1%2)"},
+{true, "(1<<1)"},
+{true, "(1>>1)"},
+{true, "(1<=>1)"},
+{true, "(1<1)"},
+{true, "(1>1)"},
+{true, "(1<=1)"},
+{true, 

[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-04-30 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood added inline comments.



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:204
+  return true;
+}
+

LegalizeAdulthood wrote:
> aaron.ballman wrote:
> > LegalizeAdulthood wrote:
> > > aaron.ballman wrote:
> > > > Comma operator?
> > > Remember that the use case here is identifying expressions that are 
> > > initializers for an enum.  If you were doing a code review and you saw 
> > > this:
> > > ```
> > > enum {
> > > FOO = (2, 3)
> > > };
> > > ```
> > > Would you be OK with that?  I wouldn't.  Clang even warns about it: 
> > > https://godbolt.org/z/Y641cb8Y9
> > > 
> > > Therefore I deliberately left comma operator out of the grammar.
> > This is another case where I think you're predicting that users won't be 
> > using the full expressivity of the language and we'll get bug reports 
> > later. Again, in insolation, I tend to agree that I wouldn't be happy 
> > seeing that code. However, users write some very creative code and there's 
> > no technical reason why we can't or shouldn't handle comma operators.
> "Don't let the perfect be the enemy of the good."
> 
> My inclination is to simply explicitly state that comma operator is not 
> recognized in the documentation.  It's already implicit by it's absence from 
> the list of recognized operators.
> 
> Again, the worst that happens is that your macro isn't converted.
> 
> I'm open to being convinced that it's important, but you haven't convinced me 
> yet `:)`
It wasn't much extra work/code to add comma operator support so I've done that.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124500/new/

https://reviews.llvm.org/D124500

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-04-29 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood updated this revision to Diff 426160.
LegalizeAdulthood marked an inline comment as done.
LegalizeAdulthood added a comment.

Update from review comments


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124500/new/

https://reviews.llvm.org/D124500

Files:
  clang-tools-extra/clang-tidy/modernize/CMakeLists.txt
  clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp
  clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.h
  clang-tools-extra/clang-tidy/modernize/MacroToEnumCheck.cpp
  clang-tools-extra/docs/clang-tidy/checks/modernize-macro-to-enum.rst
  clang-tools-extra/test/clang-tidy/checkers/modernize-macro-to-enum.cpp
  clang-tools-extra/unittests/clang-tidy/CMakeLists.txt
  clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp

Index: clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
===
--- /dev/null
+++ clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
@@ -0,0 +1,210 @@
+//=== ModernizeModuleTest.cpp - clang-tidy ===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+#include "ClangTidyTest.h"
+#include "modernize/IntegralLiteralExpressionMatcher.h"
+#include "clang/Lex/Lexer.h"
+#include "gtest/gtest.h"
+
+#include 
+#include 
+#include 
+#include 
+
+namespace clang {
+namespace tidy {
+namespace test {
+
+static std::vector tokenify(const char *Text) {
+  LangOptions LangOpts;
+  std::vector Includes;
+  LangOptions::setLangDefaults(LangOpts, Language::CXX, llvm::Triple(),
+   Includes, LangStandard::lang_cxx20);
+  Lexer Lex(SourceLocation{}, LangOpts, Text, Text, Text + std::strlen(Text));
+  std::vector Tokens;
+  bool End = false;
+  while (!End) {
+Token Tok;
+End = Lex.LexFromRawLexer(Tok);
+Tokens.push_back(Tok);
+  }
+
+  return Tokens;
+}
+
+static bool matchText(const char *Text) {
+  std::vector Tokens{tokenify(Text)};
+  modernize::IntegralLiteralExpressionMatcher Matcher(Tokens);
+
+  return Matcher.match();
+}
+
+namespace {
+
+struct Param {
+  bool Matched;
+  const char *Text;
+};
+
+class MatcherTest : public ::testing::TestWithParam {};
+
+} // namespace
+
+static const Param Params[] = {
+// Accept integral literals.
+{true, "1"},
+{true, "0177"},
+{true, "0xdeadbeef"},
+{true, "0b1011"},
+{true, "'c'"},
+// Reject non-integral literals.
+{false, "1.23"},
+{false, "0x1p3"},
+{false, R"("string")"},
+{false, "1i"},
+
+// Accept literals with these unary operators.
+{true, "-1"},
+{true, "+1"},
+{true, "~1"},
+{true, "!1"},
+// Reject invalid unary operators.
+{false, "1-"},
+{false, "1+"},
+{false, "1~"},
+{false, "1!"},
+
+// Accept valid binary operators.
+{true, "1+1"},
+{true, "1-1"},
+{true, "1*1"},
+{true, "1/1"},
+{true, "1%2"},
+{true, "1<<1"},
+{true, "1>>1"},
+{true, "1<=>1"},
+{true, "1<1"},
+{true, "1>1"},
+{true, "1<=1"},
+{true, "1>=1"},
+{true, "1==1"},
+{true, "1!=1"},
+{true, "1&1"},
+{true, "1^1"},
+{true, "1|1"},
+{true, "1&&1"},
+{true, "1||1"},
+{true, "1+ +1"}, // A space is needed to avoid being tokenized as ++ or --.
+{true, "1- -1"},
+// Reject invalid binary operators.
+{false, "1+"},
+{false, "1-"},
+{false, "1*"},
+{false, "1/"},
+{false, "1%"},
+{false, "1<<"},
+{false, "1>>"},
+{false, "1<=>"},
+{false, "1<"},
+{false, "1>"},
+{false, "1<="},
+{false, "1>="},
+{false, "1=="},
+{false, "1!="},
+{false, "1&"},
+{false, "1^"},
+{false, "1|"},
+{false, "1&&"},
+{false, "1||"},
+
+// Accept valid ternary operators.
+{true, "1?1:1"},
+{true, "1?:1"}, // A gcc extension treats x ? : y as x ? x : y.
+// Reject invalid ternary operators.
+{false, "?"},
+{false, "?1"},
+{false, "?:"},
+{false, "?:1"},
+{false, "?1:"},
+{false, "?1:1"},
+{false, "1?"},
+{false, "1?1"},
+{false, "1?:"},
+{false, "1?1:"},
+
+// Accept parenthesized expressions.
+{true, "(1)"},
+{true, "((+1))"},
+{true, "((+(1)))"},
+{true, "(-1)"},
+{true, "-(1)"},
+{true, "(+1)"},
+{true, "((+1))"},
+{true, "+(1)"},
+{true, "(~1)"},
+{true, "~(1)"},
+{true, "(!1)"},
+{true, "!(1)"},
+{true, "(1+1)"},
+{true, "(1-1)"},
+{true, "(1*1)"},
+{true, "(1/1)"},
+{true, "(1%2)"},
+{true, "(1<<1)"},
+{true, "(1>>1)"},
+{true, "(1<=>1)"},
+{true, "(1<1)"},
+{true, "(1>1)"},
+{true, "(1<=1)"},
+{true, "(1>=1)"},
+{true, 

[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-04-29 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood marked 7 inline comments as done.
LegalizeAdulthood added inline comments.



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.h:18-20
+// Parses an array of tokens and returns true if they conform to the rules of
+// C++ for whole expressions involving integral literals.  Follows the operator
+// precedence rules of C++.

aaron.ballman wrote:
> Oh boy, I'm not super excited about having another parser to maintain...
> 
> It'd be nice if we had a ParserUtils.cpp/h file that made it easier to go 
> from an arbitrary array of tokens to AST nodes + success/failure information 
> on parsing the tokens. It's not strictly needed for what you're trying to 
> accomplish here, but it would be a much more general interface and it would 
> remove the support burden from adding another parser that's out of Clang's 
> tree.
Yeah, I'm not a fan of duplication either, but see my earlier comments about 
why I think clangParse is overkill here.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124500/new/

https://reviews.llvm.org/D124500

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-04-29 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood marked 3 inline comments as done.
LegalizeAdulthood added inline comments.



Comment at: clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp:66
+{false, "1.23"},
+{false, "0x1p3"},
+{false, R"("string")"},

LegalizeAdulthood wrote:
> aaron.ballman wrote:
> > ```
> > 12i
> > .0
> > ```
> > 
> `.0` is already covered by the case `1.23`.  I'm not home brewing 
> tokenization, but using the Lexer to do that.
> 
> `12i` I need to investigate to find out what the Lexer does.
OK, so `12i` turns into `numeric_constant` token, so I've added test cases to 
exclude those and enhanced the matcher.

Essentially that's a bug in the existing implementation that `12i` wasn't 
rejected outright.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124500/new/

https://reviews.llvm.org/D124500

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-04-29 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood added a comment.

In D124500#3483224 , @aaron.ballman 
wrote:

> To clarify my previous comments, I'm fine punting on the edge cases until 
> user reports come in, so don't let them block this review if you feel 
> strongly about not supporting them. But when user reports start coming in, at 
> some point I might start asking to replace the custom parser with calling 
> into the clangParse library through some invented utility interface so that 
> we don't have to deal with a long tail of bug reports.

Yeah, I think punting on edge cases is the right thing to do here.  As I say,
the worst that happens is your macro isn't converted automatically when you
could convert it manually.

Maybe we're thinking about this check differently.

I want this check to do the majority of the heavy lifting so that I'm only left 
with a
few "weird" macros that I might have to convert by hand.  I never, ever, ever 
want
this check to generate invalid code.  Therefore it is of paramount importance 
that
it be conservative about what it recognizes as a candidate for conversion.

I think this is the baseline for all the modernize checks, really.  There are 
still cases
where loop-convert doesn't recognize an iterator based loop that could be
converted to a range-based for loop.  The most important thing is that 
loop-convert
doesn't take my weird iterator based loop and convert it to a range based for 
loop
that doesn't compile.

As for calling into `clangParse`, I think that would be overkill for a couple 
reasons.
First, the real parser is going to do a lot of work creating AST nodes which we 
will
never use, except to traverse the structure looking for things that would 
invalidate
it as a candidate for a constant initializing expression.  Second, we only need 
to
match the structure, we don't need to extract any information from the token 
stream
other than a "thumbs up" or "thumbs down" that it is a valid initializing 
expression.
Many times in clang-tidy reviews performance concerns are raised and I think
matching the token stream with the recursive descent matcher here is going to be
much, much faster than invoking `clangParse`, particularly since the matcher 
bails
out early on the first token that doesn't match.  The only thing I can think of 
that
would make it faster is if we could get the lexed tokens from the preprocessor
instead of making it re-lex the macro body, but that's a change beyond the scope
of this check or even clang-tidy.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124500/new/

https://reviews.llvm.org/D124500

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-04-29 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood marked 3 inline comments as done.
LegalizeAdulthood added inline comments.



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:99
+
+  if (!Current->isLiteral() || isStringLiteral(Current->getKind()) ||
+  !isIntegralConstant(*Current)) {

aaron.ballman wrote:
> LegalizeAdulthood wrote:
> > aaron.ballman wrote:
> > > I know this is code moved from elsewhere, but I suppose we never 
> > > considered the odd edge case where a user does something like `"foo"[0]` 
> > > as a really awful integer constant. :-D
> > It's always possible to write crazy contorted code and have a check not 
> > recognize it.  I don't think it's worthwhile to spend time trying to handle 
> > torture cases unless we can find data that shows it's prevalent in real 
> > world code.
> > 
> > If I was doing a code review and saw this:
> > ```
> > enum {
> > FOO = "foo"[0]
> > };
> > ```
> > I'd flag that in a code review as bogus, whereas if I saw:
> > ```
> > enum {
> > FOO = 'f'
> > };
> > ```
> > That would be acceptable, which is why character literals are accepted as 
> > an integral literal initializer for an enum in this check.
> >  I don't think it's worthwhile to spend time trying to handle torture cases 
> > unless we can find data that shows it's prevalent in real world code.
> 
> I think I'm okay agreeing to that in this particular case, but this is more 
> to point out that writing your own parser is a maintenance burden. Users will 
> hit cases we've both forgotten about here, they'll file a bug, then someone 
> has to deal with it. It's very hard to justify to users "we think you write 
> silly code" because they often have creative ways in which their code is not 
> actually so silly, especially when we support "most" valid expressions.
Writing your own parser is unavoidable here because we can't just assume that 
any old thing will be a valid initializer just by looking at the set of tokens 
present in the macro body.  (There is a separate discussion going on about 
improving the preprocessor support and parsing things more deeply, but that 
isn't even to the point of a prototype yet.)  The worst thing we can do is 
create "fixits" that produce invalid code.

The worst that happens if your expression isn't recognized is that your macro 
isn't matched as a candidate for an enum.  You can always make it an enum 
manually and join it with adjacent macros that were recognized and converted.

As it stands, the check only recognizes a single literal with an optional unary 
operator.

This change expands the check to recognize a broad range of expressions, 
allowing those macros to be converted to enums.  I opened the issue because 
running modernize-macro-to-enum on the [[ 
https://github.com/InsightSoftwareConsortium/ITK | ITK codebase ]] showed some 
simple expressions involving literals that weren't recognized and converted.

If an expression isn't recognized and an issue is opened, it will be an 
enhancement request to support a broader range of expression, not a bug that 
this check created invalid code.

IMO, the more useful thing that's missing from the grammar is recognizing 
`sizeof` expressions rather than indexing string literals with an integral 
literal subscript.

I had planned on doing another increment to recognize `sizeof` expressions.



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:204
+  return true;
+}
+

aaron.ballman wrote:
> LegalizeAdulthood wrote:
> > aaron.ballman wrote:
> > > Comma operator?
> > Remember that the use case here is identifying expressions that are 
> > initializers for an enum.  If you were doing a code review and you saw this:
> > ```
> > enum {
> > FOO = (2, 3)
> > };
> > ```
> > Would you be OK with that?  I wouldn't.  Clang even warns about it: 
> > https://godbolt.org/z/Y641cb8Y9
> > 
> > Therefore I deliberately left comma operator out of the grammar.
> This is another case where I think you're predicting that users won't be 
> using the full expressivity of the language and we'll get bug reports later. 
> Again, in insolation, I tend to agree that I wouldn't be happy seeing that 
> code. However, users write some very creative code and there's no technical 
> reason why we can't or shouldn't handle comma operators.
"Don't let the perfect be the enemy of the good."

My inclination is to simply explicitly state that comma operator is not 
recognized in the documentation.  It's already implicit by it's absence from 
the list of recognized operators.

Again, the worst that happens is that your macro isn't converted.

I'm open to being convinced that it's important, but you haven't convinced me 
yet `:)`


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124500/new/

https://reviews.llvm.org/D124500

___
cfe-commits mailing list
cfe-commits@lists.llvm.org

[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-04-29 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added a comment.

To clarify my previous comments, I'm fine punting on the edge cases until user 
reports come in, so don't let them block this review if you feel strongly about 
not supporting them. But when user reports start coming in, at some point I 
might start asking to replace the custom parser with calling into the 
clangParse library through some invented utility interface so that we don't 
have to deal with a long tail of bug reports.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124500/new/

https://reviews.llvm.org/D124500

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-04-29 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added inline comments.



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:99
+
+  if (!Current->isLiteral() || isStringLiteral(Current->getKind()) ||
+  !isIntegralConstant(*Current)) {

LegalizeAdulthood wrote:
> aaron.ballman wrote:
> > I know this is code moved from elsewhere, but I suppose we never considered 
> > the odd edge case where a user does something like `"foo"[0]` as a really 
> > awful integer constant. :-D
> It's always possible to write crazy contorted code and have a check not 
> recognize it.  I don't think it's worthwhile to spend time trying to handle 
> torture cases unless we can find data that shows it's prevalent in real world 
> code.
> 
> If I was doing a code review and saw this:
> ```
> enum {
> FOO = "foo"[0]
> };
> ```
> I'd flag that in a code review as bogus, whereas if I saw:
> ```
> enum {
> FOO = 'f'
> };
> ```
> That would be acceptable, which is why character literals are accepted as an 
> integral literal initializer for an enum in this check.
>  I don't think it's worthwhile to spend time trying to handle torture cases 
> unless we can find data that shows it's prevalent in real world code.

I think I'm okay agreeing to that in this particular case, but this is more to 
point out that writing your own parser is a maintenance burden. Users will hit 
cases we've both forgotten about here, they'll file a bug, then someone has to 
deal with it. It's very hard to justify to users "we think you write silly 
code" because they often have creative ways in which their code is not actually 
so silly, especially when we support "most" valid expressions.



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:185
+
+  if (Current->is(tok::TokenKind::question)) {
+if (!advance())

LegalizeAdulthood wrote:
> aaron.ballman wrote:
> > There is GNU extension in this space: https://godbolt.org/z/PrWY3T6hY
> Do you have a link for the extension?
https://gcc.gnu.org/onlinedocs/gcc/Conditionals.html



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:204
+  return true;
+}
+

LegalizeAdulthood wrote:
> aaron.ballman wrote:
> > Comma operator?
> Remember that the use case here is identifying expressions that are 
> initializers for an enum.  If you were doing a code review and you saw this:
> ```
> enum {
> FOO = (2, 3)
> };
> ```
> Would you be OK with that?  I wouldn't.  Clang even warns about it: 
> https://godbolt.org/z/Y641cb8Y9
> 
> Therefore I deliberately left comma operator out of the grammar.
This is another case where I think you're predicting that users won't be using 
the full expressivity of the language and we'll get bug reports later. Again, 
in insolation, I tend to agree that I wouldn't be happy seeing that code. 
However, users write some very creative code and there's no technical reason 
why we can't or shouldn't handle comma operators.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124500/new/

https://reviews.llvm.org/D124500

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-04-29 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood marked 2 inline comments as done.
LegalizeAdulthood added inline comments.



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:32-33
+  // not a decimal floating-point literal
+  return std::none_of(
+  Begin, End, [](char C) { return C == '.' || std::toupper(C) == 'E'; });
+}

aaron.ballman wrote:
> Do we need to care about integer suffixes that make a non-integer type, like: 
> https://godbolt.org/z/vx3xbGa41
I don't think those will be parsed as literal tokens by the preprocessor, but 
I'll check.



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:99
+
+  if (!Current->isLiteral() || isStringLiteral(Current->getKind()) ||
+  !isIntegralConstant(*Current)) {

aaron.ballman wrote:
> I know this is code moved from elsewhere, but I suppose we never considered 
> the odd edge case where a user does something like `"foo"[0]` as a really 
> awful integer constant. :-D
It's always possible to write crazy contorted code and have a check not 
recognize it.  I don't think it's worthwhile to spend time trying to handle 
torture cases unless we can find data that shows it's prevalent in real world 
code.

If I was doing a code review and saw this:
```
enum {
FOO = "foo"[0]
};
```
I'd flag that in a code review as bogus, whereas if I saw:
```
enum {
FOO = 'f'
};
```
That would be acceptable, which is why character literals are accepted as an 
integral literal initializer for an enum in this check.



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:185
+
+  if (Current->is(tok::TokenKind::question)) {
+if (!advance())

aaron.ballman wrote:
> There is GNU extension in this space: https://godbolt.org/z/PrWY3T6hY
Do you have a link for the extension?



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:204
+  return true;
+}
+

aaron.ballman wrote:
> Comma operator?
Remember that the use case here is identifying expressions that are 
initializers for an enum.  If you were doing a code review and you saw this:
```
enum {
FOO = (2, 3)
};
```
Would you be OK with that?  I wouldn't.  Clang even warns about it: 
https://godbolt.org/z/Y641cb8Y9

Therefore I deliberately left comma operator out of the grammar.



Comment at: 
clang-tools-extra/test/clang-tidy/checkers/modernize-macro-to-enum.cpp:26
+#define EXPR11 (1 + (2))
+#define EXPR12 ((1) + (2 + 0) + (1 * 1) + (1 / 1) + (1 | 1 ) + (1 & 1) + (1 << 
1) + (1 >> 1) + (1 % 2) + (1 ^ 1))
+// CHECK-MESSAGES: :[[@LINE-12]]:1: warning: replace macro with enum 
[modernize-macro-to-enum]

aaron.ballman wrote:
> Other interesting tests I'd expect we could convert into an enum (at least 
> theoretically):
> ```
> #define A 12 + +1
> #define B 12 - -1
> #define C (1, 2, 3)
> #define D 100 ? : 8
> #define E 100 ? 100 : 8
> #define F 'f'
> #define G "foo"[0]
> #define H 1 && 2
> #define I 1 || 2
> ```
Most of these (except comma operator and string subscript, see my comments 
earlier) are covered in the unit test for the matcher.  I'll add tests for 
these:

```
12 + +1
12 - -1
100 ? : 8
```



Comment at: clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp:66
+{false, "1.23"},
+{false, "0x1p3"},
+{false, R"("string")"},

aaron.ballman wrote:
> ```
> 12i
> .0
> ```
> 
`.0` is already covered by the case `1.23`.  I'm not home brewing tokenization, 
but using the Lexer to do that.

`12i` I need to investigate to find out what the Lexer does.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124500/new/

https://reviews.llvm.org/D124500

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-04-29 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added inline comments.



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:25
+
+  // not a hexadecimal floating-point literal
+  if (Token.getLength() > 2 && Begin[0] == '0' && std::toupper(Begin[1]) == 
'X')

(Same suggestion elsewhere -- just double check that all the comments are full 
sentences with capitalization and punctuation.)



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:32-33
+  // not a decimal floating-point literal
+  return std::none_of(
+  Begin, End, [](char C) { return C == '.' || std::toupper(C) == 'E'; });
+}

Do we need to care about integer suffixes that make a non-integer type, like: 
https://godbolt.org/z/vx3xbGa41



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:99
+
+  if (!Current->isLiteral() || isStringLiteral(Current->getKind()) ||
+  !isIntegralConstant(*Current)) {

I know this is code moved from elsewhere, but I suppose we never considered the 
odd edge case where a user does something like `"foo"[0]` as a really awful 
integer constant. :-D



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:185
+
+  if (Current->is(tok::TokenKind::question)) {
+if (!advance())

There is GNU extension in this space: https://godbolt.org/z/PrWY3T6hY



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:204
+  return true;
+}
+

Comma operator?



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.h:9
+
+#pragma once
+

We don't use #pragma once (not portable, not reliable).



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.h:18-20
+// Parses an array of tokens and returns true if they conform to the rules of
+// C++ for whole expressions involving integral literals.  Follows the operator
+// precedence rules of C++.

Oh boy, I'm not super excited about having another parser to maintain...

It'd be nice if we had a ParserUtils.cpp/h file that made it easier to go from 
an arbitrary array of tokens to AST nodes + success/failure information on 
parsing the tokens. It's not strictly needed for what you're trying to 
accomplish here, but it would be a much more general interface and it would 
remove the support burden from adding another parser that's out of Clang's tree.



Comment at: 
clang-tools-extra/test/clang-tidy/checkers/modernize-macro-to-enum.cpp:26
+#define EXPR11 (1 + (2))
+#define EXPR12 ((1) + (2 + 0) + (1 * 1) + (1 / 1) + (1 | 1 ) + (1 & 1) + (1 << 
1) + (1 >> 1) + (1 % 2) + (1 ^ 1))
+// CHECK-MESSAGES: :[[@LINE-12]]:1: warning: replace macro with enum 
[modernize-macro-to-enum]

Other interesting tests I'd expect we could convert into an enum (at least 
theoretically):
```
#define A 12 + +1
#define B 12 - -1
#define C (1, 2, 3)
#define D 100 ? : 8
#define E 100 ? 100 : 8
#define F 'f'
#define G "foo"[0]
#define H 1 && 2
#define I 1 || 2
```



Comment at: clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp:66
+{false, "1.23"},
+{false, "0x1p3"},
+{false, R"("string")"},

```
12i
.0
```




Comment at: clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp:99
+{true, "1&&1"},
+{true, "1||1"},
+// invalid binary operator

```
100 ? : 10
1, 2
```



Comment at: clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp:134
+{false, "1?1:"},
+{false, "1?:1"},
+

This one is valid


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124500/new/

https://reviews.llvm.org/D124500

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-04-28 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood updated this revision to Diff 425700.
LegalizeAdulthood added a comment.

Update documentation


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124500/new/

https://reviews.llvm.org/D124500

Files:
  clang-tools-extra/clang-tidy/modernize/CMakeLists.txt
  clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp
  clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.h
  clang-tools-extra/clang-tidy/modernize/MacroToEnumCheck.cpp
  clang-tools-extra/docs/clang-tidy/checks/modernize-macro-to-enum.rst
  clang-tools-extra/test/clang-tidy/checkers/modernize-macro-to-enum.cpp
  clang-tools-extra/unittests/clang-tidy/CMakeLists.txt
  clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp

Index: clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
===
--- /dev/null
+++ clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
@@ -0,0 +1,207 @@
+//=== ModernizeModuleTest.cpp - clang-tidy ===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+#include "ClangTidyTest.h"
+#include "modernize/IntegralLiteralExpressionMatcher.h"
+#include "clang/Lex/Lexer.h"
+#include "gtest/gtest.h"
+
+#include 
+#include 
+#include 
+#include 
+
+namespace clang {
+namespace tidy {
+namespace test {
+
+static std::vector tokenify(const char *Text) {
+  LangOptions LangOpts;
+  std::vector Includes;
+  LangOptions::setLangDefaults(LangOpts, Language::CXX, llvm::Triple(),
+   Includes, LangStandard::lang_cxx20);
+  Lexer Lex(SourceLocation{}, LangOpts, Text, Text, Text + std::strlen(Text));
+  std::vector Tokens;
+  bool End = false;
+  while (!End) {
+Token Tok;
+End = Lex.LexFromRawLexer(Tok);
+Tokens.push_back(Tok);
+  }
+
+  return Tokens;
+}
+
+static bool matchText(const char *Text) {
+  std::vector Tokens{tokenify(Text)};
+  modernize::IntegralLiteralExpressionMatcher Matcher(Tokens);
+
+  return Matcher.match();
+}
+
+namespace {
+
+struct Param {
+  bool Matched;
+  const char *Text;
+};
+
+class MatcherTest : public ::testing::TestWithParam {};
+
+} // namespace
+
+static const Param Params[] = {
+// valid integral literals
+{true, "1"},
+{true, "0177"},
+{true, "0xdeadbeef"},
+{true, "0b1011"},
+{true, "'c'"},
+// invalid literals
+{false, "1.23"},
+{false, "0x1p3"},
+{false, R"("string")"},
+
+// literals with unary operators
+{true, "-1"},
+{true, "+1"},
+{true, "~1"},
+{true, "!1"},
+// invalid unary operators
+{false, "1-"},
+{false, "1+"},
+{false, "1~"},
+{false, "1!"},
+
+// valid binary operators
+{true, "1+1"},
+{true, "1-1"},
+{true, "1*1"},
+{true, "1/1"},
+{true, "1%2"},
+{true, "1<<1"},
+{true, "1>>1"},
+{true, "1<=>1"},
+{true, "1<1"},
+{true, "1>1"},
+{true, "1<=1"},
+{true, "1>=1"},
+{true, "1==1"},
+{true, "1!=1"},
+{true, "1&1"},
+{true, "1^1"},
+{true, "1|1"},
+{true, "1&&1"},
+{true, "1||1"},
+// invalid binary operator
+{false, "1+"},
+{false, "1-"},
+{false, "1*"},
+{false, "1/"},
+{false, "1%"},
+{false, "1<<"},
+{false, "1>>"},
+{false, "1<=>"},
+{false, "1<"},
+{false, "1>"},
+{false, "1<="},
+{false, "1>="},
+{false, "1=="},
+{false, "1!="},
+{false, "1&"},
+{false, "1^"},
+{false, "1|"},
+{false, "1&&"},
+{false, "1||"},
+
+// valid ternary operator
+{true, "1?1:1"},
+// invalid ternary operator
+{false, "?"},
+{false, "?1"},
+{false, "?:"},
+{false, "?:1"},
+{false, "?1:"},
+{false, "?1:1"},
+{false, "1?"},
+{false, "1?1"},
+{false, "1?:"},
+{false, "1?1:"},
+{false, "1?:1"},
+
+// parenthesized expressions
+{true, "(1)"},
+{true, "((+1))"},
+{true, "((+(1)))"},
+{true, "(-1)"},
+{true, "-(1)"},
+{true, "(+1)"},
+{true, "((+1))"},
+{true, "+(1)"},
+{true, "(~1)"},
+{true, "~(1)"},
+{true, "(!1)"},
+{true, "!(1)"},
+{true, "(1+1)"},
+{true, "(1-1)"},
+{true, "(1*1)"},
+{true, "(1/1)"},
+{true, "(1%2)"},
+{true, "(1<<1)"},
+{true, "(1>>1)"},
+{true, "(1<=>1)"},
+{true, "(1<1)"},
+{true, "(1>1)"},
+{true, "(1<=1)"},
+{true, "(1>=1)"},
+{true, "(1==1)"},
+{true, "(1!=1)"},
+{true, "(1&1)"},
+{true, "(1^1)"},
+{true, "(1|1)"},
+{true, "(1&&1)"},
+{true, "(1||1)"},
+{true, "(1?1:1)"},
+
+// more complicated expressions
+{true, "1+1+1"},
+{true, "1+1+1+1"},
+{true, "1+1+1+1+1"},
+{true, "1*1*1"},
+

[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-04-27 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood updated this revision to Diff 425679.
LegalizeAdulthood added a comment.

Add file block comment


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124500/new/

https://reviews.llvm.org/D124500

Files:
  clang-tools-extra/clang-tidy/modernize/CMakeLists.txt
  clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp
  clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.h
  clang-tools-extra/clang-tidy/modernize/MacroToEnumCheck.cpp
  clang-tools-extra/test/clang-tidy/checkers/modernize-macro-to-enum.cpp
  clang-tools-extra/unittests/clang-tidy/CMakeLists.txt
  clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp

Index: clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
===
--- /dev/null
+++ clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
@@ -0,0 +1,207 @@
+//=== ModernizeModuleTest.cpp - clang-tidy ===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+#include "ClangTidyTest.h"
+#include "modernize/IntegralLiteralExpressionMatcher.h"
+#include "clang/Lex/Lexer.h"
+#include "gtest/gtest.h"
+
+#include 
+#include 
+#include 
+#include 
+
+namespace clang {
+namespace tidy {
+namespace test {
+
+static std::vector tokenify(const char *Text) {
+  LangOptions LangOpts;
+  std::vector Includes;
+  LangOptions::setLangDefaults(LangOpts, Language::CXX, llvm::Triple(),
+   Includes, LangStandard::lang_cxx20);
+  Lexer Lex(SourceLocation{}, LangOpts, Text, Text, Text + std::strlen(Text));
+  std::vector Tokens;
+  bool End = false;
+  while (!End) {
+Token Tok;
+End = Lex.LexFromRawLexer(Tok);
+Tokens.push_back(Tok);
+  }
+
+  return Tokens;
+}
+
+static bool matchText(const char *Text) {
+  std::vector Tokens{tokenify(Text)};
+  modernize::IntegralLiteralExpressionMatcher Matcher(Tokens);
+
+  return Matcher.match();
+}
+
+namespace {
+
+struct Param {
+  bool Matched;
+  const char *Text;
+};
+
+class MatcherTest : public ::testing::TestWithParam {};
+
+} // namespace
+
+static const Param Params[] = {
+// valid integral literals
+{true, "1"},
+{true, "0177"},
+{true, "0xdeadbeef"},
+{true, "0b1011"},
+{true, "'c'"},
+// invalid literals
+{false, "1.23"},
+{false, "0x1p3"},
+{false, R"("string")"},
+
+// literals with unary operators
+{true, "-1"},
+{true, "+1"},
+{true, "~1"},
+{true, "!1"},
+// invalid unary operators
+{false, "1-"},
+{false, "1+"},
+{false, "1~"},
+{false, "1!"},
+
+// valid binary operators
+{true, "1+1"},
+{true, "1-1"},
+{true, "1*1"},
+{true, "1/1"},
+{true, "1%2"},
+{true, "1<<1"},
+{true, "1>>1"},
+{true, "1<=>1"},
+{true, "1<1"},
+{true, "1>1"},
+{true, "1<=1"},
+{true, "1>=1"},
+{true, "1==1"},
+{true, "1!=1"},
+{true, "1&1"},
+{true, "1^1"},
+{true, "1|1"},
+{true, "1&&1"},
+{true, "1||1"},
+// invalid binary operator
+{false, "1+"},
+{false, "1-"},
+{false, "1*"},
+{false, "1/"},
+{false, "1%"},
+{false, "1<<"},
+{false, "1>>"},
+{false, "1<=>"},
+{false, "1<"},
+{false, "1>"},
+{false, "1<="},
+{false, "1>="},
+{false, "1=="},
+{false, "1!="},
+{false, "1&"},
+{false, "1^"},
+{false, "1|"},
+{false, "1&&"},
+{false, "1||"},
+
+// valid ternary operator
+{true, "1?1:1"},
+// invalid ternary operator
+{false, "?"},
+{false, "?1"},
+{false, "?:"},
+{false, "?:1"},
+{false, "?1:"},
+{false, "?1:1"},
+{false, "1?"},
+{false, "1?1"},
+{false, "1?:"},
+{false, "1?1:"},
+{false, "1?:1"},
+
+// parenthesized expressions
+{true, "(1)"},
+{true, "((+1))"},
+{true, "((+(1)))"},
+{true, "(-1)"},
+{true, "-(1)"},
+{true, "(+1)"},
+{true, "((+1))"},
+{true, "+(1)"},
+{true, "(~1)"},
+{true, "~(1)"},
+{true, "(!1)"},
+{true, "!(1)"},
+{true, "(1+1)"},
+{true, "(1-1)"},
+{true, "(1*1)"},
+{true, "(1/1)"},
+{true, "(1%2)"},
+{true, "(1<<1)"},
+{true, "(1>>1)"},
+{true, "(1<=>1)"},
+{true, "(1<1)"},
+{true, "(1>1)"},
+{true, "(1<=1)"},
+{true, "(1>=1)"},
+{true, "(1==1)"},
+{true, "(1!=1)"},
+{true, "(1&1)"},
+{true, "(1^1)"},
+{true, "(1|1)"},
+{true, "(1&&1)"},
+{true, "(1||1)"},
+{true, "(1?1:1)"},
+
+// more complicated expressions
+{true, "1+1+1"},
+{true, "1+1+1+1"},
+{true, "1+1+1+1+1"},
+{true, "1*1*1"},
+{true, "1*1*1*1"},
+{true, "1*1*1*1*1"},
+{true, "1<<1<<1"},
+

[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-04-27 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood added inline comments.



Comment at: clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp:1
+#include "ClangTidyTest.h"
+#include "modernize/IntegralLiteralExpressionMatcher.h"

Needs a file header


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124500/new/

https://reviews.llvm.org/D124500

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-04-27 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood updated this revision to Diff 425425.
LegalizeAdulthood added a comment.

Inline Variable


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124500/new/

https://reviews.llvm.org/D124500

Files:
  clang-tools-extra/clang-tidy/modernize/CMakeLists.txt
  clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp
  clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.h
  clang-tools-extra/clang-tidy/modernize/MacroToEnumCheck.cpp
  clang-tools-extra/test/clang-tidy/checkers/modernize-macro-to-enum.cpp
  clang-tools-extra/unittests/clang-tidy/CMakeLists.txt
  clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp

Index: clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
===
--- /dev/null
+++ clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
@@ -0,0 +1,200 @@
+#include "ClangTidyTest.h"
+#include "modernize/IntegralLiteralExpressionMatcher.h"
+#include "clang/Lex/Lexer.h"
+#include "gtest/gtest.h"
+
+#include 
+#include 
+#include 
+#include 
+
+namespace clang {
+namespace tidy {
+namespace test {
+
+static std::vector tokenify(const char *Text) {
+  LangOptions LangOpts;
+  std::vector Includes;
+  LangOptions::setLangDefaults(LangOpts, Language::CXX, llvm::Triple(),
+   Includes, LangStandard::lang_cxx20);
+  Lexer Lex(SourceLocation{}, LangOpts, Text, Text, Text + std::strlen(Text));
+  std::vector Tokens;
+  bool End = false;
+  while (!End) {
+Token Tok;
+End = Lex.LexFromRawLexer(Tok);
+Tokens.push_back(Tok);
+  }
+
+  return Tokens;
+}
+
+static bool matchText(const char *Text) {
+  std::vector Tokens{tokenify(Text)};
+  modernize::IntegralLiteralExpressionMatcher Matcher(Tokens);
+
+  return Matcher.match();
+}
+
+namespace {
+
+struct Param {
+  bool Matched;
+  const char *Text;
+};
+
+class MatcherTest : public ::testing::TestWithParam {};
+
+} // namespace
+
+static const Param Params[] = {
+// valid integral literals
+{true, "1"},
+{true, "0177"},
+{true, "0xdeadbeef"},
+{true, "0b1011"},
+{true, "'c'"},
+// invalid literals
+{false, "1.23"},
+{false, "0x1p3"},
+{false, R"("string")"},
+
+// literals with unary operators
+{true, "-1"},
+{true, "+1"},
+{true, "~1"},
+{true, "!1"},
+// invalid unary operators
+{false, "1-"},
+{false, "1+"},
+{false, "1~"},
+{false, "1!"},
+
+// valid binary operators
+{true, "1+1"},
+{true, "1-1"},
+{true, "1*1"},
+{true, "1/1"},
+{true, "1%2"},
+{true, "1<<1"},
+{true, "1>>1"},
+{true, "1<=>1"},
+{true, "1<1"},
+{true, "1>1"},
+{true, "1<=1"},
+{true, "1>=1"},
+{true, "1==1"},
+{true, "1!=1"},
+{true, "1&1"},
+{true, "1^1"},
+{true, "1|1"},
+{true, "1&&1"},
+{true, "1||1"},
+// invalid binary operator
+{false, "1+"},
+{false, "1-"},
+{false, "1*"},
+{false, "1/"},
+{false, "1%"},
+{false, "1<<"},
+{false, "1>>"},
+{false, "1<=>"},
+{false, "1<"},
+{false, "1>"},
+{false, "1<="},
+{false, "1>="},
+{false, "1=="},
+{false, "1!="},
+{false, "1&"},
+{false, "1^"},
+{false, "1|"},
+{false, "1&&"},
+{false, "1||"},
+
+// valid ternary operator
+{true, "1?1:1"},
+// invalid ternary operator
+{false, "?"},
+{false, "?1"},
+{false, "?:"},
+{false, "?:1"},
+{false, "?1:"},
+{false, "?1:1"},
+{false, "1?"},
+{false, "1?1"},
+{false, "1?:"},
+{false, "1?1:"},
+{false, "1?:1"},
+
+// parenthesized expressions
+{true, "(1)"},
+{true, "((+1))"},
+{true, "((+(1)))"},
+{true, "(-1)"},
+{true, "-(1)"},
+{true, "(+1)"},
+{true, "((+1))"},
+{true, "+(1)"},
+{true, "(~1)"},
+{true, "~(1)"},
+{true, "(!1)"},
+{true, "!(1)"},
+{true, "(1+1)"},
+{true, "(1-1)"},
+{true, "(1*1)"},
+{true, "(1/1)"},
+{true, "(1%2)"},
+{true, "(1<<1)"},
+{true, "(1>>1)"},
+{true, "(1<=>1)"},
+{true, "(1<1)"},
+{true, "(1>1)"},
+{true, "(1<=1)"},
+{true, "(1>=1)"},
+{true, "(1==1)"},
+{true, "(1!=1)"},
+{true, "(1&1)"},
+{true, "(1^1)"},
+{true, "(1|1)"},
+{true, "(1&&1)"},
+{true, "(1||1)"},
+{true, "(1?1:1)"},
+
+// more complicated expressions
+{true, "1+1+1"},
+{true, "1+1+1+1"},
+{true, "1+1+1+1+1"},
+{true, "1*1*1"},
+{true, "1*1*1*1"},
+{true, "1*1*1*1*1"},
+{true, "1<<1<<1"},
+{true, "4U>>1>>1"},
+{true, "1<1<1"},
+{true, "1>1>1"},
+{true, "1<=1<=1"},
+{true, "1>=1>=1"},
+{true, "1==1==1"},
+{true, "1!=1!=1"},
+{true, "1&1&1"},
+{true, "1^1^1"},
+{true, "1|1|1"},
+{true, "1&&1&&1"},
+{true, "1||1||1"},
+};
+
+TEST_P(MatcherTest, MatchResult) {
+  EXPECT_TRUE(matchText(GetParam().Text) == GetParam().Matched);

[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-04-27 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood updated this revision to Diff 425423.
LegalizeAdulthood added a comment.

- Add banner block comment for new files
- Extract Functions to eliminate duplication


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124500/new/

https://reviews.llvm.org/D124500

Files:
  clang-tools-extra/clang-tidy/modernize/CMakeLists.txt
  clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp
  clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.h
  clang-tools-extra/clang-tidy/modernize/MacroToEnumCheck.cpp
  clang-tools-extra/test/clang-tidy/checkers/modernize-macro-to-enum.cpp
  clang-tools-extra/unittests/clang-tidy/CMakeLists.txt
  clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp

Index: clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
===
--- /dev/null
+++ clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
@@ -0,0 +1,200 @@
+#include "ClangTidyTest.h"
+#include "modernize/IntegralLiteralExpressionMatcher.h"
+#include "clang/Lex/Lexer.h"
+#include "gtest/gtest.h"
+
+#include 
+#include 
+#include 
+#include 
+
+namespace clang {
+namespace tidy {
+namespace test {
+
+static std::vector tokenify(const char *Text) {
+  LangOptions LangOpts;
+  std::vector Includes;
+  LangOptions::setLangDefaults(LangOpts, Language::CXX, llvm::Triple(),
+   Includes, LangStandard::lang_cxx20);
+  Lexer Lex(SourceLocation{}, LangOpts, Text, Text, Text + std::strlen(Text));
+  std::vector Tokens;
+  bool End = false;
+  while (!End) {
+Token Tok;
+End = Lex.LexFromRawLexer(Tok);
+Tokens.push_back(Tok);
+  }
+
+  return Tokens;
+}
+
+static bool matchText(const char *Text) {
+  std::vector Tokens{tokenify(Text)};
+  modernize::IntegralLiteralExpressionMatcher Matcher(Tokens);
+
+  return Matcher.match();
+}
+
+namespace {
+
+struct Param {
+  bool Matched;
+  const char *Text;
+};
+
+class MatcherTest : public ::testing::TestWithParam {};
+
+} // namespace
+
+static const Param Params[] = {
+// valid integral literals
+{true, "1"},
+{true, "0177"},
+{true, "0xdeadbeef"},
+{true, "0b1011"},
+{true, "'c'"},
+// invalid literals
+{false, "1.23"},
+{false, "0x1p3"},
+{false, R"("string")"},
+
+// literals with unary operators
+{true, "-1"},
+{true, "+1"},
+{true, "~1"},
+{true, "!1"},
+// invalid unary operators
+{false, "1-"},
+{false, "1+"},
+{false, "1~"},
+{false, "1!"},
+
+// valid binary operators
+{true, "1+1"},
+{true, "1-1"},
+{true, "1*1"},
+{true, "1/1"},
+{true, "1%2"},
+{true, "1<<1"},
+{true, "1>>1"},
+{true, "1<=>1"},
+{true, "1<1"},
+{true, "1>1"},
+{true, "1<=1"},
+{true, "1>=1"},
+{true, "1==1"},
+{true, "1!=1"},
+{true, "1&1"},
+{true, "1^1"},
+{true, "1|1"},
+{true, "1&&1"},
+{true, "1||1"},
+// invalid binary operator
+{false, "1+"},
+{false, "1-"},
+{false, "1*"},
+{false, "1/"},
+{false, "1%"},
+{false, "1<<"},
+{false, "1>>"},
+{false, "1<=>"},
+{false, "1<"},
+{false, "1>"},
+{false, "1<="},
+{false, "1>="},
+{false, "1=="},
+{false, "1!="},
+{false, "1&"},
+{false, "1^"},
+{false, "1|"},
+{false, "1&&"},
+{false, "1||"},
+
+// valid ternary operator
+{true, "1?1:1"},
+// invalid ternary operator
+{false, "?"},
+{false, "?1"},
+{false, "?:"},
+{false, "?:1"},
+{false, "?1:"},
+{false, "?1:1"},
+{false, "1?"},
+{false, "1?1"},
+{false, "1?:"},
+{false, "1?1:"},
+{false, "1?:1"},
+
+// parenthesized expressions
+{true, "(1)"},
+{true, "((+1))"},
+{true, "((+(1)))"},
+{true, "(-1)"},
+{true, "-(1)"},
+{true, "(+1)"},
+{true, "((+1))"},
+{true, "+(1)"},
+{true, "(~1)"},
+{true, "~(1)"},
+{true, "(!1)"},
+{true, "!(1)"},
+{true, "(1+1)"},
+{true, "(1-1)"},
+{true, "(1*1)"},
+{true, "(1/1)"},
+{true, "(1%2)"},
+{true, "(1<<1)"},
+{true, "(1>>1)"},
+{true, "(1<=>1)"},
+{true, "(1<1)"},
+{true, "(1>1)"},
+{true, "(1<=1)"},
+{true, "(1>=1)"},
+{true, "(1==1)"},
+{true, "(1!=1)"},
+{true, "(1&1)"},
+{true, "(1^1)"},
+{true, "(1|1)"},
+{true, "(1&&1)"},
+{true, "(1||1)"},
+{true, "(1?1:1)"},
+
+// more complicated expressions
+{true, "1+1+1"},
+{true, "1+1+1+1"},
+{true, "1+1+1+1+1"},
+{true, "1*1*1"},
+{true, "1*1*1*1"},
+{true, "1*1*1*1*1"},
+{true, "1<<1<<1"},
+{true, "4U>>1>>1"},
+{true, "1<1<1"},
+{true, "1>1>1"},
+{true, "1<=1<=1"},
+{true, "1>=1>=1"},
+{true, "1==1==1"},
+{true, "1!=1!=1"},
+{true, "1&1&1"},
+{true, "1^1^1"},
+{true, "1|1|1"},
+{true, "1&&1&&1"},
+{true, "1||1||1"},
+};
+
+TEST_P(MatcherTest, 

[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-04-26 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood added inline comments.



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:1
+#include "IntegralLiteralExpressionMatcher.h"
+

ditto



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:292
+
+bool IntegralLiteralExpressionMatcher::conditionalExpr() {
+  if (!logicalOrExpr())

Structure of these feels very similar, I'll see if I can squish out the 
duplication



Comment at: 
clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.h:1
+#pragma once
+

Uh I guess this needs some sort of copyright notice?



Comment at: clang-tools-extra/clang-tidy/modernize/MacroToEnumCheck.cpp:325
+  IntegralLiteralExpressionMatcher Matcher(MacroTokens);
+  bool matched = Matcher.match();
+  return matched;

inline variable made explicit for debugging


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124500/new/

https://reviews.llvm.org/D124500

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum

2022-04-26 Thread Richard via Phabricator via cfe-commits
LegalizeAdulthood created this revision.
LegalizeAdulthood added a reviewer: aaron.ballman.
Herald added subscribers: carlosgalvezp, xazax.hun, mgorny.
Herald added a project: All.
LegalizeAdulthood requested review of this revision.
Herald added a project: clang-tools-extra.
Herald added a subscriber: cfe-commits.

Add a recursive descent parser to match macro expansion tokens against
fully formed valid expressions of integral literals.  Partial expressions will
not be matched -- they can't be valid initializing expressions for an enum.

Fixes #55055


https://reviews.llvm.org/D124500

Files:
  clang-tools-extra/clang-tidy/modernize/CMakeLists.txt
  clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp
  clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.h
  clang-tools-extra/clang-tidy/modernize/MacroToEnumCheck.cpp
  clang-tools-extra/test/clang-tidy/checkers/modernize-macro-to-enum.cpp
  clang-tools-extra/unittests/clang-tidy/CMakeLists.txt
  clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp

Index: clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
===
--- /dev/null
+++ clang-tools-extra/unittests/clang-tidy/ModernizeModuleTest.cpp
@@ -0,0 +1,200 @@
+#include "ClangTidyTest.h"
+#include "modernize/IntegralLiteralExpressionMatcher.h"
+#include "clang/Lex/Lexer.h"
+#include "gtest/gtest.h"
+
+#include 
+#include 
+#include 
+#include 
+
+namespace clang {
+namespace tidy {
+namespace test {
+
+static std::vector tokenify(const char *Text) {
+  LangOptions LangOpts;
+  std::vector Includes;
+  LangOptions::setLangDefaults(LangOpts, Language::CXX, llvm::Triple(),
+   Includes, LangStandard::lang_cxx20);
+  Lexer Lex(SourceLocation{}, LangOpts, Text, Text, Text + std::strlen(Text));
+  std::vector Tokens;
+  bool End = false;
+  while (!End) {
+Token Tok;
+End = Lex.LexFromRawLexer(Tok);
+Tokens.push_back(Tok);
+  }
+
+  return Tokens;
+}
+
+static bool matchText(const char *Text) {
+  std::vector Tokens{tokenify(Text)};
+  modernize::IntegralLiteralExpressionMatcher Matcher(Tokens);
+
+  return Matcher.match();
+}
+
+namespace {
+
+struct Param {
+  bool Matched;
+  const char *Text;
+};
+
+class MatcherTest : public ::testing::TestWithParam {};
+
+} // namespace
+
+static const Param Params[] = {
+// valid integral literals
+{true, "1"},
+{true, "0177"},
+{true, "0xdeadbeef"},
+{true, "0b1011"},
+{true, "'c'"},
+// invalid literals
+{false, "1.23"},
+{false, "0x1p3"},
+{false, R"("string")"},
+
+// literals with unary operators
+{true, "-1"},
+{true, "+1"},
+{true, "~1"},
+{true, "!1"},
+// invalid unary operators
+{false, "1-"},
+{false, "1+"},
+{false, "1~"},
+{false, "1!"},
+
+// valid binary operators
+{true, "1+1"},
+{true, "1-1"},
+{true, "1*1"},
+{true, "1/1"},
+{true, "1%2"},
+{true, "1<<1"},
+{true, "1>>1"},
+{true, "1<=>1"},
+{true, "1<1"},
+{true, "1>1"},
+{true, "1<=1"},
+{true, "1>=1"},
+{true, "1==1"},
+{true, "1!=1"},
+{true, "1&1"},
+{true, "1^1"},
+{true, "1|1"},
+{true, "1&&1"},
+{true, "1||1"},
+// invalid binary operator
+{false, "1+"},
+{false, "1-"},
+{false, "1*"},
+{false, "1/"},
+{false, "1%"},
+{false, "1<<"},
+{false, "1>>"},
+{false, "1<=>"},
+{false, "1<"},
+{false, "1>"},
+{false, "1<="},
+{false, "1>="},
+{false, "1=="},
+{false, "1!="},
+{false, "1&"},
+{false, "1^"},
+{false, "1|"},
+{false, "1&&"},
+{false, "1||"},
+
+// valid ternary operator
+{true, "1?1:1"},
+// invalid ternary operator
+{false, "?"},
+{false, "?1"},
+{false, "?:"},
+{false, "?:1"},
+{false, "?1:"},
+{false, "?1:1"},
+{false, "1?"},
+{false, "1?1"},
+{false, "1?:"},
+{false, "1?1:"},
+{false, "1?:1"},
+
+// parenthesized expressions
+{true, "(1)"},
+{true, "((+1))"},
+{true, "((+(1)))"},
+{true, "(-1)"},
+{true, "-(1)"},
+{true, "(+1)"},
+{true, "((+1))"},
+{true, "+(1)"},
+{true, "(~1)"},
+{true, "~(1)"},
+{true, "(!1)"},
+{true, "!(1)"},
+{true, "(1+1)"},
+{true, "(1-1)"},
+{true, "(1*1)"},
+{true, "(1/1)"},
+{true, "(1%2)"},
+{true, "(1<<1)"},
+{true, "(1>>1)"},
+{true, "(1<=>1)"},
+{true, "(1<1)"},
+{true, "(1>1)"},
+{true, "(1<=1)"},
+{true, "(1>=1)"},
+{true, "(1==1)"},
+{true, "(1!=1)"},
+{true, "(1&1)"},
+{true, "(1^1)"},
+{true, "(1|1)"},
+{true, "(1&&1)"},
+{true, "(1||1)"},
+{true, "(1?1:1)"},
+
+// more complicated expressions
+{true, "1+1+1"},
+{true, "1+1+1+1"},
+{true, "1+1+1+1+1"},
+{true, "1*1*1"},
+{true, "1*1*1*1"},
+{true, "1*1*1*1*1"},
+{true, "1<<1<<1"},
+