[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-09-13 Thread Owen Pan via Phabricator via cfe-commits
owenpan added a comment.
Herald added reviewers: rymiel, HazardyKnusperkeks, owenpan, MyDeveloperDay.

It seems that this patch caused a regression. See 
https://github.com/llvm/llvm-project/issues/62768.


Herald added a comment.

NOTE: Clang-Format Team Automated Review Comment

Your review contains a change to clang/include/clang/Format/Format.h but does 
not contain an update to ClangFormatStyleOptions.rst

ClangFormatStyleOptions.rst is generated via 
clang/docs/tools/dump_format_style.py,  please run this to regenerate the .rst

You can validate that the rst is valid by running.

  ./docs/tools/dump_format_style.py
  mkdir -p html
  /usr/bin/sphinx-build -n ./docs ./html


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144170/new/

https://reviews.llvm.org/D144170

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-03-27 Thread Björn Schäpers via Phabricator via cfe-commits
HazardyKnusperkeks added inline comments.



Comment at: clang/include/clang/Format/Format.h:4343
+   WhitespaceSensitiveMacros == R.WhitespaceSensitiveMacros &&
+   Macros == R.Macros;
   }

I put a lot of effort into bringing the stuff sorted. And now a change which 
was completely transparent to me, because of the missing clang-format tag.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144170/new/

https://reviews.llvm.org/D144170

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


Re: [PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-03-23 Thread Manuel Klimek via cfe-commits
I am so sorry, thanks for sending out the patch already and fixing the
layout!

On Thu, Mar 23, 2023 at 10:08 AM MyDeveloperDay via Phabricator <
revi...@reviews.llvm.org> wrote:

> MyDeveloperDay added a comment.
>
> I know this is like "telling my grandmother to suck eggs" but  @klimek the
> change here to Format.h means you need to regenerate
> ClangFormatStyleOptions.rst via docs/tools/dump_format_style.py
>
> Impacting the next change to generate - D125171: Add a new clang-format
> option AlwaysBreakBeforeFunctionParameters <
> https://reviews.llvm.org/D125171>
>
>
> Repository:
>   rG LLVM Github Monorepo
>
> CHANGES SINCE LAST ACTION
>   https://reviews.llvm.org/D144170/new/
>
> https://reviews.llvm.org/D144170
>
>
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-03-23 Thread MyDeveloperDay via Phabricator via cfe-commits
MyDeveloperDay added a comment.

I know this is like "telling my grandmother to suck eggs" but  @klimek the 
change here to Format.h means you need to regenerate 
ClangFormatStyleOptions.rst via docs/tools/dump_format_style.py

Impacting the next change to generate - D125171: Add a new clang-format option 
AlwaysBreakBeforeFunctionParameters 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144170/new/

https://reviews.llvm.org/D144170

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-02-24 Thread Manuel Klimek via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG01402831aaae: [clang-format] Add simple macro replacements 
in formatting. (authored by klimek).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144170/new/

https://reviews.llvm.org/D144170

Files:
  clang/include/clang/Format/Format.h
  clang/lib/Format/ContinuationIndenter.cpp
  clang/lib/Format/Format.cpp
  clang/lib/Format/FormatToken.h
  clang/lib/Format/MacroExpander.cpp
  clang/lib/Format/Macros.h
  clang/lib/Format/TokenAnalyzer.cpp
  clang/lib/Format/TokenAnalyzer.h
  clang/lib/Format/TokenAnnotator.cpp
  clang/lib/Format/TokenAnnotator.h
  clang/lib/Format/UnwrappedLineFormatter.cpp
  clang/lib/Format/UnwrappedLineParser.cpp
  clang/lib/Format/UnwrappedLineParser.h
  clang/lib/Format/WhitespaceManager.cpp
  clang/unittests/Format/FormatTest.cpp
  clang/unittests/Format/MacroCallReconstructorTest.cpp
  clang/unittests/Format/MacroExpanderTest.cpp
  clang/unittests/Format/TestLexer.h

Index: clang/unittests/Format/TestLexer.h
===
--- clang/unittests/Format/TestLexer.h
+++ clang/unittests/Format/TestLexer.h
@@ -72,7 +72,8 @@
   TokenList annotate(llvm::StringRef Code) {
 FormatTokenLexer Lex = getNewLexer(Code);
 auto Tokens = Lex.lex();
-UnwrappedLineParser Parser(Style, Lex.getKeywords(), 0, Tokens, *this);
+UnwrappedLineParser Parser(SourceMgr.get(), Style, Lex.getKeywords(), 0,
+   Tokens, *this, Allocator, IdentTable);
 Parser.parse();
 TokenAnnotator Annotator(Style, Lex.getKeywords());
 for (auto &Line : UnwrappedLines) {
Index: clang/unittests/Format/MacroExpanderTest.cpp
===
--- clang/unittests/Format/MacroExpanderTest.cpp
+++ clang/unittests/Format/MacroExpanderTest.cpp
@@ -19,9 +19,16 @@
Lex.Allocator, Lex.IdentTable);
   }
 
+  std::string expand(MacroExpander &Macros, llvm::StringRef Name) {
+EXPECT_TRUE(Macros.defined(Name))
+<< "Macro not defined: \"" << Name << "\"";
+return text(Macros.expand(Lex.id(Name), {}));
+  }
+
   std::string expand(MacroExpander &Macros, llvm::StringRef Name,
- const std::vector &Args = {}) {
-EXPECT_TRUE(Macros.defined(Name));
+ const std::vector &Args) {
+EXPECT_TRUE(Macros.defined(Name))
+<< "Macro not defined: \"" << Name << "\"";
 return text(Macros.expand(Lex.id(Name), lexArgs(Args)));
   }
 
@@ -95,7 +102,7 @@
   EXPECT_EQ("", expand(*Macros, "A"));
   EXPECT_EQ("b", expand(*Macros, "B"));
   EXPECT_EQ("c+c", expand(*Macros, "C"));
-  EXPECT_EQ("", expand(*Macros, "D"));
+  EXPECT_EQ("", expand(*Macros, "D", {}));
 }
 
 TEST_F(MacroExpanderTest, ExpandsWithArguments) {
@@ -105,7 +112,6 @@
   });
   EXPECT_EQ("", expand(*Macros, "A", {"a"}));
   EXPECT_EQ("b1+b2+b3", expand(*Macros, "B", {"b1", "b2 + b3"}));
-  EXPECT_EQ("x+", expand(*Macros, "B", {"x"}));
 }
 
 TEST_F(MacroExpanderTest, AttributizesTokens) {
@@ -200,6 +206,14 @@
   EXPECT_ATTRIBUTES(Result, Attributes);
 }
 
+TEST_F(MacroExpanderTest, Overloads) {
+  auto Macros = create({"A=x", "A()=y", "A(a)=a", "A(a, b)=a b"});
+  EXPECT_EQ("x", expand(*Macros, "A"));
+  EXPECT_EQ("y", expand(*Macros, "A", {}));
+  EXPECT_EQ("z", expand(*Macros, "A", {"z"}));
+  EXPECT_EQ("xy", expand(*Macros, "A", {"x", "y"}));
+}
+
 } // namespace
 } // namespace format
 } // namespace clang
Index: clang/unittests/Format/MacroCallReconstructorTest.cpp
===
--- clang/unittests/Format/MacroCallReconstructorTest.cpp
+++ clang/unittests/Format/MacroCallReconstructorTest.cpp
@@ -33,13 +33,31 @@
   TokenList
   expand(llvm::StringRef Name,
  const SmallVector, 1> &Args) {
+return expandInternal(Name, Args);
+  }
+
+  TokenList expand(llvm::StringRef Name) { return expandInternal(Name, {}); }
+
+  TokenList expand(llvm::StringRef Name, const std::vector &Args) {
+return expandInternal(Name, lexArgs(Args));
+  }
+
+  const UnexpandedMap &getUnexpanded() const { return Unexpanded; }
+
+  const TokenList &getTokens() const { return Tokens; }
+
+private:
+  TokenList expandInternal(
+  llvm::StringRef Name,
+  const std::optional, 1>>
+  &Args) {
 auto *ID = Lex.id(Name);
 auto UnexpandedLine = std::make_unique();
 UnexpandedLine->Tokens.push_back(ID);
-if (!Args.empty()) {
+if (Args && !Args->empty()) {
   UnexpandedLine->Tokens.push_back(Lex.id("("));
-  for (auto I = Args.begin(), E = Args.end(); I != E; ++I) {
-if (I != Args.begin())
+  for (auto I = Args->begin(), E = Args->end(); I != E; ++I) {
+if (I != Args->begin())
   UnexpandedLine->Tokens.push_back(Lex.id(","));
 U

[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-02-24 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/lib/Format/UnwrappedLineParser.cpp:4592
+  !Macros.hasArity(ID->TokenText, Args->size())) {
+// The macro is overloaded to be both object-like and function-like,
+// but none of the function-like arities match the number of arguments.

sammccall wrote:
> klimek wrote:
> > sammccall wrote:
> > > Strictly I don't think this comment is true, we hit this path when it's 
> > > **only** an object-like macro, used before parens.
> > > 
> > > For this reason you might *not* want the dbgs() here but rather in the 
> > > bottom `else`
> > I'm pretty sure we hit this when it's overloaded for both, but we call it 
> > with the wrong arity.
> > E.g. A=x, A(x, y, z)=x y z
> > 
> > A(1, 2)
> > - Macros.objectLike(A) -> true
> > - Args -> true
> > - !Macros.hasArity(A, 2) -> true
> I agree that (the case described in the comment) => (we get to this point in 
> the code).
> 
> I'm disputing that (we get to this point in the code) => (the case described 
> in the comment).
> 
> i.e. given `A=x`, code=`A(1, 2)`, we also get: `objectLike(A) -> true`, `Args 
> -> true`, `!Macros.hasArity(A, 2) -> true`, but this time the comment is 
> lying about the state.
Ah, your comment made me think it was the former; I think the dbgs() is still 
what I want, but with an "either or" dbgs().

Adapted the comment in the code and the dbgs() to match.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144170/new/

https://reviews.llvm.org/D144170

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-02-24 Thread Manuel Klimek via Phabricator via cfe-commits
klimek updated this revision to Diff 500191.
klimek added a comment.

Address review comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144170/new/

https://reviews.llvm.org/D144170

Files:
  clang/include/clang/Format/Format.h
  clang/lib/Format/ContinuationIndenter.cpp
  clang/lib/Format/Format.cpp
  clang/lib/Format/FormatToken.h
  clang/lib/Format/MacroExpander.cpp
  clang/lib/Format/Macros.h
  clang/lib/Format/TokenAnalyzer.cpp
  clang/lib/Format/TokenAnalyzer.h
  clang/lib/Format/TokenAnnotator.cpp
  clang/lib/Format/TokenAnnotator.h
  clang/lib/Format/UnwrappedLineFormatter.cpp
  clang/lib/Format/UnwrappedLineParser.cpp
  clang/lib/Format/UnwrappedLineParser.h
  clang/lib/Format/WhitespaceManager.cpp
  clang/unittests/Format/FormatTest.cpp
  clang/unittests/Format/MacroCallReconstructorTest.cpp
  clang/unittests/Format/MacroExpanderTest.cpp
  clang/unittests/Format/TestLexer.h

Index: clang/unittests/Format/TestLexer.h
===
--- clang/unittests/Format/TestLexer.h
+++ clang/unittests/Format/TestLexer.h
@@ -72,7 +72,8 @@
   TokenList annotate(llvm::StringRef Code) {
 FormatTokenLexer Lex = getNewLexer(Code);
 auto Tokens = Lex.lex();
-UnwrappedLineParser Parser(Style, Lex.getKeywords(), 0, Tokens, *this);
+UnwrappedLineParser Parser(SourceMgr.get(), Style, Lex.getKeywords(), 0,
+   Tokens, *this, Allocator, IdentTable);
 Parser.parse();
 TokenAnnotator Annotator(Style, Lex.getKeywords());
 for (auto &Line : UnwrappedLines) {
Index: clang/unittests/Format/MacroExpanderTest.cpp
===
--- clang/unittests/Format/MacroExpanderTest.cpp
+++ clang/unittests/Format/MacroExpanderTest.cpp
@@ -19,9 +19,16 @@
Lex.Allocator, Lex.IdentTable);
   }
 
+  std::string expand(MacroExpander &Macros, llvm::StringRef Name) {
+EXPECT_TRUE(Macros.defined(Name))
+<< "Macro not defined: \"" << Name << "\"";
+return text(Macros.expand(Lex.id(Name), {}));
+  }
+
   std::string expand(MacroExpander &Macros, llvm::StringRef Name,
- const std::vector &Args = {}) {
-EXPECT_TRUE(Macros.defined(Name));
+ const std::vector &Args) {
+EXPECT_TRUE(Macros.defined(Name))
+<< "Macro not defined: \"" << Name << "\"";
 return text(Macros.expand(Lex.id(Name), lexArgs(Args)));
   }
 
@@ -95,7 +102,7 @@
   EXPECT_EQ("", expand(*Macros, "A"));
   EXPECT_EQ("b", expand(*Macros, "B"));
   EXPECT_EQ("c+c", expand(*Macros, "C"));
-  EXPECT_EQ("", expand(*Macros, "D"));
+  EXPECT_EQ("", expand(*Macros, "D", {}));
 }
 
 TEST_F(MacroExpanderTest, ExpandsWithArguments) {
@@ -105,7 +112,6 @@
   });
   EXPECT_EQ("", expand(*Macros, "A", {"a"}));
   EXPECT_EQ("b1+b2+b3", expand(*Macros, "B", {"b1", "b2 + b3"}));
-  EXPECT_EQ("x+", expand(*Macros, "B", {"x"}));
 }
 
 TEST_F(MacroExpanderTest, AttributizesTokens) {
@@ -200,6 +206,14 @@
   EXPECT_ATTRIBUTES(Result, Attributes);
 }
 
+TEST_F(MacroExpanderTest, Overloads) {
+  auto Macros = create({"A=x", "A()=y", "A(a)=a", "A(a, b)=a b"});
+  EXPECT_EQ("x", expand(*Macros, "A"));
+  EXPECT_EQ("y", expand(*Macros, "A", {}));
+  EXPECT_EQ("z", expand(*Macros, "A", {"z"}));
+  EXPECT_EQ("xy", expand(*Macros, "A", {"x", "y"}));
+}
+
 } // namespace
 } // namespace format
 } // namespace clang
Index: clang/unittests/Format/MacroCallReconstructorTest.cpp
===
--- clang/unittests/Format/MacroCallReconstructorTest.cpp
+++ clang/unittests/Format/MacroCallReconstructorTest.cpp
@@ -33,13 +33,31 @@
   TokenList
   expand(llvm::StringRef Name,
  const SmallVector, 1> &Args) {
+return expandInternal(Name, Args);
+  }
+
+  TokenList expand(llvm::StringRef Name) { return expandInternal(Name, {}); }
+
+  TokenList expand(llvm::StringRef Name, const std::vector &Args) {
+return expandInternal(Name, lexArgs(Args));
+  }
+
+  const UnexpandedMap &getUnexpanded() const { return Unexpanded; }
+
+  const TokenList &getTokens() const { return Tokens; }
+
+private:
+  TokenList expandInternal(
+  llvm::StringRef Name,
+  const std::optional, 1>>
+  &Args) {
 auto *ID = Lex.id(Name);
 auto UnexpandedLine = std::make_unique();
 UnexpandedLine->Tokens.push_back(ID);
-if (!Args.empty()) {
+if (Args && !Args->empty()) {
   UnexpandedLine->Tokens.push_back(Lex.id("("));
-  for (auto I = Args.begin(), E = Args.end(); I != E; ++I) {
-if (I != Args.begin())
+  for (auto I = Args->begin(), E = Args->end(); I != E; ++I) {
+if (I != Args->begin())
   UnexpandedLine->Tokens.push_back(Lex.id(","));
 UnexpandedLine->Tokens.insert(UnexpandedLine->Tokens.end(), I->begin(),
   I->end());
@@ -57,16 +75,6 @@
 retur

[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-02-24 Thread Sam McCall via Phabricator via cfe-commits
sammccall added a comment.

Still LG




Comment at: clang/lib/Format/MacroExpander.cpp:172
   assert(defined(ID->TokenText));
+  assert(
+  (!OptionalArgs && ObjectLike.find(ID->TokenText) != ObjectLike.end()) ||

klimek wrote:
> sammccall wrote:
> > this assertion failure isn't going to be as useful as it could be!
> > 
> > maybe rather
> > ```
> > if (OptionalArgs)
> >   assert(...);
> > else
> >   assert(...);
> > ```
> > 
> > Also I think we're now ensuring to only call this if arity matches, so I'd 
> > make this a precondition and replace the first assert with hasArity to make 
> > it easier to reason about
> Hopefully done, the comment took me a while to parse :)
Sorry about that, LG though



Comment at: clang/lib/Format/UnwrappedLineParser.cpp:4592
+  !Macros.hasArity(ID->TokenText, Args->size())) {
+// The macro is overloaded to be both object-like and function-like,
+// but none of the function-like arities match the number of arguments.

klimek wrote:
> sammccall wrote:
> > Strictly I don't think this comment is true, we hit this path when it's 
> > **only** an object-like macro, used before parens.
> > 
> > For this reason you might *not* want the dbgs() here but rather in the 
> > bottom `else`
> I'm pretty sure we hit this when it's overloaded for both, but we call it 
> with the wrong arity.
> E.g. A=x, A(x, y, z)=x y z
> 
> A(1, 2)
> - Macros.objectLike(A) -> true
> - Args -> true
> - !Macros.hasArity(A, 2) -> true
I agree that (the case described in the comment) => (we get to this point in 
the code).

I'm disputing that (we get to this point in the code) => (the case described in 
the comment).

i.e. given `A=x`, code=`A(1, 2)`, we also get: `objectLike(A) -> true`, `Args 
-> true`, `!Macros.hasArity(A, 2) -> true`, but this time the comment is lying 
about the state.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144170/new/

https://reviews.llvm.org/D144170

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-02-24 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/lib/Format/MacroExpander.cpp:172
   assert(defined(ID->TokenText));
+  assert(
+  (!OptionalArgs && ObjectLike.find(ID->TokenText) != ObjectLike.end()) ||

sammccall wrote:
> this assertion failure isn't going to be as useful as it could be!
> 
> maybe rather
> ```
> if (OptionalArgs)
>   assert(...);
> else
>   assert(...);
> ```
> 
> Also I think we're now ensuring to only call this if arity matches, so I'd 
> make this a precondition and replace the first assert with hasArity to make 
> it easier to reason about
Hopefully done, the comment took me a while to parse :)



Comment at: clang/lib/Format/UnwrappedLineParser.cpp:4592
+  !Macros.hasArity(ID->TokenText, Args->size())) {
+// The macro is overloaded to be both object-like and function-like,
+// but none of the function-like arities match the number of arguments.

sammccall wrote:
> Strictly I don't think this comment is true, we hit this path when it's 
> **only** an object-like macro, used before parens.
> 
> For this reason you might *not* want the dbgs() here but rather in the bottom 
> `else`
I'm pretty sure we hit this when it's overloaded for both, but we call it with 
the wrong arity.
E.g. A=x, A(x, y, z)=x y z

A(1, 2)
- Macros.objectLike(A) -> true
- Args -> true
- !Macros.hasArity(A, 2) -> true


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144170/new/

https://reviews.llvm.org/D144170

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-02-24 Thread Manuel Klimek via Phabricator via cfe-commits
klimek updated this revision to Diff 500181.
klimek marked 5 inline comments as done.
klimek added a comment.

Address review comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144170/new/

https://reviews.llvm.org/D144170

Files:
  clang/include/clang/Format/Format.h
  clang/lib/Format/ContinuationIndenter.cpp
  clang/lib/Format/Format.cpp
  clang/lib/Format/FormatToken.h
  clang/lib/Format/MacroExpander.cpp
  clang/lib/Format/Macros.h
  clang/lib/Format/TokenAnalyzer.cpp
  clang/lib/Format/TokenAnalyzer.h
  clang/lib/Format/TokenAnnotator.cpp
  clang/lib/Format/TokenAnnotator.h
  clang/lib/Format/UnwrappedLineFormatter.cpp
  clang/lib/Format/UnwrappedLineParser.cpp
  clang/lib/Format/UnwrappedLineParser.h
  clang/lib/Format/WhitespaceManager.cpp
  clang/unittests/Format/FormatTest.cpp
  clang/unittests/Format/MacroCallReconstructorTest.cpp
  clang/unittests/Format/MacroExpanderTest.cpp
  clang/unittests/Format/TestLexer.h

Index: clang/unittests/Format/TestLexer.h
===
--- clang/unittests/Format/TestLexer.h
+++ clang/unittests/Format/TestLexer.h
@@ -72,7 +72,8 @@
   TokenList annotate(llvm::StringRef Code) {
 FormatTokenLexer Lex = getNewLexer(Code);
 auto Tokens = Lex.lex();
-UnwrappedLineParser Parser(Style, Lex.getKeywords(), 0, Tokens, *this);
+UnwrappedLineParser Parser(SourceMgr.get(), Style, Lex.getKeywords(), 0,
+   Tokens, *this, Allocator, IdentTable);
 Parser.parse();
 TokenAnnotator Annotator(Style, Lex.getKeywords());
 for (auto &Line : UnwrappedLines) {
Index: clang/unittests/Format/MacroExpanderTest.cpp
===
--- clang/unittests/Format/MacroExpanderTest.cpp
+++ clang/unittests/Format/MacroExpanderTest.cpp
@@ -19,9 +19,16 @@
Lex.Allocator, Lex.IdentTable);
   }
 
+  std::string expand(MacroExpander &Macros, llvm::StringRef Name) {
+EXPECT_TRUE(Macros.defined(Name))
+<< "Macro not defined: \"" << Name << "\"";
+return text(Macros.expand(Lex.id(Name), {}));
+  }
+
   std::string expand(MacroExpander &Macros, llvm::StringRef Name,
- const std::vector &Args = {}) {
-EXPECT_TRUE(Macros.defined(Name));
+ const std::vector &Args) {
+EXPECT_TRUE(Macros.defined(Name))
+<< "Macro not defined: \"" << Name << "\"";
 return text(Macros.expand(Lex.id(Name), lexArgs(Args)));
   }
 
@@ -95,7 +102,7 @@
   EXPECT_EQ("", expand(*Macros, "A"));
   EXPECT_EQ("b", expand(*Macros, "B"));
   EXPECT_EQ("c+c", expand(*Macros, "C"));
-  EXPECT_EQ("", expand(*Macros, "D"));
+  EXPECT_EQ("", expand(*Macros, "D", {}));
 }
 
 TEST_F(MacroExpanderTest, ExpandsWithArguments) {
@@ -105,7 +112,6 @@
   });
   EXPECT_EQ("", expand(*Macros, "A", {"a"}));
   EXPECT_EQ("b1+b2+b3", expand(*Macros, "B", {"b1", "b2 + b3"}));
-  EXPECT_EQ("x+", expand(*Macros, "B", {"x"}));
 }
 
 TEST_F(MacroExpanderTest, AttributizesTokens) {
@@ -200,6 +206,14 @@
   EXPECT_ATTRIBUTES(Result, Attributes);
 }
 
+TEST_F(MacroExpanderTest, Overloads) {
+  auto Macros = create({"A=x", "A()=y", "A(a)=a", "A(a, b)=a b"});
+  EXPECT_EQ("x", expand(*Macros, "A"));
+  EXPECT_EQ("y", expand(*Macros, "A", {}));
+  EXPECT_EQ("z", expand(*Macros, "A", {"z"}));
+  EXPECT_EQ("xy", expand(*Macros, "A", {"x", "y"}));
+}
+
 } // namespace
 } // namespace format
 } // namespace clang
Index: clang/unittests/Format/MacroCallReconstructorTest.cpp
===
--- clang/unittests/Format/MacroCallReconstructorTest.cpp
+++ clang/unittests/Format/MacroCallReconstructorTest.cpp
@@ -33,13 +33,31 @@
   TokenList
   expand(llvm::StringRef Name,
  const SmallVector, 1> &Args) {
+return expandInternal(Name, Args);
+  }
+
+  TokenList expand(llvm::StringRef Name) { return expandInternal(Name, {}); }
+
+  TokenList expand(llvm::StringRef Name, const std::vector &Args) {
+return expandInternal(Name, lexArgs(Args));
+  }
+
+  const UnexpandedMap &getUnexpanded() const { return Unexpanded; }
+
+  const TokenList &getTokens() const { return Tokens; }
+
+private:
+  TokenList expandInternal(
+  llvm::StringRef Name,
+  const std::optional, 1>>
+  &Args) {
 auto *ID = Lex.id(Name);
 auto UnexpandedLine = std::make_unique();
 UnexpandedLine->Tokens.push_back(ID);
-if (!Args.empty()) {
+if (Args && !Args->empty()) {
   UnexpandedLine->Tokens.push_back(Lex.id("("));
-  for (auto I = Args.begin(), E = Args.end(); I != E; ++I) {
-if (I != Args.begin())
+  for (auto I = Args->begin(), E = Args->end(); I != E; ++I) {
+if (I != Args->begin())
   UnexpandedLine->Tokens.push_back(Lex.id(","));
 UnexpandedLine->Tokens.insert(UnexpandedLine->Tokens.end(), I->begin(),
  

[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-02-24 Thread Sam McCall via Phabricator via cfe-commits
sammccall accepted this revision.
sammccall added a comment.
This revision is now accepted and ready to land.

Great!

The overloading impl is a little surprising to me.
I was assuming that object would always win over function (or that it would be 
disallowed to combine the two).
I think that is simpler to implement, as you know in advance what type of 
expansion you're attempting.
But what you have isn't that much more complicated, and the extra flexibility 
may prove useful.

One weird side-effect: if you define an object-like macro FOO and then write 
`FOO(` and never close the paren, then we die inside macro-arg-parsing rather 
than hitting the normal EOF-while-formatting behavior. I left a comment on 
parseMacroCall() about how to fix this but I'm not sure whether it matters at a 
high level.




Comment at: clang/lib/Format/MacroExpander.cpp:172
   assert(defined(ID->TokenText));
+  assert(
+  (!OptionalArgs && ObjectLike.find(ID->TokenText) != ObjectLike.end()) ||

this assertion failure isn't going to be as useful as it could be!

maybe rather
```
if (OptionalArgs)
  assert(...);
else
  assert(...);
```

Also I think we're now ensuring to only call this if arity matches, so I'd make 
this a precondition and replace the first assert with hasArity to make it 
easier to reason about



Comment at: clang/lib/Format/UnwrappedLineParser.cpp:4580
+  LLVM_DEBUG({
+llvm::dbgs() << "Call: " << ID->TokenText << "(";
+if (Args) {

nit: call => Macro call in dbgs()?



Comment at: clang/lib/Format/UnwrappedLineParser.cpp:4592
+  !Macros.hasArity(ID->TokenText, Args->size())) {
+// The macro is overloaded to be both object-like and function-like,
+// but none of the function-like arities match the number of arguments.

Strictly I don't think this comment is true, we hit this path when it's 
**only** an object-like macro, used before parens.

For this reason you might *not* want the dbgs() here but rather in the bottom 
`else`



Comment at: clang/lib/Format/UnwrappedLineParser.cpp:4610
+Unexpanded[ID] = std::move(UnexpandedLine);
+SmallVector New = Macros.expand(ID, Args);
+if (!New.empty())

(nit: std::move(args)?)



Comment at: clang/lib/Format/UnwrappedLineParser.cpp:4610
+Unexpanded[ID] = std::move(UnexpandedLine);
+SmallVector New = Macros.expand(ID, Args);
+if (!New.empty())

sammccall wrote:
> (nit: std::move(args)?)
nit: maybe "expansion" is more descriptive than "new"?



Comment at: clang/lib/Format/UnwrappedLineParser.cpp:4620
+});
+  } else {
+Tokens->setPosition(Position);

I think you may want a comment and/or dbgs() here explaining this case: `Didn't 
expand macro FOO because it was used {with 3|without} args, which doesn't match 
any definition`



Comment at: clang/lib/Format/UnwrappedLineParser.cpp:4696
+  } while (!eof());
+  return {};
+}

I believe this has changed meaning along with the return type: previously it 
returned an empty vector, now it returns `nullopt`. Not sure which you intend 
here.

I think the ideal thing here would be to **rewind the token stream back to 
before the `(`**, and return `nullopt`. We'll end up performing an object-like 
substitution if possible, and non-macro parsing will continue.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144170/new/

https://reviews.llvm.org/D144170

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-02-24 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/unittests/Format/FormatTest.cpp:22584
+  Style.Macros.push_back("CALL(x)=f([] { x })");
+  Style.Macros.push_back("ASSIGN_OR_RETURN(a, b, c)=a = (b) || (c)");
+

sammccall wrote:
> klimek wrote:
> > sammccall wrote:
> > > If this is assign_or_return, the treatment of "c" is fairly strange. Also 
> > > you are mostly calling this with only two args. Maybe drop C and use a 
> > > different macro for the 3-arg case?
> > ASSIGN_OR_RETURN allows 3 args, though; this is basically the thing I'd 
> > propose to configure for ASSIGN_OR_RETURN in general; is there anything 
> > specific you don't like about this?
> OK, this is a bit sticky.
>  - `ASSIGN_OR_RETURN` is almost always called with two args
>  - some versions of `ASSIGN_OR_RETURN` support an optional third arg (there's 
> no canonical public version)
>  - these emulate overloading using variadic macro tricks
>  - this patch doesn't claim to support either variadic macros or overloading
> 
> So the principled options seem to be:
>  - macros have fixed arity: clang-format can support the two-arg version of 
> `ASSIGN_OR_RETURN` but not the three-arg version. (This is what I was 
> assuming)
>  - macros have variable arity: the public docs need to describe how passing 
> too many/too few args is treated, because this isn't obvious and it sounds 
> like we want people to rely on it. This feels like stretching "principled" to 
> me - we're relying on the expansion still parsing correctly when args are 
> missing, which basically seems like coincidence.
>  - macros support overloading: instead of looking up macros by name, the key 
> is `(string Name, optional Arity)`
Chose option 3, as this is pretty straight-forward and I believe the best user 
experience.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144170/new/

https://reviews.llvm.org/D144170

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-02-24 Thread Manuel Klimek via Phabricator via cfe-commits
klimek updated this revision to Diff 500121.
klimek marked 6 inline comments as done.
klimek added a comment.

Address review comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144170/new/

https://reviews.llvm.org/D144170

Files:
  clang/include/clang/Format/Format.h
  clang/lib/Format/ContinuationIndenter.cpp
  clang/lib/Format/Format.cpp
  clang/lib/Format/FormatToken.h
  clang/lib/Format/MacroExpander.cpp
  clang/lib/Format/Macros.h
  clang/lib/Format/TokenAnalyzer.cpp
  clang/lib/Format/TokenAnalyzer.h
  clang/lib/Format/TokenAnnotator.cpp
  clang/lib/Format/TokenAnnotator.h
  clang/lib/Format/UnwrappedLineFormatter.cpp
  clang/lib/Format/UnwrappedLineParser.cpp
  clang/lib/Format/UnwrappedLineParser.h
  clang/lib/Format/WhitespaceManager.cpp
  clang/unittests/Format/FormatTest.cpp
  clang/unittests/Format/MacroCallReconstructorTest.cpp
  clang/unittests/Format/MacroExpanderTest.cpp
  clang/unittests/Format/TestLexer.h

Index: clang/unittests/Format/TestLexer.h
===
--- clang/unittests/Format/TestLexer.h
+++ clang/unittests/Format/TestLexer.h
@@ -72,7 +72,8 @@
   TokenList annotate(llvm::StringRef Code) {
 FormatTokenLexer Lex = getNewLexer(Code);
 auto Tokens = Lex.lex();
-UnwrappedLineParser Parser(Style, Lex.getKeywords(), 0, Tokens, *this);
+UnwrappedLineParser Parser(SourceMgr.get(), Style, Lex.getKeywords(), 0,
+   Tokens, *this, Allocator, IdentTable);
 Parser.parse();
 TokenAnnotator Annotator(Style, Lex.getKeywords());
 for (auto &Line : UnwrappedLines) {
Index: clang/unittests/Format/MacroExpanderTest.cpp
===
--- clang/unittests/Format/MacroExpanderTest.cpp
+++ clang/unittests/Format/MacroExpanderTest.cpp
@@ -19,9 +19,16 @@
Lex.Allocator, Lex.IdentTable);
   }
 
+  std::string expand(MacroExpander &Macros, llvm::StringRef Name) {
+EXPECT_TRUE(Macros.defined(Name))
+<< "Macro not defined: \"" << Name << "\"";
+return text(Macros.expand(Lex.id(Name), {}));
+  }
+
   std::string expand(MacroExpander &Macros, llvm::StringRef Name,
- const std::vector &Args = {}) {
-EXPECT_TRUE(Macros.defined(Name));
+ const std::vector &Args) {
+EXPECT_TRUE(Macros.defined(Name))
+<< "Macro not defined: \"" << Name << "\"";
 return text(Macros.expand(Lex.id(Name), lexArgs(Args)));
   }
 
@@ -95,7 +102,7 @@
   EXPECT_EQ("", expand(*Macros, "A"));
   EXPECT_EQ("b", expand(*Macros, "B"));
   EXPECT_EQ("c+c", expand(*Macros, "C"));
-  EXPECT_EQ("", expand(*Macros, "D"));
+  EXPECT_EQ("", expand(*Macros, "D", {}));
 }
 
 TEST_F(MacroExpanderTest, ExpandsWithArguments) {
@@ -105,7 +112,6 @@
   });
   EXPECT_EQ("", expand(*Macros, "A", {"a"}));
   EXPECT_EQ("b1+b2+b3", expand(*Macros, "B", {"b1", "b2 + b3"}));
-  EXPECT_EQ("x+", expand(*Macros, "B", {"x"}));
 }
 
 TEST_F(MacroExpanderTest, AttributizesTokens) {
@@ -200,6 +206,14 @@
   EXPECT_ATTRIBUTES(Result, Attributes);
 }
 
+TEST_F(MacroExpanderTest, Overloads) {
+  auto Macros = create({"A=x", "A()=y", "A(a)=a", "A(a, b)=a b"});
+  EXPECT_EQ("x", expand(*Macros, "A"));
+  EXPECT_EQ("y", expand(*Macros, "A", {}));
+  EXPECT_EQ("z", expand(*Macros, "A", {"z"}));
+  EXPECT_EQ("xy", expand(*Macros, "A", {"x", "y"}));
+}
+
 } // namespace
 } // namespace format
 } // namespace clang
Index: clang/unittests/Format/MacroCallReconstructorTest.cpp
===
--- clang/unittests/Format/MacroCallReconstructorTest.cpp
+++ clang/unittests/Format/MacroCallReconstructorTest.cpp
@@ -33,13 +33,31 @@
   TokenList
   expand(llvm::StringRef Name,
  const SmallVector, 1> &Args) {
+return expandInternal(Name, Args);
+  }
+
+  TokenList expand(llvm::StringRef Name) { return expandInternal(Name, {}); }
+
+  TokenList expand(llvm::StringRef Name, const std::vector &Args) {
+return expandInternal(Name, lexArgs(Args));
+  }
+
+  const UnexpandedMap &getUnexpanded() const { return Unexpanded; }
+
+  const TokenList &getTokens() const { return Tokens; }
+
+private:
+  TokenList expandInternal(
+  llvm::StringRef Name,
+  const std::optional, 1>>
+  &Args) {
 auto *ID = Lex.id(Name);
 auto UnexpandedLine = std::make_unique();
 UnexpandedLine->Tokens.push_back(ID);
-if (!Args.empty()) {
+if (Args && !Args->empty()) {
   UnexpandedLine->Tokens.push_back(Lex.id("("));
-  for (auto I = Args.begin(), E = Args.end(); I != E; ++I) {
-if (I != Args.begin())
+  for (auto I = Args->begin(), E = Args->end(); I != E; ++I) {
+if (I != Args->begin())
   UnexpandedLine->Tokens.push_back(Lex.id(","));
 UnexpandedLine->Tokens.insert(UnexpandedLine->Tokens.end(), I->begin(),
  

[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-02-23 Thread Sam McCall via Phabricator via cfe-commits
sammccall added a comment.

Thanks, this all looks good, at least as far as I understand it! The 
two-passes-over-the-whole-file approach is easier for me to follow, too.

Mostly just comment nits, but I have one annoying question left about using 
macros with the wrong arity - I thought this was just an implementation 
limitation, but it sounds like you want it to be a feature, so I'd like to nail 
it down a little...




Comment at: clang/include/clang/Format/Format.h:2750
+  ///
+  /// Code will be parsed with macros expanded, and formatting will try to best
+  /// match the structure of the expanded call.

I find "try to match the structure of the expanded call" a bit hard to follow 
from a user perspective.

Maybe "Code will be parsed with macros expanded, in order to determine how to 
interpret and format the macro arguments"?



Comment at: clang/include/clang/Format/Format.h:2753
+  ///
+  /// For example, with the macro "A(x)=x", the code
+  /// \code

This is a good example. I think you should spell out both sides, and probably 
start with the simpler case (no macro definition) to motivate the problem.

```
For example, the code:
  A(a*b);
will usually be interpreted as a call to a function A, and the multiplication 
expression will be formatted as `a * b`.

If we specify the macro definition:
  Macros:
- A(x)=x

the code will now be parsed as a declaration of the variable b of type a*,
and formatted as `a* b` (depending on pointer-binding rules).
```



Comment at: clang/include/clang/Format/Format.h:2757
+  /// \endcode
+  /// will be formatted as a declaration of the variable \c b of type \c A*
+  /// (depending on pointer-binding rules)

type a*, not A*



Comment at: clang/include/clang/Format/Format.h:2762
+  /// \endcode
+  /// instead of as multiplication.
+  std::vector Macros;

I think the restrictions should be spelled out here: object-like and 
function-like macros are both supported, macro args should be used exactly 
once, macros should not reference other macros.



Comment at: clang/lib/Format/UnwrappedLineParser.cpp:225
+  }
+  Callback.finishRun();
+}

klimek wrote:
> sammccall wrote:
> > I'm sure you got this right, but it's hard for me to evaluate this code 
> > because I don't know what `UnwrappedLineConsumer::finishRun()` is for - 
> > it's not documented and the implementation is mysterious.
> > 
> > You might consider adding a comment to that interface/method, it's not 
> > really related to this change though.
> Done. Not sure I've put too much info about how it's going to be used into 
> the interface, where it doesn't really belong.
Thanks, comment seems just right to me.



Comment at: clang/lib/Format/UnwrappedLineParser.h:88
+/// - for different combinations of #if blocks
+/// - for code where macros are expanded and the code with the original
+///   macro calls.

this is a bit of a garden path sentence: `for code where ((macros are expanded) 
and (the code with (the original macro calls) ... ERR_MISSING_PREDICATE))`

maybe `when macros are involved, for the expanded code and the as-written code`?



Comment at: clang/lib/Format/UnwrappedLineParser.h:260
+  // was expanded from a macro call.
+  bool containsExpansion(const UnwrappedLine &Line);
+

nit: should these added functions be const?

(maybe this class doesn't bother with const - computePPHash is const though)



Comment at: clang/unittests/Format/FormatTest.cpp:22584
+  Style.Macros.push_back("CALL(x)=f([] { x })");
+  Style.Macros.push_back("ASSIGN_OR_RETURN(a, b, c)=a = (b) || (c)");
+

klimek wrote:
> sammccall wrote:
> > If this is assign_or_return, the treatment of "c" is fairly strange. Also 
> > you are mostly calling this with only two args. Maybe drop C and use a 
> > different macro for the 3-arg case?
> ASSIGN_OR_RETURN allows 3 args, though; this is basically the thing I'd 
> propose to configure for ASSIGN_OR_RETURN in general; is there anything 
> specific you don't like about this?
OK, this is a bit sticky.
 - `ASSIGN_OR_RETURN` is almost always called with two args
 - some versions of `ASSIGN_OR_RETURN` support an optional third arg (there's 
no canonical public version)
 - these emulate overloading using variadic macro tricks
 - this patch doesn't claim to support either variadic macros or overloading

So the principled options seem to be:
 - macros have fixed arity: clang-format can support the two-arg version of 
`ASSIGN_OR_RETURN` but not the three-arg version. (This is what I was assuming)
 - macros have variable arity: the public docs need to describe how passing too 
many/too few args is treated, because this isn't obvious and it sounds like we 
want people to rely on it. This feels like stretching "principled" to me - 

[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-02-23 Thread Manuel Klimek via Phabricator via cfe-commits
klimek updated this revision to Diff 499777.
klimek added a comment.

Add comment.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144170/new/

https://reviews.llvm.org/D144170

Files:
  clang/include/clang/Format/Format.h
  clang/lib/Format/ContinuationIndenter.cpp
  clang/lib/Format/Format.cpp
  clang/lib/Format/FormatToken.h
  clang/lib/Format/MacroExpander.cpp
  clang/lib/Format/Macros.h
  clang/lib/Format/TokenAnalyzer.cpp
  clang/lib/Format/TokenAnalyzer.h
  clang/lib/Format/TokenAnnotator.cpp
  clang/lib/Format/TokenAnnotator.h
  clang/lib/Format/UnwrappedLineFormatter.cpp
  clang/lib/Format/UnwrappedLineParser.cpp
  clang/lib/Format/UnwrappedLineParser.h
  clang/lib/Format/WhitespaceManager.cpp
  clang/unittests/Format/FormatTest.cpp
  clang/unittests/Format/TestLexer.h

Index: clang/unittests/Format/TestLexer.h
===
--- clang/unittests/Format/TestLexer.h
+++ clang/unittests/Format/TestLexer.h
@@ -72,7 +72,8 @@
   TokenList annotate(llvm::StringRef Code) {
 FormatTokenLexer Lex = getNewLexer(Code);
 auto Tokens = Lex.lex();
-UnwrappedLineParser Parser(Style, Lex.getKeywords(), 0, Tokens, *this);
+UnwrappedLineParser Parser(SourceMgr.get(), Style, Lex.getKeywords(), 0,
+   Tokens, *this, Allocator, IdentTable);
 Parser.parse();
 TokenAnnotator Annotator(Style, Lex.getKeywords());
 for (auto &Line : UnwrappedLines) {
Index: clang/unittests/Format/FormatTest.cpp
===
--- clang/unittests/Format/FormatTest.cpp
+++ clang/unittests/Format/FormatTest.cpp
@@ -22568,6 +22568,230 @@
"llvm::outs()\n<<");
 }
 
+TEST_F(FormatTest, UnexpandConfiguredMacros) {
+  FormatStyle Style = getLLVMStyle();
+  Style.Macros.push_back("CLASS=class C {");
+  Style.Macros.push_back("SEMI=;");
+  Style.Macros.push_back("STMT=f();");
+  Style.Macros.push_back("ID(x)=x");
+  Style.Macros.push_back("ID3(x, y, z)=x y z");
+  Style.Macros.push_back("CALL(x)=f([] { x })");
+  Style.Macros.push_back("ASSIGN_OR_RETURN(a, b, c)=a = (b) || (c)");
+  Style.Macros.push_back("MOCK_METHOD(r, n, a, s)=r n a s");
+
+  verifyFormat("ID(nested(a(b, c), d))", Style);
+  verifyFormat("CLASS\n"
+   "  a *b;\n"
+   "};",
+   Style);
+  verifyFormat("SEMI\n"
+   "SEMI\n"
+   "SEMI",
+   Style);
+  verifyFormat("STMT\n"
+   "STMT\n"
+   "STMT",
+   Style);
+  verifyFormat("void f() { ID(a *b); }", Style);
+  verifyFormat(R"(ID(
+{ ID(a *b); });
+)",
+   Style);
+  verifyIncompleteFormat(R"(ID3({, ID(a *b),
+  ;
+  });
+)",
+ Style);
+
+  verifyFormat("ID(CALL(CALL(return a * b;)));", Style);
+
+  verifyFormat("ASSIGN_OR_RETURN(MySomewhatLongType *variable,\n"
+   " MySomewhatLongFunction(SomethingElse()));\n",
+   Style);
+
+  verifyFormat(R"(
+#define MACRO(a, b) ID(a + b)
+)",
+   Style);
+  EXPECT_EQ(R"(
+int a;
+int b;
+int c;
+int d;
+int e;
+int f;
+ID(
+namespace foo {
+int a;
+}
+) // namespace k
+)",
+format(R"(
+int a;
+int b;
+int c;
+int d;
+int e;
+int f;
+ID(namespace foo { int a; })  // namespace k
+)",
+   Style));
+  verifyFormat(R"(ID(
+//
+({ ; }))
+)",
+   Style);
+
+  Style.ColumnLimit = 35;
+  // FIXME: Arbitrary formatting of macros where the end of the logical
+  // line is in the middle of a macro call are not working yet.
+  verifyFormat(R"(ID(
+void f();
+void)
+ID(g) ID(()) ID(
+;
+void g();)
+)",
+   Style);
+
+  Style.ColumnLimit = 10;
+  verifyFormat("STMT\n"
+   "STMT\n"
+   "STMT",
+   Style);
+
+  EXPECT_EQ(R"(
+ID(CALL(CALL(
+a *b)));
+)",
+format(R"(
+ID(CALL(CALL(a * b)));
+)",
+   Style));
+
+  // FIXME: If we want to support unbalanced braces or parens from macro
+  // expansions we need to re-think how we propagate errors in
+  // TokenAnnotator::parseLine; for investigation, switching the inner loop of
+  // TokenAnnotator::parseLine to return LT_Other instead of LT_Invalid in case
+  // of !consumeToken() changes the formatting of the test below and makes it
+  // believe it has a fully correct formatting.
+  EXPECT_EQ(R"(
+ID3(
+{
+CLASS
+a *b;
+};
+},
+ID(x *y);
+,
+STMT
+STMT
+STMT)
+void f();
+)",
+format(R"(
+ID3({CLASS a*b; };}, ID(x*y);, STMT STMT STMT)
+void f();
+)",
+   Style));
+
+  verifyFormat("ID(a(\n"
+   "#ifdef A\n"
+   "b, c\n"
+   "#else\n"
+   "d(e)\n"
+   "#endif\n"
+   "))",
+   Style);
+  Style.ColumnLimit = 80;
+  verifyFormat(

[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-02-23 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/lib/Format/ContinuationIndenter.cpp:744
+  // Align following lines within parenthesis / brackets if configured.
+  // For a line of macro parents, the commas that follow the opening 
parenthesis
+  // in the line come after the opening parenthesis' children - we want to 
align

sammccall wrote:
> This is a bit hard to follow without context, I had to look up 
> MacroCallReconstructor::takeResult() to recall the structure it's referring 
> to, and there's no obvious breadcrumb to point you there. And if you don't 
> follow the details, it's hard to understand what the relevance is.
> 
> Maybe something like: this doesn't apply to macro expansion lines, which are 
> `MACRO( , , )` with args as children of `(` and `,`. It does not make sense 
> to align the commas with the opening paren.
> 
> (I think describing here how the alignment ends up working in that case might 
> just add confusion: it's not handled by this code, right?)
Excellent wording, applied. And yes, alignment in the macro case is not handled 
here.



Comment at: clang/lib/Format/TokenAnnotator.h:68
 FormatToken *Current = First;
+addChildren(Line.Tokens.front(), Current);
 for (const UnwrappedLineNode &Node : llvm::drop_begin(Line.Tokens)) {

sammccall wrote:
> this seems like a separate bugfix: previously if the first node in 
> Line.Tokens had children, we wouldn't be adding them.
> 
> Is this true? Was there some reason the first node could never have children 
> before?
I think the main problem is that it's impossible to write C++ code that has a 
child in the first node based on how this worked. That is, previously we only 
added children for blocks within one line, that is, within a function or 
lambda. Those need intro tokens. So I really couldn't make this trigger, or I 
would have fixed it in a separate patch.



Comment at: clang/lib/Format/TokenAnnotator.h:76
+  addChildren(Node, Current);
+  // FIXME: if we add children, previous will point to the token before
+  // the children; changing this requires significant changes across

sammccall wrote:
> does "add children" here refer to the procedure in addChildren(), or to the 
> token insertion idea introduced in the previous patch?
> 
> (i.e. is this just describing that previous/next skip over trees of children 
> rather than including them in the chain, or is there something new going on?)
It's about adding children in general.
When fixing this I realized this is in principle wrong, but then noticed that 
we have code that depends on this behavior.



Comment at: clang/lib/Format/UnwrappedLineFormatter.cpp:926
+} else {
+  Tok->Finalized = true;
+}

sammccall wrote:
> why do we no longer finalize the child tree?
We do - markFinalized() is called for the child tree again. Previously we 
marked multiple times, which doesn't work with the new algorithm, as it would 
move ExpandedArg -> UnexpandedArg -> Finalized in one go.



Comment at: clang/lib/Format/UnwrappedLineParser.cpp:225
+  }
+  Callback.finishRun();
+}

sammccall wrote:
> I'm sure you got this right, but it's hard for me to evaluate this code 
> because I don't know what `UnwrappedLineConsumer::finishRun()` is for - it's 
> not documented and the implementation is mysterious.
> 
> You might consider adding a comment to that interface/method, it's not really 
> related to this change though.
Done. Not sure I've put too much info about how it's going to be used into the 
interface, where it doesn't really belong.



Comment at: clang/lib/Format/UnwrappedLineParser.cpp:4564
+  // We parse the macro call into a new line.
+  auto Args = parseMacroCall();
+  InExpansion = OldInExpansion;

sammccall wrote:
> it looks like you go into parsing the macro call including parens and 
> arguments regardless of whether the macro is object-like or function-like.
> 
> So given
> ```
> #define DEBUG puts
> DEBUG("foo");
> ```
> you're going to expand to `puts;` instead of `puts("foo");`
Excellent find. Completely invalidated the way I was doing formatting of 
expanded lines, but fixed the problem with the line indices needing resetting 
\o/

Now we're formatting all lines in the first round with the macro calls replaced 
with expanded lines, and then again format everything including the macro 
calls. That is more in line with clang-format's assumptions anyway.



Comment at: clang/lib/Format/UnwrappedLineParser.cpp:4660
+}
+  } while (!eof());
+  return {};

sammccall wrote:
> hitting eof is an error condition here, so it seems more natural to handle it 
> as a case in the switch as a kind of early-return, and maybe it's worth 
> adding an LLVM_DEBUG() for that case. If anything it'd be nice to have 
> r

[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-02-23 Thread Manuel Klimek via Phabricator via cfe-commits
klimek updated this revision to Diff 499771.
klimek marked 9 inline comments as done.
klimek added a comment.

Address reviewer comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144170/new/

https://reviews.llvm.org/D144170

Files:
  clang/include/clang/Format/Format.h
  clang/lib/Format/ContinuationIndenter.cpp
  clang/lib/Format/Format.cpp
  clang/lib/Format/FormatToken.h
  clang/lib/Format/MacroExpander.cpp
  clang/lib/Format/Macros.h
  clang/lib/Format/TokenAnalyzer.cpp
  clang/lib/Format/TokenAnalyzer.h
  clang/lib/Format/TokenAnnotator.cpp
  clang/lib/Format/TokenAnnotator.h
  clang/lib/Format/UnwrappedLineFormatter.cpp
  clang/lib/Format/UnwrappedLineParser.cpp
  clang/lib/Format/UnwrappedLineParser.h
  clang/lib/Format/WhitespaceManager.cpp
  clang/unittests/Format/FormatTest.cpp
  clang/unittests/Format/TestLexer.h

Index: clang/unittests/Format/TestLexer.h
===
--- clang/unittests/Format/TestLexer.h
+++ clang/unittests/Format/TestLexer.h
@@ -72,7 +72,8 @@
   TokenList annotate(llvm::StringRef Code) {
 FormatTokenLexer Lex = getNewLexer(Code);
 auto Tokens = Lex.lex();
-UnwrappedLineParser Parser(Style, Lex.getKeywords(), 0, Tokens, *this);
+UnwrappedLineParser Parser(SourceMgr.get(), Style, Lex.getKeywords(), 0,
+   Tokens, *this, Allocator, IdentTable);
 Parser.parse();
 TokenAnnotator Annotator(Style, Lex.getKeywords());
 for (auto &Line : UnwrappedLines) {
Index: clang/unittests/Format/FormatTest.cpp
===
--- clang/unittests/Format/FormatTest.cpp
+++ clang/unittests/Format/FormatTest.cpp
@@ -22568,6 +22568,230 @@
"llvm::outs()\n<<");
 }
 
+TEST_F(FormatTest, UnexpandConfiguredMacros) {
+  FormatStyle Style = getLLVMStyle();
+  Style.Macros.push_back("CLASS=class C {");
+  Style.Macros.push_back("SEMI=;");
+  Style.Macros.push_back("STMT=f();");
+  Style.Macros.push_back("ID(x)=x");
+  Style.Macros.push_back("ID3(x, y, z)=x y z");
+  Style.Macros.push_back("CALL(x)=f([] { x })");
+  Style.Macros.push_back("ASSIGN_OR_RETURN(a, b, c)=a = (b) || (c)");
+  Style.Macros.push_back("MOCK_METHOD(r, n, a, s)=r n a s");
+
+  verifyFormat("ID(nested(a(b, c), d))", Style);
+  verifyFormat("CLASS\n"
+   "  a *b;\n"
+   "};",
+   Style);
+  verifyFormat("SEMI\n"
+   "SEMI\n"
+   "SEMI",
+   Style);
+  verifyFormat("STMT\n"
+   "STMT\n"
+   "STMT",
+   Style);
+  verifyFormat("void f() { ID(a *b); }", Style);
+  verifyFormat(R"(ID(
+{ ID(a *b); });
+)",
+   Style);
+  verifyIncompleteFormat(R"(ID3({, ID(a *b),
+  ;
+  });
+)",
+ Style);
+
+  verifyFormat("ID(CALL(CALL(return a * b;)));", Style);
+
+  verifyFormat("ASSIGN_OR_RETURN(MySomewhatLongType *variable,\n"
+   " MySomewhatLongFunction(SomethingElse()));\n",
+   Style);
+
+  verifyFormat(R"(
+#define MACRO(a, b) ID(a + b)
+)",
+   Style);
+  EXPECT_EQ(R"(
+int a;
+int b;
+int c;
+int d;
+int e;
+int f;
+ID(
+namespace foo {
+int a;
+}
+) // namespace k
+)",
+format(R"(
+int a;
+int b;
+int c;
+int d;
+int e;
+int f;
+ID(namespace foo { int a; })  // namespace k
+)",
+   Style));
+  verifyFormat(R"(ID(
+//
+({ ; }))
+)",
+   Style);
+
+  Style.ColumnLimit = 35;
+  // FIXME: Arbitrary formatting of macros where the end of the logical
+  // line is in the middle of a macro call are not working yet.
+  verifyFormat(R"(ID(
+void f();
+void)
+ID(g) ID(()) ID(
+;
+void g();)
+)",
+   Style);
+
+  Style.ColumnLimit = 10;
+  verifyFormat("STMT\n"
+   "STMT\n"
+   "STMT",
+   Style);
+
+  EXPECT_EQ(R"(
+ID(CALL(CALL(
+a *b)));
+)",
+format(R"(
+ID(CALL(CALL(a * b)));
+)",
+   Style));
+
+  // FIXME: If we want to support unbalanced braces or parens from macro
+  // expansions we need to re-think how we propagate errors in
+  // TokenAnnotator::parseLine; for investigation, switching the inner loop of
+  // TokenAnnotator::parseLine to return LT_Other instead of LT_Invalid in case
+  // of !consumeToken() changes the formatting of the test below and makes it
+  // believe it has a fully correct formatting.
+  EXPECT_EQ(R"(
+ID3(
+{
+CLASS
+a *b;
+};
+},
+ID(x *y);
+,
+STMT
+STMT
+STMT)
+void f();
+)",
+format(R"(
+ID3({CLASS a*b; };}, ID(x*y);, STMT STMT STMT)
+void f();
+)",
+   Style));
+
+  verifyFormat("ID(a(\n"
+   "#ifdef A\n"
+   "b, c\n"
+   "#else\n"
+   "d(e)\n"
+   "#endif\n"
+   "))",
+   

[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-02-16 Thread Sam McCall via Phabricator via cfe-commits
sammccall added a comment.

It's happening!




Comment at: clang/lib/Format/ContinuationIndenter.cpp:743
 
+  // Align following lines within parenthesis / brackets if configured.
+  // For a line of macro parents, the commas that follow the opening 
parenthesis

nit: parentheses



Comment at: clang/lib/Format/ContinuationIndenter.cpp:744
+  // Align following lines within parenthesis / brackets if configured.
+  // For a line of macro parents, the commas that follow the opening 
parenthesis
+  // in the line come after the opening parenthesis' children - we want to 
align

This is a bit hard to follow without context, I had to look up 
MacroCallReconstructor::takeResult() to recall the structure it's referring to, 
and there's no obvious breadcrumb to point you there. And if you don't follow 
the details, it's hard to understand what the relevance is.

Maybe something like: this doesn't apply to macro expansion lines, which are 
`MACRO( , , )` with args as children of `(` and `,`. It does not make sense to 
align the commas with the opening paren.

(I think describing here how the alignment ends up working in that case might 
just add confusion: it's not handled by this code, right?)



Comment at: clang/lib/Format/TokenAnnotator.h:68
 FormatToken *Current = First;
+addChildren(Line.Tokens.front(), Current);
 for (const UnwrappedLineNode &Node : llvm::drop_begin(Line.Tokens)) {

this seems like a separate bugfix: previously if the first node in Line.Tokens 
had children, we wouldn't be adding them.

Is this true? Was there some reason the first node could never have children 
before?



Comment at: clang/lib/Format/TokenAnnotator.h:76
+  addChildren(Node, Current);
+  // FIXME: if we add children, previous will point to the token before
+  // the children; changing this requires significant changes across

does "add children" here refer to the procedure in addChildren(), or to the 
token insertion idea introduced in the previous patch?

(i.e. is this just describing that previous/next skip over trees of children 
rather than including them in the chain, or is there something new going on?)



Comment at: clang/lib/Format/UnwrappedLineFormatter.cpp:924
+  Tok->MacroCtx->Role = MR_UnexpandedArg;
+  Tok->SpacesRequiredBefore = 0;
+} else {

comment as to why?



Comment at: clang/lib/Format/UnwrappedLineFormatter.cpp:926
+} else {
+  Tok->Finalized = true;
+}

why do we no longer finalize the child tree?



Comment at: clang/lib/Format/UnwrappedLineParser.cpp:225
+  }
+  Callback.finishRun();
+}

I'm sure you got this right, but it's hard for me to evaluate this code because 
I don't know what `UnwrappedLineConsumer::finishRun()` is for - it's not 
documented and the implementation is mysterious.

You might consider adding a comment to that interface/method, it's not really 
related to this change though.



Comment at: clang/lib/Format/UnwrappedLineParser.cpp:4198
   LLVM_DEBUG({
-if (CurrentLines == &Lines)
+if (CurrentLines == &Lines) {
+  llvm::dbgs() << "Adding unwrapped line:\n";

nit: you check this condition a bunch of times, and this patch adds more.

giving it a name like `!parsingPPDirective()` would be less cryptic



Comment at: clang/lib/Format/UnwrappedLineParser.cpp:4235
+// FIXME: We format the expanded lines in an extra step that does not give
+// the formatter all unwrapped lines, thus the indexes are invalid; to 
allow
+// all features during expanded line formatting, recalcuate the indexes

This is a bit vague as to what the "indexes" refer to and what the implications 
are (between comment & code "indexes" is mentioned 3 times but it's not clear 
without digging what they are).

AFAICS in practice this only affects indentation of braces in whitesmiths 
style, which may be worth mentioning. (It'll get stale, but at the moment the 
comment sounds like arbitrary things are likely to be broken)



Comment at: clang/lib/Format/UnwrappedLineParser.cpp:4564
+  // We parse the macro call into a new line.
+  auto Args = parseMacroCall();
+  InExpansion = OldInExpansion;

it looks like you go into parsing the macro call including parens and arguments 
regardless of whether the macro is object-like or function-like.

So given
```
#define DEBUG puts
DEBUG("foo");
```
you're going to expand to `puts;` instead of `puts("foo");`



Comment at: clang/lib/Format/UnwrappedLineParser.cpp:4573
+  Line = std::move(PreCall);
+  SmallVector New = Macros.expand(ID, Args);
+  if (!New.empty())

Calling a macro with the wrong arity might be worth flagging in 

[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-02-16 Thread Manuel Klimek via Phabricator via cfe-commits
klimek created this revision.
klimek added a reviewer: sammccall.
Herald added a project: All.
klimek requested review of this revision.
Herald added a project: clang.

Add configuration to specify macros.
Macros will be expanded, and the code will be parsed and annotated
in the expanded state. In a second step, the formatting decisions
in the annotated expanded code will be reconstructed onto the
original unexpanded macro call.

Eventually, this will allow to remove special-case code for
various macro options we accumulated over the years in favor of
one principled mechanism.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D144170

Files:
  clang/include/clang/Format/Format.h
  clang/lib/Format/ContinuationIndenter.cpp
  clang/lib/Format/Format.cpp
  clang/lib/Format/FormatToken.h
  clang/lib/Format/Macros.h
  clang/lib/Format/TokenAnalyzer.cpp
  clang/lib/Format/TokenAnalyzer.h
  clang/lib/Format/TokenAnnotator.cpp
  clang/lib/Format/TokenAnnotator.h
  clang/lib/Format/UnwrappedLineFormatter.cpp
  clang/lib/Format/UnwrappedLineParser.cpp
  clang/lib/Format/UnwrappedLineParser.h
  clang/lib/Format/WhitespaceManager.cpp
  clang/unittests/Format/FormatTest.cpp
  clang/unittests/Format/TestLexer.h

Index: clang/unittests/Format/TestLexer.h
===
--- clang/unittests/Format/TestLexer.h
+++ clang/unittests/Format/TestLexer.h
@@ -72,7 +72,8 @@
   TokenList annotate(llvm::StringRef Code) {
 FormatTokenLexer Lex = getNewLexer(Code);
 auto Tokens = Lex.lex();
-UnwrappedLineParser Parser(Style, Lex.getKeywords(), 0, Tokens, *this);
+UnwrappedLineParser Parser(SourceMgr.get(), Style, Lex.getKeywords(), 0,
+   Tokens, *this, Allocator, IdentTable);
 Parser.parse();
 TokenAnnotator Annotator(Style, Lex.getKeywords());
 for (auto &Line : UnwrappedLines) {
Index: clang/unittests/Format/FormatTest.cpp
===
--- clang/unittests/Format/FormatTest.cpp
+++ clang/unittests/Format/FormatTest.cpp
@@ -66,7 +66,8 @@
 
   void _verifyFormat(const char *File, int Line, llvm::StringRef Expected,
  llvm::StringRef Code,
- const FormatStyle &Style = getLLVMStyle()) {
+ const FormatStyle &Style = getLLVMStyle(),
+ bool MessUp = true) {
 ScopedTrace t(File, Line, ::testing::Message() << Code.str());
 EXPECT_EQ(Expected.str(), format(Expected, Style))
 << "Expected code is not stable";
@@ -76,20 +77,24 @@
   // needs to be checked for Objective-C++ as well.
   FormatStyle ObjCStyle = Style;
   ObjCStyle.Language = FormatStyle::LK_ObjC;
-  EXPECT_EQ(Expected.str(), format(test::messUp(Code), ObjCStyle));
+  EXPECT_EQ(Expected.str(),
+format(MessUp ? test::messUp(Code) : Code, ObjCStyle));
 }
   }
 
   void _verifyFormat(const char *File, int Line, llvm::StringRef Code,
- const FormatStyle &Style = getLLVMStyle()) {
-_verifyFormat(File, Line, Code, test::messUp(Code), Style);
+ const FormatStyle &Style = getLLVMStyle(),
+ bool MessUp = true) {
+_verifyFormat(File, Line, Code, MessUp ? test::messUp(Code) : Code, Style,
+  MessUp);
   }
 
   void _verifyIncompleteFormat(const char *File, int Line, llvm::StringRef Code,
-   const FormatStyle &Style = getLLVMStyle()) {
+   const FormatStyle &Style = getLLVMStyle(),
+   bool MessUp = true) {
 ScopedTrace t(File, Line, ::testing::Message() << Code.str());
-EXPECT_EQ(Code.str(),
-  format(test::messUp(Code), Style, SC_ExpectIncomplete));
+EXPECT_EQ(Code.str(), format(MessUp ? test::messUp(Code) : Code, Style,
+ SC_ExpectIncomplete));
   }
 
   void _verifyIndependentOfContext(const char *File, int Line,
@@ -22568,6 +22573,189 @@
"llvm::outs()\n<<");
 }
 
+TEST_F(FormatTest, UnexpandConfiguredMacros) {
+  FormatStyle Style = getLLVMStyle();
+  Style.Macros.push_back("CLASS=class C {");
+  Style.Macros.push_back("SEMI=;");
+  Style.Macros.push_back("STMT=f();");
+  Style.Macros.push_back("ID(x)=x");
+  Style.Macros.push_back("ID3(x, y, z)=x y z");
+  Style.Macros.push_back("CALL(x)=f([] { x })");
+  Style.Macros.push_back("ASSIGN_OR_RETURN(a, b, c)=a = (b) || (c)");
+
+  verifyFormat("ID(nested(a(b, c), d))", Style);
+  verifyFormat("CLASS\n"
+   "  a *b;\n"
+   "};",
+   Style);
+  verifyFormat("SEMI\n"
+   "SEMI\n"
+   "SEMI",
+   Style);
+  verifyFormat("STMT\n"
+   "STMT\n"
+   "STMT",
+   Style);
+  verifyFormat("void f() { ID(a *b); }", Style);
+  verifyFormat(R"(ID(
+{ ID(a *b); });
+)",
+   Style);