[PATCH] D146310: [clang-format] Fix dropped 'else' in 398cddf6acec

2023-04-21 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

Submitted.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146310/new/

https://reviews.llvm.org/D146310

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D146310: [clang-format] Fix dropped 'else' in 398cddf6acec

2023-04-21 Thread Manuel Klimek via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG9e9e096ae95b: [clang-format] Fix dropped else. 
(authored by klimek).
Herald added reviewers: rymiel, HazardyKnusperkeks, owenpan, MyDeveloperDay.
Herald added a comment.

NOTE: Clang-Format Team Automated Review Comment

It looks like your clang-format review does not contain any unit tests, please 
try to ensure all code changes have a unit test (unless this is an `NFC` or 
refactoring, adding documentation etc..)

Add your unit tests in `clang/unittests/Format` and you can build with `ninja 
FormatTests`.  We recommend using the `verifyFormat(xxx)` format of unit tests 
rather than `EXPECT_EQ` as this will ensure you change is tolerant to random 
whitespace changes (see FormatTest.cpp as an example)

For situations where your change is altering the TokenAnnotator.cpp which can 
happen if you are trying to improve the annotation phase to ensure we are 
correctly identifying the type of a token, please add a token annotator test in 
`TokenAnnotatorTest.cpp`


Changed prior to commit:
  https://reviews.llvm.org/D146310?vs=506133=515683#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146310/new/

https://reviews.llvm.org/D146310

Files:
  clang/lib/Format/UnwrappedLineFormatter.cpp


Index: clang/lib/Format/UnwrappedLineFormatter.cpp
===
--- clang/lib/Format/UnwrappedLineFormatter.cpp
+++ clang/lib/Format/UnwrappedLineFormatter.cpp
@@ -511,10 +511,9 @@
 ShouldMerge = !Style.BraceWrapping.AfterClass ||
   (NextLine.First->is(tok::r_brace) &&
!Style.BraceWrapping.SplitEmptyRecord);
-  }
-  if (TheLine->InPPDirective ||
-  !TheLine->First->isOneOf(tok::kw_class, tok::kw_enum,
-   tok::kw_struct)) {
+  } else if (TheLine->InPPDirective ||
+ !TheLine->First->isOneOf(tok::kw_class, tok::kw_enum,
+  tok::kw_struct)) {
 // Try to merge a block with left brace unwrapped that wasn't yet
 // covered.
 ShouldMerge = !Style.BraceWrapping.AfterFunction ||


Index: clang/lib/Format/UnwrappedLineFormatter.cpp
===
--- clang/lib/Format/UnwrappedLineFormatter.cpp
+++ clang/lib/Format/UnwrappedLineFormatter.cpp
@@ -511,10 +511,9 @@
 ShouldMerge = !Style.BraceWrapping.AfterClass ||
   (NextLine.First->is(tok::r_brace) &&
!Style.BraceWrapping.SplitEmptyRecord);
-  }
-  if (TheLine->InPPDirective ||
-  !TheLine->First->isOneOf(tok::kw_class, tok::kw_enum,
-   tok::kw_struct)) {
+  } else if (TheLine->InPPDirective ||
+ !TheLine->First->isOneOf(tok::kw_class, tok::kw_enum,
+  tok::kw_struct)) {
 // Try to merge a block with left brace unwrapped that wasn't yet
 // covered.
 ShouldMerge = !Style.BraceWrapping.AfterFunction ||
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D147176: [clang-format] NFC ensure Style operator== remains sorted for ease of editing

2023-03-30 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

And thanks a lot for cleaning up, really appreciate it!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147176/new/

https://reviews.llvm.org/D147176

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D147176: [clang-format] NFC ensure Style operator== remains sorted for ease of editing

2023-03-30 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

I apologize for not tagging appropriately - is that process documented 
somewhere?


Herald added a comment.

NOTE: Clang-Format Team Automated Review Comment

It looks like your clang-format review does not contain any unit tests, please 
try to ensure all code changes have a unit test (unless this is an `NFC` or 
refactoring, adding documentation etc..)

Add you unit tests in `clang/unittests/Format` and build `ninja FormatTests` we 
recommend using the `verifyFormat(xxx)` format of unit tests rather than 
`EXPECT_EQ` as this will ensure you change is tolerant to random whitespace 
changes (see FormatTest.cpp as an example)

For situations where your change is altering the TokenAnnotator.cpp which can 
happen if you are trying to improve the annotation phase to ensure we are 
correctly identifying the type of a token, please add a token annotator test in 
`TokenAnnotatorTest.cpp`


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147176/new/

https://reviews.llvm.org/D147176

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D146310: [clang-format] Fix dropped 'else' in 398cddf6acec

2023-03-23 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

Thanks, yes, I did not intend to delete the else. This only triggers with 
fuzzing with rather involved inputs, thus I wasn't able to create a nice enough 
unit test.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146310/new/

https://reviews.llvm.org/D146310

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D146704: [clang-format] NFC Format.h and ClangFormatStyleOptions.rst are out of date

2023-03-23 Thread Manuel Klimek via Phabricator via cfe-commits
klimek accepted this revision.
klimek added a comment.
This revision is now accepted and ready to land.

Thank you!! Sorry for forgetting that I needed to do this, 


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146704/new/

https://reviews.llvm.org/D146704

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D145244: [clang-format] Add ability to trace tokens.

2023-03-06 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

Answering the fundamental design question first.




Comment at: clang/lib/Format/ContinuationIndenter.cpp:289
Current.closesBlockOrBlockTypeList(Style))) 
{
+DEBUG_FORMAT_TRACE_TOKEN(Current, "!canBreak");
 return false;

sammccall wrote:
> annotating all exit paths from this function and `mustBreak` seems much more 
> verbose and fragile than wrapping the function (making this version private) 
> and adding the annotations in the wrapper.
How do we get the exact condition that triggered? The main trick here is the 
__FILE__:__LINE__ in the debug output.



Comment at: clang/lib/Format/FormatDebug.h:22
+
+#ifndef NDEBUG
+

sammccall wrote:
> it looks like you're not providing a dummy definition in NDEBUG mode - does 
> this still build in that config?
All use is within LLVM_DEBUG().



Comment at: clang/lib/Format/FormatDebug.h:32
+ << (Tok).TokenText << ": " << Debug << "\n";  
\
+  } else { 
\
+  }

sammccall wrote:
> why empty else?
The idea was dangling else protection, but I guess we have dangling else 
warnings, so this is not necessary?



Comment at: clang/lib/Format/TokenAnnotator.cpp:3258
   }
+  DEBUG_FORMAT_TRACE_TOKEN(*Current, (Current->MustBreakBefore ? "" : "!")
+ << "MustBreakBefore");

sammccall wrote:
> sammccall wrote:
> > move this out of the if() and remove the one on the other branch?
> this meaning of `<<` in `(...) << "MustBreakBefore` is confusing.
> 
> consider `<< (...) << "MustBreakBefore"` or `dbgs() << (...) << 
> "MustBreakBefore"` or a twine or formatv or really anything that isn't a 
> shift :-)
The idea was to see which branch of the if is taken here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145244/new/

https://reviews.llvm.org/D145244

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D145244: [clang-format] Add ability to trace tokens.

2023-03-03 Thread Manuel Klimek via Phabricator via cfe-commits
klimek updated this revision to Diff 502123.
klimek added a comment.

Remove superfluous include.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145244/new/

https://reviews.llvm.org/D145244

Files:
  clang/lib/Format/CMakeLists.txt
  clang/lib/Format/ContinuationIndenter.cpp
  clang/lib/Format/FormatDebug.cpp
  clang/lib/Format/FormatDebug.h
  clang/lib/Format/TokenAnnotator.cpp
  clang/lib/Format/UnwrappedLineFormatter.cpp

Index: clang/lib/Format/UnwrappedLineFormatter.cpp
===
--- clang/lib/Format/UnwrappedLineFormatter.cpp
+++ clang/lib/Format/UnwrappedLineFormatter.cpp
@@ -7,6 +7,7 @@
 //===--===//
 
 #include "UnwrappedLineFormatter.h"
+#include "FormatDebug.h"
 #include "FormatToken.h"
 #include "NamespaceEndCommentsFixer.h"
 #include "WhitespaceManager.h"
@@ -1252,17 +1253,43 @@
   /// penalty of \p Penalty. Insert a line break if \p NewLine is \c true.
   void addNextStateToQueue(unsigned Penalty, StateNode *PreviousNode,
bool NewLine, unsigned *Count, QueueType *Queue) {
-if (NewLine && !Indenter->canBreak(PreviousNode->State))
+bool MustBreak = false;
+bool MustNotBreak = false;
+LLVM_DEBUG({
+  if (internal::DebugForceMustBreak().match(
+  PreviousNode->State.NextToken->TokenText)) {
+MustBreak = true;
+  }
+  if (internal::DebugForceMustNotBreak().match(
+  PreviousNode->State.NextToken->TokenText)) {
+MustNotBreak = true;
+  }
+});
+
+if (NewLine && (!Indenter->canBreak(PreviousNode->State) || MustNotBreak) &&
+!MustBreak) {
+  DEBUG_FORMAT_TRACE_TOKEN(*PreviousNode->State.NextToken,
+   "Cannot break before");
   return;
-if (!NewLine && Indenter->mustBreak(PreviousNode->State))
+}
+if (!NewLine && (Indenter->mustBreak(PreviousNode->State) || MustBreak) &&
+!MustNotBreak) {
+  DEBUG_FORMAT_TRACE_TOKEN(*PreviousNode->State.NextToken,
+   "Must break before");
   return;
+}
 
 StateNode *Node = new (Allocator.Allocate())
 StateNode(PreviousNode->State, NewLine, PreviousNode);
 if (!formatChildren(Node->State, NewLine, /*DryRun=*/true, Penalty))
   return;
 
-Penalty += Indenter->addTokenToState(Node->State, NewLine, true);
+unsigned AddPenalty = Indenter->addTokenToState(Node->State, NewLine, true);
+DEBUG_FORMAT_TRACE_TOKEN(*PreviousNode->State.NextToken,
+ "Penalty for " << (!NewLine ? "not " : "")
+<< "breaking before token is "
+<< AddPenalty);
+Penalty += AddPenalty;
 
 Queue->push(QueueItem(OrderedPenalty(Penalty, *Count), Node));
 ++(*Count);
Index: clang/lib/Format/TokenAnnotator.cpp
===
--- clang/lib/Format/TokenAnnotator.cpp
+++ clang/lib/Format/TokenAnnotator.cpp
@@ -13,6 +13,8 @@
 //===--===//
 
 #include "TokenAnnotator.h"
+#include "ContinuationIndenter.h"
+#include "FormatDebug.h"
 #include "FormatToken.h"
 #include "clang/Basic/SourceManager.h"
 #include "clang/Basic/TokenKinds.h"
@@ -3244,6 +3246,7 @@
 
 const auto  = Prev->Children;
 if (!Children.empty() && Children.back()->Last->is(TT_LineComment)) {
+  DEBUG_FORMAT_TRACE_TOKEN(*Current, "MustBreakBefore");
   Current->MustBreakBefore = true;
 } else {
   Current->MustBreakBefore =
@@ -3252,6 +3255,8 @@
   Current->is(TT_FunctionDeclarationName)) {
 Current->MustBreakBefore = mustBreakForReturnType(Line);
   }
+  DEBUG_FORMAT_TRACE_TOKEN(*Current, (Current->MustBreakBefore ? "" : "!")
+ << "MustBreakBefore");
 }
 
 Current->CanBreakBefore =
Index: clang/lib/Format/FormatDebug.h
===
--- /dev/null
+++ clang/lib/Format/FormatDebug.h
@@ -0,0 +1,41 @@
+//===--- FormatDebug.h - Format C++ code *- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+///
+/// \file
+/// This file declares utilities used in clang-format in debug mode.
+///
+//===--===//
+#ifndef LLVM_CLANG_LIB_FORMAT_FORMATDEBUG_H
+#define LLVM_CLANG_LIB_FORMAT_FORMATDEBUG_H
+
+#include "llvm/Support/Regex.h"
+
+namespace clang {
+namespace format {
+namespace internal {
+
+#ifndef NDEBUG
+

[PATCH] D145244: [clang-format] Add ability to trace tokens.

2023-03-03 Thread Manuel Klimek via Phabricator via cfe-commits
klimek created this revision.
klimek added a reviewer: sammccall.
Herald added a project: All.
klimek requested review of this revision.
Herald added a project: clang.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D145244

Files:
  clang/lib/Format/CMakeLists.txt
  clang/lib/Format/ContinuationIndenter.cpp
  clang/lib/Format/FormatDebug.cpp
  clang/lib/Format/FormatDebug.h
  clang/lib/Format/TokenAnnotator.cpp
  clang/lib/Format/UnwrappedLineFormatter.cpp

Index: clang/lib/Format/UnwrappedLineFormatter.cpp
===
--- clang/lib/Format/UnwrappedLineFormatter.cpp
+++ clang/lib/Format/UnwrappedLineFormatter.cpp
@@ -7,6 +7,8 @@
 //===--===//
 
 #include "UnwrappedLineFormatter.h"
+#include "ContinuationIndenter.h"
+#include "FormatDebug.h"
 #include "FormatToken.h"
 #include "NamespaceEndCommentsFixer.h"
 #include "WhitespaceManager.h"
@@ -1252,17 +1254,43 @@
   /// penalty of \p Penalty. Insert a line break if \p NewLine is \c true.
   void addNextStateToQueue(unsigned Penalty, StateNode *PreviousNode,
bool NewLine, unsigned *Count, QueueType *Queue) {
-if (NewLine && !Indenter->canBreak(PreviousNode->State))
+bool MustBreak = false;
+bool MustNotBreak = false;
+LLVM_DEBUG({
+  if (internal::DebugForceMustBreak().match(
+  PreviousNode->State.NextToken->TokenText)) {
+MustBreak = true;
+  }
+  if (internal::DebugForceMustNotBreak().match(
+  PreviousNode->State.NextToken->TokenText)) {
+MustNotBreak = true;
+  }
+});
+
+if (NewLine && (!Indenter->canBreak(PreviousNode->State) || MustNotBreak) &&
+!MustBreak) {
+  DEBUG_FORMAT_TRACE_TOKEN(*PreviousNode->State.NextToken,
+   "Cannot break before");
   return;
-if (!NewLine && Indenter->mustBreak(PreviousNode->State))
+}
+if (!NewLine && (Indenter->mustBreak(PreviousNode->State) || MustBreak) &&
+!MustNotBreak) {
+  DEBUG_FORMAT_TRACE_TOKEN(*PreviousNode->State.NextToken,
+   "Must break before");
   return;
+}
 
 StateNode *Node = new (Allocator.Allocate())
 StateNode(PreviousNode->State, NewLine, PreviousNode);
 if (!formatChildren(Node->State, NewLine, /*DryRun=*/true, Penalty))
   return;
 
-Penalty += Indenter->addTokenToState(Node->State, NewLine, true);
+unsigned AddPenalty = Indenter->addTokenToState(Node->State, NewLine, true);
+DEBUG_FORMAT_TRACE_TOKEN(*PreviousNode->State.NextToken,
+ "Penalty for " << (!NewLine ? "not " : "")
+<< "breaking before token is "
+<< AddPenalty);
+Penalty += AddPenalty;
 
 Queue->push(QueueItem(OrderedPenalty(Penalty, *Count), Node));
 ++(*Count);
Index: clang/lib/Format/TokenAnnotator.cpp
===
--- clang/lib/Format/TokenAnnotator.cpp
+++ clang/lib/Format/TokenAnnotator.cpp
@@ -13,6 +13,8 @@
 //===--===//
 
 #include "TokenAnnotator.h"
+#include "ContinuationIndenter.h"
+#include "FormatDebug.h"
 #include "FormatToken.h"
 #include "clang/Basic/SourceManager.h"
 #include "clang/Basic/TokenKinds.h"
@@ -3244,6 +3246,7 @@
 
 const auto  = Prev->Children;
 if (!Children.empty() && Children.back()->Last->is(TT_LineComment)) {
+  DEBUG_FORMAT_TRACE_TOKEN(*Current, "MustBreakBefore");
   Current->MustBreakBefore = true;
 } else {
   Current->MustBreakBefore =
@@ -3252,6 +3255,8 @@
   Current->is(TT_FunctionDeclarationName)) {
 Current->MustBreakBefore = mustBreakForReturnType(Line);
   }
+  DEBUG_FORMAT_TRACE_TOKEN(*Current, (Current->MustBreakBefore ? "" : "!")
+ << "MustBreakBefore");
 }
 
 Current->CanBreakBefore =
Index: clang/lib/Format/FormatDebug.h
===
--- /dev/null
+++ clang/lib/Format/FormatDebug.h
@@ -0,0 +1,41 @@
+//===--- FormatDebug.h - Format C++ code *- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+///
+/// \file
+/// This file declares utilities used in clang-format in debug mode.
+///
+//===--===//
+#ifndef LLVM_CLANG_LIB_FORMAT_FORMATDEBUG_H
+#define LLVM_CLANG_LIB_FORMAT_FORMATDEBUG_H
+
+#include "llvm/Support/Regex.h"
+
+namespace clang {
+namespace format {

[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-02-24 Thread Manuel Klimek via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG01402831aaae: [clang-format] Add simple macro replacements 
in formatting. (authored by klimek).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144170/new/

https://reviews.llvm.org/D144170

Files:
  clang/include/clang/Format/Format.h
  clang/lib/Format/ContinuationIndenter.cpp
  clang/lib/Format/Format.cpp
  clang/lib/Format/FormatToken.h
  clang/lib/Format/MacroExpander.cpp
  clang/lib/Format/Macros.h
  clang/lib/Format/TokenAnalyzer.cpp
  clang/lib/Format/TokenAnalyzer.h
  clang/lib/Format/TokenAnnotator.cpp
  clang/lib/Format/TokenAnnotator.h
  clang/lib/Format/UnwrappedLineFormatter.cpp
  clang/lib/Format/UnwrappedLineParser.cpp
  clang/lib/Format/UnwrappedLineParser.h
  clang/lib/Format/WhitespaceManager.cpp
  clang/unittests/Format/FormatTest.cpp
  clang/unittests/Format/MacroCallReconstructorTest.cpp
  clang/unittests/Format/MacroExpanderTest.cpp
  clang/unittests/Format/TestLexer.h

Index: clang/unittests/Format/TestLexer.h
===
--- clang/unittests/Format/TestLexer.h
+++ clang/unittests/Format/TestLexer.h
@@ -72,7 +72,8 @@
   TokenList annotate(llvm::StringRef Code) {
 FormatTokenLexer Lex = getNewLexer(Code);
 auto Tokens = Lex.lex();
-UnwrappedLineParser Parser(Style, Lex.getKeywords(), 0, Tokens, *this);
+UnwrappedLineParser Parser(SourceMgr.get(), Style, Lex.getKeywords(), 0,
+   Tokens, *this, Allocator, IdentTable);
 Parser.parse();
 TokenAnnotator Annotator(Style, Lex.getKeywords());
 for (auto  : UnwrappedLines) {
Index: clang/unittests/Format/MacroExpanderTest.cpp
===
--- clang/unittests/Format/MacroExpanderTest.cpp
+++ clang/unittests/Format/MacroExpanderTest.cpp
@@ -19,9 +19,16 @@
Lex.Allocator, Lex.IdentTable);
   }
 
+  std::string expand(MacroExpander , llvm::StringRef Name) {
+EXPECT_TRUE(Macros.defined(Name))
+<< "Macro not defined: \"" << Name << "\"";
+return text(Macros.expand(Lex.id(Name), {}));
+  }
+
   std::string expand(MacroExpander , llvm::StringRef Name,
- const std::vector  = {}) {
-EXPECT_TRUE(Macros.defined(Name));
+ const std::vector ) {
+EXPECT_TRUE(Macros.defined(Name))
+<< "Macro not defined: \"" << Name << "\"";
 return text(Macros.expand(Lex.id(Name), lexArgs(Args)));
   }
 
@@ -95,7 +102,7 @@
   EXPECT_EQ("", expand(*Macros, "A"));
   EXPECT_EQ("b", expand(*Macros, "B"));
   EXPECT_EQ("c+c", expand(*Macros, "C"));
-  EXPECT_EQ("", expand(*Macros, "D"));
+  EXPECT_EQ("", expand(*Macros, "D", {}));
 }
 
 TEST_F(MacroExpanderTest, ExpandsWithArguments) {
@@ -105,7 +112,6 @@
   });
   EXPECT_EQ("", expand(*Macros, "A", {"a"}));
   EXPECT_EQ("b1+b2+b3", expand(*Macros, "B", {"b1", "b2 + b3"}));
-  EXPECT_EQ("x+", expand(*Macros, "B", {"x"}));
 }
 
 TEST_F(MacroExpanderTest, AttributizesTokens) {
@@ -200,6 +206,14 @@
   EXPECT_ATTRIBUTES(Result, Attributes);
 }
 
+TEST_F(MacroExpanderTest, Overloads) {
+  auto Macros = create({"A=x", "A()=y", "A(a)=a", "A(a, b)=a b"});
+  EXPECT_EQ("x", expand(*Macros, "A"));
+  EXPECT_EQ("y", expand(*Macros, "A", {}));
+  EXPECT_EQ("z", expand(*Macros, "A", {"z"}));
+  EXPECT_EQ("xy", expand(*Macros, "A", {"x", "y"}));
+}
+
 } // namespace
 } // namespace format
 } // namespace clang
Index: clang/unittests/Format/MacroCallReconstructorTest.cpp
===
--- clang/unittests/Format/MacroCallReconstructorTest.cpp
+++ clang/unittests/Format/MacroCallReconstructorTest.cpp
@@ -33,13 +33,31 @@
   TokenList
   expand(llvm::StringRef Name,
  const SmallVector, 1> ) {
+return expandInternal(Name, Args);
+  }
+
+  TokenList expand(llvm::StringRef Name) { return expandInternal(Name, {}); }
+
+  TokenList expand(llvm::StringRef Name, const std::vector ) {
+return expandInternal(Name, lexArgs(Args));
+  }
+
+  const UnexpandedMap () const { return Unexpanded; }
+
+  const TokenList () const { return Tokens; }
+
+private:
+  TokenList expandInternal(
+  llvm::StringRef Name,
+  const std::optional, 1>>
+  ) {
 auto *ID = Lex.id(Name);
 auto UnexpandedLine = std::make_unique();
 UnexpandedLine->Tokens.push_back(ID);
-if (!Args.empty()) {
+if (Args && !Args->empty()) {
   UnexpandedLine->Tokens.push_back(Lex.id("("));
-  for (auto I = Args.begin(), E = Args.end(); I != E; ++I) {
-if (I != Args.begin())
+  for (auto I = Args->begin(), E = Args->end(); I != E; ++I) {
+if (I != Args->begin())
   UnexpandedLine->Tokens.push_back(Lex.id(","));
 UnexpandedLine->Tokens.insert(UnexpandedLine->Tokens.end(), 

[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-02-24 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/lib/Format/UnwrappedLineParser.cpp:4592
+  !Macros.hasArity(ID->TokenText, Args->size())) {
+// The macro is overloaded to be both object-like and function-like,
+// but none of the function-like arities match the number of arguments.

sammccall wrote:
> klimek wrote:
> > sammccall wrote:
> > > Strictly I don't think this comment is true, we hit this path when it's 
> > > **only** an object-like macro, used before parens.
> > > 
> > > For this reason you might *not* want the dbgs() here but rather in the 
> > > bottom `else`
> > I'm pretty sure we hit this when it's overloaded for both, but we call it 
> > with the wrong arity.
> > E.g. A=x, A(x, y, z)=x y z
> > 
> > A(1, 2)
> > - Macros.objectLike(A) -> true
> > - Args -> true
> > - !Macros.hasArity(A, 2) -> true
> I agree that (the case described in the comment) => (we get to this point in 
> the code).
> 
> I'm disputing that (we get to this point in the code) => (the case described 
> in the comment).
> 
> i.e. given `A=x`, code=`A(1, 2)`, we also get: `objectLike(A) -> true`, `Args 
> -> true`, `!Macros.hasArity(A, 2) -> true`, but this time the comment is 
> lying about the state.
Ah, your comment made me think it was the former; I think the dbgs() is still 
what I want, but with an "either or" dbgs().

Adapted the comment in the code and the dbgs() to match.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144170/new/

https://reviews.llvm.org/D144170

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-02-24 Thread Manuel Klimek via Phabricator via cfe-commits
klimek updated this revision to Diff 500191.
klimek added a comment.

Address review comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144170/new/

https://reviews.llvm.org/D144170

Files:
  clang/include/clang/Format/Format.h
  clang/lib/Format/ContinuationIndenter.cpp
  clang/lib/Format/Format.cpp
  clang/lib/Format/FormatToken.h
  clang/lib/Format/MacroExpander.cpp
  clang/lib/Format/Macros.h
  clang/lib/Format/TokenAnalyzer.cpp
  clang/lib/Format/TokenAnalyzer.h
  clang/lib/Format/TokenAnnotator.cpp
  clang/lib/Format/TokenAnnotator.h
  clang/lib/Format/UnwrappedLineFormatter.cpp
  clang/lib/Format/UnwrappedLineParser.cpp
  clang/lib/Format/UnwrappedLineParser.h
  clang/lib/Format/WhitespaceManager.cpp
  clang/unittests/Format/FormatTest.cpp
  clang/unittests/Format/MacroCallReconstructorTest.cpp
  clang/unittests/Format/MacroExpanderTest.cpp
  clang/unittests/Format/TestLexer.h

Index: clang/unittests/Format/TestLexer.h
===
--- clang/unittests/Format/TestLexer.h
+++ clang/unittests/Format/TestLexer.h
@@ -72,7 +72,8 @@
   TokenList annotate(llvm::StringRef Code) {
 FormatTokenLexer Lex = getNewLexer(Code);
 auto Tokens = Lex.lex();
-UnwrappedLineParser Parser(Style, Lex.getKeywords(), 0, Tokens, *this);
+UnwrappedLineParser Parser(SourceMgr.get(), Style, Lex.getKeywords(), 0,
+   Tokens, *this, Allocator, IdentTable);
 Parser.parse();
 TokenAnnotator Annotator(Style, Lex.getKeywords());
 for (auto  : UnwrappedLines) {
Index: clang/unittests/Format/MacroExpanderTest.cpp
===
--- clang/unittests/Format/MacroExpanderTest.cpp
+++ clang/unittests/Format/MacroExpanderTest.cpp
@@ -19,9 +19,16 @@
Lex.Allocator, Lex.IdentTable);
   }
 
+  std::string expand(MacroExpander , llvm::StringRef Name) {
+EXPECT_TRUE(Macros.defined(Name))
+<< "Macro not defined: \"" << Name << "\"";
+return text(Macros.expand(Lex.id(Name), {}));
+  }
+
   std::string expand(MacroExpander , llvm::StringRef Name,
- const std::vector  = {}) {
-EXPECT_TRUE(Macros.defined(Name));
+ const std::vector ) {
+EXPECT_TRUE(Macros.defined(Name))
+<< "Macro not defined: \"" << Name << "\"";
 return text(Macros.expand(Lex.id(Name), lexArgs(Args)));
   }
 
@@ -95,7 +102,7 @@
   EXPECT_EQ("", expand(*Macros, "A"));
   EXPECT_EQ("b", expand(*Macros, "B"));
   EXPECT_EQ("c+c", expand(*Macros, "C"));
-  EXPECT_EQ("", expand(*Macros, "D"));
+  EXPECT_EQ("", expand(*Macros, "D", {}));
 }
 
 TEST_F(MacroExpanderTest, ExpandsWithArguments) {
@@ -105,7 +112,6 @@
   });
   EXPECT_EQ("", expand(*Macros, "A", {"a"}));
   EXPECT_EQ("b1+b2+b3", expand(*Macros, "B", {"b1", "b2 + b3"}));
-  EXPECT_EQ("x+", expand(*Macros, "B", {"x"}));
 }
 
 TEST_F(MacroExpanderTest, AttributizesTokens) {
@@ -200,6 +206,14 @@
   EXPECT_ATTRIBUTES(Result, Attributes);
 }
 
+TEST_F(MacroExpanderTest, Overloads) {
+  auto Macros = create({"A=x", "A()=y", "A(a)=a", "A(a, b)=a b"});
+  EXPECT_EQ("x", expand(*Macros, "A"));
+  EXPECT_EQ("y", expand(*Macros, "A", {}));
+  EXPECT_EQ("z", expand(*Macros, "A", {"z"}));
+  EXPECT_EQ("xy", expand(*Macros, "A", {"x", "y"}));
+}
+
 } // namespace
 } // namespace format
 } // namespace clang
Index: clang/unittests/Format/MacroCallReconstructorTest.cpp
===
--- clang/unittests/Format/MacroCallReconstructorTest.cpp
+++ clang/unittests/Format/MacroCallReconstructorTest.cpp
@@ -33,13 +33,31 @@
   TokenList
   expand(llvm::StringRef Name,
  const SmallVector, 1> ) {
+return expandInternal(Name, Args);
+  }
+
+  TokenList expand(llvm::StringRef Name) { return expandInternal(Name, {}); }
+
+  TokenList expand(llvm::StringRef Name, const std::vector ) {
+return expandInternal(Name, lexArgs(Args));
+  }
+
+  const UnexpandedMap () const { return Unexpanded; }
+
+  const TokenList () const { return Tokens; }
+
+private:
+  TokenList expandInternal(
+  llvm::StringRef Name,
+  const std::optional, 1>>
+  ) {
 auto *ID = Lex.id(Name);
 auto UnexpandedLine = std::make_unique();
 UnexpandedLine->Tokens.push_back(ID);
-if (!Args.empty()) {
+if (Args && !Args->empty()) {
   UnexpandedLine->Tokens.push_back(Lex.id("("));
-  for (auto I = Args.begin(), E = Args.end(); I != E; ++I) {
-if (I != Args.begin())
+  for (auto I = Args->begin(), E = Args->end(); I != E; ++I) {
+if (I != Args->begin())
   UnexpandedLine->Tokens.push_back(Lex.id(","));
 UnexpandedLine->Tokens.insert(UnexpandedLine->Tokens.end(), I->begin(),
   I->end());
@@ -57,16 +75,6 @@
 return UnexpandedTokens;
   }
 
-  TokenList expand(llvm::StringRef 

[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-02-24 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/lib/Format/MacroExpander.cpp:172
   assert(defined(ID->TokenText));
+  assert(
+  (!OptionalArgs && ObjectLike.find(ID->TokenText) != ObjectLike.end()) ||

sammccall wrote:
> this assertion failure isn't going to be as useful as it could be!
> 
> maybe rather
> ```
> if (OptionalArgs)
>   assert(...);
> else
>   assert(...);
> ```
> 
> Also I think we're now ensuring to only call this if arity matches, so I'd 
> make this a precondition and replace the first assert with hasArity to make 
> it easier to reason about
Hopefully done, the comment took me a while to parse :)



Comment at: clang/lib/Format/UnwrappedLineParser.cpp:4592
+  !Macros.hasArity(ID->TokenText, Args->size())) {
+// The macro is overloaded to be both object-like and function-like,
+// but none of the function-like arities match the number of arguments.

sammccall wrote:
> Strictly I don't think this comment is true, we hit this path when it's 
> **only** an object-like macro, used before parens.
> 
> For this reason you might *not* want the dbgs() here but rather in the bottom 
> `else`
I'm pretty sure we hit this when it's overloaded for both, but we call it with 
the wrong arity.
E.g. A=x, A(x, y, z)=x y z

A(1, 2)
- Macros.objectLike(A) -> true
- Args -> true
- !Macros.hasArity(A, 2) -> true


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144170/new/

https://reviews.llvm.org/D144170

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-02-24 Thread Manuel Klimek via Phabricator via cfe-commits
klimek updated this revision to Diff 500181.
klimek marked 5 inline comments as done.
klimek added a comment.

Address review comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144170/new/

https://reviews.llvm.org/D144170

Files:
  clang/include/clang/Format/Format.h
  clang/lib/Format/ContinuationIndenter.cpp
  clang/lib/Format/Format.cpp
  clang/lib/Format/FormatToken.h
  clang/lib/Format/MacroExpander.cpp
  clang/lib/Format/Macros.h
  clang/lib/Format/TokenAnalyzer.cpp
  clang/lib/Format/TokenAnalyzer.h
  clang/lib/Format/TokenAnnotator.cpp
  clang/lib/Format/TokenAnnotator.h
  clang/lib/Format/UnwrappedLineFormatter.cpp
  clang/lib/Format/UnwrappedLineParser.cpp
  clang/lib/Format/UnwrappedLineParser.h
  clang/lib/Format/WhitespaceManager.cpp
  clang/unittests/Format/FormatTest.cpp
  clang/unittests/Format/MacroCallReconstructorTest.cpp
  clang/unittests/Format/MacroExpanderTest.cpp
  clang/unittests/Format/TestLexer.h

Index: clang/unittests/Format/TestLexer.h
===
--- clang/unittests/Format/TestLexer.h
+++ clang/unittests/Format/TestLexer.h
@@ -72,7 +72,8 @@
   TokenList annotate(llvm::StringRef Code) {
 FormatTokenLexer Lex = getNewLexer(Code);
 auto Tokens = Lex.lex();
-UnwrappedLineParser Parser(Style, Lex.getKeywords(), 0, Tokens, *this);
+UnwrappedLineParser Parser(SourceMgr.get(), Style, Lex.getKeywords(), 0,
+   Tokens, *this, Allocator, IdentTable);
 Parser.parse();
 TokenAnnotator Annotator(Style, Lex.getKeywords());
 for (auto  : UnwrappedLines) {
Index: clang/unittests/Format/MacroExpanderTest.cpp
===
--- clang/unittests/Format/MacroExpanderTest.cpp
+++ clang/unittests/Format/MacroExpanderTest.cpp
@@ -19,9 +19,16 @@
Lex.Allocator, Lex.IdentTable);
   }
 
+  std::string expand(MacroExpander , llvm::StringRef Name) {
+EXPECT_TRUE(Macros.defined(Name))
+<< "Macro not defined: \"" << Name << "\"";
+return text(Macros.expand(Lex.id(Name), {}));
+  }
+
   std::string expand(MacroExpander , llvm::StringRef Name,
- const std::vector  = {}) {
-EXPECT_TRUE(Macros.defined(Name));
+ const std::vector ) {
+EXPECT_TRUE(Macros.defined(Name))
+<< "Macro not defined: \"" << Name << "\"";
 return text(Macros.expand(Lex.id(Name), lexArgs(Args)));
   }
 
@@ -95,7 +102,7 @@
   EXPECT_EQ("", expand(*Macros, "A"));
   EXPECT_EQ("b", expand(*Macros, "B"));
   EXPECT_EQ("c+c", expand(*Macros, "C"));
-  EXPECT_EQ("", expand(*Macros, "D"));
+  EXPECT_EQ("", expand(*Macros, "D", {}));
 }
 
 TEST_F(MacroExpanderTest, ExpandsWithArguments) {
@@ -105,7 +112,6 @@
   });
   EXPECT_EQ("", expand(*Macros, "A", {"a"}));
   EXPECT_EQ("b1+b2+b3", expand(*Macros, "B", {"b1", "b2 + b3"}));
-  EXPECT_EQ("x+", expand(*Macros, "B", {"x"}));
 }
 
 TEST_F(MacroExpanderTest, AttributizesTokens) {
@@ -200,6 +206,14 @@
   EXPECT_ATTRIBUTES(Result, Attributes);
 }
 
+TEST_F(MacroExpanderTest, Overloads) {
+  auto Macros = create({"A=x", "A()=y", "A(a)=a", "A(a, b)=a b"});
+  EXPECT_EQ("x", expand(*Macros, "A"));
+  EXPECT_EQ("y", expand(*Macros, "A", {}));
+  EXPECT_EQ("z", expand(*Macros, "A", {"z"}));
+  EXPECT_EQ("xy", expand(*Macros, "A", {"x", "y"}));
+}
+
 } // namespace
 } // namespace format
 } // namespace clang
Index: clang/unittests/Format/MacroCallReconstructorTest.cpp
===
--- clang/unittests/Format/MacroCallReconstructorTest.cpp
+++ clang/unittests/Format/MacroCallReconstructorTest.cpp
@@ -33,13 +33,31 @@
   TokenList
   expand(llvm::StringRef Name,
  const SmallVector, 1> ) {
+return expandInternal(Name, Args);
+  }
+
+  TokenList expand(llvm::StringRef Name) { return expandInternal(Name, {}); }
+
+  TokenList expand(llvm::StringRef Name, const std::vector ) {
+return expandInternal(Name, lexArgs(Args));
+  }
+
+  const UnexpandedMap () const { return Unexpanded; }
+
+  const TokenList () const { return Tokens; }
+
+private:
+  TokenList expandInternal(
+  llvm::StringRef Name,
+  const std::optional, 1>>
+  ) {
 auto *ID = Lex.id(Name);
 auto UnexpandedLine = std::make_unique();
 UnexpandedLine->Tokens.push_back(ID);
-if (!Args.empty()) {
+if (Args && !Args->empty()) {
   UnexpandedLine->Tokens.push_back(Lex.id("("));
-  for (auto I = Args.begin(), E = Args.end(); I != E; ++I) {
-if (I != Args.begin())
+  for (auto I = Args->begin(), E = Args->end(); I != E; ++I) {
+if (I != Args->begin())
   UnexpandedLine->Tokens.push_back(Lex.id(","));
 UnexpandedLine->Tokens.insert(UnexpandedLine->Tokens.end(), I->begin(),
   I->end());
@@ -57,16 +75,6 @@
 return UnexpandedTokens;
   }
 

[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-02-24 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/unittests/Format/FormatTest.cpp:22584
+  Style.Macros.push_back("CALL(x)=f([] { x })");
+  Style.Macros.push_back("ASSIGN_OR_RETURN(a, b, c)=a = (b) || (c)");
+

sammccall wrote:
> klimek wrote:
> > sammccall wrote:
> > > If this is assign_or_return, the treatment of "c" is fairly strange. Also 
> > > you are mostly calling this with only two args. Maybe drop C and use a 
> > > different macro for the 3-arg case?
> > ASSIGN_OR_RETURN allows 3 args, though; this is basically the thing I'd 
> > propose to configure for ASSIGN_OR_RETURN in general; is there anything 
> > specific you don't like about this?
> OK, this is a bit sticky.
>  - `ASSIGN_OR_RETURN` is almost always called with two args
>  - some versions of `ASSIGN_OR_RETURN` support an optional third arg (there's 
> no canonical public version)
>  - these emulate overloading using variadic macro tricks
>  - this patch doesn't claim to support either variadic macros or overloading
> 
> So the principled options seem to be:
>  - macros have fixed arity: clang-format can support the two-arg version of 
> `ASSIGN_OR_RETURN` but not the three-arg version. (This is what I was 
> assuming)
>  - macros have variable arity: the public docs need to describe how passing 
> too many/too few args is treated, because this isn't obvious and it sounds 
> like we want people to rely on it. This feels like stretching "principled" to 
> me - we're relying on the expansion still parsing correctly when args are 
> missing, which basically seems like coincidence.
>  - macros support overloading: instead of looking up macros by name, the key 
> is `(string Name, optional Arity)`
Chose option 3, as this is pretty straight-forward and I believe the best user 
experience.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144170/new/

https://reviews.llvm.org/D144170

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-02-24 Thread Manuel Klimek via Phabricator via cfe-commits
klimek updated this revision to Diff 500121.
klimek marked 6 inline comments as done.
klimek added a comment.

Address review comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144170/new/

https://reviews.llvm.org/D144170

Files:
  clang/include/clang/Format/Format.h
  clang/lib/Format/ContinuationIndenter.cpp
  clang/lib/Format/Format.cpp
  clang/lib/Format/FormatToken.h
  clang/lib/Format/MacroExpander.cpp
  clang/lib/Format/Macros.h
  clang/lib/Format/TokenAnalyzer.cpp
  clang/lib/Format/TokenAnalyzer.h
  clang/lib/Format/TokenAnnotator.cpp
  clang/lib/Format/TokenAnnotator.h
  clang/lib/Format/UnwrappedLineFormatter.cpp
  clang/lib/Format/UnwrappedLineParser.cpp
  clang/lib/Format/UnwrappedLineParser.h
  clang/lib/Format/WhitespaceManager.cpp
  clang/unittests/Format/FormatTest.cpp
  clang/unittests/Format/MacroCallReconstructorTest.cpp
  clang/unittests/Format/MacroExpanderTest.cpp
  clang/unittests/Format/TestLexer.h

Index: clang/unittests/Format/TestLexer.h
===
--- clang/unittests/Format/TestLexer.h
+++ clang/unittests/Format/TestLexer.h
@@ -72,7 +72,8 @@
   TokenList annotate(llvm::StringRef Code) {
 FormatTokenLexer Lex = getNewLexer(Code);
 auto Tokens = Lex.lex();
-UnwrappedLineParser Parser(Style, Lex.getKeywords(), 0, Tokens, *this);
+UnwrappedLineParser Parser(SourceMgr.get(), Style, Lex.getKeywords(), 0,
+   Tokens, *this, Allocator, IdentTable);
 Parser.parse();
 TokenAnnotator Annotator(Style, Lex.getKeywords());
 for (auto  : UnwrappedLines) {
Index: clang/unittests/Format/MacroExpanderTest.cpp
===
--- clang/unittests/Format/MacroExpanderTest.cpp
+++ clang/unittests/Format/MacroExpanderTest.cpp
@@ -19,9 +19,16 @@
Lex.Allocator, Lex.IdentTable);
   }
 
+  std::string expand(MacroExpander , llvm::StringRef Name) {
+EXPECT_TRUE(Macros.defined(Name))
+<< "Macro not defined: \"" << Name << "\"";
+return text(Macros.expand(Lex.id(Name), {}));
+  }
+
   std::string expand(MacroExpander , llvm::StringRef Name,
- const std::vector  = {}) {
-EXPECT_TRUE(Macros.defined(Name));
+ const std::vector ) {
+EXPECT_TRUE(Macros.defined(Name))
+<< "Macro not defined: \"" << Name << "\"";
 return text(Macros.expand(Lex.id(Name), lexArgs(Args)));
   }
 
@@ -95,7 +102,7 @@
   EXPECT_EQ("", expand(*Macros, "A"));
   EXPECT_EQ("b", expand(*Macros, "B"));
   EXPECT_EQ("c+c", expand(*Macros, "C"));
-  EXPECT_EQ("", expand(*Macros, "D"));
+  EXPECT_EQ("", expand(*Macros, "D", {}));
 }
 
 TEST_F(MacroExpanderTest, ExpandsWithArguments) {
@@ -105,7 +112,6 @@
   });
   EXPECT_EQ("", expand(*Macros, "A", {"a"}));
   EXPECT_EQ("b1+b2+b3", expand(*Macros, "B", {"b1", "b2 + b3"}));
-  EXPECT_EQ("x+", expand(*Macros, "B", {"x"}));
 }
 
 TEST_F(MacroExpanderTest, AttributizesTokens) {
@@ -200,6 +206,14 @@
   EXPECT_ATTRIBUTES(Result, Attributes);
 }
 
+TEST_F(MacroExpanderTest, Overloads) {
+  auto Macros = create({"A=x", "A()=y", "A(a)=a", "A(a, b)=a b"});
+  EXPECT_EQ("x", expand(*Macros, "A"));
+  EXPECT_EQ("y", expand(*Macros, "A", {}));
+  EXPECT_EQ("z", expand(*Macros, "A", {"z"}));
+  EXPECT_EQ("xy", expand(*Macros, "A", {"x", "y"}));
+}
+
 } // namespace
 } // namespace format
 } // namespace clang
Index: clang/unittests/Format/MacroCallReconstructorTest.cpp
===
--- clang/unittests/Format/MacroCallReconstructorTest.cpp
+++ clang/unittests/Format/MacroCallReconstructorTest.cpp
@@ -33,13 +33,31 @@
   TokenList
   expand(llvm::StringRef Name,
  const SmallVector, 1> ) {
+return expandInternal(Name, Args);
+  }
+
+  TokenList expand(llvm::StringRef Name) { return expandInternal(Name, {}); }
+
+  TokenList expand(llvm::StringRef Name, const std::vector ) {
+return expandInternal(Name, lexArgs(Args));
+  }
+
+  const UnexpandedMap () const { return Unexpanded; }
+
+  const TokenList () const { return Tokens; }
+
+private:
+  TokenList expandInternal(
+  llvm::StringRef Name,
+  const std::optional, 1>>
+  ) {
 auto *ID = Lex.id(Name);
 auto UnexpandedLine = std::make_unique();
 UnexpandedLine->Tokens.push_back(ID);
-if (!Args.empty()) {
+if (Args && !Args->empty()) {
   UnexpandedLine->Tokens.push_back(Lex.id("("));
-  for (auto I = Args.begin(), E = Args.end(); I != E; ++I) {
-if (I != Args.begin())
+  for (auto I = Args->begin(), E = Args->end(); I != E; ++I) {
+if (I != Args->begin())
   UnexpandedLine->Tokens.push_back(Lex.id(","));
 UnexpandedLine->Tokens.insert(UnexpandedLine->Tokens.end(), I->begin(),
   I->end());
@@ -57,16 +75,6 @@
 return UnexpandedTokens;
   }
 

[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-02-23 Thread Manuel Klimek via Phabricator via cfe-commits
klimek updated this revision to Diff 499777.
klimek added a comment.

Add comment.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144170/new/

https://reviews.llvm.org/D144170

Files:
  clang/include/clang/Format/Format.h
  clang/lib/Format/ContinuationIndenter.cpp
  clang/lib/Format/Format.cpp
  clang/lib/Format/FormatToken.h
  clang/lib/Format/MacroExpander.cpp
  clang/lib/Format/Macros.h
  clang/lib/Format/TokenAnalyzer.cpp
  clang/lib/Format/TokenAnalyzer.h
  clang/lib/Format/TokenAnnotator.cpp
  clang/lib/Format/TokenAnnotator.h
  clang/lib/Format/UnwrappedLineFormatter.cpp
  clang/lib/Format/UnwrappedLineParser.cpp
  clang/lib/Format/UnwrappedLineParser.h
  clang/lib/Format/WhitespaceManager.cpp
  clang/unittests/Format/FormatTest.cpp
  clang/unittests/Format/TestLexer.h

Index: clang/unittests/Format/TestLexer.h
===
--- clang/unittests/Format/TestLexer.h
+++ clang/unittests/Format/TestLexer.h
@@ -72,7 +72,8 @@
   TokenList annotate(llvm::StringRef Code) {
 FormatTokenLexer Lex = getNewLexer(Code);
 auto Tokens = Lex.lex();
-UnwrappedLineParser Parser(Style, Lex.getKeywords(), 0, Tokens, *this);
+UnwrappedLineParser Parser(SourceMgr.get(), Style, Lex.getKeywords(), 0,
+   Tokens, *this, Allocator, IdentTable);
 Parser.parse();
 TokenAnnotator Annotator(Style, Lex.getKeywords());
 for (auto  : UnwrappedLines) {
Index: clang/unittests/Format/FormatTest.cpp
===
--- clang/unittests/Format/FormatTest.cpp
+++ clang/unittests/Format/FormatTest.cpp
@@ -22568,6 +22568,230 @@
"llvm::outs()\n<<");
 }
 
+TEST_F(FormatTest, UnexpandConfiguredMacros) {
+  FormatStyle Style = getLLVMStyle();
+  Style.Macros.push_back("CLASS=class C {");
+  Style.Macros.push_back("SEMI=;");
+  Style.Macros.push_back("STMT=f();");
+  Style.Macros.push_back("ID(x)=x");
+  Style.Macros.push_back("ID3(x, y, z)=x y z");
+  Style.Macros.push_back("CALL(x)=f([] { x })");
+  Style.Macros.push_back("ASSIGN_OR_RETURN(a, b, c)=a = (b) || (c)");
+  Style.Macros.push_back("MOCK_METHOD(r, n, a, s)=r n a s");
+
+  verifyFormat("ID(nested(a(b, c), d))", Style);
+  verifyFormat("CLASS\n"
+   "  a *b;\n"
+   "};",
+   Style);
+  verifyFormat("SEMI\n"
+   "SEMI\n"
+   "SEMI",
+   Style);
+  verifyFormat("STMT\n"
+   "STMT\n"
+   "STMT",
+   Style);
+  verifyFormat("void f() { ID(a *b); }", Style);
+  verifyFormat(R"(ID(
+{ ID(a *b); });
+)",
+   Style);
+  verifyIncompleteFormat(R"(ID3({, ID(a *b),
+  ;
+  });
+)",
+ Style);
+
+  verifyFormat("ID(CALL(CALL(return a * b;)));", Style);
+
+  verifyFormat("ASSIGN_OR_RETURN(MySomewhatLongType *variable,\n"
+   " MySomewhatLongFunction(SomethingElse()));\n",
+   Style);
+
+  verifyFormat(R"(
+#define MACRO(a, b) ID(a + b)
+)",
+   Style);
+  EXPECT_EQ(R"(
+int a;
+int b;
+int c;
+int d;
+int e;
+int f;
+ID(
+namespace foo {
+int a;
+}
+) // namespace k
+)",
+format(R"(
+int a;
+int b;
+int c;
+int d;
+int e;
+int f;
+ID(namespace foo { int a; })  // namespace k
+)",
+   Style));
+  verifyFormat(R"(ID(
+//
+({ ; }))
+)",
+   Style);
+
+  Style.ColumnLimit = 35;
+  // FIXME: Arbitrary formatting of macros where the end of the logical
+  // line is in the middle of a macro call are not working yet.
+  verifyFormat(R"(ID(
+void f();
+void)
+ID(g) ID(()) ID(
+;
+void g();)
+)",
+   Style);
+
+  Style.ColumnLimit = 10;
+  verifyFormat("STMT\n"
+   "STMT\n"
+   "STMT",
+   Style);
+
+  EXPECT_EQ(R"(
+ID(CALL(CALL(
+a *b)));
+)",
+format(R"(
+ID(CALL(CALL(a * b)));
+)",
+   Style));
+
+  // FIXME: If we want to support unbalanced braces or parens from macro
+  // expansions we need to re-think how we propagate errors in
+  // TokenAnnotator::parseLine; for investigation, switching the inner loop of
+  // TokenAnnotator::parseLine to return LT_Other instead of LT_Invalid in case
+  // of !consumeToken() changes the formatting of the test below and makes it
+  // believe it has a fully correct formatting.
+  EXPECT_EQ(R"(
+ID3(
+{
+CLASS
+a *b;
+};
+},
+ID(x *y);
+,
+STMT
+STMT
+STMT)
+void f();
+)",
+format(R"(
+ID3({CLASS a*b; };}, ID(x*y);, STMT STMT STMT)
+void f();
+)",
+   Style));
+
+  verifyFormat("ID(a(\n"
+   "#ifdef A\n"
+   "b, c\n"
+   "#else\n"
+   "d(e)\n"
+   "#endif\n"
+   "))",
+   Style);
+  Style.ColumnLimit = 80;
+  

[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-02-23 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/lib/Format/ContinuationIndenter.cpp:744
+  // Align following lines within parenthesis / brackets if configured.
+  // For a line of macro parents, the commas that follow the opening 
parenthesis
+  // in the line come after the opening parenthesis' children - we want to 
align

sammccall wrote:
> This is a bit hard to follow without context, I had to look up 
> MacroCallReconstructor::takeResult() to recall the structure it's referring 
> to, and there's no obvious breadcrumb to point you there. And if you don't 
> follow the details, it's hard to understand what the relevance is.
> 
> Maybe something like: this doesn't apply to macro expansion lines, which are 
> `MACRO( , , )` with args as children of `(` and `,`. It does not make sense 
> to align the commas with the opening paren.
> 
> (I think describing here how the alignment ends up working in that case might 
> just add confusion: it's not handled by this code, right?)
Excellent wording, applied. And yes, alignment in the macro case is not handled 
here.



Comment at: clang/lib/Format/TokenAnnotator.h:68
 FormatToken *Current = First;
+addChildren(Line.Tokens.front(), Current);
 for (const UnwrappedLineNode  : llvm::drop_begin(Line.Tokens)) {

sammccall wrote:
> this seems like a separate bugfix: previously if the first node in 
> Line.Tokens had children, we wouldn't be adding them.
> 
> Is this true? Was there some reason the first node could never have children 
> before?
I think the main problem is that it's impossible to write C++ code that has a 
child in the first node based on how this worked. That is, previously we only 
added children for blocks within one line, that is, within a function or 
lambda. Those need intro tokens. So I really couldn't make this trigger, or I 
would have fixed it in a separate patch.



Comment at: clang/lib/Format/TokenAnnotator.h:76
+  addChildren(Node, Current);
+  // FIXME: if we add children, previous will point to the token before
+  // the children; changing this requires significant changes across

sammccall wrote:
> does "add children" here refer to the procedure in addChildren(), or to the 
> token insertion idea introduced in the previous patch?
> 
> (i.e. is this just describing that previous/next skip over trees of children 
> rather than including them in the chain, or is there something new going on?)
It's about adding children in general.
When fixing this I realized this is in principle wrong, but then noticed that 
we have code that depends on this behavior.



Comment at: clang/lib/Format/UnwrappedLineFormatter.cpp:926
+} else {
+  Tok->Finalized = true;
+}

sammccall wrote:
> why do we no longer finalize the child tree?
We do - markFinalized() is called for the child tree again. Previously we 
marked multiple times, which doesn't work with the new algorithm, as it would 
move ExpandedArg -> UnexpandedArg -> Finalized in one go.



Comment at: clang/lib/Format/UnwrappedLineParser.cpp:225
+  }
+  Callback.finishRun();
+}

sammccall wrote:
> I'm sure you got this right, but it's hard for me to evaluate this code 
> because I don't know what `UnwrappedLineConsumer::finishRun()` is for - it's 
> not documented and the implementation is mysterious.
> 
> You might consider adding a comment to that interface/method, it's not really 
> related to this change though.
Done. Not sure I've put too much info about how it's going to be used into the 
interface, where it doesn't really belong.



Comment at: clang/lib/Format/UnwrappedLineParser.cpp:4564
+  // We parse the macro call into a new line.
+  auto Args = parseMacroCall();
+  InExpansion = OldInExpansion;

sammccall wrote:
> it looks like you go into parsing the macro call including parens and 
> arguments regardless of whether the macro is object-like or function-like.
> 
> So given
> ```
> #define DEBUG puts
> DEBUG("foo");
> ```
> you're going to expand to `puts;` instead of `puts("foo");`
Excellent find. Completely invalidated the way I was doing formatting of 
expanded lines, but fixed the problem with the line indices needing resetting 
\o/

Now we're formatting all lines in the first round with the macro calls replaced 
with expanded lines, and then again format everything including the macro 
calls. That is more in line with clang-format's assumptions anyway.



Comment at: clang/lib/Format/UnwrappedLineParser.cpp:4660
+}
+  } while (!eof());
+  return {};

sammccall wrote:
> hitting eof is an error condition here, so it seems more natural to handle it 
> as a case in the switch as a kind of early-return, and maybe it's worth 
> adding an LLVM_DEBUG() for that case. If anything it'd be nice to have 
> 

[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-02-23 Thread Manuel Klimek via Phabricator via cfe-commits
klimek updated this revision to Diff 499771.
klimek marked 9 inline comments as done.
klimek added a comment.

Address reviewer comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144170/new/

https://reviews.llvm.org/D144170

Files:
  clang/include/clang/Format/Format.h
  clang/lib/Format/ContinuationIndenter.cpp
  clang/lib/Format/Format.cpp
  clang/lib/Format/FormatToken.h
  clang/lib/Format/MacroExpander.cpp
  clang/lib/Format/Macros.h
  clang/lib/Format/TokenAnalyzer.cpp
  clang/lib/Format/TokenAnalyzer.h
  clang/lib/Format/TokenAnnotator.cpp
  clang/lib/Format/TokenAnnotator.h
  clang/lib/Format/UnwrappedLineFormatter.cpp
  clang/lib/Format/UnwrappedLineParser.cpp
  clang/lib/Format/UnwrappedLineParser.h
  clang/lib/Format/WhitespaceManager.cpp
  clang/unittests/Format/FormatTest.cpp
  clang/unittests/Format/TestLexer.h

Index: clang/unittests/Format/TestLexer.h
===
--- clang/unittests/Format/TestLexer.h
+++ clang/unittests/Format/TestLexer.h
@@ -72,7 +72,8 @@
   TokenList annotate(llvm::StringRef Code) {
 FormatTokenLexer Lex = getNewLexer(Code);
 auto Tokens = Lex.lex();
-UnwrappedLineParser Parser(Style, Lex.getKeywords(), 0, Tokens, *this);
+UnwrappedLineParser Parser(SourceMgr.get(), Style, Lex.getKeywords(), 0,
+   Tokens, *this, Allocator, IdentTable);
 Parser.parse();
 TokenAnnotator Annotator(Style, Lex.getKeywords());
 for (auto  : UnwrappedLines) {
Index: clang/unittests/Format/FormatTest.cpp
===
--- clang/unittests/Format/FormatTest.cpp
+++ clang/unittests/Format/FormatTest.cpp
@@ -22568,6 +22568,230 @@
"llvm::outs()\n<<");
 }
 
+TEST_F(FormatTest, UnexpandConfiguredMacros) {
+  FormatStyle Style = getLLVMStyle();
+  Style.Macros.push_back("CLASS=class C {");
+  Style.Macros.push_back("SEMI=;");
+  Style.Macros.push_back("STMT=f();");
+  Style.Macros.push_back("ID(x)=x");
+  Style.Macros.push_back("ID3(x, y, z)=x y z");
+  Style.Macros.push_back("CALL(x)=f([] { x })");
+  Style.Macros.push_back("ASSIGN_OR_RETURN(a, b, c)=a = (b) || (c)");
+  Style.Macros.push_back("MOCK_METHOD(r, n, a, s)=r n a s");
+
+  verifyFormat("ID(nested(a(b, c), d))", Style);
+  verifyFormat("CLASS\n"
+   "  a *b;\n"
+   "};",
+   Style);
+  verifyFormat("SEMI\n"
+   "SEMI\n"
+   "SEMI",
+   Style);
+  verifyFormat("STMT\n"
+   "STMT\n"
+   "STMT",
+   Style);
+  verifyFormat("void f() { ID(a *b); }", Style);
+  verifyFormat(R"(ID(
+{ ID(a *b); });
+)",
+   Style);
+  verifyIncompleteFormat(R"(ID3({, ID(a *b),
+  ;
+  });
+)",
+ Style);
+
+  verifyFormat("ID(CALL(CALL(return a * b;)));", Style);
+
+  verifyFormat("ASSIGN_OR_RETURN(MySomewhatLongType *variable,\n"
+   " MySomewhatLongFunction(SomethingElse()));\n",
+   Style);
+
+  verifyFormat(R"(
+#define MACRO(a, b) ID(a + b)
+)",
+   Style);
+  EXPECT_EQ(R"(
+int a;
+int b;
+int c;
+int d;
+int e;
+int f;
+ID(
+namespace foo {
+int a;
+}
+) // namespace k
+)",
+format(R"(
+int a;
+int b;
+int c;
+int d;
+int e;
+int f;
+ID(namespace foo { int a; })  // namespace k
+)",
+   Style));
+  verifyFormat(R"(ID(
+//
+({ ; }))
+)",
+   Style);
+
+  Style.ColumnLimit = 35;
+  // FIXME: Arbitrary formatting of macros where the end of the logical
+  // line is in the middle of a macro call are not working yet.
+  verifyFormat(R"(ID(
+void f();
+void)
+ID(g) ID(()) ID(
+;
+void g();)
+)",
+   Style);
+
+  Style.ColumnLimit = 10;
+  verifyFormat("STMT\n"
+   "STMT\n"
+   "STMT",
+   Style);
+
+  EXPECT_EQ(R"(
+ID(CALL(CALL(
+a *b)));
+)",
+format(R"(
+ID(CALL(CALL(a * b)));
+)",
+   Style));
+
+  // FIXME: If we want to support unbalanced braces or parens from macro
+  // expansions we need to re-think how we propagate errors in
+  // TokenAnnotator::parseLine; for investigation, switching the inner loop of
+  // TokenAnnotator::parseLine to return LT_Other instead of LT_Invalid in case
+  // of !consumeToken() changes the formatting of the test below and makes it
+  // believe it has a fully correct formatting.
+  EXPECT_EQ(R"(
+ID3(
+{
+CLASS
+a *b;
+};
+},
+ID(x *y);
+,
+STMT
+STMT
+STMT)
+void f();
+)",
+format(R"(
+ID3({CLASS a*b; };}, ID(x*y);, STMT STMT STMT)
+void f();
+)",
+   Style));
+
+  verifyFormat("ID(a(\n"
+   "#ifdef A\n"
+   "b, c\n"
+   "#else\n"
+   "d(e)\n"
+   "#endif\n"
+   "))",
+   

[PATCH] D144170: [clang-format] Add simple macro replacements in formatting.

2023-02-16 Thread Manuel Klimek via Phabricator via cfe-commits
klimek created this revision.
klimek added a reviewer: sammccall.
Herald added a project: All.
klimek requested review of this revision.
Herald added a project: clang.

Add configuration to specify macros.
Macros will be expanded, and the code will be parsed and annotated
in the expanded state. In a second step, the formatting decisions
in the annotated expanded code will be reconstructed onto the
original unexpanded macro call.

Eventually, this will allow to remove special-case code for
various macro options we accumulated over the years in favor of
one principled mechanism.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D144170

Files:
  clang/include/clang/Format/Format.h
  clang/lib/Format/ContinuationIndenter.cpp
  clang/lib/Format/Format.cpp
  clang/lib/Format/FormatToken.h
  clang/lib/Format/Macros.h
  clang/lib/Format/TokenAnalyzer.cpp
  clang/lib/Format/TokenAnalyzer.h
  clang/lib/Format/TokenAnnotator.cpp
  clang/lib/Format/TokenAnnotator.h
  clang/lib/Format/UnwrappedLineFormatter.cpp
  clang/lib/Format/UnwrappedLineParser.cpp
  clang/lib/Format/UnwrappedLineParser.h
  clang/lib/Format/WhitespaceManager.cpp
  clang/unittests/Format/FormatTest.cpp
  clang/unittests/Format/TestLexer.h

Index: clang/unittests/Format/TestLexer.h
===
--- clang/unittests/Format/TestLexer.h
+++ clang/unittests/Format/TestLexer.h
@@ -72,7 +72,8 @@
   TokenList annotate(llvm::StringRef Code) {
 FormatTokenLexer Lex = getNewLexer(Code);
 auto Tokens = Lex.lex();
-UnwrappedLineParser Parser(Style, Lex.getKeywords(), 0, Tokens, *this);
+UnwrappedLineParser Parser(SourceMgr.get(), Style, Lex.getKeywords(), 0,
+   Tokens, *this, Allocator, IdentTable);
 Parser.parse();
 TokenAnnotator Annotator(Style, Lex.getKeywords());
 for (auto  : UnwrappedLines) {
Index: clang/unittests/Format/FormatTest.cpp
===
--- clang/unittests/Format/FormatTest.cpp
+++ clang/unittests/Format/FormatTest.cpp
@@ -66,7 +66,8 @@
 
   void _verifyFormat(const char *File, int Line, llvm::StringRef Expected,
  llvm::StringRef Code,
- const FormatStyle  = getLLVMStyle()) {
+ const FormatStyle  = getLLVMStyle(),
+ bool MessUp = true) {
 ScopedTrace t(File, Line, ::testing::Message() << Code.str());
 EXPECT_EQ(Expected.str(), format(Expected, Style))
 << "Expected code is not stable";
@@ -76,20 +77,24 @@
   // needs to be checked for Objective-C++ as well.
   FormatStyle ObjCStyle = Style;
   ObjCStyle.Language = FormatStyle::LK_ObjC;
-  EXPECT_EQ(Expected.str(), format(test::messUp(Code), ObjCStyle));
+  EXPECT_EQ(Expected.str(),
+format(MessUp ? test::messUp(Code) : Code, ObjCStyle));
 }
   }
 
   void _verifyFormat(const char *File, int Line, llvm::StringRef Code,
- const FormatStyle  = getLLVMStyle()) {
-_verifyFormat(File, Line, Code, test::messUp(Code), Style);
+ const FormatStyle  = getLLVMStyle(),
+ bool MessUp = true) {
+_verifyFormat(File, Line, Code, MessUp ? test::messUp(Code) : Code, Style,
+  MessUp);
   }
 
   void _verifyIncompleteFormat(const char *File, int Line, llvm::StringRef Code,
-   const FormatStyle  = getLLVMStyle()) {
+   const FormatStyle  = getLLVMStyle(),
+   bool MessUp = true) {
 ScopedTrace t(File, Line, ::testing::Message() << Code.str());
-EXPECT_EQ(Code.str(),
-  format(test::messUp(Code), Style, SC_ExpectIncomplete));
+EXPECT_EQ(Code.str(), format(MessUp ? test::messUp(Code) : Code, Style,
+ SC_ExpectIncomplete));
   }
 
   void _verifyIndependentOfContext(const char *File, int Line,
@@ -22568,6 +22573,189 @@
"llvm::outs()\n<<");
 }
 
+TEST_F(FormatTest, UnexpandConfiguredMacros) {
+  FormatStyle Style = getLLVMStyle();
+  Style.Macros.push_back("CLASS=class C {");
+  Style.Macros.push_back("SEMI=;");
+  Style.Macros.push_back("STMT=f();");
+  Style.Macros.push_back("ID(x)=x");
+  Style.Macros.push_back("ID3(x, y, z)=x y z");
+  Style.Macros.push_back("CALL(x)=f([] { x })");
+  Style.Macros.push_back("ASSIGN_OR_RETURN(a, b, c)=a = (b) || (c)");
+
+  verifyFormat("ID(nested(a(b, c), d))", Style);
+  verifyFormat("CLASS\n"
+   "  a *b;\n"
+   "};",
+   Style);
+  verifyFormat("SEMI\n"
+   "SEMI\n"
+   "SEMI",
+   Style);
+  verifyFormat("STMT\n"
+   "STMT\n"
+   "STMT",
+   Style);
+  verifyFormat("void f() { ID(a *b); }", Style);
+  verifyFormat(R"(ID(
+{ ID(a *b); });
+)",
+   Style);
+  verifyIncompleteFormat(R"(ID3({, 

[PATCH] D143070: [clang-format] Enable FormatTokenSource to insert tokens.

2023-02-15 Thread Manuel Klimek via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG1995d4424505: [clang-format] Enable FormatTokenSource to 
insert tokens. (authored by klimek).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143070/new/

https://reviews.llvm.org/D143070

Files:
  clang/lib/Format/FormatTokenSource.h
  clang/lib/Format/UnwrappedLineParser.h
  clang/unittests/Format/FormatTokenSourceTest.cpp

Index: clang/unittests/Format/FormatTokenSourceTest.cpp
===
--- clang/unittests/Format/FormatTokenSourceTest.cpp
+++ clang/unittests/Format/FormatTokenSourceTest.cpp
@@ -28,12 +28,17 @@
 #define EXPECT_TOKEN_KIND(FormatTok, Kind) \
   do { \
 FormatToken *Tok = FormatTok;  \
-EXPECT_EQ((Tok)->Tok.getKind(), Kind) << *(Tok);   \
+EXPECT_EQ(Tok->Tok.getKind(), Kind) << *Tok;   \
+  } while (false);
+#define EXPECT_TOKEN_ID(FormatTok, Name)   \
+  do { \
+FormatToken *Tok = FormatTok;  \
+EXPECT_EQ(Tok->Tok.getKind(), tok::identifier) << *Tok;\
+EXPECT_EQ(Tok->TokenText, Name) << *Tok;   \
   } while (false);
 
 TEST_F(IndexedTokenSourceTest, EmptyInput) {
-  TokenList Tokens = lex("");
-  IndexedTokenSource Source(Tokens);
+  IndexedTokenSource Source(lex(""));
   EXPECT_FALSE(Source.isEOF());
   EXPECT_TOKEN_KIND(Source.getNextToken(), tok::eof);
   EXPECT_TRUE(Source.isEOF());
@@ -46,8 +51,7 @@
 }
 
 TEST_F(IndexedTokenSourceTest, NavigateTokenStream) {
-  TokenList Tokens = lex("int a;");
-  IndexedTokenSource Source(Tokens);
+  IndexedTokenSource Source(lex("int a;"));
   EXPECT_TOKEN_KIND(Source.peekNextToken(), tok::kw_int);
   EXPECT_TOKEN_KIND(Source.getNextToken(), tok::kw_int);
   EXPECT_EQ(Source.getPreviousToken(), nullptr);
@@ -60,11 +64,12 @@
   EXPECT_TOKEN_KIND(Source.peekNextToken(), tok::eof);
   EXPECT_TOKEN_KIND(Source.getNextToken(), tok::eof);
   EXPECT_TOKEN_KIND(Source.getPreviousToken(), tok::semi);
+  EXPECT_TOKEN_KIND(Source.getNextToken(), tok::eof);
+  EXPECT_TOKEN_KIND(Source.getPreviousToken(), tok::semi);
 }
 
 TEST_F(IndexedTokenSourceTest, ResetPosition) {
-  TokenList Tokens = lex("int a;");
-  IndexedTokenSource Source(Tokens);
+  IndexedTokenSource Source(lex("int a;"));
   Source.getNextToken();
   unsigned Position = Source.getPosition();
   Source.getNextToken();
@@ -73,6 +78,50 @@
   EXPECT_TOKEN_KIND(Source.setPosition(Position), tok::kw_int);
 }
 
+TEST_F(IndexedTokenSourceTest, InsertTokens) {
+  IndexedTokenSource Source(lex("A1 A2"));
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("B1 B2")), "B1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "B2");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A2");
+}
+
+TEST_F(IndexedTokenSourceTest, InsertTokensAtEOF) {
+  IndexedTokenSource Source(lex("A1"));
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_KIND(Source.getNextToken(), tok::eof);
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("B1 B2")), "B1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "B2");
+  EXPECT_TOKEN_KIND(Source.getNextToken(), tok::eof);
+}
+
+TEST_F(IndexedTokenSourceTest, InsertTokensRecursive) {
+  IndexedTokenSource Source(lex("A1"));
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  // A1
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("B1")), "B1");
+  // B1 A1
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("C1")), "C1");
+  // C1 B1 A1
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("D1")), "D1");
+  // D1 C1 B1 A1
+  EXPECT_TOKEN_ID(Source.getNextToken(), "C1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "B1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+}
+
+TEST_F(IndexedTokenSourceTest, InsertTokensRecursiveAtEndOfSequence) {
+  IndexedTokenSource Source(lex("A1"));
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("B1")), "B1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("C1")), "C1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("D1")), "D1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+}
+
 } // namespace
 } // namespace format
 } // namespace clang
Index: clang/lib/Format/UnwrappedLineParser.h
===
--- clang/lib/Format/UnwrappedLineParser.h
+++ clang/lib/Format/UnwrappedLineParser.h
@@ -280,9 +280,6 @@
   FormatTokenSource *Tokens;
   

[PATCH] D143070: [clang-format] Enable FormatTokenSource to insert tokens.

2023-02-15 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/lib/Format/UnwrappedLineParser.h:287
   // owned outside of and handed into the UnwrappedLineParser.
+  // FIXME: The above fixme doesn't work if we need to create tokens while
+  // parsing.

sammccall wrote:
> I'm not really sure how to read the combination of these two FIXMEs... does 
> it mean "we wanted to do this differently one day, but now we never can"?
> 
> Maybe either delete both, or if you think it's still potentially actionable, 
> something like FIXME: it would be better to have tokens created and owned 
> outside because XYZ, but it's hard because macro expansion mutates the stream
> 
> (I don't really understand what the prev comment is about: the tokens *are* 
> handed into the UnwrappedLineParser constructor. So I may be off base here)
Yeah, I think I don't fully understand what I wanted to fix, so deleted both.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143070/new/

https://reviews.llvm.org/D143070

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D143070: [clang-format] Enable FormatTokenSource to insert tokens.

2023-02-15 Thread Manuel Klimek via Phabricator via cfe-commits
klimek updated this revision to Diff 497637.
klimek marked an inline comment as done.
klimek added a comment.

Address review comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143070/new/

https://reviews.llvm.org/D143070

Files:
  clang/lib/Format/FormatTokenSource.h
  clang/lib/Format/UnwrappedLineParser.h
  clang/unittests/Format/FormatTokenSourceTest.cpp

Index: clang/unittests/Format/FormatTokenSourceTest.cpp
===
--- clang/unittests/Format/FormatTokenSourceTest.cpp
+++ clang/unittests/Format/FormatTokenSourceTest.cpp
@@ -28,12 +28,17 @@
 #define EXPECT_TOKEN_KIND(FormatTok, Kind) \
   do { \
 FormatToken *Tok = FormatTok;  \
-EXPECT_EQ((Tok)->Tok.getKind(), Kind) << *(Tok);   \
+EXPECT_EQ(Tok->Tok.getKind(), Kind) << *Tok;   \
+  } while (false);
+#define EXPECT_TOKEN_ID(FormatTok, Name)   \
+  do { \
+FormatToken *Tok = FormatTok;  \
+EXPECT_EQ(Tok->Tok.getKind(), tok::identifier) << *Tok;\
+EXPECT_EQ(Tok->TokenText, Name) << *Tok;   \
   } while (false);
 
 TEST_F(IndexedTokenSourceTest, EmptyInput) {
-  TokenList Tokens = lex("");
-  IndexedTokenSource Source(Tokens);
+  IndexedTokenSource Source(lex(""));
   EXPECT_FALSE(Source.isEOF());
   EXPECT_TOKEN_KIND(Source.getNextToken(), tok::eof);
   EXPECT_TRUE(Source.isEOF());
@@ -46,8 +51,7 @@
 }
 
 TEST_F(IndexedTokenSourceTest, NavigateTokenStream) {
-  TokenList Tokens = lex("int a;");
-  IndexedTokenSource Source(Tokens);
+  IndexedTokenSource Source(lex("int a;"));
   EXPECT_TOKEN_KIND(Source.peekNextToken(), tok::kw_int);
   EXPECT_TOKEN_KIND(Source.getNextToken(), tok::kw_int);
   EXPECT_EQ(Source.getPreviousToken(), nullptr);
@@ -60,11 +64,12 @@
   EXPECT_TOKEN_KIND(Source.peekNextToken(), tok::eof);
   EXPECT_TOKEN_KIND(Source.getNextToken(), tok::eof);
   EXPECT_TOKEN_KIND(Source.getPreviousToken(), tok::semi);
+  EXPECT_TOKEN_KIND(Source.getNextToken(), tok::eof);
+  EXPECT_TOKEN_KIND(Source.getPreviousToken(), tok::semi);
 }
 
 TEST_F(IndexedTokenSourceTest, ResetPosition) {
-  TokenList Tokens = lex("int a;");
-  IndexedTokenSource Source(Tokens);
+  IndexedTokenSource Source(lex("int a;"));
   Source.getNextToken();
   unsigned Position = Source.getPosition();
   Source.getNextToken();
@@ -73,6 +78,50 @@
   EXPECT_TOKEN_KIND(Source.setPosition(Position), tok::kw_int);
 }
 
+TEST_F(IndexedTokenSourceTest, InsertTokens) {
+  IndexedTokenSource Source(lex("A1 A2"));
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("B1 B2")), "B1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "B2");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A2");
+}
+
+TEST_F(IndexedTokenSourceTest, InsertTokensAtEOF) {
+  IndexedTokenSource Source(lex("A1"));
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_KIND(Source.getNextToken(), tok::eof);
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("B1 B2")), "B1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "B2");
+  EXPECT_TOKEN_KIND(Source.getNextToken(), tok::eof);
+}
+
+TEST_F(IndexedTokenSourceTest, InsertTokensRecursive) {
+  IndexedTokenSource Source(lex("A1"));
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  // A1
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("B1")), "B1");
+  // B1 A1
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("C1")), "C1");
+  // C1 B1 A1
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("D1")), "D1");
+  // D1 C1 B1 A1
+  EXPECT_TOKEN_ID(Source.getNextToken(), "C1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "B1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+}
+
+TEST_F(IndexedTokenSourceTest, InsertTokensRecursiveAtEndOfSequence) {
+  IndexedTokenSource Source(lex("A1"));
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("B1")), "B1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("C1")), "C1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("D1")), "D1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+}
+
 } // namespace
 } // namespace format
 } // namespace clang
Index: clang/lib/Format/UnwrappedLineParser.h
===
--- clang/lib/Format/UnwrappedLineParser.h
+++ clang/lib/Format/UnwrappedLineParser.h
@@ -280,9 +280,6 @@
   FormatTokenSource *Tokens;
   UnwrappedLineConsumer 
 
-  // FIXME: This is a temporary measure until we have reworked the ownership
-  // of 

[PATCH] D143070: [clang-format] Enable FormatTokenSource to insert tokens.

2023-02-14 Thread Manuel Klimek via Phabricator via cfe-commits
klimek updated this revision to Diff 497416.
klimek added a comment.

Undo changes to ownership of initial set of FormatTokens.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143070/new/

https://reviews.llvm.org/D143070

Files:
  clang/lib/Format/FormatTokenSource.h
  clang/lib/Format/UnwrappedLineParser.h
  clang/unittests/Format/FormatTokenSourceTest.cpp

Index: clang/unittests/Format/FormatTokenSourceTest.cpp
===
--- clang/unittests/Format/FormatTokenSourceTest.cpp
+++ clang/unittests/Format/FormatTokenSourceTest.cpp
@@ -30,10 +30,15 @@
 FormatToken *Tok = FormatTok;  \
 EXPECT_EQ((Tok)->Tok.getKind(), Kind) << *(Tok);   \
   } while (false);
+#define EXPECT_TOKEN_ID(FormatTok, Name)   \
+  do { \
+FormatToken *Tok = FormatTok;  \
+EXPECT_EQ((Tok)->Tok.getKind(), tok::identifier) << *(Tok);\
+EXPECT_EQ((Tok)->TokenText, Name) << *(Tok);   \
+  } while (false);
 
 TEST_F(IndexedTokenSourceTest, EmptyInput) {
-  TokenList Tokens = lex("");
-  IndexedTokenSource Source(Tokens);
+  IndexedTokenSource Source(lex(""));
   EXPECT_FALSE(Source.isEOF());
   EXPECT_TOKEN_KIND(Source.getNextToken(), tok::eof);
   EXPECT_TRUE(Source.isEOF());
@@ -46,8 +51,7 @@
 }
 
 TEST_F(IndexedTokenSourceTest, NavigateTokenStream) {
-  TokenList Tokens = lex("int a;");
-  IndexedTokenSource Source(Tokens);
+  IndexedTokenSource Source(lex("int a;"));
   EXPECT_TOKEN_KIND(Source.peekNextToken(), tok::kw_int);
   EXPECT_TOKEN_KIND(Source.getNextToken(), tok::kw_int);
   EXPECT_EQ(Source.getPreviousToken(), nullptr);
@@ -60,11 +64,12 @@
   EXPECT_TOKEN_KIND(Source.peekNextToken(), tok::eof);
   EXPECT_TOKEN_KIND(Source.getNextToken(), tok::eof);
   EXPECT_TOKEN_KIND(Source.getPreviousToken(), tok::semi);
+  EXPECT_TOKEN_KIND(Source.getNextToken(), tok::eof);
+  EXPECT_TOKEN_KIND(Source.getPreviousToken(), tok::semi);
 }
 
 TEST_F(IndexedTokenSourceTest, ResetPosition) {
-  TokenList Tokens = lex("int a;");
-  IndexedTokenSource Source(Tokens);
+  IndexedTokenSource Source(lex("int a;"));
   Source.getNextToken();
   unsigned Position = Source.getPosition();
   Source.getNextToken();
@@ -73,6 +78,50 @@
   EXPECT_TOKEN_KIND(Source.setPosition(Position), tok::kw_int);
 }
 
+TEST_F(IndexedTokenSourceTest, InsertTokens) {
+  IndexedTokenSource Source(lex("A1 A2"));
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("B1 B2")), "B1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "B2");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A2");
+}
+
+TEST_F(IndexedTokenSourceTest, InsertTokensAtEOF) {
+  IndexedTokenSource Source(lex("A1"));
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_KIND(Source.getNextToken(), tok::eof);
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("B1 B2")), "B1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "B2");
+  EXPECT_TOKEN_KIND(Source.getNextToken(), tok::eof);
+}
+
+TEST_F(IndexedTokenSourceTest, InsertTokensRecursive) {
+  IndexedTokenSource Source(lex("A1"));
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  // A1
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("B1")), "B1");
+  // B1 A1
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("C1")), "C1");
+  // C1 B1 A1
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("D1")), "D1");
+  // D1 C1 B1 A1
+  EXPECT_TOKEN_ID(Source.getNextToken(), "C1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "B1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+}
+
+TEST_F(IndexedTokenSourceTest, InsertTokensRecursiveAtEndOfSequence) {
+  IndexedTokenSource Source(lex("A1"));
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("B1")), "B1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("C1")), "C1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("D1")), "D1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+}
+
 } // namespace
 } // namespace format
 } // namespace clang
Index: clang/lib/Format/UnwrappedLineParser.h
===
--- clang/lib/Format/UnwrappedLineParser.h
+++ clang/lib/Format/UnwrappedLineParser.h
@@ -89,7 +89,8 @@
 public:
   UnwrappedLineParser(const FormatStyle ,
   const AdditionalKeywords ,
-  unsigned FirstStartColumn, ArrayRef Tokens,
+  unsigned FirstStartColumn,
+  ArrayRef Tokens,
   UnwrappedLineConsumer );
 
   void parse();
@@ -283,6 +284,8 @@
   // FIXME: This is a temporary measure until we have 

[PATCH] D143070: [clang-format] Enable FormatTokenSource to insert tokens.

2023-02-14 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/lib/Format/FormatTokenSource.h:74
 public:
-  IndexedTokenSource(ArrayRef Tokens)
+  IndexedTokenSource(SmallVectorImpl )
   : Tokens(Tokens), Position(-1) {}

sammccall wrote:
> As I understand it, this parameter is used:
>  - to provide the initial set of input tokens the source will iterate over
>  - as a common address space for input + synthesized tokens, to allow the 
> jump mechanism to work
>  - to make the caller responsible for ownership/lifetime of the synthesized 
> tokens too
> 
> This simplifies the implementation, my only problem with this is it seems 
> unusual and confusing.
> A comment explaining the roles of this `Tokens` param would help a bit.
> 
> Alternatively you could consider slightly different data structures just for 
> the purpose of making interfaces more obvious: e.g. pass in a 
> `BumpPtrAllocator&`, allocate scratch tokens there, and use pointers instead 
> of indices (jumps become a ptr->ptr map etc)
Noticing none of this makes sense. We should really just copy the tokens, given 
that we modify the vector.



Comment at: clang/lib/Format/FormatTokenSource.h:94
   FormatToken *getPreviousToken() override {
 return Position > 0 ? Tokens[Position - 1] : nullptr;
   }

sammccall wrote:
> this no longer seems valid, immediately after calling insertTokens(), 
> Position is at the start of a jump range, and Tokens[Position - 1] will be 
> the EOF at the end of the main stream or previous jump range.
> 
> if this is never going to happen, you can detect it (I don't think 
> Tokens[Position - 1] can be EOF in any other way)
Added comment in the interface and assert.



Comment at: clang/lib/Format/FormatTokenSource.h:115
 
   unsigned getPosition() override {
 LLVM_DEBUG(llvm::dbgs() << "Getting Position: " << Position << "\n");

sammccall wrote:
> maybe add a comment that positions don't have meaningful order and this is 
> only useful to restore the position?
> 
> (All the callsites look good, it just feels like a trap)
Added a comment.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143070/new/

https://reviews.llvm.org/D143070

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D143070: [clang-format] Enable FormatTokenSource to insert tokens.

2023-02-14 Thread Manuel Klimek via Phabricator via cfe-commits
klimek updated this revision to Diff 497415.
klimek marked 3 inline comments as done.
klimek added a comment.

Address reviewer comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143070/new/

https://reviews.llvm.org/D143070

Files:
  clang/lib/Format/FormatTokenSource.h
  clang/lib/Format/UnwrappedLineParser.cpp
  clang/lib/Format/UnwrappedLineParser.h
  clang/unittests/Format/FormatTokenSourceTest.cpp
  clang/unittests/Format/TestLexer.h

Index: clang/unittests/Format/TestLexer.h
===
--- clang/unittests/Format/TestLexer.h
+++ clang/unittests/Format/TestLexer.h
@@ -71,7 +71,8 @@
 
   TokenList annotate(llvm::StringRef Code) {
 FormatTokenLexer Lex = getNewLexer(Code);
-auto Tokens = Lex.lex();
+auto Toks = Lex.lex();
+SmallVector Tokens(Toks.begin(), Toks.end());
 UnwrappedLineParser Parser(Style, Lex.getKeywords(), 0, Tokens, *this);
 Parser.parse();
 TokenAnnotator Annotator(Style, Lex.getKeywords());
Index: clang/unittests/Format/FormatTokenSourceTest.cpp
===
--- clang/unittests/Format/FormatTokenSourceTest.cpp
+++ clang/unittests/Format/FormatTokenSourceTest.cpp
@@ -30,10 +30,15 @@
 FormatToken *Tok = FormatTok;  \
 EXPECT_EQ((Tok)->Tok.getKind(), Kind) << *(Tok);   \
   } while (false);
+#define EXPECT_TOKEN_ID(FormatTok, Name)   \
+  do { \
+FormatToken *Tok = FormatTok;  \
+EXPECT_EQ((Tok)->Tok.getKind(), tok::identifier) << *(Tok);\
+EXPECT_EQ((Tok)->TokenText, Name) << *(Tok);   \
+  } while (false);
 
 TEST_F(IndexedTokenSourceTest, EmptyInput) {
-  TokenList Tokens = lex("");
-  IndexedTokenSource Source(Tokens);
+  IndexedTokenSource Source(lex(""));
   EXPECT_FALSE(Source.isEOF());
   EXPECT_TOKEN_KIND(Source.getNextToken(), tok::eof);
   EXPECT_TRUE(Source.isEOF());
@@ -46,8 +51,7 @@
 }
 
 TEST_F(IndexedTokenSourceTest, NavigateTokenStream) {
-  TokenList Tokens = lex("int a;");
-  IndexedTokenSource Source(Tokens);
+  IndexedTokenSource Source(lex("int a;"));
   EXPECT_TOKEN_KIND(Source.peekNextToken(), tok::kw_int);
   EXPECT_TOKEN_KIND(Source.getNextToken(), tok::kw_int);
   EXPECT_EQ(Source.getPreviousToken(), nullptr);
@@ -60,11 +64,12 @@
   EXPECT_TOKEN_KIND(Source.peekNextToken(), tok::eof);
   EXPECT_TOKEN_KIND(Source.getNextToken(), tok::eof);
   EXPECT_TOKEN_KIND(Source.getPreviousToken(), tok::semi);
+  EXPECT_TOKEN_KIND(Source.getNextToken(), tok::eof);
+  EXPECT_TOKEN_KIND(Source.getPreviousToken(), tok::semi);
 }
 
 TEST_F(IndexedTokenSourceTest, ResetPosition) {
-  TokenList Tokens = lex("int a;");
-  IndexedTokenSource Source(Tokens);
+  IndexedTokenSource Source(lex("int a;"));
   Source.getNextToken();
   unsigned Position = Source.getPosition();
   Source.getNextToken();
@@ -73,6 +78,50 @@
   EXPECT_TOKEN_KIND(Source.setPosition(Position), tok::kw_int);
 }
 
+TEST_F(IndexedTokenSourceTest, InsertTokens) {
+  IndexedTokenSource Source(lex("A1 A2"));
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("B1 B2")), "B1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "B2");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A2");
+}
+
+TEST_F(IndexedTokenSourceTest, InsertTokensAtEOF) {
+  IndexedTokenSource Source(lex("A1"));
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_KIND(Source.getNextToken(), tok::eof);
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("B1 B2")), "B1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "B2");
+  EXPECT_TOKEN_KIND(Source.getNextToken(), tok::eof);
+}
+
+TEST_F(IndexedTokenSourceTest, InsertTokensRecursive) {
+  IndexedTokenSource Source(lex("A1"));
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  // A1
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("B1")), "B1");
+  // B1 A1
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("C1")), "C1");
+  // C1 B1 A1
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("D1")), "D1");
+  // D1 C1 B1 A1
+  EXPECT_TOKEN_ID(Source.getNextToken(), "C1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "B1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+}
+
+TEST_F(IndexedTokenSourceTest, InsertTokensRecursiveAtEndOfSequence) {
+  IndexedTokenSource Source(lex("A1"));
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("B1")), "B1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("C1")), "C1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_ID(Source.insertTokens(lex("D1")), "D1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+}
+
 } // namespace
 } // 

[PATCH] D143070: [clang-format] Enable FormatTokenSource to insert tokens.

2023-02-01 Thread Manuel Klimek via Phabricator via cfe-commits
klimek created this revision.
klimek added a reviewer: sammccall.
Herald added a project: All.
klimek requested review of this revision.
Herald added a project: clang.

In preparation for configured macro replacements in formatting,
add the ability to insert tokens to FormatTokenSource, and implement
token insertion in IndexedTokenSource.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D143070

Files:
  clang/lib/Format/FormatTokenSource.h
  clang/lib/Format/UnwrappedLineParser.cpp
  clang/lib/Format/UnwrappedLineParser.h
  clang/unittests/Format/FormatTokenSourceTest.cpp
  clang/unittests/Format/TestLexer.h

Index: clang/unittests/Format/TestLexer.h
===
--- clang/unittests/Format/TestLexer.h
+++ clang/unittests/Format/TestLexer.h
@@ -71,7 +71,8 @@
 
   TokenList annotate(llvm::StringRef Code) {
 FormatTokenLexer Lex = getNewLexer(Code);
-auto Tokens = Lex.lex();
+auto Toks = Lex.lex();
+SmallVector Tokens(Toks.begin(), Toks.end());
 UnwrappedLineParser Parser(Style, Lex.getKeywords(), 0, Tokens, *this);
 Parser.parse();
 TokenAnnotator Annotator(Style, Lex.getKeywords());
Index: clang/unittests/Format/FormatTokenSourceTest.cpp
===
--- clang/unittests/Format/FormatTokenSourceTest.cpp
+++ clang/unittests/Format/FormatTokenSourceTest.cpp
@@ -30,6 +30,12 @@
 FormatToken *Tok = FormatTok;  \
 EXPECT_EQ((Tok)->Tok.getKind(), Kind) << *(Tok);   \
   } while (false);
+#define EXPECT_TOKEN_ID(FormatTok, Name)   \
+  do { \
+FormatToken *Tok = FormatTok;  \
+EXPECT_EQ((Tok)->Tok.getKind(), tok::identifier) << *(Tok);\
+EXPECT_EQ((Tok)->TokenText, Name) << *(Tok);   \
+  } while (false);
 
 TEST_F(IndexedTokenSourceTest, EmptyInput) {
   TokenList Tokens = lex("");
@@ -60,6 +66,8 @@
   EXPECT_TOKEN_KIND(Source.peekNextToken(), tok::eof);
   EXPECT_TOKEN_KIND(Source.getNextToken(), tok::eof);
   EXPECT_TOKEN_KIND(Source.getPreviousToken(), tok::semi);
+  EXPECT_TOKEN_KIND(Source.getNextToken(), tok::eof);
+  EXPECT_TOKEN_KIND(Source.getPreviousToken(), tok::semi);
 }
 
 TEST_F(IndexedTokenSourceTest, ResetPosition) {
@@ -73,6 +81,62 @@
   EXPECT_TOKEN_KIND(Source.setPosition(Position), tok::kw_int);
 }
 
+TEST_F(IndexedTokenSourceTest, InsertTokens) {
+  TokenList TokensA = lex("A1 A2");
+  TokenList TokensB = lex("B1 B2");
+  IndexedTokenSource Source(TokensA);
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_ID(Source.insertTokens(TokensB), "B1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "B2");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A2");
+}
+
+TEST_F(IndexedTokenSourceTest, InsertTokensAtEOF) {
+  TokenList TokensA = lex("A1");
+  TokenList TokensB = lex("B1 B2");
+  IndexedTokenSource Source(TokensA);
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_KIND(Source.getNextToken(), tok::eof);
+  EXPECT_TOKEN_ID(Source.insertTokens(TokensB), "B1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "B2");
+  EXPECT_TOKEN_KIND(Source.getNextToken(), tok::eof);
+}
+
+TEST_F(IndexedTokenSourceTest, InsertTokensRecursive) {
+  TokenList TokensA = lex("A1");
+  TokenList TokensB = lex("B1");
+  TokenList TokensC = lex("C1");
+  TokenList TokensD = lex("D1");
+  IndexedTokenSource Source(TokensA);
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  // A1
+  EXPECT_TOKEN_ID(Source.insertTokens(TokensB), "B1");
+  // B1 A1
+  EXPECT_TOKEN_ID(Source.insertTokens(TokensC), "C1");
+  // C1 B1 A1
+  EXPECT_TOKEN_ID(Source.insertTokens(TokensD), "D1");
+  // D1 C1 B1 A1
+  EXPECT_TOKEN_ID(Source.getNextToken(), "C1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "B1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+}
+
+TEST_F(IndexedTokenSourceTest, InsertTokensRecursiveAtEndOfSequence) {
+  TokenList TokensA = lex("A1");
+  TokenList TokensB = lex("B1");
+  TokenList TokensC = lex("C1");
+  TokenList TokensD = lex("D1");
+  IndexedTokenSource Source(TokensA);
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_ID(Source.insertTokens(TokensB), "B1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_ID(Source.insertTokens(TokensC), "C1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+  EXPECT_TOKEN_ID(Source.insertTokens(TokensD), "D1");
+  EXPECT_TOKEN_ID(Source.getNextToken(), "A1");
+}
+
 } // namespace
 } // namespace format
 } // namespace clang
Index: clang/lib/Format/UnwrappedLineParser.h
===
--- clang/lib/Format/UnwrappedLineParser.h
+++ clang/lib/Format/UnwrappedLineParser.h
@@ -89,7 +89,8 @@
 public:
 

[PATCH] D119138: [clang-format] Further improve support for requires expressions

2022-11-28 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

For non-functional clean-ups generally llvm doesn't require pre-commit review - 
I did communicate here so people involved in the original change wouldn't miss 
the clean-up. I do agree that what commits to pre-review is a fine line, and 
usually try to err on the side of pre-review; I'll take your feedback into 
consideration for future changes.

Regarding a better overview, you're 100% right. This is something we've 
definitely not been good enough and we need to get better at.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119138/new/

https://reviews.llvm.org/D119138

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D119138: [clang-format] Further improve support for requires expressions

2022-11-26 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

I changed it in 49aca00d63e14df8bc68fc4329e6cbc9c9805eb8 
.

"We" is the people working on clang-format :) I hope that we have a common goal 
of making clang-format as easy to maintain as we can.

FWIW, I once had the same opinion as you about best doing all parsing as early 
as possible, but djasper convinced me that the split was a good idea, and in 
the end, I think it turns out to be significantly less brittle to do more 
complex annotation in TokenAnnotator. E.g. we now have a lookahead limit of 50, 
which seems rather arbitrary, while in TokenAnnotator we could simply limit 
lookahead towards the current UnwrappedLine. Similarly, in TokenAnnotator, we 
already have all the parens connected, so we could simply look from requires 
l_paren to the corresponding r_paren and whether the next token is an l_brace. 
If I can find a bit of time I'll take an attempt at implementing it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119138/new/

https://reviews.llvm.org/D119138

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D119138: [clang-format] Further improve support for requires expressions

2022-11-26 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

Generally, why do we need to have that much information? I.e. why do we need to 
know the exact type of the "requires" keyword?
I do understand we need to know the brace type, but that seems like it would be 
easier to figure out in the TokenAnnotator (where we already parsed 
UnwrappedLines).
Do we ever parse UnwrappedLines differently depending on requires 
clauses/expressions?
If not, we should really do the annotation in TokenAnnotator, where we already 
have nice parsing bounds from the parsed UnwrappedLines.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119138/new/

https://reviews.llvm.org/D119138

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D119138: [clang-format] Further improve support for requires expressions

2022-11-25 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.
Herald added a project: All.

Hey ho, sorry for the late comment here, but adding peekNextToken(n) is 
problematic, as this gets in the way of future changes we want to do to handle 
macros better.
Usually we want to use X = Tokens->getPosition() and FormatTok = 
Tokens->setPosition(X) pairs when doing look-ahead.
I did a quick attempt at fixing this, but ran into infinite loops later in the 
annotator :(


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119138/new/

https://reviews.llvm.org/D119138

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D88299: [clang-format] Add MacroUnexpander.

2022-09-21 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

In D88299#3804910 , @owenpan wrote:

> @sstwcw thanks for pointing it out. See D134329 
> .

Thanks Owen, really appreciate this! And sorry for not getting to it yet myself 
:(


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88299/new/

https://reviews.llvm.org/D88299

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D88299: [clang-format] Add MacroUnexpander.

2022-07-18 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

In D88299#3660779 , @nridge wrote:

> Thanks! (I was intrigued by Sam's "solves a whole class of clang-format's 
> biggest problems" comment :-))

The end-result hopefully will :)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88299/new/

https://reviews.llvm.org/D88299

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D88299: [clang-format] Add MacroUnexpander.

2022-07-18 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

In D88299#3660772 , @nridge wrote:

> Does this patch change the formatting behaviour of clang-format?
>
> If so, are there any test cases that show before/after formatting? The 
> MacroCallReconstructorTest unit test seems like it's testing an internal 
> interface.

No, this is prep-work for the real change, which I'm planning to send out soon.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88299/new/

https://reviews.llvm.org/D88299

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D88299: [clang-format] Add MacroUnexpander.

2022-07-12 Thread Manuel Klimek via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGd6d0dc1f4537: [clang-format] Add MacroUnexpander. (authored 
by klimek).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88299/new/

https://reviews.llvm.org/D88299

Files:
  clang/lib/Format/CMakeLists.txt
  clang/lib/Format/FormatToken.h
  clang/lib/Format/MacroCallReconstructor.cpp
  clang/lib/Format/Macros.h
  clang/lib/Format/UnwrappedLineParser.h
  clang/unittests/Format/CMakeLists.txt
  clang/unittests/Format/MacroCallReconstructorTest.cpp

Index: clang/unittests/Format/MacroCallReconstructorTest.cpp
===
--- /dev/null
+++ clang/unittests/Format/MacroCallReconstructorTest.cpp
@@ -0,0 +1,688 @@
+#include "../../lib/Format/Macros.h"
+#include "../../lib/Format/UnwrappedLineParser.h"
+#include "TestLexer.h"
+#include "llvm/ADT/ArrayRef.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/StringRef.h"
+
+#include "gmock/gmock.h"
+#include "gtest/gtest.h"
+#include 
+#include 
+#include 
+
+namespace clang {
+namespace format {
+namespace {
+
+using UnexpandedMap =
+llvm::DenseMap>;
+
+// Keeps track of a sequence of macro expansions.
+//
+// The expanded tokens are accessible via getTokens(), while a map of macro call
+// identifier token to unexpanded token stream is accessible via
+// getUnexpanded().
+class Expansion {
+public:
+  Expansion(TestLexer , MacroExpander ) : Lex(Lex), Macros(Macros) {}
+
+  // Appends the token stream obtained from expanding the macro Name given
+  // the provided arguments, to be later retrieved with getTokens().
+  // Returns the list of tokens making up the unexpanded macro call.
+  TokenList
+  expand(llvm::StringRef Name,
+ const SmallVector, 1> ) {
+auto *ID = Lex.id(Name);
+auto UnexpandedLine = std::make_unique();
+UnexpandedLine->Tokens.push_back(ID);
+if (!Args.empty()) {
+  UnexpandedLine->Tokens.push_back(Lex.id("("));
+  for (auto I = Args.begin(), E = Args.end(); I != E; ++I) {
+if (I != Args.begin())
+  UnexpandedLine->Tokens.push_back(Lex.id(","));
+UnexpandedLine->Tokens.insert(UnexpandedLine->Tokens.end(), I->begin(),
+  I->end());
+  }
+  UnexpandedLine->Tokens.push_back(Lex.id(")"));
+}
+Unexpanded[ID] = std::move(UnexpandedLine);
+
+auto Expanded = uneof(Macros.expand(ID, Args));
+Tokens.append(Expanded.begin(), Expanded.end());
+
+TokenList UnexpandedTokens;
+for (const UnwrappedLineNode  : Unexpanded[ID]->Tokens) {
+  UnexpandedTokens.push_back(Node.Tok);
+}
+return UnexpandedTokens;
+  }
+
+  TokenList expand(llvm::StringRef Name,
+   const std::vector  = {}) {
+return expand(Name, lexArgs(Args));
+  }
+
+  const UnexpandedMap () const { return Unexpanded; }
+
+  const TokenList () const { return Tokens; }
+
+private:
+  llvm::SmallVector
+  lexArgs(const std::vector ) {
+llvm::SmallVector Result;
+for (const auto  : Args) {
+  Result.push_back(uneof(Lex.lex(Arg)));
+}
+return Result;
+  }
+  llvm::DenseMap> Unexpanded;
+  llvm::SmallVector Tokens;
+  TestLexer 
+  MacroExpander 
+};
+
+struct Chunk {
+  Chunk(llvm::ArrayRef Tokens)
+  : Tokens(Tokens.begin(), Tokens.end()) {}
+  Chunk(llvm::ArrayRef Children)
+  : Children(Children.begin(), Children.end()) {}
+  llvm::SmallVector Tokens;
+  llvm::SmallVector Children;
+};
+
+bool tokenMatches(const FormatToken *Left, const FormatToken *Right) {
+  if (Left->getType() == Right->getType() &&
+  Left->TokenText == Right->TokenText)
+return true;
+  llvm::dbgs() << Left->TokenText << " != " << Right->TokenText << "\n";
+  return false;
+}
+
+// Allows to produce chunks of a token list by typing the code of equal tokens.
+//
+// Created from a list of tokens, users call "consume" to get the next chunk
+// of tokens, checking that they match the written code.
+struct Matcher {
+  Matcher(const TokenList , TestLexer )
+  : Tokens(Tokens), It(this->Tokens.begin()), Lex(Lex) {}
+
+  Chunk consume(StringRef Tokens) {
+TokenList Result;
+for (const FormatToken *Token : uneof(Lex.lex(Tokens))) {
+  assert(tokenMatches(*It, Token));
+  Result.push_back(*It);
+  ++It;
+}
+return Chunk(Result);
+  }
+
+  TokenList Tokens;
+  TokenList::iterator It;
+  TestLexer 
+};
+
+UnexpandedMap mergeUnexpanded(const UnexpandedMap ,
+  const UnexpandedMap ) {
+  UnexpandedMap Result;
+  for (const auto  : M1) {
+Result[KV.first] = std::make_unique(*KV.second);
+  }
+  for (const auto  : M2) {
+Result[KV.first] = std::make_unique(*KV.second);
+  }
+  return Result;
+}
+
+class MacroCallReconstructorTest : public ::testing::Test {
+public:
+  MacroCallReconstructorTest() : Lex(Allocator, Buffers) {}
+
+  

[PATCH] D88299: [clang-format] Add MacroUnexpander.

2022-07-11 Thread Manuel Klimek via Phabricator via cfe-commits
klimek updated this revision to Diff 443729.
klimek added a comment.

Address review comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88299/new/

https://reviews.llvm.org/D88299

Files:
  clang/lib/Format/CMakeLists.txt
  clang/lib/Format/FormatToken.h
  clang/lib/Format/MacroCallReconstructor.cpp
  clang/lib/Format/Macros.h
  clang/lib/Format/UnwrappedLineParser.h
  clang/unittests/Format/CMakeLists.txt
  clang/unittests/Format/MacroCallReconstructorTest.cpp

Index: clang/unittests/Format/MacroCallReconstructorTest.cpp
===
--- /dev/null
+++ clang/unittests/Format/MacroCallReconstructorTest.cpp
@@ -0,0 +1,688 @@
+#include "../../lib/Format/Macros.h"
+#include "../../lib/Format/UnwrappedLineParser.h"
+#include "TestLexer.h"
+#include "llvm/ADT/ArrayRef.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/StringRef.h"
+
+#include "gmock/gmock.h"
+#include "gtest/gtest.h"
+#include 
+#include 
+#include 
+
+namespace clang {
+namespace format {
+namespace {
+
+using UnexpandedMap =
+llvm::DenseMap>;
+
+// Keeps track of a sequence of macro expansions.
+//
+// The expanded tokens are accessible via getTokens(), while a map of macro call
+// identifier token to unexpanded token stream is accessible via
+// getUnexpanded().
+class Expansion {
+public:
+  Expansion(TestLexer , MacroExpander ) : Lex(Lex), Macros(Macros) {}
+
+  // Appends the token stream obtained from expanding the macro Name given
+  // the provided arguments, to be later retrieved with getTokens().
+  // Returns the list of tokens making up the unexpanded macro call.
+  TokenList
+  expand(llvm::StringRef Name,
+ const SmallVector, 1> ) {
+auto *ID = Lex.id(Name);
+auto UnexpandedLine = std::make_unique();
+UnexpandedLine->Tokens.push_back(ID);
+if (!Args.empty()) {
+  UnexpandedLine->Tokens.push_back(Lex.id("("));
+  for (auto I = Args.begin(), E = Args.end(); I != E; ++I) {
+if (I != Args.begin())
+  UnexpandedLine->Tokens.push_back(Lex.id(","));
+UnexpandedLine->Tokens.insert(UnexpandedLine->Tokens.end(), I->begin(),
+  I->end());
+  }
+  UnexpandedLine->Tokens.push_back(Lex.id(")"));
+}
+Unexpanded[ID] = std::move(UnexpandedLine);
+
+auto Expanded = uneof(Macros.expand(ID, Args));
+Tokens.append(Expanded.begin(), Expanded.end());
+
+TokenList UnexpandedTokens;
+for (const UnwrappedLineNode  : Unexpanded[ID]->Tokens) {
+  UnexpandedTokens.push_back(Node.Tok);
+}
+return UnexpandedTokens;
+  }
+
+  TokenList expand(llvm::StringRef Name,
+   const std::vector  = {}) {
+return expand(Name, lexArgs(Args));
+  }
+
+  const UnexpandedMap () const { return Unexpanded; }
+
+  const TokenList () const { return Tokens; }
+
+private:
+  llvm::SmallVector
+  lexArgs(const std::vector ) {
+llvm::SmallVector Result;
+for (const auto  : Args) {
+  Result.push_back(uneof(Lex.lex(Arg)));
+}
+return Result;
+  }
+  llvm::DenseMap> Unexpanded;
+  llvm::SmallVector Tokens;
+  TestLexer 
+  MacroExpander 
+};
+
+struct Chunk {
+  Chunk(llvm::ArrayRef Tokens)
+  : Tokens(Tokens.begin(), Tokens.end()) {}
+  Chunk(llvm::ArrayRef Children)
+  : Children(Children.begin(), Children.end()) {}
+  llvm::SmallVector Tokens;
+  llvm::SmallVector Children;
+};
+
+bool tokenMatches(const FormatToken *Left, const FormatToken *Right) {
+  if (Left->getType() == Right->getType() &&
+  Left->TokenText == Right->TokenText)
+return true;
+  llvm::dbgs() << Left->TokenText << " != " << Right->TokenText << "\n";
+  return false;
+}
+
+// Allows to produce chunks of a token list by typing the code of equal tokens.
+//
+// Created from a list of tokens, users call "consume" to get the next chunk
+// of tokens, checking that they match the written code.
+struct Matcher {
+  Matcher(const TokenList , TestLexer )
+  : Tokens(Tokens), It(this->Tokens.begin()), Lex(Lex) {}
+
+  Chunk consume(StringRef Tokens) {
+TokenList Result;
+for (const FormatToken *Token : uneof(Lex.lex(Tokens))) {
+  assert(tokenMatches(*It, Token));
+  Result.push_back(*It);
+  ++It;
+}
+return Chunk(Result);
+  }
+
+  TokenList Tokens;
+  TokenList::iterator It;
+  TestLexer 
+};
+
+UnexpandedMap mergeUnexpanded(const UnexpandedMap ,
+  const UnexpandedMap ) {
+  UnexpandedMap Result;
+  for (const auto  : M1) {
+Result[KV.first] = std::make_unique(*KV.second);
+  }
+  for (const auto  : M2) {
+Result[KV.first] = std::make_unique(*KV.second);
+  }
+  return Result;
+}
+
+class MacroCallReconstructorTest : public ::testing::Test {
+public:
+  MacroCallReconstructorTest() : Lex(Allocator, Buffers) {}
+
+  std::unique_ptr
+  createExpander(const std::vector ) {
+return std::make_unique(MacroDefinitions,
+   

[PATCH] D88299: [clang-format] Add MacroUnexpander.

2022-07-11 Thread Manuel Klimek via Phabricator via cfe-commits
klimek marked 7 inline comments as done.
klimek added inline comments.



Comment at: clang/lib/Format/Macros.h:201
+  /// Generally, this line tries to have the same structure as the expanded,
+  /// formatted unwrapped lines handed in via \c addLine(), with the exception
+  /// that for multiple top-level lines, each subsequent line will be the

sammccall wrote:
> you could give this a name like "tail form", and then refer to it in docs of 
> MacroCallReconstructor::Result, in MacroCallReconstructor.cpp:482, etc.
> 
> Up to you.
I'm somewhat unhappy with the term "tail form"; happy to do this in a 
subsequent change if we find a better name.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88299/new/

https://reviews.llvm.org/D88299

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D88299: [clang-format] Add MacroUnexpander.

2021-11-26 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

Noticed I should have waiting with the renaming of the file until the review is 
done :( Sorry for the extra confusion.




Comment at: clang/lib/Format/MacroUnexpander.cpp:77
+  // stream, we need to continue the unexpansion until we find the right token
+  // if it is part of the unexpanded token stream.
+  // Note that hidden tokens can be part of the unexpanded stream in nested

sammccall wrote:
> I can't work out what "it" refers to in this sentence.
> 
> (and "spelled" token stream?)
Changed to "given token" - it refers to the token.

It's reconstructed, not spelled, like the example below explains: we do have 
hidden tokens that we want to find when we're in the middle of a recursive 
expansion. I wouldn't call the hidden tokens here "spelled" - would you?



Comment at: clang/lib/Format/MacroUnexpander.cpp:82
+  // And the call: BRACED(BRACED(x))
+  // The outer macro call will be BRACED({a}), and the hidden tokens '{' and
+  // '}' can be found in the unexpanded macro stream of that level.

sammccall wrote:
> It would be helpful to complete the example by spelling out which token 
> you're adding, which the correct parent is, and which tokens you need to 
> "expand over" to make it available.
> 
> I *think* the answer to the first two is - when you're adding the `a` then 
> its proper parent is the inner `(` in `BRACED(BRACED(`... but I don't know 
> the last part.
Found out that I was missing a unit test. Added a unit test, and now explained 
the unit test here in the comment. PTAL.



Comment at: clang/lib/Format/MacroUnexpander.cpp:84
+  // '}' can be found in the unexpanded macro stream of that level.
+  if (!Unexpanded.empty() && Token->MacroCtx &&
+  (Token->MacroCtx->Role != MR_Hidden ||

sammccall wrote:
> is it possible that you may need to unexpand more than the innermost macro?
> 
> e.g. BRACED(ID(ID(BRACED(ID(ID(a)) expands to {{a}} and the parents of 
> `a` and inner `{` each come from a couple of macro-levels up.
> 
> (I don't totally understand the logic here, so the answer's probably "no"...)
A token on a higher level must always also be there on a lower level.
Calls in this example are:
BRACED({a})
ID({a})
ID({a})
BRACED(a)
ID(a)
ID(a)
When the next token comes in, we thus always find it in the higher level (open) 
macro call. When we reconstruct for that token, we then reconstruct multiple 
macro calls at once, if it is the first token in multiple macro calls.



Comment at: clang/lib/Format/MacroUnexpander.cpp:173
+  assert(Token->MacroCtx);
+  // A single token can be the only result of a macro call:
+  // Given: ID(x, y) ;

sammccall wrote:
> This raises another point: a macro can have an empty body.
> AFAICT this fundamentally isn't supported here, as we're driven by the 
> expanded token stream.
> I guess it makes sense to handle this elsewhere in clang-format (or even not 
> handle it) but it should be documented somewhere.
I think the right point to document and handle it is at the next layer, where 
we integrate all of this into clang-format.



Comment at: clang/lib/Format/MacroUnexpander.cpp:238
+  // generated by the call.
+  for (size_t I = Unexpanded.size(); I < Token->MacroCtx->ExpandedFrom.size();
+   ++I) {

sammccall wrote:
> nit: index arithmetic obscures what's going on a bit.
> You could write
> 
> ```
> ArrayRef StartedMacros = 
> makeArrayRef(Token->MacroCtx->ExpandedFrom).drop_front(Unexpanded.size());
> for (FormatToken *ID : llvm::reverse(StartedMacros)) {
> ...
> }
> ```
> but up to you
> 
> It's not obvious to me *why* we're iterating in reverse order BTW: i.e. why 
> the order of the `Unexpanded` stack is opposite the order of the 
> `ExpandedFrom` stack.
> So maybe just a comment to reinforce that, like (// ExpandedFrom is 
> outside-in order, we want inside-out)
Oooh, this is nice. s/drop_front/drop_back/. Added comment.



Comment at: clang/lib/Format/MacroUnexpander.cpp:282
+ (Unexpanded.size() >= Token->MacroCtx->EndOfExpansion));
+  if (!Token->MacroCtx)
+return;

sammccall wrote:
> nit: Token->MacroCtx is checked at the callsite, and startUnexpansion asserts 
> it as a precondition - do that here too for symmetry?
Thanks for spotting, this was from a time where the code looked differently 
(the assert above also didn't make sense any more in that form).



Comment at: clang/lib/Format/MacroUnexpander.cpp:310
+// Expand the closing parenthesis, if it exists, including an optional
+// trailing comment.
+for (auto T = Unexpanded.back().I; T != Unexpanded.back().End; ++T) {

sammccall wrote:
> I can't work out if this is supposed to say comma or comment :-)
> 
> If comma - is that a thing?
> If comment - why would a comment be considered part of the unexpansion of a 
> 

[PATCH] D88299: [clang-format] Add MacroUnexpander.

2021-11-26 Thread Manuel Klimek via Phabricator via cfe-commits
klimek updated this revision to Diff 390057.
klimek marked 49 inline comments as done.
klimek added a comment.

Work in review comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88299/new/

https://reviews.llvm.org/D88299

Files:
  clang/lib/Format/CMakeLists.txt
  clang/lib/Format/FormatToken.h
  clang/lib/Format/MacroCallReconstructor.cpp
  clang/lib/Format/Macros.h
  clang/lib/Format/UnwrappedLineParser.h
  clang/unittests/Format/CMakeLists.txt
  clang/unittests/Format/MacroCallReconstructorTest.cpp

Index: clang/unittests/Format/MacroCallReconstructorTest.cpp
===
--- /dev/null
+++ clang/unittests/Format/MacroCallReconstructorTest.cpp
@@ -0,0 +1,688 @@
+#include "../../lib/Format/Macros.h"
+#include "../../lib/Format/UnwrappedLineParser.h"
+#include "TestLexer.h"
+#include "llvm/ADT/ArrayRef.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/StringRef.h"
+
+#include "gmock/gmock.h"
+#include "gtest/gtest.h"
+#include 
+#include 
+#include 
+
+namespace clang {
+namespace format {
+namespace {
+
+using UnexpandedMap =
+llvm::DenseMap>;
+
+// Keeps track of a sequence of macro expansions.
+//
+// The expanded tokens are accessible via getTokens(), while a map of macro call
+// identifier token to unexpanded token stream is accessible via
+// getUnexpanded().
+class Expansion {
+public:
+  Expansion(TestLexer , MacroExpander ) : Lex(Lex), Macros(Macros) {}
+
+  // Appends the token stream obtained from expanding the macro Name given
+  // the provided arguments, to be later retrieved with getTokens().
+  // Returns the list of tokens making up the unexpanded macro call.
+  TokenList
+  expand(llvm::StringRef Name,
+ const SmallVector, 1> ) {
+auto *ID = Lex.id(Name);
+auto UnexpandedLine = std::make_unique();
+UnexpandedLine->Tokens.push_back(ID);
+if (!Args.empty()) {
+  UnexpandedLine->Tokens.push_back(Lex.id("("));
+  for (auto I = Args.begin(), E = Args.end(); I != E; ++I) {
+if (I != Args.begin())
+  UnexpandedLine->Tokens.push_back(Lex.id(","));
+UnexpandedLine->Tokens.insert(UnexpandedLine->Tokens.end(), I->begin(),
+  I->end());
+  }
+  UnexpandedLine->Tokens.push_back(Lex.id(")"));
+}
+Unexpanded[ID] = std::move(UnexpandedLine);
+
+auto Expanded = uneof(Macros.expand(ID, Args));
+Tokens.append(Expanded.begin(), Expanded.end());
+
+TokenList UnexpandedTokens;
+for (const UnwrappedLineNode  : Unexpanded[ID]->Tokens) {
+  UnexpandedTokens.push_back(Node.Tok);
+}
+return UnexpandedTokens;
+  }
+
+  TokenList expand(llvm::StringRef Name,
+   const std::vector  = {}) {
+return expand(Name, lexArgs(Args));
+  }
+
+  const UnexpandedMap () const { return Unexpanded; }
+
+  const TokenList () const { return Tokens; }
+
+private:
+  llvm::SmallVector
+  lexArgs(const std::vector ) {
+llvm::SmallVector Result;
+for (const auto  : Args) {
+  Result.push_back(uneof(Lex.lex(Arg)));
+}
+return Result;
+  }
+  llvm::DenseMap> Unexpanded;
+  llvm::SmallVector Tokens;
+  TestLexer 
+  MacroExpander 
+};
+
+struct Chunk {
+  Chunk(llvm::ArrayRef Tokens)
+  : Tokens(Tokens.begin(), Tokens.end()) {}
+  Chunk(llvm::ArrayRef Children)
+  : Children(Children.begin(), Children.end()) {}
+  llvm::SmallVector Tokens;
+  llvm::SmallVector Children;
+};
+
+bool tokenMatches(const FormatToken *Left, const FormatToken *Right) {
+  if (Left->getType() == Right->getType() &&
+  Left->TokenText == Right->TokenText)
+return true;
+  llvm::dbgs() << Left->TokenText << " != " << Right->TokenText << "\n";
+  return false;
+}
+
+// Allows to produce chunks of a token list by typing the code of equal tokens.
+//
+// Created from a list of tokens, users call "consume" to get the next chunk
+// of tokens, checking that they match the written code.
+struct Matcher {
+  Matcher(const TokenList , TestLexer )
+  : Tokens(Tokens), It(this->Tokens.begin()), Lex(Lex) {}
+
+  Chunk consume(StringRef Tokens) {
+TokenList Result;
+for (const FormatToken *Token : uneof(Lex.lex(Tokens))) {
+  assert(tokenMatches(*It, Token));
+  Result.push_back(*It);
+  ++It;
+}
+return Chunk(Result);
+  }
+
+  TokenList Tokens;
+  TokenList::iterator It;
+  TestLexer 
+};
+
+UnexpandedMap mergeUnexpanded(const UnexpandedMap ,
+  const UnexpandedMap ) {
+  UnexpandedMap Result;
+  for (const auto  : M1) {
+Result[KV.first] = std::make_unique(*KV.second);
+  }
+  for (const auto  : M2) {
+Result[KV.first] = std::make_unique(*KV.second);
+  }
+  return Result;
+}
+
+class MacroCallReconstructorTest : public ::testing::Test {
+public:
+  MacroCallReconstructorTest() : Lex(Allocator, Buffers) {}
+
+  std::unique_ptr
+  createExpander(const std::vector ) {
+return 

[PATCH] D114430: [clang-format] NFC - recent changes caused clang-format to no longer be clang-formatted.

2021-11-24 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

Thanks for cleaning up after me, and sorry for the mess - do y'all have 
clang-format set up as a presubmit or do you just remember to format manually?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114430/new/

https://reviews.llvm.org/D114430

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69764: [clang-format] Add East/West Const fixer capability

2021-08-09 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

In D69764#2934686 , @aaron.ballman 
wrote:

> In D69764#2934612 , @MyDeveloperDay 
> wrote:
>
>>> I was referring to @rsmith and @aaron.ballman (to clarify), both are 
>>> maintainers in 'clang', the former of which is the 'superset' maintainer of 
>>> this format project based on its directory. While Aaron is a 
>>> peer-maintainer to this project, his contributions are immense, and his 
>>> points are too-well-reasoned and motivated to be dismissed.
>>
>> Just so we are clear I'm not dismissing anyone opinions, @arron.ballman and 
>> I have been going at this quite a bit both on and off list. I have huge 
>> respect for these people, (even if the defence of my review it might not 
>> seem so). I can't even think to emulate their contributions.
>>
>> Ideally I'd like their blessing, but alas I fear that might be beyond my 
>> capabilities, but if there are things like the RFC that could even go some 
>> way to potentially them seeing a way forward that is mutually beneficial so 
>> that the door is even open a jar for this then I'm happy to try.
>>
>> If ultimately the answer is "no" then I need to understand what can be done 
>> if anything, if that means a new tool then I'm ok with that as the 
>> compromise. Out right dismissal, I would be very sorry to see.
>
> Here are my thoughts put in one place.
>
> 0) I like the idea of being able to format qualifier location, and I think 
> clang-format is the tool I would reach for to perform that work
> .33) I wish this was generalized to handle all qualifiers rather than just 
> `const`. (This is not a blocking wish.)
> .66) I wish this used "left" and "right" rather than "east" and "west" for 
> user-facing options and documentation. (This is Very Strong wish but I won't 
> block the patch over it.)
>
> 1. In general, I do not like the idea of my code formatting tool having 
> opinions about the input source's existing formatting (I think valid code 
> should remain valid). This is the area of concern causing me to ask for an 
> RFC because I'm operating under the impression that not breaking code is 
> (generally) the status quo for clang-format.
> 2. If the community consensus is that formatting can break code (blanket 
> permission, case by case basis, whatever), I'll definitely abide by that 
> decision. I just want it to be an explicit decision from a wider audience 
> than people who might be following this very long-running patch.
> 3. The proposed opt-in nature of the check is a blessing and a curse. It 
> alleviates some of the danger (it's not dangerous by default, you have to 
> manually say you want it). But it doesn't eliminate the danger (code can 
> still be broken) and it does raise the question of how often people opt into 
> this sort of thing and whether it's worth the maintenance costs. I don't 
> think any of us can answer those "do people opt in" questions though, so 
> that's not a concern I think we should hold anything up over unless someone 
> has actual data on usage of opt-in features for clang-format. I think opt-in 
> alleviates enough of the danger that it's a worthwhile approach for this 
> patch (and likely may be a good idea for a policy on future changes of the 
> same kind).
>
> tl;dr: if there's an RFC and the community says "breaking changes aren't that 
> big of a deal to us", it tips me from "against" to "in favor" for this 
> functionality.

Happy to go the RFC route if @MyDeveloperDay wants to do that.

FWIW, I don't understand the idea of the community being a democracy. It is 
not, and never was. Chris has designed an escalation process, when this was 
raised to him - I don't know whether this was ever made official or tested.
I also don't see this as a big change from clang-format's principles - C++ is 
(unfortunately) way too complex to not be able to introduce subtle bugs - the 
history of clang-format is full of them.

Note that I don't think the change will get consensus on an RFC, but not making 
the change will not get consensus, either - in that case, other than escalating 
to a clear higher decision power, or letting main contributors make the call, I 
don't know what to do.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69764/new/

https://reviews.llvm.org/D69764

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69764: [clang-format] Add East/West Const fixer capability

2021-08-09 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

In D69764#2934032 , @MyDeveloperDay 
wrote:

> In D69764#2932929 , @steveire wrote:
>
>> 
>
> @steveire, thanks for your comments, I also agree that a second tool 
> shouldn't be needed, especially as this functionality is off by default (and 
> needs to stay that way). This was my suggestion for a compromise.
>
> Its a tough call, but ultimately no on ever gave this a LGTM so I couldn't 
> proceed anyway. Whilst I think we can cover the other qualifiers with a 
> similar approach I don't want to waste more of my time until the fundamental 
> question has been answered if its ok for clang-format to insert/modify the 
> order of tokens and that is more a collective decision.
>
> It was suggested to me that I write up and RFC on the matter, I'm not a 
> massive fan of those as I don't really see a formal process for handling them 
> (the conversation seems to turn into a series of personal preferences then 
> dwindles and dies without conclusion). But if people think it would solve 
> something I guess I could try.
>
> From my perspective I'm more interested in the views of the major 
> clang-format contributors (@curdeius, @krasimir, @owenpan , @sammccall, 
> @HazardyKnusperkeks  and people like yourselves who have put time and effort 
> into blogging about these tools), ultimately it will be us who have to handle 
> the impact of this (and maybe the #clangd people) and I don't want to 
> inconvenience them or make work for them.
>
> So here is what is needed to land this in my view:
>
> 1. I'd want at least 2 to 3 of the current reviewers to LGTM.
> 2. I'd want 1 person (like yourself) a community contributor to also give 
> their approval (I guess I have that conceptually from you!)
>
> or
>
> 1. @klimek or @djasper to give us an its ok for "clang-format to 
> insert/modify the order of tokens" in a controlled manner decision
> 2. 1 LGTM
>
> Whilst I respect the views of those few who have objected, they don't to my 
> knowledge make significant contributions to clang-format (not in the last 5-6 
> years I've been following it anyway)
>
> I feel this team owns the decision as we are the ones that look after it, 
> warts and all. We'll always have a situation where some don't agree, thats 
> ok, but I feel there is more agreement that this funcationality is ok rather 
> than not.
>
> Until one of those 2 things happens, we are in limbo.

I think we had a really good, inclusive discussion on this change, so I don't 
think an RFC would add anything to it.

I think we have consensus on, if we want this, we:

1. want it behind an option that is not enabled in default-configs
2. we want users to understand that this has the potential to introduce bugs

I also think this is a feature that seems useful enough for a large enough 
group, to provide it in clang-format.
I'd put a lot of weight on @MyDeveloperDay's evaluation, as in the end, he's 
currently taking the most active role in keeping clang-format alive & well.

One idea for clang-format going forward is: can we make it easier for people to 
understand what flags can introduce non-whitespace-changes? E.g. have flag name 
prefixes like semantic-*.

So, I'm generally OK with this.

One minor nit I have is that I'd personally add unit tests to the 
EastWestConstFixer, as it's quite a large class, but given that I'll duck out 
after this comment, I'll not insist :)


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69764/new/

https://reviews.llvm.org/D69764

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D96114: [ASTMatchers] Fix parent-child traversal between functions and parms

2021-02-08 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

Can you explain the change and the before/after a bit more? Thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96114/new/

https://reviews.llvm.org/D96114

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D91996: [clang-format] Remove double trim

2020-11-26 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

Isn't the comment incorrect after this patch?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91996/new/

https://reviews.llvm.org/D91996

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D88299: [clang-format] Add MacroUnexpander.

2020-10-28 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/lib/Format/FormatToken.h:449
 
+  /// When macro expansion introduces parents, those are marked as
+  /// \c MacroParent, so formatting knows their children need to be formatted.

sammccall wrote:
> I can't really understand from the comment when this is supposed to be set, 
> and there are no tests of it.
> 
> (The comment is vague: is a "parent" the inverse of FormatToken::Children 
> here? Is this scenario when the parents in question are new, or their 
> children are new, or both? What part of the code is "formatting", and why 
> would it otherwise skip the children?)
Rewrote comment.



Comment at: clang/lib/Format/Macros.h:134
 
+/// Matches formatted lines that were created by macro expansion to a format
+/// for the original macro call.

sammccall wrote:
> "matches formatted lines" probably describes the hard technical problem it 
> has to solve, but not so much what it does for the caller:  what the 
> transformation is between its inputs and its outputs.
> 
> Is it something like:
> 
> ```
> Rewrites expanded code (containing tokens expanded from macros) into spelled 
> code (containing tokens for the macro call itself). Token types are 
> preserved, so macro arguments in the output have semantically-correct types 
> from their expansion. This is the point of expansion/unexpansion: to allow 
> this information to be used in formatting.
> ```
> 
> [Is it just tokentypes? I guess it's also Role and MustBreakBefore and some 
> other stuff like that?]
It's not the token info, this we'd trivially have by using the original token 
sequence which is still lying around (we re-use the same tokens).

Reworded.



Comment at: clang/lib/Format/Macros.h:142
+/// When getting the formatted lines of the expansion via the \c addLine 
method:
+/// -> class A {
+/// -> public:

sammccall wrote:
> sammccall wrote:
> > I'm a bit confused by these arrows. It doesn't seem that they each point to 
> > an unwrappedline passed to addLine?
> This example didn't really help me understand the interface of this class, it 
> seems to be a special case:
>  - the input is a single block construct (rather than e.g. a whole program), 
> but it's not clear why that's the case.
>  - the output (unexpanded form) consists of exactly a macro call with no 
> leading/trailing tokens, which isn't true in general
> 
> If the idea is to provide as input the minimal range of lines that span the 
> macro, we should say so somewhere. But I would like to understand why we're 
> not simply processing the whole file.
Why not? That is the intention? Note that high-level we do not pass class 
definitions in one unwrapped line; if they ever go into a single line it's done 
in the end by a special merging pass (yes, that's a bit confusing).



Comment at: clang/lib/Format/Macros.h:142
+/// When getting the formatted lines of the expansion via the \c addLine 
method:
+/// -> class A {
+/// -> public:

klimek wrote:
> sammccall wrote:
> > sammccall wrote:
> > > I'm a bit confused by these arrows. It doesn't seem that they each point 
> > > to an unwrappedline passed to addLine?
> > This example didn't really help me understand the interface of this class, 
> > it seems to be a special case:
> >  - the input is a single block construct (rather than e.g. a whole 
> > program), but it's not clear why that's the case.
> >  - the output (unexpanded form) consists of exactly a macro call with no 
> > leading/trailing tokens, which isn't true in general
> > 
> > If the idea is to provide as input the minimal range of lines that span the 
> > macro, we should say so somewhere. But I would like to understand why we're 
> > not simply processing the whole file.
> Why not? That is the intention? Note that high-level we do not pass class 
> definitions in one unwrapped line; if they ever go into a single line it's 
> done in the end by a special merging pass (yes, that's a bit confusing).
Re: input is a single construct - that is not the case (see other comment)
Re: leading / trailing tokens: I didn't want to make it too complex in the 
example.



Comment at: clang/lib/Format/Macros.h:147
+///
+/// Creates the unwrapped lines containing the macro call tokens so that
+/// the macro call tokens fit the semantic structure of the expanded formatted

sammccall wrote:
> this says "creates the unwrapped lines" but getResult() returns only one.
> Does the plural here refer to the tree? Maybe just use singular or "the 
> unwrappedlinetree"?
Fixed comment.



Comment at: clang/lib/Format/Macros.h:154
+/// -> })
+class MacroUnexpander {
+public:

sammccall wrote:
> I get the symmetry between the expander/unexpander classes, but the name is 
> making it harder for me to follow the code.
>  - the extra compound+negation in the name 

[PATCH] D88299: [clang-format] Add MacroUnexpander.

2020-10-28 Thread Manuel Klimek via Phabricator via cfe-commits
klimek updated this revision to Diff 301253.
klimek marked 3 inline comments as done.
klimek added a comment.

Adapted based on review comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88299/new/

https://reviews.llvm.org/D88299

Files:
  clang/lib/Format/CMakeLists.txt
  clang/lib/Format/FormatToken.h
  clang/lib/Format/MacroUnexpander.cpp
  clang/lib/Format/Macros.h
  clang/unittests/Format/CMakeLists.txt
  clang/unittests/Format/MacroUnexpanderTest.cpp

Index: clang/unittests/Format/MacroUnexpanderTest.cpp
===
--- /dev/null
+++ clang/unittests/Format/MacroUnexpanderTest.cpp
@@ -0,0 +1,650 @@
+#include "../../lib/Format/Macros.h"
+#include "../../lib/Format/UnwrappedLineParser.h"
+#include "TestLexer.h"
+#include "llvm/ADT/ArrayRef.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/StringRef.h"
+
+#include "gmock/gmock.h"
+#include "gtest/gtest.h"
+#include 
+#include 
+#include 
+
+namespace clang {
+namespace format {
+namespace {
+
+using UnexpandedMap = std::map>;
+
+// Keeps track of a sequence of macro expansions.
+//
+// The expanded tokens are accessible via getTokens(), while a map of macro call
+// identifier token to unexpanded token stream is accessible via
+// getUnexpanded().
+class Expansion {
+public:
+  Expansion(TestLexer , MacroExpander ) : Lex(Lex), Macros(Macros) {}
+
+  // Appends the token stream obtained from expanding the macro Name given
+  // the provided arguments, to be later retrieved with getTokens().
+  // Returns the list of tokens making up the unexpanded macro call.
+  TokenList
+  expand(llvm::StringRef Name,
+ const SmallVector, 1> ) {
+auto *ID = Lex.id(Name);
+auto UnexpandedLine = std::make_unique();
+UnexpandedLine->Tokens.push_back(ID);
+if (!Args.empty()) {
+  UnexpandedLine->Tokens.push_back(Lex.id("("));
+  for (auto I = Args.begin(), E = Args.end(); I != E; ++I) {
+if (I != Args.begin())
+  UnexpandedLine->Tokens.push_back(Lex.id(","));
+UnexpandedLine->Tokens.insert(UnexpandedLine->Tokens.end(), I->begin(),
+  I->end());
+  }
+  UnexpandedLine->Tokens.push_back(Lex.id(")"));
+}
+Unexpanded[ID] = std::move(UnexpandedLine);
+
+auto Expanded = uneof(Macros.expand(ID, Args));
+Tokens.append(Expanded.begin(), Expanded.end());
+
+TokenList UnexpandedTokens;
+for (const UnwrappedLineNode  : Unexpanded[ID]->Tokens) {
+  UnexpandedTokens.push_back(Node.Tok);
+}
+return UnexpandedTokens;
+  }
+
+  TokenList expand(llvm::StringRef Name,
+   const std::vector  = {}) {
+return expand(Name, lexArgs(Args));
+  }
+
+  const UnexpandedMap () const { return Unexpanded; }
+
+  const TokenList () const { return Tokens; }
+
+private:
+  llvm::SmallVector
+  lexArgs(const std::vector ) {
+llvm::SmallVector Result;
+for (const auto  : Args) {
+  Result.push_back(uneof(Lex.lex(Arg)));
+}
+return Result;
+  }
+  std::map> Unexpanded;
+  llvm::SmallVector Tokens;
+  TestLexer 
+  MacroExpander 
+};
+
+struct Chunk {
+  Chunk(llvm::ArrayRef Tokens)
+  : Tokens(Tokens.begin(), Tokens.end()) {}
+  Chunk(llvm::ArrayRef Children)
+  : Children(Children.begin(), Children.end()) {}
+  llvm::SmallVector Tokens;
+  llvm::SmallVector Children;
+};
+
+bool tokenMatches(const FormatToken *Left, const FormatToken *Right) {
+  if (Left->getType() == Right->getType() &&
+  Left->TokenText == Right->TokenText)
+return true;
+  llvm::dbgs() << Left->TokenText << " != " << Right->TokenText << "\n";
+  return false;
+}
+
+// Allows to produce chunks of a token list by typing the code of equal tokens.
+//
+// Created from a list of tokens, users call "consume" to get the next chunk
+// of tokens, checking that they match the written code.
+struct Matcher {
+  Matcher(const TokenList , TestLexer )
+  : Tokens(Tokens), It(this->Tokens.begin()), Lex(Lex) {}
+
+  Chunk consume(StringRef Tokens) {
+TokenList Result;
+for (const FormatToken *Token : uneof(Lex.lex(Tokens))) {
+  assert(tokenMatches(*It, Token));
+  Result.push_back(*It);
+  ++It;
+}
+return Chunk(Result);
+  }
+
+  TokenList Tokens;
+  TokenList::iterator It;
+  TestLexer 
+};
+
+UnexpandedMap mergeUnexpanded(const UnexpandedMap ,
+  const UnexpandedMap ) {
+  UnexpandedMap Result;
+  for (const auto  : M1) {
+Result[KV.first] = std::make_unique(*KV.second);
+  }
+  for (const auto  : M2) {
+Result[KV.first] = std::make_unique(*KV.second);
+  }
+  return Result;
+}
+
+class MacroUnexpanderTest : public ::testing::Test {
+public:
+  MacroUnexpanderTest() : Lex(Allocator, Buffers) {}
+
+  std::unique_ptr
+  createExpander(const std::vector ) {
+return std::make_unique(MacroDefinitions,
+   Lex.SourceMgr.get(), Lex.Style,
+ 

[PATCH] D89920: Export TemplateArgumentMatcher so clients defining custom matchers don't need to use the internal namespace

2020-10-22 Thread Manuel Klimek via Phabricator via cfe-commits
klimek accepted this revision.
klimek added a comment.
This revision is now accepted and ready to land.

lg


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89920/new/

https://reviews.llvm.org/D89920

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D88952: [clang-format][tests] Fix MacroExpander lexer not parsing C++ keywords

2020-10-07 Thread Manuel Klimek via Phabricator via cfe-commits
klimek accepted this revision.
klimek added a comment.
This revision is now accepted and ready to land.

Nice catch, thanks! LG!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88952/new/

https://reviews.llvm.org/D88952

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D88956: [clang-format] Fix misformatted macro definitions after D86959

2020-10-07 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/unittests/Format/FormatTest.cpp:8333
 
+TEST_F(FormatTest, UnderstandsUsesOfStarAndAmpInMacroDefinition) {
+  // This is a regression test for mis-parsing the & after decltype as a binary

arichardson wrote:
> arichardson wrote:
> > klimek wrote:
> > > I'd put this into a new test file that specifically tests annotations, 
> > > perhaps AnnotationTest.cpp?
> > > 
> > > Also, the test name seems misleading, as not all of these are macro 
> > > definitions?
> > The regression is the second definition which is a macro definition, I 
> > guess I could drop the first test.
> Moving it to a separate file probably makes sense, but then I'd have to drop 
> the verifyFormat() calls.
If we start a new file, we can have 2 tests.

I'd leave the verifyFormat tests here, potentially putting them into already 
existing test functions that test similar things.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88956/new/

https://reviews.llvm.org/D88956

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D88956: [clang-format] Fix misformatted macro definitions after D86959

2020-10-07 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/unittests/Format/FormatTest.cpp:8333
 
+TEST_F(FormatTest, UnderstandsUsesOfStarAndAmpInMacroDefinition) {
+  // This is a regression test for mis-parsing the & after decltype as a binary

I'd put this into a new test file that specifically tests annotations, perhaps 
AnnotationTest.cpp?

Also, the test name seems misleading, as not all of these are macro definitions?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88956/new/

https://reviews.llvm.org/D88956

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-09-29 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

In D83296#2299062 , @nridge wrote:

> What does this change mean for users of clang-format -- better formatting of 
> complicated (e.g. multi-line) macro invocations?

In addition to what Sam said, this also attempts to be an improvement in 
maintainability. Given this is a fairly complex change, you might ask how this 
helps :)
The idea is that we bundle the complexity of macro handling in a clearly 
separated part of the code that can be tested and developed ~on its own.
Currently, we have multiple macro regex settings that then lead to random code 
throughout clang-format that tries to handle those identifiers special.
Once this is done, we can delete all those settings, as the more generalized 
macro configuration will supersede them.




Comment at: clang/lib/Format/MacroExpander.cpp:190
+  return true;
+for (const auto  : Args[I->getValue()]) {
+  // A token can be part of a macro argument at multiple levels.

sammccall wrote:
> nit: this is confusingly a const reference to a non-const pointer... `auto *` 
> or `FormatToken *`?
Yikes, thanks for catching!



Comment at: clang/unittests/Format/MacroExpanderTest.cpp:183
+  EXPECT_ATTRIBUTES(Result, Attributes);
+}
+

sammccall wrote:
> may want a test that uses of an arg after the first are not expanded, because 
> that "guards" a bunch of nasty potential bugs
Discussed offline: the above test tests exactly this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83296/new/

https://reviews.llvm.org/D83296

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D84306: [clang-format][NFC] Be more careful about the layout of FormatToken.

2020-09-28 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

In D84306#2294952 , @MyDeveloperDay 
wrote:

> Which bit do you find more complex? adding something to the FormatToken or 
> the use of the `is()` calls?

I think mainly that 
a) the language doesn't support bitfields well yet, as evidenced by having to 
pull the default values into the constructor, where it's harder to maintain imo 
(but that will change)
b) we'll need to continuously take care that the bitfields stay bundled

The setters / getters are generally fine, I think; I believe the decision to 
treat this as a struct from the very beginning of the project was bad, but 
refactoring this into something more cohesive seems like a daunting task.

To be clear, I don't think it's a lot of complexity, I was mainly wondering 
whether we have some measurement what it gets us.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D84306/new/

https://reviews.llvm.org/D84306

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D88299: [clang-format] Add MacroUnexpander.

2020-09-25 Thread Manuel Klimek via Phabricator via cfe-commits
klimek created this revision.
klimek added a reviewer: sammccall.
Herald added subscribers: cfe-commits, mgorny.
Herald added a project: clang.
klimek requested review of this revision.

MacroUnexpander applies the structural formatting of expanded lines into
UnwrappedLines to the corresponding unexpanded macro calls, resulting in
UnwrappedLines for the macro calls the user typed.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D88299

Files:
  clang/lib/Format/CMakeLists.txt
  clang/lib/Format/FormatToken.h
  clang/lib/Format/MacroUnexpander.cpp
  clang/lib/Format/Macros.h
  clang/unittests/Format/CMakeLists.txt
  clang/unittests/Format/MacroUnexpanderTest.cpp

Index: clang/unittests/Format/MacroUnexpanderTest.cpp
===
--- /dev/null
+++ clang/unittests/Format/MacroUnexpanderTest.cpp
@@ -0,0 +1,636 @@
+#include "../../lib/Format/Macros.h"
+#include "../../lib/Format/UnwrappedLineParser.h"
+#include "TestLexer.h"
+#include "llvm/ADT/ArrayRef.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/StringRef.h"
+
+#include "gmock/gmock.h"
+#include "gtest/gtest.h"
+#include 
+#include 
+#include 
+
+namespace clang {
+namespace format {
+namespace {
+
+using UnexpandedMap = std::map>;
+
+class Expansion {
+public:
+  Expansion(TestLexer , MacroExpander ) : Lex(Lex), Macros(Macros) {}
+
+  TokenList
+  expand(llvm::StringRef Name,
+ const SmallVector, 1> ) {
+auto *ID = Lex.id(Name);
+auto UnexpandedLine = std::make_unique();
+UnexpandedLine->Tokens.push_back(ID);
+if (!Args.empty()) {
+  UnexpandedLine->Tokens.push_back(Lex.id("("));
+  for (auto I = Args.begin(), E = Args.end(); I != E; ++I) {
+if (I != Args.begin())
+  UnexpandedLine->Tokens.push_back(Lex.id(","));
+UnexpandedLine->Tokens.insert(UnexpandedLine->Tokens.end(), I->begin(),
+  I->end());
+  }
+  UnexpandedLine->Tokens.push_back(Lex.id(")"));
+}
+Unexpanded[ID] = std::move(UnexpandedLine);
+
+auto Expanded = uneof(Macros.expand(ID, Args));
+Tokens.append(Expanded.begin(), Expanded.end());
+
+TokenList UnexpandedTokens;
+for (const UnwrappedLineNode  : Unexpanded[ID]->Tokens) {
+  UnexpandedTokens.push_back(Node.Tok);
+}
+return UnexpandedTokens;
+  }
+
+  TokenList expand(llvm::StringRef Name,
+   const std::vector  = {}) {
+return expand(Name, lexArgs(Args));
+  }
+
+  const UnexpandedMap () const { return Unexpanded; }
+
+  const TokenList () const { return Tokens; }
+
+private:
+  llvm::SmallVector
+  lexArgs(const std::vector ) {
+llvm::SmallVector Result;
+for (const auto  : Args) {
+  Result.push_back(uneof(Lex.lex(Arg)));
+}
+return Result;
+  }
+  std::map> Unexpanded;
+  llvm::SmallVector Tokens;
+  TestLexer 
+  MacroExpander 
+};
+
+struct Chunk {
+  Chunk(llvm::ArrayRef Tokens)
+  : Tokens(Tokens.begin(), Tokens.end()) {}
+  Chunk(llvm::ArrayRef Children)
+  : Children(Children.begin(), Children.end()) {}
+  llvm::SmallVector Tokens;
+  llvm::SmallVector Children;
+};
+
+bool tokenMatches(const FormatToken *Left, const FormatToken *Right) {
+  if (Left->getType() == Right->getType() &&
+  Left->TokenText == Right->TokenText)
+return true;
+  llvm::dbgs() << Left->TokenText << " != " << Right->TokenText << "\n";
+  return false;
+}
+
+struct Matcher {
+  Matcher(const TokenList ) : Tokens(Tokens), It(this->Tokens.begin()) {}
+
+  Chunk consume(const TokenList ) {
+TokenList Result;
+for (const FormatToken *Token : Tokens) {
+  assert(tokenMatches(*It, Token));
+  Result.push_back(*It);
+  ++It;
+}
+return Chunk(Result);
+  }
+
+  TokenList Tokens;
+  TokenList::iterator It;
+};
+
+UnexpandedMap mergeUnexpanded(const UnexpandedMap ,
+  const UnexpandedMap ) {
+  UnexpandedMap Result;
+  for (const auto  : M1) {
+Result[KV.first] = std::make_unique(*KV.second);
+  }
+  for (const auto  : M2) {
+Result[KV.first] = std::make_unique(*KV.second);
+  }
+  return Result;
+}
+
+class MacroUnexpanderTest : public ::testing::Test {
+public:
+  std::unique_ptr
+  create(const std::vector ) {
+return std::make_unique(MacroDefinitions,
+   Lex.SourceMgr.get(), Lex.Style,
+   Lex.Allocator, Lex.IdentTable);
+  }
+
+  UnwrappedLine line(llvm::ArrayRef Tokens) {
+UnwrappedLine Result;
+for (FormatToken *Tok : Tokens) {
+  Result.Tokens.push_back(UnwrappedLineNode(Tok));
+}
+return Result;
+  }
+
+  UnwrappedLine line(llvm::StringRef Text) { return line({lex(Text)}); }
+
+  UnwrappedLine line(llvm::ArrayRef Chunks) {
+UnwrappedLine Result;
+for (const Chunk  : Chunks) {
+  Result.Tokens.insert(Result.Tokens.end(), Chunk.Tokens.begin(),
+   Chunk.Tokens.end());
+  

[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-09-25 Thread Manuel Klimek via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
klimek marked 5 inline comments as done.
Closed by commit rGe336b74c995d: [clang-format] Add a MacroExpander. (authored 
by klimek).

Changed prior to commit:
  https://reviews.llvm.org/D83296?vs=292970=294285#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83296/new/

https://reviews.llvm.org/D83296

Files:
  clang/lib/Format/CMakeLists.txt
  clang/lib/Format/FormatToken.h
  clang/lib/Format/MacroExpander.cpp
  clang/lib/Format/Macros.h
  clang/unittests/Format/CMakeLists.txt
  clang/unittests/Format/MacroExpanderTest.cpp
  clang/unittests/Format/TestLexer.h

Index: clang/unittests/Format/TestLexer.h
===
--- /dev/null
+++ clang/unittests/Format/TestLexer.h
@@ -0,0 +1,88 @@
+//===--- TestLexer.h - Format C++ code --*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+///
+/// \file
+/// This file contains a TestLexer to create FormatTokens from strings.
+///
+//===--===//
+
+#ifndef CLANG_UNITTESTS_FORMAT_TESTLEXER_H
+#define CLANG_UNITTESTS_FORMAT_TESTLEXER_H
+
+#include "../../lib/Format/FormatTokenLexer.h"
+
+#include "clang/Basic/FileManager.h"
+#include "clang/Basic/SourceManager.h"
+
+#include 
+#include 
+
+namespace clang {
+namespace format {
+
+typedef llvm::SmallVector TokenList;
+
+inline std::ostream <<(std::ostream , const FormatToken ) {
+  Stream << "(" << Tok.Tok.getName() << ", \"" << Tok.TokenText.str() << "\")";
+  return Stream;
+}
+inline std::ostream <<(std::ostream , const TokenList ) {
+  Stream << "{";
+  for (size_t I = 0, E = Tokens.size(); I != E; ++I) {
+Stream << (I > 0 ? ", " : "") << *Tokens[I];
+  }
+  Stream << "}";
+  return Stream;
+}
+
+inline TokenList uneof(const TokenList ) {
+  assert(!Tokens.empty() && Tokens.back()->is(tok::eof));
+  return TokenList(Tokens.begin(), std::prev(Tokens.end()));
+}
+
+inline std::string text(llvm::ArrayRef Tokens) {
+  return std::accumulate(Tokens.begin(), Tokens.end(), std::string(),
+ [](const std::string , FormatToken *Tok) {
+   return (R + Tok->TokenText).str();
+ });
+}
+
+class TestLexer {
+public:
+  TestLexer() : SourceMgr("test.cpp", "") {}
+
+  TokenList lex(llvm::StringRef Code) {
+Buffers.push_back(
+llvm::MemoryBuffer::getMemBufferCopy(Code, ""));
+clang::FileID FID = SourceMgr.get().createFileID(SourceManager::Unowned,
+ Buffers.back().get());
+FormatTokenLexer Lex(SourceMgr.get(), FID, 0, Style, Encoding, Allocator,
+ IdentTable);
+auto Result = Lex.lex();
+return TokenList(Result.begin(), Result.end());
+  }
+
+  FormatToken *id(llvm::StringRef Code) {
+auto Result = uneof(lex(Code));
+assert(Result.size() == 1U && "Code must expand to 1 token.");
+return Result[0];
+  }
+
+  FormatStyle Style = getLLVMStyle();
+  encoding::Encoding Encoding = encoding::Encoding_UTF8;
+  std::vector> Buffers;
+  clang::SourceManagerForFile SourceMgr;
+  llvm::SpecificBumpPtrAllocator Allocator;
+  IdentifierTable IdentTable;
+};
+
+} // namespace format
+} // namespace clang
+
+#endif // LLVM_CLANG_UNITTESTS_FORMAT_TEST_LEXER_H
Index: clang/unittests/Format/MacroExpanderTest.cpp
===
--- /dev/null
+++ clang/unittests/Format/MacroExpanderTest.cpp
@@ -0,0 +1,187 @@
+#include "../../lib/Format/Macros.h"
+#include "TestLexer.h"
+#include "clang/Basic/FileManager.h"
+
+#include "gtest/gtest.h"
+
+namespace clang {
+namespace format {
+
+namespace {
+
+class MacroExpanderTest : public ::testing::Test {
+public:
+  std::unique_ptr
+  create(const std::vector ) {
+return std::make_unique(MacroDefinitions,
+   Lex.SourceMgr.get(), Lex.Style,
+   Lex.Allocator, Lex.IdentTable);
+  }
+
+  std::string expand(MacroExpander , llvm::StringRef Name,
+ const std::vector  = {}) {
+EXPECT_TRUE(Macros.defined(Name));
+return text(Macros.expand(Lex.id(Name), lexArgs(Args)));
+  }
+
+  llvm::SmallVector
+  lexArgs(const std::vector ) {
+llvm::SmallVector Result;
+for (const auto  : Args) {
+  Result.push_back(uneof(Lex.lex(Arg)));
+}
+return Result;
+  }
+
+  struct MacroAttributes {
+clang::tok::TokenKind Kind;
+MacroRole Role;
+unsigned Start;
+unsigned End;
+llvm::SmallVector ExpandedFrom;
+  };

[PATCH] D84306: [clang-format][NFC] Be more careful about the layout of FormatToken.

2020-09-25 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

FWIW, finding this due to seeing the added complexity. Do you have data on what 
the difference is on clang-format for either overall memory use or performance? 
Thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D84306/new/

https://reviews.llvm.org/D84306

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-09-19 Thread Manuel Klimek via Phabricator via cfe-commits
klimek updated this revision to Diff 292970.
klimek marked 7 inline comments as done.
klimek added a comment.

Worked in review comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83296/new/

https://reviews.llvm.org/D83296

Files:
  clang/lib/Format/CMakeLists.txt
  clang/lib/Format/FormatToken.h
  clang/lib/Format/MacroExpander.cpp
  clang/lib/Format/Macros.h
  clang/unittests/Format/CMakeLists.txt
  clang/unittests/Format/MacroExpanderTest.cpp
  clang/unittests/Format/TestLexer.h

Index: clang/unittests/Format/TestLexer.h
===
--- /dev/null
+++ clang/unittests/Format/TestLexer.h
@@ -0,0 +1,88 @@
+//===--- TestLexer.h - Format C++ code --*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+///
+/// \file
+/// This file contains a TestLexer to create FormatTokens from strings.
+///
+//===--===//
+
+#ifndef CLANG_UNITTESTS_FORMAT_TEST_LEXER_H
+#define CLANG_UNITTESTS_FORMAT_TEST_LEXER_H
+
+#include "../../lib/Format/FormatTokenLexer.h"
+
+#include "clang/Basic/FileManager.h"
+#include "clang/Basic/SourceManager.h"
+
+#include 
+#include 
+
+namespace clang {
+namespace format {
+
+typedef llvm::SmallVector TokenList;
+
+inline std::ostream <<(std::ostream , const FormatToken ) {
+  Stream << "(" << Tok.Tok.getName() << ", \"" << Tok.TokenText.str() << "\")";
+  return Stream;
+}
+inline std::ostream <<(std::ostream , const TokenList ) {
+  Stream << "{";
+  for (size_t I = 0, E = Tokens.size(); I != E; ++I) {
+Stream << (I > 0 ? ", " : "") << *Tokens[I];
+  }
+  Stream << "}";
+  return Stream;
+}
+
+inline TokenList uneof(const TokenList ) {
+  assert(!Tokens.empty() && Tokens.back()->is(tok::eof));
+  return TokenList(Tokens.begin(), std::prev(Tokens.end()));
+}
+
+inline std::string text(llvm::ArrayRef Tokens) {
+  return std::accumulate(Tokens.begin(), Tokens.end(), std::string(),
+ [](const std::string , FormatToken *Tok) {
+   return (R + Tok->TokenText).str();
+ });
+}
+
+class TestLexer {
+public:
+  TestLexer() : SourceMgr("test.cpp", "") {}
+
+  TokenList lex(llvm::StringRef Code) {
+Buffers.push_back(
+llvm::MemoryBuffer::getMemBufferCopy(Code, ""));
+clang::FileID FID = SourceMgr.get().createFileID(SourceManager::Unowned,
+ Buffers.back().get());
+FormatTokenLexer Lex(SourceMgr.get(), FID, 0, Style, Encoding, Allocator,
+ IdentTable);
+auto Result = Lex.lex();
+return TokenList(Result.begin(), Result.end());
+  }
+
+  FormatToken *id(llvm::StringRef Code) {
+auto Result = uneof(lex(Code));
+assert(Result.size() == 1U && "Code must expand to 1 token.");
+return Result[0];
+  }
+
+  FormatStyle Style = getLLVMStyle();
+  encoding::Encoding Encoding = encoding::Encoding_UTF8;
+  std::vector> Buffers;
+  clang::SourceManagerForFile SourceMgr;
+  llvm::SpecificBumpPtrAllocator Allocator;
+  IdentifierTable IdentTable;
+};
+
+} // namespace format
+} // namespace clang
+
+#endif // LLVM_CLANG_UNITTESTS_FORMAT_TEST_LEXER_H
Index: clang/unittests/Format/MacroExpanderTest.cpp
===
--- /dev/null
+++ clang/unittests/Format/MacroExpanderTest.cpp
@@ -0,0 +1,187 @@
+#include "../../lib/Format/Macros.h"
+#include "TestLexer.h"
+#include "clang/Basic/FileManager.h"
+
+#include "gtest/gtest.h"
+
+namespace clang {
+namespace format {
+
+namespace {
+
+class MacroExpanderTest : public ::testing::Test {
+public:
+  std::unique_ptr
+  create(const std::vector ) {
+return std::make_unique(MacroDefinitions,
+   Lex.SourceMgr.get(), Lex.Style,
+   Lex.Allocator, Lex.IdentTable);
+  }
+
+  std::string expand(MacroExpander , llvm::StringRef Name,
+ const std::vector  = {}) {
+EXPECT_TRUE(Macros.defined(Name));
+return text(Macros.expand(Lex.id(Name), lexArgs(Args)));
+  }
+
+  llvm::SmallVector
+  lexArgs(const std::vector ) {
+llvm::SmallVector Result;
+for (const auto  : Args) {
+  Result.push_back(uneof(Lex.lex(Arg)));
+}
+return Result;
+  }
+
+  struct MacroAttributes {
+clang::tok::TokenKind Kind;
+MacroRole Role;
+unsigned Start;
+unsigned End;
+llvm::SmallVector ExpandedFrom;
+  };
+
+  void expectAttributes(const TokenList ,
+const std::vector ,
+const std::string , unsigned Line) {
+EXPECT_EQ(Tokens.size(), Attributes.size()) 

[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-09-19 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/lib/Format/FormatToken.h:177
+/// EndOfExpansion:   0  0 0   21 0 1
+struct MacroContext {
+  MacroContext(MacroRole Role) : Role(Role) {}

sammccall wrote:
> "context" is often pretty vague - "MacroSource" isn't a brilliant name but at 
> least seems to hint at the direction (that the FormatToken is the expanded 
> token and the MacroSource gives information about what it was expanded from)
> 
> I don't feel strongly about this though, up to you.
MacroSource sounds like it is about the macro source (i.e. the tokens written 
for the macro).
I'd be happy to rename to MacroExpansion or MacroExpansionInfo or somesuch if 
you think that helps?



Comment at: clang/lib/Format/MacroExpander.cpp:81
+  nextToken();
+}
+if (Current->isNot(tok::r_paren))

sammccall wrote:
> this accepts `FOO(A,B,)=...` as equivalent to `FOO(A,B)=...`. Not sure if 
> worth fixing.
We're generally accepting too much; I'd either want to restrict it fully, or 
basically be somewhat minimum/forgiving. Given that we can't get errors back to 
the user, I was aiming for the latter.



Comment at: clang/lib/Format/MacroExpander.cpp:88
+
+  bool parseExpansion() {
+if (!Current->isOneOf(tok::equal, tok::eof))

sammccall wrote:
> (nit: I'd probably find this easier to follow as `if (equal) else if (eof) 
> else` with parseTail inlined, but up to you)
I basically like having the implementation match the BNR.
That said, not feeling strongly about it. You're saying you'd duplicate the 
Def.Body.push_back in the if (eof)?

  if (Current->is(tok::equal) {
nextToken();
// inline parseTail
  } else if (Current->is(tok::eof) {
Def.Body.push_back(Current);
  } else {
return false;
  }

Generally, I personally find it easier to read the early exit.



Comment at: clang/lib/Format/MacroExpander.cpp:142
+  auto Definition = Parser.parse();
+  Definitions[Definition.Name] = Definition;
+}

sammccall wrote:
> nit: this is a copy for what seems like no reason - move `Parser.parse()` 
> inline to this line?
Reason is that we need the name.



Comment at: clang/lib/Format/MacroExpander.cpp:186
+Tok->MacroCtx = MacroContext(MR_ExpandedArg);
+  pushToken(Tok);
+}

klimek wrote:
> sammccall wrote:
> > klimek wrote:
> > > sammccall wrote:
> > > > you're pushing here without copying. This means the original tokens 
> > > > from the ArgsList are mutated. Maybe we own them, but this seems at 
> > > > least wrong for multiple expansion of the same arg.
> > > > 
> > > > e.g.
> > > > ```
> > > > #define M(X,Y) X Y X
> > > > M(1,2)
> > > > ```
> > > > 
> > > > Will expand to:
> > > > - 1, ExpandedArg, ExpandedFrom = [M, M] // should just be one M
> > > > - 2, ExpandedArg, ExpandedFrom = [M]
> > > > - 1, ExpandedArg, ExpandedFrom = [M, M] // this is the same token 
> > > > pointer as the first one
> > > > 
> > > > Maybe it would be better if pushToken performed the copy, and returned 
> > > > a mutable pointer to the copy. (If you can make the input const, that 
> > > > would prevent this type of bug)
> > > Ugh. I'll need to take a deeper look, but generally, the problem is we 
> > > don't want to copy - we're mutating the data of the token while 
> > > formatting the expanded token stream, and then re-use that info when 
> > > formatting the original stream.
> > > We could copy, add a reference to the original token, and then have a 
> > > step that adapts in the end, and perhaps that's cleaner overall anyway, 
> > > but will be quite a change.
> > > The alternative is that I'll look into how to specifically handle 
> > > double-expansion (or ... forbid it).
> > > (or ... forbid it).
> > 
> > I'm starting to think this is the best option.
> > 
> > The downsides seem pretty acceptable to me:
> >  - it's another wart to document: on the other hand it simplifies the 
> > conceptual model, I think it helps users understand the deeper behavior
> >  - some macros require simplification rather than supplying the actual 
> > definition: already crossed this bridge by not supporting macros in macro 
> > bodies, variadics, pasting...
> >  - loses information: one expansion is enough to establish which part of 
> > the grammar the arguments form in realistic cases. (Even in pathological 
> > cases, preserving the conflicting info only helps you if you have a plan to 
> > resolve the conflicts)
> >  - it's another wart to document: 
> > 
> > Are there any others?
> > 
> My main concern is that it's probably the most surprising feature to not 
> support.
Forbade multi-expansion.



Comment at: clang/lib/Format/Macros.h:82
+///
+/// Furthermore, expansion and definition are independent - a macro defined
+/// as "A()=a" and "A=a" can both be expanded  as "A()" and "A".


[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-08-20 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/lib/Format/MacroExpander.cpp:186
+Tok->MacroCtx = MacroContext(MR_ExpandedArg);
+  pushToken(Tok);
+}

sammccall wrote:
> klimek wrote:
> > sammccall wrote:
> > > you're pushing here without copying. This means the original tokens from 
> > > the ArgsList are mutated. Maybe we own them, but this seems at least 
> > > wrong for multiple expansion of the same arg.
> > > 
> > > e.g.
> > > ```
> > > #define M(X,Y) X Y X
> > > M(1,2)
> > > ```
> > > 
> > > Will expand to:
> > > - 1, ExpandedArg, ExpandedFrom = [M, M] // should just be one M
> > > - 2, ExpandedArg, ExpandedFrom = [M]
> > > - 1, ExpandedArg, ExpandedFrom = [M, M] // this is the same token pointer 
> > > as the first one
> > > 
> > > Maybe it would be better if pushToken performed the copy, and returned a 
> > > mutable pointer to the copy. (If you can make the input const, that would 
> > > prevent this type of bug)
> > Ugh. I'll need to take a deeper look, but generally, the problem is we 
> > don't want to copy - we're mutating the data of the token while formatting 
> > the expanded token stream, and then re-use that info when formatting the 
> > original stream.
> > We could copy, add a reference to the original token, and then have a step 
> > that adapts in the end, and perhaps that's cleaner overall anyway, but will 
> > be quite a change.
> > The alternative is that I'll look into how to specifically handle 
> > double-expansion (or ... forbid it).
> > (or ... forbid it).
> 
> I'm starting to think this is the best option.
> 
> The downsides seem pretty acceptable to me:
>  - it's another wart to document: on the other hand it simplifies the 
> conceptual model, I think it helps users understand the deeper behavior
>  - some macros require simplification rather than supplying the actual 
> definition: already crossed this bridge by not supporting macros in macro 
> bodies, variadics, pasting...
>  - loses information: one expansion is enough to establish which part of the 
> grammar the arguments form in realistic cases. (Even in pathological cases, 
> preserving the conflicting info only helps you if you have a plan to resolve 
> the conflicts)
>  - it's another wart to document: 
> 
> Are there any others?
> 
My main concern is that it's probably the most surprising feature to not 
support.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83296/new/

https://reviews.llvm.org/D83296

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-08-19 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/lib/Format/MacroExpander.cpp:186
+Tok->MacroCtx = MacroContext(MR_ExpandedArg);
+  pushToken(Tok);
+}

sammccall wrote:
> you're pushing here without copying. This means the original tokens from the 
> ArgsList are mutated. Maybe we own them, but this seems at least wrong for 
> multiple expansion of the same arg.
> 
> e.g.
> ```
> #define M(X,Y) X Y X
> M(1,2)
> ```
> 
> Will expand to:
> - 1, ExpandedArg, ExpandedFrom = [M, M] // should just be one M
> - 2, ExpandedArg, ExpandedFrom = [M]
> - 1, ExpandedArg, ExpandedFrom = [M, M] // this is the same token pointer as 
> the first one
> 
> Maybe it would be better if pushToken performed the copy, and returned a 
> mutable pointer to the copy. (If you can make the input const, that would 
> prevent this type of bug)
Ugh. I'll need to take a deeper look, but generally, the problem is we don't 
want to copy - we're mutating the data of the token while formatting the 
expanded token stream, and then re-use that info when formatting the original 
stream.
We could copy, add a reference to the original token, and then have a step that 
adapts in the end, and perhaps that's cleaner overall anyway, but will be quite 
a change.
The alternative is that I'll look into how to specifically handle 
double-expansion (or ... forbid it).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83296/new/

https://reviews.llvm.org/D83296

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-07-29 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/lib/Format/MacroExpander.cpp:35
+  SmallVector Params;
+  SmallVector Tokens;
+};

sammccall wrote:
> Tokens -> Expansion? (semantics rather than type)
Changed to "Body".



Comment at: clang/lib/Format/MacroExpander.cpp:81
+do {
+  Def.Tokens.push_back(Current);
+  nextToken();

sammccall wrote:
> this assumes the expansion is nonempty, which the grammar doesn't. while{} 
> instead?
I have no clue how this ever worked tbh O.O
Has been reworked as part of the move to use = to separate the macro signature 
from the body.



Comment at: clang/lib/Format/MacroExpander.cpp:113
+void MacroExpander::parseDefinitions(
+const std::vector ) {
+  for (const std::string  : MacroExpander) {

sammccall wrote:
> weird param name!
Copy-paste gone wrong I assume.



Comment at: clang/lib/Format/MacroExpander.cpp:116
+Buffers.push_back(
+llvm::MemoryBuffer::getMemBufferCopy(Macro, ""));
+clang::FileID FID =

sammccall wrote:
> This is a slightly spooky buffer name - it's the magic name the PP uses for 
> pasted tokens.
> A closer fit for config is maybe "" (like macro definitions 
> passed with `-D`).
> Is it necessary to use one of clang's magic buffer names at all? If so, 
> comment! Else maybe "" or something?
We need source locations, and apparently only:
,  and 
are allowed to have source locations.




Comment at: clang/lib/Format/MacroExpander.cpp:133
+  ArgsList Args) {
+  assert(defined(ID->TokenText));
+  SmallVector Result;

sammccall wrote:
> is the caller responsible for checking the #args matches #params?
> If so, document and assert here?
> 
> Looking at the implementation, it seems you don't expand if there are too few 
> args, and expand if there are too many args (ignoring the last ones). Maybe 
> it doesn't matter, but it'd be nice to be more consistent here.
> 
> (Probably worth calling out somewhere explicitly that variadic macros are not 
> supported)
Added docs in the class comment for MacroExpander.
(so far I always expand, too few -> empty, too many -> ignore)



Comment at: clang/lib/Format/MacroExpander.cpp:194
+assert(New->MacroCtx.Role == MR_None);
+// Tokens that are not part of the user code do not need to be formatted.
+New->MacroCtx.Role = MR_Hidden;

sammccall wrote:
> (I don't know exactly how this is used, but consider whether you mean "do not 
> need to", "should not" or "cannot" here)
Replaced with "are not".



Comment at: clang/lib/Format/MacroExpander.h:10
+///
+/// \file
+/// This file contains the declaration of MacroExpander, which handles macro

sammccall wrote:
> I think this comment is too short (and doesn't really say much that you can't 
> get from the filename). IME many people's mental model of macros is based on 
> how they're used rather than how they formally work, so I think it's worth 
> spelling out even the obvious implementation choices.
> 
> I'd like to see:
>  - rough description of the scope/goal of the feature (clang-format doesn't 
> see macro definitions, so macro uses tend to be pseudo-parsed as function 
> calls. When this isn't appropriate [example], a macro definition can be 
> provided as part of the style. When such a macro is used in the code, 
> clang-format will expand it as the preprocessor would before determining the 
> role of tokens. [example])
>  - explicitly call out here that only a single level of expansion is 
> supported, which is a divergence from the real preprocessor. (This influences 
> both the implementation and the mental model)
>  - what the MacroExpander does and how it relates to MacroContext. I think 
> this should give the input and output token streams names, as it's often 
> important to clearly distinguish the two. (TokenBuffer uses "Spelled tokens" 
> and "Expanded tokens" for this, which is not the same as how these terms are 
> used in SourceManager, but related and hasn't been confusing in practice).
>  - a rough sketch of how the mismatch between what was annotated and what 
> must be formatted is resolved e.g. -- just guessing here -- the spelled 
> stream is formatted but for macro args, the annotations from the expanded 
> stream are used.
> 
> (I'm assuming this is the best file to talk about the implementation of this 
> feature in general - i'm really hoping that there is such a file. If there 
> are a bunch of related utilities you might even consider renaming the header 
> as an umbrella to make them easier to document. This is a question of 
> taste...)
Moved to Macros.h and added file comment that tries to address all of these.



Comment at: clang/lib/Format/MacroExpander.h:66
+  /// each element in \p Args is a positional 

[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-07-29 Thread Manuel Klimek via Phabricator via cfe-commits
klimek updated this revision to Diff 281611.
klimek marked 27 inline comments as done.
klimek added a comment.

Addressed code review comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83296/new/

https://reviews.llvm.org/D83296

Files:
  clang/lib/Format/CMakeLists.txt
  clang/lib/Format/FormatToken.h
  clang/lib/Format/MacroExpander.cpp
  clang/lib/Format/Macros.h
  clang/unittests/Format/CMakeLists.txt
  clang/unittests/Format/MacroExpanderTest.cpp
  clang/unittests/Format/TestLexer.h

Index: clang/unittests/Format/TestLexer.h
===
--- /dev/null
+++ clang/unittests/Format/TestLexer.h
@@ -0,0 +1,88 @@
+//===--- TestLexer.h - Format C++ code --*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+///
+/// \file
+/// This file contains a TestLexer to create FormatTokens from strings.
+///
+//===--===//
+
+#ifndef CLANG_UNITTESTS_FORMAT_TEST_LEXER_H
+#define CLANG_UNITTESTS_FORMAT_TEST_LEXER_H
+
+#include "../../lib/Format/FormatTokenLexer.h"
+
+#include "clang/Basic/FileManager.h"
+#include "clang/Basic/SourceManager.h"
+
+#include 
+#include 
+
+namespace clang {
+namespace format {
+
+typedef llvm::SmallVector TokenList;
+
+inline std::ostream <<(std::ostream , const FormatToken ) {
+  Stream << "(" << Tok.Tok.getName() << ", \"" << Tok.TokenText.str() << "\")";
+  return Stream;
+}
+inline std::ostream <<(std::ostream , const TokenList ) {
+  Stream << "{";
+  for (size_t I = 0, E = Tokens.size(); I != E; ++I) {
+Stream << (I > 0 ? ", " : "") << *Tokens[I];
+  }
+  Stream << "}";
+  return Stream;
+}
+
+inline TokenList uneof(const TokenList ) {
+  assert(!Tokens.empty() && Tokens.back()->is(tok::eof));
+  return TokenList(Tokens.begin(), std::prev(Tokens.end()));
+}
+
+inline std::string text(llvm::ArrayRef Tokens) {
+  return std::accumulate(Tokens.begin(), Tokens.end(), std::string(),
+ [](const std::string , FormatToken *Tok) {
+   return (R + Tok->TokenText).str();
+ });
+}
+
+class TestLexer {
+public:
+  TestLexer() : SourceMgr("test.cpp", "") {}
+
+  TokenList lex(llvm::StringRef Code) {
+Buffers.push_back(
+llvm::MemoryBuffer::getMemBufferCopy(Code, ""));
+clang::FileID FID = SourceMgr.get().createFileID(SourceManager::Unowned,
+ Buffers.back().get());
+FormatTokenLexer Lex(SourceMgr.get(), FID, 0, Style, Encoding, Allocator,
+ IdentTable);
+auto Result = Lex.lex();
+return TokenList(Result.begin(), Result.end());
+  }
+
+  FormatToken *id(llvm::StringRef Code) {
+auto Result = uneof(lex(Code));
+assert(Result.size() == 1U && "Code must expand to 1 token.");
+return Result[0];
+  }
+
+  FormatStyle Style = getLLVMStyle();
+  encoding::Encoding Encoding = encoding::Encoding_UTF8;
+  std::vector> Buffers;
+  clang::SourceManagerForFile SourceMgr;
+  llvm::SpecificBumpPtrAllocator Allocator;
+  IdentifierTable IdentTable;
+};
+
+} // namespace format
+} // namespace clang
+
+#endif // LLVM_CLANG_UNITTESTS_FORMAT_TEST_LEXER_H
Index: clang/unittests/Format/MacroExpanderTest.cpp
===
--- /dev/null
+++ clang/unittests/Format/MacroExpanderTest.cpp
@@ -0,0 +1,170 @@
+#include "../../lib/Format/Macros.h"
+#include "TestLexer.h"
+#include "clang/Basic/FileManager.h"
+
+#include "gtest/gtest.h"
+
+namespace clang {
+namespace format {
+
+namespace {
+
+class MacroExpanderTest : public ::testing::Test {
+public:
+  std::unique_ptr
+  create(const std::vector ) {
+return std::make_unique(MacroDefinitions,
+   Lex.SourceMgr.get(), Lex.Style,
+   Lex.Allocator, Lex.IdentTable);
+  }
+
+  std::string expand(MacroExpander , llvm::StringRef Name,
+ const std::vector  = {}) {
+EXPECT_TRUE(Macros.defined(Name));
+return text(Macros.expand(Lex.id(Name), lexArgs(Args)));
+  }
+
+  llvm::SmallVector
+  lexArgs(const std::vector ) {
+llvm::SmallVector Result;
+for (const auto  : Args) {
+  Result.push_back(uneof(Lex.lex(Arg)));
+}
+return Result;
+  }
+
+  struct MacroAttributes {
+clang::tok::TokenKind Kind;
+MacroRole Role;
+unsigned Start;
+unsigned End;
+llvm::SmallVector ExpandedFrom;
+  };
+
+  void expectAttributes(const TokenList ,
+const std::vector ,
+const std::string , unsigned Line) {
+EXPECT_EQ(Tokens.size(), 

[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-07-13 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

In D83296#2146970 , @sammccall wrote:

> In D83296#2146870 , @klimek wrote:
>
> > Monday-morning ping.
>
>
> Thanks for the reminder here... however this is taking me a bit to get my 
> head around, and we've got a release branch cut scheduled for a couple of 
> days that we're trying to polish for.
>  AFAICT there's significant followup work still needed to make use of this - 
> are you wanting this to land in the 11 release? Else i'd probably come back 
> to this after the cut...


Sorry, this is not urgent - the 11 cut has clear priority. My default is to 
ping stuff every week unless somebody tells me they prefer me not doing that :)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83296/new/

https://reviews.llvm.org/D83296



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-07-13 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

Monday-morning ping.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83296/new/

https://reviews.llvm.org/D83296



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83621: [clang][Tooling] Try to avoid file system access if there is no record for the file in compile_commads.json

2020-07-13 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a subscriber: djasper.
klimek added a comment.

@djasper wrote this iirc, but I doubt he'll have much time to look into this :)

IIRC the symlink checking was there because some part of the system on linux 
tends to give us realpaths (with CMake builds), and that leads to not finding 
anything if there are symlinks involved.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83621/new/

https://reviews.llvm.org/D83621



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D79773: [clang-format] Improve clang-formats handling of concepts

2020-07-09 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/lib/Format/UnwrappedLineParser.cpp:631-636
+  else if (FormatTok->is(tok::arrow)) {
+// Following the } we can find a trailing return type arrow
+// as part of an implicit conversion constraint.
+nextToken();
+parseStructuralElement();
+  }

I'd have expected this to be just in front of line 629 (if... tok::semi) - does 
that break something? Don't we want to munch the semi after this?



Comment at: clang/lib/Format/UnwrappedLineParser.cpp:2314
+  nextToken();
+  if (FormatTok->Tok.is(tok::less)) {
+while (!FormatTok->Tok.is(tok::greater)) {

MyDeveloperDay wrote:
> miscco wrote:
> > miscco wrote:
> > > I guess you could use `parseBracedList(/*ContinueOnSemicolons=*/false, 
> > > /*IsEnum=*/false,/*ClosingBraceKind=*/tok::greater);` here?
> > To be more specific, I am concerned what happens if you have multiple 
> > template arguments where a linebreak should occur. Is this still happening 
> > here?
> > 
> > 
> > ```
> > template 
> > concept someVeryLongConceptName = 
> > someVeryLongConstraintName1;
> > ```
> This is formatted as
> 
> ```
> template 
> concept someVeryLongConceptName =
> someVeryLongConstraintName1;
> ```
This seems like a very detailed way to parse them; generally, we try to only 
parse rough structure here.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D79773/new/

https://reviews.llvm.org/D79773



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83296: [clang-format] Add a MacroExpander.

2020-07-07 Thread Manuel Klimek via Phabricator via cfe-commits
klimek created this revision.
klimek added a reviewer: sammccall.
Herald added subscribers: cfe-commits, mgorny.
Herald added a project: clang.

The MacroExpander allows to expand simple (non-resursive) macro
definitions from a macro identifier token and macro arguments. It
annotates the tokens with a newly introduced MacroContext that keeps
track of the role a token played in expanding the macro in order to
be able to reconstruct the macro expansion from an expanded (formatted)
token stream.

Made Token explicitly copy-able to enable copying tokens from the parsed
macro definition.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D83296

Files:
  clang/lib/Format/CMakeLists.txt
  clang/lib/Format/FormatToken.h
  clang/lib/Format/MacroExpander.cpp
  clang/lib/Format/MacroExpander.h
  clang/unittests/Format/CMakeLists.txt
  clang/unittests/Format/MacroExpanderTest.cpp
  clang/unittests/Format/TestLexer.h

Index: clang/unittests/Format/TestLexer.h
===
--- /dev/null
+++ clang/unittests/Format/TestLexer.h
@@ -0,0 +1,88 @@
+//===--- TestLexer.h - Format C++ code --*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===--===//
+///
+/// \file
+/// This file contains a TestLexer to create FormatTokens from strings.
+///
+//===--===//
+
+#ifndef LLVM_CLANG_UNITTESTS_FORMAT_TEST_LEXER_H
+#define LLVM_CLANG_UNITTESTS_FORMAT_TEST_LEXER_H
+
+#include "../../lib/Format/FormatTokenLexer.h"
+
+#include "clang/Basic/FileManager.h"
+#include "clang/Basic/SourceManager.h"
+
+#include 
+#include 
+
+namespace clang {
+namespace format {
+
+typedef llvm::SmallVector TokenList;
+
+inline std::ostream <<(std::ostream , const FormatToken ) {
+  Stream << "(" << Tok.Tok.getName() << ", \"" << Tok.TokenText.str() << "\")";
+  return Stream;
+}
+inline std::ostream <<(std::ostream , const TokenList ) {
+  Stream << "{";
+  for (size_t I = 0, E = Tokens.size(); I != E; ++I) {
+Stream << (I > 0 ? ", " : "") << *Tokens[I];
+  }
+  Stream << "}";
+  return Stream;
+}
+
+inline TokenList uneof(const TokenList ) {
+  assert(!Tokens.empty() && Tokens.back()->is(tok::eof));
+  return TokenList(Tokens.begin(), std::prev(Tokens.end()));
+}
+
+inline std::string text(llvm::ArrayRef Tokens) {
+  return std::accumulate(Tokens.begin(), Tokens.end(), std::string(),
+ [](const std::string , FormatToken *Tok) {
+   return (R + Tok->TokenText).str();
+ });
+}
+
+class TestLexer {
+public:
+  TestLexer() : SourceMgr("test.cpp", "") {}
+
+  TokenList lex(llvm::StringRef Code) {
+Buffers.push_back(
+llvm::MemoryBuffer::getMemBufferCopy(Code, ""));
+clang::FileID FID = SourceMgr.get().createFileID(SourceManager::Unowned,
+ Buffers.back().get());
+FormatTokenLexer Lex(SourceMgr.get(), FID, 0, Style, Encoding, Allocator,
+ IdentTable);
+auto Result = Lex.lex();
+return TokenList(Result.begin(), Result.end());
+  }
+
+  FormatToken *id(llvm::StringRef Code) {
+auto Result = uneof(lex(Code));
+assert(Result.size() == 1U && "Code must expand to 1 token.");
+return Result[0];
+  }
+
+  FormatStyle Style = getLLVMStyle();
+  encoding::Encoding Encoding = encoding::Encoding_UTF8;
+  std::vector> Buffers;
+  clang::SourceManagerForFile SourceMgr;
+  llvm::SpecificBumpPtrAllocator Allocator;
+  IdentifierTable IdentTable;
+};
+
+} // namespace format
+} // namespace clang
+
+#endif // LLVM_CLANG_UNITTESTS_FORMAT_TEST_LEXER_H
Index: clang/unittests/Format/MacroExpanderTest.cpp
===
--- /dev/null
+++ clang/unittests/Format/MacroExpanderTest.cpp
@@ -0,0 +1,167 @@
+#include "../../lib/Format/MacroExpander.h"
+#include "TestLexer.h"
+#include "clang/Basic/FileManager.h"
+
+#include "gmock/gmock.h"
+#include "gtest/gtest.h"
+
+namespace clang {
+namespace format {
+
+namespace {
+
+class MacroExpanderTest : public ::testing::Test {
+public:
+  std::unique_ptr
+  create(const std::vector ) {
+return std::make_unique(
+MacroDefinitions, Lex.SourceMgr.get(), Lex.Style, Lex.Encoding,
+Lex.Allocator, Lex.IdentTable);
+  }
+
+  std::string expand(MacroExpander , llvm::StringRef Name,
+ const std::vector  = {}) {
+EXPECT_TRUE(Macros.defined(Name));
+return text(Macros.expand(Lex.id(Name), lexArgs(Args)));
+  }
+
+  llvm::SmallVector
+  lexArgs(const std::vector ) {
+llvm::SmallVector Result;
+for (const auto  : Args) {
+  Result.push_back(uneof(Lex.lex(Arg)));
+}
+return 

[PATCH] D83218: Hand Allocator and IdentifierTable into FormatTokenLexer.

2020-07-07 Thread Manuel Klimek via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG8c2a61397607: Hand Allocator and IdentifierTable into 
FormatTokenLexer. (authored by klimek).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83218/new/

https://reviews.llvm.org/D83218

Files:
  clang/lib/Format/FormatTokenLexer.cpp
  clang/lib/Format/FormatTokenLexer.h
  clang/lib/Format/TokenAnalyzer.cpp


Index: clang/lib/Format/TokenAnalyzer.cpp
===
--- clang/lib/Format/TokenAnalyzer.cpp
+++ clang/lib/Format/TokenAnalyzer.cpp
@@ -64,11 +64,16 @@
 
 std::pair TokenAnalyzer::process() {
   tooling::Replacements Result;
-  FormatTokenLexer Tokens(Env.getSourceManager(), Env.getFileID(),
-  Env.getFirstStartColumn(), Style, Encoding);
+  llvm::SpecificBumpPtrAllocator Allocator;
+  IdentifierTable IdentTable(getFormattingLangOpts(Style));
+  FormatTokenLexer Lex(Env.getSourceManager(), Env.getFileID(),
+   Env.getFirstStartColumn(), Style, Encoding, Allocator,
 
-  UnwrappedLineParser Parser(Style, Tokens.getKeywords(),
- Env.getFirstStartColumn(), Tokens.lex(), *this);
+   IdentTable);
+  ArrayRef Toks(Lex.lex());
+  SmallVector Tokens(Toks.begin(), Toks.end());
+  UnwrappedLineParser Parser(Style, Lex.getKeywords(),
+ Env.getFirstStartColumn(), Tokens, *this);
   Parser.parse();
   assert(UnwrappedLines.rbegin()->empty());
   unsigned Penalty = 0;
@@ -76,14 +81,14 @@
 LLVM_DEBUG(llvm::dbgs() << "Run " << Run << "...\n");
 SmallVector AnnotatedLines;
 
-TokenAnnotator Annotator(Style, Tokens.getKeywords());
+TokenAnnotator Annotator(Style, Lex.getKeywords());
 for (unsigned i = 0, e = UnwrappedLines[Run].size(); i != e; ++i) {
   AnnotatedLines.push_back(new AnnotatedLine(UnwrappedLines[Run][i]));
   Annotator.annotate(*AnnotatedLines.back());
 }
 
 std::pair RunResult =
-analyze(Annotator, AnnotatedLines, Tokens);
+analyze(Annotator, AnnotatedLines, Lex);
 
 LLVM_DEBUG({
   llvm::dbgs() << "Replacements for run " << Run << ":\n";
Index: clang/lib/Format/FormatTokenLexer.h
===
--- clang/lib/Format/FormatTokenLexer.h
+++ clang/lib/Format/FormatTokenLexer.h
@@ -38,7 +38,9 @@
 class FormatTokenLexer {
 public:
   FormatTokenLexer(const SourceManager , FileID ID, unsigned Column,
-   const FormatStyle , encoding::Encoding Encoding);
+   const FormatStyle , encoding::Encoding Encoding,
+   llvm::SpecificBumpPtrAllocator ,
+   IdentifierTable );
 
   ArrayRef lex();
 
@@ -103,10 +105,10 @@
   const SourceManager 
   FileID ID;
   const FormatStyle 
-  IdentifierTable IdentTable;
+  IdentifierTable 
   AdditionalKeywords Keywords;
   encoding::Encoding Encoding;
-  llvm::SpecificBumpPtrAllocator Allocator;
+  llvm::SpecificBumpPtrAllocator 
   // Index (in 'Tokens') of the last token that starts a new line.
   unsigned FirstInLineIndex;
   SmallVector Tokens;
Index: clang/lib/Format/FormatTokenLexer.cpp
===
--- clang/lib/Format/FormatTokenLexer.cpp
+++ clang/lib/Format/FormatTokenLexer.cpp
@@ -22,13 +22,15 @@
 namespace clang {
 namespace format {
 
-FormatTokenLexer::FormatTokenLexer(const SourceManager , FileID ID,
-   unsigned Column, const FormatStyle ,
-   encoding::Encoding Encoding)
+FormatTokenLexer::FormatTokenLexer(
+const SourceManager , FileID ID, unsigned Column,
+const FormatStyle , encoding::Encoding Encoding,
+llvm::SpecificBumpPtrAllocator ,
+IdentifierTable )
 : FormatTok(nullptr), IsFirstToken(true), StateStack({LexerState::NORMAL}),
   Column(Column), TrailingWhitespace(0), SourceMgr(SourceMgr), ID(ID),
-  Style(Style), IdentTable(getFormattingLangOpts(Style)),
-  Keywords(IdentTable), Encoding(Encoding), FirstInLineIndex(0),
+  Style(Style), IdentTable(IdentTable), Keywords(IdentTable),
+  Encoding(Encoding), Allocator(Allocator), FirstInLineIndex(0),
   FormattingDisabled(false), MacroBlockBeginRegex(Style.MacroBlockBegin),
   MacroBlockEndRegex(Style.MacroBlockEnd) {
   Lex.reset(new Lexer(ID, SourceMgr.getBuffer(ID), SourceMgr,


Index: clang/lib/Format/TokenAnalyzer.cpp
===
--- clang/lib/Format/TokenAnalyzer.cpp
+++ clang/lib/Format/TokenAnalyzer.cpp
@@ -64,11 +64,16 @@
 
 std::pair TokenAnalyzer::process() {
   tooling::Replacements Result;
-  FormatTokenLexer Tokens(Env.getSourceManager(), Env.getFileID(),
-  Env.getFirstStartColumn(), Style, Encoding);
+  llvm::SpecificBumpPtrAllocator Allocator;
+  IdentifierTable 

[PATCH] D83218: Hand Allocator and IdentifierTable into FormatTokenLexer.

2020-07-07 Thread Manuel Klimek via Phabricator via cfe-commits
klimek updated this revision to Diff 275964.
klimek added a comment.

Address review comment.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83218/new/

https://reviews.llvm.org/D83218

Files:
  clang/lib/Format/FormatTokenLexer.cpp
  clang/lib/Format/FormatTokenLexer.h
  clang/lib/Format/TokenAnalyzer.cpp


Index: clang/lib/Format/TokenAnalyzer.cpp
===
--- clang/lib/Format/TokenAnalyzer.cpp
+++ clang/lib/Format/TokenAnalyzer.cpp
@@ -64,11 +64,16 @@
 
 std::pair TokenAnalyzer::process() {
   tooling::Replacements Result;
-  FormatTokenLexer Tokens(Env.getSourceManager(), Env.getFileID(),
-  Env.getFirstStartColumn(), Style, Encoding);
+  llvm::SpecificBumpPtrAllocator Allocator;
+  IdentifierTable IdentTable(getFormattingLangOpts(Style));
+  FormatTokenLexer Lex(Env.getSourceManager(), Env.getFileID(),
+   Env.getFirstStartColumn(), Style, Encoding, Allocator,
 
-  UnwrappedLineParser Parser(Style, Tokens.getKeywords(),
- Env.getFirstStartColumn(), Tokens.lex(), *this);
+   IdentTable);
+  ArrayRef Toks(Lex.lex());
+  SmallVector Tokens(Toks.begin(), Toks.end());
+  UnwrappedLineParser Parser(Style, Lex.getKeywords(),
+ Env.getFirstStartColumn(), Tokens, *this);
   Parser.parse();
   assert(UnwrappedLines.rbegin()->empty());
   unsigned Penalty = 0;
@@ -76,14 +81,14 @@
 LLVM_DEBUG(llvm::dbgs() << "Run " << Run << "...\n");
 SmallVector AnnotatedLines;
 
-TokenAnnotator Annotator(Style, Tokens.getKeywords());
+TokenAnnotator Annotator(Style, Lex.getKeywords());
 for (unsigned i = 0, e = UnwrappedLines[Run].size(); i != e; ++i) {
   AnnotatedLines.push_back(new AnnotatedLine(UnwrappedLines[Run][i]));
   Annotator.annotate(*AnnotatedLines.back());
 }
 
 std::pair RunResult =
-analyze(Annotator, AnnotatedLines, Tokens);
+analyze(Annotator, AnnotatedLines, Lex);
 
 LLVM_DEBUG({
   llvm::dbgs() << "Replacements for run " << Run << ":\n";
Index: clang/lib/Format/FormatTokenLexer.h
===
--- clang/lib/Format/FormatTokenLexer.h
+++ clang/lib/Format/FormatTokenLexer.h
@@ -38,7 +38,9 @@
 class FormatTokenLexer {
 public:
   FormatTokenLexer(const SourceManager , FileID ID, unsigned Column,
-   const FormatStyle , encoding::Encoding Encoding);
+   const FormatStyle , encoding::Encoding Encoding,
+   llvm::SpecificBumpPtrAllocator ,
+   IdentifierTable );
 
   ArrayRef lex();
 
@@ -103,10 +105,10 @@
   const SourceManager 
   FileID ID;
   const FormatStyle 
-  IdentifierTable IdentTable;
+  IdentifierTable 
   AdditionalKeywords Keywords;
   encoding::Encoding Encoding;
-  llvm::SpecificBumpPtrAllocator Allocator;
+  llvm::SpecificBumpPtrAllocator 
   // Index (in 'Tokens') of the last token that starts a new line.
   unsigned FirstInLineIndex;
   SmallVector Tokens;
Index: clang/lib/Format/FormatTokenLexer.cpp
===
--- clang/lib/Format/FormatTokenLexer.cpp
+++ clang/lib/Format/FormatTokenLexer.cpp
@@ -22,13 +22,15 @@
 namespace clang {
 namespace format {
 
-FormatTokenLexer::FormatTokenLexer(const SourceManager , FileID ID,
-   unsigned Column, const FormatStyle ,
-   encoding::Encoding Encoding)
+FormatTokenLexer::FormatTokenLexer(
+const SourceManager , FileID ID, unsigned Column,
+const FormatStyle , encoding::Encoding Encoding,
+llvm::SpecificBumpPtrAllocator ,
+IdentifierTable )
 : FormatTok(nullptr), IsFirstToken(true), StateStack({LexerState::NORMAL}),
   Column(Column), TrailingWhitespace(0), SourceMgr(SourceMgr), ID(ID),
-  Style(Style), IdentTable(getFormattingLangOpts(Style)),
-  Keywords(IdentTable), Encoding(Encoding), FirstInLineIndex(0),
+  Style(Style), IdentTable(IdentTable), Keywords(IdentTable),
+  Encoding(Encoding), Allocator(Allocator), FirstInLineIndex(0),
   FormattingDisabled(false), MacroBlockBeginRegex(Style.MacroBlockBegin),
   MacroBlockEndRegex(Style.MacroBlockEnd) {
   Lex.reset(new Lexer(ID, SourceMgr.getBuffer(ID), SourceMgr,


Index: clang/lib/Format/TokenAnalyzer.cpp
===
--- clang/lib/Format/TokenAnalyzer.cpp
+++ clang/lib/Format/TokenAnalyzer.cpp
@@ -64,11 +64,16 @@
 
 std::pair TokenAnalyzer::process() {
   tooling::Replacements Result;
-  FormatTokenLexer Tokens(Env.getSourceManager(), Env.getFileID(),
-  Env.getFirstStartColumn(), Style, Encoding);
+  llvm::SpecificBumpPtrAllocator Allocator;
+  IdentifierTable IdentTable(getFormattingLangOpts(Style));
+  FormatTokenLexer Lex(Env.getSourceManager(), 

[PATCH] D83218: Hand Allocator and IdentifierTable into FormatTokenLexer.

2020-07-06 Thread Manuel Klimek via Phabricator via cfe-commits
klimek created this revision.
klimek added a reviewer: sammccall.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

This allows us to share the allocator in the future so we can create tokens 
while parsing.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D83218

Files:
  clang/lib/Format/FormatTokenLexer.cpp
  clang/lib/Format/FormatTokenLexer.h
  clang/lib/Format/TokenAnalyzer.cpp

Index: clang/lib/Format/TokenAnalyzer.cpp
===
--- clang/lib/Format/TokenAnalyzer.cpp
+++ clang/lib/Format/TokenAnalyzer.cpp
@@ -64,11 +64,16 @@
 
 std::pair TokenAnalyzer::process() {
   tooling::Replacements Result;
-  FormatTokenLexer Tokens(Env.getSourceManager(), Env.getFileID(),
-  Env.getFirstStartColumn(), Style, Encoding);
+  llvm::SpecificBumpPtrAllocator Allocator;
+  IdentifierTable IdentTable(getFormattingLangOpts(Style));
+  FormatTokenLexer Lex(Env.getSourceManager(), Env.getFileID(),
+   Env.getFirstStartColumn(), Style, Encoding, Allocator,
 
-  UnwrappedLineParser Parser(Style, Tokens.getKeywords(),
- Env.getFirstStartColumn(), Tokens.lex(), *this);
+   IdentTable);
+  ArrayRef Toks(Lex.lex());
+  SmallVector Tokens(Toks.begin(), Toks.end());
+  UnwrappedLineParser Parser(Style, Lex.getKeywords(),
+ Env.getFirstStartColumn(), Tokens, *this);
   Parser.parse();
   assert(UnwrappedLines.rbegin()->empty());
   unsigned Penalty = 0;
@@ -76,14 +81,14 @@
 LLVM_DEBUG(llvm::dbgs() << "Run " << Run << "...\n");
 SmallVector AnnotatedLines;
 
-TokenAnnotator Annotator(Style, Tokens.getKeywords());
+TokenAnnotator Annotator(Style, Lex.getKeywords());
 for (unsigned i = 0, e = UnwrappedLines[Run].size(); i != e; ++i) {
   AnnotatedLines.push_back(new AnnotatedLine(UnwrappedLines[Run][i]));
   Annotator.annotate(*AnnotatedLines.back());
 }
 
 std::pair RunResult =
-analyze(Annotator, AnnotatedLines, Tokens);
+analyze(Annotator, AnnotatedLines, Lex);
 
 LLVM_DEBUG({
   llvm::dbgs() << "Replacements for run " << Run << ":\n";
Index: clang/lib/Format/FormatTokenLexer.h
===
--- clang/lib/Format/FormatTokenLexer.h
+++ clang/lib/Format/FormatTokenLexer.h
@@ -38,7 +38,9 @@
 class FormatTokenLexer {
 public:
   FormatTokenLexer(const SourceManager , FileID ID, unsigned Column,
-   const FormatStyle , encoding::Encoding Encoding);
+   const FormatStyle , encoding::Encoding Encoding,
+   llvm::SpecificBumpPtrAllocator ,
+   IdentifierTable );
 
   ArrayRef lex();
 
@@ -103,10 +105,9 @@
   const SourceManager 
   FileID ID;
   const FormatStyle 
-  IdentifierTable IdentTable;
+  IdentifierTable 
   AdditionalKeywords Keywords;
   encoding::Encoding Encoding;
-  llvm::SpecificBumpPtrAllocator Allocator;
   // Index (in 'Tokens') of the last token that starts a new line.
   unsigned FirstInLineIndex;
   SmallVector Tokens;
@@ -117,6 +118,7 @@
 
   llvm::Regex MacroBlockBeginRegex;
   llvm::Regex MacroBlockEndRegex;
+  llvm::SpecificBumpPtrAllocator 
 
   // Targets that may appear inside a C# attribute.
   static const llvm::StringSet<> CSharpAttributeTargets;
Index: clang/lib/Format/FormatTokenLexer.cpp
===
--- clang/lib/Format/FormatTokenLexer.cpp
+++ clang/lib/Format/FormatTokenLexer.cpp
@@ -22,15 +22,17 @@
 namespace clang {
 namespace format {
 
-FormatTokenLexer::FormatTokenLexer(const SourceManager , FileID ID,
-   unsigned Column, const FormatStyle ,
-   encoding::Encoding Encoding)
+FormatTokenLexer::FormatTokenLexer(
+const SourceManager , FileID ID, unsigned Column,
+const FormatStyle , encoding::Encoding Encoding,
+llvm::SpecificBumpPtrAllocator ,
+IdentifierTable )
 : FormatTok(nullptr), IsFirstToken(true), StateStack({LexerState::NORMAL}),
   Column(Column), TrailingWhitespace(0), SourceMgr(SourceMgr), ID(ID),
-  Style(Style), IdentTable(getFormattingLangOpts(Style)),
-  Keywords(IdentTable), Encoding(Encoding), FirstInLineIndex(0),
-  FormattingDisabled(false), MacroBlockBeginRegex(Style.MacroBlockBegin),
-  MacroBlockEndRegex(Style.MacroBlockEnd) {
+  Style(Style), IdentTable(IdentTable), Keywords(IdentTable),
+  Encoding(Encoding), FirstInLineIndex(0), FormattingDisabled(false),
+  MacroBlockBeginRegex(Style.MacroBlockBegin),
+  MacroBlockEndRegex(Style.MacroBlockEnd), Allocator(Allocator) {
   Lex.reset(new Lexer(ID, SourceMgr.getBuffer(ID), SourceMgr,
   getFormattingLangOpts(Style)));
   Lex->SetKeepWhitespaceMode(true);
___
cfe-commits mailing list

[PATCH] D83076: Revert AST Matchers default to AsIs mode

2020-07-03 Thread Manuel Klimek via Phabricator via cfe-commits
klimek accepted this revision.
klimek added a comment.
This revision is now accepted and ready to land.

lg


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83076/new/

https://reviews.llvm.org/D83076



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82771: [ASTMatcher] Fix a performance regression: memorize the child match.

2020-06-30 Thread Manuel Klimek via Phabricator via cfe-commits
klimek accepted this revision.
klimek added a comment.
This revision is now accepted and ready to land.

LG




Comment at: clang/lib/ASTMatchers/ASTMatchFinder.cpp:468
+// Memoize result even doing a single-level match, it might be expensive.
+Key.Type = MaxDepth == 1 ? MatchType::Child : MatchType::Descendants;
 MemoizationMap::iterator I = ResultCache.find(Key);

sammccall wrote:
> sammccall wrote:
> > This line looks a lot like a cache-poisoning bug, until realizing that 
> > MaxDepth is never actually used as an integer, it's a boolean - 1 means 
> > non-recursive and 0 means recursive. (Note that the recursive call passes 
> > MaxDepth, not MaxDepth-1).
> > I guess we should just fix the type, actually bounding the depth doesn't 
> > seem likely to be done anytime soon.
> er, 1 means non-recursive and anything else means recursive...
That sounds like a historical reason - feel free to change :) (I don't even 
remember how that came to be).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82771/new/

https://reviews.llvm.org/D82771



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82771: [ASTMatcher] Fix a performance regression: memorize the child match.

2020-06-30 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

In D82771#2121924 , @hokein wrote:

> In D82771#2120214 , @klimek wrote:
>
> > In what situation are we calling child matchers repeatedly with the same 
> > matcher on the same node?
>
>
> I guess a pattern like 
> https://github.com/llvm/llvm-project/blob/master/clang-tools-extra/clang-tidy/readability/ContainerSizeEmptyCheck.cpp#L29-L40?


Ah, so this is when we go hasDeclaration.

I'm wondering whether what we really want to memoize are matchers that go 
across the AST, like hasType, hasDeclaration, etc, where we can expect a linear 
number of nodes hitting that  matcher.

hasChild specifically seems weird to memoize when we don't memoize other has 
matchers.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82771/new/

https://reviews.llvm.org/D82771



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82771: [ASTMatcher] Fix a performance regression: memorize the child match.

2020-06-29 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/lib/ASTMatchers/ASTMatchFinder.cpp:467
 Key.Traversal = Ctx.getParentMapContext().getTraversalKind();
-Key.Type = MatchType::Descendants;
+// Memorize result even doing a single-level match, it might be expensive.
+Key.Type = MaxDepth == 1 ? MatchType::Child : MatchType::Descendants;

Here and below: s/memorize/memoize/


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82771/new/

https://reviews.llvm.org/D82771



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82771: [ASTMatcher] Fix a performance regression: memorize the child match.

2020-06-29 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

In what situation are we calling child matchers repeatedly with the same 
matcher on the same node?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82771/new/

https://reviews.llvm.org/D82771



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80202: [ASTMatchers] Performance optimization for memoization

2020-06-22 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a reviewer: sammccall.
klimek added a comment.

+Sam for additional sanity checking.




Comment at: clang/lib/ASTMatchers/ASTMatchFinder.cpp:902
   // Maps (matcher, node) -> the match result for memoization.
-  typedef std::map MemoizationMap;
+  typedef std::map> MemoizationMap;
   MemoizationMap ResultCache;

Ok, if this actually matters, we should also not use a std::map here, but a 
llvm::DenseMap (or do we rely on iteration invalidation semantics here?).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80202/new/

https://reviews.llvm.org/D80202



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80961: Ignore template instantiations if not in AsIs mode

2020-06-18 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

In D80961#2099262 , @steveire wrote:

> In D80961#2095254 , @klimek wrote:
>
> > 1. the scare quotes around "standing objections" reads like you're not 
> > respecting the opinions of others here;
>
>
> Hmm, this wasn't intended. I sometimes quote things if they are a particular 
> term (ordinarily I would quote "particular term").
>
> Nevertheless, I'll take on your message. I don't have more to say.


Thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80961/new/

https://reviews.llvm.org/D80961



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80961: Ignore template instantiations if not in AsIs mode

2020-06-16 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

In D80961#2090216 , @steveire wrote:

> In D80961#2079419 , @aaron.ballman 
> wrote:
>
> > In D80961#2079044 , @klimek wrote:
> >
> > > In D80961#2076242 , 
> > > @aaron.ballman wrote:
> > >
> > > > In D80961#2073049 , @klimek 
> > > > wrote:
> > > >
> > > > > Without jumping into the discussion whether it should be the default, 
> > > > > I think we should be able to control template instantiation 
> > > > > visitation separately from other implicit nodes.
> > > > >  Having to put AsIs on a matcher every time you need to match 
> > > > > template instantiations is a rather big change (suddenly you have to 
> > > > > change all the matchers you've written so far).
> > > >
> > > >
> > > > I think that's the intended meaning of `AsIs` though. Template 
> > > > instantiations are not source code the user wrote, they're source code 
> > > > the compiler stamped out from code the user wrote. I hope 
> > > > `IgnoreUnlessSpelledInSource` isn't a misnomer.
> > > >
> > > > > I love the idea of being able to control visitation of template 
> > > > > instantiation.
> > > > >  I am somewhat torn on whether it should be the default, and would 
> > > > > like to see more data.
> > > > >  I feel more strongly about needing AsIs when I want to match 
> > > > > template instantiations.
> > > >
> > > > FWIW, my experience in clang-tidy has been that template instantiations 
> > > > are ignored far more often than they're desired. In fact, 
> > > > instantiations tend to be a source of bugs for us because they're easy 
> > > > to forget about when writing matchers without keeping templates in 
> > > > mind. The times when template instantiations become important to *not* 
> > > > ignore within the checks is when the check is specific to template 
> > > > behavior, but that's a minority of the public checks thus far.
> > >
> > >
> > > So I assume the idea is that this will work & be what we'll want people 
> > > to write?
> > >  traverse(AsIs, decl(traverse(IgnoreImplicit, hasFoo)))
> >
> >
> > I believe so, yes. It's explicit about which traversal mode you want any 
> > given set of matchers to match with, so it is more flexible at the expense 
> > of being more verbose. Alternatively, you could be in `AsIs` mode for all 
> > of it and explicitly ignore the implicit nodes you wish to ignore (which 
> > may be even more verbose with the same results, depending on the matchers 
> > involved).
>


Support for clang-query is coming up, I assume? If that works & will work in 
clang-query that addresses my concern.

> As far as I can tell, the people who had "standing objections" to fixing this 
> bug have been convinced that it makes sense. At least that is implied.
> 
> The problem is they haven't explicitly reversed so progress is still blocked.
> 
> Please respond to either say you're no longer blocking this or say why you 
> continue to block it.
> 
> There is no reason not to fix this.

Stephen, I want to call this out, as I think the way you write here is not OK:

1. the scare quotes around "standing objections" reads like you're not 
respecting the opinions of others here; it's fine to be frustrated, it's fine 
to call out why you're frustrated (for example, reviews taking long time, or it 
being hard to find consensus), but it's not OK to use sarcasm to implicitly 
attack folks - the only result is that people are hurt
2. if you want somebody to chime in, please say so; implicitly alluding to 
people you want to respond reads as an implicit accusation of folks trying to 
block you; nobody wants to block you; we're professionals trying to create a 
great product, and we often have different opinions what that means; that's 
part of the process, and in the end makes for a better product
3. "There is no reason not to fix this" - asserting your believes is not an 
argument, and comes across as arrogant - the moment other people have 
objections there are reasons to object; whether those reasons in the end are 
more important than the reasons you cited for doing it is often a value call, 
and different people have different values

If you need to vent, feel free to write me (and only me) angry emails (if you 
preface with Vent: I also know the context, which helps :), or I'm also happy 
to add you to my work chat so you can vent at me in chat. I understand the need 
to vent - but it must not be done at the emotional expense of others on the 
project. In the end, behaving like this in code reviews will reduce the chance 
that people are willing to engage with you and spend time on reviewing your 
code, leading to even longer review times. This is a vicious cycle. It's in 
your hands to break it, but it will take time and patience.
As I've said before, I love what you're doing for the project, but 

[PATCH] D80025: [ASTMatcher] Correct memoization bug ignoring direction (descendants or ancestors)

2020-06-16 Thread Manuel Klimek via Phabricator via cfe-commits
klimek accepted this revision.
klimek added a comment.
This revision is now accepted and ready to land.

LG. Thanks!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80025/new/

https://reviews.llvm.org/D80025



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D81552: [ASTMatchers] Added hasDirectBase and hasClass Matchers

2020-06-15 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

We have some precedent for overloading has* matchers, but I'll defer to Sam's 
judgement whether that's a good idea here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81552/new/

https://reviews.llvm.org/D81552



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80961: Ignore template instantiations if not in AsIs mode

2020-06-08 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

In D80961#2076242 , @aaron.ballman 
wrote:

> In D80961#2073049 , @klimek wrote:
>
> > Without jumping into the discussion whether it should be the default, I 
> > think we should be able to control template instantiation visitation 
> > separately from other implicit nodes.
> >  Having to put AsIs on a matcher every time you need to match template 
> > instantiations is a rather big change (suddenly you have to change all the 
> > matchers you've written so far).
>
>
> I think that's the intended meaning of `AsIs` though. Template instantiations 
> are not source code the user wrote, they're source code the compiler stamped 
> out from code the user wrote. I hope `IgnoreUnlessSpelledInSource` isn't a 
> misnomer.
>
> > I love the idea of being able to control visitation of template 
> > instantiation.
> >  I am somewhat torn on whether it should be the default, and would like to 
> > see more data.
> >  I feel more strongly about needing AsIs when I want to match template 
> > instantiations.
>
> FWIW, my experience in clang-tidy has been that template instantiations are 
> ignored far more often than they're desired. In fact, instantiations tend to 
> be a source of bugs for us because they're easy to forget about when writing 
> matchers without keeping templates in mind. The times when template 
> instantiations become important to *not* ignore within the checks is when the 
> check is specific to template behavior, but that's a minority of the public 
> checks thus far.


So I assume the idea is that this will work & be what we'll want people to 
write?
traverse(AsIs, decl(traverse(IgnoreImplicit, hasFoo)))


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80961/new/

https://reviews.llvm.org/D80961



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80961: WIP: Ignore template instantiations if not in AsIs mode

2020-06-04 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

Without jumping into the discussion whether it should be the default, I think 
we should be able to control template instantiation visitation separately from 
other implicit nodes.
Having to put AsIs on a matcher every time you need to match template 
instantiations is a rather big change (suddenly you have to change all the 
matchers you've written so far).
I love the idea of being able to control visitation of template instantiation.
I am somewhat torn on whether it should be the default, and would like to see 
more data.
I feel more strongly about needing AsIs when I want to match template 
instantiations.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80961/new/

https://reviews.llvm.org/D80961



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80961: WIP: Ignore template instantiations if not in AsIs mode

2020-06-04 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

In D80961#2068865 , @ymandel wrote:

> Thank you for bringing up this issue. I think it highlights an underlying 
> problem -- editing templates is quite difficult -- that neither setting will 
> address, as Dmitri expanded on above. Given the parallel to macros, I'd say 
> your change is better than the status quo. Most clang tidies and other 
> rewriting tools that I've encountered simply skip code in macro expansions, 
> rather than reason about how to update the macro definition or whatnot. So, 
> by that reasoning, we should skip template instantations.


I've personally written quite a few tools that changed macro definitions. Do 
you have numbers on how many of the library team's tools generally want to 
match template instantiations?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80961/new/

https://reviews.llvm.org/D80961



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D50078: clang-format: support aligned nested conditionals formatting

2020-05-27 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

For the policy question: clang-format does intentionally not try to be stable - 
the "how to" suggestion for clang-format has always been to format changes 
lines and live with divergence (divergence from people manually formatting 
things is larger).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D50078/new/

https://reviews.llvm.org/D50078



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69764: [clang-format] Add East/West Const fixer capability

2020-05-25 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/docs/ClangFormatStyleOptions.rst:1378
 
+**ConstStyle** (``ConstAlignmentStyle``)
+  Different ways to arrange const.

MyDeveloperDay wrote:
> aaron.ballman wrote:
> > MyDeveloperDay wrote:
> > > klimek wrote:
> > > > Personally, I'm somewhat against having 3 different aliases for the 
> > > > options. I'd chose one, even though it doesn't make everybody happy, 
> > > > and move on. I'm fine with East/West as long as the documentation makes 
> > > > it clear what it is.
> > > If I have to drop the other options, I think I'd want to go with 
> > > East/West const as I feel it has more momentum, just letting people know 
> > > before I change the code back (to my original patch ;-) )
> > > 
> > > https://www.youtube.com/watch?v=gRmI_gsNqcI
> > > 
> > > {F10954065}
> > > 
> > @klimek I requested that we do not go with East/West the options and I'm 
> > still pretty insistent on it. East/West is a kitschy way to phrase it that 
> > I think is somewhat US-centric (where we make a pretty big distinction 
> > between the east and west coasts). I do not want to have to mentally map 
> > left/right to the less-clear east/west in the config file. Would you be 
> > fine if we only had Left/Right instead of East/West? I would be fine with 
> > that option, but figured enough people like the cute East/West designation 
> > that we might as well support it.
> Just for a reference, I'm not from the US and I think east/west still 
> translates pretty well. I was happy to support the others. 
I'd be fine with only having left/right; my personal feeling is also that 
west-const / east-const has kinda become a term of art, though, so I really 
don't know which one is "right" :)

Generally, I think this is one of the cases where, given good docs, we're 
quickly spending more engineering hours discussing the right solution than 
it'll cost aggregated over all future users, under the assumption that people 
do not write new configs very often, and the few who will, will quickly 
remember.



CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69764/new/

https://reviews.llvm.org/D69764



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80025: [ASTMatcher] Correct memoization bug ignoring direction (descendants or ancestors)

2020-05-20 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/lib/ASTMatchers/ASTMatchFinder.cpp:46-49
+enum class MatchDirection {
+  Ancestors,
+  Descendants
+};

loic-joly-sonarsource wrote:
> klimek wrote:
> > loic-joly-sonarsource wrote:
> > > klimek wrote:
> > > > Nice find! Why don't we need more states though?
> > > > 1. wouldn't hasParent() followed by a hasAncestor() also trigger the 
> > > > memoization? (if so, we'd need transitive / non-transitive)
> > > > 2. can we trigger a directional match and a non-directional 
> > > > (non-traversal, for example, explicit) match in the same memoization 
> > > > scope?
> > > 1. Yes, I'm going to add it
> > > 2. Sorry, I did not understand what you mean...
> > > 
> > Re 2: nevermind, we don't memoize non-transitive matches (other than 
> > hasParent / hasChild), right?
> > In that case, I'm wondering whether the simpler solution is to just not 
> > memoize hasParent / hasChild - wouldn't that be more in line with the rest 
> > of the memoization?
> If I correctly understand what you mean, you think that memoizing recursive 
> searches (ancestors/descendants) might not be worth it, and that just 
> memoizing direct searches (parent, child) would be enough.
> 
> I really don't know if it's true or not, and I believe that getting this kind 
> of data requires a significant benchmarking effort (which code base, which 
> matchers...). And there might also be other performance-related concerns (for 
> instance, the value of `MaxMemoizationEntries`, the choice of `std::map` to 
> store the cache...),  
> 
> In other words, I think this would go far beyond what this PR is trying to 
> achieve, which is only correcting a bug.
What I'm trying to say is:
Currently the only non-transitive matchers we memoize are hasChild and 
hasParent, which ... seems weird - my operating hypothesis is that back in the 
day I didn't realize that or I'd have changed it :)

Thus, it seems like a simpler patch is to not memoize if it's not a transitive 
match.

Sam also had a good idea, which is to change the ASTMatchFinder API to instead 
of the current:
matchesChildOf
matchesDescendantOf
matchesAncestorOf(..., MatchMode)

create different atoms:
matchesChildOf
matchesDescendantOfOrSelf
matchesParentOf
matchesAncestorOfOrSelf

then hasDescendant(m) would be implemented as hasChild(hasDescendantOrSelf(m)) 
and the direction would be part of the matcher already.
That as an aside, which I'd propose to not do in the current patch though :)

My proposal for the current patch is to not memoize if we're only matching a 
single child or parent.




CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80025/new/

https://reviews.llvm.org/D80025



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80202: [ASTMatchers] Performance optimization for memoization

2020-05-20 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

In D80202#2046321 , @njames93 wrote:

> In D80202#2046266 , @klimek wrote:
>
> > Given the extra complexity I'd like to see that it matters - bound nodes 
> > tend to be small.
>
>
> I put that in the description, but this is where i need help. whats the best 
> way to benchmark the matchers? 
>  Also, do you know how it was benchmarked when `MaxMemoizationEntries` was 
> decided upon?
>  There was also comments about some making some micro benchmarks but I don't 
> think that was acted upon.


I do remember benchmarking it when I wrote it, but that was 10 years ago :)
I mainly did benchmarks on large ASTs, trying various matchers.
Today we can benchmark for example whether running clang-tidy on all of LLVM 
makes a difference.
Micro-BMs would be nice to add, but are somewhat hard to get right.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80202/new/

https://reviews.llvm.org/D80202



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80202: [ASTMatchers] Performance optimization for memoization

2020-05-20 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

Given the extra complexity I'd like to see that it matters - bound nodes tend 
to be small.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80202/new/

https://reviews.llvm.org/D80202



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80025: [ASTMatcher] Correct memoization bug ignoring direction (descendants or ancestors)

2020-05-19 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/lib/ASTMatchers/ASTMatchFinder.cpp:46-49
+enum class MatchDirection {
+  Ancestors,
+  Descendants
+};

loic-joly-sonarsource wrote:
> klimek wrote:
> > Nice find! Why don't we need more states though?
> > 1. wouldn't hasParent() followed by a hasAncestor() also trigger the 
> > memoization? (if so, we'd need transitive / non-transitive)
> > 2. can we trigger a directional match and a non-directional (non-traversal, 
> > for example, explicit) match in the same memoization scope?
> 1. Yes, I'm going to add it
> 2. Sorry, I did not understand what you mean...
> 
Re 2: nevermind, we don't memoize non-transitive matches (other than hasParent 
/ hasChild), right?
In that case, I'm wondering whether the simpler solution is to just not memoize 
hasParent / hasChild - wouldn't that be more in line with the rest of the 
memoization?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80025/new/

https://reviews.llvm.org/D80025



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80025: [ASTMatcher] Correct memoization bug ignoring direction (descendants or ancestors)

2020-05-18 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/lib/ASTMatchers/ASTMatchFinder.cpp:46-49
+enum class MatchDirection {
+  Ancestors,
+  Descendants
+};

Nice find! Why don't we need more states though?
1. wouldn't hasParent() followed by a hasAncestor() also trigger the 
memoization? (if so, we'd need transitive / non-transitive)
2. can we trigger a directional match and a non-directional (non-traversal, for 
example, explicit) match in the same memoization scope?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80025/new/

https://reviews.llvm.org/D80025



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D67405: Make FormatToken::Type private.

2020-05-13 Thread Manuel Klimek via Phabricator via cfe-commits
klimek updated this revision to Diff 263712.
klimek added a comment.

Update docs.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67405/new/

https://reviews.llvm.org/D67405

Files:
  clang/lib/Format/Format.cpp
  clang/lib/Format/FormatToken.cpp
  clang/lib/Format/FormatToken.h
  clang/lib/Format/FormatTokenLexer.cpp
  clang/lib/Format/TokenAnnotator.cpp
  clang/lib/Format/UnwrappedLineParser.cpp

Index: clang/lib/Format/UnwrappedLineParser.cpp
===
--- clang/lib/Format/UnwrappedLineParser.cpp
+++ clang/lib/Format/UnwrappedLineParser.cpp
@@ -369,9 +369,9 @@
   bool SwitchLabelEncountered = false;
   do {
 tok::TokenKind kind = FormatTok->Tok.getKind();
-if (FormatTok->Type == TT_MacroBlockBegin) {
+if (FormatTok->getType() == TT_MacroBlockBegin) {
   kind = tok::l_brace;
-} else if (FormatTok->Type == TT_MacroBlockEnd) {
+} else if (FormatTok->getType() == TT_MacroBlockEnd) {
   kind = tok::r_brace;
 }
 
@@ -1033,11 +1033,11 @@
   case tok::kw_asm:
 nextToken();
 if (FormatTok->is(tok::l_brace)) {
-  FormatTok->Type = TT_InlineASMBrace;
+  FormatTok->setType(TT_InlineASMBrace);
   nextToken();
   while (FormatTok && FormatTok->isNot(tok::eof)) {
 if (FormatTok->is(tok::r_brace)) {
-  FormatTok->Type = TT_InlineASMBrace;
+  FormatTok->setType(TT_InlineASMBrace);
   nextToken();
   addUnwrappedLine();
   break;
@@ -1342,7 +1342,7 @@
 // for them (the one we know is missing are lambdas).
 if (Style.BraceWrapping.AfterFunction)
   addUnwrappedLine();
-FormatTok->Type = TT_FunctionLBrace;
+FormatTok->setType(TT_FunctionLBrace);
 parseBlock(/*MustBeDeclaration=*/false);
 addUnwrappedLine();
 return;
@@ -1658,7 +1658,7 @@
   // This might or might not actually be a lambda arrow (this could be an
   // ObjC method invocation followed by a dereferencing arrow). We might
   // reset this back to TT_Unknown in TokenAnnotator.
-  FormatTok->Type = TT_LambdaArrow;
+  FormatTok->setType(TT_LambdaArrow);
   SeenArrow = true;
   nextToken();
   break;
@@ -1666,8 +1666,8 @@
   return true;
 }
   }
-  FormatTok->Type = TT_LambdaLBrace;
-  LSquare.Type = TT_LambdaLSquare;
+  FormatTok->setType(TT_LambdaLBrace);
+  LSquare.setType(TT_LambdaLSquare);
   parseChildBlock();
   return true;
 }
@@ -1700,7 +1700,7 @@
 
   // Consume * (generator function). Treat it like C++'s overloaded operators.
   if (FormatTok->is(tok::star)) {
-FormatTok->Type = TT_OverloadedOperator;
+FormatTok->setType(TT_OverloadedOperator);
 nextToken();
   }
 
@@ -2684,8 +2684,8 @@
 E = Line.Tokens.end();
I != E; ++I) {
 llvm::dbgs() << I->Tok->Tok.getName() << "["
- << "T=" << I->Tok->Type << ", OC=" << I->Tok->OriginalColumn
- << "] ";
+ << "T=" << I->Tok->getType()
+ << ", OC=" << I->Tok->OriginalColumn << "] ";
   }
   for (std::list::const_iterator I = Line.Tokens.begin(),
 E = Line.Tokens.end();
@@ -2956,14 +2956,14 @@
   flushComments(isOnNewLine(*FormatTok));
   parsePPDirective();
 }
-while (FormatTok->Type == TT_ConflictStart ||
-   FormatTok->Type == TT_ConflictEnd ||
-   FormatTok->Type == TT_ConflictAlternative) {
-  if (FormatTok->Type == TT_ConflictStart) {
+while (FormatTok->getType() == TT_ConflictStart ||
+   FormatTok->getType() == TT_ConflictEnd ||
+   FormatTok->getType() == TT_ConflictAlternative) {
+  if (FormatTok->getType() == TT_ConflictStart) {
 conditionalCompilationStart(/*Unreachable=*/false);
-  } else if (FormatTok->Type == TT_ConflictAlternative) {
+  } else if (FormatTok->getType() == TT_ConflictAlternative) {
 conditionalCompilationAlternative();
-  } else if (FormatTok->Type == TT_ConflictEnd) {
+  } else if (FormatTok->getType() == TT_ConflictEnd) {
 conditionalCompilationEnd();
   }
   FormatTok = Tokens->getNextToken();
Index: clang/lib/Format/TokenAnnotator.cpp
===
--- clang/lib/Format/TokenAnnotator.cpp
+++ clang/lib/Format/TokenAnnotator.cpp
@@ -118,9 +118,9 @@
 if (Style.Language == FormatStyle::LK_TextProto ||
 (Style.Language == FormatStyle::LK_Proto && Left->Previous &&
  Left->Previous->isOneOf(TT_SelectorName, TT_DictLiteral)))
-  CurrentToken->Type = TT_DictLiteral;
+  CurrentToken->setType(TT_DictLiteral);
 else
-  CurrentToken->Type = TT_TemplateCloser;
+  CurrentToken->setType(TT_TemplateCloser);
 next();
 return true;
   }
@@ -151,7 +151,7 

[PATCH] D67405: Make FormatToken::Type private.

2020-05-13 Thread Manuel Klimek via Phabricator via cfe-commits
klimek updated this revision to Diff 263707.
klimek added a comment.

Rebase.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D67405/new/

https://reviews.llvm.org/D67405

Files:
  clang/lib/Format/Format.cpp
  clang/lib/Format/FormatToken.cpp
  clang/lib/Format/FormatToken.h
  clang/lib/Format/FormatTokenLexer.cpp
  clang/lib/Format/TokenAnnotator.cpp
  clang/lib/Format/UnwrappedLineParser.cpp

Index: clang/lib/Format/UnwrappedLineParser.cpp
===
--- clang/lib/Format/UnwrappedLineParser.cpp
+++ clang/lib/Format/UnwrappedLineParser.cpp
@@ -369,9 +369,9 @@
   bool SwitchLabelEncountered = false;
   do {
 tok::TokenKind kind = FormatTok->Tok.getKind();
-if (FormatTok->Type == TT_MacroBlockBegin) {
+if (FormatTok->getType() == TT_MacroBlockBegin) {
   kind = tok::l_brace;
-} else if (FormatTok->Type == TT_MacroBlockEnd) {
+} else if (FormatTok->getType() == TT_MacroBlockEnd) {
   kind = tok::r_brace;
 }
 
@@ -1033,11 +1033,11 @@
   case tok::kw_asm:
 nextToken();
 if (FormatTok->is(tok::l_brace)) {
-  FormatTok->Type = TT_InlineASMBrace;
+  FormatTok->setType(TT_InlineASMBrace);
   nextToken();
   while (FormatTok && FormatTok->isNot(tok::eof)) {
 if (FormatTok->is(tok::r_brace)) {
-  FormatTok->Type = TT_InlineASMBrace;
+  FormatTok->setType(TT_InlineASMBrace);
   nextToken();
   addUnwrappedLine();
   break;
@@ -1342,7 +1342,7 @@
 // for them (the one we know is missing are lambdas).
 if (Style.BraceWrapping.AfterFunction)
   addUnwrappedLine();
-FormatTok->Type = TT_FunctionLBrace;
+FormatTok->setType(TT_FunctionLBrace);
 parseBlock(/*MustBeDeclaration=*/false);
 addUnwrappedLine();
 return;
@@ -1658,7 +1658,7 @@
   // This might or might not actually be a lambda arrow (this could be an
   // ObjC method invocation followed by a dereferencing arrow). We might
   // reset this back to TT_Unknown in TokenAnnotator.
-  FormatTok->Type = TT_LambdaArrow;
+  FormatTok->setType(TT_LambdaArrow);
   SeenArrow = true;
   nextToken();
   break;
@@ -1666,8 +1666,8 @@
   return true;
 }
   }
-  FormatTok->Type = TT_LambdaLBrace;
-  LSquare.Type = TT_LambdaLSquare;
+  FormatTok->setType(TT_LambdaLBrace);
+  LSquare.setType(TT_LambdaLSquare);
   parseChildBlock();
   return true;
 }
@@ -1700,7 +1700,7 @@
 
   // Consume * (generator function). Treat it like C++'s overloaded operators.
   if (FormatTok->is(tok::star)) {
-FormatTok->Type = TT_OverloadedOperator;
+FormatTok->setType(TT_OverloadedOperator);
 nextToken();
   }
 
@@ -2684,8 +2684,8 @@
 E = Line.Tokens.end();
I != E; ++I) {
 llvm::dbgs() << I->Tok->Tok.getName() << "["
- << "T=" << I->Tok->Type << ", OC=" << I->Tok->OriginalColumn
- << "] ";
+ << "T=" << I->Tok->getType()
+ << ", OC=" << I->Tok->OriginalColumn << "] ";
   }
   for (std::list::const_iterator I = Line.Tokens.begin(),
 E = Line.Tokens.end();
@@ -2956,14 +2956,14 @@
   flushComments(isOnNewLine(*FormatTok));
   parsePPDirective();
 }
-while (FormatTok->Type == TT_ConflictStart ||
-   FormatTok->Type == TT_ConflictEnd ||
-   FormatTok->Type == TT_ConflictAlternative) {
-  if (FormatTok->Type == TT_ConflictStart) {
+while (FormatTok->getType() == TT_ConflictStart ||
+   FormatTok->getType() == TT_ConflictEnd ||
+   FormatTok->getType() == TT_ConflictAlternative) {
+  if (FormatTok->getType() == TT_ConflictStart) {
 conditionalCompilationStart(/*Unreachable=*/false);
-  } else if (FormatTok->Type == TT_ConflictAlternative) {
+  } else if (FormatTok->getType() == TT_ConflictAlternative) {
 conditionalCompilationAlternative();
-  } else if (FormatTok->Type == TT_ConflictEnd) {
+  } else if (FormatTok->getType() == TT_ConflictEnd) {
 conditionalCompilationEnd();
   }
   FormatTok = Tokens->getNextToken();
Index: clang/lib/Format/TokenAnnotator.cpp
===
--- clang/lib/Format/TokenAnnotator.cpp
+++ clang/lib/Format/TokenAnnotator.cpp
@@ -118,9 +118,9 @@
 if (Style.Language == FormatStyle::LK_TextProto ||
 (Style.Language == FormatStyle::LK_Proto && Left->Previous &&
  Left->Previous->isOneOf(TT_SelectorName, TT_DictLiteral)))
-  CurrentToken->Type = TT_DictLiteral;
+  CurrentToken->setType(TT_DictLiteral);
 else
-  CurrentToken->Type = TT_TemplateCloser;
+  CurrentToken->setType(TT_TemplateCloser);
 next();
 return true;
   }
@@ -151,7 +151,7 @@

[PATCH] D73537: [clangd] add CODE_OWNERS

2020-01-28 Thread Manuel Klimek via Phabricator via cfe-commits
klimek accepted this revision.
klimek added a comment.
This revision is now accepted and ready to land.

lg


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73537/new/

https://reviews.llvm.org/D73537



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69764: [clang-format] Add Left/Right Const (East/West , Before/After) fixer capability

2019-12-02 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

I'm not generally opposed to this, given that
a) clang-format already changes code; I think by now we're not fixing double 
semicolon mainly for workflow reasons, would be fine to add
b) the implementation is very self contained




Comment at: clang/docs/ClangFormatStyleOptions.rst:1378
 
+**ConstStyle** (``ConstAlignmentStyle``)
+  Different ways to arrange const.

Personally, I'm somewhat against having 3 different aliases for the options. 
I'd chose one, even though it doesn't make everybody happy, and move on. I'm 
fine with East/West as long as the documentation makes it clear what it is.



Comment at: clang/lib/Format/EastWestConstFixer.cpp:143
+SmallVectorImpl ,
+FormatTokenLexer ) {
+  const AdditionalKeywords  = Tokens.getKeywords();

This function is super large - can we split it up?



Comment at: clang/unittests/Format/FormatTest.cpp:30-32
+#define VERIFYFORMAT(expect, style) verifyFormat(expect, style, __LINE__)
+#define VERIFYFORMAT2(expect, actual, style)   
\
+  verifyFormat(expect, actual, style, __LINE__)

I'm somewhat opposed to introducing these macros in this patch. If we want 
them, we should create an extra patch, and figure out ho to use them in all 
format tests.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69764/new/

https://reviews.llvm.org/D69764



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D70664: update string comparison in clang-format.py

2019-11-25 Thread Manuel Klimek via Phabricator via cfe-commits
klimek accepted this revision.
klimek added a comment.
This revision is now accepted and ready to land.

lg


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70664/new/

https://reviews.llvm.org/D70664



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D70633: clang-format-vs : Fix Unicode formatting

2019-11-25 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

generally makes sense


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70633/new/

https://reviews.llvm.org/D70633



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69577: [clang-format] [PR35518] C++17 deduction guides are wrongly formatted

2019-10-30 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/lib/Format/TokenAnnotator.cpp:1400-1403
+} else if (Current.Previous && Current.Previous->is(tok::r_paren) &&
+   Current.startsSequence(tok::arrow, tok::identifier, tok::less)) 
{
+  // Deduction guides trailing arrow "...) -> A".
+  Current.Type = TT_TrailingReturnArrow;

Why doesn't this trigger on function templates:
  c()->f();



CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69577/new/

https://reviews.llvm.org/D69577



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69122: Add support to find out resource dir and add it as compilation args

2019-10-24 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added a comment.

> since resource directory should be picked relative
>  to the path of the clang-compiler in the compilation command.

The resource dir should be the resource dir that shipped with the clang source 
code that the *tool* was built with.
We can think about the resource dir as files that should really have been 
compiled into the tool.
If the compiler has a different version, its resource dir might break the tool.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69122/new/

https://reviews.llvm.org/D69122



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69351: Improve Clang's getting involved document and make it more inclusive in wording.

2019-10-23 Thread Manuel Klimek via Phabricator via cfe-commits
klimek accepted this revision.
klimek added a comment.
This revision is now accepted and ready to land.

lg


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69351/new/

https://reviews.llvm.org/D69351



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69351: Improve Clang's getting involved document and make it more inclusive in wording.

2019-10-23 Thread Manuel Klimek via Phabricator via cfe-commits
klimek added inline comments.



Comment at: clang/www/get_involved.html:94
+
+  A complete specification: The specification must be sufficient to
+  understand the design of the feature as well as interpret the meaning of

Remove "complete" - the explanation afterwards seems sufficient?



Comment at: clang/www/get_involved.html:107
+
+  A long-term support plan: Contributing a non-trivial extension to Clang
+  implies a commitment to supporting that extension, improving the

s/non-trivial/substantial/?



Comment at: clang/www/get_involved.html:108-109
+  A long-term support plan: Contributing a non-trivial extension to Clang
+  implies a commitment to supporting that extension, improving the
+  implementation and specification as Clang evolves. The capacity of the
+  contributor to make that commitment is as important as the commitment

s/,/and/?



Comment at: clang/www/get_involved.html:113
+
+  A high-quality implementation: The implementation must fit well into
+  Clang's architecture, follow LLVM's coding conventions, and meet Clang's

Perhaps: a standard-conforming implementation?



Comment at: clang/www/get_involved.html:115-116
+  Clang's architecture, follow LLVM's coding conventions, and meet Clang's
+  quality standards, including high-quality diagnostics and rich AST
+  representations. This is particularly important for language extensions,
+  because users will learn how those extensions work through the behavior of 
the

"high-quality diagnostics" is a bit unclear
"rich AST" that seems to imply a lot of domain knowledge




Comment at: clang/www/get_involved.html:120
+
+  A proper test suite: Extensive testing is crucial to ensure that the
+  language extension is not broken by ongoing maintenance in Clang. The test

s/proper//


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69351/new/

https://reviews.llvm.org/D69351



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D68969: [clang-format] Remove the dependency on frontend

2019-10-21 Thread Manuel Klimek via Phabricator via cfe-commits
klimek accepted this revision.
klimek added a comment.
This revision is now accepted and ready to land.

lg




Comment at: clang/tools/clang-format/ClangFormat.cpp:356
+
+  StringRef Line(StartBuf, (EndBuf - StartBuf) - 1);
+

- 1 is to exclude the \n I'd assume?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68969/new/

https://reviews.llvm.org/D68969



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


  1   2   3   4   5   6   >