[PATCH] D154093: [clang-format] Break long string literals in C#, etc.

2023-09-20 Thread Arthur Eubanks via Phabricator via cfe-commits
aeubanks added a comment.

In D154093#4648986 , @aeubanks wrote:

> In D154093#4648931 , @eaeltsin 
> wrote:
>
>> This might have another issue with Verilog -
>>
>>   < import "AAA-BBB" foo bar baz
>>   ---
>>   > import {"AAA-",
>>   > "BBB"} foo bar baz
>>
>> I wonder if Verilog allows breaking strings in `import`?
>
> From the spec:
> `dpi_spec_string ::= "DPI-C" | "DPI"`
> Seems like it can't.

Sent out https://github.com/llvm/llvm-project/pull/66951


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154093/new/

https://reviews.llvm.org/D154093

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154093: [clang-format] Break long string literals in C#, etc.

2023-09-20 Thread Arthur Eubanks via Phabricator via cfe-commits
aeubanks added a comment.

In D154093#4648931 , @eaeltsin wrote:

> This might have another issue with Verilog -
>
>   < import "AAA-BBB" foo bar baz
>   ---
>   > import {"AAA-",
>   > "BBB"} foo bar baz
>
> I wonder if Verilog allows breaking strings in `import`?

From the spec:
`dpi_spec_string ::= "DPI-C" | "DPI"`
Seems like it can't.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154093/new/

https://reviews.llvm.org/D154093

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154093: [clang-format] Break long string literals in C#, etc.

2023-09-20 Thread Evgeny Eltsin via Phabricator via cfe-commits
eaeltsin added a comment.

This might have another issue with Verilog -

  < import "AAA-BBB" foo bar baz
  ---
  > import {"AAA-",
  > "BBB"} foo bar baz

I wonder if Verilog allows breaking strings in `import`?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154093/new/

https://reviews.llvm.org/D154093

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154093: [clang-format] Break long string literals in C#, etc.

2023-09-14 Thread Evgeny Eltsin via Phabricator via cfe-commits
eaeltsin added a comment.

This still misses contexts when string literal is required, for example 
https://github.com/search?q=repo%3Agoogle%2Fclosure-compiler%20%22%20must%20be%20a%20string%20literal%22=code

I wonder, if splitting the literal with `+` is a good option at all.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154093/new/

https://reviews.llvm.org/D154093

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154093: [clang-format] Break long string literals in C#, etc.

2023-09-12 Thread sstwcw via Phabricator via cfe-commits
sstwcw added a comment.

In D154093#4644727 , @alexfh wrote:

> This patch makes clang-format produce invalid JS/TS code
>
> In D154093#4644512 , @eaeltsin 
> wrote:
>
>> This introduces an invalid TS transformation, from
>>
>>   type x = 'ab';
>>
>> to
>>
>>   type x = ('a'+'b');
>>
>> Which doesn't compile - 
>> https://www.typescriptlang.org/play?#code/C4TwDgpgBAHlC8UDkBDARkg3AKG6SUICUAFKklANTIYCUOQA
>>
>> Should we revert?
>
> Similarly, strings in dictionary literals can't be broken: `var w = {'a' + 
> 'b': 0};` (https://gcc.godbolt.org/z/6renEv5sa). Please fix or revert.

Here is the fix. https://github.com/llvm/llvm-project/pull/66168


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154093/new/

https://reviews.llvm.org/D154093

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154093: [clang-format] Break long string literals in C#, etc.

2023-09-12 Thread Alexander Kornienko via Phabricator via cfe-commits
alexfh added a comment.

This patch makes clang-format produce invalid JS/TS code

In D154093#4644512 , @eaeltsin wrote:

> This introduces an invalid TS transformation, from
>
>   type x = 'ab';
>
> to
>
>   type x = ('a'+'b');
>
> Which doesn't compile - 
> https://www.typescriptlang.org/play?#code/C4TwDgpgBAHlC8UDkBDARkg3AKG6SUICUAFKklANTIYCUOQA
>
> Should we revert?

Similarly, strings in dictionary literals can't be broken: `var w = {'a' + 'b': 
0};` (https://gcc.godbolt.org/z/6renEv5sa). Please fix or revert.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154093/new/

https://reviews.llvm.org/D154093

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154093: [clang-format] Break long string literals in C#, etc.

2023-09-12 Thread Evgeny Eltsin via Phabricator via cfe-commits
eaeltsin added a comment.

This introduces an invalid TS transformation, from

  type x = 'ab';

to

  type x = ('a'+'b');


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154093/new/

https://reviews.llvm.org/D154093

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154093: [clang-format] Break long string literals in C#, etc.

2023-09-04 Thread sstwcw via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGddc80637ccbc: [clang-format] Break long string literals in 
C#, etc. (authored by sstwcw).

Changed prior to commit:
  https://reviews.llvm.org/D154093?vs=553729=555795#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154093/new/

https://reviews.llvm.org/D154093

Files:
  clang/docs/ClangFormatStyleOptions.rst
  clang/include/clang/Format/Format.h
  clang/lib/Format/BreakableToken.cpp
  clang/lib/Format/BreakableToken.h
  clang/lib/Format/ContinuationIndenter.cpp
  clang/lib/Format/FormatToken.h
  clang/lib/Format/TokenAnnotator.cpp
  clang/lib/Format/WhitespaceManager.cpp
  clang/unittests/Format/FormatTestCSharp.cpp
  clang/unittests/Format/FormatTestJS.cpp
  clang/unittests/Format/FormatTestJava.cpp
  clang/unittests/Format/FormatTestVerilog.cpp
  clang/unittests/Format/TokenAnnotatorTest.cpp

Index: clang/unittests/Format/TokenAnnotatorTest.cpp
===
--- clang/unittests/Format/TokenAnnotatorTest.cpp
+++ clang/unittests/Format/TokenAnnotatorTest.cpp
@@ -1868,6 +1868,30 @@
   EXPECT_TOKEN(Tokens[2], tok::l_paren, TT_Unknown);
   EXPECT_TOKEN(Tokens[5], tok::comma, TT_Unknown);
   EXPECT_TOKEN(Tokens[8], tok::r_paren, TT_Unknown);
+
+  // String literals in concatenation.
+  Tokens = Annotate("x = {\"\"};");
+  ASSERT_EQ(Tokens.size(), 7u) << Tokens;
+  EXPECT_TOKEN(Tokens[3], tok::string_literal, TT_StringInConcatenation);
+  Tokens = Annotate("x = {\"\", \"\"};");
+  ASSERT_EQ(Tokens.size(), 9u) << Tokens;
+  EXPECT_TOKEN(Tokens[3], tok::string_literal, TT_StringInConcatenation);
+  EXPECT_TOKEN(Tokens[5], tok::string_literal, TT_StringInConcatenation);
+  Tokens = Annotate("x = '{{\"\"}};");
+  ASSERT_EQ(Tokens.size(), 10u) << Tokens;
+  EXPECT_TOKEN(Tokens[5], tok::string_literal, TT_StringInConcatenation);
+  // Cases where the string should not be annotated that type.  Fix the
+  // `TT_Unknown` if needed in the future.
+  Tokens = Annotate("x = {\"\" == \"\"};");
+  ASSERT_EQ(Tokens.size(), 9u) << Tokens;
+  EXPECT_TOKEN(Tokens[3], tok::string_literal, TT_Unknown);
+  EXPECT_TOKEN(Tokens[5], tok::string_literal, TT_Unknown);
+  Tokens = Annotate("x = {(\"\")};");
+  ASSERT_EQ(Tokens.size(), 9u) << Tokens;
+  EXPECT_TOKEN(Tokens[4], tok::string_literal, TT_Unknown);
+  Tokens = Annotate("x = '{\"\"};");
+  ASSERT_EQ(Tokens.size(), 8u) << Tokens;
+  EXPECT_TOKEN(Tokens[4], tok::string_literal, TT_Unknown);
 }
 
 TEST_F(TokenAnnotatorTest, UnderstandConstructors) {
Index: clang/unittests/Format/FormatTestVerilog.cpp
===
--- clang/unittests/Format/FormatTestVerilog.cpp
+++ clang/unittests/Format/FormatTestVerilog.cpp
@@ -1157,6 +1157,66 @@
   verifyFormat("{< Ranges(1, tooling::Range(Offset, Length));
-tooling::Replacements Replaces = reformat(Style, Code, Ranges);
-auto Result = applyAllReplacements(Code, Replaces);
-EXPECT_TRUE(static_cast(Result));
-LLVM_DEBUG(llvm::errs() << "\n" << *Result << "\n\n");
-return *Result;
-  }
-
-  static std::string
-  format(llvm::StringRef Code,
- const FormatStyle  = getGoogleStyle(FormatStyle::LK_Java)) {
-return format(Code, 0, Code.size(), Style);
+  FormatStyle getDefaultStyle() const override {
+return getGoogleStyle(FormatStyle::LK_Java);
   }
 
   static FormatStyle getStyleWithColumns(unsigned ColumnLimit) {
@@ -41,13 +26,6 @@
 Style.ColumnLimit = ColumnLimit;
 return Style;
   }
-
-  static void verifyFormat(
-  llvm::StringRef Code,
-  const FormatStyle  = getGoogleStyle(FormatStyle::LK_Java)) {
-EXPECT_EQ(Code.str(), format(Code, Style)) << "Expected code is not stable";
-EXPECT_EQ(Code.str(), format(test::messUp(Code), Style));
-  }
 };
 
 TEST_F(FormatTestJava, NoAlternativeOperatorNames) {
@@ -565,9 +543,9 @@
 }
 
 TEST_F(FormatTestJava, BreaksStringLiterals) {
-  // FIXME: String literal breaking is currently disabled for Java and JS, as it
-  // requires strings to be merged using "+" which we don't support.
-  verifyFormat("\"some text other\";", getStyleWithColumns(14));
+  verifyFormat("x = \"some text \"\n"
+   "+ \"other\";",
+   "x = \"some text other\";", getStyleWithColumns(18));
 }
 
 TEST_F(FormatTestJava, AlignsBlockComments) {
@@ -625,5 +603,7 @@
Style);
 }
 
+} // namespace
+} // namespace test
 } // namespace format
-} // end namespace clang
+} // namespace clang
Index: clang/unittests/Format/FormatTestJS.cpp
===
--- clang/unittests/Format/FormatTestJS.cpp
+++ clang/unittests/Format/FormatTestJS.cpp
@@ -1505,6 +1505,97 @@
 TEST_F(FormatTestJS, StringLiteralConcatenation) {
   verifyFormat("var literal = 'hello ' +\n"
"'world';");
+
+  // Long strings should be broken.
+  

[PATCH] D154093: [clang-format] Break long string literals in C#, etc.

2023-08-26 Thread sstwcw via Phabricator via cfe-commits
sstwcw updated this revision to Diff 553729.
sstwcw added a comment.

Stop using `SmallString` in the tests.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154093/new/

https://reviews.llvm.org/D154093

Files:
  clang/docs/ClangFormatStyleOptions.rst
  clang/include/clang/Format/Format.h
  clang/lib/Format/BreakableToken.cpp
  clang/lib/Format/BreakableToken.h
  clang/lib/Format/ContinuationIndenter.cpp
  clang/lib/Format/FormatToken.h
  clang/lib/Format/TokenAnnotator.cpp
  clang/lib/Format/WhitespaceManager.cpp
  clang/unittests/Format/FormatTestCSharp.cpp
  clang/unittests/Format/FormatTestJS.cpp
  clang/unittests/Format/FormatTestJava.cpp
  clang/unittests/Format/FormatTestVerilog.cpp
  clang/unittests/Format/TokenAnnotatorTest.cpp

Index: clang/unittests/Format/TokenAnnotatorTest.cpp
===
--- clang/unittests/Format/TokenAnnotatorTest.cpp
+++ clang/unittests/Format/TokenAnnotatorTest.cpp
@@ -1863,6 +1863,30 @@
   EXPECT_TOKEN(Tokens[2], tok::l_paren, TT_Unknown);
   EXPECT_TOKEN(Tokens[5], tok::comma, TT_Unknown);
   EXPECT_TOKEN(Tokens[8], tok::r_paren, TT_Unknown);
+
+  // String literals in concatenation.
+  Tokens = Annotate("x = {\"\"};");
+  ASSERT_EQ(Tokens.size(), 7u) << Tokens;
+  EXPECT_TOKEN(Tokens[3], tok::string_literal, TT_StringInConcatenation);
+  Tokens = Annotate("x = {\"\", \"\"};");
+  ASSERT_EQ(Tokens.size(), 9u) << Tokens;
+  EXPECT_TOKEN(Tokens[3], tok::string_literal, TT_StringInConcatenation);
+  EXPECT_TOKEN(Tokens[5], tok::string_literal, TT_StringInConcatenation);
+  Tokens = Annotate("x = '{{\"\"}};");
+  ASSERT_EQ(Tokens.size(), 10u) << Tokens;
+  EXPECT_TOKEN(Tokens[5], tok::string_literal, TT_StringInConcatenation);
+  // Cases where the string should not be annotated that type.  Fix the
+  // `TT_Unknown` if needed in the future.
+  Tokens = Annotate("x = {\"\" == \"\"};");
+  ASSERT_EQ(Tokens.size(), 9u) << Tokens;
+  EXPECT_TOKEN(Tokens[3], tok::string_literal, TT_Unknown);
+  EXPECT_TOKEN(Tokens[5], tok::string_literal, TT_Unknown);
+  Tokens = Annotate("x = {(\"\")};");
+  ASSERT_EQ(Tokens.size(), 9u) << Tokens;
+  EXPECT_TOKEN(Tokens[4], tok::string_literal, TT_Unknown);
+  Tokens = Annotate("x = '{\"\"};");
+  ASSERT_EQ(Tokens.size(), 8u) << Tokens;
+  EXPECT_TOKEN(Tokens[4], tok::string_literal, TT_Unknown);
 }
 
 TEST_F(TokenAnnotatorTest, UnderstandConstructors) {
Index: clang/unittests/Format/FormatTestVerilog.cpp
===
--- clang/unittests/Format/FormatTestVerilog.cpp
+++ clang/unittests/Format/FormatTestVerilog.cpp
@@ -1157,6 +1157,66 @@
   verifyFormat("{< Ranges(1, tooling::Range(Offset, Length));
-tooling::Replacements Replaces = reformat(Style, Code, Ranges);
-auto Result = applyAllReplacements(Code, Replaces);
-EXPECT_TRUE(static_cast(Result));
-LLVM_DEBUG(llvm::errs() << "\n" << *Result << "\n\n");
-return *Result;
-  }
-
-  static std::string
-  format(llvm::StringRef Code,
- const FormatStyle  = getGoogleStyle(FormatStyle::LK_Java)) {
-return format(Code, 0, Code.size(), Style);
+  FormatStyle getDefaultStyle() const override {
+return getGoogleStyle(FormatStyle::LK_Java);
   }
 
   static FormatStyle getStyleWithColumns(unsigned ColumnLimit) {
@@ -41,13 +26,6 @@
 Style.ColumnLimit = ColumnLimit;
 return Style;
   }
-
-  static void verifyFormat(
-  llvm::StringRef Code,
-  const FormatStyle  = getGoogleStyle(FormatStyle::LK_Java)) {
-EXPECT_EQ(Code.str(), format(Code, Style)) << "Expected code is not stable";
-EXPECT_EQ(Code.str(), format(test::messUp(Code), Style));
-  }
 };
 
 TEST_F(FormatTestJava, NoAlternativeOperatorNames) {
@@ -565,9 +543,9 @@
 }
 
 TEST_F(FormatTestJava, BreaksStringLiterals) {
-  // FIXME: String literal breaking is currently disabled for Java and JS, as it
-  // requires strings to be merged using "+" which we don't support.
-  verifyFormat("\"some text other\";", getStyleWithColumns(14));
+  verifyFormat("x = \"some text \"\n"
+   "+ \"other\";",
+   "x = \"some text other\";", getStyleWithColumns(18));
 }
 
 TEST_F(FormatTestJava, AlignsBlockComments) {
@@ -625,5 +603,7 @@
Style);
 }
 
+} // namespace
+} // namespace test
 } // namespace format
-} // end namespace clang
+} // namespace clang
Index: clang/unittests/Format/FormatTestJS.cpp
===
--- clang/unittests/Format/FormatTestJS.cpp
+++ clang/unittests/Format/FormatTestJS.cpp
@@ -1505,6 +1505,97 @@
 TEST_F(FormatTestJS, StringLiteralConcatenation) {
   verifyFormat("var literal = 'hello ' +\n"
"'world';");
+
+  // Long strings should be broken.
+  verifyFormat("var literal =\n"
+   "' ' +\n"
+   "'';",
+   "var literal = ' ';",
+

[PATCH] D154093: [clang-format] Break long string literals in C#, etc.

2023-08-24 Thread Owen Pan via Phabricator via cfe-commits
owenpan added a comment.

In D154093#4615948 , @sstwcw wrote:

> In the JavaScript tests that I added, it was wrong to use `SmallString`.  
> Would you prefer me changing it to `string` or expanding the 6 test cases so 
> we don't need a variable for the string?

You mean `std::string`? That would be fine by me.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154093/new/

https://reviews.llvm.org/D154093

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154093: [clang-format] Break long string literals in C#, etc.

2023-08-24 Thread sstwcw via Phabricator via cfe-commits
sstwcw added a comment.

In the JavaScript tests that I added, it was wrong to use `SmallString`.  Would 
you prefer me changing it to `string` or expanding the 6 test cases so we don't 
need a variable for the string?

Shortly after I committed this patch, the server builds caught the problem in 
the JavaScript tests.  When the program reads a file, a null byte is added at 
the end.  The program relies on this property because it means it doesn't have 
to do a bound check on the source string, including in some of the code I wrote 
though this time it was another piece of code that caught the problem.  Also 
the null byte is not copied when a SmallString is initialized from a `const 
char*`.  A null byte will be added if the `c_str` method is used at least once. 
 But it was not the case here.  So my test ended up passing a string which does 
not have a null byte at the end.  I didn't catch the problem when I ran the 
tests myself because ironically I had the address sanitizer enabled.  The 
address sanitizer didn't catch the problem because the byte past the end of the 
string in the `SmallString` is the struct padding which is still inside the 
object as far as the address sanitizer is concerned.  However the address 
sanitizer likes to add padding between variables.  So now the uninitialized 
part of the struct padding is the padding null bytes added by the address 
sanitizer instead of the variable that used to be there.  So the assertion that 
the byte is null now holds.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154093/new/

https://reviews.llvm.org/D154093

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154093: [clang-format] Break long string literals in C#, etc.

2023-08-24 Thread David Spickett via Phabricator via cfe-commits
DavidSpickett added a comment.

This was causing failures on a large number of Linaro's bots. Judging by how 
widespread it was, reproducing could be as simple as adding 
`-DLLVM_ENABLE_ASSERTIONS=True` to your CMake config.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154093/new/

https://reviews.llvm.org/D154093

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154093: [clang-format] Break long string literals in C#, etc.

2023-08-24 Thread Owen Pan via Phabricator via cfe-commits
owenpan added a comment.

@sstwcw this patch caused a crash in `FormatTestJS.StringLiteralConcatenation`:

  Assertion failed: ((!RequiresNullTerminator || BufEnd[0] == 0) && "Buffer is 
not null terminated!"), function init, file MemoryBuffer.cpp, line 54.

Can you revert it if you can reproduce the crash?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154093/new/

https://reviews.llvm.org/D154093

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154093: [clang-format] Break long string literals in C#, etc.

2023-08-23 Thread sstwcw via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG16ccba51072b: [clang-format] Break long string literals in 
C#, etc. (authored by sstwcw).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154093/new/

https://reviews.llvm.org/D154093

Files:
  clang/docs/ClangFormatStyleOptions.rst
  clang/include/clang/Format/Format.h
  clang/lib/Format/BreakableToken.cpp
  clang/lib/Format/BreakableToken.h
  clang/lib/Format/ContinuationIndenter.cpp
  clang/lib/Format/FormatToken.h
  clang/lib/Format/TokenAnnotator.cpp
  clang/lib/Format/WhitespaceManager.cpp
  clang/unittests/Format/FormatTestCSharp.cpp
  clang/unittests/Format/FormatTestJS.cpp
  clang/unittests/Format/FormatTestJava.cpp
  clang/unittests/Format/FormatTestVerilog.cpp
  clang/unittests/Format/TokenAnnotatorTest.cpp

Index: clang/unittests/Format/TokenAnnotatorTest.cpp
===
--- clang/unittests/Format/TokenAnnotatorTest.cpp
+++ clang/unittests/Format/TokenAnnotatorTest.cpp
@@ -1815,6 +1815,30 @@
   EXPECT_TOKEN(Tokens[2], tok::l_paren, TT_Unknown);
   EXPECT_TOKEN(Tokens[5], tok::comma, TT_Unknown);
   EXPECT_TOKEN(Tokens[8], tok::r_paren, TT_Unknown);
+
+  // String literals in concatenation.
+  Tokens = Annotate("x = {\"\"};");
+  ASSERT_EQ(Tokens.size(), 7u) << Tokens;
+  EXPECT_TOKEN(Tokens[3], tok::string_literal, TT_StringInConcatenation);
+  Tokens = Annotate("x = {\"\", \"\"};");
+  ASSERT_EQ(Tokens.size(), 9u) << Tokens;
+  EXPECT_TOKEN(Tokens[3], tok::string_literal, TT_StringInConcatenation);
+  EXPECT_TOKEN(Tokens[5], tok::string_literal, TT_StringInConcatenation);
+  Tokens = Annotate("x = '{{\"\"}};");
+  ASSERT_EQ(Tokens.size(), 10u) << Tokens;
+  EXPECT_TOKEN(Tokens[5], tok::string_literal, TT_StringInConcatenation);
+  // Cases where the string should not be annotated that type.  Fix the
+  // `TT_Unknown` if needed in the future.
+  Tokens = Annotate("x = {\"\" == \"\"};");
+  ASSERT_EQ(Tokens.size(), 9u) << Tokens;
+  EXPECT_TOKEN(Tokens[3], tok::string_literal, TT_Unknown);
+  EXPECT_TOKEN(Tokens[5], tok::string_literal, TT_Unknown);
+  Tokens = Annotate("x = {(\"\")};");
+  ASSERT_EQ(Tokens.size(), 9u) << Tokens;
+  EXPECT_TOKEN(Tokens[4], tok::string_literal, TT_Unknown);
+  Tokens = Annotate("x = '{\"\"};");
+  ASSERT_EQ(Tokens.size(), 8u) << Tokens;
+  EXPECT_TOKEN(Tokens[4], tok::string_literal, TT_Unknown);
 }
 
 TEST_F(TokenAnnotatorTest, UnderstandConstructors) {
Index: clang/unittests/Format/FormatTestVerilog.cpp
===
--- clang/unittests/Format/FormatTestVerilog.cpp
+++ clang/unittests/Format/FormatTestVerilog.cpp
@@ -1157,6 +1157,66 @@
   verifyFormat("{< Ranges(1, tooling::Range(Offset, Length));
-tooling::Replacements Replaces = reformat(Style, Code, Ranges);
-auto Result = applyAllReplacements(Code, Replaces);
-EXPECT_TRUE(static_cast(Result));
-LLVM_DEBUG(llvm::errs() << "\n" << *Result << "\n\n");
-return *Result;
-  }
-
-  static std::string
-  format(llvm::StringRef Code,
- const FormatStyle  = getGoogleStyle(FormatStyle::LK_Java)) {
-return format(Code, 0, Code.size(), Style);
+  FormatStyle getDefaultStyle() const override {
+return getGoogleStyle(FormatStyle::LK_Java);
   }
 
   static FormatStyle getStyleWithColumns(unsigned ColumnLimit) {
@@ -41,13 +26,6 @@
 Style.ColumnLimit = ColumnLimit;
 return Style;
   }
-
-  static void verifyFormat(
-  llvm::StringRef Code,
-  const FormatStyle  = getGoogleStyle(FormatStyle::LK_Java)) {
-EXPECT_EQ(Code.str(), format(Code, Style)) << "Expected code is not stable";
-EXPECT_EQ(Code.str(), format(test::messUp(Code), Style));
-  }
 };
 
 TEST_F(FormatTestJava, NoAlternativeOperatorNames) {
@@ -565,9 +543,9 @@
 }
 
 TEST_F(FormatTestJava, BreaksStringLiterals) {
-  // FIXME: String literal breaking is currently disabled for Java and JS, as it
-  // requires strings to be merged using "+" which we don't support.
-  verifyFormat("\"some text other\";", getStyleWithColumns(14));
+  verifyFormat("x = \"some text \"\n"
+   "+ \"other\";",
+   "x = \"some text other\";", getStyleWithColumns(18));
 }
 
 TEST_F(FormatTestJava, AlignsBlockComments) {
@@ -625,5 +603,7 @@
Style);
 }
 
+} // namespace
+} // namespace test
 } // namespace format
-} // end namespace clang
+} // namespace clang
Index: clang/unittests/Format/FormatTestJS.cpp
===
--- clang/unittests/Format/FormatTestJS.cpp
+++ clang/unittests/Format/FormatTestJS.cpp
@@ -1505,6 +1505,97 @@
 TEST_F(FormatTestJS, StringLiteralConcatenation) {
   verifyFormat("var literal = 'hello ' +\n"
"'world';");
+
+  // Long strings should be broken.
+  verifyFormat("var literal =\n"
+ 

[PATCH] D154093: [clang-format] Break long string literals in C#, etc.

2023-08-08 Thread MyDeveloperDay via Phabricator via cfe-commits
MyDeveloperDay accepted this revision.
MyDeveloperDay added a comment.

I'm ok with this if @owenpan is


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154093/new/

https://reviews.llvm.org/D154093

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154093: [clang-format] Break long string literals in C#, etc.

2023-08-07 Thread sstwcw via Phabricator via cfe-commits
sstwcw updated this revision to Diff 547770.
sstwcw added a comment.

- Add tests for code in JavaScript template strings


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154093/new/

https://reviews.llvm.org/D154093

Files:
  clang/docs/ClangFormatStyleOptions.rst
  clang/include/clang/Format/Format.h
  clang/lib/Format/BreakableToken.cpp
  clang/lib/Format/BreakableToken.h
  clang/lib/Format/ContinuationIndenter.cpp
  clang/lib/Format/FormatToken.h
  clang/lib/Format/TokenAnnotator.cpp
  clang/lib/Format/WhitespaceManager.cpp
  clang/unittests/Format/FormatTestCSharp.cpp
  clang/unittests/Format/FormatTestJS.cpp
  clang/unittests/Format/FormatTestJava.cpp
  clang/unittests/Format/FormatTestVerilog.cpp
  clang/unittests/Format/TokenAnnotatorTest.cpp

Index: clang/unittests/Format/TokenAnnotatorTest.cpp
===
--- clang/unittests/Format/TokenAnnotatorTest.cpp
+++ clang/unittests/Format/TokenAnnotatorTest.cpp
@@ -1815,6 +1815,30 @@
   EXPECT_TOKEN(Tokens[2], tok::l_paren, TT_Unknown);
   EXPECT_TOKEN(Tokens[5], tok::comma, TT_Unknown);
   EXPECT_TOKEN(Tokens[8], tok::r_paren, TT_Unknown);
+
+  // String literals in concatenation.
+  Tokens = Annotate("x = {\"\"};");
+  ASSERT_EQ(Tokens.size(), 7u) << Tokens;
+  EXPECT_TOKEN(Tokens[3], tok::string_literal, TT_StringInConcatenation);
+  Tokens = Annotate("x = {\"\", \"\"};");
+  ASSERT_EQ(Tokens.size(), 9u) << Tokens;
+  EXPECT_TOKEN(Tokens[3], tok::string_literal, TT_StringInConcatenation);
+  EXPECT_TOKEN(Tokens[5], tok::string_literal, TT_StringInConcatenation);
+  Tokens = Annotate("x = '{{\"\"}};");
+  ASSERT_EQ(Tokens.size(), 10u) << Tokens;
+  EXPECT_TOKEN(Tokens[5], tok::string_literal, TT_StringInConcatenation);
+  // Cases where the string should not be annotated that type.  Fix the
+  // `TT_Unknown` if needed in the future.
+  Tokens = Annotate("x = {\"\" == \"\"};");
+  ASSERT_EQ(Tokens.size(), 9u) << Tokens;
+  EXPECT_TOKEN(Tokens[3], tok::string_literal, TT_Unknown);
+  EXPECT_TOKEN(Tokens[5], tok::string_literal, TT_Unknown);
+  Tokens = Annotate("x = {(\"\")};");
+  ASSERT_EQ(Tokens.size(), 9u) << Tokens;
+  EXPECT_TOKEN(Tokens[4], tok::string_literal, TT_Unknown);
+  Tokens = Annotate("x = '{\"\"};");
+  ASSERT_EQ(Tokens.size(), 8u) << Tokens;
+  EXPECT_TOKEN(Tokens[4], tok::string_literal, TT_Unknown);
 }
 
 TEST_F(TokenAnnotatorTest, UnderstandConstructors) {
Index: clang/unittests/Format/FormatTestVerilog.cpp
===
--- clang/unittests/Format/FormatTestVerilog.cpp
+++ clang/unittests/Format/FormatTestVerilog.cpp
@@ -1157,6 +1157,66 @@
   verifyFormat("{< Ranges(1, tooling::Range(Offset, Length));
-tooling::Replacements Replaces = reformat(Style, Code, Ranges);
-auto Result = applyAllReplacements(Code, Replaces);
-EXPECT_TRUE(static_cast(Result));
-LLVM_DEBUG(llvm::errs() << "\n" << *Result << "\n\n");
-return *Result;
-  }
-
-  static std::string
-  format(llvm::StringRef Code,
- const FormatStyle  = getGoogleStyle(FormatStyle::LK_Java)) {
-return format(Code, 0, Code.size(), Style);
+  FormatStyle getDefaultStyle() const override {
+return getGoogleStyle(FormatStyle::LK_Java);
   }
 
   static FormatStyle getStyleWithColumns(unsigned ColumnLimit) {
@@ -41,13 +26,6 @@
 Style.ColumnLimit = ColumnLimit;
 return Style;
   }
-
-  static void verifyFormat(
-  llvm::StringRef Code,
-  const FormatStyle  = getGoogleStyle(FormatStyle::LK_Java)) {
-EXPECT_EQ(Code.str(), format(Code, Style)) << "Expected code is not stable";
-EXPECT_EQ(Code.str(), format(test::messUp(Code), Style));
-  }
 };
 
 TEST_F(FormatTestJava, NoAlternativeOperatorNames) {
@@ -565,9 +543,9 @@
 }
 
 TEST_F(FormatTestJava, BreaksStringLiterals) {
-  // FIXME: String literal breaking is currently disabled for Java and JS, as it
-  // requires strings to be merged using "+" which we don't support.
-  verifyFormat("\"some text other\";", getStyleWithColumns(14));
+  verifyFormat("x = \"some text \"\n"
+   "+ \"other\";",
+   "x = \"some text other\";", getStyleWithColumns(18));
 }
 
 TEST_F(FormatTestJava, AlignsBlockComments) {
@@ -625,5 +603,7 @@
Style);
 }
 
+} // namespace
+} // namespace test
 } // namespace format
-} // end namespace clang
+} // namespace clang
Index: clang/unittests/Format/FormatTestJS.cpp
===
--- clang/unittests/Format/FormatTestJS.cpp
+++ clang/unittests/Format/FormatTestJS.cpp
@@ -1505,6 +1505,97 @@
 TEST_F(FormatTestJS, StringLiteralConcatenation) {
   verifyFormat("var literal = 'hello ' +\n"
"'world';");
+
+  // Long strings should be broken.
+  verifyFormat("var literal =\n"
+   "' ' +\n"
+   "'';",
+   "var literal = ' ';",

[PATCH] D154093: [clang-format] Break long string literals in C#, etc.

2023-08-07 Thread sstwcw via Phabricator via cfe-commits
sstwcw added inline comments.



Comment at: clang/unittests/Format/FormatTestCSharp.cpp:221
 TEST_F(FormatTestCSharp, CSharpFatArrows) {
-  verifyFormat("Task serverTask = Task.Run(async() => {");
+  verifyIncompleteFormat("Task serverTask = Task.Run(async() => {");
   verifyFormat("public override string ToString() => \"{Name}\\{Age}\";");

MyDeveloperDay wrote:
> not sure what changing here and why?
The brace is not paired.  The old `verifyFormat` function didn't care about it. 
 In `FormatTestBase.h` there is the function `verifyFormat` which requires the 
code to be complete and `verifyIncompleteFormat` which doesn't.



Comment at: clang/unittests/Format/FormatTestJS.cpp:1575
+   "/ /;",
+   getGoogleJSStyleWithColumns(18));
 }

MyDeveloperDay wrote:
> out of interest what happens with  `xx ${variable} 
> xx` type strings? can we add a test to show we won't break in 
> the middle of a ${vas} even though the column limit might be reached?
There can be a break inside the braces.  I added a test.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154093/new/

https://reviews.llvm.org/D154093

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154093: [clang-format] Break long string literals in C#, etc.

2023-08-07 Thread sstwcw via Phabricator via cfe-commits
sstwcw updated this revision to Diff 547767.
sstwcw marked 2 inline comments as done.
sstwcw added a comment.

- Add tests for code in JavaScript template strings


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154093/new/

https://reviews.llvm.org/D154093

Files:
  clang/docs/ClangFormatStyleOptions.rst
  clang/include/clang/Format/Format.h
  clang/lib/Format/BreakableToken.cpp
  clang/lib/Format/BreakableToken.h
  clang/lib/Format/ContinuationIndenter.cpp
  clang/lib/Format/FormatToken.h
  clang/lib/Format/TokenAnnotator.cpp
  clang/lib/Format/WhitespaceManager.cpp
  clang/unittests/Format/FormatTestCSharp.cpp
  clang/unittests/Format/FormatTestJS.cpp
  clang/unittests/Format/FormatTestJava.cpp
  clang/unittests/Format/FormatTestVerilog.cpp
  clang/unittests/Format/TokenAnnotatorTest.cpp

Index: clang/unittests/Format/TokenAnnotatorTest.cpp
===
--- clang/unittests/Format/TokenAnnotatorTest.cpp
+++ clang/unittests/Format/TokenAnnotatorTest.cpp
@@ -1815,6 +1815,30 @@
   EXPECT_TOKEN(Tokens[2], tok::l_paren, TT_Unknown);
   EXPECT_TOKEN(Tokens[5], tok::comma, TT_Unknown);
   EXPECT_TOKEN(Tokens[8], tok::r_paren, TT_Unknown);
+
+  // String literals in concatenation.
+  Tokens = Annotate("x = {\"\"};");
+  ASSERT_EQ(Tokens.size(), 7u) << Tokens;
+  EXPECT_TOKEN(Tokens[3], tok::string_literal, TT_StringInConcatenation);
+  Tokens = Annotate("x = {\"\", \"\"};");
+  ASSERT_EQ(Tokens.size(), 9u) << Tokens;
+  EXPECT_TOKEN(Tokens[3], tok::string_literal, TT_StringInConcatenation);
+  EXPECT_TOKEN(Tokens[5], tok::string_literal, TT_StringInConcatenation);
+  Tokens = Annotate("x = '{{\"\"}};");
+  ASSERT_EQ(Tokens.size(), 10u) << Tokens;
+  EXPECT_TOKEN(Tokens[5], tok::string_literal, TT_StringInConcatenation);
+  // Cases where the string should not be annotated that type.  Fix the
+  // `TT_Unknown` if needed in the future.
+  Tokens = Annotate("x = {\"\" == \"\"};");
+  ASSERT_EQ(Tokens.size(), 9u) << Tokens;
+  EXPECT_TOKEN(Tokens[3], tok::string_literal, TT_Unknown);
+  EXPECT_TOKEN(Tokens[5], tok::string_literal, TT_Unknown);
+  Tokens = Annotate("x = {(\"\")};");
+  ASSERT_EQ(Tokens.size(), 9u) << Tokens;
+  EXPECT_TOKEN(Tokens[4], tok::string_literal, TT_Unknown);
+  Tokens = Annotate("x = '{\"\"};");
+  ASSERT_EQ(Tokens.size(), 8u) << Tokens;
+  EXPECT_TOKEN(Tokens[4], tok::string_literal, TT_Unknown);
 }
 
 TEST_F(TokenAnnotatorTest, UnderstandConstructors) {
Index: clang/unittests/Format/FormatTestVerilog.cpp
===
--- clang/unittests/Format/FormatTestVerilog.cpp
+++ clang/unittests/Format/FormatTestVerilog.cpp
@@ -1157,6 +1157,66 @@
   verifyFormat("{< Ranges(1, tooling::Range(Offset, Length));
-tooling::Replacements Replaces = reformat(Style, Code, Ranges);
-auto Result = applyAllReplacements(Code, Replaces);
-EXPECT_TRUE(static_cast(Result));
-LLVM_DEBUG(llvm::errs() << "\n" << *Result << "\n\n");
-return *Result;
-  }
-
-  static std::string
-  format(llvm::StringRef Code,
- const FormatStyle  = getGoogleStyle(FormatStyle::LK_Java)) {
-return format(Code, 0, Code.size(), Style);
+  FormatStyle getDefaultStyle() const override {
+return getGoogleStyle(FormatStyle::LK_Java);
   }
 
   static FormatStyle getStyleWithColumns(unsigned ColumnLimit) {
@@ -41,13 +26,6 @@
 Style.ColumnLimit = ColumnLimit;
 return Style;
   }
-
-  static void verifyFormat(
-  llvm::StringRef Code,
-  const FormatStyle  = getGoogleStyle(FormatStyle::LK_Java)) {
-EXPECT_EQ(Code.str(), format(Code, Style)) << "Expected code is not stable";
-EXPECT_EQ(Code.str(), format(test::messUp(Code), Style));
-  }
 };
 
 TEST_F(FormatTestJava, NoAlternativeOperatorNames) {
@@ -565,9 +543,9 @@
 }
 
 TEST_F(FormatTestJava, BreaksStringLiterals) {
-  // FIXME: String literal breaking is currently disabled for Java and JS, as it
-  // requires strings to be merged using "+" which we don't support.
-  verifyFormat("\"some text other\";", getStyleWithColumns(14));
+  verifyFormat("x = \"some text \"\n"
+   "+ \"other\";",
+   "x = \"some text other\";", getStyleWithColumns(18));
 }
 
 TEST_F(FormatTestJava, AlignsBlockComments) {
@@ -625,5 +603,7 @@
Style);
 }
 
+} // namespace
+} // namespace test
 } // namespace format
-} // end namespace clang
+} // namespace clang
Index: clang/unittests/Format/FormatTestJS.cpp
===
--- clang/unittests/Format/FormatTestJS.cpp
+++ clang/unittests/Format/FormatTestJS.cpp
@@ -1505,6 +1505,97 @@
 TEST_F(FormatTestJS, StringLiteralConcatenation) {
   verifyFormat("var literal = 'hello ' +\n"
"'world';");
+
+  // Long strings should be broken.
+  verifyFormat("var literal =\n"
+   "' ' +\n"
+   "'';",
+

[PATCH] D154093: [clang-format] Break long string literals in C#, etc.

2023-08-07 Thread MyDeveloperDay via Phabricator via cfe-commits
MyDeveloperDay added inline comments.



Comment at: clang/unittests/Format/FormatTestCSharp.cpp:221
 TEST_F(FormatTestCSharp, CSharpFatArrows) {
-  verifyFormat("Task serverTask = Task.Run(async() => {");
+  verifyIncompleteFormat("Task serverTask = Task.Run(async() => {");
   verifyFormat("public override string ToString() => \"{Name}\\{Age}\";");

not sure what changing here and why?



Comment at: clang/unittests/Format/FormatTestJS.cpp:1575
+   "/ /;",
+   getGoogleJSStyleWithColumns(18));
 }

out of interest what happens with  `xx ${variable} 
xx` type strings? can we add a test to show we won't break in the 
middle of a ${vas} even though the column limit might be reached?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154093/new/

https://reviews.llvm.org/D154093

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154093: [clang-format] Break long string literals in C#, etc.

2023-08-06 Thread sstwcw via Phabricator via cfe-commits
sstwcw added a comment.

@MyDeveloperDay Can you take a look at the last test in 
FormatTestCSharp.Attributes?  I tried to enable breaking string literals by 
default.  The long string got broken.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154093/new/

https://reviews.llvm.org/D154093

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D154093: [clang-format] Break long string literals in C#, etc.

2023-08-06 Thread sstwcw via Phabricator via cfe-commits
sstwcw added a comment.

In D154093#4544496 , @owenpan wrote:

> In D154093#4542339 , @sstwcw wrote:
>
>> @owenpan What do you think about this revision especially the replacement 
>> part?
>
> See D154093#4544495 . Then we can 
> extend it by using `,` instead of `+` for Verilog (plus inserting a pair of 
> braces).

I added support for that.  See the updated description.  It turned out
that because inserting plus signs may involve inserting parentheses, all
the replacement stuff are still needed.  So I didn't split the Verilog
stuff into a separate patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D154093/new/

https://reviews.llvm.org/D154093

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits