[PATCH] D75194: [clang-format] Do not merge very long C# automatic properties

2020-03-03 Thread MyDeveloperDay via Phabricator via cfe-commits
MyDeveloperDay added a comment.

@krasimir, well I learned something new, so thanks for taking the time to 
explain that to help improve our understanding.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75194/new/

https://reviews.llvm.org/D75194



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75194: [clang-format] Do not merge very long C# automatic properties

2020-03-03 Thread Krasimir Georgiev via Phabricator via cfe-commits
krasimir added a comment.

Here's some empirical ideas about the approach: In clang-format, braces can be 
handled in two quite distinct ways, controlled by BraceBlockKind 
:
 BK_Block and BK_BracedInit.
BK_Block is for braces that open blocks that usually are at one level deeper 
and consist of a sequence of statements and other constructs.
BK_BracedInit is for initializer lists, dictionaries and other similar 
syntactics that are somewhat more appropriate to put together into a line.
The level of granularity of detailed formatting in clang-format is an unwrapped 
line, which is conceptually a sequence of tokens that "make sense together as a 
single line" (without going much into style details and ignoring column 
limits). Separate statements are for example put on separate unwrapped lines. 
The formatting information flowing between unwrapped lines includes just 
higher-level data such as the current nesting depth.
The brace detection is handled primarily in 
UnwrappedLineParser::calculateBraceTypes 
,
 which happens quite early during the parsing of the input sequence.

If an opening brace is marked BK_Block there, later the stuff between it and 
the matching closing brace is parsed as a block: multiple "statements" are put 
on their own unrwapped lines, inside the block.
If the brace is marked BK_BracedInit, the staff following it is parsed more 
like dictionary-struct-array-literal stuff, and importantly is kept on the same 
unwrapped line as the surrounding code (as a braced list may occur as a 
subexpression of a larger expression, and the formatting of the larger 
expression may depend non-trivially by the formatting of the braced list).

For example, consider the following pseudo-C-family fragment:

  % cat test.cc  
  
  int f() {
block_example {
  get;
  set;
};
  
int init_list_example({1, 2, {more}}, other);
  }

If we examine the parsed unwrapped lines (`clang-format -debug`), they look 
like:

  Line(0, FSC=0): int[T=81, OC=0] identifier[T=81, OC=4] l_paren[T=81, OC=5] 
r_paren[T=81, OC=6] l_brace[T=21, OC=8] 
  Line(1, FSC=0): identifier[T=81, OC=2] l_brace[T=21, OC=16] 
  Line(2, FSC=0): identifier[T=81, OC=4] semi[T=81, OC=7] 
  Line(2, FSC=0): identifier[T=81, OC=4] semi[T=81, OC=7] 
  Line(1, FSC=0): r_brace[T=81, OC=2] semi[T=81, OC=3] 
  Line(1, FSC=0): int[T=81, OC=2] identifier[T=81, OC=6] l_paren[T=81, OC=23] 
l_brace[T=81, OC=24] numeric_constant[T=81, OC=25] comma[T=81, OC=26] 
numeric_constant[T=81, OC=28] comma[T=81, OC=29] l_brace[T=81, OC=31] 
identifier[T=81, OC=32] r_brace[T=81, OC=36] r_brace[T=81, OC=37] comma[T=81, 
OC=38] identifier[T=81, OC=40] r_paren[T=81, OC=45] semi[T=81, OC=46] 
  Line(0, FSC=0): r_brace[T=81, OC=0] 
  Line(0, FSC=0): eof[T=81, OC=0] 

The block-like thing is put on separate lines at level 2; the braced-list-like 
thing is kept on the same unwrapped line.

Currently, C# auto property get/set "blocks" are parsed as BK_Block, hence the 
need to later merge lines using a heuristic.
If you could instead mark those as BK_BracedInit in `calculateBraceTypes`, that 
will have the effect of keeping them inline with the surrounding code and might 
produce good formatting of the whole line with a few tweaks.
There's another complication: syntactically in C++, a semicolon is very 
special. For example, C/C++ requires a semicolon after class declarations. 
There's a bunch of places in clang-formatthat use the presence of semicolons to 
determine "end of statement/end of line". So the semicolons in `{ get; set; }` 
might cause a bit of trouble and throw the parser off a bit. Fortunately, 
clang-format already has some code that deals with similar complications for 
javascript, where stuff like `X<{a: string; b: number}>` is correctly handled 
as a braced list, even in the presence of semicolons inside. You can look at 
how this is handled for inspiration (I'm not 100% sure these are the only 
places in code that contribute to the formatting of these constructs):

- in calculateBraceTypes here: 
https://github.com/llvm/llvm-project/blob/9f8a7e82b85078b5afbbc44429355f156e044205/clang/lib/Format/UnwrappedLineParser.cpp#L446
- and later in parseBracedList the semicolons are specially treated here: 
https://github.com/llvm/llvm-project/blob/9f8a7e82b85078b5afbbc44429355f156e044205/clang/lib/Format/UnwrappedLineParser.cpp#L1698

I hope this is helpful, but please take it with a grain of salt as I'm not very 
familiar with those parts of clang-format and am mostly poking around it and 
looking at how similar constructs for other languages are handled.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75194/new/


[PATCH] D75194: [clang-format] Do not merge very long C# automatic properties

2020-03-02 Thread Jonathan B Coe via Phabricator via cfe-commits
jbcoe planned changes to this revision.
jbcoe added a comment.

I will try to handle this in unwrapped line parser.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75194/new/

https://reviews.llvm.org/D75194



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75194: [clang-format] Do not merge very long C# automatic properties

2020-02-28 Thread Jonathan B Coe via Phabricator via cfe-commits
jbcoe marked an inline comment as done.
jbcoe added inline comments.



Comment at: clang/lib/Format/UnwrappedLineFormatter.cpp:489
+// If the Limit is broken, split the default value onto a new line and
+// indent it.
+//

krasimir wrote:
> Why not always (not just when Limit is broken) merge the whole automatic 
> property declaration into 1 line?
> Syntactically, this whole thing is a unit:
> `MyVeryLongTypeName Name { get; set } = new MyVeryLongTypeName();`
> My C# is rusty, what are the things after the `=` that are syntactically 
> allowed in general?
I think any expression is valid.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75194/new/

https://reviews.llvm.org/D75194



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75194: [clang-format] Do not merge very long C# automatic properties

2020-02-28 Thread Krasimir Georgiev via Phabricator via cfe-commits
krasimir added inline comments.



Comment at: clang/lib/Format/UnwrappedLineFormatter.cpp:489
+// If the Limit is broken, split the default value onto a new line and
+// indent it.
+//

Why not always (not just when Limit is broken) merge the whole automatic 
property declaration into 1 line?
Syntactically, this whole thing is a unit:
`MyVeryLongTypeName Name { get; set } = new MyVeryLongTypeName();`
My C# is rusty, what are the things after the `=` that are syntactically 
allowed in general?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75194/new/

https://reviews.llvm.org/D75194



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits