On Friday, 4 September 2015 at 14:14:43 UTC, Mike James wrote:
On Friday, 4 September 2015 at 14:05:09 UTC, Jonathan M Davis wrote:
On Friday, 4 September 2015 at 13:55:03 UTC, Jonathan M Davis wrote:
[...]
[snip]

[...]

Isn't it called Maximal Munch...

https://en.wikipedia.org/wiki/Maximal_munch

Regards, -<mike>-

Yes. That's how most languages typically parse tokens, but some programming languages are more willing to force formatting on you than others, even if they use maximal munch. You _can_ choose to make certain uses of whitespace illegal while still using maximal munch, since all that maximal munch is doing is deciding how you're going to know whether a sequence of characters is one token or several when it's ambiguous. It's why vector<pair<int, int>> has resulted in the C++98 parsers thinking that the >> on the end is a shift operator rather than the closing halves of the two templates, and C++11, Java, and C# have all had to _not_ use maximal munch in that particular case to make it so that it's not treated as the shift-operator. It makes their grammars that much less context-free and is part of why D uses !() for template instantiations.

In any case, I didn't use the term maximal munch, because that indicates how tokens are separated and says nothing about how you format your code (aside from the fact that you sometimes have to add whitespace to disambiguate if the grammar isn't clean enough), whereas this discussion really has to do with making formatting your code in a particular instance illegal (or at least that the compiler would warn about it, which is essentially equivalent to making it illegal, since no one should leave warnings in their code, and -w literally turns all warnings into errors anyway). There is no ambiguity as to whether =+ is the same as = + as far as the compiler is concerned, because there is no =+ token, and so maximal munch doesn't really even come into play here.

- Jonathan M Davis

Reply via email to