On Thursday, 12 September 2013 at 03:37:55 UTC, H. S. Teoh wrote:
I still don't understand why backtracking is necessary in the first place. I would've thought a modern parser should be well able to encode seen tokens in its state so that backtracking is never necessary. Or does D grammar have tricky bits that cannot be handled this way, that
I'm unaware of?


The problem is that it can cause a exponential (and I literally mean exponential here) amount of complexity.

In some cases, the complexity is manageable, but in other that don't make any sense (it has to be noted that even full lexing don't make any sens here). For instance :

int foo()() {}
       ^

When you are at the caret position, you don't know if you face a function declaration or a template declaration. You could go for some ambiguous parsing, but each template argument can itself be a type, an expression or a symbol.

In such situation, it is much simpler to skip input to the closing parentheses, check what's coming after and act accordingly. The alternative is to go for some ambiguous function/template parameters parsing and resolve at the end, but as template argument are themselves ambiguous type/expression/symbols, the amount of complexity in the parser is doomed to explode. Note that the example is not self contained, for instance, both template parameters and argument can imply template instantiation, which means ambiguous argument parsing.

SDC does a good deal of ambiguous parsing without backtracking (more than DMD does), but you got to draw the line somewhere.

What I'd like to see is a go to the closing token feature, that can eventually take a fast path to do so, or an efficient token buffering system (I'm not sure which feature would be the fastest, but I'm inclined to think this is the first one, however, that won't be suited for a dmd style parser).

Reply via email to