On Wednesday, 9 October 2013 at 03:55:42 UTC, Andrei Alexandrescu
wrote:
On 10/8/13 6:26 PM, Walter Bright wrote:
On 10/4/2013 5:24 PM, Andrei Alexandrescu wrote:
[...]
Some points:
1. This is a replacement for the switch statement starting at
around
line 505 in advance()
https://github.com/Hackerpilot/phobos/blob/9bdb7f97bb8021f3b0d0291896b8fe21a6fead23/std/d/lexer.d
It is not a replacement for the rest of the lexer.
2. Instead of explicit token type enums, such as:
mod, /// %
it would just be referred to as:
tok!"%"
Andrei pointed out to me that he has fixed the latter so it
resolves to
a small integer - meaning it works efficiently as cases in
switch
statements. This removes my primary objection to it.
3. This level of abstraction combined with efficient
generation cannot
be currently done in any other language. Hence, it makes for a
sweet
showcase of what D can do.
Hence, I think we ought to adapt Brian's lexer by replacing
the switch
with Andrei's trie searcher, and replacing the enum TokenType
with the
tok!"string" syntax.
Thanks, that's exactly what I had in mind. Also the trie
searcher should be exposed by the library so people can
implement other languages.
Let me make another, more strategic, point. Projects like Rust
and Go have dozens of people getting paid to work on them. In
the time it takes us to crank one conventional lexer/parser for
a language, they can crank five. The answer is we can't win
with a conventional approach. We must leverage D's strengths to
amplify our speed of execution, and in this context an
integrated generic lexer generator is the ticket.
There is one thing I neglected to mention, and I apologize for
that. Coming with this all on the eve of voting must be quite
demotivating for Brian, who's been through all the arduous
steps to get his work to production quality. I hope the
compensating factor is that the proposed change is a net
positive for the greater good.
Andrei
Overall, I think this is going into the right direction. However,
there is one thing I don't like with that design.
When you go throw the big switch of death, you match the
beginning of the string and then you go back to a function that
will test where does it come from and act accordingly. That is
kind of wasteful.
What SDC does is that it calls a function-template with the part
matched by the big switch of death passed as template argument.
The nice thing about it is that it is easy to trnsform this
compile time argument into a runtime one by simply forwarding it
(what is done to parse identifier that begins by a keyword for
instance).