On 05.04.2016 10:46, Walter Bright wrote:
On 1/16/2016 7:13 AM, H. S. Teoh via Digitalmars-d wrote:
I disagree. I think having the dmd itself (lexer, parser, etc.) as a
library (with the dmd executable merely being the default frontend) will
do D a lot of good.
For one thing, IDE's will no longer need to reinvent a D parser for the
purposes of syntax highlighting;
On the other hand, using lexer.d and parse.d as a guide to build your
own is a trivial undertaking. The Boost license is designed so this can
be done without worrying about making a derived work.
I looked into doing syntax highlighting for my editor, MicroEmacs. It
turns out it is not so easy to just use a compiler lexer/parser for it.
For one thing, the one used in the compiler is optimized for speed in a
forward pass through the text.
But a syntax highlighter in a text editor is different. Suppose I change
a character in the middle of a line. All the highlighting from that
point forward may change. And to figure out what that change is, the
parser/lexer has to start over from the beginning of the file! (Think
string literals, nested comments, quoted string literals, etc.) This
would make editing slow.
...
Also, tools might want to parse a more intuitive grammar in order to be
able to give suggestions on how to adapt the code such that DMDs parser
will accept it.
I surmised that a solution is to have each line in the editor be tagged
with a state to show the lexing state at the beginning of that line.
Then, when a character in a line changes, the lexer can be restarted
from that state, and it continues forward until the next line state
matches the new computed state. This would make it enormously faster.
...
Some additional care will need to be taken, e.g.
/+ <backspace> <backspace>
or even simply
/+ comment +/
But the compiler's lexer is not designed to be restartable in the middle.
Similarly, a source code formatter would be different in a different
way. It would need, for example, extra information about comments, token
start/end positions, etc. Such data collection would be irrelevant to a
compiler, and would slow it down and consume unneeded memory.