Re: DDMD and such.

Martin Nowak Thu, 29 Sep 2011 03:35:55 -0700

On Wed, 28 Sep 2011 22:59:45 +0200, Jonathan M Davis <[email protected]>wrote:

On Wednesday, September 28, 2011 13:43 Nick Sabalausky wrote:
"Jonathan M Davis" <[email protected]> wrote in message
news:[email protected]...
> I would point out that there is an intention to eventually get a Dlexer
> and
> parser into Phobos so that tools can take advantage of them. Thosecould> eventually lead to a frontend in D but would provide benefits farbeyond
> simply
> having the compiler in D.
Is the interest more in a D-specific lexer/parser or a generalized one?Or
is it more of a split vote? I seem to remember interest both ways, but I
don't know whether there's any consensus among the DMD/Phobos crew.
A generalized lexer is nothing more than a regex engine that has morethan
one distinct accept state (which then gets run over and over until EOF).
And the FSM is made simply by doing a combined regex "(regexForToken1 |
regexForToken2 | regexForToken3 | ... )", and then each of those parts
just get their own accept state. Which makes me wonder...
There was a GSoC project to overhaul Phobos's regex engine, wasn'tthere?
Is that done? Is it designed in a way that the stuff above wouldn't be
real hard to add?
And what about algoritm? Is it a Thompson NFA, ie, it traverses the NFAas
if it were a DFA, effectively "creating" the DFA on-the-fly)? Or does it
just traverse the NFA as an NFA? Or does it create an actual DFA and
traverse that? An actual DFA would probably be best for a lexer. If aDFA,
is it an optimized DFA? In my (limited) tests, it didn't seem like
DFA-optimization would yield a notable benefit on typical
programming-langauge tokens. It seems to be more suited to pathological
cases.
There is some desire to have a lexer and parser in Phobos whichbasically havethe same implementation as dmd (only in D instead of C++). That way,they're
very close to the actual compiler, and it's easy to port fixes and
improvements between the two.
However, we definitely also want a more general lexer/parser generatorwhichtakes advantage of D's metaprogramming capabalities. Andrei was pushingmorefor that and doesn't really like the idea of the other, since it wouldreducethe desire to produce the more general solution. So, this _is_ somedissensionon the matter. But there's definitely room for both. It's just aquestion of
time and manpower.

- Jonathan M Davis

What's currently missing to write lexers/parsers is an approach for rangebased file readingwith lookahead. Steven seems to work on a new stdio which tries to solvethis issue.

Re: DDMD and such.

Reply via email to