Re: Request for comments: std.d.lexer

Andrei Alexandrescu Tue, 05 Feb 2013 19:55:34 -0800

On 2/5/13 10:29 PM, Jonathan M Davis wrote:

On Tuesday, February 05, 2013 08:34:29 Andrei Alexandrescu wrote:

As far as I could tell the dependencies of the lexer are fairly
contained (util, token, identifier) and conversion to input range is
immediate.


I don't remember all of the details at the moment, since it's been several
months since I looked at dmd's lexer, but a lot of the problem stems from the
fact that it's all written around the assumption that it's dealing with a
char*. Converting it to operate on string might be fairly straightforward, but
it gets more complicated when dealing with ranges. Also, both Walter and
others have stated that the lexer in D should be configurable in a number of
ways, and dmd's lexer isn't configurable at all. So, while a direct translation
would likely be quick, refactoring it to do what it's been asked to be able to
do would not be.

I'm quite a ways along with one that's written from scratch, but I need to find
the time to finish it. Also, doing it from scratch has had the added benefit of
helping me find bugs in the spec and in dmd.

I think it would be reasonable for a lexer to require a range of ubyteas input, and carry its own decoding. In the first approximation it mayeven require a random-access range of ubyte.


Andrei

Re: Request for comments: std.d.lexer

Reply via email to