"Nick Sabalausky" <a@a.a> wrote in message 
news:imivp7$2fu$1...@digitalmars.com...
> "Alexey Prokhin" <alexey.prok...@yandex.ru> wrote in message 
> news:mailman.2713.1300954193.4748.digitalmars-d-le...@puremagic.com...
>>> Currently, as far as I know, there are only two lexers and two parsers 
>>> for
>>> D: the C++ front end which dmd, gdc, and ldc use and the D front end 
>>> which
>>> ddmd uses and which is based on the C++ front end. Both of those are 
>>> under
>>> the GPL (which makes them useless for a lot of stuff) and both of them 
>>> are
>>> tied to compilers. Being able to lex D code and get the list of tokens 
>>> in
>>> a D program and being able to parse D code and get the resultant 
>>> abstract
>>> syntax tree would be very useful for a number of programs.
>> There is a third one: http://code.google.com/p/dil/. The main page says 
>> that
>> the lexer and the parser are fully implemented for both D1 and D2. But 
>> the
>> license is also the GPL.
>
> The nearly-done v0.4 of my Goldie parsing system (zlib/libpng license) 
> comes with a mostly-complete lexing-only grammar for D2.
>
> http://www.dsource.org/projects/goldie/browser/trunk/lang/dlex.grm
>
> The limitations of it right now:
>
> - Doesn't do nested comments. That requires a feature (that's going to be 
> introduced in the related tool GOLD Parsing System v4.2) that I haven't 
> had a chance to add into Goldie just yet.
>

Note that this probably isn't a big of a problem as it sounds:

For one thing, it still recognizes "/+" and "+/" as tokens. It'll just try 
to lex everything in between too. And when Goldie is used to just lex, you 
still get the entire source lexed even if it has errors, and the lex-error 
tokens get included in the resulting token array. So it would be pretty easy 
to just call Goldie's lex function, and then step through the token array 
removing balanced /+ and +/ sections manually.



Reply via email to