On 20/06/2010 22:46, Alix Pexton wrote:
On 20/06/2010 21:37, Ellery Newcomer wrote:
On 06/20/2010 03:01 PM, Alix Pexton wrote:
On 19/06/2010 21:12, Alix Pexton wrote:
I've been sketching some grammar diagrams for D2.0, a little like those
on JSON.org, and of course I didn't get far before I ran into something
odd.
I think I will take the plunge and base my diagrams on the source of
DMD. After looking at the code in lexer.c, it does not seem as far
beyond my rusty old c++ parsing skills as I had expected! Massive credit
to Walter for having a codebase that is as mature as DMD without it
turning into a labyrinth of preprocessor macros and cryptic "comefrom"s.
This will mean however that my little project may take a little longer,
sigh...
A...
Do share. I've always been too lazy to read lexer.c, and from this
discussion, it sounds like there are a few spots where my own lexer
grammar is incorrect (or at least differs from dmd).
of course ^^
A...
Well, I think I have got my head around lexer.c now, and its various
peculiarities, like "000377." being a valid float (although not
according to my shiny new, limited edition copy of tDPL (fig2.2 p35)^^).
The weirdness occurs because some of some corner cases are handled not
by the neat little state state machine that validates reals, but in the
scanner at the point where it recognises a number beginning with a zero.
The productions in lex.html represent the range of inputs that are
accepted by the state machine without taking into account that the
scanner rejects the sequence "._" (which makes sense as that is the
identifier "_" in the outer scope).
Andrei's analysis in tDPL also points out that 0xp0 is a valid hexfloat,
but a strict reading of lex.html would not allow it.
Overall the diagram for hexfloat is much simpler than the one for
decimalfloat, which I think will have to be split into 3 ><
A...
PS, octal must die!