On 2/27/2013 11:55 PM, Jonathan M Davis wrote:
Again, please see how lexer.c works. I assure you, there is no double
copying going on, nor is there a double test for the terminating 0.

I know what the lexer does, and remember that it _doesn't_ operate on ranges,
and there are subtle differences between being able to just use char* and
trying to handle generic ranges.

Hence the need to invent SentinelInputRange.


Given how a lexer works (and I have been working on a lexer off and on
recently), the only real difference is that you'd just use a couple of static
ifs like

static if(!isSomeString!R)
{
     if(range.empty)
         break; //or whatever you do at the end
}

static if(isSomeString!R)
{
     case 0:
         break; //or whatever you do at the end
}

There are so many places where this would occur, it cries out for a new type.


So, in the case of a lexer, I don't see sentinel ranges as buying us much. You
end up having to wrap most any range that you pass to the lexer or whatever
(including strings so that they'll pass isSentinelRange), you lose out on any
optimizations of any functions that you call which special-case strings
(though there probably wouldn't be many of those in a lexer), and all you
avoid is a couple of static ifs.

And NO, THE SOURCE FILE INPUT IS NEITHER WRAPPED NOR DOUBLE COPIED. Here's how it's done:

https://github.com/D-Programming-Language/dmd/blob/master/src/root/root.c

line 1012 and 1038

The idea of sentinels certainly isn't useless, but anything caring about that
sort of speed is likely to just use strings or arrays, and those can trivially
be special cased to avoid unnecessary empty checks and to add the check for
the sentinel, making the whole sentinel range idea an unnecessary complication
IMHO.

You can't do efficient lookahead without sentinels, either. Lexers are sensitive to every instruction executed per character read. No sentinels mean double the number of instructions per source character.

InputRanges are an abject failure if "anyone caring about speed" is not going to use them. And yes, I care very much about the D lexing speed.


Reply via email to