28-Jan-2013 15:48, Johannes Pfau пишет:
Am Sun, 27 Jan 2013 11:48:23 -0800
schrieb Walter Bright <[email protected]>:

On 1/27/2013 2:17 AM, Philippe Sigaud wrote:
Walter seems to think if a lexer is not able to vomit thousands
of tokens a seconds, then it's not good.

Speed is critical for a lexer.

This means, for example, you'll need to squeeze pretty much all
storage allocation out of it.

But to be fair that doesn't fit ranges very well. If you don't want to
do any allocation but still keep identifiers etc in memory this
basically means you have to keep the whole source in memory and this is
conceptually an array and not a range.


Not the whole source but to construct a table of all identifiers. The source is awfully redundant because of repeated identifiers, spaces, comments and what not. The set of unique identifiers is rather small.

I think the best course of action is to just provide a hook to trigger on every identifier encountered. That could be as discussed earlier a delegate.

But you can of course write a lexer which accepts buffered ranges and
does some allocation for those and is special cased for arrays to not
allocate at all. (Unbuffered ranges should be supported using a
generic BufferedRange)




--
Dmitry Olshansky

Reply via email to