On 02/28/13 02:11, Walter Bright wrote: > A SentinelInputRange is an InputRange with the following additions: > > 1. a compile time property 'sentinel' that is the terminating value of the > range > 2. empty is defined as: empty = (front == sentinel) > 3. it is not necessary for empty to be called before front > > A C style 0-terminated string is an example of a SentinelInputRange. > > The additions to std.range would be: > > 1. isSentinelInputRange(T) which returns true if T is a SentinelInputRange > 2. a unittest > 3. documentation of this > > An addition to std.string would be a function that takes a char* and returns > a SentinelInputRange. > > Motivation: > 1. easy conversion of C strings to ranges > 2. necessary for a fast implementation of a lexer
LookAheadRange, that not only defines 'sentinel', but also 'lookahead' which defines how far one can safely look ahead. When 'lookahead==0' it becomes equivalent to your SentinelInputRange, where 'front' and '[0]' is always valid and can contain the sentinel. When 'lookahead==N', accessing '[N]' is always valid. Having said that, I've used this approach in a D lexer, and it does not really matter in practice - avoiding the length (or '\0' sentinel) check makes a <~1ms difference when lexing "datetime.d" sized objects (1.5Mbytes+, 460k+ tokens). Which is practically irrelevant both in an IDE context and a compiler context - other processing will be be orders of magnitude more expensive. An IDE doesn't need to re-lex the whole file after every key press and 1ms won't make any difference for a compiler run. "Necessary for a fast implementation of a lexer" is an exaggeration, but yes, it is an obvious optimization that has a measurable impact. Before implementing it I was expecting a larger improvement too; and was somewhat surprised when the resulting gain was in the noise (~1ms +/- ~3ms). /Relatively/ it is significant (10% perf gain is never insignificant), but the /absolute/ perf difference isn't that large. Which of course isn't an argument against (SentinelInputLookAhead)Range; a std API like that would be a good thing. The advantage of SentinelInputRange/LookAheadRange is that it, unlike an array, can be lazy. Which starts to matter when you're dealing with /token/ streams. artur
