Re: Proposal for SentinelInputRange

Walter Bright Wed, 27 Feb 2013 21:15:35 -0800

On 2/27/2013 8:47 PM, Jonathan M Davis wrote:

Now, the only real benefit that I see for this allowing you to make a string
zero-terminated (which in the case of a lexer would probably mean copying the
entire file into a new string which has zero on the end).

Nawp, there is no extra copy. Take a look at the compiler source. You have toread the file into memory anyway - so make the buffer one byte longer, and put a0 at the end.

In general, you'll be
forced to wrap a range in a sentinel range to get this behavior, which means
that you're _still_ checking empty all the time, because it has to keep
checking whether it's supposed to make front 0 now. And that probably means
that it'll be slightly _more_ expensive to do this for anything other than a
string. That being the case, it might be better to just special case strings
rather than come up with this whole new range idea.


Nope, not necessary to wrap it.

I'm also not at all covinced that this is generally useful. It may be that
it's a great idea for lexers, but what other use cases are there?

Anything that walks a C string. Lots of cases for that. Sentinels are often usedwhere high speed processing of data is desired. Google sentinel-terminated datafor more examples.

Also, I'd point out that even for strings, doing something like this means
wrapping them, because their empty isn't defined in a manner which works with
isSentinelRange.


For D strings, yes, for C strings, no need to wrap them.

So, I'm inclined to believe that we'd be better off just special casing strings
in any algorithms that can take advantage of this sort of thing than we would
be creating this sentinel range idea.

0 terminated C strings are a classic case of this. Another case is a tokenstream ending with an EOF token.

Re: Proposal for SentinelInputRange

Reply via email to