On 2/27/2013 8:47 PM, Jonathan M Davis wrote:
Now, the only real benefit that I see for this allowing you to make a string
zero-terminated (which in the case of a lexer would probably mean copying the
entire file into a new string which has zero on the end).

Nawp, there is no extra copy. Take a look at the compiler source. You have to read the file into memory anyway - so make the buffer one byte longer, and put a 0 at the end.

In general, you'll be
forced to wrap a range in a sentinel range to get this behavior, which means
that you're _still_ checking empty all the time, because it has to keep
checking whether it's supposed to make front 0 now. And that probably means
that it'll be slightly _more_ expensive to do this for anything other than a
string. That being the case, it might be better to just special case strings
rather than come up with this whole new range idea.

Nope, not necessary to wrap it.


I'm also not at all covinced that this is generally useful. It may be that
it's a great idea for lexers, but what other use cases are there?

Anything that walks a C string. Lots of cases for that. Sentinels are often used where high speed processing of data is desired. Google sentinel-terminated data for more examples.


Also, I'd point out that even for strings, doing something like this means
wrapping them, because their empty isn't defined in a manner which works with
isSentinelRange.

For D strings, yes, for C strings, no need to wrap them.

So, I'm inclined to believe that we'd be better off just special casing strings
in any algorithms that can take advantage of this sort of thing than we would
be creating this sentinel range idea.

0 terminated C strings are a classic case of this. Another case is a token stream ending with an EOF token.

Reply via email to