Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:

Of course, he will not have other UTF-8-like features, such as avoidance of ASCII values in the final trail byte, and "fast forward parsing" by looking at the first byte.

The fast forward feature is certianly not decisive, but the random acessibility (from any position and in any direction) is certainly much more decisive and is a real positive factor for UTF-8, rather than the format proposed above, which can only be read in the forward direction, even if it can be accessed randomly to find the *next* character. to find the *previous* one, you have to scan backward until you eat at least one byte used to encode the character before it (otherwise, you don't know if a 1xxxxxx byte is the first one in a sequence, even if you can know if a byte is the last one.

Kannan is looking for a format for a protocol that he is developing. Maybe scanning backwards through a string is not a scenario that will ever be encountered in this protocol. It's not for us to say.

--
Doug Ewell  |  Thornton, Colorado, USA  |  http://www.ewellic.org
RFC 5645, 4645, UTN #14  |  ietf-languages @ http://is.gd/2kf0s ­


Reply via email to