Michael B. Allen wrote on 2002-04-03 19:47 UTC:
> > Whenever you are dealing with a variable-length encoding of characters,
> > you really don't want to specify anything in terms of a number of
> > characters.
> 
> So you should not use a variable-length encoding for any serious generic
> string processing like in this DOM example?

No, that's not what I said. You should address substrings by the address
of the used encoding, not by an abstract and useless character count.
You really want to use array indices here, which mean just that. If
people think they want to count characters, that is with very high
probability a sign that they have misunderstood what they really want
and just copied the notion of "address offset equals number of
characters in between times size of a character" from the fixed-length
character world. Time to think again. You can do perfectly serious
generic string processing without ever counting any characters between
to string positions.

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>

--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to