Michael B. Allen wrote on 2002-04-03 19:47 UTC: > > Whenever you are dealing with a variable-length encoding of characters, > > you really don't want to specify anything in terms of a number of > > characters. > > So you should not use a variable-length encoding for any serious generic > string processing like in this DOM example?
No, that's not what I said. You should address substrings by the address of the used encoding, not by an abstract and useless character count. You really want to use array indices here, which mean just that. If people think they want to count characters, that is with very high probability a sign that they have misunderstood what they really want and just copied the notion of "address offset equals number of characters in between times size of a character" from the fixed-length character world. Time to think again. You can do perfectly serious generic string processing without ever counting any characters between to string positions. Markus -- Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/> -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
