* David Cantrell <da...@cantrell.org.uk> [2005-10-17 16:40]: > Variable width character sets are themselves hateful. I'll go > further and say that they are a spectacularly stupid idea, and > that whoever decreed them needs shootin'. Yes, being able to > represent more than 220-odd characters is a Good Idea. So > damnit, just use 32-bit - or 64-bit - characters. Then you'll > be able to seek!
Yuck, then you’re saddled with endianness issues. Plus null bytes can then be part of the data, so most charset-oblivious software breaks. Not worth it, considering that 99.99% of text processing is either gluing strings together without looking inside, or processing them character-by-character. Blindly indexing into a string without having scanned it previously is so rare it doesn’t merrit consideration. Regards, -- Aristotle Pagaltzis // <http://plasmasturm.org/>