> You just seem to have Decided, for reasons known only to you, that > The Character Length Of A String Is Not Useful. Despite literally > decades of programs that have used strlen() in various ways.
strlen was mostly used in a context where char-length = byte-length = display-width. Most of those calls to strlen have nothing to do with char-length but are more interested in display-width or byte-length. In the context of Unicode, using utf-8 doesn't make byte-length any harder than with ASCII. And in the context of Unicode, display-width is a lot more complex than strlen regardless of which encoding you use because any given Unicode char can have a display-width of 0, 1, or 2 (even if you disregard proportional fonts and other fancy rendering tricks). So utf-8 doesn't make the computation of display-width any more complex than utf-32. > What if the question is "Find all the English words that have an E > in the 5th position and a U in the 7th"? That can be answered just as easily and efficiently from a utf-8 representation of the string as from a utf-32 representation. Stefan