Non-ascii string processing?

Theodore H. Smith Sat, 04 Oct 2003 12:23:41 -0700

Hi lists,

I'm wondering how people tend to do their non-ascii string processing.

I'm wondering, if anyone really needs anything other than byte oriented code? I'm using UTF8 as my character format, and UTF8 is variable width, of course. I offer the option of processing UTF8, with byte functions, however.

EG:

Start = MyString.InStr( "<" )
End = MyString.InStr( Start + 1, "> )

things like this, it really doesn't matter if your data is UTF8, you can still process it like bytes! Leading to faster speed, and simpler code.

So, I'm wondering, in fact, is there ANY code that needs explicit UTF8 processing? Heres a few I've thought of.

1) Spell checking - needs UTF8 character based iteration
2) lexical processing - needs UTF8 mode to be able to match "�" to "a".

Can anyone tell me any more? Please feel free to go into great detail in your answers. The more detail the better.

Thanks a lot!

I'm just wondering if I can simplify my string processing library, and if anyone really needs anything except byte-level processing, for most functions, except maybe a few for the two I mentioned above!

Non-ascii string processing?

Reply via email to