On Mon, Jan 7, 2013 at 2:34 PM, Costello, Roger L. <[email protected]> wrote: > Are there "Unicode processors"? > > That is, are there processors that break up Unicode text into its parts -- > here's a character, here's another character, here's still another character, > etc. -- and then makes those parts (along with information about each part > such as "this part is the Latin Capital Letter T" and "this part is the Latin > Small Letter o") available to Unicode applications (such as XML processors) > via an API? > > I did a Google search for "Unicode processor" and came up empty so I am > guessing the answer is that there are no Unicode processors. Or perhaps they > go by a different name? If there are no Unicode processors, why not?
I don't really think I understand what you want. K&R C had this, at least for the ASCII subset of Unicode; it has arrays of characters and you can access each character individually. If you want to know if the third character in your array s is the Latin capital letter T, you write s[2] == "T". If you want to know if it's a letter, you write isalpha(s[2]). Naturally speaking, Unicode support is slightly more complex, but it's still a matter of sequences of characters and functions to query the properties. It's plain text, it doesn't have XML's complex hierarchical features. -- Kie ekzistas vivo, ekzistas espero.

