hi, as i understand, characters in .net are 16-bit values.
but what about unicode characters, that are simply above the 16-bit limit? for example: OLD ITALIC LETTER A (unicode code: 10300). how do you represent those in .net? i tried to open a textfile containing this old-italic-a: - the length and indexing methods of string all said that old-italic-a is actually 2 letters => it doesn't work - when writing the string back to an utf8 encoded textfile, then it was correctly written. so for me it seems that dotnet (mono) uses utf16 as internal encoding format, but indexing (and length) doesn't use that information. am i correct? are there any ways to handle those characters in dotnet? for example the new java-1.5 contains some new string-methods that can handle these characters. it's not perfect in java, but at least there is something. if someone wants to play with it, i attached a text file containing the text "marrakesh", encoded in utf8, where i replaced the first "a" with old-italic-a (it's easy to do with a little iconv to-from ucs4 and hexedit) thanks, gabor farkas
m𐌀rrakesh