On Wed, 7 Jun 2006 12:24:15 +0300 ik <[EMAIL PROTECTED]> wrote: > Hi, > > AnsiString can contain insdie UTF-8. WideString is for UTF-16 and > UCS-2, that requires more bytes then UTF-8. > > Please note that each char on UTF-8 is two bytes (like AnsiString), > while UTF-16 is 4 bytes (if I remember correctly).
No. UTF-8 needs 1 to 4 bytes. For example 1 byte for for ASCII characters, 2 bytes for german umlaute. UTF-16 needs 1 to 2 words for each character. For examples the two character string 'a' plus umlaut 'o' needs 3 bytes as UTF-8 and 4 bytes as UTF-16. Both encodings support the whole unicode character set and there is a 1:1 mapping. The LCL will support UTF-8 and provide some extra functions for UTF-16, because UTF-8 is more compatible to existing pascal programs. This is not yet complete. See below. To show a unicode string (UTF-8 or UTF-16), you must have a unicode font. At the moment the lazarus IDE under linux/gtk1 starts with a 'courier' font, which is an ANSI font, and therefore does not show correct all UTF-8 chars like german umlaute. You have to choose a '2-byte' font in the editor options. linux/gtk2 has almost only UTF-8 fonts, and that's why the 'courier' font supports UTF-8. Of course not every font supports all of the one million unicode characters. To work with special keycodes needed for instance for the french accent charcters, two or more keys must be combined to one. This is done via a key mapping, which is not yet fully supported in the gtk interfaces. Maybe someone else can tell about the unicode support of the win32/wince/qt interfaces and we can add this to the FAQ. Mattias > > > On 6/7/06, Alexandre Leclerc <[EMAIL PROTECTED]> wrote: > > Hi all, > > > > I see that many classes and functions are not using widestring. So i > > must understand that all these are not UTF8 capable? (If I understand > > UTF8, widestring would not support UTF16?). > > > > Ok, when coding a procedure reading files (LoadFromStream/File) should > > I always use widestring and then try guess if it is only normal > > ansistring and if you convert ansistring to widestring in order to use > > the functions? > > > > Or do we have to always code 2 times the same function: one with wide > > string, the other not. The goal of my question is to avoid coding two > > times the same stuff (even if this is copy-paste, you need to maintain > > the code). > > > > Right now I code for window; no UTF8 problems for now :) But I want my > > app to run under linux. > > 1. how do I detect UTF8/Ainsi files? Is this possible? (I talk about > > text files for now). > > Theoreticly you need to see how many bytes you need to enter in order > to have one char (use hex-editor for better view by your eyes). > > but on real life, it's almost impossible to detect such things... > thats why most programs gives you the ability to change it while > loading/working. > > > > 2. what is a #32 char in widechar? #xx#32? (I do comparisons like 'if > > pchar = #32 then'... I don't know how to translate this. > > In the Unicode world, each language have it's own "space" char... But > the regular ASCII english values are still the same, only that it have > another byte before of them. > > > 3. in TStringList.LoadFromStream... there are no widestring... how > > could a widestring version be done so that LoadFromStream(ansi) would > > call after convertion (?) to wide string LoadFromStream(widestring)? > > You need to implement it by yourself . In order to load a widestring > file, you should do something like so: > > var > f : file of widestring; > s : widestring; > > assignfile (f, file_to_open); > reseat (f); > readln (f, s); > ... > closefile(s); > > > > > (In fact I'm asking all this to start coding for the future right now.) > > > > Best regards. > > > > -- > > Alexandre Leclerc > > > > > > > Ido > > _________________________________________________________________ > To unsubscribe: mail [EMAIL PROTECTED] with > "unsubscribe" as the Subject > archives at http://www.lazarus.freepascal.org/mailarchives _________________________________________________________________ To unsubscribe: mail [EMAIL PROTECTED] with "unsubscribe" as the Subject archives at http://www.lazarus.freepascal.org/mailarchives
