That’s all correct as far as I can see … 

 

about #3: stringlists etc using SavetoFile and LoadFromFile will default to a 
file format of AnsiStrings unless a BOM is found or unless one specifies a 
format otherwise

 

TEncoding.GetBufferEncoding can be used to detect what encoding is used for a 
file content (like in this example here: 
http://docwiki.embarcadero.com/CodeExamples/en/UnicodeConversion_(Delphi) ). 

 

… and the other way around (if I have to save in Unicode or not I usually just 
make a conversion to ansistring and compare it back to the original 
(unicode)string. If the characters are all the same (no replacements of Unicode 
to “?” characters) then I know I can save them as ascii instead of utf8.

 

This will allow you to build your own load/save(/append) functions for 
text-files without having to resort to TStringList. TStringList always adds a 
CR/LF to the last line of text when loading/saving which can be a bit annoying 
if you don’t want that.

 


Kind Regards,
Stefan Mueller 
_______________________
R&D Manager
ORCL Toolbox LLP, Japan
 <http://www.orcl-toolbox.com/> http://www.orcl-toolbox.com

 

From: [email protected] 
[mailto:[email protected]] On Behalf Of John Bird
Sent: Friday, January 20, 2012 12:36 PM
To: 'NZ Borland Developers Group - Delphi List'
Subject: [DUG] ]XE2 string conversion notes

 

I am converting source to be D2007 and XE2 compatible, the main issue being 
just my own string and file reading functions.

 

I recall Jolyon writing about this some months ago, with his complaints about 
the confusing naming of some of the routines (ANSIUpperCase for uppercasing 
Unicode for instance).

 

>From what I have been reading and researching I wanted to add a few points and 
>list them here to make sure I am on the right track:

 

1 – Almost everything compiles and runs as is, especially if one has never 
tried to cater for WideChar and WideString before (thats where much of the 
problems come from IMHO)

 

2 – Some unusual cases – Records with definitions eg Name:string[60]  will need 
to be revisited.  (these are shortstring and still Ansi).

 

3 – stringlists etc using SavetoFile and LoadFromFile will default to a file 
format of AnsiStrings unless a BOM is found or unless one specifies a format 
otherwise

 

4 – Source files similarly will remain as Ansi/Ascii unless Unicode characters 
are present

 

5 – statements like if ThisChar in [‘a’-‘z’]  replaced with CharInSet   (the 
argument ‘a’-‘z’ is still AnsiChar/Ascii characters

 

6 – Uppercase, lowercase and more general string functions like the above 
charinset are best replaced with the Character unit functions:

    eg

        isLower

        isUpper

        isDigit

        toUpper

        toLower

 

    these are all general Unicode routines – many for either Char or String – 
and handle eg case conversion according to the general Unicode rules.   ie 
don’t use the AnsiUpperCase function which converts Ascii and according to the 
current locale (codepage) – ie not general Unicode conversion as far as I can 
figure.

 

7 – To compare strings, use CompareStr and CompareText for comparison which is 
or is not case sensitive according to general Unicode rules.   These also use 
proper unicode rules I understand so that the same character encoded 
differently in each string (eg as a surrogate pair) will be still matched if it 
is ultimately the same character.

 

8 – {$IFDEF UNICODE} blocks can be added for code only for  XE2 etc and will be 
ignored by D2007.

 

Hope this research is of use to others, please tell me if any of these are 
wrong.

 

John Bird

_______________________________________________
NZ Borland Developers Group - Delphi mailing list
Post: [email protected]
Admin: http://delphi.org.nz/mailman/listinfo/delphi
Unsubscribe: send an email to [email protected] with 
Subject: unsubscribe

Reply via email to