@ Colin : No worries.
@All : One other thing to point out is that when working with genuine, actual Unicode strings you should be careful to use the correct ANSI() functions... yes, you read that right. S := Uppercase(S); Will NOT convert Unicode characters (just as it would previously have not converted non-ASCII characters). S := ANSIUppercase( S ); On the other hand will. The same goes for the likes of SameText() vs ANSISameText() etc. If you were writing for extended character sets in the past you were most likely already using these routines, but if you werent (perhaps because Delphi doesnt support extended chars very well) and are now thinking that by simply upgrading to a Unicode Delphi all such things are magically taken care of, you will be in for a shock. Better yet, use the routines introduced in the Character unit (why not UnicodeUtils? DOH!) The only problem you then have is if you want to write string handling/manipulating code that will be portable between Unicode and non-Unicode Delphi compilers. From: delphi-boun...@delphi.org.nz [mailto:delphi-boun...@delphi.org.nz] On Behalf Of Colin Johnsun Sent: Tuesday, 23 November 2010 15:22 To: NZ Borland Developers Group - Delphi List Subject: Re: [DUG] Upgrading to XE - Unicode strings questions Doh! Thanks Jolyon for clearing that misunderstanding on my part. I was aware of the surrogate pair issue but I wrongly assumed that this might have been taken care by the iterator implementation. I guess not. Thanks again! Cheers, Colin On 23 November 2010 13:06, Jolyon Smith <jsm...@deltics.co.nz> wrote: Colin, the for C in loop and the for i := 1 to Length() loops are functionally identical! The only difference is that the for in version incurs the slight overhead of the enumerator framework invoked by the compiler and runtime magic to support that syntax. But in neither case will the loop itself help detect/respond to surrogate pairs (a single WideChar is potentially only ½ the data required to form a complete character). The only way to reduce an iterator over a string to a simple char-wise loop, whether explicit or using enumerators, is to first convert to UTF32, the facilities for which in the Delphi RTL are <cough> rudimentary, to put it politely. Non-existent may be nearer the mark. The precise mechanics of the loop construct used is not material to that problem. However, just as before Unicode when most people didnt care and just wrote code that assumed ANSI==ASCII, these days people wont care and will write code that assumes that Unicode==BMP (Basic Multilingual Plane), ignoring surrogate pairs just as they used to ignore extended ASCII and ANSI characters. And for most people, that will probably actually work. J From: delphi-boun...@delphi.org.nz [mailto:delphi-boun...@delphi.org.nz] On Behalf Of Colin Johnsun Sent: Tuesday, 23 November 2010 14:31 To: NZ Borland Developers Group - Delphi List Subject: Re: [DUG] Upgrading to XE - Unicode strings questions I won't answer everything but just on this one question: On 23 November 2010 11:04, John Bird <johnkb...@paradise.net.nz> wrote: Extra question: It looks like code like for i:=1 to length(string1) do begin DoSomethingWithOneChar(string1[i]); end; cannot be used reliably. The problems are that length(string1) looks like it cannot be safely used - as unicode characters may include 2 codepoints and length(string1) highlights that there is a difference between the number of unicode characters in a string and the number of codepoints. Still figuring out what is the best practice here, as I have quite a lot of string routines. Should be be OK as long as the unicode text actually is ASCII. you can use something like this: var C: Char; ... for C in String1 do begin DoSomethingWithOneChar(C); end; In this case you don't need to know the index of each character, you just get the char using the for..in..do loop. _______________________________________________ NZ Borland Developers Group - Delphi mailing list Post: delphi@delphi.org.nz Admin: http://delphi.org.nz/mailman/listinfo/delphi Unsubscribe: send an email to delphi-requ...@delphi.org.nz with Subject: unsubscribe
_______________________________________________ NZ Borland Developers Group - Delphi mailing list Post: delphi@delphi.org.nz Admin: http://delphi.org.nz/mailman/listinfo/delphi Unsubscribe: send an email to delphi-requ...@delphi.org.nz with Subject: unsubscribe