Re: Elegant way to express constant UTF8 string in script?

Mark Schonewille Mon, 30 Jun 2014 08:10:54 -0700

Hi Ben,

The apostrophe doesn't work because you convert to ASCII text that looks 
different on different platforms. If you don't use unidecode and just set the 
unicodeText of a field to your Unicode string, it should work. If that's not 
practical, you could use macToIso() to convert your string to Latin-1.


--
Kind regards,

Mark Schonewille
Economy-x-Talk
Http://economy-x-talk.com

Share the clipboard of your computer over a local network with Clipboard Link 
http://clipboardlink.economy-x-talk.com


Op 30 jun. 2014 om 16:38 heeft Ben Rubinstein <benr...@cogapp.com> het volgende 
geschreven:

> I think this problem should be solved in LC 7 (possibly using normaliseText); 
> but I need a solution that I can ship now (and it's been threatened that LC 7 
> will 'fix' a 'bug' which isn't, so I'm not sure if I'll ever able to use it).
> 
> My app processes some data from - and then, re-organised, to - UTF8 text 
> files. Occasionally it needs to insert a constant string; and for various 
> reasons (all of them excellent) I want to specify these constant strings in 
> the script.  So far, so good.  Now however one of these constant strings 
> needs to contain a character which is not in ASCII.  Actually two of them.  
> So I need to express a UTF8 string in my script.  And I'm searching for an 
> elegant way to do this.
> 
> My constant string used to look something like this:
> 
>   constant kMyConstantString = "This is my ice cream"
> 
> but now it needs to read something like
>   constant kMyConstantString = "This ice cream is (c) Ben and Jerry's Inc"
> 
> (only with a smart apostrophe and a proper copyright symbol).
> 
> I thought I could just about manage with this
> 
>  put uniDecode(uniEncode("This ice cream is © Ben and Jerry’s Inc, "ANSI"), 
> "UTF8") into kMyConstantString
> 
> (that is, encode from ANSI to Unicode, then from Unicode into UTF8).
> 
> I tested it on Mac and it seemed to work.  The UTF8 file was generated and 
> this text came out just right.
> 
> 
> However, it turned out that when the code was compiled and run on Windows, 
> the copyright symbol came out OK, but the apostrophe came out as o-tilde.
> 
> This is because uniEncode(..., "ANSI") is a lie; "ANSI" is meaningless; 
> instead it interprets the source encoding as whatever is typical for the 
> operating system.  I wrote the script on Mac; in MacRoman, © is 0xA9 and 
> smart apostrophe is 0xD5; in ISO-8859-1 (and UTF8), 0xA9 is ©, but 0xD5 is 
> o-tilde.
> 
> So... what's the most elegant way to this (is there one)?  Is there any 
> alternative to just looking up the UTF8 encodings and writing:
> 
>  put format("This ice cream is \xC2\xA9 Ben and Jerry\xE2\x80\x99s Inc") into 
> kMyConstantString
> 
> ?
> 
> TIA,
> 
> Ben
> 
> _______________________________________________
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription 
> preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode

_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: Elegant way to express constant UTF8 string in script?

Reply via email to