Hi Ben, The apostrophe doesn't work because you convert to ASCII text that looks different on different platforms. If you don't use unidecode and just set the unicodeText of a field to your Unicode string, it should work. If that's not practical, you could use macToIso() to convert your string to Latin-1.
-- Kind regards, Mark Schonewille Economy-x-Talk Http://economy-x-talk.com Share the clipboard of your computer over a local network with Clipboard Link http://clipboardlink.economy-x-talk.com Op 30 jun. 2014 om 16:38 heeft Ben Rubinstein <benr...@cogapp.com> het volgende geschreven: > I think this problem should be solved in LC 7 (possibly using normaliseText); > but I need a solution that I can ship now (and it's been threatened that LC 7 > will 'fix' a 'bug' which isn't, so I'm not sure if I'll ever able to use it). > > My app processes some data from - and then, re-organised, to - UTF8 text > files. Occasionally it needs to insert a constant string; and for various > reasons (all of them excellent) I want to specify these constant strings in > the script. So far, so good. Now however one of these constant strings > needs to contain a character which is not in ASCII. Actually two of them. > So I need to express a UTF8 string in my script. And I'm searching for an > elegant way to do this. > > My constant string used to look something like this: > > constant kMyConstantString = "This is my ice cream" > > but now it needs to read something like > constant kMyConstantString = "This ice cream is (c) Ben and Jerry's Inc" > > (only with a smart apostrophe and a proper copyright symbol). > > I thought I could just about manage with this > > put uniDecode(uniEncode("This ice cream is © Ben and Jerry’s Inc, "ANSI"), > "UTF8") into kMyConstantString > > (that is, encode from ANSI to Unicode, then from Unicode into UTF8). > > I tested it on Mac and it seemed to work. The UTF8 file was generated and > this text came out just right. > > > However, it turned out that when the code was compiled and run on Windows, > the copyright symbol came out OK, but the apostrophe came out as o-tilde. > > This is because uniEncode(..., "ANSI") is a lie; "ANSI" is meaningless; > instead it interprets the source encoding as whatever is typical for the > operating system. I wrote the script on Mac; in MacRoman, © is 0xA9 and > smart apostrophe is 0xD5; in ISO-8859-1 (and UTF8), 0xA9 is ©, but 0xD5 is > o-tilde. > > So... what's the most elegant way to this (is there one)? Is there any > alternative to just looking up the UTF8 encodings and writing: > > put format("This ice cream is \xC2\xA9 Ben and Jerry\xE2\x80\x99s Inc") into > kMyConstantString > > ? > > TIA, > > Ben > > _______________________________________________ > use-livecode mailing list > use-livecode@lists.runrev.com > Please visit this url to subscribe, unsubscribe and manage your subscription > preferences: > http://lists.runrev.com/mailman/listinfo/use-livecode _______________________________________________ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode