does this mean one could replace /u with 0x and then replace uls with empty and end up with the correct end result?
On Wed, Mar 15, 2017 at 2:16 PM, Richmond Mathewson via use-livecode < use-livecode@lists.runrev.com> wrote: > Just knock off the last 3, and what is left is what you want. > > Richmond. > > On 3/15/17 6:43 pm, J. Landman Gay via use-livecode wrote: > >> The problem with the pseudo code is that there's no clear indication of >> how many characters at the end to preserve. I'm not sure how the libraries >> deal with that. >> >> -- >> Jacqueline Landman Gay | jac...@hyperactivesw.com >> HyperActive Software | http://www.hyperactivesw.com >> >> >> >> On March 15, 2017 2:28:57 AM Richmond Mathewson via use-livecode < >> use-livecode@lists.runrev.com> wrote: >> >> No; it won't always be 4 characters, here's an admittedly extremely >>> obscure ancient Sinhala number; >>> 0x111F4. >>> >>> Of course the chances of encountering whacky characters like that is >>> small, but you'll have to make sure you >>> can cope with them should they crop up. >>> >>> If you look at Eduardo Ba\u00f1uls you will have to strip what comes >>> after the '\' of the prefix 'u' >>> and the suffix 'uls' and then you can cope with whatever is left: >>> >>> Reasonably pseudo-code following: >>> >>> set the item delimiter to \ >>> put what's after the item delimiter into HOLDER >>> delete char 1 of HOLDER >>> delete the last char of HOLDER >>> delete the last char of HOLDER >>> delete the last char of HOLDER >>> put "0x" & HOLDER into NUNUM >>> >>> at this point "NUNUM" could be alost any length, but that should not >>> matter unduly. >>> >>> Richmond. >>> >>> On 3/14/17 11:26 pm, J. Landman Gay via use-livecode wrote: >>> >>>> I'm dealing with non-English languages, and JSON data retrieved from a >>>> database comes in with unicode escape sequences like this: Eduardo >>>> Ba\u00f1uls. >>>> >>>> I need to translate those. I can do it by replacing the "\u" with "0x" >>>> and then using numToCodepoint() to get the UTF16 character. But there >>>> could be many of these in the same string, so I'm looking for a >>>> one-shot command that might just do them all. I don't think we have one. >>>> >>>> The alternative is to loop through all the text, getting an offset for >>>> each "\u" and then calculating the number of characters after that to >>>> use with numToCodepoint(). But will it always be 4 characters in any >>>> language? >>>> >>>> Or is there an easier way? >>>> >>>> >>> _______________________________________________ >>> use-livecode mailing list >>> use-livecode@lists.runrev.com >>> Please visit this url to subscribe, unsubscribe and manage your >>> subscription preferences: >>> http://lists.runrev.com/mailman/listinfo/use-livecode >>> >> >> >> >> _______________________________________________ >> use-livecode mailing list >> use-livecode@lists.runrev.com >> Please visit this url to subscribe, unsubscribe and manage your >> subscription preferences: >> http://lists.runrev.com/mailman/listinfo/use-livecode >> > > _______________________________________________ > use-livecode mailing list > use-livecode@lists.runrev.com > Please visit this url to subscribe, unsubscribe and manage your > subscription preferences: > http://lists.runrev.com/mailman/listinfo/use-livecode > _______________________________________________ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode