Richard Gaskin wrote:

Converting ISO-8959-1 character references to displayable text is a snap:

If field 1 contains this:

      Don’t give up & call it quits.

...I can get the plain text like this:

     set the htmlText of fld 2 to the text of fld 1
     get the text of fld 2


But what do I do when the data I'm working with contains hex character references?:


     Don’t give up & call it quits.

I have a bunch of XML files that are UTF-8 encoded and chock full o' hex character references like that, and doing a replace on each or hunting them down to do a baseConvert would be inefficient.

I'd like to think some combination of Unicode functions/properties would do the trick, but alas I'm too braindead to come up with the winning solution.


Sorry, I'm clueless about Unicode; noting leaps out of the docs to suggest itself.

If there isn't a clever Unicode method, you could do the following .... note it ignores the more complex parts of UTF-*, and deals only with those chars that can be represented in 2 hex digits ....

It uses replace to do the actual changes - but only does one replace for each character encoded in the original, so should be pretty fast (NB : not tested for speed - only for working correctly in simple cases).


on mouseUP local tText, tArr, tNew, tmp put the text of field "inField" into tText put tText into tmp split tmp by "&" and ";" put the keys of tmp into tArr filter tArr with "#x*" repeat for each line L in tArr put baseconvert(char 3 to 4 of L, 16, 10) into tNew replace (char 2 to 4 of L) with tNew in tText end repeat put tText after msg end mouseUp

-- Alex.


-- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.300 / Virus Database: 265.6.9 - Release Date: 06/01/2005

_______________________________________________
use-revolution mailing list
[email protected]
http://lists.runrev.com/mailman/listinfo/use-revolution

Reply via email to