Re: Help with Unicode Text

Dar Scott Mon, 28 Mar 2005 11:30:26 -0800


On Mar 28, 2005, at 12:06 PM, Dan Friedman wrote:

Anyone know how to replace a return char in a unicode string?


There are two problems with your method.

Well, the first is really a potential problem depending on what you want to do. Do you mean ASCII carriage return? Or the Revolution newline character (coded the same as ASCII line feed)?

The test character is a single byte character. However, each character in a unicode string is two bytes, 16-bit values in host order, that is, UTF16. Even then you can't just convert the character to two bytes for the platform and search. You might match half of one character and half of the next.

The pattern for repeating for each unicode character is like this:

-- for each unicode char uc in sBMP
  repeat with i = 1 to length(sBMP)-1 step 2
    put char i to i+1 of sBMP into uc
    -- body
  end repeat

That assumes there are no surrogates.

One way to convert your ASCII test char is this:

  get uniEncode(c,"UTF8")

So, you can go through each unicode character, accumulating values, but replacing those that need replacing.

Dar

--
**********************************************
    DSC (Dar Scott Consulting & Dar's Lab)
    http://www.swcp.com/dsc/
    Programming Services and Software
**********************************************

_______________________________________________
use-revolution mailing list
[email protected]
http://lists.runrev.com/mailman/listinfo/use-revolution

Re: Help with Unicode Text

Reply via email to