On Jun 8, 2009, at 2:40 PM, Andrew Lindesay wrote:

If you render the original string, I presume that it does not contain the corrupted UTF-8 sequence and renders the glyphs correctly?

Right. If I change the number of characters I get different results. Truncating to 12 bytes makes up two japanese characters, 6 makes up one.

returnValue  = new String(textBlock.toString().getBytes("UTF-8"), 0, lengthTruncated, "UTF-8");

^^^ I know you tried it using sub-strings, but this above would definitely cause trouble as it could break inside multi-byte sequences.

I still get 'fractional' multi-byte characters but the results are different:


Previously:

Note the length is different so it does make an attempt to count the glyphs. This could mean that it's a different type of encoding and so my data is corrupted at at least not what I think it is.

Thanks

kib

"Success is not final, failure is not fatal: it is the courage to continue that counts."
Winston Churchill

Klaus Berkling
Systems Administrator
DynEd International, Inc.





Attachment: smime.p7s
Description: S/MIME cryptographic signature

 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list      (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to