Hi all. This seems it should work but it doesn't. I truncate a string that may contain Japanese characters, purely for display purposes. Double byte or multi-byte characters are split appart. Results look like this: お使いのコンピュータにDVDドライブが搭載れているかは�? or: ![]() Here is the code: public String stringWithNoHTML(String aStringWithHTML, int lengthTruncated) { String returnValue = null; if (aStringWithHTML != null && aStringWithHTML.length() > 0) { //StringBuffer textBlock = new StringBuffer(aStringWithHTML); StringBuffer textBlock = new StringBuffer(); Pattern htmlTagPattern = Pattern.compile("<(.|\n|\r)+?>|&[a-zA-Z0-9]+;"); Matcher lineBreakMatcher = htmlTagPattern.matcher(aStringWithHTML); boolean results = lineBreakMatcher.find(); while (results) { lineBreakMatcher.appendReplacement(textBlock, " "); results = lineBreakMatcher.find(); } lineBreakMatcher.appendTail(textBlock); if (lengthTruncated > 0 && textBlock.length() > SUMMARY_LENGTH) { try { returnValue = new String(textBlock.toString().getBytes("UTF-8"), 0, lengthTruncated, "UTF-8"); } catch (UnsupportedEncodingException ex) { returnValue = null; } //returnValue = new String(textBlock.substring(0, lengthTruncated) + "..."); } else returnValue = textBlock.toString(); } return returnValue; } The original string may contain single byte characters as well. I expect the string to be properly truncated and not chop off bytes of the characters. It works fine with single byte characters. Using returnValue = new String(textBlock.toString().getBytes("UTF-8"), 0, lengthTruncated, "UTF-8"); or returnValue = new String(textBlock.substring(0, lengthTruncated) + "..."); makes no difference. I also bypassed the regex patter and still see the same problem. Files, components, class, etc. are in UTF-8. Has anyone seen this before and is there a work-around? Thanks kib "Success is not final, failure is not fatal: it is the courage to continue that counts." Winston Churchill Klaus Berkling Systems Administrator DynEd International, Inc. |
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ Do not post admin requests to the list. They will be ignored. Webobjects-dev mailing list (Webobjects-dev@lists.apple.com) Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com
This email sent to arch...@mail-archive.com