I don't have a JUnit readily available but I can provide the brief overview of what we are doing:
1. We are providing xml compliant unicode characters for chinese text in the resource bundle, for instance printVatSummary.vatSummary=& #165; & #x589e; & #x503c; & #x7a0e; & #x6458; & #x8981; (Extra spaces have been introduced above for better readability. Actual code does not have these spaces. ) 2. Application generates the DataPOJO reading properties from the resource bundle. DATA POJO: - taxSummaryDesc=& #165; & #x589e; & #x503c; & #x7a0e; & #x6458; & #x8981; (Extra spaces have been introduced above for better readability. Actual code does not have these spaces. ) 3. DataPOJO is next marshalled into XML using Castor API's. <tax-summary-desc> element below contain invalid unicode characters. If we transform this XML to HTML by applying styesheets, we get UTFDataFormatException as follows: <?xml version="1.0" encoding="UTF-8"?> <tax-summary-desc>&#165;&#x589e;&#x503c;&#x7a0e;&#x6458;&#x8981;</tax-summary-desc> error::java.io.UTFDataFormatException: Invalid byte 1 of 1-byte UTF-8 sequence. We'll need to configure castor to render the special characters as it is. Werner Guttmann wrote: > > I am not 100% sure whether this is a bug or not, but would like to look > into this further. You wouldn't be able to supply us with e.g. a JUnit > test that highlights the problem ? > > Werner > > xsltuser wrote: >> Dear All, >> >> I'm using Castor's Marshaller API to marshal java object to xml. The >> input >> object has some unicode characters of the form & # x 1234; (without >> spaces) >> representing chinese characters. The output xml after marshalling >> displays & >> amp ; x 1234; (without spaces) in place of & # x 1234; (without >> spaces) >> as a result of which chinese characters are not rendered correctly when >> the >> XML is transformed into HTML using XSLT. I'm using UTF-8 encoding. >> >> Any help is greatly appreciated. >> >> Thanks. > > > --------------------------------------------------------------------- > To unsubscribe from this list, please visit: > > http://xircles.codehaus.org/manage_email > > > > -- View this message in context: http://www.nabble.com/Invalid-Unicode-characters-in-XML-tp15951719p16001443.html Sent from the Castor - User mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email

