Rewriting XML element: <tax-summary-desc> & amp;#165; & amp;#x589e; & amp;#x503c; & amp;#x7a0e; & amp;#x6458; & amp;#x8981;</tax-summary-desc>
xsltuser wrote: > > I don't have a JUnit readily available but I can provide the brief > overview of what we are doing: > > 1. We are providing xml compliant unicode characters for chinese text > in the resource bundle, for instance > > printVatSummary.vatSummary=& #165; & #x589e; & #x503c; & #x7a0e; & > #x6458; & #x8981; > > (Extra spaces have been introduced above for better readability. Actual > code does not have these spaces. ) > > 2. Application generates the DataPOJO reading properties from the resource > bundle. > > DATA POJO: > > - taxSummaryDesc=& #165; & #x589e; & #x503c; & #x7a0e; & #x6458; & > #x8981; > > (Extra spaces have been introduced above for better readability. Actual > code does not have these spaces. ) > > > 3. DataPOJO is next marshalled into XML using Castor API's. > <tax-summary-desc> element below contain invalid unicode characters. If we > transform this XML to HTML by applying styesheets, we > get UTFDataFormatException as follows: > > <?xml version="1.0" encoding="UTF-8"?> > <tax-summary-desc>&#165;&#x589e;&#x503c;&#x7a0e;&#x6458;&#x8981;</tax-summary-desc> > > error::java.io.UTFDataFormatException: Invalid byte 1 of 1-byte UTF-8 > sequence. > > We'll need to configure castor to render the special characters as it is. > > > > > Werner Guttmann wrote: >> >> I am not 100% sure whether this is a bug or not, but would like to look >> into this further. You wouldn't be able to supply us with e.g. a JUnit >> test that highlights the problem ? >> >> Werner >> >> xsltuser wrote: >>> Dear All, >>> >>> I'm using Castor's Marshaller API to marshal java object to xml. The >>> input >>> object has some unicode characters of the form & # x 1234; (without >>> spaces) >>> representing chinese characters. The output xml after marshalling >>> displays & >>> amp ; x 1234; (without spaces) in place of & # x 1234; (without >>> spaces) >>> as a result of which chinese characters are not rendered correctly when >>> the >>> XML is transformed into HTML using XSLT. I'm using UTF-8 encoding. >>> >>> Any help is greatly appreciated. >>> >>> Thanks. >> >> >> --------------------------------------------------------------------- >> To unsubscribe from this list, please visit: >> >> http://xircles.codehaus.org/manage_email >> >> >> >> > > -- View this message in context: http://www.nabble.com/Invalid-Unicode-characters-in-XML-tp15951719p16001446.html Sent from the Castor - User mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email

