Jasper Krauter created BATIK-1328: ------------------------------------- Summary: No support for unicode characters in U+10000 - U+10FFFF range Key: BATIK-1328 URL: https://issues.apache.org/jira/browse/BATIK-1328 Project: Batik Issue Type: Bug Components: SVG DOM Affects Versions: 1.13 Reporter: Jasper Krauter
The SVG Transcoder checks for valid XML characters but does not take into account characters that, due to the Java String implementation, are represented by two Java chars (UTF-16 Surrogate Pairs). Since neither of those individual chars are a valid XML character on their own, the transcoder fails. But the [XML1.0 specification|https://www.w3.org/TR/xml/#charsets] does allow for those characters. In {{{}org.apache.batik.dom.util.DOMUtilities#contentToString{}}}, instead of {{{}String#charAt{}}}, rather {{String#codePointAt}} should be used to extract individual characters. Using {{{}StringBuffer#appendCodePoint{}}}, the code points can properly appended to the output string. The methods that check for character validity already account for code points. Code example to reproduce the issue: {code:java} String svgNS = SVGDOMImplementation.SVG_NAMESPACE_URI; Document doc = SVGDOMImplementation.getDOMImplementation().createDocument(svgNS, "svg", null); Element text = doc.createElementNS(svgNS, "text"); text.setTextContent("Hello, world! 👋"); doc.getDocumentElement().appendChild(text); var transcoder = new SVGTranscoder(); TranscoderOutput out = new TranscoderOutput(new OutputStreamWriter(System.out)); TranscoderInput in = new TranscoderInput(doc); transcoder.transcode(in, out);{code} throws {code:java} Exception in thread "main" java.lang.RuntimeException: IO:Invalid character   at batik.transcoder@1.13/org.apache.batik.transcoder.svg2svg.SVGTranscoder.transcode(SVGTranscoder.java:179){code} -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: batik-dev-unsubscr...@xmlgraphics.apache.org For additional commands, e-mail: batik-dev-h...@xmlgraphics.apache.org