jkesselm commented on code in PR #166: URL: https://github.com/apache/xalan-java/pull/166#discussion_r1465440629
########## serializer/src/main/java/org/apache/xml/serializer/ToStream.java: ########## @@ -1595,23 +1599,40 @@ else if (m_encodingInfo.isInEncoding(ch)) { // not in the normal ASCII range, we also // just leave it get added on to the clean characters } - else if (Encodings.isHighUTF16Surrogate(ch) && i < end-1 && Encodings.isLowUTF16Surrogate(chars[i+1])) { - // So, this is a (valid) surrogate pair - if (! m_encodingInfo.isInEncoding(ch, chars[i+1])) { - int codepoint = Encodings.toCodePoint(ch, chars[i+1]); - writeOutCleanChars(chars, i, lastDirtyCharProcessed); - writer.write("&#"); - writer.write(Integer.toString(codepoint)); - writer.write(';'); - lastDirtyCharProcessed = i+1; - } - i++; // skip the low surrogate, too + else if (Encodings.isHighUTF16Surrogate(ch)) { + // Store for later processing. We may be at the end of a buffer, + // and must wait till low surrogate arrives + // before we can do anything with this. + writeOutCleanChars(chars, i, lastDirtyCharProcessed); + m_highUTF16Surrogate = ch; + lastDirtyCharProcessed = i; + } + else if (m_highUTF16Surrogate != 0 && Encodings.isLowUTF16Surrogate(ch)) { + // The complete utf16 byte sequence is now available and may be serialized. + if (! m_encodingInfo.isInEncoding(m_highUTF16Surrogate, ch)) { Review Comment: I'm a bit confused by your intent here. The high surrogate is never in any of the encodings, and should never be output by itself, so this case is firing incorrectly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@xalan.apache.org For additional commands, e-mail: dev-h...@xalan.apache.org