Simon White created XERCESC-1987:
------------------------------------
Summary: Transcoding Issue with single XMLCh to utf8
Key: XERCESC-1987
URL: https://issues.apache.org/jira/browse/XERCESC-1987
Project: Xerces-C++
Issue Type: Bug
Components: Utilities
Affects Versions: 2.8.0
Environment: Windows XP
Reporter: Simon White
There appears to be an issue with transcoding to utf8. Conditions:
Input string = Single Chinese Character (XmlCh holds value 27493).
Problem code in TranscodeToStr::transcode:
unsigned int allocSize = len * sizeof(XMLCh);
fString = (XMLByte*)fMemoryManager->allocate(allocSize);
This code sets the output buffer to be two bytes. The issue here is that the
character in question converts to a 3 byte utf8 character. It therefore hits
this in XMLUTF8Transcoder.cpp:
// If we cannot fully get this char into the output buffer,
// then leave it for the next time.
//
if (outPtr + encodedBytes > outEnd)
break;
Since this is only a single character being converted it returns 0 and then
hits this since nothing could be decoded:
if(charsRead == 0)
ThrowXMLwithMemMgr(TranscodingException,
XMLExcepts::Trans_BadSrcSeq, fMemoryManager);
The sequence is not invalid, only output buffer has been limited to input
buffer size. Is simply adding a few spare characters to allocSize the correct
fix?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]