Hi, probably this has been asked already but I did not find a reference.
According to the Xerces documentation the internal representation of strings is UTF-16. When I tried Xerces on OS X where XMLCh is defined as a uint_16 and try to convert from the current locale („de-DE“) to an XMLCh string representation by: char const* inputSource = "\U0001F600"; XMLCh* outputCharacter = XMLString::transcode(inputSource); This obviously does not work. The output is one character with the value 62976. Looking at the XMLString::transcode method (based on IconvLCPTranscoder::transcode)it is obvious that it does not work because on OS X: - the size of wchar_t is 32 bit; - XMLString::transcode copies the output of mbsrtowcs (wide characters of 32 bit into a 16 bit XMLCh buffer). while(true) { size_t len = ::mbsrtowcs(tmpString + dstCursor, &src, resultSize - dstCursor, &st); // len is based on 32 bit if (len == TRANSCODING_ERROR) { dstCursor = 0; break; } dstCursor += len; if (src == 0) // conversion finished break; if (dstCursor >= resultSize - 1) reallocString<wchar_t>(tmpString, resultSize, manager, tmpString != localBuffer); } // make a final copy, converting from wchar_t to XMLCh: XMLCh* resultString = (XMLCh*)manager->allocate((dstCursor + 1) * sizeof(XMLCh)); // result string is based on 16 bit size_t i; for (i=0; i<dstCursor; ++i) resultString[i] = tmpString[i]; // 32 bit number is stored in 16 bit number Therefore, I have two questions: 1) is „configure“ wrongly configured to use for XMLCh a uint_16? 2) how will the result be converted in any case to a UTF-16 encoding? Best regards, Hartwig