IconvGNU and IconvFBSD based transcoders assume UCS-2 as XMLCh encoding
-----------------------------------------------------------------------
Key: XERCESC-1663
URL: http://issues.apache.org/jira/browse/XERCESC-1663
Project: Xerces-C++
Issue Type: Bug
Components: Utilities
Affects Versions: 2.7.0
Environment: any
Reporter: Boris Kolpackov
I was studying the code in IconvGNU and IconvFBSD transcoders and it appears
that they assume UCS-2 is the encoding for XMLCh when it's actually UTF-16. I
believe this can result in the loss of data.
The encoding that is used for XMLCh is stored in the fUnicodeCP variable which
is initialized in the Iconv{GNU,FBSD}TransServices c-tor. The initialization
code just tries all encodings from the gIconv{GNU,FBSD}Encodings array which
for GNU contains only UCS-2 and for FreeBSD contains UCS-2 and UCS-4 encodings.
I tried to add a UTF-16LE to this array (as a first item) and it works fine for
GNU (I double checked that UTF-16LE ends up in fUnicodeCP). Does anybody knows
what's going on here? Should we add UTF-16 to these arrays?
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]