On Linux "Input data transcoding error" message does not contain invalid 
character if no proper locale is set
-------------------------------------------------------------------------------------------------------------

                 Key: XERCESC-1727
                 URL: https://issues.apache.org/jira/browse/XERCESC-1727
             Project: Xerces-C++
          Issue Type: Bug
          Components: Utilities
    Affects Versions: 2.7.0
         Environment: Linux, both 32- and 64-bit Red Hats, probably others
            Reporter: Sergey Melnikov


The test case involved russian letter put into XML file. Transcoding crashed 
with "Input data transcoding error..." message but no symbol displayed. I've 
traced the problem to memory allocation for a single character being 
transcoded. The size of a character in Linux implementation is determined 
through 'mblen' in calcRequiredSize(..) but it seems nothing more than plain 
7-bit ASCII is acceptable. 

Here is a source code snippet based on original calcRequiredSize fed with a 
character I used:
        
        char sExp[2]={'\192','\0'};
// the line below "fixes" the case: the character (russian 'A') is shown within 
exception message
//      setlocale(LC_ALL,"Russian");
        int iLen=std::mblen(&sExp[0],MB_CUR_MAX);
        if(-1 == iLen)
        {
                // ERROR! we should not be here, since no allocation will be 
done -- and that's what I get
                throw 1;
        }

Other platforms (Win32, Solaris etc) worked out fine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to