Hi Alberto, You were right I was assuming it to be UTF-8 (I thought that was the default char map for "ko" locale) Anyways now I set the shell locale to "ko.UTF-8" and removed the setlocale call in the code. So now I am sure that the input is in UTF-8.
Using XMLString::transcode I am able to converts char* to XMLCh* and XMLCh* to char* without any loss. But the same string if I write in xml and view the xml, the korean string in xml is different!! Am I missing something while writing the xml? On secondary level, even though my shell locale is UTF-8 now, the UTF-8 transcoder is still distorting the output. Regards, Pushkar -----Original Message----- From: Alberto Massari [mailto:[EMAIL PROTECTED] Sent: Wednesday, May 28, 2008 1:53 PM To: [email protected] Subject: Re: Unable to transcode korean chars on Solaris Hi Pushkar, relying on XMLString::transcode depends on the current locale; so I wouldn't do the call to setlocale if you know that the input string was entered using the current shell locale. As for the other attempt, you are creating an UTF-8 transcoder and asking it to convert the input string, but this would only work if your shell locale is UTF-8. So, either work with whatever locale is used by the shell (XMLString::transcode) or create the appropriate transcoder for the input string you are dealing with (don't blindly use UTF-8). Alberto Patil, Pushkar wrote: > Hi All, > > I am facing problem while transcoding korean chars on Solaris. > Some details: > Xerces Version: 2.2 > Solaris: 5.8 > Locale: Korean > The code works fine on AIX and Windows ( for both en_US and korean > locale ) > > I receive korean data as multie byte char* from database and to > transcode I used the "XMLString::transcode" method. > When I write the transcoded XMLCh* in xml, the string is distorted. > I tried using XMLTranscoder with no results. > > To debug the problem I have written a small C style program > (OnlyXerces.cpp) which simulates the output (it receives the korean > chars as argument). > I have attached the program, the console output from the program and > the data from the generated xmls. > > Would be great if someone would point out the problem in my code or > divert me to a alternative / better approach. > > Regards, > Pushkar > > Snippets of code: > > *setlocale(LC_ALL, ""); *// output is received as "ko" > > *////////// Transcoding using XMLString:transcode////////// * > char* strIn = argv[1]; /// argv[1] contains the input korean > characters > *XMLCh* tag = XMLString::transcode(strIn);* ...write xml using "tag" > > *////////// Transcoding using XMLTranscoder* XMLRecognizer::Encodings > encodingEnum = XMLRecognizer::UTF_8; > *XMLTranscoder* utf8Transcoder = > XMLPlatformUtils::fgTransService->makeNewTranscoderFor(encodingEnum, > failReason, 16*1024); * > > XMLCh* outputStr = NULL; > unsigned int charsEaten = 0; > unsigned int length = strlen(strIn); > unsigned char* sizes = new unsigned char[ length + 1 ]; outputStr = > new XMLCh[ length ]; > * unsigned int chars_stored = utf8Transcoder->transcodeFrom((const > XMLByte*) strIn, length, outputStr, length, charsEaten, sizes ); * ... > write xml using "outputStr" > >
