Hi Alberto,

You were right I was assuming it to be UTF-8 (I thought that was the
default char map for "ko" locale)
Anyways now I set the shell locale to "ko.UTF-8" and removed the
setlocale call in the code.
So now I am sure that the input is in UTF-8.

Using XMLString::transcode I am able to converts char* to XMLCh* and
XMLCh* to char* without any loss.
But the same string if I write in xml and view the xml, the korean
string in xml is different!!
Am I missing something while writing the xml?

On secondary level, even though my shell locale is UTF-8 now, the UTF-8
transcoder is still distorting the output.

Regards,
Pushkar

-----Original Message-----
From: Alberto Massari [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, May 28, 2008 1:53 PM
To: [email protected]
Subject: Re: Unable to transcode korean chars on Solaris

Hi Pushkar,
relying on XMLString::transcode depends on the current locale; so I
wouldn't do the call to setlocale if you know that the input string was
entered using the current shell locale. As for the other attempt, you
are creating an UTF-8 transcoder and asking it to convert the input
string, but this would only work if your shell locale is UTF-8.
So, either work with whatever locale is used by the shell
(XMLString::transcode) or create the appropriate transcoder for the
input string you are dealing with (don't blindly use UTF-8).

Alberto


Patil, Pushkar wrote:
> Hi All,
>  
> I am facing problem while transcoding korean chars on Solaris.
> Some details:
> Xerces Version: 2.2
> Solaris: 5.8
> Locale: Korean
> The code works fine on AIX and Windows ( for both en_US and korean 
> locale )
>  
> I receive korean data as multie byte char* from database and to 
> transcode I used the "XMLString::transcode" method.
> When I write the transcoded XMLCh* in xml, the string is distorted.
> I tried using XMLTranscoder with no results.
>  
> To debug the problem I have written a small C style program
> (OnlyXerces.cpp) which simulates the output (it receives the korean 
> chars as argument).
> I have attached the program, the console output from the program and 
> the data from the generated xmls.
>  
> Would be great if someone would point out the problem in my code or 
> divert me to a alternative / better approach.
>  
> Regards,
> Pushkar
>  
> Snippets of code:
>  
> *setlocale(LC_ALL, ""); *// output is received as "ko"
>  
> *////////// Transcoding using XMLString:transcode////////// *
> char* strIn = argv[1]; /// argv[1] contains the input korean 
> characters
> *XMLCh* tag = XMLString::transcode(strIn);* ...write xml using "tag"
>  
> *//////////  Transcoding using XMLTranscoder* XMLRecognizer::Encodings

> encodingEnum = XMLRecognizer::UTF_8;
> *XMLTranscoder* utf8Transcoder =
> XMLPlatformUtils::fgTransService->makeNewTranscoderFor(encodingEnum,
> failReason, 16*1024);  *
>  
>  XMLCh* outputStr = NULL;
>  unsigned int charsEaten = 0;
>  unsigned int length = strlen(strIn);
>  unsigned char* sizes = new unsigned char[ length + 1 ];  outputStr = 
> new XMLCh[ length ];
> * unsigned int chars_stored =  utf8Transcoder->transcodeFrom((const
> XMLByte*) strIn, length, outputStr, length, charsEaten, sizes ); * ...

> write xml using "outputStr"
>  
>  

Reply via email to