On Sat, 2008-12-13 at 05:04 +0900, Jeongkyu Kim wrote: > However, there is one more homework left for me. When I opened the > exported file with OO.o Writer, Korean characters were broken. I am > playing around some functions in ww8par.cxx such as ReadPlainChars(), > GetCurrentCharSet(), and Custom8BitToUnicode(), but I have no luck > yet. Now, I need some hints on how to handle Korean characters > correctly in importing filter
Hmm, well first make sure that the in SwWW8ImplReader::ReadPlainChars that eSrcCharSet is equal to RTL_TEXTENCODING_MS_949. Assuming that that is working correct then looking at the code it probably does not properly handle multi-byte encodings. So rather than sending each byte to be converted in for( nL2 = 0; nL2 < nLen; ++nL2, ++pWork ) it would likely be better to collect the whole set of bytes, adjust Custom8BitToUnicode to take a sequence of bytes and send the whole lot to rtl_convertTextToUnicode so as to not break up multi-bytes sequences into broken single characters. If you have a simple same document which reproduces this on import then if you log an issue and put me as "cmc" on the cc I could take a look to see if that is the case. C. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
