On 19 Apr 2011, at 9:16 AM, William Lachance wrote:

> On Tue, Apr 19, 2011 at 8:58 AM, Edward Mendelson <e...@columbia.edu> wrote:
> Hello,
> 
> Recently I was asked to help someone convert hundreds of WPMac files that 
> include Japanese Kanji, files that were created on old Macs that had the 
> Japanese Language Kits installed.
> 
> It seems - I could be wrong - that libwpd doesn't convert the characters in 
> those files. The method I found for converting them was a bit roundabout:
> ... 
> Use a PowerPC Mac that runs OS 10.4 and "Classic" with the Japanese Language 
> Kit installed. Open the WPMac files in WPMac in Classic. Copy the contents of 
> the file to the clipboard. Paste the contents of the file from the Clipboard 
> into OS X's TextEdit or any other unicode-aware Mac application. Save the 
> resulting file as an RTF or DOC file. The resulting file opens correctly in 
> LibreOffice, Pages, Word, etc.
> 
> This method obviously requires obsolete hardware and software. I would guess 
> that it would require an enormous amount of effort to support double-byte CJK 
> and other WorldScript-based scripts in libwpd, and that the potential need 
> for it is far too small to justify the effort. But is this something that 
> might someday be possible in the future?
> 
> Actually, it's not really that difficult. Unless Japanese is dramatically 
> different from what we've seen so far, all we should need to make this 
> conversion work is a table mapping from WordPerfect extended characters to 
> their unicode equivalents. Over the years we've expanded support for 
> languages from only plain latin to relatively obscure ones like Tibetan 
> courtesy of mappings submitted by various people. 
> 
> If you don't have the expertise to create such a mapping yourself, we could 
> probably derive one from (1) a WP document containing all the characters in a 
> Japanese script and (2) one converted to RTF/DOC. If you're interested in 
> producing something like this, let us know!
> 
> -- 
> William Lachance
> wrl...@gmail.com

Thanks for that quick reply. I don't have such a document, but I'll ask the 
person who asked me for help, in the hope that he might be able to create one.

Japanese/Chinese/Korean in WPMac formats are I *think* completely different 
from extended characters in WPDOS/WPWin/WPUnix formats. They don't use the 
CharacterSet/CharacterNumber system, because thousands of characters are 
supported in each language. So a document that included the extended characters 
would be enormous. My guess is that Tibetan fits fairly well into character set 
12, but this would be different. Or am I completely wrong about this?

Edward Mendelson

------------------------------------------------------------------------------
Benefiting from Server Virtualization: Beyond Initial Workload 
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve 
application availability and disaster protection. Learn more about boosting 
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
_______________________________________________
Libwpd-devel mailing list
Libwpd-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libwpd-devel

Reply via email to