On 19 Apr 2011, at 9:16 AM, William Lachance wrote:
> On Tue, Apr 19, 2011 at 8:58 AM, Edward Mendelson <e...@columbia.edu> wrote:
> Hello,
>
> Recently I was asked to help someone convert hundreds of WPMac files that
> include Japanese Kanji, files that were created on old Macs that had the
> Japanese Language Kits installed.
>
> It seems - I could be wrong - that libwpd doesn't convert the characters in
> those files. The method I found for converting them was a bit roundabout:
> ...
> Use a PowerPC Mac that runs OS 10.4 and "Classic" with the Japanese Language
> Kit installed. Open the WPMac files in WPMac in Classic. Copy the contents of
> the file to the clipboard. Paste the contents of the file from the Clipboard
> into OS X's TextEdit or any other unicode-aware Mac application. Save the
> resulting file as an RTF or DOC file. The resulting file opens correctly in
> LibreOffice, Pages, Word, etc.
>
> This method obviously requires obsolete hardware and software. I would guess
> that it would require an enormous amount of effort to support double-byte CJK
> and other WorldScript-based scripts in libwpd, and that the potential need
> for it is far too small to justify the effort. But is this something that
> might someday be possible in the future?
>
> Actually, it's not really that difficult. Unless Japanese is dramatically
> different from what we've seen so far, all we should need to make this
> conversion work is a table mapping from WordPerfect extended characters to
> their unicode equivalents. Over the years we've expanded support for
> languages from only plain latin to relatively obscure ones like Tibetan
> courtesy of mappings submitted by various people.
>
> If you don't have the expertise to create such a mapping yourself, we could
> probably derive one from (1) a WP document containing all the characters in a
> Japanese script and (2) one converted to RTF/DOC. If you're interested in
> producing something like this, let us know!
>
> --
> William Lachance
> wrl...@gmail.com
Thanks for that quick reply. I don't have such a document, but I'll ask the
person who asked me for help, in the hope that he might be able to create one.
Japanese/Chinese/Korean in WPMac formats are I *think* completely different
from extended characters in WPDOS/WPWin/WPUnix formats. They don't use the
CharacterSet/CharacterNumber system, because thousands of characters are
supported in each language. So a document that included the extended characters
would be enormous. My guess is that Tibetan fits fairly well into character set
12, but this would be different. Or am I completely wrong about this?
Edward Mendelson
------------------------------------------------------------------------------
Benefiting from Server Virtualization: Beyond Initial Workload
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve
application availability and disaster protection. Learn more about boosting
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
_______________________________________________
Libwpd-devel mailing list
Libwpd-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libwpd-devel