On 23 Apr 2011, at 4:19 PM, Fridrich Strba wrote:

> Edward,
> 
> 
> On Sat, 2011-04-23 at 11:13 -0400, Edward Mendelson wrote:
>> Also potentially useful: The character map document that shipped with 6.x 
>> for DOS is here:
>> http://dl.dropbox.com/u/271144/CHARACT6.DOC
>> It includes all 14 sets.
> 
> Thanks for this one. I run it over wpd2text and it might be that we
> actually do a good job here. Now, I could use some help here. If you
> people could simply do the wpd2text of those characters and see whether
> all glyphs in all charsets are correctly mapped. If you find error, just
> note which charset and char number and what would be the correct unicode
> mapping. For characters that would correspond to 2 or more unicode
> character sequence, please write that down, but give me also a closer
> approximation of 1-1 mapping.
> 
> 
>> I'll find and post the CHARMAP.TST that shipped with 5.1 Hebrew and Arabic 
>> later on.
> 
> The wp2rtf zip file contains a TEST.WP file that has all the charsets in
> it. If you have the visual representation, just run it through wpd2text
> and compare. I would appreciate again to have the information of wrongly
> mapped glyphs.
> 

Fridrich,

Both these files (CHARACT6.DOC and TEST.WP - which is the same as the WP51 
CHARACTR.DOC) produced very good results when run through wpd2html.

Here are some quick notes:

1. Quite a few WP characters have no unicode equivalents, and there is no way 
to fix that.

2. In TEST.WP (the WP5.1 file), 6,56 through 6,234 didn't convert at all; but 
these characters are correctly converted in the WP6.x CHARACT6.DOC. You 
evidently have different tables for 5.x and 6+, and I think you can simply copy 
the 6,56 through 6,234 mappings from the 6+ table to the 5.x table.

3. In the converted CHARACT6.DOC, I think it may be possible to add these:

2,44 seems to be 0361
2,45 seems to be 035C
4,100 seems to be 1D11E
4,101 seems to be 1D122
6,83 seems to be 2A38
9,83 seems to be 05AA

I'll have to check the Hebrew tomorrow, but since I don't know any Hebrew, I'll 
be guessing.

Smokey, I think you know Arabic. Is there any chance you could check these?

I finally tested the Arabic WP 5.1 files from this page:

http://www.un.org/popin/unpopcom/32ndsess/gass.htm

wpd2odt says they are not WordPerfect files. Apparently libwpd doesn't handle 
documents created by Arabic WP5.1 or Hebrew WP5.1. 

I hope these details help somewhat, and will try to report more tomorrow.

Edward Mendelson
Contributing Editor
PC Magazine




------------------------------------------------------------------------------
Fulfilling the Lean Software Promise
Lean software platforms are now widely adopted and the benefits have been 
demonstrated beyond question. Learn why your peers are replacing JEE 
containers with lightweight application servers - and what you can gain 
from the move. http://p.sf.net/sfu/vmware-sfemails
_______________________________________________
Libwpd-devel mailing list
Libwpd-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libwpd-devel

Reply via email to