On 23 Apr 2011, at 2:18 AM, Smokey Ardisson wrote:

> At 11:23 PM +0200 on 4/21/11, Fridrich Strba wrote:
> 
>> On 21/04/11 21:52, Smokey Ardisson wrote:
>>> Do we have reason to believe that the mapping in WP5 is different from
>>> the ones I contributed for WP6?
>> 
>> Yes, we have a strong reason to believe that not all are the same. If
>> you take the arabic test in the zip I pointed to and convert with a
>> fresh checkout of libwpd to html, you will see remarkable differences.
>> The advantage though is that that test document is actually having the
>> names of the characters near to them, which will make the janitorial
>> work a bit easier.
> 
> I can't believe that a company would move around blocks of characters 
> from one version of the software to the next, especially when those 
> characters' codes are the canonical ways of identifying them :-P 
> :sigh:  However, there were not too many differences/corrections 
> between what's in libwpd_internal.cpp right now for set 13.

I won't be able to send more details until later today, or tomorrow, but I can 
confirm (what I assume you already know) that there are two states of the 5.1 
character sets - one state for non-Hebrew/Arabaic 5.1, another for Hebrew and 
Arabic 5.1 - and that the 6.x character sets are very different from the 5.1 
sets.

About the two states of the 5.1 sets: until Hebrew and Arabic was released, 
there were only 12 sets; the Hebrew/Arabic version had a much larger set 9 
(Hebrew) and added Arabic sets 13 and 14.

I don't have a full list of the differences between the 5.1 and 6.x in the 
other character sets, but briefly:

Set 1: 6.x adds some characters at the end

Set 2. 5.x and 6.x are completely different.

Set 4: 6.x adds some characters at the end

Set 5: 5.x and 6.x are completely different

Set 6: 6.x adds some characters at the end

Set 8: 6.x adds some characters at the end

Set 9: 6.x is vastly larger (I haven't yet checked whether it's the same as 5.1 
Hebrew)

Set 10: 6.x adds some characters at the end

Set 11: 6.x is completely different

Set 13/14: I think you discovered that these are different in 5.1 Arabic and 
6.x?

You can find on my Arabic and Hebrew WP page full sets of printer drivers for 
Arabic and Hebrew 5.1, and these may help in mapping characters:

http://www.columbia.edu/~em36/wpdos/arabicandhebrew.html

I'll get back to testing later today or tomorrow at the latest.

Edward

 
------------------------------------------------------------------------------
Fulfilling the Lean Software Promise
Lean software platforms are now widely adopted and the benefits have been 
demonstrated beyond question. Learn why your peers are replacing JEE 
containers with lightweight application servers - and what you can gain 
from the move. http://p.sf.net/sfu/vmware-sfemails
_______________________________________________
Libwpd-devel mailing list
Libwpd-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libwpd-devel

Reply via email to