At 11:23 PM +0200 on 4/21/11, Fridrich Strba wrote:

>On 21/04/11 21:52, Smokey Ardisson wrote:
>>  Do we have reason to believe that the mapping in WP5 is different from
>>  the ones I contributed for WP6?
>
>Yes, we have a strong reason to believe that not all are the same. If
>you take the arabic test in the zip I pointed to and convert with a
>fresh checkout of libwpd to html, you will see remarkable differences.
>The advantage though is that that test document is actually having the
>names of the characters near to them, which will make the janitorial
>work a bit easier.

I can't believe that a company would move around blocks of characters 
from one version of the software to the next, especially when those 
characters' codes are the canonical ways of identifying them :-P 
:sigh:  However, there were not too many differences/corrections 
between what's in libwpd_internal.cpp right now for set 13.

Can someone (Edward?) extract the full character sets 13 and 14 from 
"TEST.WP" from the wp2rtf zip and either run those through wp2rtf or 
generate PDFs that show the Arabic characters for me?  The "TEST_ARA" 
pair of documents didn't include the first dozen codepoints in set 13 
and don't include any of set 14, so I don't have a visual reference 
:-(

I have set 13 all fixed (with the exception of those first dozen that 
I don't know what they look like and which have useless names), 
within the confines of what's available in Unicode and with the 
limitation of a single-codepoint-to-single-codepoint mapping (wp2rtf 
produces more "accurate" mappings of some fancy WP codepoints that 
don't have single matching Unicode glyphs by using two characters).

I think new additions to Unicode since version 4.1 (when I did the 
WP6 mappings) will let us successfully map some of the additional 
random diacritical marks WP used to Unicode; if so, I can also fix 
the old WP6 mappings for those.

>  > Looking back through my files from that era, it looks like we ended up
>>  punting on "true" conversion for WP-Arabic and ended up mapping to
>>  Unicode presentation forms (except for the "stand-alone" forms of the
>>  letters, which got mapped to the normal, combining Unicode characters),
>>  so that the result was reverse-ordered, unconnected text (I think you
>>  were going to use Fribidi to try and reorder, but I remember vaguely
>>  that that effort had some problems and you intended to fix things
>>  elsewhere in another manner).  At some point in the future, we might
>>  want to revisit that and map all of the WP-Arabic codepoints to normal,
>>  combining forms where possible for WP5/WP6.
>
>Yeah, I tried to do that, but it is a bit too complicated and the
>reverse bidi algorithm is not even defined. Basically, you would have to
>have two marks like OOXML has, to tell that this and this span is part
>of the same run, because if not, you will not have a way to reorder
>spans with different character properties (bold, italic) but part of the
>same phrase. It was a huge mess to do and I really did not have the
>courage to dive into it.

Just pretend I never said that paragraph you replied to; I wasn't 
completely awake when I wrote that and I had forgotten the key bit of 
the problem: non-Mac WP required you to enter characters 
backwards/LTR to begin with (I was thinking for some reason that the 
characters were in the correct order in the file and would work 
properly in a bidi-aware word processor if only we'd mapped to the 
normal codepoints) :-P

Smokey
-- 
Smokey Ardisson
alqah...@ardisson.org
http://www.ardisson.org/
------------------------------------------
"He is a fool who has forgotten what became of his ancestry
seven generations before him and who does not care what will
become of his progeny seven generations after him."
           --Kazakh Proverb

------------------------------------------------------------------------------
Fulfilling the Lean Software Promise
Lean software platforms are now widely adopted and the benefits have been 
demonstrated beyond question. Learn why your peers are replacing JEE 
containers with lightweight application servers - and what you can gain 
from the move. http://p.sf.net/sfu/vmware-sfemails
_______________________________________________
Libwpd-devel mailing list
Libwpd-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libwpd-devel

Reply via email to