> I understand that Mac developers would consider a conversion to unicode
> "lossy" or "non-reversible" if the directionality indicators are not
> preserved somehow (using RLE/LRE or RLO/LRO), and this might constitute
> an "algorithmic" approach that 'enc2xs' would not support.
> 
> Is there a work-around that will allow all the MacArabic code points to
> be converted successfully, given that their respective character
> semantics are all well established in unicode?  Even a "lossy" 
> conversion (ditching the directionality specs) would be better than the 
> failures I'm getting now.

(1) If you can forgive information loss on the text direction,
how about use of fallback?

e.g.

0x2B    <LR>+0x002B   # PLUS SIGN, left-right
0xAB    <RL>+0x002B   # PLUS SIGN, right-left

in http://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/ARABIC.TXT
can be converted to

<U002B> \x2B |0 # PLUS SIGN
<U002B> \xAB |3 # PLUS SIGN, right-left

in Encode/ucm/macArabic.ucm.

(2) I've briefly written a module (attached with this mail)
for MacArabic with Perl 5.6.1 or later.

I hope it would be able to be built on Mac;
but I haven't worked with Macintosh, and
I'm not well-informed in Macintosh nor "bidi",
please report me if something wrong.
(at least, the version here doesn't support
embedding or nesting of direction.)

SADAHIRO Tomoyuki

Attachment: Lingua-AR-MacArabic-0.00.tar.gz
Description: Binary data

Reply via email to