Chris,

I think what you've done is very interesting, and
useful, so what I have to say below is not intended
as criticism of your work in any way.

It's curious that the Arabic Presentation Forms got
into Unicode at all, and a number of people still think
it was a mistake, a sell-out.  One of the Fathers of Unicode 
told me they were deprecated.  Even the Unicode specification
explains their presence rather apologetically.

According to the True Gospel of Unicode, Arabic text
should always be _encoded_ in the basic \u06HH characters,
and should not need to be re-encoded.  The rendering
(BIDI, shaping, ligatures) should be done in a separate
rendering engine that does whatever it needs to do internally
but does _not_ change the original encoding of the text.
The Unicode Presentation Forms are kludges for systems
that don't implement a proper rendering engine (and it
sounds like you are working within the limitations of
such a system).  According to the True Gospel, the Arabic
Presentations Forms are an unjustified waste of code space.

>From a purely aesthetic point of view, rendering each
basic Arabic character with one of only four glyphs (isolated,
initial, medial, final) yields only marginally acceptable
results.  It's crude and butt ugly.  Before you get defensive,
let me point out that I myself use a four-glyph-to-each-char font
for rendering Arabic on my own webpage www.arabic-morphology.com
So I'm not criticising you.  My rendered Arabic is crude and
butt ugly.  So I'm just saying that really good Arabic
rendering would need to be far more subtle and flexible than
any four-glyph font could allow.  Good Arabic rendering also
needs to use a lot of ligatures, not just the laam-alif ones.
The Arabic Presentation Forms also provide a lot of ligatures,
but again a good rendering system would need ligatures not
includes in the closed set of Arabic Presentation Forms.

As an example, the ArabTeX rendering system (a package for
TeX and LaTeX for rendering Arabic) uses a font of about
250 basic glyphs, with an auxiliary mechanism to connect
the basic shapes where necessary with smooth curves.

The ultimate Arabic rendering engine is probably that of Thomas
Milo, who is inspired by the best of the Turkish calligraphers.
His rendering algorithms first draw the basic skeleton of the
word according to the best classical proportions, and then
fit in the dots and diacritics afterward (this is how real
calligraphers do it, and it requires some squeezing and subtle
contextual adjustments).  A lot of people think that Milo's
work is _too_ calligraphic, and unsuitable for plain text
like newspapers and even most books, but he's a constant
reminder that most computer rendering of Arabic (including
my own) is still pretty crude.

Keep up the good work,

Ken

> Mailing-List: contact [EMAIL PROTECTED]; run by ezmlm
> list-help: <mailto:[EMAIL PROTECTED]>
> list-unsubscribe: <mailto:[EMAIL PROTECTED]>
> list-post: <mailto:[EMAIL PROTECTED]>
> Delivered-To: mailing list [EMAIL PROTECTED]
> Delivered-To: moderator for [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> From: "Chris Whiting" <[EMAIL PROTECTED]>
> Subject: Re: Bidirectional (bidi) Support?
> Date: Fri, 24 Oct 2003 22:24:38 -0400
> X-Priority: 3
> X-MSMail-Priority: Normal
> X-Newsreader: Microsoft Outlook Express 6.00.2800.1158
> X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2800.1165
> X-Posted-By: 24.225.138.33
> 
> 
> >"Bob Hallissy" <[EMAIL PROTECTED]> wrote in message
> news:OFA3E32D4C.D67B7CEA->[EMAIL PROTECTED]
> >
> >On 21/10/2003 01:09:32 "Chris Whiting" wrote:
> >
> >>I have implemented ... an Arabic shaping algorithm in
> >>Perl and was wondering if it would be useful to upload it to cpan.
> >
> >I presume your algorithm depends on the Arabic presentation forms available
> as separately encoded >characters in Unicode. If this is the case, and given
> that lots of Arabic characters in Unicode do not have >all their
> presentation forms separately encoded, nor will any new presentation forms
> be added to the >standard, it would seem such an algorithm would be of
> limited, and perhaps, misleading help.
> >
> 
> The algorithm, and all that I have seen, convert Arabic characters in the
> \x{06--} range to Arabic Presentation Forms A ( starting at \x{FB50} or B
> ( starting at \x{FE70} ) characters depending on their medial, isolated,
> initial, and final values per the Unicode standard.
> 
> I am not sure that I understand your point.  Isn't this the purpose of the
> Arabic Presentation Forms?
> 
> I have found several different algorthms that do this but have found none on
> CPAN.  Note that I may help someone else (who has a simpler module than
> mine) to upload his files.
> 
> I use the modules when rendering text on images using ImageMagick which
> performs no shaping nor bidi algorithm.  Without this shaping module  many
> of the characters are not rendered correctly.  It my case the shaping module
> is not limited nor misleading but required.
> 
> Perhaps, you have a better way?
> 
> Chris
> 
> 
> >Bob
> 
> 


**********************************************************************
Kenneth R. Beesley              [EMAIL PROTECTED] 
Xerox Research Centre Europe    Tel from France:    04  76 61 50 64     
6, chemin de Maupertuis         Tel from Abroad: +33 4  76 61 50 64
38240 MEYLAN                    Fax from France:    04  76 61 50 99
France                          Fax from Abroad: +33 4  76 61 50 99

XRCE page:      http://www.xrce.xerox.com
Personal page:  http://www.xrce.xerox.com/people/beesley/beesley.html
**********************************************************************

Reply via email to