Hi Werner,
Thank you for commenting on this.

On Mon, 25 Nov 2002, Werner LEMBERG wrote:
> > The idea is:
> >  Assign codes and hot spots for all possible Glyph componenents,
> >   per script, per language system.
>
> How will you handle open-ended scripts like Urdu where the number of
> ligatures is changing while the language evolves?  For example, I was
> told by an Urdu computer scientist that during a visit of Margaret
> Thatcher (a former Prime Minister of England) the newspapers created a
> new ligature for her name.
>
> > Create a generic state machine thet can step through the input
> >   unicode characters, and spit out Glyph components and their
> >   relative hot spot positions.
>
> This is far more complicated I fear.  You will need fallback
> algorithms for fonts which don't provide some glyphs/ligatures, etc.
> Some fonts have e.g. `Amacron' as a single glyph, others compose it
> from `A' with a macron accent.

Talking about ligatures, what I am really afraid of is having
a scirpt encoded today, viewed it with tomorrow's font and not
seeing what I wrote today.

What I was thinking is that all the compulsory ligatures
need to be defined. If a new ligature arrives, a new
scriptcode has to be created or ZWJ used to form it.

This way the abiguity goes away. For non-compulsory
ligatures/non-ligatures we could still use ZWJ and ZWNJ
characters.

I admit, the task is not simple. That's why I posted it instead
of just implementing it straight away. And the most complicated
part is the definitions. The hard part is: in a scriptode and
language system what are the compuslsory ligatures.

> > . Create a generic inverse state machine. The input is
> >  components and their relative hot spot positions and the
> >  output is unicode stream.
>
> You can do that already by following the Adobe Glyph List (AGL)
> algorithm for naming glyphs.

Thanks for the reference. I also think there are a lot of things
out there from which we could learn.

Sorry if I can not fully attend the discussion I started,
but I am extremely tied down with other things at the momemnt.


-- 
G̳á̳s̳p̳á̳r̳
ガーシュパール・Гашпар・가스팔・Γασπαρ・גאשפאר
עברי 10-2*5


--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to