Re: About the European MES-2 subset

Philippe Verdy Fri, 18 Jul 2003 07:28:26 -0700

On Friday, July 18, 2003 1:13 PM, Peter Kirk <[EMAIL PROTECTED]> wrote:

> On 18/07/2003 03:16, Philippe Verdy wrote:
> 
> > I still note that modern Hebrew and Arabic are excluded from MES-2,
> > as they are not used in any official language in the European Union
> > or EFTA, or future EU candidates. ...
> > 
> But they are used in official publications within the EU, those
> targeted 
> at minority communities. But then so are south Asian and east Asian
> scripts. 

But for these Asian languages, I think it's best to have fonts designed to
handle correctly their corresponding scripts, instead of a giant font poorly
hinted for readability at small sizes, and without support of common
ligatures.

Arabic, Hebrew and Brahmic scripts should better be supported by their
own fonts, rather than partially (for example the inclusion of Brahmic
digits only in Arial Unicode MS was an error, in my opinion, and Microsoft
should have better provided separate fonts for these Brahmic scripts, rather
than specifying that its fonts support these scripts).

> > ... But They are certainly of great
> > interest for countries with which the EU is a major partner, and
> > which 
> > are using these scripts. In some future, it would be needed to
> > include support for modern Georgian (a subset of U+10A0..U+10FF),
> > and modern Armenian (a subset of U+0530..U+058F), as well as some
> > characters 
> > from Cyrillic Supplementary (in U+0500..U+052F).

For the case of Armenian and Georgian Mkedruli, they do not seem complex
to add in a font.

> If this subset is to be enlarged very much, and to require complex
> script rendering etc for its implementation, surely there is little
> point in specifying anything less than the improper (in the
> mathematical sense!) subset which Ken mentioned, i.e. the whole of
> Unicode. 

I agree with this point. But this is not an excuse to not implement and
support at least the NFC and case mapping closures in a decent font
for any script, even if the script is reduced to letters used in the modern
language.

But some optional ligatures not strictly needed for a set of written
modern languages may strictly be not needed if the font or renderer
supports correct fallback decompositions (for example with <fi>, <fl>,
<ffi>, <ffl>). What is important here is the legality of the printed text,
so that no confusion is possible for a text written in any language.

One good source of such characters needed for languages can be
found in the Openi18n.org LDML database (notably the ICU section
which is the most complete collection), which contain definitions of
<examplarCharacters> for each supported language (but there may
exist some omissions). One regret: some characters are used and
examplar but not mandatory to support a language and they should
be listed separately, as well as rare characters if they are used only
in proper names or geographical names or translitterated foreign
words which can often be written with a the common letters with a
phonetic approach.

An example is: Norsk "Bokm�l", most often transcripted to: norv�gien
"bokmal" or "bokm�l" in French (where the circumflex is used both as
a way to specify an open and/or lengthened vowel), or translated to:
norv�gien "classique" (by opposition to: norv�gien "r�form�", ou
"nouveau" norv�gien).

So <examplarCharacters> in a language are a good indication to
indicate the needed characters for a language, even if an "official"
transliteration rule is used to translate imported foreign words with more
characters.

-- 
Philippe.
Spams non tol�r�s: tout message non sollicit� sera
rapport� � vos fournisseurs de services Internet.

Re: About the European MES-2 subset

Reply via email to