Re: UNESCO standard keyboards? (Re: Tamazight/berber language : ....)

Philippe Verdy Fri, 06 Jun 2003 02:24:17 -0700

From: <[EMAIL PROTECTED]>
> Don Osborn wrote on 06/05/2003 07:34:29 PM:
> > > There are probably some existing standard for keyboard mappings,
> promoted by
> > > UNESCO and published in a ISO standard.
> >
> > If there were such a thing (for Tamazight or any other African
> > language) I'd be
> > very interested to know about it.  My impression is that there are no
> such
> > standards for African languages that use extended Latin characters.  In
> fact
> > SIL is apparently working with UNESCO on a proposed keyboard layout
> > for African
> > languages precisely because there is not yet any such standardization.
> 
> Just for clarification, what we are doing is *not* part of a
> standardization process; note that UNESCO is not a standards body. Rather,
> UNESCO is involved in policy recommendations, and what we are assisting
> them with some documents providing recommendations related to support of
> the world's languages (emphasis on those languages on the other side of the
> digital divide) in ICTs. The keyboard layout in question is merely for a
> prototype implementation intended to demonstrate the ability to create
> keyboard layouts to meet the needs of lesser-well-supported-in-ICTs
> languages.


Thanks for this information. However I don't think I stated that UNESCO was a standard 
body, but it is an important part of its activity to promote education and 
conservation of world cultures by helping cultures to be written and used on modern 
technologies.

So I do think that for languages that are not supported by a country, most of these 
languages will in practive be written using the keyboard layouts already used in each 
country. Unicode can help those that want to create keyboard layouts and input 
methods, by describing somewhere the subset which is appropriate for each language and 
would facilitate their interchange, and the correct encoding of 
digrams/trigrams/polygrams with sequences of characters.

If would be interesting to add some informative appendixes to Unicode and later make 
them normative, to clearly state the subset of characters that MUST be supported for 
each written language, and a list of legacy equivalents that should be interpreted the 
same as their recommanded encoding in the context of that language.

Of course the recommanded characters should exclude compatibility characters (that 
will be listed in the legacy equivalences).
After this step, there could exist statistics studies based on many types of 
publications, and published out of the Unicode standard, that list the combination 
properties of each letter or digram/trigram/polygram, with statistic indicators, 
allowing further identification of language.

As Unicode has nearly finished its work on all major modern languages, such 
specification could already exist for them, but for more rare languages, it would help 
if they were encoded more explicitly with encoding guides, or recommendations, in 
order to facilitate their interchange. When I look in Unicode, there are often many 
candidates for the encoding of written languages with existing Unicode "abstract" 
characters.

We spoke about the case of Breton <c'h> (which is encoded using the same set of 
characters as French, and so uses the APOSTROPHE and not the MODIFIER LETTER 
APOSTROPHE, simply because of input methods that use the French keyboard), or of the 
Tifinagh <gamma> (which could have been encoded in various texts with a Greek gamma, 
or with a Latin Gamma, with additional variants such as the LATIN SMALL LETTER GAMMA 
LITTLE CAPITAL...).

The origin of some "compatibility characters" is not clear. They were certainly added 
because of legacy encodings, but if their use is not recommanded for newer languages, 
this usage should be better described: these characters will continue to be 
recommmanded to support national standards for specific languages (which already 
defined these distinct variants).

For many remaining minority languages, there will certainly not exist a unique 
keyboard layout or input method. A model based on completely new keyboard layouts that 
do not correspond to any layout largely used in countries where these languages are 
used would probaly fail (people won't be able for example to find this keyboard or 
will be limited in their choice, notably for notebooks).

In my opinion, Tifinagh or Breton will often be written using an extended French 
keyboard, but not from a completely new layout (simply because people need to use also 
their national language, and won't use a distinct computer or keyboard).
The simplest way is then to modify or extend a national keyboard, even if this implies 
that a miniorty language will be supported by several input methods, one for each 
country).

So I do think that a multinational Tifinagh keyboard will not exist, instead there 
will be keyboards for French+Tifinagh+Arabic  or which may be distinct in Morocco, 
Tunisia, Algeria, Chad, or English+Tifinagh+Arabic in Lybia and Egypt, or 
English+French+Tifinagh in Canada...

Keyboard layouts were initially created to support a single language in a single 
country. The way I see it will evolve is that OS will be more open and will allow each 
user to adapt its keyboard and input methods to the languages that the user wants to 
support, using a national base layout which is just extended to support more languages.

Or simply the base language: look at the standard base French keyboard, two 
"characters" are still missing on it, the AE and OE ligatures (I say ligature and not 
letter for AE because this is how it is interpreted in French, but this is still 
needed because of orthographic, phonetic and grammatical rules which differentiate the 
ligature and the pairs of separate letters, notably for syllable breaks).

Because of this absence, the origin orthograph is not respected, and this creates 
confusions and most texts are now encoded as separate letters (requiring to make 
additional ligatures for correct rendering, using dictionnaries or morphological 
analysis, so that a word like "coeur" will use the ligature, but not "coexister"). If 
the keyboards (created initially when the French set needed to fit in the ISO646 7bit 
model) contained those characters, more people would use those ligatures and interpret 
them as distinct letters. It's strange that French keyboards were slowly adapted to 
add new characters like the micro symbol that nearly nobody uses, or the newer Euro 
sign, but also forgot to define a standard position for OE and AE which are full 
members of the French alphabet...

So if input methods and layout must be developed, it would be interesting to recommand 
how this should be done, and avoiding the same error as in the past: a language should 
be encoded with a preferable set of characters, and this must be reflected in the 
keyboard layouts, and standardized.

The publication of recommanded alphabets (possibly in several "conformance levels") 
for each language would really help software and OS vendors to provide a richer set of 
input methods that can be learned and reused by people. A part of this job fits in 
Unicode because it defines character properties (conformance of a Unicode character to 
a language is such a property), and another part would be some other ISO working 
groups and Unesco recommandations.

-- Philippe.

Re: UNESCO standard keyboards? (Re: Tamazight/berber language : ....)

Reply via email to