Re: IUC27 Unicode, Cultural Diversity, and Multilingual Computing / Africa is forgotten once again.

2004-12-08 Thread Patrick Andries
John H. Jenkins a écrit :
On Dec 8, 2004, at 3:57 PM, Patrick Andries wrote:

Azzedine Ait Khelifa a écrit :
Hello All,
The subject of this conference is really interesting and veryusefull.
But once again Africa is forgotten.
I want to know, if we can have the same conference AfricaOriented 
scheduled ?
If Not,  What should we do to have this conference scheduled in a 
cityaccesible for african community (like Paris).

If this is possible, I would also add « and with much more contents 
ina language understood in Africa and the host country : French ».

Well, and as with everything else associated with Unicode, feel free 
to volunteer.


Well, I do volunteer work...in English and in French (there are other 
fora where people talk about Unicode in French whether in Morocco or 
Lebanon for instance).

Merci du conseil et salutations cordiales,
P. A.




Re: Pour sauver la patrimoine de l'Imprimerie Nationale de France

2004-12-06 Thread Patrick Andries
Michael Everson a écrit :
Voir http://www.garamonpatrimoine.org/
Note the use of Unicode in http://www.garamonpatrimoine.org/petition.html
P. A.



[Fwd: Re: Re: Relationship between Unicode and 10646]]

2004-11-29 Thread Patrick Andries






 Message original 

  

  Sujet: 
  Re: Re: Relationship between Unicode and 10646]


  Date: 
  Mon, 29 Nov 2004 10:17:34 +0100


  De: 
  Philippe Verdy [EMAIL PROTECTED]


  
  
  
  


  
  
  
  


  
  
  
  

  



From: "Patrick Andries" [EMAIL PROTECTED]
 Enfin, je ne suis plus si sr que les socits amricaines considrent 
 encore
 Unicode comme quelque chose de stratgique, il s'agit surtout d'efforts 
 individuels
 de la part de techniciens passions dans ces entreprises, passionns qu'on 
 laisse
 encore faire sans doute parce que cela cre un bon capital de sympathie 
 multiculturel.

[PA] This was extracted from a longer and private message to Philippe. 
It is out of context here. Unicode is still strategic, the new scripts
may be less so to the major software companies although major software
companies will most probably not be able to ignore the new versions of 
Unicode which will contain more than simply new rare scripts. 

Anyways, this was a private discussion. Thanks, Philippe. Will teach me.

P. A.









Value of U+1E20

2004-09-16 Thread Patrick Andries
Would any one know what is the value of U+1E20 ?
Is this (also) used in Semitic transliterations ? For which value ? 
Could it be a fricative G ?

Many thanks,
P. A.



Re: Unicode V4 and ISO

2004-09-01 Thread Patrick Andries




Martine Brunet a crit:

  
Hello,
  
I am new on this list and I have a question about very special
characters
and the standard Unicode v4. I sought much the answer to this question
at
  www.unicode.org
but without success.
Can somebody say to me if the characters of the 4 following standards
ISO 5426 -2:1996, ISO 6861:1996, ISO 8957:1996 and ISO 10754:1996 are
integrated in Unicode V4?
  
In detail, they are the following standards :
- ISO 5426-2: 1996 Information and documentation - Extension of
the Latin alphabet coded character set for bibliographic information
Interchange - Part 2: Latin characters used in minor European languages
and obsolete typography
- ISO 8957:1996 Information and documentation - Hebrew alphabet
coded character sets for bibliographic information interchange.
  - ISO 10754: 1996 Information and documentation - Extension of
the Cyrillic alphabet coded character set for non-Slavic languages for
bibliographic information interchange.
- ISO 6861: 1996 Information and documentation Glagolitic
alaphabet codes character set for bibliographic information interchange

I believe the place to look at is here : 

http://www.unicode.org/versions/Unicode4.0.0/References.pdf

At first sight, these all served as sources references to ISO 10646,
except the Glagolitic whose script is part of Amd 1 to ISO 10646:2003
(an upcoming version of Unicode, coming after 4.0 thus). I believe
(but I have not studied this in depth) that the Amd1 proposal differs
slighly from ISO 6861 in as far as some glyph variants from ISO 6881
are not proposed in Amd 1.



Cordialement,

P. Andries
- o - O - o - 
ISO 10646 et Unicode en franais
http://pages.infinit.net/hapax









Re: Errors in TUS Figure 15.2?

2004-08-03 Thread Patrick Andries




Doug Ewell a crit:

  Peter Kirk peterkirk at qaya dot org wrote:

  
  
The situation is even more confused in that some Unicode characters,
e.g. U+0152 LATIN CAPITAL LIGATURE OE, are called LIGATUREs in their
character names but are unambiguously single Unicode characters (e.g.
they have no decomposition even for compatibility). (These are in
addition to the characters named LIGATURE in the Alphabetic
Presentation Forms block, which mostly have compatibility
decompositions.)

  
  
The last thing you want to worry about is the correlation between
whether a character has the word LIGATURE in its name and whether it is
actually a ligature.  That way lies madness.
  

[PA] Incidentally, the French version of ISO 10646 does not name these
letters LIGATURE, but DIGRAMME SOUD (e.g. U+0152 : DIGRAMME SOUD
MAJUSCULE
LATIN OE).

Also, the Unicode 1.0 name may have been better in this regard : 
LATIN CAPITAL LETTER O E .

P. A.






Re: Much better Latin-1 keyboard for Windows

2004-07-27 Thread Patrick Andries




Mike Ayers a crit:

  
  RE: Much better Latin-1 keyboard for Windows
[Alain] As I said in my previous mail, these definitions
are 
 not the best of definitions. The distinction is but 
 intuitive, you have to see the diagrams where labeling makes 
 the difference: 
SNIP/ 
 I don't have these diagrams. Are they published somewhere
public? 

The only one I know that don't infringe copyright (because never as yet
published) is here :

http://www.cooptel.qc.ca/~pandries/ISO-CEI%209995-1-1994.pdf

I believe Alain was refering to figures 8 and 9 (end of document).

P. A.
- o - O - o - 
ISO 10646 et Unicode en franais
http://pages.infinit.net/hapax







Re: Changing UCA primar[l]y weights (bad idea)

2004-07-12 Thread Patrick Andries
Alain LaBonté a écrit :
   It would be much better to make sorting, matching and searching 
consistent with tailored tables of either the UCA or ISO/IEC 14651. 
Unfortunately that is not what happens in most products, except in 
some good search engines (Google, Altavista and the like, which are 
smart enough for this -- but are not tailorable, to my knowledge -- 
and there are slight differences in behaviour between Google and 
Altavista although it is very much better that Mozilla or MS products 
in all cases).
[PA] Sometimes too smart when one wants to search a word with an accent 
and not find the far more numerous forms without it. A small check mark 
(ignore diacritics) would be welcome. (Anybody from Google reading the 
list ?)

P. A.



Re: Arabic written in Syriac? Arabic written in Tifinagh?

2004-07-09 Thread Patrick Andries
E. Keown a écrit :
   Elaine Keown
   Tucson
Hi,
I'm trying to track down a reference for Arabic
written in Syriac (by Syriac Christians).
 

Well, the keyword « Garshuni » may help here.
I did a little work on Tifinagh 2-3 years ago.  I
discovered that it is used to write Arabic by Tuareg
women.I hope that the Moroccan Tifinagh proposal
includes those characters, if they are 'extras.'
 

Do you have any letters in mind ?  Some such letters could very well be 
missing

P. A.



Re: Arabic written in Syriac? Arabic written in Tifinagh?

2004-07-09 Thread Patrick Andries
E. Keown a écrit :

Aha!--thank you. Is there much Garshuni material,
some especially notable?

A recent (may 2004) communication and references to Garshuni manuscripts :
17h15 Élie Kallas (Trieste)
/Le type linguistique garchouni du Mont-Liban (15^ème siècle) d'après 
les mss. Vat ar. 640 et Borg. ar. 136 d'Ibn el-Qila-^c i-./

http://www.fltr.ucl.ac.be/FLTR/GLOR/ORI/ColloqueArabe/programmeF.htm
« Danach widmete Naoum Faik seine
Zeit der eigenen Zeitschrift »Bethnahrin». Die
Besonderheit der Publikationen von Naoum Faik war,
dass die Beiträge in türkischer bzw. arabischer Sprache
jedoch in Syro-Aramäischen Alphabet. Dieser Stil ist u.a.
als Garschuni bekannt und war vor und nach dem I.
Weltkrieg vor allem innerhalb des Intellektuellenkreises,
die im Osmanischen Reich lebten, weit verbreitet. »
http://www.bethil-online.com/magazines/rh_2003/rh-61.pdf
So it seems like it was quite common in the Ottoman Empire before and 
after WWI among intellectual circles.

I think Google (English, French and German) will reveal a wealth of 
material or citations to material.


Tifinagh is used to write Arabic by Tuareg
women.I hope that the Moroccan Tifinagh
proposal includes those characters..

Patrick Andries wrote:

Do you have any letters in mind ? Some such letters
could very well be missing


I did have a short list of such Tifinagh characters--6
or fewerfrom 3 years ago.but the U.S. Post
Office lost two of my boxes this spring, and the
Arabic- etc notes were in the box that's still
heaven-knows-where. Kamal Mansour had a copy of my
Arabic-script bibliography, but I am not sure that the
Tifinagh material was on that.
I know of a least one such a letter by memory (because it
is easy to remember) : a rectangle for emphatic s. But it is
debatable (only Hanoteau gives it, I think) and thus was
not a priority to code in our first (modern-day) Tifinagh
proposal.
But Tifinagh is actually a really important
script---it's used to write many major dialects,
though maybe more by womenand it's caseless, so
the collation string can have the variants inserted in
the regular string of letters
I'm not sure I understand.
P. A.



Re: Looking for transcription or transliteration standards latin- arabic

2004-07-07 Thread Patrick Andries
Peter Kirk a crit :
On 07/07/2004 07:08, Raymond Mercier wrote:
This is a possible derivation. If this is Gerd's source, he failed to 
make the point that istimboli was not a Greek name of the city but a 
colloquial pronunciation of a phrase. And the source of that may be 
the following old German text, from 
http://www.staff.ncl.ac.uk/jon.west/get/hc0144_3.htm:

Constantinopel hayssen die Chrichen Istimboli und die Thrcken 
hayssends Stambol;

And according to http://www.fotoist.8m.com/ad.htm (in Turkish) this 
information comes the from 14th-15th century German traveller Johan 
Schildtberger. But I have my suspicions about this information. The 
Greeks had no problem with initial consonant clusters but the Turks 
did, so it is much more likely that the Turks added the initial I to a 
Greek word starting with ST, just as Spanish and French add initial E 
before such clusters.

French (for the last 5 centuries) no longer adds an initial E in front 
of ST (see : stop, start, sport (*), stage, stature, station, etc.), 
historically (in Old French) this was true (estable [stable], estamper 
[to stamp], estat [state, station], esterlin [sterling], estrange 
[stange, stranger]). Old French is before the fall of Constatinople and 
the end of the Hundred Year war (both in 1453 as all French-speaking 
schoolchildren learn).

Spanish still does (or a least did recently) see recent loanwords : 
esqu (ski) or esprint (sprint).

P. A.
(*) English word derived from an Old French word desport / deport  
(entertainment), see deporte in Spanish and desporto/desporte in 
Portuguese (but esporte in Brazil).
.




Re: How to find character corresponding to code

2004-07-07 Thread Patrick Andries
Mike Ayers a crit :
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 Behalf Of [EMAIL PROTECTED]
 Say, I have given a 2-Byte Unicode character code. How can I quickly
 find out, how the corresponding
 character *should* look like according to the standard?

 From the Unicode standards page (FAQ and Search), it seems that it is
 easy to find the code point,
 when one knows the character name. I would like to do the reverse,
 though.
Use the code charts:
http://www.unicode.org/charts/
As you hold the mouse over each link, look at the status bar 
of your browser which shows the link name.  You will see the final 
part of the link name is U followed by hex digits followed by 
.pdf.  The hex digits are the first codepoint in that block.  The 
charts are in ascending order - top to bottom, left to right.  Once 
you find the chart you want, finding the character should be no problem.

[PA] Personally, I often use Babelmap and Code 2000 as default font, 
easy to see the character properties and come with English or a French 
UI with corresponding character  names. Also nice to test the script, 
cut and paste the characters, etc.

http://uk.geocities.com/BabelStone1357/Software/BabelMap_fr.html
http://uk.geocities.com/BabelStone1357/Software/BabelMap.html
P. A.



Re: Looking for transcription or transliteration standards latin- arabic

2004-07-06 Thread Patrick Andries
Peter Kirk a crit :
On 03/07/2004 00:07, Patrick Andries wrote:
o very different political realities (before and after 1453). Cities
change names without going through transliterattions, cf. Berlin
(Ontario) becoming Kitchener in 1916.
 

But Constantinople - Istanbul is not in fact this kind of name 
change, for Istanbul (that is, stanbul) is probably a corrupted and 
shortened version of Constantinople, with the initial I added to fit 
Turkish phonology (cf. the old western version Stamboul, still used in 
Russian, also Smyrna - Izmir). (I have also heard it said that 
Istanbul comes from Greek EIS TN POLIN to the city, but that seems 
less likely to me.)
Yes, I have heard this.
So the change is more like Beijing - Peking than Berlin - Kitchener. 
Without a political change Constantinople would not have changed name in 
a matter of days (at least as far as the officials were concerned). In 
any case, it is not a transliteration problem (Beijing -- Pkin).

P. A.



Re: Looking for transcription or transliteration standards latin- arabic

2004-07-06 Thread Patrick Andries
Patrick Andries a crit :

So the change is more like Beijing - Peking than Berlin - Kitchener. 

Without a political change Constantinople would not have changed name 
in a matter of days (at least as far as the officials were concerned). 
In any case, it is not a transliteration problem (Beijing -- Pkin).
[PA] I wrote this a bit too fast this morning (first message !). I 
believe the origin of Istanbul is a bit too obscure to decide whether it 
is due to a transcription or a complete name change. Just to confuse 
things further Konstantaniye was apparently used by the Turkish 
administration and a Greek form Istimboli is attested in the XIVth century.

P. .A



[OT] Dutch letters was [Fwd: Re: is n with tilde used in French language ?]

2004-07-05 Thread Patrick Andries
Patrick Andries a écrit :

http://www.evertype.com/alphabets/french.pdf
Several remarks :
ü seems not be be listed (see « würmien », « le würm », « argüer» now 
acceptable according to a recent spelling reform).

Population of France is now 61,7 millions (including around 1,7 
millions French citizens in French overseas territories), but French 
is also the native tongue of populations in Belgium, Luxembourg and 
Switzerland (all in Europe).
Haarmann 1993 figure of 58,1 millions was for Metropolitan France + 
overseas territories (1990 census). 

[PA] Incidently I notice contrarily to French the populations for Dutch 
and German speakers include the speakers of those languages in several 
country.

Also for Dutch, I'm not convinced the list of letters is complete in 
http://www.evertype.com/alphabets/dutch.pdf

Most vowels could take an acute accents I believe : attaché, logé 
(French words), dóórdringen, géén, búíten, drááien (stressed syllables, 
cf. http://www.geocities.com/tinnestaaltroep/tinnepick.html, stressing 
words graphically is common and (much) more frequent than in written 
English while stressing words by adding accents is about completely 
absent in French).

The circumflex also is used : enquête, gêne, fêteren
As well as è in scène (in my Kramers Nederlands-Frans dictionary).
http://www.e-klas.net/ns/nlspelling.htm#acc
http://www.geocities.com/tinnestaaltroep/tinneaccentframe.htm


Re: Looking for transcription or transliteration standards latin- arabic

2004-07-04 Thread Patrick Andries
Philipp Reichmuth a crit :
Except there is no v sound, only an f sound in the Russian 
pronunciation of  due to regressive assimilation. 
Chykoffskee is pretty accurate, actually. I'd say Tchaikovsky is 
just a spelling taken over from French at a time when French was 
pretty much the international common language at least in diplomacy 
and art.
[PA] And the prevalence of French in the Russian imperial nobility.
In French it is today Tchakovsky (with trma), but the v looks like an 
attempt to transliterate, Russian names written in French in the XIXth 
century would usually transcribe  as ff : boeuf Strogonoff, Michel 
Strogoff (Jules Verne), *Princesse Demidoff* ne Strogonoff, Tchkoff as 
an migr name in France [2 born in Paris between 1916 and 1940].




Re: is n with tilde used in French language ?

2004-07-04 Thread Patrick Andries
Cristian Secar a crit :
According to Michael Everson's site, The Alphabets of Europe page,
the French .pdf, character  and  (Latin small / capital letter N
with tilde) is used by the French alphabet.
 

Not any alphabet taught in primary school I would say.
But caon is in my Petit Larousse illustr (2004), but then it refers 
the reader to the more common canyon...

I looked at different other sources and found no other mention about
this character as being used for French language (however, my search
was not exhaustive).
The standard ISO/IEC 8859-16 claims coverage of the French language,
but character  and  is not part of ISO/IEC 8859-16.
Should I understand that this charactere was only used in old French ?
 

As a ligature certainly and it was also proposed and used by Renaissance 
orthographical reformers to denote unambiguously nasal sounds (I have 
several books from around 1550 using the tilde in that fashion 
[facsimiles of such books of course...]).

Patrick
- o -O - o -
ISO 10646 et Unicode en franais
http://pages.infinit.net/hapax



[Fwd: Re: is n with tilde used in French language ?]

2004-07-04 Thread Patrick Andries

 Message original 
Sujet:  Re: is n with tilde used in French language ?
Date:   Sun, 4 Jul 2004 21:31:28 +0100
De: Michael Everson [EMAIL PROTECTED]
Pour:   [EMAIL PROTECTED] [EMAIL PROTECTED]
Références: [EMAIL PROTECTED]

At 21:50 +0300 2004-07-04, Cristian Secara~ wrote:
According to Michael Everson's site, The Alphabets of Europe page,
the French .pdf, character ñ and Ñ (Latin small / capital letter N
with tilde) is used by the French alphabet.

The reason it is in that list is because there 
are some loanwords in French which retain the 
letter. Cañon is one of these.

http://www.evertype.com/alphabets/french.pdf
Several remarks :
ü seems not be be listed (see « würmien », « le würm », « argüer» now acceptable 
according to a recent spelling reform).
Population of France is now 61,7 millions (including around 1,7 millions French 
citizens in French overseas territories), but French is also the native tongue of 
populations in Belgium, Luxembourg and Switzerland (all in Europe).
Haarmann 1993 figure of 58,1 millions was for Metropolitan France + overseas 
territories (1990 census).
[1] http://www.insee.fr/fr/ffc/pop_age4.htm





Re: Mandombe

2004-07-02 Thread Patrick Andries
Anto'nio Martins-Tuva'lkin a écrit :
Anyway, no clear indication on which language or languages is supposed
to be served by this script -- though it seems to be aimed for Bantu
languages, perhaps kiKongo (where ombe means black).
It apparently means (in kiKongo) the Black people's own or For the Black 
People (ma Ndombe = Celle des noirs / Propre au peuple noir/ Pour les 
noirs).
It is basically a script promoted by a Church (rather important one), a 
bit like Deseret. The Église kimbanguiste (officially Église de 
Jésus-Christ sur Son Envoyé spécial Simon Kimbangu -- EJCSK) around 6 
million members. mainly in RDC Congo.

Samples :
http://perso.wanadoo.fr/kimbangu.net/public1.htm
Two books I know of :
* WABELADIO PAYI D., 1996, Mandombe, Ecriture Négro-africaine : manuel 
d'apprentissage à l'usage des apprenants, Edition du CENA, RDC, 65 
pages. è Résumé

* LOUTHES A., To tanga Mandombe - Manuel de lecture aux apprenants de 
l'écriture négro-africaine, Edition du CENA, 60 pages

And two dissertations :
MALUEKI MATUASILUA S.H., 2000, *L'impact de l'Ecriture Négro-africaine « 
*Mandombe* » dans le développement - Cas de quelques exemples à 
Kinshasa*, Mémoire de fin de cycle de Technicien en Développement Rural, 
Institut Supérieur de Développement Rural, Luozi, Bas-Congo, RDC. è 
*Résumé *

* LUSIKILA Kueno Buayi J.P., 1998, *L'Ecriture *Mandombe*. Essai de 
signification theologico-sapientiale et culturelle*, Mémoire de fin 
d'études de Licence en théologie, Université Simon Kimbangu, Kinshasa, RDC

P. A.



Re: Mandombe

2004-07-02 Thread Patrick Andries
Patrick Andries a écrit :
Anto'nio Martins-Tuva'lkin a écrit :
Anyway, no clear indication on which language or languages is supposed
to be served by this script -- though it seems to be aimed for Bantu
languages, perhaps kiKongo (where ombe means black).
It apparently means (in kiKongo) the Black people's own or For the 
Black People (ma Ndombe = Celle des noirs / Propre au peuple noir/ 
Pour les noirs).
It is basically a script promoted by a Church (rather important one), 
a bit like Deseret. The Église kimbanguiste (officially Église de 
Jésus-Christ sur Son Envoyé spécial Simon Kimbangu -- EJCSK) around 6 
million members. mainly in RDC Congo. 
Finger slipped : « Église de Jésus-Christ sur Terre par Son Envoyé 
spécial Simon Kimbangu ».

http://www.quid.fr/2000/Q014770.htm
P. A.



Re: Mandombe

2004-07-02 Thread Patrick Andries
Michael Everson a écrit :
At 07:00 -0400 2004-07-02, Patrick Andries wrote:
It is basically a script promoted by a Church (rather important one), 
a bit like Deseret.

It is a pretty dreadful writing system. I find it hard to believe that 
anyone could actually read it or that anyone actually learns it. I did 
photocopy Payi's book (46 pp) but the script's structure is not really 
explained well enough to do a ConScript registry for it!
I have contacted the Church to see if I could get more details.
P. A.



[totally OT] Mohawk, Re: Looking for transcription or transliteration standards latin- arabic

2004-07-02 Thread Patrick Andries
Mike Ayers a crit :
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 Behalf Of Chris Harvey
 Sent: Friday, July 02, 2004 11:17 AM
 Perhaps one could think of Ha Tinh as the English word for
 the city, like Rome (English) for Roma (Italian), or
 Tokyo (English) for Tky (English transliteration of
Tky is not an English transliteration of Japanese, as it 
uses diacritics not found in English.  The correct English 
transliteration is in fact Tokyo, which does not round trip.

 Japanese), or Kahnawake (English/French) for Kahnaw:ke
Errr - didn't the Emglish/French useage predate the Mohawk 
alphabet?  Pretty perverse case there.

Yes, the Mohwak alphabet certainly postdates the French transcriptions.
Just a few pieces of information about Mohawk (Agnier in its traditional 
French form) names around Montreal (Kanesatake North Shore, Kahnawake 
South Shore) :

   1) Heard one of the Mohawk leaders speak on the radio the other day 
and he pronounced the K of  Kanesatake as Kansatgu for my French ear, 
which seems to be validated by the old French spelling Canessedage 
(first attested in 1695), the name was first used apparently when the 
Agniers found refuge at the foot of Mont Royal on Montral Island than 
already occupied by the French for quite a time before the Sulpicians 
moved them to another area ouside Montreal. The French adopted Oka (an 
Algonquian name, if I recall properly) to designate the same place the 
Mohawk named Kanesatake.

   2) As far as Kahnawake is concerned the settlement occurred again 
while the French had settled the area (long story but the small group of 
Mohawk that had converted to Catholicism and found refuge around 
Montreal went through several settlements before settling in Kahnawake), 
at the same time the priests and French settlers that accompagnied the 
Mohawk called the place (now Kahnawake) Saint-Franois-Xavier-du-Sault 
or simply Le Sault. In Mohawk (agnier) the present-day Kahnawake was 
respectively called Kahnawake ( au rapide ,  by the rapids ), in 
1676, Kahnawakon, ( dans le rapide ,  in the rapids ), in 1690, 
Kanatakwenke, ( d'o on est parti ,  whence we left ), in 1696 and 
Caughnawaga, in 1716 and many other spellings thereafter until 1980 when 
Kahnawake was chosen as the official spelling.

P. A.



Re: Looking for transcription or transliteration standards latin- arabic

2004-07-02 Thread Patrick Andries
Jony Rosenne a crit :

  

-Original Message-
From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of John H. Jenkins



Peking for Bejng.  :-)



Or Constantinople for Istanbul.  :-)

Two very different political realities (before and after 1453). Cities
change names without going through transliterattions, cf. Berlin
(Ontario) becoming Kitchener in 1916.

In any case, it is Istamboul and Pkin.

P. A.





[OT] Re: Still some educational work to do

2004-06-30 Thread Patrick Andries
Ted Hopp a écrit :
I was listening to that program, too. When I heard the explanation of
Unicode, I turned off the radio. :(
 

[PA] This kind of experiences always makes me wonder how much « 
misinformation » I'm listening to or viewing on subjects about which I 
know less...

P. A.



Re: Thrilling varia from the Library of Congress

2004-06-30 Thread Patrick Andries
Michael Everson a écrit :
Found a book on the Tulu script.
Found some of Doke's 1925 phonetic characters cited in a 1975 source. 
If a few citations of author specific characters are enough are 
sufficient for encoding I have a few more characters to propose

Note : I don't know which I really prefer (encode this kind of rare 
characters or not).



Re: Still some educational work to do

2004-06-30 Thread Patrick Andries
Michael Everson a écrit :
At 11:03 -0500 2004-06-30, Donald Z. Osborn wrote:
The flip side of this issue, which came up in the letter from the 
person who was just in Ouaga, is a question: what sort of African and 
other non-Western
representation is there on the Unicode consortium?

People like me take an interest; and the Agence intergouvernmentale de 
la francophonie has joined the Consortium recently.
Canada and France (and Morocco) at the ISO level also take an interest 
and we have been in contact with the different centers mentioned by Don, 
sometimes for several years. We have also successfully proposed Tifinagh 
a major script used in a large part of Africa (Morocco, Algeria, 
Tunisia, Libya, an oasis in Egypt, Mali, Niger and part of Burkina 
Faso,...).

Patrick Andries
- o - O - o
ISO 10646 et Unicode en français
http://pages.infinit.net/hapax



Re: Still some educational work to do

2004-06-30 Thread Patrick Andries
Donald Z. Osborn a écrit :
And a lot more yet... In some parts of the world that could benefit most from
actively working Unicode, such as much of Africa, there is still relatively
little knowledge of it. Even among techies.
In fact, there is still an undercurrent of dissatisfaction among some who know
something about Unicode with aspects of how it provides for some African
character needs. I was reminded of this by a letter I received not long ago
from someone who attended a recent colloquium on ICT in Ouagadougou.
Within the last year some of us began discussing possible conferences,
workshops, training modules, or a road show on Unicode in Africa and perhaps
other regions.
Yes, we did and this in a language understood in the given country.  I'm 
not sure a series of workshops in English in French-speaking Africa is 
for instance a good thing.

A series of workshops (in French, Arabic or Berber) is planned for 
Morocco later this year on this subject (Unicode, multilingual documents 
and  font technology).

P. A.



Tifinagh (Projet de norme marocaine 17.1.100) (was Re: lines 05-08, version 4.7 of Roadmap to BMP and 'Hebrew extensions')

2004-06-28 Thread Patrick Andries
Marco Cimarosti a écrit :
Rick McGowan wrote:
 

I mistakenly thought Tifinagh was rtl.
 

That's OK. It has been, and sometimes still is, written right 
to left, hence it was roadmapped in a right-to-left
allocation block. However, in modern usage, and in the
Moroccan national standard now being drafted, it  
is specifically left to right.
   

Is the draft of this Moroccan standard on-line somewhere?
TIA.
_ Marco
 

Ici : http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2739-1.pdf
P. A.
- o - 0 - o -
ISO 10646 et Unicode en français
http://pages.infinit.net/hapax

 




Re: Tifinagh and Roadmap

2004-06-28 Thread Patrick Andries
Marco Cimarosti a écrit :
Is the draft of this Moroccan standard on-line somewhere?
TIA.
_ Marco
Speaking of Tifinagh, I notice the block allocated to it has been 
modified but not the document referenced in it.

See http://www.unicode.org/roadmaps/bmp/, row 2D.
I believe it should point to  
http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2739 (as previously requested in 
Markham).
If need be I could produce a revised version updating N2739 with the 
recommandations of the Tifinagh ad-hoc (code points and names modified) 
for easy reference.

I have already done so for the French version of the proposal :  
http://cooptel.qc.ca/~pandries/propo_tifinagh.pdf

P. A.



Re: Bantu click letters

2004-06-10 Thread Patrick Andries
Michael Everson a écrit :
At 10:00 -0400 2004-06-10, John Cowan wrote:
And today, if I were reprinting it, I'd commission a digital font
(your effort, my expense) and put the characters in the PUA.

Not if you wanted, as an Africanist, to be able to represent the text 
as it was originally written.
Could you please explain this, how would using PUA characters prevent 
the text to be represented as it was originally written ?

P. A.



Re: Bantu click letters

2004-06-10 Thread Patrick Andries
Patrick Andries a écrit :
Michael Everson a écrit :
Practice your tongue-twisting.
Proposal to add Bantu phonetic click characters to the UCS
http://www.evertype.com/standards/iso10646/pdf/n2790-clicks.pdf
:-P

Are these letters used in any other book than Doke's book on Kalahari 
Bushmen ?

P. A.
[PA] I don't think I got a direct answer on these non Bantu clik 
symbols being used in any other book.
If these symbols are indeed used in a single book and by a single 
author, I would put them in the PUA, I don't see any interchange 
requirement to do otherwise. If letters unique to an author may now be 
encoded in Unicode, I have many to propose to the enabling technology 
that Unicode is and people will be free to use them or not.

P.A.





Re: Bantu click letters

2004-06-09 Thread Patrick Andries
Michael Everson a écrit :
Practice your tongue-twisting.
Proposal to add Bantu phonetic click characters to the UCS
http://www.evertype.com/standards/iso10646/pdf/n2790-clicks.pdf
:-P
Are these letters used in any other book than Doke's book on Kalahari 
Bushmen ?

P. A.




Re: Phoenician, Fraktur etc

2004-05-26 Thread Patrick Andries
Peter Kirk a écrit :

If Fraktur and ordinary Latin are the same script, then it couldn't be 
said that the Germans abandoned the Fraktur script after WWII. Yet, 
that is what available references say did happen. 

Fraktur was actually abandonned during the Nazi era.  In an ordinance 
dated 3/I/1941, the NSDAP Reichleiter, Martin Bormann, on order from 
Adolf Hitler, describes the « so-called Gothic script » as the « 
Schwabacher Jewish letters », Antiqua (Latin) letters were to be used 
from then on and the script was to be called the « normal  script ». On 
the party congress  in 1934 in Nuremberg, Hitler already criticized the 
« Gothic script ».

http://www.deutsche-schutzgebiete.de/fraktur.htm (transcript of the said 
ordinance).

P. A.



Re: Multiple Writing Directions in One Script

2004-05-25 Thread Patrick Andries
Dean Snyder a écrit :
Archaic Greek could be written right-to-left, left-to-right, or boustrophedon.
I'm asking for technical advice as to how such variability in writing
direction streams in the same script can be, and should be, handled in
Unicode, and how it should be dealt with in a Unicode proposal.
 

I believe is similar to what exists in Old Italic. Please refer to the 
Old Italic proposal.

P. A.



Re: Multiple Writing Directions in One Script

2004-05-25 Thread Patrick Andries
Michael Everson a écrit :
At 14:02 -0700 2004-05-25, Patrick Andries wrote:
I believe is similar to what exists in Old Italic. Please refer to 
the Old Italic proposal.

Old Italic is no longer a proposal. It has been encoded.
I know, Michael. But there is still a document called the Old Italic 
Proposal (or whatever it was first called Etruscan, Osque, ...).  No 
need to be picky, but helpful.

Since Dean was looking for the way to address multiple writing 
directions in a proposal, I was suggesting him to read the Old Italic 
Proposal which lead into the encoding of Old Italic. He should find 
language there that should suit him since Old Italic shares similar 
properties. Do you have a pointer to this proposal (your,s I believe) ? 
This would have been helpful, but I see Ken has answered the question 
quite well.

P. A.



Re: Response to Everson Phoenician and why June 7?

2004-05-24 Thread Patrick Andries
saqqara a écrit :
I showed my 5 year old some Fraktur (lower case only) for the first time
today. He is only just getting to grips with reading simple English words.
And the verdict .. 'funny and silly' but he could still read the words
back to me. Anecdotal perhaps but Dean, do you want me test the other 29 of
his class at school before we can be rid of this fallacious Fraktur analogy?
 

Try with Sütterlin also unified within Latin ;-)
http://www.cooptel.qc.ca/~pandries/suetterlin.jpg
(Sorry)
P. A.



Re: Response to Everson Phoenician and why June 7?

2004-05-24 Thread Patrick Andries
Doug Ewell a crit :
And when shown the Stterlin, he couldn't read it but
certainly recognized it as handwriting.
So would he when submitted with a Cyrillic handwriting ?
P. A.



Inscription in Punic and Neopunic

2004-05-24 Thread Patrick Andries
Apparently the following book
Kanaanische und aramische Inschriften, by H. Donner-W Rllig, 
Wiesbaden, 1962-64 (3rd edition 1971-1976)

on page 161 (if I read properly the reference) contains a sample of an 
inscription that would be partly written in Punic and partly in Neo-Punic.

I have been travelling for a week now and I'm estranged from all decent 
libraries.

The inscription was found in Cherchel (Algeria) and is apparently 
dedicated to Micipsa.

Would anyone have access to the aforementioned book ? Could that person 
be so kind as to see whether such an inscription is indeed illustrated ?

Many thanks,
P. A.





Re: Inscription in Punic and Neopunic

2004-05-24 Thread Patrick Andries
James Kass a crit :
Patrick Andries wrote,
 

The inscription was found in Cherchel (Algeria) and is apparently 
dedicated to Micipsa.

Would anyone have access to the aforementioned book ? Could that person 
be so kind as to see whether such an inscription is indeed illustrated ?
   

Is this it?
 

First of all thank you.
I believe this is not the inscription, it could be Cherchel N2.
*Cherchel N 2*
Berger 1889, pp. 35-46;
Lidzbarski 1898, p. 439, 3 Dd2, Taf. xvi, 4;
Van den Branden 1974, pp. 143-145;
Garbini 1974b. p. 33;
Roschinsky 1979, pp. 111-116;
NSI 57;
KAI 161. 
KAI = Kanaanische und aramische Inschriften and 161 is also the 
reference I have.

Unfortunately no illustration is provided.
P. A.



Re: Proposal to encode dominoes and other game symbols

2004-05-24 Thread Patrick Andries
John Hudson a écrit :
Michael Everson wrote:
Here. Chew on this. :-)
N2760
Proposal to encode dominoes and other game symbols
Michael Everson
2004-05-18
http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2760.pdf

This could get out of hand very quickly. Chinese and Japanese (shogi) 
chess pieces? 

To complete U+2616 and U+2617 ?
P. A.



Re: Phoenician and software development

2004-05-21 Thread Patrick Andries
saqqara a écrit :
 
Unification of the Phoenician script with Hebrew would certainly 
eliminate some short term problems - the Hebrew script is fairly well 
supported nowadays among applications and we'd eliminate the Plane 1 
issue. Terribly confusing to users however - the majority do not read 
Hebrew and we'd be back to hacks to prevent modern Hebrew fonts 
sneaking in. Unicode is not meant to be purely about fixing short term 
problems, rather a platform for moving forward.
If many Israelis may not be able to read Phoenician or Neo-Punic, it is 
not obvious to me that Phoenician or Punic scholars -- presumably the 
intended users of  Phoenician/Canaanite -- do not read Square Hebrew. I 
have some testimony to the opposite : Lionel Galand (Tifinagh expert) 
saying he has often seen Punic inscriptions represented using Square 
Hebrew characters,  James Février (Punic expert) illustrating the 
Phoenician character names with Square Hebrew glyphs (and not Phoenician 
glyphs used in the previous pages), Dictionnaire de la civilisation 
phénicienne et punique 
http://www.amazon.fr/exec/obidos/ASIN/2503500331/171-9944786-8511424 
unifying Aramaic, Square Hebrew and Phoenician in its initial 
transliteration table and illustrating the 22 letters with Square Hebrew 
glyphs, etc.

This may have been due to technical reasons (easy availability of Square 
Hebrew fonts), but it looks like Punic scholars are able to read Square 
Hebrew fonts.

P. A.



Re: [OT] What is Langues'O

2004-05-21 Thread Patrick Andries
From: John Cowan [EMAIL PROTECTED]
Philippe Verdy scripsit:
   

Please go to Langues'O for this commentary. As I wrote, you will be
probably answered with the historical context.
   

C'est quoi Langues'O ? Où est-ce ?
Please check http://www.inalco.fr/
As the splash page shows it is « Langues O' ».
Merci
P. .A



Re: [OT] What is Langues'O

2004-05-21 Thread Patrick Andries
Philippe Verdy a écrit :
Please check http://www.inalco.fr/
As the splash page shows it is « Langues O' ».
   

Yes but only on the splash screen. Elsewhere on the site (the top banner, and
menu, and the logos in PDFs of its brochures, letters and publications) it uses
Langues'O which means Langues Orientales 

I know.
(so the quote should be after
rather than before,
So, a typo from the Webmaster and the splash screen is indeed correct.
and this site is not clear about its own logo)... The name
« Publications Langues O' » refers to the publisher name and is distinct from
the community name or the newsletter title.
 

Is it really important ?
P. A.



Re: ISO 15924 French name Gotique: a typo...???

2004-05-21 Thread Patrick Andries
Philippe Verdy a écrit :
To find proof that gotique is incorrect in French, I looked for some official
French resources, notably the list of language names published and used by the
BPI:
http://www.culture.gouv.fr/culture/dglf/bpi/list-langues.html
clicking in the allemand language name gives this:
http://www.culture.gouv.fr/culture/dglf/bpi/allemand.html
[quote]
Depuis 1941, l'allemand a abandonné l'écriture gothique.
[/quote]
However I wonder if this is related to the Sütterlin script.
 

[PA] Gothique = fraktur (in fact a type of Fraktur ou écriture brisée), 
gotique = script of Goths.


So may be a beter name would be ancien gothique.
[PA] No, I don't think so. If I remember properly,  for the Robert 
Dictionary (very standard work), gotique is the language of Old Goths. 
This is not a typo. I'm travelling and don't have my dictionaries with 
me so I can't copy the definition but here is a reference I have :
« /Le *gotique* (...) est antérieur de plusieurs siècles aux autres 
dialectes germaniques /(SAUSSURE, /Linguistique générale., /1916, p. 297). »

In the Académie française dictionarie since 1718 « parfois /gotique, /en 
parlant du peuple ou de la langue ».

Gotique was chosen over gothique because gothique is the usual term for 
Fraktur in daily speech and ancien gothique makes you think of an old 
Fraktur style.

No need of corrections (used in 10646 in any case), however éthiopique 
is completely unknown in French except as the French name of a famous 
Greek classical book (Les Éthiopiques d'Héliodore).

Could we have these discussions somewhere else (in French ?). Merci.
P. A.



Re: ISO 15924 draft fixes

2004-05-20 Thread Patrick Andries
Antoine Leca a crit :
The French name for Hang looks strange. It happened to be hangul (hangul,
hangeul) (after quite a bit of discussion.)
 

The name in ISO/CEI 10646 (F)  is  hangl   from a Corean dictionary 
and a Corean grammar published by the Inalco (Langues O'). Another 
suggested form in some sources, to appromixate the pronounciation.  is  
hangueul 

P. A.



Re: Response to Everson Phoenician and why June 7?

2004-05-20 Thread Patrick Andries
James Kass a crit :
Ernest Cline wrote,
 

In order for Phoenician to be disunified from Hebrew, it must
first have been unified with Hebrew.  This is not the case.
 

Well then, nonunification if you wish to be picky about it.
   

Sorry if I offended.  Many on this list have referred to the current
proposal as a disunification and seem to be arguing that accepting
this proposal would change and disrupt current Unicoding practices.
In this case, I think it's important to be picky because there are
no current Unicoding practices for Phoenician. 

You may mean that the Unicode book does not document how Phoenician (or 
Paleo-Hebrew) may be encoded. This is not to say that no one is using 
Unicode to encode Paleo-Hebrew texts.

P. A.



Re: Compatibility equivalents, was: Qamats Qatan

2004-05-16 Thread Patrick Andries
Peter Kirk a écrit :
Well, at least façade and facade collate together at the top 
level, with the default collation weights, and so one will match the 
other in simple searches.

[PA] I was simply trying to say -- not that I always express myself well 
-- that adding some characters may force additional processing (here in 
the collation, elsewhere if  a cedilla exists as a combining character 
in normalisations and rendering). Adding characters is not as innocent a 
process as some seem to say : «We just add characters and that's it, you 
are not forced to do anything about it». If it is true that one is not 
forced to use them as a writer in the script, when one does not control 
the writers or sources and one has to process several sources (collate, 
render, search them), one is then forced to implement certain additional 
processes (for excellent reasons if the characters are indeed 
necessary). This is why I believe one must carefully review the pros and 
cons before adding new characters, they may well be unified with 
existing ones, for example.


Again, if the separate Punic script were to be compatibility 
equivalent to Phoenician or Hebrew I would not have strong objections; 
but otherwise I am sure that there would be strong objections on the 
grounds that yet further splitting of what is logically the same 
script used for closely related languages leads to even more confusion.

[PA] I would have like Michael to say that splitting may lead to 
confusion with little gain..since he suggested ths unification.

Note that I believe unification of Neo-punic with Phoenician is the 
prudent course to take (for the reasons I explained : introducing new 
characters has a cost and does force people to do something about them). 
Otherwise, if Unicode has space, tailoring collations is The Proper 
Thing To Do and «Unicode doesn't force people to do anything. Unicode 
makes characters available for those who wish to use them. », why not 
encode Neo-Punic ?  After all, one could make a case for it : Neo-punic 
is a remote descendant from Canaanite (genealogically as much as the 
Aramaic-Square Hebrew branch, it also retains the 22 primitive Canaanite 
characters), pretty different as far as glyphs are concerned (some 
simple strokes may represent a b, a d or an r, a Saint-Andrew's 
cross may represent m or alef), has three subcategories (Carthago, 
Tripolitaine and Maghrebine), some inscriptions (cf. Cherchell) are 
mixed Neo-punic and Punic (how would one represent them in plain text?), 
it uses matres lectionis (reusing gutturals having nearly completely 
disappeared in the spoken language), etc.

P. .A



Re: Qamats Qatan (was Majority of community important, inclusion not forcing people to do anything)

2004-05-15 Thread Patrick Andries
Jony Rosenne a écrit :
 

-Original Message-
From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED] On Behalf Of Patrick Andries
Sent: Friday, May 14, 2004 11:16 PM
To: Michael Everson
Cc: [EMAIL PROTECTED]
Subject: Majority of community important, inclusion not 
forcing people to do anything (Re: [BULK] - Re: Interleaved 
collation of related scripts)

   

...
 

Unicode doesn't force people to do anything. (Well, apart from using
smart font technology for a lot of scripts, but that's not relevant 
here.) Unicode makes characters available for those who 
 

wish to use them.
[PA] Surely Unicode does not make all characters available : 
it rejects 
some and unifies some. Why reject or unify if their inclusion 
would not 
pose a problem ? I somehow have the impression that the sheer 
presence 
of characters (duplicates for instance) does have an effect 
on users and 
forces certain processing (normalisation sometimes, decomposition in 
some cases, changing  transcoding filters in other cases 
(what are the 
Coptic users having Coptic texts encoded as Greek data going to do?), 
changing/adding  Cmap for some fonts (Coptic ones previously indexed 
with Greek code points ?)), etc. to achieve the desired effect.

P. A.
   

Having Qamats Qatan as a regular Unicode character will have an effect on
the majority of users who do not know or care for the distinction.
If anything, it should be some kind of glyph variant.
 

[PA2] I suspect you are going to an answer to the effect that you are 
not anymore forced to use Qamats Qatan in Hebrew than you are to use the 
cedilla in English for « façade». But, while this is true, if you 
compare a Unicode script that used to not include ç or a combining 
cedilla with the new one that now includes it, this has an effect on 
algorithms (searching, transcoding, normalisation, even fonts for 
instance) and in this sense Unicode forces people do something about it 
(not that it is bad to have this ripple effect).

If adding new scripts does not force one to use them, « Unicode doesn't 
force people to do anything » and space is not an issue, why not include 
new Punic and Neo-punic scripts along the proposed Phoneician ?  After 
all, I may want to show the diachronic evolution of Phoenician (Semitic) 
words (from 1200 BC to 200 AD for instance) in plain text (XML). Why 
unify Phoenician  with Punic and Neo-Punic ? No one will be forced to 
use Punic and Neo-Punic after all. Surely there must be a reason why you 
proposed a unification (and it may make perfect sense). Is it only for 
genealogical reasons or because the non consulted community of Punic 
users (which probably is any case too conservative in the eyes of some) 
did request unification ?

P. A.



Re: interleaved ordering (was RE: Phoenician)

2004-05-14 Thread Patrick Andries
[EMAIL PROTECTED] a crit :
Dean A. Snyder wrote,
 

The issue is not what we CAN do; the issue is what will we be FORCED to
do that already happens right now by default in operating systems,
Google, databases, etc. without any end user fiddling?
   

That's the question.  

Since search engines like Google survive based on their ability to serve
users' wants and find what users seek, why wouldn't Google make such
a tailoring?  
 

Because the Phoenician user community is very very small ? Same goes for 
Microsoft on some collations already mentioned (French Canadian sorting, 
Khmer) and those are much larger communities.

P. A.



Re: [BULK] - Re: Interleaved collation of related scripts

2004-05-14 Thread Patrick Andries
[EMAIL PROTECTED] a écrit :
Peter Kirk scripsit:
 

Well, I accepted somewhat reluctantly that Phoenician should be 
separately encoded because a small number of users want it to be, 
although a majority apparently do not want it to be.
   

Neither you nor anyone else knows what the majority wants, because most
interested parties have never even heard of this debate.  It's natural
to suppose that The Majority R Us, but there's no evidence for it.
In any case, it's the majority in the UTC (and ultimately the Consortium)
that matters, and the UTC works mostly by consensus anyway.
 

There is such a thing as ISO JTC1/SC2/WG2.
P. A.



Majority of community important, inclusion not forcing people to do anything (Re: [BULK] - Re: Interleaved collation of related scripts)

2004-05-14 Thread Patrick Andries
Michael Everson a écrit :
At 12:08 -0700 2004-05-14, Peter Kirk wrote:
ell, I accepted somewhat reluctantly that Phoenician should be 
separately encoded because a small number of users want it to be, 
although a majority apparently do not want it to be.

I really don't know if those who spoke for the majority were really 
representative of a real majority.
[PA] Is representing the majority of a community of users important ? If 
so, how do we know what this majority thinks ? Or, as was mentioned, 
these users are sometimes too conservative and then don't really know 
what is good for their own good in terms of script analysis and their 
preferences should be ignored ?


This would not be an acceptable position if Unicode intended to force 
all users of Phoenician to move immediately to the new script - 
although it would actually make much more sense to do so.

Unicode doesn't force people to do anything. (Well, apart from using 
smart font technology for a lot of scripts, but that's not relevant 
here.) Unicode makes characters available for those who wish to use them.
[PA] Surely Unicode does not make all characters available : it rejects 
some and unifies some. Why reject or unify if their inclusion would not 
pose a problem ? I somehow have the impression that the sheer presence 
of characters (duplicates for instance) does have an effect on users and 
forces certain processing (normalisation sometimes, decomposition in 
some cases, changing  transcoding filters in other cases (what are the 
Coptic users having Coptic texts encoded as Greek data going to do?), 
changing/adding  Cmap for some fonts (Coptic ones previously indexed 
with Greek code points ?)), etc. to achieve the desired effect.

P. A.




Re: interleaved ordering (was RE: Phoenician)

2004-05-14 Thread Patrick Andries
Kenneth Whistler a écrit :
[on slow implementation of some collations by certain manufacturers and 
service providers]

And the answer is to democratize the approach.
I agree on  the ideal solution, it has independently been mentioned to 
some large manufacturer's technical respresentative who seems also to 
agree on this, but he is not the decision maker.

One shouldn't be
demanding that The Borg centrally define and implement all uses
for all users, so that users simply dial Channel 621 and then
sit there passively assimilating and get dished up their content. 
Instead, the users should demand of The Borg that user-definable
requirements be supported actively, so that the *people* get
to define what they do and how it is done at the point they
interact with the software.
 

It has actively been requested (for Canada for a few years and even 
prospectfully for Tifinagh), it is a slow moving boat and I'm not sure 
all manufacturers and service providers can be convinced, some of them 
holding a virtual monopoly  in the OS market or the search engine one. 
Though I must admit I don't quite see what they would relinquish or lose 
by allowing users to tailor collations.

P. A.



Re: Coptic/Greek (Re: Phoenician)

2004-05-13 Thread Patrick Andries
[EMAIL PROTECTED] a écrit :

Peter Kirk scripsit:

 

I support Coptic disunification on the grounds that it was requested by 
the user community. Initially I opposed Phoenician disunification 
because there was no evidence of demand for it from users. As such 
evidence has now been produced, I now support Phoenician disunification, 
according to Michael Everson's proposal. Please note carefully this last 
sentence.
   

Okay, I have no qualms with that. Note that the same rules applies in 
both of these cases « because requested by the user community ».

I appreciate your expliciting your reasons.

P. A.

(Incidentally, how much input does one need before one can say what the 
user community wishes ?)





Re: Coptic/Greek (Re: Phoenician)

2004-05-12 Thread Patrick Andries
D. Starner a crit :

Doug Ewell [EMAIL PROTECTED] writes:

 

Peter Kirk peterkirk at qaya dot org wrote:

   

Because each such case has to be judged on its individual merits,
according to proper justification and user requirements. There can be
no hard rules like always split or always join.
 

Nobody, neither Michael nor anyone else, ever advocates such a rule.
   

But that's what Patrick implied when he asked how you support the Hebrew/Phoencian
unification and the Coptic/Greek unification, that such a rule exists.
 

Well, yes. But more specifically why was the unification ill-advised for 
Peter Kirk in the case of Coptic and would not be in the case of 
Phoenician. Unless, of course, one justs follows the trend and says 
Coptic unification was ill-avised because it has been disunified. 
Somehow, I feel I should not have asked since the argument often seems 
to be, in the case of neighbouring historical scripts, genealogy and 
user community feeling (as interpreted by the proposers).

P. A.




Coptic/Greek (Re: Phoenician)

2004-05-11 Thread Patrick Andries

Peter Kirk a crit :

 

And these two cases are hardly a good advertisement for the expert's
reputation. The Coptic/Greek unification proved to be ill-advised and
is being undone. 

I'm rather surprised by this comment. If the Coptic/Greek unification 
proved to be ill-advised how could you defend what I see, if I recall 
properly, as your (original ?) position : Phoenician/Hebrew unification ?

P. A.




Script vs Writing System

2004-05-10 Thread Patrick Andries
At 12:12 -0700 2004-05-10, Mike Ayers wrote:

But all this leads me to finally ask:  what does script mean?  It 
seems clear to me that although the term has been used throughout the 
Phoenician debate, not everyone is using it the same way.  I know 
that there is a definition of script that is used for encoding 
purposes, but can I find it written anywhere, or is it more of an 
ephemeral thing?

[PA] The glossary has « A collection of symbols used to represent 
textual information in one or more writing systems. »

Chapter 6 also defines Writing Systems summarized by Table 6-1 Typology 
of Scripts (Writing Systems then Scripts) :

A writing system is then defined as « A set of rules for using one or 
more scripts to write a particular language. Examples include the 
American English writing System, the British English writing system, the 
French writing system, and the Japanese writing system. »

Writing
System
TypeUnicode Script(s)
--
«
Alphabets:   Latin, Greek, Cyrillic, Armenian, Thaana, Georgian, Ogham,
  Runic, Mongolian, Old Italic, Gothic, Ugaritic, 
Deseret, Shavian,
  Osmanya

Abjads:Hebrew, Arabic, Syriac

Abugidas: Devanagari, Bengali, Gurmukhi, Gujarati, Oriya, Tamil, Telugu,
  Kannada, Malayalam, Sinhala, Thai, Lao, Tibetan, Myanmar,
 Tagalog, Hanunóo, Buhid, Tagbanwa, Khmer, Limbu, Tai Le
Logosyllabaries: Han

Simple Syllabaries: Cherokee, Hiragana, Katakana, Bopomofo, Yi, Linear 
B, Cypriot

Featural Syllabaries: Ethiopic, Canadian Aboriginal Syllabics, Hangul
»
Note : «Table 6-1 lists all of the scripts currently encoded in the 
Unicode Standard, showing the
writing system type for each. The list is an approximate guide, rather 
than a definitive classification,
because of the mix of features seen in many scripts. The writing systems 
for some
languages may be quite complex, mixing more than one writing system 
together in a composite
system. Japanese is the best example; it mixes a logosyllabary (Han), 
two syllabaries
(Hiragana and Katakana), and one alphabet (Latin, for romaji).»



Re: Phoenician

2004-05-08 Thread Patrick Andries
Peter Jacobi a écrit :

Patrick Andries [EMAIL PROTECTED] wrote:
[on tailored collations]
 

[PA] I suppose this would be true in principle, but how long before this 
is implemented in the **actual tools** used by user such as MS Word or 
MS SQL Server ?
[...] (yes, 
I know with a bit of tailoring ($) other tools from other manufacturers 
could fit the bill).
   

When you get get it from other sources, why lamenting on the 
non-availability from Microsoft?
 

[PA] Not only Microsoft, I'm not sure that your average XML editor 
(XMetal, for instance) will allow this for some time (search 
undiscriminately a string you don't know is in Paleo-Hebrew or Square 
Hebrew)

There is enough software which offloads collation to IBM ICU,
where adding tailorings is very easy.
 

[PA] In principle yes, but this is still tailoring.

 


A modest contribution to the Firebird Foundation or any 
decent programmer working on this OSS SQL database, will give
you any collation for Firebird and Interbase. 

 

[PA] Maybe, but I fear this is not really practical in many cases : 
users may already have made a technological choice and that choice often 
will not allow you to tailor your collation but there is already a 
solution that would allow to do your work (unify, tag, use a 
stylehsheet) and the proposed block is commercially marginal and there 
is little hope tools will accommodate the new block for sometime.

The argument 'we can't go this way, because Microsoft doesn't 
support it' is rather the wrong way around. 

[PA] Well, it reflects real problems and begs the question : what do you 
gain with desunification and introducing an additional block, this 
introduction having a practical impact.

And it's even not
engraved in stone, that Microsoft won't support it.
 

[PA] This is true but this may take (some very long ?) time if the 
non-availabilibity Khmer or French Canadian sorting is anything to go by.

Again, I'm not opposed to Phoenician in principle (it is intellectually 
pleasing and cleaner), I just don't know what you gain with this 
encoding that you would not be able to do today (right now, with no 
additional cost) using what Dean Snyder proposed (XML tags and a 
stylesheet for rendering) especially for the large bases where 
Paleo-Hebrew is mixed with Square Hebrew. Not very clear to me (this may 
have been explained in other emails, I will read them, apologies if the 
pragmatic gain has been explained and I'm just appearing a bit dumb here).

Kind regards,

P. A,






Re: New contribution

2004-05-07 Thread Patrick Andries
Doug Ewell a crit :

It's clear to me that the reason my colleague and I can read this font
is not that we have any special knowledge of both scripts, but because
it's a stylistic variant of Latin.
 

And thus he cannot read a Vietnamese text in Stterlin, as you said, 
because it is not a stylistic variant of Latin ?

P. A.





Re: Phoenician

2004-05-07 Thread Patrick Andries
Dean Snyder a écrit :

Of course. But that does not make tagged text a minefield - in the
absence of your nice Phoenician font Hebrew would show up instead -
precisely what is used by and large by Semiticists right now.
 

[PA] I also got this feedback from Lionel Galand (of Tifinagh and Libyan 
fame) about Punic :  «Je peux vous dire que j'ai souvent travaillé sur 
des répertoires de documents puniques qui étaient publiés en caractères 
hébraïques. »

P. A.



Re: Phoenician

2004-05-07 Thread Patrick Andries
Peter Constable a écrit :

[PA] I also got this feedback from Lionel Galand (of Tifinagh and Libyan
fame) about Punic :  «Je peux vous dire que j'ai souvent travaillé sur
des répertoires de documents puniques qui étaient publiés en caractères
hébraïques. »
 

This could be multiplied a hundredfold.
   

The same could be said of Devanagari or Arabic text published in Roman transcription.  That does not mean that we do not encode Devanagari or Arabic, or that encoding those scripts prevents the same people from continuing to publish in Roman transcription.

 

[PA] True. Just stating it is a common practice. People will not be 
unsettled by a plain text unification.

Personally, I'm still not very convinced there is anything to be gained 
by having two ways of encoding large documents bases as the Dead Sea 
Scrolls. I would have encoded these texts as Dean Snyder suggested (my 
CSS/XSLT bias I supposed) : one underlying encoding, different 
rendering. But I'm no specialist in Semitic (or otherwise Indo-European 
for that matter) studies.

Just an inkling, not a dogmatic conviction.

P. A.





Re: Phoenician

2004-05-07 Thread Patrick Andries
[EMAIL PROTECTED] a écrit :

Jony Rosenne scripsit:

 

A possible strong negative argument would be if having it would cause
problems for those who do not think they need it. For example, if it would
make searching more difficult. This argument has been raised, but I am not
convinced the possible difficulties are significant.
   

This could be solved by making Phoenician and Hebrew base characters equivalent
at the first level of collation.
 

[PA] I suppose this would be true in principle, but how long before this 
is implemented in the **actual tools** used by user such as MS Word or 
MS SQL Server ?

I think we have already discussed this here regarding the French 
Canadian official sorting and Khmer sorting, which are still unavailable 
on Windows. How much money for Microsoft in Phoenician sorting ? I'm not 
aware the collation tables can be tailored by users in those tools (yes, 
I know with a bit of tailoring ($) other tools from other manufacturers 
could fit the bill).

In theory, I believe either way  (a separate encoding or a unification 
within the Hebrew block(*)) could be feasible. In practice, the 
unification point of views is available right now. I suppose it depends 
on one's outlook and preference. But is Unicode concerned with current 
limitations ? Okay, another way : But is Unicode concerned with current 
pragmatic usability ?

P. A.




Re: New contribution

2004-05-06 Thread Patrick Andries
Doug Ewell a crit :
As I've said before, I don't know enough about the historical
relationship between Phoenician and Hebrew to get involved in this
bloodbath.  But for the life of me, I can't figure out how Fraktur keeps
getting dragged into it.  For heaven's sake, it's not THAT
unrecognizably different from Antiqua.
 

Fraktur is not that different, this is true. One could easily write 
Greek texts in Coptic and they would be legible (they would obviously 
not use the original Coptic letters for the original Coptic sounds).


Since the gauntlet had been thrown down, I did go ahead and format some
Vietnamese text samples in Fraktur or Stterlin, and showed the samples
to a Vietnamese co-worker who moved to the U.S. sometime after high
school.  He had absolutely no problem reading the Fraktur, and said
there are plenty of examples of Fraktur in Vietnam (mostly decorative,
or in documents from the 1950s and earlier).
Which could maybe only show that he knows both scripts (Latin and 
Fraktur)...

 He couldn't understand the
Stterlin at all, but did recognize it as handwriting and not, say, a
secret code or child's doodling.  

Yes, you are right Stterlin is that different. Even if, with a little 
bit of Fraktur training and knowing the language of the text written in 
it, the text would become legible by guessing the letters that are too 
different. But I am not sure this (guessing the unknown forms) would not 
be true with a text written in a different but neighbouring script. But 
I understand this would not even be possible by modern day (Square) 
Hebrew readers when confronted with Paleo-Hebrew. Which seems to settle 
the script identity question for me.

P. .A




v and u positional variants (Re: New contribution)

2004-05-06 Thread Patrick Andries
Jim Allan a écrit :

Similarly _v_ and _u_ were for long only used as positional variants.
For very long, which explains for example why French has a non 
etymological h in « huile » (oil) : to distinguish vile (she-bad) and 
vile (oil) written the same way but pronounced differently when the h 
was added. Catach is her Dictionnaire historique de l'orthographe 
française names this a diacritic h. It appeared around the XIIIth century.

P. A.
The same is true for huit (8) / vit (he lives or virile member) , huitre 
(oyster) / vitre (window pane), huis (door) / vis (you (sing.) live, 
live ! or screw), etc.



Süterrlin (was A New Contribution)

2004-05-06 Thread Patrick Andries
Peter Kirk a écrit :
OK, maybe not such a good example. So let's go back to Suetterlin. I 
would expect a much higher rate of recognition among German users of 
normal Latin script than among American users of normal Latin script. 
So a test of recognition in America might seem to indicate that 
Suetterlin should be disunified from Latin, on the same grounds that 
you want to disunify Phoenician and Hebrew (plus that Suetterlin has 
different cursive joining behaviour, just as Syriac does from Hebrew), 
but a test in Germany might provide evidence against this disunification.

[PA] Why is it important to go to Germany or even that one should 
understand the underlying text ? Germans understanding a text written in 
Sütterlin may only prove that some have been exposed to other scripts 
(Fraktur or Sütterlin for instance, if one says those aren't other 
scripts we are just having a cicrular argument) or that they are 
guessing and filling up the gaps (the very different letters) because 
they are interpreting the text, not that the characters are recognizably 
Latin characters.

P. A.



Re: New contribution

2004-05-05 Thread Patrick Andries
Patrick Andries a crit :
Mark E. Shoulson a crit :
 Well, it doesn't need to be a wedding invitation, does it?  I'll give 
it a try;
 I've downloaded a Stterlin font, and I'll type up a small document and
 see if I can get some English-readers to read it or recognize it.

 Even if they can't read it, I'll bet they can recognize it as Latin 
letters and possibly English,
 which was not so for Paleo-Hebrew and Hebrew.

Not at all obvious to me :
http://www.cooptel.qc.ca/~pandries/suetterlin.jpg
(sorry already mentioned)
Could just as well be some Cyrillic or foreign (Tolkien ?) cursive for 
the average reader. But I agree -- as you mention in another message -- 
that people will not think this is a set of random symbols and would 
know how to turn the piece of paper on which it is written, mostly 
because of the cursivity and linking of the letters and the presence of 
numerals. Still, I believe this will not be perceived as the same script 
as Latin by readers of the Latin script (I'm not even sure young Germans 
would be able to recognize it without training).

P. A.
(who will also stop on this subject since we seem to be rehashing the 
same arguments)

(someone asked for a Phoenician / Hebrew dictionary sample to prove the 
need of plaintext distinctions, I have not found one but would it be 
more convincing that this ?
http://www.cooptel.qc.ca/~pandries/dico-fraktur-latin.pdf)





Re: Yoruba Keyboard

2004-05-05 Thread Patrick Andries
John Hudson a crit :
For details, see http://www.bisharat.net/ and, for mailing list 
subscription, 
http://lists.kabissa.org/mailman/listinfo/a12n-collaboration

If you are more at ease with French (yorouba ?), there is a 
Unicode-Afrique mailing list.

To subscribe send a message to  [EMAIL PROTECTED]
Also an initiative of http://www.bisharat.net/A12N/
(Don Osborne)
P. A.
- o - O - o -
ISO 10646 et Unicode en franais
http://pages.infinit.net/hapax




Re: Pal(a)eo-Hebrew and Square Hebrew

2004-05-04 Thread Patrick Andries
Dean Snyder a écrit :
Patrick Andries wrote at 8:55 AM on Monday, May 3, 2004:
 

I got this answer from a forum dedicated to Ancient Hebrew :
« Very few people can read let alone recognize the paleo Hebrew font. 
Most modern Hebrew readers are not even aware that Hebrew was once 
written in the paleo Hebrew script.
   

The same could be said for archaic Greek versus modern Greek - do you
propose to encode archaic Greek separately?
 

[PA] I'm proposing nothing here, I'm just forwarding an answer,
When the text was written in the paleo Hebrew four of the 
Hebrew letters were used as vowels - aleph, hey, vav and yud, but were 
removed from the text when the masorites added the vowel pointings. This 
is evident in the Dead Sea Scrolls where the four letters are found in 
the words but removed in the Masoretic text.
   

This is simply not true.
[PA] So there were Dead Sea Scrolls written in Square Hebrew with matres 
lectionis ? (I don't know, I just would like to know.)

P.A.




[Fwd: Re: New contribution]

2004-05-04 Thread Patrick Andries
03/05/2004 05:19, Michael Everson wrote:

Suetterlin.

Oh shut UP about Sütterlin already. I don't know where you guys come 
up with this stuff. Sütterlin is a kind of stylized handwriting based on 
Fraktur letterforms and ductus. It is hard to read. It is not hard to 
learn, ...

Since when is this an argument ? Neither is Phoenician hard to learn (22 
letters with no contextual variants, etc.)... Could we please remain 
courteous ?


... and it is not hard to see the relationship between its forms and 
Fraktur. ...

The relationship is not at all apparent to someone that reads only the 
Latin Script and does not know the genealogy from the Fraktur Script to the 
German Script (as Sütterlin was also called). (I like mentioning that 
people saw them as different scripts.) Quite analogous to a set of 
historically related Northern Semitic scripts, and obviously if you have 
learned the genealogy of these scripts it is easy to recognize the 
relationship...

P. A.




Re: Pal(a)eo-Hebrew and Square Hebrew

2004-05-04 Thread Patrick Andries
Peter Kirk a écrit :
On 03/05/2004 05:55, Patrick Andries wrote:
Quoted...
...
When the Biblical text is written in paleo Hebrew there are no vowel 
pointings. When the text was written in the paleo Hebrew four of the 
Hebrew letters were used as vowels - aleph, hey, vav and yud, but 
were removed from the text when the masorites added the vowel 
pointings. This is evident in the Dead Sea Scrolls where the four 
letters are found in the words but removed in the Masoretic text.

No. 




Re: New contribution

2004-05-04 Thread Patrick Andries
Christian Cooke a écrit :
Surely a cipher is by definition after the event, i.e. there must be 
the parent script before the child. Does it not follow that, by John's 
reasoning, if one is no more than a cipher of the other then it is 
Hebrew that is the cipher and so the only way Phoenician and Hebrew 
can be unified (a suggestion you'll have to assume is suitably 
showered with smileys :-) is for the latter to be deprecated and the 
former encoded as the /real/ parent script? 
What is so important about genealogy ?
P. A. (immunity of the ill-informed also requested)


Re: New contribution

2004-05-04 Thread Patrick Andries
Patrick Andries a écrit :
Christian Cooke a écrit :
Surely a cipher is by definition after the event, i.e. there must 
be the parent script before the child. Does it not follow that, by 
John's reasoning, if one is no more than a cipher of the other then 
it is Hebrew that is the cipher and so the only way Phoenician and 
Hebrew can be unified (a suggestion you'll have to assume is suitably 
showered with smileys :-) is for the latter to be deprecated and the 
former encoded as the /real/ parent script? 

What is so important about genealogy ?
Let me precise this : what is so important whether we encode the father 
or one of the sons ?




Pal(a)eo-Hebrew and Square Hebrew

2004-05-03 Thread Patrick Andries
I got this answer from a forum dedicated to Ancient Hebrew :
 Very few people can read let alone recognize the paleo Hebrew font. 
Most modern Hebrew readers are not even aware that Hebrew was once 
written in the paleo Hebrew script. There are also many who believe that 
the square script is the original script and the paleo was a kind of 
handwritten script used by the commoners and was formed out of the 
original square script. This of course goes against the archeological 
record as the square script does not appear until around 500 BCE in 
Babylonia where it was used to write the Aramaic language and adopted by 
the Hebrews while in captivity in Babylon.

I am not aware of a program that will switch from square to paleo 
although there is a site that has the Torah in paleo Hebrew script - 
http://www.crowndiamond.org/cd/torah.html.

When the Biblical text is written in paleo Hebrew there are no vowel 
pointings. When the text was written in the paleo Hebrew four of the 
Hebrew letters were used as vowels - aleph, hey, vav and yud, but were 
removed from the text when the masorites added the vowel pointings. This 
is evident in the Dead Sea Scrolls where the four letters are found in 
the words but removed in the Masoretic text.

I do not know of a paleo Hebrew font used in the unicode though I heard 
of one who was working on that awhile ago but, I do not know what came 
about out of that.


P. A.



Re: New contribution

2004-05-03 Thread Patrick Andries
Michael Everson a écrit :
At 08:56 -0400 2004-05-03, John Cowan wrote:
Michael Everson scripsit:
 You can buy books to teach you how to learn Sütterlin. Germans who
 don't read Sütterlin recognize it as what it is -- a hard-to-read way
 that everyone used to write German not so long ago.

Sure.  At some point, the same was true of Palaeo-Hebrew and
Square Hebrew, no doubt.  Jews returning from Babylonian
exile with their nifty new Aramaic-style glyphs probably
saw PH inscriptions around them here and there.

And REJECTED them as being a different script.
What does this mean ? How do you know how they felt ? Any differently 
from the Germans that rejected Suetterlin as different script, etc. ?

While I'm rather for the Phoenician proposal, I believe  one has to 
stress structural differences and objective arguments rather than simply 
repeating « it's a different script ». In this regard the treatment of 
matres lectionis found in Paleo-Hebrew (if I'm to believe *Jeff A. 
Benner* http://www.ancient-hebrew.org/jeffbenner(*) which I quoted in 
another message) and the massoretic points in Square Hebrew may be a 
structural difference.

P. A.
(*) http://www.ancient-hebrew.org/bookstore/101.html




Re: New contribution

2004-05-03 Thread Patrick Andries
D. Starner a écrit :
Phoenician script, on the other hand, is so 
different that its use renders a ritual scroll 
unclean. 
   

And I've got Latin fonts, whose use will render a Bible unclean.
(Might come in handy for Tantric religious works, though.) More
seriously, I imagine some German religious communities were very
strict on the Bible in Fraktur instead of a radical new Roman font.
 

[PA] It is true of some Amish and Hutterite communities that have asked 
explicit for Fraktur to be used in Hymn books and not Latin  (I know of 
a request to this effectmade to a Mennonite printer in Manitoba known to 
me ).

P. A.



Re: Arid Canaanite Wasteland (was: Re: New contribution)

2004-05-02 Thread Patrick Andries
Elliotte Rusty Harold a écrit :
At 9:43 AM -0700 5/1/04, Peter Kirk wrote:
For the record, I agree that Old Canaanite would be a better name. 
The reason for this is not primarily to be more Semito-centric, but 
rather to represent better the range of languages covered. For the 
same reason, Latin script should not be called English script, 
because English is only one of many languages using it.

Of course, Latin is also only one of many languages using the Latin 
script. Of course, the name Latin also has the nice political 
property that it's nobody's first language and only one very unusual 
state's official language any more (Vatican City). But is there some 
reason we call this the Latin script instead of the Roman script?
Roman Script to me is opposed to Latin Script, Uncial Script, Fraktur 
Script (all seen as scripts by Daniels  Bright).

P. A.





Re: New contribution

2004-04-30 Thread Patrick Andries
Ernest Cline a écrit :
[Original Message]
From: John Hudson [EMAIL PROTECTED]
But your proposal specifically states that the 'Phoenician' characters
   

should
 

be used to encode Palaeo-Hebrew, as if somehow Hebrew and Hebrew are
different languages when they look different.
   

No more so than Japanese becomes a different language when written
as romanji.  Language and script are distinct and a given language is often
encoded using several different scripts.  There may be points against
favoring writing Paleo-Hebrew with a Phoenician script instead of the Hebrew
script, but this isn't one of them. 

Well, since this seems to be the center of some controversy, isn't the 
methodology one should adopt to ask what the community of users thinks : 
is this for you (plural) two different scripts or are those just 
stylistic variations of the same script (Hebrew). The community of users.

And then to record this as an encoding guideline in the proposal 
(Paleo-Hebrew texts should be encoded using Phoenician codepoints or 
for Paleo-Hebrew texts texts should be encoded using the Hebrew 
codepoints).

I don't really know, I just wish we could reconcile both sides here ;-) (*)
P. A.
(*)  I must be affected by the gorgeous weather we are at long last 
enjoying here.




Re: New contribution

2004-04-30 Thread Patrick Andries
Ernest Cline a écrit :
How about the following:
When deciding how to encode ancient scripts in Unicode, sometimes
arbitrary distinctions must be made between scripts that had a
continuous evolution from one form into another.  Depending upon
the point of view of the author, a text written in a transitional form,
such as Paleo-Hebrew, might be encoded in Unicode as either
of the two scripts that it serves as a bridge between, in this case,
Phoenician and Hebrew.
Depending upon how the passions run, this might mollify both sides
or it might make them both madder than they are. :)
 

[PA] I think this may only create confusion where there is none right 
now (if it is true data  are coded with Hebrew code points and a font 
change does the trick).

A standard should attempt to standardize and improve things.
P. A.



Re: U+0140

2004-04-15 Thread Patrick Andries
Anto'nio Martins-Tuva'lkin a écrit :

However I advise removal of the note Catalan under U+0140 and
U+013F, and perhaps replacement of the whole note with «for Catalan
use U+006C U+00B7» (resp. U+004C).
Did you get an answer on this ? Why is there no decomposition associated 
to this character ?

Also did somewhat mention why U+0140 is even in Unicode since it could 
be considered (by ignorami like me) as a precomposed character (l + 
middle dot) ? Is it due to the polysemy of the middle dot ?

P. .A





Re: U+0140

2004-04-15 Thread Patrick Andries
Patrick Andries a écrit :

Anto'nio Martins-Tuva'lkin a écrit :

However I advise removal of the note Catalan under U+0140 and
U+013F, and perhaps replacement of the whole note with «for Catalan
use U+006C U+00B7» (resp. U+004C).
Did you get an answer on this ? Why is there no decomposition 
associated to this character ?

Also did somewhat mention why U+0140 is even in Unicode since it could 
be considered (by ignorami like me) as a precomposed character (l + 
middle dot) ? Is it due to the polysemy of the middle dot ?
[PA] In the meantime Eric Muller forwarded some answers (dating back 
from 6/8/2002) where Ken explains this all. Thank you Eric.

«

There is no particular reason to use the
l· as a single character, when all the 8859-based and Windows 1252
implementations would be using U+00B7 for the middle dot.
Consider U+0140 as effectively a compatibility character for
ISO 6937. It is mapped to 0xF7 in that standard. It is also
mapped to 0xA9A8 in Code Page 949 (Korean) -- which probably got
it from ISO 6937 in the first place.

Is U+0140 used in other languages?
 

Not that I know of.

--Ken
»
Patrick




Re: U+0140

2004-04-15 Thread Patrick Andries
Philippe Verdy a écrit :

From: Patrick Andries [EMAIL PROTECTED]
 

Anto'nio Martins-Tuva'lkin a écrit :
   

However I advise removal of the note Catalan under U+0140 and
U+013F, and perhaps replacement of the whole note with «for Catalan
use U+006C U+00B7» (resp. U+004C).
   

Did you get an answer on this ? Why is there no decomposition associated
to this character ?
Also did somewhat mention why U+0140 is even in Unicode since it could
be considered (by ignorami like me) as a precomposed character (l +
middle dot) ? Is it due to the polysemy of the middle dot ?
   

I thought it was already answered in this list by a Catalan speaking
contributor: the sequence L+middle-dot in Catalan is NOT a combining sequence.
Are you referring to the person I quoted ? Why doesn't the U+0140 have 
decomposition in Unicode ?

P. A.





Re: U+0140

2004-04-15 Thread Patrick Andries
Kenneth Whistler a écrit :

Did you get an answer on this ? Why is there no decomposition associated 
to this character ?
   

Thanks to Eric and Patrick for digging out my answer on this perennial
question from a couple years back, and saving me the trouble of
having to rummage around to find it. :-)
Also, it should be noted that there *is* a decomposition for
U+0140 in the Unicode Character Database, to wit:
0140;LATIN SMALL LETTER L WITH MIDDLE DOT;Ll;0;L;compat 006C 00B7;...
^^
 

Oops. Looked at the wrong place in BabelMap.

Sorry (blushing).

Patrick








Re: U+0140

2004-04-15 Thread Patrick Andries
Philippe Verdy a écrit :

From: Patrick Andries [EMAIL PROTECTED]

 

Peter Kirk a écrit :

   

What is U+2027 intended for? The name suggests that it might be what
is needed for Catalan.
[PA] Isn't this the one that should be used in dictionaries ?
 

See http://www.unicode.org/unicode/standard/reports/tr14/tr14-6.html
2027
HYPHENATION POINT
Hyphenation point is primarily used to visibly indicate syllabification
of words. Syllable breaks are potential line breaking opportunities in
the middle of words. The hyphenation point It is mainly used in
dictionaries and similar works. When an actual line break falls inside a
word containing hyphenation point characters, the hyphenation point is
rendered as a regular hyphen at the end of the line.
   

This last sentence is wrong, at least in my Larousse dictionnaries:

I believe it simply describes certain practices (Anglo-Saxon, American 
?), maybe this should be clearer.

P. A.




Re: names of the chars?

2004-04-07 Thread Patrick Andries
Tobias Stamm a crit :

Greetings to all standartisers!

I'm new here so forgive me my stupidness.

I just have one little question to which I didn't found the answer in 
the whole homepage:

What is the standard of the characters names?


* The valid English names of ISO 10646 are defined in Annex L of ISO/IEC 
10646-1:2000(E)

Rule 1

By convention, only Latin capital letters A to Z, space, and hyphen are 
used for writing the names of characters.

NOTE  Names of characters may also include digits 0 to 9 (provided that 
a digit is not the first character in a word)

For more detail see http://std.dkuug.dk/JTC1/SC2/WG2/docs/principles.html



* For the French names more characters are allowed, see Annexe L of 
ISO/IEC 10646-1:2000(F) [OE digraph, apostrophe, accented letters]

Rgle 1

Par convention, on nutilisera que des lettres latines majuscules (y 
compris les lettres accentues et les digrammes souds), lespace, 
lapostrophe et le trait dunion pour la formation des noms de 
caractres. Ces caractres doivent faire partie du rpertoire de 
l'alphabet latin n 9 (ISO 8859-15).

NOTE : Les noms des caractres peuvent aussi comprendre les chiffres 0  
9 (en autant que le premier caractre dun nom ne soit pas un chiffre) 
lorsque lutilisation du nom de ce chiffre nest pas approprie.

P. A.




Re: French typographic thin space (was: Fixed Width Spaces)

2004-04-01 Thread Patrick Andries
Asmus Freytag [EMAIL PROTECTED] a écrit :

Have you folks noticed the addition of Narrow Non Break Space?


Yes,  but I have not been able to find a font with a narrow enough glyph 
(I just looked again at Code 2000).

Does anyone know of an appropriate font for French in this regard ?

P. A.







Re: Version(s) of Unicode supported by various versions of Microsoft Windows

2004-03-05 Thread Patrick Andries
Peter said:

People *really shouldn't* ask Does product X support Unicode version
N? They should be asking questions like Can product X correctly
perform function Y on such-and-such characters added in Unicode version
N?
   

This makes for a rather long list of questions if you want to know what 
Microsoft supports in a new OS or product release for instance.

One might think of how to best present the latest support level in a 
concise fashion and not on a per function per character basis.

P. A.




Re: Version(s) of Unicode supported by various versions of Microsoft Windows

2004-03-05 Thread Patrick Andries
Peter Constable a écrit :

Well, there is no way to answer a question like What version of Unicode
does Windows XP support with anything other than a vague summary
statement like somewhere between 3.0 and 4.0 or a bunch of details.
And since people tend not to find a vague summary very useful, I'm
suggesting we'd all be better off if they simply asked about what
specific functionality they need to know about. At least, until somebody
comes up with some bright idea about other ways to answer such
questions.
One other option is to ask what languages / locales are supported, and
that is how MS has been documenting things up to now. It's a slightly
different question, but it's one that is answerable.
Much better, IO. MS must then provide a coherent support for a 
language/locale at a given Unicode level.
(No one wants to ask how every functions works for every codepoint for 
that locale, at least not before hitting a bug...)

P. A.




Re: [OT?] Modifying (Unicode) sorting of languages using diacritics in MS Word and MS SQL Server

2004-02-27 Thread Patrick Andries
Michael (michka) Kaplan a crit :

From: Patrick Andries [EMAIL PROTECTED]

 

   I have the same question for MS SQL Server 2000...
   

Similar answer to the one Chris gave for Word, though with a slightly older
version of the Windows sort tables
 

   Finally, I would like to know if it is possible for a user  to add
an additional language to the ones appearing in the Windows regional and
language options, so as to assign to it, for instance, some keyboard
layouts.
   

This is not currently possible. But the user can certainly create a new
keyboard (now with an easy GUI tool) and the system will handle all that is
typed with it.
 

[PA] Yes, the GUI tool is very nice. So easy to use in theory that I 
don't understand why it is only available in English (i.e. one does not 
need to be a techie and thus know English to be able or want to use 
this tool).

P.-S. : Do Word, SQL Server 2000 and the Regional and Language options
window support all Unicode 4.0 associated languages as far as proper
sorting and addition of keyboards are concerned ?
   

It is hard to know what you mean here -- are you asking for when every
single character in Unicode 4.0 will be in some keyboard and some
linguistically appropriate sort, all built into Windows? Or did you have a
more practical (and reasonable) target in mind?
 

[PA] Let me be reasonable as you kindly suggest, how about proper French 
Canadian  (CAN/CSA Z243.4.1 standard (which you most probably know) and 
ISO/IEC 14651 with the delta corresponding to the latter) or Khmer sorting ?

P. A.





Re: [OT?] Modifying (Unicode) sorting of languages using diacritics in MS Word and MS SQL Server

2004-02-27 Thread Patrick Andries
Michael (michka) Kaplan a crit :

[PA] Let me be reasonable as you kindly suggest, how about proper French
Canadian  (CAN/CSA Z243.4.1 standard (which you most probably know) and
ISO/IEC 14651 with the delta corresponding to the latter) or Khmer sorting
   

?

I am unaware of any specific non-conformant pieces in Windows in regard to
the former standard.
 

[PA] Well, may I suggest an offline discussion with Alain Labont 
(cc'ed) ? He is more aware of this issue than I am.

I believe he has already transmitted his concern relative to this 
non-conformance through other channels in Microsoft (subsidiaires and 
other members of the Unicode consortium).

P. A.





[OT?] Modifying (Unicode) sorting of languages using diacritics in MS Word and MS SQL Server

2004-02-22 Thread Patrick Andries
Hello,

   I would like to know if the collating order used by Word may be 
tailored by the user to sort properly letters using diacritics in a 
language not appearing in the list of  languages by Word. A simple sort 
by character number will obviously not work.

   I have the same question for MS SQL Server 2000...

   Finally, I would like to know if it is possible for a user  to add 
an additional language to the ones appearing in the Windows regional and 
language options, so as to assign to it, for instance, some keyboard 
layouts.
  
   Many thanks,

Patrick Andries

P.-S. : Do Word, SQL Server 2000 and the Regional and Language options 
window support all Unicode 4.0 associated languages as far as proper 
sorting and addition of keyboards are concerned ? If not, when will 
these products do so ?

  



Re: Detecting encoding in Plain text

2004-01-08 Thread Patrick Andries

- Message d'origine - 
De: John Delacour [EMAIL PROTECTED]


 Given any sizeable chunk of text, it ought to be possible to estimate 
 the statistical likelihood of its being in a certain 
 encoding/[language] even if it's in an unspecified 8859-* encoding. 
 It would be quite an interesting exercise, but I'd be surprised if 
 someone hasn't done it before.  Perhaps someone here knows.

See 

http://www.alis.com/fr/services_que.html
http://www.alis.com/en/services_que.html

P. A.





U+0488 and U+0489

2004-01-01 Thread Patrick Andries
Hello everyone,

Does anyone have any background and usage information relative to the
two characters named below ? Some rendered examples would be very much
appreciated.

 U+0488 COMBINING CYRILLIC HUNDRED THOUSANDS SIGN
 U+0489 COMBINING CYRILLIC MILLIONS SIGN


Many thanks,

P. Andries

 - o - 0 - o -
Meilleurs vux pour l'an nouveau !







Re: Mathematical exist and forall in Unicode

2003-12-30 Thread Patrick Andries

- Message d'origine - 
De: Mirek [EMAIL PROTECTED]


 Hello,

 I am not sure if it is the proper place to discuss the case if missing
 characters, but haven't found better place.

 I tried to find out two characters in unicode and encountered the
 following problem. There are two characters for logical EXISTS and FOR
 ALL signs.
 There exists old notation that is in unicode (exist =
 mirrored E, for all = inverted A)

U+2200
U+2203

and yet new notation (exist =  the  character similar to logical OR
OPERATOR but bigger, and for all =
 similar to logical AND OPERATOR, but bigger).

You mean similar to U+22C0 and U+22C1 ?

Do you have any reference as to the modernity of this V-like notation ?

May I add that, at first sight, I find this a very strange idea since
well-known and distinct signs would have been replaced by signs dangerously
close to other well-known ones.

 IMHO it's strange that unicode does not cover both types of notations,  or
maybe I missed something.

I don't know, but how about considering them as glyph variants ?

P. A.





Looking for more samples of _| (power tower)

2003-12-30 Thread Patrick Andries
 I have found an interesting form of a power tower (_|, see the third line
here http://pages.infinit.net/hapax/images/puissances.jpg).

 I was wondering if anyone else knew of other occurrences of this sign?

 Many thanks,

 Patrick





Re: Mathematical exist and forall in Unicode

2003-12-30 Thread Patrick Andries

- Message d'origine - 
De: D. Starner [EMAIL PROTECTED]


  These can probably be used as glyph variants, i.e., by selecting a US
vs. European font (or whatever
  is the distinction).

 I thought glyph variants were supposed to look at least somewhat similar.

Any reference to this similarity in appearance as a condition ?

(Is the Sütterlin »e« a glyph variant of standard latin «e» then ? It does
not ressemble any other e I know but rather an n.)

P.A.





Re: UNICODE OTHER STANDARDS

2003-12-29 Thread Patrick Andries

- Message d'origine - 
De: Markus Scherer [EMAIL PROTECTED]

 It looks to me like Christopher is not after an analysis of what standards
could somehow be squeezed
 to use Unicode charsets, but rather a list of standards that _specify_
(actively, not potentially)
 Unicode/10646.

 The obvious ones are of course
 HTML (at least since 4.01:
http://www.w3.org/TR/html401/charset.html#h-5.1)
 XML
 ECMAScript

 I do not have a complete list.

Another one : ISO 14651 (collation), I believe.

Ken Whistler (or Alain Labonté) can confirm (or deny) this.

P. A.







Re: [hebrew] Re: Ancient Northwest Semitic Script (was Re: why Aramaicnow)

2003-12-28 Thread Patrick Andries




-Message d'origine - 
De: "D. Starner" [EMAIL PROTECTED]
 Indeed, by 
the same argument, we could encode a lot of scripts together. ISCII did 
it for Indic scripts. I'm sure we could do some serious merging among 
syllabic scripts - 12A8(#4776;) is the same as 
13A7(#5031;)

I understand this is said tongue in cheek, but even 
then

This merging seems reasonable to you because 
you consider theirsimilarEnglish names, 
butnottheirdifferent phonetic value ([k]vs [ka]) or their ISO 10646Frenchnames 
for instance (respectively K for Ethiopic and KA Cherokee). KA being 12AB in the French version. See Daniels-Bright (Table 51.5 
which gives k (ka) for U+12A8 [k] and ka 
for U+12AB [ka] or [k]) and Amharique pour 
francophones (L'Harmattan) (p. 5 which gives ke/k for U+12A8 and ka for 
U+12AB). 

The English names are, of course, perfectly okay 
(don't want to open a can of worms here;-)).


P. A.
- o - O - o - 
ISO 10646 en franais
http://pages.infinit.net/hapax



Re: Aramaic unification and information retrieval

2003-12-27 Thread Patrick Andries

- Message d'origine - 
De: Patrick Andries [EMAIL PROTECTED]



 - Message d'origine - 
 De: Michael Everson [EMAIL PROTECTED]

 At 17:46 + 2003-12-26, Christopher John Fynn wrote:

 (Though the Roman style  Fraktur style of Latin script are probably
more
 different from each other as some of the separately encoded Indic
 scripts [e.g. Kannada / Telugu])


  Sorry, Chris, this is unsubstantiated speculation, and it doesn't
  happen to be true.
 
  In 1997, I showed some comparisons between Coptic, Greek, Cyrillic,
  and Gothic showing that all of them but Greek were similar enough to
  be read with a minimum of training and practice.

 Very probable, but how did you measure those distances and the training
and
 practice necessary ?

  I revised this a bit
  in 2001: http://www.evertype.com/standards/cy/coptic.html. German,
  English, and Irish can all be read with similarly low learning curve
  whether the script is Fraktur or Gaelic; the number of letterforms
  which differ is small.


 Interesting, I wonder if you included Sütterlin in your study.

 http://pages.infinit.net/hapax/images/suetterlin.jpg

 To the average litterate reader of the Latin script and not scholars like
 Everson : what letters are written ?

Some people having enquired about what the Sütterlin letters could
correspond to (and some having mistakenly identified several), I have
written the document in a different « script ».

http://pages.infinit.net/hapax/images/SuetterlinEnAnglaise.jpg

I wonder how many letterforms could be considered as different. If the first
three words (»Bin noch munter«) are anything to go by, I would say quite a
lot : B, c, h,  u, t, e, r  with n deceivingly close to e to the untrained
eye.

P. A.






Re: Aramaic unification and information retrieval

2003-12-24 Thread Patrick Andries

- Message d'origine - 
De: D. Starner [EMAIL PROTECTED]


  Yup, if you make a grid patten of sufficient size and complexity you can
fit
  any relatively simple shape like a letterform into it.

 And this grid doesn't even particularly fit the characters.
 Two big rules of Latin typography are that the capital
 letters are all of the same size (visually, at least)

Is this true for accented capitals or only for English letters?

AUGJQO

 Did I yet again read too fast?

P. A.

Season's Greetings  Best Wishes to All!
Bonnes ftes et meilleurs vux  tous !





[OT] Size of Latin Capitals (was Re: Aramaic unification and information retrieval)

2003-12-24 Thread Patrick Andries

- Message d'origine - 
De: Doug Ewell [EMAIL PROTECTED]



 Patrick Andries Patrick dot Andries at xcential dot com wrote:

  And this grid doesn't even particularly fit the characters.
  Two big rules of Latin typography are that the capital
  letters are all of the same size (visually, at least)
 
  Is this true for accented capitals or only for English letters?
 
  AUGJQO
 
   Did I yet again read too fast?

 Maybe.  Think base letter, not letter with combining diacritics.

Well, I think this is better said by the writer then implicitly thought by
the reader ;-)

 Also bear in mind that capital J and Q have no descender in some fonts.

A minority, I would think, for Q and this right from lapidary capitals and
there are also some capital P and Y that extend below the baseline (Fraktur
for instance, unless this is not Latin). But okay, this is not a Unicode
issue but a font design issue.

P. A.






Re: [OT] CJK - CJC (Re: Corea?)

2003-12-15 Thread Patrick Andries

-  Message d'origine - 
De: Don Osborn [EMAIL PROTECTED]


 Although I admit to not quite understanding the motivation for this
 suggestion,

Request by 22 MPs that want to modify the English spelling by law.

Because according to the articles this was the original English spelling
before the occupying Japanese authorities changed the initial C by a K so
that Korea would follow Japan in alphabetical order.

Apparently Nord and South Corea(s) want to participate in the 2004 Olympic
Games under the letter C (» Sie geht so weit, dass die beiden Länder bei den
Olympischen Spielen 2004 gemeinsam mit dem C im Namen antreten wollen.
Überhaupt soll das Weltsportfest der eigentliche Grund für die koloniale
Buchstabensuppe sein. «)

P. Andries







  1   2   3   >