Re: Unicode 3.1: incomplete tags considered harmless/useful

2001-01-31 Thread DougEwell2
In a message dated 2001-01-31 12:19:33 Pacific Standard Time, [EMAIL PROTECTED] writes: > The section "Dangers of Incomplete Support" in section 13.7 seems to me > to be far too strongly worded; it should be weakened or removed > altogether. > > In particular, there is no reason why sequen

Fwd: Character encoding systems for Arabic Web pages ?

2001-01-31 Thread DougEwell2
There seems to be a lot of crossover between the Qalam and Unicode lists today. > Forward Header_ > Subject:Character encoding systems for Arabic Web pages ? > Author: <[EMAIL PROTECTED]> > Date: 2001-01-31 6:35 PM > > > Dear Arabist's,

Re: Unicode 3.1: UTF-8

2001-01-31 Thread David Starner
On Wed, Jan 31, 2001 at 11:18:37AM -0800, John Cowan wrote: > I propose that the distinction between illegal and irregular UTF-8 > code sequences (D36bc) be eliminated. Since there are no code points > between U+D7FF and U+E000 (the apparently intervening code points > are UTF-16 code units, but

Phaistos Disk (was Re: ConScript registry?)

2001-01-31 Thread P. T. Rourke
Sure enough. And I'm certainly never going to criticize someone for treating it as a script until it is proven otherwise - including for the purposes of Unicode. But one has to admit that one excellent piece of evidence that a script is a script is the existence of multiple texts, and that in th

Unicode 3.1: incomplete tags considered harmless/useful

2001-01-31 Thread John Cowan
The section "Dangers of Incomplete Support" in section 13.7 seems to me to be far too strongly worded; it should be weakened or removed altogether. In particular, there is no reason why sequences of tag characters not beginning with LANGUAGE TAG or CANCEL TAG cannot be used for various purposes b

Unicode 3.1: UTF-8

2001-01-31 Thread John Cowan
I propose that the distinction between illegal and irregular UTF-8 code sequences (D36bc) be eliminated. Since there are no code points between U+D7FF and U+E000 (the apparently intervening code points are UTF-16 code units, but not Unicode code points) the corresponding UTF-8 code sequences shou

Unicode 3.1: Georgian (editorial)

2001-01-31 Thread John Cowan
The 2nd paragraph in the revision of 7.5 appears to be a remnant. -- There is / one art || John Cowan <[EMAIL PROTECTED]> no more / no less || http://www.reutershealth.com to do / all things || http://www.ccil.org/~cowan with art- / lessness \\ -- P

Re: ConScript registry?

2001-01-31 Thread Michael Everson
Ar 08:21 -0800 2001-01-31, scríobh P. T. Rourke: >Thanks, but if you go back and read my original message, you'll find the >following sentences that continue from the point quoted by Mr. Everson: > >> Other than the Phaistos Disk "script," which may not >> be a script at all (it seems odd that the

Re: ConScript registry?

2001-01-31 Thread Michael Everson
The Phaistos disk is either a sample of writing or it is a board game. But as a board game it doesn't look very interesting. Michael Everson ** Everson Gunn Teoranta ** http://www.egt.ie 15 Port Chaeimhghein Íochtarach; Baile Átha Cliath 2; Éire/Ireland Mob +353 86 807 9169 ** Fax +353 1 478

Re: Property error for U+2118?

2001-01-31 Thread John Cowan
John O'Conner wrote: > Is this an error or intentional change? I noticed that all other > "SCRIPT CAPITAL *" character values are in the "Lu" category. > However, this particular character has changed to "So" in the 3.0, > 3.0.1, and 3.1 db. Why? Why not the other SCRIPT CAPITAL * char > val

Re: ConScript registry?

2001-01-31 Thread John Jenkins
On Wednesday, January 31, 2001, at 08:21 AM, P. T. Rourke wrote: > Thanks, but if you go back and read my original message, you'll find the > following sentences that continue from the point quoted by Mr. Everson: > >> Other than the Phaistos Disk "script," which may not >> be a script at all (i

Re: Daniels and Bright Tibetan Query

2001-01-31 Thread Valeriy E. Ushakov
On Wed, Jan 31, 2001 at 08:10:44 -0800, James E. Agenbroad wrote: > In the chapter on Tibetan in Daniels and Bright's The world's writing > systems (page 434) about prescript symbols: "There are six radicals that > never occur with a prescript: wa, ra, la, ha, and 'a chung." Does anyone > know wh

Property error for U+2118?

2001-01-31 Thread John O'Conner
Is this an error or intentional change? I noticed that all other "SCRIPT CAPITAL *" character values are in the "Lu" category. However, this particular character has changed to "So" in the 3.0, 3.0.1, and 3.1 db. Why? Why not the other SCRIPT CAPITAL * char values too? 2118;SCRIPT CAPITAL P;So;0;

Re: ConScript registry?

2001-01-31 Thread P. T. Rourke
Thanks, but if you go back and read my original message, you'll find the following sentences that continue from the point quoted by Mr. Everson: > Other than the Phaistos Disk "script," which may not > be a script at all (it seems odd that there would be a > script in as heavily studied a locatio

Daniels and Bright Tibetan Query

2001-01-31 Thread James E. Agenbroad
Wednesday, Januaary 31, 2001 In the chapter on Tibetan in Daniels and Bright's The world's writing systems (page 434) about prescript symbols: "There are six radicals that never occur with a prescript: wa, ra, la, ha, and 'a chung." Does anyone know what the

Re: ConScript registry?

2001-01-31 Thread John Jenkins
On Wednesday, January 31, 2001, at 06:14 AM, Michael Everson wrote: > Ar 05:46 -0800 2001-01-31, scríobh P. T. Rourke: >> I'm curious: what are the historical scripts that have been proposed to >> Unicode that only exist in a handful of documents (note that I define >> handful as 20 or less)? >

Re: ConScript registry?

2001-01-31 Thread Thomas Chan
On Wed, 31 Jan 2001, Michael Everson wrote: > Ar 13:23 -0800 2001-01-30, scríobh Thomas Chan: > >I don't think that CSUR is conclusive proof that there wouldn't be a > >deluge of demands for encoding fictional or constructed scripts if the > >likes of Tengwar or Klingon were encoded. > > Well, I

Re: Benefits of Unicode

2001-01-31 Thread Antoine Leca
[EMAIL PROTECTED] wrote: > > > There are several features that make the TRON approach to multilingual > processing unique. One is that the TRON character set is "limitlessly > extensible," and thus it is capable of including all scripts that have ever > been used, and even new scripts that have

Re: ConScript registry?

2001-01-31 Thread DougEwell2
In a message dated 2001-01-31 5:46:59 Pacific Standard Time, [EMAIL PROTECTED] writes: > >Don't forget Deseret, which will, in fact, be part of Unicode 3.1. > > Version 2.1 of ConScript removes Deseret and points the user to the SMP. > (John Cowan hasn't updated the mirror site yet.) This is

Re: ConScript registry?

2001-01-31 Thread Michael Everson
Ar 05:46 -0800 2001-01-31, scríobh P. T. Rourke: >I'm curious: what are the historical scripts that have been proposed to >Unicode that only exist in a handful of documents (note that I define >handful as 20 or less)? Proto-Sinaitic, for instance. Possibly some of the badly-known South American s

Re: ConScript registry?

2001-01-31 Thread P. T. Rourke
I'm curious: what are the historical scripts that have been proposed to Unicode that only exist in a handful of documents (note that I define handful as 20 or less)? Other than the Phaistos Disk "script," which may not be a script at all (it seems odd that there would be a script in as heavily st

Re: ConScript registry?

2001-01-31 Thread Michael Everson
Ar 13:23 -0800 2001-01-30, scríobh Thomas Chan: >I don't think that CSUR is conclusive proof that there wouldn't be a >deluge of demands for encoding fictional or constructed scripts if the >likes of Tengwar or Klingon were encoded. Well, I think what David was saying is that there don't seem to

Re: ConScript registry?

2001-01-31 Thread Michael Everson
Ar 12:19 -0800 2001-01-30, scríobh David Starner: >The ConScript registry (http://www.egt.ie/standards/csur/index.html) is a >place where constructed/artifical scripts can be registered in a way >that they can be publicially transfered (among those who recognize the >encoding, of course.) "By ag

Re: ConScript registry?

2001-01-31 Thread Michael Everson
Ar 14:54 -0800 2001-01-30, scríobh David Starner: >On a calmer note, how many script submissions does Unicode and the >ISO 10646 working group get now? How about from people outside Unicode >and the working group? What about outside the standards bodies? The occasional Southeast Asian script we

Re: ConScript registry?

2001-01-31 Thread Michael Everson
Ar 13:56 -0800 2001-01-30, scríobh John Jenkins: > >> Of those in the registry, I would guess only 8 (Tengwar, Cirth, >> Engsvanyali, Shavian, Solresol, Visible Speech, Aiha, and Klingon) have >> any claim to be added to Unicode. 78 columns, less than 624 characters to be >> added. > >Don't forget