Re: ConScript registry?
On Wed, Jan 31, 2001 at 05:06:14AM -0800, Michael Everson wrote: Of those in the registry, I would guess only 8 (Tengwar, Cirth, Engsvanyali, Shavian, Solresol, Visible Speech, Aiha, and Klingon) have any claim to be added to Unicode. 78 columns, less than 624 characters to be added. These would appear to be in use by actual communities of some size. (Some of the other ConScripts appear to be in use only by their creators.) The only reason I include Aiha in my list was because it's already got a block tenatively assigned on your roadmap to the SMP. I've never heard of this language before, and wouldn't have included it otherwise. Why was it included in the roadmap? -- David Starner - [EMAIL PROTECTED] Pointless website: http://dvdeug.dhis.org
Re: ConScript registry?
Ar 13:56 -0800 2001-01-30, scrobh John Jenkins: Of those in the registry, I would guess only 8 (Tengwar, Cirth, Engsvanyali, Shavian, Solresol, Visible Speech, Aiha, and Klingon) have any claim to be added to Unicode. 78 columns, less than 624 characters to be added. Don't forget Deseret, which will, in fact, be part of Unicode 3.1. Version 2.1 of ConScript removes Deseret and points the user to the SMP. (John Cowan hasn't updated the mirror site yet.) This is an object lesson in the volatility of the Private Use Area. I suppose I ought to do up a mapping table for anyone who used the old Deseret. Michael Everson ** Everson Gunn Teoranta ** http://www.egt.ie 15 Port Chaeimhghein ochtarach; Baile tha Cliath 2; ire/Ireland Mob +353 86 807 9169 ** Fax +353 1 478 2597 ** Vox +353 1 478 2597 27 Pirc an Fhithlinn; Baile an Bhthair; Co. tha Cliath; ire
Re: ConScript registry?
Ar 14:54 -0800 2001-01-30, scrobh David Starner: On a calmer note, how many script submissions does Unicode and the ISO 10646 working group get now? How about from people outside Unicode and the working group? What about outside the standards bodies? The occasional Southeast Asian script we hadn't seen before is brought to our attention from outside, but in general we've identified a large set of scripts we need to work on (see the Roadmap) and we sort of focus on that. It is sometimes difficult to work on them because of resources available, and for some of the scripts it is difficult to get in touch with users or experts. If my guess is right, there's very few submissions from outside Unicode, and really no evidence that this would pick up significantly after Tengwar or Klingon got encoded. I would tend to concur with this assessment. Michael Everson ** Everson Gunn Teoranta ** http://www.egt.ie 15 Port Chaeimhghein ochtarach; Baile tha Cliath 2; ire/Ireland Mob +353 86 807 9169 ** Fax +353 1 478 2597 ** Vox +353 1 478 2597 27 Pirc an Fhithlinn; Baile an Bhthair; Co. tha Cliath; ire
Re: ConScript registry?
Ar 12:19 -0800 2001-01-30, scrobh David Starner: The ConScript registry (http://www.egt.ie/standards/csur/index.html) is a place where constructed/artifical scripts can be registered in a way that they can be publicially transfered (among those who recognize the encoding, of course.) "By agreement between sender and receiver" is the usual jargon. It also is a 'proof' that there won't be a huge surge of constructed characters in Unicode if you let Klingon or Tengwar in. Is it? There is roughly 2000 characters encoded in the BMP Private-Use area, with another 6,000 in Plane 16. Even accepting them all, that would fit easily in the space that hasn't even been tenatively allocated in Plane 1's roadmap. Oh, I see what you're saying. ConScript handles some 40 scripts -- even if they were **all** accepted into the SMP it wouldn't make that much difference. Not that we're thinking of that. Cowan and Everson have not been very picky about which scripts they included in the ConScript registry. Well, we tried to make sure the proposals were of quality. We preferred it a lot if fonts were available for the user. Of those in the registry, I would guess only 8 (Tengwar, Cirth, Engsvanyali, Shavian, Solresol, Visible Speech, Aiha, and Klingon) have any claim to be added to Unicode. 78 columns, less than 624 characters to be added. These would appear to be in use by actual communities of some size. (Some of the other ConScripts appear to be in use only by their creators.) Michael Everson ** Everson Gunn Teoranta ** http://www.egt.ie 15 Port Chaeimhghein ochtarach; Baile tha Cliath 2; ire/Ireland Mob +353 86 807 9169 ** Fax +353 1 478 2597 ** Vox +353 1 478 2597 27 Pirc an Fhithlinn; Baile an Bhthair; Co. tha Cliath; ire
Re: ConScript registry?
Ar 13:23 -0800 2001-01-30, scrobh Thomas Chan: I don't think that CSUR is conclusive proof that there wouldn't be a deluge of demands for encoding fictional or constructed scripts if the likes of Tengwar or Klingon were encoded. Well, I think what David was saying is that there don't seem to be all that many of them. CSUR is just a pair of websites without nowhere the high profile nor authority of Unicode. I thought one of the Unicode web pages linked it. I could be wrong. And the CSUR states explicitly that it is just for fun. Having said that, I do know of some folks who have done implementations of one sort or another based on its specifications. If say, a fictional script were included and published by Unicode and ISO, then people all over would suddenly be aware of the fact that a fictional script got included, and perhaps they might conclude that they should submit their own pet scripts as well. Thomas, if a script like Tengwar, which has thousands of users who are actually interested in writing texts in it, sorting, searching, and all that, gets into the UCS it is because there is a credible requirement to encode it. Plenty of "nonfictional" historical scripts have fewer users than Tengwar. For some of them we have a handful of texts. Tengwar on the other hand is studied by linguists, used by enthusiasts, and at any rate is an integral part of the work of one of the 20th century's finest and most influential writers. Many people with very real scripts that they use in their daily lives were not aware of Unicode or that it would benefit them to have them encoded; I suspect the same is true for creators of fictional and constructed scripts. Yes, of course. For example, it is easy to find a variety of fonts for fantasy runes or other alphabets that people have created, some based off a description in published fiction, but they have not gotten in touch with CSUR. Actually there aren't all that many. Or take the case of the Hotsuma Tsutae syllabary, created in modern times to provide an fictional pre-Chinese writing system (http://www.jtc.co.jp/hotsuma/index-e.htm) for what is supposedly Old Japanese, which has books and articles published about it, and fonts in existence, but it has no contact with CSUR. In fact, I *have* seen this. As I recall Ken Whistler and I looked at it when we were at the WG2 meeting in Fukuoka. Michael Everson ** Everson Gunn Teoranta ** http://www.egt.ie 15 Port Chaeimhghein ochtarach; Baile tha Cliath 2; ire/Ireland Mob +353 86 807 9169 ** Fax +353 1 478 2597 ** Vox +353 1 478 2597 27 Pirc an Fhithlinn; Baile an Bhthair; Co. tha Cliath; ire
Re: ConScript registry?
I'm curious: what are the historical scripts that have been proposed to Unicode that only exist in a handful of documents (note that I define handful as 20 or less)? Other than the Phaistos Disk "script," which may not be a script at all (it seems odd that there would be a script in as heavily studied a location as the Aegean with only one example; it probably is a script, but I would say that the jury is still out). Patrick Rourke [EMAIL PROTECTED]
Re: ConScript registry?
Ar 05:46 -0800 2001-01-31, scrobh P. T. Rourke: I'm curious: what are the historical scripts that have been proposed to Unicode that only exist in a handful of documents (note that I define handful as 20 or less)? Proto-Sinaitic, for instance. Possibly some of the badly-known South American scripts like Paucartambo. There are some scripts whose names keep getting repeated in the literature but for which it's almost impossible to get any samples at all. Michael Everson ** Everson Gunn Teoranta ** http://www.egt.ie 15 Port Chaeimhghein ochtarach; Baile tha Cliath 2; ire/Ireland Mob +353 86 807 9169 ** Fax +353 1 478 2597 ** Vox +353 1 478 2597 27 Pirc an Fhithlinn; Baile an Bhthair; Co. tha Cliath; ire
Re: ConScript registry?
On Wed, 31 Jan 2001, Michael Everson wrote: Ar 13:23 -0800 2001-01-30, scríobh Thomas Chan: I don't think that CSUR is conclusive proof that there wouldn't be a deluge of demands for encoding fictional or constructed scripts if the likes of Tengwar or Klingon were encoded. Well, I think what David was saying is that there don't seem to be all that many of them. My primary objection was that we don't have conclusive evidence for either scenario. CSUR is just a pair of websites without nowhere the high profile nor authority of Unicode. I thought one of the Unicode web pages linked it. I could be wrong. And the CSUR states explicitly that it is just for fun. Having said that, I do know of some folks who have done implementations of one sort or another based on its specifications. I don't see any wording along the lines of "just for fun" on either CSUR website itself, except for a link on your http://www.egt.ie/sc2wg2.html page. The only thing that suggests their unofficialness and volatility is mention of the Private Use Area, but perhaps that is not clear to people who see the words "Unicode" and "Registry", and think it is the real thing, or there are problems comprehending the concept of a Private Use Area. Or perhaps they have heard about it secondhand. For example, look through the Usenet newsgroup archives at deja.com or any discussion board online and see how often people believe Klingon is in Unicode, or "going to be in the next version" of Unicode, when there has only a proposal. (And I doubt they are looking at the WG2 proposal itself, but the CSUR registration or derivative information.) If say, a fictional script were included and published by Unicode and ISO, then people all over would suddenly be aware of the fact that a fictional script got included, and perhaps they might conclude that they should submit their own pet scripts as well. Thomas, if a script like Tengwar, which has thousands of users who are actually interested in writing texts in it, sorting, searching, and all that, gets into the UCS it is because there is a credible requirement to encode it. Plenty of "nonfictional" historical scripts have fewer users than Tengwar. For some of them we have a handful of texts. Tengwar on the other hand is studied by linguists, used by enthusiasts, and at any rate is an integral part of the work of one of the 20th century's finest and most influential writers. Please note that I did not single Tengwar out for criticism. I believe it has a valid argument to be encoded because of the size of the user community. It is the fictional scripts with small user communities that are the problem, and how that relates to treatment of real-world historical scripts with small user communities. For example, it is easy to find a variety of fonts for fantasy runes or other alphabets that people have created, some based off a description in published fiction, but they have not gotten in touch with CSUR. Actually there aren't all that many. Are we sure about this? It remains to be examined how they would be treated, but there are Chinese fictional scripts that have the potential capability of gobbling up codepoints like "ideographs" have done. e.g., http://deall.ohio-state.edu/grads/chan.200/misc/100fu.jpg http://deall.ohio-state.edu/grads/chan.200/misc/100shou.jpg each show a single character in what are supposedly a hundred different scripts. Most of these "scripts" could probably be conflated and treated as font variants, but a few are distinct. Multiply that by 4000-8000 each, and you might have an explosion. Or take the case of bunch of obsoleted reformist alphabets and syllabaries of the late-19th and early 20th century, such as the Guanhua Zimu ("Mandarin letters") alphabet, which is to my knowledge only partially described in one Western source. If I understand correctly, these would be in the same category as Deseret or Visible Speech. Or take the case of the Hotsuma Tsutae syllabary, created in modern times to provide an fictional pre-Chinese writing system (http://www.jtc.co.jp/hotsuma/index-e.htm) for what is supposedly Old Japanese, which has books and articles published about it, and fonts in existence, but it has no contact with CSUR. In fact, I *have* seen this. As I recall Ken Whistler and I looked at it when we were at the WG2 meeting in Fukuoka. How did that discussion turn out? Thomas Chan [EMAIL PROTECTED]
Re: ConScript registry?
On Wednesday, January 31, 2001, at 06:14 AM, Michael Everson wrote: Ar 05:46 -0800 2001-01-31, scrobh P. T. Rourke: I'm curious: what are the historical scripts that have been proposed to Unicode that only exist in a handful of documents (note that I define handful as 20 or less)? Proto-Sinaitic, for instance. Possibly some of the badly-known South American scripts like Paucartambo. There are some scripts whose names keep getting repeated in the literature but for which it's almost impossible to get any samples at all. Well, the best example of this sort of thing is the Phaistos disk script, which Michael and I have independently proposed. The entire corpus of known writings in this script was included in the proposal, and half of the corpus is found on your Unicode CD. Literally "on".
Re: ConScript registry?
Thanks, but if you go back and read my original message, you'll find the following sentences that continue from the point quoted by Mr. Everson: Other than the Phaistos Disk "script," which may not be a script at all (it seems odd that there would be a script in as heavily studied a location as the Aegean with only one example; it probably is a script, but I would say that the jury is still out). - Original Message - From: "John Jenkins" [EMAIL PROTECTED] To: "Unicode List" [EMAIL PROTECTED] Sent: Wednesday, January 31, 2001 10:54 AM Subject: Re: ConScript registry? On Wednesday, January 31, 2001, at 06:14 AM, Michael Everson wrote: Ar 05:46 -0800 2001-01-31, scrobh P. T. Rourke: I'm curious: what are the historical scripts that have been proposed to Unicode that only exist in a handful of documents (note that I define handful as 20 or less)? Proto-Sinaitic, for instance. Possibly some of the badly-known South American scripts like Paucartambo. There are some scripts whose names keep getting repeated in the literature but for which it's almost impossible to get any samples at all. Well, the best example of this sort of thing is the Phaistos disk script, which Michael and I have independently proposed. The entire corpus of known writings in this script was included in the proposal, and half of the corpus is found on your Unicode CD. Literally "on".
Re: ConScript registry?
On Wednesday, January 31, 2001, at 08:21 AM, P. T. Rourke wrote: Thanks, but if you go back and read my original message, you'll find the following sentences that continue from the point quoted by Mr. Everson: Other than the Phaistos Disk "script," which may not be a script at all (it seems odd that there would be a script in as heavily studied a location as the Aegean with only one example; it probably is a script, but I would say that the jury is still out). You are of course correct. In my eagerness to point out that the entire Phaistos repertoire is included in the encoding proposal, I read too hastily. My apologies.
Re: ConScript registry?
The Phaistos disk is either a sample of writing or it is a board game. But as a board game it doesn't look very interesting. Michael Everson ** Everson Gunn Teoranta ** http://www.egt.ie 15 Port Chaeimhghein ochtarach; Baile tha Cliath 2; ire/Ireland Mob +353 86 807 9169 ** Fax +353 1 478 2597 ** Vox +353 1 478 2597 27 Pirc an Fhithlinn; Baile an Bhthair; Co. tha Cliath; ire
Re: ConScript registry?
Ar 08:21 -0800 2001-01-31, scrobh P. T. Rourke: Thanks, but if you go back and read my original message, you'll find the following sentences that continue from the point quoted by Mr. Everson: Other than the Phaistos Disk "script," which may not be a script at all (it seems odd that there would be a script in as heavily studied a location as the Aegean with only one example; it probably is a script, but I would say that the jury is still out). The sample we have pf Phaistos is, at least, well-designed, clear, and easily analyzable. Meaning, at least it's not a rumour. Michael Everson ** Everson Gunn Teoranta ** http://www.egt.ie 15 Port Chaeimhghein ochtarach; Baile tha Cliath 2; ire/Ireland Mob +353 86 807 9169 ** Fax +353 1 478 2597 ** Vox +353 1 478 2597 27 Pirc an Fhithlinn; Baile an Bhthair; Co. tha Cliath; ire
Phaistos Disk (was Re: ConScript registry?)
Sure enough. And I'm certainly never going to criticize someone for treating it as a script until it is proven otherwise - including for the purposes of Unicode. But one has to admit that one excellent piece of evidence that a script is a script is the existence of multiple texts, and that in this case that excellent piece of evidence happens to be missing. Not to say that there isn't evidence other evidence that it is an example of a lost script (the fact that the characters seem to have been imprinted with some sort of stamp, for instance, which is suggestive that there are in fact multiple texts in the script, and the rest are lost). Another possible explanation of the disk is that it is a "dancing-men" cipher of some sort (though why a cipher would be imprinted on such a permanent medium is beyond me). I do not think there is anything controversial in expressing this modicum of doubt. Anyway, I'm glad that there are folks like Mr. Everson and Mr. Jenkins willing to put in the time in to keep up activity on encoding historical scripts. Patrick Rourke [EMAIL PROTECTED] Thanks, but if you go back and read my original message, you'll find the following sentences that continue from the point quoted by Mr. Everson: Other than the Phaistos Disk "script," which may not be a script at all (it seems odd that there would be a script in as heavily studied a location as the Aegean with only one example; it probably is a script, but I would say that the jury is still out). The sample we have pf Phaistos is, at least, well-designed, clear, and easily analyzable. Meaning, at least it's not a rumour. Michael Everson ** Everson Gunn Teoranta ** http://www.egt.ie 15 Port Chaeimhghein ochtarach; Baile tha Cliath 2; ire/Ireland Mob +353 86 807 9169 ** Fax +353 1 478 2597 ** Vox +353 1 478 2597 27 Pirc an Fhithlinn; Baile an Bhthair; Co. tha Cliath; ire
Re: ConScript registry?
On Tue, Jan 30, 2001 at 11:02:29AM -0800, Elaine Keown wrote: Hello, What's the ConScript registry? The ConScript registry (http://www.egt.ie/standards/csur/index.html) is a place where constructed/artifical scripts can be registered in a way that they can be publicially transfered (among those who recognize the encoding, of course.) Does it have a formal relationship with Unicode? Sounds like something designed to be used with the Private Use Area? It doesn't have a formal relationship with Unicode, although it is being done by John Cowan and Michael Everson. It is allocations of characters in the Private Use area. It also is a 'proof' that there won't be a huge surge of constructed characters in Unicode if you let Klingon or Tengwar in. There is roughly 2000 characters encoded in the BMP Private-Use area, with another 6,000 in Plane 16. Even accepting them all, that would fit easily in the space that hasn't even been tenatively allocated in Plane 1's roadmap. Cowan and Everson have not been very picky about which scripts they included in the ConScript registry. Of those in the registry, I would guess only 8 (Tengwar, Cirth, Engsvanyali, Shavian, Solresol, Visible Speech, Aiha, and Klingon) have any claim to be added to Unicode. 78 columns, less than 624 characters to be added. -- David Starner - [EMAIL PROTECTED] Pointless website: http://dvdeug.dhis.org
Re: ConScript registry?
On Tuesday, January 30, 2001, at 12:19 PM, David Starner wrote: Of those in the registry, I would guess only 8 (Tengwar, Cirth, Engsvanyali, Shavian, Solresol, Visible Speech, Aiha, and Klingon) have any claim to be added to Unicode. 78 columns, less than 624 characters to be added. Don't forget Deseret, which will, in fact, be part of Unicode 3.1. (Shavian has also been accepted by UTC for encoding; it's just that nobody has really pushed on it so it's languished.)
Re: ConScript registry?
On Tue, Jan 30, 2001 at 01:23:02PM -0800, Thomas Chan wrote: I don't think that CSUR is conclusive proof that there wouldn't be a deluge of demands for encoding fictional or constructed scripts if the likes of Tengwar or Klingon were encoded. This is real life; we don't get much conclusive proof around here. If say, a fictional script were included and published by Unicode and ISO, then people all over would suddenly be aware of the fact that a fictional script got included, and perhaps they might conclude that they should submit their own pet scripts as well. "Their own pet scripts"? Since when does the works of the greatest fantasy author of the 20th century used by thousands become "pet scripts"? On a calmer note, how many script submissions does Unicode and the ISO 10646 working group get now? How about from people outside Unicode and the working group? What about outside the standards bodies? If my guess is right, there's very few submissions from outside Unicode, and really no evidence that this would pick up significantly after Tengwar or Klingon got encoded. Many people with very real scripts that they use in their daily lives were not aware of Unicode or that it would benefit them to have them encoded; I suspect the same is true for creators of fictional and constructed scripts. ("very real" and "fictional and constructed" not being disjunctive, of course.) True. And? For example, it is easy to find a variety of fonts for fantasy runes or other alphabets that people have created, some based off a description in published fiction, but they have not gotten in touch with CSUR. But those are the marginal cases that Unicode doesn't need to worry about. They won't mess with Unicode, either. They aren't going to be interested or patient enough to fill out the forms. Or take the case of the Hotsuma Tsutae syllabary, created in modern times to provide an fictional pre-Chinese writing system (http://www.jtc.co.jp/hotsuma/index-e.htm) for what is supposedly Old Japanese, which has books and articles published about it, and fonts in existence, but it has no contact with CSUR. Unsurprisingly, the CSUR covers Western scripts better than Eastern ones. You could probably know better than I do how many Eastern fictional scripts there are. Even with that, is it right not to encode one language that deserves it, because there may be more that deserve to be encoded, or for fear (on what evidence) of spurious submissions? -- David Starner - [EMAIL PROTECTED] Pointless website: http://dvdeug.dhis.org
Re: ConScript registry?
In a message dated 2001-01-30 15:29:04 Pacific Standard Time, [EMAIL PROTECTED] writes: For example, it is easy to find a variety of fonts for fantasy runes or other alphabets that people have created, some based off a description in published fiction, but they have not gotten in touch with CSUR. But those are the marginal cases that Unicode doesn't need to worry about. They won't mess with Unicode, either. They aren't going to be interested or patient enough to fill out the forms. Right at the moment I am trying to get my own constructed script encoded in CSUR. Although I would be perfectly willing to fill out the paperwork required by Unicode and ISO/IEC JTC1/SC2/WG2 (it's not that much really), I would never actually do so because the script simply doesn't belong in Unicode/10646. They know it and I know it. There are plenty of differences between Unicode and CSUR, as others have mentioned. Unicode is well known and getting better known every day; CSUR is relatively obscure except to Unicode insiders, people whose scripts have already been encoded, and probably some who stumble across John's or Michael's web sites by chance. CSUR is intentionally focused on recently constructed scripts (as I have suggested before, all scripts are "artificial" or "constructed" but some gain wider acceptance than others) and so it naturally contains some scripts of extremely limited use that would not be candidates for encoding in Unicode. I trust Unicode and WG2 not to accept just any old script. Many of the "proposed" but not yet "registered" CSUR scripts were invented by one guy whose hobby is creating fantasy worlds, languages, and scripts. I figure my script is just as worthy even though there is no fantasy world created around it. -Doug Ewell Fullerton, California