Re: ZWJ, ZWNJ, CGJ and combination

2003-11-10 Thread Philippe Verdy
From: Peter Kirk [EMAIL PROTECTED] On 09/11/2003 14:55, Philippe Verdy wrote: ... And canonical normalization _guarantees_ to preserve *only* starter sequences (defective or not), but not necessarily combining character sequences (defective or not), and that's where care must be taken

Re: Transliterating font

2003-11-10 Thread John Hudson
At 09:01 PM 11/9/2003, Chris Jacobs wrote: I tried to make Open Type tables like that for the Zigan Trad font, but I did not get it working. How do you do the n g - ng ligature? You cannot just use liga because n g h should become n gh, not ng h, and there will not yet be much support for clig.

Re: Tamil 0BB3 and 0BD7

2003-11-10 Thread Peter Jacobi
Hi Doug, All, Doug Ewell [EMAIL PROTECTED] wrote: [..] Second, disunifying y would cause untold mapping nightmares. [..] Not exactly nightmares, but the Tamil case does cause some mapping discomfort. About three out of five 7bit/8bit encodings for Tamil have these two Unicode codepoints

Re: ZWJ, ZWNJ, CGJ and combination

2003-11-10 Thread Philippe Verdy
There's still a problem between these "clarified" definitions, introduced by D14: "a combining character is a graphic character" means it must be a graphic character, and this excludes character category "Cf". "Combining characters consist of all characters with the General Category values

Re: Hex-byte pictures (WAS: RE: Hexadecimal digits?)

2003-11-10 Thread Philippe Verdy
From: Simon Butcher [EMAIL PROTECTED] BTW, Frank also had other proposals which included the IBM 3270 characters I think you were referring to (poke around the directory at http://www.funet.fi/pub/kermit/ucsterminal/).. I am not proposing to encode all terminal function indicators in Unicode.

RE: Hex-byte pictures (WAS: RE: Hexadecimal digits?)

2003-11-10 Thread Simon Butcher
Hi Philippe! When dealing with protocol specifications, there's often a need for characters like these, too, since hex byte pictures are unambiguous. I have a DEC dumb terminal around here somewhere which also uses them when debugging control characters. I suppose you could argue it's

RE: Hexadecimal digits?

2003-11-10 Thread Jill Ramonsky
Well, obviously I support this totally, since I suggested the same thing myself on this list earlier this year (see http://groups.yahoo.com/group/unicode/message/20789). I am 100% in favor of adding hex digits to Unicode. I speak as a programmer, and as a designer of software architecture.

RE: Hex-byte pictures (WAS: RE: Hexadecimal digits?)

2003-11-10 Thread Simon Butcher
BTW, Frank also had other proposals which included the IBM 3270 characters I think you were referring to (poke around the directory at http://www.funet.fi/pub/kermit/ucsterminal/).. I am not proposing to encode all terminal function indicators in Unicode. Else it would mean that

RE: Hexadecimal digits?

2003-11-10 Thread Michael Everson
At 09:19 + 2003-11-10, Jill Ramonsky wrote: Well, obviously I support this totally, since I suggested the same thing myself on this list earlier this year (see http://groups.yahoo.com/group/unicode/message/20789). I am 100% in favor of adding hex digits to Unicode. I speak as a programmer,

Re: Berber/Tifinagh

2003-11-10 Thread Michael Everson
At 17:47 -0800 2003-11-09, Curtis Clark wrote: What determines whether a script is a cipher of another? Whim? Theban was rejected because Books of Shadows are usually handwritten and private and there is no requirement to exchange data. As a cipher, it is easy to determine when Latin

RE: Hexadecimal digits?

2003-11-10 Thread Michael Everson
At 09:19 + 2003-11-10, Jill Ramonsky wrote: Well, obviously I support this totally, since I suggested the same thing myself on this list earlier this year (see http://groups.yahoo.com/group/unicode/message/20789). And Ken Whistler responded to this then:

Re: Tamil 0BB3 and 0BD7

2003-11-10 Thread Philippe Verdy
From: Peter Jacobi [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Monday, November 10, 2003 9:51 AM Subject: Re: Tamil 0BB3 and 0BD7 Hi Doug, All, Doug Ewell [EMAIL PROTECTED] wrote: [..] Second, disunifying y would cause untold mapping nightmares. [..] Not exactly nightmares, but

RE: Hexadecimal digits?

2003-11-10 Thread jarkko.hietaniemi
Well, obviously I support this totally, since I suggested the same thing myself on this list earlier this year (see http://groups.yahoo.com/group/unicode/message/20789). I am 100% in favor of adding hex digits to Unicode. I speak as a programmer, and as a designer of software

RE: Hexadecimal digits?

2003-11-10 Thread Jill Ramonsky
-Original Message- From: Michael Everson [mailto:[EMAIL PROTECTED] Sent: Monday, November 10, 2003 9:56 AM To: [EMAIL PROTECTED] Subject: RE: Hexadecimal digits? There are oceans of data out there with ABCDEF used already. What do you propose to do about that? Nothing. This is not

RE: Hexadecimal digits?

2003-11-10 Thread Jill Ramonsky
My question went unanswered, so I'll ask it again - do I get a vote? How does one go about registering support for a proposal? I consider myself a relevant interested party, someone who belives that hex should collate before hex 1 in a natural sort. Is it possible to add support to

Re: Hex-byte pictures (WAS: RE: Hexadecimal digits?)

2003-11-10 Thread Philippe Verdy
From: Simon Butcher [EMAIL PROTECTED] BTW, Frank also had other proposals which included the IBM 3270 characters I think you were referring to (poke around the directory at http://www.funet.fi/pub/kermit/ucsterminal/).. I am not proposing to encode all terminal function indicators

RE: ZWJ, ZWNJ, CGJ and combination

2003-11-10 Thread Kent Karlsson
Peter Kirk wrote: But does the Khmer script follow this rule? Please bear in mind that I know nothing about this script. But in TUS v4.0 10.4 p.281 I read: Ordering of Syllable Components. The standard order of components in an orthographic syllable as expressed in BNF is B {R | C} {S

RE: Tamil 0BB3 and 0BD7

2003-11-10 Thread Kent Karlsson
Subject: Re: Tamil 0BB3 and 0BD7 The Indic lenght marks should be seen as encoding mistakes. Ideally, none of them should have been encoded, since they do not have an independent usage. Compare Khmer, which does not have decompositions for its glyph composite dependent vowels (one thing that's

Re: Hexadecimal digits?

2003-11-10 Thread Philippe Verdy
From: [EMAIL PROTECTED] I think the proposals either to have the six hexalphadigits or the sixteen hexdigits or the 256 bytedigits are doomed to have about as much usage as the equally well-intentioned Unicode LS and PS. The bad thing about the proposed 6 HEX digits, is that it assumes that

Re: Hexadecimal digits?

2003-11-10 Thread Philippe Verdy
From: Jill Ramonsky [EMAIL PROTECTED] How does one go about registering support for a proposal? I consider myself a relevant interested party, someone who belives that hex should collate before hex 1 in a natural sort. Is it possible to add support to a proposal, or do I just have to

Re: Hexadecimal digits?

2003-11-10 Thread Philippe Verdy
From: Jill Ramonsky [EMAIL PROTECTED] My question went unanswered, so I'll ask it again - do I get a vote? I think that you get a right to vote when you pay your subscription to be a full member of the UTC (or the UTC votes to invite you to become a member, as it feels your liaison membership

Re: Tamil 0BB3 and 0BD7

2003-11-10 Thread Philippe Verdy
From: Kent Karlsson [EMAIL PROTECTED] The Indic lenght marks should be seen as encoding mistakes. Could they be documented officially as deprecated in favor of another character, by assigning them a compatibility decomposition mapping (I mean with compat in the UCD)?

Re: Clarification, please, was Re: Berber/Tifinagh

2003-11-10 Thread jon
At this point, I'm a bit puzzled about the circumstances in which an alphabet is a cipher of another, and when it isn't. In an offlist conversation, you, I, and others seemed to arrive at the consensus that the Theban magickal script was a cipher of Latin. And many years ago, you raised

RE: Hexadecimal digits?

2003-11-10 Thread Michael Everson
At 10:23 + 2003-11-10, Jill Ramonsky wrote: There are oceans of data out there with ABCDEF used already. What do you propose to do about that? Nothing. This is not my problem, and I find it irrelevant. That attitude is why it might be good that you don't have a vote. Even I, who have

Re: Berber/Tifinagh

2003-11-10 Thread jon
Quoting Michael Everson [EMAIL PROTECTED]: At 17:47 -0800 2003-11-09, Curtis Clark wrote: What determines whether a script is a cipher of another? Whim? Theban was rejected because Books of Shadows are usually handwritten and private and there is no requirement to exchange data. This

RE: Hexadecimal digits?

2003-11-10 Thread Kent Karlsson
Jill Ramonsky wrote: My question went unanswered, so I'll ask it again - do I get a vote? Not from me anyway. And I'm not too worried, this kind of proposal has been rejected on very good grounds before... So I would guess the proposal has zero chance/risk of being accepted. myself a

Re: ZWJ, ZWNJ, CGJ and combination

2003-11-10 Thread Peter Kirk
On 09/11/2003 22:45, Philippe Verdy wrote: From: Peter Kirk [EMAIL PROTECTED] On 09/11/2003 14:55, Philippe Verdy wrote: ... And canonical normalization _guarantees_ to preserve *only* starter sequences (defective or not), but not necessarily combining character sequences (defective or

RE: Tamil 0BB3 and 0BD7

2003-11-10 Thread Kent Karlsson
From: Kent Karlsson [EMAIL PROTECTED] The Indic lenght marks should be seen as encoding mistakes. Could they be documented officially as deprecated in favor of another character, by assigning them a compatibility decomposition mapping (I mean with compat in the UCD)? By now you

Re: Ciphers (Was: Berber/Tifinagh)

2003-11-10 Thread Peter Kirk
On 09/11/2003 19:18, John Hudson wrote: ... Any sign can be made a cipher by changing the signified. Writing systems are collections of conventional signs, which means that there is conventional agreement as to the signified. For example, the signifier 'A' is conventionally agreed by users of

RE: Hexadecimal digits?

2003-11-10 Thread Michael Everson
At 10:42 + 2003-11-10, Jill Ramonsky wrote: My question went unanswered, so I'll ask it again - do I get a vote? How does one go about registering support for a proposal? I consider myself a relevant interested party, someone who belives that hex should collate before hex 1 in a

Re: Clarification, please, was Re: Berber/Tifinagh

2003-11-10 Thread Peter Kirk
On 10/11/2003 03:38, [EMAIL PROTECTED] wrote: At this point, I'm a bit puzzled about the circumstances in which an alphabet is a cipher of another, and when it isn't. In an offlist conversation, you, I, and others seemed to arrive at the consensus that the Theban magickal script was a cipher

Re: Berber/Tifinagh

2003-11-10 Thread Peter Kirk
On 10/11/2003 01:51, Michael Everson wrote: At 17:47 -0800 2003-11-09, Curtis Clark wrote: What determines whether a script is a cipher of another? Whim? Theban was rejected because Books of Shadows are usually handwritten and private and there is no requirement to exchange data. As a

RE: Tamil 0BB3 and 0BD7

2003-11-10 Thread Peter Jacobi
[EMAIL PROTECTED] wrote: The Indic lenght marks should be seen as encoding mistakes. Ideally, none of them should have been encoded, since they do not have an independent usage. [...] But in the case of Tamil, the non-existance of U+0BD7 wouldn't remove the disparity with written Tamil, it

RE: Hexadecimal digits?

2003-11-10 Thread Jill Ramonsky
Microsoft Windows XP does a pretty good job of natural sort order. For example a file called File99 will sort just before File100. File99A will slot between them, but File992 will go after them. It's all pretty much exactly what you'd expect. To sort File1 immediately after File could

Re: Berber/Tifinagh

2003-11-10 Thread Michael Everson
At 11:47 + 2003-11-10, [EMAIL PROTECTED] wrote: Theban was rejected because Books of Shadows are usually handwritten and private and there is no requirement to exchange data. This is increasingly untrue. This is not to say that the increase in willingness to allow Books of Shadows and

Re: Ciphers (Was: Berber/Tifinagh)

2003-11-10 Thread Michael Everson
At 04:04 -0800 2003-11-10, Peter Kirk wrote: Languages formerly written in Cyrillic are now being written in Latin script with a one to one mapping. Proposals are in preparation for extra Hebrew characters used by particular communities for western languages which are more commonly written in

Re: Ciphers (Was: Berber/Tifinagh)

2003-11-10 Thread John Cowan
Peter Kirk scripsit: But when does an unconventional use become a new convention? If a particular community chooses to write English (for example) using e.g. Cyrillic or Hebrew characters, with a one to one mapping, are they using a cipher or are they transliterating? Does it depend on how

Re: Tamil 0BB3 and 0BD7

2003-11-10 Thread Philippe Verdy
From: Kent Karlsson [EMAIL PROTECTED] From: Kent Karlsson [EMAIL PROTECTED] The Indic lenght marks should be seen as encoding mistakes. Could they be documented officially as deprecated in favor of another character, by assigning them a compatibility decomposition mapping (I mean

Re: ZWJ, ZWNJ, CGJ and combination

2003-11-10 Thread Philippe Verdy
From: Peter Kirk [EMAIL PROTECTED] This does not affect my argument. A combining character sequence, as defined, does not perfectly fit your definition an unordered set of sequences of characters having the same combining class. But it is preserved under canonical normalisation. Well, perhaps

Re: Hexadecimal digits?

2003-11-10 Thread Philippe Verdy
From: Jill Ramonsky [EMAIL PROTECTED] Microsoft Windows XP does a pretty good job of natural sort order. For example a file called File99 will sort just before File100. File99A will slot between them, but File992 will go after them. It's all pretty much exactly what you'd expect. To

RE: Tamil 0BB3 and 0BD7

2003-11-10 Thread Kent Karlsson
Peter Jacobi wrote: U+0B95 U+0BC6 U+0BB3 and U+0B95 U+0BCC are indistinguishable in written Tamil. Then there is either a true ambiguity (perhaps resolvable by context), or one or the other is a spelling mistake (and just an apparent ambiguity). Compare again Khmer, where the register shift

RE: Hexadecimal digits?

2003-11-10 Thread Kent Karlsson
Jill Ramonsky wrote: example a file called File99 will sort just before File100. File99A will slot between them, but File992 will go after them. It's all pretty much exactly what you'd expect. Really!? Would you not expect 99A to come (much) after 99?! After all, 99A (in hexadecimal) is

RE: Hexadecimal digits?

2003-11-10 Thread Jill Ramonsky
Jill Ramonsky wrote: example a file called File99 will sort just before File100. File99A will slot between them, but File992 will go after them. It's all pretty much exactly what you'd expect. Really!? Would you not expect 99A to come (much) after 99?! After all, 99A (in hexadecimal) is

RE: Tamil 0BB3 and 0BD7

2003-11-10 Thread Kent Karlsson
Philippe Verdy wrote: The decompositions cannot be changed. Is it true for compatibility decomposition? When I look at the Unicode stability policy, I thought it only meant the canonical mappings, or the fact that a canonical mapping cannot be changed to a compatibility mapping or the

RE: Hexadecimal digits?

2003-11-10 Thread Kent Karlsson
After all, 99A (in hexadecimal) is greater than 99 (hexadecimal). Oops. I missed the 2 key. E.g: After all, 99A (in hexadecimal) is greater than 992 (hexadecimal). Sorry (both about missing the 2 and that your argument doesn't work) /kent k smime.p7s Description: S/MIME

Re: Ciphers (Was: Berber/Tifinagh)

2003-11-10 Thread John Cowan
[EMAIL PROTECTED] scripsit: That would not describe the current use Theban (when it offers no real secrecy, and when most occultists are aware of modern computer-based encryption). The intention of secrecy is not the same thing, obviously, as actual secrecy, as too many have found out to

Re: Ciphers (Was: Berber/Tifinagh)

2003-11-10 Thread jon
Quoting John Cowan [EMAIL PROTECTED]: [EMAIL PROTECTED] scripsit: That would not describe the current use Theban (when it offers no real secrecy, and when most occultists are aware of modern computer-based encryption). The intention of secrecy is not the same thing, obviously, as

RE: Hexadecimal digits?

2003-11-10 Thread Jim Allan
Jill Ramonsky posted: However, File99A (where A is a hex digit) should sort (much) after both File99A (where A is a letter) and File100. The only way you can tell File99(letter)A apart from File99(digit)A is by giving the two As different codepoints. And the only way you can tell 7 decimal from

Re: ZWJ, ZWNJ, CGJ and combination

2003-11-10 Thread Mark Davis
This is unpleasant; I wish I had taken a closer look at the structure for Khmer before it went in, because it is very problematic. At this point the UTC will have to take up this topic and figure out what to do. Mark __ http://www.macchiato.com - Original

Re: Hexadecimal digits?

2003-11-10 Thread Mark Davis
I agree -- this is pointless. The UTC has discussed this before, and I don't think there is any chance that the UTC would add either: (a) made-up hexadecimal digits that differ in shape from A-F, or (b) glyphic clones of A-F that were hexadecimal digits. Mark __

Line breaking with space followed by RLM/LRM/ZWJ/ZWNJ, and another TR14 issue

2003-11-10 Thread Peter Kirk
Some issues with TR14: 1) The version linked to from http://www.unicode.org/versions/Unicode4.0.0/ is an old version, http://www.unicode.org/reports/tr14/tr14-13.html. 2) I note from the latest version of TR14 (http://www.unicode.org/reports/tr14/) and the line breaking data

Re: Tamil 0BB3 and 0BD7

2003-11-10 Thread Doug Ewell
Peter Jacobi peter underscore jacobi at gmx dot net wrote: So in effect, Unicode handling of this case, may actually change Tamil use - I've already seen proposals to a script reform dis-unifying the glyphs. Let's make sure we don't get started down that path. There is a Tamil script reform

Re[2]: Berber/Tifinagh (was: Swahili Banthu)

2003-11-10 Thread Alexander Savenkov
Hello, 2003-11-09T21:41:25Z Michael Everson [EMAIL PROTECTED] wrote: At 19:30 +0100 2003-11-09, Philippe Verdy wrote: So my question is, once again: would a font that would display pointed Latin glyphs from Tifinagh script code points really break the Unicode model? Yes, Philippe. It is the

RE: Hexadecimal digits?

2003-11-10 Thread Jill Ramonsky
Sorry, but I have to correct you. You state below that "[my] argument doesn't work". This is slightly confusing because I haven't proposed any arguments, beyond that I support the inclusion into Unicode of hex digits which are distinct from the letters A to Z. I can only assume you are

RE: Hexadecimal digits?

2003-11-10 Thread Jill Ramonsky
I am not the one who has not thought it through. There _is_ no difference between decimal 7 and hex 7. They are the same digit. File777 sorts before File999 in _ALL_ radices. Jill -Original Message- From: Jim Allan [mailto:[EMAIL PROTECTED] Sent: Monday, November 10, 2003 3:29 PM

Re: Hex-byte pictures (WAS: RE: Hexadecimal digits?)

2003-11-10 Thread Doug Ewell
Philippe Verdy verdy underscore p at wanadoo dot fr wrote: - the attachment symbol (trombonne in French, Brobriefklammer in German, I don't know the term in English), Paper clip. -Doug Ewell Fullerton, California http://users.adelphia.net/~dewell/

Re: Tamil 0BB3 and 0BD7

2003-11-10 Thread Peter Jacobi
Hi Doug, All, Doug Ewell [EMAIL PROTECTED] wrote in response to me: So in effect, Unicode handling of this case, may actually change Tamil use - I've already seen proposals to a script reform dis-unifying the glyphs. Let's make sure we don't get started down that path. There is a Tamil

RE: Tamil 0BB3 and 0BD7

2003-11-10 Thread Jungshik Shin
On Mon, 10 Nov 2003, Kent Karlsson wrote: Philippe Verdy wrote: Is it true for compatibility decomposition? When I look at the Unicode stability policy, I thought it only meant the canonical mappings, or Philippe, I wish you were right about this so that at least we could reinstate

Re: Berber/Tifinagh (was: Swahili Banthu)

2003-11-10 Thread John Cowan
Alexander Savenkov scripsit: I'm not sure I'm not taking your words out of the context, Michael. You are. Michael is complaining not about transliteration as such, but about instant transliteration by font substitution. -- John Cowan [EMAIL PROTECTED] www.ccil.org/~cowan

Re[2]: Berber/Tifinagh (was: Swahili Banthu)

2003-11-10 Thread Michael Everson
At 17:36 +0300 2003-11-10, Alexander Savenkov wrote: The Wrong Thing To Do can be seen everywhere in the newspapers when the names and some other words originally written in Cyrillic and other scripts are letter-by-letter (mapped?) transliterated to the resulting script. That's transliteration,

Re: Hexadecimal digits?

2003-11-10 Thread Doug Ewell
Jill Ramonsky Jill dot Ramonsky at aculab dot com wrote: I am not the one who has not thought it through. There _is_ no difference between decimal 7 and hex 7. They are the same digit. File777 sorts before File999 in _ALL_ radices. No, what Jim said was: And the only way you can tell 7

Re: Ciphers (Was: Berber/Tifinagh)

2003-11-10 Thread Peter Kirk
On 10/11/2003 04:50, Michael Everson wrote: At 04:04 -0800 2003-11-10, Peter Kirk wrote: Languages formerly written in Cyrillic are now being written in Latin script with a one to one mapping. Proposals are in preparation for extra Hebrew characters used by particular communities for western

Re: ZWJ, ZWNJ, CGJ and combination

2003-11-10 Thread Peter Kirk
On 10/11/2003 05:29, Philippe Verdy wrote: I did not say the opposite (that normalization could change semantics). But normalization does not work at the combining character sequence level but at the starter sequence level, ... No, it does not, in the same sense that word level processing does

Re: Handy table of combining character classes

2003-11-10 Thread Andrew C. West
On Fri, 7 Nov 2003 14:57:51 -0500, John Cowan wrote: Here's a little table of the combining classes, showing the value, the number of characters in the class, and a handy name (typically the one used in the Unicode Standard, or a CODE POINT NAME if there is only one; sometimes of my own

Re: Ciphers (Was: Berber/Tifinagh)

2003-11-10 Thread Michael Everson
At 09:13 -0800 2003-11-10, Peter Kirk wrote: On 10/11/2003 04:50, Michael Everson wrote: At 04:04 -0800 2003-11-10, Peter Kirk wrote: Languages formerly written in Cyrillic are now being written in Latin script with a one to one mapping. Proposals are in preparation for extra Hebrew characters

Re: Berber/Tifinagh

2003-11-10 Thread Curtis Clark
on 2003-11-10 04:17 Michael Everson wrote: It still remains the case that Theban orthography is basically English, that is, it is Latin with funny glyphs. Why isn't Latin Serbian just Cyrillic Serbian with funny glyphs? I'm not trying to be intentionally dense here; Theban English and Serbian

Re: ZWJ, ZWNJ, CGJ and combination

2003-11-10 Thread Peter Kirk
On 10/11/2003 09:09, Philippe Verdy wrote: ... Time to publish a public review for change of category of ZWJ and ZWNJ from Cf to Mn, so that they finally become acceptable within combining sequences? Maybe. Interesting that they are already treated as combining characters for line breaking

Re[2]: Berber/Tifinagh (was: Swahili Banthu)

2003-11-10 Thread John Hudson
At 06:36 AM 11/10/2003, Alexander Savenkov wrote: Yes, Philippe. It is the same thing as mapping Cyrillic to ASCII letters. It is a hack. It is to be avoided. It is the Wrong Thing To Do. I'm not sure I'm not taking your words out of the context, Michael. The Wrong Thing To Do can be seen

Re: Ciphers (Was: Berber/Tifinagh)

2003-11-10 Thread Peter Kirk
On 10/11/2003 10:36, John Hudson wrote: ... Well, Tifinagh is not a cipher and writing Tifinagh with a Latin cipher is a bad idea. But things like bidi properties are only an issue if you are employing a cipher at the glyph level. I've already explained why I think ciphers, masquerading and

Re: Berber/Tifinagh

2003-11-10 Thread Michael Everson
At 10:14 -0800 2003-11-10, Curtis Clark wrote: Why isn't Latin Serbian just Cyrillic Serbian with funny glyphs? Because Latin and Serbian are self-evidently different scripts. I'm not trying to be intentionally dense here; Theban English and Serbian are different in many ways. But are there

RE: Hexadecimal digits?

2003-11-10 Thread Jim Allan
Jim Ramonsky posted: I am not the one who has not thought it through. There _is_ no difference between decimal 7 and hex 7. They are the same digit. File777 sorts before File999 in _ALL_ radices. Exactly. So mixed hex and mixed decimal will not sort or compare properly using a natural sort

Re: Ciphers (Was: Berber/Tifinagh)

2003-11-10 Thread John Hudson
At 09:13 AM 11/10/2003, Peter Kirk wrote: So, if Masonic Samaritan script texts (no intention of secrecy there, by the way) should be encoded as a cipher of Latin and not with the Unicode Samaritan script, does that imply that Azerbaijani Latin texts should be encoded as a cipher or

Re: Ciphers (Was: Berber/Tifinagh)

2003-11-10 Thread Peter Kirk
On 10/11/2003 10:21, Michael Everson wrote: At 09:13 -0800 2003-11-10, Peter Kirk wrote: On 10/11/2003 04:50, Michael Everson wrote: At 04:04 -0800 2003-11-10, Peter Kirk wrote: Languages formerly written in Cyrillic are now being written in Latin script with a one to one mapping. Proposals

Re: Berber/Tifinagh (was: Swahili Banthu)

2003-11-10 Thread Patrick Andries
- Message d'origine - Philippe Verdy a écrit : I was concerned recently by some people who wanted to better write the Tifinagh languages Stricto sensu, they are no tifinagh languages, but languages (or dialects of the Berber language) written with the tifinagh script. (such as

Re: Handy table of combining character classes

2003-11-10 Thread John Cowan
Andrew C. West scripsit: 589 ? Aren't all characters that are not 1-240 Combining Class 0 (i.e. Spacing, split, enclosing, reordrant, and Tibetan subjoined) ? 235,617 (including 2,048 surrogate code points) by my reckoning. Yes. I was enumerating only the combining characters, however. --

Re: Hexadecimal digits?

2003-11-10 Thread Curtis Clark
on 2003-11-10 07:28 Jim Allan wrote: And the only way you can tell 7 decimal from 7 hex is by giving 7 to different code points, that is File777 in hex should sort after File999 in decimal. The CSS guru Eric Meyer noted that Ohio license plates translate as hex RGB colors, mostly purple:

Re: Ciphers (Was: Berber/Tifinagh)

2003-11-10 Thread Michael Everson
At 11:20 -0800 2003-11-10, Peter Kirk wrote: Who knows? You adduce no evidence. There is not much point in producing evidence if there are no agreed criteria. OK. In the absence of criteria the suspicion remains that decisions e.g. not to encode Theban and Klingon are purely subjective. OK. --

Re: Berber/Tifinagh (was: Swahili Banthu)

2003-11-10 Thread Patrick Andries
De: Patrick Andries [EMAIL PROTECTED] - Message d'origine - Philippe Verdy a écrit : I was concerned recently by some people who wanted to better write the Tifinagh languages Stricto sensu, they are no tifinagh languages, but languages (or dialects of the Berber language)

RE: Hexadecimal digits?

2003-11-10 Thread Murray Sargent
An important part of Ricardo Niemietz's hex digit proposal (http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2677) is to have columns of hexadecimal numbers line up properly as columns of decimal numbers do. This could be achieved using a font with a set of glyph variants for A-F with a hexadecimal

Re: Ciphers (Was: Berber/Tifinagh)

2003-11-10 Thread John Hudson
At 10:49 AM 11/10/2003, Peter Kirk wrote: Agreed. But if you want to write English with the Theban script, as there are no Theban characters? Or what if you want to write English with the RTL version of the Theban script which I found mentioned at http://catb.org/~esr/unicode/theban/? That

Re: Ciphers (Was: Berber/Tifinagh)

2003-11-10 Thread Michael Everson
At 10:49 -0800 2003-11-10, Peter Kirk wrote: Agreed. But if you want to write English with the Theban script, as there are no Theban characters? So far we have not seen evidence that the Theban script is other than a cypher for the Latin script.. Or what if you want to write English with the

Re: Ciphers (Was: Berber/Tifinagh)

2003-11-10 Thread Peter Kirk
On 10/11/2003 12:53, Michael Everson wrote: At 10:49 -0800 2003-11-10, Peter Kirk wrote: Agreed. But if you want to write English with the Theban script, as there are no Theban characters? So far we have not seen evidence that the Theban script is other than a cypher for the Latin script..

Re: Re[2]: Berber/Tifinagh (was: Swahili Banthu)

2003-11-10 Thread Philippe Verdy
From: Michael Everson [EMAIL PROTECTED] At 17:36 +0300 2003-11-10, Alexander Savenkov wrote: The Wrong Thing To Do can be seen everywhere in the newspapers when the names and some other words originally written in Cyrillic and other scripts are letter-by-letter (mapped?) transliterated to the

Re: Re[2]: Berber/Tifinagh (was: Swahili Banthu)

2003-11-10 Thread Philippe Verdy
From: John Hudson [EMAIL PROTECTED] At 06:36 AM 11/10/2003, Alexander Savenkov wrote: Yes, Philippe. It is the same thing as mapping Cyrillic to ASCII letters. It is a hack. It is to be avoided. It is the Wrong Thing To Do. I'm not sure I'm not taking your words out of the context,

Re: Berber/Tifinagh (was: Swahili Banthu)

2003-11-10 Thread Don Osborn
Patrick's message on this topic gets to the heart of the issue of why to encode Tifinagh (as Tifinagh) in the first place. But I think that Philippe's sentiment is not misplaced, if one approaches transliteration on the character and not the glyph level, as John and others put it. But

Re: Re[2]: Berber/Tifinagh (was: Swahili Banthu)

2003-11-10 Thread John Hudson
At 02:22 PM 11/10/2003, Philippe Verdy wrote: The the case of Berber this is not true: it is the same language written with 2 scripts (actually 3 as Arabic is also used). The mapping is not perfect for now, but there are works to correct this and adopt a single convention in each script (but with

Re: Berber/Tifinagh (was: Swahili Banthu)

2003-11-10 Thread Philippe Verdy
From: Patrick Andries [EMAIL PROTECTED] This makes no sense : the modern use of the Tifinagh script cannot be another script... You may have meant the modern day script used for the berber language. This is highly disputable (Morocco just started teaching Tifinagh in its schools and they are

Re: Berber/Tifinagh (was: Swahili Banthu)

2003-11-10 Thread Patrick Andries
- Message d'origine - De: Don Osborn [EMAIL PROTECTED] I've thought for instance about the small number of schools here in Niger that teach in Tamajak, using the Latin based script and how easy it will or will not be for the students to make the connections with the Tifinagh that

Re: Ciphers (Was: Berber/Tifinagh)

2003-11-10 Thread John Hudson
At 01:57 PM 11/10/2003, Peter Kirk wrote: Define cypher, or cipher, and I will either provide evidence that the Theban script is not one or accept that, on your definition, it is one. In the absence of a definition this discussion is meaningless. Similarly if the definition is simply a whim as

Re: Berber/Tifinagh (was: Swahili Banthu)

2003-11-10 Thread Patrick Andries
- Original Message - De: Philippe Verdy [EMAIL PROTECTED] This is the role of diacritics and symbols added to the target script, so that no information from the text written in the source script is lost. Yes, I know this but you cannot go from Berber written in Arabic to Tifinagh or

Re: Berber/Tifinagh (was: Swahili Banthu)

2003-11-10 Thread Kenneth Whistler
Philippe Verdy wrote: You seem to forget that Tifinagh is not a unified script, but a set of separate scripts where the same glyphs are used with distinct semantic functions. I think Philippe is running off the rails here. Tifinagh is a script. It comes in a number of local varieties,

Re: Re[2]: Berber/Tifinagh (was: Swahili Banthu)

2003-11-10 Thread Michael Everson
I am not going to argue with you about Tifinagh, Philippe. -- Michael Everson * * Everson Typography * * http://www.evertype.com

Re: Ciphers (Was: Berber/Tifinagh)

2003-11-10 Thread Michael Everson
At 13:57 -0800 2003-11-10, Peter Kirk wrote: So far we have not seen evidence that the Theban script is other than a cypher for the Latin script.. Define cypher, or cipher, and I will either provide evidence that the Theban script is not one or accept that, on your definition, it is one. This

Re: Berber/Tifinagh (was: Swahili Banthu)

2003-11-10 Thread Michael Everson
Don, Berber is often written in Tifinagh without vowels. And sometimes with vowels. Andd the same in Arabic. There is no point worrying (without it even being encoded) about Latin transliteration standards for it at this point. -- Michael Everson * * Everson Typography * *

Re: Berber/Tifinagh (was: Swahili Banthu)

2003-11-10 Thread Michael Everson
At 23:33 +0100 2003-11-10, Philippe Verdy wrote: You seem to forget that Tifinagh is not a unified script, but a set of separate scripts What? where the same glyphs are used with distinct semantic functions. We haven't decided what kind of unification is appropriate for Tifinagh entities yet.

Please help knock my FAQ into shape

2003-11-10 Thread Theodore H. Smith
Hi list, I have a FAQ on Unicode, for REALbasic programmers, at www.elfdata.com/plugin/unicodefaq.html Much of the information there, isn't stuff you are familar with, because its all about REALbasic. However, much of it is, because it is about Unicode. Basically, I'm hoping people can see

Re: Berber/Tifinagh (was: Swahili Banthu)

2003-11-10 Thread Philippe Verdy
From: Kenneth Whistler [EMAIL PROTECTED] Rather than encode a half dozen different scripts for this, one for each local orthographic tradition, the entire script was carefully unified to enable representation of any of the local varieties accurately with the overall script encoding. I suspect

Re: Berber/Tifinagh (was: Swahili Banthu)

2003-11-10 Thread Patrick Andries
- Message d'origine - De: Philippe Verdy [EMAIL PROTECTED] From: Patrick Andries [EMAIL PROTECTED] In this condition, why couldn't Latin glyphs be among these, when they already have the merit of covering the whole abstract character set covered by all scripts in the

Re: Ciphers (Was: Berber/Tifinagh)

2003-11-10 Thread John Hudson
At 04:32 PM 11/10/2003, Kenneth Whistler wrote: And because there is consensus in both committees that encoding of the potentially very large number of arbitrary ciphers of Latin letters (and other scripts as well) is *not* appropriate for Unicode. Attempting to fix an approximate number to the

Re: Ciphers (Was: Berber/Tifinagh)

2003-11-10 Thread Peter Kirk
On 10/11/2003 14:53, John Hudson wrote: At 01:57 PM 11/10/2003, Peter Kirk wrote: Define cypher, or cipher, and I will either provide evidence that the Theban script is not one or accept that, on your definition, it is one. In the absence of a definition this discussion is meaningless.

  1   2   >