Re: Swahili Banthu

2003-11-09 Thread Don Osborn
Catching up with this belatedly... Swahili, like a number of languages just south of the Sahara, was - and I would guess still is by some - written using Arabic characters (Ajami). The Latin alphabet is indeed now dominant (and certainly official) for Swahili, and it uses ASCII characters

RE: Hexadecimal digits?

2003-11-09 Thread Simon Butcher
Hi :) http://www.dkuug.dk/jtc1/sc2/wg2/docs/n2677 N2677 Proposal for six Hexadecimal digits Ricardo Cancho Niemietz - individual contribution 2003-10-21 snip Could be interesting for processing, and I can see a reason for keeping these unique from U+0041-U+0046 but ultimately I thought the

Re: Hexadecimal digits?

2003-11-09 Thread Philippe Verdy
From: Simon Butcher [EMAIL PROTECTED] http://www.dkuug.dk/jtc1/sc2/wg2/docs/n2677 N2677 Proposal for six Hexadecimal digits Ricardo Cancho Niemietz - individual contribution 2003-10-21 snip Could be interesting for processing, and I can see a reason for keeping these unique from

Re: Berber/Tifinagh (was: Swahili Banthu)

2003-11-09 Thread Philippe Verdy
From: Don Osborn [EMAIL PROTECTED] As for other African scripts, they are most notable in the western and northern parts of the continent. Tifinagh and N'ko are in the process of being encoded. I just had a conversation with someone the other day who recounted seeing a letter written in

Re: Berber/Tifinagh (was: Swahili Banthu)

2003-11-09 Thread Michael Everson
At 15:53 +0100 2003-11-09, Philippe Verdy wrote: I was concerned recently by some people who wanted to better write the Tifinagh languages (such as Berber) with the Latin script (notably for North Africa, but also in Europe due to the important North African community, notably in France). Why?

Re: ZWJ, ZWNJ, CGJ and combination

2003-11-09 Thread Peter Kirk
On 08/11/2003 17:09, Mark Davis wrote: I agree with the first part of your analysis. By the phrase requesting ligation of combining characters it is unclear to me what you mean, and whether that is the right solution to whatever problem you are referring to. Mark

RE: Hexadecimal digits?

2003-11-09 Thread Simon Butcher
Hi Philippe, http://www.dkuug.dk/jtc1/sc2/wg2/docs/n2677 N2677 Proposal for six Hexadecimal digits Ricardo Cancho Niemietz - individual contribution 2003-10-21 snip Could be interesting for processing, and I can see a reason for keeping these unique from U+0041-U+0046

Re: [hebrew] Re: ZWJ, ZWNJ, CGJ and combination

2003-11-09 Thread Peter Kirk
Philippe, I was deliberately making different threads for the main Unicode list and for the Hebrew list. Please keep them distinct. On 08/11/2003 17:15, Philippe Verdy wrote: I'm curious about what name you would give to it. The name COMBINING CHARACTER JOINER is already used... Where? It is

Re: Berber/Tifinagh (was: Swahili Banthu)

2003-11-09 Thread Philippe Verdy
From: Michael Everson [EMAIL PROTECTED] When we encode Tifinagh we will encode Tifinagh. We will not meta-encode it for ease of transliteration to other scripts. Yes that was the intent of my suggestion, I don't say that this must be done. But what would be wrong if a font was created for the

Re: ZWJ, ZWNJ, CGJ and combination

2003-11-09 Thread Peter Kirk
On 08/11/2003 17:09, Mark Davis wrote: I agree with the first part of your analysis. By the phrase requesting ligation of combining characters it is unclear to me what you mean, and whether that is the right solution to whatever problem you are referring to. Mark

Re: Berber/Tifinagh (was: Swahili Banthu)

2003-11-09 Thread Michael Everson
At 17:54 +0100 2003-11-09, Philippe Verdy wrote: From: Michael Everson [EMAIL PROTECTED] When we encode Tifinagh we will encode Tifinagh. We will not meta-encode it for ease of transliteration to other scripts. Yes that was the intent of my suggestion, I don't say that this must be done. But

Re: Hexadecimal digits?

2003-11-09 Thread Philippe Verdy
From: Simon Butcher [EMAIL PROTECTED] However personally, when dealing with a octet, or an arbitrary number of octets, I believe the byte-pictures would be much easier to deal with (especially when dealing with a lot of raw data). Except that it would require 256 new codepoints, instead of

Re: Berber/Tifinagh (was: Swahili Banthu)

2003-11-09 Thread Philippe Verdy
From: Michael Everson [EMAIL PROTECTED] At 17:54 +0100 2003-11-09, Philippe Verdy wrote: From: Michael Everson [EMAIL PROTECTED] When we encode Tifinagh we will encode Tifinagh. We will not meta-encode it for ease of transliteration to other scripts. Yes that was the intent of my

Re: Berber/Tifinagh (was: Swahili Banthu)

2003-11-09 Thread Michael Everson
At 19:30 +0100 2003-11-09, Philippe Verdy wrote: So my question is, once again: would a font that would display pointed Latin glyphs from Tifinagh script code points really break the Unicode model? Yes, Philippe. It is the same thing as mapping Cyrillic to ASCII letters. It is a hack. It is to

Re: Berber/Tifinagh (was: Swahili Banthu)

2003-11-09 Thread Don Osborn
Philippe, I thought I understood the intent of your first letter, but now I'm not sure. So let me back up and go over some basics as I understand them: 1) The Berber languages as we know are written with three scripts, Tifinagh, Arabic, and Latin. I've been given to understand that the

Tamil 0BB3 and 0BD7

2003-11-09 Thread Peter Jacobi
Dear List Members, I understand that characters of different scripts, with equal appearance are dis-unified and have different Unicode codepoints, Latin E vs Greek U+0395 vs Cyrillic U+0414 a typical example. I also understand that characters of one script having equal shapes in some fonts

Re: Berber/Tifinagh

2003-11-09 Thread Mark E. Shoulson
Philippe Verdy wrote: From: Michael Everson [EMAIL PROTECTED] At 17:54 +0100 2003-11-09, Philippe Verdy wrote: From: Michael Everson [EMAIL PROTECTED] When we encode Tifinagh we will encode Tifinagh. We will not meta-encode it for ease of transliteration to other scripts.

Re: ZWJ, ZWNJ, CGJ and combination

2003-11-09 Thread Mark Davis
Let's try to be clear on the terms. Look at the definition of combining sequences: D17 Combining character sequence: A character sequence consisting of either a base character followed by a sequence of one or more combining characters, or a sequence of one or more combining characters. Thus a

Clarification, please, was Re: Berber/Tifinagh

2003-11-09 Thread Curtis Clark
on 2003-11-09 10:41 Michael Everson wrote: I am appalled. I thought you understood something about Unicode, Philippe. At this point, I'm a bit puzzled about the circumstances in which an alphabet is a cipher of another, and when it isn't. In an offlist conversation, you, I, and others seemed to

LAST Call for Papers- Unicode IUC25-March 2004- Washington, D.C., USA

2003-11-09 Thread Tex Texin
Only 1 week left to propose papers for the next Unicode Conference! Submissions are due Nov. 14. In addition to the conference's highly-regarded ensemble of up-to-date information on internationalization and Unicode best practices, this conference will additionally focus on solutions that address

Re: ZWJ, ZWNJ, CGJ and combination

2003-11-09 Thread Peter Kirk
On 09/11/2003 11:11, Mark Davis wrote: ... Thus a combining character sequence *cannot* contain a ZWJ or any other Cf. ... Such a sequence would not correspond to anything used in a natural language. Mark __ http://www.macchiato.com But does the Khmer

RE: Hexadecimal digits?

2003-11-09 Thread Simon Butcher
Hi Philippe, However personally, when dealing with a octet, or an arbitrary number of octets, I believe the byte-pictures would be much easier to deal with (especially when dealing with a lot of raw data). Except that it would require 256 new codepoints, instead of just 6 for the

Re: Tamil 0BB3 and 0BD7

2003-11-09 Thread Philippe Verdy
From: Peter Jacobi [EMAIL PROTECTED] U+0B95 U+0BCC which is canonically equivalent to U+0B95 U+0BC7 U+0BD7 looks exactly the same as U+0B95 U+0BC7 U+0BB3 Isn't that a bit odd? Giving an analogy using Latin script, that would be the same as if Latin y U+0079 in vocalic and consonantic

Re: Berber/Tifinagh (was: Swahili Banthu)

2003-11-09 Thread Philippe Verdy
From: Michael Everson [EMAIL PROTECTED] At 19:30 +0100 2003-11-09, Philippe Verdy wrote: So my question is, once again: would a font that would display pointed Latin glyphs from Tifinagh script code points really break the Unicode model? Yes, Philippe. It is the same thing as mapping

Transliterating font

2003-11-09 Thread Chris Jacobs
- Original Message - From: Philippe Verdy [EMAIL PROTECTED] To: Michael Everson [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Sunday, November 09, 2003 5:54 PM Subject: Re: Berber/Tifinagh (was: Swahili Banthu) From: Michael Everson [EMAIL PROTECTED] When we encode Tifinagh we

Re: [hebrew] Re: ZWJ, ZWNJ, CGJ and combination

2003-11-09 Thread Philippe Verdy
From: Peter Kirk [EMAIL PROTECTED] A starter sequence (defective or not) is then an unordered set of sequences of characters having the same combining class. The relative order of each element of this set has no semantic value, and does not influence the canonical equivalence of strings. On

Re: Hexadecimal digits?

2003-11-09 Thread Philippe Verdy
From: Simon Butcher [EMAIL PROTECTED] When dealing with protocol specifications, there's often a need for characters like these, too, since hex byte pictures are unambiguous. I have a DEC dumb terminal around here somewhere which also uses them when debugging control characters. I suppose you

Re: [hebrew] Re: ZWJ, ZWNJ, CGJ and combination

2003-11-09 Thread Peter Kirk
On 09/11/2003 14:04, Philippe Verdy wrote: From: Peter Kirk [EMAIL PROTECTED] A starter sequence (defective or not) is then an unordered set of sequences of characters having the same combining class. The relative order of each element of this set has no semantic value, and does not

Re: Transliterating font

2003-11-09 Thread Mark E. Shoulson
Chris Jacobs wrote: As long as the font is explicitly advertized as a 'font with built-in transliterator', as long as the people know that what you see is not what is in the text, this seems to me indeed a good idea. Would be nice for Klingon too :-) Got one already. Several, really.

Re: Transliterating font

2003-11-09 Thread Philippe Verdy
From: Chris Jacobs [EMAIL PROTECTED] As long as the font is explicitly advertized as a 'font with built-in transliterator', as long as the people know that what you see is not what is in the text, this seems to me indeed a good idea. Would be nice for Klingon too :-) And in fact it's quite

Re: Re: ZWJ, ZWNJ, CGJ and combination

2003-11-09 Thread Philippe Verdy
From: Peter Kirk [EMAIL PROTECTED] Not at all ! May be with supplementary markup of my sentence it will be more clear: A starter sequence (defective or not) is then an _unordered_ set of { _ordered_ sequences of { characters having the same combining class

Re: Berber/Tifinagh

2003-11-09 Thread Curtis Clark
on 2003-11-09 17:07 John Hudson wrote: I've given a lot of thought to transliteration and transcription at the glyph level: Which comes back to the issue of ciphers. It would seem to me that glyph-level transliteration is the accepted behavior for ciphers (else we would actually have to

Re: Transliterating font

2003-11-09 Thread jameskass
. Philippe Verdy wrote, And in fact it's quite simple to do it with OpenType composite fonts that can be built to refer to glyphs searched in another font: such a transliterator font would not need any glyph, and thus does not require to buy a licence for a commercial design ... ... which is

Re: Berber/Tifinagh (was: Swahili Banthu)

2003-11-09 Thread Doug Ewell
Michael Everson everson at evertype dot com wrote: This has nothing to do with encoding. You are harkening back to the hideous world of 8-bit font hacks of twenty years ago. and Philippe Verdy verdy underscore p at wanadoo dot fr responded: In fact that's exactly the opposite which may be

Re: Transliterating font

2003-11-09 Thread Chris Jacobs
Got one already. Several, really. Including one I quite like, which displays sort of ligatures for ch/gh/ng/tlh and small-caps for the capital letters (plus a descending S) and two different flavors of ampersand for the two ands in Klingon... For instant transliteration, it has its

Re: Clarification, please, was Re: Berber/Tifinagh

2003-11-09 Thread Doug Ewell
Curtis Clark jcclark at mockfont dot com wrote: If Philippe were correct about the one-to-one correspondence, wouldn't the Latin glyphs be a cipher of the Tifinagh? And thus a glyph choice rather than a script choice? Probably. But judging from the chart in

Re: Transliterating font

2003-11-09 Thread Chris Jacobs
How do you do the n g - ng ligature? Got it already. It goes in the same liga but in a different lookup.

Re: Hexadecimal digits?

2003-11-09 Thread Doug Ewell
Philippe Verdy verdy underscore p at wanadoo dot fr wrote: From: Simon Butcher pickle at alien dot net dot au However personally, when dealing with a octet, or an arbitrary number of octets, I believe the byte-pictures would be much easier to deal with (especially when dealing with a lot of