Catching up with this belatedly...
Swahili, like a number of languages just south of the Sahara, was - and I
would guess still is by some - written using Arabic characters (Ajami). The
Latin alphabet is indeed now dominant (and certainly official) for
Swahili, and it uses ASCII characters
Hi :)
http://www.dkuug.dk/jtc1/sc2/wg2/docs/n2677
N2677
Proposal for six Hexadecimal digits
Ricardo Cancho Niemietz - individual contribution
2003-10-21
snip
Could be interesting for processing, and I can see a reason for keeping
these unique from U+0041-U+0046 but ultimately I thought the
From: Simon Butcher [EMAIL PROTECTED]
http://www.dkuug.dk/jtc1/sc2/wg2/docs/n2677
N2677
Proposal for six Hexadecimal digits
Ricardo Cancho Niemietz - individual contribution
2003-10-21
snip
Could be interesting for processing, and I can see a reason for keeping
these unique from
From: Don Osborn [EMAIL PROTECTED]
As for other African scripts, they are most notable in the western
and northern parts of the continent. Tifinagh and N'ko are in the
process of being encoded. I just had a conversation with someone
the other day who recounted seeing a letter written in
At 15:53 +0100 2003-11-09, Philippe Verdy wrote:
I was concerned recently by some people who wanted to better write the
Tifinagh languages (such as Berber) with the Latin script (notably for North
Africa, but also in Europe due to the important North African community,
notably in France).
Why?
On 08/11/2003 17:09, Mark Davis wrote:
I agree with the first part of your analysis. By the phrase requesting ligation
of combining characters it is unclear to me what you mean, and whether that is
the right solution to whatever problem you are referring to.
Mark
Hi Philippe,
http://www.dkuug.dk/jtc1/sc2/wg2/docs/n2677
N2677
Proposal for six Hexadecimal digits
Ricardo Cancho Niemietz - individual contribution
2003-10-21
snip
Could be interesting for processing, and I can see a reason
for keeping
these unique from U+0041-U+0046
Philippe, I was deliberately making different threads for the main
Unicode list and for the Hebrew list. Please keep them distinct.
On 08/11/2003 17:15, Philippe Verdy wrote:
I'm curious about what name you would give to it.
The name COMBINING CHARACTER JOINER is already used...
Where? It is
From: Michael Everson [EMAIL PROTECTED]
When we encode Tifinagh we will encode Tifinagh. We will not
meta-encode it for ease of transliteration to other scripts.
Yes that was the intent of my suggestion, I don't say that this must be
done. But what would be wrong if a font was created for the
On 08/11/2003 17:09, Mark Davis wrote:
I agree with the first part of your analysis. By the phrase requesting ligation
of combining characters it is unclear to me what you mean, and whether that is
the right solution to whatever problem you are referring to.
Mark
At 17:54 +0100 2003-11-09, Philippe Verdy wrote:
From: Michael Everson [EMAIL PROTECTED]
When we encode Tifinagh we will encode Tifinagh. We will not
meta-encode it for ease of transliteration to other scripts.
Yes that was the intent of my suggestion, I don't say that this must
be done. But
From: Simon Butcher [EMAIL PROTECTED]
However personally, when dealing with a octet, or an arbitrary number
of octets, I believe the byte-pictures would be much easier to deal with
(especially when dealing with a lot of raw data).
Except that it would require 256 new codepoints, instead of
From: Michael Everson [EMAIL PROTECTED]
At 17:54 +0100 2003-11-09, Philippe Verdy wrote:
From: Michael Everson [EMAIL PROTECTED]
When we encode Tifinagh we will encode Tifinagh. We will not
meta-encode it for ease of transliteration to other scripts.
Yes that was the intent of my
At 19:30 +0100 2003-11-09, Philippe Verdy wrote:
So my question is, once again: would a font that would display pointed Latin
glyphs from Tifinagh script code points really break the Unicode model?
Yes, Philippe. It is the same thing as mapping Cyrillic to ASCII
letters. It is a hack. It is to
Philippe, I thought I understood the intent of your first letter, but now
I'm not sure. So let me back up and go over some basics as I understand
them:
1) The Berber languages as we know are written with three scripts, Tifinagh,
Arabic, and Latin. I've been given to understand that the
Dear List Members,
I understand that characters of different scripts, with
equal appearance are dis-unified and have different
Unicode codepoints, Latin E vs Greek U+0395 vs
Cyrillic U+0414 a typical example.
I also understand that characters of one script having
equal shapes in some fonts
Philippe Verdy wrote:
From: Michael Everson [EMAIL PROTECTED]
At 17:54 +0100 2003-11-09, Philippe Verdy wrote:
From: Michael Everson [EMAIL PROTECTED]
When we encode Tifinagh we will encode Tifinagh. We will not
meta-encode it for ease of transliteration to other scripts.
Let's try to be clear on the terms.
Look at the definition of combining sequences:
D17 Combining character sequence: A character sequence consisting of either a
base character followed by a sequence of one or more combining characters, or a
sequence of one or more combining characters.
Thus a
on 2003-11-09 10:41 Michael Everson wrote:
I am appalled. I thought you understood something about Unicode, Philippe.
At this point, I'm a bit puzzled about the circumstances in which an
alphabet is a cipher of another, and when it isn't. In an offlist
conversation, you, I, and others seemed to
Only 1 week left to propose papers for the next Unicode Conference!
Submissions are due Nov. 14.
In addition to the conference's highly-regarded ensemble of
up-to-date information on internationalization and Unicode best practices,
this conference will additionally focus on solutions that address
On 09/11/2003 11:11, Mark Davis wrote:
...
Thus a combining character sequence *cannot* contain a ZWJ or any other Cf.
... Such a sequence would not correspond to anything used in a natural
language.
Mark
__
http://www.macchiato.com
But does the Khmer
Hi Philippe,
However personally, when dealing with a octet, or an
arbitrary number
of octets, I believe the byte-pictures would be much easier
to deal with
(especially when dealing with a lot of raw data).
Except that it would require 256 new codepoints, instead of
just 6 for the
From: Peter Jacobi [EMAIL PROTECTED]
U+0B95 U+0BCC which is canonically equivalent to
U+0B95 U+0BC7 U+0BD7
looks exactly the same as
U+0B95 U+0BC7 U+0BB3
Isn't that a bit odd?
Giving an analogy using Latin script,
that would be the same as if Latin y U+0079
in vocalic and consonantic
From: Michael Everson [EMAIL PROTECTED]
At 19:30 +0100 2003-11-09, Philippe Verdy wrote:
So my question is, once again: would a font that would display pointed
Latin
glyphs from Tifinagh script code points really break the Unicode model?
Yes, Philippe. It is the same thing as mapping
- Original Message -
From: Philippe Verdy [EMAIL PROTECTED]
To: Michael Everson [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Sunday, November 09, 2003 5:54 PM
Subject: Re: Berber/Tifinagh (was: Swahili Banthu)
From: Michael Everson [EMAIL PROTECTED]
When we encode Tifinagh we
From: Peter Kirk [EMAIL PROTECTED]
A starter sequence (defective or not) is then an unordered set of
sequences
of characters having the same combining class. The relative order of each
element of this set has no semantic value, and does not influence the
canonical equivalence of strings. On
From: Simon Butcher [EMAIL PROTECTED]
When dealing with protocol specifications, there's often a need for
characters like these, too, since hex byte pictures are unambiguous. I have
a DEC dumb terminal around here somewhere which also uses them when
debugging control characters.
I suppose you
On 09/11/2003 14:04, Philippe Verdy wrote:
From: Peter Kirk [EMAIL PROTECTED]
A starter sequence (defective or not) is then an unordered set of
sequences
of characters having the same combining class. The relative order of each
element of this set has no semantic value, and does not
Chris Jacobs wrote:
As long as the font is explicitly advertized as a 'font with built-in
transliterator', as long as the people know that what you see is not what is
in the text, this seems to me indeed a good idea.
Would be nice for Klingon too :-)
Got one already. Several, really.
From: Chris Jacobs [EMAIL PROTECTED]
As long as the font is explicitly advertized as a 'font with built-in
transliterator', as long as the people know that what you see is not what
is
in the text, this seems to me indeed a good idea.
Would be nice for Klingon too :-)
And in fact it's quite
From: Peter Kirk [EMAIL PROTECTED]
Not at all ! May be with supplementary markup of my sentence
it will be more clear:
A starter sequence (defective or not) is then an
_unordered_ set of {
_ordered_ sequences of {
characters having the same combining class
on 2003-11-09 17:07 John Hudson wrote:
I've given a lot of thought to transliteration and transcription at the
glyph level:
Which comes back to the issue of ciphers. It would seem to me that
glyph-level transliteration is the accepted behavior for ciphers (else
we would actually have to
.
Philippe Verdy wrote,
And in fact it's quite simple to do it with OpenType composite fonts that
can be built to refer to glyphs searched in another font: such a
transliterator font would not need any glyph, and thus does not require to
buy a licence for a commercial design ...
... which is
Michael Everson everson at evertype dot com wrote:
This has nothing to do with encoding. You are harkening back to the
hideous world of 8-bit font hacks of twenty years ago.
and Philippe Verdy verdy underscore p at wanadoo dot fr responded:
In fact that's exactly the opposite which may be
Got one already. Several, really. Including one I quite like, which
displays sort of ligatures for ch/gh/ng/tlh and small-caps for the
capital letters (plus a descending S) and two different flavors of
ampersand for the two ands in Klingon...
For instant transliteration, it has its
Curtis Clark jcclark at mockfont dot com wrote:
If Philippe were correct about the one-to-one correspondence, wouldn't
the Latin glyphs be a cipher of the Tifinagh? And thus a glyph choice
rather than a script choice?
Probably. But judging from the chart in
How do you do the n g - ng ligature?
Got it already. It goes in the same liga but in a different lookup.
Philippe Verdy verdy underscore p at wanadoo dot fr wrote:
From: Simon Butcher pickle at alien dot net dot au
However personally, when dealing with a octet, or an arbitrary number
of octets, I believe the byte-pictures would be much easier to deal
with (especially when dealing with a lot of
38 matches
Mail list logo