Christopher John Fynn wrote on 06/21/2003 08:23:17 PM:
Any suggestions as to how to create a standardized work around
for these incorrect values?
Propose new characters, and deprecate the old ones?
- Peter
---
Peter
Philippe Verdy wrote on 06/24/2003 04:54:30 AM:
This symbol [fleur-de-lis] is commonly found and used in some printed
books,
sometimes as a bullet-like character, but most often to terminate a
chapter or add fioritures near a title
Well, such examples are better than a sample showing a
Michael Everson wrote on 06/24/2003 05:52:09 AM:
Yes. Between the databases. For instance. Look, William, I' was
saying that for instance, an Arizona number plate
Oh yeah, that reminds me. When are you going to propose the SUGUARO
SYMBOL? My wife's from Arizona; I'll back that one.
-
William Overington wrote on 06/24/2003 05:32:56 AM:
In that the document proposes U+2693 for FLEUR-DE-LIS it would seem not
unreasonable for fontmakers now to be able to produce fonts having a
FLEUR-DE-LIS glyph at U+2693.
Bad idea. Bad William. No biscuit.
However, what is the correct
At 00:56 -0500 2003-06-25, [EMAIL PROTECTED] wrote:
Christopher John Fynn wrote on 06/21/2003 08:23:17 PM:
Any suggestions as to how to create a standardized work around
for these incorrect values?
Propose new characters, and deprecate the old ones?
Fix the bloody errors, for heaven's sake.
On Wed, Jun 25, 2003 at 02:10:44 -0700, Andrew C. West wrote:
I've never really understood normalization, but it seems to me that
normalising bcuig 0F56, 0F45, 0F74, 0F72, 0F42 to bciug 0F56,
0F45, 0F72, 0F74, 0F42 is wrong as bciug could conceivably be a
shorthand abbreviation for a
I am rather concerned that the name HANDICAPPED SIGN is being used without
any justification or discussion of the name of the character.
The Name Police approved. ;-)
I am rather concerned about the Orwellian nightmare possibilities of this
and believe that vigilance is a necessary activity to
On Wed, 25 Jun 2003 15:05:26 +0400, Valeriy E. Ushakov wrote:
Err, as in this particular case one vowel sign is above and the other
one is below the stack - i.e. they don't interact spatially - you
cannot really distinguish them. ;)
I know that the vowel signs do not interact with each other
Michael Everson everson at evertype dot com wrote:
Similarly, the fleur-de-lis is a well-known named symbol which can
be used to represent a number of things.
In text? I've seen it on flags, on license plates, on heraldic
crests, but can't recall seeing it in text.
I don't have access to
On Wed, Jun 25, 2003 at 07:31:51 -0700, Andrew C. West wrote:
Err, as in this particular case one vowel sign is above and the other
one is below the stack - i.e. they don't interact spatially - you
cannot really distinguish them. ;)
I know that the vowel signs do not interact with each
Let me add that this was the case recently for Hebrew (to mention on
example). So it is certainly not impossible.
But we have enough real work to do that we should do our best to veer from
the theoretical. :-)
MichKa
- Original Message -
From: Michael (michka) Kaplan [EMAIL PROTECTED]
At 08:44 -0700 2003-06-25, Doug Ewell wrote:
If it's true that either the UTC or WG2 has formally approved the
character, for a future version of Unicode or a future amendment to
10646, then I don't see any reason why font makers can't PRODUCE a font
with a glyph for the proposed character at the
Peter_Constable at sil dot org wrote:
William Overington wrote on 06/24/2003 05:32:56 AM:
In that the document proposes U+2693 for FLEUR-DE-LIS it would seem
not unreasonable for fontmakers now to be able to produce fonts
having a FLEUR-DE-LIS glyph at U+2693.
Bad idea. Bad William. No
On Wednesday, June 25, 2003 4:31 PM, Andrew C. West [EMAIL PROTECTED] wrote:
On Wed, 25 Jun 2003 15:05:26 +0400, Valeriy E. Ushakov wrote:
What I'm suggesting is that although cui 0F45, 0F74, 0F72 and
ciu 0F45, 0F72, 0F74 should be rendered identically, the logical
ordering of the codepoints
At 08:11 -0700 2003-06-25, Michael \(michka\) Kaplan wrote:
Do you (or does anyone) have an actual example where this is the case? It
may well be true but until someone has a proof there is not really an
indication of a specific problem for the UTC to address.
A document showing what happens in
Speaking of Orwellian nightmare scenarios, I don't get this reference. I
read Homage to Catalonia, but could someone please explain this Orwellian
nightmare? I can't figure out, what does the Spanish civil war have to do
with Unicode?
Yer ol' pal,
Youtie
Michael, that is like saying move the bloody character or remove
the bloody character.
Mark
__
http://www.macchiato.com
Eppur si muove
- Original Message -
From: Michael Everson [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Wednesday, June 25, 2003
At 8:11 AM -0700 6/25/03, Michael (michka) Kaplan wrote:
From: Andrew C. West [EMAIL PROTECTED]
What I'm suggesting is that although cui 0F45, 0F74, 0F72 and
ciu 0F45,
0F72, 0F74 should be rendered identically, the logical ordering of the
codepoints representing the vowels may represent
this was the case
Someone might misread your statement. We did not change the combining
classes for Hebrew.
Mark
__
http://www.macchiato.com
Eppur si muove
- Original Message -
From: Michael (michka) Kaplan [EMAIL PROTECTED]
To: [EMAIL PROTECTED];
On Wednesday, June 25, 2003 6:11 PM, Michael Everson [EMAIL PROTECTED] wrote:
At 08:44 -0700 2003-06-25, Doug Ewell wrote:
If it's true that either the UTC or WG2 has formally approved the
character, for a future version of Unicode or a future amendment to
10646, then I don't see any
At 09:13 -0700 2003-06-25, Mark Davis wrote:
Michael, that is like saying move the bloody character or remove
the bloody character.
Fix the bloody errors, for heaven's sake.
You'd like to think so. But Deprecate TIBETAN THINGY and add TIBETAN
THINGY BIS so that we can fix the problem is utterly
At 09:17 AM 6/25/2003, Youtie Effaight wrote:
Speaking of Orwellian nightmare scenarios, I don't get this reference. I
read Homage to Catalonia, but could someone please explain this
Orwellian nightmare? I can't figure out, what does the Spanish civil war
have to do with Unicode?
I missed the
At 00:56 -0500 2003-06-25, [EMAIL PROTECTED] wrote:
Michael Everson wrote on 06/24/2003 05:52:09 AM:
Yes. Between the databases. For instance. Look, William, I' was
saying that for instance, an Arizona number plate
Oh yeah, that reminds me. When are you going to propose the SUGUARO
SYMBOL? My
On Wednesday, June 25, 2003 6:13 PM, Mark Davis [EMAIL PROTECTED] wrote:
Michael Everson wrote:
[EMAIL PROTECTED] wrote:
Christopher John Fynn wrote:
Any suggestions as to how to create a standardized work around
for these incorrect values?
Propose new characters, and
At 18:26 +0100 2003-06-25, Michael Everson wrote:
You'd like to think so. But Deprecate TIBETAN THINGY and add
TIBETAN THINGY BIS so that we can fix the problem is utterly
ridiculous.
And by that I mean, given the TWO standards Unicode and ISO/IEC
10646, adding duplicate characters is frowned
From: Michael (michka) Kaplan [EMAIL PROTECTED]
From: Michael (michka) Kaplan [EMAIL PROTECTED]
From: Andrew C. West [EMAIL PROTECTED]
What I'm suggesting is that although cui 0F45, 0F74, 0F72
and ciu 0F45, 0F72, 0F74 should be rendered identically,
the logical ordering of the
On Wednesday, June 25, 2003 8:14 PM, Peter Lofting [EMAIL PROTECTED] wrote:
At 7:41 PM +0200 6/25/03, Philippe Verdy wrote:
If there are real distinct semantics that were abusively unified
by the canonicalization, the only safe way would be to create a
second character that would have
At 7:41 PM +0200 6/25/03, Philippe Verdy wrote:
If there are real distinct semantics that were abusively unified
by the canonicalization, the only safe way would be to create a
second character that would have another combining class than the
existing one, to be used when lexical distinction
I am rather concerned about the Orwellian nightmare possibilities of this
and believe that vigilance is a necessary activity to protect freedom. Just
think, data about someone can be expressed with one character which can be
sent around the world to be stored in a database which is not necessarily
Oh yeah, that reminds me. When are you going to propose the SUGUARO
SYMBOL? My wife's from Arizona; I'll back that one.
Recte SAGUARO. I lived in Tucson from junior high to my B.A. I guess
I would propose one if it were, as the SHAMROCK is, used to indicate
something in lexicography or
On Wed, Jun 25, 2003 at 09:08:10 -0700, Peter Lofting wrote:
A list of common contractions would help here. I've seen at least one
such published collection in the past which listed common
contractions found in U-Med running text. However I don't have it
with me. Does anyone on-line have
Let me remind you: Talk on this list doesn't mean that the issue is
automatically brought up for UTC deliberation. If no documents are formally
submitted, nothing will happen.
After all the discussion of Tibetan, if anyone has a serious concrete
proposal for a specific change to the Unicode
At 12:15 -0700 2003-06-25, John Hudson wrote:
In this case, any existing normalisation for Hebrew is already
broken -- in the sense of destroying Biblical Hebrew text -- but
still the argument from the UTC seems to be that even broken
implementations -- broken because the standard is broken --
Rick McGowan posted and was answered by John Hudson:
If there isn't a visual difference here, how could there be a lexical
difference? Imagine the age before computers. All you have to go on is
what's on the page. There isn't an inherent order in those elements; they
could have been written by
William Overington wrote on 06/25/2003 06:26:25 AM:
Well, I realize that what I say may, at first glance, possibly appear
extreme at times, yet please do consider what I write in an objective
manner. If Unicode has a WHEELCHAIR SYMBOL then that is a symbol, if
Unicode encodes a HANDICAPPED
Michael Kaplan wrote on 06/25/2003 10:55:47 AM:
Let me add that this was the case recently for Hebrew (to mention on
example). So it is certainly not impossible.
The Hebrew issue is different: that involves things that *are* visually
distinct, and that distinction cannot be represented in a
John Hudson scripsit:
I'm not saying I like this, but this is how it has been explained to
me with regard to the very clearly erroneous Hebrew mark combining classes
which demonstrably break Biblical Hebrew text. In this case, any existing
normalisation for Hebrew is already broken -- in
Andrew C. West wrote on 06/25/2003 09:31:51 AM:
What I'm suggesting is that although cui 0F45, 0F74, 0F72 and ciu
0F45,
0F72, 0F74 should be rendered identically, the logical ordering of the
codepoints representing the vowels may represent lexical differencesthat
would
be lost during the
Thank you for [indirectly] making my point for me. I am saying that if
someone has an issue that *does* make a difference then they should bring it
up.
Otherwise, I say that a difference that makes no difference, make no
difference. And we can move on to actual problems. :-)
MichKa
-
Peter asked:
How can things that are visually indistinguishable be lexically different?
chat (en)
chat (fr)
We don't encode the phonological distinctions between homographs; we
encode text.
But I agree that we encode text. Both words above, which are
*lexically* distinct, would have the
At 18:26 +0100 2003-06-25, Michael Everson wrote:
You'd like to think so. But Deprecate TIBETAN THINGY and add
TIBETAN THINGY BIS so that we can fix the problem is utterly
ridiculous.
And by that I mean, given the TWO standards Unicode and ISO/IEC
10646, adding duplicate characters is
At 01:15 PM 6/25/2003, John Cowan wrote:
I don't understand how the current implementation breaks BH text.
At worst, normalization may put various combining marks in a non-traditional
order, but all alternative orders are canonically equivalent anyway, and
no (ordinary) Unicode process should
At 14:20 -0700 2003-06-25, John Hudson wrote:
John,
Write it up with glyphs and minimal pairs and people will see the
problem, if any. Or propose some solution. (That isn't add duplicate
characters.)
In Biblical Hebrew, it is possible for more than one vowel to be
attached to a single
John Hudson wrote:
In Biblical Hebrew, it is possible for more than one vowel to be attached
to a single consonant. This means that is it very important to maintain the
ordering of vowels applied to a single consonant. The Unicode Standard
assigns an individual combining class to every
On Wed, 25 Jun 2003 19:47:26 +0400, Valeriy E. Ushakov wrote:
And given that the two look identical in writing in the first palce,
this lexical difference had a chance to originate exactly *where*?
You are putting the cart before the horse.
Well, unless the text has been scanned with OCR, a
At 02:36 PM 6/25/2003, Michael Everson wrote:
Write it up with glyphs and minimal pairs and people will see the problem,
if any. Or propose some solution. (That isn't add duplicate characters.)
Peter Constable has written this up and submitted a proposal to the UTC.
Additional documentation of
On Wed, 25 Jun 2003 13:41:27 -0700 (PDT), Kenneth Whistler wrote:
Peter asked:
How can things that are visually indistinguishable be lexically different?
chat (en)
chat (fr)
And if Unicode reordered vowels in front of consonants, then we wouldn't be able
to distinguish :
chat (en)
At 03:29 PM 6/25/2003, Kenneth Whistler wrote:
This is not simply
'non-traditional' but results in incorrect rendering and a different
vocalisation of the text.
I don't think this is true.
First, the intent of the (admittedly problematical) fixed position
combining classes was that the
On Thursday, June 26, 2003 1:04 AM, Andrew C. West [EMAIL PROTECTED] wrote:
On Wed, 25 Jun 2003 13:41:27 -0700 (PDT), Kenneth Whistler wrote:
Peter asked:
How can things that are visually indistinguishable be lexically
different?
chat (en)
chat (fr)
And if Unicode
At 04:57 PM 6/25/2003, Kenneth Whistler wrote:
And I hate to have to continue being Mr. Negativity on this
list, but I remain unconvinced that the proposed solution
(of cloning 14 Hebrew points and vowels) just to fix an
unpreferred canonical reordering result represents the
sole remaining
John Hudson wrote:
At 02:36 PM 6/25/2003, Michael Everson wrote:
Write it up with glyphs and minimal pairs and people will see the problem,
if any. Or propose some solution. (That isn't add duplicate characters.)
Peter Constable has written this up and submitted a proposal to the UTC.
Valeriy E. Ushakov [EMAIL PROTECTED] wrote:
A sample list of dbu can contractions from Schmidt grammar:
http://snark.ptc.spbu.ru/~uwe/tibex/contractions/contractions.ht
ml
When these combinations are written in dbu-can script, as they
are here ,the problem may not look too bad. - However
John Hudson wrote:
This idea of Hebrew vowels as 'fixed' marks is problematical, because in
Biblical Hebrew they are not fixed: they move relative to additional marks
(other vowels or cantillation marks).
It may be more *difficult* for applications to do correct rendering,
but there was
For example, the alleged problem of the vocalization order of
the Masoretes might be amenable to a much less drastic
solution. People could consider, for example, representation
of the required sequence:
lamed, qamets, hiriq, final mem
as:
lamed, qamets, ZWJ, hiriq, final mem
Difficulties due to the present combining class values attached
to these characters most frequently occur with
abbreviations/contractions and/or with cursive scripts. With
abbreviations it is common to have two or more vowels on a
consonant stack. In cursive or semi-cursive forms of Tibetan
At 06:22 PM 6/25/2003, Kenneth Whistler wrote:
Even if the ZWJ is stripped by the application before the actual
low-level paint API is called, so that instead of
lamed, qamets, ZWJ, hiriq, final mem
the renderer just sees
lamed, qamets, hiriq, final mem
you still end up with the order you need
The Free Standards Group Open Internationalization Initiative (OpenI18N)
announced the release of the locale data markup language specification
(LDML), Version 1.0: see http://www.openi18n.org/specs/ldml/.
To see the full announcement, please visit
When, in the Bible, one sees two vowels on a given consonant, it isn't so.
There is one vowel for the consonant one sees, and another vowel for an
invisible consonant. The proper way to encode it is to use some code to
represent the invisible consonant. Then the problem mentioned below does not
Hi,
Some weeks back there were a number of postings about software for
viewing Unicode Ranges in TrueType fonts and I had a few questions about
that. Most viewers listed seemed to only check the Unicode Range bits of
the fonts which can be misleading in certain cases.
Anyways I wanted to ask
59 matches
Mail list logo