John Cowan va escriure:
Pavel Adamek scripsit:
From the viewpoint of sorting,
the coding HCOMBINING C BEFORE
would be much better than
CCOMBINING H AFTER.
For Czech, yes. For Spanish we want the latter.
What for?
Antoine
Antoine Leca scripsit:
John Cowan va escriure:
Pavel Adamek scripsit:
From the viewpoint of sorting,
the coding HCOMBINING C BEFORE
would be much better than
CCOMBINING H AFTER.
For Czech, yes. For Spanish we want the latter.
What for?
First of all, this is an extended
At 13:29 +0100 2004-03-22, Antoine Leca wrote:
John Cowan va escriure:
Pavel Adamek scripsit:
From the viewpoint of sorting,
the coding HCOMBINING C BEFORE
would be much better than
CCOMBINING H AFTER.
For Czech, yes. For Spanish we want the latter.
What for?
Irony.
--
Michael Everson *
From: John Cowan [EMAIL PROTECTED]
First of all, this is an extended joke.
The point of the joke is that Czech sorts ch as a single letter after
h, so using a COMBINING C BEFORE would make this happen automatically,
provided the combining character sorted after all letters.
Spanish also
The point of the joke is that Czech
sorts ch as a single letter after h,
so using a COMBINING C BEFORE
would make this happen automatically,
provided the combining character sorted after all letters.
Spanish also sorts ch as a single letter,
but after c, so here we
want a COMBINING H
I wrote:
For easy multi-level comparison,
let us define new characters:
COMBINING LEVEL 1 GRAPHEME JOINER
COMBINING LEVEL 2 GRAPHEME JOINER
...
Then, for example, instead of
CCOMBINING CARONESKYCOMBINING ACUTE
code it as
CCOMBINING LEVEL 1 GRAPHEME JOINER
COMBINING CARONESKY
Pavel Adamek pavel dot adamek at ima dot cz wrote:
For easy multi-level comparison,
let us define new characters:
COMBINING LEVEL 1 GRAPHEME JOINER
COMBINING LEVEL 2 GRAPHEME JOINER
...
Please, let's not. There are many people who feel we already have one
CGJ too many.
Let's solve this
At 22:13 + 2004-03-19, Marion Gunn wrote:
Ar 03:17 -0800 2004/03/18, scríobh Peter Kirk [EMAIL PROTECTED]:
An alternative for Marion, if her company still has rights to the fonts
which it so expensively developed to serve her country, would be to
distribute those fonts widely (and that
At 19:46 + 2004-03-19, Marion Gunn wrote:
Ar 15:41 + 2004/03/18, scríobh [EMAIL PROTECTED]:
Anyone who feels that past monetary contributions towards encoding
efforts were made based on false pretenses may be able to seek legal
redress...
James Kass
An admission of having made a seemingly
At 14:57 -0600 2004-03-19, Unspecified wrote:
Quoting Peter Kirk [EMAIL PROTECTED]:
I don't think it affects Irish, unless you want to be dotless Marıon ın
Irısh even when usıng a non-Gaelıc font. The consensus on the list seems
to be that Irish should be written with a normal i character and
The letter í is the long form of i. It is encoded
0069 0301 (or its equivalent 00E9). It would also
be a spelling error to encode í with 0131.
Those are the facts. It is not a matter for dispute.
I'm sorry. I do not acknowledge the ISO's authority to dictate spelling
norms.
Like
why not introduce a seimhiu character
whose glyphic representation is
either h-following or dot-above?
Primarily for Unicode structural reasons:
Unicode needs to say a character is
either combining or not.
It would be combining in both fonts.
Although it is not usual in Latin script
to
Pavel Adamek pavel dot adamek at ima dot cz wrote:
COMBINING C BEFORE
HCOMBINING C BEFORE = CH
Shhh. It's not April 1 yet.
-Doug Ewell
Fullerton, California
http://users.adelphia.net/~dewell/
Jon Hanna jon at hackcraft dot net wrote:
Whether an Irish person writes an i without a dot, an English person
writes it with a dot, or a 12 year old girl penning a valentine card
writes it with a heart it is still the letter i.
There was a great document related to this at
Fine. I concede that this is the case. Therefore, let's change the underlying
form of 0069 to a dotless i and let English speakers change it to a dotted
i with the font.
The Gaelic and Roman letterforms are glyph
variants of the Latin script. Changing the font
will lose the dot, if the
Fine. I concede that this is the case. Therefore, let's change the
underlying
form of 0069 to a dotless i and let English speakers change it to a
dotted
i with the font.
I am happy to inform you that the underlying form doesn't have a dot.
--
Jon Hanna
http://www.hackcraft.net/
it has
COMBINING C BEFORE
HCOMBINING C BEFORE = CH
Shhh. It's not April 1 yet.
Of course I do not want to add this character to Unicode,
I was only thinking about possibilities.
The document
An operational model for characters and glyphs
says:
-
Even within the content domain,
the nature
Pavel Adamek scripsit:
From the viewpoint of sorting,
the coding HCOMBINING C BEFORE
would be much better than
CCOMBINING H AFTER.
For Czech, yes. For Spanish we want the latter.
--
Her he asked if O'Hare Doctor tidings sent from far John Cowan
coast and she with grameful sigh him
Ar 15:41 + 2004/03/18, scríobh [EMAIL PROTECTED]:
Anyone who feels that past monetary contributions towards encoding
efforts were made based on false pretenses may be able to seek legal
redress...
James Kass
An admission of having made a seemingly foolhardy investment hardly amounts
to making
Ar 19/03/2004 11:46, scro Marion Gunn (is that correct Irish old
orthography?):
... If there were text processing
resources
available for the Gaelic script, this could change.
I have to agree with the above paragraph of Brian's.
Well, any Unicode-compatible word processor, e-mailer etc
Quoting Peter Kirk [EMAIL PROTECTED]:
I don't think it affects Irish, unless you want to be dotless Marıon ın
Irısh even when usıng a non-Gaelıc font. The consensus on the list seems
to be that Irish should be written with a normal i character and the dot
removed in particular fonts.
Peter Kirk [EMAIL PROTECTED] wrote:
I don't think it affects Irish, unless you want to be dotless Maron n
Irsh even when usng a non-Gaelc font. The consensus on the list seems
to be that Irish should be written with a normal i character and the dot
removed in particular fonts.
I also approve,
Yes!
That is why Irish traditional spelling rendered in Gentium looks so silly!
I'm sure I, or almost anyone else on this august list, could easily adapt
Gentium to the small extent of removing that extraneous dot, but it would
probably be illegal to so alter it. Any point asking SIL for that
Lest I get jumped upon for inaccuracy!:-) I hasten to add that, if we can
get an undotted i in fine Gentium I don't care if it als provides dots for
every single consonant (we may be laughed at as ignorant peasants, but we
know enough to only use what we need in accordance with the practice of our
[EMAIL PROTECTED]
Ar 03:17 -0800 2004/03/18, scríobh Peter Kirk [EMAIL PROTECTED]:
An alternative for Marion, if her company still has rights to the fonts
which it so expensively developed to serve her country, would be to
distribute those fonts widely (and that probably means free of charge)
in
On 19/03/2004 13:41, Marion Gunn wrote:
Yes!
That is why Irish traditional spelling rendered in Gentium looks so silly!
I'm sure I, or almost anyone else on this august list, could easily adapt
Gentium to the small extent of removing that extraneous dot, but it would
probably be illegal to so
Anyone who feels that past monetary contributions towards encoding
efforts were made based on false pretenses may be able to seek legal
redress.
There's a certain barrister in Africa who might be able to help in this
regard. Of course, this barrister works under conditions of strict
Marion Gunn mgunn at egt dot ie wrote:
To recap: dot above is a traditional diacritic in Irish, reserved for
use with certain consonants (its function being served, in Roman
script, by placing the 'letter' h after those same consonants). I
suppose (with thanks to Antoine for reading my msg so
Quoting Doug Ewell [EMAIL PROTECTED]:
Marion Gunn mgunn at egt dot ie wrote:
To recap: dot above is a traditional diacritic in Irish, reserved for
use with certain consonants (its function being served, in Roman
script, by placing the 'letter' h after those same consonants). I
suppose
At 11:51 -0600 2004-03-18, Unspecified (i.e.
Brian, who should really put his name in his
e-mail program) wrote:
I disagree that the question is this simple. It is not just a font issue.
Yes, it is.
It is a matter of the writing system being used.
The writing system used by Irish is the Latin
[EMAIL PROTECTED] scripsit:
Thus, the digraph 0062+0068 (i.e., bh) represents the same conceptual
object as 1E03. Note that, if a selection of Irish text is set using one
convention or the other, problems with spell checkers will occur UNLESS there
is some metadata that indicates the writing
Quoting [EMAIL PROTECTED]:
[EMAIL PROTECTED] scripsit:
Thus, the digraph 0062+0068 (i.e., bh) represents the same
conceptual object as 1E03. Note that, if a selection of Irish text
is set using one convention or the other, problems with spell checkers
will occur UNLESS there is some
At 14:32 -0600 2004-03-18, Brian wrote:
Well, unless your spelling-checker author is bright enough (as is very
likely) to handle both dot-convention and h-convention spellings.
These are not intrinsically tied to Uncial vs. Antiqua font styles,
though; one can write perfectly good Irish
[EMAIL PROTECTED] scripsit:
In this context, and if it's true that a spell checker could, in theory, be
programmed to handle parallel encoding conventions, then why shouldn't Irish
language traditionalists encode the i with a LATIN SMALL LETTER DOTLESS I
such as 0131?
It could be done, yes,
Quoting Michael Everson [EMAIL PROTECTED]:
he also acknowledges that current spell checkers
only work with the modern (Roman) orthography
and that there are no spell checkers that work
with the older orthography.
Because no one needs one, and no one has made a
corpus of texts in
On 18/03/2004 10:30, Michael Everson wrote:
...
You mistake orthography and glyph choice with character identity.
Dotless i as a *character* is used only in Turkic languages, has
nothing to do with Irish, and never has.
May I pick a nit here? Dotless i is used in the official orthography of
At 17:33 -0500 2004-03-18, [EMAIL PROTECTED] wrote:
You might say, then why not introduce a seimhiu character whose glyphic
representation is either h-following or dot-above? Primarily for Unicode
structural reasons: Unicode needs to say a character is either combining or
not.
I proposed one
At 16:37 -0600 2004-03-18, Brian wrote:
People do not create machine-readible texts in the old orthography because of
the technical challenges of reproducing them.
I have no difficulty reproducing machine-readable
texts in the old orthography. I typeset a version
of the Irish Constitution last
At 15:58 -0800 2004-03-18, Peter Kirk wrote:
On 18/03/2004 10:30, Michael Everson wrote:
You mistake orthography and glyph choice with character identity.
Dotless i as a *character* is used only in Turkic languages, has
nothing to do with Irish, and never has.
May I pick a nit here? Dotless i
On 16/03/2004 17:47, Mark E. Shoulson wrote:
...
Of course Celtic uncial fonts will have appeal only to a limited
market. But you shouldn't have to respell your words when the font
changes (as you would if Irish went to dotless-i, since when printed
in conventional fonts, it does have a dot
Michael Everson [EMAIL PROTECTED] wrote:
At 21:42 +0100 2004-03-16, Antoine Leca wrote:
Also, Michael, tell us if your name when written inside some Irish text,
should it be considered English, or Irish? Then, should the i be dotted?
My name should be written with U+0069 as has been stated
From: Mark E. Shoulson [EMAIL PROTECTED]
Peter Kirk wrote:
On the other hand, the change to Unicode required for Irish to use
dotless i would be rather trivial, simply adding Irish to the existing
list currently consisting of Turkish and Azeri, to which Tatar,
Bashkir, Gagauz, Karakalpak
At 02:04 -0800 2004-03-17, Peter Kirk wrote:
Or just use the accursed American Uncial, if there's a version of it
which supports more than Windows 1252.
It would not be suitable for Turkish, given its inherent ugliness.
--
Michael Everson * * Everson Typography * * http://www.evertype.com
On 17/03/2004 03:16, Michael Everson wrote:
At 02:04 -0800 2004-03-17, Peter Kirk wrote:
Or just use the accursed American Uncial, if there's a version of it
which supports more than Windows 1252.
It would not be suitable for Turkish, given its inherent ugliness.
If I come across Turks or
Chuig: Unicode Mailing List [EMAIL PROTECTED]
Scríobh Carl W. Brown [EMAIL PROTECTED]:
Marion,
What exactly are you proposing? A glyph change so that the glyphs for
normal dotted I be rendered without the dot, or that Irish be added to the
list of languages that uses the dotless I such as
Marion Gunn wrote...
I do know my language is being badly served, however.
And I would conclude, given the discussion we've seen on this list, that
your language isn't being badly served by the Unicode Standard (or any
other character encoding), but by some fonts and their vendors.
You
[skipping past various grandiloquence...]
Having worked so hard (sweating long years at other sources of income) to
fund the price of developing fonts and attending mtgs to define not just
individual 10646/Unicode characters, but whole character blocks within
10646/Unicode, plus a series of
At 00:20 + 2004-03-18, Marion Gunn wrote:
I do know my language is being badly served, however.
The Irish language is in no way badly served by the Unicode
Standard or by ISO/IEC 10646.
Some Unicode oldtimers may recall the 'Irish long s' debate (before your
time, Jon), when, finally
Marion,
That particular campaign was such a resounding 'success' we went on to
spend thousands of quid each year, for many years, trekking one more
encoding campaign trail after another, in support of many other languages,
as well as our own.
It reminds me of my work on a multi-lingual
Marion,
Irish in Roman script is written i with dot above,
Irish in traditional script is written i without
dot above. The current flooding of our local
advertising and publishing markets by various
non-native uncial fonts to write our language goes
against tradition in imposing on us that
Carl W. Brown cbrown at xnetinc dot com wrote:
Language that do not have the dotless I have different casing rules.
To implement dotless I support for Irish we would have to change
Unicode.
Only to the extent of adding a note that the casing rules for Turkish,
where I and are a case pair,
Perhaps an anecdote about typesetting Irish text
with taste is in order. When I typeset Nicholas
Williams' wonderful Irish translation of Alice's
Adventures in Wonderland I had occasion to use
some fancy Roman fonts for the chapter headings.
One of these, Mona Lisa Recut, has a tall and
At 14:06 -0500 2004-03-16, [EMAIL PROTECTED] wrote:
Peter Kirk scripsit:
It has the disadvantage of making these fonts useless for Turkish and
Azeri, and more fundamentally so than fonts which have f,i ligatures
with no visible dot.
As soon as someone commissions a Gaelic font from me which
On Tuesday, March 16, 2004 5:48 PM
Peter Kirk [EMAIL PROTECTED] va escriure:
On 16/03/2004 07:35, Carl W. Brown wrote:
I suspect that just changing the font to eliminate the dot will be
easier. Software won't have to be changed, existing code pages will
not have to be changed, searches will
As soon as someone commissions a Gaelic font from me which needs
dotted lower-case i for Turkish or Azeri, I shall let you know.
Keep in mind that OpenType allows fonts to have language-specific
behaviours. You could create a font in which the glyph for 0069 is
dotless for Gaelic, and dotted
Peter Kirk wrote:
On 16/03/2004 07:35, Carl W. Brown wrote:
...
I suspect that just changing the font to eliminate the dot will be
easier.
Software won't have to be changed, existing code pages will not have
to be
changed, searches will work, etc.
It has the disadvantage of making these
At 21:42 +0100 2004-03-16, Antoine Leca wrote:
Also, Michael, tell us if your name when written inside some Irish text,
should it be considered English, or Irish? Then, should the i be dotted?
My name should be written with U+0069 as has been stated already. In
a Gaelic font, it would be drawn
57 matches
Mail list logo