John Hudson schreef:
By the way, although Unicode calls it a cedilla, the correct form to use
with G is the disconnected, 'under comma' form.
Ah yes, the cedillas; now these are ambiguous!
What is the correct form for cedillas under N, K, L, R, S and T? What
should these look like? The fonts
John Cowan schreef:
Digraphs and ligatures are both made by combining two glyphs. In a
digraph,
the glyphs remain separate but are placed close together. In a ligature,
the glyphs are fused into a single glyph.
Oh, in that case I must say I think the UnicodeData.txt file doesn't do a
very
Pim Blokland scripsit:
Now I must admit, I haven't come across many texts which used Ts with
cedillas. Not in printed form, that is; the only ones I have seen were in
electronic form, where their appearance depends on the font used.
T with cedilla should never have existed. When s with comma
By the way, although Unicode calls it a cedilla, the
correct form to use
with G is the disconnected, 'under comma' form.
Ah yes, the cedillas; now these are ambiguous!
What is the correct form for cedillas under N, K, L, R, S
and T? What should these look like?
Well, the easy (and
Pim Blokland scripsit:
For instance, the Danish ae (U+00E6) is not designated a ligature,
It was in Unicode 1.0; I think politics were involved in that one.
In Latin use, ae is most certainly a ligature, and likewise in the
languages (including English) that have borrowed words involving it.
In
The names do NOT always provide correct descriptions of the
characters. This is especially true for digraph and ligature
(and in the case of U+00E6 too), as well as (e.g.) SCRIPT CAPITAL P,
which is neither script, nor capital (it's lowercase), though
it is a p... In addition, there are
Oh, in that case I must say I think the UnicodeData.txt file doesn't do a
very good job.
For instance, the Danish ae (U+00E6) is not designated a ligature, but the
Dutch ij (U+0133) is, even though the a and e are clearly fused
together, while the i and j aren't.
Hm, this whole concept seems
For instance, the Danish ae (U+00E6) is not designated a ligature,
It was in Unicode 1.0; I think politics were involved in that one.
In Latin use, ae is most certainly a ligature, and likewise in the
languages (including English) that have borrowed words involving it.
In Danish use,
On Friday, March 7, 2003, at 04:26 AM, Pim Blokland wrote:
Oh, in that case I must say I think the UnicodeData.txt file doesn't
do a
very good job.
For instance, the Danish ae (U+00E6) is not designated a ligature, but
the
Dutch ij (U+0133) is, even though the a and e are clearly fused
David Oftedal schreef:
Hm, this whole concept seems stupid if you ask me.
That's beside the point. The issue of this discussion is not how stupid this
all is, but how consistent is the description of the UnicodeData.txt file.
So I DO care whether I should call something a digraph or a ligature.
On Thu, Mar 06, 2003 at 02:25:19PM -0500, Dean Snyder wrote:
Ben Yehuda is a modern Hebrew dictionary, and, as I noted in my
original email, I have little experience in modern, Israeli Hebrew -
maybe the orthography is different there, I just don't know. Which is why
I was limiting my remarks
On Fri, 7 Mar 2003, John H. Jenkins wrote:
since different people speaking different languages
often have different perceptions of what a symbol is.
Reminds me of ISIRI 3342 that officially considered symbol and character
the same thing and used one word (namaad, Noon, Meem, Alef, Dal) for
Kent Karlsson schreef:
Typographically, it's a ligature either way.
You mean that both ae and ij should be called ligatures, although one is
fused and the other isn't?
OK, I can live with that. I'd rather the ij were called a digraph, though.
The ij is considered by some to be one letter in
At 15:36 +0100 2003-03-07, Pim Blokland wrote:
Kent Karlsson schreef:
Typographically, it's a ligature either way.
You mean that both ae and ij should be called ligatures, although one is
fused and the other isn't?
OK, I can live with that. I'd rather the ij were called a digraph, though.
These
Kent Karlsson scripsit:
E.g., it is quite legitimate to render, e.g. LIGATURE FI as an f followed
by an i, no ligation, whereas that is not allowed for the ae
ligature/letter, nor for the oe ligature.
How do you know that? Either Caesar or Cæsar is good Latin.
--
After fixing the Y2K bug
E.g., it is quite legitimate to render, e.g. LIGATURE FI as
an f followed
by an i, no ligation, whereas that is not allowed for the ae
ligature/letter, nor for the oe ligature.
How do you know that? Either Caesar or Cæsar is good Latin.
That's the other way around. Ligating ae into æ
Typographically, it's a ligature either way.
You mean that both ae and ij should be called ligatures,
although one is fused and the other isn't?
No. What I'm trying to say is that the names do not really matter.
While there is a strive to give good names to characters,
they sometimes are
Pim Blokland scripsit:
The ij is considered by some to be one letter in Dutch, and when written
down, an i and a j together look very much like a written y with
diaeresis. (See fonts like Script MT.) So I can understand foreigners
getting confused and encoding it that way (as a y with
Kent Karlsson scripsit:
Ligating ae into æ works for Latin
and sometimes English (could be done via a smart font).
Always for English, I think: if someone finds a counterexample, let them
use a + ZWNJ + e.
Note that e.g. an fj
ligature is just as legitimate and useful as an fi ligature
Michael Everson everson at evertype dot com wrote:
You mean that both ae and ij should be called ligatures, although one
is fused and the other isn't?
OK, I can live with that. I'd rather the ij were called a digraph,
though.
These terms are not normative. Get used to it.
The names
What an interesting character ij, or y is. It really shows how languages
evolve over time. As for the æ:
How do you know that? Either Caesar or Cæsar is good Latin.
We're not necessarily talking about Latin here. In Norwegian and Danish,
æ is not a ligature, but a separate sound almost
Actually, it is of orthographic significance: it is not
uncommon for good fonts to have an fj ligature.
That typography, not orthography.
But I would appreciate if more fonts had an fj ligature, and
(e.g.) a gj ligature too (in some fonts gj otherwise have
overlapping glyphs).
/kent
At 08:23 -0800 2003-03-07, Doug Ewell wrote:
The names themselves are normative, of course. What is not normative is
the distinction between the terms LETTER, LIGATURE, and DIGRAPH used in
the names. Just wanted to clarify that for Pim.
I didn't say the names are not normative. I said the terms
At 01:49 AM 3/7/2003, Pim Blokland wrote:
Ah yes, the cedillas; now these are ambiguous!
What is the correct form for cedillas under N, K, L, R, S and T? What
should these look like? The fonts I've seen disagree on all of them: some
have commas, others have real cedillas.
Since Unicode 3.0 came
John Hudson schreef:
The most problematical part of this is that 8-bit codepages supporting
Romanian use the old S and T with *cedilla* codepoints, not the new S and
T
with comma codepoints.
Apple updated their Romanian codepage shortly after those new characters
appeared, five years ago.
Not
On Fri, Mar 07, 2003 at 17:27:08 +0100, David Oftedal wrote:
We're not necessarily talking about Latin here. In Norwegian and Danish,
is not a ligature, but a separate sound almost unpronounceable by
English speakers.
I believe is also a character in the IPA.
Noah
-Original Message-
Date/Time:Fri Mar 7 12:44:47 EST 2003
Contact: [EMAIL PROTECTED]
Report Type: Other Question, Problem, or Feedback
I was wondering when writing code for a program in Visual
Basic.NET. Just a very very simple code that converts
charactersto
Ram Viswanadha wrote:
There is also some information at
http://oss.software.ibm.com/icu/docs/papers/binary_ordered_compression_for_unicode.html#Test_Results
Not sure if this is what you are looking
for.
thanks. not really. I am not look into the
Mijan scripsit:
Let's consider the ra+virama+ya case. In the mostpart the
ra+virama+ya is
displayed as ya+reph. This obviously seems to be an
instance of ambiguous interpretation because ra+virama+ya could
also represents
ra+ja-phalaa. ya+reph and ra+ja-phalaa are used in different
words and
29 matches
Mail list logo