From: Roozbeh Pournader [EMAIL PROTECTED]
On Mon, 28 Apr 2003, Mark Davis wrote:
BTW, the ICU demos have been all upgraded to Unicode 4.0, on
http://oss.software.ibm.com/icu/demo/.
They include:
[...]
IDNA Demo
This simple demo performs IDNA transformations as described in RFC
Jim Allan wrote:
Kent Karlsson posted:
And I (still!) very strongly disagree. The empty set symbol stands
for the empty set (also written {}). But there is no set here, let alone
an empty one. Possibly an empty string (of phonetic symbols?).
Written as '' or in your favourite
At 14:19 +0200 2003-05-29, Kent Karlsson wrote:
(Remember that the empty set symbol really was an O with stroke,
originally!)
Surely a 0 with stroke, not a O with stroke.
--
Michael Everson * * Everson Typography * * http://www.evertype.com
I'm not sure what other people experience, but I see a note saying the
attachment was (quite correctly I think) removed from the email, and
instead just lists the name and format of the attachment.
I'm on the digest format.
From: Carl W. Brown [EMAIL PROTECTED]
It looks to me like UNCODE. Has the UN has taken a rode in globalization? Maybe
the web page has no scripting but is still savvy.
Wrong! You strip the very visible dot from the i letter, you also refse to see that
there's a ligature between the U and N.
Compliant is a problem term, as compliance is a problem concept. I
believe
we discussed, some months ago, the problem of claiming compliance for
systems or applications, since very little (any?) software implements
everything in Unicode or implements everything equally well. What
would it
mean
Kent Karlsson [EMAIL PROTECTED] wrote on 05/29/2003 07:19:23 AM:
(Remember that the empty set symbol really was an O with stroke,
originally!)
And A was an Ancient Near-Eastern pictograph originally.
- Peter
---
there are still (even more) browsers that do not display UTF-8
correctly...
who still use very often a browser that supports some form their
national encoding (SJIS, GB2312, Big5, KSC5601), sometimes with
ISO2022-* but shamely do not decode UTF-8 properly (even when the
page is correctly
Michael Everson scripsit:
Surely a 0 with stroke, not a O with stroke.
MathWorld shows a GIF that is definitely a circle with stroke
at http://mathworld.wolfram.com/EmptySet.html , and notes that 0
(no slash) is a variant but dispreferred notation.
--
Yes, chili in the eye is bad, but so is
Kent Karlsson scripsit:
Sorry for picking on every statement you make, but there is no such thing
as a null set or a null set symbol (null and empty aren't the same).
Null set is quite common, though not as common as empty set:
see http://mathworld.wolfram.com/EmptySet.html , which says they
From: Ben Dougall [EMAIL PROTECTED]
On Wednesday, May 28, 2003, at 06:59 pm, Otto Stolz wrote:
PS. In these tow languages, the quote-marks are paired thusly:
en_US: U+201C ... U+201D, and U+2018 ... U+2019
de_DE: U+201E ... U+201C, and U+201A ... U+2018
are they the right way
At 07:57 -0500 2003-05-29, [EMAIL PROTECTED] wrote:
Kent Karlsson [EMAIL PROTECTED] wrote on 05/29/2003 07:19:23 AM:
(Remember that the empty set symbol really was an O with stroke,
originally!)
And A was an Ancient Near-Eastern pictograph originally.
Of a bull
--
Michael Everson * *
Ben Dougall schreef:
the reason i said that bit is html and xml (i know they're not
human
languages and they're certainly not in the area i'm asking about)
So you were not talking about computer languages and I don't need to
point out Pascal's (* *) and C's /* */ delimiters for comments?
OK...
John Cowan wrote:
I have yet to see anyone quote a linguistic texts that *explicitly*
says that
they use the empty set symbol for this empty linguistic entity.
Well, a linguistics paper I read yesterday (citation on request)
definitely
used the slashed-circle, aka empty set sign, to
Michael Everson wrote:
(Remember that the empty set symbol really was an O with stroke,
originally!)
Surely a 0 with stroke, not a O with stroke.
The empty set sign was originally definitely the Norwegian/Danish letter
CAPITAL O WITH STROKE. It never was related at all to a ZERO with
John Cowan wrote:
Netscape 4.x is dead.
Alas no. I have two recent (2002 and 2003) cases where the customers, with
large NS4.x installations they were not ready to upgrade, said in effect
your software must be NS4.x-compatible or no deal.
--
François Yergeau
These silly threads seem to indicate too many people on these two lists are
underemployed or interested in developing smokescreens for other activities.
When a reference to using embryonic ISO 639-3 to 'legitimize' SIL's flawed
Ethnologue is let pass with no comment, but followed only by a
Philippe,
From: Carl W. Brown [EMAIL PROTECTED]
It looks to me like UNCODE. Has the UN has taken a rode in
globalization? Maybe the web page has no scripting but is still savvy.
Wrong! You strip the very visible dot from the i letter, you also
refse to see that there's a ligature
From: [EMAIL PROTECTED]
there are still (even more) browsers that do not display UTF-8
correctly...
who still use very often a browser that supports some form their
national encoding (SJIS, GB2312, Big5, KSC5601), sometimes with
ISO2022-* but shamely do not decode UTF-8 properly (even
Hi Philippe and Kazuhiro,
Thanks for your quick response.
I think I may made a mistake to give the code page alias name,
actually, my program doesn't specify the encoding value explicitly.
So by default, JVM will take the system default when it's
initialized.
Do you know which one is being
Why is Ethnologue flawed?
on 5/29/03 9:15 AM, Marion Gunn at [EMAIL PROTECTED] wrote:
When a reference to using embryonic ISO 639-3 to 'legitimize' SIL's flawed
Ethnologue is let pass with no comment
The One Dot Leader is typically used to create a dotted line in a table of
contents between chapter or topic and page number. The Full Stop is a period
for running text. In good typography, the One Dot Leader is usually slightly
smaller than the Full Stop.
Donald Figge
//
-Original
[EMAIL PROTECTED] scripsit:
IIRC, there are still problems with recent versions of browsers in relation
to NCRs: some understand hex but not decimal, or vice versa.
I have not heard of any that don't support decimal NCRs.
--
Long-short-short, long-short-short / Dactyls in dimeter,
Verse form
Philippe Verdy scripsit:
French usage of these quotation marks is interesting: when a quotation
spans several paragraphs, each paragraph starts with a quotation mark,
but only the last one is terminated by the mirrored mark.
This is also the rule in English. However, it is usually only
On Wed, 28 May 2003, John Hudson wrote:
At 08:32 PM 5/28/2003, John Cowan wrote:
Netscape 4.x is dead.
I wish it were. Monitoring the web traffic at one of the sites I'm involved
with, I am dismayed to see that more than 5% of visitors are using Netscape
4.7.
You should not be dismayed.
Don't use Windows-31J, it is a encoding name alias that is not used by Microsoft for
its 932 codepage! So it would cause problems with other compliant JVMs.
Better use CP932 which seems to be the canonical name used by Sun in its reference
implementation, or windows-932 documented in the
Brian Doyle wrote:
on 5/29/03 9:15 AM, Marion Gunn at [EMAIL PROTECTED] wrote:
When a reference to using embryonic ISO 639-3 to
'legitimize' SIL's flawed
Ethnologue is let pass with no comment
Why is Ethnologue flawed?
And how is this more on-topic on a mailing list called Unicode
From: Theodore H. Smith [EMAIL PROTECTED]
I'm not sure what other people experience, but I see a note saying the
attachment was (quite correctly I think) removed from the email, and
instead just lists the name and format of the attachment.
I'm on the digest format.
You may see the GIF
At 16:01 +0200 2003-05-29, Kent Karlsson wrote:
Michael Everson wrote:
(Remember that the empty set symbol really was an O with stroke,
originally!)
Surely a 0 with stroke, not a O with stroke.
The empty set sign was originally definitely the Norwegian/Danish letter
CAPITAL O WITH STROKE.
I do
On Thu, 29 May 2003, Marco Cimarosti wrote:
Rick McGowan wrote:
2. It is unikely that the Unicode *logo* itself (i.e. the thing at
http://www.unicode.org/webscripts/logo60s2.gif) will be incorporated
directly in any image that people are allowed to put on their
websites, because to put
Edward H Trager wrote:
John Hudson wrote:
John Cowan wrote:
Netscape 4.x is dead.
I wish it were. Monitoring the web traffic at one of the sites I'm involved
with, I am dismayed to see that more than 5% of visitors are using Netscape
4.7.
Lots of organizations may have reasons like
Marion Gunn scripsit:
These silly threads seem to indicate too many people on these two lists are
underemployed or interested in developing smokescreens for other activities.
Insolence *and* paranoia. I see.
When a reference to using embryonic ISO 639-3 to 'legitimize' SIL's flawed
Surely a 0 with stroke, not a O with stroke.
The empty set sign was originally definitely the Norwegian/Danish
letter
CAPITAL O WITH STROKE.
I do not believe you.
It never was related at all to a ZERO with stroke.
Why not? Zero and emptiness are closely related.
Did you read the
From: Philippe Verdy [EMAIL PROTECTED]
Subject: Re: Shift-JIS/Unicode mapping in JAVA
Date: Thu, 29 May 2003 17:11:05 +0200
Message-ID: [EMAIL PROTECTED]
So it would cause problems with other compliant JVMs.
Would you show the problem? I would like to analyze it. Anyway, I has
already announced
Kent Karlsson wrote on 05/29/2003 07:19:01 AM:
The empty set symbol is a math symbol, not expected to ever occur
(properly)
in a word-like context. Capital O with stroke, however, is a
letter, and can easily
and without any problems occur in a word-like context.
Which is exactly why it
Brian on 05/29/2003 09:37:31 AM:
Why is Ethnologue flawed?
Because:
1. research that has gone into it has only been going on for 50 years with
limited manpower, not 150 with unlimited manpower;
2. linguistic and sociolinguistic change is on-going, and it is difficult
to keep research current
Ben Dougall wrote:
On Wednesday, May 28, 2003, at 06:59 pm, Otto Stolz wrote:
PS. In these tow languages, the quote-marks are paired thusly:
en_US: U+201C ... U+201D, and U+2018 ... U+2019
de_DE: U+201E ... U+201C, and U+201A ... U+2018
are they the right way round? so in german it'd be:
Kent:
Others gave references where it in most cases did NOT look at all like the
empty set symbol.
Gustav Leunbach (1973), Morphological Analysis as a Step in
Automated Syntactic Analysis of a
Text.http://acl.ldc.upenn.edu/C/C73/C73-2022.pdf
uses an empty set symbol to denote a morphological
are they the right way round? so in german it'd be:
otto said So, there is not comprehensive list of
openers vs. closers
possible.
Does not look right here. The following is more like it:
So, there is not comprehensive list of openers vs.
closers possible.
No, as far as I can
On Thursday, May 29, 2003, at 02:16 pm, Pim Blokland wrote:
Ben Dougall schreef:
the reason i said that bit is html and xml (i know they're not
human
languages and they're certainly not in the area i'm asking about)
So you were not talking about computer languages and I don't need to
point out
I've just downloaded the PDF files with 4.0 additions (U40-*.pdf). One
question: How is one supposed to tell apart the glyphs for U+1D29 and
U+1D18?... Or one isn't?... (OK, this question is probably more suited
to be posed to IPA, but.)
--
Kenneth Whistler scripsit:
I don't
know what the origin of the mathematical integral sign
(U+222B) is, so cannot vouch for whether it is graphologically
connected to the long s or not, but it is clearly distinct
in usage and properties from the IPA esh.
It was devised by Leibniz, and
Anto'nio Martins-Tuva'lkin wrote:
I've just downloaded the PDF files with 4.0 additions (U40-*.pdf). One
question: How is one supposed to tell apart the glyphs for U+1D29 and
U+1D18?... Or one isn't?...
In the same way that you tell apart the glyphs for U+0050 P LATIN
CAPITAL LETTER P and
On Thursday, May 29, 2003, at 08:08 pm, Markus Scherer wrote:
Ben Dougall wrote:
On Wednesday, May 28, 2003, at 06:59 pm, Otto Stolz wrote:
PS. In these tow languages, the quote-marks are paired thusly:
en_US: U+201C ... U+201D, and U+2018 ... U+2019
de_DE: U+201E ... U+201C, and U+201A ...
On Thursday, May 29, 2003, at 02:10 pm, Philippe Verdy wrote:
Interestingly, the French first-level quotation marks use what we call
chevrons (double angle brackets).
However there are some typographical considerations that common fonts
forget when they design these characters:
They are
António asked:
I've just downloaded the PDF files with 4.0 additions (U40-*.pdf). One
question: How is one supposed to tell apart the glyphs for U+1D29 and
U+1D18?... Or one isn't?... (OK, this question is probably more suited
to be posed to IPA, but.)
Visually, you usually couldn't, any
From: Philippe Verdy ([EMAIL PROTECTED])
From: Marco Cimarosti [EMAIL PROTECTED]
ISO 10646 has the French translation of all the character names. In most
cases, the French names are just literal translations of the English ones
but,
Even « literal » translations may help disambiguate
From: Ben Dougall [EMAIL PROTECTED]
On Thursday, May 29, 2003, at 02:10 pm, Philippe Verdy wrote:
Interestingly, the French first-level quotation marks use what we call
chevrons (double angle brackets).
However there are some typographical considerations that common fonts
forget
Am Donnerstag, 29. Mai 2003 um 22:35 schrieb Kenneth Whistler:
KW Kent:
Others gave references where it in most cases did NOT look at all like the
empty set symbol.
KW Gustav Leunbach (1973), Morphological Analysis as a Step in
KW Automated Syntactic Analysis of a
KW
Ben Dougall asked:
On Thursday, May 29, 2003, at 02:10 pm, Philippe Verdy wrote:
Interestingly, the French first-level quotation marks use what we call
chevrons (double angle brackets).
are they something that's in unicode? apart from the less than and
greater than symbols i can't
Philippe Verdy wrote:
Code positions 0xAB and 0xBB (in ISO-8859-1) are
canonically equivalent to Unicode U+00AB («) and
U+00BB (») code points.
One correction -- this has nothing to do with canonical equivalence.
This (as for all other ISO/IEC 8859-1 encoded characters)
is an example of
At 02:56 PM 5/29/03 -0700, Kenneth Whistler wrote:
António asked:
I've just downloaded the PDF files with 4.0 additions (U40-*.pdf). One
question: How is one supposed to tell apart the glyphs for U+1D29 and
U+1D18?... Or one isn't?... (OK, this question is probably more suited
to be posed to
This subject seems to come periodically on French typographical lists, so I
would like to see what might be the answer of Unicode(unicore) to it.
What should be done with rare extinct latin letters which usually can't
easily be mapped to a single modern letter (i.e. they are not simply
From: Patrick Andries [EMAIL PROTECTED]
From: Philippe Verdy ([EMAIL PROTECTED])
Microsoft displays these French translations for character names. There are
however some strange translations that lack a common formal format that
allows easier searching for related characters.
I would be
Ben Dougall wrote:
So, there is not comprehensive list of openers vs. closers possible.
so that's a 99 shaped quote on the baseline to open and, and a 99 high
up to close. seems very odd to use 99 high or low to open, not a 66. but
if that's how it is, that's how it is.
Well, wait - I was
Philippe Verdy said:
So I think names in both Windows and this Hapax page come
from a ISO10646 normative reference file in French, and it
contains the names for Unicode3.2 characters (but still not
new characters added or modified in Unicode 4.0)
and then asked:
Also, as this alternate
Rick posted a message recently he intended as a personal contribution,
but it may have been interpreted as an official statement. Here is
some clarification of what he wrote.
1. His point about compliance and conformance was intended to indicate
that using the savvy logo would only indicate that
Marion Gunn crossposted:
Scríobh John Cowan [EMAIL PROTECTED]:
Jon Hanna scripsit:
...
It's funny, just earlier today, I castigated a member of a list I manage
for posting a contribution to another list without the author's
permission, an act which some of us regard as seriously
Philippe Verdy scripsit:
A file similarly formatted to
http://www.unicode.org/Public/4.0-Update/NamesList-4.0.0.txt exists here
http://pages.infinit.net/hapax/ListeDesNoms.htm .
Thanks for this reference (and also thanks to pointing this excellent French
translation of the
- [LATIN CAPITAL LETTER O WITH STROKE] and [LATIN
SMALL LETTER O
WITH STROKE] are both ruled out as their semantics is
totally wrong.
Not at all (as seen by example Jarkko quoted!). In Danish
and Norwegian,
yes. But in Swedish and Finnish that vowel is written (and ).
2. It is unikely that the Unicode *logo* itself (i.e. the thing at
http://www.unicode.org/webscripts/logo60s2.gif) will be incorporated
directly in any image that people are allowed to put on their
websites,
because to put the Unicode logo on a product or whatever requires a
license
I wonder how a character standardizer would like it if a bunch of
graphic artists criticized her character encoding.
A professional of any kind will listen to critique.
At 09:20 +0100 2003-05-30, William Overington wrote:
I wonder if Sarasvati herself, not one or more of the
non-Sarasvati-but-act-like-they-are-without-a-mandate people, could please
make a formal ruling on whether it is permitted to post a list of Private
Use Area encodings to the list and thus
On Thu, 29 May 2003 16:05:37 -0700 (PDT), Kenneth Whistler wrote:
In general, when people are interested in classes of characters,
like this, a quick trip into the Unicode Character Database is
a useful thing to do. In particular, look for the list of
characters with the property
- Original Message -
From: William Overington [EMAIL PROTECTED]
To: Magda Danish (Unicode) [EMAIL PROTECTED]; [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Friday, May 30, 2003 10:20 AM
Subject: Re: Announcement: New Unicode Savvy Logo
Now that Mark Davis has made a statement in the
65 matches
Mail list logo