I assume that "the ISO standard" refers to ISO/IEC 8859-1 and
possibly 8859-2 as well. Unicode is an ISO standard too (ISO/IEC
10646-1).
So if my browser is set to ISO 8859-1 or ISO 8859-2, but a
Central Euopean or Western European site is only in Unicode, then all
will show up
Marco,
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
Sent: Friday, September 29, 2000 1:26 AM
[EMAIL PROTECTED] wrote:
In XNS 1.0, XNS personal, business, and general names all
follow the same normalization rules:
These normalization rules only work for ASCII, so why bother using
Hi, Carl.
(You replied privately; was this intentional? If not, you can resend it to
the list, and I will re-send this one).
A better choice, IMHO, would be to normalize by *decomposition*. In this
way, the problem above would be addressed by rule 3 below.
I think you have a very good
[EMAIL PROTECTED] wrote:
Just to clarify, I have no connection with the XNS project
(other than as a
user), but posted the info about it as of possible interest
[...]
I am certainly one of those who made the impression of addressing Tom
himself, as if he was the author of the proposal.
I
Marco,
I sent them an email and invited them to join the discussion.
Carl
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
Sent: Monday, October 02, 2000 6:59 AM
To: Unicode List
Subject: RE: New Name Registry Using Unicode
[EMAIL PROTECTED] wrote:
Just to
Marco,
It would certainly seem that the optimal solution would be to carry the
locale.
Then you normalize according to the rules of the locale. Besides the locale
could aid in the search. You would only have to be unique for your locale.
The drawback is that every search engine would have
There are a number of similarities between this XNS and IDN, so
http://www.ietf.org/internet-drafts/draft-ietf-idn-nameprep-00.txt would be
worth reading.
On locales: using them is dangerous for matching. The only reason to add
locale is if it were to make a difference which letters match. But
Hello,
I'm writing to inquire about the "lag time" between when Unicode 3.0 hit the street
and when implementations in Windows NT, software tools, fonts, etc. came out? Does
stuff usually come out within 3 months, 6 months, ?
Is there a central URL that keeps track of implementations, so
Windows NT's latest version, Windows 2000, does not support Unicode 3.0.
There are many scripts for which no keyboards exist, and which do not even
have fonts or shaping rules for rendering.
When it comes out is a generic question, so I will give you a generic
answer: when they get the work
There is no central way to do it in all applications, no. In fact, most do
not support a way to do it per application. :-(
It gets better or worse in Windows 2000 (depending on your point of view) as
it grabs whole clusters when you work per letter, thus Tamil TTA + I and
other ligatures are
It knows because:
1. You sent the page in that character set, or;
2. You embedded a token in the page to tell the CGI program what the
character set was, or;
3. You used the (IE only) hack to get the browser to embed it in a hidden
field, or;
4. You guessed it based on a heuristic (or from the
Put your characters not defined in Unicode 3.0 in the Private Use Area. If
you need more space, select one of the Private Use Planes, which require the
use of UTF-16. Anything else can lead to incompatible and unpredictable
results in the future. Now, how you obtain fonts that support the
In RFC1766 usage, "zh-tw" is often used to mean traditional chinese,
and "zh-cn" is used for simplified This occurs in places such as HTTP
headers and xml:lang tags.
In POSIX locale id usage, zh_CN and zh_TW are also simplified and
traditional, respectively.
However, what should be done for
Hello Everyone,
I have an simple servlet which gets the form fields and stores in a sql
server db. Now I am trying to store and retrive international characters
(charset EUC-JP).
The problem I am having here is:
For the first time when I send the characters, java gets it as ascii, It
returns
There
are two edit controls in the apps you mention which do AWS, and the calling
application is responsible for turning this behaviour on/off (I am assuming it
can be turned on/off programmatically). They are mshtml.dll (OE, IE, and HTML
messages in Outlook) and riched20.dll (plain text
Hello James at RF.NET...
Parties are nice, but when you announce them, please
refrain from tag lines like:
Yes, they're hiring Perl and C++ people.
When I see naughty lines like that, I immediately think
"thinly veiled recruiting pitch", and "blatant advertising
gimmick"... The wrath-wreaking
Michka,
I would not expect Windows 2000 to support Unicode 3.0 especially since the
final build of W2K was sent manufacturing in November of 1999 too late for
Unicode 3.0. Even if it had come out earlier in 1999 it would have been
difficult to implement late in the development cycle unless the
A little off topic...
Does anyone know whether Excel can read a Unicode CSV-format file? Although
such a file opens in Word 2000 fine, Excel seems terminally confused,
whether fed utf-16 or utf-8. The only documentation I could find seems to
consider Unicode files and csv files as separate
I do not think Excel 2000 will handle these files properly. The Jet 4.0 text
IISAM will, for what its worth.
michka
a new book on internationalization in VB at
http://www.i18nWithVB.com/
- Original Message -
From: "Peck, Jon" [EMAIL PROTECTED]
To: "Unicode List" [EMAIL PROTECTED]
Sent:
I agree 100%, and I could make the same argument for surrogate support in
SQL Server 2000 (i.e. there are no characters, so support is not relevant at
ship time) but since I cannot ever state with certanty what the next version
of products (i.e. Whistler or Yukon) will support, I do not want to
I have an simple servlet which gets the form fields and stores in a sql
server db. Now I am trying to store and retrive international characters
(charset EUC-JP).
The problem I am having here is:
For the first time when I send the characters, java gets it as ascii, It
returns back to the
Hi Raghu,
Your problem is probably:
Your servlet is running in the default locale of your server and thus
assumes Latin-1 (ISO-8859-1) as the character set of the transaction. It
thus converts the individual bytes of your EUC-JP stream to the Java
internal representation of these characters
-Original Message-From: Rami Radi
[mailto:[EMAIL PROTECTED]]Sent: Saturday, September 30, 2000 5:36
AMTo: [EMAIL PROTECTED]Subject: Need
Help
Dear Reader
Iam a WAP developer who was put infront of
the problem of having to display Arabic characters on WAP phones. these phones
Hello Rami,
I am not sure I understand your question. If the phone uses the Unicode
standard, then you do not need the hex codes for Unicode characters; you
simply need the pages in question (HTML, XML, or otherwise) to be in a
Unicode encoding such as UTF-8.
Having a file in such an encoding
Steven R. Loomis wrote:
In RFC1766 usage, "zh-tw" is often used to mean traditional chinese,
and "zh-cn" is used for simplified This occurs in places such as HTTP
headers and xml:lang tags.
No. "zh-tw" only mean Chinese used in Taiwan and "zh-cn" only mean
Chinese used in China. It
No, its not that at all. It is just that many products have a long history
of connection with the people who use the product who also happen to have a
bidirectional language as their native one. Many other products have a
development team with that expertise.
One example can be found in Mozilla
Yung-Fong "Frank" Tang [EMAIL PROTECTED] wrote:
Steven R. Loomis wrote:
In RFC1766 usage, "zh-tw" is often used to mean traditional chinese,
and "zh-cn" is used for simplified This occurs in places such as HTTP
headers and xml:lang tags.
No. "zh-tw" only mean Chinese used in Taiwan and
[EMAIL PROTECTED] wrote:
For purposes of name registration uniqueness, the only significant
characters are numbers and letter as defined by the Java isLetterOrDigit
function returning TRUE. This function determines if a character is a
letter or digit according to the Unicode 2.0 standard
This discussion has become quite "surreal".
In the meantime, I and other people who have the need to write about these
characters have, with more or less encouragement from the Unicode Editorial
Committee started to use the terms "Supplementary Planes", "Supplementary
Characters" etc. This
29 matches
Mail list logo