Re: Major site in unicode?

2000-10-02 Thread Doug Ewell
I assume that "the ISO standard" refers to ISO/IEC 8859-1 and possibly 8859-2 as well. Unicode is an ISO standard too (ISO/IEC 10646-1). So if my browser is set to ISO 8859-1 or ISO 8859-2, but a Central Euopean or Western European site is only in Unicode, then all will show up

RE: New Name Registry Using Unicode

2000-10-02 Thread Carl W. Brown
Marco, From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Friday, September 29, 2000 1:26 AM [EMAIL PROTECTED] wrote: In XNS 1.0, XNS personal, business, and general names all follow the same normalization rules: These normalization rules only work for ASCII, so why bother using

RE: New Name Registry Using Unicode

2000-10-02 Thread Marco . Cimarosti
Hi, Carl. (You replied privately; was this intentional? If not, you can resend it to the list, and I will re-send this one). A better choice, IMHO, would be to normalize by *decomposition*. In this way, the problem above would be addressed by rule 3 below. I think you have a very good

RE: New Name Registry Using Unicode

2000-10-02 Thread Marco . Cimarosti
[EMAIL PROTECTED] wrote: Just to clarify, I have no connection with the XNS project (other than as a user), but posted the info about it as of possible interest [...] I am certainly one of those who made the impression of addressing Tom himself, as if he was the author of the proposal. I

RE: New Name Registry Using Unicode

2000-10-02 Thread Carl W. Brown
Marco, I sent them an email and invited them to join the discussion. Carl -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Monday, October 02, 2000 6:59 AM To: Unicode List Subject: RE: New Name Registry Using Unicode [EMAIL PROTECTED] wrote: Just to

RE: New Name Registry Using Unicode

2000-10-02 Thread Carl W. Brown
Marco, It would certainly seem that the optimal solution would be to carry the locale. Then you normalize according to the rules of the locale. Besides the locale could aid in the search. You would only have to be unique for your locale. The drawback is that every search engine would have

Re: New Name Registry Using Unicode

2000-10-02 Thread Mark Davis
There are a number of similarities between this XNS and IDN, so http://www.ietf.org/internet-drafts/draft-ietf-idn-nameprep-00.txt would be worth reading. On locales: using them is dangerous for matching. The only reason to add locale is if it were to make a difference which letters match. But

lag time in Unicode implementations in OS, etc?

2000-10-02 Thread Elaine Keown
Hello, I'm writing to inquire about the "lag time" between when Unicode 3.0 hit the street and when implementations in Windows NT, software tools, fonts, etc. came out? Does stuff usually come out within 3 months, 6 months, ? Is there a central URL that keeps track of implementations, so

Re: lag time in Unicode implementations in OS, etc?

2000-10-02 Thread Michael \(michka\) Kaplan
Windows NT's latest version, Windows 2000, does not support Unicode 3.0. There are many scripts for which no keyboards exist, and which do not even have fonts or shaping rules for rendering. When it comes out is a generic question, so I will give you a generic answer: when they get the work

Re: [OT] Word select in Microsoft products?

2000-10-02 Thread Michael \(michka\) Kaplan
There is no central way to do it in all applications, no. In fact, most do not support a way to do it per application. :-( It gets better or worse in Windows 2000 (depending on your point of view) as it grabs whole clusters when you work per letter, thus Tamil TTA + I and other ligatures are

RE: Major site in unicode?

2000-10-02 Thread addison
It knows because: 1. You sent the page in that character set, or; 2. You embedded a token in the page to tell the CGI program what the character set was, or; 3. You used the (IE only) hack to get the browser to embed it in a hidden field, or; 4. You guessed it based on a heuristic (or from the

RE: lag time in Unicode implementations in OS, etc?

2000-10-02 Thread Hart, Edwin F.
Put your characters not defined in Unicode 3.0 in the Private Use Area. If you need more space, select one of the Private Use Planes, which require the use of UTF-16. Anything else can lead to incompatible and unpredictable results in the future. Now, how you obtain fonts that support the

Locale ID's again: simplified vs. traditional

2000-10-02 Thread Steven R. Loomis
In RFC1766 usage, "zh-tw" is often used to mean traditional chinese, and "zh-cn" is used for simplified This occurs in places such as HTTP headers and xml:lang tags. In POSIX locale id usage, zh_CN and zh_TW are also simplified and traditional, respectively. However, what should be done for

URLEncode international characters

2000-10-02 Thread Raghu Kolluru
Hello Everyone, I have an simple servlet which gets the form fields and stores in a sql server db. Now I am trying to store and retrive international characters (charset EUC-JP). The problem I am having here is: For the first time when I send the characters, java gets it as ascii, It returns

RE: [OT] Word select in Microsoft produ

2000-10-02 Thread Chris Pratley
There are two edit controls in the apps you mention which do AWS, and the calling application is responsible for turning this behaviour on/off (I am assuming it can be turned on/off programmatically). They are mshtml.dll (OE, IE, and HTML messages in Outlook) and riched20.dll (plain text

Re: FREE Perl Mongers Party Tue, Oct 3, 7 pm

2000-10-02 Thread Sarasvati
Hello James at RF.NET... Parties are nice, but when you announce them, please refrain from tag lines like: Yes, they're hiring Perl and C++ people. When I see naughty lines like that, I immediately think "thinly veiled recruiting pitch", and "blatant advertising gimmick"... The wrath-wreaking

RE: lag time in Unicode implementations in OS, etc?

2000-10-02 Thread Carl W. Brown
Michka, I would not expect Windows 2000 to support Unicode 3.0 especially since the final build of W2K was sent manufacturing in November of 1999 too late for Unicode 3.0. Even if it had come out earlier in 1999 it would have been difficult to implement late in the development cycle unless the

Unicode in Excel 2000

2000-10-02 Thread Peck, Jon
A little off topic... Does anyone know whether Excel can read a Unicode CSV-format file? Although such a file opens in Word 2000 fine, Excel seems terminally confused, whether fed utf-16 or utf-8. The only documentation I could find seems to consider Unicode files and csv files as separate

Re: Unicode in Excel 2000

2000-10-02 Thread Michael \(michka\) Kaplan
I do not think Excel 2000 will handle these files properly. The Jet 4.0 text IISAM will, for what its worth. michka a new book on internationalization in VB at http://www.i18nWithVB.com/ - Original Message - From: "Peck, Jon" [EMAIL PROTECTED] To: "Unicode List" [EMAIL PROTECTED] Sent:

Re: lag time in Unicode implementations in OS, etc?

2000-10-02 Thread Michael \(michka\) Kaplan
I agree 100%, and I could make the same argument for surrogate support in SQL Server 2000 (i.e. there are no characters, so support is not relevant at ship time) but since I cannot ever state with certanty what the next version of products (i.e. Whistler or Yukon) will support, I do not want to

URLEncode international characters

2000-10-02 Thread Raghu Kolluru
I have an simple servlet which gets the form fields and stores in a sql server db. Now I am trying to store and retrive international characters (charset EUC-JP). The problem I am having here is: For the first time when I send the characters, java gets it as ascii, It returns back to the

Re: URLEncode international characters

2000-10-02 Thread addison
Hi Raghu, Your problem is probably: Your servlet is running in the default locale of your server and thus assumes Latin-1 (ISO-8859-1) as the character set of the transaction. It thus converts the individual bytes of your EUC-JP stream to the Java internal representation of these characters

FW: Need Help

2000-10-02 Thread Magda Danish (Unicode)
-Original Message-From: Rami Radi [mailto:[EMAIL PROTECTED]]Sent: Saturday, September 30, 2000 5:36 AMTo: [EMAIL PROTECTED]Subject: Need Help Dear Reader Iam a WAP developer who was put infront of the problem of having to display Arabic characters on WAP phones. these phones

Re: Need Help

2000-10-02 Thread Michael \(michka\) Kaplan
Hello Rami, I am not sure I understand your question. If the phone uses the Unicode standard, then you do not need the hex codes for Unicode characters; you simply need the pages in question (HTML, XML, or otherwise) to be in a Unicode encoding such as UTF-8. Having a file in such an encoding

Re: Locale ID's again: simplified vs. traditional

2000-10-02 Thread Yung-Fong Tang
Steven R. Loomis wrote: In RFC1766 usage, "zh-tw" is often used to mean traditional chinese, and "zh-cn" is used for simplified This occurs in places such as HTTP headers and xml:lang tags. No. "zh-tw" only mean Chinese used in Taiwan and "zh-cn" only mean Chinese used in China. It

Re: please expand re bidi algorithm

2000-10-02 Thread Michael \(michka\) Kaplan
No, its not that at all. It is just that many products have a long history of connection with the people who use the product who also happen to have a bidirectional language as their native one. Many other products have a development team with that expertise. One example can be found in Mozilla

Re: Locale ID's again: simplified vs. traditional

2000-10-02 Thread Doug Ewell
Yung-Fong "Frank" Tang [EMAIL PROTECTED] wrote: Steven R. Loomis wrote: In RFC1766 usage, "zh-tw" is often used to mean traditional chinese, and "zh-cn" is used for simplified This occurs in places such as HTTP headers and xml:lang tags. No. "zh-tw" only mean Chinese used in Taiwan and

Re: New Name Registry Using Unicode

2000-10-02 Thread Antoine Leca
[EMAIL PROTECTED] wrote: For purposes of name registration uniqueness, the only significant characters are numbers and letter as defined by the Java isLetterOrDigit function returning TRUE. This function determines if a character is a letter or digit according to the Unicode 2.0 standard

Re: surrogate terminology

2000-10-02 Thread Asmus Freytag
This discussion has become quite "surreal". In the meantime, I and other people who have the need to write about these characters have, with more or less encouragement from the Unicode Editorial Committee started to use the terms "Supplementary Planes", "Supplementary Characters" etc. This