Re: string vs. char [was Re: Java and Unicode]

2000-11-20 Thread Antoine Leca
Marco Cimarosti wrote: Actually, C does have different types for characters within strings and for characters in isolation. That is not my point of view. There is a special case for 'H', that holds int type rather than char, for backward compatibility reasons (such as because the first

Re: string vs. char [was Re: Java and Unicode]

2000-11-20 Thread Marco Cimarosti
Antoine Leca wrote: Marco Cimarosti wrote: Actually, C does have different types for characters within strings and for characters in isolation. That is not my point of view. There is a special case for 'H', that holds int type rather than char, for backward compatibility reasons (such

Re: string vs. char [was Re: Java and Unicode]

2000-11-20 Thread Michael \(michka\) Kaplan
From: "Marco Cimarosti" [EMAIL PROTECTED] the Surrograte (aka "Astral") Planes. I believe the UTC has deprecated the term Astral planes with extreme prejudice. HTH! michka a new book on internationalization in VB at http://www.i18nWithVB.com/

Re: string vs. char [was Re: Java and Unicode]

2000-11-20 Thread David Starner
On Mon, Nov 20, 2000 at 06:54:27AM -0800, Michael (michka) Kaplan wrote: From: "Marco Cimarosti" [EMAIL PROTECTED] the Surrograte (aka "Astral") Planes. I believe the UTC has deprecated the term Astral planes with extreme prejudice. HTH! The UTC has chosen not use the term Astral Plane.

Re: string vs. char [was Re: Java and Unicode]

2000-11-20 Thread Michael \(michka\) Kaplan
ot; [EMAIL PROTECTED] Sent: Monday, November 20, 2000 7:18 AM Subject: Re: string vs. char [was Re: Java and Unicode] On Mon, Nov 20, 2000 at 06:54:27AM -0800, Michael (michka) Kaplan wrote: From: "Marco Cimarosti" [EMAIL PROTECTED] the Surrograte (aka "Astral") Plan

Re: string vs. char [was Re: Java and Unicode]

2000-11-20 Thread John Cowan
David Starner wrote: I chose Astral Planes for perceived grace and beauty. Thank you! -- There is / one art || John Cowan [EMAIL PROTECTED] no more / no less|| http://www.reutershealth.com to do / all things ||

[totally OT] Unicode terminology (was Re: string vs. char [was Re: Java and Unicode])

2000-11-20 Thread Marco Cimarosti
David Starner wrote: Sent: 20 Nov 2000, Mon 16.18 To: Unicode List Subject: Re: string vs. char [was Re: Java and Unicode] On Mon, Nov 20, 2000 at 06:54:27AM -0800, Michael (michka) Kaplan wrote: From: "Marco Cimarosti" [EMAIL PROTECTED] the Surrograte (aka "Astral&

Re: string vs. char [was Re: Java and Unicode]

2000-11-20 Thread addison
Hi Jani, I dunno. I oversimplified in that statement about exposing vs. hiding. ICU "hides" the facts about the Unicode implementation in macros, specifically a next and previous character macro and various other fillips. If you look very closely at the function (method) prototypes you can see

Re: string vs. char [was Re: Java and Unicode]

2000-11-20 Thread Mark Davis
ormation. Mark - Original Message - From: "Michael (michka) Kaplan" [EMAIL PROTECTED] To: "Unicode List" [EMAIL PROTECTED] Sent: Monday, November 20, 2000 06:54 Subject: Re: string vs. char [was Re: Java and Unicode] From: "Marco Cimarosti" [EMAIL PROTEC

RE: string vs. char [was Re: Java and Unicode]

2000-11-17 Thread Marco Cimarosti
Addison P. Phillips wrote: I ended up deciding that the Unicode API for this OS will only work in strings. CTYPE replacement functions (such as isalpha) and character based replacement functions (such as strchr) will take and return strings for all of their arguments. Internally, my

RE: string vs. char [was Re: Java and Unicode]

2000-11-17 Thread Marco Cimarosti
Ooops! In my previous message, I wrote: wchar_t * _wcschr_32(const wint_t * s, wchar_t c); wchar_t * _wcsrchr_32(const wint_t * s, wchar_t c); What I actually wanted to write is: wchar_t * _wcschr_32(const wchar_t * s, wint_t c); wchar_t * _wcsrchr_32(const wchar_t * s, wint_t c); Sorry if

Re: string vs. char [was Re: Java and Unicode]

2000-11-17 Thread addison
for example. Mark - Original Message - From: [EMAIL PROTECTED] To: "Unicode List" [EMAIL PROTECTED] Sent: Thursday, November 16, 2000 13:24 Subject: string vs. char [was Re: Java and Unicode] Normally this thread would be of only academic interest to me... ...bu

RE: string vs. char [was Re: Java and Unicode]

2000-11-17 Thread addison
Well... I think you're right. I knew that char and string units weren't really the same thing. My concern was how to make it easy on developers to use the Unicode API using their "native intelligence". More thought makes me less certain of my approach. Specifically, as Mark points out, looping

Re: Java and Unicode

2000-11-16 Thread Elliotte Rusty Harold
At 4:44 PM -0800 11/15/00, Markus Scherer wrote: In the case of Java, the equivalent course of action would be to stick with a 16-bit char as the base type for strings. The int type could be used in _additional_ APIs for single Unicode code points, deprecating the old APIs with char. It's

Re: Java and Unicode

2000-11-16 Thread Valeriy E. Ushakov
On Thu, Nov 16, 2000 at 05:58:27 -0800, Elliotte Rusty Harold wrote: public char charAt(int index) This method is used to walk strings, looking at each character in turn, a useful thing to do. Clearly it would be possible to replace it with a method with a String return type like this:

Re: Java and Unicode

2000-11-16 Thread Elliotte Rusty Harold
At 7:26 AM -0800 11/16/00, Valeriy E. Ushakov wrote: On Thu, Nov 16, 2000 at 05:58:27 -0800, Elliotte Rusty Harold wrote: public char charAt(int index) This method is used to walk strings, looking at each character in turn, a useful thing to do. Clearly it would be possible to replace

Re: Java and Unicode

2000-11-16 Thread Thomas Chan
On Thu, 16 Nov 2000, Markus Scherer wrote: The ICU API was changed this way within a few months this year. Some of the higher-level implementations are still to follow until next summer, when there will be some 45000 CJK characters that will be infrequent but hard to ignore - the Chinese and

Re: Java and Unicode

2000-11-16 Thread Markus Scherer
Juliusz Chroboczek wrote: I believe that Java strings use UTF-8 internally. .class files use a _modified_ utf-8. at runtime, strings are always in 16-bit unicode. At any rate the internal implementation is not exposed to applications -- note that `length' is a method in class String (while

string vs. char [was Re: Java and Unicode]

2000-11-16 Thread addison
Normally this thread would be of only academic interest to me... ...but this week I'm writing a spec for adding Unicode support to an embedded operating system written in C. Due to Mssrs. O'Conner and Scherer's presentations at the most recent IUC, I was aware of the clash between internal

Re: string vs. char [was Re: Java and Unicode]

2000-11-16 Thread Mark Davis
: Thursday, November 16, 2000 13:24 Subject: string vs. char [was Re: Java and Unicode] Normally this thread would be of only academic interest to me... ...but this week I'm writing a spec for adding Unicode support to an embedded operating system written in C. Due to Mssrs. O'Conner and

Re: Java and Unicode

2000-11-15 Thread Elliotte Rusty Harold
One thing I'm very curious about going forward: Right now character values greater than 65535 are purely theoretical. However this will change. It seems to me that handling these characters properly is going to require redefining the char data type from two bytes to four. This is a major

Re: Java and Unicode

2000-11-15 Thread Michael \(michka\) Kaplan
ot; [EMAIL PROTECTED] To: "Unicode List" [EMAIL PROTECTED] Sent: Wednesday, November 15, 2000 6:15 AM Subject: Re: Java and Unicode One thing I'm very curious about going forward: Right now character values greater than 65535 are purely theoretical. However this will change. It seems

RE: Java and Unicode

2000-11-15 Thread Marco . Cimarosti
Eliotte Rusty Harold wrote: One thing I'm very curious about going forward: Right now character values greater than 65535 are purely theoretical. However this will change. It seems to me that handling these characters properly is going to require redefining the char data type from two

Re: Java and Unicode

2000-11-15 Thread Jungshik Shin
On Wed, 15 Nov 2000, Michael (michka) Kaplan wrote: In any case, I think that UTF-16 is the answer here. Many people try to compare this to DBCS, but it really is not the same thing understanding lead bytes and trail bytes in DBCS is *astoundingly* more complicated than handling

Re: Java and Unicode

2000-11-15 Thread Thomas Chan
On Wed, 15 Nov 2000, Jungshik Shin wrote: On Wed, 15 Nov 2000, Michael (michka) Kaplan wrote: In any case, I think that UTF-16 is the answer here. Many people try to compare this to DBCS, but it really is not the same thing understanding lead bytes and trail bytes in DBCS is

Re: Java and Unicode

2000-11-15 Thread Roozbeh Pournader
On Wed, 15 Nov 2000, Michael (michka) Kaplan wrote: I do not think they are so theoretical, with both 10646 and Unicode including them in the very new future (unless you count it as theoretical when you drop an egg but it has not yet hit the ground!). Lemme think. You're saying that when I

Re: Java and Unicode

2000-11-15 Thread Jungshik Shin
On Wed, 15 Nov 2000, Thomas Chan wrote: On Wed, 15 Nov 2000, Jungshik Shin wrote: On Wed, 15 Nov 2000, Michael (michka) Kaplan wrote: Many people try to compare this to DBCS, but it really is not the same thing understanding lead bytes and trail bytes in DBCS is *astoundingly*

Re: Java and Unicode

2000-11-15 Thread John Jenkins
On Wednesday, November 15, 2000, at 12:08 PM, Roozbeh Pournader wrote: On Wed, 15 Nov 2000, Michael (michka) Kaplan wrote: I do not think they are so theoretical, with both 10646 and Unicode including them in the very new future (unless you count it as theoretical when you drop

Re: Java and Unicode

2000-11-15 Thread Kenneth Whistler
John O'Conner wrote: Yes. If you have been involved with Unicode for any period of time at all, you would know that the Unicode consortium has advertised Unicode's 16-bit encoding for a long, long time, even in its latest Unicode 3.0 spec. The Unicode 3.0 spec clearly favors the 16-bit

Java and Unicode

2000-11-14 Thread Jani Kajala
As Unicode will soon contain characters defined beyond the code point range [0,65535] I'm wondering how is Java going to handle this? I didn't find any hints from JDK documentation either, at least a few days ago when I browsed the Java documentation about internationalization I just saw a

Re: Java and Unicode

2000-11-14 Thread John O'Conner
You can currently store UTF-16 in the String and StringBuffer classes. However, all operations are on char values or 16-bit code units. The upcoming release of the J2SE platform will include support for Unicode 3.0 (maybe 3.0.1) properties, case mapping, collation, and character break iteration.

RE: Java, SQL, Unicode and Databases

2000-06-23 Thread Michael Kaplan (Trigeminal Inc.)
PROTECTED]] Sent: Friday, June 23, 2000 7:55 AM To: Unicode List Cc: Unicode List; [EMAIL PROTECTED] Subject: Re: Java, SQL, Unicode and Databases I think that this is also true for DB2 using UTF-8 as the database encoding. From an application perspective, MS SQL Server

Re: Java, SQL, Unicode and Databases

2000-06-23 Thread Joe_Ross
PROTECTED], Hossein Kushki@IBMCA, Vladimir Dvorkin [EMAIL PROTECTED], Steven Watt [EMAIL PROTECTED] Subject: Re: Java, SQL, Unicode and Databases Joe, Can you expand on this a bit more? Privately if you prefer. Do you mean version 7 of MS SQL Server? I assume if it doesn't have UTF-8, i

RE: Java, SQL, Unicode and Databases

2000-06-23 Thread Michael Kaplan (Trigeminal Inc.)
this case is hiding the differences. Michael -- From: [EMAIL PROTECTED][SMTP:[EMAIL PROTECTED]] Sent: Friday, June 23, 2000 2:27 PM To: Michael Kaplan (Trigeminal Inc.) Cc: Unicode List; [EMAIL PROTECTED] Subject: RE: Java, SQL, Unicode and Databases Michae

Java, SQL, Unicode and Databases

2000-06-22 Thread Tex Texin
I want to write an application in Java that will store information in a database using Unicode. Ideally the application will run with any database that supports Unicode. One would presume that the JDBC driver would take care of any differences between databases so my application could be