Marco Cimarosti wrote:
Actually, C does have different types for characters within strings and for
characters in isolation.
That is not my point of view.
There is a special case for 'H', that holds int type rather than char, for
backward compatibility reasons (such as because the first
Antoine Leca wrote:
Marco Cimarosti wrote:
Actually, C does have different types for characters within
strings and for
characters in isolation.
That is not my point of view.
There is a special case for 'H', that holds int type rather
than char, for
backward compatibility reasons (such
From: "Marco Cimarosti" [EMAIL PROTECTED]
the Surrograte (aka "Astral") Planes.
I believe the UTC has deprecated the term Astral planes with extreme
prejudice. HTH!
michka
a new book on internationalization in VB at
http://www.i18nWithVB.com/
On Mon, Nov 20, 2000 at 06:54:27AM -0800, Michael (michka) Kaplan wrote:
From: "Marco Cimarosti" [EMAIL PROTECTED]
the Surrograte (aka "Astral") Planes.
I believe the UTC has deprecated the term Astral planes with extreme
prejudice. HTH!
The UTC has chosen not use the term Astral Plane.
ot; [EMAIL PROTECTED]
Sent: Monday, November 20, 2000 7:18 AM
Subject: Re: string vs. char [was Re: Java and Unicode]
On Mon, Nov 20, 2000 at 06:54:27AM -0800, Michael (michka) Kaplan wrote:
From: "Marco Cimarosti" [EMAIL PROTECTED]
the Surrograte (aka "Astral") Plan
David Starner wrote:
I chose Astral Planes for perceived grace
and beauty.
Thank you!
--
There is / one art || John Cowan [EMAIL PROTECTED]
no more / no less|| http://www.reutershealth.com
to do / all things ||
David Starner wrote:
Sent: 20 Nov 2000, Mon 16.18
To: Unicode List
Subject: Re: string vs. char [was Re: Java and Unicode]
On Mon, Nov 20, 2000 at 06:54:27AM -0800, Michael (michka)
Kaplan wrote:
From: "Marco Cimarosti" [EMAIL PROTECTED]
the Surrograte (aka "Astral&
Hi Jani,
I dunno. I oversimplified in that statement about exposing vs. hiding.
ICU "hides" the facts about the Unicode implementation in macros,
specifically a next and previous character macro and various other
fillips. If you look very closely at the function (method) prototypes you
can see
ormation.
Mark
- Original Message -
From: "Michael (michka) Kaplan" [EMAIL PROTECTED]
To: "Unicode List" [EMAIL PROTECTED]
Sent: Monday, November 20, 2000 06:54
Subject: Re: string vs. char [was Re: Java and Unicode]
From: "Marco Cimarosti" [EMAIL PROTEC
Addison P. Phillips wrote:
I ended up deciding that the Unicode API for this OS will only work in
strings. CTYPE replacement functions (such as isalpha) and
character based
replacement functions (such as strchr) will take and return
strings for
all of their arguments.
Internally, my
Ooops!
In my previous message, I wrote:
wchar_t * _wcschr_32(const wint_t * s, wchar_t c);
wchar_t * _wcsrchr_32(const wint_t * s, wchar_t c);
What I actually wanted to write is:
wchar_t * _wcschr_32(const wchar_t * s, wint_t c);
wchar_t * _wcsrchr_32(const wchar_t * s, wint_t c);
Sorry if
for example.
Mark
- Original Message -
From: [EMAIL PROTECTED]
To: "Unicode List" [EMAIL PROTECTED]
Sent: Thursday, November 16, 2000 13:24
Subject: string vs. char [was Re: Java and Unicode]
Normally this thread would be of only academic interest to me...
...bu
Well... I think you're right. I knew that char and string units weren't
really the same thing. My concern was how to make it easy on developers to
use the Unicode API using their "native intelligence".
More thought makes me less certain of my approach. Specifically, as Mark
points out, looping
At 4:44 PM -0800 11/15/00, Markus Scherer wrote:
In the case of Java, the equivalent course of action would be to
stick with a 16-bit char as the base type for strings. The int type
could be used in _additional_ APIs for single Unicode code points,
deprecating the old APIs with char.
It's
On Thu, Nov 16, 2000 at 05:58:27 -0800, Elliotte Rusty Harold wrote:
public char charAt(int index)
This method is used to walk strings, looking at each character in
turn, a useful thing to do. Clearly it would be possible to replace
it with a method with a String return type like this:
At 7:26 AM -0800 11/16/00, Valeriy E. Ushakov wrote:
On Thu, Nov 16, 2000 at 05:58:27 -0800, Elliotte Rusty Harold wrote:
public char charAt(int index)
This method is used to walk strings, looking at each character in
turn, a useful thing to do. Clearly it would be possible to replace
On Thu, 16 Nov 2000, Markus Scherer wrote:
The ICU API was changed this way within a few months this year. Some of the
higher-level implementations are still to follow until next summer, when there will
be some 45000 CJK characters that will be infrequent but hard to ignore - the Chinese
and
Juliusz Chroboczek wrote:
I believe that Java strings use UTF-8 internally.
.class files use a _modified_ utf-8. at runtime, strings are always in 16-bit unicode.
At any rate the
internal implementation is not exposed to applications -- note that
`length' is a method in class String (while
Normally this thread would be of only academic interest to me...
...but this week I'm writing a spec for adding Unicode support to an
embedded operating system written in C. Due to Mssrs. O'Conner and
Scherer's presentations at the most recent IUC, I was aware of the clash
between internal
: Thursday, November 16, 2000 13:24
Subject: string vs. char [was Re: Java and Unicode]
Normally this thread would be of only academic interest to me...
...but this week I'm writing a spec for adding Unicode support to an
embedded operating system written in C. Due to Mssrs. O'Conner and
One thing I'm very curious about going forward: Right now character
values greater than 65535 are purely theoretical. However this will
change. It seems to me that handling these characters properly is
going to require redefining the char data type from two bytes to
four. This is a major
ot; [EMAIL PROTECTED]
To: "Unicode List" [EMAIL PROTECTED]
Sent: Wednesday, November 15, 2000 6:15 AM
Subject: Re: Java and Unicode
One thing I'm very curious about going forward: Right now character
values greater than 65535 are purely theoretical. However this will
change. It seems
Eliotte Rusty Harold wrote:
One thing I'm very curious about going forward: Right now character
values greater than 65535 are purely theoretical. However this will
change. It seems to me that handling these characters properly is
going to require redefining the char data type from two
On Wed, 15 Nov 2000, Michael (michka) Kaplan wrote:
In any case, I think that UTF-16 is the answer here.
Many people try to compare this to DBCS, but it really is not the same
thing understanding lead bytes and trail bytes in DBCS is *astoundingly*
more complicated than handling
On Wed, 15 Nov 2000, Jungshik Shin wrote:
On Wed, 15 Nov 2000, Michael (michka) Kaplan wrote:
In any case, I think that UTF-16 is the answer here.
Many people try to compare this to DBCS, but it really is not the same
thing understanding lead bytes and trail bytes in DBCS is
On Wed, 15 Nov 2000, Michael (michka) Kaplan wrote:
I do not think they are so theoretical, with both 10646 and Unicode
including them in the very new future (unless you count it as theoretical
when you drop an egg but it has not yet hit the ground!).
Lemme think. You're saying that when I
On Wed, 15 Nov 2000, Thomas Chan wrote:
On Wed, 15 Nov 2000, Jungshik Shin wrote:
On Wed, 15 Nov 2000, Michael (michka) Kaplan wrote:
Many people try to compare this to DBCS, but it really is not the same
thing understanding lead bytes and trail bytes in DBCS is *astoundingly*
On Wednesday, November 15, 2000, at 12:08 PM, Roozbeh Pournader wrote:
On Wed, 15 Nov 2000, Michael (michka) Kaplan wrote:
I do not think they are so theoretical, with both 10646 and Unicode
including them in the very new future (unless you count it as
theoretical
when you drop
John O'Conner wrote:
Yes. If you have been involved with Unicode for any period of time at all, you
would know that the Unicode consortium has advertised Unicode's 16-bit
encoding for a long, long time, even in its latest Unicode 3.0 spec. The
Unicode 3.0 spec clearly favors the 16-bit
As Unicode will soon contain characters defined beyond the code point range
[0,65535] I'm wondering how is Java going to handle this?
I didn't find any hints from JDK documentation either, at least a few days
ago when I browsed the Java documentation about internationalization I just
saw a
You can currently store UTF-16 in the String and StringBuffer classes. However,
all operations are on char values or 16-bit code units. The upcoming release of
the J2SE platform will include support for Unicode 3.0 (maybe 3.0.1)
properties, case mapping, collation, and character break iteration.
PROTECTED]]
Sent: Friday, June 23, 2000 7:55 AM
To: Unicode List
Cc: Unicode List; [EMAIL PROTECTED]
Subject: Re: Java, SQL, Unicode and Databases
I think that this is also true for DB2 using UTF-8 as the database
encoding.
From an application perspective, MS SQL Server
PROTECTED], Hossein Kushki@IBMCA, Vladimir Dvorkin
[EMAIL PROTECTED], Steven Watt [EMAIL PROTECTED]
Subject: Re: Java, SQL, Unicode and Databases
Joe,
Can you expand on this a bit more? Privately if you prefer.
Do you mean version 7 of MS SQL Server?
I assume if it doesn't have UTF-8, i
this
case is hiding the differences.
Michael
--
From: [EMAIL PROTECTED][SMTP:[EMAIL PROTECTED]]
Sent: Friday, June 23, 2000 2:27 PM
To: Michael Kaplan (Trigeminal Inc.)
Cc: Unicode List; [EMAIL PROTECTED]
Subject: RE: Java, SQL, Unicode and Databases
Michae
I want to write an application in Java that will store information
in a database using Unicode. Ideally the application will run
with any database that supports Unicode. One would presume that the
JDBC driver would take care of any differences between databases
so my application could be
35 matches
Mail list logo