On Sun, Feb 08, 2004 at 10:37:45PM -0800, Chris Mullins wrote:
.NET has the ability to:
1) Iterate over strings by graphemes so that regardless of encoding,
developers can treat Unicode combining characters and surrogate pairs as
a single entity.
2) Build and manipulate strings that
On Mon, 2004-02-09 at 02:22, gabor wrote:
snip/
i just can't understand why the designers of dotnet didn't look at the unicode
standards. i can understand that java has this problem, but java is much older
than dotnet.
maybe it's because winapi uses 16-bit characters?
I imagine it's due to
No, Gabor is not confused. Unicode has grown. It is now 20 bits, not 16.
See for example http://www.terena.nl/library/multiling/unicode/utf16.html
(which I just found by googling; it looks a bit out-of-date).
I had absolutely no clue about this ;) I've been using unicode for years and I
As I recall, when the CM3 Modula-3 compiler added support for unicode, they
used a hybrid scheme where TEXTs (their equivalent of System.String) can
contain both 8-bit and 16-bit chars. So only the portions of the string
that require more than 8 bits use it. Something similar could be done with
On Mon, 2004-02-09 at 19:21, Marcus wrote:
As I recall, when the CM3 Modula-3 compiler added support for unicode, they
used a hybrid scheme where TEXTs (their equivalent of System.String) can
contain both 8-bit and 16-bit chars. So only the portions of the string
that require more than 8
hi,
as i understand, characters in .net are 16-bit values.
but what about unicode characters, that are simply above the 16-bit
limit?
for example:
OLD ITALIC LETTER A (unicode code: 10300).
how do you represent those in .net?
i tried to open a textfile containing this old-italic-a:
- the
Hi Gabor,
I think you're confused. Characters in .NET are 16 bits BECAUSE they are
unicode. 16 bits = 2 bytes = 65536 values.
a way to check that is simple. here's some C# example code:
string s = a;
s += (char)10300;
Console.WriteLine(s = + s);
Console.WriteLine(len =
On 08-Feb-2004, max [EMAIL PROTECTED] wrote:
Hi Gabor,
I think you're confused. Characters in .NET are 16 bits BECAUSE they are
unicode. 16 bits = 2 bytes = 65536 values.
No, Gabor is not confused. Unicode has grown. It is now 20 bits, not 16.
See for example
represent those in .net?
Cheers!
Fabio Montoya
| -Original Message-
| From: [EMAIL PROTECTED]
| [mailto:[EMAIL PROTECTED] On Behalf Of max
| Sent: Sunday, February 08, 2004 10:04 PM
| To: gabor; [EMAIL PROTECTED]
| Subject: Re: [Mono-list] unicode trouble
|
| Hi Gabor,
| I think
'; [EMAIL PROTECTED]
| Subject: RE: [Mono-list] unicode trouble
|
|
|
| Gabor is right Max! The Unicode standard defines characters
| in a 32 bit space, The Unicode Character Space in 32 bits or UCS-32.
|
| For practical reasons, the Unicode standard defines
| transformation formats,
| i.e.:
|
| UTF
10 matches
Mail list logo