- Original Message -
From: peter green [EMAIL PROTECTED]
To: FPC developers' list fpc-devel@lists.freepascal.org
Sent: Sunday, January 09, 2005 11:45 PM
Subject: RE: [fpc-devel] ansistrings and widestrings
Type
// Lenght paremeters are number of CHARS not bytes
TWide2AnsiMove
peter green wrote:
it should be noted that pascal classes are really not suited to doing
strings.
IMO we should distinguish Strings, as containers, from Text as an
interpretation of data as, ahem, text of some language, in some
encoding, possibly with attributes...
to do strings with
Florian Klaempfl wrote:
The only universal international representation for strings is Unicode
(currently 32 bit), that doesn't require any conversions.
That's not true. E.g. the german umlauts can be represented by 2 chars
when using UTF-32 (the char and the two dots), same apply to a
be replaced by versions supplied by a unit which
provides proper internationalisation.
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of DrDiettrich
Sent: 07 January 2005 15:06
To: FPC developers' list
Subject: Re: [fpc-devel] ansistrings and widestrings
-devel] ansistrings and widestrings
ok i see a MAJOR problem with the semantics of those functions.
they assume that one widechar is equivilent to one ansichar (that is the
source count of widechars will equal the destination count of ansichars or
the source count of widechars will equal
developers' list
Subject: Re: [fpc-devel] ansistrings and widestrings
Well functions are called ANSI to unicode and vice versa. ANSI is
always single
byte; by unicode people usually refer to utf16, not multibyte
encoding and both
Delphi and FPC define WideString as double byte strings. So
Subject: RE: [fpc-devel] ansistrings and widestrings
in wondows terminology (which i presume is where the name ansistring comes
from) the windows code page which is often refered to in documentation as
the ansi code page CAN be multi byte.
http://www.microsoft.com/globaldev/reference/WinCP.mspx
more
PPS. AFAIK UTF-8 is not used internally in any OS - it's only
used for storing
UNICODE text in more compact form - web site authors really like it.
i belive a lot of linux distros are switching to it for the console at least
for less common languages i don't know how gui stuff on linux