Graeme Geldenhuys wrote:
GG> Now this brings me to another point which makes no sense! Naming
GG> convertion of functions.
GG> Lets look at the following RTL function as an example:
GG> * StrPos is used for 1 byte (8 bit) ANSI strings.
GG> * AnsiStrPos is used for multi-byte (or is that 2 bytes max) UTF-8
GG> strings. aka WideString.
GG> So why did Borland name it AnsiStrPos, when it doesn't operate on ANSI
GG> strings!! Why not name it Utf8StrPos or WideStrPos? The prefix Ansi*
GG> completely goes against what it does (operates on)! It doesn't work
GG> with Ansi strings, but rather WideStrings.
No, AnsiStr* functions have nothing common with Unicode. They work with
MBCS (MultiByte Character Set). It is like UTF8 in the sense that
a single character may represented by more than one byte, but encoding
rules are different. And these MBCS encoding rules are
locale-dependent. Each locale defines a set 'Lead bytes' which signify
start of multibyte sequence. For most European locales this set is empty,
and therefore AnsiStr* functions operate exactly as Str*.
<>
--
Best regards,
Sergei
_________________________________________________________________
To unsubscribe: mail [EMAIL PROTECTED] with
"unsubscribe" as the Subject
archives at http://www.lazarus.freepascal.org/mailarchives