Mattias Gaertner schrieb:
On Wed, 26 Nov 2014 11:23:17 +0100
Michael Schnell <mschn...@lumino.de> wrote:
Seemingly here the "bytes per character" setting implicitly is thought
of as a port of the "code-page" definition. correct ?
Code page define bytes per character.
Huh?
Not all codepages have a fixed number of bytes per character.
The string preamble contains the *element size* (1 for AnsiString), just
like with every dynamic array.
As you know: Don't confuse character with glyph and codepoint.
Right, but what is what?
I feel a need for an exact (official) definition of such (and more)
terms, in order to prevent further misunderstandings of the
documentation and in discussions.
E.g. "code page" has different meanings, when used with ANSI/ISO and
Unicode character sets.
While ANSI/ISO codepages desribe different mappings of bytes into
characters, Unicode codepages define subsets of the whole Unicode range.
My understanding of "character" is a *logical* unit (letter), with
possibly different encodings, values and sizes in different codepages
(character sets).
What's the term for the *physical* unit (AnsiChar, WideChar)?
Ansistring supports only one byte per character code pages.
Huh?
What's your definition of "character"?
AnsiString supports MBCS codepages as well. The restriction is the
physical storage unit (1 byte per string item), as imposed by AnsiChar.
DoDi
_______________________________________________
fpc-devel maillist - fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel