Mattias Gaertner schrieb:
On Wed, 26 Nov 2014 11:23:17 +0100
Michael Schnell <mschn...@lumino.de> wrote:

Seemingly here the "bytes per character" setting implicitly is thought of as a port of the "code-page" definition. correct ?

Code page define bytes per character.

Huh?

Not all codepages have a fixed number of bytes per character.
The string preamble contains the *element size* (1 for AnsiString), just like with every dynamic array.


As you know: Don't confuse character with glyph and codepoint.

Right, but what is what?

I feel a need for an exact (official) definition of such (and more) terms, in order to prevent further misunderstandings of the documentation and in discussions.

E.g. "code page" has different meanings, when used with ANSI/ISO and Unicode character sets. While ANSI/ISO codepages desribe different mappings of bytes into characters, Unicode codepages define subsets of the whole Unicode range.

My understanding of "character" is a *logical* unit (letter), with possibly different encodings, values and sizes in different codepages (character sets).
What's the term for the *physical* unit (AnsiChar, WideChar)?


Ansistring supports only one byte per character code pages.

Huh?

What's your definition of "character"?

AnsiString supports MBCS codepages as well. The restriction is the physical storage unit (1 byte per string item), as imposed by AnsiChar.

DoDi

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Reply via email to