Flávio Etrusco wrote:
On Sat, Mar 8, 2014 at 9:30 PM, Giuliano Colla <[email protected]> wrote:
I was aware of that. My problem is that the char I must add to the Utf8 string is calculated run time, and is in the range Unicode $A0-$BF. I had assumed (wrongly) that the compiler was smart enough to convert a type "char" to UTF8, when concatenating it to an UTf8 string. Instead it turns out that the character is appended as it is, which leads to an invalid UTF8 character (above 127), which displays as a crossed box. IMHO that's an FPC bug.
Are you aware of $CODEPAGE directive? And that Lazarus, unless told otherwise, saves the source files in UTF-8 and tells FPC they are encoded in UTF-8?
There was specific discussion about conversions relating to characters in the range 0x80 through 0xff a few months ago. For parsing things like APL and ALGOL character sets I specifically use UTF-16 internally, and only convert to/from an 8-bit OEM codepage and/or UTF-8 at the margins of the program.
See discussion in http://mantis.freepascal.org/view.php?id=21195 and note change in the 2.7.1 compiler around revision 23613.
-- Mark Morgan Lloyd markMLl .AT. telemetry.co .DOT. uk [Opinions above are the author's, not those of his employers or colleagues] -- _______________________________________________ Lazarus mailing list [email protected] http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
