[fpc-devel] Unicode conversion routines
Hi, Has anybody written these conversions functions yet? If not, I am about to port the ConvertUTF.c file from Unicode.org to Object Pascal. UTF-32 to UTF-16 UTF-32 to UTF-8 UTF-16 to UTF-32 UTF-16 to UTF-8 UTF-8 to UTF-16 UTF-8 to UTF-32 Regards, - Graeme - ___ fpGUI - a cross-platform Free Pascal GUI toolkit http://opensoft.homeip.net/fpgui/ ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode conversion routines
Graeme Geldenhuys schrieb: Hi, Has anybody written these conversions functions yet? If not, I am about to port the ConvertUTF.c file from Unicode.org to Object Pascal. UTF-32 to UTF-16 UTF-32 to UTF-8 UTF-16 to UTF-32 UTF-16 to UTF-8 UTF-8 to UTF-16 UTF-8 to UTF-32 Please check first if the rtl routines function UnicodeToUtf8(Dest: PChar; Source: PUnicodeChar; MaxBytes: SizeInt): SizeInt;{$ifdef SYSTEMINLINE}inline;{$endif} function UnicodeToUtf8(Dest: PChar; MaxDestBytes: SizeUInt; Source: PUnicodeChar; SourceChars: SizeUInt): SizeUInt; function Utf8ToUnicode(Dest: PUnicodeChar; Source: PChar; MaxChars: SizeInt): SizeInt;{$ifdef SYSTEMINLINE}inline;{$endif} function Utf8ToUnicode(Dest: PUnicodeChar; MaxDestChars: SizeUInt; Source: PChar; SourceBytes: SizeUInt): SizeUInt; function UTF8Encode(const s : Ansistring) : UTF8String; inline; function UTF8Encode(const s : UnicodeString) : UTF8String; function UTF8Decode(const s : UTF8String): UnicodeString; function AnsiToUtf8(const s : ansistring): UTF8String;{$ifdef SYSTEMINLINE}inline;{$endif} function Utf8ToAnsi(const s : UTF8String) : ansistring;{$ifdef SYSTEMINLINE}inline;{$endif} function UnicodeStringToUCS4String(const s : UnicodeString) : UCS4String; function UCS4StringToUnicodeString(const s : UCS4String) : UnicodeString; function WideStringToUCS4String(const s : WideString) : UCS4String; function UCS4StringToWideString(const s : UCS4String) : WideString; are sufficient or how they can be extended. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode conversion routines
On Sat, Nov 22, 2008 at 10:51 AM, Florian Klaempfl [EMAIL PROTECTED] wrote: function UTF8Decode(const s : UTF8String): UnicodeString; Is there some hard-coded limit on UTF8Decode? I am writing some unit tests for these methods and on my 3rd test, it already fails. I'm using testing sample data from unicode.org. var s8: UTF8String; s16: UnicodeString; begin // U+289A8 CJK UNIFIED IDEOGRAPH-289A8 s8 := Char($F0) + Char($A8) + Char($A6) + Char($A8); s16 := UTF8Decode(s8); AssertEquals('Failed on 4', 4, Length(s16)); AssertEquals('Failed on 5', UnicodeChar($D862) + UnicodeChar($DDA8), s16) end; Test 4 fails: Expected 4 but was 0. Test 5 fails: to due to Test 4 failure... ---[ from unicode description file ]--- U+289A8 CJK UNIFIED IDEOGRAPH-289A8 General Character Properties Unicode category: Letter, Other Various Useful Representations UTF-8: 0xF0 0xA8 0xA6 0xA8 UTF-16: 0xD862 0xDDA8 [ end ] Regards, - Graeme - ___ fpGUI - a cross-platform Free Pascal GUI toolkit http://opensoft.homeip.net/fpgui/ ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode conversion routines
Graeme Geldenhuys schrieb: On Sat, Nov 22, 2008 at 10:51 AM, Florian Klaempfl [EMAIL PROTECTED] wrote: function UTF8Decode(const s : UTF8String): UnicodeString; Is there some hard-coded limit on UTF8Decode? I am writing some unit tests for these methods and on my 3rd test, it already fails. I'm using testing sample data from unicode.org. See http://bugs.freepascal.org/view.php?id=11791 var s8: UTF8String; s16: UnicodeString; begin // U+289A8 CJK UNIFIED IDEOGRAPH-289A8 s8 := Char($F0) + Char($A8) + Char($A6) + Char($A8); s16 := UTF8Decode(s8); AssertEquals('Failed on 4', 4, Length(s16)); AssertEquals('Failed on 5', UnicodeChar($D862) + UnicodeChar($DDA8), s16) end; Test 4 fails: Expected 4 but was 0. Test 5 fails: to due to Test 4 failure... ---[ from unicode description file ]--- U+289A8 CJK UNIFIED IDEOGRAPH-289A8 General Character Properties Unicode category: Letter, Other Various Useful Representations UTF-8: 0xF0 0xA8 0xA6 0xA8 UTF-16: 0xD862 0xDDA8 [ end ] Regards, - Graeme - ___ fpGUI - a cross-platform Free Pascal GUI toolkit http://opensoft.homeip.net/fpgui/ ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
[fpc-devel] Memory consumed by strings
Is there a way to determine how much memory is consumed by strings by a running application? I'd like to know this, in particular, for FPC ana Lazarus --to begin with. And, the reason I'd like to know this is this: Whenever I suggest that char size be increased to 4, the idea gets opposed on the grouds that it will need huge memory --4 times as much. There's of course some merit in that arguement, but I have no idea what it is '4 times' of. This is not very engineer-like --it being unmeasured. Can anyone suggest a way to measure the memory load caused by strings? ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel