[fpc-devel] Unicode conversion routines

2008-11-22 Thread Graeme Geldenhuys
Hi,

Has anybody written these conversions functions yet?  If not, I am
about to port the ConvertUTF.c file from Unicode.org to Object Pascal.

UTF-32 to UTF-16
UTF-32 to UTF-8
UTF-16 to UTF-32
UTF-16 to UTF-8
UTF-8 to UTF-16
UTF-8 to UTF-32


Regards,
  - Graeme -


___
fpGUI - a cross-platform Free Pascal GUI toolkit
http://opensoft.homeip.net/fpgui/
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Unicode conversion routines

2008-11-22 Thread Florian Klaempfl
Graeme Geldenhuys schrieb:
 Hi,
 
 Has anybody written these conversions functions yet?  If not, I am
 about to port the ConvertUTF.c file from Unicode.org to Object Pascal.
 
   UTF-32 to UTF-16
   UTF-32 to UTF-8
   UTF-16 to UTF-32
   UTF-16 to UTF-8
   UTF-8 to UTF-16
   UTF-8 to UTF-32

Please check first if the rtl routines


function UnicodeToUtf8(Dest: PChar; Source: PUnicodeChar; MaxBytes:
SizeInt): SizeInt;{$ifdef SYSTEMINLINE}inline;{$endif}
function UnicodeToUtf8(Dest: PChar; MaxDestBytes: SizeUInt; Source:
PUnicodeChar; SourceChars: SizeUInt): SizeUInt;
function Utf8ToUnicode(Dest: PUnicodeChar; Source: PChar; MaxChars:
SizeInt): SizeInt;{$ifdef SYSTEMINLINE}inline;{$endif}
function Utf8ToUnicode(Dest: PUnicodeChar; MaxDestChars: SizeUInt;
Source: PChar; SourceBytes: SizeUInt): SizeUInt;
function UTF8Encode(const s : Ansistring) : UTF8String; inline;
function UTF8Encode(const s : UnicodeString) : UTF8String;
function UTF8Decode(const s : UTF8String): UnicodeString;
function AnsiToUtf8(const s : ansistring): UTF8String;{$ifdef
SYSTEMINLINE}inline;{$endif}
function Utf8ToAnsi(const s : UTF8String) : ansistring;{$ifdef
SYSTEMINLINE}inline;{$endif}
function UnicodeStringToUCS4String(const s : UnicodeString) : UCS4String;
function UCS4StringToUnicodeString(const s : UCS4String) : UnicodeString;
function WideStringToUCS4String(const s : WideString) : UCS4String;
function UCS4StringToWideString(const s : UCS4String) : WideString;

are sufficient or how they can be extended.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Unicode conversion routines

2008-11-22 Thread Graeme Geldenhuys
On Sat, Nov 22, 2008 at 10:51 AM, Florian Klaempfl
[EMAIL PROTECTED] wrote:
 function UTF8Decode(const s : UTF8String): UnicodeString;

Is there some hard-coded limit on UTF8Decode?  I am writing some unit
tests for these methods and on my 3rd test, it already fails. I'm
using testing sample data from unicode.org.

var
  s8: UTF8String;
  s16: UnicodeString;
begin

  // U+289A8 CJK UNIFIED IDEOGRAPH-289A8
  s8 := Char($F0) + Char($A8) + Char($A6) + Char($A8);
  s16 := UTF8Decode(s8);

  AssertEquals('Failed on 4', 4, Length(s16));
  AssertEquals('Failed on 5', UnicodeChar($D862) + UnicodeChar($DDA8), s16)

end;

Test 4 fails: Expected 4 but was 0.
Test 5 fails: to due to Test 4 failure...


---[  from unicode description file ]---
U+289A8 CJK UNIFIED IDEOGRAPH-289A8

General Character Properties

Unicode category: Letter, Other

Various Useful Representations

UTF-8: 0xF0 0xA8 0xA6 0xA8
UTF-16: 0xD862 0xDDA8
[ end ]


Regards,
  - Graeme -


___
fpGUI - a cross-platform Free Pascal GUI toolkit
http://opensoft.homeip.net/fpgui/
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


Re: [fpc-devel] Unicode conversion routines

2008-11-22 Thread Florian Klaempfl
Graeme Geldenhuys schrieb:
 On Sat, Nov 22, 2008 at 10:51 AM, Florian Klaempfl
 [EMAIL PROTECTED] wrote:
 function UTF8Decode(const s : UTF8String): UnicodeString;
 
 Is there some hard-coded limit on UTF8Decode?  I am writing some unit
 tests for these methods and on my 3rd test, it already fails. I'm
 using testing sample data from unicode.org.

See http://bugs.freepascal.org/view.php?id=11791

 
 var
   s8: UTF8String;
   s16: UnicodeString;
 begin
 
   // U+289A8 CJK UNIFIED IDEOGRAPH-289A8
   s8 := Char($F0) + Char($A8) + Char($A6) + Char($A8);
   s16 := UTF8Decode(s8);
 
   AssertEquals('Failed on 4', 4, Length(s16));
   AssertEquals('Failed on 5', UnicodeChar($D862) + UnicodeChar($DDA8), s16)
 
 end;
 
 Test 4 fails: Expected 4 but was 0.
 Test 5 fails: to due to Test 4 failure...
 
 
 ---[  from unicode description file ]---
 U+289A8 CJK UNIFIED IDEOGRAPH-289A8
 
 General Character Properties
 
 Unicode category: Letter, Other
 
 Various Useful Representations
 
 UTF-8: 0xF0 0xA8 0xA6 0xA8
 UTF-16: 0xD862 0xDDA8
 [ end ]
 
 
 Regards,
   - Graeme -
 
 
 ___
 fpGUI - a cross-platform Free Pascal GUI toolkit
 http://opensoft.homeip.net/fpgui/
 ___
 fpc-devel maillist  -  fpc-devel@lists.freepascal.org
 http://lists.freepascal.org/mailman/listinfo/fpc-devel
 

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel


[fpc-devel] Memory consumed by strings

2008-11-22 Thread listmember
Is there a way to determine how much memory is consumed by strings by a 
running application?


I'd like to know this, in particular, for FPC ana Lazarus --to begin with.

And, the reason I'd like to know this is this: Whenever I suggest that 
char size be increased to 4, the idea gets opposed on the grouds that it 
will need huge memory --4 times as much.


There's of course some merit in that arguement, but I have no idea what 
it is '4 times' of.


This is not very engineer-like --it being unmeasured.

Can anyone suggest a way to measure the memory load caused by strings?
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel