On Fri, 27 Nov 2009, Mindaugas Kavaliauskas wrote:

Hi,

> I'm trying to use UTF8 functions from string API. I found a problem
> using Cairo library (it accepts all string parameters in UTF8). If I
> want to print a "s caron" (\xF0 from 1257 code page), I must pass
> \xC5\xA1 to cairo functions. \xC2\xAD does not work. I guess it's
> the problem of little vs big endianness in UTF8.
> Do we need two UTF8 functions (or additional parameter) to manage
> the problem? :-(

UTF8 is endian independent. There is no little and big endian version
of UTF-8.
In CP-1257 character at position 0xF0 has Unicode value 0x0161 which
in UTF-8 encoding has value "\xC5\xA1". "\xC2\xAD" is 0x00AD. It simply
means that in your program you are not set LTWIN codepage but some
other CP which uses Unicode table which has value 0x00AD at position
0xF0, i.e. CP-852.
The code below illustrates it.

best regards,
Przemek


request HB_CODEPAGE_LTWIN
request HB_CODEPAGE_PL852
proc main()
   ? hb_strToHex( hb_utf8chr( 0x0161 ) )
   ?
   set( _SET_CODEPAGE, "LTWIN" )
   ? hb_strToHex( hb_strToUtf8( chr( 0xF0 ) ) )
   ? hb_strToHex( str2utf8( chr( 0xF0 ) ) )
   ?
   set( _SET_CODEPAGE, "PL852" )
   ? hb_strToHex( hb_strToUtf8( chr( 0xF0 ) ) )
   ? hb_strToHex( str2utf8( chr( 0xF0 ) ) )
return

#pragma begindump
#include "hbapi.h"
#include "hbapistr.h"
HB_FUNC( STR2UTF8 )
{
   const char * psz;
   void * hStr = hb_parstr_utf8( 1, &psz, NULL );
   hb_retc( psz );
   hb_strfree( hStr );
}
#pragma enddump
_______________________________________________
Harbour mailing list (attachment size limit: 40KB)
[email protected]
http://lists.harbour-project.org/mailman/listinfo/harbour

Reply via email to