Here it is. To confirm the brokeness of the old code I wrote a small test program:
#include "gwlib.h"
int
main()
{
Octstr *os;
gwlib_init();
os = octstr_create ("aeiAEI������?ߵ");
octstr_dump(os, 0);
charset_convert(os, "ISO8859-15", "UTF-8");
octstr_dump(os, 0);
charset_convert(os, "UTF-8", "UTF-16");
octstr_dump(os, 0);
charset_convert(os, "UTF-16", "ISO8859-15");
debug("charsettest", 0, "Final result: %s", octstr_get_cstr (os));
return 0;
}
The result with the (broken) code from CVS is:
2004-11-25 16:40:32 [1425] [0] DEBUG: Octet string at 0x81090e0:
2004-11-25 16:40:32 [1425] [0] DEBUG: len: 15
2004-11-25 16:40:32 [1425] [0] DEBUG: size: 16
2004-11-25 16:40:32 [1425] [0] DEBUG: immutable: 0
2004-11-25 16:40:32 [1425] [0] DEBUG: data: 61 65 69 41 45 49 c4 d6 dc e4
f6 fc a4 df b5 aeiAEI.........
2004-11-25 16:40:32 [1425] [0] DEBUG: Octet string dump ends.
2004-11-25 16:40:32 [1425] [0] DEBUG: Octet string at 0x81090e0:
2004-11-25 16:40:32 [1425] [0] DEBUG: len: 25
2004-11-25 16:40:32 [1425] [0] DEBUG: size: 1024
2004-11-25 16:40:32 [1425] [0] DEBUG: immutable: 0
2004-11-25 16:40:32 [1425] [0] DEBUG: data: 61 65 69 41 45 49 c3 84 c3 96
c3 9c c3 a4 c3 b6 aeiAEI..........
2004-11-25 16:40:32 [1425] [0] DEBUG: data: c3 bc e2 82 ac c3 9f c2 b5
.........
2004-11-25 16:40:32 [1425] [0] DEBUG: Octet string dump ends.
2004-11-25 16:40:32 [1425] [0] DEBUG: Octet string at 0x81090e0:
2004-11-25 16:40:32 [1425] [0] DEBUG: len: 3
2004-11-25 16:40:32 [1425] [0] DEBUG: size: 1024
2004-11-25 16:40:32 [1425] [0] DEBUG: immutable: 0
2004-11-25 16:40:32 [1425] [0] DEBUG: data: ff fe 61
..a
2004-11-25 16:40:32 [1425] [0] DEBUG: Octet string dump ends.
2004-11-25 16:40:32 [1425] [0] ERROR: Failed to convert string from <UTF-16>
to <ISO8859-15>, errno was <22>
2004-11-25 16:40:32 [1425] [0] DEBUG: Final result: ��a
With the attached patch I get the desired and expected result:
2004-11-25 16:40:51 [1428] [0] DEBUG: Octet string at 0x81090e0:
2004-11-25 16:40:51 [1428] [0] DEBUG: len: 15
2004-11-25 16:40:51 [1428] [0] DEBUG: size: 16
2004-11-25 16:40:51 [1428] [0] DEBUG: immutable: 0
2004-11-25 16:40:51 [1428] [0] DEBUG: data: 61 65 69 41 45 49 c4 d6 dc e4
f6 fc a4 df b5 aeiAEI.........
2004-11-25 16:40:51 [1428] [0] DEBUG: Octet string dump ends.
2004-11-25 16:40:51 [1428] [0] DEBUG: Octet string at 0x81090e0:
2004-11-25 16:40:51 [1428] [0] DEBUG: len: 25
2004-11-25 16:40:51 [1428] [0] DEBUG: size: 1024
2004-11-25 16:40:51 [1428] [0] DEBUG: immutable: 0
2004-11-25 16:40:51 [1428] [0] DEBUG: data: 61 65 69 41 45 49 c3 84 c3 96
c3 9c c3 a4 c3 b6 aeiAEI..........
2004-11-25 16:40:51 [1428] [0] DEBUG: data: c3 bc e2 82 ac c3 9f c2 b5
.........
2004-11-25 16:40:51 [1428] [0] DEBUG: Octet string dump ends.
2004-11-25 16:40:51 [1428] [0] DEBUG: Octet string at 0x81090e0:
2004-11-25 16:40:51 [1428] [0] DEBUG: len: 32
2004-11-25 16:40:51 [1428] [0] DEBUG: size: 1024
2004-11-25 16:40:51 [1428] [0] DEBUG: immutable: 0
2004-11-25 16:40:51 [1428] [0] DEBUG: data: ff fe 61 00 65 00 69 00 41 00
45 00 49 00 c4 00 ..a.e.i.A.E.I...
2004-11-25 16:40:51 [1428] [0] DEBUG: data: d6 00 dc 00 e4 00 f6 00 fc 00
ac 20 df 00 b5 00 ........... ....
2004-11-25 16:40:51 [1428] [0] DEBUG: Octet string dump ends.
2004-11-25 16:40:51 [1428] [0] DEBUG: Final result: aeiAEI������?ߵ
Regards
J�rg
-----Urspr�ngliche Nachricht-----
Von: Alexander Malysh [mailto:[EMAIL PROTECTED]
Gesendet: Mittwoch, 24. November 2004 13:00
An: [EMAIL PROTECTED]
Betreff: RE: Bug in gwlib/charset.c: int charset_convert(Octstr *string,
c har *charset_from, char *charset_to)
Hi Joerg,
IMO, it's a bug. please post a patch, I will commit it...
"Pommnitz, J�rg" wrote:
> No comment?
>
> -----Urspr�ngliche Nachricht-----
> Von: Pommnitz, J�rg
> Gesendet: Montag, 22. November 2004 12:22
> An: Kannel-Devel (E-Mail)
> Betreff: Bug in gwlib/charset.c: int charset_convert(Octstr *string,
> char *charset_from, char *charset_to)
>
>
> Hi List,
> the above mentioned function seems buggy to me. It takes the result of the
> iconv operation and uses octstr_append_cstr(string, to_buf); to replace
> the old contents with the new one. This can't possibly work when you
> convert to say UTF-16 with zero-bytes in the middle of the result (quite
> likely). I don't have a patch, but the right solution would probably be to
> use
>
> octstr_append_data(string, to_buf, pointer - to_buf);
>
> instead of
>
> octstr_append_cstr(string, to_buf);
>
> Regards
> Joerg
--
Thanks,
Alex
charset.diff
Description: Binary data
