Re: [PATH] convert internal charset to UTF-8

Peter Christensen Thu, 20 Jul 2006 03:51:40 -0700

Hi Alex,

Awesome initiative! I've been hoping for this to happen for quite awhile. There are a few issues though:

1. In the gwlib/latin1_to_gsm.h, <SP> (space) is replaced with <ESC>(0x1B), and <ESC> is mapped to NRP instead of just <ESC>. (If you follow me)

2. For some odd reason, smsbox trims the message to 160 characters,while it is in utf-8 format... My usual charset test message whichcontains all GSM characters except the Greek ones (wasn't possiblebefore now), looks like this:


Test: @£$¥èéùìòÇ
Øø

Åå_ÆæßÉ!"#¤%&'()*+,-./0123456789:;<=>?¡ABCDEFGHIJKLMNOPQRSTUVWXYZÄÖÑÜ§¿abcdefghijklmnopqrstuvwxyzäöñüà^{}\[~]|€

Which in UTF-8 takes up 163 octets, but only 141 septets in GSM. Whentransmitting, the € is omitted, and judging from a ngrep of datatransfered from smsbox to bearerbox, it is smsbox which does thetrimming. For the record, the string is exactly 160 octets long when €is omitted.Apparently it uses the size of the GSM string to determine when tosplit, but the trimming/splitting is done on the UTF-8 string. Obviouslyit is sms_split, which is to blame, but why is this function used at allif splitting is done in bearerbox (according to comments in source) -this problem is probably not directly related to the utf-8 patch.


Med venlig hilsen / Best regards

Peter Christensen

Developer
------------------
Cool Systems ApS

Tel: +45 2888 1600
Mai: [EMAIL PROTECTED]
www: www.coolsystems.dk


Alexander Malysh wrote:

Hi all,
at http://www.kannel.org/~amalysh/kannel-utf8.patch is a not so hugepatch that converts internal kannel charset to UTF-8. Please note that Ididn't add smsbox compatibility code, means smsbox expect text body tobe encoded in UTF-8 as default also MOs will be forwarded in UTF-8. Itcould be workarounded with charset cgi variable.
Please test it and send feedback/patches.
I will maintain this patch for a while as long as we don't decide tocommit it to CVS.
--Thanks,
Alex

Re: [PATH] convert internal charset to UTF-8

Reply via email to