Andreas Fink wrote:
>
> > > Index: gw/smsbox.c
> >> ===================================================================
> >> RCS file: /home/cvs/gateway/gw/smsbox.c,v
> >> retrieving revision 1.156
> >> diff -r1.156 smsbox.c
> >> 1392,1395d1391
> >> < if (charset_processing(charset, &body, coding) == -1) {
> >> < *status = 415;
> >> < ret = octstr_create("Charset or body misformed, rejected");
> >> < }
> >
> >votings from the smsbox hackers for the proposed change?! Andreas?
> >Nick?
>
> if its a bug, lets fix it. I had a user complaining that he has
> problems with greek characters. Sounds like the source of the problem.
>
Actually this is a bug (I think) I found trying to solve the problem
with greek characters.
Removing this line is not enough.
The code in charset processing does well when coding==DC_UCS2 (well this
is the easy case).
It also does well when coding==DC_7BIT and charset=="ISO-8859-1"(well
that is even easier: just do nothing)
But when coding==DC_7BIT and charset!="ISO-8859-1" it seems to be trying
to do something like this:
first ... encode to UTF-8
then UTF-8 to ISO-8859-1
allways using libxml calls.
Actually the code for UTF-8 to ISO-8859-1 is wrong and commented out.
/* UTF-8 to ISO-8859-1 */
/* charset = octstr_create("ISO-8859-1");
if (charset_from_utf8(new*body, &temp, charset) >= 0) {
octstr_destroy(new*body);
new*body = temp;
octstr_dump(new*body, 0);
octstr_destroy(charset);
} else {
octstr_destroy(charset);
octstr_destroy(new*body);
return NULL;
}
debug("sms.http", 0, "coding=7bit, after iso8859-1, msgdata is %s",
octstr_get_cstr(n
ew*body));
*/
Anyway it would *NOT* do the job. libxml maps any characters that not
map directly to ISO-8859-1 to something like this Μ (the XML way).
Which is not good!
I am working on a solution for the greek characters, which must be
relatively easy since GSM default alphabet has all the GREEK capital
letters.
But I don't know how to give more general solution for all the possible
charsets (other than ISO-8859-7 which is for Greek).