On Wed, 2001-11-21 at 17:34, Nektarios K. Papadopoulos wrote:

Yes, it should be a bug.
When I coded it, I've asked for help in this part because: 1st I didn't know the xml_* functions and
2nd: I usually only use iso-8859-1 or ucs-2 directly.
That's why there's so many debug lines around it.

Feel free to correct it.
BTW, there should be a bug somewhere in this code that panics, I've seen it once but I don't
recall what I've done (besides passing some differente charsets and codings)
Andreas Fink wrote:
> 
> >  > Index: gw/smsbox.c
> >>  ===================================================================
> >>  RCS file: /home/cvs/gateway/gw/smsbox.c,v
> >>  retrieving revision 1.156
> >>  diff -r1.156 smsbox.c
> >>  1392,1395d1391
> >>  <       if (charset_processing(charset, &body, coding) == -1) {
> >>  <           *status = 415;
> >>  <           ret = octstr_create("Charset or body misformed, rejected");
> >>  <       }
> >
> >votings from the smsbox hackers for the proposed change?! Andreas?
> >Nick?
> 
> if its a bug, lets fix it. I had a user complaining that he has
> problems with greek characters. Sounds like the source of the problem.
> 

Actually this is a bug (I think) I found trying to solve the problem
with greek characters.

Removing this line is not enough.

The code in charset processing does well when coding==DC_UCS2 (well this
is the easy case).

It also does well when coding==DC_7BIT and charset=="ISO-8859-1"(well
that is even easier: just do nothing)

But when coding==DC_7BIT and charset!="ISO-8859-1" it seems to be trying
to do something like this:
first ... encode to UTF-8
then  UTF-8 to ISO-8859-1
allways using libxml calls.

Actually the code for UTF-8 to ISO-8859-1 is wrong and commented out.
      /* UTF-8 to ISO-8859-1 */
/*  charset = octstr_create("ISO-8859-1"); 
      if (charset_from_utf8(new*body, &temp, charset) >= 0) {
    octstr_destroy(new*body);
    new*body = temp;
octstr_dump(new*body, 0);

    octstr_destroy(charset);
      } else {
    octstr_destroy(charset);
    octstr_destroy(new*body);
    return NULL;
      }
debug("sms.http", 0, "coding=7bit, after iso8859-1, msgdata is %s",
octstr_get_cstr(n
ew*body));
*/

Anyway it would *NOT* do the job. libxml maps any characters that not
map directly to ISO-8859-1 to something like this &#x39C; (the XML way).
Which is not good!

I am working on a solution for the greek characters, which must be
relatively easy since GSM default alphabet has all the GREEK capital
letters.

But I don't know how to give more general solution for all the possible
charsets (other than ISO-8859-7 which is for Greek).

Reply via email to