Friends
Why not use iconv?
I have been experementing with it as I want character sets that libxml
cannot deal with.
Worik
Bruno David Sim�es Rodrigues <[EMAIL PROTECTED]> writes:
> On Wed, 2001-11-21 at 17:34, Nektarios K. Papadopoulos wrote:
> Yes, it should be a bug.
> When I coded it, I've asked for help in this part because: 1st I didn't know
> the xml_* functions and
> 2nd: I usually only use iso-8859-1 or ucs-2 directly.
> That's why there's so many debug lines around it.
> Feel free to correct it.
> BTW, there should be a bug somewhere in this code that panics, I've seen it
> once but I don't
> recall what I've done (besides passing some differente charsets and codings)
>
> Andreas Fink wrote:
> >
> > > > Index: gw/smsbox.c
> > >> ===================================================================
> > >> RCS file: /home/cvs/gateway/gw/smsbox.c,v
> > >> retrieving revision 1.156
> > >> diff -r1.156 smsbox.c
> > >> 1392,1395d1391
> > >> < if (charset_processing(charset, &body, coding) == -1) {
> > >> < *status = 415;
> > >> < ret = octstr_create("Charset or body misformed, rejected");
> > >> < }
> > >
> > >votings from the smsbox hackers for the proposed change?! Andreas?
> > >Nick?
> >
> > if its a bug, lets fix it. I had a user complaining that he has
> > problems with greek characters. Sounds like the source of the problem.
> >
>
> Actually this is a bug (I think) I found trying to solve the problem
> with greek characters.
>
> Removing this line is not enough.
>
> The code in charset processing does well when coding==DC_UCS2 (well this
> is the easy case).
>
> It also does well when coding==DC_7BIT and charset=="ISO-8859-1"(well
> that is even easier: just do nothing)
>
> But when coding==DC_7BIT and charset!="ISO-8859-1" it seems to be trying
> to do something like this:
> first ... encode to UTF-8
> then UTF-8 to ISO-8859-1
> allways using libxml calls.
>
> Actually the code for UTF-8 to ISO-8859-1 is wrong and commented out.
> /* UTF-8 to ISO-8859-1 */
> /* charset = octstr_create("ISO-8859-1");
> if (charset_from_utf8(new*body, &temp, charset) >= 0) {
> octstr_destroy(new*body);
> new*body = temp;
> octstr_dump(new*body, 0);
>
> octstr_destroy(charset);
> } else {
> octstr_destroy(charset);
> octstr_destroy(new*body);
> return NULL;
> }
> debug("sms.http", 0, "coding=7bit, after iso8859-1, msgdata is %s",
> octstr_get_cstr(n
> ew*body));
> */
>
> Anyway it would *NOT* do the job. libxml maps any characters that not
> map directly to ISO-8859-1 to something like this Μ (the XML way).
> Which is not good!
>
> I am working on a solution for the greek characters, which must be
> relatively easy since GSM default alphabet has all the GREEK capital
> letters.
>
> But I don't know how to give more general solution for all the possible
> charsets (other than ISO-8859-7 which is for Greek).
>
--
Worik Macky Turei Stanton
Whew! [EMAIL PROTECTED]
Aotearoa