Hi

What happens is that the document typically contains characters from the
ISO-8859-1 charset, as allowed/suggested by the RFC. Kannel expects
something else, I do not know what.

The WML compiler fails, dumping something like this in the logs:

2004-12-03 23:58:57 [13183] [7] INFO: Mutex gwlib/list.c:134: 2 locks, 0 
collisions.
2004-12-03 23:58:57 [13183] [7] DEBUG: WSP: Converting from <text/vnd.wap.wml> 
to <application/vnd.wap.wmlc>
2004-12-03 23:58:57 [13183] [7] ERROR: WML compiler: Compiling error: libxml 
returned a NULL pointer
2004-12-03 23:58:57 [13183] [7] WARNING: WSP: WML compilation failed.
2004-12-03 23:58:57 [13183] [7] DEBUG: WSP: Content convertion failed!
2004-12-03 23:58:57 [13183] [7] WARNING: WSP: All converters for 
`text/vnd.wap.wml' at `http://www.aftenposten.no/mobil/wap/' failed.

Consequently, no content is sent to the phone.

The test for wether the content should be transcoded to UTF8 before
compiling the WML-code fails if no charset is set on the received document
at all. Since RFC 2616 (HTTP 1.1) says these documents shall be
interpreted according to the ISO-8859-1 charset I found gwlib/http.c to be a 
nice
(and easy) place to put this functionality.

If you feel it would be better to modify the test for wether to convert to
UTF8 thats OK with me. But I would be very happy if kannel treated these
pages properly, since about a third of the pages I visit the stock kannel
fail.

I have saved a page that I think failed here:
  http://folesvaert.dyndns.org/tst/ap3.wml
(If this does not fail to compile then let me know, and I will produce a
page that does.)

I have been running with this patch from the beginning of December 2004
without any problems.

Regards
Rune

---
Rune S�tre <[EMAIL PROTECTED]>
NetCom as, Infrastruktur
Telefon (mob): 934 34 285
..

On Tue, 15 Feb 2005, Paul P Komkoff Jr wrote:

> Replying to Rune Saetre:
> > I think this patch should be applied before 1.4.1 is released.
> >
> > I have submitted this before, but without the [PATCH] tag in the subject
> > line. And here it is again, since I cannot see it has been applied to the
> > CVS.
> >
> > This patches the gateway-1.4.0/gwlib/http.c file from the 1.4.0 version to
> > set ISO-8859-1 as charset for subtypes of "text" if no charset is
> > specified in the headers.
>
> What happens if you don't ?
> E.g. we receive some text/plain without explicit charset set, then ...
>
> > This is in accordance with RFC 2616, section 3.7.1.
> > It also adresses bug #0000068.
> > Moreover, wapbox is totally useless without it here in Norway...
>
> So ?
>
> I know the reasoning I had when invented NEW_CHARSET. Phone claims it
> supports multiple encodings - like, for example, latin1, koi8-r, and
> utf8, but does this in-line and with q=0.x weights. The, webserver
> decides to present russian content in, for example, koi8. But there's
> <?xml encoding=utf-8?> ot whatever in wml source anyway, and wml
> compiler bombs at the stage it calls libxml.
> So I am explicitly requesting UTF-8 in that case, then recoding any
> received page to UTF-8 before feeding it to libxml.
>
>
> But text/plain ...
>
> --
> Paul P 'Stingray' Komkoff Jr // http://stingr.net/key <- my pgp key
>  This message represents the official view of the voices in my head
>

Reply via email to