On Sat, 15 Apr 2000, S Page wrote:

> Hey folks, my first post.
> 
> I'm writing an Apache SSI as a mod_perl module.  Apache 1.3.4 / mod_perl
> 1.18.
> When Web designers pass it HTML character entities, by the time they
> arrive at my module some become 8-bit characters.
> 
> For example,
> <!--#perl sub="Macromedia::BuildNav::code" args="
>         header_text=Gen&eacute;rat&ocirc;r reviews and Oscar&reg; awards
>         product_color=#FFCC33
> "-->
> 
> If I `print STDERR` or `$r->print` out args, it contains
> header_text=Gen \351 erat \364 r reviews and Oscar&reg; awards.  The
> accented characters are encoded as 0xe9 0xf4.
> 
> *   What Apache/mod_perl code is doing the entity decoding?

mod_include.c

> *   What character encoding is it using?   It looks like ISO-8859-1.

see mod_include.c:decodehtml()

> *   Is there a way to disable this or control it myself?

not really, but you can turn it off with this patch.

--- mod_include.c~      Tue Mar 14 20:07:10 2000
+++ mod_include.c       Mon May 15 22:52:00 2000
@@ -989,7 +989,7 @@
         return DECLINED;
     }
     while (1) {
-        if (!(tag_val = get_tag(r->pool, in, tag, sizeof(tag), 1))) {
+        if (!(tag_val = get_tag(r->pool, in, tag, sizeof(tag), 0))) {
             break;
         }
         if (strnEQ(tag, "sub", 3)) {


Reply via email to