On Sat, 15 Apr 2000, S Page wrote:
> Hey folks, my first post.
>
> I'm writing an Apache SSI as a mod_perl module. Apache 1.3.4 / mod_perl
> 1.18.
> When Web designers pass it HTML character entities, by the time they
> arrive at my module some become 8-bit characters.
>
> For example,
> <!--#perl sub="Macromedia::BuildNav::code" args="
> header_text=Genératôr reviews and Oscar® awards
> product_color=#FFCC33
> "-->
>
> If I `print STDERR` or `$r->print` out args, it contains
> header_text=Gen \351 erat \364 r reviews and Oscar® awards. The
> accented characters are encoded as 0xe9 0xf4.
>
> * What Apache/mod_perl code is doing the entity decoding?
mod_include.c
> * What character encoding is it using? It looks like ISO-8859-1.
see mod_include.c:decodehtml()
> * Is there a way to disable this or control it myself?
not really, but you can turn it off with this patch.
--- mod_include.c~ Tue Mar 14 20:07:10 2000
+++ mod_include.c Mon May 15 22:52:00 2000
@@ -989,7 +989,7 @@
return DECLINED;
}
while (1) {
- if (!(tag_val = get_tag(r->pool, in, tag, sizeof(tag), 1))) {
+ if (!(tag_val = get_tag(r->pool, in, tag, sizeof(tag), 0))) {
break;
}
if (strnEQ(tag, "sub", 3)) {