I've been over this with Nick before: mod_proxy_html uses mod_xml2enc to do the detection magic but mod_xml2enc fails to detect compressed content correctly. Hence a simple "ProxyHTMLEnable" fails when content compression is in place.
To work around this without dropping support for content compression you can do SetOutputfilter INFLATE;xml2enc;proxy-html;DEFLATE or at least that was the kind-of-result of the half-finished discussion last time. Aside from being plain ugly and troublesome to use (e.g. if you want to use AddOutputfilter somewhere else) the above also has a major shortcoming, which lies with already-compressed content. Suppose the client does GET /something.tar.gz HTTP/1.1 ... Accept-Encoding: gzip, deflate to which the backend will respond with 200 but *not* send an "Content-Encoding" header since the content is already encoded. Using the above filter chain "corrupts" the content because it will be inflated and then deflated, double compressing it in the end. Imho this whole issue lies with proxy_html using xml2enc to do the content type detection and xml2enc failing to detect the content encoding. I guess all it really takes is to have xml2enc inspect the headers_in to see if there is a "Content-Encoding" header and then add the inflate/deflate filters (unless there is a general reason not to rely on the input headers, see below). Adding the inflate/deflate filters inside xml2enc is where I need some advice. For the deflate part I can probably do something like const char *compression_method = apr_table_get(f->r->headers_in, "Content-Encoding"); if (compression_method != NULL && strncasecmp(compression_method, "gzip", 4) == 0) { ap_add_output_filter("deflate", NULL, r, NULL); } but what about the inflate part ? I can't simply add the inflate input filter because at that point (in mod_xml2enc's xml2enc_ffunc() ) I would then need to "go back" in the input filter chain which is afaik not possible. So I would have to run the inflate input filter "in place". Of course, this whole issue would disappear if inflate/deflate would be run automagically (upon seeing a Content-Encoding header) in general. Anyway, what's the reasoning behind not having them run always and give them the knowledge (e.g. about the input headers) to get out of the way if necessary ?