On Sun, 2014-09-28 at 23:10 +0200, Rainer Jung wrote:

> IMHO it is a useful approach. Whan I looked at the CGI topic, I noticed 
> that the safest thing is cleaning up in ap_create_environment(), because 
> you can be sure to get every env var in your hands there, not only the 
> ones coming from headers.

The "shellshock" recipe for mod_taint takes a bit of a kitchen-sink
approach:
 - The Request headers
 - The Request fields that haven't always been fully sanitised
   and that might try to smuggle something: PATH_INFO and
   QUERY_STRING (r->args).
 - subprocess_env

The point being, that comprehensively covers data coming from
an HTTP request and that might find its way into a subprocess
environment!  Unless someone embeds a backdoor in some trojan,
in which case all bets are off ...

>  Unfortunately that's not an extensible part of 
> code (no hook). Furthermore cleaning the env vars only works for CGI, 
> but not e.g. for FCGI, SCGI proxying etc. Sometimes it would be 
> convenient to sanitize data on e.g. a revproxy and not on every internal 
> web server maybe doing CGI. So I'd say mod_taint is a good compromise 
> between the slightly higher safety of directly sanitizing the env vars 
> and compatibility with current hooks.

You're looking at a WAF there.  mod_taint is but one small part of
the functionality of a WAF, but just happens to be the part that
sits between untrusted Request data and a subprocess environment!
Cleaning up headers and QEURY_STRING/PATH_INFO protects a generic
backend, anything from local CGI to proxy.

> I haven't looked at your updated code yet (only at the old one that 
> didn't easily allow to untaint all headers),

Indeed, it was only after your first email that I looked at it
and started thinking harder about whether it in fact did the job!

>  but could it be helpful to 
> allow narrowing down which headers and vars to untaint? Like e.g. with a 
> regexp against the header/var name? Something between listing individual 
> headers und untainting all headers? Or is there no real use case for this?

I don't really see a use case.  HTTP rules allow arbitrary
headers as extensions and CGI rules translate those by
formula to environment variables, so we always need some
kind of catch-all.  But I guess there's nothing to stop
wildcarding being incorporated into the apr_table_do
that iterates over request_headers and subprocess_env.

> Some headers have a more complex internal structure. Cookies come to 
> mind. There one might want to untaint using a regexp against the cookie 
> values, not the cookie header value.

What matters is the value passed to the subprocess as
HTTP_COOKIE.  We're sanitising that, but parsing the
(sanitised) value(s) is none of our business.

-- 
Nick Kew

Reply via email to