On Sun, 2014-09-28 at 23:10 +0200, Rainer Jung wrote: > IMHO it is a useful approach. Whan I looked at the CGI topic, I noticed > that the safest thing is cleaning up in ap_create_environment(), because > you can be sure to get every env var in your hands there, not only the > ones coming from headers.
The "shellshock" recipe for mod_taint takes a bit of a kitchen-sink approach: - The Request headers - The Request fields that haven't always been fully sanitised and that might try to smuggle something: PATH_INFO and QUERY_STRING (r->args). - subprocess_env The point being, that comprehensively covers data coming from an HTTP request and that might find its way into a subprocess environment! Unless someone embeds a backdoor in some trojan, in which case all bets are off ... > Unfortunately that's not an extensible part of > code (no hook). Furthermore cleaning the env vars only works for CGI, > but not e.g. for FCGI, SCGI proxying etc. Sometimes it would be > convenient to sanitize data on e.g. a revproxy and not on every internal > web server maybe doing CGI. So I'd say mod_taint is a good compromise > between the slightly higher safety of directly sanitizing the env vars > and compatibility with current hooks. You're looking at a WAF there. mod_taint is but one small part of the functionality of a WAF, but just happens to be the part that sits between untrusted Request data and a subprocess environment! Cleaning up headers and QEURY_STRING/PATH_INFO protects a generic backend, anything from local CGI to proxy. > I haven't looked at your updated code yet (only at the old one that > didn't easily allow to untaint all headers), Indeed, it was only after your first email that I looked at it and started thinking harder about whether it in fact did the job! > but could it be helpful to > allow narrowing down which headers and vars to untaint? Like e.g. with a > regexp against the header/var name? Something between listing individual > headers und untainting all headers? Or is there no real use case for this? I don't really see a use case. HTTP rules allow arbitrary headers as extensions and CGI rules translate those by formula to environment variables, so we always need some kind of catch-all. But I guess there's nothing to stop wildcarding being incorporated into the apr_table_do that iterates over request_headers and subprocess_env. > Some headers have a more complex internal structure. Cookies come to > mind. There one might want to untaint using a regexp against the cookie > values, not the cookie header value. What matters is the value passed to the subprocess as HTTP_COOKIE. We're sanitising that, but parsing the (sanitised) value(s) is none of our business. -- Nick Kew
