All, On Tue, Sep 15, 2015 at 11:15 AM, Arvids Godjuks <arvids.godj...@gmail.com> wrote: > I fully support your effort to get this into the PHP to be part of core > extensions, or at least one of those that keep up with the language > releases. > This is a very good tool to have, and you can actually run it in production > to catch things that may slipped the stating (things happen). And it's > invaluable during during testing.
I'm against any form of tainting for a few reasons. First, it's only really useful as a first-order heuristic. A failure does not indicate a security vulnerability, and a non-failure does not indicate the absence of one. An example of a false-positive would be if (preg_match('(^[a-z0-9]+$)', $input)) echo $input;. Taint-based systems would flag that as a possible XSS, where it's clearly not in any context. An example of a false-negative would be the following: <a href=<?php echo htmlspecialchars($input); ?>>Bar</a>. If the input is "foo onclick=dosomethingbad();", an XSS is still executed. Another example of a false negative would be: $query = "SELECT * FROM foo WHERE bar = " . mysqli_real_escape_string($c, $input) . ";"; A false positive is potentially OK. A false negative is not. Second, it will encourage the improper pattern of "sanitize" functions to de-taint input. These are functions which call `mysqli_real_escape_string(htmlspecialchars(whatever(etc($input))))`. No matter what the context the variables are used in. This doesn't improve security (and in many cases can demonstrably harm it). It just encourages people to work around it. And that ignores that there's an `untaint()` function that they can call on any variable to "silence the errors". Third, it ignores context. This is related to the first two, but I think is a separate concern. An example from the taint RFC (https://wiki.php.net/rfc/taint) is the shell-execution. If the variable is used in the context of command, one escape function is needed. If it's used as an argument, another is needed. Detecting which is not something that's trivial for a language-level taint function. And calling the wrong function defeats the purpose. So if we can't detect the context, we can't reliably say whether it's safe or not. In the hands of an experienced engineer/security researcher, taint systems can be a really handy tool to help find possible issues. In the hands of a junior engineer, it can actually make things worse because it encourages them to just "silence the errors". Considering that it's not a reliable tool, and is only a heuristic, I would suggest that this be implemented as an extension, and not in the core of a programming language. With that said, if there are engine hooks necessary to implement this feature better as an extension, I'm all for adding those hooks. I'm just against adding something to the language that won't actually be able to accomplish what it promises. Escaping is always context-sensitive, in every case. Tainting as has been proposed in the past doesn't (and literally can't in the general case) know the context. Therefore it can't reliably know the correct way to escape. Consider the case of addslashes() and magic_quotes_gpc. The reason that they are insecure is that they don't know the context that the variable is used in. This leads to character set issues, as well as cases where backslashes enter contexts they don't escape (like HTML output). We learned years ago that context-insensitive escaping doesn't work and is literally bad. Let's not make the same mistakes again (even if it's for the correct reasons this time). Anthony -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php