> On Jun 21, 2021, at 7:18 PM, Benjamin Morel <benjamin.mo...@gmail.com> wrote: > > On Tue, 22 Jun 2021 at 01:06, Derick Rethans <der...@php.net> wrote: > >> On 21 June 2021 23:37:56 BST, Yasuo Ohgaki <yohg...@ohgaki.net> wrote: >>> >>> The name "is_trusted" is misleading. >>> Literal is nothing but literal. >> >> I agree with this. The name is_trusted is going to be the same naming >> mistake as "safe mode" was. Developers will put their trust in it that it >> is 100% guaranteed safe. > > > FWIW, agreed, too. Trusted is vague and may imply some false sense of > security. Literal is literally what it says on the tin.
I proposed the name change originally. For my inspiration take a look at Trusted Types API in Javascript: https://developer.mozilla.org/en-US/docs/Web/API/Trusted_Types_API <https://developer.mozilla.org/en-US/docs/Web/API/Trusted_Types_API> And also Trusted Types API from the W3C https://w3c.github.io/webappsec-trusted-types/dist/spec/ <https://w3c.github.io/webappsec-trusted-types/dist/spec/> The reason is_trusted() would be better than is_literal() is so that the function could in the future handle other trusted types that are definitely not literal and allow most code that using is_trusted() to continue to work. So, for example, in the future we could add a new TrustedInterface and if an object implemented the trusted interface and a __ToString() method it could be considered trusted by code that uses the is_trusted() to guard against untrusted code. Alternately if in the future PHP added a TrustedAttribute we could use on the __ToString() method to declare an object that could be used when a trusted string is required. ----- Conversely, I am *strongly* opposed to is_literal() as proposed. The current RFC proposes half a solution without first envisioning what the other half of the solution will look like. If an is_literal() RFC were passed I would envision the following things happening: 1. Bloggers would start blogging about how all PHP users should use is_literal() when accepting SQL and HTML code, but won't provide any of the nuances about special-cases because many won't understand them, they will just see a new PHP feature to hype. 2. PHP listing tools will jump on is_literal() and champion its use. Just look at Psalm. 3. Coding standards teams in large organizations will see that the listing tools are flagging non-literals and wil start requiring all SQL and HTML etc. code to be literals. And since they already bar the use of eval() there will be no way to work around the requirement when you actually know what you are doing. And because they are in large organizations, they will be divorced from the developers in the trenches who 95% of the time will have no problem with the requirement, but 5% of the time will simply not be able to achieve their development needs. Unfortunately these trench developers will not have enough power in the organization to get the coding standards teams to even list to their issues, especially if they are an agency or contractor. 4. Library developers will similarly adopt is_literal() as a requirement to use their library. While the most popular ones will likely be enlightened enough they will address edge cases. many in the long tail will not. So lots of the long-tail PHP libraries out there — and especially the ones only needed by a small audience so there won't be alternatives — will not address exceptions. And for any of many reasons, many of those library developers will not be responsive to those who submit PRs to provide workarounds. THIS is the future I do not want to see, and if is_trusted() is renamed back to is_literal() this is the future I predict. Maybe it won't happen, but I personally do not want to risk it. The biggest problem with the RFC is it does not address how to bypass someone else's requirement for use of is_literal() when you *know* what you are doing and you absolutely need to. Craig and I have had several long discussions about this, and he said he did not want to allow that because someone might misuse it (Craig, if I am paraphrasing incorrectly, please do correct me here.) And I completely understand Craig's misgivings, but then he would not address mine. His only solution to get around the special-case needs that *will* exist is to use eval(). which is a lot like saying: "So you are in pain? Why not try Meth?" Craig shared this talk from Google which inspired him: https://www.youtube.com/watch?v=ccfEu-Jj0as <https://www.youtube.com/watch?v=ccfEu-Jj0as> I would highly recommend watching this before voting on Craig's RFC. The speaker talks about how they used "CompileTimeConstants" to reduce security bugs, BUT: 1. They used a @CompileTimeConstant attribute and a static checker to guard against unsafe strings instead of a function baked into Java. So convention and process over language enforcement. https://errorprone.info/bugpattern/CompileTimeConstant <https://errorprone.info/bugpattern/CompileTimeConstant> https://github.com/google/error-prone/blob/master/annotations/src/main/java/com/google/errorprone/annotations/CompileTimeConstant.java <https://github.com/google/error-prone/blob/master/annotations/src/main/java/com/google/errorprone/annotations/CompileTimeConstant.java> 2. They developed APIs that allow packaging data to be used in place of literals. See qb.setParameter() at 9:45. 3. They did this for code in Google's mono-repo, so they have control over how people will be using it, and they have the ability to address special cases. If PHP adds is_literal() the cat will be out the bag and we won't be able to address special cases easily. 4. Even Craig mentioned to me that a literal cannot be trusted in all contexts. An HTML literal passed to SQL is not what SQL wants, and might in some rare case being able to bring down a DB. If so, how then to do we quickly fix PHP so that can't happen? Adding a false sense of security does not seem like a good idea. 5. Craig complained about escaping because it can't be used universally without understanding use-case. So maybe what we need instead are use-case specific tools that can sanitize use-case specific string content? SqlSanitizer, HtmlSanitizer, etc? And/or templating systems built into PHP that can handle escaping (I'm not necessarily suggesting these, but do think they should be considered.) 6. At ~35:45 the speaker states it is their responsibility as API vendors to sanitize and escape strings. He also said most data is user-supplied data and not literals. So he is saying you MUST allow for user-supplied data, but is_literal() does not allow for that. 6. They implement a whitelist of exceptions, because they were Google they based the whitelist on their internal projects. But PHP cannot do that with is_literal() because it does not address exceptions. 7. He said you need a "lightweight approval process for exceptions" at around ~40:00. Again, including is_literal() results in an extremely heavyweight process for user's exceptions. 8. He mentions "Design for Reviewability" around 44:00; if there were a make_literal() it would be very easy to find places that need to be reviewed for literal exceptions. So, IF we add is_trusted() we can plan to add other trusted data types later. If we add is_literal(), we *should* map out all the issues in advance and make sure we know that workable solutions at least exist. In addition we will need some way to create data that can pass as literal (but then it's not a literal, hence why "is_trusted" makes more sense.) Google did it with a method attribute. Maybe PHP should too? In summary, is_literal() is a hammer and people will us it to treat everything like a nail. At least is_trusted() has the future potential to be treated like a screwdriver. -Mike P.S. If I had my druthers is_trusted() would not pass either, and we'd hash out the full vision for security before we go implementing something that could paint us into a corner forever, or worse one that would create a security hole that will be hard to fix.