Re: [PHP-DEV] [RFC] is_trusted - was is_literal

Mike Schinkel Tue, 22 Jun 2021 02:10:05 -0700

> On Jun 21, 2021, at 7:18 PM, Benjamin Morel <benjamin.mo...@gmail.com> wrote:
> 
> On Tue, 22 Jun 2021 at 01:06, Derick Rethans <der...@php.net> wrote:
> 
>> On 21 June 2021 23:37:56 BST, Yasuo Ohgaki <yohg...@ohgaki.net> wrote:
>>> 
>>> The name "is_trusted" is misleading.
>>> Literal is nothing but literal.
>> 
>> I agree with this. The name is_trusted is going to be the same naming
>> mistake as "safe mode" was. Developers will put their trust in it that it
>> is 100% guaranteed safe.
> 
> 
> FWIW, agreed, too. Trusted is vague and may imply some false sense of
> security. Literal is literally what it says on the tin.

I proposed the name change originally.

For my inspiration take a look at Trusted Types API in Javascript:

https://developer.mozilla.org/en-US/docs/Web/API/Trusted_Types_API
<https://developer.mozilla.org/en-US/docs/Web/API/Trusted_Types_API>

And also Trusted Types API from the W3C

https://w3c.github.io/webappsec-trusted-types/dist/spec/
<https://w3c.github.io/webappsec-trusted-types/dist/spec/>

The reason is_trusted() would be better than is_literal() is so that the
function could in the future handle other trusted types that are definitely not
literal and allow most code that using is_trusted() to continue to work.

So, for example, in the future we could add a new TrustedInterface and if an
object implemented the trusted interface and a __ToString() method it could be
considered trusted by code that uses the is_trusted() to guard against
untrusted code.

Alternately if in the future PHP added a TrustedAttribute we could use on the
__ToString() method to declare an object that could be used when a trusted
string is required.

-----

Conversely, I am *strongly* opposed to is_literal() as proposed. The current
RFC proposes half a solution without first envisioning what the other half of
the solution will look like.

If an is_literal() RFC were passed I would envision the following things
happening:

1. Bloggers would start blogging about how all PHP users should use
is_literal() when accepting SQL and HTML code, but won't provide any of the
nuances about special-cases because many won't understand them, they will just
see a new PHP feature to hype.

2. PHP listing tools will jump on is_literal() and champion its use. Just look
at Psalm.

3. Coding standards teams in large organizations will see that the listing
tools are flagging non-literals and wil start requiring all SQL and HTML etc.
code to be literals. And since they already bar the use of eval() there will be
no way to work around the requirement when you actually know what you are
doing.

And because they are in large organizations, they will be divorced from the
developers in the trenches who 95% of the time will have no problem with the
requirement, but 5% of the time will simply not be able to achieve their
development needs. Unfortunately these trench developers will not have enough
power in the organization to get the coding standards teams to even list to
their issues, especially if they are an agency or contractor.

4. Library developers will similarly adopt is_literal() as a requirement to use
their library. While the most popular ones will likely be enlightened enough
they will address edge cases. many in the long tail will not. So lots of the
long-tail PHP libraries out there — and especially the ones only needed by a
small audience so there won't be alternatives — will not address exceptions.
And for any of many reasons, many of those library developers will not be
responsive to those who submit PRs to provide workarounds.

THIS is the future I do not want to see, and if is_trusted() is renamed back to
is_literal() this is the future I predict. Maybe it won't happen, but I
personally do not want to risk it.

The biggest problem with the RFC is it does not address how to bypass someone
else's requirement for use of is_literal() when you *know* what you are doing
and you absolutely need to. Craig and I have had several long discussions
about this, and he said he did not want to allow that because someone might
misuse it (Craig, if I am paraphrasing incorrectly, please do correct me here.)

And I completely understand Craig's misgivings, but then he would not address
mine. His only solution to get around the special-case needs that *will* exist
is to use eval(). which is a lot like saying: "So you are in pain? Why not try
Meth?"

Craig shared this talk from Google which inspired him:

https://www.youtube.com/watch?v=ccfEu-Jj0as
<https://www.youtube.com/watch?v=ccfEu-Jj0as>

I would highly recommend watching this before voting on Craig's RFC.

The speaker talks about how they used "CompileTimeConstants" to reduce security
bugs, BUT:

1. They used a @CompileTimeConstant attribute and a static checker to guard
against unsafe strings instead of a function baked into Java. So convention and
process over language enforcement.

https://errorprone.info/bugpattern/CompileTimeConstant
<https://errorprone.info/bugpattern/CompileTimeConstant>
https://github.com/google/error-prone/blob/master/annotations/src/main/java/com/google/errorprone/annotations/CompileTimeConstant.java

<https://github.com/google/error-prone/blob/master/annotations/src/main/java/com/google/errorprone/annotations/CompileTimeConstant.java>

2. They developed APIs that allow packaging data to be used in place of
literals. See qb.setParameter() at 9:45.

3. They did this for code in Google's mono-repo, so they have control over how
people will be using it, and they have the ability to address special cases. If
PHP adds is_literal() the cat will be out the bag and we won't be able to
address special cases easily.

4. Even Craig mentioned to me that a literal cannot be trusted in all contexts.
An HTML literal passed to SQL is not what SQL wants, and might in some rare
case being able to bring down a DB. If so, how then to do we quickly fix PHP so
that can't happen? Adding a false sense of security does not seem like a good
idea.

5. Craig complained about escaping because it can't be used universally without
understanding use-case. So maybe what we need instead are use-case specific
tools that can sanitize use-case specific string content? SqlSanitizer,
HtmlSanitizer, etc? And/or templating systems built into PHP that can handle
escaping (I'm not necessarily suggesting these, but do think they should be
considered.)

6. At ~35:45 the speaker states it is their responsibility as API vendors to
sanitize and escape strings. He also said most data is user-supplied data and
not literals. So he is saying you MUST allow for user-supplied data, but
is_literal() does not allow for that.

6. They implement a whitelist of exceptions, because they were Google they
based the whitelist on their internal projects. But PHP cannot do that with
is_literal() because it does not address exceptions.

7. He said you need a "lightweight approval process for exceptions" at around
~40:00. Again, including is_literal() results in an extremely heavyweight
process for user's exceptions.

8. He mentions "Design for Reviewability" around 44:00; if there were a
make_literal() it would be very easy to find places that need to be reviewed
for literal exceptions.

So, IF we add is_trusted() we can plan to add other trusted data types later.

If we add is_literal(), we *should* map out all the issues in advance and make
sure we know that workable solutions at least exist. In addition we will need
some way to create data that can pass as literal (but then it's not a literal,
hence why "is_trusted" makes more sense.) Google did it with a method
attribute. Maybe PHP should too?

In summary, is_literal() is a hammer and people will us it to treat everything
like a nail. At least is_trusted() has the future potential to be treated like
a screwdriver.

-Mike
P.S. If I had my druthers is_trusted() would not pass either, and we'd hash out
the full vision for security before we go implementing something that could
paint us into a corner forever, or worse one that would create a security hole
that will be hard to fix.

Re: [PHP-DEV] [RFC] is_trusted - was is_literal

Reply via email to