Hi Craig,

> On Jun 12, 2021, at 1:00 PM, Craig Francis <cr...@craigfrancis.co.uk> wrote:
> 
> Hi Internals,
> 
> I'd like to start the discussion on the is_literal() RFC:
> 
> https://wiki.php.net/rfc/is_literal

Nice!  There is an awful lot to like here.

And few bits of concern.

What's to like?
-------------------

1. Adding a proactive method for guarding against injection vulnerabilities 
would be a huge step forward for PHP. This should not be underemphasized.

2. That you added the section "Previous Work" and covered what other languages 
are doing in this regard.

3. The "Thanks" section and all the external references and prior comments that 
provides insight into how this RFC was developed.

4. And I love that you publicly stated these points in an RFC:

        a. Why not use static analysis? It will never be used by most 
developers.
        b. Why not educate everyone? You can't - developer training simply does 
not scale, and mistakes still happen.


What's of concern? 
-------------------------
(Note: the last point could address concerns #2 and #4.)

1. Bike-shedding a bit, but I find a bit of cognitive dissonance over the 
semantic incorrectness of is_literal(), similar to Jakob Givoni's concern. 

I also fear it will be confusing for developers who know what a literal is, and 
thus I agree with Jakob. 

Minimally I'd ask for a separate vote on is_literal() vs. is_from_literal(). 

2. Regarding the sections "Non-Parameterised Values" and "Future Scope," if I 
understand correctly, your vision is to ultimately require all SQL and other 
vulnerable literals to be written as literals in the source code for them to be 
used or otherwise disallowed by future library code and/or PHP internal 
functionality?

If I did not misunderstand I believe you would disallow an important class of 
functionality from being implemented in PHP, including several existing and 
widely used applications such as but not limited to PhpMyAdmin and Adminer that 
open and interact with arbitrary SQL databases.  It would also disallow using 
any data returned by APIs where the PHP application cannot know in advance what 
data will be acquired, even if that data has been properly escaped and/or 
sanitized.

3. I notice your RFC does not grant sprintf() the ability to return a string 
with the "literal" flag even though sprintf() can frequently return strings 
composed from known literals and safe numeric value. And I have frequently see 
it used for this purpose, e.g.:

$fields = ['id','name'];
$table = 'my_table';

$sql = sprintf( 'SELECT %s FROM %s WHERE ID=%d', 
        implode( ',', $fields ),        
        $my_table,
        intval($_GET['id']));

Of course your section "WHERE IN" "addresses" this, but please see #4 below. 

I argue sprintf() should be able to pass thru literal flags and validate 
integer values to determine that the result is literal.  Am I missing something 
here?

4. Regarding the section "WHERE IN" you argue that we should push everyone to 
use parameters properly — which is a great idea in theory — but that would have 
a significant breakage in existing sites online.  While many here in Internals 
seem to look down on WordPress if does power over 40%[1] of the web at this 
point and this would break most if not all those sites if the site owners or 
their hosts were to upgrade to a version of PHP that was militant about 
requiring is_literal() for mysqli_query()[2].  

And yes, WordPress could upgrade their core code[3] to address this, but that 
would not also fix the 60k plugins and 10k themes on WordPress.org 
<http://wordpress.org/>, or likely the million+ combined custom plugins and 
themes in use on the web. 

Is it really reasonable to change PHP to just break of all those sites so that 
we can force all the site owners to beg and plead with their plugin and theme 
developers to better protect them from injection errors?  Seems that medicine 
might be worse than the cure, at least for all those site owners. 

5. Given #2 and #4, I think your section "Support Functions" is missing what 
would allow WordPress, PhpMyAdmin, Adminer and all other PHP applications to 
address the concerns in #2 and #4 and it is neither literal_concat() nor 
literal_implode(). 

The function we would need is the one Lauri Kenttä wrote (in jest?), but we'd 
need one that is performant, not a userland version. Name it as_literal() or 
make_literal() or even as_unsafe_literal() — whichever chosen is unimportant 
IMO — but applications like the ones I mentioned need to be able to say "I 
absolutely know what I am explicitly doing so do not throw a warning or an 
error telling me my value is `not a literal`, dammit!"  

Certainly most people would not use such a method by accident, especially if 
called something like `as_unsafe_literal()`.  There might even be ways to 
validate that this was not used by a copy-and-paste artist, but I will leave 
that validation to other people who are smarter than me.

6. Regarding the section "Reflection API," shouldn't their be an 
is_literal()/is_from_literal() method added to the ReflectionParameter() and 
ReflectionProperty() classes?

So assuming you address at least #5 — and thus #2 and #4 indirectly — I would 
really like to see this RFC pass.

-Mike


[1] https://w3techs.com/blog/entry/40_percent_of_the_web_uses_wordpress 
<https://w3techs.com/blog/entry/40_percent_of_the_web_uses_wordpress> 
[2] 
https://www.php.net/manual/en/mysqli.query.php#refsect1-mysqli.query-parameters 
<https://www.php.net/manual/en/mysqli.query.php#refsect1-mysqli.query-parameters>
 
[3] 
https://github.com/WordPress/WordPress/blob/master/wp-includes/wp-db.php#L2056 
<https://github.com/WordPress/WordPress/blob/master/wp-includes/wp-db.php#L2056>





Reply via email to