Hello internals,

For the past few weeks I have been refactoring the PDO extension to reduce the 
complexity of the implementation but also lower the memory requirement of the 
_pdo_stmt_t struct.
And in this process I rediscovered an oddity which prevents removing the usage 
of the FCI struct (which takes 64 bytes) from it.

The constructor arguments for the class are passed as a PHP array to 
PDOStatement::setFetchMode(), PDOStatement::fetchAll(), and 
PDOStatment::fetchObject().
However, this array of parameters does not have the same semantics as the $args 
parameter of call_user_func_array(). (CUFA for short.)
The reason for this is that it uses a rather unintuitive and obscure Zend API 
called zend_fcall_info_args_ex(),
this API sets the arguments of a call by copying each argument one after the 
other as a positional argument.
And if a zend_function pointer is provided, it will check if the function 
argument should be passed by reference and automatically wrap values into a 
reference without a warning.
This behaviour is at odds with passing a keyed array to CUFA, where the key 
will act as a named argument look-up, and with normal calling semantics where a 
by-ref argument being passed by value will trigger an E_WARNING.

Moreover, this API is effectively unused. A SourceGraph search [1] came up with 
9 results, where only 2 of them are relevant:
- The UOPZ extension
- The Zephir project, which allows generating a PHP C extension from PHPesque 
code
- All other 7 projects are extensions generated by using Zephir

I have submitted PRs to Zephir [2] and UOPZ [3] that have been merged that 
remove the usage of this API by instead passing the HashTable pointer of the 
array as the named_params field of the FCI.
Thus, the only remaining usage of this API I can find is now in PDO.
This conversion of using the named_params field of the FCI for calling the 
constructor can also be done for PDO,
allowing the removal of the FCI from _pdo_stmt_t, convert the ctor_arg zval 
(which takes 16 bytes) to a HashTable* (taking only 8 bytes), and remove 
various copies of the constructor arguments.
Doing this causes some minor BC breaks as this aligns the behaviour with CUFA 
which can be seen in the latest PR implementation. [4]

It is possible to emulate the previous behaviour, but this requires doing a 
deep copy of the constructor arguments, negating that benefit.

To me, these minor BC breaks are acceptable as:

- by-ref arguments in a constructor seems already rare, even more so in an 
object that represents a database row
- having an argument array behave differently than CUFA or what you would 
receive with variadic arugments is somewhat a language inconsistency.

Thoughts?

For reference, this is effectively the same point I raised in a previous 
internals thread, which didn't really get much attention. [5]

Best regards,

Gina P. Banyard

[1] 
https://sourcegraph.com/search?q=context:global+lang:C+zend_fcall_info_args_ex+-file:zend_API.h+-file:zend_API.c+-file:pdo_stmt.c&patternType=keyword&sm=0
[2] Zephir PR: https://github.com/zephir-lang/zephir/pull/2443
[3] UOPZ PR: https://github.com/krakjoe/uopz/pull/188
[4] https://github.com/php/php-src/pull/17427
[5] https://externals.io/message/118847

Reply via email to