Re: [PHP-DEV] [RFC][Discussion] Add MariaDB-specific features to mysqlnd and mysqli

Kamil Tekiela Tue, 14 Apr 2026 08:14:01 -0700

Hi Georg,

Thank you for addressing my questions.


Regarding the progress indicator: it looks to me like it will be very
difficult to implement properly in PHP, but I might be completely
misunderstanding the design. If you think a PoC is possible, could you
please prepare one and maybe then we can come back to this discussion.

----------------------------

>There is a fundamental difference between what a server might do (Version) and 
>what it can do (Capabilities). A version string is a static label. Capability 
>flags, however, reflect the server’s current build-time features and runtime 
>configuration.
>
>Exposing ext_server_capabilities allows developers and frameworks to implement 
>reliable feature detection. Without these flags, a developer cannot safely use 
>functions without risking a fatal error or needing to wrap every call in a 
>try-catch block.

This is exactly what exceptions and errors are for. The PHP user
SHOULD NOT be concerned with what capabilities the server offers and
which are compatible with PHP. Even if it could be potentially useful
to PHP developers, we should not expose this information as part of
mysqli API. Either mysqli supports the feature or it doesn't; there
should be no maybe.

----------------------------

>The performance improvement is not derived from the Generator itself, but from 
>the reduction of network round-trips.
>
>- standard foreach + execute(): 1,000 rows = 1,000 "Send" packets + 1,000 
>"Response" packets.
>- execute_many(): The driver pulls rows from the iterable and serializes them 
>into a large internal network buffer. 1,000 rows = 1 large Send packet + 1 
>Response packet.
>
>Regarding iterator_to_array($generator): This forces PHP to allocate memory 
>for every single row simultaneously. If you are importing 1 million rows, PHP 
>must allocate memory for 1 million arrays before the first byte is ever sent. 
>By using the Generator directly, we maintain O(1) memory usage, fetching and 
>serializing only one row at a time.

If mysqlnd is going to serialize it all into a large internal buffer
then it still needs to read all the data from the iterator before
making the query? It's still going to use the same amount of memory.
How is it going to maintain constant memory usage?

A generator might be a neat trick for users to prepare data on the go,
but it doesn't reduce the memory usage if the consumer needs to use
all of the data in one go. As I understand, execute_many will send it
all in one batch, so the memory footprint stays the same.

----------------------------

Re execute_many parameters:

Despite your convincing arguments for better network utilization by
providing the types, I still think we should not offer the possibility
of specifying the types. I don't know what other PHP developers on
this mailing list think about it, but for me the type feature goes
against the nature of PHP. Making the parameter optional is very good
choice and eases my concerns slightly, so if I am outnumbered in my
opinion, I won't be upset.

The number of mysqli users grows increasingly smaller. Out of this,
the number of people who will use execute_many and who will need to
optimize for TINYINT is unbelievably tiny. Any string easily
overshadows the numerical data. Thus, this feature won't see much
legitimate use.

> Because the MariaDB bulk protocol requires type declarations in the packet 
> header before the data rows are sent, the driver cannot "autodetect" binary 
> widths from a Generator without reading the entire stream into memory first. 
> Providing the $types string acts as a contract that allows for true, 
> constant-memory streaming.

Isn't that what it does anyway? You need to read all the data
(serialize) before you make the EXECUTE command, correct? I don't
understand why you can't prepare the type specification automatically
while serializing the data.

Regarding the control parameter:

Why not make it a callback? Provide the $row as an argument and let
the user modify it inside the callback, substituting values for
mysqli_indicator, and returning the row to be inserted. It would offer
a lot more flexibility to the user and would make the implementation
simpler. This way, you don't need to implement Null or None anymore.

----------------------------

>>Please be explicit what the failure conditions may be.
>
>Like for all other functions you have to retrieve error messages via 
>mysqli_stmt::error/errno.

I was asking you to list in the RFC all the possible client errors
that are added as part of this implementation. For example, "Row %lu
is not an array". This should be part of the RFC, in my opinion, as we
may want to discuss the error conditions and messages too.

----------------------------

Result sets:

> As mentioned in the beginning of the RFC, execute_many is used for UPSERT 
> (insert, replace,delete and update) operations. However it might contain a 
> result set, e.g. by having a RETURNING clause.

It can also be used with SELECT. The SQL is irrelevant.

>From what I can see, it is possible to request that MariaDB reply with
individual result sets. Wouldn't that be better? How would that impact
the performance?
>
>If you require the IDs of every row in a bulk insert, the MariaDB RETURNING 
>clause should be used:

That's not what I meant. I was asking whether it could be implemented
with MARIADB_CLIENT_BULK_UNIT_RESULTS instead. When users execute a
SELECT with 2 data rows, I would like to see it return 2 mysqli_result
objects. Same with INSERT statements, it should return a separate
result for each insertion. If there are arguments against that, they
should be explained in the RFC.

Anyway, the RFC should clearly explain how result fetching works with
all 3 methods (unbuffered, stored, and get_result) and what are the
possible gotchas.

----------------------------

> For transactional engines like InnoDB, atomicity is guaranteed. If a protocol 
> error occurs or a constraint is hit mid-batch, the server handles the 
> rollback, ensuring the database remains in a consistent state.

> Transaction Safety & Atomicity: In native MariaDB bulk mode, the entire batch 
> is sent as a single unit. In the fallback emulation, rows are executed one by 
> one. For non-transactional engines, a failure on row 500 would leave the 
> first 499 rows committed. To maintain consistency, we should recommend that 
> users wrap execute_many() in an explicit transaction when portability across 
> MySQL and MariaDB is required.

I am confused. Aren't both of these statements stating the same? Why
can't you wrap the fallback in an automatic transaction to make it
work exactly the same as the native MariaDB solution?

If execute_many implies an automatic transaction but only with
transactional engines, it should be clearly stated in the RFC so that
it can be later documented in PHP manual too.

----------------------------

> Limited Indicator Support: Since the standard MySQL COM_STMT_EXECUTE protocol 
> does not understand MariaDB-specific indicators, the fallback will only 
> support mysqli_indicator::Null (translated to a standard SQL NULL) and 
> mysqli_indicator::None. Indicators like DEFAULT and IGNORE are technically 
> impossible to implement in the fallback without complex SQL string 
> manipulation/rewriting, which would introduce unacceptable CPU overhead.

And for this reason, I think that maybe we shouldn't even implement
the control parameter at all. It sounds like a neat feature, but it
costs performance in a function that is all about improving
performance, and it is DB-version specific. The new execute_many
function doesn't need the control parameter to function properly, and
in my opinion, it would be better to keep it as simple as possible.
But I am curious to see what others think.

Regards,
Kamil

Re: [PHP-DEV] [RFC][Discussion] Add MariaDB-specific features to mysqlnd and mysqli

Reply via email to