Hello all, @Hélder Gregório <[email protected]> and I identified a gap between common database API execution patterns and Arrow Flight SQL prepared statements. To address this, we propose adding an optional boolean field to ActionCreatePreparedStatementResult. Background
A common pattern in database APIs is: 1. Create a prepared statement 2. Execute the prepared statement, returning either a result set or an update count This pattern exists in: - *JDBC* (Connection.prepareStatement() + PreparedStatement.execute()) - *Python PEP 249* (both steps condensed in cursor.execute()) - *ODBC* (SQLPrepare() + SQLExecute()) In Arrow Flight SQL, there are two mutually exclusive communication paths for executing prepared statements. Both begin with ActionCreatePreparedStatementRequest, after which the client must choose between: - CommandPreparedStatementQuery (returns a result set), or - CommandPreparedStatementUpdate (returns an update count). (For simplicity, we ignore parameter binding here.) The issue is that ActionCreatePreparedStatementResult, returned by the server in the first call, does not contain information indicating which execution path the client should take. *Proposal* We propose adding the following field to ActionCreatePreparedStatementResult : optional bool is_update = 4; - true → clients should use CommandPreparedStatementUpdate - false → clients should use CommandPreparedStatementQuery This makes the intended execution path explicit. The behavior of clients when the server does not set this field is outside the scope of this proposal, though discussion is welcome. We would be happy to open follow-up PRs to standardize client behavior across drivers if desired. Current state of driver implementations - The Arrow Flight SQL JDBC driver uses a heuristic to choose the execution path: https://github.com/apache/arrow-java/issues/797 <https://github.com/apache/arrow-java/issues/797?utm_source=chatgpt.com> - The PEP 249 Python Flight SQL driver (in ADBC) always uses CommandPreparedStatementQuery in cursor.execute(). We believe making the execution path explicit improves protocol completeness and alignment with widely used database APIs. Let us know your thoughts. Best, Pedro Matias
