paul-rogers commented on PR #12636:
URL: https://github.com/apache/druid/pull/12636#issuecomment-1162659924

   @clintropolis, thanks for the history of parameters. I reread the links that 
you generously provided: very helpful. @gianm your advice to look into the JDBC 
(Avatica) code path was wise: doing so was insightful.
   
   It turns out that Druid's implementation of the JDBC protocol is a bit off, 
which may have led to some of the confusion around when to prepare and when to 
expect parameters. Issue #12682 spells out the gaps.
   
   This latest PR revises how we manage SQL processing in both HTTP and JDBC to 
try to get things a bit cleaner. A key challenge is that `SqlLifecycle` tries 
to implement two distinct processing flows, while strictly enforcing a set of 
states. This has caused the code to get rather complex. According to the 
comments, `SqlLifecycle` evolved from the native `QueryLifecycle` optimize for 
native queries. But SQL, for better or worse, works differently.
   
   SQL works around the idea of _statements_ and _cursors_. In JDBC, statements 
are of two kinds: `Statement` and `PreparedStatement`. The cursor is 
represented by a `ResultSet`. Each statement can have one or more result sets 
open, and each statement can give rise to any number of result sets (one after 
another.) In a prepared statement, each execution tends to take a set of 
parameter values.
   
   Then, on the HTTP side, we have a "direct execution" model: we get a SQL 
statement, context variables and parameters, and run through the planner 
sequence: parse, validate, authorize, plan and execute.
   
   To model this behavior, the latest commit refactors `SqlLifecycle` in to 
focused "statement" classes:
   
   * `DirectStatement`: Implements a one-pass, directly executed statement. 
Accepts parameters. Makes a single pass through the planner. Produces a 
`Sequence` (Druid's result set) as output.
   * `HttpStatement`: A direct statement for use in the HTTP endpoint.
   * `PreparedStatement`: The Druid implementation of the JDBC prepared 
statement. Produces a `PrepareResult` as output. Allows execution by creating a 
`DirectStatement` which runs the query, with parameters. Supports the JDBC 
prepared statement model.
   
   The resulting code is far simpler and easier to test. Check out the result 
to see if the refactoring makes sense, and resolves the issues we've been 
discussing.
   
   Then, the Avatica JDBC implementation is refactored to clearly model the 
JDBC concepts:
   
   * `DruidJdbcStatement` models a (non-prepared) `Statement`.
   * `DruidJdbcPreparedStatement` models a `PreparedStatement`.
   * `AbstractDruidJdbcStatement` is the common base class for the above.
   * `DruidJdbcResultSet` represents a result set (a running query).
   
   The JDBC classes are then implemented using the statement abstractions 
described above. A JDBC `ResultSet` is a `DirectStatement`. A JDBC 
`PreparedStatement` is a Druid `PreparedStatement`.
   
   The Avatica driver tests were modified to be consistent with the JDBC 
standard. The original code didn't pass, but the revised code does.
   
   The result is that we now have a clean SQL stack for JDBC, HTTP and internal 
use that make a single planner pass when possible, and use two passes for the 
JDBC prepare/execute model.
   
   Please take a look and see what you think.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to