NGA-TRAN commented on issue #4426: URL: https://github.com/apache/arrow-datafusion/issues/4426#issuecomment-1332545456
Thanks @alamb and @thinkharderdev for the feedback. Here is my proposal. I will update the main section above after we agree on this. We will use [Postgres's prepared statement syntax](https://www.postgresql.org/docs/current/sql-prepare.html#:~:text=A%20prepared%20statement%20is%20a,statement%20is%20planned%20and%20executed) Let us create a table for our examples ```SQL CREATE TABLE cpu ("user" INT, region VARCHAR, time TMESTAMP); -- insert some data SELECT * FROM cpu; user | region | time ------+--------+--------------------- 1 | east | 2022-11-15 00:00:00 2 | west | 2022-11-30 19:10:25 (2 rows) ``` # New Feature: PREPARE a select statement and EXECUTE it We will support PREPARE and EXECUTE statement that are compatible with Postgres as follows: ### Example 1: ```SQL -- Create a prepared plan PREPARE my_plan_1(TIMESTAMP) AS SELECT "user", region FROM cpu WHERE time = $1; -- Execute the plan EXECUTE my_plan_1('2022-11-30 19:10:25'); user | region ------+-------- 2 | west (1 row) ``` ### Example 2 ```SQL -- Create a prepared plan PREPARE my_plan_2(INT, INT) AS SELECT region, count(*) FROM cpu GROUP BY region HAVING count(*) >= $1 and count(*) < $2 order by 1; -- Execute the plan EXECUTE my_plan_2(1, 10); region | count --------+------- east | 1 west | 1 (2 rows) ``` # Design proposal ## Prepare Statement sqlparser parses and provides output for PREPARE](https://github.com/sqlparser-rs/sqlparser-rs/blob/fa6bd01b19fabc9087d3b306e7c92f12cb8d368d/src/ast/mod.rs#L1444) and [EXECUTE](https://github.com/sqlparser-rs/sqlparser-rs/blob/fa6bd01b19fabc9087d3b306e7c92f12cb8d368d/src/ast/mod.rs#L1440) as follows ```rust Prepare { name: Ident, data_types: Vec<DataType>, statement: Box<Statement>, }, Execute { name: Ident, parameters: Vec<Expr> }, ``` We will generate the corresponding struct when we generate the statement logical plan ```rust PreparedPlan { name: Indent, // name provided with the PREPARE params: Vec<DataType>, // list of data types provided with the PREPARE plan: LogicalPlan, // logical plan that understands Params defined below } struct Param { type: DataType, idx: usize // index to the vector `params` } ``` Before showing examples of PreparedPlan, let us look into today logical plans. ```SQL -- Example 1: SELECT “user”, region FROM cpu WHERE time > to_timestamp('1970-01-01T00:00:00.000000120+00:00'); Logical Plan: Projection: cpu.user, cpu.region Filter: cpu.time > totimestamp(Utf8("1970-01-01T00:00:00.000000120+00:00")) TableScan: cpu -- Example 2: SELECT region, count(*) FROM cpu GROUP BY region HAVING count(*) >= 1 and count(*) < 10 order by 1; logical plan: Sort: cpu.region ASC NULLS LAST Projection: cpu.region, COUNT(UInt8(1)) Filter: COUNT(UInt8(1)) >= Int64(1) AND COUNT(UInt8(1)) < Int64(10) Aggregate: groupBy=[[cpu.region]], aggr=[[COUNT(UInt8(1))]] TableScan: cpu ``` Let see the full output of a prepared statement: ### Example 1: ```SQL PREPARE my_plan_1(TIMESTAMP) AS SELECT "user", region FROM cpu WHERE time = $1; ``` The output will look like ```rust my_plan_1 = PreparedPlan { name: "my_plan_1", params: [Timestamp] // vector of 1 plan: Logical Plan Projection: cpu.user, cpu.region Filter: cpu.time > Param { type: Timestamp, idx: 0} // idx points to the first item of the params TableScan: cpu } ``` ### Example 2: ```SQL PREPARE my_plan_2(INT, INT) AS SELECT region, count(*) FROM cpu GROUP BY region HAVING count(*) >= $1 and count(*) < $2 order by 1; ``` The output will look like ```rust my_plan_2 = PreparedPlan { name: "my_plan_2", params: [Int64, Int64], plan: Logical plan Sort: cpu.region ASC NULLS LAST Projection: cpu.region, COUNT(UInt8(1)) Filter: COUNT(UInt8(1)) >= Param { type: Int64, idx: 0 } // points to first item of the params AND COUNT(UInt8(1)) < Param { type: Int64, idx: 1 } // points to second items of the params Aggregate: groupBy=[[cpu.region]], aggr=[[COUNT(UInt8(1))]] TableScan: cpu } ``` **Question: How do we store all the generated prepared plans? In a hashMap?** ## Execute Plan sqlparser parses and provides output for [EXECUTE](https://github.com/sqlparser-rs/sqlparser-rs/blob/fa6bd01b19fabc9087d3b306e7c92f12cb8d368d/src/ast/mod.rs#L1440) as follows ```rust Execute { name: Ident, parameters: Vec<Expr> }, ``` We will search the the plan and replace Params in the Logical plan with the provided parameters. Then generate physical plan for it and execute. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
