Yoshi Matsuzaki created SPARK-32014: ---------------------------------------
Summary: Support calling stored procedure on JDBC data source Key: SPARK-32014 URL: https://issues.apache.org/jira/browse/SPARK-32014 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.0.0 Reporter: Yoshi Matsuzaki Currently, all queries via JDBC data source are enveloped by outer SELECT as described below: [https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html] {quote} A query that will be used to read data into Spark. The specified query will be parenthesized and used as a subquery in the FROM clause. Spark will also assign an alias to the subquery clause. As an example, spark will issue a query of the following form to the JDBC Source. SELECT <columns> FROM (<user_specified_query>) spark_gen_alias {quote} Because of the behavior, we cannot call a stored procedure in major databases, because stored procedure call syntax is usually not allowed to be used in a subquery because its returned value is optional. For example, below Scala code to execute a query on Snowflake as JDBC data source raises a syntax error, because the query "call proc()" is rewritten to "select * from (call proc()) where 1 = 0", and it is invalid because CALL cannot be in the middle of a query. {code:scala} val df: DataFrame = spark.read .format("snowflake") .options(options) .option("query", "call proc()") .load() display(df) {code} I tested this with Snowflake, but it should happen in any major database systems. I understand JDBC data source is to read and write data through Dataframe, then the interfaces implemented are just to read and write, but sometimes we need to just execute some queries before or after reading/writing, for example, to preprocess the data by stored procedure. I would appreciate it if you could consider to implement some interface/way to allow us to call a stored procedure. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org