[
https://issues.apache.org/jira/browse/SPARK-32014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17798563#comment-17798563
]
Sravan Kumar Vadaga commented on SPARK-32014:
---------------------------------------------
Can this be prioritized?
> Support calling stored procedure on JDBC data source
> ----------------------------------------------------
>
> Key: SPARK-32014
> URL: https://issues.apache.org/jira/browse/SPARK-32014
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 3.0.0
> Reporter: Yoshi Matsuzaki
> Priority: Major
>
> Currently, all queries via JDBC data source are enveloped by outer SELECT as
> described below:
> [https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html]
> {quote}
> A query that will be used to read data into Spark. The specified query will
> be parenthesized and used as a subquery in the FROM clause. Spark will also
> assign an alias to the subquery clause. As an example, spark will issue a
> query of the following form to the JDBC Source.
> SELECT <columns> FROM (<user_specified_query>) spark_gen_alias
> {quote}
> Because of the behavior, we cannot call a stored procedure in major
> databases, because stored procedure call syntax is usually not allowed to be
> used in a subquery because its returned value is optional.
> For example, below Scala code to execute a query on Snowflake as JDBC data
> source raises a syntax error, because the query "call proc()" is rewritten to
> "select * from (call proc()) where 1 = 0", and it is invalid because CALL
> cannot be in the middle of a query.
> {code:scala}
> val df: DataFrame = spark.read
> .format("snowflake")
> .options(options)
> .option("query", "call proc()")
> .load()
> display(df)
> {code}
> I tested this with Snowflake, but it should happen in any major database
> systems.
> I understand JDBC data source is to read and write data through Dataframe,
> then the interfaces implemented are just to read and write, but sometimes we
> need to just execute some queries before or after reading/writing, for
> example, to preprocess the data by stored procedure.
> I would appreciate it if you could consider to implement some interface/way
> to allow us to call a stored procedure.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]