[ 
https://issues.apache.org/jira/browse/SPARK-32014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17798563#comment-17798563
 ] 

Sravan Kumar Vadaga commented on SPARK-32014:
---------------------------------------------

Can this be prioritized? 

> Support calling stored procedure on JDBC data source
> ----------------------------------------------------
>
>                 Key: SPARK-32014
>                 URL: https://issues.apache.org/jira/browse/SPARK-32014
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Yoshi Matsuzaki
>            Priority: Major
>
> Currently, all queries via JDBC data source are enveloped by outer SELECT as 
> described below:
> [https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html]
> {quote}
> A query that will be used to read data into Spark. The specified query will 
> be parenthesized and used as a subquery in the FROM clause. Spark will also 
> assign an alias to the subquery clause. As an example, spark will issue a 
> query of the following form to the JDBC Source.
> SELECT <columns> FROM (<user_specified_query>) spark_gen_alias
> {quote}
> Because of the behavior, we cannot call a stored procedure in major 
> databases, because stored procedure call syntax is usually not allowed to be 
> used in a subquery because its returned value is optional.
> For example, below Scala code to execute a query on Snowflake as JDBC data 
> source raises a syntax error, because the query "call proc()" is rewritten to 
> "select * from (call proc()) where 1 = 0", and it is invalid because CALL 
> cannot be in the middle of a query.
> {code:scala}
> val df: DataFrame = spark.read
>   .format("snowflake")
>   .options(options)
>   .option("query", "call proc()")
>   .load()
> display(df)
> {code}
> I tested this with Snowflake, but it should happen in any major database 
> systems.
> I understand JDBC data source is to read and write data through Dataframe, 
> then the interfaces implemented are just to read and write, but sometimes we 
> need to just execute some queries before or after reading/writing, for 
> example, to preprocess the data by stored procedure.
> I would appreciate it if you could consider to implement some interface/way 
> to allow us to call a stored procedure.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to