akhalymon-cv opened a new pull request #34709: URL: https://github.com/apache/spark/pull/34709
### What changes were proposed in this pull request? This PR adds the boolean option 'useRawQuery' to unwrap the query from 'select' statement, used to get schema, as it causes problems with CTE when using MSSQL JDBC driver. When set to 'true', the user has to also provide a schema of result in 'customSchema' option. The obvious downside of this approach is that the user is obligated to fetch schema beforehand when running the query. The advantage is that we are avoiding running query twice just to get schema, and users can run query without modification, unlike the solution described in https://github.com/apache/spark/pull/34693 ### Why are the changes needed? These changes are needed to support of CTE while using MSSQL JDBC ### Does this PR introduce _any_ user-facing change? Added optiont 'useRawQuery' to JDBC options. Example: ``` JdbcMsSqlDF = ( spark.read.format("jdbc") .option("url", f"jdbc:sqlserver://{server}:{port};databaseName={database};") .option("user", user) .option("password", password) .option("useRawQuery", "true") //<----------- Do not wrap the query .option("customSchema", schema) .option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver") .option("query", query) .load() ) ``` ### How was this patch tested? The patch was tested manually, unit tests are pending, it's still WIP. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
