Robert Beauchemin created SPARK-9078:
----------------------------------------
Summary: Use of non-standard LIMIT keyword in JDBC tableExists code
Key: SPARK-9078
URL: https://issues.apache.org/jira/browse/SPARK-9078
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 1.4.0, 1.3.1
Reporter: Robert Beauchemin
Priority: Minor
tableExists in
spark/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcUtils.scala uses
non-standard SQL (specifically, the LIMIT keyword) to determine whether a table
exists in a JDBC data source. This will cause an exception in many/most JDBC
databases that doesn't support LIMIT keyword. See
http://stackoverflow.com/questions/1528604/how-universal-is-the-limit-statement-in-sql
To check for table existence or an exception, it could be recrafted around
"select 1 from $table where 0 = 1" which isn't the same (it returns an empty
resultset rather than the value '1'), but would support more data sources and
also support empty tables. Arguably ugly and possibly queries every row on
sources that don't support constant folding, but better than failing on JDBC
sources that don't support LIMIT.
Perhaps "supports LIMIT" could be a field in the JdbcDialect class for
databases that support keyword this to override. The ANSI standard is (OFFSET
and) FETCH.
The standard way to check for table existence would be to use
information_schema.tables which is a SQL standard but may not work for other
JDBC data sources that support SQL, but not the information_schema. The JDBC
DatabaseMetaData interface provides getSchemas() that allows checking for the
information_schema in drivers that support it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]