Robert Beauchemin created SPARK-9078:
----------------------------------------

             Summary: Use of non-standard LIMIT keyword in JDBC tableExists code
                 Key: SPARK-9078
                 URL: https://issues.apache.org/jira/browse/SPARK-9078
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 1.4.0, 1.3.1
            Reporter: Robert Beauchemin
            Priority: Minor


tableExists in  
spark/sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcUtils.scala uses 
non-standard SQL (specifically, the LIMIT keyword) to determine whether a table 
exists in a JDBC data source. This will cause an exception in many/most JDBC 
databases that doesn't support LIMIT keyword. See 
http://stackoverflow.com/questions/1528604/how-universal-is-the-limit-statement-in-sql

To check for table existence or an exception, it could be recrafted around 
"select 1 from $table where 0 = 1" which isn't the same (it returns an empty 
resultset rather than the value '1'), but would support more data sources and 
also support empty tables. Arguably ugly and possibly queries every row on 
sources that don't support constant folding, but better than failing on JDBC 
sources that don't support LIMIT. 

Perhaps "supports LIMIT" could be a field in the JdbcDialect class for 
databases that support keyword this to override. The ANSI standard is (OFFSET 
and) FETCH. 

The standard way to check for table existence would be to use 
information_schema.tables which is a SQL standard but may not work for other 
JDBC data sources that support SQL, but not the information_schema. The JDBC 
DatabaseMetaData interface provides getSchemas()  that allows checking for the 
information_schema in drivers that support it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to