Xiao Li created SPARK-24423: ------------------------------- Summary: Add a new option `query` for JDBC sources Key: SPARK-24423 URL: https://issues.apache.org/jira/browse/SPARK-24423 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 2.3.0 Reporter: Xiao Li
Currently, our JDBC connector provides the option `dbtable` for users to specify the to-be-loaded JDBC source table. val jdbcDf = spark.read .format("jdbc") .option("*dbtable*", "dbName.tableName") .options(jdbcCredentials: Map) .load() Normally, users do not fetch the whole JDBC table due to the poor performance/throughput of JDBC. Thus, they normally just fetch a small set of tables. For advanced users, they can pass a subquery as the option. val query = """ (select * from tableName limit 10) as tmp """ val jdbcDf = spark.read .format("jdbc") .option("*dbtable*", query) .options(jdbcCredentials: Map) .load() However, this is straightforward to end users. We should simply allow users to specify the query by a new option `query`. We will handle the complexity for them. val query = """select * from tableName limit 10""" val jdbcDf = spark.read .format("jdbc") .option("*{color:#ff0000}query{color}*", query) .options(jdbcCredentials: Map) .load() Users are not allowed to specify query and dbtable at the same time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org