[ https://issues.apache.org/jira/browse/SPARK-24423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494753#comment-16494753 ]
Dilip Biswal commented on SPARK-24423: -------------------------------------- [~smilegator] Thanks Sean for pinging me. I would like to take a look at this one. > Add a new option `query` for JDBC sources > ----------------------------------------- > > Key: SPARK-24423 > URL: https://issues.apache.org/jira/browse/SPARK-24423 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.3.0 > Reporter: Xiao Li > Priority: Major > > Currently, our JDBC connector provides the option `dbtable` for users to > specify the to-be-loaded JDBC source table. > > val jdbcDf = spark.read > .format("jdbc") > .option("*dbtable*", "dbName.tableName") > .options(jdbcCredentials: Map) > .load() > > Normally, users do not fetch the whole JDBC table due to the poor > performance/throughput of JDBC. Thus, they normally just fetch a small set of > tables. For advanced users, they can pass a subquery as the option. > > val query = """ (select * from tableName limit 10) as tmp """ > val jdbcDf = spark.read > .format("jdbc") > .option("*dbtable*", query) > .options(jdbcCredentials: Map) > .load() > > However, this is straightforward to end users. We should simply allow users > to specify the query by a new option `query`. We will handle the complexity > for them. > > val query = """select * from tableName limit 10""" > val jdbcDf = spark.read > .format("jdbc") > .option("*{color:#ff0000}query{color}*", query) > .options(jdbcCredentials: Map) > .load() > > Users are not allowed to specify query and dbtable at the same time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org