[ https://issues.apache.org/jira/browse/SPARK-24423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16514735#comment-16514735 ]
Takeshi Yamamuro commented on SPARK-24423: ------------------------------------------ Are'u still working on this? > Add a new option `query` for JDBC sources > ----------------------------------------- > > Key: SPARK-24423 > URL: https://issues.apache.org/jira/browse/SPARK-24423 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.3.0 > Reporter: Xiao Li > Priority: Major > > Currently, our JDBC connector provides the option `dbtable` for users to > specify the to-be-loaded JDBC source table. > {code} > val jdbcDf = spark.read > .format("jdbc") > .option("*dbtable*", "dbName.tableName") > .options(jdbcCredentials: Map) > .load() > {code} > Normally, users do not fetch the whole JDBC table due to the poor > performance/throughput of JDBC. Thus, they normally just fetch a small set of > tables. For advanced users, they can pass a subquery as the option. > {code} > val query = """ (select * from tableName limit 10) as tmp """ > val jdbcDf = spark.read > .format("jdbc") > .option("*dbtable*", query) > .options(jdbcCredentials: Map) > .load() > {code} > However, this is straightforward to end users. We should simply allow users > to specify the query by a new option `query`. We will handle the complexity > for them. > {code} > val query = """select * from tableName limit 10""" > val jdbcDf = spark.read > .format("jdbc") > .option("*{color:#ff0000}query{color}*", query) > .options(jdbcCredentials: Map) > .load() > {code} > Users are not allowed to specify query and dbtable at the same time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org