[jira] [Commented] (SPARK-24423) Add a new option `query` for JDBC sources
[ https://issues.apache.org/jira/browse/SPARK-24423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702228#comment-16702228 ] Apache Spark commented on SPARK-24423: -- User 'wangyum' has created a pull request for this issue: https://github.com/apache/spark/pull/23170 > Add a new option `query` for JDBC sources > - > > Key: SPARK-24423 > URL: https://issues.apache.org/jira/browse/SPARK-24423 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.3.0 >Reporter: Xiao Li >Assignee: Dilip Biswal >Priority: Major > Fix For: 2.4.0 > > > Currently, our JDBC connector provides the option `dbtable` for users to > specify the to-be-loaded JDBC source table. > {code} > val jdbcDf = spark.read > .format("jdbc") > .option("*dbtable*", "dbName.tableName") > .options(jdbcCredentials: Map) > .load() > {code} > Normally, users do not fetch the whole JDBC table due to the poor > performance/throughput of JDBC. Thus, they normally just fetch a small set of > tables. For advanced users, they can pass a subquery as the option. > {code} > val query = """ (select * from tableName limit 10) as tmp """ > val jdbcDf = spark.read > .format("jdbc") > .option("*dbtable*", query) > .options(jdbcCredentials: Map) > .load() > {code} > However, this is straightforward to end users. We should simply allow users > to specify the query by a new option `query`. We will handle the complexity > for them. > {code} > val query = """select * from tableName limit 10""" > val jdbcDf = spark.read > .format("jdbc") > .option("*{color:#ff}query{color}*", query) > .options(jdbcCredentials: Map) > .load() > {code} > Users are not allowed to specify query and dbtable at the same time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-24423) Add a new option `query` for JDBC sources
[ https://issues.apache.org/jira/browse/SPARK-24423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516732#comment-16516732 ] Apache Spark commented on SPARK-24423: -- User 'dilipbiswal' has created a pull request for this issue: https://github.com/apache/spark/pull/21590 > Add a new option `query` for JDBC sources > - > > Key: SPARK-24423 > URL: https://issues.apache.org/jira/browse/SPARK-24423 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.3.0 >Reporter: Xiao Li >Priority: Major > > Currently, our JDBC connector provides the option `dbtable` for users to > specify the to-be-loaded JDBC source table. > {code} > val jdbcDf = spark.read > .format("jdbc") > .option("*dbtable*", "dbName.tableName") > .options(jdbcCredentials: Map) > .load() > {code} > Normally, users do not fetch the whole JDBC table due to the poor > performance/throughput of JDBC. Thus, they normally just fetch a small set of > tables. For advanced users, they can pass a subquery as the option. > {code} > val query = """ (select * from tableName limit 10) as tmp """ > val jdbcDf = spark.read > .format("jdbc") > .option("*dbtable*", query) > .options(jdbcCredentials: Map) > .load() > {code} > However, this is straightforward to end users. We should simply allow users > to specify the query by a new option `query`. We will handle the complexity > for them. > {code} > val query = """select * from tableName limit 10""" > val jdbcDf = spark.read > .format("jdbc") > .option("*{color:#ff}query{color}*", query) > .options(jdbcCredentials: Map) > .load() > {code} > Users are not allowed to specify query and dbtable at the same time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-24423) Add a new option `query` for JDBC sources
[ https://issues.apache.org/jira/browse/SPARK-24423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516071#comment-16516071 ] Dilip Biswal commented on SPARK-24423: -- [~maropu] Hello, yes. I will open a PR today/tomorrow. > Add a new option `query` for JDBC sources > - > > Key: SPARK-24423 > URL: https://issues.apache.org/jira/browse/SPARK-24423 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.3.0 >Reporter: Xiao Li >Priority: Major > > Currently, our JDBC connector provides the option `dbtable` for users to > specify the to-be-loaded JDBC source table. > {code} > val jdbcDf = spark.read > .format("jdbc") > .option("*dbtable*", "dbName.tableName") > .options(jdbcCredentials: Map) > .load() > {code} > Normally, users do not fetch the whole JDBC table due to the poor > performance/throughput of JDBC. Thus, they normally just fetch a small set of > tables. For advanced users, they can pass a subquery as the option. > {code} > val query = """ (select * from tableName limit 10) as tmp """ > val jdbcDf = spark.read > .format("jdbc") > .option("*dbtable*", query) > .options(jdbcCredentials: Map) > .load() > {code} > However, this is straightforward to end users. We should simply allow users > to specify the query by a new option `query`. We will handle the complexity > for them. > {code} > val query = """select * from tableName limit 10""" > val jdbcDf = spark.read > .format("jdbc") > .option("*{color:#ff}query{color}*", query) > .options(jdbcCredentials: Map) > .load() > {code} > Users are not allowed to specify query and dbtable at the same time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-24423) Add a new option `query` for JDBC sources
[ https://issues.apache.org/jira/browse/SPARK-24423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514735#comment-16514735 ] Takeshi Yamamuro commented on SPARK-24423: -- Are'u still working on this? > Add a new option `query` for JDBC sources > - > > Key: SPARK-24423 > URL: https://issues.apache.org/jira/browse/SPARK-24423 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.3.0 >Reporter: Xiao Li >Priority: Major > > Currently, our JDBC connector provides the option `dbtable` for users to > specify the to-be-loaded JDBC source table. > {code} > val jdbcDf = spark.read > .format("jdbc") > .option("*dbtable*", "dbName.tableName") > .options(jdbcCredentials: Map) > .load() > {code} > Normally, users do not fetch the whole JDBC table due to the poor > performance/throughput of JDBC. Thus, they normally just fetch a small set of > tables. For advanced users, they can pass a subquery as the option. > {code} > val query = """ (select * from tableName limit 10) as tmp """ > val jdbcDf = spark.read > .format("jdbc") > .option("*dbtable*", query) > .options(jdbcCredentials: Map) > .load() > {code} > However, this is straightforward to end users. We should simply allow users > to specify the query by a new option `query`. We will handle the complexity > for them. > {code} > val query = """select * from tableName limit 10""" > val jdbcDf = spark.read > .format("jdbc") > .option("*{color:#ff}query{color}*", query) > .options(jdbcCredentials: Map) > .load() > {code} > Users are not allowed to specify query and dbtable at the same time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-24423) Add a new option `query` for JDBC sources
[ https://issues.apache.org/jira/browse/SPARK-24423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494753#comment-16494753 ] Dilip Biswal commented on SPARK-24423: -- [~smilegator] Thanks Sean for pinging me. I would like to take a look at this one. > Add a new option `query` for JDBC sources > - > > Key: SPARK-24423 > URL: https://issues.apache.org/jira/browse/SPARK-24423 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.3.0 >Reporter: Xiao Li >Priority: Major > > Currently, our JDBC connector provides the option `dbtable` for users to > specify the to-be-loaded JDBC source table. > > val jdbcDf = spark.read > .format("jdbc") > .option("*dbtable*", "dbName.tableName") > .options(jdbcCredentials: Map) > .load() > > Normally, users do not fetch the whole JDBC table due to the poor > performance/throughput of JDBC. Thus, they normally just fetch a small set of > tables. For advanced users, they can pass a subquery as the option. > > val query = """ (select * from tableName limit 10) as tmp """ > val jdbcDf = spark.read > .format("jdbc") > .option("*dbtable*", query) > .options(jdbcCredentials: Map) > .load() > > However, this is straightforward to end users. We should simply allow users > to specify the query by a new option `query`. We will handle the complexity > for them. > > val query = """select * from tableName limit 10""" > val jdbcDf = spark.read > .format("jdbc") > .option("*{color:#ff}query{color}*", query) > .options(jdbcCredentials: Map) > .load() > > Users are not allowed to specify query and dbtable at the same time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-24423) Add a new option `query` for JDBC sources
[ https://issues.apache.org/jira/browse/SPARK-24423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494739#comment-16494739 ] Xiao Li commented on SPARK-24423: - cc [~dkbiswal] Are your team interested in this task? > Add a new option `query` for JDBC sources > - > > Key: SPARK-24423 > URL: https://issues.apache.org/jira/browse/SPARK-24423 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.3.0 >Reporter: Xiao Li >Priority: Major > > Currently, our JDBC connector provides the option `dbtable` for users to > specify the to-be-loaded JDBC source table. > > val jdbcDf = spark.read > .format("jdbc") > .option("*dbtable*", "dbName.tableName") > .options(jdbcCredentials: Map) > .load() > > Normally, users do not fetch the whole JDBC table due to the poor > performance/throughput of JDBC. Thus, they normally just fetch a small set of > tables. For advanced users, they can pass a subquery as the option. > > val query = """ (select * from tableName limit 10) as tmp """ > val jdbcDf = spark.read > .format("jdbc") > .option("*dbtable*", query) > .options(jdbcCredentials: Map) > .load() > > However, this is straightforward to end users. We should simply allow users > to specify the query by a new option `query`. We will handle the complexity > for them. > > val query = """select * from tableName limit 10""" > val jdbcDf = spark.read > .format("jdbc") > .option("*{color:#ff}query{color}*", query) > .options(jdbcCredentials: Map) > .load() > > Users are not allowed to specify query and dbtable at the same time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org