[jira] [Commented] (SPARK-24423) Add a new option `query` for JDBC sources

2018-11-28 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702228#comment-16702228
 ] 

Apache Spark commented on SPARK-24423:
--

User 'wangyum' has created a pull request for this issue:
https://github.com/apache/spark/pull/23170

> Add a new option `query` for JDBC sources
> -
>
> Key: SPARK-24423
> URL: https://issues.apache.org/jira/browse/SPARK-24423
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Xiao Li
>Assignee: Dilip Biswal
>Priority: Major
> Fix For: 2.4.0
>
>
> Currently, our JDBC connector provides the option `dbtable` for users to 
> specify the to-be-loaded JDBC source table. 
> {code} 
>  val jdbcDf = spark.read
>    .format("jdbc")
>    .option("*dbtable*", "dbName.tableName")
>    .options(jdbcCredentials: Map)
>    .load()
> {code} 
>  Normally, users do not fetch the whole JDBC table due to the poor 
> performance/throughput of JDBC. Thus, they normally just fetch a small set of 
> tables. For advanced users, they can pass a subquery as the option.   
> {code} 
>  val query = """ (select * from tableName limit 10) as tmp """
>  val jdbcDf = spark.read
>    .format("jdbc")
>    .option("*dbtable*", query)
>    .options(jdbcCredentials: Map)
>    .load()
> {code} 
>  However, this is straightforward to end users. We should simply allow users 
> to specify the query by a new option `query`. We will handle the complexity 
> for them. 
> {code} 
>  val query = """select * from tableName limit 10"""
>  val jdbcDf = spark.read
>    .format("jdbc")
>    .option("*{color:#ff}query{color}*", query)
>    .options(jdbcCredentials: Map)
>    .load()
> {code} 
>  Users are not allowed to specify query and dbtable at the same time. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24423) Add a new option `query` for JDBC sources

2018-06-19 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516732#comment-16516732
 ] 

Apache Spark commented on SPARK-24423:
--

User 'dilipbiswal' has created a pull request for this issue:
https://github.com/apache/spark/pull/21590

> Add a new option `query` for JDBC sources
> -
>
> Key: SPARK-24423
> URL: https://issues.apache.org/jira/browse/SPARK-24423
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Xiao Li
>Priority: Major
>
> Currently, our JDBC connector provides the option `dbtable` for users to 
> specify the to-be-loaded JDBC source table. 
> {code} 
>  val jdbcDf = spark.read
>    .format("jdbc")
>    .option("*dbtable*", "dbName.tableName")
>    .options(jdbcCredentials: Map)
>    .load()
> {code} 
>  Normally, users do not fetch the whole JDBC table due to the poor 
> performance/throughput of JDBC. Thus, they normally just fetch a small set of 
> tables. For advanced users, they can pass a subquery as the option.   
> {code} 
>  val query = """ (select * from tableName limit 10) as tmp """
>  val jdbcDf = spark.read
>    .format("jdbc")
>    .option("*dbtable*", query)
>    .options(jdbcCredentials: Map)
>    .load()
> {code} 
>  However, this is straightforward to end users. We should simply allow users 
> to specify the query by a new option `query`. We will handle the complexity 
> for them. 
> {code} 
>  val query = """select * from tableName limit 10"""
>  val jdbcDf = spark.read
>    .format("jdbc")
>    .option("*{color:#ff}query{color}*", query)
>    .options(jdbcCredentials: Map)
>    .load()
> {code} 
>  Users are not allowed to specify query and dbtable at the same time. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24423) Add a new option `query` for JDBC sources

2018-06-18 Thread Dilip Biswal (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516071#comment-16516071
 ] 

Dilip Biswal commented on SPARK-24423:
--

[~maropu] Hello, yes. I will open a PR today/tomorrow.

> Add a new option `query` for JDBC sources
> -
>
> Key: SPARK-24423
> URL: https://issues.apache.org/jira/browse/SPARK-24423
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Xiao Li
>Priority: Major
>
> Currently, our JDBC connector provides the option `dbtable` for users to 
> specify the to-be-loaded JDBC source table. 
> {code} 
>  val jdbcDf = spark.read
>    .format("jdbc")
>    .option("*dbtable*", "dbName.tableName")
>    .options(jdbcCredentials: Map)
>    .load()
> {code} 
>  Normally, users do not fetch the whole JDBC table due to the poor 
> performance/throughput of JDBC. Thus, they normally just fetch a small set of 
> tables. For advanced users, they can pass a subquery as the option.   
> {code} 
>  val query = """ (select * from tableName limit 10) as tmp """
>  val jdbcDf = spark.read
>    .format("jdbc")
>    .option("*dbtable*", query)
>    .options(jdbcCredentials: Map)
>    .load()
> {code} 
>  However, this is straightforward to end users. We should simply allow users 
> to specify the query by a new option `query`. We will handle the complexity 
> for them. 
> {code} 
>  val query = """select * from tableName limit 10"""
>  val jdbcDf = spark.read
>    .format("jdbc")
>    .option("*{color:#ff}query{color}*", query)
>    .options(jdbcCredentials: Map)
>    .load()
> {code} 
>  Users are not allowed to specify query and dbtable at the same time. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24423) Add a new option `query` for JDBC sources

2018-06-16 Thread Takeshi Yamamuro (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514735#comment-16514735
 ] 

Takeshi Yamamuro commented on SPARK-24423:
--

Are'u still working on this?

> Add a new option `query` for JDBC sources
> -
>
> Key: SPARK-24423
> URL: https://issues.apache.org/jira/browse/SPARK-24423
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Xiao Li
>Priority: Major
>
> Currently, our JDBC connector provides the option `dbtable` for users to 
> specify the to-be-loaded JDBC source table. 
> {code} 
>  val jdbcDf = spark.read
>    .format("jdbc")
>    .option("*dbtable*", "dbName.tableName")
>    .options(jdbcCredentials: Map)
>    .load()
> {code} 
>  Normally, users do not fetch the whole JDBC table due to the poor 
> performance/throughput of JDBC. Thus, they normally just fetch a small set of 
> tables. For advanced users, they can pass a subquery as the option.   
> {code} 
>  val query = """ (select * from tableName limit 10) as tmp """
>  val jdbcDf = spark.read
>    .format("jdbc")
>    .option("*dbtable*", query)
>    .options(jdbcCredentials: Map)
>    .load()
> {code} 
>  However, this is straightforward to end users. We should simply allow users 
> to specify the query by a new option `query`. We will handle the complexity 
> for them. 
> {code} 
>  val query = """select * from tableName limit 10"""
>  val jdbcDf = spark.read
>    .format("jdbc")
>    .option("*{color:#ff}query{color}*", query)
>    .options(jdbcCredentials: Map)
>    .load()
> {code} 
>  Users are not allowed to specify query and dbtable at the same time. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24423) Add a new option `query` for JDBC sources

2018-05-30 Thread Dilip Biswal (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494753#comment-16494753
 ] 

Dilip Biswal commented on SPARK-24423:
--

[~smilegator] Thanks Sean for pinging me. I would like to take a look at this 
one.

> Add a new option `query` for JDBC sources
> -
>
> Key: SPARK-24423
> URL: https://issues.apache.org/jira/browse/SPARK-24423
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Xiao Li
>Priority: Major
>
> Currently, our JDBC connector provides the option `dbtable` for users to 
> specify the to-be-loaded JDBC source table. 
>  
> val jdbcDf = spark.read
>   .format("jdbc")
>   .option("*dbtable*", "dbName.tableName")
>   .options(jdbcCredentials: Map)
>   .load()
>  
> Normally, users do not fetch the whole JDBC table due to the poor 
> performance/throughput of JDBC. Thus, they normally just fetch a small set of 
> tables. For advanced users, they can pass a subquery as the option. 
>  
> val query = """ (select * from tableName limit 10) as tmp """
> val jdbcDf = spark.read
>   .format("jdbc")
>   .option("*dbtable*", query)
>   .options(jdbcCredentials: Map)
>   .load()
>  
> However, this is straightforward to end users. We should simply allow users 
> to specify the query by a new option `query`. We will handle the complexity 
> for them. 
>  
> val query = """select * from tableName limit 10"""
> val jdbcDf = spark.read
>   .format("jdbc")
>   .option("*{color:#ff}query{color}*", query)
>   .options(jdbcCredentials: Map)
>   .load()
>  
> Users are not allowed to specify query and dbtable at the same time. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24423) Add a new option `query` for JDBC sources

2018-05-30 Thread Xiao Li (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494739#comment-16494739
 ] 

Xiao Li commented on SPARK-24423:
-

cc [~dkbiswal] Are your team interested in this task?

> Add a new option `query` for JDBC sources
> -
>
> Key: SPARK-24423
> URL: https://issues.apache.org/jira/browse/SPARK-24423
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Xiao Li
>Priority: Major
>
> Currently, our JDBC connector provides the option `dbtable` for users to 
> specify the to-be-loaded JDBC source table. 
>  
> val jdbcDf = spark.read
>   .format("jdbc")
>   .option("*dbtable*", "dbName.tableName")
>   .options(jdbcCredentials: Map)
>   .load()
>  
> Normally, users do not fetch the whole JDBC table due to the poor 
> performance/throughput of JDBC. Thus, they normally just fetch a small set of 
> tables. For advanced users, they can pass a subquery as the option. 
>  
> val query = """ (select * from tableName limit 10) as tmp """
> val jdbcDf = spark.read
>   .format("jdbc")
>   .option("*dbtable*", query)
>   .options(jdbcCredentials: Map)
>   .load()
>  
> However, this is straightforward to end users. We should simply allow users 
> to specify the query by a new option `query`. We will handle the complexity 
> for them. 
>  
> val query = """select * from tableName limit 10"""
> val jdbcDf = spark.read
>   .format("jdbc")
>   .option("*{color:#ff}query{color}*", query)
>   .options(jdbcCredentials: Map)
>   .load()
>  
> Users are not allowed to specify query and dbtable at the same time. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org