[jira] [Commented] (SPARK-2710) Build SchemaRDD from a JdbcRDD with MetaData (no hard code case class)

Apache Spark (JIRA) Sun, 27 Jul 2014 16:17:28 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075789#comment-14075789
 ]


Apache Spark commented on SPARK-2710:
-------------------------------------

User 'chutium' has created a pull request for this issue:
https://github.com/apache/spark/pull/1612

> Build SchemaRDD from a JdbcRDD with MetaData (no hard code case class)
> ----------------------------------------------------------------------
>
>                 Key: SPARK-2710
>                 URL: https://issues.apache.org/jira/browse/SPARK-2710
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core, SQL
>            Reporter: Teng Qiu
>
> Spark SQL can take Parquet files or JSON files as a table directly (without 
> given a case class to define the schema)
> as a component named SQL, it should also be able to take a ResultSet from 
> RDBMS easily.
> i find that there is a JdbcRDD in core: 
> core/src/main/scala/org/apache/spark/rdd/JdbcRDD.scala
> so i want to make some small change in this file to allow SQLContext to read 
> the MetaData from the PreparedStatement (read metadata do not need to execute 
> the query really).
> and there is a small bug in JdbcRDD
> in compute(), method close()
> {code}
> if (null != conn && ! stmt.isClosed()) conn.close()
> {code}
> should be
> {code}
> if (null != conn && ! conn.isClosed()) conn.close()
> {code}
> just a small write error :)
> Then, in Spark SQL, SQLContext can create SchemaRDD with JdbcRDD and his 
> MetaData.
> In the further, maybe we can add a feature in sql-shell, so that user can 
> using spark-thrift-server join tables from different sources
> such as:
> {code}
> CREATE TABLE jdbc_tbl1 AS JDBC "connectionString" "username" "password" 
> "initQuery" "bound" ...
> CREATE TABLE parquet_files AS JDBC "hdfs://tmp/parquet_table/"
> SELECT parquet_files.colX, jdbc_tbl1.colY
>   FROM parquet_files
>   JOIN jdbc_tbl1
>     ON (parquet_files.id = jdbc_tbl1.id)
> {code}
> I think such a feature will be useful, like facebook Presto engine does.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (SPARK-2710) Build SchemaRDD from a JdbcRDD with MetaData (no hard code case class)

Reply via email to