[
https://issues.apache.org/jira/browse/SPARK-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075789#comment-14075789
]
Apache Spark commented on SPARK-2710:
-------------------------------------
User 'chutium' has created a pull request for this issue:
https://github.com/apache/spark/pull/1612
> Build SchemaRDD from a JdbcRDD with MetaData (no hard code case class)
> ----------------------------------------------------------------------
>
> Key: SPARK-2710
> URL: https://issues.apache.org/jira/browse/SPARK-2710
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core, SQL
> Reporter: Teng Qiu
>
> Spark SQL can take Parquet files or JSON files as a table directly (without
> given a case class to define the schema)
> as a component named SQL, it should also be able to take a ResultSet from
> RDBMS easily.
> i find that there is a JdbcRDD in core:
> core/src/main/scala/org/apache/spark/rdd/JdbcRDD.scala
> so i want to make some small change in this file to allow SQLContext to read
> the MetaData from the PreparedStatement (read metadata do not need to execute
> the query really).
> and there is a small bug in JdbcRDD
> in compute(), method close()
> {code}
> if (null != conn && ! stmt.isClosed()) conn.close()
> {code}
> should be
> {code}
> if (null != conn && ! conn.isClosed()) conn.close()
> {code}
> just a small write error :)
> Then, in Spark SQL, SQLContext can create SchemaRDD with JdbcRDD and his
> MetaData.
> In the further, maybe we can add a feature in sql-shell, so that user can
> using spark-thrift-server join tables from different sources
> such as:
> {code}
> CREATE TABLE jdbc_tbl1 AS JDBC "connectionString" "username" "password"
> "initQuery" "bound" ...
> CREATE TABLE parquet_files AS JDBC "hdfs://tmp/parquet_table/"
> SELECT parquet_files.colX, jdbc_tbl1.colY
> FROM parquet_files
> JOIN jdbc_tbl1
> ON (parquet_files.id = jdbc_tbl1.id)
> {code}
> I think such a feature will be useful, like facebook Presto engine does.
--
This message was sent by Atlassian JIRA
(v6.2#6252)