[
https://issues.apache.org/jira/browse/SPARK-4912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14272018#comment-14272018
]
Yin Huai commented on SPARK-4912:
---------------------------------
h3. Persistence of data
Right now the data sources API can only be used to read data that is already
present. It would be nice to also allow the creation of new tables as well as
inserting into existing tables.
h4. Types of insertions
In SQL
{code:Sql}
INSERT INTO [OVERWRITE] <existingTable> SELECT …
CREATE TABLE [USING <data source>] [OPTIONS (key value, …)] [PARTITIONED BY
(col1, col2, …)] AS SELECT …
{code}
Programmatic API
{code}
schemaRDD.insertInto(tableName: String [, overwrite: Boolean])
schemaRDD.saveAsTable(tableName: String [, sourceName: String])
schemaRDD.saveAsTable(
tableName: String,
partitionColumns: Seq[String]
[, sourceName: String],
options: Map[String, String])
{code}
When sourceName is not specified defaults to "spark.sql.defaultDataSource".
When this setting is not set then HiveContext will fall back to its standard
configuration.
h4. Interfaces
{code}
trait InsertableRelation {
def insertInto(overwrite: Boolean, data: SchemaRDD): Unit
}
trait CreateableRelation {
def createRelation(
name: String,
options: Map[String, String],
data: SchemaRDD): Map[String, String]
}
class InsertablePartitionedRelation extends PartitionedRelation {
def writeFile(path: String, schema: StructType, data: Iterator[Row])
}
{code}
> Persistent data source tables
> -----------------------------
>
> Key: SPARK-4912
> URL: https://issues.apache.org/jira/browse/SPARK-4912
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Reporter: Michael Armbrust
> Assignee: Michael Armbrust
> Priority: Blocker
>
> It would be good if tables created through the new data sources api could be
> persisted to the hive metastore.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]