[jira] [Commented] (SPARK-5501) Write support for the data source API

Yin Huai (JIRA) Tue, 03 Feb 2015 10:58:19 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14303760#comment-14303760
 ]


Yin Huai commented on SPARK-5501:
---------------------------------

h3. End user APIs added to DataFrame
h4. Save a DataFrame as a table
When a user is using *HiveContext*, he/she can save a DataFrame as a table. The 
metadata of this table will be stored in metastore.
{code}
// When a data source name is not specified, we will use our default one 
(configured by spark.sql.default.datasource). Right now, it is Parquet.
def saveAsTable(tableName: String): Unit
def saveAsTable(
      tableName: String,
      dataSourceName: String,
      option: (String, String),
      options: (String, String)*): Unit
// This is for Java users.
def saveAsTable(
      tableName: String,
      dataSourceName: String,
      options: java.util.Map[String, String]): Unit
{code}

h4. Save a DataFrame to a data source
Users can save a DataFrame with a data source.
{code}
//This method is used to save a DataFrame to a file based data source (e.g. 
Parquet). We will use the default data source . Right now, it is Parquet.
def save(path: String): Unit
def save(
      dataSourceName: String,
      option: (String, String),
      options: (String, String)*): Unit
// This is for Java users.
def save(
      dataSourceName: String,
      options: java.util.Map[String, String]): Unit
{code}

h4. Insert data into a table from a DataFrame
Users can insert the data of DataFrame to an existing table created by the data 
source API.
{code}
// Appends the data of this DataFrame to the table tableName.
def insertInto(tableName: String): Unit
// When overwrite is true, inserts the data of this DataFrame to the table 
tableName and overwrite existing data.
// When overwrite is false, A=appends the data of this DataFrame to the table 
tableName.
def insertInto(tableName: String, overwrite: Boolean): Unit
{code}

> Write support for the data source API
> -------------------------------------
>
>                 Key: SPARK-5501
>                 URL: https://issues.apache.org/jira/browse/SPARK-5501
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Yin Huai
>            Assignee: Yin Huai
>            Priority: Blocker
>             Fix For: 1.3.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-5501) Write support for the data source API

Reply via email to