[ 
https://issues.apache.org/jira/browse/SPARK-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated SPARK-5420:
----------------------------
    Comment: was deleted

(was: I am copying the summary of write related interfaces from 
[SPARK-5501|https://issues.apache.org/jira/browse/SPARK-5501?focusedCommentId=14303760&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14303760]
 to here.
h3. End user APIs added to DataFrame (write related)
h4. Save a DataFrame as a table
When a user is using *HiveContext*, he/she can save a DataFrame as a table. The 
metadata of this table will be stored in metastore.
{code}
// When a data source name is not specified, we will use our default one 
(configured by spark.sql.default.datasource). Right now, it is Parquet.
def saveAsTable(tableName: String): Unit
def saveAsTable(
      tableName: String,
      dataSourceName: String,
      option: (String, String),
      options: (String, String)*): Unit
// This is for Java users.
def saveAsTable(
      tableName: String,
      dataSourceName: String,
      options: java.util.Map[String, String]): Unit
{code}

h4. Save a DataFrame to a data source
Users can save a DataFrame with a data source.
{code}
//This method is used to save a DataFrame to a file based data source (e.g. 
Parquet). We will use the default data source . Right now, it is Parquet.
def save(path: String): Unit
def save(
      dataSourceName: String,
      option: (String, String),
      options: (String, String)*): Unit
// This is for Java users.
def save(
      dataSourceName: String,
      options: java.util.Map[String, String]): Unit
{code}

h4. Insert data into a table from a DataFrame
Users can insert the data of DataFrame to an existing table created by the data 
source API.
{code}
// Appends the data of this DataFrame to the table tableName.
def insertInto(tableName: String): Unit
// When overwrite is true, inserts the data of this DataFrame to the table 
tableName and overwrite existing data.
// When overwrite is false, A=appends the data of this DataFrame to the table 
tableName.
def insertInto(tableName: String, overwrite: Boolean): Unit
{code})

> Cross-langauge load/store functions for creating and saving DataFrames
> ----------------------------------------------------------------------
>
>                 Key: SPARK-5420
>                 URL: https://issues.apache.org/jira/browse/SPARK-5420
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Patrick Wendell
>            Assignee: Yin Huai
>            Priority: Blocker
>             Fix For: 1.3.0
>
>
> We should have standard API's for loading or saving a table from a data 
> store. Per comment discussion:
> {code}
> def loadData(datasource: String, parameters: Map[String, String]): DataFrame
> def loadData(datasource: String, parameters: java.util.Map[String, String]): 
> DataFrame
> def storeData(datasource: String, parameters: Map[String, String]): DataFrame
> def storeData(datasource: String, parameters: java.util.Map[String, String]): 
> DataFrame
> {code}
> Python should have this too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to