[jira] [Commented] (SPARK-5774) Support save RDD append to file

Sean Owen (JIRA) Thu, 12 Feb 2015 08:13:39 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-5774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318452#comment-14318452
 ]


Sean Owen commented on SPARK-5774:
----------------------------------

I don't think Spark does or should support append semantics in these methods. 
An RDD is immutable and persisting it outputs that entire immutable state to 
entire files. Although it depends a lot on what you mean, INSERT INTO does not 
require append semantics. You make new files.

If you mean you can't automatically overwrite past output, this is why the 
{{spark.hadoop.validateOutputSpecs}} option exists. You can turn it off to 
change the semantics to allow overwrite.

> Support save RDD append to file
> -------------------------------
>
>                 Key: SPARK-5774
>                 URL: https://issues.apache.org/jira/browse/SPARK-5774
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>    Affects Versions: 1.3.0
>            Reporter: Yanbo Liang
>
> Now RDD.saveAsTextFile only support writing to a file which is empty. In some 
> cases, we need to save RDD append to an existing file. For example, when 
> execute SQL command "INSERT INTO ...", we need to append the RDD to an 
> existing file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-5774) Support save RDD append to file

Reply via email to