[
https://issues.apache.org/jira/browse/SPARK-5774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318452#comment-14318452
]
Sean Owen commented on SPARK-5774:
----------------------------------
I don't think Spark does or should support append semantics in these methods.
An RDD is immutable and persisting it outputs that entire immutable state to
entire files. Although it depends a lot on what you mean, INSERT INTO does not
require append semantics. You make new files.
If you mean you can't automatically overwrite past output, this is why the
{{spark.hadoop.validateOutputSpecs}} option exists. You can turn it off to
change the semantics to allow overwrite.
> Support save RDD append to file
> -------------------------------
>
> Key: SPARK-5774
> URL: https://issues.apache.org/jira/browse/SPARK-5774
> Project: Spark
> Issue Type: New Feature
> Components: Spark Core
> Affects Versions: 1.3.0
> Reporter: Yanbo Liang
>
> Now RDD.saveAsTextFile only support writing to a file which is empty. In some
> cases, we need to save RDD append to an existing file. For example, when
> execute SQL command "INSERT INTO ...", we need to append the RDD to an
> existing file.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]