Re: No overwrite flag for saveAsXXFile

Sean Owen Fri, 06 Mar 2015 06:46:03 -0800

This was discussed in the past and viewed as dangerous to enable. The
biggest problem, by far, comes when you have a job that output M
partitions, 'overwriting' a directory of data containing N > M old
partitions. You suddenly have a mix of new and old data.


It doesn't match Hadoop's semantics either, which won't let you do
this. You can of course simply remove the output directory.

On Fri, Mar 6, 2015 at 2:20 PM, Ted Yu <yuzhih...@gmail.com> wrote:
> Adding support for overwrite flag would make saveAsXXFile more user friendly.
>
> Cheers
>
>
>
>> On Mar 6, 2015, at 2:14 AM, Jeff Zhang <zjf...@gmail.com> wrote:
>>
>> Hi folks,
>>
>> I found that RDD:saveXXFile has no overwrite flag which I think is very 
>> helpful. Is there any reason for this ?
>>
>>
>>
>> --
>> Best Regards
>>
>> Jeff Zhang
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: No overwrite flag for saveAsXXFile

Reply via email to