This was discussed in the past and viewed as dangerous to enable. The biggest problem, by far, comes when you have a job that output M partitions, 'overwriting' a directory of data containing N > M old partitions. You suddenly have a mix of new and old data.
It doesn't match Hadoop's semantics either, which won't let you do this. You can of course simply remove the output directory. On Fri, Mar 6, 2015 at 2:20 PM, Ted Yu <yuzhih...@gmail.com> wrote: > Adding support for overwrite flag would make saveAsXXFile more user friendly. > > Cheers > > > >> On Mar 6, 2015, at 2:14 AM, Jeff Zhang <zjf...@gmail.com> wrote: >> >> Hi folks, >> >> I found that RDD:saveXXFile has no overwrite flag which I think is very >> helpful. Is there any reason for this ? >> >> >> >> -- >> Best Regards >> >> Jeff Zhang > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org