Is there a third way? Unless I miss something. Hadoop's OutputFormat wants the target dir to not exist no matter what, so it's just a question of whether Spark deletes it for you or errors.
On Tue, Jun 3, 2014 at 12:22 AM, Patrick Wendell <pwend...@gmail.com> wrote: > We can just add back a flag to make it backwards compatible - it was > just missed during the original PR. > > Adding a *third* set of "clobber" semantics, I'm slightly -1 on that > for the following reasons: > > 1. It's scary to have Spark recursively deleting user files, could > easily lead to users deleting data by mistake if they don't understand > the exact semantics. > 2. It would introduce a third set of semantics here for saveAsXX... > 3. It's trivial for users to implement this with two lines of code (if > output dir exists, delete it) before calling saveAsHadoopFile. > > - Patrick >