Have you tried spark.*hadoop*.*validateOutputSpecs*? On 01-Mar-2016 9:43 pm, "Peter Halliday" <pjh...@cornell.edu> wrote:
> http://pastebin.com/vbbFzyzb > > The problem seems to be to be two fold. First, the ParquetFileWriter in > Hadoop allows for an overwrite flag that Spark doesn’t allow to be set. > The second is that the DirectParquetOutputCommitter has an abortTask that’s > empty. I see SPARK-8413 open on this too, but no plans on changing this. > I’m surprised not to see this fixed yet. > > Peter Halliday > > > > On Mar 1, 2016, at 10:01 AM, Ted Yu <yuzhih...@gmail.com> wrote: > > Do you mind pastebin'ning the stack trace with the error so that we know > which part of the code is under discussion ? > > Thanks > > On Tue, Mar 1, 2016 at 7:48 AM, Peter Halliday <pjh...@cornell.edu> wrote: > >> I have a Spark application that has a Task seem to fail, but it actually >> did write out some of the files that were assigned it. And Spark assigns >> another executor that task, and it gets a FileAlreadyExistsException. The >> Hadoop code seems to allow for files to be overwritten, but I see the 1.5.1 >> version of this code doesn’t allow for this to be passed in. Is that >> correct? >> >> Peter Halliday >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> > >