Re: Get rid of FileAlreadyExistsError

Sabarish Sasidharan Tue, 01 Mar 2016 09:55:15 -0800

Have you tried spark.*hadoop*.*validateOutputSpecs*?
On 01-Mar-2016 9:43 pm, "Peter Halliday" <pjh...@cornell.edu> wrote:


> http://pastebin.com/vbbFzyzb
>
> The problem seems to be to be two fold.  First, the ParquetFileWriter in
> Hadoop allows for an overwrite flag that Spark doesn’t allow to be set.
> The second is that the DirectParquetOutputCommitter has an abortTask that’s
> empty.  I see SPARK-8413 open on this too, but no plans on changing this.
> I’m surprised not to see this fixed yet.
>
> Peter Halliday
>
>
>
> On Mar 1, 2016, at 10:01 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>
> Do you mind pastebin'ning the stack trace with the error so that we know
> which part of the code is under discussion ?
>
> Thanks
>
> On Tue, Mar 1, 2016 at 7:48 AM, Peter Halliday <pjh...@cornell.edu> wrote:
>
>> I have a Spark application that has a Task seem to fail, but it actually
>> did write out some of the files that were assigned it.  And Spark assigns
>> another executor that task, and it gets a FileAlreadyExistsException.  The
>> Hadoop code seems to allow for files to be overwritten, but I see the 1.5.1
>> version of this code doesn’t allow for this to be passed in.  Is that
>> correct?
>>
>> Peter Halliday
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>
>

Re: Get rid of FileAlreadyExistsError

Reply via email to