I haven’t trie spark.hadoop.validateOutputSpecs. However, it seems that has to
do with the existence of the output directory itself and not the files. Maybe
I’m wrong?
Peter
> On Mar 1, 2016, at 11:53 AM, Sabarish Sasidharan
> wrote:
>
> Have you tried spark.hadoop.validateOutputSpecs?
>
Have you tried spark.*hadoop*.*validateOutputSpecs*?
On 01-Mar-2016 9:43 pm, "Peter Halliday" wrote:
> http://pastebin.com/vbbFzyzb
>
> The problem seems to be to be two fold. First, the ParquetFileWriter in
> Hadoop allows for an overwrite flag that Spark doesn’t allow to be set.
> The second i
http://pastebin.com/vbbFzyzb
The problem seems to be to be two fold. First, the ParquetFileWriter in Hadoop
allows for an overwrite flag that Spark doesn’t allow to be set. The second is
that the DirectParquetOutputCommitter has an abortTask that’s empty. I see
SPARK-8413 open on this too, b
Do you mind pastebin'ning the stack trace with the error so that we know
which part of the code is under discussion ?
Thanks
On Tue, Mar 1, 2016 at 7:48 AM, Peter Halliday wrote:
> I have a Spark application that has a Task seem to fail, but it actually
> did write out some of the files that we
I have a Spark application that has a Task seem to fail, but it actually did
write out some of the files that were assigned it. And Spark assigns another
executor that task, and it gets a FileAlreadyExistsException. The Hadoop code
seems to allow for files to be overwritten, but I see the 1.5.