Re: DirectFileOutputCommitter in Spark 2.3.1

2018-09-19 Thread Dillon Dukek
I believe you need to set mapreduce.fileoutputcommitter.algorithm.version
to 2.

On Wed, Sep 19, 2018 at 10:45 AM Priya Ch 
wrote:

> Hello Team,
>
> I am trying to write a DataSet as parquet file in Append mode partitioned
> by few columns. However since the job is time consuming, I would like to
> enable DirectFileOutputCommitter (i.e by-passing the writes to temporary
> folder).
>
> Version of the spark i am using is 2.3.1.
>
> Can someone please help in enabling the configuration which allows direct
> write to S3 both in case of appending, writing new files and overwriting
> the files.
>
> Thanks,
> Padma CH
>


DirectFileOutputCommitter in Spark 2.3.1

2018-09-19 Thread Priya Ch
Hello Team,

I am trying to write a DataSet as parquet file in Append mode partitioned
by few columns. However since the job is time consuming, I would like to
enable DirectFileOutputCommitter (i.e by-passing the writes to temporary
folder).

Version of the spark i am using is 2.3.1.

Can someone please help in enabling the configuration which allows direct
write to S3 both in case of appending, writing new files and overwriting
the files.

Thanks,
Padma CH