Hi Michael,
Thanks for the hint! So if I turn off speculation, consecutive appends like
above will not produce temporary files right?
Which class is responsible for disabling the use of DirectOutputCommitter?
Thank you,
Jerry
On Tue, Jan 12, 2016 at 4:12 PM, Michael Armbrust
wrote:
> There c
There can be dataloss when you are using the DirectOutputCommitter and
speculation is turned on, so we disable it automatically.
On Tue, Jan 12, 2016 at 1:11 PM, Jerry Lam wrote:
> Hi spark users and developers,
>
> I wonder if the following observed behaviour is expected. I'm writing
> datafram
Hi spark users and developers,
I wonder if the following observed behaviour is expected. I'm writing
dataframe to parquet into s3. I'm using append mode when I'm writing to it.
Since I'm using org.apache.spark.sql.
parquet.DirectParquetOutputCommitter as
the spark.sql.parquet.output.committer.clas