subject:"saveAsNewAPIHadoopDataset must not enable speculation for parquet file\?"

Re: saveAsNewAPIHadoopDataset must not enable speculation for parquet file?

2018-04-26 Thread cane

Thanks Steve! I will study about links you mentioned! -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: saveAsNewAPIHadoopDataset must not enable speculation for parquet file?

2018-04-26 Thread Steve Loughran

sorry, not noticed this followup. Been busy with other issues On 3 Apr 2018, at 11:19, cane mailto:zhoukang199...@gmail.com>> wrote: Now, if we use saveAsNewAPIHadoopDataset with speculation enable.It may cause data loss. I check the comment of thi api: We should make sure our tasks are idemp

Re: saveAsNewAPIHadoopDataset must not enable speculation for parquet file?

2018-04-07 Thread 周康

I observe that. If commit Job done on driver and commit task done on executor. With speculation enable,it may cause data loss. Since commit Job will call listStatus and commit Task will delete output file if already exist and rename to final output. When listStatus called after delete and before re

Re: saveAsNewAPIHadoopDataset must not enable speculation for parquet file?

2018-04-03 Thread Steve Loughran

> On 3 Apr 2018, at 11:19, cane wrote: > > Now, if we use saveAsNewAPIHadoopDataset with speculation enable.It may cause > data loss. > I check the comment of thi api: > > We should make sure our tasks are idempotent when speculation is enabled, > i.e. do > * not use output committer that w

saveAsNewAPIHadoopDataset must not enable speculation for parquet file?

2018-04-03 Thread cane

Now, if we use saveAsNewAPIHadoopDataset with speculation enable.It may cause data loss. I check the comment of thi api: We should make sure our tasks are idempotent when speculation is enabled, i.e. do * not use output committer that writes data directly. * There is an example in https://

Re: saveAsNewAPIHadoopDataset must not enable speculation for parquet file?

Re: saveAsNewAPIHadoopDataset must not enable speculation for parquet file?

Re: saveAsNewAPIHadoopDataset must not enable speculation for parquet file?

Re: saveAsNewAPIHadoopDataset must not enable speculation for parquet file?

saveAsNewAPIHadoopDataset must not enable speculation for parquet file?

5 matches

Site Navigation

Mail list logo

Footer information