Re: Spark output data to S3 is very slow

2016-09-17 Thread Qiang Li
Tried several times, it is slow same as before, I will let spark output data to HDFS, then sync data to S3 as temporary solution. Thank you. On Sat, Sep 17, 2016 at 10:43 AM, Takeshi Yamamuro wrote: > Hi, > > Have you seen the previous thread? >

Re: Spark output data to S3 is very slow

2016-09-16 Thread Takeshi Yamamuro
Hi, Have you seen the previous thread? https://www.mail-archive.com/user@spark.apache.org/msg56791.html // maropu On Sat, Sep 17, 2016 at 11:34 AM, Qiang Li wrote: > Hi, > > > I ran some jobs with Spark 2.0 on Yarn, I found all tasks finished very > quickly, but the last

Spark output data to S3 is very slow

2016-09-16 Thread Qiang Li
Hi, I ran some jobs with Spark 2.0 on Yarn, I found all tasks finished very quickly, but the last step, spark spend lots of time to rename or move data from s3 temporary directory to real directory, then I try to set