RE: Spark Tasks on second node never return in Yarn when I have more than 1 task node

2015-11-24 Thread Shuai Zheng
[mailto:jonathaka...@gmail.com] Sent: Thursday, November 19, 2015 6:54 PM To: Shuai Zheng Cc: user Subject: Re: Spark Tasks on second node never return in Yarn when I have more than 1 task node I don't know if this actually has anything to do with why your job is hanging, but since you are using EMR you

Spark Tasks on second node never return in Yarn when I have more than 1 task node

2015-11-19 Thread Shuai Zheng
Hi All, I face a very weird case. I have already simplify the scenario to the most so everyone can replay the scenario. My env: AWS EMR 4.1.0, Spark1.5 My code can run without any problem when I run it in a local mode, and it has no problem when it run on a EMR cluster with one

Re: Spark Tasks on second node never return in Yarn when I have more than 1 task node

2015-11-19 Thread Jonathan Kelly
I don't know if this actually has anything to do with why your job is hanging, but since you are using EMR you should probably not set those fs.s3 properties but rather let it use EMRFS, EMR's optimized Hadoop FileSystem implementation for interacting with S3. One benefit is that it will