RE: Spark Sql on large number of files (~500Megs each) fails after couple of hours

2016-04-10 Thread Yu, Yucai
It is possible not the first failure, could you increase below and rerun? spark.yarn.executor.memoryOverhead 4096 In my experience, sometimes, netty will use lots of off-heap memory, which may lead to exceed container memory limitation and be killed by yarn’s node manager. Thanks,

Re: Spark Sql on large number of files (~500Megs each) fails after couple of hours

2016-04-10 Thread Yash Sharma
Hi Yucai, Thanks for the info. I have explored the container logs but did not get lot of information from it. I have seen this error log for few containers but not sure of the cause for it. 1. java.lang.NullPointerException (DiskBlockManager.scala:167) 2. java.lang.ClassCastException:

RE: Spark Sql on large number of files (~500Megs each) fails after couple of hours

2016-04-10 Thread Yu, Yucai
Hi Yash, How about checking the executor(yarn container) log? Most of time, it shows more details, we are using CDH, the log is at: [yucai@sr483 container_1457699919227_0094_01_14]$ pwd /mnt/DP_disk1/yucai/yarn/logs/application_1457699919227_0094/container_1457699919227_0094_01_14