Regarding your 2nd problem, my best guess is that you’re seeing GC pauses. It’s not unusual, given you’re using 40GB heaps. See for instance this blog post <http://gridgain.blogspot.ch/2014/06/jdk-g1-garbage-collector-pauses-for.html>
>From conducting numerous tests, we have concluded that unless you are utilizing some off-heap technology (e.g. GridGain OffHeap), no Garbage Collector provided with JDK will render any kind of stable GC performance with heap sizes larger that 16GB. For example, on 50GB heaps we can often encounter up to 5 minute GC pauses, with average pauses of 2 to 4 seconds. Not sure if Yarn can do this, but I would try to run with a smaller executor heap, and more executors per node. iulian
