Hello! I am using hadoop-2.7.1. I have a large map job running (total cores available on the cluster about 3000, total tasks 35000). In the middle of this process one server reboots.
After reboot, nodemanager starts successfully end registers with resource manager: 2015-09-23 01:06:24,656 INFO [main] nodemanager.NodeStatusUpdaterImpl (NodeStatusUpdaterImpl.java:registerWithRM(311)) - Notifying ContainerManager to unblock new container-requests In YARN web-interface I see this host as active, but VCores used remains zero (see screenshot). But the map job mentioned is still running and have about 12000 pending tasks. Why this host does not receive tasks to run? PS: I recently upgraded from 2.4.1 and I did not notice such a problem with 2.4.1: new tasks were spawning immediately after reboot. Thanks!
