task is still running on node has no disk space

Anfernee Xu Sun, 30 Mar 2014 10:34:32 -0700

 Hi,

I'm running 2.2.0 clusters, my application is pretty disk I/O
expensive(processing huge zip files), overtime I found some job failure due
to "no space on disk", normally the leftover files can be cleaned, but for
some reason if they're not, I expect no more new task can run on this node,
but in fact I still can see new tasks are coming to that node and keep
failing. My application will write data to /tmp(where may cause out of disk
space), so I can configure below properties:


<property>
     <name>yarn.nodemanager.local-dirs</name>
     <value>
                 /scratch/usr/software/hadoop2/hadoop-dc/temp/nm-local-dir,
                /tmp/nm-local-dir
     </value>
   </property>

  <property>
     <name>yarn.nodemanager.disk-health-checker.min-healthy-disks</name>
     <value>1.0</value>
   </property>

As I have /tmp/nm-local-dir as part of $yarn.nodemanager.local-dirs, based
on doc

yarn.nodemanager.disk-health-checker.min-healthy-disks:

The minimum fraction of number of disks to be healthy for the nodemanager
to launch new containers. This correspond to both
yarn-nodemanager.local-dirs and yarn.nodemanager.log-dirs. i.e. If there
are less number of healthy local-dirs (or log-dirs) available, then new
containers will not be launched on this node.

Did I miss anything?

-- 
--Anfernee

task is still running on node has no disk space

Reply via email to