Alejandro Fernandez created AMBARI-12252:
--------------------------------------------
Summary: Prevent datanode from creating datadir that becomes
unmounted
Key: AMBARI-12252
URL: https://issues.apache.org/jira/browse/AMBARI-12252
Project: Ambari
Issue Type: Bug
Components: ambari-agent
Affects Versions: 1.7.0
Reporter: Alejandro Fernandez
Assignee: Alejandro Fernandez
Priority: Critical
Fix For: 2.1.0
This is related to AMBARI-7506
Ambari keeps track of a file, /etc/hadoop/conf/dfs_data_dir_mount.hist
that contains a mapping of HDFS data dirs to the last known mount point.
This is used to detect when a data dir becomes unmounted, in order to prevent
HDFS from writing to the root partition.
Consider the example of a data node configured with these volumes:
/dev/sda -> /
/dev/sdb -> /grid/0
/dev/sdc -> /grid/1
/dev/sdd -> /grid/2
Typically, each /grid/#/ directory contains a data folder.
Today, if a data directory becomes unmounted, then the directory will not exist
and Ambari will not create it automatically. Ambari will simply log a warning,
and update its cache with the new mount point, which is / ; that is the
underlying bug.
If hdfs-site contains dfs.datanode.failed.volumes.tolerated with a value > 0,
then DataNode will tolerate the failure, otherwise, the DataNode will die.
Because Ambari will already have "/" in its cache file, the fact that it used
to be mounted in a non-root drive is lost, so next time DataNode is restarted,
Ambari will create the data dir which is now mounted on the root partition;
this is really bad because HDFS will now fill up the root drive.
The admin can still remount the partition, but then needs to restart DataNode
so Ambari can update its cache.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)