[
https://issues.apache.org/jira/browse/AMBARI-12252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14611612#comment-14611612
]
Hadoop QA commented on AMBARI-12252:
------------------------------------
{color:green}+1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12743229/AMBARI-12252.patch
against trunk revision .
{color:green}+1 @author{color}. The patch does not contain any @author
tags.
{color:green}+1 tests included{color}. The patch appears to include 3 new
or modified test files.
{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.
{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.
{color:green}+1 core tests{color}. The patch passed unit tests in .
Test results:
https://builds.apache.org/job/Ambari-trunk-test-patch/3334//testReport/
Console output:
https://builds.apache.org/job/Ambari-trunk-test-patch/3334//console
This message is automatically generated.
> Prevent datanode from creating an HDFS datadir when drive becomes unmounted
> ---------------------------------------------------------------------------
>
> Key: AMBARI-12252
> URL: https://issues.apache.org/jira/browse/AMBARI-12252
> Project: Ambari
> Issue Type: Bug
> Components: ambari-agent
> Affects Versions: 1.7.0
> Reporter: Alejandro Fernandez
> Assignee: Alejandro Fernandez
> Priority: Critical
> Fix For: 2.1.0
>
> Attachments: AMBARI-12252.branch-2.1.patch, AMBARI-12252.patch
>
>
> This is related to AMBARI-7506
> Ambari keeps track of a file, /etc/hadoop/conf/dfs_data_dir_mount.hist
> that contains a mapping of HDFS data dirs to the last known mount point.
> This is used to detect when a data dir becomes unmounted, in order to prevent
> HDFS from writing to the root partition.
> Consider the example of a data node configured with these volumes:
> /dev/sda -> /
> /dev/sdb -> /grid/0
> /dev/sdc -> /grid/1
> /dev/sdd -> /grid/2
> Typically, each /grid/#/ directory contains a data folder.
> Today, if a data directory becomes unmounted, then the directory will not
> exist and Ambari will not create it automatically. Ambari will simply log a
> warning, and update its cache with the new mount point, which is / ; that is
> the underlying bug.
> If hdfs-site contains dfs.datanode.failed.volumes.tolerated with a value > 0,
> then DataNode will tolerate the failure, otherwise, the DataNode will die.
> Because Ambari will already have "/" in its cache file, the fact that it used
> to be mounted in a non-root drive is lost, so next time DataNode is
> restarted, Ambari will create the data dir which is now mounted on the root
> partition; this is really bad because HDFS will now fill up the root drive.
> The admin can still remount the partition, but then needs to restart DataNode
> so Ambari can update its cache.
> The ideal way to fix this in Ambari 2.2 is as follows,
> * Track which data dirs the admin wants mounted on a non-root partition. If
> the admin wishes all data dirs to be on non-root mounts, but the initial
> install is incorrect, then this should be reported as a problem.
> * Keep the history of the mount points in the database. Today, if the cache
> file is deleted or the host reimaged, then this information is lost.
> * Introduce a new state between FAILED and COMPLETED, such as
> COMPLETED_WITH_ERRORS, that will allow tasks to look differently in the UI,
> so the user can clearly detect when a critical but non fatal error happened.
> * Plugin with Alert Framework
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)