> On Sept. 22, 2015, 10:31 p.m., Sumit Mohanty wrote: > > ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/alerts/alert_datanode_unmounted_data_dir.py, > > line 77 > > <https://reviews.apache.org/r/38651/diff/1/?file=1081590#file1081590line77> > > > > Will it result in an alert after Ambari upgrade? Not sure if requiring > > DN restart to get rid of an alert is a good idea? > > Alejandro Fernandez wrote: > No upgrade is needed to pickup added alert definitions in Ambari 2.1; > ambari-server actually loads them from the json file on start. > It checks if the history file exits, if the data dirs exist, and if it's > possible for the data dirs to have become unmounted. > One way to fix the missing history file or missing data dir is to restart > DN, but that's not necessarily required. > > Sumit Mohanty wrote: > What I meant is when I upgrade from Ambari-2.1.0 to 2.1.2 then the > history file will not exist. Will we see an WARN alert?
The history file was added in either Ambari 1.7.0/2.0.0, and it is created the first time that DataNode starts. This means that existing clusters should not see any warnings; warnings only show up during the installation of a brand new cluster. - Alejandro ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/38651/#review100084 ----------------------------------------------------------- On Sept. 22, 2015, 10:17 p.m., Alejandro Fernandez wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/38651/ > ----------------------------------------------------------- > > (Updated Sept. 22, 2015, 10:17 p.m.) > > > Review request for Ambari, Andrew Onischuk, Dmitro Lisnichenko, Jonathan > Hurley, Nate Cole, Sumit Mohanty, Srimanth Gunturi, and Sid Wagle. > > > Bugs: AMBARI-13194 > https://issues.apache.org/jira/browse/AMBARI-13194 > > > Repository: ambari > > > Description > ------- > > Ambari uses the dfs.datanode.data.dir.mount.file property in HDFS, whose > value is typically /etc/hadoop/conf/dfs_data_dir_mount.hist > to track the mount points for each of the data dirs. > > E.g., > {code} > /hadoop01/data,/device1 > /hadoop02/data,/device2 > /hadoop03/data,/ # this one is on root, the others are all on mount > points. > {code} > > Whenever a drive becomes unmounted, Ambari detects that it was previously on > a mount and will not create that data dir; HDFS can still tolerate the > failure if dfs.datanode.failed.volumes.tolerated is greater than 0. > Now, if the /etc/hadoop/conf/dfs_data_dir_mount.hist file is deleted, then > Ambari won't have this knowledge, and will create the datadir (even if it's > on the root partition). > > To improve tracking, create an alert definition that checks the following > * warning status if the /etc/hadoop/conf/dfs_data_dir_mount.hist file is > deleted > * critical status if at least one of the data dirs is mounted on the root > partition, and at least one data dir is on a mount > > > Diffs > ----- > > ambari-common/src/main/python/resource_management/core/providers/system.py > 213adc5 > > ambari-common/src/main/python/resource_management/libraries/functions/dfs_datanode_helper.py > a05e162 > ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/alerts.json > 477fd95 > > ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/alerts/alert_datanode_unmounted_data_dir.py > PRE-CREATION > > ambari-server/src/test/python/stacks/2.0.6/HDFS/test_alert_datanode_unmounted_data_dir.py > PRE-CREATION > > Diff: https://reviews.apache.org/r/38651/diff/ > > > Testing > ------- > > * Python unit tests passed > * Verified that the alert worked on several hosts for all 3 types of statuses > (WARNING, CRITICAL, OK) > * Also checked that it did not run on a host without DataNode, and it did run > once I added DataNode to that host > > > Thanks, > > Alejandro Fernandez > >
