----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26065/#review55081 -----------------------------------------------------------
Ship it! Ship It! - Jonathan Hurley On Sept. 30, 2014, 7:57 p.m., Alejandro Fernandez wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/26065/ > ----------------------------------------------------------- > > (Updated Sept. 30, 2014, 7:57 p.m.) > > > Review request for Ambari, Florian Barca, Jonathan Hurley, Mahadev Konar, Sid > Wagle, and Tom Beerbower. > > > Bugs: AMBARI-7506 > https://issues.apache.org/jira/browse/AMBARI-7506 > > > Repository: ambari > > > Description > ------- > > When a drive fails and it is unmounted for service, if the data node process > is stopped/started using Ambari the dfs.data.dir path that was housed on that > drive is re-created, but this time on the / partition leading to out of disk > space issues and data being created on the wrong volume. > In this case we only want the Ambari Agent to create dfs.data.dir's during > installation, and not after as this makes drive replacements difficult. > > > Diffs > ----- > > ambari-agent/src/test/python/resource_management/TestFileSystem.py > PRE-CREATION > ambari-common/src/main/python/resource_management/core/logger.py e395bd7 > ambari-common/src/main/python/resource_management/core/providers/mount.py > dc6d7d9 > > ambari-common/src/main/python/resource_management/libraries/functions/dfs_datanode_helper.py > PRE-CREATION > > ambari-common/src/main/python/resource_management/libraries/functions/file_system.py > PRE-CREATION > > ambari-server/src/main/resources/stacks/HDP/1.3.2/services/HDFS/configuration/hadoop-env.xml > 5da6484 > > ambari-server/src/main/resources/stacks/HDP/1.3.2/services/HDFS/package/scripts/hdfs_datanode.py > 2482f97 > > ambari-server/src/main/resources/stacks/HDP/1.3.2/services/HDFS/package/scripts/params.py > 245ad92 > > ambari-server/src/main/resources/stacks/HDP/2.0.6/services/HDFS/configuration/hadoop-env.xml > b3935d7 > > ambari-server/src/main/resources/stacks/HDP/2.0.6/services/HDFS/package/scripts/hdfs_datanode.py > e38d9af > > ambari-server/src/main/resources/stacks/HDP/2.0.6/services/HDFS/package/scripts/params.py > 27cef20 > ambari-server/src/test/python/stacks/1.3.2/configs/default.json c80723c > ambari-server/src/test/python/stacks/1.3.2/configs/secured.json 99e88b8 > ambari-server/src/test/python/stacks/2.0.6/configs/default.json 4e00086 > ambari-server/src/test/python/stacks/2.0.6/configs/secured.json d03be7a > ambari-web/app/data/HDP2/site_properties.js 9886d56 > ambari-web/app/data/site_properties.js 0e6aa8e > > Diff: https://reviews.apache.org/r/26065/diff/ > > > Testing > ------- > > Created unit tests and simple end-to-end test on a sandbox VM. > > Ran end-to-end tests on Google Compute Cloud with VMs that had an external > drive mounted. > 1. Created a cluster with 2 VMs, and copied the changes python files. > 2. To avoid having to copy the changed web files, instead saved the new > property by running, > /var/lib/ambari-server/resources/scripts/configs.sh set localhost dev > hadoop-env dfs.datanode.data.dir.mount.file > "/etc/hadoop/conf/dfs_data_dir_mount.hist" > and verified that the property appears in the API, e.g., > http://162.216.150.229:8080/api/v1/clusters/dev/configurations?type=hadoop-env&tag=version1412115461978734672 > 3. Restarted HDFS on all agents > 4. cat /etc/hadoop/conf/dfs_data_dir_mount.hist > correctly showed the HDFS data dir and its mount point, > # data_dir,mount_point > /grid/0/hadoop/hdfs/data,/grid/0 > > 5. Then changed the HDFS data dir property from /grid/0/hadoop/hdfs/data to > /grid/1/hadoop/hdfs/data > which correctly showed it is mounted on root, and created the > /grid/1/hadoop/hdfs/data directory > > 6. Next, unmounted the drive, by first stopping HDFS and Zookeeper. Also ran, > cd /root > fuser -c /grid/0 > lsof /grid/0 > umount /grid/0 > > 7. Restarted the HDFS services, and it resulted in an error as expected. > Fail: Execution of 'ulimit -c unlimited; su - hdfs -c 'export > HADOOP_LIBEXEC_DIR=/usr/lib/hadoop/libexec && > /usr/lib/hadoop/sbin/hadoop-daemon.sh --config /etc/hadoop/conf start > datanode'' returned 1. starting datanode, logging to > /var/log/hadoop/hdfs/hadoop-hdfs-datanode-alejandro-1.out > > 8. Next, incremented the "DataNode volumes failure toleration" property from > 0 to 1 and restarted all of the Datanodes, which did not result in an error > this time. > > > Thanks, > > Alejandro Fernandez > >
