-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26065/#review55081
-----------------------------------------------------------

Ship it!


Ship It!

- Jonathan Hurley


On Sept. 30, 2014, 7:57 p.m., Alejandro Fernandez wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26065/
> -----------------------------------------------------------
> 
> (Updated Sept. 30, 2014, 7:57 p.m.)
> 
> 
> Review request for Ambari, Florian Barca, Jonathan Hurley, Mahadev Konar, Sid 
> Wagle, and Tom Beerbower.
> 
> 
> Bugs: AMBARI-7506
>     https://issues.apache.org/jira/browse/AMBARI-7506
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> When a drive fails and it is unmounted for service, if the data node process 
> is stopped/started using Ambari the dfs.data.dir path that was housed on that 
> drive is re-created, but this time on the / partition leading to out of disk 
> space issues and data being created on the wrong volume.
> In this case we only want the Ambari Agent to create dfs.data.dir's during 
> installation, and not after as this makes drive replacements difficult.
> 
> 
> Diffs
> -----
> 
>   ambari-agent/src/test/python/resource_management/TestFileSystem.py 
> PRE-CREATION 
>   ambari-common/src/main/python/resource_management/core/logger.py e395bd7 
>   ambari-common/src/main/python/resource_management/core/providers/mount.py 
> dc6d7d9 
>   
> ambari-common/src/main/python/resource_management/libraries/functions/dfs_datanode_helper.py
>  PRE-CREATION 
>   
> ambari-common/src/main/python/resource_management/libraries/functions/file_system.py
>  PRE-CREATION 
>   
> ambari-server/src/main/resources/stacks/HDP/1.3.2/services/HDFS/configuration/hadoop-env.xml
>  5da6484 
>   
> ambari-server/src/main/resources/stacks/HDP/1.3.2/services/HDFS/package/scripts/hdfs_datanode.py
>  2482f97 
>   
> ambari-server/src/main/resources/stacks/HDP/1.3.2/services/HDFS/package/scripts/params.py
>  245ad92 
>   
> ambari-server/src/main/resources/stacks/HDP/2.0.6/services/HDFS/configuration/hadoop-env.xml
>  b3935d7 
>   
> ambari-server/src/main/resources/stacks/HDP/2.0.6/services/HDFS/package/scripts/hdfs_datanode.py
>  e38d9af 
>   
> ambari-server/src/main/resources/stacks/HDP/2.0.6/services/HDFS/package/scripts/params.py
>  27cef20 
>   ambari-server/src/test/python/stacks/1.3.2/configs/default.json c80723c 
>   ambari-server/src/test/python/stacks/1.3.2/configs/secured.json 99e88b8 
>   ambari-server/src/test/python/stacks/2.0.6/configs/default.json 4e00086 
>   ambari-server/src/test/python/stacks/2.0.6/configs/secured.json d03be7a 
>   ambari-web/app/data/HDP2/site_properties.js 9886d56 
>   ambari-web/app/data/site_properties.js 0e6aa8e 
> 
> Diff: https://reviews.apache.org/r/26065/diff/
> 
> 
> Testing
> -------
> 
> Created unit tests and simple end-to-end test on a sandbox VM.
> 
> Ran end-to-end tests on Google Compute Cloud with VMs that had an external 
> drive mounted.
> 1. Created a cluster with 2 VMs, and copied the changes python files.
> 2. To avoid having to copy the changed web files, instead saved the new 
> property by running,
> /var/lib/ambari-server/resources/scripts/configs.sh set localhost dev 
> hadoop-env dfs.datanode.data.dir.mount.file 
> "/etc/hadoop/conf/dfs_data_dir_mount.hist"
> and verified that the property appears in the API, e.g., 
> http://162.216.150.229:8080/api/v1/clusters/dev/configurations?type=hadoop-env&tag=version1412115461978734672
> 3. Restarted HDFS on all agents
> 4. cat /etc/hadoop/conf/dfs_data_dir_mount.hist
> correctly showed the HDFS data dir and its mount point,
> # data_dir,mount_point
> /grid/0/hadoop/hdfs/data,/grid/0
> 
> 5. Then changed the HDFS data dir property from /grid/0/hadoop/hdfs/data to 
> /grid/1/hadoop/hdfs/data
> which correctly showed it is mounted on root, and created the 
> /grid/1/hadoop/hdfs/data directory
> 
> 6. Next, unmounted the drive, by first stopping HDFS and Zookeeper. Also ran,
> cd /root
> fuser -c /grid/0
> lsof /grid/0
> umount /grid/0
> 
> 7. Restarted the HDFS services, and it resulted in an error as expected.
> Fail: Execution of 'ulimit -c unlimited;  su - hdfs -c 'export 
> HADOOP_LIBEXEC_DIR=/usr/lib/hadoop/libexec && 
> /usr/lib/hadoop/sbin/hadoop-daemon.sh --config /etc/hadoop/conf start 
> datanode'' returned 1. starting datanode, logging to 
> /var/log/hadoop/hdfs/hadoop-hdfs-datanode-alejandro-1.out
> 
> 8. Next, incremented the "DataNode volumes failure toleration" property from 
> 0 to 1 and restarted all of the Datanodes, which did not result in an error 
> this time.
> 
> 
> Thanks,
> 
> Alejandro Fernandez
> 
>

Reply via email to