Re: Review Request 58208: Wait For DataNodes To Shutdown During a Rolling Upgrade

Alejandro Fernandez Tue, 11 Apr 2017 11:14:36 -0700

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58208/#review171593
-----------------------------------------------------------





ambari-common/src/main/python/resource_management/libraries/script/script.py
Lines 323 (patched)
<https://reviews.apache.org/r/58208/#comment244535>

    What does "afix" mean?



ambari-common/src/main/python/resource_management/libraries/script/script.py
Line 328 (original), 347 (patched)
<https://reviews.apache.org/r/58208/#comment244534>

    Add some doc.



ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py
Line 24 (original), 24 (patched)
<https://reviews.apache.org/r/58208/#comment244533>

    Can we remove this import *?


- Alejandro Fernandez


On April 11, 2017, 3:22 p.m., Dmitro Lisnichenko wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58208/
> -----------------------------------------------------------
> 
> (Updated April 11, 2017, 3:22 p.m.)
> 
> 
> Review request for Ambari, Jonathan Hurley and Nate Cole.
> 
> 
> Bugs: AMBARI-20682
>     https://issues.apache.org/jira/browse/AMBARI-20682
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> During a rolling upgrade (especially on a large, heavily used cluster), the 
> DataNodes do not shutdown immediately. However, they do de-register from the 
> NameNode which tricks Ambari into thinking that they are down.
> 
> Since the rolling upgrade uses a {{RESTART}} command, we attempt to start the 
> DataNode back up before the daemon has shutdown:
> 
> {code}
> 2017-03-14 05:00:25,602 - 
> call['/usr/hdp/current/hadoop-hdfs-datanode/bin/hdfs dfsadmin -fs hdfs://c1ha 
> -shutdownDatanode 0.0.0.0:8010 upgrade'] {'user': 'hdfs'}
> 2017-03-14 05:00:28,438 - call returned (0, 'Submitted a shutdown request to 
> datanode 0.0.0.0:8010')
> 2017-03-14 05:00:28,438 - 
> Execute['/usr/hdp/current/hadoop-hdfs-datanode/bin/hdfs dfsadmin -fs 
> hdfs://c1ha -D ipc.client.connect.max.retries=5 -D 
> ipc.client.connect.retry.interval=1000 -getDatanodeInfo 0.0.0.0:8010'] 
> {'tries': 1, 'user': 'hdfs'}
> 2017-03-14 05:00:35,976 - DataNode has successfully shutdown for upgrade.
> {code}
> 
> Even though ~ 6 seconds have passed, the daemon is still running as it 
> drains. Therefore, we attempt to start it which causes a NOOP.
> 
> Instead, we should also monitor for the PID.
> 
> -----------------
> Now STOP command waits until component really dies. Motivation behind that 
> is: we don't want to execute START of still running component again (e.g. 
> during upgrade/RESTART)
> 
> 
> Diffs
> -----
> 
>   
> ambari-common/src/main/python/resource_management/libraries/script/script.py 
> 9a5da04278 
>   
> ambari-funtest/src/test/resources/stacks/HDP/2.0.7/services/HIVE/package/scripts/mysql_service.py
>  4716343fb2 
>   
> ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/datanode.py
>  151e26cace 
>   
> ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/datanode_upgrade.py
>  b55237dd1f 
>   
> ambari-server/src/main/resources/stacks/BIGTOP/0.8/services/HIVE/package/scripts/mysql_service.py
>  11bbdd8e6b 
>   
> ambari-server/src/main/resources/stacks/BIGTOP/0.8/services/HIVE/package/scripts/postgresql_service.py
>  cc7b4cc14e 
>   ambari-server/src/test/python/stacks/2.0.6/HDFS/test_datanode.py 1c3c5b7932 
> 
> 
> Diff: https://reviews.apache.org/r/58208/diff/2/
> 
> 
> Testing
> -------
> 
> mvn clean test 
> and test on live cluster
> 
> 
> Thanks,
> 
> Dmitro Lisnichenko
> 
>

Re: Review Request 58208: Wait For DataNodes To Shutdown During a Rolling Upgrade

Reply via email to