-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/41691/#review112008
-----------------------------------------------------------



ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py
 (line 45)
<https://reviews.apache.org/r/41691/#comment172328>

    I spoke to Aravindan about this.
    Consider what happens when the server time out value is.
    
    A. < 30 mins (default of 20): If NN takes more than 30 mins to come out of 
safemode, then the task will be aborted and the user will have to retry the 
step again (e.g., NameNode restart and wait again)
    
    B. 30 or higher: Then NN will wait up to 30 mins. If after 30 mins still in 
safemode, then the task will proceed.
    
    For a very large cluster, this can take much longer than 30 mins and we'll 
be in the same boat again.
    There are 2 other potential solutions:
    1. Have a timeout value in ambari.properties that is specific for waiting 
to leave safemode
    2. Pass in the value of the server timeout to the command. So if the user 
bumps it up to 40 mins, then NameNode can always wait up to x-5 mins.
    
    What do you think?


- Apache Ambari


On Dec. 23, 2015, 5:20 p.m., Dmitro Lisnichenko wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/41691/
> -----------------------------------------------------------
> 
> (Updated Dec. 23, 2015, 5:20 p.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez, Eugene Chekanskiy, Sumit 
> Mohanty, and Vitalyi Brodetskyi.
> 
> 
> Bugs: AMBARI-14479
>     https://issues.apache.org/jira/browse/AMBARI-14479
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Issue
> Namenode safemode check timeout value of 30mins is more than the server 
> timeout of 20mins for a task. Hence, the server kills the namenode startup 
> script if it takes more than 20mins to get out of safemode.
> 
> 
> Diffs
> -----
> 
>   
> ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py
>  1766c44 
>   
> ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py
>  67db735 
>   ambari-server/src/test/python/stacks/2.0.6/HDFS/test_namenode.py 399fd8d 
> 
> Diff: https://reviews.apache.org/r/41691/diff/
> 
> 
> Testing
> -------
> 
> mvn clean test
> 
> 
> Thanks,
> 
> Dmitro Lisnichenko
> 
>

Reply via email to