Re: Review Request 48708: Namenode start step failed during EU with RetriableException

2016-06-15 Thread Dmitro Lisnichenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48708/#review137699
---


Ship it!




Ship It!

- Dmitro Lisnichenko


On June 15, 2016, 12:33 a.m., Jonathan Hurley wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48708/
> ---
> 
> (Updated June 15, 2016, 12:33 a.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez, Dmitro Lisnichenko, and Nate 
> Cole.
> 
> 
> Bugs: AMBARI-17236
> https://issues.apache.org/jira/browse/AMBARI-17236
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> When starting NN during an EU, we're hitting this when trying to create HDFS 
> directories:
> ```
> {
>   "RemoteException": {
> "exception": "RetriableException", 
> "javaClassName": "org.apache.hadoop.ipc.RetriableException", 
> "message": "NameNode still not started"
>   }
> }
> ```
> 
> So, the heart of this issue is that, depending on topology and upgrade type, 
> we might not wait for NN to be out of Safe Mode after starting. However, we 
> are always creating directories, regardless of topology/upgrade:
> 
> ```
> # Always run this on non-HA, or active NameNode during HA.
> if is_active_namenode:
>   create_hdfs_directories()
>   create_ranger_audit_hdfs_directories()
> ```
> 
> NameNode, in Safe Mode, is read-only and would forbid this anyway, even if it 
> didn't throw a retryable exception:
> ```
> [hdfs@c6403 root]$ hadoop fs -mkdir /foo
> mkdir: Cannot create directory /foo. Name node is in safe mode.
> ```
> 
> So, it seems like we need to wait for NN to be out of Safe Mode no matter 
> what.
> 
> 
> Diffs
> -
> 
>   
> ambari-common/src/main/python/resource_management/libraries/resources/hdfs_resource.py
>  18e61fb 
>   
> ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py
>  635f159 
> 
> Diff: https://reviews.apache.org/r/48708/diff/
> 
> 
> Testing
> ---
> 
> PENDING
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>



Re: Review Request 48708: Namenode start step failed during EU with RetriableException

2016-06-14 Thread Alejandro Fernandez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48708/#review137623
---


Ship it!




Ship It!

- Alejandro Fernandez


On June 14, 2016, 9:33 p.m., Jonathan Hurley wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48708/
> ---
> 
> (Updated June 14, 2016, 9:33 p.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez, Dmitro Lisnichenko, and Nate 
> Cole.
> 
> 
> Bugs: AMBARI-17236
> https://issues.apache.org/jira/browse/AMBARI-17236
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> When starting NN during an EU, we're hitting this when trying to create HDFS 
> directories:
> ```
> {
>   "RemoteException": {
> "exception": "RetriableException", 
> "javaClassName": "org.apache.hadoop.ipc.RetriableException", 
> "message": "NameNode still not started"
>   }
> }
> ```
> 
> So, the heart of this issue is that, depending on topology and upgrade type, 
> we might not wait for NN to be out of Safe Mode after starting. However, we 
> are always creating directories, regardless of topology/upgrade:
> 
> ```
> # Always run this on non-HA, or active NameNode during HA.
> if is_active_namenode:
>   create_hdfs_directories()
>   create_ranger_audit_hdfs_directories()
> ```
> 
> NameNode, in Safe Mode, is read-only and would forbid this anyway, even if it 
> didn't throw a retryable exception:
> ```
> [hdfs@c6403 root]$ hadoop fs -mkdir /foo
> mkdir: Cannot create directory /foo. Name node is in safe mode.
> ```
> 
> So, it seems like we need to wait for NN to be out of Safe Mode no matter 
> what.
> 
> 
> Diffs
> -
> 
>   
> ambari-common/src/main/python/resource_management/libraries/resources/hdfs_resource.py
>  18e61fb 
>   
> ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py
>  635f159 
> 
> Diff: https://reviews.apache.org/r/48708/diff/
> 
> 
> Testing
> ---
> 
> PENDING
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>



Re: Review Request 48708: Namenode start step failed during EU with RetriableException

2016-06-14 Thread Nate Cole

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48708/#review137599
---


Ship it!




Ship It!

- Nate Cole


On June 14, 2016, 5:33 p.m., Jonathan Hurley wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48708/
> ---
> 
> (Updated June 14, 2016, 5:33 p.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez, Dmitro Lisnichenko, and Nate 
> Cole.
> 
> 
> Bugs: AMBARI-17236
> https://issues.apache.org/jira/browse/AMBARI-17236
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> When starting NN during an EU, we're hitting this when trying to create HDFS 
> directories:
> ```
> {
>   "RemoteException": {
> "exception": "RetriableException", 
> "javaClassName": "org.apache.hadoop.ipc.RetriableException", 
> "message": "NameNode still not started"
>   }
> }
> ```
> 
> So, the heart of this issue is that, depending on topology and upgrade type, 
> we might not wait for NN to be out of Safe Mode after starting. However, we 
> are always creating directories, regardless of topology/upgrade:
> 
> ```
> # Always run this on non-HA, or active NameNode during HA.
> if is_active_namenode:
>   create_hdfs_directories()
>   create_ranger_audit_hdfs_directories()
> ```
> 
> NameNode, in Safe Mode, is read-only and would forbid this anyway, even if it 
> didn't throw a retryable exception:
> ```
> [hdfs@c6403 root]$ hadoop fs -mkdir /foo
> mkdir: Cannot create directory /foo. Name node is in safe mode.
> ```
> 
> So, it seems like we need to wait for NN to be out of Safe Mode no matter 
> what.
> 
> 
> Diffs
> -
> 
>   
> ambari-common/src/main/python/resource_management/libraries/resources/hdfs_resource.py
>  18e61fb 
>   
> ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py
>  635f159 
> 
> Diff: https://reviews.apache.org/r/48708/diff/
> 
> 
> Testing
> ---
> 
> PENDING
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>



Re: Review Request 48708: Namenode start step failed during EU with RetriableException

2016-06-14 Thread Jonathan Hurley

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48708/#review137586
---




ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py
 


`is_action_namenode_cmd` was not used anymore - this was dead code.



ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py
 (lines 168 - 170)


Because we're writing out directories on startup no matter what, we need to 
ensure NN is out of SafeMode - that's the major logic change.


- Jonathan Hurley


On June 14, 2016, 5:33 p.m., Jonathan Hurley wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48708/
> ---
> 
> (Updated June 14, 2016, 5:33 p.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez, Dmitro Lisnichenko, and Nate 
> Cole.
> 
> 
> Bugs: AMBARI-17236
> https://issues.apache.org/jira/browse/AMBARI-17236
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> When starting NN during an EU, we're hitting this when trying to create HDFS 
> directories:
> ```
> {
>   "RemoteException": {
> "exception": "RetriableException", 
> "javaClassName": "org.apache.hadoop.ipc.RetriableException", 
> "message": "NameNode still not started"
>   }
> }
> ```
> 
> So, the heart of this issue is that, depending on topology and upgrade type, 
> we might not wait for NN to be out of Safe Mode after starting. However, we 
> are always creating directories, regardless of topology/upgrade:
> 
> ```
> # Always run this on non-HA, or active NameNode during HA.
> if is_active_namenode:
>   create_hdfs_directories()
>   create_ranger_audit_hdfs_directories()
> ```
> 
> NameNode, in Safe Mode, is read-only and would forbid this anyway, even if it 
> didn't throw a retryable exception:
> ```
> [hdfs@c6403 root]$ hadoop fs -mkdir /foo
> mkdir: Cannot create directory /foo. Name node is in safe mode.
> ```
> 
> So, it seems like we need to wait for NN to be out of Safe Mode no matter 
> what.
> 
> 
> Diffs
> -
> 
>   
> ambari-common/src/main/python/resource_management/libraries/resources/hdfs_resource.py
>  18e61fb 
>   
> ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py
>  635f159 
> 
> Diff: https://reviews.apache.org/r/48708/diff/
> 
> 
> Testing
> ---
> 
> PENDING
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>