Re: Review Request 48708: Namenode start step failed during EU with RetriableException

2016-06-15 Thread Dmitro Lisnichenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48708/#review137699
---


Ship it!




Ship It!

- Dmitro Lisnichenko


On June 15, 2016, 12:33 a.m., Jonathan Hurley wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48708/
> ---
> 
> (Updated June 15, 2016, 12:33 a.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez, Dmitro Lisnichenko, and Nate 
> Cole.
> 
> 
> Bugs: AMBARI-17236
> https://issues.apache.org/jira/browse/AMBARI-17236
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> When starting NN during an EU, we're hitting this when trying to create HDFS 
> directories:
> ```
> {
>   "RemoteException": {
> "exception": "RetriableException", 
> "javaClassName": "org.apache.hadoop.ipc.RetriableException", 
> "message": "NameNode still not started"
>   }
> }
> ```
> 
> So, the heart of this issue is that, depending on topology and upgrade type, 
> we might not wait for NN to be out of Safe Mode after starting. However, we 
> are always creating directories, regardless of topology/upgrade:
> 
> ```
> # Always run this on non-HA, or active NameNode during HA.
> if is_active_namenode:
>   create_hdfs_directories()
>   create_ranger_audit_hdfs_directories()
> ```
> 
> NameNode, in Safe Mode, is read-only and would forbid this anyway, even if it 
> didn't throw a retryable exception:
> ```
> [hdfs@c6403 root]$ hadoop fs -mkdir /foo
> mkdir: Cannot create directory /foo. Name node is in safe mode.
> ```
> 
> So, it seems like we need to wait for NN to be out of Safe Mode no matter 
> what.
> 
> 
> Diffs
> -
> 
>   
> ambari-common/src/main/python/resource_management/libraries/resources/hdfs_resource.py
>  18e61fb 
>   
> ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py
>  635f159 
> 
> Diff: https://reviews.apache.org/r/48708/diff/
> 
> 
> Testing
> ---
> 
> PENDING
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>



Re: Review Request 48708: Namenode start step failed during EU with RetriableException

2016-06-14 Thread Alejandro Fernandez

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48708/#review137623
---


Ship it!




Ship It!

- Alejandro Fernandez


On June 14, 2016, 9:33 p.m., Jonathan Hurley wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48708/
> ---
> 
> (Updated June 14, 2016, 9:33 p.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez, Dmitro Lisnichenko, and Nate 
> Cole.
> 
> 
> Bugs: AMBARI-17236
> https://issues.apache.org/jira/browse/AMBARI-17236
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> When starting NN during an EU, we're hitting this when trying to create HDFS 
> directories:
> ```
> {
>   "RemoteException": {
> "exception": "RetriableException", 
> "javaClassName": "org.apache.hadoop.ipc.RetriableException", 
> "message": "NameNode still not started"
>   }
> }
> ```
> 
> So, the heart of this issue is that, depending on topology and upgrade type, 
> we might not wait for NN to be out of Safe Mode after starting. However, we 
> are always creating directories, regardless of topology/upgrade:
> 
> ```
> # Always run this on non-HA, or active NameNode during HA.
> if is_active_namenode:
>   create_hdfs_directories()
>   create_ranger_audit_hdfs_directories()
> ```
> 
> NameNode, in Safe Mode, is read-only and would forbid this anyway, even if it 
> didn't throw a retryable exception:
> ```
> [hdfs@c6403 root]$ hadoop fs -mkdir /foo
> mkdir: Cannot create directory /foo. Name node is in safe mode.
> ```
> 
> So, it seems like we need to wait for NN to be out of Safe Mode no matter 
> what.
> 
> 
> Diffs
> -
> 
>   
> ambari-common/src/main/python/resource_management/libraries/resources/hdfs_resource.py
>  18e61fb 
>   
> ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py
>  635f159 
> 
> Diff: https://reviews.apache.org/r/48708/diff/
> 
> 
> Testing
> ---
> 
> PENDING
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>



Re: Review Request 48708: Namenode start step failed during EU with RetriableException

2016-06-14 Thread Nate Cole

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48708/#review137599
---


Ship it!




Ship It!

- Nate Cole


On June 14, 2016, 5:33 p.m., Jonathan Hurley wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48708/
> ---
> 
> (Updated June 14, 2016, 5:33 p.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez, Dmitro Lisnichenko, and Nate 
> Cole.
> 
> 
> Bugs: AMBARI-17236
> https://issues.apache.org/jira/browse/AMBARI-17236
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> When starting NN during an EU, we're hitting this when trying to create HDFS 
> directories:
> ```
> {
>   "RemoteException": {
> "exception": "RetriableException", 
> "javaClassName": "org.apache.hadoop.ipc.RetriableException", 
> "message": "NameNode still not started"
>   }
> }
> ```
> 
> So, the heart of this issue is that, depending on topology and upgrade type, 
> we might not wait for NN to be out of Safe Mode after starting. However, we 
> are always creating directories, regardless of topology/upgrade:
> 
> ```
> # Always run this on non-HA, or active NameNode during HA.
> if is_active_namenode:
>   create_hdfs_directories()
>   create_ranger_audit_hdfs_directories()
> ```
> 
> NameNode, in Safe Mode, is read-only and would forbid this anyway, even if it 
> didn't throw a retryable exception:
> ```
> [hdfs@c6403 root]$ hadoop fs -mkdir /foo
> mkdir: Cannot create directory /foo. Name node is in safe mode.
> ```
> 
> So, it seems like we need to wait for NN to be out of Safe Mode no matter 
> what.
> 
> 
> Diffs
> -
> 
>   
> ambari-common/src/main/python/resource_management/libraries/resources/hdfs_resource.py
>  18e61fb 
>   
> ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py
>  635f159 
> 
> Diff: https://reviews.apache.org/r/48708/diff/
> 
> 
> Testing
> ---
> 
> PENDING
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>



Re: Review Request 48708: Namenode start step failed during EU with RetriableException

2016-06-14 Thread Jonathan Hurley

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48708/#review137586
---




ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py
 


`is_action_namenode_cmd` was not used anymore - this was dead code.



ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py
 (lines 168 - 170)


Because we're writing out directories on startup no matter what, we need to 
ensure NN is out of SafeMode - that's the major logic change.


- Jonathan Hurley


On June 14, 2016, 5:33 p.m., Jonathan Hurley wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48708/
> ---
> 
> (Updated June 14, 2016, 5:33 p.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez, Dmitro Lisnichenko, and Nate 
> Cole.
> 
> 
> Bugs: AMBARI-17236
> https://issues.apache.org/jira/browse/AMBARI-17236
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> When starting NN during an EU, we're hitting this when trying to create HDFS 
> directories:
> ```
> {
>   "RemoteException": {
> "exception": "RetriableException", 
> "javaClassName": "org.apache.hadoop.ipc.RetriableException", 
> "message": "NameNode still not started"
>   }
> }
> ```
> 
> So, the heart of this issue is that, depending on topology and upgrade type, 
> we might not wait for NN to be out of Safe Mode after starting. However, we 
> are always creating directories, regardless of topology/upgrade:
> 
> ```
> # Always run this on non-HA, or active NameNode during HA.
> if is_active_namenode:
>   create_hdfs_directories()
>   create_ranger_audit_hdfs_directories()
> ```
> 
> NameNode, in Safe Mode, is read-only and would forbid this anyway, even if it 
> didn't throw a retryable exception:
> ```
> [hdfs@c6403 root]$ hadoop fs -mkdir /foo
> mkdir: Cannot create directory /foo. Name node is in safe mode.
> ```
> 
> So, it seems like we need to wait for NN to be out of Safe Mode no matter 
> what.
> 
> 
> Diffs
> -
> 
>   
> ambari-common/src/main/python/resource_management/libraries/resources/hdfs_resource.py
>  18e61fb 
>   
> ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py
>  635f159 
> 
> Diff: https://reviews.apache.org/r/48708/diff/
> 
> 
> Testing
> ---
> 
> PENDING
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>



Review Request 48708: Namenode start step failed during EU with RetriableException

2016-06-14 Thread Jonathan Hurley

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48708/
---

Review request for Ambari, Alejandro Fernandez, Dmitro Lisnichenko, and Nate 
Cole.


Bugs: AMBARI-17236
https://issues.apache.org/jira/browse/AMBARI-17236


Repository: ambari


Description
---

When starting NN during an EU, we're hitting this when trying to create HDFS 
directories:
```
{
  "RemoteException": {
"exception": "RetriableException", 
"javaClassName": "org.apache.hadoop.ipc.RetriableException", 
"message": "NameNode still not started"
  }
}
```

So, the heart of this issue is that, depending on topology and upgrade type, we 
might not wait for NN to be out of Safe Mode after starting. However, we are 
always creating directories, regardless of topology/upgrade:

```
# Always run this on non-HA, or active NameNode during HA.
if is_active_namenode:
  create_hdfs_directories()
  create_ranger_audit_hdfs_directories()
```

NameNode, in Safe Mode, is read-only and would forbid this anyway, even if it 
didn't throw a retryable exception:
```
[hdfs@c6403 root]$ hadoop fs -mkdir /foo
mkdir: Cannot create directory /foo. Name node is in safe mode.
```

So, it seems like we need to wait for NN to be out of Safe Mode no matter what.


Diffs
-

  
ambari-common/src/main/python/resource_management/libraries/resources/hdfs_resource.py
 18e61fb 
  
ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py
 635f159 

Diff: https://reviews.apache.org/r/48708/diff/


Testing
---

PENDING


Thanks,

Jonathan Hurley