Re: Review Request 51426: Journal node restart failing on RU from dergM10 to erie on Wire Encrypted cluster

2016-09-12 Thread Dmitro Lisnichenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51426/#review148478
---


Ship it!




Ship It!

- Dmitro Lisnichenko


On Aug. 25, 2016, 12:53 p.m., Andrew Onischuk wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51426/
> ---
> 
> (Updated Aug. 25, 2016, 12:53 p.m.)
> 
> 
> Review request for Ambari and Vitalyi Brodetskyi.
> 
> 
> Bugs: AMBARI-18260
> https://issues.apache.org/jira/browse/AMBARI-18260
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> Type of upgrade : RU  
> Upgrade from HDP Derg M10 (2.4.2.0) to Erie (on secure, Wire encrypted
> cluster)
> 
> Journal node logs show :
> 
> 
> 
> 
> org.apache.hadoop.hdfs.qjournal.protocol.JournalOutOfSyncException: Can't 
> write, no segment open
>   at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.checkSync(Journal.java:484)
>   at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:353)
>   at 
> org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:152)
>   at 
> org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:158)
>   at 
> org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25421)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)
> ç2016-08-25 07:19:47,651 INFO  namenode.FileJournalManager 
> (FileJournalManager.java:finalizeLogSegment(142)) - Finalizing edits file 
> /grid/0/hadoop/hdfs/journal/nameservice/current/edits_inprogress_0073697
>  -> 
> /grid/0/hadoop/hdfs/journal/nameservice/current/edits_0073697-0073698
> 
> 
> Error at the exact RU task:
> 
> 
> 
> 
> Traceback (most recent call last):
>   File 
> "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/journalnode.py",
>  line 198, in 
> JournalNode().execute()
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
>  line 280, in execute
> method(env)
>   File 
> "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
>  line 731, in restart
> self.post_upgrade_restart(env, upgrade_type=upgrade_type)
>   File 
> "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/journalnode.py",
>  line 75, in post_upgrade_restart
> journalnode_upgrade.post_upgrade_check()
>   File 
> "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/journalnode_upgrade.py",
>  line 64, in post_upgrade_check
> namenode_ha.is_encrypted(), params.security_enabled)
>   File 
> "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py",
>  line 306, in get_jmx_data
> data = urllib2.urlopen(nn_address).read()
>   File "/usr/lib64/python2.6/urllib2.py", line 126, in urlopen
> return _opener.open(url, data, timeout)
>   File "/usr/lib64/python2.6/urllib2.py", line 391, in open
> response = self._open(req, data)
>   File "/usr/lib64/python2.6/urllib2.py", line 409, in _open
> '_open', req)
>   File "/usr/lib64/python2.6/urllib2.py", line 369, in _call_chain
> result = func(*args)
>   File "/usr/lib64/python2.6/urllib2.py", line 1194, in https_open
> return self.do_open(httplib.HTTPSConnection, req)
>   File "/usr/lib64/python2.6/urllib2.py", line 1161, in do_open
> raise URLError(err)
> urllib2.URLError:  routines:SSL23_GET_SERVER_HELLO:sslv3 alert handshake failure>
> 
> 
> Live cluster :  
> 
> 
> Artifacts:  dgm10toerienoha-s11/test-logs/ambariru-dgm10toerie-sec-noha/ambaritestartifact
> s/artifacts/screenshots/com.hw.ambari.ui.tests.monitoring.admin_page.TestQuick
> RollingUpgradeApi/test060_StartPerformUpgrade/_24_22_9_0_One_step_of_upgrade_f
> ailed_after_retry_group_UpgradeGroup_c

Re: Review Request 51426: Journal node restart failing on RU from dergM10 to erie on Wire Encrypted cluster

2016-08-25 Thread Andrew Onischuk

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51426/
---

(Updated Aug. 25, 2016, 9:53 a.m.)


Review request for Ambari and Vitalyi Brodetskyi.


Bugs: AMBARI-18260
https://issues.apache.org/jira/browse/AMBARI-18260


Repository: ambari


Description
---

Type of upgrade : RU  
Upgrade from HDP Derg M10 (2.4.2.0) to Erie (on secure, Wire encrypted
cluster)

Journal node logs show :




org.apache.hadoop.hdfs.qjournal.protocol.JournalOutOfSyncException: Can't 
write, no segment open
at 
org.apache.hadoop.hdfs.qjournal.server.Journal.checkSync(Journal.java:484)
at 
org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:353)
at 
org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.journal(JournalNodeRpcServer.java:152)
at 
org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.journal(QJournalProtocolServerSideTranslatorPB.java:158)
at 
org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25421)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)
ç2016-08-25 07:19:47,651 INFO  namenode.FileJournalManager 
(FileJournalManager.java:finalizeLogSegment(142)) - Finalizing edits file 
/grid/0/hadoop/hdfs/journal/nameservice/current/edits_inprogress_0073697
 -> 
/grid/0/hadoop/hdfs/journal/nameservice/current/edits_0073697-0073698


Error at the exact RU task:




Traceback (most recent call last):
  File 
"/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/journalnode.py",
 line 198, in 
JournalNode().execute()
  File 
"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
 line 280, in execute
method(env)
  File 
"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
 line 731, in restart
self.post_upgrade_restart(env, upgrade_type=upgrade_type)
  File 
"/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/journalnode.py",
 line 75, in post_upgrade_restart
journalnode_upgrade.post_upgrade_check()
  File 
"/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/journalnode_upgrade.py",
 line 64, in post_upgrade_check
namenode_ha.is_encrypted(), params.security_enabled)
  File 
"/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py",
 line 306, in get_jmx_data
data = urllib2.urlopen(nn_address).read()
  File "/usr/lib64/python2.6/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
  File "/usr/lib64/python2.6/urllib2.py", line 391, in open
response = self._open(req, data)
  File "/usr/lib64/python2.6/urllib2.py", line 409, in _open
'_open', req)
  File "/usr/lib64/python2.6/urllib2.py", line 369, in _call_chain
result = func(*args)
  File "/usr/lib64/python2.6/urllib2.py", line 1194, in https_open
return self.do_open(httplib.HTTPSConnection, req)
  File "/usr/lib64/python2.6/urllib2.py", line 1161, in do_open
raise URLError(err)
urllib2.URLError: 


Live cluster :  


Artifacts: 


Diffs
-

  
ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py
 f6987b3 

Diff: https://reviews.apache.org/r/51426/diff/


Testing
---

mvn clean test


Thanks,

Andrew Onischuk