[ 
https://issues.apache.org/jira/browse/AMBARI-24966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Grinenko updated AMBARI-24966:
-------------------------------------
    Resolution: Incomplete
        Status: Resolved  (was: Patch Available)

> Start Namenode failing during Move master NN wizard on non-HA cluster with 
> custom hdfs service user
> ---------------------------------------------------------------------------------------------------
>
>                 Key: AMBARI-24966
>                 URL: https://issues.apache.org/jira/browse/AMBARI-24966
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-web
>    Affects Versions: 2.7.3
>            Reporter: Andrii Tkach
>            Assignee: Andrii Tkach
>            Priority: Blocker
>              Labels: pull-request-available
>          Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Start Namenode failing during Move master NN wizard.
> From NN logs:
> {code:java}
> 2018-11-21 19:43:57,126 WARN Encountered exception loading fsimage 
> java.io.IOException: NameNode is not formatted. at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:237)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1090)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694) 
> at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:937) 
> at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:910) 
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643)
>  at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710)
> 2018-11-21 19:43:57,131 INFO Stopped 
> o.e.j.w.WebAppContext@58359ebd{/,null,UNAVAILABLE}{/hdfs}
> 2018-11-21 19:43:57,135 INFO Stopped 
> ServerConnector@75e91545{HTTP/1.1,[http/1.1]}{ctr-e139-1542663976389-4877-02-000004.hwx.site:50070}
> 2018-11-21 19:43:57,136 INFO Stopped 
> o.e.j.s.ServletContextHandler@2f4205be{/static,file:///usr/hdp/3.1.0.0-13/hadoop-hdfs/webapps/static/,UNAVAILABLE}
> 2018-11-21 19:43:57,136 INFO Stopped 
> o.e.j.s.ServletContextHandler@319bc845{/logs,file:///grid/0/log/hdfs/cstm-hdfs/,UNAVAILABLE}
> 2018-11-21 19:43:57,138 INFO Stopping NameNode metrics system...
> 2018-11-21 19:43:57,139 INFO timeline thread interrupted.
> 2018-11-21 19:43:57,140 INFO NameNode metrics system stopped.
> 2018-11-21 19:43:57,141 INFO NameNode metrics system shutdown complete.
> 2018-11-21 19:43:57,141 ERROR Failed to start namenode. java.io.IOException: 
> NameNode is not formatted. at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:237)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1090)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694) 
> at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:937) 
> at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:910) 
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643)
>  at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710)
> 2018-11-21 19:43:57,143 INFO No live collector to send metrics to. Metrics to 
> be sent will be discarded. This message will be skipped for the next 20 times.
> 2018-11-21 19:43:57,143 INFO Exiting with status 1: java.io.IOException: 
> NameNode is not formatted.
> 2018-11-21 19:43:57,145 INFO SHUTDOWN_MSG: 
> /************************************************************ SHUTDOWN_MSG: 
> Shutting down NameNode at 
> ctr-e139-1542663976389-4877-02-000004.hwx.site/172.27.25.135 
> ************************************************************/
> {code}
> Ambari task logs
> {code:java}
> 2018-11-22 03:31:30,588 - The NameNode is still in Safemode. Please be 
> careful with commands that need Safemode OFF.
> Traceback (most recent call last):
>   File 
> "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/HDFS/package/scripts/namenode.py",
>  line 408, in <module>
>     NameNode().execute()
>   File 
> "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", 
> line 352, in execute
>     method(env)
>   File 
> "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/HDFS/package/scripts/namenode.py",
>  line 138, in start
>     upgrade_suspended=params.upgrade_suspended, env=env)
>   File "/usr/lib/ambari-agent/lib/ambari_commons/os_family_impl.py", line 89, 
> in thunk
>     return fn(*args, **kwargs)
>   File 
> "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/HDFS/package/scripts/hdfs_namenode.py",
>  line 264, in namenode
>     create_hdfs_directories(name_service)
>   File 
> "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/HDFS/package/scripts/hdfs_namenode.py",
>  line 336, in create_hdfs_directories
>     nameservices=name_services
>   File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 
> 166, in __init__
>     self.env.run()
>   File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", 
> line 160, in run
>     self.run_action(resource, action)
>   File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", 
> line 124, in run_action
>     provider_action()
>   File 
> "/usr/lib/ambari-agent/lib/resource_management/libraries/providers/hdfs_resource.py",
>  line 677, in action_create_on_execute
>     self.action_delayed("create")
>   File 
> "/usr/lib/ambari-agent/lib/resource_management/libraries/providers/hdfs_resource.py",
>  line 674, in action_delayed
>     self.get_hdfs_resource_executor().action_delayed(action_name, self)
>   File 
> "/usr/lib/ambari-agent/lib/resource_management/libraries/providers/hdfs_resource.py",
>  line 373, in action_delayed
>     self.action_delayed_for_nameservice(None, action_name, main_resource)
>   File 
> "/usr/lib/ambari-agent/lib/resource_management/libraries/providers/hdfs_resource.py",
>  line 395, in action_delayed_for_nameservice
>     self._assert_valid()
>   File 
> "/usr/lib/ambari-agent/lib/resource_management/libraries/providers/hdfs_resource.py",
>  line 334, in _assert_valid
>     self.target_status = self._get_file_status(target)
>   File 
> "/usr/lib/ambari-agent/lib/resource_management/libraries/providers/hdfs_resource.py",
>  line 497, in _get_file_status
>     list_status = self.util.run_command(target, 'GETFILESTATUS', 
> method='GET', ignore_status_codes=['404'], assertable_result=False)
>   File 
> "/usr/lib/ambari-agent/lib/resource_management/libraries/providers/hdfs_resource.py",
>  line 214, in run_command
>     return self._run_command(*args, **kwargs)
>   File 
> "/usr/lib/ambari-agent/lib/resource_management/libraries/providers/hdfs_resource.py",
>  line 282, in _run_command
>     _, out, err = get_user_call_output(cmd, user=self.run_user, 
> logoutput=self.logoutput, quiet=False)
>   File 
> "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/get_user_call_output.py",
>  line 62, in get_user_call_output
>     raise ExecutionFailed(err_msg, code, files_output[0], files_output[1])
> resource_management.core.exceptions.ExecutionFailed: Execution of 'curl -sS 
> -L -w '%{http_code}' -X GET -d '' -H 'Content-Length: 0' --negotiate -u : 
> 'http://ctr-e139-1542663976389-4877-02-000004.hwx.site:50070/webhdfs/v1/tmp?op=GETFILESTATUS'
>  1>/tmp/tmpC3TR3n 2>/tmp/tmpTlrH_1' returned 7. curl: (7) Failed to connect 
> to ctr-e139-1542663976389-4877-02-000004.hwx.site port 50070: Connection 
> refused
> 000
> {code}
> Customized service users and ambari agent user is enabled in the test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to