Zack Marsh created AMBARI-10977:
-----------------------------------
Summary: HDFS Rebalance failed with IllegalArgumentException: Does
not contain a valid host:port authority
Key: AMBARI-10977
URL: https://issues.apache.org/jira/browse/AMBARI-10977
Project: Ambari
Issue Type: Bug
Environment: ambari-2.1.0-376, hdp-2.3.0.0-1880, sles11sp3
Reporter: Zack Marsh
The HDFS Rebalance is failing with the following error messages:
stderr:
{code}
2015-05-06 12:31:46,656 - Error while executing command 'rebalancehdfs':
Traceback (most recent call last):
File
"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 214, in execute
method(env)
File
"/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py",
line 243, in rebalancehdfs
logoutput = False,
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py",
line 148, in __init__
self.env.run()
File
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
line 152, in run
self.run_action(resource, action)
File
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
line 118, in run_action
provider_action()
File
"/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py",
line 269, in action_run
raise ex
Fail: Execution of 'ambari-sudo.sh su hdfs -l -s /bin/bash -c 'export
PATH='"'"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/ambari-server/*:/sbin:/usr/sbin:/usr/local/sbin:/root/bin:/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/X11R6/bin:/usr/games:/opt/teradata/bynet/bin:/usr/lib/mit/bin:/usr/lib/mit/sbin:.:/opt/teradata/sm3g/bin:/opt/dell/srvadmin/bin:/opt/dell/srvadmin/sbin:/opt/teradata/dswap/sbin:/usr/tdbms/bin:/opt/teradata/gsctools/bin:/opt/teradata/vmf/bin:/opt/teradata/bin:/var/lib/ambari-agent:/var/lib/ambari-agent:/usr/hdp/current/hadoop-client/bin'"'"'
; hdfs --config /usr/hdp/current/hadoop-client/conf balancer -threshold 10''
returned 255. May 6, 2015 12:31:46 PM Balancing took 888.0 milliseconds
15/05/06 12:31:46 ERROR balancer.Balancer: Exiting balancer due an exception
java.lang.IllegalArgumentException: Does not contain a valid host:port
authority: jolokia1.labs.teradata.com
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:213)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
at org.apache.hadoop.hdfs.DFSUtil.getNameServiceUris(DFSUtil.java:1037)
at org.apache.hadoop.hdfs.DFSUtil.getNsServiceRpcUris(DFSUtil.java:978)
at
org.apache.hadoop.hdfs.server.balancer.Balancer$Cli.run(Balancer.java:682)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at
org.apache.hadoop.hdfs.server.balancer.Balancer.main(Balancer.java:794)
{code}
stdout:
{code}
Starting balancer with threshold = 10
Executing command ambari-sudo.sh su hdfs -l -s /bin/bash -c 'export
PATH='"'"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/ambari-server/*:/sbin:/usr/sbin:/usr/local/sbin:/root/bin:/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/X11R6/bin:/usr/games:/opt/teradata/bynet/bin:/usr/lib/mit/bin:/usr/lib/mit/sbin:.:/opt/teradata/sm3g/bin:/opt/dell/srvadmin/bin:/opt/dell/srvadmin/sbin:/opt/teradata/dswap/sbin:/usr/tdbms/bin:/opt/teradata/gsctools/bin:/opt/teradata/vmf/bin:/opt/teradata/bin:/var/lib/ambari-agent:/var/lib/ambari-agent:/usr/hdp/current/hadoop-client/bin'"'"'
; hdfs --config /usr/hdp/current/hadoop-client/conf balancer -threshold 10'
2015-05-06 12:31:43,096 - Execute['ambari-sudo.sh su hdfs -l -s /bin/bash -c
'export
PATH='"'"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/ambari-server/*:/sbin:/usr/sbin:/usr/local/sbin:/root/bin:/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/X11R6/bin:/usr/games:/opt/teradata/bynet/bin:/usr/lib/mit/bin:/usr/lib/mit/sbin:.:/opt/teradata/sm3g/bin:/opt/dell/srvadmin/bin:/opt/dell/srvadmin/sbin:/opt/teradata/dswap/sbin:/usr/tdbms/bin:/opt/teradata/gsctools/bin:/opt/teradata/vmf/bin:/opt/teradata/bin:/var/lib/ambari-agent:/var/lib/ambari-agent:/usr/hdp/current/hadoop-client/bin'"'"'
; hdfs --config /usr/hdp/current/hadoop-client/conf balancer -threshold 10'']
{'logoutput': False, 'on_new_line': handle_new_line}
[balancer] May 6, 2015 12:31:46 PM [balancer] [balancer] Balancing took 888.0
milliseconds[balancer]
[balancer] 15/05/06 12:31:46 ERROR balancer.Balancer: Exiting balancer due an
exception
java.lang.IllegalArgumentException: Does not contain a valid host:port
authority: jolokia1.labs.teradata.com
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:213)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
at org.apache.hadoop.hdfs.DFSUtil.getNameServiceUris(DFSUtil.java:1037)
at org.apache.hadoop.hdfs.DFSUtil[balancer]
.getNsServiceRpcUris(DFSUtil.java:978)
at
org.apache.hadoop.hdfs.server.balancer.Balancer$Cli.run(Balancer.java:682)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at
org.apache.hadoop.hdfs.server.balancer.Balancer.main(Balancer.java:794)
2015-05-06 12:31:46,656 - Error while executing command 'rebalancehdfs':
Traceback (most recent call last):
File
"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 214, in execute
method(env)
File
"/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py",
line 243, in rebalancehdfs
logoutput = False,
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py",
line 148, in __init__
self.env.run()
File
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
line 152, in run
self.run_action(resource, action)
File
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
line 118, in run_action
provider_action()
File
"/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py",
line 269, in action_run
raise ex
Fail: Execution of 'ambari-sudo.sh su hdfs -l -s /bin/bash -c 'export
PATH='"'"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/ambari-server/*:/sbin:/usr/sbin:/usr/local/sbin:/root/bin:/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/X11R6/bin:/usr/games:/opt/teradata/bynet/bin:/usr/lib/mit/bin:/usr/lib/mit/sbin:.:/opt/teradata/sm3g/bin:/opt/dell/srvadmin/bin:/opt/dell/srvadmin/sbin:/opt/teradata/dswap/sbin:/usr/tdbms/bin:/opt/teradata/gsctools/bin:/opt/teradata/vmf/bin:/opt/teradata/bin:/var/lib/ambari-agent:/var/lib/ambari-agent:/usr/hdp/current/hadoop-client/bin'"'"'
; hdfs --config /usr/hdp/current/hadoop-client/conf balancer -threshold 10''
returned 255. May 6, 2015 12:31:46 PM Balancing took 888.0 milliseconds
15/05/06 12:31:46 ERROR balancer.Balancer: Exiting balancer due an exception
java.lang.IllegalArgumentException: Does not contain a valid host:port
authority: jolokia1.labs.teradata.com
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:213)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
at org.apache.hadoop.hdfs.DFSUtil.getNameServiceUris(DFSUtil.java:1037)
at org.apache.hadoop.hdfs.DFSUtil.getNsServiceRpcUris(DFSUtil.java:978)
at
org.apache.hadoop.hdfs.server.balancer.Balancer$Cli.run(Balancer.java:682)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at
org.apache.hadoop.hdfs.server.balancer.Balancer.main(Balancer.java:794)
{code}
The values of the namenode rpc address properties are as follows:
{code}
dfs.namenode.rpc-address = <NN1 FQDN>
dfs.namenode.rpc-address.<CLUSTER-NAME>.nn1 = <NN1 FQDN>:8020
dfs.namenode.rpc-address.<CLUSTER-NAME>.nn2 = <NN2FQDN>:8020
{code}
Setting the plain dfs.namenode.rpc-address property to the Active Namenode at
port 8020 allows the rebalance to succeed.
However, if this property is set to the Stand-by namenode at port 8020 the
rebalance fails with the error:
{code}
2015-05-06 12:45:59,673 - Error while executing command 'rebalancehdfs':
Traceback (most recent call last):
File
"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 214, in execute
method(env)
File
"/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py",
line 243, in rebalancehdfs
logoutput = False,
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py",
line 148, in __init__
self.env.run()
File
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
line 152, in run
self.run_action(resource, action)
File
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
line 118, in run_action
provider_action()
File
"/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py",
line 269, in action_run
raise ex
Fail: Execution of 'ambari-sudo.sh su hdfs -l -s /bin/bash -c 'export
PATH='"'"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/ambari-server/*:/sbin:/usr/sbin:/usr/local/sbin:/root/bin:/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/X11R6/bin:/usr/games:/opt/teradata/bynet/bin:/usr/lib/mit/bin:/usr/lib/mit/sbin:.:/opt/teradata/sm3g/bin:/opt/dell/srvadmin/bin:/opt/dell/srvadmin/sbin:/opt/teradata/dswap/sbin:/usr/tdbms/bin:/opt/teradata/gsctools/bin:/opt/teradata/vmf/bin:/opt/teradata/bin:/var/lib/ambari-agent:/var/lib/ambari-agent:/usr/hdp/current/hadoop-client/bin'"'"'
; hdfs --config /usr/hdp/current/hadoop-client/conf balancer -threshold 10''
returned 252. 15/05/06 12:45:56 INFO balancer.Balancer: Using a threshold of
10.0
15/05/06 12:45:56 INFO balancer.Balancer: namenodes =
[hdfs://jolokia1.labs.teradata.com:8020, hdfs://JOLOKIA]
15/05/06 12:45:56 INFO balancer.Balancer: parameters =
Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration =
5, number of nodes to be excluded = 0, number of nodes to be included = 0]
Time Stamp Iteration# Bytes Already Moved Bytes Left To Move
Bytes Being Moved
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
Operation category READ is not supported in state standby
at
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
at
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1785)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1301)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getServerDefaults(FSNamesystem.java:1613)
at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getServerDefaults(NameNodeRpcServer.java:593)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getServerDefaults(ClientNamenodeProtocolServerSideTranslatorPB.java:383)
at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
. Exiting ...
May 6, 2015 12:45:59 PM Balancing took 3.127 seconds
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)