Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The following page has been changed by DrakeMcSmooth: http://wiki.apache.org/hadoop/Hbase/Troubleshooting ------------------------------------------------------------------------------ + == Problem: Master initializes, but Region Servers do not == - == Problem: Master node initializes, but the datanodes of slave nodes do not == - * Master node activates ''DataNode'' and ''TaskTracker'' on itself and the slave nodes, but ''dfshealth'' only shows 1 Live Node, the Master node. - * Slave node's tasktracker log contains repeated instances of the following block: + * Master's log contains repeated instances of the following block: - ~-2007-11-27 11:09:39,293 INFO org.apache.hadoop.ipc.RPC: Server at masternode/192.168.222.23:54311 not available yet, Zzzzz...[[BR]] - 2007-11-27 11:09:40,299 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: masternode/192.168.100.50:54311. Already tried 1 time(s).[[BR]] + ~-INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /127.0.0.1:60020. Already tried 1 time(s).[[BR]] - 2007-11-27 11:09:41,303 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: masternode/192.168.100.50:54311. Already tried 2 time(s).[[BR]] + INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /127.0.0.1:60020. Already tried 2 time(s).[[BR]] + ... + INFO org.apache.hadoop.ipc.RPC: Server at /127.0.0.1:60020 not available yet, Zzzzz...-~ + * Region Servers' logs contains repeated instances of the following block: - 2007-11-27 11:09:42,309 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: masternode/192.168.100.50:54311. Already tried 3 time(s).[[BR]] - 2007-11-27 11:09:43,314 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: masternode/192.168.100.50:54311. Already tried 4 time(s).[[BR]] - 2007-11-27 11:09:44,319 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: masternode/192.168.100.50:54311. Already tried 5 time(s).[[BR]] - 2007-11-27 11:09:45,324 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: masternode/192.168.100.50:54311. Already tried 6 time(s).[[BR]] - 2007-11-27 11:09:46,329 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: masternode/192.168.100.50:54311. Already tried 7 time(s).[[BR]] - 2007-11-27 11:09:47,332 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: masternode/192.168.100.50:54311. Already tried 8 time(s).[[BR]] - 2007-11-27 11:09:48,336 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: masternode/192.168.100.50:54311. Already tried 9 time(s).[[BR]] + ~-INFO org.apache.hadoop.ipc.Client: Retrying connect to server: masternode/192.168.100.50:60000. Already tried 9 time(s). - 2007-11-27 11:09:49,342 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: masternode/192.168.100.50:54311. Already tried 10 time(s).[[BR]] + INFO org.apache.hadoop.ipc.Client: Retrying connect to server: masternode/192.168.100.50:60000. Already tried 10 time(s). - 2007-11-27 11:09:50,347 INFO org.apache.hadoop.ipc.RPC: Server at masternode/192.168.100.50:54311 not available yet, Zzzzz...-~ + INFO org.apache.hadoop.ipc.RPC: Server at masternode/192.168.100.50:60000 not available yet, Zzzzz...-~ - + * Note that the Master believes the Region Servers have the IP of 127.0.0.1 - which is localhost and resolves to the master's own localhost. === Causes === - * That port on the master node is not accessible from other nodes on the network + * The Region Servers are erroneously informing the Master that their IP addresses are 127.0.0.1. === Resolution === - * Modify <code>/etc/hosts</code> on the master node, from + * Modify <code>/etc/hosts</code> on the region servers, from {{{ # Do not remove the following line, or various programs # that require network functionality will fail. - 127.0.0.1 masternode localhost.localdomain localhost + 127.0.0.1 fully.qualified.regionservername regionservername localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6 }}} @@ -34, +29 @@ 127.0.0.1 localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6 }}} - - * As a result '''netstat''' should return the following - ~-$ netstat -an | grep LISTEN - tcp 0 0 0.0.0.0:756 0.0.0.0:* LISTEN[[BR]] - tcp 0 0 127.0.0.1:631 0.0.0.0:* LISTEN[[BR]] - '''tcp 0 0 ::ffff:192.168.100.50:54310 :::* LISTEN'''[[BR]] - tcp 0 0 :::50090 :::* LISTEN[[BR]] - tcp 0 0 :::50070 :::* LISTEN-~ - == Problem: HRegionServers have lease issues on starting Hbase == * HRegionServers connect initially, then drop off due to '''LeaseExpiredException'''
