Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The following page has been changed by RongEnFan: http://wiki.apache.org/hadoop/Hbase/Troubleshooting ------------------------------------------------------------------------------ === Resolution === * Either reduce the load or add more memory/machines. + + == Problem: Master initializes, but Region Servers do not == + * Master's log contains repeated instances of the following block: + ~-INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /127.0.0.1:60020. Already tried 1 time(s).[[BR]] + INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /127.0.0.1:60020. Already tried 2 time(s).[[BR]] + ...[[BR]] + INFO org.apache.hadoop.ipc.RPC: Server at /127.0.0.1:60020 not available yet, Zzzzz...-~ + * Region Servers' logs contains repeated instances of the following block: + ~-INFO org.apache.hadoop.ipc.Client: Retrying connect to server: masternode/192.168.100.50:60000. Already tried 1 time(s).[[BR]] + INFO org.apache.hadoop.ipc.Client: Retrying connect to server: masternode/192.168.100.50:60000. Already tried 2 time(s).[[BR]] + ...[[BR]] + INFO org.apache.hadoop.ipc.RPC: Server at masternode/192.168.100.50:60000 not available yet, Zzzzz...-~ + * Note that the Master believes the Region Servers have the IP of 127.0.0.1 - which is localhost and resolves to the master's own localhost. + === Causes === + * The Region Servers are erroneously informing the Master that their IP addresses are 127.0.0.1. + === Resolution === + * Modify '''/etc/hosts''' on the region servers, from + {{{ + # Do not remove the following line, or various programs + # that require network functionality will fail. + 127.0.0.1 fully.qualified.regionservername regionservername localhost.localdomain localhost + ::1 localhost6.localdomain6 localhost6 + }}} + + * To (removing the master node's name from localhost) + {{{ + # Do not remove the following line, or various programs + # that require network functionality will fail. + 127.0.0.1 localhost.localdomain localhost + ::1 localhost6.localdomain6 localhost6 + }}} + == Problem: Created Root Directory for HBase through Hadoop DFS == + * On Startup, Master says that you need to run the hbase migrations script. Upon running that, the hbase migrations script says no files in root directory. + === Causes === + * HBase expects the root directory to either not exist, or to have already been initialized by hbase running a previous time. If you create a new directory for HBase using Hadoop DFS, this error will occur. + === Resolution === + * Make sure the HBase root directory does not currently exist or has been initialized by a previous run of HBase. Sure fire solution is to just use Hadoop dfs to delete the HBase root and let HBase create and initialize the directory itself. + + == Problem: lots of DFS error regarding can not find block from live nodes == + * Under heavy read load, you may see lots of DFSClient complains about no live nodes hold a particular block. And, hadoop DataNode logs show xceiverCount exceed the limit (256) + + === Causes === + * not enough xceiver thread on DataNode to serve the traffic + + === Resolution === + * Either reduce the load or set dfs.datanode.max.xcievers (hadoop-site.xml) to a larger value than the default (256). Note that in order to change the tunable, you need 0.17.2 or 0.18.0 (HADOOP-3859). +
