Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The following page has been changed by AndrewPurtell:
http://wiki.apache.org/hadoop/Hbase/Troubleshooting

------------------------------------------------------------------------------
   * See an exception with above message in logs (usually hadoop 0.18.x).
  === Causes ===
   * Slow datanodes are marked as down by DFSClient; eventually all replicas 
are marked as 'bad' (HADOOP-3831).
+  * Insufficient file descriptors available at the OS level for DFS DataNodes.
  === Resolution ===
+  * Increase the file descriptor limit of the user account under which the DFS 
DataNode processes are operating. On most Linux systems, adding the following 
lines to /etc/security/limits.conf will increase the file descriptor limit from 
the default of 1024 to 32768. Substitute the actual user name for {{{<user>}}}. 
+    {{{
+ <user>          soft    nofile          32768
+ <user>          hard    nofile          32768
+ }}}
-  * Apply HADOOP-4681 to your cluster or at least to the hadoop jar used by 
hbase.
+  * Apply HDFS-127 (formerly HADOOP-4681) to your cluster or at least to the 
hadoop jar used by hbase.
   * Try setting '''dfs.datanode.socket.write.timeout''' to zero (in hadoop 
0.18.x -- See HADOOP-3831 for detail and why not needed in hadoop 0.19.x).  See 
the thread at 
[http://mail-archives.apache.org/mod_mbox/hadoop-hbase-user/200810.mbox/%[email protected]%3e
 message from jean-adrien] for some background.  Note, this is an hdfs client 
configuration so needs to be available in $HBASE_HOME/conf.  Making the change 
only in $HADOOP_HOME/conf is not sufficient.  Copy your amended hadoop-site.xml 
to the hbase conf directory or add this configuration to 
$HBASE_HOME/conf/hbase-site.xml.
   * Try increasing '''dfs.datanode.handler.count''' from its default of 3. 
This is a server configuration change so must be made in 
$HADOOP_HOME/conf/hadoop-site.xml. Try increasing it to 10, then by additional 
increments of 10. It probably does not make sense to use a value larger than 
the total number of nodes in the cluster. 
  

Reply via email to