Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The following page has been changed by AndrewPurtell:
http://wiki.apache.org/hadoop/Hbase/Troubleshooting

The comment on the change is:
some inital comments/recommendations about Amazon EC2 deployments 

------------------------------------------------------------------------------
   1. [#5 Problem: "xceiverCount 258 exceeds the limit of concurrent xcievers 
256"]
   1. [#6 Problem: "No live nodes contain current block"]
   1. [#7 Problem: DFS instability and/or regionserver lease timeouts]
+  1. [#8 Problem: Instability on Amazon EC2]
  
  [[Anchor(1)]]
  == 1. Problem: Master initializes, but Region Servers do not ==
@@ -127, +128 @@

      * [http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html 
Tuning garbage collector in Java SE 6]
   * For Java SE 6, some users have had success with {{{ 
-XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode }}}
  
+ [[Anchor(8)]]
+ == 8. Problem: Instability on Amazon EC2 ==
+  * Various problems suggesting overloading on Amazon EC2 deployments: Scanner 
timeouts, problems locating HDFS blocks, missed heartbeats, "We slept xxx ms, 
ten times longer than scheduled" messages, and so on. 
+  * These problems continue after following the other relevant advice on this 
page. 
+  * Or, you are trying to use Small or Medium instance types. (Do not.)
+ === Causes ===
+  * Hadoop and HBase daemons require 1GB heap, therefore RAM, per daemon. For 
load intensive environments, HBase regionservers may require more heap than 
this. There must be enough available RAM to comfortably hold the working sets 
of all Java processes running on the instance. This includes any mapper or 
reducer tasks which may run co-located with system daemons. Small and Medium 
instances do not have enough available RAM to contain typical Hadoop+HBase 
deployments. 
+  * Hadoop and HBase daemons are latency sensitive. There should be enough 
free RAM so no swapping occurs. Swapping during garbage collection may cause 
JVM threads to be suspended for a critically long time. Also, there should be 
sufficient virtual cores to service the JVM threads whenever they become 
runnable. Large instances have two virtual cores, so they can run HDFS and 
HBase daemons concurrently, but nothing more. X-Large instances have four 
virtual cores, so they can run in addition to HDFS and HBase daemons two 
mappers or reducers concurrently. Configure TaskTracker concurrency limits 
accordingly, or separate mapreduce computation from storage functions. 
+ === Resolution ===
+  * Use Large instances for HDFS and HBase storage tasks.
+  * Use X-Large instances if you are also running mappers and reducers 
co-located with system daemons.
+  * Consider splitting storage and computational function over disjoint 
instance sets. 
+ 

Reply via email to