About your new crash: 2010-07-15 04:09:03,248 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats: Sizes: Total=3.3544312MB (3517376), Free=405.42056MB (425114272), Max=408.775MB (428631648), Counts: Blocks=0, Access=1985476, Hit=0, Miss=1985476, Evictions=0, Evicted=0, Ratios: Hit Ratio=0.0%, Miss Ratio=100.0%, Evicted/Run=NaN 2010-07-15 04:11:32,791 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats: Sizes: Total=3.3544312MB (3517376), Free=405.42056MB (425114272), Max=408.775MB (428631648), Counts: Blocks=0, Access=1985759, Hit=0, Miss=1985759, Evictions=0, Evicted=0, Ratios: Hit Ratio=0.0%, Miss Ratio=100.0%, Evicted/Run=NaN 2010-07-15 04:11:32,965 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats: Sizes: Total=3.3544312MB (3517376), Free=405.42056MB (425114272), Max=408.775MB (428631648), Counts: Blocks=0, Access=1985768, Hit=0, Miss=1985768, Evictions=0, Evicted=0, Ratios: Hit Ratio=0.0%, Miss Ratio=100.0%, Evicted/Run=NaN 2010-07-15 04:11:33,330 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 176029ms for sessionid 0x129c87a7f980045, closing socket connection and attempting reconnect 2010-07-15 04:11:33,330 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 175969ms for sessionid 0x129c87a7f980052, closing socket connection and attempting reconnect
Your machine wasn't able to talk to the server for almost 2 minutes, and we can see that it stopped logging for approx that time. Again, I'd like to point out that this is a JVM/hardware issue. My other comments inline. J-D On Thu, Jul 15, 2010 at 9:55 AM, Jinsong Hu <[email protected]> wrote: > I tested the hbase with client throttling and I noticed that in the > beginning, the process is limited by CPU speed. after running for several > hours, I noticed that the speed is limited by disk. Then I realized that > there is a serious flaw in hbase design: > > In the beginning , the number of regions is very small. So hbase only > does compaction for those limited number of regions. But as insertion > continues, we will have more and more regions,the clients continue to insert > more records to the regions, preferably uniformly . At that time lots of > regions need to be compacted. Most of the clusters has relatively stable > number of physical machines. More regions leads to more compactions, that > means for a given physical machine, it needs to support more compaction as > time go on. And the compaction most involves disk IO, so sooner or later, > the disk IO capacity will be exhausted. At that moment, the machine is not > responsive any longer, and finally time out happens , and hbase regionserver > crash. To me it looks like a GC pause with swapping and/or CPU starvation, which leads to the region server shutting down itself. > > This seems to be a design flaw for hbase. In my testing, even sustained > insertion rate of 1000 can crash the cluster. Is there any solution for this > ? Well first the values you are inserting are a lot bigger than what HBase is optimized for out of the box. WRT your IO exhaustion description, what you really described is the limit of scalability for a single region server. If HBase had a single server architecture, I'd say that you're right, but this is not the case. In the Bigtable paper, they say they usually support 100 tablets (regions) per TableServer (our RegionServer). When you grow more than that, you need to add more machines! This is the scalability model, HBase can grow to hundreds/thousands of servers without any fancy third party software. WRT solutions: 1- if this is your initial data import, I'd recommend you use the HFileOutputFormat since it will be much faster and you don't have to wait after splits/compactions to happen. See http://hbase.apache.org/docs/r0.20.5/api/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.html 2- if you are trying to see how HBase handles your expected "normal" load, then I would recommend getting production-grade machines. 3- if you wish to keep the same hardware, you need to give more ressources to HBase. Running maps and reduces on the same machine, when you have only 4 cores, easily leads to resource contention. We recommend running HBase with at least 4GB, you have only 2GB. Also it needs cores, at StumbleUpon for example we use 2xi7 which accounts to 16 CPUs when counting hyperthreading, so we run 8 maps and 6 reducers per machine leaving at least 2 for HBase (it's a database, it needs processing power). Even when we did the initial import of our data, we didn't have any GC issues. This is also the setup for our production cluster, although we (almost) don't run MR jobs on it since they usually are IO-heavy. > > Jimmy. >
