What Tim said and then some comments in the below. What version of hbase?
> > This happens every time when first region starts to split. As far as i can > see table is set to enabled *false* (web admin), web admin becomes little > bit less responsible - listing table regions shows no regions. > and after a while i can see 500 or more regions. You go from zero to 500 regions with nothing showing in between? Thats pretty impressive. 500 regions in 256M on 3 servers is probably pushing it Some of them as exception > shows are not fully available. Identify the duff regions by running a full table scan in the shell with DEBUG enabled on the client. It'll puke when it hits the first broke region HDFS doesn't seems to be the main issue. When > i run fsck it says hbase dir is healthy apart from some under replicated > blocks. Occasionaly i saw that some blocks where missing but i think this > was due to "Too many files open" exceptions (to small regions size - now > it's default 64) Too many open files is bad. Check out the hbase 'Getting Started'. > Amount of data is not enormous - around 1gb in less then 100k rows then this > problems starts to occur. Request per seconds is i think small - 20-30 per > second. > What else i can say is I've set the max hbase retry to only 2 because we > can't allow clients to wait more for response. > I would suggest you leave things at default till running smooth then start in optimizing. > What i would like to know is whether the table is always disabled when > performing region splits? No. Region goes offline for some period of time. If machines are heavily loaded it will take longer for it to come back on line again. And is it truly disabled then so that clients > can't do anything? > It looks like status says disabled but still requests are processed, though, > with different results (some like above). > Disabled or 'offline'? Parents of region splits go offline and are replaced by new daughter splits. > > > My cluster setup can be probably useful - > 3 centos virtual machines based on xen running DN/HR and zookeeper + one of > them NodeMaster and Secondary Master. > 2 gigs of ram on each. Currently hadoop processes run with Xmx 512 and hbase > with 256 but non of them is swapping nor going out of memory. > GC logs looks normal - stop the world is not occurring ;) Really? No full GCs though only 256 and though about 100 plus regions per server? > top says cpus are nearly idle on all machines. > > It's far from ideal but we need to prove that this can work reliably to get > more toys. > Maybe next week we will be able to test on some better machines but for now > that all what I've got. > Makes sense. You are starting very small though and virtual machines have proven a flakey foundation for hbase. Read back over the list and look for ec2 mentions. St.Ack > > Any advices are welcome. > > > Thanks, > Michal >
