bq. capacity load on terms of numbers of regions per region server I guess you meant to say 'in terms of ...'
Yes. 0.94 load balancer looks at region count only. On Tue, Jan 21, 2014 at 9:39 PM, Asaf Mesika <[email protected]> wrote: > If hot means many requests, then it's only in 0.96 right? 0.94 is only > addressing capacity load on terms of numbers of regions per region server > of the same table. > > On Monday, January 20, 2014, Ted Yu <[email protected]> wrote: > > > bq. under heavy load by serving to hot regions > > > > Did you mean 'two hot regions' ? > > If so, the master will move one of them to another RS. > > > > Cheers > > > > > > On Mon, Jan 20, 2014 at 6:17 AM, Bill Q <[email protected]> wrote: > > > > > Hi Ted and Bharath, > > > Thanks a lot for the replies. > > > > > > For question #1, if there is a RS is under heavy load by serving to hot > > > regions, the HMaster will move one of the two regions to another RS, or > > > HMaster will split both of them and move the newly crated halves to > other > > > RSs? > > > > > > For question #3, does this mean that a HFile has many 64k blocks, but > > > itself is around 64M (or 128M)? > > > > > > > > > Many thanks. > > > > > > > > > Bill > > > > > > > > > On Mon, Jan 20, 2014 at 1:49 AM, Bharath Vissapragada < > > > [email protected] > > > > wrote: > > > > > > > For question #3, The block size Lars talks about is the blocksize > > inside > > > a > > > > HFile which is different from HDFS block size. Look at > > > > http://hbase.apache.org/book/apes03.html . Hfile is indexed as > blocks > > to > > > > facilitate random access to data so that we can skip unnecessary disk > > > > blocks while gets/scans. Smaller the hfile block size better is the > > > random > > > > read performance. You can see the detailed hfile layout in that link. > > > > > > > > For question #4, You are correct, since the data resides on HDFS, > each > > > > region server has access to all the storefiles (they just use hdfs > api > > to > > > > read them). The reason they are still available after a (RS+datanode) > > > crash > > > > is because of the replication in hdfs. The store files still have > valid > > > > replicas and namenode tries to maintain the replication factor by > > > > re-replicating them eventually. > > > > > > > > > > > > On Mon, Jan 20, 2014 at 12:08 PM, Ted Yu <[email protected]> > wrote: > > > > > > > > > For question #1, there is load balancer in HMaster which does the > job > > > of > > > > > balancing region load. > > > > > > > > > > For number 2, the daughter regions stay on the same server as the > > > parent > > > > > after split. Later one or both of them may be moved to other region > > > > servers. > > > > > > > > > > Cheers > > > > > > > > > > On Jan 19, 2014, at 10:27 PM, Bill Q <[email protected]> wrote: > > > > > > > > > > > Hi, > > > > > > I am trying to get more information about HBase. I would > appreciate > > > > some > > > > > > answers to these few questions. Thanks a lot. > > > > > > > > > > > > 1. About load balancing: does HMaster monitor overloaded or low > > > loaded > > > > > > HRegionServer, and move some regions from the hot HRegionServer > to > > > low > > > > > > loaded ones (with or without add new servers into the cluster, > > > > > > respectively)? > > > > > > > > > > > > 2. About region splitting: when splitting a region, will the > newly > > > > > created > > > > > > regions stay on the current HRegionSever, or will HMaster assign > > some > > > > new > > > > > > HRegionServers to take the newly created two regions? > > > > > > > > > > > > 3. About HFile size: Lars mentioned here > > > > > > > > > > > > > > > > http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.htmlthat > > > > > > the HFile size is default to 64k. How does this work while the > > > default > > > > > HDFS > > > > > > block is 64M/128M? Would the small HFile size waste lots of space > > on > > > > > HDFS? > > > > > > > > > > > > 4. About data locality: if a HRegionServer fails, the HMaster > would > > > > > assign > > > > > > a new HRegionServer to take its place. But does this new > > > HRegionServer > > > > > > should have access to the storeFiles? I assumed that's how it > works > > > by > > > > > > using HDFS's data replication. But after some readings, I got > > > confused. > > > > > It > > > > > > seems that the new HRegionServer can work without the storeFiles > > data > > > > a >
