Hi, I am trying to get more information about HBase. I would appreciate some answers to these few questions. Thanks a lot.
1. About load balancing: does HMaster monitor overloaded or low loaded HRegionServer, and move some regions from the hot HRegionServer to low loaded ones (with or without add new servers into the cluster, respectively)? 2. About region splitting: when splitting a region, will the newly created regions stay on the current HRegionSever, or will HMaster assign some new HRegionServers to take the newly created two regions? 3. About HFile size: Lars mentioned here http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html that the HFile size is default to 64k. How does this work while the default HDFS block is 64M/128M? Would the small HFile size waste lots of space on HDFS? 4. About data locality: if a HRegionServer fails, the HMaster would assign a new HRegionServer to take its place. But does this new HRegionServer should have access to the storeFiles? I assumed that's how it works by using HDFS's data replication. But after some readings, I got confused. It seems that the new HRegionServer can work without the storeFiles data at local. How does this work at all? Many thanks. Bill
