On Thu, Jan 19, 2012 at 3:34 AM, Praveen Sripati <[email protected]>wrote:
> 1. When the memstore fills, is it flushed to HDFS or local file system? > > HDFS > 2. If the region size (hbase.hregion.max.filesize) is set to 200MB and the > HDFS Block Size is set to 64MB, will the region be split across 4 data > nodes? I know that this doesn't make sense to split a single regions data > across nodes in HDFS, but how is it handled in HBase? > > You mean file in the above rather than region? If so, yes, the file will be made of multiple HDFS blocks. The blocks will be replicated. Usually one replica is on the datanode local to the regionserver. See the reference guide for more on hbase locality. > 3. Is region size (hbase.hregion.max.filesize) the size of commit log or > the size of the file that has been flushed? > > Its about files under a region. WALs/logs have their own configs. > 4. The commit log might become big over time, is there similar concept of > checkpoint in HBase for the commit logs? > > WALs are rolled at configurable size -- usually 64MB. WALs that have edits that have been all flushed to hfiles are let go/deleted. St.Ack
