Hi All, I had some questions about the hbase architecture that I am a little confused about.
After doing reading over the internet / HBase book etc, my understanding of regions is the following -> When the cluster initially starts up (with no data), the regionservers come online. When the data starts flowing into HBase, the data starts being written into a single region. Each region is composed of a memstore, storefiles and a WAL. As memstores get filled up, they are written to storefiles on disk. A major compaction combines these storefiles together into a single store file. Storefiles/memstores exist on a per column family basis. What I do not understand is the following - the StoreFiles for a particular region/column family are written into HDFS as an HFile. This HFile then exists in HDFS as chunks (usually of 64 MB size) across the HDFS data nodes. When the region "splits", what happens to these store files ? If a region is already split into multiple store files and a compaction runs and a split was supposed to occur, the compaction process will only compact half of the files ? If a region only has a single store file, how is the data split into two ? Is half of the data written into a separate file and then the original file truncated ? Isn't this an expensive operation ? Secondly, what does it mean to "move" a region from one regionserver to another. Since a region is composed of a set of HFiles on the HDFS filesystem, and can potentially exist anywhere across the datanode cluster, is any data physically being moved from one machine to another ? or is it just that the regionserver that the region has "moved" to takes control of the file on HDFS that needs to be written to ? I guess what I am not being able to fully understand is how is HBase effectively using HDFS when it comes to region distribution. If a regionserver is hosting a hot region whose blocks sit on a certain set of datanodes, how does moving this region to another regionserver help ? The traffic ultimately would land on the same set of datanodes holding the different chunks of the region .. shouldn't Hbase then be doing load balancing based on where the regions are physically located on the datanodes as opposed to where they are *logically* located ? Please let me know if I am missing something truly fundamental here. Thank you, Sam
