Xin, Comments inline.
Regards, J-D On Tue, Jul 22, 2008 at 2:28 AM, Xin Jing <[EMAIL PROTECTED]> wrote: > Hi, > > I am a new user of HBase, I am curious about the inert process of HBase. > Could you please explain it in details? > > The question is: when I created a table (only one column, to make it easy > to describe), and insert a huge amount of data into the table. I know it is > a B-Tree like storage structure, what is the mechanism to build the table? > > 1. When the table size is over a threshold, how to split it? Each table is divided into regions which are distributed among the region servers (nodes) and each region splits when growing larger than a configured size. This is described here: http://wiki.apache.org/hadoop/Hbase/HbaseArchitecture#hregion > > > 2. When inserting data into the table, is all the data is in > memory? If not, how to make sure the performance is good enough? Also described in the link above. > > > 3. When all the data has been inserted into the table, there must > be a lot of files. And the files size may differ at some extend (some file > is several M, while some may be several hundred M), do I need to make the > file size similar and how? This is also described in the link above. > > > Thanks > -Xin > >
