Praveenesh, I will recommend you to read the google Big Table paper( http://labs.google.com/papers/bigtable.html) which is a foundation for the hbase. Terminology is little different though:
Mapping of terms(not exhaustive): *********************************** Big Table Hbase *********************************** Master Server HMaster Tablet region Tablet server regionserver chubby zookeeper (It is apache implementation of distributed synchronization server) Hbase stores data on hadoop dfs. hbase is a client of hdfs. Hence hadoop will automatically distribute and replicate the data across your hadoop cluster. Hbase master and regionservers formats/transforms the data and relies on hadoop for storage and retrieval. Hbase cluster can run on a separate set of nodes or it can even share hadoop nodes. Once you have setup hdfs cluster, hbase cluster can be easily setup. Thanks Gaurav On Tue, Apr 26, 2011 at 10:55 AM, praveenesh kumar <[email protected]>wrote: > Hello everyone, > > Thanks everyone for guiding me everytime. I am able to setup hadoop cluster > of 10 nodes. > Now comes HBASE..!!! > > I am new to all this... > My problem is I have huge data to analyze. > so shall I go for single node Hbase installation on all nodes or go for > distributed Hbase installation.?? > > How distributed installation is different from single node installaion ?? > Now suppose if I have distributed Hbase... > and If I design some table on my master node.. and then store data on it.. > say around 100M. How the data is going to be distributed.. Will HBASE do it > automatically or we have to write codes for getting it distributed ?? > Is there any good tutorial that tells us more about HBase and how to work > on > it ??? > > Thanks, > Praveenesh >
