Hello there, IMHO, 5-8 servers are sufficient enough to start with. But it's all relative to the data you have and the intensity of your reads/writes. You should have different strategies though, based on whether it's 'read' or 'write'. You actually can't define 'big' in absolute terms. My cluster might be big for me, but for someone else it might still be not big enough or for someone it might be very big. Long story short it depends on your needs. If you are able to achieve your goal with 5-8 RSs, then having more machines will be a wastage, I think.
But you should always keep in mind that HBase is kinda greedy when it comes to memory. For a decent load 4G is sufficient, IMHO. But it again depends on operations you are gonna perform. If you have large clusters where you are planning to run MR jobs frequently you are better off with additional 2G. Warm Regards, Tariq cloudfront.blogspot.com On Sat, Jun 22, 2013 at 7:51 PM, myhbase <myhb...@126.com> wrote: > Hello All, > > I learn hbase almost from papers and books, according to my > understanding, HBase is the kind of architecture which is more appliable > to a big cluster. We should have many HDFS nodes, and many HBase(region > server) nodes. If we only have several severs(5-8), it seems hbase is > not a good choice, please correct me if I am wrong. In addition, how > many nodes usually we can start to consider the hbase solution and how > about the physic mem size and other hardware resource in each node, any > reference document or cases? Thanks. > > --Ning > >