It says: "The master and namenode are the entry points of their respective levels, meaning that if an HBase client wants a specific data, it first has to ask to the master that knows which is the region server that stores it."
Which is wrong, quoting the Bigtable paper (which your team should consider reading): "As with many single-master distributed storage sys- tems [17, 21], client data does not move through the mas- ter: clients communicate directly with tablet servers for reads and writes. Because Bigtable clients do not rely on the master for tablet location information, most clients never communicate with the master. As a result, the mas- ter is lightly loaded in practice." Which also impacts your conclusion: "For example it can be interesting to see when a system based on an architecture using a single point of entry, such as HBase and its master, would be overload" J-D On Fri, May 13, 2011 at 4:06 AM, Thibault Dory <[email protected]> wrote: > Hello, > > I have written with a few other people a paper for the ACM Symposium > On Cloud Computing. This paper describes the methodology, > infrastructure and configuration used as well as the results obtained > for elasticity and scalability of three noSQL databases, of wich > HBase. The paper can be downloaded here : > http://www.nosqlbenchmarking.com/wp-content/uploads/2011/05/paper.pdf<http://www.google.com/url?sa=D&q=http://www.nosqlbenchmarking.com/wp-content/uploads/2011/05/paper.pdf> > > > Any feedback on the methodology used would be appreciated, we would > like to know if HBase is used in a "fair" way in those tests. > > We also encountered a problem with the distribution of requests among region > servers. This problem is described in section 5.4.2 and any hints on how to > solve this problem would be appreciated. Please note that the request > generation is independent of the specific database layer and that we did not > observe this problem for the two other databases. > > Regards, > > Thibault Dory >
