On Sat, May 14, 2011 at 5:16 PM, Jean-Daniel Cryans <[email protected]>wrote:
> On Sat, May 14, 2011 at 6:40 AM, Thibault Dory <[email protected]> > wrote: > > I'm wondering what are the possible bottlenecks of an HBase cluster, even > if > > there are cache mechanism, the fact that some data are centralized could > > lead to a bottleneck (even if its quite theoretical given the load needed > to > > achieve it). > > Isn't that what your paper is about? > Yes that is part of the things that could be observed but it looks like a much bigger budget would be needed to get to clusters big enough to observe it for HBase. Anyway, the main thing we were interested in is elasticity. > > > Would it be right to say the following ? > > > > - The namenode is storing all the meta data and must scale vertically > if > > the cluster becomes very big > > The fact that there's only 1 namenode is bad in multiple ways, > generally people will be more bothered by the fact that it's a single > point of failure. Larger companies do hit the limits of that single > machine so Y! worked on "Federated Namenodes" as a way to circumvent > that. See http://www.slideshare.net/huguk/hdfs-federation-Riak's data > model > representationhadoop-hadoop-summit2011<http://www.slideshare.net/huguk/hdfs-federation-hadoop-summit2011> > > This work is already available in hadoop's svn trunk. > Thanks, I did not know about "Federated Namenodes", this is interesting. > > > - There is only one node storing the -ROOT- table and only one node > > storing the .META. table, if I'm doing a lot of random accesses and that > my > > dataset is VERY large, could I overload those node? > > Again, I believe this is the subject of your paper right? Indeed this is part of it but that does not mean that I'm an HBase specialist, this is why I'm asking to you here as you may have more experience with big clusters or have a good knowledge of the internal of HBase. Unfortunately I did not had the time to this before the deadline of the paper but I'll be granted additional time if it is accepted, so it's not too late. > Anyways so > in general in -ROOT- has 1 row, and that row is cached. Even if you > have thousands of clients that need to update their .META. location > (this would only happen at the beginning of a MR job or if .META. > moves), serving from memory is fast. > Next you have .META., again the clients cache their region locations > so once they have it they don't need to talk to .META. until a region > moves or gets split. Also .META. isn't that big and is usually served > directly from memory. > The BT paper mentions they allow the splitting of .META. when it grows > a bit too much and this is something we've blocked for the moment in > HBase. > > J-D > Going back to my original problem, the fact that one region server was always overloaded with requests while the others were only serving a few requests despite of my requests generated using a uniform distribution, I would like to know what you think about the idea of Ted Yu saying that it may be related to the fact that the overloaded region server could be the one storing the .META. table. At that point in tests, the cluster was made of 24 nodes and was storing 40 millions rows in HBase. As my requests are fully random, there is a high probability given the total number of entries, that a lot requests issued by a client are for entries they did not requested before, leading to a lookup to the .META. table for almost each request. Of course this is valid only if the client does not know that an entry it never asked for is in a region it has already accessed before. Is it the case? For example if a client ask for the row 10 and sees that it is in the region 2, will it know that the row 15 is also in the region 2 without making a new lookup into the .META. table?
